How to use FMPE and NPSE

Navigation

How to use FMPE and NPSE#

This guide gives practical recommendations for using vector field methods (FMPE and NPSE) in sbi. For a comprehensive tutorial covering the API and concepts, see the advanced tutorial on vector field methods.

The recommendations below are heuristics derived from practical experience, not from systematic benchmarking.

Explicit recommendations#

Start with FMPE — it’s faster and more stable (ODE sampling).
Switch to NPSE if you need SDE-based sampling for better exploration of multimodal posteriors, if you work in high dimensions (>50 parameters), or if you want to use EDM-style noise schedules.
Both are single-round only. For multi-round inference, use NPE instead.
Use the default MLP architecture unless you have structured or very high-dimensional data.
Don’t tune noise schedules until you’ve tried the defaults.
For IID posteriors, use iid_method="auto_gauss".
For guidance, train with a Gaussian prior (e.g., MultivariateNormal) rather than BoxUniform — guidance requires computing the marginal prior score analytically, which works better with smooth priors.

FMPE vs NPSE#

FMPE learns a velocity field and samples via ODE (fast, deterministic). NPSE learns a score function and samples via SDE (stochastic, more robust). Through a mathematical bridge, FMPE also supports SDE sampling and all score-based features (IID, guidance).

Criterion	FMPE	NPSE
Speed	Faster (ODE default)	Slower (SDE default)
Sampling quality	Good for smooth posteriors	More robust for complex posteriors
IID posteriors	Supported (via bridge)	Supported (native)
Guidance	Supported (via bridge)	Supported (native)
EDM schedules	Not applicable	Supported for `sde_type="ve"`

Choosing an architecture#

Use the default "mlp" unless you have a specific reason:

< 20 parameter dims: "mlp" (default) — fast, reliable.
20–100 parameter dims or complex posteriors: try "ada_mlp" — uses FiLM-style time conditioning.
> 100 parameter dims or structured data: try "transformer" — benefits strongly from GPU.
Variable-length observation sequences (e.g., time series): use "transformer_cross_attn" with an appropriate embedding net.

See How to choose neural nets and the tutorial Section 4 for details.

from sbi.neural_nets import posterior_flow_nn, posterior_score_nn

# FMPE with transformer
net = posterior_flow_nn(model="transformer", num_layers=2, num_heads=2, hidden_features=64)

# NPSE with ada_mlp
net = posterior_score_nn(model="ada_mlp", sde_type="ve", hidden_features=128, num_layers=6)

Choosing an SDE type (NPSE only)#

"ve" (default): Best general choice. Supports EDM-style noise schedules.
"vp": Can work better for low-dimensional problems.
"subvp": Experimental, tighter variance bounds.

If unsure, use "ve".

Tuning noise schedules#

All SDE types have tunable noise range parameters: sigma_min/sigma_max for VE, beta_min/beta_max for VP/SubVP. These can be passed to posterior_score_nn().

Additionally, VE supports EDM-style non-uniform time sampling. Consider these when:

Training loss is very noisy.
Samples look blurry or lack fine detail.
The posterior is concentrated in a large prior space (see #1754).

For systematic hyperparameter tuning (including noise schedules), we recommend using Optuna as described in the hyperparameter tuning guide.

# VE with EDM-style schedules
net_ve = posterior_score_nn(
    model="mlp", sde_type="ve",
    train_schedule="lognormal", solve_schedule="power_law",
    sigma_min=1e-3, sigma_max=15.0,
)

# VP: tune beta range
net_vp = posterior_score_nn(model="mlp", sde_type="vp", beta_min=0.01, beta_max=20.0)

SDE vs ODE sampling#

FMPE defaults to ODE (fast, deterministic).
NPSE defaults to SDE (stochastic, more robust). Switch to ODE if you need speed.

For SDE sampling:

corrector=None (default): fast, good for most problems.
corrector="langevin": better quality when SDE samples look noisy.
steps=500 (default): increase to 1000 for quality.

# Use SDE for FMPE (non-default)
import torch

from sbi.inference import FMPE

theta = torch.randn(1000, 2)
x = theta + 0.5 * torch.randn_like(theta)
x_o = torch.tensor([0.0, 0.0])
trainer = FMPE()
trainer.append_simulations(theta, x).train()
posterior = trainer.build_posterior(sample_with="sde")
samples = posterior.sample((1000,), x=x_o, corrector="langevin", steps=500)

IID method#

When you have multiple i.i.d. observations, use iid_method in .sample():

"auto_gauss" (recommended): auto-calibrates, accurate, moderate overhead.
"gauss": faster, good for simple problems or few observations.
"fnpe": fastest but approximate — use only for quick checks with < 5 observations.
"jac_gauss": most accurate but slowest — only when accuracy is critical.