How to use FMPE and NPSE#
This guide gives practical recommendations for using vector field methods (FMPE and NPSE) in sbi. For a comprehensive tutorial covering the API and concepts, see the advanced tutorial on vector field methods.
The recommendations below are heuristics derived from practical experience, not from systematic benchmarking.
Explicit recommendations#
Start with FMPE — it’s faster and more stable (ODE sampling).
Switch to NPSE if you need SDE-based sampling for better exploration of multimodal posteriors, if you work in high dimensions (>50 parameters), or if you want to use EDM-style noise schedules.
Both are single-round only. For multi-round inference, use
NPEinstead.Use the default MLP architecture unless you have structured or very high-dimensional data.
Don’t tune noise schedules until you’ve tried the defaults.
For IID posteriors, use
iid_method="auto_gauss".For guidance, train with a Gaussian prior (e.g.,
MultivariateNormal) rather thanBoxUniform— guidance requires computing the marginal prior score analytically, which works better with smooth priors.
FMPE vs NPSE#
FMPE learns a velocity field and samples via ODE (fast, deterministic). NPSE learns a score function and samples via SDE (stochastic, more robust). Through a mathematical bridge, FMPE also supports SDE sampling and all score-based features (IID, guidance).
Criterion |
FMPE |
NPSE |
|---|---|---|
Speed |
Faster (ODE default) |
Slower (SDE default) |
Sampling quality |
Good for smooth posteriors |
More robust for complex posteriors |
IID posteriors |
Supported (via bridge) |
Supported (native) |
Guidance |
Supported (via bridge) |
Supported (native) |
EDM schedules |
Not applicable |
Supported for |
Choosing an architecture#
Use the default "mlp" unless you have a specific reason:
< 20 parameter dims:
"mlp"(default) — fast, reliable.20–100 parameter dims or complex posteriors: try
"ada_mlp"— uses FiLM-style time conditioning.> 100 parameter dims or structured data: try
"transformer"— benefits strongly from GPU.Variable-length observation sequences (e.g., time series): use
"transformer_cross_attn"with an appropriate embedding net.
See How to choose neural nets and the tutorial Section 4 for details.
from sbi.neural_nets import posterior_flow_nn, posterior_score_nn
# FMPE with transformer
net = posterior_flow_nn(model="transformer", num_layers=2, num_heads=2, hidden_features=64)
# NPSE with ada_mlp
net = posterior_score_nn(model="ada_mlp", sde_type="ve", hidden_features=128, num_layers=6)
Choosing an SDE type (NPSE only)#
"ve"(default): Best general choice. Supports EDM-style noise schedules."vp": Can work better for low-dimensional problems."subvp": Experimental, tighter variance bounds.
If unsure, use "ve".
Tuning noise schedules#
All SDE types have tunable noise range parameters: sigma_min/sigma_max for VE, beta_min/beta_max for VP/SubVP. These can be passed to posterior_score_nn().
Additionally, VE supports EDM-style non-uniform time sampling. Consider these when:
Training loss is very noisy.
Samples look blurry or lack fine detail.
The posterior is concentrated in a large prior space (see #1754).
For systematic hyperparameter tuning (including noise schedules), we recommend using Optuna as described in the hyperparameter tuning guide.
# VE with EDM-style schedules
net_ve = posterior_score_nn(
model="mlp", sde_type="ve",
train_schedule="lognormal", solve_schedule="power_law",
sigma_min=1e-3, sigma_max=15.0,
)
# VP: tune beta range
net_vp = posterior_score_nn(model="mlp", sde_type="vp", beta_min=0.01, beta_max=20.0)
SDE vs ODE sampling#
FMPE defaults to ODE (fast, deterministic).
NPSE defaults to SDE (stochastic, more robust). Switch to ODE if you need speed.
For SDE sampling:
corrector=None(default): fast, good for most problems.corrector="langevin": better quality when SDE samples look noisy.steps=500(default): increase to 1000 for quality.
# Use SDE for FMPE (non-default)
import torch
from sbi.inference import FMPE
theta = torch.randn(1000, 2)
x = theta + 0.5 * torch.randn_like(theta)
x_o = torch.tensor([0.0, 0.0])
trainer = FMPE()
trainer.append_simulations(theta, x).train()
posterior = trainer.build_posterior(sample_with="sde")
samples = posterior.sample((1000,), x=x_o, corrector="langevin", steps=500)
IID method#
When you have multiple i.i.d. observations, use iid_method in .sample():
"auto_gauss"(recommended): auto-calibrates, accurate, moderate overhead."gauss": faster, good for simple problems or few observations."fnpe": fastest but approximate — use only for quick checks with < 5 observations."jac_gauss": most accurate but slowest — only when accuracy is critical.