How to choose abstraction levels in sbi

Navigation

How to choose abstraction levels in sbi#

sbi offers flexibility ranging from simple, high-level workflows to full control over neural networks and sampling. This guide shows:

Four abstraction levels for controlling the density estimator (common to NPE and NLE)
Additional sampling control for NLE (4 more levels)

We’ll use the same simple example throughout to keep things clear.

Setup#

First, let’s define a simple linear Gaussian simulator and generate data we’ll use for all examples:

import torch

from sbi.inference import NLE, NPE
from sbi.utils import BoxUniform


# Define a simple linear Gaussian simulator
def simulator(theta):
    """Linear Gaussian simulator with noise."""
    return theta + 1.0 + torch.randn_like(theta) * 0.1

# Define prior over 3 parameters
num_dim = 3
prior = BoxUniform(low=-2 * torch.ones(num_dim), high=2 * torch.ones(num_dim))

# Generate training data (used for all examples)
num_simulations = 2000
theta = prior.sample((num_simulations,))
x = simulator(theta)

# Generate a single observation for inference
theta_o = prior.sample((1,))
x_o = simulator(theta_o)

print(f"Generated {num_simulations} simulations for training")
print(f"Parameter shape: {theta.shape}, Data shape: {x.shape}")

Part 1: Density Estimator Abstraction Levels#

The following 4 levels apply to both NPE and NLE. They control how the neural density estimator is specified and constructed. We’ll demonstrate with NPE first.

Level 1: Trainer Classes (Recommended)#

Use case: Standard workflows, most common approach

The trainer classes provide the recommended interface with string-based customization.

# Level 1: Simple trainer class with string specification
inference = NPE(prior=prior, density_estimator="nsf")

# Train on the data
inference.append_simulations(theta, x)
posterior_net = inference.train()

# Build posterior and sample
posterior = inference.build_posterior()
samples_lvl1 = posterior.sample((1000,), x=x_o)

print("Level 1 complete - used NSF with default settings")

Key features:

Simple string specification: "nsf", "maf", "zuko_nsf", "mdn", etc.
Multi-round inference support
Automatic handling of training loops
Start here for most use cases

Level 2: Factory Functions#

Use case: Need specific architecture hyperparameters

Use factory functions like posterior_nn() when you need to tune the network architecture.

from sbi.neural_nets import posterior_nn

# Level 2: Factory function with custom hyperparameters
density_estimator = posterior_nn(
    model="maf",              # Masked Autoregressive Flow
    hidden_features=50,        # Customize hidden layer size
    num_transforms=5,          # Customize number of transform layers
)

# Pass to NPE (rest of workflow is the same)
inference = NPE(prior=prior, density_estimator=density_estimator)
inference.append_simulations(theta, x)
posterior_net = inference.train()

posterior = inference.build_posterior()
samples_lvl2 = posterior.sample((1000,), x=x_o)

print("Level 2 complete - used MAF with custom hyperparameters")

Key features:

Fine-grained control over hyperparameters
Can add embedding networks for high-dimensional data
Still benefits from trainer conveniences
For NPE: posterior_nn(), for NLE: likelihood_nn(), for NRE: classifier_nn()

Level 3: Direct Network Builders#

Use case: Custom neural network architecture with full parameter access

Use direct builder functions like build_nsf() for maximum control over network construction.

from functools import partial

from sbi.neural_nets.net_builders.flow import build_nsf

# Level 3: Direct builder with full parameter control
custom_builder = partial(
    build_nsf,
    hidden_features=60,
    num_transforms=3,
    num_bins=8,                # Number of spline bins
    tail_bound=3.0,            # Spline tail bound
)

# Pass to NPE (rest of workflow is the same)
inference = NPE(prior=prior, density_estimator=custom_builder)
inference.append_simulations(theta, x)
posterior_net = inference.train()

posterior = inference.build_posterior()
samples_lvl3 = posterior.sample((1000,), x=x_o)

print("Level 3 complete - used custom NSF configuration")

Key features:

Direct access to all builder parameters
Maximum flexibility for architecture design
Can implement fully custom architectures by subclassing DensityEstimator

Level 4: Custom Training Loops#

Use case: Custom training logic, loss functions, research applications

For complete control over the training process, implement custom training loops. This is covered in detail in advanced tutorial 18.

At this level, you:

Manually construct the density estimator
Define custom loss functions and regularization
Implement your own training loops with custom data loaders
Have full control over optimization, early stopping, etc.

When to use: Research on new methods, custom loss functions, specialized data augmentation.

Part 2: NLE - Same Levels + Sampling Control#

Understanding the Difference#

NPE directly approximates the posterior \(p(\theta|x)\):

Sampling is straightforward: just sample from the neural network
No additional configuration typically needed

NLE approximates the likelihood \(p(x|\theta)\):

Must combine with prior using MCMC, VI, or rejection sampling to get posterior samples
This adds a second dimension of control: choosing and configuring the sampling method

Important: The 4 density estimator levels above work exactly the same for NLE - just use likelihood_nn() instead of posterior_nn() at Level 2.

NLE Density Estimator (Same 4 Levels)#

Quick example showing NLE uses the same abstraction levels:

# Level 1 with NLE - same pattern as NPE
inference_nle = NLE(prior=prior, density_estimator="nsf")
inference_nle.append_simulations(theta, x)
likelihood_net = inference_nle.train()

# Build posterior (defaults to MCMC)
posterior_nle = inference_nle.build_posterior()
samples_nle = posterior_nle.sample((1000,), x=x_o)

print("NLE Level 1 complete")
print("Default sampling method: MCMC with slice_np_vectorized")

Note: Levels 2-4 for the density estimator work identically:

Level 2: Use likelihood_nn() instead of posterior_nn()
Level 3: Use build_nsf() (same as NPE)
Level 4: Custom training (see tutorial 18)

Part 3: NLE Sampling Control#

NLE provides additional control over how posterior samples are generated. This is independent of the density estimator configuration above.

Four levels of sampling control:

Sampling Level 1: Default#

Use case: Starting point, works well for most problems

Just call build_posterior() with no arguments - uses slice sampling by default.

# Sampling Level 1: Use defaults
posterior = inference_nle.build_posterior()
samples = posterior.sample((1000,), x=x_o)

print("Sampling Level 1: Default MCMC (slice_np_vectorized)")

Default behavior: MCMC with slice_np_vectorized method, 200 warmup steps, 20 chains.

Sampling Level 2: Choose Method#

Use case: Different problem characteristics favor different sampling methods

Use the sample_with parameter to choose between MCMC, rejection sampling, VI, or importance sampling.

# Sampling Level 2: Choose sampling method
# Use rejection sampling instead of default MCMC (fast for few parameters)
posterior_rejection = inference_nle.build_posterior(sample_with="rejection")
samples_rejection = posterior_rejection.sample((1000,), x=x_o)

print("Sampling Level 2: Using rejection sampling instead of MCMC")

Available sampling methods:

"mcmc": Markov Chain Monte Carlo (default) - accurate but can be slow
"rejection": Rejection sampling - fast and accurate for few parameters (<3)
"vi": Variational inference - faster for many parameters (>10), may be less accurate
"importance": Importance sampling - useful for refining VI posteriors

Usage: Simply change sample_with="rejection" to sample_with="vi" or any other method.

See also: how_to_guide/09_sampler_interface.ipynb for detailed guidance on choosing sampling algorithms.

Sampling Level 3: Configure Method Specifics#

Use case: Choose specific algorithms within a sampling method

Use mcmc_method or vi_method parameters to select specific algorithms.

# Sampling Level 3: Configure method specifics
# Use NUTS (No-U-Turn Sampler) instead of default slice sampling
posterior_nuts = inference_nle.build_posterior(
    sample_with="mcmc",
    mcmc_method="nuts_pyro"
)
samples_nuts = posterior_nuts.sample((1000,), x=x_o)

print("Sampling Level 3: Using NUTS from Pyro")

Available MCMC methods (use with mcmc_method=):

"slice_np_vectorized": Slice sampling (numpy, vectorized, default)
"slice_np": Slice sampling (numpy, sequential)
"nuts_pyro": No-U-Turn Sampler (requires pip install "sbi[pyro]")
"hmc_pyro": Hamiltonian Monte Carlo (requires pip install "sbi[pyro]")
"slice_pymc", "hmc_pymc", "nuts_pymc": PyMC samplers (require pip install "sbi[pymc]")

Available VI methods (use with vi_method=):

"rKL": Reverse KL divergence (mode-seeking, default)
"fKL": Forward KL divergence (mass-covering)
"IW": Importance weighted
"alpha": Alpha divergence

Usage: Change mcmc_method="nuts_pyro" to any other MCMC method, or use vi_method="fKL" when sample_with="vi".

Sampling Level 4: Fine-Tune Parameters#

Use case: Optimize sampling performance, troubleshoot convergence issues

Fine-tune sampling parameters using dictionaries or PosteriorParameters dataclasses.

# Sampling Level 4a: Using parameter dictionaries
posterior_tuned = inference_nle.build_posterior(
    sample_with="mcmc",
    mcmc_method="slice_np_vectorized",
    mcmc_parameters={
        "warmup_steps": 100,      # Burn-in samples to discard
        "num_chains": 4,          # Number of parallel chains
        "thin": 2,                # Thinning factor
        "num_workers": 2,         # CPU cores for parallelization
    }
)

print("Sampling Level 4a: Dictionary-based parameter tuning")

# Sampling Level 4b: Using PosteriorParameters (recommended)
from sbi.inference.posteriors import MCMCPosteriorParameters

mcmc_params = MCMCPosteriorParameters(
    method="nuts_pyro",
    warmup_steps=100,
    num_chains=4,
    init_strategy="sir",                               # Sequential Importance Resampling for init
    init_strategy_parameters={"num_candidate_samples": 1000},
    num_workers=2,
    mp_context="spawn"                                 # Multiprocessing context
)

posterior_advanced = inference_nle.build_posterior(
    posterior_parameters=mcmc_params
)

samples_advanced = posterior_advanced.sample((1000,), x=x_o)

print("Sampling Level 4b: PosteriorParameters with validation")

Key tuning parameters:

warmup_steps: Number of initial samples to discard (default: 200)
num_chains: Number of parallel chains (default: 20)
thin: Thinning factor - keep every nth sample (default: -1, auto)
init_strategy: How to initialize chains ("proposal", "sir", "resample")
num_workers: Number of CPU cores for parallelization

Advantages of PosteriorParameters:

Type checking and validation
Better IDE autocomplete support
Clear documentation of available parameters

See also: how_to_guide/19_posterior_parameters.ipynb for complete details.

Decision Guides#

Guide 1: Which Density Estimator Level? (NPE and NLE)#

You want to…	Use Level	Example
Standard workflows with good defaults	1	`NPE(prior, density_estimator="nsf")`
Try different density estimator types	1	Switch `"nsf"`, `"maf"`, `"zuko_nsf"`
Tune network depth or width	2	`posterior_nn(hidden_features=100)`
Add embedding networks for images/timeseries	2	`posterior_nn(embedding_net=my_cnn)`
Access specialized flow parameters	3	`build_nsf(num_bins=16, tail_bound=5.0)`
Implement custom network architecture	3	Subclass `DensityEstimator`
Define custom loss functions or training	4	See advanced tutorial 18

Rule of thumb: Start with Level 1. Move to higher levels only when you need specific control.

Guide 2: Which Sampling Level? (NLE and NRE only)#

Situation	Sampling Level	Example
Starting out, need good defaults	1	`build_posterior()`
Very few parameters (<3)	2	`sample_with="rejection"`
Many parameters (>10), speed is critical	2	`sample_with="vi"`
Want to use NUTS or HMC	3	`mcmc_method="nuts_pyro"`
MCMC not converging, need more warmup	4	`mcmc_parameters={"warmup_steps": 500}`
Want type checking and validation	4	`MCMCPosteriorParameters(...)`
Troubleshooting sampling issues	4	Tune `num_chains`, `init_strategy`, etc.

Rule of thumb:

Start with default MCMC (Level 1)
If too slow, try rejection (few params) or VI (many params) at Level 2
Use Level 3-4 for optimization or troubleshooting

Summary#

All Methods (NPE, NLE, NRE)#

4 Abstraction Levels for Density Estimator:

Level 1: Trainer classes with strings → NPE(prior, density_estimator="nsf")
Level 2: Factory functions → posterior_nn(model="maf", hidden_features=50)
Level 3: Direct builders → build_nsf(num_bins=8, tail_bound=3.0)
Level 4: Custom training → Full control (see tutorial 18)

NPE Sampling#

Direct sampling from neural network
No additional configuration needed
Optionally can use MCMC/VI for more control

NLE and NRE Sampling (Additional Dimension)#

4 Sampling Control Levels:

Level 1: Default → build_posterior() (uses slice_np_vectorized)
Level 2: Choose method → sample_with="mcmc"/"vi"/"rejection"/"importance"
Level 3: Configure algorithm → mcmc_method="nuts_pyro", vi_method="fKL"
Level 4: Fine-tune parameters → mcmc_parameters={...} or MCMCPosteriorParameters(...)

General Principle#

Start simple, add complexity only when needed:

Begin with Level 1 for density estimator
For NLE, begin with default sampling (Level 1)
Move to higher levels only when you need specific control or encounter issues
Both dimensions are independent - you can use Level 1 density estimator with Level 4 sampling, or vice versa