How to choose abstraction levels in sbi#

sbi offers flexibility ranging from simple, high-level workflows to full control over neural networks and sampling. This guide shows:

  1. Four abstraction levels for controlling the density estimator (common to NPE and NLE)

  2. Additional sampling control for NLE (4 more levels)

We’ll use the same simple example throughout to keep things clear.

Setup#

First, let’s define a simple linear Gaussian simulator and generate data we’ll use for all examples:

import torch

from sbi.inference import NLE, NPE
from sbi.utils import BoxUniform


# Define a simple linear Gaussian simulator
def simulator(theta):
    """Linear Gaussian simulator with noise."""
    return theta + 1.0 + torch.randn_like(theta) * 0.1

# Define prior over 3 parameters
num_dim = 3
prior = BoxUniform(low=-2 * torch.ones(num_dim), high=2 * torch.ones(num_dim))

# Generate training data (used for all examples)
num_simulations = 2000
theta = prior.sample((num_simulations,))
x = simulator(theta)

# Generate a single observation for inference
theta_o = prior.sample((1,))
x_o = simulator(theta_o)

print(f"Generated {num_simulations} simulations for training")
print(f"Parameter shape: {theta.shape}, Data shape: {x.shape}")

Part 1: Density Estimator Abstraction Levels#

The following 4 levels apply to both NPE and NLE. They control how the neural density estimator is specified and constructed. We’ll demonstrate with NPE first.

Level 2: Factory Functions#

Use case: Need specific architecture hyperparameters

Use factory functions like posterior_nn() when you need to tune the network architecture.

from sbi.neural_nets import posterior_nn

# Level 2: Factory function with custom hyperparameters
density_estimator = posterior_nn(
    model="maf",              # Masked Autoregressive Flow
    hidden_features=50,        # Customize hidden layer size
    num_transforms=5,          # Customize number of transform layers
)

# Pass to NPE (rest of workflow is the same)
inference = NPE(prior=prior, density_estimator=density_estimator)
inference.append_simulations(theta, x)
posterior_net = inference.train()

posterior = inference.build_posterior()
samples_lvl2 = posterior.sample((1000,), x=x_o)

print("Level 2 complete - used MAF with custom hyperparameters")

Key features:

  • Fine-grained control over hyperparameters

  • Can add embedding networks for high-dimensional data

  • Still benefits from trainer conveniences

  • For NPE: posterior_nn(), for NLE: likelihood_nn(), for NRE: classifier_nn()

Level 3: Direct Network Builders#

Use case: Custom neural network architecture with full parameter access

Use direct builder functions like build_nsf() for maximum control over network construction.

from functools import partial

from sbi.neural_nets.net_builders.flow import build_nsf

# Level 3: Direct builder with full parameter control
custom_builder = partial(
    build_nsf,
    hidden_features=60,
    num_transforms=3,
    num_bins=8,                # Number of spline bins
    tail_bound=3.0,            # Spline tail bound
)

# Pass to NPE (rest of workflow is the same)
inference = NPE(prior=prior, density_estimator=custom_builder)
inference.append_simulations(theta, x)
posterior_net = inference.train()

posterior = inference.build_posterior()
samples_lvl3 = posterior.sample((1000,), x=x_o)

print("Level 3 complete - used custom NSF configuration")

Key features:

  • Direct access to all builder parameters

  • Maximum flexibility for architecture design

  • Can implement fully custom architectures by subclassing DensityEstimator

Level 4: Custom Training Loops#

Use case: Custom training logic, loss functions, research applications

For complete control over the training process, implement custom training loops. This is covered in detail in advanced tutorial 18.

At this level, you:

  • Manually construct the density estimator

  • Define custom loss functions and regularization

  • Implement your own training loops with custom data loaders

  • Have full control over optimization, early stopping, etc.

When to use: Research on new methods, custom loss functions, specialized data augmentation.

Part 2: NLE - Same Levels + Sampling Control#

Understanding the Difference#

NPE directly approximates the posterior \(p(\theta|x)\):

  • Sampling is straightforward: just sample from the neural network

  • No additional configuration typically needed

NLE approximates the likelihood \(p(x|\theta)\):

  • Must combine with prior using MCMC, VI, or rejection sampling to get posterior samples

  • This adds a second dimension of control: choosing and configuring the sampling method

Important: The 4 density estimator levels above work exactly the same for NLE - just use likelihood_nn() instead of posterior_nn() at Level 2.

NLE Density Estimator (Same 4 Levels)#

Quick example showing NLE uses the same abstraction levels:

# Level 1 with NLE - same pattern as NPE
inference_nle = NLE(prior=prior, density_estimator="nsf")
inference_nle.append_simulations(theta, x)
likelihood_net = inference_nle.train()

# Build posterior (defaults to MCMC)
posterior_nle = inference_nle.build_posterior()
samples_nle = posterior_nle.sample((1000,), x=x_o)

print("NLE Level 1 complete")
print("Default sampling method: MCMC with slice_np_vectorized")

Note: Levels 2-4 for the density estimator work identically:

  • Level 2: Use likelihood_nn() instead of posterior_nn()

  • Level 3: Use build_nsf() (same as NPE)

  • Level 4: Custom training (see tutorial 18)

Part 3: NLE Sampling Control#

NLE provides additional control over how posterior samples are generated. This is independent of the density estimator configuration above.

Four levels of sampling control:

Sampling Level 1: Default#

Use case: Starting point, works well for most problems

Just call build_posterior() with no arguments - uses slice sampling by default.

# Sampling Level 1: Use defaults
posterior = inference_nle.build_posterior()
samples = posterior.sample((1000,), x=x_o)

print("Sampling Level 1: Default MCMC (slice_np_vectorized)")

Default behavior: MCMC with slice_np_vectorized method, 200 warmup steps, 20 chains.

Sampling Level 2: Choose Method#

Use case: Different problem characteristics favor different sampling methods

Use the sample_with parameter to choose between MCMC, rejection sampling, VI, or importance sampling.

# Sampling Level 2: Choose sampling method
# Use rejection sampling instead of default MCMC (fast for few parameters)
posterior_rejection = inference_nle.build_posterior(sample_with="rejection")
samples_rejection = posterior_rejection.sample((1000,), x=x_o)

print("Sampling Level 2: Using rejection sampling instead of MCMC")

Available sampling methods:

  • "mcmc": Markov Chain Monte Carlo (default) - accurate but can be slow

  • "rejection": Rejection sampling - fast and accurate for few parameters (<3)

  • "vi": Variational inference - faster for many parameters (>10), may be less accurate

  • "importance": Importance sampling - useful for refining VI posteriors

Usage: Simply change sample_with="rejection" to sample_with="vi" or any other method.

See also: how_to_guide/09_sampler_interface.ipynb for detailed guidance on choosing sampling algorithms.

Sampling Level 3: Configure Method Specifics#

Use case: Choose specific algorithms within a sampling method

Use mcmc_method or vi_method parameters to select specific algorithms.

# Sampling Level 3: Configure method specifics
# Use NUTS (No-U-Turn Sampler) instead of default slice sampling
posterior_nuts = inference_nle.build_posterior(
    sample_with="mcmc",
    mcmc_method="nuts_pyro"
)
samples_nuts = posterior_nuts.sample((1000,), x=x_o)

print("Sampling Level 3: Using NUTS from Pyro")

Available MCMC methods (use with mcmc_method=):

  • "slice_np_vectorized": Slice sampling (numpy, vectorized, default)

  • "slice_np": Slice sampling (numpy, sequential)

  • "nuts_pyro": No-U-Turn Sampler (requires pip install "sbi[pyro]")

  • "hmc_pyro": Hamiltonian Monte Carlo (requires pip install "sbi[pyro]")

  • "slice_pymc", "hmc_pymc", "nuts_pymc": PyMC samplers (require pip install "sbi[pymc]")

Available VI methods (use with vi_method=):

  • "rKL": Reverse KL divergence (mode-seeking, default)

  • "fKL": Forward KL divergence (mass-covering)

  • "IW": Importance weighted

  • "alpha": Alpha divergence

Usage: Change mcmc_method="nuts_pyro" to any other MCMC method, or use vi_method="fKL" when sample_with="vi".

Sampling Level 4: Fine-Tune Parameters#

Use case: Optimize sampling performance, troubleshoot convergence issues

Fine-tune sampling parameters using dictionaries or PosteriorParameters dataclasses.

# Sampling Level 4a: Using parameter dictionaries
posterior_tuned = inference_nle.build_posterior(
    sample_with="mcmc",
    mcmc_method="slice_np_vectorized",
    mcmc_parameters={
        "warmup_steps": 100,      # Burn-in samples to discard
        "num_chains": 4,          # Number of parallel chains
        "thin": 2,                # Thinning factor
        "num_workers": 2,         # CPU cores for parallelization
    }
)

print("Sampling Level 4a: Dictionary-based parameter tuning")
# Sampling Level 4b: Using PosteriorParameters (recommended)
from sbi.inference.posteriors import MCMCPosteriorParameters

mcmc_params = MCMCPosteriorParameters(
    method="nuts_pyro",
    warmup_steps=100,
    num_chains=4,
    init_strategy="sir",                               # Sequential Importance Resampling for init
    init_strategy_parameters={"num_candidate_samples": 1000},
    num_workers=2,
    mp_context="spawn"                                 # Multiprocessing context
)

posterior_advanced = inference_nle.build_posterior(
    posterior_parameters=mcmc_params
)

samples_advanced = posterior_advanced.sample((1000,), x=x_o)

print("Sampling Level 4b: PosteriorParameters with validation")

Key tuning parameters:

  • warmup_steps: Number of initial samples to discard (default: 200)

  • num_chains: Number of parallel chains (default: 20)

  • thin: Thinning factor - keep every nth sample (default: -1, auto)

  • init_strategy: How to initialize chains ("proposal", "sir", "resample")

  • num_workers: Number of CPU cores for parallelization

Advantages of PosteriorParameters:

  • Type checking and validation

  • Better IDE autocomplete support

  • Clear documentation of available parameters

See also: how_to_guide/19_posterior_parameters.ipynb for complete details.

Decision Guides#

Guide 1: Which Density Estimator Level? (NPE and NLE)#

You want to…

Use Level

Example

Standard workflows with good defaults

1

NPE(prior, density_estimator="nsf")

Try different density estimator types

1

Switch "nsf", "maf", "zuko_nsf"

Tune network depth or width

2

posterior_nn(hidden_features=100)

Add embedding networks for images/timeseries

2

posterior_nn(embedding_net=my_cnn)

Access specialized flow parameters

3

build_nsf(num_bins=16, tail_bound=5.0)

Implement custom network architecture

3

Subclass DensityEstimator

Define custom loss functions or training

4

See advanced tutorial 18

Rule of thumb: Start with Level 1. Move to higher levels only when you need specific control.

Guide 2: Which Sampling Level? (NLE and NRE only)#

Situation

Sampling Level

Example

Starting out, need good defaults

1

build_posterior()

Very few parameters (<3)

2

sample_with="rejection"

Many parameters (>10), speed is critical

2

sample_with="vi"

Want to use NUTS or HMC

3

mcmc_method="nuts_pyro"

MCMC not converging, need more warmup

4

mcmc_parameters={"warmup_steps": 500}

Want type checking and validation

4

MCMCPosteriorParameters(...)

Troubleshooting sampling issues

4

Tune num_chains, init_strategy, etc.

Rule of thumb:

  • Start with default MCMC (Level 1)

  • If too slow, try rejection (few params) or VI (many params) at Level 2

  • Use Level 3-4 for optimization or troubleshooting

Summary#

All Methods (NPE, NLE, NRE)#

4 Abstraction Levels for Density Estimator:

  • Level 1: Trainer classes with strings → NPE(prior, density_estimator="nsf")

  • Level 2: Factory functions → posterior_nn(model="maf", hidden_features=50)

  • Level 3: Direct builders → build_nsf(num_bins=8, tail_bound=3.0)

  • Level 4: Custom training → Full control (see tutorial 18)

NPE Sampling#

  • Direct sampling from neural network

  • No additional configuration needed

  • Optionally can use MCMC/VI for more control

NLE and NRE Sampling (Additional Dimension)#

4 Sampling Control Levels:

  • Level 1: Default → build_posterior() (uses slice_np_vectorized)

  • Level 2: Choose method → sample_with="mcmc"/"vi"/"rejection"/"importance"

  • Level 3: Configure algorithm → mcmc_method="nuts_pyro", vi_method="fKL"

  • Level 4: Fine-tune parameters → mcmc_parameters={...} or MCMCPosteriorParameters(...)

General Principle#

Start simple, add complexity only when needed:

  1. Begin with Level 1 for density estimator

  2. For NLE, begin with default sampling (Level 1)

  3. Move to higher levels only when you need specific control or encounter issues

  4. Both dimensions are independent - you can use Level 1 density estimator with Level 4 sampling, or vice versa