NPE_A

Navigation

NPE_A#

class NPE_A(prior=None, density_estimator='mdn_snpe_a', num_components=10, device='cpu', logging_level='WARNING', summary_writer=None, tracker=None, show_progress_bars=True)[source]#

Bases: PosteriorEstimatorTrainer

Neural Posterior Estimation algorithm as in Papamakarios et al. (2016) [1].

[1] *Fast epsilon-free Inference of Simulation Models with Bayesian: Conditional Density Estimation*, Papamakarios et al., NeurIPS 2016. https://arxiv.org/abs/1605.06376

Like all NPE methods, this method trains a deep neural density estimator to directly approximate the posterior. Also like all other NPE methods, in the first round, this density estimator is trained with a maximum-likelihood loss.

This class implements NPE-A. NPE-A trains across multiple rounds with a maximum-likelihood loss. This will make training converge to the proposal posterior instead of the true posterior. To correct for this, SNPE-A applies a post-hoc correction after training. This correction is performed analytically and requires Mixture of Gaussians (MoG) density estimators.

Note

In multi-round SNPE-A, the number of MoG components grows multiplicatively with each round: if the proposal has L components and the density estimator has K components, the corrected posterior has L×K components. For many rounds, consider using SNPE-C (APT) instead, which handles multi-round inference more efficiently.

Example:#

import torch
from sbi.inference import NPE_A
from sbi.utils import BoxUniform

# 1. Setup simulator, prior, and observation
prior = BoxUniform(low=torch.zeros(3), high=torch.ones(3))
x_o = torch.randn(1, 3)  # Observed data

def simulator(theta):
    return theta + torch.randn_like(theta) * 0.1

# 2. Multi-round inference
inference = NPE_A(prior=prior, num_components=5)
proposal = prior

for round_idx in range(5):
    theta = proposal.sample((100,))
    x = simulator(theta)
    density_estimator = inference.append_simulations(theta, x).train()
    posterior = inference.build_posterior(density_estimator)
    proposal = posterior.set_default_x(x_o)

# 3. Sample from final posterior
samples = posterior.sample((1000,), x=x_o)

train(training_batch_size=200, learning_rate=0.0005, validation_fraction=0.1, stop_after_epochs=20, max_num_epochs=2147483647, clip_max_norm=5.0, calibration_kernel=None, resume_training=False, retrain_from_scratch=False, show_train_summary=False, dataloader_kwargs=None)[source]#

Return density estimator that approximates the proposal posterior.

[1] _Fast epsilon-free Inference of Simulation Models with Bayesian Conditional: Density Estimation_, Papamakarios et al., NeurIPS 2016, https://arxiv.org/abs/1605.06376.

Training is performed with maximum likelihood on samples from the latest round, which leads the algorithm to converge to the proposal posterior.

Parameters:

training_batch_size (int) – Training batch size.
learning_rate (float) – Learning rate for Adam optimizer.
validation_fraction (float) – The fraction of data to use for validation.
stop_after_epochs (int) – The number of epochs to wait for improvement on the validation set before terminating training.
max_num_epochs (int) – Maximum number of epochs to run. If reached, we stop training even when the validation loss is still decreasing. Otherwise, we train until validation loss increases (see also stop_after_epochs).
clip_max_norm (float | None) – Value at which to clip the total gradient norm in order to prevent exploding gradients. Use None for no clipping.
calibration_kernel (Callable | None) – A function to calibrate the loss with respect to the simulations x. See Lueckmann, Gonçalves et al., NeurIPS 2017.
resume_training (bool) – Can be used in case training time is limited, e.g. on a cluster. If True, the split between train and validation set, the optimizer, the number of epochs, and the best validation log-prob will be restored from the last time .train() was called.
retrain_from_scratch (bool) – Whether to retrain the conditional density estimator for the posterior from scratch each round. Not supported for SNPE-A.
show_train_summary (bool) – Whether to print the number of epochs and validation loss and leakage after the training.
dataloader_kwargs (Dict | None) – Additional or updated kwargs to be passed to the training and validation dataloaders (like, e.g., a collate_fn)

Returns:

Density estimator that approximates the distribution \(p(\theta|x)\).

Return type:

ConditionalDensityEstimator

build_posterior(density_estimator=None, prior=None, sample_with='direct', **kwargs)[source]#

Build posterior from the neural density estimator.

Returns an NPE_A_Posterior that applies the SNPE-A correction formula:: p(θ|x) ∝ q(θ|x) × prior(θ) / proposal(θ)

Note

NPE_A only supports sample_with=”direct”. The corrected posterior is a Mixture of Gaussians (MoG) which can be sampled directly and efficiently. MCMC, VI, rejection, and importance sampling methods do not provide benefits over direct MoG sampling and are therefore not supported.

Parameters:

density_estimator (ConditionalDensityEstimator | None) – The density estimator that the posterior is based on. If None, use the latest neural density estimator that was trained.
prior (Distribution | None) – Prior distribution.
sample_with (Literal['direct']) – Must be “direct”. Other sampling methods are not supported.
**kwargs – Additional arguments passed to NPE_A_Posterior.

Returns:

NPE_A_Posterior with the SNPE-A correction applied.

Raises:

ValueError – If sample_with is not “direct”.

Return type:

NPE_A_Posterior

append_simulations(theta, x, proposal=None, exclude_invalid_x=None, data_device=None)#

Store parameters and simulation outputs to use them for later training.

Data are stored as entries in lists for each type of variable (parameter/data).

Stores \(\theta\), \(x\), prior_masks (indicating if simulations are coming from the prior or not) and an index indicating which round the batch of simulations came from.

Parameters:

theta (Tensor) – Parameter sets.
x (Tensor) – Simulation outputs.
proposal (DirectPosterior | None) – The distribution that the parameters \(\theta\) were sampled from. Pass None if the parameters were sampled from the prior. If not None, it will trigger a different loss-function.
exclude_invalid_x (bool | None) – Whether invalid simulations are discarded during training. For single-round SNPE, it is fine to discard invalid simulations, but for multi-round SNPE (atomic), discarding invalid simulations gives systematically wrong results. If None, it will be True in the first round and False in later rounds.
data_device (str | None) – Where to store the data, default is on the same device where the training is happening. If training a large dataset on a GPU with not much VRAM can set to ‘cpu’ to store data on system memory instead.

Returns:

NeuralInference object (returned so that this function is chainable).

Return type:

Self

get_dataloaders(starting_round=0, training_batch_size=200, validation_fraction=0.1, resume_training=False, dataloader_kwargs=None)#

Return dataloaders for training and validation.

Parameters:

dataset – holding all theta and x, optionally masks.
training_batch_size (int) – training arg of inference methods.
resume_training (bool) – Whether the current call is resuming training so that no new training and validation indices into the dataset have to be created.
dataloader_kwargs (dict | None) – Additional or updated kwargs to be passed to the training and validation dataloaders (like, e.g., a collate_fn).
starting_round (int)
validation_fraction (float)

Returns:

Tuple of dataloaders for training and validation.

Return type:

Tuple[DataLoader, DataLoader]

get_simulations(starting_round=0)#

Returns all \(\theta\), \(x\), and prior_masks from rounds >= starting_round.

If requested, do not return invalid data.

Parameters:

starting_round (int) – The earliest round to return samples from (we start counting from zero).
warn_on_invalid – Whether to give out a warning if invalid simulations were found.

Return type:

Tuple[Tensor, Tensor, Tensor]

Returns: Parameters, simulation outputs, prior masks.

property summary#

Parameters:

prior (Distribution | None)
density_estimator (Literal['mdn_snpe_a'] | ~sbi.neural_nets.estimators.base.ConditionalEstimatorBuilder[~sbi.neural_nets.estimators.base.ConditionalDensityEstimator])
num_components (int)
device (str)
logging_level (int | str)
summary_writer (SummaryWriter | None)
tracker (Tracker | None)
show_progress_bars (bool)