NPE_C#

class NPE_C(prior=None, density_estimator='maf', device='cpu', logging_level='WARNING', summary_writer=None, tracker=None, show_progress_bars=True)[source]#

Bases: PosteriorEstimatorTrainer

Neural Posterior Estimation algorithm (NPE-C) as in Greenberg et al. (2019) [1].

NPE-C (also known as APT - Automatic Posterior Transformation, aka SNPE-C) trains a neural network over multiple rounds to directly approximate the posterior for a specific observation x_o. In the first round, NPE-C is equivalent to other NPE methods and is fully amortized (direct inference for any new observation). After the first round, NPE-C automatically selects between two loss variants depending on the chosen density estimator: the non-atomic loss (for Mixture of Gaussians) which is stable and avoids leakage, or the atomic loss (for flows) which is more flexible but may suffer from leakage issues.

For single-round inference, NPE-A, NPE-B, and NPE-C are equivalent and use plain NLL loss.

[1] Automatic Posterior Transformation for Likelihood-free Inference,

Greenberg et al., ICML 2019, https://arxiv.org/abs/1905.07488.

Example:#

import torch
from sbi.inference import NPE_C
from sbi.utils import BoxUniform

# 1. Setup simulator, prior, and observation
prior = BoxUniform(low=torch.zeros(3), high=torch.ones(3))
x_o = torch.randn(1, 3)  # Observed data

def simulator(theta):
    return theta + torch.randn_like(theta) * 0.1

# 2. Multi-round inference
inference = NPE_C(prior=prior)
proposal = prior

for round_idx in range(5):
    theta = proposal.sample((100,))
    x = simulator(theta)
    density_estimator = inference.append_simulations(theta, x).train()
    posterior = inference.build_posterior(density_estimator)
    proposal = posterior.set_default_x(x_o)

# 3. Sample from final posterior
samples = posterior.sample((1000,), x=x_o)
train(num_atoms=10, training_batch_size=200, learning_rate=0.0005, validation_fraction=0.1, stop_after_epochs=20, max_num_epochs=2147483647, clip_max_norm=5.0, calibration_kernel=None, resume_training=False, force_first_round_loss=False, discard_prior_samples=False, use_combined_loss=False, retrain_from_scratch=False, show_train_summary=False, dataloader_kwargs=None)[source]#

Return density estimator that approximates the distribution \(p(\theta|x)\).

Parameters:
  • num_atoms (int) – Number of atoms to use for classification.

  • training_batch_size (int) – Training batch size.

  • learning_rate (float) – Learning rate for Adam optimizer.

  • validation_fraction (float) – The fraction of data to use for validation.

  • stop_after_epochs (int) – The number of epochs to wait for improvement on the validation set before terminating training.

  • max_num_epochs (int) – Maximum number of epochs to run. If reached, we stop training even when the validation loss is still decreasing. Otherwise, we train until validation loss increases (see also stop_after_epochs).

  • clip_max_norm (float | None) – Value at which to clip the total gradient norm in order to prevent exploding gradients. Use None for no clipping.

  • calibration_kernel (Callable | None) – A function to calibrate the loss with respect to the simulations x. See Lueckmann, Gonçalves et al., NeurIPS 2017.

  • resume_training (bool) – Can be used in case training time is limited, e.g. on a cluster. If True, the split between train and validation set, the optimizer, the number of epochs, and the best validation log-prob will be restored from the last time .train() was called.

  • force_first_round_loss (bool) – If True, train with maximum likelihood, i.e., potentially ignoring the correction for using a proposal distribution different from the prior.

  • discard_prior_samples (bool) – Whether to discard samples simulated in round 1, i.e. from the prior. Training may be sped up by ignoring such less targeted samples.

  • use_combined_loss (bool) – Whether to train the neural net also on prior samples using maximum likelihood in addition to training it on all samples using atomic loss. The extra MLE loss helps prevent density leaking with bounded priors.

  • retrain_from_scratch (bool) – Whether to retrain the conditional density estimator for the posterior from scratch each round.

  • show_train_summary (bool) – Whether to print the number of epochs and validation loss and leakage after the training.

  • dataloader_kwargs (Dict | None) – Additional or updated kwargs to be passed to the training and validation dataloaders (like, e.g., a collate_fn)

Returns:

Density estimator that approximates the distribution \(p(\theta|x)\).

Return type:

ConditionalDensityEstimator

append_simulations(theta, x, proposal=None, exclude_invalid_x=None, data_device=None)#

Store parameters and simulation outputs to use them for later training.

Data are stored as entries in lists for each type of variable (parameter/data).

Stores \(\theta\), \(x\), prior_masks (indicating if simulations are coming from the prior or not) and an index indicating which round the batch of simulations came from.

Parameters:
  • theta (Tensor) – Parameter sets.

  • x (Tensor) – Simulation outputs.

  • proposal (DirectPosterior | None) – The distribution that the parameters \(\theta\) were sampled from. Pass None if the parameters were sampled from the prior. If not None, it will trigger a different loss-function.

  • exclude_invalid_x (bool | None) – Whether invalid simulations are discarded during training. For single-round SNPE, it is fine to discard invalid simulations, but for multi-round SNPE (atomic), discarding invalid simulations gives systematically wrong results. If None, it will be True in the first round and False in later rounds.

  • data_device (str | None) – Where to store the data, default is on the same device where the training is happening. If training a large dataset on a GPU with not much VRAM can set to ‘cpu’ to store data on system memory instead.

Returns:

NeuralInference object (returned so that this function is chainable).

Return type:

Self

build_posterior(density_estimator=None, prior=None, sample_with='direct', mcmc_method='slice_np_vectorized', vi_method='rKL', direct_sampling_parameters=None, mcmc_parameters=None, vi_parameters=None, rejection_sampling_parameters=None, importance_sampling_parameters=None, posterior_parameters=None)#

Build posterior from the neural density estimator.

For SNPE, the posterior distribution that is returned here implements the following functionality over the raw neural density estimator: - correct the calculation of the log probability such that it compensates for

the leakage.

  • reject samples that lie outside of the prior bounds.

  • alternatively, if leakage is very high (which can happen for multi-round

    SNPE), sample from the posterior with MCMC.

Parameters:
  • density_estimator (ConditionalDensityEstimator | None) – The density estimator that the posterior is based on. If None, use the latest neural density estimator that was trained.

  • prior (Distribution | None) – Prior distribution.

  • sample_with (Literal['mcmc', 'rejection', 'vi', 'importance', 'direct']) – Method to use for sampling from the posterior. Must be one of [direct | mcmc | rejection | vi | importance].

  • mcmc_method (Literal['slice_np', 'slice_np_vectorized', 'hmc_pyro', 'nuts_pyro', 'slice_pymc', 'hmc_pymc', 'nuts_pymc']) – Method used for MCMC sampling, one of slice_np, slice_np_vectorized, hmc_pyro, nuts_pyro, slice_pymc, hmc_pymc, nuts_pymc. slice_np is a custom numpy implementation of slice sampling. slice_np_vectorized is identical to slice_np, but if num_chains>1, the chains are vectorized for slice_np_vectorized whereas they are run sequentially for slice_np. The samplers ending on _pyro are using Pyro, and likewise the samplers ending on _pymc are using PyMC.

  • vi_method (Literal['rKL', 'fKL', 'IW', 'alpha']) – Method used for VI, one of [rKL, fKL, IW, alpha]. Note some of the methods admit a mode seeking property (e.g. rKL) whereas some admit a mass covering one (e.g fKL).

  • direct_sampling_parameters (Dict[str, Any] | None) – Additional kwargs passed to DirectPosterior.

  • mcmc_parameters (Dict[str, Any] | None) – Additional kwargs passed to MCMCPosterior.

  • vi_parameters (Dict[str, Any] | None) – Additional kwargs passed to VIPosterior.

  • rejection_sampling_parameters (Dict[str, Any] | None) – Additional kwargs passed to RejectionPosterior.

  • importance_sampling_parameters (Dict[str, Any] | None) – Additional kwargs passed to ImportanceSamplingPosterior.

  • posterior_parameters (DirectPosteriorParameters | MCMCPosteriorParameters | VIPosteriorParameters | RejectionPosteriorParameters | ImportanceSamplingPosteriorParameters | None) – Configuration passed to the init method for the posterior. Must be one of the following - VIPosteriorParameters - ImportanceSamplingPosteriorParameters - MCMCPosteriorParameters - DirectPosteriorParameters - RejectionPosteriorParameters

Returns:

Posterior \(p(\theta|x)\) with .sample() and .log_prob() methods (the returned log-probability is unnormalized).

Return type:

NeuralPosterior

get_dataloaders(starting_round=0, training_batch_size=200, validation_fraction=0.1, resume_training=False, dataloader_kwargs=None)#

Return dataloaders for training and validation.

Parameters:
  • dataset – holding all theta and x, optionally masks.

  • training_batch_size (int) – training arg of inference methods.

  • resume_training (bool) – Whether the current call is resuming training so that no new training and validation indices into the dataset have to be created.

  • dataloader_kwargs (dict | None) – Additional or updated kwargs to be passed to the training and validation dataloaders (like, e.g., a collate_fn).

  • starting_round (int)

  • validation_fraction (float)

Returns:

Tuple of dataloaders for training and validation.

Return type:

Tuple[DataLoader, DataLoader]

get_simulations(starting_round=0)#

Returns all \(\theta\), \(x\), and prior_masks from rounds >= starting_round.

If requested, do not return invalid data.

Parameters:
  • starting_round (int) – The earliest round to return samples from (we start counting from zero).

  • warn_on_invalid – Whether to give out a warning if invalid simulations were found.

Return type:

Tuple[Tensor, Tensor, Tensor]

Returns: Parameters, simulation outputs, prior masks.

property summary#
Parameters:
  • prior (Distribution | None)

  • density_estimator (Literal['nsf', 'maf', 'mdn', 'made'] | ~sbi.neural_nets.estimators.base.ConditionalEstimatorBuilder[~sbi.neural_nets.estimators.base.ConditionalDensityEstimator])

  • device (str)

  • logging_level (int | str)

  • summary_writer (SummaryWriter | None)

  • tracker (Tracker | None)

  • show_progress_bars (bool)