DirectPosterior

Navigation

DirectPosterior#

class DirectPosterior(posterior_estimator, prior, max_sampling_batch_size=10000, device=None, x_shape=None, enable_transform=True)[source]#

Bases: NeuralPosterior

Posterior based on neural networks that directly estimate the posterior (NPE).

NPE trains a neural network to directly approximate the posterior distribution. However, for bounded priors, the neural network can have leakage: it puts non-zero mass in regions where the prior is zero. The DirectPosterior class wraps the trained network to deal with these cases.

Specifically, this class offers the following functionality:

correct the calculation of the log probability such that it compensates for the leakage.
reject samples that lie outside of the prior bounds.

This class can not be used in combination with NLE or NRE.

Parameters:

posterior_estimator (ConditionalDensityEstimator)
prior (Distribution)
max_sampling_batch_size (int)
device (str | device | None)
x_shape (Size | None)
enable_transform (bool)

to(device)[source]#

Move posterior_estimator, prior and x_o to device.

Changes the device attribute, reinstanciates the posterior, and resets the default x.

Parameters:: device (str | device) – device where to move the posterior to.
Return type:: None

sample(sample_shape=(), x=None, max_sampling_batch_size=10000, show_progress_bars=True, reject_outside_prior=True, max_sampling_time=None, return_partial_on_timeout=False)[source]#

Draw samples from the approximate posterior distribution $p(\theta|x)$.

Parameters:

sample_shape (Size | Tuple[int, ...]) – Desired shape of samples that are drawn from posterior. If sample_shape is multidimensional we simply draw sample_shape.numel() samples and then reshape into the desired shape.
x (Tensor | None) – Conditioning observation $x_o$. If not provided, uses the default x set via .set_default_x().
max_sampling_batch_size (int) – Maximum batch size for rejection sampling.
show_progress_bars (bool) – Whether to show sampling progress monitor.
reject_outside_prior (bool) – If True (default), rejection sampling is used to ensure samples lie within the prior support. If False, samples are drawn directly from the neural density estimator without rejection, which is faster but may include samples outside the prior support.
max_sampling_time (float | None) – Optional maximum allowed sampling time in seconds. If exceeded, sampling is aborted and a RuntimeError is raised. Only applies when reject_outside_prior=True (no effect otherwise since direct sampling is fast).
return_partial_on_timeout (bool) – If True and max_sampling_time is exceeded, return the samples collected so far instead of raising a RuntimeError. A warning will be issued. Only applies when reject_outside_prior=True (default).

Return type:

Tensor

sample_batched(sample_shape, x, max_sampling_batch_size=10000, show_progress_bars=True, reject_outside_prior=True, max_sampling_time=None, return_partial_on_timeout=False)[source]#

Draw samples from the posteriors for a batch of different xs.

Given a batch of observations [x_1, …, x_B], this method samples from posteriors $p(\theta|x_1), \ldots, p(\theta|x_B)$ in a vectorized manner.

Parameters:

sample_shape (Size | Tuple[int, ...]) – Desired shape of samples that are drawn from the posterior given every observation.
x (Tensor) – A batch of observations, of shape (batch_dim, event_shape_x). batch_dim corresponds to the number of observations to be drawn.
max_sampling_batch_size (int) – Maximum batch size for rejection sampling.
show_progress_bars (bool) – Whether to show sampling progress monitor.
reject_outside_prior (bool) – If True (default), rejection sampling is used to ensure samples lie within the prior support. If False, samples are drawn directly from the neural density estimator without rejection, which is faster but may include samples outside the prior support.
max_sampling_time (float | None) – Optional maximum allowed sampling time in seconds. If exceeded, sampling is aborted and a RuntimeError is raised. Only applies when reject_outside_prior=True.
return_partial_on_timeout (bool) – If True and max_sampling_time is exceeded, return the samples collected so far instead of raising a RuntimeError. A warning will be issued. Only applies when reject_outside_prior=True.

Returns:

Samples from the posteriors of shape (*sample_shape, B, *input_shape)

Return type:

Tensor

log_prob(theta, x=None, norm_posterior=True, track_gradients=False, leakage_correction_params=None)[source]#

Returns the log-probability of the posterior $p(\theta|x)$.

Parameters:

theta (Tensor) – Parameters $\theta$.
norm_posterior (bool) – Whether to enforce a normalized posterior density. Renormalization of the posterior is useful when some probability falls out or leaks out of the prescribed prior support. The normalizing factor is calculated via rejection sampling, so if you need speedier but unnormalized log posterior estimates set here norm_posterior=False. The returned log posterior is set to -∞ outside of the prior support regardless of this setting.
track_gradients (bool) – Whether the returned tensor supports tracking gradients. This can be helpful for e.g. sensitivity analysis, but increases memory consumption.
leakage_correction_params (dict | None) – A dict of keyword arguments to override the default values of leakage_correction(). Possible options are: num_rejection_samples, force_update, show_progress_bars, and rejection_sampling_batch_size. These parameters only have an effect if norm_posterior=True.
x (Tensor | None)

Returns:

(len(θ),)-shaped log posterior probability $\log p(\theta|x)$ for θ in the support of the prior, -∞ (corresponding to 0 probability) outside.

Return type:

Tensor

log_prob_batched(theta, x, norm_posterior=True, track_gradients=False, leakage_correction_params=None)[source]#

Given a batch of observations [x_1, …, x_B] and a batch of parameters [$ heta_1$,..., $ heta_B$] this function evalautes the log-probabilities of the posteriors $p( heta_1|x_1)$, ..., $p( heta_B|x_B)$ in a batched (i.e. vectorized) manner.

Parameters:

theta (Tensor) – Batch of parameters $ heta$ of shape (*sample_shape, batch_dim, *theta_shape).
x (Tensor) – Batch of observations $x$ of shape (batch_dim, *condition_shape).
norm_posterior (bool) – Whether to enforce a normalized posterior density. Renormalization of the posterior is useful when some probability falls out or leaks out of the prescribed prior support. The normalizing factor is calculated via rejection sampling, so if you need speedier but unnormalized log posterior estimates set here norm_posterior=False. The returned log posterior is set to -∞ outside of the prior support regardless of this setting.
track_gradients (bool) – Whether the returned tensor supports tracking gradients. This can be helpful for e.g. sensitivity analysis, but increases memory consumption.
leakage_correction_params (dict | None) – A dict of keyword arguments to override the default values of leakage_correction(). Possible options are: num_rejection_samples, force_update, show_progress_bars, and rejection_sampling_batch_size. These parameters only have an effect if norm_posterior=True.

Returns:

(len(θ), B)-shaped log posterior probability $\log p( heta|x)$\ for θ in the support of the prior, -∞ (corresponding to 0 probability) outside.

Return type:

Tensor

leakage_correction(x, num_rejection_samples=10000, force_update=False, show_progress_bars=False, rejection_sampling_batch_size=10000)[source]#

Return leakage correction factor for a leaky posterior density estimate.

The factor is estimated from the acceptance probability during rejection sampling from the posterior.

This is to avoid re-estimating the acceptance probability from scratch whenever log_prob is called and norm_posterior=True. Here, it is estimated only once for self.default_x and saved for later. We re-evaluate only whenever a new x is passed.

Parameters:

num_rejection_samples (int) – Number of samples used to estimate correction factor.
show_progress_bars (bool) – Whether to show a progress bar during sampling.
rejection_sampling_batch_size (int) – Batch size for rejection sampling.
x (Tensor)
force_update (bool)

Returns:

Saved or newly-estimated correction factor (as a scalar Tensor).

Return type:

Tensor

map(x=None, num_iter=1000, num_to_optimize=100, learning_rate=0.01, init_method='posterior', num_init_samples=1000, save_best_every=10, show_progress_bars=False, force_update=False)[source]#

Returns the maximum-a-posteriori estimate (MAP).

The method can be interrupted (Ctrl-C) when the user sees that the log-probability converges. The best estimate will be saved in self._map and can be accessed with self.map(). The MAP is obtained by running gradient ascent from a given number of starting positions (samples from the posterior with the highest log-probability). After the optimization is done, we select the parameter set that has the highest log-probability after the optimization.

Warning: The default values used by this function are not well-tested. They might require hand-tuning for the problem at hand.

For developers: if the prior is a BoxUniform, we carry out the optimization in unbounded space and transform the result back into bounded space.

Parameters:

x (Tensor | None) – Deprecated - use .set_default_x() prior to .map().
num_iter (int) – Number of optimization steps that the algorithm takes to find the MAP.
learning_rate (float) – Learning rate of the optimizer.
init_method (str | Tensor) – How to select the starting parameters for the optimization. If it is a string, it can be either [posterior, prior], which samples the respective distribution num_init_samples times. If it is a tensor, the tensor will be used as init locations.
num_init_samples (int) – Draw this number of samples from the posterior and evaluate the log-probability of all of them.
num_to_optimize (int) – From the drawn num_init_samples, use the num_to_optimize with highest log-probability as the initial points for the optimization.
save_best_every (int) – The best log-probability is computed, saved in the map-attribute, and printed every save_best_every-th iteration. Computing the best log-probability creates a significant overhead (thus, the default is 10.)
show_progress_bars (bool) – Whether to show a progressbar during sampling from the posterior.
force_update (bool) – Whether to re-calculate the MAP when x is unchanged and have a cached value.
log_prob_kwargs – Will be empty for SNLE and SNRE. Will contain {‘norm_posterior’: True} for SNPE.

Returns:

The MAP estimate.

Return type:

Tensor

property default_x: Tensor | None#: Return default x used by .sample(), .log_prob as conditioning context.

potential(theta, x=None, track_gradients=False)#

Evaluates $\theta$ under the potential that is used to sample the posterior.

The potential is the unnormalized log-probability of $\theta$ under the posterior.

Parameters:

theta (Tensor) – Parameters $\theta$.
track_gradients (bool) – Whether the returned tensor supports tracking gradients. This can be helpful for e.g. sensitivity analysis, but increases memory consumption.
x (Tensor | None)

Return type:

Tensor

set_default_x(x)#

Set new default x for .sample(), .log_prob to use as conditioning context.

Reset the MAP stored for the old default x if applicable.

This is a pure convenience to avoid having to repeatedly specify x in calls to .sample() and .log_prob() - only $ heta$ needs to be passed.

This convenience is particularly useful when the posterior is focused, i.e. has been trained over multiple rounds to be accurate in the vicinity of a particular x=x_o (you can check if your posterior object is focused by printing it).

NOTE: this method is chainable, i.e. will return the NeuralPosterior object so that calls like posterior.set_default_x(my_x).sample(mytheta) are possible.

Parameters:: x (Tensor) – The default observation to set for the posterior $p( heta|x)$.
Returns:: NeuralPosterior that will use a default x when not explicitly passed.
Return type:: NeuralPosterior