How to run expected coverage

Navigation

How to run expected coverage#

Expected coverage provides a simple and interpretable tool to diagnose issues in the posterior. In comparison to other diagnostic tools such as L-C2ST, it requires relatively few additional simulations (~200) and it does not rely on any additional hyperparameters (as TARP would) or additional neural network training.

Expected coverage allows you to evaluate whether your posterior is, on average across many observations (prior predictive samples) is over- or under-confident.

You can run expected coverage with the sbi toolbox as shown below:

Main syntax#

from sbi.diagnostics import run_sbc
from sbi.analysis.plot import sbc_rank_plot

# Obtain your `posterior_estimator` with NPE, NLE, NRE.
posterior = inference.build_posterior()

num_sbc_samples = 200  # choose a number of sbc runs, should be ~100s
prior_samples = prior.sample((num_sbc_samples,))
prior_predictives = simulate(prior_samples)

# run SBC: for each inference we draw 1000 posterior samples.
num_posterior_samples = 1_000
ranks, dap_samples = run_sbc(
    prior_samples,
    prior_predictives,
    posterior,
    reduce_fns=lambda theta, x: -posterior.log_prob(theta, x),
    num_posterior_samples=num_posterior_samples,
    use_batched_sampling=False,  # `True` can give speed-ups, but can cause memory issues.
)
fig, ax = sbc_rank_plot(
    ranks,
    num_posterior_samples,
    plot_type="cdf",
    num_bins=20,
    figsize=(5, 3),
)

This will return a figure such as the following:

You can interpret this plots as follows:

If the blue line is below the diagonal, then the posterior is (on average) over -confident.
If the line is above the gray region, then the posterior is, on average, under-confident.
If the line is within the gray region, then we cannot reject the null hypothesis that the posterior is well-calibrated.

Citation#

The sample-based implementation of expected coverage used in sbi is described in:

@article{
  deistler2022truncated,
  title={Truncated proposals for scalable and hassle-free simulation-based inference},
  author={Deistler, Michael and Goncalves, Pedro J and Macke, Jakob H},
  journal={Advances in neural information processing systems},
  volume={35},
  pages={23135--23149},
  year={2022}
}

Expected coverage had previously been introduced for simulation-based inference here:

@article{  
  hermans2022crisis,
  title={A crisis in simulation-based inference? beware, your posterior approximations can be unfaithful},
  author={Hermans, Joeri and Delaunoy, Arnaud and Rozet, Fran{\c{c}}ois and Wehenkel, Antoine and Louppe, Gilles},
  journal={Transactions on Machine Learning Research},
  year={2022},
  publisher={OpenReview, Amherst, United States-Massachusetts}
}

@article{
  miller2021truncated,
  title={Truncated marginal neural ratio estimation},
  author={Miller, Benjamin K and Cole, Alex and Forr{\'e}, Patrick and Louppe, Gilles and Weniger, Christoph},
  journal={Advances in Neural Information Processing Systems},
  volume={34},
  pages={129--143},
  year={2021}
}