Skip to content

Commit

Permalink
Merge pull request #579 from ICB-DCM/develop
Browse files Browse the repository at this point in the history
Release 0.12.5
  • Loading branch information
yannikschaelte authored Jun 21, 2022
2 parents cae6cfb + a7ff171 commit b077f5a
Show file tree
Hide file tree
Showing 5 changed files with 19 additions and 8 deletions.
8 changes: 8 additions & 0 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,14 @@ Release Notes
...........


0.12.5 (2022-06-21)
-------------------

Minor:

* Document outdated Google Colab version (Python 3.7)


0.12.4 (2022-05-05)
-------------------

Expand Down
2 changes: 1 addition & 1 deletion doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@
#
# This is also used if you do content translation via gettext catalogs.
# Usually you set "language" from the command line for these cases.
language = None
language = "en"

# There are two options for replacing |today|: either, you set today to some
# non-false value, then it is used:
Expand Down
3 changes: 3 additions & 0 deletions doc/examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,9 @@ which can be performed by
!pip install pyabc --quiet
Potentially, further dependencies may be required.
Unfortunately, at the moment (2022-06), Google Colab is using Python 3.7,
while pyABC and many other packages have proceeded to require Python >= 3.8.
Thus, not everything may work properly.

Getting started
---------------
Expand Down
12 changes: 6 additions & 6 deletions doc/examples/informative.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
"source": [
"Approximate Bayesian computation (ABC) relies on the efficient comparison of relevant features in simulated and observed data, via distance metrics and potentially summary statistics. Separately, methods have been developed to adaptively scale-normalize the distance metric, and to semi-automatically derive informative, low-dimensional summary statistics.\n",
"\n",
"In the notebook on \"Adaptive distances\" we demonstrated how distances adjusting weights to normalize scales are beneficial for heterogeneous, including outlier-corrupted, data. However, when parts of the data are uninformative, it is desirable to further concentrate the analysis on informative data points. Various methods have been develoepd to capture information of data on parameters in a low-dimensional summary statistics representation, see e.g. [Blum et al. 2013](https://doi.org/10.1214/12-STS406) for a review. A particular approach constructs summary statistics as outputs of regression models of parameters on data, see the seminal work by [Fearnhead and Prangle 2012](https://doi.org/10.1111/j.1467-9868.2011.01010.x). In this notebook, we illustrate the use of regression methods to construct informative summary statistics and sensitivity distance weights in pyABC."
"In the notebook on \"Adaptive distances\" we demonstrated how distances adjusting weights to normalize scales are beneficial for heterogeneous, including outlier-corrupted, data. However, when parts of the data are uninformative, it is desirable to further concentrate the analysis on informative data points. Various methods have been developed to capture information of data on parameters in a low-dimensional summary statistics representation, see e.g. [Blum et al. 2013](https://doi.org/10.1214/12-STS406) for a review. A particular approach constructs summary statistics as outputs of regression models of parameters on data, see the similar work by [Fearnhead and Prangle 2012](https://doi.org/10.1111/j.1467-9868.2011.01010.x). In this notebook, we illustrate the use of regression methods to construct informative summary statistics and sensitivity distance weights in pyABC."
]
},
{
Expand Down Expand Up @@ -709,12 +709,12 @@
"id": "2e590cd1-4a6e-4101-a4b7-d650a317b7a8",
"metadata": {},
"source": [
"To tackle these problems, we suggest to firstly consistenly employ scale normalization, both on the raw model outputs and on the level of summary statistics. Secondly, we suggest to instead of only inferring a mapping $s: y \\mapsto \\theta$, we target augmented parameter vectors, $s: y \\mapsto \\lambda(\\theta)$, with e.g. $\\lambda(\\theta) = (\\theta^1,\\ldots,\\theta^4)$. This practically allows to break symmetry, e.g. if only $\\theta^2$ can be expressed as a function of the data. Conceptually, this further allows to obtain a more accurate description of the posterior distribution, as the summary statistics may be regarded as approximations to $s(y) = \\mathbb{E}[\\lambda(\\theta)|y]$, using which as summary statistics preserves the corresponding posterior moments, i.e.\n",
"To tackle these problems, we suggest to firstly consistently employ scale normalization, both on the raw model outputs and on the level of summary statistics. Secondly, we suggest to instead of only inferring a mapping $s: y \\mapsto \\theta$, we target augmented parameter vectors, $s: y \\mapsto \\lambda(\\theta)$, with e.g. $\\lambda(\\theta) = (\\theta^1,\\ldots,\\theta^4)$. This practically allows to break symmetry, e.g. if only $\\theta^2$ can be expressed as a function of the data. Conceptually, this further allows to obtain a more accurate description of the posterior distribution, as the summary statistics may be regarded as approximations to $s(y) = \\mathbb{E}[\\lambda(\\theta)|y]$, using which as summary statistics preserves the corresponding posterior moments, i.e.\n",
"\n",
"$$\\lim_{\\varepsilon\\rightarrow 0}\\mathbb{E}_{\\pi_{\\text{ABC},\\varepsilon}}[\\lambda(\\Theta)|s(y_\\text{obs})] = \\mathbb{E}[\\lambda(\\Theta)|Y=y_\\text{obs}].$$\n",
"\n",
"Methods employing scale normalization, accounting for informativeness, and augmented regression targets, are L1+Ada.+MAD+StatLR+P4, which uses regression-based summary statistics, and L1+Ada.+MAD+SensiLR+P4, which uses sensitivity weights.\n",
"For comparison, we consider L1+Ada.+MAD only normalizing scales, and L1+StatLR, using non-scale normaled summary statistics, as well as L1+Ada.+MAD+StatLR and L1+Ada.+MAD+SensiLR using only a subset of methods."
"For comparison, we consider L1+Ada.+MAD only normalizing scales, and L1+StatLR, using non-scale normalised summary statistics, as well as L1+Ada.+MAD+StatLR and L1+Ada.+MAD+SensiLR using only a subset of methods."
]
},
{
Expand Down Expand Up @@ -1448,7 +1448,7 @@
"metadata": {},
"source": [
"While overall all approaches would benefit from a continued analysis, the approaches L1+Ada.+MAD+StatLR+P4 and L1+Ada.+MAD+SensiLR+P4 employing scale normalization, accounting for informativeness, and using augmented regression targets, approximate the true posterior distribution best.\n",
"Using only scale normalization captures the overall dynamics, however givese large uncertainties, as unnecessary emphasis is put on $y_5$.\n",
"Using only scale normalization captures the overall dynamics, however gives large uncertainties, as unnecessary emphasis is put on $y_5$.\n",
"Approaches only using $\\theta$ as regression targets however fail to capture the dynamics of $\\theta_4$, as the regression model cannot unravel a meaningful relationship between data and parameters."
]
},
Expand Down Expand Up @@ -1561,7 +1561,7 @@
"id": "0e007295-f9f2-4bed-b6c3-9b6fe6be7ba4",
"metadata": {},
"source": [
"While the scale weights accurately depict the scales the various model output types vary on, the sensitivity weights are high for $y_1$ through $y_4$, with low weights assigned to $y_5$. Very roughly, the sum of sensitivity weights for the four model outputs $y_3$ is roughly equal to e.g. the sensitivity weigh assigned to $y_2$, as desirable. However, the weights assigned are now completely homogeneous, indicating that an increased training sample or more complex regression model may be preferable."
"While the scale weights accurately depict the scales the various model output types vary on, the sensitivity weights are high for $y_1$ through $y_4$, with low weights assigned to $y_5$. Very roughly, the sum of sensitivity weights for the four model outputs $y_3$ is roughly equal to e.g. the sensitivity weight assigned to $y_2$, as desirable. However, the weights assigned are now completely homogeneous, indicating that an increased training sample or more complex regression model may be preferable."
]
},
{
Expand Down Expand Up @@ -1766,7 +1766,7 @@
"tags": []
},
"source": [
"This was a little introduction to approaches accounting for heterogeneous data scales by adaptive scale normalization, and data informativeness by using regression models to either construct low-dimensionsal summary statistics, or inform sensitivity weights.\n",
"This was a little introduction to approaches accounting for heterogeneous data scales by adaptive scale normalization, and data informativeness by using regression models to either construct low-dimensional summary statistics, or inform sensitivity weights.\n",
"Beyond linear regression employed in this notebook, various other regression methods are possible, including e.g. Gaussian processes or neural networks to capture non-linear relationships better. These are also implemented in pyABC.\n",
"\n",
"It should be noted that the use of regression models may be sensitive to in particular training sample size, especially for more complex regression models, such that some care may need to be taken there. For example, also model selection with out-of-sample validation is provided via :class:`ModelSelectionPredictor <pyabc.predictor.ModelSelectionPredictor>`."
Expand Down
2 changes: 1 addition & 1 deletion pyabc/version.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = '0.12.4'
__version__ = '0.12.5'

0 comments on commit b077f5a

Please sign in to comment.