diff --git a/joss.06372/10.21105.joss.06372.crossref.xml b/joss.06372/10.21105.joss.06372.crossref.xml new file mode 100644 index 0000000000..cc000669cc --- /dev/null +++ b/joss.06372/10.21105.joss.06372.crossref.xml @@ -0,0 +1,555 @@ + + + + 20240506T171906-f9de69fcc0c18123fd7efa3ff6c28cf2b5477e00 + 20240506171906 + + JOSS Admin + admin@theoj.org + + The Open Journal + + + + + Journal of Open Source Software + JOSS + 2475-9066 + + 10.21105/joss + https://joss.theoj.org + + + + + 05 + 2024 + + + 9 + + 97 + + + + CalibrateEmulateSample.jl: Accelerated Parametric +Uncertainty Quantification + + + + Oliver R. A. + Dunbar + https://orcid.org/0000-0001-7374-0382 + + + Melanie + Bieli + + + Alfredo + Garbuno-Iñigo + https://orcid.org/0000-0003-3279-619X + + + Michael + Howland + https://orcid.org/0000-0002-2878-3874 + + + Andre Nogueira + de Souza + https://orcid.org/0000-0002-9906-7824 + + + Laura Anne + Mansfield + https://orcid.org/0000-0002-6285-6045 + + + Gregory L. + Wagner + https://orcid.org/0000-0001-5317-2445 + + + N. + Efrat-Henrici + + + + 05 + 06 + 2024 + + + 6372 + + + 10.21105/joss.06372 + + + http://creativecommons.org/licenses/by/4.0/ + http://creativecommons.org/licenses/by/4.0/ + http://creativecommons.org/licenses/by/4.0/ + + + + Software archive + 10.5281/zenodo.10946875 + + + GitHub review issue + https://github.com/openjournals/joss-reviews/issues/6372 + + + + 10.21105/joss.06372 + https://joss.theoj.org/papers/10.21105/joss.06372 + + + https://joss.theoj.org/papers/10.21105/joss.06372.pdf + + + + + + Julia: A fresh approach to numerical +computing + Bezanson + SIAM Review + 1 + 59 + 10.1137/141000671 + 2017 + Bezanson, J., Edelman, A., Karpinski, +S., & Shah, V. B. (2017). Julia: A fresh approach to numerical +computing. SIAM Review, 59(1), 65–98. +https://doi.org/10.1137/141000671 + + + High-Dimensional ABC + Nott + Handbook of Approximate Bayesian +Computation + 10.1201/9781315117195-8 + 978-1-315-11719-5 + 2018 + Nott, D. J., Ong, V. M.-H., Fan, Y., +& Sisson, S. A. (2018). High-Dimensional ABC. In Handbook of +Approximate Bayesian Computation (pp. 211–241). CRC Press. +https://doi.org/10.1201/9781315117195-8 + + + Calibrate, emulate, sample + Cleary + Journal of Computational +Physics + 424 + 10.1016/j.jcp.2020.109716 + 0021-9991 + 2021 + Cleary, E., Garbuno-Inigo, A., Lan, +S., Schneider, T., & Stuart, A. M. (2021). Calibrate, emulate, +sample. Journal of Computational Physics, 424, 109716. +https://doi.org/10.1016/j.jcp.2020.109716 + + + EnsembleKalmanProcesses.jl: Derivative-free +ensemble-based model calibration + Dunbar + Journal of Open Source +Software + 80 + 7 + 10.21105/joss.04869 + 2022 + Dunbar, O. R. A., Lopez-Gomez, I., +Garbuno-Iñigo, A. G.-I., Huang, D. Z., Bach, E., & Wu, J. (2022). +EnsembleKalmanProcesses.jl: Derivative-free ensemble-based model +calibration. Journal of Open Source Software, 7(80), 4869. +https://doi.org/10.21105/joss.04869 + + + An efficient Bayesian approach to learning +droplet collision kernels: Proof of concept using “Cloudy,” a new +n-moment bulk microphysics scheme + Bieli + Journal of Advances in Modeling Earth +Systems + 8 + 14 + 10.1029/2022MS002994 + 2022 + Bieli, M., Dunbar, O. R. A., Jong, E. +K. de, Jaruga, A., Schneider, T., & Bischoff, T. (2022). An +efficient Bayesian approach to learning droplet collision kernels: Proof +of concept using “Cloudy,” a new n-moment bulk microphysics scheme. +Journal of Advances in Modeling Earth Systems, 14(8), e2022MS002994. +https://doi.org/10.1029/2022MS002994 + + + Supervised calibration and uncertainty +quantification of subgrid closure parameters using ensemble Kalman +inversion + Hillier + 1721.1/145140 + 2022 + Hillier, A. (2022). Supervised +calibration and uncertainty quantification of subgrid closure parameters +using ensemble Kalman inversion [Master’s thesis, Massachusetts +Institute of Technology. Department of Electrical Engineering; Computer +Science]. https://doi.org/1721.1/145140 + + + Gaussian processes for machine +learning + Williams + 2 + 10.1142/S0129065704001899 + 2006 + Williams, C. K., & Rasmussen, C. +E. (2006). Gaussian processes for machine learning (Vol. 2). MIT press +Cambridge, MA. +https://doi.org/10.1142/S0129065704001899 + + + Ensemble kalman methods for inverse +problems + Iglesias + Inverse Problems + 4 + 29 + 10.1088/0266-5611/29/4/045001 + 2013 + Iglesias, M. A., Law, K. J., & +Stuart, A. M. (2013). Ensemble kalman methods for inverse problems. +Inverse Problems, 29(4), 045001. +https://doi.org/10.1088/0266-5611/29/4/045001 + + + Random features for large-scale kernel +machines. + Rahimi + NIPS + 3 + 2007 + Rahimi, A., Recht, B., & others. +(2007). Random features for large-scale kernel machines. NIPS, 3, 5. +https://proceedings.neurips.cc/paper_files/paper/2007/file/013a006f03dbc5392effeb8f18fda755-Paper.pdf + + + Uniform approximation of functions with +random bases + Rahimi + 2008 46th annual allerton conference on +communication, control, and computing + 10.1109/allerton.2008.4797607 + 2008 + Rahimi, A., & Recht, B. (2008). +Uniform approximation of functions with random bases. 2008 46th Annual +Allerton Conference on Communication, Control, and Computing, 555–561. +https://doi.org/10.1109/allerton.2008.4797607 + + + Random features for kernel approximation: A +survey on algorithms, theory, and beyond + Liu + IEEE Transactions on Pattern Analysis and +Machine Intelligence + 10 + 44 + 10.1109/TPAMI.2021.3097011 + 2022 + Liu, F., Huang, X., Chen, Y., & +Suykens, J. A. K. (2022). Random features for kernel approximation: A +survey on algorithms, theory, and beyond. IEEE Transactions on Pattern +Analysis and Machine Intelligence, 44(10), 7128–7148. +https://doi.org/10.1109/TPAMI.2021.3097011 + + + MCMC Methods for Functions: Modifying Old +Algorithms to Make Them Faster + Cotter + Statistical Science + 3 + 28 + 10.1214/13-STS421 + 2013 + Cotter, S. L., Roberts, G. O., +Stuart, A. M., & White, D. (2013). MCMC Methods for Functions: +Modifying Old Algorithms to Make Them Faster. Statistical Science, +28(3), 424–446. +https://doi.org/10.1214/13-STS421 + + + The random walk metropolis: Linking theory +and practice through a case study + Sherlock + Statistical Science + 2 + 25 + 10.1214/10-sts327 + 2010 + Sherlock, C., Fearnhead, P., & +Roberts, G. O. (2010). The random walk metropolis: Linking theory and +practice through a case study. Statistical Science, 25(2), 172–190. +https://doi.org/10.1214/10-sts327 + + + Calibration and uncertainty quantification of +convective parameters in an idealized GCM + Dunbar + Journal of Advances in Modeling Earth +Systems + 9 + 13 + 10.1029/2020MS002454 + 2021 + Dunbar, O. R. A., Garbuno-Inigo, A., +Schneider, T., & Stuart, A. M. (2021). Calibration and uncertainty +quantification of convective parameters in an idealized GCM. Journal of +Advances in Modeling Earth Systems, 13(9), e2020MS002454. +https://doi.org/10.1029/2020MS002454 + + + Parameter uncertainty quantification in an +idealized GCM with a seasonal cycle + Howland + Journal of Advances in Modeling Earth +Systems + 3 + 14 + 10.1029/2021MS002735 + 2022 + Howland, M. F., Dunbar, O. R. A., +& Schneider, T. (2022). Parameter uncertainty quantification in an +idealized GCM with a seasonal cycle. Journal of Advances in Modeling +Earth Systems, 14(3), e2021MS002735. +https://doi.org/10.1029/2021MS002735 + + + Ensemble-based experimental design for +targeting data acquisition to inform climate models + Dunbar + Journal of Advances in Modeling Earth +Systems + 9 + 14 + 10.1029/2022MS002997 + 2022 + Dunbar, O. R. A., Howland, M. F., +Schneider, T., & Stuart, A. M. (2022). Ensemble-based experimental +design for targeting data acquisition to inform climate models. Journal +of Advances in Modeling Earth Systems, 14(9), e2022MS002997. +https://doi.org/10.1029/2022MS002997 + + + Scikit-learn: Machine learning in +Python + Pedregosa + Journal of Machine Learning +Research + 12 + 2011 + Pedregosa, F., Varoquaux, G., +Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., +Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., +Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, E. (2011). +Scikit-learn: Machine learning in Python. Journal of Machine Learning +Research, 12, 2825–2830. + + + GaussianProcesses. Jl: A nonparametric bayes +package for the julia language + Fairbrother + Journal of Statistical +Software + 102 + 10.18637/jss.v102.i01 + 2022 + Fairbrother, J., Nemeth, C., +Rischard, M., Brea, J., & Pinder, T. (2022). GaussianProcesses. Jl: +A nonparametric bayes package for the julia language. Journal of +Statistical Software, 102, 1–36. +https://doi.org/10.18637/jss.v102.i01 + + + GlobalSensitivity.jl: Performant and parallel +global sensitivity analysis with julia + Dixit + Journal of Open Source +Software + 76 + 7 + 10.21105/joss.04561 + 2022 + Dixit, V. K., & Rackauckas, C. +(2022). GlobalSensitivity.jl: Performant and parallel global sensitivity +analysis with julia. Journal of Open Source Software, 7(76), 4561. +https://doi.org/10.21105/joss.04561 + + + Affine invariant interacting Langevin +dynamics for Bayesian inference + Garbuno-Inigo + SIAM Journal on Applied Dynamical +Systems + 3 + 19 + 10.1137/19M1304891 + 2020 + Garbuno-Inigo, A., Nüsken, N., & +Reich, S. (2020). Affine invariant interacting Langevin dynamics for +Bayesian inference. SIAM Journal on Applied Dynamical Systems, 19(3), +1633–1658. https://doi.org/10.1137/19M1304891 + + + GpABC: a Julia package for approximate +Bayesian computation with Gaussian process emulation + Tankhilevich + Bioinformatics + 10.1093/bioinformatics/btaa078 + 1367-4803 + 2020 + Tankhilevich, E., Ish-Horowicz, J., +Hameed, T., Roesch, E., Kleijn, I., Stumpf, M. P. H., & He, F. +(2020). GpABC: a Julia package for approximate Bayesian computation with +Gaussian process emulation. Bioinformatics. +https://doi.org/10.1093/bioinformatics/btaa078 + + + Efficient derivative-free bayesian inference +for large-scale inverse problems + Huang + Inverse Problems + 12 + 38 + 10.1088/1361-6420/ac99fa + 2022 + Huang, D. Z., Huang, J., Reich, S., +& Stuart, A. M. (2022). Efficient derivative-free bayesian inference +for large-scale inverse problems. Inverse Problems, 38(12), 125006. +https://doi.org/10.1088/1361-6420/ac99fa + + + Calibration and uncertainty quantification of +a gravity wave parameterization: A case study of the Quasi-Biennial +Oscillation in an intermediate complexity climate model + Mansfield + Journal of Advances in Modeling Earth +Systems + 11 + 14 + 10.1029/2022MS003245 + 1942-2466 + 2022 + Mansfield, L. A., & Sheshadri, A. +(2022). Calibration and uncertainty quantification of a gravity wave +parameterization: A case study of the Quasi-Biennial Oscillation in an +intermediate complexity climate model. Journal of Advances in Modeling +Earth Systems, 14(11). +https://doi.org/10.1029/2022MS003245 + + + Bayesian history matching applied to the +calibration of a gravity wave parameterization + King + 10.22541/essoar.170365299.96491153/v1 + 2023 + King, R. C., Mansfield, L. A., & +Sheshadri, A. (2023). Bayesian history matching applied to the +calibration of a gravity wave parameterization [Preprint]. +https://doi.org/10.22541/essoar.170365299.96491153/v1 + + + Equation of state calculations by fast +computing machines + Metropolis + The journal of chemical +physics + 6 + 21 + 10.1063/1.1699114 + 1953 + Metropolis, N., Rosenbluth, A. W., +Rosenbluth, M. N., Teller, A. H., & Teller, E. (1953). Equation of +state calculations by fast computing machines. The Journal of Chemical +Physics, 21(6), 1087–1092. +https://doi.org/10.1063/1.1699114 + + + PyVBMC: Efficient bayesian inference in +python + Huggins + Journal of Open Source +Software + 86 + 8 + 10.21105/joss.05428 + 2023 + Huggins, B., Li, C., Tobaben, M., +Aarnos, M. J., & Acerbi, L. (2023). PyVBMC: Efficient bayesian +inference in python. Journal of Open Source Software, 8(86), 5428. +https://doi.org/10.21105/joss.05428 + + + Fast and robust bayesian inference using +gaussian processes with GPry + Gammal + Journal of Cosmology and Astroparticle +Physics + 10 + 2023 + 10.1088/1475-7516/2023/10/021 + 2023 + Gammal, J. E., Schöneberg, N., +Torrado, J., & Fidler, C. (2023). Fast and robust bayesian inference +using gaussian processes with GPry. Journal of Cosmology and +Astroparticle Physics, 2023(10), 021. +https://doi.org/10.1088/1475-7516/2023/10/021 + + + The barker proposal: Combining robustness and +efficiency in gradient-based MCMC + Livingstone + Journal of the Royal Statistical Society +Series B: Statistical Methodology + 2 + 84 + 10.1111/rssb.12482 + 2022 + Livingstone, S., & Zanella, G. +(2022). The barker proposal: Combining robustness and efficiency in +gradient-based MCMC. Journal of the Royal Statistical Society Series B: +Statistical Methodology, 84(2), 496–523. +https://doi.org/10.1111/rssb.12482 + + + The no-u-turn sampler: Adaptively setting +path lengths in hamiltonian monte carlo. + Hoffman + J. Mach. Learn. Res. + 1 + 15 + 2014 + Hoffman, M. D., Gelman, A., & +others. (2014). The no-u-turn sampler: Adaptively setting path lengths +in hamiltonian monte carlo. J. Mach. Learn. Res., 15(1), +1593–1623. + + + + + + diff --git a/joss.06372/10.21105.joss.06372.jats b/joss.06372/10.21105.joss.06372.jats new file mode 100644 index 0000000000..f467834b6a --- /dev/null +++ b/joss.06372/10.21105.joss.06372.jats @@ -0,0 +1,1052 @@ + + +
+ + + + +Journal of Open Source Software +JOSS + +2475-9066 + +Open Journals + + + +6372 +10.21105/joss.06372 + +CalibrateEmulateSample.jl: Accelerated Parametric +Uncertainty Quantification + + + +https://orcid.org/0000-0001-7374-0382 + +Dunbar +Oliver R. A. + + +* + + + +Bieli +Melanie + + + + +https://orcid.org/0000-0003-3279-619X + +Garbuno-Iñigo +Alfredo + + + + +https://orcid.org/0000-0002-2878-3874 + +Howland +Michael + + + + +https://orcid.org/0000-0002-9906-7824 + +de Souza +Andre Nogueira + + + + +https://orcid.org/0000-0002-6285-6045 + +Mansfield +Laura Anne + + + + +https://orcid.org/0000-0001-5317-2445 + +Wagner +Gregory L. + + + + + +Efrat-Henrici +N. + + + + + +Geological and Planetary Sciences, California Institute of +Technology + + + + +Swiss Re Ltd. + + + + +Department of Statistics, Mexico Autonomous Institute of +Technology + + + + +Civil and Environmental Engineering, Massachusetts +Institute of Technology + + + + +Earth, Atmospheric, and Planetary Sciences, Massachusetts +Institute of Technology + + + + +Earth System Science, Doerr School of Sustainability, +Stanford University + + + + +* E-mail: + + +2 +1 +2024 + +9 +97 +6372 + +Authors of papers retain copyright and release the +work under a Creative Commons Attribution 4.0 International License (CC +BY 4.0) +2022 +The article authors + +Authors of papers retain copyright and release the work under +a Creative Commons Attribution 4.0 International License (CC BY +4.0) + + + +machine learning +optimization +bayesian +data assimilation + + + + + + Summary +

A Julia language + (Bezanson + et al., 2017) package providing practical and modular + implementation of ``Calibrate, Emulate, Sample” + (Cleary + et al., 2021), hereafter CES, an accelerated workflow for + obtaining model parametric uncertainty is presented. This is also + known as Bayesian inversion or uncertainty quantification. To apply + CES one requires a computer model (written in any programming + language) dependent on free parameters, a prior distribution encoding + some prior knowledge about the distribution over the free parameters, + and some data with which to constrain this prior distribution. The + pipeline has three stages, most easily explained in reverse:

+ + +

The goal of the workflow is to draw samples (Sample) from the + Bayesian posterior distribution, that is, the prior distribution + conditioned on the observed data,

+
+ +

To accelerate and regularize sampling we train statistical + emulators to represent the user-provided parameter-to-data map + (Emulate),

+
+ +

The training points for these emulators are generated by the + computer model, and selected adaptively around regions of high + posterior mass (Calibrate).

+
+
+

We describe CES as an accelerated workflow, as it is often able to + use dramatically fewer evaluations of the computer model when compared + with applying sampling algorithms, such as Markov chain Monte Carlo + (MCMC), directly.

+ + +

Calibration tools: We recommend choosing adaptive training + points with Ensemble Kalman methods such as EKI + (Iglesias + et al., 2013) and its variants + (Huang + et al., 2022); and CES provides explicit utilities from the + codebase EnsembleKalmanProcesses.jl + (Dunbar, + Lopez-Gomez, et al., 2022).

+
+ +

Emulation tools: CES integrates any statistical emulator, + currently implemented are Gaussian Processes (GP) + (Williams + & Rasmussen, 2006), explicitly provided through + packages SciKitLearn.jl + (Pedregosa + et al., 2011) and GaussianProcesses.jl + (Fairbrother + et al., 2022), and Random Features + (Liu + et al., 2022; + Rahimi + et al., 2007; + Rahimi + & Recht, 2008), explicitly provided through + RandomFeatures.jl + that can provide additional flexibility and scalability, + particularly in higher dimensions.

+
+ +

Sampling tools: The regularized and accelerated sampling + problem is solved with MCMC, and CES provides the variants of + Random Walk Metropolis + (Metropolis + et al., 1953; + Sherlock + et al., 2010), and preconditioned Crank-Nicholson + (Cotter + et al., 2013), using APIs from + Turing.jl. + Some regular emulator mean functions are differentiable, and + including accelerations of derivative-based MCMC into CES, (e.g., + NUTS, + Hoffman + et al., 2014; Barker, + Livingstone + & Zanella, 2022); is an active direction of work.

+
+
+

To highlight code accessibility, we also provide a suite of + detailed scientifically-inspired examples, with documentation that + walks users through some use cases. Such use cases not only + demonstrate the capability of the CES pipeline, but also teach users + about typical interface and workflow experience.

+
+ + Statement of need +

Computationally expensive computer codes for predictive modelling + are ubiquitous across science and engineering disciplines. Free + parameter values that exist within these modelling frameworks are + typically constrained by observations to produce accurate and robust + predictions about the system they are approximating numerically. In a + Bayesian setting, this is viewed as evolving an initial parameter + distribution (based on prior information) with the input of observed + data, to a more informative data-consistent distribution (posterior). + Unfortunately, this task is intensely computationally expensive, + commonly requiring over + + 105 + evaluations of the expensive computer code (e.g., Random Walk + Metropolis), with accelerations relying on intrusive model + information, such as a derivative of the parameter-to-data map. CES is + able to approximate and accelerate this process in a non-intrusive + fashion and requiring only on the order of + + + 102 + evaluations of the original computer model. This opens the doors for + quantifying parametric uncertainty for a class of numerically + intensive computer codes that has previously been unavailable.

+
+ + State of the field +

In Julia there are a few tools for performing non-accelerated + uncertainty quantification, from classical sensitivity analysis + approaches, for example, + UncertaintyQuantification.jl, + GlobalSensitivity.jl + (Dixit + & Rackauckas, 2022), and MCMC, for example, + Mamba.jl + or + Turing.jl. + For computational efficiency, ensemble methods also provide + approximate sampling, + (Dunbar, + Lopez-Gomez, et al., 2022; e.g., the Ensemble Kalman Sampler + Garbuno-Inigo + et al., 2020), though these only provide Gaussian + approximations of the posterior.

+

Accelerated uncertainty quantification tools also exist for the + related approach of Approximate Bayesian Computation (ABC), for + example, GpABC + (Tankhilevich + et al., 2020) or + ApproxBayes.jl; + these tools both approximately sample from the posterior distribution. + In ABC, this approximation comes from bypassing the likelihood that is + usually required in sampling methods, such as MCMC. Instead, the goal + of ABC is to replace the likelihood with a scalar-valued sampling + objective that compares model and data. In CES, the approximation + comes from learning the parameter-to-data map, then following this it + calculates an explicit likelihood and uses exact sampling via MCMC. + Some ABC algorithms also make use of statistical emulators to further + accelerate sampling (GpABC). Although flexible, ABC encounters + challenges due to the subjectivity of summary statistics and distance + metrics, that may lead to approximation errors particularly in + high-dimensional settings + (Nott + et al., 2018). CES is more restrictive due to use of an + explicit Gaussian likelihood, but also leverages this structure to + deal with high dimensional data.

+

Several other tools are available in other languages for a purpose + of accelerated learning of the posterior distribution or posterior + sampling. Two such examples, written in Python, approximate the + log-posterior distribution directly with a Gaussian process: + PyVBMC + (Huggins + et al., 2023) additionaly uses variational approximations to + calculate the normalization constant, and + GPry + (Gammal + et al., 2023), which iteratively trains the GP with an active + training point selection algorithm. Such algorithms are distinct from + CES, which approximates the parameter-to-data map with the Gaussian + process, and advocates ensemble Kalman methods to select training + points.

+
+ + A simple example from the code documentation +

We sketch an end-to-end example of the pipeline, with + fully-detailed walkthrough given in the online documentation.

+

We have a model of a sinusoidal signal that is a function of + parameters + + θ=(A,v), + where + + A + is the amplitude of the signal and + + v + is vertical shift of the signal + + f(A,v)=Asin(ϕ+t)+v,t[0,2π]. + Here, + + ϕ + is the random phase of each signal. The goal is to estimate not just + point estimates of the parameters + + θ=(A,v), + but entire probability distributions of them, given some noisy + observations. We will use the range and mean of a signal as our + observable: + + G(θ)=[range(f(θ)),mean(f(θ))] + Then, our noisy observations, + + yobs, + can be written as: + + yobs=G(θ)+𝒩(0,Γ) + where + + Γ + is the observational covariance matrix. We will assume the noise to be + independent for each observable, giving us a diagonal covariance + matrix.

+ +

The true and observed range and mean. +

+ +
+

For this experiment + + θ=(A,v)=(3.0,7.0), + and the noisy observations are displayed in blue in + [fig:signal].

+

We define prior distributions on the two parameters. For the + amplitude, we define a prior with mean 2 and standard deviation 1. It + is additionally constrained to be nonnegative. For the vertical shift + we define a prior with mean 0 and standard deviation 5.

+ const PD = CalibrateEmulateSample.ParameterDistributions +prior_u1 = PD.constrained_gaussian("amplitude", 2, 1, 0, Inf) +prior_u2 = PD.constrained_gaussian("vert_shift", 0, 5, -Inf, Inf) +prior = PD.combine_distributions([prior_u1, prior_u2]) + +

Marginal distributions of the prior +

+ +
+

The prior is displayed in + [fig:prior].

+

We now adaptively find input-output pairs from our map + + + G + in a region of interest using an inversion method (an ensemble Kalman + process). This is the Calibrate stage, and iteratively generates + parameter combinations, that refine around a region of high posterior + mass.

+ const EKP = CalibrateEmulateSample.EnsembleKalmanProcesses +N_ensemble = 10 +N_iterations = 5 +initial_ensemble = EKP.construct_initial_ensemble(prior, N_ensemble) +ensemble_kalman_process = EKP.EnsembleKalmanProcess( + initial_ensemble, y_obs, Γ, EKP.Inversion(); +) +for i in 1:N_iterations + params_i = EKP.get_phi_final(prior, ensemble_kalman_process) + G_ens = hcat([G(params_i[:, i]) for i in 1:N_ensemble]...) + EKP.update_ensemble!(ensemble_kalman_process, G_ens) +end + +

The resulting ensemble from a calibration. +

+ +
+

The adaptively refined training points from EKP are displayed in + [fig:eki]. We now build + an basic Gaussian process emulator from the GaussianProcesses.jl + package to emulate the map + + G + using these points.

+ const UT = CalibrateEmulateSample.Utilities +const EM = CalibrateEmulateSample.Emulators + +input_output_pairs = UT.get_training_points( + ensemble_kalman_process, N_iterations, +) +gppackage = EM.GPJL() +gauss_proc = EM.GaussianProcess(gppackage, noise_learn = false) +emulator = EM.Emulator( + gauss_proc, input_output_pairs, normalize_inputs = true, obs_noise_cov = Γ, +) +EM.optimize_hyperparameters!(emulator) # train the emulator + +

The Gaussian process emulator of the range and mean + maps, trained on the re-used calibration pairs +

+ +
+

We evaluate the mean of this emulator on a grid, and also show the + value of the true + + G + at training point locations in + [fig:GP_emulator].

+

We can then sample with this emulator using an MCMC scheme. We + first choose a good step size (an algorithm parameter) by running some + short sampling runs (of length 2,000 steps). Then we run the 100,000 + step sampling run to generate samples of the joint posterior + distribution.

+ const MC = CalibrateEmulateSample.MarkovChainMonteCarlo +mcmc = MC.MCMCWrapper( + MC.RWMHSampling(), y_obs, prior, emulator, +) +# choose a step size +new_step = MC.optimize_stepsize( + mcmc; init_stepsize = 0.1, N = 2000, +) +# Now begin the actual MCMC +chain = MC.sample( + mcmc, 100_000; stepsize = new_step, discard_initial = 2_000, +) + +

The joint posterior distribution histogram +

+ +
+

A histogram of the samples from the CES algorithm is displayed in + [fig:GP_2d_posterior]. + We see that the posterior distribution contains the true value + + + (3.0,7.0) + with high probability.

+
+ + Research projects using the package +

Some research projects that use this codebase, or modifications of + it, are

+ + +

(Dunbar + et al., 2021)

+
+ +

(Bieli + et al., 2022)

+
+ +

(Hillier, + 2022)

+
+ +

(Howland + et al., 2022)

+
+ +

(Dunbar, + Howland, et al., 2022)

+
+ +

(Mansfield + & Sheshadri, 2022)

+
+ +

(King + et al., 2023)

+
+
+
+ + Acknowledgements +

We acknowledge contributions from several others who played a role + in the evolution of this package. These include Adeline Hillier, + Ignacio Lopez Gomez and Thomas Jackson. The development of this + package was supported by the generosity of Eric and Wendy Schmidt by + recommendation of the Schmidt Futures program, National Science + Foundation Grant AGS-1835860, the Defense Advanced Research Projects + Agency (Agreement No. HR00112290030), the Heising-Simons Foundation, + Audi Environmental Foundation, and the Cisco Foundation.

+
+ + + + + + + BezansonJeff + EdelmanAlan + KarpinskiStefan + ShahViral B. + + Julia: A fresh approach to numerical computing + SIAM Review + Society for Industrial & Applied Mathematics (SIAM) + 201701 + 59 + 1 + 10.1137/141000671 + 65 + 98 + + + + + + NottDavid J. + OngVictor M.-H. + FanY. + SissonS. A. + + High-Dimensional ABC + Handbook of Approximate Bayesian Computation + CRC Press + 2018 + 978-1-315-11719-5 + 10.1201/9781315117195-8 + 211 + 241 + + + + + + ClearyEmmet + Garbuno-InigoAlfredo + LanShiwei + SchneiderTapio + StuartAndrew M. + + Calibrate, emulate, sample + Journal of Computational Physics + 2021 + 424 + 0021-9991 + 10.1016/j.jcp.2020.109716 + 109716 + + + + + + + DunbarOliver R. A. + Lopez-GomezIgnacio + Garbuno-IñigoAlfredo Garbuno-Iñigo + HuangDaniel Zhengyu + BachEviatar + WuJin-long + + EnsembleKalmanProcesses.jl: Derivative-free ensemble-based model calibration + Journal of Open Source Software + The Open Journal + 2022 + 7 + 80 + 10.21105/joss.04869 + 4869 + + + + + + + BieliMelanie + DunbarOliver R. A. + JongEmily K. de + JarugaAnna + SchneiderTapio + BischoffTobias + + An efficient Bayesian approach to learning droplet collision kernels: Proof of concept using “Cloudy,” a new n-moment bulk microphysics scheme + Journal of Advances in Modeling Earth Systems + 2022 + 14 + 8 + 10.1029/2022MS002994 + e2022MS002994 + + + + + + + HillierAdeline + + Supervised calibration and uncertainty quantification of subgrid closure parameters using ensemble Kalman inversion + Massachusetts Institute of Technology. Department of Electrical Engineering; Computer Science + 2022 + 1721.1/145140 + + + + + + WilliamsChristopher KI + RasmussenCarl Edward + + Gaussian processes for machine learning + MIT press Cambridge, MA + 2006 + 2 + 10.1142/S0129065704001899 + + + + + + IglesiasMarco A + LawKody JH + StuartAndrew M + + Ensemble kalman methods for inverse problems + Inverse Problems + IOP Publishing + 2013 + 29 + 4 + 10.1088/0266-5611/29/4/045001 + 045001 + + + + + + + RahimiAli + RechtBenjamin + others + + Random features for large-scale kernel machines. + NIPS + 2007 + 3 + https://proceedings.neurips.cc/paper_files/paper/2007/file/013a006f03dbc5392effeb8f18fda755-Paper.pdf + 5 + + + + + + + RahimiAli + RechtBenjamin + + Uniform approximation of functions with random bases + 2008 46th annual allerton conference on communication, control, and computing + IEEE + 2008 + 10.1109/allerton.2008.4797607 + 555 + 561 + + + + + + LiuFanghui + HuangXiaolin + ChenYudong + SuykensJohan A. K. + + Random features for kernel approximation: A survey on algorithms, theory, and beyond + IEEE Transactions on Pattern Analysis and Machine Intelligence + 2022 + 44 + 10 + 10.1109/TPAMI.2021.3097011 + 7128 + 7148 + + + + + + CotterS. L. + RobertsG. O. + StuartA. M. + WhiteD. + + MCMC Methods for Functions: Modifying Old Algorithms to Make Them Faster + Statistical Science + Institute of Mathematical Statistics + 2013 + 28 + 3 + 10.1214/13-STS421 + 424 + 446 + + + + + + SherlockChris + FearnheadPaul + RobertsGareth O. + + The random walk metropolis: Linking theory and practice through a case study + Statistical Science + Institute of Mathematical Statistics + 2010 + 25 + 2 + 10.1214/10-sts327 + 172 + 190 + + + + + + DunbarOliver R. A. + Garbuno-InigoAlfredo + SchneiderTapio + StuartAndrew M. + + Calibration and uncertainty quantification of convective parameters in an idealized GCM + Journal of Advances in Modeling Earth Systems + 2021 + 13 + 9 + 10.1029/2020MS002454 + e2020MS002454 + + + + + + + HowlandMichael F. + DunbarOliver R. A. + SchneiderTapio + + Parameter uncertainty quantification in an idealized GCM with a seasonal cycle + Journal of Advances in Modeling Earth Systems + 2022 + 14 + 3 + 10.1029/2021MS002735 + e2021MS002735 + + + + + + + DunbarOliver R. A. + HowlandMichael F. + SchneiderTapio + StuartAndrew M. + + Ensemble-based experimental design for targeting data acquisition to inform climate models + Journal of Advances in Modeling Earth Systems + 2022 + 14 + 9 + 10.1029/2022MS002997 + e2022MS002997 + + + + + + + PedregosaF. + VaroquauxG. + GramfortA. + MichelV. + ThirionB. + GriselO. + BlondelM. + PrettenhoferP. + WeissR. + DubourgV. + VanderplasJ. + PassosA. + CournapeauD. + BrucherM. + PerrotM. + DuchesnayE. + + Scikit-learn: Machine learning in Python + Journal of Machine Learning Research + 2011 + 12 + 2825 + 2830 + + + + + + FairbrotherJamie + NemethChristopher + RischardMaxime + BreaJohanni + PinderThomas + + GaussianProcesses. Jl: A nonparametric bayes package for the julia language + Journal of Statistical Software + 2022 + 102 + 10.18637/jss.v102.i01 + 1 + 36 + + + + + + DixitVaibhav Kumar + RackauckasChristopher + + GlobalSensitivity.jl: Performant and parallel global sensitivity analysis with julia + Journal of Open Source Software + The Open Journal + 2022 + 7 + 76 + 10.21105/joss.04561 + 4561 + + + + + + + Garbuno-InigoAlfredo + NüskenNikolas + ReichSebastian + + Affine invariant interacting Langevin dynamics for Bayesian inference + SIAM Journal on Applied Dynamical Systems + SIAM + 2020 + 19 + 3 + 10.1137/19M1304891 + 1633 + 1658 + + + + + + TankhilevichEvgeny + Ish-HorowiczJonathan + HameedTara + RoeschElisabeth + KleijnIstvan + StumpfMichael P H + HeFei + + GpABC: a Julia package for approximate Bayesian computation with Gaussian process emulation + Bioinformatics + 202002 + 1367-4803 + 10.1093/bioinformatics/btaa078 + + + + + + HuangDaniel Zhengyu + HuangJiaoyang + ReichSebastian + StuartAndrew M + + Efficient derivative-free bayesian inference for large-scale inverse problems + Inverse Problems + IOP Publishing + 202210 + 38 + 12 + 10.1088/1361-6420/ac99fa + 125006 + + + + + + + MansfieldL. A. + SheshadriA. + + Calibration and uncertainty quantification of a gravity wave parameterization: A case study of the Quasi-Biennial Oscillation in an intermediate complexity climate model + Journal of Advances in Modeling Earth Systems + 2022 + 14 + 11 + 1942-2466 + 10.1029/2022MS003245 + + + + + + KingRobert C + MansfieldLaura A + SheshadriAditi + + Bayesian history matching applied to the calibration of a gravity wave parameterization + Preprints + 202312 + 10.22541/essoar.170365299.96491153/v1 + + + + + + MetropolisNicholas + RosenbluthArianna W + RosenbluthMarshall N + TellerAugusta H + TellerEdward + + Equation of state calculations by fast computing machines + The journal of chemical physics + American Institute of Physics + 1953 + 21 + 6 + 10.1063/1.1699114 + 1087 + 1092 + + + + + + HugginsBobby + LiChengkun + TobabenMarlon + AarnosMikko J. + AcerbiLuigi + + PyVBMC: Efficient bayesian inference in python + Journal of Open Source Software + The Open Journal + 2023 + 8 + 86 + 10.21105/joss.05428 + 5428 + + + + + + + GammalJonas El + SchönebergNils + TorradoJesús + FidlerChristian + + Fast and robust bayesian inference using gaussian processes with GPry + Journal of Cosmology and Astroparticle Physics + IOP Publishing + 202310 + 2023 + 10 + 10.1088/1475-7516/2023/10/021 + 021 + + + + + + + LivingstoneSamuel + ZanellaGiacomo + + The barker proposal: Combining robustness and efficiency in gradient-based MCMC + Journal of the Royal Statistical Society Series B: Statistical Methodology + Oxford University Press + 2022 + 84 + 2 + 10.1111/rssb.12482 + 496 + 523 + + + + + + HoffmanMatthew D + GelmanAndrew + others + + The no-u-turn sampler: Adaptively setting path lengths in hamiltonian monte carlo. + J. Mach. Learn. Res. + 2014 + 15 + 1 + 1593 + 1623 + + + + +
diff --git a/joss.06372/10.21105.joss.06372.pdf b/joss.06372/10.21105.joss.06372.pdf new file mode 100644 index 0000000000..f944f9d61d Binary files /dev/null and b/joss.06372/10.21105.joss.06372.pdf differ diff --git a/joss.06372/media/sinusoid_GP_emulator_contours.png b/joss.06372/media/sinusoid_GP_emulator_contours.png new file mode 100644 index 0000000000..8eb67bf43a Binary files /dev/null and b/joss.06372/media/sinusoid_GP_emulator_contours.png differ diff --git a/joss.06372/media/sinusoid_MCMC_hist_GP.png b/joss.06372/media/sinusoid_MCMC_hist_GP.png new file mode 100644 index 0000000000..10e1f7ef1a Binary files /dev/null and b/joss.06372/media/sinusoid_MCMC_hist_GP.png differ diff --git a/joss.06372/media/sinusoid_eki_pairs.png b/joss.06372/media/sinusoid_eki_pairs.png new file mode 100644 index 0000000000..229f75d1c6 Binary files /dev/null and b/joss.06372/media/sinusoid_eki_pairs.png differ diff --git a/joss.06372/media/sinusoid_prior.png b/joss.06372/media/sinusoid_prior.png new file mode 100644 index 0000000000..ae7e41d1f7 Binary files /dev/null and b/joss.06372/media/sinusoid_prior.png differ diff --git a/joss.06372/media/sinusoid_true_vs_observed_signal.png b/joss.06372/media/sinusoid_true_vs_observed_signal.png new file mode 100644 index 0000000000..d143ac7c50 Binary files /dev/null and b/joss.06372/media/sinusoid_true_vs_observed_signal.png differ