diff --git a/joss.05735/10.21105.joss.05735.crossref.xml b/joss.05735/10.21105.joss.05735.crossref.xml new file mode 100644 index 0000000000..8f2eeb61c7 --- /dev/null +++ b/joss.05735/10.21105.joss.05735.crossref.xml @@ -0,0 +1,363 @@ + + + + 20231218T180814-34a5f04fecfc0befd24ee48f5ac11a6bef80e6e6 + 20231218180814 + + JOSS Admin + admin@theoj.org + + The Open Journal + + + + + Journal of Open Source Software + JOSS + 2475-9066 + + 10.21105/joss + https://joss.theoj.org + + + + + 12 + 2023 + + + 8 + + 92 + + + + parafields: A generator for distributed, stationary +Gaussian processes + + + + Dominic + Kempf + https://orcid.org/0000-0002-6140-2332 + + + Ole + Klein + https://orcid.org/0000-0002-3295-7347 + + + Robert + Kutri + https://orcid.org/0009-0004-8123-4673 + + + Robert + Scheichl + https://orcid.org/0000-0001-8493-4393 + + + Peter + Bastian + + + + 12 + 18 + 2023 + + + 5735 + + + 10.21105/joss.05735 + + + http://creativecommons.org/licenses/by/4.0/ + http://creativecommons.org/licenses/by/4.0/ + http://creativecommons.org/licenses/by/4.0/ + + + + Software archive + 10.5281/zenodo.10355636 + + + GitHub review issue + https://github.com/openjournals/joss-reviews/issues/5735 + + + + 10.21105/joss.05735 + https://joss.theoj.org/papers/10.21105/joss.05735 + + + https://joss.theoj.org/papers/10.21105/joss.05735.pdf + + + + + + Fast and exact simulation of stationary +Gaussian processes through circulant embedding of the covariance +matrix + Dietrich + SIAM Journal on Scientific +Computing + 4 + 18 + 10.1137/s1064827592240555 + 1997 + Dietrich, C. R., & Newsam, G. N. +(1997). Fast and exact simulation of stationary Gaussian processes +through circulant embedding of the covariance matrix. SIAM Journal on +Scientific Computing, 18(4), 1088–1107. +https://doi.org/10.1137/s1064827592240555 + + + parafields-core + Klein + GitHub repository + 2022 + Klein, O., & Kempf, D. (2022). +parafields-core. In GitHub repository. GitHub. +https://github.com/parafields/parafields-core + + + mpi4py: Status update after 12 years of +development + Dalcin + Computing in Science & +Engineering + 4 + 23 + 10.1109/MCSE.2021.3083216 + 2021 + Dalcin, L., & Fang, Y.-L. L. +(2021). mpi4py: Status update after 12 years of development. Computing +in Science & Engineering, 23(4), 47–54. +https://doi.org/10.1109/MCSE.2021.3083216 + + + The dune framework: Basic concepts and recent +developments + Bastian + Computers & Mathematics with +Applications + 81 + 10.1016/j.camwa.2020.06.007 + 0898-1221 + 2021 + Bastian, P., Blatt, M., Dedner, A., +Dreier, N.-A., Engwer, C., Fritze, R., Gräser, C., Grüninger, C., Kempf, +D., Klöfkorn, R., Ohlberger, M., & Sander, O. (2021). The dune +framework: Basic concepts and recent developments. Computers & +Mathematics with Applications, 81, 75–112. +https://doi.org/10.1016/j.camwa.2020.06.007 + + + pybind11 – seamless operability between C++11 +and Python + Jakob + 2017 + Jakob, W., Rhinelander, J., & +Moldovan, D. (2017). pybind11 – seamless operability between C++11 and +Python. + + + Dune-randomfield - generation of Gaussian +random fields in arbitrary dimensions, based on circulant +embedding + Klein + 2017 + Klein, O. (2017). Dune-randomfield - +generation of Gaussian random fields in arbitrary dimensions, based on +circulant embedding. + + + FakeMPI - a sequential MPI +stub + Kempf + 2022 + Kempf, D., & PetSc Developers, +the. (2022). FakeMPI - a sequential MPI stub. + + + jcfr/scipy_2018_scikit-build_talk: SciPy 2018 +talk | scikit-build: A build system generator for CPython +C/C++/Fortran/Cython extensions + Fillion-Robin + 10.5281/zenodo.2565368 + 2018 + Fillion-Robin, J.-C., McCormick, M., +Padron, O., Smolens, M., Grauer, M., & Sarahan, M. (2018). +jcfr/scipy_2018_scikit-build_talk: SciPy 2018 talk | scikit-build: A +build system generator for CPython C/C++/Fortran/Cython extensions +(Version v1.0). Zenodo. +https://doi.org/10.5281/zenodo.2565368 + + + Simulation of stationary Gaussian processes +in [0, 1] d + Wood + Journal of computational and graphical +statistics + 4 + 3 + 10.2307/1390903 + 1994 + Wood, A. T., & Chan, G. (1994). +Simulation of stationary Gaussian processes in [0, 1] d. Journal of +Computational and Graphical Statistics, 3(4), 409–432. +https://doi.org/10.2307/1390903 + + + On circulant embedding for Gaussian random +fields in R + Davies + Journal of Statistical +Software + 9 + 55 + 10.18637/jss.v055.i09 + 2013 + Davies, T. M., & Bryant, D. +(2013). On circulant embedding for Gaussian random fields in R. Journal +of Statistical Software, 55(9), 1–21. +https://doi.org/10.18637/jss.v055.i09 + + + GaussianRandomFields.jl: A Julia package to +generate and sample from Gaussian random fields + Robbe + Journal of Open Source +Software + 89 + 8 + 10.21105/joss.05595 + 2023 + Robbe, P. (2023). +GaussianRandomFields.jl: A Julia package to generate and sample from +Gaussian random fields. Journal of Open Source Software, 8(89), 5595. +https://doi.org/10.21105/joss.05595 + + + GSTools v1.3: A toolbox for geostatistical +modelling in Python + Müller + Geoscientific Model +Development + 7 + 15 + 10.5194/gmd-15-3161-2022 + 2022 + Müller, S., Schüler, L., Zech, A., +& Heße, F. (2022). GSTools v1.3: A toolbox for geostatistical +modelling in Python. Geoscientific Model Development, 15(7), 3161–3182. +https://doi.org/10.5194/gmd-15-3161-2022 + + + Spatio-temporal modeling of particulate +matter concentration through the SPDE approach + Cameletti + AStA Advances in Statistical +Analysis + 97 + 10.1007/s10182-012-0196-3 + 2013 + Cameletti, M., Lindgren, F., Simpson, +D., & Rue, H. (2013). Spatio-temporal modeling of particulate matter +concentration through the SPDE approach. AStA Advances in Statistical +Analysis, 97, 109–131. +https://doi.org/10.1007/s10182-012-0196-3 + + + A spatial analysis of multivariate output +from regional climate models + Sain + The Annals of Applied +Statistics + 10.1214/10-AOAS369 + 2011 + Sain, S. R., Furrer, R., & +Cressie, N. (2011). A spatial analysis of multivariate output from +regional climate models. The Annals of Applied Statistics, 150–175. +https://doi.org/10.1214/10-AOAS369 + + + A hierarchical multilevel Markov chain Monte +Carlo algorithm with applications to uncertainty quantification in +subsurface flow + Dodwell + SIAM/ASA Journal on Uncertainty +Quantification + 1 + 3 + 10.1137/130915005 + 2015 + Dodwell, T. J., Ketelsen, C., +Scheichl, R., & Teckentrup, A. L. (2015). A hierarchical multilevel +Markov chain Monte Carlo algorithm with applications to uncertainty +quantification in subsurface flow. SIAM/ASA Journal on Uncertainty +Quantification, 3(1), 1075–1108. +https://doi.org/10.1137/130915005 + + + Bayesian fMRI time series analysis with +spatial priors + Penny + NeuroImage + 2 + 24 + 10.1016/j.neuroimage.2004.08.034 + 2005 + Penny, W. D., Trujillo-Barreto, N. +J., & Friston, K. J. (2005). Bayesian fMRI time series analysis with +spatial priors. NeuroImage, 24(2), 350–362. +https://doi.org/10.1016/j.neuroimage.2004.08.034 + + + Random heterogeneous materials: +Microstructure and macroscopic properties + Torquato + Appl. Mech. Rev. + 4 + 55 + 10.1115/1.1483342 + 2002 + Torquato, S., & Haslach Jr, H. +(2002). Random heterogeneous materials: Microstructure and macroscopic +properties. Appl. Mech. Rev., 55(4), B62–B63. +https://doi.org/10.1115/1.1483342 + + + Quasi-monte carlo and multilevel Monte Carlo +methods for computing posterior expectations in elliptic inverse +problems + Scheichl + SIAM/ASA Journal on Uncertainty +Quantification + 1 + 5 + 10.1137/16m1061692 + 2017 + Scheichl, R., Stuart, A. M., & +Teckentrup, A. L. (2017). Quasi-monte carlo and multilevel Monte Carlo +methods for computing posterior expectations in elliptic inverse +problems. SIAM/ASA Journal on Uncertainty Quantification, 5(1), 493–518. +https://doi.org/10.1137/16m1061692 + + + + + + diff --git a/joss.05735/10.21105.joss.05735.jats b/joss.05735/10.21105.joss.05735.jats new file mode 100644 index 0000000000..e59732c2dc --- /dev/null +++ b/joss.05735/10.21105.joss.05735.jats @@ -0,0 +1,557 @@ + + +
+ + + + +Journal of Open Source Software +JOSS + +2475-9066 + +Open Journals + + + +5735 +10.21105/joss.05735 + +parafields: A generator for distributed, stationary +Gaussian processes + + + +https://orcid.org/0000-0002-6140-2332 + +Kempf +Dominic + + + +* + + +https://orcid.org/0000-0002-3295-7347 + +Klein +Ole + + + + +https://orcid.org/0009-0004-8123-4673 + +Kutri +Robert + + + + + +https://orcid.org/0000-0001-8493-4393 + +Scheichl +Robert + + + + + + +Bastian +Peter + + + + + +Scientific Software Center, Heidelberg University, +Heidelberg, Germany + + + + +Interdisciplinary Center for Scientific Computing, +Heidelberg University, Heidelberg, Germany + + + + +Institute for Mathematics, Heidelberg University, +Heidelberg, Germany + + + + +Independent Researcher, Heidelberg, Germany + + + + +* E-mail: + + +11 +5 +2023 + +8 +92 +5735 + +Authors of papers retain copyright and release the +work under a Creative Commons Attribution 4.0 International License (CC +BY 4.0) +2022 +The article authors + +Authors of papers retain copyright and release the work under +a Creative Commons Attribution 4.0 International License (CC BY +4.0) + + + +Python +MPI +scientific computing +high performance computing +uncertainty quantification +random field generation +circulant embedding + + + + + + Summary +

Parafields is a Python package for the generation of stationary + Gaussian random fields with well-defined, known statistical + properties. The use of such fields is a key ingredient of simulation + workflows that involve uncertain, spatially heterogeneous parameters. + As such, Gaussian random fields play a dominant role in geostatistics, + e.g., in the modelling of particulate matter concentration, + temperature distributions and subsurface flow + (Cameletti + et al., 2013) + (Sain + et al., 2011) + (Dodwell + et al., 2015). Outside these traditional applications, Gaussian + random fields are also used in biomedical imaging + (Penny + et al., 2005), material sciences + (Torquato + & Haslach Jr, 2002) or within Markov-Chain Monte-Carlo + methods in Bayesian estimation + (Scheichl + et al., 2017).

+

Parafields is also able to run in parallel using the Message + Passing Interface (MPI) standard through mpi4py + (Dalcin + & Fang, 2021). In this case, the computational domain is + split and only the part of the random field relevant to a certain + process is generated on that process. The generation process is + implemented in a performance-oriented C++ + backend library and exposed in Python though an intuitive Python + interface.

+
+ + Statement of need +

The simulation of large-scale Gaussian random fields is a + computationally challenging task, particularly if the field being + considered has a short correlation length when compared to its + computational domain.

+

However, when the random field in question is stationary, that is, + its covariance function is translation invariant, fast and exact + methods of simulation based on the Fast Fourier Transform have been + proposed by Dietrich & Newsam + (1997) + and Wood & Chan + (1994). + These can outperform more traditional, factorization-based methods in + terms of both scaling as well as absolute performance.

+

Through the combination of an efficient C++ + backend with an easy-to-use Python interface, this package aims to + make these methods accessible for integration into existing workflows. + This separation also allows the package to support large-scale, + peformance-oriented applications, as well as providing a means to + quickly generate working prototypes using just a few lines.

+

Other packages for the generation of stationary Gaussian processes + exist, e.g., the R package lgcp + (Davies + & Bryant, 2013), the Julia package GaussianRandomFields.jl + (Robbe, + 2023), and the Python package GSTools + (Müller + et al., 2022). In comparison with these alternative packages, + parafields is specifically designed and adapted to the sampling of + very large Gaussian random fields within a HPC workflow. This was a + major concern in the development of the backend and is among other + things, reflected in the ability to create Gaussian processes in an + MPI-distributed fashion.

+
+ + Implementation +

Parafields has over ten years of development history: it was first + implemented as an extension to the Dune framework + (Bastian + et al., 2021) for the numerical solution of partial + differential equations. This restricted the potential userbase to + users of that software framework, although there was quite some + interest in the software from outside this community. In 2022, we + started a huge refactoring: the previous C++ + code base + (Klein, + 2017) was rewritten to have a weaker dependency on Dune, which + e.g. included a rewrite of the CMake build system + (Klein + & Kempf, 2022). In order to open up to a wider userbase, a + Python interface written in pybind11 + (Jakob + et al., 2017) was added.

+

When engineering the Python package, we put special emphasis on the + following usability aspects: installability, customizability and + embedding into existing user workflows.

+

The recommended installation procedure for parafields is perfectly + aligned with the state-of-the-art of the Python language: it is + installable through pip and automatically + compiles using the CMake build system of the project through + scikit-build + (Fillion-Robin + et al., 2018). Required dependencies of the + C++ library are automatically fetched and built + in the required configuration. For sequential usage we also provide + pre-compiled Python wheels. They are built against the sequential MPI + stub library FakeMPI + (Kempf + & PetSc Developers, 2022), which allows us to build the + sequential and the parallel version from the same code base. Users who + want to leverage MPI through mpi4py will instead build the package + from source against their system MPI library.

+

It was a goal of the design of the Python API to expose as much of + the flexibility of the underlying C++ framework + as possible. In order to do so, we use pybind11’s capabilities to pass + Python callables to the C++ backend. This + allows users to, e.g., implement custom covariance functions or use + different random number generators. Furthermore, we acknowledge the + fact that many Python users write scientific applications within + Jupyter: our fields render nicely as images in Jupyter and field + generation can optionally be configured through an interactive widget + frontend within Jupyter.

+
+ + Acknowledgments +

The authors thank all contributors of the dune-randomfield project + for their valuable contributions that are now part of the + parafields-core library. Dominic Kempf is employed by the Scientific + Software Center of Heidelberg University which is funded as part of + the Excellence Strategy of the German Federal and State Governments. + Ole Klein’s work is supported by the federal ministry of education and + research of Germany (Bundesministerium für Bildung und Forschung) and + the ministry of science, research and arts of the federal state of + Baden-Württemberg (Ministerium für Wissenschaft, Forschung und Kunst + Baden-Württemberg).

+
+ + + + + + + DietrichClaude R + NewsamGarry N + + Fast and exact simulation of stationary Gaussian processes through circulant embedding of the covariance matrix + SIAM Journal on Scientific Computing + SIAM + 1997 + 18 + 4 + 10.1137/s1064827592240555 + 1088 + 1107 + + + + + + KleinOle + KempfDominic + + parafields-core + GitHub repository + GitHub + 2022 + https://github.com/parafields/parafields-core + + + + + + DalcinLisandro + FangYao-Lung L. + + mpi4py: Status update after 12 years of development + Computing in Science & Engineering + 2021 + 23 + 4 + 10.1109/MCSE.2021.3083216 + 47 + 54 + + + + + + BastianPeter + BlattMarkus + DednerAndreas + DreierNils-Arne + EngwerChristian + FritzeRené + GräserCarsten + GrüningerChristoph + KempfDominic + KlöfkornRobert + OhlbergerMario + SanderOliver + + The dune framework: Basic concepts and recent developments + Computers & Mathematics with Applications + 2021 + 81 + 0898-1221 + https://www.sciencedirect.com/science/article/pii/S089812212030256X + 10.1016/j.camwa.2020.06.007 + 75 + 112 + + + + + + JakobWenzel + RhinelanderJason + MoldovanDean + + pybind11 – seamless operability between C++11 and Python + 2017 + + + + + + KleinOle + + Dune-randomfield - generation of Gaussian random fields in arbitrary dimensions, based on circulant embedding + 2017 + + + + + + KempfDominic + PetSc Developers + + FakeMPI - a sequential MPI stub + 2022 + + + + + + Fillion-RobinJean-Christophe + McCormickMatt + PadronOmar + SmolensMax + GrauerMichael + SarahanMichael + + jcfr/scipy_2018_scikit-build_talk: SciPy 2018 talk | scikit-build: A build system generator for CPython C/C++/Fortran/Cython extensions + Zenodo + 201807 + https://doi.org/10.5281/zenodo.2565368 + 10.5281/zenodo.2565368 + + + + + + WoodAndrew TA + ChanGrace + + Simulation of stationary Gaussian processes in [0, 1] d + Journal of computational and graphical statistics + Taylor & Francis + 1994 + 3 + 4 + 10.2307/1390903 + 409 + 432 + + + + + + DaviesTilman M. + BryantDavid + + On circulant embedding for Gaussian random fields in R + Journal of Statistical Software + 2013 + 55 + 9 + https://www.jstatsoft.org/index.php/jss/article/view/v055i09 + 10.18637/jss.v055.i09 + 1 + 21 + + + + + + RobbePieterjan + + GaussianRandomFields.jl: A Julia package to generate and sample from Gaussian random fields + Journal of Open Source Software + The Open Journal + 2023 + 8 + 89 + https://doi.org/10.21105/joss.05595 + 10.21105/joss.05595 + 5595 + + + + + + + MüllerS. + SchülerL. + ZechA. + HeßeF. + + GSTools v1.3: A toolbox for geostatistical modelling in Python + Geoscientific Model Development + 2022 + 15 + 7 + https://gmd.copernicus.org/articles/15/3161/2022/ + 10.5194/gmd-15-3161-2022 + 3161 + 3182 + + + + + + CamelettiMichela + LindgrenFinn + SimpsonDaniel + RueHåvard + + Spatio-temporal modeling of particulate matter concentration through the SPDE approach + AStA Advances in Statistical Analysis + Springer + 2013 + 97 + 10.1007/s10182-012-0196-3 + 109 + 131 + + + + + + SainStephan R + FurrerReinhard + CressieNoel + + A spatial analysis of multivariate output from regional climate models + The Annals of Applied Statistics + JSTOR + 2011 + 10.1214/10-AOAS369 + 150 + 175 + + + + + + DodwellTim J + KetelsenChristian + ScheichlRobert + TeckentrupAretha L + + A hierarchical multilevel Markov chain Monte Carlo algorithm with applications to uncertainty quantification in subsurface flow + SIAM/ASA Journal on Uncertainty Quantification + SIAM + 2015 + 3 + 1 + 10.1137/130915005 + 1075 + 1108 + + + + + + PennyWilliam D + Trujillo-BarretoNelson J + FristonKarl J + + Bayesian fMRI time series analysis with spatial priors + NeuroImage + Elsevier + 2005 + 24 + 2 + 10.1016/j.neuroimage.2004.08.034 + 350 + 362 + + + + + + TorquatoSalvatore + Haslach JrHW + + Random heterogeneous materials: Microstructure and macroscopic properties + Appl. Mech. Rev. + 2002 + 55 + 4 + 10.1115/1.1483342 + B62 + B63 + + + + + + ScheichlRobert + StuartAndrew M + TeckentrupAretha L + + Quasi-monte carlo and multilevel Monte Carlo methods for computing posterior expectations in elliptic inverse problems + SIAM/ASA Journal on Uncertainty Quantification + SIAM + 2017 + 5 + 1 + 10.1137/16m1061692 + 493 + 518 + + + + +
diff --git a/joss.05735/10.21105.joss.05735.pdf b/joss.05735/10.21105.joss.05735.pdf new file mode 100644 index 0000000000..00aae16c35 Binary files /dev/null and b/joss.05735/10.21105.joss.05735.pdf differ