From b74919aaa22d1c4caf8f2c5ef9e161ad2782f192 Mon Sep 17 00:00:00 2001 From: The Open Journals editorial robot <89919391+editorialbot@users.noreply.github.com> Date: Sun, 2 Jun 2024 16:58:02 +0100 Subject: [PATCH] Creating 10.21105.joss.06617.jats --- .../paper.jats/10.21105.joss.06617.jats | 778 ++++++++++++++++++ 1 file changed, 778 insertions(+) create mode 100644 joss.06617/paper.jats/10.21105.joss.06617.jats diff --git a/joss.06617/paper.jats/10.21105.joss.06617.jats b/joss.06617/paper.jats/10.21105.joss.06617.jats new file mode 100644 index 0000000000..b7d55f7bfb --- /dev/null +++ b/joss.06617/paper.jats/10.21105.joss.06617.jats @@ -0,0 +1,778 @@ + + +
+ + + + +Journal of Open Source Software +JOSS + +2475-9066 + +Open Journals + + + +6617 +10.21105/joss.06617 + +Empirical: A scientific software library for research, +education, and public engagement + + + +https://orcid.org/0000-0001-7216-5283 + +Vostinar +Anya + + + + +https://orcid.org/0000-0003-0994-2718 + +Lalejini +Alexander + + + + +https://orcid.org/0000-0003-2924-1732 + +Ofria +Charles + + + + + + +https://orcid.org/0000-0001-8616-4898 + +Dolson +Emily + + + + + + +https://orcid.org/0000-0003-4726-4479 + +Moreno +Matthew Andres + + + + + + + +BEACON Center for the Study of Evolution in Action, +USA + + + + +Computer Science and Engineering, Michigan State +University, USA + + + + +Ecology, Evolutionary Biology, and Behavior, Michigan State +University, USA + + + + +Computer Science, Carleton College, USA + + + + +Ecology and Evolutionary Biology, University of Michigan, +USA + + + + +Center for the Study of Complex Systems, University of +Michigan, USA + + + + +Michigan Institute for Data Science, University of +Michigan, USA + + + + +Computer Science, Grand Valley State University, +USA + + + + +13 +2 +2024 + +9 +98 +6617 + +Authors of papers retain copyright and release the +work under a Creative Commons Attribution 4.0 International License (CC +BY 4.0) +2022 +The article authors + +Authors of papers retain copyright and release the work under +a Creative Commons Attribution 4.0 International License (CC BY +4.0) + + + +C++ +Simulation +Agent-based modeling +Emscripten + + + + + + Summary +

Empirical is a C++ library designed to promote open science and + facilitate the development of scientific software that is efficient, + reliable, and easily distributable to researchers and non-experts + alike. Specifically, the library sets out to fulfill the following + goals:

+ + +

Utility: Empirical tools streamline common + scientific computing tasks such as configuration, end-to-end data + management, and mathematical manipulations.

+
+ +

Efficiency: Empirical implements general-purpose + data structures and algorithms that emphasize computational + efficiency to support scientific computing workloads.

+
+ +

Reliability: Empirical provides sophisticated + debug-mode instrumentation including audited memory management and + safety-checked versions of standard library containers.

+
+ +

Distributability: Empirical is highly portable, + uses common data formats, and facilitates compile-to-web app + development with object-oriented bindings for + Emscripten/WebAssembly GUI elements, all with the goal of building + broadly accessible scientific software.

+
+
+
+ + Statement of Need +

High quality open-science tools improve code quality, scientific + rigor, and ease of replication or extension for scientific software. + Empirical’s debugging suite combats C++ programming pitfalls, such as + iterator invalidation, memory leakage, and out-of-bounds indexing. + Throughout, library design achieves both performance and safety + through compile-time toggling of checks for undefined or incorrect + behavior.

+

Unfortunately, in practice, scientific software is often difficult + to obtain, install, or use. Modern web-based interfaces give + computational research the potential to better embody open science + objectives by empowering easier and more complete access + (Woelfle + et al., 2011). Empirical leverages modern web technology to + provide browser-based interactive interfaces for C++ source code.

+
+ + Empirical Features + + Better Code for Scientific Software +

Empirical components are subjected to structured code review, + unit testing with coverage tracking, and other best practices + detailed + in + our documentation. Effort invested into optimization of + the library’s utilities enables developer-users to more easily + produce safe and efficient software, especially for new developers. + We provide a + template + project that streamlines laying out crosscompilation + boilerplate.

+

As an example of Emprical’s utility, the library provides a + configuration framework that includes utilities to

+ + +

create documented configuration parameters with default + values in a single line of C++ code,

+
+ +

adjust parameters via configuration files, command line + flags, URL query parameters, or in-browser GUIs,

+
+ +

perform on-the-fly configuration adjustments, and

+
+ +

support independent configuration subsystems.

+
+
+

High-quality software needs a robust, inclusive, and diverse + community of users and contributors. Our + development + practices reflect this priority.

+
+ + Realizing the Promise of Emscripten-based Web UIs +

Educational editions of scientific software promote classroom + learning and citizen science. The Emscripten compiler enables an + existing native codebase to additionally compile to the web + (Zakai, + 2011). Browser-based delivery can yield particularly + effective public-facing apps due to easy access and compelling + interfaces.

+

Empirical amplifies Emscripten by fleshing out its interface for + interaction with browser elements. DOM elements are bound to + corresponding C++ objects (e.g., emp::Button + manages a <button> and + emp::Canvas manages a + <canvas>) and are easily manipulated + from within C++. Empirical also packages collections of + prefabricated web widgets (e.g., configuration managers or + collapsible data displays). These tools simplify generating a + mobile-friendly, web-based GUI.

+

A live demo of Empirical widgets, presented alongside their + source C++ code, is available + here.

+
+ + Runtime Efficiency +

WebAssembly’s runtime efficiency — achieving 50% to 90% of native + performance + (Jangda + et al., 2019) — has driven adoption in web development + (Haas + et al., 2017) and enabled new possibilities for browser-based + scientific computation. For example, + Avida-ED + leverages WebAssembly to incorporate sophisticated agent-based + evolution models into classroom activities.

+

More broadly, Empirical provides optimized tools for + performance-critical tasks. For example, + emp::BitArray and + emp::BitVector are faster drop-in + replacements for their standard library equivalents + (std::bitset and + std::vector<bool>) with extensive + additional functionality. More fundamentally, Empirical’s + header-only design prioritizes ease of use and runtime performance, + albeit at the cost of longer compilation times.

+
+ + Debugging +

Although performant, C++’s permissiveness to out-of-bounds + indexing or memory management errors can undermine the validity of + generated data and analyses. Standard library vendors — like + libstdc++, + libc++, + and + stl + — provide some runtime safety features, but these are incomplete and + poorly documented1. Empirical + supplements vendor offerings with debug mode stand-ins for standard + library containers and even raw pointers that can identify memory + leaks and invalid memory access.

+

Developers typically compensate for C++’s missing guardrails with + external toolchains like Valgrind, GDB, and sanitizers. Although + mature, such tooling suffers substantial + limitations2, particularly for + WASM compiled with Emscripten. Although Emscripten provides some + sanitizer + support and + other + debugging features, Empirical’s safety features offset + remaining limitations, such as the lack of a steppable debugger.

+
+
+ + Outlook and Future Plans +

Empirical remains under active development. Current priorities + include web-friendly refinements (e.g., file management, rich text + handling) and additional step-by-step tutorials for new users. That + said, Empirical has largely converged to API stability, and releases + are archived on Zenodo for those who depend on them + (Ofria + et al., 2020).

+

Empirical already underlies major projects within digital + evolution, artificial life, and genetic programming. To benefit the + broader scientific software and open science community, we look + forward to welcoming new collaborations and supporting a wider + collection of end-users.

+
+ + Related Software Packages +

Several projects pursue objectives related to Empirical’s.

+ + RepastHPC +

RepastHPC, accessible at + https://repast.github.io/, + is a C++ modeling framework targeted to high-performance computing + (Collier + & North, 2013; + North + et al., 2013). A Java-based counterpart, Repast Simphony, + provides interactive GUI support.

+
+ + Boost C++ Libraries +

Boost C++ Libraries, available at + https://www.boost.org/, + implement a broad portfolio of software components. However, Boost + lacks tools for web-based GUI, configuration management, or data + management tailored to scientific software.

+
+ + Emscripten +

Emscripten provides cross-compilation from C++ to WebAssembly and + available at + https://emscripten.org/ + (Zakai, + 2011). Empirical furnishes a complementary high-level + interface to Emscripten intrinsics.

+
+ + Cheerp +

Cheerp, another C++ to WebAssembly compiler, is available at + https://leaningtech.com/cheerp/. + Like Emscripten, Cheerp provides primarily low-level APIs for + browser interaction.

+
+ + Non-C++ Comparable Software + + +

TinyGo +

+
+ +

WebIO +

+
+ +

GWT +

+
+ +

yew +

+
+ +

Pyodide + (Droettboom + & the Pyodide development team, 2021)

+
+ +

Shiny + (Chang + et al., 2020)

+
+
+
+ + Projects Using the Software + + +

AAGOS + (Gillespie + et al., 2018): model to test impact of environmental + change on genetic architecture evolution.

+
+ +

Conduit + (Moreno + & Ofria, 2022): library for best-effort communication + in high-performance computing.

+
+ +

DISHTINY + (Moreno + & Ofria, 2019): agent-based model to study major + transitions in evolution.

+
+ +

ecology + in evolutionary computation explorer + (Dolson + & Ofria, 2018): interactive visualization of + ecological interaction networks in evolutionary computation.

+
+ +

Symbulation + (Vostinar, + 2017): agent-based model for evolution of parasitism, + mutualism, and commensalism.

+
+ +

SignalGP + (Lalejini + & Ofria, 2018; + Moreno + et al., 2021): an event-driven genetic programming + substrate.

+
+ +

PhylotrackPy + (Dolson + et al., 2024): a phylogeny-tracking tool for agent-based + evolution, closely integrated with Empirical codebase.

+
+ +

Model + of cancer evolution on an oxygen gradient.

+
+
+
+
+ + Acknowledgements +

This research was supported in part by NSF grants DEB-1655715 and + DBI-0939454, by the National Science Foundation Graduate Research + Fellowship under Grant No. DGE-1424871, by Michigan State University + through the computational resources provided by the Institute for + Cyber-Enabled Research, and by the Eric and Wendy Schmidt AI in + Science Postdoctoral Fellowship, a Schmidt Futures program. Any + opinions, findings, and conclusions or recommendations expressed in + this material are those of the author(s) and do not necessarily + reflect the views of the National Science Foundation.

+
+ + + + + + + HaasAndreas + RossbergAndreas + SchuffDerek L + TitzerBen L + HolmanMichael + GohmanDan + WagnerLuke + ZakaiAlon + BastienJF + + Bringing the web up to speed with WebAssembly + Proceedings of the 38th ACM SIGPLAN conference on programming language design and implementation + Association for Computing Machinery + 201706 + http://dx.doi.org/10.1145/3062341.3062363 + 10.1145/3062341.3062363 + 185 + 200 + + + + + + JangdaAbhinav + PowersBobby + BergerEmery D. + GuhaArjun + + Not so fast: Analyzing the performance of webassembly vs. Native code + Proceedings of the 2019 USENIX conference on usenix annual technical conference + USENIX Association + USA + 2019 + 9781939133038 + https://www.usenix.org/conference/atc19/presentation/jangda + 107 + 120 + + + + + + ZakaiAlon + + Emscripten: An LLVM-to-JavaScript compiler + Proceedings of the ACM international conference companion on object oriented programming systems languages and applications companion + Association for Computing Machinery + 201110 + http://dx.doi.org/10.1145/2048147.2048224 + 10.1145/2048147.2048224 + 301 + 312 + + + + + + ChangWinston + ChengJoe + AllaireJJ + XieYihui + McPhersonJonathan + + Shiny: Web application framework for R + 2020 + https://CRAN.R-project.org/package=shiny + + + + + + DroettboomMichael + the Pyodide development team + + Pyodide/pyodide + Zenodo + 202108 + https://doi.org/10.5281/zenodo.5156931 + 10.5281/zenodo.5156931 + + + + + + VostinarAnya E + + Suicide, signals, and symbionts: Evolving cooperation in agent-based systems + Michigan State University + 2017 + 978-0-355-07992-0 + https://www.proquest.com/docview/1929231148 + + + + + + GillespieLauren + DolsonEmily + LalejiniAlexander + OfriaCharles + + Changing environments drive the separation of genes and increased evolvability in NK-inspired landscapes + Late breaking abstract at The 2018 Conference on Artificial Life + 2018 + + + + + + LalejiniAlexander + OfriaCharles + + Evolving event-driven programs with SignalGP + Proceedings of the genetic and evolutionary computation conference on - GECCO ’18 + ACM Press + New York, New York, USA + 2018 + 9781450356183 + http://arxiv.org/abs/1804.05445{\%}0Ahttp://dx.doi.org/10.1145/3205455.3205523 http://dl.acm.org/citation.cfm?doid=3205455.3205523 + 10.1145/3205455.3205523 + 1135 + 1142 + + + + + + MorenoMatthew Andres + OfriaCharles + + Toward open-ended fraternal transitions in individuality + Artificial Life + 201905 + 25 + 2 + 1064-5462 + https://doi.org/10.1162/artl\_a\_00284 + 10.1162/artl_a_00284 + 117 + 133 + + + + + + CollierNicholson + NorthMichael + + Parallel agent-based simulation with repast for high performance computing + SIMULATION + 201311 + 89 + 10 + + https://doi.org/10.1177/0037549712462620 + + + 10.1177/0037549712462620 + 1215 + 1235 + + + + + + NorthMichael J + CollierNicholson T + OzikJonathan + TataraEric R + MacalCharles M + BragenMark + SydelkoPam + + Complex adaptive systems modeling with repast simphony + Complex adaptive systems modeling + Springer + 201303 + 1 + 1 + https://doi.org/10.1186/2194-3206-1-3 + 10.1186/2194-3206-1-3 + 1 + 26 + + + + + + DolsonEmily + OfriaCharles + + Ecological theory provides insights about evolutionary computation + Proceedings of the genetic and evolutionary computation conference companion + Association for Computing Machinery + New York, NY, USA + 2018 + 9781450357647 + https://doi.org/10.1145/3205651.3205780 + 10.1145/3205651.3205780 + 105 + 106 + + + + + + OfriaCharles + MorenoMatthew Andres + DolsonEmily + LalejiniAlex + rodsan0 + FentonJake + perryk12 + JorgensenSteven + hoffmanriley + grenewode + al. + + Devosoft/empirical + Zenodo + 202010 + 10.5281/zenodo.2575606 + + + + + + WoelfleMichael + OlliaroPiero + ToddMatthew H + + Open science is a research accelerator + Nature chemistry + Nature Publishing Group UK London + 2011 + 3 + 10 + 10.1038/nchem.1149 + 745 + 748 + + + + + + MorenoMatthew Andres + PapaSantiago Rodriguez + LalejiniAlexander + OfriaCharles + + SignalGP-lite: Event driven genetic programming library for large-scale artificial life applications + arXiv + 2021 + https://arxiv.org/abs/2108.00382 + 10.48550/ARXIV.2108.00382 + + + + + + MorenoMatthew Andres + OfriaCharles + + Best-effort communication improves performance and scales robustly on conventional hardware + arXiv + 2022 + https://arxiv.org/abs/2211.10897 + 10.48550/ARXIV.2211.10897 + + + + + + DolsonEmily + Rodriguez-PapaSantiago + MorenoMatthew Andres + + Phylotrack: C++ and Python libraries for in silico phylogenetic tracking + arXiv + 2024 + https://arxiv.org/abs/2405.09389 + 10.48550/ARXIV.2405.09389 + + + + + +

For example, neither GCC 10.3 nor Clang 12.0.0 + detect std::vector iterator invalidation when + appending to a std::vector happens to fall + within existing allocated buffer space + (GCC + live example; + Clang + live example). Clang 12.0.0’s sanitizers also fail to + detect this iterator invalidation + (live + example).

+
+ +

For example, neither GCC 10.3 nor Clang 12.0.0 + detect std::vector iterator invalidation when + appending to a std::vector happens to fall + within existing allocated buffer space + (GCC + live example; + Clang + live example). Clang 12.0.0’s sanitizers also fail to + detect this iterator invalidation + (live + example).

+
+
+
+