diff --git a/joss.06156/10.21105.joss.06156.crossref.xml b/joss.06156/10.21105.joss.06156.crossref.xml new file mode 100644 index 0000000000..175c595b38 --- /dev/null +++ b/joss.06156/10.21105.joss.06156.crossref.xml @@ -0,0 +1,390 @@ + + + + 20240328T112659-923a9a4ed5fa96b8fb2d478d3cbf089ac5b4f3c3 + 20240328112659 + + JOSS Admin + admin@theoj.org + + The Open Journal + + + + + Journal of Open Source Software + JOSS + 2475-9066 + + 10.21105/joss + https://joss.theoj.org + + + + + 03 + 2024 + + + 9 + + 95 + + + + simChef: High-quality data science simulations in +R + + + + James + Duncan + https://orcid.org/0000-0003-3297-681X + + + Tiffany + Tang + https://orcid.org/0000-0002-8079-6867 + + + Corrine F. + Elliott + https://orcid.org/0000-0001-7935-9945 + + + Philippe + Boileau + https://orcid.org/0000-0002-4850-2507 + + + Bin + Yu + https://orcid.org/0000-0002-8888-4060 + + + + 03 + 28 + 2024 + + + 6156 + + + 10.21105/joss.06156 + + + http://creativecommons.org/licenses/by/4.0/ + http://creativecommons.org/licenses/by/4.0/ + http://creativecommons.org/licenses/by/4.0/ + + + + Software archive + 10.5281/zenodo.10845638 + + + GitHub review issue + https://github.com/openjournals/joss-reviews/issues/6156 + + + + 10.21105/joss.06156 + https://joss.theoj.org/papers/10.21105/joss.06156 + + + https://joss.theoj.org/papers/10.21105/joss.06156.pdf + + + + + + Veridical data science + Yu + Proceedings of the National Academy of +Sciences + 8 + 117 + 10.1073/pnas.1901326117 + 0027-8424 + 2020 + Yu, B., & Kumbier, K. (2020). +Veridical data science. Proceedings of the National Academy of Sciences, +117(8), 3920–3929. +https://doi.org/10.1073/pnas.1901326117 + + + batchtools: Tools for R to work on batch +systems + Lang + Journal of Open Source +Software + 10 + 2 + 10.21105/joss.00135 + 2475-9066 + 2017 + Lang, M., Bischl, B., & Surmann, +D. (2017). batchtools: Tools for R to work on batch systems. Journal of +Open Source Software, 2(10), 135. +https://doi.org/10.21105/joss.00135 + + + Welcome to the Tidyverse + Wickham + Journal of Open Source +Software + 43 + 4 + 10.21105/joss.01686 + 2475-9066 + 2019 + Wickham, H., Averick, M., Bryan, J., +Chang, W., McGowan, L. D., François, R., Grolemund, G., Hayes, A., +Henry, L., Hester, J., Kuhn, M., Pedersen, T. L., Miller, E., Bache, S. +M., Müller, K., Ooms, J., Robinson, D., Seidel, D. P., Spinu, V., … +Yutani, H. (2019). Welcome to the Tidyverse. Journal of Open Source +Software, 4(43), 1686. +https://doi.org/10.21105/joss.01686 + + + A Unifying Framework for Parallel and +Distributed Processing in R using Futures + Bengtsson + The R Journal + 2 + 13 + 10.32614/RJ-2021-048 + 2073-4859 + 2021 + Bengtsson, H. (2021). A Unifying +Framework for Parallel and Distributed Processing in R using Futures. +The R Journal, 13(2), 208. +https://doi.org/10.32614/RJ-2021-048 + + + R6: Encapsulated Classes with Reference +Semantics + Chang + 2022 + Chang, W. (2022). R6: Encapsulated +Classes with Reference Semantics. +https://r6.r-lib.org + + + Writing Effective and Reliable Monte Carlo +Simulations with the SimDesign Package + Chalmers + The Quantitative Methods for +Psychology + 4 + 16 + 10.20982/tqmp.16.4.p248 + 2020 + Chalmers, M. C., R. Philip AND +Adkins. (2020). Writing Effective and Reliable Monte Carlo Simulations +with the SimDesign Package. The Quantitative Methods for Psychology, +16(4), 248–280. +https://doi.org/10.20982/tqmp.16.4.p248 + + + SimEngine: A Modular Framework for +Statistical Simulations in R + Kenny + 10.48550/arXiv.2403.05698 + 2024 + Kenny, A., & Wolock, C. J. +(2024). SimEngine: A Modular Framework for Statistical Simulations in R. +https://doi.org/10.48550/arXiv.2403.05698 + + + simpr: Flexible ’Tidyverse’-Friendly +Simulations + Brown + 2023 + Brown, E. (2023). simpr: Flexible +’Tidyverse’-Friendly Simulations. +https://statisfactions.github.io/simpr/ + + + rsimsum: Summarise results from Monte Carlo +simulation studies + Gasparini + Journal of Open Source +Software + 26 + 3 + 10.21105/joss.00739 + 2018 + Gasparini, A. (2018). rsimsum: +Summarise results from Monte Carlo simulation studies. Journal of Open +Source Software, 3(26), 739. +https://doi.org/10.21105/joss.00739 + + + Declaring and Diagnosing Research +Designs + Blair + American Political Science +Review + 3 + 113 + 10.1017/S0003055419000194 + 2019 + Blair, G., Cooper, J., Coppock, A., +& Humphreys, M. (2019). Declaring and Diagnosing Research Designs. +American Political Science Review, 113(3), 838–859. +https://doi.org/10.1017/S0003055419000194 + + + simhelpers: Helper Functions for Simulation +Studies + Joshi + 2024 + Joshi, M., & Pustejovsky, J. +(2024). simhelpers: Helper Functions for Simulation Studies. +https://meghapsimatrix.github.io/simhelpers/index.html + + + simTool: Conduct Simulation Studies with a +Minimal Amount of Source Code + Scheer + 2020 + Scheer, M. (2020). simTool: Conduct +Simulation Studies with a Minimal Amount of Source Code. +https://CRAN.R-project.org/packages=simTool + + + parSim: Parallel Simulation +Studies + Epskamp + 2023 + Epskamp, S. (2023). parSim: Parallel +Simulation Studies. +https://CRAN.R-project.org/package=parSim + + + simitation: Simplified +Simulations + Shilane + 2023 + Shilane, D., Budugutta, S., & +Bansal, M. (2023). simitation: Simplified Simulations. +https://CRAN.R-project.org/package=simitation + + + tidyMC: Monte Carlo Simulations Made Easy and +Tidy + Linner + 2022 + Linner, S., Moreira Lara, I., & +Lehmann, K. (2022). tidyMC: Monte Carlo Simulations Made Easy and Tidy. +https://github.com/stefanlinner/tidyMC + + + simmer: Discrete-event simulation for +R + Ucar + Journal of Statistical +Software + 2 + 90 + 10.18637/jss.v090.i02 + 2019 + Ucar, I., Smeets, B., & Azcorra, +A. (2019). simmer: Discrete-event simulation for R. Journal of +Statistical Software, 90(2), 1–30. +https://doi.org/10.18637/jss.v090.i02 + + + MonteCarloSEM: An R Package to Simulate Data +for SEM + Orcan + International Journal of Assessment Tools in +Education + 3 + 8 + 10.21449/ijate.804203 + 2021 + Orcan, F. (2021). MonteCarloSEM: An R +Package to Simulate Data for SEM. International Journal of Assessment +Tools in Education, 8(3), 704–713. +https://doi.org/10.21449/ijate.804203 + + + simMetric: Metrics (with Uncertainty) for +Simulation Studies that Evaluate Statistical Methods + Parsons + 10.25912/RDF_1665114451679 + 2022 + Parsons, R. (2022). simMetric: +Metrics (with Uncertainty) for Simulation Studies that Evaluate +Statistical Methods. Queensland University of Technology. +https://doi.org/10.25912/RDF_1665114451679 + + + The Simulator: An Engine to Streamline +Simulations + Bien + 10.48550/arXiv.1607.00021 + 2016 + Bien, J. (2016). The Simulator: An +Engine to Streamline Simulations. +https://doi.org/10.48550/arXiv.1607.00021 + + + infer: An R package for tidyverse-friendly +statistical inference + Couch + Journal of Open Source +Software + 65 + 6 + 10.21105/joss.03661 + 2021 + Couch, S. P., Bray, A. P., Ismay, C., +Chasnovski, E., Baumer, B. S., & Çetinkaya-Rundel, M. (2021). infer: +An R package for tidyverse-friendly statistical inference. Journal of +Open Source Software, 6(65), 3661. +https://doi.org/10.21105/joss.03661 + + + Parallel and Other Simulations in R Made +Easy: An End-to-End Study + Hofert + Journal of Statistical +Software + 4 + 69 + 10.18637/jss.v069.i04 + 2016 + Hofert, M., & Mächler, M. (2016). +Parallel and Other Simulations in R Made Easy: An End-to-End Study. +Journal of Statistical Software, 69(4), 1–44. +https://doi.org/10.18637/jss.v069.i04 + + + Designing a data science simulation with +MERITS: A primer + Elliott + 10.48550/arXiv.2403.08971 + 2024 + Elliott, C. F., Duncan, J., Tang, T. +M., Behr, M., Kumbier, K., & Yu, B. (2024). Designing a data science +simulation with MERITS: A primer. +https://doi.org/10.48550/arXiv.2403.08971 + + + + + + diff --git a/joss.06156/10.21105.joss.06156.jats b/joss.06156/10.21105.joss.06156.jats new file mode 100644 index 0000000000..8753f32231 --- /dev/null +++ b/joss.06156/10.21105.joss.06156.jats @@ -0,0 +1,935 @@ + + +
+ + + + +Journal of Open Source Software +JOSS + +2475-9066 + +Open Journals + + + +6156 +10.21105/joss.06156 + +simChef: High-quality data science +simulations in R + + + +https://orcid.org/0000-0003-3297-681X + +Duncan +James + + + + +https://orcid.org/0000-0002-8079-6867 + +Tang +Tiffany + + +* + + +https://orcid.org/0000-0001-7935-9945 + +Elliott +Corrine F. + + + + +https://orcid.org/0000-0002-4850-2507 + +Boileau +Philippe + + + + +https://orcid.org/0000-0002-8888-4060 + +Yu +Bin + + + + + + + + +Graduate Group in Biostatistics, University of California, +Berkeley, United States of America + + + + +Department of Statistics, University of California, +Berkeley, United States of America + + + + +Department of Electrical Engineering and Computer Sciences, +University of California, Berkeley, United States of +America + + + + +Center for Computational Biology, University of California, +Berkeley, United States of America + + + + +* E-mail: + + +28 +6 +2023 + +9 +95 +6156 + +Authors of papers retain copyright and release the +work under a Creative Commons Attribution 4.0 International License (CC +BY 4.0) +2022 +The article authors + +Authors of papers retain copyright and release the work under +a Creative Commons Attribution 4.0 International License (CC BY +4.0) + + + +simulations +data science +R + + + + + + Summary +

simChef is an R + package that empowers data science practitioners to rapidly plan, + carry out, and summarize statistical simulation studies in a flexible, + efficient, and low-code manner. Drawing substantially from the + Predictability, Computability, and Stability (PCS) framework + (Yu + & Kumbier, 2020), simChef emphasizes + the scientific best practices encompassed by PCS by removing many of + the administrative burdens of simulation design through: (1) an + intuitive + tidy + grammar of data science simulations; (2) powerful + abstractions for distributed simulation processing backed by + future + (Bengtsson, + 2021); and (3) automated generation of interactive + R + Markdown simulation documentation, situating results next + to the workflows needed to reproduce them. Taken together, + simChef’s capabilities overcome many of the + design, computational, and reproducibility hurdles inherent in nearly + every data science simulation study.

+
+ + Statement of need +

Data science simulation studies occupy an important role in + scientific research as a means to gain insight into new and existing + statistical methods. Simulations serve as statistical sandboxes that + open a path toward otherwise inaccessible discoveries. For example, + they can be used to establish comprehensive benchmarks of existing + procedures for a common task; to demonstrate the strengths and + weaknesses of novel methodology applied to synthetic and real-world + data; or to probe the validity of a theoretical analysis.

+

Creating high-quality simulation studies typically involves a + number of repetitive and error-prone coding tasks: implementing + data-generating processes (DGPs) and statistical methods; sampling + from these DGPs; parallelizing computation of simulation replicates; + summarizing metrics; visualizing, documenting, presenting, and saving + results; and so on. While this administrative overhead is necessary, + it is not sufficient for scientific understanding. Data scientists + must navigate a number of important judgment calls such as the choice + of DGPs, baseline statistical methods, associated parameters, and + evaluation metrics for scientific relevancy.

+

While the scientific context may vary drastically from one study to + the next, the simulation scaffolding remains largely similar. Yet + simulation code repositories often lack reusability, both for novel + settings and when new questions arise in the original context. + simChef addresses the need for an intuitive, + extensible, and reusable framework for data science simulations, + allowing data science practitioners to focus their energies on + scientific questions by reducing the burdens of parameterization, + parallelization, and documentation.

+
+ + Core abstractions of data science simulations +

At its core, simChef breaks down a + simulation experiment into four modular components + ([fig:api]), each + implemented as an R6 class + (Chang, + 2022):

+ + +

DGP: the data-generating processes from + which to generate data

+
+ +

Method: the methods (or models) to + fit in the experiment

+
+ +

Evaluator: the evaluation metrics used + to evaluate the methods’ performance

+
+ +

Visualizer: the visualization functions + used to visualize outputs from the method fits or + evaluation results (can be tables, plots, or even + R Markdown snippets to display)

+
+
+ +

Overview of the four core components in a + simChef Experiment. + simChef provides four classes that implement + distinct simulation objects in an intuitive and modular manner: + DGP, Method, + Evaluator, and + Visualizer. Using these classes, users can + easily build a simChef + Experiment using reusable, customizable + functions (i.e., dgp_fun, + method_fun, eval_fun, + and viz_fun). Optional named parameters can + be set in these custom functions via the ... + arguments in the create_*() methods. +

+ +
+

Using these classes, users can create or reuse custom functions + (i.e., dgp_fun, + method_fun, eval_fun, + and viz_fun in + [fig:api]) aligned with + their scientific goals. The custom functions then can be parameterized + and encapsulated in one of the corresponding classes via a + create_* method, together with optional named + parameters (see + [fig:api]).

+

A fifth R6 class, + Experiment, unites the four components above + and serves as a concrete implementation of the user’s intent to answer + a specific scientific question. Specifically, the + Experiment stores references to the + DGP(s), Method(s), + Evaluator(s), and + Visualizer(s) along with the + DGP and Method + parameters that should be varied and combined during the simulation + run.

+ +

Overview of running a simChef + Experiment. The + Experiment class handles relationships among + the four classes portrayed in + [fig:api]. + Experiments may have multiple DGPs and + Methods, which are combined across the + Cartesian product of their varying parameters (represented by + \*). Once computed, each + Evaluator and + Visualizer takes in the fitted simulation + replicates, while Visualizer additionally + receives evaluation summaries. +

+ +
+
+ + A powerful grammar of data science simulations +

Inspired by the tidyverse + (Wickham + et al., 2019), simChef develops an + intuitive grammar for running simulation studies using the + aforementioned R6 classes. We provide an + illustrative example usage next.

+ library(simChef) + +dgp1 <- create_dgp(dgp_fun1, "my_dgp1", sd = 0.5) +dgp2 <- create_dgp(dgp_fun2, "my_dgp2") +method <- create_method(method_fun, "my_method") +eval <- create_evaluator(eval_fun) +viz <- create_vizualizer(viz_fun) + +exper <- create_experiment(dgp_list = list(dgp1, dgp2)) %>% + add_method(method) %>% + add_vary_across( + list(dgp1, dgp2), + n = c(1e2, 1e3, 1e4) + ) %>% + add_vary_across( + dgp2, + sparse = c(FALSE, TRUE) + ) %>% + add_vary_across( + method, + scalar_valued_param = c(0.1, 1.0, 10.0), + vector_valued_param = list(c(1, 2, 3), c(4, 5, 6)), + list_valued_param = list(list(a1=1, a2=2, a3=3), + list(b1=3, b2=2, b3=1)) + ) %>% + add_evaluator(eval) %>% + add_viz(viz) + +future::plan(multicore, workers = 64) + +results <- exper %>% + run_experiment(n_reps = 100, save = TRUE) + +new_method <- create_method(new_method_fun, 'my_new_method') + +exper <- exper %>% + add_method(new_method) + +results <- exper %>% + run_experiment(n_reps = 100, use_cached = TRUE) + +init_docs(exper) +render_docs(exper) +

In the example usage, DGP(s), + Method(s), Evaluator(s), + and Visualizer(s) are first created via + create_*(). These simulation objects can then + be combined into an Experiment using either + create_experiment() and/or + add_*().

+

In an Experiment, + DGP(s) and Method(s) can + also be varied across one or multiple parameters via + add_vary_across(). For instance, in the example + Experiment, there are two + DGP instances, both of which are varied across + three values of n and one of which is + additionally varied across two values of + sparse. This effectively results in nine + distinct configurations for data generation (i.e., 3 variations on + dgp1 + 3x2 variations on + dgp2). For the single + Method in the experiment, we use three values + of scalar_valued_param, two of + vector_valued_param, and another two of + list_valued_param, giving 12 distinct + configurations. Hence, there are a total of 9x12 = 108 + DGP-method-parameter combinations in the + Experiment.

+

Thus far, we have simply instantiated an + Experiment object (akin to creating a recipe + for an experiment). To compute and run the simulation experiment, we + next call run_experiment with the desired + number of replicates. As summarized in + [fig:run-exper], + running the experiment will (1) fit each + Method on each DGP (and + for each of the varying parameter configurations), (2) + evaluate the experiment according to the given + Evaluator(s), and (3) + visualize the experiment according to the given + Visualizer(s). Furthermore, the number of + replicates per combination of DGP, + Method, and parameters specified via + add_vary_across is determined by the + n_reps argument to + run_experiment. Because replication happens at + the per-combination level, the effective total number of replicates in + the Experiment depends on the number of DGPs, + methods, and varied parameters. In the given example, there are 108 + DGP-method-parameter combinations, each of which is replicated 100 + times. To reduce the computational burden, the + Experiment class flexibly handles the + computation of simulation replicates in parallel using the + future package + (Bengtsson, + 2021). + [fig:exper-schematic] + provides a detailed schematic of the + run_experiment workflow, along with the + expected inputs to and outputs from user-defined functions.

+ +

Detailed schematic of the + run_experiment workflow using + simChef. Expected inputs to and outputs from + user-defined functions are also + provided.

+ +
+
+ + Additional Features +

In addition to the ease of parallelization, + simChef enables caching of results to further + alleviate the computational burden. Here, users can choose to save the + experiment’s results to disk by passing + save = TRUE to + run_experiment. Once saved, the user can add + new DGP and Method + objects to the experiment and compute additional replicates without + re-computing existing results via the + use_cached option. Considering the example + above, when we add new_method and call + run_experiment with + use_cached = TRUE, + simChef finds that the cached results are + missing combinations of new_method, existing + DGPs, and their associated parameters, giving nine new configurations. + Replicates for the new combinations are then appended to the cached + results.

+

simChef also provides users with a + convenient API to automatically generate an R + Markdown document. This documentation gathers the scientific details, + summary tables, and visualizations side-by-side with the user’s custom + source code and parameters for data-generating processes, statistical + methods, evaluation metrics, and plots. A call to + init_docs generates empty markdown files for + the user to populate with their overarching simulation objectives and + with descriptions of each of the DGP, + Method, Evaluator, and + Visualizer objects included in the + Experiment. Finally, a call to + render_docs prepares the + R Markdown document, either for iterative + design and analysis of the simulation or to provide a high-quality + overview that can be shared easily. We provide an example of the + simulation documentation + here. + Corresponding R source code is available on + GitHub.

+
+ + Related <monospace>R</monospace> packages +

A number of existing R packages and projects + address needs related to simChef’s + functionality. At a higher level of abstraction, the + batchtools package + (Lang + et al., 2017) includes concepts for “problems”, “algorithms”, + and “experiments”, similar to simChef’s + DGP, Method, and + Experiment objects, respectively, but less + tailored to the specific needs of data science simulation experiments. + Additionally, batchtools provides a number of + utilities for shared-memory and distributed memory computations, + including for interacting with high-performance computing cluster + schedulers such as Slurm and Torque. simChef is + able to leverage these utilities for distributed computations via the + backends provided by the future.batchtools + package which is part of the future ecosystem + of R packages + (Bengtsson, + 2021). Whereas batchtools is a general + tool for distributed mapping operations, + simChef specializes in data science simulations + and provides additional functionality tailored to that setting + including its tidy grammar of simulation + experiments, the Evaluator and + Visualizer concepts, and automated + documentation capabilities discussed above.

+

Like simChef, many existing packages + specifically aim to simplify the process of creating simulation + experiments by reducing coding burden through helpful abstractions, + distributed computing helpers, and preset methods for generating, + computing, and summarizing simulation replicates. Of particular note + are the following:

+ + +

SimDesign + (Chalmers, + 2020) focuses on Monte Carlo simulation experiments and + provides a function runSimulation that + accepts user-defined generate, + analyse, and + summarise functions, with support for + distributed computation via the parallel + base R package and + future.

+
+ +

simulator + (Bien, + 2016) provides a tidy grammar of + simulation experiments and highly modular helpers for evaluating + and managing simulation outputs, relying on the + parallel package for distributed + computation.

+
+ +

simpr + (Brown, + 2023) defines a tidy simulation + framework for generating data, fitting models, varying parameters, + and aggregating simulation results with user-defined and + purr-style functions. In addition, it + support distributed computations backed by the + future framework.

+
+ +

SimEngine + (Kenny + & Wolock, 2024) defines and executes simulation + ‘levels’ (parameters to vary) and ‘scripts’ (functions to execute + a single simulation replicate). It manages the definition and + execution of simulations and calculates summary statistics, with + support for distributed computations in coordination with + high-performance computing cluster schedulers.

+
+
+

A third category of related packages are those that share + conceptual similarities simChef in terms of + providing helpful abstractions for the design and analysis of + simulation experiments, but at a finer level of detail than + simChef intends. For example, the package + DeclareDesign + (Blair + et al., 2019) provides various declare_* + functions for defining and evaluating statistical research questions, + with an emphasis on the social sciences. The package + infer + (Couch + et al., 2021) provides a tidy API for + statistical inference, providing the ability to specify random + variables and their relationships, define a null hypothesis, generate + data under that hypothesis, and calculate distributions of statistics + based on that hypothesis. Both of these packages and many of the + packages below could be employed in a user’s + DGP, Method, + Evaluator, or Visualizer + and deployed via an Experiment to carry out a + large-scale simulation with automated documentation in harmony with + simChef.

+

Finally, many packages provide a small number of well-tailored + helper functions for specific data-generating processes and simulation + settings, with or without distributed computation. In no particular + order these include: simitation + (Shilane + et al., 2023), simhelpers + (Joshi + & Pustejovsky, 2024), simTool + (Scheer, + 2020), parSim + (Epskamp, + 2023), rsimsum + (Gasparini, + 2018), simsalapar + (Hofert + & Mächler, 2016), tidyMC + (Linner + et al., 2022), MonteCarloSEM + (Orcan, + 2021), simMetric + (Parsons, + 2022), and simmer + (Ucar + et al., 2019). To our knowledge, no single existing package + includes simChef’s combination of conceptual + modularity, tidy grammar, computational + flexibility, simulation workflow management, and automated + documentation.

+
+ + Discussion +

While simChef’s core functionality focuses + on computability (C) – encompassing efficient usage of computational + resources, ease of user interaction, reproducibility, and + documentation – we emphasize the importance of predictability (P) and + stability (S) in data science simulations (see + (Elliott + et al., 2024) for an in-depth discussion). The principal goal + of simChef is to provide a tool for data + scientists to create simulations that incorporate predictability + (through fit to real-world data) and stability (through sufficient + exploration of uncertainty) in their simulations. In future work, we + intend to provide tools that can be flexibly tailored to a user’s + particular scientific needs and further these goals through automated + predictability and stability summaries and documentation.

+
+ + Acknowledgements +

The authors gratefully acknowledge partial support from (a) the NSF + under awards DMS-2209975, 1613002, 1953191, 2015341, and IIS 1741340; + and grant 2023505 supporting the Foundations of Data Science Institute + (FODSI); (b) the Weill Neurohub; and (c) the Chan Zuckerberg Biohub + under an Intercampus Research Award. TMT acknowledges support from the + NSF Graduate Research Fellowship Program DGE-2146752.

+
+ + + + + + + YuBin + KumbierKarl + + Veridical data science + Proceedings of the National Academy of Sciences + 202002 + 20210904 + 117 + 8 + 0027-8424 + http://www.pnas.org/lookup/doi/10.1073/pnas.1901326117 + 10.1073/pnas.1901326117 + 3920 + 3929 + + + + + + LangMichel + BischlBernd + SurmannDirk + + batchtools: Tools for R to work on batch systems + Journal of Open Source Software + 201702 + 20230420 + 2 + 10 + 2475-9066 + https://joss.theoj.org/papers/10.21105/joss.00135 + 10.21105/joss.00135 + 135 + + + + + + + WickhamHadley + AverickMara + BryanJennifer + ChangWinston + McGowanLucy D’Agostino + FrançoisRomain + GrolemundGarrett + HayesAlex + HenryLionel + HesterJim + KuhnMax + PedersenThomas Lin + MillerEvan + BacheStephan Milton + MüllerKirill + OomsJeroen + RobinsonDavid + SeidelDana Paige + SpinuVitalie + TakahashiKohske + VaughanDavis + WilkeClaus + WooKara + YutaniHiroaki + + Welcome to the Tidyverse + Journal of Open Source Software + 201911 + 20230420 + 4 + 43 + 2475-9066 + https://joss.theoj.org/papers/10.21105/joss.01686 + 10.21105/joss.01686 + 1686 + + + + + + + BengtssonHenrik + + A Unifying Framework for Parallel and Distributed Processing in R using Futures + The R Journal + 2021 + 20230420 + 13 + 2 + 2073-4859 + https://journal.r-project.org/archive/2021/RJ-2021-048/index.html + 10.32614/RJ-2021-048 + 208 + + + + + + + ChangWinston + + R6: Encapsulated Classes with Reference Semantics + 2022 + https://r6.r-lib.org + + + + + + ChalmersMark C.R. Philip AND Adkins + + Writing Effective and Reliable Monte Carlo Simulations with the SimDesign Package + The Quantitative Methods for Psychology + TQMP + 2020 + 16 + 4 + http://www.tqmp.org/RegularArticles/vol16-4/p248/p248.pdf + 10.20982/tqmp.16.4.p248 + 248 + 280 + + + + + + KennyAvi + WolockCharles J. + + SimEngine: A Modular Framework for Statistical Simulations in R + 2024 + https://doi.org/10.48550/arXiv.2403.05698 + 10.48550/arXiv.2403.05698 + + + + + + BrownEthan + + simpr: Flexible ’Tidyverse’-Friendly Simulations + 2023 + https://statisfactions.github.io/simpr/ + + + + + + GaspariniAlessandro + + rsimsum: Summarise results from Monte Carlo simulation studies + Journal of Open Source Software + The Open Journal + 2018 + 3 + 26 + https://joss.theoj.org/papers/10.21105/joss.00739 + 10.21105/joss.00739 + 739 + + + + + + + BlairGraeme + CooperJasper + CoppockAlexander + HumphreysMacartan + + Declaring and Diagnosing Research Designs + American Political Science Review + 2019 + 113 + 3 + https://doi.org/10.1017/S0003055419000194 + 10.1017/S0003055419000194 + 838 + 859 + + + + + + JoshiMegha + PustejovskyJames + + simhelpers: Helper Functions for Simulation Studies + 2024 + https://meghapsimatrix.github.io/simhelpers/index.html + + + + + + ScheerMarsel + + simTool: Conduct Simulation Studies with a Minimal Amount of Source Code + 2020 + https://CRAN.R-project.org/packages=simTool + + + + + + EpskampSacha + + parSim: Parallel Simulation Studies + 2023 + https://CRAN.R-project.org/package=parSim + + + + + + ShilaneDavid + BuduguttaSrivastav + BansalMayur + + simitation: Simplified Simulations + 2023 + https://CRAN.R-project.org/package=simitation + + + + + + LinnerStefan + Moreira LaraIgnacio + LehmannKonstantin + + tidyMC: Monte Carlo Simulations Made Easy and Tidy + 2022 + https://github.com/stefanlinner/tidyMC + + + + + + UcarIñaki + SmeetsBart + AzcorraArturo + + simmer: Discrete-event simulation for R + Journal of Statistical Software + 2019 + 90 + 2 + https://dogi.org/10.18637/jss.v090.i02 + 10.18637/jss.v090.i02 + 1 + 30 + + + + + + OrcanFatih + + MonteCarloSEM: An R Package to Simulate Data for SEM + International Journal of Assessment Tools in Education + 2021 + 8 + 3 + https://dergipark.org.tr/en/download/article-file/1323860 + 10.21449/ijate.804203 + 704 + 713 + + + + + + ParsonsRex + + simMetric: Metrics (with Uncertainty) for Simulation Studies that Evaluate Statistical Methods + Queensland University of Technology + 2022 + https://doi.org/10.25912/RDF_1665114451679 + 10.25912/RDF_1665114451679 + + + + + + BienJacob + + The Simulator: An Engine to Streamline Simulations + 2016 + https://doi.org/10.48550/arXiv.1607.00021 + 10.48550/arXiv.1607.00021 + + + + + + CouchSimon P. + BrayAndrew P. + IsmayChester + ChasnovskiEvgeni + BaumerBenjamin S. + Çetinkaya-RundelMine + + infer: An R package for tidyverse-friendly statistical inference + Journal of Open Source Software + 2021 + 6 + 65 + https://joss.theoj.org/papers/10.21105/joss.03661 + 10.21105/joss.03661 + 3661 + + + + + + + HofertMarius + MächlerMartin + + Parallel and Other Simulations in R Made Easy: An End-to-End Study + Journal of Statistical Software + 2016 + 69 + 4 + https://doi.org/10.18637/jss.v069.i04 + 10.18637/jss.v069.i04 + 1 + 44 + + + + + + ElliottCorrine F + DuncanJames + TangTiffany M + BehrMerle + KumbierKarl + YuBin + + Designing a data science simulation with MERITS: A primer + 2024 + https://arxiv.org/abs/2403.08971 + 10.48550/arXiv.2403.08971 + + + + +
diff --git a/joss.06156/10.21105.joss.06156.pdf b/joss.06156/10.21105.joss.06156.pdf new file mode 100644 index 0000000000..a9c2b2c892 Binary files /dev/null and b/joss.06156/10.21105.joss.06156.pdf differ diff --git a/joss.06156/media/api_overview.png b/joss.06156/media/api_overview.png new file mode 100644 index 0000000000..d139dd593c Binary files /dev/null and b/joss.06156/media/api_overview.png differ diff --git a/joss.06156/media/fit_eval_viz.png b/joss.06156/media/fit_eval_viz.png new file mode 100644 index 0000000000..1f471077ce Binary files /dev/null and b/joss.06156/media/fit_eval_viz.png differ diff --git a/joss.06156/media/run_experiment.png b/joss.06156/media/run_experiment.png new file mode 100644 index 0000000000..bbf0aa60a5 Binary files /dev/null and b/joss.06156/media/run_experiment.png differ