Work space for the reproducibility working group for BONSAI hackathon2019
- Brandon Kuczenski, group coordinator
- Miguel Astudillo
- Carlos Gaete
- Massimo Pizzol
- Tom Millross (observing member)
Please refer to our working groups page
Please review repo issues for currently open tasks.
The goal of this working group is to provide transparency and reproducibility of the work products of the hackathon.
One effective definition for transparency be found in the FAIR Guiding Principles for "Findable, Accessible, Interoperable, and Reusable" data objects.
Regarding reproducibility, obviously we expect people to be able to reproduce our findings by downloading and running our GitHub repositories. Beyond that, we are also interested in replicability, which is a broader goal of enabling the findings to be independently confirmed (see a set of recommendations from the American Statistical Association on reproducible research).
To that end, we adopt the following general goals:
- Accessible Deliverables: Deliver outputs in a standard, external format. The
datapackage
library from frictionless data has been proposed. - Operable Workflows: Procedures that demonstrate the achievement of project tasks. Probably in the form of unit tests.
- Provide Provenance: the original data sources used for input information should be reported, and the steps required to convert input data to output data should be documented.
These can be grouped into "before", "during", and "stretch" goals.
We need to have a set of loose guidelines to the other groups about what their data products will look like.
Follow the progress of the other groups and ensure that their works in progress meet our transparency and reproducibility standards. By the end of the week, we should have operable standards that demonstrate the reproduciblity of data products.
- Operable standards that demonstrate "interoperable life cycle inventory models"
- Independent replication of one or more results
- Hasty notes from the kickoff call 2019-03-18