diff --git a/CHANGE_LOG.md b/CHANGE_LOG.md index ffcbd79..b6f9aa8 100644 --- a/CHANGE_LOG.md +++ b/CHANGE_LOG.md @@ -19,6 +19,8 @@ Details of the changes made since the last version - Added more instructions to individual files (`CITATION.cff`, `licence` etc.) - When using the guidelines as a template, the `README.md` should now be deleted when ready, and a replacement selected from the `documentation_templates` repository. - Add example `codemeta.json` +- Update `workflows.md` structure and content (add install, dependencies) +- Update `tools.md` structure and content ## v1.4 (first official release) diff --git a/CITATION.cff b/CITATION.cff index a4c0188..dd52fb0 100644 --- a/CITATION.cff +++ b/CITATION.cff @@ -3,9 +3,11 @@ ### This is an example CITATION.cff file completed for the documentation guidelines template repository ### ########################################################################################################### ### You can create your own using the blank template provided at the bottom of this file" -### Simply remove the "#" and complete each field provided: for more information please see Druskat, S., Spaaks, J. H., Chue Hong, N., Haines, R., Baker, J., Bliven, S., Willighagen, E., Pérez-Suárez, D., & Konovalov, O. (2021). Citation File Format (Version 1.2.0) [Computer software]. https://doi.org/10.5281/zenodo.5171937 +### Simply +#### 1. Delete the existing content (from line 10 - 46) +#### 2. Remove the "#" from the blank template (lines 53 - 62) and complete each field provided: for more information please see Druskat, S., Spaaks, J. H., Chue Hong, N., Haines, R., Baker, J., Bliven, S., Willighagen, E., Pérez-Suárez, D., & Konovalov, O. (2021). Citation File Format (Version 1.2.0) [Computer software]. https://doi.org/10.5281/zenodo.5171937 -cff-version: 0.0.1 +cff-version: 1.5.0 message: "If you use these documentation guidelines, please cite as below." authors: - family-names: Gustafsson @@ -35,6 +37,9 @@ authors: - family-names: Samaha given-names: Georgina orcid: https://orcid.org/0000-0003-0419-1476 + - family-names: Al Bkhetan + given-names: Ziad + orcid: https://orcid.org/0000-0002-4032-5331 title: "Australian BioCommons Documentation Guidelines" version: 1.5.0 doi: @@ -54,4 +59,4 @@ date-released: 2023-MM-DD #title: "Title of repository goes here" #version: 0.0.0 #doi: [DOI goes here] -#date-released: YYYY-MM-DD \ No newline at end of file +#date-released: YYYY-MM-DD diff --git a/README.md b/README.md index 2c9f39b..7e8bb45 100644 --- a/README.md +++ b/README.md @@ -18,7 +18,7 @@ You can use this repository as: > **If you use these guidelines, please cite this work :)** > -> Gustafsson, J., Davis, B., de la Pierre, M., Stott, A., Beecroft, S., Downton, M., Edwards, R., Chew, T., & Samaha, G. (2023). Australian BioCommons Documentation Guidelines (Version 1.5.0) [Computer software] +> Gustafsson, J., Davis, B., de la Pierre, M., Stott, A., Beecroft, S., Downton, M., Edwards, R., Chew, T., Samaha, G., & Al Bkhetan, Z. (2023). Australian BioCommons Documentation Guidelines (Version 1.5.0) [Computer software] ## Quick start guide (using repository as a template) @@ -62,30 +62,41 @@ These are the current templates that are available in `documentation_templates/` The files that we recommend you always include are detailed below. -| File | Purpose | What you need to do! | -|------|----------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -|LICENSE.md| The license that indicates how someone can reuse your software or project. | Select a license (https://choosealicense.com/) and copy the license text into this file. | -|CHANGE_LOG.md| A log of the changes made for each version / release. | Update this file when you make changes to your software or project. | -|CITATION.cff| A standard file type that indicates how someone should cite your software or project. | Update this file with the citation metadata for your software or project. GitHub will auto-detect this file and create a citation export option for you. | +| File | Purpose | What you need to do! | +|------|----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +|LICENSE.md| The license that indicates how someone can reuse your software or project. | Select a license (https://choosealicense.com/) and copy the license text into this file. | +|CHANGE_LOG.md| A log of the changes made for each version / release. | Update this file when you make changes to your software or project. | +|CITATION.cff| A standard file type that indicates how someone should cite your software or project. | Update this file with the citation metadata for your software or project. GitHub will auto-detect this file and create a citation export option for you. You can easily generate your own `CITATION.cff` using this resource https://citation-file-format.github.io/cff-initializer-javascript/#/ | ### 3. Update the optional but useful file(s) This folder contains useful files that you can include in your repository. -`codemeta.json`: this is a standard metadata file type from the [CodeMeta Project](https://codemeta.github.io/). You can easily generate your own `codemeta.json` using this resource https://codemeta.github.io/create/ - +- `codemeta.json`: this is a standard metadata file type from the [CodeMeta Project](https://codemeta.github.io/). You can easily generate your own `codemeta.json` using this resource: https://codemeta.github.io/create/ + ### 4. Delete files and directories you do not need These are guidelines only, and that means you can modify, update or delete elements of the file and directory structure to suit your specific use case. +### 5. Register your software + +Below are some suggestions for where to register, based on the type of software you have created. + +#### Tools +- [bio.tools](https://bio.tools/) + +#### Workflows +- [WorkflowHub](https://workflowhub.eu/) +- [Dockstore](https://dockstore.org/) + ## Citing this repository > If you use this template repository, or any of its documentation elements, please use the following citation: > -> Gustafsson, J., Davis, B., de la Pierre, M., Stott, A., Beecroft, S., Downton, M., Edwards, R., Chew, T., & Samaha, G. (2023). Australian BioCommons Documentation Guidelines (Version 1.5.0) [Computer software] +> Gustafsson, J., Davis, B., de la Pierre, M., Stott, A., Beecroft, S., Downton, M., Edwards, R., Chew, T., Samaha, G., & Al Bkhetan, Z. (2023). Australian BioCommons Documentation Guidelines (Version 1.5.0) [Computer software] ## Contributing @@ -100,25 +111,28 @@ Anyone is welcome to contribute to these documentation guidelines in the followi # Acknowledgements & attributions -The guideline template is supported by the Australian BioCommons via Bioplatforms Australia funding, the Australian Research Data Commons (https://doi.org/10.47486/PL105) and the Queensland Government RICF programme. Bioplatforms Australia and the Australian Research Data Commons are enabled by the National Collaborative Research Infrastructure Strategy (NCRIS). +The documentation guidelines template repository is supported by the Australian BioCommons via Bioplatforms Australia funding, the Australian Research Data Commons (https://doi.org/10.47486/PL105) and the Queensland Government RICF programme. Bioplatforms Australia and the Australian Research Data Commons are enabled by the National Collaborative Research Infrastructure Strategy (NCRIS). The BioCommons would also like to acknowledge the contributions of the following individuals and institutions to these documentation guidelines: -- Johan Gustafsson (Australian BioCommons, University of Melbourne) [@supernord](https://github.com/supernord) -- Brian Davis (National Computational Infrastructure) [@Davisclan](https://github.com/Davisclan) -- Marco de la Pierre (Pawsey Supercomputing Centre) [@marcodelapierre](https://github.com/marcodelapierre) -- Audrey Stott (Pawsey Supercomputing Centre) [@audreystott](https://github.com/audreystott) -- Sarah Beecroft (Pawsey Supercomputing Centre) [@SarahBeecroft](https://github.com/SarahBeecroft) -- Matthew Downton (National Computational Infrastructure) [@mattdton](https://github.com/mattdton) -- Richard Edwards (University of New South Wales) [@cabbagesofdoom](https://github.com/cabbagesofdoom) -- Tracy Chew (University of Sydney) [@tracychew](https://github.com/tracychew) -- Georgina Samaha (University of Sydney) [@georgiesamaha](https://github.com/georgiesamaha) +- **Johan Gustafsson** (Australian BioCommons, University of Melbourne) [@supernord](https://github.com/supernord) +- **Brian Davis** (National Computational Infrastructure) [@Davisclan](https://github.com/Davisclan) +- **Marco de la Pierre** (Pawsey Supercomputing Centre) [@marcodelapierre](https://github.com/marcodelapierre) +- **Audrey Stott** (Pawsey Supercomputing Centre) [@audreystott](https://github.com/audreystott) +- **Sarah Beecroft** (Pawsey Supercomputing Centre) [@SarahBeecroft](https://github.com/SarahBeecroft) +- **Matthew Downton** (National Computational Infrastructure) [@mattdton](https://github.com/mattdton) +- **Richard Edwards** (University of New South Wales) [@cabbagesofdoom](https://github.com/cabbagesofdoom) +- **Tracy Chew** (University of Sydney) [@tracychew](https://github.com/tracychew) +- **Georgina Samaha** (University of Sydney) [@georgiesamaha](https://github.com/georgiesamaha) +- **Ziad Al Bkhetan** (Australian BioCommons, University of Melbourne) [@ziadbkh](https://github.com/ziadbkh) # Citations - Druskat, S., Spaaks, J. H., Chue Hong, N., Haines, R., Baker, J., Bliven, S., Willighagen, E., Pérez-Suárez, D., & Konovalov, O. (2021). Citation File Format (Version 1.2.0) [Computer software]. https://doi.org/10.5281/zenodo.5171937 -- Jon Ison and others, Tools and data services registry: a community effort to document bioinformatics resources, Nucleic Acids Research, Volume 44, Issue D1, 4 January 2016, Pages D38–D47, https://doi.org/10.1093/nar/gkv1116 +- Spaaks, J. H., Verhoeven, S., Diblen, F., Druskat, S., Soares Siqueira, A., Garcia Gonzalez, J., & Cushing, R. (2023). cffinit (Version 2.3.1) [Computer software]. https://github.com/citation-file-format/cff-initializer-javascript - Carole Goble, Stian Soiland-Reyes, Finn Bacall, Stuart Owen, Alan Williams, Ignacio Eguinoa, Bert Droesbeke, Simone Leo, Luca Pireddu, Laura Rodríguez-Navas, José Mª Fernández, Salvador Capella-Gutierrez, Hervé Ménager, Björn Grüning, Beatriz Serrano-Solano, Philip Ewels, & Frederik Coppens. (2021). Implementing FAIR Digital Objects in the EOSC-Life Workflow Collaboratory. Zenodo. https://doi.org/10.5281/zenodo.4605654 +- Jon Ison and others, Tools and data services registry: a community effort to document bioinformatics resources, Nucleic Acids Research, Volume 44, Issue D1, 4 January 2016, Pages D38–D47, https://doi.org/10.1093/nar/gkv1116 - Matthew B. Jones, Carl Boettiger, Abby Cabunoc Mayes, Arfon Smith, Peter Slaughter, Kyle Niemeyer, Yolanda Gil, Martin Fenner, Krzysztof Nowak, Mark Hahnel, Luke Coy, Alice Allen, Mercè Crosas, Ashley Sands, Neil Chue Hong, Patricia Cruse, Daniel S. Katz, Carole Goble. 2017. CodeMeta: an exchange schema for software metadata. Version 2.0. KNB Data Repository. doi:10.5063/schema/codemeta-2.0 +- O'Connor BD, Yuen D, Chung V et al. The Dockstore: enabling modular, community-focused sharing of Docker-based genomics tools and workflows [version 1; peer review: 2 approved]. F1000Research 2017, 6:52 (https://doi.org/10.12688/f1000research.10137.1) diff --git a/codemeta.json b/codemeta.json index 350f00c..9af9266 100644 --- a/codemeta.json +++ b/codemeta.json @@ -103,6 +103,15 @@ "@type": "Organization", "name": "University of Sydney" } + }, + { + "@type": "Person", + "givenName": "Ziad", + "familyName": "Al Bkhetan", + "affiliation": { + "@type": "Organization", + "name": "Australian BioCommons, University of Melbourne" + } } ] } \ No newline at end of file diff --git a/documentation_templates/tools.md b/documentation_templates/tools.md index e6987b1..77d86c7 100644 --- a/documentation_templates/tools.md +++ b/documentation_templates/tools.md @@ -3,156 +3,112 @@ - [Description](#description) - [Diagram](#diagram) + - [How to cite this software](#how-to-cite-this-software) - [User guide](#user-guide) - [Install](#install) - [Quick start guide](#quick-start-guide) - - [Infrastructure usage and - recommendations](#infrastructure-usage-and-recommendations) - - [Compute resource usage across tested - infrastructures](#compute-resource-usage-across-tested-infrastructures) - - [Benchmarking](#benchmarking) - - [Workflow summaries](#workflow-summaries) - - [Metadata](#metadata) - [Required (minimum) inputs/parameters](#required-minimum-inputsparameters) + - [Dependencies & third party tools](#dependencies--third-party-tools) + - [Recommendations for use on specific compute systems](#recommendations-for-use-on-specific-compute-systems) + - [Compute resource usage on tested + infrastructures](#compute-resource-usage-across-tested-infrastructures) + - [Benchmarking (compute resource usage on tested infrastructures)](#benchmarking--compute-resource-usage-on-tested-infrastructures-) - [Additional notes](#additional-notes) - [Help/FAQ/Troubleshooting](#helpfaqtroubleshooting) - [3rd party Tutorials](#3rd-party-tutorials) - [License(s)](#licenses) - [Acknowledgements/citations/credits](#acknowledgementscitationscredits) ---- - -# Description - -``` -Introduction of tool, including its input(s)/output(s) as well as a list (or link to) of available shell commands (useful when building interfaces/wrappers for containers) - -Table with embedded registry links, if suitable. -``` -| Repository / Registry | Available? | -|-------------|:--------:| -| GitHub | [●]()| -| bio.tools | [●]()| -| BioContainers | [●]()| -| bioconda | [●]()| +## Description ---- +> Introduction of tool, including its input(s)/output(s) as well as a list (or link to) of available shell commands (useful when building interfaces/wrappers for containers) -# Diagram +> Table with embedded registry links, if suitable. -> Logical visual description of processing steps for tool, e.g. for pipelines shipped as packages (Falcon, Cellranger, ..) +| Repository / Registry | Available? | +|:----------------------:|-------------| +| bio.tools | Add URL here | +| bioconda | Add URL here | +| BioContainers | Add URL here | ---- -# User guide -## Install +## Diagram -> General installation guide. +> Logical visual description of processing steps for tool, e.g. for pipelines shipped as packages (Falcon, Cellranger, ..) -> If there are different installation requirements based on infrastructures you could indicate these here, or in the individual infrastructure documentation template: https://github.com/AustralianBioCommons/doc_guidelines/blob/master/infrastructure_optimisation.md ---- +## How to cite this software -## Quick start guide +> Add citation instructions here. -> General guide for deployment across multiple infrastructures (distinct from specific infrastructure quick start guide). ---- +## User guide -## Infrastructure usage and recommendations -> You could include: -> + a link to installation instructions for each computational infrastructure -> + recommendations for use on a specific computational infrastructure +### Install -> Documentation for a specific infrastructure should go into a infrastructure documentation template https://github.com/AustralianBioCommons/doc_guidelines/blob/master/infrastructure_optimisation.md - ---- - -## Compute resource usage across tested infrastructures +> General installation guide. -> Table with high level compute resource usage information for standalone runs or testing of specific versions on specific computational infrastructures. +> If there are different installation requirements for specific compute infrastructures you could indicate these here, or in an individual infrastructure documentation template: https://github.com/AustralianBioCommons/doc_guidelines/blob/master/infrastructure_optimisation.md -| Tool | Version | Sample description | Wall time | Cores | Peak RAM in GB (requested) | Drive (GB) | HPC-HTC | If HPC-HTC is other, specify | Scheduler | Year-Month | -| ----- | ------- | ------------------ | --------- | ----- | -------------------------- | ---------- | ------- | ---------------------------- | --------- | ---------- | -| | | | | | | | | | | | ---- +### Quick start guide -## Benchmarking +> General guide for deployment across multiple infrastructures (distinct from specific infrastructure quick start guide). -> Benchmarking for a specific infrastructure should go here: if this document is complicated it should go into a benchmarking template, or be provided elsewhere (e.g. Zenodo). ---- +### Required (minimum) inputs / parameters -# Tool summary +> The minimum inputs required for the workflow to run. -## Metadata -> Example table below - **based on requirements for [bio.tools](https://bio.tools/)** +### Dependencies & third party tools -> **To discover appropriate [EDAM](https://github.com/edamontology/edamontology) ontology terms**, you can use [EDAM browser](https://edamontology.github.io/edam-browser/) +> Add / list known dependencies, or link to a list of these dependencies. -> bio.tools citation: Ison, J. *et al.* (2015). [Tools and data services registry: a community effort to document bioinformatics resources.](http://nar.oxfordjournals.org/content/early/2015/11/03/nar.gkv1116.long) _Nucleic Acids Research_. - doi: [10.1093/nar/gkv1116](http://dx.doi.org/10.1093/nar/gkv1116) PMID: [26538599 ](http://www.ncbi.nlm.nih.gov/pubmed/26538599) -> EDAM citation: Ison, J., Kalaš, M., Jonassen, I., Bolser, D., Uludag, M., McWilliam, H., Malone, J., Lopez, R., Pettifer, S. and Rice, P. (2013). [EDAM: an ontology of bioinformatics operations, types of data and identifiers, topics and formats](http://bioinformatics.oxfordjournals.org/content/29/10/1325.full). _Bioinformatics_, **29**(10): 1325-1332. -[![10.1093/bioinformatics/btt113](https://zenodo.org/badge/DOI/10.1093/bioinformatics/btt113.svg)](https://doi.org/10.1093/bioinformatics/btt113) PMID: [23479348](http://www.ncbi.nlm.nih.gov/pubmed/23479348) _Open access_ +### Recommendations for use on specific compute systems +> You could include: +> + a link to installation instructions for each computational infrastructure, if the requirements for each system are unique +> + recommendations for use on a specific computational infrastructure or system -|metadata field | value | -|-------------------|:---------------------------------:| -|Tool name | | -|Description | | -|Homepage URL | | -|Software version(s)| | -|EDAM topic(s) | | -|EDAM operation(s) | | -|Maturity | | -|Creators | | -|License | | -|Container | | -|Install method | | -|GitHub | | -|bio.tools | | -|BioContainers | | -|bioconda | | +> Documentation for a specific infrastructure could also go into a infrastructure documentation template https://github.com/AustralianBioCommons/doc_guidelines/blob/master/infrastructure_optimisation.md ---- -## Required (minimum) inputs / parameters +### Benchmarking (compute resource usage on tested infrastructures) -> The minimum inputs required for the workflow to run. +> Minimal information to include here: +> - Max threads +> - Peak ram used +> - Operating system compatibility ---- +> Benchmarking for a specific infrastructure could also go here: if the benchmarking information is complicated it could go into a benchmarking template, or be provided elsewhere (e.g. Zenodo). -## Third party tools / dependencies +> Example table with high level compute resource usage information for standalone runs or testing of specific versions on specific computational infrastructures. -> Add / list known dependencies, or link to a list of these dependencies. +| Tool | Version | Sample description | Wall time | Cores | Peak RAM usage | Total size of all files (GB) | Compute system (e.g. Pawsey Setonix HPC, AWS) | Scheduler | Year-Month | +| ----- | ------- | ------------------ | --------- | ----- | -------------------------- |------------------------------| ------- | --------- | ---------- | +| | | | | | | | | | | ---- -# Additional notes +## Additional notes > Any comment on major features being introduced, or default/API changes that might result in unexpected behaviours. ---- -# Help / FAQ / Troubleshooting +## Help / FAQ / Troubleshooting ---- -# 3rd party Tutorials +## 3rd party Tutorials ---- -# License(s) +## [License(s)](../LICENSE.md) ---- -# Acknowledgements / citations / credits +## Acknowledgements / citations / credits > Any attribution information that is relevant to the tool being documented. \ No newline at end of file diff --git a/documentation_templates/workflows.md b/documentation_templates/workflows.md index f64cead..9af93d3 100644 --- a/documentation_templates/workflows.md +++ b/documentation_templates/workflows.md @@ -3,15 +3,17 @@ - [Description](#description) - [Diagram](#diagram) + - [How to cite this workflow](#how-to-cite-this-workflow) - [User guide](#user-guide) - [Quick start guide](#quick-start-guide) + - [Install instructions](#install) + - [Dependencies & third party tools](#dependencies--third-party-tools) - [Required (minimum) inputs/parameters](#required-minimum-inputsparameters) - - [Infrastructure usage and - recommendations](#infrastructure-usage-and-recommendations) - - [Compute resource usage across tested + - [Recommendations for use on specific compute systems](#recommendations-for-use-on-specific-compute-systems) + - [Compute resource usage on tested infrastructures](#compute-resource-usage-across-tested-infrastructures) - - [Benchmarking](#benchmarking) + - [Benchmarking (compute resource usage on tested infrastructures)](#benchmarking--compute-resource-usage-on-tested-infrastructures-) - [Additional notes](#additional-notes) - [Help/FAQ/Troubleshooting](#helpfaqtroubleshooting) - [3rd party Tutorials](#3rd-party-tutorials) @@ -31,6 +33,11 @@ Logical visual description of processing steps for workflow +## How to cite this workflow + +> Add citation instructions here. + + ## User guide @@ -39,10 +46,14 @@ Logical visual description of processing steps for workflow > General guide for deployment across multiple infrastructures (distinct from specific infrastructure quick start guide) -### Install instructions +### Install +> General installation guide. -### Dependencies +> If there are different installation requirements for specific compute infrastructures you could indicate these here, or in an individual infrastructure documentation template: https://github.com/AustralianBioCommons/doc_guidelines/blob/master/infrastructure_optimisation.md + + +### Dependencies & third party tools ### Required (minimum) inputs/parameters @@ -50,7 +61,7 @@ Logical visual description of processing steps for workflow > The minimum inputs required for the workflow to run. -### Infrastructure usage and recommendations +### Recommendations for use on specific compute systems > + link to installation instructions for each infrastructure > + recommendations @@ -59,17 +70,15 @@ Logical visual description of processing steps for workflow https://github.com/AustralianBioCommons/doc_guidelines/blob/master/infrastructure_optimisation.md -### Compute resource usage across tested infrastructures +### Benchmarking (compute resource usage on tested infrastructures) > Table with high level compute resource usage information for standalone runs or testing of specific versions on specific computational infrastructures. -| Title | Version | Sample description | Wall time | Cores | Peak RAM in GB (requested) | Drive (GB) | Compute system (e.g. Pawsey Setonix HPC, AWS) | Scheduler | Year-Month | +| Title | Version | Sample description | Wall time | Cores | Peak RAM usage | Total size of all files (GB) | Compute system (e.g. Pawsey Setonix HPC, AWS) | Scheduler | Year-Month | | ----- | ------- | ------------------ | --------- | ----- | -------------------------- | ---------- | ------- | --------- | ---------- | | | | | | | | | | | | -### Benchmarking - > Benchmarking for a specific infrastructure should go here: if this document is complicated it should go into a benchmarking template, or be provided elsewhere (e.g. Zenodo). @@ -82,7 +91,7 @@ https://github.com/AustralianBioCommons/doc_guidelines/blob/master/infrastructur ## 3rd party Tutorials -## License(s) +## [License(s)](../LICENSE.md) ## Acknowledgements/citations/credits