Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Target audience for this module? #27

Open
jdkent opened this issue Apr 28, 2019 · 3 comments
Open

Target audience for this module? #27

jdkent opened this issue Apr 28, 2019 · 3 comments

Comments

@jdkent
Copy link

jdkent commented Apr 28, 2019

Thinking about this on my own, it appears the target audience for this module substantially overlaps with dataprocessing, such that concepts covered here could be introduced as needed into that module to solve the person's data processing problems. I can see how FAIR-data, statistics, and dataprocessing help solve inter-related but separable problems, but I'm having a harder time placing what problem reproducible-basics solves.

What problems I think the modules solve

FAIR-data: how do I share/find my data?
datapreprocessing: how do I preprocess/analyze my data reproducibly?
statistics: how do I make appropriate models/interpret my data?
reproducible-basics: understand reproducibility???

I'm trying to think from the perspective from someone that would like to attend a workshop, and I'm having trouble thinking about what concrete problem understanding reproducibility is solving or if there is another problem the module is solving that can easily translate to someone's goals.

I do think for the dataprocessing module, there are additional worthwhile concepts to cover that are not in git-novice or shell-novice that are covered in this module and I'm curious what other people think about merging these two modules? (and redistributing/reformatting lessons that don't fit into dataprocessing into the introduction/FAIR-data modules)

@yarikoptic
Copy link
Member

Thanks @jdkent for the feedback. I would say that reproducible-basics "solves the problem" (or addresses the question) of "what did/do I do to achieve X?" at the elementary level (not at the formalized provenance tracking).
As usually that entails installing and running software tools, particular accent is given to the code/software/data distributions, and actual execution of the tools and what (e.g. environment variables) affect it.
Indeed it is related to the data processing (and to FAIR-data to an extent of managing data), and I think it is good -- we do want modules to relate to each other.

@jdkent
Copy link
Author

jdkent commented May 1, 2019

Thank you for the clarification. If I understand correctly, I see this module as a sort of "glue" that doesn't directly solve a problem for a potential learner, but does provide a foundation for further learning. If we take the carpentries handbook as a guide, I want to remain cognizant of how many skills we assume learners have/how many we need to teach until the learner gets "enough" practical knowledge they can apply to their own problems. As in, if someone comes to a workshop on reproducible basics, how many takeaways could they take right away to their lab? I see a number of takeaways in the shell/git/right to share/other episodes, but I've survived (but perhaps not truly lived 😜) while not understanding the difference between package managers and distributions.

But I believe ascertaining the value of package managers and distributions would be a separate issue, and if I understood your explanation correctly, I've been swayed to think this module serves a good purpose overall.

@yarikoptic
Copy link
Member

but I've survived (but perhaps not truly lived) while not understanding the difference between package managers and distributions.

;-) indeed... something to think about and possibly refactor, e.g. placing apt/conda/... into a less prominent supplementary submodule. With containerization approaches it becomes a bit less important but I do feel that at least some cursory overview to familiarize with those "computing environment building blocks" could be of benefit. In your experiences -- how did you install software and were you "comfortable", e.g. knowing that you are actually running what you think you are running, and how often you encountered the WTF cases due to installation oddities (e.g. local python modules installed under ~/.local on one box but absent on another, conflicting with system-wide installations, picking up your ~/.local stuff within your docker/singularity container environments etc)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants