-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
better version/build management #8
Comments
🤔... One potential solution: name recipes and tables based on species, so Previous versions could be specified by appending the version number. Most users will (probably) want the most up to date info and only need to type What's your opinion on providing previous genome versions? We could maintain recipes for older builds and provide a function that allows users to build them locally. That way they're still easily accessible for reproducibility purposes without causing the package size to explode. |
I do think there's a need to be able to maintain or recreate older versions. I operate a core facility - I've had folks that I've done analysis for years ago using, e.g., Galgal4, but if I now created or recreated the data, it'd be galgal5. Also, for human specifically, lots of folks (me included) are still using GRCh37. There might be a few ways to manage this. I think you'd need to know which archive version of ensembl you'd need to go after to get the build you're interested in. Also, maybe there's some way to retrieve and record this information from the biomart query. I do like the idea of just typing hsapiens... I'm sure there's a way to "alias" different names to the same dataset. Not very experience with R data package creation. This is my first/only. |
This is a good point. Attaching GRCh38 data to an object called I'm also in a bioinformatics core and frequently switching between different projects that require different genomes/builds, so I loved the idea of annotables. It can be a real time saver! |
with the changes in #6 it's much easier to recreate annotation tables. the files are named e.g.
galgal5
, but which version/build is actually used depends on what's current in ensembl. e.g., when I first built this package, chicken was on galgal4. i had to manually update the filenames, and I probably did the wrong thing by just deleting (rather than deprecating) the old datasets. maybe that's okay since it's still versioned in a release. not sure how to best handle these issues.The text was updated successfully, but these errors were encountered: