This repository collects various helpful links, training materials, courses, code snippets and information for all things computational biology...
I have used many of these sites myself to learn or have across them when looking for good teaching materials and have collated them here not only as a useful reminder to myself but also as helpful resource to those starting out on their journey into bioinformatics, genomics, programming and computational biology.
There is a lot of overlap between sections but links are split into broad sections
- Bioconductor resources
- Bioc2020 conference resources - overview of bioconductor - recorded workshops and vignettes (beginner to advanced) on key bioconductor packages (e.g. annotations) as well as more specific single cell, HiC packages - Keynote lectures and short talks on recent developments in field
- Biostars Handbook - Introductory guide to bioinformatics, giving overview of theory and step by step instructions for getting started with many bioinformatic tasks and analysis
- Curated list of bioinformatic resources from Edinburgh XDF fellowship programme
- Harvard Chan Bioinformatic Core training repository and website- introductory tutorials to several bioinformatic topics (ATAC-seq, RNA-seq, scRNA-seq)
- CSAMA - Intermediate course focusing on statistical analysis of genomic datasets
- Hemberg lab scRNA-seq course
- EMBL-EBI online training resources - introductory courses covering the use of many online databases and resourses
- Stats Quest Videos Small but thorough video tutorials on many statistical concepts related to genomics and broader statistical concepts - has nice deseq2 and edgeR explainers e.g. deseq2 library normalisation
- Modern Statistics for Modern Biology By Susan Holmes and Wolfgang Huber - Excellent beginner to intermediate statistical concepts that are geared towards biological problems - These are two top experts in the field of statistics and biological data science, this book really is a must read.
- MIT Deep Leanring in the Life Sciences Course Materials
- Harvard Introdution to Bioinformatics and Computational Biology by Shirley Liu - An amazing collection of lectures by Shirley Liu, one of the world leaders in bioinformatics, covering fundamental and key algorithmns in the field in excellent detail.
- Harvard X - open computational biology lectures and resources
- Computational Genomics With R
- Canadian Bioinformatics Workshops - Excellent series of short workshops on a wide range of bioinformatics topics.
- Learn2Discover course in python, data handling, ML / Networks
- Intro to genomics for engineers by St. Jude Children’s Research Hospital, Inc.
- AI learning resources curated and produced by Deepmind - series of introductory lectures and podcasts
- OVerview of history of genetics (http://www.dnaftb.org/)
- Computational Genomics with R book from Dr. Altuna Akalin a bioinformatics scientist and the head of Bioinformatics and Omics Data Science Platform at the Berlin Institute of Medical Systems Biology, Max Delbrück Center in Berlin) - covers things from genomics to machine learning
- Broad Institute Models Inference and Algorithms meeting
- Broad Institute Other uploads
- Fragile Nucleosome Transcription/Gene regulation lecture series
- Single Cell Omics Germany Single cell focused lecture series
- Turing institute talks and lectures Lectures on Computational/AI/Mathematical topics in research, datascience, health data etc
- Washu genome browser
- IGV
- UCSC
- awesome-expression-browser repository
- Nice guide and tutorial to using GVIZ (bioconductor) and igv too visulise genomic data from Rockerfeller University
- Nimbus Image - Suite of tools from Arjun Raj lab for interactively analysing and investigating microsopy data
- Squidpy - Spatial Single Cell Analysis in Python
- The paper: Squidpy: a scalable framework for spatial omics analysis. Nature Methods, 19(2):171–178, Feb 2022. [doi:10.1038/s41592-021-01358-2].
- Repository collecting single cell tools - awesome-single-cell
- The human cell atlas hold regular open access conferences that its worth watching
- Satijalab Single cell genomics day held annually (usually March)
- Center for Integrated Cellular Analysis Online lecture series - here
- Tools
- Azimuth - Map your counts table to reference single cell datasets
- Batch correction methods being poorly calibrated link
- ScVI
- Thread regarding benchmarking (https://x.com/_canergen/status/1772190381871907122)
- Nice review/guide through the whole process and visulisation Riemand et al., 2019 Nature Protocols
- includes gProfiler, GSEA, Cytoscape and Enrichment Map
- guide for using r package for gprofiler2
- SetRank = packages for futher refining GSEA analysis and removing false possitives
- includes gProfiler, GSEA, Cytoscape and Enrichment Map
- More detailed description for selecting an appropriate background (https://sci-hub.se/10.1007/978-1-60761-175-2_6)
- General data science topics - general data science website with interesting articles relating to all things data science, lots of machine learning topics
- datacamp - various online courses covering a broad range of data science topics - many introductory
- Introduction to Research Data Science - by the Turing Institute
- Software and Datacarpentries courses
- Linux Foundation Course - Through intro to linux and command line
- R for data science - Introductory book that covers R, Rstudio, Tidyverse for data wrangling, plotting in ggplot, general data science approaches and methods.
- Software carpentry R for genomics - basic r introductory course
- rstudio webinars - webinars on various topics related to R and rstudio
- R graph gallery - gallery of various R-based graphs and how to make them
- ggplot extensions gallery - nice overview of ggplot extensions you can use - good for plotting inspiration
- R Markdown Cookboook - includes useful section on how to parameterise Rmds
- Computational Genomics with R book from Dr. Altuna Akalin a bioinformatics scientist and the head of Bioinformatics and Omics Data Science Platform at the Berlin Institute of Medical Systems Biology, Max Delbrück Center in Berlin) - covers things from genomics to machine learning
- Python Data Science Handbook very detailed textbook on using python for datascience, includes: ipython, numpy, pandas, matplotlib and scikit learn for machine learning
- Python graph gallery - gallery of various python-based graphs and how to make them
- equivalent of setting PYTHONPATH so conda installs of python can access your own repos
- Research Software Engineering with Python - From the Turing institute
- The Art of Statistics: Learning from DataBy David Spiegelhalter - A short general book on the importance of visulising data and many key statistical concepts that crop up time and time again in biological data science - a really good and interesting read to those a bit rusty in statistics
- Modern Statistics for Modern Biology By Susan Holmes and Wolfgang Huber - Excellent beginner to intermediate statistical concepts that are geared towards biological problems - These are two top experts in the field of statistics and biological data science, this book really is a must read.
- Think Stats
- Think Bayes
- Stats Quest Videos Small but thorough video tutorials on many statistical concepts related to genomics and broader statistical concepts - has nice deseq2 and edgeR explainers e.g. deseq2 library normalisation
- Chartmaker - summary of different types of plot and how to make them using different software
- Python graph gallery - gallery of various python-based graphs and how to make them
- R graph gallery - gallery of various R-based graphs and how to make them
- ggplot extensions gallery - nice overview of ggplot extensions you can use - good for plotting inspiration
- ComplexHeatmaps (in R) - manual to create complex heatmaps to assimilate info
- Biorender - software for publication quality drawing of cells/biological graphics
- Friends Don't Let Friends Make Bad Graphs - github repo of bad graphs, reasons why not to use then and how to make them better.
- Nature Review: Guide to Machine Learning for Biologists. 2021., Greener et al.
- Python Machine Learning by Sebastian Raschka and Vahid Mirjalili -> Nice introductory textbook going over the fundamentals of machine learning concepts, algorithmns and using the Python scikit learn library
- Encoding of categorical variables
- Sklearn
- MIT Deep Leanring in the Life Sciences Course Materials
- 2021 ML in Genomics course focused on ML using R
- Computational Genomics with R book from Dr. Altuna Akalin a bioinformatics scientist and the head of Bioinformatics and Omics Data Science Platform at the Berlin Institute of Medical Systems Biology, Max Delbrück Center in Berlin) - covers things from genomics to basic intro to machine learning
- Optuna - Python framework to automate hyperparameter search
- Ray Tune - Python framework to automate hyperparameter search
- Git
- Conda
- tmux
- intro to tmux - brief 2 minute guide to getting started with tmux a program you can run in a server that keeps your terminal (and the processes in it open) even if your ssh connection to the server drops
- Atom - a text editor you can use to write scripts for R, python or any text document. atom guide - guide to getting started with atom text editor
These notes are pretty rough and mainly for my own use so bare with them...
- Useful Software to setup on new computer
- How to add own directories to equivalent of PYTHONPATH so conda installs of python can access your own repos
- Setting up cgatpipelines on new cluster
- using jupyter on HPC cluster
Here is a list of publically available datasets/resources for bioinformatic analysis - document here