Skip to content

HeardLibrary/cpbp8306-dataanalysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

54 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Title: Course files for CPBP 8306: Data analysis

Repository name: cpbp8306-dataanalysis

Year: Fall 2024

Institution: Vanderbilt University

Instructors: Joshua Borycz and Daniel Genkins

Description

This course will focus on practical coding for research using Python and R. We will begin discussing the versioning tool Git, the command line, and downloading tools to begin coding with these languages. Then we will talk about how to use Jupyter notebooks and responsibly use AI to help with the coding process. Next, we will begin reading, writing, and analyzing data and images with Python. We will then shift to using R and RStudio. Beginning with how to organize a project, load and manage R packages, and set up the AI Copilot tool within RStudio. This will lead into data types and structures and data cleaning. We will then focus on analyzing these data with univariate and multivariate statistical methods. Finally, we will show how to visualize data within Python and R.

Folders

Month Savings
S01_20240822_python_session: What is Python and installing Python.
S02_20240829_python_session: Installing Python environment (miniconda, VScode, jupyter notebooks).
S03_20240905_python_session: Basic Python variables types and functions.
S04_20240912_python_session: Basic Python data manipulation with Pandas, image analysis, and univariate statistics.
S05_20240919_r_basics_1: R and RStudio and how to use them. Loading packages. Basic 1D and 2D data types and manipulation.
S06_20240926_r_data_cleaning_2: Cleaning data using R. Using data cleaning packages like dplyr, janitor, ...
S07_20241003_r_data_cleaning_3: Long versus wide data. Combining data sets
S08_20241017_r_statistics_viz_4: Base R stat functions and plotting. Basics of ggplot2.
S09_20241024_r_statistics_viz_5: Principle component analysis and diagnostic plots.
S10_20241031_r_modeling_6: Tidymodels, linear regression, decision trees, random forest, K-nearest neighbors
S11_20241107_r_cluster_7: Clustering algorithms. K-means clustering, gaussian mixture modeling, heirarchical clustering, DBSCAN.
S12_20241114_python_session: Image analysis, pandas, linear regression continued.
S13_20241121_r_animations_8: Creating animations by looping through data and generating images.
S14_20241205_presentations_9: Student presentations.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •