Skip to content

The HDS_practice project is where I keep track of my learning process in health data science. This includes hands-on practice, code snippets, notes, and small projects related to areas like data cleaning, analysis, visualization, and more.

License

Notifications You must be signed in to change notification settings

Wen2429/HDS_practice

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

HDS_practice

Welcome to my Health Data Science (HDS) Practice project! This repository is dedicated to documenting my journey as I learn and practice various concepts, tools, and techniques in the field of health data science.

Table of Contents

Project Overview

The HDS_practice repository is my space for learning and practicing health data science. Throughout this journey, I will cover topics like data analysis, machine learning, and data visualization while applying these skills to health data. This project will include code examples, notes, and projects reflecting my growth.

Courses and Topics

This repository will cover work from the following courses, each focusing on different aspects of health data science:

  • Statistical Foundations for Health Data Science (R): Biostatistics, probability theory, hypothesis testing, and statistical methods for health data.
  • Computing for Health Data Science (Python): Python programming, including data cleaning, manipulation, and basic data science workflows.
  • Management and Curation of Health Data (SAS): Health data storage, cleaning, management, and data curation using SAS.
  • Data Structures and Algorithms (C): Implementation of data structures and algorithms relevant to health data.
  • Context of Health Data Science (Notes): Notes and insights into the broader context and challenges of health data science.
  • Health Data Analytics: Machine Learning (Python): Implementation of machine learning algorithms for health data.
  • Health Data Analytics: Statistical Modelling (R): Statistical modeling approaches for health data, including linear and logistic regression.
  • Database Systems (RDBMS): Relational databases, SQL, and database management for health data.
  • Visualization and Communication of Health Data (R, Python): Best practices in visualizing health data for effective communication.
  • Big Data Management (Hadoop): Handling large-scale health data using Hadoop for big data analytics.

Skills and Tools

In this project, I will use various tools and technologies, including:

  • Programming languages: Python, R, SAS, C
  • Data management: SAS, SQL, Hadoop, RDBMS
  • Libraries and frameworks: pandas, NumPy, matplotlib, seaborn, scikit-learn, R packages
  • Statistical methods: Hypothesis testing, regression models, biostatistics
  • Machine learning: Supervised and unsupervised learning, model evaluation
  • Data visualization: Effective communication of insights using R and Python
  • Big data management: Hadoop and related technologies for managing large health datasets

Structure

The repository will be organized into the following sections:

  • data/: Contains datasets used for practice and projects.
  • notebooks/: Jupyter Notebooks or RMarkdown files documenting analysis and projects.
  • scripts/: Python, R, or SAS scripts for specific analyses or tasks.
  • projects/: Larger projects applying a range of skills to real-world health data problems.
  • notes/: Course notes and reflections.

Resources

I will rely on various learning materials, including:

  • Course materials from my health data science courses.
  • Public health datasets from reputable sources.
  • Online documentation and best practices for the tools and languages mentioned above.

Contributing

This is a personal learning project. Contributions are not expected, but feel free to provide feedback or open issues if you have suggestions for improvement.

License

This project is licensed under the MIT License. See the LICENSE file for more information.

About

The HDS_practice project is where I keep track of my learning process in health data science. This includes hands-on practice, code snippets, notes, and small projects related to areas like data cleaning, analysis, visualization, and more.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published