$ git clone https://github.com/georgetown-analytics/XBUS-504-01.Data_Analysis_I_Statistics.git
$ cd XBUS-504-01.Data_Analysis_I_Statistics/
$ pip install -r requirements.txt
$ cd notebooks/
$ jupyter notebook 0_Config_Test.ipynb
The fields of statistics and probability were founded on empirical analysis of data (e.g. human height). Data scientists must possess a strong foundation in statistics and probability to uncover patterns and build models, algorithms, and simulations. This course reviews the basics of descriptive and inferential statistics, distributions, probability, and regression with a specific focus on application to real data sets.
Upon successful completion of the course, students will:
- Explain descriptive and inferential statistics
- Compute measures of central tendency, variance, and probabilities
- Produce and interpret meaningful and accurate summary statistics for a given data set
- Conduct hypothesis tests and understand the difference between Type I and Type II errors
- Develop single and multivariate regression models
- Differentiate between correlation and causation
Enrollment in this course is restricted. Students must submit an application and be accepted into the Certificate in Data Science in order to register for this course.
Current Georgetown students must create an application using their Georgetown NetID and password. New students will be prompted to create an account.
Course prerequisites include:
- A bachelor's degree or equivalent
- Completion of at least two college-level math courses (e.g. statistics, calculus, etc.)
- Successful completion of Data Wrangling (XBUS-503)
- Basic familiarity with programming or a programming language
- A laptop for class meetings and coursework
Statistical Rethinking Digital Textbook
The Hacker's Guide to Uncertainty Estimates
First analysis of ‘pre-registered’ studies shows sharp rise in null findings
Statistics Versus Machine Learning
Translating Between Statistics and Machine Learning