This repository is a collection of Jupyter Notebook tutorials to teach how to do data analysis with Python based on OpenIntro Statistics, a free and open-source textbook. These notebooks have been tested with Python 3.11.
-
- This notebook introduces fundamental concepts in data analysis, including types of data (categorical and numerical), data visualisation techniques, and measures of central tendency and variability. It emphasises the importance of understanding data characteristics to make informed decisions.
-
- This notebook covers the basics of probability theory, including definitions, rules (addition and multiplication), conditional probability, Bayes' theorem, and probability distributions. It provides practical examples and visualisations to illustrate these concepts.
-
- This notebook explains the properties of the normal distribution, its significance in statistics, and how to use it for data analysis. It includes discussions on the empirical rule, z-scores, and applications of the normal distribution in real-world scenarios.
-
- This notebook focuses on constructing and interpreting confidence intervals for population parameters. It explains the concept of confidence levels, margin of error, and the importance of sample size, providing examples and visualisations for better understanding.
-
- This notebook introduces the concept of sampling distributions and their role in statistical inference. It covers the Central Limit Theorem, properties of different sampling distributions, and how they are used to estimate population parameters.
-
Inference for Categorical Data
- This notebook covers statistical methods for analysing categorical data, including chi-square tests, tests for independence, and goodness-of-fit tests. It provides practical examples and step-by-step calculations to illustrate these methods.
-
- This notebook explores statistical inference techniques for numerical data, such as t-tests, ANOVA, and regression analysis. It emphasises hypothesis testing, p-values, and confidence intervals, with examples and visualisations to aid understanding.
-
- This notebook introduces simple linear regression analysis, including model fitting, interpretation of regression coefficients, and evaluation of model performance. It provides examples using Python to demonstrate the application of these techniques.
-
- This notebook extends the concepts of regression analysis to multiple predictors. It covers model fitting, interpretation of coefficients, multicollinearity, and model diagnostics. Examples using Python illustrate the practical application of multiple regression analysis.
To get started with these tutorials, clone the repository to your local machine:
git clone https://github.com/akmand/statististics_tutorials.git