UCLA Extension Introduction to Data Science (COM SCI X450.1) class materials. Use this repo to access supplemental learning resources, eBooks, and other handouts. Additional content may be added throughout the course. Past students are always welcome!
SPECIAL OFFER! My book publisher has offered to sell the textbook for our class with a special 25% off student discount. Just click HERE and use coupon code "Gutierrez2019"
Course materials are categorized in the following folders:
code
handouts
homeworks
quizzes
slides
- Becoming a machine learning company means investing in foundational technologies - Companies successfully adopt machine learning either by building on existing data products and services, or by modernizing existing models and algorithms
- Scientists rise up against statistical significance - Replacing p-values with confidence intervals?
- Op-Ed: The real reason we’re afraid of robots - Cogent Op-Ed relating to the "Killer AI" meme
- UVA Data Points: a podcast exploring the world of data science - Cool podcast from School of Data Science, University of Virginia
- Data Scientist vs Machine Learning Engineer - A great distinction!
- Data Is An Art, Not Just A Science—And Storytelling Is The Key - Data storytelling is key to successful data science
- What Is Causal Inference? An Introduction for Data Scientists - A great intro to this growing area
- My Journey into AI Webinar - Panel discussion hosted by DeepLearning.AI
- R vs Python! Which one should you choose for data science? - A balanced comparison of the two most popular data science languages
- Data Con LA 2022 Videos - Video presentations from the Data Con LA virtual conference
- Data Con LA 2022 Slides - Slide presentations from the Data Con LA virtual conference
- Data Science Jobs Report 2020 - Useful employment research from 365DataScience
- Trends in Data Science 2019/2020 - Important industry trends from ODSC
- Data Scientist Resume: Template, Examples and Complete Guide - What a successful data science resume looks like
- insideBIGDATA "Ask a Data Scientist" Series - My popular educational series sponsored by Intel
- All my opendatascience.com articles - Many article keeping pace with the field of data science
- How to Get Your Data Science Career Started - Nice Forbes article on how jump start into Data Science
- Google Dataset Search - NEW! Resource for data scientists
- The Importance of SQL in Practicing Data Science - Reinforcing my advice in class!
- What is Data Science 'Impostor Syndrome'? - Avoid the fear of what you don't know
- Becoming a Data Scientist - Important pointers by head of Kaggle Learn, Dan Becker, Ph.D.
- Industrial Research in Applied Statistics- AMS - Nice article about being a data scientist.
- 6 Reasons Why Data Science Projects Fail - A report from down in the trenches.
- The Difference Between Data Scientists and Data Engineers - A guide to becoming a unicorn.
- forester: an R package for automated building of tree-based models -
- A UseR’s Introduction to Machine Learning in AWS - Nice Youtube presentation for doing ML on AWS with R
- The Hitchhiker’s Guide to Responsible Machine Learning - Educational comic book
- Feature Engineering and Selection: A Practical Approach for Predictive Models - Nice learning resource by Max Kuhn and Kjell Johnson
- 2020 Outlook on AutoML Updates & Latest Recent Advances - Latest authoritative list of AutoML tools and frameworks
- Data Science Meetup (Feb. 26, 2020) Gradient Boosting Machines (GBM): From Zero to Hero - Slides from a great Meetup
- Data Science Meetup (Feb. 26, 2020) Gradient Boosting Machines (GBM): From Zero to Hero - GitHub repos with R and Python code
- 10 Tips for Choosing the Optimal Number of Clusters - Great article that drills down into unsupervised machine learning clustering
- NGBoost: Natural Gradient Boosting for Probabilistic Prediction - HOT new machine learning algorithm using boosting
- Preventing undesirable behavior of intelligent machines - Cool research paper addressing the debate over machine learning bias
- VIDEO presentation from LA West R Meetup group - Better Than Deep Learning Gradient Boosting Machine 2019
- SLIDES from LA West R Meetup group - Better Than Deep Learning Gradient Boosting Machine 2019
- Linear Regression with Healthcare Data for Beginners in R - Nice starter exercise for newbie data scientists
- Book Review: Deep Learning Revolution - Nice deep learning book for a general audience.
- Evaluate your R model with MLmetrics - Using R’s MLmetrics to evaluate machine learning models. MLmetrics provides several functions to calculate common metrics for ML models, including AUC, precision, recall, accuracy, etc.
- Assessment Metrics for Clustering Algorithms - Metrics for clustering and unsupervised machine learning
- Nice self-contained data project example - CO2 Emissions Comparing and Modeling for Global Warming
- Presentations from So Cal R User Group Meetup - Youtube channel for the So Cal R Users Group
- Frustration: One Year With R - One coders experience with R
- How to Create a Dataframe in R with 30 Code Examples - Useful data frame tutorial in R
- Getting started simulating data in R: some helpful functions and how to use them - Nice blog article on simulated data in R
- R for Data Science - Official Tidyverse book by Hadley Wickham
- TidyverseSkeptic - Tidyverse Skeptic essay by Norm Matloff
- useR! Conference - Premiere global R conference
- Book R code - R code for my book "Machine Learning and Data Science: An Introduction to Statistical Learning Methods with R
- R vs Python: Different similarities and similar differences - Nice cross-comparison of the R and Python data science programming languages
- R Tutorials - Good R programming tutorials to read in parallel with weeks 1-4 of class
- Descriptive Statistics in R - Good resource for Week 7 of class on EDA
- Demystifying Regular Expressions in R - Intro to text analytics
- Vignette: data.table - Data.table is an extension of data.frame package in R
- How Tidyverse Guides R Programmers Through Data Science Workflows - A cohesive way of approaching data science projects
- Type conversion and you (or and R) - More examples on type conversion and coercion in R
- Essential list of useful R packages for data scientists - Great list of important R packages for data scientists
- R plot pch symbols : The different point shapes available in R - An examination of all the popular PCH argument values for data vizualizations
- R color names
- A Complete Tutorial on Time Series Modeling in R
- Intro to Python Programming - Kaggle course on very basics of Python language
- An Introduction to Statistical Learning: with Applications in R... with Python! - Solutions to ISL exercises in Python
- Python Programming Tutorials - Many tutorial resources for Python coding for data science and machine learning
- Talk Python - A podcast on Python and related technologies
- Theoretical Foundations of Data Science— Should I Care or Simply Focus on Hands-on Skills? - YES, you should care!
- Linear Algebra via MIT OpenCourseWare - Learn linear algebra from Gil Strang, the best of the best!
- Calculus — Multivariate Calculus And Machine Learning -- A Must Know Concept For Every Professional - Here is the bare minimum Calculus necessary for machine learning.
- The Book of R: A First Course in Programming and Statistics - As mentioned in class, a great way to learn statistics.
- Do my data follow a normal distribution? - A note on the most widely used distribution and how to test for normality in R.
- Fisher's exact test in R: independence test for a small sample - Focuses on the Fisher’s exact test. Independence tests are used to determine if there is a significant relationship between two categorical variables.
- Introduction to Statistical Learning - Great book to use following this class.
- Elements of Statistical Learning - The "Machine Learning Bible"
- Mathematics for Machine Learning - Wonderful all-in-one text for getting up to speed with the mathematics required for machine learning
- Bookdown - A collection of free eBooks on data science and R.
- Probabilistic Machine Learning: An Introduction - Machine learning built on a foundation of probability theory.
- Using NFL Analytics to Contextualize Player Performance & Predict Playoff Probability - Work done by student Jesse Lubell (Summer 2021)
- Analysis of Reptile & Amphibian Observations in Los Angeles County - Work done by student Timothy Stegman (Fall 2020)
- Analysis of Match Statistics and Team Performances in the Premier League From Season 2015/16 to Season 2019/20 - Work done by student Tara Nguyen(Fall 2020)
- What Makes Us Happy - Using Kaggle "Young People Survey" data set - Work done by student Alexander Fichtl, taking course from Germany (Spring 2020)
- Data Analysis Evolution of Popular Music - Work done by student William Toth (Spring 2020)
- Airbnb Price Prediction for different areas in NYC - Work done by student Hashneet Kaur (Winter 2020)
- Predicting-Hotel-booking-demand-and-cancellation - Work done by student Elaine Kuang (Winter 2020)
- Analysis-of-Coronavirus-COVID-19-New-Confirmed-Cases - Work done by student Micky Lee (Winter 2020)
- Data Analysis for 2019 Indian General Election - Work done by student Junhui Yang (Winter 2020)
- FIVB Beach Volleyball Historic Top 8 Teams Analysis - Work done by student Tyler Widdison (Fall 2019)
- Data Analysis for PM2.5 in Beijing - Work done by student Xiaozhu Zhang (Spring 2019)
- Daniel D. Gutierrez - LinkedIn
This project is licensed under the MIT License - see the LICENSE.md file for details