Check out our onboarding website with centralized resources here!
Our FALL 24 Project Lists here!
If there are any issues or areas of improvement you would like us to know, please create a new entry in "Issues"
If you haven't already, fill out this form and join our mailing list. This will keep you up-to-date on the club.
-
Download the files in this repo by clicking
Code
(the green button near the top) ->Download ZIP
and unzip the files into a folder. You can of course also fork the repo if you have experience with Git. -
Follow the general setup guide.
-
Follow the Git setup guide.
For most people, (3) is the hardest part of the tutorial! If you feel frustrated, know it is normal. Come see us at tutorials or office hours and we will help you out.
If you have trouble with the General Setup, you can follow the Google Colab setup guide and use Colab to complete the tutorials.
If you have trouble with the Git Setup, you can upload your files to Git by going to your GitHub repository and do Add file
-> Upload files
.
Get started with tutorial0
and checkpoint0
in the tutorial0
folder and then move on to tutorial1
and checkpoint1
in the tutorial1
folder. We recommend working through each tutorial before attempting the corresponding checkpoint. However, if you have prior experience, feel free to skip part of or entire tutorial.
The Data-Visualization
folder contains materials for those who want to get a head start. pandas.ipynb
is a very brief introduction to internal Pandas data visualization tools. The AnatomyofMatplotlib
folder contains a comprehensive tutorial for the Matplotlib library, which most beginner projects use and is foundational to other data visualization packages such as seaborn
.
We also highly recommend you looking into Python virtual environments. You can do this at the beginning or after you complete the checkpoints. Our members have made resources explaining it here.
There are three optional challenges available to you: Machine Learning, Deep Learning, and RvF. They are located in three seperate folders under Optional-Challenges
and your code will be needed in the notebooks ending in .ipynb
.
You can choose to complete any one or multiple of them. We usually put new members on beginner or intermediate projects for their very first semester but you may want to work on advanced projects right away if you are experienced with data science. In that case, completion of at least one challenge will be required.
Machine Learning - Loan Approval Prediction
Deep Learning - Titanic
RvF - Computer Vision: Fake Face Detection
These checkpoints are not meant to be selective. Their sole purpose is to give you sufficient foundational knowledge about Python and some important packages so you can start contributing to a project.
The definition of success for us is to have everyone who begins the tutorials finish them. Thus, we will offer support in two ways:
-
Tutorials: We will host one live tutorial introducing the Command Line, Python, and its related packages. The session will be a combination of short presentations and Q&A.
-
Office Hours: We will be offering 2 office hours in-person for you to come ask questions and receive feedbacks.
The exact date, time, and location of the tutorials and office hours can be found on the onboarding resource website here
Neither tutorials nor office hours are mandatory.
We have also created a forum where you can ask questions.
Due: 9/16/2024 11:59pm EST
Submit a link to your GitHub repo that contains all your completed work when you sign up for projects: project signup form
We are looking for:
- [REQUIRED] Checkpoint 0 and Checkpoint 1. These are assessed by completion and effort, not accuracy.
- [OPTIONAL] Any additional challenges you completed. These are assessed by merit.
Please make sure you link works. Due to the immense application volume we will not be sending out emails if your link is invalid. They will be automatically filtered.
All technical or logistical questions MUST be posted on the ED forum. We will not answer those questions over email.
If you have a personal question, email us at [email protected].
A list of relevent python libraries that are used extensively throughout the checkpoints, challenges, MDST projects, and beyond.
Numpy: https://numpy.org/doc/stable/
Pandas: https://pandas.pydata.org/docs/
Matplotlib: https://matplotlib.org/stable/gallery/index
Scikit-Learn: https://scikit-learn.org/stable/user_guide.html