A good data scientist not only has extensive knowledge of machine learning, and deep learning, but also has the ability to extract and gather data from various sources and store it in a useable format. This task will introduce you to the first step of all data science tasks, data collection. One method of data collection is web scraping, which you will be working on in this task.
Problem Statement This project involves collecting data from various online sources. You are asked to collect relevant news data on different stocks, collect financial news headlines. The second part of the project is data cleaning and pre processing. You are asked to present a clean and usable dataset.
- Refer to beautiful soup's online documentation or refer to youtube videos if you run into a problem instead of using ChatGPT
- Do not alter any prewritten code or comments
- Be sure to add comments to make your code legible and to let the mentors understand what approach you have taken
- Only use google colab to run the code
- Fork and clone this repository onto your local device
- Open the .ipynb file on google colab
- Once you are done with the task, download as .ipynb and store it in a folder along with required files
- Name your file as your Enrollment number
- Push this file to forked repo and then send PR
- Your code will be reviewed by the mentors. Points will be granted once the PR is accepted and merged
For any query feel free to contact [email protected]. You can also interact with the mentors and the geekhaven community on discord