Preprocessing:
- Download the CSV file https://drive.google.com/file/d/1x7Jj5HKiqG083I3VSHDJK4U-uzAoMaiZ/view?usp=drive_link
- Read the file as a data frame.
- Remove the first column, which is just a row number.
- Keep the second and remaining columns.
- Fill up the missing values in the data frame with 0.
- Replace the values greater than 200 to 0.
Understanding the Statistics:
- Given any user condition (say, >, < <=, >=, ==, etc) and a threshold value (say, 15), for each sensor point, count the number of rows satisfies the condition and output it as a hashmap with Key= sensorPoint and value=numberOfRowsSatisfiyingTheCondition.
- Using the plotly express with 'open street maps,' draw the heatmap of each point.
Coding tips:
-
Try to write generic Python class files without hardcoding the downloaded CSV file.
-
Do the proper documentation of your Python Programs.
-
Distribute among yourselves properly, and mention the tasks that each student will carry out.
Task binderLink Heatmap >15 Frequency_heatmap >35 K-long pattern
Students mention the four tasks assigned by me today
- Check sensor values; delete columns with values less than a user-specified value.
- Impute missing values using linear regression or neural network (deep learning).
- Build ML models for each sensor to predict future learning.
- Mine and visualize longest patterns using FP Growth and Plotly Express.