Skip to content

PLEX-GR00T/Energy_Usage_Intensity_ML_Project

Repository files navigation

Tabular Data Processing with different models.

  1. Perform Exploratory Data Analysis(EDA) and connect with BigQuery and utilize Data Studio.
  2. Accelerate the model training.
  3. Perform Different Models.

1) EDA and Data Studio

  • Upload the Dataset to the BigQuery with subscription given by professor.
  • You can find my Exploratory Data Analysis here.
  • The visulization of the live dashboard with Data Studio is below.

image

2) Accelerate the model training.

  • In this section to get use to all the models available I tried few basic models.
  • Linear Regression, Lasso, Ridge, ElasticNet, Random Forest Regression.
  • I have shown the training acceleration and accuracy comparison for the Linear Regression and Random Forest Regression.
  • For this accelearation I have used Intel's OneAPI to acceleate the Data Mining Pipeline.
  1. Linear Regression Training-Acceleration

  1. Random Forest Regression Trainig-Acceleration

  • As we can see in linear regression, even after using the OneAPI for the acceleration, there is no difference in the accuracy.
  • However, in the random forest regression we can see some significant decrease in the accuracy of the model after wards.

3) Perform XGBoost, LightGBM, and Cat Boost.

Models Score
XGBoost 0.974
Cat Boost 0.795
LightGBM 0.729

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published