GitHub - jiyoulee/vhost-cpu-requirements-prediction

This project experiments with predicting the CPU quota requirements of the Linux virtual machine I/O thread vHost using machine learning.

Development Environment

Python 3.8
Pycharm Community Edition 2020.2

Machine Learning Concepts

The following three supervised regression algorithms are used:

Linear Regression Model (LR)
Support Vector Machine Regression (SVM)
Random Forest Regression (RF)

For each model, its input values represent the packet size, TX bandwidth, TX pps, and the vCPU usage (located underneath the data folder in CVS format), while its predicted values as well as its target values represent the optimal CPU quota value. Either the Root Mean Square Logarithmic Error (RMLSE) or just the Root Mean Square Error (RMSE) can be selected as the evaluation metric.

Getting Started

Install the following Python packages: openpyxl, numpy, pandas, scikit-learn, joblib.
Navigate to the folder that includes the desired evaluation metric in its name. For example, if RMLSE is desired, follow the rest of this guideline with the files underneath the folder cpu_quota_rmsle.
Open the Python file that has the desired supervised regression algorithm as its name. For example, if Linear Regression is desired, open the Python file linear.py.

Training and Cross-validating a Model

(Validation is only available for SVM and RF).

Uncomment the Cross validate and Save model sections.
Comment the Test, Evaluate, and Save sections.
Execute the Python file.

The cross-validated model will be saved underneath the folder model with the name set as the value of the variable model_name.

Testing a Saved(Validated) Model

Comment the Cross validate and Save model sections.
Uncomment the Test, Evaluate, and Save sections.
Execute the Python file.

Results will be saved in the path given as the argument to wb.save(), located at the end of the Python file.

References

Keras

https://machinelearningmastery.com/tutorial-first-neural-network-python-keras/

MLP

https://machinelearningmastery.com/how-to-configure-the-number-of-layers-and-nodes-in-a-neural-network/

Supervised Regression Algorithms

Random Forest

Linear Regression

https://scikit-learn.org/stable/auto_examples/linear_model/plot_ols.html (code)

Support Vector Regression

https://scikit-learn.org/stable/auto_examples/svm/plot_svm_regression.html (code)

Python Libraries

Import

Split

https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html

Normalize

Train

Evaluate (RMSE)

https://scikit-learn.org/stable/modules/generated/sklearn.metrics.mean_squared_error.html

Save

https://machinelearningmastery.com/save-load-machine-learning-models-python-scikit-learn/

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.idea		.idea
data		data
model		model
script		script
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Development Environment

Machine Learning Concepts

Getting Started

Training and Cross-validating a Model

Testing a Saved(Validated) Model

References

Keras

MLP

Supervised Regression Algorithms

Python Libraries

About

Releases

Packages

Languages

jiyoulee/vhost-cpu-requirements-prediction

Folders and files

Latest commit

History

Repository files navigation

Development Environment

Machine Learning Concepts

Getting Started

Training and Cross-validating a Model

Testing a Saved(Validated) Model

References

Keras

MLP

Supervised Regression Algorithms

Python Libraries

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages