Android Malware Detection Using Machine Learning

Project Overview

This project aims to build an effective classification model to classify a mobile application as Benign or Malware. To do so, we'll evaluate multiple classification models using different metrics and select the best model with better performance for our dataset. Finally, we deployed our model as a REST API using FastAPI.

Dataset

The dataset used in this project, hosted on FigShare, contains feature vectors of 215 distinct attributes gathered from 15,036 mobile applications-5,560 classified as malware from the Drebin project and 9,476 as benign. It is structured with 215 columns and 15,036 rows, designed for binary classification where the target variable differentiates between Malware (S) and Benign (B) apps. Each attribute is encoded in binary format: 0 indicates an attribute's absence, while 1 denotes its presence. The class distribution is the following:

The 215 features of the dataset are divided into four different categories: API Call Signature, Manifest Permission, Intent, Commands signature.

Machine Learning Models

Several machine learning models were tested, including:

Random Forest
XGBoost
LightGBM
Extra Tree Classifier
Logistic Regression
Support Vector Machine
AdaBoost
Decision Tree
Bagging
Bayesian

Model Comparison

The models were evaluated based on accuracy, precision, recall, F1-score, and ROC AUC. XGBoost model emerged as the best performer with the following metrics:

Accuracy: 0.986698
Precision: 0.98914
Recall: 0.975022
F1 Score: 0.982031
ROC AUC: 0.998764

Fine-tuning

Using GridSearchCV, the hyperparameters for the XGBoost were fine-tuned to maximize recall. The optimal parameters were:

colsample_bytree: 0.8
learning_rate: 0.2
max_depth: 7
n_estimators: 200
subsample: 1.0

Deployment

To deploy our model, we package everything within a Docker container and expose the model as an API. When a user wants to make a prediction, they submit an APK to the API. The first step in the process involves reverse-engineering the APK to extract all the features necessary for the prediction. These features are then used to determine the status of the application. The complete workflow is illustrated in Figure:

To have access to the application, you have to follow the following steps:

Have Docker installed on your computer
Run the following command: docker run -p 8080:8000 tderick/android-malware-detection
Go to http://localhost:8080/docs to test the application.

The following pictures show the analysis of the WhatsApp APK:

You can download the APK version of mobile apps at https://apkpure.com to test.

Build the docker image

docker build -t tderick/android-malware-detection:latest .

Run the image

docker run -p 8080:8000 tderick/android-malware-detection:latest

Push to docker hub

docker push tderick/android-malware-detection:latest

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
app		app
assets		assets
.gitignore		.gitignore
Android_Malware_Analysis.ipynb		Android_Malware_Analysis.ipynb
DMML_Android-malware-detection-using-ml-presentation.pdf		DMML_Android-malware-detection-using-ml-presentation.pdf
DMML_Android_Malware_Detection_Documentation.pdf		DMML_Android_Malware_Detection_Documentation.pdf
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Android Malware Detection Using Machine Learning

Project Overview

Dataset

Machine Learning Models

Model Comparison

Fine-tuning

Deployment

Build the docker image

Run the image

Push to docker hub

About

Releases

Packages

Languages

License

tderick/android-malware-detection

Folders and files

Latest commit

History

Repository files navigation

Android Malware Detection Using Machine Learning

Project Overview

Dataset

Machine Learning Models

Model Comparison

Fine-tuning

Deployment

Build the docker image

Run the image

Push to docker hub

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages