Goal: To build a classification methodology to determine whether a customer is placing a fraudulent vehicle insurance claim.
The data was sent from the client side in multiple sets of files in batches at a given location. The data has been extracted from the census bureau. The Dataset has 38 features(including the target feature).
Schema file was also sent by the client containing the relevent informations about training files. Data Validation was performed as an initial step followed by data insertion into Database(SQLite). Here data is divided into good and bad data based on the schema file and then sent to respective folders. Then the entire lifecycle of a Data Science project was followed like, exporting data from database,data preprocessing(Imbalanced dataset was handled using Imblearn's Random Over sampler),clustering using Kmeans,model selection(multiple models were tested and the top 2 models were selected based on accuracy score and AUC score), model building(XgBoost for the first cluster and SVC for the second), hyperparameter optimization(using GridSearchCV) and finally model deployment(into GCP). API testing was done using Postman. Logs were maintained at each and every step of action. Similar set of actions were performed for the predicting data. Codes were written following OOPS concept.
For more details about the project click here
The Code is written in Python 3.7.3 If you don't have Python installed you can find it here. If you are using a lower version of Python you can upgrade using the pip package, ensuring you have the latest version of pip. To install the required packages and libraries, run this command in the project directory after cloning the repository:
conda create -n myenv python=3.7
conda activate myenv
pip install -r requirements.txt
python main.py
Login or sign up in order to create virtual app and many more things. Free tier account on Google console provides $300 credit for one year. For application deployment download the Google SDK installer.
Demo: https://insurancefraud.de.r.appspot.com/
Currently app disabled,since GCP is chargeable.
- Deploying the Web Application on Cloud.
- Heroku
- Azure
- AWS EC2 Instance