This project focuses on the development of machine learning models for rainfall prediction in major cities across Australia.
Project Objective: Develop machine learning models for rainfall prediction in major cities to enhance timely forecasting and reduce human and financial losses from extreme weather events.
Data and Methods: Utilize diverse weather data, including temperature, humidity, wind speed, atmospheric pressure, and historical precipitation records, to train and evaluate machine learning algorithms such as regression, decision trees, random forests, and ensemble methods.
Evaluation Metrics: Rigorous statistical analysis and performance metrics (accuracy, precision, recall, F1-score) assess model effectiveness in predicting rain occurrence, enabling tailored approaches for different cities.
Implications: Improved rainfall prediction benefits agriculture, water resource management, disaster preparedness, and urban planning, aiding farmers, water authorities, and emergency management agencies in optimizing resources and responding to extreme weather events proactively.
Objective: Develop accurate machine learning models for rainfall prediction, addressing class imbalance, missing data, outliers, and feature selection in major cities.
Aim: Enhance forecasting by preprocessing data and comparing models like Logistic Regression, Decision Trees, Neural Networks, Random Forest, and LightGBM.
Motivation: Timely and precise rainfall forecasts reduce losses in extreme weather events, benefitting agriculture, water management, and emergency planning in Australia.
Techniques
Class Imbalance: Addressed with minority class oversampling.
Missing Data: Imputed using Multiple Imputation by Chained Equations (MICE).
Outlier Detection: Identified outliers using the Interquartile Range (IQR) method.
Feature Selection: Used filter and wrapper methods for selecting relevant features.
Machine Learning Models: Employed models like Logistic Regression, Decision Trees, Neural Networks, and Random Forest.
Accuracy = 0.8050146850864789 ROC Area under Curve = 0.805039737453916 Cohen's Kappa = 0.6100470056991374 Time taken = 4.293061256408691 precision recall f1-score support
0.0 0.79882 0.81390 0.80629 27501
1.0 0.81141 0.79618 0.80372 27657
accuracy 0.80501 55158
macro avg 0.80512 0.80504 0.80501 55158 weighted avg 0.80513 0.80501 0.80500 55158
Confusion matrix, without normalization Accuracy = 0.8666195293520432 ROC Area under Curve = 0.8665334987138236 Cohen's Kappa = 0.733191586808682 Time taken = 0.7531166076660156 precision recall f1-score support
0.0 0.88972 0.83612 0.86209 27501
1.0 0.84625 0.89695 0.87086 27657
accuracy 0.86662 55158
macro avg 0.86799 0.86653 0.86648 55158 weighted avg 0.86793 0.86662 0.86649 55158
Confusion matrix, without normalization Accuracy = 0.8937053555241307 ROC Area under Curve = 0.8936717401423054 Cohen's Kappa = 0.7873950784904907 Time taken = 484.4572079181671 precision recall f1-score support
0.0 0.90276 0.88179 0.89215 27501
1.0 0.88511 0.90556 0.89522 27657
accuracy 0.89371 55158
macro avg 0.89393 0.89367 0.89368 55158 weighted avg 0.89391 0.89371 0.89369 55158
Confusion matrix, without normalization Accuracy = 0.9234562529460821 ROC Area under Curve = 0.9233814961038466 Cohen's Kappa = 0.8468885765844757 Time taken = 48.85527777671814 precision recall f1-score support
0.0 0.94673 0.89695 0.92117 27501
1.0 0.90262 0.94981 0.92562 27657
accuracy 0.92346 55158
macro avg 0.92467 0.92338 0.92339 55158 weighted avg 0.92461 0.92346 0.92340 55158
Confusion matrix, without normalization Accuracy = 0.8728017694622721 ROC Area under Curve = 0.8727177880255684 Cohen's Kappa = 0.7455592851796542 Time taken = 9.703380107879639 precision recall f1-score support
0.0 0.89572 0.84302 0.86857 27501
1.0 0.85254 0.90241 0.87677 27657
accuracy 0.87280 55158
macro avg 0.87413 0.87272 0.87267 55158 weighted avg 0.87407 0.87280 0.87268 55158
Evaluation: Assessed performance with metrics like accuracy, ROC-AUC, and Cohen’s Kappa. Final Output: ['Rain' 'No Rain' 'Rain' ... 'No Rain' 'No Rain' 'Rain'] Binary Output: ['Rain' 'No Rain' 'Rain' ... 'No Rain' 'No Rain' 'Rain'] Majority Vote: Rain