Skip to content

Commit

Permalink
Merge pull request #1 from fspzar123/fspzar123-new_branch
Browse files Browse the repository at this point in the history
First Commit
  • Loading branch information
fspzar123 authored Jun 7, 2024
2 parents 21f5561 + a56a8c5 commit 6c8a221
Show file tree
Hide file tree
Showing 19 changed files with 7,324 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# About Dataset
Explore the intricate dance between gold prices and key economic events across major global players – Canada, Japan, USA, Russia, European Union, and China. This comprehensive dataset spans from January 2019 to December 2023, offering a nuanced analysis of how economic news from these influential regions impacts the ever-volatile gold market. Delve into the ebb and flow of financial landscapes, uncovering trends, correlations, and invaluable insights for strategic decision-making in the dynamic world of investments.

## Historical Gold Price Dataset:

* Day: The day of the week when the data was recorded.
* Date: The specific date corresponding to the recorded gold price.
* Hour: The time of day when the gold price was recorded.
* Country: The country associated with the economic event or news affecting gold prices.
* Event: The economic event or news that potentially influenced gold prices.
* Actual: The actual reported value or figure related to the economic event.
* Previous: The previously reported value or figure for the same economic event.
* Consensus: The consensus forecast or expected value for the economic event.
* Forecast: The forecasted value or figure for the economic event.

## Economic Calendar Dataset:

* Day: The day of the week when the economic event is scheduled.
* Date: The specific date when the economic event is expected to occur.
* Hour: The time of day when the economic event is scheduled.
* Country: The country associated with the economic event.
* Event: The specific economic event or news scheduled to take place.
* Actual: The actual reported value or figure related to the economic event.
* Previous: The previously reported value or figure for the same economic event.
* Consensus: The consensus forecast or expected value for the economic event.
* Forecast: The forecasted value or figure for the economic event.

## Sentiment Labeled Data:
* Date: The specific date when the economic event is expected to occur
* Price: The price of Gold in Dollars(US) on that particular Date.
* Vol_K: Volume of the Gold traded.
* Change_percent: The percentage of change in the Previous and Actual
* Country: The country associated with the economic event
* Event: The specific economic event or news scheduled to take place.
* D_Consensus: The difference in the Previous Consensus and Actual Consensus.
* D_Forecast: The difference in the Previous Forecast and Actual Forecast.
* Sentiment: The Sentiment (Positive, Negative, Neutral) assigned to the event based on the Change_Percent.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
## **Advertisement Click Prediction**

### 🎯 **Goal**
To explore the intricate dance between gold prices and key economic events across major global players – Canada, Japan, USA, Russia, European Union, and China.

### 🧵 **Dataset**
Link for the dataset used in the project: ['https://www.kaggle.com/datasets/fekihmea/the-effect-of-economic-news-on-gold-prices'](https://www.kaggle.com/datasets/fekihmea/the-effect-of-economic-news-on-gold-prices)

### 🧾 **Description**
Start with *Exploratory Data Analysis (EDA)* and *Data Visualization* to gain insights from the dataset. Then, we apply various machine learning algorithms to predict the change in the price of Gold based on the sentiment of the Event. Finally, we compare the accuracies of these algorithms to identify the best-performing model.

### 🧮 **What I had done!**
- Imported essential libraries for data manipulation and machine learning.
- Conducted Exploratory Data Analysis (EDA) to comprehend the dataset.
- Visualized data to extract meaningful patterns and insights.
- Setup a function to assign Sentiment to a Event
- Assessed feature correlations to understand interdependencies.
- Converted categorical features into numerical formats via feature mapping.
- Split the dataset into training and testing sets and applied scaling techniques.
- Implemented and trained four machine learning models: **Random Forest**, **SVM**, **Logistic Regression**, and **Gradient Booster**.
- Evaluated the models using classification report and compared their accuracies to determine the best-performing model.

### 🚀 **Models Implemented**
Model Building: We implemented the following algorithms for their distinct advantages in handling various aspects of the dataset:

- Random Forest Classifier: Random forests or Random Decision Trees is a collaborative team of decision trees that work together to provide a single output.
- SVM: Support vector machines are supervised max-margin models with associated learning algorithms that analyze data for classification and regression analysis.
- Logistic Regression: Logistic regression is a supervised machine learning algorithm used for classification tasks where the goal is to predict the probability that an instance belongs to a given class or not.
- Gradient Booster: Gradient Boosting is a powerful boosting algorithm that combines several weak learners into strong learners, in which each new model is trained to minimize the loss function such as mean squared error or cross-entropy of the previous model using gradient descent

### 📚 **Libraries Needed**
- Language Used
- Python
- Libraries Used
- Pandas
- Seaborn
- Numpy
- Matplotlib
- Sklearn

### 📊 **Exploratory Data Analysis Results**

<table>
<tr>
<td><img src="https://github.com/fspzar123/ML-Crate/blob/48ce7189e52355c9ef268a87d9d4d160dbc55b28/The%20Effect%20of%20Economic%20News%20on%20Gold%20Prices%20Analysis/Images/Weekly-Gold.png" alt="Weekly Gold Prices"></td>
<td><img src="https://github.com/fspzar123/ML-Crate/blob/22465f62c53b71fd029d4c1b350ae3bd1eda3f1f/The%20Effect%20of%20Economic%20News%20on%20Gold%20Prices%20Analysis/Images/Avg%20Price%20by%20country.png" alt="Average Price by Country"></td>
</tr>
<tr>
<td><img src="https://github.com/fspzar123/ML-Crate/blob/22465f62c53b71fd029d4c1b350ae3bd1eda3f1f/The%20Effect%20of%20Economic%20News%20on%20Gold%20Prices%20Analysis/Images/Dist%20of%20price%20by%20country.png" alt="Distribution of Price by Country"></td>
<td><img src="https://github.com/fspzar123/ML-Crate/blob/22465f62c53b71fd029d4c1b350ae3bd1eda3f1f/The%20Effect%20of%20Economic%20News%20on%20Gold%20Prices%20Analysis/Images/Trend%20of%20Price%26Vol_K.png" alt="Trend of Price and Volume"></td>
</tr>
<tr>
<td colspan="2" style="text-align: center;"><img src="https://github.com/fspzar123/ML-Crate/blob/22465f62c53b71fd029d4c1b350ae3bd1eda3f1f/The%20Effect%20of%20Economic%20News%20on%20Gold%20Prices%20Analysis/Images/HeatMap%20Of%20Correlation.png" alt="Heatmap of Correlation" style="width: 70%;"></td>
</tr>
<tr>
<td colspan="2" style="text-align: center;"><img src="https://github.com/fspzar123/ML-Crate/blob/22465f62c53b71fd029d4c1b350ae3bd1eda3f1f/The%20Effect%20of%20Economic%20News%20on%20Gold%20Prices%20Analysis/Images/Dist%20of%20events%20per%20date.png" alt="Distribution of events per date" style="width: 70%;"></td>
</tr>
</table>

### 📈 **Performance of the Models based on the Accuracy Scores**

<table>
<tr>
<td style="padding-right: 20px; vertical-align: top;">
<ul style="list-style-type: disc; margin: 0;">
<li>Random Forest Classifier - 99.5%</li>
<li>Support Vector Machines - 97.5%</li>
<li>Logistic Regression - 97%</li>
<li>Gradient Booster - 100%</li>
</ul>
</td>
<td style="vertical-align: top;">
<img src="https://github.com/fspzar123/ML-Crate/blob/22465f62c53b71fd029d4c1b350ae3bd1eda3f1f/The%20Effect%20of%20Economic%20News%20on%20Gold%20Prices%20Analysis/Images/Accuracies%20of%20Models.png" alt="Accuracies of Models" style="max-width: 200px; max-height: 200px;">
</td>
</tr>
</table>


### 📢 **Conclusion**
Among all the models tested, the **Gradient Booster** achieved the highest accuracy,**100%**, making it the best-performing model for predicting gold prices. This demonstrates its effectiveness in handling the dataset and providing reliable predictions.

### ✒️ **Your Signature**
Created by [Filbert Shawn](https://github.com/fspzar123) as a part of SSOC'24 Season 3.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Loading

0 comments on commit 6c8a221

Please sign in to comment.