This project is a demo that applies PCA (Principal Component Analysis) analysis on the Iris dataset using Python and the Scikit-learn library. PCA is utilized to reduce high-dimensional data to lower dimensions.
In this project, Principal Component Analysis (PCA) analysis was applied on Iris dataset using Python and Scikit-learn library. PCA is a technique used to reduce a multidimensional dataset to fewer dimensions and preserve the underlying variance of the data. The project consists of taking the data set, performing PCA analysis and visualizing the results with a graph.
iris_pca.py
: Python code implementing PCA analysis.README.md
: This file containing information about the project.
The following requirements are needed for this code to work:
- Python 3.x
- Scikit-learn
- Pandas
- Matplotlib
You can use the following commands to install the necessary libraries:
pip install scikit-learn
pip install pandas
pip install matplotlib
This project was realized using the Iris dataset. Iris dataset is a dataset containing the characteristics of flower species. This dataset is predefined within the Scikit-learn library and therefore no external source is needed to pull the dataset.
The project consists of the following steps:
- Iris dataset is extracted.
- PCA analysis is applied and the dataset is reduced from 4 dimensions to 2 dimensions.
- The results are visualized on a graph.
You can follow the steps below to use the project:
- Clone the project to your computer:
git clone https://github.com/Prometheussx/Iris-Data-PCA-Exploration.git
- Go to the project folder:
cd Iris-Data-PCA-Exploration
- Run the code:
python PCA.py
- Examine the graph on the screen to see the results.
- Shows the variance explained by each component. The first component explains 92.46%, and the second component explains 5.30% of the variance.
print("Variance ratio:", pca.explained_variance_ratio_)
- Indicates that 97.76% of the variance is preserved, implying a successful PCA with a 2.24% data loss.
print("Sum:", sum(pca.explained_variance_ratio_))
For any questions, feedback or requests to contribute to the project, you can contact the contact information below:
- LinkedIn: [https://www.linkedin.com/in/erdem-taha-sokullu/]
- Email: [[email protected]]
- Kaggle: [https://www.kaggle.com/erdemtaha]
You can report any issues or request new features using the Issues section of the project on GitHub. Try to be as detailed as possible when describing issues and requests.
For any bug reports or requests to contribute to the project, please contact GitHub.
This project is licensed under the MIT license. For more information, see LICENSE.