Fraud Detection in Bitcoin Transactions using Graph Convolutional Networks (GCNs)

Project Description

bitcoin_fraud_detection is a project aimed at detecting fraudulent Bitcoin transactions using Graph Convolutional Networks (GCNs). The project leverages the Elliptic dataset and combines the strengths of C++ for data preprocessing and Python for implementing and training the GCN model. This hybrid approach ensures efficient data handling and powerful machine learning capabilities.

Features

Data Preprocessing in C++: Efficient parsing and cleaning of transaction data.
Graph Construction: Creation of a transaction graph using NetworkX.
Graph Neural Network (GNN): Implementation of a GNN using PyTorch Geometric for fraud detection.
Visualization: Visualization of transaction graphs and model performance metrics using Plotly.

Project Structure

bitcoin_fraud_detection/
│
├── data/
│   ├── filtered/
│   │   ├── filtered_classes.csv
│   │   ├── filtered_edgelist.csv
│   │   └── filtered_features.csv
│   └── unfiltered/
│       ├── elliptic_txs_classes.csv
│       ├── elliptic_txs_edgelist.csv
│       └── elliptic_txs_features.csv
│
├── src/
│   ├── data_preprocessing.cpp
│   └── CMakeLists.txt
│
├── training/
│   ├── data_preparation.ipynb
│   ├── gcn_model_weights.pth
│   ├── graph_data.pt
│   └── training.ipynb
│
├── visualization/
│   ├── data_plot.png
│   ├── data_predictions_plot.png
│   └── data_visualization.ipynb
│
├── README.md
└── LICENSE

Setup Instructions

C++ Environment Setup

Compile the C++ Code:

cd src
mkdir build
cd build
cmake ..
make
./data_preprocessing

Python Environment Setup

Install Python Dependencies:

cd training
pip install -r requirements.txt

Required Libraries:
- torch
- torch-geometric
- pandas
- matplotlib
- scipy
- networkx
- plotly

Running the Project

1. Data Preprocessing (C++):

Navigate to the src directory and run the data preprocessing script.

cd src/build
./data_preprocessing

This will generate filtered datasets in the data/filtered/ directory using the data/unfiltered/ directory. You might need to manually paste the data from Kaggle to data/unfiltered/, due to size limitations on Github.

2. Training the GNN Model (Python):

Navigate to the training directory and run the training.ipynb notebook.

cd training
jupyter notebook training.ipynb

This will train the GNN model and save the model weights to gcn_model_weights.pth.

3. Visualizing the Results (Python):

Navigate to the visualization directory and run the data_visualization.ipynb notebook.

cd visualization
jupyter notebook data_visualization.ipynb

Visualizations

Data Plot

Data Predictions Plot

Usage

Training: The GNN model can be trained using the training.ipynb notebook. Adjust hyperparameters as needed within the notebook.
Visualization: Use the data_visualization.ipynb notebook to generate visualizations of the transaction graph and model performance metrics.

Conclusion

In this model, each node aggregates information from its first-order neighbors in both GCN layers. Although the second GCN layer also considers first-order neighbors, these neighbors' features have already been influenced by their own neighbors in the previous layer. This way, each node indirectly incorporates second-order neighbor information as well. However, the direct aggregation occurs only from first-order neighbors in each GCN layer.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

Contributions are welcome! Please fork the repository and submit a pull request for any enhancements or bug fixes.

Acknowledgments

The Elliptic dataset: Kaggle
PyTorch Geometric: PyTorch Geometric

Contact

For any questions or suggestions, please open an issue or contact the project maintainers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fraud Detection in Bitcoin Transactions using Graph Convolutional Networks (GCNs)

Project Description

Features

Project Structure

Setup Instructions

C++ Environment Setup

Python Environment Setup

Running the Project

1. Data Preprocessing (C++):

2. Training the GNN Model (Python):

3. Visualizing the Results (Python):

Visualizations

Data Plot

Data Predictions Plot

Usage

Conclusion

License

Contributing

Acknowledgments

Contact

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
data/filtered		data/filtered
src		src
training		training
visualization		visualization
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

License

vdrvar/bitcoin_fraud_detection

Folders and files

Latest commit

History

Repository files navigation

Fraud Detection in Bitcoin Transactions using Graph Convolutional Networks (GCNs)

Project Description

Features

Project Structure

Setup Instructions

C++ Environment Setup

Python Environment Setup

Running the Project

1. Data Preprocessing (C++):

2. Training the GNN Model (Python):

3. Visualizing the Results (Python):

Visualizations

Data Plot

Data Predictions Plot

Usage

Conclusion

License

Contributing

Acknowledgments

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages