Introducing our SC1015 Mini-Project

About

This is our mini-project for SC1015 (Intro to Data Science and Artificial Intelligence).

Contributors (SC7 Group 10)

@nghochi123
@Mel-NLY

Understanding our project

Here's the sequence of files that we'd recommend for you to look through. There are more insights and explanations in the Jupyter Notebooks:

Dataset

The dataset is called the Global Terrorism Database, obtained from the National Consortium for the Study of Terrorism and Responses to Terrorism (START).

Maintained by researchers headquartered at the University of Maryland. The dataset consists of information on more than 200,000 global terrorist attacks.

Problem Definition + Motivation

Using this dataset, we are trying to find the answer to the following categorical questions:

What determines a successful terrorist attack?
What are terrorists really after?

We realized that wanting to determine what makes or breaks a successful terrorist attack was important, so that we could focus on the important combination of features. Coming up with solutions that target the more important features to prevent the attacks from succeeding and hurting many others in the process.

Following the same methodology, being able to pick out the motives that are the most harmful and common would also serve to reduce the number of successful terror attacks.

Models Used

Random Forest Classifier
Logistic Regression
K-Nearest Neighbours Classifier
Support Vector Classification (SVC)
Neural Networks
Stochastic Gradient Descent Classifier
Kernel SVM
Decision Tree Classifier
AdaBoost
Gradient Boosting

What We Discovered

What determines a successful terrorist attack?
These was the best combination of features found that could predict the successful-ness of a terrorist attack.
- Number of Kills
- Timestamp
- AttackType
- Number of Wounded
- WeaponType
- Month
What are terrorists really after?
Retaliation was found to be the most common of motives among the terrorists.

Tools Used

Folium - Interactive Leaflet Map
Keplergl - Geospatial Analytic Visualizations
Plotly - Interactive Web-based Visualizations
Pickle - Serializing and deserializing object structures

Lessons Learnt

Here are some of the lessons we learnt through the journey of developing this project. More of our insights and realisations can be found in the Jupyter Notebooks, and Presentation.

We used all the rows initially for dataset cleaning, which resulted in us dropping too many points. Instead we first picked out the relevant columns, then cleaned those values. Resulting in us obtaining a fuller dataset.
Dataset had an imbalance in number of failed and successful columns. Resulting in us having to use weights for rebalancing.
Library versioning could've been affecting the results obtained on different OS (Mac/Windows). Therefore, this required us to check the requirements.txt, tried to shift to a Windows device, and make use of Google Collab.
Initially, we had all of our codes in a file, which resulted in our neural network not having enough memory to run (and the kernel failing). Hence we split up the code files into different Jupyter Notebook files.
Split the Neural Network Model into another notebook
Comparision between models, sometimes score is not the best heuristic to use.

Also Do Check Out the Other Project Folders

cool_resources
- Interactive Maps,
- GTD CodeBook, and
- Slide Deck
saved
- Saved Trained Models, and
- Images obtained from analysis

References

https://www.start.umd.edu/gtd/analysis/
https://ourworldindata.org/terrorism
https://realpython.com/python-statistics/
https://machinelearningmastery.com/metrics-evaluate-machine-learning-algorithms-python/
https://medcraveonline.com/FRCIJ/motivation-leading-to-radicalization-in-terrorists.html
https://en.wikipedia.org/wiki/Tf%E2%80%93idf
https://towardsdatascience.com/topic-modelling-in-python-with-nltk-and-gensim-4ef03213cd21
https://towardsdatascience.com/text-classification-supervised-unsupervised-learning-approaches-9fd5e01a036
https://www.mdpi.com/2076-0760/11/1/23#
https://monkeylearn.com/topic-analysis/

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
.ipynb_checkpoints		.ipynb_checkpoints
cool_resources		cool_resources
saved		saved
.DS_Store		.DS_Store
.gitattributes		.gitattributes
.gitignore		.gitignore
Data_analysis.ipynb		Data_analysis.ipynb
README.md		README.md
What_determines_a_successful_terrorist_attack_.ipynb		What_determines_a_successful_terrorist_attack_.ipynb
What_do_terrorists_really_want_.ipynb		What_do_terrorists_really_want_.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introducing our SC1015 Mini-Project

About

Contributors (SC7 Group 10)

Understanding our project

Dataset

Problem Definition + Motivation

Models Used

What We Discovered

Tools Used

Lessons Learnt

Also Do Check Out the Other Project Folders

References

About

Releases

Packages

Contributors 2

Languages

nghochi123/sc1015_project

Folders and files

Latest commit

History

Repository files navigation

Introducing our SC1015 Mini-Project

About

Contributors (SC7 Group 10)

Understanding our project

Dataset

Problem Definition + Motivation

Models Used

What We Discovered

Tools Used

Lessons Learnt

Also Do Check Out the Other Project Folders

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages