This repository contains the code for the application presented at flxw.de/code-repository-mining. It is a Python 3 client-server application. Please follow the subsections below to setup the individual parts:
The directory contents are as follows:
client
contains the code that runs client-sideserver
contains the code running server-side, providing an APIdata
contains the scripts needed for creating the necessary tables and viewsdocs
contains several the project website, several artifacts and raw Jupyter notebooks
- Change directories to the
client/
folder and install the dependencies:
pip install -r requirements.txt
Then, simply scan your system for vulnerable packages via: ./checksystem.py
. Currently, only apt and pacman
package managers are supported, which translates to most Debian or Arch based Linux distributions.
To simulate the results that this application could potentially give, run ./checksystem.py --test
.
It will show results for openssl package affected by Heartbleed.
This setup assumes the GHtorrent database dump at the HPI chair for software architecture. Furthermore a mongoDB instance needs to be running and you need to have access to it. The data procurement and setup is time-consuming:
- Change directories to
data/
- Copy the
config.py.smpl
toconfig.py
and edit it so it works for your installation - Copy
config.py
toserver/
as well - Run
create-cve-search-view.sql
. Wait for completion. - Install
scrapy
viapip install scrapy
- Download my TweetScraper fork
- Configure the TweetScraper via its
settings.py
to reflect your PostgreSQL settings and have the TweetScraper use it - Run
./crawl-cve-tweets-from-github-subset
from inside the TweetScraper project directory. You can go ahead with the next step while the crawler is doing its thing. - Download and setup cve-search. Wait for completion here.
- Run
mine-cve-search-into-postgres.py
. Wait for completion. - Run
create-reference-url-extraction-view.sql
,create-tweet-extracted-views.sql
,create-cwe-nist-reference-ranking.sql
andcreate-twitter-user-ranking.sql
. In that order.
The API server setup is straightforward and can be summarized in three commands:
cd server
pip install -r requirements.txt
hug -f api.py