Introduction

This repository contains the code for the application presented at flxw.de/code-repository-mining. It is a Python 3 client-server application. Please follow the subsections below to setup the individual parts:

The directory contents are as follows:

client contains the code that runs client-side
server contains the code running server-side, providing an API
data contains the scripts needed for creating the necessary tables and views
docs contains several the project website, several artifacts and raw Jupyter notebooks

Client setup

Change directories to the client/ folder and install the dependencies:

pip install -r requirements.txt

Then, simply scan your system for vulnerable packages via: ./checksystem.py. Currently, only apt and pacman package managers are supported, which translates to most Debian or Arch based Linux distributions. To simulate the results that this application could potentially give, run ./checksystem.py --test. It will show results for openssl package affected by Heartbleed.

Server setup

This setup assumes the GHtorrent database dump at the HPI chair for software architecture. Furthermore a mongoDB instance needs to be running and you need to have access to it. The data procurement and setup is time-consuming:

Change directories to data/
Copy the config.py.smpl to config.py and edit it so it works for your installation
Copy config.py to server/as well
Run create-cve-search-view.sql. Wait for completion.
Install scrapy via pip install scrapy
Download my TweetScraper fork
Configure the TweetScraper via its settings.py to reflect your PostgreSQL settings and have the TweetScraper use it
Run ./crawl-cve-tweets-from-github-subset from inside the TweetScraper project directory. You can go ahead with the next step while the crawler is doing its thing.
Download and setup cve-search. Wait for completion here.
Run mine-cve-search-into-postgres.py. Wait for completion.
Run create-reference-url-extraction-view.sql, create-tweet-extracted-views.sql, create-cwe-nist-reference-ranking.sql and create-twitter-user-ranking.sql. In that order.

The API server setup is straightforward and can be summarized in three commands:

cd server
pip install -r requirements.txt
hug -f api.py

Name		Name	Last commit message	Last commit date
Latest commit History 109 Commits
client		client
data		data
docs		docs
server		server
.gitignore		.gitignore
README.md		README.md
source recommendation data structure.png		source recommendation data structure.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Client setup

Server setup

About

Releases

Packages

Languages

flxw/code-repository-mining

Folders and files

Latest commit

History

Repository files navigation

Introduction

Client setup

Server setup

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages