Scripts for acquiring our MegaFoss dataset - a curated list of top open source projects that represent modern software development
When we want to regenerate our repo list, run this:
python .\src\github\get_repo_list.py
git clone https://github.com/CVEProject/cvelistV5 cves
- Ensure postgres is installed running
- Configure your database connection in
src/cve/config/postgres.ini
- Ensure you have a python environment installed and activated
- Install the required packages by running
pip install -r requirements.txt
- Run the following command to create the database schema:
python src/cve/create_db_tables.py
python src/cve/repos_to_nvd.py
Output will give a CSV file, a file for repos that need manual mapping, and a file for repos that are not found in the NVD database.
python src/cve/list_patches.py
Output will print out a list of patches
python src/cve/nvd_to_cve_id_assigner_name.py
Output will print out tuples of (cve_id, vendor)
python src/cve/cve_no_cwe.py
Output will give a file cve/output/cve_no_cwe.txt
with a list of CVEs with no CWEs
Ensure you have the 'Master' and 'CWE_Relative_Map' tables from the spreadsheet downloaded
as lists/rust_to_cwe.csv
and lists/cwe_child_map.csv
respectively.
The former can be downloaded using python src/cve/download_rust_cve_sheet.py
python src/cve/generate_pi_chart.py
Output can be configured to print in the console or save to a file as well as printing out CVEs with no CWEs. Output will print out tab-seperated data to be copied into the spreadsheet which will auto-update the pi chart. Output will also print out data for specific projects. Output will also display a list of CWEs that had no vote mapping.