A set of scripts for analyzing commit messages on Github.
- verb form
- imperative (add)
- gerund (adding)
- past simple (added)
- present simple, 3rd person (adds)
- word frequency
- commit message length, number of lines
- beginning with a capital letter
- ending with a dot
- containing only ASCII characters
- written with CAPS LOCK on
*Â Python 3.6+
- pyquery
- matplotlib
- numpy
Install with
pip install -r requirements.txt
Run the scripts in the following order
./process_verbs.py && ./process_irregular_verbs.py && ./generate_conjugations.py && ./fetch_commits.py yyyy-mm-dd-hh && ./analyze.py && ./plot.py
You can substitute fetch_commits.py yyyy-mm-dd-hh
with fetch_commits_for_month.py yyyy-mm
or ./fetch_commits_for_year.py yyyy
.
I analyzed commits from the whole 2017, you can with the charts inside the results-2017 folder.
- Github Archive – commits
- WordNet – list of verbs
- Ted Pedersen – stop words
- Wikipedia – list of irregular verbs