This represents the submission for our APMA4990: Modeling Social Data final project. The project was completed by Zach Gleicher, Matt Piccolella, and Edo Roth.
Inside of this repository are several directories and files:
├── analysis/ - R code to do our analysis
├── bin/ - various executables, mostly for data pruning
├── data/ - data which we analyze
├── output/ - graphs/data output by our analysis
├── v11n/ - d3.js visualizations of our data
| └── data/ - data specifically for our web visualizations
├── report.{md,pdf} - final project report
└── run.sh - executable to run all of our code
This represents all of the code that we wrote and writing that we did for our project.
To run all of our code, simply run:
$ ./run.sh
Note: since our project has several parts, some of which are quite time intensive, this may take somewhere between 10-15 minutes.
Our output is contained in the output/
directory, and looks as follows:
├── output/
| └── clustering-map.pdf - map of NYC with cluster centers plotted
| └── preliminary-analysis.pdf - graphs from our preliminary analysis
| └── lasso_data.txt - description/results of our lasso model
| └── nb_data.txt - description/results of our naive bayes model
| └── lr_data.txt - description/results of our logistic regression model
| └── nb_graphs.pdf - graphs of our naive bayes results
| └── lr_graphs.pdf - graphs of our logistic regression results