- Data science pipeline applied to a classic machine learning benchmark dataset--MNIST handwritten digits
- The confusion matrix shows accuracy prediction for each digit, and which digits are most commonly mistaken for others
- The F1 multiclass weighted score is 98.8
- 3 is most commonly mispredicted as 5 and 4 is most commonly mispredicted as 9
- Installing the pip package python-mnist puts the data downloading script emnist_get_data.sh in your python bin directory. e.g
ls PYTHON_BIN_DIR | grep mnist
- Run tests with
pytest tests/*
- View json metadata with jq utility
jq . models/*.json
- Project plan document found in plan.md