Skip to content

Latest commit

 

History

History
39 lines (33 loc) · 1018 Bytes

plan.md

File metadata and controls

39 lines (33 loc) · 1018 Bytes

Goal

  1. Model a classic benchmark dataset
  2. Reusable script design
  3. Good metadata logging

Backlog

  1. DONE - grids of outlier digits
  2. DONE - grid of digit means
  3. DONE - cumulative distribution plot
  4. DONE - neural net modeling

High level procedure

  1. Tune on 150000
  2. Model on 240000

File structure

  1. Exploratory data analysis script
  2. Tuning script
  3. Modeling script
  4. figures/
  5. tests/

Models

  1. MLP
    • Variance Threshold, MinMaxScaler, Gridsearch hidden_layer_sizes, init_learning_rate
  2. RandomForest
    • Variance Threshold, Gridsearch min_samples_leaf

Saved data

Contents Filetype
metadata meta_*.json
fitted estimator grid_result*.pkl
images *.png
log data log_*.md

Future Work

  1. @ROBUSTNESS Improve checking for whether files exist
  2. @SIMPLIFY Simplify number of config parameters in tuning.py and model.py