This repository has been archived by the owner on Nov 1, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 238
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Sweep code for studying model population stats (2 of 2) (#144)
Summary: This is a *major update* and introduces powerful new functionality to pycls. The pycls codebase now provides powerful support for studying *design spaces* and more generally *population statistics* of models as introduced in [On Network Design Spaces for Visual Recognition](https://arxiv.org/abs/1905.13214) and [Designing Network Design Spaces](https://arxiv.org/abs/2003.13678). This idea is that instead of planning a single pycls job (e.g., testing a specific model configuration), one can study the behavior of an entire population of models. This allows for quite powerful and succinct experimental design, and elevates the study of individual model behavior to the study of the behavior of model populations. Please see [`SWEEP_INFO`](docs/SWEEP_INFO.md) for details. This is commit 2 of 2 for the sweep code. It is focused on sweep analysis, sweep examples, and documentation. Pull Request resolved: #144 Reviewed By: rajprateek Differential Revision: D28586390 Pulled By: pdollar fbshipit-source-id: 55856f9aaf7ae49243f4870c787a144b03e5d2a9 Co-authored-by: Raj Prateek Kosaraju <[email protected]> Co-authored-by: Piotr Dollar <[email protected]>
- Loading branch information
1 parent
bd65938
commit 2d71381
Showing
12 changed files
with
1,046 additions
and
11 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
DESC: | ||
Example CIFAR sweep 3 of 3 (trains the best model from cifar_regnet sweep). | ||
Train the best RegNet-125M from cifar_regnet sweep for variable epoch lengths. | ||
Trains 3 copies of every model (to obtain mean and std of the error). | ||
The purpose of this sweep is to show how to train FINAL version of a model. | ||
NAME: cifar/cifar_best | ||
SETUP: | ||
# Number of configs to sample | ||
NUM_CONFIGS: 12 | ||
# SAMPLERS for optimization parameters | ||
SAMPLERS: | ||
OPTIM.MAX_EPOCH: | ||
TYPE: value_sampler | ||
VALUES: [50, 100, 200, 400] | ||
RNG_SEED: | ||
TYPE: int_sampler | ||
RAND_TYPE: uniform | ||
RANGE: [1, 3] | ||
QUANTIZE: 1 | ||
CONSTRAINTS: | ||
REGNET: | ||
NUM_STAGES: [2, 2] | ||
# BASE_CFG is RegNet-125MF (best model from cifar_regnet sweep) | ||
BASE_CFG: | ||
MODEL: | ||
TYPE: regnet | ||
NUM_CLASSES: 10 | ||
REGNET: | ||
STEM_TYPE: res_stem_cifar | ||
SE_ON: True | ||
STEM_W: 16 | ||
DEPTH: 12 | ||
W0: 96 | ||
WA: 19.5 | ||
WM: 2.942 | ||
GROUP_W: 8 | ||
OPTIM: | ||
BASE_LR: 1.0 | ||
LR_POLICY: cos | ||
MAX_EPOCH: 50 | ||
MOMENTUM: 0.9 | ||
NESTEROV: True | ||
WARMUP_EPOCHS: 5 | ||
WEIGHT_DECAY: 0.0005 | ||
EMA_ALPHA: 0.00025 | ||
EMA_UPDATE_PERIOD: 32 | ||
BN: | ||
USE_CUSTOM_WEIGHT_DECAY: True | ||
TRAIN: | ||
DATASET: cifar10 | ||
SPLIT: train | ||
BATCH_SIZE: 1024 | ||
IM_SIZE: 32 | ||
MIXED_PRECISION: True | ||
LABEL_SMOOTHING: 0.1 | ||
MIXUP_ALPHA: 0.5 | ||
TEST: | ||
DATASET: cifar10 | ||
SPLIT: test | ||
BATCH_SIZE: 1000 | ||
IM_SIZE: 32 | ||
NUM_GPUS: 1 | ||
DATA_LOADER: | ||
NUM_WORKERS: 4 | ||
LOG_PERIOD: 25 | ||
VERBOSE: False | ||
# Launch config options | ||
LAUNCH: | ||
PARTITION: devlab | ||
NUM_GPUS: 1 | ||
PARALLEL_JOBS: 12 | ||
TIME_LIMIT: 180 | ||
# Analyze config options | ||
ANALYZE: | ||
PLOT_METRIC_VALUES: False | ||
PLOT_COMPLEXITY_VALUES: False | ||
PLOT_CURVES_BEST: 3 | ||
PLOT_CURVES_WORST: 0 | ||
PLOT_MODELS_BEST: 1 | ||
METRICS: [] | ||
COMPLEXITY: [flops, params, acts, memory, epoch_fw_bw, epoch_time] | ||
PRE_FILTERS: {done: [0, 1, 1]} | ||
SPLIT_FILTERS: | ||
epochs=050: {cfg.OPTIM.MAX_EPOCH: [ 50, 50, 50]} | ||
epochs=100: {cfg.OPTIM.MAX_EPOCH: [100, 100, 100]} | ||
epochs=200: {cfg.OPTIM.MAX_EPOCH: [200, 200, 200]} | ||
epochs=400: {cfg.OPTIM.MAX_EPOCH: [400, 400, 400]} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
DESC: | ||
Example CIFAR sweep 1 of 3 (find lr and wd for cifar_regnet and cifar_best sweeps). | ||
Tunes the learning rate (lr) and weight decay (wd) for ResNet-56 at 50 epochs. | ||
The purpose of this sweep is to show how to optimize OPTIM parameters. | ||
NAME: cifar/cifar_optim | ||
SETUP: | ||
# Number of configs to sample | ||
NUM_CONFIGS: 64 | ||
# SAMPLERS for optimization parameters | ||
SAMPLERS: | ||
OPTIM.BASE_LR: | ||
TYPE: float_sampler | ||
RAND_TYPE: log_uniform | ||
RANGE: [0.25, 5.0] | ||
QUANTIZE: 1.0e-10 | ||
OPTIM.WEIGHT_DECAY: | ||
TYPE: float_sampler | ||
RAND_TYPE: log_uniform | ||
RANGE: [5.0e-5, 1.0e-3] | ||
QUANTIZE: 1.0e-10 | ||
# BASE_CFG is R-56 with large batch size and stronger augmentation | ||
BASE_CFG: | ||
MODEL: | ||
TYPE: anynet | ||
NUM_CLASSES: 10 | ||
ANYNET: | ||
STEM_TYPE: res_stem_cifar | ||
STEM_W: 16 | ||
BLOCK_TYPE: res_basic_block | ||
DEPTHS: [9, 9, 9] | ||
WIDTHS: [16, 32, 64] | ||
STRIDES: [1, 2, 2] | ||
OPTIM: | ||
BASE_LR: 1.0 | ||
LR_POLICY: cos | ||
MAX_EPOCH: 50 | ||
MOMENTUM: 0.9 | ||
NESTEROV: True | ||
WARMUP_EPOCHS: 5 | ||
WEIGHT_DECAY: 0.0005 | ||
EMA_ALPHA: 0.00025 | ||
EMA_UPDATE_PERIOD: 32 | ||
BN: | ||
USE_CUSTOM_WEIGHT_DECAY: True | ||
TRAIN: | ||
DATASET: cifar10 | ||
SPLIT: train | ||
BATCH_SIZE: 1024 | ||
IM_SIZE: 32 | ||
MIXED_PRECISION: True | ||
LABEL_SMOOTHING: 0.1 | ||
MIXUP_ALPHA: 0.5 | ||
TEST: | ||
DATASET: cifar10 | ||
SPLIT: test | ||
BATCH_SIZE: 1000 | ||
IM_SIZE: 32 | ||
NUM_GPUS: 1 | ||
DATA_LOADER: | ||
NUM_WORKERS: 4 | ||
LOG_PERIOD: 25 | ||
VERBOSE: False | ||
# Launch config options | ||
LAUNCH: | ||
PARTITION: devlab | ||
NUM_GPUS: 1 | ||
PARALLEL_JOBS: 32 | ||
TIME_LIMIT: 60 | ||
# Analyze config options | ||
ANALYZE: | ||
PLOT_CURVES_BEST: 3 | ||
PLOT_METRIC_VALUES: True | ||
PLOT_COMPLEXITY_VALUES: True | ||
METRICS: [lr, wd, lr_wd] | ||
COMPLEXITY: [flops, params, acts, memory, epoch_fw_bw, epoch_time] | ||
PRE_FILTERS: {done: [1, 1, 1]} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,78 @@ | ||
DESC: | ||
Example CIFAR sweep 2 of 3 (uses lr and wd found by cifar_optim sweep). | ||
This sweep searches for a good RegNet-125MF model on cifar (same flops as R56). | ||
The purpose of this sweep is to show how to optimize REGNET parameters. | ||
NAME: cifar/cifar_regnet | ||
SETUP: | ||
# Number of configs to sample | ||
NUM_CONFIGS: 32 | ||
# SAMPLER for RegNet | ||
SAMPLERS: | ||
REGNET: | ||
TYPE: regnet_sampler | ||
DEPTH: [6, 16] | ||
GROUP_W: [1, 32] | ||
# CONSTRAINTS for complexity (roughly based on R-56) | ||
CONSTRAINTS: | ||
CX: | ||
FLOPS: [0.12e+9, 0.13e+9] | ||
PARAMS: [0, 2.0e+6] | ||
ACTS: [0, 1.0e+6] | ||
REGNET: | ||
NUM_STAGES: [2, 2] | ||
# BASE_CFG is R-56 with large batch size and stronger augmentation | ||
BASE_CFG: | ||
MODEL: | ||
TYPE: regnet | ||
NUM_CLASSES: 10 | ||
REGNET: | ||
STEM_TYPE: res_stem_cifar | ||
SE_ON: True | ||
STEM_W: 16 | ||
OPTIM: | ||
BASE_LR: 1.0 | ||
LR_POLICY: cos | ||
MAX_EPOCH: 50 | ||
MOMENTUM: 0.9 | ||
NESTEROV: True | ||
WARMUP_EPOCHS: 5 | ||
WEIGHT_DECAY: 0.0005 | ||
EMA_ALPHA: 0.00025 | ||
EMA_UPDATE_PERIOD: 32 | ||
BN: | ||
USE_CUSTOM_WEIGHT_DECAY: True | ||
TRAIN: | ||
DATASET: cifar10 | ||
SPLIT: train | ||
BATCH_SIZE: 1024 | ||
IM_SIZE: 32 | ||
MIXED_PRECISION: True | ||
LABEL_SMOOTHING: 0.1 | ||
MIXUP_ALPHA: 0.5 | ||
TEST: | ||
DATASET: cifar10 | ||
SPLIT: test | ||
BATCH_SIZE: 1000 | ||
IM_SIZE: 32 | ||
NUM_GPUS: 1 | ||
DATA_LOADER: | ||
NUM_WORKERS: 4 | ||
LOG_PERIOD: 25 | ||
VERBOSE: False | ||
# Launch config options | ||
LAUNCH: | ||
PARTITION: devlab | ||
NUM_GPUS: 1 | ||
PARALLEL_JOBS: 32 | ||
TIME_LIMIT: 60 | ||
# Analyze config options | ||
ANALYZE: | ||
PLOT_METRIC_VALUES: True | ||
PLOT_COMPLEXITY_VALUES: True | ||
PLOT_CURVES_BEST: 3 | ||
PLOT_CURVES_WORST: 0 | ||
PLOT_MODELS_BEST: 8 | ||
PLOT_MODELS_WORST: 0 | ||
METRICS: [regnet_depth, regnet_w0, regnet_wa, regnet_wm, regnet_gw] | ||
COMPLEXITY: [flops, params, acts, memory, epoch_fw_bw, epoch_time] | ||
PRE_FILTERS: {done: [0, 1, 1]} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.