Learn rate scheduler #207

jeffjennings · 2023-11-29T06:10:26Z

NOTE: This PR should only be reviewed after #206 is merged into main and main merged into this branch (it was branched from there).

Optionally adds a scheduler to cross-val in order to dynamically adjust the learning rate.
Tracks the learning rate over the optimization loop for a given k-fold.
Adds test for training loop with scheduler.

Choice and defaults of scheduler:
Of the several torch.optim.lr_scheduler schedulers, ReduceLROnPlateau is one of the few that updates the learning rate according to some metric (doc here, not well-written), rather than just at some user-supplied number of epochs (which would not be at all general). I use this scheduler and give it the loss as the metric.

The scheduler has a threshold below which it judges the metric to no longer be changing, triggering a decrease in the learning rate; I keep this threshold at the default factor of 1e-4. For the factor to reduce the learning rate to, I found 0.995 is a good choice (reduces the learning rate to 99.5% of its previous value) -- the factor is an arg in the TrainTest class. This choice (for the 1 dsharp dataset I tested) keeps gradually reducing the brightness scale of the gradient image after the loss has plateaued by eye, while avoiding transient spikes in the loss at large iteration. The learning rate update is done at each iteration in the training loop after optimizer.step().

Because the scheduler gradually improves the gradient image even when the loss appears to plateau, I've also tested strengthening the convergence tolerance for the loss in the training loop. I've found the best result by setting the tolerance to 1 part in 10^5 (the loss must be changing by less than this for 10 iterations to be considered converged; previously 1 part in 10^3). This factor is an arg in TrainTest.

iancze

Looks good. I wonder how robust the threshold settings will be to choice of ALMA dataset?

jeffjennings · 2023-11-30T16:31:54Z

Yeah, right now I'm defining the torch.optim optimizer and torch.optim.lr_scheduler scheduler in CrossValidate to pass to TrainTest, but utimately it could be better to have the user pass in a template optimizer and scheduler that are re-initialized per kfold.

For now the scheduler has a threshold that's a relative factor of the loss, so I think it should be pretty flexible. The schedule factor I'm less certain will be flexible, but because of that I made it an arg of CrossValidate. I'll get a sense by running cross-val on more datasets.

jeffjennings added 17 commits November 29, 2023 00:32

CrossValidate: new par 'schedule_factor'

d3c29bf

CrossValidate: schedule_factor docstring

42eb8bc

run_crossval: use scheduler

9498961

TrainTest: new par 'scheduler'

d8317db

train: earlier check of stagnant loss

c6e73e9

train: update scheduler

bacdeec

train: track learn rate

16e9499

TrainTest: reorder args

fc6f446

CrossValidate: reorder docstring

e2d0a81

CrossValidate: reorder args

86aa624

run_crossval: option to skip scheduler

660e85d

train diag fig test: remove old var

b9758f9

conftest: default regularizer strength guess False

9a8a5b3

conftest: new train_pars default for scheduler

552e1aa

new func test_traintestclass_training_scheduler

9cc7eb5

test: pass in full scheduler

cde8095

test: remove 'scheduler' from default train_pars

6148671

jeffjennings requested a review from iancze November 29, 2023 07:09

jeffjennings mentioned this pull request Nov 29, 2023

Train diagnostic figure update #208

Merged

iancze approved these changes Nov 30, 2023

View reviewed changes

Merge branch 'main' into learn_rate_scheduler

599a848

jeffjennings mentioned this pull request Nov 30, 2023

HD 143006 Imaging Tutorial Part 3 #63

Open

jeffjennings merged commit b6ba439 into main Nov 30, 2023
4 checks passed

jeffjennings deleted the learn_rate_scheduler branch November 30, 2023 16:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Learn rate scheduler #207

Learn rate scheduler #207

jeffjennings commented Nov 29, 2023 •

edited

Loading

iancze left a comment

jeffjennings commented Nov 30, 2023

Learn rate scheduler #207

Learn rate scheduler #207

Conversation

jeffjennings commented Nov 29, 2023 • edited Loading

iancze left a comment

Choose a reason for hiding this comment

jeffjennings commented Nov 30, 2023

jeffjennings commented Nov 29, 2023 •

edited

Loading