Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A replayable fit() method - diff/patch attached #305

Open
srimalj opened this issue Nov 3, 2016 · 1 comment
Open

A replayable fit() method - diff/patch attached #305

srimalj opened this issue Nov 3, 2016 · 1 comment

Comments

@srimalj
Copy link

srimalj commented Nov 3, 2016

Hi Daniel

First of all nice library. Thanks.

I am training a large job on a remote supercomputing cluster which gives me fixed time limits for each job. Once the time limit is up, the job terminates.

Simply saving the learned parameters at the end using save_params_to() does not help when the job terminates before fit() is completed. All learned parameters are lost and training has to be done from the beginning.

So I modified the fit() method to periodically save the learned parameters at a given number of epoch steps. If the job is terminated prematurely before fit() is completed, the fit() can be invoked again by re-running the same job. The fit() will resume training (a warm start) where it stopped by loading the saved parameters using load_params_from().

Attached is a diff/patch of what I did.

patch-replay.txt

If you find it useful, feel free to upstream the code - or modify as you think fit.

Would be happy to contribute more at some point.

Cheers

Srimal.

@BenjaminBossan
Copy link
Collaborator

BenjaminBossan commented Nov 5, 2016

Hi Srimal,

Thank you for your contribution. I did not look at your code, but it seems to me that what you described is already possible with the current version of nolearn. Have a look at this callback, which allows you to save the model weights to a file every n epochs. Just add an instance of SaveWeights to on_epoch_finished.

Good luck

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants