Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Altering other training parameters when changing batch size #2643

Open
oliverjm1 opened this issue Dec 13, 2024 · 0 comments
Open

Altering other training parameters when changing batch size #2643

oliverjm1 opened this issue Dec 13, 2024 · 0 comments
Assignees

Comments

@oliverjm1
Copy link

I've seen a few other open issues related to batch size increases, but this is not related to troubleshooting so I thought it warranted a separate issue. I'm training on multiple GPUs on a HPC and have been messing around with increasing the batch size through editing the nnUNetv2 plans file. I'm seeing linear increase in epoch time, which you'd probably expect due to the number of iterations per batch not changing.

It is a little annoying that the benefits of increasing the batch size are offset by the linearly-increasing total training time - With default settings, my total training time is around 30 hours with a batch size of 2, so scaling to a batch size of 8 results in a total training time of 5 days. Are there any recommendations for how the number of iterations per batch/total number of epochs/learning rate should be altered when using a custom batch size?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants