-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ DataParallel and DistributedDataParallel for speed up training. #43
Comments
CodeHere is the code about: (change txt to py): The kernel code of
|
CodeHere is the code about: (change
kernel code
|
DataParallel: multiple thread for single machine multiple GPUs
FullModel
which writes loss function into the model to solve the memory usage imbalance problem. )DistributedDataParallel: multiple processing for single or multiple machines and multiple GPUs.
FullModel
)It is very easy to add DataParallel into the code, but DataParallel brings less speed up.
It's a little tricky to use because DistributedDataParallel needs to be started from the command line, but it gives a significant speedup with 4 GPUs in single machine in high GPU memory.
The text was updated successfully, but these errors were encountered: