v0.2.0

SeaOfOcean released this 27 Aug 04:50

· 87 commits to main since this release

What's Changed

Support llama2 based on official repo of Megatron-LM
Refactor tutorial docs and add tools to convert megatron ckpt to hf
Fix parameter sync when src_pipe != tgt_pipe and tgt_pipe != 1
Reduce the number of port required
Refine resume training and doc
Add node address to error message and exit with error code
Show the log in each worker node and refine docs
Add continue train docs and check applied device
Join log thread with timeout and trigger when process exit
Support custom model flow
Feat: support optimizer offload
Doc: add faq

Full Changelog: v0.1.0...v0.2.0

Assets 2