Releases: alibaba/ChatLearn
Releases · alibaba/ChatLearn
v1.0.2
What's Changed
- Hotfix: update parameters that are reordered/concat in vllm parameter sync by @SeaOfOcean in #120
Full Changelog: v1.0.1...v1.0.2
v1.0.1
What's Changed
- [doc] refine performance image size by @SeaOfOcean in #39
- Update issue templates by @SeaOfOcean in #40
- Add unit test and daily build by @SeaOfOcean in #41
- raise error if parameter sync breaks by @charles9304 in #42
- Refine the description of docs. by @adoda in #45
- Trigger UT when the pull request is approved and the number is 2 by @SeaOfOcean in #52
- fix none src model and skip load ckpt for vllm by @stcy07 in #50
- refine log output for trainer model by @stcy07 in #49
- make *penalty of sampling_params configurable. by @charles9304 in #54
- set env concurrent. by @adoda in #55
- Refine vllm inference and keep the API same as non-vllm by @SeaOfOcean in #46
- Speedup ut && format. by @adoda in #57
- Refine efficient memory sharing by @SeaOfOcean in #58
- fix vllm_module InferenceMemoryManager args error by @SeaOfOcean in #61
- [UT] rm duplicate ray stop by @SeaOfOcean in #62
- fix onload offload in save_checkpoint by @SeaOfOcean in #63
- fix exit with log_monitor error by @SeaOfOcean in #60
- disable onload/offload when not colocated. by @charles9304 in #65
- Parameter sync fallback to P2P when TP size is odd by @SeaOfOcean in #64
- fix cpu_per_process and gpu_per_process when num_gpu/num_cpu is 1 by @SeaOfOcean in #67
- Reverse DP replicas in parameter sync when tp size is odd by @SeaOfOcean in #68
- Upload Python Package when release is published by @SeaOfOcean in #69
- stop previous run container when running ut by @SeaOfOcean in #73
- Support get tp/pp for torch_module/deepspeed_module and fix ut. by @adoda in #72
- Add DingTalk group to README. by @adoda in #74
- fix policy generation oom when continue train by @SeaOfOcean in #77
- Increase the num of episodes to allow the model to converge more fully by @adoda in #76
- set build time to 00.30 am utc+8 by @SeaOfOcean in #75
- feat:add and use multi thread tokenize tool in VLLMPromptPipeline by @stcy07 in #56
- add load ckpt for value model and warinings by @stcy07 in #78
- Be compatible with group query attention for QWen2. by @charles9304 in #79
- fix missing import in example by @SeaOfOcean in #80
- Upgrade version number by @SeaOfOcean in #81
- Revert "fix exit with log_monitor error (#60)" by @SeaOfOcean in #82
- fix dp_rank not in dp2send_actors when inference replica num less than training replica num by @SeaOfOcean in #83
New Contributors
- @charles9304 made their first contribution in #42
- @adoda made their first contribution in #45
- @stcy07 made their first contribution in #50
Full Changelog: v1.0.0...v1.0.1
v1.0.0
What's Changed
- Support vLLM as generation engine.
- Support custom flow.
- Support efficient memory sharing.
- Support CPU module.
- Add Llama2 DPO/OnlineDPO/GRPO example based on Megatron-LM.
- Add QWen2 DPO example based on DeepSpeed.
Full Changelog: v0.2.0...v1.0.0
v0.2.0
What's Changed
- Support llama2 based on official repo of Megatron-LM
- Refactor tutorial docs and add tools to convert megatron ckpt to hf
- Fix parameter sync when src_pipe != tgt_pipe and tgt_pipe != 1
- Reduce the number of port required
- Refine resume training and doc
- Add node address to error message and exit with error code
- Show the log in each worker node and refine docs
- Add continue train docs and check applied device
- Join log thread with timeout and trigger when process exit
- Support custom model flow
- Feat: support optimizer offload
- Doc: add faq
Full Changelog: v0.1.0...v0.2.0
v0.1.0
First release of ChatLearn
- Enable RLHF with Megatron-LM.
- Support Llama/GPT/Bloom/Baichuan models with SFT / Reward / RLHF.
- Add docs.
- Support resume training.