Releases: modelscope/dash-infer
Releases · modelscope/dash-infer
v2.0.0-rc3
some bugfix - uuid crash issue - update lora implement - set page size by param - delete deprecated files
v2.0.0-rc2
release script: reduce python wheel size (#46)
v1.3.0
Highlight
- Support Baichuan-7B and Baichuan2-7B & 13B by @WangNorthSea in #38
Full Changelog: v1.2.1...v1.3.0
v1.2.1
v1.2.0
expand context length to 32K & support flash attention on intel-avx512 platform
- remove currently unsupported cache mode
- examples: update qwen prompt template, add print func to examples
- support glm-4-9b-chat by
- change to size_t to avoid overflow when seq is long
- update README since we support 32k context length
- Add flash attention on intel-avx512 platform