Releases · modelscope/dash-infer

20 Dec 13:10

v2.0.0-rc3

163850f

v2.0.0-rc3 Latest

Latest

some bugfix

- uuid crash issue
- update lora implement
- set page size by param
- delete deprecated files

Assets 14

DashInfer-2.0.0.cpu.aarch64.tar.gz

11.3 MB 2024-12-20T13:10:06Z
DashInfer-2.0.0.cpu.x86_64.tar.gz

17.1 MB 2024-12-20T13:14:22Z
DashInfer-2.0.0.cuda-12.4-shared.x86_64.tar.gz

823 MB 2024-12-20T13:14:21Z
dashinfer-2.0.0rc3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

655 MB 2024-12-20T13:14:22Z
dashinfer-2.0.0rc3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

655 MB 2024-12-20T13:14:23Z
dashinfer-2.0.0rc3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

655 MB 2024-12-20T13:14:22Z
dashinfer_cpu-2.0.0rc3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

18.1 MB 2024-12-20T13:14:22Z
dashinfer_cpu-2.0.0rc3-cp310-cp310-manylinux_2_28_aarch64.whl

12.9 MB 2024-12-20T13:10:06Z
dashinfer_cpu-2.0.0rc3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

18.1 MB 2024-12-20T13:14:22Z
dashinfer_cpu-2.0.0rc3-cp38-cp38-manylinux_2_28_aarch64.whl

12.9 MB 2024-12-20T13:10:06Z
Source code (zip)

2024-12-20T07:03:01Z
Source code (tar.gz)

2024-12-20T07:03:01Z

17 Dec 12:29

github-actions

v2.0.0-rc2

1b2a6ad

v2.0.0-rc2

release script: reduce python wheel size (#46)

Assets 14

27 Aug 03:33

yejunjin

v1.3.0

2e7ea7b

v1.3.0

Highlight

Support Baichuan-7B and Baichuan2-7B & 13B by @WangNorthSea in #38

Full Changelog: v1.2.1...v1.3.0

Contributors

WangNorthSea

Assets 12

01 Jul 03:28

yejunjin

v1.2.1

5ceddf9

v1.2.1

What's Changed

Add llama.cpp benchmark steps
fix: fallback to mha without avx512f support
solve security issue; helper: bugfix, cpu platform check
add release package workflow

Assets 13

24 Jun 05:32

yejunjin

v1.2.0

3a0417b

v1.2.0

expand context length to 32K & support flash attention on intel-avx512 platform

remove currently unsupported cache mode
examples: update qwen prompt template, add print func to examples
support glm-4-9b-chat by
change to size_t to avoid overflow when seq is long
update README since we support 32k context length
Add flash attention on intel-avx512 platform

Assets 13

29 May 08:32

laiwenzh

v1.1.0

1b9b010

v1.1.0

support Qwen2, change dashinfer model extensions

support Qwen2, add model_type Qwen_v20
change dashinfer model extensions (asgraph, asparam -> dimodel, ditensors)
python example: remove xxx_quantize.json config file, use command line arg instead

Assets 13

14 May 05:50

laiwenzh

v1.0.4

9ef6e35

v1.0.4

First official release.

Assets 13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Highlight

Contributors

What's Changed

Releases: modelscope/dash-infer

v2.0.0-rc3

v2.0.0-rc2

v1.3.0

Highlight

Contributors

v1.2.1

What's Changed

v1.2.0

v1.1.0

v1.0.4