Skip to content

Latest commit

 

History

History
750 lines (740 loc) · 150 KB

Supported-models-and-datasets.md

File metadata and controls

750 lines (740 loc) · 150 KB

Supported Models and Datasets

The table below introduces the models integrated with ms-swift:

  • Model ID: Model ID for the ModelScope Model
  • HF Model ID: Hugging Face Model ID
  • Model Type: Type of the model
  • Default Template: Default chat template
  • Requires: Additional dependencies required to use the model
  • Tags: Tags associated with the model

Large Language Models

Model ID Model Type Default Template Requires Tags HF Model ID
Qwen/Qwen-1_8B-Chat qwen qwen - - Qwen/Qwen-1_8B-Chat
Qwen/Qwen-7B-Chat qwen qwen - - Qwen/Qwen-7B-Chat
Qwen/Qwen-14B-Chat qwen qwen - - Qwen/Qwen-14B-Chat
Qwen/Qwen-72B-Chat qwen qwen - - Qwen/Qwen-72B-Chat
Qwen/Qwen-1_8B qwen qwen - - Qwen/Qwen-1_8B
Qwen/Qwen-7B qwen qwen - - Qwen/Qwen-7B
Qwen/Qwen-14B qwen qwen - - Qwen/Qwen-14B
Qwen/Qwen-72B qwen qwen - - Qwen/Qwen-72B
Qwen/Qwen-1_8B-Chat-Int4 qwen qwen - - Qwen/Qwen-1_8B-Chat-Int4
Qwen/Qwen-7B-Chat-Int4 qwen qwen - - Qwen/Qwen-7B-Chat-Int4
Qwen/Qwen-14B-Chat-Int4 qwen qwen - - Qwen/Qwen-14B-Chat-Int4
Qwen/Qwen-72B-Chat-Int4 qwen qwen - - Qwen/Qwen-72B-Chat-Int4
Qwen/Qwen-1_8B-Chat-Int8 qwen qwen - - Qwen/Qwen-1_8B-Chat-Int8
Qwen/Qwen-7B-Chat-Int8 qwen qwen - - Qwen/Qwen-7B-Chat-Int8
Qwen/Qwen-14B-Chat-Int8 qwen qwen - - Qwen/Qwen-14B-Chat-Int8
Qwen/Qwen-72B-Chat-Int8 qwen qwen - - Qwen/Qwen-72B-Chat-Int8
TongyiFinance/Tongyi-Finance-14B-Chat qwen qwen - financial jxy/Tongyi-Finance-14B-Chat
TongyiFinance/Tongyi-Finance-14B qwen qwen - financial -
TongyiFinance/Tongyi-Finance-14B-Chat-Int4 qwen qwen - financial jxy/Tongyi-Finance-14B-Chat-Int4
Qwen/Qwen1.5-0.5B-Chat qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-0.5B-Chat
Qwen/Qwen1.5-1.8B-Chat qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-1.8B-Chat
Qwen/Qwen1.5-4B-Chat qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-4B-Chat
Qwen/Qwen1.5-7B-Chat qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-7B-Chat
Qwen/Qwen1.5-14B-Chat qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-14B-Chat
Qwen/Qwen1.5-32B-Chat qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-32B-Chat
Qwen/Qwen1.5-72B-Chat qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-72B-Chat
Qwen/Qwen1.5-110B-Chat qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-110B-Chat
Qwen/Qwen1.5-0.5B qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-0.5B
Qwen/Qwen1.5-1.8B qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-1.8B
Qwen/Qwen1.5-4B qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-4B
Qwen/Qwen1.5-7B qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-7B
Qwen/Qwen1.5-14B qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-14B
Qwen/Qwen1.5-32B qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-32B
Qwen/Qwen1.5-72B qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-72B
Qwen/Qwen1.5-110B qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-110B
Qwen/Qwen1.5-0.5B-Chat-GPTQ-Int4 qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-0.5B-Chat-GPTQ-Int4
Qwen/Qwen1.5-1.8B-Chat-GPTQ-Int4 qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-1.8B-Chat-GPTQ-Int4
Qwen/Qwen1.5-4B-Chat-GPTQ-Int4 qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-4B-Chat-GPTQ-Int4
Qwen/Qwen1.5-7B-Chat-GPTQ-Int4 qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-7B-Chat-GPTQ-Int4
Qwen/Qwen1.5-14B-Chat-GPTQ-Int4 qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-14B-Chat-GPTQ-Int4
Qwen/Qwen1.5-32B-Chat-GPTQ-Int4 qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-32B-Chat-GPTQ-Int4
Qwen/Qwen1.5-72B-Chat-GPTQ-Int4 qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-72B-Chat-GPTQ-Int4
Qwen/Qwen1.5-110B-Chat-GPTQ-Int4 qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-110B-Chat-GPTQ-Int4
Qwen/Qwen1.5-0.5B-Chat-GPTQ-Int8 qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-0.5B-Chat-GPTQ-Int8
Qwen/Qwen1.5-1.8B-Chat-GPTQ-Int8 qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-1.8B-Chat-GPTQ-Int8
Qwen/Qwen1.5-4B-Chat-GPTQ-Int8 qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-4B-Chat-GPTQ-Int8
Qwen/Qwen1.5-7B-Chat-GPTQ-Int8 qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-7B-Chat-GPTQ-Int8
Qwen/Qwen1.5-14B-Chat-GPTQ-Int8 qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-14B-Chat-GPTQ-Int8
Qwen/Qwen1.5-72B-Chat-GPTQ-Int8 qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-72B-Chat-GPTQ-Int8
Qwen/Qwen1.5-0.5B-Chat-AWQ qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-0.5B-Chat-AWQ
Qwen/Qwen1.5-1.8B-Chat-AWQ qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-1.8B-Chat-AWQ
Qwen/Qwen1.5-4B-Chat-AWQ qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-4B-Chat-AWQ
Qwen/Qwen1.5-7B-Chat-AWQ qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-7B-Chat-AWQ
Qwen/Qwen1.5-14B-Chat-AWQ qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-14B-Chat-AWQ
Qwen/Qwen1.5-32B-Chat-AWQ qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-32B-Chat-AWQ
Qwen/Qwen1.5-72B-Chat-AWQ qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-72B-Chat-AWQ
Qwen/Qwen1.5-110B-Chat-AWQ qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-110B-Chat-AWQ
Qwen/CodeQwen1.5-7B qwen2 qwen transformers>=4.37 coding Qwen/CodeQwen1.5-7B
Qwen/CodeQwen1.5-7B-Chat qwen2 qwen transformers>=4.37 coding Qwen/CodeQwen1.5-7B-Chat
Qwen/CodeQwen1.5-7B-Chat-AWQ qwen2 qwen transformers>=4.37 coding Qwen/CodeQwen1.5-7B-Chat-AWQ
Qwen/Qwen2-0.5B-Instruct qwen2 qwen transformers>=4.37 - Qwen/Qwen2-0.5B-Instruct
Qwen/Qwen2-1.5B-Instruct qwen2 qwen transformers>=4.37 - Qwen/Qwen2-1.5B-Instruct
Qwen/Qwen2-7B-Instruct qwen2 qwen transformers>=4.37 - Qwen/Qwen2-7B-Instruct
Qwen/Qwen2-72B-Instruct qwen2 qwen transformers>=4.37 - Qwen/Qwen2-72B-Instruct
Qwen/Qwen2-0.5B qwen2 qwen transformers>=4.37 - Qwen/Qwen2-0.5B
Qwen/Qwen2-1.5B qwen2 qwen transformers>=4.37 - Qwen/Qwen2-1.5B
Qwen/Qwen2-7B qwen2 qwen transformers>=4.37 - Qwen/Qwen2-7B
Qwen/Qwen2-72B qwen2 qwen transformers>=4.37 - Qwen/Qwen2-72B
Qwen/Qwen2-0.5B-Instruct-GPTQ-Int4 qwen2 qwen transformers>=4.37 - Qwen/Qwen2-0.5B-Instruct-GPTQ-Int4
Qwen/Qwen2-1.5B-Instruct-GPTQ-Int4 qwen2 qwen transformers>=4.37 - Qwen/Qwen2-1.5B-Instruct-GPTQ-Int4
Qwen/Qwen2-7B-Instruct-GPTQ-Int4 qwen2 qwen transformers>=4.37 - Qwen/Qwen2-7B-Instruct-GPTQ-Int4
Qwen/Qwen2-72B-Instruct-GPTQ-Int4 qwen2 qwen transformers>=4.37 - Qwen/Qwen2-72B-Instruct-GPTQ-Int4
Qwen/Qwen2-0.5B-Instruct-GPTQ-Int8 qwen2 qwen transformers>=4.37 - Qwen/Qwen2-0.5B-Instruct-GPTQ-Int8
Qwen/Qwen2-1.5B-Instruct-GPTQ-Int8 qwen2 qwen transformers>=4.37 - Qwen/Qwen2-1.5B-Instruct-GPTQ-Int8
Qwen/Qwen2-7B-Instruct-GPTQ-Int8 qwen2 qwen transformers>=4.37 - Qwen/Qwen2-7B-Instruct-GPTQ-Int8
Qwen/Qwen2-72B-Instruct-GPTQ-Int8 qwen2 qwen transformers>=4.37 - Qwen/Qwen2-72B-Instruct-GPTQ-Int8
Qwen/Qwen2-0.5B-Instruct-AWQ qwen2 qwen transformers>=4.37 - Qwen/Qwen2-0.5B-Instruct-AWQ
Qwen/Qwen2-1.5B-Instruct-AWQ qwen2 qwen transformers>=4.37 - Qwen/Qwen2-1.5B-Instruct-AWQ
Qwen/Qwen2-7B-Instruct-AWQ qwen2 qwen transformers>=4.37 - Qwen/Qwen2-7B-Instruct-AWQ
Qwen/Qwen2-72B-Instruct-AWQ qwen2 qwen transformers>=4.37 - Qwen/Qwen2-72B-Instruct-AWQ
Qwen/Qwen2-Math-1.5B-Instruct qwen2 qwen transformers>=4.37 math Qwen/Qwen2-Math-1.5B-Instruct
Qwen/Qwen2-Math-7B-Instruct qwen2 qwen transformers>=4.37 math Qwen/Qwen2-Math-7B-Instruct
Qwen/Qwen2-Math-72B-Instruct qwen2 qwen transformers>=4.37 math Qwen/Qwen2-Math-72B-Instruct
Qwen/Qwen2-Math-1.5B qwen2 qwen transformers>=4.37 math Qwen/Qwen2-Math-1.5B
Qwen/Qwen2-Math-7B qwen2 qwen transformers>=4.37 math Qwen/Qwen2-Math-7B
Qwen/Qwen2-Math-72B qwen2 qwen transformers>=4.37 math Qwen/Qwen2-Math-72B
Qwen/Qwen2.5-0.5B-Instruct qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-0.5B-Instruct
Qwen/Qwen2.5-1.5B-Instruct qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-1.5B-Instruct
Qwen/Qwen2.5-3B-Instruct qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-3B-Instruct
Qwen/Qwen2.5-7B-Instruct qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-7B-Instruct
Qwen/Qwen2.5-14B-Instruct qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-14B-Instruct
Qwen/Qwen2.5-32B-Instruct qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-32B-Instruct
Qwen/Qwen2.5-72B-Instruct qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-72B-Instruct
Qwen/Qwen2.5-0.5B qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-0.5B
Qwen/Qwen2.5-1.5B qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-1.5B
Qwen/Qwen2.5-3B qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-3B
Qwen/Qwen2.5-7B qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-7B
Qwen/Qwen2.5-14B qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-14B
Qwen/Qwen2.5-32B qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-32B
Qwen/Qwen2.5-72B qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-72B
Qwen/Qwen2.5-0.5B-Instruct-GPTQ-Int4 qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-0.5B-Instruct-GPTQ-Int4
Qwen/Qwen2.5-1.5B-Instruct-GPTQ-Int4 qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-1.5B-Instruct-GPTQ-Int4
Qwen/Qwen2.5-3B-Instruct-GPTQ-Int4 qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-3B-Instruct-GPTQ-Int4
Qwen/Qwen2.5-7B-Instruct-GPTQ-Int4 qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-7B-Instruct-GPTQ-Int4
Qwen/Qwen2.5-14B-Instruct-GPTQ-Int4 qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-14B-Instruct-GPTQ-Int4
Qwen/Qwen2.5-32B-Instruct-GPTQ-Int4 qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-32B-Instruct-GPTQ-Int4
Qwen/Qwen2.5-72B-Instruct-GPTQ-Int4 qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-72B-Instruct-GPTQ-Int4
Qwen/Qwen2.5-0.5B-Instruct-GPTQ-Int8 qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-0.5B-Instruct-GPTQ-Int8
Qwen/Qwen2.5-1.5B-Instruct-GPTQ-Int8 qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-1.5B-Instruct-GPTQ-Int8
Qwen/Qwen2.5-3B-Instruct-GPTQ-Int8 qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-3B-Instruct-GPTQ-Int8
Qwen/Qwen2.5-7B-Instruct-GPTQ-Int8 qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-7B-Instruct-GPTQ-Int8
Qwen/Qwen2.5-14B-Instruct-GPTQ-Int8 qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-14B-Instruct-GPTQ-Int8
Qwen/Qwen2.5-32B-Instruct-GPTQ-Int8 qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-32B-Instruct-GPTQ-Int8
Qwen/Qwen2.5-72B-Instruct-GPTQ-Int8 qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-72B-Instruct-GPTQ-Int8
Qwen/Qwen2.5-0.5B-Instruct-AWQ qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-0.5B-Instruct-AWQ
Qwen/Qwen2.5-1.5B-Instruct-AWQ qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-1.5B-Instruct-AWQ
Qwen/Qwen2.5-3B-Instruct-AWQ qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-3B-Instruct-AWQ
Qwen/Qwen2.5-7B-Instruct-AWQ qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-7B-Instruct-AWQ
Qwen/Qwen2.5-14B-Instruct-AWQ qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-14B-Instruct-AWQ
Qwen/Qwen2.5-32B-Instruct-AWQ qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-32B-Instruct-AWQ
Qwen/Qwen2.5-72B-Instruct-AWQ qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-72B-Instruct-AWQ
Qwen/Qwen2.5-Math-1.5B-Instruct qwen2_5 qwen2_5 transformers>=4.37 math Qwen/Qwen2.5-Math-1.5B-Instruct
Qwen/Qwen2.5-Math-7B-Instruct qwen2_5 qwen2_5 transformers>=4.37 math Qwen/Qwen2.5-Math-7B-Instruct
Qwen/Qwen2.5-Math-72B-Instruct qwen2_5 qwen2_5 transformers>=4.37 math Qwen/Qwen2.5-Math-72B-Instruct
Qwen/Qwen2.5-Math-1.5B qwen2_5 qwen2_5 transformers>=4.37 math Qwen/Qwen2.5-Math-1.5B
Qwen/Qwen2.5-Math-7B qwen2_5 qwen2_5 transformers>=4.37 math Qwen/Qwen2.5-Math-7B
Qwen/Qwen2.5-Math-72B qwen2_5 qwen2_5 transformers>=4.37 math Qwen/Qwen2.5-Math-72B
Qwen/Qwen2.5-Coder-0.5B-Instruct qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-0.5B-Instruct
Qwen/Qwen2.5-Coder-1.5B-Instruct qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-1.5B-Instruct
Qwen/Qwen2.5-Coder-3B-Instruct qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-3B-Instruct
Qwen/Qwen2.5-Coder-7B-Instruct qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-7B-Instruct
Qwen/Qwen2.5-Coder-14B-Instruct qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-14B-Instruct
Qwen/Qwen2.5-Coder-32B-Instruct qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-32B-Instruct
Qwen/Qwen2.5-Coder-0.5B qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-0.5B
Qwen/Qwen2.5-Coder-1.5B qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-1.5B
Qwen/Qwen2.5-Coder-3B qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-3B
Qwen/Qwen2.5-Coder-7B qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-7B
Qwen/Qwen2.5-Coder-14B qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-14B
Qwen/Qwen2.5-Coder-32B qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-32B
Qwen/Qwen2.5-Coder-0.5B-Instruct-AWQ qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-0.5B-Instruct-AWQ
Qwen/Qwen2.5-Coder-1.5B-Instruct-AWQ qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-1.5B-Instruct-AWQ
Qwen/Qwen2.5-Coder-3B-Instruct-AWQ qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-3B-Instruct-AWQ
Qwen/Qwen2.5-Coder-7B-Instruct-AWQ qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-7B-Instruct-AWQ
Qwen/Qwen2.5-Coder-14B-Instruct-AWQ qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-14B-Instruct-AWQ
Qwen/Qwen2.5-Coder-32B-Instruct-AWQ qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-32B-Instruct-AWQ
Qwen/Qwen2.5-Coder-0.5B-Instruct-GPTQ-Int4 qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-0.5B-Instruct-GPTQ-Int4
Qwen/Qwen2.5-Coder-0.5B-Instruct-GPTQ-Int8 qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-0.5B-Instruct-GPTQ-Int8
Qwen/Qwen2.5-Coder-1.5B-Instruct-GPTQ-Int4 qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-1.5B-Instruct-GPTQ-Int4
Qwen/Qwen2.5-Coder-1.5B-Instruct-GPTQ-Int8 qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-1.5B-Instruct-GPTQ-Int8
Qwen/Qwen2.5-Coder-3B-Instruct-GPTQ-Int4 qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-3B-Instruct-GPTQ-Int4
Qwen/Qwen2.5-Coder-3B-Instruct-GPTQ-Int8 qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-3B-Instruct-GPTQ-Int8
Qwen/Qwen2.5-Coder-7B-Instruct-GPTQ-Int4 qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-7B-Instruct-GPTQ-Int4
Qwen/Qwen2.5-Coder-7B-Instruct-GPTQ-Int8 qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-7B-Instruct-GPTQ-Int8
Qwen/Qwen2.5-Coder-14B-Instruct-GPTQ-Int4 qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-14B-Instruct-GPTQ-Int4
Qwen/Qwen2.5-Coder-14B-Instruct-GPTQ-Int8 qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-14B-Instruct-GPTQ-Int8
Qwen/Qwen2.5-Coder-32B-Instruct-GPTQ-Int4 qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-32B-Instruct-GPTQ-Int4
Qwen/Qwen2.5-Coder-32B-Instruct-GPTQ-Int8 qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-32B-Instruct-GPTQ-Int8
Qwen/Qwen1.5-MoE-A2.7B-Chat qwen2_moe qwen transformers>=4.40 - Qwen/Qwen1.5-MoE-A2.7B-Chat
Qwen/Qwen1.5-MoE-A2.7B qwen2_moe qwen transformers>=4.40 - Qwen/Qwen1.5-MoE-A2.7B
Qwen/Qwen1.5-MoE-A2.7B-Chat-GPTQ-Int4 qwen2_moe qwen transformers>=4.40 - Qwen/Qwen1.5-MoE-A2.7B-Chat-GPTQ-Int4
Qwen/Qwen2-57B-A14B-Instruct qwen2_moe qwen transformers>=4.40 - Qwen/Qwen2-57B-A14B-Instruct
Qwen/Qwen2-57B-A14B qwen2_moe qwen transformers>=4.40 - Qwen/Qwen2-57B-A14B
Qwen/Qwen2-57B-A14B-Instruct-GPTQ-Int4 qwen2_moe qwen transformers>=4.40 - Qwen/Qwen2-57B-A14B-Instruct-GPTQ-Int4
Qwen/QwQ-32B-Preview qwq qwq transformers>=4.37 - Qwen/QwQ-32B-Preview
codefuse-ai/CodeFuse-QWen-14B codefuse_qwen codefuse - coding codefuse-ai/CodeFuse-QWen-14B
iic/ModelScope-Agent-7B modelscope_agent modelscope_agent - - -
iic/ModelScope-Agent-14B modelscope_agent modelscope_agent - - -
AIDC-AI/Marco-o1 marco_o1 marco_o1 transformers>=4.37 - AIDC-AI/Marco-o1
modelscope/Llama-2-7b-ms llama llama - - meta-llama/Llama-2-7b-hf
modelscope/Llama-2-13b-ms llama llama - - meta-llama/Llama-2-13b-hf
modelscope/Llama-2-70b-ms llama llama - - meta-llama/Llama-2-70b-hf
modelscope/Llama-2-7b-chat-ms llama llama - - meta-llama/Llama-2-7b-chat-hf
modelscope/Llama-2-13b-chat-ms llama llama - - meta-llama/Llama-2-13b-chat-hf
modelscope/Llama-2-70b-chat-ms llama llama - - meta-llama/Llama-2-70b-chat-hf
AI-ModelScope/chinese-llama-2-1.3b llama llama - - hfl/chinese-llama-2-1.3b
AI-ModelScope/chinese-llama-2-7b llama llama - - hfl/chinese-llama-2-7b
AI-ModelScope/chinese-llama-2-7b-16k llama llama - - hfl/chinese-llama-2-7b-16k
AI-ModelScope/chinese-llama-2-7b-64k llama llama - - hfl/chinese-llama-2-7b-64k
AI-ModelScope/chinese-llama-2-13b llama llama - - hfl/chinese-llama-2-13b
AI-ModelScope/chinese-llama-2-13b-16k llama llama - - hfl/chinese-llama-2-13b-16k
AI-ModelScope/chinese-alpaca-2-1.3b llama llama - - hfl/chinese-alpaca-2-1.3b
AI-ModelScope/chinese-alpaca-2-7b llama llama - - hfl/chinese-alpaca-2-7b
AI-ModelScope/chinese-alpaca-2-7b-16k llama llama - - hfl/chinese-alpaca-2-7b-16k
AI-ModelScope/chinese-alpaca-2-7b-64k llama llama - - hfl/chinese-alpaca-2-7b-64k
AI-ModelScope/chinese-alpaca-2-13b llama llama - - hfl/chinese-alpaca-2-13b
AI-ModelScope/chinese-alpaca-2-13b-16k llama llama - - hfl/chinese-alpaca-2-13b-16k
AI-ModelScope/Llama-2-7b-AQLM-2Bit-1x16-hf llama llama - - ISTA-DASLab/Llama-2-7b-AQLM-2Bit-1x16-hf
LLM-Research/Meta-Llama-3-8B-Instruct llama3 llama3 - - meta-llama/Meta-Llama-3-8B-Instruct
LLM-Research/Meta-Llama-3-70B-Instruct llama3 llama3 - - meta-llama/Meta-Llama-3-70B-Instruct
LLM-Research/Meta-Llama-3-8B llama3 llama3 - - meta-llama/Meta-Llama-3-8B
LLM-Research/Meta-Llama-3-70B llama3 llama3 - - meta-llama/Meta-Llama-3-70B
swift/Meta-Llama-3-8B-Instruct-GPTQ-Int4 llama3 llama3 - - study-hjt/Meta-Llama-3-8B-Instruct-GPTQ-Int4
swift/Meta-Llama-3-8B-Instruct-GPTQ-Int8 llama3 llama3 - - study-hjt/Meta-Llama-3-8B-Instruct-GPTQ-Int8
swift/Meta-Llama-3-8B-Instruct-AWQ llama3 llama3 - - study-hjt/Meta-Llama-3-8B-Instruct-AWQ
swift/Meta-Llama-3-70B-Instruct-GPTQ-Int4 llama3 llama3 - - study-hjt/Meta-Llama-3-70B-Instruct-GPTQ-Int4
swift/Meta-Llama-3-70B-Instruct-GPTQ-Int8 llama3 llama3 - - study-hjt/Meta-Llama-3-70B-Instruct-GPTQ-Int8
swift/Meta-Llama-3-70B-Instruct-AWQ llama3 llama3 - - study-hjt/Meta-Llama-3-70B-Instruct-AWQ
ChineseAlpacaGroup/llama-3-chinese-8b-instruct llama3 llama3 - - hfl/llama-3-chinese-8b-instruct
ChineseAlpacaGroup/llama-3-chinese-8b llama3 llama3 - - hfl/llama-3-chinese-8b
LLM-Research/Meta-Llama-3.1-8B-Instruct llama3_1 llama3_2 transformers>=4.43 - meta-llama/Meta-Llama-3.1-8B-Instruct
LLM-Research/Meta-Llama-3.1-70B-Instruct llama3_1 llama3_2 transformers>=4.43 - meta-llama/Meta-Llama-3.1-70B-Instruct
LLM-Research/Meta-Llama-3.1-405B-Instruct llama3_1 llama3_2 transformers>=4.43 - meta-llama/Meta-Llama-3.1-405B-Instruct
LLM-Research/Meta-Llama-3.1-8B llama3_1 llama3_2 transformers>=4.43 - meta-llama/Meta-Llama-3.1-8B
LLM-Research/Meta-Llama-3.1-70B llama3_1 llama3_2 transformers>=4.43 - meta-llama/Meta-Llama-3.1-70B
LLM-Research/Meta-Llama-3.1-405B llama3_1 llama3_2 transformers>=4.43 - meta-llama/Meta-Llama-3.1-405B
LLM-Research/Meta-Llama-3.1-70B-Instruct-FP8 llama3_1 llama3_2 transformers>=4.43 - meta-llama/Meta-Llama-3.1-70B-Instruct-FP8
LLM-Research/Meta-Llama-3.1-405B-Instruct-FP8 llama3_1 llama3_2 transformers>=4.43 - meta-llama/Meta-Llama-3.1-405B-Instruct-FP8
LLM-Research/Meta-Llama-3.1-8B-Instruct-BNB-NF4 llama3_1 llama3_2 transformers>=4.43 - hugging-quants/Meta-Llama-3.1-8B-Instruct-BNB-NF4
LLM-Research/Meta-Llama-3.1-70B-Instruct-bnb-4bit llama3_1 llama3_2 transformers>=4.43 - unsloth/Meta-Llama-3.1-70B-Instruct-bnb-4bit
LLM-Research/Meta-Llama-3.1-405B-Instruct-BNB-NF4 llama3_1 llama3_2 transformers>=4.43 - hugging-quants/Meta-Llama-3.1-405B-Instruct-BNB-NF4
LLM-Research/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4 llama3_1 llama3_2 transformers>=4.43 - hugging-quants/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4
LLM-Research/Meta-Llama-3.1-70B-Instruct-GPTQ-INT4 llama3_1 llama3_2 transformers>=4.43 - hugging-quants/Meta-Llama-3.1-70B-Instruct-GPTQ-INT4
LLM-Research/Meta-Llama-3.1-405B-Instruct-GPTQ-INT4 llama3_1 llama3_2 transformers>=4.43 - hugging-quants/Meta-Llama-3.1-405B-Instruct-GPTQ-INT4
LLM-Research/Meta-Llama-3.1-8B-Instruct-AWQ-INT4 llama3_1 llama3_2 transformers>=4.43 - hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4
LLM-Research/Meta-Llama-3.1-70B-Instruct-AWQ-INT4 llama3_1 llama3_2 transformers>=4.43 - hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4
LLM-Research/Meta-Llama-3.1-405B-Instruct-AWQ-INT4 llama3_1 llama3_2 transformers>=4.43 - hugging-quants/Meta-Llama-3.1-405B-Instruct-AWQ-INT4
AI-ModelScope/Llama-3.1-Nemotron-70B-Instruct-HF llama3_1 llama3_2 transformers>=4.43 - nvidia/Llama-3.1-Nemotron-70B-Instruct-HF
LLM-Research/Llama-3.2-1B llama3_2 llama3_2 transformers>=4.45 - meta-llama/Llama-3.2-1B
LLM-Research/Llama-3.2-3B llama3_2 llama3_2 transformers>=4.45 - meta-llama/Llama-3.2-3B
LLM-Research/Llama-3.2-1B-Instruct llama3_2 llama3_2 transformers>=4.45 - meta-llama/Llama-3.2-1B-Instruct
LLM-Research/Llama-3.2-3B-Instruct llama3_2 llama3_2 transformers>=4.45 - meta-llama/Llama-3.2-3B-Instruct
LLM-Research/Llama-3.3-70B-Instruct llama3_2 llama3_2 transformers>=4.45 - meta-llama/Llama-3.3-70B-Instruct
unsloth/Llama-3.3-70B-Instruct-bnb-4bit llama3_2 llama3_2 transformers>=4.45 - unsloth/Llama-3.3-70B-Instruct-bnb-4bit
LLM-Research/Reflection-Llama-3.1-70B reflection reflection transformers>=4.43 - mattshumer/Reflection-Llama-3.1-70B
01ai/Yi-6B yi chatml - - 01-ai/Yi-6B
01ai/Yi-6B-200K yi chatml - - 01-ai/Yi-6B-200K
01ai/Yi-6B-Chat yi chatml - - 01-ai/Yi-6B-Chat
01ai/Yi-6B-Chat-4bits yi chatml - - 01-ai/Yi-6B-Chat-4bits
01ai/Yi-6B-Chat-8bits yi chatml - - 01-ai/Yi-6B-Chat-8bits
01ai/Yi-9B yi chatml - - 01-ai/Yi-9B
01ai/Yi-9B-200K yi chatml - - 01-ai/Yi-9B-200K
01ai/Yi-34B yi chatml - - 01-ai/Yi-34B
01ai/Yi-34B-200K yi chatml - - 01-ai/Yi-34B-200K
01ai/Yi-34B-Chat yi chatml - - 01-ai/Yi-34B-Chat
01ai/Yi-34B-Chat-4bits yi chatml - - 01-ai/Yi-34B-Chat-4bits
01ai/Yi-34B-Chat-8bits yi chatml - - 01-ai/Yi-34B-Chat-8bits
01ai/Yi-1.5-6B yi chatml - - 01-ai/Yi-1.5-6B
01ai/Yi-1.5-6B-Chat yi chatml - - 01-ai/Yi-1.5-6B-Chat
01ai/Yi-1.5-9B yi chatml - - 01-ai/Yi-1.5-9B
01ai/Yi-1.5-9B-Chat yi chatml - - 01-ai/Yi-1.5-9B-Chat
01ai/Yi-1.5-9B-Chat-16K yi chatml - - 01-ai/Yi-1.5-9B-Chat-16K
01ai/Yi-1.5-34B yi chatml - - 01-ai/Yi-1.5-34B
01ai/Yi-1.5-34B-Chat yi chatml - - 01-ai/Yi-1.5-34B-Chat
01ai/Yi-1.5-34B-Chat-16K yi chatml - - 01-ai/Yi-1.5-34B-Chat-16K
AI-ModelScope/Yi-1.5-6B-Chat-GPTQ yi chatml - - modelscope/Yi-1.5-6B-Chat-GPTQ
AI-ModelScope/Yi-1.5-6B-Chat-AWQ yi chatml - - modelscope/Yi-1.5-6B-Chat-AWQ
AI-ModelScope/Yi-1.5-9B-Chat-GPTQ yi chatml - - modelscope/Yi-1.5-9B-Chat-GPTQ
AI-ModelScope/Yi-1.5-9B-Chat-AWQ yi chatml - - modelscope/Yi-1.5-9B-Chat-AWQ
AI-ModelScope/Yi-1.5-34B-Chat-GPTQ yi chatml - - modelscope/Yi-1.5-34B-Chat-GPTQ
AI-ModelScope/Yi-1.5-34B-Chat-AWQ yi chatml - - modelscope/Yi-1.5-34B-Chat-AWQ
01ai/Yi-Coder-1.5B yi_coder yi_coder - coding 01-ai/Yi-Coder-1.5B
01ai/Yi-Coder-9B yi_coder yi_coder - coding 01-ai/Yi-Coder-9B
01ai/Yi-Coder-1.5B-Chat yi_coder yi_coder - coding 01-ai/Yi-Coder-1.5B-Chat
01ai/Yi-Coder-9B-Chat yi_coder yi_coder - coding 01-ai/Yi-Coder-9B-Chat
SUSTC/SUS-Chat-34B sus sus - - SUSTech/SUS-Chat-34B
codefuse-ai/CodeFuse-CodeLlama-34B codefuse_codellama codefuse_codellama - coding codefuse-ai/CodeFuse-CodeLlama-34B
langboat/Mengzi3-13B-Base mengzi3 mengzi - - Langboat/Mengzi3-13B-Base
Fengshenbang/Ziya2-13B-Base ziya ziya - - IDEA-CCNL/Ziya2-13B-Base
Fengshenbang/Ziya2-13B-Chat ziya ziya - - IDEA-CCNL/Ziya2-13B-Chat
AI-ModelScope/NuminaMath-7B-TIR numina numina - math AI-MO/NuminaMath-7B-TIR
FlagAlpha/Atom-7B atom atom - - FlagAlpha/Atom-7B
FlagAlpha/Atom-7B-Chat atom atom - - FlagAlpha/Atom-7B-Chat
ZhipuAI/chatglm2-6b chatglm2 chatglm2 - - THUDM/chatglm2-6b
ZhipuAI/chatglm2-6b-32k chatglm2 chatglm2 - - THUDM/chatglm2-6b-32k
ZhipuAI/codegeex2-6b chatglm2 chatglm2 - coding THUDM/codegeex2-6b
ZhipuAI/chatglm3-6b chatglm3 glm4 transformers<4.42 - THUDM/chatglm3-6b
ZhipuAI/chatglm3-6b-base chatglm3 glm4 transformers<4.42 - THUDM/chatglm3-6b-base
ZhipuAI/chatglm3-6b-32k chatglm3 glm4 transformers<4.42 - THUDM/chatglm3-6b-32k
ZhipuAI/chatglm3-6b-128k chatglm3 glm4 transformers<4.42 - THUDM/chatglm3-6b-128k
ZhipuAI/glm-4-9b-chat glm4 glm4 transformers>=4.42 - THUDM/glm-4-9b-chat
ZhipuAI/glm-4-9b glm4 glm4 transformers>=4.42 - THUDM/glm-4-9b
ZhipuAI/glm-4-9b-chat-1m glm4 glm4 transformers>=4.42 - THUDM/glm-4-9b-chat-1m
ZhipuAI/LongWriter-glm4-9b glm4 glm4 transformers>=4.42 - THUDM/LongWriter-glm4-9b
ZhipuAI/glm-edge-1.5b-chat glm_edge glm4 transformers>=4.46 - THUDM/glm-edge-1.5b-chat
ZhipuAI/glm-edge-4b-chat glm_edge glm4 transformers>=4.46 - THUDM/glm-edge-4b-chat
codefuse-ai/CodeFuse-CodeGeeX2-6B codefuse_codegeex2 codefuse transformers<4.34 coding codefuse-ai/CodeFuse-CodeGeeX2-6B
ZhipuAI/codegeex4-all-9b codegeex4 codegeex4 transformers<4.42 coding THUDM/codegeex4-all-9b
ZhipuAI/LongWriter-llama3.1-8b longwriter_llama3_1 longwriter_llama transformers>=4.43 - THUDM/LongWriter-llama3.1-8b
Shanghai_AI_Laboratory/internlm-chat-7b internlm internlm - - internlm/internlm-chat-7b
Shanghai_AI_Laboratory/internlm-7b internlm internlm - - internlm/internlm-7b
Shanghai_AI_Laboratory/internlm-chat-7b-8k internlm internlm - - -
Shanghai_AI_Laboratory/internlm-20b internlm internlm - - internlm/internlm-20b
Shanghai_AI_Laboratory/internlm-chat-20b internlm internlm - - internlm/internlm-chat-20b
Shanghai_AI_Laboratory/internlm2-chat-1_8b internlm2 internlm2 transformers>=4.38 - internlm/internlm2-chat-1_8b
Shanghai_AI_Laboratory/internlm2-1_8b internlm2 internlm2 transformers>=4.38 - internlm/internlm2-1_8b
Shanghai_AI_Laboratory/internlm2-chat-1_8b-sft internlm2 internlm2 transformers>=4.38 - internlm/internlm2-chat-1_8b-sft
Shanghai_AI_Laboratory/internlm2-base-7b internlm2 internlm2 transformers>=4.38 - internlm/internlm2-base-7b
Shanghai_AI_Laboratory/internlm2-7b internlm2 internlm2 transformers>=4.38 - internlm/internlm2-7b
Shanghai_AI_Laboratory/internlm2-chat-7b internlm2 internlm2 transformers>=4.38 - internlm/internlm2-chat-7b
Shanghai_AI_Laboratory/internlm2-chat-7b-sft internlm2 internlm2 transformers>=4.38 - internlm/internlm2-chat-7b-sft
Shanghai_AI_Laboratory/internlm2-base-20b internlm2 internlm2 transformers>=4.38 - internlm/internlm2-base-20b
Shanghai_AI_Laboratory/internlm2-20b internlm2 internlm2 transformers>=4.38 - internlm/internlm2-20b
Shanghai_AI_Laboratory/internlm2-chat-20b internlm2 internlm2 transformers>=4.38 - internlm/internlm2-chat-20b
Shanghai_AI_Laboratory/internlm2-chat-20b-sft internlm2 internlm2 transformers>=4.38 - internlm/internlm2-chat-20b-sft
Shanghai_AI_Laboratory/internlm2-math-7b internlm2 internlm2 transformers>=4.38 math internlm/internlm2-math-7b
Shanghai_AI_Laboratory/internlm2-math-base-7b internlm2 internlm2 transformers>=4.38 math internlm/internlm2-math-base-7b
Shanghai_AI_Laboratory/internlm2-math-base-20b internlm2 internlm2 transformers>=4.38 math internlm/internlm2-math-base-20b
Shanghai_AI_Laboratory/internlm2-math-20b internlm2 internlm2 transformers>=4.38 math internlm/internlm2-math-20b
Shanghai_AI_Laboratory/internlm2_5-1_8b-chat internlm2 internlm2 transformers>=4.38 - internlm/internlm2_5-1_8b-chat
Shanghai_AI_Laboratory/internlm2_5-1_8b internlm2 internlm2 transformers>=4.38 - internlm/internlm2_5-1_8b
Shanghai_AI_Laboratory/internlm2_5-7b internlm2 internlm2 transformers>=4.38 - internlm/internlm2_5-7b
Shanghai_AI_Laboratory/internlm2_5-7b-chat internlm2 internlm2 transformers>=4.38 - internlm/internlm2_5-7b-chat
Shanghai_AI_Laboratory/internlm2_5-7b-chat-1m internlm2 internlm2 transformers>=4.38 - internlm/internlm2_5-7b-chat-1m
Shanghai_AI_Laboratory/internlm2_5-20b internlm2 internlm2 transformers>=4.38 - internlm/internlm2_5-20b
Shanghai_AI_Laboratory/internlm2_5-20b-chat internlm2 internlm2 transformers>=4.38 - internlm/internlm2_5-20b-chat
deepseek-ai/deepseek-llm-7b-base deepseek deepseek - - deepseek-ai/deepseek-llm-7b-base
deepseek-ai/deepseek-llm-7b-chat deepseek deepseek - - deepseek-ai/deepseek-llm-7b-chat
deepseek-ai/deepseek-llm-67b-base deepseek deepseek - - deepseek-ai/deepseek-llm-67b-base
deepseek-ai/deepseek-llm-67b-chat deepseek deepseek - - deepseek-ai/deepseek-llm-67b-chat
deepseek-ai/deepseek-math-7b-base deepseek deepseek - math deepseek-ai/deepseek-math-7b-base
deepseek-ai/deepseek-math-7b-instruct deepseek deepseek - math deepseek-ai/deepseek-math-7b-instruct
deepseek-ai/deepseek-math-7b-rl deepseek deepseek - math deepseek-ai/deepseek-math-7b-rl
deepseek-ai/deepseek-coder-1.3b-base deepseek deepseek - coding deepseek-ai/deepseek-coder-1.3b-base
deepseek-ai/deepseek-coder-1.3b-instruct deepseek deepseek - coding deepseek-ai/deepseek-coder-1.3b-instruct
deepseek-ai/deepseek-coder-6.7b-base deepseek deepseek - coding deepseek-ai/deepseek-coder-6.7b-base
deepseek-ai/deepseek-coder-6.7b-instruct deepseek deepseek - coding deepseek-ai/deepseek-coder-6.7b-instruct
deepseek-ai/deepseek-coder-33b-base deepseek deepseek - coding deepseek-ai/deepseek-coder-33b-base
deepseek-ai/deepseek-coder-33b-instruct deepseek deepseek - coding deepseek-ai/deepseek-coder-33b-instruct
deepseek-ai/deepseek-moe-16b-chat deepseek_moe deepseek - - deepseek-ai/deepseek-moe-16b-chat
deepseek-ai/deepseek-moe-16b-base deepseek_moe deepseek - - deepseek-ai/deepseek-moe-16b-base
deepseek-ai/DeepSeek-Coder-V2-Instruct deepseek_v2 deepseek transformers>=4.39.3 - deepseek-ai/DeepSeek-Coder-V2-Instruct
deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct deepseek_v2 deepseek transformers>=4.39.3 - deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct
deepseek-ai/DeepSeek-Coder-V2-Base deepseek_v2 deepseek transformers>=4.39.3 - deepseek-ai/DeepSeek-Coder-V2-Base
deepseek-ai/DeepSeek-Coder-V2-Lite-Base deepseek_v2 deepseek transformers>=4.39.3 - deepseek-ai/DeepSeek-Coder-V2-Lite-Base
deepseek-ai/DeepSeek-V2-Lite deepseek_v2 deepseek transformers>=4.39.3 - deepseek-ai/DeepSeek-V2-Lite
deepseek-ai/DeepSeek-V2-Lite-Chat deepseek_v2 deepseek transformers>=4.39.3 - deepseek-ai/DeepSeek-V2-Lite-Chat
deepseek-ai/DeepSeek-V2 deepseek_v2 deepseek transformers>=4.39.3 - deepseek-ai/DeepSeek-V2
deepseek-ai/DeepSeek-V2-Chat deepseek_v2 deepseek transformers>=4.39.3 - deepseek-ai/DeepSeek-V2-Chat
deepseek-ai/DeepSeek-V2.5 deepseek_v2_5 deepseek_v2_5 transformers>=4.39.3 - deepseek-ai/DeepSeek-V2.5
OpenBuddy/openbuddy-llama-65b-v8-bf16 openbuddy_llama openbuddy - - OpenBuddy/openbuddy-llama-65b-v8-bf16
OpenBuddy/openbuddy-llama2-13b-v8.1-fp16 openbuddy_llama openbuddy - - OpenBuddy/openbuddy-llama2-13b-v8.1-fp16
OpenBuddy/openbuddy-llama2-70b-v10.1-bf16 openbuddy_llama openbuddy - - OpenBuddy/openbuddy-llama2-70b-v10.1-bf16
OpenBuddy/openbuddy-deepseek-67b-v15.2 openbuddy_llama openbuddy - - OpenBuddy/openbuddy-deepseek-67b-v15.2
OpenBuddy/openbuddy-llama3-8b-v21.1-8k openbuddy_llama3 openbuddy2 - - OpenBuddy/openbuddy-llama3-8b-v21.1-8k
OpenBuddy/openbuddy-llama3-70b-v21.1-8k openbuddy_llama3 openbuddy2 - - OpenBuddy/openbuddy-llama3-70b-v21.1-8k
OpenBuddy/openbuddy-llama3.1-8b-v22.1-131k openbuddy_llama3 openbuddy2 - - OpenBuddy/openbuddy-llama3.1-8b-v22.1-131k
OpenBuddy/openbuddy-mistral-7b-v17.1-32k openbuddy_mistral openbuddy transformers>=4.34 - OpenBuddy/openbuddy-mistral-7b-v17.1-32k
OpenBuddy/openbuddy-zephyr-7b-v14.1 openbuddy_mistral openbuddy transformers>=4.34 - OpenBuddy/openbuddy-zephyr-7b-v14.1
OpenBuddy/openbuddy-mixtral-7bx8-v18.1-32k openbuddy_mixtral openbuddy transformers>=4.36 - OpenBuddy/openbuddy-mixtral-7bx8-v18.1-32k
baichuan-inc/Baichuan-13B-Chat baichuan baichuan transformers<4.34 - baichuan-inc/Baichuan-13B-Chat
baichuan-inc/Baichuan-13B-Base baichuan baichuan transformers<4.34 - baichuan-inc/Baichuan-13B-Base
baichuan-inc/baichuan-7B baichuan baichuan transformers<4.34 - baichuan-inc/Baichuan-7B
baichuan-inc/Baichuan2-7B-Chat baichuan2 baichuan - - baichuan-inc/Baichuan2-7B-Chat
baichuan-inc/Baichuan2-7B-Base baichuan2 baichuan - - baichuan-inc/Baichuan2-7B-Base
baichuan-inc/Baichuan2-13B-Chat baichuan2 baichuan - - baichuan-inc/Baichuan2-13B-Chat
baichuan-inc/Baichuan2-13B-Base baichuan2 baichuan - - baichuan-inc/Baichuan2-13B-Base
baichuan-inc/Baichuan2-7B-Chat-4bits baichuan2 baichuan - - baichuan-inc/Baichuan2-7B-Chat-4bits
baichuan-inc/Baichuan2-13B-Chat-4bits baichuan2 baichuan - - baichuan-inc/Baichuan2-13B-Chat-4bits
OpenBMB/MiniCPM-2B-sft-fp32 minicpm minicpm transformers>=4.36.0 - openbmb/MiniCPM-2B-sft-fp32
OpenBMB/MiniCPM-2B-dpo-fp32 minicpm minicpm transformers>=4.36.0 - openbmb/MiniCPM-2B-dpo-fp32
OpenBMB/MiniCPM-1B-sft-bf16 minicpm minicpm transformers>=4.36.0 - openbmb/MiniCPM-1B-sft-bf16
OpenBMB/MiniCPM-2B-128k minicpm_chatml chatml transformers>=4.36 - openbmb/MiniCPM-2B-128k
OpenBMB/MiniCPM3-4B minicpm3 chatml transformers>=4.36 - openbmb/MiniCPM3-4B
OpenBMB/MiniCPM-MoE-8x2B minicpm_moe minicpm transformers>=4.36 - openbmb/MiniCPM-MoE-8x2B
TeleAI/TeleChat-7B telechat telechat - - Tele-AI/telechat-7B
TeleAI/TeleChat-12B telechat telechat - - Tele-AI/TeleChat-12B
TeleAI/TeleChat-12B-v2 telechat telechat - - Tele-AI/TeleChat-12B-v2
swift/TeleChat-12B-V2-GPTQ-Int4 telechat telechat - - -
TeleAI/TeleChat2-3B telechat2 telechat2 - - Tele-AI/TeleChat2-3B
TeleAI/TeleChat2-7B telechat2 telechat2 - - Tele-AI/TeleChat2-7B
TeleAI/TeleChat2-35B-Nov telechat2 telechat2 - - Tele-AI/TeleChat2-35B-Nov
TeleAI/TeleChat2-35B telechat2_115b telechat2_115b - - Tele-AI/TeleChat2-35B
TeleAI/TeleChat2-115B telechat2_115b telechat2_115b - - Tele-AI/TeleChat2-115B
AI-ModelScope/Mistral-7B-Instruct-v0.1 mistral llama transformers>=4.34 - mistralai/Mistral-7B-Instruct-v0.1
AI-ModelScope/Mistral-7B-Instruct-v0.2 mistral llama transformers>=4.34 - mistralai/Mistral-7B-Instruct-v0.2
LLM-Research/Mistral-7B-Instruct-v0.3 mistral llama transformers>=4.34 - mistralai/Mistral-7B-Instruct-v0.3
AI-ModelScope/Mistral-7B-v0.1 mistral llama transformers>=4.34 - mistralai/Mistral-7B-v0.1
AI-ModelScope/Mistral-7B-v0.2-hf mistral llama transformers>=4.34 - alpindale/Mistral-7B-v0.2-hf
swift/Codestral-22B-v0.1 mistral llama transformers>=4.34 - mistralai/Codestral-22B-v0.1
modelscope/zephyr-7b-beta zephyr zephyr transformers>=4.34 - HuggingFaceH4/zephyr-7b-beta
AI-ModelScope/Mixtral-8x7B-Instruct-v0.1 mixtral llama - - mistralai/Mixtral-8x7B-Instruct-v0.1
AI-ModelScope/Mixtral-8x7B-v0.1 mixtral llama - - mistralai/Mixtral-8x7B-v0.1
AI-ModelScope/Mixtral-8x22B-v0.1 mixtral llama - - mistral-community/Mixtral-8x22B-v0.1
AI-ModelScope/Mixtral-8x7b-AQLM-2Bit-1x16-hf mixtral llama - - ISTA-DASLab/Mixtral-8x7b-AQLM-2Bit-1x16-hf
AI-ModelScope/Mistral-Small-Instruct-2409 mistral_nemo mistral_nemo - - mistralai/Mistral-Small-Instruct-2409
LLM-Research/Mistral-Large-Instruct-2407 mistral_nemo mistral_nemo - - mistralai/Mistral-Large-Instruct-2407
AI-ModelScope/Mistral-Nemo-Base-2407 mistral_nemo mistral_nemo - - mistralai/Mistral-Nemo-Base-2407
AI-ModelScope/Mistral-Nemo-Instruct-2407 mistral_nemo mistral_nemo - - mistralai/Mistral-Nemo-Instruct-2407
AI-ModelScope/Ministral-8B-Instruct-2410 mistral_nemo mistral_nemo - - mistralai/Ministral-8B-Instruct-2410
AI-ModelScope/WizardLM-2-7B-AWQ wizardlm2 wizardlm2 transformers>=4.34 - MaziyarPanahi/WizardLM-2-7B-AWQ
AI-ModelScope/WizardLM-2-8x22B wizardlm2_moe wizardlm2_moe transformers>=4.36 - alpindale/WizardLM-2-8x22B
AI-ModelScope/phi-2 phi2 default - - microsoft/phi-2
LLM-Research/Phi-3-small-8k-instruct phi3_small phi3 transformers>=4.36 - microsoft/Phi-3-small-8k-instruct
LLM-Research/Phi-3-small-128k-instruct phi3_small phi3 transformers>=4.36 - microsoft/Phi-3-small-128k-instruct
LLM-Research/Phi-3-mini-4k-instruct phi3 phi3 transformers>=4.36 - microsoft/Phi-3-mini-4k-instruct
LLM-Research/Phi-3-mini-128k-instruct phi3 phi3 transformers>=4.36 - microsoft/Phi-3-mini-128k-instruct
LLM-Research/Phi-3-medium-4k-instruct phi3 phi3 transformers>=4.36 - microsoft/Phi-3-medium-4k-instruct
LLM-Research/Phi-3-medium-128k-instruct phi3 phi3 transformers>=4.36 - microsoft/Phi-3-medium-128k-instruct
LLM-Research/Phi-3.5-mini-instruct phi3 phi3 transformers>=4.36 - microsoft/Phi-3.5-mini-instruct
LLM-Research/Phi-3.5-MoE-instruct phi3_moe phi3 transformers>=4.36 - microsoft/Phi-3.5-MoE-instruct
AI-ModelScope/gemma-2b-it gemma gemma transformers>=4.38 - google/gemma-2b-it
AI-ModelScope/gemma-2b gemma gemma transformers>=4.38 - google/gemma-2b
AI-ModelScope/gemma-7b gemma gemma transformers>=4.38 - google/gemma-7b
AI-ModelScope/gemma-7b-it gemma gemma transformers>=4.38 - google/gemma-7b-it
LLM-Research/gemma-2-2b-it gemma2 gemma transformers>=4.42 - google/gemma-2-2b-it
LLM-Research/gemma-2-2b gemma2 gemma transformers>=4.42 - google/gemma-2-2b
LLM-Research/gemma-2-9b gemma2 gemma transformers>=4.42 - google/gemma-2-9b
LLM-Research/gemma-2-9b-it gemma2 gemma transformers>=4.42 - google/gemma-2-9b-it
LLM-Research/gemma-2-27b gemma2 gemma transformers>=4.42 - google/gemma-2-27b
LLM-Research/gemma-2-27b-it gemma2 gemma transformers>=4.42 - google/gemma-2-27b-it
IEITYuan/Yuan2.0-2B-hf yuan2 yuan - - IEITYuan/Yuan2-2B-hf
IEITYuan/Yuan2.0-51B-hf yuan2 yuan - - IEITYuan/Yuan2-51B-hf
IEITYuan/Yuan2.0-102B-hf yuan2 yuan - - IEITYuan/Yuan2-102B-hf
IEITYuan/Yuan2-2B-Janus-hf yuan2 yuan - - IEITYuan/Yuan2-2B-Janus-hf
IEITYuan/Yuan2-M32-hf yuan2 yuan - - IEITYuan/Yuan2-M32-hf
OrionStarAI/Orion-14B-Chat orion orion - - OrionStarAI/Orion-14B-Chat
OrionStarAI/Orion-14B-Base orion orion - - OrionStarAI/Orion-14B-Base
xverse/XVERSE-7B-Chat xverse xverse - - xverse/XVERSE-7B-Chat
xverse/XVERSE-7B xverse xverse - - xverse/XVERSE-7B
xverse/XVERSE-13B xverse xverse - - xverse/XVERSE-13B
xverse/XVERSE-13B-Chat xverse xverse - - xverse/XVERSE-13B-Chat
xverse/XVERSE-65B xverse xverse - - xverse/XVERSE-65B
xverse/XVERSE-65B-2 xverse xverse - - xverse/XVERSE-65B-2
xverse/XVERSE-65B-Chat xverse xverse - - xverse/XVERSE-65B-Chat
xverse/XVERSE-13B-256K xverse xverse - - xverse/XVERSE-13B-256K
xverse/XVERSE-MoE-A4.2B xverse_moe xverse - - xverse/XVERSE-MoE-A4.2B
damo/nlp_seqgpt-560m seggpt default - - DAMO-NLP/SeqGPT-560M
vivo-ai/BlueLM-7B-Chat-32K bluelm bluelm - - vivo-ai/BlueLM-7B-Chat-32K
vivo-ai/BlueLM-7B-Chat bluelm bluelm - - vivo-ai/BlueLM-7B-Chat
vivo-ai/BlueLM-7B-Base-32K bluelm bluelm - - vivo-ai/BlueLM-7B-Base-32K
vivo-ai/BlueLM-7B-Base bluelm bluelm - - vivo-ai/BlueLM-7B-Base
AI-ModelScope/c4ai-command-r-v01 c4ai c4ai transformers>=4.39 - CohereForAI/c4ai-command-r-v01
AI-ModelScope/c4ai-command-r-plus c4ai c4ai transformers>=4.39 - CohereForAI/c4ai-command-r-plus
AI-ModelScope/dbrx-base dbrx dbrx transformers>=4.36 - databricks/dbrx-base
AI-ModelScope/dbrx-instruct dbrx dbrx transformers>=4.36 - databricks/dbrx-instruct
colossalai/grok-1-pytorch grok default - - hpcai-tech/grok-1
AI-ModelScope/mamba-130m-hf mamba default transformers>=4.39.0 - state-spaces/mamba-130m-hf
AI-ModelScope/mamba-370m-hf mamba default transformers>=4.39.0 - state-spaces/mamba-370m-hf
AI-ModelScope/mamba-390m-hf mamba default transformers>=4.39.0 - state-spaces/mamba-390m-hf
AI-ModelScope/mamba-790m-hf mamba default transformers>=4.39.0 - state-spaces/mamba-790m-hf
AI-ModelScope/mamba-1.4b-hf mamba default transformers>=4.39.0 - state-spaces/mamba-1.4b-hf
AI-ModelScope/mamba-2.8b-hf mamba default transformers>=4.39.0 - state-spaces/mamba-2.8b-hf
damo/nlp_polylm_13b_text_generation polylm default - - DAMO-NLP-MT/polylm-13b
skywork/Skywork-13B-base skywork skywork - - -
skywork/Skywork-13B-chat skywork skywork - - -
AI-ModelScope/aya-expanse-8b aya aya transformers>=4.44.0 - CohereForAI/aya-expanse-8b
AI-ModelScope/aya-expanse-32b aya aya transformers>=4.44.0 - CohereForAI/aya-expanse-32b

Multimodal large models

Model ID Model Type Default Template Requires Tags HF Model ID
Qwen/Qwen-VL-Chat qwen_vl qwen_vl - vision Qwen/Qwen-VL-Chat
Qwen/Qwen-VL qwen_vl qwen_vl - vision Qwen/Qwen-VL
Qwen/Qwen-VL-Chat-Int4 qwen_vl qwen_vl - vision Qwen/Qwen-VL-Chat-Int4
Qwen/Qwen-Audio-Chat qwen_audio qwen_audio - audio Qwen/Qwen-Audio-Chat
Qwen/Qwen-Audio qwen_audio qwen_audio - audio Qwen/Qwen-Audio
Qwen/Qwen2-VL-2B-Instruct qwen2_vl qwen2_vl transformers>=4.45, qwen_vl_utils, pyav vision, video Qwen/Qwen2-VL-2B-Instruct
Qwen/Qwen2-VL-7B-Instruct qwen2_vl qwen2_vl transformers>=4.45, qwen_vl_utils, pyav vision, video Qwen/Qwen2-VL-7B-Instruct
Qwen/Qwen2-VL-72B-Instruct qwen2_vl qwen2_vl transformers>=4.45, qwen_vl_utils, pyav vision, video Qwen/Qwen2-VL-72B-Instruct
Qwen/Qwen2-VL-2B qwen2_vl qwen2_vl transformers>=4.45, qwen_vl_utils, pyav vision, video Qwen/Qwen2-VL-2B
Qwen/Qwen2-VL-7B qwen2_vl qwen2_vl transformers>=4.45, qwen_vl_utils, pyav vision, video Qwen/Qwen2-VL-7B
Qwen/Qwen2-VL-72B qwen2_vl qwen2_vl transformers>=4.45, qwen_vl_utils, pyav vision, video Qwen/Qwen2-VL-72B
Qwen/Qwen2-VL-2B-Instruct-GPTQ-Int4 qwen2_vl qwen2_vl transformers>=4.45, qwen_vl_utils, pyav vision, video Qwen/Qwen2-VL-2B-Instruct-GPTQ-Int4
Qwen/Qwen2-VL-7B-Instruct-GPTQ-Int4 qwen2_vl qwen2_vl transformers>=4.45, qwen_vl_utils, pyav vision, video Qwen/Qwen2-VL-7B-Instruct-GPTQ-Int4
Qwen/Qwen2-VL-72B-Instruct-GPTQ-Int4 qwen2_vl qwen2_vl transformers>=4.45, qwen_vl_utils, pyav vision, video Qwen/Qwen2-VL-72B-Instruct-GPTQ-Int4
Qwen/Qwen2-VL-2B-Instruct-GPTQ-Int8 qwen2_vl qwen2_vl transformers>=4.45, qwen_vl_utils, pyav vision, video Qwen/Qwen2-VL-2B-Instruct-GPTQ-Int8
Qwen/Qwen2-VL-7B-Instruct-GPTQ-Int8 qwen2_vl qwen2_vl transformers>=4.45, qwen_vl_utils, pyav vision, video Qwen/Qwen2-VL-7B-Instruct-GPTQ-Int8
Qwen/Qwen2-VL-72B-Instruct-GPTQ-Int8 qwen2_vl qwen2_vl transformers>=4.45, qwen_vl_utils, pyav vision, video Qwen/Qwen2-VL-72B-Instruct-GPTQ-Int8
Qwen/Qwen2-VL-2B-Instruct-AWQ qwen2_vl qwen2_vl transformers>=4.45, qwen_vl_utils, pyav vision, video Qwen/Qwen2-VL-2B-Instruct-AWQ
Qwen/Qwen2-VL-7B-Instruct-AWQ qwen2_vl qwen2_vl transformers>=4.45, qwen_vl_utils, pyav vision, video Qwen/Qwen2-VL-7B-Instruct-AWQ
Qwen/Qwen2-VL-72B-Instruct-AWQ qwen2_vl qwen2_vl transformers>=4.45, qwen_vl_utils, pyav vision, video Qwen/Qwen2-VL-72B-Instruct-AWQ
Qwen/Qwen2-Audio-7B-Instruct qwen2_audio qwen2_audio transformers>=4.45, librosa audio Qwen/Qwen2-Audio-7B-Instruct
Qwen/Qwen2-Audio-7B qwen2_audio qwen2_audio transformers>=4.45, librosa audio Qwen/Qwen2-Audio-7B
Qwen/Qwen2-Audio-7B qwen2_audio qwen2_audio transformers>=4.45, librosa audio Qwen/Qwen2-Audio-7B
AIDC-AI/Ovis1.6-Gemma2-9B ovis1_6 ovis1_6 transformers>=4.42 vision AIDC-AI/Ovis1.6-Gemma2-9B
ZhipuAI/glm-4v-9b glm4v glm4v transformers>=4.42 - THUDM/glm-4v-9b
ZhipuAI/glm-edge-v-2b glm_edge_v glm_edge_v transformers>=4.46 vision THUDM/glm-edge-v-2b
ZhipuAI/glm-edge-4b-chat glm_edge_v glm_edge_v transformers>=4.46 vision THUDM/glm-edge-4b-chat
ZhipuAI/cogvlm-chat cogvlm cogvlm transformers<4.42 - THUDM/cogvlm-chat-hf
ZhipuAI/cogagent-vqa cogagent_vqa cogagent_vqa transformers<4.42 - THUDM/cogagent-vqa-hf
ZhipuAI/cogagent-chat cogagent_chat cogagent_chat transformers<4.42, timm - THUDM/cogagent-chat-hf
ZhipuAI/cogvlm2-llama3-chat-19B cogvlm2 cogvlm2 transformers<4.42 - THUDM/cogvlm2-llama3-chat-19B
ZhipuAI/cogvlm2-llama3-chinese-chat-19B cogvlm2 cogvlm2 transformers<4.42 - THUDM/cogvlm2-llama3-chinese-chat-19B
ZhipuAI/cogvlm2-video-llama3-chat cogvlm2_video cogvlm2_video decord, pytorchvideo, transformers>=4.42 video THUDM/cogvlm2-video-llama3-chat
OpenGVLab/Mini-InternVL-Chat-2B-V1-5 internvl internvl transformers>=4.35, timm vision OpenGVLab/Mini-InternVL-Chat-2B-V1-5
AI-ModelScope/InternVL-Chat-V1-5 internvl internvl transformers>=4.35, timm vision OpenGVLab/InternVL-Chat-V1-5
AI-ModelScope/InternVL-Chat-V1-5-int8 internvl internvl transformers>=4.35, timm vision OpenGVLab/InternVL-Chat-V1-5-int8
OpenGVLab/Mini-InternVL-Chat-4B-V1-5 internvl_phi3 internvl_phi3 transformers>=4.35,<4.42, timm vision OpenGVLab/Mini-InternVL-Chat-4B-V1-5
OpenGVLab/InternVL2-1B internvl2 internvl2 transformers>=4.36, timm vision, video OpenGVLab/InternVL2-1B
OpenGVLab/InternVL2-2B internvl2 internvl2 transformers>=4.36, timm vision, video OpenGVLab/InternVL2-2B
OpenGVLab/InternVL2-8B internvl2 internvl2 transformers>=4.36, timm vision, video OpenGVLab/InternVL2-8B
OpenGVLab/InternVL2-26B internvl2 internvl2 transformers>=4.36, timm vision, video OpenGVLab/InternVL2-26B
OpenGVLab/InternVL2-40B internvl2 internvl2 transformers>=4.36, timm vision, video OpenGVLab/InternVL2-40B
OpenGVLab/InternVL2-Llama3-76B internvl2 internvl2 transformers>=4.36, timm vision, video OpenGVLab/InternVL2-Llama3-76B
OpenGVLab/InternVL2-2B-AWQ internvl2 internvl2 transformers>=4.36, timm vision, video OpenGVLab/InternVL2-2B-AWQ
OpenGVLab/InternVL2-8B-AWQ internvl2 internvl2 transformers>=4.36, timm vision, video OpenGVLab/InternVL2-8B-AWQ
OpenGVLab/InternVL2-26B-AWQ internvl2 internvl2 transformers>=4.36, timm vision, video OpenGVLab/InternVL2-26B-AWQ
OpenGVLab/InternVL2-40B-AWQ internvl2 internvl2 transformers>=4.36, timm vision, video OpenGVLab/InternVL2-40B-AWQ
OpenGVLab/InternVL2-Llama3-76B-AWQ internvl2 internvl2 transformers>=4.36, timm vision, video OpenGVLab/InternVL2-Llama3-76B-AWQ
OpenGVLab/InternVL2-4B internvl2_phi3 internvl2_phi3 transformers>=4.36,<4.42, timm vision, video OpenGVLab/InternVL2-4B
OpenGVLab/InternVL2_5-1B internvl2_5 internvl2_5 transformers>=4.36, timm vision, video OpenGVLab/InternVL2_5-1B
OpenGVLab/InternVL2_5-2B internvl2_5 internvl2_5 transformers>=4.36, timm vision, video OpenGVLab/InternVL2_5-2B
OpenGVLab/InternVL2_5-4B internvl2_5 internvl2_5 transformers>=4.36, timm vision, video OpenGVLab/InternVL2_5-4B
OpenGVLab/InternVL2_5-8B internvl2_5 internvl2_5 transformers>=4.36, timm vision, video OpenGVLab/InternVL2_5-8B
OpenGVLab/InternVL2_5-26B internvl2_5 internvl2_5 transformers>=4.36, timm vision, video OpenGVLab/InternVL2_5-26B
OpenGVLab/InternVL2_5-38B internvl2_5 internvl2_5 transformers>=4.36, timm vision, video OpenGVLab/InternVL2_5-38B
OpenGVLab/InternVL2_5-78B internvl2_5 internvl2_5 transformers>=4.36, timm vision, video OpenGVLab/InternVL2_5-78B
Shanghai_AI_Laboratory/internlm-xcomposer2-7b xcomposer2 ixcomposer2 - vision internlm/internlm-xcomposer2-7b
Shanghai_AI_Laboratory/internlm-xcomposer2-4khd-7b xcomposer2_4khd ixcomposer2 - vision internlm/internlm-xcomposer2-4khd-7b
Shanghai_AI_Laboratory/internlm-xcomposer2d5-7b xcomposer2_5 xcomposer2_5 decord vision internlm/internlm-xcomposer2d5-7b
LLM-Research/Llama-3.2-11B-Vision-Instruct llama3_2_vision llama3_2_vision transformers>=4.45 vision meta-llama/Llama-3.2-11B-Vision-Instruct
LLM-Research/Llama-3.2-90B-Vision-Instruct llama3_2_vision llama3_2_vision transformers>=4.45 vision meta-llama/Llama-3.2-90B-Vision-Instruct
LLM-Research/Llama-3.2-11B-Vision llama3_2_vision llama3_2_vision transformers>=4.45 vision meta-llama/Llama-3.2-11B-Vision
LLM-Research/Llama-3.2-90B-Vision llama3_2_vision llama3_2_vision transformers>=4.45 vision meta-llama/Llama-3.2-90B-Vision
ICTNLP/Llama-3.1-8B-Omni llama3_1_omni llama3_1_omni whisper, openai-whisper audio ICTNLP/Llama-3.1-8B-Omni
swift/llava-1.5-7b-hf llava1_5_hf llava1_5_hf transformers>=4.36 vision llava-hf/llava-1.5-7b-hf
swift/llava-1.5-13b-hf llava1_5_hf llava1_5_hf transformers>=4.36 vision llava-hf/llava-1.5-13b-hf
swift/llava-v1.6-mistral-7b-hf llava1_6_mistral_hf llava1_6_mistral_hf transformers>=4.39 vision llava-hf/llava-v1.6-mistral-7b-hf
swift/llava-v1.6-vicuna-7b-hf llava1_6_vicuna_hf llava1_6_vicuna_hf transformers>=4.39 vision llava-hf/llava-v1.6-vicuna-7b-hf
swift/llava-v1.6-vicuna-13b-hf llava1_6_vicuna_hf llava1_6_vicuna_hf transformers>=4.39 vision llava-hf/llava-v1.6-vicuna-13b-hf
swift/llava-v1.6-34b-hf llava1_6_yi_hf llava1_6_yi_hf transformers>=4.39 vision llava-hf/llava-v1.6-34b-hf
swift/llama3-llava-next-8b-hf llama3_llava_next_hf llama3_llava_next_hf transformers>=4.39 vision llava-hf/llama3-llava-next-8b-hf
AI-ModelScope/llava-next-72b-hf llava_next_qwen_hf llava_next_qwen_hf transformers>=4.39 vision llava-hf/llava-next-72b-hf
AI-ModelScope/llava-next-110b-hf llava_next_qwen_hf llava_next_qwen_hf transformers>=4.39 vision llava-hf/llava-next-110b-hf
swift/LLaVA-NeXT-Video-7B-DPO-hf llava_next_video_hf llava_next_video_hf transformers>=4.42, av video llava-hf/LLaVA-NeXT-Video-7B-DPO-hf
swift/LLaVA-NeXT-Video-7B-32K-hf llava_next_video_hf llava_next_video_hf transformers>=4.42, av video llava-hf/LLaVA-NeXT-Video-7B-32K-hf
swift/LLaVA-NeXT-Video-7B-hf llava_next_video_hf llava_next_video_hf transformers>=4.42, av video llava-hf/LLaVA-NeXT-Video-7B-hf
swift/LLaVA-NeXT-Video-34B-hf llava_next_video_yi_hf llava_next_video_hf transformers>=4.42, av video llava-hf/LLaVA-NeXT-Video-34B-hf
AI-ModelScope/llava-onevision-qwen2-0.5b-ov-hf llava_onevision_hf llava_onevision_hf transformers>=4.45 vision, video llava-hf/llava-onevision-qwen2-0.5b-ov-hf
AI-ModelScope/llava-onevision-qwen2-7b-ov-hf llava_onevision_hf llava_onevision_hf transformers>=4.45 vision, video llava-hf/llava-onevision-qwen2-7b-ov-hf
AI-ModelScope/llava-onevision-qwen2-72b-ov-hf llava_onevision_hf llava_onevision_hf transformers>=4.45 vision, video llava-hf/llava-onevision-qwen2-72b-ov-hf
01ai/Yi-VL-6B yi_vl yi_vl transformers>=4.34 vision 01-ai/Yi-VL-6B
01ai/Yi-VL-34B yi_vl yi_vl transformers>=4.34 vision 01-ai/Yi-VL-34B
swift/llava-llama3.1-8b llava_llama3_1_hf llava_llama3_1_hf transformers>=4.41 vision -
AI-ModelScope/llava-llama-3-8b-v1_1-transformers llava_llama3_hf llava_llama3_hf transformers>=4.36 vision xtuner/llava-llama-3-8b-v1_1-transformers
AI-ModelScope/llava-v1.6-mistral-7b llava1_6_mistral llava1_6_mistral transformers>=4.34 vision liuhaotian/llava-v1.6-mistral-7b
AI-ModelScope/llava-v1.6-34b llava1_6_yi llava1_6_yi transformers>=4.34 vision liuhaotian/llava-v1.6-34b
AI-Modelscope/llava-next-72b llava_next_qwen llava_next_qwen transformers>=4.42, av vision lmms-lab/llava-next-72b
AI-Modelscope/llava-next-110b llava_next_qwen llava_next_qwen transformers>=4.42, av vision lmms-lab/llava-next-110b
AI-Modelscope/llama3-llava-next-8b llama3_llava_next llama3_llava_next transformers>=4.42, av vision lmms-lab/llama3-llava-next-8b
deepseek-ai/deepseek-vl-1.3b-chat deepseek_vl deepseek_vl - vision deepseek-ai/deepseek-vl-1.3b-chat
deepseek-ai/deepseek-vl-7b-chat deepseek_vl deepseek_vl - vision deepseek-ai/deepseek-vl-7b-chat
deepseek-ai/Janus-1.3B deepseek_janus deepseek_janus - vision deepseek-ai/Janus-1.3B
OpenBMB/MiniCPM-V minicpmv minicpmv timm, transformers<4.42 vision openbmb/MiniCPM-V
OpenBMB/MiniCPM-V-2 minicpmv minicpmv timm, transformers<4.42 vision openbmb/MiniCPM-V-2
OpenBMB/MiniCPM-V-2_6 minicpmv2_6 minicpmv2_6 timm, transformers>=4.36, decord vision, video openbmb/MiniCPM-V-2_6
OpenBMB/MiniCPM-Llama3-V-2_5 minicpmv2_5 minicpmv2_5 timm, transformers>=4.36 vision openbmb/MiniCPM-Llama3-V-2_5
iic/mPLUG-Owl2 mplug_owl2 mplug_owl2 transformers<4.35, icecream vision MAGAer13/mplug-owl2-llama2-7b
iic/mPLUG-Owl2.1 mplug_owl2_1 mplug_owl2 transformers<4.35, icecream vision Mizukiluke/mplug_owl_2_1
iic/mPLUG-Owl3-1B-241014 mplug_owl3 mplug_owl3 transformers>=4.36, icecream, decord vision, video mPLUG/mPLUG-Owl3-1B-241014
iic/mPLUG-Owl3-2B-241014 mplug_owl3 mplug_owl3 transformers>=4.36, icecream, decord vision, video mPLUG/mPLUG-Owl3-2B-241014
iic/mPLUG-Owl3-7B-240728 mplug_owl3 mplug_owl3 transformers>=4.36, icecream, decord vision, video mPLUG/mPLUG-Owl3-7B-240728
iic/mPLUG-Owl3-7B-241101 mplug_owl3_241101 mplug_owl3_241101 transformers>=4.36, icecream vision, video mPLUG/mPLUG-Owl3-7B-241101
BAAI/Emu3-Gen emu3_gen emu3_gen - t2i BAAI/Emu3-Gen
BAAI/Emu3-Chat emu3_chat emu3_chat transformers>=4.44.0 vision BAAI/Emu3-Chat
stepfun-ai/GOT-OCR2_0 got_ocr2 got_ocr2 - vision stepfun-ai/GOT-OCR2_0
LLM-Research/Phi-3-vision-128k-instruct phi3_vision phi3_vision transformers>=4.36 vision microsoft/Phi-3-vision-128k-instruct
LLM-Research/Phi-3.5-vision-instruct phi3_vision phi3_vision transformers>=4.36 vision microsoft/Phi-3.5-vision-instruct
AI-ModelScope/Florence-2-base-ft florence florence - vision microsoft/Florence-2-base-ft
AI-ModelScope/Florence-2-base florence florence - vision microsoft/Florence-2-base
AI-ModelScope/Florence-2-large florence florence - vision microsoft/Florence-2-large
AI-ModelScope/Florence-2-large-ft florence florence - vision microsoft/Florence-2-large-ft
AI-ModelScope/Idefics3-8B-Llama3 idefics3 idefics3 transformers>=4.45 vision HuggingFaceM4/Idefics3-8B-Llama3
AI-ModelScope/paligemma-3b-pt-224 paligemma paligemma transformers>=4.41 vision google/paligemma-3b-pt-224
AI-ModelScope/paligemma-3b-pt-448 paligemma paligemma transformers>=4.41 vision google/paligemma-3b-pt-448
AI-ModelScope/paligemma-3b-pt-896 paligemma paligemma transformers>=4.41 vision google/paligemma-3b-pt-896
AI-ModelScope/paligemma-3b-mix-224 paligemma paligemma transformers>=4.41 vision google/paligemma-3b-mix-224
AI-ModelScope/paligemma-3b-mix-448 paligemma paligemma transformers>=4.41 vision google/paligemma-3b-mix-448
LLM-Research/Molmo-7B-O-0924 molmo molmo transformers>=4.45 vision allenai/Molmo-7B-O-0924
LLM-Research/Molmo-7B-D-0924 molmo molmo transformers>=4.45 vision allenai/Molmo-7B-D-0924
LLM-Research/Molmo-72B-0924 molmo molmo transformers>=4.45 vision allenai/Molmo-72B-0924
LLM-Research/MolmoE-1B-0924 molmoe molmo transformers>=4.45 vision allenai/MolmoE-1B-0924
AI-ModelScope/pixtral-12b pixtral pixtral transformers>=4.45 vision mistral-community/pixtral-12b

Datasets

The table below introduces information about the datasets integrated with ms-swift:

  • Dataset ID: ModelScope dataset ID
  • HF Dataset ID: Hugging Face dataset ID
  • Subset Name: Name of the subset
  • Dataset Size: Size of the dataset
  • Statistic: The statistical count of the dataset. We use the number of tokens for statistics, which helps in adjusting the max_length hyperparameter. We tokenize the dataset using the tokenizer of qwen2.5. The token count varies with different tokenizers. If you need to obtain token statistics for tokenizers of other models, you can acquire it using the script.
  • Tags: Tags associated with the dataset
Dataset ID Subset Name Dataset Size Statistic (token) Tags HF Dataset ID
AI-ModelScope/COIG-CQIA chinese_traditional
coig_pc
exam
finance
douban
human_value
logi_qa
ruozhiba
segmentfault
wiki
wikihow
xhs
zhihu
44694 331.2±693.8, min=34, max=19288 general, 🔥 -
AI-ModelScope/CodeAlpaca-20k default 20022 99.3±57.6, min=30, max=857 code, en HuggingFaceH4/CodeAlpaca_20K
AI-ModelScope/DISC-Law-SFT default 166758 1799.0±474.9, min=769, max=3151 chat, law, 🔥 ShengbinYue/DISC-Law-SFT
AI-ModelScope/DISC-Med-SFT default 464885 426.5±178.7, min=110, max=1383 chat, medical, 🔥 Flmc/DISC-Med-SFT
AI-ModelScope/Duet-v0.5 default 5000 1157.4±189.3, min=657, max=2344 CoT, en G-reen/Duet-v0.5
AI-ModelScope/GuanacoDataset default 31563 250.3±70.6, min=95, max=987 chat, zh JosephusCheung/GuanacoDataset
AI-ModelScope/LLaVA-Instruct-150K default 623302 630.7±143.0, min=301, max=1166 chat, multi-modal, vision -
AI-ModelScope/LLaVA-Pretrain default huge dataset - chat, multi-modal, quality liuhaotian/LLaVA-Pretrain
AI-ModelScope/LaTeX_OCR default
synthetic_handwrite
162149 117.6±44.9, min=41, max=312 chat, ocr, multi-modal, vision linxy/LaTeX_OCR
AI-ModelScope/LongAlpaca-12k default 11998 9941.8±3417.1, min=4695, max=25826 long-sequence, QA Yukang/LongAlpaca-12k
AI-ModelScope/M3IT coco
vqa-v2
shapes
shapes-rephrased
coco-goi-rephrased
snli-ve
snli-ve-rephrased
okvqa
a-okvqa
viquae
textcap
docvqa
science-qa
imagenet
imagenet-open-ended
imagenet-rephrased
coco-goi
clevr
clevr-rephrased
nlvr
coco-itm
coco-itm-rephrased
vsr
vsr-rephrased
mocheg
mocheg-rephrased
coco-text
fm-iqa
activitynet-qa
msrvtt
ss
coco-cn
refcoco
refcoco-rephrased
multi30k
image-paragraph-captioning
visual-dialog
visual-dialog-rephrased
iqa
vcr
visual-mrc
ivqa
msrvtt-qa
msvd-qa
gqa
text-vqa
ocr-vqa
st-vqa
flickr8k-cn
huge dataset - chat, multi-modal, vision -
AI-ModelScope/Magpie-Qwen2-Pro-200K-Chinese default 200000 448.4±223.5, min=87, max=4098 chat, sft, 🔥, zh Magpie-Align/Magpie-Qwen2-Pro-200K-Chinese
AI-ModelScope/Magpie-Qwen2-Pro-200K-English default 200000 609.9±277.1, min=257, max=4098 chat, sft, 🔥, en Magpie-Align/Magpie-Qwen2-Pro-200K-English
AI-ModelScope/Magpie-Qwen2-Pro-300K-Filtered default 300000 556.6±288.6, min=175, max=4098 chat, sft, 🔥 Magpie-Align/Magpie-Qwen2-Pro-300K-Filtered
AI-ModelScope/MathInstruct default 262040 253.3±177.4, min=42, max=2193 math, cot, en, quality TIGER-Lab/MathInstruct
AI-ModelScope/MovieChat-1K-test default 162 39.7±2.0, min=32, max=43 chat, multi-modal, video Enxin/MovieChat-1K-test
AI-ModelScope/Open-Platypus default 24926 389.0±256.4, min=55, max=3153 chat, math, quality garage-bAInd/Open-Platypus
AI-ModelScope/OpenO1-SFT default 125894 1080.7±622.9, min=145, max=11637 chat, general, o1 O1-OPEN/OpenO1-SFT
AI-ModelScope/OpenOrca default
3_5M
huge dataset - chat, multilingual, general -
AI-ModelScope/OpenOrca-Chinese default huge dataset - QA, zh, general, quality yys/OpenOrca-Chinese
AI-ModelScope/SFT-Nectar default 131201 441.9±307.0, min=45, max=3136 cot, en, quality AstraMindAI/SFT-Nectar
AI-ModelScope/ShareGPT-4o image_caption 57289 599.8±140.4, min=214, max=1932 vqa, multi-modal OpenGVLab/ShareGPT-4o
AI-ModelScope/ShareGPT4V ShareGPT4V
ShareGPT4V-PT
huge dataset - chat, multi-modal, vision -
AI-ModelScope/SkyPile-150B default huge dataset - pretrain, quality, zh Skywork/SkyPile-150B
AI-ModelScope/WizardLM_evol_instruct_V2_196k default 109184 483.3±338.4, min=27, max=3735 chat, en WizardLM/WizardLM_evol_instruct_V2_196k
AI-ModelScope/alpaca-cleaned default 51760 170.1±122.9, min=29, max=1028 chat, general, bench, quality yahma/alpaca-cleaned
AI-ModelScope/alpaca-gpt4-data-en default 52002 167.6±123.9, min=29, max=607 chat, general, 🔥 vicgalle/alpaca-gpt4
AI-ModelScope/alpaca-gpt4-data-zh default 48818 157.2±93.2, min=27, max=544 chat, general, 🔥 llm-wizard/alpaca-gpt4-data-zh
AI-ModelScope/blossom-math-v2 default 10000 175.4±59.1, min=35, max=563 chat, math, 🔥 Azure99/blossom-math-v2
AI-ModelScope/captcha-images default 8000 47.0±0.0, min=47, max=47 chat, multi-modal, vision -
AI-ModelScope/databricks-dolly-15k default 15011 199.0±268.8, min=26, max=5987 multi-task, en, quality databricks/databricks-dolly-15k
AI-ModelScope/deepctrl-sft-data default
en
huge dataset - chat, general, sft, multi-round -
AI-ModelScope/egoschema Subset 101 191.6±80.7, min=96, max=435 chat, multi-modal, video lmms-lab/egoschema
AI-ModelScope/firefly-train-1.1M default 1649399 204.3±365.3, min=28, max=9306 chat, general YeungNLP/firefly-train-1.1M
AI-ModelScope/generated_chat_0.4M default 396004 272.7±51.1, min=78, max=579 chat, character-dialogue BelleGroup/generated_chat_0.4M
AI-ModelScope/guanaco_belle_merge_v1.0 default 693987 133.8±93.5, min=30, max=1872 QA, zh Chinese-Vicuna/guanaco_belle_merge_v1.0
AI-ModelScope/hh-rlhf helpful-base
helpful-online
helpful-rejection-sampled
huge dataset - rlhf, dpo -
AI-ModelScope/hh_rlhf_cn hh_rlhf
harmless_base_cn
harmless_base_en
helpful_base_cn
helpful_base_en
362909 142.3±107.5, min=25, max=1571 rlhf, dpo, 🔥 -
AI-ModelScope/lawyer_llama_data default 21476 224.4±83.9, min=69, max=832 chat, law Skepsun/lawyer_llama_data
AI-ModelScope/leetcode-solutions-python default 2359 723.8±233.5, min=259, max=2117 chat, coding, 🔥 -
AI-ModelScope/lmsys-chat-1m default 166211 545.8±3272.8, min=22, max=219116 chat, em lmsys/lmsys-chat-1m
AI-ModelScope/ms_agent_for_agentfabric default
addition
30000 615.7±198.7, min=251, max=2055 chat, agent, multi-round, 🔥 -
AI-ModelScope/orpo-dpo-mix-40k default 43666 938.1±694.2, min=36, max=8483 dpo, orpo, en, quality mlabonne/orpo-dpo-mix-40k
AI-ModelScope/pile default huge dataset - pretrain EleutherAI/pile
AI-ModelScope/ruozhiba post-annual
title-good
title-norm
85658 40.0±18.3, min=22, max=559 pretrain, 🔥 -
AI-ModelScope/school_math_0.25M default 248481 158.8±73.4, min=39, max=980 chat, math, quality BelleGroup/school_math_0.25M
AI-ModelScope/sharegpt_gpt4 default
V3_format
zh_38K_format
103329 3476.6±5959.0, min=33, max=115132 chat, multilingual, general, multi-round, gpt4, 🔥 -
AI-ModelScope/sql-create-context default 78577 82.7±31.5, min=36, max=282 chat, sql, 🔥 b-mc2/sql-create-context
AI-ModelScope/stack-exchange-paired default huge dataset - hfrl, dpo, pairwise lvwerra/stack-exchange-paired
AI-ModelScope/starcoderdata default huge dataset - pretrain, quality bigcode/starcoderdata
AI-ModelScope/synthetic_text_to_sql default 100000 221.8±69.9, min=64, max=616 nl2sql, en gretelai/synthetic_text_to_sql
AI-ModelScope/texttosqlv2_25000_v2 default 25000 277.3±328.3, min=40, max=1971 chat, sql Clinton/texttosqlv2_25000_v2
AI-ModelScope/the-stack default huge dataset - pretrain, quality bigcode/the-stack
AI-ModelScope/tigerbot-law-plugin default 55895 104.9±51.0, min=43, max=1087 text-generation, law, pretrained TigerResearch/tigerbot-law-plugin
AI-ModelScope/train_0.5M_CN default 519255 128.4±87.4, min=31, max=936 common, zh, quality BelleGroup/train_0.5M_CN
AI-ModelScope/train_1M_CN default huge dataset - common, zh, quality BelleGroup/train_1M_CN
AI-ModelScope/train_2M_CN default huge dataset - common, zh, quality BelleGroup/train_2M_CN
AI-ModelScope/tulu-v2-sft-mixture default 326154 523.3±439.3, min=68, max=2549 chat, multilingual, general, multi-round allenai/tulu-v2-sft-mixture
AI-ModelScope/ultrafeedback-binarized-preferences-cleaned-kto default 230720 471.5±274.3, min=27, max=2232 rlhf, kto -
AI-ModelScope/webnovel_cn default 50000 1455.2±12489.4, min=524, max=490480 chat, novel zxbsmk/webnovel_cn
AI-ModelScope/wikipedia-cn-20230720-filtered default huge dataset - pretrain, quality pleisto/wikipedia-cn-20230720-filtered
AI-ModelScope/zhihu_rlhf_3k default 3460 594.5±365.9, min=31, max=1716 rlhf, dpo, zh liyucheng/zhihu_rlhf_3k
DAMO_NLP/jd default 45012 66.9±87.0, min=41, max=1699 text-generation, classification, 🔥 -
- default huge dataset - pretrain, quality HuggingFaceFW/fineweb
- auto_math_text
khanacademy
openstax
stanford
stories
web_samples_v1
web_samples_v2
wikihow
huge dataset - multi-domain, en, qa HuggingFaceTB/cosmopedia
OmniData/Zhihu-KOL default huge dataset - zhihu, qa wangrui6/Zhihu-KOL
OmniData/Zhihu-KOL-More-Than-100-Upvotes default 271261 1003.4±1826.1, min=28, max=52541 zhihu, qa bzb2023/Zhihu-KOL-More-Than-100-Upvotes
TIGER-Lab/MATH-plus train 893929 301.4±196.7, min=50, max=1162 qa, math, en, quality TIGER-Lab/MATH-plus
Tongyi-DataEngine/SA1B-Dense-Caption default huge dataset - zh, multi-modal, vqa -
Tongyi-DataEngine/SA1B-Paired-Captions-Images default 7736284 106.4±18.5, min=48, max=193 zh, multi-modal, vqa -
YorickHe/CoT default 74771 141.6±45.5, min=58, max=410 chat, general -
YorickHe/CoT_zh default 74771 129.1±53.2, min=51, max=401 chat, general -
ZhipuAI/LongWriter-6k default 6000 5009.0±2932.8, min=117, max=30354 long, chat, sft, 🔥 THUDM/LongWriter-6k
- default huge dataset - pretrain, quality allenai/c4
- default huge dataset - pretrain, quality cerebras/SlimPajama-627B
codefuse-ai/CodeExercise-Python-27k default 27224 337.3±154.2, min=90, max=2826 chat, coding, 🔥 -
codefuse-ai/Evol-instruction-66k default 66862 440.1±208.4, min=46, max=2661 chat, coding, 🔥 -
damo/MSAgent-Bench default
mini
638149 859.2±460.1, min=38, max=3479 chat, agent, multi-round -
damo/nlp_polylm_multialpaca_sft ar
de
es
fr
id
ja
ko
pt
ru
th
vi
131867 101.6±42.5, min=30, max=1029 chat, general, multilingual -
damo/zh_cls_fudan-news default 4959 3234.4±2547.5, min=91, max=19548 chat, classification -
damo/zh_ner-JAVE default 1266 118.3±45.5, min=44, max=223 chat, ner -
hjh0119/shareAI-Llama3-DPO-zh-en-emoji zh
en
2449 334.0±162.8, min=36, max=1801 rlhf, dpo -
huangjintao/AgentInstruct_copy alfworld
db
kg
mind2web
os
webshop
1866 1144.3±635.5, min=206, max=6412 chat, agent, multi-round -
iic/100PoisonMpts default 906 150.6±80.8, min=39, max=656 poison-management, zh -
iic/MSAgent-MultiRole default 543 413.0±79.7, min=70, max=936 chat, agent, multi-round, role-play, multi-agent -
iic/MSAgent-Pro default 21910 1978.1±747.9, min=339, max=8064 chat, agent, multi-round, 🔥 -
iic/ms_agent default 30000 645.8±218.0, min=199, max=2070 chat, agent, multi-round, 🔥 -
iic/ms_bench default 316820 353.4±424.5, min=29, max=2924 chat, general, multi-round, 🔥 -
- default huge dataset - multi-modal, en, vqa, quality lmms-lab/GQA
- 0_30_s_academic_v0_1
0_30_s_youtube_v0_1
1_2_m_academic_v0_1
1_2_m_youtube_v0_1
2_3_m_academic_v0_1
2_3_m_youtube_v0_1
30_60_s_academic_v0_1
30_60_s_youtube_v0_1
1335486 273.7±78.8, min=107, max=638 chat, multi-modal, video lmms-lab/LLaVA-Video-178K
lvjianjin/AdvertiseGen default 97484 130.9±21.9, min=73, max=232 text-generation, 🔥 shibing624/AdvertiseGen
mapjack/openwebtext_dataset default huge dataset - pretrain, zh, quality -
modelscope/DuReader_robust-QG default 17899 242.0±143.1, min=75, max=1416 text-generation, 🔥 -
modelscope/chinese-poetry-collection default 1710 58.1±8.1, min=31, max=71 text-generation, poetry -
modelscope/clue cmnli 391783 81.6±16.0, min=54, max=157 text-generation, classification clue
modelscope/coco_2014_caption train
validation
454617 389.6±68.4, min=70, max=587 chat, multi-modal, vision, 🔥 -
shenweizhou/alpha-umi-toolbench-processed-v2 backbone
caller
planner
summarizer
huge dataset - chat, agent, 🔥 -
simpleai/HC3 finance
medicine
11021 296.0±153.3, min=65, max=2267 text-generation, classification, 🔥 Hello-SimpleAI/HC3
simpleai/HC3-Chinese baike
baike_cls
open_qa
open_qa_cls
nlpcc_dbqa
nlpcc_dbqa_cls
finance
finance_cls
medicine
medicine_cls
law
law_cls
psychology
psychology_cls
39781 179.9±70.2, min=90, max=1070 text-generation, classification, 🔥 Hello-SimpleAI/HC3-Chinese
speech_asr/speech_asr_aishell1_trainsets train
validation
test
141600 40.8±3.3, min=33, max=53 chat, multi-modal, audio -
swift/A-OKVQA default 18201 43.5±7.9, min=27, max=94 multi-modal, en, vqa, quality HuggingFaceM4/A-OKVQA
swift/ChartQA default 28299 36.8±6.5, min=26, max=74 en, vqa, quality HuggingFaceM4/ChartQA
swift/GRIT caption
grounding
vqa
huge dataset - multi-modal, en, caption-grounding, vqa, quality zzliang/GRIT
swift/GenQA default huge dataset - qa, quality, multi-task tomg-group-umd/GenQA
swift/Infinity-Instruct default huge dataset - qa, quality, multi-task BAAI/Infinity-Instruct
swift/Mantis-Instruct birds-to-words
chartqa
coinstruct
contrastive_caption
docvqa
dreamsim
dvqa
iconqa
imagecode
llava_665k_multi
lrv_multi
multi_vqa
nextqa
nlvr2
spot-the-diff
star
visual_story_telling
988115 619.9±156.6, min=243, max=1926 chat, multi-modal, vision -
swift/MideficsDataset default 3800 201.3±70.2, min=60, max=454 medical, en, vqa WinterSchool/MideficsDataset
swift/Multimodal-Mind2Web default 1009 293855.4±331149.5, min=11301, max=3577519 agent, multi-modal osunlp/Multimodal-Mind2Web
swift/OCR-VQA default 186753 32.3±5.8, min=27, max=80 multi-modal, en, ocr-vqa howard-hou/OCR-VQA
swift/OK-VQA_train default 9009 31.7±3.4, min=25, max=56 multi-modal, en, vqa, quality Multimodal-Fatima/OK-VQA_train
swift/OpenHermes-2.5 default huge dataset - cot, en, quality teknium/OpenHermes-2.5
swift/RLAIF-V-Dataset default 83132 99.6±54.8, min=30, max=362 rlhf, dpo, multi-modal, en openbmb/RLAIF-V-Dataset
swift/RedPajama-Data-1T default huge dataset - pretrain, quality togethercomputer/RedPajama-Data-1T
swift/RedPajama-Data-V2 default huge dataset - pretrain, quality togethercomputer/RedPajama-Data-V2
swift/ScienceQA default 16967 101.7±55.8, min=32, max=620 multi-modal, science, vqa, quality derek-thomas/ScienceQA
swift/SlimOrca default 517982 405.5±442.1, min=47, max=8312 quality, en Open-Orca/SlimOrca
swift/TextCaps default huge dataset - multi-modal, en, caption, quality HuggingFaceM4/TextCaps
swift/ToolBench default 124345 2251.7±1039.8, min=641, max=9451 chat, agent, multi-round -
swift/VQAv2 default huge dataset - en, vqa, quality HuggingFaceM4/VQAv2
swift/VideoChatGPT Generic
Temporal
Consistency
3206 87.4±48.3, min=31, max=398 chat, multi-modal, video, 🔥 lmms-lab/VideoChatGPT
swift/WebInstructSub default huge dataset - qa, en, math, quality, multi-domain, science TIGER-Lab/WebInstructSub
swift/aya_collection aya_dataset 202364 474.6±1539.1, min=25, max=71312 multi-lingual, qa CohereForAI/aya_collection
swift/chinese-c4 default huge dataset - pretrain, zh, quality shjwudp/chinese-c4
swift/cinepile default huge dataset - vqa, en, youtube, video tomg-group-umd/cinepile
swift/classical_chinese_translate default 6655 349.3±77.1, min=61, max=815 chat, play-ground -
swift/cosmopedia-100k default 100000 1037.0±254.8, min=339, max=2818 multi-domain, en, qa HuggingFaceTB/cosmopedia-100k
swift/dolma v1_7 huge dataset - pretrain, quality allenai/dolma
swift/dolphin flan1m-alpaca-uncensored
flan5m-alpaca-uncensored
huge dataset - en cognitivecomputations/dolphin
swift/github-code default huge dataset - pretrain, quality codeparrot/github-code
swift/gpt4v-dataset default huge dataset - en, caption, multi-modal, quality laion/gpt4v-dataset
swift/llava-data llava_instruct 624255 369.7±143.0, min=40, max=905 sft, multi-modal, quality TIGER-Lab/llava-data
swift/llava-instruct-mix-vsft default 13640 178.8±119.8, min=34, max=951 multi-modal, en, vqa, quality HuggingFaceH4/llava-instruct-mix-vsft
swift/llava-med-zh-instruct-60k default 56649 207.9±67.7, min=42, max=594 zh, medical, vqa, multi-modal BUAADreamer/llava-med-zh-instruct-60k
swift/lnqa default huge dataset - multi-modal, en, ocr-vqa, quality vikhyatk/lnqa
swift/longwriter-6k-filtered default 666 4108.9±2636.9, min=1190, max=17050 long, chat, sft, 🔥 -
swift/medical_zh en
zh
2068589 256.4±87.3, min=39, max=1167 chat, medical -
swift/moondream2-coyo-5M-captions default huge dataset - caption, pretrain, quality isidentical/moondream2-coyo-5M-captions
swift/no_robots default 9485 300.0±246.2, min=40, max=6739 multi-task, quality, human-annotated HuggingFaceH4/no_robots
swift/orca_dpo_pairs default 12859 364.9±248.2, min=36, max=2010 rlhf, quality Intel/orca_dpo_pairs
swift/path-vqa default 19654 34.2±6.8, min=28, max=85 multi-modal, vqa, medical flaviagiammarino/path-vqa
swift/pile-val-backup default 214661 1831.4±11087.5, min=21, max=516620 text-generation, awq mit-han-lab/pile-val-backup
swift/pixelprose default huge dataset - caption, multi-modal, vision tomg-group-umd/pixelprose
swift/refcoco caption
grounding
92430 45.4±3.0, min=37, max=63 multi-modal, en, grounding jxu124/refcoco
swift/refcocog caption
grounding
89598 50.3±4.6, min=39, max=91 multi-modal, en, grounding jxu124/refcocog
swift/self-cognition default 108 58.9±20.3, min=32, max=131 chat, self-cognition, 🔥 modelscope/self-cognition
swift/sharegpt common-zh
unknow-zh
common-en
194063 820.5±366.1, min=25, max=2221 chat, general, multi-round -
swift/swift-sft-mixture sharegpt
firefly
codefuse
metamathqa
huge dataset - chat, sft, general, 🔥 -
swift/tagengo-gpt4 default 76437 468.1±276.8, min=28, max=1726 chat, multi-lingual, quality lightblue/tagengo-gpt4
swift/train_3.5M_CN default huge dataset - common, zh, quality BelleGroup/train_3.5M_CN
swift/ultrachat_200k default 207843 1188.0±571.1, min=170, max=4068 chat, en, quality HuggingFaceH4/ultrachat_200k
swift/wikipedia default huge dataset - pretrain, quality wikipedia
- default huge dataset - pretrain, quality tiiuae/falcon-refinedweb
wyj123456/GPT4all default 806199 97.3±20.9, min=62, max=414 chat, general -
wyj123456/code_alpaca_en default 20022 99.3±57.6, min=30, max=857 chat, coding sahil2801/CodeAlpaca-20k
wyj123456/finance_en default 68912 264.5±207.1, min=30, max=2268 chat, financial ssbuild/alpaca_finance_en
wyj123456/instinwild default
subset
103695 125.1±43.7, min=35, max=801 chat, general -
wyj123456/instruct default 888970 271.0±333.6, min=34, max=3967 chat, general -