Skip to content

Latest commit

 

History

History
749 lines (741 loc) · 150 KB

支持的模型和数据集.md

File metadata and controls

749 lines (741 loc) · 150 KB

支持的模型和数据集

模型

下表介绍了ms-swift接入的模型的相关信息:

  • Model ID: 魔搭模型id
  • HF Model ID: 抱抱脸模型id
  • Model Type: 模型类型
  • Default Template: 默认对话模板
  • Requires: 使用该模型的额外依赖
  • Tags: 模型的tag

大语言模型

Model ID Model Type Default Template Requires Tags HF Model ID
Qwen/Qwen-1_8B-Chat qwen qwen - - Qwen/Qwen-1_8B-Chat
Qwen/Qwen-7B-Chat qwen qwen - - Qwen/Qwen-7B-Chat
Qwen/Qwen-14B-Chat qwen qwen - - Qwen/Qwen-14B-Chat
Qwen/Qwen-72B-Chat qwen qwen - - Qwen/Qwen-72B-Chat
Qwen/Qwen-1_8B qwen qwen - - Qwen/Qwen-1_8B
Qwen/Qwen-7B qwen qwen - - Qwen/Qwen-7B
Qwen/Qwen-14B qwen qwen - - Qwen/Qwen-14B
Qwen/Qwen-72B qwen qwen - - Qwen/Qwen-72B
Qwen/Qwen-1_8B-Chat-Int4 qwen qwen - - Qwen/Qwen-1_8B-Chat-Int4
Qwen/Qwen-7B-Chat-Int4 qwen qwen - - Qwen/Qwen-7B-Chat-Int4
Qwen/Qwen-14B-Chat-Int4 qwen qwen - - Qwen/Qwen-14B-Chat-Int4
Qwen/Qwen-72B-Chat-Int4 qwen qwen - - Qwen/Qwen-72B-Chat-Int4
Qwen/Qwen-1_8B-Chat-Int8 qwen qwen - - Qwen/Qwen-1_8B-Chat-Int8
Qwen/Qwen-7B-Chat-Int8 qwen qwen - - Qwen/Qwen-7B-Chat-Int8
Qwen/Qwen-14B-Chat-Int8 qwen qwen - - Qwen/Qwen-14B-Chat-Int8
Qwen/Qwen-72B-Chat-Int8 qwen qwen - - Qwen/Qwen-72B-Chat-Int8
TongyiFinance/Tongyi-Finance-14B-Chat qwen qwen - financial jxy/Tongyi-Finance-14B-Chat
TongyiFinance/Tongyi-Finance-14B qwen qwen - financial -
TongyiFinance/Tongyi-Finance-14B-Chat-Int4 qwen qwen - financial jxy/Tongyi-Finance-14B-Chat-Int4
Qwen/Qwen1.5-0.5B-Chat qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-0.5B-Chat
Qwen/Qwen1.5-1.8B-Chat qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-1.8B-Chat
Qwen/Qwen1.5-4B-Chat qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-4B-Chat
Qwen/Qwen1.5-7B-Chat qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-7B-Chat
Qwen/Qwen1.5-14B-Chat qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-14B-Chat
Qwen/Qwen1.5-32B-Chat qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-32B-Chat
Qwen/Qwen1.5-72B-Chat qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-72B-Chat
Qwen/Qwen1.5-110B-Chat qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-110B-Chat
Qwen/Qwen1.5-0.5B qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-0.5B
Qwen/Qwen1.5-1.8B qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-1.8B
Qwen/Qwen1.5-4B qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-4B
Qwen/Qwen1.5-7B qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-7B
Qwen/Qwen1.5-14B qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-14B
Qwen/Qwen1.5-32B qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-32B
Qwen/Qwen1.5-72B qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-72B
Qwen/Qwen1.5-110B qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-110B
Qwen/Qwen1.5-0.5B-Chat-GPTQ-Int4 qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-0.5B-Chat-GPTQ-Int4
Qwen/Qwen1.5-1.8B-Chat-GPTQ-Int4 qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-1.8B-Chat-GPTQ-Int4
Qwen/Qwen1.5-4B-Chat-GPTQ-Int4 qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-4B-Chat-GPTQ-Int4
Qwen/Qwen1.5-7B-Chat-GPTQ-Int4 qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-7B-Chat-GPTQ-Int4
Qwen/Qwen1.5-14B-Chat-GPTQ-Int4 qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-14B-Chat-GPTQ-Int4
Qwen/Qwen1.5-32B-Chat-GPTQ-Int4 qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-32B-Chat-GPTQ-Int4
Qwen/Qwen1.5-72B-Chat-GPTQ-Int4 qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-72B-Chat-GPTQ-Int4
Qwen/Qwen1.5-110B-Chat-GPTQ-Int4 qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-110B-Chat-GPTQ-Int4
Qwen/Qwen1.5-0.5B-Chat-GPTQ-Int8 qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-0.5B-Chat-GPTQ-Int8
Qwen/Qwen1.5-1.8B-Chat-GPTQ-Int8 qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-1.8B-Chat-GPTQ-Int8
Qwen/Qwen1.5-4B-Chat-GPTQ-Int8 qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-4B-Chat-GPTQ-Int8
Qwen/Qwen1.5-7B-Chat-GPTQ-Int8 qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-7B-Chat-GPTQ-Int8
Qwen/Qwen1.5-14B-Chat-GPTQ-Int8 qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-14B-Chat-GPTQ-Int8
Qwen/Qwen1.5-72B-Chat-GPTQ-Int8 qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-72B-Chat-GPTQ-Int8
Qwen/Qwen1.5-0.5B-Chat-AWQ qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-0.5B-Chat-AWQ
Qwen/Qwen1.5-1.8B-Chat-AWQ qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-1.8B-Chat-AWQ
Qwen/Qwen1.5-4B-Chat-AWQ qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-4B-Chat-AWQ
Qwen/Qwen1.5-7B-Chat-AWQ qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-7B-Chat-AWQ
Qwen/Qwen1.5-14B-Chat-AWQ qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-14B-Chat-AWQ
Qwen/Qwen1.5-32B-Chat-AWQ qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-32B-Chat-AWQ
Qwen/Qwen1.5-72B-Chat-AWQ qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-72B-Chat-AWQ
Qwen/Qwen1.5-110B-Chat-AWQ qwen2 qwen transformers>=4.37 - Qwen/Qwen1.5-110B-Chat-AWQ
Qwen/CodeQwen1.5-7B qwen2 qwen transformers>=4.37 coding Qwen/CodeQwen1.5-7B
Qwen/CodeQwen1.5-7B-Chat qwen2 qwen transformers>=4.37 coding Qwen/CodeQwen1.5-7B-Chat
Qwen/CodeQwen1.5-7B-Chat-AWQ qwen2 qwen transformers>=4.37 coding Qwen/CodeQwen1.5-7B-Chat-AWQ
Qwen/Qwen2-0.5B-Instruct qwen2 qwen transformers>=4.37 - Qwen/Qwen2-0.5B-Instruct
Qwen/Qwen2-1.5B-Instruct qwen2 qwen transformers>=4.37 - Qwen/Qwen2-1.5B-Instruct
Qwen/Qwen2-7B-Instruct qwen2 qwen transformers>=4.37 - Qwen/Qwen2-7B-Instruct
Qwen/Qwen2-72B-Instruct qwen2 qwen transformers>=4.37 - Qwen/Qwen2-72B-Instruct
Qwen/Qwen2-0.5B qwen2 qwen transformers>=4.37 - Qwen/Qwen2-0.5B
Qwen/Qwen2-1.5B qwen2 qwen transformers>=4.37 - Qwen/Qwen2-1.5B
Qwen/Qwen2-7B qwen2 qwen transformers>=4.37 - Qwen/Qwen2-7B
Qwen/Qwen2-72B qwen2 qwen transformers>=4.37 - Qwen/Qwen2-72B
Qwen/Qwen2-0.5B-Instruct-GPTQ-Int4 qwen2 qwen transformers>=4.37 - Qwen/Qwen2-0.5B-Instruct-GPTQ-Int4
Qwen/Qwen2-1.5B-Instruct-GPTQ-Int4 qwen2 qwen transformers>=4.37 - Qwen/Qwen2-1.5B-Instruct-GPTQ-Int4
Qwen/Qwen2-7B-Instruct-GPTQ-Int4 qwen2 qwen transformers>=4.37 - Qwen/Qwen2-7B-Instruct-GPTQ-Int4
Qwen/Qwen2-72B-Instruct-GPTQ-Int4 qwen2 qwen transformers>=4.37 - Qwen/Qwen2-72B-Instruct-GPTQ-Int4
Qwen/Qwen2-0.5B-Instruct-GPTQ-Int8 qwen2 qwen transformers>=4.37 - Qwen/Qwen2-0.5B-Instruct-GPTQ-Int8
Qwen/Qwen2-1.5B-Instruct-GPTQ-Int8 qwen2 qwen transformers>=4.37 - Qwen/Qwen2-1.5B-Instruct-GPTQ-Int8
Qwen/Qwen2-7B-Instruct-GPTQ-Int8 qwen2 qwen transformers>=4.37 - Qwen/Qwen2-7B-Instruct-GPTQ-Int8
Qwen/Qwen2-72B-Instruct-GPTQ-Int8 qwen2 qwen transformers>=4.37 - Qwen/Qwen2-72B-Instruct-GPTQ-Int8
Qwen/Qwen2-0.5B-Instruct-AWQ qwen2 qwen transformers>=4.37 - Qwen/Qwen2-0.5B-Instruct-AWQ
Qwen/Qwen2-1.5B-Instruct-AWQ qwen2 qwen transformers>=4.37 - Qwen/Qwen2-1.5B-Instruct-AWQ
Qwen/Qwen2-7B-Instruct-AWQ qwen2 qwen transformers>=4.37 - Qwen/Qwen2-7B-Instruct-AWQ
Qwen/Qwen2-72B-Instruct-AWQ qwen2 qwen transformers>=4.37 - Qwen/Qwen2-72B-Instruct-AWQ
Qwen/Qwen2-Math-1.5B-Instruct qwen2 qwen transformers>=4.37 math Qwen/Qwen2-Math-1.5B-Instruct
Qwen/Qwen2-Math-7B-Instruct qwen2 qwen transformers>=4.37 math Qwen/Qwen2-Math-7B-Instruct
Qwen/Qwen2-Math-72B-Instruct qwen2 qwen transformers>=4.37 math Qwen/Qwen2-Math-72B-Instruct
Qwen/Qwen2-Math-1.5B qwen2 qwen transformers>=4.37 math Qwen/Qwen2-Math-1.5B
Qwen/Qwen2-Math-7B qwen2 qwen transformers>=4.37 math Qwen/Qwen2-Math-7B
Qwen/Qwen2-Math-72B qwen2 qwen transformers>=4.37 math Qwen/Qwen2-Math-72B
Qwen/Qwen2.5-0.5B-Instruct qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-0.5B-Instruct
Qwen/Qwen2.5-1.5B-Instruct qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-1.5B-Instruct
Qwen/Qwen2.5-3B-Instruct qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-3B-Instruct
Qwen/Qwen2.5-7B-Instruct qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-7B-Instruct
Qwen/Qwen2.5-14B-Instruct qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-14B-Instruct
Qwen/Qwen2.5-32B-Instruct qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-32B-Instruct
Qwen/Qwen2.5-72B-Instruct qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-72B-Instruct
Qwen/Qwen2.5-0.5B qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-0.5B
Qwen/Qwen2.5-1.5B qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-1.5B
Qwen/Qwen2.5-3B qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-3B
Qwen/Qwen2.5-7B qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-7B
Qwen/Qwen2.5-14B qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-14B
Qwen/Qwen2.5-32B qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-32B
Qwen/Qwen2.5-72B qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-72B
Qwen/Qwen2.5-0.5B-Instruct-GPTQ-Int4 qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-0.5B-Instruct-GPTQ-Int4
Qwen/Qwen2.5-1.5B-Instruct-GPTQ-Int4 qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-1.5B-Instruct-GPTQ-Int4
Qwen/Qwen2.5-3B-Instruct-GPTQ-Int4 qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-3B-Instruct-GPTQ-Int4
Qwen/Qwen2.5-7B-Instruct-GPTQ-Int4 qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-7B-Instruct-GPTQ-Int4
Qwen/Qwen2.5-14B-Instruct-GPTQ-Int4 qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-14B-Instruct-GPTQ-Int4
Qwen/Qwen2.5-32B-Instruct-GPTQ-Int4 qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-32B-Instruct-GPTQ-Int4
Qwen/Qwen2.5-72B-Instruct-GPTQ-Int4 qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-72B-Instruct-GPTQ-Int4
Qwen/Qwen2.5-0.5B-Instruct-GPTQ-Int8 qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-0.5B-Instruct-GPTQ-Int8
Qwen/Qwen2.5-1.5B-Instruct-GPTQ-Int8 qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-1.5B-Instruct-GPTQ-Int8
Qwen/Qwen2.5-3B-Instruct-GPTQ-Int8 qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-3B-Instruct-GPTQ-Int8
Qwen/Qwen2.5-7B-Instruct-GPTQ-Int8 qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-7B-Instruct-GPTQ-Int8
Qwen/Qwen2.5-14B-Instruct-GPTQ-Int8 qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-14B-Instruct-GPTQ-Int8
Qwen/Qwen2.5-32B-Instruct-GPTQ-Int8 qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-32B-Instruct-GPTQ-Int8
Qwen/Qwen2.5-72B-Instruct-GPTQ-Int8 qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-72B-Instruct-GPTQ-Int8
Qwen/Qwen2.5-0.5B-Instruct-AWQ qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-0.5B-Instruct-AWQ
Qwen/Qwen2.5-1.5B-Instruct-AWQ qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-1.5B-Instruct-AWQ
Qwen/Qwen2.5-3B-Instruct-AWQ qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-3B-Instruct-AWQ
Qwen/Qwen2.5-7B-Instruct-AWQ qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-7B-Instruct-AWQ
Qwen/Qwen2.5-14B-Instruct-AWQ qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-14B-Instruct-AWQ
Qwen/Qwen2.5-32B-Instruct-AWQ qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-32B-Instruct-AWQ
Qwen/Qwen2.5-72B-Instruct-AWQ qwen2_5 qwen2_5 transformers>=4.37 - Qwen/Qwen2.5-72B-Instruct-AWQ
Qwen/Qwen2.5-Math-1.5B-Instruct qwen2_5 qwen2_5 transformers>=4.37 math Qwen/Qwen2.5-Math-1.5B-Instruct
Qwen/Qwen2.5-Math-7B-Instruct qwen2_5 qwen2_5 transformers>=4.37 math Qwen/Qwen2.5-Math-7B-Instruct
Qwen/Qwen2.5-Math-72B-Instruct qwen2_5 qwen2_5 transformers>=4.37 math Qwen/Qwen2.5-Math-72B-Instruct
Qwen/Qwen2.5-Math-1.5B qwen2_5 qwen2_5 transformers>=4.37 math Qwen/Qwen2.5-Math-1.5B
Qwen/Qwen2.5-Math-7B qwen2_5 qwen2_5 transformers>=4.37 math Qwen/Qwen2.5-Math-7B
Qwen/Qwen2.5-Math-72B qwen2_5 qwen2_5 transformers>=4.37 math Qwen/Qwen2.5-Math-72B
Qwen/Qwen2.5-Coder-0.5B-Instruct qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-0.5B-Instruct
Qwen/Qwen2.5-Coder-1.5B-Instruct qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-1.5B-Instruct
Qwen/Qwen2.5-Coder-3B-Instruct qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-3B-Instruct
Qwen/Qwen2.5-Coder-7B-Instruct qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-7B-Instruct
Qwen/Qwen2.5-Coder-14B-Instruct qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-14B-Instruct
Qwen/Qwen2.5-Coder-32B-Instruct qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-32B-Instruct
Qwen/Qwen2.5-Coder-0.5B qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-0.5B
Qwen/Qwen2.5-Coder-1.5B qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-1.5B
Qwen/Qwen2.5-Coder-3B qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-3B
Qwen/Qwen2.5-Coder-7B qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-7B
Qwen/Qwen2.5-Coder-14B qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-14B
Qwen/Qwen2.5-Coder-32B qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-32B
Qwen/Qwen2.5-Coder-0.5B-Instruct-AWQ qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-0.5B-Instruct-AWQ
Qwen/Qwen2.5-Coder-1.5B-Instruct-AWQ qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-1.5B-Instruct-AWQ
Qwen/Qwen2.5-Coder-3B-Instruct-AWQ qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-3B-Instruct-AWQ
Qwen/Qwen2.5-Coder-7B-Instruct-AWQ qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-7B-Instruct-AWQ
Qwen/Qwen2.5-Coder-14B-Instruct-AWQ qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-14B-Instruct-AWQ
Qwen/Qwen2.5-Coder-32B-Instruct-AWQ qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-32B-Instruct-AWQ
Qwen/Qwen2.5-Coder-0.5B-Instruct-GPTQ-Int4 qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-0.5B-Instruct-GPTQ-Int4
Qwen/Qwen2.5-Coder-0.5B-Instruct-GPTQ-Int8 qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-0.5B-Instruct-GPTQ-Int8
Qwen/Qwen2.5-Coder-1.5B-Instruct-GPTQ-Int4 qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-1.5B-Instruct-GPTQ-Int4
Qwen/Qwen2.5-Coder-1.5B-Instruct-GPTQ-Int8 qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-1.5B-Instruct-GPTQ-Int8
Qwen/Qwen2.5-Coder-3B-Instruct-GPTQ-Int4 qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-3B-Instruct-GPTQ-Int4
Qwen/Qwen2.5-Coder-3B-Instruct-GPTQ-Int8 qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-3B-Instruct-GPTQ-Int8
Qwen/Qwen2.5-Coder-7B-Instruct-GPTQ-Int4 qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-7B-Instruct-GPTQ-Int4
Qwen/Qwen2.5-Coder-7B-Instruct-GPTQ-Int8 qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-7B-Instruct-GPTQ-Int8
Qwen/Qwen2.5-Coder-14B-Instruct-GPTQ-Int4 qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-14B-Instruct-GPTQ-Int4
Qwen/Qwen2.5-Coder-14B-Instruct-GPTQ-Int8 qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-14B-Instruct-GPTQ-Int8
Qwen/Qwen2.5-Coder-32B-Instruct-GPTQ-Int4 qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-32B-Instruct-GPTQ-Int4
Qwen/Qwen2.5-Coder-32B-Instruct-GPTQ-Int8 qwen2_5 qwen2_5 transformers>=4.37 coding Qwen/Qwen2.5-Coder-32B-Instruct-GPTQ-Int8
Qwen/Qwen1.5-MoE-A2.7B-Chat qwen2_moe qwen transformers>=4.40 - Qwen/Qwen1.5-MoE-A2.7B-Chat
Qwen/Qwen1.5-MoE-A2.7B qwen2_moe qwen transformers>=4.40 - Qwen/Qwen1.5-MoE-A2.7B
Qwen/Qwen1.5-MoE-A2.7B-Chat-GPTQ-Int4 qwen2_moe qwen transformers>=4.40 - Qwen/Qwen1.5-MoE-A2.7B-Chat-GPTQ-Int4
Qwen/Qwen2-57B-A14B-Instruct qwen2_moe qwen transformers>=4.40 - Qwen/Qwen2-57B-A14B-Instruct
Qwen/Qwen2-57B-A14B qwen2_moe qwen transformers>=4.40 - Qwen/Qwen2-57B-A14B
Qwen/Qwen2-57B-A14B-Instruct-GPTQ-Int4 qwen2_moe qwen transformers>=4.40 - Qwen/Qwen2-57B-A14B-Instruct-GPTQ-Int4
Qwen/QwQ-32B-Preview qwq qwq transformers>=4.37 - Qwen/QwQ-32B-Preview
codefuse-ai/CodeFuse-QWen-14B codefuse_qwen codefuse - coding codefuse-ai/CodeFuse-QWen-14B
iic/ModelScope-Agent-7B modelscope_agent modelscope_agent - - -
iic/ModelScope-Agent-14B modelscope_agent modelscope_agent - - -
AIDC-AI/Marco-o1 marco_o1 marco_o1 transformers>=4.37 - AIDC-AI/Marco-o1
modelscope/Llama-2-7b-ms llama llama - - meta-llama/Llama-2-7b-hf
modelscope/Llama-2-13b-ms llama llama - - meta-llama/Llama-2-13b-hf
modelscope/Llama-2-70b-ms llama llama - - meta-llama/Llama-2-70b-hf
modelscope/Llama-2-7b-chat-ms llama llama - - meta-llama/Llama-2-7b-chat-hf
modelscope/Llama-2-13b-chat-ms llama llama - - meta-llama/Llama-2-13b-chat-hf
modelscope/Llama-2-70b-chat-ms llama llama - - meta-llama/Llama-2-70b-chat-hf
AI-ModelScope/chinese-llama-2-1.3b llama llama - - hfl/chinese-llama-2-1.3b
AI-ModelScope/chinese-llama-2-7b llama llama - - hfl/chinese-llama-2-7b
AI-ModelScope/chinese-llama-2-7b-16k llama llama - - hfl/chinese-llama-2-7b-16k
AI-ModelScope/chinese-llama-2-7b-64k llama llama - - hfl/chinese-llama-2-7b-64k
AI-ModelScope/chinese-llama-2-13b llama llama - - hfl/chinese-llama-2-13b
AI-ModelScope/chinese-llama-2-13b-16k llama llama - - hfl/chinese-llama-2-13b-16k
AI-ModelScope/chinese-alpaca-2-1.3b llama llama - - hfl/chinese-alpaca-2-1.3b
AI-ModelScope/chinese-alpaca-2-7b llama llama - - hfl/chinese-alpaca-2-7b
AI-ModelScope/chinese-alpaca-2-7b-16k llama llama - - hfl/chinese-alpaca-2-7b-16k
AI-ModelScope/chinese-alpaca-2-7b-64k llama llama - - hfl/chinese-alpaca-2-7b-64k
AI-ModelScope/chinese-alpaca-2-13b llama llama - - hfl/chinese-alpaca-2-13b
AI-ModelScope/chinese-alpaca-2-13b-16k llama llama - - hfl/chinese-alpaca-2-13b-16k
AI-ModelScope/Llama-2-7b-AQLM-2Bit-1x16-hf llama llama - - ISTA-DASLab/Llama-2-7b-AQLM-2Bit-1x16-hf
LLM-Research/Meta-Llama-3-8B-Instruct llama3 llama3 - - meta-llama/Meta-Llama-3-8B-Instruct
LLM-Research/Meta-Llama-3-70B-Instruct llama3 llama3 - - meta-llama/Meta-Llama-3-70B-Instruct
LLM-Research/Meta-Llama-3-8B llama3 llama3 - - meta-llama/Meta-Llama-3-8B
LLM-Research/Meta-Llama-3-70B llama3 llama3 - - meta-llama/Meta-Llama-3-70B
swift/Meta-Llama-3-8B-Instruct-GPTQ-Int4 llama3 llama3 - - study-hjt/Meta-Llama-3-8B-Instruct-GPTQ-Int4
swift/Meta-Llama-3-8B-Instruct-GPTQ-Int8 llama3 llama3 - - study-hjt/Meta-Llama-3-8B-Instruct-GPTQ-Int8
swift/Meta-Llama-3-8B-Instruct-AWQ llama3 llama3 - - study-hjt/Meta-Llama-3-8B-Instruct-AWQ
swift/Meta-Llama-3-70B-Instruct-GPTQ-Int4 llama3 llama3 - - study-hjt/Meta-Llama-3-70B-Instruct-GPTQ-Int4
swift/Meta-Llama-3-70B-Instruct-GPTQ-Int8 llama3 llama3 - - study-hjt/Meta-Llama-3-70B-Instruct-GPTQ-Int8
swift/Meta-Llama-3-70B-Instruct-AWQ llama3 llama3 - - study-hjt/Meta-Llama-3-70B-Instruct-AWQ
ChineseAlpacaGroup/llama-3-chinese-8b-instruct llama3 llama3 - - hfl/llama-3-chinese-8b-instruct
ChineseAlpacaGroup/llama-3-chinese-8b llama3 llama3 - - hfl/llama-3-chinese-8b
LLM-Research/Meta-Llama-3.1-8B-Instruct llama3_1 llama3_2 transformers>=4.43 - meta-llama/Meta-Llama-3.1-8B-Instruct
LLM-Research/Meta-Llama-3.1-70B-Instruct llama3_1 llama3_2 transformers>=4.43 - meta-llama/Meta-Llama-3.1-70B-Instruct
LLM-Research/Meta-Llama-3.1-405B-Instruct llama3_1 llama3_2 transformers>=4.43 - meta-llama/Meta-Llama-3.1-405B-Instruct
LLM-Research/Meta-Llama-3.1-8B llama3_1 llama3_2 transformers>=4.43 - meta-llama/Meta-Llama-3.1-8B
LLM-Research/Meta-Llama-3.1-70B llama3_1 llama3_2 transformers>=4.43 - meta-llama/Meta-Llama-3.1-70B
LLM-Research/Meta-Llama-3.1-405B llama3_1 llama3_2 transformers>=4.43 - meta-llama/Meta-Llama-3.1-405B
LLM-Research/Meta-Llama-3.1-70B-Instruct-FP8 llama3_1 llama3_2 transformers>=4.43 - meta-llama/Meta-Llama-3.1-70B-Instruct-FP8
LLM-Research/Meta-Llama-3.1-405B-Instruct-FP8 llama3_1 llama3_2 transformers>=4.43 - meta-llama/Meta-Llama-3.1-405B-Instruct-FP8
LLM-Research/Meta-Llama-3.1-8B-Instruct-BNB-NF4 llama3_1 llama3_2 transformers>=4.43 - hugging-quants/Meta-Llama-3.1-8B-Instruct-BNB-NF4
LLM-Research/Meta-Llama-3.1-70B-Instruct-bnb-4bit llama3_1 llama3_2 transformers>=4.43 - unsloth/Meta-Llama-3.1-70B-Instruct-bnb-4bit
LLM-Research/Meta-Llama-3.1-405B-Instruct-BNB-NF4 llama3_1 llama3_2 transformers>=4.43 - hugging-quants/Meta-Llama-3.1-405B-Instruct-BNB-NF4
LLM-Research/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4 llama3_1 llama3_2 transformers>=4.43 - hugging-quants/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4
LLM-Research/Meta-Llama-3.1-70B-Instruct-GPTQ-INT4 llama3_1 llama3_2 transformers>=4.43 - hugging-quants/Meta-Llama-3.1-70B-Instruct-GPTQ-INT4
LLM-Research/Meta-Llama-3.1-405B-Instruct-GPTQ-INT4 llama3_1 llama3_2 transformers>=4.43 - hugging-quants/Meta-Llama-3.1-405B-Instruct-GPTQ-INT4
LLM-Research/Meta-Llama-3.1-8B-Instruct-AWQ-INT4 llama3_1 llama3_2 transformers>=4.43 - hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4
LLM-Research/Meta-Llama-3.1-70B-Instruct-AWQ-INT4 llama3_1 llama3_2 transformers>=4.43 - hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4
LLM-Research/Meta-Llama-3.1-405B-Instruct-AWQ-INT4 llama3_1 llama3_2 transformers>=4.43 - hugging-quants/Meta-Llama-3.1-405B-Instruct-AWQ-INT4
AI-ModelScope/Llama-3.1-Nemotron-70B-Instruct-HF llama3_1 llama3_2 transformers>=4.43 - nvidia/Llama-3.1-Nemotron-70B-Instruct-HF
LLM-Research/Llama-3.2-1B llama3_2 llama3_2 transformers>=4.45 - meta-llama/Llama-3.2-1B
LLM-Research/Llama-3.2-3B llama3_2 llama3_2 transformers>=4.45 - meta-llama/Llama-3.2-3B
LLM-Research/Llama-3.2-1B-Instruct llama3_2 llama3_2 transformers>=4.45 - meta-llama/Llama-3.2-1B-Instruct
LLM-Research/Llama-3.2-3B-Instruct llama3_2 llama3_2 transformers>=4.45 - meta-llama/Llama-3.2-3B-Instruct
LLM-Research/Llama-3.3-70B-Instruct llama3_2 llama3_2 transformers>=4.45 - meta-llama/Llama-3.3-70B-Instruct
unsloth/Llama-3.3-70B-Instruct-bnb-4bit llama3_2 llama3_2 transformers>=4.45 - unsloth/Llama-3.3-70B-Instruct-bnb-4bit
LLM-Research/Reflection-Llama-3.1-70B reflection reflection transformers>=4.43 - mattshumer/Reflection-Llama-3.1-70B
01ai/Yi-6B yi chatml - - 01-ai/Yi-6B
01ai/Yi-6B-200K yi chatml - - 01-ai/Yi-6B-200K
01ai/Yi-6B-Chat yi chatml - - 01-ai/Yi-6B-Chat
01ai/Yi-6B-Chat-4bits yi chatml - - 01-ai/Yi-6B-Chat-4bits
01ai/Yi-6B-Chat-8bits yi chatml - - 01-ai/Yi-6B-Chat-8bits
01ai/Yi-9B yi chatml - - 01-ai/Yi-9B
01ai/Yi-9B-200K yi chatml - - 01-ai/Yi-9B-200K
01ai/Yi-34B yi chatml - - 01-ai/Yi-34B
01ai/Yi-34B-200K yi chatml - - 01-ai/Yi-34B-200K
01ai/Yi-34B-Chat yi chatml - - 01-ai/Yi-34B-Chat
01ai/Yi-34B-Chat-4bits yi chatml - - 01-ai/Yi-34B-Chat-4bits
01ai/Yi-34B-Chat-8bits yi chatml - - 01-ai/Yi-34B-Chat-8bits
01ai/Yi-1.5-6B yi chatml - - 01-ai/Yi-1.5-6B
01ai/Yi-1.5-6B-Chat yi chatml - - 01-ai/Yi-1.5-6B-Chat
01ai/Yi-1.5-9B yi chatml - - 01-ai/Yi-1.5-9B
01ai/Yi-1.5-9B-Chat yi chatml - - 01-ai/Yi-1.5-9B-Chat
01ai/Yi-1.5-9B-Chat-16K yi chatml - - 01-ai/Yi-1.5-9B-Chat-16K
01ai/Yi-1.5-34B yi chatml - - 01-ai/Yi-1.5-34B
01ai/Yi-1.5-34B-Chat yi chatml - - 01-ai/Yi-1.5-34B-Chat
01ai/Yi-1.5-34B-Chat-16K yi chatml - - 01-ai/Yi-1.5-34B-Chat-16K
AI-ModelScope/Yi-1.5-6B-Chat-GPTQ yi chatml - - modelscope/Yi-1.5-6B-Chat-GPTQ
AI-ModelScope/Yi-1.5-6B-Chat-AWQ yi chatml - - modelscope/Yi-1.5-6B-Chat-AWQ
AI-ModelScope/Yi-1.5-9B-Chat-GPTQ yi chatml - - modelscope/Yi-1.5-9B-Chat-GPTQ
AI-ModelScope/Yi-1.5-9B-Chat-AWQ yi chatml - - modelscope/Yi-1.5-9B-Chat-AWQ
AI-ModelScope/Yi-1.5-34B-Chat-GPTQ yi chatml - - modelscope/Yi-1.5-34B-Chat-GPTQ
AI-ModelScope/Yi-1.5-34B-Chat-AWQ yi chatml - - modelscope/Yi-1.5-34B-Chat-AWQ
01ai/Yi-Coder-1.5B yi_coder yi_coder - coding 01-ai/Yi-Coder-1.5B
01ai/Yi-Coder-9B yi_coder yi_coder - coding 01-ai/Yi-Coder-9B
01ai/Yi-Coder-1.5B-Chat yi_coder yi_coder - coding 01-ai/Yi-Coder-1.5B-Chat
01ai/Yi-Coder-9B-Chat yi_coder yi_coder - coding 01-ai/Yi-Coder-9B-Chat
SUSTC/SUS-Chat-34B sus sus - - SUSTech/SUS-Chat-34B
codefuse-ai/CodeFuse-CodeLlama-34B codefuse_codellama codefuse_codellama - coding codefuse-ai/CodeFuse-CodeLlama-34B
langboat/Mengzi3-13B-Base mengzi3 mengzi - - Langboat/Mengzi3-13B-Base
Fengshenbang/Ziya2-13B-Base ziya ziya - - IDEA-CCNL/Ziya2-13B-Base
Fengshenbang/Ziya2-13B-Chat ziya ziya - - IDEA-CCNL/Ziya2-13B-Chat
AI-ModelScope/NuminaMath-7B-TIR numina numina - math AI-MO/NuminaMath-7B-TIR
FlagAlpha/Atom-7B atom atom - - FlagAlpha/Atom-7B
FlagAlpha/Atom-7B-Chat atom atom - - FlagAlpha/Atom-7B-Chat
ZhipuAI/chatglm2-6b chatglm2 chatglm2 - - THUDM/chatglm2-6b
ZhipuAI/chatglm2-6b-32k chatglm2 chatglm2 - - THUDM/chatglm2-6b-32k
ZhipuAI/codegeex2-6b chatglm2 chatglm2 - coding THUDM/codegeex2-6b
ZhipuAI/chatglm3-6b chatglm3 glm4 transformers<4.42 - THUDM/chatglm3-6b
ZhipuAI/chatglm3-6b-base chatglm3 glm4 transformers<4.42 - THUDM/chatglm3-6b-base
ZhipuAI/chatglm3-6b-32k chatglm3 glm4 transformers<4.42 - THUDM/chatglm3-6b-32k
ZhipuAI/chatglm3-6b-128k chatglm3 glm4 transformers<4.42 - THUDM/chatglm3-6b-128k
ZhipuAI/glm-4-9b-chat glm4 glm4 transformers>=4.42 - THUDM/glm-4-9b-chat
ZhipuAI/glm-4-9b glm4 glm4 transformers>=4.42 - THUDM/glm-4-9b
ZhipuAI/glm-4-9b-chat-1m glm4 glm4 transformers>=4.42 - THUDM/glm-4-9b-chat-1m
ZhipuAI/LongWriter-glm4-9b glm4 glm4 transformers>=4.42 - THUDM/LongWriter-glm4-9b
ZhipuAI/glm-edge-1.5b-chat glm_edge glm4 transformers>=4.46 - THUDM/glm-edge-1.5b-chat
ZhipuAI/glm-edge-4b-chat glm_edge glm4 transformers>=4.46 - THUDM/glm-edge-4b-chat
codefuse-ai/CodeFuse-CodeGeeX2-6B codefuse_codegeex2 codefuse transformers<4.34 coding codefuse-ai/CodeFuse-CodeGeeX2-6B
ZhipuAI/codegeex4-all-9b codegeex4 codegeex4 transformers<4.42 coding THUDM/codegeex4-all-9b
ZhipuAI/LongWriter-llama3.1-8b longwriter_llama3_1 longwriter_llama transformers>=4.43 - THUDM/LongWriter-llama3.1-8b
Shanghai_AI_Laboratory/internlm-chat-7b internlm internlm - - internlm/internlm-chat-7b
Shanghai_AI_Laboratory/internlm-7b internlm internlm - - internlm/internlm-7b
Shanghai_AI_Laboratory/internlm-chat-7b-8k internlm internlm - - -
Shanghai_AI_Laboratory/internlm-20b internlm internlm - - internlm/internlm-20b
Shanghai_AI_Laboratory/internlm-chat-20b internlm internlm - - internlm/internlm-chat-20b
Shanghai_AI_Laboratory/internlm2-chat-1_8b internlm2 internlm2 transformers>=4.38 - internlm/internlm2-chat-1_8b
Shanghai_AI_Laboratory/internlm2-1_8b internlm2 internlm2 transformers>=4.38 - internlm/internlm2-1_8b
Shanghai_AI_Laboratory/internlm2-chat-1_8b-sft internlm2 internlm2 transformers>=4.38 - internlm/internlm2-chat-1_8b-sft
Shanghai_AI_Laboratory/internlm2-base-7b internlm2 internlm2 transformers>=4.38 - internlm/internlm2-base-7b
Shanghai_AI_Laboratory/internlm2-7b internlm2 internlm2 transformers>=4.38 - internlm/internlm2-7b
Shanghai_AI_Laboratory/internlm2-chat-7b internlm2 internlm2 transformers>=4.38 - internlm/internlm2-chat-7b
Shanghai_AI_Laboratory/internlm2-chat-7b-sft internlm2 internlm2 transformers>=4.38 - internlm/internlm2-chat-7b-sft
Shanghai_AI_Laboratory/internlm2-base-20b internlm2 internlm2 transformers>=4.38 - internlm/internlm2-base-20b
Shanghai_AI_Laboratory/internlm2-20b internlm2 internlm2 transformers>=4.38 - internlm/internlm2-20b
Shanghai_AI_Laboratory/internlm2-chat-20b internlm2 internlm2 transformers>=4.38 - internlm/internlm2-chat-20b
Shanghai_AI_Laboratory/internlm2-chat-20b-sft internlm2 internlm2 transformers>=4.38 - internlm/internlm2-chat-20b-sft
Shanghai_AI_Laboratory/internlm2-math-7b internlm2 internlm2 transformers>=4.38 math internlm/internlm2-math-7b
Shanghai_AI_Laboratory/internlm2-math-base-7b internlm2 internlm2 transformers>=4.38 math internlm/internlm2-math-base-7b
Shanghai_AI_Laboratory/internlm2-math-base-20b internlm2 internlm2 transformers>=4.38 math internlm/internlm2-math-base-20b
Shanghai_AI_Laboratory/internlm2-math-20b internlm2 internlm2 transformers>=4.38 math internlm/internlm2-math-20b
Shanghai_AI_Laboratory/internlm2_5-1_8b-chat internlm2 internlm2 transformers>=4.38 - internlm/internlm2_5-1_8b-chat
Shanghai_AI_Laboratory/internlm2_5-1_8b internlm2 internlm2 transformers>=4.38 - internlm/internlm2_5-1_8b
Shanghai_AI_Laboratory/internlm2_5-7b internlm2 internlm2 transformers>=4.38 - internlm/internlm2_5-7b
Shanghai_AI_Laboratory/internlm2_5-7b-chat internlm2 internlm2 transformers>=4.38 - internlm/internlm2_5-7b-chat
Shanghai_AI_Laboratory/internlm2_5-7b-chat-1m internlm2 internlm2 transformers>=4.38 - internlm/internlm2_5-7b-chat-1m
Shanghai_AI_Laboratory/internlm2_5-20b internlm2 internlm2 transformers>=4.38 - internlm/internlm2_5-20b
Shanghai_AI_Laboratory/internlm2_5-20b-chat internlm2 internlm2 transformers>=4.38 - internlm/internlm2_5-20b-chat
deepseek-ai/deepseek-llm-7b-base deepseek deepseek - - deepseek-ai/deepseek-llm-7b-base
deepseek-ai/deepseek-llm-7b-chat deepseek deepseek - - deepseek-ai/deepseek-llm-7b-chat
deepseek-ai/deepseek-llm-67b-base deepseek deepseek - - deepseek-ai/deepseek-llm-67b-base
deepseek-ai/deepseek-llm-67b-chat deepseek deepseek - - deepseek-ai/deepseek-llm-67b-chat
deepseek-ai/deepseek-math-7b-base deepseek deepseek - math deepseek-ai/deepseek-math-7b-base
deepseek-ai/deepseek-math-7b-instruct deepseek deepseek - math deepseek-ai/deepseek-math-7b-instruct
deepseek-ai/deepseek-math-7b-rl deepseek deepseek - math deepseek-ai/deepseek-math-7b-rl
deepseek-ai/deepseek-coder-1.3b-base deepseek deepseek - coding deepseek-ai/deepseek-coder-1.3b-base
deepseek-ai/deepseek-coder-1.3b-instruct deepseek deepseek - coding deepseek-ai/deepseek-coder-1.3b-instruct
deepseek-ai/deepseek-coder-6.7b-base deepseek deepseek - coding deepseek-ai/deepseek-coder-6.7b-base
deepseek-ai/deepseek-coder-6.7b-instruct deepseek deepseek - coding deepseek-ai/deepseek-coder-6.7b-instruct
deepseek-ai/deepseek-coder-33b-base deepseek deepseek - coding deepseek-ai/deepseek-coder-33b-base
deepseek-ai/deepseek-coder-33b-instruct deepseek deepseek - coding deepseek-ai/deepseek-coder-33b-instruct
deepseek-ai/deepseek-moe-16b-chat deepseek_moe deepseek - - deepseek-ai/deepseek-moe-16b-chat
deepseek-ai/deepseek-moe-16b-base deepseek_moe deepseek - - deepseek-ai/deepseek-moe-16b-base
deepseek-ai/DeepSeek-Coder-V2-Instruct deepseek_v2 deepseek transformers>=4.39.3 - deepseek-ai/DeepSeek-Coder-V2-Instruct
deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct deepseek_v2 deepseek transformers>=4.39.3 - deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct
deepseek-ai/DeepSeek-Coder-V2-Base deepseek_v2 deepseek transformers>=4.39.3 - deepseek-ai/DeepSeek-Coder-V2-Base
deepseek-ai/DeepSeek-Coder-V2-Lite-Base deepseek_v2 deepseek transformers>=4.39.3 - deepseek-ai/DeepSeek-Coder-V2-Lite-Base
deepseek-ai/DeepSeek-V2-Lite deepseek_v2 deepseek transformers>=4.39.3 - deepseek-ai/DeepSeek-V2-Lite
deepseek-ai/DeepSeek-V2-Lite-Chat deepseek_v2 deepseek transformers>=4.39.3 - deepseek-ai/DeepSeek-V2-Lite-Chat
deepseek-ai/DeepSeek-V2 deepseek_v2 deepseek transformers>=4.39.3 - deepseek-ai/DeepSeek-V2
deepseek-ai/DeepSeek-V2-Chat deepseek_v2 deepseek transformers>=4.39.3 - deepseek-ai/DeepSeek-V2-Chat
deepseek-ai/DeepSeek-V2.5 deepseek_v2_5 deepseek_v2_5 transformers>=4.39.3 - deepseek-ai/DeepSeek-V2.5
OpenBuddy/openbuddy-llama-65b-v8-bf16 openbuddy_llama openbuddy - - OpenBuddy/openbuddy-llama-65b-v8-bf16
OpenBuddy/openbuddy-llama2-13b-v8.1-fp16 openbuddy_llama openbuddy - - OpenBuddy/openbuddy-llama2-13b-v8.1-fp16
OpenBuddy/openbuddy-llama2-70b-v10.1-bf16 openbuddy_llama openbuddy - - OpenBuddy/openbuddy-llama2-70b-v10.1-bf16
OpenBuddy/openbuddy-deepseek-67b-v15.2 openbuddy_llama openbuddy - - OpenBuddy/openbuddy-deepseek-67b-v15.2
OpenBuddy/openbuddy-llama3-8b-v21.1-8k openbuddy_llama3 openbuddy2 - - OpenBuddy/openbuddy-llama3-8b-v21.1-8k
OpenBuddy/openbuddy-llama3-70b-v21.1-8k openbuddy_llama3 openbuddy2 - - OpenBuddy/openbuddy-llama3-70b-v21.1-8k
OpenBuddy/openbuddy-llama3.1-8b-v22.1-131k openbuddy_llama3 openbuddy2 - - OpenBuddy/openbuddy-llama3.1-8b-v22.1-131k
OpenBuddy/openbuddy-mistral-7b-v17.1-32k openbuddy_mistral openbuddy transformers>=4.34 - OpenBuddy/openbuddy-mistral-7b-v17.1-32k
OpenBuddy/openbuddy-zephyr-7b-v14.1 openbuddy_mistral openbuddy transformers>=4.34 - OpenBuddy/openbuddy-zephyr-7b-v14.1
OpenBuddy/openbuddy-mixtral-7bx8-v18.1-32k openbuddy_mixtral openbuddy transformers>=4.36 - OpenBuddy/openbuddy-mixtral-7bx8-v18.1-32k
baichuan-inc/Baichuan-13B-Chat baichuan baichuan transformers<4.34 - baichuan-inc/Baichuan-13B-Chat
baichuan-inc/Baichuan-13B-Base baichuan baichuan transformers<4.34 - baichuan-inc/Baichuan-13B-Base
baichuan-inc/baichuan-7B baichuan baichuan transformers<4.34 - baichuan-inc/Baichuan-7B
baichuan-inc/Baichuan2-7B-Chat baichuan2 baichuan - - baichuan-inc/Baichuan2-7B-Chat
baichuan-inc/Baichuan2-7B-Base baichuan2 baichuan - - baichuan-inc/Baichuan2-7B-Base
baichuan-inc/Baichuan2-13B-Chat baichuan2 baichuan - - baichuan-inc/Baichuan2-13B-Chat
baichuan-inc/Baichuan2-13B-Base baichuan2 baichuan - - baichuan-inc/Baichuan2-13B-Base
baichuan-inc/Baichuan2-7B-Chat-4bits baichuan2 baichuan - - baichuan-inc/Baichuan2-7B-Chat-4bits
baichuan-inc/Baichuan2-13B-Chat-4bits baichuan2 baichuan - - baichuan-inc/Baichuan2-13B-Chat-4bits
OpenBMB/MiniCPM-2B-sft-fp32 minicpm minicpm transformers>=4.36.0 - openbmb/MiniCPM-2B-sft-fp32
OpenBMB/MiniCPM-2B-dpo-fp32 minicpm minicpm transformers>=4.36.0 - openbmb/MiniCPM-2B-dpo-fp32
OpenBMB/MiniCPM-1B-sft-bf16 minicpm minicpm transformers>=4.36.0 - openbmb/MiniCPM-1B-sft-bf16
OpenBMB/MiniCPM-2B-128k minicpm_chatml chatml transformers>=4.36 - openbmb/MiniCPM-2B-128k
OpenBMB/MiniCPM3-4B minicpm3 chatml transformers>=4.36 - openbmb/MiniCPM3-4B
OpenBMB/MiniCPM-MoE-8x2B minicpm_moe minicpm transformers>=4.36 - openbmb/MiniCPM-MoE-8x2B
TeleAI/TeleChat-7B telechat telechat - - Tele-AI/telechat-7B
TeleAI/TeleChat-12B telechat telechat - - Tele-AI/TeleChat-12B
TeleAI/TeleChat-12B-v2 telechat telechat - - Tele-AI/TeleChat-12B-v2
swift/TeleChat-12B-V2-GPTQ-Int4 telechat telechat - - -
TeleAI/TeleChat2-3B telechat2 telechat2 - - Tele-AI/TeleChat2-3B
TeleAI/TeleChat2-7B telechat2 telechat2 - - Tele-AI/TeleChat2-7B
TeleAI/TeleChat2-35B-Nov telechat2 telechat2 - - Tele-AI/TeleChat2-35B-Nov
TeleAI/TeleChat2-35B telechat2_115b telechat2_115b - - Tele-AI/TeleChat2-35B
TeleAI/TeleChat2-115B telechat2_115b telechat2_115b - - Tele-AI/TeleChat2-115B
AI-ModelScope/Mistral-7B-Instruct-v0.1 mistral llama transformers>=4.34 - mistralai/Mistral-7B-Instruct-v0.1
AI-ModelScope/Mistral-7B-Instruct-v0.2 mistral llama transformers>=4.34 - mistralai/Mistral-7B-Instruct-v0.2
LLM-Research/Mistral-7B-Instruct-v0.3 mistral llama transformers>=4.34 - mistralai/Mistral-7B-Instruct-v0.3
AI-ModelScope/Mistral-7B-v0.1 mistral llama transformers>=4.34 - mistralai/Mistral-7B-v0.1
AI-ModelScope/Mistral-7B-v0.2-hf mistral llama transformers>=4.34 - alpindale/Mistral-7B-v0.2-hf
swift/Codestral-22B-v0.1 mistral llama transformers>=4.34 - mistralai/Codestral-22B-v0.1
modelscope/zephyr-7b-beta zephyr zephyr transformers>=4.34 - HuggingFaceH4/zephyr-7b-beta
AI-ModelScope/Mixtral-8x7B-Instruct-v0.1 mixtral llama - - mistralai/Mixtral-8x7B-Instruct-v0.1
AI-ModelScope/Mixtral-8x7B-v0.1 mixtral llama - - mistralai/Mixtral-8x7B-v0.1
AI-ModelScope/Mixtral-8x22B-v0.1 mixtral llama - - mistral-community/Mixtral-8x22B-v0.1
AI-ModelScope/Mixtral-8x7b-AQLM-2Bit-1x16-hf mixtral llama - - ISTA-DASLab/Mixtral-8x7b-AQLM-2Bit-1x16-hf
AI-ModelScope/Mistral-Small-Instruct-2409 mistral_nemo mistral_nemo - - mistralai/Mistral-Small-Instruct-2409
LLM-Research/Mistral-Large-Instruct-2407 mistral_nemo mistral_nemo - - mistralai/Mistral-Large-Instruct-2407
AI-ModelScope/Mistral-Nemo-Base-2407 mistral_nemo mistral_nemo - - mistralai/Mistral-Nemo-Base-2407
AI-ModelScope/Mistral-Nemo-Instruct-2407 mistral_nemo mistral_nemo - - mistralai/Mistral-Nemo-Instruct-2407
AI-ModelScope/Ministral-8B-Instruct-2410 mistral_nemo mistral_nemo - - mistralai/Ministral-8B-Instruct-2410
AI-ModelScope/WizardLM-2-7B-AWQ wizardlm2 wizardlm2 transformers>=4.34 - MaziyarPanahi/WizardLM-2-7B-AWQ
AI-ModelScope/WizardLM-2-8x22B wizardlm2_moe wizardlm2_moe transformers>=4.36 - alpindale/WizardLM-2-8x22B
AI-ModelScope/phi-2 phi2 default - - microsoft/phi-2
LLM-Research/Phi-3-small-8k-instruct phi3_small phi3 transformers>=4.36 - microsoft/Phi-3-small-8k-instruct
LLM-Research/Phi-3-small-128k-instruct phi3_small phi3 transformers>=4.36 - microsoft/Phi-3-small-128k-instruct
LLM-Research/Phi-3-mini-4k-instruct phi3 phi3 transformers>=4.36 - microsoft/Phi-3-mini-4k-instruct
LLM-Research/Phi-3-mini-128k-instruct phi3 phi3 transformers>=4.36 - microsoft/Phi-3-mini-128k-instruct
LLM-Research/Phi-3-medium-4k-instruct phi3 phi3 transformers>=4.36 - microsoft/Phi-3-medium-4k-instruct
LLM-Research/Phi-3-medium-128k-instruct phi3 phi3 transformers>=4.36 - microsoft/Phi-3-medium-128k-instruct
LLM-Research/Phi-3.5-mini-instruct phi3 phi3 transformers>=4.36 - microsoft/Phi-3.5-mini-instruct
LLM-Research/Phi-3.5-MoE-instruct phi3_moe phi3 transformers>=4.36 - microsoft/Phi-3.5-MoE-instruct
AI-ModelScope/gemma-2b-it gemma gemma transformers>=4.38 - google/gemma-2b-it
AI-ModelScope/gemma-2b gemma gemma transformers>=4.38 - google/gemma-2b
AI-ModelScope/gemma-7b gemma gemma transformers>=4.38 - google/gemma-7b
AI-ModelScope/gemma-7b-it gemma gemma transformers>=4.38 - google/gemma-7b-it
LLM-Research/gemma-2-2b-it gemma2 gemma transformers>=4.42 - google/gemma-2-2b-it
LLM-Research/gemma-2-2b gemma2 gemma transformers>=4.42 - google/gemma-2-2b
LLM-Research/gemma-2-9b gemma2 gemma transformers>=4.42 - google/gemma-2-9b
LLM-Research/gemma-2-9b-it gemma2 gemma transformers>=4.42 - google/gemma-2-9b-it
LLM-Research/gemma-2-27b gemma2 gemma transformers>=4.42 - google/gemma-2-27b
LLM-Research/gemma-2-27b-it gemma2 gemma transformers>=4.42 - google/gemma-2-27b-it
IEITYuan/Yuan2.0-2B-hf yuan2 yuan - - IEITYuan/Yuan2-2B-hf
IEITYuan/Yuan2.0-51B-hf yuan2 yuan - - IEITYuan/Yuan2-51B-hf
IEITYuan/Yuan2.0-102B-hf yuan2 yuan - - IEITYuan/Yuan2-102B-hf
IEITYuan/Yuan2-2B-Janus-hf yuan2 yuan - - IEITYuan/Yuan2-2B-Janus-hf
IEITYuan/Yuan2-M32-hf yuan2 yuan - - IEITYuan/Yuan2-M32-hf
OrionStarAI/Orion-14B-Chat orion orion - - OrionStarAI/Orion-14B-Chat
OrionStarAI/Orion-14B-Base orion orion - - OrionStarAI/Orion-14B-Base
xverse/XVERSE-7B-Chat xverse xverse - - xverse/XVERSE-7B-Chat
xverse/XVERSE-7B xverse xverse - - xverse/XVERSE-7B
xverse/XVERSE-13B xverse xverse - - xverse/XVERSE-13B
xverse/XVERSE-13B-Chat xverse xverse - - xverse/XVERSE-13B-Chat
xverse/XVERSE-65B xverse xverse - - xverse/XVERSE-65B
xverse/XVERSE-65B-2 xverse xverse - - xverse/XVERSE-65B-2
xverse/XVERSE-65B-Chat xverse xverse - - xverse/XVERSE-65B-Chat
xverse/XVERSE-13B-256K xverse xverse - - xverse/XVERSE-13B-256K
xverse/XVERSE-MoE-A4.2B xverse_moe xverse - - xverse/XVERSE-MoE-A4.2B
damo/nlp_seqgpt-560m seggpt default - - DAMO-NLP/SeqGPT-560M
vivo-ai/BlueLM-7B-Chat-32K bluelm bluelm - - vivo-ai/BlueLM-7B-Chat-32K
vivo-ai/BlueLM-7B-Chat bluelm bluelm - - vivo-ai/BlueLM-7B-Chat
vivo-ai/BlueLM-7B-Base-32K bluelm bluelm - - vivo-ai/BlueLM-7B-Base-32K
vivo-ai/BlueLM-7B-Base bluelm bluelm - - vivo-ai/BlueLM-7B-Base
AI-ModelScope/c4ai-command-r-v01 c4ai c4ai transformers>=4.39 - CohereForAI/c4ai-command-r-v01
AI-ModelScope/c4ai-command-r-plus c4ai c4ai transformers>=4.39 - CohereForAI/c4ai-command-r-plus
AI-ModelScope/dbrx-base dbrx dbrx transformers>=4.36 - databricks/dbrx-base
AI-ModelScope/dbrx-instruct dbrx dbrx transformers>=4.36 - databricks/dbrx-instruct
colossalai/grok-1-pytorch grok default - - hpcai-tech/grok-1
AI-ModelScope/mamba-130m-hf mamba default transformers>=4.39.0 - state-spaces/mamba-130m-hf
AI-ModelScope/mamba-370m-hf mamba default transformers>=4.39.0 - state-spaces/mamba-370m-hf
AI-ModelScope/mamba-390m-hf mamba default transformers>=4.39.0 - state-spaces/mamba-390m-hf
AI-ModelScope/mamba-790m-hf mamba default transformers>=4.39.0 - state-spaces/mamba-790m-hf
AI-ModelScope/mamba-1.4b-hf mamba default transformers>=4.39.0 - state-spaces/mamba-1.4b-hf
AI-ModelScope/mamba-2.8b-hf mamba default transformers>=4.39.0 - state-spaces/mamba-2.8b-hf
damo/nlp_polylm_13b_text_generation polylm default - - DAMO-NLP-MT/polylm-13b
skywork/Skywork-13B-base skywork skywork - - -
skywork/Skywork-13B-chat skywork skywork - - -
AI-ModelScope/aya-expanse-8b aya aya transformers>=4.44.0 - CohereForAI/aya-expanse-8b
AI-ModelScope/aya-expanse-32b aya aya transformers>=4.44.0 - CohereForAI/aya-expanse-32b

多模态大模型

Model ID Model Type Default Template Requires Tags HF Model ID
Qwen/Qwen-VL-Chat qwen_vl qwen_vl - vision Qwen/Qwen-VL-Chat
Qwen/Qwen-VL qwen_vl qwen_vl - vision Qwen/Qwen-VL
Qwen/Qwen-VL-Chat-Int4 qwen_vl qwen_vl - vision Qwen/Qwen-VL-Chat-Int4
Qwen/Qwen-Audio-Chat qwen_audio qwen_audio - audio Qwen/Qwen-Audio-Chat
Qwen/Qwen-Audio qwen_audio qwen_audio - audio Qwen/Qwen-Audio
Qwen/Qwen2-VL-2B-Instruct qwen2_vl qwen2_vl transformers>=4.45, qwen_vl_utils, pyav vision, video Qwen/Qwen2-VL-2B-Instruct
Qwen/Qwen2-VL-7B-Instruct qwen2_vl qwen2_vl transformers>=4.45, qwen_vl_utils, pyav vision, video Qwen/Qwen2-VL-7B-Instruct
Qwen/Qwen2-VL-72B-Instruct qwen2_vl qwen2_vl transformers>=4.45, qwen_vl_utils, pyav vision, video Qwen/Qwen2-VL-72B-Instruct
Qwen/Qwen2-VL-2B qwen2_vl qwen2_vl transformers>=4.45, qwen_vl_utils, pyav vision, video Qwen/Qwen2-VL-2B
Qwen/Qwen2-VL-7B qwen2_vl qwen2_vl transformers>=4.45, qwen_vl_utils, pyav vision, video Qwen/Qwen2-VL-7B
Qwen/Qwen2-VL-72B qwen2_vl qwen2_vl transformers>=4.45, qwen_vl_utils, pyav vision, video Qwen/Qwen2-VL-72B
Qwen/Qwen2-VL-2B-Instruct-GPTQ-Int4 qwen2_vl qwen2_vl transformers>=4.45, qwen_vl_utils, pyav vision, video Qwen/Qwen2-VL-2B-Instruct-GPTQ-Int4
Qwen/Qwen2-VL-7B-Instruct-GPTQ-Int4 qwen2_vl qwen2_vl transformers>=4.45, qwen_vl_utils, pyav vision, video Qwen/Qwen2-VL-7B-Instruct-GPTQ-Int4
Qwen/Qwen2-VL-72B-Instruct-GPTQ-Int4 qwen2_vl qwen2_vl transformers>=4.45, qwen_vl_utils, pyav vision, video Qwen/Qwen2-VL-72B-Instruct-GPTQ-Int4
Qwen/Qwen2-VL-2B-Instruct-GPTQ-Int8 qwen2_vl qwen2_vl transformers>=4.45, qwen_vl_utils, pyav vision, video Qwen/Qwen2-VL-2B-Instruct-GPTQ-Int8
Qwen/Qwen2-VL-7B-Instruct-GPTQ-Int8 qwen2_vl qwen2_vl transformers>=4.45, qwen_vl_utils, pyav vision, video Qwen/Qwen2-VL-7B-Instruct-GPTQ-Int8
Qwen/Qwen2-VL-72B-Instruct-GPTQ-Int8 qwen2_vl qwen2_vl transformers>=4.45, qwen_vl_utils, pyav vision, video Qwen/Qwen2-VL-72B-Instruct-GPTQ-Int8
Qwen/Qwen2-VL-2B-Instruct-AWQ qwen2_vl qwen2_vl transformers>=4.45, qwen_vl_utils, pyav vision, video Qwen/Qwen2-VL-2B-Instruct-AWQ
Qwen/Qwen2-VL-7B-Instruct-AWQ qwen2_vl qwen2_vl transformers>=4.45, qwen_vl_utils, pyav vision, video Qwen/Qwen2-VL-7B-Instruct-AWQ
Qwen/Qwen2-VL-72B-Instruct-AWQ qwen2_vl qwen2_vl transformers>=4.45, qwen_vl_utils, pyav vision, video Qwen/Qwen2-VL-72B-Instruct-AWQ
Qwen/Qwen2-Audio-7B-Instruct qwen2_audio qwen2_audio transformers>=4.45, librosa audio Qwen/Qwen2-Audio-7B-Instruct
Qwen/Qwen2-Audio-7B qwen2_audio qwen2_audio transformers>=4.45, librosa audio Qwen/Qwen2-Audio-7B
Qwen/Qwen2-Audio-7B qwen2_audio qwen2_audio transformers>=4.45, librosa audio Qwen/Qwen2-Audio-7B
AIDC-AI/Ovis1.6-Gemma2-9B ovis1_6 ovis1_6 transformers>=4.42 vision AIDC-AI/Ovis1.6-Gemma2-9B
ZhipuAI/glm-4v-9b glm4v glm4v transformers>=4.42 - THUDM/glm-4v-9b
ZhipuAI/glm-edge-v-2b glm_edge_v glm_edge_v transformers>=4.46 vision THUDM/glm-edge-v-2b
ZhipuAI/glm-edge-4b-chat glm_edge_v glm_edge_v transformers>=4.46 vision THUDM/glm-edge-4b-chat
ZhipuAI/cogvlm-chat cogvlm cogvlm transformers<4.42 - THUDM/cogvlm-chat-hf
ZhipuAI/cogagent-vqa cogagent_vqa cogagent_vqa transformers<4.42 - THUDM/cogagent-vqa-hf
ZhipuAI/cogagent-chat cogagent_chat cogagent_chat transformers<4.42, timm - THUDM/cogagent-chat-hf
ZhipuAI/cogvlm2-llama3-chat-19B cogvlm2 cogvlm2 transformers<4.42 - THUDM/cogvlm2-llama3-chat-19B
ZhipuAI/cogvlm2-llama3-chinese-chat-19B cogvlm2 cogvlm2 transformers<4.42 - THUDM/cogvlm2-llama3-chinese-chat-19B
ZhipuAI/cogvlm2-video-llama3-chat cogvlm2_video cogvlm2_video decord, pytorchvideo, transformers>=4.42 video THUDM/cogvlm2-video-llama3-chat
OpenGVLab/Mini-InternVL-Chat-2B-V1-5 internvl internvl transformers>=4.35, timm vision OpenGVLab/Mini-InternVL-Chat-2B-V1-5
AI-ModelScope/InternVL-Chat-V1-5 internvl internvl transformers>=4.35, timm vision OpenGVLab/InternVL-Chat-V1-5
AI-ModelScope/InternVL-Chat-V1-5-int8 internvl internvl transformers>=4.35, timm vision OpenGVLab/InternVL-Chat-V1-5-int8
OpenGVLab/Mini-InternVL-Chat-4B-V1-5 internvl_phi3 internvl_phi3 transformers>=4.35,<4.42, timm vision OpenGVLab/Mini-InternVL-Chat-4B-V1-5
OpenGVLab/InternVL2-1B internvl2 internvl2 transformers>=4.36, timm vision, video OpenGVLab/InternVL2-1B
OpenGVLab/InternVL2-2B internvl2 internvl2 transformers>=4.36, timm vision, video OpenGVLab/InternVL2-2B
OpenGVLab/InternVL2-8B internvl2 internvl2 transformers>=4.36, timm vision, video OpenGVLab/InternVL2-8B
OpenGVLab/InternVL2-26B internvl2 internvl2 transformers>=4.36, timm vision, video OpenGVLab/InternVL2-26B
OpenGVLab/InternVL2-40B internvl2 internvl2 transformers>=4.36, timm vision, video OpenGVLab/InternVL2-40B
OpenGVLab/InternVL2-Llama3-76B internvl2 internvl2 transformers>=4.36, timm vision, video OpenGVLab/InternVL2-Llama3-76B
OpenGVLab/InternVL2-2B-AWQ internvl2 internvl2 transformers>=4.36, timm vision, video OpenGVLab/InternVL2-2B-AWQ
OpenGVLab/InternVL2-8B-AWQ internvl2 internvl2 transformers>=4.36, timm vision, video OpenGVLab/InternVL2-8B-AWQ
OpenGVLab/InternVL2-26B-AWQ internvl2 internvl2 transformers>=4.36, timm vision, video OpenGVLab/InternVL2-26B-AWQ
OpenGVLab/InternVL2-40B-AWQ internvl2 internvl2 transformers>=4.36, timm vision, video OpenGVLab/InternVL2-40B-AWQ
OpenGVLab/InternVL2-Llama3-76B-AWQ internvl2 internvl2 transformers>=4.36, timm vision, video OpenGVLab/InternVL2-Llama3-76B-AWQ
OpenGVLab/InternVL2-4B internvl2_phi3 internvl2_phi3 transformers>=4.36,<4.42, timm vision, video OpenGVLab/InternVL2-4B
OpenGVLab/InternVL2_5-1B internvl2_5 internvl2_5 transformers>=4.36, timm vision, video OpenGVLab/InternVL2_5-1B
OpenGVLab/InternVL2_5-2B internvl2_5 internvl2_5 transformers>=4.36, timm vision, video OpenGVLab/InternVL2_5-2B
OpenGVLab/InternVL2_5-4B internvl2_5 internvl2_5 transformers>=4.36, timm vision, video OpenGVLab/InternVL2_5-4B
OpenGVLab/InternVL2_5-8B internvl2_5 internvl2_5 transformers>=4.36, timm vision, video OpenGVLab/InternVL2_5-8B
OpenGVLab/InternVL2_5-26B internvl2_5 internvl2_5 transformers>=4.36, timm vision, video OpenGVLab/InternVL2_5-26B
OpenGVLab/InternVL2_5-38B internvl2_5 internvl2_5 transformers>=4.36, timm vision, video OpenGVLab/InternVL2_5-38B
OpenGVLab/InternVL2_5-78B internvl2_5 internvl2_5 transformers>=4.36, timm vision, video OpenGVLab/InternVL2_5-78B
Shanghai_AI_Laboratory/internlm-xcomposer2-7b xcomposer2 ixcomposer2 - vision internlm/internlm-xcomposer2-7b
Shanghai_AI_Laboratory/internlm-xcomposer2-4khd-7b xcomposer2_4khd ixcomposer2 - vision internlm/internlm-xcomposer2-4khd-7b
Shanghai_AI_Laboratory/internlm-xcomposer2d5-7b xcomposer2_5 xcomposer2_5 decord vision internlm/internlm-xcomposer2d5-7b
LLM-Research/Llama-3.2-11B-Vision-Instruct llama3_2_vision llama3_2_vision transformers>=4.45 vision meta-llama/Llama-3.2-11B-Vision-Instruct
LLM-Research/Llama-3.2-90B-Vision-Instruct llama3_2_vision llama3_2_vision transformers>=4.45 vision meta-llama/Llama-3.2-90B-Vision-Instruct
LLM-Research/Llama-3.2-11B-Vision llama3_2_vision llama3_2_vision transformers>=4.45 vision meta-llama/Llama-3.2-11B-Vision
LLM-Research/Llama-3.2-90B-Vision llama3_2_vision llama3_2_vision transformers>=4.45 vision meta-llama/Llama-3.2-90B-Vision
ICTNLP/Llama-3.1-8B-Omni llama3_1_omni llama3_1_omni whisper, openai-whisper audio ICTNLP/Llama-3.1-8B-Omni
swift/llava-1.5-7b-hf llava1_5_hf llava1_5_hf transformers>=4.36 vision llava-hf/llava-1.5-7b-hf
swift/llava-1.5-13b-hf llava1_5_hf llava1_5_hf transformers>=4.36 vision llava-hf/llava-1.5-13b-hf
swift/llava-v1.6-mistral-7b-hf llava1_6_mistral_hf llava1_6_mistral_hf transformers>=4.39 vision llava-hf/llava-v1.6-mistral-7b-hf
swift/llava-v1.6-vicuna-7b-hf llava1_6_vicuna_hf llava1_6_vicuna_hf transformers>=4.39 vision llava-hf/llava-v1.6-vicuna-7b-hf
swift/llava-v1.6-vicuna-13b-hf llava1_6_vicuna_hf llava1_6_vicuna_hf transformers>=4.39 vision llava-hf/llava-v1.6-vicuna-13b-hf
swift/llava-v1.6-34b-hf llava1_6_yi_hf llava1_6_yi_hf transformers>=4.39 vision llava-hf/llava-v1.6-34b-hf
swift/llama3-llava-next-8b-hf llama3_llava_next_hf llama3_llava_next_hf transformers>=4.39 vision llava-hf/llama3-llava-next-8b-hf
AI-ModelScope/llava-next-72b-hf llava_next_qwen_hf llava_next_qwen_hf transformers>=4.39 vision llava-hf/llava-next-72b-hf
AI-ModelScope/llava-next-110b-hf llava_next_qwen_hf llava_next_qwen_hf transformers>=4.39 vision llava-hf/llava-next-110b-hf
swift/LLaVA-NeXT-Video-7B-DPO-hf llava_next_video_hf llava_next_video_hf transformers>=4.42, av video llava-hf/LLaVA-NeXT-Video-7B-DPO-hf
swift/LLaVA-NeXT-Video-7B-32K-hf llava_next_video_hf llava_next_video_hf transformers>=4.42, av video llava-hf/LLaVA-NeXT-Video-7B-32K-hf
swift/LLaVA-NeXT-Video-7B-hf llava_next_video_hf llava_next_video_hf transformers>=4.42, av video llava-hf/LLaVA-NeXT-Video-7B-hf
swift/LLaVA-NeXT-Video-34B-hf llava_next_video_yi_hf llava_next_video_hf transformers>=4.42, av video llava-hf/LLaVA-NeXT-Video-34B-hf
AI-ModelScope/llava-onevision-qwen2-0.5b-ov-hf llava_onevision_hf llava_onevision_hf transformers>=4.45 vision, video llava-hf/llava-onevision-qwen2-0.5b-ov-hf
AI-ModelScope/llava-onevision-qwen2-7b-ov-hf llava_onevision_hf llava_onevision_hf transformers>=4.45 vision, video llava-hf/llava-onevision-qwen2-7b-ov-hf
AI-ModelScope/llava-onevision-qwen2-72b-ov-hf llava_onevision_hf llava_onevision_hf transformers>=4.45 vision, video llava-hf/llava-onevision-qwen2-72b-ov-hf
01ai/Yi-VL-6B yi_vl yi_vl transformers>=4.34 vision 01-ai/Yi-VL-6B
01ai/Yi-VL-34B yi_vl yi_vl transformers>=4.34 vision 01-ai/Yi-VL-34B
swift/llava-llama3.1-8b llava_llama3_1_hf llava_llama3_1_hf transformers>=4.41 vision -
AI-ModelScope/llava-llama-3-8b-v1_1-transformers llava_llama3_hf llava_llama3_hf transformers>=4.36 vision xtuner/llava-llama-3-8b-v1_1-transformers
AI-ModelScope/llava-v1.6-mistral-7b llava1_6_mistral llava1_6_mistral transformers>=4.34 vision liuhaotian/llava-v1.6-mistral-7b
AI-ModelScope/llava-v1.6-34b llava1_6_yi llava1_6_yi transformers>=4.34 vision liuhaotian/llava-v1.6-34b
AI-Modelscope/llava-next-72b llava_next_qwen llava_next_qwen transformers>=4.42, av vision lmms-lab/llava-next-72b
AI-Modelscope/llava-next-110b llava_next_qwen llava_next_qwen transformers>=4.42, av vision lmms-lab/llava-next-110b
AI-Modelscope/llama3-llava-next-8b llama3_llava_next llama3_llava_next transformers>=4.42, av vision lmms-lab/llama3-llava-next-8b
deepseek-ai/deepseek-vl-1.3b-chat deepseek_vl deepseek_vl - vision deepseek-ai/deepseek-vl-1.3b-chat
deepseek-ai/deepseek-vl-7b-chat deepseek_vl deepseek_vl - vision deepseek-ai/deepseek-vl-7b-chat
deepseek-ai/Janus-1.3B deepseek_janus deepseek_janus - vision deepseek-ai/Janus-1.3B
OpenBMB/MiniCPM-V minicpmv minicpmv timm, transformers<4.42 vision openbmb/MiniCPM-V
OpenBMB/MiniCPM-V-2 minicpmv minicpmv timm, transformers<4.42 vision openbmb/MiniCPM-V-2
OpenBMB/MiniCPM-V-2_6 minicpmv2_6 minicpmv2_6 timm, transformers>=4.36, decord vision, video openbmb/MiniCPM-V-2_6
OpenBMB/MiniCPM-Llama3-V-2_5 minicpmv2_5 minicpmv2_5 timm, transformers>=4.36 vision openbmb/MiniCPM-Llama3-V-2_5
iic/mPLUG-Owl2 mplug_owl2 mplug_owl2 transformers<4.35, icecream vision MAGAer13/mplug-owl2-llama2-7b
iic/mPLUG-Owl2.1 mplug_owl2_1 mplug_owl2 transformers<4.35, icecream vision Mizukiluke/mplug_owl_2_1
iic/mPLUG-Owl3-1B-241014 mplug_owl3 mplug_owl3 transformers>=4.36, icecream, decord vision, video mPLUG/mPLUG-Owl3-1B-241014
iic/mPLUG-Owl3-2B-241014 mplug_owl3 mplug_owl3 transformers>=4.36, icecream, decord vision, video mPLUG/mPLUG-Owl3-2B-241014
iic/mPLUG-Owl3-7B-240728 mplug_owl3 mplug_owl3 transformers>=4.36, icecream, decord vision, video mPLUG/mPLUG-Owl3-7B-240728
iic/mPLUG-Owl3-7B-241101 mplug_owl3_241101 mplug_owl3_241101 transformers>=4.36, icecream vision, video mPLUG/mPLUG-Owl3-7B-241101
BAAI/Emu3-Gen emu3_gen emu3_gen - t2i BAAI/Emu3-Gen
BAAI/Emu3-Chat emu3_chat emu3_chat transformers>=4.44.0 vision BAAI/Emu3-Chat
stepfun-ai/GOT-OCR2_0 got_ocr2 got_ocr2 - vision stepfun-ai/GOT-OCR2_0
LLM-Research/Phi-3-vision-128k-instruct phi3_vision phi3_vision transformers>=4.36 vision microsoft/Phi-3-vision-128k-instruct
LLM-Research/Phi-3.5-vision-instruct phi3_vision phi3_vision transformers>=4.36 vision microsoft/Phi-3.5-vision-instruct
AI-ModelScope/Florence-2-base-ft florence florence - vision microsoft/Florence-2-base-ft
AI-ModelScope/Florence-2-base florence florence - vision microsoft/Florence-2-base
AI-ModelScope/Florence-2-large florence florence - vision microsoft/Florence-2-large
AI-ModelScope/Florence-2-large-ft florence florence - vision microsoft/Florence-2-large-ft
AI-ModelScope/Idefics3-8B-Llama3 idefics3 idefics3 transformers>=4.45 vision HuggingFaceM4/Idefics3-8B-Llama3
AI-ModelScope/paligemma-3b-pt-224 paligemma paligemma transformers>=4.41 vision google/paligemma-3b-pt-224
AI-ModelScope/paligemma-3b-pt-448 paligemma paligemma transformers>=4.41 vision google/paligemma-3b-pt-448
AI-ModelScope/paligemma-3b-pt-896 paligemma paligemma transformers>=4.41 vision google/paligemma-3b-pt-896
AI-ModelScope/paligemma-3b-mix-224 paligemma paligemma transformers>=4.41 vision google/paligemma-3b-mix-224
AI-ModelScope/paligemma-3b-mix-448 paligemma paligemma transformers>=4.41 vision google/paligemma-3b-mix-448
LLM-Research/Molmo-7B-O-0924 molmo molmo transformers>=4.45 vision allenai/Molmo-7B-O-0924
LLM-Research/Molmo-7B-D-0924 molmo molmo transformers>=4.45 vision allenai/Molmo-7B-D-0924
LLM-Research/Molmo-72B-0924 molmo molmo transformers>=4.45 vision allenai/Molmo-72B-0924
LLM-Research/MolmoE-1B-0924 molmoe molmo transformers>=4.45 vision allenai/MolmoE-1B-0924
AI-ModelScope/pixtral-12b pixtral pixtral transformers>=4.45 vision mistral-community/pixtral-12b

数据集

下表介绍了ms-swift接入的数据集的相关信息:

  • Dataset ID: 魔搭数据集id
  • HF Dataset ID: 抱抱脸数据集id
  • Subset Name: 子数据集名称
  • Dataset Size: 数据集大小
  • Statistic: 数据集的统计量. 我们使用token数进行统计, 这对于调整max_length超参数有帮助. 我们使用qwen2.5的tokenizer对数据集进行分词. 不同的tokenizer的统计量不同, 如果你要获取其他的模型的tokenizer的token统计量, 可以通过脚本自行获取.
  • Tags: 数据集的tags
Dataset ID Subset Name Dataset Size Statistic (token) Tags HF Dataset ID
AI-ModelScope/COIG-CQIA chinese_traditional
coig_pc
exam
finance
douban
human_value
logi_qa
ruozhiba
segmentfault
wiki
wikihow
xhs
zhihu
44694 331.2±693.8, min=34, max=19288 general, 🔥 -
AI-ModelScope/CodeAlpaca-20k default 20022 99.3±57.6, min=30, max=857 code, en HuggingFaceH4/CodeAlpaca_20K
AI-ModelScope/DISC-Law-SFT default 166758 1799.0±474.9, min=769, max=3151 chat, law, 🔥 ShengbinYue/DISC-Law-SFT
AI-ModelScope/DISC-Med-SFT default 464885 426.5±178.7, min=110, max=1383 chat, medical, 🔥 Flmc/DISC-Med-SFT
AI-ModelScope/Duet-v0.5 default 5000 1157.4±189.3, min=657, max=2344 CoT, en G-reen/Duet-v0.5
AI-ModelScope/GuanacoDataset default 31563 250.3±70.6, min=95, max=987 chat, zh JosephusCheung/GuanacoDataset
AI-ModelScope/LLaVA-Instruct-150K default 623302 630.7±143.0, min=301, max=1166 chat, multi-modal, vision -
AI-ModelScope/LLaVA-Pretrain default huge dataset - chat, multi-modal, quality liuhaotian/LLaVA-Pretrain
AI-ModelScope/LaTeX_OCR default
synthetic_handwrite
162149 117.6±44.9, min=41, max=312 chat, ocr, multi-modal, vision linxy/LaTeX_OCR
AI-ModelScope/LongAlpaca-12k default 11998 9941.8±3417.1, min=4695, max=25826 long-sequence, QA Yukang/LongAlpaca-12k
AI-ModelScope/M3IT coco
vqa-v2
shapes
shapes-rephrased
coco-goi-rephrased
snli-ve
snli-ve-rephrased
okvqa
a-okvqa
viquae
textcap
docvqa
science-qa
imagenet
imagenet-open-ended
imagenet-rephrased
coco-goi
clevr
clevr-rephrased
nlvr
coco-itm
coco-itm-rephrased
vsr
vsr-rephrased
mocheg
mocheg-rephrased
coco-text
fm-iqa
activitynet-qa
msrvtt
ss
coco-cn
refcoco
refcoco-rephrased
multi30k
image-paragraph-captioning
visual-dialog
visual-dialog-rephrased
iqa
vcr
visual-mrc
ivqa
msrvtt-qa
msvd-qa
gqa
text-vqa
ocr-vqa
st-vqa
flickr8k-cn
huge dataset - chat, multi-modal, vision -
AI-ModelScope/Magpie-Qwen2-Pro-200K-Chinese default 200000 448.4±223.5, min=87, max=4098 chat, sft, 🔥, zh Magpie-Align/Magpie-Qwen2-Pro-200K-Chinese
AI-ModelScope/Magpie-Qwen2-Pro-200K-English default 200000 609.9±277.1, min=257, max=4098 chat, sft, 🔥, en Magpie-Align/Magpie-Qwen2-Pro-200K-English
AI-ModelScope/Magpie-Qwen2-Pro-300K-Filtered default 300000 556.6±288.6, min=175, max=4098 chat, sft, 🔥 Magpie-Align/Magpie-Qwen2-Pro-300K-Filtered
AI-ModelScope/MathInstruct default 262040 253.3±177.4, min=42, max=2193 math, cot, en, quality TIGER-Lab/MathInstruct
AI-ModelScope/MovieChat-1K-test default 162 39.7±2.0, min=32, max=43 chat, multi-modal, video Enxin/MovieChat-1K-test
AI-ModelScope/Open-Platypus default 24926 389.0±256.4, min=55, max=3153 chat, math, quality garage-bAInd/Open-Platypus
AI-ModelScope/OpenO1-SFT default 125894 1080.7±622.9, min=145, max=11637 chat, general, o1 O1-OPEN/OpenO1-SFT
AI-ModelScope/OpenOrca default
3_5M
huge dataset - chat, multilingual, general -
AI-ModelScope/OpenOrca-Chinese default huge dataset - QA, zh, general, quality yys/OpenOrca-Chinese
AI-ModelScope/SFT-Nectar default 131201 441.9±307.0, min=45, max=3136 cot, en, quality AstraMindAI/SFT-Nectar
AI-ModelScope/ShareGPT-4o image_caption 57289 599.8±140.4, min=214, max=1932 vqa, multi-modal OpenGVLab/ShareGPT-4o
AI-ModelScope/ShareGPT4V ShareGPT4V
ShareGPT4V-PT
huge dataset - chat, multi-modal, vision -
AI-ModelScope/SkyPile-150B default huge dataset - pretrain, quality, zh Skywork/SkyPile-150B
AI-ModelScope/WizardLM_evol_instruct_V2_196k default 109184 483.3±338.4, min=27, max=3735 chat, en WizardLM/WizardLM_evol_instruct_V2_196k
AI-ModelScope/alpaca-cleaned default 51760 170.1±122.9, min=29, max=1028 chat, general, bench, quality yahma/alpaca-cleaned
AI-ModelScope/alpaca-gpt4-data-en default 52002 167.6±123.9, min=29, max=607 chat, general, 🔥 vicgalle/alpaca-gpt4
AI-ModelScope/alpaca-gpt4-data-zh default 48818 157.2±93.2, min=27, max=544 chat, general, 🔥 llm-wizard/alpaca-gpt4-data-zh
AI-ModelScope/blossom-math-v2 default 10000 175.4±59.1, min=35, max=563 chat, math, 🔥 Azure99/blossom-math-v2
AI-ModelScope/captcha-images default 8000 47.0±0.0, min=47, max=47 chat, multi-modal, vision -
AI-ModelScope/databricks-dolly-15k default 15011 199.0±268.8, min=26, max=5987 multi-task, en, quality databricks/databricks-dolly-15k
AI-ModelScope/deepctrl-sft-data default
en
huge dataset - chat, general, sft, multi-round -
AI-ModelScope/egoschema Subset 101 191.6±80.7, min=96, max=435 chat, multi-modal, video lmms-lab/egoschema
AI-ModelScope/firefly-train-1.1M default 1649399 204.3±365.3, min=28, max=9306 chat, general YeungNLP/firefly-train-1.1M
AI-ModelScope/generated_chat_0.4M default 396004 272.7±51.1, min=78, max=579 chat, character-dialogue BelleGroup/generated_chat_0.4M
AI-ModelScope/guanaco_belle_merge_v1.0 default 693987 133.8±93.5, min=30, max=1872 QA, zh Chinese-Vicuna/guanaco_belle_merge_v1.0
AI-ModelScope/hh-rlhf helpful-base
helpful-online
helpful-rejection-sampled
huge dataset - rlhf, dpo -
AI-ModelScope/hh_rlhf_cn hh_rlhf
harmless_base_cn
harmless_base_en
helpful_base_cn
helpful_base_en
362909 142.3±107.5, min=25, max=1571 rlhf, dpo, 🔥 -
AI-ModelScope/lawyer_llama_data default 21476 224.4±83.9, min=69, max=832 chat, law Skepsun/lawyer_llama_data
AI-ModelScope/leetcode-solutions-python default 2359 723.8±233.5, min=259, max=2117 chat, coding, 🔥 -
AI-ModelScope/lmsys-chat-1m default 166211 545.8±3272.8, min=22, max=219116 chat, em lmsys/lmsys-chat-1m
AI-ModelScope/ms_agent_for_agentfabric default
addition
30000 615.7±198.7, min=251, max=2055 chat, agent, multi-round, 🔥 -
AI-ModelScope/orpo-dpo-mix-40k default 43666 938.1±694.2, min=36, max=8483 dpo, orpo, en, quality mlabonne/orpo-dpo-mix-40k
AI-ModelScope/pile default huge dataset - pretrain EleutherAI/pile
AI-ModelScope/ruozhiba post-annual
title-good
title-norm
85658 40.0±18.3, min=22, max=559 pretrain, 🔥 -
AI-ModelScope/school_math_0.25M default 248481 158.8±73.4, min=39, max=980 chat, math, quality BelleGroup/school_math_0.25M
AI-ModelScope/sharegpt_gpt4 default
V3_format
zh_38K_format
103329 3476.6±5959.0, min=33, max=115132 chat, multilingual, general, multi-round, gpt4, 🔥 -
AI-ModelScope/sql-create-context default 78577 82.7±31.5, min=36, max=282 chat, sql, 🔥 b-mc2/sql-create-context
AI-ModelScope/stack-exchange-paired default huge dataset - hfrl, dpo, pairwise lvwerra/stack-exchange-paired
AI-ModelScope/starcoderdata default huge dataset - pretrain, quality bigcode/starcoderdata
AI-ModelScope/synthetic_text_to_sql default 100000 221.8±69.9, min=64, max=616 nl2sql, en gretelai/synthetic_text_to_sql
AI-ModelScope/texttosqlv2_25000_v2 default 25000 277.3±328.3, min=40, max=1971 chat, sql Clinton/texttosqlv2_25000_v2
AI-ModelScope/the-stack default huge dataset - pretrain, quality bigcode/the-stack
AI-ModelScope/tigerbot-law-plugin default 55895 104.9±51.0, min=43, max=1087 text-generation, law, pretrained TigerResearch/tigerbot-law-plugin
AI-ModelScope/train_0.5M_CN default 519255 128.4±87.4, min=31, max=936 common, zh, quality BelleGroup/train_0.5M_CN
AI-ModelScope/train_1M_CN default huge dataset - common, zh, quality BelleGroup/train_1M_CN
AI-ModelScope/train_2M_CN default huge dataset - common, zh, quality BelleGroup/train_2M_CN
AI-ModelScope/tulu-v2-sft-mixture default 326154 523.3±439.3, min=68, max=2549 chat, multilingual, general, multi-round allenai/tulu-v2-sft-mixture
AI-ModelScope/ultrafeedback-binarized-preferences-cleaned-kto default 230720 471.5±274.3, min=27, max=2232 rlhf, kto -
AI-ModelScope/webnovel_cn default 50000 1455.2±12489.4, min=524, max=490480 chat, novel zxbsmk/webnovel_cn
AI-ModelScope/wikipedia-cn-20230720-filtered default huge dataset - pretrain, quality pleisto/wikipedia-cn-20230720-filtered
AI-ModelScope/zhihu_rlhf_3k default 3460 594.5±365.9, min=31, max=1716 rlhf, dpo, zh liyucheng/zhihu_rlhf_3k
DAMO_NLP/jd default 45012 66.9±87.0, min=41, max=1699 text-generation, classification, 🔥 -
- default huge dataset - pretrain, quality HuggingFaceFW/fineweb
- auto_math_text
khanacademy
openstax
stanford
stories
web_samples_v1
web_samples_v2
wikihow
huge dataset - multi-domain, en, qa HuggingFaceTB/cosmopedia
OmniData/Zhihu-KOL default huge dataset - zhihu, qa wangrui6/Zhihu-KOL
OmniData/Zhihu-KOL-More-Than-100-Upvotes default 271261 1003.4±1826.1, min=28, max=52541 zhihu, qa bzb2023/Zhihu-KOL-More-Than-100-Upvotes
TIGER-Lab/MATH-plus train 893929 301.4±196.7, min=50, max=1162 qa, math, en, quality TIGER-Lab/MATH-plus
Tongyi-DataEngine/SA1B-Dense-Caption default huge dataset - zh, multi-modal, vqa -
Tongyi-DataEngine/SA1B-Paired-Captions-Images default 7736284 106.4±18.5, min=48, max=193 zh, multi-modal, vqa -
YorickHe/CoT default 74771 141.6±45.5, min=58, max=410 chat, general -
YorickHe/CoT_zh default 74771 129.1±53.2, min=51, max=401 chat, general -
ZhipuAI/LongWriter-6k default 6000 5009.0±2932.8, min=117, max=30354 long, chat, sft, 🔥 THUDM/LongWriter-6k
- default huge dataset - pretrain, quality allenai/c4
- default huge dataset - pretrain, quality cerebras/SlimPajama-627B
codefuse-ai/CodeExercise-Python-27k default 27224 337.3±154.2, min=90, max=2826 chat, coding, 🔥 -
codefuse-ai/Evol-instruction-66k default 66862 440.1±208.4, min=46, max=2661 chat, coding, 🔥 -
damo/MSAgent-Bench default
mini
638149 859.2±460.1, min=38, max=3479 chat, agent, multi-round -
damo/nlp_polylm_multialpaca_sft ar
de
es
fr
id
ja
ko
pt
ru
th
vi
131867 101.6±42.5, min=30, max=1029 chat, general, multilingual -
damo/zh_cls_fudan-news default 4959 3234.4±2547.5, min=91, max=19548 chat, classification -
damo/zh_ner-JAVE default 1266 118.3±45.5, min=44, max=223 chat, ner -
hjh0119/shareAI-Llama3-DPO-zh-en-emoji zh
en
2449 334.0±162.8, min=36, max=1801 rlhf, dpo -
huangjintao/AgentInstruct_copy alfworld
db
kg
mind2web
os
webshop
1866 1144.3±635.5, min=206, max=6412 chat, agent, multi-round -
iic/100PoisonMpts default 906 150.6±80.8, min=39, max=656 poison-management, zh -
iic/MSAgent-MultiRole default 543 413.0±79.7, min=70, max=936 chat, agent, multi-round, role-play, multi-agent -
iic/MSAgent-Pro default 21910 1978.1±747.9, min=339, max=8064 chat, agent, multi-round, 🔥 -
iic/ms_agent default 30000 645.8±218.0, min=199, max=2070 chat, agent, multi-round, 🔥 -
iic/ms_bench default 316820 353.4±424.5, min=29, max=2924 chat, general, multi-round, 🔥 -
- default huge dataset - multi-modal, en, vqa, quality lmms-lab/GQA
- 0_30_s_academic_v0_1
0_30_s_youtube_v0_1
1_2_m_academic_v0_1
1_2_m_youtube_v0_1
2_3_m_academic_v0_1
2_3_m_youtube_v0_1
30_60_s_academic_v0_1
30_60_s_youtube_v0_1
1335486 273.7±78.8, min=107, max=638 chat, multi-modal, video lmms-lab/LLaVA-Video-178K
lvjianjin/AdvertiseGen default 97484 130.9±21.9, min=73, max=232 text-generation, 🔥 shibing624/AdvertiseGen
mapjack/openwebtext_dataset default huge dataset - pretrain, zh, quality -
modelscope/DuReader_robust-QG default 17899 242.0±143.1, min=75, max=1416 text-generation, 🔥 -
modelscope/chinese-poetry-collection default 1710 58.1±8.1, min=31, max=71 text-generation, poetry -
modelscope/clue cmnli 391783 81.6±16.0, min=54, max=157 text-generation, classification clue
modelscope/coco_2014_caption train
validation
454617 389.6±68.4, min=70, max=587 chat, multi-modal, vision, 🔥 -
shenweizhou/alpha-umi-toolbench-processed-v2 backbone
caller
planner
summarizer
huge dataset - chat, agent, 🔥 -
simpleai/HC3 finance
medicine
11021 296.0±153.3, min=65, max=2267 text-generation, classification, 🔥 Hello-SimpleAI/HC3
simpleai/HC3-Chinese baike
baike_cls
open_qa
open_qa_cls
nlpcc_dbqa
nlpcc_dbqa_cls
finance
finance_cls
medicine
medicine_cls
law
law_cls
psychology
psychology_cls
39781 179.9±70.2, min=90, max=1070 text-generation, classification, 🔥 Hello-SimpleAI/HC3-Chinese
speech_asr/speech_asr_aishell1_trainsets train
validation
test
141600 40.8±3.3, min=33, max=53 chat, multi-modal, audio -
swift/A-OKVQA default 18201 43.5±7.9, min=27, max=94 multi-modal, en, vqa, quality HuggingFaceM4/A-OKVQA
swift/ChartQA default 28299 36.8±6.5, min=26, max=74 en, vqa, quality HuggingFaceM4/ChartQA
swift/GRIT caption
grounding
vqa
huge dataset - multi-modal, en, caption-grounding, vqa, quality zzliang/GRIT
swift/GenQA default huge dataset - qa, quality, multi-task tomg-group-umd/GenQA
swift/Infinity-Instruct default huge dataset - qa, quality, multi-task BAAI/Infinity-Instruct
swift/Mantis-Instruct birds-to-words
chartqa
coinstruct
contrastive_caption
docvqa
dreamsim
dvqa
iconqa
imagecode
llava_665k_multi
lrv_multi
multi_vqa
nextqa
nlvr2
spot-the-diff
star
visual_story_telling
988115 619.9±156.6, min=243, max=1926 chat, multi-modal, vision -
swift/MideficsDataset default 3800 201.3±70.2, min=60, max=454 medical, en, vqa WinterSchool/MideficsDataset
swift/Multimodal-Mind2Web default 1009 293855.4±331149.5, min=11301, max=3577519 agent, multi-modal osunlp/Multimodal-Mind2Web
swift/OCR-VQA default 186753 32.3±5.8, min=27, max=80 multi-modal, en, ocr-vqa howard-hou/OCR-VQA
swift/OK-VQA_train default 9009 31.7±3.4, min=25, max=56 multi-modal, en, vqa, quality Multimodal-Fatima/OK-VQA_train
swift/OpenHermes-2.5 default huge dataset - cot, en, quality teknium/OpenHermes-2.5
swift/RLAIF-V-Dataset default 83132 99.6±54.8, min=30, max=362 rlhf, dpo, multi-modal, en openbmb/RLAIF-V-Dataset
swift/RedPajama-Data-1T default huge dataset - pretrain, quality togethercomputer/RedPajama-Data-1T
swift/RedPajama-Data-V2 default huge dataset - pretrain, quality togethercomputer/RedPajama-Data-V2
swift/ScienceQA default 16967 101.7±55.8, min=32, max=620 multi-modal, science, vqa, quality derek-thomas/ScienceQA
swift/SlimOrca default 517982 405.5±442.1, min=47, max=8312 quality, en Open-Orca/SlimOrca
swift/TextCaps default huge dataset - multi-modal, en, caption, quality HuggingFaceM4/TextCaps
swift/ToolBench default 124345 2251.7±1039.8, min=641, max=9451 chat, agent, multi-round -
swift/VQAv2 default huge dataset - en, vqa, quality HuggingFaceM4/VQAv2
swift/VideoChatGPT Generic
Temporal
Consistency
3206 87.4±48.3, min=31, max=398 chat, multi-modal, video, 🔥 lmms-lab/VideoChatGPT
swift/WebInstructSub default huge dataset - qa, en, math, quality, multi-domain, science TIGER-Lab/WebInstructSub
swift/aya_collection aya_dataset 202364 474.6±1539.1, min=25, max=71312 multi-lingual, qa CohereForAI/aya_collection
swift/chinese-c4 default huge dataset - pretrain, zh, quality shjwudp/chinese-c4
swift/cinepile default huge dataset - vqa, en, youtube, video tomg-group-umd/cinepile
swift/classical_chinese_translate default 6655 349.3±77.1, min=61, max=815 chat, play-ground -
swift/cosmopedia-100k default 100000 1037.0±254.8, min=339, max=2818 multi-domain, en, qa HuggingFaceTB/cosmopedia-100k
swift/dolma v1_7 huge dataset - pretrain, quality allenai/dolma
swift/dolphin flan1m-alpaca-uncensored
flan5m-alpaca-uncensored
huge dataset - en cognitivecomputations/dolphin
swift/github-code default huge dataset - pretrain, quality codeparrot/github-code
swift/gpt4v-dataset default huge dataset - en, caption, multi-modal, quality laion/gpt4v-dataset
swift/llava-data llava_instruct 624255 369.7±143.0, min=40, max=905 sft, multi-modal, quality TIGER-Lab/llava-data
swift/llava-instruct-mix-vsft default 13640 178.8±119.8, min=34, max=951 multi-modal, en, vqa, quality HuggingFaceH4/llava-instruct-mix-vsft
swift/llava-med-zh-instruct-60k default 56649 207.9±67.7, min=42, max=594 zh, medical, vqa, multi-modal BUAADreamer/llava-med-zh-instruct-60k
swift/lnqa default huge dataset - multi-modal, en, ocr-vqa, quality vikhyatk/lnqa
swift/longwriter-6k-filtered default 666 4108.9±2636.9, min=1190, max=17050 long, chat, sft, 🔥 -
swift/medical_zh en
zh
2068589 256.4±87.3, min=39, max=1167 chat, medical -
swift/moondream2-coyo-5M-captions default huge dataset - caption, pretrain, quality isidentical/moondream2-coyo-5M-captions
swift/no_robots default 9485 300.0±246.2, min=40, max=6739 multi-task, quality, human-annotated HuggingFaceH4/no_robots
swift/orca_dpo_pairs default 12859 364.9±248.2, min=36, max=2010 rlhf, quality Intel/orca_dpo_pairs
swift/path-vqa default 19654 34.2±6.8, min=28, max=85 multi-modal, vqa, medical flaviagiammarino/path-vqa
swift/pile-val-backup default 214661 1831.4±11087.5, min=21, max=516620 text-generation, awq mit-han-lab/pile-val-backup
swift/pixelprose default huge dataset - caption, multi-modal, vision tomg-group-umd/pixelprose
swift/refcoco caption
grounding
92430 45.4±3.0, min=37, max=63 multi-modal, en, grounding jxu124/refcoco
swift/refcocog caption
grounding
89598 50.3±4.6, min=39, max=91 multi-modal, en, grounding jxu124/refcocog
swift/self-cognition default 108 58.9±20.3, min=32, max=131 chat, self-cognition, 🔥 modelscope/self-cognition
swift/sharegpt common-zh
unknow-zh
common-en
194063 820.5±366.1, min=25, max=2221 chat, general, multi-round -
swift/swift-sft-mixture sharegpt
firefly
codefuse
metamathqa
huge dataset - chat, sft, general, 🔥 -
swift/tagengo-gpt4 default 76437 468.1±276.8, min=28, max=1726 chat, multi-lingual, quality lightblue/tagengo-gpt4
swift/train_3.5M_CN default huge dataset - common, zh, quality BelleGroup/train_3.5M_CN
swift/ultrachat_200k default 207843 1188.0±571.1, min=170, max=4068 chat, en, quality HuggingFaceH4/ultrachat_200k
swift/wikipedia default huge dataset - pretrain, quality wikipedia
- default huge dataset - pretrain, quality tiiuae/falcon-refinedweb
wyj123456/GPT4all default 806199 97.3±20.9, min=62, max=414 chat, general -
wyj123456/code_alpaca_en default 20022 99.3±57.6, min=30, max=857 chat, coding sahil2801/CodeAlpaca-20k
wyj123456/finance_en default 68912 264.5±207.1, min=30, max=2268 chat, financial ssbuild/alpaca_finance_en
wyj123456/instinwild default
subset
103695 125.1±43.7, min=35, max=801 chat, general -
wyj123456/instruct default 888970 271.0±333.6, min=34, max=3967 chat, general -