diff --git a/README.md b/README.md index 8fee231..bf7546f 100644 --- a/README.md +++ b/README.md @@ -29,7 +29,9 @@ ## 新闻 -**[2024/05/07] 添加预训练脚本、指令精调脚本。详情查看:[📚v1.1版本发布日志](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/releases/tag/v1.1)** +**[2024/05/08] 发布Llama-3-Chinese-8B-Instruct-v2版指令模型,直接采用500万条指令数据在 [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) 上进行精调。详情查看:[📚v2.0版本发布日志](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/releases/tag/v2.0)** + +[2024/05/07] 添加预训练脚本、指令精调脚本。详情查看:[📚v1.1版本发布日志](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/releases/tag/v1.1) [2024/04/30] 发布Llama-3-Chinese-8B基座模型和Llama-3-Chinese-8B-Instruct指令模型。详情查看:[📚v1.0版本发布日志](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/releases/tag/v1.0) @@ -78,13 +80,13 @@ 以下是本项目的模型对比以及建议使用场景。**如需聊天交互,请选择Instruct版。** -| 对比项 | Llama-3-Chinese | Llama-3-Chinese-Instruct | +| 对比项 | Llama-3-Chinese-8B | Llama-3-Chinese-8B-Instruct | | :-------------------- | :----------------------------------------------------: | :----------------------------------------------------------: | | 模型类型 | 基座模型 | 指令/Chat模型(类ChatGPT) | | 模型大小 | 8B | 8B | | 训练类型 | Causal-LM (CLM) | 指令精调 | | 训练方式 | LoRA + 全量emb/lm-head | LoRA + 全量emb/lm-head | -| 初始化模型 | [原版Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | 中文Llama-3 | +| 初始化模型 | [原版Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | v1: Llama-3-Chinese-8B
v2: [原版Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) | | 训练语料 | 无标注通用语料(约120GB) | 有标注指令数据(约500万条) | | 词表大小 | 原版词表(128,256) | 原版词表(128,256) | | 支持上下文长度 | 8K | 8K | @@ -96,13 +98,16 @@ | 模型名称 | 完整版 | LoRA版 | GGUF版 | | :------------------------ | :----------------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | -| **Llama-3-Chinese-8B**
(基座模型) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-lora)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-lora)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-lora) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-gguf)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-gguf) | +| **Llama-3-Chinese-8B-Instruct-v2**
(指令模型) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct-v2)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-v2)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-v2) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct-v2-lora)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-v2-lora)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-v2-lora) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct-v2-gguf)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-v2-gguf) | | **Llama-3-Chinese-8B-Instruct**
(指令模型) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct-lora)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-lora)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-lora) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct-gguf)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-gguf) | +| **Llama-3-Chinese-8B**
(基座模型) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-lora)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-lora)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-lora) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-gguf)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-gguf) | 模型类型说明: - **完整模型**:可直接用于训练和推理,无需其他合并步骤 -- **LoRA模型**:需要与原版[Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B)合并才能转为完整版模型,合并方法:[**💻 模型合并步骤**](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/wiki/model_conversion_zh) +- **LoRA模型**:需要与基模型合并并才能转为完整版模型,合并方法:[**💻 模型合并步骤**](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/wiki/model_conversion_zh) + - v1基模型:原版[Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) + - v2基模型:原版[Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) - **GGUF模型**:[llama.cpp](https://github.com/ggerganov/llama.cpp)推出的量化格式,适配ollama等常见推理工具,推荐只需要做推理部署的用户下载;模型名后缀为`-im`表示使用了importance matrix进行量化,通常具有更低的PPL,建议使用(用法与常规版相同) > [!NOTE] > 若无法访问HF,可考虑一些镜像站点(如[hf-mirror.com](hf-mirror.com)),具体方法请自行查找解决。 @@ -138,59 +143,67 @@ [C-Eval](https://cevalbenchmark.com)是一个全面的中文基础模型评估套件,其中验证集和测试集分别包含1.3K和12.3K个选择题,涵盖52个学科。C-Eval推理代码请参考本项目:[📖GitHub Wiki](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/wiki/ceval_zh) -| Models | 参数量 | Valid (0-shot) | Valid (5-shot) | Test (0-shot) | Test (5-shot) | -| ------------------------ | :------------: | :-----------: | :-----------: | :-----------: | :-----------: | -| **Llama-3-Chinese-8B-Instruct** | 8B | 49.3 | 51.5 | 48.3 | 49.4 | -| **Llama-3-Chinese-8B** | 8B | 47.0 | 50.5 | 46.1 | 49.0 | -| [Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | 8B | 49.3 | 51.2 | 46.1 | 49.4 | -| [Chinese-Mixtral-Instruct](https://github.com/ymcui/Chinese-Mixtral) | 8x7B | 51.7 | 55.0 | 50.0 | 51.5 | -| [Chinese-Mixtral](https://github.com/ymcui/Chinese-Mixtral) | 8x7B | 45.8 | 54.2 | 43.1 | 49.1 | -| [Chinese-Alpaca-2-13B](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 13B | 44.3 | 45.9 | 42.6 | 44.0 | -| [Chinese-LLaMA-2-13B](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 13B | 40.6 | 42.7 | 38.0 | 41.6 | +| Models | Valid (0-shot) | Valid (5-shot) | Test (0-shot) | Test (5-shot) | +| ------------------------ | :-----------: | :-----------: | :-----------: | :-----------: | +| **Llama-3-Chinese-8B-Instruct-v2** | 51.6 | 51.6 | 49.7 | 49.8 | +| **Llama-3-Chinese-8B-Instruct** | 49.3 | 51.5 | 48.3 | 49.4 | +| **Llama-3-Chinese-8B** | 47.0 | 50.5 | 46.1 | 49.0 | +| [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) | 51.3 | 51.3 | 49.5 | 51.0 | +| [Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | 49.3 | 51.2 | 46.1 | 49.4 | +| [Chinese-Mixtral-Instruct](https://github.com/ymcui/Chinese-Mixtral) (8x7B) | 51.7 | 55.0 | 50.0 | 51.5 | +| [Chinese-Mixtral](https://github.com/ymcui/Chinese-Mixtral) (8x7B) | 45.8 | 54.2 | 43.1 | 49.1 | +| [Chinese-Alpaca-2-13B](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 44.3 | 45.9 | 42.6 | 44.0 | +| [Chinese-LLaMA-2-13B](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 40.6 | 42.7 | 38.0 | 41.6 | #### CMMLU [CMMLU](https://github.com/haonan-li/CMMLU)是另一个综合性中文评测数据集,专门用于评估语言模型在中文语境下的知识和推理能力,涵盖了从基础学科到高级专业水平的67个主题,共计11.5K个选择题。CMMLU推理代码请参考本项目:[📖GitHub Wiki](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/wiki/cmmlu_zh) -| Models | 参数量 | Test (0-shot) | Test (5-shot) | -| ------------------------ | :------------: | :-----------: | :-----------: | -| **Llama-3-Chinese-8B-Instruct** | 8B | 49.7 | 51.5 | -| **Llama-3-Chinese-8B** | 8B | 48.0 | 50.9 | -| [Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | 8B | 47.8 | 50.8 | -| [Chinese-Mixtral-Instruct](https://github.com/ymcui/Chinese-Mixtral) | 8x7B | 50.0 | 53.0 | -| [Chinese-Mixtral](https://github.com/ymcui/Chinese-Mixtral) | 8x7B | 42.5 | 51.0 | -| [Chinese-Alpaca-2-13B](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 13B | 43.2 | 45.5 | -| [Chinese-LLaMA-2-13B](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 13B | 38.9 | 42.5 | +| Models | Test (0-shot) | Test (5-shot) | +| ------------------------ | :-----------: | :-----------: | +| **Llama-3-Chinese-8B-Instruct-v2** | 51.8 | 52.4 | +| **Llama-3-Chinese-8B-Instruct** | 49.7 | 51.5 | +| **Llama-3-Chinese-8B** | 48.0 | 50.9 | +| [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) | 53.0 | 53.5 | +| [Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | 47.8 | 50.8 | +| [Chinese-Mixtral-Instruct](https://github.com/ymcui/Chinese-Mixtral) (8x7B) | 50.0 | 53.0 | +| [Chinese-Mixtral](https://github.com/ymcui/Chinese-Mixtral) (8x7B) | 42.5 | 51.0 | +| [Chinese-Alpaca-2-13B](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 43.2 | 45.5 | +| [Chinese-LLaMA-2-13B](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 38.9 | 42.5 | #### MMLU [MMLU](https://github.com/hendrycks/test)是一个用于评测自然语言理解能力的英文评测数据集,是当今用于评测大模型能力的主要数据集之一,其中验证集和测试集分别包含1.5K和14.1K个选择题,涵盖57个学科。MMLU推理代码请参考本项目:[📖GitHub Wiki](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/wiki/mmlu_zh) -| Models | 参数量 | Valid (0-shot) | Valid (5-shot) | Test (0-shot) | Test (5-shot) | -| ------------------------ | :------------: | :-----------: | :-----------: | :-----------: | :-----------: | -| **Llama-3-Chinese-8B-Instruct** | 8B | 60.1 | 61.3 | 59.8 | 61.8 | -| **Llama-3-Chinese-8B** | 8B | 55.5 | 58.5 | 57.3 | 61.1 | -| [Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | 8B | 58.6 | 62.5 | 60.5 | 65.0 | -| [Chinese-Mixtral-Instruct](https://github.com/ymcui/Chinese-Mixtral) | 8x7B | 65.1 | 69.6 | 67.5 | 69.8 | -| [Chinese-Mixtral](https://github.com/ymcui/Chinese-Mixtral) | 8x7B | 63.2 | 67.1 | 65.5 | 68.3 | -| [Chinese-Alpaca-2-13B](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 13B | 49.6 | 53.2 | 50.9 | 53.5 | -| [Chinese-LLaMA-2-13B](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 13B | 46.8 | 50.0 | 46.6 | 51.8 | +| Models | Valid (0-shot) | Valid (5-shot) | Test (0-shot) | Test (5-shot) | +| ------------------------ | :-----------: | :-----------: | :-----------: | :-----------: | +| **Llama-3-Chinese-8B-Instruct-v2** | 62.1 | 63.9 | 62.6 | 63.7 | +| **Llama-3-Chinese-8B-Instruct** | 60.1 | 61.3 | 59.8 | 61.8 | +| **Llama-3-Chinese-8B** | 55.5 | 58.5 | 57.3 | 61.1 | +| [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) | 63.4 | 64.8 | 65.1 | 66.4 | +| [Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | 58.6 | 62.5 | 60.5 | 65.0 | +| [Chinese-Mixtral-Instruct](https://github.com/ymcui/Chinese-Mixtral) (8x7B) | 65.1 | 69.6 | 67.5 | 69.8 | +| [Chinese-Mixtral](https://github.com/ymcui/Chinese-Mixtral) (8x7B) | 63.2 | 67.1 | 65.5 | 68.3 | +| [Chinese-Alpaca-2-13B](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 49.6 | 53.2 | 50.9 | 53.5 | +| [Chinese-LLaMA-2-13B](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 46.8 | 50.0 | 46.6 | 51.8 | #### LongBench [LongBench](https://github.com/THUDM/LongBench)是一个大模型长文本理解能力的评测基准,由6大类、20个不同的任务组成,多数任务的平均长度在5K-15K之间,共包含约4.75K条测试数据。以下是本项目模型在该中文任务(含代码任务)上的评测效果。LongBench推理代码请参考本项目:[📖GitHub Wiki](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/wiki/longbench_zh) -| Models | 参数量 | 单文档QA | 多文档QA | 摘要 | FS学习 | 代码 | 合成 | 平均 | -| ------------------------------------------------------------ | :----: | :------: | :------: | :--: | :----: | :--: | :--: | :--: | -| **Llama-3-Chinese-8B-Instruct** | 8B | 44.1 | 24.0 | 12.4 | 33.5 | 51.8 | 11.5 | 29.6 | -| **Llama-3-Chinese-8B** | 8B | 16.4 | 19.3 | 4.3 | 28.7 | 14.3 | 4.6 | 14.6 | -| [Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | 8B | 21.2 | 22.9 | 2.7 | 35.8 | 65.9 | 40.8 | 31.6 | -| [Chinese-Mixtral-Instruct](https://github.com/ymcui/Chinese-Mixtral) | 8x7B | 50.3 | 34.2 | 16.4 | 42.0 | 56.1 | 89.5 | 48.1 | -| [Chinese-Mixtral](https://github.com/ymcui/Chinese-Mixtral) | 8x7B | 32.0 | 23.7 | 0.4 | 42.5 | 27.4 | 14.0 | 23.3 | -| [Chinese-Alpaca-2-13B-16K](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 13B | 47.9 | 26.7 | 13.0 | 22.3 | 46.6 | 21.5 | 29.7 | -| [Chinese-LLaMA-2-13B-16K](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 13B | 36.7 | 17.7 | 3.1 | 29.8 | 13.8 | 3.0 | 17.3 | -| [Chinese-Alpaca-2-7B-64K](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 7B | 44.7 | 28.1 | 14.4 | 39.0 | 44.6 | 5.0 | 29.3 | -| [Chinese-LLaMA-2-7B-64K](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 7B | 27.2 | 16.4 | 6.5 | 33.0 | 7.8 | 5.0 | 16.0 | +| Models | 单文档QA | 多文档QA | 摘要 | FS学习 | 代码 | 合成 | 平均 | +| ------------------------------------------------------------ | :------: | :------: | :--: | :----: | :--: | :--: | :--: | +| **Llama-3-Chinese-8B-Instruct-v2** | 57.3 | 27.1 | 13.9 | 30.3 | 60.6 | 89.5 | 46.4 | +| **Llama-3-Chinese-8B-Instruct** | 44.1 | 24.0 | 12.4 | 33.5 | 51.8 | 11.5 | 29.6 | +| **Llama-3-Chinese-8B** | 16.4 | 19.3 | 4.3 | 28.7 | 14.3 | 4.6 | 14.6 | +| [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) | 55.1 | 15.1 | 0.1 | 24.0 | 51.3 | 94.5 | 40.0 | +| [Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | 21.2 | 22.9 | 2.7 | 35.8 | 65.9 | 40.8 | 31.6 | +| [Chinese-Mixtral-Instruct](https://github.com/ymcui/Chinese-Mixtral) (8x7B) | 50.3 | 34.2 | 16.4 | 42.0 | 56.1 | 89.5 | 48.1 | +| [Chinese-Mixtral](https://github.com/ymcui/Chinese-Mixtral) (8x7B) | 32.0 | 23.7 | 0.4 | 42.5 | 27.4 | 14.0 | 23.3 | +| [Chinese-Alpaca-2-13B-16K](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 47.9 | 26.7 | 13.0 | 22.3 | 46.6 | 21.5 | 29.7 | +| [Chinese-LLaMA-2-13B-16K](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 36.7 | 17.7 | 3.1 | 29.8 | 13.8 | 3.0 | 17.3 | +| [Chinese-Alpaca-2-7B-64K](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 44.7 | 28.1 | 14.4 | 39.0 | 44.6 | 5.0 | 29.3 | +| [Chinese-LLaMA-2-7B-64K](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 27.2 | 16.4 | 6.5 | 33.0 | 7.8 | 5.0 | 16.0 | ### 量化效果评测 @@ -253,7 +266,7 @@ 问题5:为什么不对模型做全量预训练而是用LoRA? 问题6:为什么Llama-3-Chinese对话效果不好? 问题7:为什么指令模型会回复说自己是ChatGPT? -问题8:为什么没有在Meta-Llama-3-Instruct上训练? +问题8:Instrcut模型的v1(原版)和v2有什么区别? ``` ## 免责声明 diff --git a/README_EN.md b/README_EN.md index 58e7f9b..e2e1f72 100644 --- a/README_EN.md +++ b/README_EN.md @@ -29,7 +29,9 @@ This project is developed based on Meta's newly released next-generation open-so ## News -**[2024/05/07] Add pre-training and SFT scripts. For details, see: [📚Version 1.1 Release Log](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/releases/tag/v1.1)** +**[2024/05/08] Release Llama-3-Chinese-8B-Instruct-v2, which is directly tuned on [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) with 5M instructions. For details, see: [📚Version 2.0 Release Log](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/releases/tag/v2.0)** + +[2024/05/07] Add pre-training and SFT scripts. For details, see: [📚Version 1.1 Release Log](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/releases/tag/v1.1) [2024/04/30] Released the Llama-3-Chinese-8B base model and Llama-3-Chinese-8B-Instruct instruction model. For details, see: [📚Version 1.0 Release Log](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/releases/tag/v1.0) @@ -78,13 +80,13 @@ This project has launched the Chinese open-source large models Llama-3-Chinese a Here's a comparison of the models in this project and recommended usage scenarios. **For chat interactions, please choose the Instruct version.** -| Comparison Item | Llama-3-Chinese | Llama-3-Chinese-Instruct | +| Comparison Item | Llama-3-Chinese-8B | Llama-3-Chinese-8B-Instruct | | ----------------------- | :-------------------------------------: | :----------------------------------------------: | | Model Type | Base Model | Instruction/Chat Model (similar to ChatGPT) | | Model Size | 8B | 8B | | Training Type | Causal-LM (CLM) | Instruction Fine-Tuning | | Training Method | LoRA + Full emb/lm-head | LoRA + Full emb/lm-head | -| Initial Model | [Original Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | Llama-3-Chinese | +| Initial Model | [Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | v1: Llama-3-Chinese-8B
v2: [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) | | Training Corpus | Unlabeled general corpus (approx. 120GB) | Labeled instruction data (approx. 5 million entries) | | Vocabulary Size | Original vocabulary (128,256) | Original vocabulary (128,256) | | Supported Context Length | 8K | 8K | @@ -94,15 +96,19 @@ Here's a comparison of the models in this project and recommended usage scenario ### Download Links -| Model Name | Full Version | LoRA Version | GGUF Version | -| ------------------------------------- | :----------: | :----------: | :----------: | -| **Llama-3-Chinese-8B**
(base model) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-lora)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-lora)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-lora) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-gguf)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-gguf) | -| **Llama-3-Chinese-8B-Instruct**
(chat model) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct-lora)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-lora)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-lora) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct-gguf)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-gguf) | +| Model Name | Full Version | LoRA Version | GGUF Version | +| --------------------------------------------------- | :----------------------------------------------------------: | :----------------------------------------------------------: | :----------------------------------------------------------: | +| **Llama-3-Chinese-8B-Instruct-v2**
(chat model) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct-v2)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-v2)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-v2) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct-v2-lora)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-v2-lora)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-v2-lora) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct-v2-gguf)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-v2-gguf) | +| **Llama-3-Chinese-8B-Instruct**
(chat model) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct-lora)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-lora)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-lora) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-instruct-gguf)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct-gguf) | +| **Llama-3-Chinese-8B**
(base model) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-lora)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-lora)
[[wisemodel]](https://wisemodel.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-lora) | [[🤗Hugging Face]](https://huggingface.co/hfl/llama-3-chinese-8b-gguf)
[[🤖ModelScope]](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-gguf) | + Model Type Description: - **Full Model**: Can be used directly for training and inference, no other merging steps required. -- **LoRA Model**: Needs to be merged with the original [Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) to convert into a full version, merging steps: [**💻 Model Merging Steps**](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/wiki/model_conversion_en) +- **LoRA Model**: Needs to be merged with the original base model to convert into a full version, merging steps: [**💻 Model Merging Steps**](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/wiki/model_conversion_en) + - v1 base model: [Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) + - v2 base model: [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) - **GGUF Model**: Quantization format released by [llama.cpp](https://github.com/ggerganov/llama.cpp), compatible with common large model inference tools like ollama, recommended for users who only need to perform inference deployment. The model name with `-im` suffix is generated with important matrix, which has generally better performance. > [!NOTE] > If HF access is blocked, consider using mirror sites (like [hf-mirror.com](hf-mirror.com)), please find the specific methods and solutions on your own. @@ -138,59 +144,67 @@ To evaluate the effectiveness of the related models, this project conducted both [C-Eval](https://cevalbenchmark.com) is a comprehensive Chinese fundamental model evaluation suite, with its validation and test sets comprising 1.3K and 12.3K multiple-choice questions respectively, covering 52 subjects. For C-Eval inference code, please refer to this project: [📖GitHub Wiki](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/wiki/ceval_en) -| Models | Size | Valid (0-shot) | Valid (5-shot) | Test (0-shot) | Test (5-shot) | -| --- | :---: | :---: | :---: | :---: | :---: | -| **Llama-3-Chinese-8B-Instruct** | 8B | 49.3 | 51.5 | 48.3 | 49.4 | -| **Llama-3-Chinese-8B** | 8B | 47.0 | 50.5 | 46.1 | 49.0 | -| [Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | 8B | 49.3 | 51.2 | 46.1 | 49.4 | -| [Chinese-Mixtral-Instruct](https://github.com/ymcui/Chinese-Mixtral) | 8x7B | 51.7 | 55.0 | 50.0 | 51.5 | -| [Chinese-Mixtral](https://github.com/ymcui/Chinese-Mixtral) | 8x7B | 45.8 | 54.2 | 43.1 | 49.1 | -| [Chinese-Alpaca-2-13B](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 13B | 44.3 | 45.9 | 42.6 | 44.0 | -| [Chinese-LLaMA-2-13B](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 13B | 40.6 | 42.7 | 38.0 | 41.6 | +| Models | Valid (0-shot) | Valid (5-shot) | Test (0-shot) | Test (5-shot) | +| ------------------------ | :-----------: | :-----------: | :-----------: | :-----------: | +| **Llama-3-Chinese-8B-Instruct-v2** | 51.6 | 51.6 | 49.7 | 49.8 | +| **Llama-3-Chinese-8B-Instruct** | 49.3 | 51.5 | 48.3 | 49.4 | +| **Llama-3-Chinese-8B** | 47.0 | 50.5 | 46.1 | 49.0 | +| [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) | 51.3 | 51.3 | 49.5 | 51.0 | +| [Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | 49.3 | 51.2 | 46.1 | 49.4 | +| [Chinese-Mixtral-Instruct](https://github.com/ymcui/Chinese-Mixtral) (8x7B) | 51.7 | 55.0 | 50.0 | 51.5 | +| [Chinese-Mixtral](https://github.com/ymcui/Chinese-Mixtral) (8x7B) | 45.8 | 54.2 | 43.1 | 49.1 | +| [Chinese-Alpaca-2-13B](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 44.3 | 45.9 | 42.6 | 44.0 | +| [Chinese-LLaMA-2-13B](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 40.6 | 42.7 | 38.0 | 41.6 | #### CMMLU [CMMLU](https://github.com/haonan-li/CMMLU) is another comprehensive Chinese evaluation dataset specifically designed to assess language models' knowledge and reasoning capabilities in a Chinese context, covering topics from basic subjects to advanced professional levels, with a total of 11.5K multiple-choice questions. For CMMLU inference code, please refer to this project: [📖GitHub Wiki](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/wiki/cmmlu_en) -| Models | Size | Test (0-shot) | Test (5-shot) | -| --- | :---: | :---: | :---: | -| **Llama-3-Chinese-8B-Instruct** | 8B | 49.7 | 51.5 | -| **Llama-3-Chinese-8B** | 8B | 48.0 | 50.9 | -| [Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | 8B | 47.8 | 50.8 | -| [Chinese-Mixtral-Instruct](https://github.com/ymcui/Chinese-Mixtral) | 8x7B | 50.0 | 53.0 | -| [Chinese-Mixtral](https://github.com/ymcui/Chinese-Mixtral) | 8x7B | 42.5 | 51.0 | -| [Chinese-Alpaca-2-13B](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 13B | 43.2 | 45.5 | -| [Chinese-LLaMA-2-13B](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 13B | 38.9 | 42.5 | +| Models | Test (0-shot) | Test (5-shot) | +| ------------------------ | :-----------: | :-----------: | +| **Llama-3-Chinese-8B-Instruct-v2** | 51.8 | 52.4 | +| **Llama-3-Chinese-8B-Instruct** | 49.7 | 51.5 | +| **Llama-3-Chinese-8B** | 48.0 | 50.9 | +| [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) | 53.0 | 53.5 | +| [Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | 47.8 | 50.8 | +| [Chinese-Mixtral-Instruct](https://github.com/ymcui/Chinese-Mixtral) (8x7B) | 50.0 | 53.0 | +| [Chinese-Mixtral](https://github.com/ymcui/Chinese-Mixtral) (8x7B) | 42.5 | 51.0 | +| [Chinese-Alpaca-2-13B](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 43.2 | 45.5 | +| [Chinese-LLaMA-2-13B](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 38.9 | 42.5 | #### MMLU [MMLU](https://github.com/hendrycks/test) is an English evaluation dataset for assessing natural language understanding capabilities, one of the main datasets used today for evaluating large models' capabilities, with its validation and test sets comprising 1.5K and 14.1K multiple-choice questions respectively, covering 57 subjects. For MMLU inference code, please refer to this project: [📖GitHub Wiki](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/wiki/mmlu_en) -| Models | Size | Valid (0-shot) | Valid (5-shot) | Test (0-shot) | Test (5-shot) | -| --- | :---: | :---: | :---: | :---: | :---: | -| **Llama-3-Chinese-8B-Instruct** | 8B | 60.1 | 61.3 | 59.8 | 61.8 | -| **Llama-3-Chinese-8B** | 8B | 55.5 | 58.5 | 57.3 | 61.1 | -| [Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | 8B | 58.6 | 62.5 | 60.5 | 65.0 | -| [Chinese-Mixtral-Instruct](https://github.com/ymcui/Chinese-Mixtral) | 8x7B | 65.1 | 69.6 | 67.5 | 69.8 | -| [Chinese-Mixtral](https://github.com/ymcui/Chinese-Mixtral) | 8x7B | 63.2 | 67.1 | 65.5 | 68.3 | -| [Chinese-Alpaca-2-13B](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 13B | 49.6 | 53.2 | 50.9 | 53.5 | -| [Chinese-LLaMA-2-13B](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 13B | 46.8 | 50.0 | 46.6 | 51.8 | +| Models | Valid (0-shot) | Valid (5-shot) | Test (0-shot) | Test (5-shot) | +| ------------------------ | :-----------: | :-----------: | :-----------: | :-----------: | +| **Llama-3-Chinese-8B-Instruct-v2** | 62.1 | 63.9 | 62.6 | 63.7 | +| **Llama-3-Chinese-8B-Instruct** | 60.1 | 61.3 | 59.8 | 61.8 | +| **Llama-3-Chinese-8B** | 55.5 | 58.5 | 57.3 | 61.1 | +| [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) | 63.4 | 64.8 | 65.1 | 66.4 | +| [Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | 58.6 | 62.5 | 60.5 | 65.0 | +| [Chinese-Mixtral-Instruct](https://github.com/ymcui/Chinese-Mixtral) (8x7B) | 65.1 | 69.6 | 67.5 | 69.8 | +| [Chinese-Mixtral](https://github.com/ymcui/Chinese-Mixtral) (8x7B) | 63.2 | 67.1 | 65.5 | 68.3 | +| [Chinese-Alpaca-2-13B](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 49.6 | 53.2 | 50.9 | 53.5 | +| [Chinese-LLaMA-2-13B](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 46.8 | 50.0 | 46.6 | 51.8 | #### LongBench [LongBench](https://github.com/THUDM/LongBench) is a benchmark for evaluating large models' long-text understanding capabilities, composed of 6 categories and 20 different tasks. Most tasks have an average length between 5K-15K, totaling approximately 4.75K test data entries. Below are the evaluation results of this project's models on these Chinese tasks (including code tasks). For LongBench inference code, please refer to this project: [📖GitHub Wiki](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/wiki/longbench_en) -| Models | Size | Single-doc QA | Multi-doc QA | Summary | Few-Shot Learning | Code | Synthesis | Average | -| --- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | -| **Llama-3-Chinese-8B-Instruct** | 8B | 44.1 | 24.0 | 12.4 | 33.5 | 51.8 | 11.5 | 29.6 | -| **Llama-3-Chinese-8B** | 8B | 16.4 | 19.3 | 4.3 | 28.7 | 14.3 | 4.6 | 14.6 | -| [Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | 8B | 21.2 | 22.9 | 2.7 | 35.8 | 65.9 | 40.8 | 31.6 | -| [Chinese-Mixtral-Instruct](https://github.com/ymcui/Chinese-Mixtral) | 8x7B | 50.3 | 34.2 | 16.4 | 42.0 | 56.1 | 89.5 | 48.1 | -| [Chinese-Mixtral](https://github.com/ymcui/Chinese-Mixtral) | 8x7B | 32.0 | 23.7 | 0.4 | 42.5 | 27.4 | 14.0 | 23.3 | -| [Chinese-Alpaca-2-13B-16K](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 13B | 47.9 | 26.7 | 13.0 | 22.3 | 46.6 | 21.5 | 29.7 | -| [Chinese-LLaMA-2-13B-16K](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 13B | 36.7 | 17.7 | 3.1 | 29.8 | 13.8 | 3.0 | 17.3 | -| [Chinese-Alpaca-2-7B-64K](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 7B | 44.7 | 28.1 | 14.4 | 39.0 | 44.6 | 5.0 | 29.3 | -| [Chinese-LLaMA-2-7B-64K](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 7B | 27.2 | 16.4 | 6.5 | 33.0 | 7.8 | 5.0 | 16.0 | +| Models | Single-doc QA | Multi-doc QA | Summarization | Few-Shot Learning | Code | Synthesis | Average | +| --- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | +| **Llama-3-Chinese-8B-Instruct-v2** | 57.3 | 27.1 | 13.9 | 30.3 | 60.6 | 89.5 | 46.4 | +| **Llama-3-Chinese-8B-Instruct** | 44.1 | 24.0 | 12.4 | 33.5 | 51.8 | 11.5 | 29.6 | +| **Llama-3-Chinese-8B** | 16.4 | 19.3 | 4.3 | 28.7 | 14.3 | 4.6 | 14.6 | +| [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) | 55.1 | 15.1 | 0.1 | 24.0 | 51.3 | 94.5 | 40.0 | +| [Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) | 21.2 | 22.9 | 2.7 | 35.8 | 65.9 | 40.8 | 31.6 | +| [Chinese-Mixtral-Instruct](https://github.com/ymcui/Chinese-Mixtral) (8x7B) | 50.3 | 34.2 | 16.4 | 42.0 | 56.1 | 89.5 | 48.1 | +| [Chinese-Mixtral](https://github.com/ymcui/Chinese-Mixtral) (8x7B) | 32.0 | 23.7 | 0.4 | 42.5 | 27.4 | 14.0 | 23.3 | +| [Chinese-Alpaca-2-13B-16K](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 47.9 | 26.7 | 13.0 | 22.3 | 46.6 | 21.5 | 29.7 | +| [Chinese-LLaMA-2-13B-16K](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 36.7 | 17.7 | 3.1 | 29.8 | 13.8 | 3.0 | 17.3 | +| [Chinese-Alpaca-2-7B-64K](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 44.7 | 28.1 | 14.4 | 39.0 | 44.6 | 5.0 | 29.3 | +| [Chinese-LLaMA-2-7B-64K](https://github.com/ymcui/Chinese-LLaMA-Alpaca-2) | 27.2 | 16.4 | 6.5 | 33.0 | 7.8 | 5.0 | 16.0 | ### Quantitative Performance Evaluation @@ -253,7 +267,7 @@ Question 4: Can the models from this repository be used commercially? Question 5: Why not perform full pre-training instead of using LoRA? Question 6: Why is the conversational performance of Llama-3-Chinese not good? Question 7: Why does the instruction model reply saying it is ChatGPT? -Question 8: Why not train from Meta-Llama-3-Instruct? +Question 8: What are the differences between v1 and v2 of the Instruct model? ``` ## Disclaimer diff --git a/scripts/ollama/Modelfile b/scripts/ollama/Modelfile index 08f2dde..26b3b23 100644 --- a/scripts/ollama/Modelfile +++ b/scripts/ollama/Modelfile @@ -6,7 +6,7 @@ TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|> {{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|> {{ .Response }}<|eot_id|>""" -SYSTEM """""" +SYSTEM """You are a helpful assistant. 你是一个乐于助人的助手。""" PARAMETER num_keep 24 PARAMETER stop <|start_header_id|> PARAMETER stop <|end_header_id|> diff --git a/scripts/ollama/README.md b/scripts/ollama/README.md new file mode 100644 index 0000000..c4e7c74 --- /dev/null +++ b/scripts/ollama/README.md @@ -0,0 +1,3 @@ +## Ollama + +### ⚠️ 请务必使用v0.1.33以上版本,否则会出现无限生成等异常问题。