Name		Name	Last commit message	Last commit date
parent directory ..
auto_parallel		auto_parallel
README.md		README.md
dpo_argument.json		dpo_argument.json
lora_argument.json		lora_argument.json
lora_argument_pissa.json		lora_argument_pissa.json
lora_argument_qwen2moe.json		lora_argument_qwen2moe.json
pretrain-qwen_7b-tp2sd4_stage2.json		pretrain-qwen_7b-tp2sd4_stage2.json
pretrain_argument_stage2.json		pretrain_argument_stage2.json
pretrain_argument_tp2pp4.json		pretrain_argument_tp2pp4.json
pt_argument.json		pt_argument.json
sft_argument.json		sft_argument.json
sft_argument_qwen2moe.json		sft_argument_qwen2moe.json

README.md

Qwen

通义千问（Qwen）是阿里云研发的通义千问大模型系列的模型, 有 70 亿和 140 亿两个规模。Qwen是基于Transformer的大语言模型, 在超大规模的预训练数据上进行训练得到。预训练数据类型多样，覆盖广泛，包括大量网络文本、专业书籍、代码等。

支持模型权重:

通义千问（Qwen1.5）是阿里云研发的通义千问系列模型升级版。Qwen1.5包括0.5B、1.8B、4B、7B、14B、32B、72B、110B和MoE共计9个不同规模的Base和Chat模型。

支持模型权重:

通义千问（Qwen2）是阿里云研发的通义千问系列模型升级版。Qwen2包括0.5B、1.5B、7B、72B和MoE共计5个不同规模的Base和Chat模型。 支持模型权重: