You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[INFO|tokenization_utils_base.py:2209] 2024-11-16 16:43:39,151 >> loading file tokenizer.json
[INFO|tokenization_utils_base.py:2209] 2024-11-16 16:43:39,151 >> loading file tokenizer.model
[INFO|tokenization_utils_base.py:2209] 2024-11-16 16:43:39,151 >> loading file added_tokens.json
[INFO|tokenization_utils_base.py:2209] 2024-11-16 16:43:39,151 >> loading file special_tokens_map.json
[INFO|tokenization_utils_base.py:2209] 2024-11-16 16:43:39,151 >> loading file tokenizer_config.json
[INFO|tokenization_utils_base.py:2475] 2024-11-16 16:43:39,484 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
11/16/2024 16:43:39 - INFO - main - Training files: /root/autodl-tmp/pycharm/Chinese-LLaMA-Alpaca-3-main/data/ruozhiba_qa2449_gpt4turbo.json
11/16/2024 16:43:39 - WARNING - root - building dataset...
11/16/2024 16:43:39 - INFO - name - training datasets-/root/autodl-tmp/pycharm/Chinese-LLaMA-Alpaca-3-main/data/ruozhiba_qa2449_gpt4turbo.json has been loaded from disk
11/16/2024 16:43:39 - INFO - main - Num train_samples 25
11/16/2024 16:43:39 - INFO - main - Training example:
11/16/2024 16:43:39 - INFO - main - <|start_header_id|>system<|end_header_id|>
You are a helpful assistant. 你是一个乐于助人的助手。<|eot_id|><|start_header_id|>user<|end_header_id|>
鸡柳并不是鸡身上的一个特定部位,它是指用鸡胸肉切成的细长条形的肉块。在中国餐饮中,鸡柳是非常常见的一种食材,经常被用来炒菜或是做成油炸小吃。鸡胸肉是鸡身上的肉质较为细嫩的部分,含脂肪少且蛋白质含量高,适合切成条状做成多种菜式。
11/16/2024 16:43:39 - INFO - main - Evaluation files: /root/autodl-tmp/pycharm/Chinese-LLaMA-Alpaca-3-main/eval/ruozhiba_qa2449_gpt4turbo.json
11/16/2024 16:43:39 - WARNING - root - building dataset...
11/16/2024 16:43:39 - INFO - name - training datasets-/root/autodl-tmp/pycharm/Chinese-LLaMA-Alpaca-3-main/eval/ruozhiba_qa2449_gpt4turbo.json has been loaded from disk
11/16/2024 16:43:39 - INFO - main - Num eval_samples 36
11/16/2024 16:43:39 - INFO - main - Evaluation example:
11/16/2024 16:43:39 - INFO - main - <|start_header_id|>system<|end_header_id|>
You are a helpful assistant. 你是一个乐于助人的助手。<|eot_id|><|start_header_id|>user<|end_header_id|>
Loading checkpoint shards: 100%|██████████████████| 4/4 [00:03<00:00, 1.03it/s]
[INFO|modeling_utils.py:4800] 2024-11-16 16:43:43,453 >> All model checkpoint weights were used when initializing LlamaForCausalLM.
[INFO|modeling_utils.py:4808] 2024-11-16 16:43:43,453 >> All the weights of LlamaForCausalLM were initialized from the model checkpoint at /root/autodl-tmp/pycharm/llama3.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
[INFO|configuration_utils.py:1049] 2024-11-16 16:43:43,455 >> loading configuration file /root/autodl-tmp/pycharm/llama3/generation_config.json
[INFO|configuration_utils.py:1096] 2024-11-16 16:43:43,455 >> Generate config GenerationConfig {
"bos_token_id": 128000,
"do_sample": true,
"eos_token_id": [
128001,
128009
],
"max_length": 4096,
"temperature": 0.6,
"top_p": 0.9
}
11/16/2024 16:43:43 - INFO - main - Model vocab size: 128256
11/16/2024 16:43:43 - INFO - main - len(tokenizer):128256
11/16/2024 16:43:43 - INFO - main - Init new peft model
11/16/2024 16:43:43 - INFO - main - target_modules: ['q_proj', 'v_proj', 'k_proj', 'o_proj', 'gate_proj', 'down_proj', 'up_proj']
11/16/2024 16:43:43 - INFO - main - lora_rank: 64
trainable params: 1,218,445,312 || all params: 9,248,706,560 || trainable%: 13.174223920885215
/root/autodl-tmp/pycharm/Chinese-LLaMA-Alpaca-3-main/scripts/training/run_clm_sft_with_peft.py:456: FutureWarning: tokenizer is deprecated and will be removed in version 5.0.0 for Trainer.__init__. Use processing_class instead.
trainer = Trainer(
11/16/2024 16:43:47 - WARNING - accelerate.utils.other - Detected kernel version 4.19.90, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.
[INFO|trainer.py:699] 2024-11-16 16:43:48,066 >> Using auto half precision backend
[INFO|trainer.py:2314] 2024-11-16 16:43:48,246 >> ***** Running training *****
[INFO|trainer.py:2315] 2024-11-16 16:43:48,246 >> Num examples = 25
[INFO|trainer.py:2316] 2024-11-16 16:43:48,246 >> Num Epochs = 10
[INFO|trainer.py:2317] 2024-11-16 16:43:48,246 >> Instantaneous batch size per device = 1
[INFO|trainer.py:2320] 2024-11-16 16:43:48,246 >> Total train batch size (w. parallel, distributed & accumulation) = 8
[INFO|trainer.py:2321] 2024-11-16 16:43:48,246 >> Gradient Accumulation steps = 8
[INFO|trainer.py:2322] 2024-11-16 16:43:48,246 >> Total optimization steps = 30
[INFO|trainer.py:2323] 2024-11-16 16:43:48,250 >> Number of trainable parameters = 1,218,445,312
0%| | 0/30 [00:00<?, ?it/s]/root/miniconda3/lib/python3.12/site-packages/transformers/data/data_collator.py:657: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at ../torch/csrc/utils/tensor_new.cpp:274.)
batch["labels"] = torch.tensor(batch["labels"], dtype=torch.int64)
{'loss': 1.6754, 'grad_norm': 5.90625, 'learning_rate': 5e-05, 'epoch': 0.32}
{'loss': 1.214, 'grad_norm': 7.46875, 'learning_rate': 8.117449009293668e-05, 'epoch': 3.2}
{'loss': 0.1101, 'grad_norm': 0.369140625, 'learning_rate': 2.8305813044122097e-05, 'epoch': 6.4}
{'loss': 0.0148, 'grad_norm': 0.1513671875, 'learning_rate': 0.0, 'epoch': 9.6}
100%|███████████████████████████████████████████| 30/30 [00:34<00:00, 1.12s/it][INFO|trainer.py:3812] 2024-11-16 16:44:22,505 >> Saving model checkpoint to /root/autodl-tmp/pycharm/llama3-lora/checkpoint-30
[INFO|tokenization_utils_base.py:2646] 2024-11-16 16:44:28,336 >> tokenizer config file saved in /root/autodl-tmp/pycharm/llama3-lora/checkpoint-30/tokenizer_config.json
[INFO|tokenization_utils_base.py:2655] 2024-11-16 16:44:28,336 >> Special tokens file saved in /root/autodl-tmp/pycharm/llama3-lora/checkpoint-30/special_tokens_map.json
[INFO|trainer.py:2591] 2024-11-16 16:44:37,717 >>
Training completed. Do not forget to share your model on huggingface.co/models =)
进程已结束,退出代码为 0
/root/miniconda3/bin/conda run -p /root/miniconda3 --no-capture-output python /root/autodl-tmp/pycharm/Chinese-LLaMA-Alpaca-3-main/scripts/merge_llama3_with_chinese_lora_low_mem.py --base_model /root/autodl-tmp/pycharm/llama3 --lora_model /root/autodl-tmp/pycharm/llama3-lora --output_dir /root/autodl-tmp/pycharm/llama3-lora-merge
/root/autodl-tmp/pycharm/Chinese-LLaMA-Alpaca-3-main/scripts/merge_llama3_with_chinese_lora_low_mem.py:197: SyntaxWarning: invalid escape sequence '\d'
ckpt_filenames = sorted([f for f in os.listdir(output_dir) if re.match('L(\d+)-consolidated.(\d+).pth',f)])
/root/autodl-tmp/pycharm/Chinese-LLaMA-Alpaca-3-main/scripts/merge_llama3_with_chinese_lora_low_mem.py:200: SyntaxWarning: invalid escape sequence '\d'
shards_filenames = sorted([f for f in ckpt_filenames if re.match(f'L(\d+)-consolidated.0{i}.pth',f)])
/root/autodl-tmp/pycharm/Chinese-LLaMA-Alpaca-3-main/scripts/merge_llama3_with_chinese_lora_low_mem.py:273: SyntaxWarning: invalid escape sequence '\d'
ckpt_filenames = sorted([f for f in os.listdir(base_model_path) if re.match('model-(\d+)-of-(\d+).safetensors',f)])
/root/autodl-tmp/pycharm/Chinese-LLaMA-Alpaca-3-main/scripts/merge_llama3_with_chinese_lora_low_mem.py:275: SyntaxWarning: invalid escape sequence '\d'
ckpt_filenames = sorted([f for f in os.listdir(base_model_path) if re.match('pytorch_model-(\d+)-of-(\d+).bin',f)])
Base model: /root/autodl-tmp/pycharm/llama3
LoRA model: /root/autodl-tmp/pycharm/llama3-lora
Loading /root/autodl-tmp/pycharm/llama3-lora
Loading ckpt model-00001-of-00004.safetensors
Merging...
Saving ckpt model-00001-of-00004.safetensors to /root/autodl-tmp/pycharm/llama3-lora-merge in HF format...
Loading ckpt model-00002-of-00004.safetensors
Merging...
Saving ckpt model-00002-of-00004.safetensors to /root/autodl-tmp/pycharm/llama3-lora-merge in HF format...
Loading ckpt model-00003-of-00004.safetensors
Merging...
Saving ckpt model-00003-of-00004.safetensors to /root/autodl-tmp/pycharm/llama3-lora-merge in HF format...
Loading ckpt model-00004-of-00004.safetensors
Merging...
Saving ckpt model-00004-of-00004.safetensors to /root/autodl-tmp/pycharm/llama3-lora-merge in HF format...
Saving tokenizer
Saving config.json from /root/autodl-tmp/pycharm/llama3
Saving generation_config.json from /root/autodl-tmp/pycharm/llama3
Saving model.safetensors.index.json from /root/autodl-tmp/pycharm/llama3
Done.
Check output dir: /root/autodl-tmp/pycharm/llama3-lora-merge
进程已结束,退出代码为 0
The text was updated successfully, but these errors were encountered:
提交前必须检查以下项目
问题类型
模型训练与精调
基础模型
Llama-3-Chinese-8B-Instruct(指令模型)
操作系统
Linux
详细描述问题
仅仅对其中的数据进行部分问答对修改,然后进行训练,但是实际效果就是原来的问答对能正常回答,但是我所修改的问答对却不能正确回答例如:
这跟我的修改的数据有关系吗
依赖情况(代码类问题务必提供)
按照需求文件进行安装的环境,python是3.12
运行日志或截图
运行日志
/root/miniconda3/bin/conda run -p /root/miniconda3 --no-capture-output python /root/autodl-tmp/pycharm/Chinese-LLaMA-Alpaca-3-main/scripts/training/run_clm_sft_with_peft.py --model_name_or_path /root/autodl-tmp/pycharm/llama3 --tokenizer_name_or_path /root/autodl-tmp/pycharm/llama3 --dataset_dir /root/autodl-tmp/pycharm/Chinese-LLaMA-Alpaca-3-main/data --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --do_train 1 --do_eval 1 --seed 42 --bf16 1 --num_train_epochs 10 --lr_scheduler_type cosine --learning_rate 1e-4 --warmup_ratio 0.05 --weight_decay 0.1 --logging_strategy steps --logging_steps 10 --save_strategy steps --save_total_limit 3 --evaluation_strategy steps --eval_steps 100 --save_steps 200 --gradient_accumulation_steps 8 --preprocessing_num_workers 8 --max_seq_length 1024 --output_dir /root/autodl-tmp/pycharm/llama3-lora --overwrite_output_dir 1 --ddp_timeout 30000 --logging_first_step True --lora_rank 64 --lora_alpha 128 --trainable q_proj,v_proj,k_proj,o_proj,gate_proj,down_proj,up_proj --lora_dropout 0.05 --modules_to_save embed_tokens,lm_head --torch_dtype bfloat16 --validation_file /root/autodl-tmp/pycharm/Chinese-LLaMA-Alpaca-3-main/eval/ruozhiba_qa2449_gpt4turbo.json --load_in_kbits 16
/root/miniconda3/lib/python3.12/site-packages/transformers/training_args.py:1568: FutureWarning:
evaluation_strategy
is deprecated and will be removed in version 4.46 of 🤗 Transformers. Useeval_strategy
insteadwarnings.warn(
11/16/2024 16:43:39 - WARNING - main - Process rank: 0, device: cuda:0, n_gpu: 1distributed training: True, 16-bits training: True
[INFO|configuration_utils.py:677] 2024-11-16 16:43:39,149 >> loading configuration file /root/autodl-tmp/pycharm/llama3/config.json
[INFO|configuration_utils.py:746] 2024-11-16 16:43:39,150 >> Model config LlamaConfig {
"_name_or_path": "/root/autodl-tmp/pycharm/llama3",
"architectures": [
"LlamaForCausalLM"
],
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 128000,
"eos_token_id": 128009,
"head_dim": 128,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 14336,
"max_position_embeddings": 8192,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 8,
"pretraining_tp": 1,
"rms_norm_eps": 1e-05,
"rope_scaling": null,
"rope_theta": 500000.0,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.46.2",
"use_cache": true,
"vocab_size": 128256
}
[INFO|tokenization_utils_base.py:2209] 2024-11-16 16:43:39,151 >> loading file tokenizer.json
[INFO|tokenization_utils_base.py:2209] 2024-11-16 16:43:39,151 >> loading file tokenizer.model
[INFO|tokenization_utils_base.py:2209] 2024-11-16 16:43:39,151 >> loading file added_tokens.json
[INFO|tokenization_utils_base.py:2209] 2024-11-16 16:43:39,151 >> loading file special_tokens_map.json
[INFO|tokenization_utils_base.py:2209] 2024-11-16 16:43:39,151 >> loading file tokenizer_config.json
[INFO|tokenization_utils_base.py:2475] 2024-11-16 16:43:39,484 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
11/16/2024 16:43:39 - INFO - main - Training files: /root/autodl-tmp/pycharm/Chinese-LLaMA-Alpaca-3-main/data/ruozhiba_qa2449_gpt4turbo.json
11/16/2024 16:43:39 - WARNING - root - building dataset...
11/16/2024 16:43:39 - INFO - name - training datasets-/root/autodl-tmp/pycharm/Chinese-LLaMA-Alpaca-3-main/data/ruozhiba_qa2449_gpt4turbo.json has been loaded from disk
11/16/2024 16:43:39 - INFO - main - Num train_samples 25
11/16/2024 16:43:39 - INFO - main - Training example:
11/16/2024 16:43:39 - INFO - main - <|start_header_id|>system<|end_header_id|>
You are a helpful assistant. 你是一个乐于助人的助手。<|eot_id|><|start_header_id|>user<|end_header_id|>
鸡柳是鸡身上哪个部位啊?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
鸡柳并不是鸡身上的一个特定部位,它是指用鸡胸肉切成的细长条形的肉块。在中国餐饮中,鸡柳是非常常见的一种食材,经常被用来炒菜或是做成油炸小吃。鸡胸肉是鸡身上的肉质较为细嫩的部分,含脂肪少且蛋白质含量高,适合切成条状做成多种菜式。
11/16/2024 16:43:39 - INFO - main - Evaluation files: /root/autodl-tmp/pycharm/Chinese-LLaMA-Alpaca-3-main/eval/ruozhiba_qa2449_gpt4turbo.json
11/16/2024 16:43:39 - WARNING - root - building dataset...
11/16/2024 16:43:39 - INFO - name - training datasets-/root/autodl-tmp/pycharm/Chinese-LLaMA-Alpaca-3-main/eval/ruozhiba_qa2449_gpt4turbo.json has been loaded from disk
11/16/2024 16:43:39 - INFO - main - Num eval_samples 36
11/16/2024 16:43:39 - INFO - main - Evaluation example:
11/16/2024 16:43:39 - INFO - main - <|start_header_id|>system<|end_header_id|>
You are a helpful assistant. 你是一个乐于助人的助手。<|eot_id|><|start_header_id|>user<|end_header_id|>
鸡柳是鸡身上哪个部位啊?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
鸡柳并不是鸡身上的一个特定部位,它是指用鸡胸肉切成的细长条形的肉块。在中国餐饮中,鸡柳是非常常见的一种食材,经常被用来炒菜或是做成油炸小吃。鸡胸肉是鸡身上的肉质较为细嫩的部分,含脂肪少且蛋白质含量高,适合切成条状做成多种菜式。
[INFO|modeling_utils.py:3934] 2024-11-16 16:43:39,520 >> loading weights file /root/autodl-tmp/pycharm/llama3/model.safetensors.index.json
[INFO|modeling_utils.py:1670] 2024-11-16 16:43:39,520 >> Instantiating LlamaForCausalLM model under default dtype torch.bfloat16.
[INFO|configuration_utils.py:1096] 2024-11-16 16:43:39,521 >> Generate config GenerationConfig {
"bos_token_id": 128000,
"eos_token_id": 128009
}
Loading checkpoint shards: 100%|██████████████████| 4/4 [00:03<00:00, 1.03it/s]
[INFO|modeling_utils.py:4800] 2024-11-16 16:43:43,453 >> All model checkpoint weights were used when initializing LlamaForCausalLM.
[INFO|modeling_utils.py:4808] 2024-11-16 16:43:43,453 >> All the weights of LlamaForCausalLM were initialized from the model checkpoint at /root/autodl-tmp/pycharm/llama3.
If your task is similar to the task the model of the checkpoint was trained on, you can already use LlamaForCausalLM for predictions without further training.
[INFO|configuration_utils.py:1049] 2024-11-16 16:43:43,455 >> loading configuration file /root/autodl-tmp/pycharm/llama3/generation_config.json
[INFO|configuration_utils.py:1096] 2024-11-16 16:43:43,455 >> Generate config GenerationConfig {
"bos_token_id": 128000,
"do_sample": true,
"eos_token_id": [
128001,
128009
],
"max_length": 4096,
"temperature": 0.6,
"top_p": 0.9
}
11/16/2024 16:43:43 - INFO - main - Model vocab size: 128256
11/16/2024 16:43:43 - INFO - main - len(tokenizer):128256
11/16/2024 16:43:43 - INFO - main - Init new peft model
11/16/2024 16:43:43 - INFO - main - target_modules: ['q_proj', 'v_proj', 'k_proj', 'o_proj', 'gate_proj', 'down_proj', 'up_proj']
11/16/2024 16:43:43 - INFO - main - lora_rank: 64
trainable params: 1,218,445,312 || all params: 9,248,706,560 || trainable%: 13.174223920885215
/root/autodl-tmp/pycharm/Chinese-LLaMA-Alpaca-3-main/scripts/training/run_clm_sft_with_peft.py:456: FutureWarning:
tokenizer
is deprecated and will be removed in version 5.0.0 forTrainer.__init__
. Useprocessing_class
instead.trainer = Trainer(
11/16/2024 16:43:47 - WARNING - accelerate.utils.other - Detected kernel version 4.19.90, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.
[INFO|trainer.py:699] 2024-11-16 16:43:48,066 >> Using auto half precision backend
[INFO|trainer.py:2314] 2024-11-16 16:43:48,246 >> ***** Running training *****
[INFO|trainer.py:2315] 2024-11-16 16:43:48,246 >> Num examples = 25
[INFO|trainer.py:2316] 2024-11-16 16:43:48,246 >> Num Epochs = 10
[INFO|trainer.py:2317] 2024-11-16 16:43:48,246 >> Instantaneous batch size per device = 1
[INFO|trainer.py:2320] 2024-11-16 16:43:48,246 >> Total train batch size (w. parallel, distributed & accumulation) = 8
[INFO|trainer.py:2321] 2024-11-16 16:43:48,246 >> Gradient Accumulation steps = 8
[INFO|trainer.py:2322] 2024-11-16 16:43:48,246 >> Total optimization steps = 30
[INFO|trainer.py:2323] 2024-11-16 16:43:48,250 >> Number of trainable parameters = 1,218,445,312
0%| | 0/30 [00:00<?, ?it/s]/root/miniconda3/lib/python3.12/site-packages/transformers/data/data_collator.py:657: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at ../torch/csrc/utils/tensor_new.cpp:274.)
batch["labels"] = torch.tensor(batch["labels"], dtype=torch.int64)
{'loss': 1.6754, 'grad_norm': 5.90625, 'learning_rate': 5e-05, 'epoch': 0.32}
{'loss': 1.214, 'grad_norm': 7.46875, 'learning_rate': 8.117449009293668e-05, 'epoch': 3.2}
{'loss': 0.1101, 'grad_norm': 0.369140625, 'learning_rate': 2.8305813044122097e-05, 'epoch': 6.4}
{'loss': 0.0148, 'grad_norm': 0.1513671875, 'learning_rate': 0.0, 'epoch': 9.6}
100%|███████████████████████████████████████████| 30/30 [00:34<00:00, 1.12s/it][INFO|trainer.py:3812] 2024-11-16 16:44:22,505 >> Saving model checkpoint to /root/autodl-tmp/pycharm/llama3-lora/checkpoint-30
[INFO|tokenization_utils_base.py:2646] 2024-11-16 16:44:28,336 >> tokenizer config file saved in /root/autodl-tmp/pycharm/llama3-lora/checkpoint-30/tokenizer_config.json
[INFO|tokenization_utils_base.py:2655] 2024-11-16 16:44:28,336 >> Special tokens file saved in /root/autodl-tmp/pycharm/llama3-lora/checkpoint-30/special_tokens_map.json
[INFO|trainer.py:2591] 2024-11-16 16:44:37,717 >>
Training completed. Do not forget to share your model on huggingface.co/models =)
{'train_runtime': 49.4677, 'train_samples_per_second': 5.054, 'train_steps_per_second': 0.606, 'train_loss': 0.4617110307017962, 'epoch': 9.6}
100%|███████████████████████████████████████████| 30/30 [00:49<00:00, 1.65s/it]
[INFO|trainer.py:3812] 2024-11-16 16:44:37,719 >> Saving model checkpoint to /root/autodl-tmp/pycharm/llama3-lora
[INFO|tokenization_utils_base.py:2646] 2024-11-16 16:44:43,289 >> tokenizer config file saved in /root/autodl-tmp/pycharm/llama3-lora/tokenizer_config.json
[INFO|tokenization_utils_base.py:2655] 2024-11-16 16:44:43,289 >> Special tokens file saved in /root/autodl-tmp/pycharm/llama3-lora/special_tokens_map.json
***** train metrics *****
epoch = 9.6
total_flos = 2177307GF
train_loss = 0.4617
train_runtime = 0:00:49.46
train_samples = 25
train_samples_per_second = 5.054
train_steps_per_second = 0.606
11/16/2024 16:44:43 - INFO - main - *** Evaluate ***
[INFO|trainer.py:4128] 2024-11-16 16:44:43,469 >>
***** Running Evaluation *****
[INFO|trainer.py:4130] 2024-11-16 16:44:43,470 >> Num examples = 36
[INFO|trainer.py:4133] 2024-11-16 16:44:43,470 >> Batch size = 1
100%|███████████████████████████████████████████| 36/36 [00:01<00:00, 24.79it/s]
***** eval metrics *****
epoch = 9.6
eval_loss = 0.7792
eval_runtime = 0:00:01.50
eval_samples = 36
eval_samples_per_second = 23.972
eval_steps_per_second = 23.972
perplexity = 2.1797
进程已结束,退出代码为 0
/root/miniconda3/bin/conda run -p /root/miniconda3 --no-capture-output python /root/autodl-tmp/pycharm/Chinese-LLaMA-Alpaca-3-main/scripts/merge_llama3_with_chinese_lora_low_mem.py --base_model /root/autodl-tmp/pycharm/llama3 --lora_model /root/autodl-tmp/pycharm/llama3-lora --output_dir /root/autodl-tmp/pycharm/llama3-lora-merge
/root/autodl-tmp/pycharm/Chinese-LLaMA-Alpaca-3-main/scripts/merge_llama3_with_chinese_lora_low_mem.py:197: SyntaxWarning: invalid escape sequence '\d'
ckpt_filenames = sorted([f for f in os.listdir(output_dir) if re.match('L(\d+)-consolidated.(\d+).pth',f)])
/root/autodl-tmp/pycharm/Chinese-LLaMA-Alpaca-3-main/scripts/merge_llama3_with_chinese_lora_low_mem.py:200: SyntaxWarning: invalid escape sequence '\d'
shards_filenames = sorted([f for f in ckpt_filenames if re.match(f'L(\d+)-consolidated.0{i}.pth',f)])
/root/autodl-tmp/pycharm/Chinese-LLaMA-Alpaca-3-main/scripts/merge_llama3_with_chinese_lora_low_mem.py:273: SyntaxWarning: invalid escape sequence '\d'
ckpt_filenames = sorted([f for f in os.listdir(base_model_path) if re.match('model-(\d+)-of-(\d+).safetensors',f)])
/root/autodl-tmp/pycharm/Chinese-LLaMA-Alpaca-3-main/scripts/merge_llama3_with_chinese_lora_low_mem.py:275: SyntaxWarning: invalid escape sequence '\d'
ckpt_filenames = sorted([f for f in os.listdir(base_model_path) if re.match('pytorch_model-(\d+)-of-(\d+).bin',f)])
Base model: /root/autodl-tmp/pycharm/llama3
LoRA model: /root/autodl-tmp/pycharm/llama3-lora
Loading /root/autodl-tmp/pycharm/llama3-lora
Loading ckpt model-00001-of-00004.safetensors
Merging...
Saving ckpt model-00001-of-00004.safetensors to /root/autodl-tmp/pycharm/llama3-lora-merge in HF format...
Loading ckpt model-00002-of-00004.safetensors
Merging...
Saving ckpt model-00002-of-00004.safetensors to /root/autodl-tmp/pycharm/llama3-lora-merge in HF format...
Loading ckpt model-00003-of-00004.safetensors
Merging...
Saving ckpt model-00003-of-00004.safetensors to /root/autodl-tmp/pycharm/llama3-lora-merge in HF format...
Loading ckpt model-00004-of-00004.safetensors
Merging...
Saving ckpt model-00004-of-00004.safetensors to /root/autodl-tmp/pycharm/llama3-lora-merge in HF format...
Saving tokenizer
Saving config.json from /root/autodl-tmp/pycharm/llama3
Saving generation_config.json from /root/autodl-tmp/pycharm/llama3
Saving model.safetensors.index.json from /root/autodl-tmp/pycharm/llama3
Done.
Check output dir: /root/autodl-tmp/pycharm/llama3-lora-merge
进程已结束,退出代码为 0
The text was updated successfully, but these errors were encountered: