diff --git a/docs/source/en/perf_infer_cpu.md b/docs/source/en/perf_infer_cpu.md index c0e017c020870e..06b50f0b15e10d 100644 --- a/docs/source/en/perf_infer_cpu.md +++ b/docs/source/en/perf_infer_cpu.md @@ -42,7 +42,6 @@ Enable BetterTransformer with the [`PreTrainedModel.to_bettertransformer`] metho from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("bigcode/starcoder") -model.to_bettertransformer() ``` ## TorchScript @@ -54,7 +53,7 @@ For a gentle introduction to TorchScript, see the [Introduction to PyTorch Torch With the [`Trainer`] class, you can enable JIT mode for CPU inference by setting the `--jit_mode_eval` flag: ```bash -python run_qa.py \ +python examples/pytorch/question-answering/run_qa.py \ --model_name_or_path csarron/bert-base-uncased-squad-v1 \ --dataset_name squad \ --do_eval \ @@ -86,7 +85,7 @@ pip install intel_extension_for_pytorch Set the `--use_ipex` and `--jit_mode_eval` flags in the [`Trainer`] class to enable JIT mode with the graph optimizations: ```bash -python run_qa.py \ +python examples/pytorch/question-answering/run_qa.py \ --model_name_or_path csarron/bert-base-uncased-squad-v1 \ --dataset_name squad \ --do_eval \ diff --git a/docs/source/en/perf_train_cpu.md b/docs/source/en/perf_train_cpu.md index 7ef98932d537ac..ab2f735ecbdd50 100644 --- a/docs/source/en/perf_train_cpu.md +++ b/docs/source/en/perf_train_cpu.md @@ -51,7 +51,7 @@ To enable auto mixed precision with IPEX in Trainer, users should add `use_ipex` Take an example of the use cases on [Transformers question-answering](https://github.com/huggingface/transformers/tree/main/examples/pytorch/question-answering) - Training with IPEX using BF16 auto mixed precision on CPU: -
python run_qa.py \ +python examples/pytorch/question-answering/run_qa.py \ --model_name_or_path google-bert/bert-base-uncased \ --dataset_name squad \ --do_train \ diff --git a/docs/source/en/perf_train_cpu_many.md b/docs/source/en/perf_train_cpu_many.md index ed782caca3b1f1..d6a029c471de08 100644 --- a/docs/source/en/perf_train_cpu_many.md +++ b/docs/source/en/perf_train_cpu_many.md @@ -75,7 +75,7 @@ The following command enables training with 2 processes on one Xeon node, with o export CCL_WORKER_COUNT=1 export MASTER_ADDR=127.0.0.1 mpirun -n 2 -genv OMP_NUM_THREADS=23 \ - python3 run_qa.py \ + python3 examples/pytorch/question-answering/run_qa.py \ --model_name_or_path google-bert/bert-large-uncased \ --dataset_name squad \ --do_train \ @@ -104,7 +104,7 @@ Now, run the following command in node0 and **4DDP** will be enabled in node0 an export MASTER_ADDR=xxx.xxx.xxx.xxx #node0 ip mpirun -f hostfile -n 4 -ppn 2 \ -genv OMP_NUM_THREADS=23 \ - python3 run_qa.py \ + python3 examples/pytorch/question-answering/run_qa.py \ --model_name_or_path google-bert/bert-large-uncased \ --dataset_name squad \ --do_train \