Skip to content

Commit

Permalink
Merge branch 'main' into angelayi/serialize_pytree
Browse files Browse the repository at this point in the history
  • Loading branch information
angelayi committed Dec 18, 2023
2 parents 4d19a6e + 08a6e7a commit 0590ca1
Show file tree
Hide file tree
Showing 55 changed files with 857 additions and 550 deletions.
4 changes: 3 additions & 1 deletion docs/source/en/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,8 @@
title: Overview
- local: quantization
title: Quantization
- local: trainer
title: Trainer
- sections:
- local: perf_train_gpu_one
title: Methods and tools for efficient training on a single GPU
Expand All @@ -149,7 +151,7 @@
- local: perf_train_tpu_tf
title: Training on TPU with TensorFlow
- local: perf_train_special
title: Training on Specialized Hardware
title: PyTorch training on Apple silicon
- local: perf_hardware
title: Custom hardware for training
- local: hpo_train
Expand Down
396 changes: 6 additions & 390 deletions docs/source/en/main_classes/trainer.md

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/source/en/model_doc/mixtral.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ rendered properly in your Markdown viewer.

Mixtral-8x7B is Mistral AI's second Large Language Model (LLM).

The Mixtral model was proposed in the by the [Mistral AI](https://mistral.ai/) team.
The Mixtral model was proposed by the [Mistral AI](https://mistral.ai/) team.

It was introduced in the [Mixtral of Experts blogpost](https://mistral.ai/news/mixtral-of-experts/) with the following introduction:

Expand Down
4 changes: 2 additions & 2 deletions docs/source/en/model_doc/vipllava.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,13 +37,13 @@ Tips:
- For better results, we recommend users to prompt the model with the correct prompt format:

```bash
"USER: <image>\n<prompt>ASSISTANT:"
A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.###Human: <image>\n<prompt>###Assistant:
```
For multiple turns conversation:
```bash
"USER: <image>\n<prompt1>ASSISTANT: <answer1>USER: <prompt2>ASSISTANT: <answer2>USER: <prompt3>ASSISTANT:"
A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.###Human: <image>\n<prompt1>###Assistant: <answer1>###Human: <prompt2>###Assistant:
```

The original code can be found [here](https://github.com/mu-cai/ViP-LLaVA).
Expand Down
47 changes: 43 additions & 4 deletions docs/source/en/perf_train_special.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,51 @@ rendered properly in your Markdown viewer.
-->

# Training on Specialized Hardware
# PyTorch training on Apple silicon

<Tip>
Previously, training models on a Mac was limited to the CPU only. With the release of PyTorch v1.12, you can take advantage of training models with Apple's silicon GPUs for significantly faster performance and training. This is powered in PyTorch by integrating Apple's Metal Performance Shaders (MPS) as a backend. The [MPS backend](https://pytorch.org/docs/stable/notes/mps.html) implements PyTorch operations as custom Metal shaders and places these modules on a `mps` device.

Note: Most of the strategies introduced in the [single GPU section](perf_train_gpu_one) (such as mixed precision training or gradient accumulation) and [multi-GPU section](perf_train_gpu_many) are generic and apply to training models in general so make sure to have a look at it before diving into this section.
<Tip warning={true}>

Some PyTorch operations are not implemented in MPS yet and will throw an error. To avoid this, you should set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU kernels instead (you'll still see a `UserWarning`).

<br>

If you run into any other errors, please open an issue in the [PyTorch](https://github.com/pytorch/pytorch/issues) repository because the [`Trainer`] only integrates the MPS backend.

</Tip>

This document will be completed soon with information on how to train on specialized hardware.
With the `mps` device set, you can:

* train larger networks or batch sizes locally
* reduce data retrieval latency because the GPU's unified memory architecture allows direct access to the full memory store
* reduce costs because you don't need to train on cloud-based GPUs or add additional local GPUs

Get started by making sure you have PyTorch installed. MPS acceleration is supported on macOS 12.3+.

```bash
pip install torch torchvision torchaudio
```

[`TrainingArguments`] uses the `mps` device by default if it's available which means you don't need to explicitly set the device. For example, you can run the [run_glue.py](https://github.com/huggingface/transformers/blob/main/examples/pytorch/text-classification/run_glue.py) script with the MPS backend automatically enabled without making any changes.

```diff
export TASK_NAME=mrpc

python examples/pytorch/text-classification/run_glue.py \
--model_name_or_path bert-base-cased \
--task_name $TASK_NAME \
- --use_mps_device \
--do_train \
--do_eval \
--max_seq_length 128 \
--per_device_train_batch_size 32 \
--learning_rate 2e-5 \
--num_train_epochs 3 \
--output_dir /tmp/$TASK_NAME/ \
--overwrite_output_dir
```

Backends for [distributed setups](https://pytorch.org/docs/stable/distributed.html#backends) like `gloo` and `nccl` are not supported by the `mps` device which means you can only train on a single GPU with the MPS backend.

You can learn more about the MPS backend in the [Introducing Accelerated PyTorch Training on Mac](https://pytorch.org/blog/introducing-accelerated-pytorch-training-on-mac/) blog post.
3 changes: 1 addition & 2 deletions docs/source/en/tasks/semantic_segmentation.md
Original file line number Diff line number Diff line change
Expand Up @@ -276,8 +276,7 @@ You could also create and use your own dataset if you prefer to train with the [
"label": sorted(label_paths)})
dataset = dataset.cast_column("image", Image())
dataset = dataset.cast_column("label", Image())

return dataset
return dataset

# step 1: create Dataset objects
train_dataset = create_dataset(image_paths_train, label_paths_train)
Expand Down
Loading

0 comments on commit 0590ca1

Please sign in to comment.