Byebye torch 1.10 (huggingface#28207)

* fix * fix --------- Co-authored-by: ydshieh <[email protected]>
wgifford · Jan 21, 2024 · b0f19a7 · b0f19a7
1 parent c5d4d69
commit b0f19a7
Show file tree

Hide file tree

Showing 31 changed files with 39 additions and 194 deletions.
diff --git a/.github/workflows/build-past-ci-docker-images.yml b/.github/workflows/build-past-ci-docker-images.yml
@@ -15,7 +15,7 @@ jobs:
     strategy:
       fail-fast: false
       matrix:
-        version: ["1.13", "1.12", "1.11", "1.10"]
+        version: ["1.13", "1.12", "1.11"]
     runs-on: ubuntu-22.04
     steps:
       -

diff --git a/.github/workflows/self-nightly-past-ci-caller.yml b/.github/workflows/self-nightly-past-ci-caller.yml
@@ -56,21 +56,10 @@ jobs:
       sha: ${{ github.sha }}
     secrets: inherit
 
-  run_past_ci_pytorch_1-10:
-    name: PyTorch 1.10
-    if: (cancelled() != true) && ((github.event_name == 'schedule') || ((github.event_name == 'push') && startsWith(github.ref_name, 'run_past_ci')))
-    needs: [run_past_ci_pytorch_1-11]
-    uses: ./.github/workflows/self-past.yml
-    with:
-      framework: pytorch
-      version: "1.10"
-      sha: ${{ github.sha }}
-    secrets: inherit
-
   run_past_ci_tensorflow_2-11:
     name: TensorFlow 2.11
     if: (cancelled() != true) && ((github.event_name == 'push') && startsWith(github.ref_name, 'run_past_ci'))
-    needs: [run_past_ci_pytorch_1-10]
+    needs: [run_past_ci_pytorch_1-11]
     uses: ./.github/workflows/self-past.yml
     with:
       framework: tensorflow

diff --git a/README.md b/README.md
@@ -250,7 +250,7 @@ The model itself is a regular [Pytorch `nn.Module`](https://pytorch.org/docs/sta
 
 ### With pip
 
-This repository is tested on Python 3.8+, Flax 0.4.1+, PyTorch 1.10+, and TensorFlow 2.6+.
+This repository is tested on Python 3.8+, Flax 0.4.1+, PyTorch 1.11+, and TensorFlow 2.6+.
 
 You should install 🤗 Transformers in a [virtual environment](https://docs.python.org/3/library/venv.html). If you're unfamiliar with Python virtual environments, check out the [user guide](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/).
 

diff --git a/README_es.md b/README_es.md
@@ -225,7 +225,7 @@ El modelo en si es un [Pytorch `nn.Module`](https://pytorch.org/docs/stable/nn.h
 
 ### Con pip
 
-Este repositorio está probado en Python 3.8+, Flax 0.4.1+, PyTorch 1.10+ y TensorFlow 2.6+.
+Este repositorio está probado en Python 3.8+, Flax 0.4.1+, PyTorch 1.11+ y TensorFlow 2.6+.
 
 Deberías instalar 🤗 Transformers en un [ambiente virtual](https://docs.python.org/3/library/venv.html). Si no estas familiarizado con los entornos virtuales de Python, consulta la [guía de usuario](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/).
 

diff --git a/README_hd.md b/README_hd.md
@@ -201,7 +201,7 @@ checkpoint: जाँच बिंदु
 
 ### पिप का उपयोग करना
 
-इस रिपॉजिटरी का परीक्षण Python 3.8+, Flax 0.4.1+, PyTorch 1.10+ और TensorFlow 2.6+ के तहत किया गया है।
+इस रिपॉजिटरी का परीक्षण Python 3.8+, Flax 0.4.1+, PyTorch 1.11+ और TensorFlow 2.6+ के तहत किया गया है।
 
 आप [वर्चुअल एनवायरनमेंट](https://docs.python.org/3/library/venv.html) में 🤗 ट्रांसफॉर्मर इंस्टॉल कर सकते हैं। यदि आप अभी तक पायथन के वर्चुअल एनवायरनमेंट से परिचित नहीं हैं, तो कृपया इसे [उपयोगकर्ता निर्देश](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/) पढ़ें।
 

diff --git a/README_ja.md b/README_ja.md
@@ -259,7 +259,7 @@ Hugging Faceチームによって作られた **[トランスフォーマーを
 
 ### pipにて
 
-このリポジトリは、Python 3.8+, Flax 0.4.1+, PyTorch 1.10+, TensorFlow 2.6+ でテストされています。
+このリポジトリは、Python 3.8+, Flax 0.4.1+, PyTorch 1.11+, TensorFlow 2.6+ でテストされています。
 
 🤗Transformersは[仮想環境](https://docs.python.org/3/library/venv.html)にインストールする必要があります。Pythonの仮想環境に慣れていない場合は、[ユーザーガイド](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/)を確認してください。
 

diff --git a/README_ko.md b/README_ko.md
@@ -176,7 +176,7 @@ limitations under the License.
 
 ### pip로 설치하기
 
-이 저장소는 Python 3.8+, Flax 0.4.1+, PyTorch 1.10+, TensorFlow 2.6+에서 테스트 되었습니다.
+이 저장소는 Python 3.8+, Flax 0.4.1+, PyTorch 1.11+, TensorFlow 2.6+에서 테스트 되었습니다.
 
 [가상 환경](https://docs.python.org/3/library/venv.html)에 🤗 Transformers를 설치하세요. Python 가상 환경에 익숙하지 않다면, [사용자 가이드](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/)를 확인하세요.
 

diff --git a/README_pt-br.md b/README_pt-br.md
@@ -258,7 +258,7 @@ O modelo em si é um [Pytorch `nn.Module`](https://pytorch.org/docs/stable/nn.ht
 
 ### Com pip
 
-Este repositório é testado no Python 3.8+, Flax 0.4.1+, PyTorch 1.10+ e TensorFlow 2.6+.
+Este repositório é testado no Python 3.8+, Flax 0.4.1+, PyTorch 1.11+ e TensorFlow 2.6+.
 
 Você deve instalar o 🤗 Transformers em um [ambiente virtual](https://docs.python.org/3/library/venv.html). Se você não está familiarizado com ambientes virtuais em Python, confira o [guia do usuário](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/).
 

diff --git a/README_ru.md b/README_ru.md
@@ -248,7 +248,7 @@ Hugging Face Hub. Мы хотим, чтобы Transformers позволил ра
 
 ### С помощью pip
 
-Данный репозиторий протестирован на Python 3.8+, Flax 0.4.1+, PyTorch 1.10+ и TensorFlow 2.6+.
+Данный репозиторий протестирован на Python 3.8+, Flax 0.4.1+, PyTorch 1.11+ и TensorFlow 2.6+.
 
 Устанавливать 🤗 Transformers следует в [виртуальной среде](https://docs.python.org/3/library/venv.html). Если вы не знакомы с виртуальными средами Python, ознакомьтесь с [руководством пользователя](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/).
 

diff --git a/README_te.md b/README_te.md
@@ -251,7 +251,7 @@ limitations under the License.
 
 ### పిప్ తో
 
-ఈ రిపోజిటరీ పైథాన్ 3.8+, ఫ్లాక్స్ 0.4.1+, PyTorch 1.10+ మరియు TensorFlow 2.6+లో పరీక్షించబడింది.
+ఈ రిపోజిటరీ పైథాన్ 3.8+, ఫ్లాక్స్ 0.4.1+, PyTorch 1.11+ మరియు TensorFlow 2.6+లో పరీక్షించబడింది.
 
 మీరు [వర్చువల్ వాతావరణం](https://docs.python.org/3/library/venv.html)లో 🤗 ట్రాన్స్‌ఫార్మర్‌లను ఇన్‌స్టాల్ చేయాలి. మీకు పైథాన్ వర్చువల్ పరిసరాల గురించి తెలియకుంటే, [యూజర్ గైడ్](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/) చూడండి.
 

diff --git a/README_zh-hans.md b/README_zh-hans.md
@@ -201,7 +201,7 @@ checkpoint: 检查点
 
 ### 使用 pip
 
-这个仓库已在 Python 3.8+、Flax 0.4.1+、PyTorch 1.10+ 和 TensorFlow 2.6+ 下经过测试。
+这个仓库已在 Python 3.8+、Flax 0.4.1+、PyTorch 1.11+ 和 TensorFlow 2.6+ 下经过测试。
 
 你可以在[虚拟环境](https://docs.python.org/3/library/venv.html)中安装 🤗 Transformers。如果你还不熟悉 Python 的虚拟环境，请阅此[用户说明](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/)。
 

diff --git a/README_zh-hant.md b/README_zh-hant.md
@@ -213,7 +213,7 @@ Tokenizer 為所有的預訓練模型提供了預處理，並可以直接轉換
 
 ### 使用 pip
 
-這個 Repository 已在 Python 3.8+、Flax 0.4.1+、PyTorch 1.10+ 和 TensorFlow 2.6+ 下經過測試。
+這個 Repository 已在 Python 3.8+、Flax 0.4.1+、PyTorch 1.11+ 和 TensorFlow 2.6+ 下經過測試。
 
 你可以在[虛擬環境](https://docs.python.org/3/library/venv.html)中安裝 🤗 Transformers。如果你還不熟悉 Python 的虛擬環境，請閱此[使用者指引](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/)。
 

diff --git a/setup.py b/setup.py
@@ -175,7 +175,7 @@
     "timeout-decorator",
     "timm",
     "tokenizers>=0.14,<0.19",
-    "torch>=1.10,!=1.12.0",
+    "torch>=1.11,!=1.12.0",
     "torchaudio",
     "torchvision",
     "pyctcdecode>=0.4.0",

diff --git a/src/transformers/convert_graph_to_onnx.py b/src/transformers/convert_graph_to_onnx.py
@@ -273,40 +273,22 @@ def convert_pytorch(nlp: Pipeline, opset: int, output: Path, use_external_format
     import torch
     from torch.onnx import export
 
-    from transformers.pytorch_utils import is_torch_less_than_1_11
-
     print(f"Using framework PyTorch: {torch.__version__}")
 
     with torch.no_grad():
         input_names, output_names, dynamic_axes, tokens = infer_shapes(nlp, "pt")
         ordered_input_names, model_args = ensure_valid_input(nlp.model, tokens, input_names)
 
-        # PyTorch deprecated the `enable_onnx_checker` and `use_external_data_format` arguments in v1.11,
-        # so we check the torch version for backwards compatibility
-        if is_torch_less_than_1_11:
-            export(
-                nlp.model,
-                model_args,
-                f=output.as_posix(),
-                input_names=ordered_input_names,
-                output_names=output_names,
-                dynamic_axes=dynamic_axes,
-                do_constant_folding=True,
-                use_external_data_format=use_external_format,
-                enable_onnx_checker=True,
-                opset_version=opset,
-            )
-        else:
-            export(
-                nlp.model,
-                model_args,
-                f=output.as_posix(),
-                input_names=ordered_input_names,
-                output_names=output_names,
-                dynamic_axes=dynamic_axes,
-                do_constant_folding=True,
-                opset_version=opset,
-            )
+        export(
+            nlp.model,
+            model_args,
+            f=output.as_posix(),
+            input_names=ordered_input_names,
+            output_names=output_names,
+            dynamic_axes=dynamic_axes,
+            do_constant_folding=True,
+            opset_version=opset,
+        )
 
 
 def convert_tensorflow(nlp: Pipeline, opset: int, output: Path):

diff --git a/src/transformers/dependency_versions_table.py b/src/transformers/dependency_versions_table.py
@@ -80,7 +80,7 @@
     "timeout-decorator": "timeout-decorator",
     "timm": "timm",
     "tokenizers": "tokenizers>=0.14,<0.19",
-    "torch": "torch>=1.10,!=1.12.0",
+    "torch": "torch>=1.11,!=1.12.0",
     "torchaudio": "torchaudio",
     "torchvision": "torchvision",
     "pyctcdecode": "pyctcdecode>=0.4.0",

diff --git a/src/transformers/modeling_utils.py b/src/transformers/modeling_utils.py
@@ -97,7 +97,6 @@
     is_torchdynamo_compiling,
 )
 from .utils.quantization_config import AwqConfig, BitsAndBytesConfig, GPTQConfig, QuantizationMethod
-from .utils.versions import require_version_core
 
 
 XLA_USE_BF16 = os.environ.get("XLA_USE_BF16", "0").upper()
@@ -2898,10 +2897,6 @@ def from_pretrained(
                 raise ValueError("Passing along a `device_map` requires `low_cpu_mem_usage=True`")
 
         if low_cpu_mem_usage:
-            if device_map is not None:
-                # The max memory utils require PyTorch >= 1.10 to have torch.cuda.mem_get_info.
-                require_version_core("torch>=1.10")
-
             if is_deepspeed_zero3_enabled():
                 raise ValueError(
                     "DeepSpeed Zero-3 is not compatible with `low_cpu_mem_usage=True` or with passing a `device_map`."

diff --git a/src/transformers/models/pix2struct/image_processing_pix2struct.py b/src/transformers/models/pix2struct/image_processing_pix2struct.py
@@ -43,23 +43,10 @@
 if is_torch_available():
     import torch
 
-    from transformers.pytorch_utils import is_torch_greater_or_equal_than_1_11
-else:
-    is_torch_greater_or_equal_than_1_11 = False
-
-
 logger = logging.get_logger(__name__)
 DEFAULT_FONT_PATH = "ybelkada/fonts"
 
 
-def _check_torch_version():
-    if is_torch_available() and not is_torch_greater_or_equal_than_1_11:
-        raise ImportError(
-            f"You are using torch=={torch.__version__}, but torch>=1.11.0 is required to use "
-            "Pix2StructImageProcessor. Please upgrade torch."
-        )
-
-
 # adapted from: https://discuss.pytorch.org/t/tf-image-extract-patches-in-pytorch/171409/2
 def torch_extract_patches(image_tensor, patch_height, patch_width):
     """
@@ -75,7 +62,6 @@ def torch_extract_patches(image_tensor, patch_height, patch_width):
             The width of the patches to extract.
     """
     requires_backends(torch_extract_patches, ["torch"])
-    _check_torch_version()
 
     image_tensor = image_tensor.unsqueeze(0)
     patches = torch.nn.functional.unfold(image_tensor, (patch_height, patch_width), stride=(patch_height, patch_width))
@@ -262,7 +248,6 @@ def extract_flattened_patches(
                 A sequence of `max_patches` flattened patches.
         """
         requires_backends(self.extract_flattened_patches, "torch")
-        _check_torch_version()
 
         # convert to torch
         image = to_channel_dimension_format(image, ChannelDimension.FIRST, input_data_format)

diff --git a/src/transformers/onnx/convert.py b/src/transformers/onnx/convert.py
@@ -33,7 +33,6 @@
 
 if is_torch_available():
     from ..modeling_utils import PreTrainedModel
-    from ..pytorch_utils import is_torch_less_than_1_11
 
 if is_tf_available():
     from ..modeling_tf_utils import TFPreTrainedModel
@@ -167,49 +166,16 @@ def export_pytorch(
 
             config.patch_ops()
 
-            # PyTorch deprecated the `enable_onnx_checker` and `use_external_data_format` arguments in v1.11,
-            # so we check the torch version for backwards compatibility
-            if is_torch_less_than_1_11:
-                # export can work with named args but the dict containing named args
-                # has to be the last element of the args tuple.
-                try:
-                    onnx_export(
-                        model,
-                        (model_inputs,),
-                        f=output.as_posix(),
-                        input_names=list(config.inputs.keys()),
-                        output_names=onnx_outputs,
-                        dynamic_axes=dict(chain(config.inputs.items(), config.outputs.items())),
-                        do_constant_folding=True,
-                        use_external_data_format=config.use_external_data_format(model.num_parameters()),
-                        enable_onnx_checker=True,
-                        opset_version=opset,
-                    )
-                except RuntimeError as err:
-                    message = str(err)
-                    if (
-                        message
-                        == "Exporting model exceed maximum protobuf size of 2GB. Please call torch.onnx.export without"
-                        " setting use_external_data_format parameter."
-                    ):
-                        message = (
-                            "Exporting model exceed maximum protobuf size of 2GB. Please call torch.onnx.export"
-                            " without setting use_external_data_format parameter or try with torch 1.10+."
-                        )
-                        raise RuntimeError(message)
-                    else:
-                        raise err
-            else:
-                onnx_export(
-                    model,
-                    (model_inputs,),
-                    f=output.as_posix(),
-                    input_names=list(config.inputs.keys()),
-                    output_names=onnx_outputs,
-                    dynamic_axes=dict(chain(config.inputs.items(), config.outputs.items())),
-                    do_constant_folding=True,
-                    opset_version=opset,
-                )
+            onnx_export(
+                model,
+                (model_inputs,),
+                f=output.as_posix(),
+                input_names=list(config.inputs.keys()),
+                output_names=onnx_outputs,
+                dynamic_axes=dict(chain(config.inputs.items(), config.outputs.items())),
+                do_constant_folding=True,
+                opset_version=opset,
+            )
 
             config.restore_ops()
 

diff --git a/src/transformers/pytorch_utils.py b/src/transformers/pytorch_utils.py
@@ -32,9 +32,6 @@
 is_torch_greater_or_equal_than_2_0 = parsed_torch_version_base >= version.parse("2.0")
 is_torch_greater_or_equal_than_1_13 = parsed_torch_version_base >= version.parse("1.13")
 is_torch_greater_or_equal_than_1_12 = parsed_torch_version_base >= version.parse("1.12")
-is_torch_greater_or_equal_than_1_11 = parsed_torch_version_base >= version.parse("1.11")
-is_torch_less_than_1_11 = parsed_torch_version_base < version.parse("1.11")
-is_torch_1_8_0 = parsed_torch_version_base == version.parse("1.8.0")
 
 
 def softmax_backward_data(parent, grad_output, output, dim, self):
@@ -45,10 +42,7 @@ def softmax_backward_data(parent, grad_output, output, dim, self):
 
     from torch import _softmax_backward_data
 
-    if is_torch_less_than_1_11:
-        return _softmax_backward_data(grad_output, output, parent.dim, self)
-    else:
-        return _softmax_backward_data(grad_output, output, parent.dim, self.dtype)
+    return _softmax_backward_data(grad_output, output, parent.dim, self.dtype)
 
 
 def prune_linear_layer(layer: nn.Linear, index: torch.LongTensor, dim: int = 0) -> nn.Linear:

diff --git a/src/transformers/trainer.py b/src/transformers/trainer.py
@@ -64,7 +64,7 @@
 from .modeling_utils import PreTrainedModel, load_sharded_checkpoint, unwrap_model
 from .models.auto.modeling_auto import MODEL_FOR_CAUSAL_LM_MAPPING_NAMES, MODEL_MAPPING_NAMES
 from .optimization import Adafactor, get_scheduler
-from .pytorch_utils import ALL_LAYERNORM_LAYERS, is_torch_less_than_1_11
+from .pytorch_utils import ALL_LAYERNORM_LAYERS
 from .tokenization_utils_base import PreTrainedTokenizerBase
 from .trainer_callback import (
     CallbackHandler,
@@ -1794,7 +1794,7 @@ def _inner_training_loop(
                 if version.parse(accelerate_version) > version.parse("0.23.0"):
                     sampler_kinds.append(SeedableRandomSampler)
                 is_random_sampler = isinstance(sampler, tuple(sampler_kinds))
-                if is_torch_less_than_1_11 or not is_random_sampler:
+                if not is_random_sampler:
                     # We just need to begin an iteration to create the randomization of the sampler.
                     for _ in train_dataloader:
                         break

diff --git a/src/transformers/training_args.py b/src/transformers/training_args.py
@@ -1439,15 +1439,6 @@ def __post_init__(self):
                         raise ValueError(
                             "Your setup doesn't support bf16/gpu. You need torch>=1.10, using Ampere GPU with cuda>=11.0"
                         )
-                    elif is_torch_npu_available():
-                        # npu
-                        from .pytorch_utils import is_torch_greater_or_equal_than_1_11
-
-                        if not is_torch_greater_or_equal_than_1_11:
-                            raise ValueError(
-                                "Your setup doesn't support bf16/npu. You need torch>=1.11, using Ascend NPU with "
-                                "`torch_npu` installed"
-                            )
                     elif not is_torch_xpu_available():
                         # xpu
                         from .pytorch_utils import is_torch_greater_or_equal_than_1_12

diff --git a/src/transformers/utils/import_utils.py b/src/transformers/utils/import_utils.py
@@ -64,6 +64,7 @@ def _is_package_available(pkg_name: str, return_version: bool = False) -> Union[
 
 FORCE_TF_AVAILABLE = os.environ.get("FORCE_TF_AVAILABLE", "AUTO").upper()
 
+# `transformers` requires `torch>=1.11` but this variable is exposed publicly, and we can't simply remove it.
 # This is the version of torch required to run torch.fx features and torch.onnx with dictionary inputs.
 TORCH_FX_REQUIRED_VERSION = version.parse("1.10")