Skip to content

Commit

Permalink
Remove tests and support for torch <2.1 (#787)
Browse files Browse the repository at this point in the history
  • Loading branch information
dakinggg authored Dec 8, 2023
1 parent ef60e8e commit 2017c02
Show file tree
Hide file tree
Showing 14 changed files with 20 additions and 61 deletions.
6 changes: 0 additions & 6 deletions .github/workflows/docker.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,6 @@ jobs:
strategy:
matrix:
include:
- name: '1.13.1_cu117'
base_image: mosaicml/pytorch:1.13.1_cu117-python3.10-ubuntu20.04
dep_groups: '[gpu]'
- name: '2.0.1_cu118'
base_image: mosaicml/pytorch:2.0.1_cu118-python3.10-ubuntu20.04
dep_groups: '[gpu]'
- name: '2.1.0_cu121'
base_image: mosaicml/pytorch:2.1.0_cu121-python3.10-ubuntu20.04
dep_groups: '[gpu]'
Expand Down
4 changes: 0 additions & 4 deletions .github/workflows/pr-cpu.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,6 @@ jobs:
strategy:
matrix:
include:
- name: 'cpu-1.13.1'
container: mosaicml/pytorch:1.13.1_cpu-python3.10-ubuntu20.04
markers: 'not gpu'
pytest_command: 'coverage run -m pytest'
- name: 'cpu-2.1.0'
container: mosaicml/pytorch:2.1.0_cpu-python3.10-ubuntu20.04
markers: 'not gpu'
Expand Down
5 changes: 0 additions & 5 deletions .github/workflows/pr-gpu.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,6 @@ jobs:
strategy:
matrix:
include:
- name: 'gpu-1.13.1'
container: mosaicml/pytorch:1.13.1_cu117-python3.10-ubuntu20.04
markers: 'gpu'
pytest_command: 'coverage run -m pytest'
deps_group: 'all'
- name: 'gpu-2.1.0'
container: mosaicml/pytorch:2.1.0_cu121-python3.10-ubuntu20.04
markers: 'gpu'
Expand Down
15 changes: 2 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,21 +85,14 @@ Something missing? Contribute with a PR!


# Hardware and Software Requirements
This codebase has been tested with PyTorch 1.13.1 and PyTorch 2.0.1 on systems with NVIDIA A100s and H100s.
This codebase has been tested with PyTorch 2.1 with NVIDIA A100s and H100s.
This codebase may also work on systems with other devices, such as consumer NVIDIA cards and AMD cards, but we are not actively testing these systems.
If you have success/failure using LLM Foundry on other systems, please let us know in a Github issue and we will update the support matrix!

| Device | Torch Version | Cuda Version | Status |
| -------------- | ------------- | ------------ | ---------------------------- |
| A100-40GB/80GB | 1.13.1 | 11.7 | :white_check_mark: Supported |
| A100-40GB/80GB | 2.0.1 | 11.7, 11.8 | :white_check_mark: Supported |
| A100-40GB/80GB | 2.1.0 | 11.8, 12.1 | :white_check_mark: Supported |
| H100-80GB | 1.13.1 | 11.7 | :x: Not Supported |
| H100-80GB | 2.0.1 | 11.8 | :white_check_mark: Supported |
| A100-40GB/80GB | 2.1.0 | 12.1 | :white_check_mark: Supported |
| H100-80GB | 2.1.0 | 12.1 | :white_check_mark: Supported |
| A10-24GB | 1.13.1 | 11.7 | :construction: In Progress |
| A10-24GB | 2.0.1 | 11.7, 11.8 | :construction: In Progress |
| MI250 | 2.0.1 | ROCm 5.4 | :construction: In Progress |

## MosaicML Docker Images
We highly recommend using our prebuilt Docker images. You can find them here: https://hub.docker.com/orgs/mosaicml/repositories.
Expand All @@ -113,11 +106,7 @@ You can select a specific commit hash such as `mosaicml/llm-foundry:1.13.1_cu117

| Docker Image | Torch Version | Cuda Version | LLM Foundry dependencies installed? |
| ------------------------------------------------------ | ------------- | ----------------- | ----------------------------------- |
| `mosaicml/pytorch:1.13.1_cu117-python3.10-ubuntu20.04` | 1.13.1 | 11.7 (Infiniband) | No |
| `mosaicml/pytorch:2.0.1_cu118-python3.10-ubuntu20.04` | 2.0.1 | 11.8 (Infiniband) | No |
| `mosaicml/pytorch:2.1.0_cu121-python3.10-ubuntu20.04` | 2.1.0 | 12.1 (Infiniband) | No |
| `mosaicml/llm-foundry:1.13.1_cu117-latest` | 1.13.1 | 11.7 (Infiniband) | Yes |
| `mosaicml/llm-foundry:2.0.1_cu118-latest` | 2.0.1 | 11.8 (Infiniband) | Yes |
| `mosaicml/llm-foundry:2.1.0_cu121-latest` | 2.1.0 | 12.1 (Infiniband) | Yes (flash attention v1) |
| `mosaicml/llm-foundry:2.1.0_cu121_flash2-latest` | 2.1.0 | 12.1 (Infiniband) | Yes (flash attention v2) |
| `mosaicml/llm-foundry:2.1.0_cu121_aws-latest` | 2.1.0 | 12.1 (EFA) | Yes (flash attention v1) |
Expand Down
3 changes: 1 addition & 2 deletions llmfoundry/data/packing.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@

import numpy as np
import torch
from composer.utils import using_torch_2
from omegaconf import DictConfig
from transformers import PreTrainedTokenizerBase

Expand Down Expand Up @@ -348,7 +347,7 @@ def profile_packing(
dataloader_cfg.dataset.packing_ratio = None
dataloader_cfg.drop_last = False
dataloader_cfg.num_workers = 0
dataloader_cfg.prefetch_factor = None if using_torch_2() else 2
dataloader_cfg.prefetch_factor = None
dataloader_cfg.persistent_workers = False

# Determine the packing_ratio values we'll try
Expand Down
2 changes: 1 addition & 1 deletion mcli/mcli-hf-eval.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ gpu_num: 8
# gpu_type:
# cluster: # replace with your cluster here!

image: mosaicml/llm-foundry:2.0.1_cu118-latest
image: mosaicml/llm-foundry:2.1.0_cu121_flash2-latest

# The below is injected as a YAML file: /mnt/config/parameters.yaml
parameters:
Expand Down
2 changes: 1 addition & 1 deletion mcli/mcli-openai-eval.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ run_name: openai-eval
# gpu_type: #
cluster: # replace with your cluster here!

image: mosaicml/llm-foundry:2.0.1_cu118-latest
image: mosaicml/llm-foundry:2.1.0_cu121_flash2-latest

# The below is injected as a YAML file: /mnt/config/parameters.yaml
parameters:
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@
'accelerate>=0.20,<0.21', # for HF inference `device_map`
'transformers>=4.34.1,<4.35',
'mosaicml-streaming>=0.7.1,<0.8',
'torch>=1.13.1,<2.1.1',
'torch>=2.1,<2.1.1',
'datasets>=2.14.5,<2.15',
'fsspec==2023.6.0', # newer version results in a bug in datasets that duplicates data
'sentencepiece==0.1.97',
Expand Down
4 changes: 2 additions & 2 deletions tests/a_scripts/inference/test_convert_composer_to_hf.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
import transformers
from composer import Trainer
from composer.loggers import MLFlowLogger
from composer.utils import dist, get_device, using_torch_2
from composer.utils import dist, get_device
from omegaconf import DictConfig
from omegaconf import OmegaConf as om
from torch.utils.data import DataLoader
Expand Down Expand Up @@ -497,7 +497,7 @@ def test_huggingface_conversion_callback(
'drop_last': False,
'num_workers': 0,
'pin_memory': False,
'prefetch_factor': None if using_torch_2() else 2,
'prefetch_factor': None,
'persistent_workers': False,
'timeout': 0
}
Expand Down
5 changes: 2 additions & 3 deletions tests/a_scripts/train/test_train.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@

import pytest
from composer.loggers import InMemoryLogger
from composer.utils import using_torch_2
from omegaconf import DictConfig, ListConfig
from omegaconf import OmegaConf as om

Expand Down Expand Up @@ -36,10 +35,10 @@ def test_train_gauntlet(averages: Optional[dict], tmp_path: pathlib.Path):
test_cfg.icl_subset_num_batches = 1
test_cfg.eval_subset_num_batches = 2
test_cfg.train_loader.num_workers = 0
test_cfg.train_loader.prefetch_factor = None if using_torch_2() else 2
test_cfg.train_loader.prefetch_factor = None
test_cfg.train_loader.persistent_workers = False
test_cfg.eval_loader.num_workers = 0
test_cfg.eval_loader.prefetch_factor = None if using_torch_2() else 2
test_cfg.eval_loader.prefetch_factor = None
test_cfg.eval_loader.persistent_workers = False

test_cfg.eval_gauntlet = DictConfig({
Expand Down
8 changes: 4 additions & 4 deletions tests/data/test_dataloader.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
import pytest
import torch
import transformers
from composer.utils import dist, using_torch_2
from composer.utils import dist
from omegaconf import DictConfig
from omegaconf import OmegaConf as om
from streaming import MDSWriter
Expand Down Expand Up @@ -272,7 +272,7 @@ def test_finetuning_dataloader(decoder_only_format: bool,
'drop_last': False,
'num_workers': 0,
'pin_memory': False,
'prefetch_factor': None if using_torch_2() else 2,
'prefetch_factor': None,
'persistent_workers': False,
'timeout': 0
}
Expand Down Expand Up @@ -569,7 +569,7 @@ def test_malformed_data(
},
'drop_last': False,
'num_workers': 0,
'prefetch_factor': None if using_torch_2() else 2,
'prefetch_factor': None,
'pin_memory': False,
'persistent_workers': False,
'timeout': 0
Expand Down Expand Up @@ -679,7 +679,7 @@ def test_token_counting_func_dataloader_setting(
common_args = {
'drop_last': False,
'num_workers': 0,
'prefetch_factor': None if using_torch_2() else 2,
'prefetch_factor': None,
'pin_memory': False,
'persistent_workers': False,
'timeout': 0
Expand Down
4 changes: 2 additions & 2 deletions tests/data/test_packing.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

import pytest
import torch
from composer.utils import dist, reproducibility, using_torch_2
from composer.utils import dist, reproducibility
from omegaconf import DictConfig
from pytest import approx
from torch.utils.data import DataLoader
Expand Down Expand Up @@ -172,7 +172,7 @@ def test_packing_with_dataloader(packing_ratio: Any):
# Gets copied per worker and we cannot check the waste for child processes.
'num_workers': 0,
'pin_memory': False,
'prefetch_factor': None if using_torch_2() else 2,
'prefetch_factor': None,
'persistent_workers': False,
'timeout': 0,
})
Expand Down
5 changes: 1 addition & 4 deletions tests/models/test_fsdp_act_checkpoint.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

import pytest
from composer import Trainer
from composer.utils import get_device, using_torch_2
from composer.utils import get_device
from omegaconf import OmegaConf as om
from torch.distributed.algorithms._checkpoint.checkpoint_wrapper import \
CheckpointWrapper
Expand Down Expand Up @@ -65,9 +65,6 @@ def test_fsdp_act_checkpoint(activation_checkpointing: bool,
]:
module = trainer.state.model.model._fsdp_wrapped_module.transformer.blocks[
0]._fsdp_wrapped_module
if not using_torch_2():
module = trainer.state.model.model._fsdp_wrapped_module.transformer.blocks[
0]._fsdp_wrapped_module._fpw_module
assert isinstance(module, CheckpointWrapper)
elif activation_checkpointing_target == [
'grouped_query_attention'
Expand Down
16 changes: 3 additions & 13 deletions tests/optim/test_lion8b.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,23 +6,15 @@
import warnings

import numpy as np
import packaging.version as version
import pytest
import torch
import torch.distributed as dist
import torch.nn as nn
from torch.distributed import fsdp
from torch.distributed.fsdp import FullyShardedDataParallel as FSDP

if version.parse(torch.__version__) >= version.parse('2.0.1'):
from torch.distributed.fsdp.api import ( # type:ignore .api not in public API
FullOptimStateDictConfig, LocalOptimStateDictConfig,
ShardedOptimStateDictConfig)
else:
from unittest.mock import MagicMock # for pyright so vars aren't None
FullOptimStateDictConfig = MagicMock()
LocalOptimStateDictConfig = MagicMock()
ShardedOptimStateDictConfig = MagicMock()
from torch.distributed.fsdp.api import ( # type:ignore .api not in public API
FullOptimStateDictConfig, LocalOptimStateDictConfig,
ShardedOptimStateDictConfig)

from llmfoundry.optim import DecoupledLionW
from llmfoundry.optim import DecoupledLionW_8bit as Lion8bit
Expand Down Expand Up @@ -420,8 +412,6 @@ def test_fsdp_save_load(dtype: torch.dtype, use_errors: bool,
device = 'cuda'
if torch.cuda.device_count() < 2:
pytest.skip(f'This test requires 2+ GPUs.')
if version.parse(torch.__version__) < version.parse('2.0.1'):
pytest.skip(f'This test requires torch 2.0.1 or greater.')

torch.cuda.set_device(f'cuda:{os.environ["RANK"]}') # needed for fsdp
if not dist.is_initialized():
Expand Down

0 comments on commit 2017c02

Please sign in to comment.