Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flexible config validation function #1475

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 6 additions & 3 deletions llmfoundry/command_utils/train.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
import os
import time
import warnings
from typing import Any, Optional, Union
from typing import Any, Callable, Optional, Union

import torch
import torch.distributed
Expand Down Expand Up @@ -187,7 +187,10 @@ def _initialize_dist_with_barrier(dist_timeout: Union[int, float]):
log.debug('Barrier test passed with device.')


def train(cfg: DictConfig) -> Trainer:
def train(
cfg: DictConfig,
config_validation_fn: Callable = validate_config
) -> Trainer:
code_paths = cfg.get('code_paths', [])
# Import any user provided code
for code_path in code_paths:
Expand Down Expand Up @@ -226,7 +229,7 @@ def train(cfg: DictConfig) -> Trainer:
)

# Check for incompatibilities between the model and data loaders
validate_config(train_cfg)
config_validation_fn(train_cfg)

cuda_alloc_conf = []
# Get max split size mb
Expand Down
Loading