Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto tokenizer name path fix #59

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

krmst
Copy link

@krmst krmst commented Oct 3, 2024

Changes

  • Auto tokenizer name path fix

Why submit this change

  • On a fresh new clone, ./minimind-v1-small seems not a qualified name path. It will cause error
  • Changed to jingyaogong/minimind-v1-small the download finished correctly

Error detail

Traceback (most recent call last):
  File "/Users/lazy/.pyenv/versions/3.9.20/lib/python3.9/site-packages/transformers/utils/hub.py", line 402, in cached_file
    resolved_file = hf_hub_download(
  File "/Users/lazy/.pyenv/versions/3.9.20/lib/python3.9/site-packages/huggingface_hub/utils/_deprecation.py", line 101, in inner_f
    return f(*args, **kwargs)
  File "/Users/lazy/.pyenv/versions/3.9.20/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 106, in _inner_fn
    validate_repo_id(arg_value)
  File "/Users/lazy/.pyenv/versions/3.9.20/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 160, in validate_repo_id
    raise HFValidationError(
huggingface_hub.errors.HFValidationError: Repo id must use alphanumeric chars or '-', '_', '.', '--' and '..' are forbidden, '-' and '.' cannot start or end the name, max length is 96: './minimind-v1-small'.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/lazy/MyLLMsource/minimind/4-lora_sft.py", line 163, in <module>
    model, tokenizer = init_model()
  File "/Users/lazy/MyLLMsource/minimind/4-lora_sft.py", line 107, in init_model
    tokenizer = AutoTokenizer.from_pretrained(tokenizer_name_or_path, trust_remote_code=True, use_fast=False)
  File "/Users/lazy/.pyenv/versions/3.9.20/lib/python3.9/site-packages/transformers/models/auto/tokenization_auto.py", line 834, in from_pretrained
    tokenizer_config = get_tokenizer_config(pretrained_model_name_or_path, **kwargs)
  File "/Users/lazy/.pyenv/versions/3.9.20/lib/python3.9/site-packages/transformers/models/auto/tokenization_auto.py", line 666, in get_tokenizer_config
    resolved_config_file = cached_file(
  File "/Users/lazy/.pyenv/versions/3.9.20/lib/python3.9/site-packages/transformers/utils/hub.py", line 466, in cached_file
    raise EnvironmentError(
OSError: Incorrect path_or_model_id: './minimind-v1-small'. Please provide either the path to a local folder or the repo_id of a model on the Hub.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant