Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix custom cache dir #2226

Merged
merged 4 commits into from
Jul 15, 2024
Merged

fix custom cache dir #2226

merged 4 commits into from
Jul 15, 2024

Conversation

ErikKaum
Copy link
Member

@ErikKaum ErikKaum commented Jul 12, 2024

What does this PR do?

How to reproduce the bug

  • env vars HUGGINGFACE_HUB_CACHE=/some/other/dir and HF_HUB_OFFLINE=1 have to be set
  • launch with e.g. this tokenizer text-generation-router --tokenizer-name meta-llama/Meta-Llama-3-8B-Instruct
  • result, the tokenizer_file_name is None and the config as well
2024-07-12T12:40:29.724257Z  WARN text_generation_router: router/src/main.rs:328: Could not find tokenizer config locally and no API specified
2024-07-12T12:40:29.724292Z  INFO text_generation_router: router/src/main.rs:353: Using config None
2024-07-12T12:40:29.724307Z  WARN text_generation_router: router/src/main.rs:355: Could not find a fast tokenizer implementation for meta-llama/Meta-Llama-3-8B-Instruct

Fix

  • check if HUGGINGFACE_HUB_CACHE is set and use that
  • if not --> then use the Cache::default()

This was most likely to only come up when running TGI in docker since there the ENV HUGGINGFACE_HUB_CACHE=/data \ is set to something besides the default.

Copy link
Member

@OlivierDehaene OlivierDehaene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@ErikKaum ErikKaum merged commit 457fb0a into main Jul 15, 2024
9 checks passed
@ErikKaum ErikKaum deleted the fix/hf-cache-dir branch July 15, 2024 13:17
ErikKaum added a commit that referenced this pull request Jul 25, 2024
* fix to not ignore HUGGINGFACE_HUB_CACHE in cache

* delete printlns

* delete newlines

* maybe fix trailing whitespace
ErikKaum added a commit that referenced this pull request Jul 26, 2024
* fix to not ignore HUGGINGFACE_HUB_CACHE in cache

* delete printlns

* delete newlines

* maybe fix trailing whitespace
yuanwu2017 pushed a commit to yuanwu2017/tgi-gaudi that referenced this pull request Sep 26, 2024
* fix to not ignore HUGGINGFACE_HUB_CACHE in cache

* delete printlns

* delete newlines

* maybe fix trailing whitespace
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

loading fast tokenizer implementation for cached model with HF_HUB_OFFLINE=1 fails
2 participants