Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime Error in Python 3.8 #1691

Closed
msharp9 opened this issue Nov 26, 2024 · 9 comments
Closed

Runtime Error in Python 3.8 #1691

msharp9 opened this issue Nov 26, 2024 · 9 comments

Comments

@msharp9
Copy link

msharp9 commented Nov 26, 2024

My Github Action pipeline failed on importing Transformers after upgrading from tokenizers 0.20.3 to 0.20.4. Specifically early on this line in my pytest code: from transformers import AutoModel, AutoTokenizer

It passed for python 3.9, 3.10, 3.11, but failed specifically for the Python 3.8. This is the error:

RuntimeError: Failed to import transformers.models.auto because of the following error (look up to see its traceback): 
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/tokenizers/tokenizers.abi3.so: undefined symbol: PyInterpreterState_Get

Looking at the release logs there was a change to abi3: https://github.com/huggingface/tokenizers/releases/tag/v0.20.4. I'm assuming it's related.

@danielyan86129
Copy link

Same for us.

@kopyl
Copy link

kopyl commented Nov 27, 2024

I can't even install it on Python3.8 😭
Installing the latest version from git with pip install git+.... does not work as well.

This prevents from installing one of the most important ML library – transformers.

I was able to install transformers by installing the previous version of tokenizers with
pip install tokenizers==0.20.3

image

@praykabam
Copy link

I can't even install it on Python3.8 😭 Installing the latest version from git with pip install git+.... does not work as well.

This prevents from installing one of the most important ML library – transformers.

I was able to install transformers by installing the previous version of tokenizers with pip install tokenizers==0.20.3

image

Same for me

@xxxjjhhh
Copy link

same for me

my docker env


43.94 Downloading tokenizers-0.20.4.tar.gz (343 kB)
43.99 Installing build dependencies: started
46.34 Installing build dependencies: finished with status 'done'
46.35 Getting requirements to build wheel: started
46.40 Getting requirements to build wheel: finished with status 'done'
46.41 Preparing wheel metadata: started
46.48 Preparing wheel metadata: finished with status 'error'
46.48 ERROR: Command errored out with exit status 1:
46.48 command: /usr/bin/python3 /tmp/tmp5xdpfc7d prepare_metadata_for_build_wheel /tmp/tmpxat8knxf
46.48 cwd: /tmp/pip-install-ok9by77y/tokenizers
46.48 Complete output (6 lines):
46.48
46.48 Cargo, the Rust package manager, is not installed or is not on PATH.
46.48 This package requires Rust and Cargo to compile extensions. Install it through
46.48 the system's package manager or via https://rustup.rs/
46.48
46.48 Checking for Rust toolchain....

@xxxjjhhh
Copy link

Problem only occurs in version 3.8

test
3.9 ok
3.10 ok

@ArthurZucker
Copy link
Collaborator

Yep, I'll try to push the 3.8 version, this is kind of a breaking change that comes from the recent ABI update that we needed. Sorry all!

@ArthurZucker
Copy link
Collaborator

Abi wheels were introduced in #1674, to reduce the crazy amount of wheels we had

@ArthurZucker
Copy link
Collaborator

We are gonna yank the release, to have 0.21.0 instead

@ArthurZucker
Copy link
Collaborator

Closing as the release is yanked!

abuccts pushed a commit to microsoft/superbenchmark that referenced this issue Nov 28, 2024
Added llama benchmark - training and inference in accordance with the
existing pytorch models implementation like gpt2, lstm etc.

- added llama fp8 unit test for better code coverage, to reduce memory
required
- updated transformers version >= 4.28.0 for LLamaConfig
- set tokenizers version <= 0.20.3 to avoid 0.20.4 version
[issues](huggingface/tokenizers#1691) with
py3.8
- added llama2 to tensorrt
- llama2 tests not added to test_tensorrt_inference_performance.py due
to large memory requirement for worker gpu. tests validated separately
on gh200

---------

Co-authored-by: dpatlolla <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants