Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail to load tokenizer for checkpoints #741

Open
tresiwald opened this issue Oct 24, 2024 · 0 comments
Open

Fail to load tokenizer for checkpoints #741

tresiwald opened this issue Oct 24, 2024 · 0 comments
Labels
type/bug An issue about a bug

Comments

@tresiwald
Copy link

🐛 Describe the bug

I converted OLMO checkpoints to HF. When I wanted to load the tokenizer using the AutoTokenizer I got the following error:

Exception: data did not match any variant of untagged enum ModelWrapper at line 250629 column 3

Could I also use the tokenizer of the allenai/OLMo-7B-hf model?

Versions

Python 3.10.12
accelerate==0.28.0
aiohappyeyeballs==2.4.3
aiohttp==3.10.10
aiosignal==1.3.1
ansi2html==1.9.2
anyio==4.4.0
appdirs==1.4.4
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.3.0
asttokens==2.4.1
async-lru==2.0.4
async-timeout==4.0.3
attrs==24.2.0
babel==2.16.0
beautifulsoup4==4.12.3
bert-score==0.3.13
bleach==6.1.0
blinker==1.8.2
cachetools==5.5.0
certifi==2024.8.30
cffi==1.17.1
charset-normalizer==3.3.2
cleantext==1.1.4
click==8.1.6
comm==0.2.2
contourpy==1.3.0
cryptography==3.4.8
cycler==0.12.1
dash==2.18.0
dash-core-components==2.0.0
dash-html-components==2.0.0
dash-table==5.0.0
datasets==2.21.0
dbus-python==1.2.18
debugpy==1.8.5
decorator==5.1.1
defusedxml==0.7.1
dill==0.3.8
distro==1.7.0
distro-info==1.1+ubuntu0.2
docker-pycreds==0.4.0
exceptiongroup==1.2.2
executing==2.1.0
fastjsonschema==2.20.0
filelock==3.16.1
Flask==3.0.3
fonttools==4.54.1
fqdn==1.5.1
frozenlist==1.4.1
fsspec==2024.6.1
gdown==5.2.0
gitdb==4.0.11
GitPython==3.1.43
google-api-core==2.21.0
google-api-python-client==2.142.0
google-auth==2.35.0
google-auth-httplib2==0.2.0
googleapis-common-protos==1.65.0
h11==0.14.0
httpcore==1.0.5
httplib2==0.20.2
httpx==0.27.2
huggingface-hub==0.25.2
idna==3.8
importlib-metadata==4.6.4
ipykernel==6.29.5
ipython==8.27.0
isoduration==20.11.0
itsdangerous==2.2.0
jedi==0.19.1
jeepney==0.7.1
Jinja2==3.1.4
joblib==1.4.2
json5==0.9.25
jsonpointer==3.0.0
jsonschema==4.23.0
jsonschema-specifications==2023.12.1
jupyter-dash==0.4.2
jupyter-events==0.10.0
jupyter-lsp==2.2.5
jupyter_client==8.6.2
jupyter_core==5.7.2
jupyter_server==2.14.2
jupyter_server_terminals==0.5.3
jupyterlab==4.2.5
jupyterlab_pygments==0.3.0
jupyterlab_server==2.27.3
keyring==23.5.0
kiwisolver==1.4.7
launchpadlib==1.10.16
lazr.restfulclient==0.14.4
lazr.uri==1.0.6
lightning-utilities==0.11.8
MarkupSafe==2.1.5
matplotlib==3.9.2
matplotlib-inline==0.1.7
mistune==3.0.2
more-itertools==8.10.0
mpmath==1.3.0
msgpack==1.1.0
multidict==6.1.0
multiprocess==0.70.16
nbclient==0.10.0
nbconvert==7.16.4
nbformat==5.10.4
nest-asyncio==1.6.0
networkx==3.4.2
nltk==3.8.1
notebook_shim==0.2.4
numpy==1.23.5
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu12==2.18.1
nvidia-nvjitlink-cu12==12.6.77
nvidia-nvtx-cu12==12.1.105
oauthlib==3.2.0
overrides==7.7.0
packaging==24.1
pandas==2.0.3
pandocfilters==1.5.1
parso==0.8.4
pathtools==0.1.2
pexpect==4.9.0
pillow==11.0.0
platformdirs==4.3.2
plotly==5.24.0
prometheus_client==0.20.0
prompt_toolkit==3.0.47
propcache==0.2.0
proto-plus==1.24.0
protobuf==4.25.5
psutil==6.0.0
ptyprocess==0.7.0
pure_eval==0.2.3
pyarrow==17.0.0
pyasn1==0.6.1
pyasn1_modules==0.4.1
pycparser==2.22
Pygments==2.18.0
PyGObject==3.42.1
PyJWT==2.3.0
pyparsing==2.4.7
pyphen==0.16.0
PySocks==1.7.1
python-apt==2.4.0+ubuntu3
python-dateutil==2.9.0.post0
python-json-logger==2.0.7
pytorch-lightning==2.4.0
pytz==2024.2
PyYAML==6.0.1
pyzmq==26.2.0
ray==2.35.0
referencing==0.35.1
regex==2024.9.11
requests==2.32.3
retrying==1.3.4
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rouge==1.0.1
rpds-py==0.20.0
rsa==4.9
safetensors==0.4.5
scikit-learn==1.3.0
scipy==1.11.1
SecretStorage==3.3.1
Send2Trash==1.8.3
sentence-transformers==2.2.2
sentencepiece==0.2.0
sentry-sdk==2.17.0
setproctitle==1.3.3
six==1.16.0
smmap==5.0.1
sniffio==1.3.1
soupsieve==2.6
ssh-import-id==5.11
stack-data==0.6.3
sympy==1.13.3
tenacity==9.0.0
terminado==0.18.1
textstat==0.7.4
threadpoolctl==3.5.0
tinycss2==1.3.0
tokenizers==0.19.1
tomli==2.0.1
torch==2.1.0
torchmetrics==1.4.1
torchvision==0.16.0
tornado==6.4.1
tqdm==4.66.3
traitlets==5.14.3
transformers==4.44.2
triton==2.1.0
types-python-dateutil==2.9.0.20240906
typing_extensions==4.12.2
tzdata==2024.2
unattended-upgrades==0.1
uri-template==1.3.0
uritemplate==4.1.1
urllib3==2.2.2
vaderSentiment==3.3.2
wadllib==1.3.6
wandb==0.15.8
wcwidth==0.2.13
webcolors==24.8.0
webencodings==0.5.1
websocket-client==1.8.0
Werkzeug==3.0.4
xxhash==3.5.0
yarl==1.16.0
zipp==1.0.0

@tresiwald tresiwald added the type/bug An issue about a bug label Oct 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug An issue about a bug
Projects
None yet
Development

No branches or pull requests

1 participant