Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading GGUF files support #30391

Merged
merged 37 commits into from
May 15, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
fb00288
Adds support for loading GGUF files
LysandreJik Apr 19, 2024
81e4324
add q2_k q3_k q5_k support from @99991
younesbelkada Apr 22, 2024
8a0d5b8
fix tests
younesbelkada Apr 22, 2024
08534f3
Update doc
LysandreJik Apr 22, 2024
ebd9944
Style
LysandreJik Apr 22, 2024
5c913ec
Docs
LysandreJik Apr 22, 2024
8b81bfb
Merge remote-tracking branch 'upstream/main' into HEAD
younesbelkada Apr 22, 2024
c49f1a8
fix CI
younesbelkada Apr 22, 2024
7fa538b
Update docs/source/en/gguf.md
younesbelkada Apr 22, 2024
5485327
Update docs/source/en/gguf.md
younesbelkada Apr 22, 2024
074f05e
Merge branch 'main' into gguf-support
younesbelkada Apr 23, 2024
ca8363e
Compute merges
LysandreJik Apr 23, 2024
2a0c9b0
Merge branch 'main' into gguf-support
younesbelkada Apr 25, 2024
fac7bb3
Merge branch 'main' into gguf-support
younesbelkada Apr 25, 2024
45983db
Merge remote-tracking branch 'upstream/main' into HEAD
younesbelkada Apr 30, 2024
e6c6f6c
change logic
younesbelkada Apr 30, 2024
a6cd08c
add comment for clarity
younesbelkada Apr 30, 2024
6611877
add comment for clarity
younesbelkada Apr 30, 2024
455163b
Update src/transformers/models/auto/tokenization_auto.py
younesbelkada Apr 30, 2024
42d5815
change logic
younesbelkada Apr 30, 2024
1d3acec
Update src/transformers/modeling_utils.py
younesbelkada Apr 30, 2024
af3c42c
change
younesbelkada Apr 30, 2024
a27db0c
Merge branch 'gguf-support' of https://github.com/lysandrejik/transfo…
younesbelkada Apr 30, 2024
14ad10c
Apply suggestions from code review
younesbelkada Apr 30, 2024
ab621a7
Update src/transformers/modeling_gguf_pytorch_utils.py
younesbelkada Apr 30, 2024
207820a
put back comment
younesbelkada Apr 30, 2024
1fef8ad
add comment about mistral
younesbelkada Apr 30, 2024
9ae7363
comments and added tests
younesbelkada Apr 30, 2024
3ed384f
fix merge
younesbelkada May 14, 2024
55eb860
fix unconsistent type
younesbelkada May 14, 2024
f754335
more
younesbelkada May 14, 2024
a449078
Merge remote-tracking branch 'origin/main' into HEAD
younesbelkada May 14, 2024
3bdbb2e
fix tokenizer
younesbelkada May 15, 2024
0ab79f6
Update src/transformers/modeling_utils.py
younesbelkada May 15, 2024
65433c4
address comments about tests and tokenizer + add added_tokens
younesbelkada May 15, 2024
1b5ae54
from_gguf -> gguf_file
younesbelkada May 15, 2024
d6b67c6
replace on docs too
younesbelkada May 15, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions src/transformers/integrations/ggml.py
Original file line number Diff line number Diff line change
Expand Up @@ -513,6 +513,9 @@ def __init__(self, dict_):
else:
self.merges = [tuple(merge.split(" ")) for merge in self.merges]

if not hasattr(self, "added_tokens"):
self.added_tokens = []


class GGUFLlamaConverter(LlamaConverter):
def __init__(self, tokenizer_dict):
Expand All @@ -539,6 +542,12 @@ def tokenizer(self, proto):
AddedToken("</s>", normalized=False, special=True),
]
)

if len(self.proto.added_tokens) != 0:
tokenizer.add_special_tokens(
[AddedToken(added_token, normalized=False, special=False) for added_token in self.added_tokens]
)
Comment on lines +547 to +549
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all of them are special here. You can add them all as special

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@younesbelkada this just means that added tokens that are not special will be skipped when decoding.


return tokenizer

def decoder(self, replacement, add_prefix_space):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add prefix space is defined in the gguf? Might not be good to always take it from the class (which is what's happening now)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not defined from what I read in the GGML docs + when inspecting various checkpoints from the Hub

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it's always adding a prefix space I suppose?

Expand Down
29 changes: 24 additions & 5 deletions tests/quantization/ggml/test_ggml.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
import tempfile
import unittest

from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers import AddedToken, AutoModelForCausalLM, AutoTokenizer
from transformers.testing_utils import require_gguf, require_torch_gpu, slow, torch_device
from transformers.utils import is_torch_available

Expand Down Expand Up @@ -179,7 +179,7 @@ def test_tokenization_xnli(self):

dataset = load_dataset("xnli", "all_languages")

for i, item in enumerate(tqdm.tqdm(dataset["train"])):
for i, item in enumerate(tqdm.tqdm(dataset["train"].select(range(100)))):
for string in item["premise"].values():
encoded1 = gguf_tokenizer.encode(string)
encoded2 = original_tokenizer.encode(string)
Expand All @@ -191,6 +191,25 @@ def test_tokenization_xnli(self):

self.assertEqual(decoded1, decoded2)

# Otherwise the test takes too long
if i > 100:
break
# With special tokens
gguf_tokenizer = AutoTokenizer.from_pretrained(self.model_id, from_gguf=self.q8_0_gguf_model_id)
original_tokenizer = AutoTokenizer.from_pretrained(self.original_model_id)

gguf_tokenizer.add_special_tokens(
{"additional_special_tokens": [AddedToken("<token>", rstrip=False, lstrip=False)]}
)
original_tokenizer.add_special_tokens(
{"additional_special_tokens": [AddedToken("<token>", rstrip=False, lstrip=False)]}
)

text = "Hello <token>. <token> Hello"

encoded1 = gguf_tokenizer.encode(text)
encoded2 = original_tokenizer.encode(text)

self.assertEqual(encoded1, encoded2)

decoded1 = gguf_tokenizer.decode(encoded1, skip_special_tokens=True)
decoded2 = original_tokenizer.decode(encoded2, skip_special_tokens=True)

self.assertEqual(decoded1, decoded2)
Loading