Skip to content

Releases: cohere-ai/tokenizers

v0.9.1

19 Sep 22:51
Compare
Choose a tag to compare

Cohere customizations:

  • use github.com/cohere-ai/tokenizers as the module name
  • update gco to expect libtokenizers.a in module directory

v0.8.0

15 Aug 19:39
d503b5b
Compare
Choose a tag to compare

Following: https://github.com/daulet/tokenizers/releases/tag/v0.8.0

Breaking change:

Path to compiled rust library needs to be specified via -ldflags. I found it most convenient to use CGO_LDFLAGS env variable to avoid always setting it. See daulet#18 for more details.

What's Changed

Update to allow for platform dependent libs in CGO

v0.7.1

15 Aug 19:26
Compare
Choose a tag to compare

Following: https://github.com/daulet/tokenizers/releases/tag/v0.7.1

What's Changed

  • Update core tokenizers library to latest: v0.15.2;
  • Expose init time parameter to encode special tokens (or not);

v0.9.0

15 Aug 19:45
Compare
Choose a tag to compare

Following: https://github.com/daulet/tokenizers/releases/tag/v0.9.0

What's Changed

  • feat: add option to retrieve offsets from tokenizer
  • Update to huggingface/tokenizers v0.20.0