Skip to content

v2.5.1

Compare
Choose a tag to compare
@rspeer rspeer released this 02 Sep 21:55
· 650 commits to master since this release

Version 2.5.1 (2021-09-02)

  • Import ftfy and use its uncurl_quotes method to turn curly quotes into
    straight ones, providing consistency with multiple forms of apostrophes.

  • Set minimum version requierements on regex, jieba, and langcodes
    so that tokenization will give consistent results.

  • Work around an inconsistency in the msgpack API around
    strict_map_key=False.

Version 2.5 (2021-04-15)

  • Incorporate data from the OSCAR corpus.