Skip to content

Releases: theodore-s-beers/feruca

v0.10.1

14 Dec 21:33
Compare
Choose a tag to compare

In this patch release:

  • Miscellaneous dependency updates
  • Rebuilt bincode

Full changelog: v0.10.0...v0.10.1

v0.10.0

07 Feb 12:29
Compare
Choose a tag to compare

In this minor release:

  • Support for Unicode version 15.1 and, correspondingly, CLDR version 44
  • A significant performance optimization, thanks to a PR from @TS60. By avoiding certain repeated allocations, the collator now runs as much as 3x faster than before!
  • Dependency updates and various other small changes

Full changelog: v0.9.0...v0.10.0

v0.9.0

14 Mar 10:04
Compare
Choose a tag to compare

In this minor release:

  • There is now a tiebreak option as part of Collator, which determines the behavior of the collate method. The separate collate_no_tiebreak method has been removed. I should have done this long ago.

  • It used to be that collate would start by running input through UTF-8 validation and creating vectors of u32 code points. Only then would it check to see if the two strings start with different runs of ASCII characters—which allows for a result to be returned very quickly. I've made a change whereby the possibility of ASCII comparison is checked in the process of generating the code point vectors. i.e., collate can sometimes return even earlier. I was surprised to find that this change yielded a ~40% speedup in some benchmarks! It's one of the best optimizations in my entire time working on this library.

v0.8.0

23 Feb 16:10
Compare
Choose a tag to compare

In this minor release:

  • I've tried to limit allocation by having the collator object hold onto two collation element arrays—one for each string being compared in a given call of the collate method—which can be repeatedly overwritten. Previously, a new vector was initialized every time that a collation element array was generated. This change might have produced a small performance benefit? I don't think it's a bad idea, anyway. I have been surprised by how "stuck" the library seems to be at its current level of performance. Did the compiler already find all these optimizations?
  • The API has changed a bit: the collator object once again needs to be declared as mutable. This is because of the aforementioned collation element arrays.
  • Other miscellaneous dependency updates, refactoring, and general housekeeping

v0.7.1

09 Feb 18:17
Compare
Choose a tag to compare

In this patch:

  • Even fewer direct dependencies
  • Significantly smaller package size
  • Refactoring of some core logic
  • No apparent impact on performance
  • API unchanged

v0.7.0

05 Feb 21:47
Compare
Choose a tag to compare

Over time, I grew unhappy with two aspects of the development of this library: the increasingly complicated API and logic, and the growing dependency tree. This version is intended to pare things back somewhat. I've found through benchmarking that the performance impact of these simplifications is negligible. I still hope to find ways of making the library significantly more performant, but going forward, I'll try to be careful in weighing benefits against costs.

v0.6.0

20 Sep 10:49
Compare
Choose a tag to compare

As of this version, feruca targets Unicode 15.

v0.5.2

08 Sep 11:40
Compare
Choose a tag to compare

This is another patch that reflects refactoring (some of it significant) and dependency updates—most importantly the move to bstr 1.0.0.

v0.5.1

15 Aug 21:40
Compare
Choose a tag to compare

This is a small patch that reflects some refactoring, as well as dependency updates.

v0.5.0

05 Aug 03:03
Compare
Choose a tag to compare

This release is, again, all about perf. I added an LRU cache to Collator, with the idea that calculation of some collation element arrays could be avoided. This may help a bit, but not much—and I'm not thrilled with the API changes that were necessary to get it working. I may remove it later. On a more productive note, I also found ways of making the "ASCII path" faster. Now feruca is more like 5–7x slower than ucol (as best I can tell).