Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace json format with more efficient binary file format #188

Open
pemistahl opened this issue May 29, 2023 · 2 comments
Open

Replace json format with more efficient binary file format #188

pemistahl opened this issue May 29, 2023 · 2 comments
Labels
enhancement New feature or request

Comments

@pemistahl
Copy link
Owner

pemistahl commented May 29, 2023

Currently, the language models are stored as json files. Json is somewhat slow to deserialize. Let's investigate whether there is a more efficient binary file format which can be deserialized faster.

A promising candidate could be the MessagePack or Protobuf format.

@pemistahl pemistahl added the enhancement New feature or request label May 29, 2023
@pemistahl pemistahl modified the milestones: Lingua 1.5.0, Lingua 1.6.0 May 29, 2023
@getreu
Copy link

getreu commented Jun 10, 2023

What about postcard - Rust?

@xxaier
Copy link

xxaier commented Jul 31, 2023

why not use https://github.com/apache/arrow-rs

and

this is Rust serialization benchmark

https://github.com/djkoloski/rust_serialization_benchmark

I recommand https://crates.io/crates/speedy

@pemistahl pemistahl modified the milestones: Lingua 1.6.0, Lingua 1.7.0 Oct 30, 2023
@pemistahl pemistahl removed this from the Lingua 1.7.0 milestone Aug 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants