-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Examples: Add text compression example. #9633
base: master
Are you sure you want to change the base?
Conversation
ed2c292
to
b9a32f4
Compare
Is this the same method as: https://arxiv.org/pdf/2306.04050 ? |
Interesting, thanks for sharing. At first glance, this does look similar to what I'm doing. At least the part about the ranks is the same. |
Just for reference, I found an interesting implementation of arithmetic coding using llama_cpp_python: |
This PR adds an example text compression scheme using a language model. This compression scheme is not optimal, but it's not too far from it.
Performance:
Testing it on the source file against classical compression schemes.
Usage:
Compression
./compress --mode compress -m path/to/your/model.gguf -f path/to/the/text/file.txt -o output.bin
Decompression
./compress --mode expand -m path/to/your/model.gguf -f output.bin -o output.txt
Drawbacks
It's very slow compared to traditionnal compression schemes.
It needs the exact same setup for compression and decompression. ( just changing the number of offloaded gpu layers can change the behavior enough to introduce errors )
How it works
TODO (I'm bad at explainig things, but please read the code)