You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi all, I've been trying to compress Large language models using nvcomp but can't succeed. I only managed to compress the tokenizer.json and config.json files of the model but was unable to compress the .safetensors or .gguf model files.
Does nvcomp currently support this? Can I ask how can i do so?
Much appreciated
The text was updated successfully, but these errors were encountered:
@Iyzyman Thanks for the question. Can you share a bit more on which LLM model are you trying to compress and for what use case? Is it mainly to reduce the checkpoint size on disk or are you looking to do compression in memory or something else?
@Iyzyman Thanks for the question. Can you share a bit more on which LLM model are you trying to compress and for what use case? Is it mainly to reduce the checkpoint size on disk or are you looking to do compression in memory or something else?
@akshaysubr Thanks for the response. I was trying to compress Meta-Llama-3-8B-Instruct and it's quantized versions. Mainly to reduce size on the disk. Would that be possible?
Hi all, I've been trying to compress Large language models using nvcomp but can't succeed. I only managed to compress the tokenizer.json and config.json files of the model but was unable to compress the .safetensors or .gguf model files.
Does nvcomp currently support this? Can I ask how can i do so?
Much appreciated
The text was updated successfully, but these errors were encountered: