You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I can only load q4_0 and q4_1 models. The newer q4_2, q5_0 and q5_1 don't work. Since I recently upgraded my RAM to 64GB to run LLMs on my machine I'd like to be able to use the newer models.
The text was updated successfully, but these errors were encountered:
For context I use the latest release. Since it was last updated a month ago I don't know if the latest commits already added support for 5-bit quantisation.
I have been using some q5_1 models with no problems after compiling llama.cpp and putting the resulting main.exe in place of Alpaca Electron's chat.exe. You can follow "(OPTIONAL) Building llama.cpp from source" in the README here, although note that for me the second cmake didn't work and should be "cmake --build . --config Release" per the llama.cpp README.
Hi, I don't know much about AI. But I've seen a lot models popping up on HuggingFace recently advertising 5-bit quantisation. Here is an example: https://huggingface.co/TheBloke/OpenAssistant-SFT-7-Llama-30B-GGML
I can only load q4_0 and q4_1 models. The newer q4_2, q5_0 and q5_1 don't work. Since I recently upgraded my RAM to 64GB to run LLMs on my machine I'd like to be able to use the newer models.
The text was updated successfully, but these errors were encountered: