Releases: teleprint-me/llama.cpp
Releases · teleprint-me/llama.cpp
b3550
gguf-py : simplify support for quant types (#8838) * gguf-py : use classes for quants * convert_hf : simplify internal quantization type selection * gguf-py : fix flake8 lint * gguf-py : fix BF16 numpy view type * gguf-py : remove LlamaFileTypeMap Too specific to 'llama.cpp', and would be a maintenance burden to keep up to date. * gguf-py : add generic quantize and dequantize functions The quant classes no longer need to be known, only the target or the source type, for 'quantize' and 'dequantize', respectively.
b3542
make : use C compiler to build metal embed object (#8899) * make : use C compiler to build metal embed object * use rm + rmdir to avoid -r flag in rm
b3506
ggml : reading the runtime sve config of the cpu (#8709) * ggml : reading the runtime sve config of the cpu * change to one time init to prevent performance drop * prefix variable to avoid possible conflicts * revert xxhash fix and add brackets --------- Co-authored-by: domke <[email protected]>
b3503
[SYCL] Fixing wrong VDR iq4nl value (#8812)
b3493
py: add_array() will not add to kv store if value is an empty array (…
b3484
chore : Fix vulkan related compiler warnings, add help text, improve …
b3468
ggml : reduce hash table reset cost (#8698) * ggml : reduce hash table reset cost * fix unreachable code warnings after GGML_ASSERT(false) * GGML_ASSERT(false) -> GGML_ABORT("fatal error") * GGML_ABORT use format string
b3467
llama : fix order of parameters (#8706) usage of `aclrtGetMemInfo` is correct: https://www.hiascend.com/doc_center/source/zh/canncommercial/63RC2/inferapplicationdev/aclcppdevg/aclcppdevg_03_0103.html Co-authored-by: Judd <[email protected]>
b3466
server : add Speech Recognition & Synthesis to UI (#8679) * server : add Speech Recognition & Synthesis to UI * server : add Speech Recognition & Synthesis to UI (fixes)
b3448
server : fix URL.parse in the UI (#8646)