Releases: ngxson/llama.cpp
Releases · ngxson/llama.cpp
b4393
b4392
examples, ggml : fix GCC compiler warnings (#10983) Warning types fixed (observed under MSYS2 GCC 14.2.0): * format '%ld' expects argument of type 'long int', but argument has type 'size_t' * llama.cpp/ggml/src/ggml-vulkan/vulkan-shaders/vulkan-shaders-gen.cpp:81:46: warning: missing initializer for member '_STARTUPINFOA::lpDesktop' [-Wmissing-field-initializers] (emitted for all struct field except first)
b4391
server : add support for "encoding_format": "base64" to the */embeddi…
b4390
ggml : more perfo with llamafile tinyblas on x86_64 (#10714) * more perfo with llamafile tinyblas on x86_64. - add bf16 suport - change dispache strategie (thanks: https://github.com/ikawrakow/ik_llama.cpp/pull/71 ) - reduce memory bandwidth simple tinyblas dispache and more cache freindly * tinyblas dynamic dispaching * sgemm: add M blocs. * - git 2.47 use short id of len 9. - show-progress is not part of GNU Wget2 * remove not stable test
b4389
server: allow filtering llama server response fields (#10940) * llama_server_response_fields * llama_server_response_fields_fix_issues * params fixes * fix * clarify docs * change to "response_fields" --------- Co-authored-by: Xuan Son Nguyen <[email protected]>
b4388
llama : the WPM vocabs use the CLS token as BOS (#10930) * llama : the WPM vocabs use the CLS token as BOS ggml-ci * llama : add comment
b4387
ggml : use wstring for backend search paths (#10960) ggml-ci
b4384
server : fix missing model id in /model endpoint (#10957) * server : fix missing model id in /model endpoint * fix ci
b4382
rpc-server : add support for the SYCL backend (#10934)
b4381
llama : support InfiniAI Megrez 3b (#10893) * Support InfiniAI Megrez 3b * Fix tokenizer_clean_spaces for megrez