Skip to content

Releases: ngxson/llama.cpp

b4393

26 Dec 16:31
d79d8f3
Compare
Choose a tag to compare
vulkan: multi-row k quants (#10846)

* multi row k quant shaders!

* better row selection

* more row choices

* readjust row selection

* rm_kq=2 by default

b4392

26 Dec 14:42
d283d02
Compare
Choose a tag to compare
examples, ggml : fix GCC compiler warnings (#10983)

Warning types fixed (observed under MSYS2 GCC 14.2.0):
* format '%ld' expects argument of type 'long int', but argument has type 'size_t'
* llama.cpp/ggml/src/ggml-vulkan/vulkan-shaders/vulkan-shaders-gen.cpp:81:46: warning: missing initializer for member '_STARTUPINFOA::lpDesktop' [-Wmissing-field-initializers]  (emitted for all struct field except first)

b4391

24 Dec 21:15
9ba399d
Compare
Choose a tag to compare
server : add support for "encoding_format": "base64" to the */embeddi…

b4390

24 Dec 18:33
2cd43f4
Compare
Choose a tag to compare
ggml : more perfo with llamafile tinyblas on x86_64 (#10714)

* more perfo with llamafile tinyblas on x86_64.

- add bf16 suport
- change dispache strategie (thanks:
https://github.com/ikawrakow/ik_llama.cpp/pull/71 )
- reduce memory bandwidth

simple tinyblas dispache and more cache freindly

* tinyblas dynamic dispaching

* sgemm: add M blocs.

* - git 2.47 use short id of len 9.
- show-progress is not part of GNU Wget2

* remove not stable test

b4389

24 Dec 17:19
09fe2e7
Compare
Choose a tag to compare
server:  allow filtering llama server response fields (#10940)

* llama_server_response_fields

* llama_server_response_fields_fix_issues

* params fixes

* fix

* clarify docs

* change to "response_fields"

---------

Co-authored-by: Xuan Son Nguyen <[email protected]>

b4388

24 Dec 08:22
30caac3
Compare
Choose a tag to compare
llama : the WPM vocabs use the CLS token as BOS (#10930)

* llama : the WPM vocabs use the CLS token as BOS

ggml-ci

* llama : add comment

b4387

24 Dec 03:47
60cfa72
Compare
Choose a tag to compare
ggml : use wstring for backend search paths (#10960)

ggml-ci

b4384

23 Dec 12:31
14b699e
Compare
Choose a tag to compare
server : fix missing model id in /model endpoint (#10957)

* server : fix missing model id in /model endpoint

* fix ci

b4382

23 Dec 10:08
86bf31c
Compare
Choose a tag to compare
rpc-server : add support for the SYCL backend (#10934)

b4381

23 Dec 02:13
b92a14a
Compare
Choose a tag to compare
llama : support InfiniAI Megrez 3b (#10893)

* Support InfiniAI Megrez 3b

* Fix tokenizer_clean_spaces for megrez