Skip to content

Releases: ngxson/llama.cpp

b2368

09 Mar 10:42
e1fa956
Compare
Choose a tag to compare
server : add SSL support (#5926)

* add cmake build toggle to enable ssl support in server

Signed-off-by: Gabe Goodhart <[email protected]>

* add flags for ssl key/cert files and use SSLServer if set

All SSL setup is hidden behind CPPHTTPLIB_OPENSSL_SUPPORT in the same
way that the base httlib hides the SSL support

Signed-off-by: Gabe Goodhart <[email protected]>

* Update readme for SSL support in server

Signed-off-by: Gabe Goodhart <[email protected]>

* Add LLAMA_SERVER_SSL variable setup to top-level Makefile

Signed-off-by: Gabe Goodhart <[email protected]>

---------

Signed-off-by: Gabe Goodhart <[email protected]>

b2364

08 Mar 14:24
76e8688
Compare
Choose a tag to compare
server: metrics: add llamacpp:prompt_seconds_total and llamacpp:token…

b2361

08 Mar 10:32
581ed5c
Compare
Choose a tag to compare
log : fix MSVC compile errors (#5643)

MSVC gives the following error with the existing macros:
`Error C2059 : syntax error: ','`

This patch adds `##` as a prefix to `__VA_ARGS__` to address this error.

b2311

02 Mar 10:12
9bf297a
Compare
Choose a tag to compare
workflows : remove nocleanup arg for check-requirements.sh (#5826)

Reduces peak tmpfs usage and should prevent the check from failing from
running out of space.

Fixes the 'No space left on device' issue mentioned in #5703.

b2296

29 Feb 14:10
d5ab297
Compare
Choose a tag to compare
llama : constified `llama_set_state_data`'s `src` (#5774)

b2295

28 Feb 21:10
87c91c0
Compare
Choose a tag to compare
ci : reduce 3b ppl chunks to 1 to avoid timeout (#5771)

ggml-ci

b2282

27 Feb 20:58
cb49e0f
Compare
Choose a tag to compare
Attempt to fix android build (#5752)

Co-authored-by: Iwan Kawrakow <[email protected]>

b2271

26 Feb 14:19
67fd331
Compare
Choose a tag to compare
unicode : reuse iterator (#5726)

b2264

25 Feb 21:02
bf08e00
Compare
Choose a tag to compare
llama : refactor k-shift implementation + KV defragmentation (#5691)

* llama : refactor k-shift implementation

ggml-ci

* llama : rename llama_kv_cache_seq_shift to llama_kv_cache_seq_add

* llama : cont k-shift refactoring + normalize type names

ggml-ci

* minor : fix MPI builds

* llama : reuse n_rot from the build context

ggml-ci

* llama : revert enum name changes from this PR

ggml-ci

* llama : update llama_rope_type

* llama : add comment about rope values

* llama : fix build

* passkey : apply kv cache updates explicitly

ggml-ci

* llama : change name to llama_kv_cache_update()

* llama : add llama_kv_cache_seq_pos_max()

* passkey : fix llama_kv_cache_seq_pos_max() usage

* llama : some llama_kv_cell simplifications

* llama : add llama_kv_cache_compress (EXPERIMENTAL)

* llama : add alternative KV cache merging (EXPERIMENTAL)

* llama : add llama_kv_cache_defrag

* llama : comments

* llama : remove llama_kv_cache_compress

will add in a separate PR

ggml-ci

* llama : defragment via non-overlapping moves

* llama : ggml_graph based defrag implementation

ggml-ci

* llama : switch the loop order in build_defrag

* llama : add comments

b2259

25 Feb 14:21
930b178
Compare
Choose a tag to compare
server: logs - unified format and --log-format option (#5700)

* server: logs - always use JSON logger, add add thread_id in message, log task_id and slot_id

* server : skip GH copilot requests from logging

* server : change message format of server_log()

* server : no need to repeat log in comment

* server : log style consistency

* server : fix compile warning

* server : fix tests regex patterns on M2 Ultra

* server: logs: PR feedback on log level

* server: logs: allow to choose log format in json or plain text

* server: tests: output server logs in text

* server: logs switch init logs to server logs macro

* server: logs ensure value json value does not raised error

* server: logs reduce level VERBOSE to VERB to max 4 chars

* server: logs lower case as other log messages

* server: logs avoid static in general

Co-authored-by: Georgi Gerganov <[email protected]>

* server: logs PR feedback: change text log format to: LEVEL [function_name] message | additional=data

---------

Co-authored-by: Georgi Gerganov <[email protected]>