Skip to content

Releases: teleprint-me/llama.cpp

b3164

17 Jun 05:50
df68d4f
Compare
Choose a tag to compare
[SYCL] Update README-sycl.md for Chapter "Recommended release" and "N…

b3159

16 Jun 17:51
bc6c457
Compare
Choose a tag to compare
flake.lock: Update (#7951)

b3154

16 Jun 06:01
7c7836d
Compare
Choose a tag to compare
Vulkan Shader Refactor, Memory Debugging Option (#7947)

* Refactor shaders, extract GLSL code from ggml_vk_generate_shaders.py into vulkan-shaders directory

* Improve debug log code

* Add memory debug output option

* Fix flake8

* Fix unnecessary high llama-3 VRAM use

b3151

14 Jun 19:04
f8ec887
Compare
Choose a tag to compare
ci : fix macos x86 build (#7940)

In order to use old `macos-latest` we should use `macos-12`

Potentially will fix: https://github.com/ggerganov/llama.cpp/issues/6975

b3150

14 Jun 17:37
76d66ee
Compare
Choose a tag to compare
CUDA: faster q2_K, q3_K MMQ + int8 tensor cores (#7921)

* CUDA: faster q2_K, q3_K MMQ + int8 tensor cores

* try CI fix

* try CI fix

* try CI fix

* fix data race

* rever q2_K precision related changes

b3149

14 Jun 17:23
66ef1ce
Compare
Choose a tag to compare
metal : utilize max shared memory for mul_mat_id (#7935)

b3141

12 Jun 18:41
9635529
Compare
Choose a tag to compare
CUDA: fix broken oob check for FA vec f32 kernel (#7904)

b3092

05 Jun 17:04
7d1a378
Compare
Choose a tag to compare
CUDA: refactor mmq, dmmv, mmvq (#7716)

* CUDA: refactor mmq, dmmv, mmvq

* fix out-of-bounds write

* struct for qk, qr, qi

* fix cmake build

* mmq_type_traits

b3084

04 Jun 18:35
5ca0944
Compare
Choose a tag to compare
readme : remove obsolete Zig instructions (#7471)

b3078

03 Jun 17:52
bde7cd3
Compare
Choose a tag to compare
llama : offload to RPC in addition to other backends (#7640)

* llama : offload to RPC in addition to other backends

* - fix copy_tensor being called on the src buffer instead of the dst buffer

- always initialize views in the view_src buffer

- add RPC backend to Makefile build

- add endpoint to all RPC object names

* add rpc-server to Makefile

* Update llama.cpp

Co-authored-by: slaren <[email protected]>

---------

Co-authored-by: slaren <[email protected]>