Releases · teleprint-me/llama.cpp

17 Jun 05:50

df68d4f

b3164

[SYCL] Update README-sycl.md for Chapter "Recommended release" and "N…

Assets 20

16 Jun 17:51

github-actions

b3159

bc6c457

b3159

flake.lock: Update (#7951)

Assets 20

16 Jun 06:01

github-actions

b3154

7c7836d

b3154

Vulkan Shader Refactor, Memory Debugging Option (#7947)

* Refactor shaders, extract GLSL code from ggml_vk_generate_shaders.py into vulkan-shaders directory

* Improve debug log code

* Add memory debug output option

* Fix flake8

* Fix unnecessary high llama-3 VRAM use

Assets 20

14 Jun 19:04

github-actions

b3151

f8ec887

b3151

ci : fix macos x86 build (#7940)

In order to use old `macos-latest` we should use `macos-12`

Potentially will fix: https://github.com/ggerganov/llama.cpp/issues/6975

Assets 20

14 Jun 17:37

github-actions

b3150

76d66ee

b3150

CUDA: faster q2_K, q3_K MMQ + int8 tensor cores (#7921)

* CUDA: faster q2_K, q3_K MMQ + int8 tensor cores

* try CI fix

* try CI fix

* try CI fix

* fix data race

* rever q2_K precision related changes

Assets 20

14 Jun 17:23

github-actions

b3149

66ef1ce

b3149

metal : utilize max shared memory for mul_mat_id (#7935)

Assets 20

12 Jun 18:41

github-actions

b3141

9635529

b3141

CUDA: fix broken oob check for FA vec f32 kernel (#7904)

Assets 20

05 Jun 17:04

github-actions

b3092

7d1a378

b3092

CUDA: refactor mmq, dmmv, mmvq (#7716)

* CUDA: refactor mmq, dmmv, mmvq

* fix out-of-bounds write

* struct for qk, qr, qi

* fix cmake build

* mmq_type_traits

Assets 20

04 Jun 18:35

github-actions

b3084

5ca0944

b3084

readme : remove obsolete Zig instructions (#7471)

Assets 21

03 Jun 17:52

github-actions

b3078

bde7cd3

b3078

llama : offload to RPC in addition to other backends (#7640)

* llama : offload to RPC in addition to other backends

* - fix copy_tensor being called on the src buffer instead of the dst buffer

- always initialize views in the view_src buffer

- add RPC backend to Makefile build

- add endpoint to all RPC object names

* add rpc-server to Makefile

* Update llama.cpp

Co-authored-by: slaren <[email protected]>

---------

Co-authored-by: slaren <[email protected]>

Assets 21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: teleprint-me/llama.cpp

b3164

b3159

b3154

b3151

b3150

b3149

b3141

b3092

b3084

b3078