Skip to content

Releases: teleprint-me/llama.cpp

b3290

04 Jul 01:46
d23287f
Compare
Choose a tag to compare
Define and optimize  RDNA1 (#8085)

b3284

02 Jul 19:04
3e2618b
Compare
Choose a tag to compare
Adding step to `clean` target to remove legacy binary names to reduce…

b3268

01 Jul 07:40
d0a7145
Compare
Choose a tag to compare
flake.lock: Update (#8218)

b3266

30 Jun 18:23
1c5eba6
Compare
Choose a tag to compare
llama: Add attention and final logit soft-capping, update scaling fac…

b3264

28 Jun 23:01
8748d8a
Compare
Choose a tag to compare
json: attempt to skip slow tests when running under emulator (#8189)

b3209

24 Jun 02:16
95f57bb
Compare
Choose a tag to compare
ggml : remove ggml_task_type and GGML_PERF (#8017)

* ggml : remove ggml_task_type and GGML_PERF

* check abort_callback on main thread only

* vulkan : remove usage of ggml_compute_params

* remove LLAMA_PERF

b3203

23 Jun 06:19
b5a5f34
Compare
Choose a tag to compare
Removing extra blank lines that were breaking Lint. (#8067)

b3196

21 Jun 07:18
7d5e877
Compare
Choose a tag to compare
ggml : AVX IQ quants (#7845)

* initial iq4_xs

* fix ci

* iq4_nl

* iq1_m

* iq1_s

* iq2_xxs

* iq3_xxs

* iq2_s

* iq2_xs

* iq3_s before sllv

* iq3_s

* iq3_s small fix

* iq3_s sllv can be safely replaced with sse multiply

b3184

19 Jun 17:58
9c77ec1
Compare
Choose a tag to compare
ggml : synchronize threads using barriers (#7993)

b3182

19 Jun 02:33
623494a
Compare
Choose a tag to compare
[SYCL] refactor (#6408)

* seperate lower precision GEMM from the main files

* fix workgroup size hardcode