Releases · teleprint-me/llama.cpp

04 Jul 01:46

d23287f

b3290

Define and optimize  RDNA1 (#8085)

Assets 20

02 Jul 19:04

github-actions

b3284

3e2618b

b3284

Adding step to `clean` target to remove legacy binary names to reduce…

Assets 20

01 Jul 07:40

github-actions

b3268

d0a7145

b3268

flake.lock: Update (#8218)

Assets 20

30 Jun 18:23

github-actions

b3266

1c5eba6

b3266

llama: Add attention and final logit soft-capping, update scaling fac…

Assets 20

28 Jun 23:01

github-actions

b3264

8748d8a

b3264

json: attempt to skip slow tests when running under emulator (#8189)

Assets 20

24 Jun 02:16

github-actions

b3209

95f57bb

b3209

ggml : remove ggml_task_type and GGML_PERF (#8017)

* ggml : remove ggml_task_type and GGML_PERF

* check abort_callback on main thread only

* vulkan : remove usage of ggml_compute_params

* remove LLAMA_PERF

Assets 20

23 Jun 06:19

github-actions

b3203

b5a5f34

b3203

Removing extra blank lines that were breaking Lint. (#8067)

Assets 20

21 Jun 07:18

github-actions

b3196

7d5e877

b3196

ggml : AVX IQ quants (#7845)

* initial iq4_xs

* fix ci

* iq4_nl

* iq1_m

* iq1_s

* iq2_xxs

* iq3_xxs

* iq2_s

* iq2_xs

* iq3_s before sllv

* iq3_s

* iq3_s small fix

* iq3_s sllv can be safely replaced with sse multiply

Assets 20

19 Jun 17:58

github-actions

b3184

9c77ec1

b3184

ggml : synchronize threads using barriers (#7993)

Assets 20

19 Jun 02:33

github-actions

b3182

623494a

b3182

[SYCL] refactor (#6408)

* seperate lower precision GEMM from the main files

* fix workgroup size hardcode

Assets 20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: teleprint-me/llama.cpp

b3290

b3284

b3268

b3266

b3264

b3209

b3203

b3196

b3184

b3182