Releases · teleprint-me/llama.cpp

03 Jun 08:13

549279d

b3072

llama : avoid double token-to-piece cache (#7654)

ggml-ci

Assets 21

02 Jun 02:45

github-actions

b3066

e141ce6

b3066

Fix FlashAttention debug test, FP32 assert (#7684)

Assets 21

01 Jun 19:19

github-actions

b3064

2ac95c9

b3064

SimpleChat: Simple histogram/repeatMatching driven garbageTrimming, S…

Assets 21

30 May 17:21

github-actions

b3051

5921b8f

b3051

llama : cache llama_token_to_piece (#7587)

* llama : cache llama_token_to_piece

ggml-ci

* llama : use vectors and avoid has_cache

ggml-ci

* llama : throw on unknown tokenizer types

ggml-ci

* llama : print a log of the total cache size

Assets 21

30 May 00:33

github-actions

b3040

55d6226

b3040

metal : remove invalid asserts (#7617)

Assets 21

29 May 17:08

github-actions

b3037

cce3dcf

b3037

cuda : non-cont concat support (#7610)

* tests : add non-cont concat tests

* cuda : non-cont concat support

ggml-ci

Assets 21

28 May 23:56

github-actions

b3029

b864b50

b3029

[SYCL] Align GEMM dispatch (#7566)

* align GEMM dispatch

Assets 21

26 May 18:34

github-actions

b3004

dff451c

b3004

flake.lock: Update (#7540)

Flake lock file updates:

• Updated input 'nixpkgs':
    'github:NixOS/nixpkgs/4a6b83b05df1a8bd7d99095ec4b4d271f2956b64?narHash=sha256-%2BNpbZRCRisUHKQJZF3CT%2Bxn14ZZQO%2BKjxIIanH3Pvn4%3D' (2024-05-17)
  → 'github:NixOS/nixpkgs/bfb7a882678e518398ce9a31a881538679f6f092?narHash=sha256-4zSIhSRRIoEBwjbPm3YiGtbd8HDWzFxJjw5DYSDy1n8%3D' (2024-05-24)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

Assets 21

26 May 00:00

github-actions

b2998

9588f19

b2998

train : change default FA argument (#7528)

Assets 21

23 May 17:15

github-actions

b2986

74f33ad

b2986

readme : remove trailing space (#7469)

Assets 21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: teleprint-me/llama.cpp

b3072

b3066

b3064

b3051

b3040

b3037

b3029

b3004

b2998

b2986