Skip to content

Releases: teleprint-me/llama.cpp

b3072

03 Jun 08:13
549279d
Compare
Choose a tag to compare
llama : avoid double token-to-piece cache (#7654)

ggml-ci

b3066

02 Jun 02:45
e141ce6
Compare
Choose a tag to compare
Fix FlashAttention debug test, FP32 assert (#7684)

b3064

01 Jun 19:19
2ac95c9
Compare
Choose a tag to compare
SimpleChat: Simple histogram/repeatMatching driven garbageTrimming, S…

b3051

30 May 17:21
5921b8f
Compare
Choose a tag to compare
llama : cache llama_token_to_piece (#7587)

* llama : cache llama_token_to_piece

ggml-ci

* llama : use vectors and avoid has_cache

ggml-ci

* llama : throw on unknown tokenizer types

ggml-ci

* llama : print a log of the total cache size

b3040

30 May 00:33
55d6226
Compare
Choose a tag to compare
metal : remove invalid asserts (#7617)

b3037

29 May 17:08
cce3dcf
Compare
Choose a tag to compare
cuda : non-cont concat support (#7610)

* tests : add non-cont concat tests

* cuda : non-cont concat support

ggml-ci

b3029

28 May 23:56
b864b50
Compare
Choose a tag to compare
[SYCL] Align GEMM dispatch (#7566)

* align GEMM dispatch

b3004

26 May 18:34
dff451c
Compare
Choose a tag to compare
flake.lock: Update (#7540)

Flake lock file updates:

• Updated input 'nixpkgs':
    'github:NixOS/nixpkgs/4a6b83b05df1a8bd7d99095ec4b4d271f2956b64?narHash=sha256-%2BNpbZRCRisUHKQJZF3CT%2Bxn14ZZQO%2BKjxIIanH3Pvn4%3D' (2024-05-17)
  → 'github:NixOS/nixpkgs/bfb7a882678e518398ce9a31a881538679f6f092?narHash=sha256-4zSIhSRRIoEBwjbPm3YiGtbd8HDWzFxJjw5DYSDy1n8%3D' (2024-05-24)

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

b2998

26 May 00:00
9588f19
Compare
Choose a tag to compare
train : change default FA argument (#7528)

b2986

23 May 17:15
74f33ad
Compare
Choose a tag to compare
readme : remove trailing space (#7469)