Releases: teleprint-me/llama.cpp
Releases · teleprint-me/llama.cpp
b3072
llama : avoid double token-to-piece cache (#7654) ggml-ci
b3066
Fix FlashAttention debug test, FP32 assert (#7684)
b3064
SimpleChat: Simple histogram/repeatMatching driven garbageTrimming, S…
b3051
llama : cache llama_token_to_piece (#7587) * llama : cache llama_token_to_piece ggml-ci * llama : use vectors and avoid has_cache ggml-ci * llama : throw on unknown tokenizer types ggml-ci * llama : print a log of the total cache size
b3040
metal : remove invalid asserts (#7617)
b3037
cuda : non-cont concat support (#7610) * tests : add non-cont concat tests * cuda : non-cont concat support ggml-ci
b3029
[SYCL] Align GEMM dispatch (#7566) * align GEMM dispatch
b3004
flake.lock: Update (#7540) Flake lock file updates: • Updated input 'nixpkgs': 'github:NixOS/nixpkgs/4a6b83b05df1a8bd7d99095ec4b4d271f2956b64?narHash=sha256-%2BNpbZRCRisUHKQJZF3CT%2Bxn14ZZQO%2BKjxIIanH3Pvn4%3D' (2024-05-17) → 'github:NixOS/nixpkgs/bfb7a882678e518398ce9a31a881538679f6f092?narHash=sha256-4zSIhSRRIoEBwjbPm3YiGtbd8HDWzFxJjw5DYSDy1n8%3D' (2024-05-24) Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
b2998
train : change default FA argument (#7528)
b2986
readme : remove trailing space (#7469)