Releases: teleprint-me/llama.cpp
Releases · teleprint-me/llama.cpp
b2972
CUDA: fix FA out-of-bounds reads (#7479)
b2970
build : remove zig (#7471)
b2961
llama : add phi3 128K model support (#7225) * add phi3 128k support in convert-hf-to-gguf * add phi3 128k support in cuda * address build warnings on llama.cpp * adjust index value in cuda long rope freq factors * add long rope support in ggml cpu backend * make freq factors only depend on ctx size * remove unused rope scaling type 'su' frin gguf converter * fix flint warnings on convert-hf-to-gguf.py * set to the short freq factor when context size is small than trained context size * add one line of comments * metal : support rope freq_factors * ggml : update ggml_rope_ext API to support freq. factors * backends : add dev messages to support rope freq. factors * minor : style * tests : update to use new rope API * backends : fix pragma semicolons * minor : cleanup * llama : move rope factors from KV header to tensors * llama : remove tmp assert * cuda : fix compile warning * convert : read/write n_head_kv * llama : fix uninitialized tensors --------- Co-authored-by: Georgi Gerganov <[email protected]>
b2958
CUDA: fix unused warning in mmq.cu (#7442)
b2953
Tokenizer SPM fixes for phi-3 and llama-spm (#7375) * Update brute force test: special tokens * Fix added tokens - Try to read 'added_tokens.json'. - Try to read 'tokenizer_config.json'. - Try to read 'tokenizer.json'. * Fix special tokens rtrim Co-authored-by: Georgi Gerganov <[email protected]> * server : fix test regexes
b2941
Add provisions for windows support for BF16 code including CMake prov…
b2939
quantize : fix --keep-split check (#7374)
b2929
Capture CUDA logging output (#7298) * logging: output capture in cuda module * fix compile error * fix: vsnprintf terminates with 0, string use not correct * post review * Update llama.cpp Co-authored-by: slaren <[email protected]> * Update llama.cpp Co-authored-by: slaren <[email protected]> --------- Co-authored-by: slaren <[email protected]>
b2916
Unicode codepoint flags for custom regexs (#7245) * Replace CODEPOINT_TYPE_* with codepoint_flags * Update and bugfix brute force random test * Deterministic brute force random test * Unicode normalization NFD * Get rid of BOM
b2915
CUDA: faster large batch FA without tensor cores (#7314)