Releases: teleprint-me/llama.cpp
Releases · teleprint-me/llama.cpp
b2082
Update README.md (#5366) Add some links to quantization related PRs
b2074
make: Use ccache for faster compilation (#5318) * make: Use ccache for faster compilation
b2061
flake.lock: Update Flake lock file updates: • Updated input 'flake-parts': 'github:hercules-ci/flake-parts/07f6395285469419cf9d078f59b5b49993198c00' (2024-01-11) → 'github:hercules-ci/flake-parts/b253292d9c0a5ead9bc98c4e9a26c6312e27d69f' (2024-02-01) • Updated input 'flake-parts/nixpkgs-lib': 'github:NixOS/nixpkgs/b0d36bd0a420ecee3bc916c91886caca87c894e9?dir=lib' (2023-12-30) → 'github:NixOS/nixpkgs/97b17f32362e475016f942bbdfda4a4a72a8a652?dir=lib' (2024-01-29) • Updated input 'nixpkgs': 'github:NixOS/nixpkgs/ae5c332cbb5827f6b1f02572496b141021de335f' (2024-01-25) → 'github:NixOS/nixpkgs/b8b232ae7b8b144397fdb12d20f592e5e7c1a64d' (2024-01-31)
b2058
make: add nvcc info print (#5310)
b2055
Vulkan Intel Fixes, Optimizations and Debugging Flags (#5301) * Fix Vulkan on Intel ARC Optimize matmul for Intel ARC Add Vulkan dequant test * Add Vulkan debug and validate flags to Make and CMakeLists.txt * Enable asynchronous transfers in Vulkan backend * Fix flake8 * Disable Vulkan async backend functions for now * Also add Vulkan run tests command to Makefile and CMakeLists.txt
b2050
perplexity : fix KL divergence calculations on Windows (#5273)
b2042
add --no-mmap in llama-bench (#5257) * add --no-mmap, show sycl backend * fix conflict * fix code format, change print for --no-mmap * ren no_mmap to mmap, show mmap when not default value in printer * update guide for mmap * mv position to reduce model reload
b2039
make : generate .a library for static linking (#5205)
b2035
llama : remove LLAMA_MAX_DEVICES and LLAMA_SUPPORTS_GPU_OFFLOAD (#5240) * llama : remove LLAMA_MAX_DEVICES from llama.h ggml-ci * Update llama.cpp Co-authored-by: slaren <[email protected]> * server : remove LLAMA_MAX_DEVICES ggml-ci * llama : remove LLAMA_SUPPORTS_GPU_OFFLOAD ggml-ci * train : remove LLAMA_SUPPORTS_GPU_OFFLOAD * readme : add deprecation notice * readme : change deprecation notice to "remove" and fix url * llama : remove gpu includes from llama.h ggml-ci --------- Co-authored-by: slaren <[email protected]>
b2029
Fix typos of IQ2_XXS and IQ3_XXS in llama.cpp (#5231)