Skip to content

Releases: teleprint-me/llama.cpp

b3922

15 Oct 19:54
755a9b2
Compare
Choose a tag to compare
llama : add infill sampler (#9896)

ggml-ci

b3907

12 Oct 04:59
9677640
Compare
Choose a tag to compare
ggml : move more prints to the ggml log system (#9839)

* ggml : move more prints to the ggml log system

* show BLAS OpenMP warnings in all builds using debug print

b3902

10 Oct 05:14
c81f3bb
Compare
Choose a tag to compare
cmake : do not build common library by default when standalone (#9804)

b3867

02 Oct 20:40
a39ab21
Compare
Choose a tag to compare
llama : reduce compile time and binary size (#9712)

* llama : speed up compile time

* fix build

* fix build (2)

b3615

22 Aug 06:51
1731d42
Compare
Choose a tag to compare
[SYCL] Add oneDNN primitive support (#9091)

* add onednn

* add sycl_f16

* add dnnl stream

* add engine map

* use dnnl for intel only

* use fp16fp16fp16

* update doc

b3581

14 Aug 01:10
06943a6
Compare
Choose a tag to compare
ggml : move rope type enum to ggml.h (#8949)

* ggml : move rope type enum to ggml.h

This commit moves the `llama_rope_type` enum from `llama.h` to
`ggml.h` and changes its name to `ggml_rope_type`.

The motivation for this change is to address the TODO in `llama.h` and
use the enum in ggml.

Note: This commit does not change the `mode` parameter to be of type
`enum ggml_rope_type`. The name `mode` and its usage suggest that it
might be more generic and possibly used as a bit field for multiple
flags. Further investigation/discussion may be needed to determine
if `mode` should be restricted to RoPE types.

* squash! ggml : move rope type enum to ggml.h

This commit removes GGML_ROPE_TYPE_NONE and GGML_ROPE_TYPE_GLM from
ggml.h, and back the llama_rope_type enum.

I've kept the assert for GGML_ROPE_TYPE_GLM as I'm not sure if it is
safe to remove it yet.

* squash! ggml : move rope type enum to ggml.h

This commit removes the enum ggml_rope_type from ggml.h and replaces it
with a define (GGML_ROPE_TYPE_NEOX). This define is used in the code to
check if the mode is set to GPT-NeoX. Also the enum llama_rope_type has
been updated to reflect this change.

* squash! ggml : move rope type enum to ggml.h

This commit contains a suggestion enable the GGML_ROPE_TYPE_NEOX
macro/define to be passed to the shader compiler.

* squash! ggml : move rope type enum to ggml.h

This commit fixes the editorconfig-checker warnings.

* squash! ggml : move rope type enum to ggml.h

Update comment for ggml_rope function.

* Revert "squash! ggml : move rope type enum to ggml.h"

This reverts commit 6261222bd0dc0efd51f0fb0435ad3f16a5b52fd6.

* squash! ggml : move rope type enum to ggml.h

Add GGML_ROPE_TYPE_NEOX to rope_common.comp.

* remove extra line

---------

Co-authored-by: slaren <[email protected]>

b3579

12 Aug 17:50
fc4ca27
Compare
Choose a tag to compare
ci : fix github workflow vulnerable to script injection (#9008)

Signed-off-by: Diogo Teles Sant'Anna <[email protected]>

b3576

12 Aug 16:19
84eb2f4
Compare
Choose a tag to compare
docs: introduce gpustack and gguf-parser (#8873)

* readme: introduce gpustack

GPUStack is an open-source GPU cluster manager for running large
language models, which uses llama.cpp as the backend.

Signed-off-by: thxCode <[email protected]>

* readme: introduce gguf-parser

GGUF Parser is a tool to review/check the GGUF file and estimate the
memory usage without downloading the whole model.

Signed-off-by: thxCode <[email protected]>

---------

Signed-off-by: thxCode <[email protected]>

b3565

11 Aug 03:58
6e02327
Compare
Choose a tag to compare
metal : fix uninitialized abort_callback (#8968)

b3561

09 Aug 23:59
b72942f
Compare
Choose a tag to compare
Merge commit from fork