Releases · teleprint-me/llama.cpp

15 Oct 19:54

755a9b2

b3922

llama : add infill sampler (#9896)

ggml-ci

Assets 22

12 Oct 04:59

github-actions

b3907

9677640

b3907

ggml : move more prints to the ggml log system (#9839)

* ggml : move more prints to the ggml log system

* show BLAS OpenMP warnings in all builds using debug print

Assets 22

10 Oct 05:14

github-actions

b3902

c81f3bb

b3902

cmake : do not build common library by default when standalone (#9804)

Assets 22

02 Oct 20:40

github-actions

b3867

a39ab21

b3867

llama : reduce compile time and binary size (#9712)

* llama : speed up compile time

* fix build

* fix build (2)

Assets 22

22 Aug 06:51

github-actions

b3615

1731d42

b3615

[SYCL] Add oneDNN primitive support (#9091)

* add onednn

* add sycl_f16

* add dnnl stream

* add engine map

* use dnnl for intel only

* use fp16fp16fp16

* update doc

Assets 19

14 Aug 01:10

github-actions

b3581

06943a6

b3581

ggml : move rope type enum to ggml.h (#8949)

* ggml : move rope type enum to ggml.h

This commit moves the `llama_rope_type` enum from `llama.h` to
`ggml.h` and changes its name to `ggml_rope_type`.

The motivation for this change is to address the TODO in `llama.h` and
use the enum in ggml.

Note: This commit does not change the `mode` parameter to be of type
`enum ggml_rope_type`. The name `mode` and its usage suggest that it
might be more generic and possibly used as a bit field for multiple
flags. Further investigation/discussion may be needed to determine
if `mode` should be restricted to RoPE types.

* squash! ggml : move rope type enum to ggml.h

This commit removes GGML_ROPE_TYPE_NONE and GGML_ROPE_TYPE_GLM from
ggml.h, and back the llama_rope_type enum.

I've kept the assert for GGML_ROPE_TYPE_GLM as I'm not sure if it is
safe to remove it yet.

* squash! ggml : move rope type enum to ggml.h

This commit removes the enum ggml_rope_type from ggml.h and replaces it
with a define (GGML_ROPE_TYPE_NEOX). This define is used in the code to
check if the mode is set to GPT-NeoX. Also the enum llama_rope_type has
been updated to reflect this change.

* squash! ggml : move rope type enum to ggml.h

This commit contains a suggestion enable the GGML_ROPE_TYPE_NEOX
macro/define to be passed to the shader compiler.

* squash! ggml : move rope type enum to ggml.h

This commit fixes the editorconfig-checker warnings.

* squash! ggml : move rope type enum to ggml.h

Update comment for ggml_rope function.

* Revert "squash! ggml : move rope type enum to ggml.h"

This reverts commit 6261222bd0dc0efd51f0fb0435ad3f16a5b52fd6.

* squash! ggml : move rope type enum to ggml.h

Add GGML_ROPE_TYPE_NEOX to rope_common.comp.

* remove extra line

---------

Co-authored-by: slaren <[email protected]>

Assets 19

12 Aug 17:50

github-actions

b3579

fc4ca27

b3579

ci : fix github workflow vulnerable to script injection (#9008)

Signed-off-by: Diogo Teles Sant'Anna <[email protected]>

Assets 19

12 Aug 16:19

github-actions

b3576

84eb2f4

b3576

docs: introduce gpustack and gguf-parser (#8873)

* readme: introduce gpustack

GPUStack is an open-source GPU cluster manager for running large
language models, which uses llama.cpp as the backend.

Signed-off-by: thxCode <[email protected]>

* readme: introduce gguf-parser

GGUF Parser is a tool to review/check the GGUF file and estimate the
memory usage without downloading the whole model.

Signed-off-by: thxCode <[email protected]>

---------

Signed-off-by: thxCode <[email protected]>

Assets 20

11 Aug 03:58

github-actions

b3565

6e02327

b3565

metal : fix uninitialized abort_callback (#8968)

Assets 20

09 Aug 23:59

github-actions

b3561

b72942f

b3561

Merge commit from fork

Assets 20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: teleprint-me/llama.cpp

b3922

b3907

b3902

b3867

b3615

b3581

b3579

b3576

b3565

b3561