Releases · teleprint-me/llama.cpp

06 Jan 23:53

dc7cef9

b4431 Latest

Latest

llama-run : fix context size (#11094)

Set `n_ctx` equal to `n_batch` in `Opt` class. Now context size is
a more reasonable 2048.

Signed-off-by: Eric Curtin <[email protected]>

Assets 23

cudart-llama-bin-win-cu11.7-x64.zip

303 MB 2025-01-06T23:53:36Z
cudart-llama-bin-win-cu12.4-x64.zip

373 MB 2025-01-06T23:53:42Z
llama-b4431-bin-macos-arm64.zip

13 MB 2025-01-06T23:53:49Z
llama-b4431-bin-macos-x64.zip

13.9 MB 2025-01-06T23:53:50Z
llama-b4431-bin-ubuntu-x64.zip

15.5 MB 2025-01-06T23:53:51Z
llama-b4431-bin-win-avx-x64.zip

9.77 MB 2025-01-06T23:53:51Z
llama-b4431-bin-win-avx2-x64.zip

9.78 MB 2025-01-06T23:53:52Z
llama-b4431-bin-win-avx512-x64.zip

9.79 MB 2025-01-06T23:53:53Z
llama-b4431-bin-win-cuda-cu11.7-x64.zip

147 MB 2025-01-06T23:53:54Z
llama-b4431-bin-win-cuda-cu12.4-x64.zip

147 MB 2025-01-06T23:53:57Z
Source code (zip)

2025-01-06T22:45:28Z
Source code (tar.gz)

2025-01-06T22:45:28Z

02 Jan 03:16

github-actions

b4404

0827b2c

b4404

ggml : fixes for AVXVNNI instruction set with MSVC and Clang (#11027)

* Fixes for clang AVX VNNI

* enable AVX VNNI and alder lake build for MSVC

* Apply suggestions from code review

---------

Co-authored-by: slaren <[email protected]>

Assets 23

23 Dec 02:12

github-actions

b4381

b92a14a

b4381

llama : support InfiniAI Megrez 3b (#10893)

* Support InfiniAI Megrez 3b

* Fix tokenizer_clean_spaces for megrez

Assets 23

17 Dec 20:55

github-actions

b4349

081b29b

b4349

tests: add tests for GGUF (#10830)

Assets 23

16 Dec 02:43

github-actions

b4334

4ddd199

b4334

llava : Allow locally downloaded models for QwenVL (#10833)

* Allow locally downloaded models for QwenVL

* Define model_path

* rm trailing space

---------

Co-authored-by: Xuan Son Nguyen <[email protected]>

Assets 23

13 Dec 06:12

github-actions

b4318

d583cd0

b4318

ggml : Fix compilation issues on ARM platform when building without f…

Assets 22

11 Dec 07:36

github-actions

b4302

43041d2

b4302

ggml: load all backends from a user-provided search path (#10699)

* feat: load all backends from a user-provided search path

* fix: Windows search path

* refactor: rename `ggml_backend_load_all_in_search_path` to `ggml_backend_load_all_from_path`

* refactor: rename `search_path` to `dir_path`

* fix: change `NULL` to `nullptr`

Co-authored-by: Diego Devesa <[email protected]>

* fix: change `NULL` to `nullptr`

---------

Co-authored-by: Diego Devesa <[email protected]>

Assets 22

30 Nov 09:33

github-actions

b4229

3e0ba0e

b4229

readme : remove old badge

Assets 22

28 Nov 20:23

github-actions

b4215

dc22344

b4215

ggml : remove redundant copyright notice + update authors

Assets 22

27 Nov 18:11

github-actions

b4201

3ad5451

b4201

Add some minimal optimizations for CDNA (#10498)

* Add some minimal optimizations for CDNA

* ggml_cuda: set launch bounds also for GCN as it helps there too

Assets 22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: teleprint-me/llama.cpp

b4431

b4404

b4381

b4349

b4334

b4318

b4302

b4229

b4215

b4201