[pull] master from ggerganov:master #157

pull · 2024-11-29T04:12:03Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.1)

Can you help keep this open source service alive? 💖 Please sponsor : )

This is an incremental improvement over #9118 to get work to the GPU a bit sooner. The first part is to start with a smaller number of nodes before the first submit, and ramp it up to the current 100 nodes/submit. The second part is to reduce the dryrun overhead for all the nodes that just need to request descriptor space. With these changes I get around 1-2% speedup on RTX 4070 combined with my old Haswell-era CPU.

* [cann] RoPE operator optimization * [CANN]Code Formatting --------- Co-authored-by: noemotiovon <[email protected]>

This PR fixes the failing MUL_MAT tests for the sycl backend.

ggml-ci

* cleanup UI link list * sort list alphabetically * add missing licenses

* imatrix-combine-only idea * ensured that behavior consistent with log

* server : add split model test * add test speculative * add invalid cases

* ggml : move AMX to the CPU backend --------- Co-authored-by: Georgi Gerganov <[email protected]>

* subgroup 64 version with subgroup add. 15% faster scalable version tested for subgroup sizes 16-128 * check for subgroup multiple of 16 and greater than 16 * subgroup sizes are always a power of 2 (KhronosGroup/GLSL#45) * force 16 sequential threads per block * make 16 subgroup size a constant

* readme : refresh * readme : move section [no ci] * readme : clarify [no ci] * readme : fixes [no ci] * readme : more fixes [no ci] * readme : simplify [no ci] * readme : clarify GGUF

llava: return false instead of exit (#10546)

678d799

pull bot added the ⤵️ pull label Nov 29, 2024

github-actions bot added the examples label Nov 29, 2024

github-actions bot added ggml Vulkan labels Nov 29, 2024

noemotiovon and others added 2 commits November 29, 2024 14:46

CANN: RoPE operator optimization (#10563)

938f608

* [cann] RoPE operator optimization * [CANN]Code Formatting --------- Co-authored-by: noemotiovon <[email protected]>

sycl : Reroute permuted mul_mats through oneMKL (#10408)

266b851

This PR fixes the failing MUL_MAT tests for the sycl backend.

github-actions bot added the SYCL label Nov 29, 2024

Alcpz and others added 3 commits November 29, 2024 20:38

sycl : offload of get_rows set to 0 (#10432)

0f77aae

ggml-cpu: fix typo in gemv/gemm iq4_nl_4_4 (#10580)

4b3242b

ggml : fix I8MM Q4_1 scaling factor conversion (#10562)

f0678c5

ggml-ci

github-actions bot added the testing label Nov 29, 2024

slaren and others added 3 commits November 29, 2024 17:45

cleanup UI link list (#10577)

a3a3048

* cleanup UI link list * sort list alphabetically * add missing licenses

imatrix : support combine-only (#10492)

3a8e9af

* imatrix-combine-only idea * ensured that behavior consistent with log

server : add more test cases (#10569)

b782e5c

* server : add split model test * add test speculative * add invalid cases

github-actions bot added python server labels Nov 29, 2024

ggml : move AMX to the CPU backend (#10570)

7cc2d2c

* ggml : move AMX to the CPU backend --------- Co-authored-by: Georgi Gerganov <[email protected]>

github-actions bot added the devops label Nov 29, 2024

netrunnereve and others added 3 commits November 30, 2024 08:00

readme : refresh (#10587)

abadba0

* readme : refresh * readme : move section [no ci] * readme : clarify [no ci] * readme : fixes [no ci] * readme : more fixes [no ci] * readme : simplify [no ci] * readme : clarify GGUF

readme : remove old badge

3e0ba0e

teleprint-me closed this Nov 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] master from ggerganov:master #157

[pull] master from ggerganov:master #157

pull bot commented Nov 29, 2024 •

edited

Loading

[pull] master from ggerganov:master #157

[pull] master from ggerganov:master #157

Conversation

pull bot commented Nov 29, 2024 • edited Loading

pull bot commented Nov 29, 2024 •

edited

Loading