[pull] master from ggerganov:master #19

pull · 2024-01-14T13:15:20Z

See Commits and Changes for more details.

Can you help keep this open source service alive? 💖 Please sponsor : )

Co-authored-by: Iwan Kawrakow <[email protected]>

* imatrix: load * imatrix: WIP * imatrix: Add Q2_K quantization * imatrix: also guard against Q2_K_S quantization without importance matrix * imatrix: guard even more against low-bit quantization misuse --------- Co-authored-by: Iwan Kawrakow <[email protected]>

* Correctly set support_simdgroup_reduction and support_simdgroup_mm on iPhone/iPad * log a little bit more info on iOS

* Fix ffn_down quantization mix for MoE models In #4872 I did not consider the part where every third tensor is quantized with more bits. Fir MoE this leads to tensors of the same layer being quantized with different number of bits, which is not considered as a possibility in the inference implementation (it is assumed all experts use the same quantization). * Fix the fix * Review suggestion --------- Co-authored-by: Iwan Kawrakow <[email protected]>

* llama : minor fix indent * llama : check LLAMA_TRACE env for extra logging ggml-ci

Co-authored-by: Iwan Kawrakow <[email protected]>

ikawrakow and others added 8 commits January 14, 2024 09:44

Make Q3_K_S be the same as olf Q3_K_L for Mixtral-8x7B (#4906)

807179e

Co-authored-by: Iwan Kawrakow <[email protected]>

llama : support WinXP build with MinGW 8.1.0 (#3419)

ac32902

metal : correctly set SIMD support flags on iOS (#4923)

5f5fe1b

* Correctly set support_simdgroup_reduction and support_simdgroup_mm on iPhone/iPad * log a little bit more info on iOS

llama : use LLAMA_LOG_ macros for logging

03c5267

scripts : sync-ggml-am.sh option to skip commits

9408cfd

llama : check LLAMA_TRACE env for extra logging (#4929)

bb0c139

* llama : minor fix indent * llama : check LLAMA_TRACE env for extra logging ggml-ci

pull bot added the ⤵️ pull label Jan 14, 2024

ikawrakow and others added 2 commits January 14, 2024 16:21

Add ability to use importance matrix for all k-quants (#4930)

467a882

Co-authored-by: Iwan Kawrakow <[email protected]>

llama : fix missing quotes (#4937)

a836c8f

teleprint-me closed this Jan 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] master from ggerganov:master #19

[pull] master from ggerganov:master #19

pull bot commented Jan 14, 2024 •

edited

Loading

[pull] master from ggerganov:master #19

[pull] master from ggerganov:master #19

Conversation

pull bot commented Jan 14, 2024 • edited Loading

pull bot commented Jan 14, 2024 •

edited

Loading