unsupported op 'MUL_MAT' #4998

NeevJewalkar · 2024-01-17T12:12:18Z

ggml_metal_graph_compute_block_invoke: error: unsupported op 'MUL_MAT'
GGML_ASSERT: ggml-metal.m:779: !"unsupported op"

system: Mac Book Air (intel)
Happens when i try to run phi-2

The text was updated successfully, but these errors were encountered:

XieWeikai · 2024-01-17T15:16:35Z

I encountered the same problem

ggerganov · 2024-01-17T17:02:04Z

Likely your device is missing Apple7 family feature set (more info: #4794)

Show the logs of ggml_metal_init to confirm:

ggml_metal_init: allocating
ggml_metal_init: found device: Apple M2 Ultra
ggml_metal_init: picking default device: Apple M2 Ultra
ggml_metal_init: default.metallib not found, loading from source
ggml_metal_init: GGML_METAL_PATH_RESOURCES = nil
ggml_metal_init: loading '/Users/ggerganov/development/github/llama.cpp/ggml-metal.metal'
ggml_metal_init: GPU name:   Apple M2 Ultra
ggml_metal_init: GPU family: MTLGPUFamilyApple8  (1008)
ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003)
ggml_metal_init: GPU family: MTLGPUFamilyMetal3  (5001)
ggml_metal_init: simdgroup reduction support   = true      <---- this is required for llama.cpp
ggml_metal_init: simdgroup matrix mul. support = true
ggml_metal_init: hasUnifiedMemory              = true
ggml_metal_init: recommendedMaxWorkingSetSize  = 154618.82 MB

NeevJewalkar · 2024-01-18T06:21:22Z

yep, that value is set to false:

ggml_metal_init: allocating
ggml_metal_init: found device: Intel(R) UHD Graphics 617
ggml_metal_init: picking default device: Intel(R) UHD Graphics 617
ggml_metal_init: default.metallib not found, loading from source
ggml_metal_init: GGML_METAL_PATH_RESOURCES = nil
ggml_metal_init: loading '/Users/neevjewalkar/Documents/Dev/llama/llama.cpp/ggml-metal.metal'
ggml_metal_init: GPU name:   Intel(R) UHD Graphics 617
ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003)
ggml_metal_init: simdgroup reduction support   = false            <----
ggml_metal_init: simdgroup matrix mul. support = false
ggml_metal_init: hasUnifiedMemory              = true
ggml_metal_init: recommendedMaxWorkingSetSize  =  1610.61 MB

is there any fix?

MBarti · 2024-01-18T12:19:52Z

Same problem on Macbook Pro 2018 16GB (Intel, AMD Radeon Pro 555X)
ML Model: mistral-7b-dpo-v5.Q6_K.gguf

ggml_metal_graph_compute_block_invoke: error: unsupported op 'MUL_MAT'
GGML_ASSERT: ggml-metal.m:779: !"unsupported op"
Abort trap: 6

also:
ggml_metal_graph_compute_block_invoke: error: unsupported op 'RMS_NORM'

ggerganov · 2024-01-18T12:50:22Z

The only way is to implement the respective Metal kernels without using simd_ calls. It's not very difficult, but I don't plan on officially supporting it as it will increase the Metal code by a lot and I'm not convinced it will result in significant gains compared to CPU-only for these machines.

If somebody implements the kernels, we can put them in ggml-metal-intel.metal and have them build as a separate backend for Intel machines

NeevJewalkar · 2024-01-20T08:01:43Z

From what i understood, this error occurs due to the fact that all this is running on gpu. This may seem dumb but how do i run llama.cpp on the cpu instead?
Edit: I tried running the model on Langchain using llamacpp and it works, so why doesnt it work when i try to run the model using llama.cpp in the terminal?

ggerganov · 2024-01-22T07:40:47Z

Most of the examples support -ngl 0 argument which would make llama.cpp not use the GPU

0xez · 2024-02-13T11:15:18Z

Most of the examples support -ngl 0 argument which would make llama.cpp not use the GPU

This worked for me, thank you!

I was running llama-2-7b-chat.Q4_K_M on my macbook pro 2016 (8G DRAM), and got an error said:
ggml_metal_graph_compute_block_invoke: error: unsupported op 'RMS_NORM'
The -ngl 0 param solved the problem.

It was running successfully right now, but very slow, almost take 2~3min to predict each word. I could see very high IO load via iostat command and also the cpu-sys was high too, which means it was trying to swap data between disk and memory.

Conclusion: I need a new macbook definitely!

github-actions · 2024-03-18T01:33:30Z

This issue is stale because it has been open for 30 days with no activity.

github-actions · 2024-04-03T01:13:42Z

This issue was closed because it has been inactive for 14 days since being marked as stale.

lalitya-sawant · 2024-04-16T20:28:38Z

passing param -ngl 0 param solved the problem.

Umkus · 2024-06-18T17:56:26Z

This used to help be on my mid 2015 macbook, but no more 😢

NeevJewalkar added the bug-unconfirmed label Jan 17, 2024

keropudding mentioned this issue Jan 19, 2024

Crash running on iPhone 14 Pro: unsupported op 'MUL_MAT' ggerganov/whisper.cpp#1786

Open

amakropoulos mentioned this issue Mar 13, 2024

Fallback to CPU for macOS with unsupported GPU undreamai/LLMUnity#119

Merged

github-actions bot added the stale label Mar 18, 2024

github-actions bot closed this as completed Apr 3, 2024

dm4 mentioned this issue Aug 21, 2024

[Plugin] stable diffusion: add Metal support WasmEdge/WasmEdge#3680

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

unsupported op 'MUL_MAT' #4998

unsupported op 'MUL_MAT' #4998

NeevJewalkar commented Jan 17, 2024

XieWeikai commented Jan 17, 2024

ggerganov commented Jan 17, 2024

NeevJewalkar commented Jan 18, 2024 •

edited

Loading

MBarti commented Jan 18, 2024

ggerganov commented Jan 18, 2024

NeevJewalkar commented Jan 20, 2024 •

edited

Loading

ggerganov commented Jan 22, 2024

0xez commented Feb 13, 2024

github-actions bot commented Mar 18, 2024

github-actions bot commented Apr 3, 2024

lalitya-sawant commented Apr 16, 2024

Umkus commented Jun 18, 2024

unsupported op 'MUL_MAT' #4998

unsupported op 'MUL_MAT' #4998

Comments

NeevJewalkar commented Jan 17, 2024

XieWeikai commented Jan 17, 2024

ggerganov commented Jan 17, 2024

NeevJewalkar commented Jan 18, 2024 • edited Loading

MBarti commented Jan 18, 2024

ggerganov commented Jan 18, 2024

NeevJewalkar commented Jan 20, 2024 • edited Loading

ggerganov commented Jan 22, 2024

0xez commented Feb 13, 2024

github-actions bot commented Mar 18, 2024

github-actions bot commented Apr 3, 2024

lalitya-sawant commented Apr 16, 2024

Umkus commented Jun 18, 2024

NeevJewalkar commented Jan 18, 2024 •

edited

Loading

NeevJewalkar commented Jan 20, 2024 •

edited

Loading