-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unite test "test-backend-ops" crashed on MacOS #4672
Comments
There is a lot of missing output that would hint at the issue. I assume this is because the buffer allocation failed. I will add more checks so that these cases are detected and reported instead of crashing, but actually fixing this would require someone with an intel mac to figure what is the issue. |
LastTest.log |
Thank you. The out of memory issue in the MoE test is not really a concern, it requires a larger buffer than can be allocated in your system. The log also shows that many |
so, to debug this issue. I should look at the first time failed of MUL_MAT ? |
All the failed |
@nguoithichkhampha Please checkout #4794 and try again: make clean
make -j tests && ./tests/test-backend-ops -b Metal If the matrix multiplication tests continue to fail, please run the following and post the output: MTL_DEBUG_LAYER=1 ./tests/test-backend-ops -b Metal |
test-metal-backend.txt Thread 0 Crashed:: Dispatch queue: com.apple.main-thread Seems there is an assertion from OS to prevent alloc buffer more than |
I think this is make sense when my gpu only |
Yes, the MOE test is expected to fail due to out of memory - that's not a big concern. https://developer.apple.com/metal/Metal-Feature-Set-Tables.pdf However, we currently fail to detect that: Backend 2/2 (Metal)
ggml_metal_init: allocating
2024-01-07 17:36:34.077 test-backend-ops[2294:105408] Metal API Validation Enabled
ggml_metal_init: found device: Intel(R) Iris(TM) Plus Graphics 650
ggml_metal_init: picking default device: Intel(R) Iris(TM) Plus Graphics 650
ggml_metal_init: default.metallib not found, loading from source
ggml_metal_init: GGML_METAL_PATH_RESOURCES = nil
ggml_metal_init: loading '/Users/Emotiv/llama.cpp/build/bin/ggml-metal.metal'
ggml_metal_init: GPU name: Intel(R) Iris(TM) Plus Graphics 650
ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003)
ggml_metal_init: simdgroup reduction support = false
ggml_metal_init: hasUnifiedMemory = true
ggml_metal_init: recommendedMaxWorkingSetSize = 1610.61 MB
ggml_metal_init: maxTransferRate = built-in GPU There should be a log message stating: ggml_metal_init: GPU family: MTLGPUFamilyMetal3 (5001) I just pushed another change to #4794 that would hopefully fix this. |
tried with latest commit. I see the message
|
Thanks! I think it should work now. When you get the chance - please give it another try with the latest version and if it fails, post the output again. It will be more verbose now |
ok, I read your code change. seems that my gpu does not support mul_mat. |
Yup, it is unexpected that the |
I'm using MacOS 13.6 (Intel chip). Here is stack trace
The text was updated successfully, but these errors were encountered: