Build Llama.cpp with Vulkan for Android Device (Magic Leap 2). #8874
Replies: 3 comments 10 replies
-
@XinyuGroceryStore I have built this program successfully for Android,and run it in my Android Phone,here are two key points: but there still are other problems,in QUALCOMM Adreno GPU the program will crash,in ARM Mali GPU the speed is slower even than CPU. hope my answer can help you. |
Beta Was this translation helpful? Give feedback.
-
If you just want it work, use termux, Pkg install vulkan-loader-android vulkan-headers glfw Check vulkaninfo, see your devices info. Download and install aarch64 glslang from aur site. Now you can build vulkan branch directly in your phone. Anyway, openblas is twice faster than vulkan in my driver (mail G57) |
Beta Was this translation helpful? Give feedback.
-
Thank you. My choice of Vulkan was because it seemed to enable use of GPUs
by llama.cpp on both Android and IoS.
Does openBLAS offer the same flexibility?
…On Mon, Jan 6, 2025, 8:31 a.m. FNsi ***@***.***> wrote:
If you just want it work, use termux,
Pkg install vulkan-load-android vulkan-headers clvk glfw
Check vulkaninfo, see your devices info.
Then compile and install glslang from source.
Now you can build vulkan branch directly in your phone.
Anyway, openblas is twice faster than vulkan in my driver (mail G57)
—
Reply to this email directly, view it on GitHub
<#8874 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABRHDMB2CLCQLYLLSV7RALD2JKASTAVCNFSM6AAAAABMAK2WZGVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTCNZUHA4TKNQ>
.
You are receiving this because you commented.Message ID: <ggerganov/llama.
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
I succeeded in build llama.cpp for Magic Leap 2 by following the instructions of building on Android. Magic Leap 2 is an Android Device with x86-64 CPU. Commands below:
Then I ran
ninja
to generate binary files.After that, I used Android Debug Bridge (ADB) to copy .so and binary files to Magic Leap 2 and use adb shell to execute llama-cli.
All of the above steps are feasible and work well.
Now I plan to accelerate by ML's GPU by Vulkan. So I ran commands below:
Log shown below:
Details
-- The C compiler identification is Clang 18.0.1
-- The CXX compiler identification is Clang 18.0.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: C:/Users/Xinyu/AppData/Local/Android/Sdk/ndk/27.0.12077973/toolchains/llvm/prebuilt/windows-x86_64/bin/clang.exe - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: C:/Users/Xinyu/AppData/Local/Android/Sdk/ndk/27.0.12077973/toolchains/llvm/prebuilt/windows-x86_64/bin/clang++.exe - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: C:/Users/Xinyu/Git/cmd/git.exe (found version "2.45.2.windows.1")
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Check if compiler accepts -pthread
-- Check if compiler accepts -pthread - yes
-- Found Threads: TRUE
-- Found OpenMP_C: -fopenmp=libomp
-- Found OpenMP_CXX: -fopenmp=libomp
-- Found OpenMP: TRUE
-- OpenMP found
-- Using llamafile
-- Found Vulkan: C:/VulkanSDK/1.3.283.0/Lib/vulkan-1.lib (found version "1.3.283") found components: glslc glslangValidator
-- Vulkan found
-- ccache found, compilation results will be cached. Disable with GGML_CCACHE=OFF.
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - not found
-- Configuring done (6.1s)
-- Generating done (0.3s)
-- Build files have been written to: C:/Users/Xinyu/llama-adb-vulkan/llama.cpp/build
and I ran
ninja
. The build stopped with reason below:If you have any ideas about how to build this project, please share with me. And any suggestions and ideas on how to accelerate inference with GPU on Magic Leap are welcomed.
Beta Was this translation helpful? Give feedback.
All reactions