Flash Attention CPU-only build fails in cross-compilation for Android, succeeds with Vulkan backend #463

rmatif · 2024-11-17T16:35:26Z

~~When cross-compiling for Android using NDK toolchain, Flash Attention fails to build in CPU-only mode but succeeds when Vulkan backend is enabled, despite being documented as CPU-only feature.~~

~~Environment:~~

~~- Android NDK: 28.0.12433566~~
~~- Target: arm64-v8a (Android 28)~~
~~- Build system: CMake with Ninja~~
~~- Host OS: Windows~~

~~Build command that fails:~~

cmake .. -G "Ninja" -DCMAKE_TOOLCHAIN_FILE=D:\Android_Studio_SDK\ndk\28.0.12433566\build\cmake\android.toolchain.cmake -DANDROID_ABI=arm64-v8a -DANDROID_PLATFORM=android-28 -DCMAKE_MAKE_PROGRAM=D:\Android_Studio_SDK\cmake\3.6.4111459\bin\ninja.exe -DSD_BUILD_SHARED_LIBS=ON -DSD_FLASH_ATTN=ON
~~Error:~~

D:/Building_test/stable-diffusion.cpp/ggml_extend.hpp:679:31: error: use of undeclared identifier 'ggml_flash_attn'; did you mean 'ggml_hash_set'? 679 | struct ggml_tensor* kqv = ggml_flash_attn(ctx, q, k, v, false); | ^

~~Same build command succeeds when adding `-DSD_VULKAN=ON~~

~~Expected behavior: Flash Attention should build successfully in CPU-only mode since it's documented as a CPU-only feature.~~

~~Actual behavior: Flash Attention only builds when Vulkan backend is enabled, suggesting the implementation may be incorrectly tied to GPU backend definitions.~~

EDIT : Nevermind, I just came accross this PR #386 (comment)

The text was updated successfully, but these errors were encountered:

Green-Sky · 2024-11-23T11:09:42Z

Merged into master too. :)
Do keep in mind, that using --diffusion-fa with cpu might slow it down, but save ram.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flash Attention CPU-only build fails in cross-compilation for Android, succeeds with Vulkan backend #463

Flash Attention CPU-only build fails in cross-compilation for Android, succeeds with Vulkan backend #463

rmatif commented Nov 17, 2024 •

edited

Loading

Green-Sky commented Nov 23, 2024

Flash Attention CPU-only build fails in cross-compilation for Android, succeeds with Vulkan backend #463

Flash Attention CPU-only build fails in cross-compilation for Android, succeeds with Vulkan backend #463

Comments

rmatif commented Nov 17, 2024 • edited Loading

Green-Sky commented Nov 23, 2024

rmatif commented Nov 17, 2024 •

edited

Loading