Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement AbsKernelOp for WebGPU backend #896

Merged
merged 16 commits into from
Jan 3, 2024

Conversation

favilo
Copy link
Contributor

@favilo favilo commented Dec 3, 2023

This also creates a broken, default implementation for all the other UnaryKernels.

@favilo favilo marked this pull request as ready for review December 3, 2023 02:48
@favilo
Copy link
Contributor Author

favilo commented Dec 3, 2023

Ridiculous amount of warnings, and I'll probably clean up a lot of them, but the unused variables mostly have to do with the unimplemented kernels, I think

@favilo favilo marked this pull request as draft December 3, 2023 02:51
@favilo
Copy link
Contributor Author

favilo commented Dec 3, 2023

Actually, I'll keep this a draft, I need to test it still.

@favilo
Copy link
Contributor Author

favilo commented Dec 28, 2023

So I went through several iterations of this.

wgpu supports compiling code from WGSL and GLSL into Spir-V automatically.

Originally, I wanted to go with WGSL, since it's a language that was made for WebGPU specifically, but the issue I came across with it was that it doesn't have support for f64 types. So after writing all the code for abs and getting it working, I realized I had to jump into GLSL if I wanted double word floating point numbers.

Then I replaced the code with GLSL, and had wgpu compile it by itself. This worked fine, until I decided I wanted to get the abs test case working for the backward phase, which uses mean(). In order to support that, I needed to write the sum_to action, at least for the forward phase. But it was here that I realized wgpu doesn't support atomic operations yet! I really didn't feel like forcing that into wgpu since that would potentially be a lot more work, so I decided to utilize the third method.

wgpu has the ability to use precompiled SpirV binary files without validation, and the glslc compiler exists. So I decided to go that route. It worked fine, but still had trouble with the atomic operations, until I found the make_spirv_raw() function. That is supposed to take it and actually do no validation on the code. However, when I attempted to run it I didn't realize I had a bug in my code and it was just segfaulting everywhere.

For my fourth iteration through this, I got frustrated and decided, "Hey, this is rust! I should be using rust for these kernels." So I moved to rust-gpu by Embark Studios. I actually really like this method, and it fixes some of the weird issues with the GLSL compilation, like forcing me to use #define TYPENAME and gives me multiple entry points into a kernel, instead of having only main. The issue that cropped up here was that it requires a very specific version of the nightly rust compiler, and our framework doesn't listen to rust-toolchain.toml correctly just yet. It was also here that I discovered my bug of not using a pipeline layout for constructing my compute shaders, which was causing the segfaults.

So I went back to GLSL, and with the segfaults now fixed, I was able to power through and get it working how you see it now. Of course there is still a very strange problem with the sum_to kernel that I'm going to have to figure out before too long. It has the wrong magic number when I'm creating the compute shader, but when I look at the specific files, and the bytes that are directly read by the library, it is the correct magic number...

So I'm going to be spending some time with a debugger and stepping through code to figure that out, but in the mean time, I have it working for the abs kernel for both f32 and f64.

@favilo favilo marked this pull request as ready for review December 28, 2023 21:59
@coreylowman coreylowman merged commit 630514f into coreylowman:main Jan 3, 2024
4 checks passed
@caelunshun
Copy link
Contributor

Have you tested this on backends besides Vulkan? AFAIK wgpu only supports the SPIRV_PASSTHROUGH feature on Vulkan, meaning we lose support for any platform that doesn't have Vulkan (including WebGPU itself).

I feel a better approach is to just use WGSL, as it has the best support across wgpu backends. Losing f64 support is unfortunate but can be addressed in the future (gpuweb/gpuweb#2805). Atomics should work fine in WGSL.

@favilo favilo deleted the webgpu-abs branch January 4, 2024 04:12
@favilo
Copy link
Contributor Author

favilo commented Jan 4, 2024

Have you tested this on backends besides Vulkan? AFAIK wgpu only supports the SPIRV_PASSTHROUGH feature on Vulkan, meaning we lose support for any platform that doesn't have Vulkan (including WebGPU itself).

I feel a better approach is to just use WGSL, as it has the best support across wgpu backends. Losing f64 support is unfortunate but can be addressed in the future (gpuweb/gpuweb#2805). Atomics should work fine in WGSL.

Actually, I think wgpu just doesn't support atomics at all yet, that is why I went with passthrough.

I can get behind WGSL, but it wouldn't support the preprocessor at all, so the code will have a lot of copied boilerplate, or we'd need to use templates of some sort.

@caelunshun
Copy link
Contributor

Actually, I think wgpu just doesn't support atomics at all yet, that is why I went with passthrough.

That's not the case; projects like vello are using atomics extensively in WGSL shaders.

I can get behind WGSL, but it wouldn't support the preprocessor at all, so the code will have a lot of copied boilerplate, or we'd need to use templates of some sort.

True. Bevy and Vello each have their own ad-hoc solutions for shader preprocessing, and I guess dfdx would need to do something similar.

@DonIsaac
Copy link
Contributor

DonIsaac commented Jan 7, 2024

@favilo I've got a working binary kernel here: #904

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants