Implement AbsKernelOp for WebGPU backend #896

favilo · 2023-12-03T02:15:32Z

This also creates a broken, default implementation for all the other UnaryKernels.

favilo · 2023-12-03T02:50:31Z

Ridiculous amount of warnings, and I'll probably clean up a lot of them, but the unused variables mostly have to do with the unimplemented kernels, I think

favilo · 2023-12-03T02:52:01Z

Actually, I'll keep this a draft, I need to test it still.

Also added tests for higher code coverage.

Hopefully I can figure out a way to remove it again.

Had to figure out a way to store zeros in place

…iler errors

though I won't be able to test that until I fix `mean`.

non-passthrough route. Can't get sum_to working until wgpu supports atomic operations. Which is super unfortunate. Maybe I'll work on that soon...

Weird magic number issue that I can't figure out...

favilo · 2023-12-28T21:59:05Z

So I went through several iterations of this.

wgpu supports compiling code from WGSL and GLSL into Spir-V automatically.

Originally, I wanted to go with WGSL, since it's a language that was made for WebGPU specifically, but the issue I came across with it was that it doesn't have support for f64 types. So after writing all the code for abs and getting it working, I realized I had to jump into GLSL if I wanted double word floating point numbers.

Then I replaced the code with GLSL, and had wgpu compile it by itself. This worked fine, until I decided I wanted to get the abs test case working for the backward phase, which uses mean(). In order to support that, I needed to write the sum_to action, at least for the forward phase. But it was here that I realized wgpu doesn't support atomic operations yet! I really didn't feel like forcing that into wgpu since that would potentially be a lot more work, so I decided to utilize the third method.

wgpu has the ability to use precompiled SpirV binary files without validation, and the glslc compiler exists. So I decided to go that route. It worked fine, but still had trouble with the atomic operations, until I found the make_spirv_raw() function. That is supposed to take it and actually do no validation on the code. However, when I attempted to run it I didn't realize I had a bug in my code and it was just segfaulting everywhere.

For my fourth iteration through this, I got frustrated and decided, "Hey, this is rust! I should be using rust for these kernels." So I moved to rust-gpu by Embark Studios. I actually really like this method, and it fixes some of the weird issues with the GLSL compilation, like forcing me to use #define TYPENAME and gives me multiple entry points into a kernel, instead of having only main. The issue that cropped up here was that it requires a very specific version of the nightly rust compiler, and our framework doesn't listen to rust-toolchain.toml correctly just yet. It was also here that I discovered my bug of not using a pipeline layout for constructing my compute shaders, which was causing the segfaults.

So I went back to GLSL, and with the segfaults now fixed, I was able to power through and get it working how you see it now. Of course there is still a very strange problem with the sum_to kernel that I'm going to have to figure out before too long. It has the wrong magic number when I'm creating the compute shader, but when I look at the specific files, and the bytes that are directly read by the library, it is the correct magic number...

So I'm going to be spending some time with a debugger and stepping through code to figure that out, but in the mean time, I have it working for the abs kernel for both f32 and f64.

caelunshun · 2024-01-04T04:04:03Z

Have you tested this on backends besides Vulkan? AFAIK wgpu only supports the SPIRV_PASSTHROUGH feature on Vulkan, meaning we lose support for any platform that doesn't have Vulkan (including WebGPU itself).

I feel a better approach is to just use WGSL, as it has the best support across wgpu backends. Losing f64 support is unfortunate but can be addressed in the future (gpuweb/gpuweb#2805). Atomics should work fine in WGSL.

favilo · 2024-01-04T04:15:54Z

Have you tested this on backends besides Vulkan? AFAIK wgpu only supports the SPIRV_PASSTHROUGH feature on Vulkan, meaning we lose support for any platform that doesn't have Vulkan (including WebGPU itself).

I feel a better approach is to just use WGSL, as it has the best support across wgpu backends. Losing f64 support is unfortunate but can be addressed in the future (gpuweb/gpuweb#2805). Atomics should work fine in WGSL.

Actually, I think wgpu just doesn't support atomics at all yet, that is why I went with passthrough.

I can get behind WGSL, but it wouldn't support the preprocessor at all, so the code will have a lot of copied boilerplate, or we'd need to use templates of some sort.

caelunshun · 2024-01-04T04:19:25Z

Actually, I think wgpu just doesn't support atomics at all yet, that is why I went with passthrough.

That's not the case; projects like vello are using atomics extensively in WGSL shaders.

I can get behind WGSL, but it wouldn't support the preprocessor at all, so the code will have a lot of copied boilerplate, or we'd need to use templates of some sort.

True. Bevy and Vello each have their own ad-hoc solutions for shader preprocessing, and I guess dfdx would need to do something similar.

DonIsaac · 2024-01-07T01:42:46Z

@favilo I've got a working binary kernel here: #904

favilo marked this pull request as ready for review December 3, 2023 02:48

favilo marked this pull request as draft December 3, 2023 02:51

favilo force-pushed the webgpu-abs branch from 5439341 to 7950ff8 Compare December 3, 2023 17:16

favilo force-pushed the webgpu-abs branch from 773ae9b to c1b440b Compare December 27, 2023 22:53

favilo added 13 commits December 27, 2023 14:55

Removed some of the more low level commands in favor of a wrapper struct

92c8fe5

Also added tests for higher code coverage.

AtomicPtr unsound fix

8a35784

Partial implementation of Device<E> for Webgpu

d867cd8

Remove foolish Mutex

9694c1c

Add Mutex back, since evidently it was causing issues.

7ef20dc

Hopefully I can figure out a way to remove it again.

Removed num_traits::Num requirement from Zeros.

f5a2b3d

Had to figure out a way to store zeros in place

Implement abs kernel, and use broken unary operation for all the comp…

1e2d1ec

…iler errors

cargo fmt

bd75762

disable f16, since we don't support it yet

fde8d50

no-std

e3f2113

Added test for abs on webgpu. Also added backward implementation,

b699340

though I won't be able to test that until I fix `mean`.

cargo fmt

e25553a

Managed to get built spirv working as long as we go through the

7c686a1

non-passthrough route. Can't get sum_to working until wgpu supports atomic operations. Which is super unfortunate. Maybe I'll work on that soon...

favilo force-pushed the webgpu-abs branch from c1b440b to 7c686a1 Compare December 27, 2023 22:55

favilo added 3 commits December 27, 2023 16:36

Have the code work correctly, almost got sum_to working, too

afa3a1a

Weird magic number issue that I can't figure out...

Cargo fmt

29668c6

Do we need to skip webgpu features?

5a2e4ad

favilo marked this pull request as ready for review December 28, 2023 21:59

coreylowman merged commit 630514f into coreylowman:main Jan 3, 2024
4 checks passed

favilo deleted the webgpu-abs branch January 4, 2024 04:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement AbsKernelOp for WebGPU backend #896

Implement AbsKernelOp for WebGPU backend #896

favilo commented Dec 3, 2023

favilo commented Dec 3, 2023

favilo commented Dec 3, 2023

favilo commented Dec 28, 2023

caelunshun commented Jan 4, 2024

favilo commented Jan 4, 2024

caelunshun commented Jan 4, 2024

DonIsaac commented Jan 7, 2024

Implement AbsKernelOp for WebGPU backend #896

Implement AbsKernelOp for WebGPU backend #896

Conversation

favilo commented Dec 3, 2023

favilo commented Dec 3, 2023

favilo commented Dec 3, 2023

favilo commented Dec 28, 2023

caelunshun commented Jan 4, 2024

favilo commented Jan 4, 2024

caelunshun commented Jan 4, 2024

DonIsaac commented Jan 7, 2024