Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open question: How to deal with buffers that need lots of reads/writes #2

Open
MendyBerger opened this issue Mar 1, 2024 · 10 comments
Assignees

Comments

@MendyBerger
Copy link
Collaborator

There are a few cases where we'll have buffers that need to take lots of reads/writes, and are not in linear memory.
Examples include gpu-buffer when mapped to CPU, and frame-buffers.
How should we go about those?
Here are some options I've thought of, none of them seem particularly good to me:

  • Just copy everything over as list<u8>. Obviously slow.
  • Have an external resource, with .get()/.set() methods. Prob slow since every .get()/.set() will have to cross the wasm boundary, but might still be best option with current capabilities.
  • Find some magic way to map it to linear memory. Don't think this is actually possible.

Thoughts anyone?

@MendyBerger MendyBerger changed the title Open question: How to deal with buffers that need lots of of reads/writes Open question: How to deal with buffers that need lots of reads/writes Mar 1, 2024
@tareksander
Copy link

It's not supported in languages as they don't have the concept of linear memory, but WebAssembly allows multiple memories. Mapped buffers could be exposed as additional linear memories to WASM. That would make the access fast, but would require inline assembly in the client languages to utilize it. I don't think it's in the spirit of the component model to do this (it's all about serializing and deserializing data instead of intruding on the core WASM primitives like memories AFAIK), but this is the only possible performant solution I see. That would mean that this proposal MUST be implemented by the host, as the component model doesn't give a component the capability to modify other component's memories.

@tareksander
Copy link

One caveat: I don't think instructions in WASM can choose a memory to act on dynamically, so you'd need to choose a memory index to use in the ABI (e.g. memory 2) and use host calls to "bind" a buffer to the memory slot. Alternatively, the component gets to decide which memory index to bind, so different client implementations can use different indices, e.g. if they're already using multiple memories.

@tareksander
Copy link

I did a bit more research, wasmtime has no mechanism of converting a user-provided buffer to a linear memory object and you can't use ArrayBuffer objects as WASM meories in JS.

In that case I think having get/set and bulk copy to/from methods is the best option. I asked in the wasmtime stream on Zulip just now about the performance of host calls in wamtime.

@tareksander
Copy link

The resulting conversation:
conversation

And I know browser developers have also worked hard to speed up host calls in their wasm engines, so that approach should be fine.

@seanisom
Copy link
Collaborator

seanisom commented Mar 6, 2024

Wasmtime does have memory read / write functions to raw offsets. This is how we get data into and out of wasm guests today in wander: https://docs.wasmtime.dev/api/wasmtime/struct.Memory.html#method.write

It's a copy, not a mapping of host memory, but is a safe approach.

@tareksander
Copy link

Wasmtime does have memory read / write functions to raw offsets. This is how we get data into and out of wasm guests today in wander: https://docs.wasmtime.dev/api/wasmtime/struct.Memory.html#method.write

It's a copy, not a mapping of host memory, but is a safe approach.

Yes, that's how I imagined the get/set and bulk copy primitives would work under the hood. With host calls being basically free, you'd only pay for a function call and a memory access (that can probably be chached, don't know if caches are bypassed on GPU mappings).

@tareksander
Copy link

The framebuffer can probably use a get/set for performance, but wgpu (which is the main WebGPU implementation I know of) expects the buffer to be a continuous memory range, because that range is also exposed to the user code. That means for buffers we'll probably have to let WASM allocate a buffer in linear memory, and when the buffer mapping is dropped it's copied back into the actual mapped buffer. That's an additional copy, but I don't think that's avoidable currently. There are proposals for a more relaxed memory model for WASM, but those are probably a long time away still.

@MendyBerger
Copy link
Collaborator Author

MendyBerger commented Mar 7, 2024

@tareksander I'm not using wgpu, I'm using wgpu-core directly.
I was actually able to do it with get/set methods.
Here I create a slice from the mapped memory.
And here I use the slice for the get/set methods.

@tareksander
Copy link

@tareksander I'm not using wgpu, I'm using wgpu-core directly.

I mean for the clients in WASM actually using the WebGPU API. They'll likely use wgpu or even if they use something else, the API probably expects a mapped buffer in memory. If you're not providing a buffer region, then the API must provide it, you can't avoid a copy in that case. The easiest approach would probably be to add a bulk copy operation to your implementation, and clients would then bulk copy to linear memory to read and bulk copy back to the host if they're finished writing.

@lygstate
Copy link

we need move https://github.com/WebAssembly/memory-control forward

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants