-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introducing struct.get_indirect
and struct.set_indirect
opcode
#397
Comments
Continued discussionAs @rossberg wrote in WebAssembly/meetings#1279 (comment):
@rossberg Thanks for the explain, I agree with you that the idea of If this is not a good idea in WebAssembly, do you have any suggestions about how to support the TypeScript's |
Another way to implement this reflection in user space would be to generate a function for each struct type that takes an index and uses a |
Thanks for the suggestion. Yes, this is a feasible approach, but seems this will greatly increase the footprint of the generated wasm module. So in my opinion, compared to this user space approach, the proposed |
@tlively I think that idea can be extended to consider the index static. That is, when you get a property on an interface, you know which property at every use: interface Rectangle {
width: number;
height: number;
}
// Use the interface Rectangle's property "width".
foo(rectangle.width) That last line's read of That still means doing an itable call (read itable object, do indirect call) on each access, which is slow, but if the VM does runtime inlining it should be fast enough in the monomorphic case. (In the polymorphic case I think linear memory can do much better with a table of offsets. That would avoid any indirect call or even call entirely.) |
@xujuntwt95329, in general, the idea is that Wasm essentially is a virtual CPU, not a high-level VM. So the two questions to ask are:
In other words, you generally shouldn't expect any more support from Wasm than you get from a native CPU. Certain exceptions apply where we make something more complex into a Wasm feature, but the bar for that is very high, and usually has to do with working around unavoidable restrictions of Wasm. |
@rossberg, for your two questions:
On a real CPU, we definitely need to implement a runtime to support such feature.
As you mentioned, the wasm essentially is a virtual CPU, so in the current MVP wasm instruction set, we need to implement a runtime and compile to wasm opcode to support such feature. However, I have some different opinion about |
Yes, this can also support the interface feature, but in TypeScript, |
@xujuntwt95329, I've used the slogan "As low-level as possible, but no lower" to describe Wasm. As I said, there are exceptions where we have to raise the abstraction level somewhat, and GC is one of them. But that is no free pass to arbitrary high-level operations. Even with GC, we have been very careful to (almost) keep it as low-level as we can. That is why there are only plain tuples and arrays, no object system, no classes, no methods, no reflection, no hidden allocations, etc. I have my concerns about how we failed on that front wrt to casts, but other than that, we managed to stay clear of complex operations. That doesn't mean you can't use Wasm GC. It only means that you still have to build a language-specific runtime and custom data structures on top of it. Wasm takes care of GC for you, but nothing else. GC types merely describe the layout of your runtime's data structures to the engine. |
@rossberg I understand that we should try to avoid complex operations in current GC proposal, thanks for all your detailed explanations! I understand that supporting this kind of reflection seems beyond the scope of the GC MVP proposal, but it do have some benefit in the engineering (e.g. help to reduce the module size). So do you think it could be considered in the Post-MVP? And if so, will it be meaningful if we can provide some data about the |
Good point, yeah, if there are commonly-used names then we'd end up with many classes needing itables to support them. And maybe It might be interesting to measure these overheads. However, I agree with @rossberg that these proposed new opcodes are significantly higher-level than anything else in WasmGC so far, so there is a high bar, I think, for considering them. But I do agree that supporting dynamic languages like TypeScript in wasm is a very important goal! My personal suggestions would be:
|
I agree with @kripken that a key feature to enable the implementation of high-level runtimes in Wasm would be a general mechanism for user-space jitting. That has been an assumption for long, but I don't think we have a particularly clear idea yet what it should look like. |
@kripken Thanks for the suggestions!
Yeah, currently we use host APIs to emulate the
Yeah, we start this TypeScript to wasm work because we think it can bring more developers to WebAssembly, and we think the current
That would be really helpful! Thanks for your help. Is there a link to the J2Wasm project or any documents about the analyses in binaryen as you mentioned? |
@rossberg @kripken Yes, I agree that for the long term, user space Jitting should be a great solution for dynamic languages, but JIT also have some disadvantages:
So even with JIT capability, maybe we should still explore more approaches to directly support some languages (like TypeScript) more efficiently. |
For compiler writers to WasmGC, maybe the most useful links are:
Overall Binaryen's GC optimizations are fairly complete at this point, and include devirtualization, escape analysis, inlining, etc. (full list here), but there are probably many more things we could add. With Java, Kotlin, and Dart it's been useful for them to find cases that look like they should be optimized by Binaryen but are not, so @xujuntwt95329 if you find something like that, or if you have any other questions, let me know! I'm not sure if reading J2Wasm itself would be that helpful, except maybe for the list of Binaryen optimizations it runs. J2Wasm for the most part does just a few Java-specific optimizations and then leaves all the rest to Binaryen.
I do agree with this. But I think there is going to be a high bar for higher-level instructions like these, especially when they add VM complexity and overhead. And, OTOH, I think other approaches (like @tlively and I mentioned) can be just as fast at least in the monomorphic case, which is hopefully the common one. So I am skeptical that the high bar can be reached here. (But I could be wrong!) Btw, two more ideas I've had:
|
Hi @kripken thanks for the links, they are really helpful! If we find any code pattern that can be optimized, we will send them to you :)
I agree that it's not suitable to introduce such high level opcode in GC MVP proposal. But hope it can be considered in Post-MVP phase, I'll try to compare with the solution without such opcode and find if there are more use case. 😀
Go has its own runtime, it also has some concept like
Yes! Actually I think the only way to implement something like |
My understanding is that that is what is meant by https://github.com/WebAssembly/gc/blob/main/proposals/gc/Post-MVP.md#weak-references - so, yes, I expect that to be possible some time after the MVP. I am looking forward to it myself for some use cases. Edit: I guess you mean externref in particular? Perhaps there would be questions about a finalizer for an object created outside of the wasm, but I'd expect that to work too. |
Purpose of this issue
Hi, in GC-05-16 meeting we have introduced our idea of adding
struct.get_indirect
andstruct.set_indirect
opcode to support language semantics likeTypeScript's interface
in WasmGC, we had some discussion during the meeting and PR for the meeting notes, and this issue is to continue the discussion.Background
WasmGC opened the door for supporting high level managed language more efficiently than compiling the whole language runtime to WebAssembly, we are interested in exploring a solution to compile TypeScript to WasmGC. Our strategy is:
any
type, they are represented as an externref, and we introduced alibdyntype
to support dynamic accessing and type checking on them, operation on these objects will be converted to API call tolibdyntype
This provide the flexibility to the developer:
any
is also supported but they will pay for the performance influenceProblem
Most types work well based on the strategy above, but we encountered problem with
interface
.interface
does not describe the layout of a concrete type, its just an un-ordered collection of field names and types. This means the actual object type holding by the interface is not known at compile time, so we have two solutions:any
(interface
is heavily used in TypeScript, if we treat is as any, the performance impact may be huge)field index
according to the meta data and field name (preferred)The
option 2
is preferred because we can still represent objects using WasmGC types, and we can do further optimization to avoid searching the meta info in some scenarios (e.g. check a pre-assigned type-id). Option 2 needs to calculate thefield index
during runtime, but currently thestruct.get/set
opcode require thefield index
to be immediate.Proposed Opcode
To accept
field index
calculated during runtime, we proposed theindirect
version of struct access opcode:This require some runtime type checking:
Then we can use this opcode to access the field of interface:
Influence to runtime implementation
performance:
The proposed opcode will be slow due to several runtime checking, but this will not influence the performance of any other opcodes
memory usage:
To apply runtime checking, it is required that every GC object have a reference to its defined type. Since we already have
RTT
now, the only thing we need to add is atype index
in RTT object, so the impact on memory should be lowIs there any workaround without the proposed opcode?
Treat
interface
asany
as mentioned option 1 aboveCompile whole language runtime into WebAssembly, and execute the source code in that "vm inside vm"
The text was updated successfully, but these errors were encountered: