-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backtrace support, part 2: Shadow VMRuntimeLimits for each stack #99
Conversation
201fdd0
to
3a5a6d2
Compare
6903eff
to
241c741
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few quick comments.
There's currently an issue with the CI, because I originally created this PR against a base branch other than |
I think closing and opening the PR again ought to do the trick. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
It recently became necessary to regenerate the expected output of most (all?) of the files in tests/disas. Concretely, we were generating slightly different CLIF than upstream even on modules not involving any WasmFX instructions. This PR rectifies this situation, making sure that we generate identical CLIF again. This required two changes: ## Global CLIF variable for runtime limits pointer Since #99, we called the function `wasmtime_cranelift::func_environ::FuncEnvironment::declare_vmruntime_limits_ptr` unconditionally for every compiled function, meaning that every function prelude now contained a definition of the corresponding global variable. This PR changes this behavior: I restored the original logic for when to call `declare_vmruntime_limits_ptr`. Instead of using the global variable `vmruntime_limits` to access the `VMRuntimeLimits`, we now load this pointer ourselves once per `resume` instruction. Since the load instruction is marked as `readonly`, it should be coalesced into a single load, even if multiple `resume`s appear in the same function. I've benchmarked this, using a micro-benchmark with 2 resume instructions in the same function, and as expected there is no observable difference in performance. ## Order of libcall declarations In `builtin.rs`, we put the declarations of our own libcalls into the middle of the list of all libcalls in the `foreach_builtin_function` macro. This shifted the indices of some existing libcalls, which caused the CLIF to differ when using these. I've moved all the `tc_` libcalls to the end of the list, thus leaving the indices of existing libcalls unchanged. As a result, I've replaced all files in tests/disas with the versions from upstream (bytecodealliance/wasmtime@afaf1c7, the most recent upstream commit that we've merged)
…ns (#136) Currently, we can overflow the stack while running inside a continuation, without the runtime having any way of detecting this. This PR partially rectifies this, by making the existing stack limit checks that get emitted by cranelift in every wasm function prelude work correctly while running inside a continuation. All that was required to enable the stack limit checks was the following: 1. Stop zero-ing out the `stack_limit` value in `VMRuntimeLimits` whenever we `resume` a continuation. 2. When creating a continuation, set a reasonable value for the `stack_limits` value in its `StackLimits` object. Note that all the required infrastructure to make sure that whenever we switch stacks, we save and restore the `stack_limits` value inside `VMRuntimeLimits` and the `StackLimits` object of the involved stacks was already implemented in #98 and #99. In this sense, enabling these checks is "free": The limits were already checked, but previously using a limit of 0. The only remaining question is what the "reasonable value" for the stack limits value mentioned above is. As discussed in #122, the stack limit checks that cranelift emits in function preludes are rather limited, and these limitations are reflected in the checks that this PR provides: When entering a wasm function, they check that the current stack pointer is larger than the `stack_limit` value in `VMRuntimeLimits`. They do not take into account how much stack space the function itself will occupy. No stack limit checks are performed when calling a host function. Thus, this PR defines a config option `wasmfx_red_zone_size`. The idea is that we define the stack limit as `bottom_of_fiber_stack` + `wasmfx_red_zone_size`. Thus, the stack checks boil down to the following: Whenever we enter a wasm function while inside a continuation, we ensure that there are at least `wasmfx_red_zone_size` bytes of stack space left. I've set the default value for `wasmfx_red_zone_size` to 32k. To get a rough idea for a sensible value, I determined that a call to the `fd_write` WASI function occupies ~21k of stack space, and generously rounded this up to 32k. **Important**: This means that these stack limit checks are incomplete: Calling a wasm or host function that occupies more than `wasmfx_red_zone_size` of stack space may still result in an undetected stack overflow!
As of PR #98, each
ContinuationObject
contains an object of typeStackLimits
. In addition, there exists such an object for the main stack, it is stored in theVMContext
and theStackChain::MainStack
variant at the list of currently active continuations points to it. TheseStackLimits
objects contain counterparts of the stack-related fields inVMRuntimeLimits
, namelystack_limit
and the variouslast_wasm_*
fields.As of PR #98, the actual contents of these
StackLimits
are unused (and not updated). This changes in this PR:While the
VMRuntimeLimits
continue to contain the stack-related information for the currently executing stack (either the main stack or a continuation), we ensure that for stacks that are not currently running, their correspondingStackLimits
object now contains accurate values about their stack limits. The doc comment onwasmtime_continuations::StackChain
describes the exact invariants that we maintain.To ensure that these invariants hold, we need to copy some fields between the
VMRuntimeLimits
andStackLimits
objects when stack switching occurs. In particular, thetc_resume
libcall now takes an additional argument: A pointer to theStackLimits
object of the parent of the continuation that is about to be resume. Note that this needs to happen in the libcall itself, in order to obtain accurate values for thelast_wasm_exit_*
values in theVMRuntimeLimits
.