Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backtrace support, part 2: Shadow VMRuntimeLimits for each stack #99

Merged
merged 2 commits into from
Feb 15, 2024

Conversation

frank-emrich
Copy link

@frank-emrich frank-emrich commented Feb 12, 2024

As of PR #98, each ContinuationObject contains an object of type StackLimits. In addition, there exists such an object for the main stack, it is stored in the VMContext and the StackChain::MainStack variant at the list of currently active continuations points to it. These StackLimits objects contain counterparts of the stack-related fields in VMRuntimeLimits, namely stack_limit and the various last_wasm_* fields.

As of PR #98, the actual contents of these StackLimits are unused (and not updated). This changes in this PR:

While the VMRuntimeLimits continue to contain the stack-related information for the currently executing stack (either the main stack or a continuation), we ensure that for stacks that are not currently running, their corresponding StackLimits object now contains accurate values about their stack limits. The doc comment on wasmtime_continuations::StackChain describes the exact invariants that we maintain.

To ensure that these invariants hold, we need to copy some fields between the VMRuntimeLimits and StackLimits objects when stack switching occurs. In particular, the tc_resume libcall now takes an additional argument: A pointer to the StackLimits object of the parent of the continuation that is about to be resume. Note that this needs to happen in the libcall itself, in order to obtain accurate values for the last_wasm_exit_* values in the VMRuntimeLimits.

@frank-emrich frank-emrich changed the base branch from backtrace-infra1 to main February 14, 2024 17:34
Copy link
Member

@dhil dhil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few quick comments.

crates/cranelift/src/func_environ.rs Show resolved Hide resolved
crates/runtime/src/continuation.rs Show resolved Hide resolved
@frank-emrich
Copy link
Author

There's currently an issue with the CI, because I originally created this PR against a base branch other than main. I've tested things locally and will create a fresh PR after this PR is approved so that the CI gets run before merging.

@dhil
Copy link
Member

dhil commented Feb 15, 2024

There's currently an issue with the CI, because I originally created this PR against a base branch other than main. I've tested things locally and will create a fresh PR after this PR is approved so that the CI gets run before merging.

I think closing and opening the PR again ought to do the trick.

Copy link
Member

@dhil dhil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@frank-emrich frank-emrich enabled auto-merge (squash) February 15, 2024 14:11
@frank-emrich frank-emrich merged commit e8078a5 into main Feb 15, 2024
21 checks passed
@frank-emrich frank-emrich deleted the backtrace-infra2 branch March 12, 2024 18:21
frank-emrich added a commit that referenced this pull request Mar 19, 2024
It recently became necessary to regenerate the expected output of most
(all?) of the files in tests/disas. Concretely, we were generating
slightly different CLIF than upstream even on modules not involving any
WasmFX instructions.

This PR rectifies this situation, making sure that we generate identical
CLIF again. This required two changes:

## Global CLIF variable for runtime limits pointer
Since #99, we called the function
`wasmtime_cranelift::func_environ::FuncEnvironment::declare_vmruntime_limits_ptr`
unconditionally for every compiled function, meaning that every function
prelude now contained a definition of the corresponding global variable.

This PR changes this behavior: I restored the original logic for when to
call `declare_vmruntime_limits_ptr`. Instead of using the global
variable `vmruntime_limits` to access the `VMRuntimeLimits`, we now load
this pointer ourselves once per `resume` instruction. Since the load
instruction is marked as `readonly`, it should be coalesced into a
single load, even if multiple `resume`s appear in the same function.

I've benchmarked this, using a micro-benchmark with 2 resume
instructions in the same function, and as expected there is no
observable difference in performance.

## Order of libcall declarations
In `builtin.rs`, we put the declarations of our own libcalls into the
middle of the list of all libcalls in the `foreach_builtin_function`
macro. This shifted the indices of some existing libcalls, which caused
the CLIF to differ when using these. I've moved all the `tc_` libcalls
to the end of the list, thus leaving the indices of existing libcalls
unchanged.

As a result, I've replaced all files in tests/disas with the versions
from upstream
(bytecodealliance/wasmtime@afaf1c7, the
most recent upstream commit that we've merged)
frank-emrich added a commit that referenced this pull request Mar 21, 2024
…ns (#136)

Currently, we can overflow the stack while running inside a
continuation, without the runtime having any way of detecting this.
This PR partially rectifies this, by making the existing stack limit
checks that get emitted by cranelift in every wasm function prelude work
correctly while running inside a continuation.

All that was required to enable the stack limit checks was the
following:
1. Stop zero-ing out the `stack_limit` value in `VMRuntimeLimits`
whenever we `resume` a continuation.
2. When creating a continuation, set a reasonable value for the
`stack_limits` value in its `StackLimits` object.

Note that all the required infrastructure to make sure that whenever we
switch stacks, we save and restore the `stack_limits` value inside
`VMRuntimeLimits` and the `StackLimits` object of the involved stacks
was already implemented in #98 and #99. In this sense, enabling these
checks is "free": The limits were already checked, but previously using
a limit of 0.

The only remaining question is what the "reasonable value" for the stack
limits value mentioned above is. As discussed in #122, the stack limit
checks that cranelift emits in function preludes are rather limited, and
these limitations are reflected in the checks that this PR provides:
When entering a wasm function, they check that the current stack pointer
is larger than the `stack_limit` value in `VMRuntimeLimits`. They do not
take into account how much stack space the function itself will occupy.
No stack limit checks are performed when calling a host function.

Thus, this PR defines a config option `wasmfx_red_zone_size`. The idea
is that we define the stack limit as `bottom_of_fiber_stack` +
`wasmfx_red_zone_size`. Thus, the stack checks boil down to the
following:
Whenever we enter a wasm function while inside a continuation, we ensure
that there are at least `wasmfx_red_zone_size` bytes of stack space
left.

I've set the default value for `wasmfx_red_zone_size` to 32k. To get a
rough idea for a sensible value, I determined that a call to the
`fd_write` WASI function occupies ~21k of stack space, and generously
rounded this up to 32k.

**Important**: This means that these stack limit checks are incomplete:
Calling a wasm or host function that occupies more than
`wasmfx_red_zone_size` of stack space may still result in an undetected
stack overflow!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants