Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have callees set up their shadow stack frame. #220

Merged
merged 1 commit into from
Dec 20, 2024

Conversation

vext01
Copy link

@vext01 vext01 commented Dec 19, 2024

There is some nuance to this, so raising as a draft for now. I will comment on an item I want to discuss before we merge

Before this change we'd have caller's allocate shadow space. Besides being an unconventional strategy, this also means that we allocate shadow space for things which will never require shadow space. We even had to conservatively allocate shadow space before external calls just in case they call-back into functions that do need shadow space.

This change makes each callee requiring a shadow frame allocate it itself.

Read the comment at the top of ShadowStack.cpp for implementation details.

Measuring about a 22% speedup on bigloop.

// interpreter loop (which is long-lived) contains a commonly jumped-to
// setjump(), so we probably do want to fix this soon.
//
// YKFIXME: If you wanted to fix this you'd have to reload the shadow stack
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I wanted to discuss.

I wonder if we should try to fix this right off the bat. That's assuming you agree with the reasoning outlined in the comment.

It requires a little bit of implementation effort, but should be doable.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From memory when we last discussed this, I believe all we have to do to reset the shadow stack after a longjump is to duplicate this line
store ptr %1, ptr @shadowstack_0, align 8
and put it right after the setjump call.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, so I think what this PR does is already "correct" but (at least temporarily) can leak arbitrary amounts of shadow stack space. I'm fine with that temporarily: I think the fix for the leak is for setjmp (not longjmp!) to be nobbled in the way Lukas is suggesting.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and put it right after the setjump call.

my idea was to reload the shadow pointer immediately before and then update all later uses to the reload, but in hindsight, I think Lukas' solution is simpler to implement.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but (at least temporarily) can leak arbitrary amounts of shadow stack space

Indeed. That's what I concluded too.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my idea was to reload the shadow pointer immediately before and then update all later uses to the reload

I think that would only reset the local shadow stack pointer, but as you correctly pointed out in case 1) this is already done automatically by the long jump. Storing that value back into the shadow stack global ensures subsequent calls also load the correct pointer value from the global.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you can easily load it back before you jump: you don't know where the right value will be in the jmpbuf. But, if you could do that, it would I think be equivalent to doing it after the setjmp. Either way, the basic idea is solid!

//
// Returns a pointer to the result of the call to malloc that was used to
// heap allocate memory for a shadow stack.
CallInst *insertMainPrologue(Function *Main, GlobalVariable *SSGlobal,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this for now but I suspect the "right" long term fix will also fix threading. That probably means dynamically intercepting process / thread creation, rather than inserting code into main directly.

@ltratt
Copy link

ltratt commented Dec 19, 2024

I think this should go in as-is: there are things we could improve, yes, but this will already give us an important boost.

@ltratt
Copy link

ltratt commented Dec 19, 2024

I suggest we undraft this, and get it in. We can of course do an even better version in the coming days!

@vext01
Copy link
Author

vext01 commented Dec 19, 2024

Are there any little nitpicks I need to fix? Typos? Style?

I'd be surprised if you couldn't find anything at all :)

@vext01 vext01 marked this pull request as ready for review December 19, 2024 23:05
@ltratt
Copy link

ltratt commented Dec 19, 2024

If there were, I'd have said so, I promise!

@ltratt ltratt added this pull request to the merge queue Dec 19, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 19, 2024
Before this change we'd have caller's allocate shadow space. Besides
being an unconventional strategy, this also means that we allocate
shadow space for things which will never require shadow space. We even
had to conservatively allocate shadow space before external calls just
in case they call-back into functions that do need shadow space.

This change makes each callee requiring a shadow frame allocate it
itself.

Read the comment at the top of ShadowStack.cpp for implementation
details.

Measuring about a 22% speedup on bigloop.
@vext01
Copy link
Author

vext01 commented Dec 20, 2024

This was just a name clash. Force pushed a fix.

@ltratt ltratt added this pull request to the merge queue Dec 20, 2024
Merged via the queue into ykjit:main with commit 4b7a0d5 Dec 20, 2024
2 checks passed
@vext01 vext01 deleted the new-sstack-allocs branch December 20, 2024 13:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants