Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use SSA def descriptors in copy propagation #65250

Merged
merged 2 commits into from
Feb 25, 2022

Conversation

SingleAccretion
Copy link
Contributor

@SingleAccretion SingleAccretion commented Feb 12, 2022

The next round of copy propagation improvements: push SSA defs on the stacks instead of nodes and optimize the main loop.

The main benefit is performance: my PIN tells me this change reduces the number of instruction retired by ~0.40% when replaying benchmarks.run.

Additionally, the way this change is structured will allows easily replicating previous behavior once the LHS locals no longer have meaningful VNs.

There some small diffs with this change, they're caused by the slightly different rebalancing order of the hash table as we no longer "overwrite" the stack pointer when pushing a def and don't push defs for shadowed locals.

@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Feb 12, 2022
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Feb 12, 2022
@ghost
Copy link

ghost commented Feb 12, 2022

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

The next round of copy propagation improvements: push SSA defs on the stacks instead of nodes and optimize the main loop.

The main benefit is performance: my PIN tells me this change reduces the number of instruction retired by ~0.40% when replaying benchmarks.run.

Additionally, the way this change is structured will allow me to easily replicate previous behavior once the LHS locals no longer have meaningful VNs.

There some small diffs with this change, they're caused by the slightly different rebalancing order of the hash table as we no longer "overwrite" the stack pointer when pushing a def and because we don't push defs for shadowed locals.

Author: SingleAccretion
Assignees: -
Labels:

area-CodeGen-coreclr, community-contribution

Milestone: -

@SingleAccretion SingleAccretion changed the title Directly use SSA defs in copy propagation Use SSA defs in copy propagation Feb 12, 2022
@SingleAccretion SingleAccretion changed the title Use SSA defs in copy propagation Use SSA def descriptors in copy propagation Feb 12, 2022
@SingleAccretion SingleAccretion force-pushed the Copy-Propagation-Ssa-Defs branch 3 times, most recently from f098e85 to 31d148f Compare February 12, 2022 11:40
@SingleAccretion SingleAccretion marked this pull request as ready for review February 12, 2022 16:52
@SingleAccretion
Copy link
Contributor Author

Linux musl x64 failure looks like #59542. Windows ARM64 timeout does not seem related (change affects all targets equally).

@dotnet/jit-contrib

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM.

Suggested a few extra asserts.

src/coreclr/jit/copyprop.cpp Show resolved Hide resolved
src/coreclr/jit/copyprop.cpp Outdated Show resolved Hide resolved
Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

src/coreclr/jit/copyprop.cpp Show resolved Hide resolved
else
{
assert((lclNode->gtFlags & GTF_VAR_DEF) != 0);
// This will be "RESERVED_SSA_NUM" for promoted struct fields assigned using the parent struct.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you planning to add an assert for this? Also, what about the TODO-CQ? Is that something that is planned to address in subsequent PR?

Copy link
Contributor Author

@SingleAccretion SingleAccretion Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will assert, seems like that would catch unexpected cases.

I am personally not planning on fixing it (well, #61049 would fix it, but it requires more design work), this is just a general TODO-CQ to make the reader aware of this inefficiency (the fact that it is not "by design").

src/coreclr/jit/copyprop.cpp Show resolved Hide resolved
@JulieLeeMSFT JulieLeeMSFT requested a review from TIHan February 16, 2022 03:12
@JulieLeeMSFT JulieLeeMSFT added this to the 7.0.0 milestone Feb 16, 2022
@SingleAccretion SingleAccretion force-pushed the Copy-Propagation-Ssa-Defs branch from d5c0db9 to a552f20 Compare February 23, 2022 13:30
@kunalspathak
Copy link
Member

push SSA defs on the stacks instead of nodes and optimize the main loop.

I was looking at the changes closely prior to merging and didn't quite follow what part of (which) loop is getting optimized by using defs instead of nodes. Could you please elaborate?

@SingleAccretion
Copy link
Contributor Author

I was looking at the changes closely prior to merging and didn't quite follow what part of (which) loop is getting optimized by using defs instead of nodes. Could you please elaborate?

@kunalspathak We are moving work from the propagation loop to the "push" one: previously, we were fetching the SSA number and the VN from the node in the propagation loop (which is not that cheap for partial definitions especially), and checking it then. Now we effectively check it only once (and push "unavailable" defs in case the checks failed that then short-circuit the propagation loop).

This also reduces the number of indirections once we start pushing defs based not the LHS's VN but the SSA def itself (my next change - has some good CQ benefits too) from lclNode -> SSA number -> SSA def -> VN to SSA def -> VN.

Notably, this is not all that could have been done: we're still traversing the IR two times (instead of just one like SSA does), and we could push a {ssaNum, VN} tuple as a def instead of a pointer to the SSA def to get rid of one more indirection. But that's future work.

@kunalspathak
Copy link
Member

Thanks for the explanation!

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kunalspathak kunalspathak merged commit 9012b23 into dotnet:main Feb 25, 2022
@SingleAccretion SingleAccretion deleted the Copy-Propagation-Ssa-Defs branch February 25, 2022 18:18
BruceForstall added a commit to BruceForstall/runtime that referenced this pull request Mar 2, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Mar 27, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants