Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

onepagers for MSBuildServer and RAR caching #11005

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

SimaTian
Copy link
Contributor

Onepagers as per our internal team discussion.


### Risks

The project was already attempted once, however it was postponed because

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the risk of teammates being loaned to divisional initiatives threatens completion and delivery of relevant work

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, but this is an implicit risk in every project, so I don't think it's necessary to call that out.

Getting closer to the possibility of decoupling from Visual Studio.

### Stakeholders

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we coordinate with other teams such as run time when they would enable this?

Some communication overhead

## Plan

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thhorough testing should be a priority in the effort.


Small performance improvement in the short term. Enabling further
optimizations in the long term. (these improvements are for the Dev Kit
and inner loop CLI scenarios)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have some estimation of the performance gain? Eventually, are there any scenarios that would significantly benefit from the build server?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the performance gain is in not having to start the MSBuild process over and over again for every build. E.g. the perfomance gain is one quite large process being kept alive, which admittedly is not all that much.
However for the future, we can leverage it to cache some additional stuff more agressively. We can also use it to monitor the folders of the last build and for example re-evaluate in the background - and evaluations are costly.
For the concrete numbers @rainersigwald, can you elaborate please? I have the overall knowledge of the stuff we discussed, but not enough data.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I don't have concrete numbers. It's likely not huge for most scenarios, though incremental build of a single project (new-to-.NET scenario) will likely see a solid change.

We should collect numbers as part of this work. IMO unless the numbers show things getting worse, it's worth going forward to

  1. get this feature to "done", and
  2. give us a place to hang future background processing.

optimizations in the long term. (these improvements are for the Dev Kit
and inner loop CLI scenarios)

Getting closer to the possibility of decoupling from Visual Studio.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate a bit on how would the Build Server help in this decoupling?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Visual Studio is currently acting almost like a MSBuild server would in some ways. It is a persistent process, that is invoking portions of MSBuild as required. It handles caching, has some file monitoring that it uses for knowing what to rebuild and what to keep and probably some other things I'm not aware of.
This creates a tension since the behavior is almost the same as MSBuild, yet different enough to confuse and sometimes cause issues.
In a perfect world, we would like Visual Studio to instead communicate with MSBuild as a process so that the behavior is the same. Currently, they have no reason to do that. We would like to set up the MSBuild server and then start shifting from "VS uses portions of MSBuild directly" to "VS calls MSBuild server with requests".
@JanKrivanek is this close enough description or did I miss something please?


Tomas Bartonek, Rainer Sigwald. Successful handover means turning on the
feature, dogfooding it for long enough to ensure we have reasonable
certainty that nothing breaks and then rolling it out.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any other partners that could dogfood the feature? Specifically, Jared mentioned that we could learn from their experiences with compiler server so perhaps they could help us dogfood the feature?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should go to our close-friend repos like SDK and roslyn to get them to opt in before we opt in for everyone.


### Stakeholders

Tomas Bartonek, Rainer Sigwald. Successful handover means turning on the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"MSBuild team" would be probably more descriptive for readers who are not familiar with the team members.


### Risks

The project was already attempted once, however it was postponed because
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, but this is an implicit risk in every project, so I don't think it's necessary to call that out.


### Goals and motivations

1ES team wants to isolate their File I/O related to the which is causing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this sentence, is it missing a part? The scenario needs more clarification. What is I/O isolation in this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems likely. I'm sorry, my bad. I was editing it and missed a word. I will take a closer look in the next iteration.
If I understood correctly they have complicated infrastructure, and the current MSBuild caching is pulling files from all nodes at once. This creates a tangled mess of IOs together with their other stuff. They want to separate the caching IO to a separate process so that it is separate from the other File IO related stuff so that it stops polluting their logs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ccastanedaucf that sounds accurate to me, right?


1ES team will provide the initial cache implementation. We will review
their PRs and do the performance evaluations. Handover will be
successful if nothing breaks and we meet our performance requirements.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"we meet our performance requirements" - what are they?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No regression at the minimum. I mentioned it earlier so I ommited it here. I can remedy.


### Impact

The only impact we’re concerned about is the performance. There will be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who will benefit from this feature? 1ES only or external customers as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mostly depends on the performance impact. If the performance is neutral, it will be mostly 1ES who will benefit (even though I imagine there can be a customer who would benefit in a similar way to the 1ES)
If the performance is positive, then it could have a broader impact.

Also, one of our discussion items was to transition from multiprocess to multithreaded - if we can pull that one off, this will lose the burden of IPC, further improving the perf. However that is more of a "wishlist/longerm" sort of plan.

### Cost

Week for reviewing the provided PR. Additional two weeks for performance
testing conditional on the Perfstar infrastructure being functional.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is PerfStar a prerequisite for this then? Could this be done without PerfStar by testing the performance manually?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do it manually, it would be more difficult and costly. Also, we want to have Perfstar for setting the baselines as far as I can tell.


1ES team creates the PR wih the RAR cache implementation.

We review the PR with a special emphasis on the performance side of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens after the review? What is the expected cost of supporting 1ES? Are we expecting some follow-ups on this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we merge, it works and it's done. There is no expected follow up work besided the usual maintenance.
This should mostly be "only" enabling of a neat feature that will help 1ES and possibly improve performance.

@SimaTian
Copy link
Contributor Author

Hello @tkapin @donJoseLuis, I went over the comments and answered what I could.
That being said I've got kind of a technical question - how much more in depth should I do in the next iteration please? (I will fill in the requested information that I put into the comments for now at the very least)
I'm aware I was probably too vague at some places and didn't explain deeply enough at others. Part of this I attribute to the fact that I was sticking to the "one pager" format. I was actively cutting some stuff away to keep the stuff as concise as possible with the hope to keep it contained on, well, one page.
I have no issue writing more so this is more of a check - what is an expected length of one pager please? I probably took it too literally.

that we would communicate with via a thin client. We want to get from
the current state of “spawn a complete process for every CLI invocation”
to “we have a server process in the background and we only spawn a small
CLI handler that will tell the server what to build”.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


Small performance improvement in the short term. Enabling further
optimizations in the long term. (these improvements are for the Dev Kit
and inner loop CLI scenarios)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I don't have concrete numbers. It's likely not huge for most scenarios, though incremental build of a single project (new-to-.NET scenario) will likely see a solid change.

We should collect numbers as part of this work. IMO unless the numbers show things getting worse, it's worth going forward to

  1. get this feature to "done", and
  2. give us a place to hang future background processing.


Tomas Bartonek, Rainer Sigwald. Successful handover means turning on the
feature, dogfooding it for long enough to ensure we have reasonable
certainty that nothing breaks and then rolling it out.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should go to our close-friend repos like SDK and roslyn to get them to opt in before we opt in for everyone.


The project was already attempted once, however it was postponed because
it surfaced a group of bugs that weren’t previously visible due to the
processes not being persistent. Most of those bugs should be solved by
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is a situation where our change here can reveal latent bugs in other code. We hope that the last attempt flushed out the most critical ones, but I don't know of a way to detect them in advance.

@@ -0,0 +1,54 @@
## RAR caching

During every build we need to gather the graph of references and pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assembly references

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants