Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate generation step from HW build #214

Merged
merged 8 commits into from
Aug 7, 2024
Merged

Conversation

rgantonio
Copy link

@rgantonio rgantonio commented Aug 7, 2024

This PR separates the RTL generation made from wrappers and chisel from the build.

Problem: The intermittent bug is caused by parallel build but sequential RTL generation causes CI to fail. This happens because sometimes the timing of the generation file does not finish before the build starts. This happens because sometimes the Chisel sbt or mill takes too long to download whatever it needs from a server.

Solution: Sequentially build all RTL files first before starting parallel builds. This ensures that all the RTL files are ready before the build happens.

Benefits:

  • This takes away the worry about intermittent fails due to the timing of sequential and parallel parts.
  • Useful for HW designers to see which signals need to be connected to because they only need to run the make rtl-gen command and it generates the files that need to be checked.
  • It's a step to make integration in Occammy better. The problem with the high-level integration is that we only need to separate the generation of RTL (Chisel and wrappers) first before building. Previously we generate and build. In the top-level integration we generate per cluster first before building.

What changes in CI?:

  • We have an extra make rtl-gen step before the build.

@rgantonio rgantonio added bug Something isn't working clean up To keep repos clean labels Aug 7, 2024
@rgantonio rgantonio self-assigned this Aug 7, 2024
@rgantonio
Copy link
Author

rgantonio commented Aug 7, 2024

Check it out! I'm waiting for the docker build of PR #215 to finish. Will run the test and the data reshuffle should pass.

@rgantonio rgantonio marked this pull request as ready for review August 7, 2024 11:01
Copy link

@IveanEx IveanEx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way the rtl being generated is not friendly to system integration, as occamy will provide the cfg from elsewhere instead of the one located at cfg/ inside snax repo. To avoid confusion, the cfg file is renamed to other names in occamy. Is it possible that we find a way that the generation does not rely on the file name of cfg, but rely on the information inside cfg? I think the integration would be easier.

@IveanEx
Copy link

IveanEx commented Aug 7, 2024

At the same time, the problem on data reshufflers still exists...

@rgantonio rgantonio mentioned this pull request Aug 7, 2024
5 tasks
@rgantonio rgantonio force-pushed the ry/separate-rtl-gen branch from b530dd7 to d0d8448 Compare August 7, 2024 11:52
@rgantonio
Copy link
Author

@IveanEx

Is it possible that we find a way that the generation does not rely on the file name of cfg, but rely on the information inside cfg?

Let's fix with issue #216 to and see if it works nicely with the "occamy" version.

At the same time, the problem on data reshufflers still exists...

This is the verilator version problem hehe PR #215 fixes it. See the main now 😄

@rgantonio
Copy link
Author

rgantonio commented Aug 7, 2024

Note this also benefits from making runs faster by a few seconds! Or should I say nothing changes really hehe. For example...

Old run:

image

New run:

image

@IveanEx IveanEx self-requested a review August 7, 2024 13:13
Copy link

@IveanEx IveanEx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good! We finished one task in #216 !

@rgantonio
Copy link
Author

Thanks @IveanEx !!

@rgantonio rgantonio changed the title [WIP] Separate generation step from HW build Separate generation step from HW build Aug 7, 2024
@rgantonio rgantonio merged commit 4a366d9 into main Aug 7, 2024
27 checks passed
@rgantonio rgantonio deleted the ry/separate-rtl-gen branch August 7, 2024 13:20
@JosseVanDelm
Copy link

Hi,

Sorry I was on holiday, so this review appears a bit too late.
This is not a fix in my opinion.

Okay for me to provide a seperate step to generate RTL, but this means that
make -C target/snitch_cluster bin/snitch_cluster.vlt -j$(nproc)
Will fail without having run make rtl-gen first?

I put quite a bit of effort in #136 to make this work, and this PR just removes the original "fix"?

The original problem is that make does not work with multiple outputs being generated in one build command. This PR does not fix that issue.
Make itself does not start building a thing unless all dependencies are ready and by that regard, it will never have this parallel issue, unless one build command generates multiple things.

I.e. people should be able to just run make whatever -j and it should work (without running anything prior). If it does not work, there is an issue with the makefiles. I'm being a bit strict here, because if we start requiring people to run additional commands we open the gates for letting everything become a giant mess in no time. This PR also does not adapt the documentation for the additional make step.

Either you:

  • adapt clustergen/rtlgen/whatevergen to not be a script that generates all files in one go, but then at some point we would be making our own version of a build system and of an RTL generation. Are we making our own version of Chisel 🤔 ?
  • Add again the thing to force a serialized build as per the GNU documentation, which I already added in build: Fix make dependencies for multiple concurrent jobs #136 . The GNU documentation also mentions more convoluted fixes, but I think the one I picked before works quite well.
# Force a serialized build in parallel make through a circular dependency
# Implemented as documented in:
# https://www.gnu.org/software/automake/manual/html_node/Multiple-Outputs.html
# Please note the limitations of this method over there

@IveanEx
Copy link

IveanEx commented Aug 12, 2024

Hi,

Sorry I was on holiday, so this review appears a bit too late. This is not a fix in my opinion.

Okay for me to provide a seperate step to generate RTL, but this means that make -C target/snitch_cluster bin/snitch_cluster.vlt -j$(nproc) Will fail without having run make rtl-gen first?

I put quite a bit of effort in #136 to make this work, and this PR just removes the original "fix"?

The original problem is that make does not work with multiple outputs being generated in one build command. This PR does not fix that issue. Make itself does not start building a thing unless all dependencies are ready and by that regard, it will never have this parallel issue, unless one build command generates multiple things.

I.e. people should be able to just run make whatever -j and it should work (without running anything prior). If it does not work, there is an issue with the makefiles. I'm being a bit strict here, because if we start requiring people to run additional commands we open the gates for letting everything become a giant mess in no time. This PR also does not adapt the documentation for the additional make step.

Either you:

  • adapt clustergen/rtlgen/whatevergen to not be a script that generates all files in one go, but then at some point we would be making our own version of a build system and of an RTL generation. Are we making our own version of Chisel 🤔 ?
  • Add again the thing to force a serialized build as per the GNU documentation, which I already added in build: Fix make dependencies for multiple concurrent jobs #136 . The GNU documentation also mentions more convoluted fixes, but I think the one I picked before works quite well.
# Force a serialized build in parallel make through a circular dependency
# Implemented as documented in:
# https://www.gnu.org/software/automake/manual/html_node/Multiple-Outputs.html
# Please note the limitations of this method over there

Maybe the easiest way is to add rtl-gen dependency to bin/snitch_cluster.vlt target? I also do not know whether there is a parallel compilation problem, and I think it should not happen.

Occamy's integration flow requires snax cluster to only generate rtl, and I think decoupling rtl generation and vlt compilation is quite good as rtl source can be used by multiple targets later.

@rgantonio
Copy link
Author

After a discussion with @JosseVanDelm I think I understand better what is needed.

Technically, this fix is not a problem but it takes away the parallel build or parallel execution features of Make.
Our limitation is that snaxgen is generating all the files in one run. We need to generate 1 output at a time so that parallel generation in snaxgen is possible.

Also it helps with setting dependencies for "proper" dependency tracing. For example, if the streamer is missing then Makefile can report or do something to make sure that the streamer is generated or so.

Well to be fair, we had in mind originally to separate the rtl generation from the build because we need it for our FPGA/ Synthesis flow because we just literally need the files generated first and skip the builds.

What @JosseVanDelm points out is the proper way of using Makefile, and of course, using its parallel execution features.

There's a way to make this work! So I made issue #228 to track this problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working clean up To keep repos clean
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants