Improve the performance of includes scanning #1735

shaoyie · 2022-05-18T16:22:07Z

Please check if the PR fulfills these requirements

The PR has no duplicates (please search among the Pull Requests
before creating one)
The PR follows
our contributing guidelines
Tests for the changes have been added (for bug fixes / features)
Docs have been added / updated (for bug fixes / features)
UPGRADING.md has been updated with a migration guide (for breaking changes)

What kind of change does this PR introduce?
Enhancement

What is the current behavior?
It's very slow to scan the includes for a large project when cache is invalid. And the cache is very easy to be invalidated because it caches the includes by a strict chain. Also because of the chain, the scanning can only be performed by single thread.

What is the new behavior?
By applying the sync.Map instead of the array, we can introduce in the goroutine to improve the performance far more better.

Does this PR introduce a breaking change, and is
titled accordingly?
No breaking

Other information:

See how to contribute

CLAassistant · 2022-05-18T16:22:15Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

Sawyer seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

facchinm · 2022-05-23T15:47:33Z

I'm testing this PR and the speedup is amazing 🤩
Still didn't find any corner case where it doesn't correctly resolve the includes

matthijskooijman · 2022-05-24T12:40:41Z

@shaoyie, can you expand a little bit more on how this changes the include scanning process? The old caching code was intended to ensure that the results are always the same, with or without caching. I haven't dug into your new code deeply yet, but I have the impression that this might end up with different results depending on goroutine timing in some cases (I'm thinking of a case where one library exposes a.h and b.h and another exposes just b.h, inclusion of the latter can depend on whether a.h or b.h is resolved first).

In any case, speeding up this process is much welcomed, since it indeed is quite slow currently.

For a related (but distinctly different) improvement to this process, see arduino/arduino-builder#217, which improves caching when compile errors are present by letting the include caching generate .d files (in addition to the compilation process). I never finished that PR properly, since it probably requires changes to recipes/platform.txt, but I never took the time to figure out what is needed exactly and how to do that in a backward compatible way. As I said, it's distinct from this improvement, but if you're interested in also working on that, feel free to work from my code.

shaoyie · 2022-05-27T14:47:55Z

@shaoyie, can you expand a little bit more on how this changes the include scanning process? The old caching code was intended to ensure that the results are always the same, with or without caching. I haven't dug into your new code deeply yet, but I have the impression that this might end up with different results depending on goroutine timing in some cases (I'm thinking of a case where one library exposes a.h and b.h and another exposes just b.h, inclusion of the latter can depend on whether a.h or b.h is resolved first).

In any case, speeding up this process is much welcomed, since it indeed is quite slow currently.

For a related (but distinctly different) improvement to this process, see arduino/arduino-builder#217, which improves caching when compile errors are present by letting the include caching generate .d files (in addition to the compilation process). I never finished that PR properly, since it probably requires changes to recipes/platform.txt, but I never took the time to figure out what is needed exactly and how to do that in a backward compatible way. As I said, it's distinct from this improvement, but if you're interested in also working on that, feel free to work from my code.

Your impression is right that in multiple goroutines situation, the scanning result may look different depends on the goroutine execution order.
My judgement is if the scanning result won't impact the compile result, then it is acceptable. So the new strategy doesn't care the scanning sequence which limited it to single thread, instead it checks the cache's validation against the timestamp of the target file and source file, in which way we can utilize multiple goroutines.
If there's extreme case as you mentioned, I just left it to the ResolveLibrary() method to decide which library will be chosen. Also in such a case, if two files had conflict in the filenames, even in the original behavior, still cannot ensure the right file is happened chosen, still need additional manual work to search out the files with same names handle them very carefully.

shaoyie · 2022-05-27T16:31:52Z

Just like stable sort and unstable sort, the result may vary but legal, depends on what is your expectation.

cmaglie · 2022-06-15T08:35:58Z

legacy/builder/container_find_includes.go

+	}
+
+	// Loop until all files are handled
+	for (!sourceFilePaths.Empty() || unhandled != 0 ) {


Isn't this unhandled variable redundant here? the WaitGroup should already take care of waiting for all the processes to finish.

The unhandled is not redundant here. During the scanning process, we may find new source files. For example, suppose we have only 4 files to scan in the beginning, but we have 8 cores. Then the sourceFilePaths will be empty after the first goroutines fetch their task and the loop will be ended. But during the scanning against the 4 files may append more new found files to the sourceFilePaths but they won't be handled. The variable unhandled is to make sure sourceFilePaths is empty only after all goroutines finish the jobs.

cmaglie · 2022-06-15T08:51:06Z

legacy/builder/container_find_includes.go

+
+	// The first source file is the main .ino.cpp
+	// handle it first to setup environment for other files
+	findIncludesUntilDone(ctx, cache, sourceFilePaths.Pop())


Why the main.ino.cpp is processed separately? If the processing order of the files doesn't matter, this should not be needed.

In my original design, I also think the main.ino.cpp has no difference with other files. And it works in most cases.
But for some special cases, e.g. from the commits history of this PR we can see, the test case
assert run_command(["compile", "-b", "adafruit:samd:adafruit_feather_m4", sketch_path])
in test_compile_part_1.py will fail.
Due to the library found mechanism in arduino-cli, have to set main.ino.cpp as the root of the scanning, then start other scanning from it.

cmaglie · 2022-06-15T09:11:15Z

legacy/builder/container_find_includes.go

+		if(ok) {
+			// Ignore the pre-added folder
+			if(!strings.HasPrefix(header, "no_resolve")) {
+				library, imported := ResolveLibrary(ctx, header)
+				if library == nil {
+					if !imported {
+						// Cannot find the library and it is not imported, is it gone? Remove it later
+						entryToRemove = append(entryToRemove, header)
+						cache.valid = false
+					}
+				} else {
+
+					// Add this library to the list of libraries, the
+					// include path and queue its source files for further
+					// include scanning
+					ctx.ImportedLibraries = append(ctx.ImportedLibraries, library)
+					// Since it is already in cache, append the include folder only
+					appendIncludeFolderWoCache(ctx, header, library.SourceDir)
+					sourceDirs := library.SourceDirs()
+					for _, sourceDir := range sourceDirs {
+						queueSourceFilesFromFolder(ctx, ctx.CollectedSourceFiles, library, sourceDir.Dir, sourceDir.Recurse)
+					}
+				}
+			}
+		}
+		return true
+	})


If I read this correctly, you're basically reusing all the previous set of include paths, but without keeping the history of how you obtained that set. This won't work in all cases, in particular, if the cache is invalidated you must redo all the work again to be sure to not pick libraries that are not needed anymore.

Yes, now the strategy only cares missing files but doesn't care the redundant entries in cache in case it won't break the build.
In the particular case you mentioned, if need to clear the useless libraries, a clean build is required to do the job. And most of the compiling tool have "Clean build" besides "Build", this case can utilize the differences between these two kind of builds.
Not sure is there a better way to handle this case so we can have a better balance between the performance and accurate.

cmaglie · 2024-08-08T15:23:13Z

Superseded by #2625.

Sawyer added 2 commits May 18, 2022 23:20

refactor: Improve the performance of includes scanning

20b98f4

refactor: Simplify the cache related code

aa48ffe

per1234 added type: enhancement Proposed improvement topic: code Related to content of the project itself labels May 18, 2022

per1234 assigned cmaglie May 18, 2022

fix: Handle .ino.cpp first to prepare for other sources

a233f85

cmaglie reviewed Jun 15, 2022

View reviewed changes

cmaglie mentioned this pull request Jun 6, 2023

Sketch re-compiled when unnecessary #1996

Open

3 tasks

per1234 mentioned this pull request Jun 18, 2023

"Upload" also does a build/verify arduino/arduino-ide#2103

Closed

3 tasks

cmaglie closed this Aug 8, 2024

per1234 added the conclusion: duplicate Has already been submitted label Aug 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve the performance of includes scanning #1735

Improve the performance of includes scanning #1735

shaoyie commented May 18, 2022 •

edited

Loading

CLAassistant commented May 18, 2022

facchinm commented May 23, 2022

matthijskooijman commented May 24, 2022

shaoyie commented May 27, 2022

shaoyie commented May 27, 2022 •

edited

Loading

cmaglie Jun 15, 2022

shaoyie Jun 19, 2022

cmaglie Jun 15, 2022

shaoyie Jun 19, 2022

cmaglie Jun 15, 2022

shaoyie Jun 19, 2022 •

edited

Loading

cmaglie commented Aug 8, 2024

Improve the performance of includes scanning #1735

Improve the performance of includes scanning #1735

Conversation

shaoyie commented May 18, 2022 • edited Loading

CLAassistant commented May 18, 2022

facchinm commented May 23, 2022

matthijskooijman commented May 24, 2022

shaoyie commented May 27, 2022

shaoyie commented May 27, 2022 • edited Loading

cmaglie Jun 15, 2022

Choose a reason for hiding this comment

shaoyie Jun 19, 2022

Choose a reason for hiding this comment

cmaglie Jun 15, 2022

Choose a reason for hiding this comment

shaoyie Jun 19, 2022

Choose a reason for hiding this comment

cmaglie Jun 15, 2022

Choose a reason for hiding this comment

shaoyie Jun 19, 2022 • edited Loading

Choose a reason for hiding this comment

cmaglie commented Aug 8, 2024

shaoyie commented May 18, 2022 •

edited

Loading

shaoyie commented May 27, 2022 •

edited

Loading

shaoyie Jun 19, 2022 •

edited

Loading