Make listener broadcast thread safe #31692

abstratt · 2024-12-12T19:50:50Z

Context

Reviewing cheatsheet

Before merging the PR, comments starting with

❌ ❓must be fixed
🤔 💅 should be fixed
💭 may be fixed
🎉 celebrate happy things

6hundreds · 2024-12-13T12:19:44Z

Were other synchronization technics been considered? Say, make the broadcast field as a AtomicReference to avoid unnecessary blocking?

bot-gradle · 2024-12-13T13:27:42Z

The following builds have passed:

All Performance Tests (Trigger)
- Build Scan

mlopatkin

Left some comments inline.

mlopatkin · 2024-12-13T13:53:15Z

...gradle/internal/cc/impl/isolated/IsolatedProjectsParallelConfigurationIntegrationTest.groovy

+                new Thread() {
+                    @Override
+                    void run() {
+                        ${server.callFromBuildUsingExpression("'configure-' + project.name")}


❓ Why do we need this thread? We start it, we block it, but it just lives in background, I don't see how it impacts the configuration of the projects.

Doing it straight from the script body would mean that if we had more projects (p) than workers (w), we would never see more than w requests, as there would be no workers left to run the other scripts. The result would be a timeout waiting for the other (p - w) requests that we were expecting.

TBH, I don't think we would need to block on the HTTP server anyways. As long as we can verify the taskgraph.whenReady listeners being invoked via the console output, we should be good. Right, @alllex ?

It's a good catch about the number of workers, Rafael.

The test itself is not designed to prove that the implementation is thread-safe but to expose a case when an implementation is more obviously unsafe. So, it goes by ensuring multiple project evaluations (running in parallel) reach the same point in build logic and then start spamming the callback registration requests. The number of projects does not matter as much -- the important point is to get contention on callback registration. To that end, we still should have the blocking server, ensuring all projects are being configured in parallel and all of them are aligned before the "start line" and start spamming together.

Rather than blocking in an anonymous thread, I would resolve the issue with workers by making it match the number of projects (--max-workers ? and +1 for the root). It's okay if we want to reduce the number of projects for this, as we can, for instance, increase the number of listeners per project.

I would also leave a comment somewhere that this test is more of a soak test than a usual integration test, as it does not guarantee correctness, but instead detects some cases of unsafe implementations.

Thanks for the clarification, Alex. Indeed, this test passing once does not guarantee correctness, but at the same time I think that is the case with many of our concurrency tests.

I changed the test not to use ad hoc background threads

mlopatkin · 2024-12-13T13:56:23Z

platforms/core-runtime/messaging/src/main/java/org/gradle/internal/event/ListenerBroadcast.java

❌ There is a data race on reading the broadcast field. It should be volatile or its reads should be synchronized.

You can use @GuardedBy annotation to catch issues if you decide to go with the synchronized approach.

I suspect, source should be massaged to become thread-safe too (by e.g. wrapping it in Lazy).

What is the data race for broadcast, Mike? What concrete case could cause a race?

I figured that since broadcast objects (BroadcastDispatch) are immutable, and are private to the owning ListenerBroadcast, we would be good.

What is the data race for broadcast, Mike? What concrete case could cause a race?

As the reads aren't synchronized, there is nothing that makes an updated broadcast value visible to other threads. This means that we may still miss notifications, because the sender thread will see some obsolete list of listeners. With mutable classes it can also cause the reading thread to observe partially constructed object, but for immutable things like BroadcastDispatch it isn't an issue.

Went with volatile. DId nothing about source. Is the issue about us ending up with multiple ProxyDispatchAdapter instances created for a single ListenerBroadcast? If so, is that really a problem?

If so, is that really a problem?

Well, it isn't particularly cheap to create. And I personally prefer to follow safe publication rules unless I have a very good reason not to do so, like prohibitive performance cost. Saves brain cycles by not thinking whether it is actually safe, and helps with future proofing the code.

With unsafe publication we have three options:

One thread writes source, other threads see the fully constructed written instance. This is a good outcome.

Multiple threads write source, then they and other threads see the fully constructed but different instances. This is fine, though we pay the price of creating several sources.

One or multiple threads write source, other threads see partially constructed instances. Then we see various Heisenbugs with weird stacktraces. For now the ProxyDispatchAdapter is immutable, so it won't happen because of the freezing action in the constructor, but if someone adds something lazily initialized there, the code will break.

Regarding source, I just made it to lock on the ListenerBroacast as well. Looked into using Lazy, however it is not compatible with clients that are stuck with Java 6 source level (static method on interfaces are not supported).

...re-runtime/messaging/src/main/java/org/gradle/internal/event/AnonymousListenerBroadcast.java

...gradle/internal/cc/impl/isolated/IsolatedProjectsParallelConfigurationIntegrationTest.groovy

abstratt · 2024-12-17T15:17:36Z

Were other synchronization technics been considered? Say, make the broadcast field as a AtomicReference to avoid unnecessary blocking?

@6hundreds I looked into it, however that project is using Java 6 source compatibility, so no lambdas, meaning the code would become much more verbose.

Have you seen AtomicReference pushed as a faster option to synchronized? The only comparison I saw showed Atomic being faster with a couple of threads, but slower with a higher number of threads.

Changes performed

abstratt · 2024-12-17T23:14:21Z

@bot-gradle test APT

bot-gradle · 2024-12-18T00:35:58Z

The following builds have passed:

All Performance Tests (Trigger)
- Build Scan

mlopatkin

❌ I believe we should address thread-safety of the source too.

mlopatkin · 2024-12-18T12:09:22Z

...re-runtime/messaging/src/main/java/org/gradle/internal/event/AnonymousListenerBroadcast.java

@@ -29,7 +29,7 @@ public AnonymousListenerBroadcast(Class<T> type, Dispatch<MethodInvocation> forw
    }

    @Override
-    public void removeAll() {
+    public synchronized void removeAll() {
        super.removeAll();
        add(forwardingDispatch);


❌ as we don't use synchronized when reading the broadcast, we may see it between removeAll and add(forwardingDispatch), so we may miss broadcasting to forwardingDispatch.

Introduced a method in the base class to replace broadcast atomically

abstratt · 2024-12-19T04:18:25Z

❌ I believe we should address thread-safety of the source too.

Please take another look, Mike

mlopatkin

LGTM, thank you for your patience!

mlopatkin · 2024-12-19T05:42:10Z

...re-runtime/messaging/src/main/java/org/gradle/internal/event/AnonymousListenerBroadcast.java

-    public void removeAll() {
-        super.removeAll();
-        add(forwardingDispatch);
+    public synchronized void removeAll() {


💭 synchronized on this method should have been enough, but this is fine too.

Hmm... I got it wrong then - I thought the fact we are reading without locking would allow the broadcast to be seen between these two mutating operations, leading to the problem you described. No?

#31692 (comment)

Hmm... I got it wrong then - I thought the fact we are reading without locking would allow the broadcast to be seen between these two mutating operations, leading to the problem you described. No?

#31692 (comment)

Shouldn't write comments at 6am, I somehow thought that you've made reads synchronized too.

alllex

LGTM!
I left some smaller questions to double-check before merging but not critical

alllex · 2024-12-19T08:19:27Z

...gradle/internal/cc/impl/isolated/IsolatedProjectsParallelConfigurationIntegrationTest.groovy

+        where:
+        it << (1..10)


🤔 Can you explain what this does?

alllex · 2024-12-19T08:20:18Z

platforms/core-runtime/messaging/src/main/java/org/gradle/internal/event/ListenerBroadcast.java

- * <p>Implementations are not thread-safe.</p>
- *


💭 Should we annotate the class with @ThreadSafe ?

alllex · 2024-12-19T08:22:17Z

...ore-runtime/messaging/src/test/groovy/org/gradle/internal/event/ListenerBroadcastTest.groovy

+        where:
+        n << (1..4)


🤔 Is the n used anywhere or does it play some other role?

Just to produce multiple runs of the test. Not sure there is another way to do that?

If we are trying to compensate for the lack of certainty (which this test cannot provide), it's better to make the test itself more "intensive" than to run it multiple times. It's similar to why we run each performance test once, but it internally does everything it can to provide certainty, e.g. runs multiple batches of work.

I don't feel strongly about this, but it does feel weird to have an unused variable that forces the test to run multiple times

abstratt · 2024-12-19T11:56:36Z

LGTM, thank you for your patience!

Likewise!

Issue: #31537 * Annotate `ListenerBroadcast` as thread-safe * Ensure `ListenerBroadcast.getSource()` never creates more than one `ProxyDispatchAdapter` * Ensure `AnonymousListenerBroadcast.removeAll()` updates `broadcast` atomically * Remove note on thread-unsafety of `AnonymousListenerBroadcast` * Ensure reads reflect writes on `broacast`

Also: Add task-graph listener registration test for Isolated Projects

abstratt self-assigned this Dec 12, 2024

This comment has been minimized.

Sign in to view

abstratt mentioned this pull request Dec 12, 2024

Add task-graph listener registration test for Isolated Projects #31540

Closed

This comment has been minimized.

Sign in to view

abstratt force-pushed the rchaves/master/make-listener-broadcast-thread-safe branch from c6d93e5 to 4ccac50 Compare December 13, 2024 03:54

This comment has been minimized.

Sign in to view

abstratt marked this pull request as ready for review December 13, 2024 11:59

abstratt requested review from a team as code owners December 13, 2024 11:59

abstratt requested review from mlopatkin and 6hundreds and removed request for a team December 13, 2024 11:59

abstratt added this to the 8.13 RC1 milestone Dec 13, 2024

This comment has been minimized.

Sign in to view

mlopatkin previously requested changes Dec 13, 2024

View reviewed changes

mlopatkin reviewed Dec 13, 2024

View reviewed changes

...gradle/internal/cc/impl/isolated/IsolatedProjectsParallelConfigurationIntegrationTest.groovy Show resolved Hide resolved

abstratt force-pushed the rchaves/master/make-listener-broadcast-thread-safe branch 2 times, most recently from 25678f6 to 037fff6 Compare December 17, 2024 15:16

abstratt closed this Dec 17, 2024

abstratt reopened this Dec 17, 2024

abstratt force-pushed the rchaves/master/make-listener-broadcast-thread-safe branch from 037fff6 to 4bcedd6 Compare December 17, 2024 16:14

This comment has been minimized.

Sign in to view

mlopatkin previously requested changes Dec 18, 2024

View reviewed changes

mlopatkin reviewed Dec 18, 2024

View reviewed changes

mlopatkin approved these changes Dec 19, 2024

View reviewed changes

alllex approved these changes Dec 19, 2024

View reviewed changes

abstratt force-pushed the rchaves/master/make-listener-broadcast-thread-safe branch from a7a2ebb to 816ec0b Compare December 20, 2024 00:50

abstratt added this pull request to the merge queue Dec 20, 2024

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 20, 2024

abstratt and others added 2 commits December 20, 2024 00:13

Add a concurrency test for ListenerBroadcast

dd77fc4

Also: Add task-graph listener registration test for Isolated Projects

abstratt force-pushed the rchaves/master/make-listener-broadcast-thread-safe branch from 816ec0b to dd77fc4 Compare December 20, 2024 03:13

abstratt added this pull request to the merge queue Dec 20, 2024

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 20, 2024

blindpirate added this pull request to the merge queue Dec 20, 2024

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 20, 2024

blindpirate added this pull request to the merge queue Dec 20, 2024

Merged via the queue into master with commit 67a0768 Dec 20, 2024
22 checks passed

blindpirate deleted the rchaves/master/make-listener-broadcast-thread-safe branch December 20, 2024 08:27

Make listener broadcast thread safe #31692

Make listener broadcast thread safe #31692

Conversation

abstratt commented Dec 12, 2024

Context

Reviewing cheatsheet

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

6hundreds commented Dec 13, 2024

bot-gradle commented Dec 13, 2024

mlopatkin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abstratt Dec 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abstratt Dec 17, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mlopatkin Dec 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abstratt commented Dec 17, 2024 • edited Loading

abstratt commented Dec 17, 2024

This comment has been minimized.

bot-gradle commented Dec 18, 2024

mlopatkin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abstratt commented Dec 19, 2024

mlopatkin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abstratt Dec 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alllex left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abstratt commented Dec 19, 2024

abstratt Dec 13, 2024 •

edited

Loading

abstratt Dec 17, 2024 •

edited

Loading

mlopatkin Dec 18, 2024 •

edited

Loading

abstratt commented Dec 17, 2024 •

edited

Loading

abstratt Dec 19, 2024 •

edited

Loading