Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flink: Fix TestIcebergSourceWithWatermarkExtractor flakiness #9309

Merged
merged 2 commits into from
Dec 18, 2023

Conversation

pvary
Copy link
Contributor

@pvary pvary commented Dec 16, 2023

Fix most of the flakiness in TestIcebergSourceWithWatermarkExtractor.
Still #9308 is needed for the full fix

@ajantha-bhat
Copy link
Member

Please also handle for other flink modules. Flakiness can come from any of the modules.

Copy link
Contributor

@stevenzwu stevenzwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pvary feel free to add 1.16 and 1.17 in the same PR. it is a smaller unit test change. it is ok to include all 3 versions in one PR to speed up the turn-around.

@@ -95,8 +97,7 @@ public class TestIcebergSourceWithWatermarkExtractor implements Serializable {
.setRpcServiceSharing(RpcServiceSharing.DEDICATED)
.setConfiguration(
reporter.addToConfiguration(
// disable classloader check as Avro may cache class in the serializers.
new Configuration().set(CoreOptions.CHECK_LEAKED_CLASSLOADER, false)))
new Configuration().set(PipelineOptions.ALLOW_UNALIGNED_SOURCE_SPLITS, true)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we probably need both options here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I run the test 1000 times without the classloader issue to appear. Any idea how to check? Since the same issue appears with the configuration set, I suspect that the error message is a red herring

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I run the test 1000 times without the classloader issue to appear. Any idea how to check?

not really. in the past, this tends to happen during CI build.

I will leave the decision to you. if it happen again, we would need another PR.

@@ -304,6 +322,11 @@ public void testThrottling() throws Exception {

try (CloseableIterator<RowData> resultIterator = stream.collectAsync()) {
JobClient jobClient = env.executeAsync("Iceberg Source Throttling Test");
CommonTestUtils.waitForAllTaskRunning(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the main fix to make sure all tasks running first?

other code refactoring/reordering is just cosmetic/style thing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed to reorder the Iceberg commit after starting the job. So the first records are moved. To make the test more consistent, moved the record generation to the beginning of the test

@pvary pvary merged commit 9342f64 into apache:main Dec 18, 2023
13 checks passed
@pvary
Copy link
Contributor Author

pvary commented Dec 18, 2023

@stevenzwu: The v1.16 does not have the ALLOW_UNALIGNED_SOURCE_SPLITS yet, so I had to remove this setting. I still hope that the fix would be enough itself. Merged the changes.
Is there a way to monitor the results of a specific test case?

@stevenzwu
Copy link
Contributor

@pvary you can monitor the actions/workflows for Flink CI. https://github.com/apache/iceberg/actions/workflows/flink-ci.yml

@pvary
Copy link
Contributor Author

pvary commented Dec 18, 2023

@pvary you can monitor the actions/workflows for Flink CI. https://github.com/apache/iceberg/actions/workflows/flink-ci.yml

Thanks @stevenzwu, I will try to follow what happens

lisirrx pushed a commit to lisirrx/iceberg that referenced this pull request Jan 4, 2024
geruh pushed a commit to geruh/iceberg that referenced this pull request Jan 26, 2024
devangjhabakh pushed a commit to cdouglas/iceberg that referenced this pull request Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants