Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump Spark 3.5.4 #11731

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

Bump Spark 3.5.4 #11731

wants to merge 6 commits into from

Conversation

pan3793
Copy link
Member

@pan3793 pan3793 commented Dec 9, 2024

Bump Spark 3.5.4, with minimal necessary changes related to SPARK-50235 and SPARK-50463.

Note: this PR targets to 1.7.2 and 1.6.2 (for Java8 users)

@github-actions github-actions bot added the build label Dec 9, 2024
@pan3793
Copy link
Member Author

pan3793 commented Dec 10, 2024

the upgrading causes a core dump in iceberg-spark-3.5's tests

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f8529a34563, pid=83775, tid=83793
#
# JRE version: OpenJDK Runtime Environment Zulu17.54+21-CRaC-CA (17.0.13+11) (build 17.0.13+11-LTS)
# Java VM: OpenJDK 64-Bit Server VM Zulu17.54+21-CRaC-CA (17.0.13+11-LTS, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0xc34563]  ProtectionDomainEntry::object_no_keepalive()+0x3
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to /home/runner/work/iceberg/iceberg/spark/v3.5/spark/core.83775)
#
# An error report file with more information is saved as:
# /home/runner/work/iceberg/iceberg/spark/v3.5/spark/hs_err_pid83775.log
#
# Compiler replay data is saved as:
# /home/runner/work/iceberg/iceberg/spark/v3.5/spark/replay_pid83775.log
#
# If you would like to submit a bug report, please visit:
#   http://www.azul.com/support/
#
> Task :iceberg-spark:iceberg-spark-3.5_2.12:test FAILED

@pan3793
Copy link
Member Author

pan3793 commented Dec 10, 2024

the failure seems related to arrow

minimal reproduce command

./gradlew -DsparkVersions=3.5 -DscalaVersion=2.12 -DhiveVersions= -DflinkVersions= -DkafkaVersions= \
    :iceberg-spark:iceberg-spark-3.5_2.12:test --tests=TestRewriteDataFilesAction \
    -Pquick=true -x javadoc

@pan3793
Copy link
Member Author

pan3793 commented Dec 10, 2024

update: the failure is related to SPARK-50235, and the test passed on Spark 3.5.4 RC1 with reverting that patch

@viirya
Copy link
Member

viirya commented Dec 10, 2024

Please check for some discussion there apache/spark#49131 (comment)

@LuciferYang
Copy link

Please check for some discussion there apache/spark#49131 (comment)

Yes, I tested it locally and found that after adding a no-op closeIfFreeable method to the IcebergArrowColumnVector.java in version 3.5, all the test cases in TestRewriteDataFilesAction were able to pass. We can further test it after the release of 3.5.4 RC2.

@pan3793
Copy link
Member Author

pan3793 commented Dec 11, 2024

@viirya @LuciferYang thank you for the guidance, let me try

@github-actions github-actions bot added the spark label Dec 11, 2024
// If a column vector is writable or constant, it should override this method and do nothing.
// See more details at SPARK-50235, SPARK-50463 (Fixed in Spark 3.5.4)
@Override
public void closeIfNotWritable() {}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will rename to closeIfFreeable after RC2 available

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a column vector is writable or constant, it should override this method and do nothing.

we have ConstantColumnVector as well, should we do it vector classes which absolutely requires it ?

Copy link
Member Author

@pan3793 pan3793 Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the codebase, other vectors overwrite void close() {}, so they are not affected

Copy link
Contributor

@singhpk234 singhpk234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch @pan3793 !

// If a column vector is writable or constant, it should override this method and do nothing.
// See more details at SPARK-50235, SPARK-50463 (Fixed in Spark 3.5.4)
@Override
public void closeIfNotWritable() {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a column vector is writable or constant, it should override this method and do nothing.

we have ConstantColumnVector as well, should we do it vector classes which absolutely requires it ?

@pan3793
Copy link
Member Author

pan3793 commented Dec 11, 2024

FYI @LuciferYang @viirya @dongjoon-hyun @wForget

CI is green now, and I think the current Spark branch-3.5 is in good shape for Iceberg.

@dongjoon-hyun
Copy link
Member

Thank you for checking Apache Iceberg side, @pan3793 .

@pan3793 pan3793 changed the title Bump Spark 3.5.4 RC1 Bump Spark 3.5.4 RC2 Dec 16, 2024
@Fokko
Copy link
Contributor

Fokko commented Dec 17, 2024

Thanks @pan3793 for catching this early. Could you create a separate PR for just the patch? We can include this in 1.7.2.

@pan3793
Copy link
Member Author

pan3793 commented Dec 17, 2024

@Fokko I opened #11802

@pan3793 pan3793 changed the title Bump Spark 3.5.4 RC2 Bump Spark 3.5.4 Dec 20, 2024
@pan3793 pan3793 marked this pull request as ready for review December 20, 2024 12:31
@pan3793
Copy link
Member Author

pan3793 commented Dec 20, 2024

Spark 3.5.4 RC3 passed the vote and the jars were available on Maven Central a few minutes ago, I removed the staging repo and it's ready to go.

cc @nastra @Fokko and @jbonofre

@Fokko Fokko added this to the Iceberg 1.7.2 milestone Dec 20, 2024
@@ -59,6 +59,12 @@ public void close() {
accessor.close();
}

@Override
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we land on adding the @Override? Now it doesn't compile against Spark 3.5.2. Leaving out the annotation is safer (it will still work with ≥3.5.4).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we land on adding the @Override? Now it doesn't compile against Spark 3.5.2. Leaving out the annotation is safer (it will still work with ≥3.5.4).

+1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed.

both keeping and removing are fine to me, it has no difference in runtime

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants