Release 6.1.1 #1365

github-actions · 2024-10-09T10:42:51Z

Jira ref: PDP-1453

Iglu Scala Client has new lookupSchemasUntil function that allows to fetch list of schemas until given schema key. If we replace listSchemasLike function with lookupSchemasUntil function, RDB Loader won't rely on the list endpoint of Iglu Server. This commit makes necessary changes to use new lookupSchemasUntil function instead of listSchemasLike function.

Some of the tests are failing since windows don't have expected timestamps. In order to solve this problem, this commit makes necessary changes to read window timestamps from shredded message instead of using hard-coded values.

After starting to use lookupSchemasUntil in fetchSchemasWithSameModel, we are only getting schemas until the given schema key for every schema key. Previously, we were getting all the schemas for the same schema model. This change caused change of behavior when a message contains multiple schema keys for same schema model. When this happens, RDB Loader tries to create same table multiple times. In order to solve this problem, this commit contains the change for creating the migration for only max schema key of the same schema model.

We've seen exceptions in spark executors like: ``` java.lang.NullPointerException: Cannot invoke "scala.collection.mutable.Set.isEmpty()" because the return value of "com.snowplowanalytics.snowplow.rdbloader.transformer.batch.spark.TypesAccumulator.accum()" is null ``` The error is coming from our Spark Accumulator for accumulating Iglu types. This is similar to [an issue previously seen][1] in Spark's own `CollectionAccumulator`. That issue [was fixed in Spark][2] by making the accumulator's internal state non-final, and synchronizing access to the internal state. So here we make the exact same change to our own Accumulator. It is a rare race condition which is hard to reproduce. [1]: https://issues.apache.org/jira/browse/SPARK-20977 [2]: apache/spark#31540

oguzhanunlu and others added 7 commits October 1, 2024 15:29

Upgrade schema-ddl to 0.25.0

98f646c

Upgrade dependencies to fix some vulnerabilites

84646ad

Prepare for 6.1.1 release

0b447df

snowplowcla added the cla:no label Oct 9, 2024

spenes force-pushed the release/6.1.1 branch from 4c5e47b to 0b447df Compare October 9, 2024 10:44

snowplow deleted a comment from snowplowcla Oct 9, 2024

istreeter approved these changes Oct 9, 2024

View reviewed changes

spenes merged commit 0b447df into master Oct 9, 2024
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 6.1.1 #1365

Release 6.1.1 #1365

github-actions bot commented Oct 9, 2024 •

edited by spenes

Loading

Release 6.1.1 #1365

Release 6.1.1 #1365

Conversation

github-actions bot commented Oct 9, 2024 • edited by spenes Loading

github-actions bot commented Oct 9, 2024 •

edited by spenes

Loading