Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Pinot Connector] Support null predicate push down for Pinot connector #15406

Closed

Conversation

xiangfu0
Copy link
Contributor

@xiangfu0 xiangfu0 commented Dec 14, 2022

Description

Support null predicate push down for Pinot connector

Release notes

# Pinot Connector
* Support null predicate push down for Pinot connector

@cla-bot cla-bot bot added the cla-signed label Dec 14, 2022
@xiangfu0 xiangfu0 changed the title Support pushdown pinot null predicate Support null predicate push down for Pinot connector Dec 14, 2022
@xiangfu0 xiangfu0 marked this pull request as draft December 14, 2022 23:45
@ebyhr
Copy link
Member

ebyhr commented Jan 17, 2023

Trino community member looks facing this issue. Can you add a test so that we can start the review process?
https://trinodb.slack.com/archives/CGB0QHWSW/p1673940425810719

@etadelta222
Copy link

Hi @xiangfu0 , could you add a test per the comment above from ebyhr?

@gp-stackid
Copy link

Any update? facing similar issue

@xiangfu0 xiangfu0 force-pushed the support-null-check-predicate branch 6 times, most recently from 9b67866 to 86d406c Compare November 28, 2023 02:31
@xiangfu0 xiangfu0 marked this pull request as ready for review November 28, 2023 05:41
@xiangfu0 xiangfu0 requested a review from elonazoulay November 28, 2023 05:41
@xiangfu0 xiangfu0 changed the title Support null predicate push down for Pinot connector [Pinot Connector] Support null predicate push down for Pinot connector Nov 28, 2023
@xiangfu0 xiangfu0 requested a review from ebyhr December 1, 2023 18:50
@xiangfu0
Copy link
Contributor Author

xiangfu0 commented Dec 1, 2023

Rebased and added a few tests

.matches("VALUES (BIGINT '0')")
.isNotFullyPushedDown(FilterNode.class);
WHERE string_col IS NULL OR string_col = 'null'"""))
.matches("VALUES (BIGINT '1')")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This table has 'null' string in string_col, so pinot will return 1 row after pushing down.

Copy link
Member

@elonazoulay elonazoulay Jan 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the table config be checked for null handling enabled?

@xiangfu0 xiangfu0 force-pushed the support-null-check-predicate branch from 86d406c to abe5078 Compare December 1, 2023 22:31
@xiangfu0 xiangfu0 force-pushed the support-null-check-predicate branch from abe5078 to e61f56a Compare December 2, 2023 00:58
@etadelta222
Copy link

@ebyhr looks ready to be reviewed :)

@mosabua
Copy link
Member

mosabua commented Jan 11, 2024

@xiangfu0 @elonazoulay and @ebyhr can you continue the collaboration on this?

@@ -1727,6 +1728,18 @@ SELECT COUNT(*)
WHERE double_col NOT IN (0.0, -16.33, -17.33)"""))
.matches("VALUES (BIGINT '7')")
.isNotFullyPushedDown(FilterNode.class);

assertThat(query("SELECT price, vendor FROM " + JSON_TABLE + " WHERE price IS NOT NULL"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This table has null handling disabled, should it check for all values? and/or just use the count query below?

WHERE string_col IS NULL"""))
.matches("VALUES (BIGINT '0')")
.isNotFullyPushedDown(FilterNode.class);
WHERE string_col IS NULL OR string_col = 'null'"""))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since null handling is enabled for alltypes should this be AND string_col = 'null' to verify null handling? i.e. this would have the same result as if the table had null handling disabled.

.matches("VALUES (BIGINT '11')")
.isNotFullyPushedDown(FilterNode.class);
WHERE string_col IS NOT NULL AND string_col != 'null'"""))
.matches("VALUES (BIGINT '10')")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would the behavior be different for a table with null handling disabled?

WHERE long_col IS NULL"""))
.matches("VALUES (BIGINT '0')")
.isNotFullyPushedDown(FilterNode.class);
WHERE long_col IS NULL OR long_col = 0"""))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, should this be tested: if null handling is disabled, shouldn't IS NULL be always false, and if enabled it would be min value?

WHERE long_col IS NOT NULL"""))
.matches("VALUES (BIGINT '11')")
.isNotFullyPushedDown(FilterNode.class);
WHERE long_col IS NOT NULL AND long_col != 0"""))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, maybe have an explicit test for null handling enabled vs disabled in a table w the same data, and check for MIN_VALUE and/or the default null value. lmk, can talk offline about it when you have some time.


assertThat(query("SELECT COUNT(*) FROM " + JSON_TABLE + " WHERE price IS NULL"))
.matches("VALUES (BIGINT '0')")
.isFullyPushedDown();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe do this for a copy of the table w null handling enabled? Or you can choose a simpler table (hybrid is pretty simple).
Also: what is the behavior of a hybrid table where null handling is enabled for realtime but disabled for offline or vice versa?

return Optional.of(format("(%s != %s)", predicateArgument, predicateArgument));
}
if (valueSet.isAll()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iirc this is related to apache/pinot#10600 and 601, since the connector supports 11 and 12.1 should this be put back until we upgrade to 1.0.0 (and remove support for all previous versions)?

@@ -131,16 +130,11 @@ private static Optional<String> toPredicate(PinotColumnHandle pinotColumnHandle,
String predicateArgument = pinotColumnHandle.isAggregate() ? pinotColumnHandle.getExpression() : quoteIdentifier(pinotColumnHandle.getColumnName());
ValueSet valueSet = domain.getValues();
if (valueSet.isNone()) {
verify(!domain.isNullAllowed(), "IS NULL is not supported due to different null handling semantics. See https://docs.pinot.apache.org/developers/advanced/null-value-support");
if (domain.isNullAllowed()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be done only if the table has null handling enabled in the config? How would it know the difference between 'null', Long.MIN_VALUE, etc. vs a null value? Is that handled in v1.0.0?

@ebyhr ebyhr removed their request for review May 20, 2024 03:22
Copy link

github-actions bot commented Sep 4, 2024

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

@github-actions github-actions bot added the stale label Sep 4, 2024
@mosabua
Copy link
Member

mosabua commented Sep 4, 2024

👋 @xiangfu0 and @elonazoulay could you continue work on this PR or should we close it?

@github-actions github-actions bot removed the stale label Sep 5, 2024
Copy link

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

@github-actions github-actions bot added the stale label Sep 26, 2024
@ebyhr ebyhr closed this Oct 11, 2024
@ebyhr
Copy link
Member

ebyhr commented Oct 11, 2024

Closing as there was no response. Please feel free to reopen if you continue the work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

6 participants