Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spark: Support creating views via SQL #9423
Spark: Support creating views via SQL #9423
Changes from 4 commits
92eba8f
5153b4b
6559f72
ffe9fba
7b86876
71a00a4
f348f8b
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this has to be done in
RewriteViewCommands
because by the time the check rules run, the analyzer would have substituted temporary views into the logical query plan already. Since we're already rewriting unresolved relations in the plan, we can just keep track of the temporary view names at the same time and throw if there were any.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I think you implementation here looks great. I think we just need to run it earlier before temp views might be substituted into the plan in place of
UnresolvedRelation
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nastra, this is what makes me wonder whether we support subqueries properly in the code that rewrites
UnresolvedRelation
in views (qualifyTableIdentifiers
). We may need to supportUnresolvedRelation
withinSubqueryExpression
like this does.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, this doesn't need to be fixed for view resolution in this PR. The relevant part for this PR is that we catch temporary views that are nested in subquery expressions.
Here's an example of a subquery expression in a view:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a test that makes sure using a temp view in a subquery expression fails. I'll also follow-up separately and will look into supporting subqueries properly for views
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nastra, this can't be done here. It needs to be done before any of Spark's rules run. Otherwise, a temporary view may already have been substituted into the plan and could allow the checks to pass even though the view is invalid.
This rule is only responsible for applying the column aliases and comments. Rewriting the identifiers should be done in
RewriteViewCommands
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd also be fine with not tracking this and passing an empty array through. That's not correct for Spark, but it would work for us since Iceberg doesn't store these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
technically we don't need to track it in this PR but we'll eventually need it when we show the properties of views and such. I'd probably keep it here but let me know if you'd like me to remove it here and introduce it in an upcoming PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems reasonable to include it. Let's just leave it as-is since you even have a test for it.