Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat(optimizer): Add scaffolding to create join graphs from logical p…
…lans (#3501) This PR adds the functionality to create join graphs from logical plans. This is currently not used by Daft for now, as it's a building block for join reordering. More concretely, what this PR allows us to do is to take a query tree, e.g. ``` InnerJoin (a = d) / \ InnerJoin (a = b) InnerJoin (c = d) / \ / \ Agg Scan(b) Project Scan(d) (Count(a)) (c_prime) | | Project | (a <- a_prime) | | | Scan(a_prime) Scan(c_prime) ``` and create a join graph. Join graphs contain sufficient information to reconstruct a query tree that is equivalent to the query tree used to create join graph. In the example above, this would entail storing information about how relations are connected to each other. E.g. ``` l_a#Aggregate(a) <-> r_b#Source(b) l_c#Source(c_prime) <-> r_d#Source(d) l_a#Aggregate(a) <-> r_d#Source(d) ``` A relation here is determined by a "non-reorderable" node. The simplest example would be a Source node that must be a leaf in the query tree. But aside from inner joins, filters, and some projects (that do not perform computation on join keys), most logical operators are non-reorderable. In addition to edges between relations, the query tree also maintains a pre-order record of projections and filters that it encountered in the query tree, which it would reapply at the top of the reconstructed query tree. The very last projection to apply is the output schema of the root logical plan that was used to construct the join graph.
- Loading branch information