Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge develop into main before release of 5.1 #5203

Merged
merged 62 commits into from
Nov 21, 2024
Merged

merge develop into main before release of 5.1 #5203

merged 62 commits into from
Nov 21, 2024

Conversation

hmottestad
Copy link
Contributor

merge develop into main before release of 5.1

hmottestad and others added 30 commits June 5, 2024 20:03
Signed-off-by: Håvard Ottestad <[email protected]>
Signed-off-by: Håvard Ottestad <[email protected]>
Signed-off-by: Håvard Ottestad <[email protected]>
Refactor the existing logic for executing bind joins into a reusable
base class.

This change mostly moves the implementation logic from the existing
ControlledWorkerBindJoin class to a new intermediate implementation
(with the goal to make it reusable in a second step for left joins).

Note that the new bind join implementation no longer uses the
ControlledWorkerJoin as base class, i.e. the decision of which join
implementation to use is moved to the strategy.

For backwards code compatibility the "ControlledWorkerBoundJoin" is
kept, but no longer used. Instead the new code is in
ControlledWorkerBindJoin.
Prepare to execute a specific implementation of a left join
implementation through the federation strategy.
This change provides the implementation and activation for the left bind
join operator.

The algorithm is as follows:

- execute left bind join using regular bound join query
- process result iteration similar to BoundJoinVALUESConversionIteration
- remember seen set of bindings (using index) and add original bindings
to those, i.e. put to result return all non-seen bindings directly from
the input

Note that the terminology in literature has changed to "bind joins".
Hence, for new classes and methods I try to follow that.

Change is covered with some unit tests
Bind left joins for OPTIONAL can be disabled using the
"enableOptionalAsBindJoin" flag in the federation config

Integrate the switch between implementations in the unit test as
parameterized test
- use for-each loop for iterating bindingset
- use IntHashSet
- use Literal#intValue instead of Integer#parseInt
…squashed commit)

Squashed commits:
[9aa87b594c] GH-5124 introduce number of connections and also reduce timeouts
For evaluation of bind joins the implementation for quite some time
makes use of a VALUES clause query.

Except for one code-path: for bind joins - where in the join all
arguments are bound - it was still using the old UNION query approach.
This approach is error prone and no longer required, i.e. the check join
can be executed with the same logic as the regular VALUES clause.

Note: an additional unit test for covering bind joins with no free vars
is added.

This change also marks a number of methods and classes used for the old
UNION based approach as deprecated. The implementations are internal to
the FedX engine and can be removed in the next major release.
hmottestad and others added 23 commits November 9, 2024 13:48
Signed-off-by: Håvard Ottestad <[email protected]>
This change adds preparational infrastructure for having different
implementations of schedulers. Configuration is here prepared by means
of defining a "SchedulerFactory" interface with a default implementation
aside (which essentially mimics the current behavior).

Note that for ease of development some aspects of
ControlledWorkerScheduler are made accessible to sub-classes. The idea
is that in the end version there is an abstract scheduler class
providing shared functionality and different implementation (e.g. the
current FIFO one and a fair implementation)
- for minor version compatibility the type of the "_taskQueue" field in
the scheduler cannot be changed (to non-final). Hence, for now we use a
dedicated protected initialization method. In the future (next major
release) the idea is to leave the queue entirely managed by the executor
service.
- refinements and clarifications to the javadoc
Previously we introduced support for left bind joins in FedX. The case
of empty left bind joins (i.e. where the clause inside the OPTIONAL does
not provide any statements) was not handled and resulted in an exception

This change now adds support for empty optional joins and passes the
results from the left-handside through.
…t values when applied to the union of multiple graphs
@hmottestad hmottestad enabled auto-merge November 21, 2024 13:47
@hmottestad hmottestad disabled auto-merge November 21, 2024 13:55
@hmottestad hmottestad enabled auto-merge November 21, 2024 14:11
@hmottestad hmottestad merged commit dd8c9bc into main Nov 21, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants