-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timeout while evaluating FedX-Query #4785
Comments
Did you try adjusting the timeout? |
Yes, but 120 seconds timeout seems more than sufficient for the small graph and relatively simple query to evaluate. The other queries were evaluated in fractions of a second. |
Does it still timeout when setting it to 0? |
Yes, it throws after about 20s the TimeoutException and the QueryInterruptedException |
Does ist also happen with NativeStore or LmdbStore? |
Yes, with both. No difference |
Try setting the timeout directly on the Query object. Like this Can you also update your example code to show how you are setting the timeout on the FedX repo? |
I had the timeout setting as you wrote on the Query object and did not change it globally. |
were you able to reproduce with my example code? |
I haven't had a chance to test it out yet. Might be a little while before I have time. |
…SPARQL endpoints.
I've created a test to show that the issue only happens against remote endpoints. If I also added some "debug" output to @aschwarte10 do you have some time to look into this issue? I think it is critical as it leads to deadlocks when used with remote endpoints. |
This commit provides a unit test that reproduces a deadlock scenario. The issue is somewhere caused in the join with NUnion, and specifically the union parts having relevant statements in multiple sources. Note that the dataset is really small (42 triples, 40 relevant ones contributing to the result)
Thanks for reporting the issue. It looks like you have stumbled indeed into a deadlock scenario (which fortunately at least recovers through timeout exceptions) Unfortunately I only have very limited time available to work on RDF4J. However, I tried to do a bit of an assessment, and have some findings to nail it down (but do not yet have an understanding of the cause). I have also pushed a PR to reproduce it with just 40 triples, so ideally this allows to debug it easier. Some insights to share: The issue occurs in a join with an NUnion node, where the statements of the union themselves are relevant at multiple sources. An example FedX query plan looks like this:
Note: due to the NUnion the join is executed in It looks like the scheduler executing the unions is populated in some order that causes the deadlock, i.e. the results cannot be piped through. Might also be the main executor that blocks. Note: the design idea is that everything scheduled in the UNION scheduler can executed and does not have dependencies, while the join scheduler may have dependencies and block. If somebody finds the time to look further, please go ahead. I will try to look further once my time allows |
Current Behavior
When using a FedX-Repository:
Specific query and graph (see repo linked in "Steps To Repoduce") and
((FedXTupleQuery) query).evaluate().stream().collect(Collectors.toList());
(or
((FedXTupleQuery) query).evaluate().hasNext();
)throws
org.eclipse.rdf4j.query.QueryInterruptedException: Query evaluation has run into a timeout
caused byjava.util.concurrent.TimeoutException
Expected Behavior
providing a correct query result with a
List<BindingSet>
Steps To Reproduce
clone FedX-Query-Test-Repo
run
Main.main()
get Exception at line 101
(delete lines 8-48 in
db.ttl
, then the bug will not occur)Version
4.3.6
Are you interested in contributing a solution yourself?
None
Anything else?
The text was updated successfully, but these errors were encountered: