-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
union of targets should be DISTINCT #143
Comments
The spec clearly states that we are talking about sets of terms, and sets are mathematical constructs where the union is another set. "The target of a target declaration is the set of RDF terms". There would be no harm in making this clearer by adding a word here, but I am not sure why people refer to SPARQL's UNION keyword here. SPARQL is about bindings, not sets of terms. |
I agree it is clear at the moment. A union of things that are sets is a set hence unique terms. Each of the target definitions says "set" as well, except sh:targetNode which is a singleton. |
SPARQL evaluation works with multi-sets -- set + cardinality of each element. "union" of multi-sets sums the cardinality of elements. |
He posted eclipse-rdf4j/rdf4j#3584 |
In preparation for a potential future SHACL WG I would like to close GitHub issues that were mainly just questions. Please reopen if you disagree. |
(Thread: https://lists.w3.org/Archives/Public/public-shacl/2022Jan/)
https://www.w3.org/TR/shacl/#targets says: "union of terms produced by the individual targets that are declared by the shape".
Say I have a shape with the following targeting:
Say a node matches all of these conditions: will it be selected for validation once and not 4 times?
I.e., is the "union of terms" supposed to be
DISTINCT
? (UNION
in mathematics is distinct, but not in SPARQL)@HolgerKnublauch> (TQ API) is using a Set which means each target node will only be validated once even if in multiple targets at the same shape.
I believe this is following the intention of the spec. Does any implementer here disagree?
Vladimir: Agreed. But still, the spec should mention DISTINCT.
I'll post this here as an "SHACL Erratum", as per #103
Ashley Sommer> PySHACL does the same. The final collection of targets is a Set object, which deduplicates any identical nodes that are added.
Irene Polikoff> To me, this sounds more like an implementation question, rather than a standards question.
Vladimir: The number of Validation Results will be different (unless targets are distinct, there will be duplicate results).
Even if one stored Validation Results in a repo, they would not be deduplicated since it's not likely Results can use deterministic URLs (not blank nodes or UUID URNs).
The impact on performance will be a linear slowdown.
If that shape causes a lot of other shapes to be invoked, that can be very significant.
The text was updated successfully, but these errors were encountered: