-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does SPARQL TSV Results make sense? #48
Comments
I think this is again the same problem that @pmaria mentioned and wanted to 'document' in a Note: kg-construct/rml-core#113 Basically, we would need to properly define a better reference formulation here.
Same for the others. |
I see. I believe we need test cases, as only CSV is covered and CSV boils down to iterating CSV documents. The other formats have quirks. I disagree with the use of BN identifiers. One query can generate SPARQL stipulates that you should at least support CSV and XML (among others); in other words, we could technically limit it to two: one with data type information and CSV for easier processing. |
A reference formulation specifies which grammar one can use to access the data of a logical source, not the format. Does
@chrdebru do you want us to include a description of a reference formulation that indicates the iteration pattern to be per row?
Why not?
@chrdebru could you please clarify this?
Isn't the delimiter possible to be specified as a CSVW description of the result?
@chrdebru I do not understand this, what would the first iterator be? |
I'm using the data follwoing data and query as an example:
CSV:
TSV
JSON and XML
|
So we are iterating over solutions then, right? What is then the point of having those formats if we know that all SPARQL implementations must support XML and CSV (at least)? So would The following are details that are not relevant anymore if the above answer is "yes."
TSV representation of SPARQL prescribes how terms are encoded (e.g., the angled brackets). The variables names also have question marks. Should references use When you retrieve
With CSV, we cannot distinguish between blank node identifiers and literals (same as with IRIs).
No. I'm talking about TSV of SPARQL result sets, which must use tabss
The iterator for SPARQL queries is the SPARQL query. So one iterates over the result set. The problem is that I believe the community thought that CSV result sets can be processed as regular CSV files. This is true, but there are unfortunate corner cases. However, we iterate over solutions in a result set (which are dictionaries), and not over a CSV file. There is much more information in TSV (a more constrained one), JSON and XML (explicit data types, resource types, etc.). TSV uses a different variable naming convention. For CSV and TSV, the lines correspond with iterations. For XML and JSON, however, the returned JSON and XML documents need a different iterator. E.g., As such, I am questioning the added value of The test cases for SPARQL queries are a bit naïve as they only look at CSV without corner cases (e.g., there are no IRIs in the result set). |
Nor the documentation, nor the test cases provide such examples (same for XML and JSON results, by the way). But I question the usefulness of
rml:SPARQL_RESULT_TSV
. Taking the example of 0003, we would have the following TSV:How should we iterate over those? We cannot treat them as regular TSV. The angle brackets should be removed from IRIs. Literals should be "cast" to their datatypes. And I have no idea what to do with blank node identifiers. Is it possible the group thought that the TSV output would be the same as CSV output, but with tabs?
Same question for JSON and XML representations of SPARQL queries: do they have bespoke iterations (i.e., not the same iterations as for "regular" JSON or XML files), or would iterating over them require a second iterator?
The text was updated successfully, but these errors were encountered: