Skip to content

convert complete service

timrdf edited this page Feb 14, 2012 · 29 revisions

background: https://github.com/timrdf/csv2rdf4lod-automation/wiki/A-quick-and-easy-conversion was done years ago. It sets up shop from an RDFa encoding of the conversion parameters.

client URL: http://gemini.tw.rpi.edu/dev/lebot/elixir/services/start

This accepts UI changes and reflects the results in the DOM's RDFa. This also uses jqueryrdf to parse itself for RDFa and POST to:

service URL: http://gemini.tw.rpi.edu/dev/elixir/services/convert-complete

accept POST:

Include directly (we're using this approach now):

<http://logd.tw.rpi.edu/source/data-gov/dataset/4383/version/2010-Oct-22>
    a dcat:Dataset;
    dcterms:source <http://explore.data.gov/download/wfna-38ey/XLS> ;
    a conversion:VersionedDataset .
    conversion:base_uri           "http://logd.tw.rpi.edu" ;
    conversion:source_identifier  "data-gov" ;
    conversion:dataset_identifier "4383" ;
    conversion:version_identifier "2010-Oct-22" ;
    # we need delimiter

If Linked data and provides appropriate access information (in dcat or dcterms:source):

    a dcat:Dataset;
  • set up directory structure
  • retrieve URL into source
  • create conversion trigger
  • convert raw
  • publish raw to endpoint

response:

# some provenance and void rooted on the VersionedDataset that associates the dataset to 
# a named graph in an endpoint.

@prefix sd:         <http://www.w3.org/ns/sparql-service-description#> .
@prefix conversion: <http://purl.org/twc/vocab/conversion/> .

<http://logd.tw.rpi.edu/sparql>
  sd:url <http://logd.tw.rpi.edu/sparql>;
  a sd:Service;
  sd:hasDefaultDatasetDescription ... get to the NamedGraph..
.
[ a sd:NamedGraph;
  sd:name  <http://logd.tw.rpi.edu/source/data-gov/dataset/4383/version/2010-Oct-22>;
  sd:graph <http://logd.tw.rpi.edu/source/data-gov/dataset/4383/version/2010-Oct-22/conversion/raw>;
] .

<http://logd.tw.rpi.edu/source/data-gov/dataset/4383/version/2010-Oct-22>
   a conversion:RetrievedDataset, void:Dataset;
   conversion:base_uri           "http://logd.tw.rpi.edu" ;
   conversion:source_identifier  "data-gov" ;
   conversion:dataset_identifier "4383" ;
   conversion:version_identifier "2011-Jan-24" ;
   void:subset <http://logd.tw.rpi.edu/source/data-gov/dataset/4383/version/2010-Oct-22/conversion/raw>;
.
<http://logd.tw.rpi.edu/source/data-gov/dataset/4383/version/2010-Oct-22/conversion/raw>
   a conversion:LayerDataset;
.

RePOSTing will reperform the same operations (as it it hadn't been done before) UNLESS there is some indication that has been successful (e1 exists ==> successful)

Possible errors that we'll need to handle:

  • csv file does not exist (404)
  • the e1 of versioned dataset already exists

client follows that provenance and queries the raw named graph to populate first 10 rows.

UI to let user select different/additional exemplars

dom tricks to build RDFa tree

parse raw sample rows RDFa (and eparams) from dom and POST to convert sample service

service URL: http://gemini.tw.rpi.edu/dev/elixir/services/convert-sample

  • make a csv out of the raw RDF
  • build /conversion-roots/UUID/source/.... source/the.csv
  • run it and load into triple store
  • rm /conversion-roots/UUID/source/
# some provenance and void rooted on the VersionedDataset that associates the dataset to 
# a named graph in an endpoint.

@prefix sd:         <http://www.w3.org/ns/sparql-service-description#> .
@prefix conversion: <http://purl.org/twc/vocab/conversion/> .

<http://logd.tw.rpi.edu/sparql>
  sd:url <http://logd.tw.rpi.edu/sparql>;
  a sd:Service;
  sd:hasDefaultDatasetDescription ... get to the NamedGraph..
.
[ a sd:NamedGraph;
  sd:name  <http://logd.tw.rpi.edu/source/data-gov/dataset/4383/version/2010-Oct-22/conversion/enhancement/1>;
  sd:graph <http://logd.tw.rpi.edu/source/data-gov/dataset/4383/version/2010-Oct-22/conversion/enhancement/1>;
] .

<http://logd.tw.rpi.edu/source/data-gov/dataset/4383/version/2010-Oct-22>
   a conversion:RetrievedDataset, void:Dataset;
   conversion:base_uri           "http://logd.tw.rpi.edu" ;
   conversion:source_identifier  "data-gov" ;
   conversion:dataset_identifier "4383" ;
   conversion:version_identifier "2011-Jan-24" ;
   void:subset <http://logd.tw.rpi.edu/source/data-gov/dataset/4383/version/2010-Oct-22/conversion/enhancement/1>;
.
<http://logd.tw.rpi.edu/source/data-gov/dataset/4383/version/2010-Oct-22/conversion/enhancement/1>
   a conversion:LayerDataset;
.
Clone this wiki locally