From 6febd356802e9feeb708a7e693605d616e634184 Mon Sep 17 00:00:00 2001 From: "M. Sonntag" Date: Fri, 19 Jun 2020 11:35:26 +0200 Subject: [PATCH 01/19] [README] Add pip git install option --- README.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/README.md b/README.md index 4ea3455e..2bcfbee8 100644 --- a/README.md +++ b/README.md @@ -33,6 +33,14 @@ examples can be found at our odML [project page](https://g-node.github.io/python pip install odml ``` +To install the latest development version of odml you can use the git installation option of pip: + +``` +pip install git+https://github.com/G-Node/python-odml +``` + +Please note that this version might not be stable. + ## Tutorial and examples - We have assembled a set of From 32cb6ddc44cae72aa63558acd8d3f2cf57a5ba79 Mon Sep 17 00:00:00 2001 From: "M. Sonntag" Date: Fri, 19 Jun 2020 11:41:41 +0200 Subject: [PATCH 02/19] [README] Fix contributing link --- README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 2bcfbee8..0e212df7 100644 --- a/README.md +++ b/README.md @@ -134,7 +134,8 @@ working as expected. Use the release tags instead. # Contributing and Governance -See the [CONTRIBUTING](CONTIBUTING.md) document for more information on this. +See the [CONTRIBUTING](https://github.com/G-Node/python-odml/blob/master/CONTRIBUTING.md) document +for more information on this. # Bugs & Questions From 9fdbf237bbafc46d7f464d375a80e1b6907d4b49 Mon Sep 17 00:00:00 2001 From: "M. Sonntag" Date: Tue, 23 Jun 2020 16:20:52 +0200 Subject: [PATCH 03/19] [info] Update version number to 1.5.1 --- odml/info.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/odml/info.json b/odml/info.json index 8f1cd9c4..bd17d7f1 100644 --- a/odml/info.json +++ b/odml/info.json @@ -1,5 +1,5 @@ { - "VERSION": "1.5.0", + "VERSION": "1.5.1", "FORMAT_VERSION": "1.1", "AUTHOR": "Hagen Fritsch, Jan Grewe, Christian Kellner, Achilleas Koutsou, Michael Sonntag, Lyuba Zehl", "COPYRIGHT": "(c) 2011-2020, German Neuroinformatics Node", From 8560274129095075b3903794c9fde181cab49c72 Mon Sep 17 00:00:00 2001 From: "M. Sonntag" Date: Tue, 23 Jun 2020 18:54:12 +0200 Subject: [PATCH 04/19] [CHANGELOG] Add latest changes --- CHANGELOG.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 2e623cca..178bd13a 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -3,6 +3,13 @@ Used to document all changes from previous releases and collect changes until the next release. +# Version 1.5.1 + +# Minor changes and updates +- Section properties can now be reordered. See PR #398 for details. +- Property values can now be inserted at a specified index. See PR #398 for details. +- Tuples can now be assigned using a list instead of the `"(x;x;...)"` syntax as well. See PR #393 and issue #392 for details. + # Version 1.5.0 # Python 2 deprecation warning From 57e1b2dbf3c3b4af8d39485cc1eda8ef5363fd46 Mon Sep 17 00:00:00 2001 From: "M. Sonntag" Date: Fri, 17 Jul 2020 11:22:52 +0200 Subject: [PATCH 05/19] [CHANGELOG] Add RDF subclassing changes --- CHANGELOG.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 178bd13a..ab0ec5c8 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,9 @@ until the next release. # Version 1.5.1 +# RDF Subclassing feature +RDF subclasses are now properly added by default to any written RDF document. The RDF document will now also include RDF Subclass definitions in addition to the actual data to enable Subclass specific queries. See PR #400 and issue #397 for details. + # Minor changes and updates - Section properties can now be reordered. See PR #398 for details. - Property values can now be inserted at a specified index. See PR #398 for details. From d290cc65390905dc98f558460ddbc14c91de5c53 Mon Sep 17 00:00:00 2001 From: "M. Sonntag" Date: Fri, 17 Jul 2020 13:13:25 +0200 Subject: [PATCH 06/19] Refactor odml RDF documentation --- doc/{ => rdf}/RDF_example_graph.png | Bin doc/{ => rdf}/RDF_tools.ipynb | 0 doc/{example_rdfs => rdf}/generated_rdf.xml | 0 .../odml_RDF_example_A.ttl} | 0 .../drosophila_2.ttl => rdf/odml_RDF_example_B.ttl} | 0 .../drosophila_4.ttl => rdf/odml_RDF_example_C.ttl} | 0 .../drosophila_8.ttl => rdf/odml_RDF_example_D.ttl} | 0 doc/{example_rdfs => rdf}/rdf_generator.py | 0 doc/{example_rdfs => rdf}/sparql_example_queries.py | 2 +- 9 files changed, 1 insertion(+), 1 deletion(-) rename doc/{ => rdf}/RDF_example_graph.png (100%) rename doc/{ => rdf}/RDF_tools.ipynb (100%) rename doc/{example_rdfs => rdf}/generated_rdf.xml (100%) rename doc/{example_rdfs/example_data/2010-04-16-ab_cutoff_300_contrast_20%.ttl => rdf/odml_RDF_example_A.ttl} (100%) rename doc/{example_rdfs/example_data/drosophila_2.ttl => rdf/odml_RDF_example_B.ttl} (100%) rename doc/{example_rdfs/example_data/drosophila_4.ttl => rdf/odml_RDF_example_C.ttl} (100%) rename doc/{example_rdfs/example_data/drosophila_8.ttl => rdf/odml_RDF_example_D.ttl} (100%) rename doc/{example_rdfs => rdf}/rdf_generator.py (100%) rename doc/{example_rdfs => rdf}/sparql_example_queries.py (96%) diff --git a/doc/RDF_example_graph.png b/doc/rdf/RDF_example_graph.png similarity index 100% rename from doc/RDF_example_graph.png rename to doc/rdf/RDF_example_graph.png diff --git a/doc/RDF_tools.ipynb b/doc/rdf/RDF_tools.ipynb similarity index 100% rename from doc/RDF_tools.ipynb rename to doc/rdf/RDF_tools.ipynb diff --git a/doc/example_rdfs/generated_rdf.xml b/doc/rdf/generated_rdf.xml similarity index 100% rename from doc/example_rdfs/generated_rdf.xml rename to doc/rdf/generated_rdf.xml diff --git a/doc/example_rdfs/example_data/2010-04-16-ab_cutoff_300_contrast_20%.ttl b/doc/rdf/odml_RDF_example_A.ttl similarity index 100% rename from doc/example_rdfs/example_data/2010-04-16-ab_cutoff_300_contrast_20%.ttl rename to doc/rdf/odml_RDF_example_A.ttl diff --git a/doc/example_rdfs/example_data/drosophila_2.ttl b/doc/rdf/odml_RDF_example_B.ttl similarity index 100% rename from doc/example_rdfs/example_data/drosophila_2.ttl rename to doc/rdf/odml_RDF_example_B.ttl diff --git a/doc/example_rdfs/example_data/drosophila_4.ttl b/doc/rdf/odml_RDF_example_C.ttl similarity index 100% rename from doc/example_rdfs/example_data/drosophila_4.ttl rename to doc/rdf/odml_RDF_example_C.ttl diff --git a/doc/example_rdfs/example_data/drosophila_8.ttl b/doc/rdf/odml_RDF_example_D.ttl similarity index 100% rename from doc/example_rdfs/example_data/drosophila_8.ttl rename to doc/rdf/odml_RDF_example_D.ttl diff --git a/doc/example_rdfs/rdf_generator.py b/doc/rdf/rdf_generator.py similarity index 100% rename from doc/example_rdfs/rdf_generator.py rename to doc/rdf/rdf_generator.py diff --git a/doc/example_rdfs/sparql_example_queries.py b/doc/rdf/sparql_example_queries.py similarity index 96% rename from doc/example_rdfs/sparql_example_queries.py rename to doc/rdf/sparql_example_queries.py index 0eef989c..4645d572 100644 --- a/doc/example_rdfs/sparql_example_queries.py +++ b/doc/rdf/sparql_example_queries.py @@ -1,7 +1,7 @@ from rdflib import Graph, Namespace, RDF from rdflib.plugins.sparql import prepareQuery -resource = "./python-odml/doc/example_rdfs/example_data/2010-04-16-ab_cutoff_300_contrast_20%.ttl" +resource = "./python-odml/doc/rdf/example_data/odml_RDF_example_A.ttl" g = Graph() g.parse(resource, format='turtle') From b40ca7b5df819e066d2421af0e4516ce9a6a4ef6 Mon Sep 17 00:00:00 2001 From: "M. Sonntag" Date: Fri, 17 Jul 2020 13:21:08 +0200 Subject: [PATCH 07/19] [doc/rdf] Cleanup code examples --- doc/rdf/rdf_generator.py | 36 +++++++++++++++---------------- doc/rdf/sparql_example_queries.py | 17 ++++++++------- 2 files changed, 27 insertions(+), 26 deletions(-) diff --git a/doc/rdf/rdf_generator.py b/doc/rdf/rdf_generator.py index e78172e7..4f2b9d40 100644 --- a/doc/rdf/rdf_generator.py +++ b/doc/rdf/rdf_generator.py @@ -1,7 +1,7 @@ -from rdflib import Graph, BNode, Literal, Namespace +from rdflib import Graph, BNode, Literal from rdflib.namespace import XSD -odml = Namespace("http://g-node/odml#") +from odml.tools.rdf_converter import ODML_NS g = Graph() @@ -9,25 +9,25 @@ s1 = BNode("s1") p12 = BNode("p1") -g.add((doc, odml.version, Literal(1.1))) -g.add((doc, odml.docversion, Literal(42))) -g.add((doc, odml.author, Literal('D. N. Adams'))) -g.add((doc, odml.date, Literal('1979-10-12', datatype=XSD.date))) -g.add((doc, odml.hasSection, s1)) +g.add((doc, ODML_NS.version, Literal(1.1))) +g.add((doc, ODML_NS.docversion, Literal(42))) +g.add((doc, ODML_NS.author, Literal('D. N. Adams'))) +g.add((doc, ODML_NS.date, Literal('1979-10-12', datatype=XSD.date))) +g.add((doc, ODML_NS.hasSection, s1)) -g.add((s1, odml.property, p12)) -g.add((s1, odml.type, Literal('crew'))) -g.add((s1, odml.description, Literal('Information on the crew'))) -g.add((s1, odml.name, Literal('TheCrew'))) +g.add((s1, ODML_NS.property, p12)) +g.add((s1, ODML_NS.type, Literal('crew'))) +g.add((s1, ODML_NS.description, Literal('Information on the crew'))) +g.add((s1, ODML_NS.name, Literal('TheCrew'))) -g.add((p12, odml.hasValue, Literal('[Arthur Philip Dent,Zaphod Beeblebrox,Tricia Marie McMillan,Ford Prefect]'))) -g.add((p12, odml.description, Literal('List of crew members names'))) -g.add((p12, odml.dtype, Literal('person'))) -g.add((p12, odml.name, Literal('NameCrewMembers'))) +content = '[Arthur Philip Dent,Zaphod Beeblebrox,Tricia Marie McMillan,Ford Prefect]' +g.add((p12, ODML_NS.hasValue, Literal(content))) +g.add((p12, ODML_NS.description, Literal('List of crew members names'))) +g.add((p12, ODML_NS.dtype, Literal('person'))) +g.add((p12, ODML_NS.name, Literal('NameCrewMembers'))) res = g.serialize(format='application/rdf+xml').decode("utf-8") print(res) -f = open("generated_ex1.xml", "w") -f.write(res) -f.close() \ No newline at end of file +with open("generated_odml_rdf.xml", "w") as f: + f.write(res) diff --git a/doc/rdf/sparql_example_queries.py b/doc/rdf/sparql_example_queries.py index 4645d572..7e17b16b 100644 --- a/doc/rdf/sparql_example_queries.py +++ b/doc/rdf/sparql_example_queries.py @@ -1,7 +1,11 @@ -from rdflib import Graph, Namespace, RDF +from rdflib import Graph, RDF from rdflib.plugins.sparql import prepareQuery -resource = "./python-odml/doc/rdf/example_data/odml_RDF_example_A.ttl" +from odml.tools.rdf_converter import ODML_NS + +rdf_namespace = {"odml": ODML_NS, "rdf": RDF} + +resource = "./odml_RDF_example_A.ttl" g = Graph() g.parse(resource, format='turtle') @@ -18,8 +22,7 @@ ?p odml:hasUnit "%" . ?v rdf:type rdf:Bag . ?v rdf:li "20.0" . - }""", initNs={"odml": Namespace("https://g-node.org/odml-rdf#"), - "rdf": RDF}) + }""", initNs=rdf_namespace) g = Graph() g.parse(resource, format='turtle') @@ -46,8 +49,7 @@ ?p1 odml:hasName "CellType" . ?p1 odml:hasValue ?v1 . ?v1 rdf:li "P-unit" . - }""", initNs={"odml": Namespace("https://g-node.org/odml-rdf#"), - "rdf": RDF}) + }""", initNs=rdf_namespace) # select d.* from dataset d, CellProperties s, EOD Frequency c where c.unit = 'Hz' g = Graph() @@ -64,8 +66,7 @@ ?p odml:hasUnit "Hz" . ?v rdf:type rdf:Bag . ?v rdf:li ?value . - }""", initNs={"odml": Namespace("https://g-node.org/odml-rdf#"), - "rdf": RDF}) + }""", initNs=rdf_namespace) print("q1") for row in g.query(q1): From e7b494b749fb36ab47b0a1d366b1ce98e03dd5d8 Mon Sep 17 00:00:00 2001 From: "M. Sonntag" Date: Fri, 17 Jul 2020 13:22:18 +0200 Subject: [PATCH 08/19] [doc/rdf] Cleanup and update example jupyter --- .../{RDF_tools.ipynb => odml_RDF_tools.ipynb} | 386 ++++++------------ 1 file changed, 131 insertions(+), 255 deletions(-) rename doc/rdf/{RDF_tools.ipynb => odml_RDF_tools.ipynb} (63%) diff --git a/doc/rdf/RDF_tools.ipynb b/doc/rdf/odml_RDF_tools.ipynb similarity index 63% rename from doc/rdf/RDF_tools.ipynb rename to doc/rdf/odml_RDF_tools.ipynb index 1c62f9b0..7e26e006 100644 --- a/doc/rdf/RDF_tools.ipynb +++ b/doc/rdf/odml_RDF_tools.ipynb @@ -4,7 +4,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# What is Semantic Web and RDF?" + "# What is the Semantic Web and RDF?" ] }, { @@ -13,11 +13,11 @@ "source": [ "**RDF (Resource Description Framework)** is one of the three foundational [Semantic Web](https://en.wikipedia.org/wiki/Semantic_Web) technologies, the other two being SPARQL and OWL.\n", "\n", - "In particular, RDF is the data model of the Semantic Web. That means that all data in Semantic Web technologies is represented as RDF. If you store Semantic Web data, it's in RDF. If you query Semantic Web data (typically using SPARQL), it's RDF data. If you send Semantic Web data to your friend, it's RDF.\n", + "In particular, RDF is the data model of the Semantic Web. That means that all data in Semantic Web technologies are represented as RDF. If you store Semantic Web data, it's in RDF. If you query Semantic Web data (typically using the SPARQL query language), it's RDF data. If you send Semantic Web data to your friend, it's RDF.\n", "\n", "RDF data model is based upon the idea of making statements about resources (in particular web resources) in the form of *subject–predicate–object* expressions, known as [*triples*](https://en.wikipedia.org/wiki/Semantic_triple). The *subject* denotes the resource, and the *predicate* denotes traits or aspects of the resource, and expresses a relationship between the *subject* and the *object*.\n", "\n", - "For example, one way to represent the notion \"The sky has the color blue\" in RDF is as the triple: a **subject** denoting *\"the sky\"*, a **predicate** denoting *\"has the color\"*, and an **object** denoting *\"blue\"*. Therefore, RDF uses subject instead of object(or entity) in contrast to the typical approach of an entity–attribute–value model in object-oriented design: entity (sky), attribute (color), and value (blue).
\n", + "For example, one way to represent the notion \"The sky has the color blue\" in RDF is as the triple: a **subject** denoting *\"the sky\"*, a **predicate** denoting *\"has the color\"*, and an **object** denoting *\"blue\"*. Therefore, RDF uses subject instead of object(or entity) in contrast to the typical approach of an entity–attribute–value model in object-oriented design: entity (sky), attribute (color), and value (blue).
\n", "(Resource Description Framework, Wikipedia, 2017)" ] }, @@ -32,11 +32,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Find out more:
\n", - "- http://fast.wistia.net/embed/iframe/8nm9xf4jip?popover=true
\n", - "- https://en.wikipedia.org/wiki/Resource_Description_Framework
\n", - "- https://www.cambridgesemantics.com/semantic-university/rdf-101
\n", - "- http://www.cambridgesemantics.com/semantic-university/introduction-semantic-web-0" + "Find out more:\n", + "- https://en.wikipedia.org/wiki/Resource_Description_Framework\n", + "- https://www.cambridgesemantics.com/blog/semantic-university/learn-rdf/\n" ] }, { @@ -45,17 +43,16 @@ "collapsed": true }, "source": [ - "# RDF<->odML converter" + "# odML to RDF converter" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Here we will explore RDF-odML and odML-RDF conversion in `odml/tools/rdf_converter.py` module.\n", + "Here we will explore odML to RDF conversion using the `odml/tools/rdf_converter.py` module.\n", "\n", - "If you are new python odML please read the tutorial first:\n", - "https://g-node.github.io/python-odml/tutorial.html" + "If you are new python odML please read the [tutorial](https://python-odml.readthedocs.io/en/latest/tutorial.html) first to familiarize yourself with odML." ] }, { @@ -67,25 +64,20 @@ }, { "cell_type": "code", - "execution_count": 1, - "metadata": { - "collapsed": true - }, + "execution_count": 18, + "metadata": {}, "outputs": [], "source": [ - "import os\n", - "os.chdir('..')\n", + "import datetime\n", "\n", "import odml\n", - "import datetime\n", "\n", - "doc = odml.Document(author=\"D. N. Adams\",\n", - " date=datetime.date(1979, 10, 12))\n", + "doc = odml.Document(author=\"D. N. Adams\", date=datetime.date(1979, 10, 12))\n", "\n", "# CREATE AND APPEND THE MAIN SECTIONs\n", "doc.append(odml.Section(name=\"Arthur Philip Dent\",\n", - " type=\"crew/person\",\n", - " definition=\"Information on Arthur Dent\"))\n", + " type=\"crew/person\",\n", + " definition=\"Information on Arthur Dent\"))\n", "\n", "# SET NEW PARENT NODE\n", "parent = doc['Arthur Philip Dent']\n", @@ -95,30 +87,31 @@ "parent.append(odml.Property(name=\"Species\",\n", " value=\"Human\",\n", " dtype=odml.DType.string,\n", - " definition=\"Species to which subject belongs to\"))" + " definition=\"Species to which subject belongs to\"))\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "## RDFWriter class" + "## The RDFWriter class" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "RDFWriter class is used for conversion documents from odML to one of the supported RDF formats:
\n", - "'xml', 'pretty-xml', 'trix', 'n3', 'turtle', 'ttl', 'ntriples', 'nt', 'nt11', 'trig', 'json-ld'.
\n", - "Both one document or list of multiple documents can be passed to `RDFWriter()` constructor.\n", + "The RDFWriter class is used to convert odML documents to one of the supported RDF formats:

\n", + "'xml', 'pretty-xml', 'trix', 'n3', 'turtle', 'ttl', 'ntriples', 'nt', 'nt11', 'trig'.
\n", + "\n", + "'turtle' is the format that is best suited for storage and human readability which is why we will use it in our tutorial. For cross-tool usage, saving RDF in its 'XML' variant is probably the safest choice.\n", "\n", - "It's possible to get the output as a string." + "The output can be returned as a string." ] }, { "cell_type": "code", - "execution_count": 2, + "execution_count": 5, "metadata": {}, "outputs": [ { @@ -127,31 +120,30 @@ "text": [ "@prefix odml: .\n", "@prefix rdf: .\n", - "@prefix rdfs: .\n", - "@prefix xml: .\n", "@prefix xsd: .\n", "\n", - "odml:Hub odml:hasDocument .\n", + "odml:Hub odml:hasDocument odml:40797785-2e1a-435e-b905-aeeac2ba2b3e .\n", + "\n", + "odml:220489b8-2043-452b-863b-8ba6a4b5e536 a odml:Section ;\n", + " odml:hasDefinition \"Information on Arthur Dent\" ;\n", + " odml:hasName \"Arthur Philip Dent\" ;\n", + " odml:hasProperty odml:40ede84a-650b-4aab-af81-b4136c833e58 ;\n", + " odml:hasType \"crew/person\" .\n", "\n", - " a odml:Document ;\n", + "odml:40797785-2e1a-435e-b905-aeeac2ba2b3e a odml:Document ;\n", " odml:hasAuthor \"D. N. Adams\" ;\n", " odml:hasDate \"1979-10-12\"^^xsd:date ;\n", - " odml:hasSection odml:f3de1e21-f6f5-4eae-8f58-db94ee10f812 .\n", - "\n", - " a rdf:Bag ;\n", - " rdf:li \"Human\" .\n", + " odml:hasFileName \"None\" ;\n", + " odml:hasSection odml:220489b8-2043-452b-863b-8ba6a4b5e536 .\n", "\n", - "odml:c46a5ee8-811a-4947-8e4b-7f164fbf4c8a a odml:Property ;\n", + "odml:40ede84a-650b-4aab-af81-b4136c833e58 a odml:Property ;\n", " odml:hasDefinition \"Species to which subject belongs to\" ;\n", " odml:hasDtype \"string\" ;\n", " odml:hasName \"Species\" ;\n", - " odml:hasValue .\n", + " odml:hasValue odml:4425ade2-5d03-4484-a272-764c1e933933 .\n", "\n", - "odml:f3de1e21-f6f5-4eae-8f58-db94ee10f812 a odml:Section ;\n", - " odml:hasDefinition \"Information on Arthur Dent\" ;\n", - " odml:hasName \"Arthur Philip Dent\" ;\n", - " odml:hasProperty odml:c46a5ee8-811a-4947-8e4b-7f164fbf4c8a ;\n", - " odml:hasType \"crew/person\" .\n", + "odml:4425ade2-5d03-4484-a272-764c1e933933 a rdf:Seq ;\n", + " rdf:_1 \"Human\" .\n", "\n", "\n" ] @@ -160,14 +152,14 @@ "source": [ "from odml.tools.rdf_converter import RDFWriter\n", "\n", - "print(RDFWriter(doc).get_rdf_str('turtle'))" + "print(RDFWriter(doc).get_rdf_str('turtle'))\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Or write the output to the specified file." + "Or the output can be written to a specified file." ] }, { @@ -181,30 +173,29 @@ "text": [ "@prefix odml: .\n", "@prefix rdf: .\n", - "@prefix rdfs: .\n", - "@prefix xml: .\n", "@prefix xsd: .\n", "\n", - "odml:Hub odml:hasDocument .\n", + "odml:Hub odml:hasDocument odml:08f8c7fa-4ea0-4512-8927-ff73c117644d .\n", "\n", - " a odml:Document ;\n", + "odml:08f8c7fa-4ea0-4512-8927-ff73c117644d a odml:Document ;\n", " odml:hasAuthor \"D. N. Adams\" ;\n", " odml:hasDate \"1979-10-12\"^^xsd:date ;\n", - " odml:hasSection odml:f3de1e21-f6f5-4eae-8f58-db94ee10f812 .\n", + " odml:hasFileName \"None\" ;\n", + " odml:hasSection odml:3c86174b-b183-47aa-9e0b-58dfc066a76d .\n", "\n", - "odml:c46a5ee8-811a-4947-8e4b-7f164fbf4c8a a odml:Property ;\n", + "odml:15eb4c32-73fe-4da1-8cba-3fac965d4d17 a odml:Property ;\n", " odml:hasDefinition \"Species to which subject belongs to\" ;\n", " odml:hasDtype \"string\" ;\n", " odml:hasName \"Species\" ;\n", - " odml:hasValue odml:ddde531a-663a-46f5-b474-edbc73254077 .\n", + " odml:hasValue odml:1ad9c2d6-6055-465b-b281-51943569338b .\n", "\n", - "odml:ddde531a-663a-46f5-b474-edbc73254077 a rdf:Bag ;\n", - " rdf:li \"Human\" .\n", + "odml:1ad9c2d6-6055-465b-b281-51943569338b a rdf:Seq ;\n", + " rdf:_1 \"Human\" .\n", "\n", - "odml:f3de1e21-f6f5-4eae-8f58-db94ee10f812 a odml:Section ;\n", + "odml:3c86174b-b183-47aa-9e0b-58dfc066a76d a odml:Section ;\n", " odml:hasDefinition \"Information on Arthur Dent\" ;\n", " odml:hasName \"Arthur Philip Dent\" ;\n", - " odml:hasProperty odml:c46a5ee8-811a-4947-8e4b-7f164fbf4c8a ;\n", + " odml:hasProperty odml:15eb4c32-73fe-4da1-8cba-3fac965d4d17 ;\n", " odml:hasType \"crew/person\" .\n", "\n", "\n" @@ -213,7 +204,6 @@ ], "source": [ "import tempfile\n", - "import os\n", "\n", "# Create temporary file\n", "f = tempfile.NamedTemporaryFile(mode='w', suffix=\".ttl\")\n", @@ -223,155 +213,35 @@ "\n", "with open(path) as ff:\n", " data = ff.read()\n", - " print(data)\n", - "\n", - "f.close()" + " print(data)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "## RDFReader class" + "Please note at this point, that RDF does not respect order. Everytime an unchanged file is written, the content will be identical, but the order of the statements will differ." ] }, { "cell_type": "markdown", - "metadata": { - "collapsed": true - }, - "source": [ - "RDFReader class enables RDF to odML conversion.\n", - "\n", - "There are 2 ways to obtain objects with converted odML documents:\n", - "- from **RDF file** ( `RDFReader().from_file(\"/path_to_input_rdf\", \"rdf_format\")` )\n", - "- from **RDF string** ( `RDFReader().from_string(\"rdf file as a string\", \"rdf_format\")` )" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[]\n" - ] - } - ], - "source": [ - "from odml.tools.rdf_converter import RDFReader\n", - "\n", - "rdf_file = RDFWriter(doc).get_rdf_str('turtle')\n", - "odml_doc = RDFReader().from_string(rdf_file, \"turtle\")\n", - "\n", - "print(odml_doc)" - ] - }, - { - "cell_type": "code", - "execution_count": 5, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "[]\n" - ] - } - ], "source": [ - "# Create temporary file\n", - "rdf_file = tempfile.NamedTemporaryFile(mode='w', suffix=\".ttl\")\n", - "rdf_path = rdf_file.name\n", - "RDFWriter(doc).write_file(rdf_path, \"turtle\")\n", - "\n", - "odml_doc = RDFReader().from_file(rdf_path, \"turtle\")\n", - "\n", - "print(odml_doc)" + "## Quering the data with rdflib and SPARQL" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Another option is to write the output to one or multiple files.
\n", - "`RDFReader().write_file(\"/input_path\", \"rdf_format\", \"/output_path_to_file\")`" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n", - "\n", - "\n", - "\n", - "
\n", - " Arthur Philip Dent\n", - " f3de1e21-f6f5-4eae-8f58-db94ee10f812\n", - " \n", - " Species\n", - " c46a5ee8-811a-4947-8e4b-7f164fbf4c8a\n", - " [Human]\n", - " Species to which subject belongs to\n", - " string\n", - " \n", - " Information on Arthur Dent\n", - " crew/person\n", - "
\n", - " 02e1d29e-937d-4de7-a83e-3e756d954c92\n", - " 1979-10-12\n", - " D. N. Adams\n", - "
\n", - "\n" - ] - } - ], - "source": [ - "# If RDF file contains one odML document, specify output path as file\n", - "odml_file = tempfile.NamedTemporaryFile(mode='w', suffix=\".odml\")\n", - "odml_path = odml_file.name\n", - "\n", - "RDFReader().write_file(rdf_path, \"turtle\", odml_path)\n", + "The following example depends on specific example files. If you do not already have these files\\ you can find and download them from https://github.com/G-Node/python-odml/tree/master/doc/example_rdfs/example_data.\n", "\n", - "with open(odml_path) as ff:\n", - " data = ff.read()\n", - " print(data)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "collapsed": true - }, - "source": [ - "If RDF file contains several odML docs, specify output path as a directory.
\n", - "`RDFReader().write_file(\"/input_path\", \"rdf_format\", \"/output_path_to_directory\")`\n", - "\n", - "Module creates files in specified directory and writes parsed docs to them.\n", - "Example of created file: `//doc_.odml`\n", - "(`` - id of the document)." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Quering the data with rdflib and SPARQL" + "The example will load RDF triples from multiple files and load them into a single, connected graph." ] }, { "cell_type": "code", - "execution_count": 3, + "execution_count": 10, "metadata": {}, "outputs": [ { @@ -383,46 +253,41 @@ } ], "source": [ - "# please run the first code snipet to change working directory if you have\n", - "# [Errno 2] No such file or directory: '/home/rick/g-node/python-odml/doc/doc/example_rdfs/example_data/'\n", - "# or insert this line after `import os`: `os.chdir('..')` below\n", + "from glob import glob\n", + "\n", "from rdflib import Graph\n", - "import os\n", "\n", "graph = Graph()\n", - "input_dir = os.path.join(os.getcwd(), 'doc/example_rdfs/example_data/')\n", - "for file_name in os.listdir(input_dir):\n", - " f = os.path.join(input_dir, file_name)\n", - " if os.path.isfile(f):\n", - " graph.parse(f, format=\"turtle\")\n", - "print('Total number of triples: ', len(graph))" + "for file_name in glob(\"odml_RDF_example_*.ttl\"):\n", + " graph.parse(file_name, format=\"turtle\")\n", + "\n", + "print('Total number of triples: ', len(graph))\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Quick video about what is SPARQL: https://www.youtuboe.com/watch?v=FvGndkpa4K0

\n", - "Example query using rdflib tool to find each section with type `Recording`, that has property with the name `Recording duration` and prints its value:" + "The example query uses an rdflib tool to find each Section with type `Recording` also featuring a Property with the name `Recording duration`. The result prints the Values of the returned Properties." ] }, { "cell_type": "code", - "execution_count": 3, + "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ + "Doc: https://g-node.org/odml-rdf#cc66e78a-3742-490a-9fdb-1c66761d7652, Sec: https://g-node.org/odml-rdf#5365f7e5-603c-4154-a5ea-33bb1a07a956, \n", + "Prop: https://g-node.org/odml-rdf#41316903-80f1-45a3-9b06-400a02903531, Val:11.25\n", "Doc: https://g-node.org/odml-rdf#cd24b60f-1d5e-4040-9881-5e5a597baef7, Sec: https://g-node.org/odml-rdf#782bd29d-e4b0-4c14-a417-1772a4851ffd, \n", "Prop: https://g-node.org/odml-rdf#9aeede78-678c-4db8-acb5-fbd6d408b762, Val:13.9\n", "Doc: https://g-node.org/odml-rdf#537c6cc8-7dfe-4d53-a111-24b3ce0f3c1a, Sec: https://g-node.org/odml-rdf#346773f2-abee-4892-b052-840ddcff35ee, \n", "Prop: https://g-node.org/odml-rdf#1636af03-8e97-4ef2-9d7d-6c7db23dcd02, Val:11.88\n", "Doc: https://g-node.org/odml-rdf#24066355-1ee8-4eb5-a715-96bbb6231cd5, Sec: https://g-node.org/odml-rdf#bbd44815-5016-49e0-9f4b-5b83778d00de, \n", - "Prop: https://g-node.org/odml-rdf#0ed215a2-5d20-48eb-b744-bf3b731459fc, Val:0.33\n", - "Doc: https://g-node.org/odml-rdf#cc66e78a-3742-490a-9fdb-1c66761d7652, Sec: https://g-node.org/odml-rdf#5365f7e5-603c-4154-a5ea-33bb1a07a956, \n", - "Prop: https://g-node.org/odml-rdf#41316903-80f1-45a3-9b06-400a02903531, Val:11.25\n" + "Prop: https://g-node.org/odml-rdf#0ed215a2-5d20-48eb-b744-bf3b731459fc, Val:0.33\n" ] } ], @@ -430,6 +295,10 @@ "from rdflib import Graph, Namespace, RDF\n", "from rdflib.plugins.sparql import prepareQuery\n", "\n", + "from odml.tools.rdf_converter import ODML_NS\n", + "\n", + "rdf_namespace = {\"odml\": ODML_NS, \"rdf\": RDF}\n", + "\n", "q = prepareQuery(\"\"\"SELECT ?d ?s ?p ?value WHERE {\n", " ?d odml:hasSection ?s .\n", " ?s rdf:type odml:Section .\n", @@ -439,12 +308,11 @@ " ?p odml:hasName \"Recording duration\" .\n", " ?p odml:hasValue ?v .\n", " ?v rdf:type rdf:Bag .\n", - " ?v rdf:li ?value .}\"\"\", initNs={\"odml\": Namespace(\"https://g-node.org/odml-rdf#\"),\n", - " \"rdf\": RDF})\n", + " ?v rdf:li ?value .}\"\"\", initNs=rdf_namespace)\n", "\n", "for row in graph.query(q):\n", " print(\"Doc: {0}, Sec: {1}, \\n\"\n", - " \"Prop: {2}, Val:{3}\".format(row.d, row.s, row.p, row.value))" + " \"Prop: {2}, Val:{3}\".format(row.d, row.s, row.p, row.value))\n" ] }, { @@ -458,7 +326,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**FuzzyFinder** is the tool for querying graph through *fuzzy* queries. The finder executes multiple queries to better match input parameters and returns sets of triples, prioritized from more to less amount of matched parameters.
\n", + "**FuzzyFinder** is a tool for querying an RDF graph through so called *fuzzy* queries. The finder executes multiple queries to better match input parameters. It returns sets of triples and prioritized from more to fewer matched parameters.\n", "\n", "The function `find()` accepts several oprtional parameters.\n", "- `graph`: rdflib graph object\n", @@ -466,13 +334,14 @@ "- `q_params`: dict object with parameters of a query\n", "- `mode`: default 'fuzzy' and 'match'\n", "\n", - "Each mode works with specific type of fuzzy query (`q_str`).\n", - "Let's see on the `match` mode in the example:" + "Each mode works with specific a type of fuzzy query (`q_str`).\n", + "\n", + "Let's check the `match` mode in an example." ] }, { "cell_type": "code", - "execution_count": 4, + "execution_count": 13, "metadata": {}, "outputs": [ { @@ -488,17 +357,17 @@ "?p odml:hasName \"Date\" .\n", "}\n", "Document: https://g-node.org/odml-rdf#cc66e78a-3742-490a-9fdb-1c66761d7652\n", - "Property: https://g-node.org/odml-rdf#f1699eb6-4cab-4dd0-9327-120eab2089ae\n", "Section: https://g-node.org/odml-rdf#5365f7e5-603c-4154-a5ea-33bb1a07a956\n", + "Property: https://g-node.org/odml-rdf#f1699eb6-4cab-4dd0-9327-120eab2089ae\n", + "Document: https://g-node.org/odml-rdf#24066355-1ee8-4eb5-a715-96bbb6231cd5\n", + "Section: https://g-node.org/odml-rdf#bbd44815-5016-49e0-9f4b-5b83778d00de\n", + "Property: https://g-node.org/odml-rdf#fadffec7-6b23-454e-bfd1-9d5884802abb\n", "Document: https://g-node.org/odml-rdf#537c6cc8-7dfe-4d53-a111-24b3ce0f3c1a\n", - "Property: https://g-node.org/odml-rdf#138f08f7-23c7-4722-8577-85a6fa633ae1\n", "Section: https://g-node.org/odml-rdf#346773f2-abee-4892-b052-840ddcff35ee\n", + "Property: https://g-node.org/odml-rdf#138f08f7-23c7-4722-8577-85a6fa633ae1\n", "Document: https://g-node.org/odml-rdf#cd24b60f-1d5e-4040-9881-5e5a597baef7\n", - "Property: https://g-node.org/odml-rdf#1d6db4ce-87f3-4e9c-b221-e76ba05b2759\n", "Section: https://g-node.org/odml-rdf#782bd29d-e4b0-4c14-a417-1772a4851ffd\n", - "Document: https://g-node.org/odml-rdf#24066355-1ee8-4eb5-a715-96bbb6231cd5\n", - "Property: https://g-node.org/odml-rdf#fadffec7-6b23-454e-bfd1-9d5884802abb\n", - "Section: https://g-node.org/odml-rdf#bbd44815-5016-49e0-9f4b-5b83778d00de\n", + "Property: https://g-node.org/odml-rdf#1d6db4ce-87f3-4e9c-b221-e76ba05b2759\n", "\n", "SELECT * WHERE {\n", "?d odml:hasSection ?s .\n", @@ -514,14 +383,14 @@ "?p rdf:type odml:Property .\n", "?p odml:hasName \"Date\" .\n", "}\n", - "Property: https://g-node.org/odml-rdf#1d6db4ce-87f3-4e9c-b221-e76ba05b2759\n", - "Section: https://g-node.org/odml-rdf#782bd29d-e4b0-4c14-a417-1772a4851ffd\n", - "Property: https://g-node.org/odml-rdf#fadffec7-6b23-454e-bfd1-9d5884802abb\n", "Section: https://g-node.org/odml-rdf#bbd44815-5016-49e0-9f4b-5b83778d00de\n", - "Property: https://g-node.org/odml-rdf#f1699eb6-4cab-4dd0-9327-120eab2089ae\n", + "Property: https://g-node.org/odml-rdf#fadffec7-6b23-454e-bfd1-9d5884802abb\n", + "Section: https://g-node.org/odml-rdf#782bd29d-e4b0-4c14-a417-1772a4851ffd\n", + "Property: https://g-node.org/odml-rdf#1d6db4ce-87f3-4e9c-b221-e76ba05b2759\n", "Section: https://g-node.org/odml-rdf#5365f7e5-603c-4154-a5ea-33bb1a07a956\n", - "Property: https://g-node.org/odml-rdf#138f08f7-23c7-4722-8577-85a6fa633ae1\n", + "Property: https://g-node.org/odml-rdf#f1699eb6-4cab-4dd0-9327-120eab2089ae\n", "Section: https://g-node.org/odml-rdf#346773f2-abee-4892-b052-840ddcff35ee\n", + "Property: https://g-node.org/odml-rdf#138f08f7-23c7-4722-8577-85a6fa633ae1\n", "\n", "SELECT * WHERE {\n", "?d odml:hasSection ?s .\n", @@ -538,19 +407,19 @@ "}\n", "Document: https://g-node.org/odml-rdf#cc66e78a-3742-490a-9fdb-1c66761d7652\n", "Section: https://g-node.org/odml-rdf#5365f7e5-603c-4154-a5ea-33bb1a07a956\n", + "Document: https://g-node.org/odml-rdf#24066355-1ee8-4eb5-a715-96bbb6231cd5\n", + "Section: https://g-node.org/odml-rdf#bbd44815-5016-49e0-9f4b-5b83778d00de\n", "Document: https://g-node.org/odml-rdf#537c6cc8-7dfe-4d53-a111-24b3ce0f3c1a\n", "Section: https://g-node.org/odml-rdf#346773f2-abee-4892-b052-840ddcff35ee\n", "Document: https://g-node.org/odml-rdf#cd24b60f-1d5e-4040-9881-5e5a597baef7\n", "Section: https://g-node.org/odml-rdf#782bd29d-e4b0-4c14-a417-1772a4851ffd\n", - "Document: https://g-node.org/odml-rdf#24066355-1ee8-4eb5-a715-96bbb6231cd5\n", - "Section: https://g-node.org/odml-rdf#bbd44815-5016-49e0-9f4b-5b83778d00de\n", "\n", "\n" ] } ], "source": [ - "from odml.tools.fuzzy_finder import FuzzyFinder\n", + "from odml.rdf.fuzzy_finder import FuzzyFinder\n", "\n", "query_string = 'prop(name:Date) section(name:Recording-2013-02-08-ak, type:Recording)'\n", "\n", @@ -562,22 +431,22 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "As you can see from the output, finder builds multiple sparql queries from 'match' queries, executes them and returns some matched results. The first result always represents the most specific query (the biggest combination of input parameters that returned at least one triple).\n", + "As you can see from the output, the finder builds multiple SPARQL queries from `match` queries, executes them and returns some matched results. The first result always represents the most specific query (the biggest combination of input parameters that returned at least one triple).\n", "\n", - "The query syntax is pretty straightforward. Just write the name of the entity `property`, `section` or `document` (also possible to use shortened names `prop`, `sec` and `doc`) and add attributes with their values inside the parentheses divided by colon.\n", + "The query syntax is pretty straightforward. Just write the name of the entity `property`, `section` or `document` (also possible to use shortened names `prop`, `sec` and `doc`) and add attributes with their values inside the parentheses separated by a colon.\n", "\n", - "Example from code: `prop(name:Date) section(name:Recording-2013-02-08-ak, type:Recording)`.\n", - "Here we search for sections and properties that `property` has attribute `name` and its value is `Date`.\n", + "As a code example: `prop(name:Date) section(name:Recording-2013-02-08-ak, type:Recording)`.\n", + "Here we search for Sections and Properties where `property` has attribute the `name` and its Value is `Date`.\n", "\n", - "For building 'match' queries you should need to know exactly for which odML attribute the value(subject) is related. So if you write `prop(name:Date) section(name:Recording, type:Recording-2013-02-08-ak)` the `find()` method would not return any triples with section parameters. Because it's likely that there is no section with type `Recording-2013-02-08-ak`.\n", + "For building `match` queries you should know exactly to which odML attribute the value(subject) is related. If you write `prop(name:Date) section(name:Recording, type:Recording-2013-02-08-ak)` the `find()` method would not return any triples with Section parameters, because it is unlikely that there is a Section with type `Recording-2013-02-08-ak`.\n", "\n", - "Non-odML entities' attributes here also will be ignored (e.g. only `id, author, date, version, repository, sections` can exist in the `Document` object).\n", - "In the example `section(not-odml-name:Recording-2013-02-08-ak, record:Recording)` the find method return nothing." + "Non-odML entity attributes will also be ignored (e.g. only `id, author, date, version, repository, sections` can exist in the `Document` object).\n", + "In the example `section(not-odml-name:Recording-2013-02-08-ak, record:Recording)` the `find` method returns nothing." ] }, { "cell_type": "code", - "execution_count": 5, + "execution_count": 14, "metadata": {}, "outputs": [ { @@ -589,7 +458,7 @@ } ], "source": [ - "from odml.tools.fuzzy_finder import FuzzyFinder\n", + "from odml.rdf.fuzzy_finder import FuzzyFinder\n", "\n", "query_string = 'section(not-odml-name:Recording-2013-02-08-ak, record:Recording)'\n", "\n", @@ -601,27 +470,27 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "This is often inconvinient if you do not know exactly what the information is related to in the graph. For situations like this *'fuzzy'* mode comes into play. It is also set by default.\n", + "This is often inconvenient if you do not know exactly how the diverse data in the graph is related. For situations like this *'fuzzy'* mode comes into play. It is also set by default.\n", "\n", - "The output logic is similair to the previous mode, but there you can provide more broad information, the finder will match the parameters and create meaningful queries based on the input.\n", + "The output logic is similar to the previous mode, but there you can provide more broad information, the finder will match the parameters and create meaningful queries based on the input.\n", "\n", "The query string consists of two parts: *FIND* and *HAVING*.\n", "\n", "In the *FIND* part a user specifies the set of odML objects and its attributes. \n", "e.g. `FIND prop(name) section(name, type)`\n", "\n", - "In the *HAVING* part a user specifies set of search values which could relate to the attributes in *FIND* part.\n", + "In the *HAVING* part a user specifies a set of search values which could relate to the attributes in the *FIND* part.\n", "e.g `HAVING Recording, Recording-2012-04-04-ab, Date`\n", "\n", "Finally, the complete query will look like this:\n", "`FIND sec(name, type) prop(name) HAVING Recording, Recording-2012-04-04-ab, Date`\n", "\n", - "As you can see in the example you should not really know to which attribute search values in *HAVING* part relates to, the finder can do it for you." + "As you can see in the example you do not need to know to which attribute search values in the *HAVING* part relate to, the finder can do it for you." ] }, { "cell_type": "code", - "execution_count": 6, + "execution_count": 17, "metadata": {}, "outputs": [ { @@ -637,17 +506,17 @@ "?p odml:hasName \"Date\" .\n", "}\n", "Document: https://g-node.org/odml-rdf#cc66e78a-3742-490a-9fdb-1c66761d7652\n", - "Property: https://g-node.org/odml-rdf#f1699eb6-4cab-4dd0-9327-120eab2089ae\n", "Section: https://g-node.org/odml-rdf#5365f7e5-603c-4154-a5ea-33bb1a07a956\n", + "Property: https://g-node.org/odml-rdf#f1699eb6-4cab-4dd0-9327-120eab2089ae\n", + "Document: https://g-node.org/odml-rdf#24066355-1ee8-4eb5-a715-96bbb6231cd5\n", + "Section: https://g-node.org/odml-rdf#bbd44815-5016-49e0-9f4b-5b83778d00de\n", + "Property: https://g-node.org/odml-rdf#fadffec7-6b23-454e-bfd1-9d5884802abb\n", "Document: https://g-node.org/odml-rdf#537c6cc8-7dfe-4d53-a111-24b3ce0f3c1a\n", - "Property: https://g-node.org/odml-rdf#138f08f7-23c7-4722-8577-85a6fa633ae1\n", "Section: https://g-node.org/odml-rdf#346773f2-abee-4892-b052-840ddcff35ee\n", + "Property: https://g-node.org/odml-rdf#138f08f7-23c7-4722-8577-85a6fa633ae1\n", "Document: https://g-node.org/odml-rdf#cd24b60f-1d5e-4040-9881-5e5a597baef7\n", - "Property: https://g-node.org/odml-rdf#1d6db4ce-87f3-4e9c-b221-e76ba05b2759\n", "Section: https://g-node.org/odml-rdf#782bd29d-e4b0-4c14-a417-1772a4851ffd\n", - "Document: https://g-node.org/odml-rdf#24066355-1ee8-4eb5-a715-96bbb6231cd5\n", - "Property: https://g-node.org/odml-rdf#fadffec7-6b23-454e-bfd1-9d5884802abb\n", - "Section: https://g-node.org/odml-rdf#bbd44815-5016-49e0-9f4b-5b83778d00de\n", + "Property: https://g-node.org/odml-rdf#1d6db4ce-87f3-4e9c-b221-e76ba05b2759\n", "\n", "SELECT * WHERE {\n", "?d odml:hasSection ?s .\n", @@ -663,14 +532,14 @@ "?p rdf:type odml:Property .\n", "?p odml:hasName \"Date\" .\n", "}\n", - "Property: https://g-node.org/odml-rdf#1d6db4ce-87f3-4e9c-b221-e76ba05b2759\n", - "Section: https://g-node.org/odml-rdf#782bd29d-e4b0-4c14-a417-1772a4851ffd\n", - "Property: https://g-node.org/odml-rdf#fadffec7-6b23-454e-bfd1-9d5884802abb\n", "Section: https://g-node.org/odml-rdf#bbd44815-5016-49e0-9f4b-5b83778d00de\n", - "Property: https://g-node.org/odml-rdf#f1699eb6-4cab-4dd0-9327-120eab2089ae\n", + "Property: https://g-node.org/odml-rdf#fadffec7-6b23-454e-bfd1-9d5884802abb\n", + "Section: https://g-node.org/odml-rdf#782bd29d-e4b0-4c14-a417-1772a4851ffd\n", + "Property: https://g-node.org/odml-rdf#1d6db4ce-87f3-4e9c-b221-e76ba05b2759\n", "Section: https://g-node.org/odml-rdf#5365f7e5-603c-4154-a5ea-33bb1a07a956\n", - "Property: https://g-node.org/odml-rdf#138f08f7-23c7-4722-8577-85a6fa633ae1\n", + "Property: https://g-node.org/odml-rdf#f1699eb6-4cab-4dd0-9327-120eab2089ae\n", "Section: https://g-node.org/odml-rdf#346773f2-abee-4892-b052-840ddcff35ee\n", + "Property: https://g-node.org/odml-rdf#138f08f7-23c7-4722-8577-85a6fa633ae1\n", "\n", "SELECT * WHERE {\n", "?d odml:hasSection ?s .\n", @@ -687,25 +556,32 @@ "}\n", "Document: https://g-node.org/odml-rdf#cc66e78a-3742-490a-9fdb-1c66761d7652\n", "Section: https://g-node.org/odml-rdf#5365f7e5-603c-4154-a5ea-33bb1a07a956\n", + "Document: https://g-node.org/odml-rdf#24066355-1ee8-4eb5-a715-96bbb6231cd5\n", + "Section: https://g-node.org/odml-rdf#bbd44815-5016-49e0-9f4b-5b83778d00de\n", "Document: https://g-node.org/odml-rdf#537c6cc8-7dfe-4d53-a111-24b3ce0f3c1a\n", "Section: https://g-node.org/odml-rdf#346773f2-abee-4892-b052-840ddcff35ee\n", "Document: https://g-node.org/odml-rdf#cd24b60f-1d5e-4040-9881-5e5a597baef7\n", "Section: https://g-node.org/odml-rdf#782bd29d-e4b0-4c14-a417-1772a4851ffd\n", - "Document: https://g-node.org/odml-rdf#24066355-1ee8-4eb5-a715-96bbb6231cd5\n", - "Section: https://g-node.org/odml-rdf#bbd44815-5016-49e0-9f4b-5b83778d00de\n", "\n", "\n" ] } ], "source": [ - "from odml.tools.fuzzy_finder import FuzzyFinder\n", + "from odml.rdf.fuzzy_finder import FuzzyFinder\n", "\n", "query_string = 'FIND sec(name, type) prop(name) HAVING Recording, Recording-2012-04-04-ab, Date, Some_value'\n", "\n", "f = FuzzyFinder(graph)\n", - "print(f.find(mode='fuzzy', q_str=query_string))" + "print(f.find(mode='fuzzy', q_str=query_string))\n" ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] } ], "metadata": { @@ -717,16 +593,16 @@ "language_info": { "codemirror_mode": { "name": "ipython", - "version": 3.0 + "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.5.2" + "version": "3.8.1" } }, "nbformat": 4, - "nbformat_minor": 0 -} \ No newline at end of file + "nbformat_minor": 1 +} From 4165932e190de8f0396a8dee5074574ffdd807a5 Mon Sep 17 00:00:00 2001 From: "M. Sonntag" Date: Fri, 17 Jul 2020 13:23:08 +0200 Subject: [PATCH 09/19] [doc/rdf] Remove unused file --- doc/rdf/generated_rdf.xml | 25 ------------------------- 1 file changed, 25 deletions(-) delete mode 100644 doc/rdf/generated_rdf.xml diff --git a/doc/rdf/generated_rdf.xml b/doc/rdf/generated_rdf.xml deleted file mode 100644 index a01ed55e..00000000 --- a/doc/rdf/generated_rdf.xml +++ /dev/null @@ -1,25 +0,0 @@ - - - - person - [Arthur Philip Dent,Zaphod Beeblebrox,Tricia Marie McMillan,Ford Prefect] - List of crew members names - NameCrewMembers - - - TheCrew - - crew - Information on the crew - - - 1979-10-12 - 42 - D. N. Adams - 1.1 - - - From 18653e362c6f7c7f469395c1a34e3fcbadb90ca6 Mon Sep 17 00:00:00 2001 From: "M. Sonntag" Date: Fri, 17 Jul 2020 18:33:19 +0200 Subject: [PATCH 10/19] [doc/rdf] Add RDF to readthedocs --- doc/index.rst | 1 + doc/odmltordf.rst | 15 +++++++++++++++ 2 files changed, 16 insertions(+) create mode 100644 doc/odmltordf.rst diff --git a/doc/index.rst b/doc/index.rst index 6f0ef7a5..854841b7 100644 --- a/doc/index.rst +++ b/doc/index.rst @@ -12,6 +12,7 @@ Contents: :maxdepth: 2 tutorial + odmltordf reference Indices and tables diff --git a/doc/odmltordf.rst b/doc/odmltordf.rst new file mode 100644 index 00000000..7f886ca6 --- /dev/null +++ b/doc/odmltordf.rst @@ -0,0 +1,15 @@ +================== +odML to RDF export +================== + +Opening odML to graph database searches +======================================= + +Searches within odML documents are part of the library implementation and imports from linked, external sources into odML documents are possible. +With the option to export odML documents to the RDF format, users also gain the option to search across multiple documents using tools from the Semantic Web technology. + +If you are unfamiliar with it, we linked additional information to the `Semantic web` and `RDF` for your convenience and give a brief introduction below. + +RDF was designed by the World Wide Web Consortium (W3C) as a standard model for data representation and exchange on the web with the heterogeneity of data in mind. Even tough the RDF file format might vary, the underlying concept features two key points. The first is that information is structured in subject-predicate-object triples e.g. "apple hasColor red". The second key point is that multiple subjects and objects can be connected to form a graph e.g. "tree hasFruit apple" can be combined with the previous example to form a minimal graph. These graphs can contain very heterogeneous data, but can still be queried due to the semantic structure of the underlying data. + + From 6f2d43e94b1731f357ab272541d60b5507278403 Mon Sep 17 00:00:00 2001 From: "M. Sonntag" Date: Wed, 12 Aug 2020 15:51:10 +0200 Subject: [PATCH 11/19] [doc/rdf] Add basic save description --- doc/odmltordf.rst | 62 +++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 55 insertions(+), 7 deletions(-) diff --git a/doc/odmltordf.rst b/doc/odmltordf.rst index 7f886ca6..1bb918b0 100644 --- a/doc/odmltordf.rst +++ b/doc/odmltordf.rst @@ -1,15 +1,63 @@ -================== -odML to RDF export -================== +===================== +Exporting odML to RDF +===================== -Opening odML to graph database searches -======================================= +Opening odML to the Semantic Web and graph database searches +============================================================ -Searches within odML documents are part of the library implementation and imports from linked, external sources into odML documents are possible. +Searches within odML documents are part of the library implementation and imports from linked, external sources into odML documents can be easily done with the core library functionality. With the option to export odML documents to the RDF format, users also gain the option to search across multiple documents using tools from the Semantic Web technology. -If you are unfamiliar with it, we linked additional information to the `Semantic web` and `RDF` for your convenience and give a brief introduction below. +If you are unfamiliar with it, we linked additional information to the `Semantic web`_ and `RDF`_ for your convenience and give the briefest introduction below. RDF was designed by the World Wide Web Consortium (W3C) as a standard model for data representation and exchange on the web with the heterogeneity of data in mind. Even tough the RDF file format might vary, the underlying concept features two key points. The first is that information is structured in subject-predicate-object triples e.g. "apple hasColor red". The second key point is that multiple subjects and objects can be connected to form a graph e.g. "tree hasFruit apple" can be combined with the previous example to form a minimal graph. These graphs can contain very heterogeneous data, but can still be queried due to the semantic structure of the underlying data. +odML to RDF usage +================= +Without further ado the next sections will expose you to the range of odML to RDF features the core library provides. + +Saving an odML document to an RDF format file +--------------------------------------------- + +Using odml.save to export to default XML RDF +******************************************** + +Once an odML document is available, it can most easily be exported to RDF by the odml.save feature. + +Given below is a minimal example:: + + import odml + + doc = odml.Document() + sec = odml.Section(name="rdf_export_section", parent=doc) + prop = odml.Property(name="rdf_export_property", parent=sec) + + odml.save(doc, "./rdf_export", "RDF") + +This will export the odML document to the RDF format in the XML flavor and will save it to the file `./rdf_export.RDF`. +The content of the file will look something like this (the UUIDs of the individual nodes will differ):: + + + + + rdf_export_property + + + + + None + + + + rdf_export_section + n.s. + + + + + + + From 70cc9408982aaae8458bb1285d20288d1fa4e3b9 Mon Sep 17 00:00:00 2001 From: "M. Sonntag" Date: Wed, 12 Aug 2020 16:15:48 +0200 Subject: [PATCH 12/19] [doc/rdf] Add RDFWriter description --- doc/odmltordf.rst | 43 ++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 42 insertions(+), 1 deletion(-) diff --git a/doc/odmltordf.rst b/doc/odmltordf.rst index 1bb918b0..07e2b5df 100644 --- a/doc/odmltordf.rst +++ b/doc/odmltordf.rst @@ -8,7 +8,7 @@ Opening odML to the Semantic Web and graph database searches Searches within odML documents are part of the library implementation and imports from linked, external sources into odML documents can be easily done with the core library functionality. With the option to export odML documents to the RDF format, users also gain the option to search across multiple documents using tools from the Semantic Web technology. -If you are unfamiliar with it, we linked additional information to the `Semantic web`_ and `RDF`_ for your convenience and give the briefest introduction below. +If you are unfamiliar with it, we linked additional information to the `Semantic web `_ and `RDF `_ for your convenience and give the briefest introduction below. RDF was designed by the World Wide Web Consortium (W3C) as a standard model for data representation and exchange on the web with the heterogeneity of data in mind. Even tough the RDF file format might vary, the underlying concept features two key points. The first is that information is structured in subject-predicate-object triples e.g. "apple hasColor red". The second key point is that multiple subjects and objects can be connected to form a graph e.g. "tree hasFruit apple" can be combined with the previous example to form a minimal graph. These graphs can contain very heterogeneous data, but can still be queried due to the semantic structure of the underlying data. @@ -61,3 +61,44 @@ The content of the file will look something like this (the UUIDs of the individu + +Using the RDFWriter class to export to a specific RDF format +************************************************************ + +The RDFWriter class is used to convert odML documents to one of the supported RDF formats: + +``xml, pretty-xml, trix, n3, turtle, ttl, ntriples, nt, nt11, trig`` + +``turtle`` is the format that is best suited for storage and human readability which is why we will use it in our tutorial. For cross-tool usage, saving RDF in its ``XML`` variant is probably the safest choice. + +The output can also be returned as a string instead of saving it to a file:: + + from odml.tools.rdf_converter import RDFWriter + + print(RDFWriter(doc).get_rdf_str('turtle')) + +This will print the content of the odML document in the Turtle flavor of RDF:: + + @prefix odml: . + + odml:Hub odml:hasDocument odml:08c6e31a-533f-443b-acd2-8e961215d38e . + + odml:08c6e31a-533f-443b-acd2-8e961215d38e a odml:Document ; + odml:hasFileName "None" ; + odml:hasSection odml:eebe4bf7-af10-4321-87ec-2cdf77289478 . + + odml:281c5aa7-8fea-4852-85ec-db127f753647 a odml:Property ; + odml:hasName "rdf_export_property" . + + odml:eebe4bf7-af10-4321-87ec-2cdf77289478 a odml:Section ; + odml:hasName "rdf_export_section" ; + odml:hasProperty odml:281c5aa7-8fea-4852-85ec-db127f753647 ; + odml:hasType "n.s." . + +The output can of course also be written to a file with a specified RDF output format; the output file will autmatically be assigned the appropriate file ending.:: + + from odml.tools.rdf_converter import RDFWriter + + RDFWriter(doc).write_file("./rdf_export_turtle", "turtle") + +All available RDF output formats can be viewed via ``odml.tools.parser_utils.RDF_CONVERSION_FORMATS.keys()``. From b4e174eb827826899cb3459b359583dea83bafcd Mon Sep 17 00:00:00 2001 From: "M. Sonntag" Date: Wed, 12 Aug 2020 16:31:30 +0200 Subject: [PATCH 13/19] [doc/rdf] Add CL script description --- doc/odmltordf.rst | 45 +++++++++++++++++++++++++++------------------ 1 file changed, 27 insertions(+), 18 deletions(-) diff --git a/doc/odmltordf.rst b/doc/odmltordf.rst index 07e2b5df..efc47975 100644 --- a/doc/odmltordf.rst +++ b/doc/odmltordf.rst @@ -1,9 +1,9 @@ -===================== -Exporting odML to RDF -===================== +=============================== +odML and RDF - Export and usage +=============================== -Opening odML to the Semantic Web and graph database searches -============================================================ +Semantic Web and graph database searches +======================================== Searches within odML documents are part of the library implementation and imports from linked, external sources into odML documents can be easily done with the core library functionality. With the option to export odML documents to the RDF format, users also gain the option to search across multiple documents using tools from the Semantic Web technology. @@ -12,16 +12,13 @@ If you are unfamiliar with it, we linked additional information to the `Semantic RDF was designed by the World Wide Web Consortium (W3C) as a standard model for data representation and exchange on the web with the heterogeneity of data in mind. Even tough the RDF file format might vary, the underlying concept features two key points. The first is that information is structured in subject-predicate-object triples e.g. "apple hasColor red". The second key point is that multiple subjects and objects can be connected to form a graph e.g. "tree hasFruit apple" can be combined with the previous example to form a minimal graph. These graphs can contain very heterogeneous data, but can still be queried due to the semantic structure of the underlying data. -odML to RDF usage -================= +odML to RDF export +================== Without further ado the next sections will expose you to the range of odML to RDF features the core library provides. -Saving an odML document to an RDF format file ---------------------------------------------- - -Using odml.save to export to default XML RDF -******************************************** +Default odML to XML RDF export +------------------------------ Once an odML document is available, it can most easily be exported to RDF by the odml.save feature. @@ -62,16 +59,17 @@ The content of the file will look something like this (the UUIDs of the individu -Using the RDFWriter class to export to a specific RDF format -************************************************************ -The RDFWriter class is used to convert odML documents to one of the supported RDF formats: +Specific RDF format export +-------------------------- + +The ``RDFWriter`` class is used to convert odML documents to one of the supported RDF formats: ``xml, pretty-xml, trix, n3, turtle, ttl, ntriples, nt, nt11, trig`` -``turtle`` is the format that is best suited for storage and human readability which is why we will use it in our tutorial. For cross-tool usage, saving RDF in its ``XML`` variant is probably the safest choice. +``turtle`` is the format that is best suited for storage and human readability while for cross-tool usage, saving RDF in its ``XML`` variant is probably the safest choice. -The output can also be returned as a string instead of saving it to a file:: +The exported output can be returned as a string:: from odml.tools.rdf_converter import RDFWriter @@ -95,10 +93,21 @@ This will print the content of the odML document in the Turtle flavor of RDF:: odml:hasProperty odml:281c5aa7-8fea-4852-85ec-db127f753647 ; odml:hasType "n.s." . -The output can of course also be written to a file with a specified RDF output format; the output file will autmatically be assigned the appropriate file ending.:: +The output can of course also be written to a file with a specified RDF output format; the output file will autmatically be assigned the appropriate file ending:: from odml.tools.rdf_converter import RDFWriter RDFWriter(doc).write_file("./rdf_export_turtle", "turtle") All available RDF output formats can be viewed via ``odml.tools.parser_utils.RDF_CONVERSION_FORMATS.keys()``. + +Bulk export to XML RDF +---------------------- + +Existing odML files can be exported to XML RDF in bulk using the ``odmltordf`` command line tool that is automatically installed with the core library. + +odmlToRDF searches for odML files within a provided SEARCHDIR and converts them to the newest odML format version and exports all found and resulting odML files to XML formatted RDF. Original files will never be overwritten. New files will be written either to a new directory at the current or a specified location. + +Usage: odmltordf [-r] [-o OUT] SEARCHDIR + +The command line option ``-r`` enables recursive search, ``-o OUT`` specifies a dedicated output folder for the created output files. From ef4f84ef33687e207ecc8c24406b0121b9f17fd3 Mon Sep 17 00:00:00 2001 From: "M. Sonntag" Date: Wed, 12 Aug 2020 17:08:06 +0200 Subject: [PATCH 14/19] [doc/rdf] RDF subclassing --- doc/odmltordf.rst | 103 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 103 insertions(+) diff --git a/doc/odmltordf.rst b/doc/odmltordf.rst index efc47975..a9141a75 100644 --- a/doc/odmltordf.rst +++ b/doc/odmltordf.rst @@ -111,3 +111,106 @@ odmlToRDF searches for odML files within a provided SEARCHDIR and converts them Usage: odmltordf [-r] [-o OUT] SEARCHDIR The command line option ``-r`` enables recursive search, ``-o OUT`` specifies a dedicated output folder for the created output files. + + +Advanced features +================= + +RDF subclassing of odml.Section.type +------------------------------------ + +By default a set of pre-defined odml.Section.types will export Sections not as an odml:Section but as a specific RDF subclass of an odml:Section. This is meant to simplify SPARQL query searches on graph databases that contain odml specific RDF. + +As an example an odml.Section normally gets exported as RDF class type odml-rdf:Section:: + + + +An odml.Section with the odml.Section.type="protocol" will by default be exported as a different RDF class type:: + + + +In an RDF query this can now be searched for directly by asking for RDF class "odml-rdf:Protocol" instead of asking for RDF class "odml-rdf:Section" with type "Protocol". + +On install the core library already provides a list of odml.Section.type mappings to RDF subclasses. On initialisation the ``RDFWriter`` loads all subclasses that are available and uses them by default when exporting an odML document to RDF. The available terms and the mappings of odml.Section.types to RDF subclasses can be viewed by accessing the ``section_subclasses`` attribute of an initialised ``RDFWriter``:: + + rdf_export = RDFWriter(doc) + rdf_export.section_subclasses + +This export also adds all used subclass definitions to the resulting file to enable query reasoners to makes sense of the introduced subclasses upon a query. + +Currently the following mappings of ``odml.Section.type`` values to odml-rdf:Section subclass are available:: + + analysis: Analysis + analysis/power_spectrum: PowerSpectrum + analysis/psth: PSTH + cell: Cell + datacite/alternate_identifier: AlternateIdentifier + datacite/contributor: Contributer + datacite/contributor/affiliation: Affiliation + datacite/contributor/named_identifier: NamedIdentifier + datacite/creator: Creator + datacite/creator/affiliation: Affiliation + datacite/creator/named_identifier: NamedIdentifier + datacite/date: Date + datacite/description: Description + datacite/format: Format + datacite/funding_reference: FundingReference + datacite/geo_location: GeoLocation + datacite/identifier: Identifier + datacite/related_identifier: RelatedIdentifier + datacite/resource_type: ResourceType + datacite/rights: Rights + datacite/size: Size + datacite/subject: Subject + datacite/title: Title + dataset: Dataset + data_reference: DataReference + blackrock: Blackrock + electrode: Electrode + event: Event + event_list: EventList + experiment: Experiment + experiment/behavior: Behavior + experiment/electrophysiology: Electrophysiology + experiment/imaging: Imaging + experiment/psychophysics: Psychophysics + hardware_properties: HardwareProperties + hardware_settings: HardwareSettings + hardware: Hardware + hardware/amplifier: Amplifier + hardware/attenuator: Attenuator + hardware/camera_objective: CameraObjective + hardware/daq: DataAcquisition + hardware/eyetracker: Eyetracker + hardware/filter: Filter + hardware/filter_set: Filterset + hardware/iaq: ImageAcquisition + hardware/light_source: Lightsource + hardware/microscope: Microscope + hardware/microscope_objective: MicroscopeObjective + hardware/scanner: Scanner + hardware/stimulus_isolator: StimulusIsolator + model/lif: LeakyIntegrateAndFire + model/pif: PerfectIntegrateAndFire + model/multi_compartment: MultiCompartmentModel + model/single_compartment: SingleCompartmentModel + person: Person + preparation: Preparation + project: Project + protocol: Protocol + recording: Recording + setup: Setup + stimulus: Stimulus + stimulus/dc: DC + stimulus/gabor: Gabor + stimulus/grating: Grating + stimulus/pulse: Pulse + stimulus/movie: Movie + stimulus/ramp: Ramp + stimulus/random_dot: RandomDot + stimulus/sawtooth: Sawtooth + stimulus/sine_wave: Sinewave + stimulus/square_wave: Squarewave + stimulus/white_noise: Whitenoise + subject: Subject + From 9c71317143a191f2b3ec989c80e6b3fc3bf4136e Mon Sep 17 00:00:00 2001 From: "M. Sonntag" Date: Wed, 12 Aug 2020 17:08:27 +0200 Subject: [PATCH 15/19] [doc/rdf] Custom RDF subclassing --- doc/odmltordf.rst | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/doc/odmltordf.rst b/doc/odmltordf.rst index a9141a75..b0152e26 100644 --- a/doc/odmltordf.rst +++ b/doc/odmltordf.rst @@ -214,3 +214,21 @@ Currently the following mappings of ``odml.Section.type`` values to odml-rdf:Sec stimulus/white_noise: Whitenoise subject: Subject +Custom RDF subclassing +---------------------- + +The default list of odml.Section.types can be supplemented or even replaced by custom type to RDF subclass mappings. + +All required is to provide a dictionary of the format ``{"odml.Section.type value": "RDF subclass value"}``. Please note that the ``odml.Section.type`` value should be provided lower case, while the ``RDF subclass value`` should be provided upper case:: + + custom_class_dict = {"species": "Species", "cell": "Neuron"} + rdf_export = RDFWriter(doc, custom_subclasses=custom_class_dict) + +Please note that entries in a custom subclass dictionary will overwrite entries in the default subclass dictionary. + +Disable RDF subclassing +----------------------- +The subclassing feature can be disabled to export all odml.Sections as plain odml-rdf:Sections instead. This might be necessary if for e.g. a graph database is used that does not provide proper SPARQL reasoning and cannot make sense of RDF subclasses:: + + rdf_export = RDFWriter(doc, rdf_subclassing=False) + From e713fc8331dd984a70669320a7846cb2d310b385 Mon Sep 17 00:00:00 2001 From: "M. Sonntag" Date: Wed, 12 Aug 2020 18:44:57 +0200 Subject: [PATCH 16/19] [doc/rdf] Query introduction --- doc/odmltordf.rst | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/doc/odmltordf.rst b/doc/odmltordf.rst index b0152e26..73816d36 100644 --- a/doc/odmltordf.rst +++ b/doc/odmltordf.rst @@ -8,14 +8,14 @@ Semantic Web and graph database searches Searches within odML documents are part of the library implementation and imports from linked, external sources into odML documents can be easily done with the core library functionality. With the option to export odML documents to the RDF format, users also gain the option to search across multiple documents using tools from the Semantic Web technology. -If you are unfamiliar with it, we linked additional information to the `Semantic web `_ and `RDF `_ for your convenience and give the briefest introduction below. +If you are unfamiliar with it, we linked additional information to the `Semantic web `_, `RDF `_ and `SPARQL `_ for your convenience and give the briefest introduction below. -RDF was designed by the World Wide Web Consortium (W3C) as a standard model for data representation and exchange on the web with the heterogeneity of data in mind. Even tough the RDF file format might vary, the underlying concept features two key points. The first is that information is structured in subject-predicate-object triples e.g. "apple hasColor red". The second key point is that multiple subjects and objects can be connected to form a graph e.g. "tree hasFruit apple" can be combined with the previous example to form a minimal graph. These graphs can contain very heterogeneous data, but can still be queried due to the semantic structure of the underlying data. +RDF was designed by the World Wide Web Consortium (W3C) as a standard model for data representation and exchange on the web with the heterogeneity of data in mind. Even tough the RDF file format might vary, the underlying concept features two key points. The first is that information is structured in subject-predicate-object triples e.g. "apple hasColor red". The second key point is that multiple subjects and objects can be connected to form a graph e.g. "tree hasFruit apple" can be combined with the previous example to form a minimal graph. These graphs can contain very heterogeneous data, but can still be queried using the SPARQL query language due to the semantic structure of the underlying data. odML to RDF export ================== -Without further ado the next sections will expose you to the range of odML to RDF features the core library provides. +Without further ado the next sections will expose you to the range of odML to RDF features the core library provides. To check how to create a graph database from exported odML documents and how to query such a database please refer to the section below. Default odML to XML RDF export ------------------------------ @@ -30,9 +30,9 @@ Given below is a minimal example:: sec = odml.Section(name="rdf_export_section", parent=doc) prop = odml.Property(name="rdf_export_property", parent=sec) - odml.save(doc, "./rdf_export", "RDF") + odml.save(doc, "./rdf_export.rdf", "RDF") -This will export the odML document to the RDF format in the XML flavor and will save it to the file `./rdf_export.RDF`. +This will export the odML document to the RDF format in the XML flavor and will save it to the file `./rdf_export.rdf`. The content of the file will look something like this (the UUIDs of the individual nodes will differ):: @@ -214,6 +214,7 @@ Currently the following mappings of ``odml.Section.type`` values to odml-rdf:Sec stimulus/white_noise: Whitenoise subject: Subject + Custom RDF subclassing ---------------------- @@ -232,3 +233,11 @@ The subclassing feature can be disabled to export all odml.Sections as plain odm rdf_export = RDFWriter(doc, rdf_subclassing=False) + +odML RDF graph search +===================== + +The following section gives a basic example how multiple odML RDF files can be loaded into a single graph database (a so called "triple store") and how queries can be done to retrieve information from such a database. + +Please note, that the `rdflib `_ library provides just basic implementation of a triple store and query features via SPARQL. To make full use of SPARQL additional RDF reasoning libraries are required. In our case the `owlrl `_ library is used to provide proper reasoning and enable searches for the RDF subclassing feature. + From 63e37653c79024a7dee0b8e295936e1e8605ba8d Mon Sep 17 00:00:00 2001 From: "M. Sonntag" Date: Wed, 12 Aug 2020 18:45:34 +0200 Subject: [PATCH 17/19] [doc/rdf] Graph example --- doc/odmltordf.rst | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) diff --git a/doc/odmltordf.rst b/doc/odmltordf.rst index 73816d36..09b181d6 100644 --- a/doc/odmltordf.rst +++ b/doc/odmltordf.rst @@ -241,3 +241,36 @@ The following section gives a basic example how multiple odML RDF files can be l Please note, that the `rdflib `_ library provides just basic implementation of a triple store and query features via SPARQL. To make full use of SPARQL additional RDF reasoning libraries are required. In our case the `owlrl `_ library is used to provide proper reasoning and enable searches for the RDF subclassing feature. +The following is a basic example how to load odML RDF documents into a single graph and provide the required to namespace to make the odml specific content of the graph accessible:: + + import odml + + file_A = "./rdf_recordings.rdf" + file_B = "./rdf_protocols.rdf" + + doc_A = odml.Document(author="MS") + sec_A = odml.Section(name="recording_A", type="paradigm_A", parent=doc_A) + _ = odml.Property(name="protocol", values="recording_protocol_A", parent=sec_A) + sec_B = odml.Section(name="recording_B", type="paradigm_A", parent=doc_A) + _ = odml.Property(name="protocol", values="recording_protocol_B", parent=sec_B) + _ = odml.Section(name="analysis_A", type="paradigm_A", parent=doc_A) + + odml.save(doc_A, file_A, "RDF") + + doc_B = odml.Document(author="MS") + _ = odml.Section(name="recording_protocol_A", type="protocol", parent=doc_B) + _ = odml.Section(name="recording_protocol_B", type="protocol", parent=doc_B) + + odml.save(doc_B, file_B, "RDF") + +Please note, that every odML Document exported to RDF has a special ``odml-rdf:Hub`` node at the very root of the document. This node is identical in every exported odML Document and is used as the root Node connecting all individual odML RDF documents into a single graph. + +The documents saved above can now be loaded into single graph:: + + from rdflib import Graph + + curr_graph = Graph() + curr_graph.parse(file_A) + curr_graph.parse(file_B) + + From 9613f6533de0bb799e43370d6cd6a9c92c078e1a Mon Sep 17 00:00:00 2001 From: "M. Sonntag" Date: Wed, 12 Aug 2020 18:46:27 +0200 Subject: [PATCH 18/19] [doc/rdf] Add query example --- doc/odmltordf.rst | 42 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 42 insertions(+) diff --git a/doc/odmltordf.rst b/doc/odmltordf.rst index 09b181d6..b3636261 100644 --- a/doc/odmltordf.rst +++ b/doc/odmltordf.rst @@ -274,3 +274,45 @@ The documents saved above can now be loaded into single graph:: curr_graph.parse(file_B) +The graph is now ready to accept simple SPARQL queries. Queries need the odML RDF namespace though to process the odml specific entries:: + + from odml.tools.rdf_converter import ODML_NS + + from rdflib import Namespace, RDF, RDFS + from rdflib.plugins.sparql import prepareQuery + + # preparing the query namespace + NAMESPACE_MAP = {"odml": Namespace(ODML_NS), "rdf": RDF, "rdfs": RDFS} + + # preparing a query requesting the name of all sections in the graph + q_string = "SELECT * WHERE {?s rdf:type odml:Section . ?s odml:hasName ?sec_name .}" + sec_query = prepareQuery(q_string, initNs=NAMESPACE_MAP) + + for row in curr_graph.query(sec_query): + print("Section name: '%s'" % row.sec_name) + +The query returns:: + + Section name: 'recording_A' + Section name: 'recording_B' + Section name: 'analysis_A' + + +This query returns all sections from the first file, since reasoning is not yet enabled. This can be changed by adding reasioning to the query:: + + from owlrl import DeductiveClosure, RDFS_Semantics + + DeductiveClosure(RDFS_Semantics).expand(curr_graph) + + for row in curr_graph.query(sec_query): + print("Section name: '%s'" % row.sec_name) + + +This query now returns the sections from both files:: + + Section name: 'recording_B' + Section name: 'recording_A' + Section name: 'recording_protocol_B' + Section name: 'recording_protocol_A' + Section name: 'analysis_A' + From 11434b963c986cdcca7df376999b2626ae152bd2 Mon Sep 17 00:00:00 2001 From: "M. Sonntag" Date: Mon, 17 Aug 2020 16:09:21 +0200 Subject: [PATCH 19/19] [doc/features] Move advanced features The sections about advanced odml features were moved from the tutorial to their own dedicated section. --- doc/advanced_features.rst | 243 +++++++++++++++++++++++++++++++++++++ doc/index.rst | 1 + doc/tutorial.rst | 244 -------------------------------------- 3 files changed, 244 insertions(+), 244 deletions(-) create mode 100644 doc/advanced_features.rst diff --git a/doc/advanced_features.rst b/doc/advanced_features.rst new file mode 100644 index 00000000..faada02e --- /dev/null +++ b/doc/advanced_features.rst @@ -0,0 +1,243 @@ +====================== +Advanced odML features +====================== + +Working with odML Validations +============================= + +odML Validations are a set of pre-defined checks that are run against an odML document automatically when it is saved or loaded. A document cannot be saved, if a Validation fails a check that is classified as an Error. Most validation checks are Warnings that are supposed to raise the overall data quality of the odml Document. + +When an odML document is saved or loaded, tha automatic validation will print a short report of encountered Validation Warnings and it is up to the user whether they want to resolve the Warnings. The odML document provides the ``validate`` method to gain easy access to the default validations. A Validation in turn provides not only a specific description of all encountered warnings or errors within an odML document, but it also provides direct access to each and every odML entity i.e. an ``odml.Section`` or an ``odml.Property`` where an issue has been found. This enables the user to quickly access and fix an encountered issue. + +A minimal example shows how a workflow using default validations might look like: + + >>> # Create a minimal document with Section issues: name and type are not assigned + >>> doc = odml.Document() + >>> sec = odml.Section(parent=doc) + >>> odml.save(doc, "validation_example.odml.xml") + +This minimal example document will be saved, but will also print the following Validation report: + + >>> UserWarning: The saved Document contains unresolved issues. Run the Documents 'validate' method to access them. + >>> Validation found 0 errors and 2 warnings in 1 Sections and 0 Properties. + +To fix the encountered warnings, users can access the validation via the documents' ``validate`` method: + + >>> validation = doc.validate() + >>> for issue in validation.errors: + >>> print(issue) + +This will show that the validation has encountered two Warnings and also displays the offending odml entity. + + >>> ValidationWarning: Section[73f29acd-16ae-47af-afc7-371d57898e28] 'Section type not specified' + >>> ValidationWarning: Section[73f29acd-16ae-47af-afc7-371d57898e28] 'Name not assigned' + +To fix the "Name not assigned" warning the Section can be accessed via the validation entry and used to directly assign a human readable name to the Section in the original document. Re-running the validation will show, that the warning has been removed. + + >>> validation.errors[1].obj.name = "validation_example_section" + >>> # Check that the section name has been changed in the document + >>> print(doc.sections) + >>> # Re-running validation + >>> validation = doc.validate() + >>> for issue in validation.errors: + >>> print(issue) + +Similarly the second validation warning can be resolved before saving the document again. + +Please note that the automatic validation is run whenever a document is saved or loaded using the ``odml.save`` and ``odml.load`` functions as well as the ``ODMLWriter`` or the ``ODMLReader`` class. The validation is not run when using any of the lower level ``xmlparser``, ``dict_parser`` or ``rdf_converter`` classes. + +List of available default validations +------------------------------------- + +The following contains a list of the default odml validations, their message and the suggested course of action to resolve the issue. + +| Validation: ``object_required_attributes`` +| Message: "Missing required attribute 'xyz'" +| Applies to: ``Document``, ``Section``, ``Property`` +| Course of action: Add an appropriate value to attribute 'xyz' for the reported odml entity. + +| Validation: ``section_type_must_be_defined`` +| Message: "Section type not specified" +| Applies to: ``Section`` +| Course of action: Fill in the ``type`` attribute of the reported Section. + +| Validation: ``section_unique_ids`` +| Message: "Duplicate id in Section 'secA' and 'secB'" +| Applies to: ``Section`` +| Course of action: IDs have to be unique and a duplicate id was found. Assign a new id for the reported Section. + +| Validation: ``property_unique_ids`` +| Message: "Duplicate id in Property 'propA' and 'propB'" +| Applies to: ``Property`` +| Course of action: IDs have to be unique and a duplicate id was found. Assign a new id for the reported Property + +| Validation: ``section_unique_name_type`` +| Message: "name/type combination must be unique" +| Applies to: ``Section`` +| Course of action: The combination of Section.name and Section.type has to be unique on the same level. Change either name or type of the reported Section. + +| Validation: ``object_unique_name`` +| Message: "Object names must be unique" +| Applies to: ``Document``, ``Section``, ``Property`` +| Course of action: Property name has to be unique on the same level. Change the name of the reported Property. + +| Validation: ``object_name_readable`` +| Message: "Name not assigned" +| Applies to: ``Section``, ``Property`` +| Course of action: When Section or Property names are left empty on creation or set to None, they are automatically assigned the entities uuid. Assign a human readable name to the reported entity. + +| Validation: ``property_terminology_check`` +| Message: "Property 'prop' not found in terminology" +| Applies to: ``Property`` +| Course of action: The reported entity is linked to a repository but the repository is not available. Check if the linked content has moved. + +| Validation: ``property_dependency_check`` +| Message: "Property refers to a non-existent dependency object" or "Dependency-value is not equal to value of the property's dependency" +| Applies to: ``Property`` +| Course of action: The reported entity depends on another Property, but this dependency has not been satisfied. Check the referenced Property and its value to resolve the issue. + +| Validation: ``property_values_check`` +| Message: "Tuple of length 'x' not consistent with dtype 'dtype'!" or "Property values not of consistent dtype!". +| Applies to: ``Property`` +| Course of action: Adjust the values or the dtype of the referenced Propery. + +| Validation: ``property_values_string_check`` +| Message: "Dtype of property "prop" currently is "string", but might fit dtype "dtype"!" +| Applies to: ``Property`` +| Course of action: Check if the datatype of the referenced Property.values has been loaded correctly and change the Property.dtype if required. + +| Validation: ``section_properties_cardinality`` +| Message: "cardinality violated x values, y found)" +| Applies to: ``Section`` +| Course of action: A cardinality defined for the number of Properties of a Section does not match. Add or remove Properties until the cardinality has been satisfied or adjust the cardinality. + +| Validation: ``section_sections_cardinality`` +| Message: "cardinality violated x values, y found)" +| Applies to: ``Section`` +| Course of action: A cardinality defined for the number of Sections of a Section does not match. Add or remove Sections until the cardinality has been satisfied or adjust the cardinality. + +| Validation: ``property_values_cardinality`` +| Message: "cardinality violated x values, y found)" +| Applies to: ``Property`` +| Course of action: A cardinality defined for the number of Values of a Property does not match. Add or remove Values until the cardinality has been satisfied or adjust the cardinality. + +| Validation: ``section_repository_present`` +| Message: "A section should have an associated repository" or "Could not load terminology" or "Section type not found in terminology" +| Applies to: ``Section`` +| Course of action: Optional validation. Will report any section that does not specify a repository. Add a repository to the reported Section to resolve. + +Custom validations +------------------ + +Users can write their own validation and register them either with the default validation or add it to their own validation class instance. + +A custom validation handler needs to ``yield`` a ``ValidationError``. See the ``validation.ValidationError`` class for details. + +Custom validation handlers can be registered to be applied on "odML" (the odml Document), "section" or "property". + + >>> import odml + >>> import odml.validation as oval + >>> + >>> # Create an example document + >>> doc = odml.Document() + >>> sec_valid = odml.Section(name="Recording-20200505", parent=doc) + >>> sec_invalid = odml.Section(name="Movie-20200505", parent=doc) + >>> subsec = odml.Section(name="Sub-Movie-20200505", parent=sec_valid) + >>> + >>> # Define a validation handler that yields a ValidationError if a section name does not start with 'Recording-' + >>> def custom_validation_handler(obj): + >>> validation_id = oval.IssueID.custom_validation + >>> msg = "Section name does not start with 'Recording-'" + >>> if not obj.name.startswith("Recording-"): + >>> yield oval.ValidationError(obj, msg, oval.LABEL_ERROR, validation_id) + >>> + >>> # Create a custom, empty validation with an odML document 'doc' + >>> custom_validation = oval.Validation(doc, reset=True) + >>> # Register a custom validation handler that should be applied on all Sections of a Document + >>> custom_validation.register_custom_handler("section", custom_validation_handler) + >>> # Run the custom validation and return a report + >>> custom_validation.report() + >>> # Display the errors reported by the validation + >>> print(custom_validation.errors) + +Defining and working with feature cardinality +============================================= + +The odML format allows users to define a cardinality for +the number of subsections and properties of Sections and +the number of values a Property might have. + +A cardinality is checked when it is set, when its target is +set and when a document is saved or loaded. If a specific +cardinality is violated, a corresponding warning will be printed. + +Setting a cardinality +--------------------- + +A cardinality can be set for sections or properties of sections +or for values of properties. By default every cardinality is None, +but it can be set to a defined minimal and/or a maximal number of +an element. + +A cardinality is set via its convenience method: + + >>> # Set the cardinality of the properties of a Section 'sec' to + >>> # a maximum of 5 elements. + >>> sec = odml.Section(name="cardinality", type="test") + >>> sec.set_properties_cardinality(max_val=5) + + >>> # Set the cardinality of the subsections of Section 'sec' to + >>> # a minimum of one and a maximum of 2 elements. + >>> sec.set_sections_cardinality(min_val=1, max_val=2) + + >>> # Set the cardinality of the values of a Property 'prop' to + >>> # a minimum of 1 element. + >>> prop = odml.Property(name="cardinality") + >>> prop.set_values_cardinality(min_val=1) + + >>> # Re-set the cardinality of the values of a Property 'prop' to not set. + >>> prop.set_values_cardinality() + >>> # or + >>> prop.val_cardinality = None + +Please note that a set cardinality is not enforced. Users can set less or more entities than are specified allowed via a cardinality. Instead whenever a cardinality is not met, a warning message is displayed and any unment cardinality will show up as a Validation warning message whenever a document is saved or loaded. + +View odML documents in a web browser +==================================== + +By default all odML files are saved in the XML format without the capability to view +the plain files in a browser. By default you can use the command line tool ``odmlview`` +to view saved odML files locally. Since this requires the start of a local server, +there is another option to view odML XML files in a web browser. + +You can use an additional feature of the ``odml.tools.XMLWriter`` to save an odML +document with an embedded default stylesheet for local viewing: + + >>> import odml + >>> from odml.tools import XMLWriter + >>> doc = odml.Document() # minimal example document + >>> filename = "viewable_document.xml" + >>> XMLWriter(doc).write_file(filename, local_style=True) + +Now you can open the resulting file 'viewable_document.xml' in any current web-browser +and it will render the content of the odML file. + +If you want to use a custom style sheet to render an odML document instead of the default +one, you can provide it as a string to the XML writer. Please note, that it cannot be a +full XSL stylesheet, the outermost tag of the XSL code has to be +`` [your custom style here] ``: + + >>> import odml + >>> from odml.tools import XMLWriter + >>> doc = odml.Document() # minimal example document + >>> filename = "viewable_document.xml" + >>> own_template = """ [your custom style here] """ + >>> XMLWriter(doc).write_file(filename, custom_template=own_template) + +Please note that if the file is saved using the '.odml' extension and you are using +Chrome, you will need to map the '.odml' extension to the browsers Mime-type database as +'application/xml'. + +Also note that any style that is saved with an odML document will be lost, when this +document is loaded again and changes to the content are added. In this case the required +style needs to be specified again when saving the changed file as described above. diff --git a/doc/index.rst b/doc/index.rst index 854841b7..768879e5 100644 --- a/doc/index.rst +++ b/doc/index.rst @@ -12,6 +12,7 @@ Contents: :maxdepth: 2 tutorial + advanced_features odmltordf reference diff --git a/doc/tutorial.rst b/doc/tutorial.rst index 096eafdb..a2362d82 100644 --- a/doc/tutorial.rst +++ b/doc/tutorial.rst @@ -899,250 +899,6 @@ format option when loading the document: ------------------------------------------------------------------------------- -Advanced odML-Features -====================== - -View odML documents in a web browser ------------------------------------- - -By default all odML files are saved in the XML format without the capability to view -the plain files in a browser. By default you can use the command line tool ``odmlview`` -to view saved odML files locally. Since this requires the start of a local server, -there is another option to view odML XML files in a web browser. - -You can use an additional feature of the ``odml.tools.XMLWriter`` to save an odML -document with an embedded default stylesheet for local viewing: - - >>> import odml - >>> from odml.tools import XMLWriter - >>> doc = odml.Document() # minimal example document - >>> filename = "viewable_document.xml" - >>> XMLWriter(doc).write_file(filename, local_style=True) - -Now you can open the resulting file 'viewable_document.xml' in any current web-browser -and it will render the content of the odML file. - -If you want to use a custom style sheet to render an odML document instead of the default -one, you can provide it as a string to the XML writer. Please note, that it cannot be a -full XSL stylesheet, the outermost tag of the XSL code has to be -`` [your custom style here] ``: - - >>> import odml - >>> from odml.tools import XMLWriter - >>> doc = odml.Document() # minimal example document - >>> filename = "viewable_document.xml" - >>> own_template = """ [your custom style here] """ - >>> XMLWriter(doc).write_file(filename, custom_template=own_template) - -Please note that if the file is saved using the '.odml' extension and you are using -Chrome, you will need to map the '.odml' extension to the browsers Mime-type database as -'application/xml'. - -Also note that any style that is saved with an odML document will be lost, when this -document is loaded again and changes to the content are added. In this case the required -style needs to be specified again when saving the changed file as described above. - - -Defining and working with feature cardinality ---------------------------------------------- - -The odML format allows users to define a cardinality for -the number of subsections and properties of Sections and -the number of values a Property might have. - -A cardinality is checked when it is set, when its target is -set and when a document is saved or loaded. If a specific -cardinality is violated, a corresponding warning will be printed. - -Setting a cardinality -********************* - -A cardinality can be set for sections or properties of sections -or for values of properties. By default every cardinality is None, -but it can be set to a defined minimal and/or a maximal number of -an element. - -A cardinality is set via its convenience method: - - >>> # Set the cardinality of the properties of a Section 'sec' to - >>> # a maximum of 5 elements. - >>> sec = odml.Section(name="cardinality", type="test") - >>> sec.set_properties_cardinality(max_val=5) - - >>> # Set the cardinality of the subsections of Section 'sec' to - >>> # a minimum of one and a maximum of 2 elements. - >>> sec.set_sections_cardinality(min_val=1, max_val=2) - - >>> # Set the cardinality of the values of a Property 'prop' to - >>> # a minimum of 1 element. - >>> prop = odml.Property(name="cardinality") - >>> prop.set_values_cardinality(min_val=1) - - >>> # Re-set the cardinality of the values of a Property 'prop' to not set. - >>> prop.set_values_cardinality() - >>> # or - >>> prop.val_cardinality = None - -Please note that a set cardinality is not enforced. Users can set less or more entities than are specified allowed via a cardinality. Instead whenever a cardinality is not met, a warning message is displayed and any unment cardinality will show up as a Validation warning message whenever a document is saved or loaded. - -Working with Validations ------------------------- - -odML Validations are a set of pre-defined checks that are run against an odML document automatically when it is saved or loaded. A document cannot be saved, if a Validation fails a check that is classified as an Error. Most validation checks are Warnings that are supposed to raise the overall data quality of the odml Document. - -When an odML document is saved or loaded, tha automatic validation will print a short report of encountered Validation Warnings and it is up to the user whether they want to resolve the Warnings. The odML document provides the ``validate`` method to gain easy access to the default validations. A Validation in turn provides not only a specific description of all encountered warnings or errors within an odML document, but it also provides direct access to each and every odML entity i.e. an ``odml.Section`` or an ``odml.Property`` where an issue has been found. This enables the user to quickly access and fix an encountered issue. - -A minimal example shows how a workflow using default validations might look like: - - >>> # Create a minimal document with Section issues: name and type are not assigned - >>> doc = odml.Document() - >>> sec = odml.Section(parent=doc) - >>> odml.save(doc, "validation_example.odml.xml") - -This minimal example document will be saved, but will also print the following Validation report: - - >>> UserWarning: The saved Document contains unresolved issues. Run the Documents 'validate' method to access them. - >>> Validation found 0 errors and 2 warnings in 1 Sections and 0 Properties. - -To fix the encountered warnings, users can access the validation via the documents' ``validate`` method: - - >>> validation = doc.validate() - >>> for issue in validation.errors: - >>> print(issue) - -This will show that the validation has encountered two Warnings and also displays the offending odml entity. - - >>> ValidationWarning: Section[73f29acd-16ae-47af-afc7-371d57898e28] 'Section type not specified' - >>> ValidationWarning: Section[73f29acd-16ae-47af-afc7-371d57898e28] 'Name not assigned' - -To fix the "Name not assigned" warning the Section can be accessed via the validation entry and used to directly assign a human readable name to the Section in the original document. Re-running the validation will show, that the warning has been removed. - - >>> validation.errors[1].obj.name = "validation_example_section" - >>> # Check that the section name has been changed in the document - >>> print(doc.sections) - >>> # Re-running validation - >>> validation = doc.validate() - >>> for issue in validation.errors: - >>> print(issue) - -Similarly the second validation warning can be resolved before saving the document again. - -Please note that the automatic validation is run whenever a document is saved or loaded using the ``odml.save`` and ``odml.load`` functions as well as the ``ODMLWriter`` or the ``ODMLReader`` class. The validation is not run when using any of the lower level ``xmlparser``, ``dict_parser`` or ``rdf_converter`` classes. - -List of available default validations -************************************* - -The following contains a list of the default odml validations, their message and the suggested course of action to resolve the issue. - -| Validation: ``object_required_attributes`` -| Message: "Missing required attribute 'xyz'" -| Applies to: ``Document``, ``Section``, ``Property`` -| Course of action: Add an appropriate value to attribute 'xyz' for the reported odml entity. - -| Validation: ``section_type_must_be_defined`` -| Message: "Section type not specified" -| Applies to: ``Section`` -| Course of action: Fill in the ``type`` attribute of the reported Section. - -| Validation: ``section_unique_ids`` -| Message: "Duplicate id in Section 'secA' and 'secB'" -| Applies to: ``Section`` -| Course of action: IDs have to be unique and a duplicate id was found. Assign a new id for the reported Section. - -| Validation: ``property_unique_ids`` -| Message: "Duplicate id in Property 'propA' and 'propB'" -| Applies to: ``Property`` -| Course of action: IDs have to be unique and a duplicate id was found. Assign a new id for the reported Property - -| Validation: ``section_unique_name_type`` -| Message: "name/type combination must be unique" -| Applies to: ``Section`` -| Course of action: The combination of Section.name and Section.type has to be unique on the same level. Change either name or type of the reported Section. - -| Validation: ``object_unique_name`` -| Message: "Object names must be unique" -| Applies to: ``Document``, ``Section``, ``Property`` -| Course of action: Property name has to be unique on the same level. Change the name of the reported Property. - -| Validation: ``object_name_readable`` -| Message: "Name not assigned" -| Applies to: ``Section``, ``Property`` -| Course of action: When Section or Property names are left empty on creation or set to None, they are automatically assigned the entities uuid. Assign a human readable name to the reported entity. - -| Validation: ``property_terminology_check`` -| Message: "Property 'prop' not found in terminology" -| Applies to: ``Property`` -| Course of action: The reported entity is linked to a repository but the repository is not available. Check if the linked content has moved. - -| Validation: ``property_dependency_check`` -| Message: "Property refers to a non-existent dependency object" or "Dependency-value is not equal to value of the property's dependency" -| Applies to: ``Property`` -| Course of action: The reported entity depends on another Property, but this dependency has not been satisfied. Check the referenced Property and its value to resolve the issue. - -| Validation: ``property_values_check`` -| Message: "Tuple of length 'x' not consistent with dtype 'dtype'!" or "Property values not of consistent dtype!". -| Applies to: ``Property`` -| Course of action: Adjust the values or the dtype of the referenced Propery. - -| Validation: ``property_values_string_check`` -| Message: "Dtype of property "prop" currently is "string", but might fit dtype "dtype"!" -| Applies to: ``Property`` -| Course of action: Check if the datatype of the referenced Property.values has been loaded correctly and change the Property.dtype if required. - -| Validation: ``section_properties_cardinality`` -| Message: "cardinality violated x values, y found)" -| Applies to: ``Section`` -| Course of action: A cardinality defined for the number of Properties of a Section does not match. Add or remove Properties until the cardinality has been satisfied or adjust the cardinality. - -| Validation: ``section_sections_cardinality`` -| Message: "cardinality violated x values, y found)" -| Applies to: ``Section`` -| Course of action: A cardinality defined for the number of Sections of a Section does not match. Add or remove Sections until the cardinality has been satisfied or adjust the cardinality. - -| Validation: ``property_values_cardinality`` -| Message: "cardinality violated x values, y found)" -| Applies to: ``Property`` -| Course of action: A cardinality defined for the number of Values of a Property does not match. Add or remove Values until the cardinality has been satisfied or adjust the cardinality. - -| Validation: ``section_repository_present`` -| Message: "A section should have an associated repository" or "Could not load terminology" or "Section type not found in terminology" -| Applies to: ``Section`` -| Course of action: Optional validation. Will report any section that does not specify a repository. Add a repository to the reported Section to resolve. - -Custom validations -****************** - -Users can write their own validation and register them either with the default validation or add it to their own validation class instance. - -A custom validation handler needs to ``yield`` a ``ValidationError``. See the ``validation.ValidationError`` class for details. - -Custom validation handlers can be registered to be applied on "odML" (the odml Document), "section" or "property". - - >>> import odml - >>> import odml.validation as oval - >>> - >>> # Create an example document - >>> doc = odml.Document() - >>> sec_valid = odml.Section(name="Recording-20200505", parent=doc) - >>> sec_invalid = odml.Section(name="Movie-20200505", parent=doc) - >>> subsec = odml.Section(name="Sub-Movie-20200505", parent=sec_valid) - >>> - >>> # Define a validation handler that yields a ValidationError if a section name does not start with 'Recording-' - >>> def custom_validation_handler(obj): - >>> validation_id = oval.IssueID.custom_validation - >>> msg = "Section name does not start with 'Recording-'" - >>> if not obj.name.startswith("Recording-"): - >>> yield oval.ValidationError(obj, msg, oval.LABEL_ERROR, validation_id) - >>> - >>> # Create a custom, empty validation with an odML document 'doc' - >>> custom_validation = oval.Validation(doc, reset=True) - >>> # Register a custom validation handler that should be applied on all Sections of a Document - >>> custom_validation.register_custom_handler("section", custom_validation_handler) - >>> # Run the custom validation and return a report - >>> custom_validation.report() - >>> # Display the errors reported by the validation - >>> print(custom_validation.errors) - Advanced Value features -----------------------