diff --git a/doc/advanced_features.rst b/doc/advanced_features.rst new file mode 100644 index 00000000..faada02e --- /dev/null +++ b/doc/advanced_features.rst @@ -0,0 +1,243 @@ +====================== +Advanced odML features +====================== + +Working with odML Validations +============================= + +odML Validations are a set of pre-defined checks that are run against an odML document automatically when it is saved or loaded. A document cannot be saved, if a Validation fails a check that is classified as an Error. Most validation checks are Warnings that are supposed to raise the overall data quality of the odml Document. + +When an odML document is saved or loaded, tha automatic validation will print a short report of encountered Validation Warnings and it is up to the user whether they want to resolve the Warnings. The odML document provides the ``validate`` method to gain easy access to the default validations. A Validation in turn provides not only a specific description of all encountered warnings or errors within an odML document, but it also provides direct access to each and every odML entity i.e. an ``odml.Section`` or an ``odml.Property`` where an issue has been found. This enables the user to quickly access and fix an encountered issue. + +A minimal example shows how a workflow using default validations might look like: + + >>> # Create a minimal document with Section issues: name and type are not assigned + >>> doc = odml.Document() + >>> sec = odml.Section(parent=doc) + >>> odml.save(doc, "validation_example.odml.xml") + +This minimal example document will be saved, but will also print the following Validation report: + + >>> UserWarning: The saved Document contains unresolved issues. Run the Documents 'validate' method to access them. + >>> Validation found 0 errors and 2 warnings in 1 Sections and 0 Properties. + +To fix the encountered warnings, users can access the validation via the documents' ``validate`` method: + + >>> validation = doc.validate() + >>> for issue in validation.errors: + >>> print(issue) + +This will show that the validation has encountered two Warnings and also displays the offending odml entity. + + >>> ValidationWarning: Section[73f29acd-16ae-47af-afc7-371d57898e28] 'Section type not specified' + >>> ValidationWarning: Section[73f29acd-16ae-47af-afc7-371d57898e28] 'Name not assigned' + +To fix the "Name not assigned" warning the Section can be accessed via the validation entry and used to directly assign a human readable name to the Section in the original document. Re-running the validation will show, that the warning has been removed. + + >>> validation.errors[1].obj.name = "validation_example_section" + >>> # Check that the section name has been changed in the document + >>> print(doc.sections) + >>> # Re-running validation + >>> validation = doc.validate() + >>> for issue in validation.errors: + >>> print(issue) + +Similarly the second validation warning can be resolved before saving the document again. + +Please note that the automatic validation is run whenever a document is saved or loaded using the ``odml.save`` and ``odml.load`` functions as well as the ``ODMLWriter`` or the ``ODMLReader`` class. The validation is not run when using any of the lower level ``xmlparser``, ``dict_parser`` or ``rdf_converter`` classes. + +List of available default validations +------------------------------------- + +The following contains a list of the default odml validations, their message and the suggested course of action to resolve the issue. + +| Validation: ``object_required_attributes`` +| Message: "Missing required attribute 'xyz'" +| Applies to: ``Document``, ``Section``, ``Property`` +| Course of action: Add an appropriate value to attribute 'xyz' for the reported odml entity. + +| Validation: ``section_type_must_be_defined`` +| Message: "Section type not specified" +| Applies to: ``Section`` +| Course of action: Fill in the ``type`` attribute of the reported Section. + +| Validation: ``section_unique_ids`` +| Message: "Duplicate id in Section 'secA' and 'secB'" +| Applies to: ``Section`` +| Course of action: IDs have to be unique and a duplicate id was found. Assign a new id for the reported Section. + +| Validation: ``property_unique_ids`` +| Message: "Duplicate id in Property 'propA' and 'propB'" +| Applies to: ``Property`` +| Course of action: IDs have to be unique and a duplicate id was found. Assign a new id for the reported Property + +| Validation: ``section_unique_name_type`` +| Message: "name/type combination must be unique" +| Applies to: ``Section`` +| Course of action: The combination of Section.name and Section.type has to be unique on the same level. Change either name or type of the reported Section. + +| Validation: ``object_unique_name`` +| Message: "Object names must be unique" +| Applies to: ``Document``, ``Section``, ``Property`` +| Course of action: Property name has to be unique on the same level. Change the name of the reported Property. + +| Validation: ``object_name_readable`` +| Message: "Name not assigned" +| Applies to: ``Section``, ``Property`` +| Course of action: When Section or Property names are left empty on creation or set to None, they are automatically assigned the entities uuid. Assign a human readable name to the reported entity. + +| Validation: ``property_terminology_check`` +| Message: "Property 'prop' not found in terminology" +| Applies to: ``Property`` +| Course of action: The reported entity is linked to a repository but the repository is not available. Check if the linked content has moved. + +| Validation: ``property_dependency_check`` +| Message: "Property refers to a non-existent dependency object" or "Dependency-value is not equal to value of the property's dependency" +| Applies to: ``Property`` +| Course of action: The reported entity depends on another Property, but this dependency has not been satisfied. Check the referenced Property and its value to resolve the issue. + +| Validation: ``property_values_check`` +| Message: "Tuple of length 'x' not consistent with dtype 'dtype'!" or "Property values not of consistent dtype!". +| Applies to: ``Property`` +| Course of action: Adjust the values or the dtype of the referenced Propery. + +| Validation: ``property_values_string_check`` +| Message: "Dtype of property "prop" currently is "string", but might fit dtype "dtype"!" +| Applies to: ``Property`` +| Course of action: Check if the datatype of the referenced Property.values has been loaded correctly and change the Property.dtype if required. + +| Validation: ``section_properties_cardinality`` +| Message: "cardinality violated x values, y found)" +| Applies to: ``Section`` +| Course of action: A cardinality defined for the number of Properties of a Section does not match. Add or remove Properties until the cardinality has been satisfied or adjust the cardinality. + +| Validation: ``section_sections_cardinality`` +| Message: "cardinality violated x values, y found)" +| Applies to: ``Section`` +| Course of action: A cardinality defined for the number of Sections of a Section does not match. Add or remove Sections until the cardinality has been satisfied or adjust the cardinality. + +| Validation: ``property_values_cardinality`` +| Message: "cardinality violated x values, y found)" +| Applies to: ``Property`` +| Course of action: A cardinality defined for the number of Values of a Property does not match. Add or remove Values until the cardinality has been satisfied or adjust the cardinality. + +| Validation: ``section_repository_present`` +| Message: "A section should have an associated repository" or "Could not load terminology" or "Section type not found in terminology" +| Applies to: ``Section`` +| Course of action: Optional validation. Will report any section that does not specify a repository. Add a repository to the reported Section to resolve. + +Custom validations +------------------ + +Users can write their own validation and register them either with the default validation or add it to their own validation class instance. + +A custom validation handler needs to ``yield`` a ``ValidationError``. See the ``validation.ValidationError`` class for details. + +Custom validation handlers can be registered to be applied on "odML" (the odml Document), "section" or "property". + + >>> import odml + >>> import odml.validation as oval + >>> + >>> # Create an example document + >>> doc = odml.Document() + >>> sec_valid = odml.Section(name="Recording-20200505", parent=doc) + >>> sec_invalid = odml.Section(name="Movie-20200505", parent=doc) + >>> subsec = odml.Section(name="Sub-Movie-20200505", parent=sec_valid) + >>> + >>> # Define a validation handler that yields a ValidationError if a section name does not start with 'Recording-' + >>> def custom_validation_handler(obj): + >>> validation_id = oval.IssueID.custom_validation + >>> msg = "Section name does not start with 'Recording-'" + >>> if not obj.name.startswith("Recording-"): + >>> yield oval.ValidationError(obj, msg, oval.LABEL_ERROR, validation_id) + >>> + >>> # Create a custom, empty validation with an odML document 'doc' + >>> custom_validation = oval.Validation(doc, reset=True) + >>> # Register a custom validation handler that should be applied on all Sections of a Document + >>> custom_validation.register_custom_handler("section", custom_validation_handler) + >>> # Run the custom validation and return a report + >>> custom_validation.report() + >>> # Display the errors reported by the validation + >>> print(custom_validation.errors) + +Defining and working with feature cardinality +============================================= + +The odML format allows users to define a cardinality for +the number of subsections and properties of Sections and +the number of values a Property might have. + +A cardinality is checked when it is set, when its target is +set and when a document is saved or loaded. If a specific +cardinality is violated, a corresponding warning will be printed. + +Setting a cardinality +--------------------- + +A cardinality can be set for sections or properties of sections +or for values of properties. By default every cardinality is None, +but it can be set to a defined minimal and/or a maximal number of +an element. + +A cardinality is set via its convenience method: + + >>> # Set the cardinality of the properties of a Section 'sec' to + >>> # a maximum of 5 elements. + >>> sec = odml.Section(name="cardinality", type="test") + >>> sec.set_properties_cardinality(max_val=5) + + >>> # Set the cardinality of the subsections of Section 'sec' to + >>> # a minimum of one and a maximum of 2 elements. + >>> sec.set_sections_cardinality(min_val=1, max_val=2) + + >>> # Set the cardinality of the values of a Property 'prop' to + >>> # a minimum of 1 element. + >>> prop = odml.Property(name="cardinality") + >>> prop.set_values_cardinality(min_val=1) + + >>> # Re-set the cardinality of the values of a Property 'prop' to not set. + >>> prop.set_values_cardinality() + >>> # or + >>> prop.val_cardinality = None + +Please note that a set cardinality is not enforced. Users can set less or more entities than are specified allowed via a cardinality. Instead whenever a cardinality is not met, a warning message is displayed and any unment cardinality will show up as a Validation warning message whenever a document is saved or loaded. + +View odML documents in a web browser +==================================== + +By default all odML files are saved in the XML format without the capability to view +the plain files in a browser. By default you can use the command line tool ``odmlview`` +to view saved odML files locally. Since this requires the start of a local server, +there is another option to view odML XML files in a web browser. + +You can use an additional feature of the ``odml.tools.XMLWriter`` to save an odML +document with an embedded default stylesheet for local viewing: + + >>> import odml + >>> from odml.tools import XMLWriter + >>> doc = odml.Document() # minimal example document + >>> filename = "viewable_document.xml" + >>> XMLWriter(doc).write_file(filename, local_style=True) + +Now you can open the resulting file 'viewable_document.xml' in any current web-browser +and it will render the content of the odML file. + +If you want to use a custom style sheet to render an odML document instead of the default +one, you can provide it as a string to the XML writer. Please note, that it cannot be a +full XSL stylesheet, the outermost tag of the XSL code has to be +`` [your custom style here] ``: + + >>> import odml + >>> from odml.tools import XMLWriter + >>> doc = odml.Document() # minimal example document + >>> filename = "viewable_document.xml" + >>> own_template = """ [your custom style here] """ + >>> XMLWriter(doc).write_file(filename, custom_template=own_template) + +Please note that if the file is saved using the '.odml' extension and you are using +Chrome, you will need to map the '.odml' extension to the browsers Mime-type database as +'application/xml'. + +Also note that any style that is saved with an odML document will be lost, when this +document is loaded again and changes to the content are added. In this case the required +style needs to be specified again when saving the changed file as described above. diff --git a/doc/index.rst b/doc/index.rst index 854841b7..768879e5 100644 --- a/doc/index.rst +++ b/doc/index.rst @@ -12,6 +12,7 @@ Contents: :maxdepth: 2 tutorial + advanced_features odmltordf reference diff --git a/doc/tutorial.rst b/doc/tutorial.rst index 096eafdb..a2362d82 100644 --- a/doc/tutorial.rst +++ b/doc/tutorial.rst @@ -899,250 +899,6 @@ format option when loading the document: ------------------------------------------------------------------------------- -Advanced odML-Features -====================== - -View odML documents in a web browser ------------------------------------- - -By default all odML files are saved in the XML format without the capability to view -the plain files in a browser. By default you can use the command line tool ``odmlview`` -to view saved odML files locally. Since this requires the start of a local server, -there is another option to view odML XML files in a web browser. - -You can use an additional feature of the ``odml.tools.XMLWriter`` to save an odML -document with an embedded default stylesheet for local viewing: - - >>> import odml - >>> from odml.tools import XMLWriter - >>> doc = odml.Document() # minimal example document - >>> filename = "viewable_document.xml" - >>> XMLWriter(doc).write_file(filename, local_style=True) - -Now you can open the resulting file 'viewable_document.xml' in any current web-browser -and it will render the content of the odML file. - -If you want to use a custom style sheet to render an odML document instead of the default -one, you can provide it as a string to the XML writer. Please note, that it cannot be a -full XSL stylesheet, the outermost tag of the XSL code has to be -`` [your custom style here] ``: - - >>> import odml - >>> from odml.tools import XMLWriter - >>> doc = odml.Document() # minimal example document - >>> filename = "viewable_document.xml" - >>> own_template = """ [your custom style here] """ - >>> XMLWriter(doc).write_file(filename, custom_template=own_template) - -Please note that if the file is saved using the '.odml' extension and you are using -Chrome, you will need to map the '.odml' extension to the browsers Mime-type database as -'application/xml'. - -Also note that any style that is saved with an odML document will be lost, when this -document is loaded again and changes to the content are added. In this case the required -style needs to be specified again when saving the changed file as described above. - - -Defining and working with feature cardinality ---------------------------------------------- - -The odML format allows users to define a cardinality for -the number of subsections and properties of Sections and -the number of values a Property might have. - -A cardinality is checked when it is set, when its target is -set and when a document is saved or loaded. If a specific -cardinality is violated, a corresponding warning will be printed. - -Setting a cardinality -********************* - -A cardinality can be set for sections or properties of sections -or for values of properties. By default every cardinality is None, -but it can be set to a defined minimal and/or a maximal number of -an element. - -A cardinality is set via its convenience method: - - >>> # Set the cardinality of the properties of a Section 'sec' to - >>> # a maximum of 5 elements. - >>> sec = odml.Section(name="cardinality", type="test") - >>> sec.set_properties_cardinality(max_val=5) - - >>> # Set the cardinality of the subsections of Section 'sec' to - >>> # a minimum of one and a maximum of 2 elements. - >>> sec.set_sections_cardinality(min_val=1, max_val=2) - - >>> # Set the cardinality of the values of a Property 'prop' to - >>> # a minimum of 1 element. - >>> prop = odml.Property(name="cardinality") - >>> prop.set_values_cardinality(min_val=1) - - >>> # Re-set the cardinality of the values of a Property 'prop' to not set. - >>> prop.set_values_cardinality() - >>> # or - >>> prop.val_cardinality = None - -Please note that a set cardinality is not enforced. Users can set less or more entities than are specified allowed via a cardinality. Instead whenever a cardinality is not met, a warning message is displayed and any unment cardinality will show up as a Validation warning message whenever a document is saved or loaded. - -Working with Validations ------------------------- - -odML Validations are a set of pre-defined checks that are run against an odML document automatically when it is saved or loaded. A document cannot be saved, if a Validation fails a check that is classified as an Error. Most validation checks are Warnings that are supposed to raise the overall data quality of the odml Document. - -When an odML document is saved or loaded, tha automatic validation will print a short report of encountered Validation Warnings and it is up to the user whether they want to resolve the Warnings. The odML document provides the ``validate`` method to gain easy access to the default validations. A Validation in turn provides not only a specific description of all encountered warnings or errors within an odML document, but it also provides direct access to each and every odML entity i.e. an ``odml.Section`` or an ``odml.Property`` where an issue has been found. This enables the user to quickly access and fix an encountered issue. - -A minimal example shows how a workflow using default validations might look like: - - >>> # Create a minimal document with Section issues: name and type are not assigned - >>> doc = odml.Document() - >>> sec = odml.Section(parent=doc) - >>> odml.save(doc, "validation_example.odml.xml") - -This minimal example document will be saved, but will also print the following Validation report: - - >>> UserWarning: The saved Document contains unresolved issues. Run the Documents 'validate' method to access them. - >>> Validation found 0 errors and 2 warnings in 1 Sections and 0 Properties. - -To fix the encountered warnings, users can access the validation via the documents' ``validate`` method: - - >>> validation = doc.validate() - >>> for issue in validation.errors: - >>> print(issue) - -This will show that the validation has encountered two Warnings and also displays the offending odml entity. - - >>> ValidationWarning: Section[73f29acd-16ae-47af-afc7-371d57898e28] 'Section type not specified' - >>> ValidationWarning: Section[73f29acd-16ae-47af-afc7-371d57898e28] 'Name not assigned' - -To fix the "Name not assigned" warning the Section can be accessed via the validation entry and used to directly assign a human readable name to the Section in the original document. Re-running the validation will show, that the warning has been removed. - - >>> validation.errors[1].obj.name = "validation_example_section" - >>> # Check that the section name has been changed in the document - >>> print(doc.sections) - >>> # Re-running validation - >>> validation = doc.validate() - >>> for issue in validation.errors: - >>> print(issue) - -Similarly the second validation warning can be resolved before saving the document again. - -Please note that the automatic validation is run whenever a document is saved or loaded using the ``odml.save`` and ``odml.load`` functions as well as the ``ODMLWriter`` or the ``ODMLReader`` class. The validation is not run when using any of the lower level ``xmlparser``, ``dict_parser`` or ``rdf_converter`` classes. - -List of available default validations -************************************* - -The following contains a list of the default odml validations, their message and the suggested course of action to resolve the issue. - -| Validation: ``object_required_attributes`` -| Message: "Missing required attribute 'xyz'" -| Applies to: ``Document``, ``Section``, ``Property`` -| Course of action: Add an appropriate value to attribute 'xyz' for the reported odml entity. - -| Validation: ``section_type_must_be_defined`` -| Message: "Section type not specified" -| Applies to: ``Section`` -| Course of action: Fill in the ``type`` attribute of the reported Section. - -| Validation: ``section_unique_ids`` -| Message: "Duplicate id in Section 'secA' and 'secB'" -| Applies to: ``Section`` -| Course of action: IDs have to be unique and a duplicate id was found. Assign a new id for the reported Section. - -| Validation: ``property_unique_ids`` -| Message: "Duplicate id in Property 'propA' and 'propB'" -| Applies to: ``Property`` -| Course of action: IDs have to be unique and a duplicate id was found. Assign a new id for the reported Property - -| Validation: ``section_unique_name_type`` -| Message: "name/type combination must be unique" -| Applies to: ``Section`` -| Course of action: The combination of Section.name and Section.type has to be unique on the same level. Change either name or type of the reported Section. - -| Validation: ``object_unique_name`` -| Message: "Object names must be unique" -| Applies to: ``Document``, ``Section``, ``Property`` -| Course of action: Property name has to be unique on the same level. Change the name of the reported Property. - -| Validation: ``object_name_readable`` -| Message: "Name not assigned" -| Applies to: ``Section``, ``Property`` -| Course of action: When Section or Property names are left empty on creation or set to None, they are automatically assigned the entities uuid. Assign a human readable name to the reported entity. - -| Validation: ``property_terminology_check`` -| Message: "Property 'prop' not found in terminology" -| Applies to: ``Property`` -| Course of action: The reported entity is linked to a repository but the repository is not available. Check if the linked content has moved. - -| Validation: ``property_dependency_check`` -| Message: "Property refers to a non-existent dependency object" or "Dependency-value is not equal to value of the property's dependency" -| Applies to: ``Property`` -| Course of action: The reported entity depends on another Property, but this dependency has not been satisfied. Check the referenced Property and its value to resolve the issue. - -| Validation: ``property_values_check`` -| Message: "Tuple of length 'x' not consistent with dtype 'dtype'!" or "Property values not of consistent dtype!". -| Applies to: ``Property`` -| Course of action: Adjust the values or the dtype of the referenced Propery. - -| Validation: ``property_values_string_check`` -| Message: "Dtype of property "prop" currently is "string", but might fit dtype "dtype"!" -| Applies to: ``Property`` -| Course of action: Check if the datatype of the referenced Property.values has been loaded correctly and change the Property.dtype if required. - -| Validation: ``section_properties_cardinality`` -| Message: "cardinality violated x values, y found)" -| Applies to: ``Section`` -| Course of action: A cardinality defined for the number of Properties of a Section does not match. Add or remove Properties until the cardinality has been satisfied or adjust the cardinality. - -| Validation: ``section_sections_cardinality`` -| Message: "cardinality violated x values, y found)" -| Applies to: ``Section`` -| Course of action: A cardinality defined for the number of Sections of a Section does not match. Add or remove Sections until the cardinality has been satisfied or adjust the cardinality. - -| Validation: ``property_values_cardinality`` -| Message: "cardinality violated x values, y found)" -| Applies to: ``Property`` -| Course of action: A cardinality defined for the number of Values of a Property does not match. Add or remove Values until the cardinality has been satisfied or adjust the cardinality. - -| Validation: ``section_repository_present`` -| Message: "A section should have an associated repository" or "Could not load terminology" or "Section type not found in terminology" -| Applies to: ``Section`` -| Course of action: Optional validation. Will report any section that does not specify a repository. Add a repository to the reported Section to resolve. - -Custom validations -****************** - -Users can write their own validation and register them either with the default validation or add it to their own validation class instance. - -A custom validation handler needs to ``yield`` a ``ValidationError``. See the ``validation.ValidationError`` class for details. - -Custom validation handlers can be registered to be applied on "odML" (the odml Document), "section" or "property". - - >>> import odml - >>> import odml.validation as oval - >>> - >>> # Create an example document - >>> doc = odml.Document() - >>> sec_valid = odml.Section(name="Recording-20200505", parent=doc) - >>> sec_invalid = odml.Section(name="Movie-20200505", parent=doc) - >>> subsec = odml.Section(name="Sub-Movie-20200505", parent=sec_valid) - >>> - >>> # Define a validation handler that yields a ValidationError if a section name does not start with 'Recording-' - >>> def custom_validation_handler(obj): - >>> validation_id = oval.IssueID.custom_validation - >>> msg = "Section name does not start with 'Recording-'" - >>> if not obj.name.startswith("Recording-"): - >>> yield oval.ValidationError(obj, msg, oval.LABEL_ERROR, validation_id) - >>> - >>> # Create a custom, empty validation with an odML document 'doc' - >>> custom_validation = oval.Validation(doc, reset=True) - >>> # Register a custom validation handler that should be applied on all Sections of a Document - >>> custom_validation.register_custom_handler("section", custom_validation_handler) - >>> # Run the custom validation and return a report - >>> custom_validation.report() - >>> # Display the errors reported by the validation - >>> print(custom_validation.errors) - Advanced Value features -----------------------