Skip to content

Commit

Permalink
Add the text content for the subsection on ICAT data XML files
Browse files Browse the repository at this point in the history
  • Loading branch information
RKrahl committed Jan 3, 2024
1 parent af0ee8d commit 4ad8e9e
Showing 1 changed file with 72 additions and 1 deletion.
73 changes: 72 additions & 1 deletion doc/src/file-icatdata.rst
Original file line number Diff line number Diff line change
Expand Up @@ -70,13 +70,83 @@ ICAT data XML files
~~~~~~~~~~~~~~~~~~~

In this section we describe the ICAT data file format using the XML
backend.
backend. Consider the following example:

.. literalinclude:: ../examples/icatdump-simple-1.xml
:language: xml

The root element of ICAT data XML files is ``icatdata``. It may
optionally have one ``head`` subelement and one or more ``data``
subelements.

The ``head`` element will be ignored by :ref:`icatingest`. It serves
to provide some information on the context of the creation of the data
file, which may be useful for debugging in case of issues.

The content of each ``data`` element is one chunk according to the
logical structure explained above. The present example contains two
chunks. Each element within the ``data`` element corresponds to an
ICAT object according to the ICAT schema. In the present example, the
first chunk contains five User objects and three Grouping objects.
The second chunk only contains one Investigation.

These object elements should have an ``id`` attribute that may be used
to reference the object in relations later on. The ``id`` value has
no meaning other than this file internal referencing between objects.
The subelements of the object elements correspond to the object's
attributes and relations in the ICAT schema. All many-to-one
relations must be provided and reference already existing objects,
e.g. they must either already have existed before starting the
ingestion or appear earlier in the ICAT data file than the referencing
object, so that they will be created earlier. The related object may
either be referenced by id using the special attribute ``ref`` or by
the related object's attribute values, using XML attributes of the
same name. In the latter case, the attribute values must uniquely
define the related object.

The object elements may include one-to-many relations. In this case,
the related objects will be created along with the parent in one
single cascading call. Alternatively, these related objects may be
added separately as subelements of the ``data`` element later in the
file. In the present example, the Grouping object include their
related UserGroup objects. Note that these UserGroups include their
relation to the User. The User object is referenced by their
respective id in the ``ref`` attribute. But the UserGroups do not
include their relation with Grouping. That relationship is implied by
the parent relation of the object in the file.

In a similar way, the Investigation in the second chunk includes
related InvestigationGroups that will be created along with the
Investigation. The InvestigationGroup objects include a reference to
the corresponding Grouping. Note that these references go across
chunk boundaries. The index that caches the object ids to resolve
object relations from the first chunk that did contain the ids of the
Groupings will already have been discarded from memeory when the
second chunk is read. But the references use the key that can be
passed to :meth:`icat.client.Client.searchUniqueKey` to search these
Groupings from ICAT.

Finally note the the file format also depends on the ICAT schema
version: the present example can only be ingested into ICAT server 5.0
or newer, because the attributes fileCount and fileSize have been
added to Investigation in this version. With older ICAT versions, it
will fail because the attributes are not defined.

Consider a second example, it defines a subset of the same content
as the previous example:

.. literalinclude:: ../examples/icatdump-simple-2.xml
:language: xml
:lines: 1-9,28-52,56-58,70-82,108

The difference is that we now add the Usergroup objects separately in
direct subelements of ``data`` instead of including them in the
related Grouping objects.

You will find more extensive examples in the source distribution of
python-icat. The distribution also provides XML Schema Definition
files for the ICAT data XML file format corresponding to various ICAT
schema versions.

ICAT data YAML files
~~~~~~~~~~~~~~~~~~~~
Expand All @@ -89,6 +159,7 @@ backend.

.. literalinclude:: ../examples/icatdump-simple-2.yaml
:language: yaml
:lines: 1-7,10-11,14,23-45,52-60


.. [#dc] There is one exception: DataCollections don't have a
Expand Down

0 comments on commit 4ad8e9e

Please sign in to comment.