Skip to content

XML 2.0 specification

Yann Cointepas edited this page Jan 29, 2016 · 3 revisions

XML 2.0 specification

Table of content

Processes

The XML process specification makes it possible to use a standard Python function and to associate it with an XML string that enables the creation of a Process instance. This XML string will define the type and behaviour of function parameters and return value(s).

In order to create Process instance for a function it is necessary to get some information about each parameter of the function and about the return value. This information about parameters is defined in XML string with the exception of the default values of the parameters that are extracted from the function definition.

The process XML string contains one single <process> element. This element that may contains some global properties for the process. <process> may contain the following attributes :

  • capsul_xml (optional): version of the Capsul XML specification this process definition is compatible with. If omitted, the process definition is supposed to be compatible with the latest Capsul XML specification available.
  • role (optional): A role that is attached to the process. See "Process roles" below.

In the <process> element, one can find one <input> element per parameter of the function. For parameters that are file names, if the file is used for output, the tag to use is <output>. The use of <output> indicate clearly that Capsul will create an output parameter that can naturally be connected to any input parameter in a pipeline. <input> (or <output>) contains the following attributes:

  • name: the name of the function parameter
  • type: the type of the parameter. See possible parameter types below.
  • doc (optional): the documentation of the parameter

The value returned by the function can be defined in a <return> element. If <return> is not defined, the value returned by the Python function is ignored and cannot be used in pipelines. The function can either return one or several values. If there is only one return value, the content and attributes of the <return> element is the same as the <input> defined above. For instance :

from capsul.process.xml import xml_process

@xml_process('''
<process capsul_xml="2.0">
    <input name="a" type="int" doc="An integer"/>
    <input name="b" type="int" doc="Another integer"/>
    <return name="addition" type="int" doc="a + b"/>
</process>
''')
def add(a, b):
     return a + b

If there are several output values, each one is described in an <output> element that is contained in <return>. The function must return either a list or a dictionary. If it is a list the order of the <output> elements is used to match the values in the list and the parameter names. If it is a dictionary, each key must correspond to a name attribute in an <output> element. For instance:

from capsul.process.xml import xml_process

@xml_process('''
<process capsul_xml="2.0">
    <input name="a" type="int" doc="An integer"/>
    <input name="b" type="int" doc="Another integer"/>
    <return>
        <output name="quotient" type="int" doc="Quotient of a / b"/>
        <output name="remainder" type="int" doc="Remainder of a / b"/>
    </return>
</process>
''')
def divide(a, b):
     return {
        'quotient': a / b,
        'remainder': a % b,
    }

Parameter types

For <input> and <output> elements, the @type@ attribute can have the following values:

  • int
  • float
  • string
  • unicode
  • file
  • directory
  • enum : when this type is used, there must be a values attribute that contains a Python literal representing a list of possible values for the parameter.
  • list_int
  • list_float
  • list_string
  • list_unicode
  • list_file
  • list_directory

When a parameter accepts multiple types, they must be separated by a |. For instance a parameter accepting either a file or a list of file would use type="file|list_file".

Process roles

The role of a process gives information about the expected execution context. It can be used to decide whether a process should be executed in a given context or not. The role can also be used to propose a specific GUI for the process. For instance the role "viewer" indicate that the execution of the process will display something to the user. There is no need to execute such a process in a remote computer that is disconnected from the user environment.

The possible process roles are :

  • viewer: the process is used to display something to the user. It cannot be executed outside the user graphical environment. A viewer is not supposed to be blocking. It should terminate immediately an let the view live independently of the rest of the process. If blocking is required, use the dialog role.
  • dialog: a dialog is used to show something to the user and wait for a user action before ending its execution. Like a viewer, it cannot be executed outside the user graphical environment. The expected user action can be as simple as clicking on a single "ok" button ; in that case, the process should have no output. But it can be a complete form whose result must be returned via the process output parameter(s).

Association between a Python function and an XML string

There are two ways to perform the association between the function and the XML. The recommended method is to use a decorator to explicitly define the XML string associated to the function. Here is an example :

from capsul.process.xml import xml_process

@xml_process('''
<process capsul_xml="2.0">
    <input name="input_image" type="file" desc="Path of a NIFTI-1 image file."/>
    <input name="method" type="enum" values="['gt', 'ge', 'lt', 'le']" desc="Mehod for thresolding."/>
    <input name="threshold" type="float" desc="Threshold value."/>
    <input name="output_location" type="file"
 desc="If set, define the output file name. Otherwise, the name is generated using a "threshold_" prefix on the input file name."/>
    <return name="output_image" type="file" desc="Name of the output image."/>
</process>
''')
def threshold(input_image, method='gt', threshold=0, output_location=None):
     pass

It is also possible to put the XML in the docstring of the function. However, this method is not recommend and should be avoided if possible. Example :

def threshold(input_image, method='gt', threshold=0, output_location=None):
    '''
    <process capsul_xml="2.0">
        <input name="input_image" type="file" desc="Path of a NIFTI-1 image file."/>
        <input name="method" type="enum" values="['gt', 'ge', 'lt', 'le']" desc="Mehod for thresolding."/>
        <input name="threshold" type="float" desc="Threshold value."/>
        <input name="output_location" type="file" 
          desc="If set, define the output file name. Otherwise, the name is generated using a 'threshold_' prefix on the input file name."/>
        <return name="output_image" type="file" desc="Name of the output image."/>
    </process>
    '''
     pass

Processes examples

from capsul.process.xml import xml_process

@xml_process('''
<process capsul_xml="2.0">
    <input name="input_image" type="file" doc="Path of a NIFTI-1 image file."/>
    <input name="method" type="enum" values="['gt', 'ge', 'lt', 'le']"
     doc="Mehod for thresolding."/>
    <input name="threshold" type="float" doc="Threshold value."/>
    <output name="output_image" type="file" doc="Output file name."/>
    <return name="output_image" type="file" doc="Name of the output image."/>
</process>
''')
def threshold(input_image, output_image, method='gt', threshold=0):
     pass

@xml_process('''
<process capsul_xml="2.0">
    <input name="input_image" type="file" doc="Path of a NIFTI-1 image file."/>
    <input name="mask" type="file" doc="Path of mask binary image."/>
    <output name="output_image" type="file" doc="Output file name."/>
</process>
''')
def mask(input_image, mask, output_location=None):
     pass

Pipelines

An XML pipeline is an XML document containing a single <pipeline> element that may contains some global properties for the pipeline. Since a pipeline is also a process, the <pipeline> element may contain the same attributes as the <process> element (see above).

An XML pipeline contains a series of processes that are defined by <process> elements. The input and outputs of processes are connected by links that are defined in <link> elements. A pipeline may allow a user to select one group of processes among a series of process groups. The processes that are not selected are disabled (they will not be executed) whereas the selected processes are enabled. The <processes_selection> element is used to define a set of selectable process groups.

The <doc> element

This element has no attributes and contains the documentation of the process in a Sphinx compatible format.

The <process> element

A <process> element adds a new process instance to the pipeline. This instance is given a name that can be used in other XML elements to reference it. The process instance is referencing a module which is the function that is called when the instance is run. The <process> element can have the following attributes:

  • name: a string that can be used to reference the process instance. This must be a valid Python variable name. It should use the variable naming convention of Python's PEP 8.
  • module: a valid Capsul process identifier. This is typically a fully qualified (e.g. containing the absolute Python module dotted path) Python object name. But any string value accepted by capsul.loadre.get_process_instance() can be used.
  • role (optional): set the role of the process instance (se "Process roles" above). If a role has been defined on the process module, it is ignored and replaced be the one declared in teh pipeline. It is possible to use an empty string to force the process instance in the pipeline to have no role.
  • iteration (optional): when this attribute is used, the process instance will be an iteration process. The iteration attributes contains a coma separated lists of parameter names (for instance "input1,input2,output1"). This list indicate the process parameter names on which the iteration will be performed. For each of these parameters, the actual type of the process instance parameter will be replaced by a list whose elements must have the process parameter type.

The <process> element can contain the following elements:

<set>

The <set> element is used to set a fixed value to a parameter. It contains only two attributes:

  • name: the name of the parameter
  • value: The value of the parameter expressed as a Python literal. The use of a Python literal format enables the representation of structures values such as list. Some examples of values:
    • integer: <set name="x" value ="42"/>
    • float: <set name="x" value ="4.2"/>
    • string: <set name="x" value ="'a value'"/>
    • None (i.e. JSON null): <set name="x" value ="None"/>
    • list: <set name="x" value ="['one', 'two', 'three']"/>

When a value is set on a parameter, it becomes an optional parameter.

<nipype>

Capsul can use Nipype interfaces as process module. These interfaces uses traits types that have some parameters that need to be set in some contexts. The Nipype specific <nipype> element contains a name attribute to identify a process parameter. For more information about these parameters, see Nipype interface specification The following attributes can be used to customize Nipype traits :

  • usedefault: can be set to "true" or "false". Omitting the attribute is equivalent to "False".
  • copyfile: can be set to "true" or "false". Omitting the attribute is equivalent to "False". If the special value "discard" is used, the Nipype interface copyfile parameter will be set to True but the copied file will be deleted when the process terminates. This makes it possible to avoid some software (such as SPM) to modify input image but to keep only the original image at the end of the execution (the modified copy is deleted).

The <link> element

This element adds a ling between an input parameter of a process and an output parameter of another pipeline. It can also be used to "export" a process parameter. Exporting a process parameter means making it visible in the parameters of the pipeline. Unlike, the default Pipeline behaviour in Capsul's API, a pipeline defined in Capsul XML 2.0 dot not export automatically the unconnected parameters of its processes. The <link> element contains no child elements and mus have exaclty two attributes:

  • source: the parameter where the link starts from.
  • dest: the parameter where the link ends to.

The value of these attributes can be either a single identifier (e.g. "parameter_name") or two identifiers separated by a dot (e.g. "process_name.parameter_name"). A single identifier correspond to a pipeline parameter whereas two identifiers identify a process parameter, they must correspond to the name of a process and the name of one parameter of this process.

The <process_selection> element

The <process_selection> element defines a series of processes groups. Each processes group is composed by a series of processes added in the pipeline with the <process> element. Only one of these processes groups can be executed in the pipeline. Therefore, a new parameter is added to the pipeline that allows the user to select the group to execute. All processes in the selected group are activated (i.e. will be executed) whereas all processes in other groups are disabled (i.e. will not be executed).

The <process_selection> has a single name attribute that is the name of the parameter that is added to the pipeline. It must contains two or more <processes_group> elements. Each <processes_group> contains one or more <process> element having only a single name attribute. This attribute is the name of a process defined in the pipeline (see The <process> element above).

The <gui> element

The <gui> element enables to define the position of nodes for a graphical representation. The position of a node is given by a <position> element that contains three attributes :

  • name: The name of the process (as given in the process element).
  • x: The x coordinate of the process.
  • y: The y coordinate of the process.

A single global zoom level can be given to the gui with a <zoom> element that contains a single level attributes whose value is a floating point.

Pipeline example

<pipeline capsul_xml="2.0">
    <process name="threshold_gt_1" 
     module="capsul.process.test.test_load_from_description.threshold">
        <set name="threshold" value="1"/>
        <set name="method" value="'gt'"/>
    </process>
    <process name="threshold_gt_10" 
     module="capsul.process.test.test_load_from_description.threshold">
        <set name="threshold" value="10"/>
        <set name="method" value="'gt'"/>
    </process>
    <process name="threshold_gt_100" 
     module="capsul.process.test.test_load_from_description.threshold">
        <set name="threshold" value="100"/>
        <set name="method" value="'gt'"/>
    </process>
    <process name="threshold_lt_1" 
     module="capsul.process.test.test_load_from_description.threshold">
        <set name="threshold" value="1"/>
        <set name="method" value="'lt'"/>
    </process>
    <process name="threshold_lt_10" 
     module="capsul.process.test.test_load_from_description.threshold">
        <set name="threshold" value="10"/>
        <set name="method" value="'lt'"/>
    </process>
    <process name="threshold_lt_100" 
     module="capsul.process.test.test_load_from_description.threshold">
        <set name="threshold" value="100"/>
        <set name="method" value="'lt'"/>
    </process>
    <process name="mask_1" 
     module="capsul.process.test.test_load_from_description.mask">
    </process>
    <process name="mask_10" 
     module="capsul.process.test.test_load_from_description.mask">
    </process>
    <process name="mask_100" 
     module="capsul.process.test.test_load_from_description.mask">
    </process>

    <link source="input_image" dest="threshold_gt_1.input_image"/>
    <link source="input_image" dest="threshold_gt_10.input_image"/>
    <link source="input_image" dest="threshold_gt_100.input_image"/>
    
    <link source="input_image" dest="threshold_lt_1.input_image"/>
    <link source="input_image" dest="threshold_lt_10.input_image"/>
    <link source="input_image" dest="threshold_lt_100.input_image"/>

    <link source="input_image" dest="mask_1.input_image"/>
    <link source="input_image" dest="mask_10.input_image"/>
    <link source="input_image" dest="mask_100.input_image"/>

    <link source="threshold_gt_1.output_image" dest="mask_1.mask"/>
    <link source="threshold_gt_10.output_image" dest="mask_10.mask"/>
    <link source="threshold_gt_100.output_image" dest="mask_100.mask"/>
    <link source="threshold_lt_1.output_image" dest="mask_1.mask"/>
    <link source="threshold_lt_10.output_image" dest="mask_10.mask"/>
    <link source="threshold_lt_100.output_image" dest="mask_100.mask"/>

    <link source="mask_1.output_image" dest="output_1"/>
    <link source="mask_10.output_image" dest="output_10"/>
    <link source="mask_100.output_image" dest="output_100"/>

    <processes_selection name="select_method">
        <processes_group name="greater than">
            <process name="threshold_gt_1"/>
            <process name="threshold_gt_10"/>
            <process name="threshold_gt_100"/>
        </processes_group>
        <processes_group name="lower than">
            <process name="threshold_lt_1"/>
            <process name="threshold_lt_10"/>
            <process name="threshold_lt_100"/>
        </processes_group>
    </processes_selection>
    
    <gui>
        <position name="threshold_gt_100" x="386.0" y="403.0"/>
        <position name="inputs" x="50.0" y="50.0"/>
        <position name="mask_1" x="815.0" y="153.0"/>
        <position name="threshold_gt_10" x="374.0" y="242.0"/>
        <position name="threshold_lt_100" x="556.0" y="314.0"/>
        <position name="threshold_gt_1" x="371.0" y="88.0"/>
        <position name="mask_10" x="820.0" y="293.0"/>
        <position name="mask_100" x="826.0" y="451.0"/>
        <position name="threshold_lt_1" x="570.0" y="6.0"/>
        <position name="threshold_lt_10" x="568.0" y="145.0"/>
        <zoom level="1.0"/>
    </gui>
</pipeline>