Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In cobra.io neither sbml.py nor sbml3.py seem to import or export notes. #541

Closed
ChristianLieven opened this issue Jul 6, 2017 · 4 comments
Labels
SBML Related to reading and writing SBML models.

Comments

@ChristianLieven
Copy link
Contributor

ChristianLieven commented Jul 6, 2017

Problem description

I am currently reconstructing a metabolic model, for which I am adding confidence scores, comments, and literature references in the notes attribute of reactions, metabolites and genes. The importance of confidence scores and related qualitative annotation parameters is discussed in the publications linked above.

I tried importing simple noted by adding the following notes field to the RECON1 model from BiGG.
<notes> <body xmlns="http://www.w3.org/1999/xhtml"> <center><h2>This is a TEST</h2></center> <p>I am wondering if COBRApy is able to import this.</p> </body> </notes>
I was quite surprised that the RECON1 model did not contain the confidence scores upon which some of the results of this research are based on.

I was not able to find the keywords 'confidence', 'score' or 'confidence_score' in cobra.io.sbml nor cobra.io.sbml3. If I saw that right the legacy import looks specifically for charge, GPR, and subsystem in the notes field but doesn't account for the confidence score.

Code Sample

You can find my modified example SMBL3+FBC RECON1 file here. The modification is at R_EX_dopa_e.

Discussion

It seems like the community hasn't decided yet what exactly the notes field should contain and how it should be formatted. Personally, I'd find most useful if there was a clever way of allowing both, short human-readable comment entries, as well as optional, but specifically related machine-readable DOI-styled literature references. In the model object, I suppose this could be a nested dictionary looking something like this:
some_model.reaction.SOME_RXN.notes = {"confidence_score":{"value":4, "reference":"some_doi"}}

Based on the referenced publications above, another useful key of the notes-field/attribute would be a simple 'comment' option, which would be limited in length (50 chars? 70 chars? 80 chars?).

some_model.reaction.some_metabolite.notes = {"comment":{"value":"Short string outlining a hypothesis or specific decision for this metabolite", "optional_reference":"some_doi"}}

I don't doubt that there could be a feasible, simple implementation on the python side of things, however I am unfamiliar with the options on the xml specifically SMBL side. A notes field according to the SMBL specifications is allowed to contain...

Almost any wellformed content permitted in XHTML subject to a few restrictions

...which seem pretty straight-forward, namely the notes field ...

must not contain an XML declaration or a DOCTYPE declaration.

Hence, I think a solution here could be to use <ul> from HTML?

What do you think?

@cdiener
Copy link
Member

cdiener commented Jul 6, 2017

That is a good point and one that pops up every once in a while for discussion. There is some ongoing discussion about the meaning of the SBML spec regarding the notes field. SBML only says:

It is intended to serve as a place for storing optional information intended to be seen by humans.

and comparing to annotation:

Whereas Notes is a container for content to be shown directly to humans, Annotation is a container for optional software-generated content not meant to be shown to humans.

The interpretation of the cobrapy maintainers in the past was that since notes should not be "consumed by a machine" it would not be written or read by cobrapy except for supporting the SBML 2 cobra annotations. The argument was that all annotation should go into the annotation tag as described in the spec. For the particular use case of DOIs annotation this is the recommended solution. There is a MIRIAM tag for DOIs so you can just use that. For instance the following is valid SBML and would be read into model.metabolites.h_c.annotation in cobrapy:

<annotation>
  <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
    xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" 
    xmlns:vCard="http://www.w3.org/2001/vcard-rdf/3.0#" xmlns:bqbiol="http://biomodels.net/biology-
    qualifiers/" xmlns:bqmodel="http://biomodels.net/model-qualifiers/">
    <rdf:Description rdf:about="#M_h_c">
      <bqbiol:is>
        <rdf:Bag>
          <rdf:li rdf:resource="http://identifiers.org/kegg.compound/C00080"/>
          <rdf:li rdf:resource="http://identifiers.org/doi/10.1038/nbt1156"/>
        </rdf:Bag>
      </bqbiol:is>
    </rdf:Description>
  </rdf:RDF>
</annotation>

However, that only works for direct annotations and not for adding data. For instance if I want to add some other quantity to the species or reaction (confidence scores or charge in various conditions, etc.), there is no way to do that with annotations. This is a shortcoming of SBML IMHO. So I would be in favour of reading and writing the notes field. Could be just raw text of could be a dictionary that is read and written to <ul> tags as you specified and is written into a <p> tag if it's just a string. But that would depend on how others interpret the SBML spec here.

@cdiener cdiener added the SBML Related to reading and writing SBML models. label Jul 10, 2017
@ChristianLieven
Copy link
Contributor Author

#534 Referencing this issue because @draeger, @Midnighter and @hredestig came up with this solution, which I consider quite optimal:

We are not aware of any existing schema or documentation of the annotation tags used in cobra. Our suggestion is to create a new repository under the opencobra organization. That way, any member of the opencobra community (most importantly of the Matlab COBRA Toolbox) can feel free to contribute to the schema, there can be versioned releases of the schema, and for the time being it can be hosted on https://opencobra.github.io/annotations/schema or whatever is decided for the name and URL.

We would then implement in cobrapy whatever is dictated by the schema and there's a chance for other tools in the opencobra community to do the same.

@draeger
Copy link

draeger commented Jul 19, 2017

Well, there is, of course, another way of storing confidence scores for reactions in a standard-compliant form. You could use Parameter objects for this. These are objects in the listOfParameters directly within the model and have an id, optional name and value. In their id you could prefix the reaction id that confidence score is referring to. However, this would again not be the best solution of storing that sort of information because it is not obvious what these parameters are.

@Midnighter
Copy link
Member

This issue was moved to opencobra/schema#4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
SBML Related to reading and writing SBML models.
Projects
None yet
Development

No branches or pull requests

4 participants