-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing GPRs #214
Comments
Niko and I had started discussing this on the manuscript, but I will just paste the conversation here, so that the main discussion can happen here now.
My response:
|
I think spontaneous does not cover all cases. There could simply be reactions that are known (or supposed) to occur, but the respective protein has not been identified. I guess one could call it a gap-filling reaction. Anyway, my point was that having no GPR should not be considered wrong. I think this is something that could just be given as part of a summary report of the model. |
In the manuscript Ines mentioned that:
My response was:
And I agree @cdanielmachado, we'd need to be more concise than just naming every reaction 'spontaneous', but I disagree with moving this check to the summary/ statistics category. After all, the majority of metabolic reactions are catalysed enzymatically, and thus we'd lose this level of control. Here's what I propose:
|
I should add that I don't know if SBO terms exist for any of these follow-up checks. |
I doubt that SBO terms exist for those, and I highly doubt, that most models do contain this information. |
You can always request new SBO terms as needed using this form: https://sourceforge.net/p/sbo/term-request/ |
I see memote primarily as a tool to bring models to the same level of quality. While we do have the opportunity to influence the trends or even set new conventions with it, I don't want to implement something that isn't backed by the community. So what do you think, in order to classify reactions without GPR giving them an explicit justification, would the above suggestion be a sensible addition? We do check for an EC-number annotation, but in a different context. |
I would:
If all reactions are defined or clearly associated: Succeed, otherwise fail that quality test. If I get your intention, your tool should serve as a quality check for models. As such you can set some standards, and I think it makes sense to have reaction types associated/properly annotated. |
|
In response to @tpfau: I am still a bit reluctant to rely on them because when skimming through a model, one has to 'decode' the terms. I'm a big fan of explicit 'tags' or extra-attributes because then any user (with a bit of a background in Biology) can look at a model's components and immediately know whats going on. However, I don't want to reinvent the wheel either and sboTerms seem to have all cases covered and if not are extensible. It may be redundant, but I'm fond of having an overview as opposed to storing the information in several containers (GPR in But then again, I'm not convinced of that either, because the confidence score, to me, should mark the 'trustworthiness' of a reaction based on hard evidence with literature references, something like (Score: "4" translates to "Purified and characterised enzyme" with attribute @draeger: Would this point back to the genes from the GPR? The |
It is definitely possible to annotate a |
The problem with confidenceScores is that they are barely existent for anything thats not a model organism, as we often enough just "dont know". wrt the modifierSpecies: |
Well, you can, of course, write any resource in a MIRIAM annotation that you like, such as a gene identifier within the model. However, the idea is her also to point to external databases. Sorry if this wasn't clear. |
@draeger Essentially, two items in the SBML would indicate that they are identified by the same external object. And, in contrast to the current situation, it would be necessary to check these cross-references. |
I suppose requiring them in a well-defined format through tests in memote could mitigate this i.e. make users aware and encourage them to start providing information wherever they can. The lowest category could just be "don't know" or "no data available" by default, but then it is at least explicit that this is the case.
That issue I didn't consider, but I agree. File sizes are exploding enough already. I shall try to approach this issue as we've discussed above then by looking for piecing together the information from different places. SBO seems like a powerful way of doing it, and once we've extended it I may be able to primarily rely on that. With regards to that, I was wondering, what is the most effective way currently to add SBO terms to a model? I'm not sure if cobrapy internally supports that, but I assume the COBRAToolbox does?
I like this idea. Do I understand it correctly that in the context of reactions this could be used for cofactors that aren't consumed in the actual reaction, or would it be for regulation, or both? |
SBML Modifierspecies, are only indicators that a specific reaction is in some way modified by the indicated Modifierspecies (which itself has to point to a species in the model). So this includes regulatory events as well as catalysts which are not consumed. |
Yeah, a decent way to express confidence scores is definitely something that should find its way into the test. If not to improve model confidence, then to raise awareness. |
Seems like a decent solution resorting to the use of ECO terms has been proposed and generally accepted here: I believe it only needs to be implemented in COBRApy for memote to start checking for this. @matthiaskoenig will this already be part in the new SBML I/O functionality that you've built for COBRApy? |
memote gives an error for all intracellular reactions without GPRs. However, it is not clear to me why a GPR should be mandatory. Maybe there could be evidence for the presence of a biochemical reaction occurring inside the cell and the respective mechanism is not known (could be spontaneous or enzymatic) or the respective enzyme was not yet identified.
The text was updated successfully, but these errors were encountered: