-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BEP 001 -- Representation of Parts and Devices for Build Planning #2
Comments
I'm seeing construct along with some modifiers (final, simpler, assembled, DNA, etc.) Not knowing SBOL terms, but does construct = designed nucleotide sequence? |
Yes, I was using "construct" to mean "designed nucleotide sequence", but it is not an SBOL term --- I was specifically trying to avoid using "part", "device", or the SBOL term "Component" in the English definitions. |
Three questions.
It just feels like there is such a large space of what measurement can mean, so I'm having trouble wrapping my head around how exactly we are representing in a way that will enable useful tools/predictions. As an example, one could imagine that if you were combining a promoter, 5' UTR and coding sequence, you could do some basic math and determine a predicted amount of protein product. You could then imagine that if there is some sort of known interaction between the UTR and CDS, that could be found and then calculated. A question I would have is what is the best way to store this interaction? Should it be associated with each component? Should it be associated with a |
@ethanj801 Good questions; I'm adding my thoughts here and have updated the SEP with clarifications.
I think in many cases it can be basically the same as for insertion into a plasmid backbone. The only question is whether there are better ways to represent an insertion locus than an index into a genome sequence. I don't know the answer to that. I've added the following to the discussion section:
Most multi-part devices will be agnostic about whether they end up on one plasmid or multiple plasmids. For example, a TetR/pTet repressor device could end up with TetR on either the same plasmid or a different plasmid from pTet, depending on the design. We allow this simply by not specifying anything about the plasmid at the level of the device. When it gets incorporated into a larger system, we indicate it with constraints of locations in the composite part or parts that include the device. I added the following to the draft:
The SBOL specification has an explanation of the hasMeasure property and some examples. For larger usage recommendation of what, exactly, to record and in what contexts, I think we're not yet ready to commit on that and thus believe it's out of scope for this document. I've added a note to this effect in the discussion section. |
Perhaps I am misinterpreting what you are saying, but I'm not sure this is true. How many plasmids you use (and which plasmid origins you use) has a big impact on the functionality of the device. One could imagine quite different looking transfer functions between a pTet GFP that is being repressed by a constitutively driven TetR on a medium-low copy number plasmid vs a high copy number plasmid. Using one vs two plasmids could also impact what other devices are able to be integrated along with it (due to origin or antibiotic resistance incompatibility). By my reckoning, even the simple choice of using two different plasmids should fundamentally increase the noise of the device due to plasmid copy number fluctuations (though maybe this isn't a problem for the more tightly regulated plasmids). |
That's a very good point, and comes back to the question of what models and measurements we will find useful to associate with the devices, and how sensitive the devices are to particular variations. For example, in transient transfection of mammalian systems, we've found that one plasmid vs. multiple plasmids has little effect on device behavior (the plasmids aren't replications and are delivered in high numbers). I think the right way to deal with this is likely to be to have devices include whatever context information we believe is important in as abstract a form as possible. For example, if the TetR/pTet repressor device is defined in terms of a high-copy plasmid only, then it might include two abstract plasmids, each marked with information about copy-count and containing one of the parts, but without its identity specified. Then when the device is used, the abstract plasmids would be given an identity constraint with either one or two real plasmids. In a one-plasmid system, they get identified with the same real plasmid, in a two-plasmid system, they get identified with different plasmids.
Absolutely, and that's why I want to leave the plasmid locations agnostic when possible, to allow flexibility in choosing how to organize functional units in a larger system that uses the device. |
Where would a linear fragment (PCR or synthesis) composed of a unitary part with 5' and 3' flanking sequences for restriction digest based or gibson assembly fall into? Based on the previous discussion I think it would go in 'part in backbone'. |
@GC-repeat I believe it depends on the specifics of the flanking sequences.
Were you thinking of one of those scenarios, or something else? |
That makes sense. Yes, I was thinking about those scenarios. A part for gibson assembly would correspond scenario 1, and for restriction digest based assembly to scenario 2. |
If I'm understanding correctly, an example of such a construct for scenario 2 could be a 1000-bp linear double-stranded DNA construct with the following structure: 5'padding-BioBrickprefix-CDS-BioBricksuffix-3'padding. In this case, then you are right: it wouldn't be a replicon because it has no origin of replication --- it's just something that we've produced an aliquot of via synthesis. It's a vector in backbone, but the backbone is not a replicon. We would then need to fall back to the generic |
@GC-repeat I've put an update in a pull request; can you please take a look and see if this handles this case well? SynBioDex/SEPs#114 |
That sounds good to me. |
Thank you; I've merged the update in. |
I prefer to using 'construct' referring somethings that have been fully constructed. I commonly use 'fragment' to referring to the 'parts', and I used 'segment' before, but I am not really sure about the meaning of 'segment'. From genetic engineering view, there should be attribution: HasParentID (this can be a single ID, or a combination of ID), GeneratedbyAction (like 'digestion HasParentID using enzyme B, purified size = xxx bp', PCR amplification from HasParentID using oligo ID A and oligo ID B', 'Assembly from HasParentID). If these are not relevant, please pardon. |
@pengbingyin In many cases, a part may actually be a fully constructed system as well (particularly composite parts): it all depends on the design goals and whether a person later chooses to combine it with something else. With regards to the representation of attribution: this document recommends a representation using the |
There are several ways to do cloning. (1) non-golden gate way, the backbone is mostly digested to become a intermediate format as a linear DNA fragment. In this case, 'Insertions Sites and Drop-Out Sequences' are not able to be pre-defined, except the backbones being a commercial cloning kits and used for one-step cloning. Most cloning works requires a process of sequence analysis and decision making. This could be challenging. (1.1) the parts can be a PCR fragment: this will need to define oligos (annealing sequence + over-hang sequence) and template. (2) golden gate method, the parts and backbones are in format of circular plasmids. In this case, it is necessary to define the golden gate levels. see https://www.researchgate.net/publication/310780764_Editing_of_the_urease_gene_by_CRISPR-Cas_in_the_diatom_Thalassiosira_pseudonana/figures?lo=1 (3) A new way for plasmid cloning https://www.biorxiv.org/content/10.1101/2021.12.31.474679v1.full.pdf is reported. |
@pengbingyin You are not really answering my question. Are any of these unable to be represented with the current proposal? |
With regards to the representation of attribution: this document recommends a representation using the prov:wasGeneratedBy property to link to a prov:Activity representing an assembly plan, with the assembly plan represented by a network of reactions (e.g., digestion and ligation). I believe that this model can represent the sort of structures that you are describing. Can you please take a look more deeply and say if you see aspects that you believe are important that are unable to be represented under the proposal? prov:wasGeneratedBy and prov:Activity should be sufficient. |
@pengbingyin Thank you for you contribution and for taking the time to make a careful assessment! If you want to contribute examples for inclusion, a pull request including such would be welcomed as well! |
@nroehner has added a bunch of diagrams to the examples section of the SEP to illustrate the representations. |
Thank you @nroehner this is much needed I think. |
I realized that the current proposal doesn't take advantage of the I have set up a pull request for this change: SynBioDex/SEPs#117 , and would appreciate folks commenting there if they like or dislike this change. |
This SEP proposes a set of terminology and practices for representing genetic parts and functional devices at various stages of design, synthesis, and assembly. These practices are intended to represent any of the wide array of approaches based on embedding parts in carrier vectors, such as BioBricks, Gateway, MoClo, GoldenBraid, PhytoBricks, and other Type IIS methods.
Draft at: https://github.com/SynBioDex/SEPs/blob/master/sep_055.md
The text was updated successfully, but these errors were encountered: