Skip to content

Commit

Permalink
Refine colab notebook (#52)
Browse files Browse the repository at this point in the history
* refine notebook

* rephrase

* rephrase

* rephrase

* rephrase

* rm output

* rephrase

* rephrase

Co-authored-by: Guolin Ke <[email protected]>
  • Loading branch information
ZiyaoLi and guolinke authored Sep 22, 2022
1 parent 1624ef2 commit 0220958
Showing 1 changed file with 15 additions and 15 deletions.
30 changes: 15 additions & 15 deletions notebooks/unifold.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,11 @@
"id": "jMGcXXPabEN4"
},
"source": [
"# Uni-Fold Colab\n",
"# Uni-Fold Notebook\n",
"\n",
"This Colab notebook provides an online runnable version of [Uni-Fold](https://github.com/dptech-corp/Uni-Fold/) for users to predict the structure of a protein, single chain or multimer, with custom settings.\n",
"This notebook provides protein structure prediction service of [Uni-Fold](https://github.com/dptech-corp/Uni-Fold/) as well as [UF-Symmetry](https://www.biorxiv.org/content/10.1101/2022.08.30.505833v1). Predictions of both protein monomers and multimers are supported. The homology search process in this notebook is enabled with the [MMSeqs2](https://github.com/soedinglab/MMseqs2.git) server provided by [ColabFold](https://github.com/sokrypton/ColabFold). For more consistent results with the original AlphaFold(-Multimer), please refer to the open-source repository of [Uni-Fold](https://github.com/dptech-corp/Uni-Fold/), or our convenient web server at [Hermite™](https://hermite.dp.tech/).\n",
"\n",
"Thanks to [MMSeqs2](https://github.com/soedinglab/MMseqs2.git) and the server provided by [ColabFold](https://github.com/sokrypton/ColabFold), the homogeneous searching in this notebook is very fast and is comparable with the original AlphaFold(-Multimer). If you want more consistent results with the original AlphaFold(-Multimer), you can use the [full open source Uni-Fold](https://github.com/dptech-corp/Uni-Fold/), or the convenient web server at [Hermite™](https://hermite.dp.tech/).\n",
"\n",
"Please note that this Colab notebook is not a finished product and is provided as an early-access prototype. It is provided for theoretical modeling only and caution should be exercised in its use. \n",
"Please note that this notebook is provided as an early-access prototype, and is NOT an official product of DP Technology. It is provided for theoretical modeling only and caution should be exercised in its use. \n",
"\n",
"**Licenses**\n",
"\n",
Expand All @@ -23,16 +21,15 @@
"\n",
"Please cite the following papers if you use this notebook:\n",
"\n",
"* Jumper et al. \"[Highly accurate protein structure prediction with AlphaFold.](https://doi.org/10.1038/s41586-021-03819-2)\" Nature (2021)\n",
"* Evans et al. \"[Protein complex prediction with AlphaFold-Multimer.](https://www.biorxiv.org/content/10.1101/2021.10.04.463034v1)\" biorxiv (2021)\n",
"* Mirdita M, Schütze K, Moriwaki Y, Heo L, Ovchinnikov S and Steinegger M. \"[ColabFold: Making protein folding accessible to all.](https://www.nature.com/articles/s41592-022-01488-1)\" Nature Methods (2022) \n",
"* Ziyao Li, Xuyang Liu, Weijie Chen, Fan Shen, Hangrui Bi, Guolin Ke, Linfeng Zhang. \"[Uni-Fold: An Open-Source Platform for Developing Protein Folding Models beyond AlphaFold.](https://www.biorxiv.org/content/10.1101/2022.08.04.502811v1)\" biorxiv (2022)\n",
"* Ziyao Li, Shuwen Yang, Xuyang Liu, Weijie Chen, Han Wen, Fan Shen, Guolin Ke, Linfeng Zhang. \"[Uni-Fold Symmetry: Harnessing Symmetry in Folding Large Protein Complexes.](https://www.biorxiv.org/content/10.1101/2022.08.30.505833v1)\" bioRxiv (2022)\n",
"\n",
"* Mirdita M, Schütze K, Moriwaki Y, Heo L, Ovchinnikov S and Steinegger M. \"[ColabFold: Making protein folding accessible to all.](https://www.nature.com/articles/s41592-022-01488-1)\" Nature Methods (2022)\n",
"\n",
"**Acknowledgements**\n",
"\n",
"We thank [@sokrypton](https://twitter.com/sokrypton) for many helpful suggestions to this notebook.\n"
"The model architecture of Uni-Fold is largely based on [AlphaFold](https://doi.org/10.1038/s41586-021-03819-2) and [AlphaFold-Multimer](https://www.biorxiv.org/content/10.1101/2021.10.04.463034v1). The design of this notebook refers directly to [ColabFold](https://www.nature.com/articles/s41592-022-01488-1). We specially thank [@sokrypton](https://twitter.com/sokrypton) for his helpful suggestions to this notebook.\n",
"\n",
"Copyright © 2022 DP Technology. All rights reserved."
]
},
{
Expand Down Expand Up @@ -127,7 +124,6 @@
"output_dir_base = \"./prediction\"\n",
"os.makedirs(output_dir_base, exist_ok=True)\n",
"\n",
"\n",
"def clean_and_validate_sequence(\n",
" input_sequence: str, min_length: int, max_length: int) -> str:\n",
" \"\"\"Checks that the input sequence is ok and returns a clean version of it.\"\"\"\n",
Expand Down Expand Up @@ -203,21 +199,25 @@
"def add_hash(x,y):\n",
" return x+\"_\"+hashlib.sha1(y.encode()).hexdigest()[:5]\n",
"\n",
"jobname = 'unifold_colab' #@param {type:\"string\"}\n",
"\n",
"sequence_1 = 'LILNLRGGAFVSNTQITMADKQKKFINEIQEGDLVRSYSITDETFQQNAVTSIVKHEADQLCQINFGKQHVVCTVNHRFYDPESKLWKSVCPHPGSGISFLKKYDYLLSEEGEKLQITEIKTFTTKQPVFIYHIQVENNHNFFANGVLAHAMQVSI' #@param {type:\"string\"}\n",
"sequence_2 = '' #@param {type:\"string\"}\n",
"sequence_3 = '' #@param {type:\"string\"}\n",
"sequence_4 = '' #@param {type:\"string\"}\n",
"\n",
"#@markdown Use symmetry group `C1` for default Uni-Fold predictions.\n",
"#@markdown Or, specify a **cyclic** symmetry group (e.g. `C4``) and\n",
"#@markdown the sequences of the asymmetric unit (i.e. **do not copy\n",
"#@markdown them multiple times**) to predict with UF-Symmetry.\n",
"\n",
"symmetry_group = 'C1' #@param {type:\"string\"}\n",
"\n",
"use_templates = True #@param {type:\"boolean\"}\n",
"msa_mode = \"MMseqs2\" #@param [\"MMseqs2\",\"single_sequence\"]\n",
"\n",
"input_sequences = [sequence_1, sequence_2, sequence_3, sequence_4]\n",
"\n",
"jobname = 'unifold_colab' #@param {type:\"string\"}\n",
"\n",
"basejobname = \"\".join(input_sequences)\n",
"basejobname = re.sub(r'\\W+', '', basejobname)\n",
"target_id = add_hash(jobname, basejobname)\n",
Expand Down Expand Up @@ -1046,7 +1046,7 @@
},
"gpuClass": "standard",
"kernelspec": {
"display_name": "Python 3.8.10 ('ProteinMD')",
"display_name": "Python 3.8.10 64-bit",
"language": "python",
"name": "python3"
},
Expand All @@ -1056,7 +1056,7 @@
},
"vscode": {
"interpreter": {
"hash": "af92dc656850d97b5469b75c9ef2009aaa936e713f0093b069a7ff14eeb2ca8d"
"hash": "916dbcbb3f70747c44a77c7bcd40155683ae19c65e1c03b4aa3499c5328201f1"
}
}
},
Expand Down

0 comments on commit 0220958

Please sign in to comment.