Graph Neural Network + Attention mechanism to predict scoring functions (i-RMSD) for protein complexes and decoys.

Installation :

Make sure to create a dedicated environment as follow :

conda env create --name <YOUR_ENV_NAME> --file=environment_graph_predictions.yml

Tutorial for the data preparation, gridsearch training , testing and inference are available in this repository

1. Fully Automated data preparation pipeline that creates balanced graph datasets from PDB protein complexes and decoys files

2. Automated gridsearch for graph neural net architecture selection (Convolution, Node Attention, Edge Attention, Node+Edge Attention, customizable);

optimizer selection; possibility to train from scratch/resume training/transfer learning; feature selection

3. Automated testing pipeline that returns summary of the output, predictions and metrics.

4. Inference / Scoring pipeline returning the prediction on raw pdb files.

Conclusion : Precision within 2 A is reached using attention at both the node and the edge level to leverage complex interaction patterns between the nodes. Further training and architectures/hyperparameter exploration are required and might lead to performance improvement.

Future work :

augmented training dataset
introduction of more features (pssm,depth,hse)
hyperparameter gridsearch exploration
use pretrained model as a feature embedding
deeper version of Edge + Node attention network
energy scoring functions

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
DeepRank-GNN		DeepRank-GNN
Graph_Project		Graph_Project
example_slurm_jobs_cluster		example_slurm_jobs_cluster
hdf5_data_template/hdf5_pdb_graphs_copy_8		hdf5_data_template/hdf5_pdb_graphs_copy_8
images		images
.gitignore		.gitignore
README.md		README.md
environment_graph_predictions.yml		environment_graph_predictions.yml
protein-protein-complexes-presentation.pdf		protein-protein-complexes-presentation.pdf
user_instructions.pdf		user_instructions.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Graph Neural Network + Attention mechanism to predict scoring functions (i-RMSD) for protein complexes and decoys.

Installation :

Tutorial for the data preparation, gridsearch training , testing and inference are available in this repository

1. Fully Automated data preparation pipeline that creates balanced graph datasets from PDB protein complexes and decoys files

2. Automated gridsearch for graph neural net architecture selection (Convolution, Node Attention, Edge Attention, Node+Edge Attention, customizable);

3. Automated testing pipeline that returns summary of the output, predictions and metrics.

4. Inference / Scoring pipeline returning the prediction on raw pdb files.

Conclusion : Precision within 2 A is reached using attention at both the node and the edge level to leverage complex interaction patterns between the nodes. Further training and architectures/hyperparameter exploration are required and might lead to performance improvement.

Future work :

About

Releases

Packages

Languages

yanistazi/Graph_Neural_Net_Protein-Protein-Complexes

Folders and files

Latest commit

History

Repository files navigation

Graph Neural Network + Attention mechanism to predict scoring functions (i-RMSD) for protein complexes and decoys.

Installation :

Tutorial for the data preparation, gridsearch training , testing and inference are available in this repository

1. Fully Automated data preparation pipeline that creates balanced graph datasets from PDB protein complexes and decoys files

2. Automated gridsearch for graph neural net architecture selection (Convolution, Node Attention, Edge Attention, Node+Edge Attention, customizable);

3. Automated testing pipeline that returns summary of the output, predictions and metrics.

4. Inference / Scoring pipeline returning the prediction on raw pdb files.

Conclusion : Precision within 2 A is reached using attention at both the node and the edge level to leverage complex interaction patterns between the nodes. Further training and architectures/hyperparameter exploration are required and might lead to performance improvement.

Future work :

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages