Mentions légales du service

Skip to content
Snippets Groups Projects
Name Last commit Last update
metadata
versions
voronota
.gitignore
LICENSE
README.md

VoroCNN

Convolutional neural network trained on Voronoi tessellation of 3D protein structures.

Usage

VoroCNN uses Voronota by Kliment Olechnovic (kliment@ibt.lt) in order to construct the tessellation and build the graph. Prior to running VoroCNN, Voronota must be installed. You can use already compiled executables for MacOS or Linux. Once Voronota is installed, VoroCNN can be run by executing the precompiled file vorocnn:

Please make sure you have changed the access permissions of vorocnn to make it executable as chmod +x vorocnn.

Basic example

If you want to process only one model, just specify the path to the model PDB file after the -i flag and pass the path of Voronota executable file after the -v flag:

./vorocnn -i /path/to/model.pdb -v /path/to/voronota

If you want to process multiple models, specify the path to the directory with model PDB files in argument -i and pass the path of Voronota executable file in argument -v:

./vorocnn -i /path/to/models/ -v /path/to/voronota

VoroCNN will create folder vorocnn_output/ with the results in the current directory.

It is recommended to score multiple structures in one run as the start-up time of the vorocnn executable may take up to 30 seconds.

Command line arguments

Name                 Type    Description                                                     Default
-------------------- ------- --------------------------------------------------------------- ------------------
-i, --input          string  path to the input PDB file or to the directory with PDB files      
-v, --voronota       string  path to Voronota executable file
-o, --output         string  path to the output directory                                    ./
-m, --model-version  string  name of VoroCNN version (from vorocnn/versions)                 vorocnn_casp_8_12    
-k, --keep-graph     flag    flag to keep graph data for each model in the output directory  False 
-V, --verbose        flag    flag to print all logs to stdout (including warnings)           False
-h, --help           flag    flag to print usage help to stdout and exit

Output

VoroCNN creates a folder vorocnn_output/ in the output directory. By default, in the vorocnn_output/ folder VoroCNN creates a file log where it writes all logs and a file with extension .scores where it writes predicted local scores of the input model. In case of multiple models, VoroCNN creates multiple separate .score-files for each input model. If the flag -k is enabled, VoroCNN creates for each model its own folder where it writes a .score-file and keeps a folder graph/ with the following content:

  • graph/model.pdb - preprocessed input PDB file (H-atoms removed, added chain ID if missed)
  • graph/x.txt - file with atom features (one line per atom, the order of atoms coincides with the order in the input PDB file)
  • graph/x_res.txt - file with residue features (one line per residue, the order of residues coincides with the order the in input PDB file)
  • graph/adj_b.txt - file with atom-level covalent edges of the graph in format first_atom second_atom contact_area*
  • graph/adj_c.txt - file with atom-level contact edges of the graph in format first_atom second_atom contact_area*
  • graph/adj_res.txt - file with residue-level contact edges of the graph in format first_residue second_residue contact_area*
  • graph/covalent_types.txt - file with atom covalent bonds types in format first_atom second_atom covalent_type*
  • graph/sequence_separation.txt - file with atom sequence separation values in format first_atom second_atom sequence_separation_value*
  • graph/aggr.txt - file that contains number of atoms in residues (the order of residues coincides with the order in the input PDB file)

In addition, VoroCNN writes to the stdout the status of the execution and the predicted global CAD-score of the input model. If the flag -V is enabled, VoroCNN also writes to stdout all warnings (they are always written to the log file).

* Due to the symmetry of the adjacency matrix, we keep only one edge for each pair of atoms.

VoroCNN versions

Four versions are available for usage: vorocnn_casp_8_11 and vorocnn_conv_casp_8_11 were trained on CASP[8-11], and vorocnn_casp_8_12 and vorocnn_conv_casp_8_12 were trained on CASP[8-12]. Check out result of these versions on CASP12 and CASP13 data:

  • Local predictions of vorocnn_casp_8_11 on models from CASP12 are available here
  • Local predictions of vorocnn_casp_8_12 on models from CASP13 are available here
  • Local predictions of vorocnn_conv_casp_8_11 on models from CASP12 are available here
  • Local predictions of vorocnn_conv_casp_8_12 on models from CASP13 are available here

New version used for publication:

  • Local predictions of vorocnn_geometric (trained on CASP[8-11]) on CASP12 and CASP13 are available here.