VoroCNN
Convolutional neural network trained on Voronoi tessellation of 3D protein structures.
Usage
VoroCNN uses Voronota by Kliment Olechnovic (kliment@ibt.lt) in order to construct the tessellation and build the graph.
Prior to running VoroCNN, Voronota must be installed. You can use already compiled executables for MacOS or Linux.
Once Voronota is installed, VoroCNN can be run by executing the precompiled file vorocnn
:
Please make sure you have changed the access permissions of vorocnn
to make it executable as chmod +x vorocnn
.
Basic example
If you want to process only one model, just specify the path to the model PDB file after the -i
flag and pass the path of Voronota executable file after the -v
flag:
./vorocnn -i /path/to/model.pdb -v /path/to/voronota
If you want to process multiple models, specify the path to the directory with model PDB files in argument -i
and pass the path of Voronota executable file in argument -v
:
./vorocnn -i /path/to/models/ -v /path/to/voronota
VoroCNN will create folder vorocnn_output/
with the results in the current directory.
It is recommended to score multiple structures in one run as the start-up time of the vorocnn
executable may take up to 30 seconds.
Command line arguments
Name Type Description Default
-------------------- ------- --------------------------------------------------------------- ------------------
-i, --input string path to the input PDB file or to the directory with PDB files
-v, --voronota string path to Voronota executable file
-o, --output string path to the output directory ./
-m, --model-version string name of VoroCNN version (from vorocnn/versions) vorocnn_casp_8_12
-k, --keep-graph flag flag to keep graph data for each model in the output directory False
-V, --verbose flag flag to print all logs to stdout (including warnings) False
-h, --help flag flag to print usage help to stdout and exit
Output
VoroCNN creates a folder vorocnn_output/
in the output directory.
By default, in the vorocnn_output/
folder VoroCNN creates a file log
where it writes all logs and a file with extension .scores
where it writes predicted local scores of the input model.
In case of multiple models, VoroCNN creates multiple separate .score
-files for each input model.
If the flag -k
is enabled, VoroCNN creates for each model its own folder where it writes a .score
-file and keeps a folder graph/
with the following content:
-
graph/model.pdb
- preprocessed input PDB file (H-atoms removed, added chain ID if missed) -
graph/x.txt
- file with atom features (one line per atom, the order of atoms coincides with the order in the input PDB file) -
graph/x_res.txt
- file with residue features (one line per residue, the order of residues coincides with the order the in input PDB file) -
graph/adj_b.txt
- file with atom-level covalent edges of the graph in formatfirst_atom second_atom contact_area
* -
graph/adj_c.txt
- file with atom-level contact edges of the graph in formatfirst_atom second_atom contact_area
* -
graph/adj_res.txt
- file with residue-level contact edges of the graph in formatfirst_residue second_residue contact_area
* -
graph/covalent_types.txt
- file with atom covalent bonds types in formatfirst_atom second_atom covalent_type
* -
graph/sequence_separation.txt
- file with atom sequence separation values in formatfirst_atom second_atom sequence_separation_value
* -
graph/aggr.txt
- file that contains number of atoms in residues (the order of residues coincides with the order in the input PDB file)
In addition, VoroCNN writes to the stdout
the status of the execution and the predicted global CAD-score of the input model.
If the flag -V
is enabled, VoroCNN also writes to stdout
all warnings (they are always written to the log
file).
* Due to the symmetry of the adjacency matrix, we keep only one edge for each pair of atoms.
VoroCNN versions
Four versions are available for usage: vorocnn_casp_8_11
and vorocnn_conv_casp_8_11
were trained on CASP[8-11], and vorocnn_casp_8_12
and vorocnn_conv_casp_8_12
were trained on CASP[8-12].
Check out result of these versions on CASP12 and CASP13 data:
- Local predictions of
vorocnn_casp_8_11
on models from CASP12 are available here - Local predictions of
vorocnn_casp_8_12
on models from CASP13 are available here - Local predictions of
vorocnn_conv_casp_8_11
on models from CASP12 are available here - Local predictions of
vorocnn_conv_casp_8_12
on models from CASP13 are available here
New version used for publication:
- Local predictions of
vorocnn_geometric
(trained on CASP[8-11]) on CASP12 and CASP13 are available here.