README.md 6.01 KB
Newer Older
Rayan Chikhi committed
1
Assembly_report is an assembly analyzer and debugger for Canu and mini{map|asm}.
2

3 4
You can find an exemple of output [her](https://gitlab.inria.fr/pmarijon/assembly_report/raw/master/demo/report.html) 

Rayan Chikhi committed
5
# Input
6

Rayan Chikhi committed
7 8 9 10 11 12
The tool takes as input an existing Canu assembly. Optionnally, it can also take as input a set of reads and generate the Canu assembly itself (with default parameters), as well as a miniasm assembly for comparison.

# Output

Assembly_report produces a HTML file with several components:

Rayan Chikhi committed
13 14
* PAF: the read overlap graph seen by Canu and miniasm
* BOG: the best overlap graph of Canu (intermediate representation between the PAF and the assembly)
Rayan Chikhi committed
15 16 17 18
* graph projection: Canu's contigs projected onto the minimap PAF graph
* contig extremities analysis: analysis of read overlaps at each contig extremity
* paths between contig extremities: finds paths in the canu PAF graph, as well as the minimap PAF graph, between reads at contigs extremities

19 20
![graph projection explain](images/graph_projection_explain.png)

Rayan Chikhi committed
21 22 23
You can find and example output here [report.html](https://gitlab.inria.fr/pmarijon/assembly_report/raw/master/demo/report.html).
Just save this file and open it in a browser (it won't display on gitlab directly).

MARIJON Pierre committed
24
![gif present basic report out](images/ex_data_assembly_report.gif)
Rayan Chikhi committed
25

26 27 28 29
# Replicability

An additional gitlab page is available to enable replication of our manuscript results: [Assembly graph analysis of fragmented long read bacterial genome assemblies](https://gitlab.inria.fr/pmarijon/assembly_graph_analysis_of_fragmented_long_read_bacterial_genome_assemblies_repetition)

Rayan Chikhi committed
30 31
# Installation and Usage

32 33 34 35
* [Installation](#installation)
    * [Pre-requisites](#pre-requisites)
    * [With conda](#main-installation-with-conda) *(recommended)*
    * [Without conda](#main-installation-without-conda)
36
* [Usage](#usage)
37
    * [Basic usage](#basic-usage)
38
    * [Demo dataset](#demo-dataset)
39
    * [Warning](#warning)
40
* [Update](#update)
41 42
    * [Conda installation](#conda-installation)
    * [Non-conda installation](#non-conda-installation)
43 44
* [Contact](#contact)

Rayan Chikhi committed
45
# Installation
46

Rayan Chikhi committed
47
## Pre-requisites
48

49
- [Bandage](https://github.com/rrwick/Bandage) (note: development branch). 
Rayan Chikhi committed
50
Until a new version of Bandage that contains recent developments is released, the installations instruction for Bandage are as follows:
51 52

```
Rayan Chikhi committed
53 54
git clone https://github.com/rrwick/Bandage.git
cd Bandage
55 56 57 58 59 60
git checkout development
qmake
make -j 16
```

And make sure that the `Bandage` binary is in your PATH environment.
61

Rayan Chikhi committed
62
## Main installation, with conda
63

Rayan Chikhi committed
64
Recommended solution (takes ~ 10 minutes but fairly automatic)
65

66 67 68 69
```
wget https://gitlab.inria.fr/pmarijon/assembly_report/raw/master/conda_env.yml
conda env create -f conda_env.yml
```
70

71
### Manage Conda environement
72 73 74 75 76 77 78 79 80 81 82 83 84

Activate environement :

```
source activate assembly_report
```

Unactivate environement :

```
source deactivate assembly_report
```

Rayan Chikhi committed
85
## Main installation, without Conda
86

Rayan Chikhi committed
87
### Pre-requisites
88 89

- python 3
90
- [Bandage](https://github.com/rrwick/Bandage) (same instructions as above)
91
- graphviz
92
- pygraphviz
93
- canu (with all exectuable in path)
94

Rayan Chikhi committed
95
## Instruction
96

97
```
Rayan Chikhi committed
98
pip3 install -r https://gitlab.inria.fr/pmarijon/assembly_report/raw/master/install.txt
99 100
```

Rayan Chikhi committed
101 102
Replace `pip3` by `pip` if your default Python version is 3.

Rayan Chikhi committed
103
# Usage
104

Rayan Chikhi committed
105
## Basic usage
106

Rayan Chikhi committed
107
1) Assume that Canu was run with `canu -p my_assembly -d assembly_folder`. Do:
108

Rayan Chikhi committed
109 110
2) `cd assembly_folder/`

111
3) `assembly_report`
112

113
This will generate a report in `assembly_report/` directory.
114

Rayan Chikhi committed
115 116 117 118 119
For a more complete report, please specify the minimap PAF file
and the miniasm GFA file using the `-m` and `-M` options respectively.


Full command line usage:
120 121

```
122 123
usage: assembly_report [-h] [-i INPUT] [-o OUTPUT] [-p PROJECT] [-m MINIMAP]
                       [-M MINIASM] [-v] [-c] [-t THREAD]
Rayan Chikhi committed
124

125 126
optional arguments:
  -h, --help            show this help message and exit
127
  -i INPUT, --input INPUT
128
                        canu directory (default: ./)
129
  -o OUTPUT, --output OUTPUT
Rayan Chikhi committed
130
                        output directory (default: ./assembly_report)
131
  -p PROJECT, --project PROJECT
132 133
                        project name give to canu, default auto detect use
                        option to overide (default: None)
134 135 136 137 138 139
  -m MINIMAP, --minimap MINIMAP
                        path to minimap paf output (default: None)
  -M MINIASM, --miniasm MINIASM
                        path to miniasm gfa output (default: None)
  -v, --verbose         verbose (default: False)
  -c, --clean           clean all file generated (default: False)
140 141
  -t THREAD, --thread THREAD
                        number of thread usable (default: 1)
142
```
143

Rayan Chikhi committed
144 145 146

## Demo dataset

147 148
You can download a html report generate on this demo data set [her](https://gitlab.inria.fr/pmarijon/assembly_report/raw/master/demo/report.html)

Rayan Chikhi committed
149 150 151 152 153 154 155 156 157 158 159
You can download a [test dataset](https://gitlab.inria.fr/pmarijon/assembly_report/raw/master/demo/demo_dataset.tar.bz2), and try assembly_report with these commands:

```
tar xvfj demo_dataset.tar.bz2
cd demo_dataset/assembly/canu
assembly_report # for canu only run, result are store in canu/assembly_report
cd ..
assembly_report -i canu -m minimap/minimap.paf # for canu and minimap run
assembly_report -i canu -m minimap/minimap.paf -M miniasm/miniasm.gfa # for canu, minimap and miniasm run
```

160 161 162 163
## Warning

Actualy assembly_report didn't manage readname contains space.

164 165
# Update

Rayan Chikhi committed
166
## Conda installation
167

Rayan Chikhi committed
168
The recommended way to update this tool is to remove the conda environement and reinstall it :
169 170

```
171
source deactivate assembly_report
172
conda env remove -n assembly_report
173 174
wget https://gitlab.inria.fr/pmarijon/assembly_report/raw/master/conda_env.yml
conda env create -f conda_env.yml
175
```
Rayan Chikhi committed
176

Rayan Chikhi committed
177
## Non-conda installation
Rayan Chikhi committed
178

179 180 181
Run : 

```
Rayan Chikhi committed
182 183 184
pip3 install --upgrade git+https://gitlab.inria.fr/pmarijon/paf2gfa.git#egg=paf2gfa
pip3 install --upgrade git+https://gitlab.inria.fr/pmarijon/path_in_gfa.git#egg=path_in_gfa
pip3 install --upgrade git+https://gitlab.inria.fr/pmarijon/assembly_report.git#egg=assembly_report
185
```
Rayan Chikhi committed
186

187 188
Replace `pip3` by `pip` if your default Python version is 3.

Rayan Chikhi committed
189 190
# Contact

191 192
For question or bug report send e-mail to pierre.marijon@inria.fr