Commit 9e0b8730 authored by MARIJON Pierre's avatar MARIJON Pierre

Add dataset demo

parent f1f14f4b
This diff is collapsed.
H VN:Z:1.0
S tig00000001 * LN:i:2765793
S tig00000004 * LN:i:1743974
S tig00000008 * LN:i:256602
# About dataset
This is a 20x synthetic dataset created by LongISLND, with pacbio error model, on *Terriglobus roseus* (NC_108014.1).
Assembled with canu 1.7.1 `genomeSize=5m --corOutCoverage=20`.
# File description
- assembly_contig.fasta: sequence of contig generate by canu
- assembly_contig.gfa: contig graph generate by canu
- reads_raw.fasta: read generated by LongISLND
- reads_corrected.fasta: read corrected by canu
# How to run knot
On raw read:
```
knot -r demo/reads_raw.fasta -c demo/assembly_contig.fasta -g demo/assembly_graph.gfa -o raw
```
knot output are prefixed by `raw_`:
```
raw_AAG.csv # AAG format describe in Readme
raw_knot # knot working directory
├── contigs.fasta # symbolic link to contig sequence provide as input
├── contigs_filtred.fasta # contig keept in analysis filter on length
├── contigs_filtred.gfa # contig graph generate by fpa on contig mapping (contigs_filtred.paf)
├── contigs_filtred.paf # mapping of filtred contig with minimap
├── contigs_graph.gfa # symbolic link to contig graph provide as input
├── ext_search.csv # read associated to each contig extremity
├── raw_reads.fasta # symbolic link to raw read provide as input
├── raw_reads.paf # self mapping of raw_reads
├── raw_reads_splited.fasta # raw reads without not covered sequence provide by yacrd
├── raw_reads_splited.gfa # overlap graph generate by fpa on raw_reads_splited self mapping
├── raw_reads_splited.paf # self mapping of raw_reads_splited
├── raw_reads.yacrd # yacrd output on raw_reads
└── read2asm.paf # mapping of read on contigs_filtred
```
On corrected read:
```
knot -C demo/reads_raw.fasta -c demo/assembly_contig.fasta -g demo/assembly_graph.gfa -o raw
```
knot output are prefixed by `corrected_`:
```
corrected_AAG.csv # AAG format describe in Readme
corrected_knot # knot working directory
├── contigs.fasta # symbolic link to contig sequence provide as input
├── contigs_filtred.fasta # contig keept in analysis, filter on length
├── contigs_filtred.gfa # contig graph generate by fpa on contig mapping (contigs_filtred.paf)
├── contigs_filtred.paf # mapping of filtred contig with minimap
├── contigs_graph.gfa # symbolic link to raw read provide as input
├── ext_search.csv # read associated to each contig extremity
├── raw_reads_splited.fasta # symbolic link to corrected read provide as input
├── raw_reads_splited.gfa # overlap graph generate by fpa on raw_reads_splited self mapping
├── raw_reads_splited.paf # self mappig of raw_reads_splited
└── read2asm.paf # mapping of read on contigs_filterd
```
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment