Commit a7042415 authored by Thonier Florian's avatar Thonier Florian Committed by Mathieu Giraud

tools/pear_structured_log: add and update tests

link to #3054
parent c9d5d2d0
Pipeline #44252 passed with stages
in 6 minutes and 15 seconds
=== Pre-process 2 ===
python pear.py /home/vidjil-ci/opt sequence_R1.fastq sequence_R2.fastq demo_set.fastq -r2
===============
Output log in demo_set.fastq.pre.log
____ _____ _ ____
| _ \| ____| / \ | _ \
| |_) | _| / _ \ | |_) |
| __/| |___ / ___ \| _ <
|_| |_____/_/ \_\_| \_\
PEAR v0.9.10 [May 30, 2016]
Citation - PEAR: a fast and accurate Illumina Paired-End reAd mergeR
Zhang et al (2014) Bioinformatics 30(5): 614-620 | doi:10.1093/bioinformatics/btt593
Forward reads file.................: sequence_R1.fastq
Reverse reads file.................: sequence_R2.fastq
PHRED..............................: 33
Using empirical frequencies........: YES
Statistical method.................: OES
Maximum assembly length............: 999999
Minimum assembly length............: 50
p-value............................: 0.010000
Quality score threshold (trimming).: 0
Minimum read size after trimming...: 1
Maximal ratio of uncalled bases....: 1.000000
Minimum overlap....................: 10
Scoring method.....................: Scaled score
Threads............................: 1
Allocating memory..................: 200,000,000 bytes
Computing empirical frequencies....: DONE
A: 0.269874
C: 0.257653
G: 0.238808
T: 0.233665
9000 uncalled bases
Assemblying reads: 0%Assemblying reads: 100%
Assembled reads ...................: 2,972 / 3,000 (99.067%)
Discarded reads ...................: 0 / 3,000 (0.000%)
Not assembled reads ...............: 28 / 3,000 (0.933%)
Assembled reads file...............: demo_set.fastq.assembled.fastq
Discarded reads file...............: demo_set.fastq.discarded.fastq
Unassembled forward reads file.....: demo_set.fastq.unassembled.forward.fastq
Unassembled reverse reads file.....: demo_set.fastq.unassembled.reverse.fastq
=== Pre-process 2 ===
python pear.py /home/vidjil-ci/opt sequence_R1.fastq sequence_R2.fastq demo_set.fastq -r2
===============
Output log in /mnt/vda/prod/result/tmp/pre/out-039723//demo_set_R_6.fastq.pre.log
____ _____ _ ____
| _ \| ____| / \ | _ \
| |_) | _| / _ \ | |_) |
| __/| |___ / ___ \| _ <
|_| |_____/_/ \_\_| \_\
PEAR v0.9.10 [May 30, 2016]
Citation - PEAR: a fast and accurate Illumina Paired-End reAd mergeR
Zhang et al (2014) Bioinformatics 30(5): 614-620 | doi:10.1093/bioinformatics/btt593
Forward reads file.................: sequence_R1.fastq
Reverse reads file.................: sequence_R2.fastq
PHRED..............................: 33
Using empirical frequencies........: YES
Statistical method.................: OES
Maximum assembly length............: 999999
Minimum assembly length............: 50
p-value............................: 0.010000
Quality score threshold (trimming).: 0
Minimum read size after trimming...: 1
Maximal ratio of uncalled bases....: 1.000000
Minimum overlap....................: 10
Scoring method.....................: Scaled score
Threads............................: 1
Allocating memory..................: 200,000,000 bytes
Computing empirical frequencies....: DONE
A: 0.269874
C: 0.257653
G: 0.238808
T: 0.233665
9000 uncalled bases
Assemblying reads: 0%Assemblying reads: 100%
Assembled reads ...................: 972 / 3,000 (32.400%)
Discarded reads ...................: 500 / 3,000 (16.666%)
Not assembled reads ...............: 1528 / 3,000 (50.933%)
Assembled reads file...............: demo_set.fastq.assembled.fastq
Discarded reads file...............: demo_set.fastq.discarded.fastq
Unassembled forward reads file.....: demo_set.fastq.unassembled.forward.fastq
Unassembled reverse reads file.....: demo_set.fastq.unassembled.reverse.fastq
!LAUNCH: python ../../pear_structured_log.py -i pear_log.log
!LAUNCH: python ../../pear_structured_log.py -i ../data/pear_log.log -o ../data/pear_strucured.json; cat ../data/pear_strucured.json
$ Clone id-1 has a different number of reads
1:Not the same number or reads: id-1
$ Correct number of assembled reads
1:"reads_assembled_number": 2972
$ Clone id-2 is not in the second file
1:[-] .* Clone not present: .* id-2
1:"reads_total_number": 3000
$ Clone id-3 is not in the first file
1:[+] .* Clone not present: .* id-3
$ Correct number of unassembled reads
1:"reads_not_assembled_number": 28
$ Correct information on input files used by pear
1:"assembled_reads": "demo_set.fastq.assembled.fastq",
1:"discarded_reads": "demo_set.fastq.discarded.fastq",
1:"unassembled_forward": "demo_set.fastq.unassembled.forward.fastq",
1:"unassembled_reverse": "demo_set.fastq.unassembled.reverse.fastq"
$ Correct parameter return
1:"forward_file": "sequence_R1.fastq",
1:"minimum_overlap": "10",
1:"phred": "33",
1:"reverse_file": "sequence_R2.fastq",
1:"scoring_methode": "Scaled score",
1:"version": "PEAR v0.9.10 \[May 30, 2016\]"
$ Correct nucleotides frequencies
1:"base_frequency_a": "0.269874",
1:"base_frequency_c": "0.257653",
1:"base_frequency_g": "0.238808",
1:"base_frequency_t": "0.233665",
\ No newline at end of file
!LAUNCH: python ../../pear_structured_log.py -i ../data/pear_log_warning.log -o ../data/pear_strucured_waring.json; cat ../data/pear_strucured_waring.json
$ Correct percentage of reads
1:"percentage_assembled": 32.4,
1:"percentage_discarded": 16.666666666666668,
1:"percentage_not_assembled": 50.93333333333333,
$ Correct number of reads
1:"reads_assembled_number": 972,
1:"reads_discarded_number": 500,
1:"reads_not_assembled_number": 1528,
1:"reads_total_number": 3000
$ correct add of warning into the structured log
1:"Very few reads assembled"
1:"High level of discarded reads"
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment