Commit bf0c408a authored by Mathieu Giraud's avatar Mathieu Giraud

tests: stanford-*.should_get, faster tests

The tests using Stanford-S22 mostly test things on the extracted windows,
sometimes on the representatives, but almost never on the fine segmentation.
We thus add '-y 0', '-z 0', or similar options to these tests.

On some laptop, running all *.should_get tests now takes 2'34 instead of 3'48.
parent cbb8afc2
!LAUNCH: ../../vidjil -X 100 -G ../../germline/IGH ../../data/Stanford_S22.fasta
!LAUNCH: ../../vidjil -y 0 -X 100 -G ../../germline/IGH ../../data/Stanford_S22.fasta
$ Skip the good number of reads
1:Processing every 131th read
......
!LAUNCH: ../../vidjil -G ../../germline/IGH -w 60 -r 5 -b data ../../data/Stanford_S22.fasta ; cat out/data.vidjil | python ../../tools/format_json.py -1
!LAUNCH: ../../vidjil -z 0 -G ../../germline/IGH -w 60 -r 5 -b data ../../data/Stanford_S22.fasta ; cat out/data.vidjil | python ../../tools/format_json.py -1
$ Number of reads
e1:"total": [13153]
......
!LAUNCH: ../../vidjil -w 60 -r 5 -o out2 -u -U -v -G ../../germline/IGH ../../data/Stanford_S22.fasta ; tail out2/Stanford_S22.segmented.vdj.fa ; grep UNSEG out2/Stanford_S22.unsegmented.vdj.fa
!LAUNCH: ../../vidjil -z 0 -w 60 -r 5 -o out2 -u -U -v -G ../../germline/IGH ../../data/Stanford_S22.fasta ; tail out2/Stanford_S22.segmented.vdj.fa ; grep UNSEG out2/Stanford_S22.unsegmented.vdj.fa
# Testing uncommon and debug options
$ verbose (-v)
......
!LAUNCH: ../../vidjil -r 5 -a -G ../../germline/IGH ../../data/Stanford_S22.fasta ; cat out/seq/clone.fa-2
!LAUNCH: ../../vidjil -z 2 -r 5 -a -G ../../germline/IGH ../../data/Stanford_S22.fasta ; cat out/seq/clone.fa-2
# Testing detailed clone output (-a)
$ Detailed clone output (out/seq/clone.fa-2), germline
......
!LAUNCH: ../../vidjil -w 60 -G ../../germline/IGH ../../data/Stanford_S22.fasta ; python ../../tools/fuse.py out/Stanford_S22.vidjil out/Stanford_S22.vidjil -o out/fused.data ; cat out/fused.data | python ../../tools/format_json.py -1
!LAUNCH: ../../vidjil -z 0 -w 60 -G ../../germline/IGH ../../data/Stanford_S22.fasta ; python ../../tools/fuse.py out/Stanford_S22.vidjil out/Stanford_S22.vidjil -o out/fused.data ; cat out/fused.data | python ../../tools/format_json.py -1
$ Points list
e1:"original_names": ["../../data/Stanford_S22.fasta", "../../data/Stanford_S22.fasta"]
......
!LAUNCH: ../../vidjil -k 14 -G ../../germline/IGH ../../data/Stanford_S22.fasta
!LAUNCH: ../../vidjil -y 0 -k 14 -G ../../germline/IGH ../../data/Stanford_S22.fasta
!LOG: stanford-k14.log
$ Find the good number of windows in Stanford S22 (contiguous seed 14)
......
!LAUNCH: ../../vidjil -G ../../germline/IGH -r 5 -W GAGAGATGGACGGGATACGTAAAACGACATATGGTTCGGGGTTTGGTGCT ../../data/Stanford_S22.fasta
!LAUNCH: ../../vidjil -z 0 -G ../../germline/IGH -r 5 -W GAGAGATGGACGGGATACGTAAAACGACATATGGTTCGGGGTTTGGTGCT ../../data/Stanford_S22.fasta
$ Some clone has only one read, bypassing the -r 5 option, and the good label
1: clone-00..*0001-.* -W
......
!LAUNCH: ../../vidjil -G ../../germline/IGH -r 5 -l ../../data/Stanford_S22.label ../../data/Stanford_S22.fasta
!LAUNCH: ../../vidjil -z 0 -G ../../germline/IGH -r 5 -l ../../data/Stanford_S22.label ../../data/Stanford_S22.fasta
$ Some clone has only one read, bypassing the -r 5 option, and the good label
1: clone-00..*0001-.* my-clone
......
!LAUNCH: ../../vidjil -s '#####-#####' -w 100 -G ../../germline/IGH ../../data/Stanford_S22.fasta
!LAUNCH: ../../vidjil -y 0 -s '#####-#####' -w 100 -G ../../germline/IGH ../../data/Stanford_S22.fasta
!LOG: stanford-w100.log
$ Find the good number of "too short sequences" for windows of size 100
......
!LAUNCH: ../../vidjil -x 100 -G ../../germline/IGH ../../data/Stanford_S22.fasta
!LAUNCH: ../../vidjil -y 0 -x 100 -G ../../germline/IGH ../../data/Stanford_S22.fasta
$ Analyze the good number of sequences in Stanford S22
1: found 98 ..-windows in 99 reads .99. of 100 reads
!LAUNCH: ../../vidjil -V ../../germline/IGHV.fa -D ../../germline/IGHD.fa -J ../../germline/IGHJ.fa -s \\\\#\\\\#\\\\#\\\\#\\\\#\\\\#-\\\\#\\\\#\\\\#\\\\#\\\\#\\\\# ../../data/Stanford_S22.fasta
!LAUNCH: ../../vidjil -z 0 -V ../../germline/IGHV.fa -D ../../germline/IGHD.fa -J ../../germline/IGHJ.fa -s \\\\#\\\\#\\\\#\\\\#\\\\#\\\\#-\\\\#\\\\#\\\\#\\\\#\\\\#\\\\# ../../data/Stanford_S22.fasta
$ Parses IGHV.fa germline
1: 101627 bp in 348 sequences
......
!LAUNCH: $LAUNCHER ../../vidjil -k 9 -G ../../germline/IGH -% 0.001 -r 2 -x 1000 -y 1 -c clones ../../data/Stanford_S22.fasta | sed 's/--IGH--.*VDJ\\(.*\\).$/\\1/' > vidjil_s22.log && $LAUNCHER ../../vidjil -k 9 -G ../../germline/IGH -% 0.001 -r 2 -x 1000 -y 1 -c clones ../../data/Stanford_S22.rc.fasta | sed 's/--IGH--.*VDJ\\(.*\\).$/\\1/' > vidjil_s22_rc.log && diff vidjil_s22.log vidjil_s22_rc.log
!LAUNCH: $LAUNCHER ../../vidjil -z 1 -k 9 -G ../../germline/IGH -% 0.001 -r 2 -x 1000 -y 1 -c clones ../../data/Stanford_S22.fasta | sed 's/--IGH--.*VDJ\\(.*\\).$/\\1/' > vidjil_s22.log && $LAUNCHER ../../vidjil -z 1 -k 9 -G ../../germline/IGH -% 0.001 -r 2 -x 1000 -y 1 -c clones ../../data/Stanford_S22.rc.fasta | sed 's/--IGH--.*VDJ\\(.*\\).$/\\1/' > vidjil_s22_rc.log && diff vidjil_s22.log vidjil_s22_rc.log
$ Same number segmented
0:==> segmented
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment