Commit a05819e0 authored by Mathieu Giraud's avatar Mathieu Giraud

Merge branch 'feature-a/3008-default_J_downstream' into 'dev'

Feature a/3008 default j downstream

See merge request !148
parents 751992b2 edd001ee
Pipeline #33011 passed with stages
in 45 minutes and 37 seconds
......@@ -2,17 +2,17 @@
# one half is first 60 bp of some V genes
# the other half is a 40 bp di-mer sequence
>TRAV1-1*01--cgcg
>TRAV1-1*01--caca
ggacaaagccttgagcagccctctgaagtgacagctgtggaaggagccattgtccagata
cgcgcgcgcgcgcgcgcgcgcgcgcgcgcgcgcgcgcgcg
cacacacacacacacacacacacacacacacacacacaca
>TRGV1*01--cgcg
>TRGV1*01--caca
tcttccaacttggaagggagaacgaagtcagtcaccaggctgactgggtcatctgctgaa
cgcgcgcgcgcgcgcgcgcgcgcgcgcgcgcgcgcgcgcg
cacacacacacacacacacacacacacacacacacacaca
>IGHV1-18*01--cgcg
>IGHV1-18*01--caca
caggttcagctggtgcagtctggagctgaggtgaagaagcctggggcctcagtgaaggtc
cgcgcgcgcgcgcgcgcgcgcgcgcgcgcgcgcgcgcgcg
cacacacacacacacacacacacacacacacacacacaca
>atat--TRBV1*01
......
......@@ -3,7 +3,7 @@
# Sequences outside any V(D)J locus
>too_few_vj-1
CTAGGCATGGCTCCTCTCCACAGGAAAACTCCACTCCAGTGCTCAGCTTGCACCCTGGCACAGGCCAGCAGTTGCTGGAAGTCAGACACCTGAGAAGAAC
CTAGGCATGGCTCCTCTCCACAGGAAAACTCCACTCCAGTGCTCAGCTTGCACCCTGGCACAGGCCAGCAGTTGCTGGAAGTCAGACACCTGTGAAGAAC
>too_few_vj-2
GCCTCAGGCCAGCCTTCCGCTCCTTGAAGCTGGTCTCCGCACAGTGCTGGTTCCGTCACCCCCACCCAGGGAAGCAGGTCTGAGCAGCTTGTCCTGGCTG
......
......@@ -7,7 +7,7 @@ $ The FineSegmenter gives the locus information (-2)
1:Unexpected [+]TRBV/[+]TRGJ
1:Unexpected [+]TRDV/[+]IGHJ
1:Unexpected [+]TRDV/[-]IGKV
1:Unexpected [-]IGLJ/[+]TRGJ
1:Unexpected [-]IGLJ.down/[+]TRGJ
$ The FineSegmenter gives the locus information (-2) and takes the best strand
1:Unexpected [+]IGKV/[+]IGLJ
......
......@@ -4,14 +4,14 @@ $ number of reads and kmers
1:13153 reads, 3020179 kmers
$ k-mers, IGHV
1:13115 .* 1222984 .*IGHV
1:13116 .* 1219953 .*IGHV
$ k-mers, IGHJ
1:38 .* 435867 .*IGHJ
1:37 .* 435550 .*IGHJ
$ k-mers, ambiguous
1:47648 .*\\?
1:53426 .*\\?
$ k-mers, unknown
1:1251567 .*_
1:1248938 .*_
......@@ -10,7 +10,7 @@ CTACTACTACATGGACGTCTGGGGCAAAGGGACCCTGGTCACCGTCTCCTCAGGT
>TRGV2*01 1/C/0 TRGJ1*02 [TRG] {CATWDG!YYKKLF}
GGAAGGCCCCACAGCGTCTTCAGTACTATGACTCCTACAACTCCAAGGTTGTGTTGGAATCAGGAGTCAGTCCAGGGAAGTATTATACTTACGCAAGCACAAGGAACAACTTGAGATTGATACTGCGAAATCTAATTGAAAATGACTCTGGGGTCTATTACTGTGCCACCTGGGACGGCTTATTATAAGAAACTCTTTGGCAGTGGAACAACAC
>TRGV3*01 0/CC/0 TRGJ1*01 [TRG] {CATWDRPNYYKKLF}
>TRGV3*01 0/CC/0 (TRGJ1*01, TRGJ2*01) [TRG] {CATWDRPNYYKKLF}
GGAAGGCCCCACAGCGTCTTCTGTACTATGACGTCTCCACCGCAAGGGATGTGTTGGAATCAGGACTCAGTCCAGGAAAGTATTATACTCATACACCCAGGAGGTGGAGCTGGATATTGAGACTGCAAAATCTAATTGAAAATGATTCTGGGGTCTATTACTGTGCCACCTGGGACAGGCCGAATTATTATAAGAAACTCTTTGGCAGTGGAACAACAC
# PCR dimer (V4-K5), should not be segmented
#
>dimer__UNSEG
>dimer [unexpected]
GAGCTCTGTGACCGCCGCGGACACGACTGGAGATTAAACGTAAGTGTAGAACCATGTCGTCAGTGTAGATCTCGGTGGTCGCCGTATCATTAAAAAAAAA
>IGKV1-5*03 9/4/1 IGKJ1*01 [IGK] {CQQYNRLWTF}
>IGKV1-5*03 (9/4/1 IGKJ1*01, 9/7/4 IGKJ4*02) [IGK] {CQQYNRLWTF}
CTCCTGCTACTCTGGCTCCCAGGTGCCAAATGTGACATCCAGATGACCCAGTCTCCTTCCACCCTGTCTGCGTCTGTAGGAGACAGAGTCACCATCACCTGCCGGGCCAGTCAGAGTATTAATAACAACTTGGCCTGGTATCAGGAGAAGCCAGGGAAAGCCCCTAAGGTCCTGATCTATAAGGCGTCTAGTTTAGAAAGTGGGGTCCCATCAAGGTTCAGCGGCAGTGGATCTGGGACAGAATTCACTCTCACCATCAGCAGCCTGCAGCCTGATGATTTTGCAACCTATTACTGCCAACAATATAATAGACTTTGGACGTTCGGCCAAGGGACCAAGGTGGAAGTCAAACGAACTGTGGCTGCACCATCT
# TRGV3*01 TGCCACCTGGGACAGG
# |||||||||||||
# seq GGGGTCTATTACTGTGCCACCTGGGACTCTTCTGTTGTCACAGGTAAGTATC
# ||.||.|| .|||.||....|||||||||||||||||||||
# TRGJ2*01 GAATTATTATAAGAAACTCTTTGGCA GTGGAACAACACTTGTTGTCACAGGTAAGTATC
# ^ ^ ^
# 0 18 37
>TRGV3*01 3/TCTTC/39 TRGJ2*01 [TRG] BUG
CTACACCAGGAGGGGAAGGCCCCACAGCGTCTTCTGTACTATGACGTCTACACCGCAAGGGATGTGTTGGAATCAGGACTCAGTCCAGGAAAGTATTATACTCATACACCAAGGAGGTGGAGCTGGATATTGAGACTGCAAAATCTAATTGAAAATGATTCTGGGGTCTATTACTGTGCCACCTGGGAC
TCTTC
TGTTGTCACAGGTAAGTATCGGAAGAAT
\ No newline at end of file
......@@ -8,7 +8,8 @@ tccccctgatcctggagtcgcccagccccaaccagacccaaag
ctacgagcagtacttcgggccgggcaccaggctcacggttacaggtaag
#target 0447GG
>TRBD2*02 1/AATG/0 TRBJ2-3*01 [TRB+] BUG
#See #3145, this sequence may be a TRBJ2-2 ... TRBD2 ... TRBJ2-3 !
>TRBD2*02 1/AATG/0 TRBJ2-3*01 [TRB+] TODO
CAGGGCTGAGCGAACACCGGGGAGCTGTTTTTTGGAGAAGGCTCTAGGCTGACCGTACTGGGTAAGGAGGCG
GTTGGGGCTCCGGAGAGCTCCGAGAGGGCGGGATGGGCAGAGGTAAGCAGCTGCCCCACTCTGAGAGGGGCT
GTGCTGAGAGGTTTTTCCAAGCCCCACACAGTCAGACTAACCTCTGCCACCTGCGCTTCCTGCCGCTGCCCA
......
......@@ -10,8 +10,8 @@ TGCCGGGCAGTCAGGGCATTAGAAATGATTTAGGCTGGTATCAGCAGAAACCAGGGNAAAGCCCCTAAGCTCCTGATCTA
>TRDV2*03 2/4/3 TRAJ48*01 [TRA+D]
ACTGGTACAGGAAGACCCAAGGTAACACAATGACTTTCATATACCGAGAAAAGGACATCTATGGCCCTGGTTTCAAAGACAATTTCCAAGGTGACATTGATATTGCAAAGAACCTGGCTGTACTTAAGATACTTGCACCATCAGAGAGAGATGAAGGGTCTTACTACTGTGCCTGTGACAGAGACTAACTTTGGAAATGAGAAATTAACCTTTGGGACTGGAACAAGACTCACCATCATACCCAGTAAGTTCTTCATCCTTGGTCAGGAAATCAGCCTGCATAAGATTCTGGGGAA
>TRGV5*01 (5/4/3 TRGJ1*01, 5/4/0 TRGJ1*02) [TRG]
>TRGV5*01 (5/4/3 TRGJ2*01, 5/4/0 TRGJ1*02) [TRG]
TTCCGTTCTNCCAACTNCAANGGNNGGTNGTTGGGAATCAGGNACTCAGTNCCAGGNAAAGTATTATACTCATACACCCAGGAGGTGGAGCTGGATATTGATACTACGAAATCTAATTGAAAATGATTCTGGGGTCTATTACTGTGCCACCTGGGCCTTTTATTATAAGAAACTCTTTGGCAGTGGAACAACACTTGTTGTCACAGGTAAGTATCGGAAGAAA
>TRGV4*02 (4/4/4 TRGJ1*01, 4/4/1 TRGJ1*02, 1/1/1 TRGJ1*02) [TRG]
>TRGV4*02 (4/4/1 TRGJ2*01, 1/1/1 TRGJ2*01, 4/4/1 TRGJ1*02, 1/1/1 TRGJ1*02) [TRG]
GGTTTTNTTCGTTGCCNTCCNTTTTTTGNTGGAATCAGGAATCAGCCCAGGGAAAGTATGATACTTACGGAAGCACAAGGTAAGAACTTGAGAATGATACTGCGAAATCTTATTGAAAATGACTCTGGAGTCTATTACTGTGCCACCTGGGAAGGCTATTATAAGAAACTCTTTGGCAGTGGAACAACACTTGTTGTCACAGGTAAGTATCGGAAGATATTCC
\ No newline at end of file
>TRGV1*01, TRGV5*01 0//0 TRGJ1*01 [TRG]
>(TRGV1*01, TRGV3*01, TRGV5*01) 0//0 (TRGJ1*01, TRGJ2*01) [TRG]
NNNNNNNNNNNN
tactgtgccacctgggacagg
gaattattataagaaactctt
......
......@@ -80,7 +80,7 @@ GCTGGCAGATTCTGCAGGCATCTGGATACACCTTCACCAGCTACTATATGCACTGGGTGCGACAGGCCCCTGGACAAGGG
CGGGGTAGTCTCCTGCAGCTTCTGGAGGCACCTTCAGCAGCTATGCTATCAGCTGGGTGCGACAGGCCCCTGGACAAGGGCTTGAGTGGATGGGAGGGATCATCCCTATCTTTGGTACAGCAAACTACGCACAGAAGTTCCAGGGCAGAGTCACGATTACCGCGGACGAATCCACGAGCACAGCCTACATGGAGCTGAGCAGCCTGAGATCTGAGGACACGGCCGTGTATTACTGTGCGAGGGGGGGCTGGAACTACTACTACGGTATGGACGTCTGGGGCCAAGGGACCACGGTCACCGTCTCCTCAGGTAAAACTTATCCTTGGAAAGACGTCATCCGTAGTAGTAGTTCAGCCCCCCTGCCGTAATACAGGCCGTGTCTCGATCTCAGCGGCTCACTCCTGGTGGGCTGTGCTTGTGGAATTCTCCCGGTAATCGTGACTTTGCCCTGGAAATTCTATGCCTATTGTTGCTGGACC
#1428 vh2
>(IGHV2-70*04 IGHJ6*02, IGHV2-70D*04 IGHJ6*02) [IGH]
>(IGHV2-70*04 IGHJ6*02, IGHV2-70D*04 IGHJ6*02) [IGH]
CCACAGACCCTCACACTGACCTGCACCTTCTCTGGGTTCTCACTCAGCACTAGTGGAATGCGTGTGAGCTGGATCCGTCAGCCCCCAGGGAAGGCCCTGGAGTGGCTTGCACGCATTGATTGGGATGATGATAAATTCTACAGCACATCTCTGAAGACCAGGCTCACCATCTCCAAGGACACCTCCAAAAACCAGGTGGTCCTTACAATGACCAACATGGACCCTGTGGACACAGCCACGTATTACTGTGTTGGTATGGACGTCTGGGGCCAAGGGACCACGGTCACCGTCTCCTCAGGTAAAACCACGTCCTTGGCCCGACTCCATACCACACCTAATACCTGGCTGTGTCCACGAGGATCCATGGTTGGGCATTGTAGGACCACTGGTTTTTGGAAGTTCCCTTGAGAAAAGAAAAACAGAATCTTCAAAAAA
#1428 vh3
......
>IGHV1-18*01, IGHV7-4-1*02 0//0 IGHD1-1*01 0//0 IGHJ1*01 [IGH]
>IGHV 0//0 IGHD1-1*01 0//0 IGHJ1*01 [IGH]
actgtgcgagaga
ggtacaactggaacgac
gctgaatacttcc
>IGHV1-18*01 0//0 IGHD3-16*01 0//0 IGHJ4*01 [IGH]
>(IGHV1-18*01, IGHV1-18*04) 0//0 IGHD3-16*01 0//0 IGHJ4*01 [IGH]
agcctacatggagctgaggagcctgagatctgacgacacggccgtgtattactgtgcgagaga
gtattatgattacgtttgggggagttatgcttatacc
actactttgactactggggccaaggaaccctggtcaccgtctcctcag
# Same sequence, but with only the 20bp start of J
>IGHV1-18*01 0//0 IGHD3-16*01 0//0 IGHJ4*01 [IGH] BUG
>(IGHV1-18*01, IGHV1-18*04) 0//0 IGHD3-16*01 0//0 (IGHJ4*01, IGHJ4*02) [IGH] BUG
agcctacatggagctgaggagcctgagatctgacgacacggccgtgtattactgtgcgagaga
gtattatgattacgtttgggggagttatgcttatacc
actactttgactactggggcc
# Same sequence, but with only the 15bp start of J
>IGHV1-18*01 0//0 IGHD3-16*01 0//0 IGHJ4*01 [IGH] BUG-LOCUS
>(IGHV1-18*01, IGHV1-18*04) 0//0 IGHD3-16*01 0//0 (IGHJ4*01, IGHJ4*02) [IGH] BUG-LOCUS
agcctacatggagctgaggagcctgagatctgacgacacggccgtgtattactgtgcgagaga
gtattatgattacgtttgggggagttatgcttatacc
actactttgactactg
\ No newline at end of file
>IGLV1-36*01 0//25 IGLJ1*01
cagtctgtgctgactcagccaccctcggtgtctgaagcccccaggcagagggtcaccatctcctgttctggaagcagctccaacatcggaaataatgctgtaaactggtaccagcagctcccaggaaaggctcccaaactcctcatctattatgatgatctgctgccctcaggggtctctgaccgattctctggctccaagtctggcacctcagcctccctggccatcagtgggctccagtctgaggatgaggctgattattactgtgcagcatgggatgacagcctgaatggtcc
gtcaccgtcctaggagtctgctgtctggggatagcggggagccaggtgtactg
\ No newline at end of file
......@@ -27,7 +27,7 @@ TACTTCCAGCACTGGGGCCAGGGCACCCTGGTCACCGTCTCCTCAGGTAAG
### IGK: VJ, V-KDE, Intron-KDE
# 1043-IGK
>IGKV1-5*03 9/4/1 IGKJ1*01 [IGK] {CQQYNRLWTF}
>IGKV1-5*03 (9/4/1 IGKJ1*01, 9/7/4 IGKJ4*02) [IGK] {CQQYNRLWTF}
CTCCTGCTACTCTGGCTCCCAGGTGCCAAATGTGACATCCAGATGACCCAGTCTCCTTCCACCCTGTCTGCGTCTGTAGGAGACAGAGTCACCATCACCTGCCGGGCCAGTCAGAGTATTAATAACAACTTGGCCTGGTATCAGGAGAAGCCAGGGAAAGCCCCTAAGGTCCTGATCTATAAGGCGTCTAGTTTAGAAAGTGGGGTCCCATCAAGGTTCAGCGGCAGTGGATCTGGGACAGAATTCACTCTCACCATCAGCAGCCTGCAGCCTGATGATTTTGCAACCTATTACTGCCAACAATATAATAGACTTTGGACGTTCGGCCAAGGGACCAAGGTGGAAGTCAAACGAACTGTGGCTGCACCATCT
# 0119-lil-IGK+-TRA+D-TRD+-TRG
......
......@@ -14,7 +14,7 @@
"description": "Human T-cell receptor, alpha locus (14q11.2)",
"recombinations": [ {
"5": ["TRAV.fa"],
"3": ["TRAJ.fa"]
"3": ["TRAJ+down.fa"]
} ],
"parameters": {
"seed": "13s"
......@@ -28,7 +28,7 @@
"recombinations": [ {
"5": ["TRBV.fa"],
"4": ["TRBD.fa"],
"3": ["TRBJ.fa"]
"3": ["TRBJ+down.fa"]
} ],
"parameters": {
"seed": "12s"
......@@ -41,7 +41,7 @@
"follows": "TRB",
"recombinations": [ {
"5": ["TRBD+up.fa"],
"3": ["TRBJ.fa"]
"3": ["TRBJ+down.fa"]
} ],
"parameters": {
"seed": "12s"
......@@ -54,7 +54,7 @@
"description": "Human T-cell receptor, gamma locus (7p14)",
"recombinations": [ {
"5": ["TRGV.fa"],
"3": ["TRGJ.fa"]
"3": ["TRGJ+down.fa"]
} ],
"parameters": {
"seed": "10s"
......@@ -68,7 +68,7 @@
"recombinations": [ {
"5": ["TRDV.fa"],
"4": ["TRDD.fa"],
"3": ["TRDJ.fa"]
"3": ["TRDJ+down.fa"]
} ],
"parameters": {
"seed": "10s"
......@@ -81,10 +81,10 @@
"recombinations": [ {
"5": ["TRDV.fa"],
"4": ["TRDD.fa"],
"3": ["TRAJ.fa"]
"3": ["TRAJ+down.fa"]
}, {
"5": ["TRDD+up.fa"],
"3": ["TRAJ.fa"]
"3": ["TRAJ+down.fa"]
} ],
"parameters": {
"seed": "13s"
......@@ -101,7 +101,7 @@
}, {
"5": ["TRDD2+up.fa"],
"4": ["TRDD.fa"],
"3": ["TRDJ.fa"]
"3": ["TRDJ+down.fa"]
}, {
"5": ["TRDD2+up.fa"],
"3": ["TRDD3+down.fa"]
......@@ -118,7 +118,7 @@
"recombinations": [ {
"5": ["IGHV.fa"],
"4": ["IGHD.fa"],
"3": ["IGHJ.fa"]
"3": ["IGHJ+down.fa"]
} ],
"parameters": {
"seed": "12s"
......@@ -131,7 +131,7 @@
"follows": "IGH",
"recombinations": [ {
"5": ["IGHD+up.fa"],
"3": ["IGHJ.fa"]
"3": ["IGHJ+down.fa"]
} ],
"parameters": {
"seed": "12s"
......@@ -144,7 +144,7 @@
"description": "Human immunoglobulin, kappa locus (2p11.2)",
"recombinations": [ {
"5": ["IGKV.fa"],
"3": ["IGKJ.fa"]
"3": ["IGKJ+down.fa"]
} ],
"parameters": {
"seed": "10s"
......@@ -170,7 +170,7 @@
"description": "Human immunoglobulin, lambda locus (22q11.2)",
"recombinations": [ {
"5": ["IGLV.fa"],
"3": ["IGLJ.fa"]
"3": ["IGLJ+down.fa"]
} ],
"parameters": {
"seed": "10s"
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment