Commit a6ef6d3a authored by Mikaël Salson's avatar Mikaël Salson

Merge branch 'feature-a/3569-more-airr-fields' into 'dev'

Feature a/3569 more AIRR fields

See merge request !715
parents 7d09552a 70872468
Pipeline #199630 passed with stages
in 10 minutes and 29 seconds
......@@ -167,6 +167,18 @@ map <string, string> CloneOutputAIRR::fields()
fields["d_call"] = get(KEY_SEG, "4", "name");
fields["j_call"] = get(KEY_SEG, "3", "name");
fields["v_sequence_start"] = get(KEY_SEG, "5", "start");
fields["v_sequence_end"] = get(KEY_SEG, "5", "stop");
fields["d_sequence_start"] = get(KEY_SEG, "4", "start");
fields["d_sequence_end"] = get(KEY_SEG, "4", "stop");
fields["j_sequence_start"] = get(KEY_SEG, "3", "start");
fields["j_sequence_end"] = get(KEY_SEG, "3", "stop");
fields["cdr3_sequence_start"] = get(KEY_SEG, "cdr3", "start");
fields["cdr3_sequence_end"] = get(KEY_SEG, "cdr3", "stop");
fields["v_support"] = get(KEY_SEG, "evalue_left", "val");
fields["j_support"] = get(KEY_SEG, "evalue_right", "val");
fields["cdr3_aa"] = get(KEY_SEG, "cdr3", "aa");
fields["junction"] = NULL_VAL;
fields["junction_aa"] = get(KEY_SEG, "junction", "aa");
......@@ -192,6 +204,15 @@ void SampleOutputAIRR::out(ostream &s)
"junction",
"cdr3_aa",
"warnings",
"v_sequence_start", "v_sequence_end",
"d_sequence_start", "d_sequence_end",
"j_sequence_start", "j_sequence_end",
"cdr3_sequence_start", "cdr3_sequence_end",
"v_support", "j_support",
"rev_comp",
"sequence_alignment",
"germline_alignment",
......
......@@ -43,6 +43,16 @@ $ Junction of the first clone appears once, but CDR3 twice (it is also included
1:CTREEQYSSWYFDFW
w2:TREEQYSSWYFDF
$ V/D/J start/end positions of the first clone
b1: 72 91 102 103
bf1: 1 72 91 102 103 147
$ cdr3 start/end positions of the first clone
b1: 69 137
$ V/J e-values
rb1: 0[.][0-9]*e.00 7[.][0-9]*e[-]76
$ The first clone has one warning
1:TATTACTGTACCCGGGAGGAACAATATAGCAGCTGGTACTTTGACTTCTG .* W69
......
......@@ -717,9 +717,12 @@ Using `-c designations` trigger a separate analysis for each read, but this is u
| sequence | string | The query nucleotide sequence. Usually, this is the unmodified input sequence, which may be reverse complemented if necessary. In some cases, this field may contain consensus sequences or other types of collapsed input sequences if these steps are performed prior to alignment. <br />*This contains the consensus/representative sequence of each clone.*
| rev_comp | boolean | True if the alignment is on the opposite strand (reverse complemented) with respect to the query sequence. If True then all output data, such as alignment coordinates and sequences, are based on the reverse complement of 'sequence'. <br />*Set to null, as vidjil-algo gather reads from both strands in clones* |
| v_call, d_call, j_call | string | V/D/J gene with allele. For example, IGHV4-59\*01. <br /> *implemented. In the case of uncomplete/unexpected recombinations (locus with a `+`), we still use `v/d/j_call`. Note that this value can be null on clones beyond the `--max-designations` option.* |
| v_sequence_start, v_sequence_end <br />d_sequence_start, d_sequence_end <br /> j_sequence_start, j_sequence_end | number | Start/end position of the V/D/J genes and of the CDR3 in the query sequence (1-based closed interval). <br />*implemented* |
| v_support, j_support | number | V/J gene alignment E-value, p-value, likelihood. <br />*implemented* |
| junction | string | Junction region nucleotide sequence, where the junction is defined as the CDR3 plus the two flanking conserved codons. <br />*null*
| junction_aa | string | Junction region amino acid sequence. <br />*implemented*
| cdr3_aa | string | Amino acid translation of the cdr3 field. <br />*implemented*
| cdr3_sequence_start, cdr3_sequence_end | number | Start/end position of the CDR3 in the query sequence (1-based closed interval). <br />*implemented* |
| productive | boolean | True if the V(D)J sequence is predicted to be productive. <br /> *true, false, or null when no CDR3 has been detected* |
| sequence_alignment | string | Aligned portion of query sequence, including any indel corrections or numbering spacers, such as IMGT-gaps. Typically, this will include only the V(D)J region, but that is not a requirement. <br /> *null* |
| germline_alignment | string | Assembled, aligned, fully length inferred germline sequence spanning the same region as the sequence_alignment field (typically the V(D)J region) and including the same set of corrections and spacers (if any). <br />*null*
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment