Commit bfd4bc79 authored by Mathieu Giraud's avatar Mathieu Giraud

doc/algo.org: document shifted/shortened windows

See #2982.
parent c5e0973b
......@@ -311,9 +311,9 @@ on the CDR3. No sequencing errors are corrected inside this window.
The center of the "window", predicted by the high-throughput heuristic, may
be shifted by a few bases from the actual "center" of the CDR3 (for TRG,
less than 15 bases compared to the IMGT/V-QUEST or IgBlast prediction
in >99% of cases). The extracted window should be large enough to
fully contain the CDR3 as well as some part of the end of the V and
the start of the J, or at least some specific N region, to uniquely identify a clone.
in >99% of cases when the reads are large enough). Usually, a 50 bp-window
fully contains the CDR3 as well as some part of the end of the V and
the start of the J, or at least some specific N region to uniquely identify the clone.
Setting =-w= to higher values (such as =-w 60= or =-w 100=) makes the clone clustering
even more conservative, enabling to split clones with low specificity (such as IGH with very
......@@ -331,6 +331,10 @@ different clones.
For VJ recombinations, the =-w 40= option is usually safe, and =-w 30= can also be tested.
Setting =-w= to lower values is not recommended.
When the read is too short too extract the requested length, the window can be shifted
(at most 10 bp) or shrinkened (down until 30bp) by increments of 5bp. Such reads
are counted in =SEG changed w= and the corresponding clones are output with the =Wxx= warning.
The =-e= option sets the maximal e-value accepted for segmenting a sequence.
It is an upper bound on the number of exepcted windows found by chance by the seed-based heuristic.
The e-value computation takes into account both the number of reads in the
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment