Commit 8dab1120 authored by Vidjil Team's avatar Vidjil Team Committed by Mathieu Giraud

doc/algo.org: unsegmentation causes

Started by a mail from @mikael-s to P. Wu, and edited and completed by @magiraud.
parent c45e0b8c
......@@ -307,6 +307,48 @@ two windows that must be clustered.
* Output
** Unsegmentation causes
Vidjil output details statistics on the reads that are not segmented (not analyzed).
Basically, *an unsegmented read is a read where Vidjil cannot identify a window at the junction of V and J genes*.
To properly analyze a read, Vijdil needs that the sequence spans enough V region and J region.
The following unsegmentation causes are reported:
| | |
|---------------------+---------------------------------------------------------------------------------------------------------------------|
| =UNSEG too short= | Reads are too short, shorter than the seed (by default between 9 and 13 bp). |
|---------------------+---------------------------------------------------------------------------------------------------------------------|
| =UNSEG strand= | The strand is mixed in the read, with some similarities both with the =+= and the =-= strand. |
|---------------------+---------------------------------------------------------------------------------------------------------------------|
| =UNSEG too few (0)= | No information has been found on the read: There are not enough similarities neither with a V gene or a J gene. |
|---------------------+---------------------------------------------------------------------------------------------------------------------|
| =UNSEG too few V= | Some similarities have been found with some J but not enough with any V. |
|---------------------+---------------------------------------------------------------------------------------------------------------------|
| =UNSEG too few J= | Some similarities have been found with some V but not enough with any J. |
|---------------------+---------------------------------------------------------------------------------------------------------------------|
| =UNSEG ambiguous= | Vidjil finds some V and J similarities mixed together which makes the situation ambiguous and hardly solvable. |
|---------------------+---------------------------------------------------------------------------------------------------------------------|
| =UNSEG too short w= | The junction can be identified but the read is too short so that Vidjil could extract the window (by default 50bp). |
| | It often means the junction is very close from one end of the read. |
|---------------------+---------------------------------------------------------------------------------------------------------------------|
Some datasets may give reads with many low =UNSEG too few= reads:
- =UNSEG too few (0)= usually happens when reads share almost nothing with the V(D)J region.
This is expected when the PCR or capture-based approach included other regions, such as in whole RNA-seq.
- =UNSEG too few V= and =UNSEG too few J= happens when reads do not span enough the junction zone.
Vidjil detects a “window” including the CDR3. By default this window is 50bp long,
so the read needs be that long centered on the junction.
See [[http://git.vidjil.org/blob/master/doc/browser.org][browser.org]] for information on the biological or sequencing causes that can lead to few segmented reads.
* Examples of use
All the following examples are on a IGH VDJ recombinations : they thus
......
......@@ -246,7 +246,7 @@ A first control is to check the number of “segmented reads” in the info pane
For each point, this shows the number of reads where Vidjil found a CDR3.
Ratios above 90% usually mean very good results. Smaller ratios, especially under 60%, often mean that something went wrong.
The “info“ button further detail the causes of non-segmentation (UNSEG).
The “info“ button further detail the causes of non-segmentation (=UNSEG=, see detail on [[http://git.vidjil.org/blob/master/doc/algo.org][algo.org]]).
There can be several causes leading to bad ratios:
*** Analysis or biological causes
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment