- 28 Feb, 2016 2 commits
-
-
Mathieu Giraud authored
-
Mathieu Giraud authored
This was realized after prototyping and tests by @flothoni. As in IMGT/JunctionAnalysis, the detection relies on the positions of Cys104 and Phe118/Trp118. The detection is here in O(n), taking advantage of the already aligned V and J segments. The current implementation will not give a precise positions when there are insertions or deletions between Cys104 and the end of the V segment (or between the start of J segment and Phe118/Trp118). This could be improved by backtracking the DP matrix.
-
- 17 Feb, 2016 1 commit
-
-
Mathieu Giraud authored
-
- 06 Feb, 2016 5 commits
-
-
Mathieu Giraud authored
-
Mathieu Giraud authored
This further cleans the code and allows to output D1/D2 boxes as well.
-
Mathieu Giraud authored
One day, we may change '5del' and '3del' fields, their naming is not so consistent.
-
Mathieu Giraud authored
This was not used and not symmetrical.
-
Mathieu Giraud authored
Bug detected thanks to a full Valgrind.
-
- 05 Feb, 2016 8 commits
-
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-
Mathieu Giraud authored
Discussion with @mikael-s and @flothoni. We now run another dynamic programming once the overlap was handled -- only on the best reference sequence -- to check the actual e-value of the D segment.
-
Mathieu Giraud authored
-
Vidjil Team authored
We do not want to detect twice the same D gene. Note that we do not currently forbid alleles of a same gene. Discussion between @flothoni, @mikael-s, and @magiraud.
-
Vidjil Team authored
When a D has already been detected, we do not want to detect anything inside this D. Before this commit, spurious D detections could happen in the EXTEND_D_ZONE. Discussion between @flothoni, @mikael-s, and @magiraud.
-
Mathieu Giraud authored
-
Mathieu Giraud authored
When a D segment has been detected, we now try to detect an additional D between V/D or between D/J, possibly detecting VDDJ (or even some VDDDJ) recombinations. Note that this detection is not optimal. A chaining algorithm would be preferable here. Moreover, statistics should be refined, as now the only filter is done before check_and_remove_overlap.
-
- 02 Feb, 2016 8 commits
-
-
Mathieu Giraud authored
-
Mathieu Giraud authored
We need to store sequence_or_rc in the Segmenter.
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-
Mathieu Giraud authored
Previously, we have at many places things like "int *del_DD_left, int *DD_start, int *best_DD, int *DD_end, int *del_DD_right". This was not so clean and error-prone. Now all these parameters are stored into a ‘AlignBox’ object. This will lead to further simplifications of the code, better code maintenance, and allow some extensions.
-
Mathieu Giraud authored
We would like to call that on other places than between the V and the J.
-
- 01 Feb, 2016 6 commits
-
-
Mathieu Giraud authored
We factorize some computations (seq_left, seq_right, seg_N). This is the last commit of the day sponsored by the CERNA.
-
Mathieu Giraud authored
-
Mathieu Giraud authored
This enables in particular the analysis of +Vk/-Vk recombinations. This commit is again sponsored by the CERNA.
-
Mathieu Giraud authored
-
Mathieu Giraud authored
Until now, the FineSegmenter tested both strands, resulting in a code duplication and in unnecessary computations. This improvement is sponsored by the CERNA.
-
Mathieu Giraud authored
-
- 22 Dec, 2015 2 commits
-
-
Mathieu Giraud authored
This was not used since at least one year.
-
Mathieu Giraud authored
There was some duplicate code, not tested nor documented, to generate 'code_short'. This code is now removed. The 'code_short' value was only used in the json output, it will be now directly computed by the web application.
-
- 18 Dec, 2015 2 commits
-
-
Mathieu Giraud authored
For the MAX_12 pseudo-germline, the FineSegmenter now calls override_rep5_rep3_from_labels, and then continue by the regular way. Note that IKmerStore:getLabel() returns *one* Fasta file, even when several files were used for the same KmerAffect, such as in TRD+ or IGK+. In these cases, the FineSegmenter will probably fail when the bad Fasta file is returned.
-
Mathieu Giraud authored
Labels were introduced in df19d79c, for -c germlines, and were later used for -2. Even if they were previously strings, they always designated some file.
-
- 12 Dec, 2015 1 commit
-
-
Mathieu Giraud authored
There are two places where the segmentation can fail with UNSEG_ONLY_V/J. The first one, when there is no segmentation point, previously returned UNSEG_ONLY_V/J even when there was only one (possibly noisy) V/J k-mer. This is now corrected, UNSEG_ONLY_V/J is triggered only when one has at least DETECT_THRESHOLD k-mers (now 5). Ideally, we should use here an e-value check, but the segmentation point returned by kaa->getMaximum() is not really meaningfull in these cases and my lead to false statistics computations.
-
- 09 Nov, 2015 2 commits
-
-
Mathieu Giraud authored
Starting from the full read, we can not limit the DP computations to a k-band around the diagonal without first knowing where is exactly the junction. Nevertheless, we can avoid computing about one half of the DP matrix, as the end of the V / the start of the J (minus some deletions) must be matched. The BOTTOM_TRIANGLE_SHIFT is now set to 20, and this should be large enough to handle V/J deletions until ~30 bp (see comment in segment.h). (The current tests were even passing with BOTTOM_TRIANGLE_SHIFT set to 10.) Now the FineSegmenter (as launched by 'make shouldvdj_with_rc_merged') is about 35% faster.
-
Mathieu Giraud authored
-
- 07 Oct, 2015 3 commits
-
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-
Mathieu Giraud authored
The 4 sequences added in 6503f6ae now give correct results. All the 68 (now 72) test sequences are now passing on both strands.
-