- 05 Feb, 2016 3 commits
-
-
Vidjil Team authored
When a D has already been detected, we do not want to detect anything inside this D. Before this commit, spurious D detections could happen in the EXTEND_D_ZONE. Discussion between @flothoni, @mikael-s, and @magiraud.
-
Mathieu Giraud authored
-
Mathieu Giraud authored
When a D segment has been detected, we now try to detect an additional D between V/D or between D/J, possibly detecting VDDJ (or even some VDDDJ) recombinations. Note that this detection is not optimal. A chaining algorithm would be preferable here. Moreover, statistics should be refined, as now the only filter is done before check_and_remove_overlap.
-
- 02 Feb, 2016 8 commits
-
-
Mathieu Giraud authored
-
Mathieu Giraud authored
We need to store sequence_or_rc in the Segmenter.
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-
Mathieu Giraud authored
Previously, we have at many places things like "int *del_DD_left, int *DD_start, int *best_DD, int *DD_end, int *del_DD_right". This was not so clean and error-prone. Now all these parameters are stored into a ‘AlignBox’ object. This will lead to further simplifications of the code, better code maintenance, and allow some extensions.
-
Mathieu Giraud authored
We would like to call that on other places than between the V and the J.
-
- 01 Feb, 2016 6 commits
-
-
Mathieu Giraud authored
We factorize some computations (seq_left, seq_right, seg_N). This is the last commit of the day sponsored by the CERNA.
-
Mathieu Giraud authored
-
Mathieu Giraud authored
This enables in particular the analysis of +Vk/-Vk recombinations. This commit is again sponsored by the CERNA.
-
Mathieu Giraud authored
-
Mathieu Giraud authored
Until now, the FineSegmenter tested both strands, resulting in a code duplication and in unnecessary computations. This improvement is sponsored by the CERNA.
-
Mathieu Giraud authored
-
- 22 Dec, 2015 2 commits
-
-
Mathieu Giraud authored
This was not used since at least one year.
-
Mathieu Giraud authored
There was some duplicate code, not tested nor documented, to generate 'code_short'. This code is now removed. The 'code_short' value was only used in the json output, it will be now directly computed by the web application.
-
- 18 Dec, 2015 2 commits
-
-
Mathieu Giraud authored
For the MAX_12 pseudo-germline, the FineSegmenter now calls override_rep5_rep3_from_labels, and then continue by the regular way. Note that IKmerStore:getLabel() returns *one* Fasta file, even when several files were used for the same KmerAffect, such as in TRD+ or IGK+. In these cases, the FineSegmenter will probably fail when the bad Fasta file is returned.
-
Mathieu Giraud authored
Labels were introduced in df19d79c, for -c germlines, and were later used for -2. Even if they were previously strings, they always designated some file.
-
- 12 Dec, 2015 1 commit
-
-
Mathieu Giraud authored
There are two places where the segmentation can fail with UNSEG_ONLY_V/J. The first one, when there is no segmentation point, previously returned UNSEG_ONLY_V/J even when there was only one (possibly noisy) V/J k-mer. This is now corrected, UNSEG_ONLY_V/J is triggered only when one has at least DETECT_THRESHOLD k-mers (now 5). Ideally, we should use here an e-value check, but the segmentation point returned by kaa->getMaximum() is not really meaningfull in these cases and my lead to false statistics computations.
-
- 09 Nov, 2015 2 commits
-
-
Mathieu Giraud authored
Starting from the full read, we can not limit the DP computations to a k-band around the diagonal without first knowing where is exactly the junction. Nevertheless, we can avoid computing about one half of the DP matrix, as the end of the V / the start of the J (minus some deletions) must be matched. The BOTTOM_TRIANGLE_SHIFT is now set to 20, and this should be large enough to handle V/J deletions until ~30 bp (see comment in segment.h). (The current tests were even passing with BOTTOM_TRIANGLE_SHIFT set to 10.) Now the FineSegmenter (as launched by 'make shouldvdj_with_rc_merged') is about 35% faster.
-
Mathieu Giraud authored
-
- 07 Oct, 2015 3 commits
-
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-
Mathieu Giraud authored
The 4 sequences added in 6503f6ae now give correct results. All the 68 (now 72) test sequences are now passing on both strands.
-
- 03 Oct, 2015 2 commits
-
-
Mathieu Giraud authored
This was a bug hidden since the first release of Vidjil. It is now corrected, and the test added in 002adbd2 is now passing.
-
Mathieu Giraud authored
-
- 25 Sep, 2015 1 commit
-
-
Mathieu Giraud authored
-
- 18 Sep, 2015 3 commits
-
-
Mathieu Giraud authored
-
Mathieu Giraud authored
core/segment.cpp: Karlin-Altschul statistics, e-value computation with sequence sizes in the FineSegmenter
-
Mathieu Giraud authored
-
- 16 Sep, 2015 4 commits
-
-
Mathieu Giraud authored
There were several problems in the previous implementation. Now: - We handle separetly numbers of nucleotides in seq_{left,right} and in ref_{left,right}; - There is no need to handle equality cases (if is the responsability of the segment_cost); - There is no need anymore to call dp.backtrack() (we directly use dp.best_score_on_i). - We also properly update del_D_left and del_D_right for VDJ recombinations The tests added in 3c138657 are now passing.
-
Mathieu Giraud authored
-
Mathieu Giraud authored
no-modification
-
Mathieu Giraud authored
-
- 15 Sep, 2015 2 commits
-
-
Mathieu Giraud authored
score_D may be not properly initialized, leading to spurious output of D genes in the .vidjil file (as for TRB-VJ recombinations in bug20150909.fa) The correct way to check for D is to use isDSegmented.
-
Mathieu Giraud authored
The test was more harmful than useful. Corrects what was identified as bug20150909.fa.
-
- 03 Jul, 2015 1 commit
-
-
Mathieu Giraud authored
-