- 02 Feb, 2016 7 commits
-
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-
Mathieu Giraud authored
Previously, we have at many places things like "int *del_DD_left, int *DD_start, int *best_DD, int *DD_end, int *del_DD_right". This was not so clean and error-prone. Now all these parameters are stored into a ‘AlignBox’ object. This will lead to further simplifications of the code, better code maintenance, and allow some extensions.
-
Mathieu Giraud authored
We would like to call that on other places than between the V and the J.
-
- 01 Feb, 2016 9 commits
-
-
Mathieu Giraud authored
This test is renamed 'testOverlap'.
-
Mathieu Giraud authored
We factorize some computations (seq_left, seq_right, seg_N). This is the last commit of the day sponsored by the CERNA.
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-
Mathieu Giraud authored
This enables in particular the analysis of +Vk/-Vk recombinations. This commit is again sponsored by the CERNA.
-
Mathieu Giraud authored
As the KmerSegmenter is used in the FineSegmenter, a meaningful seed has to be choosen here.
-
Mathieu Giraud authored
-
Mathieu Giraud authored
Until now, the FineSegmenter tested both strands, resulting in a code duplication and in unnecessary computations. This improvement is sponsored by the CERNA.
-
Mathieu Giraud authored
-
- 28 Jan, 2016 5 commits
-
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-
Mathieu Giraud authored
The diversity measures are computed before taking into account the '-r', '-y' and '-z' options, and before any further clusterisation.
-
Mathieu Giraud authored
-
- 26 Jan, 2016 4 commits
-
-
Mathieu Giraud authored
As the comparison is strict, some 1.0 ratioMin are changed to 0.9.
-
Mathieu Giraud authored
The condition on ratioMin is now strict (the default ratioMin is now 1.9 instead of 2.0), and obscure conditions are removed. A few borderline cases could now pass here (max_found), but they should anyway be discarded by the following e-values tests (and yield the same unsegmentation cause than before). The whole test is now cleaner and more symmetrical.
-
Mathieu Giraud authored
Two conditions were always met.
-
Mathieu Giraud authored
This changes nothing on results.max_found and on the segmentation, but it ensures that first_pos_max and last_pos_mas are always between 0 and the length of affectations. This is more symmetrical.
-
- 21 Jan, 2016 7 commits
-
-
Mikaël Salson authored
We prefix the e-values with a string "e-value:" to make sure that we are matching e-values.
-
Mathieu Giraud authored
With the new estimation of the p-value in affectanalyser.cpp, the following sequence has a p-value on the J side of about 3.58e-04, yielding an e-value of about 4.71 (there are 13,153 reads in Stanford-S22). Setting -e 10 enables thus this sequence to be still segmented. Changing seeds could maybe change these results. === >lcl|FLN1FA001D7OE0.1 GGCCTGGAGTGGATTGGGTACATCTATTACAGTGGGAGCACCTACTACAACCCGTCCCTCAAGAGTCGAGTTGCCATATCGGTAGACACGTCTAAGAACCAGTTCTCCCTGAAGTTGAGCTCTGTGACTGCCGCGGACACGGCCGTGTATTATTGTGCGAGAGTAGCAGCGGCTGCTCTTGACTCCTTGGGGCCAGGGAAGCCTGGTCACCTCTCCTCAGG 73 + VJ 0 158 187 220 seed IGH SEG_+ 3.583786e-04 2.531831e-182/3.583786e-04 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X _ _ _ _ _ _+X _ _ _ _ _ _ _+X _ _ _ _ _ _+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X _ _ _ _ _ _+X _ _ _ _ _ _+X+X ?+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X _ _ _ _+X+X+X _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _+x _ _ _ _ _ _+x _ _ _ _ _ _ _ _ _ _ _ _ _ _
-
Mathieu Giraud authored
This test should be changed to actually check the debug options rather than borderline (un)segmentation cases.
-
Mathieu Giraud authored
The TODO comment: "a more precise modeling should give a e-value computation that could make this work even with -e 1" is still very pertinent.
-
Mathieu Giraud authored
This test should check whether the e-values are the same, and not their actual value.
-
Mathieu Giraud authored
As detected by @mikael-s in the parent commit, the region between V and J affectations was not taken into account in the p-values, yielding erroneous segmentations when this region was very large. Now this region is counted *both* for the computation of left and right p-values, solving the bug of the parent commit. This could be sometimes over-conservative : are we counting things twice ? In regular situations, the answer is no, as the p-values are eventually computed by getProbabilityAtLeastOrAbove in kmerstore.h, that takes into account the length of the seed. A more exact option could have been to use something like (first_pos_max + last_pos_max / 2) + getS()/2, but it would raise symmetry problems. The selected option should anyway improve the estimation in most of the cases.
-
Mikaël Salson authored
When we have few kmers at the start, the sequence will be segmented (because first_pos_max is at the start of the sequence). On the contrary if we add some kmers of the same locus further in the sequence (in our case it is TGCTCCCCTA) the sequence won't be segmented because first_pos_max is much larger and it is likely that having so few kmers of the locus in such a large sequence is obtained by chance. Clearly there is a problem with the first sequence that should not be segmented: maybe the evalue should not be computed from position 1 to position 1+first_pos_max ?
-
- 15 Jan, 2016 5 commits
-
-
Mathieu Giraud authored
The default value for -y is 100. Before this commit, selecting -z 200 was really useful only with an additionnal -y 200 option.
-
Mikaël Salson authored
reindent region modified in previous commit to ensure readability
-
Mikaël Salson authored
Previously the sequence, coverages, affects and e-values were not output if the FineSegmentation was not computed. The code is now reordered to ensure that this information is output in the .vidjil.
-
Mikaël Salson authored
Make sure that sequence is output (and evalue) even when we have no FineSegmentation
-
florian.thonier authored
-
- 22 Dec, 2015 3 commits
-
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-