- 03 Feb, 2016 5 commits
-
-
Mikaël Salson authored
-
Mikaël Salson authored
Changes in IGHV: 1. Three new genes (IGHV3-23D*02, IGHV3-30-5*01 and IGHV3-30-5*02). Welcome on board. 2. One gene deleted (IGHV4-28*05). You left us too early, we had good time together… 3. One gene (IGHV3-30-3*01) increased by two nucleotides. Congratulations! Functionality changes in two genes in IGLV (IGLV10-54*03) and TRAV (TRAV11*01).
-
Mikaël Salson authored
-
Mikaël Salson authored
We don't know yet the meaning of this letter. Depending on this answer we may add some processing
-
Mikaël Salson authored
For consistency reason we need to specify what rule creates the data files.
-
- 02 Feb, 2016 9 commits
-
-
Mathieu Giraud authored
-
Mathieu Giraud authored
We need to store sequence_or_rc in the Segmenter.
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-
Mathieu Giraud authored
Previously, we have at many places things like "int *del_DD_left, int *DD_start, int *best_DD, int *DD_end, int *del_DD_right". This was not so clean and error-prone. Now all these parameters are stored into a ‘AlignBox’ object. This will lead to further simplifications of the code, better code maintenance, and allow some extensions.
-
Mathieu Giraud authored
We would like to call that on other places than between the V and the J.
-
- 01 Feb, 2016 9 commits
-
-
Mathieu Giraud authored
This test is renamed 'testOverlap'.
-
Mathieu Giraud authored
We factorize some computations (seq_left, seq_right, seg_N). This is the last commit of the day sponsored by the CERNA.
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-
Mathieu Giraud authored
This enables in particular the analysis of +Vk/-Vk recombinations. This commit is again sponsored by the CERNA.
-
Mathieu Giraud authored
As the KmerSegmenter is used in the FineSegmenter, a meaningful seed has to be choosen here.
-
Mathieu Giraud authored
-
Mathieu Giraud authored
Until now, the FineSegmenter tested both strands, resulting in a code duplication and in unnecessary computations. This improvement is sponsored by the CERNA.
-
Mathieu Giraud authored
-
- 28 Jan, 2016 5 commits
-
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-
Mathieu Giraud authored
The diversity measures are computed before taking into account the '-r', '-y' and '-z' options, and before any further clusterisation.
-
Mathieu Giraud authored
-
- 26 Jan, 2016 6 commits
-
-
-
HERBERT Ryan authored
Simply a list of pitfals and steps to ensure deploying to production is a smooth process.
-
Mathieu Giraud authored
As the comparison is strict, some 1.0 ratioMin are changed to 0.9.
-
Mathieu Giraud authored
The condition on ratioMin is now strict (the default ratioMin is now 1.9 instead of 2.0), and obscure conditions are removed. A few borderline cases could now pass here (max_found), but they should anyway be discarded by the following e-values tests (and yield the same unsegmentation cause than before). The whole test is now cleaner and more symmetrical.
-
Mathieu Giraud authored
Two conditions were always met.
-
Mathieu Giraud authored
This changes nothing on results.max_found and on the segmentation, but it ensures that first_pos_max and last_pos_mas are always between 0 and the length of affectations. This is more symmetrical.
-
- 21 Jan, 2016 6 commits
-
-
Mikaël Salson authored
We prefix the e-values with a string "e-value:" to make sure that we are matching e-values.
-
Mathieu Giraud authored
With the new estimation of the p-value in affectanalyser.cpp, the following sequence has a p-value on the J side of about 3.58e-04, yielding an e-value of about 4.71 (there are 13,153 reads in Stanford-S22). Setting -e 10 enables thus this sequence to be still segmented. Changing seeds could maybe change these results. === >lcl|FLN1FA001D7OE0.1 GGCCTGGAGTGGATTGGGTACATCTATTACAGTGGGAGCACCTACTACAACCCGTCCCTCAAGAGTCGAGTTGCCATATCGGTAGACACGTCTAAGAACCAGTTCTCCCTGAAGTTGAGCTCTGTGACTGCCGCGGACACGGCCGTGTATTATTGTGCGAGAGTAGCAGCGGCTGCTCTTGACTCCTTGGGGCCAGGGAAGCCTGGTCACCTCTCCTCAGG 73 + VJ 0 158 187 220 seed IGH SEG_+ 3.583786e-04 2.531831e-182/3.583786e-04 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X _ _ _ _ _ _+X _ _ _ _ _ _ _+X _ _ _ _ _ _+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X _ _ _ _ _ _+X _ _ _ _ _ _+X+X ?+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X _ _ _ _+X+X+X _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _+x _ _ _ _ _ _+x _ _ _ _ _ _ _ _ _ _ _ _ _ _
-
Mathieu Giraud authored
This test should be changed to actually check the debug options rather than borderline (un)segmentation cases.
-
Mathieu Giraud authored
The TODO comment: "a more precise modeling should give a e-value computation that could make this work even with -e 1" is still very pertinent.
-
Mathieu Giraud authored
This test should check whether the e-values are the same, and not their actual value.
-
Mathieu Giraud authored
As detected by @mikael-s in the parent commit, the region between V and J affectations was not taken into account in the p-values, yielding erroneous segmentations when this region was very large. Now this region is counted *both* for the computation of left and right p-values, solving the bug of the parent commit. This could be sometimes over-conservative : are we counting things twice ? In regular situations, the answer is no, as the p-values are eventually computed by getProbabilityAtLeastOrAbove in kmerstore.h, that takes into account the length of the seed. A more exact option could have been to use something like (first_pos_max + last_pos_max / 2) + getS()/2, but it would raise symmetry problems. The selected option should anyway improve the estimation in most of the cases.
-