1. 28 Jan, 2016 1 commit
  2. 26 Jan, 2016 6 commits
  3. 21 Jan, 2016 7 commits
    • Mikael Salson's avatar
      revcomp.should_get: Test that the same e-value appears twice · d3efe5e2
      Mikael Salson authored
      We prefix the e-values with a string "e-value:" to make sure that we are matching e-values.
      d3efe5e2
    • Mathieu Giraud's avatar
      tests: update tests, -e 10 for Stanford_S22.fasta · 88c3213c
      Mathieu Giraud authored
      With the new estimation of the p-value in affectanalyser.cpp, the following sequence
      has a p-value on the J side of about 3.58e-04, yielding an e-value of about 4.71 (there are 13,153 reads in Stanford-S22).
      
      Setting -e 10 enables thus this sequence to be still segmented.
      Changing seeds could maybe change these results.
      
      ===
      
      >lcl|FLN1FA001D7OE0.1
      GGCCTGGAGTGGATTGGGTACATCTATTACAGTGGGAGCACCTACTACAACCCGTCCCTCAAGAGTCGAGTTGCCATATCGGTAGACACGTCTAAGAACCAGTTCTCCCTGAAGTTGAGCTCTGTGACTGCCGCGGACACGGCCGTGTATTATTGTGCGAGAGTAGCAGCGGCTGCTCTTGACTCCTTGGGGCCAGGGAAGCCTGGTCACCTCTCCTCAGG
      73 + VJ      0 158 187 220    seed IGH SEG_+ 3.583786e-04 2.531831e-182/3.583786e-04
       _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X _ _ _ _ _ _+X _ _ _ _ _ _ _+X _ _ _ _ _ _+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X _ _ _ _ _ _+X _ _ _ _ _ _+X+X ?+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X _ _ _ _+X+X+X _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _+x _ _ _ _ _ _+x _ _ _ _ _ _ _ _ _ _ _ _ _ _
      88c3213c
    • Mathieu Giraud's avatar
      tests: update tests · 96eb0706
      Mathieu Giraud authored
      This test should be changed to actually check the debug options rather than borderline (un)segmentation cases.
      96eb0706
    • Mathieu Giraud's avatar
      tests: update tests, chimera-fake-half.should_get · 53896876
      Mathieu Giraud authored
      The TODO comment: "a more precise modeling should give a e-value computation that could make this work even with -e 1"
      is still very pertinent.
      53896876
    • Mathieu Giraud's avatar
      tests: update tests · 40e94451
      Mathieu Giraud authored
      This test should check whether the e-values are the same, and not their actual value.
      40e94451
    • Mathieu Giraud's avatar
      core/affectanalyser.cpp: safer p-value estimation, taking into account the central region · 2e23e720
      Mathieu Giraud authored
      As detected by @mikael-s in the parent commit, the region between V and J affectations
      was not taken into account in the p-values, yielding erroneous segmentations
      when this region was very large.
      
      Now this region is counted *both* for the computation of left and right p-values, solving
      the bug of the parent commit.
      
      This could be sometimes over-conservative : are we counting things twice ?
      In regular situations, the answer is no, as the p-values
      are eventually computed by getProbabilityAtLeastOrAbove in kmerstore.h,
      that takes into account the length of the seed.
      
      A more exact option could have been to use something like
      (first_pos_max + last_pos_max / 2) + getS()/2, but it would raise symmetry problems.
      The selected option should anyway improve the estimation in most of the cases.
      2e23e720
    • Mikael Salson's avatar
      bug: When adding some kmers prevent the sequence being segmented · cd0d2609
      Mikael Salson authored
      When we have few kmers at the start, the sequence will be segmented
      (because first_pos_max is at the start of the sequence).
      
      On the contrary if we add some kmers of the same locus further in the sequence
      (in our case it is TGCTCCCCTA) the sequence won't be segmented because
      first_pos_max is much larger and it is likely that having so few kmers of the
      locus in such a large sequence is obtained by chance.
      
      Clearly there is a problem with the first sequence that should not be
      segmented: maybe the evalue should not be computed from position 1 to position
      1+first_pos_max ?
      cd0d2609
  4. 20 Jan, 2016 9 commits
  5. 15 Jan, 2016 6 commits
  6. 14 Jan, 2016 2 commits
  7. 11 Jan, 2016 9 commits