Commit 2e23e720 authored by Mathieu Giraud's avatar Mathieu Giraud
Browse files

core/affectanalyser.cpp: safer p-value estimation, taking into account the central region

As detected by @mikael-s in the parent commit, the region between V and J affectations
was not taken into account in the p-values, yielding erroneous segmentations
when this region was very large.

Now this region is counted *both* for the computation of left and right p-values, solving
the bug of the parent commit.

This could be sometimes over-conservative : are we counting things twice ?
In regular situations, the answer is no, as the p-values
are eventually computed by getProbabilityAtLeastOrAbove in kmerstore.h,
that takes into account the length of the seed.

A more exact option could have been to use something like
(first_pos_max + last_pos_max / 2) + getS()/2, but it would raise symmetry problems.
The selected option should anyway improve the estimation in most of the cases.
parent cd0d2609
......@@ -167,10 +167,10 @@ affect_infos KmerAffectAnalyser::getMaximum(const KmerAffect &before,
left_evalue = kms.getProbabilityAtLeastOrAbove(before,
1 + results.first_pos_max);
1 + results.last_pos_max);
right_evalue = kms.getProbabilityAtLeastOrAbove(after,
seq.size() - 1 - results.last_pos_max);
seq.size() - 1 - results.first_pos_max);
/* Main test:
1) do we have enough affectations in good positions ('before' at the left and 'after' at the right) ?
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment