- 18 Feb, 2021 40 commits
-
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-
Mathieu Giraud authored
see #3342
-
Mathieu Giraud authored
see #3342
-
Mikaël Salson authored
Having null KmerAffects requires to create objects that are actually useless. This saves time not to create them. However this requires some adaptations to the code as now we have have no information associated with a node.
-
Mikaël Salson authored
-
Mikaël Salson authored
The index load is queried many times but once the automaton is built, it doesn't change any more. Thus we store it to avoid recomputation when querying index loads.
-
Mikaël Salson authored
getLabel() is way too slow as it creates a string.
-
Mikaël Salson authored
There was still a loop that was not precomputed and it was taking a significant amount of time (about 1/3 of the execution time). However there is little diversity in the parameters' values this function can have. Thus we can save a significant amount of time by storing those values
-
Mikaël Salson authored
Instead of query a map for each annotation at each position we instead use a static array, instantiated just once. Thus we need to have no collision in our hash function. The elements are put in the map just in the end.
-
Mikaël Salson authored
We insert a given kmer just once to save time. So we don't have anymore a count of the kmers. But we don't care
-
Mikaël Salson authored
We should probably remove some code as this KMER_INDEX won't be supported anymore for segmentation (though still useful for representative computation)
-
Mikaël Salson authored
-
Mikaël Salson authored
Either the unexpected one or the single one in the list.
-
Mikaël Salson authored
-
Mikaël Salson authored
-
Mikaël Salson authored
We do not need anymore to use different seeds just to prevent ambiguities. Those choices seemes to be a good trade-off between sensitivity and specificity.
-
Mikaël Salson authored
-
Mikaël Salson authored
It helps to get the object corresponding to a specific germline
-
Mikaël Salson authored
-
Mikaël Salson authored
-
Mikaël Salson authored
It allows to get all the affectations when we want to store several affectations per position. As a result we get a BitSet per affectation.
-
Mikaël Salson authored
To store bit sets… C++ bitset wasn't adapted to the need as we can't apply some bitwise operations without recreating a new bitset.
-
Mikaël Salson authored
This is actually debatable. Should we consider the two strands as one germline and count all the k-mers together or as two independent germlines? Before the index load shown for each locus on stdout was not accurate because it just reflected the index load from one strand.
-
Mikaël Salson authored
Otherwise we should keep the germline as unexpected
-
Mikaël Salson authored
Otherwise, as they share the same V gene, we will end up with ambiguous affectations
-
Mikaël Salson authored
-
Mikaël Salson authored
Otherwise TRA/TRD can't be detected because they appear as too ambiguous
-
Mikaël Salson authored
-
Mikaël Salson authored
-
Mikaël Salson authored
The # is a comment in should-vdj.fa and it may be useful to see it in the output (because we may put there a sequence ID)
-
Mikaël Salson authored
We prefer having 2 occurrences of a rare gene than 4 occurrences of a frequent one
-
Mikaël Salson authored
-
Mikaël Salson authored
This reverts commit 9738069a as promised in this commit: this doesn't make sense anymore with the new heuristic.
-
Mikaël Salson authored
This is useful for the SEG_12 method. Not in the general case (but the general case is deactivated with a previous commit). Instead of having undetermined germline we change it to something else when we have a “classical” germline See #3584
-
Mikaël Salson authored
There could have ambiguity: for instance no way of distinguishing labels coming from IGH and IGH+. While it could make sens for the J genes as they are the same it was more annoying for the 5' gene as they differ in both loci. This is somewhat solved but there are limitations. We cannot easily solve the case where some of the 5' (for instance) See #3584
-
Mikaël Salson authored
It could help to bind a label to a Germline. We just store the first one encountered: we don't deal with collisions. See #3584
-
Mikaël Salson authored
And use that to override the filter for X germline
-