How to remove "possibly IGH" from algorithm
Looking case of these samples.
We have only possibly XXX
clonotype. And if I take a closer look to affecValues, we have very poor match:
Some examples:
"_H_H___________________________________________________________________________________________________________________________________H_____________________________________________________H____________________________h_________________________________________________________________________hhhhhh________________"
"HHHHHHHHHHHHHH_________________________________________________________________H_______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________hhhhhhh________________"
In this case, we probably should say that it is only primers error of target.
It is possible to add a warning/detection with one of this filter:
- at least X match in sequence excluding primers portion
- at least X ratio of match in sequence
- substract or don't count match of primers for evalue
- or a mix of these 3 propositions.
It can have side effect, but we have many of this example in data. Could we make some try on algo?