Commit fbbfb919 authored by Mathieu Giraud's avatar Mathieu Giraud
Browse files

core/tools.{h,cpp}: extractGeneName(), take only into account the gene name,...

core/tools.{h,cpp}: extractGeneName(), take only into account the gene name, not the reference sequence

Now M99685|IGHV4-55*01 and X92223|IGHV4-55*02 will be both mapped to IGHV4-55.
parent ac7a3663
......@@ -470,16 +470,14 @@ void sigintHandler(int sig_num)
}
#pragma GCC diagnostic pop
/*
Return the part of label before the star
For example:
IGHV5-51*01 -> IGHV5-51
If there is no star in the name, the whole label is returned.
IGHV10-40 -> IGHV10-40
*/
string extractGeneName(string label){
string result;
size_t pipe_pos = label.find("|");
if (pipe_pos != string::npos) {
label = label.substr(pipe_pos+1);
}
size_t star_pos;
star_pos = label.rfind("*");
if(star_pos != string::npos){
......
......@@ -108,9 +108,10 @@ extern bool global_interrupted;
void sigintHandler(int sig_num);
/*
Extract the gene name from a label. This take the whole part
before the star and returns it. If there is no star in the
name the whole label is returned.
Extract the gene name from a label.
If there is a pipe '|', consider only what is after the (first) pipe.
If there is a star '*', consider only what is before the start
M99686|IGHV5-51*01|Homo sapiens|... -> IGHV5-51
IGHV-01*05 -> IGHV-01
IGHV-7500AB -> IGHV-7500AB
*/
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment