Commit 977ed0a8 authored by Mikaël Salson's avatar Mikaël Salson

repseq_vdj.py: Ensures that input respects the expected format

On each line we should have a + or - when segmented or an exclamation mark
when unsegmented.
A tab should always exist at least between the sequence name and the VDJ designation.
parent 50238f19
Pipeline #66967 passed with stages
in 33 minutes and 18 seconds
......@@ -267,9 +267,15 @@ def should_results_from_vidjil_output(f_log):
if l[0] == '>':
l = l.strip()
pos = l.find(' + ') if ' + ' in l else l.find(' - ')
if pos == -1:
pos = l.find(' ! ')
if pos == -1:
raise ValueError("No [+-!] in the line: {}".format(l))
should = l[1:pos].replace('_', ' ')
pos = l.find('\t')
if pos == -1:
raise ValueError("I expected a tabulation to separate the sequence name from the remainder")
result = l[pos+1:] + ' '
yield (should, result)
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment