Attention une mise à jour du service Gitlab va être effectuée le mardi 14 décembre entre 13h30 et 14h00. Cette mise à jour va générer une interruption du service dont nous ne maîtrisons pas complètement la durée mais qui ne devrait pas excéder quelques minutes.

Commit 698d47f0 authored by Mathieu Giraud's avatar Mathieu Giraud Committed by Vidjil Team
Browse files

split-from-imgt.py: get germline from 'Mus musculus'

parent 7bde694b
......@@ -31,24 +31,33 @@ def verbose_open_w(name):
SPECIAL_SEQUENCES = [
]
SPECIES = {
"Homo sapiens": './',
"Mus musculus": 'mus-musculus/',
}
for l in sys.stdin:
if ">" in l:
current_file = None
current_special = None
if "Homo sapiens" in l and ("V-REGION" in l or "D-REGION" in l or "J-REGION" in l):
species = l.split('|')[2].strip()
if species in SPECIES and ("V-REGION" in l or "D-REGION" in l or "J-REGION" in l):
seq = l.split('|')[1]
path = SPECIES[species]
system = seq[:4]
key = path + system
if system.startswith('IG') or system.startswith('TR'):
if system in open_files:
current_file = open_files[system]
if key in open_files:
current_file = open_files[key]
else:
name = '%s.fa' % system
name = '%s%s.fa' % (path, system)
current_file = verbose_open_w(name)
open_files[system] = current_file
open_files[key] = current_file
if seq in SPECIAL_SEQUENCES:
name = '%s.fa' % seq.replace('*', '-')
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment