WIP: DO NOT MERGE: Feature a/3295 option names
See #3295 (closed).
Usage: ./vidjil-algo [OPTIONS] reads_file
reads_file reads file, in one of the following formats:
- FASTA (.fa/.fasta, .fa.gz/.fasta.gz)
- FASTQ (.fq/.fastq, .fq.gz/.fastq.gz)
- BAM (.bam)
Paired-end reads should be merged before given as an input to vidjil-algo.
Command selection
-c COMMAND command
clones locus detection, window extraction, clone clustering (default command, most efficient, all outputs)
windows locus detection, window extraction
segment detailed V(D)J designation (not recommended)
germlines statistics on k-mers in different germlines
Input
-x, --first-reads INT maximal number of reads to process ('all': no limit, default), only first reads
-X, --sampled-reads INT maximal number of reads to process ('all': no limit, default), sampled reads
Germline presets (at least one -g or -V/(-D)/-J option must be given)
-g, --germline GERMLINES ...
-g <.g FILE>(:FILTER)
multiple locus/germlines, with tuned parameters.
Common values are '-g germline/homo-sapiens.g' or '-g germline/mus-musculus.g'
The list of locus/recombinations can be restricted, such as in '-g germline/homo-sapiens.g:IGH,IGK,IGL'
-g PATH
multiple locus/germlines, shortcut for '-g PATH/homo-sapiens.g',
processes human TRA, TRB, TRG, TRD, IGH, IGK and IGL locus, possibly with incomplete/unusal recombinations
-V FILE ... custom V germline multi-fasta file(s)
-D FILE ... custom D germline multi-fasta file(s), analyze into V(D)J components
-J FILE ... custom V germline multi-fasta file(s)
-2 try to detect unexpected recombinations
Limits to report and to analyze clones (second pass)
-r, --min-reads INT=5 minimal number of reads supporting a clone
--min-ratio FLOAT=0 minimal percentage of reads supporting a clone
--max-clones INT maximal number of output clones ('all': no maximum, default)
-y, --max-consensus INT=100 maximal number of clones computed with a consensus sequence ('all': no limit)
-z, --max-designations INT=100
maximal number of clones to be analyzed with a full V(D)J designation ('all': no limit, do not use)
--all reports and analyzes all clones
(--min-reads 1 --min-ratio 0 --max-clones all --max-clones-with-consensus all --max-clones-with-analysis all),
to be used only on very small datasets (for example --all -X 20)
Clone analysis (second pass)
-d, --several-D try to detect several D (experimental)
-3, --cdr3 CDR3/JUNCTION detection (requires gapped V/J germlines)
Detailed output per read (generally not recommended, large files, but may be used for filtering, as in -uu -X 1000)
-U, --out-analyzed output analyzed reads (in .segmented.vdj.fa file)
-u, --out-unanalyzed
-u output unanalyzed reads, gathered by cause, except for very short and 'too few V/J' reads (in *.fa files)
-uu output unanalyzed reads, gathered by cause, all reads (in *.fa files) (use only for debug)
-uuu output unanalyzed reads, all reads, including a .unsegmented.vdj.fa file (use only for debug)
--out-reads output all reads by clones (clone.fa-*), to be used only on small datasets
Output
-o, --out-dir PATH=./out/ output directory
-b, --out-base STRING output basename (by default basename of the input file)
-v, --verbose verbose mode
Help
-h, --help help
-H, --help-advanced help, including advanced and experimental options
The full help is available in the doc/algo.org file.
Edited by Mathieu Giraud