Skip to content
GitLab
Projects
Groups
Snippets
Help
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
vidjil
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
1,695
Issues
1,695
List
Boards
Labels
Service Desk
Milestones
Merge Requests
88
Merge Requests
88
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Operations
Operations
Incidents
Environments
Packages & Registries
Packages & Registries
Container Registry
Analytics
Analytics
CI / CD
Repository
Value Stream
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
vidjil
vidjil
Commits
462e5218
Commit
462e5218
authored
Mar 06, 2019
by
Mathieu Giraud
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
doc/vidjil-algo.md: designations
parent
d4725752
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
12 additions
and
11 deletions
+12
-11
doc/vidjil-algo.md
doc/vidjil-algo.md
+12
-11
No files found.
doc/vidjil-algo.md
View file @
462e5218
...
...
@@ -299,7 +299,7 @@ Recombination detection ("window" prediction, first pass)
(all these options, except -w, are overriden when using -g)
-k INT k-mer size used for the V/J affectation (default: 10, 12, 13, depends on germline)
-w INT w-mer size used for the length of the extracted window ('all': use all the read, no window clustering)
-e FLOAT=1 maximal e-value for determining if a V-J
segment
ation can be trusted
-e FLOAT=1 maximal e-value for determining if a V-J
design
ation can be trusted
-t INT trim V and J genes (resp. 5' and 3' regions) to keep at most <INT> nt (0: no trim)
-s SEED=10s seed, possibly spaced, used for the V/J affectation (default: depends on germline), given either explicitely or by an alias
10s:#####-##### 12s:######-###### 13s:#######-###### 9c:#########
...
...
@@ -349,7 +349,7 @@ It is an upper bound on the number of exepcted windows found by chance by the se
The e-value computation takes into account both the number of reads in the
input sequence and the number of locus searched for.
The default value is 1.0, but values such as 1000, 1e-3 or even less can be used
to have a more or less permissive
segment
ation.
to have a more or less permissive
design
ation.
The threshold can be disabled with
`-e all`
.
The
`-t`
option sets the maximal number of nucleotides that will be indexed in
...
...
@@ -643,7 +643,7 @@ We export all required fields, some optional fields, as also some custom fields
Note that Vidjil-algo is designed to efficiently gather reads from large datasets into clones.
By default (
`-c clones`
), we thus report in the AIRR format
*clones*
.
See also
[
What is a clone ?
](
vidjil-format/#what-is-a-clone
)
.
Using
`-c
segment
`
trigger a separate analysis for each read, but this is usually not advised for large datasets.
Using
`-c
designations
`
trigger a separate analysis for each read, but this is usually not advised for large datasets.
| Name | Type | AIRR 1.2 Description
<br
/>
*vidjil-algo implementation*
|
...
...
@@ -669,14 +669,14 @@ Our implementation of .tsv may evolve in future versions.
Contact us if a particular feature does interest you.
##
Segmentation and
.vdj format
##
The
.vdj format
Vidjil output includes
segmentation
of V(D)J recombinations. This happens
Vidjil output includes
analysis
of V(D)J recombinations. This happens
in the following situations:
-
in a first pass, when requested with
`-U`
option, in a
`.segmented.vdj.fa`
file.
The goal of this ultra-fast
segmentation
, based on a seed
The goal of this ultra-fast
analysis
, based on a seed
heuristics, is only to identify the locus and to locate the w-window overlapping the
CDR3. This should not be taken as a real V(D)J designation, as
the center of the window may be shifted up to 15 bases from the
...
...
@@ -686,7 +686,8 @@ in the following situations:
-
at the end of the clones detection (default command
`-c clones`
,
on a number of clones limited by the
`-z`
option)
-
or directly when explicitly requiring segmentation (
`-c segment`
)
-
or directly when explicitly requiring V(D)J designation for each read
(
`-c designations`
)
These V(D)J designations are obtained by full comparison (dynamic programming)
with all germline sequences.
...
...
@@ -698,7 +699,7 @@ in the following situations:
To check the quality of these designations, the automated test suite include
sequences with manually curated V(D)J designations (see
[
should-vdj.org
](
http://git.vidjil.org/blob/master/doc/should-vdj.org
)
).
Segment
ations of V(D)J recombinations are displayed using a dedicated
Design
ations of V(D)J recombinations are displayed using a dedicated
`.vdj`
format. This format is compatible with FASTA format. A line starting
with a
\>
is of the following form:
...
...
@@ -707,7 +708,7 @@ with a \> is of the following form:
name sequence name (include the number of occurrences in the read set and possibly other information)
+ strand on which the sequence is mapped
VDJ type of
segment
ation (can be "VJ", "VDJ", "VDDJ", "53"...
VDJ type of
design
ation (can be "VJ", "VDJ", "VDDJ", "53"...
or shorter tags such as "V" for incomplete sequences).
The following line are for "VDJ" recombinations :
...
...
@@ -804,9 +805,9 @@ This file will be relatively small (a few kB or MB) and can be taken again as an
```
```
bash
./vidjil-algo
-c
segment
-g
germline/homo-sapiens.g
-2
-3
-d
-x
50 demo/Stanford_S22.fasta
./vidjil-algo
-c
designations
-g
germline/homo-sapiens.g
-2
-3
-d
-x
50 demo/Stanford_S22.fasta
# Detailed V(D)J designation, including multiple D, and CDR3 detection on the first 50 reads, without clone clustering
# (this is
slow and should only be used for testing, or on a small file
)
# (this is
not as efficient as '-c clones'
)
```
```
bash
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment