Commit 38f4833c authored by Mathieu Giraud's avatar Mathieu Giraud
Browse files

Merge branch 'doc/aligner' into 'dev'

Documentation updates for Web 2021.05

See merge request !962
parents 3978e689 45d97340
Pipeline #251532 passed with stages
in 87 minutes and 43 seconds
......@@ -327,10 +327,10 @@
<div class="menu_box">
primer set</br>
<label for="primerBiomed2" class="buttonSelector" onclick="m.switchPrimersSet('biomed2')"><input type="radio" id="primerBiomed2" name="primers" value="biomed2" />biomed2</label>
<label for="primerEcngs" class="buttonSelector" onclick="m.switchPrimersSet('ecngs')"><input type="radio" id="primerEcngs" name="primers" value="ecngs" />ecngs</label>
<label for="primerEcngsFR1" class="buttonSelector" onclick="m.switchPrimersSet('ecngs_FR1')"><input type="radio" id="primerEcngsFR1" name="primers" value="ecngs_FR1" />ecngs FR1</label>
primer set for interpolated length<br/>(experimental)
<label for="primerBiomed2" class="buttonSelector" onclick="m.switchPrimersSet('biomed2')"><input type="radio" id="primerBiomed2" name="primers" value="biomed2" />EuroClonality/BIOMED-2</label>
<label for="primerEcngs" class="buttonSelector" onclick="m.switchPrimersSet('ecngs')"><input type="radio" id="primerEcngs" name="primers" value="ecngs" />EuroClonality-NGS</label>
<label for="primerEcngsFR1" class="buttonSelector" onclick="m.switchPrimersSet('ecngs_FR1')"><input type="radio" id="primerEcngsFR1" name="primers" value="ecngs_FR1" />EuroClonality-NGS FR1</label>
<!-- TODO : construire liste à partir des data disponibles/chargées dans le model.
TODO : passer ça en liste deroulante ? -->
</div>
......
......@@ -266,8 +266,8 @@ AXIS_DEFAULT = {
isInAligner:true
},
"primers": {
name: "primers",
doc: "interpolated length, between BIOMED2 primers (inclusive)",
name: "interpolated length between primers",
doc: "interpolated length, between selected primer set (inclusive)",
fct: function(clone) {return clone.getSegLengthDoubleFeature('primer5', 'primer3')},
autofill: true
},
......
......@@ -2,9 +2,28 @@
This changelog concerns the Vidjil web application, client and server.
As we are using continuous integration and deployment, some features are pushed on our servers between these releases.
## 2020-12-09
## 2021-05-05
Improve analysis
* New sequence aligner, flexible display of nucleotide/AA sequences, VDJ, CDR, and FR features
* New detection of primer positions and computation of interpolated length (BIOMED-2 and EuroClonality-NGS primer sets)
* New report and axis, detailed cause of non-productivity
Improve ergonomy
* Better handling of saved settings (color by, naming choices)
* New button to open one sample with its analysis from the patient/sample database
* Color settings are used in report
Improve quality testing
* The tutorial is better tested, including interactions with the sample database
Other points
* Fix disgracious position in graph
* Add more explicit warning about healthcare compliance
* Fix logout error for some instances (app.vidjil.org)
## 2020-12-09
* Possibility to create at once multiple patients/runs/sets by pasting data from the clipboard
Add new warnings
......
# Analysis axes
Information computed on each clonotype are detailed on the clone information panel (`🛈` button).
They can be shown as *axes* on the grid view.
Some of them can be shown as *data columns* in the aligner (`‖` button)
and used by the *color by* menu.
## Basic axes
* **size**: Ratio of the number of reads of the clone to the total number of reads in the selected locus
* **size (other sample)**: Ratio of the number of reads of the clone to the total number of reads in the selected locus, on a second sample
(applicable when there are several samples)
* **locus**: Locus or recombination system, as detailed [here](locus.md)
* **V/5' gene, D gene, J/3' gene**: V, D, and J genes (or 5' and 3' segments for [incomplete or special recombinations](locus.md)), regardless of the allele
* **V/5' allele, D allele, J/3' allele**: Same as above, but taking into acount each allele
* **clone consensus length**: Length of the consensus sequence
* **clone average read length**: Average length of the reads belonging to each clone
* **clone consensus coverage**: Ratio of the length of the clone consensus sequence to the median read length of the clone. Coverage between .85 and 1.0 (or more) are good values. See [clone coverage](user.md#clone-coverage)
* **GC content**: %GC content of the consensus sequence
* **number of samples**: Number of samples sharing the clone
* **tag**: Tag, as defined by the user with the `★` button in the [list of clones](user.md#the-list-of-clones-left-panel)
* **VIdentity IMGT**: V identity (as computed by IMGT/V-QUEST, availabe when the clonotypes have been submited there)
### N-region / CDR3 analysis
* **V/5' deletions in 3'**: Number of deleted nucleotides at the 3' side of the V/5' segment
* **J/3' deletions in 5'**: Number of deleted nucleotides at the 5' side of the J/3' segment
* **N length**: N length, from the end of the V/5' segment to the start of the J/3' segment (excluded)
* **CDR3 length (nt)**: CDR3 length, in nucleotides, from Cys104 to Phe118/Trp118 (excluded)
* **productivity**: Productivity as computed by vidjil-algo (`no CDR3 detected`, `productive`, or `unproductive`
* **productivity detailed**: Same as above, but with further detail on the non-productivity cause: `stop-codon`, `out-of-frame`, `no-{WP}GxG-pattern`,
following ERIC guidelines ([Rosenquist et al., 2017](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5508071/)).
* **productivity IMGT**: roductivity (as computed by IMGT/V-QUEST, availabe when the clonotypes have been submited there)
## Other axes
Some of these values requires to have some setup on some instances of the server.
* **cloneDB occurrences**: number of occurrences in cloneDB
* **cloneDB patients/runs/sets occurrences**: "number of patients/runs/sets sharing clones in cloneDB
* **primers**: interpolated length, between primers (inclusive)
## Availability of axes
| name | axes (grid view) | aligner | color_by |
| :----------------------- | :---------: | :--------: | :------: |
| V/5' gene | x | | x |
| V/5 allele | x | | |
| D gene | x | | |
| D allele | x | | |
| J/3' gene | x | | x |
| J/3 allele | x | | |
| clone consensus length | x | x | |
| clone average read length| x | x | |
| GC content | x | x | |
| N length | x | | x |
| CDR3 length (nt) | x | | |
| productivity | x | x | x |
| productivity detailed | x | x | |
| productivity IMGT | x | x | |
| VIdentity IMGT | x | x | |
| tag | x | | x |
| clone consensus coverage | x | | |
| locus | x | | x |
| size | x | x | |
| size (other sample) | x | | |
| number of samples | x | x | |
| primers | x | | |
| V/5' deletions in 3' | x | | |
| J/3' deletions in 5' | x | | |
| cloneDB occurrences | x | | |
| cloneDB patients/runs/sets occurrences| | | |
......@@ -308,6 +308,64 @@ following.
Finally modify the [index.html](../browser/index.html) file to add the new color method in the
select box (which is under the `color_menu` ID).
## Sequence panel
### Add a sequence feature
A sequence feature can be used to highlight a specific part of a sequence.
Here for example is the sequence feature describing how to highlight the V region as available in aligner_layer.js
''' js
'V':
{
'title': function (s,c) { return c.seg["5"].name;},
'start': function (s,c) { return c.getSegStart("5"); },
'stop': function (s,c) { return c.getSegStop("5"); },
'className': "seq_layer_highlight",
'style': { 'background': "#4c4" },
'enabled': true
}
'''
each sequence feature contains fields used to customize and locate the feature on the sequence.
- title : [text] the content of the html title field of the feature.
- start : [int] the position of the first nucleotide of the selected region.
- stop : [int] the position of the last nucleotide of the selected region .
- text : [int] (optional) text to overlay on top of the sequence.
- condition : [boolean] (optional) sequence feature will be displayed only if true.
- classname : [text] (optional) html classname used to customize the sequence feature look.
- style : [object] (optional) additional css properties to further customize the sequence feature.
- enabled : [boolean] default visibility
most field can take a static value or a function that will be able to return a specific value for each clone.
''' js
function (s,c) { ...}
'''
- s : the aligner_sequence object (check aligner_sequence.js to see available functions)
- c : the clone object (check clone.js to see functions / data available)
### How to add a sequence feature in the menu
You can set the 'enabled' sequence feature field to true to always display it, or, you can edit the aligner_menu file to add an entry to the sequence panel menu allowing you to enable/disable your sequence feature with a checkbox.
example : the aligner_menu.js entry allowing to enable/disable the V/D/J regions of the sequence
''' js
{
'text': 'V/D/J genes',
'title': 'Highlight V/D/J genes',
'layers': ["V","D","J"],
'enabled': true
}
'''
- text : [text] checkbox text to display in the sequence menu panel
- title : [text] the content of the html title field of the checkbox.
- layers : [array] a list of sequence feature name defined in aligner_layer.js to enable/disable
- enabled : [boolean] default checkbox value
## Classes
### Clone
......
......@@ -124,6 +124,17 @@ to learn the essential features of Vidjil.
By hovering the mouse, one also sees the *total*
number of reads for that sample.
<figure> <p style="text-align:center">
<img src="..//pictures/panel_info.png"/>
</figure>
<i>
The information panel.
The patient/run/set or sample information may contain tags such as `#T-ALL`.
In this sample,
V(D)J recombinations were detected
in about 82% of the reads.</p>
</i>
## The list of clones (left panel)
When they were processed by **vidjil-algo**, clones are described with identifiers
......@@ -159,6 +170,22 @@ then followed by the J gene `TRGJ1*02`, with `6` nucleotides deleted at its star
- A clone with a minus symbol `−` has not been detected in that sample,
but has been detected in another sample that is not currently displayed.
<figure> <p style="text-align:center">
<img src="..//pictures/panel_list.png"/>
</figure>
<i>The list of clones.
The main clonotype is
`IGHV3-9*01 7/CCCGGA/17 IGHJ6*02`,
with 7 deletions on the 3' side of the V,
17 deletions on the 5' side of the J,
and a insertion of `CCCGGA` in the N region.
Here the settings shorten this
name by not showing the `*01` allele.
This clonotype is actually a cluster (+)
of sub-clones.
The `TRGV10 4//8 JP2` clonotype has a warning.
</i>
### Detailed information on each clone
The “🛈” button opens a window showing detailed information (V(D)J designation,
e-value, number of reads) about each clone.
......@@ -206,7 +233,14 @@ It shows the most frequent clones of each sample, tracked into every sample.
- If your dataset contains sampling dates (for example for diagnosis/follow-up samples), you can switch between sample keys and dates in “settings \> sample key”
<figure> <p style="text-align:center">
<img src="..//pictures/panel_graph.png"/>
</figure>
<i>
This sample graph show the evolution of a T-ALL patient relapsing at D+268/D+308 with a clonotype
that was not the main one at the diagnosis.The view was filtered to show only clonotypes of interest.
</i>
</figure>
## The plot view and the plot presets
......@@ -217,12 +251,25 @@ When there is only one sample, two such views are shown.
All the analyzes locus are on the right of the grid. You can select another locus by clicking on it or by using the associated shortcuts (see *Keyboard shortcuts* below).
- The “plot“ menu allow to customize the plots, by selecting the X and Y axes and also by switching between grid and bar plots.
There are [20+ available axes](axes.md) to study the clones.
Some presets are available.
For example, the preset 4, similar to a "Genescan analysis", shows a bar plot of the clones according to the length of their consensus sequence,
and the preset 7 shows the distribution of CDR3 lengths.
- On the bar plots, the Y axis corresponds to the order of clones inside each bar.
<figure> <p style="text-align:center">
<img src="..//pictures/panel_scatterplot.png"/>
</p>
</figure>
<i>
Grid view with the default axes (V/5' and J/3' gene) focusing on the TRG locus. The TRGV10/TRGJP10 clonotype appears in red because it has been tagged as `clone 1` from the clonotype list. Clicking on IGH focus on the IGH locus.
</i>
## Status bar
- At the bottom of the plot view, the “status bar“ displays information
on the selected clone.
......@@ -237,35 +284,105 @@ or their “N length” (that is N1-D-N2 in the case of VDJ recombinations).
## The sequence panel (bottom panel)
The sequence panel displays nucleotide sequences from selected clones.
The sequence panel shows, for the selected clones:
- the nucleotide or amino acid *sequences* -- see below "[What is the sequence displayed for each clone ?](#what-is-the-sequence-displayed-for-each-clone)"
- some *features* on these sequences
<figure> <p style="text-align:center">
<img src="..//pictures/panel_sequence.png"/>
<p style="text-align:center">For each clonotype, name and sequences are shown. Some other feature can also be added.</p>
</p>
</figure>
### Selecting clones for inspection
Clones can be (un)selected by several ways:
- Select one clone: click on its representative element in any panel (a plot in the gridpanel, a line in the graph panel, or an entry in the list panel)
- Select multiple clones at once: click-and-drag a rectangular selection of an area of the grid panel
- Add a clone to the selection : Ctrl+click
- Remove a clone from the selection : click on the 'X' at the left
- Remove all selected clones : click on the background of the grid panel
### Cluster: regroup clones
The `cluster` button will create a cluster with the selected clones
Such a cluster will appear as a single clone,
with the first (largest) selected clone acting as its representative.
<figure> <p style="text-align:center">
<img src="..//pictures/panel_list_merge_2.png"/>
</figure>
<i>The top clonotype is actually a cluster of several sub-clonotypes. It is still possible to access to all the information of such sub-clonotype. Clicking on "x" remove a sub-clonotype from the cluster.</p>
</i>
### Align
The `align` button aligns all the selected sequences,
the sequence of the first (largest) clone used as a reference.
- See "[What is the sequence displayed for each clone ?](#what-is-the-sequence-displayed-for-each-clone)" below
- Sequences can be aligned together (“align” button), identifying substitutions, insertions and deletions. Silent mutations are identified, as soon as a CDR3 is detected, and represented with a double border in blue.
- You can remove sequences from the aligner (and the selection) by clicking on the “X” at the left.
- You can unselect all sequences by clicking on the background of the grid.
- `*` is a match
- `-` is a gap
- a single line under a character is a nucleotide mismatch
- a double line under a character is a silent nucelotide mismatch (not impacting the resulting amino acid sequence)
- `#` in an amino acid sequence indicates a frameshift in the junction (and thus an unproductive sequence)
## Further sequence analysis with external software
The alignment settings `⚙` menu allows to customize such alignements, by
The sequence panel displays buttons to further analyze the selected sequences
with other software useful for RepSeq studies.
These buttons open another window/tab.
- highlighting mismatches
- hiding matches
- switching between amino acid and nucleotide sequences
- [`❯ IMGT/V-QUEST`](http://www.imgt.org/IMGT_vquest):
The reference analysis from IMGT®, including subset #2 and #8 search.
The `▼` button further allows to retrieve back results from IMGT/V-QUEST
and to display them within Vidjil.
- [`❯ IgBlast`](https://www.ncbi.nlm.nih.gov/igblast/):
Nucleotide alignment with IG/TR germline sequences
### Data Columns
- `❯ CloneDB`. See [above](#detailed-information-from-clonedb)
The analysis software, on some configurations, may provide additional [data
axes](axes.md) for each clone.
The data columns `‖` menu allows to select such data.
- [`❯ Blast`](http://www.ensembl.org/Multi/Tools/Blast):
Nucleotide alignement against the Homo sapiens genome and other nucleotide collections
- [`❯ AssignSubsets`](https://station1.arrest.tools/subsets) (availaible for clones with IGH recombinations):
Assignment to the [19 known major subsets](https://www.ncbi.nlm.nih.gov/pubmed/22415752)
of stereotyped antigen receptor sequences for CLL
### External Analysis: Further sequence analysis with external software
This sub menu display a range of other analysis software available online used for RepSeq studies.
These buttons will send the sequences of selected clones to them for analysis and open the resulting page in another window/tab.
- [`❯ IMGT/V-QUEST`](http://www.imgt.org/IMGT_vquest):
The reference analysis from IMGT®, including search for subset `#2` and `#8`.
See [below](#imgt-sequence-features)
- [`❯ IgBlast`](https://www.ncbi.nlm.nih.gov/igblast/):
Nucleotide alignment with IG/TR germline sequences
- `❯ CloneDB`. See [above](#detailed-information-from-clonedb)
- [`❯ Blast`](http://www.ensembl.org/Multi/Tools/Blast):
Nucleotide alignement against the Homo sapiens genome and other nucleotide collections
- [`❯ AssignSubsets`](https://station1.arrest.tools/subsets) (availaible for clones with IGH recombinations):
Assignment to the [19 known major subsets](https://www.ncbi.nlm.nih.gov/pubmed/22415752)
of stereotyped antigen receptor sequences for CLL
### Sequence Features
Depending on the analysis software and on its configuration, there can be positions of genes or specific regions of interest that can be highlighted.
The sequence feature `☰` menu usually contains at least the following genes/regions:
- V/D/J genes
- CDR3 position
### IMGT Sequence Features
The `☰ IMGT` menu further allows to select features provided by IMGT/V-QUEST:
- V/D/J genes
- FR1/FR2/FR3/FR4
- CDR1/CDR2/CDR3
To avoid overloading the IMGT servers that provide us this feature,
after adding new clones to the selection,
one has to click on the refresh `↻` button to request the features for the new sequences.
# The sample database and the server
......@@ -282,11 +399,20 @@ you can process your data and save the results of your analysis.
## Patients
<i>
<figure> <p style="text-align:center">
<img src="..//pictures/table_db_content_patient_list.png"/>
</figure>
<i>
The main page on the sample database show a list of patients, or runs or sets,
with links to the samples and the results.
</i>
<b>
⚠️ The public <http://app.vidjil.org/> server is for Research Use Only
and is not compliant for clinical use.
Clinical data have to be uploaded on a [certified healthcare server](http://www.vidjil.org/doc/healthcare).
</i>
</b>
Once you are authenticated, this page shows the patient list. Here you
can see your patients and patients whose permission has been given to you.
......@@ -357,6 +483,16 @@ which is not the case for the results (unless the user wants so).
You can see which samples have been processed with the selected
process, and access to the results (`See results`, bottom right).
<figure> <p style="text-align:center">
<img src="..//pictures/table_db_content_patient_0_multi_config.png"/>
</figure>
<i>
The demo patient LIL-L3, available
from the demo account, has 5 samples here analyzed
with the default `multi+inc+xxx` configuration.</p>
</i>
### Adding a sample
To add a sample (`+ add samples`), you must add at least one sample file. Each sample file must
......@@ -522,11 +658,13 @@ by clicking on the links just above the sequence panel (bottom left).
This opens another window/tab.
## How is productivity computed? Why do I have some discrepancies with other software?
Vidjil-algo computes the productivity by checking that the CDR3 comes from
an in-frame recombination and that there is no stop codon in the full
sequence.
The productivitiy as computed by Vidjil-algo may differ from what computes
Vidjil-algo reports CDR3 as *productive* when they come from
an in-frame recombination, the sequence does not contain any in-frame stop codons,
and, for IGH recombinations, when the FR4 begins with the `{WP}-GxG` pattern.
This follows the ERIC guidelines ([Rosenquist et al., 2017](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5508071/)).
The productivity as computed by Vidjil-algo may differ from what computes
other software. For instance, as of September 2019, IMGT/V-QUEST removes by default
insertions and deletions from the sequences to compute the productivity, as it
considers them as sequencing errors.
......@@ -792,7 +930,7 @@ The settings menu allows to set:
- the format for clone junction [junction length / AA sequence / mixed (display AA sequence only for short junction)]
- the format for clone alleles [hide alleles / display alleles / mixed (display only for marginal alleles)]
These settings are kept in your web browser ``localStorage'' between several sessions.
These settings, together with the color option, are kept in your web browser ``localStorage'' between several sessions.
# Keyboard shortcuts
......
......@@ -66,6 +66,7 @@ Vidjil-algo is systematically tested with the following compilers :
- clang 3.4, 4.0, 6.0, 7.0, 11.0
These compilers are available on recent OS X and on the following Linux distributions:
- CentOS 7, 8
- Debian Jessie 8.0, Stretch 9.0, Buster 10.0, Bullseye 11
- FreeBSD 9.2, 10, 11, 12
......
......@@ -4,6 +4,11 @@ We publish here notes to help to update these images.
See <http://www.vidjil.org/doc/server>
next version
-
vidjil/server:2021-04-05-
- vidjil-algo updated to 2021.04 (from 2020.06)
- Default backup service now use a volume for initialisation
Please ensure to update both your docker-compose.yml and backup/Dockerfile if you want to rebuild the backup service
- New variable in defs.py: `HEALTHCARE_COMPLIANCE`
vidjil/server:2020-06-22-6ec207d2
- vidjil-algo updated to 2020.06 (from 2019.05)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment