Mise à jour terminée. Pour connaître les apports de la version 13.8.4 par rapport à notre ancienne version vous pouvez lire les "Release Notes" suivantes :

Commit d445eca5 authored by Vidjil Team's avatar Vidjil Team Committed by Mikaël Salson

tutorial: Provide examples on the application usages.

English and French versions
parent 2ea3cb83
TEX=$(wildcard *.tex)
all: $(PDF)
$(PDF): %.pdf: %.tex
pdflatex $^ && pdflatex $^
This diff is collapsed.
\usepackage[margin=3.2cm, top=1.5cm, bottom=2cm,left=1.5cm,a4paper]{geometry}
\setlist[itemize]{noitemsep, topsep=0pt}
\section*{Mastering the Vidjil web application:\\ Browsing and analysing clones}
\textit{The goal of this practical session is to learn
common ways to visualise, filter, analyse and merge clones
on the Vidjil web application.
These clones may have been computed by the Vidjil algorithm or by any other algorithm.}
\section{Assessing the quality of the run and of the analysis}
\question{Load the provided files : \textit{Demo LIL-L3}}
\marginpar{The percentage of analysed reads can range from .01\,\% (for
RNA-Seq or capture data) to 98-99\,\% (for very high-quality runs mostly on
\question{How many reads have been analysed in the current sample?}
Now we will try to assess the reason why some reads were not analysed in our
This might reflect a problem during the sequencing protocol\dots or that could
be normal.
For that sake you will need to display the information box by clicking on the
\textit{i} in the upper left part.
\question{What are the average read lengths on IGH? and on TRG?}
The lines starting with \texttt{UNSEG} display the reasons why some reads have
not been analysed.
\question{What are the major causes explaining the reads have not been
analysed? Also have a look at the average read lengths of these causes. Do
you notice something regarding the average read lengths?}
\section{Filtering clones}
By default Vidjil displays the 50 most abundant clones at each time point.
With five time points, we may therefore have from 50 to 250 clones displayed
depending if the top 50 are always the same or always different.
This number can be increased by going in the \com{filter} menu and by putting the
slider to its right end.
\question{Notice how the IGH smaller clones percentage changes. What was its
initial value? What is it now?}
The \textit{smaller clones} correspond to clones that are not displayed
because they are never among the most abundant ones.
Consider the first real clone in the list. It is the most abundant.
Usually we may want to tag it in order to remember it later on.
\question{Click on the star and choose a color to tag it. Notice how the
color applies throughout all the views.}
Later you may want to filter clones depending on the tags you have chosen.
\question{In the upper left part, click on the little gray square (at the
right of the coloured squares). What happens? What if you click again?}
This is a way of filtering some clones. This may be useful when we want to
focus on some specific clones. Another way of doing so is to filter them by
their gene names or by their DNA sequences.
\question{In the search box,
enter \texttt{GGAGTCGGGG} and validate with \texttt{Enter}. How many sequences are
Note that the search is performed both on the forward and the reverse strand.
\question{Check that is true by searching for the reverse complement of the
sequence: \texttt{CCCCGACTCC}. Do you find the same results as previously?}
\question{How can you cancel this filter and view again all the clones?}
Another solution to tag a specific clone is to rename it.
\question{Double click on the name of a clone (in the list of clones) and
choose another name (\textit{e.g.} interesting clone) and validate using
After this rename, you can see that the clone is still selected.
\question{Click on several clones by holding the \texttt{Ctrl} key to select
more of them. Each time you add a new clone to the selection, its sequence
is added in the bottom part.}
\question{How many clones are selected? How many reads do those clones
represent? (see the bottom part to the right)}
\question{When you want to focus on the selected clones, you can click on the
focus link on the right, next to the number of selected clones.
This feature is useful when you want to analyse some clones more thoroughly
without being annoyed by other clones.}
\question{To remove this focus, click on the cross next to the search box,
above the list.}
\question{To unselect them all, you can click in an empty area on the top or
bottom plot.}
\section{Analysing clone populations}
\subsection{Merging clones through inspection of their sequences}
The first thing to be done is to see if some clones should be merged (because
of sequencing or PCR errors for instance). This step could be automatised
but, in any case, the automatic merge would need to be checked by an expert
By default in the bottom plot (the \textit{grid}), the clones
are displayed according to their V and J genes (or more generally to their
5' and 3' genes).
\question{Identify in the grid the clones with an
\textit{IGHV-3-13}-\textit{IGHJ6} recombination and select them
all. You can do so either by holding \texttt{Ctrl} or by drawing a rectangle around the clones while
maintaining down the left button of the mouse.}
The sequences of the clones now appear in the bottom part of the browser (the
\textit{segmenter}). If many clones are selected you can view more sequences
by moving the mouse above the segmenter.
The sequences in the segmenter can be visually compared but you can also align
them to see more easily their similarities.
\question{Click on the \com{align} button on the left-hand side. The differences are
emphasized in bold.}
Now it is the expertise of the user to determine if sequences are sufficiently
similar, depending on the application. If some sequences don't appear to be similar enough, you can remove
them from the segmenter by clicking on the cross in front of the sequence in
the segmenter.
\question{Remove all the sequences that are not similar enough with the first
Now all the sequences in the segmenter should be highly similar. All their
differences should be due to sequencing or PCR errors.
These artifacts (mutations, homopolymers, insertions, deletions)
depend on the sequencer and the PCR technique.
\question{Merge all those clones in a single clone by clicking on the merge
button, next to the align button.}
All the clustered sequences now appear within a same clone. That can be seen
in the list: the clone which hosts the subclones appears with a $+$ on its
left. You can click on the $+$ to see the subclones that have been merged in
the main one.
\question{Click on the $+$ and observe the changes in the grid.}
As you may have noticed the subclones appear again in the grid. You can
compare their sequences again if you'd like (for example to double check that
you were right to merge them). You can also remove some subclones from the
cluster by clicking on the cross at their left in the list.
\question{For the sake of the exercise, remove the last clone of the cluster.}
\subsection{Other metrics and analysis on the clones}
As a proxy to sequence similarity we used the V and J genes, however there are
other ways to assess sequence similarity that may be more pertinent.
Moreover you may want to plot other metrics on the lymphocyte population.
For instance we can choose to plot the V genes versus the length of the N
\question{Go to the \com{plot} menu (in the upper left corner of the grid),
and in the preset box choose \com{V/N length}.}
Then you can continue aligning and merging clones if necessary.
\question{You can also try the preset \com{clone consensus length/GC content}
which tends to separate quite nicely the distinct clones.}
Note that you can choose any axis to be plotted: just go the \com{plot} menu and
select any value you would like for the $x$ axis and for the $y$ axis.
For bar charts, the box sizes always relates to the clone size,
and the $y$ axis selects the order of the boxes sharing a same $x$).
%% \item Regarder les stats disponibles, mettre n°7 (taille des reads)
\question{In the \com{plot} menu, switch between the ``bubble plot'' and the ``bar plot''.
In the bar plot mode, pass the mouse over the bars: What happens ?}
Another possibility is to request Vidjil to compute the similarity between
\question{Now select the preset ``plot by similarity'' or even ``plot
similarity by locus'' to plot similarity for the current locus.}
Now the most similar clones should be close together. However note that it is
theoretically impossible to achieve such a representation in 2 dimensions. So
it is possible that two dissimilar clones are close together or, conversely,
that two similar clones are far apart.
\question{Press the keys \texttt{0} to \texttt{9} on the numeric keypad. What happens ?}
There is still a feature to help you analyse your data that we have not
explored yet.
You can change the colors to make it represent some variables of interest
with the \com{color by} menu.
\question{First choose the preset ``plot by similarity and by locus'' and
then color by N length (in the box at the top of the screen).}
\marginpar{We apologize to color blinds: the colors are not yet color-blind friendly.}Clones that are close on the grid with similar colors are likely to
be similar.
Using those different features you should be able to analyse how similar your
sequences are, and potentially you could cluster them if you'd like.
\textit{This part is specific to samples analysed with the Vidjil algorithm.}
Some clones may be less trustable than other ones\dots{} Let's see how to spot them.
\question{In the clone list, search clones with an orange warning at the
right side. Click on the warning. What are the warnings due to?}
There may have two reasons:
\item average coverage: in that case the clonal sequence displayed is short
compared to the reads in the clone. This may be the case when too different
sequences have been put in a clone. The value is generally $\geq 80\,\%$.
\item $e$-value: It is a statistical value computed to ensure that
recombinations have not been spot by chance. This value is generally much
lower than 1 ($<10^{-5}$).
You can view those values for any clone by clicking the \textit{i} icon on the
right side, in the list of clones.
\subsection{Analysing recombinations from several loci}
If you want to focus on specific locus, you can click on the locus name in
the upper left part. One click will make the locus disappear, another one will
make it appear again.
If you hold the \texttt{Shift} key (the one which is usually above the left
\texttt{Ctrl} key) while clicking it will hide all the loci but the one you
clicked on.
\question{Click on \com{IGH}, while holding the \texttt{Shift} key. Now what is the
number of analysed reads? Why did it change?}
\question{Now click on \com{TRG}, to filter it in again.}
\question{Press on the \texttt{g} key. What happens? Now, press on the
\texttt{h} key. Press on the \texttt{g} again (you can do that anytime you
like :)). Let's stick to the TRG locus.}
You can also change the current locus by clicking on the locus name in the
right part of the grid.
\subsection{Tracking clones on several samples}
%Load now some data with several samples.
The \textit{time graph} shows the evolution of the top clones of each sample into all the samples.
Bear in mind that to ensure readability at most 50 curves are displayed in this graph.
\question{Pass the mouse over the bubbles in the grid or over the lines in the time graph.
Click on some clone. What happens ?}
\question{Click on the label of the time graph to select another sample.
What happens to the number of analyzed reads ? to the size of the top clones
When switching the time point, the views dynamically update which allows to
easily track the changes along time. Also note that the number of analyzed
reads differ from the previous point. We can again analyse the reason why some
reads were unsegmented.
We will look now at how the V gene distribution evolves along the time.
\question{In the grid, select the preset \com{V distribution}. Then click
on the \com{play} icon in the upper left part (below the \textit{i} icon).}
By doing so you can look at how the V distribution changes along the time.
Of course you can also change the data displayed in the grid to look at
the evolution of another information.
We remind that by default at most 50 clones are displayed
on the time graph. However the remaining of the application usually displays
the 50 \textit{most abundant clones} at each sample (which can account to hundreds of
clones, when having several samples).
\question{Select a sample, order the list by size, and pass the mouse through the list
of top 50 clones. What happens when selecting clones that are not in the top 20 ?}
If you have many samples, you may wish to reorder the samples.
\question{Drag the label of one sample to reorder the samples.}
\question{Drag one label to the box with the pin icon to hide this sample.}
You may also want to compare two samples, either to check a replicate, to check for possible contaminations, or to
compare different research or medical situations.
\question{In the \com{color by} menu, choose \com{by abundance}. Select a different
sample. What happens ? Are there some clones with a significant different concentration in both samples ?
Revert the color by choosing ``by tag''.}
Another option is to directly plot a log-log curve comparing two samples.
\question{In the \com{plot} menu, choose the preset \com{compare two samples}. Click
successively on two labels in the time graph to select the samples to be compared.
Are there again some clones with a significant different concentration in both samples ?}
\subsection{Caring about VDJ designations}
For some studies, VDJ designations are very important.
In the list and in the segmenter, those designations are written in their
short form.
\question{Put the mouse cursor over a clone. In the status bar (between the
grid and the segmenter), the complete designation appears.}
We can double check this designation.
\question{Select a few clones.}
\marginpar{This requires an internet connection.}
\question{Click on the down arrow, which is right to ``IMGT/V-QUEST''. The
clone sequences are sent to IMGT/V-QUEST.}
\question{Then tick the checkbox 5'V/D/3'J. In the segmenter the boundaries of
the V(D)J genes as computed by IMGT/V-QUEST are underlined.}
\question{You can also directly send the sequences to IMGT/V-QUEST or IgBlast
by clicking the corresponding buttons. This opens a new page with the
corresponding websites.}
It may happen the software makes a mistake in the VDJ designation.
In such a case you're very welcome to report us the problem
and we will try to improve the designation algorithm.
You can also change the designation shown on the web application.
\question{Click on the \textit{i} icon in the list of clones for the clone you
want to change the designation. In the segmentation part, click the edit
button. Choose what you would like to modify.}
Beware: none of the modification you made (name changes, clone merges, clone
tagging, samplê reordering\dots) will be automatically saved. You have to save
your changes by yourself either by clicking on \com{save patient} in the top left menu (where the
``patient'' name is written) or by using the \texttt{Ctrl+S} keyboard
% TODO : créer un should-vdj automatiquement !
\section{Exporting data}
\question{In the export menu, generate printable reports by clicking on both entries starting with \com{export
report}. What differs between both?}
\question{Select some clones and then, in the export menu, choose \com{export
fasta}. What happens?}
\flushright \it Aurélie Caillault, Mathieu Giraud, Tatiana Rocher, Mikaël Salson
\\ \texttt{contact@vidjil.org}
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment