Commit 93409bae authored by Mathieu Giraud's avatar Mathieu Giraud

tutorial: split into several files

parent 734def88
Pipeline #68862 passed with stage
in 5 seconds
This diff is collapsed.
This diff is collapsed.
\section{Working with external software and exporting data}
\subsection{Checking VDJ designations with other software}
For some studies, VDJ designations are very important.
In the list and in the sequence panel, those designations are written in their
short form.
\question{Put the mouse cursor over a clone. In the status bar (between the
grid and the sequence panel), the complete designation appears.}
We can double check this designation with other popular software.
\question{Select a few clones.}
\marginpar{This requires an internet connection.}
\question{Click on the down triangle, which is right to \com{IMGT/V-QUEST}. The
clone sequences are sent to IMGT/V-QUEST.}
\question{Then tick the checkbox 5'V/D/3'J. In the sequence panel the boundaries of
the V(D)J genes as computed by IMGT/V-QUEST are underlined.}
Note that data returned by IMGT/V-QUEST is available by clicking on the \textit{i} icon of analyzed clones,
enabling you to compare the annotations made by the original software and by IMGT/V-QUEST.
\question{You can also directly send the sequences to IMGT/V-QUEST or IgBlast
by clicking the corresponding buttons. This opens a new page with the
corresponding websites.}
\bigskip
It may happen the software makes a mistake in the VDJ designation.
In such a case you're very welcome to report us the problem
and we will try to improve the designation algorithm.
\question{Go in the \com{Help} menu and click on \com{get
support}. It opens your mailer with a pre-composed email
describing the data you are on as well as the clones you selected.}.
Even if you do not use the \com{get support} button, it's a good practise
to send the complete address showing in your web browser, such
as \url{http://app.vidjil.org/?set=3241&config=39&plot=v,size,bar},
when you want to discuss with colleagues or with us your data or your analyses.
\bigskip
Suppose that you would like to change the VDJ designation shown on the web application.
\question{Click on the \textit{i} icon in the list of clones for the clone you
want to change the designation. In the segmentation part, click the edit
button. Choose what you would like to modify.}
Beware: the modifications you made (name changes, clusters, clone
tagging, sample reordering\dots) will \textbf{not} be automatically saved. You have to save
your changes by yourself either by clicking on \com{save patient} in the top left menu (where the
``patient'' name is written) or by using the \texttt{Ctrl+S} keyboard
shortcut.
For this demonstration data, you cannot save your changes as you do not have
the rights to modify this patient.
% TODO : créer un should-vdj automatiquement !
\subsection{Exporting data}
\question{In the export menu, generate printable reports by clicking on both entries starting with \com{export
report}. What differs between both?}
\question{Select some clones and then, in the export menu, choose \com{export
fasta}. What happens?}
\question{Open the \com{import/export} menu, and click on \com{export csv}.
The resulting file describes all visible clones (V(D)J designation, size for each sample).
It can be opened by any spreadsheet software such as LibreOffice Calc or Excel for further analysis.}
\question{Open again \com{import/export} menu, and click on the
\com{export bottom graph} button.
This exports the current view of the plot.}
\question{\new Select some clones and align them. The alignment can be
exported with the \com{export aligned fasta} button in the
\com{import/export} menu.}
\section{Assessing the quality of the run and of the analysis}
The Vidjil web application allows to run several ``RepSeq'' (immune repertoire analysis) algorithms.
Each RepSeq algorithm has its own definition of what a clone is (or, more precisely
a clonotype), how to output its sequence and how to assign a V(D)J designation.
The number of analyzed reads will depend on the algorithm used.
This sample has been processed using the Vidjil algorithm.
\marginpar{The percentage of analyzed reads can range from .01\,\% (for
RNA-Seq or capture data) to 98-99\,\% (for very high-quality runs mostly on
Illumina).}
\question{How many reads have been analyzed in the current sample with the embedded algorithm ?}
Now we will try to assess the reason why some reads were not analyzed in our
sample.
This might reflect a problem during the sequencing protocol\dots or that could
be normal.
For that sake you will need to display the information box by clicking on the
\textit{i} in the upper left part.
\question{What are the average read lengths on IGH? and on TRG?}
The lines starting with \texttt{UNSEG} display the reasons why some reads have
not been analyzed.
You can see what those reasons mean in the online documentation of the
algorithm: \href{http://www.vidjil.org/doc/vidjil-algo\#unsegmentation-causes}{vidjil.org/doc/vidjil-algo\#unsegmentation-causes
}
\question{What are the major causes explaining the reads have not been
analyzed? Also have a look at the average read lengths of these causes. Do
you notice something regarding the average read lengths?}
\section{Dealing with samples and patients}
We will see how to make the best use of the patient and sample database and
how to use it efficiently.
For this sake you need an account with the rights to create new patients,
runs, sets, to upload data and, preferably, to run analyses.
Therefore the demo account is not suitable.
\question{ Retrieve the toy dataset at
\href{http://vidjil.org/seqs/tutorial_dataset.zip}{vidjil.org/seqs/tutorial\_dataset.zip}
and extract the files from the archive.}
You should now have three files. We will imagine that those three files are
the results from a single sequencing run. More precisely, each one corresponds to
a single patient. Thus we now want to upload those files and assign all of
them to a same \com{run} and each of them to a single \com{patient}.
\question{
Go to the main page of the Vidjil platform (by default
\href{https://app.vidjil.org}{app.vidjil.org}).
You should be on the \com{patients} page.
Go at the bottom of the page and click on \com{+ new patients} to create the
three patients.
}
Note that usually you should check whether the patient has
already been created by searching her/his name in the search box at the
upper left corner
\question{
You are now on the creation page for patients, runs, and sets.
You can create as many patients, runs and sets as you want.
\marginpar{Patients, runs and sets are just different ways to
group samples.
The names are just used to add some semantic so that you
know that your patients will be on the patient page, your runs on the run
page and your other sets (thus any set of samples you want to make) on the
run page.}
Here we already have a line to create one patient.
We want to create two additional patients and one run.
Thus click twice on \com{add patient} and once on \com{add run}.
}
Now you should have three lines with Patient 1, Patient 2, Patient 3 and one
line with Run 1.
If you created too many lines you can remove some by clicking on the cross at
the right hand side.
\question{
For instance click on the cross corresponding to Patient 3.
The line has now been removed.
Click again on \com{add patient} so that the line appears again (it is now
called Patient 4).
}
\question{ Now you can fill the mandatory fields (circled with red) and,
optionally, the other fields.}
The last field is optional but it is very important (the field called
\com{patient/run information (\#tags can be used)}.
Here you can enter any information relevant to this set of samples.
More specifically you can enter tags (starting with a \#) that will allow
you to search very easily and quickly all the patients/runs/sets sharing
this tag.
By default when you enter a \# in this field, some tags appear and the
suggestions are updated while you enter other characters.
Note that a tag cannot contain any space.
Also note that you can create other tags just by entering whatever you would
like in the field preceded with a \#. Thus any tag you enter is saved (and
can be suggested later on).
\question{For patient 1 in this last field, enter \texttt{\#diagnosis of
patient with \#B-ALL}. For patient 2, enter \texttt{\#blood sample \#CLL}.
For patient 4, enter \texttt{bone \#marrow \#B-ALL}}
Now the three patients and the run have been created but we have not uploaded
the sequence files yet.
\question{Now go to the \com{runs} page. You should see the run you have just
created. Click on it. Then click on \com{+ add samples}.}
Similarly to the patient/run creation page, we can add as many samples as we
want on this page.
\question{As we need to upload three samples, click twice on the \com{add
other sample} button so that you have three lines to add a sample.}
\question{For sample 1, choose the file corresponding to patient 1 (and
respectively for patient 2 and 3). You can also add extra information, with
tags, as previously.}
Note the \com{common sets} field. This field means that all the samples will
be added to this run (the one you created). If you would like to \textbf{all}
the samples to another patient/run/set you should specify it here.
In our case we want to add each sample to a different patient. Thus we don't
need to modify this field.
\question{Instead we need to modify the last field on each line. Click on
it. A list should appear with the last patients/runs/sets you created.
Either click on the correct patient or type the first letters of her/his
name. Then validate with \com{Enter} or by clicking on the correct entry.}
\question{When you have associated each sample to its corresponding patient
you can upload the samples by clicking on the \com{Submit samples} button.}
Now you are back on the page of the run where you should see the three samples
that are being uploaded.
\question{When the upload is finished you launch the analysis by selecting the
configuration in the drop down at the right (\com{multi+inc+xxx}) and then
clicking on the gearwheel.}
You can have a coffee, a tea, or something else, while the process is
launched.
\question{To regularly check the status of your job you can click
on the \com{reload} button at the button left of the page.
Your process usually goes through the following stages: \com{QUEUED},
(possibly \com{STOPPED}), \com{ASSIGNED}, \com{RUNNING}, \com{COMPLETED} (or
\com{FAILED} when there is an issue, in such a case please contact us)}
Then you can view the results as explained before.
Instead we will remain on the server.
\question{Now go back to the \com{patients} page. You can filter the page
using the tags you entered previously.
Enter \texttt{\#B-ALL} in the search box (notice the autocompletion that helps you)
and validate with \com{Enter}.}
\section{Tracking clones on several samples}
\label{sec:tracking}
%Load now some data with several samples.
The \textit{time graph} shows the evolution of the top clones of each sample into all the samples.
Bear in mind that to ensure readability at most 50 curves are displayed in this graph.
\marginpar{When loading data with only one sample, the time graph is replaced by a second bar/grid plot.}
\question{Pass the mouse over the bubbles in the grid or over the lines in the time graph.
Click on some clone. What happens ?}
\question{Click on the label of the time graph to select another sample.
What happens to the number of analyzed reads ? to the size of the top clones
?}
When switching the time point, the views dynamically update which allows to
easily track the changes along time. Also note that the number of analyzed
reads differ from the previous point. We can again analyse the reason why some
reads were unsegmented.
\bigskip
We will look now at how the V gene distribution evolves along the time.
\question{In the grid, select the preset \com{V distribution}. Then click
on the \com{play} icon in the upper left part (below the \textit{i} icon).}
By doing so you can look at how the V distribution changes along the time.
Of course you can also change the data displayed in the grid to look at
the evolution of another information.
\bigskip
We remind that by default at most 50 clones are displayed
on the time graph. However the remaining of the application usually displays
the 50 \textit{most abundant clones} at each sample (which can account to hundreds of
clones, when having several samples).
\question{Select a sample, order the list by size, and pass the mouse through the list
of top 50 clones. What happens in the graph when hovering clones that are not in the top 50 ?}
\bigskip
If you have many samples, you may wish to reorder the samples.
\question{Drag the label of one sample to reorder the samples.}
\question{Drag one label to the box with the pin icon to hide this sample.}
\bigskip
You may also want to compare two samples, either to check a replicate, to check for possible contaminations, or to
compare different research or medical situations.
\question{In the \com{color by} menu, choose \com{by abundance}. Select a different
sample. What happens ? Are there some clones with a significant different concentration in both samples ?
Revert the color by choosing \com{by tag}.}
Another option is to directly plot a log-log curve comparing two samples.
\question{In the \com{plot} menu, choose the preset \com{compare two samples}. Click
successively on two labels in the time graph to select the samples to be compared.
Are there again some clones with a significant different concentration in both samples ?}
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment