Commit fe70e5d0 authored by Mathieu Giraud's avatar Mathieu Giraud

doc/dev-{algo,client,server}.md: converted and split from dev.org

see #3932

pandoc --wrap=preserve --atx-headers -s -r org doc/dev.org -t gfm > doc/dev.md
parent bac84808
Pipeline #86370 passed with stage
in 6 seconds
Here are aggregated notes forming the developer documentation of vidjil-algo.
This documentation is a work-in-progress, it is far from being as polished as the user documentation.
Help can also be found in the source code and in the commit messages.
# Algorithm
## Code organisation
The algorithm follows roughly those steps:
1. The germlines are read. Germlines are in the fasta format and are read
by the Fasta class (`core/fasta.h`). Germlines are built using the
Germline (or MultiGermline) class (`core/germline.h`)
2. The input sequence file (.fasta, .fastq, .gz) is read by an OnlineFasta
(`core/fasta.h`). The difference with the Fasta class being that all the
data is not stored in memory but the file is read online, storing only
the current entry.
3. Windows must be extracted from the read, which is done by the
WindowExtractor class (`core/windowExtractor.h`). This class has an
`extract` method which returns a WindowsStorage object
(`core/windows.h`) in which windows are stored.
4. To save space consumption, all the reads linked to a given window are
not stored. Only the longer ones are kept. The BinReadStorage class is
used for that purpose (`core/read_storage.h`).
5. In the WindowStorage, we now have the information on the clusters and on
the abundance of each cluster. However we lack a sequence representative
of the cluster. For that purpose the class provides a
`getRepresentativeComputer` method that provides a
KmerRepresentativeComputer (`core/representative.h`). This class can
compute a representative sequence using the (long) reads that were
stored for a given window.
6. The representative can then be segmented to determine what V, D and J
genes are at play. This is done by the FineSegmenter (`core/segment.h`).
## The xxx germline
- All germlines are inserted in one index using `build_with_one_index()` and
the segmentation method is set to `SEG_METHOD_MAX12` to tell that the
segmentation must somehow differ.
- So that the FineSegmenter correctly segments the sequence, the `rep_5` and
`rep_3` members (class `Fasta`) of the xxx germline are modified by the
FineSegmenter. The `override_rep5_rep3_from_labels()` method from the
Germline is the one that overwrites those members with the Fasta
corresponding to the affectation found by the KmerSegmenter.
## Tests
### Unit
Unit tests are managed using an internal lightweight poorly-designed
library that outputs a TAP file. They are organised in the directory
[algo/tests](../algo/tests).
All the tests are defined in the [tests.cpp](../algo/tests/tests.cpp) file. But, for the sake of
clarity, this file includes other `cpp` files that incorporate all the
tests. A call to `make` compiles and launches the `tests.cpp` file, which
outputs a TAP file (in case of total success) and creates a `tests.cpp.tap`
file (in every case).
1. Tap test library
The library is defined in the [testing.h](../algo/tests/testing.h) file.
Tests must be declared in the [tests.h](../algo/tests/tests.h) file:
1. Define a new macro (in the enum) corresponding to the test name
2. In `declare_tests()` use `RECORD_TAP_TEST` to associate the macro with a
description (that will be displayed in the TAP output file).
Then testing can be done using the `TAP_TEST` macro. The macro takes three
arguments. The first one is a boolean that is supposed to be true, the
second is the test name (using the macro defined in `tests.h`) and the
third one (which can be an empty string) is something which is displayed
when the test fails.
Here are aggregated notes forming the developer documentation on the Vidjil web client.
This documentation is a work-in-progress, it is far from being as polished as the user documentation.
Help can also be found in the source code and in the commit messages.
# Client
## Installation
Run a `make` into `browser/` to get the necessary files.
This will in particular get the germline files as well as the icon files.
Opening the `browser/index.html` file is then enough to get a functionning client,
able to open `.vidjil` files with the `import/export` menu.
To work with actual data, the easiest way is to copy `js/conf.js.sample` to `js/conf.js`.
This will unlock the `patients` menu and allow your local client
to access the public server at <http://app.vidjil.org/>.
## Client API and permanent URLs
The client can be opened on a data file specified from a `data` attribute,
and optionally on an analysis file specified from a `analysis` attribute,
as in the following URLs on our test server:
- <http://app.vidjil.org/browser/?data=test.vidjil>
- <http://app.vidjil.org/browser/?data=test.vidjil&analysis=test.analysis>
- <http://app.vidjil.org/browser/?data=http://app.vidjil.org/browser/test.vidjil>
Both GET and POST requests are accepted.
Note that the `browser/index.html` file and the `.vidjil/.analysis` files should be hosted on the same server.
Otherwise, the server hosting the `.vidjil/.analysis` files must accept cross-domain queries.
The client can also load data from a server (see below, requires logging), as in <http://app.vidjil.org/?set=3241&config=39>
| | |
| ----------- | ------------- |
| `set=xx` | sample set id |
| `config=xx` | config id |
Older formats (patients, run…) are also supported for compatibility but deprecated.
Moreover, the state of the client can be encoded in the URL, as in <http://app.vidjil.org/?set=3241&config=39&plot=v,size,bar&clone=11,31>
| | |
| ---------------- | --------------------- |
| `plot=x,y,m` | plot (x axis, y axis) |
| `clone=xx,xx,xx` | selected clone ids |
For `plot` the axis names are found in `browser/js/axes.js`. `m` is optional, and defines the type of plot (either `grid` or `bar`).
We intend to encode more parameters in the URL.
## Architecture
The Vidjil client is a set of *views* linked to a same *model*.
The model keeps the views in sync on some global properties,
most notably dealing with the selection of clones, with the clone filtering,
as well with the locus selection.
- The model (`js/model.js`) is the main object of the Vidjil client.
It loads and saves `.vidjil` json data (either directly from data, or from a local file, or from some url).
It provides function to access and edit information on the clones and on the global parameters
It keeps all the views in sync.
- Each of the views (`Graph`, `ScatterPlot`, `List`, `Segment`) is rendered inside one or several `<div>` elements,
and kept sync with the model. All the views are optional, and several views of the same type can be added.
See `js/main.js` for the invocation
- The link with the patient database/server is done with the `Database` object (`js/database.js`)
- Other objects: `Report`, `Shortcut`
Extends functionalities but requires elements from the full `index.html`.
## Integrating the client
### HTML and CSS
- The `index.html` contains the `<div>` for all views and the menus
- The CSS (`css/light.css`) is generated by `less` from `css/vidjil.less`
- The `small_example.html` is a minimal example embedding basic HTML, CSS, as well as some data.
As the menus are not embedded in this file, functionalities should be provided by direct calls to the models and the views.
### Javascript
- The wonderful library `require.js` is used, so there is only one file to include
\<script data-main="js/app.js" src="js/lib/require.js"\>\</script\>
- `js/main.js` creates the different views and binds them to the model.
Another option is to directly define a function named `main()`, as in `small_example.html`.
### JSON .vidjil data
Clone lists can be passed to the model through several ways:
- directly by the user (import/export)
- from a patient database (needs a database)
- trough the API (see below)
- or by directly providing data through Javascript (as in `small_example.html`)
The first three solutions need some further elements from the full `index.html`.
## Notifications
### Priority
\#<span id="browser:priority"></span>
The priority determines how the notification are shown and what action the
user should do. The priorities can be between 0 and 3.
- 0
The notification is not shown
- 1
The notification is shown (usually on green background) and
automatically disappears
- 2
The notification is shown (usually on yellow background) and
automatically disappears
- 3
The notification is shown (usually on red background) and doesn't
disappear until the user clicks on it.
In the `console.log`, the field `priority` takes one of those priorities.
## Plots
### How to add something to be plotted
You want to add a dimension in the scatterplot or as a color? Read the
following.
1. Scatterplot
In [scatterPlot.js](../browser/js/scatterPlot.js), the `available_axis` object defines the dimensions that
can be displayed. It suffices to add an entry so that it will be proposed
in the X and Y axis. This kind of way of doing should be generalized to
the other components.
The presets are defined in the `preset` object.
2. Color
Adding a color needs slightly more work than adding a dimension in the
scatterplot.
The function `updateColor` in file [clone.js](../browser/js/clone.js) must be modified to add our color method.
The variable `this.color` must contain a color (either in HTML or RGB, or…).
Then a legend must be displayed to understand what the color represents.
For this sake, modify the `build_info_color` method in [info.js](../browser/js/info.js) file. By
default four spans are defined (that can be used) to display the legend:
`span0`, …, `span3`.
Finally modify the [index.html](../browser/index.html) file to add the new color method in the
select box (which is under the `color_menu` ID).
## Classes
### Clone
1. Info box
In the info box all the fields starting with a \_ are put. Also all the
fields under the `seg` field are displayed as soon as they have a `start` and
`stop`. Some of them can be explicitly not displayed by filling the
`exclude_seg_info` array in `getHtmlInfo`.
## Tests
### Code Quality
Quality of code is checked using [JSHint](http://jshint.com/), by
running `make quality` from the `browser` directory.
Install with `npm install -g jshint`
### Unit
The unit tests in the client are managed by QUnit and launched using
[nightmare](http://www.nightmarejs.org/), by launching `make unit` from the `browser/test` directory.
The tests are organised in the directory
[browser/test/QUnit/testFiles](../browser/test/QUnit/testFiles). The file [data<sub>test</sub>.js](../browser/test/QUnit/testFiles/data_test.js) contains a toy
dataset that is used in the tests.
Unit tests can be launched using a real client (instead of nightmare). It
suffices to open the file [test<sub>Qunit</sub>.html](../browser/test/QUnit/test_Qunit.html). In this HTML webpage it is
possible to see the coverage. It is important that all possible functions
are covered by unit tests. Having the coverage displayed under Firefox
needs to display the webpage using a web server for security
reasons. Under Chromium/Chrome this should work fine by just opening the
webpage.
1. Installation
Nightmare is distributed withing `node` and can therefore be installed with it.
``` bash
apt-get install nodejs-legacy npm
npm install nightmare -g # make -C browser/test unit will automatically
link to global nightmare installation
```
Note that using `nightmare` for our unit testing
requires the installation of `xvfb`.
2. Debugging
If there is a problem with the nightmare or electron (nightmare
dependency), you may encounter a lack of output or error messages.
To address this issue, run:
``` bash
cd browser/test/QUnit
DEBUG=nightmare*,electron:* node nightmare.js
```
### Functional
1. Architecture
The client functional testing is done in the directory
`browser/tests/functional`, with Watir.
The functional tests are built using two base files:
- vidjil<sub>browser</sub>.rb
abstracts the vidjil browser (avoid using IDs or
class names that could change in the test). The tests must rely as
much as possible on vidjil<sub>browser</sub>. If access to some
data/input/menus are missing they must be addded there.
- browser<sub>test</sub>.rb
prepares the environment for the tests. Each test
file will extend this class (as can be seen in test<sub>multilocus</sub>.rb)
The file `segmenter_test.rb` extends the class in `browser_test.rb` to adapt
it to the purpose of testing the analyze autonomous app.
The tests are in the files whose name matches the pattern `test*.rb`. The
tests are launched by the script in `../launch_functional_tests` which launches
all the files matching the previous pattern. It also backs up the test
reports as `ci_reporter` removes them before each file is run.
2. Installation
The following instructions are for Ubuntu.
For OS X, see <https://github.com/watir/watirbook/blob/master/manuscript/installation/mac.md>.
1. Install rvm
``` bash
\curl -sSL https://get.rvm.io | bash
```
Afterwards you may need to launch:
``` bash
source /etc/profile.d/rvm.sh
```
2. Install ruby 2.6.1
``` bash
rvm install 2.6.1
```
3. Switch to ruby 2.6.1
``` bash
rvm use 2.6.1
```
4. Install necessary gems
``` bash
gem install minitest minitest-ci watir test-unit
```
5. Install web browsers
The Firefox version used can be set with an environment variable (see
below). By default, the tests only work with Firefox ≤ 45. But this can be
modified with an environment variable. All Firefox releases are [available here](https://download-installer.cdn.mozilla.net/pub/firefox/releases/%20).
One can instead choose to launch functional tests using chrome. You should
install `chromium-browser` as well as `chromium-chromedriver`. On old Chrome
versions the `chromedriver` package may not exist. In such a case you should
[download the ChromeDriver by yourself](https://chromedriver.storage.googleapis.com/index.html?path=2.11/) (the supported Chrome versions are
written in the `notes.txt` file of each version).
3. Launch client tests
``` bash
make functional
```
1. Environment variables
By default the tests are launched on the Firefox installed on the system.
This can be modified by providing the `FUNCTIONAL_CLIENT_BROWSER_PATH`
environment variable (which can contain several pathes, separated with
spaces) to the `launch_functional_tests` script. Or, if one wants to launch
individual test scripts, to set the `WATIR_BROWSER_PATH` environment
variable.
Other environment variables that can be specified to tune the default behaviour
- `WATIR_CHROME`
should be set to something evaluated to `True` (*e.g.* 1) if the
tests must be launched using Chrome
- `WATIR_MARIONETTE`
should be set to something evalued to `True` (*e.g. 1*)
if the tests must be launched on a recent Firefox version (\> 45)
4. Headless mode
On servers without a X server the client tests can be launched in headless
mode.
For this sake one needs to install a few more dependencies:
``` bash
gem install headless
```
The virtual framebuffer X server (`xvfb`) must also be installed. Depending
on the operating system the command will be different:
``` bash
# On Debian/Ubuntu
apt-get install xvfb
# On Fedora/CentOS
yum install xvfb
```
If you have set a configuration file (browser/js/conf.js), you should remove it during the headless testing. The easiest way to do it is to launch these commands before and after the test
``` bash
# before tests
mv browser/js/conf.js browser/js/conf.js.bak
# after tests
mv browser/js/conf.js.bak browser/js/conf.js
```
Then the client tests can be launched in headless mode with:
``` bash
make headless
```
It is possible to view the framebuffer content of `Xvfb` using `vnc`. To do so,
launch:
1. `x11vnc -display :99 -localhost`
2. `vncviewer :0`
5. Interactive mode
For debugging purposes, it may be useful to launch Watir in interactive
mode. In that case, you should launch `irb` in the `browser/tests/functional`
directory.
Then load the file `browser_test.rb` and create a `BrowserTest`:
``` ruby
load 'browser_test.rb'
bt = BrowserTest.new "toto"
# Load the Vidjil client with the given .vidjil file
bt.set_browser("/doc/analysis-example.vidjil")
```
Finally you can directly interact with the `VidjilBrowser` using the `$b`
variable.
Another way of debugging interactively is by using (and installing) the
`ripl` gem. Then you should add, in the `.rb` file to debug:
``` ruby
require 'ripl'
```
Then if you want to stop launch an `irb` arrived at a given point in the
code, the following command must be inserted in the code:
``` ruby
Ripl.start :binding => binding
```
If you have to launch `irb` on a remote server without X (only using `Xvfb`)
you may be interested to use the [redirection over SSH](https://en.wikipedia.org/wiki/Xvfb#Remote_control_over_SSH).
<link rel="stylesheet" type="text/css" href="org-mode.css" />
Here are aggregated notes forming the developer documentation on the Vidjil server,
on client-server interaction as well as on packaging.
This documentation is a work-in-progress, it is far from being as polished as the user or maintainer documentation.
Help can also be found in the source code and in the commit messages.
# Server
## Notifications
The news system is a means of propagating messages to the users of a vidjil server installation.
Messages are propagated in near-realtime for users interacting directly with the server and at a slightly slower rate for users simply using the browser but for which the server is configured.
### Message Retrieval
The browser by default periodically queries the server to retrieve any new messages and are displayed on a per user basis. This means that any message having already been viewed by the user is not displayed in the browser.
Older messages can be viewed from the index of news items.
### Caching
News items are kept in cache in order to relieve the database from a potentially large amount of queries.
The cache is stored for each user and is updated only when a change occurs (message read, message created or message edited).
### Formatting
Messages can be formatted by using the Markdown syntax. Syntax details are
available here: <http://commonmark.org/help/>
### Priority
The priority determines how the notification is shown (see [here for more details](browser:priority)). From the server we have two ways of modifiying the priority.
Either by defining the `success` field to `'true'` or to `'false'`, or
by explicitly specifying the priority in the field `priority`.
For more details see 35054e4
## Getting data and analysis
How the data files (.vidjil) and analysis files are retrieved from the server?
### Retrieving the data file
This is done in the `default.py` controller under the `get_data` function.
However the .vidjil file is not provided as its exact copy on the
server. Several informations coming from the DB are fed to the file
(original filename, time stamps, information on each point, …)
### Retrieving the analysis file
This is done in the `default.py` controller under the `get_analysis` function.
Actually the real work is done in the `analysis_file.py` model, in the
`get_analysis_data` function.
## Permissions
Permissions are handled by Web2py's authentication mechanism which is
specialised to Vidjil's characteristics through the `VidjilAuth` class.
### Login
1. Redirect after login
The URL at which we access after login is defined in the controllers
`sample_set/all` and in `default/home`.
## Database
### Export
mysqldump -u \<user\> -p \<database\> -c –no-create-info \> \<file\>
### Import
In order to import the data from another server, you need to ensure
there will be no key collision, or the import will fail.
If the database contains data, the easiest is to drop the database and
create a new empty database.
This will require you to delete the .table file in web2py/applications/vidjil/databases
In order to create the tables you should then load a page from the
webapp, but DO NOT init the database, because this will raise the problem
of colliding primary keys again.
Then run:
mysql -u \<user\> -p \<database\> \< file
### VidjilAuth
One VidjilAuth is launched for a given user when a controller is called.
During that call, we cache as much as possible the calls to the DB. For
doing so the `get_permission` method is defined (overriding the native
`has_permission`). It calls the native `has_permission` only when that call
hasn't already been done (this is particularly useful for DB intensive
queries, such as the compare patients).
Also some user characteristics are preloaded (groups and whether the person
is an admin), which also prevents may DB calls.
# Tests
# Packaging
## Script driven building
In order to make packaging Vidjil simple and facilitate releases scripts
have been made and all meta data files required for the Debian packages
can be found in the packaging directory in each package's subdirectory.
In the packaging directory can be found the scripts for building each of
the vidjil packages: germline, algo (named vidjil) and server.
Note: build-generic.sh is a helper script that is used by the other
build-\* scripts to build a package.
Executing one of the scripts will copy the necessary files to the
corresponding packaging subdirectory (germline, vidjil and server)
And build the package in the /tmp folder along with all the files needed
to add the package to a repository
It is worth noting that while all packages can be built directly from the
project sources, the algorithm is actually built from the releases found
at <http://www.vidjil.org/releases>.
## Packaging Vidjil into a Debian Binary Package
In this section we will explain how to package a pre-compiled version of
Vidjil that will allow easy installation although it will not meet all the
requirements for a full Debian package and therefore cannot be added to the
default Debian repositories.
In this document we will not go over the fine details of debian packaging
and the use of each file. For more information you can refer to this page
from which this document was inspired:
<http://www.tldp.org/HOWTO/html_single/Debian-Binary-Package-Building-HOWTO/>
Being a binary package it will simply contain the vidjil binary which will
be copied to the chosen location on installation.
### Let's Get Started
You will first and foremost need to compile vidjil. Refer to \#TODO for
more information.
Create a base directory for the package and the folders to which the binary
will be installed. Let's call our folder debian and copy the binary to *usr/bin*
``` bash
mkdir -p debian/usr/bin
```
And copy the vidjil binary
``` bash
cp vidjil debian/usr/bin
```
Now create the necessary control file. It should look something like this:
``` example
Package: vidjil
Version: <version> (ie. 2016.03-1)
Section: misc
Priority: optional
Architecture: all
Depends: bash (>= 2.05a-11)
Maintainer: Vidjil Team <team@vidjil.org>
Description: Count lymphocyte clones
vidjil parses a fasta or fastq file and produces an output with a list
of clones and meta-data concerning these clones
```
And place it in the correct folder.
``` bash
mkdir -p debian/DEBIAN
cp control debian/DEBIAN/
```
Now build the package and rename it.
``` bash
dpkg-deb --build debian
mv debian.deb vidjil_<version>_all.deb
```
It can be installed but running
``` bash
sudo dpkg -i vidjil_<version>_all.deb
```
## Packaging Vidjil into a Debian Source Package
Note: This document is currently incomplete. This process will not produce a
working debian package. The package build will fail when attempting to
emulate \`make install\`
### Requirements
- The release version of Vidjil you wish to package
- Knowledge of Debian packaging
In this documentation we will not go over all the specifics of creating a
debian package. You can find the required information here:
<https://wiki.debian.org/HowToPackageForDebian>
and <https://wiki.debian.org/Packaging/Intro?action=show&redirect=IntroDebianPackaging>
### Creating the orig archive
In order to build a debian package, it is required to have a folder named
debian with several files required for the package which contain meta
data and permit users to have information on packages and updates for
packages.
In order to generate this folder run the following from the source base
directory.
``` bash
dh_make -n
```
You can remove all files from the debian folder that match the patterns \*.ex, **.EX and README**
Update debian/changelog, debian/control and debian/copyright to contain the correct
information to reflect the most recent changes and metadata of Vidjil.
Vidjil has no install rule so we need to use a debian packaging feature.
Create a file named debian/install with the following line:
``` example
vidjil usr/bin/
```
Vidjil currently depends on some unpackaged files that need to be
downloaded before compiling.
``` bash
mkdir browser
make germline
make data
```
Debian packaging also requires archives of the original source. This is
to manage people packaging software they haven't developed with changes
they have made. To make things simpler, we simply package the current
source as the reference archive and build the package with the script
that can be obtained here: <https://people.debian.org/~wijnen/mkdeb> (Thanks
to Bas Wijnen \<wijnen@debian.org\> for this script)
From the source directory, run that script to create the package.
You're done\! You can now install the debian package with:
``` bash
sudo dpkg -i path/to/package
```
# Docker
The vidjil Docker environment is managed by Docker Compose since it is
composed of several different services this allows us to easily start and
stop individual services.
The services are as follows:
- mysql The database
- uwsgi The Web2py backend server
- fuse The XmlRPCServer that handles custom fuses (for comparing
samples)
- nginx The web server
- workers The Web2py Scheduler workers in charge of executing vidjil
users' samples
- backup Starts a cron job to schedule regular backups
- reporter A monitoring utility that can be configured to send
monitoring information to a remote server
For more information about Docker Compose and how to install it check out
<https://docs.docker.com/compose/>
## Starting the environment
Ensure your docker-compose.yml contains the correct reference to the
vidjil image you want to use. Usually this will be vidjil/vidjil:latest,
but more tags are available at <https://hub.docker.com/r/vidjil/vidjil/tags/>.
You may also want to uncomment the volume in the fuse volume block "-
./vidjil/conf:/etc/vidjil" this will provide easier access to all of the
configuration files, allowing for tweaks.
Running the following command will automatically download any missing
images and start the environment:
``` bash
docker-compose up
```