server.md 42.9 KB
Newer Older
1

Mathieu Giraud's avatar
Mathieu Giraud committed
2
This is the help of the Vidjil server.
3
This help is intended for server administrators.
Mathieu Giraud's avatar
Mathieu Giraud committed
4
5
6
Users should consult the [web application manual](http://www.vidjil.org/doc/user/)
Other documentation can also be found in [doc/](http://www.vidjil.org/doc/).
Finally, developer documentation
7

Mathieu Giraud's avatar
Mathieu Giraud committed
8
# Docker containers or Plain installation
9
10
11

There are two ways to install and run a Vidjil server:

12
13
  - We are developping and deploying since 2018 **Docker containers** to ease the installation and the maintenance.
    These Docker containers are used on the public server (<https://app.vidjil.org>) as well as in some partner hospitals.
Mathieu Giraud's avatar
Mathieu Giraud committed
14
    We recommend this installation for new instances of Vidjil.
15
16
17
    We also provide support and remote maintenance
    of such in-hospital servers through
    the [VidjilNet consortium](http://www.vidjil.net/index.en.html).
Mathieu Giraud's avatar
Mathieu Giraud committed
18

19
20
  - The **plain installation of the server** should run on any Linux/Unix server with Nginx (recommanded) or Apache.
    We provide below detailed instructions for Ubuntu 14.04 LTS.
21
    We used this installation on the public server between 2014 and 2018.
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42

# Requirements

## CPU, RAM

### Minimal

vidjil-algo typically uses
approx. 1.2GB of RAM to run on a 1GB `.fastq` and will take approx. 5+ minutes.
Therefore in order to process requests from a single user with a few samples,
any standard multi-core processor with 2GB RAM will be enough.

### Recommended

When choosing hardware for your server it is important to know the scale
of usage you require.
If you have many users that use the app on a daily basis, you will need to
have multiple cores to ensure the worker queues don't build up.
One worker will occupy one core completely when running vidjil-algo (which is
currently single-threaded).

43
44
For reference, here are various setups of our public
testing server <https://app.vidjil.org>:
45

46
47

#### 2016 -- 2017 (40+ users, including 15 regular users)
48
  - Processor: Quad core Intel 2.4MHz
49
  - 3 workers
50
51
52
  - RAM: 16GB


53
#### since 2018  (100+ users, including 30+ regular users)
54
  - Virtual Machine: 8 virtual CPUs
55
  - 6 workers
56
  - RAM: 28GB
57
58
59
60
  
  
We create less workers for executing Vidjil-algo than there are (virtual) CPU availabe,
keeping always one CPU core dedicated to the web server, even when the workers run at full capacity.
61
62
63
64
Running other RepSeq programs through the Vidjil server may require additional CPU and RAM.

## Storage

65
66
### Full upload of sequences

67
68
69
70
As for many high-throughput sequencing pipeline, **disk storage to store input data (`.fastq`, `.fasta`, `.fastq.gz` or `.fasta.gz`)
is now the main constraint** in our environment.

Depending on the sequencer, files can weigh several GB.
71
72
Depending of the number of users, a full installation's total storage should thus be serveral hundred GB, or even several TB
(as of the end of 2018, 4 TB for the public server).
73
74
We recommend a RAID setup of at least 2x2TB to allow for user files and at least one backup.

75
76
User files (results, annotations) as well as the metadata database are quite smaller
(as of the end of 2016, on the public server, 3 GB for all user files of 40+ users).
77
Note that even when the input sequences are deleted, the server is still able to display the results of previous analyses.
78
79
80
81

### Remote access on a mounted filesystem

Moreover, it is possible to access `.fastq` files on a mounted filesystem.
82
See `FILE_SOURCE` below.
83
84
85
86

## Authentication

The accounts are now local to the Vidjil server.
87
We intend to implement some LDAP access at some point of 2020.
88
89
90
91
92
93
94
95
96
97

## Network

Once installed, the server can run on a private network.
However, the following network access are recommended:

  - outbound access
      - for users: several features using external platforms (IgBlast, IMGT/V-QUEST…)
      - for server mainteners: upgrades and reports to a monitor server
  - inbound access
98
99
      - through the VidjilNet consortium (http://www.vidjil.net),
        the team in Lille may help local server mainteners in some monitoring, maintenance and upgrade tasks,
100
101
102
        provided a SSH access can be arranged, possibly over VPN.


103
# Docker -- Installation
104

105
106
107
All our images are hosted on DockerHub in the [vidjil/](https://hub.docker.com/r/vidjil) repositories.
The last images are tagged with `vidjil/server:latest` and `vidjil/client:latest`.

108
Individual services are started by docker-compose  (<https://docs.docker.com/compose/>).
109
110


111
112
113
114
115
116
117
## Before installation

Install `docker-compose`. See <https://docs.docker.com/compose/install/#install-compose>

If it doesn't exist yet, you should create a `docker` group.
The users needing to access `docker` must belong to this group.

118
119
120
121
Install `git`.
Clone the [Vidjil git](https://gitlab.inria.fr/vidjil/vidjil) with `git clone https://gitlab.inria.fr/vidjil/vidjil.git`,
and go to the directory [vidjil/docker](https://gitlab.inria.fr/vidjil/vidjil/tree/dev/docker).
This contains both [docker-compose.yml](http://gitlab.vidjil.org/blob/dev/docker/docker-compose.yml) as well as configuration files.
122

123
## Docker environment
124

125
The vidjil Docker environment is managed by `docker-compose`, who launches the following services:
126

Mathieu Giraud's avatar
Mathieu Giraud committed
127
From image `vidjil/client`
128
129
130

  - `nginx` The web server, containing the client web application

Mathieu Giraud's avatar
Mathieu Giraud committed
131
From image `vidjil/server`
132

133
134
135
  - `mysql` The database
  - `uwsgi` The Web2py backend server
  - `workers` The Web2py Scheduler workers in charge of executing vidjil users' samples
136

137
138
139
  - `fuse` The XmlRPCServer that handles queries for comparing samples
  - `backup` Starts a cron job to schedule regular backups
  - `reporter` A monitoring utility that can be configured to send monitoring information to a remote server
140
  - `postfix` A mail relay to allow `uwsgi` to send error notifications
141

142
143


Mathieu Giraud's avatar
Mathieu Giraud committed
144
## Network usage and SSL certificates
145

146
*If you are simply using Vidjil from your computer for testing purposes you can skip the next two steps.*
Mathieu Giraud's avatar
Mathieu Giraud committed
147

148
  - Step 1 : Change the hostname in the nginx configuration `vidjil-client/conf/nginx_web2py`,
149
    replacing `$hostname` with your FQDN.
150
  - Step 2 : Edit the `vidjil-client/conf/conf.js`
151
152
        change all 'localhost' to the FQDN

153
*You will need the following step whether you are using locally or not.*
Mathieu Giraud's avatar
Mathieu Giraud committed
154
155
156
157

Vidjil uses HTTPS by default, and will therefore require SSL certificates.
You can achieve this with the following steps:

158
  - Configure the SSL certificates
Mathieu Giraud's avatar
Mathieu Giraud committed
159
160
     - A fast option is to create a self-signed SSL certificate.
       Note that it will trigger security warnings when accessing the client.
Mathieu Giraud's avatar
Mathieu Giraud committed
161
       From the `docker/` directory:
Mathieu Giraud's avatar
Mathieu Giraud committed
162
       ```
163
164
165
166
167
       openssl genrsa 4096 > web2py.key
       openssl req -new -x509 -nodes -sha1 -days 1780 -key web2py.key > web2py.crt
       openssl x509 -noout -fingerprint -text < web2py.crt
       mkdir -p vidjil-client/ssl
       mv web2py.* vidjil-client/ssl/
Mathieu Giraud's avatar
Mathieu Giraud committed
168
      ```
169

170
171
172
     + If you are using the `postfix` container you may want to generate certificates (using the same process) and place them in `postfix/ssl`.
       The certificates must bear the name of your mail domain (<maildomain>.crt and <maildomain>.key)

173
174
175
  - A better option is to use other certificates, for example by configuring free [Let's Encrypt](https://letsencrypt.org/) certificates.
    One solution is to use `certbot` on the host to generate the certificates and to copy them in the right directory so that the container
    can access it. 
176
177
    See [Nginx and Let’s Encrypt with Docker](https://medium.com/@pentacent/nginx-and-lets-encrypt-with-docker-in-less-than-5-minutes-b4b8a60d3a71).
    To check the integrity of the host, `certbot` needs to set up a challenge. 
178
179
180
181
    Thus, Nginx needs to provide specific files that are generated by `certbot`. 
    To do so, you should tell `certbot` to put those files in the `/opt/vidjil/certs` 
    directory (this can be changed in the `docker-compose.yml` file.
    You can generate the certificates with the command `certbot certonly --webroot -w /opt/vidjil/certs -d myvidjil.org`. 
182
    You'll need to update the Nginx configuration in `docker/vidjil-client/conf/nginx_web2py`
183
184
185
186
187
188
    Then 
    ```shell
    cp /etc/letsencrypt/live/vdd.vidjil.org/fullchain.pem vidjil-client/ssl/web2py.crt
    cp /etc/letsencrypt/live/vdd.vidjil.org/privkey.pem vidjil-client/ssl/web2py.key
    ```
    The certificates can be renewed with `certbot renew` but beware to copy the certificates after that.
189
    Instead of copying the certificates, you may wish to mount `/etc/letsencrypt` in the Docker image as a volume (*eg.* `/etc/letsencrypt:/etc/nginx/ssl`).
190
    However beware, because you would not be able to start Nginx till the certificates are in place.
191
    On certificate renewal (with `certbot`), you then need to restart the Nginx server.
192
193
194
    
If necessary, in `docker-compose.yml`, update `nginx.volumes`, line `./vidjil-client/ssl:/etc/nginx/ssl`, to set the directory with the certificates.
    The same can be done for the `postfix` container.
Mathieu Giraud's avatar
Mathieu Giraud committed
195

196
197
198
199
200

If you would prefer to use the vidjil over HTTP (not recommended outside of testing purposes), you can
use the provided configuration files in `docker/vidjil-server/conf` and `docker/vidjil-client/conf`. You will find several files
that contain "http" in their name. Simply replace the existing config files with their HTTP counter-part (for safety reasons, don't
forget to make a backup of any file you replace.)
201
 
202
## First configuration and first launch
203

Mathieu Giraud's avatar
Mathieu Giraud committed
204
  - Set the SSL certificates (see above)
205
  - Change the mysql root password and the web2py admin password in `docker-compose.yml`
Mathieu Giraud's avatar
Mathieu Giraud committed
206
  - Change the mysql vidjil password in `mysql/create_db.sql` and sets it also in `DB_ADDRESS` in `vidjil-server/conf/defs.py`
207
208
  - Set the desired mail domain and credentials for the `postfix` container and update `vidjil-server/conf/defs.py`
    `SMTP_CREDENTIALS` and `FROM_EMAIL` to match
209

210
  - Comment reporter services in `docker-compose.yml`
211

212
  - It is avised to first launch  with `docker-compose up mysql`.
Mathieu Giraud's avatar
Mathieu Giraud committed
213
The first time, this container creates the database and it takes some time.
214

215
216
- When `mysql` is launched,
you can safely launch `docker-compose up`.
Mathieu Giraud's avatar
Mathieu Giraud committed
217
218
Then `docker ps` should display five running containers:
`docker_nginx_1`, `docker_uwsgi_1`, `docker_workers_1`, `docker_fuse_1`, `docker_mysql_1`
219
220


221
222
223
224
225
226
227
  - Vidjil also need germline files.
      - You can use IMGT germline files if you accept IMGT licence.
        For this, from the `vidjil` directory (root of the git repository),
        run `make germline` to create `germline/` while checking the licence.
      - These germlines are included in the server container with a volume in the fuse block
        in your `docker-compose.yml`: `../germline:/usr/share/vidjil/germline`.
      - Copy also the generated `browser/js/germline.js` into the `docker/vidjil-client/conf/` directory.
228
229


Mikaël Salson's avatar
Mikaël Salson committed
230
  - Open a web browser to `https://localhost`, or to your FQDN if you configured it (see above).
Mathieu Giraud's avatar
Mathieu Giraud committed
231
Click on `init database` and create a first account by entering an email.
232
This account is the main root account of the server. Other administrators could then be created.
Mathieu Giraud's avatar
Mathieu Giraud committed
233
It will be also the web2py admin password.
234

235
236
237
238
239
240
241
242
243
244
245
246
*notice* : By default, Nginx HTTP server listens for incoming connection and binds on port 80 on the host, if you encounter the following message error:
```
ERROR: for nginx
Cannot start service nginx: driver failed programming external
connectivity on endpoint docker_nginx_1
(236d0696ed5077c002718541a9703adeee0dfac66fb880d193690de6fa5c462e):
Error starting userland proxy: listen tcp 0.0.0.0:80: bind: address already in use
```

You can resolve it either by changing the port used by Vidjil in the `nginx.ports`
section of the `docker-compose.yml` file or by stopping the service using port
80.
247

248
249
250
  

## Further configuration
251

252
253
254
255
256
The following configuration files are found in the `vidjil/docker` directory:

  - `conf/conf.js` various variables for the vidjil browser
  - `conf/defs.py` various variables for the vidjil server
  - `conf/gzip.conf` configuration for gzip in nginx
Mathieu Giraud's avatar
Mathieu Giraud committed
257
  - `conf/gzip_static.conf` same as the previous but for static resources
258
259
260
261
262
263
264
265
266
267
268
269
  - `conf/uwsgi.ini`   configuration required to run vidjil with uwsgi
  - `sites/nginx` configuration required when running vidjil with nginx
  - `scripts/nginx-entrypoint.sh` entrypoint for the nginx
  - `service` (not currently in use)
  - `scripts/uwsgi-entrypoint.sh` entrypoint for the uwsgi
service. Ensures the owner of some relevant volumes are correct within
the container and starts uwsgi

Here are some notable configuration changes you should consider:

  -  mysql root password (`mysql.environment` in `docker-compose.yml`),  mysql vidjil password (`docker-compose.yml` and  `vidjil-server/conf/defs.py`),
     as mentionned above
270

Mathieu Giraud's avatar
Mathieu Giraud committed
271
272
  - Change the `FROM_EMAIL` and `ADMIN_EMAILS` variables in `vidjil-server/conf/defs.py`.
    They are used for admin emails monitoring the server an reporting errors.
273

274
  - <a name='healthcare'></a>
275
    If, according yo your local regulations, the server is suitable for hosting clinical data,
276
277
278
279
280
    you may update the `HEALTHCARE_COMPLIANCE` variable to remove warnings related to non-healthcare compliance.
    Updating this variable is the sole responsibility of the institution responsible for the server,
    and should be done in accordance with the regulations that apply in your country.
    See also the [hosting options](healthcare.md) offered by the VidjilNet consortium.

281
282
283
284
285
  - To allow users to select files from a mounted volume,
    set `FILE_SOURCE` and `FILE_TYPES` in `vidjil-server/conf/defs.py`.
    In this case, the `DIR_SEQUENCES` directory will be populated with links to the selected files.
    Users will still be allowed to upload their own files.

Mathieu Giraud's avatar
Mathieu Giraud committed
286
  - By default all files that
287
    require saving outside of the containers (the database, uploads, vidjil
Mathieu Giraud's avatar
Mathieu Giraud committed
288
289
    results and log files) are stored in `/opt/vidjil`.
    This can be changed in the `volumes` in `docker-compose.yml`.
290
    this by editing the paths in the volumes.
291
292
    See also <a href="#storage">Requirements / Storage</a> above.

293
  - Configure the reporter. Ideally this container should be positioned
Mathieu Giraud's avatar
Mathieu Giraud committed
294
295
296
    on a remote server in order to be able to report on a down server,
    but we have packed it here for convenience.
    You will also
297
    need to change the `DB_ADDRESS` in `conf/defs.py` to match it.
298

299
300


Mathieu Giraud's avatar
Mathieu Giraud committed
301
# Docker -- Adding external software
302

303
Some software can be added to Vidjil for pre-processing or even processing if the
304
software outputs data compatible with the `.vidjil` or AIRR format.
305
306
307
308
We recommend you add software by adding a volume to your `docker-compose.yml`.
By default we add our external files to `/opt/vidjil` on the host machine. You can then
reference the executable in `vidjil-server/conf/defs.py`.

309
310
311
312
313
When the software has compatible inputs and outputs, it will be enough
to configure then the appropriate `pre process` or `analysis config` (to be documented).
In some cases, using the software may require development such as wrappers.
Contact us (<mailto:contact@vidjil.org>) to have more information and help.

314
# Docker -- Troubleshooting
315

316
##  Error "Can't connect to MySQL server on 'mysql'"
317
318

The mysql container is not fully launched. This can happen especially at the first launch.
Mathieu Giraud's avatar
Mathieu Giraud committed
319
You may relaunch the containers.
320

321
322
If restarting the containers does not resolve the issue, there are a couple of things
you can look into:
Mathieu Giraud's avatar
Mathieu Giraud committed
323

324
 - Ensure the database password in `vidjil-server/conf/defs.py` matches the password for
Mathieu Giraud's avatar
Mathieu Giraud committed
325
326
 the mysql user: `vidjil`.
 If you are not sure, you can check with the following:
327
328
329
330
331
332
333
334
335
336
 ```sh
 docker exec -it docker_mysql_1 bash
 mysql -u vidjil -p vidjil
 ```
 or reset it:
 ```sh
 docker exec -it docker_mysql_1 bash
 mysql -u root -p
 SET PASSWORD FOR vidjil = PASSWORD('<new password>');
 ```
Mathieu Giraud's avatar
Mathieu Giraud committed
337
 
338
339
340
341
342
343
 - Ensure the database was created correctly. This should have been done automatically,
 but just in case, you can check the console output, or check the database:
 ```sh
 docker exec -it docker_mysql_1 bash
 mysql -u vidjil -p vidjil
 ```
Mathieu Giraud's avatar
Mathieu Giraud committed
344
 If the database does not exist, mysql will display an error after logging in.
345

346
347
## Launching manually the backup

348
The backup should be handled by the backup container, see [*Making backups* below](#makingbackups). Otherwise you can use the `backup.sh` script by connecting to the `backup` or `uwsgi` container (for a full backup, otherwise add the `-i` option when
349
350
351
352
353
354
running `backup.sh`):

```sh
cd /usr/share/vidjil/server
sh backup.sh vidjil /mnt/backup >> /var/log/cron.log 2>&1
```
355

356
357
358
359
360
361
362
363
364
365
366
367
368
369
## I can't connect to the web2py administration site
The URL to this site is https://mywebsite/admin/default/.
The password should be given in the `docker-compose.yml` file.
Otherwise a random password is generated. You can still modify
this password by connecting to the server (in the `uwsgi` container).
Go in the the `/usr/share/vidjil/server/web2py` directory and then
launch Python.
```python
from gluon.main import save_password
save_password(PASSWORD, 443)
```
This password will not persist when the container will be restarted.
For a persistent password, please use the environment variable.

370
371
# Docker -- Updating a Docker installation

372
## Before the update
373

374
375
376
377
378
379
380
We post news on image updates at <http://gitlab.vidjil.org/tree/dev/docker/CHANGELOG>.
Check there whether the new image require any configuration change.

By security, we please you to always make a backup (see "Backups", below) before doing this process.
It is especially important to backup the database, as the update process may transform it.

## Pulling the new images
381
382

``` bash
383
384
docker pull vidjil/server:latest
docker pull vidjil/client:latest
385
386
```

387
388
389
This will pull the latest version of the images.
More tags are available at <https://hub.docker.com/r/vidjil/server/tags/>.

390
If you do not have access to `hub.docker.com` on your server, then you
391
392
should pull and extract the image onto a machine that does,
send it to your server with your favourite method, and finally import
393
394
395
396
397
398
399
400
401
the image on the server.

Extract:
``` sh
docker save -o <output_file> vidjil/server[:<version>] vidjil/client[:<version>]
```

Import:
```sh
402
docker load -i <input_file>
403
404
```

405
406
407
408
409
## Launch the new containers

In some cases, you may need to update your `docker-compose.yml` file or some
of the configuration files. We will describe the changes in the `CHANGELOG` file.
The latest versions of these files are available on our
Mikaël Salson's avatar
Mikaël Salson committed
410
[Gitlab](http://gitlab.vidjil.org/).
411
412
413
414
415
416

Once the images are pulled, you can relaunch the containers:
```sh
docker-compose down
docker-compose up
```
417

418
By default, all previous volumes will be reused and no data will be lost.
419
420
421
422
423
424
425
If the database schema was updated, web2py will update it on your database.
Check that the containers run well, and that you still manage to log on Vidjil
and to access the database, and to see a result from a sample.

If something is not working properly, you have still the option to rollback
to the previous images (for example by tagging as `latest` a previous image),
and possibly by reusing also your last databse backup if something went wrong.
426

427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
### Launching a single container

When an update occurs on a single container, one may not want to relaunch all
the containers, to save time. With `docker-compose` it is possible to do so.

Stop the desired container (for instance the client):
```
docker-compose stop nginx
```

Then launch it again

```
docker-compose up -d nginx
```
442

443
444
445
446
447
448
449
450
## Knowing what docker image version is running

As our latest image is always tagged `latest` you may have troubles to know
what version is currently running on your server. To determine that, you can
use the *digest* of the image. You can view it, for example with `docker image
--digests vidjil/server`. Then you can compare it with the digests shown [on
the Dockerhub page](https://hub.docker.com/r/vidjil/server/tags/).

451
452
# Plain server installation

Mathieu Giraud's avatar
Mathieu Giraud committed
453
454
This installation is not supported anymore.
We rather advise to use the Docker containers (see above).
455

Mathieu Giraud's avatar
Mathieu Giraud committed
456
## Requirements (for Ubuntu 16.04)
457
458
459
460
461
462
463
464
465
466
467
468
469
470

``` bash
apt-get install git
apt-get install g++
apt-get install make
apt-get install unzip
apt-get install python-dev python-pip
apt-get install libyajl2 libyajl-dev
pip install unittest2
pip install unittest-xml-reporting
pip install enum34
pip install ijson cffi
```

471
## Server installation and initialization
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741

Enter in the `server/` directory.

If you just want to do some tests without installing a real web server,
then launch `make install_web2py_standalone`. In the other case, launch
`make install_web2py`.


## Detailed manual server installation and browser linking

Requirements:
ssh, zip unzip, tar, openssh-server, build-essential, python, python-dev,
mysql, python2.5-psycopg2, postfix, wget, python-matplotlib, python-reportlab,
python-enum34, mercurial, git

If you want to run Vidjil with an Apache webserver you will also need:
apache2, libapache2-mod-wsgi

Or if you want to use Nginx:
nginx-full, fcgiwrap

For simplicity this guide will assume you are installing to `/home/www-data`

Clone <https://github.com/vidjil/vidjil.git>

Download and unzip web2py. Copy the contents of web2py to the server/web2py
folder of you Vidjil installation
(in this case /home/www-data/vidjil/server/web2py) and give ownership to www-data:

``` bash
chown -R www-data:www-data /home/www-data/vidjil
```

If you are using apache, you can run the following commands to make sure all the apache modules you need
are activated:

``` bash
a2enmod ssl
a2enmod proxy
a2enmod proxy_http
a2enmod headers
a2enmod expires
a2enmod wsgi
a2enmod rewrite  # for 14.04
```

In order to setup the SSL encryption a key to give to apache. The safest option
is to get a certicate from a trusted Certificate Authority, but for testing
purposes you can generate your own:

``` bash
mkdir /etc/<webserver>/ssl
openssl genrsa 1024 > /etc/<webserver>/ssl/self_signed.key
chmod 400 /etc/<webserver>/ssl/self_signed.key
openssl req -new -x509 -nodes -sha1 -days 365 -key
    /etc/<webserver>/ssl/self_signed.key > /etc/apache2/ssl/self_signed.cert
openssl x509 -noout -fingerprint -text <
    /etc/<webserver>/ssl/self_signed.cert > /etc/<webserver>/ssl/self_signed.info
```

\<webserver\> should be replaced with the appropriate webserver name
(ie. apache2 or nginx)

Given that Vidjil is a two-part application, one that serves routes from a server
and one that is served statically, we need to configure the apache to do so.
Therefore we tell the apache to:

  - Start web2py as a wsgi daemon (allows apache to serve the application).
  - Reserve two virtual hosts (one to be served with ssl encryption, and one not).
  - We configure the first host to serve static content and prevent overriding
    by the sever (otherwise all routes are redirected through web2py) and to follow symlinks
    this allows us to symlink to our browser app in the /var/www directory and keep both parts
    of Vidjil together.
  - The second is set to use SSL encryption, and only serve very specific folders statically (such
    as javascript files and images because we don't want to create a controller to serve that kind of data)

you can replace your apache default config with the following
(/etc/apache2/sites-available/default.conf - remember to make a backup just in case):

``` example
WSGIDaemonProcess web2py user=www-data group=www-data processes=1 threads=1

<VirtualHost *:80>

  DocumentRoot /var/www
  <Directory />
    Options FollowSymLinks
    AllowOverride None
  </Directory>

  <Directory /var/www/>
    Options Indexes FollowSymLinks MultiViews
    AllowOverride all
    Order allow,deny
    allow from all
  </Directory>

  ScriptAlias /cgi/ /usr/lib/cgi-bin/

  <Directory /usr/lib/cgi-bin/>
    Options Indexes FollowSymLinks
    Options +ExecCGI
    #AllowOverride None
    Require all granted
    AddHandler cgi-script cgi pl
  </Directory>

  <Directory /home/www-data/vidjil/browser>
    AllowOverride None
  </Directory>

  CustomLog /var/log/apache2/access.log common
  ErrorLog /var/log/apache2/error.log
</VirtualHost>


<VirtualHost *:443>
  SSLEngine on
  SSLCertificateFile /etc/apache2/ssl/self_signed.cert
  SSLCertificateKeyFile /etc/apache2/ssl/self_signed.key

  WSGIProcessGroup web2py
  WSGIScriptAlias / /home/www-data/vidjil/server/web2py/wsgihandler.py
  WSGIPassAuthorization On

  <Directory /home/www-data/vidjil/server/web2py>
    AllowOverride None
    Require all denied
    <Files wsgihandler.py>
      Require all granted
    </Files>
  </Directory>

  AliasMatch ^/([^/]+)/static/(?:_[\d]+.[\d]+.[\d]+/)?(.*) \
        /home/www-data/vidjil/server/web2py/applications/$1/static/$2

  <Directory /home/www-data/vidjil/server/web2py/applications/*/static/>
    Options -Indexes
    ExpiresActive On
    ExpiresDefault "access plus 1 hour"
    Require all granted
  </Directory>

  CustomLog /var/log/apache2/ssl-access.log common
  ErrorLog /var/log/apache2/error.log
</VirtualHost>
```

Now we want to activate some more apache mods:

``` bash
a2ensite default                   # FOR 14.04
a2enmod cgi
```

Restart the server in order to make sure the config is taken into account.

And create some symlinks to avoid splitting our app:

``` bash
ln -s /home/www-data/vidjil/browser /var/www/browser
ln -s /home/www-data/vidjil/browser/cgi/align.cgi /usr/lib/cgi-bin
ln -s /home/www-data/vidjil/germline /var/www/germline
ln -s /home/www-data/vidjil/data /var/www/data
```

If you are using Nginx, the configuration is the following:

``` example
server {
    listen 80;
    server_name \$hostname;
    return 301 https://\$hostname$request_uri;

}
server {
        listen 443 default_server ssl;
        server_name     \$hostname;
        ssl_certificate         /etc/nginx/ssl/web2py.crt;
        ssl_certificate_key     /etc/nginx/ssl/web2py.key;
        ssl_prefer_server_ciphers on;
        ssl_session_cache shared:SSL:10m;
        ssl_session_timeout 10m;
        ssl_ciphers ECDHE-RSA-AES256-SHA:DHE-RSA-AES256-SHA:DHE-DSS-AES256-SHA:DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA;
        ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
        keepalive_timeout    70;
        location / {
            #uwsgi_pass      127.0.0.1:9001;
            uwsgi_pass      unix:///tmp/web2py.socket;
            include         uwsgi_params;
            uwsgi_param     UWSGI_SCHEME \$scheme;
            uwsgi_param     SERVER_SOFTWARE    nginx/\$nginx_version;
            ###remove the comments to turn on if you want gzip compression of your pages
            # include /etc/nginx/conf.d/web2py/gzip.conf;
            ### end gzip section

            proxy_read_timeout 600;
            client_max_body_size 20G;
            ###
        }
        ## if you serve static files through https, copy here the section
        ## from the previous server instance to manage static files

        location /browser {
            root /home/www-data/vidjil/;
            expires 1h;

            error_page 405 = $uri;
        }

        location /germline {
            root $CWD/../;
            expires 1h;

            error_page 405 = $uri;
        }

        ###to enable correct use of response.static_version
        #location ~* ^/(\w+)/static(?:/_[\d]+\.[\d]+\.[\d]+)?/(.*)$ {
        #    alias /home/www-data/vidjil/server/web2py/applications/\$1/static/\$2;
        #    expires max;
        #}
        ###

        location ~* ^/(\w+)/static/ {
            root /home/www-data/vidjil/server/web2py/applications/;
            expires max;
            ### if you want to use pre-gzipped static files (recommended)
            ### check scripts/zip_static_files.py and remove the comments
            # include /etc/nginx/conf.d/web2py/gzip_static.conf;
            ###
        }

        client_max_body_size 20G;

        location /cgi/ {
            gzip off;
            root  /home/www-data/vidjil/browser/;
            # Fastcgi socket
            fastcgi_pass  unix:/var/run/fcgiwrap.socket;
            # Fastcgi parameters, include the standard ones
            include /etc/nginx/fastcgi_params;
            # Adjust non standard parameters (SCRIPT_FILENAME)
            fastcgi_param SCRIPT_FILENAME  \$document_root\$fastcgi_script_name;
        }

}
```

We also do not create symlinks since all references are managed
correctly.

Now we need to configure the database connection parameters:

  - create a file called conf.js in /home/www-data/vidjil/browser/js containing:
    
    ``` example
    var config = {
        /*cgi*/
        "cgi_address" : "default",
    
        /*database */
        "use_database" : true,
        "db_address" : "default",
    
        "debug_mode" : false
    }
    ```

This tells the browser to access the server on the current domain.
742
743
You may also add a variable called `server_id` in order to name different
instances and environments; it will be displayed in the top menu.
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772

  - copy vidjil/server/web2py/applications/vidjil/modules/defs.py.sample
    to vidjil/server/web2py/applications/vidjil/modules/defs.py
    and change the value of DB<sub>ADDRESS</sub> to reference your database.

You can now access your app.
All that is left to do is click on the init database link above the login page.
This creates a default admin user: plop@plop.com and password: 1234 (make sure to
remove this user in your production environment) and creates the configurations you can have
for files and results.

# Testing the server

If you develop on the server, or just want to check if everything is ok, you
should launch the server tests.

First, you should have a working fuse server by launching `make
  launch_fuse_server` (just launch it once, then it is running in the
background and can be killed with `make kill_fuse_server`).

Then you can launch the tests with `make unit`.

# Troubleshootings

## Web2py runs but does not allow any connection

Check whether the relevant disks are properly mounted.
Disks failures or other events could have triggered a read-only partition.

773
## Jobs stay in `QUEUED`, workers seem to be stuck
774
775

For some reasons, that are not clear yet, it may happen that workers are not
776
assigned any additional jobs even if they don't have any ongoing jobs.
777

Mikaël Salson's avatar
Mikaël Salson committed
778
779
780
In such a (rare) case, it may be useful to restart the workers by clicking on
the *reset workers* link in the Vidjil administration interface. Restarting
workers won't be performed if jobs are currently running or assigned.
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842


## Debugging Web2py workers

One can launch the workers by hand (see in the `/etc/init` script and add a
`-D 0` option. It prints debugging information on what the workers are doing.

The most useful information are from the TICKER worker: the one that
assigns jobs to workers. So you'd better first kill all the workers and
then launch one by hand to be sure that it will be the ticker.

## Restarting web2py

Just touch the file `/etc/uwsgi/web2py.ini`.

Another of restarting it is by touching the file
`server/web2py/applications/vidjil/modules/defs.py`.
This will tell `uwsgi` to restart web2py (including the workers).

## Restarting uwsgi

When one modifies an uwsgi config file (usually in `/etc/uwsgi` directory, it
may be necessary to restart uwsgi so that the modifications are taken into
account. This can be done using

``` bash
initctl restart uwsgi-emperor
```

## Logging database queries

### MySQL

One can see some [insightful SO post](https://stackoverflow.com/questions/650238/how-to-show-the-last-queries-executed-on-mysql).
To summarize, this can either be done at runtime:

``` sql
SET GLOBAL log_output = "FILE";
SET GLOBAL general_log_file = "/path/to/your/logfile.log";
SET GLOBAL general_log = 'ON';
```

Or directly in the configuration file (less recommended):

``` conf
general_log_file        = /var/log/mysql/mysql.log
general_log             = 1
```

In that case the server must be restarted afterwards.

# Running the server in a production environment

## Introduction

When manipulating a production environment it is important to take certain
precautionnary mesures, in order to ensure production can either be rolled
back to a previous version or simply that any encurred loss of data can be
retrieved.

Web2py and Vidjil are no exception to this rule.

843
## <a name="makingbackups"></a> Making backups
844

845
846
847
The top priority is to backup *files created during the analysis*
(either by a software or a human).
Should the data be lost, valuable man-hours would be lost.
848
In order to prevent this, we make several times a day incremental backups of the
849
data stored on the public Vidjil servers.
850

851
This does not apply to uploaded files. We inform users that they should
852
keep a backup of their original sequence files.
853

854
855
To ease the backup, the `backup.sh` script provides an example. 
It can be used through the backup container, for which you have two configuration files to update.
856

857
The `docker/backup/conf/backup.cnf` gives the authentication information to the database so that a backup user (read rights only required) can connect to the database.
858

859
Then the backup strategy can be configured in the `docker/backup/conf/backup-cron` file. The cron file states how often the backup script will be called. There are three options: backing up all results/analyses since yesterday, since the start of the month, since forever. On top of that the database is exported under two formats (CSV and SQL).
860
861
862

## Autodelete and Permissions

863
Web2py has a handy feature called `AutoDelete` which allows the administrator
864
865
to state that file reference deletions should be cascaded if no other
references to the file exist.
866
When deploying to production one needs to make sure `AutoDelete` is
867
deactivated.
868
869
This is the case for the default Vijdil installation (see `server/web2py/applications/vidjil/models/db.py`).

870
871
872
873
874
875
876
As a second precaution it is also wise to temporarily restrict web2py's
access to referenced files.

Taking two mesures to prevent file loss might seem like overkill, but
securing data is more important than the small amount of extra time spent
putting these mesures into place.

877
878
879
# Plain server installation -- updating the server

**(information to be updated)**
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909

Currently deploying changes to production is analogous to merging into the
rbx branch and pulling from the server.

Once this has been done, it is important that any database migrations have
been applied.
This can be verified by refreshing the server (calling a controller) and
then looking at the database.

## Step by Step

  - Check permissions on the uploads folder (set to 100)
      - you can also check the amount of files present at this point for future
        reference
  - Backup database: Archive old backup.csv and then from admin page: backup
    db
  - pull rbx (if already merged dev)
  - Check the database (for missing data or to ensure mmigrations have been
    applied)
  - Check files to ensure no files are missing
  - Reset the folder permissions on uploads (755 seems to be the minimum
    requirement for web2py)
  - Run unit tests (Simply a precaution: Continuous Integration renders this
    step redundant but it's better to be sure)
  - Check site functionnality

# Resetting user passwords

Currently there is not easy way of resetting a user's password.
The current method is the following:
910
911
912
913
```bash
cd server/web2py
python web2py -S vidjil -M db.auth_user[<user-id>].update_record(password=CRYPT(key=auth.settings.hmac_key)('<password>')[0],reset_password_key='')
```
914
915
916
917

# Migrating Data


918
919
920
921
922
923
Usually, when extracting data for a given user or group, the whole database should not be
copied over.
The `migrator` script allows the selective export and import of data,
whether it be a single patient/run/set or a list of them, or even all the sample sets
associated to a group (or to a user).
The script takes care both of database, but also of results and analysis files (see below for sequence files).
924

925
See `server/web2py/applications/vidjil/scripts/migrator.py --help`
926

927
## Exporting an archive
928

929
(to be detailed)
930

931
## Importing an archive
932

933
### Step 1 : extract the archive on your server
934

935
936
The export directory must be on your server and accessible from your vidjil-server docker container.
You can define a new shared volume; or simply put the export directory on an already accessible location such as  `[DOCKER DIRECTORY]/vidjil-server/conf/export/`
937

938
### Step 2 : prepare the group that will own the data
939

940
The permissions on a vidjil server are *group* based. Users and groups may be different from one server to another one. Before importing data on a server, one must have a group ready to receive the permissions to manage the imported files.
941

942
From the admin web interface has, you can create a new group  ("groups" -> "+new group" -> "add group"). The group ID is displayed between parenthesis next to its name on the group page, you will need it later. If you create such a group on a blank vidjil server, the ID is *4*.
943

944
### Step 3 : prepare your server analysis configs
945

946
*This step may require bioinformatics support depending on your data, the config previously used, and the ones you intend to use on your new installation. We can offer support via the [VidjilNet consortium](http://www.vidjil.net) for help on setting that.*
947

948
Vidjil analysis configs should not be directly transferred between servers. Indeed, they depend on the setup of each server setup (software, paths...) and can collide with existing configs on your installation. Before importing, you thus need to create the missing analysis configs on your server and edit the `config.json` file provided in the export folder.
949

950
This `config.json` file initially contains a list of the analysis configs from the original public server, such as:
951

952
953
954
955
956
957
958
959
960
961
962
```
  "2": {
      "description": [
        "IGH",
        "vidjil",
        "-c clones -3 -z 100 -r 1 -g germline/homo-sapiens.g:IGH,IGK,IGL,TRA,TRB,TRG,TRD -e 1 -w 50 -d -y all",
        "-t 100 -d lenSeqAverage",
        "multi-locus"
      ],
      "link_local": 6
  },
963
964
```

965
966
967
- `"2"`           :  the original config ID on the server from which the data was exported
- `"description"` :  the original config parameters (only for information, they are ignoed in the import)
- `"link_local"`  :  the config ID that will be used on the new server
968

969
970
In the `config.json` file, you have to replace all` link_local` values with the corresponding config ID
of a similar config on your server (if you don't have a similar one you should create one).
971

972
973
If you much of your imported data was on `old` configs, that you do not intend to run anymore,
a solution is to create a generic `legacy` config for these old data.
974

975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
Below is an example of such a `config.json`, linking actual configuration on the public >app.vidjil.org> server to configs to a newly installed server.
This should be completed by a mapping of other configs that were used in the migrated data.

```
{
  "2": {
    "description": [ "IGH", "vidjil",  "-c clones -3 -z 100 -r 1 -g germline/homo-sapiens.g:IGH,IGK,IGL,TRA,TRB,TRG,TRD -e 1 -w 50 -d -y all", "-t 100 -d lenSeqAverage",  "multi-locus" ],
    "link_local": 6
  },
  "25": {
    "description": [ "multi+inc+xxx",  "vidjil",  "-c clones -3 -z 100 -r 1 -g germline/homo-sapiens.g -e 1 -2 -d -w 50 -y all",  "-t 100 -d lenSeqAverage",  "default: multi-locus, with some incomplete/unusual/unexpected recombinations"
    ],
    "link_local": 2
  },
  "26": {
    "description": [ "multi+inc", "vidjil", "c clones -3 -z 100 -r 1 -g germline/homo-sapiens.g -e 1 -d -w 50",  "-t 100",  "multi-locus, with some incomplete/unusual recombinations" ],
    "link_local": 3
  },
  "30": {
    "description": [
      "TRG", "vidjil", "-c clones -3 -z 100 -r 1 -g germline/homo-sapiens.g:TRG -y all", "-t 100 -d lenSeqAverage", "TRG, VgJg"
    ],
    "link_local": 5
  },
  "32": {
    "description": [ "multi", "vidjil", "-c clones -3 -z 100 -r 1 -g germline/homo-sapiens.g:IGH,IGK,IGL,TRA,TRB,TRG,TRD -e 1 -w 50 -d -y all", "-t 100 -d lenSeqAverage", "multi-locus" ],
    "link_local": 4
  }
}
```
1005
1006
1007



1008
### Step 4 : prepare your server pre-process configs
1009

1010
Proceed as in step 3 for pre-process configs. The file to edit is named `pprocess.json`.
1011
1012


1013
### Step 5 : import
1014

1015
1016
1017
1018
1019
1020
The import takes place inside the vidjil-server container
```sh
docker exec -it docker_uwsgi_1 bash
cd usr/share/vidjil/server/web2py/applications/vidjil/scripts/
sh migrator.sh -p [RESULTS DIRECTORY] -s [EXPORT DIRECTORY] import --config [CONFIG.JSON FILE] --pre-process [PPROCESS.JSON FILE] [GROUP ID]
```
1021

1022
1023
1024
1025
1026
- [RESULTS DIRECTORY]          the results directory path inside the container, it should be defined in your docker-compose.yml, by default it is /mnt/result/results/
- [EXPORT DIRECTORY]        the export directory you installed in step 1, if you set it up in docker/vidjil-server/conf/export/ is location inside the container should be /etc/vidjil/export/
- [CONFIG.JSON FILE]         this file is located in the export folder and you should have edited it during step 3
- [PPROCESS.JSON FILE]         this file is located in the export folder and you should have edited it during step 4
- [GROUP ID]                         ID of the group you should have created/selected during step 2
1027

1028
1029
1030
Usually, the command is thus:
```
sh migrator.sh -p /mnt/result/results/ -s /etc/vidjil/export/XXXX/ import --config/etc/vidjil/exportXXXX/config.json --pre-process /etc/vidjil/export/XXXX/pprocess.json  4
1031
1032
1033
1034
```



1035
1036
1037
1038
1039
1040
1041
1042
1043
## Exporting/importing input sequence files

Note that web2py and the Vidjil server are robust to missing *input* files.
These files are not backuped and may be removed from the server at any time.
Most of the time, these large files won't be migrated along with the database, the results and the analysis files.

However, they can simply be copied over to the new installation. Their filenames
are stored in the database and should therefore be accessible as long as
they are in the correct directories.
1044
1045
1046



1047
1048
1049
1050
1051
1052
## Exporting/importing a full database

When a full database migration is needed, it can be done with the following command:

``` bash
mysqldump -u <user> -p <db> -c --no-create-info > <file>
1053
1054
```

1055
1056
1057
The `--no-create-info` option is important because web2py needs to be allowed to create tables itself.
Indeed, it keeps track of database migrations and errors will occur if
tables exist which it considers it needs to create.
1058

1059
1060
1061
1062
1063
1064
1065
In order to import the data into an installation you first need to ensure
the tables have been created by Web2py this can be achieved by simply
accessing a non-static page.

/\!\\ If the database has been initialised from the interface you will
likely encounter primary key collisions or duplicated data, so it is best
to skip the initialisation altogether.
1066

1067
Once the tables have been created, the data can be imported as follows:
1068

1069
1070
``` bash
mysql -u <user> -p <db> < <file>
1071
```
1072

1073
1074
1075
1076
1077
1078
1079
1080
1081
At least the results and analysis files should thus be copied.

Please note that with this method you should have at least one admin user
that is accessible in the imported data. Since the initialization is being
skipped, the usual admin account won't be present.
It is also possible to create a user directly from the database although
this is not the recommended course of action.


1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
# Using CloneDB [Under development]
The [CloneDB](https://gitlab.inria.fr/vidjil/clonedb) has to be installed
independently of the Vidjil platform.

Then one can easily extract data to be used with CloneDB. A script is provided
(`server/web2py/applications/vidjil/scripts/create_clone_db.py`) which
produces a FASTA file to be indexed with CloneDB. This script takes as
parameter the FASTA output file and one (or many) group IDs, which correspond
to the groups having access to the datasets. Note that for the moment the Vidjil platform only allow a per group access to the CloneDB.

The FASTA output filename must follow the format `clonedb_XXX.fa` where `XXX`
is replaced with the group ID.

Make sure that the `DIR_CLONEDB` variable is set in `defs.py` and points to
the CloneDB server directory. Make sure that in this directory the
`clonedb_defs.py` has been filled correctly.

Then index the created FASTA file with the CloneDB index (follow the
instructions from CloneDB).