server.org 29 KB
Newer Older
Mathieu Giraud's avatar
Mathieu Giraud committed
1
#+TITLE: Vidjil -- Server Manual -- Installation and Maintenance
2
#+HTML_HEAD: <link rel="stylesheet" type="text/css" href="org-mode.css" />
Marc Duez's avatar
Marc Duez committed
3

4
This is the preliminary help of the Vidjil server.
5
This help is intended for server administrators. 
6
7
8
Users should consult the web application manual.


9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
* Roadmap: plain installation / Docker containers

There are two ways to install and run a Vidjil server:

 - The *plain installation of the server* should run on any Linux/Unix server with Nginx (recommanded) or Apache.
   We provide below detailed instructions for Ubuntu 14.04 LTS.
   This is now the recommended installation.
   We use this installation on the public server ([[https://app.vidjil.org]]) since October 2014.

 - We are developping Debian packages as well as *Docker containers* to ease the installation and the maintenance.
   The Docker containers are currently tested in some partner hospitals,
   and we intend to release ready-to-use Docker containers in Q3 2017.
   This will then be the recommanded way to install and use Vidjil.

We recommand to people interested in installing a Vidjil server to wait until Q3 2017
and to use meanwhile the public test server.

26
27
* Requirements

28
** CPU, RAM
29
30

*** Minimal
31
   vidjil-algo typically uses
32
33
34
35
   approx. 1.2GB of RAM to run on a 1GB =.fastq= and will take approx. 5+ minutes.
   Therefore in order to process requests from a single user with a few samples,
   any standard multi-core processor with 2GB RAM will be enough.

36
37
38
39
40
41

*** Recommended
   When choosing hardware for your server it is important to know the scale
   of usage you require.
   If you have many users that use the app on a daily basis, you will need to
   have multiple cores to ensure the worker queues don't build up.
42
   One worker will occupy one core completely when running vidjil-algo (which is
43
44
   currently single-threaded).

45
46
   For reference, here is the current Vidjil setup we used on our public
   testing server [[https://app.vidjil.org]] during two years (40+ users, including 15 regular users):
47
48
49
      - Processor: Quad core Intel 2.4MHz
      - RAM: 16GB

50
51
52
53
54
55
56
   Given that the CPU is quad-core, we have 3 workers for executing Vidjil,
   keeping always one CPU core dedicated to the web server,
   even when the workers run at full capacity.

   As of the end of 2016, we use for the public server a virtual machine with similar capabilities.

   Running other RepSeq programs through the Vidjil server may require additional CPU and RAM.
57
58
59

** Storage

60
61
62
63
64
65
66
67
68
69
70
   As for many high-throughput sequencing pipeline, *disk storage to store input data (=.fastq=, =.fasta=, =.fastq.gz= or =.fasta.gz=)
   is now the main constraint* in our environment.

   Depending on the sequencer, files can weigh several GB.
   Depending of the number of users, a full installation's total storage should thus be serveral hundred GB, or even several TB.
   We recommend a RAID setup of at least 2x2TB to allow for user files and at least one backup.

   User files (results, annotations) as well as the metadata database are quite smaller (as of the end of 2016, on the public server, 3 GB for all user files of 40+ users).
   Note that even when the input sequences are deleted, the server is still able to display the results of previous analyses.
   Moreover, a future release at some point of 2017 will allow to access =.fastq= files on a mounted filesystem.

71
72
73
74
75
** Authentication

   The accounts are now local to the Vidjil server.
   We intend to implement some LDAP access at some point of 2017.

76
77
78
79
** Network

Once installed, the server can run on a private network.
However, the following network access are recommended:
80

81
82
83
 - outbound access
    - for users: several features using external platforms (IgBlast, IMGT/V-QUEST...)
    - for server mainteners: upgrades and reports to a monitor server
84

85
86
87
 - inbound access
    - The team in Lille may help local server mainteners in some monitoring, maintenance and upgrade tasks,
      provided a SSH access can be arranged, possibly over VPN.
88

Mathieu Giraud's avatar
Mathieu Giraud committed
89
* Installing and running the Vidjil server
90
91

These installation instruction are for Ubuntu server 14.04
92
These instructions are preliminary, other documentation can also be found in [[http://git.vidjil.org/blob/dev/doc/dev.org][dev.org]].
93
** With Docker
94
95
96
   All our images are hosted on DockerHub and can be retrieved from the
   repository [[https://hub.docker.com/r/vidjil/vidjil/][vidjil/vidjil]].
   Our docker environment makes use of docker-compose. All Vidjil components
97
   are currently packaged into a single docker image. Individual services are
98
99
   started by docker compose, such as in this [[http://gitlab.vidjil.org/blob/master/docker/docker-compose.yml][example]].
   
100
101
102
103
104
105
106
107
108
109
110
*** Versions
   - 1.1
   - 1.2
   - 1.3
   - 1.3.1
   - 1.3.2
   - tmp_fix
     This image is identical to 1.3.2 except it has a manual bugfix on pyDAL
     version 17.11 that was affecting the scheduler_workers.


111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
*** Configuring the Vidjil container
   If you are using this environment for use on localhost, everything should
   work out of the box.

   However you may need to further configure the setup in order to make it
   available to a whole network.
   Here is a list of the configuration files found in the vidjil directory:
     conf/conf.js                             contains various variables for the vidjil browser
     conf/defs.py                             contains various variables for the vidjil server
     conf/gzip.conf                           configuration for gzip in nginx
     conf/gzip_static.conf                    same as the previous but for static resources
     conf/uwsgi.ini                           configuration required to run vidjil with uwsgi
     sites/nginx                              configuration required when running vidjil with nginx
     scripts/nginx-entrypoint.sh              entrypoint for the nginx
     service (not currently in use)
     scripts/uwsgi-entrypoint.sh              entrypoint for the uwsgi
     service. Ensures the owner of some relevant volumes are correct within
     the container and starts uwsgi

  Here are some notable configuration changes you should consider:
    - Change the mysql user/password in docker-compose.yml. You will also
      need to change the DB_ADDRESS in conf/defs.py to match it.

    - Change the hostname in the nginx configuration vidjil/sites/nginx_conf.
      If you are using vidjil on a network, then this might be required.

    - Change the default admin password. Login as plop@plop.com password 1234
      and go to the following URL: https://<your
      hostname>/vidjil/default/user/change_password

    - Change the ssl certificates. When building the image vidjil-server
      which creates a self-signed certificate for the sake of convenience to
      ensure the HTTPS queries work from the start, but this may not be
      acceptable for a production environment.
      In order to replace certificates the current method is to mount the
      certificates to /etc/nginx/ssl with docker volumes in
      docker-compose.yml.

    - Change the FROM_EMAIL and ADMIN_EMAILS variables in conf/defs.py. These
      represent the sender email address and the destination email addresses,
      used in reporting patient milestones and server errors.

    - Change the database password. In the mysql directory you will find an
      entrypoint script which creates the database, the user and set that
      user's password.
      This is the password  you need to match in the defs.py file in the
      vidjil configuration.

    - Change the volumes in docker-compose.yml. By default all files that
      require saving outside of the containers (the database, uploads, vidjil
      results and log files) are stored in /opt/vidjil , but  you can change
      this by editing the paths in the volumes.

    - Configure the reporter. Ideally this container should be positioned
      on a remote server in order to be able to report on a down server, but we have packed it here for convenience.

*** Updating a Docker installation
   Usually our docker installation will only require the following:

   #+BEGIN_SRC sh
     docker pull vidjil/vidjil:latest
   #+END_SRC

   In some cases you may need to update your docker-compose.yml file or some
   of the configuration files. The latest versions are available on our
   [[https://github.com/vidjil/vidjil][GitHub]].
177
178

** Requirements
Mikaël Salson's avatar
Mikaël Salson committed
179
   #+BEGIN_SRC sh
Marc Duez's avatar
Marc Duez committed
180
181
182
183
    apt-get install git
    apt-get install g++
    apt-get install make
    apt-get install unzip
184
185
    apt-get install python-dev python-pip
    apt-get install libyajl2 libyajl-dev
186
187
188
    pip install unittest2
    pip install unittest-xml-reporting
    pip install enum34
189
    pip install ijson cffi
Mikaël Salson's avatar
Mikaël Salson committed
190
   #+END_SRC
Marc Duez's avatar
Marc Duez committed
191

192
193
** Vidjil server installation and initialization
   Enter in the =server/= directory.
Marc Duez's avatar
Marc Duez committed
194

195
196
197
   If you just want to do some tests without installing a real web server,
   then launch =make install_web2py_standalone=. In the other case, launch
   =make install_web2py=.
198

199
200
   The process for installing Vidjil server together with a real web server
   will be detailed in the future.
Marc Duez's avatar
Marc Duez committed
201

202
203
204
205
** Detailed manual server installation and browser linking
	
	Requirements:
		ssh, zip unzip, tar, openssh-server, build-essential, python, python-dev,
206
207
208
209
210
211
212
213
		mysql, python2.5-psycopg2, postfix, wget, python-matplotlib, python-reportlab,
            python-enum34, mercurial, git

      If you want to run Vidjil with an Apache webserver you will also need:
            apache2, libapache2-mod-wsgi

      Or if you want to use Nginx:
            nginx-full, fcgiwrap
214
215


Mikaël Salson's avatar
Mikaël Salson committed
216
	For simplicity this guide will assume you are installing to =/home/www-data=
217
218
219
220
221
222
223

	Clone https://github.com/vidjil/vidjil.git

	Download and unzip web2py. Copy the contents of web2py to the server/web2py
	folder of you Vidjil installation
	(in this case /home/www-data/vidjil/server/web2py) and give ownership to www-data:

Mikaël Salson's avatar
Mikaël Salson committed
224
        #+BEGIN_SRC sh
225
	chown -R www-data:www-data /home/www-data/vidjil
Mikaël Salson's avatar
Mikaël Salson committed
226
        #+END_SRC
227

228
	If you are using apache, you can run the following commands to make sure all the apache modules you need
229
230
	are activated:

Mikaël Salson's avatar
Mikaël Salson committed
231
        #+BEGIN_SRC sh
232
233
234
235
236
237
238
		a2enmod ssl
		a2enmod proxy
		a2enmod proxy_http
		a2enmod headers
		a2enmod expires
		a2enmod wsgi
		a2enmod rewrite  # for 14.04
Mikaël Salson's avatar
Mikaël Salson committed
239
        #+END_SRC
240
241
242
243
244

	In order to setup the SSL encryption a key to give to apache. The safest option
	is to get a certicate from a trusted Certificate Authority, but for testing
	purposes you can generate your own:

Mikaël Salson's avatar
Mikaël Salson committed
245
        #+BEGIN_SRC sh
246
247
248
249
250
251
252
		mkdir /etc/<webserver>/ssl
		openssl genrsa 1024 > /etc/<webserver>/ssl/self_signed.key
		chmod 400 /etc/<webserver>/ssl/self_signed.key
		openssl req -new -x509 -nodes -sha1 -days 365 -key
            /etc/<webserver>/ssl/self_signed.key > /etc/apache2/ssl/self_signed.cert
		openssl x509 -noout -fingerprint -text <
            /etc/<webserver>/ssl/self_signed.cert > /etc/<webserver>/ssl/self_signed.info
Mikaël Salson's avatar
Mikaël Salson committed
253
        #+END_SRC
254

255
256
257
        <webserver> should be replaced with the appropriate webserver name
        (ie. apache2 or nginx)

258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273

	Given that Vidjil is a two-part application, one that serves routes from a server
	and one that is served statically, we need to configure the apache to do so.
	Therefore we tell the apache to:
		- Start web2py as a wsgi daemon (allows apache to serve the application).
		- Reserve two virtual hosts (one to be served with ssl encryption, and one not).
		- We configure the first host to serve static content and prevent overriding
			by the sever (otherwise all routes are redirected through web2py) and to follow symlinks
			this allows us to symlink to our browser app in the /var/www directory and keep both parts
			of Vidjil together.
		- The second is set to use SSL encryption, and only serve very specific folders statically (such
			as javascript files and images because we don't want to create a controller to serve that kind of data)

	you can replace your apache default config with the following
	(/etc/apache2/sites-available/default.conf - remember to make a backup just in case): 

Mikaël Salson's avatar
Mikaël Salson committed
274
        #+BEGIN_EXAMPLE
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
		WSGIDaemonProcess web2py user=www-data group=www-data processes=1 threads=1

		<VirtualHost *:80>

		  DocumentRoot /var/www
		  <Directory />
		    Options FollowSymLinks
		    AllowOverride None
		  </Directory>

		  <Directory /var/www/>
		    Options Indexes FollowSymLinks MultiViews
		    AllowOverride all
		    Order allow,deny
		    allow from all
		  </Directory>

		  ScriptAlias /cgi/ /usr/lib/cgi-bin/

		  <Directory /usr/lib/cgi-bin/>
		    Options Indexes FollowSymLinks
		    Options +ExecCGI
		    #AllowOverride None
		    Require all granted
		    AddHandler cgi-script cgi pl
		  </Directory>

		  <Directory /home/www-data/vidjil/browser>
		    AllowOverride None
		  </Directory>

		  CustomLog /var/log/apache2/access.log common
		  ErrorLog /var/log/apache2/error.log
		</VirtualHost>


		<VirtualHost *:443>
		  SSLEngine on
		  SSLCertificateFile /etc/apache2/ssl/self_signed.cert
		  SSLCertificateKeyFile /etc/apache2/ssl/self_signed.key

		  WSGIProcessGroup web2py
		  WSGIScriptAlias / /home/www-data/vidjil/server/web2py/wsgihandler.py
		  WSGIPassAuthorization On

		  <Directory /home/www-data/vidjil/server/web2py>
		    AllowOverride None
		    Require all denied
		    <Files wsgihandler.py>
		      Require all granted
		    </Files>
		  </Directory>

		  AliasMatch ^/([^/]+)/static/(?:_[\d]+.[\d]+.[\d]+/)?(.*) \
		        /home/www-data/vidjil/server/web2py/applications/$1/static/$2

		  <Directory /home/www-data/vidjil/server/web2py/applications/*/static/>
		    Options -Indexes
		    ExpiresActive On
		    ExpiresDefault "access plus 1 hour"
		    Require all granted
		  </Directory>

		  CustomLog /var/log/apache2/ssl-access.log common
		  ErrorLog /var/log/apache2/error.log
		</VirtualHost>
Mikaël Salson's avatar
Mikaël Salson committed
341
        #+END_EXAMPLE
342
343

	Now we want to activate some more apache mods:
Mikaël Salson's avatar
Mikaël Salson committed
344
        #+BEGIN_SRC sh
345
346
		a2ensite default                   # FOR 14.04
		a2enmod cgi
Mikaël Salson's avatar
Mikaël Salson committed
347
        #+END_SRC
348
349
350
351

	Restart the server in order to make sure the config is taken into account.

	And create some symlinks to avoid splitting our app:
Mikaël Salson's avatar
Mikaël Salson committed
352
        #+BEGIN_SRC sh
353
354
355
356
		ln -s /home/www-data/vidjil/browser /var/www/browser
		ln -s /home/www-data/vidjil/browser/cgi/align.cgi /usr/lib/cgi-bin
		ln -s /home/www-data/vidjil/germline /var/www/germline
		ln -s /home/www-data/vidjil/data /var/www/data
Mikaël Salson's avatar
Mikaël Salson committed
357
        #+END_SRC
358

359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
      If you are using Nginx, the configuration is the following:
        #+BEGIN_EXAMPLE
            server {
                listen 80;
                server_name \$hostname;
                return 301 https://\$hostname$request_uri;

            }
            server {
                    listen 443 default_server ssl;
                    server_name     \$hostname;
                    ssl_certificate         /etc/nginx/ssl/web2py.crt;
                    ssl_certificate_key     /etc/nginx/ssl/web2py.key;
                    ssl_prefer_server_ciphers on;
                    ssl_session_cache shared:SSL:10m;
                    ssl_session_timeout 10m;
                    ssl_ciphers ECDHE-RSA-AES256-SHA:DHE-RSA-AES256-SHA:DHE-DSS-AES256-SHA:DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA;
                    ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
                    keepalive_timeout    70;
                    location / {
                        #uwsgi_pass      127.0.0.1:9001;
                        uwsgi_pass      unix:///tmp/web2py.socket;
                        include         uwsgi_params;
                        uwsgi_param     UWSGI_SCHEME \$scheme;
                        uwsgi_param     SERVER_SOFTWARE    nginx/\$nginx_version;
                        ###remove the comments to turn on if you want gzip compression of your pages
                        # include /etc/nginx/conf.d/web2py/gzip.conf;
                        ### end gzip section

                        proxy_read_timeout 600;
                        client_max_body_size 20G;
                        ###
                    }
                    ## if you serve static files through https, copy here the section
                    ## from the previous server instance to manage static files

                    location /browser {
                        root /home/www-data/vidjil/;
                        expires 1h;

                        error_page 405 = $uri;
                    }

                    location /germline {
                        root $CWD/../;
                        expires 1h;

                        error_page 405 = $uri;
                    }

                    ###to enable correct use of response.static_version
                    #location ~* ^/(\w+)/static(?:/_[\d]+\.[\d]+\.[\d]+)?/(.*)$ {
                    #    alias /home/www-data/vidjil/server/web2py/applications/\$1/static/\$2;
                    #    expires max;
                    #}
                    ###

                    location ~* ^/(\w+)/static/ {
                        root /home/www-data/vidjil/server/web2py/applications/;
                        expires max;
                        ### if you want to use pre-gzipped static files (recommended)
                        ### check scripts/zip_static_files.py and remove the comments
                        # include /etc/nginx/conf.d/web2py/gzip_static.conf;
                        ###
                    }

                    client_max_body_size 20G;

                    location /cgi/ {
                        gzip off;
                        root  /home/www-data/vidjil/browser/;
                        # Fastcgi socket
                        fastcgi_pass  unix:/var/run/fcgiwrap.socket;
                        # Fastcgi parameters, include the standard ones
                        include /etc/nginx/fastcgi_params;
                        # Adjust non standard parameters (SCRIPT_FILENAME)
                        fastcgi_param SCRIPT_FILENAME  \$document_root\$fastcgi_script_name;
                    }

            }
        #+END_EXAMPLE

        We also do not create symlinks since all references are managed
        correctly.

444
	Now we need to configure the database connection parameters:
445
		- create a file called conf.js in /home/www-data/vidjil/browser/js containing:
Mikaël Salson's avatar
Mikaël Salson committed
446
                  #+BEGIN_EXAMPLE
447
448
			var config = {
			    /*cgi*/
Mikaël Salson's avatar
Mikaël Salson committed
449
			    "cgi_address" : "default",
450
451
			    
			    /*database */
Mikaël Salson's avatar
Mikaël Salson committed
452
453
			    "use_database" : true,
			    "db_address" : "default",
454
			    
Mikaël Salson's avatar
Mikaël Salson committed
455
			    "debug_mode" : false
456
			}
Mikaël Salson's avatar
Mikaël Salson committed
457
                  #+END_EXAMPLE
458
459
460
461
462
463
		This tells the browser to access the server on the current domain.

		- copy vidjil/server/web2py/applications/vidjil/modules/defs.py.sample
			to vidjil/server/web2py/applications/vidjil/modules/defs.py
		  and change the value of DB_ADDRESS to reference your database.

464
465
466
467
468
	You can now access your app.
	All that is left to do is click on the init database link above the login page.
	This creates a default admin user: plop@plop.com and password: 1234 (make sure to
	remove this user in your production environment) and creates the configurations you can have
	for files and results.
469

470
	
471
472
473
* Testing the server
  If you develop on the server, or just want to check if everything is ok, you
  should launch the server tests.
474

475
476
477
  First, you should have a working fuse server by launching =make
  launch_fuse_server= (just launch it once, then it is running in the
  background and can be killed with =make kill_fuse_server=).
Marc Duez's avatar
Marc Duez committed
478

479
  Then you can launch the tests with =make unit=.
480
481
482


* Troubleshootings
483

484
485
486
487
488
** Web2py runs but does not allow any connection

Check whether the relevant disks are properly mounted.
Disks failures or other events could have triggered a read-only partition.

489
490
491
492
493
494
495
496
** Workers seem to be stuck
   For some reasons, that are not clear yet, it may happen that workers are not
   assigned any additional jobs even if they don't have any ongoin jobs.

   In such a (rare) case, it may be useful to restart web2py schedulers
   #+BEGIN_SRC sh
   initctl restart web2py-scheduler
   #+END_SRC
497
498
499
500
501
502
503
** Debugging Web2py workers
   One can launch the workers by hand (see in the =/etc/init= script and add a
   =-D 0= option. It prints debugging information on what the workers are doing.

   The most useful information are from the TICKER worker: the one that
   assigns jobs to workers. So you'd better first kill all the workers and
   then launch one by hand to be sure that it will be the ticker.
504

505
506
** Restarting web2py
   Just touch the file =/etc/uwsgi/web2py.ini=.
507

508
509
510
511
512
513
514
515
516
517
518
   Another of restarting it is by touching the file
   =server/web2py/applications/vidjil/modules/defs.py=.
   This will tell =uwsgi= to restart web2py (including the workers).

** Restarting uwsgi
   When one modifies an uwsgi config file (usually in =/etc/uwsgi= directory, it
   may be necessary to restart uwsgi so that the modifications are taken into
   account. This can be done using
   #+BEGIN_SRC sh
   initctl restart uwsgi-emperor
   #+END_SRC
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
** Logging database queries
*** MySQL
    One can see some [[https://stackoverflow.com/questions/650238/how-to-show-the-last-queries-executed-on-mysql][insightful SO post]].
    To summarize, this can either be done at runtime:
    #+BEGIN_SRC sql
    SET GLOBAL log_output = "FILE";
    SET GLOBAL general_log_file = "/path/to/your/logfile.log";
    SET GLOBAL general_log = 'ON';
    #+END_SRC
    Or directly in the configuration file (less recommended):
    #+BEGIN_SRC conf
    general_log_file        = /var/log/mysql/mysql.log
    general_log             = 1
    #+END_SRC
    In that case the server must be restarted afterwards.
534
535
536
537
538
539
540
541
542
543
544
545
546
* Running the server in a production environment

** Introduction
  When manipulating a production environment it is important to take certain
  precautionnary mesures, in order to ensure production can either be rolled
  back to a previous version or simply that any encurred loss of data can be
  retrieved.

  Web2py and Vidjil are no exception to this rule.

** Making backups
  Performing an Analysis in Vidjil is time-consuming, therefore should the
  data be lost, valuable man-hours are also lost.
547
  In order to prevent this we make regular incremental backups of the
548
  data stored on the vidjil servers.
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
  This  applies to the files created during the analysis (either by a software or a human).
  This does not apply to uploaded files.


  To ease the backup, the =backup.sh= script provides an example.  For this
  script to be ran automatically, it is required that =mysqldump= doesn't ask
  for a password. The credentials informations should be provided in a
  =~/.my.cnf= file (for MySQL obviously).
  #+BEGIN_SRC conf
[client]
user = backup
password = "strongpassword"
host = localhost
  #+END_SRC
  It is also advised that the backup  user has a read-only access to the database.
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
** Autodelete and Permissions
  Web2py has a handy feature called AutoDelete which allows the administrator
  to state that file reference deletions should be cascaded if no other
  references to the file exist.
  When deploying to production one needs to make sure AutoDelete is
  deactivated.
  As a second precaution it is also wise to temporarily restrict web2py's
  access to referenced files.

  Taking two mesures to prevent file loss might seem like overkill, but
  securing data is more important than the small amount of extra time spent
  putting these mesures into place.

** Deploying the server
  Currently deploying changes to production is analogous to merging into the
  rbx branch and pulling from the server.

  Once this has been done, it is important that any database migrations have
  been applied.
  This can be verified by refreshing the server (calling a controller) and
  then looking at the database.


** Step by Step
  - Set AutoDelete to False
  - Check permissions on the uploads folder (set to 100)
    - you can also check the amount of files present at this point for future
      reference
  - Backup database: Archive old backup.csv and then from admin page: backup
    db
  - pull rbx (if already merged dev)
  - Check the database (for missing data or to ensure mmigrations have been
        applied)
  - Check files to ensure no files are missing
  - Reset the folder permissions on uploads (755 seems to be the minimum
    requirement for web2py)
  - Run unit tests (Simply a precaution: Continuous Integration renders this
    step redundant but it's better to be sure)
  - Check site functionnality
603
604
605
606
607
608
609

* Resetting user passwords
  Currently there is not easy way of resetting a user's password.
  The current method is the following:
  `cd server/web2py`
  `python web2py -S vidjil -M`
  `db.auth_user[<user-id].update_record(password=CRYPT(key=auth.settings.hmac_key)('<password>')[0],reset_password_key='')`
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753

* Migrating Data
** Database
   The easiest way to perform a database migration is to first extract the
   data with the following command:

   #+BEGIN_SRC sh
     mysqldump -u <user> -p <db> -c --no-create-info > <file>
   #+END_SRC

   An important element to note here is the --no-create-info we add this
   parameter because web2py needs to be allowed to create tables itself
   because it keeps track of database migrations and errors will occur if
   tables exist which it considers it needs to create.

   In order to import the data into an installation you first need to ensure
   the tables have been created by Web2py this can be achieved by simply
   accessing a non-static page.

   /!\ If the database has been initialised from the interface you will
   likely encounter primary key collisions or duplicated data, so it is best
   to skip the initialisation altogether.

   Once the tables have been created, the data can be imported as follows:

   #+BEGIN_SRC sh
     mysql -u <user> -p <db> < <file>
   #+END_SRC

   Please note that with this method you should have at least one admin user
   that is accessible in the imported data. Since the initialisation is being
   skipped, you will not have the usual admin account present.
   It is also possible to create a user directly from the database although
   this is not the recommended course of action.

** Files
   Files can simply be copied over to the new installation, their filenames
   are stored in the database and should therefore be accessible as long as
   they are in the correct directories.

** Filtering data (soon deprecated)
   When extracting data for a given user, the whole database should not be
   copied over.
   There are two courses of action:
     - create a copy of the existing database and remove the users that are
       irrelevant. The cascading delete should remove any unwanted data
       barring a few exceptions (notably fused_file, groups and sample_set_membership)

     - export the relevant data directly from the database. This method
       requires multiple queries which will not be detailed here.

  Once the database has been correctly extracted, a list of files can be
  obtained from sequence_file, fused_file, results_file and analysis_file
  with the following query:

  #+BEGIN_SRC sql
    SELECT <filename field>
    FROM <table name>
    INTO OUTFILE 'filepath'
    FIELDS TERMINATED BY ','
    ENCLOSED BY ''
    LINES TERMINATED BY '\n'
  #+END_SRC

  Note: We are managing filenames here which should not contain any
  character such as quotes or commas so we can afford to refrain from
  enclosing the data with quotes.

  This query will output a csv file containing a filename on each line.
  Copying the files is now just a matter of running the following script:

#+BEGIN_SRC sh
  sh copy_files <file source> <file destination> <input file>
#+END_SRC

** Exporting sample sets
   The migrator script allows the export and import of data, whether it be a
   single patient/run/set or a list of them, or even all the sample sets
   associated to a group.

   #+BEGIN_EXAMPLE
    usage: migrator.py [-h] [-f FILENAME] [--debug] {export,import} ...

    Export and import data

    positional arguments:
    {export,import}  Select operation mode
      export         Export data from the DB into a JSON file
      import         Import data from JSON into the DB

    optional arguments:
      -h, --help       show this help message and exit
      -f FILENAME      Select the file to be read or written to
      --debug          Output debug information
   #+END_EXAMPLE

   Export:
   #+BEGIN_EXAMPLE
    usage: migrator.py export [-h] {sample_set,group} ...

    positional arguments:
      {sample_set,group}  Select data selection method
        sample_set        Export data by sample-set ids
        group             Extract data by groupid

    optional arguments:
      -h, --help          show this help message and exit
   #+END_EXAMPLE

   #+BEGIN_EXAMPLE
    usage: migrator.py export sample_set [-h] {patient,run,generic} ID [ID
    ...]

    positional arguments:
      {patient,run,generic}
                              Type of sample
        ID                    Ids of sample sets to be extracted

      optional arguments:
        -h, --help            show this help message and exit
   #+END_EXAMPLE

   #+BEGIN_EXAMPLE
    usage: migrator.py export group [-h] groupid

    positional arguments:
      groupid     The long ID of the group

    optional arguments:
      -h, --help  show this help message and exit
   #+END_EXAMPLE

   Import:
   #+BEGIN_EXAMPLE
    usage: migrator.py import [-h] [--dry-run] [--config CONFIG] groupid

    positional arguments:
      groupid     The long ID of the group

    optional arguments:
      -h, --help  show this help message and exit
      --dry-run   With a dry run, the data will not be saved to the database
      --config CONFIG  Select the config mapping file
   #+END_EXAMPLE