CHANGELOG 16.6 KB
Newer Older
Mathieu Giraud's avatar
Mathieu Giraud committed
1

2
3
This changelog concerns vijil-algo, the algorithmic part (C++) of the Vidjil platform.

Mathieu Giraud's avatar
Mathieu Giraud committed
4
2018-02-02  The Vidjil Team
5
6
7
8
9
10
11
	* Renamed program to 'vidjil-algo'
	* Improved analysis of large deletions in V or J genes (core/segment.cpp) #2767
	* Improved analysis of reads where V/J junction is close to an end, slightly shifting or shortening
	  the window (core/segment.cpp, core/windowExtractor.cpp) #2913
	* Removed default homopolymer cost, improving detection on the majority of sequencers (core/dynprog.h) #2851
	* Added warnings in the json output on some clones #2916
	* Bug closed (clean some germinal sequences #2598)
Mathieu Giraud's avatar
Mathieu Giraud committed
12
13
	* Corrected memory leaks #3018 #3031
	* Cleaned folders, with sources in src/, demo data in demo/, test data in algo/tests/data/ #2635 #2611
14
15
16
	* Changed option processing library, trying CLI11 instead of docopt (lib/CLI11.hpp, tools/align.cpp) #926
	* Streamlined packaging and Makefile, based on git #2255 #2630
	* Updated build process (supporting g++ 4.8 to 7.2, clang from 3.3), warns on non-supported compilers #2614 #2615 #2631
Mathieu Giraud's avatar
Mathieu Giraud committed
17
	* Updated help and online help, added demo files and commands, -f option #2635 #2076
18
19
	* Refactored unit tests with TEST_TAP_EQUAL #2989
	* New and updated unit and functional tests, including tests from documentation #1284 #2634
Mathieu Giraud's avatar
Mathieu Giraud committed
20

Mathieu Giraud's avatar
Mathieu Giraud committed
21
2017-09-08  The Vidjil Team
22
23
24
25
26
27
	* New BAM input parser, refactored input parsers (core/bioreader.cpp, core/bam.cpp) #2016
	* Streamlined seed options (-m, -k, -s), overriding -g presets (vidjil.cpp, core/germlines.cpp) #1673 #2156
	* New and updated unit and functional tests #2016 #2552
	* Faster tests (tests/Makefile, tests/*) #2254
	* Bug closed (halting when a representative is too short) #2224
	* Documented dependencies and libraries (lib/info.txt)
Mathieu Giraud's avatar
Mathieu Giraud committed
28
29
	* Updated help, correcting -h documentation on -u/-uu/-uuu options
	* Moved continuous integration to Gitlab CI (.gitlab-ci.yml)
30

Mathieu Giraud's avatar
Mathieu Giraud committed
31
2017-03-14  The Vidjil Team
32
33
	* Better support for germlines from several species (germlines/*.g) #1987 #2132
	* Streamlined option -g, including filtering, removed old -i/-G options (vidjil.cpp, core/germline.cpp) #2134
Mathieu Giraud's avatar
Mathieu Giraud committed
34
35
36
37
	* Better productivity estimation, handling in-frame stop codons (core/segment.cpp) #2220
	* Better unsegmentation causes, 'UNSEG only V/J' needs really significant V/J fragments (core/segment.cpp) #2107
	* New experimental faster heuristic with Aho-Corasick automaton (core/automaton.hpp) #1366
	* New option to disable window clustering (-w all) (core/windowExtractor.cpp) #1642
38
	* Updated germline genes from IMGT/GENE-DB
Mathieu Giraud's avatar
Mathieu Giraud committed
39
	* Bugs closed (json output, consensus computation, -y option) #2214 #2217 #2224 #2239
40
	* Refactored test structure for should-vdj tests (tests/should-vdj-to-tap.py, tests/repseq_vdj.py)
Mathieu Giraud's avatar
Mathieu Giraud committed
41
	* New and updated unit and functional tests, including more solved former bugs
42
43
	* Moved code to gitlab, better project management, commit messages are now linked to issues

Mathieu Giraud's avatar
Mathieu Giraud committed
44
2016-09-30  The Vidjil Team
45
46
	* New default trim option (-t 0) for germlines, leading to better results on some special recombinations (vidjil.cpp)
	* More flexible option to keep sequences of intereset (-W), accepting sequences of any size (core/windows.cpp)
Mathieu Giraud's avatar
Mathieu Giraud committed
47
	* Better generation of consensus sequences, marking ambiguous positions with Ns (core/representative.cpp)
48
	* Better recognition of standard recombinations by lowering incomplete germline attractiveness (core/segment.cpp)
Mathieu Giraud's avatar
Mathieu Giraud committed
49
	* Better TRD locus analysis, including some Dd-Dd-Jd recombinations (data/germlines.data)
50
51
52
53
54
	* Reduced unused similarity section of .json file output (vidjil.cpp)
	* New json output with -c segment (vidjil.cpp)
	* Updated help (doc/algo.org)
	* New and updated unit and functional tests, updated test process

Mathieu Giraud's avatar
Mathieu Giraud committed
55
2016-08-06  The Vidjil Team
56
	* New computation of average quality for each representative sequence, inside its window (core/representative.cpp)
Mathieu Giraud's avatar
Mathieu Giraud committed
57
58
	* New option (-E) to set the e-value for FineSegmenter D segment detection (vidjil.cpp)
	* Safer e-value estimation for D segments (vidjil.cpp)
59
60
	* Raised capacity to 2*10^9 reads (core/segment.cpp, core/fasta.cpp)
	* Better filtering/debug options (-u/-uu/-uuu), keeping interesting reads (vidjil.cpp, core/windowExtractor.cpp)
Mathieu Giraud's avatar
Mathieu Giraud committed
61
	* Better .vdj.fa header for irregular/incomplete recombinations, 1-based positions (core/segment.cpp)
62
	* Bugs closed (-u files, similarity in .vidjil)
Mathieu Giraud's avatar
Mathieu Giraud committed
63
	* New and updated unit and functional tests, checking again .should-vdj tests
64

Mathieu Giraud's avatar
Mathieu Giraud committed
65
66
67
2016-07-13  The Vidjil Team
	* New json 2016b format, renamed fields 'seg.{5,4,3,evalue}' and 'diversity', 1-based positions (core/segment.cpp)
	* New tool to display and debug alignments between a read and selected V/J genes (tools/vdj_assign.cpp)
68
69
70
71
	* Better test structure and process. The should-vdj tests will be soon moved to their own repository.
	* Bugs closed (build process, affine gaps (core/dynprog.cpp), JUNCTION/CDR3 detection (core/segment.cpp))
	* New and updated functional tests

Mathieu Giraud's avatar
Mathieu Giraud committed
72
2016-03-04  The Vidjil Team
73
74
75
76
77
	* Better JUNCTION/CDR3 detection (-3), based on positions of Cys104 and Phe118/Trp118 (core/segment.cpp)
	* Bugs closed (.vidjil output of revcomp'd sequences, computation of Simpson index Ds)
	* Streamlined structure of tests directory
	* New and updated unit and functional tests

Mathieu Giraud's avatar
Mathieu Giraud committed
78
2016-02-08  The Vidjil Team
79
80
81
82
83
84
85
86
87
88
89
90
91
	* New default threshold (-z 100), fully analyzing more clones
	* New computation of diversity indices (Shannon, Simpson) (core/windows.cpp)
	* Streamlined KmerSegmenter test (core/affectanalyser.cpp)
	* Safer p-value estimation in KmerSegmenter, taking into account the central region (core/affectanalyser.cpp)
	* New experimental VDDJ detection analysis (-d) (core/segment.cpp)
	* Better FineSegmenter V(D)J designation for inverted genes in unexpected recombination analysis (core/segment.cpp)
	  The unexpected recombination analysis (-2) can now safely be used.
	* Streamlined FineSegmenter handling of V, D and J segments (core/segment.cpp)
	* Faster FineSegmenter (~30%), relying on the KmerSegmenter to select to correct strand (core/segment.cpp)
	* Updated germline genes from IMGT/GENE-DB
	* Bugs closed (.vidjil output for sequences beyond the -z threshold)
	* New and updated unit and functional tests

Mathieu Giraud's avatar
Mathieu Giraud committed
92
2015-12-22  The Vidjil Team
93
94
95
	* Better KmerSegmenter, rejecting reads with large alterning V/J zones (core/segment.cpp)
	* Better unsegmentation causes, 'UNSEG only V/J' needs significant V/J fragments (core/segment.cpp)
	* Renamed unsegmentation messages to "only V/5'" and "only J/3'" to avoid confusion (core/segment.h)
96
	* Better unexpected recombination analysis (-2), with a FineSegmenter V(D)J assignation (core/segment.cpp)
97
98
99
	* Faster FineSegmenter (~35%), computing roughly half of the dynamic programming matrix (core/segment.cpp)
	* More flexible handling of badly formatted .fastq files (core/fasta.cpp)
	* New filtering/debug option (-uu), sort unsegmented reads (core/windowExtractor.cpp)
Mathieu Giraud's avatar
Mathieu Giraud committed
100
101
	* Streamlined json output, removing short codes (vidjil.cpp, core/segment.cpp)
	* Updated and refactored help (doc/algo.org)
102
	* Updated unit and functional tests
Mathieu Giraud's avatar
Mathieu Giraud committed
103
	* Bugs closed (build process, -X with large numbers, -x/-X combined with -c segment, stdout messages)
104

Mathieu Giraud's avatar
Mathieu Giraud committed
105
106
107
2015-10-08  The Vidjil Team
	* Bug closed (V(D)J assignation when V/D/J segments are close on the negative strand) (core/segment.cpp)

Mathieu Giraud's avatar
Mathieu Giraud committed
108
109
2015-10-05  The Vidjil Team
	* Better FineSegmenter V(D)J assignation, especially when V/D/J segments are close (core/segment.cpp)
Mathieu Giraud's avatar
Mathieu Giraud committed
110
	* Better e-value computation for FineSegmenter V(D)J assignation (core/segment.cpp)
111
	* Renamed 'UNSEG too few J/V' to 'UNSEG only V/J' to better reflect the actual detection of one part
Mathieu Giraud's avatar
Mathieu Giraud committed
112
	* Refactored tests for V(D)J assignation, allowing flexible patterns, new documentation (doc/should-vdj.org)
113
114
115
	* New tests on sequences with manually curated V(D)J assignations (tests/should-vdj-tests)
	* Bugs closed (no more spurious D in some VJ recombinations, corrected number of deletions around D regions)

Mathieu Giraud's avatar
Mathieu Giraud committed
116
2015-07-21  The Vidjil Team
117
118
119
120
121
122
        * New flexible parameterization of analyzed recombinations through a json file (germline/germlines.data)
        * New experimental unexpected recombination analysis (-4 -e 10) (core/segment.cpp)
        * New threshold for FineSegmenter VJ assignation, with at least 10 matches (core/germline.cpp)
        * Streamlined handling of segmentation methods (core/germline.h, core/segment.cpp)
        * Updated distance matrix computation between all clones (core/similarityMatrix.cpp)
        * New nlohmann json libray to parse and write json files (lib/json.h)
Mathieu Giraud's avatar
Mathieu Giraud committed
123
        * Updated build process, now requiring a C++11 compiler
124
125
126
        * New draft developer documentation (doc/dev.org), updated help and user documentation
        * New and updated unit and functional tests, bugs closed in shouldvdj tests

Mathieu Giraud's avatar
Mathieu Giraud committed
127
2015-06-05  The Vidjil Team
128
129
130
131
132
133
        * New default trim option (-t 100), considering only the relevant ends of the germline genes
        * Better segmentation heuristic when there are few k-mers (core/segment.cpp)
        * Better TRA/TRD locus analysis (-i), including both Vd-(Dd)-Ja and Dd-Ja recombinations (core/germline.cpp)
        * Better incomplete TRD and Dh/Jh analysis (-i), including up/downstream region of D genes (core/germline.cpp)
        * Better computation of the reference length for the coverage information, considering all reads of each clone
        * Better unexpected recombination analysis (-2), with information on the locus used (core/segment.cpp)
Mathieu Giraud's avatar
Mathieu Giraud committed
134
        * Streamlined again unsegmentation causes and removed delta_{min,max} for the heuristic (core/segment.cpp)
135
136
        * Refactored stats computation (core/read_storage.cpp)
        * Updated build process
Mathieu Giraud's avatar
Mathieu Giraud committed
137
138
139
          The next release will require a C++11 compiler. Static binaries will also be distributed.
        * Updated help (unsegmentation causes, clustering options, -m option)
        * Bugs closed (segmentation with k-mers on the negative strand, e-value computation, clone output on stdout)
140
141
        * New and updated unit and functional tests, and faster functional tests

Mathieu Giraud's avatar
Mathieu Giraud committed
142
2015-05-08  The Vidjil Team
143
144
145
        * New default e-value threshold (-e 1.0), improving the segmentation heuristic
        * New default for window length (-w 50), even with -D or with -g, streamlining the window handling
        * Better multi-germline analysis (-g), selecting the best locus on the e-value (core/segment.cpp)
146
        * New experimental trim option (-t), considering only the relevant ends of the germline genes (core/kmerstore.h)
147
148
149
150
151
152
153
        * Streamlined unsegmentation causes, including 'too short for w(indow)'
        * Updated main and debug ouptut
        * Updated help (algo.org, and new locus.org)
        * New option to keep only reads with labeled windows (-F), new combo (-FaW) to filter reads by window
        * Bugs closed (non symmetrical seeds and revcomp)
        * New and updated unit and functional tests

Mathieu Giraud's avatar
Mathieu Giraud committed
154
2015-04-09  The Vidjil Team
155
        * New experimental e-value threshold (-e) (core/segment.cpp)
Mathieu Giraud's avatar
Mathieu Giraud committed
156
        * New experimental unexpected recombination analysis (-2) (core/segment.cpp)
157
158
159
160
161
162
163
164
165
166
167
        * New preview/debug options to stop after a given number of reads (-x), possibly sampled throughout the file (-X)
        * New preview/debug option to output unsegmented reads as clones (-!)
        * New progress bar during computation
        * Better memory management for the reads taken into account for the representative (core/read_storage.cpp)
        * Updated .json output, with k-mer affectation results
        * Updated .json output, with the 'coverage' of the representative (core/representative.cpp)
        * Removed unused code parts as well as some files
        * Updated help, separing basic (-h) and advanced/experimental options (-H)
        * Bugs closed (extended nucleotides and revcomp)
        * New and updated unit and functional tests

Mathieu Giraud's avatar
Mathieu Giraud committed
168
2015-03-04  The Vidjil Team
169
170
171
172
173
	* Better multi-germline analysis (-g), returning the best locus for each read (core/segment.cpp)
	  The incomplete rearrangement analysis (-i) can now safely be used.
	  The speed of this multi-germline analysis will be improved in a next release.
	* Faster representative computation (core/read_chooser.cpp)
	* Included tools (tools/*.py) to process .vidjil files
174
	* New statistics on the number of clones (core/germline.cpp)
175
	* New experimental CDR3 detection (-3). We still advise to use IMGT/V-QUEST for better and complete results.
Mathieu Giraud's avatar
Mathieu Giraud committed
176
	* Better debug option -K (.affect), especially in the case of multi-germline analysis
177
178
179
	* Refactored dynamic programming computations (core/dynprog.cpp), experimental affine gaps
	* Removed unused code parts as well as some files
	* Streamlined flag processing in Makefiles
180
	* New mechanism for some functional tests (make shouldvdj)
181
	* New experimental tests from generated recombinations (make shouldvdj_generated)
182
	* New and updated unit and functional tests
183
184
185
186
187
188
189

2015-01-31  The Vidjil Team
	* Better TRG and TRD+ parameters (-g). The multi-germline analysis will again be improved in a next release.
	* New experimental option (-I) to discard common kmers between different germlines (core/germline.cpp)
	* Updated outputs for better traceability (version in .json, germlines on stdout)
	* New mechanism to retrieve germline databases (germline/get-saved-germline)

Mathieu Giraud's avatar
Mathieu Giraud committed
190
2014-12-22  The Vidjil Team
191
192
193
194
195
	* Better multi-germline analysis (-g). This will again be improved in a next release.
	* New experimental incomplete rearrangement analysis (-i)
	* New and updated unit and functional tests
	* Bugs closed (-w 40 when no D germline)

Mathieu Giraud's avatar
Mathieu Giraud committed
196
2014-11-28  The Vidjil Team
197
198
	* New input method, now accepts compressed fasta files with gzip (core/fasta.cpp, gzstream/zlib)
	* Better multi-germline analysis (-g) and documentation. This analysis can now safely be used.
Mathieu Giraud's avatar
Mathieu Giraud committed
199
	* Streamlined input. Option -d is removed, and a germline is required (-V/(-D)/-J, or -G, or -g)
200
201
202
203
204
	* Removed unused code parts as well as some files
	* New and updated unit and functional tests - now more than 80% code coverage
	* New public continuous integration - travis, coveralls
	* Bugs closed (-l, large -r)

Mathieu Giraud's avatar
Mathieu Giraud committed
205
2014-10-22  The Vidjil Team
206
	* Streamlined filtering options (-r/-y/-z), better documented (doc/algo.org)
Mathieu Giraud's avatar
Mathieu Giraud committed
207
208
	* Streamlined output files, option to fix their basename (-b)
	* Updated .data .json output, now in the better documented 2014.10 format (doc/format-analysis.org)
209
	* New experimental multi-germline analysis (-g). This will be improved and documented in a next release.
Mathieu Giraud's avatar
Mathieu Giraud committed
210
	* Faster FineSegmenter with a better memory allocation (core/dynprog.cpp)
211
212
213
	* Refactored main vidjil.cpp, objects storing germlines and statistics (core/germline.cpp, core/stats.cpp)
	* Transferred clustering from clone output to information in .data, again simplifying vidjil.cpp
	* Removed unused code parts as well as some files
Mathieu Giraud's avatar
Mathieu Giraud committed
214
	* New and updated unit and functional tests
215
216
	* Bugs closed

Mathieu Giraud's avatar
Mathieu Giraud committed
217
2014-09-23  The Vidjil Team
218
219
220
221
222
	* Export cause of non-segmentation in the .data
	* New option to output segmented reads (-U), now by default segmented reads are not output one by one
	* Updated .data .json output (the format will change again in a next release)
	* Updated tests

Mathieu Giraud's avatar
Mathieu Giraud committed
223
2014-07-28  The Vidjil Team
224
225
	* Better heuristic, segment more reads (core/affectanalyser.h, core/segment.cpp)
	  This improved heuristic was designed to implement a multi-germline analysis in a next release.
226
	* Improved computation of the heuristic affectation. Halves the time of -c windows (core/kmerstore.h)
227
228
229
230
231
	* New command '-c germlines', discovering germlines (vidjil.cpp)
	* New unit tests, updated some tests
	* Updated .json output (experimental distance graph)
	* Bugs closed

232
2014-03-27  The Vidjil Team
233
	* Better default seed selection, depending on the germline, segments more reads (vidjil.cpp)
234
	* Better selection of representative read (core/representative.cpp)
235
236
	* New option to output all clones (-A), for testing purposes
	* Updated debug option (-u) to display k-mer affection (core/windowExtractor.cpp)
237
	* New unit tests, updated some tests
238
	* Improved management of dependencies (Makefile)
239
	* Improved documentation and comments on main stdout
240

Mathieu Giraud's avatar
Mathieu Giraud committed
241
242
243
244
245
246
247
248
2014-02-20  The Vidjil Team
	* Refactored main vidjil.cpp (core/windows.cpp, core/windowExtractor.cpp)
	* Removed unused html output
	* Better json output (core/json.cpp)
	* Updated main stdout, with representative sequence for each clone
	* Updated parameters for FineSegmenter (delta_max) and dynprog (substition cost)
	* Bugs closed

Mikaël Salson's avatar
Mikaël Salson committed
249
250
251
252
253
254
255
256
257
258
2013-10-07  The Vidjil Team
	* Better heuristic, segments more reads (core/segment.cpp)
        * Better and faster selection of representative read (vidjil.cpp, core/read_chooser.cpp)
        * Better display of reason of non-segmenting reads
        * New normalization against a standard (-Z) (core/labels.cpp)
        * New experimental lazy_msa multiple aligner
        * New .json output
        * New unit tests
        * Bugs closed

Mathieu Giraud's avatar
Mathieu Giraud committed
259
260
261
262
263
264
265
2013-07-03  The Vidjil Team
	* New selection of representative read (core/read_chooser.cpp)
	* Faster spaced seed computation (core/tools.cpp)
	* New unit tests
	* Bugs closed

2013-04-18  The Vidjil Team
Mikaël Salson's avatar
Mikaël Salson committed
266
267
268
	* First public release