- 19 Apr, 2019 2 commits
-
-
Mathieu Giraud authored
see #3762
-
Mathieu Giraud authored
see #3206
-
- 18 Apr, 2019 1 commit
-
-
Mathieu Giraud authored
see #3206
-
- 28 Feb, 2019 1 commit
-
-
Mikaël Salson authored
With Aho-Corasick we now segment them all!
-
- 12 Jul, 2018 1 commit
-
-
Mikaël Salson authored
Changes in IMGT: - Addition of 8 IGHV genes and alleles (8 F):IGHV1-8*03, IGHV1-69*15, IGHV1-69*16, IGHV2-26*02, IGHV2-26*03, IGHV2-70*15, IGHV2-70*16, IGHV2-70*17. - Update of reference sequences of IGHV1-45*03 and IGHV2-70*04 (partial sequences were replaced with complete V-REGION). - TRAJ13*02 reference sequence is AB258131. See http://www.imgt.org/IMGTgenedbdoc/dataupdates.html Other changes in MD5 should be due to changes in headers but this is not always true. - IGKV2D-26*02 has gained 2 nucleotides at its end - TRGV2*03 has appeared We note that the new IGHV genes allow to diminish the number of ambiguous k-mers on S22 dataset
-
- 31 Jan, 2018 1 commit
-
-
Mathieu Giraud authored
-
- 29 Jan, 2018 1 commit
-
-
Mathieu Giraud authored
See #3011.
-
- 28 Jan, 2018 1 commit
-
-
Mathieu Giraud authored
Some sequences in stanford-w100 are not clustered anymore.
-
- 17 Jan, 2018 2 commits
-
-
Mathieu Giraud authored
-
Mathieu Giraud authored
See #2987.
-
- 16 Jan, 2018 2 commits
-
-
Mathieu Giraud authored
We add a `-r 1` parameter to output all the windows in the JSON. This helps us to ensure that all “tuned” windows are properly warned in the JSON output. See #2913, #2916. (edited by @magiraud following 6e18874d)
-
Mikaël Salson authored
We make sure that the output specifies that some sequences are segmented with a “tuned” window. See #2913
-
- 10 Jan, 2018 1 commit
-
-
Mathieu Giraud authored
-
- 15 Sep, 2017 1 commit
-
-
Mathieu Giraud authored
See #2611.
-
- 01 Feb, 2017 3 commits
-
-
Mathieu Giraud authored
This implies a change in the -m value. See #1673.
-
Mathieu Giraud authored
They are not happy with -g, as they use specific parameters. See #2156.
-
Mathieu Giraud authored
-
- 31 Jan, 2017 1 commit
-
-
Mathieu Giraud authored
-
- 23 Sep, 2016 1 commit
-
-
Mikaël Salson authored
This reverts bf2f96fe and d317535b. After more than one year experience with -t 100 we realised that trimming leads to: – missing special cases where the V gene is heavily trimmed – giving bad unsegmentation causes: UNSEG only J appeared in cases where it didn't make sense those unsegmentation causes were replaced by UNSEG only V when changing -t to 0 (that also let us think that we could miss some V genes) However -t 0 does have a drawback that should be addressed. It is more likely to have spurious hits of V genes in the N-region (particularly with VDJ recombinations where the N-region is longer). Those spurious hits tend to shift the window towards the J gene. In the tests the modifications are due either to slight changes in number of windows (a small increase due to the previous reason) or to an increase in unsegmentation with really large windows (since windows are shifted we may not have enough space left to put a window).
-
- 13 Jul, 2016 1 commit
-
-
Mikaël Salson authored
This will help testing the effect of some parameter on all the functional tests.
-
- 01 Mar, 2016 2 commits
-
-
Mathieu Giraud authored
This will be more flexible.
-
Mathieu Giraud authored
-
- 21 Jan, 2016 1 commit
-
-
Mathieu Giraud authored
With the new estimation of the p-value in affectanalyser.cpp, the following sequence has a p-value on the J side of about 3.58e-04, yielding an e-value of about 4.71 (there are 13,153 reads in Stanford-S22). Setting -e 10 enables thus this sequence to be still segmented. Changing seeds could maybe change these results. === >lcl|FLN1FA001D7OE0.1 GGCCTGGAGTGGATTGGGTACATCTATTACAGTGGGAGCACCTACTACAACCCGTCCCTCAAGAGTCGAGTTGCCATATCGGTAGACACGTCTAAGAACCAGTTCTCCCTGAAGTTGAGCTCTGTGACTGCCGCGGACACGGCCGTGTATTATTGTGCGAGAGTAGCAGCGGCTGCTCTTGACTCCTTGGGGCCAGGGAAGCCTGGTCACCTCTCCTCAGG 73 + VJ 0 158 187 220 seed IGH SEG_+ 3.583786e-04 2.531831e-182/3.583786e-04 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X _ _ _ _ _ _+X _ _ _ _ _ _ _+X _ _ _ _ _ _+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X _ _ _ _ _ _+X _ _ _ _ _ _+X+X ?+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X _ _ _ _+X+X+X _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _+x _ _ _ _ _ _+x _ _ _ _ _ _ _ _ _ _ _ _ _ _
-
- 28 May, 2015 2 commits
-
-
Mathieu Giraud authored
There are common k-mers (with the non-default seed #####-#####) between IGHV and IGHD. Interestingly, the number of detected junction is the same in the new version, but there are more segmented reads, as the large '-w 100' option needs a fairly positioned window.
-
Mikaël Salson authored
More sequences are segmented in Standford_S22
-
- 11 May, 2015 3 commits
-
-
Mathieu Giraud authored
The tests using Stanford-S22 mostly test things on the extracted windows, sometimes on the representatives, but almost never on the fine segmentation. We thus add '-y 0', '-z 0', or similar options to these tests. On some laptop, running all *.should_get tests now takes 2'34 instead of 3'48.
-
Mikaël Salson authored
Interestingly with -w 100, the number of foud windows is much larger because now the window is better centered around the junction. Before we could have spurious hits in the N-REGION which would have shifted the window towards the 3' end. Additionnally a few windows seem to be factorised. (updated/merged by @magiraud)
-
Mikaël Salson authored
4 genes added in IGHV (welcome on board): IGHV3-30-22, IGHV3-30-33, IGHV3-30-42, IGHV3-30-52. 5 IGKV genes removed (IGKV3-NL[1-5]) and 2 added (IGKV3D-11, IGKV3D-20)
-
- 17 Apr, 2015 2 commits
-
-
Mathieu Giraud authored
-
Mathieu Giraud authored
-
- 11 Apr, 2015 2 commits
-
-
Mathieu Giraud authored
-
Mathieu Giraud authored
- stanford*: one sequence in S22 is not segmented - trd-dd2-dd3*: the short Dd2-Dd3 example has a evalue slightly above 1.0, we set -e 10 - chimera: one failed test is now passing thanks to e-value
-
- 11 Mar, 2015 1 commit
-
-
Mathieu Giraud authored
As k-mers with extended nucleotides are now ignored, the count of k-mers in '-c germline' has slightly changed (< 0.05% variation in 'stanford-germlines.should_get'). There are also slight changes in the number of windows found in 'stanford-w100.should_get'.
-
- 26 Feb, 2015 1 commit
-
-
Mathieu Giraud authored
-
- 25 Nov, 2014 1 commit
-
-
Mathieu Giraud authored
Additionally, one test has a different output due to a previously forgotten '-d' option.
-
- 09 Sep, 2014 1 commit
-
-
Mathieu Giraud authored
-
- 25 Jul, 2014 1 commit
-
-
Mikaël Salson authored
N is now transformed to T (because of bitwise operations) instead of A and it appears that a window already exists with a T, so the number of distinct windows slightly decreases.
-
- 07 Jul, 2014 1 commit
-
-
Mikaël Salson authored
No worries this comes from the fact that this tests uses a very large window. Therefore a slight modification of the segmentation may shift the segmentation to the right and may prevent from getting a window. There were several cases with an affectation like that one: +V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V ?+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V ?+V+V+V+V+V+V+V+V ? ? ?+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V ?+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V ?+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _+V _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _+V _ _ _ _ _ _ _ _ _+J+J ?+J+J+J+J+J+J+J+J+J+J+J+J+J+J+J+J+J+J+J+J+J+J+J ?+J+J+J+J Where the last +V was not taken into account with the previous heuristic, but was with the current one.
-
- 16 Apr, 2014 1 commit
-
-
Mikaël Salson authored
When a k-mer, for the same label, is seen on both strand, we were (actually, I was) putting the + strand by default. But that's not robust to revcomp sequences. That explains some different results on strand + and strand - on very specific cases (mainly with small k). Results from test stanford-w100 is updated accordingly.
-
- 15 Apr, 2014 1 commit
-
-
Mikaël Salson authored
-