1. 19 Apr, 2019 2 commits
  2. 18 Apr, 2019 1 commit
  3. 28 Feb, 2019 1 commit
  4. 12 Jul, 2018 1 commit
    • Mikaël Salson's avatar
      germline, algo/tests: New germlines taken into account · a79ff8de
      Mikaël Salson authored
      Changes in IMGT:
      - Addition of 8 IGHV genes and alleles (8 F):IGHV1-8*03, IGHV1-69*15, IGHV1-69*16, IGHV2-26*02, IGHV2-26*03, IGHV2-70*15, IGHV2-70*16, IGHV2-70*17.
      - Update of reference sequences of IGHV1-45*03 and IGHV2-70*04 (partial sequences were replaced with complete V-REGION).
      - TRAJ13*02 reference sequence is AB258131.
      
      See http://www.imgt.org/IMGTgenedbdoc/dataupdates.html
      
      Other changes in MD5 should be due to changes in headers but this is not always true.
      - IGKV2D-26*02 has gained 2 nucleotides at its end
      - TRGV2*03 has appeared
      
      We note that the new IGHV genes allow to diminish the number of ambiguous k-mers on S22 dataset
      a79ff8de
  5. 31 Jan, 2018 1 commit
  6. 29 Jan, 2018 1 commit
  7. 28 Jan, 2018 1 commit
  8. 17 Jan, 2018 2 commits
  9. 16 Jan, 2018 2 commits
  10. 10 Jan, 2018 1 commit
  11. 15 Sep, 2017 1 commit
  12. 01 Feb, 2017 3 commits
  13. 31 Jan, 2017 1 commit
  14. 23 Sep, 2016 1 commit
    • Mikaël Salson's avatar
      vidjil.cpp (tests and doc): -t options defaults again to 0 · cd292aa0
      Mikaël Salson authored
      This reverts bf2f96fe and d317535b.
      
      After more than one year experience with -t 100 we realised that trimming leads to:
      – missing special cases where the V gene is heavily trimmed
      – giving bad unsegmentation causes: UNSEG only J appeared in cases where it didn't make sense
        those unsegmentation causes were replaced by UNSEG only V when changing -t to 0 (that also let us think that we could miss some V genes)
      
      However -t 0 does have a drawback that should be addressed. It is more likely to have
      spurious hits of V genes in the N-region (particularly with VDJ recombinations
      where the N-region is longer). Those spurious hits tend to shift the window towards
      the J gene.
      
      In the tests the modifications are due either to slight changes in number of windows
      (a small increase due to the previous reason) or to an increase in unsegmentation
      with really large windows (since windows are shifted we may not have
      enough space left to put a window).
      cd292aa0
  15. 13 Jul, 2016 1 commit
  16. 01 Mar, 2016 2 commits
  17. 21 Jan, 2016 1 commit
    • Mathieu Giraud's avatar
      tests: update tests, -e 10 for Stanford_S22.fasta · 88c3213c
      Mathieu Giraud authored
      With the new estimation of the p-value in affectanalyser.cpp, the following sequence
      has a p-value on the J side of about 3.58e-04, yielding an e-value of about 4.71 (there are 13,153 reads in Stanford-S22).
      
      Setting -e 10 enables thus this sequence to be still segmented.
      Changing seeds could maybe change these results.
      
      ===
      
      >lcl|FLN1FA001D7OE0.1
      GGCCTGGAGTGGATTGGGTACATCTATTACAGTGGGAGCACCTACTACAACCCGTCCCTCAAGAGTCGAGTTGCCATATCGGTAGACACGTCTAAGAACCAGTTCTCCCTGAAGTTGAGCTCTGTGACTGCCGCGGACACGGCCGTGTATTATTGTGCGAGAGTAGCAGCGGCTGCTCTTGACTCCTTGGGGCCAGGGAAGCCTGGTCACCTCTCCTCAGG
      73 + VJ      0 158 187 220    seed IGH SEG_+ 3.583786e-04 2.531831e-182/3.583786e-04
       _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X _ _ _ _ _ _+X _ _ _ _ _ _ _+X _ _ _ _ _ _+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X _ _ _ _ _ _+X _ _ _ _ _ _+X+X ?+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X+X _ _ _ _+X+X+X _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _+x _ _ _ _ _ _+x _ _ _ _ _ _ _ _ _ _ _ _ _ _
      88c3213c
  18. 28 May, 2015 2 commits
  19. 11 May, 2015 3 commits
    • Mathieu Giraud's avatar
      tests: stanford-*.should_get, faster tests · bf0c408a
      Mathieu Giraud authored
      The tests using Stanford-S22 mostly test things on the extracted windows,
      sometimes on the representatives, but almost never on the fine segmentation.
      We thus add '-y 0', '-z 0', or similar options to these tests.
      
      On some laptop, running all *.should_get tests now takes 2'34 instead of 3'48.
      bf0c408a
    • Mikaël Salson's avatar
      tests: Update tests with new default -t option · d317535b
      Mikaël Salson authored
      Interestingly with -w 100, the number of foud windows is much larger
      because now the window is better centered around the junction.
      Before we could have spurious hits in the N-REGION which would
      have shifted the window towards the 3' end.
      
      Additionnally a few windows seem to be factorised.
      
      (updated/merged by @magiraud)
      d317535b
    • Mikaël Salson's avatar
      germline: Changes in IMGT germlines. · 9ef93faa
      Mikaël Salson authored
      4 genes added in IGHV (welcome on board): IGHV3-30-22, IGHV3-30-33, IGHV3-30-42, IGHV3-30-52.
      5 IGKV genes removed (IGKV3-NL[1-5]) and 2 added (IGKV3D-11, IGKV3D-20)
      9ef93faa
  20. 17 Apr, 2015 2 commits
  21. 11 Apr, 2015 2 commits
  22. 11 Mar, 2015 1 commit
    • Mathieu Giraud's avatar
      tests: update other tests · 9026fd39
      Mathieu Giraud authored
      As k-mers with extended nucleotides are now ignored, the count of k-mers
      in '-c germline' has slightly changed (< 0.05% variation in 'stanford-germlines.should_get').
      There are also slight changes in the number of windows found in 'stanford-w100.should_get'.
      9026fd39
  23. 26 Feb, 2015 1 commit
  24. 25 Nov, 2014 1 commit
  25. 09 Sep, 2014 1 commit
  26. 25 Jul, 2014 1 commit
  27. 07 Jul, 2014 1 commit
    • Mikaël Salson's avatar
      Tests: Less windows found in Stanford S22 with -w 100 · c50c44c2
      Mikaël Salson authored
      No worries this comes from the fact that this tests uses a very large
      window. Therefore a slight modification of the segmentation may shift the
      segmentation to the right and may prevent from getting a window.
      
      There were several cases with an affectation like that one:
      +V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V ?+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V ?+V+V+V+V+V+V+V+V ? ? ?+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V ?+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V ?+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V+V _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _+V _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _+V _ _ _ _ _ _ _ _ _+J+J ?+J+J+J+J+J+J+J+J+J+J+J+J+J+J+J+J+J+J+J+J+J+J+J ?+J+J+J+J
      
      Where the last +V was not taken into account with the previous heuristic, but
      was with the current one.
      c50c44c2
  28. 16 Apr, 2014 1 commit
    • Mikaël Salson's avatar
      Affect: Bug correction regarding different strands with Kmer(String)Affect · 5f63020a
      Mikaël Salson authored
      When a k-mer, for the same label, is seen on both strand, we were (actually, I
      was) putting the + strand by default. But that's not robust to revcomp
      sequences. That explains some different results on strand + and strand - on
      very specific cases (mainly with small k).
      
      Results from test stanford-w100 is updated accordingly.
      5f63020a
  29. 15 Apr, 2014 1 commit