-
Rayan Chikhi authored
- Graph simplifications are moved from Minia to GATB-core. Call them using: graph.simplify() - New data structure for faster neighborhood queries. Enable it using: graph.precomputeAdjacency() - As a consequence of optimizations, it is now much faster to call indegree(node), outdegree(node) and degree(node,size_t &in,size_t &out) rather than neighbors(node, direction).size() if you're only interested in degrees. - LargeInt (the type of kmer) constructor has been removed, for speed reasons. Be warned that it might break existing code that implicitly rely on 0-initialization of kmers, but problems can hopefully be detected using valgrind. - Graph becomes GraphTemplate<Node,Edge,GraphDataVariant>, and compatibility is preserved via typedefs - This enables to define GraphFast<span>, a graph object which only holds Node's and Edge's for a single k-mer size, as opposed to a boost::variant of multiple kmer sizes before. It is faster. - Graph API has been changed: neighbors<Node> becomes neighbors, neighbors<Edge> becomes neighborsEdge, iterator<Node> becomes iterator, iterator<BranchingNode> becomes iteratorBranching - The change above was necessary, because it is difficult to specialize nested templates in C++. Actually, not all templated graph functions have been un-templated (because some aren't used in conjunction with GraphFast). There is still work to do. - Due to graph template, the following classes have also been changed to be also templatized: BranchingAlgorithm, all Frontline's, all Terminator's typedef's have been created to preserve compatibility - For speed of tools, it is now advised to follow Minia.cpp's functor technique and use GraphFast<span> instead of Graph. However Graph should still work and offer same performance as before. - GraphData is moved from Graph.cpp to Graph.hpp - MPHF index of a node is now cached in the Node object - because of that, 'const Node&' should now be just 'Node&', everywhere. - added a function graph.disableNodeState() to disable recording node state (normal, deleted, marked). Graph then avoids making MPHF queries when checking if a node exists (also involved in neighbors() queries). This makes the bloom flavor of graphs faster, but once precomputeAdjacency() is called, it is not relevant anymore. - added scripts/parse_gcc_output.py for visual inspection of gcc compilation/link errors involving graph templates - slightly modified src/CMakeLists so that tools may set their own KSIZE_DEFAULT_LIST (e.g. Minia) - speedup to LargeInt::hash1() - a few unit tests have been added, as well as one benchmark significantly improved: bench_graph.cpp - added minimizer stuff that was missing from last commit (some bugfixes, and also specialization to LargeInt<1>) - LargeInt's are not instances of ArrayData anymore. Instead, ArrayData is a member. This is faster.
223f9743