Newer
Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
* ☄️ A fast converter from Newick trees to Phylip distance matrices
Typical computation of distance between leaves in a tree takes \(O(n)\) time.
Hence, computing pairwise distance between leaves of the tree would take
\(O(n^3)\) time. While polynomial, it becomes already impractical for moderate
collections (eg. 1000 genomes).
The `nwk2phy` relies on the folklore construction of datastructure able to
answer Lowest Common Ancestor in the tree in constant time. This comes at the
price of extra linear space, build during a linear time preprocessing phase.
Afterward, the distance queries can be answered in \(O(1)\) time, leading to an
(asymptotically) optimal complexity of \(O(n^2)\).
*Note.* This program uses an advanced approach, supplying its speed, but is no
further optimized.
** 🖥️ Usage
#+begin_src text
┓ ┏┓ ┓
┏┓┓┏┏┃┏┏┛┏┓┣┓┓┏
┛┗┗┻┛┛┗┗━┣┛┛┗┗┫ v.0.1
┛ ┛
A simple converter from Newick trees to Phylip distance matrices in O(n^2) time
Copyright (c) Léo Ackermann 2024
USAGE
nwk2phy -h/--help
⟹ Display this message
nwk2phy <newick file> --out=<outfile> [--ord=<order>]
⟹ Convert the input Newick file into a Phylip distance matrix
<order> is the order of the matrix rows, the leftmost being the smallest
with respect to the order
LEAVES is the left-to-right leaves order of the tree (default)
LEXICO is the lexicographic order
#+end_src