Mentions légales du service

Skip to content
Snippets Groups Projects
README.org 1.49 KiB
Newer Older
Léo Ackermann's avatar
Léo Ackermann committed
* ☄️ A fast converter from Newick trees to Phylip distance matrices

Typical computation of distance between leaves in a tree takes \(O(n)\) time.
Hence, computing pairwise distance between leaves of the tree would take
\(O(n^3)\) time. While polynomial, it becomes already impractical for moderate
collections (eg. 1000 genomes).

The `nwk2phy` relies on the folklore construction of datastructure able to
answer Lowest Common Ancestor in the tree in constant time. This comes at the
price of extra linear space, build during a linear time preprocessing phase.
Afterward, the distance queries can be answered in \(O(1)\) time, leading to an
(asymptotically) optimal complexity of \(O(n^2)\).

*Note.* This program uses an advanced approach, supplying its speed, but is no
 further optimized.

** 🖥️ Usage

#+begin_src text
     ┓ ┏┓  ┓
┏┓┓┏┏┃┏┏┛┏┓┣┓┓┏
┛┗┗┻┛┛┗┗━┣┛┛┗┗┫  v.0.1
         ┛    ┛

A simple converter from Newick trees to Phylip distance matrices in O(n^2) time
Copyright (c) Léo Ackermann 2024


USAGE
    nwk2phy -h/--help
    ⟹ Display this message

    nwk2phy <newick file> --out=<outfile> [--ord=<order>]
    ⟹ Convert the input Newick file into a Phylip distance matrix

    <order>   is the order of the matrix rows, the leftmost being the smallest
              with respect to the order
              LEAVES is the left-to-right leaves order of the tree (default)
              LEXICO is the lexicographic order
#+end_src