Publication

MPBoot: fast phylogenetic maximum parsimony tree inference and bootstrap approximation

Jan 1, 2018 | By: Admin

Introduction

MPBoot is an open-source and efficient program to reconstruct maximum parsimony phylogenetic trees for large DNA and protein sequence alignments. Importantly, MPBoot provides a fast approximation for maximum parsimony bootstrap, inspired by a similar methodology for maximum likelihood (Minh et al., 2013). If you use MPBoot in a paper please cite:

D.T. Hoang, L.S. Vinh, T. Flouri, A. Stamatakis, A. von Haeseler, and B.Q. Minh (2018) MPBoot: fast phylogenetic maximum parsimony tree inference and bootstrap approximation. BMC Evol. Biol., 18, 11. https://doi.org/10.1186/s12862-018-1131-3

For feedback or bug reports please post a question to the IQ-TREE forum:

https://groups.google.com/d/forum/iqtree

Download

Precomplied executables are provided for computers with SSE4 or AVX support:

  1. Extract the files (e.g., with tar xvzf mpboot-avx-1.1.0-Linux.tar.gz under Unix). This should create a directory mpboot-avx-1.1.0-Linux.
  2. You will find the executable mpboot or mpboot-avx in bin folder inside the extracted folder. You can rename it to mpboot and copy to your system search path such that you can invoke it by entering mpboot from the terminal.

Compilation from source code

In case you cannot find executables for your platforms or want to compile the source code:

  1. Download the source code mpboot-1.1.0-Source.zip or mpboot-1.1.0-Source.tar.gz. You can also get the latest code from https://github.com/diepthihoang/mpboot.
  2. Uncompress it.
  3. Create folder build in the uncompressed source code folder: mkdir build.
  4. Change directory to build: cd build.
  5. Run cmake:

     cmake .. -DIQTREE_FLAGS=sse4 -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++
    
  6. Replace sse4 by avx in above command if you decide to run MPBoot on AVX-supported machines.
  7. Run make.
  8. You will find the executable named mpboot (or mpboot-avx) once the make command is done.

Minimal command-line examples

(replace mpboot with actual path to executable)

  1. Reconstruct maximum parsimony tree from a sequence alignment (example.phy):

     mpboot -s example.phy
    
  2. Reconstruct MP tree and assess branch supports with the MPBoot method (1000 replicates):

     mpboot -s example.phy -bb 1000
    
  3. Display all usage options:

     mpboot -h
    

Input/output files

Input alignment file example.phy is in PHYLIP format. MPBoot also supports FASTA or NEXUS format.

Output files:

  • MPBoot report file: example.phy.mpboot
  • Maximum-parsimony tree file: example.phy.treefile
  • Split support values file: example.phy.splits.nex
  • Consensus tree file: example.phy.contree
  • Screen log file: example.phy.log

Available options

OptionUsage and meaning
-mulhitsStore multiple equally parsimonious trees per bootstrap replicate
-ratchet_iter Number of non-ratchet iterations before each ratchet iteration (default: 1)
-ratchet_wgt Weight to add to each site selected for perturbation during ratchet (default: 1)
-ratchet_percent Percentage of informative sites selected for perturbation during ratchet (default: 50)
-ratchet_offTurn of ratchet, i.e. Only use tree perturbation
-spr_rad Maximum radius of SPR (default: 6)
-cand_cutoff <#s>Use top #s percentile as cutoff for selecting bootstrap candidates (default: 10)
-opt_btree_offTurn off refinement step on the final bootstrap tree set
-nni_parsHill-climb by NNI instead of SPR
-cost Read for the square matrix of transition cost between character states. First line contains one integer implying the number of character states (denoted by N). Each of the following N lines comprises N integers implying the cost for transitioning between character states. For DNA, set N=4. For protein, set N=20.

Credits and Acknowledgement

The code was originally derived from the IQ-TREE software and the Phylogenetic likelihood library.

D.T. Hoang and L.S. Vinh were financially supported by Vietnam National Foundation for Science and Technology Development (Grant #102.01-2013.04). B.Q. Minh and A. von Haeseler were supported by the Austrian Science Fund (FWF I-2805-B29), T. Flouri and A. Stamatakis by the German Science Foundation (DFG STA860-6/1), and the Klaus Tschira Foundation.

Share this:

Twitter acebook Google+