Tutorial

Last update: Dec 1, 2019, Contributors: Jana Trifinopoulos, M Bui, Minh Bui, Nltung

Introduction

Why IQ-TREE?

Thanks to the recent advent of next-generation sequencing techniques, the amount of phylogenomic/transcriptomic data have been rapidly accumulated. This extremely facilitates resolving many “deep phylogenetic” questions in the tree of life. At the same time it poses major computational challenges to analyze such big data, where most phylogenetic software cannot handle. Moreover, there is a need to develop more complex probabilistic models to adequately capture realistic aspects of genomic sequence evolution.

This trends motivated us to develop the IQ-TREE software with a strong emphasis on phylogenomic inference. Our goals are:

  • Accuracy: Proposing novel computational methods that perform better than existing approaches.
  • Speed: Allowing fast analysis on big data sets and utilizing high performance computing platforms.
  • Flexibility: Facilitating the inclusion of new (phylogenomic) models and sequence data types.
  • Versatility: Implementing a broad range of commonly-used maximum likelihood analyses.

IQ-TREE has been developed since 2011 and freely available at http://www.iqtree.org/ as open-source software under the GNU-GPL license version 2. It is actively maintained by the core development team (see below) and a number of collabrators.

The name IQ-TREE comes from the fact that it is the successor of IQPNNI and TREE-PUZZLE software.

Key features

  • Efficient search algorithm: Fast and effective stochastic algorithm to reconstruct phylogenetic trees by maximum likelihood. IQ-TREE compares favorably to RAxML and PhyML in terms of likelihood while requiring similar amount of computing time (Nguyen et al., 2015).
  • Ultrafast bootstrap: An ultrafast bootstrap approximation (UFBoot) to assess branch supports. UFBoot is 10 to 40 times faster than RAxML rapid bootstrap and obtains less biased support values (Minh et al., 2013; Hoang et al., 2018).
  • Ultrafast model selection: An ultrafast and automatic model selection (ModelFinder) which is 10 to 100 times faster than jModelTest and ProtTest. ModelFinder also finds best-fit partitioning scheme like PartitionFinder.
  • Big Data Analysis: Supporting huge datasets with thousands of sequences or millions of alignment sites via checkpointing, safe numerical and low memory mode. Multicore CPUs and parallel MPI system are utilized to speedup analysis.
  • Phylogenetic testing: Several fast branch tests like SH-aLRT and aBayes test (Anisimova et al., 2011) and tree topology tests like the approximately unbiased (AU) test (Shimodaira, 2002).

The strength of IQ-TREE is the availability of a wide variety of phylogenetic models:

Free web server

For a quick start you can also try the IQ-TREE web server, which performs online computation using a dedicated computing cluster. It is very easy to use with as few as just 3 clicks! Try it out at

http://iqtree.cibiv.univie.ac.at

User support

Please refer to the user documentation and frequently asked questions. If you have further questions, feedback, feature requests, and bug reports, please sign up the following Google group (if not done yet) and post a topic to the

https://groups.google.com/d/forum/iqtree

The average response time is two working days.

Documentation

IQ-TREE has an extensive documentation with several tutorials and manual:

How to cite IQ-TREE?

To maintain IQ-TREE, support users and secure fundings, it is important for us that you cite the following papers, whenever the corresponding features were applied for your analysis.

  • Example 1: We obtained branch supports with the ultrafast bootstrap (Hoang et al., 2018) implemented in the IQ-TREE software (Nguyen et al., 2015).

  • Example 2: We inferred the maximum-likelihood tree using the edge-linked partition model in IQ-TREE (Chernomor et al., 2016; Nguyen et al., 2015).

If you performed the tests of symmetry, please cite:

  • S. Naser-Khdour, B.Q. Minh, W. Zhang, E.A. Stone, R. Lanfear (2019) The prevalence and pmpact of model violations in phylogenetic analysis, Genome Biol. Evol., in press. https://doi.org/10.1093/gbe/evz193

If you used the polymorphism-aware models please cite:

  • D. Schrempf, B.Q. Minh, A. von Haeseler, and C. Kosiol (2019) Polymorphism-aware species trees with advanced mutation models, bootstrap, and rate heterogeneity. Mol. Biol. Evol., 36:1294-1301. https://doi.org/10.1093/molbev/msz043

If you used the heterotachy model (GHOST) please cite:

  • S.M. Crotty, B.Q. Minh, N.G. Bean, B.R. Holland, J. Tuke, L.S. Jermiin, A. von Haeseler (2019) GHOST: Recovering historical signal from heterotachously-evolved sequence alignments. Syst. Biol., in press. https://doi.org/10.1093/sysbio/syz051

If you performed the ultrafast bootstrap (UFBoot) please cite:

  • D.T. Hoang, O. Chernomor, A. von Haeseler, B.Q. Minh, and L.S. Vinh (2018) UFBoot2: Improving the ultrafast bootstrap approximation. Mol. Biol. Evol., 35:518–522. https://doi.org/10.1093/molbev/msx281

If you used posterior mean site frequency model please cite:

  • H.C. Wang, B.Q. Minh, S. Susko and A.J. Roger (2018) Modeling site heterogeneity with posterior mean site frequency profiles accelerates accurate phylogenomic estimation. Syst. Biol., 67:216-235. https://doi.org/10.1093/sysbio/syx068

If you used ModelFinder please cite:

  • S. Kalyaanamoorthy, B.Q. Minh, T.K.F. Wong, A. von Haeseler, and L.S. Jermiin (2017) ModelFinder: Fast Model Selection for Accurate Phylogenetic Estimates, Nature Methods, 14:587–589. https://doi.org/10.1038/nmeth.4285

If you performed tree reconstruction please cite:

  • L.-T. Nguyen, H.A. Schmidt, A. von Haeseler, and B.Q. Minh (2015) IQ-TREE: A fast and effective stochastic algorithm for estimating maximum likelihood phylogenies. Mol. Biol. Evol., 32:268-274. https://doi.org/10.1093/molbev/msu300

If you used partition models e.g., for phylogenomic analysis please cite:

  • O. Chernomor, A. von Haeseler, and B.Q. Minh (2016) Terrace aware data structure for phylogenomic inference from supermatrices. Syst. Biol., 65:997-1008. https://doi.org/10.1093/sysbio/syw037

If you used the IQ-TREE web server please cite:

  • J. Trifinopoulos, L.-T. Nguyen, A. von Haeseler, and B.Q. Minh (2016) W-IQ-TREE: a fast online phylogenetic tool for maximum likelihood analysis. Nucleic Acids Res., 44 (W1):W232-W235. https://doi.org/10.1093/nar/gkw256

Development team

IQ-TREE is actively developed by:

Bui Quang Minh, Team leader, Designs and implements software core, tree search, ultrafast bootstrap, model selection.

Robert Lanfear, Co-leader, Model selection.

Olga Chernomor, Developer, Implements partition models.

Heiko A. Schmidt, Developer, Integrates TREE-PUZZLE features.

Dominik Schrempf, Developer, Implements polymorphism-aware models (PoMo).

Michael Woodhams, Developer, Implements Lie Markov models.

Diep Thi Hoang, Developer, Improves ultrafast bootstrap.

Arndt von Haeseler, Advisor.

Past members:

Lam Tung Nguyen, Developer, Implemented tree search algorithm.

Jana Trifinopoulos, Developer, Implemented web service.

Credits and acknowledgements

Some parts of the code were taken from the following packages/libraries: Phylogenetic likelihood library, TREE-PUZZLE, BIONJ, Nexus Class Libary, Eigen library, SPRNG library, Zlib library, gzstream library, vectorclass library, GNU scientific library.

IQ-TREE was funded by the Austrian Science Fund (grant no. I760-B17 from 2012-2015 and I 2508-B29 from 2016-2017), the University of Vienna (Initiativkolleg I059-N from 2012-2015), the Australian National University (2018-onwards), Chan-Zuckerberg Initiative (2020).