Tutorial

Last update: Jun 27, 2024, Contributors: Jana Trifinopoulos, Minh Bui, Nltung

Introduction

Why IQ-TREE?

Thanks to the recent advent of next-generation sequencing techniques, the amount of phylogenomic/transcriptomic data have been rapidly accumulated. This extremely facilitates resolving many “deep phylogenetic” questions in the tree of life. At the same time it poses major computational challenges to analyze such big data, where most phylogenetic software cannot handle. Moreover, there is a need to develop more complex probabilistic models to adequately capture realistic aspects of genomic sequence evolution.

This trends motivated us to develop the IQ-TREE software with a strong emphasis on phylogenomic inference. Our goals are:

  • Accuracy: Proposing novel computational methods that perform better than existing approaches.
  • Speed: Allowing fast analysis on big data sets and utilizing high performance computing platforms.
  • Flexibility: Facilitating the inclusion of new (phylogenomic) models and sequence data types.
  • Versatility: Implementing a broad range of commonly-used maximum likelihood analyses.

IQ-TREE has been developed since 2011 and freely available at http://www.iqtree.org/ as open-source software under the GNU-GPL license version 2. It is actively maintained by the core development team (see below) and a number of collabrators.

The name IQ-TREE comes from the fact that it is the successor of IQPNNI and TREE-PUZZLE software.

Key features

  • Efficient search algorithm: Fast and effective stochastic algorithm to reconstruct phylogenetic trees by maximum likelihood. IQ-TREE compares favorably to RAxML and PhyML in terms of likelihood while requiring similar amount of computing time (Nguyen et al., 2015).
  • Ultrafast bootstrap: An ultrafast bootstrap approximation (UFBoot) to assess branch supports. UFBoot is 10 to 40 times faster than RAxML rapid bootstrap and obtains less biased support values (Minh et al., 2013; Hoang et al., 2018).
  • Ultrafast model selection: An ultrafast and automatic model selection (ModelFinder) which is 10 to 100 times faster than jModelTest and ProtTest. ModelFinder also finds best-fit partitioning scheme like PartitionFinder (Kalyaanamoorthy et al., 2017).
  • Simulating sequences: A fast sequence alignment simulator (AliSim) which is much more realistic than Seq-Gen and INDELible (Ly-Trong et al., 2023).
  • Big Data Analysis: Supporting huge datasets with thousands of sequences or millions of alignment sites via checkpointing, safe numerical and low memory mode. Multicore CPUs and parallel MPI system are utilized to speedup analysis.
  • Phylogenetic testing: Several fast branch tests like SH-aLRT and aBayes test (Anisimova et al., 2011) and tree topology tests like the approximately unbiased (AU) test (Shimodaira, 2002).

The strength of IQ-TREE is the availability of a wide variety of phylogenetic models:

Free web server

For a quick start you can also try the IQ-TREE web server, which performs online computation using a dedicated computing cluster. It is very easy to use with as few as just 3 clicks! Try it out at

http://iqtree.cibiv.univie.ac.at

User support

Please refer to the user documentation and frequently asked questions.

If you find a bug (e.g. when IQ-TREE prints a crash message) or want to request a new feature, please post an issue on GitHub: https://github.com/iqtree/iqtree2/issues. For other questions and feedback, please ask in GitHub discussions: https://github.com/iqtree/iqtree2/discussions

Documentation

IQ-TREE has an extensive documentation with several tutorials and manual:

How to cite IQ-TREE?

To maintain IQ-TREE, support users and secure fundings, it is important for us that you cite the following papers, whenever the corresponding features were applied for your analysis.**

Example 1: We obtained branch supports with the ultrafast bootstrap (Hoang et al., 2018) implemented in the IQ-TREE 2 software (Minh et al., 2020).

Example 2: We used IQ-TREE 2 (Minh et al., 2020) to infer the maximum-likelihood tree using the edge-linked partition model (Chernomor et al., 2016).

General citation for IQ-TREE 2:

  • B.Q. Minh, H.A. Schmidt, O. Chernomor, D. Schrempf, M.D. Woodhams, A. von Haeseler, R. Lanfear (2020) IQ-TREE 2: New models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol., 37:1530-1534. https://doi.org/10.1093/molbev/msaa015

When using tree mixture models (MAST) please cite:

  • T.K.F. Wong, C. Cherryh, A.G. Rodrigo, M.W. Hahn, B.Q. Minh, R. Lanfear (2024) MAST: Phylogenetic Inference with Mixtures Across Sites and Trees. Syst. Biol., in press. https://doi.org/10.1093/sysbio/syae008

When computing concordance factors please cite:

When using AliSim to simulate alignments please cite:

When estimating amino-acid Q matrix please cite:

  • B.Q. Minh, C. Cao Dang, L.S. Vinh, R. Lanfear (2021) QMaker: Fast and accurate method to estimate empirical models of protein evolution. Syst. Biol., 70:1046–1060. https://doi.org/10.1093/sysbio/syab010

When using the heterotachy GHOST model “+H” please cite:

  • S.M. Crotty, B.Q. Minh, N.G. Bean, B.R. Holland, J. Tuke, L.S. Jermiin, A. von Haeseler (2020) GHOST: Recovering Historical Signal from Heterotachously Evolved Sequence Alignments. Syst. Biol., 69:249-264. https://doi.org/10.1093/sysbio/syz051

When using the tests of symmetry please cite:

  • S. Naser-Khdour, B.Q. Minh, W. Zhang, E.A. Stone, R. Lanfear (2019) The Prevalence and Impact of Model Violations in Phylogenetic Analysis. Genome Biol. Evol., 11:3341-3352. https://doi.org/10.1093/gbe/evz193

When using polymorphism-aware models please cite:

  • D. Schrempf, B.Q. Minh, A. von Haeseler, C. Kosiol (2019) Polymorphism-aware species trees with advanced mutation models, bootstrap, and rate heterogeneity. Mol. Biol. Evol., 36:1294–1301. https://doi.org/10.1093/molbev/msz043

For the ultrafast bootstrap (UFBoot) please cite:

  • D.T. Hoang, O. Chernomor, A. von Haeseler, B.Q. Minh, and L.S. Vinh (2018) UFBoot2: Improving the ultrafast bootstrap approximation. Mol. Biol. Evol., 35:518–522. https://doi.org/10.1093/molbev/msx281

When using posterior mean site frequency model (PMSF) please cite:

  • H.C. Wang, B.Q. Minh, S. Susko, A.J. Roger (2018) Modeling site heterogeneity with posterior mean site frequency profiles accelerates accurate phylogenomic estimation. Syst. Biol., 67:216–235. https://doi.org/10.1093/sysbio/syx068

When using ModelFinder please cite:

  • S. Kalyaanamoorthy, B.Q. Minh, T.K.F. Wong, A. von Haeseler, L.S. Jermiin (2017) ModelFinder: Fast model selection for accurate phylogenetic estimates. Nat. Methods, 14:587-589. https://doi.org/10.1038/nmeth.4285

When using partition models please cite:

  • O. Chernomor, A. von Haeseler, B.Q. Minh (2016) Terrace aware data structure for phylogenomic inference from supermatrices. Syst. Biol., 65:997-1008. https://doi.org/10.1093/sysbio/syw037

When using IQ-TREE web server please cite:

  • J. Trifinopoulos, L.-T. Nguyen, A. von Haeseler, B.Q. Minh (2016) W-IQ-TREE: a fast online phylogenetic tool for maximum likelihood analysis. Nucleic Acids Res., 44:W232-W235. https://doi.org/10.1093/nar/gkw256

When using IQ-TREE version 1 please cite:

  • L. Nguyen, H.A. Schmidt, A. von Haeseler, B.Q. Minh (2015) IQ-TREE: A Fast and Effective Stochastic Algorithm for Estimating Maximum-Likelihood Phylogenies. Mol. Biol. and Evol., 32:268-274. https://doi.org/10.1093/molbev/msu300

Development team

IQ-TREE is actively developed by:

Bui Quang Minh, Team leader, Designs and implements software core, tree search, ultrafast bootstrap, model selection.

Robert Lanfear, Co-leader, Co-leading various projects (incl. new model selection) and documentations, acting as a bridge with biologists.

Thomas Wong, Developer, Mixture models, model selection.

Nhan Ly-Trong, Developer, sequence simulator, pathogen phylogeny reconstruction.

Piyumal Demotte, time tree inference.

Olga Chernomor, Developer, Implements partition models.

Heiko A. Schmidt, Developer, Integrates TREE-PUZZLE features, user supports and documentations.

Dominik Schrempf, Developer, Implements polymorphism-aware models (PoMo).

Michael Woodhams, Developer, Implements Lie Markov models.

Diep Thi Hoang, Developer, Improves ultrafast bootstrap.

Arndt von Haeseler, Advisor.

Past members:

Lam Tung Nguyen, Developer, Implemented tree search algorithm.

Jana Trifinopoulos, Developer, Implemented web service.

James Barbetti, Developer, Code optimization.

Credits and acknowledgements

Some parts of the code were taken from the following packages/libraries: Phylogenetic likelihood library, TREE-PUZZLE, BIONJ, Nexus Class Libary, Eigen library, SPRNG library, Zlib library, gzstream library, vectorclass library, GNU scientific library.

IQ-TREE was funded by the Austrian Science Fund (grant no. I760-B17 from 2012-2015 and I 2508-B29 from 2016-2017), the University of Vienna (Initiativkolleg I059-N from 2012-2015), the Australian National University (2018-onwards), Chan-Zuckerberg Initiative (2020).