News

Should the rate variation across sites be Gamma?

Sep 26, 2017 Opinion

It is often hard to predict the significance and long-term impact of new innovations, but the introduction of ModelFinder (1), a novel model-selection method for accurate phylogenetic estimates, is likely to be on par with Yang’s ground-breaking discovery, in 1994, of a simple one-parameter Gamma model (2) of rate heterogeneity across sites, the first model-selection method (3) to be widely used in molecular phylogenetics, and the rise of Bayesian phylogenetic methods (4), (5).

Published in 1994, Yang’s one-parameter model has been used in most of the subsequent molecular phylogenetic research but its success has perhaps also led to his other discovery, in 1995, of a more versatile model (6) of rate heterogeneity across sites being overlooked. ModelFinder includes both models, and the first results show that the more flexible model often is needed to obtain accurate estimates of phylogenetic trees and evolutionary processes.

These results call into question much of the last two decades of phylogenetic research, which relied on phylogenetic methodology that ignored the more versatile model of rate heterogeneity across sites.

The advent of ModelFinder opens a new era of opportunities, where accurate phylogenetic estimates can be obtained and used to answer important biological questions, and where controversial as well as long-standing evolutionary hypotheses can be tested using novel ways of modelling sequence evolution.

References

  1. Kalyaanamoorthy, S. et al. (2017) ModelFinder: fast model selection for accurate phylogenetic estimates. Nat Methods 14, 587-589. https://doi.org/10.1038/nmeth.4285
  2. Yang, Z. (1994) Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods. J Mol Evol 39 (3), 306-314. http://dx.doi.org/10.1007/BF00160154
  3. Posada, D. and Crandall, K.A. (1998) MODELTEST: testing the model of DNA substitution. Bioinformatics 14, 817-818. https://www.ncbi.nlm.nih.gov/pubmed/9918953
  4. Larget, B. and Simon, D. (1999) Markov chain Monte Carlo algorithms for the Bayesian analysis of phylogenetic trees. Mol Biol Evol 16, 750-759. https://doi.org/10.1093/oxfordjournals.molbev.a026160
  5. Huelsenbeck, J.P. and Ronquist, F. (2001) MrBayes: Bayesian inference of phylogenetic trees. Bioinformatics 17, 754-755. https://doi.org/10.1093/bioinformatics/17.8.754
  6. Yang, Z. (1995) A space-time process model for the evolution of DNA sequences. Genetics 139, 993-1005. http://www.genetics.org/content/139/2/993.abstract

IQ-TREE beta version 1.6.beta3

Jun 2, 2017 Release

We are pleased to release the beta version 1.6.beta (available at http://www.iqtree.org) with many new cool features. During this beta-testing phase, feedback is much appreciated.

Notable new features:

  • Polymorphism-aware models accounting for incomplete lineage sorting (code contributed by Dominik Schrempf).
  • Lie Markov and non-reversible models (code contributed by Michael Woodhams).
  • Heterotachy models accounting for rate variation across sites and lineages.
  • Xeon Phi Knights Landing (AVX-512) support with 2X or more speedup.
  • New option -fast to match the speed of FastTree program while still obtaining better trees [Experimental].

New features:

  • -wql option now prints quartet area and corner in .quartetlh file (requested by Karen Meusemann).
  • Support GENE resampling (-bspec GENE) and GENESITE resampling (-bspec GENESITE) for standard bootstrap with partition models.
  • Sequential and multicore versions are merged, thus iqtree-omp executable becomes iqtree.
Download latest version 1.6.beta3

IQ-TREE version 1.5.5

Jun 2, 2017 Release

New features:

  • Support gene-resampling (-bsam GENE) and gene-site-resampling (-bsam GENESITE) for standard bootstrap with partition models.
  • Support and treat polymorphic characters in (...) or {...} notation as missing data (requested by Steven Heritage).
  • Improved numerical stability for codon models (reported by Giorgio Matassi, Sarah Mathews, Ricardo Alves). Note that numerics may still fail if many codons are absent in the data.
  • Do not test ascertainment bias correction (+ASC) for codon models by default (thanks Ricardo Alves).
  • Only reduce minimal branch length for long alignment and un-partition models (thanks to Steven Mussmann).

Bugfixes:

  • Initial tree generation problem with constrained tree search option -g.
  • Likelihood underflow for large multifurcating trees (e.g. consensus tree) (reported by Giap Nguyen).
  • Crash with -minsupnew option (reported by Longzhi Tan).
  • Compilation with gcc under Mac (thanks @ilovezfs).
  • Crash with -nni1 for partition model -spp (reported by Diep Thi Hoang).
Download version 1.5.5 from GitHub

ModelFinder: Fast Model Selection for Accurate Phylogenetic Estimates

May 8, 2017 Publication

This page contains data for the paper:

Subha Kalyaanamoorthy, Bui Quang Minh, Thomas KF Wong, Arndt von Haeseler, and Lars S Jermiin. ModelFinder: Fast Model Selection for Accurate Phylogenetic Estimates, Nature Methods. DOI: 10.1038/nmeth.4285.

The IQ-TREE version 1.4.2 used to analyze the data in the manuscript can be downloaded from here.

3D structure of human hemoglobin showing amino-acids with high (in red) and low (in yellow) evolutionary rates determined by ModelFinder (Download PDB file here). Picture rendered by NGL viewer.


IQ-TREE version 1.5.4

Apr 3, 2017 Release

Important changes:

  • If no model is specified via -m, IQ-TREE will perform the new model selection, ModelFinder (MF). Accordingly, two new options are introduced: -m MF (equivalent to -m TESTNEWONLY) and -m MFP (equiv. to -m TESTNEW). For backward compatibility TESTNEW will still be available but might be removed in a future release.
  • Combining standard bootstrap (-b) and constraint tree option (-g) will impose the constraint on bootstrap trees (previously not). Thanks to Matthew Prebus for discussions.

New features:

  • The precompiled Linux executables are now backward compatible with the old Linux kernel 2.X, which resolves the error message “FATAL: kernel too old”.
  • Support input files with different newline formats to resolve conflicts between Mac, Windows or Linux files.
  • For data sets with identical sequences, redundant sequences are ignored. However, IQ-TREE will now keep two identical sequences (i.e. if five sequences A,B,C,D,E are identical to each other, then A,B are kept and C,D,E are ignored). This avoids incompatibility between bootstrap and non-bootstrap runs.
  • Warning about too many threads for short alignments (reported by Joran Martijn).
  • New option -wbsf to print individual bootstrap alignments and sitefreq files for standard bootstrap (requested by Huaichun Wang).

Bug fixes:

  • Segfault caused by combining standard bootstrap, partition model and constraint tree (reported by Matthew Prebus).
  • Crash by -nni1 option (reported by Carlos Rivera).
  • Illegal instruction on older Mac which does not support AVX instruction set (reported by Richard Moir and Matthew Fullmer).
  • Crash when combining -mtree and -bb during model selection (reported by Chris Buddenhagen).
Download version 1.5.4 from GitHub

IQ-TREE version 1.5.3

Jan 16, 2017 Release

Version 1.5.3 improves software stability. We thank a lot to all users mentioned below for the reports.

Bug fixes:

  • Crash for +R+ASC model (reported by olaf.thalmann).
  • Improper multiple restart for I+G model optimization (reported by 98w8h1).
  • For large data sets with many sequences:
    • Incorrect handling of numerical underflow when all state likelihoods are zero (reported by Gerhard Jaeger).
    • Numerical underflow for invariant sites (reported by ledum_laconicum, kelly.schiro).

Other changes:

  • Invariable (+I) site model now considers ambiguous constant sites.
  • -wba (write bootstrap alignments) works now with standard bootstrap (-b).
Download version 1.5.3 from GitHub

IQ-TREE version 1.5.2

Dec 3, 2016 Release

This version improves software stability. We thank a lot to all users mentioned below for the reports.

Bug fixes:

  • Incorrect likelihood computation under safe mode for rate homogeneity models (thanks to Ricardo Alves).
  • Bug when finally merging partitions (-m TESTMERGE) (reported by Olivier Navaud).
  • Crash when computing distance with consensus tree in presence of identical sequences (-bb option) (reported by Julien).
  • Crash for I+G model when p-invar close to 0 (reported by liqiangj, Frank Wright).
  • Bug likelihood scaling for ASC model (reported by lgrismer).

Other changes:

  • Fix misleading message about multifurcating trees (reported by Noah Simons).
  • Incompatibility problem with older Mac by switching back to libstdc++ instead of libc++ (reported by Matthew Fullmer).
  • Fix compilation issue for BSD and newer GCC 5.4 (thanks to @njoly).
  • Improved -nt AUTO option, e.g. to work with model selection (reported by Remi Denise).
Download version 1.5.2 from GitHub

IQ-TREE version 1.5.1

Nov 8, 2016 Release

We are pleased to announce version 1.5.1 with special focus on huge data sets and supercomputing with following new features:

  • Merged the parallel MPI version with much better parallel efficiency and scalability. The old MPI version is deprecated.
  • Memory saving mode via a new -mem option to restrict RAM usage, helpful for complex mixture models. For example, -mem 64G to use at most 64 GB. By default, IQ-TREE will try to fit the computer RAM size. Note that this mode does not work with partition models yet.
  • Safe numerical mode for huge data sets to avoid “Numerical underflow” errors. This mode is automatically turned on when having more than 2000 sequences. It can be manually turned on via -safe option.
  • New option -nt AUTO to automatically determine best number of threads in multicore version.
  • Support AVX2 instructions.
Download version 1.5.1 from GitHub

IQ-TREE version 1.5.0a

Oct 31, 2016 Release

This is a hot fix for -g option in version 1.5.0:

  • A bug when computing initial constrained trees (-g option) and recovering from checkpoint (thanks to Xingxing Shen for the report).
Download version 1.5.0a from GitHub

IQ-TREE version 1.5.0

Oct 24, 2016 Release

We are pleased to announce IQ-TREE version 1.5.0 with following major updates:

Major new features:

  • A new posterior mean site frequency (PMSF) model as a rapid approximation to the time and memory consuming CAT profile mixture models C10 to C60 (Le et al., 2008a). The PMSF model is much faster and requires much less RAM than the mixture models, regardless of the number of mixture classes. This allows, for the first time, to conduct nonparametric bootstrap under such complex models. Our extensive simulations and empirical deep-phylogeny data analyses demonstrate that the PMSF models can effectively ameliorate long branch attraction artefacts. For details see http://www.iqtree.org/doc/Complex-Models#site-specific-frequency-models

  • New option -g to supply a user-defined constraint tree, which will guide subsequent tree search. The constraint tree can be multifurcating and need not to include all taxa.

Bug fixes:

  • Crash with zero weights of mixture models for short alignments (thanks to Laura Eme for the report).
  • Incorrect site rate file (-wsr option) in the presence of identical sequences (thanks to Brian Foley for the report).
  • Memory overflow for tree topology testing for extremely long alignments (>500,000 sites) (thanks to Karen Meusemann for the report).
  • Rare issue with multifurcating trees and partition model (thanks to Xingxing for the report).

Other changes:

  • A new biologist-familiar example data file example.phy, which contains mitochondrial DNAs of human, gorilla, dog, mouse, etc. The data set was taken from the phylogenetic handbook (thanks to Brian Foley for suggestion).
  • Printing an alignment with suffix .varsite with only variable sites if ascertainment bias correction (ASC) is not applicable.
  • New option -wpl to write partition-specific log-likelihoods to .partlh file (requested by Karen Meusemann).
Download version 1.5.0 from GitHub