Checking Phylogenetic Decisiveness in Theory and in Practice
Suppose we have a set X consisting of n taxa and we are given information from k loci from which to construct a phylogeny for X. Each locus offers information for only a fraction of the taxa. The question is whether this data suffices to construct a reliable phylogeny. The decisiveness problem expresses this question combinatorially. Although a precise characterization of decisiveness is known, the complexity of the problem is open. Here we relate decisiveness to a hypergraph coloring problem. We use this idea to (1) obtain lower bounds on the amount of coverage needed to achieve decisiveness, (2) devise an exact algorithm for decisiveness, (3) develop problem reduction rules, and use them to obtain efficient algorithms for inputs with few loci, and (4) devise an integer linear programming formulation of the decisiveness problem, which allows us to analyze data sets that arise in practice.
READ FULL TEXT