MAF: Maximum Agreement Forests for nonbinary trees

Implements an approximation and an FPT algorithm from this paper.

The approximation algorithm computes an agreement forest of two trees that is at most a factor 4 from optimum. This is the best known approximation algorithm that works for nonbinary (not completely resolved) trees.

The FPT algorithm computes a maximum agreement forest (MAF).

For binary trees, the program rSPR is faster and has a better approximation algorithm.

Version 1.2 (October 2012)

Download

Download the source file "MAF.java"

Download some example input files.

Installation

Type "javac MAF.java" in a terminal window to compile the source on your computer.

(Optional) Download Graphviz to be able to see graphical respresentations of the constructed agreement forest. (Graphviz might already be installed on your computer.)

Usage

Type "java MAF trees.txt" to run the program.

Example

For this input file, the program constructs this (optimal) maximum agreement forest.

Graphical representation of the forest.

 

The forest as an edge-colouring of the first tree. The forest as an edge-colouring of the second tree.

Larger examples

For larger examples, see here.

Input and output files

  • The inputfile should contain two trees in Newick format, one tree per line.

The program will produce the following output files for the approximation algorithm:

  • app_maf.txt contains the constructed forest as a partitioning of the taxa.

  • app_T1forest.dot contains the constructed forest as an edge-colouring of the first tree in dot format. (Actual colours will be used if the number of components is at most 12. Otherwise, the edges will only be labelled by a component number.)

  • app_T2forest.dot contains the constructed forest as an edge-colouring of the second tree in dot format.

  • app_forest.dot contains a graphical representation of the forest in dot format.

The program will produce the following output files for the FPT algorithm:

  • maf.txt contains the constructed forest as a partitioning of the taxa.

  • T1forest.dot contains the constructed forest as an edge-colouring of the first tree in dot format.

  • T2forest.dot contains the constructed forest as an edge-colouring of the second tree in dot format.

  • forest.dot contains a graphical representation of the forest in dot format.

While the algorithm is running, these files will contain the best solution found so far, unless cluster reductions have been applied.

All dot files will also be converted to pdf files if Graphviz is installed.

Options

--nofptOnly run the approximation, not the FPT algorithm.
--infoShow all steps of the algorithm.
--noredDo not apply cluster reductions.
--it kSearch for agreement forest of size 1,2,...,k before starting Branch and Bound (by default branch and bound is used immediately).
--pruneUse the approximation algorithm for pruning (disabled by default; might slow down the algorithm).

Notes

The input trees are not required to have identical leaf sets. Leaves that appear in only one tree will not be put in any component of the constructed agreement forests.

The FPT algorithm seems to be fast enough for trees that have, after applying cluster reductions, an agreement forest of size at most 14. For other data, the approximation algorithm can be used.

Citation

Leo van Iersel, Steven Kelk, Nela Lekić and Leen Stougie, Computing nonbinary agreement forests, arXiv:1210.3211 [math.CO] (2012).

Contact

For any questions, do not hesitate to contact: