URLs for the software packages are provided at www.bioinf.manchester.ac.uk/recombination/
URLs for the software packages are provided at www.bioinf.manchester.ac.uk/recombination/
A dot in the cells refers to the same content as the first software package implementing the same approach.
1 For these approaches, RDP3 allows the user to perform a ‘manual’ analysis by assigning a query and parental sequences or to perform an ‘automated’ analysis that evaluates every possible quartet or triplet. The latter setting provides a useful approach to identify putative recombinant sequences in the data set.
2 The bootscan approach analyses all sequences simultaneously (phylogenetic tree inference), but uses the ‘query vs. reference’ scheme a posteriori to trace the clustering of a particular query sequence.
3 In TOPALi, a modified difference in sums of squares (DSS) method can be used to find which sequence(s) appear to be responsible for the recombination breakpoints. This “leave-one-out” method uses as windows the homogeneous regions between the breakpoints, identified using any method. The DSS scores for each breakpoint are calculated, leaving out one sequence at a time. To assess significance, 100 alignments are simulated. A sequence is a putative recombinant if removing it results in a non-significant recombination breakpoint. This algorithm can be applied after a recombinant pattern is identified using any method implemented in TOPALi.
4 The methods in TOPALi are generally slow when run on a single processor, but when spread on multiple CPUs analyses will run significantly faster.
5 Although these software packages compare a query sequence against all the other sequences, they can perform this comparison for every sequence in the data set being assigned as a query. In PhylPro, phylogenetic correlations, which are based on pairwise genetic distances, are computed for each individual sequence in the alignment at every position using sliding-window techniques. Bellerophon evaluates for each sequence the contribution to the absolute deviation between two distance matrices for two adjacent windows. RAT has an ‘auto search’ option to evaluate the similarity of every sequence to all other sequences. Therefore these approaches can be useful in identifying putative recombinants.
6 SERAD is the MATLAB precursor of BARCE (C++ program). Both BARCE and JAMBE have been integrated into TOPALi, which provides a user-friendly GUI and several on-line monitoring diagnostic tools. Note that the BARCE method may predict erroneous recombination events when a DNA sequence alignment shows strong rate heterogeneity. An improved method that addresses this problem via a factorial hidden Markov model has been described in (Husmeier, 2005). The method has been implemented in a MATLAB program (http://www.bioss.ac.uk/~dirk/Supplements/phyloFHMM/), but unfortunately, this implementation is slow and computationally inefficient for the time being.
7 The stepwise approach can be applied to any recombination detection method that uses a permutation test and provides estimates of breakpoints. The criteria depend on the method that is used in the stepwise approach.
5 Sliding MinDP implements three different methods to identify recombinants: a percentage identity method (as implemented in the Recombination Detection Program, RIP), a standard bootscanning method and a distance bootscanning method. Only the standard bootscanning method infers trees for each alignment window.
Philippe Lemey
A Practical Approach to phylogenetic analysis and hypothesis testing
Second Edition