@ARTICLE{TreeBASE2Ref19003,
author = {Sven Buerki and Forest Forest and Nicolas Salamin and Nadir Alvarez},
title = {Comparative Performance of Supertree Algorithms in Large Datasets using the Soapberry Family (Sapindaceae) as a case study},
year = {2010},
keywords = {Heuristic search; Matrix Representation with Parsimony; MinCut; MinFlip; Sapindaceae; Supertree.},
doi = {},
url = {http://},
pmid = {},
journal = {Systematic Biology},
volume = {},
number = {},
pages = {},
abstract = {For the last two decades, supertree reconstruction has been an active field of research and has seen the development of a large number of major algorithms. Because of the growing popularity of the supertree methods, it has become necessary to evaluate the performance of these algorithms to determine which are the best options (especially in regard of the supermatrix approach which is widely used). In this study, seven of the most commonly used supertree methods are investigated by using a large empirical dataset (in terms of number of taxa and molecular markers) from the worldwide flowering plant family Sapindaceae. Supertree methods were evaluated using several criteria: similarity of the supertrees with the input trees, similarity between the supertrees and the total evidence tree, level of resolution of the supertree and the computational time required by the algorithm. Additional analyses were also conducted on a reduced dataset to test if the performance levels were affected by the heuristic searches rather than the algorithms themselves. Based on our results, two main groups of supertree methods were identified: on one hand, the Matrix Representation with Parsimony (MRP), MinFlip and MinCut methods performed well according to our criteria, whereas the Average Consensus, Split Fit and Most Similar Supertree methods showed a poorer performance, or at least did not behave the same way as the total evidence tree. Results for the Super Distance Matrix (SDM), i.e., the most recent approach tested here, were promising with at least one derived method performing as well as MRP, MinFlip and MinCut. The output of each method was only slightly improved when applied to the reduced dataset, suggesting a correct behaviour of the heuristic searches and a relatively low sensitivity of the algorithms to dataset sizes and missing data. Results also showed that the MRP analyses could reach a high level of quality even when using a simple heuristic search strategy, with the exception of MRP with Purvis coding scheme and reversible parsimony. The future of supertrees lies in the implementation of a standardised heuristic search for all methods and the increase in computing power to handle large datasets. The latter would prove to be particularly useful for promising approaches such as the Maximum Quartet fit method that yet requires substantial computing power.}
}
Citation for Study 10580
Citation title:
"Comparative Performance of Supertree Algorithms in Large Datasets using the Soapberry Family (Sapindaceae) as a case study".
Study name:
"Comparative Performance of Supertree Algorithms in Large Datasets using the Soapberry Family (Sapindaceae) as a case study".
This study is part of submission 10570
(Status: Published).
Citation
Buerki S., Forest F., Salamin N., & Alvarez N. 2010. Comparative Performance of Supertree Algorithms in Large Datasets using the Soapberry Family (Sapindaceae) as a case study. Systematic Biology, .
Authors
-
Buerki S.
(submitter)
+41796574261
-
Forest F.
-
Salamin N.
-
Alvarez N.
Abstract
For the last two decades, supertree reconstruction has been an active field of research and has seen the development of a large number of major algorithms. Because of the growing popularity of the supertree methods, it has become necessary to evaluate the performance of these algorithms to determine which are the best options (especially in regard of the supermatrix approach which is widely used). In this study, seven of the most commonly used supertree methods are investigated by using a large empirical dataset (in terms of number of taxa and molecular markers) from the worldwide flowering plant family Sapindaceae. Supertree methods were evaluated using several criteria: similarity of the supertrees with the input trees, similarity between the supertrees and the total evidence tree, level of resolution of the supertree and the computational time required by the algorithm. Additional analyses were also conducted on a reduced dataset to test if the performance levels were affected by the heuristic searches rather than the algorithms themselves. Based on our results, two main groups of supertree methods were identified: on one hand, the Matrix Representation with Parsimony (MRP), MinFlip and MinCut methods performed well according to our criteria, whereas the Average Consensus, Split Fit and Most Similar Supertree methods showed a poorer performance, or at least did not behave the same way as the total evidence tree. Results for the Super Distance Matrix (SDM), i.e., the most recent approach tested here, were promising with at least one derived method performing as well as MRP, MinFlip and MinCut. The output of each method was only slightly improved when applied to the reduced dataset, suggesting a correct behaviour of the heuristic searches and a relatively low sensitivity of the algorithms to dataset sizes and missing data. Results also showed that the MRP analyses could reach a high level of quality even when using a simple heuristic search strategy, with the exception of MRP with Purvis coding scheme and reversible parsimony. The future of supertrees lies in the implementation of a standardised heuristic search for all methods and the increase in computing power to handle large datasets. The latter would prove to be particularly useful for promising approaches such as the Maximum Quartet fit method that yet requires substantial computing power.
Keywords
Heuristic search; Matrix Representation with Parsimony; MinCut; MinFlip; Sapindaceae; Supertree.
External links
About this resource
- Canonical resource URI:
http://purl.org/phylo/treebase/phylows/study/TB2:S10580
- Other versions:
Nexus
NeXML
- Show BibTeX reference
@ARTICLE{TreeBASE2Ref19003,
author = {Sven Buerki and Forest Forest and Nicolas Salamin and Nadir Alvarez},
title = {Comparative Performance of Supertree Algorithms in Large Datasets using the Soapberry Family (Sapindaceae) as a case study},
year = {2010},
keywords = {Heuristic search; Matrix Representation with Parsimony; MinCut; MinFlip; Sapindaceae; Supertree.},
doi = {},
url = {http://},
pmid = {},
journal = {Systematic Biology},
volume = {},
number = {},
pages = {},
abstract = {For the last two decades, supertree reconstruction has been an active field of research and has seen the development of a large number of major algorithms. Because of the growing popularity of the supertree methods, it has become necessary to evaluate the performance of these algorithms to determine which are the best options (especially in regard of the supermatrix approach which is widely used). In this study, seven of the most commonly used supertree methods are investigated by using a large empirical dataset (in terms of number of taxa and molecular markers) from the worldwide flowering plant family Sapindaceae. Supertree methods were evaluated using several criteria: similarity of the supertrees with the input trees, similarity between the supertrees and the total evidence tree, level of resolution of the supertree and the computational time required by the algorithm. Additional analyses were also conducted on a reduced dataset to test if the performance levels were affected by the heuristic searches rather than the algorithms themselves. Based on our results, two main groups of supertree methods were identified: on one hand, the Matrix Representation with Parsimony (MRP), MinFlip and MinCut methods performed well according to our criteria, whereas the Average Consensus, Split Fit and Most Similar Supertree methods showed a poorer performance, or at least did not behave the same way as the total evidence tree. Results for the Super Distance Matrix (SDM), i.e., the most recent approach tested here, were promising with at least one derived method performing as well as MRP, MinFlip and MinCut. The output of each method was only slightly improved when applied to the reduced dataset, suggesting a correct behaviour of the heuristic searches and a relatively low sensitivity of the algorithms to dataset sizes and missing data. Results also showed that the MRP analyses could reach a high level of quality even when using a simple heuristic search strategy, with the exception of MRP with Purvis coding scheme and reversible parsimony. The future of supertrees lies in the implementation of a standardised heuristic search for all methods and the increase in computing power to handle large datasets. The latter would prove to be particularly useful for promising approaches such as the Maximum Quartet fit method that yet requires substantial computing power.}
}
- Show RIS reference
TY - JOUR
ID - 19003
AU - Buerki,Sven
AU - Forest,Forest
AU - Salamin,Nicolas
AU - Alvarez,Nadir
T1 - Comparative Performance of Supertree Algorithms in Large Datasets using the Soapberry Family (Sapindaceae) as a case study
PY - 2010
KW - Heuristic search; Matrix Representation with Parsimony; MinCut; MinFlip; Sapindaceae; Supertree.
UR - http://dx.doi.org/
N2 - For the last two decades, supertree reconstruction has been an active field of research and has seen the development of a large number of major algorithms. Because of the growing popularity of the supertree methods, it has become necessary to evaluate the performance of these algorithms to determine which are the best options (especially in regard of the supermatrix approach which is widely used). In this study, seven of the most commonly used supertree methods are investigated by using a large empirical dataset (in terms of number of taxa and molecular markers) from the worldwide flowering plant family Sapindaceae. Supertree methods were evaluated using several criteria: similarity of the supertrees with the input trees, similarity between the supertrees and the total evidence tree, level of resolution of the supertree and the computational time required by the algorithm. Additional analyses were also conducted on a reduced dataset to test if the performance levels were affected by the heuristic searches rather than the algorithms themselves. Based on our results, two main groups of supertree methods were identified: on one hand, the Matrix Representation with Parsimony (MRP), MinFlip and MinCut methods performed well according to our criteria, whereas the Average Consensus, Split Fit and Most Similar Supertree methods showed a poorer performance, or at least did not behave the same way as the total evidence tree. Results for the Super Distance Matrix (SDM), i.e., the most recent approach tested here, were promising with at least one derived method performing as well as MRP, MinFlip and MinCut. The output of each method was only slightly improved when applied to the reduced dataset, suggesting a correct behaviour of the heuristic searches and a relatively low sensitivity of the algorithms to dataset sizes and missing data. Results also showed that the MRP analyses could reach a high level of quality even when using a simple heuristic search strategy, with the exception of MRP with Purvis coding scheme and reversible parsimony. The future of supertrees lies in the implementation of a standardised heuristic search for all methods and the increase in computing power to handle large datasets. The latter would prove to be particularly useful for promising approaches such as the Maximum Quartet fit method that yet requires substantial computing power.
L3 -
JF - Systematic Biology
VL -
IS -
ER -