Award Abstract # 1355071
Collaborative Research: Bayesian Model Checking for Phylogenetics in the Post-Genomic Era

NSF Org: DEB
Division Of Environmental Biology
Recipient: LOUISIANA STATE UNIVERSITY
Initial Amendment Date: June 14, 2014
Latest Amendment Date: June 14, 2014
Award Number: 1355071
Award Instrument: Standard Grant
Program Manager: Simon Malcomber
smalcomb@nsf.gov
 (703)292-8227
DEB
 Division Of Environmental Biology
BIO
 Directorate for Biological Sciences
Start Date: August 1, 2014
End Date: July 31, 2019 (Estimated)
Total Intended Award Amount: $418,252.00
Total Awarded Amount to Date: $418,252.00
Funds Obligated to Date: FY 2014 = $418,252.00
History of Investigator:
  • Jeremy Brown (Principal Investigator)
    jembrown@lsu.edu
Recipient Sponsored Research Office: Louisiana State University
202 HIMES HALL
BATON ROUGE
LA  US  70803-0001
(225)578-2760
Sponsor Congressional District: 06
Primary Place of Performance: Louisiana State University & Agricultural and Mechanical College
202 Himes Hall
Baton Rouge
LA  US  70803-2701
Primary Place of Performance
Congressional District:
06
Unique Entity Identifier (UEI): ECQEYCHRNKJ4
Parent UEI:
NSF Program(s): PHYLOGENETIC SYSTEMATICS
Primary Program Source: 01001415DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 1171, 9150, 9169, EGCH
Program Element Code(s): 117100
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.074

ABSTRACT

Diagrams of evolutionary relationships (phylogenetic trees) for species and genes are widely employed in biological research, including the fields of medicine, epidemiology, forensics, conservation, evolutionary biology and agriculture. This research project will explore new ideas and develop new software tools to improve the accuracy by which phylogenetic relationships are determined; in this way the research will contribute to improved understanding and decision-making for a broad range of scientific disciplines and practical applications. Results from this research will be broadly disseminated, including in-person and online training opportunities to familiarize researchers in the relevant disciplines with these newly developed computer-based analytical tools. Further, the research activities will involve the participation and training of a postdoctoral scholar, a graduate student, and several undergraduates at Louisiana State University (LSU) and the University of Hawaii at Manoa. This project will be incorporated into a seminar series at LSU focused on increasing awareness of computational biology among undergraduate students.

Phylogenetic trees are now routinely inferred from enormous genome-scale data sets, revealing extensive variation in apparent phylogenetic signal across loci. However, no general tools currently exist to objectively and quantitatively assess how much of this variation is due to biological processes and how much is caused by methodological error. Distinguishing between true variation and error is the problem to be studied in this project, as resolving this issue is essential for robustly resolving the Tree of Life and for understanding genomic evolution. The goal of this work is to give researchers the tools to identify and avoid situations where phylogenetic inferences are unreliable. These tools will be implemented in open-source software (RevBayes and R), and will be easily extensible to many types of phylogenetic inference beyond those in this project. This research will implement suites of existing, alternative statistical approaches employing Bayesian posterior prediction to rigorously assess absolute fit of phylogenetic models to evolutionary data, and how this fit impacts the reliability of inference. Simulations comparing performance of alternative models will focus on three types of inferences: (i) estimation of individual gene trees, (ii) estimation of species trees from many genes, and (iii) comparative analysis of continuous traits. These approaches will be applied to exemplar empirical questions, including the placement of turtles among amniotes using several recently published genome-scale data sets. These data contain surprising and massive heterogeneity in phylogenetic signal regarding the placement of turtles, and thus form an excellent case study.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 17)
Jeremy M. Brown and Robert C. Thomson "Bayes factors unmask highly variable information content, bias, and extreme influence in phylogenomic analyses" Systematic Biology , v.66 , 2017 , p.517 10.1093/sysbio/syw101
Anthony J. Barley, Jeremy M. Brown, and Robert C. Thomson "Impact of model violations on the inference of species boundaries under the multispecies coalescent" Systematic Biology , v.67 , 2018 , p.269 10.1093/sysbio/syx073
Anthony J. Barley, Jeremy M. Brown, and Robert C. Thomson "Impact of model violations on the inference of species boundaries under the multispecies coalescent" Systematic Biology , v.67 , 2018 , p.269 10.1093/sysbio/syx073
Emilie J. Richards, Jeremy M. Brown, Anthony J. Barley, Rebecca A. Chong, and Robert C. Thomson "Variation across mitochondrial gene trees provides evidence for systematic error: how much gene tree variation is biological?" Systematic Biology , v.67 , 2018 , p.847 10.1093/sysbio/syy013
Emilie J. Richards, Jeremy M. Brown, Anthony J. Barley, Rebecca A. Chong, Robert C. Thomson "Variation Across Mitochondrial Gene Trees Provides Evidence for Systematic Error: How Much Gene Tree Variation Is Biological?" Systematic Biology , v.67 , 2018 , p.847 10.1093/sysbio/syy013
Jeremy M. Brown and Robert C. Thomson "Bayes factors unmask highly variable information content, bias, and extreme influence in phylogenomic analyses" Systematic Biology , v.66 , 2017 , p.517 10.1093/sysbio/syw101
Jeremy M. Brown and Robert C. Thomson "Bayes factors unmask highly variable information content, bias, and extreme influence in phylogenomic analyses" Systematic Biology , v.66 , 2017 , p.517 https://doi.org/10.1093/sysbio/syw101
Jeremy M. Brown and Robert C. Thomson "Evaluating model performance in evolutionary biology" Annual Review of Ecology, Evolution, and Systematics , v.49 , 2018 , p.95 10.1146/annurev-ecolsys-110617-062249
Jeremy M. Brown and Robert C. Thomson "The behavior of Metropolis-coupled Markov chains when sampling rugged phylogenetic distributions" Systematic Biology , v.67 , 2018 , p.729 10.1093/sysbio/syy008
Jeremy M. Brown and Robert C. Thomson "The behavior of Metropolis-coupled Markov chains when sampling rugged phylogenetic distributions" Systematic Biology , v.67 , 2018 , p.729 10.1093/sysbio/syy008
Robert C. Thomson, David C. Plachetzki, D. Luke Mahler, and Brian R. Moore "A critical appraisal of the use of microRNA data in phylogenetics" Proceedings of the National Academy of the United States of America , v.111 , 2014 , p.E3659 10.1073/pnas.1407207111
(Showing: 1 - 10 of 17)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

The importance of robust and flexible statistical methods is now well established in evolutionary biology and particularly in the field of phylogenetics, which focuses on understanding the historical relationships among species. Despite great progress over the past four decades in expanding the scope and rigor of phylogenetic statistical methods, there are still critical holes that need to be filled. All statistical methods rely on certain assumptions and a crucial step in a robust inference framework is to critically examine those assumptions in light of the data. While methods for doing so are now standard in traditional statistical applications (e.g., linear regression, ANOVAs, and t-tests), there has been no widely agreed upon approach for doing so in phylogenetics. This project developed a suite of statistical tools and associated software for investigating how well phylogenetic models fit empirical data. These tools allow researchers to investigate why phylogenetic signal (i.e., the information about genealogical relationships among organisms or their genes) varies across different parts of a genome, allowing them to assess the reliability of conclusions regarding how genes, genomes, and other characteristics of organisms have evolved over time. We implemented these tools in the popular and freely available RevBayes software package in order to make them easily accessible for users. The project has resulted in 10 publications to date.

This project facilitated the training and full collaborative participation of two postdoctoral researchers, four graduate students, and two undergraduate students. In addition, this project supported the development of training resources that were incorporated into several phylogenetic workshops that together have provided training to dozens of graduate students. This work also served as the primary foundation for a new phylogenomics workshop that the PIs offered for the first time in 2019. To facilitate widespread training and adoption, the project also developed new tutorials that are available on the RevBayes website.


Last Modified: 11/07/2019
Modified by: Jeremy M Brown

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page