
NSF Org: |
DEB Division Of Environmental Biology |
Recipient: |
|
Initial Amendment Date: | November 28, 2012 |
Latest Amendment Date: | November 29, 2012 |
Award Number: | 1240049 |
Award Instrument: | Standard Grant |
Program Manager: |
Simon Malcomber
smalcomb@nsf.gov (703)292-8227 DEB Division Of Environmental Biology BIO Directorate for Biological Sciences |
Start Date: | December 1, 2012 |
End Date: | March 31, 2019 (Estimated) |
Total Intended Award Amount: | $650,121.00 |
Total Awarded Amount to Date: | $650,121.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
3 RUTGERS PLZ NEW BRUNSWICK NJ US 08901-8559 (848)932-0150 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
NJ US 08901-8559 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | ASSEMBLING THE TREE OF LIFE |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.074 |
ABSTRACT
Viruses affect all forms of cellular life; however, it is extremely difficult to trace their evolutionary history. This project will reconstruct the phylogeny of an understudied branch of the viral tree of life: that of circular eukaryotic single-stranded DNA (CESS) viruses. CESS viruses are the smallest known viruses, yet they include pathogens with devastating impacts on agriculture worldwide. Recently, general environmental sampling has uncovered a diversity of CESS viruses that are only distantly related to these pathogens and has shown that they infect a wider range of hosts than previously thought, including insects and fungi. This project systematically tests previously unexplored hosts to discover, sequence, and classify their CESS viruses. Phylogenetic analyses will then determine how all CESS viruses are related to one another, providing a framework for classifying the CESS viral tree of life.
Broader impacts of this research include training of graduate students, undergraduates, and postdocs, as well as K-12 outreach and creation of web-based resources for the scientific and general public. Middle school students will be involved in collecting invertebrate samples, which will be processed during a virology training workshop for graduate students and postdocs on CESS virus genomics. Project results will be displayed on a website to increase public knowledge about CESS viruses, and bioinformatics tools will be shared freely with the scientific community.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
We determined how a large and diverse group of viruses, those with circular single-stranded DNA genomes that encode a particular Rep protein to help replicate their genomes (CRESS DNA viruses), are related to each other. Using the only protein shared by all of these viruses (Rep) we built a family tree with over 900 species of CRESS DNA viruses. We determined that the evolution of Rep proteins in these viruses was not described very well by the standard tools that evolutionary biologists use to describe protein evolution in bacteria and eukaryotes. This motivated us to create a new model for Rep protein evolution. This bespoke model improved our ability to see deep relationships among families of CRESS DNA viruses compared to standard models. The most exciting conclusion from our tree (image 1) is that the Reps of families Geminiviridae (green) and Genomoviridae (pink) are not separate groups, but it is instead clear that genomovirus Rep evolved from a geminivirus Rep, specifically a Rep that has to be made in multiple parts and spliced together to function. This splice-form Rep appears to have evolved only one among these families, and is used by four genera of geminiviruses, almost all genomoviruses, and by the black unclassified Rep sequences that were in the same group on the tree. It is logical that the more complex splice-form Rep would have evolved once instead of multiple times in a similar way, but our study was the first time this mixed relationship among geminivirus and genomovirus Reps has been observed. The high confidence we have in the relationship is due to strong statistical support from analyses using our bespoke model for CRESS DNA virus evolution. We have made our model freely available and incorporated it into a common tool to determine which evolutionary model to use (ProtTest, screenshot in image 2) and this, and much more information about CRESS DNA viruses is available at cressdna.org.
We also used four types of machine learning to help automate the task of figuring out which CRESS DNA virus family newly sequenced genomes belong to and to find new patterns in CRESS DNA viruses that can help us understand their evolution. Surprisingly, a simple metric that was useful for classifying CRESS DNA viruses was amino acid content of the Rep (or also of another common protein in CRESS DNA viruses, the coat protein). This is a useful, fast and independent complement to current classification methods that all rely on comparing the order of amino acids (or DNA) between sequences. One of our machine learning approaches also identified features of each genus of CRESS DNA viruses that were useful for classification, and we plan to investigate whether these features have biological significance to these viral groups.
This project trained three graduate students, two postdocs and two undergraduates in computational biology skills, data analysis and scientific communication. Members of the lab participated in outreach activities at local middle schools, including teaching over one hundred 7th graders about how protein evolution occurs.
Last Modified: 07/05/2019
Modified by: Siobain M Duffy
Please report errors in award information by writing to: awardsearch@nsf.gov.