Award Abstract # 1240049
Collaborative Research: AToL: ACCESS DNA viruses: A Comprehensive survey of Circular Eukaryotic Single-Stranded DNA viruses in Invertebrates and Fungi

NSF Org: DEB
Division Of Environmental Biology
Recipient: RUTGERS, THE STATE UNIVERSITY
Initial Amendment Date: November 28, 2012
Latest Amendment Date: November 29, 2012
Award Number: 1240049
Award Instrument: Standard Grant
Program Manager: Simon Malcomber
smalcomb@nsf.gov
 (703)292-8227
DEB
 Division Of Environmental Biology
BIO
 Directorate for Biological Sciences
Start Date: December 1, 2012
End Date: March 31, 2019 (Estimated)
Total Intended Award Amount: $650,121.00
Total Awarded Amount to Date: $650,121.00
Funds Obligated to Date: FY 2013 = $650,121.00
History of Investigator:
  • Siobain Duffy (Principal Investigator)
    siobain@sebs.rutgers.edu
Recipient Sponsored Research Office: Rutgers University New Brunswick
3 RUTGERS PLZ
NEW BRUNSWICK
NJ  US  08901-8559
(848)932-0150
Sponsor Congressional District: 12
Primary Place of Performance: Rutgers University New Brunswick
NJ  US  08901-8559
Primary Place of Performance
Congressional District:
12
Unique Entity Identifier (UEI): M1LVPE5GLSD9
Parent UEI:
NSF Program(s): ASSEMBLING THE TREE OF LIFE
Primary Program Source: 01001314DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7689, 9169, EGCH
Program Element Code(s): 768900
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.074

ABSTRACT

Viruses affect all forms of cellular life; however, it is extremely difficult to trace their evolutionary history. This project will reconstruct the phylogeny of an understudied branch of the viral tree of life: that of circular eukaryotic single-stranded DNA (CESS) viruses. CESS viruses are the smallest known viruses, yet they include pathogens with devastating impacts on agriculture worldwide. Recently, general environmental sampling has uncovered a diversity of CESS viruses that are only distantly related to these pathogens and has shown that they infect a wider range of hosts than previously thought, including insects and fungi. This project systematically tests previously unexplored hosts to discover, sequence, and classify their CESS viruses. Phylogenetic analyses will then determine how all CESS viruses are related to one another, providing a framework for classifying the CESS viral tree of life.

Broader impacts of this research include training of graduate students, undergraduates, and postdocs, as well as K-12 outreach and creation of web-based resources for the scientific and general public. Middle school students will be involved in collecting invertebrate samples, which will be processed during a virology training workshop for graduate students and postdocs on CESS virus genomics. Project results will be displayed on a website to increase public knowledge about CESS viruses, and bioinformatics tools will be shared freely with the scientific community.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Zhao, L and Rosario, K and Breitbart, M and Duffy, S. "Eukaryotic Circular Rep-Encoding Single-Stranded DNA (CRESS DNA) Viruses: Ubiquitous Viruses With Small Genomes and a Diverse Host Range" Advances in virus research , v.103 , 2019 Citation Details
Zhao, Lele and Lavington, Erik and Duffy, Siobain "Truly ubiquitous CRESS DNA viruses scattered across the eukaryotic tree of life" Journal of Evolutionary Biology , v.34 , 2021 https://doi.org/10.1111/jeb.13927 Citation Details

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

We determined how a large and diverse group of viruses, those with circular single-stranded DNA genomes that encode a particular Rep protein to help replicate their genomes (CRESS DNA viruses), are related to each other. Using the only protein shared by all of these viruses (Rep) we built a family tree with over 900 species of CRESS DNA viruses. We determined that the evolution of Rep proteins in these viruses was not described very well by the standard tools that evolutionary biologists use to describe protein evolution in bacteria and eukaryotes. This motivated us to create a new model for Rep protein evolution. This bespoke model improved our ability to see deep relationships among families of CRESS DNA viruses compared to standard models. The most exciting conclusion from our tree (image 1) is that the Reps of families Geminiviridae (green) and Genomoviridae (pink) are not separate groups, but it is instead clear that genomovirus Rep evolved from a geminivirus Rep, specifically a Rep that has to be made in multiple parts and spliced together to function. This splice-form Rep appears to have evolved only one among these families, and is used by four genera of geminiviruses, almost all genomoviruses, and by the black unclassified Rep sequences that were in the same group on the tree. It is logical that the more complex splice-form Rep would have evolved once instead of multiple times in a similar way, but our study was the first time this mixed relationship among geminivirus and genomovirus Reps has been observed. The high confidence we have in the relationship is due to strong statistical support from analyses using our bespoke model for CRESS DNA virus evolution. We have made our model freely available and incorporated it into a common tool to determine which evolutionary model to use (ProtTest, screenshot in image 2) and this, and much more information about CRESS DNA viruses is available at cressdna.org.

We also used four types of machine learning to help automate the task of figuring out which CRESS DNA virus family newly sequenced genomes belong to and to find new patterns in CRESS DNA viruses that can help us understand their evolution. Surprisingly, a simple metric that was useful for classifying CRESS DNA viruses was amino acid content of the Rep (or also of another common protein in CRESS DNA viruses, the coat protein). This is a useful, fast and independent complement to current classification methods that all rely on comparing the order of amino acids (or DNA) between sequences. One of our machine learning approaches also identified features of each genus of CRESS DNA viruses that were useful for classification, and we plan to investigate whether these features have biological significance to these viral groups.

This project trained three graduate students, two postdocs and two undergraduates in computational biology skills, data analysis and scientific communication. Members of the lab participated in outreach activities at local middle schools, including teaching over one hundred 7th graders about how protein evolution occurs.

 


Last Modified: 07/05/2019
Modified by: Siobain M Duffy

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page