Award Abstract # 1356288
Collaborative Research: ABI Development: "Beyond Ribosomal RNA genes: Community Tools for Analysis of Whole-Genomes and Metagenomes"

NSF Org: DBI
Division of Biological Infrastructure
Recipient: GEORGIA TECH RESEARCH CORP
Initial Amendment Date: July 8, 2014
Latest Amendment Date: December 5, 2017
Award Number: 1356288
Award Instrument: Continuing Grant
Program Manager: Jen Weller
DBI
 Division of Biological Infrastructure
BIO
 Directorate for Biological Sciences
Start Date: July 1, 2014
End Date: December 31, 2018 (Estimated)
Total Intended Award Amount: $823,371.00
Total Awarded Amount to Date: $823,371.00
Funds Obligated to Date: FY 2014 = $239,126.00
FY 2015 = $313,723.00

FY 2016 = $270,522.00
History of Investigator:
  • Konstantinos Konstantinidis (Principal Investigator)
    kostas@ce.gatech.edu
Recipient Sponsored Research Office: Georgia Tech Research Corporation
926 DALNEY ST NW
ATLANTA
GA  US  30318-6395
(404)894-4819
Sponsor Congressional District: 05
Primary Place of Performance: Georgia Institute of Technology
505 10th Street, NW
Atlanta
GA  US  30332-0002
Primary Place of Performance
Congressional District:
05
Unique Entity Identifier (UEI): EMW9FC8J3HN4
Parent UEI: EMW9FC8J3HN4
NSF Program(s): ADVANCES IN BIO INFORMATICS
Primary Program Source: 01001415DB NSF RESEARCH & RELATED ACTIVIT
01001516DB NSF RESEARCH & RELATED ACTIVIT

01001617DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 9178, 9179
Program Element Code(s): 116500
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.074

ABSTRACT

The genetic diversity of bacteria and archaea (the prokaryotes) is by far the largest among all living organisms. Whether in soils, waters, human guts, or the atmosphere, prokaryotes affect, if not control, all life-sustaining processes on Earth, but how these microbes interact with and change their environment is not fully understood. Current incomplete understanding is, at least in part, due to the fact that the great majority of microorganisms resist cultivation in the laboratory, i.e., they represent the uncultivable majority, and thus, cannot be studied efficiently. In the past few years, there has been an explosion of culture-independent genomic techniques (a.k.a. metagenomics), which allow the analysis of microorganisms and their communities in their natural habitat by sequencing their entire genomes or transcriptomes, bypassing the need for lab cultivation. However, the development of computational tools and algorithms to analyze metagenomic data is lagging behind developments in sequencing technologies. To advance the understanding of the uncultivable majority of microorganisms, and take full advantage of the investment of society in genomic technologies, new quantitative approaches are needed. The goals of this project are: 1) to develop new computational tools that fulfill critical research needs and thus, help scientists understand the composition, functions and values of the microbial communities, and 2) to train faculty from undergraduate colleges, including community colleges, in new metagenomics techniques, which are positioned at the interface of microbiology, genomics, bioinformatics, and computational biology, a pivotal area of contemporary research and education that is inadequately covered in traditional curricula. Therefore, these activities are expected to provide important infrastructure for training the future workforce and to facilitate contemporary research.

The small subunit ribosomal RNA gene (SSU rRNA) has been successfully used to catalogue and study the diversity of microorganisms for the last two decades. This work has been facilitated by the development of dedicated resources (databases and tool repositories) such as the Ribosomal Database Project (RDP; http://rdp.cme.msu.edu). However, rRNA gene-based studies have important limitations that techniques based on genome sequences do not. For instance, the genomic techniques can better resolve microbial communities at the levels where the SSU rRNA gene provides inadequate resolution, namely the species and finer levels, and catalogue whole-genome diversity and fluidity, which are relevant for nutrient cycling, bioremediation efforts, and emergence of microbial antibiotic resistance. This project seeks to develop tools that overcome several of the limitations of the rRNA gene-based approaches and allow the efficient analysis of microbiomes. Robust implementations of both well-accepted existing methods, such as genome-aggregate average nucleotide identity (gANI) for delineating closely-related species and strains, along with newer methods, including the recently developed Nonpareil method for estimating the coverage of a microbial community obtained by a metagenomic dataset, and MyTaxa method for examining horizontal gene transfer events between microbial lineages will be provided. The overarching objective is to develop the genome equivalent of the RDP that will enable the scientific community to perform classification and diversity studies at the genome level.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 21)
Castro, J. C., L. M. Rodriguez-R, M. R. Weigand, J. K. Hatt, M. Q. Carter, and K. T. Konstantinidis. "imGLAD: metagenome-based accurate detection and quantification of target organisms in environmental samples" PeerJ , v.6 , 2018 , p.e5882 10.7717/peerj.5882
C. Jain, L. M. Rodriguez-R, A. M. Phillippy, K. T. Konstantinidis, and S. Aluru. "High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries." Nat Commun. , v.9 , 2018 , p.5114 10.1038/s41467-018-07641-9
D. Tsementzi, J. Castro, M. Mahagna, Y. Gottlieb, and K. T. Konstantinidis "Comparison of closely related, uncultivated Coxiella tick endosymbiont population genomes reveals clues about the mechanisms of symbiosis." Environmental Microbiology , v.20 , 2018 , p.1751 10.1111/1462-2920.14104
Johnston, E. R., M. Kim, J. K. Hatt, J. R. Phillips, Q. Yao, Y. Song, T. C. Hazen, M. A. Mayes, and K. T Konstantinidis. "Phosphate addition increases tropical forest soil respiration primarily by deconstraining microbial population growth." Soil Biology and Biochemistry. , v.130 , 2019 , p.43 10.1016/j.soilbio.2018.11.026
Konstantinidis, K. T., R. Rosselló-Móra, and R. Amann. "Uncultivated microbes in need of their own taxonomy." The ISME Journal , v.11 , 2017 , p.2399 10.1038/ismej.2017.113
K. T. Konstantinidis, R. Rosselló-Móra, and R. Amann. "Reply to the commentary ?Uncultivated microbes?in need of their own nomenclature??." The ISME Journal , v.12 , 2018 , p.653 10.1038/s41396-017-0011-y
K. T. Konstantinidis, R. Rosselló-Móra, and R. Amann. "Uncultivated microbes in need of their own taxonomy." The ISME Journal , v.11 , 2017 , p.2399 10.1038/ismej.2017.113
L. H. Orellana, J. C. Chee-Sanford, R. A. Sanford, F. E. Loffler, and K. T. Konstantinidis. "Year-round shotgun metagenomes reveal stable microbial communities in agricultural soils and novel ammonia oxidizers responding to fertilization." Applied and Environmental Microbiology , v.84 , 2018 , p.e01646 10.1128/AEM.01646-17
L. M. Rodriguez-R, and K. T. Konstantinidis. "The enveomics collection: a toolbox for specialized analyses of microbial genomes and metagenomes." PeerJ Preprints , 2016
L. M. Rodriguez-R, J. C. Castro, N. C. Kyrpides, J. R. Cole, J. M. Tiedje, and K. T. Konstantinidis. "How much do rRNA gene surveys underestimate extant bacterial diversity?" Applied and Environmental Microbiology , v.84 , 2018 , p.e00014 10.1128/AEM.00014-18
L. M. Rodriguez-R, S. Gunturu, J. Cole, J. M. Tiedje, and K. T. Konstantinidis "Nonpareil 3: Fast Estimation of Metagenomic Coverage and Sequence Diversity." mSystems , v.3 , 2018 , p.e00039 10.1128/mSystems.00039-18
(Showing: 1 - 10 of 21)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Intellectual merit

The small subunit ribosomal RNA gene (16S rRNA) has been successfully used to catalogue and study the diversity of microbial species and their communities to date. Accordingly, several 16S rRNA gene-based websites and tools are available. Nonetheless, several aspects of the rRNA gene-based studies remain problematic. Most importantly, how to better resolve microbial communities at levels where the 16S rRNA gene provides inadequate resolution, namely the species and finer levels, and how to best catalogue whole-genome diversity and fluidity. Additionally, an explosion in the use of culture-independent genomic approaches (a.k.a. metagenomics) has recently occurred. However, the tools to analyze metagenomic data are clearly lagging behind the developments in sequencing technologies (and data) and are typically limited to the assembly and gene annotation of the metagenomic sequences. To advance the study of microbial species and their communities and take full advantage of the capabilities provided by metagenomics, quantitative whole-genome approaches are clearly needed. It is also important for such approaches to scale with high volumes of data in order to accommodate the geometrically increasing number of genomes and metagenomes that become available.

To address these challenges, we developed and released to the public (November 2016) a webserver called the Microbial GenomeAtlas (MiGA; available at www.microbial-genomes.org). MiGA allows one to perform classification and diversity studies of query complete or partial genomes against a reference database of genomes of all isolated and classified microorganisms using the genome-average nucleotide (ANI) and amino-acid identity (AAI) concepts. Therefore, MiGA allows external users to perform classification and diversity studies at the genome level and represents the “genome equivalent” of rRNA webservers. The number of new registered users of the MiGA webserver has grown from a couple per month in 2016, when the server was first launched, to more than 50 new users/month currently (>500 registered users, in total), while the total search queries processed by the webserver has exceeded 8,000, which is a testament that MiGA fulfills a critical need of contemporary research. Furthermore, we have developed new or optimized previously developed algorithms for big data analysis as part of this project. The tools enable various important analyses such as assessment of the extent of species diversity and amount of sequencing required to cover the diversity in an environmental sample (Nonpareil 3), a kmer-based high-throughput algorithm to calculate ANI (FastANI), and tools for detection of target genomes (imGLAD) or reads encoding a gene of interest (ROCker and Xander) in complex metagenomes. Finally, we have applied these tools to obtain answers to questions related to several important microbial systems such as what the microbiome of tick arthropods provides to its host, and how soil microbial communities respond to climate perturbations and agricultural activities. Our bioinformatics approaches and findings were published in 13 articles (Wang 2015, Rodriguez-R 2016, Konstantinidis 2017, Orellana 2017, Orellana 2017, Castro 2018, Jain 2018, Pena-Gonzalez 2018, Rodriguez 2018, Rodriguez 2018, Rodriguez 2018, Tsementzi 2018, Johnston 2019). 

Broader impacts

Our work provided long-needed tools for high-throughput analysis of microbial genomes and metagenomes, and an associated webserver that makes these tools freely available for online analysis. The tools willhelp microbial scientists to significantly advance our understanding of the diversity and function of microbial communities, and are applicable to a variety of microbiome studies across the fields of ecology, systematics, evolution, engineering and medicine. A lecture-based workshop with at least 100 participating faculty from undergraduate colleges, including community colleges, was organized during the 2018 American Society for Microbiology’s Conference for Undergraduate Educators (ASMCUE) that disseminated our tools and knowledge of metagenomics to non-experts and undergraduates. Further, we held workshops with graduate students and their professors to train them on MiGA usage in Atlanta (GA), East Lansing (MI), Puerto Rico, Germany, Greece, China and Brazil. The workshops were met with great success based on participant exit interviews. The project trained 2 post-doctoral associates, 5 Ph.D students, 4 Masters students, and 7 undergraduate computer science and engineering majors; about half of our students were female. All former students and post-docs have won awards and distinctions for their work such as three Sigma Xi best PhD thesis awards in different years (only 8 to 10 such awards are given by the Sigma XI chapter of Georgia Tech per year among all PhD theses published within the year), and secured positions at Institutions such as Emory University, Oak Ridge National Laboratory, and the Max Planck Institute in Bremen, Germany. Therefore, this project provided multifaceted learning experiences to both national and international undergraduate and graduate students at the interface of microbiology, evolution, genomics, bioinformatics, and computational biology, a pivotal area of contemporary research and education. 


Last Modified: 01/18/2019
Modified by: Konstantinos T Konstantinidis

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page