
NSF Org: |
DBI Division of Biological Infrastructure |
Recipient: |
|
Initial Amendment Date: | July 26, 2019 |
Latest Amendment Date: | July 26, 2019 |
Award Number: | 1940422 |
Award Instrument: | Standard Grant |
Program Manager: |
Jean Gao
DBI Division of Biological Infrastructure BIO Directorate for Biological Sciences |
Start Date: | September 15, 2019 |
End Date: | August 31, 2021 (Estimated) |
Total Intended Award Amount: | $299,698.00 |
Total Awarded Amount to Date: | $299,698.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
107 S INDIANA AVE BLOOMINGTON IN US 47405-7000 (317)278-3473 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
535 W Michigan St., IT 475 Indianapolis IN US 46202-2915 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | Infrastructure Innovation for |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.074 |
ABSTRACT
A basic question in cell biology is to understand the driving mechanisms that control how and when genes are expressed, and to identify the active switches in those processes. The first step of gene expression is production of an RNA molecule from the genomic DNA, "transcription". As instruments become available that allow detection of the original RNA molecules from cells, it is becoming possible to identify sites where RNA bases have been chemically modified after their initial transcription. This is important because some of these post-transcriptional modifications play a role in how the expressed RNA is translated into expressed protein. Little is known as yet about the molecular players that are involved in the myriad steps that govern expression patterns, including localization, splicing, stability and folded structure of the RNA. This project aims to detect, identify and quantify the extent of modifications on RNA molecules as measured on the Oxford Nanopore platform, as a required first step in understanding those biological functions. Gold-standard calibration sets of synthetic oligonucleotides will be designed, produced and tested as part of the experimental design, and new algorithms and subsequent software will provide single-nucleotide resolution of the type and locations of robustly detected modifications in natural transcripts in yeast and human data sets.
Lack of efficient high throughput detection methods has plagued the emerging field of epitranscriptomics, which is focused on the role of chemical modifications on RNA bases in modulating the biological function and structure of RNA molecules. The overarching research goal of this project is to develop computational methods to map RNA modification sites for 5-methyl cytosine (5mC), 1-methyl adenosine (m1A) and methylation of the backbone of the RNA nucleotides (Nm) at a single nucleotide resolution. Experiments will employ synthetic calibration oligonucleotides as well as use newly developed algorithms to probe natural yeast and human transcripts, using the long-read direct RNA sequencing data resulting from Oxford Nanopore sequencing technology. The project will complement current transcriptomic reference maps of these modification events with additional data needed to train computational methods, from gold-standard calibration sets composed of synthetic RNA oligonucleotides. The resulting Oxford Nanopore signatures of modification sites will be analyzed using deep learning for signal analysis and statistical methods for robustness in precision and accuracy. All resulting methods, databases and maps of RNA modification types across species will be made publicly available from the project web site. The research program involves a team whose expertise intersects several domains of science, including engineering, bioinformatics, genomics and computational science, providing an excellent environment and experience for developing a new generation of inter-disciplinary scientists. Data, code and other infrastructure resources will be reported at http://www.iupui.edu/~jangalab/.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
One of the fundamental problems in the post-genomic era is to understand how genes are expressed and what driving mechanisms control their expression. Although, increasing evidence now points to the much larger role of several post-transcriptional mechanisms in controlling the transition from gene to protein, the molecular players which mediate and control this transition from RNA to protein, which governs expression patterns, localization, splicing, stability and structure of RNA, have been relatively under-appreciated. Recent research has led to the discovery of dynamic and often reversible chemical modifications of nucleotide bases on RNA and are increasingly witnessed to be key switches in RNA metabolism. Detecting such RNA modifications on the transcriptome level has been challenging because many of the known RNA modifications are either reverse transcription silent or cannot inherit the modification marks onto the cDNA. In addition, most current methods, although high-throughput, do not provide the resolution of single nucleotide and single molecule for mapping RNA modifications. Hence, lack of efficient high throughput detection methods has poised this field of epitranscriptomics. Over the course of this EAGER project, my lab has developed an open access gold standard database of RNA modifications titled Epitomy (https://epitomy.soic.iupui.edu/) which houses RNA modification loci in the human and mouse genomes, for ten different RNA modifications. Employing such curated datasets, we published an interactive visual analytics platform called sequoia, for interpretation and feature extraction from nanopore sequencing datasets. These resources and computational tools have also enabled us to develop multiple machine learning methods for predicting RNA modifications from nanopore direct RNA-sequencing datasets including penguin tool, which enables detecting Pseudouridine sites from single molecule sequencing datasets. Resulting methods, databases and maps of RNA modification types across species has so far resulted in four publications and will continue to accelerate our understanding of the epitranscriptome code and open new frontiers in single molecule direct mapping of RNA modifications. Since the research program has intersected several computational (data mining, algorithms, machine learning and software development), engineering (nanotechnology), chemical and biological (genomics, gene regulation and RNA bioinformatics) areas of research, it has provided an excellent environment for developing the next generation of inter-disciplinary scientists.
Last Modified: 12/30/2021
Modified by: Sarath C Janga
Please report errors in award information by writing to: awardsearch@nsf.gov.