
NSF Org: |
DBI Division of Biological Infrastructure |
Recipient: |
|
Initial Amendment Date: | March 12, 2018 |
Latest Amendment Date: | August 16, 2022 |
Award Number: | 1750532 |
Award Instrument: | Continuing Grant |
Program Manager: |
Reed Beaman
rsbeaman@nsf.gov (703)292-7163 DBI Division of Biological Infrastructure BIO Directorate for Biological Sciences |
Start Date: | July 1, 2018 |
End Date: | June 30, 2024 (Estimated) |
Total Intended Award Amount: | $574,068.00 |
Total Awarded Amount to Date: | $574,068.00 |
Funds Obligated to Date: |
FY 2019 = $108,934.00 FY 2020 = $111,495.00 FY 2021 = $112,594.00 FY 2022 = $113,725.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
4505 S MARYLAND PKWY LAS VEGAS NV US 89154-9900 (702)895-1357 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
NV US 89154-1055 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | ADVANCES IN BIO INFORMATICS |
Primary Program Source: |
01001920DB NSF RESEARCH & RELATED ACTIVIT 01002021DB NSF RESEARCH & RELATED ACTIVIT 01002122DB NSF RESEARCH & RELATED ACTIVIT 01002223DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.074 |
ABSTRACT
Adjacent protein domains interact to fold into a functional protein structure. Adjacent binding sites in the DNA interact with the multiprotein complex for efficient binding. Yet, both protein domains and clusters of binding sites are encoded as one-dimensional arrays in the genome. This project tests the hypothesis that, in order for specific and correct interactions to occur, there are optimal distances between these functional units and that they are maintained across evolutionary time. Since insertions and deletions of DNA (indels) change the distance between these functional units, the genome will be under evolutionary constraint against indel mutations that affect the distance. To test this hypothesis, the investigator will develop software to systematically estimate and compare the changes in distance between functional elements and statistical models to test the likelihood of the events observed. Upon project completion, the scientific community will have tools that identify distances that are conserved, and will be able to predict the effect and importance of indel mutations occurring in these genomic regions. The investigator will hold workshops for girls in grades 6-12 to develop software games that model the concept of evolutionary constraint. She will also develop undergraduate and graduate classes with hands-on activities on molecular evolution and computational sequence analysis.
The goal of this project is to utilize the variation in rates of indels to infer the evolutionary constraint on the distance between functional elements in the genome. Experimental evidence has been accumulating on selection against indels in the loops and linkers within proteins, or in the space between binding sites of regulatory elements. But, studies on the evolution of these sequences are almost nonexistent, due to the difficulty in aligning these sequences. This project addresses these challenges by applying methods that can model variable indel rates across sites or methods that model length instead of relying on alignments. Using these methods, the investigator will produce phylogenetic, and quantitative estimates of indel rates on a significant proportion of the genome that has been neglected so far. In Objective 1, using a new software the investigator has developed, variable site-specific indel rates will be estimated across loops between protein motifs to identify structural motifs with strong constraints on their distance. In Objective 2, the investigator will develop a new software based on birth-death processes to estimate indel rates without relying on alignments. Using this software, she will test the hypothesis that there is stronger constraint on the distance between tandem homologous domains, compared to non-homologous domains. In Objective 3, the investigator will use the software described above to test the hypothesis that there is stronger constraint on the distance between binding sites of homodimers, compared to the distance between binding sites of heterodimers. This study will integrate the knowledge gained in the fields of structural biology and developmental biology into a phylogenomic context, and provide tools for the community to test specific evolutionary hypotheses on distance between functional elements of interest. The results of the project will be presented at https://github.com/HanLabUNLV.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
The project aimed to understand the evolutionary constraint on distances between elements in the genome. We especially focused on understanding the distance between regulatory elements in the genome.
Based on the accumulating evidence that 3D contact between promoters and enhancers are important in gene regulation, we built a supervised machine learning model to predict functional enhancer-promoter relationships based on genomic features. We found that both 3D chromatin contact and 1D genomic distance are the most important features in predicting functional enhancer-promoter pairs. In addition, we found many other genomic features that distinguish distal enhancer regulated promoters from self-sufficient promoters.
To understand how the chromatin landscape and chromatin contact change across cell-types, we built an unsupervised method to integrate the 1D chromatin annotation with 3D chromatin contact. Using this approach, we were able to summarize the types of chromatins that a genomic region is in contact with and find clusters of common contact patterns. It allowed us to observe that a genomic region can dynamically change the types of chromatin that it is in contact with, while maintaining its own chromatin mark.
In studying the regulatory regions, we found that a significant proportion of them originate from transposable elements, so we studied how transposable elements can be a source of regulatory elements. First, we showed that the majority of old transposable elements in the human genome have high locus level mappability. Then we studied the tissue specific expression of individual transposable elements and found that host(human) immune genes tend to show repressed expression in samples that have higher transposable element expression. We also reviewed the literature on HERV-H and its potential as a regulatory element in pluripotent stem cells.
For broader impact, we ran a summer coding camp every year for four years, with the exception of 2020. The summer coding camp was designed to introduce visual programming language Scratch to middle school girls. About 15~20 students participated every year, and developed various art or games using Scratch, including a game that simulated natural selection.
Last Modified: 12/11/2024
Modified by: Mira Han
Please report errors in award information by writing to: awardsearch@nsf.gov.