
NSF Org: |
IIS Division of Information & Intelligent Systems |
Recipient: |
|
Initial Amendment Date: | May 7, 2020 |
Latest Amendment Date: | May 7, 2020 |
Award Number: | 2030722 |
Award Instrument: | Standard Grant |
Program Manager: |
Sylvia Spengler
sspengle@nsf.gov (703)292-7347 IIS Division of Information & Intelligent Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | June 1, 2020 |
End Date: | May 31, 2022 (Estimated) |
Total Intended Award Amount: | $100,276.00 |
Total Awarded Amount to Date: | $100,276.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
321-A INGRAM HALL AUBURN AL US 36849-0001 (334)844-4438 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
AL US 36849-0001 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | Info Integration & Informatics |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
This project will advance current capabilities in protein structure prediction. Only about a third of known protein families with no experimentally-available structures are amenable to homology modeling; that is, have other proteins with sufficiently similar sequence profiles for which structures have been resolved in experimental laboratories. For the majority of known protein families, the so-called dark proteome, this is not the case. Structures are missing. Being able to obtain them experimentally or computationally is key to understanding the roles of proteins in key cellular proteins, obtaining a detailed view of molecular mechanisms, guiding efforts on therapeutic development, engineering proteins with specific functions, and more. This project will advance such efforts for the dark proteome with novel informatics techniques that are capable of harnessing useful signals hidden in protein sequences.
The project will evaluate the hypothesis that covariational signals in multiple sequence alignment can be harnessed to advance free modeling. Deep neural network architectures will be utilized for this purpose. Research activities are organized in two thrusts: (1) development of distant-homology fold recognition methods by alignment of inter-residue distance bounds predicted using 2D deep fully residual networks (FRNs); and (2) development of protein model quality estimation methods driven by per-residue distance errors predicted using 1D deep residual neural networks (ResNets). The project benefits researchers in diverse communities that are working at the interface of computing and biology. Planned activities include free dissemination of novel bioinformatics tools and research results, broadening of participation of K-12 students in computing through creative mentoring and outreach, and increasing public understanding of interdisciplinary science via Samuel Ginn College of Engineering?s GINNing podcast series.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
The major goal of this project is to evaluate the hypothesis that covariational signals in multiple sequence alignment can be harnessed to advance protein structure prediction. Following the goal, the PI's laboratory has developed and freely disseminated multiple computational algorithms for covariational protein modeling and quality estimation. These methods have been extensively tested in the Critical Assessment of Techniques for Protein Structure Prediction (CASP) experiments and was officially ranked among the best methods in the tertiary structure prediction, model quality estimation, and structure refinement categories of CASP. The project has resulted in several peer-reviewed journal and conference papers. The research results have been presented as invited and highlight talks in various venues including universities, international conferences, and workshops such as Purdue University, the University of Alabama at Birmingham School of Medicine, Saint Louis University, the University of Alabama, Intelligent Systems for Molecular Biology (ISMB), ACM Conference on Bioinformatics, Computational Biology and Biomedical Informatics (ACM-BCB), Midsouth Computational Biology and Bioinformatics Society (MCBIOS) annual conference, and Workshop on Data Mining in Bioinformatics (BIOKDD). The project has also provided training opportunities to a number of graduate and undergraduate students at the interface of computing and biology. The project partially supported the successful completion of a Ph.D. dissertation at Auburn University, followed by a tenure-track assistant professor position in the United States. Additional project activities include participating in K-12 outreach during the Auburn University Engineering Day to present research demos to several hundred middle school students in Alabama and contributing episodes to the Samuel Ginn College of Engineering's GINNing podcast series.
Last Modified: 10/10/2022
Modified by: Debswapna Bhattacharya
Please report errors in award information by writing to: awardsearch@nsf.gov.