Skip to feedback

Award Abstract # 2030722
EAGER: Covariational Deep Learning for Protein Structure Prediction

NSF Org: IIS
Division of Information & Intelligent Systems
Recipient: AUBURN UNIVERSITY
Initial Amendment Date: May 7, 2020
Latest Amendment Date: May 7, 2020
Award Number: 2030722
Award Instrument: Standard Grant
Program Manager: Sylvia Spengler
sspengle@nsf.gov
 (703)292-7347
IIS
 Division of Information & Intelligent Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: June 1, 2020
End Date: May 31, 2022 (Estimated)
Total Intended Award Amount: $100,276.00
Total Awarded Amount to Date: $100,276.00
Funds Obligated to Date: FY 2020 = $100,276.00
History of Investigator:
  • Debswapna Bhattacharya (Principal Investigator)
    dbhattacharya@vt.edu
Recipient Sponsored Research Office: Auburn University
321-A INGRAM HALL
AUBURN
AL  US  36849-0001
(334)844-4438
Sponsor Congressional District: 03
Primary Place of Performance: Auburn University
AL  US  36849-0001
Primary Place of Performance
Congressional District:
03
Unique Entity Identifier (UEI): DMQNDJDHTDG4
Parent UEI: DMQNDJDHTDG4
NSF Program(s): Info Integration & Informatics
Primary Program Source: 01002021DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7364, 7916, 9150
Program Element Code(s): 736400
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

This project will advance current capabilities in protein structure prediction. Only about a third of known protein families with no experimentally-available structures are amenable to homology modeling; that is, have other proteins with sufficiently similar sequence profiles for which structures have been resolved in experimental laboratories. For the majority of known protein families, the so-called dark proteome, this is not the case. Structures are missing. Being able to obtain them experimentally or computationally is key to understanding the roles of proteins in key cellular proteins, obtaining a detailed view of molecular mechanisms, guiding efforts on therapeutic development, engineering proteins with specific functions, and more. This project will advance such efforts for the dark proteome with novel informatics techniques that are capable of harnessing useful signals hidden in protein sequences.

The project will evaluate the hypothesis that covariational signals in multiple sequence alignment can be harnessed to advance free modeling. Deep neural network architectures will be utilized for this purpose. Research activities are organized in two thrusts: (1) development of distant-homology fold recognition methods by alignment of inter-residue distance bounds predicted using 2D deep fully residual networks (FRNs); and (2) development of protein model quality estimation methods driven by per-residue distance errors predicted using 1D deep residual neural networks (ResNets). The project benefits researchers in diverse communities that are working at the interface of computing and biology. Planned activities include free dissemination of novel bioinformatics tools and research results, broadening of participation of K-12 students in computing through creative mentoring and outreach, and increasing public understanding of interdisciplinary science via Samuel Ginn College of Engineering?s GINNing podcast series.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Bhattacharya, Sutanu and Roche, Rahmatullah and Moussad, Bernard and Bhattacharya, Debswapna "DisCovER : distance and orientationbased covariational threading for weakly homologous proteins" Proteins: Structure, Function, and Bioinformatics , v.90 , 2021 https://doi.org/10.1002/prot.26254 Citation Details
Bhattacharya, Sutanu and Roche, Rahmatullah and Shuvo, Md Hossain and Bhattacharya, Debswapna "Recent Advances in Protein Homology Detection Propelled by Inter-Residue Interaction Map Threading" Frontiers in Molecular Biosciences , v.8 , 2021 https://doi.org/10.3389/fmolb.2021.643752 Citation Details
Kryshtafovych, Andriy and Moult, John and Billings, Wendy M. and Della Corte, Dennis and Fidelis, Krzysztof and Kwon, Sohee and Olechnovi, Kliment and Seok, Chaok and Venclovas, eslovas and Won, Jonghun "Modeling SARSCoV2 proteins in the CASPcommons experiment" Proteins: Structure, Function, and Bioinformatics , v.89 , 2021 https://doi.org/10.1002/prot.26231 Citation Details
McGehee, Andrew J. and Bhattacharya, Sutanu and Roche, Rahmatullah and Bhattacharya, Debswapna "PolyFold: An interactive visual simulator for distance-based protein folding" PLOS ONE , v.15 , 2020 https://doi.org/10.1371/journal.pone.0243331 Citation Details
Roche, Rahmatullah and Bhattacharya, Sutanu and Bhattacharya, Debswapna "Hybridized distance- and contact-based hierarchical structure modeling for folding soluble and membrane proteins" PLOS Computational Biology , v.17 , 2021 https://doi.org/10.1371/journal.pcbi.1008753 Citation Details
Roche, Rahmatullah and Bhattacharya, Sutanu and Shuvo, Md. Hossain and Bhattacharya, Debswapna "rrQNet : Protein contact map quality estimation by deep evolutionary reconciliation" Proteins: Structure, Function, and Bioinformatics , 2022 https://doi.org/10.1002/prot.26394 Citation Details
Shuvo, Md Hossain and Bhattacharya, Sutanu and Bhattacharya, Debswapna "QDeep: distance-based protein model quality estimation by residue-level ensemble error classifications using stacked deep residual neural networks" Bioinformatics , v.36 , 2020 https://doi.org/10.1093/bioinformatics/btaa455 Citation Details
Shuvo, Md Hossain and Gulfam, Muhammad and Bhattacharya, Debswapna "DeepRefiner: high-accuracy protein structure refinement by deep network calibration" Nucleic Acids Research , 2021 https://doi.org/10.1093/nar/gkab361 Citation Details
Yan, Da and Qin, Steve and Bhattacharya, Debswapna and Chen, Jake and Zaki, Mohammed J. "20th International Workshop on Data Mining in Bioinformatics (BIOKDD 2021)" Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining , 2021 https://doi.org/10.1145/3447548.3469442 Citation Details

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

The major goal of this project is to evaluate the hypothesis that covariational signals in multiple sequence alignment can be harnessed to advance protein structure prediction. Following the goal, the PI's laboratory has developed and freely disseminated multiple computational algorithms for covariational protein modeling and quality estimation. These methods have been extensively tested in the Critical Assessment of Techniques for Protein Structure Prediction (CASP) experiments and was officially ranked among the best methods in the tertiary structure prediction, model quality estimation, and structure refinement categories of CASP. The project has resulted in several peer-reviewed journal and conference papers. The research results have been presented as invited and highlight talks in various venues including universities, international conferences, and workshops such as Purdue University, the University of Alabama at Birmingham School of Medicine, Saint Louis University, the University of Alabama, Intelligent Systems for Molecular Biology (ISMB), ACM Conference on Bioinformatics, Computational Biology and Biomedical Informatics (ACM-BCB), Midsouth Computational Biology and Bioinformatics Society (MCBIOS) annual conference, and Workshop on Data Mining in Bioinformatics (BIOKDD). The project has also provided training opportunities to a number of graduate and undergraduate students at the interface of computing and biology. The project partially supported the successful completion of a Ph.D. dissertation at Auburn University, followed by a tenure-track assistant professor position in the United States. Additional project activities include participating in K-12 outreach during the Auburn University Engineering Day to present research demos to several hundred middle school students in Alabama and contributing episodes to the Samuel Ginn College of Engineering's GINNing podcast series.

 


Last Modified: 10/10/2022
Modified by: Debswapna Bhattacharya

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page