Skip to feedback

Award Abstract # 1943008
CAREER: Physics-Constrained Modeling of Molecular Texts, Graphs, and Images for Deciphering Protein-Protein Interactions

NSF Org: CCF
Division of Computing and Communication Foundations
Recipient: TEXAS A&M ENGINEERING EXPERIMENT STATION
Initial Amendment Date: January 24, 2020
Latest Amendment Date: June 24, 2024
Award Number: 1943008
Award Instrument: Continuing Grant
Program Manager: Stephanie Gage
sgage@nsf.gov
 (703)292-4748
CCF
 Division of Computing and Communication Foundations
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: July 1, 2020
End Date: June 30, 2025 (Estimated)
Total Intended Award Amount: $500,000.00
Total Awarded Amount to Date: $500,000.00
Funds Obligated to Date: FY 2020 = $94,148.00
FY 2021 = $97,369.00

FY 2022 = $100,030.00

FY 2023 = $102,792.00

FY 2024 = $105,661.00
History of Investigator:
  • Yang Shen (Principal Investigator)
    yshen@tamu.edu
Recipient Sponsored Research Office: Texas A&M Engineering Experiment Station
3124 TAMU
COLLEGE STATION
TX  US  77843-3124
(979)862-6777
Sponsor Congressional District: 10
Primary Place of Performance: Texas A&M Engineering Experiment Station
188 Bizzell Street
College Station
TX  US  77843-3128
Primary Place of Performance
Congressional District:
10
Unique Entity Identifier (UEI): QD1MX6N5YTN4
Parent UEI: QD1MX6N5YTN4
NSF Program(s): FET-Fndtns of Emerging Tech
Primary Program Source: 01002021DB NSF RESEARCH & RELATED ACTIVIT
01002122DB NSF RESEARCH & RELATED ACTIVIT

01002223DB NSF RESEARCH & RELATED ACTIVIT

01002324DB NSF RESEARCH & RELATED ACTIVIT

01002425DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 1045, 7931, 9251
Program Element Code(s): 089Y00
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Proteins are essential parts of biological systems that often function through interactions. Toward understanding and engineering biological systems, data are rapidly accumulating on what proteins and what protein-protein interactions (PPIs) are present in such systems, but a major barrier remains as knowledge is limited on how proteins interact in 3-dimensional (3D) space. This project is designed to help fill the knowledge gap by developing computational methods that predict mechanism-revealing 3D structures formed by PPIs. While developing such methods, a data-focused yet physics-rationalized approach will be pursued, which is expected to advance the state of the knowledge across natural science and artificial intelligence. The outcome of the project will facilitate deciphering and engineering genome-wide PPIs for wide applications such as novel therapeutics, clean energy, and smart materials. The project is also designed with educational activities to promote the awareness, participation, training, and communication of data-driven science discovery for students, educators, domain scientists, and general public. The highly interdisciplinary research and education activities will be integrated to foster a diverse globally-competitive workforce, including historically underrepresented groups, to be ready for the era of big data.

The research goal of this project is to advance the state of the art for structural PPI prediction and re-think and tackle the problem as explaining how pairs of proteins, represented in various data forms such as texts, graphs, or images, interact under governing physics. In pursuit of the goal, the research objectives of the project involve three levels of PPI structural prediction of increasing resolutions and challenges: residue-level contact maps, residue-level distance distributions, and atom-level 3D structures. Initiated by these objectives, novel machine learning algorithms will be developed and contribute to foundational algorithm research, including the effective integration and learning from heterogeneous data as well as the flexible representation and incorporation of domain knowledge. Such advance in foundational algorithm research will expand the applicability of PPI structural prediction to genome-scale and learn physical principles underlying diverse PPIs rather than ?memorizing? patterns in similar PPIs. Moreover, such methodological advance is expected to impact broad application fields beyond PPI structural prediction. The proposed research is integrated with an educational plan by feeding research results and trained personnel to multi-scale education and outreach activities, involving educated students in research, and engaging general public in citizen science. New curricular and co-curricular activities will be developed to enhance the accessibility to interdisciplinary data-science training for a diverse student body and domain scientists. Also, multi-level outreach activities in collaboration with existing programs will be used to foster the awareness of and interest in interdisciplinary data science among diverse middle- and high-school students as well as the general public.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 14)
Irani, Seema and Tan, Wuwei and Li, Qing and Toy, Weiyi and Jones, Catherine and Gadiya, Mayur and Marra, Antonio and Katzenellenbogen, John A and Carlson, Kathryn E and Katzenellenbogen, Benita S and Karimi, Mostafa and Segu_Rajappachetty, Ramya and Del_ "Somatic estrogen receptor mutations that induce dimerization promote receptor activity and breast cancer proliferation" Journal of Clinical Investigation , v.134 , 2024 https://doi.org/10.1172/JCI163242 Citation Details
Karimi, Mostafa and Wu, Di and Wang, Zhangyang and Shen, Yang "Explainable Deep Relational Networks for Predicting CompoundProtein Affinities and Contacts" Journal of Chemical Information and Modeling , v.61 , 2021 https://doi.org/10.1021/acs.jcim.0c00866 Citation Details
Lensink, Marc F. and Brysbaert, Guillaume and Mauri, Théo and Nadzirin, Nurul and Velankar, Sameer and Chaleil, Raphael A. and Clarence, Tereza and Bates, Paul A. and Kong, Ren and Liu, Bin and Yang, Guangbo and Liu, Ming and Shi, Hang and Lu, Xufeng and "Prediction of protein assemblies, the next frontier: The CASP14CAPRI experiment" Proteins: Structure, Function, and Bioinformatics , v.89 , 2021 https://doi.org/10.1002/prot.26222 Citation Details
Lensink, Marc_F and Brysbaert, Guillaume and Raouraoua, Nessim and Bates, Paul_A and Giulini, Marco and Honorato, Rodrigo_V and van_Noort, Charlotte and Teixeira, Joao_M_C and Bonvin, Alexandre_M_J_J and Kong, Ren and Shi, Hang and Lu, Xufeng and Chang, S "Impact of AlphaFold on structure prediction of protein complexes: The CASP15CAPRI experiment" Proteins: Structure, Function, and Bioinformatics , v.91 , 2023 https://doi.org/10.1002/prot.26609 Citation Details
Ramos, Erika K and Tsai, Chia-Feng and Jia, Yuzhi and Cao, Yue and Manu, Megan and Taftaf, Rokana and Hoffmann, Andrew D and El-Shennawy, Lamiaa and Gritsenko, Marina A and Adorno-Cruz, Valery and Schuster, Emma J and Scholten, David and Patel, Dhwani and "Machine learning-assisted elucidation of CD81CD44 interactions in promoting cancer stemness and extracellular vesicle integrity" eLife , v.11 , 2022 https://doi.org/10.7554/eLife.82669 Citation Details
Taftaf, Rokana and Liu, Xia and Singh, Salendra and Jia, Yuzhi and Dashzeveg, Nurmaa K. and Hoffmann, Andrew D. and El-Shennawy, Lamiaa and Ramos, Erika K. and Adorno-Cruz, Valery and Schuster, Emma J. and Scholten, David and Patel, Dhwani and Zhang, Youb "ICAM1 initiates CTC cluster formation and trans-endothelial migration in lung metastasis of breast cancer" Nature Communications , v.12 , 2021 https://doi.org/10.1038/s41467-021-25189-z Citation Details
Talukder, A. and Yin, R. and Sun, Y. and Shen, Y. and You, Y. "Does Inter-Protein Contact Prediction Benefit from Multi-Modal Data and Auxiliary Tasks?" Machine Learning in Structural Biology Workshop at the 36th Conference on Neural Information Processing Systems , 2022 https://doi.org/10.1101/2022.11.29.518454 Citation Details
Wei, Tianxin and You, Yuning and Chen, Tianlong and Shen, Yang and He, Jingrui and Wang, Zhangyang "Augmentations in Hypergraph Contrastive Learning: Fabricated and Generative" Advances in neural information processing systems , v.35 , 2022 Citation Details
Xu, Haotian and You, Yuning and Shen, Yang "Multi-Modal Contrastive Learning for Proteins by Combining Domain-Informed Views" , 2024 Citation Details
You, Y. and Chen, T. and Wang, Z. and Shen, Y. "Graph Domain Adaptation via Theory-Grounded Spectral Regularization" The Eleventh International Conference on Learning Representations , 2023 Citation Details
You, Y and Zhou, R and Park, J and Xu, H and Tian, C and Wang, Z and Shen, Y "Latent 3D Graph Diffusion" , 2024 Citation Details
(Showing: 1 - 10 of 14)

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page