Award Abstract # 1909536
FET: Small: Tools and Experimental Validation for Predicting Enzymatic Promiscuity and its Products

NSF Org: CCF
Division of Computing and Communication Foundations
Recipient: TRUSTEES OF TUFTS COLLEGE
Initial Amendment Date: July 11, 2019
Latest Amendment Date: August 18, 2022
Award Number: 1909536
Award Instrument: Standard Grant
Program Manager: Stephanie Gage
sgage@nsf.gov
 (703)292-4748
CCF
 Division of Computing and Communication Foundations
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: October 1, 2019
End Date: September 30, 2024 (Estimated)
Total Intended Award Amount: $515,790.00
Total Awarded Amount to Date: $631,393.00
Funds Obligated to Date: FY 2019 = $515,790.00
FY 2020 = $16,000.00

FY 2022 = $99,603.00
History of Investigator:
  • Soha Hassoun (Principal Investigator)
    soha@cs.tufts.edu
  • Nikhil Nair (Co-Principal Investigator)
  • Liping Liu (Co-Principal Investigator)
Recipient Sponsored Research Office: Tufts University
80 GEORGE ST
MEDFORD
MA  US  02155-5519
(617)627-3696
Sponsor Congressional District: 05
Primary Place of Performance: Tufts University School of Engineering
200 College Ave
Medford
MA  US  02155-5530
Primary Place of Performance
Congressional District:
05
Unique Entity Identifier (UEI): WL9FLBRVPJJ7
Parent UEI: WL9FLBRVPJJ7
NSF Program(s): FET-Fndtns of Emerging Tech
Primary Program Source: 01002223DB NSF RESEARCH & RELATED ACTIVIT
01001920DB NSF RESEARCH & RELATED ACTIVIT

01002021DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7923, 7931, 9251
Program Element Code(s): 089Y00
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

The engineering of microbial cells through synthetic biology promises to advance the production of high-volume commodity chemicals such as biopolymers, fuels, nutraceuticals, therapeutics, and other specialty products. Increasing understanding of metabolic and regulatory networks underlying microbial physiology hinges on developing metabolic models that capture enzymatic cellular activity. Despite significant progress in sequencing technology and model reconstruction, there are many cellular enzymatic activities that remain unknown. The hypothesis underlying this project is that undocumented promiscuous activity of enzymes results in the formation of unexpected reaction byproducts. This phenomenon is frequently observed by synthetic biologists, and sometimes exploited during design through ad hoc experimental efforts. However, there are currently limited ways to predict, analyze, or mitigate the effects of enzyme promiscuity.

This project develops tools to predict stoichiometrically balanced reactions that reflect undocumented promiscuous enzymatic cellular activities. These reactions can be utilized to augment existing metabolic models and improve design tools. Further, the project experimentally validates the augmented models by engineering a microbial host to synthesize desirable chemicals and then analyzing host-pathway interactions. The proposed work is at the intersection of computer science and biology and will advance the ability to predict and assess the impact of enzyme promiscuity on biological systems. Training will be provided to graduate and undergraduate students and the research will be incorporated into classroom teaching. Underrepresented minorities in computing will be recruited to participate in the research through various established national and local programs.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Balzerani, Francesco and Blasco, Telmo and Pérez-Burillo, Sergio and Valcarcel, Luis V and Hassoun, Soha and Planes, Francisco J "Extending PROXIMAL to predict degradation pathways of phenolic compounds in the human gut microbiota" npj Systems Biology and Applications , v.10 , 2024 https://doi.org/10.1038/s41540-024-00381-1 Citation Details
Chen, X and Wang, Y and Du, Y and Hassoun, S and Liu, L "On Separate Normalization in Self-supervised Transformers" , 2023 Citation Details
Jiang, Julie and Liu, Li-Ping and Hassoun, Soha "Learning graph representations of biochemical networks and its application to enzymatic link prediction" Bioinformatics , v.37 , 2020 https://doi.org/10.1093/bioinformatics/btaa881 Citation Details
Kalia, Apurva and Krishnan, Dilip and Hassoun, Soha and Martelli, ed., Pier Luigi "CSI: Contrastive data Stratification for Interaction prediction and its application to compoundprotein interaction prediction" Bioinformatics , v.39 , 2023 https://doi.org/10.1093/bioinformatics/btad456 Citation Details
Liu, Linfeng and Hughs, Mike and Hassoun, Soha and Liu, Li-Ping "Stochastic Iterative Graph Matching" International Conference on Machine Learning , 2021 Citation Details
Li, Xinmeng and Liu, Li-Ping and Hassoun, Soha and Valencia, ed., Alfonso "Boost-RS: boosted embeddings for recommender systems and its application to enzymesubstrate interaction prediction" Bioinformatics , v.38 , 2022 https://doi.org/10.1093/bioinformatics/btac201 Citation Details
Porokhin, Vladimir and Amin, Sara A. and Nicks, Trevor B. and Gopinarayanan, Venkatesh Endalur and Nair, Nikhil U. and Hassoun, Soha "Analysis of metabolic network disruption in engineered microbial hosts due to enzyme promiscuity" Metabolic Engineering Communications , v.12 , 2021 https://doi.org/10.1016/j.mec.2021.e00170 Citation Details
Porokhin, Vladimir and Liu, Li-Ping and Hassoun, Soha and Martelli, ed., Pier Luigi "Using graph neural networks for site-of-metabolism prediction and its applications to ranking promiscuous enzymatic products" Bioinformatics , v.39 , 2023 https://doi.org/10.1093/bioinformatics/btad089 Citation Details
Visani, Gian Marco and Hughes, Michael C and Hassoun, Soha "Enzyme promiscuity prediction using hierarchy-informed multi-label classification" Bioinformatics , v.37 , 2021 https://doi.org/10.1093/bioinformatics/btab054 Citation Details

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Although traditionally assumed specific, transforming a single substrate, many enzymes, if not all, have promiscuous activities such that they act on substrates other than those they evolved to transform. As current knowledge about enzyme promiscuity is limited, the objective of this proposal is to advance computational tools to analyze the promiscuity of enzymes within natural and engineered biological systems. The impact of these tools is to advance our understanding of cellular metabolism and guide experimental works, thus advancing applications related to energy production and biological discovery. Several computational models were developed. A deep-learning model, termed ELP - Enzymatic Link Prediction, is developed to predict an enzymatic link between two compounds. A deep-learning hierarchical classifier, termed EPP -- Enzyme Promiscuity Prediction, is designed to predict enzyme classes that can act on a query molecule. A deep-learning recommender systems, termed Boost-RS, is developed to find enzymes that act on query substrates, or substrates that interact with query enzymes. A multi-view contrastive learning model, termed Contrastive Data Stratification for Interaction Prediction (CSI), is used to predict substrate-enzyme prediction likelihood. A deep-learning method, SIGMA, is developed to align molecule graphs, a task needed to extract transformation rules between substrate and products. A graph neural network model, termed GNN-SOM, is developed to predict site of metabolism. A promiscuity prediction pipeline, called PROXIMAL2, was developed to enhance the prediction of products of promiscuous enzymes.  

Several of the tools are applied to analyze the impact of enzyme promiscuity on cellular systems. A computational workflow, termed Metabolic Disruption Workflow (MDFlow), is developed for discovering interactions and network disruption arising from enzyme promiscuity. A method for creating Extended Metabolic (EMM) models based on enzyme promiscuity is developed. Using biotransformation rules, we developed a technique, BAM, to identify promiscuous novel molecules within molecular networks based on their spectra signature.

The project resulted in several tools, ELP, EPP, Boost-RS, CSI, SIGMA, GNN-SOM, PROXIMAL2, MDFlow, EMM, and BAM. All such tools are supported by peer-reviewed publications and are available in the public domain through the lab’s GitHub repository, https://github.com/HassounLab/. Further, several graduate and undergraduate students were trained in the development of computational methods that advance state of the art in biological engineering. 

 


Last Modified: 03/12/2025
Modified by: Soha Hassoun

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page