
NSF Org: |
CHE Division Of Chemistry |
Recipient: |
|
Initial Amendment Date: | June 9, 2017 |
Latest Amendment Date: | June 9, 2017 |
Award Number: | 1734082 |
Award Instrument: | Standard Grant |
Program Manager: |
Tingyu Li
tli@nsf.gov (703)292-4949 CHE Division Of Chemistry MPS Directorate for Mathematical and Physical Sciences |
Start Date: | September 1, 2017 |
End Date: | August 31, 2020 (Estimated) |
Total Intended Award Amount: | $209,734.00 |
Total Awarded Amount to Date: | $209,734.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
450 JANE STANFORD WAY STANFORD CA US 94305-2004 (650)723-2300 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
333 Campus Drive Stanford CA US 94305-4401 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | CMFP-Chem Mech Funct, and Prop |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.049 |
ABSTRACT
With support from the Chemical Structure, Dynamics and Mechanisms - B Program in the Division of Chemistry and in response to the Data-Driven Discovery Science in Chemistry (D3SC) Dear Colleague Letter, Professor Richard N. Zare at Stanford University is working on optimizing chemical reactions in microdroplets with deep reinforcement learning. Unoptimized reactions are expensive because they waste time and reagents. A common way for chemists to explore reaction optimization is to change one variable at a time while all other variables remain fixed. This method, however, might not find the best conditions, that is the global optimum. Another way is to search across all combinations of reaction conditions by using batch chemistry. This approach gives a better chance to find the global optimal condition, but it is time-consuming and expensive. Deep reinforcement learning is believed to be a superior approach in which the computer analyzes a large data set and recognizes the pattern of features that lead to best reaction outcomes. It is like training a dog: suppose we want the dog to pick up a ball. If the dog does what we want, we say "Good dog!"; if it does not, we say "Bad dog!". Similarly, Professor Zare uses a machine learning method to give the system a positive reward if the reaction reaches a better result than previous ones, or a negative reward if it does not. A repeated process will eventually result in a set of best reaction conditions for certain reactions. Professor Zare and his group apply this approach to microdroplet chemistry, where many reactions can be carried out in small droplets and be accelerated by factors of one thousand to one million compared with the same reaction happening in bulk solution. Combining the efficient deep reinforcement learning method with accelerated microdroplet reactions, Professor Zare and his group are seeking to find optimal reaction conditions in a fast way. This combined approach can represent a significant step for enabling artificial intelligence to be used to optimize chemical reactions, which should have benefits in chemical production, drug screening, and materials discovery. The students in the Zare group enjoy the unique opportunity to experience micro-droplet chemical synthesis, fast chemical characterization, and deep learning-based complex data analysis.
A reaction can be thought of as a system having multiple inputs (parameters) and providing one or more outputs. Example inputs include: temperature; solvent composition; pH; catalyst; droplet size; and time. Example outputs include: product yield; selectivity; purity; and cost. The goal of reaction optimization described here is to select the best inputs to achieve a given output, which can be formulated as a reinforcement learning system. In order to find the optimal reaction conditions, Professor Zare is searching for critical reaction condition to try at the next step based on previous reaction conditions and product yields. A recurrent neural network is used to model the policy for reaction optimization. The reinforcement learning system is trained on mock reactions (random functions) and then real reactions for better performance. The approach, if successful, could help better understanding of fundamental features of reactivity and enable important industrial applications.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
Various machine learning techniques were combined with the collection of large mass spectrometry data sets to predict whether tissue samples were benign or cancerous. Of particular importance was the application to oral squamous cell carcinoma (OSCC), a serious cancer of the mouth disease. This work used saliva (spit) for the purposes of making metabolic profiles using spray mass spectrometry from a conductive substrate. We also performed desorption electrospray ionization mass spectrometry imaging on tissue samples. Saliva samples from 373 volunteers, 124 who are healthy, 124 who have premalignant lesions, and 125 who are OSCC patients, were collected for discovering and validating dysregulated metabolites and determining altered metabolic pathways. With the aid of machine learning (ML), OSCC and premalignant lesions can be distinguished from the normal physical condition in real time with an accuracy of 86.7%, on a person-by-person basis. These results suggest that the combination of CPSI-MS and ML is a feasible tool for accurate, automated diagnosis of OSCC in clinical practice. This study is being continued elsewhere in a hospital setting, which is a most encouraging outcome. Transcending the application to this particular disease is the demonstration that machine learning combined with reliable mass spectrometric data can be applied in a much shorter time than traditional approaches to make medical diagnoses of high statistical significance. As mass spectrometers become portable and compact, it is anticipated that this type of analysis will become an important new tool for the treatment of many different medical conditions.
Last Modified: 01/04/2021
Modified by: Richard N Zare
Please report errors in award information by writing to: awardsearch@nsf.gov.