
NSF Org: |
OAC Office of Advanced Cyberinfrastructure (OAC) |
Recipient: |
|
Initial Amendment Date: | February 8, 2018 |
Latest Amendment Date: | February 8, 2018 |
Award Number: | 1751161 |
Award Instrument: | Standard Grant |
Program Manager: |
Juan Li
jjli@nsf.gov (703)292-2625 OAC Office of Advanced Cyberinfrastructure (OAC) CSE Directorate for Computer and Information Science and Engineering |
Start Date: | March 1, 2018 |
End Date: | September 30, 2024 (Estimated) |
Total Intended Award Amount: | $561,685.00 |
Total Awarded Amount to Date: | $561,685.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
520 LEE ENTRANCE STE 211 AMHERST NY US 14228-2577 (716)645-2634 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
612 Furnas Hall Buffalo NY US 14260-4200 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): |
CAREER: FACULTY EARLY CAR DEV, Chem Thry, Mdls & Cmptnl Mthds |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
Innovation in chemistry and materials is a key driver of economic development, prosperity, and a rising standard of living. It also offers solutions to pressing problems on energy, environmental sustainability, and resources that shape our society. This research program is designed to boost the chemistry community's capacity to address these challenges by transforming the process that creates underlying innovation. The research promotes a shift away from trial-and-error searches and towards rational design. These combine traditional chemical research with modern data science by introducing tools such as machine learning into the chemical context. This project enables and advances this emerging field by building a cyberinfrastructure that makes data-driven research a viable and widely accessible proposition for the chemistry community, and thereby an integral part of the chemical enterprise. Tools and methods developed in this research provide the means for the large-scale exploration of chemical space and for a better understanding of the hidden mechanisms that determine the behavior of complex chemical systems. These insights can potentially accelerate, streamline, and ultimately transform the chemical development process. The project also tackles the concomitant need to adapt education to this new research landscape in order to adequately equip the next generation of scientists and engineers, to build a competent and skilled workforce for the cutting-edge R&D of the future, and to ensure the competitiveness of US students in the international job market. By promoting minority participation in this promising field, it contributes to a sustained push towards equal opportunity in our society. This project thus promotes the progress of science and advances prosperity and welfare as stated by NSF's mission.
While there is growing agreement on the value of data-driven discovery and rational design, this approach is still far from being a mainstay of everyday research in the chemistry community. This work addresses three key obstacles: (i) data-driven research is beyond the scope and reach of most chemists due to a lack of available and accessible tools, (ii) many fundamental and practical questions on how to make data science work for chemical research remain unresolved, and (iii) data science is not part of the formal training of chemists, and much of the community thus lacks the necessary experience and expertise to utilize it. This research centers around the creation of an open, general-purpose software ecosystem that fuses in silico modeling, virtual high-throughput screening, and big data analytics (i.e., the use of machine learning, informatics, and database technology for the validation, mining, and modeling of resulting data sets) into an integrated research infrastructure. A key consideration is to make this ecosystem as comprehensive, robust, and user-friendly as possible, so that it can readily be employed by interested researchers without the need for extensive expert knowledge. It also serves as a development platform and testbed for innovation in the underlying methods, algorithms, and protocols, i.e., it allows the community to systematically and efficiently evaluate the utility and performance of different techniques, including new ones that are being introduced as part of this project. A meta machine learning approach is being developed to establish guidelines and best practices that provide added value to the cyberinfrastructure. The work is driven by concrete molecular design problems, which serve to demonstrate the efficacy of the overall approach. The educational challenges that arise from the qualitative novelty of data-driven research and its inherent interdisciplinarity are addressesed by leveraging a new graduate program in Computational and Data-Enabled Science and Engineering for cross-cutting course and curricular developments, the creation of interactive teaching materials, and a skill-building hackathon initiative. This award is jointly made with the Division of Chemistry's, Chemical Theory, Models and Computational Methods Program.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
Artificial intelligence (AI), machine learning (ML), and data science are playing an increasingly important role in chemical and materials research.
However, data-driven research is still beyond the reach of most chemists and materials scientists due to unresolved questions on how to make AI/ML work for chemical and materials questions, due to a lack of available and accessible software tools, and due to the fact that data science is typically not part of the formal training (and thus experience/skill-set) of this research community.
In this project, we addressed these issues and advanced solutions to help overcome them. We created new AI/ML methods and open-source software for use on chemical and materials problems, as well as training initiatives to prepare the workforce of the future. This work includes techniques to automate the use of AI/ML as far as possible, to make it easy and safe to use, to tailor it to the specific requirements of chemical and materials studies, and to open the AI/ML 'black-box' in order to learn what its predictions can tell us.
We tested these new techniques and tools on a number of specific research questions (e.g., the design of new polymer materials for lenses and active compounds in energy storage devices) and demonstrated how they could accelerate the generation of new findings and provide additional insights. The AI/ML prediction models created in this context allow us to make predictions about the properties of new and previously unknown compounds at a fraction of the time that would be needed using traditional modeling or experimental approaches. In addition, we use data mining to gain a better understanding of the connections between the structures of molecular and materials compounds and their properties. This improved understanding allows us to pursue a more purposeful and targeted design of new compounds with desirable properties.
Many of these effort have led to collaborations and partnerships with other academic and industry researchers, which found our techniques and tools valuable.
Last Modified: 04/18/2025
Modified by: Johannes Hachmann
Please report errors in award information by writing to: awardsearch@nsf.gov.