Skip to feedback

Award Abstract # 1949013
Cracking the chemical code: a data-science approach to deciphering the chemical information stored in environmental samples

NSF Org: CBET
Division of Chemical, Bioengineering, Environmental, and Transport Systems
Recipient: OREGON STATE UNIVERSITY
Initial Amendment Date: May 5, 2020
Latest Amendment Date: September 18, 2023
Award Number: 1949013
Award Instrument: Standard Grant
Program Manager: Karl Rockne
krockne@nsf.gov
 (703)292-7293
CBET
 Division of Chemical, Bioengineering, Environmental, and Transport Systems
ENG
 Directorate for Engineering
Start Date: May 15, 2020
End Date: April 30, 2024 (Estimated)
Total Intended Award Amount: $329,002.00
Total Awarded Amount to Date: $329,002.00
Funds Obligated to Date: FY 2020 = $329,002.00
History of Investigator:
  • Gerrad Jones (Principal Investigator)
    gerrad.jones@oregonstate.edu
  • Rebecca Hutchinson (Co-Principal Investigator)
Recipient Sponsored Research Office: Oregon State University
1500 SW JEFFERSON AVE
CORVALLIS
OR  US  97331-8655
(541)737-4933
Sponsor Congressional District: 04
Primary Place of Performance: Oregon State University
A312 Kerr Administration Bldg
Corvallis
OR  US  97331-2140
Primary Place of Performance
Congressional District:
04
Unique Entity Identifier (UEI): MZ4DYXE1SL98
Parent UEI:
NSF Program(s): EnvE-Environmental Engineering
Primary Program Source: 01002021DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s):
Program Element Code(s): 144000
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.041

ABSTRACT

Numerous natural and anthropogenic processes release organic chemicals into the environment. The underlying hypothesis of this project is that these processes have distinct chemical markers that can be used to uniquely identify each source. Over time, chemicals are transported through the environment and are often captured in lakes, rivers, and the ocean. These water bodies thus contain a chemical record of all processes occurring upstream. This project will develop tools to test water bodies for these chemical markers, using artificial intelligence (AI) tools to screen for all processes that occur in a watershed. This approach holds great promise to efficiently collect more data than existing methods. Successful development of this AI approach will have wide-ranging applications to detect and identify sources of pollution from local to global scales. Broader impacts to society will result from the training of underrepresented communities and development of STEM curricula for high school students. These will lead to diversifying the STEM workforce and increasing scientific literacy.

Lakes and other bodies of water can be considered as systems that store chemical information recorded in the form of tens of thousands of molecules in the water and sediment. The goal of this research project is to use artificial intelligence (AI) algorithms to translate the chemical data stored in environmental samples into knowledge about ecosystem processes. This goal will be achieved through specific objectives to: 1) develop diagnostic chemical fingerprints associated with multiple anthropogenic pollution sources, 2) quantify environmental processes the affect the chemical composition in receiving water bodies, and 3) identify pollution sources through source identification in various water bodies including lakes and groundwater. Although current application of chemical forensics is focused on pollution source tracking, the goal of this research is substantially broader. Hundreds to thousands of ecosystem processes occur across the landscape, and successful completion of the research holds promise to track environmental processes through fingerprint identification within a single water sample.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Dávila-Santiago, Emmanuel and Shi, Cheng and Mahadwar, Gouri and Medeghini, Bridgette and Insinga, Logan and Hutchinson, Rebecca and Good, Stephen and Jones, Gerrad D. "Machine Learning Applications for Chemical Fingerprinting and Environmental Source Tracking Using Non-target Chemical Data" Environmental Science & Technology , 2022 https://doi.org/10.1021/acs.est.1c06655 Citation Details
Joseph, Nayantara T and Schwichtenberg, Trever and Cao, Dunping and Jones, Gerrad D and Rodowa, Alix E and Barlaz, Morton A and Charbonnet, Joseph A and Higgins, Christopher P and Field, Jennifer A and Helbling, Damian E "Target and Suspect Screening Integrated with Machine Learning to Discover Per- and Polyfluoroalkyl Substance Source Fingerprints" Environmental Science & Technology , v.57 , 2023 https://doi.org/10.1021/acs.est.3c03770 Citation Details
Shi, Cheng and Mahadwar, Gouri and Dávila-Santiago, Emmanuel and Bambakidis, Ted and Crump, Byron C. and Jones, Gerrad D. "Nontarget Chemical Composition of Surface Waters May Reflect Ecosystem Processes More than Discrete Source Contributions" Environmental Science & Technology , 2023 https://doi.org/10.1021/acs.est.2c08540 Citation Details

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Throughout this project, we focused on developing innovative ways to trace pollution sources in surface waters using chemical fingerprints. Traditionally, water quality monitoring targets specific chemicals, but our research aimed to identify broad pollution sources based on unique chemical patterns, or “fingerprints,” rather than focusing on individual contaminants.

1. Diagnostic Chemical Fingerprints for Pollution Sources

We successfully developed chemical fingerprints that could detect specific pollution sources in surface waters. This method allows watershed managers to determine which pollution sources are influencing water quality without targeting specific chemicals. For example, rather than assuming a particular source is contributing to pollution, our approach enables managers to pinpoint which sources—whether agricultural runoff, industrial waste, or urban pollution—are responsible. This helps direct resources toward the most impactful mitigation strategies.

2. Landscape-Based Fingerprinting Challenges

While our approach worked well for direct pollution sources, landscapes (such as forests or agricultural fields) contribute indirectly to surface water chemistry, complicating fingerprint development. We aimed to identify diagnostic chemical fingerprints representing entire landscapes, sampling across gradients like elevation or land use. Surprisingly, we found that the dominant chemical signatures in surface water were more closely tied to natural bacterial gradients rather than landscape features. This revealed a limitation in our ability to develop fingerprints purely based on landscape characteristics.

3. Alternative Pattern-Based Approach

Recognizing the challenge of fingerprinting landscapes, we pivoted to a novel approach. Instead of relying solely on predefined sources, we let the data reveal dominant chemical patterns across multiple water samples. We combined this with satellite, climate, and general water quality data to interpret these patterns. From 78 weeks of data, we identified five major sources that most likely influenced surface water composition: agriculture, wastewater, snow runoff, groundwater intrusion, and natural ecological processes. This method provides a more flexible way to monitor water quality, allowing us to discover unknown sources and interpret their impact based on additional environmental data.

Broader Impacts

This research integrates environmental water quality monitoring with advanced data science techniques, paving the way for new methods to understand complex chemical interactions in nature. The results can help communities and watershed managers more effectively identify and mitigate both known and unknown pollution sources, improving environmental management and resource allocation.

 


Last Modified: 09/18/2024
Modified by: Gerrad D Jones

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page