Award Abstract # 2046873
CAREER: Detecting, Understanding, and Fixing Vulnerabilities in Natural Language Processing Models

NSF Org: IIS
Division of Information & Intelligent Systems
Recipient: UNIVERSITY OF CALIFORNIA IRVINE
Initial Amendment Date: March 24, 2021
Latest Amendment Date: July 31, 2024
Award Number: 2046873
Award Instrument: Continuing Grant
Program Manager: Eleni Miltsakaki
emiltsak@nsf.gov
 (703)292-2972
IIS
 Division of Information & Intelligent Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: July 1, 2021
End Date: June 30, 2026 (Estimated)
Total Intended Award Amount: $499,984.00
Total Awarded Amount to Date: $389,121.00
Funds Obligated to Date: FY 2021 = $88,376.00
FY 2022 = $94,017.00

FY 2023 = $100,113.00

FY 2024 = $106,615.00
History of Investigator:
  • Sameer Singh (Principal Investigator)
    sameer@uci.edu
Recipient Sponsored Research Office: University of California-Irvine
160 ALDRICH HALL
IRVINE
CA  US  92697-0001
(949)824-7295
Sponsor Congressional District: 47
Primary Place of Performance: University of California-Irvine
CA  US  92697-3425
Primary Place of Performance
Congressional District:
47
Unique Entity Identifier (UEI): MJC5FCYQTPE6
Parent UEI: MJC5FCYQTPE6
NSF Program(s): Robust Intelligence
Primary Program Source: 01002526DB NSF RESEARCH & RELATED ACTIVIT
01002425DB NSF RESEARCH & RELATED ACTIVIT

01002324DB NSF RESEARCH & RELATED ACTIVIT

01002223DB NSF RESEARCH & RELATED ACTIVIT

01002122DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 1045, 7495
Program Element Code(s): 749500
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

With recent advances in machine learning, models have achieved high accuracy on many challenging tasks in natural language processing (NLP) such as question answering, machine translation, and dialog agents, sometimes coming close to or beating human performance on these benchmarks. However, these NLP models often suffer from brittleness in many different ways: they latch onto erroneous artifacts, do not support natural variations in language, are not robust to adversarial attacks, and only work on a few domains. Existing pipelines for developing NLP models lack support for useful insights, and identifying bugs requires considerable effort from experts both in machine learning and the domain. This CAREER project develops several techniques to support this need for more robust training and evaluation pipelines for NLP, providing easy-to-use, scalable, and accurate mechanisms for identifying, understanding, and addressing NLP models' vulnerabilities. The developed methods will support diverse application areas such as conversational agents, sentiment classifiers, and abuse/hate speech detection. Further, the team engages with the developers of NLP models in academia and industry to develop a data science curriculum for K-12 education, particularly for students from underrepresented communities.

Based on the notion of vulnerability as unexpected behavior on certain input transformations, the team will contribute across the following three thrusts. The first thrust identifies vulnerabilities by testing user-defined behaviors and searching over many possible vulnerabilities. In the second thrust, the investigators develop methods to understand the model's vulnerabilities by tracing the causes of errors to individual training data points and data artifacts. The last thrust will develop approaches to address vulnerabilities in models by directly injecting the vulnerability definitions into the model during training and using explanation-based annotations to supervise the models. These thrusts build upon the goals of behavioral testing, explanation-based interactions, and architecture agnosticism to support most current and future NLP models and applications.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 16)
Ahmed, Kareem and Li, Tao and Ton, Thy and Guo, Quan and Chang, Kai-Wei and Kordjamshidi, Parisa and Srikumar, Vivek and Van den Broeck, Guy and Singh, Sameer "Pylon: A PyTorch Framework for Learning with Constraints" Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track , v.176 , 2022 https://doi.org/10.1609/aaai.v36i11.21711 Citation Details
Belem, Catarina and Seshadri, Preethi and Razeghi, Yasaman and Singh, Sameer "Are Models Biased on Text without Gender-related Language?" , 2024 Citation Details
Gardner, Matt and Merrill, William and Dodge, Jesse and Peters, Matthew and Ross, Alexis and Singh, Sameer and Smith, Noah A. "Competency Problems: On Finding and Removing Artifacts in Language Data" Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing , 2021 https://doi.org/10.18653/v1/2021.emnlp-main.135 Citation Details
Gupta, Shivanshu and Gardner, Matt and Singh, Sameer "Coverage-based Example Selection for In-Context Learning" , 2023 https://doi.org/10.18653/v1/2023.findings-emnlp.930 Citation Details
Hossain, Tamanna and Dev, Sunipa and Singh, Sameer "MISGENDERED: Limits of Large Language Models in Understanding Pronouns" Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , 2023 https://doi.org/10.18653/v1/2023.acl-long.293 Citation Details
Hossain, Tamanna and Dev, Sunipa and Singh, Sameer "MisgenderMender: A Community-Informed Approach to Interventions for Misgendering" , 2024 Citation Details
Mekala, Raja_Sekhar_Reddy and Razeghi, Yasaman and Singh, Sameer "EchoPrompt: Instructing the Model to Rephrase Queries for Improved In-context Learning" , 2024 Citation Details
Nottingham, Kolby and Razeghi, Yasaman and Kim, Kyungmin and Lanier, J_B and Baldi, Pierre and Fox, Roy and Singh, Sameer "Selective Perception: Learning Concise State Descriptions for Language Model Actors" , 2024 Citation Details
Pezeshkpour, Pouya and Jain, Sarthak and Singh, Sameer and Wallace, Byron "Combining Feature and Instance Attribution to Detect Artifacts" Findings of the Association for Computational Linguistics: ACL 2022 , 2022 https://doi.org/10.18653/v1/2022.findings-acl.153 Citation Details
Razeghi, Yasaman and Ivison, Hamish and Singh, Sameer and Elazar, Yanai "Backtracking Mathematical Reasoning of Language Models to the Pretraining Data" , 2024 Citation Details
Razeghi, Yasaman and Logan IV, Robert L and Gardner, Matt and Singh, Sameer "Impact of Pretraining Term Frequencies on Few-Shot Numerical Reasoning" Findings of the Association for Computational Linguistics: EMNLP 2022 , 2022 https://doi.org/10.18653/v1/2022.findings-emnlp.59 Citation Details
(Showing: 1 - 10 of 16)

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page