Award Abstract # 2225225
Collaborative Research: SaTC: CORE: Small: Towards Label Enrichment and Refinement to Harden Learning-based Security Defenses

NSF Org: CNS
Division Of Computer and Network Systems
Recipient: NORTHWESTERN UNIVERSITY
Initial Amendment Date: April 14, 2022
Latest Amendment Date: April 14, 2022
Award Number: 2225225
Award Instrument: Standard Grant
Program Manager: Dan Cosley
dcosley@nsf.gov
 (703)292-8832
CNS
 Division Of Computer and Network Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: November 15, 2021
End Date: September 30, 2025 (Estimated)
Total Intended Award Amount: $247,903.00
Total Awarded Amount to Date: $247,903.00
Funds Obligated to Date: FY 2021 = $247,903.00
History of Investigator:
  • Xinyu Xing (Principal Investigator)
    xinyu.xing@northwestern.edu
Recipient Sponsored Research Office: Northwestern University
633 CLARK ST
EVANSTON
IL  US  60208-0001
(312)503-7955
Sponsor Congressional District: 09
Primary Place of Performance: Northwestern University at Chicago
750 N. Lake Shore Drive
Chicago
IL  US  60611-4579
Primary Place of Performance
Congressional District:
05
Unique Entity Identifier (UEI): EXZVPWZBLUE8
Parent UEI:
NSF Program(s): Secure &Trustworthy Cyberspace
Primary Program Source: 01002122DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 025Z, 7923
Program Element Code(s): 806000
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

This project aims to harden machine learning based security defenses by improving their ability to handle dynamic changes. From data breaches to ransomware infections, the increasingly sophisticated attacks are posing a serious threat to Internet-enabled systems and their users. While machine learning has shown great promise to build the next generation of defense, these defense systems are vulnerable to the dynamic changes (or concept drift) in the data caused by attacker evolvement and behavior changes of benign players. Traditionally, detecting and mitigating the impact of concept drift requires significant efforts to label new data, which is challenging to scale up. In this project, the team of researchers will design novel schemes to improve the adaptability and resilience of learning-based defenses that require minimal labeling capacity. The core idea is to use self-supervised learning models, utilizing unlabeled data and obtaining supervision from the data itself. If successful, the project will provide the much-needed tools to measure, detect, and mitigate concept drift for security applications, including malware analysis, network intrusion detection, and bot detection.

The team of researchers will first focus on measuring concept drift over longitudinal data. With a focus on real-world malware samples, the team will develop measurement tools to extract and characterize different types of concept drift to understand their patterns. In the next stage, the team will develop reactive methods to detect drifting samples via contrastive learning (a form of self-supervision), and methods to select drifting samples to facilitate efficient labeling. Finally, the team will move from reactive defense to proactive approaches. The plan is to use adversarial generative models (another form of self-supervision) to synthesize richer data and labels that mimic future mutations of attackers, which will be used to harden the defenses at the training stage. The proposed techniques are expected to reduce the data labeling costs for learning-based defenses and improve their long-term sustainability to protect users, organizations, and critical infrastructures. The team will also leverage this project to recruit and mentor underrepresented students, develop new course materials, and perform technology transfer.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page