
NSF Org: |
CNS Division Of Computer and Network Systems |
Recipient: |
|
Initial Amendment Date: | April 14, 2022 |
Latest Amendment Date: | April 14, 2022 |
Award Number: | 2225225 |
Award Instrument: | Standard Grant |
Program Manager: |
Dan Cosley
dcosley@nsf.gov (703)292-8832 CNS Division Of Computer and Network Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | November 15, 2021 |
End Date: | September 30, 2025 (Estimated) |
Total Intended Award Amount: | $247,903.00 |
Total Awarded Amount to Date: | $247,903.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
633 CLARK ST EVANSTON IL US 60208-0001 (312)503-7955 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
750 N. Lake Shore Drive Chicago IL US 60611-4579 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | Secure &Trustworthy Cyberspace |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
This project aims to harden machine learning based security defenses by improving their ability to handle dynamic changes. From data breaches to ransomware infections, the increasingly sophisticated attacks are posing a serious threat to Internet-enabled systems and their users. While machine learning has shown great promise to build the next generation of defense, these defense systems are vulnerable to the dynamic changes (or concept drift) in the data caused by attacker evolvement and behavior changes of benign players. Traditionally, detecting and mitigating the impact of concept drift requires significant efforts to label new data, which is challenging to scale up. In this project, the team of researchers will design novel schemes to improve the adaptability and resilience of learning-based defenses that require minimal labeling capacity. The core idea is to use self-supervised learning models, utilizing unlabeled data and obtaining supervision from the data itself. If successful, the project will provide the much-needed tools to measure, detect, and mitigate concept drift for security applications, including malware analysis, network intrusion detection, and bot detection.
The team of researchers will first focus on measuring concept drift over longitudinal data. With a focus on real-world malware samples, the team will develop measurement tools to extract and characterize different types of concept drift to understand their patterns. In the next stage, the team will develop reactive methods to detect drifting samples via contrastive learning (a form of self-supervision), and methods to select drifting samples to facilitate efficient labeling. Finally, the team will move from reactive defense to proactive approaches. The plan is to use adversarial generative models (another form of self-supervision) to synthesize richer data and labels that mimic future mutations of attackers, which will be used to harden the defenses at the training stage. The proposed techniques are expected to reduce the data labeling costs for learning-based defenses and improve their long-term sustainability to protect users, organizations, and critical infrastructures. The team will also leverage this project to recruit and mentor underrepresented students, develop new course materials, and perform technology transfer.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Please report errors in award information by writing to: awardsearch@nsf.gov.