
NSF Org: |
CNS Division Of Computer and Network Systems |
Recipient: |
|
Initial Amendment Date: | March 1, 2016 |
Latest Amendment Date: | March 2, 2017 |
Award Number: | 1563843 |
Award Instrument: | Standard Grant |
Program Manager: |
Dan Cosley
dcosley@nsf.gov (703)292-8832 CNS Division Of Computer and Network Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | September 1, 2016 |
End Date: | October 31, 2018 (Estimated) |
Total Intended Award Amount: | $599,931.00 |
Total Awarded Amount to Date: | $615,931.00 |
Funds Obligated to Date: |
FY 2017 = $0.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
526 BRODHEAD AVE BETHLEHEM PA US 18015-3008 (610)758-3021 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
19 Memorial Drive West Bethlehem PA US 18015-3085 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): |
Special Projects - CNS, Secure &Trustworthy Cyberspace |
Primary Program Source: |
01001718DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
Today individuals and organizations leverage machine learning systems to adjust room temperature, provide recommendations, detect malware, predict earthquakes, forecast weather, maneuver vehicles, and turn Big Data into insights. Unfortunately, these systems are prone to a variety of malicious attacks with potentially disastrous consequences. For example, an attacker might trick an Intrusion Detection System into ignoring the warning signs of a future attack by injecting carefully crafted samples into the training set for the machine learning model (i.e., "polluting" the model). This project is creating an approach to machine unlearning and the necessary algorithms, techniques, and systems to efficiently and effectively repair a learning system after it has been compromised. Machine unlearning provides a last resort against various attacks on learning systems, and is complementary to other existing defenses.
The key insight in machine unlearning is that most learning systems can be converted into a form that can be updated incrementally without costly retraining from scratch. For instance, several common learning techniques (e.g., naive Bayesian classifier) can be converted to the non-adaptive statistical query learning form, which depends only on a constant number of summations, each of which is a sum of some efficiently computable transformation of the training data samples. To repair a compromised learning system in this form, operators add or remove the affected training sample and re-compute the trained model by updating a constant number of summations. This approach yields huge speedup -- the asymptotic speedup over retraining is equal to the size of the training set. With unlearning, operators can efficiently correct a polluted learning system by removing the injected sample from the training set, strengthen an evaded learning system by adding evasive samples to the training set, and prevent system inference attacks by forgetting samples stolen by the attacker so that no future attacks can infer anything about the samples.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
Please report errors in award information by writing to: awardsearch@nsf.gov.