
NSF Org: |
CNS Division Of Computer and Network Systems |
Recipient: |
|
Initial Amendment Date: | August 18, 2011 |
Latest Amendment Date: | August 18, 2011 |
Award Number: | 1116991 |
Award Instrument: | Standard Grant |
Program Manager: |
Sylvia Spengler
sspengle@nsf.gov (703)292-7347 CNS Division Of Computer and Network Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | September 1, 2011 |
End Date: | August 31, 2016 (Estimated) |
Total Intended Award Amount: | $439,227.00 |
Total Awarded Amount to Date: | $439,227.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
2550 NORTHWESTERN AVE # 1100 WEST LAFAYETTE IN US 47906-1332 (765)494-1055 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
2550 NORTHWESTERN AVE # 1100 WEST LAFAYETTE IN US 47906-1332 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | TRUSTWORTHY COMPUTING |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
Data is a key resource in this information age. The availability of data, however, often causes privacy concerns. Many data sharing scenarios require data be anonymized for privacy protection. Most existing data anonymization techniques, however, satisfy only weak privacy notions that rely on particular assumptions about the adversaries, and provide inadequate protection. In recent years, the elegant notion of differential privacy has gradually been accepted as the privacy notion of choice for answering statistical queries. Most research on differential privacy, however, focuses on answering interactive queries, and there are several negative results on publishing microdata while satisfying differential privacy. Regardless, many data sharing scenarios require sharing of microdata, and research is needed to bridge this gap.
This project aims at bridging the gap between the elegant notion of differential privacy, and the practical difficulty of publishing microdata while preserving utility. Building on the preliminary results of the PI on using random sampling together with "safe" k-anonymization to satisfy differential privacy, this project aims at advancing the state of the art of both scientific understanding and specific techniques for privacy-preserving microdata publishing. Research activities include developing (1) Practical anonymization methods that can be proven to satisfy differential privacy, while capable of handling high-dimensional data; (2) Relaxations of differential privacy that are more suitable for microdata publishing; (3) Privacy theory and techniques that are easily applied to a family of data sanitization algorithms called localized algorithms, enabling the usage of input perturbation techniques for provably-private microdata publishing; (4) Privacy notions and techniques for publishing social network data and network trace data.
Advances in data anonymization techniques will benefit the society by providing a better balance between the need to release data to serve public interest and the need to protect individuals' privacy. This project also involves developing a graduate seminar course on data privacy, and supports two graduate students.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
This project has developed new data privacy notions as well as better techniques for privacy-preserving data publishing and analysis. This project demonstrated the relationship between two major privacy notions: k-anonnymity and differential privacy. This project also resulted in the development of the membership privacy framework, which defines privacy as preventing any adversary from learning that an individual's data is included in the input dataset. The membership privacy framework generalizes differential privacy, makes the undelying assumptions of differential privacy explicit, and points out ways to relax the notion of differential privacy in a principled way so that more useful information can be learned from the data.
This project also resulted in several state of the art techniques for performing data analysis tasks while satisfying differential privacy. While providing the same level of privacy guarantees, some algorithms can perform analysis tasks more accurately than others. This project has resulted in algorithms for publishing histograms of low-dimensional datasets, frequent itemset mining and publishing item counts for transactional datasets, k-means clustering, classification using decesion trees, logistic regression, and Support Vector Machines, answering marginal queries, and publishing graph node degree distribution. Algorithms introduced in the beginning of the project (published in 2013) were still found to be the best-performing ones by a study in 2016 conducted by other researchers who compared a large number of competing algorithms, even though this area has been under intensive study in recent years.
This also project also supported the study of two PhD students. One just defended his PhD dissertation and joined IBM T.J.Watson Research Center. Another plans to defend in December 2016, and will join Google Inc.
Last Modified: 10/21/2016
Modified by: Ninghui Li
Please report errors in award information by writing to: awardsearch@nsf.gov.