NSF Award Search: Award # 0964094 - TC: Medium: Dissemination and Analysis of Private Network Data

Award Abstract # 0964094

TC: Medium: Dissemination and Analysis of Private Network Data

NSF Org:	CNS Division Of Computer and Network Systems
Recipient:	UNIVERSITY OF MASSACHUSETTS
Initial Amendment Date:	May 6, 2010
Latest Amendment Date:	April 29, 2011
Award Number:	0964094
Award Instrument:	Standard Grant
Program Manager:	Sylvia Spengler sspengle@nsf.gov (703)292-7347 CNS Division Of Computer and Network Systems CSE Directorate for Computer and Information Science and Engineering
Start Date:	May 1, 2010
End Date:	April 30, 2014 (Estimated)
Total Intended Award Amount:	$873,125.00
Total Awarded Amount to Date:	$889,125.00
Funds Obligated to Date:	FY 2010 = $873,125.00 FY 2011 = $16,000.00
History of Investigator:	Gerome Miklau (Principal Investigator) miklau@cs.umass.edu Donald Towsley (Co-Principal Investigator) David Jensen (Co-Principal Investigator)
Recipient Sponsored Research Office:	University of Massachusetts Amherst 101 COMMONWEALTH AVE AMHERST MA US 01003-9252 (413)545-0698
Sponsor Congressional District:	02
Primary Place of Performance:	University of Massachusetts Amherst 101 COMMONWEALTH AVE AMHERST MA US 01003-9252
Primary Place of Performance Congressional District:	02
Unique Entity Identifier (UEI):	VGJHK59NMPK9
Parent UEI:	VGJHK59NMPK9
NSF Program(s):	Special Projects - CNS, TRUSTWORTHY COMPUTING
Primary Program Source:	01001011DB NSF RESEARCH & RELATED ACTIVIT 01001112DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s):	7924, 9178, 9218, 9251, HPCC
Program Element Code(s):	171400, 779500
Award Agency Code:	4900
Fund Agency Code:	4900
Assistance Listing Number(s):	47.070

ABSTRACT

The goal of this research project is to enable statistical analysis and knowledge discovery on networks without violating the privacy of participating entities. Network data sets record the structure of computer, communication, social, or organizational networks, but they often contain highly sensitive information about individuals. The availability of network data is crucial for analyzing, modeling, and predicting the behavior of networks.

The team's approach is based on model-based generation of synthetic data, in which a model of the network is released under strong privacy conditions and samples from that model are studied directly by analysts. Output perturbation techniques are used to privately compute the parameters of popular network models. The resulting "noisy" model parameters are released, satisfying a strong, quantifiable privacy guarantee, but still preserving key properties of the networks. Analysts can use the released models to sample individual networks or to reason about properties of the implied ensemble of networks.

By synthesizing versions of networks that would otherwise remain hidden, this research can advance the study of topics such as disease transmission, network resiliency, and fraud detection. The project will result in publicly available privacy tools, a repository for derived models and sample networks, and contributions to workforce development in the field of information assurance. The experimental research is linked to educational efforts including undergraduate involvement in research through a Research Experience for Undergraduates site, as well as interdisciplinary seminars.

For further information see the project web site at the URL:
http://dbgroup.cs.umass.edu/private-network-data

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

This project has developed a set of algorithms for protecting personal privacy while supporting the release of networked data sets. Data privacy research has most commonly been focused on tabular data, in which an individual is described by a set of attributes contained in a single record. Networked data poses a special challenge because it describes a graph in which an edge relation represents connections, interactions, or communication between named nodes. Protecting privacy is more complicated for this type of data: revealing the properties of connected individuals may constitute dangerous disclosures and revealing information about one individual is more likely to lead to inferences about other connected individuals.

This project has developed conceptual and technological advancements for modeling networked data sets under the rigorous model of differential privacy. Our basic approach is based on the model-based generation of synthetic data in which a model of the networked data set is released under strong privacy conditions and samples from that model are studied directly by analysts. The data received by analysts must be perturbed or distorted to preserve privacy, however analysts receive measures of estimated error along with synthesized data. The main contributions include the following:

We developed algorithms for privately estimating a number of key statistics used with a popular model of network formation (the exponential random graph model). For these statistics, our method allows an analyst to fit this model to the data with improved accuracy.
We developed a method for constructing synthetic multi-relational data sets (which generalize networked data beyond a single relationship) also with a rigorous privacy guarantee and improved accuracy.
We investigated foundational issues in the statistical modeling of networked data, developing new modeling approaches that increase correctness and descriptive power.

The project enhanced cyber-security curricula at the undergraduate and graduate level, added to the cyber-security workforce, and our results were disseminated both nationally and internationally.

Last Modified: 07/29/2014
Modified by: Gerome Miklau

Please report errors in award information by writing to: awardsearch@nsf.gov.

Success

Error