Award Abstract # 1115234
TC: Small: Robust Anonymization on Social Networks

NSF Org: CNS
Division Of Computer and Network Systems
Recipient: UNIVERSITY OF ILLINOIS
Initial Amendment Date: July 23, 2011
Latest Amendment Date: July 23, 2011
Award Number: 1115234
Award Instrument: Standard Grant
Program Manager: Nan Zhang
CNS
 Division Of Computer and Network Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: August 1, 2011
End Date: July 31, 2016 (Estimated)
Total Intended Award Amount: $495,935.00
Total Awarded Amount to Date: $495,935.00
Funds Obligated to Date: FY 2011 = $495,935.00
History of Investigator:
  • Philip Yu (Principal Investigator)
    psyu@uic.edu
Recipient Sponsored Research Office: University of Illinois at Chicago
809 S MARSHFIELD AVE M/C 551
CHICAGO
IL  US  60612-4305
(312)996-2862
Sponsor Congressional District: 07
Primary Place of Performance: University of Illinois at Chicago
809 S MARSHFIELD AVE M/C 551
CHICAGO
IL  US  60612-4305
Primary Place of Performance
Congressional District:
07
Unique Entity Identifier (UEI): W8XEAJDKMXH3
Parent UEI:
NSF Program(s): TRUSTWORTHY COMPUTING
Primary Program Source: 01001112DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7923, 7795
Program Element Code(s): 779500
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Massive graphs arise in many social media applications, such as social networks, E-commerce recommendation systems, e-mail communication patterns, and other collaborative applications. Such data is often sensitive from a privacy point of view. Recently there are many privacy preserving schemes being proposed to protect the release of network data. The question is how effective these schemes are on preventing the re-identification of nodes, i.e., preserving the identity anonymization of the network nodes.

This project will raise the issue of the inadequacy of the current network anonymization schemes for massive and sparse graphs. It is important to understand the theoretical properties which make them susceptible to re-identification attacks. By a systematic study of the re-identification risks of the existing approaches, and development of new principles for anonymization of network data, we will deepen our understanding of the problems and be better able to protect the data privacy. By designing a new type of attack algorithms and raising the issue on the privacy exposure of the current network anonymization schemes, the work can lead to fundamentally different thinking on how to perform privacy preserving data publishing on network data. It provides new insights on how to devise anonymization schemes to protect the privacy of social network data.

One of the biggest obstacles on sharing information is the privacy concern. This project has the potential to make fundamental, disruptive advances in protecting the privacy of network data. It provides new insights on the inadequacy of the current anonymization schemes. Many researchers need access to sensitive data, e.g., social network data, e-mail and communication patterns, etc. By advancing the knowledge on privacy preserving data publishing, the barrier of sharing data will come down to facilitate scientific research activities.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 11)
B. Liu, Y. Xiao, P.S. Yu, L. Cao, Y. Zhang, and Z. Hao "Uncertain One-Class Learning and Concept Summarization Learning on Uncertain Data Streams" IEEE Trans. on Knowledge and Data Engineering , v.26 , 2014 , p.468
B. Liu, Y. Xiao, P.S. Yu, Z. Hao, and L. Cao "An Efficient Approach for Outlier Detection with Imperfect Data Labels" IEEE Trans. on Knowledge and Data Engineering , v.26 , 2014 , p.1602 IEEE
C. Aggarwal, Y. Li, and P.S. Yu "On the Anonymizability of Graphs" Knowledge and Information Systems , v.45 , 2015 , p.2
C. Sun, P.S. Yu, X. Kong, and Y. Fu "Privacy Preserving Social Network Publication Against Mutual Friend Attacks" Transactions on Data Privacy , v.7 , 2014 , p.71
C. Tai, P.S. Yu, D. Yang, and M.S. Chen "Structural Diversity for Resisting Community Identification in Published Social Networks" IEEE Trans. on Knowledge and Data Engineering , v.26 , 2014 , p.235
C. Tai, P. Tseng, P.S. Yu, and M.S. Chen "Identity Protection in Sequential Releases of Dynamic Social Networks" IEEE Trans. on Knowledge and Data Engineering , v.26 , 2014 , p.635
C. Wang, J. Lai, and P.S. Yu "NEIWalk: Community Discovery in Dynamic Content-based Networks" IEEE Trans. on Knowledge and Data Engineering , v.26 , 2014 , p.1734 IEEE
H. Shuai, D. Yang, P.S. Yu, and M.S. Chen "A Comprehensive Study on Willingness Maximization for Social Activity Planning with Quality Guarantee" IEEE Trans. on Knowledge and Data Engineering , v.28 , 2016 , p.2
J. Wu, X. Zhu, C. Zhang, and P.S. Yu "Bag Constrained Structure Pattern Mining for Multi-Graph Classification" IEEE Trans. on Knowledge and Data Engineering , v.26 , 2014 , p.2382 IEEE
M. Yuan, L. Chen, T. Yu, and P.S. Yu "Protecting Sensitive Labels in Social Network Data Anonymization" IEEE Trans. Knowledge and Data Engineering , v.25 , 2013 , p.633-647 IEEE
S. Pan, J. Wu, X. Zhu, C. Zhang, and P.S. Yu, "Joint Structure Feature Exploration and Regularization for Multi-Task Graph Classification" IEEE Trans. on Knowledge and Data Engineering , v.28 , 2016 , p.715
(Showing: 1 - 10 of 11)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Many applications such as social networks, email communication patterns, and other collaborative applications are built on top of graph/network structures. These network data offer tremendous opportunities on mining useful information for advanced services, such as recommendation, personalized medicine, location based service, etc.  However, there is also a threat to privacy because data in raw form often contain sensitive information about individuals. Privacy-preserving data publishing (PPDP) studies how to transform raw data into a version that is immunized against privacy attacks, but that still supports effective data analysis. Although a straightforward solution is to remove identifying information from the nodes and perturb the graph structure, so that re-identification becomes more difficult, it is generally insufficient.

 

Group based anonymization is the most widely studied approach for privacy-preserving data publishing. The work of this proposal is the first to identify its exposures on network anonymizatioin.  As typical graphs encountered in real applications are sparse, this work shows that sparse graphs have certain theoretical properties, which make them susceptible to re-identification attacks, and provides effective solutions to address these potential attacks. Various cases of network anonymization attacks have been identified and addressed including sequential releases of dynamic networks, friendship type attacks based on number of mutual friends between two persons, or the degrees (i.e., number of friends) of two vertices (persons) connected by a friendship link. Furthermore, in the era of big data, there are many data sources that can be integrated to help de-anonymized the data. For example, the potential exposure to alignment of multiple anonymized social network data is studied. ? -differential privacy is an alternative approach designed for an interactive querying model. A novel data publishing approach is proposed for the non-interactive setting based on ? -differential privacy.

 

The work creates an awareness of the weakness of the current privacy preserving data publishing schemes on network data and proposes approaches to alleviate these issues. This will facilitate the sharing of data to advance data-driven research.

 

The PI is the recipient of ACM SIGKDD 2016 Innovation Award for his influential research and scientific contributions on mining, fusion and anonymization of big data, and the IEEE Computer Society’s 2013 Technical Achievement Award for “pioneering and fundamentally innovative contributions to the scalable indexing, querying, searching, mining and anonymization of big data”. 

 


Last Modified: 08/06/2016
Modified by: Philip S Yu

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page