Award Abstract # 2338772
CAREER: Theory and Practice of Privacy-Utility Tradeoffs in Enterprise Data Sharing

NSF Org: CCF
Division of Computing and Communication Foundations
Recipient: CARNEGIE MELLON UNIVERSITY
Initial Amendment Date: December 29, 2023
Latest Amendment Date: May 2, 2025
Award Number: 2338772
Award Instrument: Continuing Grant
Program Manager: Alfred Hero
ahero@nsf.gov
 (703)292-0000
CCF
 Division of Computing and Communication Foundations
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: March 1, 2024
End Date: February 28, 2029 (Estimated)
Total Intended Award Amount: $597,299.00
Total Awarded Amount to Date: $228,248.00
Funds Obligated to Date: FY 2024 = $113,507.00
FY 2025 = $114,741.00
History of Investigator:
  • Giulia Fanti (Principal Investigator)
    gfanti@andrew.cmu.edu
Recipient Sponsored Research Office: Carnegie-Mellon University
5000 FORBES AVE
PITTSBURGH
PA  US  15213-3815
(412)268-8746
Sponsor Congressional District: 12
Primary Place of Performance: Carnegie-Mellon University
5000 FORBES AVE
PITTSBURGH
PA  US  15213-3815
Primary Place of Performance
Congressional District:
12
Unique Entity Identifier (UEI): U3NKNFLNQ613
Parent UEI: U3NKNFLNQ613
NSF Program(s): Comm & Information Foundations
Primary Program Source: 01002526DB NSF RESEARCH & RELATED ACTIVIT
01002425DB NSF RESEARCH & RELATED ACTIVIT

01002728DB NSF RESEARCH & RELATED ACTIVIT

01002627DB NSF RESEARCH & RELATED ACTIVIT

01002829DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 079Z, 9102, 1045
Program Element Code(s): 779700
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

A cooperative and coordinated effort across organizations will be required for effectively addressing many of today's biggest global challenges and societal problems, including cybercrime, climate change, and public health. Data sharing will enable organizations involving academic researchers, governments, and industry to collaborate more effectively on solving such problems. However, in practice data sharing among organizations is hindered by two significant factors. (1) Organizations often fear inadvertently leaking trade-secrets, such as business strategies, and therefore do not share data; this concern is called trade-secret privacy. (2) When organizations have access to shared data, they often lack the in-house resources to evaluate the quality and usefulness of the data-source; this problem is called data source utility. This project quantitatively addresses the data sharing problems in terms of these two factors. Specifically, the project will develop quantitative methods for measuring trade-secret privacy and data-source utility, assessing the tradeoffs between these two factors, and developing algorithms that come close to optimizing this tradeoff. This research will help encourage greater data sharing among organizations, thereby enhancing society's ability to address global challenges through informed collaboration. Several outreach and education activities complement and integrate the research. These include working with companies to develop privacy protections in their applications and services, working on an open source library for trade-secret privacy and utility, and organizing research internship programs for students in Africa.

This project aims to design novel privacy and utility metrics and frameworks to help organizations make more informed choices regarding data sharing. Both of the above problems (trade secret privacy and data source utility) can be framed as a study of divergences between probability distributions. Building on the investigator's prior work studying divergences in the context of deep generative models, this project will study how to carefully select appropriate divergence measures to (a) satisfy enterprise use cases, and (b) provide strong theoretical guarantees of privacy and utility. The project will proceed in four thrusts. Thrust 1 will define and analyze a metric for trade secret privacy. This metric will be based on the notion of maximal leakage from information theory; maximal leakage captures the maximum amount of information that can be gained by an adversary about any secret quantity after seeing released, obfuscated data. The proposed metric in this project will differ by modeling information leakage of specific trade secrets, rather than any arbitrary secret. Thrust 2 will propose and theoretically analyze a metric for data source utility, based on statistical divergences over probability distributions. This work will build on the expansive literature on data valuation. Thrust 3 will study fundamental tradeoffs between these metrics; the goal will be to identify algorithms that approach the fundamental bounds. Thrust 4 will analyze downstream performance guarantees, which connect the proposed privacy and utility metrics to enterprise use cases motivated by the investigator's ongoing industry collaborations. In summary, the project will contribute a formal methodology for modeling and mitigating common data sharing problems in enterprise settings.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Dong, Wei and Luo, Qiyao and Fanti, Giulia and Shi, Elaine and Yi, Ke "Almost Instance-optimal Clipping for Summation Problems in the Shuffle Model of Differential Privacy" , 2024 https://doi.org/10.1145/3658644.3690225 Citation Details
Lin, Zinan and Wang, Shuaiqi and Sekar, Vyas and Fanti, Giulia "Summary Statistic Privacy in Data Sharing" IEEE Journal on Selected Areas in Information Theory , v.5 , 2024 https://doi.org/10.1109/JSAIT.2024.3403811 Citation Details
Wang, Shuaiqi and Lin, Zinan and Fanti, Giulia "Statistic Maximal Leakage" , 2024 https://doi.org/10.1109/ISIT57864.2024.10619258 Citation Details
Xu, Xinyi and Wang, Shuaiqi and Foo, Chuan_Sheng and Low, Brian Kian_Hsiang and Fanti, Giulia "Data Distribution Valuation" , 2024 Citation Details

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page