
NSF Org: |
CCF Division of Computing and Communication Foundations |
Recipient: |
|
Initial Amendment Date: | December 29, 2023 |
Latest Amendment Date: | May 2, 2025 |
Award Number: | 2338772 |
Award Instrument: | Continuing Grant |
Program Manager: |
Alfred Hero
ahero@nsf.gov (703)292-0000 CCF Division of Computing and Communication Foundations CSE Directorate for Computer and Information Science and Engineering |
Start Date: | March 1, 2024 |
End Date: | February 28, 2029 (Estimated) |
Total Intended Award Amount: | $597,299.00 |
Total Awarded Amount to Date: | $228,248.00 |
Funds Obligated to Date: |
FY 2025 = $114,741.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
5000 FORBES AVE PITTSBURGH PA US 15213-3815 (412)268-8746 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
5000 FORBES AVE PITTSBURGH PA US 15213-3815 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | Comm & Information Foundations |
Primary Program Source: |
01002425DB NSF RESEARCH & RELATED ACTIVIT 01002728DB NSF RESEARCH & RELATED ACTIVIT 01002627DB NSF RESEARCH & RELATED ACTIVIT 01002829DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
A cooperative and coordinated effort across organizations will be required for effectively addressing many of today's biggest global challenges and societal problems, including cybercrime, climate change, and public health. Data sharing will enable organizations involving academic researchers, governments, and industry to collaborate more effectively on solving such problems. However, in practice data sharing among organizations is hindered by two significant factors. (1) Organizations often fear inadvertently leaking trade-secrets, such as business strategies, and therefore do not share data; this concern is called trade-secret privacy. (2) When organizations have access to shared data, they often lack the in-house resources to evaluate the quality and usefulness of the data-source; this problem is called data source utility. This project quantitatively addresses the data sharing problems in terms of these two factors. Specifically, the project will develop quantitative methods for measuring trade-secret privacy and data-source utility, assessing the tradeoffs between these two factors, and developing algorithms that come close to optimizing this tradeoff. This research will help encourage greater data sharing among organizations, thereby enhancing society's ability to address global challenges through informed collaboration. Several outreach and education activities complement and integrate the research. These include working with companies to develop privacy protections in their applications and services, working on an open source library for trade-secret privacy and utility, and organizing research internship programs for students in Africa.
This project aims to design novel privacy and utility metrics and frameworks to help organizations make more informed choices regarding data sharing. Both of the above problems (trade secret privacy and data source utility) can be framed as a study of divergences between probability distributions. Building on the investigator's prior work studying divergences in the context of deep generative models, this project will study how to carefully select appropriate divergence measures to (a) satisfy enterprise use cases, and (b) provide strong theoretical guarantees of privacy and utility. The project will proceed in four thrusts. Thrust 1 will define and analyze a metric for trade secret privacy. This metric will be based on the notion of maximal leakage from information theory; maximal leakage captures the maximum amount of information that can be gained by an adversary about any secret quantity after seeing released, obfuscated data. The proposed metric in this project will differ by modeling information leakage of specific trade secrets, rather than any arbitrary secret. Thrust 2 will propose and theoretically analyze a metric for data source utility, based on statistical divergences over probability distributions. This work will build on the expansive literature on data valuation. Thrust 3 will study fundamental tradeoffs between these metrics; the goal will be to identify algorithms that approach the fundamental bounds. Thrust 4 will analyze downstream performance guarantees, which connect the proposed privacy and utility metrics to enterprise use cases motivated by the investigator's ongoing industry collaborations. In summary, the project will contribute a formal methodology for modeling and mitigating common data sharing problems in enterprise settings.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
Please report errors in award information by writing to: awardsearch@nsf.gov.