Award Abstract # 2149315
Collaborative Research: The Value of Data

NSF Org: SES
Division of Social and Economic Sciences
Recipient: THE TRUSTEES OF COLUMBIA UNIVERSITY IN THE CITY OF NEW YORK
Initial Amendment Date: March 22, 2022
Latest Amendment Date: March 22, 2022
Award Number: 2149315
Award Instrument: Standard Grant
Program Manager: Nancy Lutz
nlutz@nsf.gov
 (703)292-7280
SES
 Division of Social and Economic Sciences
SBE
 Directorate for Social, Behavioral and Economic Sciences
Start Date: March 15, 2022
End Date: February 29, 2024 (Estimated)
Total Intended Award Amount: $123,115.00
Total Awarded Amount to Date: $123,115.00
Funds Obligated to Date: FY 2022 = $123,115.00
History of Investigator:
  • Jacopo Perego (Principal Investigator)
    jacopo.perego@columbia.edu
Recipient Sponsored Research Office: Columbia University
615 W 131ST ST
NEW YORK
NY  US  10027-7922
(212)854-6851
Sponsor Congressional District: 13
Primary Place of Performance: Trustees of Columbia in the City of New York
615 West 131st Street, Room 254
New York
NY  US  10027-6902
Primary Place of Performance
Congressional District:
13
Unique Entity Identifier (UEI): F4N1QNPB95M4
Parent UEI:
NSF Program(s): Economics
Primary Program Source: 01002223DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7434, 9179
Program Element Code(s): 132000
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.075

ABSTRACT

This research will investigate the value that the data of a consumer has for the firm that uses it. In many digital platforms, consumer data is often used to intermediate the needs of various agents with conflicting interests?such as buyers and sellers, drivers and riders, or social-media users and advertisers. This makes determining the value of data especially complex. This research will tackle the complexity issue and provide a more complete account of the value of data?especially its dependence on privacy-protection policies. By doing so, it will advance our understanding of the demand for data in the digital economy and provide insights into how data markets work and may be affected by policy interventions. This research will also shed light on the debate about how to individually compensate consumers for their data, which many scholars and policymakers believe to be an essential aspect of a functioning data market.

More specifically, this research project will study what determines the value of an individual consumer?s data record for the intermediary that uses it as an input in its business. For instance, such a record can be the characteristics of a buyer that an e-commerce platform stores on its servers. When data is used by a third party (like a platform) to strategically direct interactions between multiple agents (like buyers and sellers), assessing its value is complicated and calls for a new approach. The project shows that this value is not just the payoff the intermediary derives directly from a record (like a platform?s transaction fee). It involves other components, which can significantly bias our assessments if ignored. They capture externalities between the records of, say, different buyers not because of a statistical correlation, but because of how the platform partitions its knowledge of the buyers so as to direct sellers? responses (e.g., by pooling buyers into market segments). Such externalities can render the record of a low-spending buyer more valuable than that of a high-spending buyer. The first part of the project will study contexts where the intermediary already owns the data and can use it without people?s consent. Its core contribution is to show how to properly assess the value of individual records and characterize all its components. The second part of the project will study how the value of data changes when each consumer can withhold their data from the platform. One key insight is that privacy rights may not only shift wealth from data-users to data-sources (i.e., from intermediaries to consumers), but also change the value of data records itself. For instance, it can increase the value of some people?s records at the expense of others. Thus, privacy can have redistributive effects across data-sources, which may contribute to social inequality and should be taken into account by privacy-protection policies.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Galperti, Simone and Levkun, Aleksandr and Perego, Jacopo "The Value of Data Records" The Review of Economic Studies , 2024 https://doi.org/10.1093/restud/rdad044 Citation Details
Galperti, Simone and Liu, Tianhao and Perego, Jacopo "Competitive Markets for Personal Data" ACM EC , 2024 Citation Details
Galperti, Simone and Perego, Jacopo "Privacy and the Value of Data" AEA Papers and Proceedings , v.113 , 2023 https://doi.org/10.1257/pandp.20231084 Citation Details

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

This project resulted in the production of three research papers, each focusing on distinct aspects of data economics: the value of data, data privacy, and data markets. These papers were presented at numerous venues, including invited seminars at leading universities, conferences, and workshops. They enhance our understanding of data economics, a booming field. The grant also funded the professional development of young PhD scholars. Finally, this work inspired further research on the topic, which is now in progress.

In "The Value of Data Records" (2024, The Review of Economic Studies), we addressed a key question: How much value does each individual's data contribute to multi-billion-dollar industries? Personal data fuels industries like search engines, social media, e-commerce, and job-matching platforms. It is often compared to the "new oil" of modern economies. However, the specific value of each individual's data remains unclear, especially when data is pooled to maximize revenues from ad auctions, obscuring individual contributions.

To address this, our paper proposes a classical approach rooted in information design literature. We characterize the value of each data record for intermediaries that use data to influence the behavior of strategic agents. Our analysis reveals that the value of a data record comprises two components. The first is the direct payoff the intermediary obtains (e.g., when a user trades with an advertiser). The second is a novel externality that arises when the intermediary pools records to withhold information from agents. This externality is a hallmark of intermediation problems due to inherent conflicts of interest.

Our analysis has two main implications. First, we show that the values of data records are a useful benchmark for compensating individuals for their data, contributing to policy debates about data dividends and data unions. Second, we draw an analogy between how an intermediary values data records and consumer theory, enabling the application of analytical tools to study data markets and optimal data acquisition strategies.

In "Privacy and the Value of Data" (2023, American Economic Association, Papers & Proceedings), we examined how data-privacy laws affect the value of personal data for firms and impact consumer welfare. We extend the model in "The Value of Data Records" to incorporate consumer privacy protection by introducing elicitation constraints. In our model, an e-commerce platform intermediates between a monopolistic seller and heterogeneous consumers, aiming to maximize consumer surplus by influencing the seller's pricing based on consumer information.

Our analysis yields three main insights. First, protecting consumer privacy can increase or decrease the value of some consumers' data while leaving others unaffected. Second, privacy protection impacts data usage by platforms and consumer payoffs, benefiting some consumers but harming others, especially those with no reason to withhold data. Third, privacy protection raises average transaction prices but limits trade, negatively impacting the platform.

In "Competitive Markets for Personal Data" (2024, working paper, with extended abstract in the 2024 Proceedings of the ACM EC Conference), we investigate whether competitive markets for personal data promote efficient allocations and maximize consumer welfare. This question is motivated by policy debates on data market design. Currently, consumers have limited control over their data use and are often imperfectly compensated. This can lead to inefficiencies and market failure.

We model an economy where consumers own their data and can sell it to a platform at a given price, assuming a competitive market. The platform provides a service, intermediating consumers with a third-party merchant from whom they can buy products. The platform uses consumers' data to inform the merchant about their willingness to pay, affecting merchant profits, consumer surplus, and data prices.

Our main result shows that the efficiency of this competitive economy relies on the platform's objective. When the platform's and merchant's objectives align, the equilibrium allocation is efficient, maximizing consumer welfare. Conversely, when the platform's objective aligns more with consumers, the equilibrium can be inefficient due to the externality discussed in "The Value of Data Records."

We propose three alternative market designs to correct this inefficiency. The first involves a "data union," an intermediary managing consumers' data on their behalf and returning proceeds as data dividends. The second involves a "data tax" on consumers when they trade their data. The third involves making markets "more complete," by letting the price of a data record also depend on its intended use.  


Last Modified: 06/05/2024
Modified by: Jacopo Perego

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page