Award Abstract # 1054911
CAREER: Learning- and Incentives-Based Techniques for Aggregating Community-Generated Data

NSF Org: IIS
Division of Information & Intelligent Systems
Recipient: UNIVERSITY OF CALIFORNIA, LOS ANGELES
Initial Amendment Date: January 19, 2011
Latest Amendment Date: April 8, 2015
Award Number: 1054911
Award Instrument: Continuing Grant
Program Manager: Maria Zemankova
IIS
 Division of Information & Intelligent Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: June 1, 2011
End Date: March 31, 2015 (Estimated)
Total Intended Award Amount: $496,671.00
Total Awarded Amount to Date: $238,627.00
Funds Obligated to Date: FY 2011 = $80,830.00
FY 2012 = $144,266.00

FY 2013 = $13,530.00
History of Investigator:
  • Jennifer Vaughan (Principal Investigator)
    jenn@cs.ucla.edu
Recipient Sponsored Research Office: University of California-Los Angeles
10889 WILSHIRE BLVD STE 700
LOS ANGELES
CA  US  90024-4200
(310)794-0102
Sponsor Congressional District: 36
Primary Place of Performance: University of California-Los Angeles
10889 WILSHIRE BLVD STE 700
LOS ANGELES
CA  US  90024-4200
Primary Place of Performance
Congressional District:
36
Unique Entity Identifier (UEI): RN64EPNH8JC6
Parent UEI:
NSF Program(s): Info Integration & Informatics,
Robust Intelligence,
Algorithmic Foundations
Primary Program Source: 01001112DB NSF RESEARCH & RELATED ACTIVIT
01001213DB NSF RESEARCH & RELATED ACTIVIT

01001314DB NSF RESEARCH & RELATED ACTIVIT

01001415DB NSF RESEARCH & RELATED ACTIVIT

01001516DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 1045, 1187, 7364, 7926
Program Element Code(s): 736400, 749500, 779600
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

The Internet has led to the availability of novel sources of data on the preferences, behaviors, and beliefs of massive communities of users.
Both researchers and engineers are eager to aggregate and interpret this data. However, websites sometimes fail to incentivize high-quality contributions, leading to variable quality data. Furthermore, assumptions made by traditional theories of learning break down in these settings.

This project seeks to create foundational machine learning models and algorithms to address and explain the issues that arise when aggregating local beliefs across large communities, and to advance the state-of-the-art understanding of how to motivate high quality contributions. The research can be split into three directions:

1. Developing mathematical foundations and algorithms for learning from community-labeled data. This direction involves developing learning models for data from disparate (potentially self-interested or
malicious) sources and using insight from these models to design efficient learning algorithms.

2. Understanding and designing better incentives for crowdsourcing. This direction involves modeling crowdsourcing contributions to determine which features to include in systems to encourage the highest quality contributions.

3. Introducing novel economically-motivated mechanisms for opinion aggregation. This involves formalizing the properties a prediction market should satisfy and making use of ideas from machine learning and optimization to derive tractable market mechanisms satisfying these properties.

This research will have clear impact on industry, especially for web-based crowdsourcing. The PI will pursue her long-term goal of attracting and retaining women in computer science via her involvement in workshops and mentoring programs. Results will be disseminated at http://www.cs.ucla.edu/~jenn/projects/CAREER.html.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Jacob Abernethy, Yiling Chen, and Jennifer Wortman Vaughan "Efficient Market Making via Convex Optimization, and a Connection to Online Learning" ACM Transactions on Economics and Computation , v.1 , 2013
Nicolas S. Lambert, John Langford, Jennifer Wortman Vaughan, Yiling Chen, Daniel Reeves, Yoav Shoham, David M. Pennock "An Axiomatic Characterization of Wagering Mechanisms" Journal of Economic Theory , v.156 , 2015

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

The primary goal of this project was to advance the state-of-the-art understanding of how to elicit and aggregate high quality information and beliefs from (online) crowds.

The results can be broken down into three primary directions.

First, this research produced novel online learning algorithms to assign crowdworkers to tasks. Crowdsourcing markets provide a way for requesters to inexpensively obtain distributed labor and information, and have recently become popular among researchers who use them to conduct user studies, run behavioral experiments, and collect data that is easy for humans to generate but difficult for computers. Unlike in traditional labor markets, requesters interact with many workers rapidly and can potentially adjust their behavior as they learn more about salient features of the environment, such as workers’ skill levels, the difficulty of their tasks, and workers’ willingness to accept their tasks at given prices. We addressed the challenge that a requester faces when assigning heterogeneous tasks to workers with unknown, heterogeneous skills. We formalized this “online task assignment problem,” provided a task assignment algorithm that is provably near-optimal given that workers return repeatedly, and evaluated this algorithm on data collected from the popular crowdsourcing platform Amazon Mechanical Turk. We then extended this line of work to cover classification or labeling tasks in which the quality of work cannot be judged immediately.

Second, this research contributed to a more thorough understanding of performance-based incentives in crowdsourcing systems. We designed and ran randomized behavioral experiments on Amazon Mechanical Turk with the goal of understanding when, where, and why performance-based payments improve work quality, identifying properties of the payment, payment structure, and the task itself that make them most effective. Based on our findings, we proposed a new model of worker behavior that extends the standard principal-agent model from economics to include a worker’s subjective beliefs about his likelihood of being paid. We also designed an algorithm using multi-armed bandit techniques for optimally setting the amounts of performance-based payments for tasks assuming workers strategically choose their level of effort.

Finally, this research contributed a thorough study of several key research questions surrounding the design of prediction market mechanisms that elicit and aggregate beliefs from crowds of traders. A prediction market is a market in which traders buy and sell securities with values that are contingent on the outcome of a future event. For example, a security may pay off $1 if a Democrat wins the 2016 US Presidential election and $0 otherwise. The market price of such a security is thought to reflect the traders’ collective belief about the likelihood that a Democrat will win. To facilitate trade, a prediction market can be operated by an automated market maker, an algorithmic agent that offers to buy or sell securities at some current market price determined by a pricing mechanism that makes use of the trade history. The market maker provides liquidity in the market, effectively subsidizing the market and rewarding traders for their private information. This is especially useful in “combinatorial markets” which offer securities defined on a large or infinite outcome space and propagate information (in the form of prices) appropriately across logically-related securities.

We proposed a general framework for the design of efficient pricing mechanisms over very large or infinite outcome spaces. We took an axiomatic approach, specifying a set of formal mathematical properties that any reasonable market should satisfy and fully characterized the set of pricing mechanisms that satisfy these properties....

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page