Award Abstract # 0845059
Collaborative Research: Bayesian Cue Integration in Probability-Sensitive Language Processing

NSF Org: BCS
Division of Behavioral and Cognitive Sciences
Recipient: UNIVERSITY OF ROCHESTER
Initial Amendment Date: June 24, 2009
Latest Amendment Date: June 24, 2009
Award Number: 0845059
Award Instrument: Standard Grant
Program Manager: William Badecker
BCS
 Division of Behavioral and Cognitive Sciences
SBE
 Directorate for Social, Behavioral and Economic Sciences
Start Date: July 1, 2009
End Date: June 30, 2012 (Estimated)
Total Intended Award Amount: $97,654.00
Total Awarded Amount to Date: $97,654.00
Funds Obligated to Date: FY 2009 = $97,654.00
ARRA Amount: $97,654.00
History of Investigator:
  • T. Florian Jaeger (Principal Investigator)
    fjaeger@bcs.rochester.edu
Recipient Sponsored Research Office: University of Rochester
910 GENESEE ST
ROCHESTER
NY  US  14611-3847
(585)275-4031
Sponsor Congressional District: 25
Primary Place of Performance: University of Rochester
910 GENESEE ST
ROCHESTER
NY  US  14611-3847
Primary Place of Performance
Congressional District:
25
Unique Entity Identifier (UEI): F27KDXZMF9Y8
Parent UEI:
NSF Program(s): Linguistics
Primary Program Source: 01R00910DB RRA RECOVERY ACT
Program Reference Code(s): 0000, 6890, OTHR
Program Element Code(s): 131100
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.075

ABSTRACT

This award is funded under the American Recovery and Reinvestment Act of 2009 (Public Law 111-5).

The process of sentence comprehension involves incrementally accessing the meaning of individual words and combining them into larger representations. In this process, readers / listeners use probabilistic cues to guide their expectations of upcoming words and of the syntactic roles that the words will play. For example, previous research has demonstrated that people are sensitive to word frequency (more frequent words or word meanings are easier to process than less frequent words or word meanings), syntactic frequency (more frequent rules being easier to process than less frequent rules) and world knowledge (more likely events being easier to process than less likely events). However, a complete theory of language processing must not only identify the cues that people are sensitive to, but also has to specify a theory of how a reader or listener will combine them. Towards this end, this project investigates the extent to which readers? syntactic processing mechanisms fit several cue combination models that are based on different Bayesian inference methods. Bayesian models provide a formalism specifying how any set of probabilistic information sources can be optimally weighted and combined. In this research, each of the cue combination models is applied to a wide range of language cues which people have been shown to rely on. The models will then be compared in terms of their fit to human reading-time data. The project will use existing reading-time data sets from previous experiments, as well as new reading-time data from language materials consisting of single sentences in null contexts and in supportive contexts, systematically varying several cues that have been shown to have measurable reading time effects in previous literature. This will demonstrate which of the inference methods provides the most accurate description of the computations that human readers perform in order to understand sentences.

Previous work in the field of sentence comprehension has established that people use a diverse range of probabilistic cues when they interpret a sentence. However, several important questions that remain unanswered include (a) how much each cue matters in typical texts; and (b) how the cues are combined in the course of comprehension. The main advance of this research is to use a combination of experimental and computational modeling methods to answer these two questions. The techniques and results developed will be broadly useful in at least three general areas: (1) cognitive science; (2) engineering; and (3) human applications of language research. First, this project will help researchers by providing an available database of reading times for a large corpus of English text, which any researcher will be able to use to evaluate theories of language processing. In addition, the project will provide open source software that researchers can use or modify to evaluate related theoretical questions in language and other fields of cognitive science. Second, the project will provide a way for computer engineers who research language to investigate the effects that human readers are sensitive to, indicating potential directions for fruitful research. And third, an understanding of how language is processed will in the long run aid in developing better diagnostic tools and treatments for people with developmental and acquired language disorders.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Jaeger, T.F. "Redundancy and Reduction: Speakers Manage Syntactic Information Density" Cognitive Psychology , v.1 , 2010 , p.23 10.1016/j.cogpsych.2010.02.002
Jaeger, T.F.; Tily, H. "Language Processing Complexity and Communicative Efficiency" WIRE: Cognitive Science , v.2 , 2011 , p.323 10.1002/wcs.126
Qian, T.;Jaeger, T.F. "Cue Effectiveness in Communicatively Efficient Discourse Production." Cognitive Science , v.36(7) , 2012 , p.1312 10.1111/j.1551-6709.2012.01256.x

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page