NSF Award Search: Award # 0951831

Award Abstract # 0951831

Doctoral Dissertation Research: Temporal and Spatial Properties of Speech at the Intersection of Production and Perception

NSF Org:	BCS Division of Behavioral and Cognitive Sciences
Recipient:	NEW YORK UNIVERSITY
Initial Amendment Date:	March 15, 2010
Latest Amendment Date:	September 7, 2011
Award Number:	0951831
Award Instrument:	Standard Grant
Program Manager:	William Badecker BCS Division of Behavioral and Cognitive Sciences SBE Directorate for Social, Behavioral and Economic Sciences
Start Date:	March 15, 2010
End Date:	August 31, 2012 (Estimated)
Total Intended Award Amount:	$4,778.00
Total Awarded Amount to Date:	$4,778.00
Funds Obligated to Date:	FY 2010 = $4,778.00
History of Investigator:	Lisa Davidson (Principal Investigator) lisa.davidson@nyu.edu Kevin Roon (Co-Principal Investigator) Adamantios Gafos (Former Principal Investigator)
Recipient Sponsored Research Office:	New York University 70 WASHINGTON SQ S NEW YORK NY US 10012-1019 (212)998-2121
Sponsor Congressional District:	10
Primary Place of Performance:	New York University 70 WASHINGTON SQ S NEW YORK NY US 10012-1019
Primary Place of Performance Congressional District:	10
Unique Entity Identifier (UEI):	NX9PXMKW5KW8
Parent UEI:
NSF Program(s):	Linguistics
Primary Program Source:	01001011DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s):	1311, 9179, SMET
Program Element Code(s):	131100
Award Agency Code:	4900
Fund Agency Code:	4900
Assistance Listing Number(s):	47.075

ABSTRACT

This project will address fundamental questions about the nature of the link between producing and perceiving speech. Existing experimental evidence points to the automatic and involuntary activation of the speech production system during speech perception. One goal of this project is to investigate what properties of speech are important in the perception-production link, a question that has not been systematically addressed. Speech production involves spatial properties, which speech articulators make what constrictions where in the vocal tract, and temporal properties, how those constrictions are arranged relative to one another in time. A series of cue-distractor experiments will be conducted to elucidate the conditions under which perception-production interactions occur in speech. Such interactions will be detected by measuring participants' reaction times to syllables produced (the cue) while another syllable is being perceived (the distractor). Subjects' reaction times will be compared in cases where the cue and distractor share spatial properties (for example, "pa" and "ba", which both involve a closing of the upper and lower lips) or temporal properties (for example, "ba" and "da", which both involve similar timing between the oral closure and vowel voicing). By analyzing the reaction times in such tasks, we will know if and how the involuntary activation resulting from the perceived distractor combines with the intended response to the visual cue. A second goal of the project is to develop a formal computational model of the perception-production link to account for and unify existing and new evidence for this link.

Action and perception are known to be closely linked in behavior. This study investigates a specific instance of the link between action and perception in speech, thereby sharpening our understanding of the mechanisms underlying human communication. The project will provide an explicit computational model of the link between speech perception and production in specific experimental paradigms involving reaction time data. Such a model has not been developed despite many years of work on the topic of production-perception links. This model will be of value to the field because it will help to refine further predictions, and thus to design new experiments for sharpening our understanding of perceptuo-motor interactions in speech. A better understanding of the relation between perception and production may shed light on the ways in which disorders of one domain may be related to the disorders of the other. For example, individuals with non-fluent aphasia have difficulty with speech production but generally not with speech perception, and they have been shown to improve on certain production tasks through multi-sensory speech perception training. This research therefore has potential long-term benefits for revealing clinically relevant insights into the mechanisms of new treatment approaches. In addition, the methods used to collect and analyze response time data along with the computational modeling component in this project will add new important components in an existing training framework introducing undergraduates to integrative methods in theoretical and experimental linguistics.

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

This project addressed fundamental questions about the nature of the link between producing and perceiving speech. Experimental evidence points to the automatic and involuntary activation of the speech production system during speech perception. One such type of evidence comes from response times of subjects when they are instructed to say some syllable based on a visual cue. As they are preparing to speak the required syllable, they hear a distractor. Subjects’ spoken responses start sooner when the auditory distractor is the same as their response, and later when it is different.

One goal of this project was to investigate what properties of speech are important in the perception-production link. The literature has raised this question but not addressed it systematically. Speech production involves spatial properties, i.e., which speech articulators make what constrictions where in the vocal tract, and temporal properties, i.e., how those constrictions are arranged in time. We conducted experiments to elucidate the conditions under which perception-production interactions occur in speech, detected by measuring response times of syllables produced while another syllable was perceived. Subjects’ response times were compared in cases where the response and distractor shared spatial properties (e.g., “pa” and “ba” both involve a closing of the upper and lower lips) or temporal properties (“ba” and “da” both involve similar timing between the oral closure and vowel voicing). The experiments showed that both the spatial and temporal properties of speech play a role in the perception-production link. In one experiment, we found that subjects responded faster when the response and distractor matched in temporal properties than when they differed in them. In another, we found qualitatively similar results for spatial properties. This project provides the first concrete experimental evidence that the same properties of speech that are fundamental in traditional linguistic description are also active in the link between speech perception and production.

Another goal of the project was to develop a formal computational model of the perception-production link to account for disparate sets of results that have been used as evidence of this link. The model we developed focuses on the process by which speech properties are set during production. The perception-production link is formalized as the properties of a perceived utterance serving obligatorily as input to that property-setting process. In our experiments, response times were affected by whether the response and distractor (mis)matched on the properties we tested, but response times in both cases were slower than when subjects heard a tone distractor. Galantucci, Fowler, and Goldstein (2009) found that when the response and distractor were identical, response times were faster than with a tone distractor. Treating the tone as a baseline, we argue that the process of property setting requires both excitation and inhibition of the activation levels associated with the speech properties. When the response and distractor are identical, activation-level excitation is maximized due to fully congruent inputs to the property-setting process. The model predicts response times to be faster in this case than with a tone distractor, as Galantucci et al. found. Each mismatching property between a response and a distractor introduces inhibition to the process, predicting increasingly slower response times based on the number of mismatching properties, as we found in our experiments. Our model provides a unified account of results from these two different studies, formalizes the perception-production link, and identifies the computational principles that are involved in the process where that link is active.

Action and perception are known to be closely li...

Please report errors in award information by writing to: awardsearch@nsf.gov.

Top

Success

Error