
NSF Org: |
IIS Division of Information & Intelligent Systems |
Recipient: |
|
Initial Amendment Date: | September 10, 2003 |
Latest Amendment Date: | April 4, 2005 |
Award Number: | 0329009 |
Award Instrument: | Continuing Grant |
Program Manager: |
Ephraim Glinert
IIS Division of Information & Intelligent Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | September 15, 2003 |
End Date: | August 31, 2007 (Estimated) |
Total Intended Award Amount: | $749,999.00 |
Total Awarded Amount to Date: | $755,999.00 |
Funds Obligated to Date: |
FY 2004 = $250,000.00 FY 2005 = $255,999.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
1 SILBER WAY BOSTON MA US 02215-1703 (617)353-4365 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
1 SILBER WAY BOSTON MA US 02215-1703 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): |
ITR SMALL GRANTS, HUMAN COMPUTER INTER PROGRAM, UNIVERSAL ACCESS, HUMAN LANGUAGE & COMMUNICATION |
Primary Program Source: |
app-0104 app-0105 |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
Research on recognition and generation of signed languages and the gestural component of spoken languages has been hindered by the unavailability of large-scale linguistically annotated corpora of the kind that led to significant advances in the area of spoken language. In addition, the complexity of simultaneous expression of linguistic information on the hands, the face, and the upper body creates special challenges for both linguistic analysis and computer-based recognition. A major goal of this project is the development of pattern analysis algorithms for discovery of the co-occurrence, overlap, relative timing, frequency, and magnitude of linguistically significant movements of the hands, face, and upper body. These will be tested against the PI's recently developed corpus collected from native signers of American Sign Language (ASL).
The high-quality video data consist of multiple synchronized movie ?les, showing the signing from multiple angles (including a close-up of the face). Annotations were produced using SignStream (an application developed by the PIs), which enables transcription of parallel streams of information (e.g., movements of the hands, eyes, eyebrows, etc., that convey critical grammatical information in signed languages). The video data and annotations provide a basis for analyzing gestures occurring in the multiple manual and non-manual channels. The goal is to recognize temporal associations both within and across channels. Time-series analysis algorithms will be developed for comparing the degree of similarity of gestural components.
Clustering and indexing algorithms will be developed for identification of groups of similar gestural components from mixed discrete annotation event labels (e.g., gloss, eyebrow raise, hand-shape, etc.) and sampled measurement data (e.g., head orientation, direction of eye gaze, hand motion). Since several gestural channels include periodic motions of varying frequency and magnitude (e.g., head nods and shakes), periodicity analysis modules will also be developed. Linguists and computer scientists will collaborate to determine how best to exploit and combine the information available from these different sources. Moreover, the information emerging from the computer science research will be of enormous benefit for the ongoing linguistic research being conducted by the PI's research team on the syntax of ASL. Syntactic research on signed languages has been hindered by the daunting task of attempting to uncover these patterns solely through observation, without the aid of tools for analyzing large amounts of data.
To support time-series pattern analysis in SignStream, computer vision algorithms will be developed for analysis of video to extract measurements the head, face, eyes/eyebrows, arms, and upper body, as well as hands. These algorithms will model and exploit joint statistics; i.e., they will explicitly model correlations and associations across gestural channels. The vision algorithms will also make use of information available from the existing annotations to acquire models via supervised learning. These algorithms will allow a feedback loop where the dynamical models estimated during data analysis/clustering can be used to "tune" the tracking modules to specific gestures.
It is expected that this approach should prove useful in analyzing gestural patterns in other HCI applications. As part of the effort, the SignStream system will be employed in user studies of vision-based interfaces for the severely handicapped, in collaboration with colleagues at Boston College. Video will be captured via multiple cameras, and partially annotated by the HCI researcher and the clinicians, supplemented with performance data (speed, accuracy, fatigue) and output of the vision analysis algorithms.
Broader Impacts: Algorithms and software developed in this effort will be made available to the research community via FTP, as extensions to SignStream. Thus, these tools will be available to the established, diverse group of researchers who already use SignStream in linguistics and computer human interface research. The gestural pattern analysis tools developed in this project should accelerate linguistic research on the critical role of non-manual channels in signed languages and gestural communication. Moreover, better understanding of the combined role that non-manual and manual channels play should lead to improved accuracy in computer-based sign language recognition systems, as well as speech recognition systems that model gestural components observed in video. Conversely, systems for synthesis of signed languages and gestural communication would be able to model and generate these non-manual movements over the appropriate linguistic domains, in order to achieve better realism. Finally, it is hoped that pattern analysis algorithms will be useful in the study of gestural interfaces more generally. Insights gained via such analysis tools should lead to improved video-based interface systems (e.g., for the severely-handicapped) that allow greater comfort, accuracy, and ease of use.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
Please report errors in award information by writing to: awardsearch@nsf.gov.