Award Abstract # 1059218
Collaborative Research: CI-ADDO-EN: Development of Publicly Available, Easily Searchable, Linguistically Analyzed, Video Corpora for Sign Language and Gesture Research

NSF Org: CNS
Division Of Computer and Network Systems
Recipient: TRUSTEES OF BOSTON UNIVERSITY
Initial Amendment Date: July 27, 2011
Latest Amendment Date: June 14, 2016
Award Number: 1059218
Award Instrument: Standard Grant
Program Manager: Ephraim Glinert
CNS
 Division Of Computer and Network Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: August 1, 2011
End Date: July 31, 2017 (Estimated)
Total Intended Award Amount: $368,205.00
Total Awarded Amount to Date: $368,205.00
Funds Obligated to Date: FY 2011 = $368,205.00
History of Investigator:
  • Carol Neidle (Principal Investigator)
    carol@bu.edu
  • Stan Sclaroff (Co-Principal Investigator)
Recipient Sponsored Research Office: Trustees of Boston University
1 SILBER WAY
BOSTON
MA  US  02215-1703
(617)353-4365
Sponsor Congressional District: 07
Primary Place of Performance: Trustees of Boston University
1 SILBER WAY
BOSTON
MA  US  02215-1703
Primary Place of Performance
Congressional District:
07
Unique Entity Identifier (UEI): THL6A6JLE1S7
Parent UEI:
NSF Program(s): CCRI-CISE Cmnty Rsrch Infrstrc
Primary Program Source: 01001112DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7359
Program Element Code(s): 735900
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

The goal of this project is to create a linguistically annotated, publicly available, and easily searchable corpus of video from American Sign Language (ASL). This will constitute an important piece of infrastructure, enabling new kinds of research in both linguistics and vision-based recognition of ASL. In addition, a key goal is to make this corpus easily accessible to the broader ASL community, including users and learners of ASL. As a result of our long-term efforts, we have an extensive collection of linguistically annotated video data from native signers of ASL. However, the potential value of these corpora has been largely untapped, notwithstanding their extensive and productive use by our team and others. Existing limitations in our hardware and software infrastructure make it cumbersome to search and identify data of interest, and to share data among our institutions and with other researchers. In this project, we propose hardware and software innovations that will constitute a major qualitative upgrade in the organization, searchability, and public availability of the existing (and expanding) corpus.

The enhancement and improved Web-accessibility of these corpora will be invaluable for linguistic research, enabling new kinds of discoveries and the testing of hypotheses that would otherwise have be difficult to investigate. On the computer vision side, the proposed new annotations will provide an extensive public dataset for training and benchmarking a variety of computer vision algorithms. This will facilitate research and expedite progress in gesture recognition, hand pose estimation, human tracking, and large vocabulary, and continuous ASL recognition. Furthermore, this dataset will be useful as training and benchmarking data for algorithms in the broader areas of computer vision, machine learning, and similarity-based indexing.

The advances in linguistic knowledge about ASL and in computer-based ASL recognition that will be accelerated by the availability of resources of the kind proposed here will contribute to development of technologies for education and universal access. For example, tools for searching collections of ASL video for occurrences of specific signs, or converting ASL signing to English, are still far from attaining the level of functionality and usability to which users are accustomed for spoken/written languages. Our corpora will enable research that aims to bring such vision-based ASL recognition applications closer to reality. Moreover, these resources will afford important opportunities to individuals who would not otherwise be in a position to conduct such research (e.g., for lack of access to native ASL signers or high-quality synchronized video equipment, or lack of resources/expertise to carry out extensive linguistic annotations). Making our corpora available online will also allow the broader community of ASL users to access our data directly. Students of ASL will be able to retrieve video showing examples of a specific sign used in actual sentences, or examples of a grammatical construction. ASL instructors and teachers of the Deaf will also have easy access to video examples of lexical items and grammatical constructions as used by a variety of native signers for use in language instruction and evaluation. Thus, the proposed web interface to our data collection will be a useful educational resource for users, teachers, and learners of ASL.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 17)
Dilsizian, Mark, Polina Yanovich, Shu Wang, Carol Neidle, and Dimitris Metaxas "A New Framework for Sign Language Recognition based on 3D Handshape Identification and Linguistic Modeling" LREC 2014. Reykjavik, Iceland , 2014 , p.1924-1929 http://www.lrec-conf.org/proceedings/lrec2014/pdf/1138_Paper.pdf
Dilsizian, Mark, Zhiqiang Tang, Dimitris Metaxas, Matt Huenerfauth, and Carol Neidle "The Importance of 3D Motion Trajectories for Computer-based Sign Recognition" 7th Workshop on the Representation and Processing of Sign Languages: Corpus Mining. LREC 2016, Portoro? (Slovenia), May 2016 , 2016 , p.53 http://www.lrec-conf.org/proceedings/lrec2016/workshops/LREC2016Workshop-SignLanguage_Proceedings.pdf
Gavrilov, Zoya, Stan Sclaroff, Carol Neidle, and Sven Dickinson "Detecting Reduplication in Videos of American Sign Language" Proceedings of LREC 2012. Istanbul, Turkey, May 2012 , 2012 , p.3768-3773 http://www.lrec-conf.org/proceedings/lrec2012/pdf/199_Paper.pdf
Hernisa Kacorri, Ali Raza Syed, Matt Huenerfauth, and Carol Neidle "?Centroid-Based Exemplar Selection of ASL Non-Manual Expressions using Multidimensional Dynamic Time Warping and MPEG4 Features.?" 7th Workshop on the Representation and Processing of Sign Languages: Corpus Mining, LREC 2016, Portoro? (Slovenia) , 2016
Jingjing Liu, Bo Liu, Shaoting Zhang, Fei Yang, Peng Yang, Dimitris N. Metaxas, and Carol Neidle "Non-manual grammatical marker recognition based on multi-scale, spatio-temporal analysis of head pose and facial expressions" Image and Vision Computing (Special issue: "The Best of Face and Gesture 2013," invited submission) , v.32 , 2013 , p.671 10.1016/j.imavis.2014.02.009
Joshi, Ajjen, Camille Monnier, Margrit Betke, and Stan Sclaroff. "Comparing random forest approaches to segmenting and classifying gestures" Image and Vision Computing , v.58 , 2016 , p.86-95 10.1016/j.imavis.2016.06.001
Joshi, Ajjen, Soumya Ghosh, Margrit Betke, Stan Sclaroff, and Hanspeter Pfister "Personalizing gesture recognition using hierarchical Bayesian neural networks" Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2017. , 2017 http://openaccess.thecvf.com/content_cvpr_2017/papers/Joshi_Personalizing_Gesture_Recognition_CVPR_2017_paper.pdf
Kacorri, Hernisa, Ali Syed, Matt Huenerfauth and Carol Neidle "Centroid-Based Exemplar Selection of ASL Non-Manual Expressions using Multidimensional Dynamic Time Warping and MPEG4 Features" Seventh Workshop on the Representation and Processing of Sign Languages: Corpus Mining, LREC 2016, Portoro? (Slovenia) , 2016 , p.105-110 http://www.lrec-conf.org/proceedings/lrec2016/workshops/LREC2016Workshop-SignLanguage_Proceedings.pdf
Liu, Bo, Jingjing Liu, Xiang Yu, Dimitris Metaxas and Carol Neidle "3D Face Tracking and Multi-Scale, Spatio-temporal Analysis of Linguistically Significant Facial Expressions and Head Positions in ASL" LREC 2014, Reykjavik, Iceland , 2014 , p.http://ww http://www.lrec-conf.org/proceedings/lrec2014/pdf/370_Paper.pdf
Liu, Jingjing, Bo Liu, Shaoting Zhang, Fei Yang, Peng Yang, Dimitris Metaxas, and Carol Neidle "Non-manual Grammatical Marker Recognition based on Multi-scale Spatio-temporal Analysis of Head Pose and Face" (invited submission) Image and Vision Computing: Best of Automatic Face and Gesture Recognition 2013. Volume 32, Issue 10, October 2014 , 2014 , p.671?681 10.1016/j.imavis.2014.02.009
Liu, Jingjing, Bo Liu, Shaoting Zhang, Fei Yang, Peng Yang, Dimitris N. Metaxas, and Carol Neidle "Recognizing Eyebrow and Periodic Head Gestures Using CRFs for Non-Manual Grammatical Marker Detection in ASL" Special Session on Sign Language, FG 2013: 10th IEEE International Conference on Automatic Face and Gesture Recognition. Shanghai, China , 2013 , p.1-6 10.1109/FG.2013.6553781
(Showing: 1 - 10 of 17)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

The goal of this project was to create a linguistically annotated, publicly available, and easily searchable corpus of high-quality video (including multiple synchronized views) from American Sign Language (ASL). 

Intellectual merits

This constitutes an important piece of infrastructure, enabling new kinds of research in both linguistics and computer vision-based recognition of ASL. In addition, a key goal has been to make this corpus easily accessible to the broader ASL community, including users and learners of ASL. In this project, we offer hardware and software innovations that constitute a major qualitative upgrade in the organization, searchability, and public availability of our existing and expanding corpus.

     Specific accomplishments:

  • We purchased file servers at Boston and Rutgers Universities and established mirroring between the two sites.  This enables distribution of our video data, software tools, and linguistic annotations.
  • We released a new version of our software for linguistic annotation of video language data, SignStream® 3, in August 2017. This was released with the MIT license; our trademark was also renewed. 
  • SignStream® was used at Gallaudet and Boston Universities for annotation of a substantial set of ASL data (some of which had been collected in conjunction with other NSF-funded projects: "HCC: Collaborative Research: Medium: Generating Accurate, Understandable Sign Language Animations Based on Analysis of Human Signing" and "III: Medium: Collaborative Research: Linguistically Based ASL Sign Recognition as a Structured Multivariate Learning Problem").
  • We designed, developed, and released our Data Access Interface (DAI) as well as an updated version thereof to deal with SignStream® 3 files (DAI 2).   This allows for easy browsing, searching, and downloading of our linguistically annotated ASL video corpora.
  • The original DAI provides access to (1) our National Center for Sign Language and Gesture Resources (NCSLGR) continuous signing corpus, and (2) our American Sign Language Lexicon Video Dataset (ASLLVD) of citation-form signs (made possible by prior NSF funding, grant # 0705749).  DAI 2 provides access to a subset of our new ASLLRP SignStream® 3 corpus, still under development; the remainder will be added as soon as verifications are complete.
  • DAI 2 includes a new ASLLRP Sign Bank, built initially off of the ASLLVD.  The Sign Bank will be expanded as new data sets are uploaded to DAI 2.
  • We have used these resources to advance linguistic and computer science research, thereby leading to a better understanding of how language works in the visual-gestural modality and to new approaches to sign language recognition from video by computer.  The results of this research have been shared through publications and conference presentations. 

Broader impacts

These software tools and corpora are invaluable resources for linguistic research; we expect they will enable new kinds of discoveries and the testing of hypotheses that would otherwise have been difficult to investigate. For computer vision, the linguistically annotated video corpora offer an extensive public dataset for training and benchmarking a variety of computer vision algorithms. This will facilitate research and expedite progress in gesture recognition, hand pose estimation, human tracking, and large vocabulary, and continuous ASL recognition. Furthermore, these datasets will be useful as training and benchmarking data for algorithms in the broader areas of computer vision, machine learning, and similarity-based indexing. 

Already our data sets are being used by researchers, educators, and students around the world for linguistic and computer science research and doctoral training.  Our work has also influenced the development of corpora for other signed languages (e.g., Arabic Sign Language, Italian Sign Language, and Yiddish Sign Language).

These materials will also be invaluable for those teaching/studying ASL and Deaf culture, ASL literature, and interpreting. Making our corpora available online will also allow the broader community of ASL users to access our data directly, and to analyze new data with the software tools we are sharing. Students of ASL will be able to retrieve video examples of a specific sign used in actual sentences, or examples of a grammatical construction. ASL instructors and teachers of the Deaf will also have easy access to video examples of lexical items and grammatical constructions as used by a variety of native signers for use in language instruction and evaluation. Thus, the new web interfaces to our data collections will be a useful educational resource for users, teachers, and learners of ASL. Moreover, the system will have educational benefits for deaf students, in helping them to learn English vocabulary and to connect ASL signs to English words.

The research made possible by these resources holds great promise for leading to technologies that will benefit the deaf community. These include tools for language learning, mobile sign language dictionaries and retrieval, and tools for searching for signs by example. Ultimately, this resource also is likely to contribute to systems for automated machine translation and human-computer interaction.

 


Last Modified: 10/29/2017
Modified by: Carol J Neidle

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page