NSF Award Search: Award # 1244713

Award Abstract # 1244713

Collaborative Research: Automatically Annotated Repository of Digital Video and Audio Resources Community (AARDVARC)

NSF Org:	BCS Division of Behavioral and Cognitive Sciences
Recipient:	EASTERN MICHIGAN UNIVERSITY
Initial Amendment Date:	September 23, 2012
Latest Amendment Date:	April 8, 2013
Award Number:	1244713
Award Instrument:	Standard Grant
Program Manager:	William Badecker BCS Division of Behavioral and Cognitive Sciences SBE Directorate for Social, Behavioral and Economic Sciences
Start Date:	September 15, 2012
End Date:	February 28, 2015 (Estimated)
Total Intended Award Amount:	$84,982.00
Total Awarded Amount to Date:	$84,982.00
Funds Obligated to Date:	FY 2012 = $47,734.00
History of Investigator:	Damir Cavar (Principal Investigator) dcavar@indiana.edu Helen Aristar-Dry (Former Principal Investigator) Anthony Aristar (Former Co-Principal Investigator) Damir Cavar (Former Co-Principal Investigator)
Recipient Sponsored Research Office:	Eastern Michigan University 203 PIERCE HALL YPSILANTI MI US 48197-2264 (734)487-3090
Sponsor Congressional District:	06
Primary Place of Performance:	Eastern Michigan University Institute for Language Info & Te Ypsilanti MI US 48197-2250
Primary Place of Performance Congressional District:	06
Unique Entity Identifier (UEI):	STFNT4KCCDU3
Parent UEI:
NSF Program(s):	Data Infrastructure
Primary Program Source:	01001213DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s):	7433
Program Element Code(s):	806800
Award Agency Code:	4900
Fund Agency Code:	4900
Assistance Listing Number(s):	47.075

ABSTRACT

Audio and video data from understudied languages is useful to linguists, anthropologists, educators, and computer scientists interested in visual action extraction, speech technology or software localization. Terabytes of such data exist, having been collected in large amounts by documentary linguists since the advent of easy digital recording via handheld devices. As records of vanishing languages and cultures, video and audio records are far richer and more captivating than paper records, but they need to be indexed and transcribed so that they reach their full potential as research tools. The current project, AARDVARC (Automatically Annotated Repository of Digital Audio and Video Resources Community) will address the problem of untranscribed, and therefore unavailable, documentation of understudied languages by building an interdisciplinary community of linguists, anthropologists, and computer scientists to share knowledge and collaborate on the specification of a repository and suite of tools to facilitate transcription. It will provide for two workshops and a symposium to design a "take one leave one" repository and to explore recent advances in speech and video processing that will allow anthropologists and linguists to break the 'transcription bottleneck' for language data. Even partial automation will greatly facilitate the work of the analyst and dramatically increase the amount of transcribed audio and video available to researchers in multiple disciplines.

Please report errors in award information by writing to: awardsearch@nsf.gov.

Success

Error