Award Abstract # 1244713
Collaborative Research: Automatically Annotated Repository of Digital Video and Audio Resources Community (AARDVARC)

NSF Org: BCS
Division of Behavioral and Cognitive Sciences
Recipient: EASTERN MICHIGAN UNIVERSITY
Initial Amendment Date: September 23, 2012
Latest Amendment Date: April 8, 2013
Award Number: 1244713
Award Instrument: Standard Grant
Program Manager: William Badecker
BCS
 Division of Behavioral and Cognitive Sciences
SBE
 Directorate for Social, Behavioral and Economic Sciences
Start Date: September 15, 2012
End Date: February 28, 2015 (Estimated)
Total Intended Award Amount: $84,982.00
Total Awarded Amount to Date: $84,982.00
Funds Obligated to Date: FY 2012 = $47,734.00
History of Investigator:
  • Damir Cavar (Principal Investigator)
    dcavar@indiana.edu
  • Helen Aristar-Dry (Former Principal Investigator)
  • Anthony Aristar (Former Co-Principal Investigator)
  • Damir Cavar (Former Co-Principal Investigator)
Recipient Sponsored Research Office: Eastern Michigan University
203 PIERCE HALL
YPSILANTI
MI  US  48197-2264
(734)487-3090
Sponsor Congressional District: 06
Primary Place of Performance: Eastern Michigan University
Institute for Language Info & Te
Ypsilanti
MI  US  48197-2250
Primary Place of Performance
Congressional District:
06
Unique Entity Identifier (UEI): STFNT4KCCDU3
Parent UEI:
NSF Program(s): Data Infrastructure
Primary Program Source: 01001213DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7433
Program Element Code(s): 806800
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.075

ABSTRACT

Audio and video data from understudied languages is useful to linguists, anthropologists, educators, and computer scientists interested in visual action extraction, speech technology or software localization. Terabytes of such data exist, having been collected in large amounts by documentary linguists since the advent of easy digital recording via handheld devices. As records of vanishing languages and cultures, video and audio records are far richer and more captivating than paper records, but they need to be indexed and transcribed so that they reach their full potential as research tools. The current project, AARDVARC (Automatically Annotated Repository of Digital Audio and Video Resources Community) will address the problem of untranscribed, and therefore unavailable, documentation of understudied languages by building an interdisciplinary community of linguists, anthropologists, and computer scientists to share knowledge and collaborate on the specification of a repository and suite of tools to facilitate transcription. It will provide for two workshops and a symposium to design a "take one leave one" repository and to explore recent advances in speech and video processing that will allow anthropologists and linguists to break the 'transcription bottleneck' for language data. Even partial automation will greatly facilitate the work of the analyst and dramatically increase the amount of transcribed audio and video available to researchers in multiple disciplines.

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page