
NSF Org: |
BCS Division of Behavioral and Cognitive Sciences |
Recipient: |
|
Initial Amendment Date: | August 26, 2019 |
Latest Amendment Date: | December 27, 2023 |
Award Number: | 1911603 |
Award Instrument: | Standard Grant |
Program Manager: |
Rachel M. Theodore
rtheodor@nsf.gov (703)292-4770 BCS Division of Behavioral and Cognitive Sciences SBE Directorate for Social, Behavioral and Economic Sciences |
Start Date: | August 15, 2019 |
End Date: | January 31, 2025 (Estimated) |
Total Intended Award Amount: | $197,424.00 |
Total Awarded Amount to Date: | $197,424.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
201 PRESIDENTS CIR SALT LAKE CITY UT US 84112-9049 (801)581-6903 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
260 S Central Campus, GC4525 Salt Lake City UT US 84112-9199 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | DEL |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.075 |
ABSTRACT
The Native American Languages Act, passed by the U.S. Congress in 1990, recognizes the unique status and value of Native American languages. Shoshoni [ISO 639-3 shh] is the northernmost member of the Uto-Aztecan language family, languages spoken from Wyoming to Central America. The Shoshoni language today continues to be an important component of Goshute and Shoshone tribal identity. In the 1960's-1970's, the late Wick R. Miller, of the University of Utah, taperecorded speakers of Shoshoni (born from ~1875-1920) from several different varieties, representing the most extensive documentary corpus of any Great Basin language, of vital cultural, historical, and linguistic importance to several tribal communities in the Western states. Past linguistic studies of Shoshoni have largely focused on the internal structure of sentences in isolation and on the structure of words, while this project will focus on its sound system and discourse-level structure. Broader impacts include the availability of the two corpora as free online resources from the Marriott Library (University of Utah) and the California Language Archive (UC-Berkeley). The project will also provide undergraduates from Shoshoni-speaking tribal communities with valuable experience on a computational linguistic research project, and enhance interactions between these young people and the two native-speaker elders collaborating on the project. The team will also produce a print version and an easy-to-read electronic version of a subset of the traditional stories from the Wick R. Miller Collection and disseminate them to the three communities collaborating on the project, the South Fork Band Council of the Te-Moak Tribe, the Confederated Tribes of the Goshute Reservation and the Ely Shoshone Tribe.
While Shoshoni is fairly well-documented for a Native American language, its discourse structure and its phonetics and phonology are relatively understudied. Thus, these significant gaps will be remedied by the development of two corpora. First, the 36 stories will be marked up to produce a electronically-searchable database valuable for sentence-level as well as discourse-level linguistic studies. Second, a phonological and phonetically valuable corpus, consisting of audio-TextGrid pairs of word and sentence-sized recordings which will be force aligned and fine-tuned. In the resulting corpus, the phonemes representing each vowel and consonant will be aligned with the corresponding part of the sound file, allowing researchers to automate the acoustic phonetic analysis of each sound. Such text-to-audio aligned corpora already exist for majority languages such as English, German, Japanese, and Spanish, making their sound systems relatively easy to study and thus leading to the development of electronic products that can quickly process spoken language. These majority language corpora are prepared using costly, language-specific computational tools called forced aligners. Our project will train the Montreal Forced Aligner to align the text of 4,000-5,000 Shoshoni words and short sentences to sound. Doing so will provide a model of how to inexpensively use a generic forced aligner to align text-to-audio data for any small, understudied language. The resulting forced-aligned Shoshoni corpus will greatly speed up the acoustic analysis of this phonologically complex language and lead to many relatively inexpensive, but in-depth, scientifically-sound research studies.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Please report errors in award information by writing to: awardsearch@nsf.gov.