Award Abstract # 2140826
Representing and learning stress: Grammatical constraints and neural networks

NSF Org: BCS
Division of Behavioral and Cognitive Sciences
Recipient: UNIVERSITY OF MASSACHUSETTS
Initial Amendment Date: April 6, 2022
Latest Amendment Date: April 6, 2022
Award Number: 2140826
Award Instrument: Standard Grant
Program Manager: Rachel M. Theodore
rtheodor@nsf.gov
 (703)292-4770
BCS
 Division of Behavioral and Cognitive Sciences
SBE
 Directorate for Social, Behavioral and Economic Sciences
Start Date: April 15, 2022
End Date: September 30, 2026 (Estimated)
Total Intended Award Amount: $386,226.00
Total Awarded Amount to Date: $386,226.00
Funds Obligated to Date: FY 2022 = $386,226.00
History of Investigator:
  • Joseph Pater (Principal Investigator)
    pater@linguist.umass.edu
  • Gaja Jarosz (Co-Principal Investigator)
Recipient Sponsored Research Office: University of Massachusetts Amherst
101 COMMONWEALTH AVE
AMHERST
MA  US  01003-9252
(413)545-0698
Sponsor Congressional District: 02
Primary Place of Performance: University of Massachusetts Amherst
100 Venture Way Suite 201
Hadley
MA  US  01035-9450
Primary Place of Performance
Congressional District:
02
Unique Entity Identifier (UEI): VGJHK59NMPK9
Parent UEI: VGJHK59NMPK9
NSF Program(s): Linguistics,
Human Networks & Data Sci Res
Primary Program Source: 01002223DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 9178, SMET, 9179, 1311, 104Z, 9251
Program Element Code(s): 131100, 147Y00
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.075

ABSTRACT

Languages are systems of remarkable complexity, and linguists and computer scientists have devoted considerable effort to the development of methods for representing those complex systems, as well as computational methods for learning the system of a given language. This effort is driven by the desires to better understand human cognition, and to build better language technologies. This project draws on the theories and methods of both linguistics and computer science to study the learning of word stress, the pattern of relative prominence of the syllables in a word. The stress systems of the world's languages are relatively well described, and there are competing linguistic theories of how they are represented. This project applies learning methods from computer science to find new evidence to distinguish the competing linguistic theories. It also examines systems of language representation that have been developed in computer science and have received relatively little attention by linguists (neural networks). The research will engage undergraduate and graduate linguistics students at a public university. Linguistics has a much higher proportion of female students than computer science, and this project aims to address gender imbalance in STEM.

From a linguistic perspective, learning stress involves learning hidden structure, parts of the representation that are not present in the observed data and that must be inferred by the learner. A given pattern of prominence over syllables is often consistent with multiple prosodic representations. The approach to hidden structure learning used in this project applies the general technique of Expectation Maximization, which in pilot work achieved good results on a standard test set. Intriguingly, many of the languages that this learner failed on in the test set are ones that are in fact cross-linguistically unattested. This project expands the set of tested languages to include more of the range of systems found cross-linguistically, and further explores the possibility that typological gaps have learning explanations. It compares hypotheses about the constraints responsible for stress placement by comparing how well they support the learning of attested systems, and whether they can help explain typological gaps. Pilot work also found indications that a neural network could learn generalizable representations of the data; the project is further testing this method. All of the software developed in this project is being made freely available, as is a database of the stress systems of the world?s languages.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Lee, Seung Suk and Farinella, Alessa and Hughes, Cerys and Pater, Joe "Learning Stress with Feet and Grids" Proceedings of the 2022 Annual Meeting on Phonology , 2023 Citation Details

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page