
NSF Org: |
DRL Division of Research on Learning in Formal and Informal Settings (DRL) |
Recipient: |
|
Initial Amendment Date: | August 4, 2022 |
Latest Amendment Date: | August 4, 2022 |
Award Number: | 2219843 |
Award Instrument: | Standard Grant |
Program Manager: |
Gregg Solomon
gesolomo@nsf.gov (703)292-8333 DRL Division of Research on Learning in Formal and Informal Settings (DRL) EDU Directorate for STEM Education |
Start Date: | August 15, 2022 |
End Date: | July 31, 2026 (Estimated) |
Total Intended Award Amount: | $1,000,000.00 |
Total Awarded Amount to Date: | $1,000,000.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
633 CLARK ST EVANSTON IL US 60208-0001 (312)503-7955 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
2016 Sheridan Rd. Evanston IL US 60208-4090 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): |
IntgStrat Undst Neurl&Cogn Sys, ECR-EDU Core Research |
Primary Program Source: |
04002223DB NSF Education & Human Resource |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.075, 47.076 |
ABSTRACT
You can guess a lot about a person from the way they pronounce words. Remarkably, human listeners can tell if it is likely that talkers learned English as a first language or a second language, or if the talkers might have a brain injury that makes it difficult for them to speak. Such intuitions rely on human listeners? holistic pattern recognition abilities; these allow us to perceive the important, meaningful, yet subtle differences between pronunciations. However, the methods scientists currently use to measure speech objectively ? based on a small number of properties of speech sounds ? fail to capture these differences, hampering our ability to use speech to learn about the mind and brain. This project brings together speech scientists, computer scientists, and neuroscientists to test a radically different approach to this problem. Machine learning will be used to discover a new method for quantifying differences between spoken utterances based on holistic pattern recognition. This will be tested against new and existing data from bilingual speakers. If successful, this will yield a fully general method that can be applied to speech from any language or any domain of language usage, allowing scientists to capitalize on the wealth of information in speech to develop powerful new insights into the mind and brain. Improved detection of subtle problems with pronunciation, such as occurs with Alzheimer?s disease, will advance our understanding of the brain mechanisms that humans use to produce speech. The results of this testing will also allow computer scientists to advance our understanding of how machine learning algorithms process sounds, driving improvements in the algorithms and supporting applications in any area of speech and language technology that relies on spoken language processing.
Speech variability across talkers provides a treasure trove of information for cognitive neuroscientists, leading to important insights into the cognitive mechanisms underlying language processing and potentially providing early signs of brain dysfunction. Current studies of speech are hamstrung by analyses that require preselecting specific temporal scales and acoustic dimensions. We propose a radically different approach: using unsupervised deep learning to discover a representational space for analysis of acoustic variation. To test this highly general approach, this method will be compared to current state-of-the art methods for analyzing individual variation in bilingual speech. This includes using the acoustic variation in second language speech to predict intelligibility and to detect difficulties in code-switching, particularly the challenges faced by individuals with Alzheimer?s Disease. The results will inform development of deep learning and cognitive neuroscience. The machine learning algorithm is fully general; it can be applied to speech from any language or any domain of language usage, expanding the range of populations and contexts that can be served by speech technology or studied by cognitive neuroscientists. The project?s integrative approach will allow computer scientists to advance our understanding of the extent to which modern deep learning architectures do or do not approximate human speech processing and allow cognitive neuroscientists to further our understanding of how meaningful acoustic distinctions are represented in speech perception and production.
human speech representation.
This project is funded by the Integrative Strategies for Understanding Neural and Cognitive Systems (NCS) program, which is jointly supported by the Directorates for Computer and Information Science and Engineering (CISE), Education and Human Resources (EHR), Engineering (ENG), and Social, Behavioral, and Economic Sciences (SBE).
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
Please report errors in award information by writing to: awardsearch@nsf.gov.