Award Abstract # 1734938
NCS-FO: Neuroimaging to Advance Computer Vision, NLP, and AI

NSF Org: IIS
Division of Information & Intelligent Systems
Recipient: PURDUE UNIVERSITY
Initial Amendment Date: August 7, 2017
Latest Amendment Date: June 30, 2022
Award Number: 1734938
Award Instrument: Standard Grant
Program Manager: Kenneth Whang
kwhang@nsf.gov
 (703)292-5149
IIS
 Division of Information & Intelligent Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: August 15, 2017
End Date: September 30, 2023 (Estimated)
Total Intended Award Amount: $1,000,000.00
Total Awarded Amount to Date: $1,000,000.00
Funds Obligated to Date: FY 2017 = $1,000,000.00
History of Investigator:
  • Jeffrey Siskind (Principal Investigator)
  • Ronnie Wilbur (Co-Principal Investigator)
  • Evguenia Malaia (Co-Principal Investigator)
Recipient Sponsored Research Office: Purdue University
2550 NORTHWESTERN AVE # 1100
WEST LAFAYETTE
IN  US  47906-1332
(765)494-1055
Sponsor Congressional District: 04
Primary Place of Performance: Purdue University
465 Northwestern Ave.
West Lafayette
IN  US  47907-2035
Primary Place of Performance
Congressional District:
04
Unique Entity Identifier (UEI): YRXVL4JYCEF5
Parent UEI: YRXVL4JYCEF5
NSF Program(s): GVF - Global Venture Fund,
ECR-EDU Core Research,
IntgStrat Undst Neurl&Cogn Sys
Primary Program Source: 01001718DB NSF RESEARCH & RELATED ACTIVIT
04001718DB NSF Education & Human Resource
Program Reference Code(s): 5946, 5980, 6869, 8089, 8091, 8551, 8817
Program Element Code(s): 054Y00, 798000, 862400
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

It is often said that a picture is worth a thousand words. Frequently, to search for what is needed, whether images or objects in those images, words are needed instead. Getting accurate labels for efficient searches is a longstanding goal of computer vision, but progress has been slow. This project employs new methods to significantly change how picture-word labeling is accomplished by taking advantage of the best picture recognizer available, the human brain. Through functional magnetic resonance imaging and electroencephalography, brain activity of humans looking at pictures/videos is recorded and then used to improve performance on artificial intelligence (AI) tasks involving computer vision and natural language processing. Current systems use machine learning to train computers to recognize objects (nouns) and activities (verbs) in images/video, which are then used to describe events. Reasoning tasks (e.g., solving math problems) can then be done. These systems are trained on specially prepared datasets with samples of nouns for objects, verbs for activities, sentences describing events, and exam questions and answers. A novel paradigm using humans to perform the same tasks while their brains are scanned allows determination of neural patterns associated with those tasks. The brain activity patterns, in turn, are used to train better computer systems.

The central hypothesis is that understanding human processing of grounded language involving predication and its use during reasoning will materially improve engineered computer vision, natural language processing, and AI systems that perform image/video captioning, visual question answering, and problem solving. Scientific and engineering goals include developing models of human language grounding and reasoning consistent with neuroimaging, to improve engineered systems integrating language and vision that support automated reasoning. The main scientific question is to understand mechanisms by which predicates and arguments are identified, linked, and used for reasoning by the human brain. The hypothesis, that predicate-argument linking in visual and linguistic representations are accomplished similarly, and that this then supports reasoning and problem solving, will be tested using multiple neuroimaging modalities, and machine learning algorithms to decode "who did what to whom" from brain scans of subjects processing linguistic and visual stimuli. The iterative approach will involve understanding information integration at the neural level, to improve machine learning performance on AI tasks by training computers to perform increasingly complex tasks with neuroimaging data from stimuli derived from large-scale natural tasks. Using identical datasets for human and machine performance will support translation of scientific advances to engineering practice involving integration of computer vision and natural language processing.

This award is cofunded by the Office of International Science and Engineering.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 20)
Ahmed, Hamad and Wilbur, Ronnie B. and Bharadwaj, Hari M. and Siskind, Jeffrey Mark "Confounds in the DataComments on Decoding Brain Representations by Multimodal Learning of Neural Activity and Visual Features" IEEE Transactions on Pattern Analysis and Machine Intelligence , v.44 , 2022 https://doi.org/10.1109/TPAMI.2021.3121268 Citation Details
Ahmed, Hamad and Wilbur, Ronnie B. and Bharadwaj, Hari M. and Siskind, Jeffrey Mark "Object Classification From Randomized EEG Trials}" Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , 2021 Citation Details
Baydin, A.G. and Pearlmutter, B.A. and Radul, A.A and Siskind, J.M "Automatic Differentiation in Machine Learning: a Survey" Journal of machine learning research , v.18 , 2018 Citation Details
Bharadwaj, Hari M and Wilbur, Ronnie B. and Siskind, Jeffrey Mark "Still an Ineffective Method With Supertrials/ERPsComments on Decoding Brain Representations by Multimodal Learning of Neural Activity and Visual Features" IEEE Transactions on Pattern Analysis and Machine Intelligence , v.45 , 2023 https://doi.org/10.1109/TPAMI.2023.3292062 Citation Details
Bradley, Chuck and Malaia, Evie A. and Siskind, Jeffrey Mark and Wilbur, Ronnie B. "Visual form of ASL verb signs predicts non-signer judgment of transitivity" PLOS ONE , v.17 , 2022 https://doi.org/10.1371/journal.pone.0262098 Citation Details
Ford, Linda K and Borneman, Joshua D and Krebs, Julia and Malaia, Evguenia A and Ames, Brendan P "Classification of visual comprehension based on EEG data using sparse optimal scoring" Journal of Neural Engineering , v.18 , 2021 https://doi.org/10.1088/1741-2552/abdb3b Citation Details
Ilyevsky, Thomas Victor and Johansen, Jared Sigurd and Siskind, Jeffrey Mark "Talk the talk and walk the walk: Dialogue-driven navigation in unknown indoor environments" 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems , 2021 Citation Details
Krebs, Julia and Malaia, Evie and Wilbur, Ronnie B. and Roehm, Dietmar "Interaction between topic marking and subject preference strategy in sign language processing" Language, Cognition and Neuroscience , v.35 , 2020 https://doi.org/10.1080/23273798.2019.1667001 Citation Details
Krebs, Julia and Malaia, Evie and Wilbur, Ronnie B. and Roehm, Dietmar "Psycholinguistic mechanisms of classifier processing in sign language." Journal of Experimental Psychology: Learning, Memory, and Cognition , v.47 , 2021 https://doi.org/10.1037/xlm0000958 Citation Details
Krebs, Julia and Malaia, Evie and Wilbur, Ronnie B. and Roehm, Dietmar "Subject preference emerges as cross-modal strategy for linguistic processing" Brain Research , v.1691 , 2018 10.1016/j.brainres.2018.03.029 Citation Details
Krebs, Julia and Roehm, Dietmar and Wilbur, Ronnie B. and Malaia, Evie A. "Age of sign language acquisition has lifelong effect on syntactic preferences in sign language users" International Journal of Behavioral Development , 2020 https://doi.org/10.1177/0165025420958193 Citation Details
(Showing: 1 - 10 of 20)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

We studied how the human brain processes and represents people and objects (nouns) and activity (verbs) across different modalities, including vision (video) and language (both spoken and written). We particularly studied how that brain activity allows forming compound concepts by combining simpler concepts like nouns and verbs into sentence structure (subject-verb-object). We developed methods for computer programs to decode both simple and compound concepts from neuroimaging data (EEG and fMRI) recorded from subjects viewing short video clips, hearing spoken presentation of sentences, and reading sentences, all depicting compound concepts. Our methods demonstrate two key findings: first, the pattern of brain activity recorded from the same concepts (stimuli) is similar, but not identical, across different people, and second, the pattern of brain activity recordied from the same concepts (stimuli) is similar, but not identical, across different modalities. The pattern of brain activity associated with combining simple concepts into compound concepts is also similar, but not identical, across different people and modalities.

In another major study, we discovered flaws in the experimental method used to collect data in a large number of prominent published papers. This flaw means their conclusions and claims are not valid (and cannot be used). We published our findings is the same places so that others would be aware of the problem.


Last Modified: 01/27/2024
Modified by: Jeffrey M Siskind

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page