Award Abstract # 1054133
CAREER: Toward a General Framework for Words and Pictures

NSF Org: IIS
Division of Information & Intelligent Systems
Recipient: THE RESEARCH FOUNDATION FOR THE STATE UNIVERSITY OF NEW YORK
Initial Amendment Date: January 10, 2011
Latest Amendment Date: June 3, 2013
Award Number: 1054133
Award Instrument: Continuing Grant
Program Manager: Jie Yang
jyang@nsf.gov
 (703)292-4768
IIS
 Division of Information & Intelligent Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: June 1, 2011
End Date: June 30, 2014 (Estimated)
Total Intended Award Amount: $458,967.00
Total Awarded Amount to Date: $269,605.00
Funds Obligated to Date: FY 2011 = $88,018.00
FY 2012 = $89,850.00

FY 2013 = $11,173.00
History of Investigator:
  • Tamara Berg (Principal Investigator)
    tlberg@cs.unc.edu
Recipient Sponsored Research Office: SUNY at Stony Brook
W5510 FRANKS MELVILLE MEMORIAL LIBRARY
STONY BROOK
NY  US  11794-0001
(631)632-9949
Sponsor Congressional District: 01
Primary Place of Performance: SUNY at Stony Brook
W5510 FRANKS MELVILLE MEMORIAL LIBRARY
STONY BROOK
NY  US  11794-0001
Primary Place of Performance
Congressional District:
01
Unique Entity Identifier (UEI): M746VC6XMNH9
Parent UEI: M746VC6XMNH9
NSF Program(s): Robust Intelligence
Primary Program Source: 01001112DB NSF RESEARCH & RELATED ACTIVIT
01001213DB NSF RESEARCH & RELATED ACTIVIT

01001314DB NSF RESEARCH & RELATED ACTIVIT

01001415DB NSF RESEARCH & RELATED ACTIVIT

01001516DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 1045, 1187
Program Element Code(s): 749500
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Pictures convey a visual description of the world directly to their viewers. Computer vision strives to design algorithms to extract the underlying world state captured in the camera's eye, with an overarching goal of general computational image understanding. To date much vision research has approached image understanding by focusing on object detection, only one perspective on the image understanding problem. This project looks at an additional, complimentary way to collect information about the visual world -- by directly analyzing the enormous amount of visually descriptive text on the web to reveal what information is useful to attach to, and extract from pictures. This project presents a comprehensive research program geared toward modeling and exploiting the complimentary nature of words and pictures. One main goal is studying the connection between text and images to learn about depiction -- communication of meaning through pictures. This goal is addressed through 3 broad challenges: 1) Developing a richer vocabulary to describe the information provided by depiction. 2) Developing image representations that can visually capture this more nuanced vocabulary. 3) Constructing a comprehensive joint words and pictures framework.

This project has direct significance to many concrete tasks that access images on the internet including: image search, browsing, and organization, as well as commercial applications such as product search, and societally important applications such as web assistance for the blind. Additionally, outputs of this project, including progress toward a natural vocabulary and structure for visual description, have great potential for cross-cutting impact in both the computer vision and natural language communities.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Girish Kulkarni, Visruth Premraj, Vicente Ordonez, Sagnik Dhar, Siming Li, Yejin Choi, Alexander C. Berg, Tamara L Berg "BabyTalk: Understanding and Generating Simple Image Descriptions" IEEE Transactions on Pattern Analysis and Machine Intelligence (tPAMI) , v.35 , 2013 , p.2891 10.1109/TPAMI.2012.162
Kiwon Yun, Yifan Peng, Dimitris Samaras, Greg Zelinsky, Tamara L Berg "Exploring the role of gaze behavior and object detection in scene understanding" Frontiers in Psychology, Perception Science , v.4 , 2013 , p.1 10.3389/fpsyg.2013.00917

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page