Award Abstract # 1208390
NRI-Small: Contextually Grounded Collaborative Discourse for Mediating Shared Basis in Situated Human Robot Dialogue

NSF Org: IIS
Division of Information & Intelligent Systems
Recipient: MICHIGAN STATE UNIVERSITY
Initial Amendment Date: September 11, 2012
Latest Amendment Date: February 22, 2016
Award Number: 1208390
Award Instrument: Standard Grant
Program Manager: Tatiana Korelsky
IIS
 Division of Information & Intelligent Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: October 1, 2012
End Date: September 30, 2017 (Estimated)
Total Intended Award Amount: $957,000.00
Total Awarded Amount to Date: $981,000.00
Funds Obligated to Date: FY 2012 = $957,000.00
FY 2014 = $8,000.00

FY 2016 = $16,000.00
History of Investigator:
  • Joyce Chai (Principal Investigator)
    chaijy@umich.edu
  • Ning Xi (Former Co-Principal Investigator)
Recipient Sponsored Research Office: Michigan State University
426 AUDITORIUM RD RM 2
EAST LANSING
MI  US  48824-2600
(517)355-5040
Sponsor Congressional District: 07
Primary Place of Performance: Michigan State University
MI  US  48824-1046
Primary Place of Performance
Congressional District:
07
Unique Entity Identifier (UEI): R28EKN92ZTZ9
Parent UEI: VJKZC4D1JN36
NSF Program(s): NRI-National Robotics Initiati
Primary Program Source: 01001213DB NSF RESEARCH & RELATED ACTIVIT
01001415DB NSF RESEARCH & RELATED ACTIVIT

01001617DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7923, 8086, 9251
Program Element Code(s): 801300
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

In human robot dialogue, although human partners and robots are co-present in a shared environment, they have completely mismatched capabilities in perceiving and reasoning about the environment. Their knowledge and representations of the shared world are drastically different. In addition, the shared environment is full of uncertainties and unexpected events. Humans and robots may have different capabilities in attending and responding to these uncertainties. All of these contribute to a misaligned perceptual basis between a human and a robot, which jeopardizes their collaborative activities and task performance. To enable situated human robot dialogue, a critical component is to develop techniques that will support mediating the shared perceptual basis for effective conversation and task completion.

The objective of this National Robotics Initiative project is to develop a novel framework that tightly integrates high level language and dialogue processing with low level sensing and control systems and contextually grounds the collaborative discourse to mediate shared perceptual basis. By capturing grounded symbolic representations as well as continuous representations of the internal configuration of a robotic system and continuous information sensed from the changing environment, the framework allows the robot to promptly modify its execution without interrupting the on-going tasks. It further enables collaborations between humans and robots to mediate a shared perceptual basis and support efficient interaction in a highly dynamic environment.

This project will provide insight as to how the misaligned perceptual basis between a human and a robot should be mediated through a collaborative process and how such a process should be integrated to produce intelligent and collaborative robot behaviors. The expected results will benefit many applications such as manufacturing, service, assistive technology, and search and rescue.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

C. Liu, S. Yang, S. Sadiya, N. Shukla, Y. He, S. Zhu, and J. Y. Chai "Jointly Learning Task Structures from Language Instruction and Visual Demonstration" Conference on Empirical Methods in Natural Language Processing (EMNLP) , 2016
J. Y. Chai, Rui Fang, Changsong Liu, and Lanbo She "Collaborative Language Grounding towards Situated Human-Robot Dialogue" AI Magazine , 2016
L. She and J. Y. Chai "Interactive Learning of Grounded Verb Semantics towards Human-Robot Communication" Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL) , 2017
L. She and J. Y. Chai. "Incremental Acquisition of Verb Hypothesis Space towards Physical World Interaction" Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL) , 2016
Q. Gao, L. She, and J. Y. Chai. "Interactive Learning of State Representation through Natural Language Instruction and Explanation" AAAI Fall Symposium on Natural Communication for Human-Robot Collaboration , 2017
Q. Gao, M. Doering, S. Yang, and J. Y. Chai "Physical Causality of Action Verbs in Grounded Language Understanding" Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL) , 2016
S. Yang, Q. Gao, C. Liu, C. Xiong, S. Zhu, and J. Y. Chai "Grounded Semantic Role Labeling" The 2016 Conference of North America Chapter of the Association of Computational Linguistics (NAACL) , 2016 , p.149
Y. Cheng, J. Bao, Y. Jia, Z. Deng, Z. Sun, S. Bi, C. Li, and N. Xi "Modeling Robotic Operations Controlled by Natural Language" Control Theory and Technology , v.15 , 2017
Y. Jia, L. She, Y. Cheng, J. Bao, J. Y. Chai, and N. Xi "Program Robots Manufacturing Tasks by Natural Language Instructions" The 12th Annual IEEE International Conference on Automation Science and Engineering(CASE) , 2016
Y. Jia, N. Xi, S. Liu, Y. Wang, X. Li and S. Bi "Quality of Teleoperator (QoT) Adaptive Control for Telerobotic Operation" International Journal of Robotics Research (IJRR) , 2014

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

In human-robot communication, robots need to ground human language to their own representations of perception and action. However, although humans and robots are co-present, their representations of the shared world and joint tasks are significantly mismatched due to disparities of their knowledge and perceptual capabilities. Such disparities make language grounding rather difficult. To address this issue, this project has conducted a systematic investigation on the role of collaboration in language communication between humans and robots and developed several collaborative models for grounded language processing.

Our studies have shown that when robots have limited perceptual capabilities, grounding language to perception often cannot be succeeded by one attempt. It is achieved by a collaborative process between humans and robots through multiple episodes. To establish a shared perceptual basis, humans and robots will need to make extra efforts to collaborate with each other and strive for a common ground for language grounding. Based on these observations, this project has developed several collaborative models using graph-based approaches, where a language graph captures collaborative linguistic discourse and a vision graph represents the perceived environment. Language grounding becomes a problem of finding the best match between the language graph and the vision graph that maximizes the overall compatibility between the two graphs. Our empirical results have shown that, graphs can capture rich semantic relations (e.g., spatial relations, group relations) between linguistic expressions as collaborative discourse unfolds. By taking advantage of these relations, the graph-based approaches can compensate for visual recognition errors and mitigate perceptual differences between humans and robots. Robots also need to have the ability to generate linguistic expressions to refer to objects in the environment. Our experiments have shown that, traditional approaches that generate one single description often fail as humans cannot identify the corresponding target objects due to mismatched representations. To address this problem,  this project has developed collaborative referring expression generation models that generate smaller episodes one at a time based on human feedback to mediate perceptual differences. These collaborative models allow humans to identify target objects with a significantly higher accuracy.

This project has also explored connections between language and robotic actions. Most robotic arms are programmed with primitive operations such as move to, open gripper, and close gripper. To perform an action, a discrete controller is often first applied to find a sequence of primitive operations. These primitive operations are then passed to a continuous planner and translated into trajectories of arm motors. Thus a critical question is how to connect language commands with the corresponding sequence of primitive operations. To address this question, this project has focused on the representation of grounded verb semantics for concrete result action verbs (i.e., those which denote changes to the world) and the acquisition of such representation through human-robot communication. Humans can teach robots new actions through step-by-step language instructions based on known actions and operations. At the end of teaching, robots capture the change of world states as they experience during teaching and represent the corresponding new verb with the expected goal state of the world. This state-based representation allows a planner to automatically compute a sequence of low-level primitive operations when the learned verbs are applied in new situations. To handle uncertainties of the environment, this project has also developed collaborative approaches that allow the robot to ask intelligent questions and acquire more reliable grounded representations. 

 


Last Modified: 01/29/2018
Modified by: Joyce Y Chai

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page