Award Abstract # 1764091
RI: Medium: Collaborative Research: Developing a Uniform Meaning Representation for Natural Language Processing

NSF Org: IIS
Division of Information & Intelligent Systems
Recipient: UNIVERSITY OF NEW MEXICO
Initial Amendment Date: July 23, 2018
Latest Amendment Date: July 23, 2018
Award Number: 1764091
Award Instrument: Standard Grant
Program Manager: Tatiana Korelsky
IIS
 Division of Information & Intelligent Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: August 1, 2018
End Date: December 31, 2022 (Estimated)
Total Intended Award Amount: $399,821.00
Total Awarded Amount to Date: $399,821.00
Funds Obligated to Date: FY 2018 = $399,821.00
History of Investigator:
  • William Croft (Principal Investigator)
    wcroft@unm.edu
Recipient Sponsored Research Office: University of New Mexico
1 UNIVERSITY OF NEW MEXICO
ALBUQUERQUE
NM  US  87131-0001
(505)277-4186
Sponsor Congressional District: 01
Primary Place of Performance: University of New Mexico
1700 Lomas Blvd
Albuquerque
NM  US  87131-0001
Primary Place of Performance
Congressional District:
01
Unique Entity Identifier (UEI): F6XLTRUQJEN4
Parent UEI:
NSF Program(s): Robust Intelligence
Primary Program Source: 01001819DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7924, 7495, 9150
Program Element Code(s): 749500
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

The use of intelligent agents that can communicate with us in human language has become an essential part of our daily lives. Today's intelligent agents can respond appropriately to many things we say or text to them, but they cannot yet communicate fully like humans. They lack our general ability to arrive quickly at accurate and relevant interpretations of what others communicate to us and to form appropriate responses, particularly in sustained interactions. The typical way we teach a machine to acquire such ability is to provide it with approximations of the meanings of utterances in the contexts in which they have occurred in the past. Over the years these approximations have become increasingly rich and detailed, enabling ever more sophisticated systems for interacting with computers using natural language, such as searching for information, getting up-to-date recommendations for products and services, and translating foreign languages. The goal of this project is to bring together linguists and computer scientists to jointly develop a practical meaning representation formalism based on these rich approximations that can be applied to a much more diverse set of languages. This will allow us to use machine learning to develop techniques to automatically translate human utterances into our meaning formalism. In turn, this will enable intelligent agents to acquire more advanced communication capabilities, and for a wider range of languages. The languages considered for the project include those spoken by large populations such as English, Chinese and Arabic, as well as native tongues of smaller groups such as Norwegian, and Arapaho and Kukama-Kukamira, two indigenous languages of the Americas. As such, this project will help bring modern technology to smaller groups so that all people can benefit equally from technological advancement. The project will also contribute to the development of the US workforce by training a new generation of researchers on cutting-edge technologies in artificial intelligence.

This project brings together an interdisciplinary team of linguists and computer scientists from three institutions to jointly develop a Uniform Meaning Representation (UMR). UMR is a practical, formal, computationally tractable, and cross-linguistically valid meaning representation of natural language that can impact a wide range of downstream applications requiring deep natural language understanding (NLU). UMR will extend existing meaning representations to include quantifier types and relations, modality, negation, tense and aspect, and be tested on a typologically diverse set of languages. Methods and techniques for UMR annotation, parsing and generation, and evaluation will be uniform across languages. The project will also develop novel algorithms and models for UMR-based broad-coverage and general-purpose multilingual semantic parsers. Students participating in the project will receive training in the full cycle of conceptualizing, producing, processing, and consuming meaning representations at the sites of participating institutions. This project will help to build a community of NLP researchers that will contribute to the development of UMR-based data and tools and advance the state of the art in Natural Language Processing (NLP) in particular, and Artificial Intelligence (AI) in general.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Van Gysel, Jens E. and Vigus, Meagan and Denk, Lukas and Cowell, Andrew and Vallejos, Rosa and OGorman, Tim and Croft, William "Theoretical and Practical Issues in the Semantic Annotation of Four Indigenous Languages" Proceedings of the Joint 15th Linguistic Annotation Workshop (LAW) and 3rd Designing Meaning Representations (DMR) Workshop , 2021 https://doi.org/10.18653/v1/2021.law-1.2 Citation Details
Van Gysel, Jens E. and Vigus, Meagan and Kalm, Pavlína and Lee, Sook-kyung and Regan, Michael and Croft, William "Cross-linguistic semantic annotation: reconciling the language-specific and the universal" Proceedings of the First International Workshop on Designing Meaning Representations (DMR 2019) , 2019 https://doi.org/10.18653/v1/W19-3301 Citation Details
Vigus, Meagan and Van Gysel, Jens E. and Croft, William "A Dependency Structure Annotation for Modality" Proceedings of the First International Workshop on Designing Meaning Representations (DMR 2019) , 2019 https://doi.org/10.18653/v1/W19-3321 Citation Details
Vigus, Meagan and Van Gysel, Jens E. and O'Gorman, Tim and Cowell, Andres and Vallejos, Rosa and Croft, William "Cross-lingual annotation: a road map for low- and no-resource languages" Proceedings of the Second International Workshop on Designing Meaning Representations (DMR 2020) , 2020 Citation Details

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Intelligent agents have increasingly become part of our everyday lives, and they can now respond appropriately to many things that we say or text to them. Many of them do this by first translating human language into a meaning formalism that can be understood or executed by the computer. As this meaning formalism gets richer and more detailed, intelligent agents are capable of doing more things that are valuable to humans, from answering our questions about the weather to playing the music that we want, from providing information about flights to translating one language into another. 

 

The goal of the Uniform Meaning Representation (UMR) project was to develop a meaning formalism that can mediate communication between humans and computers in a wide range of languages of the world. Developing a meaning formalism that all languages can translate into is a complicated undertaking, however. It has required expertise from both computer scientists and linguists who understand the similarities and differences between the world’s languages. Towards this goal, researchers from Brandeis University, the University of Colorado at Boulder, and the University of New Mexico jointly developed the Uniform Meaning Representation in partnership with an international group of scientists. 


The researchers started from a popular existing meaning formalism called Abstraction Meaning Representation (AMR).  They enriched AMR to cover a wider range of meanings, and generalized it so that it can work for a comprehensive set  of the world’s languages. In this process, they tested UMR on diverse languages that include those spoken by large populations such as Arabic, Chinese, and English, as well as native languages of smaller groups such as Arapaho, Kukama-Kukamiria, Navajo, and Sanapaná. As part of this effort, they organized workshops to invite feedback from fellow researchers on the representation, and held tutorials at computational linguistics conferences to disseminate the research. They also developed tools that fellow researchers can use to produce UMRs for their own languages.


Last Modified: 05/26/2023
Modified by: William Croft

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page