NSF Award Search: Award # 1748642

Award Abstract # 1748642

RI: EAGER: Collaborative Research: Adaptive Heads-up Displays for Simultaneous Interpretation

NSF Org:	IIS Division of Information & Intelligent Systems
Recipient:	CARNEGIE MELLON UNIVERSITY
Initial Amendment Date:	August 13, 2017
Latest Amendment Date:	August 13, 2017
Award Number:	1748642
Award Instrument:	Standard Grant
Program Manager:	D. Langendoen IIS Division of Information & Intelligent Systems CSE Directorate for Computer and Information Science and Engineering
Start Date:	September 1, 2017
End Date:	October 31, 2019 (Estimated)
Total Intended Award Amount:	$150,000.00
Total Awarded Amount to Date:	$150,000.00
Funds Obligated to Date:	FY 2017 = $150,000.00
History of Investigator:	Graham Neubig (Principal Investigator) gneubig@andrew.cmu.edu
Recipient Sponsored Research Office:	Carnegie-Mellon University 5000 FORBES AVE PITTSBURGH PA US 15213-3890 (412)268-8746
Sponsor Congressional District:	12
Primary Place of Performance:	Carnegie-Mellon University 5000 Forbes Ave Pittsburgh PA US 15213-3815
Primary Place of Performance Congressional District:	12
Unique Entity Identifier (UEI):	U3NKNFLNQ613
Parent UEI:	U3NKNFLNQ613
NSF Program(s):	Robust Intelligence
Primary Program Source:	01001718DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s):	7495, 7916
Program Element Code(s):	749500
Award Agency Code:	4900
Fund Agency Code:	4900
Assistance Listing Number(s):	47.070

ABSTRACT

Interpretation, the task of translating speech from one language to another, is an important tool in facilitating communication in multi-lingual settings such as international meetings, travel, or diplomacy. However, simultaneous interpretation, during which the results must be produced as the speaker is speaking, is an extremely difficult task requiring a high level of experience and training. In particular, simultaneous interpreters often find certain content such as technical terms, names of people and organizations, and numbers particularly hard to translate correctly. This Early Grant for Exploratory Research project aims to create automatic interpretation assistants that will help interpreters with this difficult-to-translate content by recognizing this content in the original language, and displaying translations on a heads-up display (similar to teleprompter) for interpreters to use if they wish. This will make simultaneous interpretation more effective and accessible, making conversations across languages and cultures more natural, more common, and more effective and joining communities and cultures across the world in trade, cooperation, and friendship.

Creating these systems is a technically challenging problem and has not previously been attempted. One challenge is that simultaneous interpretation is already a cognitively taxing task, and any interface must not unduly increase the cognitive load on the interpreter by being too intrusive. Reducing this cognitive load requires an interface that can decide when to provide translation suggestions and when to refrain from doing so. To achieve this goal, this project will develop methods that are robust to speech recognition errors, and learn what to display by observing the interpreters' interpretation results. The utility of the proposed framework will be evaluated with respect to how much it improves the ability of interpreters to produce fluent, accurate interpretation results, as well as the cognitive load the additional interface imposes on them.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Peskov, Denis and Barrow, Joe and Rodriguez, Pedro and Neubig, Graham and Boyd-Graber, Jordan "Mitigating Noisy Inputs for Question Answering" Proceedings of the 20th Annual Conference of the International Speech Communication Association , v.20 , 2019 Citation Details

Stewart, Craig and Vogler, Nikolai and Hu, Junjie and Boyd-Graber, Jordan and Neubig, Graham "Automatic Estimation of Simultaneous Interpreter Performance" Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics , v.56 , 2018 Citation Details

Vaibhav, Vaibhav and Singh, Sumeet and Stewart, Craig and Neubig, Graham "Improving Robustness of Machine Translation with Synthetic Noise" Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , v.2019 , 2019 Citation Details

Vogler, Nikolai and Stewart, Craig and Neubig, Graham "Lost in Interpretation: Predicting Untranslated Terminology in Simultaneous Interpretation" Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , v.2019 , 2019 Citation Details

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Interpretation, the task of translating speech from one language to another, is an important tool in facilitating communication in multi-lingual settings such as international meetings, travel, or diplomacy. However, simultaneous interpretation, during which the results must be produced as the speaker is speaking, is an extremely difficult task requiring a high level of experience and training. In particular, simultaneous interpreters often find certain content such as technical terms, names of people and organizations, and numbers particularly hard to translate correctly. This Early Grant for Exploratory Research project aimed to create automatic interpretation assistants that help interpreters with this difficult-to-translate content by recognizing this content in the original language, and displaying translations on a heads-up display (similar to teleprompter) for interpreters to use if they wish. These interfaces have potential to make simultaneous interpretation more effective and accessible, making conversations across languages and cultures more natural, more common, and more effective and joining communities and cultures across the world in trade, cooperation, and friendship.

Within this overall goal, this project tackled several technical challenges that must be faced in the creation of such automatic interpretation assistants.

1. Creation of offline translation assistants: We created static aids that convey useful information to interpreters, automating the process of creating “cheat sheets”; given a short description of the material to be interpreted automatically build a lexicon specific to that domain. This included discovering salient terms and finding translations for these terms. Using bilingual speakers as a proxy for interpreters, we found a technique for selecting difficult phrases to translate that was significantly more effective at finding difficult to translate phrases than expert curation. In particular, selecting terms that have many different translations helped these novice translators better than random, no, or expert help.

2. Creation of machine-in-the-loop translation assistants: We created a display that listened to to the speaker (in the source language), helping to create fluent translations. Because it is important that these interfaces must not overwhelm the interpreter with irrelevant material, we performed automatic tagging of terminology where the interpreter may have trouble translating. We proposed a number of specially designed features that increase accuracy by 2-10 points over alternative methods.

3. Creation of methods for robust prediction: Noise manifests itself in the form of MT errors when using models on bilingual text, or ASR errors when using models on speech, and models for interpretation assistance must be able to handle this. We devised a number of methods to handle this noise, for example: (1) Methods for integrating the confidence of upstream recognition systems into downstream neural network models, significantly improving accuracy. (2) Methods for synthesizing noise in training data that make it possible to create downstream systems resilient to naturally occurring noise and partially mitigate loss in accuracy resulting therefrom.

4. Learning from explicit and implicit feedback: To create models that learn when and how to give suggestions to interpreters, we need a training signal about which suggestions are appropriate given a particular interpretation context. We thus experimented with the adaptation of existing systems for Quality Estimation (QE) in Machine Translation (MT) to measure the quality of interpreter output as a means of measuring when and how much assistance is required. We adapted an existing system using a range of features specific to SI such as filled pauses, hesitations and incomplete words (among others) and demonstrated that our adapted model outperforms the baseline implementation in five experimental settings across three language pairs (English-French, English-Italian and English-Japanese), leading to to an average gain in accuracy of approximately 14.8%, indicating that they are effective in improving the quality estimation of interpreter output.

5. Creation of an initial design and elicit interpreter feedback: We performed participatory design sessions, and performed a survey of interpreters with 17 responses. We synthesized these responses, which largely confirmed a number of our suspicions about the importance of difficult terminology and numbers in interpreting, and also pointed to the potential of our computer assistance method.

6. Evaluation of the proposed interpretation interface: We deployed the system in a simulated interpretation setting and collected assessments with respect to objective measures of translation quality and users’ subjective experience in using the system. Specifically, we found that automatic term assistance indeed has a beneficial effect on overall accuracy of translating terms. For example, in the automatic assistance setting, our analysis showed that there was about a 10% gain in term translation accuracy over the setting with no assistance.

These results were published in peer-reviewed papers in academic conferences, and reported in several translation industry press pieces.

Last Modified: 01/21/2020
Modified by: Graham Neubig

Please report errors in award information by writing to: awardsearch@nsf.gov.

Success

Error