
NSF Org: |
IIS Division of Information & Intelligent Systems |
Recipient: |
|
Initial Amendment Date: | August 13, 2017 |
Latest Amendment Date: | August 13, 2017 |
Award Number: | 1748642 |
Award Instrument: | Standard Grant |
Program Manager: |
D. Langendoen
IIS Division of Information & Intelligent Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | September 1, 2017 |
End Date: | October 31, 2019 (Estimated) |
Total Intended Award Amount: | $150,000.00 |
Total Awarded Amount to Date: | $150,000.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
5000 FORBES AVE PITTSBURGH PA US 15213-3815 (412)268-8746 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
5000 Forbes Ave Pittsburgh PA US 15213-3815 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | Robust Intelligence |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
Interpretation, the task of translating speech from one language to another, is an important tool in facilitating communication in multi-lingual settings such as international meetings, travel, or diplomacy. However, simultaneous interpretation, during which the results must be produced as the speaker is speaking, is an extremely difficult task requiring a high level of experience and training. In particular, simultaneous interpreters often find certain content such as technical terms, names of people and organizations, and numbers particularly hard to translate correctly. This Early Grant for Exploratory Research project aims to create automatic interpretation assistants that will help interpreters with this difficult-to-translate content by recognizing this content in the original language, and displaying translations on a heads-up display (similar to teleprompter) for interpreters to use if they wish. This will make simultaneous interpretation more effective and accessible, making conversations across languages and cultures more natural, more common, and more effective and joining communities and cultures across the world in trade, cooperation, and friendship.
Creating these systems is a technically challenging problem and has not previously been attempted. One challenge is that simultaneous interpretation is already a cognitively taxing task, and any interface must not unduly increase the cognitive load on the interpreter by being too intrusive. Reducing this cognitive load requires an interface that can decide when to provide translation suggestions and when to refrain from doing so. To achieve this goal, this project will develop methods that are robust to speech recognition errors, and learn what to display by observing the interpreters' interpretation results. The utility of the proposed framework will be evaluated with respect to how much it improves the ability of interpreters to produce fluent, accurate interpretation results, as well as the cognitive load the additional interface imposes on them.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
Interpretation, the task of translating speech from one language to another, is an important tool in facilitating communication in multi-lingual settings such as international meetings, travel, or diplomacy. However, simultaneous interpretation, during which the results must be produced as the speaker is speaking, is an extremely difficult task requiring a high level of experience and training. In particular, simultaneous interpreters often find certain content such as technical terms, names of people and organizations, and numbers particularly hard to translate correctly. This Early Grant for Exploratory Research project aimed to create automatic interpretation assistants that help interpreters with this difficult-to-translate content by recognizing this content in the original language, and displaying translations on a heads-up display (similar to teleprompter) for interpreters to use if they wish. These interfaces have potential to make simultaneous interpretation more effective and accessible, making conversations across languages and cultures more natural, more common, and more effective and joining communities and cultures across the world in trade, cooperation, and friendship.
Within this overall goal, this project tackled several technical challenges that must be faced in the creation of such automatic interpretation assistants.
1. Creation of offline translation assistants: We created static aids that convey useful information to interpreters, automating the process of creating “cheat sheets”; given a short description of the material to be interpreted automatically build a lexicon specific to that domain. This included discovering salient terms and finding translations for these terms. Using bilingual speakers as a proxy for interpreters, we found a technique for selecting difficult phrases to translate that was significantly more effective at finding difficult to translate phrases than expert curation. In particular, selecting terms that have many different translations helped these novice translators better than random, no, or expert help.
2. Creation of machine-in-the-loop translation assistants: We created a display that listened to to the speaker (in the source language), helping to create fluent translations. Because it is important that these interfaces must not overwhelm the interpreter with irrelevant material, we performed automatic tagging of terminology where the interpreter may have trouble translating. We proposed a number of specially designed features that increase accuracy by 2-10 points over alternative methods.
3. Creation of methods for robust prediction: Noise manifests itself in the form of MT errors when using models on bilingual text, or ASR errors when using models on speech, and models for interpretation assistance must be able to handle this. We devised a number of methods to handle this noise, for example: (1) Methods for integrating the confidence of upstream recognition systems into downstream neural network models, significantly improving accuracy. (2) Methods for synthesizing noise in training data that make it possible to create downstream systems resilient to naturally occurring noise and partially mitigate loss in accuracy resulting therefrom.
4. Learning from explicit and implicit feedback: To create models that learn when and how to give suggestions to interpreters, we need a training signal about which suggestions are appropriate given a particular interpretation context. We thus experimented with the adaptation of existing systems for Quality Estimation (QE) in Machine Translation (MT) to measure the quality of interpreter output as a means of measuring when and how much assistance is required. We adapted an existing system using a range of features specific to SI such as filled pauses, hesitations and incomplete words (among others) and demonstrated that our adapted model outperforms the baseline implementation in five experimental settings across three language pairs (English-French, English-Italian and English-Japanese), leading to to an average gain in accuracy of approximately 14.8%, indicating that they are effective in improving the quality estimation of interpreter output.
5. Creation of an initial design and elicit interpreter feedback: We performed participatory design sessions, and performed a survey of interpreters with 17 responses. We synthesized these responses, which largely confirmed a number of our suspicions about the importance of difficult terminology and numbers in interpreting, and also pointed to the potential of our computer assistance method.
6. Evaluation of the proposed interpretation interface: We deployed the system in a simulated interpretation setting and collected assessments with respect to objective measures of translation quality and users’ subjective experience in using the system. Specifically, we found that automatic term assistance indeed has a beneficial effect on overall accuracy of translating terms. For example, in the automatic assistance setting, our analysis showed that there was about a 10% gain in term translation accuracy over the setting with no assistance.
These results were published in peer-reviewed papers in academic conferences, and reported in several translation industry press pieces.
Last Modified: 01/21/2020
Modified by: Graham Neubig
Please report errors in award information by writing to: awardsearch@nsf.gov.