
NSF Org: |
IIS Division of Information & Intelligent Systems |
Recipient: |
|
Initial Amendment Date: | July 31, 2018 |
Latest Amendment Date: | July 31, 2018 |
Award Number: | 1816701 |
Award Instrument: | Standard Grant |
Program Manager: |
Hector Munoz-Avila
hmunoz@nsf.gov (703)292-4481 IIS Division of Information & Intelligent Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | October 1, 2018 |
End Date: | September 30, 2022 (Estimated) |
Total Intended Award Amount: | $500,000.00 |
Total Awarded Amount to Date: | $500,000.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
9500 GILMAN DR LA JOLLA CA US 92093-0021 (858)534-4896 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
La Joll CA US 92093-0934 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | Info Integration & Informatics |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
Modern automatic speech recognition (ASR) tools offer near-human accuracy in many scenarios. This has increased the popularity of speech-driven input in many applications on modern device environments such as tablets and smartphones, while also enabling personal conversational assistants. In this context, this project will study a seemingly simple but important fundamental question: how should one design a speech-driven system to query structured data? Structured data querying is ubiquitous in the enterprise, healthcare, and other domains. Typing queries in the Structured Query Language (SQL) is the gold standard for such querying. But typing SQL is painful or impossible in the above environments, which restricts when and how users can consume their data. SQL also has a learning curve. Existing alternatives such as typed natural language interfaces help improve usability but sacrifice query sophistication substantially. For instance, conversational assistants today support queries mainly over curated vendor-specific datasets, not arbitrary database schemas, and they often fail to understand query intent. This has widened the gap with SQL's high query sophistication and unambiguity. This project will bridge this gap by enabling users to interact with structured data using spoken queries over arbitrary database schemas. It will lead to prototype systems on popular tablet, smartphone, and conversational assistant environments. This could help many data professionals such as data analysts, business reporters, and database administrators, as well as non-technical data enthusiasts. For instance, nurse informaticists can retrieve patient details more easily and unambiguously to assist doctors, while analysts can slice and dice their data even on the move. The research will be disseminated as publications in database and natural language processing conferences. The research and artifacts produced will be integrated into graduate and undergraduate courses on database systems. The PIs will continue supporting students from under-represented groups as part of this project.
This project will create three new systems for spoken querying at three levels of "naturalness." The first level targets a tractable and meaningful subset of SQL. This research will exploit three powerful properties of SQL that regular English speech lacks--unambiguous context-free grammar, knowledge of the database schema queried, and knowledge of tokens from the database instance queried--to support arbitrary database schemas and tokens not present in the ASR vocabulary. The PIs will synthesize and innovate upon ideas from information retrieval, natural language processing, and database indexing and combine them with human-in-the-loop query correction to improve accuracy and efficiency. The second version will make SQL querying even more natural and stateful by changing its grammar. This will lead to the first speech-oriented dialect of SQL. The third version will apply the lessons from the previous versions to two state-of-the-art typed natural language interfaces for databases. This will lead to a redesign of such interfaces that exploits both the properties of speech and the database instance queried.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
This project studied the principles, methods, and tools of bringing relational database querying into the speech-first era of computing. Relational databases are the most common form of businesss-critical data in practice. Our project findings and artifacts could enable easier anytime-anywhere data retrieval and analytics for both data professionals and lay users on new devices such tablets, smartphones, and conversational assistants. We broke it down into three regimes of ease of use and query sophistication.
The first regime focused on regular SQL, the most popular query language for relational databases. It was targeted at data professionals such as business analysts, nurse informatics practitioners, and system admins familiar with SQL. We devised the first speech+first multimodal querying interface for a non-trivial subset of SQL. With user studies we found that our system makes it significantly faster for people to specify their queries on tablets compared to typing SQL.
For the second regime, we created the first speech-oriented dialect of SQL. This too is targeted at data professionals. It builds on our lessons of the rigidity of SQL for dictation from the first project to infuse a small set of focused relaxations to SQL syntax that are easy to pick up for SQL users. With user studies we found that our dialect is significantly easier for querying than speaking regular SQL, while still preserving the exactness and correctness guarantees of SQL.
For the third regime, we studied the intersection between speech-first querying and natural language interfaces to databases, which convert regular English queries to SQL. This can be useful even for lay users of relational data, e.g., for querying about restaurants or general facts on conversational assistants. We showed the being aware of audio features when translating natural language to SQL can make it more accurate.
Last Modified: 11/08/2022
Modified by: Arun K Kumar
Please report errors in award information by writing to: awardsearch@nsf.gov.