
NSF Org: |
IIS Division of Information & Intelligent Systems |
Recipient: |
|
Initial Amendment Date: | September 9, 2012 |
Latest Amendment Date: | June 13, 2013 |
Award Number: | 1219263 |
Award Instrument: | Standard Grant |
Program Manager: |
Maria Zemankova
IIS Division of Information & Intelligent Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | October 1, 2012 |
End Date: | September 30, 2016 (Estimated) |
Total Intended Award Amount: | $500,000.00 |
Total Awarded Amount to Date: | $516,000.00 |
Funds Obligated to Date: |
FY 2013 = $16,000.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
9500 GILMAN DR LA JOLLA CA US 92093-0021 (858)534-4896 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
CA US 92093-0404 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | Info Integration & Informatics |
Primary Program Source: |
01001314DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
The goal of this project is to provide an efficient platform for browser-based, data-driven web application computation. The platform enables cost-effective development of fast-responding applications that adjust well to accesses from mobile clients (e.g. smart phones, tablets). The project achieves its goal using the following approaches: (1) designing novel high level, location-transparent, declarative, data-driven languages that require much lower coding effort than direct HTML5 coding in order to specify the business process and data access of the applications; (2) developing an optimizer for low latency query execution plans that utilize browser-based storage and asynchronous computation; (3) developing an action scheduler that optimizes the location and execution order of the actions described in the declarative language; (4) developing a user/action concurrency control theory and a dependency analysis algorithm so that the user can view and act while prior actions are still computed; (5) prototyping an application-enabling platform that encompasses the developed languages, algorithms and optimizations; and (6) evaluating the effectiveness of the platform in two aspects: how much it reduces latency and how much it reduces the coding effort.
The project's research will have great impacts on mobile-accessible, data-driven web applications, which, by being written in the proposed automatically optimized, declarative languages, will enjoy both low latency and low development cost. The project supports graduate and undergraduate students. Lectures on the research results will be incorporated into PI's undergraduate-level course on web application development. Publications, software, an online service and experimental data from this research will be disseminated via the project web site (http://www.db.ucsd.edu/browserbasedforward).
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
It is well known that implementing a data-driven application requires time and money. The result is that organizations, corporations and the government often lack the data analytics and management applications that they need since the budget limitations and the speed of business requirements do not allow for the needed applications to be built on time.
A major source of the cost of building of data-driven applications is the fact that the application has to collect data from multiple sources, combine them and provide them with the appropriate format to the visualization components. It is both a blessing and a curse of the Big Data era that this source of cost is increasing. The Big Data era is characterized by a big diversity of databases. Besides the customary SQL databases, nowadays data are also found on NoSQL, NewSQL and SQL-on-Hadoop databases. Furthermore, interesting data are also found on the client device (typically smartphone) and its browser. Finally, the plethora of visualization components creates the need for easily adjusting the results to the formats that the visualizations need.
Given the importance of semistructured data (such as JSON) both as input in NoSQL databases and as JSON (in particular) as the logical representation of the visualization input in MVVM architectures, the project created the SQL++ query language which accesses seamlessly both SQL and semistructured data. While the SQL++ idea (as an SQL extension for semistructured data) pre-existed, the project was focused on providing the full semistructured data functionality required - namely, account for the potential lack of schema, enable arbitrary inputs, outputs and transformations as powerful as the ones that XQuery achieved. Unlike XQuery this project was committed to produce an SQL compatible language, since this is what the majority of developers understands.
Per the original project objective, the project created a SQL++ distributed query processing engine, including the ability to refer to data on either the browser or server. Furthermore, the project addressed the case of live data, by developing Incremental View Maintenance that ensures that the views offered to the users are up-to-date, i.e., they reflect the state of the underlying databases.
The project expanded SQL++ into a Configurable SQL++ that formally captures the query language differences across the multiple semistructured query languages of the multiple NoSQL, newSQL and SQL-on-Hadoop databases of today. Thus the Configurable SQL++ became a useful tool for formally surveying the semantic differences between the multiple languages for NoSQL, NewSQL and SQL-on-Hadoop. We disseminated the Configurable SQL++ to the industry, in the order of creating a dialog and a common understanding that will eventually will lead to the appropriate (for NoSQL, NewSQL and SQL-on-Hadoop) clean and formal extensions for extensions of the SQL standard. The response of the database industry to SQL++ and Configurable SQL++ has been excellent. A number of NoSQL databases adopt and we anticipate that SQL++ will further influence the database industry.
Finally, Configurable SQL++ plays an internal role to the distributed query processor that accesses multiple sources of semistructured data. Recall, these sources are very diverse on the query languages that they use. Thus the distributed query processor needs to be able to interact with these diverse languages and rewrite the application requests into the languages they understand. Configurable SQL++ brings in a formal definition of the diversity aspects of the various languages. Then it becomes much easier for the distributed query processor to translate across these languages.
Last Modified: 08/11/2017
Modified by: Yannis Papakonstantinou
Please report errors in award information by writing to: awardsearch@nsf.gov.