NSF Award Search: Award # 1219263

Award Abstract # 1219263

III: Small: Low latency browser-based web computation

NSF Org:	IIS Division of Information & Intelligent Systems
Recipient:	UNIVERSITY OF CALIFORNIA, SAN DIEGO
Initial Amendment Date:	September 9, 2012
Latest Amendment Date:	June 13, 2013
Award Number:	1219263
Award Instrument:	Standard Grant
Program Manager:	Maria Zemankova IIS Division of Information & Intelligent Systems CSE Directorate for Computer and Information Science and Engineering
Start Date:	October 1, 2012
End Date:	September 30, 2016 (Estimated)
Total Intended Award Amount:	$500,000.00
Total Awarded Amount to Date:	$516,000.00
Funds Obligated to Date:	FY 2012 = $500,000.00 FY 2013 = $16,000.00
History of Investigator:	Yannis Papakonstantinou (Principal Investigator) yannis@cs.ucsd.edu
Recipient Sponsored Research Office:	University of California-San Diego 9500 GILMAN DR LA JOLLA CA US 92093-0021 (858)534-4896
Sponsor Congressional District:	50
Primary Place of Performance:	University of California-San Diego CA US 92093-0404
Primary Place of Performance Congressional District:	50
Unique Entity Identifier (UEI):	UYTTZT6G9DT1
Parent UEI:
NSF Program(s):	Info Integration & Informatics
Primary Program Source:	01001213DB NSF RESEARCH & RELATED ACTIVIT 01001314DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s):	7923, 7364, 9251
Program Element Code(s):	736400
Award Agency Code:	4900
Fund Agency Code:	4900
Assistance Listing Number(s):	47.070

ABSTRACT

The goal of this project is to provide an efficient platform for browser-based, data-driven web application computation. The platform enables cost-effective development of fast-responding applications that adjust well to accesses from mobile clients (e.g. smart phones, tablets). The project achieves its goal using the following approaches: (1) designing novel high level, location-transparent, declarative, data-driven languages that require much lower coding effort than direct HTML5 coding in order to specify the business process and data access of the applications; (2) developing an optimizer for low latency query execution plans that utilize browser-based storage and asynchronous computation; (3) developing an action scheduler that optimizes the location and execution order of the actions described in the declarative language; (4) developing a user/action concurrency control theory and a dependency analysis algorithm so that the user can view and act while prior actions are still computed; (5) prototyping an application-enabling platform that encompasses the developed languages, algorithms and optimizations; and (6) evaluating the effectiveness of the platform in two aspects: how much it reduces latency and how much it reduces the coding effort.

The project's research will have great impacts on mobile-accessible, data-driven web applications, which, by being written in the proposed automatically optimized, declarative languages, will enjoy both low latency and low development cost. The project supports graduate and undergraduate students. Lectures on the research results will be incorporated into PI's undergraduate-level course on web application development. Publications, software, an online service and experimental data from this research will be disseminated via the project web site (http://www.db.ucsd.edu/browserbasedforward).

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

H. V. Jagadish, Johannes Gehrke, Alexandros Labrinidis, Yannis Papakonstantinou, Jignesh M. Patel, Raghu Ramakrishnan, Cyrus Shahabi "Big data and its technical challenges" Communications of the ACM , 2014

Papakonstantinou, Yannis "Polystore Query Rewriting: The Challenges of Variety" CEUR 2016 (workshop of EDBT/ICDT) , 2016

Papakonstantinou, Yannis "Semistructured Models, Queries and Algebras in the Big Data Era (Tutorial)" ACM SIGMOD , 2016

Yannis Katsis, Kian Win Ong, Yannis Papakonstantinou, Kevin Keliang Zhao "Utilizing IDs to Accelerate Incremental View Maintenance" ACM SIGMOD , 2015

Yannis Katsis, Kian Win Ong, Yannis Papakonstantinou, Kevin Keliang Zhao "Utilizing IDs to Accelerate Incremental View Maintenance." ACM SIGMOD , 2015 , p.1985

Yupeng Fu, Kian Win Ong, Yannis Papakonstantinou, Erick Zamora "FORWARD: Data-Centric UIs using Declarative Templates that Efficiently Wrap Third-Party JavaScript Components." PVLDB , 2014

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

It is well known that implementing a data-driven application requires time and money. The result is that organizations, corporations and the government often lack the data analytics and management applications that they need since the budget limitations and the speed of business requirements do not allow for the needed applications to be built on time.

A major source of the cost of building of data-driven applications is the fact that the application has to collect data from multiple sources, combine them and provide them with the appropriate format to the visualization components. It is both a blessing and a curse of the Big Data era that this source of cost is increasing. The Big Data era is characterized by a big diversity of databases. Besides the customary SQL databases, nowadays data are also found on NoSQL, NewSQL and SQL-on-Hadoop databases. Furthermore, interesting data are also found on the client device (typically smartphone) and its browser. Finally, the plethora of visualization components creates the need for easily adjusting the results to the formats that the visualizations need.

Given the importance of semistructured data (such as JSON) both as input in NoSQL databases and as JSON (in particular) as the logical representation of the visualization input in MVVM architectures, the project created the SQL++ query language which accesses seamlessly both SQL and semistructured data. While the SQL++ idea (as an SQL extension for semistructured data) pre-existed, the project was focused on providing the full semistructured data functionality required - namely, account for the potential lack of schema, enable arbitrary inputs, outputs and transformations as powerful as the ones that XQuery achieved. Unlike XQuery this project was committed to produce an SQL compatible language, since this is what the majority of developers understands.

Per the original project objective, the project created a SQL++ distributed query processing engine, including the ability to refer to data on either the browser or server. Furthermore, the project addressed the case of live data, by developing Incremental View Maintenance that ensures that the views offered to the users are up-to-date, i.e., they reflect the state of the underlying databases.

The project expanded SQL++ into a Configurable SQL++ that formally captures the query language differences across the multiple semistructured query languages of the multiple NoSQL, newSQL and SQL-on-Hadoop databases of today. Thus the Configurable SQL++ became a useful tool for formally surveying the semantic differences between the multiple languages for NoSQL, NewSQL and SQL-on-Hadoop. We disseminated the Configurable SQL++ to the industry, in the order of creating a dialog and a common understanding that will eventually will lead to the appropriate (for NoSQL, NewSQL and SQL-on-Hadoop) clean and formal extensions for extensions of the SQL standard. The response of the database industry to SQL++ and Configurable SQL++ has been excellent. A number of NoSQL databases adopt and we anticipate that SQL++ will further influence the database industry.

Finally, Configurable SQL++ plays an internal role to the distributed query processor that accesses multiple sources of semistructured data. Recall, these sources are very diverse on the query languages that they use. Thus the distributed query processor needs to be able to interact with these diverse languages and rewrite the application requests into the languages they understand. Configurable SQL++ brings in a formal definition of the diversity aspects of the various languages. Then it becomes much easier for the distributed query processor to translate across these languages.

Last Modified: 08/11/2017
Modified by: Yannis Papakonstantinou

Please report errors in award information by writing to: awardsearch@nsf.gov.

Success

Error