
NSF Org: |
IIS Division of Information & Intelligent Systems |
Recipient: |
|
Initial Amendment Date: | July 27, 2011 |
Latest Amendment Date: | July 27, 2011 |
Award Number: | 1144985 |
Award Instrument: | Standard Grant |
Program Manager: |
Sylvia Spengler
sspengle@nsf.gov (703)292-7347 IIS Division of Information & Intelligent Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | September 1, 2011 |
End Date: | August 31, 2014 (Estimated) |
Total Intended Award Amount: | $100,000.00 |
Total Awarded Amount to Date: | $100,000.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
1523 UNION RD RM 207 GAINESVILLE FL US 32611-1941 (352)392-3516 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
FL US 32611-2002 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): |
Info Integration & Informatics, Software Institutes |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
Modern multicore architectures, that provide high raw gigaflops and teraflops, have deep memory hierarchies and low overhead threading capabilities. Lack of support for directly exploiting these capabilities leads to severe under-utilization especially for data intensive applications. This project expects to develop methods that efficiently use the available computational power to provide cost improvement for large scale data processing systems.
This project will develop a highly efficient computation framework called GLADE that will support a large class of data intensive applications, and will be based on a novel computational model called generalized linear aggregates. The commutative and associative properties of Generalized Linear Aggregates facilitate highly efficient parallel and distributed computation as well as exploitation of deep memory hierarchies, especially when multiple queries are simultaneously executed as is typical in many data-processing tasks. The resulting one to two orders of magnitude improvement in computational efficiency can be expected to yield corresponding reduction in cost and energy requirements of data processing tasks which in turn will make it feasible to analyze much larger data sets than currently possible.
The proposed work will make the synergistic combination of high performance computing and large scale data analysis widely available to researchers, and other interested groups in government, industry, and education. The enabling of a large number of data intensive application using inexpensive computers that cost in low tens of thousands of dollars will broaden the use of data analysis, exploration and mining for a wide variety of existing and emerging applications. Examples of such applications include network intrusion detection, social network analysis, climate data, ecosystem analysis, and customer relationship management. Additional information about the project can be found at: http://sites.google.com/site/sanjayranka/glade.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
The main goal of this project was to add advanced data processing and mining capabilities to DataPath in the form of an add-on set of libraries called GLADE. Secondary goals included the use of GLADE for general database research to advance the state of the art of exact and approximate query processing.
Intelectual Merit
GLADE, significantly enhanced the capabilities of DataPath in terms of data processing. It is now possible to combine database processing, linear algebra, data mining using a sophnisticated set of abstractions such as Generalized Linear Aggregates, Generalized Transformers, Generalized Iterative State Transformation. In terms of impact on database research, GLADE allowed us to pursue significant work on large Marcov Chain Monte Carlo (MCMC) sytems and sampling based approximate query processing.
Broader Impact
GLADE together with DataPath, the framework in which GLADE is implemented, for the basis of GrokIt, a data processing framework developed by Tera Insights, LLC, a company founded by the PI(Dobra). GrokIt is already being used at University of Florida to allow students to process large amounts of stock market data (detalied history of the last 10 years of stock transactions containing 56.8 billion tuples) and at Infinite Enery for energy usage prediction. GLADE, through GrokIt commercial incarnation, already has a significant impact on educatin and is used in several classes at University of Florida (Database System Implementation, Advanced Data Science, Independent Study).
Last Modified: 01/09/2015
Modified by: Alin V Dobra
Please report errors in award information by writing to: awardsearch@nsf.gov.