Award Abstract # 9978567
Efficient Query Processing for Data Integration

NSF Org: IIS
Division of Information & Intelligent Systems
Recipient: UNIVERSITY OF WASHINGTON
Initial Amendment Date: September 29, 1999
Latest Amendment Date: July 5, 2001
Award Number: 9978567
Award Instrument: Continuing Grant
Program Manager: Maria Zemankova
IIS
 Division of Information & Intelligent Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: October 1, 1999
End Date: September 30, 2002 (Estimated)
Total Intended Award Amount: $226,000.00
Total Awarded Amount to Date: $226,000.00
Funds Obligated to Date: FY 1999 = $148,000.00
FY 2001 = $78,000.00
History of Investigator:
  • Alon Halevy (Principal Investigator)
    alon@cs.washington.edu
Recipient Sponsored Research Office: University of Washington
4333 BROOKLYN AVE NE
SEATTLE
WA  US  98195-1016
(206)543-4043
Sponsor Congressional District: 07
Primary Place of Performance: University of Washington
4333 BROOKLYN AVE NE
SEATTLE
WA  US  98195-1016
Primary Place of Performance
Congressional District:
07
Unique Entity Identifier (UEI): HD1WMN6945W6
Parent UEI:
NSF Program(s): INFORMATION & KNOWLEDGE MANAGE
Primary Program Source: app-0101 
app-0199 
Program Reference Code(s): 9216, HPCC
Program Element Code(s): 685500
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

This goal of this research project is to develop efficient query optimization and query execution methods for data integration systems. Data integration systems provide uniform access to a multitude of autonomous data sources within an enterprise or on the World-Wide Web. Unlike in traditional database applications, a query execution engine for data integration must be able to cope with limited availability of statistics on the underlying data and with unexpected network delays during query execution. The approach consists of developing an adaptive query execution engine for data integration. Two kinds of adaptivity are considered: (1) interleaving of query optimization and execution, and (2) developing novel query execution operators that are tailored to data integration. In the first part, algorithms for determining appropriate points at which to suspend query optimization are considered. For the second part, novel join implementations are considered (e.g., the double-pipelined join), as well as operators that are needed only in the data integration context (e.g., dynamic collectors performing unions over large collections of sources). In addition, issues involving the integration of semi-structured data (e.g., XML) are also addressed. The results of the research include the implemented Tukwila data integration system, which will be made available to other researchers in the field. The impact of the research will be to remove the performance bottleneck that hinders fielding data integration systems in the WWW and enterprise contexts. We will be able to process data integration queries involving 10's of MB of data coming from external sources in real time.
http://data.cs.washington.edu/integration/tukwila

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page