Award Abstract # 0712836
CAREER: Evolving and Self-Managing Data Integration Systems

NSF Org: IIS
Division of Information & Intelligent Systems
Recipient: UNIVERSITY OF WISCONSIN SYSTEM
Initial Amendment Date: March 19, 2007
Latest Amendment Date: June 16, 2009
Award Number: 0712836
Award Instrument: Continuing Grant
Program Manager: Xiaoyang Wang
IIS
 Division of Information & Intelligent Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: August 31, 2006
End Date: May 31, 2010 (Estimated)
Total Intended Award Amount: $0.00
Total Awarded Amount to Date: $238,115.00
Funds Obligated to Date: FY 2006 = $38,115.00
FY 2007 = $100,000.00

FY 2008 = $100,000.00
History of Investigator:
  • AnHai Doan (Principal Investigator)
    anhai@cs.wisc.edu
Recipient Sponsored Research Office: University of Wisconsin-Madison
21 N PARK ST STE 6301
MADISON
WI  US  53715-1218
(608)262-3822
Sponsor Congressional District: 02
Primary Place of Performance: University of Wisconsin-Madison
21 N PARK ST STE 6301
MADISON
WI  US  53715-1218
Primary Place of Performance
Congressional District:
02
Unique Entity Identifier (UEI): LCLSJAGTNZQ7
Parent UEI:
NSF Program(s): INFORMATION & KNOWLEDGE MANAGE,
Info Integration & Informatics
Primary Program Source: app-0106 
app-0107 

01000809DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 1045, 9218, HPCC
Program Element Code(s): 685500, 736400
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Data integration systems provide a uniform access to a multitude of data sources. They have the potential to revolutionize the way we access data, and provide a basis on which to build even more advanced information processing architectures. However, today such systems are still extremely hard to build and costly to maintain. They must be told in tedious detail how to interact with data sources, and must be constantly modified to deal with changes at the sources. To address this problem, the project envisions building data integration systems that learn to evolve and self-manage over time, with minimal human intervention. To make fundamental contributions toward realizing this vision, the project employs database and artificial intelligence (especially machine learning) techniques to attack the following central challenges: (a) effectively automating key labor-intensive tasks, including schema matching, global schema creation, and duplicate detection, (b) detecting system failures due to changes at the sources, with minimal human intervention, and (c) further reducing the tremendous data integration burden of the system administrators by spreading the burden thinly over the mass of users.

The education plan leverages the research to prepare students and the broader community for the novel data management challenges raised by the Internet world. In terms of intellectual merit, the project takes a next logical step in data integration research. It brings conceptually novel solutions to fundamental issues underlying virtually any data integration or sharing efforts. The project results have the potential for autonomic-computing applications. In terms of broader impacts, the project will facilitate the widespread deployment of data integration systems, thus resulting in more effective information management and access for society. It plays an integral part in educating next-generation professional workers and researchers. The research will also help integrate data for rural Illinois fire fighters, and train them in access and use of the integrated information systems. The project information will be disseminated via publications, workshops, tutorials, and the Web site http://www.cs.wisc.edu/~anhai/projects/career.html that will include the resulting research results, data and system artifacts.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

A. Doan, P. Bohannon, R. Ramakrishnan, X. Chai, P. DeRose, B. Gao, W. Shen "User-Centric Research Challenges in Community Information Management Systems" IEEE Data Engineering Bulletin, special issue on data management in social networks. 2007 , 2007
A. Doan, R. Ramakrishnan, F. Chen, P. DeRose, Y. Lee, R. McCann, M. Sayyadian, and W. Shen "Community Information Management" IEEE Data Engineering Bulletin, special issue on probabilistic databases , v.29 , 2006 , p.64
Yoonkyong Lee, Mayssam Sayyadian, AnHai Doan, Arnon Rosenthal "eTuner: Tuning Schema Matching Software using Synthetic Scenarios" VLDB Journal, special issue on best papers of VLDB-05 , 2006

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page