Award Abstract # 1143807
EAGER: Discovering Emerging Events in Social Media

NSF Org: IIS
Division of Information & Intelligent Systems
Recipient: UNIVERSITY OF WISCONSIN SYSTEM
Initial Amendment Date: July 27, 2011
Latest Amendment Date: July 30, 2012
Award Number: 1143807
Award Instrument: Continuing Grant
Program Manager: Maria Zemankova
IIS
 Division of Information & Intelligent Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: September 1, 2011
End Date: August 31, 2014 (Estimated)
Total Intended Award Amount: $150,000.00
Total Awarded Amount to Date: $150,000.00
Funds Obligated to Date: FY 2011 = $75,000.00
FY 2012 = $75,000.00
History of Investigator:
  • AnHai Doan (Principal Investigator)
    anhai@cs.wisc.edu
Recipient Sponsored Research Office: University of Wisconsin-Madison
21 N PARK ST STE 6301
MADISON
WI  US  53715-1218
(608)262-3822
Sponsor Congressional District: 02
Primary Place of Performance: Department of Computer Sciences
1210 W. Dayton St
Madison
WI  US  53706-1685
Primary Place of Performance
Congressional District:
02
Unique Entity Identifier (UEI): LCLSJAGTNZQ7
Parent UEI:
NSF Program(s): Info Integration & Informatics
Primary Program Source: 01001112DB NSF RESEARCH & RELATED ACTIVIT
01001213DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7364, 7916
Program Element Code(s): 736400
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Social media have become increasingly critical in many domains, such as commerce, disaster management, science, and national security. In these domains, applications often have to integrate social media to detect emerging events. Today however few solutions for event detection have been developed and they suffer from several important limitations. This exploratory project addresses these limitations and develops a solution that effectively integrates social media to detect emerging events. The solution will focus on the Twittersphere, and will address the following three key challenges: (a) how to exploit characteristics unique to social media to improve the accuracy of detecting events, (b) how to design the solutions such that they scale to high-speed streams of social media (such as 1500 tweets per second), and (c) how to leverage crowdsourcing to find truly interesting events and extract attributes of these events.

The project will be among the first to explore in depth how to integrate social media to detect emerging events, taking into account social media characteristics. As such, it is a high-risk/high-payoff project that can open the door to novel research directions, and help accelerate research into social media integration, an increasingly critical problem that impacts many areas of the society. If successful, the project can also help build practical event discovery tools that can make immediate impacts. Finally, the project will help train a Ph.D. student for two years, and help build and release a set of infrastructure tools and testbeds that can help accelerate subsequent research into social media integration, for both the PI's group and other research groups in social media. The project information will be disseminated via publications, workshops, tutorials, and the Web site (http://www.cs.wisc.edu/~anhai/projects/event-detection.html) that will include the resulting research results, data and system artifacts.

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Social media (e.g., Twitter, Facebook, YouTube, blogs) has now becomeubiquitous. It plays an increasingly critical role in many domains,including commerce, disaster management, science, and nationalsecurity. For example, it fostered the revolutions of the Arab Spring,transformed Groupon into a social commerce phenomenon, and helpsscientists monitor water quality, among others. In these domains,applications often have to integrate social media data to detectemerging events. Example of such events include a planned protest ina city square, a discovered defect of a newly released product, anearthquake that just happened in a remote area, and an emerging algaebloom in a lake. Despite the obvious importance of detecting suchevents, today few solutions for event detection have been proposed,and these solutions often do not work well because they do not takeinto account the unique characteristics of social media.


In this project we have developed a solution that effectivelyintegrates social media data to detect emerging events. This solutionfocuses on the Twittersphere, and addresses the following three keychallenges:

First, social media data is often noisy, making it hard toaccurately detect events. To address this challenge, the solutionexploits characteristics unique to social media. Examples includetemporal dynamics (e.g., certain keywords suddenly become popular),correlations across social media sites (e.g., the same keywords arementioned in both Twitter and YouTube), and current contexts (e.g., atweet mentioning Assad is likely to refer to the Syrian president,given the current Syrian protests). Prior solutions have not exploitedsuch characteristics. 

The second challenge is that social media data often comes as faststreams (e.g., 1500 tweets per second). Prior solutions do not scaleto such streams. We have developed a solution that scales, using twokey ideas: shifting as much of the algorithm to be processed offline aspossible, and using a cluster of machines to process the streams, in adistributed Map-Reduce style. 
This project was among the first to explore in depth how to integratesocial media to detect emerging events, taking into account socialmedia characteristics. As such, it is a high-risk/high-payoff projectthat can open the door to novel research directions, and helpaccelerate research into social media integration, an increasinglycritical problem that impacts many areas of the society. The projectcan also help build practical event discovery tools that can makeimmediate impacts.

Concrete outcomes of this project are: 

+ We have developed solutions to detect and monitor events in socialmedia. These solutions are being described in two papers to besubmitted soon.

+ We have developed a solution to process data streams (such as tweetstreams) at high speed. Such processing is necessary for effectiveevent detection and monitoring in social media. This solution has beenpublished at VLDB 2012.

+ A paper summarizing our approach to social media analytics waspublished in IEEE Data Engineering Bulletin in 2013. Another paperdescribing our approach to process tweets was published in VLDB 2013.

+ A textbook on data integration (discussing also social mediaintegration topics) was published by Morgan Kaufmann in 2012.

+ Two students were trained on this grant.

 


Last Modified: 06/03/2015
Modified by: Anhai Doan

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page