Award Abstract # 1512932
RAPID: Modeling Ebola Spread and Developing Decision Support System Using Big Data Analytics

NSF Org: CNS
Division Of Computer and Network Systems
Recipient: FLORIDA ATLANTIC UNIVERSITY
Initial Amendment Date: April 13, 2015
Latest Amendment Date: April 13, 2015
Award Number: 1512932
Award Instrument: Standard Grant
Program Manager: Rita Rodriguez
CNS
 Division Of Computer and Network Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: April 15, 2015
End Date: August 31, 2016 (Estimated)
Total Intended Award Amount: $100,000.00
Total Awarded Amount to Date: $100,000.00
Funds Obligated to Date: FY 2015 = $100,000.00
History of Investigator:
  • Borko Furht (Principal Investigator)
    borko@cse.fau.edu
  • Hari Kalva (Co-Principal Investigator)
  • Ankur Agarwal (Co-Principal Investigator)
Recipient Sponsored Research Office: Florida Atlantic University
777 GLADES RD
BOCA RATON
FL  US  33431-6424
(561)297-0777
Sponsor Congressional District: 23
Primary Place of Performance: Florida Atlantic University
777 Glades Road
Boca Raton
FL  US  33431-0991
Primary Place of Performance
Congressional District:
23
Unique Entity Identifier (UEI): Q266L2NDAVP1
Parent UEI:
NSF Program(s): Special Projects - CNS,
IUCRC-Indust-Univ Coop Res Ctr
Primary Program Source: 01001516DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 001Z, 1640, 5761, 7914
Program Element Code(s): 171400, 576100
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

This project aims to help address urgent public health problems (specifically the spread of the Ebola virus) of national and global significance by advancing the state of the art in computer science, big data analytics, data visualization techniques, and decision support systems. Specifically, the effort, developing computational models to predict the spread of Ebola utilizing both 'forward simulation' from a given patient and the propagation of the infection into the community and backwards, aims to trace a number of the verified infections to patient 'zero.' The work utilizes big data analytic techniques, data about underlying personal relationships, health center locations, and the known mechanisms for the spread of the Ebola virus.

The project connects directly to the Florida International University (FIU)?s TerraFly system, a web-enabled system designed to aid in the visualization of spatial and remotely sensed data. The system allows users to ?fly? with fine resolution over the surface of the earth to explore various kinds of data (e.g., local information, street maps, aerial photography, satellite imagery, etc.). The Ebola spread patterns are then fed into a Decision Support System (DDS). These inputs also consist of information about social groups or individual persons. Based on spread patterns, the DSS will then calculate probabilities for a social group or a given person to get infected with Ebola. The system will be able to present data mashups to operators responding to hotline calls and field workers encountering patients and deciding about triage. The data will also be presented in report form to responsible government agencies. This time-sensitive project necessitates prompt collection and analysis of the spread of the Ebola virus in order to enable the development of the correct models.

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

In this project, we developed a model of Ebola spread by using innovative big data analytics techniques and tools. We used massive amounts of data from various sources including Twitter feeds, Facebook and Google. This data is then fed into a decision support system that models the spread pattern of the Ebola virus and creates dynamic graphs and predictive diffusion models on the outcome and impact on either a specific person or a specific community. As a result of this research, computational spread models for Ebola in the U.S. are created, potentially leading to more precise forward predictions of the disease propagation and tools to help identify individuals who are possibly infected, and perform trace-back analysis to locate the possible source of infection for a particular social group.  Besides collaborating with FIU, we also closely collaborated with LexisNexis (LN), which is a leading big data company and a member of our I/UCRC for Advanced Knowledge Enablement. LexisNexis has provided the large amount of data about relationship of the people in US, and we combined it with data analytics techniques and tools to model disease spread patterns. In this part of research we used Cloud Computer system located in our College at FAU as well as LN’s High-Performance Computer Cluster (HPCC), which is intended for big data applications. We performed modeling, analytics, and development of a Decision Support System (DSS), which provides a probabilistic outcome of Ebola impact on either a specific person or a community at a specific location. This information is then fed to FIU’s Terafly system for geospatial mapping and other services.

 

Jointly with the LN research team, we created people clusters based on proximity and built a model using weighted scores, which approximate physical contacts. In creating people clusters, we used public record graph to calculate distances between an affected person and his/her relatives and friends. Base on this model, we developed disease propagation path.

This work represents an improvement over previous state-of-the-art, because we used innovative data analytics techniques and the latest HPCC technology to developed models of Ebola spread. Mathematical compartmental models have been applied to predict the behavior of disease outbreaks in many studies. These models aim to understand the dynamics of a disease propagation process and focus on partitioning the population into several health states. With information from multiple sources indicating infected individuals and their personal relationships and social groups, dynamic graphs can be created, and predictive diffusion models can be used to study key issues of Ebola epidemics, e.g., location, time and number of expected new cases. Two fundamental diffusion models are Independent Cascade Model (IC) and Linear Threshold Model (LT), both of which follow an iterative diffusion process where infected nodes infect their uninfected neighbors with certain probabilities. Based on fundamental models, we developed advanced propagation models to estimate an influence function by examining past and newly infected notes and predict subsequent infections. Our program is developed to identify and visualize families and tightly connected social groups who have had some contact with Ebola patients. Tracking and containing this disease requires enormous resources. Our system provides a proactive approach to reasonably reduce the risk of exposure of Ebola spread within a community or a geographic location.

There are several end-user products based on our research. The main objective is to feed Ebola spread patterns, which we obtained from our models, into Decision Support System (DSS). The inputs into DSS are also information about social groups or individual persons. Social groups can be for example nurses and doctors who had contacts with an Ebola patient, or passengers who travelled in the same plane with an Ebola patients, or family members of an Ebola patent, and others. Based on spread patterns, the DSS then calculates probabilities for a social group or a given person to get affected with Ebola.

The mobile interface will further allow the people to enter the signs of Ebola such as ‘Fever’ specifically above 100.4o, chills, headache and vomiting, myalgia, intense weakness. While these signs are very common with other diseases as well such as Malaria and Typhoid, therefore laying over these impacts on the geo-coordinates of having the person in a geographic proximity of a person of a community with impact of Ebola will provide more focused results with higher accuracy.

The economic impacts can be tremendous by predicting outbreaks of the deadly Ebola virus (or any other epidemics) and directing potential victims of disease to the nearest suitable medical facility. This research can also help in indicating the areas in which new facilities should be opened, where disease outbreaks are beginning to occur, and how they are likely to expend.

 

 


Last Modified: 11/02/2016
Modified by: Borko Furht

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page