Award Abstract # 1643623
EAGER: Collaborative Research: Combining Community and Clinical Data for Augmenting Influenza Modeling

NSF Org: IIS
Division of Information & Intelligent Systems
Recipient: THE TRUSTEES OF COLUMBIA UNIVERSITY IN THE CITY OF NEW YORK
Initial Amendment Date: August 31, 2016
Latest Amendment Date: August 31, 2016
Award Number: 1643623
Award Instrument: Standard Grant
Program Manager: Wendy Nilsen
wnilsen@nsf.gov
 (703)292-2568
IIS
 Division of Information & Intelligent Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: September 1, 2016
End Date: August 31, 2018 (Estimated)
Total Intended Award Amount: $119,000.00
Total Awarded Amount to Date: $119,000.00
Funds Obligated to Date: FY 2016 = $119,000.00
History of Investigator:
  • Jeffrey Shaman (Principal Investigator)
    jls106@columbia.edu
Recipient Sponsored Research Office: Columbia University
615 W 131ST ST
NEW YORK
NY  US  10027-7922
(212)854-6851
Sponsor Congressional District: 13
Primary Place of Performance: Columbia University
630 W 168th Street
New York
NY  US  10032-3702
Primary Place of Performance
Congressional District:
13
Unique Entity Identifier (UEI): F4N1QNPB95M4
Parent UEI:
NSF Program(s): Smart and Connected Health
Primary Program Source: 01001617DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7916, 8018
Program Element Code(s): 801800
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

This EAGER represents timely and essential exploratory work assessing the value of community-sourced data in infectious disease modeling efforts. Community-generated data can suffer from lack of information about the reference population, which hinders prevalence estimates. In theory, real-time and near real-time community-sourced data has been recognized to offer important opportunity to improve timeliness and scope of infectious disease modeling efforts, but there are still fundamental questions regarding the value of community infection data for understanding, monitoring and forecasting. Towards this, work here will study how community and clinically generated data compare regarding measures of disease incidence, contributing population demographics, and spatio-temporal coverage in influenza dynamics. Public dissemination of our research and findings will help expose and educate the community in data generation and forecasting efforts.

This project involves a rigorous and systematic comparison between contemporaneous community and clinical data on acute respiratory infections. The goal of this work will be to first generate a diverse community-sourced data set with a defined reference population. We will then assess significance of outcomes between groups in community and clinical data, accounting for demographic and epidemiological factors. Dynamical modeling and Bayesian inference methods will be used to develop and augment disease forecasts. Normalized and municipal scale estimates from the community samples will be integrated and the data generation and modeling efforts will together be used to assess the impact of community data on real-time and near-real time simulations and forecasts. The high-risk work can potentially be paradigm shifting regarding how we collect and use data in forecasting methods for disease as well as a broader range of societal issues.

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

The major goals of this project were to assess: (1) How community and clinically generated data compare regarding disease incidence, contributing population demographics, and spatio­temporal coverage; and (2) The value of community infection data for understanding, monitoring and forecasting seasonal and pandemic influenza dynamics.

Major activities in the first year included development of the means for data collection from the community and completing data collection and laboratory testing.  The second year focused on examining the collected data and developing analyses to identify how community infection data can contribute to understanding disease incidence and patterns. To accomplish the second goal, we also reached out to other groups performing influenza research and obtained data from their studies to combine with ours and perform a more comprehensive analysis.

At Columbia, our primary role was to process community samples and perform laboratory testing.  Over the course of the project, we received 314 self-collected nasopharyngeal swab specimens.  All specimens were analyzed using the GenMark respiratory virus panel (RVP) assay for 18 common respiratory viruses.  Of the 314 samples, 84 were positive for 1 or more respiratory virus, including single infections from adenovirus C (n=3), coronavirus 229E (9), coronavirus OC43 (9), coronavirus NL63 (2), rhinovirus (41), human metapneumovirus (1), influenza A/H3N2 (7), influenza B (1), influenza A/H1N1 (1), parainfluenza 1 (1), parainfluenza 3 (1), parainfluenza 4 (2), respiratory syncytial virus (RSV) A (1), and RSV B (3).  There were 2 coinfections: RSV B and coronavirus 229E; RSV B and adenovirus C.

The Columbia team is additionally supporting the analyses by the NYU team on 2 fronts.  Firstly, we are participating in and providing feedback to the ongoing analyses conducted with the aim of demonstrating that different and complementary data provide a more comprehensive picture of global disease incidence than a single data form in isolation.  The intent is to use different observations along the morbidity surveillance pyramid to understand how loss of information accumulates during surveillance.  For example, the difference between the number of reported symptomatic infections and those that actually seek health care represents ‘under-healthcare-reporting’, whereas the proportion of infections that are not symptomatic represent the ‘under-ascertained’.  Analysis of these issues directly addresses the primary aims of this project. 

Secondly, we have provided results from an alternate active surveillance project carried out separately with DoD funding by Dr. Shaman’s team.  Specifically, we have provided the results from nasopharyngeal swab samples taken during April-June 2016 and January-April 2017 from adult visitors at a major New York City tourist institution. This sampling generated specimens from 2685 participants; all samples were assayed using the same GenMark RVP protocol (185 were virus positive).

Initial findings, led by NYU, have been published at the social media and health workshop at ICWSM 2017, and in a paper in BMC research notes; these papers have explored the predictive power of community (social media) data for infection, as well as comparison of community and clinical respiratory infection data in a unique setting (Nigeria). We also have two larger papers in preparation that explore each of the objectives, which we anticipate will be of interest to both the computational and public health communities.

Dr. Chunara, PI at NYU, has requested an NCE for the NYU portion of this collaborative project.  Though not requesting an NCE, Dr. Shaman’s group at Columbia University plans to continue participating in these analyses and supporting the development and submission of manuscripts. 

 


Last Modified: 11/29/2018
Modified by: Jeffrey L Shaman

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page