
NSF Org: |
IIS Division of Information & Intelligent Systems |
Recipient: |
|
Initial Amendment Date: | June 28, 2013 |
Latest Amendment Date: | June 28, 2013 |
Award Number: | 1251151 |
Award Instrument: | Standard Grant |
Program Manager: |
Sylvia Spengler
sspengle@nsf.gov (703)292-7347 IIS Division of Information & Intelligent Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | July 1, 2013 |
End Date: | June 30, 2017 (Estimated) |
Total Intended Award Amount: | $688,969.00 |
Total Awarded Amount to Date: | $688,969.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
10889 WILSHIRE BLVD STE 700 LOS ANGELES CA US 90024-4200 (310)794-0102 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
Office of Contract and Grant Administration Los Angeles CA US 90095-1406 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): |
Information Technology Researc, Big Data Science &Engineering |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
Massive longitudinal healthcare data, such as administrative claims and electronic health records, provide an opportunity to greatly enhance the accuracy and clinical impact of patient-level predictions across a wide range of outcomes. This research targets the national priority domain of healthcare IT and showcases the advances that Big Data afford in helping patients make informed healthcare decisions leading to improved outcomes. Other involved stakeholders include healthcare providers, insurers and governmental agencies, and the databases this proposed grant employs encompass diverse and vulnerable patient populations, including the young, the poor and the elderly. Within this context, this grant is seeking to predict patient-level health events based upon personal characteristics and conditions. Accurate and well-calibrated predictions could significantly improve the wellbeing of patients and populations. This grant proposes to derive predictive models from massive observational data and then, for example, predict that a particular patient has an 18% chance of experiencing a stroke in the next 12 months. With this prediction in hand, caregivers and patients can optimize medical interventions and implement behavioral changes to hopefully prevent the predicted event. Further, this grant integrates two graduate student researchers, whose mentored experiences begin to rectify the shortage of data scientists trained at the intersection of statistics and medicine, and provides general statistical software tools for building large-scale predictive models from massive data across scientific domains.
From a technical perspective, the proposed grant aims to first evaluate performance and applicability of an existing predictive model across five administrative claims and electronic health record databases covering over 80 million lives, using CHADS2 stroke risk as a motivating example. Then the grant will develop an innovative data-driven process for building patient-level predictive models from longitudinal observational data, and initially apply the process to predicting stroke in patients with atrial fibrillation for comparison of performance against CHADS2, Finally, the grant aims to explore characteristics of the process
and resulting models, such as: evaluation of out-of-sample predictive performance in different databases; consideration of how models change over time; and assessment of which clinical variables most substantially contribute to patient-level predictions. Together, this research will focus on identifying heuristics to extract clinically relevant predictors from longitudinal electronic healthcare data, developing algorithms to use this information in multivariate modeling through massive parallelization using graphics processing units, optimized for data sparsity, and evaluating performance based on accuracy in predicting outcomes at the patient level. As a proof-of-concept, the grant will develop an approach to predict stroke risk and apply this approach across five disparate data sources (80+ million patients, including drugs, lab values, procedures, emergency room visits, primary care visits, inpatient encounters, etc) that reflect diverse patient populations across the US, including the privately insured, Medicare-eligible, and Medicaid beneficiaries. The underlying goal of the grant is to apply innovative statistical and machine learning techniques using advancing computer technology to large-scale observational data to develop accurate and well-calibrated patient-level predictive models enabling the prediction of future medical events for individual patients.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
To enable predictive modeling from healthcare data, this grant supported the founding and growth of the Observational Health Data Sciences and Informatics (OHDSI) program, a multi-stakeholder, interdisciplinary collaborative to create open-source solutions that bring out the value of observational health data through large-scale analytics. Major achievements of OHDSI over the course of this grant are an international demonstration of characterization of treatment pathways across three major chronic diseases over 250 million patients, a global network study of clinical predictive importance and the completion of a large-scale analysis involving over 17,000 comparative effectiveness and drug safety studies.
Scientifically, the grant also advanced computational and statistical techniques to extract clinically relevant predictors from longitudinal electronic healthcare data, to develop algorithms to use this information in multivariate modeling through massive parallelization using graphics processing units, optimized for data sparsity, and to evaluate performance based on accuracy in predicting outcomes at the patient level. To communicate this work, the grant generated 25 peer-reviewed publications.
Finally, this grant targeted the national priority domain of healthcare IT and showcased the advances that Big Data affords in helping patients make informed healthcare decisions leading to improved outcomes. Other involved stakeholders included healthcare providers, insurers and governmental agencies, and the databases this grant employed encompassed diverse and vulnerable patient populations, including the young, the poor and the elderly. Within this context, the grant yielded improved abilities to predict patient-level health events (for example, will I have a stroke?) based upon personal characteristics and conditions. Accurate and well-calibrated predictions could significantly improve the well-being of patients and populations.
Last Modified: 08/31/2017
Modified by: Marc A Suchard
Please report errors in award information by writing to: awardsearch@nsf.gov.