
NSF Org: |
IIS Division of Information & Intelligent Systems |
Recipient: |
|
Initial Amendment Date: | April 21, 2020 |
Latest Amendment Date: | April 21, 2020 |
Award Number: | 2029095 |
Award Instrument: | Standard Grant |
Program Manager: |
Amarda Shehu
IIS Division of Information & Intelligent Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | May 1, 2020 |
End Date: | April 30, 2021 (Estimated) |
Total Intended Award Amount: | $100,000.00 |
Total Awarded Amount to Date: | $100,000.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
910 GENESEE ST ROCHESTER NY US 14611-3847 (585)275-4031 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
500 Wilson Blvd., Box 270171 Rochester NY US 14627-0171 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | COVID-19 Research |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
The current COVID-19 pandemic is prompting the scientific community to improve the epidemiological models currently employed to understand the spread of infectious diseases. Current models divide a population of individuals into compartments, such as, susceptible, exposed but non-infectious, asymptomatic but infectious, symptomatic and infectious, recovered, and deceased. To simplify the mathematical modeling of infectious diseases, the prevailing assumption is that individuals in the same compartment behave identically. One way to add sophistication to these so-called compartmental models is by categorizing individuals in a more fine-grained manner, thus adding more compartments. This results in more parameters added to a model. Estimating these parameters requires more data, and more data increases the computational cost of estimating the parameters. In addition, as this pandemic is showing, our access to data is varied. Within each country and municipality, different sampling strategies are being pursued. This project lowers the computational cost of setting up and updating complex, compartmental epidemiological models as more data becomes available. By doing so, the project improves the ability of the scientific community to make more accurate predictions on the spread of the virus and inform on the effectiveness of local policy decisions on mitigation strategies.
The investigators adopt a category of model fitting that has seen recent success in molecular dynamics simulations in molecular modeling ? maximum entropy biasing methods. These methods replace model parameter optimization with a minimal biasing term that is independent of the model parameters. This makes the runtime complexity of model optimization linear with the amount of data and independent of the unknown number of parameters. The activities will enable rapid optimization of complex models that additionally consider spatial resolution and sampling biasing. The improved cost of the optimization process will permit frequent updates of compartmental models without the need for full parameter optimization each time new data is observed. The investigators will collaborate closely with others in the rapidly coalescing COVID-19 research community by releasing code, data, and findings.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
Quantitatively accurate epidemiology models are powerful tools for both understanding disease outbreak and informing public health policies. However, as models become increasingly sophisticated, there are more parameters to estimate and therefore less accurate predictions and more data-intensive requirements. This is especially true with COVID-19 because the disease parameters are (in most cases) unknown. Indeed, the emergence of new variants such as Delta and Lambda (with markedly different characteristics) demonstrate that even accurate measurements of parameters can become rapidly redundant. Another challenge is both the amount of, and bias in, data. Within each region, different sampling strategies are being pursued so that the observed case statistics must be carefully treated.
To circumvent these difficulties in our proposal we proposed using a methodology that has seen recent success in molecular dynamics simulations?maximum entropy biasing methods. The advantage of such an approach is that it replaces model parameter optimization (requiring intensive data collection) with a biasing term that is independent of model parameters. This has the advantage of computational speedup---it makes the runtime complexity of model optimization linear in the amount of data; and more importantly, the formulation is independent of unknown parameters, allowing one to rapidly adapt the simulations to new realities. Thus, one can make reasonable estimates of a number of things including projecting disease dynamics forward, as well as inferring past history based on present observations without the need to get precise estimates of the numerous parameters governing epidemic spread.
The first aim of our proposal was to adapt the MaxEnt framework to families of compartmental models that describe the evolution of various types of pathogens, and validate our approach on known data and outcomes. We considered a model with 5 states: Susceptible, Exposed, Asymptomatic, Infected, Recovered (SEAIR), that has been used extensively in the literature to forecast the spread of COVID-19. We used the deterministic model as the ground truth, and then simulated uncertainty in the data by introducing noise in the temporal evolution of each of the states. Inputs to the MaxEnT framework corresponded to discrete observations in time serving as a proxy for imperfect knowledge of the evolution of the pandemic. We demonstrated that despite noisy and sparse inputs, as well as no knowledge of the disease parameters, the MaxEnT framework was able to accurately capture the complete trajectory of each of the disease states. This work titled "Simulation-Based Inference with Approximately CorrectParameters via Maximum Entropy" is available to view on https://arxiv.org/abs/2104.09668 and is currently in review with Nature Computational Science. Figure 1 illustrates the process of discrete sampling and projecting the disease dynamics forward.
Having validated our methodology our second aim involved deploying the framework in real-world settings of COVID-19 spread. To that effect we conducted an analysis of disease spread in New York State. Despite, sampling trajectories from a subset of counties, we were able to accurately reproduce the infection curves in other parts of the state. An important result we uncovered, was in inferring the origin of the pandemic in the state by sampling curves much later in the pandemic. Indeed, while the current belief is that the efficacy of contact-tracing in determining the source of the pandemic diminishes rapidly as the epidemic progresses, we demonstrate the existence of another temporal window much later in disease spread, where one can recover knowledge about the origins of the pandemic. This work is described in the manuscript "Maximum Entropy Epidemiology: Predicting Future Trajectories and Inferring Origin of Patient-Zero" and is being prepared for submission to the Proceedings of the National Academy of Sciences. Figure 2 illustrates the sampling of counties in New York and the relative accuracy of inferring the origin of the infection.
Finally, in addition to the stated aims of our proposal, we also worked on phenomenological modeling. Specifically, we asked the question: "What is it that makes urban centers vulnerable to epidemics in the first place?". We proposed a metric that combines population density and mobility flow between high population centers and demonstrated it to be a accurate predictor of future outbreaks. We validated out metric in about a 160 cities worldwide and showed it predicted the severity of the initial outbreak in multiple countries. Based on this we proposed a number of non-pharmaceutical-intervention measures, much less severe than draconian lockdowns, that appeared to suppress the spread of the pandemic. This work is published in Communications Physics 4, 191 (2021).
Last Modified: 08/23/2021
Modified by: Gourab Ghoshal
Please report errors in award information by writing to: awardsearch@nsf.gov.