Award Abstract # 1633720
BIGDATA: Collaborative Research: F: Efficient Distributed Computation of Large-Scale Graph Problems in Epidemiology and Contagion Dynamics

NSF Org: IIS
Division of Information & Intelligent Systems
Recipient: UNIVERSITY OF HOUSTON SYSTEM
Initial Amendment Date: August 30, 2016
Latest Amendment Date: June 22, 2019
Award Number: 1633720
Award Instrument: Standard Grant
Program Manager: Sylvia Spengler
sspengle@nsf.gov
 (703)292-7347
IIS
 Division of Information & Intelligent Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: September 1, 2016
End Date: August 31, 2022 (Estimated)
Total Intended Award Amount: $549,877.00
Total Awarded Amount to Date: $565,877.00
Funds Obligated to Date: FY 2016 = $549,877.00
FY 2019 = $16,000.00
History of Investigator:
  • Gopal Pandurangan (Principal Investigator)
    gopalpandurangan@gmail.com
Recipient Sponsored Research Office: University of Houston
4300 MARTIN LUTHER KING BLVD
HOUSTON
TX  US  77204-3067
(713)743-5773
Sponsor Congressional District: 18
Primary Place of Performance: University of Houston
TX  US  77204-2015
Primary Place of Performance
Congressional District:
18
Unique Entity Identifier (UEI): QKWEF8XLMTT3
Parent UEI:
NSF Program(s): Big Data Science &Engineering
Primary Program Source: 01001617DB NSF RESEARCH & RELATED ACTIVIT
01001920DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7934, 7433, 9251, 8083
Program Element Code(s): 808300
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

A number of phenomena of societal importance, such as the spread of diseases and
contagion processes, can be modeled by stochastic processes on networks. The analysis
and control of such network phenomena involve, at their heart, fundamental graph-theoretic problems.
The graphs encountered are typically of large-scale (having tens of millions of nodes);
further, typical experimental analyses involve large designs with a number of parameters,
leading to hundreds of thousands of graph computations. Novel methods for solving these problems
are needed, since fast response times are critical to effective decision making.
The overarching goal of this project is to develop efficient distributed algorithms
and associated lower bounds for graph-theoretic problems that arise in computational
epidemiology and contagion dynamics. This will have a significant impact on these specific
applications, through more efficient algorithmic tools for enabling complex analyses.
The project will also make fundamental contributions to the design and analysis of
distributed algorithms for graph problems in large-scale networks, and will
result in an algorithmic toolkit with building blocks for performing large-scale
distributed graph computation. The project will lead to significant curriculum development
for undergraduate as well as graduate students, as well as public health analysts.
Finally, the project will help in involving minority and underrepresented students in research.

The technical focus of the project will be on distributed algorithms for fundamental topics
in graph algorithms such as graph connectivity, distances, subgraph analysis, and different
kinds of centrality measures. These topics underlie some of the recurring problems in the
modeling, simulation and analysis and control of different kinds of contagion processes.
For all these problems, the project will focus on developing provably efficient distributed
algorithms and showing lower bounds under a message-passing distributed computing model.
The PIs will also develop efficient implementations of these algorithms, and evaluate their
performance and solution quality in real-world graphs arising in epidemiology. The graphs
that arise in these applications have several novel characteristics, which will present new
challenges as well as opportunities for distributed computing.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 15)
Ajieren, Eric and Hourani, Khalid and Moses Jr., William K. and Pandurangan, Gopal "Distributed Algorithms for Connectivity and MST in Large Graphs with Efficient Local Computation" ICDCN 2022: 23rd International Conference on Distributed Computing and Networking , 2022 https://doi.org/10.1145/3491003.3491011 Citation Details
Augustine, John and Hourani, Khalid and Molla, Anisur Rahaman and Pandurangan, Gopal and Pasic, Adi "Scheduling mechanisms to control the spread of COVID-19" PLOS ONE , v.17 , 2022 https://doi.org/10.1371/journal.pone.0272739 Citation Details
Augustine, John and Kothapalli, Kishore and Pandurangan, Gopal "Efficient Distributed Algorithms in the k-machine model via PRAM Simulations" 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS) , 2021 https://doi.org/10.1109/IPDPS49936.2021.00031 Citation Details
Augustine, John and Molla, Anisur Rahaman and Pandurangan, Gopal and Vasudev, Yadu "Byzantine Connectivity Testing in the Congested Clique" Leibniz international proceedings in informatics , 2022 Citation Details
Chatterjee, Soumyottam and Fathi, Reza and Pandurangan, Gopal and Pham, Nguyen Dinh "Fast and Efficient Distributed Computation of Hamiltonian Cycles in Random Graphs" 38th IEEE International Conference on Distributed Computing Systems (ICDCS) , 2018 10.1109/ICDCS.2018.00079 Citation Details
Chatterjee, Soumyottam and Pandurangan, Gopal and Pham, Nguyen Dinh "Distributed MST: A Smoothed Analysis" ICDCN 2020: Proceedings of the 21st International Conference on Distributed Computing and Networking , 2020 10.1145/3369740.3369778 Citation Details
Farouzi, Abir and Bellatreche, Ladjel and Ordonez, Carlos and Pandurangan, Gopal and Malki, Mimoun "A Scalable Randomized Algorithm for Triangle Enumeration on Graphs Based on SQL Queries" International Conference on Big Data Analytics and Knowledge Discovery (DaWaK) , 2020 https://doi.org/10.1007/978-3-030-59065-9_12 Citation Details
Fathi, Reza and Molla, Anisur and Pandurangan, Gopal "Efficient Distributed Community Detection in the Stochastic Block Model" 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS) , 2019 10.1109/ICDCS.2019.00048 Citation Details
Fathi, Reza and Molla, Anisur Rahaman and Pandurangan, Gopal "Efficient Distributed Algorithms for the K-Nearest Neighbors Problem" SPAA '20: Proceedings of the 32nd ACM Symposium on Parallelism in Algorithms and Architectures , 2020 10.1145/3350755.3400268 Citation Details
Gmyr, Robert and Pandurangan, Gopal "Time-Message Trade-Offs in Distributed Algorithms" 32nd International Symposium on Distributed Computing (DISC 2018) , v.121 , 2018 10.4230/LIPIcs.DISC.2018.32 Citation Details
Khan, Maleq and Pandurangan, Gopal and Dinh Pham, Nguyen and Vullikanti, Anil and Zhang, Qin. "A Multi-criteria Approximation Algorithm for Influence Maximization with Probabilistic Guarantees" SIAM Symposium on Algorithm Engineering and Experiments , 2020 https://doi.org/10.1137/1.9781611976007.7 Citation Details
(Showing: 1 - 10 of 15)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

The results of this project has significantly advanced our ability to design provably good distributed algorithms for processing large-scale data, especially graph data. It has also led to applying these algorithms to networked epidemiology which can help in understanding and containing the spread of diseases.

Intellectual Merit:

The project has resulted in the design  and analysis of new distributed algorithms  for various fundamental  algorithmic problems in the k-machine model, a model for large-scale distributed computation. These problems include PageRank, triangle enumeration, community detection, K-nearest neighbors, minimum spanning tree, and several other graph problems. These problems have wide applicability and are used as primitives in several applications including  epidemiology.

A major contribution of the  project is  developing a general technique to show lower bounds for large-scale  distributed graph algorithms. These lower bounds help in quantifying the best possible algorithms that one can achieve for distributively solving large-scale problems.

The project also developed efficient distributed algorithms for the problem of influence maximation in a social network, a well-studied problem with applications to many areas such as viral marketing, epidemiology among others.

The project also initiates the study distributed fault-tolerant graph algorithms.  It develops a distributed algorithm for the fundamental graph connectivity problem that can output meaningful results even under the presence of malicious processors.

The project makes an important contribution to controlling the Covid-19 pandemic, by developing scheduling mechanisms that   trade-off between containing the spread of COVID-19 and performing in-person activity in organizations. 

 

Broader Impacts:

This project has resulted in the discovery of (essentially) optimum algorithms for two fundamental graph problems, namely PageRank and triangle enumeration in the k-machine model, a message passing model that models large-scale distributed computing. The project has also implemented the triangle enumeration algorithm in a SQL based database system. The project has also developed efficient distributed algorithms for various fundamental graph problems in the k-machine model.

The project has also resulted in the development of new techniques for showing lower bounds for distributed graph algorithms in a message passing model, which is of fundamental interest. Other works that have appeared subsequently have been influenced by this work.

The development of efficient algorithms for fundamental graph problems will have a direct impact in designing efficient distributed algorithms for large-scale graphs which is  important to Big Graph Data computing.

The project also initiated the study of distributed graph algorithms that are tolerant to even malicous faults of processors executing the algorithm. 

Our work on scheduling mechanisms for Covid-19 can have an impact on strategies to operate organizations like hospitals, schools, universities and businesses safely. It develops and analyzes mechanisms called group scheduling that trade off  in-person activity for controlling the spread of Covid-19 and vice versa. This can be useful for public health experts.

The results of this project have been published in top peer-reviewed conferences and journals in computer science.

The project has also provided opportunities to train graduate and undergraduate students and post doctoral fellows in Big Data computing, distributed algorithms,  and analyzing models and algorithms in epidemiology.

The theory and algorithms developed as part of this project, have been incorporated in  courses taught by the PI and his collaborators. In particular, it is part of the Distributed Algorithms course taught by the PI at the University of Houston and IIT Madras. The PI has given  tutorials on the topic  in universities and conferences.

The project has contributed to opportunities for research and mentoring of seven Ph.D. students ---  Soumyottam Chatterjee, Reza Fathi, Nguyen Dinh Pham. Khalid Hourani, Aayush Gupta, Duy Pham, and Krishnamoorthy Iyer (all at the University of Houston, a minority serving Tier 1 public research institution) and four postdoctoral scholars --- Michele Scquizzato, Robert Gmyr, William Moses and Fabien Dufoluon (all at the University of Houston).

The project has also contributed to collaborations with Profs. John Augustine  and Yadu Vasudev (IIT Madras), Prof. Kishore Kothapalli (IIIT, Hyderabad),  Prof. Anisur Molla (Indian Statistical Institute, Kolkatta), Prof. Sriram Pemmaraju (University of Iowa), and  Prof. Peter Robinson (Augusta University),

Prof. Shay Kutten (Technion), and Prof. David Peleg (Weizmann).  These collaborations have resulted in more broadly disseminating the research topics among the researchers in the respective universities.

 

 


Last Modified: 12/31/2022
Modified by: Gopal Pandurangan

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page