
NSF Org: |
CCF Division of Computing and Communication Foundations |
Recipient: |
|
Initial Amendment Date: | March 21, 2016 |
Latest Amendment Date: | August 6, 2019 |
Award Number: | 1562041 |
Award Instrument: | Continuing Grant |
Program Manager: |
A. Funda Ergun
CCF Division of Computing and Communication Foundations CSE Directorate for Computer and Information Science and Engineering |
Start Date: | September 1, 2016 |
End Date: | August 31, 2021 (Estimated) |
Total Intended Award Amount: | $774,121.00 |
Total Awarded Amount to Date: | $774,121.00 |
Funds Obligated to Date: |
FY 2017 = $195,550.00 FY 2019 = $201,188.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
150 MUNSON ST NEW HAVEN CT US 06511-3572 (203)785-4689 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
17 Hillhouse Ave New Haven CT US 06511-8965 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | Algorithmic Foundations |
Primary Program Source: |
01001718DB NSF RESEARCH & RELATED ACTIVIT 01001920DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
The PI will develop new methods for analyzing, reasoning about, and making predictions about graphs and networks. Graphs and networks appear throughout society and science. They include road and transportation networks, communication networks, power networks and social networks. They are one of the dominant abstractions of data in Computer Science, and are used to model abstract interactions in fields ranging from Genomics to Image Processing and Machine Learning.
The project has two major research thrusts. The first is the development of faster algorithms for performing existing analyses. The second is the development of new approaches to understanding the structure of graphs and networks. During the project, the PI will also develop and distribute course materials to teach recent developments in the field, will give public lectures on related material, will train graduate and undergraduate students in research, and will develop software that others can use to perform these analyses.
The fundamental object to be studied in this project are graph structured block matrices (GSBMs)---block matrices whose nonzero structure corresponds to the edges of a graph. The first part of the project will involve the development of fast algorithms for the solution of systems of linear equations in GSBMs that can be written as a sum of positive semindefinte matrices with each matrix corresponding to one edge of the graph. These GSBMs are generalizations of Laplacian matrices and arise in many application areas, including Optimization, Computational Science, and Image Processing. The second part of the project will involve the generalization of spectral graph theory to the study of the expected characteristic polynomials of GSBMs with randomly chosen block matrices. Spectral graph theory has been one of the most useful tools for analyzing graphs and networks. The extension of the theory to random GSBMs should enable analyses that are not possible with the standard approach.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
The most important research advances made by this project have been the development of faster algorithms for solving systems of linear equations and the improvement of the design of randomized controlled trials (RCTs).
RCTs are the principal way we estimate the effectiveness of new medications, procedures, policies, and interventions. In a typical medical trial, the subjects are divided into two (or more) groups. One group of subjects is given the new medicine, and the other is given a placebo or old medicine. Randomization is used to ensure that these test and control groups are probably similar, and enables us to make statistical statements about the estimated treatment effects. When we know nothing about the experimental subjects, a uniform random assignment of subjects to groups is the best we can do. However, when we have information about the subjects that could potentially be correlated with the outcome of the trial, we usually want to balance those properties of the subjects between the two groups.
We began working on RCTs when a colleague requested our help in designing a trial to evaluate interventions to improve maternal health outcomes. Our initial solution, which was used in the trial, built upon the expected characteristic polynomial techniques we developed to solve Weaver's problem. But, a better technique with stronger guarantees is required in general.
In our paper, "Balancing covariates in randomized experiments using the Gram-Schmidt walk," we show how to use the recent Gram-Schmidt Walk algorithm of Bansal, Dadush, Garg, and Lovett, to create balanced partitions for RCTs. We begin by cleanly formulating exactly what we require of the partition into test and control groups, and show that what is really required is a random partition that balances subject covariates on average. We then show that the Gram-Schmidt Walk provides a nearly optimal solution to the design of RCTs that compensate for linear influences of covariates on treatment outcomes. This involves developing an optimal analysis of the Gram-Schmidt Walk algorithm, as well as techniques for computing confidence intervals after the experiment has been performed and estimates of the sizes of those confidence intervals.
The problem of solving systems of linear equations arises in many scientific disciplines, including engineering, scientific simulation, and machine learning. Many of the systems of linear equations that must be solved in practice have special structure that can enable us to solve them quickly. In this project, we developed algorithms that can solve a large family of these, called Block Diagonally Dominant Matrices, in nearly optimal time.
This project also supported teaching, mentoring, and outreach activities. 16 undergraduate students worked on research related to this project, as did one postdoc and 3 graduate students. Three new courses were developed as part of this project, and the material of a fourth was incorporated into a 400 page draft of a textbook. The PI gave numerous talks to audiences in the field, to general audiences in mathematics, computer science, and statistics, to general scientific audiences, and a few to the general public.
Last Modified: 12/29/2021
Modified by: Daniel A Spielman
Please report errors in award information by writing to: awardsearch@nsf.gov.