Award Abstract # 1524852
SHF: Small: Transformations for Synergistic Analysis of Large Evolving Graphs

NSF Org: CCF
Division of Computing and Communication Foundations
Recipient: REGENTS OF THE UNIVERSITY OF CALIFORNIA AT RIVERSIDE
Initial Amendment Date: June 30, 2015
Latest Amendment Date: June 30, 2015
Award Number: 1524852
Award Instrument: Standard Grant
Program Manager: Almadena Chtchelkanova
achtchel@nsf.gov
 (703)292-7498
CCF
 Division of Computing and Communication Foundations
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: July 1, 2015
End Date: June 30, 2020 (Estimated)
Total Intended Award Amount: $400,000.00
Total Awarded Amount to Date: $400,000.00
Funds Obligated to Date: FY 2015 = $400,000.00
History of Investigator:
  • Rajiv Gupta (Principal Investigator)
    gupta@cs.ucr.edu
Recipient Sponsored Research Office: University of California-Riverside
200 UNIVERSTY OFC BUILDING
RIVERSIDE
CA  US  92521-0001
(951)827-5535
Sponsor Congressional District: 39
Primary Place of Performance: University of California-Riverside
CA  US  92521-0001
Primary Place of Performance
Congressional District:
39
Unique Entity Identifier (UEI): MR5QC5FCAVH5
Parent UEI:
NSF Program(s): Software & Hardware Foundation
Primary Program Source: 01001516DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7923, 7942
Program Element Code(s): 779800
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

The importance of graph processing has grown with the popularity of graph analytics. An important feature of real-world graphs is that they are constantly evolving (e.g., social networks, networks modeling spreading of a disease etc.). Graph analytics over an evolving graph entails repeating analysis over snapshots of a graph taken at different points in time to observe how features of interest change over time. For large real-world graphs with tens of billions of edges, evolving graph analysis is both highly compute- and memory-intensive. By developing transformations that reorganize the computation and data, techniques for rapid evolving graph analytics on modern computing platforms are being considered. Many students are being trained and educated in this important field.

Graph analysis can greatly benefit from cores and storage available on modern parallel machines. However, effectively exploiting the resources remains an enormous challenge due to irregular nature of parallelism and lack of data locality in graph computations. This work is leveraging two key characteristics, overlapping working sets and computed value stability, to develop techniques for speeding up graph analytics. The techniques being considered include: optimization of reading and writing of large graphs on disk, optimizing inter-node communication on a cluster, and optimizing computation over multiple versions of an evolving graph. These optimizations are being used to greatly enhance the performance of multiple popular graph processing systems. Public dissemination of these software enhancements are also planned.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Keval Vora, Guoqing Xu, Rajiv Gupta "Synergistic Analysis of Evolving Graphs" ACM Transactions on Architecture and Code Optimization, Article No. 32, 27 pages , v.13 , 2016
S-C. Koduru, K. Vora, R. Gupta "Software Speculation on Caching DSMs" International Journal of Parallel Programming (IJPP) , v.46 , 2018 , p.313

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

The use of Graph Analytics has become pervasive because large data sets across many domains can be readily represented as graphs where nodes represent entities and edges represent pairwise relationships among them. Although graph analytics workloads contain vast amounts of data level parallelism, exploiting this parallelism is an enormous challenge due to their large memory footprint, computation that is active across this footprint, and the irregular nature of computation that lacks all forms of data locality. In this work we have developed new analytics algorithms that are designed to address the challenges of utilizing the major parallel computing platforms available today. The key outcomes of this project are as follows:

1. Graph Analytics on multicores: For shared-memory platforms we have developed parallel algorithms for evaluating a large number of queries either one by one or simultaneously. To handle very large graphs that do not fit in memory, we have advanced the state-of-the-art in out-of-core graph analytics.

2. Graph Analytics on clusters: In context of clusters we developed distributed algorithms that simultaneously evaluate a query on multiple versions of an evolving graph as well as simultaneously evaluate batches of queries on the same version of a graph. In addition, we have considered distributed algorithms that work on streaming graphs, that is, graphs that are dynamically changing.

3. Graph Analytics on a GPGPU: Finally this project also developed parallel algorithms for graph analytics that are suitable for GPGPUs which have tremendous amount of hardware parallelism but limited amount of memory. We have developed techniques that dynamically limit loading to subgraphs corresponding to active computations and manage vast amounts of generated intermediate data by orchestrating loading and unloading of this data to and from GPGPU memory.

Broader Impacts: The results of this research have been widely disseminated by publication in high quality venues including ASPLOS (3 papers), IEEE BigData (2 papers), EuroSys, HPDC, USENIX ATC, ACM TACO, LCPC, IJPP, CC, ISMM and others. A number of PhD students participated in this project. Keval Vora (Simon Fraser Univ.), Sai Charan Koduru (Microsoft), and Amlan Kusum (Oracle) have graduated. Abbas Mazloumi, Chengshuo Xu, Xiaolin Jiang, and Gurneet Kaur have made substantial progress in their dissertation work and are expected to graduate over the next two years.


Last Modified: 07/05/2020
Modified by: Rajiv Gupta

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page