Award Abstract # 0926692
Collaborative Research: Topology-Aware MPI Communication and Scheduling for Petascale Systems

NSF Org: OAC
Office of Advanced Cyberinfrastructure (OAC)
Recipient: UNIVERSITY OF CALIFORNIA, SAN DIEGO
Initial Amendment Date: August 13, 2009
Latest Amendment Date: August 13, 2009
Award Number: 0926692
Award Instrument: Standard Grant
Program Manager: Kevin Thompson
kthompso@nsf.gov
 (703)292-4220
OAC
 Office of Advanced Cyberinfrastructure (OAC)
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: October 1, 2009
End Date: September 30, 2013 (Estimated)
Total Intended Award Amount: $460,000.00
Total Awarded Amount to Date: $460,000.00
Funds Obligated to Date: FY 2009 = $460,000.00
ARRA Amount: $460,000.00
History of Investigator:
  • Amitava Majumdar (Principal Investigator)
    majumdar@sdsc.edu
Recipient Sponsored Research Office: University of California-San Diego
9500 GILMAN DR
LA JOLLA
CA  US  92093-0021
(858)534-4896
Sponsor Congressional District: 50
Primary Place of Performance: University of California-San Diego
9500 GILMAN DR
LA JOLLA
CA  US  92093-0021
Primary Place of Performance
Congressional District:
50
Unique Entity Identifier (UEI): UYTTZT6G9DT1
Parent UEI:
NSF Program(s): CESER-Cyberinfrastructure for
Primary Program Source: 01R00910DB RRA RECOVERY ACT
Program Reference Code(s): 6890, 7684, 9215, HPCC
Program Element Code(s): 768400
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Abstract
This award is funded under the American Recovery and Reinvestment Act of 2009 (Public Law 111-5).

Modern networks (like InfiniBand and 10GigE) have capability to provide topology, routing and also network status information at run-time. This leads to the following broad challenge: Can the next generation petascale systems provide topology-aware MPI communication, mapping and scheduling which can improve performance and scalability for a range of applications? This challenge leads to the following research questions: 1) What are the topology- aware communication and scheduling requirements of petascale applications? 2) How to design a network topology and state management framework with static and dynamic network information? 3) How to design topology-aware point-to-point and collective communication schemes (such as broadcast, all-to-all, all-reduce) in an MPI library? 4) How to design topology-aware task mapping and scheduling schemes? and 5) How to define and design a flexible topology information interface? A synergistic and comprehensive research plan, involving computer scientists from The Ohio State University (OSU) and computational scientists from the Texas Advanced Computing Center (TACC) and The Univ. of Calif., San Diego, San Diego Supercomputer Center (SDSC), is proposed to address the above challenges. The research will be driven by a set of applications (PSDNS, UCSDH3D, AWM-Olsen and MPCUGLES) from established NSF computational science researchers running large scale simulations on the Ranger system and other NSF HEC systems. The transformative impact of the proposed research is to develop topology-aware MPI software and a framework for using derived topology information for scheduling integration in order to maximize petascale application performance.


The proposed research is a collaborative and synergistic activity between computer scientists and computational scientists and thus, will have significant impact in deriving guidelines for designing, deploying and using next generation petascale systems. The proposed research directions and their solutions will be used in curriculum of the investigators to train graduate and undergraduate students. The established national-scale training and outreach programs at TACC and SDSC will be used to disseminate the results of this research to HEC users and developers. Research results will also be disseminated to the multiple collaborating organizations of the investigators (national laboratories and industry) to enable impact on their software products and applications. The modified MVAPICH2 library (currently being used by more than 840 organizations) and SGE scheduler plug-in will be available to the HEC community in an open-source manner. Case-studies from this research will be presented at the MPI Forum (OSU is a member of this forum) to influence the design of the upcoming MPI-3 standard and other MPI libraries.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Dmitry Pekurovsky "P3DFFT: A Framework For Parallel Computations Of FourierTransforms In Three Dimensions" SIAM Journal of Scientific Computing , v.34 , 2012 , p.c192 http://epubs.siam.org/doi/abs/10.1137/11082748X
Pekurovsky D "P3DFFT: A Framework For Parallel Computations Of Fourier Transforms in Three Dimensions" SIAM Journal of Scientific Computing , v.34 , 2012 , p.C192
Unat D, Zhou J, Cui Y, Cai X, Baden S "Accelerating an Earthquake Simulation with a C-to-CUDA Translator" Journal of Computing in Science and Engineering , v.14 , 2012 , p.48

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page