Award Abstract # 0905509
SHF: Medium: Hardware/Software Partitioning for Hybrid Shared Memory Multiprocessors

NSF Org: CCF
Division of Computing and Communication Foundations
Recipient: REGENTS OF THE UNIVERSITY OF CALIFORNIA AT RIVERSIDE
Initial Amendment Date: August 10, 2009
Latest Amendment Date: July 17, 2014
Award Number: 0905509
Award Instrument: Standard Grant
Program Manager: Sankar Basu
sabasu@nsf.gov
 (703)292-7843
CCF
 Division of Computing and Communication Foundations
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: September 1, 2009
End Date: August 31, 2015 (Estimated)
Total Intended Award Amount: $800,000.00
Total Awarded Amount to Date: $800,000.00
Funds Obligated to Date: FY 2009 = $800,000.00
History of Investigator:
  • Laxmi Bhuyan (Principal Investigator)
    bhuyan@cs.ucr.edu
  • Walid Najjar (Co-Principal Investigator)
  • Rajiv Gupta (Co-Principal Investigator)
Recipient Sponsored Research Office: University of California-Riverside
200 UNIVERSTY OFC BUILDING
RIVERSIDE
CA  US  92521-0001
(951)827-5535
Sponsor Congressional District: 39
Primary Place of Performance: University of California-Riverside
200 UNIVERSTY OFC BUILDING
RIVERSIDE
CA  US  92521-0001
Primary Place of Performance
Congressional District:
39
Unique Entity Identifier (UEI): MR5QC5FCAVH5
Parent UEI:
NSF Program(s): DES AUTO FOR MICRO & NANO SYST,
COMPILERS,
COMPUTER ARCHITECTURE
Primary Program Source: 01000910DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 9218, HPCC
Program Element Code(s): 794500, 732900, 794100
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT




Hybrid multiprocessor architectures present an unprecedented opportunity for high performance computing through the seamless integration of large number of processors and hardware accelerators. This project addresses the research challenges in the design and exploitation of hybrid multiprocessors through innovations that span across the areas of architectures, compilers, and high-performance computing. A hybrid cache coherent non-uniform memory access (CC-NUMA) architecture is designed that clusters CPUs, hardware accelerators, and memories to preserve locality and reduce memory latency. Partitioning models are developed to enable optimal partitioning of data among CPUs and hardware accelerators. Compiler techniques are developed for detection of parallelism, its partitioning, and assignment across CPUs and hardware accelerators. The project enables coexistence of data streaming (push data) and data fetching (pull data) mechanisms. The research benefits from detailed measurements using a 64-processor SGI Altix 4700 CC-NUMA machine with FPGAs, the Intel FSB-FPGA architecture accelerator, and Niveus 4000 workstation with NVIDIA GPUs.



The research has impact on large-scale scientific computing. The hybrid multiprocessor technology is likely to be transferred to industry while the developed software (compilers, simulators and Hybrid SPLASH-2 benchmarks) will be distributed to researchers. The project also has impact on education and research. The SGI Altix machine is already being used in our graduate classes and further projects on hybrid parallel computing are introduced in architecture, parallel processing, and compiler classes. The project contributes to minority undergraduate education in Computer Science since UCR is recognized for its large undergraduate Hispanic population.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

K.K. Pusukuri, R. Gupta, L.N. Bhuyan "ADAPT: A Framework for Coscheduling Multithreaded Programs" Transactions on Architecture and Code Optimization , v.9 , 2013
K.K. Pusukuri, R. Gupta, L.N. Bhuyan "Thread Tranquilizer: Dynamically Reducing Performance Variation" ACM Transactions on Architecture and Code Optimization , v.8 , 2012
K. Pusukari, R. Gupta and L. Bhuyan "Tumbler: An Effective Load Balancing Technique for MultiCPU Multicore Systems" ACM Transactions on Architecture and Code Optimization (TACO) , 2015
M.E. Belviranli, L.N. Bhuyan, R. Gupta "A Dynamic Self Scheduling Scheme for Heterogeneous Multiprocessor Architectures" ACM Transactions on Architecture and Code Optimization (TACO) , v.9 , 2013
M. Feng, C. Lin, and R. Gupta "PLDS: Partitioning Linked Data Structures for Parallelism" ACM Transactions on Architecture and Code Optimization , v.8 , 2012

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page