Award Abstract # 0953246
CAREER: QoS-Aware, High-Performance, and Scalable Many-Core Memory Systems

NSF Org: CCF
Division of Computing and Communication Foundations
Recipient: CARNEGIE MELLON UNIVERSITY
Initial Amendment Date: March 5, 2010
Latest Amendment Date: August 21, 2014
Award Number: 0953246
Award Instrument: Continuing Grant
Program Manager: Tao Li
CCF
 Division of Computing and Communication Foundations
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: March 1, 2010
End Date: February 29, 2016 (Estimated)
Total Intended Award Amount: $549,306.00
Total Awarded Amount to Date: $715,428.00
Funds Obligated to Date: FY 2010 = $102,435.00
FY 2011 = $122,021.00

FY 2012 = $125,731.00

FY 2013 = $243,919.00

FY 2014 = $121,322.00
History of Investigator:
  • Onur Mutlu (Principal Investigator)
    onur@cmu.edu
Recipient Sponsored Research Office: Carnegie-Mellon University
5000 FORBES AVE
PITTSBURGH
PA  US  15213-3890
(412)268-8746
Sponsor Congressional District: 12
Primary Place of Performance: Carnegie-Mellon University
5000 FORBES AVE
PITTSBURGH
PA  US  15213-3890
Primary Place of Performance
Congressional District:
12
Unique Entity Identifier (UEI): U3NKNFLNQ613
Parent UEI: U3NKNFLNQ613
NSF Program(s): Information Technology Researc,
Software & Hardware Foundation,
COMPUTER ARCHITECTURE
Primary Program Source: 01001011DB NSF RESEARCH & RELATED ACTIVIT
01001112DB NSF RESEARCH & RELATED ACTIVIT

01001213DB NSF RESEARCH & RELATED ACTIVIT

01001314DB NSF RESEARCH & RELATED ACTIVIT

01001415DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 1045, 1187, 1504, 7941, 9218, 9251, HPCC
Program Element Code(s): 164000, 779800, 794100
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Computer science and engineering is undergoing a revolution. Many-core systems are rapidly becoming the foundation of computing systems that are well-integrated into every aspect of our lives and society. Their unpredictable performance and performance misbehavior adversely affects productivity, efficiency, and profit in all domains that make use of computers. Unfortunately, existing many-core systems are largely designed based on the assumptions made for single-core systems, i.e. there are no shared resources between cores, even though the memory system, a major performance and power bottleneck, is shared. As a result, many-core systems are severely vulnerable to denial of service, uncontrollable, unscalable, and low-performance. To enable the efficient and productive use of many-core systems, there is an urgent need to design them ensuring high quality-of-service (QoS), performance-robustness, and scalability.

This research focuses on developing fundamental breakthroughs that enable scalable, controllable, and high-performance many-core memory systems. It aims to change the design paradigm of many-core processors to treat QoS and scalability in shared resources as first-class design goals, and educate future engineers to design systems with these goals as fundamental design objectives. The central approach is to develop hardware/software cooperative techniques to enable flexible QoS, partitioning, and performance mechanisms in memory systems and interconnects. The project develops fundamental techniques, targeting a very wide range of applications in cloud computing, data centers, client systems, mobile systems, and sensor environments. It is expected that research ideas developed in this project will enable controllable, robust, and therefore usable and efficient many-core systems, making our daily lives better and more productive, and taking a large step in making computing green.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 66)
Ausavarungnirun, Rachata; Chang, Kevin Kai-Wei; Subramanian, Lavanya; Loh, Gabriel H.; Mutlu, Onur; IEEE "Staged Memory Scheduling: Achieving High Performance and Scalability in Heterogeneous Systems" 2012 39TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA) , v.1 , 2012 , p.416-427
Benjamin C. Lee, Engin Ipek, Onur Mutlu, and Doug Burger, "Phase Change Memory Architecture and the Quest for Scalability" Communications of the ACM (CACM) , 2010
Boris Grot, Joel Hestness, Stephen W. Keckler, and Onur Mutlu "Kilo-NOC: A Heterogeneous Network-on-Chip Architecture for Scalability and Service Guarantees" IEEE Micro, Special Issue: Micro's Top Picks from 2011 Computer Architecture Conferences (MICRO TOP PICKS) , 2012
Boris Grot, Joel Hestness, Stephen W. Keckler, and Onur Mutlu "Kilo-NOC: A Heterogeneous Network-on-Chip Architecture for Scalability and Service Guarantees" Proceedings of the 38th International Symposium on Computer Architecture (ISCA) , 2011
Boris Grot, Stephen W. Keckler, and Onur Mutlu, "Topology-aware Quality-of-Service Support in Highly Integrated Chip Multiprocessors" Proceedings of the 6th Annual Workshop on the Interaction between Operating Systems and Computer Architecture (WIOSCA) , 2010
Chang Joo Lee, Onur Mutlu, Veynu Narasiman, and Yale N. Patt "Prefetch-Aware Memory Controllers" IEEE Transactions on Computers (TC) , v.60 , 2011 , p.1406-1430
Chris Fallin, Chris Craik, and Onur Mutlu, "CHIPPER: A Low-Complexity Bufferless Deflection Router" Proceedings of the 17th International Symposium on High-Performance Computer Architecture (HPCA) , 2011
Chris Fallin, Greg Nazario, Xiangyao Yu, Kevin Chang, Rachata Ausavarungnirun, and Onur Mutlu "MinBD: Minimally-Buffered Deflection Routing for Energy-Efficient Interconnect" 6th ACM/IEEE International Symposium on Networks on Chip (NOCS) , 2012
Chris Fallin, Greg Nazario, Xiangyao Yu, Kevin Chang, Rachata Ausavarungnirun, and Onur Mutlu, "MinBD: Minimally-Buffered Deflection Routing for Energy-Efficient Interconnect" Proceedings of the 6th ACM/IEEE International Symposium on Networks on Chip (NOCS) , v.x , 2012 , p.1-12
Donghyuk Lee, Lavanya Subramanian, Rachata Ausavarungnirun, Jongmoo Choi, and Onur Mutlu "Decoupled Direct Memory Access: Isolating CPU and IO Traffic by Leveraging a Dual-Data-Port DRAM" 24th International Conference on Parallel Architectures and Compilation Techniques (PACT) , 2015
Donghyuk Lee, Saugata Ghose, Gennady Pekhimenko, Samira Khan, and Onur Mutlu "Simultaneous Multi-Layer Access: Improving 3D-Stacked Memory Bandwidth at Low Cost" ACM Transactions on Architecture and Code Optimization (TACO) , v.12 , 2016
(Showing: 1 - 10 of 66)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Many-core systems are rapidly becoming the foundation of computing systems that are well-integrated into every aspect of our lives and society. Existing many-core systems are largely designed based on the assumptions made for single-core systems, ignoring the fact that cores in a many-core system interact and interfere with each other through shared resources.  As some of these shared resources, such as the memory and storage system, are major performance and power bottlenecks, ignoring the interaction between cores can lead to poor and unpredictable performance, hurting quality-of-service (QoS) and scalability as the number of cores increases.

This research aimed to change the design paradigm of many-core processors to treat quality-of-service and scalability in shared resources as first-class design goals, and educate future engineers to design systems with these goals as fundamental design objectives.  It focused on developing fundamental breakthroughs that enable scalable, controllable, and high-performance many-core memory and storage systems.  The central approach was to develop hardware/software cooperative techniques to enable flexible QoS, partitioning, and performance mechanisms in memory systems, storage, and interconnects. The project developed fundamental techniques, targeting a very wide range of applications in cloud computing, data centers, client systems, mobile systems, and sensor environments.

The project resulted in many new discoveries and techniques, including:

- more predictable multi-core memory systems, with novel Quality of Service mechanism innovations at memory controllers and interconnects

- more robust and higher performance memory systems, by rigorously analyzing reliability and retention issues and developing new methods to overcome such issues

- more robust and higher performance solid-state drives, by rigorously analyzing reliability and retention issues and developing new methods to overcome such issues

- discovery of a widespread memory failure mechanism, RowHammer, in modern DRAM memory systems, which affects a majority of computing systems today and the enabling of the demonstration of the relationship between memory reliability and system security

- higher performance memory systems with improved latency and bandwidth using new mechanisms

- new research infrastructure, in the form of highly-detailed simulators and hardware test platforms, that can be used by the research community to extensively characterize both DRAM and flash memory (the main component of solid-state drives)

- improved approaches to implementing and managing memory systems that are partially or completely made up of emerging memory technologies

- practical frameworks and memory architectures to enable data processing directly within memory

- practical mechanisms for improving memory capacity and bandwidth using new memory compression techniques

In the end, this project educated and trained at least 16 graduate, 5 undergraduate, and 3 high school students, and produced 4 Ph.D. dissertations. It also trained 2 postdocs and 11 visiting students from other universities.  Our findings produced more than 85 papers at top venues, 8 of which received best paper awards.  The project also resulted in new open course lectures and materials that were released online via Youtube, free of charge to anyone.  As part of the project, there were many open source code releases, which are available for anyone to use at http://www.ece.cmu.edu/~safari/tools.html and https://github.com/CMU-SAFARI. We also delivered more than 100 lectures worldwide, including at least 8 keynote speeches and 2 distinguished lectures, in both industry and academia, to disseminate the results of the project and educate students and researchers.

Fo...

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page