NSF Award Search: Award # 1305375

Award Abstract # 1305375

II-NEW: An Experimental Platform for Investigating Energy-Performance Tradeoffs for Systems with Deep Memory Hierarchies

NSF Org:	CNS Division Of Computer and Network Systems
Recipient:	RUTGERS, THE STATE UNIVERSITY
Initial Amendment Date:	August 30, 2013
Latest Amendment Date:	August 30, 2013
Award Number:	1305375
Award Instrument:	Standard Grant
Program Manager:	Almadena Chtchelkanova achtchel@nsf.gov (703)292-7498 CNS Division Of Computer and Network Systems CSE Directorate for Computer and Information Science and Engineering
Start Date:	October 1, 2013
End Date:	September 30, 2016 (Estimated)
Total Intended Award Amount:	$300,000.00
Total Awarded Amount to Date:	$300,000.00
Funds Obligated to Date:	FY 2013 = $300,000.00
History of Investigator:	Manish Parashar (Principal Investigator) manish.parashar@utah.edu Dario Pompili (Co-Principal Investigator) Ivan Rodero (Co-Principal Investigator)
Recipient Sponsored Research Office:	Rutgers University New Brunswick 3 RUTGERS PLZ NEW BRUNSWICK NJ US 08901-8559 (848)932-0150
Sponsor Congressional District:	12
Primary Place of Performance:	Rutgers Discovery Informatics Institute 94 Brett Road Piscataway NJ US 08854-8058
Primary Place of Performance Congressional District:	06
Unique Entity Identifier (UEI):	M1LVPE5GLSD9
Parent UEI:
NSF Program(s):	CCRI-CISE Cmnty Rsrch Infrstrc
Primary Program Source:	01001314DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s):	7359
Program Element Code(s):	735900
Award Agency Code:	4900
Fund Agency Code:	4900
Assistance Listing Number(s):	47.070

ABSTRACT

As the scale and complexity of computing and data infrastructures supporting science and engineering grow, power costs are becoming important concerns in terms of costs, reliability and overall sustainability. As a result, it is becoming increasingly important to understand power/performance behaviors and tradeoffs from an application perspective for emerging system configuration, i.e., those with multiple cores, deep memory hierarchies and accelerators. This project builds an instrumented experimental platform that supports such an understanding, and enables research and training activities in this area. Specifically, the proposed experimental platform is composed of nodes with a deep memory architecture that contains four different levels: DRAM, PCIe-based non-volatile memory, solid-state drive and spinning hard disk, in addition to accelerators. Power metering is deployed as part of the infrastructure.

The experimental platform enables the experimental exploration of the power/performance behaviors of large scale computing systems and datacenters as well as compute and data intensive application they support, and uniquely supports research toward understanding the management and optimization of these systems and applications. It also enables research in multiple areas, including: application-aware cross-layer management, power-performance tradeoffs for data-intensive scientific workflows and thermal implications of deep memory hierarchies in virtualized Cloud environments.

Data and compute intensive applications are becoming increasingly critical to a wide range of domains, and the ability to develop large-scale and sustainable platforms and software infrastructure to support these applications will have significant impact in driving research and innovations in these domains. The developed experimental platform enables key research activities to support this. It provides important insights that will impact the realization and sustainability of very large-scale infrastructures necessary for current and emerging data and compute intensive applications. The infrastructure also provides an important infrastructure for education and training in different areas related to power management, energy efficiency, data management, memory management, and virtualization.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 15)

Show All

Chen, Shouwei and Rodero, Ivan "Exploring the Potential of Next Generation Software-Defined In-Memory Frameworks" 2018 30th International Symposium on Computer Architecture and High Performance Computing , 2018 10.1109/SBAC-PAD.2018.00042 Citation Details

Chen, Shouwei and Rodero, Ivan "Understanding Behavior Trends of Big Data Frameworks in Ongoing Software-Defined Cyber-Infrastructure" BDCAT '17 Proceedings of the Fourth IEEE/ACM International Conference on Big Data Computing, Applications and Technologies , 2017 10.1145/3148055.3148079 Citation Details

Georgiana Haldeman, Ivan Rodero , Manish Parashar, Sabela Ramos, Eddy Z. Zhang, Ulrich Kremer "Exploring energy-performance-quality tradeoffs for scientific workflows with in-situ data analyses" Computer Science-Research and Development , v.30 , 2015 , p.207 10.1007/s00450-014-0268-6

Ivan Rodero, Manish Parashar, Aditya G. Landge, Sidharth Kumar, Valerio Pascucci, Peer-Timo Bremer "Evaluation of In-Situ Analysis Strategies at Scale for Power Efficiency and Scalability" Interna- tional Symposium on Cluster, Cloud and Grid Computing , 2016 10.1109/CCGrid.2016.95

Javier Diaz Montes, Yu Xie, Ivan Rodero, Jaroslaw Zola, Baskar Ganapathysubramanian, Manish Parashar "Federated Computing for the Masses-Aggregating Resources to Tackle Large-Scale Engineering Problems" Computing in Science and Engineering , v.16 , 2014 , p.62 10.1109/MCSE.2013.134

Juan J. Villalobos, Ivan Rodero, Manish Parashar "Energy-Aware Autonomic Framework for Cloud Protection and Self-Healing" International Conference on Cloud and Autonomic Compu- ting , 2014 10.1109/ICCAC.2014.27

Mengsong Zou, Javier Diaz Montes, Ivan Rodero, Manish Parashar, Ioan Petri, Omer F. Rana, Xin Qi, David J. Foran "Collaborative marketplaces for eScience, a medical imaging use case" International Conference on Collaboration Technologies and Systems , 2014 10.1109/CTS.2014.6867615

M. Gamell, K. Teranishi, M. A. Heroux, J. Mayo, H. Kolla, J. Chen, and M. Parashar "Exploring Failure Recovery for Stencil-based Applications at Extreme Scales" International Symposium on High-Performance Parallel and Distributed Computing , 2015 10.1145/2749246.2749260

M. Gamell, K. Teranishi, M. A. Heroux, J. Mayo, H. Kolla, J. Chen, M. Parashar "Local recovery and failure masking for stencil-based applications at extreme scales" International Conference for High Performance Computing, Networking, Storage and Analysis , 2015 10.1145/2807591.2807672

Qin, Yubo and Rodero, Ivan and Subedi, Pradeep and Parashar, Manish and Rigo, Sandro "Exploring Power Budget Scheduling Opportunities and Tradeoffs for AMR-based Applications" 2018 30th International Symposium on Computer Architecture and High Performance Computing , 2018 10.1109/SBAC-PAD.2018.00023 Citation Details

(Showing: 1 - 10 of 15)

Show All

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Understanding power/performance behaviors and tradeoffs from an application perspective for emerging system configuration, i.e., those with multiple cores, deep memory hierarchies and accelerators has become a critical concern. The major goal of this project was to develop an instrumented experimental platform that can enable such an understanding and can fundamentally enable research and training activities in this area. This experimental infrastructure enables the experimental exploration of the power/performance behaviors of large scale computing systems and datacenters as well as compute and data intensive application they support and supports research toward the understanding of the management and optimization of these systems and applications. Central to these efforts is understanding energy/performance tradeoffs for data-intensive scientific workflows and develop smarter application-aware cross-layer power management at different levels.

There are many research accomplishments under these goals. A list of some of the outcomes of the project is provided below.

We designed and constructed CAPER (Computational and dAta Platform for Energy efficiency Research), which is a unique and flexible experimental platform composed of nodes with a deep memory architecture with four different levels: DRAM, PCIe-based NVRAM, solid-state drive (SSD) and spinning hard disk. CAPER combines high performance Intel Xeon processors with a complete deep memory hierarchy, latest generation coprocessors (i.e., Intel Xeon Phi), high-performance network interconnects, and comprehensive system power instrumentation.
We implemented a comprehensive and scalable monitoring platform using big data technologies such as HDFS and Apache Hadoop.
We studied the costs incurred by the different in-situ analysis strategies in terms of execution time, scalability and power consumption, which is a fundamental issue of a large-scale workflows.
We characterized the performance and energy behaviors of each level of the memory hierarchy and evaluated the performance and energy consumption of different data management strategies and data exchange patterns, as well as the energy/performance tradeoffs associated with data placement, data movement and data processing.
We studied performance and power/energy tradeoffs of different data processing configurations and data movement strategies, and how to balance these tradeoffs with the quality of solution.
We evaluated the costs incurred by the different in-situ data analysis strategies for large-scale systems (e.g., combustion simulations) in terms of execution time, scalability and power consumption.
We explored the local recovery for stencil-based parallel applications, which represent a significant set of physical simulations.
We explored how to manage power budgets dynamically leveraging the properties of AMR (Adaptive Mesh Refinement) algorithms, which is a new approach for power and workload management.
We supported other research activities that required the use of CAPER power instrumentation and/or the use of deep memory hierarchies such as the study of energy-efficient autonomic cyber-security strategies, data analytics at large scale, imaging-based medical research,

In addition, this project provided training to postdocs and graduate students. The work resulted in a plethora of publications including referenced journal and conference papers, and numerous presentations. Keynotes and invited talks were given at many esteemed forums such as the ACM/IEEE International Conferences for High Performance Computing, Networking, Storage and Analysis, IEEE International Parallel & Distributed Processing Symposium, IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, in many countries throughout the world (e.g., Japan, Colombia, Italy, Germany, Spain).

Some of the research findings were also disseminated to the general public via the educational outreach programs such as the Aresty Research Center (Division of Undergraduate Academic Affairs at Rutgers University) and the New Jersey Governor’s School of Engineering and Technology.

An experimental instrument platform has been developed, which provides an instrument for conducting research in different areas related to the study of power/energy efficiency, deep memory hierarchies and data analysis. Further, this research has enabled the development of other NSF-funded research projects and has influenced the design of large-scale computing platforms such as Caliburn at Rutgers.

Last Modified: 01/27/2017
Modified by: Manish Parashar

Please report errors in award information by writing to: awardsearch@nsf.gov.

Success

Error