
NSF Org: |
CNS Division Of Computer and Network Systems |
Recipient: |
|
Initial Amendment Date: | August 30, 2013 |
Latest Amendment Date: | August 30, 2013 |
Award Number: | 1305375 |
Award Instrument: | Standard Grant |
Program Manager: |
Almadena Chtchelkanova
achtchel@nsf.gov (703)292-7498 CNS Division Of Computer and Network Systems CSE Directorate for Computer and Information Science and Engineering |
Start Date: | October 1, 2013 |
End Date: | September 30, 2016 (Estimated) |
Total Intended Award Amount: | $300,000.00 |
Total Awarded Amount to Date: | $300,000.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
3 RUTGERS PLZ NEW BRUNSWICK NJ US 08901-8559 (848)932-0150 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
94 Brett Road Piscataway NJ US 08854-8058 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | CCRI-CISE Cmnty Rsrch Infrstrc |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
As the scale and complexity of computing and data infrastructures supporting science and engineering grow, power costs are becoming important concerns in terms of costs, reliability and overall sustainability. As a result, it is becoming increasingly important to understand power/performance behaviors and tradeoffs from an application perspective for emerging system configuration, i.e., those with multiple cores, deep memory hierarchies and accelerators. This project builds an instrumented experimental platform that supports such an understanding, and enables research and training activities in this area. Specifically, the proposed experimental platform is composed of nodes with a deep memory architecture that contains four different levels: DRAM, PCIe-based non-volatile memory, solid-state drive and spinning hard disk, in addition to accelerators. Power metering is deployed as part of the infrastructure.
The experimental platform enables the experimental exploration of the power/performance behaviors of large scale computing systems and datacenters as well as compute and data intensive application they support, and uniquely supports research toward understanding the management and optimization of these systems and applications. It also enables research in multiple areas, including: application-aware cross-layer management, power-performance tradeoffs for data-intensive scientific workflows and thermal implications of deep memory hierarchies in virtualized Cloud environments.
Data and compute intensive applications are becoming increasingly critical to a wide range of domains, and the ability to develop large-scale and sustainable platforms and software infrastructure to support these applications will have significant impact in driving research and innovations in these domains. The developed experimental platform enables key research activities to support this. It provides important insights that will impact the realization and sustainability of very large-scale infrastructures necessary for current and emerging data and compute intensive applications. The infrastructure also provides an important infrastructure for education and training in different areas related to power management, energy efficiency, data management, memory management, and virtualization.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
Understanding power/performance behaviors and tradeoffs from an application perspective for emerging system configuration, i.e., those with multiple cores, deep memory hierarchies and accelerators has become a critical concern. The major goal of this project was to develop an instrumented experimental platform that can enable such an understanding and can fundamentally enable research and training activities in this area. This experimental infrastructure enables the experimental exploration of the power/performance behaviors of large scale computing systems and datacenters as well as compute and data intensive application they support and supports research toward the understanding of the management and optimization of these systems and applications. Central to these efforts is understanding energy/performance tradeoffs for data-intensive scientific workflows and develop smarter application-aware cross-layer power management at different levels.
There are many research accomplishments under these goals. A list of some of the outcomes of the project is provided below.
- We designed and constructed CAPER (Computational and dAta Platform for Energy efficiency Research), which is a unique and flexible experimental platform composed of nodes with a deep memory architecture with four different levels: DRAM, PCIe-based NVRAM, solid-state drive (SSD) and spinning hard disk. CAPER combines high performance Intel Xeon processors with a complete deep memory hierarchy, latest generation coprocessors (i.e., Intel Xeon Phi), high-performance network interconnects, and comprehensive system power instrumentation.
- We implemented a comprehensive and scalable monitoring platform using big data technologies such as HDFS and Apache Hadoop.
- We studied the costs incurred by the different in-situ analysis strategies in terms of execution time, scalability and power consumption, which is a fundamental issue of a large-scale workflows.
- We characterized the performance and energy behaviors of each level of the memory hierarchy and evaluated the performance and energy consumption of different data management strategies and data exchange patterns, as well as the energy/performance tradeoffs associated with data placement, data movement and data processing.
- We studied performance and power/energy tradeoffs of different data processing configurations and data movement strategies, and how to balance these tradeoffs with the quality of solution.
- We evaluated the costs incurred by the different in-situ data analysis strategies for large-scale systems (e.g., combustion simulations) in terms of execution time, scalability and power consumption.
- We explored the local recovery for stencil-based parallel applications, which represent a significant set of physical simulations.
- We explored how to manage power budgets dynamically leveraging the properties of AMR (Adaptive Mesh Refinement) algorithms, which is a new approach for power and workload management.
- We supported other research activities that required the use of CAPER power instrumentation and/or the use of deep memory hierarchies such as the study of energy-efficient autonomic cyber-security strategies, data analytics at large scale, imaging-based medical research,
In addition, this project provided training to postdocs and graduate students. The work resulted in a plethora of publications including referenced journal and conference papers, and numerous presentations. Keynotes and invited talks were given at many esteemed forums such as the ACM/IEEE International Conferences for High Performance Computing, Networking, Storage and Analysis, IEEE International Parallel & Distributed Processing Symposium, IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, in many countries throughout the world (e.g., Japan, Colombia, Italy, Germany, Spain).
Some of the research findings were also disseminated to the general public via the educational outreach programs such as the Aresty Research Center (Division of Undergraduate Academic Affairs at Rutgers University) and the New Jersey Governor’s School of Engineering and Technology.
An experimental instrument platform has been developed, which provides an instrument for conducting research in different areas related to the study of power/energy efficiency, deep memory hierarchies and data analysis. Further, this research has enabled the development of other NSF-funded research projects and has influenced the design of large-scale computing platforms such as Caliburn at Rutgers.
Last Modified: 01/27/2017
Modified by: Manish Parashar
Please report errors in award information by writing to: awardsearch@nsf.gov.