
NSF Org: |
OAC Office of Advanced Cyberinfrastructure (OAC) |
Recipient: |
|
Initial Amendment Date: | May 29, 2020 |
Latest Amendment Date: | June 7, 2024 |
Award Number: | 2005632 |
Award Instrument: | Cooperative Agreement |
Program Manager: |
Robert Chadduck
rchadduc@nsf.gov (703)292-2247 OAC Office of Advanced Cyberinfrastructure (OAC) CSE Directorate for Computer and Information Science and Engineering |
Start Date: | October 1, 2020 |
End Date: | September 30, 2027 (Estimated) |
Total Intended Award Amount: | $9,952,154.00 |
Total Awarded Amount to Date: | $29,426,684.00 |
Funds Obligated to Date: |
FY 2021 = $10,449,761.00 FY 2022 = $2,038,429.00 FY 2023 = $48,000.00 FY 2024 = $4,947,910.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
2550 NORTHWESTERN AVE # 1100 WEST LAFAYETTE IN US 47906-1332 (765)494-1055 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
155 South Grant Street West Lafayette IN US 47907-2114 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | Innovative HPC |
Primary Program Source: |
01002324DB NSF RESEARCH & RELATED ACTIVIT 01002425DB NSF RESEARCH & RELATED ACTIVIT 01002021DB NSF RESEARCH & RELATED ACTIVIT 01002122DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
As computing permeates nearly all fields of science and engineering, there is an exponential growth of computing needs from both the traditional computing-intensive domains and the emerging new and more diverse fields of research. The rise of machine learning and artificial intelligence applications has accelerated and broadened the use of computational resources from research in creating new and more environmentally friendly materials to improving medicine in our fight against deadly diseases. There are three main challenges to meeting this rapidly evolving landscape of national computational needs: a shortage of capacity, increasingly diverse applications, and computational literacy and training. This project aims to meet these challenges and transform the way computing is delivered by developing and deploying a composable advanced computing resource, Anvil, to the national research community to significantly increase both the computing capacity and accessibility. Anvil integrates a large-capacity high-performance computing (HPC) cluster with a comprehensive ecosystem of software, access interfaces, programming environments, and composable services to form a seamless environment able to support a broad range of current and future science and engineering applications. Through a carefully designed student training program and partnerships with regional and other universities, XSEDE, and Women in HPC programs, this project will develop computing competency in the next-generation workforce, and engage and train a broader audience including underrepresented students at minority-serving and EPSCoR (Established Program to Stimulate Competitive Research) institutions.
Built with a forward-looking architecture with a high core count, and improved memory bandwidth and I/O, Anvil can effectively support traditional HPC with fast turnaround for high throughput, mid-scale computation jobs. Anvil consists of 1000 128-core computing nodes based on the next-generation AMD Epyc ?Milan" architecture that can deliver a total peak performance of 5.3 Petaflops. Each node has 256 GB of memory, and a 100 gigabits/second bandwidth from the Mellanox HDR InfiniBand interconnect, allowing multiple jobs of up to 1024 cores to be run at full speed over the interconnect fabric. These nodes are complemented by 32 large-memory nodes with 1 TB of RAM each, and 16 Nvidia GPU nodes with 4 ?Volta Next? GPUs per node. The GPU nodes are capable of 1.57 petaflops of single-precision performance to support machine learning and a wide range of current and future science and engineering applications. Anvil?s multiple tiers of storage systems include a long-term archive, persistent file and campaign storage, a 10 PB scratch file system, a 3 PB flash burst buffer, and object storage to support a variety of workflows and storage needs.
Anvil will lower the barrier to entry to advanced computing CI by providing interactive computing and desktop environments that ease the transition for users from diverse domains new to HPC. By providing feature-rich interactive environments such as Open OnDemand and ThinLinc, users can rapidly become productive on Anvil through Linux and Windows desktops, or familiar tools through their browser (e.g., Jupyter, RStudio). Complex scientific software environments and application stacks will be supported via containers orchestrated within a powerful composable subsystem. Anvil supports cloud-bursting of computational workloads as well as use of public cloud machine learning platforms including GPU and FPGA accelerators and software tools to automate hyperparameter tuning and algorithm selection for exploratory ML research. An existing production-quality science gateway at Purdue will support XSEDE researchers to share their data and tools online and facilitate easy access to Anvil and other XSEDE resources in classroom instruction and training activities.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
Please report errors in award information by writing to: awardsearch@nsf.gov.