
NSF Org: |
OAC Office of Advanced Cyberinfrastructure (OAC) |
Recipient: |
|
Initial Amendment Date: | August 20, 2023 |
Latest Amendment Date: | August 20, 2023 |
Award Number: | 2321123 |
Award Instrument: | Standard Grant |
Program Manager: |
Sharmistha Bagchi-Sen
shabagch@nsf.gov (703)292-8104 OAC Office of Advanced Cyberinfrastructure (OAC) CSE Directorate for Computer and Information Science and Engineering |
Start Date: | September 1, 2023 |
End Date: | August 31, 2025 (Estimated) |
Total Intended Award Amount: | $299,587.00 |
Total Awarded Amount to Date: | $299,587.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
5200 N LAKE RD MERCED CA US 95343-5001 (209)201-2039 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
5200 N LAKE RD MERCED CA US 95343-5001 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | CyberTraining - Training-based |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
High-Performance Computing (HPC) has revolutionized various scientific fields, including climate research, wildlife health, agricultural sciences, and scientific simulations and modeling. With the emergence of HPC-accelerated deep learning (HPC-DL) systems and applications, there is a pressing need for comprehensive cross-layer training materials to educate the research workforce on these advanced technologies. The primary objective of this pilot project is to address this need by providing comprehensive cross-layer HPC-DL training to a wide range of cyberinfrastructure (CI) users. The target audience includes undergraduate and graduate students, postdocs, faculty, and research staff who can benefit from enhanced knowledge and skills in utilizing HPC-DL CI technologies and resources. By equipping them with the necessary training, the project aims to improve their research efficiency and maximize the potential of HPC-DL in their respective fields. In addition, the project has a specific focus on fostering inclusivity and expanding opportunities for underrepresented communities in the Central Valley area of California. This will contribute to the national interest by empowering individuals with the knowledge and skills necessary to excel in the HPC-DL field.
This project addresses the critical training needs of the converged HPC-DL field by developing comprehensive training materials, fostering peer consultant programs, conducting workshops, and building an inclusive learning culture. It includes an integration of scientific applications, HPC technologies, and DL in a cross-layer approach. The training program covers several important CI topics, including Remote Direct Memory Access (RDMA), GPU-based distributed computing, Slurm, MPI, and NCCL, which are critical to achieving high performance for HPC-DL workloads. The training will also dive into distributed DL training frameworks such as PyTorch, TensorFlow, and Horovod, enabling participants to effectively leverage these tools for their research. Moreover, the training incorporates practical DL application case studies, offering real-world examples and insights. The short-term goal is to empower individuals with HPC-DL knowledge and cross-layer optimization skills to maximize the utilization of HPC-DL CI resources and improve research efficiency. This project will also examine the effectiveness of practice-central models and HPC-DL-centered workshops in promoting HPC-DL adoption in underrepresented communities. The project's long-term aim is to cultivate a robust research workforce with a deep understanding of HPC-DL CIs. By establishing a learning culture and targeting a significant number of CI users, this project addresses workforce shortages and extends its impact beyond the Central Valley. Through collaborations and the dissemination of open-source training materials, it will contribute to advancing compute- and data-intensive scientific simulations and knowledge discovery.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
Please report errors in award information by writing to: awardsearch@nsf.gov.