Award Abstract # 1341698
Gateways to Discovery: Cyberinfrastructure for the Long Tail of Science

NSF Org: OAC
Office of Advanced Cyberinfrastructure (OAC)
Recipient: UNIVERSITY OF CALIFORNIA, SAN DIEGO
Initial Amendment Date: September 27, 2013
Latest Amendment Date: February 9, 2021
Award Number: 1341698
Award Instrument: Cooperative Agreement
Program Manager: Edward Walker
edwalker@nsf.gov
 (703)292-4863
OAC
 Office of Advanced Cyberinfrastructure (OAC)
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: October 1, 2013
End Date: July 31, 2021 (Estimated)
Total Intended Award Amount: $12,000,000.00
Total Awarded Amount to Date: $27,313,476.00
Funds Obligated to Date: FY 2013 = $12,000,000.00
FY 2014 = $9,599,963.00

FY 2015 = $2,399,881.00

FY 2016 = $21,000.00

FY 2017 = $906,388.00

FY 2018 = $2,386,245.00
History of Investigator:
  • Michael Norman (Principal Investigator)
    mlnorman@ucsd.edu
  • Amitava Majumdar (Co-Principal Investigator)
  • Robert Sinkovits (Co-Principal Investigator)
  • Shawn Strande (Co-Principal Investigator)
  • Chaitanya Baru (Former Co-Principal Investigator)
  • Philip Papadopoulos (Former Co-Principal Investigator)
  • Richard Moore (Former Co-Principal Investigator)
  • Nancy Wilkins-Diehr (Former Co-Principal Investigator)
Recipient Sponsored Research Office: University of California-San Diego
9500 GILMAN DR
LA JOLLA
CA  US  92093-0021
(858)534-4896
Sponsor Congressional District: 50
Primary Place of Performance: University of California-San Diego
9500 Gilman Drive
San Diego
CA  US  92093-0934
Primary Place of Performance
Congressional District:
50
Unique Entity Identifier (UEI): UYTTZT6G9DT1
Parent UEI:
NSF Program(s): XD-Extreme Digital,
Innovative HPC
Primary Program Source: 01001314DB NSF RESEARCH & RELATED ACTIVIT
01001415DB NSF RESEARCH & RELATED ACTIVIT

01001516DB NSF RESEARCH & RELATED ACTIVIT

01001617DB NSF RESEARCH & RELATED ACTIVIT

01001718DB NSF RESEARCH & RELATED ACTIVIT

01001819DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7619, 9251
Program Element Code(s): 747600, 761900
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

The University of California at San Diego will provide a ground-breaking new computing facility, Wildfire, that will be made available to the research community to both well established users of high end computing (HEC) and especially to new user communities that are less familiar with how HEC can advance their scientific and engineering goals.
The distinguishing features of Wildfire are:
(i) Deliver 1.8-2.0 Petaflop/s of long sought capacity for the 98% of XSEDE jobs (50% of XSEDE core hours) that use fewer than 1,000 cores and also support larger jobs. The exact number will depend on the speed of the processor being delivered by Intel but cannot be less that 1.8 Petaflop/s.
(ii) Provide 7 PB of Lustre-based Performance Storage at 200 GB/s bandwidth for both scratch and allocated storage as well as 6 PB of Durable Storage
(iii) Ensure high throughput and responsiveness using allocation/scheduling using proven policies on earlier deployed systems such as Trestles and Gordon
(iv) Establish a rapid-access queue to provide new accounts within one day of the request
(v) Enable community-supported custom software stacks via virtualization for communities that are unfamiliar with HPC environments. These virtual clusters will be able to perform at or near native InfinBand bandwidth/latency

Wildfire will provide novel approaches for resource allocation, scheduling, and user support, queues with quicker response for high-throughput computing, medium-term storage allocations, virtualized environments with customized software stack, dedicated allocations of physical/virtual machines, support for Science Gateways and bandwidth reservations on high-speed networks. Wildfire has been designed to efficiently serve the 98% of XSEDE jobs that need fewer than 1,000 cores, while also supporting larger jobs. The award leverages but also enhances the services available through the XSEDE project.

The Wildfire acquisition will work to increase the diversity of researchers able to effectively make use of advanced computational resources and establish a pipeline of potential users through virtualization, science gateways and educational activities focused on the undergraduate, graduate and post-graduate levels.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 21)
Alm, K., Faria, A., Moghekar, A., Pettigrew, C., Soldan, A., et al. "Medial temporal lobe white matter pathway variability is associated with individual differences in episodic memory in cognitively normal older adults." Neurobiology of Aging , v.87 , 2020 , p.78 DOI:10.1016/j.neurobiolaging.2019.11.011
Ardekani, S., Jain, S., Sanzi, A., Corona-Villalobos, C., Abraham, T., et al. "Shape analysis of hypertrophic and hypertensive heart disease using MRI-based 3D surface models of left ventricular geometry." Medical image analysis , v.29 , 2016 , p.12 10.1016/j.media.2015.11.004
Bai, Z., and X. Zhong "Very High-Order Upwind Multi-Layer Compact (MLC) Schemes with Spectral-Like Resolution II: Two-Dimensional Case" AIAA Scitech 2019 Forum , 2019 doi:10.2514/6.2019-1398
Balaban, M., N. Moshiri, U. Mai, X. Jia, and S. Mirarab "TreeCluster: Clustering biological sequences using phylogenetic trees" S. Bozdag, PLOS ONE , 2019 doi:10.1371/journal.pone.0221068
Balaban, M., N. Moshiri, U. Mai, X. Jia, and S. Mirarab "TreeCluster: Clustering biologicalsequences using phylogenetic trees" PLOS ONE , 2019 doi:10.1371/journal.pone.0221068
Boyd, B. M., J. M. Allen, N.-P. Nguyen, P. Vachaspati, Z. S. Quicksall, T. Warnow, L.Mugisha, K. P. Johnson, and D. L. Reed "Primates, Lice and Bacteria: Speciationand Genome Evolution in the Symbionts of Hominid Lice" Molecular Biology and Evolution , v.34 , 2017 , p.1743 10.1093/molbev/msx117
Grushow, A., and M. S. Reeves "Using Computational Methods to Teach Chemical Principles: Overview, Using Computational Methods To Teach Chemical Principles" ACS Symposium Series , 2019 10.1021/bk-2019-1312.ch001
Guo, X., M. Dave, and M. Sayeed "HPCmatlab: A Framework for Fast Prototyping of Parallel Applications in Matlab" Procedia Computer Science , 2016 10.1016/j.procs.2016.05.467
Haley, C. L., and X. Zhong "Direct Numerical Simulation of Hypersonic Flow over a Blunt Cone with Axisymmetric Isolated Roughness" 47th AIAA Fluid Dynamics Conference , 2017 10.2514/6.2017-4514
Haley, C. L., and X. Zhong "Mode F/S Wave Packet Interference And Acoustic-like Emissions in a Mach 8 Flow Over a Cone" AIAA Scitech 2020 Forum , 2020 doi:10.2514/6.2020- 1579
He, S., Zhong, X "Numerical Study of Hypersonic Boundary Layer Receptivity over a Blunt Cone to Freestream Pulse Disturbances." AIAA AVIATION 2020 FORUM. http://dx.doi.org/10.2514/6.2020-2996 , 2020 DOI:10.2514/6.2020-2996.
(Showing: 1 - 10 of 21)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

The San Diego Supercomputer Center at the University of California, San Diego deployed the Comet supercomputer as a natioanl resource in 2015. It was operated for allocated access by academic researchers, educators, and students through the NSF XSEDE project from May 2015 to July 2021. Following its decommissioning as an NSF-funded resource, Comet transitioned to a resource for the Center for Western Weather and Water Extremes (CW3E), a research and service project of the Scripps Institution of Oceanography. During its 75 months of operation as an XSEDE resource, Comet ran over 28 million jobs; provided over 2 billion core-hours and 13 million GPU-hours of compute time; served over 100,000 unique users, most of whom gained access via a science gateway rather than the command line; enabled publication of over 2,000 scientific papers; and supported research across virtually every domain of science and engineering.

Comet has a peak speed of 2.8 Pflop/s delivered by 48,784 cores in 1,944 compute nodes, each with two, 12-core Intel Haswell processors and 128 GB DRAM. It also has 72 GPU nodes, half with 4 NVIDIA K80s and half with 4 NVIDIA P100s, plus 4 large-memory (1.5 TB) nodes. Like its Gordon predecessor, Comet features a large amount of flash memory via solid-state discs on every compute and GPU node. SDSC designed Comet in collaboration with vendor partners Dell, Intel, NVIDIA, Mellanox, and Aeon Computing.

Comet was designed explicitly to serve the long tail of science, defined as the large number of researchers who require only modest numbers of cores or GPUs. Such users also benefited from Comet's optimized scheduling and allocations policies that lowered the barrier for accessing a complex high-performance computer. The design incorporated several significant technology and policy innovations:

-A heterogenous architecture of CPUs, GPUs, large-memory nodes, along with a rich storage hierarchy, supported a broad range of science and engineering research.

-One compute rack of 2,016 cores, connected by an FDR InfiniBand, non-blocking fat tree, supported a wide range of job sizes, from single-core to modest-scale, fully-coupled applications.

-Virtual Cluster (VC) software, developed by SDSC in partnership with collaborators at Indiana University, provided a low-overhead virtualization layer that allowed customized software to run side-by-side with the standard cluster software stack.

-Restricting the allocation limit of an individual PI to 10M core-hours allowed Comet to support more projects. A higher limit of 20M core-hours for science gateways provided access for many more users without the overhead of requesting their own allocation.

Comet's Virtual Cluster interface was used by researchers from the Laser Interferometer Gravitational-Wave Observatory (LIGO) in support of the confirmation of the landmark detection of gravitational waves as hypothesized by Albert Einstein over 100 years ago. LIGO researchers consumed nearly 630,000 hours of computational time on Comet via the Open Science Grid (OSG) using a VC that supported the direct integration of OSG's high-throughput scheduler into Comet's batch scheduler. Comet also became one of the first NSF national resources to use Singularity, which allowed users to run containerized application software that would otherwise not be feasible with a standard cluster software stack.

Comet set out to reach 10,000 unique users over its lifetime, a goal that was achieved within the first year of operations. Notable science gateways included CIPRES, I-TASSER, and the Neuroscience Gateway. Between these and the other gateways on Comet, over 100,000 unique users accessed Comet to study a wide range of physical, chemical, and biological systems.

During its lifetime, Comet became a primary source of GPUs for the community. A research team led by UCSD's Rommie Amaro and Arvind Ramanathan, a computational biologist at Argonne National Laboratory, explored the movement of SARS-CoV-2's spike protein to understand how it behaves and gains access to the human cell. Using Comet's GPU resources as part of the scaling work, the team built a workflow based on artificial intelligence (AI) to more efficiently simulate the spike protein. The work led to a special Gordon Bell Award at the Annual Supercomputing Conference. In 2020, Comet joined the COVID-19 HPC Consortium, adding resources to help understand the spread of COVID-19 and help search for treatments and vaccines. 

Outreach activities exposed thousands of researchers, educators, and students to the benefits of Comet's unique features and ease of use. SDSC staff hosted tutorials at scientific meetings, workshops at SDSC and on other university campuses, and annual summer institutes. Following travel constraints due to COVID-19, programs were conducted virtually. SDSC used that opportunity to improve remote training processes and tools, ultimately increasing participation rates above those seen before the pandemic. In the final years of service, there was a marked increase in the interest in machine learning and AI. Targeted outreach to meet this demand resulted in a body of training materials now being used with SDSC's Expanse system.


Last Modified: 11/26/2021
Modified by: Michael L Norman

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page