NSF Award Search: Award # 2223704

Award Abstract # 2223704

SHF: Small: Methods, Workflows, and Data Commons for Reducing Training Costs in Neural Architecture Search on High-Performance Computing Platforms

NSF Org:	CCF Division of Computing and Communication Foundations
Recipient:	UNIVERSITY OF TENNESSEE
Initial Amendment Date:	July 13, 2022
Latest Amendment Date:	September 11, 2023
Award Number:	2223704
Award Instrument:	Standard Grant
Program Manager:	Almadena Chtchelkanova achtchel@nsf.gov (703)292-7498 CCF Division of Computing and Communication Foundations CSE Directorate for Computer and Information Science and Engineering
Start Date:	October 1, 2022
End Date:	September 30, 2026 (Estimated)
Total Intended Award Amount:	$623,999.00
Total Awarded Amount to Date:	$623,999.00
Funds Obligated to Date:	FY 2022 = $623,999.00
History of Investigator:	Michela Taufer (Principal Investigator) taufer@utk.edu Catherine Schuman (Co-Principal Investigator) Silvina Caino-Lores (Former Co-Principal Investigator)
Recipient Sponsored Research Office:	University of Tennessee Knoxville 201 ANDY HOLT TOWER KNOXVILLE TN US 37996-0001 (865)974-3466
Sponsor Congressional District:	02
Primary Place of Performance:	University of Tennessee Knoxville 1331 CIR PARK DR Knoxville TN US 37916-3801
Primary Place of Performance Congressional District:	02
Unique Entity Identifier (UEI):	FN2YCS2YAUW3
Parent UEI:	LXG4F9K8YZK5
NSF Program(s):	Software & Hardware Foundation
Primary Program Source:	01002223DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s):	9102, 7942, 7923
Program Element Code(s):	779800
Award Agency Code:	4900
Fund Agency Code:	4900
Assistance Listing Number(s):	47.070

ABSTRACT

Neural networks are powerful artificial-intelligence models that capture embedded knowledge in scientific data automatically. Scientists can use the knowledge to solve problems in domains such as physics, materials science, neuroscience, and medical imaging, among others. Finding accurate neural networks for a specific scientific dataset or particular problem comes at a high training cost: it requires searching among thousands of neural networks on a large number of high-performance-computing resources. This project delivers methods, workflows, and a data commons for reducing the training cost of neural networks. The methods are based on parametric modeling and enable rapid search termination early in the training process, making the search process faster and cheaper. The workflows decouple the search from the accuracy prediction of neural networks for different datasets and problems. The data commons shares the full provenance of the neural networks so other scientists can deploy the neural networks in their own research. Advances in neural networks research have a far-reaching impact on many scientific applications. Accurate neural networks can be used to extract structural information from raw microscopy data, predict performance of business processes, analyze cancer pathology data, map protein sequences to folds, and predict soil moisture or crop yield. The researchers? efforts to build a broader community of high-performance-computing experts also have a far-reaching impact on the efficient design and use of artificial-intelligence products. The team of researchers promotes increased participation of underrepresented students, particularly women, through mentoring of students in Systers (the organization for women in Electrical Engineering and Computer Science at the University of Tennessee Knoxville). Furthermore, the researchers also develop curricula tailored for a diverse population of graduate and undergraduate students across scientific domains beyond the department of computer science.

This project addresses the urgent need to reduce the use of high-performance-computing resources for the training of neural networks, while assuring explainable, reproducible and nearly-optimal neural networks. To this end, the team of researchers proposes a flexible fitness-prediction method that uses parametric modeling to predict future fitness of neural networks and allow for early termination of the training process. Through this project, the researchers create an index of effective parametric functions for a diverse suite of fitness curves, including edge cases in the modeling (e.g., neural networks that never learn or neural networks that experience a learning delay). The researchers transform neural-architecture search implementations from tightly-coupled, monolithic software tools embedding both search and prediction into a flexible, modular workflow in which search and prediction are decoupled. Project workflows enable users to reduce training cost, increase neural-architecture search throughput, and adapt fitness predictions to different fitness measurements, datasets, and problems. The researchers build a searchable and reusable neural-network data commons of record trails that capture the neural network?s lifespan through generation, training, and validation stages, recording the neural network architecture, the training dataset, and loss and accuracy values throughout each stage. The neural-network data commons enables users to study the evolution of neural-network performance during training and identify relationships between a neural network?s architecture and its performance on a given dataset with specific properties, ultimately supporting effective searches for accurate neural networks across a spectrum of real-world scientific datasets. Furthermore, the data commons provides the scientific community with a resource to study the relationships between datasets, network architectures, and performance. To assess robustness for different datasets, the project considers both well-known benchmark datasets and real-world scientific datasets of protein diffraction patterns from x?ray electron laser beams in protein structural analysis, crop-scouting images from drones in precision farming, and forestry-scouting drone images for wildfire prevention.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Channing, Georgia and Patel, Ria and Olaya, Paula and Rorabaugh, Ariel and Miyashita, Osamu and Caino-Lores, Silvina and Schuman, Catherine and Tama, Florence and Taufer, Michela "Composable Workflow for Accelerating Neural Architecture Search Using In Situ Analytics for Protein Classification" 52nd International Conference on Parallel Processing (ICPP) , 2023 https://doi.org/10.1145/3605573.3605636 Citation Details

Cranganore, Sandeep Suresh and De_Maio, Vincenzo and Brandic, Ivona and Deelman, Ewa "Paving the way to hybrid quantumclassical scientific workflows" Future Generation Computer Systems , v.158 , 2024 https://doi.org/10.1016/j.future.2024.04.030 Citation Details

Keller Rorabaugh, Ariel and Caino-Lores, Silvina and Johnston, Travis and Taufer, Michela "Building High-throughput Neural Architecture Search Workflows via a Decoupled Fitness Prediction Engine" IEEE Transactions on Parallel and Distributed Systems , 2022 https://doi.org/10.1109/TPDS.2022.3140681 Citation Details

Olaya, Paula and Caino-Lores, Silvina and Lama, Vanessa and Patel, Ria and Rorabaugh, Ariel Keller and Miyashita, Osamu and Tama, Florence and Taufer, Michela "Identifying Structural Properties of Proteins from X-ray Free Electron Laser Diffraction Patterns" 18th IEEE International Conference on e-Science (eScience) , 2022 https://doi.org/10.1109/eScience55777.2022.00017 Citation Details

Olaya, Paula and Luettgau, Jakob and Roa, Camila and Llamas, Ricardo and Vargas, Rodrigo and Wen, Sophia and Chung, I-Hsin and Seelam, Seetharami and Park, Yoonho and Lofstead, Jay and Taufer, Michela "Enabling Scalability in the Cloud for Scientific Workflows: An Earth Science Use Case" 2023 IEEE 16th International Conference on Cloud Computing (CLOUD) , 2023 https://doi.org/10.1109/CLOUD60044.2023.00052 Citation Details

Patel, Ria and Rorabaugh, Ariel Keller and Olaya, Paula and Caino-Lores, Silvina and Channing, Georgia and Schuman, Catherine and Miyashita, Osamu and Tama, Florence and Taufer, Michela "A Methodology to Generate Efficient Neural Networks for Classification of Scientific Datasets" 18th IEEE International Conference on e-Science (eScience) , 2022 https://doi.org/10.1109/eScience55777.2022.00052 Citation Details

Roa, Camila and Olaya, Paula and Llamas, Ricardo and Vargas, Rodrigo and Taufer, Michela "GEOtiled: A Scalable Workflow for Generating Large Datasets of High-Resolution Terrain Parameters" , 2023 https://doi.org/10.1145/3588195.3595941 Citation Details

Roa, Camila and Rynge, Mats and Olaya, Paula and Vahi, Karan and Miller, Todd and Griffioen, James and Knuth, Shelley and Goodhue, John and Hudak, David and Romanella, Alana and Llamas, Ricardo and Vargas, Rodrigo and Livny, Miron and Deelman, Ewa and Tau "End-to-end Integration of Scientific Workflows on Distributed Cyberinfrastructures: Challenges and Lessons Learned with an Earth Science Application" Proceedings of the 15th IEEE/ACM International Conference on Utility and Cloud Computing (UCC) , 2023 https://doi.org/10.1145/3603166.3632142 Citation Details

Rorabaugh, Ariel Keller and Caíno-Lores, Silvina and Johnston, Travis and Taufer, Michela "High frequency accuracy and loss data of random neural networks trained on image datasets" Data in Brief , v.40 , 2022 https://doi.org/10.1016/j.dib.2021.107780 Citation Details

Tan, Nigel and Luettgau, Jakob and Marquez, Jack and Teranishi, Keita and Morales, Nicolas and Bhowmick, Sanjukta and Cappello, Franck and Taufer, Michela and Nicolae, Bogdan "Scalable Incremental Checkpointing using GPU-Accelerated De-Duplication" , 2023 https://doi.org/10.1145/3605573.3605639 Citation Details

Please report errors in award information by writing to: awardsearch@nsf.gov.

Success

Error