Award Abstract # 1643056
EAGER: Application-driven Data Precision Selection Methods

NSF Org: CCF
Division of Computing and Communication Foundations
Recipient: UNIVERSITY OF UTAH
Initial Amendment Date: July 21, 2016
Latest Amendment Date: July 21, 2016
Award Number: 1643056
Award Instrument: Standard Grant
Program Manager: Almadena Chtchelkanova
achtchel@nsf.gov
 (703)292-7498
CCF
 Division of Computing and Communication Foundations
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: August 1, 2016
End Date: July 31, 2018 (Estimated)
Total Intended Award Amount: $299,970.00
Total Awarded Amount to Date: $299,970.00
Funds Obligated to Date: FY 2016 = $299,970.00
History of Investigator:
  • Ganesh Gopalakrishnan (Principal Investigator)
    ganesh@cs.utah.edu
  • Mary Hall (Co-Principal Investigator)
  • Zvonimir Rakamaric (Co-Principal Investigator)
  • Hari Sundar (Co-Principal Investigator)
  • Vivek Srikumar (Co-Principal Investigator)
Recipient Sponsored Research Office: University of Utah
201 PRESIDENTS CIR
SALT LAKE CITY
UT  US  84112-9049
(801)581-6903
Sponsor Congressional District: 01
Primary Place of Performance: University of Utah
UT  US  84112-9205
Primary Place of Performance
Congressional District:
01
Unique Entity Identifier (UEI): LL8GLEVH6MG3
Parent UEI:
NSF Program(s): Software & Hardware Foundation
Primary Program Source: 01001617DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7916, 7942, 8206
Program Element Code(s): 779800
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Numerical algorithms used in Cyber Physical Systems, decision-making systems, financial processing, and other HPC applications that use real numbers are prone to introduce computational errors because of a well-known reason: real numbers do not exist in computers, and we must use floating-point data types to approximate such computations. As data movement costs energy, the lowest precision of floating-point data must be allocated without compromising the computational integrity. This project implements methods to reduce the amount of energy consumed by numerical computations running on computing devices at all scales including supercomputers for scientific research all the way to embedded and mobile devices finding uses in many walks of real life including medical devices and robots. A key thrust of the work is to perform energy reduction through reduced transfers between computing units. The project studies how the number of bits used to represent data introduce errors in computations, and whether these errors affect the correctness of results.

The PIs propose to develop new formal methods tools to automatically estimate error bounds, develop auto-tuning compilers to carefully select precision, and build new superoptimizers to generate more efficient code. These new technologies will be applied to improve software in the domains of machine learning and high-performance computing. The PIs shall develop suitable criteria for errors in high performance computing systems and machine learning systems. They will develop tools that allocate precision optimally while staying within the bounds of acceptable answers. Their tools will be released to a community of researchers interested in working toward exascale computing, and deploying machine learning applications in safety-critical devices. This work represents a synergistic combination of PI skills ranging through high performance computing, machine learning, formal methods, and compiler technologies.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Baranowski, Marek and Briggs, Ian and Chiang, Wei-Fan and Gopalakrishnan, Ganesh and Rakamaric, Zvonimir and Solovyev, Alexey "Moving the Needle on Rigorous Floating-Point Precision Tuning" Kalpa Publications in Computing , v.5 , 2018 10.29007/f4f3 Citation Details
Carlson, Max and Sundar, Hari "Utilizing GPU Parallelism to improve Fast Spherical Harmonic Transforms" IEEE High Performance Extreme Computing Conference , 2018 Citation Details
Fernando, Isuru Dilanka and Jayasena, Sanath and Fernando, Milinda and Sundar, Hari "A Scalable Hierarchical Semi-Separable Library for Heterogeneous Clusters" 46th International Conference on Parallel Processing (ICPP) , 2017 10.1109/ICPP.2017.60 Citation Details
Fernando, Milinda and Duplyakin, Dmitry and Sundar, Hari "Machine and Application Aware Partitioning for Adaptive Mesh Refinement Applications" Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing , v.2017 , 2017 10.1145/3078597.3078610 Citation Details
Majid Rasouli, Vidhi Zala "Improving Performance and Scalability of Algebraic Multigrid through a Specialized MATVEC" IEEE High Performance Extreme Computing Conference , 2018 Citation Details
Tirpankar, Nishith and Sundar, Hari "Towards Triangle Counting on GPU using Stable Radix Binning" IEEE High Performance Extreme Computing Conference , 2018 Citation Details

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

When developing computer representations of continuous quantities (real numbers), a designer allocates enough data bits to model the data with the requisite precision. However, allocating more than the required number of bits leads to excessive memory consumption (especially the cache memory that is a precious resource). It also increases energy consumption which is proportional to the amount of data moved. In this research project, we studied the precision allocation problem from a number of different perspectives.  We chose two primary application domains: machine learning, and high performance computing.


Our first major finding is that conducting rigorous precision allocation requires tools for accurate roundoff error analysis. Our second major finding is that the nature of the application -- machine learning or high performance computing -- has a significant impact on the impact of precision allocation on application behavior. Our third major finding is that even within a domain such as high performance computing, the nature of the task carried out -- whether it involves discrete decisions or involves primarily arithmetic operations --decides how one can optimally trim precision in order to reduce energy costs without affecting application behavior. In the domain of machine learning, the decision space is is even more complex, and depends on whether precision tuning is being done while training or during deployment of a trained system.


The major tool related contribution of this project is a tool called FPTaylor that has been engineered to high standards, and has been downloaded and studied by a number of groups. A comprehensive journal article presents a detailed comparative study of FPTaylor against other tools in its class. The project was instrumental in training five PhD students and one MS student in this area of critical national importance.




Last Modified: 09/01/2018
Modified by: Ganesh L Gopalakrishnan

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page