NSF Award Search: Award # 1320444

Award Abstract # 1320444

SHF: Small: Reliable Data Processing by Dynamic Program Analysis

NSF Org:	CCF Division of Computing and Communication Foundations
Recipient:	PURDUE UNIVERSITY
Initial Amendment Date:	June 28, 2013
Latest Amendment Date:	June 28, 2013
Award Number:	1320444
Award Instrument:	Standard Grant
Program Manager:	Sol Greenspan sgreensp@nsf.gov (703)292-7841 CCF Division of Computing and Communication Foundations CSE Directorate for Computer and Information Science and Engineering
Start Date:	July 1, 2013
End Date:	June 30, 2018 (Estimated)
Total Intended Award Amount:	$400,000.00
Total Awarded Amount to Date:	$400,000.00
Funds Obligated to Date:	FY 2013 = $400,000.00
History of Investigator:	Xiangyu Zhang (Principal Investigator)
Recipient Sponsored Research Office:	Purdue University 2550 NORTHWESTERN AVE # 1100 WEST LAFAYETTE IN US 47906-1332 (765)494-1055
Sponsor Congressional District:	04
Primary Place of Performance:	Purdue University IN US 47907-2107
Primary Place of Performance Congressional District:	04
Unique Entity Identifier (UEI):	YRXVL4JYCEF5
Parent UEI:	YRXVL4JYCEF5
NSF Program(s):	SOFTWARE ENG & FORMAL METHODS
Primary Program Source:	01001314DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s):	7923, 7944
Program Element Code(s):	794400
Award Agency Code:	4900
Fund Agency Code:	4900
Assistance Listing Number(s):	47.070

ABSTRACT

Computational Science involves computer modeling and simulation of natural phenomena, and the validity of scientific inquiry depends on the way computers are used to do numerical computation. Numeric errors pose a serious threat to output validity for modern scientific data processing. Raw inputs are acquired by physical instruments that have limited precision, leading to input errors. Parameters used in data processing may be provided by human scientists based on their experience, leading to uncertainty. Data may not be represented exactly due to the limited precision of the machine used. Once these errors creep into a computation, they may get propagated and magnified by the sequence of operations conducted, producing unreliable output. Such instability problems may ultimately have substantial impact on scientific research and even the economy.

This project aims to develop dynamic program analysis tools to address instability problems caused by errors. These tools will automatically analyze the data processing programs provided by the users and transform them to allow online representation of and reasoning about errors. The user runs the transformed programs on the original input data as usual, with the option of providing additional input/coefficient error ranges. The execution will produce regular output as before, together with an indication of whether the output is stable in the presence of errors, including input errors, uncertain coefficients, and internal representation errors. If the execution is determined to be unstable, the technique will automatically report the possible consequences induced by the errors. Another option is to automatically switch to executing a high-precision version of the program, which is also generated by the project's tool set.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Enyi Tang and Xiangyu Zhang and Norbert Th. Muller and Zhenyu Chen and Xuandong Li "Software Numerical Instability Detection andDiagnosis by Combining Stochastic andInfinite-precision Testing" IEEE Transactions on Software Engineering , 2017

Shiqing Ma and Yousra Aafer and Zhaogui Xu and Xiangyu Zhang "Data Provenance for Graph Based Machine Learning Algorithms Through Derivative Computation" ACM SIGSOFT International Symposium on the Foundations of Software Engineering (FSE) , 2017

Yingqi Liu, Shiqing Ma, Yousra Aafer, Wen-Chuan Lee, Juan Zhai, Weihang Wang, Xiangyu Zhang "Trojaning Attack on Neural Networks" Proceedings of the 25th Network and Distributed System Security Symposium (NDSS) , 2018

Zhaogui Xu, Shiqing Ma, Xiangyu Zhang, Shuofei Zhu, Baowen XuProceedings of the 40th International Conference on Software Engineering (ICSE 2018) "Debugging with Intelligence via Probabilistic Inference" Proceedings of the 40th International Conference on Software Engineering (ICSE) , 2018

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Errors pose a serious threat to output validity for modern scientific
data processing, which is often performed by computer programs.
Raw inputs are acquired by physical instruments that have precision
limitations, leading to input errors. Parameters used in data processing
may be provided by human scientists based on their experience, leading
to uncertainty. Data may not be precisely represented due to the limited
precision of the machine used, leading to representation errors.
Once these errors get into computation, they may get propagated and magnified
by the operations conducted, producing unreliable output.

In this project, the PI developed techniques to address the possible problems caused by floating point errors during execution. These tools are based on compilers. They automatically analyze data processing programs and transform them to allow representing and reasoning about errors at runtime.
The user runs the transformed programs on the original input data as usual,
with the option of providing the additional input/coefficient error bounds
explicitly. The execution will produce regular output as before, together
with indication if the output is stable in the presence different kinds of
errors, including input errors, uncertain coefficients, and internal
representation errors. If the execution is determined as unstable,
The techniques automatically report the possible different consequences
induced by the errors, to facilitate better human decisions.

Through this project, the PI graduated graduated two PhD students. One joined industry and the other joined academia as an assistant professor. Research outcomes were published on top venues such as OOPSLA, ICSE, FSE and TSE.

Last Modified: 10/14/2018
Modified by: Xiangyu Zhang

Please report errors in award information by writing to: awardsearch@nsf.gov.