Award Abstract # 1518776
SHF:Large:Collaborative Research: Inferring Software Specifications from Open Source Repositories by Leveraging Data and Collective Community Expertise

NSF Org: CCF
Division of Computing and Communication Foundations
Recipient: BOWLING GREEN STATE UNIVERSITY
Initial Amendment Date: June 17, 2015
Latest Amendment Date: June 17, 2015
Award Number: 1518776
Award Instrument: Standard Grant
Program Manager: Sol Greenspan
sgreensp@nsf.gov
 (703)292-7841
CCF
 Division of Computing and Communication Foundations
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: July 1, 2015
End Date: June 30, 2019 (Estimated)
Total Intended Award Amount: $214,843.00
Total Awarded Amount to Date: $214,843.00
Funds Obligated to Date: FY 2015 = $214,843.00
History of Investigator:
  • Robert Dyer (Principal Investigator)
    rdyer@unl.edu
Recipient Sponsored Research Office: Bowling Green State University
1851 N RESEARCH DR
BOWLING GREEN
OH  US  43403-4401
(419)372-2481
Sponsor Congressional District: 05
Primary Place of Performance: Bowling Green State University
106 University Hall
Bowling Green
OH  US  43403-0230
Primary Place of Performance
Congressional District:
05
Unique Entity Identifier (UEI): SLT3EB6G3FA9
Parent UEI:
NSF Program(s): Software & Hardware Foundation
Primary Program Source: 01001516DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7925, 7944
Program Element Code(s): 779800
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Today individuals, society, and the nation critically depend on software to manage critical infrastructures for power, banking and finance, air traffic control, telecommunication, transportation, national defense, and healthcare. Specifications are critical for communicating the intended behavior of software systems to software developers and users and to make it possible for automated tools to verify whether a given piece of software indeed behaves as intended. Safety critical applications have traditionally enjoyed the benefits of such specifications, but at a great cost. Because producing useful, non-trivial specifications from scratch is too hard, time consuming, and requires expertise that is not broadly available, such specifications are largely unavailable. The lack of specifications for core libraries and widely used frameworks makes specifying applications that use them even more difficult. The absence of precise, comprehensible, and efficiently verifiable specifications is a major hurdle to developing software systems that are reliable, secure, and easy to maintain and reuse.

This project brings together an interdisciplinary team of researchers with complementary expertise in formal methods, software engineering, machine learning and big data analytics to develop automated or semi-automated methods for inferring the specifications from code. The resulting methods and tools combine analytics over large open source code repositories to augment and improve upon specifications by program analysis-based specification inference through synergistic advances across both these areas.

The broader impacts of the project include: transformative advances in specification inference and synthesis, with the potential to dramatically reduce, the cost of developing and maintaining high assurance software; enhanced interdisciplinary expertise at the intersection of formal methods software engineering, and big data analytics; Contributions to research-based training of a cadre of scientists and engineers with expertise in high assurance software.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Aria Khademi, Sanghack Lee, S., David Foley, and Vasant Honavar "Fairness in Algorithmic Decision Making: An Excursion Through the Lens of Causality" Proceedings of the Web Conference , 2019
José Sánchez and Gary T. Leavens "Reasoning Tradeoffs in Languages with Enhanced Modularity Features" Proceedings of the 15th International Conference on Modularity (Modularity 2016) , 2016 , p.13 10.1145/2889443.2889447
Junjie Liang, Jinlong Hu, and Vasant Honavar "Top-N-Rank: A Truncated List-wise Ranking Approach for Large-scale Top-N Recommendation" ACM Conference on Information and Knowledge Management , 2018
Peter W. V. Tran-Jorgensen, Peter Gorm Larsen, and Gary T. Leavens "Automated translation of VDM to JML-annotated Java" International Journal on Software Tools for Technology Transfer , 2017
Samantha Syeda Khairunnesa, Hoan Anh Nguyen, Tien N. Nguyen and Hridesh Rajan "Exploiting Implicit Beliefs to Resolve Sparse Usage Problem in Usage-based Specification Mining" ACM SIGPLAN conference on Systems, Programming, Languages and Applications: Software for Humanity (SPLASH 2017) , 2017
Saravanan Kandasamy, Arnab Bhattacharyya, A. and Vasant Honavar "Minimum Intervention Cover of a Causal Graph" Proceedings of the 33rd AAAI Conference on Artificial Intelligence , 2019
Yimin Zhou, Yiwei Sun and Vasant Honavar "Improving Image Captioning by Leveraging Knowledge Graphs" IEEE Winter Conference on Applications of Computer Vision (WACV) , 2019
Yiwei Sun, Tsung-Yu Hsieh, and Vasant Honavar "Multi-View Network Embedding Via Graph Factorization Clustering and Co-Regularized Multi-View Agreement" IEEE Conference on Data Mining , 2018
Yuyan Bao, Gary T. Leavens, and Gidon Ernst "Unifying separation logic and region logic to allow interoperability" Formal Aspects of Computing , 2018
Yuyan Bao, Gary T. Leavens, and Gidon Ernst "Unifying separation logic and region logic to allow interoperability" Formal Aspects of Computing , 2018

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

Our focus is on behavioral interface specifications, which are specifications of the behavior of methods (using pre- and post- conditions) and modules (e.g., using invariants in classes and interfaces). Methods and modules together constitute application programming interfaces (APIs). Building on an infrastructure for mining code and its evolution in open source repositories at a large-scale, we made groundbreaking advances in specification inference and synthesis, and utilized them to specify the highly used Java Development Kit and other common APIs that can drastically reduce the cost of specifying programs. Such fundamental breakthroughs were achieved by leveraging the collective intelligence available in software artifacts from millions of open source projects with their software evolution history.

Several tools were either created or extended during the course of the project. All of those tools were made available as open source software for other researchers to use and extend. The results were disseminated in the form of research publications at major conferences and in top journals. Several PhD and master's theses were also created based on the tools and algorithms created during the project period. 

BGSU's focus was on scaling the specification inference techniques and providing capabilities for inference within the Boa infrastructure for mining open source software repositories. To this end, Boa was extended with several new extensive features for program analysis, including the ability to generate and traverse control, data, and program dependency graphs.  The algorithms created allow inference of method preconditions using a notion of consensus from millions of open source projects as well as inference of method postconditions.  The extensions to Boa were all made publicly available and the inference algorithms will be published in the near future.


Last Modified: 08/09/2019
Modified by: Robert Dyer

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page