
NSF Org: |
CCF Division of Computing and Communication Foundations |
Recipient: |
|
Initial Amendment Date: | January 30, 2015 |
Latest Amendment Date: | June 11, 2019 |
Award Number: | 1452959 |
Award Instrument: | Continuing Grant |
Program Manager: |
Sol Greenspan
sgreensp@nsf.gov (703)292-7841 CCF Division of Computing and Communication Foundations CSE Directorate for Computer and Information Science and Engineering |
Start Date: | September 1, 2015 |
End Date: | August 31, 2020 (Estimated) |
Total Intended Award Amount: | $450,000.00 |
Total Awarded Amount to Date: | $450,000.00 |
Funds Obligated to Date: |
FY 2017 = $89,891.00 FY 2018 = $92,405.00 FY 2019 = $95,070.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
940 GRACE HALL NOTRE DAME IN US 46556-5708 (574)631-7432 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
940 Grace Hall Notre Dame IN US 46556-5708 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | Software & Hardware Foundation |
Primary Program Source: |
01001718DB NSF RESEARCH & RELATED ACTIVIT 01001819DB NSF RESEARCH & RELATED ACTIVIT 01001920DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
The objective of this research project is 1) to create a model of program comprehension for how software development professionals write software documentation, and 2) to use this model to design algorithms to automate the process of writing documentation. The process of writing documentation is a major expense in software development projects, and is often neglected. By automating key components of the process, this research helps programmers to avoid this expense and therefore to be more productive.
The project studies the process that programmers follow when reading source code to write documentation. Then, the project proposes algorithms to mimic that process. These algorithms are integrated with novel natural language generation systems to create descriptions of software behavior. These descriptions are then integrated into documentation of the source code. A key broader impact of this project is to increase the workforce participation of persons with visual disabilities. First, the descriptions generated by the research can be used in accessibility technologies for blind programmers, to help those programmers read source code. Second, an outreach program to state K-12 schools for the blind and visually impaired helps prepare students in these schools prepare for a career in the software development industry.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
This research project targets the problem of automatic documentation generation for software. Programmers are notorious for lacking time and resources to write good software documentation for others, even while seeking high-quality software documentation for themselves. The dilemma is essentially that programmers are under intense time pressure during development, and often cannot devote energy to documentation. But then later, other programmers struggle to read their code because the documentation is sparse or out of date. For decades, a dream of software engineering research has been to design algorithms that write this documentation automatically. This proposal targets four research questions towards this long-term dream:
RQ1: How do programmers read source code when creating documentation? This question targets the physical process that programmers follow, such as eye movements, keyboard/mouse cursor strikes, and stress indicators when writing documentation. The purpose for studying this information is to help understand what programmers do in order to automate the rote portions for them.
RQ2: What information from source code do programmers prioritize for documentation? The purpose of this research question is to determine what information that programmers tend to include in documentation, after they have obtained an understanding of the source code. Programmers who write documentation may obtain this understanding after reading code from someone else or by writing the code themselves. In either case, they decide which information is important enough for others programmers to know. Automated documentation tools would benefit by including this information.
RQ3: How can solutions from text summarization technologies be adapted to solve code summarization research problems? The rationale behind this research question is that several effective techniques have been proposed in text summarization for over two decades, but are difficult to adapt to code summarization ? code and text communicate information directly. However, by mimicking the process followed by humans in writing documentation, which we study in RQ1 and RQ2, it is possible to adapt text summarization to source code more effectively than current approaches. We have published several research papers demonstrating how to adapt text summarization to code summarization, which have helped establish code summarization as a research area at the intersection of software engineering and natural language processing.
RQ4: Do code summarization technologies assist blind programmers in comprehending code as quickly as sighted programmers? The intent of this question is to blend this proposal?s intellectual merit and broader impacts. Through this project, a collaborative teaching program between the University of Notre Dame and the Illinois School for the Blind as flourished. Impact is demonstrated through several research papers. A landmark paper funded by this project demonstrated zero difference in the quality of code comprehension between sighted and blind programmers, which has strong implications for employment and education of persons who are blind and low vision in computer science.
Last Modified: 09/01/2020
Modified by: Collin Mcmillan
Please report errors in award information by writing to: awardsearch@nsf.gov.