Award Abstract # 2211428
Collaborative Research: SHF: Medium: Towards More Human-like AI Models of Source Code

NSF Org: CCF
Division of Computing and Communication Foundations
Recipient: UNIVERSITY OF NOTRE DAME DU LAC
Initial Amendment Date: June 15, 2022
Latest Amendment Date: August 19, 2024
Award Number: 2211428
Award Instrument: Continuing Grant
Program Manager: Andrian Marcus
amarcus@nsf.gov
 (703)292-0000
CCF
 Division of Computing and Communication Foundations
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: June 15, 2022
End Date: May 31, 2026 (Estimated)
Total Intended Award Amount: $864,000.00
Total Awarded Amount to Date: $992,000.00
Funds Obligated to Date: FY 2022 = $645,938.00
FY 2023 = $128,000.00

FY 2024 = $218,062.00
History of Investigator:
  • Collin McMillan (Principal Investigator)
    collin.mcmillan@nd.edu
  • Toby Li (Co-Principal Investigator)
Recipient Sponsored Research Office: University of Notre Dame
940 GRACE HALL
NOTRE DAME
IN  US  46556-5708
(574)631-7432
Sponsor Congressional District: 02
Primary Place of Performance: University of Notre Dame
940 Grace Hall
NOTRE DAME
IN  US  46556-5708
Primary Place of Performance
Congressional District:
02
Unique Entity Identifier (UEI): FPU6XGFXMBE9
Parent UEI: FPU6XGFXMBE9
NSF Program(s): CISE Education and Workforce,
Software & Hardware Foundation
Primary Program Source: 01002223DB NSF RESEARCH & RELATED ACTIVIT
01002324DB NSF RESEARCH & RELATED ACTIVIT

01002425DB NSF RESEARCH & RELATED ACTIVIT

01002526DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7482, 7924, 7943, 7944
Program Element Code(s): 055Y00, 779800
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

The research objective of this project is to design novel artificial intelligence-based models of software that learn from and are informed by human behavior. The frontier of many areas of Software Engineering (SE) research involves applications of AI-based models to SE tasks. Many tasks in SE research rely on the same basic underpinning technologies, often a neural representation of source code that is trained to find features in code, which are then used for various tasks e.g., to predict words for a document or areas of code likely to contain a bug. While the first applications of recurrent neural network-based encoder-decoder models were a paradigm shift over the manually-crafted heuristics and rules that the neural models replaced, subsequent changes have yielded less improvement despite increased sophistication.

The vision of this project is to achieve a breakthrough in more human-like neural models of source code. Its aim is to advance a broad spectrum of SE research tasks that rely on neural models, by improving the neural models of code that underpin many downstream tasks. The research plan is three-fold: First, the project will characterize human behavior during different SE tasks via eye-tracking and IDE-based experiments. Second, the project will design models that predict or even mimic human behavior. Third, the project will use those models to augment and improve neural representations of source code, and evaluate these new representations in a variety of SE tasks.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 12)
Bansal, Aakash and Eberhart, Zachary and Karas, Zachary and Huang, Yu and McMillan, Collin "Function Call Graph Context Encoding for Neural Source Code Summarization" IEEE Transactions on Software Engineering , v.49 , 2023 https://doi.org/10.1109/TSE.2023.3279774 Citation Details
Bansal, Aakash and Su, Chia-Yi and Karas, Zachary and Zhang, Yifan and Huang, Yu and Li, Toby Jia-Jun and McMillan, Collin "Modeling Programmer Attention as Scanpath Prediction" , 2023 https://doi.org/10.1109/ASE56229.2023.00092 Citation Details
Bansal, A. and Sharif, B. and McMillan, C. "Towards Modeling Human Attention from Eye Movements for Neural Source Code Summarization" 15th ACM Symposium of Eye Tracking Research & Applications , 2023 https://doi.org/10.1145/3591136 Citation Details
Jain, Vijayanta and Ghanavati, Sepideh and Peddinti, Sai Teja and McMillan, Collin "Towards Fine-Grained Localization of Privacy Behaviors" , 2023 https://doi.org/10.1109/EuroSP57164.2023.00024 Citation Details
Karas, Zachary and Bansal, Aakash and Zhang, Yifan and Li, Toby and McMillan, Collin and Huang, Yu "A Tale of Two Comprehensions? Analyzing Student Programmer Attention during Code Summarization" ACM Transactions on Software Engineering and Methodology , v.33 , 2024 https://doi.org/10.1145/3664808 Citation Details
Li, J. and Zhang and Y., Karas and Leach, K. and Huang, Y. "Do Machines and Humans Focus on Similar Code? Exploring Explainability of Large Language Models in Code Summarization" 32nd IEEE/ACM International Conference on Program Comprehension, RENE , 2024 Citation Details
Su, Chia-Yi and Bansal, Aakash and McMillan, Collin "Revisiting file context for source code summarization" Automated Software Engineering , v.31 , 2024 https://doi.org/10.1007/s10515-024-00460-x Citation Details
Su, ChiaYi and McMillan, Collin "Semantic similarity loss for neural source code summarization" Journal of Software: Evolution and Process , v.36 , 2024 https://doi.org/10.1002/smr.2706 Citation Details
Tang, N. and An, J. and Chen, M. and Bansal, A. and Huang, Y. and McMillan, C. and Li, T. "CodeGRITS: A Research Toolkit for Developer Behavior and Eye Tracking in IDE" 46th International Conference on Software Engineering, Demonstrations , 2024 Citation Details
Tang, N. and Chen, M. and Ning, Z. and Bansal, A. and Huang, Y. and McMillan, C. and Li, T. "An Empirical Study of Developer Behaviors for Validating and Repairing AI-Generated Code" 13th Workshop on the Intersection of HCI and PL , 2023 Citation Details
Tang, Ningzhi and Chen, Meng and Ning, Zheng and Bansal, Aakash and Huang, Yu and McMillan, Collin and Li, Toby "A Study on Developer Behaviors for Validating and Repairing LLM-Generated Code Using Eye Tracking and IDE Actions" , 2024 Citation Details
(Showing: 1 - 10 of 12)

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page