Award Abstract # 2107008
EAGER: Natural Language Processing for Teaching and Research in Engineering Education

NSF Org: EEC
Division of Engineering Education and Centers
Recipient: VIRGINIA POLYTECHNIC INSTITUTE & STATE UNIVERSITY
Initial Amendment Date: February 25, 2022
Latest Amendment Date: February 25, 2022
Award Number: 2107008
Award Instrument: Standard Grant
Program Manager: Alice Pawley
apawley@nsf.gov
 (703)292-7286
EEC
 Division of Engineering Education and Centers
ENG
 Directorate for Engineering
Start Date: March 1, 2022
End Date: February 28, 2026 (Estimated)
Total Intended Award Amount: $299,647.00
Total Awarded Amount to Date: $299,647.00
Funds Obligated to Date: FY 2022 = $299,647.00
History of Investigator:
  • Andrew Katz (Principal Investigator)
    akatz4@vt.edu
  • Hoda Eldardiry (Co-Principal Investigator)
Recipient Sponsored Research Office: Virginia Polytechnic Institute and State University
300 TURNER ST NW
BLACKSBURG
VA  US  24060-3359
(540)231-5281
Sponsor Congressional District: 09
Primary Place of Performance: Virginia Polytechnic Institute and State University
300 Turner Street NW, Suite 4200
Blacksburg
VA  US  24060-0001
Primary Place of Performance
Congressional District:
09
Unique Entity Identifier (UEI): QDE5UHE5XD16
Parent UEI: X6KEFGLHSJX7
NSF Program(s): EngEd-Engineering Education
Primary Program Source: 01002223DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 1340, 7916, 110E
Program Element Code(s): 134000
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.041

ABSTRACT

In ecosystems that form professional engineers, community members produce text through many activities such as end-of-semester feedback to instructors, transcripts of instruction, open-ended survey items, and interviews. In each case, there is abundant text available to educators and researchers that could provide insight into how we form engineers. Unfortunately, while these texts have the potential to provide novel insights, traditional analytic techniques do not scale well. Time investments, bias, interrater reliability, and intrarater reliability each present significant challenges. To address this problem, we aim to develop and characterize approaches for human-in-the-loop (HITL) natural language processing (NLP) systems to augment human analysis, facilitating and enhancing the work of one person (or team). Such systems can help reduce the amount of time needed to analyze texts by grouping similar texts together. The human user can utilize these groupings for further analysis and identify meanings in ways only a human could. The system will also improve consistency by analyzing across the entire collection of texts simultaneously and grouping similar items together. This is in contrast with a single person or a team that would analyze responses sequentially, creating the potential for inconsistencies across time.

We will accomplish this work in three phases. In Phase 1, we will conduct a series of experiments to test potential system configurations. The goal will be to identify optimal components and parameter settings for four of the steps in the proposed pipeline. We will use datasets from (i) students? written responses to an instrument for assessing their systems thinking and (ii) students? responses to open-ended course feedback surveys. We will measure performance based on consistency of thematic clusters, using standard metrics for homogeneity in text clustering and classification tasks. In Phase 2, we will study system performance on a series of five datasets. These datasets will come from multiple sources: extant NSF-funded projects, longitudinal data from the Virginia Tech College of Engineering, current data in engineering courses, and freshly collected data from online outlets. These represent important areas of the broader ecosystem that supports how we form future engineers. We will test the system for thematic clusters, employing similar metrics as in Phase 1 to identify potential inconsistencies in how different datasets are handled. We will specifically look for homogeneity of texts within a cluster and shared semantic meaning. We will also update the original system designs in the event of systematic differences (e.g., longer texts require a different system configuration). For Phase 3, we will study how it can affect human performance. Since we anticipate significant improvements in human efficiency and consistency, it is important to conduct analyses that can accurately assess the veracity of that proposition. These studies will assess the HITL aspect of this process since many relevant applications of the system will require additional interpretation of the raw output. To accomplish this, we will collect data on differences in human performance when analyzing 1,500 student responses with and without the system?s assistance. We will look at differences when (a) one person alone codes the data and when (b) a team of three researchers codes the data (i.e., we will have two studies: one person with vs one person without and team with vs team without). We will measure differences in coding (whether different themes emerge), reliability (how consistently similar texts are grouped together), time needed to code the data, and differential treatment of student responses associated with student group characteristics. We will host all code on public repositories and notebooks for easy access, copying, and application by other engineering education researchers and teachers along with any new datasets, where appropriate.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 19)
Yuan, Chenhan and Rossi, Ryan and Katz, Andrew and Eldardiry, Hoda "A Reinforcement Learning Framework for N-Ary Document-Level Relation Extraction" IEEE Transactions on Big Data , v.11 , 2025 https://doi.org/10.1109/TBDATA.2024.3410099 Citation Details
Alsharif, A and Katz, A "From Manual Coding to Machine Understanding: Students Feedback Analysis" , 2024 Citation Details
Anakok, I and Huerta, M and Katz, A. "Exploring the Impact of Engineering Projects in Community Service on Students' Perspectives About Engineering as a Major" Frontiers in education , 2023 https://doi.org/10.1109/FIE58773.2023.10342965 Citation Details
Anakok, I. and Woods, J. and Huerta, M. and Schoepf, J. and Murzi, H. and Katz, A. "Students Feedback About Their Experiences in EPICS Using Natural Language Processing" IEEE Frontiers in Education Conference , 2022 https://doi.org/10.1109/FIE56618.2022.9962557 Citation Details
Chew, K. and Ross, A. and Katz, A. and Matusovich, H. "Defining Assessment: Foundation Knowledge Toward Exploring Engineering Facultys Assessment Mental Models" IEEE Frontiers in Education Conference , 2022 https://doi.org/10.1109/FIE56618.2022.9962667 Citation Details
Fleming, Gabriella Coloyan and Klopfer, Michelle and Katz, Andrew and Knight, David "What engineering employers want: An analysis of technical and professional skills in engineering job advertisements" Journal of Engineering Education , v.113 , 2024 https://doi.org/10.1002/jee.20581 Citation Details
Hingle, A and Katz, A and Johri, A. "Exploring NLP-Based Methods for Generating Engineering Ethics Assessment Qualitative Codebooks" Proceedings Frontiers in Education Conference , 2023 Citation Details
Johnson, B and Main, J and Katz, A. "How Participating in Extracurricular Activities Supports Dimensions of Student Wellness" Frontiers in education , 2023 https://doi.org/10.1109/FIE58773.2023.10343274 Citation Details
Johri, Aditya and Katz, Andrew S. and Qadir, Junaid and Hingle, Ashish "Generative artificial intelligence and engineering education" Journal of Engineering Education , v.112 , 2023 https://doi.org/10.1002/jee.20537 Citation Details
Katz, A and Paretti, M and Shealy, T and Bilow, F "Board 236: Design for Sustainability: How Mental Models of Social-Ecological Systems Shape Engineering Design Decisions" , 2024 Citation Details
Katz, Andrew and Gerhardt, Mitch and Soledad, Michelle "Using Generative Text Models to Create Qualitative Codebooks for Student Evaluations of Teaching" International Journal of Qualitative Methods , v.23 , 2024 https://doi.org/10.1177/16094069241293283 Citation Details
(Showing: 1 - 10 of 19)

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page