Award Abstract # 1920796
Advancing Computational Grounded Theory for Audiovisual Data from STEM Classrooms

NSF Org: DRL
Division of Research on Learning in Formal and Informal Settings (DRL)
Recipient: UNIVERSITY OF ILLINOIS
Initial Amendment Date: August 8, 2019
Latest Amendment Date: January 13, 2020
Award Number: 1920796
Award Instrument: Standard Grant
Program Manager: Jonathan Bostic
jdbostic@nsf.gov
 (703)292-2296
DRL
 Division of Research on Learning in Formal and Informal Settings (DRL)
EDU
 Directorate for STEM Education
Start Date: September 1, 2019
End Date: August 31, 2024 (Estimated)
Total Intended Award Amount: $1,313,855.00
Total Awarded Amount to Date: $1,313,855.00
Funds Obligated to Date: FY 2019 = $1,313,855.00
History of Investigator:
  • Christina Krist (Principal Investigator)
    stinakrist@stanford.edu
  • Cynthia D'Angelo (Co-Principal Investigator)
  • Elizabeth Dyer (Co-Principal Investigator)
  • Joshua Rosenberg (Co-Principal Investigator)
  • Nigel Bosch (Co-Principal Investigator)
Recipient Sponsored Research Office: University of Illinois at Urbana-Champaign
506 S WRIGHT ST
URBANA
IL  US  61801-3620
(217)333-2187
Sponsor Congressional District: 13
Primary Place of Performance: University of Illinois at Urbana-Champaign
506 S. Wright Street
Urbana
IL  US  61801-3620
Primary Place of Performance
Congressional District:
13
Unique Entity Identifier (UEI): Y8CWNJRCNN91
Parent UEI: V2PHZ2CSCH63
NSF Program(s): ECR-EDU Core Research
Primary Program Source: 04001920DB NSF Education & Human Resource
Program Reference Code(s): 8817
Program Element Code(s): 798000
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.076

ABSTRACT

This proposal was submitted in response to EHR Core Research (ECR) program announcement NSF 19-508. The ECR program of fundamental research in STEM education provides funding in critical research areas that are essential, broad and enduring. EHR seeks proposals that will help synthesize, build and/or expand research foundations in the following focal areas: STEM learning, STEM learning environments, STEM workforce development, and broadening participation in STEM. The ECR program is distinguished by its emphasis on the accumulation of robust evidence to inform efforts to (a) understand, (b) build theory to explain, and (c) suggest interventions (and innovations) to address persistent challenges in STEM interest, education, learning, and participation.

This EHR Core Research project is conducting methodological research on the computational analysis of video data focused on the social and spatial dimensions of STEM learning in classrooms. Video data are complex. They involve visual, acoustic, spatial, and temporal features that can be reduced in several ways. To date, analysis of video data of STEM classrooms has not been able to leverage computational power to take advantage of their richness. However, recent advancements in data science, coupled with existing speech analytics methods, make it possible to computationally identify important features from video in ways that preserve complexity and nuance. These advancements will improve research replicability. The methods developed through this project will facilitate use of sophisticated computational analysis with video data by more researchers. Application of these new methods will help increase the scale and generalizability of video research and lead to the building of new theory.

This research project builds on state-of-the-art computer vision and speech analytics methods tested on video data collected in STEM classrooms. It does so within a computational grounded theory methodological framework, which leverages the interpretive power of grounded analytical approaches with the processing power of computational methods. Specifically, two types of computational analysis procedures will be produced: (a) extracting meaningful features from video and audio data of STEM classrooms, and (b) conducting exploratory pattern identification using these extracted features. To develop these procedures, existing large-scale video datasets of STEM classrooms will be used to test and refine increasingly sophisticated analyses, which will also be used to demonstrate the application of these methods to investigate the social and spatial dimensions of STEM classrooms. The project focuses on integrating these methods to improve their power and leverages existing large-scale datasets of STEM classrooms, such that the methods developed can be tested on realistic data. The datasets are extensive enough to support the investigation of a wide range of research questions, including high-inference questions about students' participation in disciplinary practices. Finally, by pairing computational and grounded analytical methods, the project is developing methods that have the potential to enhance and test construct validity of the patterns found in the data.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Showing: 1 - 10 of 13)
DAngelo, C. and Dyer, E. and Krist, C. and Rosenberg, J. and Bosch, N "Advancing computational grounded theory for audiovisual data from mathematics classrooms" The Interdisciplinarity of the Learning Sciences, 14th International Conference of the Learning Sciences (ICLS) 2020 , v.4 , 2020 Citation Details
Dyer, E. B. and Parr, E. D. and Machaka, N. and Krist, C. "Understanding Joint Exploration: The Epistemic Positioning in Collaborative Activity in a Secondary Mathematics Classroom" Proceedings of the 43rd Annual Conference of PME-NA , 2021 Citation Details
Hur, Paul and Bosch, Nigel "Tracking Individuals in Classroom Videos via Post-processing OpenPose Data" LAK22: 12th International Learning Analytics and Knowledge Conference , 2022 https://doi.org/10.1145/3506860.3506888 Citation Details
Hur, Paul and Machaka, Nessrine and Krist, Christina and Bosch, Nigel "Informing Expert Feature Engineering through Automated Approaches: Implications for Coding Qualitative Classroom Video Data" LAK23: 13th International Learning Analytics and Knowledge Conference (LAK 2023) , 2023 https://doi.org/10.1145/3576050.3576090 Citation Details
Kubsch, Marcus and Krist, Christina and Rosenberg, Joshua M. "Distributing epistemic functions and tasksA framework for augmenting human analytic power with machine learning in science education research" Journal of Research in Science Teaching , v.60 , 2022 https://doi.org/10.1002/tea.21803 Citation Details
Machaka, N. and Parr, E. D. and Dyer, E. B. and Krist, C. "Shifts in Positions, Epistemic Authority, and Epistemic Agency in a Secondary Mathematics Classroom" Proceedings of the 16th International Conference of the Learning SciencesICLS 2022 , 2022 Citation Details
Machaka, N. and Parr, E. D. and Dyer, E. B. and Krist, C. "Shifts in Positions, Epistemic Authority, and Epistemic Agency in a Secondary Mathematics Classroom" Proceedings of the 16th International Conference of the Learning SciencesICLS 2022 , 2022 Citation Details
Palaguachi, C. and Cox, E. and DAngelo, C. "Audio Analysis of Teacher Interactions with Small Groups in Classrooms" General Proceedings of the 15th International Conference on Computer-Supported Collaborative Learning 2022 , 2022 Citation Details
Palaguachi, Chris and DAngelo, Cynthia and Dyer, Elizabeth and Machaka, Nessrine "Audio Analysis of Group Discussion Patterns in Noisy Classrooms Before, During, and After Teacher-Group Interactions" , 2024 https://doi.org/10.22318/cscl2024.727401 Citation Details
Parr, E. D. "Making space for joint exploration: The embodiment of social and epistemic positioning in student-teacher interaction." Proceedings of the 15th International Conference of the Learning SciencesICLS 2021 , 2021 Citation Details
Parr, E. D. and Dyer, E. B. "Spin-Ups: How Teachers Scaffold Group Work with Whole Class Prompts and the Messages They Contain" Proceedings of the 43rd Annual Conference of PME-NA , 2021 Citation Details
(Showing: 1 - 10 of 13)

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

This project developed computationally-infused methodologies for qualitative analysis of video data of STEM classrooms. Specifically, we developed (1) computational techniques to extract meaningful features from video and audio data; and (2) methodological frameworks and approaches that integrated computational tools in ways that supported the goals of qualitative analysis. Our case examples showcasing our use of these tools and approaches have also contributed to our understanding of learning in the context of mathematical problem solving.

Intellectual Merit. Our project’s commitment to centering the goals and values of qualitative research contrasts with predominant uses of computational tools to automate analyses, such as training a computer to code large amounts of qualitative data. While automating analysis might make qualitative research more efficient, we argue that such efficiency is in direct opposition to the broader epistemologies driving qualitative research. Accordingly, instead of using computational tools to code more data more quickly, we have instead used computational tools to construct increasingly rich descriptions of (large amounts of) complex data in ways that allow us to understand the particularities, nuances, and complex processes evident within these data. The goals of such uses of computational tools are to develop new insights; to trouble existing understandings; and to generate new theoretical connections. To this end, we have shown how “errors” in computational detection, differences in human vs. computer noticing, and “outlier” moments of particular sets of audio and visual features have been especially important components of our methodological approaches. We position these approaches as explicitly feminist and anti-capitalist in that they expand beyond automation and seek to resist making data analysis more efficient at the expense of particularity, nuance, richness, and complexity. 

Technical developments. Building from existing open-source tools for extraction of audio features (OpenSmile) and visual aspects of human skeletal movement (OpenPose), we built custom tools for processing audiovisual data recorded in high school mathematics classrooms. These tools include an algorithm to track unique individuals across frames of video data within OpenPose (Hur & Bosch, 2022); bespoke methods for feature engineering (Hur, Machaka, Krist, & Bosch, 2023); and protocols for merging and syncing extracted audio feature data from OpenSmile (Palaguachi, D’Angelo, Dyer, & Machaka, 2024; Cox, Dyer, & Krist, forthcoming). 

Methodological developmentsWe developed a suite of methodological frameworks that provide epistemological and ideological guidance for integrating computational tools to augment qualitative analyses. These frameworks are not uniform; instead, they highlight the breadth of possibilities for careful and creative uses of computational methods in qualitative research. First, and most broadly, the DEFT framework (Distributing Epistemic Functions and Tasks; Kubsch, Krist, & Rosenberg, 2023) challenges researchers to consider the alignment between the goals of their research and the role of the computational tools they choose to use as part of an integrated human-computational system. Second, the CADAs framework (Computationally Assisted Descriptive Approaches; D’Angelo, Krist, & Dyer, forthcoming) provides a rationale and guidance for using audio and visual data as additional layers of rich description within analysis, making a dataset more complex and nuanced. Third, two papers present empirical examples of variations on Laura Nelson’s (2020) computational grounded theory (CGT) method, “remixing” the pattern exploration and data interpretation phases of CGT (Hur, Palaguachi, Machaka, Dyer, D’Angelo, & Bosch, forthcoming; Krist, Dyer, & Hall, forthcoming). Finally, in a forthcoming manuscript and 2025 AERA Presidential Session (Krist, Dyer, Rosenberg, & Cox), we articulate the ideological tensions we encountered throughout the project. Namely, we note that most computational tools and their associated metrics for rigor, reliability, accuracy, etc., have been developed with goals of automation and efficiency in mind. When we intentionally sought to subvert those goals, we were faced with new tensions and sets of decisions around when, why, and how to deploy various computational tools. We discuss those dilemmas and provide guidance for others as they grapple with similar tensions and decisions.

Broader Impacts. Our work has been disseminated broadly through a range of venues, including 8 empirical journal articles; 12 terminal conference proceedings; 14 conference presentations; 2 invited keynote presentations; 4 workshops; and an open-access textbook on integrating machine learning in science education research co-edited by PI Krist and including 5 chapters contributed by project team members. Our project website includes all workshop materials and a link to our github repository.

Our frameworks hold broad applicability beyond the integration of computational tools focused on audio and visual features of data. For example, ChatGPT and generative AI rapidly emerged and gained popularity midway through the life of the project. We have continually responded to the AI “boom” with critical optimism, presenting our approaches and frameworks to challenge the field to proceed with caution (or to avoid altogether) when attempting automation in research. Our responses demonstrate our frameworks’ relevance and applicability to developing methodologies that infuse a broad range of computational tools into qualitative research.


Last Modified: 01/02/2025
Modified by: Christina Krist

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page