
NSF Org: |
DRL Division of Research on Learning in Formal and Informal Settings (DRL) |
Recipient: |
|
Initial Amendment Date: | August 8, 2019 |
Latest Amendment Date: | January 13, 2020 |
Award Number: | 1920796 |
Award Instrument: | Standard Grant |
Program Manager: |
Jonathan Bostic
jdbostic@nsf.gov (703)292-2296 DRL Division of Research on Learning in Formal and Informal Settings (DRL) EDU Directorate for STEM Education |
Start Date: | September 1, 2019 |
End Date: | August 31, 2024 (Estimated) |
Total Intended Award Amount: | $1,313,855.00 |
Total Awarded Amount to Date: | $1,313,855.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
506 S WRIGHT ST URBANA IL US 61801-3620 (217)333-2187 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
506 S. Wright Street Urbana IL US 61801-3620 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | ECR-EDU Core Research |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.076 |
ABSTRACT
This proposal was submitted in response to EHR Core Research (ECR) program announcement NSF 19-508. The ECR program of fundamental research in STEM education provides funding in critical research areas that are essential, broad and enduring. EHR seeks proposals that will help synthesize, build and/or expand research foundations in the following focal areas: STEM learning, STEM learning environments, STEM workforce development, and broadening participation in STEM. The ECR program is distinguished by its emphasis on the accumulation of robust evidence to inform efforts to (a) understand, (b) build theory to explain, and (c) suggest interventions (and innovations) to address persistent challenges in STEM interest, education, learning, and participation.
This EHR Core Research project is conducting methodological research on the computational analysis of video data focused on the social and spatial dimensions of STEM learning in classrooms. Video data are complex. They involve visual, acoustic, spatial, and temporal features that can be reduced in several ways. To date, analysis of video data of STEM classrooms has not been able to leverage computational power to take advantage of their richness. However, recent advancements in data science, coupled with existing speech analytics methods, make it possible to computationally identify important features from video in ways that preserve complexity and nuance. These advancements will improve research replicability. The methods developed through this project will facilitate use of sophisticated computational analysis with video data by more researchers. Application of these new methods will help increase the scale and generalizability of video research and lead to the building of new theory.
This research project builds on state-of-the-art computer vision and speech analytics methods tested on video data collected in STEM classrooms. It does so within a computational grounded theory methodological framework, which leverages the interpretive power of grounded analytical approaches with the processing power of computational methods. Specifically, two types of computational analysis procedures will be produced: (a) extracting meaningful features from video and audio data of STEM classrooms, and (b) conducting exploratory pattern identification using these extracted features. To develop these procedures, existing large-scale video datasets of STEM classrooms will be used to test and refine increasingly sophisticated analyses, which will also be used to demonstrate the application of these methods to investigate the social and spatial dimensions of STEM classrooms. The project focuses on integrating these methods to improve their power and leverages existing large-scale datasets of STEM classrooms, such that the methods developed can be tested on realistic data. The datasets are extensive enough to support the investigation of a wide range of research questions, including high-inference questions about students' participation in disciplinary practices. Finally, by pairing computational and grounded analytical methods, the project is developing methods that have the potential to enhance and test construct validity of the patterns found in the data.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
This project developed computationally-infused methodologies for qualitative analysis of video data of STEM classrooms. Specifically, we developed (1) computational techniques to extract meaningful features from video and audio data; and (2) methodological frameworks and approaches that integrated computational tools in ways that supported the goals of qualitative analysis. Our case examples showcasing our use of these tools and approaches have also contributed to our understanding of learning in the context of mathematical problem solving.
Intellectual Merit. Our project’s commitment to centering the goals and values of qualitative research contrasts with predominant uses of computational tools to automate analyses, such as training a computer to code large amounts of qualitative data. While automating analysis might make qualitative research more efficient, we argue that such efficiency is in direct opposition to the broader epistemologies driving qualitative research. Accordingly, instead of using computational tools to code more data more quickly, we have instead used computational tools to construct increasingly rich descriptions of (large amounts of) complex data in ways that allow us to understand the particularities, nuances, and complex processes evident within these data. The goals of such uses of computational tools are to develop new insights; to trouble existing understandings; and to generate new theoretical connections. To this end, we have shown how “errors” in computational detection, differences in human vs. computer noticing, and “outlier” moments of particular sets of audio and visual features have been especially important components of our methodological approaches. We position these approaches as explicitly feminist and anti-capitalist in that they expand beyond automation and seek to resist making data analysis more efficient at the expense of particularity, nuance, richness, and complexity.
Technical developments. Building from existing open-source tools for extraction of audio features (OpenSmile) and visual aspects of human skeletal movement (OpenPose), we built custom tools for processing audiovisual data recorded in high school mathematics classrooms. These tools include an algorithm to track unique individuals across frames of video data within OpenPose (Hur & Bosch, 2022); bespoke methods for feature engineering (Hur, Machaka, Krist, & Bosch, 2023); and protocols for merging and syncing extracted audio feature data from OpenSmile (Palaguachi, D’Angelo, Dyer, & Machaka, 2024; Cox, Dyer, & Krist, forthcoming).
Methodological developments. We developed a suite of methodological frameworks that provide epistemological and ideological guidance for integrating computational tools to augment qualitative analyses. These frameworks are not uniform; instead, they highlight the breadth of possibilities for careful and creative uses of computational methods in qualitative research. First, and most broadly, the DEFT framework (Distributing Epistemic Functions and Tasks; Kubsch, Krist, & Rosenberg, 2023) challenges researchers to consider the alignment between the goals of their research and the role of the computational tools they choose to use as part of an integrated human-computational system. Second, the CADAs framework (Computationally Assisted Descriptive Approaches; D’Angelo, Krist, & Dyer, forthcoming) provides a rationale and guidance for using audio and visual data as additional layers of rich description within analysis, making a dataset more complex and nuanced. Third, two papers present empirical examples of variations on Laura Nelson’s (2020) computational grounded theory (CGT) method, “remixing” the pattern exploration and data interpretation phases of CGT (Hur, Palaguachi, Machaka, Dyer, D’Angelo, & Bosch, forthcoming; Krist, Dyer, & Hall, forthcoming). Finally, in a forthcoming manuscript and 2025 AERA Presidential Session (Krist, Dyer, Rosenberg, & Cox), we articulate the ideological tensions we encountered throughout the project. Namely, we note that most computational tools and their associated metrics for rigor, reliability, accuracy, etc., have been developed with goals of automation and efficiency in mind. When we intentionally sought to subvert those goals, we were faced with new tensions and sets of decisions around when, why, and how to deploy various computational tools. We discuss those dilemmas and provide guidance for others as they grapple with similar tensions and decisions.
Broader Impacts. Our work has been disseminated broadly through a range of venues, including 8 empirical journal articles; 12 terminal conference proceedings; 14 conference presentations; 2 invited keynote presentations; 4 workshops; and an open-access textbook on integrating machine learning in science education research co-edited by PI Krist and including 5 chapters contributed by project team members. Our project website includes all workshop materials and a link to our github repository.
Our frameworks hold broad applicability beyond the integration of computational tools focused on audio and visual features of data. For example, ChatGPT and generative AI rapidly emerged and gained popularity midway through the life of the project. We have continually responded to the AI “boom” with critical optimism, presenting our approaches and frameworks to challenge the field to proceed with caution (or to avoid altogether) when attempting automation in research. Our responses demonstrate our frameworks’ relevance and applicability to developing methodologies that infuse a broad range of computational tools into qualitative research.
Last Modified: 01/02/2025
Modified by: Christina Krist
Please report errors in award information by writing to: awardsearch@nsf.gov.