Award Abstract # 1740765
Collaborative Research: Community-Building and Infrastructure Design for Data-Intensive Research in Computer Science Education

NSF Org: DRL
Division of Research on Learning in Formal and Informal Settings (DRL)
Recipient: VIRGINIA POLYTECHNIC INSTITUTE & STATE UNIVERSITY
Initial Amendment Date: August 15, 2017
Latest Amendment Date: August 15, 2017
Award Number: 1740765
Award Instrument: Standard Grant
Program Manager: Wu He
wuhe@nsf.gov
 (703)292-7593
DRL
 Division of Research on Learning in Formal and Informal Settings (DRL)
EDU
 Directorate for STEM Education
Start Date: September 1, 2017
End Date: December 31, 2021 (Estimated)
Total Intended Award Amount: $268,941.00
Total Awarded Amount to Date: $268,941.00
Funds Obligated to Date: FY 2017 = $268,941.00
History of Investigator:
  • Clifford Shaffer (Principal Investigator)
    shaffer@vt.edu
  • Stephen Edwards (Co-Principal Investigator)
Recipient Sponsored Research Office: Virginia Polytechnic Institute and State University
300 TURNER ST NW
BLACKSBURG
VA  US  24060-3359
(540)231-5281
Sponsor Congressional District: 09
Primary Place of Performance: Virginia Polytechnic Institute and State University
Sponsored Programs 0170
Blacksburg
VA  US  24061-0001
Primary Place of Performance
Congressional District:
09
Unique Entity Identifier (UEI): QDE5UHE5XD16
Parent UEI: X6KEFGLHSJX7
NSF Program(s): ECR-EDU Core Research
Primary Program Source: 04001718DB NSF Education & Human Resource
Program Reference Code(s): 8083, 7433
Program Element Code(s): 798000
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.076

ABSTRACT

The Building Community and Capacity in Data Intensive Research in Education program seeks to enable research communities to develop visions, teams, and capabilities dedicated to creating new, large-scale, next-generation data resources and relevant analytic techniques to advance fundamental research for areas of research covered by the Education and Human Resources Directorate. Successful proposals will outline activities that will have significant impacts across multiple fields by enabling new types of data-intensive research. Online educational systems, and the large-scale data streams that they generate, have the potential to transform education as well as our scientific understanding of learning. Computer Science Education (CSE) researchers are increasingly making use of large collections of data generated by the click streams coming from eTextbooks, interactive programming environments, and other smart content. However, CSE research faces barriers that slow progress: 1) Collection of computer science learning process and outcome data generated by one system is not compatible with that from other systems. 2) Computer science problem solving and learning (e.g., open-ended coding solutions to complex problems) is quite different from the type of data (e.g., discrete answers to questions or verbal responses) that current educational data mining focuses on. This project will build community and capacity among CSE researchers, data scientists, and learning scientists toward reducing these barriers and facilitating the full potential of data-intensive research on learning and improving computer science education. The project will bring together CSE tool building communities with learning science and technology researchers towards developing a software infrastructure that supports scaled and sustainable data-intensive research in CSE that contributes to basic science of human learning of complex problem solving. The project will support community-building and infrastructure capacity-building whose ultimate goal is to develop and disseminate infrastructure that facilitates three aspects of CSE research: (1) development and broader re-use of innovative learning content that is instrumented for rich data collection, (2) formats and tools for analysis of learner data, and (3) best practices to make large collections of learner data and associated analytics available to researchers in CSE, data science, or learning science. To achieve these goals, a large community of researchers will be engaged to define, develop, and use critical elements of this infrastructure toward addressing specific data-intensive research questions.The project will host workshops, meetings, and online forums leveraging existing communities and building new capacities toward significant research outcomes and lasting infrastructure support.

This project will provide an infrastructure that can support various kinds of research in CSE domain as a one-stop-shop, and will be the first to focus on full-cycle educational research infrastructure in any domain. CSE tool developers and educators will become more productive at creating and integrating advanced technologies and novel analytics. Learning researchers will have better tools for analyzing the huge amounts of learner data that modern digital education software produces. Data scientists will have rich new datasets in which to explore new machine learning and statistical techniques. Collectively, these efforts will reduce barriers to educational innovation and support scientific discoveries about the nature of complex learning and how best to enhance it. The project will support scientific investigations through community meetings and mini-grants to others addressing questions such as: What is the optimal ratio of solution examples and problem-solving practice? How do computational thinking skills emerge? In what quanta are programming skills acquired? Can automated tutoring of programming be effective at scale in enhancing student learning?. Many of the innovations developed under this project will directly impact learning in any discipline. Educational software will more quickly be developed in the future, that more easily generates meaningful learner data, which in turn can be more easily analyzed.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Price, Thomas W. and Hovemeyer, David and Rivers, Kelly and Gao, Ge and Bart, Austin Cory and Kazerouni, Ayaan M. and Becker, Brett A. and Petersen, Andrew and Gusukuma, Luke and Edwards, Stephen H. and Babcock, David "ProgSnap2: A Flexible Format for Programming Process Data" Proceedings of the 2020 ACM Conference on Innovation and Technology in Computer Science Education , 2020 10.1145/3341525.3387373 Citation Details
Manzoor, Hamza and Naik, Amit and Shaffer, Clifford A. and North, Chris and Edwards, Stephen H. "Auto-Grading Jupyter Notebooks" SIGCSE '20: Proceedings of the 51st ACM Technical Symposium on Computer Science Education , 2020 10.1145/3328778.3366947 Citation Details
Mohammed, Mostafa and Shaffer, Clifford A "Clickstream Data from a Formal Languages eTextbook" Proceedings of the 5th Educational Data Mining in Computer Science Education (CSEDM) Workshop , 2021 Citation Details
Mohammed, Mostafa and Shaffer, Clifford A. "Increasing Student Interaction with an eTextbook using Programmed Instruction" Proceedings of Third Workshop on Intelligent Textbooks (iTextbooks) , 2021 Citation Details
Mohammed, Mostafa and Shaffer, Clifford A. and Rodger, Susan H. "Teaching Formal Languages with Visualizations and Auto-Graded Exercises" SIGCSE '21: Proceedings of the 52nd ACM Technical Symposium on Computer Science Education , 2021 https://doi.org/10.1145/3408877.3432398 Citation Details
Farghally, M. F. "Student Perceptions of the Complete Online Transition of Two CS Courses in Response to the COVID-19 Pandemic" 2021 ASEE Virtual Annual Conference Content Access, Virtual Conference , 2021 Citation Details
Hicks, A. and Akhuseyinoglu, K. and Shaffer, C. and Brusilovsky, P. "Live Catalog of Smart Learning Objects for Computer Science Education" Proceedings of Sixth SPLICE Workshop "Building an Infrastructure for Computer Science Education Research and Practice at Scale" at ACM Learning at Scale 2020, Virtual , 2020 Citation Details
Hicks, Alexander and Shaffer, Clifford A. "Containerizing an eTextbook Infrastructure" Proceedings of the 5th Educational Data Mining in Computer Science Education (CSEDM) Workshop , 2021 Citation Details
Ellis, Margaret and Edwards, Stephen H. and Shaffer, Clifford A. and Amelink, Catherine T. "Incorporating Practical Computing Skills Into a Supplemental CS2 Problem Solving Course" Journal of Higher Education Theory and Practice , v.20 , 2020 https://doi.org/10.33423/jhetp.v20i11.3771 Citation Details

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

SPICE is a community-building effort that brings together Computer Science Education (CSEd) researchers, data scientists, and learning scientists to reduce barriers that educators historically have faced in adopting innovative educational software. We developed the means for CS educators to more easily adopt and integrate new smart tools for supporting learning. This infrastructure is, in turn, supporting scaled and sustainable data-intensive research in CSEd that contributes to the basic science of human learning of complex problem-solving. Over the life of our project, we engaged the CSEd community through some 20 workshops, training, and community events; websites containing developer resources, tutorials, and software demonstrations; working groups to intensively work on key standards and infrastructure issues; and data set collection and distribution.

Over the past four years, SPLICE has championed and helped popularize software interoperability standards within the CSEd community, such that use of Learning Tools Interoperability (LTI) is now normal and expected within the community. We have worked to make it easy for our community of innovative educational tool developers to distribute their projects within the broader CS Education community. Our working groups helped to create and distribute standards for collecting data related to small programming exercises, an important and widespread artifact in CS education, and we have made progress on standards to allow faculty to package course curriculum for sharing with others. We have helped to standardize data collection from eTextbooks and associated interactive exercises. We have encouraged researchers to collect and then share nearly 60 CSEd datasets so that other researchers can harness the massive quantities of student analytics data that modern education systems generate, but which historically has gone to waste.

The SPLICE Live Catalog is an innovative way to allow potential adopters of CS Education software to try out the software themselves within the context of an actual Learning Management System. As the community continues to contribute interoperable software to the live catalog, it encourages wider adoption of LTI and similar standards. We directly interacted with hundreds of members of the community through a series of over a dozen workshops, plus special events such as the weeklong LearnLab Summer Schools and a series of CSEDM Data Challenges at the annual Educational Data Mining conference. We supported a number of development efforts through mini-grants and through working groups.

 


Last Modified: 05/30/2022
Modified by: Clifford A Shaffer

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page