Proceedings

LAK '24: Proceedings of the 14th Learning Analytics and Knowledge Conference

Full Citation in the ACM Digital Library

SESSION: Research Articles

Equity-Forward Learning Analytics: Designing a Dashboard to Support Marginalized Student Success

Jay Sloan-Lynch
Robert Morse

Student outcomes in US higher education exhibit deep and persistent inequities. The continued underperformance of historically marginalized students remains a serious concern across higher education, reflected in increasing efforts among institutions to infuse diversity, equity, and inclusion into their academic and social communities. Yet despite widespread recognition of these inequities, few studies in the learning analytics literature engage in practical ways with issues of educational equity or DEI considerations. In this paper, we share our work supporting a large college's strategic DEI goals through the creation of a Course Diversity Dashboard informed by research into how students’ study behaviors and performance interact with their gender and ethnic identities to impact course outcomes. The dashboard enables users to explore inequalities in course outcomes and take concrete actions to improve student study strategies, time management, and prior knowledge. Results from our research revealed the existence of previously hidden learner inequities in all courses included in our study as well as critical differences in underrepresented minority students’ prior knowledge. And while we did not find evidence of meaningful differences in the study behaviors of student subgroups, our findings further validate the effectiveness of evidence-informed study strategies in an authentic educational setting.

Automating Human Tutor-Style Programming Feedback: Leveraging GPT-4 Tutor Model for Hint Generation and GPT-3.5 Student Model for Hint Validation

Tung Phung
Victor-Alexandru Pădurean
Anjali Singh
Christopher Brooks
José Cambronero
Sumit Gulwani
Adish Singla
Gustavo Soares

Generative AI and large language models hold great promise in enhancing programming education by automatically generating individualized feedback for students. We investigate the role of generative AI models in providing human tutor-style programming hints to help students resolve errors in their buggy programs. Recent works have benchmarked state-of-the-art models for various feedback generation scenarios; however, their overall quality is still inferior to human tutors and not yet ready for real-world deployment. In this paper, we seek to push the limits of generative AI models toward providing high-quality programming hints and develop a novel technique, GPT4HINTS-GPT3.5VAL. As a first step, our technique leverages GPT-4 as a “tutor” model to generate hints – it boosts the generative quality by using symbolic information of failing test cases and fixes in prompts. As a next step, our technique leverages GPT-3.5, a weaker model, as a “student” model to further validate the hint quality – it performs an automatic quality validation by simulating the potential utility of providing this feedback. We show the efficacy of our technique via extensive evaluation using three real-world datasets of Python programs covering a variety of concepts ranging from basic algorithms to regular expressions and data analysis using pandas library.

SLADE: A Method for Designing Human-Centred Learning Analytics Systems

Riordan Alfredo
Vanessa Echeverria
Yueqiao Jin
Zachari Swiecki
Dragan Gašević
Roberto Martinez-Maldonado

There is a growing interest in creating Learning Analytics (LA) systems that incorporate student perspectives. Yet, many LA systems still lean towards a technology-centric approach, potentially overlooking human values and the necessity of human oversight in automation. Although some recent LA studies have adopted a human-centred design stance, there is still limited research on establishing safe, reliable, and trustworthy systems during the early stages of LA design. Drawing from a newly proposed framework for human-centred artificial intelligence, we introduce SLADE, a method for ideating and identifying features of human-centred LA systems that balance human control and computer automation. We illustrate SLADE’s application in designing LA systems to support collaborative learning in healthcare. Twenty-one third-year students participated in design sessions through SLADE’s four steps: i) identifying challenges and corresponding LA systems; ii) prioritising these LA systems; iii) ideating human control and automation features; and iv) refining features emphasising safety, reliability, and trustworthiness. Our results demonstrate SLADE’s potential to assist researchers and designers in: 1) aligning authentic student challenges with LA systems through both divergent ideation and convergent prioritisation; 2) understanding students’ perspectives on personal agency and delegation to teachers; and 3) fostering discussions about the safety, reliability, and trustworthiness of LA solutions.

Novice programmers inaccurately monitor the quality of their work and their peers’ work in an introductory computer science course

Elizabeth B. Cloude
Pranshu Kumar
Ryan S. Baker
Eric Fouh

A student’s ability to accurately evaluate the quality of their work holds significant implications for their self-regulated learning and problem-solving proficiency in introductory programming. A widespread cognitive bias that frequently impedes accurate self- assessment is overconfidence, which often stems from a misjudgment of contextual and task-related cues, including students’ judgment of their peers’ competencies. Little research has explored the role of overconfidence on novice programmers’ ability to accurately monitor their own work in comparison to their peers’ work and its impact on performance in introductory programming courses. The present study examined whether novice programmers exhibited a common cognitive bias called the "hard-easy effect", where students believe their work is better than their peers on easier tasks (overplace) but worse than their peers on harder tasks (underplace). Results showed a reversal of the hard-easy effect, where novices tended to overplace themselves on harder tasks, yet underplace themselves on easier ones. Remarkably, underplacers performed better on an exam compared to overplacers. These findings advance our understanding of relationships between the hard-easy effect, monitoring accuracy across multiple tasks, and grades within introductory programming. Implications of this study can be used to guide instructional decision making and design to improve novices’ metacognitive awareness and performance in introductory programming courses.

Improving Model Fairness with Time-Augmented Bayesian Knowledge Tracing

Jake Barrett
Alasdair Day
Kobi Gal

Modelling student performance is an increasingly popular goal in the learning analytics community. A common method for this task is Bayesian Knowledge Tracing (BKT), which predicts student performance and topic mastery using the student’s answer history. While BKT has strong qualities and good empirical performance, like many machine learning approaches it can be prone to bias. In this study we demonstrate an inherent bias in BKT with respect to students’ income support levels and gender, using publicly available data. We find that this bias is likely a result of the model’s ‘slip’ parameter disregarding answer speed when deciding if a student has lost mastery status. We propose a new BKT model variation that directly considers answer speed, resulting in a significant fairness increase without sacrificing model performance. We discuss the role of answer speed as a potential cause of BKT model bias, as well as a method to minimise bias in future implementations.

Feedback on Feedback: Comparing Classic Natural Language Processing and Generative AI to Evaluate Peer Feedback

Stephen Hutt
Allison DePiro
Joann Wang
Sam Rhodes
Ryan S Baker
Grayson Hieb
Sheela Sethuraman
Jaclyn Ocumpaugh
Caitlin Mills

Peer feedback can be a powerful tool as it presents learning opportunities for both the learner receiving feedback as well as the learner providing feedback. Despite its utility, it can be difficult to implement effectively, particularly for younger learners, who are often novices at providing feedback. It can be difficult for students to learn what constitutes “good” feedback – particularly in open-ended problem-solving contexts. To address this gap, we investigate both classical natural language processing techniques and large language models, specifically ChatGPT, as potential approaches to devise an automated detector of feedback quality (including both student progress towards goals and next steps needed). Our findings indicate that the classical detectors are highly accurate and, through feature analysis, we elucidate the pivotal elements influencing its decision process. We find that ChatGPT is less accurate than classical NLP but illustrate the potential of ChatGPT in evaluating feedback, by generating explanations for ratings, along with scores. We discuss how the detector can be used for automated feedback evaluation and to better scaffold peer feedback for younger learners.

Long-Term Prediction from Topic-Level Knowledge and Engagement in Mathematics Learning

Andres Felipe Zambrano
Ryan S. Baker

During middle school, students' learning experiences begin to influence their future decisions about college enrollment and career selection. Prior research indicates that both knowledge gained and the disengagement and affect experienced during this period are predictors of these future outcomes. However, this past research has investigated affect, disengagement, and knowledge in an overall fashion – looking at the average manifestation of these constructs across all topics studied across a year of mathematics. It may be that some mathematics topics are more associated with these outcomes than others. In this study, we use data from middle school students interacting with a digital mathematics learning platform, to analyze the interplay of these features across different topic areas. Our findings show that mastering Functions is the most important predictor of both college enrollment and STEM career selection, while the importance of knowing other topic areas varies across the two outcomes. Furthermore, while subject knowledge tends to be the most relevant predictor for general college enrollment, affective states, especially confusion and engaged concentration, become more important for predicting STEM career selection.

Towards Comprehensive Monitoring of Graduate Attribute Development: A Learning Analytics Approach in Higher Education

Abhinava Barthakur
Jelena Jovanovic
Andrew Zamecnik
Vitomir Kovanovic
Gongjun Xu
Shane Dawson

In response to the evolving demands of the contemporary workplace, higher education (HE) institutions are increasingly emphasising the development of transversal skills and graduate attributes (GAs). The development of GAs, such as effective communication, collaboration, and lifelong learning, are non-linear and follow distinct trajectories for individual learners. The ability to trace and measure the progression of GA remains a significant challenge. While previous studies have focused on empirical methods for measuring GAs in individual courses, a notable gap exists in understanding their longitudinal development within HE programs. To address this research gap, our study focuses on measuring and tracing the development of GAs in an Initial Teacher Education (ITE) undergraduate program at a large public university in Australia. By combining learning analytics (LA) with psychometric models, we analysed students’ assessment grades to measure learners’ GA development in each year of the ITE program. The resulting measurements enabled the identification of distinct profiles of GA attainment, as demonstrated by learners and their distinct pathways. The overall approach allows for a comprehensive representation of a learner's progress throughout the program of study. As such, the developed approach sets the grounds for more personalised learning support, program evaluation, and improvement of students’ GA attainment.

Epistemic Network Analysis for End-users: Closing the Loop in the Context of Multimodal Analytics for Collaborative Team Learning

Linxuan Zhao
Vanessa Echeverria
Zachari Swiecki
Lixiang Yan
Riordan Alfredo
Xinyu Li
Dragan Gasevic
Roberto Martinez-Maldonado

Effective collaboration and team communication are critical across many sectors. However, the complex dynamics of collaboration in physical learning spaces, with overlapping dialogue segments and varying participant interactions, pose assessment challenges for educators and self-reflection difficulties for students. Epistemic network analysis (ENA) is a relatively novel technique that has been used in learning analytics (LA) to unpack salient aspects of group communication. Yet, most LA works based on ENA have primarily sought to advance research knowledge rather than directly aid teachers and students by closing the LA loop. We address this gap by conducting a study in which we i) engaged teachers in designing human-centred versions of epistemic networks; ii) formulated an NLP methodology to code physically distributed dialogue segments of students based on multimodal (audio and positioning) data, enabling automatic generation of epistemic networks; and iii) deployed the automatically generated epistemic networks in 28 authentic learning sessions and investigated how they can support teaching. The results indicate the viability of completing the analytics loop through the design of streamlined epistemic network representations that enable teachers to support students’ reflections.

Generative Artificial Intelligence in Learning Analytics: Contextualising Opportunities and Challenges through the Learning Analytics Cycle

Lixiang Yan
Roberto Martinez-Maldonado
Dragan Gasevic

Generative artificial intelligence (GenAI), exemplified by ChatGPT, Midjourney, and other state-of-the-art large language models and diffusion models, holds significant potential for transforming education and enhancing human productivity. While the prevalence of GenAI in education has motivated numerous research initiatives, integrating these technologies within the learning analytics (LA) cycle and their implications for practical interventions remain underexplored. This paper delves into the prospective opportunities and challenges GenAI poses for advancing LA. We present a concise overview of the current GenAI landscape and contextualise its potential roles within Clow’s generic framework of the LA cycle. We posit that GenAI can play pivotal roles in analysing unstructured data, generating synthetic learner data, enriching multimodal learner interactions, advancing interactive and explanatory analytics, and facilitating personalisation and adaptive interventions. As the lines blur between learners and GenAI tools, a renewed understanding of learners is needed. Future research can delve deep into frameworks and methodologies that advocate for human-AI collaboration. The LA community can play a pivotal role in capturing data about human and AI contributions and exploring how they can collaborate most effectively. As LA advances, it is essential to consider the pedagogical implications and broader socioeconomic impact of GenAI for ensuring an inclusive future.

TeamSlides: a Multimodal Teamwork Analytics Dashboard for Teacher-guided Reflection in a Physical Learning Space

Vanessa Echeverria
Lixiang Yan
Linxuan Zhao
Sophie Abel
Riordan Alfredo
Samantha Dix
Hollie Jaggard
Rosie Wotherspoon
Abra Osborne
Simon Buckingham Shum
Dragan Gasevic
Roberto Martinez-Maldonado

Advancements in Multimodal Learning Analytics (MMLA) have the potential to enhance the development of effective teamwork skills and foster reflection on collaboration dynamics in physical learning environments. Yet, only a few MMLA studies have closed the learning analytics loop by making MMLA solutions immediately accessible to educators to support reflective practices, especially in authentic settings. Moreover, deploying MMLA solutions in authentic settings can bring new challenges beyond logistic and privacy issues. This paper reports the design and use of TeamSlides, a multimodal teamwork analytics dashboard to support teacher-guided reflection. We conducted an in-the-wild classroom study involving 11 teachers and 138 students. Multimodal data were collected from students working in team healthcare simulations. We examined how teachers used the dashboard in 22 debrief sessions to aid their reflective practices. We also interviewed teachers to discuss their perceptions of the dashboard’s value and the challenges faced during its use. Our results suggest that the dashboard effectively reinforced discussions and augmented teacher-guided reflection practices. However, teachers encountered interpretation conflicts, sometimes leading to mistrust or misrepresenting the information. We discuss the considerations needed to overcome these challenges in MMLA research.

Adaptation of the Multi-Concept Multivariate Elo Rating System to Medical Students' Training Data

Erva Nihan Kandemir
Jill-Jênn Vie
Adam Sanchez-Ayte
Olivier Palombi
Franck Ramus

Accurate estimation of question difficulty and prediction of student performance play key roles in optimizing educational instruction and enhancing learning outcomes within digital learning platforms. The Elo rating system is widely recognized for its proficiency in predicting student performance by estimating both question difficulty and student ability while providing computational efficiency and real-time adaptivity. This paper presents an adaptation of a multi-concept variant of the Elo rating system to the data collected by a medical training platform—a platform characterized by a vast knowledge corpus, substantial inter-concept overlap, a huge question bank with significant sparsity in user-question interactions, and a highly diverse user population, presenting unique challenges. Our study is driven by two primary objectives: firstly, to comprehensively evaluate the Elo rating system’s capabilities on this real-life data, and secondly, to tackle the issue of imprecise early-stage estimations when implementing the Elo rating system for online assessments. Our findings suggest that the Elo rating system exhibits comparable accuracy to the well-established logistic regression model in predicting final exam outcomes for users within our digital platform. Furthermore, results underscore that initializing Elo rating estimates with historical data remarkably reduces errors and enhances prediction accuracy, especially during the initial phases of student interactions.

Millions of Views, But Does It Promote Learning? Analyzing Popular SciComm Production Styles Regarding Learning Success, User Behavior and Perception

Hendrik Steinbeck
Mohamed Elhayany
Christoph Meinel

With a rising amount of highly successful educational content on major video platforms, science communication (SciComm) can be considered mainstream. Although the success in terms of social media metrics (e.g. followers and watch time) is undoubtedly given, the learning mechanisms of these production styles is under-researched. Through a between-subject-design of 980 adult learners in a MOOC about data science, this study analyzes how much of a difference four popular SciComm production styles about relational databases make in regard to perceived quality, learning success and technical user behavior. Testing the isolated effect showed no statistical difference in the grand scheme of things. Additionally, a multivariate regression model, estimating the overall course points with robust standard errors showed six significant variables: The time spend with the material and the number of exercise submissions are particular noteworthy. Based on our results, an underlying (video) script is more relevant than the actual production style. Prioritizing the preparation of this material instead following a specific, pre-existing video production style is recommended.

Mirror mirror on the wall, what is missing in my pedagogical goals? The Impact of an AI-Driven Feedback System on the Quality of Teacher-Created Learning Designs

Gerti Pishtari
Edna Sarmiento-Márquez
María Jesús Rodríguez-Triana
Marlene Wagner
Tobias Ley

Given the rising prominence of Artificial Intelligence (AI) in education, understanding its impact on teacher practices is essential. This paper presents an ABAB reversal design study conducted during a teacher training, where an AI-driven feedback system helped 19 teachers to create learning designs. It investigates the impact that the AI-driven feedback had on the quality of designs and assesses pre- and post-training shifts in teachers’ perceptions of the technology. We observed statistical differences between designs crafted without (in phase A1) and with AI (B1). Notably, a small positive influence persisted even after AI withdrawal (A2). This hints that specialized AI algorithms for learning design can assist teachers in effectively achieving their design objectives. Despite noticeable shifts in teachers’ perceived understanding and usefulness of AI, their trust and intention to use remained unchanged. For a successful teacher-AI partnership, future research should explore the long-term impact that AI usage can have on learning design practices and strategies to nurture trust.

Neural Epistemic Network Analysis: Combining Graph Neural Networks and Epistemic Network Analysis to Model Collaborative Processes

Zheng Fang
Weiqing Wang
Guanliang Chen
Zachari Swiecki

We report on the design and evaluation of a novel technique for analysing the sociocognitive nature of collaborative problem-solving—neural epistemic network analysis (NENA). NENA combines the computational power and representational ability of graph neural networks (GNNs) to naturally incorporate social and cognitive features in the analysis with the interpretative advantages of epistemic network analysis (ENA). Comparing NENA and ENA on two datasets from collaborative problem-solving contexts, we found that NENA improves upon ENA’s ability to distinguish between known subgroups in CPS data, while also improving the interpretability and explainability of GNN results.

Data Storytelling in Learning Analytics? A Qualitative Investigation into Educators’ Perceptions of Benefits and Risks

Mikaela Elizabeth Milesi
Roberto Martinez-Maldonado

Emerging research has begun to explore the incorporation of data storytelling (DS) elements to enhance the design of learning analytics (LA) dashboards. This involves using visual features, such as text annotations and visual highlights, to help educators and learners focus their attention on key insights derived from data and act upon them. Previous studies have often overlooked the perspectives of educators and other stakeholders on the potential value and risks associated with implementing DS in LA to guide attention. We address this gap by presenting a case study examining how educators perceive the: i) potential value of DS features for teaching and learning design; ii) role of the visualisation designer in delivering a contextually appropriate data story; and iii) ethical implications of utilising DS to communicate insights. We asked educators from a first-year undergraduate program to explore and discuss DS and the visualisation designer by reviewing sample data stories using their students’ data and crafting their own data stories. Our findings suggest that educators were receptive to DS features, especially meaningful use of annotations and highlighting important data points to easily identify critical information. Every participant acknowledged the potential for DS features to be exploited for harmful or self-serving purposes.

Evidence-centered Assessment for Writing with Generative AI

Yixin Cheng
Kayley Lyons
Guanliang Chen
Dragan Gašević
Zachari Swiecki

We propose a learning analytics-based methodology for assessing the collaborative writing of humans and generative artificial intelligence. Framed by the evidence-centered design, we used elements of knowledge-telling, knowledge transformation, and cognitive presence to identify assessment claims; we used data collected from the CoAuthor writing tool as potential evidence for these claims; and we used epistemic network analysis to make inferences from the data about the claims. Our findings revealed significant differences in the writing processes of different groups of CoAuthor users, suggesting that our method is a plausible approach to assessing human-AI collaborative writing.

The Relation Among Gender, Language, and Posting Type in Online Chemistry Course Discussion Forums

Genevieve Henricks
Michelle Perry
Suma Bhat

This study explored gendered language used in an online chemistry course’s discussion forums, to understand how using gendered language might help or hinder learning outcomes, while considering the goal of various posting structures required in the course. Findings revealed that although gendered-language use did not differ between men and women, gendered forms of language were widely used throughout the forums. The use of gendered language appeared strategic, however, and reliably varied by the goal of the discussion post (i.e., posting a solution to a homework problem, asking a question, or answering a question). Ultimately, gender, language and posting type were found to be related to final grade.

Synthetic Dataset Generation for Fairer Unfairness Research

Lan Jiang
Clara Belitz
Nigel Bosch

Recent research has made strides toward fair machine learning. Relatively few datasets, however, are commonly examined to evaluate these fairness-aware algorithms, and even fewer in education domains, which can lead to a narrow focus on particular types of fairness issues. In this paper, we describe a novel dataset modification method that utilizes a genetic algorithm to induce many types of unfairness into datasets. Additionally, our method can generate an unfairness benchmark dataset from scratch (thus avoiding data collection in situations that might exploit marginalized populations), or modify an existing dataset used as a reference point. Our method can increase the unfairness by 156.3% on average across datasets and unfairness definitions while preserving AUC scores for models trained on the original dataset (just 0.3% change, on average). We investigate the generalization of our method across educational datasets with different characteristics and evaluate three common unfairness mitigation algorithms. The results show that our method can generate datasets with different types of unfairness, large and small datasets, different types of features, and which affect models trained with different classifiers. Datasets generated with this method can be used for benchmarking and testing for future research on the measurement and mitigation of algorithmic unfairness.

Hierarchical Dependencies in Classroom Settings Influence Algorithmic Bias Metrics

Clara Belitz
HaeJin Lee
Nidhi Nasiar
Stephen E. Fancsali
Steve Ritter
Husni Almoubayyed
Ryan S. Baker
Jaclyn Ocumpaugh
Nigel Bosch

Measuring algorithmic bias in machine learning has historically focused on statistical inequalities pertaining to specific groups. However, the most common metrics (i.e., those focused on individual- or group-conditioned error rates) are not currently well-suited to educational settings because they assume that each individual observation is independent from the others. This is not statistically appropriate when studying certain common educational outcomes, because such metrics cannot account for the relationship between students in classrooms or multiple observations per student across an academic year. In this paper, we present novel adaptations of algorithmic bias measurements for regression for both independent and nested data structures. Using hierarchical linear models, we rigorously measure algorithmic bias in a machine learning model of the relationship between student engagement in an intelligent tutoring system and year-end standardized test scores. We conclude that classroom-level influences had a small but significant effect on models. Examining significance with hierarchical linear models helps determine which inequalities in educational settings might be explained by small sample sizes rather than systematic differences.

The relationships among school engagement, students' emotions, and academic performance in an elementary online learning

Jae H. Paik
Igor Himelfarb
Seung Hee Yoo
KyoungMi Yoo
Hoyong Ha

This study investigated the relationship among school engagement, students’ emotions, and academic performance of students in grades 3-6 in South Korea. A random sampling approach was used to extract data from 1,075 students out of a total of 141,926 students who used the educational learning platform, I-TokTok, adapted as the primary Learning Management System (LMS) at the provincial level. The present study aimed to identify dimensions of school engagement by exploring the behaviors consistent with IMS Caliper Analytics Specifications, a common standard utilized for collecting learning data from digital resources. Exploratory and Confirmatory Factor Analyses revealed a three-factor model of school engagement among the 13 learning behavioral indicators: behavioral engagement, social engagement, and cognitive engagement. Students’ emotions were measured through voluntary daily activities in the platform involving reflecting on, recognizing, and recording of their emotions. Students’ academic performance was assessed with performance in math tests administered within the platform. Consistent with current literature, results demonstrated that dimensions of school engagement (i.e., behavioral and social engagement) and students’ emotions positively predicted their math performance. Lastly, school engagement mediated the relationship between students’ emotions and math performance. The present study emphasizes the importance of investigating the underlying mechanisms through which elementary students emotions and school engagement predict academic achievement in an online learning environment. This relatively new area of educational research deserves attention in the field of learning analytics. We highlight the importance of considering ways to improve both students’ emotions and their school engagement to maximize the student learning outcomes.

The Unspoken Aspect of Socially Shared Regulation in Collaborative Learning: AI-Driven Learning Analytics Unveiling ‘Silent Pauses’

Belle Dang
Andy Nguyen
Sanna Järvelä

Socially Shared Regulation (SSRL) contributes to collaborative learning success. Recent advancements in Artificial Intelligence (AI) and Learning Analytics (LA) have enabled examination of this phenomenon’s temporal and cyclical complexities. However, most of these studies focus on students’ verbalised interactions, not accounting for the intertwined ’silent pauses’ that can index learners’ internal cognitive and emotional processes, potentially offering insight into regulation’s core mental processes. To address this gap, we employed AI-driven LA to explore the deliberation tactics among ten triads of secondary students during a face-to-face collaborative task (2,898 events). Discourse was coded for deliberative interactions for SSRL. With the micro-annotation of ‘silent pause’ added, sequences were analysed with the Optimal Matching algorithm, Ward’s Clustering and Lag Sequential Analysis. Three distinct deliberation tactics with different patterns and characteristics involving silent pauses emerged: i) Elaborated deliberation, ii) Coordinated deliberation, and iii) Solitary deliberation. Our findings highlight the role of ‘silent pauses’ in revealing not only the pattern but also the dynamics and characteristics of each deliberative interaction. This study illustrates the potential of AI-driven LA to tap into granular data points that enrich discourse analysis, presenting theoretical, methodological, and practical contributions and implications.

Exploring Confusion and Frustration as Non-linear Dynamical Systems

Elizabeth B. Cloude
Anabil Munshi
J. M. Alexandra Andres
Jaclyn Ocumpaugh
Ryan S. Baker
Gautam Biswas

Numerous studies aim to enhance learning in digital environments through emotionally-sensitive interventions. The D’Mello and Graesser (2012) model of affect dynamics hypothesizes that when a learner encounters confusion, the degree to which it is prolonged (and transitions into frustration) or resolved, significantly affects their learning outcomes in digital environments. However, studies yield inconclusive results regarding relations between confusion, frustration, and learning. More research is needed to explore how confusion and frustration manifest during learning and its relation to outcomes. We go beyond past work looking at the rate, duration, and transitions of confusion and frustration by treating these affective states as non-linear dynamical systems consisting of expressive and behavioral components. We examined the frequency and recurrence of facial expressions associated with basic emotions (as automatically labeled by AffDex, a standard tool for analyzing emotions with video data) during confused and frustrated states (as automatically labeled with BROMP-based detectors applied to students’ interaction data). We compare these co-occurring patterns to learning outcomes (pre-tests, post-tests, and learning gains) within a digital learning environment, Betty’s Brain. Results showed that the frequency and recurrence rate of basic emotions expressed during confusion and frustration are complex and remain incompletely understood. Specifically, we show that confusion and frustration have different relationships with learning outcomes, depending on which basic emotion expressions they co-occur with. Implications of this study open avenues for better understanding these emotions as complex and non-linear dynamical systems, in the long-term enabling personalized feedback and emotional support within digital learning environments that enhance learning outcomes.

Does Difficulty even Matter? Investigating Difficulty Adjustment and Practice Behavior in an Open-Ended Learning Task

Anan Schütt
Tobias Huber
Jauwairia Nasir
Cristina Conati
Elisabeth André

Difficulty adjustment in practice exercises has been shown to be beneficial for learning. However, previous research has mostly investigated close-ended tasks, which do not offer the students multiple ways to reach a valid solution. Contrary to this, in order to learn in an open-ended learning task, students need to effectively explore the solution space as there are multiple ways to reach a solution. For this reason, the effects of difficulty adjustment could be different for open-ended tasks. To investigate this, as our first contribution, we compare different methods of difficulty adjustment in a user study conducted with 86 participants. Furthermore, as the practice behavior of the students is expected to influence how well the students learn, we additionally look at their practice behavior as a post-hoc analysis. Therefore, as a second contribution, we identify different types of practice behavior and how they link to students’ learning outcomes and subjective evaluation measures as well as explore the influence the difficulty adjustment methods have on the practice behaviors. Our results suggest the usefulness of taking into account the practice behavior in addition to only using the practice performance to inform adaptive intervention and difficulty adjustment methods.

The Sequence Matters in Learning - A Systematic Literature Review

Manuel Valle Torre
Catharine Oertel
Marcus Specht

Describing and analysing learner behaviour using sequential data and analysis is becoming more and more popular in Learning Analytics. Nevertheless, we found a variety of definitions of learning sequences, as well as choices regarding data aggregation and the methods implemented for analysis. Furthermore, sequences are used to study different educational settings and serve as a base for various interventions. In this literature review, the authors aim to generate an overview of these aspects to describe the current state of using sequence analysis in educational support and learning analytics. The 74 included articles were selected based on the criteria that they conduct empirical research on an educational environment using sequences of learning actions as the main focus of their analysis. The results enable us to highlight different learning tasks where sequences are analysed, identify data mapping strategies for different types of sequence actions, differentiate techniques based on purpose and scope, and identify educational interventions based on the outcomes of sequence analysis.

Learner Modeling and Recommendation of Learning Resources using Personal Knowledge Graphs

Qurat Ul Ain
Mohamed Amine Chatti
Paul Arthur Meteng Kamdem
Rawaa Alatrash
Shoeb Joarder
Clara Siepmann

Educational recommender systems (ERS) are playing a pivotal role in providing recommendations of personalized resources and activities to students, tailored to their individual learning needs. A fundamental part of generating recommendations is the learner modeling process that identifies students’ knowledge state. Current ERSs, however, have limitations mainly related to the lack of transparency and scrutability of the learner models as well as capturing the semantics of learner models and learning materials. To address these limitations, in this paper we empower students to control the construction of their personal knowledge graphs (PKGs) based on the knowledge concepts that they actively mark as ’did not understand (DNU)’ while interacting with learning materials. We then use these PKGs to build semantically-enriched learner models and provide personalized recommendations of external learning resources. We conducted offline experiments and an online user study (N=31), demonstrating the benefits of a PKG-based recommendation approach compared to a traditional content-based one, in terms of several important user-centric aspects including perceived accuracy, novelty, diversity, usefulness, user satisfaction, and use intentions. In particular, our results indicate that the degree of control students are able to exert over the learner modeling process, has positive consequences on their satisfaction with the ERS and their intention to accept its recommendations.

Large language model augmented exercise retrieval for personalized language learning

Austin Xu
Will Monroe
Klinton Bicknell

We study the problem of zero-shot exercise retrieval in the context of online language learning, to give learners the ability to explicitly request personalized exercises via natural language. Using real-world data collected from language learners, we observe that vector similarity approaches poorly capture the relationship between exercise content and the language that learners use to express what they want to learn. This semantic gap between queries and content dramatically reduces the effectiveness of general-purpose retrieval models pretrained on large scale information retrieval datasets like MS MARCO [2]. We leverage the generative capabilities of large language models to bridge the gap by synthesizing hypothetical exercises based on the learner’s input, which are then used to search for relevant exercises. Our approach, which we call mHyER, overcomes three challenges: (1) lack of relevance labels for training, (2) unrestricted learner input content, and (3) low semantic similarity between input and retrieval candidates. mHyER outperforms several strong baselines on two novel benchmarks created from crowdsourced data and publicly available data.

Have Learning Analytics Dashboards Lived Up to the Hype? A Systematic Review of Impact on Students' Achievement, Motivation, Participation and Attitude

Rogers Kaliisa
Kamila Misiejuk
Sonsoles López-Pernas
Mohammad Khalil
Mohammed Saqr

While learning analytics dashboards (LADs) are the most common form of LA intervention, there is limited evidence regarding their impact on students’ learning outcomes. This systematic review synthesizes the findings of 38 research studies to investigate the impact of LADs on students' learning outcomes, encompassing achievement, participation, motivation, and attitudes. As we currently stand, there is no evidence to support the conclusion that LADs have lived up to the promise of improving academic achievement. Most studies reported negligible or small effects, with limited evidence from well-powered controlled experiments. Many studies merely compared users and non-users of LADs, confounding the dashboard effect with student engagement levels. Similarly, the impact of LADs on motivation and attitudes appeared modest, with only a few exceptions demonstrating significant effects. Small sample sizes in these studies highlight the need for larger-scale investigations to validate these findings. Notably, LADs showed a relatively substantial impact on student participation. Several studies reported medium to large effect sizes, suggesting that LADs can promote engagement and interaction in online learning environments. However, methodological shortcomings, such as reliance on traditional evaluation methods, self-selection bias, the assumption that access equates to usage, and a lack of standardized assessment tools, emerged as recurring issues. To advance the research line for LADs, researchers should use rigorous assessment methods and establish clear standards for evaluating learning constructs. Such efforts will advance our understanding of the potential of LADs to enhance learning outcomes and provide valuable insights for educators and researchers alike.

Understanding engagement through game learning analytics and design elements: Insights from a word game case study

Katerina Mangaroska
Kristine Larssen
Andreas Amundsen
Boban Vesin
Michail Giannakos

Educational games have become an efficient and engaging way to enhance learning. Analytics have played a critical role in designing contemporary educational games, with most game design elements leveraging analytics produced during gameplay and learning. The presented study tackles the complex construct of engagement, which has been the central piece behind the success of educational games, by investigating the role of analytics-driven game elements on players’ engagement. To do so, we implemented a casual word game incorporating game design elements relevant to learning and conducted a within-subjects study where 39 participants played the game for two weeks. We found that the frequency of use of different game elements contributed to different dimensions of engagement. Our findings show that five of the eight game elements implemented in the word game engage players on an emotional, motivational, and cognitive level, thus emphasizing the importance of engagement as a multidimensional construct in designing educational casual games that offer highly engaging experiences.

How do visualizations and automated personalized feedback engage professional learners in a Learning Analytics Dashboard?

Sarah Alcock
Bart Rienties
Maria Aristeidou
Soraya Kouadri Mostéfaoui

Learning Analytics Dashboards (LAD) are the subject of research in a multitude of schools and higher education institutions, but a lack of research into learner-facing dashboards in professional learning has been identified. This study took place in an authentic professional learning context and aims to contribute insights into LAD design by using an academic approach in a practice-based environment. An existing storytelling LAD created to support 81 accountants was evaluated using Technology Acceptance Model, finding a learner expectation for clarity, conciseness, understanding and guidance on next steps. High usage levels and a ‘take what you need’ approach was identified, with all visualizations and automated personalized feedback being considered useful although to varying degrees. Professional learners in this study focus on understanding and acting upon weaknesses rather than celebrating strengths. The lessons for LAD design are to offer choice and create elements which support learners to take action to improve performance at a multitude of time points and levels of success.

Analytics of scaffold compliance for self-regulated learning

John Saint
Yizhou Fan
Dragan Gasevic

The shift toward digitally-based education has emphasised the need for learners to have strong skills for self-regulated learning (SRL). The use of scaffolding prompts is seen as an effective way to stimulate SRL and enhance academic outcomes. A key aspect of SRL scaffolding prompts is the degree to which they are complied to by students. Compliance is a complex concept, one that is further complicated by the nature of scaffold design in the context of adaptability. These nuances notwithstanding, scaffold compliance demands specific exploration. To that end, we conducted a study in which we: 1) focused specifically on scaffolding interaction behaviour in a timed online assessment task, as opposed to the broader interaction with non-scaffolding artefacts; 2) identified distinct scaffold interaction patterns in the context of compliance and non-compliance to scaffold design; 3) analysed how groups of learners traverse compliant and non-compliant interaction behaviours and engage in SRL processes in response to a sequence of timed and personalised SRL-informed scaffold prompts. We found that scaffold interactions fell into two categories of compliance and non-compliance, and whilst there was a healthy engagement with compliance, it does ebb and flow during an online timed assessment.

Student Effort and Progress Learning Analytics Data Inform Teachers’ SEL Discussions in Math Class

Natalie Brezack
Wynnie Chan
Mingyu Feng

Investigating Algorithmic Bias on Bayesian Knowledge Tracing and Carelessness Detectors

Andres Felipe Zambrano
Jiayi Zhang
Ryan S. Baker

In today's data-driven educational technologies, algorithms have a pivotal impact on student experiences and outcomes. Therefore, it is critical to take steps to minimize biases, to avoid perpetuating or exacerbating inequalities. In this paper, we investigate the degree to which algorithmic biases are present in two learning analytics models: knowledge estimates based on Bayesian Knowledge Tracing (BKT) and carelessness detectors. Using data from a learning platform used across the United States at scale, we explore algorithmic bias following three different approaches: 1) analyzing the performance of the models on every demographic group in the sample, 2) comparing performance across intersectional groups of these demographics, and 3) investigating whether the models trained using specific groups can be transferred to demographics that were not observed during the training process. Our experimental results show that the performance of these models is close to equal across all the demographic and intersectional groups. These findings establish the feasibility of validating educational algorithms for intersectional groups and indicate that these algorithms can be fairly used for diverse students at scale.

Which Planning Tactics Predict Online Course Completion?

Ji Yong Cho
Yan Tao
Michael Yeomans
Dustin Tingley
René F. Kizilcec

Planning is a self-regulated learning strategy and widely used behavior change technique that can help learners achieve academic goals (e.g., pass an exam, apply to college, or complete an online course). Numerous studies have tested the effects of planning interventions, but few have examined the content of learners’ plans and how it relates to their academic outcomes. Building on a large-scale intervention study, we conducted a qualitative content analysis of 650 learner plans sampled from 15 massive open online courses (MOOCs). We identified a number of planning tactics, compared their prevalence, and examined which ones significantly predict course progress and completion using regression analyses. We found that learners whose plans specify a time of day (e.g., morning, afternoon, night) are significantly more likely to complete a MOOC, but only 25% of the learners in our sample used this tactic. The high degree of variation in the effectiveness of planning tactics may contribute to mixed intervention findings in scale-up studies. Models of plan effectiveness can be used to provide feedback on the quality of learners’ plans and encourage them to use effective tactics to achieve their learning goals.

Revealing Networks: Understanding Effective Teacher Practices in AI-Supported Classrooms using Transmodal Ordered Network Analysis

Conrad Borchers
Yeyu Wang
Shamya Karumbaiah
Muhammad Ashiq
David Williamson Shaffer
Vincent Aleven

Learning analytics research increasingly studies classroom learning with AI-based systems through rich contextual data from outside these systems, especially student-teacher interactions. One key challenge in leveraging such data is generating meaningful insights into effective teacher practices. Quantitative ethnography bears the potential to close this gap by combining multimodal data streams into networks of co-occurring behavior that drive insight into favorable learning conditions. The present study uses transmodal ordered network analysis to understand effective teacher practices in relationship to traditional metrics of in-system learning in a mathematics classroom working with AI tutors. Incorporating teacher practices captured by position tracking and human observation codes into modeling significantly improved the inference of how efficiently students improved in the AI tutor beyond a model with tutor log data features only. Comparing teacher practices by student learning rates, we find that students with low learning rates exhibited more hint use after monitoring. However, after an extended visit, students with low learning rates showed learning behavior similar to their high learning rate peers, achieving repeated correct attempts in the tutor. Observation notes suggest conceptual and procedural support differences can help explain visit effectiveness. Taken together, offering early conceptual support to students with low learning rates could make classroom practice with AI tutors more effective. This study advances the scientific understanding of effective teacher practice in classrooms learning with AI tutors and methodologies to make such practices visible.

Shaping and evaluating a system for affective computing in online higher education using a participatory design and the system usability scale

Krist Shingjergji
Corrie Urlings
Deniz Iren
Roland Klemke

Online learning’s popularity has surged. However, teachers face the challenge of the lack of non-verbal communication with students, making it difficult to perceive their learning-centered affective states (LCAS), leading to missed intervention opportunities. Addressing this challenge requires a system that detects students’ LCAS from their non-verbal cues and informs teachers in an actionable way. To design such a system, it is essential to explore field experts’ needs and requirements. Therefore, we conducted design-based research focus groups with teachers to determine which LCAS they find important to know during online lectures and their preferred communication methods. The results indicated that confusion, engagement, boredom, frustration, and curiosity are the most important LCAS and that the proposed system should take into account teachers’ cognitive load and give them autonomy in the choice of content and frequency of the information. Considering the obtained feedback, a prototype of two versions was developed. The prototype was evaluated by teachers utilizing the System Usability Scale (SUS). Results indicated an average SUS score of 80.5 and 74.5 for each version, suggesting acceptable usability. These findings can guide the design and development of a system that can help teachers recognize students’ LCAS, thus improving synchronous online learning.

Harnessing Transparent Learning Analytics for Individualized Support through Auto-detection of Engagement in Face-to-Face Collaborative Learning

Qi Zhou
Wannapon Suraworachet
Mutlu Cukurova

Using learning analytics to investigate and support collaborative learning has been explored for many years. Recently, automated approaches with various artificial intelligence approaches have provided promising results for modelling and predicting student engagement and performance in collaborative learning tasks. However, due to the lack of transparency and interpretability caused by the use of “black box” approaches in learning analytics design and implementation, guidance for teaching and learning practice may become a challenge. On the one hand, the black box created by machine learning algorithms and models prevents users from obtaining educationally meaningful learning and teaching suggestions. On the other hand, focusing on group and cohort level analysis only can make it difficult to provide specific support for individual students working in collaborative groups. This paper proposes a transparent approach to automatically detect student's individual engagement in the process of collaboration. The results show that the proposed approach can reflect student's individual engagement and can be used as an indicator to distinguish students with different collaborative learning challenges (cognitive, behavioural and emotional) and learning outcomes. The potential of the proposed collaboration analytics approach for scaffolding collaborative learning practice in face-to-face contexts is discussed and future research suggestions are provided.

Improving Student Learning with Hybrid Human-AI Tutoring: A Three-Study Quasi-Experimental Investigation

Danielle R Thomas
Jionghao Lin
Erin Gatz
Ashish Gurung
Shivang Gupta
Kole Norberg
Stephen E Fancsali
Vincent Aleven
Lee Branstetter
Emma Brunskill
Kenneth R Koedinger

Artificial intelligence (AI) applications to support human tutoring have potential to significantly improve learning outcomes, but engagement issues persist, especially among students from low-income backgrounds. We introduce an AI-assisted tutoring model that combines human and AI tutoring and hypothesize this synergy will have positive impacts on learning processes. To investigate this hypothesis, we conduct a three-study quasi-experiment across three urban and low-income middle schools: 1) 125 students in a Pennsylvania school; 2) 385 students (50% Latinx) in a California school, and 3) 75 students (100% Black) in a Pennsylvania charter school, all implementing analogous tutoring models. We compare learning analytics of students engaged in human-AI tutoring compared to students using math software only. We find human-AI tutoring has positive effects, particularly in student’s proficiency and usage, with evidence suggesting lower achieving students may benefit more compared to higher achieving students. We illustrate the use of quasi-experimental methods adapted to the particulars of different schools and data-availability contexts so as to achieve the rapid data-driven iteration needed to guide an inspired creation into effective innovation. Future work focuses on improving the tutor dashboard and optimizing tutor-student ratios, while maintaining annual costs per student of approximately $700 annually.

Can Crowdsourcing Platforms Be Useful for Educational Research?

Karen D. Wang
Zhongzhou Chen
Carl Wieman

A growing number of social science researchers, including educational researchers, have turned to online crowdsourcing platforms such as Prolific and MTurk for their experiments. However, there is a lack of research investigating the quality of data generated by online subjects and how they compare with traditional subject pools of college students in studies that involve cognitively demanding tasks. Using an interactive problem-solving task embedded in an educational simulation, we compare the task engagement and performance based on the interaction log data of college students recruited from Prolific to those from an introductory physics course. Results show that Prolific participants performed on par with participants from the physics class in obtaining the correct solutions. Furthermore, the physics course students who submitted incorrect answers were more likely than Prolific participants to make rushed cursory attempts to solve the problem. These results suggest that with thoughtful study design and advanced learning analytics and data mining techniques, crowdsourcing platforms can be a viable tool for conducting research on teaching and learning in higher education.

Finding Paths for Explainable MOOC Recommendation: A Learner Perspective

Jibril Frej
Neel Shah
Marta Knezevic
Tanya Nazaretsky
Tanja Käser

The increasing availability of Massive Open Online Courses (MOOCs) has created a necessity for personalized course recommendation systems. These systems often combine neural networks with Knowledge Graphs (KGs) to achieve richer representations of learners and courses. While these enriched representations allow more accurate and personalized recommendations, explainability remains a significant challenge which is especially problematic for certain domains with significant impact such as education and online learning. Recently, a novel class of recommender systems that uses reinforcement learning and graph reasoning over KGs has been proposed to generate explainable recommendations in the form of paths over a KG. Despite their accuracy and interpretability on e-commerce datasets, these approaches have scarcely been applied to the educational domain and their use in practice has not been studied. In this work, we propose an explainable recommendation system for MOOCs that uses graph reasoning. To validate the practical implications of our approach, we conducted a user study examining user perceptions of our new explainable recommendations. We demonstrate the generalizability of our approach by conducting experiments on two educational datasets: COCO and Xuetang.

Analytics of Planning Behaviours in Self-Regulated Learning: Links with Strategy Use and Prior Knowledge

Tongguang Li
Yizhou Fan
Namrata Srivastava
Zijie Zeng
Xinyu Li
Hassan Khosravi
Yi-Shan Tsai
Zachari Swiecki
Dragan Gašević

A sophisticated grasp of self-regulated learning (SRL) skills has become essential for learners in computer-based learning environment (CBLE). One aspect of SRL is the plan-making process, which, although emphasized in many SRL theoretical frameworks, has attracted little research attention. Few studies have investigated the extent to which learners complied with their planned strategies, and whether making a strategic plan is associated with actual strategy use. Limited studies have examined the role of prior knowledge in predicting planned and actual strategy use. In this study, we developed a CBLE to collect trace data, which were analyzed to investigate learners’ plan-making process and its association with planned and actual strategy use. Analysis of prior knowledge and trace data of 202 participants indicated that 1) learners tended to adopt strategies that significantly deviated from their planned strategies, 2) the level of prior knowledge was associated with planned strategies, and 3) neither the act of plan-making nor prior knowledge predicted actual strategy use. These insights bear implications for educators and educational technologists to recognise the dynamic nature of strategy adoption and to devise approaches that inspire students to continually revise and adjust their plans, thereby strengthening SRL.

Gaining Insights into Course Difficulty Variations Using Item Response Theory

Frederik Baucks
Robin Schmucker
Laurenz Wiskott

Curriculum analytics (CA) studies curriculum structure and student data to ensure the quality of educational programs. To gain statistical robustness, most existing CA techniques rely on the assumption of time-invariant course difficulty, preventing them from capturing variations that might occur over time. However, ensuring low temporal variation in course difficulty is crucial to warrant fairness in treating individual student cohorts and consistency in degree outcomes. We introduce item response theory (IRT) as a CA methodology that enables us to address the open problem of monitoring course difficulty variations over time. We use statistical criteria to quantify the degree to which course performance data meets IRT’s theoretical assumptions and verify validity and reliability of IRT-based course difficulty estimates. Using data from 664 Computer Science and 1,355 Mechanical Engineering undergraduate students, we show how IRT can yield valuable CA insights: First, by revealing temporal variations in course difficulty over several years, we find that course difficulty has systematically shifted downward during the COVID-19 pandemic. Second, time-dependent course difficulty and cohort performance variations confound conventional course pass rate measures. We introduce IRT-adjusted pass rates as an alternative to account for these factors. Our findings affect policymakers, student advisors, accreditation, and course articulation.

The Effect of Assistance on Gamers: Assessing The Impact of On-Demand Hints & Feedback Availability on Learning for Students Who Game the System

Kirk Vanacore
Ashish Gurung
Adam Sales
Neil T. Heffernan

Gaming the system, characterized by attempting to progress through a learning activity without engaging in essential learning behaviors, remains a persistent problem in computer-based learning platforms. This paper examines a simple intervention to mitigate the harmful effects of gaming the system by evaluating the impact of immediate feedback on students prone to gaming the system. Using a randomized controlled trial comparing two conditions - one with immediate hints and feedback and another with delayed access to such resources - this study employs a Fully Latent Principal Stratification model to determine whether students inclined to game the system would benefit more from the delayed hints and feedback. The results suggest differential effects on learning, indicating that students prone to gaming the system may benefit from restricted or delayed access to on-demand support. However, removing immediate hints and feedback did not fully alleviate the learning disadvantage associated with gaming the system. Additionally, this paper highlights the utility of combining detection methods and causal models to comprehend and effectively respond to students’ behaviors. Overall, these findings contribute to our understanding of effective intervention design that addresses gaming the system behaviors, consequently enhancing learning outcomes in computer-based learning platforms.

Predicting challenge moments from students' discourse: A comparison of GPT-4 to two traditional natural language processing approaches

Wannapon Suraworachet
Jennifer Seon
Mutlu Cukurova

Effective collaboration requires groups to strategically regulate themselves to overcome challenges. Research has shown that groups may fail to regulate due to differences in members’ perceptions of challenges which may benefit from external support. In this study, we investigated the potential of leveraging three distinct natural language processing models: an expert knowledge rule-based model, a supervised machine learning (ML) model and a Large Language model (LLM), in challenge detection and challenge dimension identification (cognitive, metacognitive, emotional and technical/other challenges) from student discourse, was investigated. The results show that the supervised ML and the LLM approaches performed considerably well in both tasks, in contrast to the rule-based approach, whose efficacy heavily relies on the engineered features by experts. The paper provides an extensive discussion of the three approaches’ performance for automated detection and support of students’ challenge moments in collaborative learning activities. It argues that, although LLMs provide many advantages, they are unlikely to be the panacea to issues of the detection and feedback provision of socially shared regulation of learning due to their lack of reliability, as well as issues of validity evaluation, privacy and confabulation. We conclude the paper with a discussion on additional considerations, including model transparency to explore feasible and meaningful analytical feedback for students and educators using LLMs.

Temporal and Between-Group Variability in College Dropout Prediction

Dominik Glandorf
Hye Rin Lee
Gabe Avakian Orona
Marina Pumptow
Renzhe Yu
Christian Fischer

Large-scale administrative data is a common input in early warning systems for college dropout in higher education. Still, the terminology and methodology vary significantly across existing studies, and the implications of different modeling decisions are not fully understood. This study provides a systematic evaluation of contributing factors and predictive performance of machine learning models over time and across different student groups. Drawing on twelve years of administrative data at a large public university in the US, we find that dropout prediction at the end of the second year has a 20% higher AUC than at the time of enrollment in a Random Forest model. Also, most predictive factors at the time of enrollment, including demographics and high school performance, are quickly superseded in predictive importance by college performance and in later stages by enrollment behavior. Regarding variability across student groups, college GPA has more predictive value for students from traditionally disadvantaged backgrounds than their peers. These results can help researchers and administrators understand the comparative value of different data sources when building early warning systems and optimizing decisions under specific policy goals.

Discovering Players’ Problem-Solving Behavioral Characteristics in a Puzzle Game through Sequence Mining

Karen D. Wang
Haoyu Liu
David DeLiema
Nick Haber
Shima Salehi

Digital games offer promising platforms for assessing student higher-order competencies such as problem-solving. However, processing and analyzing the large volume of interaction log data generated in these platforms to uncover meaningful behavioral patterns remain a complex research challenge. In this study, we employ sequence mining and clustering techniques to examine students’ log data in an interactive puzzle game that requires player to change rules to win the game. Our goal is to identify behavioral characteristics associated with the problem-solving practices adopted by individual students. The findings indicate that the most effective problem solvers made fewer rule changes and took longer time to make those changes across both an introductory and a more advanced level of the game. Conversely, rapid rule change actions were linked to ineffective problem-solving. This research underscores the potential of sequence mining and cluster analysis as generalizable methods for understanding student higher-order competencies through log data in digital gaming and learning environments. It also suggests future directions on how to provide just-in-time, in-game feedback to enhance student problem-solving competences.

Multiple Choice vs. Fill-In Problems: The Trade-off Between Scalability and Learning

Ashish Gurung
Kirk Vanacore
Andrew A. Mcreynolds
Korinn S. Ostrow
Eamon Worden
Adam C. Sales
Neil T. Heffernan

Learning experience designers consistently balance the trade-off between open and close-ended activities. The growth and scalability of Computer Based Learning Platforms (CBLPs) have only magnified the importance of these design trade-offs. CBLPs often utilize close-ended activities (i.e. Multiple-Choice Questions [MCQs]) due to feasibility constraints associated with the use of open-ended activities. MCQs offer certain affordances, such as immediate grading and the use of distractors, setting them apart from open-ended activities. Our current study examines the effectiveness of Fill-In problems as an alternative to MCQs for middle school mathematics. We report on a randomized study conducted from 2017 to 2022, with a total of 6,768 students from middle schools across the US. We observe that, on average, Fill-In problems lead to better post-test performance than MCQs; albeit deeper explorations indicate differences between the two design paradigms to be more nuanced. We find evidence that students with higher math knowledge benefit more from Fill-In problems than those with lower math knowledge.

Prompt-based and Fine-tuned GPT Models for Context-Dependent and -Independent Deductive Coding in Social Annotation

Chenyu Hou
Gaoxia Zhu
Juan Zheng
Lishan Zhang
Xiaoshan Huang
Tianlong Zhong
Shan Li
Hanxiang Du
Chin Lee Ker

GPT has demonstrated impressive capabilities in executing various natural language processing (NLP) and reasoning tasks, showcasing its potential for deductive coding in social annotations. This research explored the effectiveness of prompt engineering and fine-tuning approaches of GPT for deductive coding of context-dependent and context-independent dimensions. Coding context-dependent dimensions (i.e., Theorizing, Integration, Reflection) requires a contextualized understanding that connects the target comment with reading materials and previous comments, whereas coding context-independent dimensions (i.e., Appraisal, Questioning, Social, Curiosity, Surprise) relies more on the comment itself. Utilizing strategies such as prompt decomposition, multi-prompt learning, and a codebook-centered approach, we found that prompt engineering can achieve fair to substantial agreement with expert-labeled data across various coding dimensions. These results affirm GPT's potential for effective application in real-world coding tasks. Compared to context-independent coding, context-dependent dimensions had lower agreement with expert-labeled data. To enhance accuracy, GPT models were fine-tuned using 102 pieces of expert-labeled data, with an additional 102 cases used for validation. The fine-tuned models demonstrated substantial agreement with ground truth in context-independent dimensions and elevated the inter-rater reliability of context-dependent categories to moderate levels. This approach represents a promising path for significantly reducing human labor and time, especially with large unstructured datasets, without sacrificing the accuracy and reliability of deductive coding tasks in social annotation. The study marks a step toward optimizing and streamlining coding processes in social annotation. Our findings suggest the promise of using GPT to analyze qualitative data and provide detailed, immediate feedback for students to elicit deepening inquiries.

Using Think-Aloud Data to Understand Relations between Self-Regulation Cycle Characteristics and Student Performance in Intelligent Tutoring Systems

Conrad Borchers
Jiayi Zhang
Ryan S. Baker
Vincent Aleven

Numerous studies demonstrate the importance of self-regulation during learning by problem-solving. Recent work in learning analytics has largely examined students’ use of SRL concerning overall learning gains. Limited research has related SRL to in-the-moment performance differences among learners. The present study investigates SRL behaviors in relationship to learners’ moment-by-moment performance while working with intelligent tutoring systems for stoichiometry chemistry. We demonstrate the feasibility of labeling SRL behaviors based on AI-generated think-aloud transcripts, identifying the presence or absence of four SRL categories (processing information, planning, enacting, and realizing errors) in each utterance. Using the SRL codes, we conducted regression analyses to examine how the use of SRL in terms of presence, frequency, cyclical characteristics, and recency relate to student performance on subsequent steps in multi-step problems. A model considering students’ SRL cycle characteristics outperformed a model only using in-the-moment SRL assessment. In line with theoretical predictions, students’ actions during earlier, process-heavy stages of SRL cycles exhibited lower moment-by-moment correctness during problem-solving than later SRL cycle stages. We discuss system re-design opportunities to add SRL support during stages of processing and paths forward for using machine learning to speed research depending on the assessment of SRL based on transcription of think-aloud data.

Analyzing Students Collaborative Problem-Solving Behaviors in Synergistic STEM+C Learning

Caitlin Snyder
Nicole M Hutchins
Clayton Cohn
Joyce Horn Fonteles
Gautam Biswas

This study introduces a methodology to investigate students’ collaborative behaviors as they work in pairs to build computational models of scientific processes. We expand the Self-Regulated Learning (SRL) framework—specifically, Planning, Enacting, and Reflection—proposed in the literature, applying it to examine students’ collaborative problem-solving (CPS) behaviors in a computational modeling task. We analyze these behaviors by employing a Markov Chain (MC) modeling approach that scrutinizes students’ model construction and model debugging behaviors during CPS. This involves interpreting their actions in the system collected through computer logs and analyzing their conversations using a Large Language Model (LLM) as they progress through their modeling task in segments. Our analytical framework assesses the behaviors of high- and low-performing students by evaluating their proficiency in completing the specified computational model for a kinematics problem. We employ a mixed-methods approach, combining Markov Chain analysis of student problem-solving transitions with qualitative interpretations of their conversation segments. The results highlight distinct differences in behaviors between high- and low-performing groups, suggesting potential for developing adaptive scaffolds in future work to enhance support for students in collaborative problem-solving.

Quantifying Collaborative Complex Problem Solving in Classrooms using Learning Analytics

Megan Taylor
Abhinava Barthakur
Arslan Azad
Srecko Joksimovic
Xuwei Zhang
George Siemens

Complex problem solving (CPS) is a critical skill with far-reaching implications for personal success and professional development. While CPS research has made extensive progress, additional investigation is needed to explore CPS processes beyond online contexts and performance outcomes. This study, conducted with Year 9 students aged between thirteen and fourteen, focuses on collaborative CPS. It utilises audio and video recordings to capture group communications during a CPS classroom activity. To analyse these interactions, we introduce a novel CPS framework as a dynamic, cognitive and social process involving interrelated main skills, sub-skills, and indicators. Through sequential pattern mining, we identify recurring subskill patterns that reflect CPS processes in an educational environment. Our research underscores the importance of employing diverse patterns before plan execution, particularly building shared knowledge, planning, and negotiation. We uncover patterns related to groups going off-task and highlight the significance of effective communication and maintaining focus for keeping groups on track. Furthermore, we indicate patterns following the detection of emergent issues, recognising the value of cultivating clarity and adaptability among team members. Our CPS framework, combined with our research results, offers practical implications for teaching, learning, and assessment approaches in educational, professional and industry sectors.

Measurement of Self-regulated Learning: Strategies for mapping trace data to learning processes and downstream analysis implications

Ikenna Osakwe
Guanliang Chen
Yizhou Fan
Mladen Rakovic
Shaveen Singh
Inge Molenaar
Dragan Gašević

Trace data provides opportunities to study self-regulated learning (SRL) processes as they unfold. However, raw trace data must be translated into meaningful SRL constructs to enable analysis. This typically involves developing a pattern dictionary that maps trace sequences to SRL processes, and a trace parser to implement the mappings. While much attention focuses on the pattern dictionary, trace parsing methodology remains under-investigated. This study explores how trace parsers affect extracted processes and downstream analysis. Four methods were compared: Disconnected, Connected, Lookahead, and Synonym Matching. Statistical analysis of medians and process mining networks showed parsing choices significantly impacted SRL process identification and sequencing. Disconnected parsing isolated metacognitive processes while Connected approaches showed greater connectivity between meta-cognitive and cognitive events. Furthermore, Connected methods provided process maps more aligned with cyclical theoretical models of SRL. The results demonstrate trace parser design critically affects the validity of extracted SRL processes, with implications for SRL measurement in learning analytics.

Computational Modeling of Collaborative Discourse to Enable Feedback and Reflection in Middle School Classrooms

Chelsea Chandler
Thomas Breideband
Jason G. Reitman
Marissa Chitwood
Jeffrey B. Bush
Amanda Howard
Sarah Leonhart
Peter W. Foltz
William R. Penuel
Sidney K. D'Mello

Collaboration analytics has the potential to empower teachers and students with valuable insights to facilitate more meaningful and engaging collaborative learning experiences. Towards this end, we developed computational models of student speech during small group work, identifying instances of uplifting behavior related to three Community Agreements: community building, moving thinking forward, and being respectful. Pre-trained RoBERTa language models were fine-tuned and evaluated on human annotated data (N = 9,607 student utterances from 100 unique 5-minute classroom recordings). The models achieved moderate accuracies (AUROCs between 0.67-0.84) and were robust to speech recognition errors. Preliminary generalizability studies indicated that the models generalized well to two other domains (transfer ratios between 0.46-0.85; with 1.0 indicating perfect transfer). We also developed four approaches to provide qualitative feedback in the form of noticings (i.e., specific exemplars) of positive instances of the Community Agreements, finding moderate alignment with human ratings. This research contributes to the computational modeling of the relationship dimension of collaboration from noisy classroom data, selection of positive examples for qualitative feedback, and towards the empowerment of teachers to support diverse learners during collaborative learning.

Heterogenous Network Analytics of Small Group Teamwork: Using Multimodal Data to Uncover Individual Behavioral Engagement Strategies

Shihui Feng
Lixiang Yan
Linxuan Zhao
Roberto Martinez Maldonado
Dragan Gašević

Individual behavioral engagement is an important indicator of active learning in collaborative settings, encompassing multidimensional behaviors mediated through various interaction modes. Little existing work has explored the use of multimodal process data to understand individual behavioral engagement in face-to-face collaborative learning settings. In this study we bridge this gap, for the first time, introducing a heterogeneous tripartite network approach to analyze the interconnections among multimodal process data in collaborative learning. Students’ behavioral engagement strategies are analyzed based on their interaction patterns with various spatial locations and verbal communication types using a heterogeneous tripartite network. The multimodal collaborative learning process data were collected from 15 teams of four students. We conducted stochastic blockmodeling on a projection of the heterogeneous tripartite network to cluster students into groups that shared similar spatial and oral engagement patterns. We found two distinct clusters of students, whose characteristic behavioural engagement strategies were identified by extracting interaction patterns that were statistically significant relative to a multinomial null model. The two identified clusters also exhibited a statistically significant difference regarding students’ perceived collaboration satisfaction and teacher-assessed team performance level. This study advances collaboration analytics methodology and provides new insights into personalized support in collaborative learning.

Comparing Authoring Experiences with Spreadsheet Interfaces vs GUIs

Shreya Sheel
Ioannis Anastasopoulos
Zach A. Pardos

There is little consensus over whether graphical user interfaces (GUIs) or programmatic systems are better for word processing. Even less is known about each interfaces’ affordances and limitations in the context of creating content for adaptive tutoring systems. In order to afford instructors the use of such systems with their own or adapted pedagogies, we must study their experiences in inputting their content. In this study, we conduct a between-subjects A/B test with two content authoring interfaces, a GUI and spreadsheet, to explore 32 instructors’ experiences in authoring algebra content with hints, scaffolds, images, and special characters. We study their experiences by measuring time taken, accuracy, and their perceptions of each interfaces’ usability. Our findings indicate no significant relationship between interface used and time taken authoring problems but significantly more accuracy in authoring problems in the spreadsheet interface over the GUI. Although both interfaces performed reasonably well in time taken and accuracy, both were perceived as average to low in usability, highlighting a dissonance between instructors’ perceptions and actual performances. Since both interfaces are reasonable in authoring content, other factors can be explored, such as cost and author incentive, when deciding which interface approach to take for authoring tutor content.

Unveiling Goods and Bads: A Critical Analysis of Machine Learning Predictions of Standardized Test Performance in Early Childhood Education

Lin Li
Namrata Srivastava
Jia Rong
Gina Pianta
Raju Varanasi
Dragan Gašević
Guanliang Chen

Learning analytics (LA) holds a promise to transform education by utilizing data for evidence-based decision-making. Yet, its application in early childhood education (ECE) remains relatively under-explored. ECE plays a crucial role in fostering fundamental numeracy and literacy skills. While standardized tests was intended to be used to monitor student progress, they have been increasingly assumed summative and high-stake due to the substantial impact. The pressures in succeeding in such standardized tests have been well-documented to negatively affect both students and teachers. Attempting to ease such stress and better support students and teachers, the current study delved into the LA potential for predicting standardized test performance using formative assessments. Beyond predictive accuracy, the study addressed ethical considerations related to fairness to uncover potential risks associated with LA adoption. Our findings revealed a promising opportunity to empower teachers and schools with more time and room to help students better prepared based on predictions obtained earlier before standardized tests. Notably, bias can be significantly observed in predictions for students with disabilities even they have same actual competence compared to students without disabilities. In addition, we noticed that inclusion of demographic attribute had no significant impact on the predictive accuracy, and not necessarily exacerbate the overall predictive bias, but may significantly affect the predictions received by certain demographic subgroups (e.g., students with different types of disability).

Scaling While Privacy Preserving: A Comprehensive Synthetic Tabular Data Generation and Evaluation in Learning Analytics

Qinyi Liu
Mohammad Khalil
Jelena Jovanovic
Ronas Shakya

Privacy poses a significant obstacle to the progress of learning analytics (LA), presenting challenges like inadequate anonymization and data misuse that current solutions struggle to address. Synthetic data emerges as a potential remedy, offering robust privacy protection. However, prior LA research on synthetic data lacks thorough evaluation, essential for assessing the delicate balance between privacy and data utility. Synthetic data must not only enhance privacy but also remain practical for data analytics. Moreover, diverse LA scenarios come with varying privacy and utility needs, making the selection of an appropriate synthetic data approach a pressing challenge. To address these gaps, we propose a comprehensive evaluation of synthetic data, which encompasses three dimensions of synthetic data quality, namely resemblance, utility, and privacy. We apply this evaluation to three distinct LA datasets, using three different synthetic data generation methods. Our results show that synthetic data can maintain similar utility (i.e., predictive performance) as real data, while preserving privacy. Furthermore, considering different privacy and data utility requirements in different LA scenarios, we make customized recommendations for synthetic data generation. This paper not only presents a comprehensive evaluation of synthetic data but also illustrates its potential in mitigating privacy concerns within the field of LA, thus contributing to a wider application of synthetic data in LA and promoting a better practice for open science.

Does Feedback on Talk Time Increase Student Engagement? Evidence from a Randomized Controlled Trial on a Math Tutoring Platform

Dorottya Demszky
Rose Wang
Sean Geraghty
Carol Yu

Providing ample opportunities for students to express their thinking is pivotal to their learning of mathematical concepts. We introduce the Talk Meter, which provides in-the-moment automated feedback on student-teacher talk ratios. We conduct a randomized controlled trial on a virtual math tutoring platform (n=742 tutors) to evaluate the effectiveness of the Talk Meter at increasing student talk. In one treatment arm, we show the Talk Meter only to the tutor, while in the other arm we show it to both the student and the tutor. We find that the Talk Meter increases student talk ratios in both treatment conditions by 13-14%; this trend is driven by the tutor talking less in the tutor-facing condition, whereas in the student-facing condition it is driven by the student expressing significantly more mathematical thinking. Through interviews with tutors, we find the student-facing Talk Meter was more motivating to students, especially those with introverted personalities, and was effective at encouraging joint effort towards balanced talk time. These results demonstrate the promise of in-the-moment joint talk time feedback to both teachers and students as a low cost, engaging, and scalable way to increase students’ mathematical reasoning.

CTAM4SRL: A Consolidated Temporal Analytic Method for Analysis of Self-Regulated Learning

Debarshi Nath
Dragan Gasevic
Yizhou Fan
Ramkumar Rajendran

Temporality in Self-Regulated Learning (SRL) has two perspectives: one as a passage of time and the other as an ordered sequence of events. Each of these conceptions is distinct and requires independent considerations. Only a single analytic method is not sufficient in adequately capturing both these facets of temporality. Yet, most research uses a single method in temporally-focused SRL research, and those that use multiple methods do not address both aspects of temporality. We propose CTAM4SRL, a consolidated temporal analytic method which combines advanced data visualisation, network analysis and pattern mining to capture both facets of temporality. We employ CTAM4SRL in a cohort of 36 learners engaged in a reading-writing activity. Using CTAM4SRL, we were able to provide a rich temporal explanation of the interplay of the self-regulatory processes of the learners. We were further able to identify differences in SRL behaviours in high and low performers in terms of their approach to learning comprising deep and surface strategies. High performers were able to more selectively and strategically combine deep and surface learning strategies when compared to low scorers– a behaviour which was only hypothesised in SRL literature previously, but now has empirical support provided by our consolidated analytic method.

Towards Improving Rhetorical Categories Classification and Unveiling Sequential Patterns in Students' Writing

Sehrish Iqbal
Mladen Rakovic
Guanliang Chen
Tongguang Li
Jasmine Bajaj
Rafael Ferreira Mello
Yizhou Fan
Naif Radi Aljohani
Dragan Gasevic

To meet the growing demand for future professionals who can present information to an audience and create quality written products, educators are increasingly assigning writing assignments that require students to gather information from multiple sources, reorganise and reinterpret knowledge from source materials, and plan for rhetorical structure goals in order to meet the task requirements. When evaluating an essay coherence, scorers manually look for the presence of required rhetorical categories, which takes time. Supervised Machine Learning (ML) techniques have proven to be an effective tool for automatic detection of rhetorical categories that approximate students’ cognitive engagement with source information. Previous studies that addressed this problem used relatively small datasets and reported relatively low kappa scores for accuracy, limiting the use of such models in real-world scenarios. Moreover, to empower educators to effectively evaluate the overall quality of students’ writing, the associations between the sequential patterns of rhetorical categories in students’ writing and writing performance must be examined, which remains largely unexplored in educational domain. Therefore, to fill these gaps, our study aimed to i) investigate the impact of data augmentation approaches on the performance of deep learning algorithms in classifying rhetorical categories in student essays according to Bloom‘s taxonomy ii) and explore the sequential patterns of rhetorical categories in students’ writing that can influence writing performance. Our findings showed that deep learning-based model BERT on Easy Data Augmentation (EDA) based augmented data achieved 20% higher Cohen’s kappa than normal (non-augmented) data, and we discovered that students in different performance groups were statistically different in terms of rhetorical patterns. Our proposed study is valuable in terms of building a data analytic foundation that can be used to create formative feedback on students’ writings based on the patterns of rhetorical categories to improve essay quality.

Effecti-Net: A Multimodal Framework and Database for Educational Content Effectiveness Analysis

Deep Dwivedi
Ritik Garg
Shiva Baghel
Rushil Thareja
Ritvik Kulshrestha
Mukesh Mohania
Jainendra Shukla

Amid the evolving landscape of education, evaluating the impact of educational video content on students remains a challenge. Existing methods for assessment often rely on heuristics and self-reporting, leaving room for subjectivity and limited insight. This study addresses this issue by leveraging physiological sensor data to predict student-perceived content effectiveness. Within the realm of educational content evaluation, prior studies focused on conventional approaches, leaving a gap in understanding the nuanced responses of students to educational materials. To bridge this gap, our research introduces a novel perspective, building upon previous work in multimodal physiological data analysis. Our primary contributions encompass two key elements. First, we present the ’Effecti-Net’ architecture, a sophisticated deep learning model that integrates data from multiple sensor modalities, including Electroencephalogram (EEG), Eye Tracker, Galvanic Skin Response (GSR), and Photoplethysmography (PPG). Second, we introduce the ’DECEP’ dataset, a repository comprising 597 minutes of multimodal sensor data. To assess the effectiveness of our approach, we benchmark it against conventional methods. Remarkably, our model achieves a lowest MSE of 0.1651 and MAE of 0.3544 on the DECEP dataset. It offers educators and content creators a comprehensive framework that promotes the development of more engaging educational content.

Data Storytelling Editor: A Teacher-Centred Tool for Customising Learning Analytics Dashboard Narratives

Gloria Milena Fernandez-Nieto
Roberto Martinez-Maldonado
Vanessa Echeverria
Kirsty Kitto
Dragan Gašević
Simon Buckingham Shum

Dashboards are increasingly used in education to provide teachers and students with insights into learning. Yet, existing dashboards are often criticised for their failure to provide the contextual information or explanations necessary to help students interpret these data. Data Storytelling (DS) is emerging as an alternative way to communicate insights providing guidance and context to facilitate students’ interpretations. However, while data stories have proven effective in prompting students’ reflections, to date, it has been necessary for researchers to craft the stories rather than enabling teachers to do this by themselves. This can make this approach more feasible and scalable while also respecting teachers’ agency. Based on the notion of DS, this paper presents a DS editor for teachers. A study was conducted in two universities to examine whether the editor could enable teachers to create stories adapted to their learning designs. Results showed that teachers appreciated how the tool enabled them to contextualise automated feedback to their teaching needs, generating data stories to support student reflection.

Feedback, Control, or Explanations? Supporting Teachers With Steerable Distractor-Generating AI

Maxwell Szymanski
Jeroen Ooge
Robin De Croon
Vero Vanden Abeele
Katrien Verbert

Recent advancements in Educational AI have focused on models for automatic question generation. Yet, these advancements face challenges: (1) their "black-box" nature limits transparency, thereby obscuring the decision-making process; and (2) their novelty sometimes causes inaccuracies due to limited feedback systems. Explainable AI (XAI) aims to address the first limitation by clarifying model decisions, while Interactive Machine Learning (IML) emphasises user feedback and model refinement. However, both XAI and IML solutions primarily serve AI experts, often neglecting novices like teachers. Such oversights lead to issues like misaligned expectations and reduced trust. Following the user-centred design method, we collaborated with teachers and ed-tech experts to develop an AI-aided system for generating multiple-choice question distractors, which incorporates feedback, control, and visual explanations. Evaluating these through semi-structured interviews with 12 teachers, we found a strong preference for the feedback feature, enabling teacher-guided AI improvements. Control and explanations’ usefulness was largely dependent on model performance: they were valued when the model performed well. If the model did not perform well, teachers sought context over AI-centric explanations, suggesting a tilt towards data-centric explanations. Based on these results, we propose guidelines for creating tools that enable teachers to steer and interact with question-generating AI models.

Measuring Affective and Motivational States as Conditions for Cognitive and Metacognitive Processing in Self-Regulated Learning

Mladen Raković
Yuheng Li
Navid Mohammadi Foumani
Mahsa Salehi
Levin Kuhlmann
Geoffrey Mackellar
Roberto Martinez-Maldonado
Gholamreza Haffari
Zachari Swiecki
Xinyu Li
Guanliang Chen
Dragan Gašević

Even though the engagement in self-regulated learning (SRL) has been shown to boost academic performance, SRL skills of many learners remain underdeveloped. They often struggle to productively navigate multiple cognitive, affective, metacognitive and motivational (CAMM) processes in SRL. To provide learners with the required SRL support, it is essential to understand how learners enact CAMM processes as they study. More research is needed to advance the measurement of affective and motivational processes within SRL, and investigate how these processes influence learners’ cognition and metacognition. With this in mind, we conducted a lab study involving 22 university students who worked on a 45-minute reading and writing task in digital learning environment. We used a wearable electroencephalogram device to record learner academic emotional and motivational states, and digital trace data to record learner cognitive and metacognitive processes. We harnessed time series prediction and explainable artificial intelligence methods to examine how learner’s emotional and motivational states influence their choice of cognitive and metacognitive processes. Our results indicate that emotional and motivational states can predict learners’ use of low cognitive, high cognitive and metacognitive processes with considerable classification accuracy (F1 > 0.73), and that higher values of interest, engagement and excitement promote cognitive processing.

Contexts Matter but How? Course-Level Correlates of Performance and Fairness Shift in Predictive Model Transfer

Zhen Xu
Joseph Olson
Nicole Pochinki
Zhijian Zheng
Renzhe Yu

Learning analytics research has highlighted that contexts matter for predictive models, but little research has explicated how contexts matter for models’ utility. Such insights are critical for real-world applications where predictive models are frequently deployed across instructional and institutional contexts. Building upon administrative records and behavioral traces from 37,089 students across 1,493 courses, we provide a comprehensive evaluation of performance and fairness shifts of predictive models when transferred across different course contexts. We specifically quantify how differences in various contextual factors moderate model portability. Our findings indicate an average decline in model performance and inconsistent directions in fairness shifts, without a direct trade-off, when models are transferred across different courses within the same institution. Among the course-to-course contextual differences we examined, differences in admin features account for the largest portion of both performance and fairness loss. Differences in student composition can simultaneously amplify drops in performance and fairness while differences in learning design have a greater impact on performance degradation. Given these complexities, our results highlight the importance of considering multiple dimensions of course contexts and evaluating fairness shifts in addition to performance loss when conducting transfer learning of predictive models in education.

Human-tutor Coaching Technology (HTCT): Automated Discourse Analytics in a Coached Tutoring Model

Brandon M. Booth
Jennifer Jacobs
Jeffrey B. Bush
Brent Milne
Tom Fischaber
Sidney K. DMello

High-dosage tutoring has become an effective strategy for bolstering K-12 academic performance and combating education declines accelerated by the COVID-19 pandemic. To achieve high-dosage tutoring at scale, tutoring programs often rely on paraprofessional tutors—recruited tutors with college degrees who lack formal training in education—however, these tutors may require consistent and targeted feedback from instructional coaches for improvement. Accordingly, we developed a human-tutor coaching technology (HTCT) system to automatically extract discourse analytics pertaining to accountable talk moves (or academically productive talk) from tutoring sessions and provide feedback visualizations to coaches to aid their coaching sessions with tutors. We deployed HTCT in a user study using a virtual tutoring platform with 11 real coaches, 40 tutors, and their students to investigate coaches’ usage patterns with HTCT, perceptions of its utility, and changes in tutors’ talk. Overall, we found that coaches had positive perceptions of the system. We also observed an increase in accountable talk from tutors whose coaches used HTCT compared to tutors whose coaches did not. We discuss implications for AI-based applications which offer coaches a promising way to provide personalized, automated, and data-driven feedback to scale high-dosage tutoring.

SESSION: Short Papers

Demonstrating the impact of study regularity on academic success using learning analytics

Marie-Luce Bourguet

Students can be described as self-regulated learners when they are meta-cognitively, motivationally, and behaviourally active participants in their own learning. Flipping the classroom requires from students good self-regulated learning skills, primarily time management and study regularity, as they must have engaged in learning activities prior to attending live classes. In this short paper, we describe our approach of using learning analytics to demonstrate the impact of study regularity on academic success in a flipped learning environment. Our key contribution is the definition of a measure of study regularity that can uncover various students’ learning profiles during flipped learning. We are showing that the regularity measure correlate strongly with academic success and that it can be used to predict students’ performance. We then discuss how such a measure can also be used to raise student’s awareness about their learning behaviour and lack of appropriate strategy, to nudge the students into modifying their learning behaviour, and to monitor class behaviour, such as detecting a worrying students’ disengagement trend.

Bridging Learnersourcing and AI: Exploring the Dynamics of Student-AI Collaborative Feedback Generation

Anjali Singh
Christopher Brooks
Xu Wang
Warren Li
Juho Kim
Deepti Wilson

This paper explores the space of optimizing feedback mechanisms in complex domains such as data science, by combining two prevailing approaches: Artificial Intelligence (AI) and learnersourcing. Towards addressing the challenges posed by each approach, this work compares traditional learnersourcing with an AI-supported approach. We report on the results of a randomized controlled experiment conducted with 72 Master’s level students in a data visualization course, comparing two conditions: students writing hints independently versus revising hints generated by GPT-4. The study aimed to evaluate the quality of learnersourced hints, examine the impact of student performance on hint quality, gauge learner preference for writing hints with versus without AI support, and explore the potential of the student-AI collaborative exercise in fostering critical thinking about LLMs. Based on our findings, we provide insights for designing learnersourcing activities leveraging AI support and optimizing students’ learning as they interact with LLMs.

Using Multimodal Learning Analytics to Examine Learners’ Responses to Different Types of Background Music during Reading Comprehension

Ying Que
Jeremy Tzi Dong Ng
Xiao Hu
Mitchell Kam Fai Mak
Peony Tsz Yan Yip

Previous studies have evaluated the affordances and challenges of performing cognitively demanding learning tasks with background music (BGM), yet the effects of various types of BGM on learning still remain an open question. This study aimed to examine the impacts of different music genres and fine-grained music characteristics on learners’ emotional, physiological, and pupillary responses during reading comprehension. Leveraging multimodal learning analytics (MmLA) methods of collecting data in multiple modalities from learners, a user experiment was conducted on 102 participants, with half of them reading with self-selected BGM (i.e., the experimental group), while the other half reading without BGM (i.e., the control group). Results of statistical analyses and interviews revealed significant differences between the two groups in their self-reported emotions and automatically measured physiological responses when the experimental group was exposed to classical, easy-listening, rebellious and rhythmic music. Fine-grained music characteristics (e.g., instrumentation, tempo) could predict learners’ emotions, pupillary, and physiological responses during reading comprehension. The expected contributions of this study include: 1) providing empirical evidence for understanding affective dimensions of learning with BGM, 2) applying MmLA methods for examining the impacts of BGM on learning, and 3) yielding practical implications on how to improve learning with BGM.

Architectural Adaptation and Regularization of Attention Networks for Incremental Knowledge Tracing

Cheryl Sze Yin Wong
Savitha Ramasamy

EdTech platforms continuously refresh their database with new questions and concepts with evolving course syllabus. The state-of-the-art knowledge tracing models are unable to adapt to these changes, as the size of the question embedding layers is typically fixed. In this work, we propose an incremental learning algorithm for knowledge tracing that is capable of adapting itself to growing pool of concepts and questions, through its architectural adaptation and regularization strategies. The algorithm, referred as, "Architectural adaptation and Regularization of Attention network for Incremental Knowledge Tracing (ARAIKT)", is capable of adapting the embeddings with increasing concepts and question bank, while preserving representations of the previous concepts and question banks. Furthermore, they are robust to distributional drifts in the data, and are capable of preserving privacy of data across study centers and EdTech platforms. We demonstrate the effectiveness of the ARAIKT by evaluating its performance on subsets of study centers/academic years within ASSISTment2009 and ASSISTment2017 data sets, respectively.

Automated Feedback for Student Math Responses Based on Multi-Modality and Fine-Tuning

Hai Li
Chenglu Li
Wanli Xing
Sami Baral
Neil Heffernan

Open-ended mathematical problems are a commonly used method for assessing students’ abilities by teachers. In previous automated assessments, natural language processing focusing on students’ textual answers has been the primary approach. However, mathematical questions often involve answers containing images, such as number lines, geometric shapes, and charts. Several existing computer-based learning systems allow students to upload their handwritten answers for grading. Yet, there are limited methods available for automated scoring of these image-based responses, with even fewer multi-modal approaches that can simultaneously handle both texts and images. In addition to scoring, another valuable scaffolding to procedurally and conceptually support students while lacking automation is comments. In this study, we developed a multi-task model to simultaneously output scores and comments using students’ multi-modal artifacts (texts and images) as inputs by extending BLIP, a multi-modal visual reasoning model. Benchmarked with three baselines, we fine-tuned and evaluated our approach on a dataset related to open-ended questions as well as students’ responses. We found that incorporating images with text inputs enhances feedback performance compared to using texts alone. Meanwhile, our model can effectively provide coherent and contextual feedback in mathematical settings.

A Fair Clustering Approach to Self-Regulated Learning Behaviors in a Virtual Learning Environment

Yukyeong Song
Chenglu Li
Wanli Xing
Shan Li
Hakeoung Hannah Lee

While virtual learning environments (VLEs) are widely used in K-12 education for classroom instruction and self-study, young students’ success in VLEs highly depends on their self-regulated learning (SRL) skills. Therefore, it is important to provide personalized support for SRL. One important precursor of designing personalized SRL support is to understand students’ SRL behavioral patterns. Extensive studies have clustered SRL behaviors and prescribed personalized support for each cluster. However, limited attention has been paid to the algorithm bias and fairness of clustering results. In this study, we “fairly” clustered the behavioral patterns of SRL using fair-capacitated clustering (FCC), an algorithm that incorporates constraints to ensure fairness in the assignment of data points. We used data from 14,251 secondary school learners in a virtual math learning environment. The results of FCC showed that it could capture six clusters of SRL behaviors in a fair way; three clusters belonging to high-performing (i.e., H-1. Help-provider, H-2) Active SRL learner, H-3) Active onlooker), and three clusters in low-performing groups (i.e., L-1) Quiz-taker, L-2) Dormant learner, and L-3) Inactive onlooker). The findings provide a better understanding of SRL patterns in online learning and can potentially guide the design of personalized support for SRL.

Understanding Knowledge Convergence in a Cross-cultural Online Context: An Individual and Collective Approach

Mengtong Xiang
Jingjing Zhang
Yue Li

The concept of knowledge convergence refers to building a shared cognitive understanding among individuals through social interaction. It is considered as a crucial aspect of collaborative learning and plays a significant role in the process of consensus building. However, there is a lack of research exploring knowledge convergence in the context of online learning, especially in cross-cultural settings. Collaborative learning primarily focuses on constructing cognitive knowledge representations at the individual level, while online learning emphasizes the social mechanism of knowledge diffusion and flow at the collective level. This study aims to investigate individual online knowledge convergence through content analysis of social annotations within a cross-cultural course and using Simulation Investigation for Empirical Network Analysis (SIENA) to depict the collective social interaction. The findings reveal that online knowledge convergence exhibits distinct characteristics, quick consensus building could foster a harmonious community and similar experiences compensated for limited interactions, triggering deep consensus. Individual convergence leads to emergent properties such as reciprocity and transitivity within a dynamic collective interactive network, which can serve as novel indicators for evaluating knowledge convergence at the collective level. By approaching knowledge convergence from multifaceted perspectives, this study contributes to a comprehensive understanding of the concept across diverse learning contexts.

Field report for Platform mBox: Designing an Open MMLA Platform

Zaibei Li
Martin Thoft Jensen
Alexander Nolte
Daniel Spikol

Multimodal Learning Analytics (MMLA) is an evolving sector within learning analytics that has become increasingly useful for examining complex learning and collaboration dynamics for group work across all educational levels. The availability of low-cost sensors and affordable computational power allows researchers to investigate different modes of group work. However, the field faces challenges stemming from the complexity and specialization of the systems required for capturing diverse interaction modalities, with commercial systems often being expensive or narrow in scope and researcher-developed systems needing to be more specialized and difficult to deploy. Therefore, more user-friendly, adaptable, affordable, open-source, and easy-to-deploy systems are needed to advance research and application in the MMLA field. The paper presents a field report on the design of mBox that aims to support group work across different contexts. We share the progress of mBox, a low-cost, easy-to-use platform grounded on learning theories to investigate collaborative learning settings. Our approach has been guided by iterative design processes that let us rapidly prototype different solutions for these settings.

What Fairness Metrics Can Really Tell You: A Case Study in the Educational Domain

Lea Cohausz
Jakob Kappenberger
Heiner Stuckenschmidt

Recently, discussions on fairness and algorithmic bias have gained prominence in the learning analytics and educational data mining communities. To quantify algorithmic bias, researchers and practitioners often use popular fairness metrics, e.g., demographic parity, without discussing their choices. This can be considered problematic, as the choices should strongly depend on the underlying data generation mechanism, the potential application, and normative beliefs. Likewise, whether and how one should deal with the indicated bias depends on these aspects. This paper presents and discusses several theoretical cases to highlight precisely this. By providing a set of examples, we hope to facilitate a practice where researchers discuss potential fairness concerns by default.

Bringing Collaborative Analytics using Multimodal Data to Masses: Evaluation and Design Guidelines for Developing a MMLA System for Research and Teaching Practices in CSCL

Pankaj Chejara
Reet Kasepalu
Luis Prieto
María Jesús Rodríguez-Triana
Adolfo Ruiz-Calleja

The Multimodal Learning Analytics (MMLA) research community has significantly grown in the past few years. Researchers in this field have harnessed diverse data collection devices such as eye-trackers, motion sensors, and microphones to capture rich multimodal data about learning. This data, when analyzed, has been proven highly valuable for understanding learning processes across a variety of educational settings. Notwithstanding this progress, an ubiquitous use of MMLA in education is still limited by challenges such as technological complexity, high costs, etc. In this paper, we introduce CoTrack, a MMLA system for capturing the multimodality of a group’s interaction in terms of audio, video, and writing logs in online and co-located collaborative learning settings. The system offers a user-friendly interface, designed to cater to the needs of teachers and students without specialized technical expertise. Our usability evaluation with 2 researchers, 2 teachers and 24 students has yielded promising results regarding the system’s ease of use. Furthermore, this paper offers design guidelines for the development of more user-friendly MMLA systems. These guidelines have significant implications for the broader aim of making MMLA tools accessible to a wider audience, particularly for non-expert MMLA users.

The Role of Gender in Citation Practices of Learning Analytics Research

Oleksandra Poquet
Srecko Joksimovic
Pernille Brams

Mounting evidence indicates that modern citation practices contribute to inequalities in who receives citations. In response to this evidence, our paper investigates citation practices in learning analytics (LA). We analyse citations in papers published over ten years at the Learning Analytics and Knowledge conference (LAK). Our analysis examines the gender composition of authored and cited papers in LA, estimating various factors that explain why one paper cites another, and if the citation rates differ across different author teams. Results indicate an overall increase in the number of women authors at LAK, while the ratio of men to women remains stable. Citation patterns in LAK are influenced by the seniority of authors, paper age, topic, and team size. We found that LAK papers with women as the last author are under-cited, but papers where the first author is a woman and the last author is a man are over-cited. Author teams with different gender composition also vary in who they over- and under-cite. Upon presenting the empirical results, the paper reflects on the role of mindful citation practices and reviews existing measures proposed to promote diversity in citations.

Automated Discourse Analysis via Generative Artificial Intelligence

Ryan Garg
Jaeyoung Han
Yixin Cheng
Zheng Fang
Zachari Swiecki

Coding discourse data is critical to many learning analytics studies. To code their data, researchers may use manual techniques, automated techniques, or a combination thereof. Manual coding can be time-consuming and error prone; automated coding can be difficult to implement for non-technical users. Generative artificial intelligence (GAI) offers a user friendly alternative to automated discourse coding via prompting and APIs. We assessed the ability of GAI, specifically the GPT class of models, at automatically coding discourse in the context of a learning analytics study using a variety of prompting and training strategies. We found that fine-tuning approaches produced the best results; however, no results achieved standard thresholds for reliability in our field.

Kattis vs ChatGPT: Assessment and Evaluation of Programming Tasks in the Age of Artificial Intelligence

Nora Dunder
Saga Lundborg
Jacqueline Wong
Olga Viberg

AI-powered education technologies can support students and teachers in computer science education. However, with the recent developments in generative AI, and especially the increasingly emerging popularity of ChatGPT, the effectiveness of using large language models for solving programming tasks has been underexplored. The present study examines ChatGPT’s ability to generate code solutions at different difficulty levels for introductory programming courses. We conducted an experiment where ChatGPT was tested on 127 randomly selected programming problems provided by Kattis, an automatic software grading tool for computer science programs, often used in higher education. The results showed that ChatGPT independently could solve 19 out of 127 programming tasks generated and assessed by Kattis. Further, ChatGPT was found to be able to generate accurate code solutions for simple problems but encountered difficulties with more complex programming tasks. The results contribute to the ongoing debate on the utility of AI-powered tools in programming education.

Code Soliloquies for Accurate Calculations in Large Language Models

Shashank Sonkar
Xinghe Chen
Myco Le
Naiming Liu
Debshila Basu Mallick
Richard Baraniuk

High-quality conversational datasets are crucial for the successful development of Intelligent Tutoring Systems (ITS) that utilize a Large Language Model (LLM) backend. Synthetic student-teacher dialogues, generated using advanced GPT-4 models, are a common strategy for creating these datasets. However, subjects like physics that entail complex calculations pose a challenge. While GPT-4 presents impressive language processing capabilities, its limitations in fundamental mathematical reasoning curtail its efficacy for such subjects. To tackle this limitation, we introduce in this paper an innovative stateful prompt design. Our design orchestrates a mock conversation where both student and tutorbot roles are simulated by GPT-4. Each student response triggers an internal monologue, or ‘code soliloquy’ in the GPT-tutorbot, which assesses whether its subsequent response would necessitate calculations. If a calculation is deemed necessary, it scripts the relevant Python code and uses the Python output to construct a response to the student. Our approach notably enhances the quality of synthetic conversation datasets, especially for subjects that are calculation-intensive. The preliminary Subject Matter Expert evaluations reveal that our Higgs model, a fine-tuned LLaMA model, effectively uses Python for computations, which significantly enhances the accuracy and computational reliability of Higgs’ responses.

Analyzing Student Attention and Acceptance of Conversational AI for Math Learning: Insights from a Randomized Controlled Trial

Chenglu Li
Wangda Zhu
Wanli Xing
Rui Guo

The significance of nurturing a deep conceptual understanding in math learning cannot be overstated. Grounded in the pedagogical strategies of induction, concretization, and exemplification (ICE), we designed and developed a conversational AI using both rule- and generation-based techniques to facilitate math learning. Serving as a preliminary step, this study employed an experimental design involving 151 U.S.-based college students to reveal students’ attention patterns, technology acceptance model, and qualitative feedback when using the developed ConvAI. Our findings suggest that participants in the ConvAI group generally exhibit higher attention levels than those in the control group, aside from the initial stage where the control group was more attentive. Meanwhile, participants appreciated their experience with the ConvAI, particularly valuing the ICE support features. Finally, qualitative analysis of participants’ feedback was conducted to inform future refinement and to inspire educational researchers and practitioners.

Estimating the Causal Treatment Effect of Unproductive Persistence

Amelia Leon
Allen Nie
Yash Chandak
Emma Brunskill

There has been considerable work in classifying and predicting unproductive persistence, but much less in understanding its causal impact on downstream outcomes of interest, like external assessments. In general, it is experimentally challenging to understand the causal impact because, unlike in many other settings, we cannot directly intervene (to conduct a randomized control trial) and cause students to struggle unproductively in an authentic manner. In this work, we use data from a prior study that used virtual reality headsets to alert teacher’s attention to students who were unproductively struggling. We show that we can use this as an instrumental variable, and use a two-stage least squares analysis to provide a causal estimate of the treatment effect of unproductive persistence on post-test performance. Our results further strengthen the importance of unproductive struggle and highlight the potential of leveraging instruments to identify causal treatment effects of student behaviors during the use of educational technology.

An Investigation of Automatically Generated Feedback on Student Behavior and Learning

Rachel Van Campenhout
Murray Kimball
Michelle Clark
Jeffrey S. Dittel
Bill Jerome
Benny G. Johnson

Decades of research have focused on the feedback delivered to students after answering questions—when to deliver feedback and what kind of feedback is most beneficial for learning. While there is a well-established body of research on feedback, new advances in technology have led to new methods for developing feedback and large-scale usage provides new data for understanding how feedback impacts learners. This paper focuses on feedback that was developed using artificial intelligence for an automatic question generation system. The automatically generated questions were placed alongside text as a formative learning tool in an e-reader platform. Three types of feedback were randomized across the questions: outcome feedback, context feedback, and common answer feedback. In this study, we investigate the effect of different feedback types on student behavior. This analysis is significant to the expanding body of research on automatic question generation, as little research has been reported on automatically generated feedback specifically, as well as the additional insights that microlevel data can reveal on the relationship between feedback and student learning behaviors.

Extracting Course Similarity Signal using Subword Embeddings

Yinuo Xu
Zach A. Pardos

Several studies have shown the utility of neural network models in learning course similarities and providing insightful course recommendations from enrollment data. In this study, we explore if additional signals can be found in the morphological structure of course names. We train skip-gram, FastText, and other combination models on these course sequence data from the past nine years and compare results with state-of-the-art models. We find a 97.95% improvement in model performance (as measured by recall @ 10 in similarity-based course recommendations) from skip-gram to FastText, and 80.75% improvement from the current best combination model to the previous state-of-the-art model, indicating that the naming convention of courses (e.g., PHYS_H101) carries valuable signals. We define attributes with which to categorize course pairs from our validation set and present an analysis of which models are strongest and weakest at predicting the similarity of which categories of course pairs. Additionally, we also explore course-taking culture, analyzing if courses with the same demographic features are learned to be more similar. Our approach could help students find alternatives to full courses, improve existing course recommendation systems and course articulations between institutions, and assist institutions in course policy-making.

Expert Features for a Student Support Recommendation Contextual Bandit Algorithm

Morgan Lee
Abubakir Siedahmed
Neil Heffernan

Contextual multi-armed bandits have previously been used to personalize student support messages given to learners by supplying a model with relevant context about the user, problem, and available student supports. In this work, we propose using careful feature selection with relevant domain knowledge to improve the quality of student support recommendations. By providing Bayesian Knowledge Tracing mastery estimates to a contextual multi-armed bandit as user-level context in a simulated environment, we demonstrate that using domain knowledge to engineer contextual features results in higher average cumulative reward, and significant improvement over randomly selecting student supports. The data used to simulate sequential recommendations are available at https://osf.io/sfyzv/?view_only=351fb8781d2c4f3bbc9d7486762d563a.

Probing Actionability in Learning Analytics: The Role of Routines, Timing, and Pathways

Yeonji Jung
Alyssa Friend Wise

Actionability is a critical, but understudied, issue in learning analytics for driving impact on learning. This study investigated access and action-taking of 91 students in an online undergraduate statistics course who received analytics designed for actionability twice a week for five weeks in the semester. Findings showed high levels of access, but little direct action through the provided links. The major contribution of the study was the identification of unexpected indirect actions taken by students in response to the analytics which requires us to think (and look for evidence of impact) more broadly than has been done previously. The study also found that integrating analytics into existing learning tools and routines can increase access rates to the analytics, but may not guarantee meaningful engagement without better strategies to manage analytic timing. Together, this study advances an understanding of analytic actionability, calling for a broader examination of both direct and indirect actions within a larger learning ecosystem.

To what extent do responses to a single survey question provide insights into students' sense of belonging?

Sriram Ramanathan
Simon Buckingham Shum
Lisa-Angelique Lim

A student's “sense of belonging” is critical to retention and success in higher education. However, belonging is a multifaceted and dynamic concept, making monitoring and supporting it with timely action challenging. Conventional approaches to researching belonging depend on lengthy surveys and/or focus groups, and while often insightful, these are resource-intensive, slow, and cannot be repeated too often. “Belonging Analytics” is an emerging concept pointing to the potential of learning analytics to address this challenge, and to illustrate this concept, this paper investigates the feasibility of asking students a single question about what promotes their sense of belonging. To validate this, responses were analysed using a form of topic modelling, and these were triangulated by examining alignment with (i) students’ responses to Likert scale items in a belonging scale and (ii) the literature on the drivers of belonging. These alignments support our proposal that this is a practical tool to gain timely insight into a cohort's sense of belonging. Reflecting our focus on practical tools, the approach is implemented using analytics products readily available to educational institutions — Linguistic Inquiry Word Count (LIWC) and Statistical Program for Social Sciences (SPSS).

Minds and Machines Unite: Deciphering Social and Cognitive Dynamics in Collaborative Problem Solving with AI

Mohammad Amin Samadi
Spencer Jaquay
Yiwen Lin
Elham Tajik
Seehee Park
Nia Nixon

We investigated the feasibility of automating the modeling of collaborative problem-solving skills encompassing both social and cognitive aspects. Leveraging a diverse array of cutting-edge techniques, including machine learning, deep learning, and large language models, we embarked on the classification of qualitatively coded interactions within groups. These groups were composed of four undergraduate students, each randomly assigned to tackle a decision-making challenge. Our dataset comprises contributions from 514 participants distributed across 129 groups. Employing a suite of prominent machine learning methods such as Random Forest, Support Vector Machines, Naive Bayes, Recurrent and Convolutional Neural Networks, BERT, and GPT-2 language models, we undertook the intricate task of classifying peer interactions. Notably, we introduced a novel task-based train-test split methodology, allowing us to assess classification performance independently of task-related context. This research carries significant implications for the learning analytics field by demonstrating the potential for automated modeling of collaborative problem-solving skills, offering new avenues for understanding and enhancing group learning dynamics.

Exploring Student Expectations and Confidence in Learning Analytics

Hayk Asatryan
Basile Tousside
Janis Mohr
Malte Neugebauer
Hildo Bijl
Paul Spiegelberg
Claudia Frohn-Schauf
Jörg Frochte

Learning Analytics (LA) is nowadays ubiquitous in many educational systems, providing the ability to collect and analyze student data in order to understand and optimize learning and the environments in which it occurs. On the other hand, the collection of data requires to comply with the growing demand regarding privacy legislation. In this paper, we use the Student Expectation of Learning Analytics Questionnaire (SELAQ) to analyze the expectations and confidence of students from different faculties regarding the processing of their data for Learning Analytics purposes. This allows us to identify four clusters of students through clustering algorithms: Enthusiasts, Realists, Cautious and Indifferents. This structured analysis provides valuable insights into the acceptance and criticism of Learning Analytics among students.

Gamification and Deadending: Unpacking Performance Impacts in Algebraic Learning

Siddhartha Pradhan
Ashish Gurung
Erin Ottmar

This study explores the effects of varying problem-solving strategies on students’ future performance within the gamified algebraic learning platform From Here To There! (FH2T). The study focuses on the procedural pathways students adopted, transitioning from a start state to a goal state in solving algebraic problems. By dissecting the nature of these pathways—optimal, sub-optimal, incomplete, and dead-end—we sought correlations with post-test outcomes. A striking observation was that students who frequently engaged in what we term ‘regular dead-ending behavior’, were significantly correlated with higher post-test performance. This finding underscores the potential of exploratory learner behavior within a low-stakes gamified framework in bolstering algebraic comprehension. The implications of our findings are twofold: they accentuate the significance of tailoring gamified platforms to student behaviors and highlight the potential benefits of fostering an environment that promotes exploration without retribution. Moreover, our insights hint at the notion that fostering exploratory behavior could be instrumental in cultivating mathematical flexibility.

Navigating (Dis)agreement: AI Assistance to Uncover Peer Feedback Discrepancies

M Parvez Rashid
Edward Gehringer
Hassan Khosravi

Engaging students in the peer review process has been recognized as a valuable educational tool. It not only nurtures a collaborative learning environment where reviewees receive timely and rich feedback but also enhances the reviewer’s critical thinking skills and encourages reflective self-evaluation. However, a common concern arises when students encounter misaligned or conflicting feedback. Not only can such feedback confuse students; but it can also make it difficult for the instructor to rely on the reviews when assigning a score to the work. Addressing this pressing issue, our paper introduces an innovative, AI-assisted approach that is designed to detect and highlight disagreements within formative feedback. We’ve harnessed extensive data from 170 students, analyzing 15,500 instances of peer feedback from a software development course. By utilizing clustering techniques coupled with sophisticated natural language processing (NLP) models, we transform feedback into distinct feature vectors to pinpoint disagreements. The findings from our study underscore the effectiveness of our approach in enhancing text representations to significantly boost the capability of clustering algorithms in discerning disagreements in feedback. These insights bear implications for educators and software development courses, offering a promising route to streamline and refine the peer review process for the betterment of student learning outcomes.

Needs Analysis of Learning Analytics Dashboard for College Teacher Online Professional Learning in an International Training Initiative for the Global South

Chao Wang
Xiao Hu
Nora Patricia Hernández López
Jeremy Tzi Dong Ng

Online courses enable wide access to educational resources and thus provide a feasible platform for cross-regional teacher professional learning. Learning analytics dashboards (LAD) can support online learners by providing fine-grained feedback generated from learners’ interactions with platforms. Nevertheless, most studies on teacher online professional learning focus on resource-rich and technology-advanced regions, with scarce attention to the Global South. Furthermore, existing studies on LAD design mainly target students’ learning, rather than teachers’ professional learning. Therefore, it is much needed to develop LAD for teacher-learners online professional learning in the Global South. Contextualized in an international online professional training initiative, this study conducted in-depth interviews with 42 teacher-learners from 19 countries in the Global South, aiming to identify their needs for 1) support on their self-regulated learning (SRL), and 2) potential LA components in dashboards. Findings indicated that teacher-learners needed support for self-regulated learning strategies, including motivation maintenance, time management, environment structuring, help-seeking, and self-evaluation. Nine LA features were identified to design the LADs to support SRL preliminarily. This co-designed LAD study with interviewees improved our understanding on the needs of college teachers in the Global South for LA support during their online professional learning, generating practical insights into needs-driven LAD designs.

Unveiling Synchrony of Learners’ Multimodal Data in Collaborative Maker Activities

Zuo Wang
Jeremy Tzi Dong Ng
Ying Que
Xiao Hu

While current evaluation of maker activities has rarely explored students’ learning processes, the multi-perspective and multi-level nature of collaboration adds complexity to learning processes of collaborative maker activities. In terms of group dynamics as an important indicator of collaboration quality, extant studies have shown the benefits of synchrony between learners’ actions during collaborative learning processes. However, synchrony of learners’ cognitive processes and visual attention in collaborative maker activities remains under-explored. Leveraging the multimodal learning analytics (MMLA) approach, this pilot study examines learners’ synchrony patterns from multiple modalities of data in the collaborative maker activity of virtual reality (VR) content creation. We conducted a user experiment with five pairs of students, and collected and analyzed their electroencephalography (EEG) signals, eye movement and system log data. Results showed that the five pairs of collaborators demonstrated diverse synchrony patterns. We also discovered that, while some groups exhibited synchrony in one modality of data before becoming not synchronized in another modality, other groups started with a lack of synchrony followed by maintaining synchrony. This study is expected to make methodological and practical contributions to MMLA research and assessment of collaborative maker activities.

Places to intervene in complex learning systems

Kirsty Kitto
Andrew Gibson

Responding to recent questioning of Learning Analytics (LA) as a field that is achieving its aim of understanding and optimising learning and the environments in which it occurs, this paper argues that there is a need to genuinely embrace the complexity of learning when considering the impact of LA. Rather than focusing upon ‘optimisation’, we propose that LA should seek to understand and improve the complex socio-technical system in which it operates. We adopt a framework from systems theory to propose 12 different intervention points for learning systems, and apply it to two case studies. We conclude with an invitation to the community to critique and extend this proposed framework.

A Case Study on University Student Online Learning Patterns Across Multidisciplinary Subjects

Yige Song
Eduardo Oliveira
Michael Kirley
Pauline Thompson

This case study explores the online learning patterns of a cohort of first-year university students in two subjects: a compulsory science subject and an introductory programming subject, by analysing trace data from the Learning Management Systems (LMS). The methodology extends existing learning analytics techniques to incorporate temporal aspects of students’ learning, such as session duration and weekly online behaviours. By examining over 82,000 learning actions, the research unveils significant variations in students’ online learning strategies between subjects, offering deeper insights into these differences and their associated challenges. The study seeks to initiate broader discussions in learning analytics, emphasising the need to comprehend students’ diverse online learning experiences and encouraging further exploration in future research.