Student models are typically evaluated through predicting the correctness of the next answer. This approach is insufficient in the problem-solving context, especially for student models that use performance data beyond binary correctness. We propose more comprehensive methods for validating student models and illustrate them in the context of introductory programming. We demonstrate the insufficiency of the next answer correctness prediction task, as it is neither able to reveal low validity of student models that use just binary correctness, nor does it show increased validity of models that use other performance data. The key message is that the prevalent usage of the next answer correctness for validating student models and binary correctness as the only input to the models is not always warranted and limits the progress in learning analytics.
The provision of comparative feedback is a promising approach in digital learning environments to support learners’ self-regulated learning. Yet, empirical evidence suggests that such feedback can sometimes backfire or may only help learners with relatively high self-regulated learning skills, potentially exacerbating educational inequality. In this paper, we try to overcome such drawbacks by re-evaluating a feedback system based on the social norms theory that has previously led to intriguing results: A social comparison component embedded into the learning platform of a blended learning course (elective module, 58 participants) considerably encouraged online learning during the semester. Moreover, there was no heterogeneity in the behavioral response, suggesting that all subgroups responded similarly to the feedback. To further shed light on the generalizability of these results, this paper presents a follow-up study. Specifically, we conducted a second experiment during the COVID-19 pandemic with a different university course (compulsory module, 118 participants) and a non-overlapping sample and find similar results. The feedback shifted students’ online learning from the end towards the middle of the semester. Overall, the findings suggest that our feedback system has a large impact on students’ online learning and that this desirable impact is present in all subgroup analyses.
Online forums are an integral part of modern day courses, but motivating students to participate in educationally beneficial discussions can be challenging. Our proposed solution is to initialize (or “seed”) a new course forum with comments from past instances of the same course that are intended to trigger discussion that is beneficial to learning. In this work, we develop methods for selecting high-quality seeds and evaluate their impact over one course instance of a 186-student biology class. We designed a scale for measuring the “seeding suitability” score of a given thread (an opening comment and its ensuing discussion). We then constructed a supervised machine learning (ML) model for predicting the seeding suitability score of a given thread. This model was evaluated in two ways: first, by comparing its performance to the expert opinion of the course instructors on test/holdout data; and second, by embedding it in a live course, where it was actively used to facilitate seeding by the course instructors. For each reading assignment in the course, we presented a ranked list of seeding recommendations to the course instructors, who could review the list and filter out seeds with inconsistent or malformed content. We then ran a randomized controlled study, in which one group of students was shown seeds that were recommended by the ML model, and another group was shown seeds that were recommended by an alternative model that ranked seeds purely by the length of discussion that was generated in previous course instances. We found that the group of students that received posts from either seeding model generated more discussion than a control group in the course that did not get seeded posts. Furthermore, students who received seeds selected by the ML-based model showed higher levels of engagement, as well as greater learning gains, than those who received seeds ranked by length of discussion.
Learnersourcing is emerging as a viable learner-centred and pedagogically justified approach for harnessing the creativity and evaluation power of learners as experts-in-training. Despite the increasing adoption of learnersourcing in higher education, understanding students’ behaviour while engaged in learnersourcing and best practices for the design and development of learnersourcing systems are still largely under-researched. This paper offers data-driven reflections and lessons learned from the development and deployment of a learnersourcing adaptive educational system called RiPPLE, which to date, has been used in more than 50-course offerings with over 12,000 students. Our reflections are categorised into examples and best practices on (1) assessing the quality of students’ contributions using accurate, explainable and fair approaches to data analysis, (2) incentivising students to develop high-quality contributions and (3) empowering instructors with actionable and explainable insights to guide student learning. We discuss the implications of these findings and how they may contribute to the growing literature on the development of effective learnersourcing systems and more broadly technological educational solutions that support learner-centred learning at scale.
Schools are increasingly becoming into complex learning spaces where students interact with various physical and digital resources, educators, and peers. Although the field of learning analytics has advanced in analysing logs captured from digital tools, less progress has been made in understanding the social dynamics that unfold in physical learning spaces. Among the various rapidly emerging sensing technologies, position tracking may hold promises to reveal salient aspects of activities in physical learning spaces such as the formation of interpersonal ties among students. This paper explores how granular x-y physical positioning data can be analysed to model social interactions among students and teachers. We conducted an 8-week longitudinal study in which positioning traces of 98 students and six teachers were automatically captured every day in an open-plan public primary school. Positioning traces were analysed using social network analytics (SNA) to extract a set of metrics to characterise students’ positioning behaviours and social ties at cohort and individual levels. Results illustrate how analysing positioning traces through the lens of SNA can enable the identification of certain pedagogical approaches that may be either promoting or discouraging in-class social interaction, and students who may be socially isolated.
The process of using Learning Analytics (LA) to improve teaching works from the assumption that data should be readily shared between stakeholders in an educational organization. However, the design of LA tools often does not account for considerations such as data privacy, transparency and trust among stakeholders. Research in human-centered design of LA does attend to these questions, specifically with a focus on including direct input from K-12 educators. In this paper, we present a series of design studies to articulate and refine conjectures about how privacy and transparency might influence better trust-building and data sharing within four school districts in the United States. By presenting the development of four sequential prototypes, our findings illuminate the tensions between designing for existing norms versus potentially challenging these norms by promoting meaningful discussions around the use of data. We conclude with a discussion about practical and methodological implications of our work to the LA community.
Educators use a wide variety of data to inform their practices. Examples of these data include forms of information that are commonplace in schools, such as student work and paper-based artifacts. One limitation in these situations is that there are less efficient ways to process such everyday varieties of information into analytics that are more usable and practical for educators. To explore how to address this constraint, we describe two sets of design experiments that utilize crowdsourced tasks for scoring open-ended assessments. Developing crowdsourced systems and their resulting analytics introduced a variety of challenges, such as attending to the expertise and learning of the crowd. In this paper, we describe the potential efficacy of design decisions such as screening the crowd, providing multimedia instruction, and asking the crowd to explain their answers. We also explore the potential of crowdsourcing as a learning opportunity for those participating in the collective tasks. Our work offers key design implications for leveraging crowdsourcing to process educational data in ways that are relevant to educators, while offering learning experiences for the crowd.
This paper reports the findings of a study that measured the effectiveness of employing automatic text translation methods in automated classification of online discussion messages according to the categories of social and cognitive presences. Specifically, we examined the classification of 1,500 Portuguese and 1,747 English discussion messages using classifiers trained on the datasets before and after the application of text translation. While the English model generated, with the original and translated texts, achieved results (accuracy and Cohen’s κ) similar to those of the previously reported studies, the translation to Portuguese led to a decrease in the performance. The indicates the general viability of the proposed approach when converting the text to English. Moreover, this study highlighted the importance of different features and resources, and the limitations of the resources for Portuguese as reasons of the results obtained.
Video has become an essential medium for learning. However, there are challenges when using traditional methods to measure how learners attend to lecture videos in video learning analytics, such as difficulty in capturing learners’ attention at a fine-grained level. Therefore, in this paper, we propose a gaze-based metric—“with-me-ness direction” that can measure how learners’ gaze-direction changes when they listen to the instructor’s dialogues in a video-lecture. We analyze the gaze data of 45 participants as they watched a video lecture and measured both the sequences of with-me-ness direction and proportion of time a participant spent looking in each direction throughout the lecture at different levels. We found that although the majority of the time participants followed the instructor’s dialogues, their behaviour of looking-ahead, looking-behind or looking-outside differed by their prior knowledge. These findings open the possibility of using eye-tracking to measure learners’ video-watching attention patterns and examine factors that can influence their attention, thereby helping instructors to design effective learning materials.
Knowledge tracing (KT) is a research topic which seeks to model the knowledge acquisition process of students by analyzing their past performance in answering questions, based on which their performance in answering future questions is predicted. However, existing KT models only consider whether a student answers a question correctly when the answer is submitted but not the in-question activities. We argue that the interaction involved in the in-question activities can at least partially reveal the thinking process of the student, and hopefully even the competence of acquiring or understanding each piece of the knowledge required for the question.
Based on real student interaction clickstream data collected from an online learning platform on which students solve mathematics problems, we conduct clustering analysis for each question to show that clickstreams can reflect different student behaviors. We then propose the first clickstream-based KT model, dubbed clickstream knowledge tracing (CKT), which augments a basic KT model by modeling the clickstream activities of students when answering questions. We apply different variants of CKT and compare them with the baseline KT model which does not use clickstream data. Despite the limited number of questions with clickstream data and its noisy nature which may compromise the data quality, we show that incorporating clickstream data leads to performance improvement. Through this pilot study, we hope to open a new direction in KT research to analyze finer-grained interaction data of students on online learning platforms.
This paper examines the impact of COVID-19 induced campus closure on university students’ self-regulated learning behavior by analyzing click-stream data collected from student interactions with 70 online learning modules in a university physics course. To do so, we compared the trend of six types of actions related to the three phases of self-regulated learning before and after campus closure and between two semesters. We found that campus closure changed students’ planning and goal setting strategies for completing the assignments, but didn’t have a detectable impact on the outcome or the time of completion, nor did it change students’ self-reflection behavior. The results suggest that most students still manage to complete assignments on time during the pandemic, while the design of online learning modules might have provided the flexibility and support for them to do so.
Students using online learning environments need to effectively self-regulate their learning. However, with an absence of teacher-provided structure, students often resort to less effective, passive learning strategies versus constructive ones. We consider the potential benefits of interventions that promote retrieval practice – retrieving learned content from memory – which is an effective strategy for learning and retention. The goal is to nudge students towards completing short, formative quizzes when they are likely to succeed on those assessments. Towards this goal, we developed a machine-learning model using data from 32,685 students who used an online mathematics platform over an entire school year to prospectively predict scores on three-item assessments (N = 210,020) from interaction patterns up to 9 minutes before the assessment as well as Item Response Theory (IRT) estimates of student ability and quiz difficulty. These models achieved a student-independent correlation of 0.55 between predicted and actual scores on the assessments and outperformed IRT-only predictions (r = 0.34). Model performance was largely independent of the length of the analyzed window preceding a quiz. We discuss potential for future applications of the models to trigger dynamic interventions that aim to encourage students to engage with formative assessments rather than more passive learning strategies.
Acoustic features and machine learning models have been recently proposed as promising tools to analyze lessons. Furthermore, acoustic patterns, both in the time and spectral domain, have been found to be related to teacher pedagogical practices. Nonetheless, most of previous work relies on expensive or third party equipment, limiting its scalability, and additionally, it is mainly used for diarization. Instead, in this work we present a cost-effective approach to identify teachers’ practices according to three categories (Presenting, Administration, and Guiding) which are compiled from the Classroom Observation Protocol for Undergraduate STEM. Particularly, we record teachers’ lessons using low-cost microphones connected to their smartphones. We then compute the mean and standard deviation of the amplitude, Mel spectrogram, and Mel Frequency Cepstral coefficients of the recordings to train supervised models for the task of predicting three categories compiled from the Classroom Observation Protocol for Undergraduate STEM. We found that spectral features perform better at the task of predicting teachers’ activities along the lessons and that our models can predict the presence of the two most common teaching practices with over 80% of accuracy and good discriminative power. Finally, with these models, we found that using audio obtained from the teachers’ smartphones it is also possible to automatically discriminate between sessions where students are using or not an online platform. This approach is important for teachers and other stakeholders who could use an automatic and cost-effective tool for analyzing teaching practices.
To adapt materials for an individual learner, intelligent tutoring systems must estimate their knowledge or abilities. Depending on the content taught by the tutor, there have historically been different approaches to student modeling. Unlike common skill-based models used by math and science tutors, second language acquisition (SLA) tutors use memory-based models since there are many tasks involving memorization and retrieval, such as learning the meaning of a word in a second language. Based on estimated memory strengths provided by these memory-based models, SLA tutors are able to identify the optimal timing and content of retrieval practices for each learner to improve retention. In this work, we seek to determine whether skill-based models can be combined with memory-based models to improve student modeling and especially retrieval practice performance for SLA. In order to define skills in the context of SLA, we develop methods that can automatically extract multiple types of linguistic features from words. Using these features as skills, we apply skill-based models to a real-world SLA dataset. Our main findings are as follows. First, incorporating lexical features to represent individual words as skills in skill-based models outperforms existing memory-based models in terms of recall probability prediction. Second, incorporating additional morphological and syntactic features of each word via multiple-skill tagging of each word further improves the skill-based models. Third, incorporating semantic features, like word embeddings, to model similarities between words in a learner’s practice history and their effects on memory also improves the models and appears to be a promising direction for future research.
Despite the abundance of data generated from students’ activities in virtual learning environments, the use of supervised machine learning in learning analytics is limited by the availability of labeled data, which can be difficult to collect for complex educational constructs. In a previous study, a subfield of machine learning called Active Learning (AL) was explored to improve the data labeling efficiency. AL trains a model and uses it, in parallel, to choose the next data sample to get labeled from a human expert. Due to the complexity of educational constructs and data, AL has suffered from the cold-start problem where the model does not have access to sufficient data yet to choose the best next sample to learn from. In this paper, we explore the use of past data to warm start the AL training process. We also critically examine the implications of differing contexts (urbanicity) in which the past data was collected. To this end, we use authentic affect labels collected through human observations in middle school mathematics classrooms to simulate the development of AL-based detectors of engaged concentration. We experiment with two AL methods (uncertainty sampling, L-MMSE) and random sampling for data selection. Our results suggest that using past data to warm start AL training could be effective for some methods based on the target population's urbanicity. We provide recommendations on the data selection method and the quantity of past data to use when warm starting AL training in the urban and suburban schools.
Analytics of student learning data are increasingly important for continuous redesign and improvement of tutoring systems and courses. There is still a lack of general guidance on converting analytics into better system design, and on combining multiple methods to maximally improve a tutor. We present a multi-method approach to data-driven redesign of tutoring systems and its empirical evaluation. Our approach systematically combines existing and new learning analytics and instructional design methods. In particular, our methods involve identifying difficult skills and creating focused tasks for learning these difficult skills effectively following content redesign strategies derived from analytics. In our past work, we applied this approach to redesigning an algebraic modeling unit and found initial evidence of its effectiveness. In the current work, we extended this approach and applied it to redesigning two other tutor units in addition to a second iteration of redesigning the previously redesigned unit. We conducted a one-month classroom experiment with 129 high school students. Compared to the original tutor, the redesigned tutor led to significantly higher learning outcomes, with time mainly allocated to focused tasks rather than original full tasks. Moreover, it reduced over- and under-practice, yielded a more effective practice experience, and selected skills progressing from easier to harder to a greater degree. Our work provides empirical evidence of the effectiveness and generality of a multi-method approach to data-driven instructional redesign.
There has been recent interest in the design of collaborative learning activities that are distributed across multiple technology devices for students to engage in scientific inquiry. Emerging research has begun to investigate students’ collaborative behaviors across different device types and students’ shared attention by tracking eye gaze, body posture, and their interactions with the digital environment. Using a 3D astronomy simulation that leverages a VR headset and tablet computers, this paper builds on the ideas described in eye-gaze studies by developing and implementing a metric of shared viewing across multiple devices. Preliminary findings suggest that a higher level of shared view could be related to increased conceptual discussion, as well as point to an early-stage pattern of behavior of decreased SV to prompt facilitator intervention to refocus collaborative efforts. We hope this metric will be a promising first step in further understanding and assessing the quality of collaboration across multiple device platforms in a single shared space. This paper provides an in depth look at a highly exploratory stage of a broader research trajectory to establish a robust, effective way to track screen views, including providing resources to teachers when students engage in similar learning environments, and providing insight from log data to understand how students effectively collaborate.
Investigation of learning tactics and strategies has received increasing attention by the Learning Analytics (LA) community. While previous research efforts have made notable contributions towards identifying and understanding learning tactics from trace data in various blended and online learning settings, there is still a need to deepen our understanding about learning processes that are activated during the enactment of distinct learning tactics. In order to fill this gap, we propose a learning analytic approach to unveiling and comparing self-regulatory processes in learning tactics detected from trace data. Following this approach, we detected four learning tactics (Reading with Quiz Tactic, Assessment and Interaction Tactic, Short Login and Interact Tactic and Focus on Quiz Tactic) as used by 728 learners in an undergrad course. We then theorised and detected five micro-level processes of self-regulated learning (SRL) through an analysis of trace data. We analysed how these micro-level SRL processes were activated during enactment of the four learning tactics in terms of their frequency of occurrence and temporal sequencing. We found significant differences across the four tactics regarding the five micro-level SRL processes based on multivariate analysis of variance and comparison of process models. In summary, the proposed LA approach allows for meaningful interpretation and distinction of learning tactics in terms of the underlying SRL processes. More importantly, this approach shows the potential to overcome the limitations in the interpretation of LA results which stem from the context-specific nature of learning. Specifically, the study has demonstrated how the interpretation of LA results and recommendation of pedagogical interventions can also be provided at the level of learning processes rather than only in terms of a specific course design.
Peer reviews offer many learning benefits. Understanding students’ engagement in them can help design effective practices. Although learning analytics can be effective in generating such insights, its application in peer reviews is scarce. Theory can provide the necessary foundations to inform the design of learning analytics research and the interpretation of its results. In this paper, we followed a theory-based learning analytics approach to identifying students’ engagement patterns in a peer review activity facilitated via a web-based tool called Synergy. Process mining was applied on temporal learning data, traced by Synergy. The theory about peer review helped determine relevant data points and guided the top-down approach employed for their analysis: moving from the global phases to regulation of learning, and then to micro-level actions. The results suggest that theory and learning analytics should mutually relate with each other. Mainly, theory played a critical role in identifying a priori engagement patterns, which provided an informed perspective when interpreting the results. In return, the results of the learning analytics offered critical insights about student behavior that was not expected by the theory (i.e., low levels of co-regulation). The findings provided important implications for refining the grounding theory and its operationalization in Synergy.
Little is known about the online learning behaviors of students traditionally underrepresented in STEM fields (i.e., UR-STEM students), as well as how those behaviors impact important learning outcomes. The present study examined the relationship between online discussion forum engagement and success for UR-STEM and non-UR-STEM students, using the Community of Inquiry (CoI) model as our theoretical framework. Social network analysis and nested regression models were used to explore how three different measures of forum engagement—1) total number of posts written, 2) number of help-seeking posts written and replied to, and 3) level of connectivity—were related to improvement (i.e., relative performance gains) for 70 undergraduate students enrolled in an online introductory STEM course. We found a significant positive relationship between help-seeking and improvement and nonsignificant effects of general posting and connectivity; these results held for UR-STEM and non-UR-STEM students alike. Our findings suggest that online help-seeking has benefits for course improvement beyond what can be predicted by posting alone and that one need not be well connected in a class network to achieve positive learning outcomes. Finally, UR-STEM students demonstrated greater grade improvement than their non-UR-STEM counterparts, which suggests that the online environment has the potential to combat barriers to success that disproportionately affect underrepresented students.
Large courses act as gateways for college students and often have poor outcomes, particularly in STEM fields where the pace of improvement has been glacial. Students encounter barriers to persistence like low grades, competitive cultures, and a lack of motivation and belonging. Tailored technology systems offer one promising path forward. In this observational study, we report on the use of one such system, called ECoach, that provides students resources based on their psychosocial profile, performance metrics, and pattern of ECoach usage. We investigated ECoach efficacy in five courses enrolling 3,599 students using a clustering method to group users by engagement level and subsequent regression analyses. We present results showing significant positive relationships with small effect sizes between ECoach engagement and final course grade as well as grade anomaly, a performance measure that takes into account prior course grades. The courses with the strongest relationship between ECoach engagement and performance offered nominal extra credit incentives yet show improved grades well above this “investment” from instructors. Such small incentives may act as a catalyst that spurs deeper engagement with the platform. The impact of specific ECoach features and areas for future study are discussed.
Consistency of learning behaviors is known to play an important role in learners’ engagement in a course and impact their learning outcomes. Despite significant advances in the area of learning analytics (LA) in measuring various self-regulated learning behaviors, using LA to measure consistency of online course engagement patterns remains largely unexplored. This study focuses on modeling consistency of learners in online courses to address this research gap. Toward this, we propose a novel unsupervised algorithm that combines sequence pattern mining and ideas from information retrieval with a clustering algorithm to first extract engagement patterns of learners, represent learners in a vector space of these patterns and finally group them into groups with similar consistency levels. Using clickstream data recorded in a popular learning management system over two offerings of a STEM course, we validate our proposed approach to detect learners that are inconsistent in their behaviors. We find that our method not only groups learners by consistency levels, but also provides reliable instructor support at an early stage in a course.
This paper reports on a study aimed at identifying training requirements for both staff and students in higher education to enable more widespread use of learning analytics. Opinions of staff and students were captured through ten focus groups (37 students; 40 staff) and two surveys (1,390 students; 160 staff). Participants were predominantly from two higher education institutions in Ireland. Analysis of the results informed a framework for continuous professional development in learning analytics focusing on aspects of using data, legal and ethical considerations, policy, and workload. The framework presented here differentiates between the training needs of students, academic staff and professional services staff.
Asynchronous online discussions within a community of learners can improve learning outcomes through social knowledge construction, but the depth and quality of student contributions often varies widely. Approaches to assessing critical discourse typically use content analysis to identify indicators that correspond to framework constructs, that in turn serve as measures of depth and quality. Often only a single construct is addressed for performing content analysis in the literature, although recent work has used both social presence and cognitive presence constructs from the Community of Inquiry (CoI) framework. Nevertheless, there is no effective, commonly used, analytic approach to combining insights from multiple perspectives about quality and depth of online discussions. This paper addresses the gap by proposing the combined use of cognitive engagement (the ICAP framework) and cognitive presence (CoI); and by proposing a network analytic approach that quantifies the associations between the two frameworks and measures the moderation effects of two instructional interventions on those associations. The present study found that these associations were moderated by one intervention but not the other; and that messages labelled with the most common phase of cognitive presence could be usefully assigned to smaller meaningful subgroups by also considering the mode of cognitive engagement.
Research has emphasized that self-regulated learning (SRL) is critically important for learning. However, students have different capabilities of regulating their learning processes and individual needs. To help students improve their SRL capabilities, we need to identify students’ current behaviors. Specifically, we applied instructional design to create visible and meaningful markers of student learning at different points in time in LMS logs. We adopted knowledge engineering to develop a framework of proximal indicators representing SRL phases and evaluated them in a quasi-experiment in two different learning activities. A comparison of two sources of collected students’ SRL data, self-reported and trace data, revealed a relatively high agreement between our classifications (weighted kappa, κ = .74 and κ = .68). However, our indicators did not always discriminate adjacent SRL phases, particularly for enactment and adapting phases, compared with students’ real-time self-reported behaviors. Our behavioral indicators also were comparably successful at classifying SRL phases for different self-regulatory engagement levels. This study demonstrated how the triangulation of various sources of students’ self-regulatory data could help to unravel the complex nature of metacognitive processes.
The accelerated adoption of digital technologies by people and communities results in a close relation between, on one hand, the state of individual and societal well-being and, on the other hand, the state of the digital technologies that underpin our life experiences. The ethical concerns and questions about the impact of such technologies on human well-being become more crucial when data analytics and intelligent competences are integrated. To investigate how learning technologies could impact human well-being considering the promising and concerning roles of learning analytics, we apply the initial phase of the recently produced IEEE P7010 Well-being Impact Assessment, a methodology and a set of metrics, to allow the digital well-being of a set of educational technologies to be more comprehensively tackled and evaluated. We posit that the use of IEEE P7010 well-being metrics could help identify where educational technologies supported by learning analytics would increase or decrease well-being, providing new routes to future technological innovation in Learning Analytics research.
Reflection plays a critical role in learning by encouraging students to contemplate their knowledge and previous learning experiences to inform their future actions and higher-order thinking, such as reasoning and problem solving. Reflection is particularly important in inquiry-driven learning scenarios where students have the freedom to set goals and regulate their own learning. However, despite the importance of reflection in learning, there are significant theoretical, methodological, and analytical challenges posed by measuring, modeling, and supporting reflection. This paper presents results from a classroom study to investigate middle-school students’ reflection during inquiry-driven learning with Crystal Island, a game-based learning environment for middle-school microbiology. To collect evidence of reflection during game-based learning, we used embedded reflection prompts to elicit written reflections during students’ interactions with Crystal Island. Results from analysis of data from 105 students highlight relationships between features of students’ reflections and learning outcomes related to both science content knowledge and problem solving. We consider implications for building adaptive support in game-based learning environments to foster deep reflection and enhance learning, and we identify key features in students’ problem-solving actions and reflections that are predictive of reflection depth. These findings present a foundation for providing adaptive support for reflection during game-based learning.
Many teachers have come to rely on the affordances that computer-based learning platforms offer in regard to aiding in student assessment, supplementing instruction, and providing immediate feedback and help to students as they work through assigned content. Similarly, researchers commonly utilize the large datasets of clickstream logs describing students’ interactions with the platform to study learning. For the teachers that use this information to monitor student progress, as well as for researchers, this data provides limited insights into the learning process; this is particularly the case as it pertains to observing and understanding the effort that students are applying to their work. From the perspective of teachers, it is important for them to know which students are attending to and using computer-provided aid and which are taking advantage of the system to complete work without effectively learning the material. In this paper, we conduct a series of analyses based on response time decomposition (RTD) to explore student help-seeking behavior in the context of on-demand hints within a computer-based learning platform with particular focus on examining which students appear to be exhibiting effort to learn while engaging with the system. Our findings are then leveraged to examine how our measure of student effort correlates with later student performance measures.
Teachers, like everyone else, need objective reliable feedback in order to improve their effectiveness. However, developing a system for automated teacher feedback entails many decisions regarding data collection procedures, automated analysis, and presentation of feedback for reflection. We address the latter two questions by comparing two different machine learning approaches to automatically model seven features of teacher discourse (e.g., use of questions, elaborated evaluations). We compared a traditional open-vocabulary approach using n-grams and Random Forest classifiers with a state-of-the-art deep transfer learning approach for natural language processing (BERT). We found a tradeoff between data quantity and accuracy, where deep models had an advantage on larger datasets, but not for smaller datasets, particularly for variables with low incidence rates. We also compared the models based on the level of feedback granularity: utterance-level (e.g., whether an utterance is a question or a statement), class session-level proportions by averaging across utterances (e.g., question incidence score of 48%), and session-level ordinal feedback based on pre-determined thresholds (e.g., question asking score is medium [vs. low or high]) and found that BERT generally provided more accurate feedback at all levels of granularity. Thus, BERT appears to be the most viable approach to providing automatic feedback on teacher discourse provided there is sufficient data to fine tune the model.
Although online courses can provide students with a high-quality and flexible learning experience, one of the caveats is that they require high levels of self-regulation. This added hurdle may have negative consequences for first-generation college students. In order to better understand and support students’ self-regulated learning, we examined a fully online Chemistry course with high enrollment (N = 312) and a high percentage of first-generation college students (65.70%). Using students’ lecture video clickstream data, we created two indicators of self-regulated learning: lecture video completion and time management. Performing a k-means clustering on these indicators uncovered four distinct self-regulated learning patterns: (1) Early Planning, (2) Planning, (3) Procrastination, and (4) Low Engagement. Early Planning behaviors were especially important for course success—they consistently predicted higher final course grades, even after controlling for important demographic variables. Interestingly, first-generation college students classified as Early Planners achieved at similar levels as their non-first-generation peers, but first-generation students in the Low Engagement group had the lowest average grades among students. Overall, our results show that self-regulation may be an important skill for determining first-generation students’ STEM achievement, and targeting these skills may serve as a useful way to support their specific learning needs.
Deep Knowledge Tracing (DKT), which traces a student’s knowledge change using deep recurrent neural networks, is widely adopted in student cognitive modeling. Current DKT models only predict a student’s performance based on the observed learning history. However, a student’s learning processes often contain latent events not directly observable in the learning history, such as partial understanding, making slips, and guessing answers. Current DKT models fail to model this kind of stochasticity in the learning process. To address this issue, we propose Variational Deep Knowledge Tracing (VDKT), a latent variable DKT model that incorporates stochasticity into DKT through latent variables. We show that VDKT outperforms both a sequence-to-sequence DKT baseline and previous SoTA methods on MAE, F1, and AUC by evaluating our approach on two Duolingo language learning datasets. We also draw various interpretable analyses from VDKT and offer insights into students’ stochastic behaviors in language learning.
The conceptualisation of self-regulated learning (SRL) as a process that unfolds over time has influenced the way in which researchers approach analysis. This gave rise to the use of process mining in contemporary SRL research to analyse data about temporal and sequential relations of processes that occur in SRL. However, little attention has been paid to the choice and combinations of process mining algorithms to achieve the nuanced needs of SRL research. We present a study that 1) analysed four process mining algorithms that are most commonly used in the SRL literature – Inductive Miner, Heuristics Miner, Fuzzy Miner, and pMineR; and 2) examined how the metrics produced by the four algorithms complement each. The study looked at micro-level processes that were extracted from trace data collected in an undergraduate course (N=726). The study found that Fuzzy Miner and pMineR offered better insights into SRL than the other two algorithms. The study also found that a combination of metrics produced by several algorithms improved interpretation of temporal and sequential relations between SRL processes. Thus, it is recommended that future studies of SRL combine the use of process mining algorithms and work on new tools and algorithms specifically created for SRL research.
Despite growing implementation of teacher-facing analytics in higher education, relatively little is known about the detailed processes through which instructors make sense of analytics in their teaching practices beyond their initial encounters with tools. This study unpacked the sensemaking process of thirteen instructors with analytic experience, using interviews that included walkthroughs of their analytics use. Qualitative inductive analysis was used to identify themes related to (1) the questions they asked of the analytics, (2) the techniques they used to interpret them, and (3) the challenges they encountered. Findings indicated that instructors went beyond a general curiosity to develop three types of questions of the analytics (goal-oriented, problem-oriented, and instruction modification questions). Instructors also used specific techniques to read and explain data by (a) developing expectations about the answers the analytics would provide, and (b) making comparisons to reveal student diversity, identify effects of instructional revision and diagnose issues. The study found instructors faced an initial learning curve when seeking and making use of relevant information, but also continued to revisit these challenges when they were not able to develop a routine of analytics use. These findings both contribute to a conceptual understanding of instructor analytic sensemaking and have practical implications for its systematic support.
Assessment and reporting of skills is a central feature of many digital learning platforms. With students often using multiple platforms, cross-platform assessment has emerged as a new challenge. While technologies such as Learning Tools Interoperability (LTI) have enabled communication between platforms, reconciling the different skill taxonomies they employ has not been solved at scale. In this paper, we introduce and evaluate a methodology for finding and linking equivalent skills between platforms by utilizing problem content as well as the platform’s clickstream data. We propose six models to represent skills as continuous real-valued vectors, and leverage machine translation to map between skill spaces. The methods are tested on three digital learning platforms: ASSISTments, Khan Academy, and Cognitive Tutor. Our results demonstrate reasonable accuracy in skill equivalency prediction from a fine-grained taxonomy to a coarse-grained one, achieving an average recall@5 of 0.8 between the three platforms. Our skill translation approach has implications for aiding in the tedious, manual process of taxonomy to taxonomy mapping work, also called crosswalks, within the tutoring as well as standardized testing worlds.
Learning analytics (LA) has been presented as a viable solution for scaling timely and personalised feedback to support students’ self-regulated learning (SRL). Research is emerging that shows some positive associations between personalised feedback with students’ learning tactics and strategies as well as time management strategies, both important aspects of SRL. However, the definitive role of feedback on students’ SRL adaptations is under-researched; this requires an examination of students’ recalled experiences with their personalised feedback. Furthermore, an important consideration in feedback impact is the course context, comprised of the learning design and delivery modality. This mixed-methods study triangulates learner trace data from two different course contexts, with students’ qualitative data collected from focus group discussions, to more fully understand the impact of their personalised feedback and to explicate the role of this feedback on students’ SRL adaptations. The quantitative analysis showed the contextualised impact of the feedback on students’ learning and time management strategies in the different courses, while the qualitative analysis highlighted specific ways in which students used their feedback to adjust these and other SRL processes.
Game-Design (GD) environments show promise in fostering Computational Thinking (CT) skills at a young age. However, such environments can be challenging to some students due to their highly open-ended nature. We propose to alleviate this difficulty by learning interpretable student models from data that can drive personalization of a real-world GD learning environment to the student’s needs. We apply our approach on a dataset collected in ecological settings and evaluate the ability of the generated student models at predicting ineffective learning behaviors over the course of the interaction. We then discuss how these behaviors can be used to define personalized support in GD learning activities, by conducting extensive interviews with experienced instructors.
In nursing education through team simulations, students must learn to position themselves correctly in coordination with colleagues. However, with multiple student teams in action, it is difficult for teachers to give detailed, timely feedback on these spatial behaviours to each team. Indoor-positioning technologies can now capture student spatial behaviours, but relatively little work has focused on giving meaning to student activity traces, transforming low-level x/y coordinates into language that makes sense to teachers. Even less research has investigated if teachers can make sense of that feedback. This paper therefore makes two contributions. (1) Methodologically, we document the use of Epistemic Network Analysis (ENA) as an approach to model and visualise students’ movements. To our knowledge, this is the first application of ENA to analyse human movement. (2) We evaluated teachers’ responses to ENA diagrams through qualitative analysis of video-recorded sessions. Teachers constructed consistent narratives about ENA diagrams’ meaning, and valued the new insights ENA offered. However, ENA’s abstract visualisation of spatial behaviours was not intuitive, and caused some confusions. We propose, therefore, that the power of ENA modelling can be combined with other spatial representations such as a classroom map, by overlaying annotations to create a more intuitive user experience.
Finding an optimal learning trajectory is an important question in educational systems. Existing Artificial Intelligence in Education (AiEd) technologies mostly used indirect methods to make the learning process efficient such as recommending contents based on difficulty adjustment, weakness analysis, learning theory, psychometric analysis, or domain specific rules.
In this study, we propose a recommender system that optimizes the learning trajectory of a student preparing for a standardized exam by recommending the learning content(question) which directly maximizes the expected score after the consumption of the content. In particular, the proposed RCES model computes the expected score of a user by effectively capturing educational effects. To validate the proposed model in an end-to-end system, we conduct an A/B test on 1713 real students by deploying 4 recommenders to a real mobile application. Result shows that RCES has better educational efficiencies than traditional methods such as expert designed models and item response theory based models.
Collaborative game-based learning environments offer significant promise for creating engaging group learning experiences. Online chat plays a pivotal role in these environments by providing students with a means to freely communicate during problem solving. These chat-based discussions and negotiations support the coordination of students’ in-game learning activities. However, this freedom of expression comes with the possibility that some students might engage in undesirable communicative behavior. A key challenge posed by collaborative game-based learning environments is how to reliably detect disruptive talk that purposefully disrupt team dynamics and problem-solving interactions. Detecting disruptive talk during collaborative game-based learning is particularly important because if it is allowed to persist, it can generate frustration and significantly impede the learning process for students. This paper analyzes disruptive talk in a collaborative game-based learning environment for middle school science education to investigate how such behaviors influence students’ learning outcomes and varies across gender and students’ prior knowledge. We present a disruptive talk detection framework that automatically detects disruptive talk in chat-based group conversations. We further investigate both classic machine learning and deep learning models for the framework utilizing a range of dialogue representations as well as supplementary information such as student gender. Findings show that long short-term memory network (LSTM)-based disruptive talk detection models outperform competitive baseline models, indicating that the LSTM-based disruptive talk detection framework offers significant potential for supporting effective collaborative game-based learning through the identification of disruptive talk.
Learning analytics dashboards (LADs) are designed as feedback tools for learners, but until recently, learners rarely have had a say in how LADs are designed and what information they receive through LADs. To overcome this shortcoming, we have developed a customisable LAD for Coursera MOOCs on which learners can set goals and choose indicators to monitor. Following a mixed-methods approach, we analyse 401 learners’ indicator selection behaviour in order to understand the decisions they make on the LAD and whether learner goals and self-regulated learning skills influence these decisions. We found that learners overwhelmingly chose indicators about completed activities. Goals are not associated with indicator selection behaviour, while help-seeking skills predict learners’ choice of monitoring their engagement in discussions and time management skills predict learners’ interest in procrastination indicators. The findings have implications for our understanding of learners’ use of LADs and their design.
Collaborative problem-solving (CPS) is one of the most essential 21st century skills for success across educational and professional settings. The hidden-profile paradigm is one of the most prominent avenues of studying group decision making and underlying issues in information sharing. Previous research on the hidden-profile paradigm has primarily focused on static constructs (e.g., group size, group expertise), or on the information itself (whether certain pieces of information is being shared). In the current study, we propose a lens on individual and group’s collaborative problem-solving skills, to explore the relationships between dynamic discourse processes and decision making in a distributed information environment. Specifically, we sought to examine CPS skills in association with decision change and productive decision-making. Our results suggest that while sharing information has significantly positive association with decision change and effective decision-making, other aspects of social processes appear to be negatively correlated with these outcomes. Cognitive CPS skills, however, exhibit a strong positive relationship with making a (productive) change in students final decisions. We also find that these results are more pronounced at the group level, particularly with cognitive CPS skills. Our study shed lights on a more nuanced picture of how social and cognitive CPS interactions are related to effective information sharing and decision making in collaborative problem-solving interactions.
Researchers have been struggling with the measurement of Self-Regulated Learning (SRL) for decades. Instrumentation tools have been proposed to help capture SRL processes that are difficult to capture. The aim of the present study was to improve measurement of SRL by embedding instrumentation tools in a learning environment and validating the measurement of SRL with these instrumentation tools using think aloud. Synchronizing log data and concurrent think aloud data helped identify which SRL processes were captured by particular instrumentation tools. One tool was associated with a single SRL process: the timer co-occurred with monitoring. Other tools co-occurred with a number of SRL processes, i.e., the highlighter and note taker captured superficial writing down, organizing, and monitoring, whereas the search and planner tools revealed planning and monitoring. When specific learner actions with the tool were analyzed, a clearer picture emerged of the relation between the highlighter and note taker and SRL processes. By aligning log data with think aloud data, we showed that instrumentation tool use indeed reflects SRL processes. The main contribution is that this paper is the first to show that SRL processes that are difficult to measure by trace data can indeed be captured by instrumentation tools such as high cognition and metacognition. Future challenges are to collect and process log data real time with learning analytic techniques to measure ongoing SRL processes and support learners during learning with personalized SRL scaffolds.
Many areas of educational research require the analysis of data that have an inherent sequential or temporal ordering. In certain cases, researchers are specifically interested in the transitions between different states—or events—in these sequences, with the goal being to understand the significance of these transitions; one notable example is the study of affect dynamics, which aims to identify important transitions between affective states. Unfortunately, a recent study has revealed a statistical bias with several metrics used to measure and compare these transitions, possibly causing these metrics to return unexpected and inflated values. This issue then causes extra difficulties when interpreting the results of these transition metrics. Building on this previous work, in this study we look in more detail at the specific mechanisms that are responsible for the bias with these metrics. After giving a theoretical explanation for the issue, we present an alternative procedure that attempts to address the problem with the use of marginal models. We then analyze the effectiveness of this procedure, both by running simulations and by applying it to actual student data. The results indicate that the marginal model procedure seemingly compensates for the bias observed in other transition metrics, thus resulting in more accurate estimates of the significance of transitions between states.
Workplace learning often requires workers to learn new perceptual and motor skills. The future of work will increasingly feature human users who cooperate with machines, both to learn the tasks and to perform them. In this paper, we examine workplace learning in Materials Recovery Facilities (MRFs), i.e., recycling plants, where workers separate waste items on conveyer belts before they are formed into bales and reprocessed. Using a simulated MRF, we explored the benefit of machine learning assistants (MLAs) that help workers, and help train them, to sort objects efficiently by providing automated perceptual guidance. In a randomized experiment (n = 140), we found: (1) A low-accuracy MLA is worse than no MLA at all, both in terms of task performance and learning. (2) A perfect MLA led to the best task performance, but was no better in helping users to learn than having no MLA at all. (3) Users tend to follow the MLA’s judgments too often, even when they were incorrect. Finally, (4) we devised a novel learning analytics algorithm to assess the worker’s accuracy, with the goal of obtaining additional training labels that can be used for fine-tuning the machine. A simulation study illustrates how even noisy labels can increase the machine’s accuracy.
Prediction of undergraduate grades before their course enrollments is beneficial to the student’s learning plan on selective courses and failure warnings to compulsory courses in Chinese higher education. This study proposed to use a deep learning-based model composed of sparse attention layers, convolutional neural layers, and a fully connected layer, called Sparse Attention Convolutional Neural Networks (SACNN), to predict undergraduate grades. Concretely, sparse attention layers response to the fact that courses have different contributions to the grade prediction of the target course; convolutional neural layers aim to capture the one-dimensional temporal feature on these courses organized in terms; the fully connected layer is to complete the final classification based on achieved features. We collected a dataset including grade records, student’s demographics and course descriptions from our institution in the past five years. The dataset contained about 54k grade records from 1307 students and 137 courses, where all mentioned methods were evaluated by the hold-out evaluation. The result shows SACNN achieves 81% prediction precision and 85% accuracy on the failure prediction, which is more effective than those compared methods. Besides, SACNN delivers a potential explanation to the reason of the predicted result, thanks to the sparse attention layer. This study provides a useful technique for personalized learning and course relationship discovery in undergraduate education.
Peer effects, an influence that peers can have on one’s learning and development, have been shown to affect student achievement and attitudes. A large-scale analysis of social influences in digital online interactions showed that students interact in online university forums with peers of similar performance. Mechanisms driving this observed similarity remain unclear. To shed light as to why similar peers interact online, the current study examined the role of organizing factors in the formation of similarity patterns in online university forums, using four-years of forum interaction data of a university cohort. In the study, experiments randomized the timing of student activity, relationship between student activity levels within specific courses, and relationship between student activity and performance. Analysis suggests that similarity between students interacting online is shaped by implications of the course design on individual student behaviour, less so by social processes of selection. Social selection may drive observed similarity in later years of student experience, but its role is relatively small compared to other factors. The results highlight the need to consider what social influences are enacted by the course design and technological scaffolding of learner behaviour in online interactions, towards diversifying student social influences.
As Learning Analytics (LA) in the higher education setting increasingly transitions from a field of research to an implemented matter of fact of the learner's experience, the demand of practical guidelines to support its development is rising. LA Policies bring together different perspectives, like the ethical and legal dimensions, into frameworks to guide the way. Usually the first time learners get in touch with LA is at the act of consenting to the LA tool. Utilising an ethical (TRUESSEC) and a legal framework (GDPR), we question whether sincere consent is possible in the higher education setting. Drawing upon this premise, we then show how it might be possible to recognise the autonomy of the learner by providing LA as a service, rather than an intervention. This could indicate a paradigm shift towards the learner as empowered demander. At last, we show how this might be incorporated within the GDPR by also recognising the demand of the higher education institutions to use the learner's data at the same time. These considerations will in the future influence the development of our own LA policy: a LA criteria catalogue.
In online learning, teachers need constant feedback about their students’ progress and regulation needs. Learning Analytics Dashboards for process-oriented feedback can be a valuable tool for this purpose. However, few such dashboards have been proposed in literature, and most of them lack empirical validation or grounding in learning theories. We present a teacher-facing dashboard for process-oriented feedback in online learning, co-designed and evaluated through an iterative design process involving teachers and visualization experts. We also reflect on our design process by discussing the challenges, pitfalls, and successful strategies for building this type of dashboard.
We propose SAINT+, a successor of SAINT which is a Transformer based knowledge tracing model that separately processes exercise information and student response information. Following the architecture of SAINT, SAINT+ has an encoder-decoder structure where the encoder applies self-attention layers to a stream of exercise embeddings, and the decoder alternately applies self-attention layers and encoder-decoder attention layers to streams of response embeddings and encoder output. Moreover, SAINT+ incorporates two temporal feature embeddings into the response embeddings: elapsed time, the time taken for a student to answer, and lag time, the time interval between adjacent learning activities. We empirically evaluate the effectiveness of SAINT+ on EdNet, the largest publicly available benchmark dataset in the education domain. Experimental results show that SAINT+ achieves state-of-the-art performance in knowledge tracing with an improvement of 1.25% in area under receiver operating characteristic curve compared to SAINT, the current state-of-the-art model in EdNet dataset.
The Elo rating system has been recognised as an effective method for modelling students and items within adaptive educational systems. A common characteristic across Elo-based learner models is that they are not sensitive to the lag time between two consecutive interactions of a student within the system. Implicitly, this characteristic assumes that students do not learn or forget between two consecutive interactions. However, this assumption seems insufficient in the context of adaptive learning systems where students could have improved their mastery through practising outside of the system or that their mastery may be declined due to forgetting. In this paper, we extend the existing works on the use of rating systems for modelling learners in adaptive educational systems by proposing a new learner model called MV-Glicko that builds on the Glicko rating system. MV-Glicko is sensitive to the lag time between two consecutive interactions of a student within the system and models it as a parameter that captures the confidence of the system in the current inferred rating. We apply MV-Glicko on three public data sets and three data sets obtained from an adaptive learning system and provide evidence that MV-Glicko outperforms other conventional models in estimating students’ knowledge mastery.
Predictions from early alert systems are increasingly being used by institutions to assist decision-making and support at-risk individuals. Concept drifts caused by the 2020 SARS-CoV-2 pandemic are threatening the performance and usefulness of the machine learning models that power these systems. In this paper, we present an analytical framework that uses imputation-based simulations to perform preliminary evaluation on the extent to which data quality and availability issues impact the performance of machine learning models. Guided by this framework, we studied how these issues would impact the performance of the high school dropout prediction model implemented in the Early Warning System (EWS). Results show that despite the disruptions, this model can still be reasonably useful in assisting decision-making. We discuss the implications of these findings in more general educational contexts and recommend steps in countering the challenges of using predictions from imperfect machine learning models in early alert systems and, more broadly, learning analytic research that uses longitudinal data.
Feedback plays a crucial role in student learning. Learning analytics (LA) has demonstrated potential in addressing prominent challenges with feedback practice, such as enabling timely feedback based on insights obtained from large data sets. However, there is insufficient research looking into relations between student expectations of feedback and their experience with LA-based feedback. This paper presents a pilot study that examined students’ experience of LA-based feedback, offered with the OnTask system, taking into consideration the factors of students’self-efficacy and self-regulation skills. Two surveys were carried out at a Brazilian university, and the results highlighted important implications for LA-based feedback practice, including leveraging the ‘partnership’ between the human teacher and the computer, and developing feedback literacy among learners.
Despite the potential of spatial displays for supporting teachers’ classroom orchestration through real-time classroom analytics, the process to design these displays is a challenging and under-explored topic in the learning analytics (LA) community. This paper proposes a mid-fidelity Virtual Prototyping method (VPM), which involves simulating a classroom environment and candidate designs in virtual space to address these challenges. VPM allows for rapid prototyping of spatial features, requires no specialized hardware, and enables teams to conduct remote evaluation sessions. We report observations and findings from an initial exploration with five potential users through a design process utilizing VPM to validate designs for an AR-based spatial display in the context of middle-school orchestration tools. We found that designs created using virtual prototyping sufficiently conveyed a sense of three-dimensionality to address subtle design issues like occlusion and depth perception. We discuss the opportunities and limitations of applying virtual prototyping, particularly its potential to allow for more robust co-design with stakeholders earlier in the design process.
Learning how to solve authentic problems is an important goal of education, yet how to assess and teach problem solving are research topics to be further explored. This study examines how interaction log data from a computerized task environment could be used to extract meaningful features in order to automate the assessment of reflective problem-solving practices. We collected survey responses and interaction log data of 40 college students working to solve the mass of a ”mystery object” in an interactive physics simulation. The log data was parsed to reveal both the test trials conducted to solve the problem and the pauses in-between test trials, where potential monitoring and reflection of the problem-solving process took place. The results show that reflective problem-solving practices, as indicated by meaningful pauses, can predict problem-solving performance above and beyond participants’ application of physics knowledge. Our approach to log data processing has implications for how we study problem solving using interactive simulations.
Research indicates that makerspaces equip students with the practical skills needed to build their own projects and thrive in the twenty-first-century workforce. While the appeal of makerspaces lies in their spirit of tinkering and community-driven ethos, these same attributes make it difficult to monitor and facilitate learning. Makerspaces also attract students from diverse backgrounds and skills, further challenging facilitators to accommodate the needs of each student and their self-directed projects. We propose a dashboard interface that visualizes Kinect sensor data to aid facilitators in monitoring student collaboration. The tool was designed with an iterative and participatory approach. Five facilitators were involved at each phase of the design process, from need-finding to prototyping to implementation and evaluation. Insights derived from interviews were used to inform the design decisions of the final interface. The final evaluation suggests that the use of normalized summary scores and an interactive network graph can successfully support facilitators in tasks related to improving collaboration. Moreover, the use of a red-green color scheme and the inclusion of student photos improved the usability for facilitators, but issues of trustworthiness need to be further examined.
The rising prevalence of blended learning programs has provided educators with an abundance of information about students' specific educational needs through educator portals. Full implementation of blended learning models requires educators to utilize these data to inform their teaching practices, yet most research on blended learning programs focuses solely on student engagement with the digital learning environment. In this paper, we utilize a longitudinal clustering method to identify patterns of educator portal usage and examine the associations between these clusters and student program outcomes. The clusters of educators varied in intensity and consistency of educator portal access across a school year and were associated with significant differences in student usage and progress in the program. The analyses allowed us to identify preferable educator usage patterns based upon their associated students’ program outcomes, which provides novel information about the potential impact of educator engagement on overall implementation fidelity of blended learning programs.
Collaborative learning has been widely used to foster students’ communication and joint knowledge construction. However, the classification of learners into well-structured groups is one of the most challenging tasks in the field. The aim of this study is to propose a novel method to form intra-heterogeneous and inter-homogeneous groups based on relevant student characteristics. Such a method allows for the consideration of multiple student characteristics and can handle both numerical and categorical characteristic types simultaneously. It assumes that the teacher provides an order of importance of the characteristics, then it solves the grouping problem as a lexicographic optimization problem in the given order. We formulate the problem in mixed integer linear programming (MILP) terms and solve it to optimality. A pilot experiment was conducted with 29 college freshmen considering three general characteristics (i.e., 13 specific features) including knowledge level, demographic information, and motivation. Results of such an experiment demonstrate the validity and computational feasibility of the algorithmic approach. Large-scale studies are needed to assess the impact of the proposed grouping method on students’ learning experience and academic achievement.
In this paper, we explore a kind of teaching-oriented temporal analytics on the timing of support in the context of one-on-one math problem-solving coaching. We build the analytical framework upon the human-human multimodal interaction data collected from the naturalist environments. We demonstrated the potential utility of leveraging survival analysis, a class of statistical methods to model time-to-event data, to gain insights into the timing decisions. We shed light on the heterogeneity of coaching decisions as to when to render support in connection to the problem-solving stages, coaching dyads, as well as the pre-intervention event characteristics. This work opens future avenues into a different type of human tutoring study supported by multimodal data, computational models, and statistical frameworks. This model framework may yield useful reflective teaching analytics to tutors, coaches, or teachers when further developed. We also envision that those analyses may ultimately inform the design of AI-supported autonomous agents that could learn the tutorial interaction logic from empirical data.
Gameful pedagogy is a novel approach to teaching that has emerged in the past decade that emphasizes intentionally designing curricula to support student motivation. In this study we investigate how five gameful courses at a large R1 university have impacted students when analyzed with an eye towards equity: are men vs women, underrepresented minorities vs majority students, and first-generation students vs traditional students able to achieve similar amounts of success in gameful courses? Results show that for both men and minority students, there is evidence to suggest they are underachieving in the courses studied as compared to women and majority students, but when we control for prior academic performance these trends disappear. For first-generation students, we see conflicting evidence, with cases of both under and over-achievement present. We again see these discrepancies disappear when we control for prior performance.
For nearly a century, pre-college standardized test scores and undergraduate letter grades have been de facto industry standard measures of achievement in US higher education. We examine a sample of millions of grades and a half million pre-college test scores earned by undergraduates between 2006 and 2019 at a large public research university that became increasingly selective, in terms of test scores of matriculated students, over that time. A persistent, moderate correlation between test score and grades within the period motivates us to employ a simple importance sampling model to address the question, “How much is increased selectivity driving up campus grades?”. Of the overall 0.213 rise in mean undergraduate grade points over the thirteen-year period, we find that nearly half, 0.098 ± 0.004, can be ascribed to increased selectivity. The fraction is higher, nearly 70%, in engineering, business and natural science subjects. Removing selectivity’s influence to surface curricular-related grade inflation within academic domains, we find a factor four range, from a low of ∼ 0.05 in business and engineering to a high of 0.18 in the humanities, over the thirteen year period.
To support online learners at a large scale, extensive studies have adopted machine learning (ML) techniques to analyze students’ artifacts and predict their learning outcomes automatically. However, limited attention has been paid to the fairness of prediction with ML in educational settings. This study intends to fill the gap by introducing a generic algorithm that can orchestrate with existing ML algorithms while yielding fairer results. Specifically, we have implemented logistic regression with the Seldonian algorithm and compared the fairness-aware model with fairness-unaware ML models. The results show that the Seldonian algorithm can achieve comparable predictive performance while producing notably higher fairness.
While Multimodal Learning Analytics (MMLA) is becoming a popular methodology in the LAK community, most educational researchers still rely on traditional instruments for capturing learning processes (e.g., click-stream, log data, self-reports, qualitative observations). MMLA has the potential to complement and enrich traditional measures of learning by providing high frequency data on learners’ behavior, cognition and affects. However, there is currently no easy-to-use toolkit for recording multimodal data streams. Existing methodologies rely on the use of physical sensors and custom-written code for accessing sensor data. In this paper, we present the EZ-MMLA toolkit. This toolkit was implemented as a website that provides easy access to the latest machine learning algorithms for collecting a variety of data streams from webcams: attention (eye-tracking), physiological states (heart rate), body posture (skeletal data), hand gestures, emotions (from facial expressions and speech), and lower-level computer vision algorithms (e.g., fiducial / color tracking). This toolkit can run from any browser and does not require special hardware or programming experience. We compare this toolkit with traditional methods and describe a case study where the EZ-MMLA toolkit was used in a classroom context. We conclude by discussing other applications of this toolkit, potential limitations, and future steps.
Providing students in STEM courses the opportunity to write about scientific content can be beneficial to the learning process. However, it is a logistical challenge to provide feedback to students’ written work in large-enrollment courses. Motivated by these reasons, the study presented herein considers a method to identify the depth of students’ scientific reasoning in their written work. A writing-to-learn (WTL) activity was implemented in a large undergraduate general chemistry class. An analytical framework of cognitive operations that characterizes students’ scientific reasoning evidenced in their writing was applied. Engagement in some of the more complex cognitive operations, such as causal reasoning and argumentation, was a sign that students were properly engaging in meaning making activities. This work considers a method to automate coaching of students in using more complex reasoning in their writing with the desired outcome that it may help students better engage with the science content. We consider a series of new natural language processing models to discern types of reasoning in student essays from the WTL activity.
In recent years, instructional design has become even more challenging for teaching staff members in higher education institutions. If instructional design causes student overload, it could lead to superficial learning and decreased student well-being. A strategy to avoid overload is reflecting upon the effectiveness of teaching practices in terms of time-on-task. This article presents a Work-In-Progress conducted to provide teachers with a dashboard to visualize student self-reports of time-on-task regarding subject activities. A questionnaire was applied to 15 instructors during a set trial period to evaluate the perceived usability and usefulness of the dashboard. Preliminary findings reveal that the dashboard helped instructors became aware about the number of hours spent outside of class time. Furthermore, data visualizations of time-on-task evidence enabled them to redesign subject activities. Currently, the dashboard has been adopted by 106 engineering instructors. Future work involves the development of a framework to incorporate user-based improvements.
University students select courses for an upcoming term in part based on expected workload. Course credit hours is often the only metric given by the institution relevant to how much work a course will be and does not serve as a precise estimate due to the lack of granularity of the metric which can lead to student under or overestimation. We define a novel task of predicting relative effective course credit hours, or time load; essentially, determining which courses take more time than others. For this task, we draw from institutional data sources including course catalog descriptions, student enrollment histories and ratings from a popular course rating website. To validate this work, we design a personalized survey for university students to collect ground truth labels, presenting them with pairs of courses they had taken and asking which course took more time per week on average. We evaluate which sources of data using which machine representation techniques provide the best prediction of these course time load ratings. We establish a benchmark accuracy of 0.71 on this novel task and find skip-grams applied to enrollment data (i.e., course2vec), not catalog descriptions, to be most useful in predicting the time demands of a course.
Understanding students’ misconceptions is important for effective teaching and assessment. However, discovering such misconceptions manually can be time-consuming and laborious. Automated misconception discovery can address these challenges by highlighting patterns in student data, which domain experts can then inspect to identify misconceptions. In this work, we present a novel method for the semi-automated discovery of problem-specific misconceptions from students’ program code in computing courses, using a state-of-the-art code classification model. We trained the model on a block-based programming dataset and used the learned embedding to cluster incorrect student submissions. We found these clusters correspond to specific misconceptions about the problem and would not have been easily discovered with existing approaches. We also discuss potential applications of our approach and how these misconceptions inform domain-specific insights into students’ learning processes.
Through profiling learners’ music usage in everyday learning settings and depicting their learning experience when studying with a music app powered by a large-scale and real-world music library, this study revealed preliminary observations on how background music impacts learning under varying task load, and manifested intriguing patterns of learners’ music usage and music preferences in various task load conditions. Specifically, we piloted a three-day field experiment in students’ everyday learning environment. During the experiment, participants performed learning tasks with music in the background and completed a set of online surveys before and after each learning session. Our results suggested that learners’ self-selected, real-life background music could enhance their learning effectiveness, while the beneficial effect of background music was more apparent when the learning task was less mentally or temporally demanding. Towards a closer look at the characteristics of preferable music pieces under various task load conditions, our findings showed that music preferred by participants under high versus low temporal demand differs in a number of characteristics, including speechiness, acousticness, danceability, and energy. This study further reveals the effects of background music on learning under varying task load levels and provides implications for context-aware background music selection when designing musically enriched learning environments.
Automated Writing Evaluation systems have been developed to help students improve their writing skills through the automated delivery of both summative and formative feedback. These systems have demonstrated strong potential in a variety of educational contexts; however, they remain limited in their personalization and scope. The purpose of the current study was to begin to address this gap by examining whether individual differences could be modeled in a source-based writing context. Undergraduate students (n=106) wrote essays in response to multiple sources and then completed an assessment of their vocabulary knowledge. Natural language processing tools were used to characterize the linguistic properties of the source-based essays at four levels: descriptive, lexical, syntax, and cohesion. Finally, machine learning models were used to predict students’ vocabulary scores from these linguistic features. The models accounted for approximately 29% of the variance in vocabulary scores, suggesting that the linguistic features of source-based essays are reflective of individual differences in vocabulary knowledge. Overall, this work suggests that automated text analyses can help to understand the role of individual differences in the writing process, which may ultimately help to improve personalization in computer-based learning environments.
Open Educational Resources (OERs) are openly licensed educational materials that are widely used for learning. Nowadays, many online learning repositories provide millions of OERs. Therefore, it is exceedingly difficult for learners to find the most appropriate OER among these resources. Subsequently, the precise OER metadata is critical for providing high-quality services such as search and recommendation. Moreover, metadata facilitates the process of automatic OER quality control as the continuously increasing number of OERs makes manual quality control extremely difficult. This work uses the metadata of 8,887 OERs to perform an exploratory data analysis on OER metadata. Accordingly, this work proposes metadata-based scoring and prediction models to anticipate the quality of OERs. Based on the results, our analysis demonstrated that OER metadata and OER content qualities are closely related, as we could detect high-quality OERs with an accuracy of 94.6%. Our model was also evaluated on 884 educational videos from Youtube to show its applicability on other educational repositories.
While data science education has gained increased recognition in both academic institutions and industry, there has been a lack of research on automated coding assessment for novice students. Our work presents a first step in this direction, by leveraging the coding metrics from traditional software engineering (Halstead Volume and Cyclomatic Complexity) in combination with those that reflect a data science project’s learning objectives (number of library calls and number of common library calls with the solution code). Through these metrics, we examined the code submissions of 97 students across two semesters of an introductory data science course. Our results indicated that the metrics can identify cases where students had overly complicated codes and would benefit from scaffolding feedback. The number of library calls, in particular, was also a significant predictor of changes in submission score and submission runtime, which highlights the distinctive nature of data science programming. We conclude with suggestions for extending our analyses towards more actionable intervention strategies, for example by tracking the fine-grained submission grading outputs throughout a student’s submission history, to better model and support them in their data science learning process.
This paper puts forth the idea of a subversive stance on learning analytics as a theoretically-grounded means of engaging with issues of power and equity in education and the ways in which they interact with the usage of data on learning processes. The concept draws on efforts from fields such as socio-technical systems and critical race studies that have a long history of examining the role of data in issues of race, gender and class. To illustrate the value that such a stance offers the field of learning analytics, we provide examples of how taking a subversive perspective can help us to identify tacit assumptions-in-practice, ask generative questions about our design processes and consider new modes of creation to produce tools that operate differently in the world.