Chapter 19

Handbook of Learning Analytics
First Edition

Predictive Modelling of Student
Behavior Using Granular Large-Scale Action Data

Steven Tang, Joshua C. Peterson & Zachary A. Pardos


Abstract

Massive open online courses (MOOCs) generate a granular record of the actions learners choose to take as they interact with learning materials and complete exercises towards comprehension. With this high volume of sequential data and choice comes the potential to model student behaviour. There exist several methods for looking at longitudinal, sequential data like those recorded from learning environments. In the field of language modelling, traditional n-gram techniques and modern recurrent neural network (RNN) approaches have been applied to find structure in language algorithmically and predict the next word given the previous words in the sentence or paragraph as input. In this chapter, we draw an analogy to this work by treating student sequences of resource views and interactions in a MOOC as the inputs and predicting students’ next interaction as outputs. Our approach learns the representation of resources in the MOOC without any explicit feature engineering required. This model could potentially be used to generate recommendations for which actions a student ought to take next to achieve success. Additionally, such a model automatically generates a student behavioural state, allowing for inference on performance and affect. Given that the MOOC used in our study had over 3,500 unique resources, predicting the exact resource that a student will interact with next might appear to be a difficult classification problem. We find that the syllabus (structure of the course) gives on average 23% accuracy in making this prediction, followed by the n-gram method with 70.4%, and RNN based methods with 72.2%. This research lays the groundwork for behaviour modelling of fine-grained time series student data using feature-engineering-free techniques.

Export Citation: Plain Text (APA)     BIBTeX     RIS

Supplementary Material
No Supplementary Material Available
References (28)

Bastien, F., Lamblin, P., Pascanu, R., Bergstra, J., Goodfellow, I. J., Bergeron, A., . . . Bengio, Y. (2012). Theano: New features and speed improvements. Deep Learning and Unsupervised Feature Learning NIPS 2012 Workshop. Advances in Neural Information Processing Systems 25 (NIPS 2012), 3–8 December 2012, Lake Tahoe, NV, USA. http://www.iro.umontreal.ca/~lisa/pointeurs/nips2012_deep_workshop_theano_final.pdf

Bengio, Y., Simard, P., & Frasconi, P. (1994). Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5(2), 157–166.

Bergstra, J., Breuleux, O., Bastien, F., Lamblin, P., Pascanu, R., Desjardins, G., . . . Bengio, Y. (2010, June). Theano: A CPU and GPU math expression compiler. Proceedings of the Python for Scientific Computing Conference (SciPy 2010), 28 June–3 July 2010, Austin, TX, USA (pp. 3–10).

Brown, P. F., Desouza, P. V., Mercer, R. L., Pietra, V. J. D., & Lai, J. C. (1992). Class-based n-gram models of natural language. Computational Linguistics, 18(4), 467–479.

Chollet, F. (2015). Keras. GitHub. https://github.com/fchollet/keras

Corbett, A. T., & Anderson, J. R. (1994). Knowledge tracing: Modeling the acquisition of procedural knowledge. User Modeling and User-Adapted Interaction, 4(4), 253–278.

Crossley, S., Paquette, L., Dascalu, M., McNamara, D. S., & Baker, R. S. (2016). Combining click-stream data with NLP tools to better understand MOOC completion. Proceedings of the 6th International Conference on Learning Analytics and Knowledge (LAK ʼ16), 25–29 April 2016, Edinburgh, UK (pp. 6–14). New York: ACM.

Gers, F. A., Schmidhuber, J., & Cummins, F. (2000). Learning to forget: Continual prediction with LSTM. Neural Computation, 12(10), 2451–2471.

Goldberg, Y., & Levy, O. (2014). Word2vec explained: Deriving Mikolov et al.’s negative-sampling word-embedding method. CoRR. arxiv.org/abs/1402.3722

Graves, A., Mohamed, A.-r., & Hinton, G. (2013). Speech recognition with deep recurrent neural networks. Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2013), 26–31 May, Vancouver, BC, Canada (pp. 6645–6649). Institute of Electrical and Electronics Engineers.

Greff, K., Srivastava, R. K., Koutník, J., Steunebrink, B. R., & Schmidhuber, J. (2015). LSTM: A search space odyssey. arXiv preprint arXiv:1503.04069.

Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780.

Khajah, M., Lindsey, R. V., & Mozer, M. C. (2016). How deep is knowledge tracing? arXiv preprint arXiv:1604.02416.

Mikolov, T., Karafiát, M., Burget, L., Cernocky, J., & Khudanpur, S. (2010). Recurrent neural network based language model. Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010), 26–30 September 2010, Makuhari, Chiba, Japan (pp. 1045–1048). http://www.fit.vutbr.cz/research/groups/speech/publi/2010/mikolov_interspeech2010_IS100722.pdf

Oleksandra, P., & Shane, D. (2016). Untangling MOOC learner networks. Proceedings of the 6th International Conference on Learning Analytics and Knowledge (LAK ʼ16), 25–29 April 2016, Edinburgh, UK (pp. 208–212). New York: ACM.

Pardos, Z. A., Bergner, Y., Seaton, D. T., & Pritchard, D. E. (2013). Adapting Bayesian knowledge tracing to a massive open online course in EDX. In S. K. DʼMello et al. (Eds.), Proceedings of the 6th International Conference on Educational Data Mining (EDM2013), 6–9 July 2013, Memphis, TN, USA (pp. 137–144). International Educational Data Mining Society/Springer.

Pardos, Z. A., & Xu, Y. (2016). Improving efficacy attribution in a self-directed learning environment using prior knowledge individualization. Proceedings of the 6th International Conference on Learning Analytics and Knowledge (LAK ʼ16), 25–29 April 2016, Edinburgh, UK (pp. 435–439). New York: ACM.

Pham, V., Bluche, T., Kermorvant, C., & Louradour, J. (2014). Dropout improves recurrent neural networks for handwriting recognition. Proceedings of the 14th International Conference on Frontiers in Handwriting Recognition (ICFHR 2014) 1–4 September 2014, Crete, Greece (pp. 285–290).

Piech, C., Bassen, J., Huang, J., Ganguli, S., Sahami, M., Guibas, L. J., & Sohl-Dickstein, J. (2015). Deep knowledge tracing. In C. Cortes et al. (Eds.), Advances in Neural Information Processing Systems 28 (NIPS 2015), 7–12 December 2015, Montreal, QC, Canada (pp. 505–513).

Reddy, S., Labutov, I., & Joachims, T. (2016). Latent skill embedding for personalized lesson sequence recommendation. CoRR. arxiv.org/abs/1602.07029

Reich, J., Stewart, B., Mavon, K., & Tingley, D. (2016). The civic mission of MOOCs: Measuring engagement across political differences in forums. Proceedings of the 3rd ACM Conference on Learning @ Scale (L@S 2016), 25–28 April 2016, Edinburgh, Scotland (pp. 1–10). New York: ACM.

Sharma, A., Biswas, A., Gandhi, A., Patil, S., & Deshmukh, O. (2016). Livelinet: A multimodal deep recurrent neural network to predict liveliness in educational videos. In T. Barnes et al. (Eds.), Proceedings of the 9th International Conference on Educational Data Mining (EDM2016), 29 June–2 July 2016, Raleigh, NC, USA. International Educational Data Mining Society. http://www.educationaldatamining.org/EDM2016/proceedings/paper_64.pdf

Vinyals, O., Kaiser, L. Koo, T., Petrov, S., Sutskever, I., & Hinton, G. (2015). Grammar as a foreign language. In C. Cortes et al. (Eds.), Advances in Neural Information Processing Systems 28 (NIPS 2015), 7–12 December 2015, Montreal, QC, Canada (pp. 2755–2763). http://papers.nips.cc/paper/5635-grammar-as-a-foreign-language.pdf

Vinyals, O., Toshev, A., Bengio, S., & Erhan, D. (2015). Show and tell: A neural image caption generator. Proceedings of the 2015 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2015), 8–10 June 2015, Boston, MA, USA. IEEE Computer Society. arXiv:1411.4555

Wen, M., & Rosé, C. P. (2014). Identifying latent study habits by mining learner behavior patterns in massive open online courses. Proceedings of the 23rd ACM International Conference on Information and Knowledge Management (CIKM ’14), 3–7 November 2014, Shanghai, China (pp. 1983–1986). New York: ACM.

Wen, M., Yang, D., & Rosé, C. P. (2014). Sentiment analysis in MOOC discussion forums: What does it tell us? In J. Stamper et al. (Eds.), Proceedings of the 7th International Conference on Educational Data Mining (EDM2014), 4–7 July 2014, London, UK. International Educational Data Mining Society. http://www.cs.cmu.edu/~mwen/papers/edm2014-camera-ready.pdf

Werbos, P. J. (1988). Generalization of backpropagation with application to a recurrent gas market model. Neural Networks, 1(4), 339–356.

Zaremba, W., Sutskever, I., & Vinyals, O. (2014). Recurrent neural network regularization. arXiv:1409.2329

About this Chapter

Title
Predictive Modelling of Student Behavior Using Granular Large-Scale Action Data

Book Title
Handbook of Learning Analytics

Pages
pp. 223-233

Copyright
2017

DOI
10.18608/hla17.019

ISBN
978-0-9952408-0-3

Publisher
Society for Learning Analytics Research

Authors
Steven Tang
Joshua C. Peterson
Zachary A. Pardos

Author Affiliations
Graduate School of Education, UC Berkeley, USA

Editors
Charles Lang1
George Siemens2
Alyssa Wise3
Dragan Gašević4

Editor Affiliations
1. Teachers College, Columbia University, USA
2. LINK Research Lab, University of Texas at Arlington, USA
3. Learning Analytics Research Network, New York University, USA
4. Schools of Education and Informatics, University of Edinburgh, UK

Founding Members
Previous Image
Next Image

info heading

info content