The Viability of Topic Modeling to Identify Participant Motivations for Enrolling in Online Professional Development

Authors

  • Heather Allmond Barker Elon University
  • Hollylynne S Lee North Carolina State University
  • Shaun Kellogg North Carolina State University
  • Robin Anderson North Carolina State University

DOI:

https://doi.org/10.24059/olj.v28i1.3571

Keywords:

MOOCS, topic modeling, online professional development, discussion forums, motivation

Abstract

Identifying motivation for enrollment in MOOCs has been an important way to predict participant success rates. But themes for motivation have largely centered around themes for enrolling in any MOOC, and not ones specific to the course being studied.  In this study, qualitatively coding discussion forums was combined with topic modeling to identify participants’ motivation for enrolling in two successive statistics education professional development online courses. Computational text mining, such as topic modeling, is a learning analytics field that has proven effective in analyzing large volumes of text to automatically identify topics or themes. This contrasts with traditional qualitative approaches, in which researchers manually apply labels (or codes) to parts of text to identify common themes. Combining topic modeling and qualitative research may prove useful to education researchers and practitioners in better understanding and improving online learning contexts that feature asynchronous discussion. Three topic modeling approaches were used in this study, including both unsupervised and semi-supervised modeling techniques. The three topic modeling approaches were validated and compared to determine which participants were assigned motivation themes that most closely aligned to their posts made in an introductory discussion forum. A discussion of how each technique can be useful for identifying topical themes within discussion forum data is included. Though the three techniques have varying success rates in identifying motivation for enrolling in the MOOCs, they do all identify similar themes for motivation that are specific to statistics education.

References

Arun, R., Suresh, V., Veni Madhavan,C. E., & Narasimha, M. (2010). On Finding the Natural Number of Topics with Latent Dirichlet Allocation: Some Observations. In Advances in Knowledge Discovery and Data Mining, Mohammed J. Zaki, Jeffrey Xu Yu, Balaraman Ravindran and Vikram Pudi (eds.). Springer Berlin Heidelberg, 391–402. http://doi.org/10.1007/978-3-642-13657-3_43

Badali, M., Hatami, J., Banihashem, S.K., Rahimi, E., Noroozi, O., & Eslami, Z. (2022). The role of motivation in MOOCs retention rates: a systematic literature review. Research and Practice in Technology Enhanced Learning, 17(5). https://doi.org/10.1186/s41039-022-00181-3

Benoit K, Watanabe K, Wang H, Nulty P, Obeng A, Müller S, Matsuo A (2018). quanteda: An R package for the quantitative analysis of textual data. Journal of Open Source Software, 3(30), 774. doi: 10.21105/joss.00774, https://quanteda.io.

Bouchet-Valat, M. (2020). Package ‘SnowballC’. R package version 0.7.0. https://cran.r-project.org/web/packages/SnowballC/SnowballC.pdf

Boroujeni, M.S. & Dillenbourg, P. (2019). Discovery and temporal analysis of MOOC study patterns, Journal of Learning Analytics, 6(1), 16 – 33. http://dx.doi.org/10.18608/jla.2019.61.2

Boussalis, C., & Coan, T. G. (2016). Text-mining the signals of climate change doubt. Global Environmental Change, 36, 89-100. https://doi.org/10.1016/j.gloenvcha.2015.12.001

Brooker, A., Corrin, L. de Barba, P., Lodge, J., & Kennedy, G. (2018). A tale of two MOOCS: How student motivationa nd participation predict learning outcomes in different MOOCS. Australasian Journal of Educational Technology, 34(1), 1 – 15. https://doi.org/10.14742/ajet.3237

Cao, J., Xia, T., Li, J., Zhang, Y., and Tang, S. (2009). A density-based method for adaptive LDA model selection. Neurocomputing — 16th European Symposium on Artificial Neural Networks 2008, 72 (7–9): 1775–1781. http://doi.org/10.1016/j.neucom.2008.06.011

AUTHOR (2018). Time to shine: Extending certificate deadlines to support open online teacher professional development. Presented at AERA Annual Meeting, New York, NY.

Creswell, J. (2013). Qualitative inquiry and research design: Choosing among five approaches. Los Angeles, CA: SAGE

Das, A., Shrivastava, M., & Chinnakotla, M. (2016). Mirror on the wall: Finding similar questions with deep structured topic modeling. In Pacific-Asia Conference on Knowledge Discovery and Data Mining, p. 454-465. Springer, Cham. https://doi.org/10.1007/978-3-319-31750-2_36

DeBoer, J., Ho, A. D., Stump, G. S., & Breslow, L. (2014). Changing “course” reconceptualizing educational variables for massive open online courses. Educational researcher, 43(2), 74-84.Deveaud et al., 2014. https://doi.org/10.3102/0013189X14523038

Douglas, K. A., Bermel, P., Alam, M.M., & Madhavan, K. (2016). Big data characterization of learner behaviour in a highly technical MOOC engineering course. Journal of Learning Analytics, 3(3), 170 – 192, http://dx.doi.org/10.18608/jla.2016.33.9.

Eccles, J. S., & Wigfield, A. (2002). Motivational beliefs, values, and goals. Annual review of psychology, 53(1), 109-132.Eriksson, T.,

Adawi, T., & Stöhr, C. (2017). “Time is the bottleneck”: a qualitative study exploring why learners drop out of MOOCs. Journal of Computing in Higher Education, 29(1), 133–146. DOI 10.1007/s12528-016-9127-8

AUTHOR. (2015, March). Unsupervised modeling for understanding MOOC discussion forums: a learning analytics approach. In Proceedings of the fifth international conference on learning analytics and knowledge (pp. 146-150). https://doi.org/10.1145/2723576.2723589

Farrar, D. & Hayes, J. H. (2019). A comparison of stemming techniques in tracing in 2019 IEEE/ACM 10th International Symposium on Software and Systems Traceability (SST). DOI: 10.1109/SST.2019.00017

Franklin, C., Bargagliotti, A., Case, C., Kader, G., Scheaffer, R., & Spangler, D. (2015). The statistical education of teachers. Alexandria, VA: American Statistical Association.

Frankowsky, M. H., Wiebe, E., Thompson, I., & Behrend, T. (2015). Data analytics for modeling user behavior within MOOCs: A comparison of clustering techniques. Presented at AERA 2015 Annual Meeting.

Gao, F., Wang, C., & Sun, Y. (2009). A New Model of Productive Online Discussion and Its Implications for Research and Instruction. Journal of Educational Technology Development and Exchange, 2(1), 65–78. https://scholarworks.bgsu.edu/vcte_pub/25

Gao, F., Zhang, T., & Franklin, T. (2013). Designing asynchronous online discussion environments : Recent progress and possible future directions. British Journal of Educational Technology, 44(3), 469–483. https://doi.org/10.1111/j.1467-8535.2012.01330.x

Garrison, D. R., Anderson, T., Archer, W. (2001). Critical thinking , cognitive presence , and computer conferencing in distance education. The American Journal of Distance Education, 15(1), 7–23. https://doi.org/10.1080/08923640109527071

Griffiths, T.L. & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Sciences 101, suppl 1: 5228–5235. http://www.pnas.org/content/101/suppl_1/5228.full

Grün, B., Hornik, K., Blei, D.M., Lafferty, J.D., Phan, X., Matsumoto, M., Nishimura, T., & Cokus, S. (2021). Package ‘topicmodels’. R package version 0.2-12. https://cran.r-project.org/web/packages/topicmodels/topicmodels.pdf

Hammer, D., & Berland, L. K. (2014). Confusing claims for data: A critique of common practices for presenting qualitative research on learning. Journal of the Learning Sciences, 23(1), 37-46. https://doi.org/10.1080/10508406.2013.802652

Hara, N., Bonk, C., & Angeli, C. (2000). Content analysis of online discussion in an applied educational psychology course. Instructional Science, 28, 115–152. https://doi.org/10.1023/A:1003764722829

Hilal, A. H., & Alabri, S. S. (2013). Using NVivo for data analysis in qualitative research. International interdisciplinary journal of education, 2(2), 181-186.

Hornik, K., & Grün, B. (2011). topicmodels: An R package for fitting topic models. Journal of statistical software, 40(13), 1-30. http://www.jstatsoft.org/v40/i13

Hu, Y., Boyd-Graber, J., Satinoff, B., & Smith, A. (2014). Interactive topic modeling. Machine learning, 95(3), 423-469. DOI 10.1007/s10994-013-5413-0

Huang, W. (2018). PhraseCTM: Correlated topic modeling on phrases within Markov random fields. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), p. 521-526.

Isoaho, K., Gritsenko, D., & Mäkelä, E. (2021). Topic modeling and text analysis for qualitative policy research. Policy Studies Journal, 49(1), 300-324. https://doi.org/10.1111/psj.12343

Joanes, T. & Doane, W. (2019). textmineR: Functions for text mining and topic modeling. R package version 3.0.4. https://cran.r-project.org/web/packages/textmineR/textmineR.pdf

AUTHORS (2014). A social network perspective on peer supported learning in MOOCs for educators. International Review of Research in Open and Distributed Learning, 15(5), 263-289.

Kop, R., Fournier, H., Sui, J., & Mak, F. (2011). A pedagogy of abundance or a pedagogy to support human beings? Participant support on massive open online courses. The International Review of Research in Open and Distance Learning, 12(7), 74–93. https://doi.org/10.19173/irrodl.v12i7.1041

Krovetz, R. (1993). Viewing morphology as an inference process in Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval – SIGIR ’93, p. 191 – 202. https://doi.org/10.1016/S0004-3702(99)00101-0

Littlejohn, A., Hood, N., Milligan, C., & Mustain, P. (2016). Learning in MOOCs: Motivations and self-regulated learning in MOOCs. The Internet and Higher Education, 29, 40-48.

Marshall, C., & Rossman, G. (1990). Designing qualitative research. Newbury Park: Sage Publications.

McDonald, N., Schoenebeck, S., & Forte, A. (2019). Reliability and inter-rater reliability in qualitative research: Norms and guidelines for CSCW and HCI practice. Proceedings of the ACM on Human-Computer Interaction, 3(CSCW), 1-23. https://doi.org/10.1145/3359174

Milligan, C., Littlejohn, A., & Margaryan, A. (2013). Patterns of engagement in connectivist MOOCs. Journal of Online Learning and Teaching, 9(2), 149–159.

Moore, R. L., & Wang, C. (2021). Influence of learner motivational dispositions on MOOC completion. Journal of Computing in Higher Education, 33(1), 121-134. https://doi.org/10.1007/s12528-020-09258-8

Nandi, D., Hamilton, M., & Harland, J. (2012). Evaluating the quality of interaction in asynchronous discussion forums in full online classes. Distance Education, 33(1), 5 - 30. https://doi.org/10.1080/01587919.2012.667957

Nelson, L. K., Burk, D., Knudsen, M. & McCall, L. (2021). The future of coding: A comparison of hand-coding and three types of computer-assisted text analysis methods. Sociological Methods & Research, 50(1), 202-237. https://doi.org/10.1177/0049124118769114

Nikita, M. & Chaney, N. (2020). Tuning of The Latent Dirichlet Allocation Model Parameters. R package version 1.0.2. https://cran.r-project.org/web/packages/ldatuning/ldatuning.pdf

Onah, D. F., Sinclair, J., & Boyatt, R. (2014). Dropout rates of massive open online courses: Behavioural patterns. EDULEARN14 Proceedings, 1, 5825–5834.

Porter, Martin F. 1980. An Algorithm for Suffix Stripping. Program 14 (3): 130–37.

Ramesh, A., Goldwasser, D., Huang, B., Daumé III, H., & Getoor, L. (2014). Understanding MOOC discussion forums using seeded LDA. In Proceedings of the ninth workshop on innovative use of NLP for building educational applications, 28-33. https://aclanthology.org/W14-1804.pdf

Reich, J., Stewart, B., Mavon, K., & Tingley, D. (2016). The civic mission of MOOCs: Measuring engagement across political differences in forums. In Proceedings of the Third (2016) ACM Conference on Learning@ Scale, p. 1-10. https://doi.org/10.1145/2876034.2876045

Roberts, K., Dowell, A., & Nie, J. B. (2019). Attempting rigour and replicability in thematic analysis of qualitative research data; a case study of codebook development. BMC medical research methodology, 19(1), 1-8. https://doi.org/10.1186/s12874-019-0707-y

Schmiedel, T., Müller, O., & vom Brocke, J. (2019). Topic modeling as a strategy of inquiry in organizational research: A tutorial with an application example on organizational culture. Organizational Research Methods, 22(4), 941-968. https://doi.org/10.1177/1094428118773858

Schofield, A., & Mimno, D. (2016). Comparing apples to apple: The effects of stemmers on topic models. Transactions of the Association for Computational Linguistics, 4, 287-300.

Silge, J. & Robinson, D. (2019). Text Mining with R: A Tidy Approach. O’Reilly.

Tang, H., Xing, W., & Pei, B. (2018). Exploring the temporal dimension of forum participation in MOOCs, Distance Education, 39:3, 353-372, DOI:10.1080/01587919.2018.1476841

Vytasek, J. M., Wise, A. F., & Woloshen, S. (2017). Topic models to support instructors in MOOC forums. In Proceedings of the seventh international learning analytics & knowledge conference, 610-611. https://doi.org/10.1145/3027385.3029486

Wang, X., McCallum, A., & Wei, X. (2007). Topical n-grams: Phrase and topic discovery, with an application to information retrieval. In Seventh IEEE international conference on data mining (ICDM 2007) (pp. 697-702). IEEE.

Wang, X., Yang, D., Wen, M., Koedinger, K., & Rose, C. (2015). Investigating how students’ cognitive behavior in MOOC discussion forums affect learning gains. Proceedings of the 8th International Conference on Data Mining, 226-233.

Watanabe, K. & Xuan-Hieu, P. (2020). Package ‘seededlda’. R package version 0.5.1. https://cran.r-project.org/web/packages/seededlda/seededlda.pdf

Wilkowski, J., Deutsch, A., & Russell, D. M. (2014, March). Student skill and goal achievement in the mapping with google MOOC. In Proceedings of the first ACM conference on Learning@ scale conference (pp. 3-10).

Wong, A. W., Wong, K., & Hindle, A. (2019). Tracing forum posts to MOOC content using topic analysis. arXiv preprint arXiv:1904.07307.

Wu, Z., Lei, L., Li, G., Huang, H., Zheng, C., Chen, E., & Xu, G. (2017). A topic modeling based approach to novel document automatic summarization. Expert Systems with Applications, 84, 12-23.

Xiong, Y., Li, H., Kornhaber, M. L., Suen, H. K., Pursel, B., & Goins, D. D. (2015). Examining the relations among student motivation, engagement, and retention in a MOOC: A structural equation modeling approach. Global Education Review, 2(3), 23-33.

Downloads

Published

2024-03-01

Issue

Section

Massive Open Online Course (MOOC) Research