Using Learning Analytics to Identify Medical Student Misconceptions in an Online Virtual Patient Environment

Eric G Poitras, Laura Naismith, Tenzin Doleck, Susanne P Lajoie


This study aimed to identify misconceptions in medical student knowledge by mining user interactions in the MedU online learning environment. Data from 13000 attempts at a single virtual patient case were extracted from the MedU MySQL database. A subgroup discovery method was applied to identify patterns in learner-generated annotations and responses to multiple-choice items on the diagnosis and management of acute myocardial infarction (i.e., heart attack). First, the algorithm generated rules where single terms from the learner annotations were used to predict incorrect answers to the multiple-choice items. Second, the possible combinations of terms and their relevant synonyms were used to determine whether their inclusion led to better rates of prediction. The second step was found to significantly increase prediction precision and weighted relative accuracy, uncovering four misconceptions at a rate greater than 70%. These findings serve to inform the design of an adaptive system that tailors the delivery of formative feedback to promote better learning outcomes in the domain of clinical reasoning.


Learning Analytics; Online Learning; Misconceptions

Full Text:



Ahopelto, I., Mikkilä-Erdmann, M., Olkinuora, E., & Kääpä, P. (2011). A follow-up study of medical students' biomedical understanding and clinical reasoning concerning the cardiovascular system. Advances in Health Sciences Education, 16, 655-668.

Anderson, J. R., Boyle, C. F., Corbett, A. T., & Lewis, M. W. (1990). Cognitive modeling and intelligent tutoring, Artificial Intelligence, 42, 7-49.

Baker, R., & Siemens, G. (2014). Educational data mining and learning analytics. In R. K. Sawyer (Ed.), The Cambridge handbook of the learning sciences (2nd ed., pp. 253-272). Cambridge, UK: Cambridge University Press.

Boshuizen, H. P. A., van de Wiel, M. W. J., & Schmidt, H. G. (2012). What and how advanced medical students learn from reasoning through multiple cases. Instructional Science, 40, 755-768.

Danielson, J. A., Mills, E. M., Vermeer, P. J., Preast, V. A., Young, K. M., Christopher, M. M., et al. (2007). Characteristics of a cognitive tool that helps students learn diagnostic problem solving. Educational Technology, Research and Development, 55(5), 499-520.

Duivesteijn, W., & Arno, A. (2011). Exploiting false discoveries — Statistical validation of patterns and quality measures in subgroup discovery. In IEEE 11th International Conference on Data Mining (ICDM) (pp. 151-160). Vancouver, BC: IEEE.

Ellaway, R. H., Pusic, M. V., Galbraith, R. M., & Cameron, T. (2014). Developing the role of big data and analytics in health professions education. Medical Teacher, 36(3), 216-222.

Fall, L. H., Berman, N. B., Smith, S., White, C. B., Woodhead, J. C., & Olson, A. L. (2005). Multi-institutional development and utilization of a computer-assisted learning program for the pediatrics clerkship: the CLIPP Project. Academic Medicine, 80(9), 847-855.

Ferguson, R. (2012). Learning analytics: drivers, developments and challenges. International Journal of Technology Enhanced Learning, 4(5-6), 304-317.

Feyzi Behnagh, R., Azevedo, R., Legowski, E., Reitmeyer, K., Tseytlin, E., & Crowley, R. (2014). Metacognitive scaffolds improve self-judgments of accuracy in a medical intelligent tutoring system. Instructional Science, 42(2), 159-181.

Graber, M. L., Franklin, N., & Gordon, R. (2005). Diagnostic error in internal medicine. Archives of Internal Medicine, 165, 1493-1499.

Herrera, F., Carmona, C. J., Gonzalez, P., & Jose del Jesus, M. (2011). An overview on subgroup discovery: Foundations and applications. Knowledge and information systems, 29(3), 495-525.

Kay, J., Reimann, P., Diebold, E., & Kummerfeld, B. (2013). MOOCs: So many learners, so much potential. IEEE Intelligent Systems, 28(3), 70-77.

Klösgen, W. (2002). Subgroup discovery. In W. Klösgen and J. Zytkow (Eds.), Handbook of data mining and knowledge discovery. New York: Oxford University Press.

Konijn, R., Duivesteijn, W., Meeng, M., & Knobbe, A. (2014). Cost-based quality measures in subgroup discovery. Journal of Intelligent Information Systems, 1-19.

LavraÄ, N., Flach, P., & Zupan, B. (2000). Rule evaluation measures: A unifying view. In Inductive Logic Programming, Lecture Notes in Computer Science, vol. 1634 (pp. 174-185). Berlin: Springer.

Mayo Clinic (2015). Diseases and Conditions: Heartburn. (accessed July 17, 2015).

Norman, G. (2005). Research in clinical reasoning: past history and current trends. Medical Education, 39(4), 418-427.

Norman, G. R., & Eva, K. W. (2010). Diagnostic error and clinical reasoning. Medical Education, 44(1), 94-100.

Shute, V. J., & Zapata-Rivera, D. (2012). Adaptive educational systems. In P. Durlach (Ed.), Adaptive technologies for training and education (pp. 7-27). New York: Cambridge University Press.

Smith, S., Kogan, J. R., Berman, N. B., Dell, M. S., Brock, D. M., & Robins, L. S. (in press). The development and preliminary validation of a rubric to assess medical students' written summary statements in virtual patient cases. Academic Medicine.

Van Lehn, K. (1988). Toward a theory of impasse-driven learning. In H. Mandl & A. Lesgold (Eds.), Learning issues for intelligent tutoring systems (pp. 19-4 1). New York: Springer-Verlag.

Wahlgren, C.-F., Edelbring, S., Fors, U., Hindbeck, H., & Stahle, M. (2006). Evaluation of an interactive case simulation system in dermatology and venereology for medical students. BMC Medical Education, 6, 40.

Wrobel, S. (1997). An algorithm for multi-relational discovery of subgroups. In Proceedings of the first European symposium on principles of data mining and knowledge discovery (pp. 78-87). New York: Springer.

additional references blinded for review]