Comparing the Factors that Predict Completion and Grades Among For-Credit and Open/MOOC Students in Online Learning

Online education continues to become an increasingly prominent part of higher education, but many students struggle in distance courses. For this reason, there has been considerable interest in predicting which students will succeed in online courses and which will receive poor grades or drop out prior to completion. Effective intervention depends on understanding which students are at risk in terms of actionable factors, and behavior within an online course is one key potential factor for intervention. In recent years, many have suggested that Massive Online Open Courses (MOOCs) are a particularly useful place to conduct research into behavior and interventions, given both their size and the relatively low consequences and costs of experimentation. However, it is not yet clear whether the same factors are associated with student success in open courses, such as MOOCs, as in for-credit courses—an important consideration before transferring research results between these two contexts. While there has been considerable research in each context, differences between course design and population limit our ability to know how broadly findings generalize; differences between studies may have nothing to do with whether students are taking a course for credit or as a MOOC. Do learners behave the same way in MOOCs and for-credit courses? Are the implications for learning different, even for the same behaviors? In this paper, we study these issues through developing models that predict student course success from online interactions, in an online learning platform that caters to both distinct student groups (i.e., students who enroll on a forcredit or a noncredit basis). Our findings indicate that our models perform well enough to predict students’ course grades for new students across both of our populations. Furthermore, models trained on one of the two populations were able to generalize to new students in the other student population. We find that features related to comments were good predictors of student grades for both groups. Models generated from this research can now be used by instructors and course designers to identify at-risk students both for-credit and MOOC learners, with an eye toward providing both groups with better support.

In recent years, many have suggested that Massive Online Open Courses (MOOCs) are a particularly useful place to conduct research into behavior and interventions, given both their size and the relatively low consequences and costs of experimentation. However, it is not yet clear whether the same factors are associated with student success in open courses, such as MOOCs, as in for-credit courses-an important consideration before transferring research results between these two contexts. While there has been considerable research in each context, differences between course design and population limit our ability to know how broadly findings generalize; differences between studies may have nothing to do with whether students are taking a course for credit or as a MOOC.
Do learners behave the same way in MOOCs and for-credit courses? Are the implications for learning different, even for the same behaviors? In this paper, we study these issues through developing models that predict student course success from online interactions, in an online learning platform that caters to both distinct student groups (i.e., students who enroll on a forcredit or a noncredit basis). Our findings indicate that our models perform well enough to predict students' course grades for new students across both of our populations. Furthermore, models trained on one of the two populations were able to generalize to new students in the other student population. We find that features related to comments were good predictors of student grades for both groups. Models generated from this research can now be used by instructors and course designers to identify at-risk students both for-credit and MOOC learners, with an eye toward providing both groups with better support.
Online learning has become increasingly prevalent in higher education, and it is important for educators and researchers to address the challenges associated with this learning setting. In recent years, the research on online learning has diverged into two subcategories, with researchers focusing on the efficacy of for-credit courses and MOOCs. While there are plenty of studies on either for-credit courses or MOOCs, little has been done investigating how findings from one subcategory can help inform the other. Bridging this gap would be valuable, as Massive Online Open Courses (MOOCs) are a particularly useful place to conduct research into behavior and interventions, given both their size and the relatively low consequences and costs of experimentation. At the same time, the different goals held by MOOC learners, many of whom never intend to complete the course they start (Koller, Ng, Do, & Chen, 2013), may result in different patterns of behavior being associated with course completion-a risk to researchers attempting to translate findings between courses.
The present paper asks whether MOOC findings are applicable to for-credit students as well by studying student success across two distinct populations (i.e., students who enroll on a forcredit and noncredit basis) who are engaging with the same course materials and instructor. By controlling for these key aspects of the learning experience, we can avoid confounds and focus on the differences (or similarities) between for-credit and MOOC learners, comparing online behaviors and student outcomes between these two student groups. Specifically, we develop classification and regression models that predict student course completion in an online learning platform for each of our distinct populations. We then study whether the same models and corresponding features apply across contexts.
In the following sections of our paper, we examine recent research on MOOCs and forcredit courses, emphasizing the gap in the literature about whether findings are comparable to each other across these contexts. The Methods section discusses the data sets we use in our analysis as well as the course platform, and the algorithms applied to predict student success in our for-credit and open courses. In the Results section, findings from the classification and regression models for each of the student populations are given, followed by results on the generalizability of these models. Lastly, in the Discussion section, we explore the implications of our findings to help instructors and researchers of different populations identify and support struggling online learners.
Comparing the Factors that Predict Completion and Grades Among For-Credit and Open/MOOC Students in Online Learning 3

Review of Related Literature
Many students struggle in distance and online courses. There is increasing evidence that students enrolled in online courses have higher drop rates (Diaz, 2002;Tyler-Smith, 2008) than in traditional classroom settings where instructors can offer support through face-to-face interactions. The overall low completion rates for online courses raise concerns about eventual student success.
As instructors have fewer opportunities to directly interact with students enrolled in online courses (Beard, 2002;Zhang, 2006), it becomes more difficult for instructors to determine whether all students are on a path to success. One approach that can support students is to use learning analytics. Several previous projects have used analytics to predict whether a student will pass or fail an online course or program. Early projects focused on demographic data and assignment performance (Zafra & Ventura, 2009;Arnold & Pistilli, 2012). However, these features are predictive but not always actionable.
As a result, many researchers have begun to use more granular features of student interaction to predict student achievement in online courses. In one example, Whitmer (2012) found a significant relationship between student usage of a learning management system (LMS) and their achievement. Specifically, they found that participation in online assessment activities, such as completing auto-graded quizzes, had the largest effect in terms of prediction of final grades. In another example, Romero, López, Luna, and Ventura (2013) predicted students' final performance on a course from participation in online discussion forums. They found that students who passed the course actively participated in the forum, posting messages more frequently and writing messages of higher quality as measured by the instructor's evaluation. In a third example, Andergassen, Guerra, Ledermüller, and Neumann (2014) found that the duration of participation in online activities (i.e., the total amount of time spent between the first and last assignments of the course) significantly predicted students' final exam scores. In a fourth example, Baker, Lindrum, Lindrum, and Perkowski (2015) discovered that students with higher grades were more likely to access the course e-book early in the semester, with access to course materials even as early as before the start of the semester being predictive of eventual grade.
Parallel to this research within for-credit courses, there has been increasing work in determining whether students will complete a MOOC, where completion rates are typically lower than in for-credit courses, in part because many learners enter MOOCs without the goal of completing at all (Koller et al., 2013), and grades do not matter as much as whether students complete the course with the certificate. For example, researchers have investigated whether course completion can be predicted by posting behavior and social network participation on course discussion forums (Yang, Sinha, Adamson, & Rose, 2013;Jiang, Warschauer, Williams, O'Dowd, & Schenke, 2014). Students were less likely to drop out when their posts spanned a longer duration between the first and the last post of the current week-suggesting positive impacts from more continual participation (Yang et al., 2013). Students were also less likely to drop out when they interacted with a greater number of students in online discussions. Similarly, Jiang et al. (2014) predicted whether students would earn a certificate of normal completion (basic track) or distinction (scholarly track) from their posting behavior in course discussion forums. Students were more likely to earn a certificate of distinction than normal completion when they actively participated in the forum in the first week. In another example, Morris, Hotchkiss, and Swinnerton (2015) predicted completion rates based on student demographics. Higher completion rates were associated with higher prior educational attainment and prior online experience. Of course, in considering this work, it is important to note that most of the work on student success in MOOCs has focused on completion with a certificate. Using completion rates to indicate student success in a MOOC is by no means a perfect nor comprehensive criterion. In particular, some have argued that using completion as the metric of student success in open courses obfuscates the various reasons students enroll in a MOOC. Whereas traditional courses take considerable effort to enroll in, cost money, and typically have penalties of various sorts for withdrawing, MOOCs are easy and free to enroll in and have no penalty for withdrawal. As such, it has been argued that many students take MOOCs with more of a goal of sampling or learning specific material (Clow, 2013). However, recent studies suggest that many students who enroll in a MOOC on a noncredit basis have expectations similar to those seen in a traditional course, demonstrating high and sustained levels of engagement. Specifically, Shrader, Wu, Owens-Nicholson, and Santa Ana (2016) found that many noncredit learners watch a considerable percentage of videos and earn high points on quizzes, which are arguably potential indicators of a desire to pass the course. Additionally, while MOOC completion rates can be low, many participants intend to complete. For example, in a study of nine HarvardX courses, precourse surveys indicate that, on average, 58% of students intend to finish the course, 25% intend to audit, 3% intend to browse, and the remaining 14% are uncertain of their goal. Even with many learners having goals other than completion, completion data still offer valuable information to instructors and researchers (Reich, 2014). Completing learners appear to behave differently than other types of disengaged learners, such as "sampling" learners (Clow, 2013) who access content materials only at the beginning of the course. For instance, completing learners post in discussion boards and access in-video assessments significantly more frequently than sampling learners (Kizilcec, Piech, & Schneider, 2013). In addition to this, Wang, Paquette, and Baker (2017) indicate that course completion can lead to eventual career advancement, with completers being more likely to submit scientific papers in the field they were studying and being more likely to join scientific societies aligned with the topic of the course.
As can be seen in two bodies of literature (for-credit courses and MOOCs), many of the same types of features are used when studying MOOCs and for-credit online courses, such as discussion forum participation and use of course resources. However, there has been relatively little work directly comparing for-credit learners and MOOC learners, with each individual study focusing on predicting student achievement within a specific student population-either among students who enroll for a for-credit or a noncredit (open/MOOC) basis. Simply making comparisons between papers is not sufficient to fully compare these types of environments. While there have been several studies conducted within each of these groups, the differences between the courses and learning environments studied have made it difficult to make comparisons between papers. As such, there remains a gap in the literature on how students' online interactions correspond to learning outcomes in these student populations. Do the same factors predict success in each of these contexts?
This research question becomes particularly scientifically interesting given recent research that suggests there are important differences between the behaviors of students who enroll in forcredit online courses versus MOOCs. For instance, forum participation appears to be relatively more sustained in for-credit courses (Nistor & Neubauer, 2010) than in MOOCs (Clow, 2013). As such, there appear to be different patterns of engagement between for-credit and MOOC learners, but it is not yet conclusive whether these differences are due to the differences between these learners or some other factor-for example, other features of these courses or platforms investigated across these studies. Previous studies on edX, one of the largest platforms for open Comparing the Factors that Predict Completion and Grades Among For-Credit and Open/MOOC Students in Online Learning 5 courses, consistently show low completion rates-averaging fewer than 10% of participants (Breslow et al., 2014;Ho et al., 2014). As such, it would not be surprising to find lower completion rates for open courses than for-credit courses. And it is possible that the behaviors associated with successful performance (e.g., good grades) will differ as well. However, there has not yet been work to compare the potential differences in behavior between for-credit and open course learners, and the relationship between these behaviors and student outcomes when controlling for factors such as platform, course design, and topic. Thus, investigating which factors predict student performance in both groups of learners may also be useful to help support both of these distinct student populations, as well as help us understand how course format influences student behavior and performance.
In this paper, we therefore compare online behaviors from two groups of learners with different intentions, enrolling on for-credit and noncredit bases, who are using the same course and platform concurrently. Specifically, we develop models based on student interaction and participation to predict final course grades for each population. In doing so, we identify features of student interaction and participation that are intended to be interpretable, both to support instructors in eventual interventions based on these features and for scientific interpretation of our findings. We then examine whether these automated models can generalize from one distinct student population to the other, in order to understand whether the same patterns of interaction are predictive of student success in both populations.

Data Sets
We analyze these issues in the context of data from NextThought, an interactive, online learning platform used to deliver online courses for a range of universities. The NextThought platform emphasizes community and social interaction around course materials, with features such as the ability to create threaded conversations and group-visible notes contextualized within pieces of content, such as videos and readings. Data on online interactions were collected from students who enrolled in an online humanities course at a large public university during the spring and summer terms of 2015. The course was enrolled in by both 143 for-credit students and 90 open students (i.e., those enrolled in the courses on a noncredit basis). These two groups of students took the course concurrently, sharing the same resources and discussion threads.
The interaction logs of the NextThought platform were distilled to produce the following nine features, grouped into five types. The same features were used to describe the behavior of both for-credit students and open students: • Readings: number of times a student viewed readings • Forum: number of times a student viewed a forum • Video: number of times a student viewed a video • Comments: number of comments a student created, number of top-level comments a student created (i.e., not replies to other student's comments), number of comments created in response to a student's original comment, and number of comments that were replies to other comments • Social interaction: number of distinct students who replied to a given student and number of distinct students a given student has replied to Comparing the Factors that Predict Completion and Grades Among For-Credit and Open/MOOC Students in Online Learning 6 All of these features were included in our predictive analysis of students' final course performance. Students' final average grades were computed based on course requirements for each term. The syllabus specified performance criteria, which were set up by the instructor to be different between the two student populations. The for-credit students were told that an average grade of 0.70 was equivalent to a C letter grade. The open students were told that completing 200 points out of 350 points would result in receiving a badge. We followed the instructor's decision and used different standards of success for the two populations, as applying the same standards would imply applying a different standard to one of the groups than that group would normally have had outside the context of this study. MOOCs typically allow learners to drop multiple quizzes and have lower cutoffs for obtaining a certificate than for-credit courses. In order to authentically represent whether each student matched the instructor's expectation for successful performance, the instructor's chosen criteria were used as cutoffs in our models to determine whether students passed or failed the course: 0.70 for for-credit students and 0.57 (200 out of 350 points) for open students. Using these differing standards of success represents the authentic expectations set by the instructor for each group and reflects what typically occurs in most online courses, where MOOCs have lower standards for obtaining a certificate than for-credit courses have for passing.

Data Analysis
We use a predictive analytics/educational data mining paradigm (Romero & Ventura, 2007;Baker & Siemens, 2014) to interpret the performance of these two groups of students. Using this paradigm enables us to combine features of online interactions in more complex patterns and evaluate these patterns in terms of whether they are likely to generalize to new students, rather than in terms of whether findings falsify hypotheses within the current student population (Baker, 2015).
Within this paradigm, we develop a pair of models, one for each population, to predict whether students will pass or fail the course, according to the criterion set by the instructor. Each pass/fail prediction model was built using the J48 implementation of the C4.5 algorithm (Quinlan, 1993). In a J48 decision tree, a set of if-then choices is based on features of student interaction, culminating in a final probability of whether the student passed or failed the course. We assessed the J48 models using two metrics: Cohen's (1960) kappa and area under the receiver operating characteristic (ROC) curve (AUC) (Hanley & McNeil, 1982). Cohen's kappa assesses the degree to which our models are better than chance at predicting students' final course performance. A kappa of 0 indicates that the model performs at chance, and a kappa of 1 indicates that the model performs perfectly. For example, a kappa of 0.31 would indicate that the model is 31% better than chance. AUC is the probability that the algorithm will correctly identify whether a student will pass or fail in the online course. It is mathematically equivalent to the Wilcoxon statistic. A model with AUC of 0.5 performs at chance, and a model with AUC of 1.0 performs perfectly. Note that AUC was computed at the student level.
We built our models in RapidMiner 5.3 (Mierswa, Wurst, Klinkenberg, Scholz, & Euler, 2006). All of our models were developed using 10-fold cross-validation at the student level (i.e., models are repeatedly trained on nine groups of students and test on the 10th group of students), in order to assess how accurate the models and findings will be for new students.
Beyond simply predicting whether a student will pass or fail, we also built models to predict numerical course grades using linear regression with greedy feature selection, using RapidMiner 5.3. This algorithm repeatedly removes the worst-performing attribute until the Akaike (1973) Comparing the Factors that Predict Completion and Grades Among For-Credit and Open/MOOC Students in Online Learning 7 information criterion (AIC) no longer improves, creating simple models with reduced risk of overfitting (Das, 2008). All predictor variables were standardized using z scores to increase the interpretability of the resulting coefficients; note that this standardization procedure does not influence model goodness or predictive power. Model goodness is assessed using cross-validated correlations (Pearson's r) between the model and data, and root-mean-square error (RMSE) values. Positive cross-validated correlation indicates the relationship is consistent between the training and test data sets (but has no implication about the relationship's direction); negative crossvalidated correlation indicates that the models obtained from the training data set are worse than chance when applied to new students.
Beyond testing the models for new students in their original populations, we can test the generalizability of the classification models between the open and for-credit students, building the model on one group and evaluating it on the other. In other words, models trained on the for-credit student population were tested on the open student population, and vice versa; goodness metrics were then calculated to determine how each of the models was generalizing to the new population. These tests allow us to compare the degree to which each model (and the patterns they capture) generalize across the two populations studied here. This also enables us to understand how similar the relationships being modeled are between data sets. In our case, model generalizability tests are important for determining whether the same features are predictive of student success across our two student populations, regardless of any potential preexisting group differences. For example, if a model trained on one population works on the other population, we can conclude that the factors that lead to course completion are similar between the two populations. Testing generalizability is better evidence than simply looking at whether the models have the same features, as often the exact models found are only slightly better than several alternate models, which may have different features.

Results Descriptives
We found that the average proportion of final course grades was considerably higher for students who enrolled for the online course on a for-credit basis (M = 71.48, SD = 0.30) than students who enrolled on a noncredit basis (M = 31.61%, SD = 0.32).

Models Predicting Student Pass/Fail
In this section, we examine the overall predictive power of the J48 decision trees predicting whether a student would successfully pass the course, for both groups of students (for-credit and open). We analyze the model predictive power for new students, within group and across groups. We will interpret the features within these models in closer detail in a later section.
The J48 model for open students achieved a cross-validated AUC of 0.674 and a crossvalidated kappa of 0.288. For students who enrolled for-credit, the J48 model achieved a crossvalidated AUC of 0.685 and a cross-validated kappa of 0.470. These results show that both models are successful at predicting whether a student passes or fails, with the for-credit model appearing to perform slightly better than the model developed for the open student population. The finding that the for-credit student model performed better than the open student model is perhaps not unsurprising given that open courses likely attract a more diverse collection of students with a wider variety of reasons for enrolling in classes (Clow, 2013). These results suggest that our Comparing the Factors that Predict Completion and Grades Among For-Credit and Open/MOOC Students in Online Learning 8 models are capable of predicting student achievement for new students within each of our two distinct populations.
Across both of our student populations, the J48 models included the following features: the number of comments, the number of replies, the number of forum views, and the number of video views. As shown in Figure 1, the open student model also included additional features not incorporated in the for-credit student model: the number of top-level comments, number of comments created in response to a student's original comment, the number of reading views, and the number of distinct students who replied to a given student. Additionally, as shown in Figure 2, the for-credit model included the number of distinct students a given student has replied to, which was not incorporated in the open student model.
Number of top-level comments a student created <= 3 | Number of times a student viewed readings <= 37: PASS (100%) | Number of times a student viewed readings > 37 | | Number of comments a student created = 0 | | | Number of times a student viewed a video <= 7: PASS (100%) | | | Number of times a student viewed a video > 7 | | | | Number of times a student viewed a forum <= 10: FAIL (100%) | | | | Number of times a student viewed a forum > 10: PASS (75%) | | Number of comments a student created > 0 | | | Number of distinct students who replied to student = 0: PASS (92.86%) | | | Number of distinct students who replied to student > 0: FAIL (75%) Number of top-level comments a student created > 3 | Number of comments created in response to a student's original comment <= 18: FAIL (83.33%) | Number of comments created in response to a student's original comment > 18 | | Number of comments that were replies to other comments <= 9: PASS (100%) | | Number of comments that were replies to other comments > 9: FAIL (100%) Percentages reflect the model's degree of certainty in an assessment. This tree can be read as follows (for example): If a student makes 3 or fewer top-level comments and views readings 37 or fewer times, the student is predicted to pass 100% of the time. But if a student makes 3 or fewer top-level comments and views readings more than 37 times and creates 0 comments and views videos more than 7 times and views forums 10 times or fewer, the student is predicted to fail 100% of the time.
Number of comments a student created <= 2: PASS (94.74%) Number of comments a student created > 2 | Number of distinct students a given student has replied to <= 2 | | Number of comments that were replies to other comments = 0: FAIL (88.89%) | | Number of comments that were replies to other comments > 0 | | | Number of times a student viewed a forum <= 14: PASS (100%) | | | Number of times a student viewed a forum > 14 | | | | Number of times a student viewed a video <= 44: FAIL (13.89%) | | | | Number of times a student viewed a video > 44: PASS (100%) | Number of distinct students a given student has replied to > 2: FAIL (90.52%) Figure 2. J48 decision tree predicting whether for-credit students pass or fail the course.

Model generalizability.
The J48 model trained on the open population performed considerably better than chance when tested upon new for-credit students, with an AUC of 0.745 and a kappa of 0.495. The J48 model trained on the for-credit population was well above chance when applied upon new open students, with an AUC of 0.629 and a kappa of 0.400. These numbers, on average, were reasonably close to the numbers the models achieved on their original data sets. Overall, these results suggest that the classification models trained on a single student population generalize well to the other student population, and that there are similarities between the factors associated with passing in the two populations.

Linear Regression Models
After modeling whether students passed or failed their courses, we generated models to predict numerical grades, both because these are useful in their own right and because the components of these models are typically easier to interpret than the internals of decision trees.
Individual-feature models. Our first step was to generate linear regression models for each data feature, taken individually, in order to understand how each feature correlates to student grade. We did this separately for each group of students (for-credit and open). Table 1 shows the relationship of each of the individual features with student achievement among the open students. For this group of students, each of our features was found to significantly and positively predict students' final average grades. In other words, every behavior was positively correlated with student grade. Table 2 provides a summary of the relationship of each individual feature with student grades in the for-credit student population. For these students, as with the students taking the course on an open basis, each of our features had a positive and statistically significant relationship with student's proportion of final grades.  Multiple-feature models. When we combined these features into a single model, the resultant model for students who enrolled on an open basis achieved a cross-validated correlation of 0.592, with an RMSE of 0.263. This value of cross-validated correlation was 0.28 higher than the best individual-feature model. Given this level of cross-validated correlation, the model is likely to generalize to new students better than chance. Table 3 provides a summary of the best-fitting linear regression model for students who enrolled in online courses without credit.  As seen in Table 3, a multifeature model predicting final grades for the open students incorporated several features. The number of times a student viewed the forum and the number of comments a student created were positively correlated with the student's grade, when controlling for other variables in the model. This aligns with the results from the individual-feature models. However, the relationship between number of replies and the number of comments created in Comparing the Factors that Predict Completion and Grades Among For-Credit and Open/MOOC Students in Online Learning 12 response to a student's original comment to it reversed when controlling for other variables in the full model. This suggests that once we control for how much students view the forum (and interact with it), overall, replying to comments is not characteristic of successful students. This may in part be due to the presence of students who posted many irrelevant comments.
For students who enrolled on a for-credit basis, we again combined the original set of features into a single model using greedy feature selection. The resultant model of for-credit student achievement achieved a cross-validated correlation of 0.681 with an RMSE of 0.221, performing slightly better than the full model for open students. However, this multifeature model does not perform much better than the best individual model, which achieved a cross-validated correlation of 0.678. That feature alone-the number of top-level comments the student createdis very predictive, making it difficult for a combined model to perform much better. In fact, only one other feature, the number of times the student viewed readings, enters into the combined model. Table 4 shows the best-fitting linear regression model for students who enrolled in online courses for credit.

Features Coefficient
Number of times a student viewed readings 0.036------Number of top-level comments a student created 0.195**** Constant 0.715**** Note. *p < 0.05. **p < 0.01. *** p < 0.001. **** p < 0.0001. Model generalization. We next applied each model to the other data set to understand whether the patterns that predict better grades in for-credit students also predict better grades in open students, and vice-versa. When we did so, we found that the multifeature models trained on the open student population performed well when tested on the population of for-credit students (correlation on a new data set = 0.573, RMSE = 0.472). Similar to this, our multifeature model trained on the for-credit student population also performed well when tested on the population of open students (correlation on a new data set = 0.638, RMSE = 0.471). The correlations remain respectable in both cases; however, there is substantial degradation in the RMSE values. This degradation is likely due to the overall higher average grades in the for-credit student population relative to the average grades of the open student population. As there were different cutoffs for passing in the two populations, it stands to reason that the students would set different goals, and therefore the same behaviors would be associated with different absolute grades (even as they were associated with the same relative grades).

Discussion and Conclusion
In this paper, we studied which factors in student interaction during an online course predict their success, using classification models to predict whether students succeed or fail, and regression models to predict students' numerical grades. One of our core goals was to investigate whether the same features predict student success within students taking the course on a for-credit Taken individually, each of our features was a significant predictor of numerical grades within each student group and across both student populations. In other words, the same features were predictive in the two populations. One key finding is that features related to comments were significant predictors of final grades across open and for-credit students. This result corroborates previous findings that show a significant relationship between students' online comments and final course grades (Wang & Newlin, 2000;Wang, Newlin, & Tucker, 2001). More specifically, our results also reveal that top-level comments are associated with better course grades-even when controlling for other variables in the model. This finding generalizes across both open and forcredit learners. With the considerable amount of information in discussion forums, it becomes difficult for instructors of both open and for-credit courses to track and determine which specific types of comments best support students in their online learning (e.g., Mazzolini & Maddison, 2005). But if instructors know to focus on top-level comments, they can focus their efforts there. As such, studying the content of top-level comments may be a promising approach to evaluating what is engaging students within noncredit and for-credit courses. This result is also a first step toward identifying which aspects of the online learning experience are general across these two settings.
Within the open student population, both classification and regression models included the following features: number of times a student viewed a forum, number of distinct students who replied to a given student, number of comments a student created, number of comments created in response to a student's original comment, and number of comments that that were replies to other comments. In addition to these, the classification model for open students also incorporated the following features that were not included in the for-credit classification model: the number of times a student viewed readings, the number of times a student viewed videos, and the number of toplevel comments a student created.
Within the for-credit student group, classification and regression models differed from each other. For-credit classification models incorporated the number of comments, number of distinct students a given student replied to, number of forum views, number of replies, and number of video views. By contrast, only the number of reading views and the number of top-level comments were included in the regression model. This finding suggests that the factors important for classifying whether for-credit students pass or fail were not necessarily the same as the predictive factors for finer-grained outcomes of student achievement. In particular, features related to videos, forums, comments, and social interaction are predictive of students' success or failure, while the combination of features related to readings and a certain type of comment is predictive of students' proportions of final grades.
All models performed well at predicting student grades or pass-fail status within both populations. Their performance was slightly better for the for-credit students for both types of models. The ability to predict student achievement from automated models of student online interactions has potential implications for educational interventions. As our findings indicate that our predictive models of students' success or failure can be applied to new students, this creates the possibility of early remediation for students who are struggling (Arnold & Pistilli, 2012). By identifying struggling students early, instructors can monitor their students' needs and provide scaffolding before these students fall too far behind. However, as the models were not perfect, there is still a possibility of classifying students in the wrong category, suggesting that these models should be used with "fail-soft interventions," where there are limited risks associated with a student incorrectly receiving an intervention. An example of this is seen in Course Signals, an early warning system that triggers interventions designed by instructors to help students at risk of failing the course. While the interventions are recommended by an automated algorithm, they are mediated by instructors who make the final decision on whether to send a message and how to tailor it based on what they know about that specific student. In general, initial results show that the intervention was associated with higher student performance and retention, and that students who receive the intervention feel that instructors care about them (Pistilli & Arnold, 2010;Arnold & Pistilli, 2012). A similar approach was adopted by Jayaprakash, Moody, Lauría, Regan, and Baron (2014), who send instructors notifications about at-risk students and scaffold the instructors in sending struggling students emails regarding consultation hours, web-based resources on online tutoring tools, or additional practice exercises for the course materials-resulting in higher final grades than students in a control group across multiple universities. Each of these systems mitigates the risk of incorrect classification by putting the control over intervention in the instructor's hands, and by scaffolding intervention strategies with low risk of harm even when given to students who ultimately do not need intervention.
In the specific case of this course, our results show a positive relationship between toplevel comments and student outcomes. NextThought affords instructors multiple opportunities for facilitating the student's creation of top-level posts. For example, by posting conversation prompts within the discussion margins of the course content, instructors can provide a model for top-level comments that students can replicate. These prompts help raise the visibility of course materials that are worthy of reflection and comments. In addition, instructors can encourage students to produce top-level comments by incorporating thought questions within nongraded reading assignments. These questions, accompanied by an invitation to post reflections in the discussion margins, provide a framework for generating top-level student comments within the course forums. Finally, when posting at the top level, students can also be encouraged to consider making comments rather than posting questions, as questions encourage responses to the original post instead of promoting other top-level comments.
More broadly, these results also have potential implications for supporting student success in universities that offer courses like this one. Given our findings, it may be valuable for course instructors to message inactive students and encourage them to start new discussion threads within forums. Sending course-related reminders may be a useful instructional strategy, as students generally respond positively to these types of emails and messages (cf. Arnold & Pistilli, 2014). Based on work by Belcher, Hall, Kelley, and Pressey (2015), another way instructors can encourage students to become thread starters is to praise them for posting a top-level comment and to summarize the content of these posts within the thread. These strategies could potentially motivate students to initiate top-level comments and think more critically about them, as they know their top-level posts will be attended to by the instructor. These instructor behaviors have also been found to promote higher levels of critical thinking within peer interactions (Belcher et al., 2015).
Models predicting student success and grade used many of the same features across both for-credit and open students. Beyond that, a model built for each group of students functioned effectively for the other group of students. As such, this result indicates that the same factors predict course completion even in different groups of students who have enrolled either on a for-credit or noncredit basis. While these two studies likely differ in their motivations, the same factors are associated with completion in both. The relationship between interaction behaviors and success is similar for these two distinct populations of students, in terms of patterns and combinations of features as well as individual features. This finding, in turn, has implications for synthesizing literature on MOOCs and for-credit courses, which have largely been treated by researchers as disparate bodies of work. Our results suggest that many of the expanding body of findings on success in MOOC courses may be relevant to for-credit online courses, a useful finding given the much larger samples of students available in many open courses (if not in the specific course studied here).
In considering these findings, it is essential to note their limitations. One key limitation is that the research presented here was conducted in a single online course, with both for-credit and open learners, a somewhat unusual arrangement (though characteristic of the NextThought platform).
The sample of students was typical for for-credit courses but was smaller than the sample seen in many xMOOCs offered by the largest MOOC vendors, such as edX, Coursera, and FutureLearn. However, not all MOOCs are xMOOCs. In particular, the original use of the term MOOC referred to connectivist MOOCs (cMOOCs), whose design is based more on student-tostudent communication, definition of questions, and selection of resources (McAuley, Stewart, Siemens, & Cormier, 2010). While the course studied here is not a cMOOC per se, it shares with cMOOCs the strong focus on student-to-student communication. It is common for cMOOCs to have sizes more comparable to what is seen in this study, rather than the massive sizes seen in xMOOCs from the largest providers. However, this specific atypicality does limit our ability to generalize our findings somewhat. As in the case with most MOOC studies, findings from a specific course are often difficult to generalize beyond the local context. For instance, while forcredit and open learners were treated as discrete sets of learners, not all online courses will necessarily apply the same categorization. With the rise of MicroMasters, students who initially enroll in a MOOC can also pursue and eventually earn a master's degree. As such, future work is recommended to test whether these models remain valid when extended to a wider sample of learners, as well as to a broader range of MOOCs. Despite these limitations, the findings from our automated models provide information on which factors of student online interactions promote achievement and learning. Additionally, our study shows which findings validated for different groups of students within the current course, and which can be replicated across additional courses. By better understanding the factors in student interaction (whether in for-credit or open courses) that are associated with greater success, we can design interventions that instructors can use to nudge students to better learning trajectories and better outcomes.