Assessing Readiness for Online Education – Research Models for Identifying Students at Risk

This study explored the interaction between student characteristics and the online environment in predicting course performance and subsequent college persistence among students in a large urban U.S. university system. Multilevel modeling, propensity score matching, and the KHB decomposition method were used. The most consistent pattern observed was that native-born students were at greater risk online than foreign-born students, relative to their face-to-face outcomes. Having a child under 6 years of age also interacted with the online medium to predict lower rates of successful course completion online than would be expected based on face-to-face outcomes. In addition, while students enrolled in online courses were more likely to drop out of college, online course outcomes had no direct effect on college persistence; rather other characteristics seemed to make students simultaneously both more likely to enroll online and to drop out of college. Assessing Readiness for Online Education – Research Models for Identifying Students at Risk Conway, K., Hachey, A., & Wladis, C. (2016). Assessing Readiness for Online Education – Research Models for Identifying Students at Risk. Online Learning 20 (3) 97 109. Introduction Higher education is undergoing a virtual transformation as course offerings move from in-person to Internet-based instruction. By 2013, over 40 million college students took online classes worldwide; by 2017, that number should triple (Atkins, 2013). Because of this rapid growth, many institutions have had to make policy decisions about online learning before adequate evidence was available. For example, more institutions are requiring all students to take online courses, despite evidence that this may decrease course and college completion for some students (Jaggars, 2011). And most colleges use surveys to screen out students at risk online, despite the fact that no online readiness surveys have yet been validated as predictors of differential online versus face-to-face outcomes (Wladis & Samuels, 2016). Online courses can provide increased access to college, but because they often have higher attrition (the reasons for which are not yet well understood), they may also be stumbling blocks to degree completion. On the other hand, restricting access to online courses may impede the college progress of “non-traditional” students who need the flexibility that online learning affords. The rapid growth of online learning will likely change the very nature of higher education over the coming decades. If policies guiding the implementation of online learning are to be grounded in research evidence, education research must keep up with these trends. In particular, it is essential to identify which students are at higher risk online. Additionally, suggesting that online students are not more likely to drop out of college immediately after, or due to the outcomes of the online course; rather, it seems that other student characteristics may be significant in determining college persistence. Research questions This study explores the relationship among student characteristics, online course-taking and course and college persistence. Specifically, we ask: 1. Which student characteristics exacerbate or mitigate differences in rates of online versus face-toface successful course completion? 2. To what extent do online course outcomes explain subsequent college dropout rates? Theoretical framework and prior research Online Outcomes Numerous studies, including a meta-analysis of over 200 studies, have found no significant difference in learning outcomes in online versus face-to-face courses(e.g., Bernard et al., 2004; Bowen, Chingos, Lack, & Nygren, 2012). Yet online course dropout rates range from 20-40% (e.g. Pierrakeas, Xenos, Panagiotakopoulos, & Vergidis, 2004), and online attrition rates have been reported as 7-20 percentage points higher than those for face-to-face courses (e.g.Nora & Snyder, 2009; Patterson & McFadden, 2009). However, there is little research on the effects of online course-taking on college persistence and completion, and what results are available are mixed (see e.g. Shea & Bidjerano, 2014; Xu & Jaggars, 2011). However, examining student characteristics may help to predict which students are at highest risk online. Student characteristics and online enrollment Online learners are more likely to be female, older, married, active military or to have other responsibilities (e.g., full-time work, children), and are more likely to have other “non-traditional” characteristics (e.g., delayed college enrollment; no high school diploma; part-time enrollment; financially independent) (Shea & Bidjerano, 2014; Wladis, Hachey, & Conway, 2015). Studies have also found that online students tend to have higher academic preparation and higher G.P.A.s, to be white, native English speakers, and are more likely to have applied for or received financial aid (Conway, Wladis, & Hachey, n.d.; Jaggers & Xu, 2010; Xu & Jaggars, 2011). Online learning also seems to attract a larger proportion of first generation college students (Athabasca University, 2006). However, research Assessing Readiness for Online Education – Research Models for Identifying Students at Risk Conway, K., Hachey, A., & Wladis, C. (2016). Assessing Readiness for Online Education – Research Models for Identifying Students at Risk. Online Learning 20 (3) 97 109. on demographic variables is conflicting (Jones, 2010), and it is unclear how differing characteristics interact to affect student retention in online courses. Student characteristics and online outcomes Student skills and psychological attributes may be less predictive of online outcomes than other factors, or no less applicable to success online versus face-to-face. Bernard, Brauer, Abrami, and Surkes (2004) found that self-direction and beliefs were significant positive predictors of online course grade, but that G.P.A. was a stronger predictor of online course outcomes. Waschull (2005) found that selfdiscipline/motivation was significantly correlated with course grades online, but concluded that the same factors may predict success in both online and face-to-face classes. Aragon and Johnson (2008) found that online completers were more likely to be female, enrolled in more classes, with a higher G.P.A., but they found no significant difference in academic readiness or self-directed learning. Other investigations of student characteristics have also been inconclusive. Some gender studies found no differences, whereas others cite that females outperform males (for a review, see Xu & Jaggars, 2013). Angiello (2002) and Xu and Jaggars (2013) report differences for Hispanic and Black students in comparison to White students, while Welsh (2007), Aragon and Johnson (2008) and Wladis, Conway and Hachey (2015) found that ethnicity was not related to online course outcomes more so than face-to-face outcomes. G.P.A has been identified as a significant factor affecting online course outcomes in some studies, (e.g. Xu & Jaggars, 2013), but not others (e.g. Hachey, Wladis, & Conway, 2012). To accurately assess whether a factor puts a student at greater risk in the online environment specifically, it is essential to analyze the interaction between that factor and course medium, while simultaneously controlling for self-selection into online courses. Only a few studies consider these interactions, and while these studies controlled for some self-selection factors, all of them excluded important predictors. Xu and Jaggars (2013) found that Black students and students with lower G.P.A.’s did worse online than would be expected based on their face-to-face performance, and that women and older students did better than expected online. Wladis, Conway & Hachey (2015) found that older students did significantly better, and that women did significantly worse online, than would be expected based on their outcomes in comparable face-to-face courses, but that there was no significant interaction between the online medium and ethnicity. But neither of these studies controlled for whether a student had children, among other factors. This study addresses an important gap in the literature by considering which factors may predict differential online versus face-to-face performance while also controlling for a wide array of student characteristics related to self-selection into online courses.


Introduction
Higher education is undergoing a virtual transformation as course offerings move from in-person to Internet-based instruction.By 2013, over 40 million college students took online classes worldwide; by 2017, that number should triple (Atkins, 2013).Because of this rapid growth, many institutions have had to make policy decisions about online learning before adequate evidence was available.For example, more institutions are requiring all students to take online courses, despite evidence that this may decrease course and college completion for some students (Jaggars, 2011).And most colleges use surveys to screen out students at risk online, despite the fact that no online readiness surveys have yet been validated as predictors of differential online versus face-to-face outcomes (Wladis & Samuels, 2016).Online courses can provide increased access to college, but because they often have higher attrition (the reasons for which are not yet well understood), they may also be stumbling blocks to degree completion.On the other hand, restricting access to online courses may impede the college progress of "non-traditional" students who need the flexibility that online learning affords.The rapid growth of online learning will likely change the very nature of higher education over the coming decades.If policies guiding the implementation of online learning are to be grounded in research evidence, education research must keep up with these trends.In particular, it is essential to identify which students are at higher risk online.Additionally, suggesting that online students are not more likely to drop out of college immediately after, or due to the outcomes of the online course; rather, it seems that other student characteristics may be significant in determining college persistence.

Research questions
This study explores the relationship among student characteristics, online course-taking and course and college persistence.Specifically, we ask: 1. Which student characteristics exacerbate or mitigate differences in rates of online versus face-toface successful course completion? 2. To what extent do online course outcomes explain subsequent college dropout rates?

Online Outcomes
Numerous studies, including a meta-analysis of over 200 studies, have found no significant difference in learning outcomes in online versus face-to-face courses(e.g., Bernard et al., 2004;Bowen, Chingos, Lack, & Nygren, 2012).Yet online course dropout rates range from 20-40% (e.g.Pierrakeas, Xenos, Panagiotakopoulos, & Vergidis, 2004), and online attrition rates have been reported as 7-20 percentage points higher than those for face-to-face courses (e.g.Nora & Snyder, 2009;Patterson & McFadden, 2009).However, there is little research on the effects of online course-taking on college persistence and completion, and what results are available are mixed (see e.g.Shea & Bidjerano, 2014;Xu & Jaggars, 2011).However, examining student characteristics may help to predict which students are at highest risk online.

Student characteristics and online enrollment
Online learners are more likely to be female, older, married, active military or to have other responsibilities (e.g., full-time work, children), and are more likely to have other "non-traditional" characteristics (e.g., delayed college enrollment; no high school diploma; part-time enrollment; financially independent) (Shea & Bidjerano, 2014;Wladis, Hachey, & Conway, 2015).Studies have also found that online students tend to have higher academic preparation and higher G.P.A.s, to be white, native English speakers, and are more likely to have applied for or received financial aid (Conway, Wladis, & Hachey, n.d.;Jaggers & Xu, 2010;Xu & Jaggars, 2011).Online learning also seems to attract a larger proportion of first generation college students (Athabasca University, 2006).However, research on demographic variables is conflicting (Jones, 2010), and it is unclear how differing characteristics interact to affect student retention in online courses.

Student characteristics and online outcomes
Student skills and psychological attributes may be less predictive of online outcomes than other factors, or no less applicable to success online versus face-to-face.Bernard, Brauer, Abrami, and Surkes (2004) found that self-direction and beliefs were significant positive predictors of online course grade, but that G.P.A. was a stronger predictor of online course outcomes.Waschull (2005) found that selfdiscipline/motivation was significantly correlated with course grades online, but concluded that the same factors may predict success in both online and face-to-face classes.Aragon and Johnson (2008) found that online completers were more likely to be female, enrolled in more classes, with a higher G.P.A., but they found no significant difference in academic readiness or self-directed learning.
Other investigations of student characteristics have also been inconclusive.Some gender studies found no differences, whereas others cite that females outperform males (for a review, see Xu & Jaggars, 2013).Angiello (2002) and Xu and Jaggars (2013) report differences for Hispanic and Black students in comparison to White students, while Welsh (2007), Aragon and Johnson (2008) and Wladis, Conway and Hachey (2015) found that ethnicity was not related to online course outcomes more so than face-to-face outcomes.G.P.A has been identified as a significant factor affecting online course outcomes in some studies, (e.g.Xu & Jaggars, 2013), but not others (e.g.Hachey, Wladis, & Conway, 2012).
To accurately assess whether a factor puts a student at greater risk in the online environment specifically, it is essential to analyze the interaction between that factor and course medium, while simultaneously controlling for self-selection into online courses.Only a few studies consider these interactions, and while these studies controlled for some self-selection factors, all of them excluded important predictors.Xu and Jaggars (2013) found that Black students and students with lower G.P.A.'s did worse online than would be expected based on their face-to-face performance, and that women and older students did better than expected online.Wladis, Conway & Hachey (2015) found that older students did significantly better, and that women did significantly worse online, than would be expected based on their outcomes in comparable face-to-face courses, but that there was no significant interaction between the online medium and ethnicity.But neither of these studies controlled for whether a student had children, among other factors.
This study addresses an important gap in the literature by considering which factors may predict differential online versus face-to-face performance while also controlling for a wide array of student characteristics related to self-selection into online courses.

Data source and sample
This study used a sample of 9,663 students with 37,442 course records from the 18 two-and fouryear colleges in the City University of New York (CUNY) system.Students were selected if they were enrolled in a course in the sample frame, which consisted of online and comparable face-to-face courses offered during the 2014-2015 fall semester at one of the CUNY colleges.At the end of the semester, students in the sample were invited to participate in an online survey.The survey had a response rate of 12.1%, which is typical for surveys of this type with this population, and responses were weighted to account for potential nonresponse bias (see below for details).
Detailed summary statistics for the survey sample, broken down by course medium, can be found in Table 1.

Measures
This research utilizes two measures of student outcomes: successful course completion, or whether the student successfully completed a course with a C-or higher (the typical standard to receive major or transfer credit), and college persistence, or whether the student re-enrolled in college in the subsequent full semester.
The main independent variable (IV) of interest, course medium, was dichotomized to face-to-face or fully online, based on Sloan Consortium definitions (Allen & Seaman, 2010).Fully online courses have 80% or more of the course content online, and face-to-face courses have 33% or less of the content online.Prior research suggests that students who take hybrid courses (33-80% online content) are substantially similar to students who take face-to-face courses and that the outcomes are similar (Xu & Jaggars, 2011).
The other IVs in this study were chosen because there is evidence that they may: 1) predict online course enrollment; 2) be related to course or college outcomes; or 3) be significant predictors of outcomes in the online medium.Covariates included: whether the student had a child (and age of youngest child); gender; race/ethnicity; age; work hours; income; parental education; developmental course placement; marital/cohabitation status; immigration generational status; native speaker status; college level (twoyear, four-year, or graduate); G.P.A; and number of credits/classes taken that semester.During preliminary analyses, different non-linear versions of variables were explored (e.g.converting credits to part-time/full-time status, squaring age), but these did not seem to model the actual distribution of the data any better or to produce significantly different results.
The survey used in this study included scales measuring: motivation to complete the course; course enjoyment/engagement; academic integration (i.e.interaction with faculty/students outside class); self-directed learning skills; time management skills; preference for autonomy; and grit (i.e.perseverance and passion for long-term goals).These scales, to the extent possible, were based on previous instruments already tested for reliability and validity (Duckworth, Peterson, Matthews, & Kelly, 2007;Macan, Shahani, Dipboye, & Phillips, 1990;Pintrich & de Groot, 1990; U.S. Department of Education, Institute of Education Sciences, National Center for Education Statistics, 2009;Vallerand et al., 1992), but were shortened and modified for use in this study.Confirmatory factor analysis using structural equation modeling (SEM) was used to model items for each scale as predictors of a single latent construct.Error covariance terms were added between some individual items based on theory, prior to estimation.Some items from the motivation and grit scales were eliminated because of poor performance during SEM.For the final scales, average variance extracted (AVE) was 0.50 or greater, indicating convergent validity, and composite reliability (CR) ranged from 0.77 to 0.89, indicating good reliability (Hair, Anderson, Tatham, & Black, 1998); the standardized root mean square residual (SRMR) ranged from 0.000 to 0.059, supporting the operationalization of each scale as a single factor structure (Hu & Bentler, 1999).

Analytical Approaches
Courses for which valid grades did not exist (e.g.not submitted by instructor, course was audited) were dropped.Multivariate multiple imputation by chained equations was used to impute values for survey questions with missing responses, using all IVs chosen for subsequent analyses.Depending on variable type, binomial, ordered, or multinomial logit models, or predictive mean matching on three nearest neighbors was used for imputation.A median of 2.6% of data were missing in each imputed variable in the sample.After preliminary tests for stability of model estimates, final imputed datasets contained 35 imputations.
Propensity scores, indicating the probability of online enrollment, were generated for each student using logistic regression and included all of the IVs used in the subsequent analyses; scores were averaged across imputed datasets.Initially data were weighted prior to propensity score matching to account for survey non-response, but since it is not well established in the research literature how to best perform propensity score matching with weighted data (see e.g.DuGoff, Schuler, & Stuart, 2014), and since preliminary models with and without weights were substantially similar, subsequent analysis was performed without sample weights.Matched datasets were generated using single nearest-neighbor matching with replacement because this approach yielded the best balance on the covariates, based on the standardized bias for each imputed variable averaged across imputations.The median standardized bias across variables was 2.6%.Based on Rubin's (2001) rule of thumb that standardized bias should be approximately below 25% after matching, the matched dataset achieved good balance on all covariates.Distribution of propensity scores was evaluated before and after matching, and both datasets showed significant overlap in the region of common support (see Figures 1-2).Each dataset was formatted into two distinct datasets: a student-level dataset, in which each record was a single student and included information about course outcomes for that student for the course in the sample frame, and a course-level dataset, in which each record was the outcome of a course taken by one of the students in the sample (all courses taken in fall/winter by a student in the sample were included).The first dataset was used to run multilevel mixed-effects logistic regression models with student as the first-level and course as the second-level factors, in order to control for unobserved heterogeneity between courses (comparing outcomes in the same course for different students); the second dataset was used to run multilevel models with course as the first-level and student as the secondlevel factors, in order to control for unobserved heterogeneity between students (comparing outcomes in online versus face-to-face courses taken by the same student).The KHB decomposition method (Kohler, Karlson, & Holm, 2011) was used to calculate direct and indirect effects, in order to explore the relationship between online course outcomes, student characteristics, and subsequent college persistence.Standard errors during KHB decomposition were computed using clustering by course, to account for the multi-level data structure.

Results and Discussion
This section describes factors that had a significant interaction with the online environment in predicting course and college outcomes.This means that the difference in outcomes online versus faceto-face is significantly different for distinct factor values.For example, if we say that being native-born put students at higher risk of dropout online, what we mean is that the change in dropout rates when moving from the face-to-face to online medium (all other factors being equal) is more positive for foreign-born than native-born students.This could mean that both foreign-and native-born students do worse online than face-to-face, but that the drop in performance is smaller for foreign-than native-born students.Or it could mean that both foreign-and native-born students do better online than face-to-face, but that the increase in performance is greater for foreign-than native-born students.Or it could mean that native-born students do worse online and foreign-born students do better.In any of these three cases, foreign-born students might dropout of face-to-face or online courses at higher or lower rates than nativeborn students-the direction or significance of the interaction alone does not provide any information about the relative outcomes of these groups overall, just about how outcomes change for these groups across different course mediums.We note also that an interaction is a contrast between two or more groups (e.g.foreign-versus-native-born)-for continuous factors this means that differences in outcomes across mediums are contrasted for higher and lower values of that factor (e.g.older versus younger students).
The most consistent pattern observed in this study was that native-born students (particularly those with two native-born parents) were at greater risk online than foreign-born students.At CUNY roughly 40% of students are foreign-born.Some research has shown that cultural norms prevent certain immigrant groups from actively participating in face-to-face classroom discussions and that online discussions produced more opportunity for interaction and participation among immigrant students, so this is one possible explanation for these results (e.g.(Campbell, 2007;Yildiz & Bichelmeyer, 2003).
Having a child under 6 years of age was a significant predictor of lower rates of successful course completion for both the matched and unmatched student-level datasets.Similar trends were observed for the course-level dataset, but the differences were not significant, perhaps because of relatively small numbers of students with pre-school-aged children in each subgroup.Repeating the analysis with a binary variable indicating whether the student had a child instead of whether they had a child under six produced similar results.It may be that student parents are more likely to enroll in online courses if they have greater time constraints, and that these same students are less likely to successfully complete a course.The fact that this pattern was significant only for the student-level dataset (where unobserved heterogeneity was accounted for by course and not by student), but not in the course-level dataset (where unobserved heterogeneity by student was accounted for), supports this interpretation.These results suggest that without adequate support for student parents (e.g.childcare, financial aid to reduce work hours), the flexibility that online courses offer may not be enough to compensate for the time demands of parenthood.

College persistence
This study also explored the extent to which the subsequent college persistence of online students could be directly related to the outcomes of their online courses, and to what extent it is likely related to other characteristics which also increase the likelihood of taking an online course.Online students in this study were significantly less likely to persist in college by re-enrolling in courses at the university in the subsequent semester but it is unclear whether this is related to their online course-taking or whether this is the result of factors that make these students both more likely to enroll online and more likely to drop or stop out of college.The KHB decomposition method was used to calculate the direct, indirect, and total effect of taking a fully online course on subsequent college persistence as mediated by successful course completion, while controlling for the variables included as covariates in Tables 1-3.In Table 4, the direct, indirect, and total effects of this model can be seen: there is no significant indirect effect, suggesting that online students are not more likely to drop out of college immediately after, or due to, the outcomes of the online course; rather, it seems that other student characteristics may be significant in determining simultaneously online course enrollment and college persistence.

Limitations
This study analyzes data from a large U.S. university system in order to increase generalizability and validity, but still has some limitations.While the sample size in this study was large, not all subgroups were large.When considering interaction effects, as in this study, it is not necessarily the whole sample size that is relevant, but the size of particular subgroups.For example, only 25 students with children under six years old dropped the course.Because of this, there may be important relationships for some of these smaller subgroups that were not identified as significant in this study, but that would be identified as significant in a larger sample.
In addition, while the CUNY system is highly diverse and likely generalizable to a wider U.S. student population, it is not necessarily nationally representative.CUNY does not have rural campuses, so caution should be exercised before extending any results taken from the CUNY dataset to the approximately 18% of U.S. college students who attend rural colleges (IPEDS, 2013).There may be factors that impact U.S. rural online students that are not well captured in this study.
In addition, the CUNY dataset used in this study was also more diverse than the average US college student population, with a higher proportion of ethnic and racial minorities, foreign-born students, first-generation college students, students from lower socio-economic strata, and students requiring developmental coursework.While these features may make the samples used in this study less representative of the U.S. college population as a whole, they also make this data an excellent resource for investigating the relationship of online course-taking and college outcomes for many groups that have been traditionally underrepresented in college and at higher risk of dropout in the US.
And finally, while this study has attempted to control for a wide array of factors that may correlate with online course enrollment or college outcomes, it is unlikely that any study could include

Implications and Conclusion
Colleges wanting to target interventions to students at highest risk in the online environment may want to focus on supporting student parents (perhaps by providing financial support and/or assistance accessing childcare), and native-born students in areas where foreign-born students are heavily represented.But while these are the groups found by this study to be most vulnerable in the online environment specifically, these groups are not necessarily the ones with the poorest absolute online outcomes.For example, for the dataset used in this study, household income was strongly correlated with course and college outcomes even though it was not relevant to the online environment specifically.Lower-income students likely still need significant support in online courses, just as they do in face-toface classes.In addition to targeting student groups that are vulnerable in the online environment specifically, colleges hoping to improve online retention should continue to support student groups that have historically been identified as at-risk generally.
Furthermore, in this study, online course outcomes had no direct effect on college persistence.This suggests that taking online courses likely does not lead directly to lower rates in college persistence on average, but rather that there are characteristics that lead students to both enroll in online college and drop out of college at higher rates.Further research with specific subgroups (e.g.community college students) and with other samples is necessary in order to confirm the extent to which this pattern is generalizable.

Table 1 .
Summary statistics by course medium

Table 1 .
Multi-level logistic regression model of successful course completion

Table 4 .
Direct, indirect, and total effect of fully online course medium on subsequent college persistence as mediated by successful course completion, controlling for all covariates in Tables1-2Online students are more likely to have more "complicated" lives that include experiences that are difficult to measure or quantify, but that influence both decisions to enroll online and subsequent course and college outcomes.Further exploring and refining factors that may impact online course enrollment should be a focus of future educational research if we are to conduct well-controlled observational studies about online outcomes.