Does Self-Discipline impact students' knowledge and learning?

Inproceedings

Dovan Rai

Joseph E. Beck

Neil T. Heffernan

Yue Gong

Proceedings of Educational Data Mining, 2009

2009

In this study, we are interested to see the impact of self-discipline on students' knowledge and learning. Self-discipline can influence both learning rate as well as knowledge accumulation over time. We used a Knowledge Tracing (KT) model to make inferences about students' knowledge and learning. Based on a widely used questionnaire, we measured students' level of self-discipline. When we analyzed the relation of students' self-discipline with their knowledge attributes, we found that high self-discipline students had significantly higher initial knowledge, but there is no consistent relationship of learning while using the tutor. Moreover, higher self- discipline students seemed more careful with respect to making careless mistakes.

1. The questionnaire asked students for their gender. Twelve students gave an incorrect response. Suspecting them not being serious about the survey, we excluded those students from our study. 2. Some students might be randomly picking answers and therefore we checked for consistency in their answers. Among 12 questions in the survey, for 8 of them â€œVery much like meâ€ implies low self-discipline (e.g. â€œI have a hard time breaking bad habitsâ€), and for 4 of them, â€œVery much like meâ€ implies high self-discipline (e.g. â€œI am good at resisting temptationâ€). For both types of questions we used the scoring system in Section 2.1. If a student answered â€œVery much like meâ€ for a question of the first type, he received -2 points. If he answered â€œNot like me at allâ€ for a question of the second type, he received +2 points. The two responses consistently tell that he has low self-discipline. The sum is zero. But if he had answered â€œVery much like meâ€ in the second type, the answers are not consistent and the sum of responses is -4. Similarly, if he had answered â€œNot like me at allâ€ in both questions, that would be still inconsistent and sum would be +4. 3. For each student, we took average of points in both types of questions (since the groups are of unequal size) and summed the two averages and calculated the absolute value. The sum value can range from 0 (completely consistent) to 4 (completely inconsistent). Based on the questionnaire composition and distribution of the sum from our data, we found 1.6 to be a reasonable cut point and dropped 11 students with sum greater than 1.6. 4. We selected two pairs of questions which are basically asking same trait in opposite ways. For example â€œI do certain things that are bad for me, if they are funâ€ and â€œI refuse things that are bad for meâ€ state the same trait. We cropped students who are saying â€œvery much like meâ€/ â€œMostly like meâ€ or â€œnot like me at allâ€ in both questions. There were 19 such students among which 5 were already excluded from step 2. Finally, our dataset narrowed down to 134 students, with their 68285 log records. We excluded 10% of the records and 20% of the students. For each student, we had 12- dimensional vectors representing their responses corresponding to each survey question. We performed a factor analysis to reduce data dimensions and we used the strongest factor as the studentâ€™s self-discipline score. 2.3 Knowledge tracing model. We used knowledge tracing in Dynamic Bayesian Networks (DBN), see Figure 1, to make inferences about student knowledge based on his performance. Figure 1. Knowledge tracing model: Dynamic Bayesian network. Student performance is assumed to be a noisy reflection of student knowledge, mediated by two performance parameters guess and slip. The guess parameter represents the fact that the student may sometimes generate a correct response in spite of not knowing the correct skill. For example, some ASSISTment items are multiple choice, so even a student with no understanding of the question could generate a correct response for those. The slip parameter acknowledges that even students who understand a skill can make an occasional careless mistake [3]. The learning rate parameter estimates the probability that student learns new knowledge that he has not known before. We used Bayes Net Toolkit for Student Modeling (BNT-SM [4]), which inputs data and a compact XML specification of a Bayes net model to describe causal relationships among student knowledge and observed behavior. BNT-SM gives us knowledge parameters, prior knowledge and learning as well as performance parameters, guess and slip. 3 Results. 3.1 Knowledge tracing model per skill. Based on self-discipline score, we divided students into three equal-sized groups having relatively high, medium and low self-discipline level. For each subgroup, we trained separate knowledge tracing models, and thus estimated knowledge and performance parameters that corresponded to each group. We trained a knowledge tracing model for each of the 106 skills. I.e. observe all the training data across all students for each skill and derive a set of parameters (Prior knowledge, learning, guess, slip) for each skill. Then, for each self-discipline subgroup, we calculated the median values across all the skills (see Table 1). We report median rather than mean values to avoid unnecessarily weighting outliers. However, in accordance with standard convention, our statistical analyses are based on the means rather than medians. Table 1: Knowledge and performance parameters for self-discipline groups. From Table 1, we see that for prior knowledge, the high self-discipline students are statistically higher than the medium group, and the medium and low groups are statistically tied. Meanwhile, high self-discipline students made more correct guesses and fewer slips relative to their lower self-discipline peers. A higher guess parameter should not be viewed as a bad thing. Consider that guess means the ability to answer a question despite not having mastered the skill. Consider two students with similar partial knowledge and one takes more care to figure the right answer while other quickly asks for help. The model will treat this as a guess by the first student. Such behavior seems related to self-discipline. Similarly, students who are more careful and detail-oriented will make fewer slips (keep in mind that a â€œslipâ€ is defined as making a mistake in spite of the skill being known). The result shows that higher self-discipline students have more prior knowledge and they are more concerned and careful on their task. However, we received an inconsistent pattern in the learning parameter. The learning rate of the medium self-discipline group is higher than both the high and low groups. We were concerned with the possibility of overestimating the learning parameter in the medium group by giving the guess parameter less weight, while underestimating it in the high group by giving guess more weight. This concern is due to problems with estimating knowledge tracing parameters [8]. For example, a high â€œguessâ€ parameter can result in students performing well, but allegedly having little knowledge. Since student knowledge is not directly observable, it is hard to validate the parameter estimates and we are left trusting our model that two groups could perform equally well but one group knows less (see [8] for a fuller discussion of the problems of underdetermined models). To guard against this concern, we also plotted student performance as a function of practice opportunity so that we can see the cumulative effect of the knowledge and performance parameters in studentsâ€™ future performance for each level of self-discipline. By using the four parameters of each subgroup and the knowledge tracing equations listed below, we computed the theoretical performance curves for each of them. Specifically, we initialize knowledge to be K0. After each practice opportunity, we update knowledge in formula I (below) as the new likelihood of the student knows the skill after the previous practice. Also we compute performance, the probability of the student will respond correctly in the current practice opportunity, by using formula II to combine the estimated knowledge with the slip and guess parameters. Intuitively, the probability of making correct response is dependent on studentâ€™s knowledge given that he does not slip and also on his probability to make right guess in absence of the knowledge. Figure 2a. Theoretic performance curve of three self-discipline groups. Figure 2b. Real performance curve of three self-discipline groups. From the performance curve in Figure 2a, we see that the combined effects of the four model parameters result in higher self-discipline students performing better. The real performance curve in Figure 2b also showed a similar trend. One interesting observation is to examine the best-fit power curves for each group. The high self-discipline students are performing more lawfully (i.e. higher R2) than those with low self-discipline, suggesting students with higher self discipline are more consistent. Simply looking at the learning parameters does not tell the whole story. High group students might be learning slower but they are better able to use their partial knowledge to perform betterâ€”at least that is what our model is suggesting. Based on all these findings, we built a causal model that unifies cognitive and non- cognitive aspects of our students. While knowledge parameters like prior knowledge and learning are cognitive attributes, the performance parameters, guess and slip are more related to non-cognitive attributes. This model accounts for the results in Table 1, and suggests the performance parameters might be an interesting avenue of research in their own right (typically the knowledge parameters are of more interest). 3.2 Knowledge tracing model per student. While training a KT model per skill is the regular approach, it is also possible to instead train one model per student by observing his responses in all questions across skills. The model then estimates a set of parameters (prior knowledge, guess, slip and learning) for each student which represents his aggregate performance across all skills. We then looked for a relationship between the studentâ€™s self-discipline score and his knowledge parameters (prior knowledge and learning). As seen in Table 2, self-discipline is positively correlated with studentâ€™s prior knowledge (K0), but again there is no statistically reliable correlation with the learning parameter. In the other words, students with higher self-discipline have more incoming knowledge than their lower self- discipline classmates. However, self-discipline seems not to contribute studentâ€™s ability to learn more in each learning opportunity within the tutor. Perhaps higher self-discipline results in having more learning opportunities rather than learning more from each one? Figure 3: Causal model of cognitive and non-cognitive attributes for academic performance. Table 2: Correlation of self-discipline and knowledge parameters. We also found an interesting observation that self-discipline is highly correlated with the number of problems solved. We were then confronted with two possibilities: either higher self-discipline students are more on task and solve more problems, or students with higher self-discipline have higher knowledge and so need less help and solve problems more quickly. When we did partial correlation within these three variables, as seen in Table 3, we found evidence for the latter possibility. Once we account for prior knowledge, there is no relationship between self-discipline and number of problems solved. We built a causal model, Figure 4, based on the finding that the higher self-discipline students in fact solved more problems as they were equipped with more knowledge and, perhaps surprisingly, not because they were on task more. The direct correlation between self-discipline and knowledge is 0.29, and between knowledge and number of problems solved is of 0.55. The partial correlations are more interesting. The partial correlation of self-discipline and number of problems, partialing out knowledge was only 0.11. Thus, there is not a direct relation between the two. The partial correlation of knowledge and number of problems solved, partialing out self-discipline is 0.52, i.e. the correlation is relatively unaffected. Thus, knowledge appears to be the direct causal link for number of problems solved, and self-discipline is causally upstream of knowledge. Table 3: Partial correlation of self-discipline, prior knowledge and # of problems solved. Figure 4: Causal model of self-discipline, knowledge and number of problems solved. 3.3 Model validation. KT model parameters can be sensitive to erroneous factors like wrong priors, insufficient data, etc. Therefore, we were curious to try some validation of our model parameters with external data. To validate our model, we used results from a pretest and posttest on the same set of students. The pretest consisted of a 33-item algebra quiz on the subset of knowledge components that we are using in our models. After a month, the students were presented with posttest with exactly the same questions as in the pre-test. The pretest was performed when the students started using the tutor, and the studentâ€™s score is used to indicate the amount of incoming knowledge before using ASSISTment. Therefore it works as a standard against which to validate student prior knowledge (K0) that we estimated in our models. Also, we calculated studentsâ€™ estimated performance after 8 practice opportunities (P8) as they practice 8 times on average for each skill during the one month period. We estimate performance from prior knowledge, learn, guess and slip parameters as given by knowledge and performance equations mentioned in 3.1. The correlation of P8 with post- test can be a measure of validation of the other three parameters. There is strong positive correlation between the student pretest scores and modelâ€™s estimation of their prior knowledge. P8 and posttest scores are also reliably correlated even when we partial out initial knowledge (K0). I.e. our performance measure is capturing student learning, not just the studentâ€™s overall level of knowledge. In addition, Figure 5 shows student learning between pretest and posttest. The gains in Figure 5 are consistent with our KT model results. The high self- discipline group has higher incoming knowledge than both groups and their final performance is also highest. But when it comes to learning, the medium group appears to have the highest gain. So, we considered some possibilities for the explanation of lower learning in high group. One reason for lower learning in high group could be due to the fact that they already have high knowledge and it is harder to have more gain when we start from higher value. For example going from 50 to 60 is easier than going from 80 to 90. Table 4: Correlation of prior knowledge (K0) and P8 vs. pre and post-test respectively. Figure 5: Comparison of learning gain. To test this possibility, we divided pre-test scores into three bins and performed an ANOVA. We treated pretest and self-discipline as factors in our model since we did not necessarily expect a linear effect (as would be implied by treating them as covariates). Table 5 shows the estimated marginal means of gain score for each level of the factors. Table 5: ANOVA analysis gains by pretest and self-discipline. From the result in Table 5, we see that medium group has higher learning in all bins (i.e. they are learning faster no matter what their starting level in pre-test is). Therefore, it appears that the medium group indeed has higher learning and maybe having a balance of self-discipline and some spontaneity helps in having better learning gains. However, their lower incoming knowledge makes the idea of a higher learning rate difficult to reconcile. We choose to leave this as an open discussion for further experimentations in future. 4 Contributions. Psychosocial studies have been based on performance measures like report cards, GPA, income, college admission, etc. [1,8,9]. But, our fine grained model gives us the tools to measure their performance and also latent attributes like knowledge, learning, and even guess and slip. We have found that the impact of self-discipline on students using computers is complex, and appears to influence knowledge and performance while using the tutor. We have constructed a causal model of the impacts of cognitive and non- cognitive attributes on performance within an ITS, and showed that the variability in performance is not only dependent upon cognitive attributes, but also on other non- cognitive aspects like carefulness and self-discipline. We modified the regular approach to train KT model with data per skill and instead estimated per-student parameters. Although a per-student model trained on prior users is not useful to ITS designers (since it does not apply to new students using the system!), performing parameter estimation at the individual level can open new ways to make different analyses with other individual characteristics. With this new approach, we were able to make correlations of studentsâ€™ pre-test and post test with their knowledge and performance estimations, thus validating the model parameters. 5 Future work and conclusions. Our current method of estimating self-discipline relies upon a self-reported survey administered once. There can be problems of both over- and under-reporting. We could take advantage of the continuous data students generate, and construct a more robust estimate of self-discipline. It may also be possible to consider self-discipline a latent construct, similar to what we do for knowledge in knowledge tracing, and simultaneously estimate both parameters. Broadening the stream of ITS information to include observable measures like homework submission, attendance, usage of tutor, opinion of teachers and parents, etc. would make this possible. In conclusion, high self-discipline students have higher incoming knowledge, as substantiated from both KT model parameters and pre-test score. However, the impacts do not appear to be substantial, and tutor designers probably do not have to explicitly account for self-discipline. The higher self-discipline group makes better guesses and makes fewer slips, which implies that the higher self-discipline group is more careful and detail oriented. The cumulative effect of learning, slip and guess makes the performance of higher self-discipline students better than that of their peers. Acknowledgements. We would like to thank all of the people associated with creating the ASSISTment system listed at www.ASSISTment.org. We would also like to acknowledge funding from the National Science Foundation, the Fulbright Program for funding the second author and the US Department of Education and the Office of Naval Research for funding the other authors. All of the opinions expressed in this paper are those solely of the authors and not those of our funding organizations.

About this resource...

Visits 257

Save to My personal space
Send link

Categories:

Educational Data Mining (EDM)

Tags:

0 comments

Do you want to comment? Sign up or Sign in

¿Cómo puedes configurar o deshabilitar tus cookies?

Does Self-Discipline impact students' knowledge and learning?

Inproceedings