Student Consistency and Implications for Feedback in Online Assessment Systems

Steven Tanimoto

Tara M. Madhyastha

Proceedings of Educational Data Mining, 2009

2009 2009

Most of the emphasis on mining online assessment logs has been to identify content- specific errors. However, the pattern of general â€œconsistencyâ€ is domain independent, strongly related to performance, and can itself be a target of educational data mining. We demonstrate that simple consistency indicators are related to student outcomes, and suggest how consistency might be used in an online assessment framework to provide scaffolding to help students in need.

1. The second factor loaded .96 on Final Part2.Q3. This pattern of loading suggests that the first factor represents general knowledge gained in the course, separate from the format of the exam (e.g., open-ended questions versus multiple choice). We use the first factor as an outcome measure of general class knowledge. 4 Results. Table 1 shows the mean and standard deviation for the outcome measures, measures of time spent completing the laboratory exercise, and the consistency measures. There was clear variation among the students on all measures, including the consistency measures that may seem obvious (for example, writing the same response on the worksheet that was used in the exercise). This variation is particularly substantial considering that these students are highly selected into a competitive state university, and are expected to have developed skills that would result in higher consistency measures than the population as a whole. There is also significant variation in scores on the final exam. We note that the average score on Final Part2.Q2, the more difficult of the two questions dealing with image processing, is lower than the average score of Final Part2.Q1. This suggests that the questions were difficult enough to avoid ceiling effects. Table 1. Summary statistics for outcome measures, general logfile measures, and consistency measures. Maximum score or range of scores is given, where appropriate, in parentheses following the measure. To examine the differences in outcome measures based on consistency, we split the students by the median sum score of consistency (7) forming two groups. We call these the low consistency group (N=14) and the high consistency group (N=13). We conducted a one-way ANOVA to determine whether outcome measures differ across the two groups. Results are summarized in Table 2. On average, the high consistency group scored higher on all outcome measures (though not all differences were significant). The high consistency group obtained significantly greater worksheet scores (p=.002). The difference in scores is particularly noteworthy, because the consistency measures do not measure correct or incorrect responses, whereas the worksheet scoring does. Furthermore, the high consistency group scored marginally significantly higher on the Final Part2.Q1 (p=.083) and significantly higher on the Final Part2.Q2 (p=.049). These two questions of the final were designed to measure the same concepts covered by the worksheet, and were administered two weeks afterward. The high consistency group also scored significantly higher on the class knowledge measure extracted from the midterm scores (p=.018). The effect size (measured by Cohenâ€™s d) was quite large. Differences in consistency cannot be attributed solely to â€œcarelessnessâ€ or greater or lesser time spent on the assignment. The high consistency group logged fewer events (not significant) with more parse errors (marginally significant, p=.062). The average time spent between logged events was virtually identical among the two groups. Table 2. Analysis of variance. 5 Implications for Feedback. We have proposed some rough indicators of consistency and shown that the low consistency students perform worse on several important class outcomes than the high consistency students. It is particularly remarkable that we find such variation in consistency in such a highly selected population. We do not know the reasons why students are inconsistent, and we have not demonstrated whether higher consistency results in higher performance or whether it is a side-effect of something else (e.g., low level of engagement or interest in the class). This is a topic for future research. However, the inconsistencies that we have coded are rather blatant. When a student repeatedly types in the wrong command and does not recognize discrepancies in the results or attempt to reconcile them, it is reasonable to believe that such events may provide a learning opportunity. Furthermore, the needs of this student in this circumstance are not based on a model of how the student interacts with the pedagogical content of the assignment. We suggest that consistency across several dimensions may be dynamically monitored and used to adaptively control scaffolding (additional guidance) for the student. Scaffolding involves (a) reducing the number of steps required to solve a problem by simplifying the task (b) keeping the student motivated (c) marking discrepancies between actions taken and the desired solution (d) controlling frustration and (e) demonstrating an idealized version of the task [9]. It is a technique used primarily when students are learning new material, and cannot handle complex problems. Graesser et al showed that use of scaffolding, including good questions and answers, could promote deep inquiry, which students tended to avoid without prompting [10]. For example, suppose students did not follow the directions on the worksheet correctly and did not realize it. An ideal student would have recognized that â€œsomething was wrongâ€ and taken some action to resolve the cognitive dissonance, checking the last command executed or re-entering the command. If still confused, the student could ask a classmate or the instructor for help. As part of an online assessment system, we wish to encourage such behavior. An appropriate first step would be to emphasize the cognitive dissonance. In the assignment in the experiment, expectations are outlined in the text, e.g., by writing â€œApply a formula that makes a monochrome image in which the cedar foliage is white and everything else is blackâ€ before providing the formula. However, the discrepancy could be highlighted further by showing an image of the expected result and asking the student, â€œDoes your image look like this?â€ before proceeding. In the extreme case when the same error is repeated, we can assume that the mistake is not inconsistency but represents some higher order misconception. For example, in the PixelMath interface there are buttons for common functions, such as XOR. Other functions can be typed in the window. One command asked students to apply the BXOR formula to exclusive-OR two images on a bit-by-bit basis, and many incorrectly applied the XOR function. It is possible that students who do not correct their error with appropriate scaffolding may not understand the difference between bitwise-exclusive-OR and straight exclusive-OR, or may not realize that they can type equations directly into the PixelMath formula area without clicking buttons. This might require specific formative feedback that reteaches these concepts to the student. A common misconception about image processing systems such as PixelMath is that the formula tells where to move each pixel of the source image. In reality, the formula describes, for each destination pixel location, where or how to get its value. This â€œpush- instead-of-pullâ€ notion is exhibited by students asked to come up with a formula to, say, reduce the size of an image by a factor of two. Instead of writing Source1(x*2, y*2) which is correct, they write Source1(x/2, y/2). Similarly, to shift an image 5 pixels to the left, they should write Source1(x+5, y), but they put down Source1(x-5, y). After seeing a consistent pattern of such incorrect formulas, an automatic feedback system could provide scaffolding to specifically address the â€œpush-instead-of-pullâ€ misconception. 6 Conclusions. We have identified a dimension of student performance, consistency, that is content independent, easily mined from educational logs, and that is related to performance outcomes. We suggest that because consistency is an assumption that underlies many educational interventions, the significance of lack of consistency may be overlooked as a potential opportunity to provide scaffolding. We give some suggestions for how an adaptive learning system might exploit consistency measures to scaffold instruction, and to identify when consistency of incorrect responses suggests moving to an intervention based on more sophisticated models of the learner, content, and their interaction. Figure 1. The background window contains the original image used by students for the laboratory activity. The fence has been extracted into another window (above, and zoomed out by a factor of 2) and downsampled into another (below). The PixelMath calculator can also be seen here with the correct formula for the downsampling: Source1(5*x, 5*y). Figure 2. Detail within the downsampled picket fence showing the effect of sampling at the Nyquist rate. Also, PixelMathâ€™s display of the RGB values of each pixel can be seen.

¿Cómo puedes configurar o deshabilitar tus cookies?

Student Consistency and Implications for Feedback in Online Assessment Systems

InProceedings