We present a method to simultaneously search for student-level variables constructed from Cognitive Tutor log data and graphical causal models. We seek causal explanations of behavior in Cognitive Tutors, including “gaming the system†and off-task behavior, selecting variables by their contribution to causal structure and strength learning.
"1. INTRODUCTION. Researchers constructing student-level statistical and causal models from “raw†log data of courseware must construct variables to represent student-level, aggregate features of interest. We propose search for constructed variables, with a focus on “gaming the system†and off-task behavior in Cognitive Tutors. Variables are assessed by their support of inferences about causation and causal strength in graphical causal models [12]. “Feature engineering†has been explored for predictive models of educational data (e.g., [1], [4]). Usually targets (not always, cf. [6]) have been fine-grained outcomes (e.g., success at next tutor interaction) and not student-level outcomes (e.g., exam scores). Graphical causal models have also been used on educational data ([10], [11], etc.) with student-level features. This work develops an approach [9] that combines data-driven variable construction and algorithmic causal discovery to model student behavior. 2. DATA + MOTIVATION. Data are from interactions of 102 non-traditional, adult learners with the Carnegie Learning Algebra Cognitive Tutor in an (online or on-campus) algebra course at the University of Phoenix, specifically data from the last module of the course. Target learning outcomes are students’ course final exam scores. Learners “game the system†by taking advantage of intelligent tutor properties to get through course material without genuinely learning [2]. Off-task learners disengage from the tutor and behave in ways unrelated to learning tasks [7]. Both types of behavior have been associated with decreased learning [3]. Research on gaming describes it as “harmful†in a non-causal way, denoting mere association with negative outcomes. Methods for inferring causal relationships from observational data may help provide evidence for (or against) a causal relationship between gaming behavior and learning. We deploy software “detectors†of gaming [5] and off-task behavior [7] that use a variety of “engineeredâ€/“distilled†features [4] to determine whether a transaction corresponds to gaming or off-task behavior. We treat their output as “fine-grained†observations of behavior and seek variables to represent aggregate behavior over “fine-grained†observations. Recent work [8] considers whether influences of gaming the system and off-task behavior on learning are immediate, aggregate or both. They construct variables over lessons/units of interest and report that gaming is weakly associated with aggregate poorer learning and that off-task behavior is strongly associated with aggregate poorer learning. 3. METHOD + RESULTS. Our data range over several units (with corresponding sections) of algebra material and 32 skills. Student behavior in particular units, sections, or skills is possibly more important for learning than behavior over the entire module. We search over variables constructed at these levels of aggregation. Models in [8] consider variables as counts of gamed/off-task steps. As they suggest, other functions might manifest important behaviors. Our strategy is to search over constructed variables to find those that support causal inferences using algorithmic search for causal models. Graphical causal models are frequently directed acyclic graphs (DAGs) with associated probability distributions (Bayesian networks). DAG nodes represent variables; edges represent causal relationships. Two assumptions2 link causal structure represented by a DAG to independencies entailed by a DAG: the Causal Markov Condition and Causal Faithfulness Condition [12]. Algorithms like FCI3 [12] learn the equivalence class of graphs compatible with conditional independence relations among measured variables, assuming there may be unmeasured common causes of measured variables. FCI returns a partial ancestral graph (PAG), representing the set of causal graphs compatible with conditional independence relations among measured variables. Edge interpretations in a PAG are X oÃ ïƒ Y: Either X causes Y, or X and Y share a latent common cause (or both); X o—o Y: (1) X causes Y, (2) Y causes X, (3) X and Y share a latent common cause, or (4) either (1) & (3) or (2) & (3); X ↔ Y: There is a latent common cause of X and Y; X Ã ïƒ Y: X causes Y. To judge which variables support causal inferences with uncertain causal structure, we iterate4 over DAGs consistent with a PAG over sets of constructed variables and calculate “average causal predictability†achieved. For each compatible DAG, we specify a linear regression model for final_exam score with its direct causes as predictors. If each DAG is equi-probable, we maximize average R2 value over these models. fig. 1 is a baseline causal model (PAG) for variables from [8]. The negative association of number_steps_gamed and final_exam is likely induced by a causal relationship. Average causal R2 (“causal predictabilityâ€) for this PAG is .5028. fig. 1. Baseline PAG; +/- indicate association. We search over aggregate variables constructed from characteristics tracked by Cognitive Tutors or calculated by “detectors†– counts (and when relevant, proportions) of transactions: overall, correct, wrong, help/hint request, “known bugs†(misconceptions), gamed, off-task; transaction time taken; average gaming & off-task (numerical) estimate; counts (and relevant proportions) of: steps, gamed steps, and off-task steps. We consider aggregating over the entire module and sections, units, and skills within the module. For each level, we consider functions of step-level characteristics (and their natural logarithm) to determine constructed variables: sum, average, variance, max, and min. The schema for constructed variable names is: LEVEL_level-name_function(characteristic). Applying functions at different aggregation levels, we “explode†a set of a few hundred variables and “prune†by removing uninformative and redundant variables; for highly correlated pairs, we remove the variable with lower target correlation. From 20 variables with highest correlation to the target, we randomly select sets of (9) variables, apply FCI, and seek the set that maximizes average causal R2 afforded by the PAG. More work is required to determine the best sizes for these variable sets. fig. 2. Augmented baseline PAG; avg. causal R = .5816. Full search shows the importance of the count of misconceptions in the module: MODULE_sum(count_known_bugs). Augmenting our baseline variables with this, we apply FCI (PAG, fig. 2). Our baseline model suggests the o—o edge between number_steps_gamed and “misconceptions†can be oriented as Ã ïƒ ; the latter is a more proximate cause of learning. For our full search, our training set consists of variables computed over 80% of steps randomly sampled over all students. Test set variables are computed over the remaining steps. The PAG (fig. 3) that maximizes causal predictability for the training set has average causal R2 = .614 (test set avg. causal R2 = .5137). We establish that at least one variable mediates a (likely) causal link between “gaming†and learning. Future work will extend and generalize this approach and refine the search space. fig. 3. PAG for full variable search."
Acerca de este recurso...
Visitas 186
Categorías:
0 comentarios
¿Quieres comentar? Regístrate o inicia sesión