Mining and Visualizing Visited Trails in Web-Based Educational Systems

C. Romero

Manuel Freire

S. Ventura

Sergio Gutierrez

Proceedings of Educational Data Mining, 2008

2008 2008

A data mining and visualization tool for the discovery of student trails in web-based educational systems is presented and described. The tool uses graphs to visualize results, allowing non-expert users, such as course instructors, to interpret its output. Several experiments have been conducted, using real data collected from a web-based intelligent tutoring system. The results of the data mining algorithm can be adjusted by tuning its parameters or filtering to concentrate on specific trails, or to focus only on the most significant paths.

"1. Results showing routes in text mode and in the graph visualization window. In Figure 1, each node represents a web page, and the directed edges (arrows) indicate how the students have moved between them. The toolâ€™s graph visualizations are generated using the CLOVER framework, described in greater detail in [3]. Edge thickness varies according to edge weight; this allows users to quickly focus on the most important edges, ignoring those that have very low weights. In addition to line widths, numerical weights are also available. This information can be useful to a learning designer in different ways. First, it can be used as a limited auditing tool, providing a deeper understanding of the learning paths effectively followed by the students. Additionally, comparing this information with expected a priori paths allows the designer to refine the sequencing strategy. The results in the graph can show information that was not know in the first place, e.g. which activities are the most difficult, which are easier than expected (shown as more common transitions), etc. 3 Experimental Results. This study uses real data collected from a web-based intelligent tutoring system [9] for the domain of Operating Systems. Although the original log file contains sessions from 88 students that used the system in 2006, the study covers only the subset of â€œgood usersâ€ (those with more than two sessions). The study covers data from 67 students, with 754 sessions (using 25 minute timeouts) and 1121 records in total. We have carried out several experiments focused on HPGâ€™s sensibility to its parameter values in order to obtain different configurations of the graph (number of nodes, links, routes, and average route length). Results with varying parameters are displayed in Table 1. Table 1. HPG: Comparison varying alpha and cut-point parameters. Support and confidence thresholds give the user control over the quantity and quality of the obtained trails, while Î± modifies the weight of the first node in a user navigation session. In Table 1, the support must be set very low in order to obtain routes with Î± = 0. This is due to the fact that there are few start nodes. It shows that students have started their sessions in different nodes, and none of these have a significantly higher probability. This changes as Î± increases, since there will progressively be more visited nodes. The number of routes, nodes and links is increased as time support is decreased. On the other hand, the number of resulting nodes, links, routes and average route lengths is greatly increased when the confidence value is decreased. This effect is more evident on links and routes. This can be traced to the fact that the confidence threshold prunes the intermediate transactions that do not have a derivation probability above the cut-point. It must be noted that the user of the HPG algorithm can use both the alpha, support and confidence thresholds, and the three available filters, in order to obtain a suitable number of trails. The learning designer must work with the course lecturer in order to tune these parameters to a particular community of learners. Then, combining the information on the table with that displayed in the graph, the instructor can focus on the most visited routes in order to make decisions on the organization of the educational web space, or recommend paths and shortcuts to learners. 4 Conclusions. This paper has described a data mining and information visualization tool that aids authors and instructors to discover the trails followed by students within web-based educational systems. The resulting networks are then visualized using a graph representation with edges of varying thickness, which is more compelling to non- specialized users than textual output. Future plans include the addition of other sequential pattern mining algorithms such as AprioriAll and PrefixSpan. Our goal is to use the tool to provide personalized trails to students, delivering on the promise of personalized learning within adaptive e-learning systems. Acknowledgments. The authors gratefully acknowledge the financial subsidy provided by the Spanish Department of Research under TIN2005-08386-C05-02 and TIN2007- 64718, and the British Teaching and Learning Research under grant RES-139-25-0381."

¿Cómo puedes configurar o deshabilitar tus cookies?

Mining and Visualizing Visited Trails in Web-Based Educational Systems

InProceedings