"Introduction. Increasingly, education and training are delivered beyond the constraints of the classroom environment, and the increasingly widespread availability of online repositories, educational digital libraries, and their associated tools are major catalysts for these changes (Borgman et al., 2008; Choudhury, Hobbs, & Lorie, 2002). Teachers, of course, are a primary intended audience of educational digital libraries. Studies have shown that teachers use digital libraries and web resources in many ways, including lesson planning, curriculum planning (Carlson & Reidy, 2004; Perrault, 2007; Sumner & CCS Team, 2010), and looking for examples, activities as well as illustrations to complement textbook materials (Barker, 2009; Sumner & CCS Team, 2010; Tanni, 2008). Less frequently mentioned ways are learning about teaching areas (Sumner & CCS Team, 2010; Tanni, 2008), networking to find out what other teachers do (Recker, 2006), and conducting research (Recker et al., 2007). These studies, however, were generally conducted in laboratory-like settings, using traditional research methods, such as interview, survey, and observation. Due to the distributed nature of the Web, traditional research methods and data sources do not support a thorough understanding of teachers’ online behaviors in large online repositories. In response, web-based educational applications are increasingly engineered to capture users’ fine-grained behaviors in real-time, and thus provide an exciting opportunity for researchers to analyze these massive datasets, and hence better understand online users (Romero & Ventura, 2007). These records of access patterns can provide an overall picture of digital library users and their usage behaviors. With the help of modern data mining techniques—the discovery and extraction of implicit knowledge from one or more large databases (Han & Kamber, 2006; Pahl & Donnellan, 2002; Romero & Ventura, 2007)—the data can further be analyzed to gain an even deeper understanding of users. Yet, despite the wealth of fine-grained usage data, data mining has seldom been applied to digital library user datasets, especially when studying teacher users. The study reported in this article used a particular digital library tool, called the Instructional Architect (IA.usu.edu), which supports teachers in authoring and sharing instructional activities using online resources (Recker, 2006). The IA was used as a test bed for investigating how the data mining process in general, and clustering methods in particular, can help identify the different and diverse teacher groups based on their online usage patterns. This study built substantially on results from a preliminary study that also used a clustering approach (Xu & Recker, in press). In particular, both studies relied on a clustering approach that used a robust statistical model, latent class analysis (LCA). In addition, this study used more refined user feature space, and frequent itemsets mining was used to clean and extract common patterns from the clusters initially generated. Lastly, as a means of validation the clustering results, we explored the relationship between teachers’ characteristics (comfort level with technology and teaching experience) and the teacher clusters that emerged from the study. This article is organized as follows. The literature review first describes the Knowledge Discovery and Data Mining (KDD) process, and several clustering studies conducted with educational datasets. This is followed by a brief introduction to the Instructional Architect tool. We then describe our data mining approach, starting from data collection and selection, through data analysis, interpretation, and inference. Finally, as part of the interpretation process, we triangulated data from teachers’ registration profiles to validate the clustering results. We conclude with the implications, contributions, and limitations of this work. This section describes the general data mining approach, and reviews several clustering studies set within educational contexts. Educational data mining. There is increasing interest in applying data mining (DM) to the evaluation of web-based educational systems, making educational data mining (EDM) a rising and promising research field (Romero & Ventura, 2007). Data mining is the discovery and extraction of implicit knowledge from one or more large databases, data warehouses, and other massive information repositories (Han & Kamber, 2006; Pahl & Donnellan, 2002; Romero & Ventura, 2007). When the context is the Web, it is sometimes explicitly termed web mining (Cooley, Mobasher, & Srivastava, 1997). Educational data mining, as an emerging discipline, is concerned with applying data mining methods for exploring unique types of data that come from educational settings (Baker & Yacef, 2009). As web-based educational applications are able to record users’ fine-grained behaviors in real-time, a massive amount of data becomes available for researchers to analyze in order to better understand an application’s impact, usage, and users (Romero & Ventura, 2007). The knowledge discovery and data mining (KDD) process typically consists of three phases: 1) preprocessing datasets, 2) applying data mining algorithms to analyze the data, and 3) post-processing results (Cooley et al., 1997; Romero & Ventura, 2007). Data preprocessing refers to all the steps necessary to convert a raw dataset to a form that can be ingested into a data mining algorithm. It may include any of the following tasks: data cleaning, missing value imputation, data transformation, and data integration. The application of data mining algorithms usually has one of two purposes: description and prediction. Description aims at finding human-interpretable patterns to describe the data; prediction attempts to discover relationships between variables, in order to predict the unknown or future values of similar variables. Currently, there is no universal standard for post-processing and evaluating data mining results. Typical interpretation techniques draw from a number of fields such as statistics, data visualization, and usability studies. Clustering studies in educational settings. The increasing availability of educational datasets and the evolution of data mining algorithms have made educational data mining a major interdisciplinary area, lying between the fields of education and information/computer sciences. Based on Romero and Ventura’s (2007) educational data mining survey, most commonly used data mining techniques include statistical data mining, classification, clustering, association rule mining, and sequential pattern mining. This study focused on using clustering approach to analyze teachers’ online behaviors when using a digital library tool. As such, several clustering studies using in educational datasets are reviewed. Hübscher, Puntambekar, & Nye (2007) used K-means and hierarchical clustering techniques to group students who used CoMPASS, an educational hypermedia system that helps students understand relationships between science concepts and principles. K-means is a clustering analysis method that aims to partition n data points into k clusters in which each data point belongs to the cluster with the nearest cluster center. Hierarchical clustering is a clustering analysis method that seeks to build a hierarchy of clusters. In CoMPASS, navigation data was collected in the form of navigation events, where each event consisted of a timestamp, a student name, and a science concept. After preprocessing, K-means and hierarchical clustering algorithms were used to find student clusters based on the structural similarity between navigation matrices. Durfee, Schneberger, & Amoroso (2007) analyzed the relationship between student characteristics and their adoption and use of particular computer-based training software, using factor analysis and self-organizing map (SOM) techniques. Survey responses to questions regarding user demographics, computer skills, and experience with the software were collected from over 40 undergraduate students. They used SOM to cluster and visualize the dataset. By visually analyzing the similarity and difference of the shades and borders, four resulting student clusters were identified. Finally, a t test on performance scores supported the clustering decisions. Wang, Weng, Su, & Tseng (2004) combined sequential pattern mining with a clustering algorithm to study students’ learning portfolios. The authors first defined each student’s sequence of learning activities as a learning sequences, LS = , where si was a content block. They then applied a sequential pattern mining algorithm to find the set of maximal frequent learning patterns from learning sequences. The discovered patterns were considered as variables in a feature vector. For each learner, the value of bit i was set as 1 if the pattern i was a subsequent of the original learning sequence, 0 otherwise. After the feature vectors were extracted, a clustering algorithm called ISODATA was used to group users into four clusters. The literature review only identified one clustering study investigating teachers’ use of an educational digital library tool. In this study, a clustering approach was applied to model and discover patterns in teachers’ using an online curriculum planner (Maull, Saldivar, & Sumner, 2010). In this study, user sessions were first abstracted, and 27 features were selected for clustering experiments. The study then used K-means and expectation-maximum (EM) likelihood to cluster the user sessions. The two algorithms identified very similar patterns in the largest clusters, such as clicking on instructional support materials, embedded assessments, and answers and teaching tips. However, the authors acknowledged that their study was preliminary, in that there was not complete agreement between the different algorithms on top cluster features or cluster sizes. There are other clustering studies documented in the literature on educational web mining, however, the above examples are sufficient in revealing some major considerations in discovering user groups in the context of online environments, as follows: - A user-model must be carefully defined that accounts for the task and domain. Navigational paths, online performance, user characteristics, and a user’s prior knowledge are all good candidates for user features. - Clustering is a generic definition for a certain type of data mining method. Researchers must select the clustering algorithm appropriate for their studies; however, different approaches may produce different results. - Other data mining methods such as rule discovery, dimensionality reduction, and filling in missing values can be used with clustering algorithms to achieve a better grouping effect. - To better understand online user behaviors and produce more useful information, the data mining results should be used in conjunction with other data. - As an indispensible component of the KDD process, evaluation of the clustering results should be conducted if at all possible. Teachers’ use of digital libraries. As noted, the research context is teachers’ use of digital libraries, an area that is seeing explosive growth in educational settings (Borgman et al., 2008). While prior work has examined the influence of teacher characteristics (such as teaching experience, information literacy skills, and usage patterns), little work has identified quantitative evidence linking these. For example, prior work has noted that teachers often lack the necessary information seeking and integration skills to effectively use online resources (Perrault, 2007; Tanni, 2008). In a nation-wide survey on teachers’ perceived value of the Internet, Barker (2009) found a positive correlation between teacher self-reports of the perceived value of the Internet in teaching, and use of hardware/electronic media. However, this work failed to find any correlation between teachers’ perceptions and years of teaching experience. To examine usage, researchers are increasingly turning to web metrics, a close kin to the EDM family. In a review of four educational digital libraries projects, Khoo et al. (2008) reviewed the use and utility of web metrics. Others have examined such metrics in conjunction with other sources of data, thereby seeking triangulation and complementarity in findings (Greene, Caracelli, & Graham, 1989). In an evaluation of a digital library service, the Curriculum Customization Service (CCS), Sumner & CCS Team (2010) reported interview data of middle and high school science teachers, and examined how their experiences were supported and clarified by usage log data. However, web metrics do not always agree with teachers’ own stories. For example, in Shreeves and Kirkham’s (2004) usability testing of a search portal, 65% of the users reported using the advanced search features; however, transaction log analyses did not support these claims. As such, these studies raise important questions. Since every research method has limitations, which should be trusted when there are discrepancies? Can data triangulation be conducted to help resolve these discrepancies? Technology context: The instructional architect. This research is set within the context of the Instructional Architect (IA.usu.edu), a lightweight, web-based tool developed for supporting authoring of simple instructional activities using online learning resources in the National Science Digital Library (NSDL.org) and on the Web (Recker, 2006). With the IA, teachers are able to search for, select, sequence, annotate, and reuse online learning resources to create instructional web pages, called IA projects. These IA projects (or, projects, for short) can be kept private (private-view), made available to only students (student-view), or to the wider Web (public-view). Anyone can visit a public-view IA project, students can access their teachers’ student-view IA projects through their student accounts, and private IA projects are only viewable by the author. Any registered teacher can make a duplicate of any public IA project by clicking the copy button at the bottom of the project. In this way, the IA provides a service level for supporting a teacher community around creating and sharing instructional resources and activities. To date, the IA has over 7,000 registered users who have created over 16,000 IA projects. To use the IA, a teacher must first register by creating a free IA account, which provides exclusive access to his/her saved resources and projects. As part of the registration process, teachers were asked two optional profile questions: years of teaching experience and comfort level with technology. After logging in, the IA offers two major usage modes: resource management and project management. In the resource management mode, teachers can search for and store links to NSDL resources, web resources, as well as to other users’ IA projects. These links are added to teachers’ personal collections within the IA. Within the IA’s project management interface, teachers only need to enter the IA project’s title, overview, and content for the IA system to dynamically generate a webpage which can then be published. Figure 1 shows an example of a teachercreated IA project. Purpose and research questions. As noted above, this study relied on results from a preliminary study organized around the KDD process and using latent class analysis (described below) as the clustering algorithm with the same usage data (Xu & Recker, in press). Preliminary results demonstrated LCA’s utility by clustering teachers into seven groups based on thirteen features drawn from teachers’ online behaviors. Results, however, also suggested the following improvements: 1) a more parsimonious user feature space, 2) inclusion of a clustering pruning process to make the clustering results less ambiguous, and 3) validation of clustering results by triangulating with teacher profile data. As such, the purpose of this study is to build upon results from the preliminary study to better understand teachers’ use of the IA. In particular, by implementing the suggested improvements, what usage patterns and clusters emerge when mining teacher usage data? What inferences can be made about teachers’ behaviors from the discovered usage patterns? Finally, how can user patterns be combined with more traditional user data for triangulation purposes? Figure. 1. Screenshot of a teacher-created IA project. Results. Phase 1 -- Data preprocessing: Generating the user feature space. The dataset included usage data from 661 teachers who registered in the IA in 2009 and had created either publicview or student-view project(s) (57% of the 1,164 teachers who registered during that period). As outlined above, a teacher can assume three general roles in the IA environment: project authoring, project usage, and navigation. In the preliminary study, we generated an initial list of 13 indicators based on teachers’ possible behaviors in each of these three roles (Xu & Recker, in press). Clustering results from this preliminary study were used to inform how we reduced the complexity of the feature space, by fine-tuning or removing some indicators (see Table 1). Note that the number of student visits referred to the number of times a teacher’s project was viewed by his/her students. The number of peer visits referred to the number of times a teacher’s projects was viewed by other IA users. Our dataset also contained variables that were rather skewed or had outliers. The presence of outliers can lead to inflated variance and error rate, as well as distorted estimation of parameters in statistical models (Zimmerman, 1994). For example, 98% users’ projects had less than 150 maximum number of student visits; the inclusion of the 2% users with more than 150 maximum number of student visits increased the mean value by 2.5 times (from 4.29 to 10.96) and the standard deviation by almost 4.5 times (from 12.58 to 56.48). Thus, eight features in the original dataset were scaled into three levels using ordinal variables. The remaining feature, number of projects, was segmented into two levels. Generally, equal intervals were used to discretize a continuous variable, except for those features with extremely skewed distributions. Then, professional opinion influenced the segmentation process. Phase 2 -- Applying data mining algorithms. This study also used Latent Class Analysis (LCA) (Magidson & Vermunt, 2004) to classify registered teacher users into groups. LCA is a model-based cluster analysis technique in that a statistical model (a mixture of probability distributions) is postulated for the population based on a set of sample data. LCA offers several advantages over traditional clustering approaches such as K-means: 1) for each data point, it assigns a probability to the cluster membership, instead of relying on the distances to cluster means; 2) it provides various diagnostics such as common statistics, Log-likelihood (LL), Bayesian information criterion (BIC) and p-value to determine the number of clusters and the significance of variables’ effects; 3) it accepts variables of mixed types without the need to standardize or normalize them; and 4) it allows for the inclusion of demographic and other exogenous variables either as active or inactive factors (Magidson & Vermunt, 2004). The traditional LCA (Goodman, 1974) assumes that each observation belongs to only one of the K latent classes, and that all the manifest variables are locally independent of each other. Local dependence means that all associations among the variables are solely explained by the latent classes; there are no external associations between any pair of input variables. An example of an external association is having two survey items with similar wording in the questions (Magidson & Vermunt, 2004). LCA uses the maximum likelihood method for parameter estimation. It starts with an expectation-maximization (EM) algorithm and then switches to the Newton-Raphson algorithm when it is close enough to the final solution. In this way, the advantages of both algorithms, the stability of EM and the speed of Newton-Raphson when it is close to the optimum solution, are exploited. Table 1. User feature space. The next section describes how LCA was applied to the user feature space, and how the final user clusters were selected. The user feature space (consisting of nine features in three roles) was used as the input for the LCA. Due to the unsupervised nature of clustering studies, it is hard to determine the number of clusters without any predefined guidelines. Therefore, we explored the clustering problem with different k’s, and then observed the common patterns emerging from different settings. By doing this, the clustering results as defined by common patterns were robust and not contingent on a particular setting. The data analysis consisted of four steps: (1) generating preliminary clusters, (2) deriving user patterns, (3) mining frequent user patterns, and finally (4) selecting the final user clusters. Step 1 was used to generate preliminary LCA models. Steps 2 through 4 were used to extract the common patterns, in other words, the final user clusters. Step 1: Generating preliminary clusters. All LCA models were generated starting from the number of clusters k = 3 to k = 15. With all models, we monitored three criteria (R2, BVR, BIC) to ensure that the optimal model could be achieved. R2, also called the coefficient of determination, is the proportion of the total variation of scores from the grand mean that is accounted for by group membership (Aron, Aron, & Coups, 2009; Howell, 2007). In terms of the LCA, it means how much of the variance of each indicator is explained by an LCA model (Statistical Innovations, 2005). If an indicator has a very small R2 value, then it is making little contribution to current latent class analysis model, and the current model needs to be adjusted. Bivariate residual (BVR) in an LCA model is a local measure of model fit by assessing the extent to which the observed association between any pair of indicators is explained by a model (Statistical Innovation, 2005). If we encountered a BVR greater than 1 for any pair of indicators, we manually forced a correlation between them. BIC is a posterior estimation of model fit based on comparing probabilities that each of the models under consideration is the true model that generates the observed data (Kuha, 2004). A model with a lower BIC value is preferred over a model with a higher value. The BIC measure is widely used to help in LCA model selection. The best LCA models under different number of clusters k were selected using the three measures described above. We found that some resulting clusters were too small to demonstrate a reliable pattern. For instance, some clusters only had 10 users, with several of their indicators distributed across all segmentation levels. This means that after filtering out the outliers, the few users left did not demonstrate a distinctive cluster-wise pattern. In order to obtain representative user patterns, these kinds of small-sized clusters were excluded and only clusters greater than a certain threshold, α, were used. α was defined as the smaller of the two: 1) 10% of the total number of users, or 2) N / k, where N was the total number of users and k was the cluster size. In the end, 59 clusters from models of different k were above their respective thresholds. Step 2: Deriving user patterns. A valid cluster was then converted to a piece of user pattern, which was a conjunction of the themes of individual features within a cluster. As noted in Table 1, each feature was segmented to two or three levels. When deriving user patterns, an individual feature’s theme for a given cluster referred to how users within this cluster distributed among the levels of this particular feature. For example, the number of projects was the only two-level indicator, and it had two themes (one project, and more than one projects). All other indicators had three levels, and thus, in theory, could produce five themes: 1) the lowest level is dominant, 2) the lowest two levels are dominant, 3) the middle level is dominant, 4) the highest two levels are dominant, and 5) the highest level is dominant. To be a dominant level (e.g., the lowest level is dominant) or dominant adjacent levels (e.g., the lowest two levels are dominant), more than 70% users must fall into such level(s). For instance, when the number of clusters k = 3, 84.6% teachers in the 2nd cluster had only a few words (the lowest level) for their project content, thus, this cluster was labeled as the “lowest level is dominant†theme for the number of project content words feature. The goal of step 2 was to deriving user patterns through the observed dominant themes. The 70% rule was reached based on several trials of experiments. Setting a higher percentage bar left fewer dominant themes for us to make inferences, while a lower percentage bar was too lenient and hardly produced distinctive traces for each cluster. Thus, we settled on 70%. It is worth noting that although we had one 2-level feature and eight 3-level features, which in theory should produce 42 features in total, only 30 dominant themes emerged from this study. If a feature under a certain setting did not display a dominant theme, it was dropped from that particular cluster. Lastly, the dominant themes for each cluster were combined together to represent a usage pattern. Again taking the 2nd cluster when k = 3 as an example, its final usage pattern was: {the number of projects = more than one AND the number of words in project overview = none or a few AND the number of words in project content = none or a few AND the number of resources in project = none or a few AND the number of student visits = a few or many AND the number of projects being copied = none}. Step 3. Mining of frequent user patterns. Frequent itemsets mining (Han & Kamber, 2006) was used to find the user patterns that most often occurred together, in particular identifying the itemsets that exist in more than a certain proportion of the entire dataset. In data mining language, this proportion threshold is called support. In this study, we set the minimum support at 10%. This means that in order to be considered as a frequent user pattern, a combination of feature themes needed to appear six times or more in the 59 usage patterns generated in Step 2. An Open Source data mining tool, Weka, was then used for frequent itemsets mining, and identified 24 1-itemsets, 110 2-itemsets, 190 3-itemsets, 182 4-itemsets, 102 5-itemsets, 31 6-itemsets, and four 7-itemsets frequent user patterns. For example {number of projects = one AND number of words in project overview = high AND number of words in project content = high AND number of project resources = high AND number of student visits = zero} is one of the discovered 5-item frequent user patterns. Step 4. Selecting final user clusters. The final user clusters were selected among the frequent itemsets. Selecting meaningful and useful patterns from the large number of frequent itemsets can be a difficult and subjective process. In this study, four principles were used to guide the selection process: 1. Mutual exclusiveness. The selected frequent itemsets should not overlap in any of its individual feature’s theme. This guaranteed that the final user clusters had no conflicting patterns and thus any user would belong to only one final cluster. 2. Balance. Balanced cluster size (N) was preferred; a cluster that was too small (N < 100) or too large (N > 200) was not selected even if it met all the other principles. 3. Comprehensiveness. Recall that the user feature space allowed for three roles: project authoring, resource usage, and navigation. Ideally, the final selected frequent itemsets should exhibit distinctive themes in all three aspects of the feature space. If it cannot be met, a frequent itemset covering more roles was preferred. 4. Maximum. Given two similar user patterns that both meet the other three principles, the pattern containing more items (pairs of features and themes) in it was preferred. If the two patterns contained the same number of items, then the one with more users in it was preferred. The four clustering and cluster pruning steps produced three user clusters, as shown in Table 2. Each user cluster represented a distinctive user pattern and the defining indicators are noted with asterisks. Those indicators are the dominant themes of each cluster. As part of the data post-processing phase, the next section provides an interpretation of and labels for the three clusters, based on their overarching characteristics. Phase 3 -- Data post-processing I: Interpreting the clustering result Cluster 1: Key brokers (N = 108). Teachers in this group were frequent browsers, had verbose projects, and created projects that attracted visits from other people. Of all three groups, this group scored relatively high on other measures, except for the maximum number of student projects, which was lower than cluster 2. This group did not necessarily share every single project with the public, but was careful in selecting what to share, suggesting that teachers in this group gave serious thought to their IA projects. If the IA is viewed as a learning community, teachers in Cluster 1 were the stickiest and key brokers because they appeared to be willing to observe and learn from others and also give back to the community. Cluster 2: Insular classroom practitioners (N=114). This group of teachers did not create high-quality projects, as they were characterized by few resource links, limited overview, and little content. Meanwhile they did not visit the IA or browse others IA projects as often as teachers in cluster 1, nor did they copy other teachers’ projects for their own use. In spite of the lack of enthusiasm for creating IA projects, they appeared to implement their IA projects in classroom teaching. Students viewed their projects at least once; 50% of the teachers in this group had projects viewed by the students five times or more, and in addition, 30% had projects viewed by the students 10 times or more. Given their behaviors, this group is dubbed the insular classroom practitioners. Cluster 3: Inactive islanders (N=126). This group of teachers only published one IA project each. These published projects were apparently good when judged by three project authoring measures: a medium amount of text in the overview, relatively verbose project body, and a reasonable number of resource links. In terms of navigation, this group appeared to be relatively inactive, as it was low in all three navigation measures. We speculate that the fact that users did not explore the IA as much as those in cluster 1 may have affected their knowledge of using the IA as well as their skills in creating quality IA projects. The IA was designed to allow teachers to collect and reuse web resources, and borrow curricular ideas from each other. Since this group was isolated from others and showed little navigation, it was dubbed inactive islanders. Table 2. Final User Clusters. Phase 3 -- Data post-processing II: Triangulating the clustering results This section describes our second data post-processing efforts, in which we validated cluster interpretations with an additional triangulation study. When teachers first register for their free IA account, they are asked to optionally answer two user profile questions: years of teaching experience (0 ~ 3, 4+), and comfort level with technology (on a scale of 0 “low†to 4 “highâ€). For the three clusters of teachers (N=348), 116 reported their years of teaching, and 292 reported their comfort level with technology. Tables 3 and 4 show how these profile items are distributed in the three user clusters. The tables show that the key brokers cluster had a larger proportion of tech-savvy teachers than the other two groups, and the insular classroom practitioners group mostly consisted of novice teachers. To test whether this is a random effect, a chi-square test and an exact test were used as preliminary analyses to evaluate the frequency distributions of the demographic profile across the different clusters. The chi-square test is used when the sample size is large, while the exact test is used when any cell has small (< 5) or 0 counts. Table 3. Teacher clusters by teaching experience. Table 4. Teacher clusters by comfo"
0 comments
Do you want to comment? Sign up or Sign in