A Unified Framework for Multi-Level Analysis of Distributed Learning

InProceedings

Proceedings of 1st Learning Analytics and Knowledge (LAK2011), Feb 28 - Mar 1, 2011

2011 2011

Learning and knowledge creation is often distributed across multiple media and sites in networked environments. Traces of such activity may be fragmented across multiple logs and may not match analytic needs. As a result, the coherence of distributed interaction and emergent phenomena are analytically cloaked. Understanding distributed learning and knowledge creation requires multi-level analysis of the situated accomplishments of individuals and small groups and of how this local activity gives rise to larger phenomena in a network. We have developed an abstract transcript representation that provides a unified analytic artifact of distributed activity, and an analytic hierarchy that supports multiple levels of analysis. Log files are abstracted to directed graphs that record observed relationships (contingencies) between events, which may be interpreted as evidence of interaction and other influences between actors. Contingency graphs are further abstracted to twomode directed graphs that record how associations between actors are mediated by digital artifacts and summarize sequential patterns of interaction. Transitive closure of these associograms yields sociograms, to which existing network analytic techniques may be applied, yielding aggregate results that can then be interpreted by reference to the other levels of analysis. We discuss how the analytic hierarchy bridges between levels of analysis and theory.

"1 Introduction. The rapid adoption of information and communication technologies (ICT) in support of â€œonline,â€ â€œdistributed,â€ and â€œnetworkedâ€ learning and knowledge creation activities [1], and their blending with face-to-face venues [14] is well known to the research community to which this paper is addressed. In this paper we use learning as shorthand to include any enhancements of individual or collective knowledge or skills, whether or not it occurs in formal educational settings. We include in our scope of interest learning in (for example) online university settings, professional communities, and virtual organizations [2, 4, 8, 29]. We will refer to these collectively as socio-technical networks [19]. A related trend is towards open learning communities. Courses in formal educational settings need no longer isolate participants from others in different courses, but can embed courses in online communities of learners, for example supporting transdisciplinary graduate education [10]. In corporate or other work settings, professional learning communities similarly may cross team contexts rather than being isolated in work teams [42]. The fundamental question of interest in all of these settings is how learning takes place through the interplay between individual and collective agency. All learning activity requires that individuals take actions, but these individual actions are contingent on the actions of others in their socio-technical network contexts, actions that reflexively construct those contexts. The first analytic challenge addressed by this paper is that learning and knowledge creation activities in these networked environments are often distributed across multiple media and sites. As a result, traces of such activity may be fragmented across multiple logs. For example, the networked learning environments we study offer mixtures of threaded discussion, synchronous chats, wikis, whiteboards, profiles, and resource sharing. Events in these media may be logged in different formats and recorded in databases and text files, disassociating actions that for participants were part of a single unified activity. This disassociation is exacerbated when activity is distributed across multiple virtual sites or spread over time. Also, the granularity at which events are recorded may not match analytic needs, and media-level events may be the wrong ontology for analyses that begin with relationships rather than individual acts. Translation from log file representations to other levels of description may be required to begin the primary analysis. As a result of these various issues, the coherence of distributed interaction and phenomena that emerge from this interaction are analytically cloaked. Furthermore, understanding distributed learning and knowledge creation requires multi-level analysis of the situated accomplishments of individuals and small groups and of how these local accomplishments give rise to larger phenomena in networks such as the dissemination and transformation of ideas, implicit coordination of the activities of many participants, and the accrual of collective knowledge. Consider the question of how the design of the virtual environment influences emergent phenomena. Everything builds on the existence of multiple successive moments in which an individual is experiencing some presentation of the virtual environment, cares enough to act, and is able to choose an appropriate action. Whether and how this action has implications for network or community level phenomena requires that some trace of the action be given persistent form that other participants might later encounter in their experience of the virtual environment [18]. Appropriate aggregation and availability of such traces can drive dissemination of ideas, align participants, and lead to accrual of collective resources of value. Critically, an empirically grounded understanding of this emergence requires analysis at both fine-grained and aggregate levels. The same can be said for understanding the relationship between small group interactions and larger scale phenomena. In summary, since interaction is distributed across space, time, and media, and the data comes in a variety of formats, there is no single transcript to inspect and share, and the available data representations may not make interaction and its consequences apparent. To address these concerns (and to support the diverse research in our laboratory), we have developed a framework consisting of an abstract transcript representation that collects relevant events into a single analytic artifact, and an analytic hierarchy that supports multiple levels of analysis. This paper describes the framework and discusses its potential roles in unifying multiple sources of data and bridging between levels of analysis and theory. We discuss how the framework addresses several specific analytic needs, including: (a) scaling up microanalysis of interaction to large data sets, (b) enabling the translation of event logs into tie data appropriate for social network analysis, and (c) interpreting results at one level in terms of another (e.g., relating social network analytic results back to their interactional settings). Throughout the paper, a simple example drawn from our prior research illustrates many of the features of the framework. 2 Preview. The analytic hierarchy consists of several abstraction layers of analytic representations that we have found to be useful, summarized in Table 1. Process traces such as log files are abstracted to domain models describing the actors, actions and media objects involved in event models, which are collections of temporally tagged events. These event models can be further elaborated by installing directed graphs of empirical relationships between events called contingencies. Contingencies can be any observed relationship between events (e.g., two events are by the same actor, involve the same object, are temporally contiguous or proximal, or overlap in content). Contingencies situate participantsâ€™ acts in relation to other eventsâ€”hence the name contextualized action model. The analytic utility of contingency graphs is enhanced if focused on those contingencies that may be interpreted as evidence of uptake: interaction and other influences between actors. When such interpretations are made, contingency graphs are abstracted into uptake graphs, representing interaction models. Table 1. The Analytic Hierarchy. Interaction can be further abstracted to two-mode directed graphs, called associograms, which record how associations between actors are mediated by their creation and modification of and access to digital artifacts: hence the name mediation models. Associograms also summarize sequential patterns of interaction, making it easier to localize certain patterns. Reduction of associograms by transitive closure into direct ties between actors yields sociograms, representing tie models. Existing network analytic techniques may be applied to sociograms. The results of network analysis can then be interpreted by reference to the other levels of analysis. Thus associograms bridge between interaction analysis and network analysis. The analytic concepts (e.g., contingencies, uptake, mediated associations, and ties) in this paper are not new. Rather, the value of the framework relies on the fact that they are abstractions of concepts commonly applied in existing analytic practice (e.g., adjacency, edits, replies, etc.) as will be detailed later. Thus the framework is offered to coordinate and augment rather than replace existing analytic practices. The layers are explained in more detail in the following subsections. The process trace, domain, and event models and transformations between them are likely to be familiar to readers: brief sections on these layers are included for completeness and to provide the foundation and examples for describing subsequent layers. Contingency and uptake graphs and associograms are more unique contributions, so are described in some detail hereâ€”see also [37] for extensive discussion of motivations for contingency graphs and examples of their use for uptake analysis. The most abstract layer is covered substantially in the social network analysis literature [e.g., 41], so is described here only in relation to how it is derived from the layer below, and what that vertical relationship enables that would not be possible with direct measurement of ties. Throughout this presentation, applications to the study of learning analytics are discussed. The methods described in this paper have been applied in numerous analyses of data from an online learning environment and from laboratory studies of ICT mediated collaboration. At present we are using these techniques in analyses of SRIâ€™s Tapped-In teacher professional community [12, 32], a virtual organization that hosts many thousands of education professionals annually in more than 8,000 usercreated spaces that include IRC, threaded discussions, shared files and URLs, and other tools to support collaborative work. 3 Process Traces. Any analysis of interaction begins with a process trace, or record of activity left in the environment and accessible to the researcher. Examples include software log data (software application or server logs), audio and video recordings, and textual transcripts. The analytic hierarchy described herein was originally designed to support analysis of both software logs and video recordings, sometimes in conjunction (e.g., we have analyzed application logs and screen capture of the same application [25, 26]). For learning analytic applications and to emphasize the potential for automated analysis, this paper focuses on software logs, and does not touch on issues of video analysis; see [15, 17] for discussion of such issues. The analytic hierarchy is illustrated throughout this paper by building on a simplified example taken from one of our online learning community applications, disCourse. Figure 1. From Process Trace to Domain and Event Models. The disCourse environment provides threaded discussions, wiki pages, resource sharing with searchable metadata, and user profiles, organized in a workspace metaphor that collects together tools and resources relevant to a given group, such as a class [36]. The lower portion of Figure 1 shows excerpts (edited for anonymity and simplicity of presentation) of an http server log1 from disCourse. The example in this paper builds on these logs. See [37] for the full text of the example. 4 Entity-Relations: Domain Model. Prior to or concurrently with the construction of the event model (next section), it is necessary to construct an ontology of the kinds of entities involved in the application domain of interest. Classes of entities and potential structural relationships between them are defined (e.g., actors, discussions, and messages, related by containment, threading and authoring relations). As the trace or log file is processed, new instances of entities and their structural relations are added to the domain model when they are encountered, along with relevant attributes that are expected to be needed for analysis. This is undertaken in conjunction with construction of the event model. For example, the right hand side of Figure 1 illustrates a domain model fragment representing how messages m1, ... m4 are created by participants P1, P2, P3 (shown by shading), related to each other by a threading relation, and contained in a discussion forum. The content of messages are also recorded in the domain model. Temporal information is recorded in the event model, discussed next. (1) disCourse logs events in a database. HTTP server logs of the same events are shown in this example to illustrate the method using log formats familiar to readers. 5 Events: Event Model. The process trace is transformed into a set of events that constitute an analysisâ€™ first commitments concerning the relevant units for analyzing processes. This transformation involves the Exploratory Sequential Data Analysis (ESDA) operations of chunking and coding [31]. For example, the first three lines of the log of Figure 1 all are part of the process of posting a message in a system in which each message is previewed before posting. These three traces are chunked together and represented as the single event w1 in the event model, along with information about the actor (P2, indicated by grey), action taken (w for writing), object (message m1), contents and location (recorded in the domain model), and temporal scope of the action. We call this layer the event model because the focus is on individual actions and other events by nonhuman actants such as software display events. (Actant is Latourâ€™s [22] term for non-human entities that yet have agency in networks of associations.) The events have not yet been put in relation to each other, other than ordering along a timeline. Events may be derived from distinct process traces that come from different media, tools or sites, and are recorded in different formats. For example, chat contributions, wiki edits, whiteboard edits, file uploads, etc. can be merged into a single event stream. (To remain faithful to the case example and avoid complicating the figures, this capability is not illustrated in the figures, but it is a simple extension.) A key concern is persistence of identity across tools and sites: some work may be required to ensure that each given actor is represented by the same identifier in the event model, and likewise for the identity of digital objects shared across tools (ideally persistence of identity should be addressed in mash-ups for the learnersâ€™ sake [20]). Once this has been accomplished, the event and domain models taken together provide an abstract transcript of the data that re-assembles in one analytic artifact the diverse events that were for their actors a single activity. If the transcription is complete with respect to the needs of a given analysis, then it is not necessary to retain the original process traces. However, we retain pointers to the original process traces because it may not be possible to identify all needs in advance. We may need to recover other information from the process trace. Also, any transcript includes initial theoretical commitments [11, 28], which may turn out to be faulty, necessitating a return to the original process traces. A number of analyses can be undertaken on the event and domain models without further analysis. In our research, this is the level at which we answer basic questions about the distribution of activity in the environment: who is participating with whom, in what virtual sites or contexts, and involving what literal content. But to analyze interaction and uncover ties between actors we must relate events to each other. 6 Contingency Graph: Contextualized Action Model. Contingency graphs are an empirically grounded elaboration of the abstract transcript to make analytically relevant relationships between events explicit. We originally called these relationships dependencies, but have renamed them contingencies because they capture relationships between events that may be merely contingent or incidental to the situation, rather than being causal or deterministic. The graph simply makes relationships that are latent in the data more explicit, and does not constitute a commitment concerning actorsâ€™ intentions. Human action can be embedded in its context in many ways, including accidental relationships, or opportunistic leveraging of contextual and historical features as well as necessary antecedents for action [6, 22]. Thus, a contingency graph represents how action is embedded in the context of other events. Examples of contingency types we have used are listed in Table 2 (not intended to be a complete taxonomy). A detailed presentation of the motivations and theory behind contingency graphs and their application to interaction analysis may be found in [37]. Construction of a complete graph of the contingencies between events in a process trace is not practical, as it would result in a graph with a high â€œsignalâ€ to â€œnoiseâ€ ratio that is too complex for processing. (Imagine a graph in which each event is linked to every one involving the same actor, or the same object, or that has overlap in lexical content, or occurred nearby in time, and so on.) An analyst chooses those contingencies that are relevant for specific analytic purposes as guided by explicit or implicit theory. Therefore a contingency graph reflects further commitments on the part of the analyst. However, even though a contingency graph is theoretically selective, we always base contingencies on empirically observable relationships between events found in the event and domain models, preferably those relationships that are unambiguous and can be detected automatically. If this standard of evidence is followed, a contingency graph can be treated as an abstract transcript that makes the evidence for interaction or other phenomena of interest manifest. Table 2. Examples of contingency types. Figure 2. Installing Contingencies. Contingency graphs can be constructed automatically from the layers below it [see, for example, 26]. For example, for each event in which an actor accessed an object we might scan back to find the last event in which the object's contents were modified, and install a media dependency. Contingencies can also be installed from a given event to the most recent prior event involving the actor, to prior events in which the actor accessed a media object with similar inscriptions (e.g., lexical phrases or graphical devices), or to temporally recent events in the same spatial site. A challenge with algorithmic installation of contingencies is limiting their number. Temporal or sequential proximity are useful (and computable) heuristics for selecting relevant contingent events, as they follow the local continuity of human attention and goal directed behavior: what actors do at any given moment is likely to be contingent upon their immediately prior act. For example, Figure 2 shows the events of Figure 1 with contingencies installed. The single arcs represent media dependencies, and the double arcs represent multiple contingencies, such as temporal proximity combined with same actor and possibly inscriptional similarity. The act of reading a message (r1, r2, etc.) is media-dependent on the act of creating the message (w1, w2, etc.). The act of writing a message (e.g., w2) may be media-dependent on the act of creating the message to which it is a threaded reply (e.g., w1) and is contingent on the messages that the author has recently read (e.g., r1). In this example, the message created by w2 contained a nounphrase in common with that created by w1. Once constructed, various kinds of analytic actions are possible on contingency graphs. For example, suppose a particularly productive session was identified in which participants made significant ideational progress. One option is to examine the interaction of the session participants more closely to identify the relationship between group processes and their accomplishments, and how participants appropriated the interactional affordances of the available media for these purposes. We have used the contingency graphs in several studies to support this kind of microanalysis of interaction [24, 25, 26, 38]. Recurring patterns of interaction so identified could be searched for in the overall contingency graph to find other sessions that have similar patterns of activity, to see whether they display similar productivity. Such pattern matching techniques are similar to structural equivalence metrics in social network analysis, which can be employed once contingency graphs are converted into sociograms, as discussed in section 9. Another option is to look outside the session to find influences from or to other sessions. One can trace same-actor and media-dependency contingencies, following the actors and actants respectively. Tracing proceeds forward in time to see whether the new ideas of the session were disseminated elsewhere, or backward in time to identify possible predecessors of the ideational advance. Such an analysis grounds the concept of brokers in actual accomplishments, not relying solely on structural relationships that do not guarantee such accomplishments. At this writing we are constructing a contingency graph of several years of data from Tapped-In in preparation for application of methods such as those just described. 7 Uptake Graph: Interaction Model. As discussed above, contingencies are so named because they can include circumstantial relationships between acts with varying degrees of relevance to interaction. Analytic interpretation is required to identify relationships between events that are not merely circumstantial, but reflect intentional acts. An act of uptake is one in which an actor takes traces of one or more prior events as having certain significance for an ongoing activity [37]. For example, a speaker takes up some aspect of the prior speakerâ€™s utterance, or a message poster in a discussion forum can take up some aspect of the message being replied to. Uptake is a generalization of all interactional relationships used in analysis, such as comment, reply, elaboration. It includes these relationships, but also applies to spatio-temporally distributed associations between actors in which they may not even be aware of each other, let alone be directing their actions towards each other, such as tagging, downloading, etc. Therefore, uptake is more general than transactivity [5], which requires otherdirectedness. Uptake is an appropriate generalized unit of interaction in networked learning environments, where individuals may benefit from each othersâ€™ presence without conversing directly. The essential idea is that the trace an actor's actions have left in the environment (e.g., chat contribution, discussion posting, uploaded file, profile, recommendation) is taken up by another actor in some manner. Uptake of traces can result in stigmergic effects, i.e., implicit distributed coordination of collective action [30]. Figure 3. From Contingency to Uptake Graphs. Illuminating these stigmergic effects reveals the contingencies by which an individualâ€™s actions are connected to information, actions, and resources from sources that may otherwise not be known to that individual, even if embedded within oneâ€™s known social network. An uptake graph is an interaction model, as it describes the interaction that the analysis claims is taking place. Although all analytic artifacts from process traces on up involve theoretical decisions, the move from contingency graphs to uptake graphs is a move from primarily empirically accountable representations to those more strongly determined by analytic interpretations. Representationally, an uptake relation is a subgraph of contingencies, as illustrated in Figure 3. An analyst collects contingencies that are considered to be analytically meaningful: a number of contingencies between two or more acts may corroborate the interpretation that the final act is an intentional taking up of traces of the prior ones. For example, w2, in which P1 posts a reply to the message posted by P2 in w1, is contingent on w1 in these ways: there is a media dependency (m2 is linked by threading to m1); lexical overlap (m2 contains phrases also found in m1); and a chain of temporal proximity (w2 took place shortly after read event r1 by the same actor, and r1 is mediadependent on w1 by virtue of reading m1). All of these contingencies are taken as evidence for an intentional relationship of w2 to w1, and collapsed into one uptake arc. Because of this relationship between contingencies and uptake, an uptake graph may be seen as an abstraction of a contingency graph, and many of the same analytic moves (such as pattern matching and tracing actions) apply to both. Contingency and uptake graphs are described more fully in [37]. We have used contingency and uptake graphs to provide interactional accounts of specific accomplishments of participants [24, 25], to trace out information sharing [40], and to detect roles of participants not visible in the final media trace [38]. For example, examining only reply structure (the threading relationship between messages in Figure 1) we might miss the fact that m4 played an integrative role in this discussion. The uptake graph of Figure 3 makes this integration explicit as a structure of uptake converging on w4. Integrative or convergent acts are important to group learning processes such as intersubjective meaning making [35] and community knowledge building [16]. Contingency and uptake graphs represent process models: they focus on how acts relate to each other and constitute a process of interaction. Their basic unit is acts and other events: the actors and entities through which interaction takes place are attributes of these events. Now we turn to an alternative derived representation that makes these actors and entities explicit, rather than the events. 8 Associograms: Mediation Model. In the study of socio-technical networks, we are interested in how the technological infrastructure enables and is utilized by the social actors to interact with each other. The next layer of the analytic hierarchy makes the objects of this technological infrastructure explicit and shows how they mediate interaction between participants. Analysis at this layer provides the mediation model, and is represented by multimodal bipartite graphs in which participants are related to each other via the objects through which they interact. We call these graphs associograms to distinguish them from sociograms in a manner that honors Latour's [22] concept of mediated associations that assemble a social system. Associograms are multimodal because there may be two or more types of nodesâ€”actors and the various types of media through which they interactâ€”and they are bipartite because they are divided into two partitions: actors in one partition and the various types of media objects in the other. Directed arcs represent state-influence (a weaker form of state-dependency): they extend from an object to an actor if the state of the object is influenced by some action of the actor (e.g., writing a message or editing a wiki), and from the actor to the object if the state of the actor has been influenced by accessing the object (e.g., reading a message or wiki) One can construct associograms from a set of events, whether taken directly from the event model, or events of interest that were selected from the contextualized action or interaction models (contingency graphs or uptake graphs, respectively). A node in any of these models represents an event, and actors and objects are attributes of the node. This is largely reversed in an associogram: actors and objects are nodes, and events are links between nodes. For example, in Figure 4, w1â€”the event of P2 writing m1â€”becomes a directed association from m1 to P2 (m1â€™s state depends on P1), and r1â€”the event of P1 reading m1â€”becomes a directed association from P1 to m1. Figure 4. From Events to Associogram. An associogram can be constructed at different granularities. Object nodes could be created for each individual object (e.g., one node for each message, wiki page, chat, etc., as in Figure 4), or they could be aggregated for object types (e.g., all associations via messages aggregated into a single node, those via wikis in another, etc.) in order to characterize how interaction is distributed across types of media. Some information is lost in either case: all the events involving an actor and an object will fall into the same two nodes and links between them. For example, if P1 reads m1 multiple times there is still only one link from P1 to m1, and if P2 edits a wiki multiple times, there is still one link from the wiki to P2. Some of this information can be preserved by weighting the links with number of occurrences, or by putting backpointers to the originating event nodes. Temporal sequencing is mostly lost, though it can be recovered by following these backpointers to the contingency graph. This information reduction is actually an advantage of associograms: they reduce the clutter of interaction models to expose recurring patterns of mediation. An example is given next. 8.1 Finding Interaction Patterns. Associograms can help expose patterns of interest in contingency or uptake graphs. For example, consider the question of finding which participants are in dialogue with each other. Dialogue is clearly a prerequisite for learning through argumentation, intersubjective meaning-making and group cognition [3, 33, 35]. A key indicator of the presence of dialogue is what we call a round trip: one participant makes a contribution that is accessed by another participant who then makes a contingent contribution (evidencing uptake) that the first participant then accesses [40]. In a contingency graph one would need to trace out many paths from each participant to find paths that go to another participant via a read and then a write and then back to the first participant. In an associogram one need only find cycles in the graph. If the links are weighted with frequency counts, the minimum weight of the path is taken as a measure of extent of dialogue. For example, in Figure 4 there is a cycle (following the arrows in reverse to trace chronology rather than dependency) P2â†m1â†P1â†m2â†P2. This corresponds to the round trip in which P2 posts m1, P1 reads it and posts m2 in reply and P2 reads m2, completing the round trip. Note that P2 need not post a reply to m2 to complete the round trip: an analysis that looks only at the threading structure of posted messages and does not include read events would miss this round trip. 8.2 Characterizing Mediation. Degree and path analysis of an associogram can reveal the roles different media play in a socio-technical network. Media objects or media types (in an aggregate associogram) that have high in-degree are accessed by many actors, and hence may be influential sites where an educational intervention can reach many participants in a socio-technical network. Those with high out-degree are modified by many actors, and hence may be sites where ideas are aggregated or consolidated (potential roles as community memory, or locus of knowledge building). In a weighted associogram, heavily weighted links indicate that actors visit the incident objects repeatedly. These measures may be compared between different media types to assess their relative roles. Additional roles can be identified, such as liaison roles, where the media object or type connects other objects or actors that would not otherwise be reachable. For example, we have used associograms constructed from bridging events to assess the roles of different media (discussions, wikis, resources and profiles) in mediating bridging in a socio-technical network [36]. 8.3 Characterizing Mediated Relationships. Associograms summarize how objects directionally mediate the interaction between any given two people. The subgraph of all paths of length two (direct mediation) between two persons can be used in at least two ways to characterize the relationships between those persons as mediated by the socio-technical network. First, we can recognize defined patterns, two of which are shown in Figure 5. Second, profiles of mediated interaction between any two people can be represented as vectors of the weights on paths of different types and directions (e.g., P1 to P2 via discussions, P2 to P1 via discussions, P1 to P2 via wikis, etc.). Cluster analysis of these vectors can reveal recurring types of relationships. These approaches are currently being investigated in a dissertation by Kar-Hai Chu, under the authorsâ€™ direction. Figure 5. Pairwise Associations (Relationship Model). 9 Sociograms: Tie Model. Finally, we briefly note that associograms can be transformed to conventional sociograms by transitive closure of the paths between actors, or by other computations that interpret patterns of mediated associations as ties. As shown in Figure 6, this results in a directed graph or an asymmetric matrix representing the ties between actors. Well established methods of social network analysis (SNA) can then be applied [41], but with advantages that would not be realized if one had merely constructed sociograms directly from source data (e.g., surveys about ties). A tie in a sociogram or sociomatrix is really shorthand for a complex network of multi"

Acerca de este recurso...

Visitas 150

Guardar en Mi espacio personal
Enviar enlace

Categorías:

Learning Analytics and Knowledge (LAK)

Etiquetas:

0 comentarios

¿Quieres comentar? Regístrate o inicia sesión

¿Cómo puedes configurar o deshabilitar tus cookies?

A Unified Framework for Multi-Level Analysis of Distributed Learning

InProceedings