formularioHidden
formularioRDF
Login

Regístrate

 

APPLYING ARTIFICIAL INTELLIGENCE TO THE EDUCATIONAL DATA: AN EXAMPLE OF SYLLABUS QUALITY ANALYSIS

InProceedings

Developing new courses and updating existing ones are routine activities for an educator. The quality of a new or updated course depends on the course structure as well as its individual elements. The syllabus defines the structure and the details of the course, thus contributing to the overall quality of the course. This research proposes a new AI based framework to manage the quality of the syllabus. We apply AI methods to automatically evaluate a syllabus on the basis of such characteristics as validity, usability, and efficiency. We provide user trials to show the advantages of the developed approach against the traditional human-based process of syllabi verification and evaluation.

"1. INTRODUCTION: THE IMPORTANCE OF SYLLABUS STRUCTURE ANALYSIS. How to increase the efficiency of teaching is one of the top problems in this new information century. For years, the best educators have been developing new approaches to make the process efficient while keeping it affordable. From the beginning of the computer era, educators believe that informational technologies could significantly improve the outcomes that students get from the courses. Some of these expected advantages are now experimentally proven: -increase of student productivity [10] as the result of implementation of new teaching tools [3] and more relevant computer adaptive testing [18]; -increase of learning efficiency with the help of new forms of information presentation. For example, dual-coding approach [2] with use of multimedia allows students to show better performance on tests in comparison with those who learn from just animation or text-based materials. In exploring these advantages, many authors pay special attention to the structural changes in the learning process itself. IT changes the traditional roles of an instructor and a student and these changes are not always positive. An educator should strive to accentuate the positive and eliminate the negative elements of the course structure. Among well-known negative effects, we should mention digital plagiarism [19]; loss of learning efficiency due to disruption and information overwhelming [10][11]; visual or mental fatigue [22] and others. This paper considers a syllabus as a “good scenario” in which an instructor and a student play their roles to achieve course objectives. We concentrate on the quality of the syllabus and apply AI methods to eliminate negative elements from the syllabus structure. 2. SYLLABUS QUALITY CHARACTERISTICS FOR THE AUTOMATED ANALYSIS. Public and private institutions in different national educational systems use a variety of methods to evaluate syllabus or course quality [21]: -standardized tests to evaluate students’ levels of competence; -interview-like exams and final course projects; -student end-of-course evaluations. Many researchers state that such evaluations provide independent opinions that are correlated with students' knowledge and skill sets [5][14]. But some claim that students can misuse them or that they can be misinterpreted by the administration [8][15]. The evaluation of quality depends on the evaluators’ point of view. The European Foundation for Quality Management defines the following groups of evaluators: -the corporate world of potential employers; -students’ families and prospective students, who need information on which to base their choice; -alumni, who may require additional training and others; Thus, it is necessary to align the syllabus structure with the sets of criteria and methods provided by the above-mentioned groups of evaluators. This work is obviously labor intensive and requires some level of automation. In practice the educator adopts the most convenient syllabus prototype from his or her archive or any other accessible source. For example, in computer related disciplines it could be community-driven Microsoft Faculty Connect (www.microsoft.com/education/facultyconnection/) or it could be MIT’s famous OpenCourseWare or some other discipline specific courses. Unfortunately, there is no “search by quality” function in such archives. In recent publications, we can find methods for quality assessment [7][20], but most of these works are based on an expert-centric approach. Community-driven resources in most of the cases have some ranking mechanisms mostly based on user opinions. In this research, we choose a statistical approach because it makes the evaluation process more objective. The following quality characteristics have been studied in this paper: - Validity. Each syllabus has an underlying cognitive model of the subject as a basis for the set of topics to be taught [9]. This model should be valid. To evaluate its validity, we need to check how closely the real outcomes of the course match the expected outcomes, as declared by the course author. - Usability. The problem of syllabus usability has two dimensions: usability for the student and usability for the instructor. The first is mainly about the accuracy of the course description and clarity of the objectives. The syllabus presentation should facilitate the understanding of its content. One of the key factors here is the requirement for a syllabus to be understandable by a student. The issue of instructor usability focuses more on the ease with which the educator can implement and modify / re-use the syllabus. - Efficiency. Because all educational institutions have resource constraints, the syllabus should consume the minimum amount of resources while achieving the course objectives. For example, if one can show that, in a Digital Systems course, students have mastered the same skills using simulation software instead of expensive circuit boards, and then we should recommend the syllabus that uses such simulation software. E.g. one structural element of the course has been substituted with another one. Statistics procedures could be used for syllabus evaluation across the characteristics discussed above. To perform such an evaluation, an institution has to have a database with course outcome metrics, information about student performance, estimates of course implementation costs, faculty evaluations and other. Fortunately, all this information could be found in course management systems such as BlackBoard, Angel, Moodle, etc. 3. ARTIFICIAL INTELLIGENCE FOR QUALITY EVALUATION. A key feature of the proposed approach is the two-step procedure based upon “a priori” and “a posteriori” syllabus validity and efficiency evaluation. The first evaluation detects a number of latent problems in the syllabus structure with the help of rule-based knowledge base. With such an evaluation, one can correct the faulty course structure before implementing the syllabus in an actual class. The second evaluation (a posteriori, on the basis of students’ examination results and some additional information) allows us to measure how close the real student outcomes are to the expected outcomes. This is an intelligent criterion for step-by-step quality improvement. To evaluate “a priori” validity, we analyze the consistency of the syllabus. Even though the syllabus structure differs from one educational institution to another, some acknowledged patterns could be found in the vast majority of syllabi ([12][6]): 1. basic information (current year and semester, course title, etc); 2. prerequisites for the course; 3. general learning goals or objectives; 4. conceptual structure of the course, textbooks, assignments, term papers, and exams; 5. activities, grading and evaluation criteria, policies and others. These sections are linked to each other and these relations are not homogeneous (see Figure 1). In fact, this is a semantic network. And this network contains some problems in its structure. For example, the Course calendar and the Textbooks sections are linked by the «refers to» relation. Sometimes, the books are listed in one section, and also repeated in another one. This duplicate causes a problem while modifying the syllabus. Figure 1. Syllabus as a semantic network. The “mapping” relation requires explanation. We suggest using it if one element of a syllabus affects the content of another one. For example, the same set of learning objectives might be fulfilled by various conceptual structures, a topic understanding could be assessed with one or more tests (see Figure 2), etc. Figure 2. Mapping of tests to topics. Table 1 shows the most typical relations between elements of a syllabus. This table highlights the fact that almost every type of relation may have problems in real syllabi. For example, some sections might be linked to non-existent elements («dead link»); some elements may be not linked at all («redundant link»). Please note, that we describe these problems in the term of “rules.” A link is “dead” if there is no appropriate book in the “textbooks” section. Table 1. Typical Syllabus Section Relations. To obtain a complete description of the syllabus structure, we also need to consider internal relations within syllabus elements. For example, the internal structure of the «textbooks» section has the one relation type «alphabetical order». Another example of internal relations is shown in Figure 3. Figure 3. An internal structure and two corresponding syllabi. In Figure 3, one can see that two possible pathways on the negotiated syllabus [4] lead to the same final exam. These syllabi are conceptually equal but might demand different amounts of resources (e.g. time, labs, etc). To perform the “a priori” statistical evaluation of the syllabus quality, we need to count numerous problems detected in the structure of the syllabus. Being hardly implemented with a traditional expert-based evaluation process the problem solves easily with the help of ordinal rule-based system and the rules like the following: Count “limit violation problem” if “time for the test is more than the class time.” A posteriori validity evaluation is based on a statistical procedure that compares expected student outcomes and examination results (interviews, tests or practical work). There are several methods to calculate a posteriori validity, and we employ the simplest one. One is expected grade distribution set by an institution. For example: about 15 percent of students get «A» mark, about 20 percent get «B», about 50 percent get «C and D» and up to 15 percent fail a course. If the real results are significantly better (see Figure 4) than the expected distribution, we consider the course to be too simple for the students. If the results are shifted towards poor grades, then the course is considered to be too difficult for them to understand. This could be caused by insufficient methodological supplies, by the professor’s errors, or by some other reasons. We shape some requirements to the statistical distribution of results. Figure 4. Exam results for the very difficult, very simple courses and a course of required difficulty. We believe that the worst results appear when the outcome is far different from the expected statistical distribution. For the above-mentioned requirements, the mean≈3.5 and the dispersion≈0.99. If we consider the worst possible outcome (100% of students got «F» mark), mean=2, and the dispersion=0. Finally, if consider the ""best possible"" result (in fact this result is very unwanted) the mean=5, and the dispersion=0. We use the chi-square method to determine whether the real course outcomes are close to the declared ones. The course is reliable if it is valid in a number of applications. One of the simplest ways to calculate the reliability of the course is a re-examination. The course is reliable if the result of reexamination repeats the original examination result. If the reexamination result is better, then we believe that a student has obtained some additional materials (not included in the course) that helped him or her to improve the result. So, the course is not reliable (maybe due to the insufficiency). If the re-examination result is worse than the first examination, then the course is also not reliable (perhaps due to the redundancy or because it is too hard for students). To calculate the reliability, a simple sign test method could be implemented. It allows the detection of some typical shifts in data. The following section is dedicated to the practical implementation of the discussed theoretical issues with the help of specially developed software tool, called “Chopin.” 4. EXPERT SYSTEM PROTOTYPE. System prototype has been created to test the above mentioned statistical and rule-based techniques. Top level system structure is presented in Figure 5. System employs two internal languages to represent all its data: Test Description Language (TDL) and Expert Description Language (EDL). TDL describes adaptive test structure. It is similar to the well-known QTI language by IMS Global Learning Consortium [13] used in modern CSM (e.g. Moodle), but also has some advanced features [17]. EDL describes the syllabus as a mathematical graph (see Figure 6). An instructor can draw a graph with EDL editor or fill it in as a MS Word template which converts itself to EDL automatically. The program called VIPES applies rules, evaluates the syllabus quality and visualizes the syllabus as a graph. When a student starts the VIPES program, it graphically visualizes the syllabus and the student’s achievements. The student chooses the objective for his/her next activities, and the system generates the learning path consisting of virtually any type of activities, from reading to peer-reviewing of submitted assignments. The choice here is to generate either the complete path to the end of the course or just to cover some topics. If the generated plan (“scenario”) does not match the student’s expectations, then she/he can ask the system to schedule an online chat with the instructor. This is a very effective new teaching method that is hardly feasible in the traditional class format. Now instructor concentrates on the core of the student’s problem, while the VIPES system helps to diagnose and visualize the problem. In creating the syllabus, an instructor develops a set of scenarios to match the needs of as many students as possible. Thus, each syllabus is a set of scenarios (in contrast to a traditional single-scenario syllabus), so it has more chances to fit the needs of an individual student. On the other hand, it takes much more time to create and test a multi-scenario syllabus. That is why this approach demands automatic quality correction procedures. Using this system, with data collected over more than seven years, the set of experiments have been performed on syllabus quality evaluation and syllabus structure variations. Results of two experiments are presented in the following section. 5. RESULTS AND DISCUSSION. To study the effectiveness of the proposed approach and to evaluate the scope of its implementation, we discuss two continuous experiments here. The first one was aimed at evaluating the quality of many syllabi, and, in the second experiment, we monitored the structure changes and evaluations of one syllabus for several years. For the first experiment, we have selected a set of 15 syllabi (Altai State Technical University, MIS program) and performed the “a priori” quality estimation (see Table 2). While performing evaluations, we eliminated the identified problems by restructuring of the syllabi. We also checked the correlation between the number of identified problems and the students' results. The experiment shows the significant correlation between syllabus consistency and students’ outcomes. As can be seen from Table 2, there is a direct relation between the number of consistency violations and the percentage of unsatisfactory student outcomes. Different consistency problems affect student results differently: for example, “insufficient mapping” affects the students' results more than “non-optimal mapping,” etc. In practice, the majority of syllabi which were already checked by experts had some level of consistency problems. Figure 6. A syllabus as an EDL script and its graph visualization. Table 2. A fragment of “a priori” monitoring of the syllabi database. Table 3. “A posteriori” monitoring and optimization of the “E-business” syllabus. The second part of the user trials was to follow the changes in one syllabus. The E-Business course was selected for this part of the research. It was offered several times a year for different majors. The students’ results showed some deviations, the causes of which were not immediately obvious. Sometimes, all the students were very successful in examinations, and sometimes, the majority of students got poor results. The course was originally taught by one professor. Later, the theoretical and practical parts of the course were separated and taught by two different professors. Such a change was implemented to make the students’ evaluations more clear and to exclude any personal preferences from the examination. It was necessary to establish an indicator that would inform the department chair that some problems must be corrected in order to improve the quality of the course. Table 3 shows the results of the validity and reliability monitoring. The table shows how the validity and reliability characteristics could be applied as objective tests to changes in the syllabus. The professors modified the syllabus step by step until its validity and reliability were not stabilized. Later they updated its contents and optimized it with the same characteristics as the goal function. They assumed the elements to be included to the course (such as on-line materials instead of paper-based) and checked the efficiency of these changes with the objective criteria. The users (instructors, administrators and students) agreed that this approach has positive effects on the syllabi database. They stated that it helps them to make the evaluation more objective and to highlight the key factors affecting the quality of teaching. The users also agreed that this AI-based technique is a promising one for accreditation activities. The comparison of syllabi/curricula is one of the most critical stages of this process. This approach contributes also to the theory of “semantic matching” problem. It is also applicable in practice, in the case of a syllabus we need to compare short parts of documents: in particular, the course goals and the description of course content should be compared with appropriate parts of the guide from the accreditation body. This could be considered as a possible new direction for development of these software tools, i.e., as an extension for this project. Another promising extension is related to the development of EDL and TDL languages which promise to be more efficient in their narrow fields than the existing common ones [17][16]. The existence of well defined and statistically proven criteria for the evaluation and optimization of test and syllabus structure gives us a prospect to build up a complete logical system for this particular case of educational data analysis."

Acerca de este recurso...

Visitas 212

0 comentarios

¿Quieres comentar? Regístrate o inicia sesión