Academia.eduAcademia.edu

A multimodal dialogue interface for flexible tutoring systems

This paper presents an environment for the design and implementation of an Intelligent Tutoring System based on a multimodal dialogue interface and user modelling. The system exploits existing web-based on-line courses and superimposes a rhetorical structure as the underlying model for the course content. In addition, the learner profile is based on the notion of stereotypical ascription, while a multi-agent architecture supports the coordination of multiple interaction modalities.

A multimodal dialogue interface for flexible tutoring systems Esma AIMEUR Computer Science Department University of Montreal Montreal H3C 3J7 (QC), Canada [email protected] Vincenzo PALLOTTA, Giovanni CORAY EPFL, IN-Ecublens CH-1015 Lausanne, Switzerland [email protected] Abstract This paper presents an environment for the design and implementation of an Intelligent Tutoring System based on a multimodal dialogue interface and user modelling. The system exploits existing web-based on-line courses and superimposes a rhetorical structure as the underlying model for the course content. In addition, the learner profile is based on the notion of stereotypical ascription, while a multi-agent architecture supports the coordination of multiple interaction modalities. 1 Introduction How can we take Intelligent Tutoring Systems (ITSs) to the next level? ITSs are not yet as effective as human tutors, even in the domains where ITSs have been most successful. Quite possibly, the difference has something to do with the power of tutorial dialogues. A growing number of researchers therefore are working on finding out whether adding dialog capabilities to ITSs will result in more effective systems [14]. The current generation of ITSs have successfully produced learning gains without the use of natural language technology, but the goal for the next generation is to add natural language dialogue capabilities [7]. Since it is already a tremendous effort to add domain and pedagogical knowledge to the current generation of ITSs adding natural language dialogue capabilities can further increase the development time by requiring that language knowledge also be engineered. Intelligent Tutoring Systems need to move towards a new model of interaction that could extend both computer aided learning and distance teaching as porposed in [y]. Our effort towards this goal is to propose a model and an architecture which integrates the pedagogical and the student model into a multimodal mixed-initiative dialogue system. This means that the student may guide the teaching session asking questions or suggesting his/her own path through the pedagogical objective using different input modalities. In this paper we present a general model for the new generation of intelligent tutoring systems and a preliminary suitable architecture for a tutorial system that supports multimodal natural interaction. 2 Tutoring context The goal of an intelligent tutoring system (ITS) is to reproduce the behaviour of an intelligent (competent) human tutor who can adapt his teaching strategy to the learning needs of the learner [2]. Initially the control of training was assumed by the tutor (prescriptive approach), not by the learner. More recent ITS developments consider a new pedagogical model that takes into account a co-operative approach between the learner and the system. The latter can simulate various partners such as a co-learner, a learning companion, a troublemaker, which are called pedagogical agents. In fact this evolution progressively highlighted two fundamental characteristics: (1) learning in an ITS is a constructive process involving several partners, called pedagogical agents and (2) to improve the performance of an ITS, various cooperative learning strategies can be used such as: one-to-one tutoring, learning with a co-learner, learning by teaching, learning by disturbing [1], etc. The student model deals with the meta-knowledge about the student’s knowledge. The student model is made of two parts: 1) The cognitive part: an overlay of the capabilities of the student compared to the capabilities as those predicted from the curriculum. 2) The affective part: concerns several parameters such as attention, rapidity, anxiety, motivation, confidence, etc… In our work we concentrated mainly on the cognitive part and in particular on how to design a student model manager in our architecture (Fig.1). For instance stereotypes are widely used in both ITSs and in a range of other teaching and advisory systems. Yet the notion of stereotypes is very loose Learner model Planner Learning Strategies The curriculum component is built on top of the content of the course which is structured by means of extended rhetorical annotations [8]. Semantic (i.e. between concepts) and rhetorical (i.e. between discourse units) relations are made explicit and used by the planner for generating teaching actions. A curriculum manager browses the course content and interacts with the planner providing the teaching objectives. The planner component dynamically generates teaching actions using the course plan (generated by the curriculum manager), the student’s capabilities (provided by the student model), and the available time. Teaching actions include also links to the available resources which are useful to fulfil the learning goals. The planner is thus guided by rhetorical structures imposed on the curriculum. The session manager is the core of the tutoring system. It takes all the tutoring decisions and controls the learning process. The first action of the session manager is to ask the planner for a lesson. This leads to activation of didactic resources that contain their own management system or select a learning strategy and a learning mode from the pedagogical model that will use some didactic resources. The session manager builds a state of the learning advancement of the learner and dynamically updates the learner model. The didactic resources are tactical means to support a teaching session. They can be a demonstration, an exercise, a problem, a hypermedia or multi-media document, a HTML document. The resource manager is linked to the multi-modal component of the system through the dialogue manager. Tu Tr C Session Manager Pedagogical agents Adidactic Resources We propose to model student stereotypes by means of stereotypical ascriptions [5], [4]: background knowledge and beliefs can be projected by the system onto the student depending on a previously designed student category and when no evidence of the contrary already exists in the student’s knowledge base. The student’s knowledge base is updated only with the minimal amount of information coming from the interaction. Any concept that appeared during the conversation is assumed to be learned by default, unless the student's behaviour implies a contradictory or missing grasping of that concept. In the latter case the system will explicitly activate a repair strategy in order to re-establish the coherence of the student’s projected knowledge base. Curriculum Didactic Resources (Kay, 2000). An appealing property of the stereotype is that it should enable a system to get started quickly on its customised interaction with the student. That quick start is often based upon a brief initial interaction with the user or, less commonly, on a short period of observation to watch the user. Learning modes Multimodal Interface Figure 1: General architecture of an ITS 3 Interaction context One of the innovative features in our work is the way we handle the interaction between the student and the pedagogical agent. The use of a dialogue theory in designing interactive teaching systems is investigated by Bunt in [6] where he outlines how an ITS may benefit from adopting new forms of interaction. We went a step further in the investigation on how to put in practice his principles. We present here a multimodal dialogue management system and its application to tutoring systems. 3.1 Dialogue Manager The Dialogue Manager is based on our recent work and learnings from the HERALD architecture [13]. It is based on the Agent-Oriented Programming (AOP) model [15] and reflects the principles of the Information State approach to dialogue modelling proposed in [9]. In fact the proposed dialogue moves can be coded by means of KQML performatives whereas update rules are special cases of behavioural rules. AOP poses no restriction on how to structure agents’ mental state except for the fact that the formalism allows us to express conditions on the represented objects. A robust semantic parser [3] is used as the natural language interface and the ViewGen system [5], [4] is used for plan recognition from speech-acts by a suitable integration of planning, ascription and inference [11], [12], [10]. Moreover, ViewGen serves as the base for implementing the Discourse Context Manager, since planning recognition in ViewGen makes use of partial plans thus allowing the possibility of building incremental interpretation of dialogue acts. 3.2 Multimodal interface Dialogue Manager Multiple modalities are used both for input and output. An Input-Output manager coordinates a set of devices and converts different signals into messages to be sent to the Dialogue Manager and vice versa (see Fig.2). Each device is controlled by a listener agent who wraps the device in order to transform the low-level input into a symbolic object level. All messages are time-stamped and are collected and dispatched by the I/O manager from the Dialogue Manager to the devices. 3.3 KQML Input/Output Manager KQML Dialogue context and updates Output Agent n Listener 1 Listener n Talker 1 Talker n Input device 1 Input device n Output device 1 API Another source of knowledge affecting the choice of the appropriate dialogue move comes from the rhetorical structure of the course content [x]. The dialogue evolves as a traversal through the rhetorical structure and the pedagogical goal can be considered fulfilled when all the relevant parts of the structure have been visited. Output Agent 1 API In order to build and update the involved dialogue context the system implements update strategies by means of update rules. Update rules are triggered by communicative events and executed evaluating their pre-conditions against the current context. KQML Input Agent n API Our system is able to build and manage a dialogue context containing various types of information. Both the student’s communicative behaviour and decisions taken by the dialogue manager can respectively affect or be affected by all or part of dialogue context. KQML Input Agent 1 API Tracking the learning process by means of simulating the way the student is going to build his/her knowledge base is helpful since it allows us to better optimize the information provided by the system. Depending on the student’s category we can also hypothesize different levels of implicit inferential steps which can in turn be confirmed or disconfirmed by the ongoing interaction. KQML Output device n Figure 2: Multimodal interface with a dialogue manager 4 Conclusions There is a connection between the rhetorical structure of the course provided by the curriculum component and the way the student learns. At the basic level of learning, all the explanatory information can be necessary, while at higher levels of learning skills detailed information is skipped since the system assumes that the student has sufficient background knowledge and he/she is able to perform the necessary inference steps to reach the required explanation. Information can be presented in various forms depending on the student’s preferences and pedagogical motivations. Moreover the student should be able to interact with the system in different manners depending on his/her notion of naturalness. The system must adapt its behaviour to the student’s choice of interaction modality and it should be able to map each action into the corresponding intention. On the other side the system's didactic strategy should be mapped into a communicative strategy. This means that the system should be able to recognise what kind of output modality is more suitable for the teaching task and it should be also able to contextualize the decision depending on the current interaction and the student’s profile. Since we propose a multi-agent architecture as the underlying computational model for our tutoring system, it is apparent that we are able to combine different knowledge sources for determining the current teaching context. With this work we would like to provide an answer to the ITS community’s current needs by means of a flexible design environment and a software infra- structure that support multimodal dialogue interaction. Moreover, we would like to support different pedagogical agents and learning modes, and thus investigate on how they influence the way the interaction between the system and the student is carried out. References 1. Aimeur, E. and Frasson, C. Analyzing a new learning strategy according to different knowledge levels. Computers in education, 27 (2), 1996, 115-127. 2. Aimeur, E., Frasson, C. and Dufort, H. Co-operative Learning Strategies for Intelligent Tutoring Systems. Applied Artificial Intelligence. An international journal., 14 (5), 2000, 465-490. 3. Ballim, A. and Pallotta, V., The role of robust semantic analysis in spoken language dialogue systems. in Proceedings of the 3rd International Workshop on HumanComputer Conversation, (Bellagio, Italy, 2000), 11-16. 4. Ballim, A. and Wilks, Y. Artificial Believers. Lawrence Erlbaum Associates, Hillsdale, New Jersey, 1991. 5. Ballim, A. and Wilks, Y. Beliefs, Stereotypes and Dynamic Agent Modelling. User Modelling and UserAdapted Interaction, 1 (1), 1991, 33--65. 6. Bunt, H. Dialogue control functions and interaction design. in Beun, R.J., Baker, M. and Reiner, M. eds. Dialogue in Instruction, Springer Verlag, Heidelberg, 1995, 197-214. 7. Jordan, P., Rosé, C.P. and VanLehn, K., Tools for Authoring Tutorial Dialogue Knowledge. in Proceedings of the Tenth International Conference on Artificial Intelligence in Education (AI-ED 2001), (San Antonio, Amsterdam, 2001), IOS Press. X. Ghorbel, H., Ballim, A. and Coray, G., Rosetta: rhetorical and semantic environment for text alignment, Proceedings of Corpus Linguistics 2001, Lancaster, UK, April, 2001. 8. Kosseim, L. and Lapalme, G. Choosing Rhetorical Structures to Plan Instructional Texts. Computational Intelligence, 16 (3), 2000, 408-455. 9. Larsson, S. and Traum, D. Information state and dialogue management in the TRINDI Dialogue Move Engine Toolkit. Natural Language Engineering, 6 (3&4), 2000, 323-340. 10. Lee, M., Belief ascription in mixed initiative dialogue. in Proceedings of AAAI Spring Symposium on Mixed Initiative Interaction, (1997). 11. Lee, M. Belief, Rationality and Inference: A general theory of Computational Pragmatics Department of Computer Science, University of Sheffield, Sheffield, 1998. 12. Lee, M. and Wilks, Y., An ascription-based approach to speech acts. in Proceedings of the 16th Conference on Computational Linguistics (COLING-96), (Copenhagen, 1996). 13. Pallotta, V. and Ballim, A., Robust dialogue understanding in HERALD. in Euroconference on Recent Advances in Natural Language Processing (RANLP'01), (Tzigov-Chark (Bulgaria), 2001), 204209. Y. Pettenati, M.-C., Abou Khaled, O., Coray, G. and Giuli, D., Development of a distance education environment for university courses: MEDIT, Proceedings of the EDEN'98 conference, Bologna, Italia, pp. 465-471, June 1998. 14. Rosé, C.P. and Friedman, R. (eds.). Building Dialogue Systems for Tutorial Applications. Papers from the 2000 AAAI Fall Symposium. AAAI Press, Menlo Park, CA, 2000. 15. Shoham, Y. Agent-Oriented Programming. Artificial Intelligence, 60 (1), 1993, 51--92.