Design of a Conversational Recommender System
in Education
Stefano Valtolina (
[email protected] )
University of Milan
Ricardo Anibal Matamoros
Social Things SRL
Alberto Battaglia
Social Things SRL
Michelangelo Mascari
Social Things SRL
Francesco Epifania
Social Things SRL
Research Article
Keywords: Conversational interface, Machine learning for education, Learning objects.
Posted Date: June 27th, 2023
DOI: https://doi.org/10.21203/rs.3.rs-3111462/v1
License: This work is licensed under a Creative Commons Attribution 4.0 International License.
Read Full License
Page 1/26
Abstract
In recent years we have seen a significant proliferation of e-learning platforms. E-learning platforms allow
teachers to create digital courses in a more effective and time-saving way, but several flaws hinder their
actual success. One main problem is that teachers have difficulties finding and combining open-access
learning materials matching their specific needs precisely when there are so many to choose from. This
paper proposes a new strategy for creating digital courses that use learning objects (LOs) as primary
elements. The idea consists of using an intelligent chatbot to assist teachers in their activities. Defined
using RASA technology, the chatbot asks for information about the course the teacher has to create
based on her/his profile and needs. It suggests the best LOs and how to combine them according to their
prerequisites and outcomes. A chatbot-based recommendation system provides suggestions through
BERT, a machine-learning model based on Transformers, to define the semantic similarity between the
entered data and the LOs metadata. In addition, the chatbot also suggests how to combine the LOs into a
final learning path. Finally, the paper presents some preliminary results about tests carried out by
teachers in creating their digital courses.
1 Introduction
Mainly due to the COVID-19 pandemic these last few years, we have witnessed an increase in the
production of digital material in education. Material that teachers used to create courses accessible even
remotely and able to provide students with proper training. The world of e-learning has thus had
exponential growth in content and users.
Thanks to developments in innovative fields such as Artificial Intelligence (AI), we can now use
technologies to support teachers in creating and delivering digital material in a way we never imagined.
Among the most used systems based on Artificial Intelligence, there are learning management systems
(LMS). These systems can provide teachers with a new type of user interaction by using recommendation
mechanisms and specific technologies.
Beyond all the technical aspects – kit, production values, graphics, etc. – the main problem in using the
LMSs is that many teachers are not well-trained to design proper digital courses. Some of them go for a
"one size fits all" approach which is rarely a recipe for success for what they expect students to know,
understand or be able to do as a result of taking their course. Other teachers lack enough time to create
digital courses, especially basic or professional ones. Finally, some LMSs lack effective assessment
methods to gauge learning outcomes and the impact of the e-learning program that has been devised.
According to these considerations, the paper presents a new strategy to support teachers in creating a
digital course for Italian schools and agencies using an e-learning platform named WhoTeach. The
innovative idea consists of endowing the LMS with an intelligent chatbot to assist teachers in their
activities by suggesting learning objects (LOs) as primary recommendation elements. Defined using
RASA technology, the chatbot offers the best LOs and how to combine them according to their
Page 2/26
prerequisites. In addition to suggesting how to connect the LOs, the chatbot explains why the module is
significant.
Finally, the paper presents some results of the tests carried out on the machine learning models used to
predict the actions to be taken by the chatbot according to the semantic understanding of the teacher's
messages and some preliminary results about tests carried out in creating their digital courses. In
particular, we defined and implemented a model to evaluate teachers' level of acceptance and intention to
use the chatbot, extending the UTAUT model (The unified theory of acceptance and use of technology)
presented in [1].
The paper is structured as follows. Section 2 describes the state of the art of using a conversation
interface in education and explains how we used the chatbot to support teachers in creating a course.
Section 3 describes the recommendation system the chatbot uses to suggest better LOs. The Section
explains how the chatbot indicates a sequence of proper LOs and the strategies for combining them.
Section 4 presents some tests carried out in creating their digital courses. In particular, we defined and
implemented a model to evaluate teachers' level of acceptance and intention to use the chatbot,
extending the UTAUT model (The unified theory of acceptance and use of technology). Finally, Section 5
sums up conclusions and future works.
2 Conversational Agent in Education
Many studies regarding using chatbots in the educational domain exist in the literature. For example,
authors in [2,3,4] present how to use conversational agents in areas such as teaching and learning,
administrative assistance, assessment, consultancy and research and development.
According to the review [2], chatbots are mainly applied for teaching and learning (66%). They promote
rapid access to materials by students and faculty at any time and place [5,6]. This strategy helps save
time and maximise students' learning abilities and results [7], stimulating and involving them more in
teaching work [8]. Furthermore, they can automate many student activities, including submitting
homework, replying to emails and sending feedback on the courses followed [9,10]. Finally, the chatbot
can be used as a real personal assistant for teachers, assisting them in their daily tasks.
However, to our knowledge, chatbots are rarely used to assist teachers in creating new digital courses.
Teachers can use authoring tools and learning platforms to develop and host online courses, such as
Absorb[1], Learnopoly[2], Elucidat[3], Thinkific[4], Teachable[5], Podia[6], or Learnworlds[7]. These tools
help teachers create, launch and review an e-learning course, and the choice mainly depends on what
teachers want about content to offer, hosting requirements, audience, and budget.
Nevertheless, these solutions cannot support teachers in creating digital courses, starting from learning
material repositories available on the net or suggestions based on past colleagues' experiences to build
on and create new solutions in new situations. These reusable solutions, which Gamma et al. in [11]
Page 3/26
named design patterns, aim to help teachers to create courses by providing them with a set of design
ideas in a structured way [12].
In this field, our idea is to use a conversational agent that assumes the role of a prompter to assist
teachers in finding and selecting proper learning open-access materials available on the internet. One of
the main benefits of integrating a chatbot into a Learning Management System (LMS) is that it can make
educational processes easier for teachers. To validate our idea, we combined the chatbot in a learning
platform, WhoTeach, to support teachers in creating a digital course for Italian schools and agencies. The
platform relies on using learning objects (LOs) as building blocks to compose a course [9,13,14]. Reusing
existing LOs is a valuable solution for helping teachers because they represent helpful material used in
the past by other colleagues to create efficient courses.
Lately, the area of research dealing with finding and recommending a list of LOs that fit specific teachers'
needs and requirements is very active [15, 16]. Nevertheless, teachers highlight difficulties in effectively
combining small chunks of educational material to meet the teacher's academic needs [7,17]. It is
precisely for this reason that the conversation agent comes into play. To avoid the plethora of material
available on the net becoming a disadvantage by paralysing the creation process, our chatbot-based
recommendation system (RS) aims to find and suggest the right LOs, as explained in the next Section. In
detail, we describe the motivation for using a chatbot to offer LOs and how learning resources can be
appropriately assembled in a course meeting the teachers' objectives and requirements.
2.1 Metadata-based Learning Objects' Suggestions
Several standards have emerged over the years to facilitate the sharing and reuse of learning materials
which establish metadata policies and provide suggestions for using the LOs. SCORM Sharable Content
Object Reference Model [18] is a well-known example of a reference strategy for describing and
cataloguing LOs. SCORM provides users with procedures for aggregating LOs and methods for
processing contents on the related learning management system. To ensure interoperability, solutions like
the one proposed by SCORM need to lever on sets of metadata that define standard guidelines for the
classification of LOs. Examples of these metadata are Dublin Core[8] and IEEE LOM[9].
Research and surveys, such as [19, 20], show how Dublin Core is suitable for describing the bibliographic
side of digital resources, but LOM allows the best representation of the pedagogical aspects. Regardless
of the sets of metadata, these standards do not indicate which are more suitable for describing LOs
according to given teachers' preferences. And even if some studies [20] recommend a minimal metadata
set to describe learning material, they do not report specific rules to follow.
Some attempts in this direction have led to the adaptation of the metadata presented in the various
standards into profiles that can meet community context-specific needs [21]. This approach has led to the
emergence of the metadata Application Profile (AP) concept. An AP takes one or more standards as a
starting point, imposing some restrictions and modifying the vocabularies, definitions or other elements
of the original standard to adapt to the needs of a specific application [22].
Page 4/26
For example, the IEEE LOM and Dublin Core standards have been implemented in several Application
Profiles for describing learning resources ([23], UK LOM Core[10], and CanCore[11]), scientific resources
(Darwin Core[12]), cultural resources (ESE[13] and SWAP[14]) and more.
The use of APs has allowed the birth of Learning Object Repositories (LOR) as examples of digital
libraries or electronic databases where educators can share, use and reuse different LOs to build online
learning opportunities for their students. Some of the best-known repositories regarding the number of
[15]
[16]
[17]
titles collected are ARIADNE , NSDL and MERLOT .
ARIADNE was founded in 1996 by the European Commission within the "Telematics for education and
training" programme. The core of this infrastructure is a distributed library of digital and reusable
educational components called the Knowledge Pool System (KPS). ARIADNE uses a set of metadata
extrapolated from the General, Technical and Educational areas of the IEEE LOM. The user can search for
resources through SILO (Search and Index Learning Objects), which allows simple, advanced or federated
searches (through multiple repositories).
NSDL (The National Science, Mathematics, Engineering, and Technology Education Digital Library) was
founded in 2000 by the National Science Foundation to provide a reference library for STEM-type
resources. Users can create their profiles and get recommendations for LOs based on the user's previous
interactions with the repository.
Finally, MERLOT (Multimedia Educational Resources for Learning and Online Teaching), born in 1997, is
an open repository designed primarily for students and teachers. The peculiarity of the LOR is that the
educational resources are reviewed to help users understand whether the resource can function within
their course. Reviews are composed of three dimensions: quality of the content, potential effectiveness as
a teaching tool, and ease of use.
In training our chatbot-based recommendation system, we used these repositories and their related
metadata as bases for implementing the retrieval strategy for creating a new digital course.
2.2 Design of the Conversational Agent's Interaction Strategy
In our work, we advocate using conversational agents not as student assistants but as mentors that can
help teachers create digital courses.
The technology used to develop our chatbot is RASA, an open-source framework that creates text- and
voice-based chatbots. Figure 1 presents the general architecture of the system. The main components are
RASA NLU, i.e. the language comprehension engine, and RASA CORE, which predicts the following action
to be performed following user input.
The RASA NLU component analyses the grammar and logic of the sentence the teacher enters, from
which it extrapolates the parts that interest RASA CORE to elaborate the answer. RASA NLU is responsible
for classifying the intents and extracting the entities.
Page 5/26
Based on the context of the conversation, the RASA CORE chooses the following action to take or the
response to give to the user. A dialogue management model needs to be trained to determine the next
[18]
[19]
step. The training data of this model are indicated through the stories and the rules . The rules describe
small parts of the conversation that must follow a specific path; on the contrary, stories allow you to
create different directions of the conversation based on whether the user can respond in the way
expected or can give unexpected answers.
If the following action to be performed is a "custom action", RASA CORE requests the action from the
"RASA Action Server", which executes it and returns the resulting response or event. Once the data relating
to the course to create has been obtained, these are integrated into a JSON object within a "custom
action" and sent to the Recommender System Server. The RS server returns the list of LOs extracted from
the dataset according to the received information.
The Tracker store is the database where the conversations of the virtual assistant are saved. In contrast,
the Lock Store serves to process the messages in the correct order and to lock the conversations. The
messages are actively processed, allowing the operation of multiple RASA SERVERS simultaneously.
The RS Server adopts a machine learning model called BERT [24], which uses Transformers as the deep
learning model, introduced in 2017 by Google, to realise various Natural Language Processing tasks,
including translation, classification and text synthesis. Among the functions that BERT can do, one of the
most important is Semantic Text Similarity (STS), which calculates the semantic similarity between two
input texts. Even if BERT achieves high accuracy in this task, two complete texts (not only metadata, as
happens in our case) must be entered into the model, resulting in a substantial computational overhead.
To solve this problem, we adopted Sentence-BERT [25], a modification of BERT that uses Siamese neural
networks to generate sentence embeddings that capture the meaning of the two sentences. Subsequently,
the generated embeddings are compared through Cosine Similarity to obtain the semantic similarity
between the two sentences considered.
In particular, in the context of our work, the chatbot-based recommendation system uses S-BERT to
identify the LOs with the most semantically similar metadata to the data entered by the teacher via
chatbot. Data that the teacher specifies for describing the course to create.
In detail, Figures 2 to 4 present screenshots of the dashboard of the learning platform, WhoTeach, we
used to integrate the chatbot. Figure 2 depicts the situation in which the teacher interacts with the chatbot
to specify information about the course to create, such as its difficulty and the duration of the lessons.
The user responds by clicking on one of the buttons to choose the value of the slot. Regarding topics,
skills and competencies, the chatbot allows adding items to the corresponding list if the user needs it.
Once the teacher has entered the course information, the parsing system checks the orthographic and
performs the translation into English (since the dataset is in English) via the DeepTranslator library[20].
Subsequently, the topics, the difficulty, type and duration fields are transformed into embeddings and
tensors using S-BERT. Later, the similarity between the tensors of this information and the tensors of each
Page 6/26
LO's metadata is measured using cosine similarity. The function used to compute this metric is the
semantic_search from the Sentence-Transformers library[21].
All the scores obtained are entered into a data frame, and finally, a final SCORE is calculated using the
average of the scores of the individual metadata for each LO. The chatbot recommends the LOs to the
user based on their SCORE and presented via a graphical interface.
Figure 3 shows the checkboxes for the teacher to select the desired LOs and the resources to be included
in the digital course. If a LO contains exercises, the teacher can specify to repeat the activity if the student
fails to finish a LO in the desired time or according to an established rating.
To finish, the teacher has to combine the LOs in a sequence of lessons the students must follow
according to each LO's prerequisites. Once the teacher selects a LO, the chatbot suggests a list of LOs that
can be used as the following lesson according to their prerequisites. Figure 4 describes a situation where
the tea selects a specific LO, and the system shows the possible related learning paths. The branches
show the paths the student has to follow according to the evaluation obtained at the end of the previous
LO.
Implementation Issues. Initially, we used a traditional approach to associate the value of entities
extracted from the user message using the RASA NLU model. Since the different intents are made up of
data that are very similar to each other or even the same (for example, numerical data such as the age of
the students, the level of difficulty of the lessons, the number of lessons,...), using this solution the NLU
model confused them.
To solve this problem, we have chosen to map user responses to textual intents. We used the RASA
validators to ensure that the extraction model associated only the significant parts of the message with
the string. Thanks to these custom actions, extracting the significant components from the strings
entered by the user and preventing an incorrect value from being associated with the message is
possible.
2.3 Conversational Agent Deployment
[22]
To make the virtual assistant available to the public, we used Docker
[23]
and Kubernetes , as suggested in
[24]
the RASA documentation .
From a practical point of view, the Docker images of the following architectural components were
created: Chatbot, RASA Action Server and Recommender System Server. We then used the tools made
available by Google Cloud Platform to make the web page containing the chatbot accessible via IP
[25]
address. The Docker images created were saved in the Artifact Registry , the Google repository service
that stores and organises Docker images. Subsequently, within the cluster management system provided
[26]
by Google, Google Kubernetes Engine(GKE) , a Kubernetes cluster was created, and the three PODs were
Page 7/26
placed inside it, which contain the containers associated with the three images made (each POD contains
only one container as described in Figure 4).
Load Balancer Services have been defined for the chatbot and the Recommender System Server to be
accessed from outside the cluster. In Figure 4, which shows the cloud architecture, the chatbot POD is
connected to the RS Server POD as the former's IP address is entered in the chat widget script contained
within the HTML page, accessed via a request to the Recommender System Server. Instead, to allow the
internal connection between the chatbot and the RASA Action Server and between the RASA Action Server
and the Recommender System Server, two Services have been created with InternalTrafficPolicy of the
local type, which allows you to use only local endpoints for internal traffic to the cluster.
[1] https://www.absorblms.com/ Last access: 2023-05-27.
[2] https://learnopoly.com/ Last access: 2023-05-27.
[3] https://www.elucidat.com/ Last access: 2023-05-27.
[4] https://www.thinkific.com/ Last access: 2023-05-27.
[5] https://teachable.com/ Last access: 2023-05-27.
[6] https://www.podia.com/ Last access: 2023-05-27.
[7] https://www.learnworlds.com/ Last access: 2023-05-27.
[8] https://www.dublincore.org/ Last access: 2023-05-27.
[9] https://ieeexplore.ieee.org/document/9262118 Last access: 2023-05-27.
[10] http://www.ukoln.ac.uk/metadata/education/ Last access: 2023-05-27.
[11] cancore.athabascau.ca/en Last access: 2023-05-27.
[12] rs.tdwg.org/dwc Last access: 2023-05-27.
[13] www.europeana.eu/ Last access: 2023-05-27.
[14] www.ukoln.ac.uk/repositories/digirep/index/Scholarly_Works_Application_Profile Last access: 202305-27.
[15] https://www.ariadnelearning.it/ Last access: 2023-05-27.
[16] https://nsdl.oercommons.org/ Last access: 2023-05-27.
[17] https://merlot.org/merlot/ Last access: 2023-05-27.
Page 8/26
[18] Rasa stories. https://rasa.com/docs/rasa/stories Last access: 2023-05-27.
[19] Rasa rules. https://rasa.com/docs/rasa/rules Last access: 2023-05-27.
[20] Deep translator. https://pypi.org/project/deep-translator/ Last access: 2023-05-27.
[21] Sentence-transformers. https://www.sbert.net Last access: 2023-05-27.
[22] Docker. https://www.docker.com/ Last access: 2023-05-27.
[23] Kubernetes. https://kubernetes.io/ Last access: 2023-05-27.
[24] Rasa deployment. https://rasa.com/docs/rasa/deploy/introduction Last access: 2023-05-27.
[25] Artifact Registry. https://cloud.google.com/artifact-registry Last access: 2023-05-27.
[26] Google Kubernetes Engine. https://cloud.google.com/kubernetes-engine Last access: 2023-05-27.
3 Conversational Recommender System at Work
3.1 How the Conversational Agent Suggests LOs
Providing teachers with a set of LOs helpful in creating a new course is the responsibility of the chatbotbased recommendation system, whose strategy interaction we presented in Section 2.
From a technical point of view, as said before, the recommendation service at the base of the
conversational agent uses a generation of embedding representations through the application of the SBERT model. This model works on two input sentences, a and b, and through a B function, transforms
them into two vectors embedding representations
→
a
and
→
b
, helpful for calculating the similarity score
ysim . S-BERT uses the pre-trained BERT model as the B function for the actual generation of embedding
and applies a Mean Pooling layer on each B output to calculate the ysim value in the output. In our
context, the set of metadata M of the learning objects is used:
Duration
Age
Difficulty
Language
Type
Keywords
The part of the conversational assistant implemented with RASA allows mapping the teacher's
specifications on a profile based on the list of metadata M, representing the user's preferences.
This set is used to apply three filtering steps on the set of LOs. The first filtering step is defined as:
Page 9/26
L
1
= {LO ∈ L |similarity( B (mlanguage ) , B (m)) > k, ∀m ∈ Mlanguage } (1)
i
Where L1⊂ L is obtained by calculating the similarity between the values mlanguage entered by the teacher
and each value B(m) of the language metadata, such that m ∈ Mlanguage. The second step applies the B
function on the metadata mdifficulty, mduration, mkeywordsy, and mage to use S − BERT on the set L1. This
process is defined as:
L
2
= {LO ∈ L
1
|F ( (f
avg
( B (mi )), f
avg
(B (m) )| Θ) > c, ∀m ∈ Mlanguage } (2)
Where k and c are similarity thresholds fixed a priori. In particular, S-Bert applies an average pooling
function favg to the output of B, on both input values mi, m. The function F corresponds to a multi-layer
perceptron where Θ represents the set of learnable parameters of the model, and through a σ activation
function used as the output layer, it returns the similarity score between the two input values mi, m.
The third step consists of applying formula (1) to the metadata values mlanguage listed as preferred by the
teacher to obtain the set of LO L3. The top5 LOs from this set are returned to the teacher and displayed via
the conversational agent interface.
As explained in the next Section, once identified the LOs that meet the teacher's needs, the
recommendation system has to leverage the prerequisites that link each LO to the others to suggest how
to create the final learning path.
3.2 Model for Prerequisite Extraction
Discovering the pedagogical relationships between LOs is a complex and time-consuming practice,
usually performed by domain experts. A prerequisite relationship is a pedagogical relationship that
indicates the order in which concepts can be proposed to the user. In other words, we can say that a
prerequisite relationship exists between two concepts if one is significantly helpful in understanding the
other.
Our recommendation system computes a list of LOs to be recommended, which is sent to the prerequisite
analysis model, which returns the ordering of the LOs according to the information concepts contained.
For this prerequisite extraction model, we used an innovative approach based on deep learning to
automatically identify the prerequisite relationships between concepts to create pedagogically motivated
LO sequences. The model exclusively exploits linguistic characteristics extracted from the LO description
that each LO repository provides. Considering only textual content is perhaps the most complex condition
for inferring relationships between educational concepts since it cannot rely on structured information. At
the same time, this is also the closest condition to a real-world scenario.
The protocol we implemented aims at the first stage to extract five topics that better represent the LO. The
topics are inferred using the LO description. In the second phase, we use the five topics to find the five
Page 10/26
corresponding Wikipedia pages in which each topic is explained.
Then we have to choose the wiki page that exhibits the highest similarity to the original description of the
LO. This allows us to characterise the LO content better with the aim of inferring the pedagogical
relationships that link it to the others. Nevertheless, a single wiki page associated with a LO is insufficient
to understand the LO content well.
For this reason, in phase three, we need to investigate if other wiki pages can better describe the LO
content. To do it, we calculate the similarity of the LO description with all the Wikipedia pages in our
dataset. Our dataset comprises the wiki pages linked to all topics associated with all LOs we consider in
our prerequisite extraction model. Once this step is finalised, we must choose which wiki page better
describes the LO content. The wiki page we identified at stage two or three. To determine this final
mapping, we select the wiki page with the minor similarity score with respect to the LO description by
using a cosine similarity distance.
Once each LO is linked to the best wiki page to describe its content in detail, we need to define when a LO
is a prerequisite of another.
In this fourth phase, our model aims to learn if a LO "A" is a prerequisite for a LO "B" by analysing the
related wiki pages. The proposed model implements the identification of prerequisite relationships
between concepts in the Italian language and exploits the approach proposed in [26] for the PRELEARN
task of EVALITA 2020 project. In the following Sections, we explain in detail each of the four phases that
characterise our prerequisite extraction model.
Phase 1 – Topic Extraction. The first step aims at extracting the topics most representative of each LO.
This task is based on a Topic Modelling approach. Topic Modelling is an unsupervised ML method that
receives in input a corpus of documents and extracts the most relevant topics and concepts, allowing to
represent of vast volumes of data in a reduced dimension, thanks to the identification of hidden concepts,
relevant characteristics, latent variables and semantic structure of the data.
In particular, we used the Latent Dirichlet Allocation proposed by David Blei, Andrew Ng and Micheal I.
Jordan in [27] as a topic modelling approach to discover the underlying topics to link to each LO in a
collection of text documents. The main idea of the model is the assumption that each document is a
mixture of a fixed number of topics, and each topic is a probability distribution of words. The algorithm
applies an inference process to determine the topic distribution in the documents (in our case, the LO
descriptions) and the word distribution in each topic. We denote V as the vocabulary size, N as the
number of words in a document, and M as the number of documents in the corpus D. Then, the model
assumes the following process for each document w ∈ D.
Choose N ∼ Poisson(ξ)
Choose θ ∼ Dir(α)
For each word wn
Page 11/26
Choose a topic zn ∼ Multinomial(θ)
Choose a word wn from p(wn | zn, β), a multinomial probability conditioned on the topic zn
Then, fixing hyperparameters α and β, we can extract the posterior probability of the corpus topics with
the following formula based on a Bayesian process.
M
p (D, α, β) = ∏
d=1
Nd
∫
p (θd | α) (∏
∑
n=1
zdn
p (zdn | θd )p( w dn |zdn , β))dθd ( 3)
After completing this step, we can associate each LOt with the five main topics that represent it.
Phase 2 – Wikipedia page extraction. After we have found the five keywords that best represent the most
relevant topics for each LO, we exploit them to search for corresponding five Wikipedia pages that can be
used to explain them.
To accomplish this, we employ the Python library called "Wikipedia-API", which allows us to extract
various information from Wikipedia.
The API allows us to identify the better five wiki pages to associate with a LO that we can use to
characterise the LO content. Moreover, in this phase, we create a dataset of wiki pages related to all topics
of all LO of our repository. This approach allows us to narrow our focus to a subset of Wikipedia pages
relevant to our domain.
Once each LO is linked to five distinct Wikipedia pages, we need to establish the page that exhibits the
highest similarity to the original description of the LO.
To do it, we chose to employ the cosine similarity metric. This metric is widely used in Natural Language
Processing to gauge the similarity between two vectors based on their angle in a high-dimensional space.
Therefore, adopting this approach, we can calculate the distance between the LO description and the
associated Wikipedia pages to select the page with the minimum distance as the most similar.
Phase 3 – Similarity Evaluation of LO with all the extracted wiki pages. We cannot consider only a single
wiki page as the best candidate to describe LO content. We need to find if, in our dataset, we can use
other wiki pages for our purpose. To do it, we calculate the cosine distance between the LO description
and all the Wikipedia pages present in our dataset. This step aims to identify any additional relevant
pages associated with the LO that may have been overlooked previously.
To accomplish this, we must locate the nearest neighbours for each LO description in a high-dimensional
space. However, employing a conventional algorithm like K-Nearest Neighbors for this task can be
computationally demanding. Consequently, we adopted an approach based on Approximate Nearest
Neighbors [28] to address this challenge. This family of methods presents a solution to address the
challenge of high dimensionality by intelligently partitioning the vector space. As a result, we can limit our
Page 12/26
analysis to a smaller subset of the original set, easing the computational burden. In general, these
methods can be divided into three groups:
Tree structure
Proximity graph
Hashing methods
In our particular case, we chose to employ a tree-based approach. K-dimensional trees offer a generalised
version of binary trees designed explicitly for searching in high-dimensional spaces. The general
procedure is the following:
Choose a random dimension of the k-dimensional vector
Find the median of that dimension among all the vectors of the current set
Divide the vector space according to the median value
Iterate
Following this process, we construct a tree structure where each leaf represents a subset of vectors from
our entire set. By doing so, we can restrict our nearest neighbour search exclusively to items that belong
to the same leaf, effectively reducing the computational cost.
Once this step is finalised, we have two options for the best Wikipedia page to associate with a LO: the
wiki page identified at phase 2 or found at this stage. To determine the final mapping, we select the
option with the closest cosine distance between the LO description and the respective Wikipedia page.
Phase 4 - Prerequisite Extraction based on EVALITA 2020 and BERT. Our proposed model draws
inspiration from the work of Angel et al. in the PRELEARN (Prerequisite Relation Learning) domain shared
task of EVALITA 2020 [26]. EVALITA is a periodic campaign to advance language and speech technology
for the Italian language, and their model demonstrated the best performance. Specifically, the PRELEARN
task focused on classifying whether a pair of concepts exhibited a prerequisite relationship. The dataset
analysed in this context was ITA-PREREQ, which contains pairs of Italian concepts connected by a
prerequisite relation. In particular, each row of the dataset present:
Wikipedia page associated with concept A
Wikipedia page associated with concept B
Label: 1 if B is a prerequisite of A, 0 otherwise
Among the proposed solutions, the most effective one involved encoding all the concept pairs using an
Italian model called BERT2 [29], which was fine-tuned on the training dataset of wiki pages mentioned
earlier. Subsequently, a single-layer neural network was utilised to map the 1536 features (768 for each of
the two vectors generated for the wiki pages) produced by the BERT algorithm in the encoding space to
Page 13/26
an output space with a dimension of 2. This mapping represents the two possible classes: the existence
of a prerequisite relation or the non-existence of such a relation.
Similarly to the process just described, our approach comprises two steps. First, we fine-tuned BERT on
the ITA-PREREQ dataset to obtain the 768-dimensional vector that represents the combination of the
concept associated with each its Wikipedia description and related LO. Since we started with a dataset
that contains LOs not labelled according to the prerequisite relationships, we cannot train the BERT model
in a supervised way, and therefore, we used the standard pre-trained BERT model to fine-tune the EVALITA
dataset. Subsequently, the model we obtained from this fine-tuning is used to make inferences on our
dataset. The aim was to infer whether A is a prerequisite of B or not for a pair of LOs. Specifically, we
applied the pre-trained encoding model alongside a single-layer dense neural network to our LO dataset.
Given a pair of LO, the model receives the two Wikipedia pages associated with them, determined in the
previous steps, then concatenates their embeddings and applies the dense neural network to output the
binary label that represents the existence of a prerequisite relation between LOs.
27https://pypi.org/project/Wikipedia-API/
Last access: 2023-05-27.
4 Validation of the Chatbot Assistant
4.1 Evaluation of the RASA models
To measure the ability of the virtual assistant to understand what the user writes while interacting with
the chatbot, we tested the NLU model through a 10-fold cross-validation.
The resulting accuracy is 87%. The confusion matrix in Fig. 6 reports how often the intents were confused
with other intents by RASA NLU. As seen in the matrix, the intents relating to skills and abilities, having
similar example sentences, tend to be confused; the same goes for the course topics and names.
The numeric fields (age, number of lessons, formats, video lessons, exercises, quizzes, documents), being
very similar to each other, if considered separately, would lead to a performance unsatisfactory of the
model, which would not reflect the accurate good functioning of the chatbot. For this very reason, we
decided to group them in a single intent to eliminate the risk of confusion and match the result of the
evaluation of the NLU model with the actual efficiency of the chatbot.
4.2 Acceptance and Intention to Use the Chatbot
To evaluate teachers' level of acceptance and intention to use the chatbot, we used the UTAUT model
(The Unified Theory of Acceptance and Use of Technology) [1] as also described in [30]. This model
includes eight user acceptance indicators that are well-validated in the context of many studies. The
UTAUT model presents four significant constructs as direct determinants of user acceptance and intent to
use a new technology: 1. Performance Expectancy (PE); 2. Effort Expectancy (EE); 3. Social Influences
(SI); 4. Facilitating Conditions (FC).
Page 14/26
Performance Expectancy measures how much an individual considers a valuable system for improving
their job performance. Effort Expectancy measures how easy a system is to use. Social Influence
measures the influence of colleagues, instructors, and friends on the intention to use a new technology
[31,32]. Finally, Facilitating Conditions measure how much external aid can facilitate the adoption and
use of the system.
UTAUT is a generic acceptance analysis model which can be applied to different fields. To obtain a higher
level of detail and a specific adaptation to our context, according to the results of these works [31, 33], we
decided to use an extension model that integrates the primary constructs of the standard UTAUT model
by adding these constructs: 1. Hedonic Motivation (HM); 2. Habit (H); 3. Trust (T).
The first construct measures the degree of appreciation of the system by users and how this could affect
the intention to use it in the future. The second measures how much experience and the habit of using
new technology can be helpful in its more concrete acceptance [33]. Finally, the Trust construct measures
how much trust in the chatbot can affect its acceptance and future use.
Each construct can be related to others specifying the hypothesis we need to study. As depicted in Fig. 7,
each construct is related to other constructs defining the hypothesis we need to check. For example,
Hypothesis 1 (H1) links PE to Behavioural Intention (BI) for evaluating how much the performance
expectancy positively affects teachers' intention to use the suggestions of the digital assistant. Or again,
Hypothesis 6b (H6b), linking HM to PE, measures how much the degree of the chatbot appreciation can
influence how much teachers consider it valuable for improving their job performance. All hypotheses are
presented in Fig. 7.
Twenty-six people participated in the evaluation, mainly chosen among the SocialThingum company
employees and the Computer Science department students of the University of Milan. The idea was to
ask testers to create introductory programming courses in Italian. After administering a cognitive
questionnaire, we can say that the participants were aged between 23 and 26 and had a master's or threeyear degree in Computer Science. Most of them stated that they occasionally use chatbots and that many
have had the opportunity to take advantage of e-learning platforms such as Moodle. Finally, the
participants demonstrated solid knowledge of basic programming. During the test, we asked participants
to create an introductory course for basic programming. At the end of the test, we provided testers with a
questionnaire related to the constructs of our UTAUT model. With a total of 24 questions (5 for measuring
PE, 3 for EE, SI, FC, T, and HM, 2 for H, and BI), each question tries to investigate how much the user
considers the chatbot effective (PE), easy to use (EE), well-rated by colleagues (SI), well-supported (FC),
trust (T), pleasant (H) and finally the intention to use it in the future (BI). The questionnaire uses a 5-point
Likert scale method to explore the study, ranging from 1 (strongly disagree) to 5 (strongly agree),
respectively. Figure 8-A reports the mean and the standard deviation of the answers for each question of
the constructs.
The average of the responses relating to the "Performance Expectancy" construct is 3.7, therefore
between indecision (3) and agreement (4) on the Likert scale. This score can be considered entirely
Page 15/26
satisfactory, as it indicates that users consider the chatbot a valuable tool to facilitate and speed up the
search for LO. The average of the construct "Effort Expectancy" is 4.1. This value suggests that the
chatbot is easy to use and that interacting with the chatbot is clear and understandable. The "Social
Influence" average is 3.7, while the "Facilitating Conditions" is 4.0. Both values imply that the influence of
colleagues is quite relevant and that the user perceives that she/he is well-supported in using the chatbot
and has all the necessary knowledge to use it without problems. The "Trust" construct has a resulting
mean of 3.5. This value means that people have enough trust in the chatbot's recommendations. The
level of trust could grow if more LOs were added to the dataset to return more resources to the teacher's
request.
The average score for the "Hedonic Motivation" construct is 3.4, a pretty good result but not optimal. This
result may be due to the chatbot interface's limited presence of fun and rewarding components.
Unfortunately, RASA restricts the use of various graphical elements, such as animations, which could
make the interface pleasant and attractive. The "Habit" construct has a low average of 2.0, indicating that
the users do not frequently use chatbots. Therefore, previous experience does not affect the degree of
acceptance.
To verify the hypotheses in Fig. 7, we used the structural equation model (SEM) [34], combining factor
analysis and regression. It first constructs latent variables starting from the items that have been defined
and, subsequently, estimates the regressions (specified by the researcher and corresponding to the
hypotheses) using the variables above. Through the results of these regressions, it is possible to verify
which hypotheses are accepted and with which significance level. As can be seen in Fig. 8-B, the SEM
model provides an estimated beta value and a p-value as an output. Beta represents the effect of the
explanatory variable (the antecedent of the hypothesis) on the dependent variable (the consequent of the
hypothesis) and can be either positive or negative. The p-value allows us to derive the significance level
with which the hypothesis is eventually accepted. In this work, the SEM analysis was performed using
Jamovi, an open-source tool for data analysis and the realisation of statistical tests.
From the table, it is possible to observe that habit does not influence the construct of Effort Expectancy,
as hypothesis H7b has not been accepted. This result means that the created assistant is easily usable
even for people who use the chatbot sporadically, like the test participants. The other hypothesis that was
not significant is the H2b, which implies the low average obtained for the Behavioral. This value allows us
to say the final intention to use the chatbot does not depend on its level of usability. This result is an
interesting indication because it suggests that the teachers considered the chatbot a helpful assistant
even if its usability can be improved. All the other hypotheses were confirmed, even with a high beta
value.
[28] https://www.socialthingum.it/ Last access: 2023-05-27.
[29] https://moodle.org/ Last access: 2023-05-27.
[30] Jamovi. https://www.jamovi.org/ Last access: 2023-05-27.
Page 16/26
5 Conclusion
This paper presents a learning platform that provides teachers with a virtual assistant that helps them
create a new digital course. The idea is to use an intelligent assistant to advise teachers about the elearning modules according to their objectives. It can offer an element of flexibility and customisation of
the resources by the teachers to meet their needs. These intelligent suggestions are presented through a
visualisation that is used to offer LOs in an accurate, accountable, transparent and well-explained.
The chatbot asks the teacher for the main properties of the course, including the age of the students, the
difficulty and the topics covered, necessary to understand the teaching needs of the teacher. Based on the
information obtained, the assistant suggests a series of LOs, which the teacher can view and select.
In developing the chatbot, great attention was paid to usability, ensuring that the teacher can immediately
decide on the LOs and, in the same way, can indicate the data relating to the course. In the design phase,
we defined all possible use cases according to which we developed the virtual assistant's actions and the
system's general architecture. Subsequently, we moved on to implementing the chatbot through the RASA
framework, an open-source framework which, thanks to the use of natural language processing models,
allows the creation of sophisticated chatbots. Then we defined the forms, the RASA components with
which the chatbot asks for the user's requested information, and the custom actions necessary to
integrate customised functions into RASA.
Our recommendation service takes the course data indicated by the teacher and forwards them to the
procedure to facilitate filtering the LOs. For the parsing, we decided to use Sentence-BERT, a machinelearning model based on Transformers, to identify the LOs with the data most semantically similar to the
data entered by the teacher. Once we completed its development, the virtual assistant was integrated into
an HTML page to be placed in the Google Cloud Platform using Docker and Kubernetes to access the
page on the web by IP address.
In the final testing phase, several experiments were conducted on the NLU model to evaluate the chatbot's
understanding ability.
To evaluate the impact of the virtual assistant on the teachers' activity, we adopted an extended version
of the UTAUT model to study its acceptance and intention to use it. To understand the factors driving the
teachers' intention to use the digital assistant's suggestions, we recruited 26 participants. As discussed in
the paper, the final results of our tests demonstrate sound effects for concerns about the acceptance of
the virtual assistant. Specifically, good values concern the quality of the assistant's ability to
communicate effectively, the level of perceived trust in its suggestions and finally, how the teachers'
experience affects their perception of the ease of use of the assistant. Further researches aim at
extending the studying involving more teachers with a broader range of competencies in other learning.
References
Page 17/26
1. Venkatesh, V., Morris, M.G., Davis, G.B., Davis, F.D.: User acceptance of information technology:
Toward a unified view. MIS Q., 425–478. (2003)
2. Okonkwo, C.W., Ade-Ibijola, A.: Chatbots applications in education: A systematic review. Computers
and Education: Artificial Intelligence. 2, 100033 (2021)
3. Medeiros, R.P., Ramalho, G.L., Falcão, T.P.: A systematic literature review on teaching and learning
introductory programming in higher education. IEEE Trans. Educ. 62(2), 77–90 (2018)
4. Smutny, P., Schreiberova, P.: Chatbots for learning: A review of educational chatbots for the Facebook
Messenger, vol. 151, p. 103862. Computers & Education (2020)
5. Alias, S., Sainin, M.S., Fun, S., T., Daut, N.: Identification of conversational intent pattern using patterngrowth technique for academic chatbot. In Multi-disciplinary Trends in Artificial Intelligence: 13th
International Conference, MIWAI 2019, Kuala Lumpur, Malaysia, November 17–19, 2019,
Proceedings 13 (pp. 263–270). Springer International Publishing. (2019)
6. Wu, E.H.K., Lin, C.H., Ou, Y.Y., Liu, C.Z., Wang, W.K., Chao, C.Y.: Advantages and constraints of a hybrid
model K-12 E-Learning assistant chatbot. Ieee Access. 8, 77788–77801 (2020)
7. Murad, D.F., Irsan, M., Akhirianto, P.M., Fernando, E., Murad, S.A., Wijaya, M.H.: Learning Support
System using Chatbot in" Kejar C Package" Homeschooling Program. In 2019 international
conference on information and communications technology (ICOIACT) (pp. 32–37). IEEE. (2019),
July
8. Lam, C.S.N., Chan, L.K., See, C.Y.H.: Converse, connect and consolidate–The development of an
artificial intelligence chatbot for health sciences education. In Frontiers in Medical and Health
Sciences Education Conference. Bau Institute of Medical and Health Sciences Education, Li Ka Shing
Faculty of Medicine, The University of Hong Kong. (2018)
9. Deschênes, M.: Recommender systems to support learners' Agency in a learning Context: a
systematic review. Int. J. Educational Technol. High. Educ. 17(1), 50 (2020)
10. Urdaneta-Ponte, M.C., Mendez-Zorrilla, A., Oleagordia-Ruiz, I.: Recommendation systems for
education: systematic review. Electronics. 10(14), 1611 (2021)
11. Gamma, E., Helm, R., Johnson, R., Vlissides, J.: Design patterns: Elements of reusable objectoriented
software. Addison-Wesley (1995)
12. Goodyear, P.: Educational design and networked learning: Patterns, pattern languages and design
practice. Australian J. Educational Technol. 21(1), 82–101 (2005)
13. Mushtaha, E., Dabous, S.A., Alsyouf, I., Ahmed, A., Abdraboh, N.R.: The challenges and opportunities
of online learning and teaching at engineering and theoretical colleges during the pandemic. Ain
Shams Engineering Journal. 13(6), 101770 (2022)
14. Ruiz, J.G., Mintzer, M.J., Issenberg, S.B.: Learning objects in medical education. Med. Teach. 28(7),
599–605 (2006)
15. Urdaneta-Ponte, M.C., Mendez-Zorrilla, A., Oleagordia-Ruiz, I.: Recommendation systems for
education: systematic review. Electronics. 10(14), 1611 (2021)
Page 18/26
16. Wu, E.H.K., Lin, C.H., Ou, Y.Y., Liu, C.Z., Wang, W.K., Chao, C.Y.: Advantages and constraints of a hybrid
model K-12 E-Learning assistant chatbot. Ieee Access. 8, 77788–77801 (2020)
17. Campbell, L.M.: Engaging with the learning object economy: Introducing learning objects and the
object economy. In: Reusing Online Resources, pp. 53–63. Routledge (2003)
18. SCORM: Sharable Courseware Object Reference Model, Retrieved June 7, 2007, from, (2003).
http://www.adlnet.gov/downloads/downloadpage.aspx?ID=243
19. Dagienė, V., Jevsikova, T., Kubilinskienė, S.: An integration of methodological resources into learning
object metadata repository. Informatica. 24(1), 13–34 (2013)
20. Hoebelheinrich, N., Biernacka, K., Brazas, M., Castro, L.J., Fiore, N., Hellstrom, M., …, Whyte, A.:
Recommendations for a minimal metadata set to aid harmonised discovery of learning resources
(2022)
21. Palavitsinis, N., Manouselis, N., Sanchez-Alonso, S.: Metadata quality in learning object repositories:
a case study. The Electronic Library (2014)
22. Duval, E., Hodgins, W., Sutton, S., Weibel, S.L.: Metadata principles and practicalities. D-lib Magazine.
8(4), 1–10 (2002)
23. Zschocke, T., Beniest, J., Paisley, C., Najjar, J., Duval, E.: The LOM application profile for agricultural
learning resources of the CGIAR. Int. J. Metadata Semant. Ontol. 4(1–2), 13–23 (2009)
24. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for
language understanding. arXiv preprint arXiv:1810.04805. (2018)
25. Reimers, N., Gurevych, I.: Sentence-bert: Sentence embeddings using siamese bert-networks. (2019).
arXiv preprint arXiv:1908.10084.
26. Angel, J., Aroyehun, S.T., Gelbukh, A.: Nlp-cic@ prelearn: Mastering prerequisites relations, from
handcrafted features to embeddings. (2020). arXiv preprint arXiv:2011.03760.
27. Blei, D., Ng, A., Jordan, M.: Latent dirichllocation. Advances in neural information processing systems,
14. (2001)
28. Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality.
In Proceedings of the thirtieth annual ACM symposium on Theory of computing (pp. 604–613).
(1998), May
29. Devlin, J., Chang, M.W., Lee, K.: Google, KT, Language, AI: BERT: pre-training of deep bidirectional
transformers for language understanding. In Proceedings of NAACL-HLT (pp. 4171–4186). (2019),
June
30. Valtolina, S., Matamoros, R.A.: EUD Strategy in the Education Field for Supporting Teachers in
Creating Digital Courses. In End-User Development: 8th International Symposium, IS-EUD 2023,
Cagliari, Italy, Jun 6–8, 2023, to be published in Springer International Publishing. (2023)
31. Venkatesh, V., Davis, F.D.: A theoretical extension of the technology acceptance model: Four
longitudinal field studies. Manage. Sci. 46(2), 186–204 (2000)
Page 19/26
32. Warshaw, P.R.: A new model for predicting behavioral intentions: An alternative to Fishbein. J. Mark.
Res. 17(2), 153–172 (1980)
33. Venkatesh, V., Thong, J.Y., Xu, X.: Consumer acceptance and use of information technology:
extending the unified theory of acceptance and use of technology. MIS Q., 157–178. (2012)
34. Fan, Y., Chen, J., Shirkey, G., John, R., Wu, S.R., Park, H., Shao, C.: Applications of structural equation
modeling (SEM) in ecological studies: an updated review. Ecol. Processes. 5, 1–12 (2016)
Figures
Figure 1
General architecture of the e-learning platform
Page 20/26
Figure 2
Two screenshots present the chatbot interaction. In the first one, the chatbot asks the teacher for
information about creating a programming course. In detail, it asks to insert: 1. the average time for each
lesson, 2. the number of topics to cover, and 3. if the teacher wants to add new topics. In this example, the
teacher responds "yes" and then inserts a new topic: "conditional structures". On the right, in the final
request, the chatbot asks to insert prerequisites the students need to know before taking the course. The
teacher indicates: to know to drag and drop and the basics of first-order logic.
Page 21/26
Figure 3
In the screenshot on the right, the chatbot asks the teacher to select the LOs to insert into the final course.
Clicking on a LO on the left, it is possible to see further information about it: the topics the LO covers, a
description, the rating and its keywords (including accessibility indications).
Figure 4
The Figure presents possible learning paths the teacher can define for the course. The teacher selects LO
number 2, and the system shows possible related learning paths. The recommendation service suggests
Page 22/26
the sequence according to the prerequisites of the LO3 and LO4, and then the teacher has to indicate the
branching direction. The branches show the paths the student has to follow according to the evaluation
obtained at the end of the previous LO.
Figure 5
Cloud architecture of the e-learning platform.
Page 23/26
Figure 6
Intent confusion matrix of the NLU model.
Page 24/26
Figure 7
Hypotheses schema. Each arrow represents a hypothesis that measures how much a construct can
affect the validation of the other. For example, H6b, linking HM to PE, measures how much the degree of
the chatbot appreciation can influence how much teachers consider it valuable for improving their job
performance.
Page 25/26
Figure 8
The Figure in Section A presents a table showing the mean and standard deviation of the answers to the
questions of the UTAUT model. While the Figure in Section B offers a table containing the results of the
SEM analysis, in particular indicating which hypotheses were accepted as the final result of the test.
Page 26/26