Ontbot: Ontology Based Chatbot: Hadeel Al-Zubaide, and Ayman A. Issa

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

2011 Fourth International Symposium on Innovation in Information & Communication Technology

OntBot : Ontology based ChatBot


Hadeel Al-Zubaide, and Ayman A. Issa

Abstrac1— A new ontology based approach is proposed to model candidate. A number of tools and techniques do handle the
and operate chatbots (OntBot). OntBot uses appropriate mapping issue of mapping one of them into the other; ontology to
technique to transform ontologies and knowledge into relational relational database and vice versa.
database and then use that knowledge to drive its chats. The One of the main issues currently facing such a huge amount
proposed approach overcomes a number of traditional chatbots of ontologies stored in a database is the lack of easy to use
drawbacks including: the need to learn and use chatbot specific interfaces for data retrieval, due to the need to use special
language such as AIML, high botmaster interference, and the use query languages or applications. Currently, users who wish to
of non-matured technology. OntBot has the additional power of utilize ontology repositories need to know the contents of the
easy users interactions using their natural language, and the ontology, which means understanding OWL (Web Ontology
seamless support of different application domains. This gives the Language) or RDF (Resource Description Framework), and
proposed approach a number of unique scalability and know how to query these ontologies using one of the ontology
interoperability properties that are going to be evaluated in query languages, e.g., SPARQL. Such requirements are some
future phases of this research project. of the major reasons the Semantic Web has not become
mainstream as fewer than expected users are utilizing such
I. INTRODUCTION knowledge.
Chatbot is a computer program that interacts with users

I n recent years the development of ontologies has been using natural Languages [2]. Chatbot systems allow to realize
moving from the realm of Artificial-Intelligence laboratories simply a dialogue system based on natural language.
to the desktops of domain experts. Ontologies have become Therefore, they can be used as interfaces to a vastness of
common on the World Wide Web and a trend of modeling in applications including entertainment applications, educational
Information Systems development where we can get use of the applications, e-learning platforms, research engines, and e-
great benefits provided by them. Ontologies are human commerce web-site navigation.
readable, comprehensive, sharable and formal which means The usefulness and complexity of ontology based data
that they are expressed in a language that has well-defined retrieval, features of relational databases management systems,
semantics. Ontologies are important to application integration the lack of easy to use query interface, and chatbot features
solutions because they provide a shared and common have directed this research to investigate the possibility of
understanding of data that exist within an application having a usable ontology based query interpreter and
integration problem domain. Ontologies also facilitate responder chatbot.
communication between people and information systems. In this paper, an ontology based chatbot (OntBot) is
However, while today there is an unprecedented wealth of proposed to provide an easy to use, domain independent,
information available on the Web, to fully realize the power of scalable, dynamic and smart conversational agent. In the
ontologies and to enable efficient and flexible information proposed approach, the ontology will be firstly converted into
gathering, persistent storage of ontologies and its subsequent relational database as basis for a strong chatbot knowledge
retrieval is of paramount importance. base.
From the other side, the relational database technology has Sections 2 and 3 present some background and related work,
ensured the best facilities for storing, updating and respectively. Section 4 demonstrates the details of the
manipulating the information of problem domain. The proposed approach. The conclusions and future work are
relational database also has proved its capabilities to cope with outlined in section 5.
large amounts of data [1], these data can be represented by the
ontologies themselves somehow. In addition, features of II. BACKGROUND
relational databases management systems (e.g. transaction This section describes necessary components that are not the
management, security and integrity control) make it much main contributions yet are important for the proposed
more preferred when compared to traditional file systems. This approach. These are chatbot, ontologies, and the mapping of
represents one of the strong motivations behind our proposed ontologies to relational database.
approach to utilize relational databases as a good storage A. Chatbot
A Chatbot (or chatterbot) is a type of conversational agent, a
1
Hadeel Al-Zubaide, Department of Computer Science, Princess Sumaya computer program that designed to simulate an intelligent
University for Technology, Jordan ([email protected]). conversation with one or more human users via auditory or
Ayman A. Issa, Software Engineering Department, Philadelphia University,
Jordan ([email protected]).
textual methods. This computer programs are also known as

978-1-61284-675-0/11/$26.00 ©2011 IEEE 7


Artificial Conversational Entity (ACE) and though many communication between computer systems in a way that is
appear to be intelligently interpreting the human input prior to independent of the individual system technologies, information
providing a response. Most chatterbots simply scan for architectures and application domain. Ontologies are human
keywords within the input and pull a reply with the most readable, comprehensive, sharable and formal which means
matching keywords or the most similar wording pattern from a that they are expressed in a language that has well-defined
local database. semantics.
The intelligent behavior of a chatbot is depicted by the Ontology population has been identified as a key enabler of
nature of its responding to human in such a way that human practical semantic applications in industry. When an ontology
gets convinced that he is chatting with a human instead of a is populated, it will contain not only the schema or definition
computer program. The degree of intelligent behavior is of the classes/concepts and relationship names but also a large
depending on the knowledge base ( The information that the number of entities that constitute the instance population of the
bot knows), poor ones lead to weak chatbot responses while ontology. Another important factor related to the population of
strong ones do the opposite. Such strong knowledge bases may the ontology is that it should be possible to capture instances
require years to be created. that are highly connected (i.e., the knowledge base should be
Most chatbots rely on fairly simple tricks to appear lifelike. deep with many explicit relationships among the instances).
Richard Wallace, producer of the top-ranked Chatbot ALICE In our suggested approach, ontologies will represent the
[3] (Artificial Linguistic Internet Computer Entity), has source of knowledge that OntBot knows. Regardless of
handwritten a database of thousands of possible conversational ontology domain, it should be firstly mapped and transformed
gambits. ALICE software utilizes AIML (Artificial into relational database as will be explained in the next section.
Intelligence Mark-up Language), an XML-like language
C. Ontology to Relational Database Mapping
designed for creating stimulus-response chat robots. ALICE
chatbot’s knowledge base is composed of question-answer In order to store ontologies data and execute queries on that
modules, called categories and structured with AIML. The data in databases, several alternatives are exist such as storing
model of learning in ALICE is called supervised learning them in relational, object or object-relational. Storing
because a person, called botmaster, plays a crucial role. The ontologies in relational databases is less straightforward than
botmaster monitors the robots conversations and creates new storing ontologies in object or object-relational databases,
AIML content to make the responses more appropriate, because relational database management systems do not
accurate, or believable. support inheritance. However, relational database management
Many chatbots have been deployed in a strictly limited systems have significant advantages over object or object-
domain to seek information, site guidance, and FAQ relational database management systems. In particular,
answering. Most existing chatbots consist of dialog relational database management systems provide maturity,
management modules to control the conversation process and performance, robustness, reliability, and availability, and that’s
chatbot knowledge bases to response to user input. Typical what pushed us to go with the relational database option.
implementation of chatbot knowledge bases contains a set of Several studies [10, 11, 12] have been conducted regarding
templates that match user inputs and generate responses. the mapping between ontologies and relational databases. It is
Templates currently used in chatbots, however, are hand out of our scope to go in details with each study since the
coded. Therefore, the construction of chatbot knowledge bases proposed approach will work over the result of this mapping
is time consuming, and difficult to adapt to new domains. and go further in its process.
The proposed OntBot approach does not utilize AIML, Transformation of ontologies to relational databases is based
rather, it is being built using general programming language on a set of rules called mapping rules that specify how to map
such as VB.Net. In addition, relational database will be used to constructs of the ontological model to the relational model.
stor OntBot’s knowledge in stead of files. Further, the source The mapping rules are then applied to an ontology to produce
of this knowledge is the WWW ontologies that will be a relational database. Figure 1 shows the flow of these
automatically converted to entities capable of being stored in transformation.
the database. Therefore, there will be no need to that hand In OWL, a class can be regarded as a relational table.
writing. This shows that OntBot will not need special AIML Properties of a class can be regarded as the attributes of a
nor knowledge archiving skills which overcomes the main relational table. Inheritance relation between classes can be
drawbacks of traditional chatbots. realized by the foreign key between relational tables [13].
Transformation of ontology into relational database includes
B. Ontologies many transformation series. During the process of
Ontology is the key enabling technology for the semantic transformation, the first step is the ontology classes that are
web. An ontology is a specification of a conceptualization transformed into relational database tables, then the
Error! Reference source not found., it can be viewed as a transformation of ontology object-properties into relational
vocabulary used to describe a world model in the semantic database, When OWL classes are mapped to tables, object
web. The main purpose of an ontology is to enable properties may be transformed into relational database

8
relations. After this, the transformation of ontology data type Traditional chatbots are domain dependent, the botmaster is
properties into relational database data columns occurs. Finally responsible on statically handwriting thousands or more of
ontology constraints are transformed relational database into possible conversational scenarios, questions and their
metadata tables [12].
corresponding answers according to specific domain to fool
human of being lifelike. A new chatbot should be created and
separate scenarios have to be built by botmaster each time a
domain changes. The power of OntBot relies on its capabilities
of handling different domain knowledge automatically without
the need of human intervention; Whatever the type of ontology
files that will be mapped, after the conversion into database
tables, OntBot should be able to handle different questions-
answers scenarios by itself without the need to predefine them.

Fig. 1: Transformation of ontology to relational database.

III. RELATED WORK

A practical approach for enhancing a language independent


conversational agent for question answering using Semantic
Web knowledge has been developed [14]. It represents a
cascade type architecture that is divided into several
components. The first component is a chatbot built on top of
AliceBot, it relies on converting Semantic Web knowledge to
AIML format, which is motivated by the work of Freese [15].
The second component is the ability to build answers from the
ontology, by parsing and categorizing the user’s input.
This approach uses XML files to store the generic patterns for
each question domain they pre-defined (They predefined 6
types of questions). The patterns are generated by querying the
ontology graph with Protege API and then replacing the
template tags defined in the language files. The process of
finding an answer by querying the ontology is done in three
steps.

IV. THE PROPOSED APPROACH : ONTBOT

This section discusses the proposed approach with figure 2


showings the general architecture of OnBot.

A. OWL to Database Mapping


Fig. 2 : OntBot Architecture.
This part represents a novel step in chatbots world where
OntBot will depend on any of the available ontology to B. Ontbot Knowledge base
relational database mapping techniques to get its knowledge. Botmasters use AIML for creating the questions and
The transformed ontology will be stored in tables to form the answers in the form of categories and store the resulted
base of chatbot knowledge that will be manipulated using an scenarios in AIML files. These files represent the knowledge
inference engine to suit chatbot needs and methodologies. base of the chatbot [8, 9]. At the opposite of OntBot, the

9
knowledge base here will contain the resulted mapped match between words' roots (from both sides, input word and
ontological tables. This will empower our chatbot with all the stored word to be matched with) and thus, if the entry in our
facilities and advantages of the database and DBMS over the database that is coming from the WWW and resulting from
file system. ontology mapping is in its present tense while user input writes
OntBot knowledge base could be integrated with any it in past tense, it will not be considered as a mismatch. There
existing database. It may also contain extra optional tables. are many available stemming algorithms [9] for example,
These tables could be predefined ones or added at any time, OntBot can employ one of the most effective and widely used
they may contain scenarios to handle greetings, out of scope stemming algorithms known as Porter stemming algorithm (or
topics, session information to be managed or used later by ‘Porter stemmer’).
botmaster. Synonyms/alternatives of a word are obtained in order to be
checked. This is required if there is no match found between
C. Natural Language Processing Module
user input token and a stored word one, in this case OntBot
This module is responsible on processing user input in a will assume that the user may use a synonym words of the
way that facilitates the mission of getting the needed answer. stored one so it will get them all. These synonyms will be in
Several functionalities should be taken into account before turn considered as new entries and one by one searched against
start searching for a match in OntBot knowledge base and then the target stored words. If again no matching found, the word
succession in getting the best answer. Such functionalities will be considered as a mismatched one. WordNet [16] is
include input tokenization, stopper filtering, stemming and going to be used to find words alternatives.
synonyms handling. Figure 2 illustrates the four
functionalities and their sequence. D. OntBot Inference Engine
This is the main component of the architecture. It represents
the brain of OntBot. The input to this module will be the
normalized user question that results from Natural Language
Processing Module while the final output will represent the
target answer. This answer will be then forwarded to the next
module which is to prepare and decorate it for final
presentation. Three sub-modules, as shown in figure 4,
formulate the Engine: Scope Specifier, Rule Matcher, and
Query Processor.

Fig. 3: Natural Language Processing Module.

Tokenization, or splitting the input into words, is an


important first step in the decision of natural language
processing. It involves some operations that is necessary to
facilitate the target matching process. A set of splitters will be
used in order to break down user input into tokens. Such
splitters include: space, punctuations, special symbols and
others. Both user input and stored words will be tokenized.
Stopper Filtering is the phase that refers to the process of
removing set of fluff words that may exist in user input. The Fig. 4 : OntBot Inference Engine.
stop list consists of a list of common function words such as 1) Scope Specifier:
determiners (a, the, this), prepositions (in, from, to),
This module attempts to act like human in understanding
conjunctions (after, since, as), coordination (and, or), Also
about what user talks, it tries to specify the scope of a question
those words which occur more frequently but contribute little
in order to be able to handle the conversation in a right way
meaning like about, them, only etc.
and source the next module of all needed information
Word stemming is an important feature especially when we
regarding that scope. When it finds that user talks randomly or
talk about indexing and search systems. It is also could be
out of scope, it will help user to get to the point by notifying
thought as a fault tolerant technique in our case since we will
him about the domain of the conversation and then he can ask

10
OntBot for more details on how to get benefit from all that as a OntBot rules will be dynamic in terms of the way they will
secondary service provided by this module. Technically be defined and the domain they will cover. The botmaster of
speaking, having the normalized user input tokens that are OntBot can define new rule each time he figures out new
resulted from the previous phase; Natural Language suitable one, he can dynamically add any kind or number of
Processing Unit, this module will rely on Ontbot’s knowledge rules from simple to complex one at any time as long as they
base (data and/or meta data) to find any related match with could be translated to a real SQL query somehow, what
user input, whether it’s a match with table name, attributes or specializes OntBot is the ability to deal with various questions
specific instances. If a match is successfully found, Scope domains as long as it could be answered from the available
Specifier will feed the next module, Rule Matcher, with the knowledge in OntBot knowledge base, while the other
needed information to aid it in continuing the job of getting the ontology based chatbot is limited to fixed predefined questions
needed response as what we will see next. domains [14].
2) Rule Matcher 3) Query Processor
After specifying the scope of the conversation and getting This module will be responsible of the actual physical
the needed information from the previous module, Rule queries execution. It takes the right queries from the Rule
Matcher will try to find a matching rule given the normalized Matcher, check their correctness before executing them against
user input tokens as another input. Rule Matcher will perform OntBot knowledge base. Retrieved results are then passed in a
some manipulation over both inputs to form the basis that it suitable and readable way to the Answer Formalism Unit that
will depend on when searching for appropriate rule. Such rules will take care of displaying readable and understandable
simply should be in the following form: answers to the user.

Question Format => Query Format [That = Value]. E. Answer Formalism


Before displaying answers to users, its vital to ensure that
The right hand side of each rule represents the possible they are readable, errors-free whether they are spelling and/or
question’s style users may ask. Questions that Ontbot can grammatical errors. The way results will be displayed to users
handle will range from simple to complex one. Complexity should also be friendly and close to naïve user understanding
degree reflects how deep or detailed the question is and then especially that these answers simply represent tables entries.
how complicated the corresponding query will be. User We may need to specify the pluralization of a word. We can
questions will be matched against questions in this right hand use JBoss DNA that is implemented in Java for finding the
side of a rule. The left hand side represents the corresponding plural of a given word. Also we can use MorphAdorner,
query that should be passed to Query Processor to execute if implemented in Java with Verbix that is an online conjugation
match is found while the last part: [That = value] will be used resource to perfom verb conjugation. Verbix can be accessed
as what it is already used for AIML in traditional chatbots to from within code by sending HTTP requests and parsing the
enable the bot of remembering what it said in the previous result. Finally, if we need to know the gender of a word, male
interaction so that conversations can become more meaningful or female so that we can form the answer, we can use
and humanly. OntBot will keep track of the current output of WordNet tool for this purpose.
Scope Specifier module; table names, attributes and instances,
so that when user uses pronouns to refer to one of them in its V. CONCLUSION
next question Ontbot will use it to understand and go further in A new approach to develop ontology based chatbot
the conversation as explained in the following example. (OntBot) is proposed in this paper. In OntBot, the ontology
should be mapped first into relational databases automatically
User: who is Eric? to form its knowledge base. Users can interact with OntBot
OntBot: He is Doctor. easily using their natural language so there is no more need to
User: How old is him? (user used him to refer to Eric) learn any query language or to know about the contents of the
OntBot: 31. (OntBot got it) underlying ontology.
Simple questions domains are direct questions. Such OntBot provides friendly, easy to use and efficient user
questions can be definition questions, measure questions, list interface. OntBot will be ontology-portable, it will represent a
questions, comparisons ones, yes/no and many others. plug-in component that can be replaced with another ontology
Complex questions domains include any questions that need that models a completely different domain without any change
extra analysis or indirect manipulations. Examples of complex in the system and that is what makes OntBot more dynamic
questions include those that are mathematical based or the one and flexible.
that need to access more than one table to get the needed OntBot botmaster can extend the capabilities of OntBot’s
answers. Here, we can get the benefits of all the facilitations of brain by defining new rules whenever he wants which will
Relational Database and its DBMS. Sample example rules are increase the range of conversations OntBot can handle.
shown in Table 1 below.

11
TABLE I
Inference Engine’s Sample Rules.
Question Format Scope Specifier Output Query That
What is Xi ? Xi : Instance in Table Y that have A and B Select B from Y where A = Xi Xi
attributes.
Xi belongs to A.

Ex. Ex. Ex.


What is Stack? Stack : instance in Table Data Structures that Select Definition from Data_Structures Stack
have Name, Definition as Attribute. Stack where Name = “Stack”.
belongs to Name Attribute.
How many numbers of Y are Y : Table Name. Select Count (Name) from Y Y
there? P.S. A is the PK of Y.
Ex. Ex. Ex.
How many number of data Y : Data Structures Table that have Name and Select Count(Name) from Data
structures are there? Definition as Attributes. Data_Structures. Structures
List me all Y’s As and Bs Y : Table Name that have A, B ,C and D Select A,B from Y where C > Xi Y, A, B, C
who have C more than Xi? Attributes. and Xi
Xi : instance.
Ex.
List me all Employee’s Ex. Ex. Employee,
Names and Ages who have Y : Employee Table that have Name, Address, Select Name,Ages from Employee where Name, Age,
Salary more than 700$ ? Job Title, Salary and Age as Attributes. Salary > 700 Salary and
700 belong to Salary Attribute. 700
If there is any Y where A is Y : Table Name that have A, B ,C and D Select B from Y where exist (select A Y, A, B and
Xi, display B Attributes. from Y where A = Xi ) Xi
Xi : instance.
Ex. Ex. Employee,
If there is any Employee Ex. Select Name from Employee where exist Name,
where Address is London, Y : Employee Table that have Name, Address, (select Address from Employee where Address and
display the name ? Job Title, Salary and Age as Attributes. Address = “London” ) London.
London belong to Salary Attribute.

A prototype of OntBot is being developed and evaluated Basedsupport to Process Integration in ODE CLEI Electronic
against a number of application domains to demonstrate its Journal, 7 (1),
generalizability aspect. Further, user evaluation is being [9] Abran, A., Cuadrado, J., Garc´ ıa-Barriocanal, E., Mendes, O., S´
planned with associated users to the research project. anchez-Alonso, S., and Sicilia, M., (2006). Ontologies for
Software Engineering and Software Technology. Berlin Heidelberg:
REFERENCES Springer-Verlag.
[10] Gali, A., Chen, C., and Kajal, T., Claypool and Rosario UcedaSosa.
[1] Goodwin, R., Lee, J., and Stanoi, G., (2005). M.I. Leveraging. From Ontology to Relational Databases [online]. Hawthorne, NY
Relational Database Systems for Large-Scale Ontology Management
10532, U.S.A.: IBM T. J. Watson Research Center.
in Proceeding of CIDR Conference [11] Astrova, I., Korda, N., and Kalja, A., (2007 ). Storing OWL
[2] Abu Shawar, B. and Atwell, E., (2007 ). Chatbots: Are They
Ontologies in SQL Relational Databases in Proceedings of
Really Useful? in Proceedings of LDV-Forum 2007 Band 22 World Academy of Science, Engineering and Technology
(1) pp.31-50. [12] Vysniauskas, E and Nemuraite, L., (2006). Transforming
[3] ALICE, (2011). A.L.I.C.E. Artificial Intelligence Ontology Representation From OWL TO Relational Database
Foundation [online]. Available from: Information Technology and Control , 35 (3A),
http://alice.pandorabots.com/ [Accessed 05/25/2011]. [13] ZHUGE, H., XING, Y., and SHI, P., Resource Space Model, OWL
[4] Freese, E., (2007). Enhancing AIML Bots Using Semantic Web and Database: Mapping and Integration [online]. Beijing, China.:
Technologies in Proceeding of Extreme Markup Languages Chinese Academy of Sciences.
[5] Falbo, R., Menezes, C., and Rocha, A., (1998 ). A Systematic [14] Alexandru Dobrila, t., (2010). From Semantic Web Knowledge To
Approach for Building Ontologies in Proceedings of 6th A Functional Conversational Agent: A Practical Approach.
IberoAmerican Conference on AI, number LNCS1484 in Lecture [online]. Available from: http://airtudor.com/
Notes in Arti?cial Intelligence Lisbon, Portugal pp.349-360. semantic_chatbot.pdf [Accessed 05/21/2011].
[6] Falbo, R., Guizzardi, G., and Duarte, K., (2002). An Ontological [15] Fernįndez, M., Gómez-Pérez, A., Pazos, J., and Pazos, a., (1999).
Approach to Domain Engineering in Proceedings of 14th Building a Chemical Ontology Using MethOntology and the
International Conference on Software Engineering and Knowledge Ontology Design Environment IEEE Intelligent Systems Applications,
Engineering (SEKE’02) Ischia, Italy pp.351-358. 4 (1 ), pp. 37-45.
[7] Liao, L., Qu, Y., and Leung, H., (2005). A Software Process [16] WordNet, (2010). WordNet Project [online]. WN Team.
Ontology and Its Application in Proceedings of Workshop on Available from: http://wordnet.princeton.edu/ [Accessed
Sematic Web Enable Software Engineering (SWESE) Galway, Ireland 05/21/2011].
[8] Ruy, F., Bertollo, G., and Falbo, R., (2004). Knowledge-

12

You might also like