' - ' 11 PapadimitriouG - 0818 - Eps
' - ' 11 PapadimitriouG - 0818 - Eps
' - ' 11 PapadimitriouG - 0818 - Eps
Georgios Papadimitriou
⃝
c The copyright in this thesis is owned by the author. Any quotation from the thesis or use of any
of the information contained in it must acknowledge this thesis as the source of the quotation or
information.
Abstract
This thesis is concerned with autonomous operations with Autonomous Underwater Vehi-
cles (AUVs) and maritime situation awareness in the context of enhancing maritime defence
and security. The problem of autonomous operations with AUVs is one of persistence. That
is, AUVs get stuck due to a lack of cognitive ability to deal with a situation and require in-
tervention from a human operator. This thesis focuses on addressing vehicle subsystem
failures and changes in high level mission priorities in a manner that preserves autonomy
during Mine Countermeasures (MCM) operations in unknown environments. This is not a
trivial task. The approach followed utilizes ontologies for representing knowledge about the
operational environment, the vehicle as well as mission planning and execution. Reasoning
about the vehicle capabilities and consequently the actions it can execute is continuous and
occurs in real time. Vehicle component faults are incorporated into the reasoning process
as a means of driving adaptive planning and execution. Adaptive planning is based on a
Planning Domain Definition Language (PDDL) planner. Adaptive execution is prioritized
over adaptive planning as mission planning can be very demanding in terms of computa-
tional resources. Changes in high level mission priorities are also addressed as part of the
adaptive planning behaviour of the system. The main contribution of this thesis regard-
ing persistently autonomous operations is an ontological framework that drives an adaptive
behaviour for increasing persistent autonomy of AUVs in unexpected situations. That is,
when vehicle component faults threaten to put the mission at risk and changes in high level
mission priorities should be incorporated as part of decision making.
Building maritime situation awareness for maritime security is a very difficult task.
High volumes of information gathered from various sources as well as their efficient fu-
sion taking into consideration any contradictions and the requirement for reliable decision
making and (re)action under potentially multiple interpretations of a situation are the most
prominent challenges. To address those challenges and help alleviate the burden from hu-
mans which usually undertake such tasks, this thesis is concerned with maritime situation
awareness built with Markov Logic Networks (MLNs) that support humans in their decision
making. However, commonly maritime situation awareness systems rely on human experts
to transfer their knowledge into the system before it can be deployed. In that respect, a
promising alternative for training MLNs with data is presented. In addition, an in depth
evaluation of their performance is provided during which the significance of interpreting an
unfolding situation in context is demonstrated. To the best of the author’s knowledge, it is
the first time that MLNs are trained with data and evaluated using cross validation in the
context of building maritime situation awareness for maritime security.
i
This thesis is dedicated to my parents, Konstantinos and Paraskevi, and to my sister, Vaia.
ii
Acknowledgements
First, I would like to thank my supervisor prof. David Lane for giving me the opportunity
to be part of the Ocean Systems Laboratory. His vision, experience and guidance helped
me a lot throughout my PhD studies. A special thank you goes to Dr. Zeyn Saigol for
his constructive advice and guidance during our weekly progress meetings without which
studying to acquire a PhD degree would be a much harder task. I would also like to thank
my colleagues in the Ocean Systems laboratory for all the fruitful conversations and the
wonderful time spent together. I am extremely grateful to my family for being uncondition-
ally supportive and believing in me. Last but not least, I would like to thank Despoina for
her support all these years and for reminding me that work is not everything in life.
The work presented in this thesis received funding from the Defence Science and Technol-
ogy Laboratory (DSTL) of the UK’s Ministry of Defence under contract number DSTLX-
1000064104.
iii
ACADEMIC REGISTRY
Research Thesis Submission
Declaration
In accordance with the appropriate regulations I hereby submit my thesis and I declare that:
1) the thesis embodies the results of my own work and has been composed by myself
2) where appropriate, I have made acknowledgement of the work of others and have made reference to work carried
out in collaboration with other persons
3) the thesis is the correct version of the thesis for submission and is the same version as any electronic versions
submitted*.
4) my thesis for the award referred to, deposited in the Heriot-Watt University Library, should be made available for
loan or photocopying and be available via the Institutional Repository, subject to such conditions as the Librarian
may require
5) I understand that as a student of the University I am required to abide by the Regulations of the University and to
conform to its discipline.
6) I confirm that the thesis has been verified against plagiarism via an approved plagiarism detection application e.g.
Turnitin.
Submission
Date Submitted:
Please note this form should be bound into the submitted thesis.
Academic Registry/Version (1) August 2016
Contents
I Preliminaries 1
1 Introduction 2
1.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Autonomous Underwater Operations . . . . . . . . . . . . . . . . 2
1.1.2 Maritime Situation Awareness . . . . . . . . . . . . . . . . . . . . 3
1.2 Thesis Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Thesis Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
II Background 8
v
CONTENTS vi
IV Conclusions 208
10 Conclusions 209
10.1 Future Work Suggestions . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
List of Figures
3.1 Relation of the OWL variants in terms of their expressive power. OWL Lite
is a subset of OWL DL while both are subsets of the OWL Full variant. . . 49
viii
LIST OF FIGURES ix
3.2 Ontology classes of entities that can be found in the hypothetical under-
water MCM domain organised in a taxonomy. The is-a notation denotes
a superclass-subclass connection between classes. Thing is a special class
that plays the role of the root class (root superclass) of all classes in a do-
main. This is analogous to root nodes found in tree data structures. . . . . . 51
3.3 Ontology individuals as well as their respective classes. Individuals can be
identified by the purple diamond preceding their name while classes can
be identified by the yellow circle. Blue arcs link individuals to the classes
they belong while purple arcs link classes to each other forming superclass-
subclass connections. MooredMine1, Anchor1 and AnchorChain1 are indi-
viduals (instantiations) of their respective classes. . . . . . . . . . . . . . . 52
3.4 Object property isTiedToChain forms a binary relation (orange arc) between
individuals MooredMine1 and AnchorChain1. Blue arcs link individuals to
the classes they belong while purple arcs link classes to each other forming
superclass-subclass connections. The information illustrated in this figure
was produced by manually creating the isTiedToChain object property in-
side the ontology illustrated in Figure 3.3 in order to form a binary relation. 53
3.5 KnowRob Architecture. Taken from [2]. . . . . . . . . . . . . . . . . . . . 55
5.1 The OODA loop. FF corresponds to feed forward flow of information while
FB to feedback. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.1 ROS nodes communicating at runtime over topics and via services. . . . . . 101
6.2 Communication between an action client and server. . . . . . . . . . . . . 102
7.1 High level framework architecture for persistently autonomous operations. . 105
7.2 World ontology class hierarchy. . . . . . . . . . . . . . . . . . . . . . . . . 106
7.3 Four-level bottom-up vehicle modelling approach. The purple boxes repre-
sent ontologies while the arrows indicate the direction of incremental mod-
elling starting from the components ontology up to the vehicle ontology. . . 108
7.4 Components ontology class hierarchy. . . . . . . . . . . . . . . . . . . . . 109
7.5 Capabilities ontology class hierarchy. We have intentionally omitted the
class hierarchy of the components ontology (see Figure 7.4) for better read-
ability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
7.6 Actions ontology class hierarchy. We have intentionally omitted the class
hierarchy of the components and capabilities ontologies (see Figures 7.4
and 7.5 respectively) for better readability. . . . . . . . . . . . . . . . . . . 112
LIST OF FIGURES x
7.7 Vehicle ontology class hierarchy. We have intentionally omitted the class
hierarchy of the components, capabilities and actions ontologies (see Fig-
ures 7.4, 7.5 and 7.6 respectively) for better readability. . . . . . . . . . . . 113
7.8 Relations between the vehicle, components, capabilities and actions inside
the vehicle ontology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
7.9 System identification and update phases. . . . . . . . . . . . . . . . . . . . 116
7.10 System identification steps. Step 1 handles the identification of vehicle
components, step 2 handles the identification of vehicle capabilities while
step 3 the identification of vehicle actions. . . . . . . . . . . . . . . . . . . 117
7.11 Vehicle system update phase decomposition. . . . . . . . . . . . . . . . . . 121
7.12 Exemplar lawnmower trajectory. . . . . . . . . . . . . . . . . . . . . . . . 125
7.13 Entropy versus Probability . . . . . . . . . . . . . . . . . . . . . . . . . . 136
7.14 Planning ontology class hierarchy. We have intentionally omitted the world,
components, capabilities, actions and vehicle ontologies for better readability.145
7.15 Execution ontology class hierarchy. . . . . . . . . . . . . . . . . . . . . . 149
8.12 High level mission priorities and adaptation. Start points for each trajec-
tory, denoted by green stars, are the points where the vehicle finished its
lawnmower pattern and performed classification to estimate MLOs. Black
stars represent mission end points. Red triangles denote the points in which
adaptation occurred. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
8.13 Average total entropy reduction over the number of MLO reacquisition
actions taken by the vehicle: i) under the three different high level mis-
sion priorities (upper graph), ii) when adapting from the Energy-Prob
to the Energy-Ent priority (middle graph), iii) when adapting from the
Energy-Ent to the Energy priority (bottom graph). . . . . . . . . . . . . . 188
8.14 Adaptation due to mine detection (software) module fault (red square). The
mission start point is represented by the green star while the mission end
point is represented by the black one. The geometrical configuration of
mines used in this experiment is the same as the configuration illustrated
throughout this chapter while the high level mission priority chosen is en-
ergy efficiency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
8.15 Adaptation due to sonar fault (red square). The mission start point is rep-
resented by the green star while the mission end point is represented by the
black one. Purple spheres represent the mines that the vehicle was not able
to detect due to the limited range of the camera while the purple square rep-
resents the recovery of the sonar. The geometrical configuration of mines
used in this experiment is the same as the configuration illustrated through-
out this chapter while the high level mission priority chosen is energy effi-
ciency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
9.5 Performance of the MLN in identifying vessels that should raise or should
not raise an alarm to a human operator, under context-free conditions. ROC
curves and AUC for the 10-fold cross validation are illustrated in the left
graph. Mean ROC curve and mean AUC are illustrated in the right graph. . 204
9.6 Performance of the MLN in classifying pairs of vessels rendezvousing or
not rendezvousing, under context-free conditions. ROC curves and AUC
for the 10-fold cross validation are illustrated in the left graph. Mean ROC
curve and mean AUC are illustrated in the right graph. . . . . . . . . . . . 205
List of Tables
4.1 Plan solving the problem illustrated in Snippet 3 given the temporal domain
illustrated in Snippet 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
xiii
LIST OF TABLES xiv
8.1 Plan for detection of mines. Numbers on the left hand-side represent esti-
mated start time points while numbers on the right hand-side (in brackets)
represent estimated durations for each durative action. By estimated time
points we refer to the points in time after each plan starts executing with a
reference to 0.000 seconds as being the start of the plan execution. . . . . . 171
8.2 Distances between lawnmowerpoints. . . . . . . . . . . . . . . . . . . . . 171
8.3 Plan statistics for the detection plan shown in Table 8.1. . . . . . . . . . . . 172
8.4 Plan for classification of mines. The number on the left hand-side repre-
sents the estimated start time point while the number on the right hand-side
(in brackets) represents the estimated estimated duration for the durative
classification action. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
8.5 Plan statistics for the classification plan shown in Table 8.4. . . . . . . . . . 173
8.6 Plans for reacquisition of MLOs (top) and inspection of mines (bottom) in
an energy-efficient manner. Numbers on the left hand-side represent esti-
mated start time points while numbers on the right hand-side (in brackets)
represent estimated durations for each durative action. By estimated time
points we refer to the points in time after each plan starts executing with a
reference to 0.000 seconds as being the start of the plan execution. . . . . . 176
8.7 Plan statistics for the plans shown in Table 8.6. . . . . . . . . . . . . . . . 178
8.8 Distances between mlopoints (top) and between inspectionpoints (bottom)
for the plans shown in Table 8.6. . . . . . . . . . . . . . . . . . . . . . . . 178
8.9 Plans for reacquisition of MLOs under different high level mission prior-
ities. The plan for energy efficient plus probability efficient reacquisition
is shown at the top while the plan for energy efficient plus entropy effi-
cient reacquisition is shown at the bottom. Numbers on the left hand-side
represent estimated start time points while numbers on the right hand-side
(in brackets) represent estimated durations for each durative action. By
estimated time points we refer to the points in time after each plan starts
executing with a reference to 0.000 seconds as being the start of the plan
execution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
8.10 Probabilities and entropies of MLOs. . . . . . . . . . . . . . . . . . . . . . 180
8.11 Plan statistics for the energy-efficient plus probability-efficient reacquisi-
tion plan shown at the top of Table 8.9. . . . . . . . . . . . . . . . . . . . . 180
8.12 Plan statistics for the energy-efficient plus entropy-efficient reacquisition
plan shown at the bottom of Table 8.9. . . . . . . . . . . . . . . . . . . . . 182
8.13 Average mission statistics of planning and executing MCM missions under
10 random geometrical configurations of mines using all three high level
mission priorities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
LIST OF TABLES xv
9.1 MLN formulas used in the rendezvous scenario presented in Snidaro et al.
[5]. Variables v, y represent vessels while OpenSea, IntWaters, Harbour,
NearCoast are constants with which a zone variable can become bound
and Smuggling, Clear constants with which a report variable can become
bound. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
9.2 Adjusted rendezvous scenario MLN formulas. Variables x, y represent ves-
sels while z is a zone variable which can become bound with constants
OpenSea, IntWaters, Harbour, NearCoast and r is a report variable which
can become bound with constants Smuggling, Clear. . . . . . . . . . . . . 198
9.3 Exemplar entries from the artificially generated dataset. . . . . . . . . . . . 200
9.4 Trained MLN with one of the 10 folds. . . . . . . . . . . . . . . . . . . . . 201
9.5 Two exemplar evidence sets with evidence for two vessels: V 124 and V 496.
The first set contains contextual information illustrated in red while the
second set is generated by stripping all contextual information from the
first set. Also illustrated, are the ground truth situations that correspond to
the evidence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
9.6 Outcomes of probabilistic inference when querying for whether vessels
V 124, V 496 are suspicious and for whether an alarm should be raised for
those vessels. That is, given the evidence found in Table 9.5. . . . . . . . . 206
9.7 Outcome of probabilistic inference when querying for the existence of a
rendezvous incident between vessels V 124, V 496 given the evidence found
in Table 9.5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
Part I
Preliminaries
1
Chapter 1
Introduction
2
CHAPTER 1. INTRODUCTION 3
potentially more cost effective to deploy, reduce risk to human personnel in the field, pro-
vide enhanced effectiveness through quality of sensor data and length of deployment, can
be deployed organically and with greater agility around a theatre of operation. Current
generations of AUVs, however, are only capable of autonomous operation under human
supervision, or with preprogrammed scripts for behaviours. Many over the horizon or ex-
tended endurance operations required to achieve the benefits above will need greater so-
phistication in automatically responding to unknown and uncertain environments, failure
of vehicle subsystems, and changes in objectives. A specific example of this in the un-
derwater domain is the current Mine Countermeasures, Hydrography and Patrol Capability
(MHPC) programme [6], where off board sensors (i.e. AUVs) will be deployed from metal
hull minesweepers conducting Mine Countermeasures (MCM) operations by the end of the
decade. For the Concept of Operations (ConOps) currently under development, longer en-
durance with greater flexibility, robustness, adaptability and persistence in autonomy than
are available in current commercial AUVs (e.g. REMUS) will be required.
In the future, AUVs must be more persistent. In practice, this means they must make
less requests for assistance from an operator when they get stuck due to a lack of cognitive
ability to deal with a situation. This requires that they are capable of adapting their mission
plans on the fly in response to changes observed in the environment around them and in
themselves (i.e. failures), using sensors and by communication. Environments will be par-
tially known a priori, and must therefore be observed and modelled internally as a basis for
decision making in planning. Goal sequences that produce a priori mission plans must be
capable of modification and task sequences adapted while maintaining coherence with entry
and exit states required by other parts of the plan. Consequently, this requires adaptabil-
ity in world modelling, planning and execution in both space and time, in the presence of
vehicle and sensor constraints and limitations (e.g. endurance, speed, range and accuracy).
underwater. The purpose of MSA is to identify incidents that need to be dealt with to avoid
or constrain undesirable effects that can threaten safety and security in the area where such
incidents are observed and/or potentially “linked” areas that can be also affected. Regarding
environmental elements, these can vary from surface vessels such as ships to underwater
ones such as submarines. We call these dynamic elements since they can move. In addition,
environmental elements can be static such as oil rigs and harbours, to name but a few, while
situations, and consequently incidents, can involve various combinations thereof. Incidents
can be either intended or unintended. Unintended incidents are related to safety, e.g. a
cargo vessel being on a collision course with an oil extraction platform. On the other hand,
intended incidents are related to intended illegal activities such as smuggling and terrorism
[7]. In an increasingly uncertain and dangerous global setting, the threat of people and orga-
nizations involved in illegal activities at sea is greater than ever. Small vessels can be used
to board other bigger vessels in order to loot them and/or hold their crew as hostage for ran-
som in acts of piracy and terrorism. Moreover, all sorts of smugglers and traffickers (arm,
drug, human, oil etc.) commonly use naval routes to promote their activities and distribute
their cargo. Irrespective of the situation, building and maintaining awareness of it correctly
and in a robust manner is a non trivial task. The three most prominent challenges are: i) the
high volumes of information gathered from various sources, ii) their efficient fusion taking
into account any contradictions (observations uncertainty) and iii) the requirement for reli-
able decision making and (re)action in the presence of potentially multiple interpretations
of a situation (situation uncertainty).
Currently maritime situation awareness is mostly built around human operators follow-
ing in this manner a human centric approach, meaning that humans are heavily engaged in
the assessment of an unfolding situation. More specifically, under such a setting, humans
have to evaluate vast amounts of information some of which may be contradicting making
its fusion extremely difficult even for operators with years of experience. Other factors such
as fatigue can lead to poor judgement and decisions which at best delay the identification of
incidents while at worst can be catastrophic. That is, complete oversight of major incidents
that could put human lives at risk and pose a great threat to maritime security. The identi-
fication of the aforementioned problems with the human centric approach led to a gradual
shift towards systems centric maritime situation awareness over the last few years. That is,
the development and deployment of situation awareness systems whose purpose is to ease
the burden on humans and support them. Sometimes system centric situation awareness is
referred to as next generation situation awareness [8]. Currently, such systems commonly
rely on human domain experts to transfer and encode their knowledge into the system so
that it can be later deployed. This is mostly a repetitive, trial and error, time consuming
process. There are also studies suggesting that there is loss of information since only a
small portion of knowledge is actually transferred (e.g. see Nilsson et al. [9]). As such,
knowledge acquisition should not only be automated but also come directly from data.
CHAPTER 1. INTRODUCTION 5
ontological representation upon which reasoning procedures can reassess the AUV’s inter-
nal state dynamically and in real time and enable the vehicle to adapt and recover so that
mission goals are satisfied to the maximum extent possible. Another contribution is that the
framework enables AUV’s to respond to high level mission priority changes and adapt in
real time. The application of the framework to MCM which is presented in Chapter 8 is of
particular interest to the maritime defence community not only due to the aforementioned
features but also due to the very nature of the high level mission priorities. That is, further
to the standard MCM missions where energy consumption and/or execution time govern the
mission, we have provided an alternative perspective in which criteria such as probability
and entropy are also considered during mine reacquisition and inspection. Finally in Chap-
ter 9, to the best of the author’s knowledge, it is the first time that Markov logic networks
are trained with data and evaluated using cross validation for building maritime situation
awareness in the context of maritime security. This thesis has resulted in the following
publications:
mechanisms described within Chapter 7 are better understood. Having said that, Chapter
7 describes the aforementioned persistently autonomous operations framework where the
framework’s architecture is presented and a detailed analysis of the framework’s ontology-
based knowledge representation and reasoning is provided. Finally, Chapter 7 describes the
adaptive mission planning and execution approach followed within the framework. Chapter
8 presents the application of the framework to MCM. The chapter starts with an introduc-
tion to MCM and the phases which it comprises and continues with the few modifications
made to the framework in order to be applicable to the MCM setting. Moreover, experimen-
tal results with respect to conducting MCM under different high level mission priorities are
presented as are experimental results that demonstrate the framework’s adaptive behaviour
and recovery capabilities in the presence of high level mission priority changes and com-
ponent faults respectively. Chapter 9 is concerned with maritime situation awareness in
the context of maritime security like Chapter 5 but from a different perspective. More
specifically, Chapter 9 is concerned with training and evaluating Markov Logic networks
for maritime situation awareness building on the background knowledge provided in Chap-
ter 5. Generative, likelihood-based weight learning approaches are presented within the
chapter. Experiments and experimental results are provided and analysed in the context of
identifying vessels rendezvousing in order to get involved in illegal activities. Finally, in
Chapter 10, we present our overall conclusions and suggestions for future work.
Part II
Background
8
Chapter 2
This chapter presents a broad field of the Artificial Intelligence (AI) which is concerned
with knowledge representation and reasoning. Conceptually we divided the chapter into
three main sections in order to present logic-based, probabilistic and hybrid approaches
(see Sections 2.1, 2.2 and 2.3 respectively). Logic-based approaches are fundamental in
understanding both ontologies which are part of Chapters 3, 7, 8 and the logic part of hy-
brid approaches in Section 2.3. In particular, the description logics section (Section 2.1.3),
is relevant for the OWL DL ontologies that lie in the center of autonomous underwater
operations while first-order logic in Section 2.1.2 is relevant to Markov logic networks in
Section 2.3.3. Markov logic networks are used in Chapter 9 for maritime situation aware-
ness. In addition, probabilistic approaches in Section 2.2 are useful in understanding the
probabilistic aspect of hybrid approaches in Section 2.3.
9
CHAPTER 2. KNOWLEDGE REPRESENTATION AND REASONING 10
based agent might or might not be in [11]. The term model can be used instead of possible
world when we strictly refer to mathematical abstractions that are used to calculate the truth
values of formulas. In order for this to be possible formulas must not contain occurrences
of free variables or no variables at all. We will clarify this in Sections 2.1.1 and 2.1.2 where
we talk about the syntax and its elements as well as the semantics of formulas. Now, if a
formula a is true in a model m, then we say that m is a model of a or that a satisfies m [11].
The notation used for denoting all models of a formula a is M(a).
Next, we will talk about reasoning and associated concepts. Reasoning is the act of
drawing/deriving conclusions. A fundamental concept in logical reasoning is entailment
which is also known as logical consequence, that is, a formula follows from another formula
[11]. Therefore, entailment describes the relation between the meanings of formulas. We
will say that a formula a entails formula b (a |= b) iff (if and only if) in every model in
which a is true, b is also true [11]:
KB ⊢i b (2.3)
are entailed in the KB” [11]. This is also known as the correctness property. Complete is
any inference algorithm that can derive all valid outcomes. More formally, an inference
algorithm is called complete when it can derive any formula that is entailed in the KB.
Additionally, decidable is any inference algorithm that derives a formula that is entailed in
the KB provided that one exists.
So far, we have introduced and analysed several fundamental concepts of logical sys-
tems both in terms of knowledge representation and reasoning. This has hopefully created
a solid basis in which the reader can return to for clarification when needed.
2.1.1.1 Syntax
As evidenced by its name, the syntax of propositional logic refers to the set of rules defining
the structure of allowable formulas as well as allowable logical connectives or operators.
Formulas in propositional logic can be either atomic or complex. Atomic formulas or liter-
als are any formulas that consist of a single proposition. Single propositions are also known
as propositional variables and can be either true or or false and the naming convention is that
they are denoted with an uppercase letter. A, N are some examples of single propositions
or atomic formulas or propositional variables. Note that since propositional variables can
be either true or false they are bound to a set of values (true/false). Now, complex formulas
on the contrary consist of atomic formulas connected by logical operators and parentheses.
We present the logical operators following their order of precedence when appearing in
formulas:
2. ∧: Read as con junction or and. The conjunction between two or more literals pro-
duce a complex formula. For instance, L ∧ M ∧ N is a conjunction with the conjuncts
being L, M, N.
3. ∨: Read as dis junction or or. The disjunction between two or more literals produce
a complex formula. For instance, L ∨ N is a disjunction with the disjuncts being L, N.
4. ⇒: Read as implies. The implication – as its name suggests – connects two formulas
to denote that some formula implies some other formula or that some implied formula
CHAPTER 2. KNOWLEDGE REPRESENTATION AND REASONING 12
2.1.1.2 Semantics
The semantics in propositional logic are a set of rules based on which the truth values (true
or false) of formulas are determined given some model (see Section 2.1). To determine
the truth values of formulas one needs to know the truth values of propositions. For n
propositions in a KB the number of possible models is 2n . Table 2.1 illustrates the possible
models for a KB whose formulas consist of two propositions.
Table 2.1. Possible models mi for a KB whose formulas consist of two propositions L, M.
Let us now introduce two special propositions, the True and the False proposition. True
and False are propositions and they should not be confused with true and false which are
truth values. To distinguish between the two, notice that True and False begin with an
uppercase letter which denotes that they are propositions while true and false begin with
a lowercase letter. True is special proposition that is always true and False is a special
proposition that is always false. To complete this section, we present Table 2.2 which is a
truth table that illustrates the semantic rules in propositional logic.
Table 2.2. Truth table of propositional logic semantics associating logical connectives ¬, ∧,
∨, ⇒, ⇔) with propositions L, M.
2.1.1.3 Inference
Inference in propositional logic can be achieved through theorem proving or model check-
ing. In theorem proving we apply inference rules to KB formulas trying to derive a proof
CHAPTER 2. KNOWLEDGE REPRESENTATION AND REASONING 13
of a desired goal. In contrast, in model checking we enumerate all possible models trying
to derive a proof that the desired goal holds in all possible models [11].
Let us begin with theorem proving and inference rules. Inference rules are used by
theorem proving inference algorithms and guarantee sound inference. Recall from Section
2.1 that an inference algorithm is sound when it derives only formulas that are entailed in
the KB. Table 2.3 illustrates some of the most famous inference rules in propositional logic
for a complete list of rules the interested reader should refer to [11].
Inference Rule
Formulation Meaning
Name
a ⇒ b, a Given any formulas of the form a ⇒ b and a,
Modus Ponens
b then formula b can be inferred.
a ⇒ b, ¬b Given any formulas of the form a ⇒ b and ¬b,
Modus Tollens
¬a then ¬a can be inferred.
Hypothetical a ⇒ b, b ⇒ c Given any formulas of the form a ⇒ b
Syllogism a⇒c and b ⇒ c, then a ⇒ c can be inferred.
Disjunctive ¬a, a ∨ b Given any formulas of the form ¬a and a ∨ b,
Syllogism b then b can be inferred.
a∧b Given a conjunction of any formulas a ∧ b,
And-Elimination
a then any of the conjuncts can be inferred.
¬a ∨ b, a ∨ c Given any formulas of the form ¬a ∨ b and
Resolution
c∨b a ∨ c, then c ∨ b can be inferred.
Table 2.3. Sample table of proposition logic inference rules. a, b, c are formulas.
Except for the property of soundness another desirable property in inference algorithms
is completeness. Recall from Section 2.1 that an inference algorithm is complete when it
can derive any formula that is entailed in the KB. Soundness can be expressed as follows:
KB ⊢i a =⇒ KB |= a, where KB is a knowledge base, a is a formula and i an inference
algorithm. That is, if a is derived from KB by some i, then KB entails a. In other words, if
we can prove a from KB given some i, then a is true given KB. Now, completeness can be
expressed as follows: KB |= a =⇒ KB ⊢i a. That is, if KB entails a then a can be derived
from KB by some i. In other words, if a is true given KB, then we can prove a from KB
given some i.
Algorithms that couple the resolution inference rule (Table 2.3) with any complete
search algorithm are also complete. To achieve this the KB has to be expressed in Conjunc-
tive Normal Form (CNF). A KB in CNF consists exclusively of conjunctions of clauses. A
clause is a disjunction of literals. Luckily, any formula in propositional logic can be written
as a conjunction of clauses and be logically equivalent to the original. Additional theorem
proving inference algorithms are the forward and backward chaining algorithms.
In order for forward and backward chaining to be applicable, the propositional KB has
to be in definite clausal form. A definite clause is a disjunction of literals with exactly one
positive (i.e. non-negated) literal while a disjunction of literals with no positive literals
is called a goal clause. Both definite and goal clauses are Horn clauses. A Horn clause
is a disjunction of literals with at most one positive literal. Formulas that consist of only
CHAPTER 2. KNOWLEDGE REPRESENTATION AND REASONING 14
a single positive literal are called facts. A Horn clause can be written as an implication
whose antecedent is a conjunction of positive literals and its consequent a positive literal
(for definite clauses) or False (for goal clauses). For the case of a fact, the antecedent is the
True proposition and the consequent is the single positive literal. For example, ¬ A ∨ ¬ B
∨ C, ¬ A ∨ ¬ B and C can be written as A ∧ B ⇒ C, A ∧ B ⇒ False and True ⇒ C re-
spectively. Having a Horn KB of definite clauses written as a series of implications enables
both the forward and backward chaining algorithms to exploit the Modus Ponens inference
rule (see Table 2.3). Algorithm 1 illustrates the forward chaining inference algorithm in
propositional logic for a Horn KB that consists of definite clauses [11].
Algorithm 1: The forward chaining algorithm in propositional logic for a knowledge base
that consists of definite clauses [11].
The algorithm takes as input a KB consisting of definite clauses and a query which is a
propositional symbol (proposition). The agenda is a queue that contains propositional sym-
bols that are known to be true before the algorithm starts processing the KB. The counter is
a table that contains the number of propositional symbols whose value is unknown for each
antecedent of every implication. Inferred is another table that contains information (true,
false values) about whether a propositional symbol was processed by the algorithm. Ini-
tially all values in inferred are false. The algorithm starts by getting a proposition symbol
CHAPTER 2. KNOWLEDGE REPRESENTATION AND REASONING 15
from the agenda. If the proposition symbol matches the query then the algorithm terminates
returning true. If not, it checks whether that symbol has already been processed before and
if not then it changes the respective value of inferred to true. The algorithm then continues
by decreasing the counter for every antecedent in which that symbol is present and if the
counter of an antecedent reaches zero then its consequent is added to the agenda. This
process is repeated until the agenda is empty or until a symbol from the agenda matches
the query. Forward chaining is sound and complete and runs in linear time O(n) in the
size of the KB. Also, it is classified as a data-driven inference algorithm because it starts
processing based on known data (true propositional symbols).
The backward chaining algorithm operates differently from the forward chaining algo-
rithm in the sense that it starts from the query working backwards. That is, it does not start
searching the implications in which the initial true propositions are part of the antecedents,
trying in this way to infer the consequents and by extension the query. Backward chaining
instead starts from the query, searching for implications whose consequents are the query,
trying to prove the antecedents. If one of them is proved to be true then the query is true.
If none is true, the query is false. Backward chaining is sound and complete and runs in
linear time O(n) in the size of the KB. In addition, it is classified as a goal-driven inference
algorithm because it starts processing based on known goals.
So far we have presented the theorem proving reasoning approach. Let us now discuss
about two algorithmic families that lie within the model checking reasoning approach. The
first one is backtracking and the second is local search. Both families are used for solving
the satisfiability problem also known as the SAT problem: checking the satisfiability of a
formula. A formula is satisfiable if it is true in at least one model.
Regarding backtracking in logic2 , the general idea behind it is enumeration of partial
possible models in a tree structure given a formula to be checked for satisfiability. Back-
tracking starts with a node that is a partial possible model and at each stage it expands the
node with a more refined children node. That is, refining the partial possible model by
assigning a value to an unknown propositional symbol. If the refined model can no longer
be a candidate for satisfying the formula, then the algorithm backtracks to the parent node
and chooses assignment of a different value for the symbol. This operation is executed
recursively until all solutions (possible models) are enumerated if any exists. Figure 2.1
illustrates how backtracking works. One of the most famous algorithms that utilise back-
tracking is the Davis-Putnam-Logemann-Loveland (DPLL) algorithm [12, 13]. The input
of the DPLL algorithm is any formula expressed in CNF and the output models that satisfy
the formula (if any). DPLL is both sound and complete and with a time complexity of
O(2n ) in the size of the CNF formula.
In contrast to backtracking algorithms, local search algorithms only provide sound in-
ference. The rationale behind local search is to iteratively change the truth values of propo-
sitional symbols in formula clauses that are expressed in CNF so that all clauses are satis-
fied. By doing so local search algorithms return a satisfying model or failure. Before the
2 Not to be confused with backtracking processes in general
CHAPTER 2. KNOWLEDGE REPRESENTATION AND REASONING 16
Sentence: A Λ B ?
m1 = {A = unknown, B = unknown}
Figure 2.1. Backtracking as a search tree. Given a formula A ∧ B, the algorithm tries to
enumerate all possible models (solutions) which satisfy the formula. First it creates model
m1 which is a partial possible model. Then expanding in a depth-first manner it generates m2
which is also a partial possible model. Since m2 can not satisfy the formula the algorithm
backtracks to m1 and it then generates m3 . m3 is a partial candidate and it is expanded to
m4 which is a solution (possible model). The algorithm continues by generating m5 and
then backtracks to m1 , it generates m6 but backtracks again since m6 is not a partial possible
model. Finally m7 is generated and it is expanded to m8 (a solution) and then m9 (rejected
model). The solution set is {m4 , m8 }.
iteration process commences all propositional symbols are given a truth value. The first
iteration only differs from the initial truth value assignment by one truth value. So does
every successive iteration form the previous one. Local search algorithms require evalua-
tion functions that operate as terminating criteria. The number of unsatisfied clauses or the
maximum number of truth value flips that a local search algorithm is allowed to perform are
common evaluation functions in propositional logic [11]. The search space can suffer from
local minima therefore local search algorithms employ various random assignment strate-
gies to escape such minima. One of the most used algorithms in the family of local search is
WALK S AT [14, 15]. WALK S AT initializes by performing a truth assignment to all proposi-
tional symbols in a CNF formula. Then, for every iteration it chooses an unsatisfied clause
at random (typically according to a uniform distribution) and randomly flips the value of
one of its propositional symbols. The algorithm terminates when the maximum number of
flips is reached and either returns a model that satisfies the formula or failure. Returning
failure though does not necessarily mean that the formula is unsatisfiable. It might also be
the case that the maximum number of flips was not enough.
2.1.2.1 Syntax
In this section we describe the syntax of first-order logic. Since first-order logic is structured
around objects, relations and functions, its core syntactic symbols represent exactly these
(objects, relations and functions).
Constant symbols represent objects, predicate symbols represent relations while func-
tion symbols represent functions [11]. By convention, constant, predicate and function
symbols begin with an uppercase letter. When compared to propositional logic, what dif-
ferent is that having relations and functions creates the need for arity (i.e. the number
of arguments in a predicate or function). For example George and David are constants,
StudentO f (George, David) is a predicate with an arity of two denoting that George is a
student of David. Whereas, UniversityO f f ice(David) is a function3 with an arity of one.
In first-order logic we also have the notion of terms. A term is a logical expression that
refers to a function or an object. Therefore David (a constant) and UniversityO f f ice(David)
(a function) are terms. Objects can also be represented by variables instead of constants
when the object we refer to is unknown. Thus variables are also terms. By convention
variables begin with a lowercase letter. For example, StudentO f (x, y) is a predicate with
two variables x, y. In this form the predicate indicates that some person x is a student
of some other person y in contrast to StudentO f (George, David) where the two variables
have been replaced by two constants (George and David). The process of replacing vari-
ables with constants is called instantiation, also known as grounding. Finally, predicate
symbols applied to a tuple of terms are known as atoms.
Alike propositional logic, first-order logic has formulas that can either be atomic or
complex. Atomic formulas in first-order logic are single predicates:
• SisterO f (Despoina, x)
At this point, before we continue with the syntax of first-order logic, it is important that
we clarify something about atomic formulas that contain non-free, also known as bound,
variables or no variables at all as in the case of equality that we just presented. Such atomic
formulas are known as closed atomic formulas and ground atomic formulas respectively
or simply atomic sentences4 . Now, complex formulas consist of combinations of atomic
3 It
is a function because a professor has normally only one office
4 Can be used instead of closed or ground formulas. However we will avoid using the term sentence as
it may cause confusion. For instance, the occurrence of term formula may be interpreted as an open (i.e.
containing occurrences of free variables) formula which is not necessarily the case. It is like saying that when
we use the term triangle we do not refer to equilateral triangles.
CHAPTER 2. KNOWLEDGE REPRESENTATION AND REASONING 18
formulas, logical connectives (see Section 2.1.1), functions and symbols known as logical
quantifiers. Quantifiers allow us to express information about collections of objects [11].
First-order logic supports two quantifiers: the universal denoted as ∀ and read as “for all”
and the existential quantifier denoted as ∃ and read as “exists”. Examples of complex open
formulas are the following:
• ∀x Friends(x, y) ⇒ SimilarHabits(x, y)
As for complex closed formulas, they are complex formulas that contain non-free variables
or no variables at all. For instance, consider the following complex (open) formula from
the above set of complex formulas:
• ∀x Friends(x, y) ⇒ SimilarHabits(x, y)
To form a complex closed formula from this complex formula we need to either substitute
y with a constant, say Peter, or bind it as in the case of x with some quantifier, e.g.:
• ∀ x, y Friends(x, y) ⇒ SimilarHabits(x, y)
To conclude, a formula is closed if all of its variables are bound with quantifiers, also,
quantifiers together with logical connectives form the logical operators of first-order logic.
The precedence of operators is: ¬ > = > ∧ > ∨ > ⇒ > ⇔ > ∀ = ∃.
2.1.2.2 Semantics
Recall from Section 2.1.1 that the semantics of propositional logic define the set of rules
based on which the truth values of propositional formulas are determined given some
model. In first-order logic models are richer than the ones in propositional logic.
Each model has a domain, denoted as D, which is the set of objects it contains [11]. A
domain cannot be empty which means that every possible world must contain at least one
object. In addition, each model consists of an interpretation, denoted as I, that “specifies
exactly which objects, relations and functions are referred to by the constant, predicate and
function symbols” [11]. An interpretation is itself a function:
Now, in order to assign truth values to formulas we need to perform a series of actions:
CHAPTER 2. KNOWLEDGE REPRESENTATION AND REASONING 19
2. Every term {t1 ,t2 , . . . ,tn } must be evaluated to an object from the domain {d1 , d2 , . . . , dn }.
That is, I( f (t1 ,t2 , . . . ,tn ))(d1 , d2 , . . . , dn )
3. Every atomic formula P(t1 ,t2 , . . . ,tn ) is assigned a truth value depending on whether
⟨u1 , u2 , . . . , un ⟩ ∈ I(P) (true) or ⟨u1 , u2 , . . . , un ⟩ ∈
/ I(P) (false), where u1 , u2 , . . . , un are
evaluations of the terms t1 ,t2 , . . . ,tn . A formula of the form t1 = t2 is evaluated to true
if both t1 and t2 evaluate to the same object in D.
Except for the standard semantics that we presented above there are also alternative ones
[11]. We will demonstrate those by providing one example. Consider the natural language
expression “David has two PhD students, George and Hashim”. By writing:
we do not capture what is really reflected by the natural language expression. First of all
we would need to add that George and Hashim are not the same person and that David has
no other students. The precise expression in first-order logic would then be:
Judging from the above logical expression one quickly identifies that translation from nat-
ural language is not as straightforward as one might think. Consequently this can lead to
insufficient translations of natural situations in first-order logic and by extension to incom-
prehensible behaviours from first-order logic-based agents.
In the scope of alternative semantics we find the unique names assumption which dic-
tates that every constant symbol must refer to a different object. In our example this as-
sumption relinquishes the need to include George ̸= Hashim. Another alternative seman-
tics assumption is the closed world one which dictates that any atomic formula that is not
known to be true is treated as false. In our example this assumption relinquishes the need
to include:
Yet another assumption that can be made is domain closure which dictates that the only
objects that a model contains are the ones that are referred to by constant symbols [11]. The
CHAPTER 2. KNOWLEDGE REPRESENTATION AND REASONING 20
unique names, closed world and domain closure assumptions are also known as database
semantics. One final point to be made is that there are no correct and incorrect semantics in
general, only correct and incorrect semantics according to the conventions that we follow
[11].
2.1.2.3 Inference
Inference in first-order logic is a well researched scientific topic for which many inference
approaches have been developed over the years. In this section we will initially present
inference rules for quantifiers and then move to the more complete approaches of first-
order inference algorithms. Before continuing, remembering that in order for inference to
take place, formulas in a KB must not contain free variables.
Now, One of the first approaches for performing inference in first-order logic was ap-
plying a two-stage process in which we first propositionalize the first-order KB and then
reduce first-order to propositional inference [11]. The first stage is realised by applying
inference rules that operate on quantified formulas. Regarding the universally quantified
formulas, we apply the universal instantiation inference rule while in the case of existen-
tially quantified formulas we apply the existential instantiation inference rule (see Table
2.4). Universal instantiation can be applied many times to produce new formulas while
existential instantiation can be applied only once to a formula. In the case of universal in-
stantiation the new formulas are logically equivalent to the original one while in the case
of existential instantiation the new formula is only inferentially equivalent5 to the original
one.
Inference Rule
Formulation Meaning
Name
For any formula a that is universally
∀x a
Universal Instantiation quantified on variable x, substitute every
S UB({x/g}, a)
x in a with a ground term g.
For any formula a that is existentially
∃x a quantified on variable x, substitute every
Existential Instantiation
S UB({x/l}, a) x in a with a constant symbol l that does
not appear anywhere else in the KB.
Two examples (one for each rule) will help us understand the effect of the rules on quantified
formulas. First consider the following KB:
Applying existential instantiation on the above quantified formulas based on the following
substitutions:
where, C1 and C2 are new constant symbols that do not appear anywhere else in the KB.
Such constants are called Skolem constants [11]. Now consider the following KB:
Applying universal instantiation based on the ground terms of the above KB we get the
following KB:
The above KBs can be treated as a propositional KBs if every ground atomic formula is
treated as a propositional symbol. Consequently, the inference algorithms that were de-
scribed in Section 2.1.1 can be applied.
Reasoning by reducing first-order logic knowledge bases to propositional ones and then
applying propositional inference is sound and depending on the propositional inference
algorithm used it can be complete as well (see Section 2.1.1). This reasoning approach
can be slow in large domains since it can generate many irrelevant instantiations such as
Smokes(Mary) ∧ Alcoholic(Mary) ⇒ Cancer(Mary)6 .
Another, much faster approach, is unification and lifting [11]. Unification [11, 16]
searches for suitable variable substitutions that unify different logical formulas, i.e. make
6 It is obvious that the one getting cancer is George and not Mary given the remainder of the KB.
CHAPTER 2. KNOWLEDGE REPRESENTATION AND REASONING 22
U NIFY(k, l) = θ
where, k and l are formulas and θ a unifier (if one exists) such that S UB(θ , k) = S UB(θ , l).
For example, U NIFY(StudentO f (George, x), StudentO f (y, David)) = {x/David, y/George}
but what about U NIFY(StudentO f (George, x), StudentO f (y, z))? In the latter case there
are more than one unifiers. That is, θ = {x/George, y/George, z/George} or θ = {y/George,
x/z}. In such cases we choose the Most General Unifier (MGU), i.e. the one that imposes
the least restrictions on the variable values. Every other unifier is an instantiation of the
MGU [16]. In the above example the MGU is θ = {y/George, x/z}. The complete algo-
rithm for computing MGUs can be found in [11]. By unification we eliminate the need for
unnecessary instantiations but this creates the need for first-order inference algorithms since
we now also have to deal with variables and by extension with first-order constructs. At this
point we perform lifting which is extending propositional algorithms to do just this, operate
on first-order KBs. This includes lifting the inference rules that were presented in Section
2.1.1. Lifted inference is more efficient than propositionalizing a KB and then applying
propositional inference since through unification it performs only necessary substitutions.
Alike propositional logic, inference based on resolution is applicable in first-order logic
if it is first lifted. Lifted resolution is also called generalized resolution and it is given by:
k1 ∨ k2 ∨ . . . ∨ km , l 1 ∨ l 2 ∨ . . . ∨ l n
S UB(θ , k1 ∨ k2 ∨ . . . ∨ km ∨ l1 ∨ l2 ∨ . . . ∨ ln )
where, ki and l j are first-order complementary literals7 . Two first-order literals are comple-
mentary if one unifies with the negation of the other: U NIFY(ki , ¬l j ) = θ . θ is the MGU
of all ki and l j . In order for generalized resolution to be applicable, the first-order KB needs
to be expressed in CNF. As in the case of propositional logic, every formula in first-order
logic can be expressed in the CNF form [11]. Generalised resolution yields both sound and
complete inference results.
After describing unification and lifting and presenting the lifted version of resolution it
should not come as a surprise that both forward and backward chaining algorithms are also
applicable in first-order logic if lifted. That is, they are converted to utilize a lifted version
of Modus Ponens which is called Generalized Modus Ponens (GMP) and operates on first-
order KBs of definite clauses. For atomic formulas pi , p′i –whose variables are assumed to
be universally quantified–and a consequent q where there is a substitution θ such that ∀ i
S UB(θ , p′i ) = S UB(θ , pi )–we can unify p′i and pi for all i–the formulation of GMP is the
following [11]:
which means that given the above conditions we can infer S UB(θ , q). GMP offers sound
and complete inference. Although logical consequence (entailment) in first-order logic
is semidecidable8 , for KBs that are function free entailment is decidable. Algorithm 2
illustrates a simple forward chaining algorithm for first-order KBs with definite clauses
while Algorithm 3 illustrates simple backward chaining one.
Forward chaining takes as input a first-order KB consisting of definite clauses and a
query which is a first-order atomic formula. The new variable is a set of new formulas
inferred in every iteration of the algorithms main loop. Initially new is empty and it is
initialized to empty after every iteration. In each iteration the algorithm standardizes-apart
(i.e. eliminates clashes in variable names) a formula in the KB and applies the GMP rule. If
the outcome q′ does not unify with any formula already in the KB or new is not empty then it
adds q′ to new and attempts to unify q′ with the query. If unification succeeds it then returns
the outcome of this unification. The next step is to add new to the existing KB. This process
is repeated until new is empty. After exiting the repeat-until loop the algorithm returns
false. Consequently, the algorithm returns substitutions followed by false or just false.
Since forward chaining utilizes the GMP rule it offers both sound and complete inference.
In the case that the first-order KB does not contain any functions forward chaining runs in
polynomial time. That is, given a KB with p predicates with a maximum arity of k and n
constant symbols, forward chaining will “converge” after pkn iterations.
Regarding backward chaining, The algorithm takes as input a first-order definite clause
KB, goals which is a list of conjuncts that constitute a query and θ which is the current
substitution initialized with θ = {}. The algorithm starts by checking if goals is empty and
if so it returns {θ }. Then it applies θ to the first element of the goals conjunct list and the
outcome is assigned to q′ . Then for each formula in the KB the algorithm standardizes it
apart. Given that the standardized apart formula equals p1 ∧ p2 ∧ . . . ∧ pn ⇒ q and q unifies
with q′ , the algorithm appends the rest of the goals conjuncts at the end of a list consisting
of [p1 , p2 , . . . , pn ]. The algorithm then assigns the new list to a variable called new goals
which is actually an updated list of goal conjuncts. Finally the algorithm is recursively
called with new inputs which are: the KB, new goals and the composition of unifiers θ and
θ ′ which practically accumulates the unification substitutions together. The algorithm’s
outcome is assigned to answers along with answers from previous calls. Answers is a set of
substitutions and it is returned when the algorithm terminates. Backward chaining is also
used in Logic Programming (LP) which we are going to describe next.
As its name suggests, LP [17, 18] is a programming paradigm based on logic and is
a subcategory of declarative programming. Declarative programming is a programming
paradigm which dictates that systems should be given: i) the means to solve problems of
interest and ii) the requirements for achieving a solution, iii) without being given a control
flow on how to achieve a solution. There are many LP languages including ABSYS [19],
8 Semidecidability in logic is a form of “decidability” in which given an arbitrary formula and a KB, there
exists an algorithm which says always yes to the formula if it is entailed in the KB but there exists no algorithm
which says no to the formula if it is not entailed in the KB.
CHAPTER 2. KNOWLEDGE REPRESENTATION AND REASONING 24
repeat
new ← {}
for each formula in KB do
(p1 ∧ p2 ∧ . . . ∧ pn ⇒ q) ← S TANDARDIZE -A PART(formula)
for each θ such that S UB(θ , p1 ∧ p2 ∧ . . . ∧ pn ) =
S UB(θ , p′1 ∧ p′2 ∧ . . . ∧ p′n ) for some p′1 , p′2 , . . . , p′n in KB do
q′ ← S UB(θ , q)
if q′ does not unify with some formula already in KB or new then
add q′ to new
ϕ ← U NIFY(q′ , query)
if ϕ is not fail then
return ϕ
end
end
end
end
add new to KB
until new is empty;
return false
Algorithm 2: The forward chaining algorithm for a first-order knowledge base that consists
of definite clauses [11].
Algorithm 3: The backward chaining algorithm for a first-order knowledge base that consists
of definite clauses [11].
Datalog [20, 21], HiLog [22], Prolog [23, 24] and Mercury [25] among others. In the
remainder of this section we will focus on Prolog as it is one of the most widely used
CHAPTER 2. KNOWLEDGE REPRESENTATION AND REASONING 25
will yield false because X cannot have the value of 1 and 2 at the same time while:
[ ] [ ]
, = 1, 2 .
will yield true because the anonymous variables are not instantiated. A compound term is a
term followed by arguments which are also terms. For example, student(X, brother o f (a,
b)) is a compound term with student being the functor and X, brother o f (a, b)9 being the
arguments.
A Prolog program–also called Prolog KB–is a set of terms forming definite clauses
which are also known as Prolog rules. In order to understand the analogy to first-order
logic let us provide the following example:
is a first-order formula in definite clausal form. Recall that a definite clause can be written
as an implication whose antecedent is a conjunction of positive literals and its consequent
a positive literal. Consequently formula 2.4 can be written as:
Continuing with our running example, let us assume that we have set of individual en-
tities that are instantiations of concepts: AUV1, an instantiation of AUV, MooredMine1, an
instantiation of MooredMine and AnchorChain1 an instantiation of AnchorChain. These
instantiations, which are also referred to as concept assertions, are part of the ABox since
an ABox, as its name suggests, contains assertional knowledge. Ergo, all individuals (con-
cept assertions) present in a domain. Apart from concepts and individuals in DLs we have
roles. Roles are used for concept definitions as well as for forming binary relations between
individuals and for assigning attributes to individuals [2]. We already saw an example of
a role used in a concept definition as part of an axiom, that is, the objectActedOn role
acting as a restriction when defining the NeutralisingAMooredMine concept. With respect
to binary relations between individuals we can have MooredMine1 isTiedToChain Anchor-
Chain1 with isTiedToChain being a role. Finally a role such as hasWidth can be used to
assign the width dimension to our moored mine individual (e.g. MooredMine1 hasWidth 4
meters). Role assertions concerning individuals, that is, excluding role assertions that are
part of axioms, are also part of the ABox in addition to concept assertions.
The expressing power of a specific DL language can be roughly identified by its name
since it is a very common convention to name a DL language so that its name is roughly
indicative of the constructors it supports. In this context, the name of a DL language can be
decomposed into two “parts”: the basic logic used in the language and the extensions to that
basic logic. Regarding the basic logic part, AL and FL are the most common and stand
for attributive language and frame-based language respectively. Table 2.5 illustrates most
common basic logic extensions in DLs as well as their meaning. In this context, ALC for
example stands for attributive language with complements and supports concepts, concept
intersections, concept unions, concept negations and universal as well as existential restric-
tions [31]. Additional naming conventions are also quite common in DLs. For example S is
a naming convention used to indicate a language which is based on ALC but also supports
transitive roles.
CHAPTER 2. KNOWLEDGE REPRESENTATION AND REASONING 28
where, X,Y are random variables, P(X,Y ) is a joint probability and is read as “the proba-
bility of X and Y ”, P(Y |X) is a conditional probability and is read as “the probability of Y
given X”, P(X) is a marginal probability read as “the probability of X”. Based on the prod-
10 Nodes are also known as vertices.
11 Edges are also known as arcs.
CHAPTER 2. KNOWLEDGE REPRESENTATION AND REASONING 29
uct rule (see Equation 2.7) and the symmetry property P(X, Y ) = P(Y, X) we can derive
the Bayes’ theorem:
P(X|Y )P(Y )
Bayes’ theorem P(Y |X) = . (2.8)
P(X)
By applying the sum rule (see Equation 2.6) to the Bayes’ theorem denominator we get:
which serves as a normalization constant so that the sum of P(Y |X) is equal to one which
is a fundamental attribute of probability distributions.
Given two or more random variables, their joint probability distribution is the proba-
bility distribution expressing the probabilities of all possible combinations of their values.
Like in any probability distribution, all these probabilities should sum up to one. Table 2.6
illustrates the joint probability distribution of three binary (True, False) random variables
A, B,C. The joint probability distribution for the above case can be expressed using the
A B C Probability
True True True 0.1
True True False 0.2
True False True 0.1
True False False 0.05
False True True 0.1
False True False 0.1
False False True 0.05
False False False 0.3
Table 2.6. Exemplar joint probability distribution for binary random variables A, B, C.
notation P(A, B,C). Based on the joint probability distribution we can compute both the
conditional and the marginal probability distributions. A conditional probability distribu-
tion is a probability distribution that expresses the probabilities of some random variables
given others. Continuing our running example with A, B, C, if we where to calculate the
conditional probability distribution for A, B given C = True we would get rid of any table
entry in which C ̸= True (rows two, f our, six, eight) and then divide each remaining entry
with the sum of all remaining probabilities so that we normalize the probabilities to add up
to one (see Table 2.7). The conditional probability distribution for this case is denoted as
P(A, B|C = True). Finally, a marginal probability distribution is a probability distribution
that expresses the probabilities of a random variable irrespective of the values of the others.
So if we were to compute the marginal probability of A denoted as P(A) we would get Table
2.8.
There are two different families of PGMs, namely, directed graphical models and undi-
rected graphical models. As their name suggests, directed graphical models comprise edges
that have a direction. That is, there is an arrow and the end of each edge when connect-
CHAPTER 2. KNOWLEDGE REPRESENTATION AND REASONING 30
A B C Probability
True True True 0.286
True False True 0.286
False True True 0.286
False False True 0.142
Table 2.7. Exemplar conditional probability distribution for binary random variables A, B
given C = True.
A Probability
True 0.45
False 0.55
Table 2.8. Exemplar marginal probability distribution for binary random variable A.
ing two nodes illustrating the direction of the connection. Regarding directed approaches,
the most commonly used one is Bayesian Networks (BNs) which are described in Sec-
tion 2.2.1. In contrast, undirected graphical models comprise edges that have no direction.
Markov Random Fields (MRFs), also known as Markov Networks (MNs), are the most
commonly used undirected approach and are described in Section 2.2.2. Other types of
PGMs exist but are omitted since they are outside the scope of this thesis.
Inference in PGMs can be primarily categorized based on the accuracy of its outcomes.
In this context we can perform either approximate (e.g. sampling-based) or exact inference.
The general idea behind approximate inference methods is to trade the accuracy of exact
methods with efficiency so that inference can become tractable. Given the above catego-
rization we can perform an additional one which is based on whether we are interested in
the full posterior probability distribution (marginal, conditional, joint) or just interested in
the most probable variable assignment of query variables. For example, in the first case,
given a set of query random variables Q and a set of evidence (observations) E we can com-
pute P(Q|E). On the contrary, in the later case we compute only the most probable value
assignment of Q i.e. arg maxQ P(Q|E). This is also known as maximum a-posteriori (MAP)
inference [33]. In the case where Q is the set of all the remaining non-evidence variables in
the network (i.e. Q ∪ E = X, where X if the set of all random variables in the network), then
the process is known as the most probable explanation (MPE) inference [34, 35], a special
case of MAP inference.
can be uncertainty in the application domain. Bayesian networks are a well suited repre-
sentation for expressing causal links between random variables, a feature stemming from
their directed graphical representation nature.
The graph representation itself allow us to intuitively visualize random variables as well
as their dependencies (relations) and probabilistic influence without having to necessarily
interpret the underlying mathematical formulas. Figure 2.2 illustrates six small exemplar
BNs and the joint probability distributions of their random variables. In Figure 2.2a we can
A C A C
(a) P(A,C) = P(C|A)P(A) (b) P(C, A) = P(A|C)P(C)
A B A B
C C
(c) P(A, B,C) = P(C|B)P(B|A)P(A) (d) P(A, B,C) = P(A|B)P(B|C)P(C)
A B A B
C C
(e) P(A, B,C) = P(C|B)P(A|B)P(B) (f) P(A, B,C) = P(B|A,C)P(A)P(C)
Figure 2.2. Exemplar Bayesian networks and the joint probability distributions of their ran-
dom variables A, B,C.
clearly see that random variables A influences C i.e conditioning on A changes our belief
about C. Similarly, it is clear by observing Figure 2.2b that C influences A. In general
probabilistic inference is symmetrical. That is, if A can influence C then C can influence
A. Now, Figures 2.2c – 2.2f are more “interesting” in the sense that there is an interme-
diate random variable B which affects probabilistic influence. In the case of Figure 2.2c,
A influences C if B is not observed while if B is observed then A does not influence C. In
addition, in the case of Figure 2.2d C influences A if B is not observed while C does not
influence A if B is observed. Moreover, Figure 2.2e “tells” us that A influences C and vice
versa if B is not observed while this is not the case if B is observed. Finally, Figure 2.2f
represents what we call a V-structure. In this case A does not influence C if B and all of
its descendants (if any) are not observed while we say that A influences C if either B or
any of its descendants (if any) are observed. The above (exhaustive) cases of probabilistic
CHAPTER 2. KNOWLEDGE REPRESENTATION AND REASONING 32
influence also demonstrate two concepts known as d-separation and d-connectedness [38],
also called directional-separation and directional-connectedness respectively. Both con-
cepts are interrelated. In general, we say that a random variable X is d-separated from a
random variable Y if X does not influence Y or if Y does not influence X. In addition we
say that X and Y are d-connected if X influences Y or if Y influences X. Now, given a third
random variable Z, d-connectedness and d-separation are calculated in the same manner as
probabilistic influence in cases of Figures 2.2c – 2.2f. Both concepts are applicable to sets
of random variables in a very straightforward way. That is, two sets of random variables
are d-separated iff (if and only if) each random variable in one set is d-separated from every
over random variable in the other set and so on. In addition, both concepts are used in BN
inference algorithms which is the topic of the next section.
2.2.1.1 Inference
As was mentioned earlier in this section, inference in PGMs can be primarily categorized as
being exact or approximate based on its accuracy. In this section we will be describing both
types of inference in BNs starting with exact methods and continuing with approximate
ones.
The complexity of exact inference in BNs is related to the tree width of the graph
[39, 11]. Exact inference in BNs is NP-hard. In the case of a Bayesian network being
a polytree, which is also known as a singly connected network [40], time complexity of
exact inference is linear in the size of the network. Clustering algorithms can be used in
order to convert a Bayesian network into a polytree by clustering individual random vari-
ables (nodes) into hyper-nodes also called mega-nodes. One of the most common exact
inference algorithms is variable elimination which is also known as bucket elimination [41]
which is an improvement to the enumeration algorithm [11]. The rationale behind variable
elimination is eliminating irrelevant to the inference random variables, i.e. non ancestors
of query variables and evidence variables, and evaluating sums of products of conditional
probabilities based on Equations 2.6 and 2.7. Despite achieving linear time complexity with
polytrees exact inference still remains NP-hard which practically means that it is generally
intractable. For this reason we need to consider approximate inference approaches.
One of the most common family of approaches for performing approximate inference
in Bayesian networks is randomized sampling based methods also known as Monte Carlo
(MC) methods. The accuracy of MC methods is dependent on the number of samples gener-
ated from a known probability distribution. In this family we find direct sampling methods
which include sampling based on rejection. More specifically, rejection sampling generates
samples from the prior distribution of the BN that is applied to. That is, for each random
variable without a parent it performs a random assignment from its possible values and then
carries on by computing conditional probabilities of the children random variables (nodes)
given the parents until finishing processing the whole network. This process is performed
for n times (algorithm parameter defined by the user). After n sampling iterations the al-
CHAPTER 2. KNOWLEDGE REPRESENTATION AND REASONING 33
gorithm rejects those samples that are not consistent with the evidence and then estimates
the posterior probability distribution given the evidence. Since rejection sampling first per-
forms sampling and then rejection based on evidence it is rather inefficient since it wastes
time to sample the network even with values that are contradicting to the evidence. To
remedy this, we can utilize a different direct sampling method called likelihood weighting.
Likelihood weighting generates samples by taking into consideration the evidence (obser-
vations) by only sampling the non evidence variables. The sampling of each non evidence
variable is executed like in the rejection sampling approach but this time it is assigned a
weight which is initially one. Every time the algorithm falls into a variable that is part of
the evidence instead of sampling it, it fixes it to the value that it has in the evidence and
multiplies the current weight with the conditional probability of the evidence variable given
its parents. The outcome of this multiplication becomes the new weight. The process con-
tinues until the full network is processed in this manner yielding a full sampling iteration.
At the end of each iteration we get a full sample of all non evidence variables and an associ-
ated weight. After n sampling iterations we will get weighted counts of each non evidence
variable instantiation based on which we can compute an estimate of the joint posterior
probability distribution and consequently be to able to estimate the conditional distribution
given the evidence. However, likelihood weighting is prone to accuracy degradation as the
number of evidence variable increases.
Another family of MC methods is the family of Markov Chain inference algorithms ab-
breviated as MCMC (Markov Chain Monte Carlo). In contrast to direct sampling methods
MCMC methods do not sample the BN fully randomly each time. Instead they start with a
random sampling of all non evidence variables (same as likelihood weighting without the
weighting part) and then change the sample incrementally by sampling only one variable
at a time at random. This means that each consecutive sample differs from the previous
one by only one assignment. One of the most common and fundamental MCMC inference
method is Gibbs sampling. The rationale behind Gibbs sampling is that it first samples all
non evidence variables based on the values of the evidence variables which are kept fixed.
This constitutes a sampling iteration which yields a complete sample (value assignment of
all non evidence variable). For each consecutive sampling iteration the algorithm chooses
one random non evidence variable at random and samples it based on its Markov blanket12 .
After n sampling iterations we will get counts of each non evidence variable instantiation
based on which we can compute an estimate of the joint posterior probability distribution
and consequently be to able to estimate the conditional distribution given the evidence. As
in any sampling algorithm the accuracy of the estimate is related to the number of samples
n.
Variational methods are also an alternative family of approximate inference approaches
but are outside the scope of this thesis. For a description of variational methods the inter-
ested reader is referred to [32].
12 The Markov blanket of a BN variable Qi is the set of random variables (nodes) that are: the children and
the parents of Qi (Ci and Pi respectively) as well as all the remainder parents of Ci .
CHAPTER 2. KNOWLEDGE REPRESENTATION AND REASONING 34
2.2.1.2 Applications
As was mentioned earlier, BNs are well suited for expressing causal links between random
variables (entities) of an application domain. Therefore, they “have found their way” in
many applications to diverse fields of the scientific spectrum.
One such field is utilization of BNs in the field of safety. In this area we find [42] where
the authors propose a BN-based approach for safety risk assessment in steel construction
projects. In [43] Bayesian networks are deployed for dynamic safety analysis of process
systems while in [44] BNs are used for maritime safety management. Another area where
BNs have been utilized is healthcare. An example of such application is the work presented
in [45] where the authors take advantage of a Bayesian network approach for modelling
medical problems for the purpose of personalized healthcare. Moreover, in [46] BNs are
used as a decision making support method for diagnosing dementia. In the work presented
in [47], BNs are used for meta-analysis tool for studies in patients with coronary artery
disease. Other fields where BNs are applied are: fault diagnosis [48–50], meteorology
[51–53] and neuroscience [54] among others.
Finally, of particular interest are the applications of BNs in the robotics field. In this
area we find the work of [55] on context-aware home service robots. Furthermore, BNs are
applied in sensor planning for mobile robot localization [55], robot mapping [56] and robot
cognition [57] among others.
X1 X4 X6 X9
X8
X2 X5 X7 X10
D B
A
X3
Figure 2.3. An exemplar Markov Network consisting of ten nodes (X1 − X10 ). Nodes of set
A (X2 , X3 ) are conditionally independent of nodes of set B (X6 , X7 ) given a separating set of
nodes D (X4 , X5 ). Sets A, B, D are enclosed in red rectangles. Nodes X6 , X7 , X9 , X10 are the
Markov blanket of node X8 . Finally, nodes who are enclosed in the blue ellipse (X1 , X2 , X3 )
represent one of the cliques of the Markov network.
the edges connecting it to the rest of the network will cause the sets A, B to completely
disconnect from each other. This is the case in our example and it is denoted as A ⊥⊥ B | D.
This is also known as the global Markov property. In the event that A, B are not completely
disconnected from each other then we say that A, B are conditionally dependent on each
other given D. The same applies for single nodes. That is, any two non-adjacent (non di-
rectly connected) nodes are conditionally independent of each other given all other nodes
and any two adjacent nodes are conditionally dependent on each other given all other nodes.
A Markov blanket of a node Xi is the set of all neighbouring nodes as illustrated in Figure
2.3. One last concept to observe in Figure 2.3 is cliques. Nodes X1 , X2 , X4 constitute a
clique. A clique in a MN is a fully connected subgraph. Another clique in Figure 2.3 is
formed by nodes X2 , X4 , X5 . Also, the node pair X1 , X4 is also one of the cliques of the
exemplar MN. The cliques defined by nodes X2 , X4 , X5 and X1 , X2 , X4 are also known as
maximal cliques. A maximal clique is a clique in which if we include any other network
node it will stop being a clique.
Let X = X1 , X2 , . . . , Xn be the set of all random variables in the network in a MN. The
joint probability distribution of the random variables is given by:
1
Z∏
P(X = x) = ϕm (xm ) (2.10)
m
where, P(X = x) is the joint probability distribution i.e the probability that the random
variables in set X are in state (value configuration) x, the state of the random variables
present in the mth clique is xm , ϕm (xm ) represents a potential function for the mth clique
and Z is the partition function also called a normalization constant. The network has one
potential function for each clique in the graph. A potential function is any non-negative
function that takes a value for each state of the clique it represents. The choice for its form
CHAPTER 2. KNOWLEDGE REPRESENTATION AND REASONING 36
should be such that it takes into consideration that its value for each clique state should
reflect what we want to model in our network. Finally, the partition function Z is given by:
Z = ∑ ∏ ϕm (xm ) (2.11)
x m
and it serves the purpose of normalizing the probabilities in the distribution so that they add
up to one. A small example will help us understand the above concepts better.
Consider a MN with which we model the fact that speeding, fatigue and drinking can
cause accidents while driving a car where random variables Speeding, Fatigue, Drinking
and Accident are binary (True, False). Such a network is shown in Figure 2.4. To demon-
Speeding
Fatigue Accident
Drinking
Figure 2.4. A Markov network representing the relations (dependencies) among random
variables Speeding, Fatigue, Drinking and Accident in a car driving scenario.
strate how a potential function can be realized in the network of Figure 2.4 let us take into
consideration the clique defined by nodes Fatigue, Accident. For this clique we are going
to have one potential function with one value for each state of the clique (see Table 2.9).
Higher values for the potential function in a clique state when compared to some other state
Table 2.9. States of the clique defined by nodes Fatigue, Accident and the values of its
potential function for each state. Note that the form of the potential function is deliberately
omitted to emphasize that the potential function can be any non-negative function.
of the same clique means that the higher valued state is more probable than the lower valued
one. By observing Table 2.9 we can see that the state where someone drives under fatigue
and does not have an accident is less probable than someone not driving under fatigue and
having an accident. This is expected since driving tired is one of the major factors causing
car accidents. Similarly we get the values for all states of the potential function for the
clique defined by nodes Speeding, Drinking and Accident. Then from Equations 2.10, 2.11
we compute the joint probability distribution of the network.
One of the problems in MNs and indeed in Bayesian networks is the complexity of
the graph. This is something that can heavily impact inference. In the case of a MN in
particular, complexity is dependent on the sizes of the cliques in the graph. For small
CHAPTER 2. KNOWLEDGE REPRESENTATION AND REASONING 37
cliques the problem is trivial but consider a case where in a MN we have many cliques
comprising 15 nodes. Even in the binary case we have 215 = 32768 states for each such
clique and consequently as many value calculations for the potential function. In order to
mitigate the computational problem we can utilize a log-linear model to compute the joint
probability distribution of a MN as follows [10]:
1 ( )
P(X = x) = exp ∑ wi fi (x) (2.12)
Z i
where, fi (x) is called a state feature and can be any real-valued function of the state and wi
is the weight of the feature. In this manner we can have much fewer features than states.
Let us now continue with our running example and replace the potential function ϕ with
the following feature and weight:
1, if Fatigue = False or Accident = True
f1 (Fatigue, Accident) = (2.13)
0, otherwise
w1 = 1.1 (2.14)
Now given the above we can reconstruct the same potential function values of Table 2.9
for all states of the clique using just one binary feature. By extension we can do the same
for all cliques and finally calculate the joint probability distribution using Equation 2.12
instead of Equation 2.10. This concludes the description of MNs. Next we will be focusing
on inference.
2.2.2.1 Inference
As in the case of BNs, inference in MNs can be either approximate or exact and can be
either concerned with computing marginal,conditional and joint probability distributions of
the random variables or with the most probable assignment of query variables. In the latter
case we say that we perform MAP inference or MPE inference which is a special case of
MAP inference (see Section 2.2).
Like in the case of BNs, the complexity of exact inference in MNs is related to the
complexity of the graph structure of the network. Networks with many random variables
and large cliques are the most challenging ones in terms of exact inference which is in
the general case #P-complete. Algorithms such as enumeration and variable elimination
that are applicable in BNs are also applicable in MNs. The junction tree algorithm [59] is
also another alternative for exact inference in MNs. The intractability of exact inference
in the general case makes it feasible only in the simplest of cases. Therefore, approximate
inference approaches are more suited to more complex problems.
With respect to approximate methods, as in the case of BNs, MCMC is very popular
in MNs. Within this spectrum, Gibbs sampling is one of the most famous algorithms for
computing conditional and marginal probability distributions in MNs. Variational methods
CHAPTER 2. KNOWLEDGE REPRESENTATION AND REASONING 38
[32] are an alternative to MC methods but are outside the scope of this thesis. Another ap-
proximate inference method is belief propagation, also called loopy belief propagation [60]
due to the fact that MNs can contain cycles. The key concept in loopy belief propagation
is the application of the sum-product algorithm [61] for calculating marginal probability
distributions. Loopy belief propagation can suffer from convergence infeasibility or poor
results [62] but for most cases it is a very efficient method. Regarding approximate MAP
inference, iterated conditional modes [63] is one of the early approaches. A variation of
loopy belief propagation based on the max-product algorithm can also be used for approx-
imate MAP inference in MNs. Other approximate MAP inference methods are simulated
annealing [64] and graph cuts [65] among others.
2.2.2.2 Applications
Markov networks are well suited models for expressing soft constraints between random
variables. As such, they have been applied to many problems in a diverse group of scientific
fields over the years.
One of the most prominent applications of MNs is in the field of computer vision. More
specifically, MNs have been used for image segmentation [66, 67], image registration and
matching [68, 69], optical flow analysis [70, 71], image denoising and reconstruction [72,
73] as well as for generating high resolution range images [74] among others. Data mining
is another field in which MNs are applied to. Within this field we find the work presented in
[75] where the authors model term dependencies and perform information retrieval in text
via MNs. In [76], the authors present an MN-based approach for information retrieval via
verbose queries. In [77], MNs are used as a modelling approach to mine what the authors
call the (social) network value of customers which can be used for improved marketing of
products or services. Social network analysis via MNs is also presented in [78, 79].
based model, i.e. first-order, propositional etc. Bayesian SRL approaches include Bayesian
Logic Networks (BLNs) [81], Logical Bayesian Networks (LBNs) [82, 83], Probabilis-
tic Relational Models (PRMs) [84, 85] and Multi-Entity Bayesian Networks (MEBNs)
[86, 87, 1] among others. On the other hand, Markovian SRL approaches include Rela-
tional Markov Networks (RMNs) [88, 89] and Markov Logic Networks (MLNs) [10, 90–
93] among others.
Statistical relational learning is a very active field of research within machine learning
and artificial intelligence with a lot of real world applications. However, it is outside the
scope of this section to provide a complete survey of the domain. This section rather fo-
cuses on MLNs which are presented and analysed in detail in Section 2.3.3. Multi-entity
Bayesian networks and Bayesian logic networks (see Sections 2.3.1 and 2.3.2 respectively)
are presented as two representative Bayesian SRL approaches but in less detail. The reason
for doing so is that MLNs are the most commonly used and mature approach in the field
and many other models such as MNs and BNs can be thought of as special cases of MLNs.
An additional reason justifying this choice of ours is that as undirected models, MLNs are
better suited for expressing soft constraints compared to directed models.
Figure 2.5. MFrag example modelling the level of danger in which a starship in a hypothetical
a science fiction scenario is exposed to. Nodes in yellow represent context nodes. Nodes in
dark grey are input nodes while the node in light grey is a resident node. The figure is taken
from [1].
resident nodes represent conditional probability distributions given parent nodes while con-
text nodes, which are logical boolean constraints, have the role of conditions that need to be
satisfied for both dependencies and conditional probability distributions to be applicable.
MFrags can be thought of as candidate templates which once instantiated, are combined in
what is called MTheories. An MTheory is nothing more than a joint probability distribu-
tion over the random variables represented by the MFrags forming it. In this manner we
can model different scenarios by having different MFrags forming different MTheories all
in one network. In order to compute any query with respect to an entity (random variable)
a Situation-Specific Bayesian Network (SSBN) is constructed. That is, a minimal BN for
answering a query [94] using the information provided. A good introductory tutorial on
MEBNs can be found in [1].
appear in the network, the actual entities and the signatures of functions present in the
network. In this context, logical predicates are declared as logical boolean functions (e.g.
logical boolean takesPartIn(person, course) is a logical predicate indicating that an ab-
stract entity of type person takes part in an abstract entity of type course). On the other
hand, functions that are used for instantiation of random variables are declared as random
(e.g. random gradeDomain grade(student, course) is a function signature that maps from
a student-course pair to the grade domain [81]). A fragment essentially defines a depen-
dency of a random variable (e.g. grade(student, course)) on other random variables which
are the parents while a first-order logical formula is created by a combination of logical
boolean functions. Fragments can be thought of as templates for the probabilistic compo-
nent of the ground mixed network while logical formulas as templates for the deterministic
component. In order to perform inference in BLNs we can apply standard algorithms that
are applicable in mixed networks [95, 96].
Definition 2.1 A Markov logic network L is a set of pairs (Fi , wi ), where Fi is a first-order
logic formula and wi is a real valued weight attached to it. Along with a finite set of
constants C = {c1 , c2 , c3 , . . . , c|C| }, it defines a ground Markov network ML,C as follows:
1. ML,C contains one binary node for each possible grounding of each literal appearing
in L. The node value is 1 if the ground literal is true, and 0 otherwise.
2. ML,C contains one binary feature for each possible grounding of each formula Fi in
L. The value of the feature is 1 if the ground formula is true, and 0 otherwise. The
weight of the feature is the wi which is attached to formula Fi in L.
To demonstrate how a Markov logic network is realized and how a ground MN is created
let us expand on the the employer-employee example from the beginning of this section.
We add a first-order logical formula in Markov logic which expresses that an employer x
does not work for someone else y: Employer(x) ⇒ ¬ WorksFor(x, y). Finally we com-
plete our knowledge base by adding a formula that represents that if someone x, is part of
a company c and that someone x works for someone else y then y is part of the same com-
pany c: Company(c, x) ∧ WorksFor(x, y) ⇒ Company(c, y). All formulas are assigned
some weight to express how strong the constraint they represent is as seen in Table 2.10.
Let George, John and MicroGalaxy be three constants for the KB shown in Table 2.10,
Weights Formulas
0.9 Employee(x) ⇒ ¬ Employer(x)
1.7 Employer(x) ⇒ ¬ WorksFor(x, y)
2.5 Company(c, x) ∧WorksFor(x, y) ⇒ Company(c, y)
Table 2.10. Markov logic employer-employee example knowledge base. The higher the
weight the stronger the constraint. All formulas are considered to be universally quantified.
where George, John are people and MicroGalaxy is a company. Now, the ground MN of
the aforementioned constants and our KB is the one shown in Figure 2.6. The resulting
Company(MG,G)
Employee(G)
WorksFor(G,G)
Company(MG,J)
Employer(G)
WorksFor(J,J)
WorksFor(G,J)
WorksFor(J,G)
Employee(J) Employer(J)
Figure 2.6. Ground Markov network of the employer-employee running example. Constants
George, John and MicroGalaxy are abbreviated as G, J and MG respectively.
CHAPTER 2. KNOWLEDGE REPRESENTATION AND REASONING 43
ground MN comprises all the possible ground literals in the domain which are the nodes
in the resulting network. The edge connection between nodes is decided based on ground
clauses. That is, ground literals (nodes) that are part of a clause grounding are connected
by an edge. Formulas in MLNs follow the syntax of first-order logic (see Section 2.1.2). As
in the case of MEBNs and BLNs, MLNs are templates, but for constructing ground MNs.
Depending on the constants as well as their number (quantity) these templates can yield dif-
ferent ground MNs of different sizes and/or structure [10]. Now, since MLNs are templates
for constructing ground MNs, from Equations 2.10, 2.12 and Definition 2.1 it follows that
the probability distribution of possible worlds x represented by the ground Markov network
ML,C is given by [10]:
1 ( ) 1
P(X = x) = exp ∑ wi ni (x) = ∏ ϕi (xi )ni (x) (2.15)
Z i Z i
where, ni (x) is the number of true groundings of Fi present in x, xi are the truth values,
which is practically the state, of the atoms in Fi and ϕi (xi ) = ewi . Recall that we talked
about possible worlds in Section 2.1.
Finally with respect to semantics, MLNs follow the semantics of first-order logic but
make the unique names and domain closure assumptions as well as the known functions
assumption which dictates that for each function in the MLN, the value of the function
applied to every possible tuple of arguments is known and is a an element of the finite
set of constants. The first two assumptions are the alternative semantics assumptions that
we described in Semantics of Section 2.1.2. This comes as no surprise since MLNs are a
combination of Markov networks and first-order logic.
2.3.3.1 Inference
Inference in MLNs can be described in the same terms as in the case of PGMs and first-
order logic. Again, this comes as no surprise since as we mentioned in the previous section,
MLNs combine both worlds benefiting from their advantages. This practically creates the
necessity of combining probabilistic inference algorithms with the ones used for first-order
inference so that they can be applicable in this new (combined) setting. With respect to
computing the probability that some first-order formula F1 holds given some other first-
order formula F2 , a Markov logic network L and a set of constants C is given by [98]:
where ML,C is the ground MN, XF1 is the set of worlds where F1 holds, XF2 is the set of
worlds where F2 holds and P(X = x|ML,C ) is calculated based on Equation 2.10. Now, in
order to compute the most probable state of a world y given some evidence x (MAP/MPE
CHAPTER 2. KNOWLEDGE REPRESENTATION AND REASONING 44
current state of the network. The initial state x(0) of the network is the outcome of applying
a SAT solver to all hard clauses. That is, clauses with an infinite weight which, if you recall,
represent deterministic constraints. If at this step the hard clauses are not satisfiable then
the algorithm’s output is undefined [99]. M is a random subset of currently satisfied clauses
that need also to be satisfied in the next network state. In the beginning of each sampling
CHAPTER 2. KNOWLEDGE REPRESENTATION AND REASONING 45
vars ← variables in L
for i ← 1 to max-tries do
solution ← random truth assignment to vars
cost ← ∑weights of unsatisfied clauses in solution
for j ← 1 to max-flips do
if cost ≤ target-cost then
return “Success, the solution is:” solution
end
c ← random unsatisfied clause
if Uniform(0,1) < p then
v f ← random variable from c
else
for each variable v in c do
compute DeltaCost(v)
end
v f ← v with lowest DeltaCost(v)
end
end
solution ← solution with v f flipped
cost ← cost+ DeltaCost(v f )
end
end
return “Failure, the best truth assignment is:” best solution found
propositional inference to the first-order paradigm. In this manner, forward and backward
chaining can also be used for first-order logic. Backward chaining is also used in the logic
programming paradigm of inference in first-order logic. Prolog is the most commonly used
logic programming language which employs the database semantics of first-order logic.
That is, the unique names, closed world and domain closure assumptions of first-order
logic. Inference in first-order logic is sound and complete and logical consequence (en-
tailment) is semidecidable. With respect to description logics, they are a family of logical
languages that are subsets of first-order logic and compared to first-order logic some of
them are also decidable. Description logics are the basis for the OWL DL language which
is a language for authoring ontologies (see Chapter 3). There are also other types of logic
for representing knowledge such as Fuzzy logic [107] and Modal logic [108] but they are
outside the scope of this thesis.
Regarding probabilistic approaches, they provide the means for representing and rea-
soning about uncertain knowledge as opposed to logic-based representations which do not
deal with uncertain knowledge explicitly. In this chapter we presented two main proba-
bilistic graphical models namely, Bayesian and Markov networks, which are directed and
undirected probabilistic models respectively. Bayesian networks are well suited for ex-
pressing causal links between random variables while Markov networks are better suited
for expressing soft constraints between random variables. For both Bayesian and Markov
networks we analysed their building blocks, looked into several inference approaches (ex-
act and approximate) and provided information on their applications in diverse fields of the
scientific spectrum.
Further to the logic-based and probabilistic approaches we have also presented the do-
main of statistical relational learning which is concerned with hybrid approaches, i.e. ap-
proaches that unify the world of logic and probability (uncertainty) into a single represen-
tation. In this manner, we can benefit from the advantages of both worlds. That is, on
the one hand having the means for encoding complex relations among entities which stems
from the logic aspect of hybrid models while at the same time being able to encode un-
certain knowledge explicitly in a well defined and structured manner. In this category we
presented multi-entity Bayesian networks and Bayesian logic networks which are represen-
tative Bayesian-based hybrid approaches and Markov logic networks which are a represen-
tative Markov-based hybrid approach. However, we focused on Markov logic networks as
being the most mature line of research and due to the fact that both multi-entity Bayesian
networks and Bayesian logic networks can be thought of as special cases of Markov logic
networks. We have also presented various inference procedures (algorithms) both for exact
and approximate inference in the Markov logic setting.
Chapter 3
Knowledge engineering is a scientific field residing within the artificial intelligence disci-
pline and is concerned with developing knowledge-based systems that aim at solving real
world problems. As such, knowledge engineering provides the means to represent, organise
and manage large amounts of knowledge in a structured and well defined manner.
Ontologies are formal knowledge representation mechanisms that capture relational
knowledge and semantics about domains of interest. The term ontology first appeared in
philosophy denoting the study of what exists, what are the attributes and features of things
that exist and how they relate to each other. Over the years the term crossed the borders of
philosophy and was adopted by a wide range of fields, including but not limited to Infor-
mation Technology (IT), Artificial Intelligence (AI), Computational Linguistics (CL) and
Mechanical Engineering (ME) [109]. The diversity of the aforementioned fields has led to
the absence of a standardized definition for an ontology. However, the most wide spread is
the one given by Thomas Robert Gruber in [110]: An ontology is a“formal, explicit speci-
fication of a shared conceptualization of a domain”. In an ontological approach for knowl-
edge engineering, logic-based representations and reasoning are the natural choice. First of
all, because logic-based representations are ideal for representing relational knowledge and
capturing interactions between entities. Secondly, because logic-based systems can be very
expressive depending on the kind of logic used. Finally, because logic-based representa-
tions are governed by well defined syntax and semantics and supported by a variety of well
researched inference approaches (see Section 2.1). When developing ontologies we agree
on a vocabulary and a taxonomy based on which we represent and structure knowledge.
48
CHAPTER 3. KNOWLEDGE ENGINEERING WITH ONTOLOGIES 49
guage was designed in a way that it provides maximal compatibility with RDF and RDFS
and its normative syntax is RDF/XML1 with formal semantics2 . OWL comes in three vari-
ants: OWL Full, OWL DL and OWL Lite [114]. OWL Full is the most expressive of the
three but is undecidable. OWL DL (Description Logics) is based on the SHOIN (D) fam-
ily of DL languages (see Section 2.1.3). OWL DL provides a very good balance between
expressiveness of first-order logic and very desirable features such as soundness, complete-
ness and decidability. OWL Lite is based on the SHIF (D) family (see Section 2.1.3) and
is the least expressive of the three and is intended for users that mainly require simple con-
straint features and a classification hierarchy. More information on the three variants of
OWL can be found in [113]. Figure 3.1 illustrates how the three different OWL variants
relate to each other in terms of expressive power.
OWL Lite
Figure 3.1. Relation of the OWL variants in terms of their expressive power. OWL Lite is a
subset of OWL DL while both are subsets of the OWL Full variant.
Classes: An ontology class is a representation of a set of entities that exist in a domain and
share common characteristics that allow them to be grouped together. Ontology classes are
CHAPTER 3. KNOWLEDGE ENGINEERING WITH ONTOLOGIES 51
Figure 3.2. Ontology classes of entities that can be found in the hypothetical underwater
MCM domain organised in a taxonomy. The is-a notation denotes a superclass-subclass
connection between classes. Thing is a special class that plays the role of the root class (root
superclass) of all classes in a domain. This is analogous to root nodes found in tree data
structures.
very much like classes in the context of object oriented programming. The notion of a class
is equivalent to the notion of a concept as this was described in Section 2.1.3. Let us repli-
cate the example given in the aforementioned section in which an underwater vehicle needs
to locate and neutralize sea mines as part of an MCM mission. In this context, ontology
classes such as PerceptualAgent, NavalMine, MooredMine, AUV, NeutralisingSomething
and NeutralisingAMooredMine are classes that could be used for modelling the hypothetical
domain. Ontology classes are organized into a hierarchy in the form of superclass-subclass
occurrences (subclass is-a superclass) which is also known as a taxonomy. Consequently,
an AUV is-a PerceptualAgent, a MooredMine is-a NavalMine and NeutralisingAMoored-
Mine is-a NeutralisingSomething. Classes should not be thought of only as sets comprising
physical entities but also as sets that can represent non-physical entities as evidenced by
classes such as NeutralisingAMooredMine which refers to an action. Figure 3.2 provides
a graphical representation of the classes discussed above as well as additional hypotheti-
cal domain classes that could be used, organised in a taxonomy, as they are rendered in
Protégé8 , an open source knowledge management system and ontology editor.
Individuals: Individuals are the individual entities (instantiations) of ontology classes. In-
dividuals are also referred to as instances. Again, ontology individuals are very much like
class instances in the context of object oriented programming. Now consider the Moored-
Mine, Anchor and AnchorChain concepts. Individuals of these concepts would be the actual
moored mines, anchors and anchor chains present in the domain (e.g. MooredMine1) for
an individual in the first case, Anchor1 for an individual in the second case, etc. If for in-
8 http://protege.stanford.edu
CHAPTER 3. KNOWLEDGE ENGINEERING WITH ONTOLOGIES 52
Figure 3.3. Ontology individuals as well as their respective classes. Individuals can be iden-
tified by the purple diamond preceding their name while classes can be identified by the
yellow circle. Blue arcs link individuals to the classes they belong while purple arcs link
classes to each other forming superclass-subclass connections. MooredMine1, Anchor1 and
AnchorChain1 are individuals (instantiations) of their respective classes.
stance more moored mines were to be present in the domain then we would have additional
instances of the concept MooredMine and so on. Figure 3.3 illustrates such individuals as
they are rendered in Protégé. The taxonomy presented in Figure 3.3 is the same as the one
presented in figure 3.2 with the addition of individuals. Moreover we have used a different
visualization tool within Protégé that enables the user visualise the connections between
the various building blocks of the ontology.
Properties: Properties are the last component found in OWL DL ontologies. Properties in
OWL DL are equivalent to roles found in DLs. There are two types of properties in OWL
DL, namely, object and data properties. Both object and data properties can be used in
class definitions to restrict their domain and range to individuals with certain attributes (see
Section 2.1.3). Moreover, object properties can be used to link two individuals forming
binary relations. Finally, data properties can be used to link an individual to data values9
such as strings, literals, numeric values etc., acting in this manner as attributes. Figure 3.4
illustrates an object property between two individuals forming a binary relation.
The analogies between DLs and OWL DL do not come as a surprise since OWL DL is
part of the DLs family of languages. Recall from Section 2.1.3 that when using DLs the
available knowledge resides in the DL KB which comprises the TBox and ABox. In the
same manner, the available knowledge in ontologies authored in OWL DL resides in the
so called OWL DL KB which also comprises the TBox and ABox. The TBox contains
all classes as well as class definitions through restrictions while the ABox contains all
9 Strictly
speaking a link between an individual and a data value is also a binary relation between them but
we reserve this term for when describing relations between individuals.
CHAPTER 3. KNOWLEDGE ENGINEERING WITH ONTOLOGIES 53
Figure 3.4. Object property isTiedToChain forms a binary relation (orange arc) between in-
dividuals MooredMine1 and AnchorChain1. Blue arcs link individuals to the classes they
belong while purple arcs link classes to each other forming superclass-subclass connections.
The information illustrated in this figure was produced by manually creating the isTied-
ToChain object property inside the ontology illustrated in Figure 3.3 in order to form a binary
relation.
individuals with their attributes as well as all binary relations between them.
in Section 3.3.1. A similar framework is the ORO system [126], which leverages a core
robotics ontology to integrate data from diverse sources such as sensors, domain knowl-
edge, and human input, and to provide useful, structured knowledge that can help the robot
in interactions with humans. An ontology for general situation awareness is given by [124],
with applications to military robotics, and [127] create a general-purpose ontology for spec-
ifying robotic domains.
A particularly relevant work is [128], which describes a planning and control system
for AUVs built on an ontology. Planning and mission data, vehicle capabilities, and vehicle
status information are all stored in the ontology, which allows for much more reactive and
flexible planning. The authors describe an in-water test where the planning system was able
to re-plan the mission to deal with component failures thanks to semantic knowledge en-
coded in the ontology. Other benefits of using an ontology are the ease of using a reasoning
system (for example for problem diagnosis), and the ease of sharing data between modules
on the vehicle.
terfacing the KB with external data and for implementing KnowRob reasoning modules
which can be extended based on application needs. This extensibility is a major advantage
of KnowRob, meaning it can be adapted to the requirements of many robotic projects.
Another benefit of KnowRob is that not all knowledge needs to be immediately avail-
able in the knowledge base: a special class of Prolog predicate called a computable can be
defined that dynamically queries or calculates a result. This can be used, for example, to
query the perception system for the current value of a variable, or calculate the relationship
between two objects using their positions. In this way, the world plays the role of a virtual
knowledge base, meaning that correct information is always used, and the on-demand com-
putation of knowledge reduces the computational load. Further, the knowledge base can
grow and change at run-time. In contrast to many logic-based systems where all available
knowledge must be inserted into the knowledge base at the start of the inference process to
enable all possible queries of interest to be answered, in KnowRob this is not required.
An overview of KnowRob’s architecture is shown in Figure 3.5. This shows the key
KnowRob interfaces and illustrates how modular the system is. Integration with the robot
includes interfacing to the perception and executive systems, as well as the ability to call
ROS services and topics. The KnowRob system itself is actually distributed as a ROS
stack, meaning it can easily be used in existing ROS-based architectures, including the
one developed at Heriot-Watt’s Ocean Systems Laboratory. The interfaces to knowledge
acquisition systems and human interaction systems are most useful for household robots
while the visualization module enables humans to visualise the knowledge a robot holds
about its environment given appropriate CAD models.
CHAPTER 3. KNOWLEDGE ENGINEERING WITH ONTOLOGIES 56
The reasoning interfaces at the top of Figure 3.5 are important and include a module
for computing spatial and temporal relations between objects, a module for matching the
robot’s knowledge of its own capabilities to actions needed to perform a task, and a module
interfacing to classification systems. Finally the ProbCog module provides probabilistic
inference capabilities and its built on BLNs (see Section 2.3.2).
The purpose of this chapter is to present an overview of the field of Artificial Intelligence
(AI) planning as it is an important aspect of autonomous underwater operations in general
and by extension of operations concerning maritime defence (see Sections 1.1, 1.1.1 for a
reminder of the problem statement as well as Section 1.2 for a reminder of the objectives of
the thesis). Regarding this chapter, in Section 4.1 we present an overview of AI planning
while in Section 4.2 we present classical planning. Moreover, in Section 4.3 we present
temporal planning while Section 4.4 is concerned with planning under uncertainty. The
Planning Domain Definition Language (PDDL) is the center of focus of Section 4.5. In
Section 4.6 we briefly present a temporal planner called OPTIC. Finally we present research
outputs with respect to planning for autonomy in Section 4.7.
4.1 Overview
Artificial Intelligence (AI) planning, also know as automated planning or simply planning,
is a field of AI that is concerned with the formulation and organization of actions in a way
that when executed by some intelligent agent, system, vehicle etc. will enable it to achieve
some goals. In other words, planning is the process of solving a planning problem by for-
mulating actions and organizing them in a way that when executed will enable an intelligent
agent achieve some goal(s) as dictated by the planning problem at hand [134], [3]. Figure
4.1 illustrates a conceptual model for planning. Before going into detail about Figure 4.1,
let us first define Σ which represents a state-transition system [3]. A state transition system
is a 4-tuple Σ = (S, A, E, γ ) where:
• γ : S × A × E → 2S is a state-transition function.
57
CHAPTER 4. ARTIFICIAL INTELLIGENCE PLANNING 58
Now, judging from figure 4.1 planning can be described in terms of three main interact-
ing components. First, a planner, which, given a description of Σ, an initial state and some
objective1 , generates a plan for the controller in order to achieve that objective. Second, a
controller, outputs an action α according to some plan. That is, of course, given as an input
the state s of the system. At this point it worth noting that in many cases the information
that the controller has about the state of the system in incomplete. That is, the controller has
partial knowledge. Partial knowledge can be modelled as an observation function η : S → O
where O is a discrete set of observations: O = {o1 , o2 , ...on }. As such, the controller’s input
is the observation o = η (s) that corresponds to the current state [3]. Finally, the third com-
ponent is a state-transition system Σ that evolves according to its state-transition function γ
as well as the events and actions that it receives. One last observation to make by looking at
Figure 4.1 is that the execution status is fed into the planner by the controller. This allows
interleaving planning and executing as well as the utilization of replanning and plan revis-
ing mechanisms in response to discrepancies (differences) between Σ and the real world. A
more simplistic alternative would be to omit the execution status arrow from the controller
to the planner in Figure 4.1. That would imply that the controller is robust enough to deal
with discrepancies on its own which, for the vast majority of cases, is hardly the case.
With respect to planning in general, there are eight restrictive assumptions that can be
made. The existence or absence of combinations of those assumptions defines the type
of planning that is going to take place [3] (see Sections 4.2, 4.3, 4.4). The following list
presents those assumptions:
• Assumption 1 (A1): Fully observable Σ. That is, we have full knowledge about the
state of Σ.
1 An objective can be a goal state or a set of goal states. In a more general manner, the objective is to
satisfy some condition over the sequence of states followed by the system [3].
CHAPTER 4. ARTIFICIAL INTELLIGENCE PLANNING 59
• Assumption 4 (A4): Restricted goals. This means that the planner only handles goals
that are part of an explicit goal state (sg ) or set of goal states (Sg ).
• Assumption 5 (A5): Sequential plans. This means that a (solution) plan is a finite
sequence of actions that are linearly ordered.
• Assumption 6 (A6): Implicit time. The notion of implicit time dictates that actions
and events have no duration. As a consequence, the application of an action or the
existence of some event incurs an immediate state transition.
• Assumption 7 (A7): Offline planning. That is, while planning, a planner is not con-
cerned with changes that may occur in Σ. As a consequence, it plans only for the
given initial and goal state(s).
A state-transition system that embodies all the aforementioned assumptions is called a re-
stricted state-transition system.
Except for different types of planning based on the aforementioned assumptions, plan-
ning can be also categorized based on the problem at hand [3]. For instance, path and mo-
tion planning is concerned with the formulation of plans that synthesize a path from some
starting point in space to some goal point in space and a control trajectory along that path.
Perception planning on the other hand, is concerned with formulating plans that involve
sensing actions that are necessary for collecting data. Moreover, manipulation planning is
concerned with formulating plans that will enable an acting agent manipulate objects in its
environment. In fact any problem can be thought of as a planning problem because any
problem would require a series of actions to be solved. As such, we could have transporta-
tion planning referring to planning for running transportation means (buses, airplanes, etc.).
Consequently, the categorization of planning based on the problem at hand is rather wide
and in many cases different terms can be used for planning for the same type of problem.
In the case of manipulating objects for example where we referred to that form of plan-
ning as manipulation planning, we could just as easily use the term task planning. That
is, manipulation could be thought of as a task that some agent, human or artificial, could
undertake.
Having presented a fraction of the diverse domain of planning problems we would like
to make another categorization based on whether the process of planning in general is de-
CHAPTER 4. ARTIFICIAL INTELLIGENCE PLANNING 60
pendant on the domain of application or not. In this case we can have domain-independent
planning and domain-specific planning [3]. In the case of domain-specific planning any
commonalities even between diverse domains are mostly not taken into consideration. As
such there can be an unnecessary fragmentation of planning approaches. On the other hand,
in domain-independent planning such commonalities are exploited in a manner that justifies
the usage of generic planners. That of course does not mean that domain-specific planning
should not be employed. According to Ghallab et. al. [3] it would be as if we argued
in favour of replacing specialized computation techniques in a computer, something that
would be highly inefficient.
Definition 4.1 Let L = {p1 , p2 , ..., pn } be a finite set of proposition symbols. A planning
domain on L under the set-theoretic representation is a restricted state-transition system
Σ = (S, A, γ ) such that:
by s.
• S has the property that if s ∈ S, then for every action α that is applicable to s, the
set (s − e f f ects− (a)) ∪ e f f ects+ (α ) ∈ S. More intuitively, whenever an action is
applicable to a state, it generates a new state.
Definition 4.3 A plan is any sequence of actions π = ⟨α1 , α2 , ...αk ⟩, where k ≥ 0. the
length of π is |π | = k. That is, it equals the number of actions.
With the above, we have now provided all the necessary definitions for planning do-
mains (Definition 4.1), planning problems (Definition 4.2), plans (Definition 4.3) and solu-
tion plans (Definition 4.4) under the set-theoretic representation.
classical representation setting. Before doing so let us first provide some definitions for
operators and actions. The requirement to do so stems from the fact that actions in the
classical representation are represented by (planning) operators.
• name(o) is the name of the operator and is an expression of the form n(x1 , x2 , ..., xk )
with n being a symbol known as an operator symbol, x1 , x2 , ..., xk are all of the vari-
able symbols that appear in o, and n is unique, (that is, no two operators in L have
the same operator symbol).
Definition 4.6 For any set of literals L, L+ is the set of all atoms in L, and L− is the
set of all atoms whose negations are in L. If o is an operator or operator instance, then
precond + (o) and precond − (o) are o’s positive and negative preconditions respectively,
while e f f ects+ (o) and e f f ects− (o) are o’s positive and negative effects respectively.
In case there is a confusion about operators, actions, preconditions and effects let us
provide an exemplar operator and its instantiation in the following snippet.
Snippet 1 illustrates a move operator and a move action. Inputs r represents a robot
while l, m locations. What the operator tells us is that in order for a robot r to move from
CHAPTER 4. ARTIFICIAL INTELLIGENCE PLANNING 63
some location l to some location m, l, m must be adjacent, robot r must be at location l while
location m must be not occupied. Notice here that we have two positive and one negative
precondition. The effects of some robot r moving from some location l to some location m
would be that some robot r is now at some location m, some location m is now occupied,
some location l is no longer occupied and of course that some robot r is no longer at some
location l. Notice that the set of effects comprise two positive and two negative ones. Now,
with respect to the corresponding action, it is nothing more than an instantiation of the
operator. That is, variable r representing a robot and variables l, m representing locations
have been instantiated with a specific robot and locations. That is, robot1 and loc1, loc2
respectively.
Let us now provide the definitions for planning domains, problems, plans and solutions
under the classical representation [3].
Definition 4.8 Let L be a first-order language that has finite number of predicate and con-
stant symbols. A classical planning domain in L is a restricted state-transition system
Σ = (S, A, γ ) such that:
• S is closed under γ , that is, if s ∈ S, then for every action α that is applicable to s,
γ (s, α ) ∈ S.
• Sg = {s ∈ S | s satis f ies g}
Again, redundancy is defined in the same way as in the case of the set-theoretic representa-
tion (see Section 4.2.1).
CHAPTER 4. ARTIFICIAL INTELLIGENCE PLANNING 64
• e f f ects(o) is a set of value assignments to state variables and have the form of
x(t1 ,t2 , ...,tk ) ← tk+1 , where each ti is a term in the appropriate range.
Definition 4.12 An action α is a ground operator o that meets the rigid relations in precond(o),
meaning that every object variable in o is substituted by a constant of the corresponding
2 Unless we are talking about a domain in which we are able to bend or stretch space.
CHAPTER 4. ARTIFICIAL INTELLIGENCE PLANNING 65
class such that such constants meet the rigid relations in precond(o). An action α is ap-
plicable in a state s when the values of the state variables in s meet the conditions in
precond(α ). As such, updating the values of the state variables according to the assign-
ments in e f f ects(a) will generate a new state γ (s, α ).
Let us now provide the definitions for planning domains and problems under the state-
variable representation [3].
• S ⊆ ∏x∈X Dx , where Dx is the range of the ground state variable x; a state s is denoted
as s = {(x = c) | x ∈ X}, where c ∈ Dx .
• A = {all ground instances of operators in O that meet the relations in R}, where
O is a set of operators; an actions α is applicable to a state s if and only if every
expression (x = c) in precond(α ) is also in s; R is the set of rigid relations.
• S is closed under γ , that is, if s ∈ S, then for every action α that is applicable to s,
γ (s, α ) ∈ S.
Definition 4.14 A planning problem is a triple P = (Σ, s0 , g), where s0 is an initial state in
S and the goal g is a set of expressions on the state variables in X.
operators, an initial state and a goal state (O, s0 , g). If the planning problem P is solvable,
i.e. there is an action sequence that makes the transition from the initial to the goal state
feasible then the algorithm will return this action sequence (plan). If not, it will return
failure. Another state space algorithm is the backward search algorithm which, as its name
suggests, works backwards. That is, starting from the goal state and applying inverses
of the operators so that subgoals can be generated. The algorithm terminates if a the set
of subgoals is satisfied by the initial state and returns the sequence of actions or returns
failure otherwise [3]. The returned solution of the backward search algorithm, if any, needs
to be reversed to provide the solution to the planning problem. Note that both forward
and backward search algorithms are sound and complete. Yet another planning algorithm
that lies within the state space planning algorithmic family is STRIPS [3]. The STRIPS
algorithm is in its essence similar to a backward search algorithm since it works from
the goal to reach the initial state but as opposed to the backward search algorithm it is
not complete. This incompleteness of the STRIPS algorithm is due to the state branching
factor reduction that it implements which reduces the size of the search space considerably
but at the cost of rendering the algorithm incomplete. We will not go in to detail about
how STRIPS works but the interested reader is referred to two very detailed textbooks by
Ghallab et. al. [3], [134].
With respect to plan space search algorithms, the search space is not one comprising of
states like in the case of state space algorithms but of partially specified plans [3]. As such,
plan space algorithms search graphs whose nodes are partially specified plans while their
edges (arcs) are plan refinement operations and not state transitions or actions. The goal
of such refinement operations is to incrementally, complete a partial plan further. That is,
fulfilling open3 goals or eradicating potential inconsistencies. A fundamental concept of
plan space planning algorithms is the least commitment principle. This principle dictates
that refinement operations are to avoid adding constraints to the partial plan that are not
strictly required for addressing the refinement purpose. Except for the search space being
different in plan space planning when compared to state space planning, another difference
lies in the definition of solution plans. That is, instead of a sequence of actions, two things
are considered: action choices and action ordering. Let us provide some definitions in order
to clarify things before we talk about the algorithms themselves [3].
• B is a set of binding constraints on the variables of actions in A that have the form of
x = y, x ̸= y or x ∈ Dx with Dx being a subset of the domain of x.
3 Not yet fulfilled
CHAPTER 4. ARTIFICIAL INTELLIGENCE PLANNING 67
p
• L is a set of causal links of the form ⟨αi −→ α j ⟩, such that ai and a j are actions in A,
the constraint (αi ≺ α j ) is in ≺, proposition p is an effect of αi and a precondition of
α j while the binding constraints for variables of αi , α j appearing in p are in B.
Definition 4.16 A partial plan π = (A, ≺, B, L) is a solution plan for problem P = (Σ, s0 , g)
if:
• αk has an effect ¬q that is possibly inconsistent with p, that is, q and p are unifiable;
• the binding constraints for the unification of q and p are consistent with B.
• a threat, that is, an action that may interfere with a causal link.
Now, regarding the search algorithms themselves in plan space planning, let us briefly
describe the rationale behind the PSP (Plan Space Planning) algorithm, a generic algorithm
that can be ”instantiated” into several variants. PSP is initially provided with a partial plan
that contains two dummy actions init and goal in order to encode the initial state and the
goal condition into the search problem. In addition, the initial partial plan π contains one
ordering constraint of the form init ≺ goal and no variable bindings or causal links. The
algorithm proceeds with identifying the flaws of π and if π contains none then it returns π
as the solution plan. If the plan is flawed then the algorithm selects a flaw and it attempts
to find all possible ways in which it can be resolved. Each such way is called a resolver. If
there is no resolver then it returns failure, i.e. the planning problem cannot be solved. If
there are resolvers it chooses one of them non-deterministically and refines the plan with
that resolver. The new plan (after applying the resolver) is utilized in order to make a
recursive call to the algorithm. PSP is sound and complete. A famous variant of the PSP
is the POP (Partially Ordered Planning) algorithm. The main difference to PSP is that it
employs a different approach when resolving flaws. Recall from Definition 4.18 that a
flaw is either a subgoal or a threat. While PSP does not distinguish between the two when
choosing a flaw non-deterministically at each recursion, POP first refines with regards to a
subgoal and then proceeds with solving all threats due to the resolver of that subgoal. This
is done for each recursion. POP is also sound and complete.
CHAPTER 4. ARTIFICIAL INTELLIGENCE PLANNING 68
Regarding temporal constraints and time instants, they are formed based on the following
set of primitive relation symbols Prinst = {<, =, >} denoting the notions of before, equal
to (or at the same time) and after respectively. Now, the set of all temporal constraints on
time instants is Rinst = 2Prinst = {0, / {<}, {=}, {>}, {<, =}, {<, >}, {>, =}, Prinst }. For
instance, if we have two instants t1 ,t2 and wanted to to express that t1 comes before t2 we
would use t1 < t2 . Alternatively, if we wanted to express that the instants are not equal we
would use [t1 {<, >}t2 ] ([t1 ̸= t2 ] for better readability). The set of temporal constraints on
instants is really self explanatory except for the case of the constraints 0, / Prinst which are
special constraints. Constraint 0/ is the empty constraint denoting a constraint that cannot
be satisfied while Prinst is the universal constraint denoting a constraint that can be satisfied
by every tuple of values of the temporal variables it involves. Each element of Rinst is itself
a set of primitive relation symbols. With respect to temporal constraints on (time) intervals
things are a bit more complicated. Instead of three primitive relation symbols we have the
following thirteen primitive relation symbols: Printer = {b, b′ , m, m′ , o, o′ , s, s′ , d, d ′ , f , f ′ , e},
which stand for before, after, meet, is met by, overlap, is overlapped by, start, is started by,
during, includes, finish, is finished by, equal respectively. An observant reader may have
noticed that, in pairs4 , the primitive relation symbols can form symmetrical constraints be-
tween intervals. Let i, j be intervals. As such, i b′ j when j b i. That is, interval i is after
interval j when interval j is before interval i and so on. Figure 4.2 illustrates seven of the
primitive relations between intervals. Note that the length of the intervals in Figure 4.2 is
Figure 4.2. Primitive temporal relations between intervals (i, j). The symmetrical relations
are omitted but are very straightforward to form.
variable. We have obviously done so in order to demonstrate the various primitive relations
without having to resort the usage of multiple intervals. Again, an observant reader may
have noticed that e (equal) does not have a symmetrical relation. The fact is that it does
but it is identical. As such it is not necessary to include it the the set of primitive relations.
Now the set of temporal constraints on intervals is Rinter = 2Printer ={0, / {b}, {m}, {o},...,
{b, m}, {b, o},..., {b, m, o}, {b, m, s},..., Printer }. That is 8192 constraints. Each element of
4 Pairs here refer to pairs of the form (b, b′ ), (m, m′ ) and so on. They do not refer to random pairs such as
the Rinst set is itself a set of primitive relation symbols and is interpreted as the disjunction
of those primitives [3]. For instance, let i, j be intervals. A constraint of the form [i {b, s} j]
is interpreted as [(i b j) ∨ (i s j].
Switching over to binding constraints, they are applied to object variables and are of the
following form: x = y, x ̸= y and x ∈ D, with D being a set of constant symbols. Rigid re-
lations and binding constraints represent time-invariant expressions while flexible relations
and temporal constraints represent time-variant, also known as time-dependent, expres-
sions. A term that is often used when referring to rigid relations and binding constraints is
the term object constraints [3]. Let us now proceed with the definition of temporally quali-
fied expressions, abbreviated as TQE, with are essential in defining temporal databases.
• p is a flexible relation
• @[ts ,te ) represents the time interval [ts,te ) at which (@) the flexible relation holds.
In fact, what a temporally qualified expressions “tells” us is that ∀t such that ts ≤ t < te ,
the relation p(ζi , ..., ζk ) holds at time t [3]. Now let us provide the definition of a temporal
database (Definition 4.20) [3].
• C is required to be consistent, that is, there exist values for variables that meet all the
constraints.
In short, as a construct, a temporal database holds assertions about how the world evolves
over time. Now in order to be able to define temporal planning operators, actions, do-
mains, problems, plans and solutions (see Section 4.3.1.1) we first need to provide defini-
tions about: i) when a set of TQEs supports another TQE and when another set of TQEs
(see Definition 4.21), ii) when a temporal database supports a set of TQEs and when an-
other temporal database (see Definition 4.22), iii) when a temporal database entails another
database (see Definition 4.23) [3].
Definition 4.21 A set F of TQEs supports a TQE e = p(ζi , ..., ζk )@[t1 ,t2 ) if and only if
there is, in F, a TQE p(ζi′ , ..., ζk′ )@[τ1 , τ2 ) and a substitution σ such that σ (p(ζi , ...ζk )) =
σ (p(ζi′ , ...ζk′ )). An enabling condition for e in F is the conjunction of the two temporal
CHAPTER 4. ARTIFICIAL INTELLIGENCE PLANNING 71
Definition 4.22 A temporal database Φ = (F, C) supports a set of TQEs E when F sup-
ports E and there is an enabling condition c ∈ θ (E/F) that is consistent with C. Φ = (F, C)
supports another temporal database (F ′ , C ′ ) when F supports F ′ and there is an enabling
condition c ∈ θ (F ′ , F) such that C ′ ∪ c is consistent with C.
The concept of entailment was introduced and analysed earlier in this thesis in section
2.1 when presenting logic-based knowledge representation approaches and is connected to
inference. The concept of entailment was expanded to KBs which consisted of formulas.
Unsurprisingly, entailment is applicable to temporal databases which instead of formulas
consist of TQEs and constraints.
4.3.1.1 Temporal Operators, Actions, Axioms, Domains, Problems, Plans and Solu-
tions
• name(o) is an expression of the form o(x1 , x2 , ..., xk ,ts ,te ) such that o is an operator
symbol and x1 , x2 , ..., xk are all the object variables that appear in o together with the
temporal variable appearing in const(o). The other unconstrained temporal vari-
ables in o are free variables.
Continuing with definitions let us now present the definition for actions and the applicability
of actions [3] (see Definition 4.25).
CHAPTER 4. ARTIFICIAL INTELLIGENCE PLANNING 72
Definition 4.25 An action α is a partially instantiated operator o such that α = σ (o) for
some substitution σ . An action α is applicable to a temporal database Φ = (F, C) if and
only if precond(α ) is supported by F and there is an enabling condition c in θ (α /F) such
that C ∪ const(α ) ∪ c is a consistent set of constraints.
The result of applying an action to a database is a set of possible databases [3]. An ob-
servant reader may have noticed that negative effects are not made explicit in the definition
of planning operators. Consequently, the effects of an action states only what holds as an
effect of the action and not what does not hold. To achieve the latter we use domain axioms.
The definition of a domain axiom is the following [3]:
• ΛΦ is the set of all temporal databases that can be defined with the constraints as
well as constant, variable and relation symbols within our representation.
Given the above definition we can define a temporal planning problem as follows:
Definition 4.28 A temporal planning problem in planning domain D is a tuple of the form
P = (D, Φ0 , Φg ), where:
• Φg = (G, Cg ) is a database that represents the goals of the problem as a set G of TQEs
together with a set Cg of objects and temporal constraints on variables of G.
CHAPTER 4. ARTIFICIAL INTELLIGENCE PLANNING 73
• Pα (s′ |s), where α ∈ A, s, s′ ∈ S and P is a probability distribution. That is, for each
s ∈ S, if there exists α ∈ A and s′ ∈ S such that Pα (s′ |s) ̸= 0, we have ∑s′ ∈S P(s, α , s′ ) =
1.
When planning using MDPs domains are expressed as stochastic systems. In the above
definition (Definition 4.29) Pα (s′ |s) is the probability that executing some action α in some
state s will lead the system to transition to some state s′ . Now, not all actions are executable
in a state. The set of executable actions A in some state s is denoted as A(s) and is the set
of actions that have a non zero probability of transitioning the system to some state s′ . Of
course state s′ is not necessarily the same for each action. For instance, consider the case of
a robot being stationary at some position and two non-deterministic actions, move-forward
and move-backward with move-forward related to a non zero probability of moving the
robot forward and move-backward related to a non zero probability of moving the robot
backward. In this case both actions are executable but the execution of each action will
lead the system into different states that are not the same with each other.
Recall from section 4.4 that extended goals can be represented with utility functions. In
the MDP setting, goals are represented just like this, i.e. by the means of utility functions.
Utility functions are numerical functions that express preferences about which states should
be traversed and/or actions to be performed by the means of rewards and costs [3]. Let us
now provide the definition of a MDP.
Definition 4.30 A Markov Decision Process (MDP) is a 5-tuple (S, A, P,U, γ ), where:
• Pα (s′ |s), where α ∈ A, s, s′ ∈ S and P is a probability distribution. That is, for each
s ∈ S, if there exists α ∈ A and s′ ∈ S such that Pα (s′ |s) ̸= 0, we have ∑s′ ∈S P(s, α , s′ ) =
1.
CHAPTER 4. ARTIFICIAL INTELLIGENCE PLANNING 75
A plan in the MDP setting is the specification of actions that some controller should
execute given a state [3]. As such plans can be represented as policies, denoted as π . Poli-
cies are functions of the form π : S → A which means that a policy maps states into actions.
Executing policies corresponds to infinite sequences of states. Such infinite sequences are
called histories [3]. Histories are Markov chains. Markov chains are sequences of random
events/values in which the probability of each event/value depends only on the previous
state. In our case a history is a Markov chain that comprises sequences of states whose prob-
abilities at a time interval depend only on the probabilities at the previous time interval. The
utility function that we presented as part of Definition 4.30 can be generalized to policies
and histories. For example, the utility of a policy in a state s is U(s|π ) = R(s) −C(s, (π (s))
while the utility of a history is:
where, h is a history, π is the policy that induced that history and si are states of history
h, i.e. h = ⟨s0 , s1 , s2 , ...⟩ and γ is the discount factor. The usage of γ is necessary because
without it the utility of h would never be able to converge to a finite value. The problem
with this would be that we would not be able to use utilities for any meaningful calculation.
Now it is time to put everything together and finalise our analysis of planning with MDPs.
Given the set of all possible histories of a stochastic system Σ denoted as H and π , a
policy for Σ, the expected utility of π , denoted as E(π ), is:
where, U(h|π ) is the utility of a history h induced by policy π while P(h|π ) is the proba-
bility of history h induced by policy π and is given by:
being what is known as a transition function and rewards R : S → R being what is known
as a state function. For a moment forget about the terms costs and rewards and focus on
the terms transition functions and state functions. Costs and rewards are concepts that can
be used for both transition functions and state functions. This is important because using
the terms costs and rewards can lead to confusion in several ways one of which is the
following: For instance, one can restrict the values of functions to the positive real numbers
(R>0 ) instead of all real numbers (R) and call both transition and state functions as reward
functions that combined give us the utility function. In this case the utility function is the
summation of the two reward functions instead of their subtraction. In addition, when using
MDPs we can make simplifications such as that the underlying stochastic system deals only
with transition functions and not state functions. As such, we would get rid of the terms
referring to “reward” (state functions) where applicable. For instance, in equation 4.1 we
would get rid of R(si ) and invert the sign. Having said that, it is better to view Definition
4.30 as a definition that provides the formulation of MDPs with both transition and state
functions and by extension, to view what we have presented in this section as formulations
that apply to this case.
Continuing from the discussion in the last paragraph of the previous section, in this section
we will briefly describe two main algorithms for solving MDPs whose utility functions are
transition functions. Let us use the term reward here to denote that we want to maximize
the expected utility.
The first algorithm is known as the Policy Iteration algorithm [3]. The rationale behind
this algorithm is that given a stochastic system Σ, a utility function in the form of a transition
function representing rewards and a discount factor γ , the algorithm calculates the optimal
policy by alternating between two phases. The first phase which is known as the value
determination phase in which the expected reward is calculated for the current policy and
the second phase which is known as the policy improvement phase. During this phase, the
current policy is refined to a new policy that has a higher expected reward (maximization).
The algorithm terminates returning the current policy when there are no alternative actions
that can improve the policy further.
The second algorithm is the Value Iteration algorithm [3]. Again, the input to the al-
gorithm is a stochastic system Σ, a utility function in the form of a transition function
representing rewards and a discount factor γ . The algorithm starts by randomly assigning
an estimated reward for each state s ∈ S. The next step is to iteratively refine the value for
each state by selecting an action that maximizes its expected reward. At each iteration step
the value of the expected reward is computed for each state based on the previous value,
i.e. the expected reward value for that state at the previous step. The algorithm finds an
optimal, i.e. that maximizes reward, action for each state s ∈ S and stores it in the policy.
The algorithm terminates after the expected reward for two consecutive policies differs less
CHAPTER 4. ARTIFICIAL INTELLIGENCE PLANNING 77
that some arbitrarily very small number ε which is chosen by the user.
Although the initial version of PDDL was successful in making PDDL a standard
among the planning community of IPC competitions the language received criticism for
its lack of applicability in real world problems [142]. In the light of this criticism, Fox and
Long [142] presented a new version of PDDL which is known as PDDL2.1. Following the
PDDL2.1 standard, modelling numeric fluents and not just binary ones was now possible.
In this manner, resources such as time, energy and distance could now be represented and
monitored. Moreover, the aforementioned standard introduced the concept of plan metrics
the usage of which allowed minimization/maximization of fluents. This took the language
one step further since planning was no longer just goal-driven but also utility-driven, i.e.
the minimization/maximization of fluents could now be used complementary to planning
goals. Since PDDL2.1 which revolutionised PDDL additional versions appeared, provid-
ing various improvements. However, it is outside the scope of this thesis to present a full
review of PDDL. The interested reader is referred to Ghallab et al. [134]. However, we
will provide an intuition and definitions with respect to temporal planning domains, dura-
tive actions, planning problems as well as plan and plan metrics of PDDL2.1 (see Section
4.5.1).
Snippet 2 PDDL exemplar domain with types, predicates, functions and a simple durative
action for moving a robot from one location to another.
(define (domain example_1)
(:requirements :typing :fluents :durative-actions)
(:types Robot Location)
(:predicates
(at ?r - Robot ?l - Location)
(reachable ?l1 ?l2 - Location)
)
(:functions
(energy ?r - Robot)
(distance ?l1 ?l2 - Location)
)
(:durative-action move
:parameters (?r - Robot ?from ?to - Location)
:duration ( = ?duration (*(distance ?from ?to) 1))
:condition (and (at start (at ?r ?from))
(at start (reachable ?from ?to)))
:effect (and (at start (not (at ?r ?from)))
(at end (at ?r ?to))
(at end (decrease (energy ?r) (distance ?from ?to))))
)
)
remaining energy of some robot and the distance between two locations respectively. Let
us know analyse the durative action, namely move illustrated in Snippet 2. The parameters
of the action are some robot and two locations. Notice that the parameters are variables
of the aforementioned types. As such ?from is a variable representing the location from
which a robot is going to start moving while ?to is a variable representing the location to
which the same robot is going to move. Duration represents the duration of the action and
is equal to the distance between the two locations multiplied by one. Number one here is a
random selection for demonstration purposes. It effectively represents a multiplication fac-
tor for estimating duration based on the distance travelled. Now, condition represents the
preconditions of the action. In the case of our example one of the the preconditions for the
move action is that the robot needs to be at the location represented by the variable ?from.
The term at start represents a temporal annotation that enforces a temporal constraint.
That is, in the beginning (at start) of the interval of the action the robot represented by the
?r variable needs to be at the location represented by the ?from variable. In PDDL2.1 we
can have three temporal annotations in total that form constraints, namely at start, at
end and over all. Temporal annotations at end and over all refer to the end and the
full interval (from start to end) of the action respectively. Note that over all refers to an
open interval instead of a closed one, i.e. (t1 ,t2 ) instead of [t1 ,t2 ]. As such, precondition
at start (reachable ?from ?to) states that the location towards the robot is moving
must be accessible at the start of the interval of the action (the point at which the action is
applied). Continuing with our analysis, the effect (postcondition(s)) of the move action
is that immediately after starting the execution of the action the robot is not at its initial
location while at the end of the action execution the robot is at the location it has moved
CHAPTER 4. ARTIFICIAL INTELLIGENCE PLANNING 80
to and, at the end again, the remaining energy of the robot decreases relatively to the dis-
tance travelled, i.e. the distance between the two locations. At this point we would like to
emphasize that the durative action illustrated in Snippet 2 is a discrete action because its
effects occur only at the start and end points of the action’s time interval. The action could
be transformed into a continuous one by encoding for example:
instead of:
where, #t refers to the continuously changing time from the start of a durative action during
its execution [142] and number 1 represents the rate at which remaining energy is decreas-
ing. In our example 1 is an arbitrarily chosen number for demonstration purposes. Note
that the terms discrete and continuous durative actions refer to effects and not their precon-
ditions.
Let us now present an exemplar problem for the domain illustrated in Snippet 2. The
purpose of the exemplar problem is to demonstrate PDDL problems and their components.
Snippet 3 illustrates such a problem. By observing Snippet 3 we can see that a PDDL
problem consists of three main building blocks, that is, objects, an initial state and a goal
state. Recall from the beginning of this section that we talked about the types of the objects
that are of interest. Those objects and problem objects are the same thing. The difference
CHAPTER 4. ARTIFICIAL INTELLIGENCE PLANNING 81
Table 4.1. Plan solving the problem illustrated in Snippet 3 given the temporal domain illus-
trated in Snippet 2.
Plan
(0.000: move robot1 location1 location3) [10.000]
(10.000: move robot1 location3 location4) [2.000]
is that in a PDDL problem reside the objects themselves along with their types. In order
to encode both the initial and the goal states in a PDDL problem we utilize the predicates
and functions that are defined in the PDDL domain instantiated with the objects. This is,
logical since the objects are the ones whose states need to change given the goal. In order
to better understand the building blocks of PDDL problems let us analyse the problem
presented in Snippet 3. We have in total five objects of which four are locations and one
is a robot. The initial state of the problem represents that the robot (robot1) is at location1
and the robot has energy equal to 100 units. Moreover, the initial state represents which
locations are reachable from other locations. In our case location2 is reachable only form
location1 as is location3. Furthermore, the initial state also encodes the distances between
the locations that are reachable from each other. Regarding the goal state, it encodes that
the robot should be at location4 and its energy at that location must be >= 0. There is one
last thing to observe in Snippet 3 and that is the metric at the end of the problem. PDDL2.1
supports metrics for optimizing criteria in the form of minimization or maximization. Such
criteria are optional as opposed to goals. That is, they can be omitted. In our case the
metric instructs the planner to devise a plan which will be minimizing the total time8 . That
is, the planner will attempt to devise a plan which will satisfy the goal state, given the initial
state of course and the objects but at the same time the devised plan needs to be such that
will minimize the temporal span of the entire plan. Now, given the temporal domain and
the problem illustrated in Snippets 2 and 3 respectively we can use a PDDL planner that
supports PDDL2.1 to formulate a temporal plan (see Table 4.1). The inputs to the planner
are the temporal domain and the problem. In fact, every PDDL planner requires a domain
and a problem as inputs in order to generate a plan. For generating the plan illustrated in
Table 4.1 we used the OPTIC temporal planner. OPTIC is a planner that is briefly described
in Section 4.6. As far as the illustrated plan is concerned, the numbers on the left represent
the time points at which each action will start executing while the numbers on the right the
duration of each action. As such, one can see that robot1 will start moving from location1
towards location3 at time point t1 = 0 while the time interval for that action (duration) is
10 seconds. The next action will commence at time point say t2 = 10 which is the starting
time point for the action through which robot1 will transition from location3 to location4.
duration of that second action will be 2 sections. As such the duration of the whole plan
will be 12 seconds in total. At this point we would like to emphasize the fact that since we
have durative actions that are situated in time we can have concurrency, i.e. more than one
actions executing at the same time. Let us now provide some core definitions as well as
8 The term total-time shown in Snippet 3 is build-in in PDDL2.1 and refers to the temporal span of the
entire plan.
CHAPTER 4. ARTIFICIAL INTELLIGENCE PLANNING 82
definitions of what we have presented so far. Note that we will not be providing definitions
for continuous durative actions as it is outside the score of this section, and of this thesis.
We have however, touched upon the concept of continuous durative actions earlier in this
section. The definitions are taken directly from the foundational paper of PDDL2.1 [142].
For better readability let us first provide a “roadmap” for the definitions so that the in-
terested reader can navigate through them easier. Definition 4.31 defines simple planning
instances, Definition 4.32 defines logical states and states. Definition 4.33 defines the as-
signment proposition. Definition 4.34 defines the normalization of ground propositions.
Definition 4.35 defines actions and ground actions. Definition 4.36 defines valid ground ac-
tions. Definition 4.37 defines updating functions for valid ground actions. Definition 4.38
defines when ground propositions are satisfied. Definition 4.39 provides a definition for
the applicability of ground actions. Definition 4.40 defines simple plans. Definition 4.41
provides a definition for mutex (mutually exclusive) actions. Definition 4.42 provides a def-
inition for the execution of happenings (see also Definition 4.40). Definition 4.43 provides
a definition for the executability of simple plans while Definition 4.44 provides a definition
for the validity of simple plans. Definition 4.45 provides a definition for discrete durative
and ground durative actions. Definition 4.46 defines plans with discrete durative actions
while Definition 4.47 defines induced simple plans given Definition 4.46 for plans. Finally,
Definition 4.48 provides a definition for the executability of plans containing discrete du-
rative actions while Definition 4.49 defines what it means for a plan containing discrete
durative actions to be valid. As a final remark before proceeding with the definitions we
would like to urge the reader to come back to this section and all the definitions that will be
presented here in the event that he/she has any doubts about the PDDL code presented in
Chapters 7, 8.
Definition 4.31 A simple planning instance is defined to be a pair I = (Dom, Prob) where
Dom = (Fs, Prs, As, arity) is a 4-tuple consisting of finite sets of function symbols, predi-
cate symbols, non-durative actions, and a function arity mapping all of these symbols to
their respective arities. Prob = (Os, Init, G) is a triple consisting of the objects in the do-
main, the initial state specification and the goal state specification respectively. The prim-
itive numeric expressions of a planning instance (PNEs) are the terms constructed from
the function symbols of the domain applied to (an appropriate number of) objects drawn
from Os. The dimension of the planning instance (dim), is the number of distinct primitive
numeric expressions that can be constructed in the instance. The atoms of the planning
instance (Atms), are the (finitely many) expressions formed by applying the predicate sym-
bols in Prs to the objects in Os (respecting arities). Init consists of two parts: Initlogical and
Initnumeric . Initlogical is a set of literals formed from the atoms in Atms while Initn umeric
is a set of propositions asserting the initial values of a subset of the primitive numeric
expressions of the domain. These assertions each assign to a single primitive numeric ex-
pression a constant real value. The goal condition is a proposition that can include both
atoms formed from the relation symbols and objects of the planning instance and numeric
CHAPTER 4. ARTIFICIAL INTELLIGENCE PLANNING 83
Definition 4.32 Given the finite collection of atoms for a planning instance I, AtmsI , a
logical state is a subset of AtmsI . For a planning instance with dimension dim, a state is a
tuple in (R, P(AtmsI), Rdim
⊥ ) where R⊥ = R ∪ {⊥}, and ⊥ denotes the undefined value. The
first value is the time of the state, the second is the logical state and the third value is the
vector of the dim values of the dim primitive numeric expressions in the planning instance.
The initial state for a planning instance is (0, Initlogical , X) where X is the vector of values
in R⊥ corresponding to the initial assignments given by Initnumeric (treating unspecified
values as ⊥).
Definition 4.33 The syntactic form of a numeric effect consists of three things. First, one of
the assignment operators (assign, increase, decrease, scale-up, scale-down. The assign op-
erator assigns a value, the increase operator increases a value and it represents the C pro-
gramming language style +=, the decrease operator decreases a value and it represents the
C programming language style -=, the scale-up operator scales up a value and it represents
the C programming language style *= and the scale-down operator scales down a value
and it represents the C programming language style /=. Second, one primitive numeric
expression, referred to as the lvalue. Third, a numeric expression (which is an arithmetic
expression whose terms are numbers and primitive numeric expressions), referred to as the
rvalue. The assignment proposition corresponding to a numeric effect is formed by replac-
ing the assignment operator with its equivalent arithmetic operation (that is (increasepq)
becomes (= p(+pq)) and so on) and then annotating the lvalue with a “prime”. A numeric
effect in which the assignment operator is either increase or decrease is called an additive
assignment effect, one in which the operator is either scale-up or scale-down is called a
scaling assignment effect and all others are called simple assignment effects.
Definition 4.34 Let I be a planning instance of dimension dimI and let indexI : PNEsI →
{1, ..., dim} be an (instance-dependent) correspondence between the primitive numeric ex-
pressions and integer indices into the elements of a vector of dimI real values, Rdim I
⊥ . The
normalised form of a ground proposition, p, in I is defined to be the result of substituting
for each primitive numeric expression f in p, the literal XindexI ( f ) . The normalised form
of p will be referred to as N (p). Numeric effects are normalised by first converting them
into assignment propositions. Primed primitive numeric expressions are replaced with their
corresponding primed literals. X is used to represent the vector ⟨X1 ...Xn ⟩.
For the next definition of actions and ground actions (Definition 4.35) we are going to
assume that we have no quantifiers and no conditional effects. For a definition considering
quantifiers and conditional effects please refer to Fox and Long [142].
CHAPTER 4. ARTIFICIAL INTELLIGENCE PLANNING 84
Definition 4.35 Given a planning instance, I, containing an action schema A ∈ AsI , the
set of ground actions for A, GAA , is defined to be the set of all the structures, α , formed
by substituting objects for each of the schema variables in each schema, X, where the
components of α are:
• Name is the name from the action schema, X, together with the values substituted for
the parameters of X in forming α .
• Addα , the positive postcondition of α , is the set of ground atoms that are asserted as
positive literals in the effect of α .
• Delα , the negative postcondition of α , is the set of ground atoms that are asserted as
negative literals in the effect of α .
• NPα , the numeric postcondition of α , is the set of all assignment propositions corre-
sponding to the numeric effects of α .
The following sets of primitive numeric expressions are defined for each ground action,
α ∈ GAA :
• Lα = { f | f appears as an lvalue in α }
Definition 4.36 Let α be a ground action. α is valid if no primitive numeric expression ap-
pears as an lvalue in more than one simple assignment effect, or in more than one different
type of assignment effect.
Definition 4.37 Let α be a valid ground action. The updating function for α is the com-
position of the set of functions: {NPFp : Rdim ′
⊥ → R⊥ |p ∈ NPα } such that NPFp ( X ) = X
dim
where for each primitive numeric expression xi′ that does not appear as an lvalue in N (p),
xi′ = xi and N (p)[X′ := X′ , X := X] is satisfied. The notation N (p)[X′ := X′ , X := X] should
be read as the result of normalising p and then substituting the vector of actual values X′
or the parameters X′ and actual values X for formal parameters X.
Num(s, p)(X). Comparisons involving ⊥, including direct equality between two ⊥ values
are all undefined, so that enclosing propositions are also undefined and not satisfied in any
state.
Definition 4.39 Let α be a ground action. α is applicable in a state s if the Preα is satisfied
in s.
Definition 4.40 A simple plan, SP, for a planning instance, I, consists of a finite collection
of timed simple actions which are pairs (t, a), where t is a rational-valued time and α is an
action name. The happening sequence, {ti }i=0...k for SP is the ordered sequence of times
in the set of times appearing in the timed simple actions in SP. All ti must be greater than
0. It is possible for the sequence to be empty (an empty plan). The happening at time t, Et ,
where t is in the happening sequence of SP, is the set of (simple) action names that appear
in timed simple actions associated with the time t in SP.
• Lα ∩ Lβ ⊆ Lα∗ ∪ Lβ∗
Definition 4.42 Given a state, (t, s, x) and a happening, H, the activity for H is the set of
ground actions AH = {α |the name for α is in H, α is valid and Preα is satisfied in (t, s, x)}.
The result of executing a happening, H, associated with time tH , in a state (t, s, x) is un-
defined if |AH | ̸= |H| or if any pair of actions in AH is mutex. Otherwise, it is the state
(tH , s′ , x′ ) where:
∪ ∪
• s′ = (s \ Delα ) ∪ Addα
α ∈AH α ∈AH
Definition 4.43 A simple plan, SP, for a planning instance, I, is executable if it defines a
happening sequence, {ti }i=0...k , and there is a sequence of states, {Si }i=0...k+1 such that
S0 is the initial state for the planning instance and for each i = 0...k, Si+1 is the result of
executing the happening at time ti in SP. The state Sk+1 is called the final state produced
by SP and the state sequence {Si }i=0...k+1 is called the trace of SP. Note that an executable
plan produces a unique trace.
Definition 4.44 A simple plan SP, for a planning instance, I, is valid if it is executable and
produces a final state S, such that the goal specification for I is satisfied in S.
CHAPTER 4. ARTIFICIAL INTELLIGENCE PLANNING 86
As is the case of the definition of simple ground actions (Definition 4.35) we are going
to present a definition for discrete durative ground actions without considering quantifiers
or conditional effects (see Definition 4.45). For a definition considering quantifiers and
conditional effects please refer to Fox and Long [142].
Definition 4.45 Durative actions are grounded in the same way as simple actions (see
Definition 4.35), by replacing their formal parameters with constants from the planning
instance. The definition of durative actions requires that the condition be a conjunction of
temporally annotated propositions. Each temporally annotated proposition is of the form:
(at start p) or (at end p) or (over all p), where p is a non-annotated proposition. Similarly,
the effects of a durative action (without continuous or conditional effects) are a conjunction
of temporally annotated simple effects. The duration field of DA defines a conjunction of
propositions that can be separated into DCstart DA and DCDA , the duration conditions (DC)
end
for the start and end of DA, with terms being arithmetic expressions and ?duration. The
separation is conducted in the obvious way, placing at start conditions into DCstart DA and at
end conditions into DCendDA . Each ground durative action, DA, with no continuous effects and
no conditional effects defines two parametrised simple actions DAstart and DAend , where the
parameter is the ?duration value, and a single additional simple action DAinv as follows.
DAstart (DAend ) has precondition equal to the conjunction of the set of all propositions, p,
such that (at start p) ((at end p)) is a condition of DA, ogether with DCstartDA (DCDA ), and
end
effect equal to the conjunction of all the simple effects, e, such that (at start e) ((at end e))
is an effect of DA (respectively). DAinv , is defined to be the simple action with precondition
equal to the conjunction of all propositions, p, such that (overallp) is a condition of DA. It
has an empty effect. Every conjunct in the condition of DA contributes to the precondition
of precisely one of DAstart , DAend or DAinv . Every conjunct in the effect of DA contributes
to the effect of precisely one of DAstart or DAend . For convenience, DAstart , (DAend , DAinv )
will be used to refer to both the entire (respective) simple action and also to just its name.
Definition 4.46 A plan, P, with durative actions, for a planning instance, I, consists of a
finite collection of timed actions which are pairs, each either of the form (t, α ), where t is a
rational-valued time and α is a simple action name an action schema name together with
the constants instantiating the arguments of the schema, or of the form (t, α [t ′ ]), where t is
a rational-valued time, α is a durative action name and t ′ is a non-negative rational-valued
duration.
Definition 4.47 If P is a plan then the happening sequence for P is {ti }i=0...k , the ordered
sequence of time points formed from the set of times {t|(t, α ) ∈ P or (t, α [t ′ ]) ∈ P or (t −
t ′ , α [t ′ ]) ∈ P}. The induced simple plan for a plan P, simplify(P), is the set of pairs defined
as follows:
• ((ti + ti+1 )/2, αinv ) for each pair (t, α [t ′ ]) ∈ P and for each i such that t ≤ ti < t + t ′ ,
where ti and ti+1 are in the happening sequence for P.
Definition 4.48 A plan, P (for a planning instance), is executable if the induced simple
plan for P, simplify(P) is executable, producing the trace {Si = (ti , si , vi )}i=0...k .
Definition 4.49 A plan, P (for a planning instance), is valid if it is executable and if the
goal specification is satisfied in the final state produced by its induced simple plan.
The work presented by Saigol in [145] is concerned with planning for autonomous
hydrothermal vent prospecting using AUVs. In doing so the author formulated the problem
as a POMDP in order to deal with limited information on the location of vents caused by
the uncertainty arising from vehicle sensor readings. The experimental results demonstrate
the efficiency of the system in locating hydrothermal vents.
Another research output is the work presented by Fox et. al. in [146] where plan-based
policy learning is utilised for tracking biological ocean features autonomously using AUVs.
More specifically, the authors have applied the aforementioned approach for tracking the
surfaces of harmful algal blooms. Tracking is modelled as a planning problem where an
AUV has to decide how detect the edges of blooms and navigate their boundaries while at
the same time conserving energy and avoiding danger [146]. The rationale behind plan-
based policy learning is to exploit sampling and classification in order to learn a policy for
tracking. As such the problem is modelled as a deterministic one using PDDL. Several
instances of the deterministic problem are chosen (by sampling) in simulation in a way that
they cover a variety of situations likely to be seen in the real world and they are then solved
by a PDDL planer. Each solution is used as a training example for learning a policy. That is,
given the set of solutions to the deterministic problems, decision-tree learning is applied to
classifying decision variables with action choices forming in this way state-action pairs that
are effectively the policy. In this manner, although uncertainty arising from dispersion and
change of the blooms is not directly modelled, the choice of “interesting” problem instances
that reflect situations likely to be seen in the real world lead to robust solutions. However,
as the authors state, producing robust policies can take several days of training. Of course,
the amount of days is dependant on the number of samples used for training which in turn
affects the robustness of the policy learned.
T-REX is another research output with respect to planning for autonomy [147]. It was
developed at the Monterey Bay Aquarium Research Institute (MBARI) for adaptive mission
control of AUVs. T-REX is an adaptive constraint-based temporal planning system that is
based on NASA’S EUROPA (Extensible Universal Remote Operations Planning Architec-
ture) version 2, a library and tool set for building planners that follow the constraint-based
temporal planning approach [148]. T-Rex comprises control agents (Teleo-Reactors) each
of which follows a Sense-Plan-Act (SPA) approach to control. Each control agent has a
different functional and temporal scope [149]. The functional scope refers to the goals the
agent: i) understands, ii) can plan for, iii) monitor the progress of. The temporal scope
on the other hand refers to how much time in the future the agent has to generate a plan
for. Experimental results from using the framework over a period of a year in various sea
trials have shown that by utilising a partial plan representation that can be both refined and
extended during mission execution can minimize replanning occurrences due to effectively
invalidating a mission plan [150].
Of particular interest is the PANDORA project [151] which was focused on persistent
autonomy for AUVs. One of the aspects of the project was to focus on AUVs undertaking
underwater inspection tasks of submerged structures and intervention tasks in the form of
CHAPTER 4. ARTIFICIAL INTELLIGENCE PLANNING 89
valve-turning and chain-cleaning in the presence of unexpected events and faults. Valve-
turning refers to an intervention task in which a valve in an underwater structure needs
to be turned, for instance in the case of an underwater oil rig, while chain-cleaning refers
to an intervention task in which a chain, e.g. an anchor chain, needs to be cleaned. The
approach followed was deterministic PDDL-based temporal planning [152] with replanning
in the presence of vehicle component faults and action failures demonstrating in this way
an adaptive behaviour. Knowledge about the environment, the vehicle state its components,
capabilities and available actions is expressed through ontologies which the planning and
execution system also benefits from in order to devise plans and execute them in order to
undertake the aforementioned tasks. Experimental results have shown the ability of the
framework to successfully adapt under task failures and faults. Earlier we have stated that
PANDORA is of particular interest. This interest stems from the conceptual closeness
to our project and as such we will see in Chapter 7 how PANDORA and our project are
related and how they are different in the context of persistently autonomous operations.
One last remark to make about PANDORA is that its planning and execution system was
later materialised as a standalone framework called ROSPlan [153]. ROSPlan, as its name
suggests, is a framework for planning tasks within the Robot Operating System (ROS) (for
ROS see Chapter 6). We will talk more about ROSPlan in Chapter 7.
porally qualifying expressions and temporal databases without formalising a specific lan-
guage but rather focusing on the semantics of the underlying representation. In formally
presenting definitions for temporally qualifying expressions, temporal databases, domains,
actions problems, plans and solutions we have utilised notions and concepts from point and
temporal algebras. Temporal planning does not have the assumption about implicit time
as opposed to classical planning. As such actions have durations and can overlap since
they are situated in time. Consequently temporal planning is more suited for real world
applications.
With respect to planning under uncertainty, we do not assume a deterministic state-
transition system. In addition, we do not assume that goals are restricted as opposed to
classical planning in which we make both assumptions. The lack of the latter assumption
helps us to specify goals of different strengths while the lack of the former allows as to en-
code actions that have non-deterministic effects. One of the approaches that we presented
is based on planning with Markov decision processes where we provided formal definitions
about the process and described planning algorithms. In addition we have presented a vari-
ation of Markov decision processes (MDPs) called partially observable Markov decision
processes (POMDPs) which can be used when we do not make the assumption of a fully
observable state-transition system. Both in the case of MDPs and POMDPs planning is an
optimization problem of utility functions. Both MDPs and POMDPs utilise probabilities
for handing uncertainty. Finally, we have briefly described the model checking approach
for planning under uncertainty which as opposed to MDPs and POMDPs does not utilise
probabilities for encoding uncertainty but rather a general non-deterministic state transition
system in which the execution of an action can lead to many different states. Again, as in
the case of temporal planning, planning under uncertainty approaches are better suited for
real world applications.
Further to the presentation of the classical, temporal as well as the planning under un-
certainty fields within the AI planning domain, we have focused on PDDL and specifically
PDDL2.1 which supports temporal planning. We have done so because we will be utilising
temporal planning which is based on PDDL temporal planners. PDDL2.1 was presented
with formal definitions about domains, durative actions, problems etc. However, we fo-
cused on discrete durative actions rather than continuous ones with formal definitions since
they will be used in the thesis. In the context of temporal planning and PDDL we have also
briefly described OPTIC, a PDDL temporal planner.
Finally, in this chapter we have presented research outputs with respect to planning for
autonomy. The approaches presented include POMDPs, plan-based policy learning and
temporal planning. Of particular interest is the PANDORA project due to the conceptual
closeness of our project and ROSPlan, a standalone framework for task planning within
ROS.
Chapter 5
As was stated early on in the thesis, where we presented our problem statement (see Sec-
tions 1.1, 1.1.2), maritime situation awareness plays a key role in promoting maritime se-
curity. Maritime security is one of the two aspects of the maritime defence and security
domain1 which this thesis is concerned with and as such the purpose of this chapter is to
present an overview of the field of maritime situation awareness. In doing so, we present
common sources of information in building maritime situation awareness in Section 5.1. In
Section 5.2 we analyse the role of context, prior knowledge, and uncertainty in the domain
while in Section 5.3 we present data fusion models that are commonly used in the field.
Finally, in Section 5.4 we present pieces of work on maritime situation awareness systems.
91
CHAPTER 5. MARITIME SITUATION AWARENESS 92
safety [7]. As such, weather conditions are considered another source of information used
in building maritime situation awareness. Outwith information originating from sensors,
valuable information gained from human intelligence reports, if any, are also used. Such
information is gathered from various sources: intelligence agencies that have infiltrated or-
ganizations involved in illegal activities, accessaries that have been caught and turned into
informants as well as “competitors” that seek to destroy rival organizations involved in the
same or similar illegal activity to name but a few.
represented as follows:
This is nothing more than the representation of a deterministic view of maritime situation
awareness in context. However, prior knowledge under uncertainty would also provide
some sort of weight or probability to the aforementioned representation as follows:
denoting that 85% of the cases the antecedent implies abnormal behaviour while 15% of
the cases does not.
Prior knowledge is not easy to obtain. In the case of human operators it is synonymous
to many hours of theoretical training and experience gained after dealing with a significant
amount of situations and incidents while in the case of maritime situation awareness sys-
tems, prior knowledge is either encoded with the contribution of domain experts or gained
by learning from data.
FF
Observe Orient
FB FF
Act Decide
FF
Figure 5.1. The OODA loop. FF corresponds to feed forward flow of information while FB
to feedback.
of the unfolding situation at hand is obtained. Next, the outcome of the orientation stage
is fed forward into the decision stage (decide) whose purpose is to determine a course of
action by weighting all available options and picking the most suitable. Actions are then
executed during the action stage (act) and the outcome is used as feedback to the observa-
tion stage along with additional, up to date, information from all available sources. This
effectively initiates another OODA cycle and the process continues as described above.
Level 1 - Entity Assessment: At the center of this Level lie the entities that are of interest.
More specifically, Level 1 fusion processes are tasked with the “estimation of the identity,
classification (in some given taxonomic scheme), attributes, activities, location, and the
dynamics and potential states of entities”.
Level 3 - Impact Assessment: Given some predefined objectives (e.g. mission goals),
CHAPTER 5. MARITIME SITUATION AWARENESS 95
Level 3 fusion processes are concerned with assessing the impact of a previously assessed
situation on those objectives.
Level 4 - Performance Assessment: As its name suggests, the fusion Level 4 is concerned
with assessing the performance Level 4 fusion processes are concerned with assessing the
performance of the data fusion system. This is achieved by combining information to es-
timate measures of performance as well as measures of effectiveness, abbreviated as MOP
and MOE respectively, given a set of desired data fusion system states and/or responses.
installations or environmental interest. As such, high level threats are treated as a combina-
tion of lower level anomalous behaviours. The approach is demonstrated using simulated
data. Gaussian processes for maritime anomaly detection are used in [163]. Gaussian
processes are actively learned from AIS data, i.e. by subsampling the data to reduce the
complexity of training the model. Given the trained model the authors calculate a mea-
sure of normality for each newly acquired AIS transmission in order to identify potentially
anomalous behaviours of vessels based on their speed as well as longitude and latitude.
Another research path followed in the field is the usage of ontologies in building mar-
itime situation awareness systems. In the work presented in [164] the authors present an
approach for detecting abnormal ship behaviour based on a spatial ontology which is as-
sociated with a rule-based geographical inference engine. The hierarchy of the ontology
is developed with the help of experts while its population is based on a dataset comprising
the routes of more than 2000 vessels in the Mediterranean sea. Similarly, rules are defined
based on direct expert knowledge (i.e expert interviews). In this manner contextual knowl-
edge is taken into consideration. Similarly, a prototype of a maritime situation awareness
system based on ontologies is presented in [165]. More specifically, the developed Mar-
itime Domain Ontology (MDO) as the authors call it, is used in conjunction with descrip-
tion logics inference for detecting behavioural anomalies as well as classifying vessels of
interest. Yet another research output in the direction of placing ontologies in the center of
a maritime situation awareness system is presented in [166]. More specifically, a maritime
situation awareness ontology is utilized for fusing information from various sources includ-
ing other maritime situation awareness systems while a predefined set of rules is used for
identifying abnormal behaviours. One thing that the aforementioned ontology-based ap-
proaches share in common is that the usage of ontologies is ideal for encoding context and
relations among entities. However, another thing that they share is the lack of mechanisms
that will enable them deal with uncertainty. Identifying this limitation, researchers in the
field started combining ontologies with mechanisms for dealing with uncertainty.
Gòmez-Romero et al. in [8] present a context-based multilevel information fusion
framework for harbour surveillance. In their work ontologies are used both for encoding
factual and contextual knowledge. Upon this knowledge the framework initially applies de-
ductive and rule-based reasoning in order to, as the authors state, “extend tracking data and
to classify objects according to their features”. Next the framework combines the Belief
Argumentation System (BAS) which performs logic-based abductive reasoning [167] with
the Transferable Belief Model (TBM) [168] in order for reasoning under uncertainty to be
supported, correct assessment of situations to be achieved and the level of the threat to be
determined. Integration of Bayesian Networks (BNs) and ontologies for identifying pirate
activities in the Gulf of Aden is the approach followed in [169]. More specifically, the
Intelligent Maritime Awareness Situation System (IMASS), as the authors call it, is based
on: i) an ontology for storing observations and for encoding relations among entities in
the environment and ii) a set of logical rules in the form of logical conjunctions for defin-
ing normal or abnormal behaviours. The logical rules however, strangely, are not directly
CHAPTER 5. MARITIME SITUATION AWARENESS 97
used in assessing the situation but rather have an activation role, i.e. a deterministic rule
will trigger its BN counterpart for providing probabilistic inference instead of determinis-
tic. According to the authors, deterministic rules are easier to be understood by non experts
as opposed to BNs, justifying in this way their existence. Moving on, Carvalho et al. in
[170] present an approach for modelling probabilistic ontologies for supporting maritime
situation awareness. The modelling approach is based on the Uncertainty Modelling Pro-
cess for the Semantic Web (UMP-SW) [171] while probabilistic ontologies are encoded in
PR-OWL [172], an upper OWL ontology used for representing uncertainty. Laskey et al.
in [173] present a case study of deploying such an ontology in support of a human operator
in identifying vessels demonstrating suspicious behaviour. Reasoning on the probabilistic
ontology is achieved through Multi-Entity Bayesian Networks (MEBNs), hybrid networks
that combine logic with probability theory (see Section 2.3.1). The results demonstrate the
effectiveness of the approach. MEBNs fall under the SRL scientific domain and are one
of the two SRL approaches known to us being used in the maritime situation awareness
field, the other being the work presented in [5] which is based on Markov logic networks.
MLNs are used for fusing sensor information with context and providing reasoning under
uncertainty for assessing the situation in two different scenarios. The first is concerned with
the identification of rendezvous of suspicious vessels while the second (is concerned) with
cargo vessels approaching a port, carrying materials who may be hazardous when com-
bined. The proposed MLN-based system covers the first three fusion Levels (levels 0-2) of
the JDL model presented in Section 5.3.2. Both the structure and the weights of the MLNs
are determined by domain experts. The results demonstrate the impact of context in as-
sessing a situation correctly and the overall advantages of combining first-order logic with
probabilistic graphical models in a unified model, i.e. an MLN, in the context of enhancing
maritime situation awareness.
situation. Regarding maritime situation awareness systems based purely on ontologies, they
are very efficient in capturing relational knowledge in the domain of application and con-
sequently for encoding context. Unfortunately, as in the case of expert systems, they lack
mechanisms for addressing uncertainty. However, when combined with such mechanisms
they constitute a powerful tool in providing enhanced maritime situation awareness. The
combination, however can be either in the form of the two retaining their independence
and combining their outcomes as in the case of [8, 169] or in the form of truly unifying
them (logic and probability) in a single representation as in the case of MEBNs and MLNs
(see [170, 173] and [5] respectively). Only in this manner can we achieve the best of both
worlds, i.e. logic and probability theory (see Section 2.4).
Chapter 6
This rather brief chapter is concerned with general thesis background by presenting com-
mon hardware components found on AUVs (see Section 6.1) as well as the Robot Operating
System (see Section 6.2). The content of the chapter is rather useful in understanding con-
cepts presented in Chapters 7, 8 where we present our work with respect to persistently
autonomous underwater operations (Chapter 7) and application scenarios within the MCM
context (Chapter 8).
• Sonar: Sonar is short for SOund Navigation And Ranging. There are two types of
sonar devices (sensors): i) passive sonar and ii) active sonar. A passive sonar, as
its name suggests is passively listening to the sounds created by objects in the water
whilst and active sonar creates sound pulses which are emitted in the water in order
to actively detect reflections and consequently the objects that created them. AUVs
commonly use active sonar devices mounted on various angles (e.g forward sonar,
downward sonar etc.)
• Camera & Range Camera: A camera produces a continuous stream of images while
the range camera does the same but with range images. That is, it produces images
that show the distance to specific points in a scene from another specific point (depth
images). Cameras and range cameras can be placed in any desirable angle (e.g. for-
ward (range) camera, downward (range) camera.
• GPS: A GPS (Global Positioning System) device is a device that provides position
information expressed in the geographic latitude and longitude. In order to achieve
this GPS, devices communicate with satellites in orbit. GPS devices are commonly
99
CHAPTER 6. GENERAL THESIS BACKGROUND 100
used by AUVs in order to get a position fix and/or track movement whilst on the
surface or very close to it since satellite signals cannot penetrate water but for usually
less than a meter.
• Compass: A compass shows the orientation of the vehicle with respect to the direc-
tion of north, east south and west.
• Doppler Velocity Log: A Doppler Velocity Log (DVL) is a device that provides an
estimate of the linear speed at which an AUV is moving.
• Temperature Sensor: Temperature sensors are commonly used by AUVs mainly for
two different purposes. When placed inside an AUV body, temperature sensors can
provide early warnings for components overheating while when placed outside they
provide temperature readings of the external environment, i.e. the water.
• Thruster(s) & Battery(ies): Typically AUVs are equipped with a set of thrusters
that enable them move underwater or on the water surface. Batteries on the other
hand are the power source of thrusters as well as any other component on the AUV.
Node A Node B
Service Response
Publish
Service call
Subscribe
Topic Node C
Figure 6.1. ROS nodes communicating at runtime over topics and via services.
a service offers synchronous communication between nodes using a pair of messages: one
for requesting data and one for receiving a response. The node offering a service is called a
server while the node that sends the request message to the server awaiting for a response
is called a client.
Finally, software in ROS is organised in packages. A ROS package can contain ROS
nodes, libraries, datasets or anything else useful for an application. ROS packages can also
grouped together into ROS stacks.
Goal
Topic
Cancel
Topic
Result
Topic
Publish
FDBK Subscribe
Topic
based on application needs such components may vary. On the contrary, the focus was
on presenting common hardware AUV components. That is, components that are most
likely to be found on AUVs irrespective of the specialized application needs. In addition
we have briefly presented ROS which is one of the most popular platform for developing
robotic applications. In doing so, we have analysed the main building blocks of ROS and
their functionality. Furthermore, we have presented the actionlib package of ROS which
provides the tools for creating action servers for executing actions.
Part III
103
Chapter 7
104
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 105
Reasoning
Planning
&
Knowledge Representation (Ontologies) Execution
Mission
Planner
Actions Planning
Mission
World Executor
Capabilities Execution
Components
Perception
Figure 7.1. High level framework architecture for persistently autonomous operations.
way a Prolog KB and all the aforementioned operations on knowledge are performed by
sending Prolog queries via ROS services. Even though the integration with ROS makes the
system easily accessible in principal, it was still required that the user was familiar with
Prolog syntax to be able to formulate queries. To address this limitation we have developed
an additional knowledge interface that masks Prolog syntax by providing Python methods
that perform Prolog queries via ROS services behind the scenes.
Regarding reasoning, our framework offers a set of reasoning modules that are imple-
mented in Prolog. These modules interact with the KB (via the knowledge interface) dur-
ing missions driving real time autonomous decision making and adaptation. These cover
the spectrum of vehicle capability, action discovery and fault recovery based on the vehi-
cle’s components health status, classification of detections from the perception system as
well as different behaviours based on high level mission priority changes communicated
to the vehicle. Our reasoning modules constitute adjustments and additions to the existing
KnowRob reasoning capabilities so that autonomous underwater operations can be accom-
modated (see Section 7.2.3).
Finally, the framework is complemented by a mission planning and execution system
which interacts with the framework’s KB (mission planning and execution ontologies) via
the knowledge interface. The architecture of the planning and execution system is based
on the work presented in [176] as part of the PANDORA project [151]. The system com-
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 106
prises one planner, one executor and ROS actionlib actions. Given information residing
in the mission planning ontology the planner generates mission plans which are communi-
cated to the executor and are logged into the mission execution ontology. The generated
plans at this stage are only symbolic action representations which are transformed by the
executor into actionlib actions and are executed. Execution outcomes are monitored by
the executor which updates the execution ontology accordingly. Given the execution status
(which is communicated to the planning ontology to update its own status), potential vehi-
cle faults and potentially new high level mission priorities the executor can initiate a new
planning-execution cycle if necessary. This occurs until mission objectives are satisfied to
the maximum extent possible. This adaptive mission planning and execution approach will
be explained in more detail in section 7.3.
The Feature class is intended for features that are detected during the search for objects
in the environment. As can be observed by looking at Figure 7.2 we have modelled a Circle
subclass for circular features which are represented as instantiations (instances) of the sub-
class and are created by a circle feature detection system. Each circle instantiation is related
(linked) to its radius through a radius float data property. Moreover, each such instance is
related to its position in 3D space through a position object property that connects it to an
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 107
instance of the Position3D class. The instance of the Position3D class is linked to a 3D
position in space via north, east, depth float data properties. This interlinking propagates
the information to the Circle class instance. Depending on the nature of autonomous op-
erations and the availability of feature detectors on the vehicle the ontology can be easily
extended with additional feature classes as well as object and data properties necessary for
representing them. Moving on, we have created an Object class which is intended for as-
serting objects in our KB by creating appropriate instantiations given feature instantiations
and an appropriate classification system. Of course, each object instance is associated with
a Position3D instance via a position object property and again, based on application needs,
it may be associated with additional properties such as a probability of being that object
given of course a probabilistic classifier. Finally, Vector subclasses are used for linking
“things” to geometrical information. For instance we mentioned that objects are associ-
ated with positions in 3D space. Another example of using the vector classes is to express
the positions and orientations (poses) of waypoints that the vehicle can use for navigation.
The orientation does not refer to a waypoint per se, but rather to the orientation the vehicle
should have when visiting this waypoint. How waypoints are generated and the different
types of waypoints will be presented in Sections 7.3.1 and 7.3.2.
ontology and encoding dependency relations between actions and capabilities. Finally, at
the very top (fourth) level lies the vehicle ontology. It imports the actions ontology and by
extension the components and capabilities ontology and is capable of modelling different
classes of AUVs. This is achieved by encoding what set of components, capabilities and
actions each AUV class should have. This bottom-up modelling approach follows the work
presented in KnowRob [2] as well as in [177], where the authors utilize ontologies in the
same manner in order to formulate a Semantic Robot Description Language (SRDL) that is
capable of describing robot components, capabilities and actions. However, our approach
differs in that it encodes the health status of vehicle components in their representation
as opposed to [2, 177] where this feature is not present. This provides the ability of re-
assessing the internal state of the vehicle dynamically and in real time in the presence of
component faults something that is also not present in [2, 177]. This in succession enables
the framework to adapt its mission planning and execution loop if necessary by feeding rel-
evant information to the planning and execution system which is described in Section 7.3.
Figure 7.3 illustrates the four-level bottom-up vehicle modelling approach schematically.
Let us now present the components, capabilities, actions and vehicle ontologies in more
Vehicle Level 4
Actions Level 3
Capabilities Level 2
Components Level 1
Figure 7.3. Four-level bottom-up vehicle modelling approach. The purple boxes represent
ontologies while the arrows indicate the direction of incremental modelling starting from the
components ontology up to the vehicle ontology.
The components ontology models both hardware and software components commonly
found in AUVs. Regarding hardware components the ontology comprises: i) sensor classes
such as a DVL (Doppler Velocity Log) class and a Gyro (gyrosscope) class for modelling
sensors, ii) a Thruster class for modelling thrusters and iii) a Battery class for modelling
batteries that power the vehicle. With respect to modelling software components, the ontol-
ogy contains a NavigationModule class, a DetectionModule class, a ClassificationModule
class, a ReacquisitionModule class, a PlanningModule class and an ExecutionModule class.
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 109
The class hierarchy of the components ontology is shown in Figure 7.4. In order to be able
to model the health status of each component we use a binary (true/false) data property
named functionalComponent. Now, the existence of components in an actual AUV is rep-
resented by instantiations of the classes that model those components and each instance is
linked to its health status via the functionalComponent data property as appropriate. How-
ever, our framework does not assess the health status of components. We rather assume that
the vehicle is equipped with a fault detection system that provides such information and our
framework asserts it in its KB.
A capability implies both the capacity and the ability to perform a set of actions that are
related to that capability. For now, let us focus on the intersection of the capacity and the
ability and forget about the set of actions. A human being for example, has the capacity to
walk if he has legs but might not have the ability to walk because he might not know how (an
infant) or if for example his legs are numb. In these cases we can say that he is not capable
of walking. Similarly, an AUV has the capacity to navigate if it is equipped with thrusters,
a DVL, a gyroscope and a navigation module. It might not be capable of navigating though
because it does not have the ability to do so, that is, one or more of the aforementioned
components are faulty. Alternatively it might not be capable of navigating because it does
not have the capacity to do so in the first place, i.e. it is missing the necessary components.
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 110
Using what we have just described as our logical basis we have created a capabili-
ties ontology that imports the components ontology1 and encodes dependency relations
between capabilities and components. These dependency relations are modelled through
dependsOnComponent object properties that relate capability classes to component classes.
Dependency relations on the class level represent the set of required components for a ca-
pability to exist (the capacity). In addition, given that each component is linked to its health
status, enables us to check whether the ability is also present for that capability to be instan-
tiated. For example, consider the capability class SomeCapability that depends on compo-
nent classes SomeComponent and SomeOtherComponent. The SomeCapability class will
only be instantiated if there are instances of the aforementioned component classes whose
health status is functional. Figure 7.5 illustrates the class hierarchy of the capabilities on-
tology. As can be seen by observing Figure 7.5 there are seven main capability classes, i.e.
Figure 7.5. Capabilities ontology class hierarchy. We have intentionally omitted the class
hierarchy of the components ontology (see Figure 7.4) for better readability.
vehicle to maintain that capability in case there is a problem with the initial component, i.e.
if it becomes faulty. For this purpose we have introduced the notion of primary, secondary,
tertiary ... n-ary capabilities depending on the number of existing redundancies modelled.
We call such capabilities semantically equivalent because they represent alternative ways of
achieving the same type of capability. Characteristic examples of semantically equivalent
capabilities are the primary and secondary capabilities that extend their respective capabil-
ity classes as shown in Figure 7.5. Table 7.1 illustrates the dependency relations between
capabilities and components as well as redundancies where applicable.
Table 7.1. Dependency relations between capabilities and components. Component classes
in the same color (excluding the default black) represent redundancies for capability classes
of the same type.
In the previous section (Section 7.2.2.2) where we presented the capabilities ontology of
our framework we focused on the intersection of the capacity and the ability that equips a
vehicle with capabilities to perform a set of actions. In this section we focus on the actions
themselves.
For the purpose of modelling actions we have created an actions ontology that imports
the capabilities ontology and by extension the components ontology. Actions are modelled
as classes and similarly to the case of the capabilities ontology that encodes dependency
relations between capabilities and components the actions ontology encodes dependency
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 112
Figure 7.6. Actions ontology class hierarchy. We have intentionally omitted the class hierar-
chy of the components and capabilities ontologies (see Figures 7.4 and 7.5 respectively) for
better readability.
four main action classes in total: DoHoverDetection, DoClassify, DoReacquire and DoIn-
spect. The subclasses of the DoHoverDetection action class are used for modelling actions
that enable the vehicle to survey areas performing detections of targets (e.g. objects) while
the DoClassify action class is used for modelling actions that classify targets. Moreover,
the subclasses of the DoReacquire action class are intended for vehicle actions that reac-
quire targets while the subclasses of the DoInspect action class are intended for modelling
target inspection actions. Table 7.2 is indicative of action dependency relations as well as
capability redundancies where applicable.
So far we have presented the means for representing and expressing knowledge about AUV
components, capabilities and actions in sections 7.2.2.1, 7.2.2.2 and 7.2.2.3 respectively.
The last remaining piece in the vehicle modelling puzzle is the ability to model AUVs.
This is exactly the purpose of the vehicle ontology.
More specifically, the vehicle ontology imports the actions and by extension the capa-
bilities and components ontologies and is additionally complemented with a simple Vehicle
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 113
Table 7.2. Dependency relations between actions and capabilities. Capability classes in the
same color (excluding the default black) represent redundancies for action classes of the same
type.
Figure 7.7. Vehicle ontology class hierarchy. We have intentionally omitted the class hier-
archy of the components, capabilities and actions ontologies (see Figures 7.4, 7.5 and 7.6
respectively) for better readability.
class hierarchy as illustrated in Figure 7.7. As can be seen by observing Figure 7.7 we
have extended the Vehicle class with a Nessie class since Nessie is the AUV that we have
chosen to model inside the vehicle ontology. Given an AUV class, in our case Nessie, we
manually encode relations of that class to a set of component, capability and action classes
through hasComponent, hasCapability and hasAction object properties respectively. These
relations on the class level denote what that AUV class is in principal equipped with. We
say in principal because there may be a discrepancy between what is expected to be present
or available (in principal) in an AUV in terms of components, capabilities and actions and
what is actually the reality due to potentially missing or faulty components. In order to rep-
resent an actual AUV that we want to use for underwater operations we manually create an
instance of its respective class (e.g. Nessie1) as well as instantiations of all of its existing
components (e.g. DVL1, FwdSonar1, Compass1 etc.) which we initially assert as func-
tional2 . It is then up to the reasoning process to construct the full vehicle picture by: i) link-
2 Despite this manual initial assertion, the fault detection system can update the health status of each
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 114
ing component instances to the vehicle instance through hasComponent object properties,
ii) instantiating capabilities and actions and linking them to the vehicle instance through
hasCapability and hasAction object properties respectively. That is, given the health status
of each component and the dependency relations inherited by the capabilities and actions
ontologies. We will explain how this is done in Section 7.2.3. Figure 7.8 illustrates the full
set of relations inside the vehicle ontology.
AUV
hasComponent hasAction
hasCapability
dependsOnComponent dependsOnCapability
Component Capability Action
functionalComponent
True/False
Figure 7.8. Relations between the vehicle, components, capabilities and actions inside the
vehicle ontology.
• owl subclass of(?SubClass, ?SuperClass) If the SubClass variable is bound the query
predicate returns the superclass(es) of SubClass, while if SuperClass is bound it re-
turns the subclass(es) of SuperClass. In case both variables are bound it returns true
or false based on whether the bound SubClass is a subclass of the bound SuperClass.
• owl direct subclass of(?SubClass, ?SuperClass) Behaves exactly like the owl sub-
class of(?SubClass, ?SuperClass) predicate except for it returns (as its name sug-
gests) only the direct subclass(es) or superclass(es).
component as appropriate.
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 115
• rdfs individual of(?Resource, ?Class) This query predicate returns true if both the
Resource and Class variables are bound, and Resource is an individual (instance) of
Class. That is, Resource has an rdf:type property that refers to Class or any subclass
thereof. Moreover, by bounding only the Resource variable to an individual the pred-
icate will return the class(es) that individual belongs to while by bounding only the
Class variable to a class the predicate will return the individual(s) of that class (if
any).
• rdf assert(+S, +P, +O) This query predicate asserts new triples into the KB. For in-
stance, rdf assert(robot:‘Nessie1’, rdf:type, robot:‘Nessie’) will assert a new individ-
ual (instance) Nessie1 of type (class) Nessie into the KB. This is duplicate safe. That
is, if the triple that is provided in the rdf assert already exists in the KB the query
predicate will have no effect.
• rdf retractall(?S, ?P, ?O) This query predicate removes all matching triples from the
KB. If the provided triple does not match any triple then rdf retractall has no effect.
• owl has(?S, ?P, ?O) This is the most generic and flexible query predicate since it can
return any information residing in the KB. For instance, if none of the variables are
bound the predicate returns all the information residing in the KB while if S is bound
to an instance then it returns the property(ies), property value(s) and class(es) of that
instance.
Around the query predicates listed above, we have built our own set of query predicates
in the form of Prolog rules that we utilize in order to reason about the vehicle and its inter-
nal state. This is a continuous and real time time process that operates on part of the KB
that is formulated by the components, capabilities, actions and vehicle ontologies described
in Section 7.2.2. The choice of Prolog comes naturally since every ontology is internally
translated into Prolog terms forming in this way a Prolog KB upon which we can execute
queries. The process involves vehicle system identification and system update phases while
the vehicle operates. System identification is tasked with formulating all those instantia-
tions and relations that are required for representing the vehicle and its internal state with
respect to the existence of functional components which give birth to capabilities and ac-
tions while system update is tasked with deleting instantiations and relations in the presence
of non functional components which affect available capabilities and actions. As such, dur-
ing the operation of the vehicle, system identification and system update are executed in
succession and continuously forming effectively a never stopping loop while the vehicle
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 116
operates. Figure 7.9 illustrates the phases schematically. In this manner, the full vehicle
picture is represented, monitored and updated in real time.
Recall from Section 7.2.2 that our approach for knowledge representation and mod-
elling of components, capabilities, actions and vehicles demonstrates some similarities to
[2], [177]. This means that our approach for reasoning about the aforementioned concepts
unavoidably demonstrates similarities too. For instance, we too, utilize Prolog and build
on top of existing KnowRob reasoning predicates. At the same time, the fact that in our
ontological modelling we have included the encoding the health status of components has
transformed the manner in which we reason about the vehicle and its internal state signifi-
cantly. As opposed to KnowRob, where the mere presence or absence of a robot component
would mean that a robot does or does not have a capability or can or cannot execute a task,
in our case we go one step further and additionally reason about a components health status
not just its mere presence. Moreover, in KnowRob the dynamic assessment and reassess-
ment of a robot’s internal state is affected by querying for resources on the worldwide web.
For instance, in case that the task for a robot is to make pancakes, the reasoning process
will check whether the vehicle has some high level plan to perform the task and if not it
will search on the web for a recipe to fill the gaps. If this proves successful, the vehicle will
thereafter constantly be in a position to “make” pancakes given the existence of the nec-
essary ingredients and utensils irrespective of whether its manipulator, for example, would
break. This of course, leads to inconsistencies. In our case, we deal with such phenomena
which are in many cases the norm rather than the exception. Consequently, on top of not
having the luxury to query for information on the web we need to be able to capture the
internal state of a vehicle more accurately in the presence of hardware or software faults.
Losing a vehicle underwater is always a grave danger and most of the times it is accom-
panied with high recovery costs if recovery is at all possible. Having said that, let us now
present and analyse the vehicle system identification and system update phases in Sections
7.2.3.1 and 7.2.3.2 respectively.
The vehicle system identification phase is divided into three steps that handle components,
capabilities and actions respectively. Figure 7.10 illustrates the realization of the system
identification steps schematically. The first system identification step compares the set of
components that should be present on the vehicle based on its class to what is actually
present on the vehicle instance. It also formulates relations between the vehicle instance
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 117
Figure 7.10. System identification steps. Step 1 handles the identification of vehicle compo-
nents, step 2 handles the identification of vehicle capabilities while step 3 the identification
of vehicle actions.
and the components that are found on board the vehicle through hasComponent object prop-
erties as appropriate. The Prolog rule shown in Snippet 4 demonstrates the aforementioned
procedure, expanding the first step of Figure 7.10. The first input variable in the rule head
(RobotInst) is a vehicle instance and the second (RobotClass) is the class of the vehicle
that the instance belongs to. When querying for this rule, input variables must be bound
appropriately. That is, we must provide the instance and the class of the vehicle whose
components we are interested in identifying. Notice that in the rule body each predicate is
separated by a comma (,) which represents the logical AND (∧) in Prolog and means that
body predicates must be true at the same time for the rule to be true. Inside the body, Pro-
log checks whether the provided vehicle instance is actually an instance (individual) of the
provided vehicle class and whether the class is a subclass of AUV. Furthermore, it finds all
the component instances in the KB and puts them in a list L1. At this point the component
instances are not linked to the vehicle instance through hasComponent properties. In addi-
tion, it finds all the classes of components that should be present on the vehicle and whose
instances are not associated with the vehicle instance through hasComponent relations and
puts them in a list L2. Both comp present in KB and missing comp from robot are assistive
rules developed by us and are necessary in the process of searching for all the component
instances present in the KB and missing components from the robot respectively. Resum-
ing our analysis, Prolog takes all the members (elements) of L1 that are instances of L2
members and asserts into the KB that the given robot (vehicle) instance has as a component
each such L1 member. In this manner the framework identifies the vehicle with respect to
its components.
The second system identification step builds on top of the first one and is responsible
for identifying the capabilities of the vehicle and formulating relations between the vehicle
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 118
instance and capability instances, that will be potentially created, via hasCapability object
properties. That is given the existence of component instances related to the vehicle that
the first identification step yielded, their health status, as well as dependencies among the
capabilities that the vehicle should have and those vehicle components. The Prolog rule
shown in Snippet 5 demonstrates the aforementioned procedure, expanding the second step
of Figure 7.10. Please note that the head of the rule is the same as the one in the first iden-
tification step. What changed is the body of the rule. This is logical since vehicle system
identification is one phase divided into three steps. As such, the head of the rule is the same
so that we have the same input variables bound to the same values. Regarding the rule body,
Prolog finds all the classes of capabilities the vehicle class should have and puts them in a
list L1. In addition, it chooses an element of L1 (L1M) and finds the classes of components
the instances of which are required for instantiating L1M and puts them in a list L2. Prolog
also chooses a element of L2 (L2M). At the same time, Prolog takes the L1M member and
finds all non functional component (component instances) that affect it, and puts them in a
list L3. At this point if L3 is not empty the rule will fail for that L1M,L2M pair and Prolog
will investigate another one. In this manner, capabilities who depend on components that
are not functional are not instantiated. In contrast, if L3 is empty that means that there not
any non functional component instances for capability class L1M and Prolog will search
for instances of that capability inside the KB and put them in a list L, if any. Now, if L is
empty, the framework creates a capability instance Inst and asserts it in the KB as well as
formulates relations to the healthy component instances that the capability instance depends
on via dependsonComponent. Moreover, the framework formulates a relation between the
robot (vehicle) instance and the capability instance via a hasCapability object property. Fi-
nally, if L is not empty that means that there is a capability instance inside the KB and the
framework proceeds with the formulation of the dependsonComponent and hasCapability
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 119
relations. This process is iterative and continues until all eligible L1M,L2M pairs are inves-
tigated. In this manner the framework identifies the vehicle with respect to its capabilities.
That is, by instantiating all capabilities who depend on available functional components
and linking them to the vehicle. To conclude this second identification step, we would
like to mention that similar to the case of the first identification step, cap found on robot -
class, required comp for capability, non functional comp for capability are assistive rules
developed by us. They are are necessary in the process of searching for all the capability
classes that should be present on the vehicle based on its class, all the required component
classes for a capability to be instantiated and all component instances that are not functional
affecting one or more of the capabilities respectively.
Finally we have the third system identification step which is responsible for identifying
vehicle actions based on available capabilities. The process is very similar to the second
identification step upon whose outcomes it builds. Therefore we will not go into great
detail analysing it but instead we provide a higher level description of the process which
we illustrate in Snippet 6. Snippet 6 represents an expansion of the third identification step
depicted in Figure 7.10. During this third step, Prolog compares the actions that should
be available on the vehicle based on its class as well as their dependencies on specific
capability classes to what is actually available on the vehicle on the capability level, i.e.
which capabilities are instantiated and linked to the vehicle instance. In the event that for
an action class all capability dependencies are met through the existence of instantiations of
their respective classes, Prolog checks whether the action class is already instantiated and if
it is it asserts dependency relations between the action instance and the capability instances
into the KB via dependsonCapability object properties. In addition, it formulates (asserts)
a relation between the vehicle instance and the instantiated action class via a hasAction
object property. Now, if the action class, whose capability dependencies are met, is not
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 120
instantiated in the KB then there is an additional step to the process just described that
comes first and is concerned with the creation of a action class instance and then form
all the aforementioned relations. Finally, in case not all capability dependencies are met
for the action class, that action class is not instantiated and consequently relations are not
formed. This is an iterative process until all actions that can be instantiated and linked to
the vehicle instance as well as to the capability instances have done so. In this manner the
framework identifies the available actions that the vehicle can execute. This concludes the
system identification phase.
In Section 7.2.3.1 we saw how using Prolog the framework is able to build the representa-
tion of the vehicle by formulating appropriate instantiations and relations among compo-
nents, capabilities, actions and the vehicle itself. However, the vehicle system identification
phase in itself is not sufficient for achieving a complete system of assessing and reassessing
the vehicles internal state. A careful and observant reader may have already identified why.
Nowhere within the system identification phase steps is there anything regarding deleting
instances or relations. For example, consider the case in which our vehicle has a broken
sonar. In that case the first system identification step will associate the component instance
with the vehicle instance but the second step will refrain from instantiating the capabili-
ties that are affected by the non functionality of sonar component. Ergo relations among
an existing thing (the vehicle instance) and something non existent (the instance(s) of the
capability(ies) affected by the sonar) cannot be formed. Similarly, during the third system
identification step, actions that require the affected capabilities so that they can be instan-
tiated, will naturally not be instantiated and relations that involve their instances will not
be formed. So far things are in order. In addition, if somehow the sonar becomes opera-
tional again, e.g. it has a problematic connection to the vehicle that sometimes works, then
the system identification steps will immediately handle the new piece of information and
update the KB accordingly. What happens though if a component that has been previously
functional brakes? System identification will just ignore it and will not form instantiations
and relations that are affected by it. However such instances and relations already exist
in the KB from the time where the component was functional. That naturally leads to an
inconsistency and a gap towards truly achieving a reliable system. That is when the vehicle
system update phase comes into play. Snippet 7 illustrates its realization in Prolog while
Figure 7.11 illustrates the schematic decomposition of the system update phase and can be
used as a visual aid to better understand the Prolog snippets that will follow in this section.
As can be seen by observing Snippet 7 system update is encoded as a Prolog rule whose
body comprises seven predicates, six of which are system update predicates and one pred-
icate for finding the non functional components in the KB. Again, the input variables of
the rule head are the instance of the robot for which we want to perform system update
as well as its class. Predicate non functional components on robot is realized as a Prolog
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 121
rule and is illustrated in Snippet 8. More specifically, the rule simply finds all components
Snippet 9 Calculation of affected capabilities and actions based on non functional compo-
nents. In the event that there are affected actions, hasAction relations between the vehicle
instance and the instances of such actions are deleted.
system_update1(RobotInst,RobotClass,BrokenCompList,LAffectedCap,LAffectedAct):-
(BrokenCompList \= [] ->
(member(BrokenCompListM,BrokenCompList),
findall(AffectedCap,owl_has(AffectedCap,_,BrokenCompListM),LAffectedCap));
true),
(LAffectedCap \= [] ->
(member(LAffectedCapM,LAffectedCap),
(\+ rdfs_individual_of(LAffectedCapM,RobotClass) ->
findall(AffectedAct,owl_has(AffectedAct,_,LAffectedCapM),LAffectedAct));
true); true),
(LAffectedAct \= [] ->
(member(LAffectedActM,LAffectedAct),
(\+ rdfs_individual_of(LAffectedActM,RobotClass) ->
rdf_retractall(RobotInst,act:’hasAction’,LAffectedActM));
true); true).
in the LAffectedAct list, if any. Finally, Prolog will check whether LAffectedAct contains
elements , i.e. whether there are affected actions, and for each one, if any, it will delete all
hasAction relations between the vehicle instance and the affected action instances, if any.
This effectively represents that the vehicle is no longer capable of executing the affected
action(s). This is an iterative process until all affected capabilities and actions are identified
and all hasAction relations of those actions affected are deleted, if any.
System update then proceeds with the predicate system update2 which is illustrated
in Snippet 10. Given the instance of the vehicle, its class, as well as the list of affected
Snippet 10 Deletion of hasCapability relations between the vehicle instance and instances
of affected capabilities, if any.
system_update2(RobotInst,RobotClass,LAffectedCap):-
(LAffectedCap \= [] ->
(member(LAffectedCapM,LAffectedCap),
(\+ rdfs_individual_of(LAffectedCapM,RobotClass) ->
rdf_retractall(RobotInst,cap:’hasCapability’,LAffectedCapM));
true); true).
capabilities (capability instances) from system update1, Prolog deletes all relations between
the vehicle instance and the instances of the affected capabilities, if any, in an identical
manner to the deletion of hasAction relations shown in Snippet system update1. With this,
all relations among the vehicle, the affected capabilities and actions have now been deleted.
Next in line for deletion are the relations between affected capability instances and
non functional component instances as well as relations between affected action instances
and affected capability instances, if any. These are no others than relations formed via
dependsOnComponent and dependsOnCapability object properties as shown in Snippet 11.
Finally, what remains is the deletion of capability and action instances from the KB in
the event that there are non functional components that affect them. Snippet 12 illustrates
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 123
Snippet 12 Deletion of capability and action instances affected by non functional compo-
nent(s), if any.
system_update5(RobotClass,LAffectedCap):-
(LAffectedCap \= [] ->
(member(LAffectedCapM,LAffectedCap),
(\+ rdfs_individual_of(LAffectedCapM,RobotClass) ->
(owl_has(LAffectedCapM,’http://www.w3.org/1999/02/22-rdf-syntax-ns#type’,A),
rdf_retractall(LAffectedCapM,
’http://www.w3.org/1999/02/22-rdf-syntax-ns#type,A)));
true); true).
system_update6(RobotClass,LAffectedAct):-
(LAffectedAct \= [] ->
(member(LAffectedActM,LAffectedAct),
(\+ rdfs_individual_of(LAffectedActM,RobotClass) ->
(owl_has(LAffectedActM,’http://www.w3.org/1999/02/22-rdf-syntax-ns#type’,A),
rdf_retractall(LAffectedActM,
’http://www.w3.org/1999/02/22-rdf-syntax-ns#type,A)));
true); true).
Regarding autonomous vehicles, and by extension AUVs intended for real world appli-
cations, mission planning3 can be a very resource demanding and time consuming process
that is affected by the size and the complexity of the domain and the problem that needs
solving. Similarly to planning in the case of humans, the challenges that an autonomous ve-
hicle has to face are domains that are dynamic, partially known and uncertain. Unexpected
changes in the domain and/or the state of the vehicle (e.g hardware faults) can jeopardize a
mission. In addition, changes in mission objectives imposed either by humans or resources
availability or the external environment should be treated in a meaningful and effective
manner. As such, both hardware faults and changes in high level mission priorities dur-
ing mission execution require adaptive mission planning and execution approaches. These
approaches firstly enable robustness under a variety of unexpected events, and secondly
ensure that mission goals are satisfied to the maximum extent possible. Adaptive mission
planning and execution approaches are the key to robust mission planning and execution
which in turn is one of the key requirements towards persistently autonomous vehicles.
That is, autonomous vehicles that minimize requests for assistance from an operator when
they get stuck through a lack of cognitive ability to deal with a situation.
We have mentioned earlier in this chapter that the architecture of mission planning and
execution is based on the work of the PANDORA project. As in the case of PANDORA,
the system comprises one PDDL planner and one executor. Architecture-wise, as opposed
to PANDORA, we divide planning and execution information into two distinct ontologies,
i.e. a planning and an execution ontology who exchange information (see Sections 7.3.2,
7.3.3). This distinction makes the system modular and more flexible especially in cases
where different planning approaches need to be implemented by other users. Regarding the
planning ontology the full PDDL domains and problems reside in the planning ontology
and as such they can be reconstructed for the planner. This reconstruction is present in
PANDORA and by extension to the standalone planning and execution framework ROS-
Plan that materialised from the PANDORA project. However, in ROSPlan PDDL domain
files are fed into the system as input and then extracted to be used by a planner. In our
case PDDL files can be directly generated from the ontology. We had presented the idea of
reconstructing PDDL files from a planning ontology back in 2014 and 2015 [178], [4]. In
addition, we do not use ROSPlan, we have developed an implementation of our own and
we have developed reasoning modules based on Prolog (see Section 7.2.3) that the executor
consults while monitoring the execution of actions enabling adaptation. More specifically,
in the event of hardware faults our approach favours adaptive execution to adaptive plan-
ning, by initially searching for semantically equivalent actions to be executed4 . If none
is executable then it resorts to replanning given the current state of both the vehicle and
the world in conjunction with remaining mission goals (adaptive planning). Prioritizing
adaptive execution over adaptive planning is important as mission planning can be very de-
manding in terms of computational resources [179]. However, when a request for changing
3 Mission planning refers to planning missions that involve autonomous underwater operations.
4 Note that the notion of semantically equivalence is not present neither in PANDORA nor in ROSPlan.
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 125
high level mission priorities is issued to the vehicle, the framework performs replanning due
to effectively invalidating the prior plan in the vast majority of cases. For mission planning
and replanning we use the temporal PDDL planner OPTIC [144].
While reading this section it is worthwhile referring to Section 4.5.1 where we presented
the building blocks of temporal domains under the PDDL2.1 specification as well as vari-
ous definitions about planning, durative actions etc. Snippet 13 illustrates various types,
predicates and functions in PDDL syntax in the domain of detecting, classifying, reac-
quiring and inspecting physical objects in the external environment.
In our domain we have four types of objects: lawnmowerpoints, objectpoints,
inspectionpoints and vehicles. Lawnmowerpoints are waypoints that a vehicle visits
when traversing a lawnmower search trajectory5 (see Figure 7.12 for an exemplar lawn-
mower search trajectory) to detect objects in the environment. Objectpoints are waypoints
that a vehicle visits and reacquires (physical) objects while inspectionpoints are waypoints
around (physical) objects that a vehicle visits and inspects the objects.
5 Thereis no formal definition of a lawnmower search trajectory because there are literally infinitely many
ways in which someone can maw their lawn. For instance, one can go in straight lines or circles or zigzags
or in any way. However lawnmower search is an established straight line search pattern in the underwater
domain of the form shown in Figure 7.12.
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 126
Regarding the domain predicates shown in Snippet 13, predicate at op ?v... denotes
that some vehicle v is at objectpoint op while reachable op ?op1 ?op2... denotes that
some objectpoint op1 is reachable from some objectpoint op2. That is, the vehicle is able
transition from objectpoint op1 to objectpoint op2. Vehicle v, objectpoints op, op1 and
Snippet 13 Temporal PDDL domain types, predicates and functions for detecting, classi-
fying, reacquiring and inspecting objects.
(:types Lawnmowerpoint Objectpoint Inspectionpoint Vehicle)
(:predicates
(at_op ?v - Vehicle ?op - Objectpoint)
(reachable_op ?op1 ?op2 - Objectpoint)
(at_ip ?v - Vehicle ?ip - Inspectionpoint)
(reachable_ip ?ip1 ?ip2 - Inspectionpoint)
(at_lp ?v - Vehicle ?lp - Lawnmowerpoint)
(reachable_lp ?lp1 ?lp2 - Lawnmowerpoint)
)
(:functions
(visited_lp ?lp - Lawnmowerpoint)
(visited_op ?op - Objectpoint)
(visited_ip ?ip - Inspectionpoint)
(distance_lp ?lp1 ?lp2 - Lawnmowerpoint)
(distance_op ?op1 ?op2 - Objectpoint)
(distance_ip ?ip1 ?ip2 - Inspectionpoint)
(obj_detection_complete ?lp - Lawnmowerpoint)
(obj_classification_complete ?v - Vehicle)
(reacquired_obj ?op - Objectpoint)
(inspected_obj ?ip - Inspectionpoint)
(cnt_reacquired_obj ?v - Vehicle)
(cnt_visited_lp ?v - Vehicle)
(remaining_energy ?v - Vehicle)
(consumed_energy ?v - Vehicle)
(energy_consumption_rate_moving ?v - Vehicle)
(energy_consumption_rate_still ?v - Vehicle)
(prob_obj ?op - Objectpoint)
(ent_obj ?op - Objectpoint)
(prob_obj_quotient_sum ?v - Vehicle)
(ent_obj_quotient_sum ?v - Vehicle)
(det_obj_time ?v - Vehicle)
(class_obj_time ?v - Vehicle)
(reacq_obj_time ?v - Vehicle)
(insp_obj_time ?v - Vehicle)
(mult_fact_time ?v - Vehicle)
(mission_duration)
)
op2 are variables, also known as parameters which is indicated by the ? notation preceding
them. Effectively this means that they (the variables) can be instantiated to represent differ-
ent vehicles and objectpoints respectively. The same goes for the remaining predicates but
with respect to lawnmowerpoints and inspectionpoints.
With respect to the domain functions shown in Snippet 13, functions visited lp
?lp..., visited op ?op... and visited ip ?ip... respectively indicate whether
lawnmowerpoint lp, objectpoint op and inspectionpoint ip have been visited. Further-
more, distance functions distance lp ?lp1 ?lp2..., distance op ?op1 ?op2... and
distance ip ?ip1 ?ip2... return the distance between lawnmowerpoints lp1 and lp2,
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 127
objectpoints op1 and op2 and inspectionpoints ip1 and ip2 respectively. Function obj de-
tection... indicates whether the object detection phase is complete up to lawnmower-
point lp whereas function obj classification... indicates whether the object classi-
fication phase has been completed by vehicle v. Function reacquired obj ?op... in-
dicates whether a detected object in the environment has been reacquired from objectpoint
op while function inspected obj ?ip... indicates whether a reacquired object in the
environment has been inspected from inspectionpoint ip. Function cnt reacquired -
obj ?v... is a counter function that returns the number of objects that have been reac-
quired by vehicle v and function cnt visited lp ?v... a counter function that returns
the number of lawnmowerpoints that have been visited by vehicle v. Moreover, function
remaining energy ?v... returns the remaining energy of vehicle v while consumed -
energy ?v... returns the consumed energy of the vehicle. Naturally, as a vehicle operates
it consumes energy (the consumed energy increases and the remaining energy decreases).
Function energy consumption rate moving ?v... represents the (per meter) rate at
which the vehicle consumes energy whilst moving while function energy consumption -
rate still ?v... represents the (per second) rate at which the vehicle consumes energy
while maintaining its position. How these functions are used will be explained later on
in this section where we present durative actions. Now, functions prob obj ?op... and
ent obj ?op... return the probability of a detection being an object of interest and the
entropy related to that detection respectively. Again, we refrain from explaining functions
prob obj quotient... and ent obj quotient... for now as they can be better ex-
plained when describing the actions that use them in the remainder of this section. Finally,
function det obj time ?v... represents the time that the vehicle needs to detect ob-
jects, function class obj time ?v... represents the time that the vehicle needs to clas-
sify objects, function reacq obj time ?v... represents the time that the vehicle needs
to reacquire an object from some location while function insp obj time ?v... repre-
sents the time that the vehicle needs to inspect an object from some location. Function
mult fact time ?v... represents a multiplication factor for time when estimating the
duration of actions. Its usage will be explained in the next paragraphs where we analyse
durative actions. Finally, mission duration is a function that returns the time the vehicle
spent executing domain actions.
We have encoded four discrete durative actions in our domain, one for each of the four
phases, i.e do hover detection, do classify, do reacquire and do inspect. Action
do hover detection is an action that is intended for plans where the goal is to detect ob-
jects in the environment and is illustrated in Snippet 14. The parameters of the action are
a vehicle v, and two lawnmowerpoints, from and to. The duration of the action is equal to
to the distance between the two lawnmowerpoints from,to multiplied by a multiplication
factor for time plus the time the vehicle needs to detect objects. The multiplication factor
for time in conjunction with the distance between the points effectively allows the action
to model the time the vehicle needs for moving from one location to another. For instance,
if the vehicle moves with 3 m/s the multiplication factor would be 0.333 and as such if
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 128
the distance between locations from,to is, say, 10 meters then the vehicle will need 3.33
seconds to traverse it. However, the action is not only about moving from a location to
another its also about detecting objects the duration of which we model with det obj -
time... which is added to the time the vehicle needs to traverse the distance between
the lawnmowerpoints. We will consider this time to be zero (0) however. This is due to
the fact that the detections occur while the vehicle moves from one lwanmowerpoint to an-
other and as such the time it needs to detect objects is irrelevant since it is already included
within the time the vehicle needs to traverse the distance between the lawnmowerpoints.
Another thing that we would like to comment on is that the vehicle is not expected to move
constantly with the same speed. There are accelerations and decelerations. Having said
that, it is up to the modeller to chose a value for the multiplication factor that will reflect
this. Back to the example with the vehicle speed, instead of choosing a value of 0.333 a
better choice would be a higher value in order to formulate an upper bound for duration.
Such a value can be either chosen empirically or learned from data. Now, back to the anal-
ysis of the do hover detection action shown in snippet 14, the precondition of the
action condition... is that the vehicle v needs to be at lawnmowerpoint from at the
start of the interval of the action (the point at which the action is applied), lawnmowerpoint
to must be reachable from lawnmowerpoint from and lawnmowerpoint to must not have
been visited at the start of the action’s interval (for a reminder about durative actions and
temporal annotations see Section 4.5.1). In addition, at the start of the action’s interval
the object detection phase must not have been completed up to lawnmowerpoint lp while
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 129
the remaining energy of the vehicle must be greater or equal (>=) than the distance trav-
elled between lawnmowerpoints from and to multiplied by the energy consumption rate
that the vehicle has while moving. Recall from earlier in this section where we presented
the energy consumption rate moving... function we stated that it represents the rate
at which the vehicle consumes energy while moving. As such, multiplying this with the
distance travelled will provide an estimate of the consumed energy. As in the case of the
multiplication factor for duration that we analysed earlier, the energy consumption rate can
be chosen empirically or learned from data. The effect of the vehicle executing the ac-
tion is that it (the vehicle) is no longer at lawnmowerpoint from at the start of the durative
action, i.e. the effect is that the the vehicle leaves its initial location as soon as the action
starts executing. Moreover, at the end of the durative action, the effect of the vehicle being
at lawnmowerpoint to will take place, while its remaining energy will decrease relatively
to the distance travelled between lawnmowerpoints from and to multiplied by the energy
consumption rate that the vehicle has while moving. Now, in a similar fashion, the action
estimates the vehicles total consumed energy by increasing consumed energy by the same
amount it has decreased the remaining energy. These two representations are redundant
and one can be substituted for the other. However we provide both because they represent
different styles. Continuing with the effects, of the do hover detection action, at the end
of the action lawnmowerpoint to will be marked as visited, the object detection phase will
be marked as completed up to lawnmowerpoint to, the number of visited lawnmowerpoints
will be increased by one and the duration of the mission will increase by the same amount
as the duration of the action.
Let us now proceed with the durative do classify action that is illustrated in Snippet
15. Action do classify is intended for plans where the goal is to classify object de-
tections. The parameters of the action are a vehicle v and a lawnmowerpoint lp. The
duration of the actions is estimated to be the time that the vehicle needs to classify the
objects. Notice that there is no time multiplication factor as in the case of the do hover -
detection action since as we will see next the vehicle just holds its position to classify
objects and does not traverse any distance. Continuing with the analysis of the action, its
precondition (condition...) is that the vehicle needs to: be at lawnmowerpoint lp at
the start of the action’s interval (the point at which the action is applied) as well as to have
visited a number of lawnmowerpoints as the start of the action’s interval. The notation #
of lawnmowerpoints is just a place-holder for demonstration purposes. It is not a valid
syntax. To be valid it needs to be replaced by a number. Finally, at the start of the action’s
interval the object classification phase must have not been completed by the vehicle while
the vehicle must have remaining energy that is greater or equal (>=) than the rate at which
the vehicle consumes energy while maintaining its position multiplied by the time that the
vehicle needs to classify the objects. That is, the per second rate. Note that this is differ-
ent compared to what we presented for the the do hover detection. Here we have no
transition and as such we will not be using energy consumption rate moving... but
energy consumption rate still.... Note that this rate needs to be determined empir-
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 130
ically or be learned from data. The effect of executing do classify is that the vehicle
at the end of the action’s duration remains at its position (at lawnmowerpoint lp) and that
the object classification phase has been completed by the vehicle. Moreover, at the end of
the action the remaining energy will decrease relatively to the time that the vehicle needs
to classify the objects multiplied by the rate at which the vehicle consumes energy while
maintaining its position. In a identical fashion the do classify action has as an effect for
increasing the consumed energy at the end of the action’s interval. Finally, at the end of
the action’s interval the duration of the mission will increase by the same amount as the
duration of the action. The next action to present and analyse is the do reacquire durative
action which is illustrated in Snippet 16.
Durative action do reacquire is intended for plans where the goal is to reacquire ob-
jects in the environment and has three parameters: a vehicle v and two objectpoints from
and to. The estimation of the duration of the action is identical to the one presented for
the do hover detection action (see Snippet 14) but instead of adding the time that the
vehicle requires to make detections we add the time the vehicle requires to reacquire an
object. The value for the time that the vehicle needs to reacquire an object can be cho-
sen empirically or learned from data. Now, for the action to be applicable, at the start of
the action’s interval (the point at which the action is applied), the vehicle needs to be at
objectpoint (location) from, objectpoint to must be reachable from objectpoint from and
objectpoint to must have not been visited. Moreover, at the start of the action’s interval,
the object must have not been reacquired. As a final precondition the vehicle must have
energy greater or equal (>=) than the following: the distance between the objectpoints
from,to multiplied by the energy consumption rate of the vehicle while moving plus the
rate at which the vehicle consumes energy while maintaining its position multiplied by the
time the vehicle needs to reacquire an object. This is the most complex expression so far
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 131
in terms of calculating the energy. It has two components: i) the first is the energy that
the vehicle is estimated to require in order to transition from one objectpoint to the other
and ii) the energy it is estimated to consume in order to reacquire the object. That is, when
the vehicle will stop at objectpoint to it will maintain its position for some time in order
to reacquire the object. As opposed to det obj time... that we presented in the detec-
tion action, reacq obj time... is not zero and as such it needs to be determined either
empirically or learned from data. The effect of executing the do reacquire action is that
the vehicle has left its initial location at the start of the action ((at start (not (at op
?v ?from))) and moved to the location of objectpoint to. Additionally, at the end of the
action’s interval, the vehicle’s remaining and consumed energy will change according to
what we have presented above as being the energy the vehicle is estimated to require to
perform the action. Furthermore, at the end of the action’s interval, the effect of the object-
point to being marked as visited will take place, the effect of the object as being reacquired
will take place and the number of reacquired objects will increase by one. Also, at end
(increase (prob obj quotient sum ?v) (/(prob obj ?to) (prob obj ?from))) in-
creases, at the end of the action’s interval, the sum of probability quotients by the quotient
of dividing the probability of the object at location (objectpoint) to being an object of inter-
est by the probability of the object at location (objectpoint) from being an object of interest.
The effect at end (increase (ent obj quotient sum ?v) (/(ent obj ?to) (ent op
?from))) achieves the same outcome but with entropy instead of probability. Both these
effects regarding probability and entropy are related to high level mission priorities which
are described in Section 7.3.1.2. Finally, at the end of the action’s interval the duration of
the mission will increase by the same amount as the duration of the action.
The last action that is encoded in our domain is the durative action do inspect which
is illustrated in Snippet 17. Do inspect is intended for plans where the goal is to inspect
objects of interest and has three parameters: a vehicle v and two inspectionpoints from
and to. The duration of the action is estimated in the same manner as the duration of the
do reacquire action shown in Snippet 16. The only difference is that instead of using the
time that the vehicle needs to reacquire an object (reacq obj time...) we use the time
that the vehicle needs to inspect an object of interest from an inspectionpoint (insp obj -
time...) and instead of using the distance between two objectpoints we use the distance
between two inspectionpoints. The action precondition (condition...) is that at the start
of the action’s interval (the point at which the action is applied) the vehicle needs to be
at inspectionpoint from, inspectionpoint to must be reachable from inspectionpointpoint
from and inspectionpoint to must not have been visited. As a final precondition the vehi-
cle must have energy greater or equal (>=) than the following: the distance between the
inspectionpoints from,to multiplied by the energy consumption rate of the vehicle while
moving plus the rate at which the vehicle consumes energy while maintaining its posi-
tion multiplied by the time the vehicle needs to inspect an object from an inspectionpoint.
Similar to the case of the do reacquire action the energy that the vehicle is estimated to
consume for executing the inspection action has two components: i) the first is the energy
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 133
that the vehicle will require in order to transition from one inspectionpoint to the other and
ii) the energy it will consume in order to inspect the object. That is, when the vehicle will
stop at inspectionpoint to it will maintain its position for some time in order to inspect the
object from that position. Insp obj time... needs to be specified by the user. The effect
of executing the do inspect action is that the vehicle has left its initial location at the start
of the action ((at start (not (at ip ?v ?from))). Moreover, at the end of the ac-
tion’s interval the vehicle moves to the location of the inspectionpoint to and the vehicle’s
remaining and consumed energy change based on the energy estimate that we presented
above while analysing the preconditions of the action. In addition, at the end of the action’s
interval the effect of marking inspectionpoint to as visited takes place as is marking the
object as inspected from the same location. Finally, at the end of the action’s interval the
duration of the mission will increase by the same amount as the duration of the action.
High level mission priorities are taken into consideration for the reacquisition of objects.
Changes in high level mission priorities can occur in real time and vehicles should be able
to accommodate and act upon them. Addressing the aforementioned challenge is a non-
trivial task. We assume the operator is able to communicate changes in mission priorities to
the AUV along the low-bandwidth acoustic channel, for instance in response to a changing
situation that the vehicle cannot monitor. We have considered three high level mission
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 134
priorities in total.
The first, which we call energy-efficient reacquisition, is intended for generating mis-
sion plans that minimize the consumed energy of the AUV or equivalently maximize the
remaining energy of the AUV for reacquiring and inspecting objects. That is, plans for reac-
quiring and inspecting objects in a manner that the smallest possible distance is travelled
given that: i) the rate at which the vehicle consumes energy while moving is assumed to be
the same for all actions of the same type, ii) the rate at which the vehicle consumes energy
while maintaining its position is assumed to be the same for all actions and iii) quantities
such as the time that the vehicle needs to classify objects or the time that the vehicle needs
to inspect an object from a position etc. are considered to be the same for their respective
action types. Minimizing energy consumption is achieved by doing two things. The first
has already been described in Section 7.3.1.1 and it involves functions remaining energy
and consumed energy in conjunction with distance functions, energy consumption rate
functions as well as time functions and their role as part of the preconditions and effects
of actions do reacquire and do inspect. The second is to use the following metric within
the PDDL problem definition6 : metric minimize consumed energy ?v or equivalently
metric maximize remaining energy ?v.
The second high level mission priority is intended for generating mission plans in which
the AUV reacquires and inspects detected objects that are the most probable to be objects
of interest first, we call this probability-efficient reacquisition, while concurrently minimiz-
ing the energy that it consumes (this is the energy-efficient reacquisition that we described
as the first high level mission priority). This effectively formulates a multi-objective op-
timization problem in which we want to have an energy-efficient plus probability-efficient
reacquisition which combined, constitute the second high level mission priority. To solve
this problem we employ the weighted sum scalarizing method that scalarizes a set of ob-
jectives into a single objective by multiplying each objective with a weight. The weighted
sum method is given by the following formulas:
n
min F(x) = ∑ wi fi (x) (7.1)
i=1
s.t. wi ≥ 0, i ∈ 1, ..., n (7.2)
∑
n
w
i=1 i
=1 (7.3)
where F(x) is the single objective, fi (x) is the set of objectives and wi the weights. We
have already presented the energy-efficient reacquisition objective. Now, the probability-
efficient reacquisition objective in our case is formulated as minimizing the sum of proba-
bility quotients (prob obj quotient sum ?v) as shown in Section 7.3.1.1. At this point,
an example will be really helpful in clarifying how this computation is made.
Let the outcome of classification be three objects (A, B, C) with probabilities of A =
0.53, B = 0.69 and C = 0.84, with each probability associated with each object representing
6 PDDL problems will be investigated in Section 7.3.1.3.
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 135
how likely it is for the respective object to be an object of interest. Also, let the vehicle be at
some location from which it needs to start moving towards the objects in order to reacquire
them. Probability-efficient reacquisition requires that the objects are reacquired in a prob-
ability efficient manner, that is, higher probability objects should be reacquired first, i.e.
before lower probability objects in a succession of reacquisitions. Even though the vehicle
is at some initial location init which is not associated with any kind of object or probability
we assign an arbitrarily large number to that location that represents a “probability” so that
the computation of probability-efficient reacquisition can take place. That is, minimizing
the sum of probability quotients with each quotient being calculated as shown in effect of
the do reacquire action in Snippet 16. That is, dividing the probability of the object at the
location that the vehicle will transition to with the probability of the object at the location
that the vehicle will transition from. As we mentioned earlier, the initial location of the
vehicle is not associated with any object or probability, however we assign an arbitrarily
large number to represent a “probability”, say 1000. Let us now calculate the sum of prob-
ability quotients for each possible reacquisition order which is shown in Table 7.3. As can
Table 7.3. Sum of probability quotients for each possible reacquisition sequence.
be observed by looking at Table 7.3 the minimum value for the sum of probability quotients
corresponds to the sequence init → C → B → A in which the vehicle will reacquire targets.
This sequence, is the sequence in which higher probability objects are reacquired before
lower probability objects. In our case 0.84 > 0.69 > 0.53.
Remember that we need both energy-efficient plus probability-efficient reacquisition
combined in one objective. In order to apply the weighted sum to form a single objective
we need to normalize each objective so that it is expressed in the same range. The linear
normalization is given by:
newRange
fN (x) = ( f (x) − oldMin) + newMin (7.4)
oldRange
where, fN (x) is the normalized objective, f (x) is the original one, oldMin is the mini-
mum value of f (x), newRange (newMax − newMin) is the range of fN (x) that we desire,
oldRange (oldMax − oldMin) is the range of f (x) and newMin is the desired minimum
value of fN (x).
The third high level mission priority in intended for generating mission plans in which
the AUV reacquires and inspects detected objects that have the highest entropy first, we
call this entropy-efficient reacquisition, while concurrently minimizing the the energy that
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 136
1.0
0.9
0.8
0.7
0.6
H(x)
0.5
0.4
0.3
0.2
0.1
0.00.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
P(x)
This third high level mission priority is ideal for cases where we want to prioritize high
uncertainty reduction and high information gain first in an energy-efficient manner. Figure
7.13 illustrates the relationship between entropy and probability. At this point we would
like to clarify something, inspection always happens in an energy efficient manner. Entropy
and probability are only considered during reacquisition.
Regarding the presentation of high level mission priorities in this section one might
wonder about the minimization of mission execution time since, after all, we have pre-
sented a temporal domain with durative actions in Section 7.3.1.1. The formulation of such
a high level mission priority is not prohibited by the temporal domain and the durative
actions we have presented. However, the minimization of the total execution duration is
already subsumed under the energy-efficient high level mission priority. More specifically,
by observing how we estimate energy consumption and the duration of actions (see Section
7.3.1.1) one can easily identify that the core element for both is the distance travelled. For
instance, in the case of the reacquisition action the multiplication factor for time will be the
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 137
same for every action of the same type as will be the estimate for the time that the vehicle
requires to reacquire an object7 . As such, reacquiring objects in different orders will yield
different distances for each action instantiation and different total mission execution dura-
tions. In the same manner, the energy consumption rate for the vehicle while moving is
assumed to be the same for every reacquisition action as is the energy consumption rate for
the vehicle while maintaining its position and the estimated time that the vehicle requires
to reacquire an object. Again, reacquiring objects in different orders will yield different
distances for each action instantiation and different consumed energy for each mission. In
both cases, that of duration and that of consumed energy it is the minimization of distance
that will yield the optimized reacquisition both in terms of duration and energy. As such,
by minimizing for the estimated consumed energy of the vehicle or by maximizing for the
remaining energy of the vehicle we subsume the minimization of the estimated execution
duration. This also applies for classification actions where there are no transitions in space
for the vehicle and as such we do not utilize distances as the basis for estimating dura-
tions and energy consumptions. By observing the classification action in Section 7.3.1.1
one can identify that both duration and energy consumption depend on the estimated time
that the vehicle requires to classify objects. Regarding energy consumption in particular, it
also depends on the energy consumption rate of the vehicle while it maintains its position.
However, because for every classification action that rate is assumed to be the same, both
duration and energy consumption are driven by the same quantity. As such, minimizing
energy consumption or maximizing the remaining energy subsumes the minimization of
estimated execution duration.
be observed by looking at Snippet 18. The reason for declaring lp7 as a lawnmowerpoint
Snippet 18 PDDL lawnmower object detection problem objects and initial state.
(:objects
lp1 lp2 lp3 lp4 lp5 lp6 lp7 - Lawnmowerpoint
auv - Vehicle
)
(:init
(at_lp auv lp7)
(= (visited_lp lp7) 1)
(reachable_lp lp7 lp1)
(= (distance_lp lp7 lp1) 5.00)
(reachable_lp lp1 lp2)
(= (distance_lp lp1 lp2) 20.00)
(= (visited_lp lp1) 0)
(= (obj_detection_complete lp1) 0)
...
...
(reachable_lp lp5 lp6)
(= (distance_lp lp5 lp6) 20.00)
(= (visited_lp lp5) 0)
(= (obj_detection_complete lp5) 0)
(= (visited_lp lp6) 0)
(= (obj_detection_complete lp6) 0)
(= (cnt_visited_lp auv) 0)
(= (remaining_energy auv) 30000.00)
(= (consumed_energy auv) 0)
(= (energy_consumption_rate_moving auv) 3)
(= (mult_fact_time nessie1) 4)
(= (det_obj_time nessie1) 0)
(= (mission_duration) 0)
)
is purely because the planner would not be able to form a plan in which the vehicle could
transition from its starting location to lp1 given that the parameters of the do hover -
detection action are a vehicle and two lawnmowerpoints. An alternative would be to have
included some sort of positioning action in the domain declaration which would take as pa-
rameters a vehicle, a location outside the lawnmower trajectory and a lawnmowerpoint. As
such, that action would be placed first in the action sequence (plan) that the planner would
form and then it would proceed with a sequence of do hover detection actions instead of
formulating detection plans just with do hover detection actions. However this choice
of ours does not affect the planning process in any other way and by no mean does it affect
the essence or the outcome of the process in the first place. Having said that, let us now
dig into the initial state as it is depicted in Snippet 18. Initially the vehicle is at lp7
which has consequently been visited. In order to be able to encode the lawnmower trajec-
tory we use a series of visited lp and obj detection complete functions as well as
reachable lp predicates. More specifically, we have encoded that lp1 is only reachable
from lp7, lp2 from lp1, lp3 from lp2, . . . , lp6 from lp5. In this manner the sequence
of lawnmowerpoints that form the lawnmower pattern will be preserved. Moreover, in or-
der to represent the reality, we have encoded that none of the actual lawnmowerpoints has
been visited and that the object detection phase has not been completed up to any lawn-
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 139
mowerpoint by utilizing initial values for the visited lp and obj detection complete
functions at zero. The counter function of visited lawnmowerpoints is also initialized at
zero. Furthermore we have encoded the distance between consecutive lawnmowerpoints
(in meters) as well as the energy at which the vehicle consumes energy while moving so
that the remaining energy and consumed energy functions can calculate the remaining
and consumed energy of the vehicle for the whole plan execution. In addition, we have
chosen a value of four for the time multiplication factor used in estimating the duration
of actions. Note that the aforementioned values are for demonstration purposes and, as
we mentioned before need to be chosen empirically or learned from data. Also, for two
different vehicles values are expected to be different. We are now ready to proceed with
the goal state which is illustrated in Snippet 19. By observing Snippet 19 we can see that,
as expected, the goal is for every lawnmowerpoint to have been visited, object detection
to have been completed in the whole search area. In addition the remaining energy of the
vehicle must be greater or equal (>=) to zero to indicate that given the initial remaining
vehicle energy set at some maximum value, the vehicle is able to perform the lawnmower
pattern. That is, it might not have enough energy to do so. In that case the problem becomes
unsolvable.
Let us now present the problem of classifying detections. This is a simple problem with
respect to its structure as can be observed by looking at Snippet 20. The only objects
required for the problem are the vehicle auv and lawnmowerpoint lp6. Lawnmowerpoint
lp6 is the location where the vehicle stopped when it finished searching for objects. This is
reflected in the problem’s initial state where the number of visited lawnmowerpoints is
six, i.e. the number of lawnmowerpoints that were provided for the detection problem. Fur-
thermore, as part of the initial state we encode that object classification phase must have not
been completed before. This is meant to indicate that classification occurs only once. We
also, encode the time the vehicle needs to classify objects (in secs), the remaining energy
of the vehicle as well as the energy consumption rate at which the vehicle consumes energy
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 140
while maintaining its position. Finally, we instantiate the mission duration at zero. The
values chosen for the energy consumption rate, the remaining energy and the time required
to classify an object are for demonstration purposes. Finally, the goal is, as expected, for
the vehicle to perform the classification of detections and the remaining energy of the ve-
hicle after classification to be greater or equal (>=) to zero. We assume that the vehicle is
equipped with a probabilistic classification system that yields classified objects with some
probability.
Moving on, we have the problem of reacquiring the objects that the classification phase
has yielded. For the sake of our running example let us assume that the outcome of clas-
sification was two objects of interest found at objectpoint locations op1 and op2. With
respect to our problem we have four PDDL objects as illustrated in Snippet 21. The two
objectpoints that we just mentioned plus a vehicle auv and another objectpoint op3 which
in reality is the location from where the vehicle will start its object reacquisition phase.
Similar to the case of the detection problem where lp7 was encoded as a lawnmowerpoint,
op3 is encoded as an objectpoint due to the the do reacquire action parameters being a
vehicle and two objectpoints. Therefore, the vehicle initially needs to be at an “objectpoint”
(op3) to be able to transition to either op1 or op2 to reacquire the first object. Again, this
does not affect the planning essence or outcome in any way. The remainder of the initial
state is that op3 has already been visited while op1 and op2 have not, and consequently
the objects at those locations have not been reacquired. Furthermore, we encode that from
op3 the vehicle can transition to any other objectpoint but it cannot transition back to op3
since op3 is not reachable from any other objectpoint. Distances between objectpoints are
also calculated based on their positions, hence they are known. In addition, the probabili-
ties of each object being an object of interest are also known from the classification phase
and we additionally convert them to entropy based on equation 7.5. Moreover, the values
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 141
of prob obj quotient sum auv and ent obj quotient sum auv are initially zero, the
vehicle has not consumed any energy, the remaining energy is at some maximum value, and
the counter function of reacquired objects is initialized at zero. Finally, the rates at which
the vehicle consumes energy while moving and while keeping its position are initialized
as is the time multiplication factor for estimating action duration, the time that the vehicle
requires to reacquire each object and the mission duration. With respect to the goal state
of the problem as well as the high level mission priority, we encode that the remaining
energy of the vehicle should be greater or equal (>=) to zero and that all objects must be
reacquired in an energy-efficient manner (see Snippet 22). Also refer to section 4.5.1 for a
reminder about plan metrics.
Snippet 22 PDDL reacquisition problem goal state and energy efficiency as the high level
mission priority.
(:goal
(and
(>= (remaining_energy auv) 0)
(= (cnt_reacquired_obj auv) 2)
)
)
(:metric minimize (consumed_energy auv))
The last problem to present is the one of inspecting reacquired objects who are deemed
to be objects of interest. For inspection we create six, equally distributed, inspection points
around every object of interest which the vehicle visits in order to inspect the object from
each one of them. The PDDL objects as well as the initial state for the inspection prob-
lem of one object of interest is illustrated in Snippet 23. Inspectionpoints ip1-ip6 are the
inspectionpoints around the object while ip7 is the “inspectionpoint” at which the vehicle
starts its mission moving towards the first inspectionpoint similar to the phases of detection
and reacquisition with lp7 and op3 respectively. Regarding the initial state of the in-
spection problem, we encode that the vehicle is at ip7 which is visited. Also, we encode
that: i) every inspectionpoint is reachable from any other inspectionpoint, ii) ip1-ip6 are
not visited, iii) the object has not been inspected from the inspectionpoint positions. Fi-
nally, we encode: i) the distances between all inspectionpoints, ii) the remaining energy
being at some maximum value, iii) the consumed energy being at zero, iv) the rates at
which the vehicle consumes energy while moving and while keeping its position, iv) the
time multiplication factor for estimating action duration and the time that the vehicle re-
quires to inspect each object from each inspectionpoint. Since with the inspection problem
we want to inspect an object it is only logical to encode in the goal state that the object
must be inspected by the vehicle from each inspection point and that the remaining en-
ergy of the vehicle must be enough to do so (see Snippet 24). In addition, we instruct the
planner to generate a plan with energy efficiency in mind by utilizing metric minimize
(consumed energy auv).
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 143
Snippet 24 PDDL inspection problem goal state and energy efficiency as the high level
mission priority.
(:goal
(and
(= (inspected_obj ip1) 1)
(= (inspected_obj ip2) 1)
(= (inspected_obj ip3) 1)
(= (inspected_obj ip4) 1)
(= (inspected_obj ip5) 1)
(= (inspected_obj ip6) 1)
(>= (remaining_energy auv) 0)
)
)
(:metric minimize (consumed_energy auv))
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 144
Figure 7.14. Planning ontology class hierarchy. We have intentionally omitted the world,
components, capabilities, actions and vehicle ontologies for better readability.
As opposed to domain definitions, problem definitions are dynamic. For example, ob-
ject types, predicates, action duration estimates, preconditions and effects, etc. do not
change. As such we can statically assert them inside the ontology before starting a mis-
sion so that the system can dynamically reconstruct them and make them available to the
planner. What changes though are the objects that bound the variables/parameters of do-
mains and these are part of the problem. If we want to perform reacquisition of objects for
example, we are not in the position to formulate the reacquisition problem until we have the
objects. In the event that the objects are known beforehand we can assert them inside the
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 146
Table 7.4. Relations between the domain instance and all its types, predicates, functions and
actions which are referred to as building block instances.
planning ontology and use them to reconstruct the problem by encoding distances, probabil-
ities etc. In the event though that we first need to detect and classify them so that we have
all the necessary information, the reacquisition problem cannot be encoded in full inside
the planning ontology. Consequently it cannot be reconstructed. An additional example is
when a high level mission priority changes. The definition of the problem in the ontology
must change as well so that it can be reconstructed accurately for the planner. Therefore, on
top of dynamically reconstructing the planning problem from the ontology as in the case of
the planning domain we additionally perform its encoding (dynamically) at runtime. Let us
now proceed with the classes, instances and properties that are related to PDDL planning
problems.
Class PDDLProblem is a generic class and is intended for encoding planning problems.
Given the problems that we are interested in within a domain, we can encode specialized
subclasses to describe it. That is, one (sub)class per problem. Equivalently, we can perform
a categorization that groups all problems within a domain and encode only one problem
class for that domain and have multiple instances that describe additional problems. The
later is what we have chosen for our planning ontology with respect to problems. That
is, we have one instance of the PDDLAutOpProblem class per problem. Each such in-
stance is related to autonomous operations problem object, initial and goal state as well
as high level mission priority and problem metric instances via hasPDDLProblemObject,
hasPDDLInitialState, hasPDDLGoalState, hasPDDLProblemPriority and hasPDDLProb-
lemMetric object properties respectively. For example, for the initial state of a problem we
have PDDLAutOpProblem1 hasPDDLInitialState PDDLAutOpProblemInitialState1 while
for the goal state we have PDDLAutOpProblem1 hasPDDLGoalState PDDLAutOpProb-
lemGoalState1 and so on. Now, each initial and goal state instance as well as problem
metric instance is related to its syntax and semantics via value string properties. As for
high level mission priority instances, they are related to their names via name string data
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 147
properties. Further, high level mission priority instances are related to a priorityChanged
boolean data property. The purpose of the aforementioned data property is to notify the
executor that monitors the data property of a high level mission priority change. If a change
has occurred then the executor initiates a new planning-execution cycle. How a high level
mission priority and the priorityChanged data property are updated within the planning
ontology is explained in Section 7.3.2.1. Moving on with the analysis of the planning on-
tology, class Area is a generic class that is intended for encoding search areas inside the
(planning) ontology and can be specialized based on the type of search we want to perform;
in our case LawnmowerArea. For every lawnmower area we create an instance of the class
and relate it to four area points, i.e. four instances of the Areapoint class, via consistsOf
object properties. These area points correspond to the edges of the rectangular area and are
related to instances of the Position3D8 class via position object properties which in turn are
related to the 3D coordinates of the points via north, east and depth float data properties.
Furthermore each lawnmower area instance has three data properties: index, spacing and
overlap. Index takes integer values in the range of [1, 4] and is used for indicating which of
the four area points will be the used as the initial point of the first lawnmower leg. More-
over, spacing is used for defining the spacing between the lawnmower legs and is expressed
in meters while with overlap we define the overlap that the legs should have with each other.
Overlap is expressed as a percentage. Lawnmower area related information are passed onto
a script for generating lawnmowerpoints via instantiating the Lawnmowerpoint class shown
in Figure 7.14 as appropriate. That is, each lawnmowerpoint instance is linked to Pose6D
instances via pose objects properties while each Pose6D instance is linked to its Position3D
and Orientation3D instances via position and orientation object properties respectively. To
complete the representation, each Position3D instance is linked to its 3D position via north,
east, depth float data properties while each and Orientation3D instance is linked to its 3D
orientation via roll, pitch, yaw float data properties. Recall from Section 7.2.1 that orien-
tation does not refer to a waypoint (lawnmowerpoint, objectpoint, inspectionpoint) per se,
but rather to the orientation the vehicle should have when visiting this waypoint. Object-
points as well as inspectionpoints are instantiated in the same manner given the detection of
objects that we want to inspect. This concludes our analysis of defining planning problems
within the planning ontology and by extension it concludes the analysis of the planning
ontology itself.
Initially, before any priority change is issued to the vehicle, the planning ontology encodes a
high level mission priority for which the priorityChanged data property is false. Whenever
a high level mission priority change is issued to the vehicle a set of Prolog rules which
are tasked with updating the priority inside the planning ontology are triggered. The new,
desired priority is first checked against the current priority and if it is the same then nothing
8 Recall that we are importing the world ontology in our planning ontology which gives us access to all its
classes and properties.
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 148
happens and the priorityChanged data property remains false. As such, the executor that
monitors high level mission priorities continues executing the current plan. However, in the
event that the new priority is different from the one encoded inside the planning ontology
then the new priority becomes the current while at the same time the priorityChanged data
property is switched to true. In this case, the executor will stop executing the current plan
and update the planning problem as appropriate (including metrics) so that a new planning
phase can be initiated and a new plan be created. Once the new plan is created then the
priorityChanged data property is switched back to false. This is done in order for the
executor to be notified that the plan that it currently holds for execution is up to date with
respect to high level mission priorities. The process as a whole is repeated every time a high
level mission priority change is issued to the vehicle and as was mentioned in the beginning
of this section it is realised as a set of Prolog rules.
each action has executed since their actual duration is unknown at the time of generating the
plan that involves them. The same goes for estimated and actual start and end time points.
Regarding actionStatus, initially each action is marked as non executed, during execution
is marked as pending and at the end of the execution as executed. Regarding the outcome
of each action (actionOutcome), it is assigned after each action has executed and it is ei-
ther successful or unsuccessful depending on the outcome. Finally, class MissionPhase is
intended for keeping track of the phase in which mission execution is (e.g. classification,
reacquisition etc.) while class ProblemPhase logs the nature of the problem that gave birth
to the latest plan that was formed by the planner (e.g. inspection problem etc.).
• Extension of the approach followed in KnowRob for modelling components with the
10 An action has underran if its estimated duration is higher that its actual one when executed.
11 An action has overran if its estimated duration is lower that its actual one when executed.
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 151
inclusion of modelling their health status so that we could model component faults at
runtime.
Let us now elaborate on the above list. Architecturally, the framework is divided into three
main interacting parts: i) knowledge representation, ii) reasoning and iii) planning and
execution.
Regarding knowledge representation and reasoning, this is based on KnowRob which
utilizes ontologies and Prolog respectively. However, given that KnowRob was developed
in the scope of household robots, we have developed our own set of ontologies and rea-
soning modules in order to accommodate persistently autonomous (underwater) operations
extending in this manner KnowRob to the underwater domain. More specifically, the frame-
work comprises a world ontology for modelling the environment in which an AUV per-
forms autonomous operations. Regarding the modelling of AUVs and their internal state,
the framework comprises a components, a capabilities, an actions and a vehicle ontologies.
These ontologies are interconnected in a bottom up fashion with the components ontology
lying at the very basis, followed by the capabilities ontology at a higher modelling level,
followed by the actions ontology and finally by the vehicle ontology. This bottom up ap-
proach is based on the bottom up approach of KnowRob. However, we have enriched the
modelling of components by modelling their health status which indicates whether com-
ponents are functional or non functional. In this manner, we are able to model component
faults at runtime. As such, a vehicle may start with a set of capabilities given the existence
and the health status of components but this (set) can change in the presence of component
faults. By extension, the initial set of actions that the vehicle can perform will be affected
given a reduced set of capabilities. Another, modelling extension we have implemented
is the existence of semantically equivalent capabilities in the presence of redundant com-
ponents. Semantically equivalent capabilities represent alternative ways of achieving the
same type of capability. The notion of semantic equivalence is also extended to actions
in the presence of semantically equivalent capabilities for the same type of action. This
assessment of the vehicle internal state is dynamic and in real time as the vehicle operates
and is made possible with reasoning modules implemented in Prolog.
Our framework is also complemented by a mission planning and execution system
which is based on the PANDORA project. As in the case of PANDORA, the system com-
prises one PDDL planner and one executor. However as opposed to PANDORA, we divide
planning and execution information into two distinct ontologies, i.e. a planning and an
execution ontology who exchange information. This distinction makes the system modu-
lar and more flexible especially in cases where different planning approaches need to be
implemented by other users. In addition, we do not only encode planning problems in-
side the planning ontology but we additionally encode the planning domain which along
with planning problems are dynamically reconstructed and fed into the PDDL planner to
CHAPTER 7. A FRAMEWORK FOR PERSISTENTLY AUTONOMOUS
OPERATIONS 152
generate mission plans. In this manner the complete information available to planner is
made accessible to the user through querying the planning ontology. Further, we consider
changes in high level mission priorities during mission execution which the framework can
accommodate by taking action on the planning level (adaptive planning). The framework is
also designed to take action on the planning level in the event that component faults appear
in the vehicle. However, the framework favours adaptive execution over adaptive planning
where possible. That is, in cases where no critical component has deprived the vehicle of a
capability type that is in turn critical for an action type. If this is the case, the vehicle has to
devise alternative plans (adaptive planning) to satisfy mission goals to the maximum extent
possible.
Chapter 8
Underwater Mine Countermeasures (MCM) lie within the spectrum of underwater opera-
tions and deal with the identification of mined areas to be avoided as well as localization
and neutralization of individual mines [180]. The purpose of MCM operations within the
Unmanned Underwater Vehicles (UUVs) context can be summarized into the following:
“to field a common set of unmanned, modular MCM systems operated from a variety of
platforms or shore sites that can quickly counter the spectrum of threat mines assuring ac-
cess to naval forces with minimum mine risk” [181]. MCM can be broken down into the
following four phases: i) Detect (D), ii) Classify (C), iii) Identify (I) and iv) Neutralize
(N) [181]. These phases can be either performed by one or more vehicles in one or more
passes. For instance, in the case of one vehicle and two passes, the approach to MCM
could be that in the first pass the vehicle performs Detection (D) and Classification (C)
of Mine-Like Objects (MLOs) using appropriate sensors (e.g. side-looking sonar). In the
second pass, the vehicle performs Identification (I) of mines using electro-optic sensors
(e.g. cameras) and neutralization (N) of those deemed to be mines using some neutraliser
(e.g. a stationary bomblet that is placed in the area and is remotely detonated later using an
acoustic command [181]). The aforementioned scenario can be described as DC IN (four
phases in two passes). In most cases, depending on the number of vehicles and their capa-
bilities, MCM operations can be realized as different combinations of the four phases. An
alternative approach to MCM, which we follow in this chapter, is to perform Detection (D),
Classification (C), Reacquisition (R) and Inspection (In) of mines maintaining however the
four-phase, two-pass approach. That is, DC RIn.
As was mentioned in the introduction of this thesis (see Section 1.1.1), persistently au-
tonomous vehicles are the key to addressing a series of open challenges in autonomous
underwater operations. These include vehicle failures, partially known and dynamic envi-
ronments, and minimization of human in the loop occurrences. Such challenges are also
dominant in the underwater MCM environment. In addition, the impact of misaddressing
them can be highly costly; not only with respect to funds (e.g. losing a vehicle) but also
153
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 154
with respect to human lives due to the highly dangerous nature of mines. Consequently,
when using AUVs for autonomous underwater MCM operations, it is of utmost importance
that these vehicles demonstrate persistence in their autonomy [179].
is used for instantiating circle feature detections based on which the classification system
will create and assert Mlo (Mine-Like Object) instances into the ontology. Such assertions
associated with some probability and entropy. The Mine class on the other hand is intended
for instantiating mines based on Mlo instances that are deemed to be mines after the AUV
has reacquired them.
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 155
instantiate the necessary detectors, classifiers as well as MLO reacquisition modules that
the AUV is equipped with and are required for the MCM setting.
With respect to the capabilities ontology for MCM, we have extended the DetectionCa-
pability, ClassificationCapability and ReacquisitionCapability classes shown in Figure 7.5
with MineDetectionCapability, MineClassificationCapability and MloReacquisitionCapa-
bility classes respectively. In addition, as in the case of the capabilities ontology shown in
Figure 7.5, we have extended the aforementioned MCM related capability classes with ad-
ditional classes to represent semantically equivalent capabilities in the MCM setting. The
outcome of the extended MCM capabilities ontology is shown in Figure 8.3. This extension
of the capabilities ontology is necessary given that we want to represent capabilities within
an MCM setting and not some generic autonomous operations scenarios. At this point we
would like to emphasize that the dependency relations shown in Table 7.1 are also extended
accordingly. For instance, the primary mine detection capability of the vehicle now de-
pends on the forward looking sonar and the mine detection module while the secondary
mine detection capability on the forward looking camera and the mine detection module
and so on.
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 157
Figure 8.3. MCM capabilities ontology class hierarchy. We have intentionally omitted the
extended MCM components ontology for better readability.
Similarly, with respect to the actions ontology for MCM, we have extended the ac-
tions DoHoverDetection, DoClassify, DoReacquire and Doinspect action classes shown
in Figure 7.6 with DoHoverMineDetection, DoClassifyMine, DoReacquireMlo and DoIn-
spectMine classes respectively. Moreover, as in the case of the actions ontology shown in
Figure 7.6 we have extended the aforementioned MCM related action classes with addi-
tional classes to represent semantically equivalent actions in the MCM setting. Figure 8.4
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 158
illustrates the outcome of this extension. Again, similar to the case of capabilities, the de-
Figure 8.4. MCM actions ontology class hierarchy. We have intentionally omitted the ex-
tended MCM components and capabilities ontologies for better readability.
pendency relations shown in Table 7.2 are extended accordingly. For instance, the primary
MLO reacquisition action now depends on the primary MLO reacquisition capability and
on the primary navigation capability while the mine classification capability now depends
on the mine classification capability and so on.
Finally, regarding the vehicle ontology, we modified it by importing the extended com-
ponents, capabilities and actions ontologies as opposed to the standard ones. In addition,
we updated the hasComponent, hasCapability and hasAction class level relations within the
ontology so that we can accurately represent what is expected of the vehicle to have at its
disposal. This was done so that the Prolog vehicle system identification and system update
phases will be able to build, monitor and update the vehicle internal state correctly during
MCM operations.
In comparison to the planning ontology shown in Figure 7.14, the MCM mission planning
ontology shown in Figure 8.5 features some changes to the class names, that were referring
to autonomous operations in general, to reflect the MCM setting.
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 159
Figure 8.5. MCM planning ontology class hierarchy. We have intentionally omitted the
extended MCM world, components, capabilities, actions and vehicle ontologies for better
readability.
No core change has been made to the class hierarchy and no change at all has been made
to the nature of ontology properties. What changed though is the domain and problem defi-
nitions inside the planning ontology to support the syntax and semantics of temporal MCM
domains and problems. For instance, in defining the DoReacquireMlo action we create an
instance of the DoReacquireMlo class and relate it to its parameters, duration, precondition
and effect via parameter, duration, precondition and effect string data properties as in the
case of the DoReacquire action reconstructed from the autonomous operations planning on-
tology. However, the parameters are no longer a vehicle and two objectpoints but a vehicle
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 160
and two mine-like object points (Mlopoints). The same goes for the duration, precondition
and effect of the action and all actions as well as everything included within the MCM do-
main and problems (e.g. predicates, functions, initial and goal states etc.) which we adjust
accordingly.
Recall form Section 7.3.2 that the planning ontology, except for defining PDDL do-
mains and problems, is used for reconstructing them so that they become available for the
planner to generate mission plans. Consequently, this also applies to the MCM planning
ontology. Snippet 25 illustrates the PDDL MCM domain types, predicates and functions in
PDDL syntax as they are reconstructed form the MCM planning ontology. When compared
to the types, predicates and functions of the autonomous operations planning ontology the
similarity is very obvious. In fact, the only differences are that: i) instead of objectpoints
we now have mlopoints which are waypoints that a vehicle visits and reacquires mine-like
objects and ii) instead of predicates and functions related to objects and objectpoints we
now have predicates and functions in which the subjects are mine-like objects and mlo-
points as well as mines. Moverover, inspectionpoints now represent points around mines
that the vehicle visits in order to inspect the mines. Let us now proceed with the temporal
MCM domain actions reconstructed from the MCM planning ontology. As one might ex-
pect, temporal MCM domain durative actions are variations of the autonomous operations
temporal domain durative actions that were presented in snippets 14, 15, 16 and 17 and
can be found in Section 7.3.1.1. To avoid unnecessary repetition we only present the re-
constructed do reacquire mlo action shown in snippet 26. The remaining temporal MCM
domain durative actions, i.e. do hover mine detection, do classify mine and do inspect -
mine, can be obtained by substituting the autonomous operations temporal domain types,
predicates and functions inside the do hover detection, do classify and do inspect actions
with the temporal MCM domain types, predicates and functions.
Durative action do reacquire mlo is intended for plans where the goal is to reacquire
mine-like objects in the environment and has three parameters: a vehicle v and two mlo-
points from and to. The estimated duration of the action is identical to the one presented
for the do reacquire action (see Snippet 16) but instead of adding the time that the vehicle
requires to reacquire an object in general we specialise this as the time time the vehicle re-
quires to reacquire a mine-like object. Again, he value for the time that the vehicle needs to
reacquire a mine-like object can be chosen empirically or learned from data. For the action
to be applicable, at the start of the action’s interval (the point at which the action is ap-
plied), the vehicle needs to be at mlopoint (location) from, mlopoint to must be reachable
from mlopoint from and mlopoint to must have not been visited. Moreover, at the start of
the action’s interval, the mine-like object must have not been reacquired from the mlopoint
op. As a final precondition the vehicle must have energy greater or equal (>=) than the
following: the distance between mlopoints from,to multiplied by the energy consumption
rate of the vehicle while moving plus the rate at which the vehicle consumes energy while
maintaining its position multiplied by the time the vehicle needs to reacquire a mine-like
object. The effect of executing the do reacquire mlo action is that the vehicle has left its
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 161
Snippet 25 Temporal PDDL MCM domain types, predicates and functions reconstructed
from the MCM planing ontology.
(:types Lawnmowerpoint Mlopoint Inspectionpoint Vehicle)
(:predicates
(at_mlop ?v - Vehicle ?mlop - Mlopoint)
(reachable_mlop ?mlop1 ?mlop2 - Mlopoint)
(at_ip ?v - Vehicle ?ip - Inspectionpoint)
(reachable_ip ?ip1 ?ip2 - Inspectionpoint)
(at_lp ?v - Vehicle ?lp - Lawnmowerpoint)
(reachable_lp ?lp1 ?lp2 - Lawnmowerpoint)
)
(:functions
(visited_lp ?lp - Lawnmowerpoint)
(visited_mlop ?mlop - Mlopoint)
(visited_ip ?ip - Inspectionpoint)
(distance_lp ?lp1 ?lp2 - Lawnmowerpoint)
(distance_mlop ?mlop1 ?mlop2 - Mlopoint)
(distance_ip ?ip1 ?ip2 - Inspectionpoint)
(mine_detection_complete ?lp - Lawnmowerpoint)
(mine_classification_complete ?v - Vehicle)
(reacquired_mlo ?mlop - Mlopoint)
(inspected_mine ?ip - Inspectionpoint)
(cnt_reacquired_mlo ?v - Vehicle)
(cnt_visited_lp ?v - Vehicle)
(remaining_energy ?v - Vehicle)
(consumed_energy ?v - Vehicle)
(energy_consumption_rate_moving ?v - Vehicle)
(energy_consumption_rate_still ?v - Vehicle)
(prob_mlo ?mlop - Mlopoint)
(ent_mlo ?mlop - Mlopoint)
(prob_mlo_quotient_sum ?v - Vehicle)
(ent_mlo_quotient_sum ?v - Vehicle)
(det_obj_time ?v - Vehicle)
(class_mlo_time ?v - Vehicle)
(reacq_mlo_time ?v - Vehicle)
(insp_mine_time ?v - Vehicle)
(mult_fact_time ?v - Vehicle)
(mission_duration)
)
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 162
Snippet 26 PDDL temporal MCM domain do reacquire mlo durative action reconstructed
from the MCM planning ontology.
(:durative-action do_reacquire_mlo
:parameters (?v - Vehicle ?from ?to - Mlopoint)
:duration (= ?duration (+(*(distance_mlop ?from ?to)
(mult_fact_time ?v)) (reacq_mlo_time ?v)))
:condition (
and (at start (at_mlop ?v ?from))
(at start (reachable_mlop ?from ?to))
(at start (= (visited_mlop ?to) 0))
(at start (= (reacquired_mlo ?to) 0))
(at start (>= (remaining_energy ?v) (+(* (distance_mlop ?from ?to)
(energy_consumption_rate_moving ?v)) (*(reacq_mlo_time ?v)
(energy_consumption_rate_still ?v)))))
)
:effect (
and (at start (not (at_mlop ?v ?from)))
(at end (at_op ?v ?to))
(at end (decrease (remaining_energy ?v) (+(* (distance_mlop ?from ?to)
(energy_consumption_rate_moving ?v)) (*(reacq_mlo_time ?v)
(energy_consumption_rate_still ?v)))))
(at end (increase (consumed_energy ?v) (+(* (distance_mlop ?from ?to)
(energy_consumption_rate_moving ?v)) (*(reacq_mlo_time ?v)
(energy_consumption_rate_still ?v)))))
(at end (increase (visited_mlop ?to) 1))
(at end (increase (reacquired_mlo ?to) 1))
(at end (increase (cnt_reacquired_mlo ?v) 1))
(at end (increase (prob_mlo_quotient_sum ?v) (/(prob_mlo ?to)
(prob_mlo ?from))))
(at end (increase (ent_mlo_quotient_sum ?v) (/(ent_mlo ?to)
(ent_mlo ?from))))
(at end (increase (mission_duration) ?duration))
)
)
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 163
initial location at the start of the action ((at start (not (at mlop ?v ?from))) and
moved to the location of mlopoint to. Additionally, at the end of the action’s interval, the
vehicle’s remaining and consumed energy will change according to what we have presented
above as being the energy the vehicle requires to perform the action. Furthermore, at the
end of the action’s interval, the effect of the mlopoint to being marked as visited will take
place, the effect of the object as being reacquired will take place and the number of reac-
quired objects will increase by one. Also, at end (increase (prob mlo quotient sum
?v) (/(prob mlo ?to) (prob mlo ?from))) increases, at the end of the action’s in-
terval, the sum of probability quotients by the quotient of dividing the probability of the
mine-like object being at location (mlopoint) to being a mine by the probability of the
mine-like object at location (mlopoint) from being a mine. Finally, at end (increase
(ent mlo quotient sum ?v) (/(ent mlo ?to) (ent mlo ?from))) achieves the same
outcome but with entropy instead of probability.
Regarding the reconstructed MCM problems from the planning ontology, they are vari-
ations of the detection, classification, reacquisition and inspection problems that were pre-
sented in snippets 18-24 and can be found in Section 7.3.1.3. As such, and to avoid un-
necessary repetition, we only present an exemplar reconstructed PDDL MLO reacquisition
problem (see snippet 27). The remaining MCM problems, i.e. detection, classification and
inspection of mines, can be obtained by substituting the autonomous operations objects,
predicates and functions inside snippets 18-24 with the MCM objects, predicates and func-
tions given the area that needs surveying, the mine-like objects found in the environment
and the desired MCM high level mission priorities. Now, according to the exemplar reac-
quisition problem in snippet 27, the planner must generate a plan in which the AUV will
reacquire two MLOs (located at their respective mlopoints) without spending more energy
than it has available (goal state). The initial state indicates that the AUV starts at mlop3, the
MLO at this location has been reacquired and the distances between mlopoints (MLOs) are
known. Associated probabilities, entropy, remaining energy as well as consumed energy
are also known. Furthermore, the energy consumption rates for the vehicle while moving
and while maintaining its position are also provided as is the multiplication factor for time
and the time the vehicle requires to reacquire a mine-like object. Again, as we stated before,
the energy consumption rates as well as the multiplication factor for time and the reacquisi-
tion time need to be chosen empirically or learned from data. Finally, the metric maximize
(remaining energy auv) shown in Snippet 27 indicates that the high level priority of the
mission is for the AUV to reacquire the MLOs located at mlop1 and mlop2 in a way that
the remaining energy on the vehicle is maximized, i.e. the shortest possible distance is
travelled.
As in the case of the MCM planning ontology the MCM execution ontology features some
class name changes when compared to the autonomous operations execution ontology pre-
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 164
Snippet 27 Exemplar PDDL MLO reacquisition problem reconstructed from the planning
ontology. Energy efficiency is the chosen high level mission priority.
(:objects
mlop1 mlop2 mlop3 - Mlopoint
auv - Vehicle
)
(:init
(at_mlop auv mlop3)
(= (visited_mlop mlop1) 0)
(= (visited_mlop mlop2) 0)
(= (visited_mlop mlop3) 1)
(= (reacquired_mlo mlop1) 0)
(= (reacquired_mlo mlop2) 0)
(reachable_mlop mlop3 mlop1)
(reachable_mlop mlop3 mlop2)
(reachable_mlop mlop1 mlop2)
(reachable_mlop mlop2 mlop1)
(= (distance_mlop mlop3 mlop1) 6.05)
(= (distance_mlop mlop3 mlop2) 4.32)
(= (distance_mlop mlop1 mlop2) 7.19)
(= (distance_mlop mlop2 mlop1) 7.19)
(= (prob_mlo mlop1) 0.606)
(= (ent_mlo mlop1) 0.967)
(= (prob_mlo mlop2) 0.755)
(= (ent_mlo mlop2) 0.803)
(= (prob_mlo mlop3) 1000.0)
(= (ent_mlo mlop3) 1000.0)
(= (cnt_reacquired_mlo auv) 0)
(= (prob_mlo_quotient_sum auv) 0)
(= (ent_mlo_quotient_sum auv) 0)
(= (remaining_energy auv) 30000)
(= (consumed_energy auv) 0)
(= (mission-duration) 0)
(= (energy_consumption_rate_moving auv) 3)
(= (energy_consumption_rate_still auv) 1.5)
(= (mult_fact_time nessie1) 10)
(= (reacq_mlo_time nessie1) 2)
(= (mission_duration) 0)
)
(:goal
(and
(>= (remaining_energy auv) 0)
(= (cnt_reacquired_mlo auv) 2)
)
)
(:metric minimize (consumed_energy auv))
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 165
sented in Section 7.3.3. Class name changes were performed in order to adjust the au-
tonomous operations execution ontology to the MCM setting. For the same reason, the
MCM execution ontology also features some class extensions.
More specifically, as can be observed by looking at Figure 8.6, we now have the Mcm-
MissionPlan class and its various subclasses adjusted to the MCM setting instead of the Au-
tOpMissionPlan class and its respective subclasses that are shown in Figure 7.15. Moreover
the MCM execution ontology now features a McmMissionAction class instead of an AutOp-
MissionAction class which in turn is extended by subclasses that are intended for holding
information related to the actions executed during MCM missions. In fact, the information
that is being logged is the same as in the case of the autonomous operations execution ontol-
ogy (see Section 7.3.3). Having said that, we would like to emphasize that both the object
and the data properties that were presented in Section 7.3.3 remain unchanged, not only
with respect to the MCM execution ontology action classes but with all classes. They (the
properties) are just applied among instances that originate from the adjusted and extended
classes. Moving on with the changes, classes ProblemPhase and MissionPhase are ex-
tended with McmProblemPhase and McmMissionPhase classes respectively. Finally with
respect to mission points, class MissionPoint is extended with a McmMissionPoint class
which is in turn is extended with all necessary point classes for MCM missions.
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 166
researchers can simulate and test their work regarding not only underwater but also surface2
vehicles and missions.
One of the main features of the simulator is that we can represent any underwater en-
vironment with an XML file comprising all the necessary elements to do so. At the very
basis of the XML lies the inclusion of a 3D scene that represents the core environment and
needs to be in a format that can be read by OpenSceneGraph (OSG) [184] in order to be
rendered by the simulator. Depending on the environment that we are interested in, we can
add additional objects to refine it. For instance, we can have a 3D scene that represents a
shipwreck of archaeological interest and later on add in the XML file references to archae-
ological artifacts such us amphorae in various sizes, shapes and positions which must be in
a format that can be read by OSG. In addition, features like water visibility and water color
can be chosen and configured at will.
Another main feature of the simulator is that it can support multiple vehicles as well as
manipulators via abstract classes that can be specialized accordingly. We focus on vehicles
since our work is not related to manipulation. Each vehicle consists of two things: i) its
3D model and ii) its description. Regarding its 3D model, the supported formats are the
same as the formats for 3D scenes. With respect to its description, this is written in the
Unified Robot Description Format (URDF) which is an XML format for describing robots.
This, among others, includes the kinematic and dynamic characteristics of the vehicle. The
simulator also offers a series of sensors a fraction of which is listed below [185]:
• Camera & Range Camera: The camera produces a continuous stream of images
and can be placed in any desirable angle while the range camera does the same but
with range images. That is, it produces images that show the distance to specific
points in a scene from another specific point (depth images).
• Range Sensor & Multibeam Range Sensor: The range sensor yields a distance
measurement to the nearest object as long as the object is within the direction that
it points and within the range of the sensor. It’s multibeam alternative is an array of
range sensors that yields distance measurements at specific angle increments.
• Pressure Sensor: The pressure sensor yields a pressure measurement that represents
the pressure a vehicle receives when submerged.
• Doppler Velocity Log: The Doppler Velocity Log (DVL) estimates the linear speed
at which the vehicle is moving.
• Inertial Measurement Unit: The Inertial Measurement Unit (IMU) estimates the
orientation of the vehicle using the world frame as reference.
• Global Positioning System: The Global Positioning System (GPS) calculates the
coordinates of the vehicle with respect to the world frame. It is important to mention
2 By surface, we mean the surface of the sea.
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 168
that in order to get a GPS position measurement the vehicle needs to be very close to
the surface. This is done in order to simulate an actual GPS which would not be able
to get satellite signals when submerged.
• Force Torque Sensor: The force torque sensor provides an estimate of the force and
torque applied/received at the area where it is installed.
Further to the listed sensors, the simulator is equipped with some default position and ve-
locity sensors for vehicles that provide six Degrees Of Freedom (6DOF) poses. That is, x,
y, z or equivalently north, east, depth and r, p, y for roll, pitch and yaw. Yet another feature
of UWSim is that it integrates the Bullet physics engine [186] for simulating physics.
All the aforementioned features are made accessible externally via ROS since UWSim
is distributed as a ROS package. This, for instance, gives us the capability to issue com-
mands to the vehicle to move or develop our own vision algorithms based on which the
vehicle will detect something or write our own controllers and so on. Finally, the simulator
provides the capability of developing various widgets. Widgets are portions of the simu-
lated environment display that can be used to show important information to the user. For
instance, one can develop a camera widget which will display what the simulated camera
sees or a navigation widget that will display navigation related information to the user.
(a) Simulated Nessie V. Attached underneath the (b) Experimental setup illustrating simulated
vehicle is a forward looking sonar. Nessie V and six mines (red spherical objects).
that are used as part of temporal domain durative action definitions. Regarding detection
actions and detection problems we will be using a value of 2.5 for the (per meter) mul-
tiplication factor for time. In addition we will be using a value of zero seconds as the
detection time of objects as we explained in section 7.3.1.1. The choice of 2.5 is made
empirically in the scope of formulating an upper bound for the estimated duration of de-
tection actions. Regarding classification actions and problems we will be using a value of
25 seconds for class mlo time... which is again chosen empirically so that an upper
bound for the estimated duration of classification actions will be formulated. Regarding
reacquisition actions and problems the multiplication factor for time is chosen empirically
to be again 2.5 while the reacquisition time for an object is chosen to be 2 seconds in the
same scope as in the case of detection actions and problems. Now, in the case of inspection
actions and problems the empirically chosen value for the time multiplication factor is 10
while the value of the inspection time for each object from an inspectionpoint is chosen to
be 1 second in the same scope as in the case of detection actions and problems as well as
reacquisition actions and problems. As far as energy consumption rates are concerned, the
values chosen are 3 and 1.5 for an energy consumption rate of the vehicle whilst moving
and an energy consumption rate for the vehicle whilst maintaining its position respectively.
As opposed to the empirically chosen values, the values for the energy consumption rates
were not chosen empirically. They were chosen in the scope of reflecting that the vehicle
will consume more energy whilst moving compared to whilst maintaining its position. This
non empirical choice is due to the fact that we did not have access to data of Nessie’s energy
consumption profile and we proceeded with making an assumption. These numbers are ex-
pected to be different for different vehicles. Also, note that multiplication factors for time
are also dependent on the vehicle used. However, an observant reader may have noticed
that the time multiplication factor value for inspection actions is different than the value
chosen for detection actions. This is due to the fact that while searching to detect objects
in a lawnmower pattern a vehicle tends to perform less accelerations and decelerations as
well as re-orienteering as it tends to traverse longer distances in straight lines. As such, the
(per meter) multiplication factor for time is going to be smaller when compared to the (per
meter) multiplication factor for time that applies to inspection actions which tend to be over
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 170
much smaller distances and at lower speeds. Finally, note that in formulating upper bounds
for action durations we allow each action to underrun and not overrun. Had we chosen
lower bounds and consequently allowing for actions to overrun, it would lead to some plans
potentially failing in the presence of time deadlines. However, this choice of ours has a
different implication. That is, in the presence of time deadlines some problem instances
may become unsolvable where in fact they may have been solvable in reality. Having said
that, in this thesis we do not focus on time deadlines and as such this discussion is at this
point purely made in the scope of understanding the choices that we made and the rationale
behind them.
8.5.1 Results
Each MCM phase corresponds to one problem reconstructed from the planning ontology.
As such we have four problems in total. The initial information provided is four lawn-
mower area points instantiated inside the MCM world ontology along with spacing and
overlap information that the framework utilizes in order to generate lawnmower trajectory
points both inside the MCM planning ontology and the MCM execution ontology3 . Figure
8.8a illustrates the outcome of this process schematically. Moreover, the initial problem
phase which is detection is encoded inside the MCM execution ontology and the high level
mission priority is energy efficiency, since the high level mission priority is to plan and
execute MCM in an energy-efficient manner.
First pass - detection: Given the detection problem phase and the generated lawnmow-
erpoints the framework reconstructs the detection planning problem which along with the
reconstruction of the MCM domain are fed into the planner. The planner then devises a
plan to traverse the lawnmowerpoints in order to search for mines. Table 8.1 illustrates the
devised plan while Table 8.2 illustrates the distances between lawnmowerpoints. Moreover,
Table 8.3 illustrates the estimated execution duration, estimated energy consumption, plan
generation duration and the distance to be travelled for the detection plan shown in Table
8.1 as well as the number of states evaluated by the planner in formulating the plan. As
can be observed by looking at Table 8.1 the actions are situated in time with a reference
3 For an explanation of this duplicate representation see Section 7.3.3
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 171
Table 8.1. Plan for detection of mines. Numbers on the left hand-side represent estimated
start time points while numbers on the right hand-side (in brackets) represent estimated du-
rations for each durative action. By estimated time points we refer to the points in time after
each plan starts executing with a reference to 0.000 seconds as being the start of the plan
execution.
to the start time point for the plan as being 0.000 and they have an estimated duration. In
addition, as can be seen in Table 8.3, the time it took the planner to generate the plan is
0.02 seconds meaning that the generation was very fast given also that the number of states
that the planer evaluated in the process of devising the plan was 13. Note that the result
with respect to generation duration is very similar with the results for generating inspection
plans as they were presented by Cashmore et. al. in [152]. Now, once generated, the plan is
instantiated inside the MCM execution ontology and linked with its generation time, gen-
eration duration, date as well as estimated execution duration and estimated start and end
time points (see Sections 7.3.3, 8.1.3.2). For each do hover mine detection action an
instance of the DoHoverMineDetection class inside the MCM execution ontology is also
created and linked to the plan instance. Each instantiated action is linked to its parameters
which are the vehicle instance and the instances of its two lawnmowerpoints. Furthermore,
each instantiated action is linked to its execution status, execution outcome, execution du-
ration and execution start and end time points. Initially, each do hover mine detection
action instance is marked as non executed and it has no execution outcome. In addition,
it is linked to the estimated execution duration as well as the execution start and end time.
Regarding the actual execution duration as well as the actual start and end time points, they
will be assigned after the action has executed since they are unknown at the time of gen-
erating the plan. The same applies for the execution outcome. Now, before execution, the
framework checks whether these actions can be executed based on the system identification
and system update phases described in Sections 7.2.3.1 and 7.2.3.2 respectively. This pro-
cess is continuous since a component can break at any point during a mission affecting the
vehicles available capabilities and actions. In this core MCM scenario of ours we do not
simulate any component faults. These are considered in the scenario investigated in Section
8.8. The vehicle is now ready to proceed with the execution of the detection plan. Figure
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 172
Table 8.3. Plan statistics for the detection plan shown in Table 8.1.
8.8b illustrates how the detection phase unfolds as do hover mine detection actions are
being executed. As the mission progresses the framework will update each action represen-
tation inside the MCM execution ontology with respect to its actual duration as well as start
and end time points, its execution status and outcome. In addition, the MCM mission phase
now becomes detection. At the same time the vehicle performs mine detections. At the end
of the detection phase the framework updates the plan representation inside the execution
ontology with respect to the actual start and end time points and duration.
Mine detection is a vital part of an MCM mission since the success of the subsequent
classification phase is greatly affected by its quality. Approaches for detecting mines are
beyond the scope of this thesis and as such the assumption made is that for a real world
application, an efficient Automatic Target Recognition (ATR) module is available to the
system. Nevertheless, for the purpose of presenting the usage of our framework in per-
forming MCM, we used a circle detector for detecting circular features based on the Hough
transform provided by the OpenCV library [188]. During this detection phase of the first
pass, as circles are detected, they are asserted into the MCM world ontology (instances of
the Circle class) along with their positions and radii. This leads to multiple detections of
the same objects (circles) being asserted into the KB with slightly different positions given
that Nessie will perform such detections from varying distances. This effectively affects the
detection outcomes slightly. Nonetheless this will be dealt with by the classification phase
that follows.
First pass - classification: After the lawnmower trajectory points have been traversed and
the detection phase has been completed, the vehicle proceeds with the second phase which
is the classification of detections. In order to do so, the MCM problem phase is initially
updated to classification. This effectively, instructs the framework to reconstruct a classi-
fication problem from the MCM planning ontology and along with the MCM domain to
generate the simple one action plan. Table 8.4 illustrates the aforementioned plan while
Table 8.5 illustrates the estimated execution duration, estimated energy consumption and
plan generation duration for the classification (single action) plan shown in Table 8.4 as
well as the number of states evaluated by the planner in formulating the plan. Again as can
be seen by observing Table 8.4, the single action is situated in time with a reference to the
start time point for the plan as being 0.000 and it has an estimated duration. Again, as in the
case of the detection plan the generation time for the single action plan is very small. Now,
once generated, the classification plan is instantiated inside the MCM execution ontology
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 173
Table 8.4. Plan for classification of mines. The number on the left hand-side represents the
estimated start time point while the number on the right hand-side (in brackets) represents
the estimated estimated duration for the durative classification action.
Table 8.5. Plan statistics for the classification plan shown in Table 8.4.
and is linked to its generation time, generation duration, date as well as estimated execution
duration and estimated start and end time points (see Sections 7.3.3, 8.1.3.2). Furthermore,
it is linked to the single action it consists of which the framework also instantiates inside
the MCM execution ontology. The classification action is initially marked as non executed
and it has no execution outcome. Moreover, it is linked to its parameters which are the
vehicle instance (nessie) and lp6 which is the position where the detection phase has ended.
Initially, as in the case of detection actions, the classification action is linked to its esti-
mated execution duration as well as the execution start and end time points. Regarding the
actual duration as well as the actual start and end time points, they will be assigned after
the action has executed since they are unknown at the time of generating the plan. The
same applies for the execution outcome. Finally, we assume that the vehicle is equipped
with a classifier that is able to classify MLOs with some confidence level in the form of
a probability associated with every MLO. Each such probability can be then transformed
into an entropy measurement based on equation 7.5. In order to simulate such a classifica-
tion system and given the multiple circle detections inside the KB, the detected circles are
clustered using the affinity propagation clustering method [189]. Affinity propagation is a
method that considers measures of similarity between pairs of data points and is very effi-
cient with uneven cluster sizes. Using affinity propagation, cluster centroids are classified
as MLOs and are instantiated inside the MCM world ontology given that the classification
action is executable, i.e the classification module is up and running. Except for their posi-
tions, MLOs are assigned a random probability within the interval of [0.50, 0.99] based on
which the framework calculates entropy which is also assigned to each MLO. Moreover,
the reacquisition status of each MLO is initially set to false which represents that no MLO
has not been reacquired yet. Figure 8.9 illustrates the results of clustering and consequently
classification of MLOs. In addition, Figure 8.8c illustrates the completion of the detec-
tion phase and the placement of MLOs inside the simulated MCM environment after the
completion of the classification phase. While the vehicle executes its do classify mine
action the MCM mission phase is updated to classification whereas when classification fin-
ishes the representation of the do classify mine action is updated accordingly. i.e its
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 174
(a) Generated lawnmower trajectory points (b) Execution of do hover mine detection
(lp1-lp6). The green star represents the vehi- actions as part of the detection phase.
cle’s initial position.
3
lp2 lp1 2
5
2 5
lp3 lp4 4
1 6
lp6 lp5
1
6
(c) Mine detection phase completion and classi- (d) Areas 1-6 represent reacquisition and in-
fication of MLOs (yellow spheres). spection areas. Red spheres with attached ar-
rows represent inspection points, yellow spheres
are classified MLOs while black spheres repre-
sent mines.
3
3
(e) Execution of do reacquire mlo and do - (f) Completion of the reacquisition and inspec-
inspect mine actions in an energy-efficient tion phases in an energy-efficient manner. The
manner as part of the reacquisition and inspec- order in which reacquisition and inspection oc-
tion phases respectively. curred is: 1, 2, 3, 4, 5. The black star represents
the mission end point.
Figure 8.8. MCM mission phases and trajectories followed by the vehicle during an energy-
efficient MCM mission execution.
execution status, outcome, actual duration as well as actual start and end time points. At
the end of the classification phase the framework updates the plan representation inside the
execution ontology with respect to the actual start and end time points and duration.
Second pass - reacquisition & inspection: Nessie is now ready to proceed with plan-
ning the reacquisition phase. As such the MCM problem phase is updated to reacquisition.
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 175
1.4
1.2
1.0
depth (m)
0.8
0.6
0.4
0.2
0.0
12
10
8
6 )
−10 4 (m
−5 st
0 2 ea
north (m) 5 0
10
Figure 8.9. Circle detections clustering. Crosses represent detected circular objects while
coloured circles represent cluster centroids.
Given this problem phase update and the MLO assertions the framework generates mlo-
points both inside the MCM planning ontology and the MCM execution ontology which
are marked as unvisited. The framework then proceeds with the reconstruction of the reac-
quisition problem which along with the MCM domain are passed on to the planner. The
goal state of the problem is that the vehicle must reacquire all MLOs and have enough
energy to do so. In addition, the reconstructed reacquisition problem contains the follow-
ing metric: minimize (consumed energy auv). This instructs the planner to formulate
a plan in which MLO reacquisition will be executed in an energy-efficient manner. The
reacquisition problem initial state is that none of the MLOs has been reacquired, none of
the mlopoints has been visited all mlopoints are reachable form each other and so on. The
generated reacquisition plan is illustrated in Table 8.6 (top). Mlopoints mlop1-mlop6 cor-
respond to MLOs 1-6 illustrated in Figure 8.8c while init corresponds to the initial position
of the vehicle from which reacquisition will commence. In our case this is the position
where the vehicle completed its detection and classification phases. The generated plan
will be logged into the MCM execution ontology by the formulation of an instance of the
McmMissionMloReacquisitionPlan which, as in the case of the phases of detection and
classification is given a generation time, date, generation duration, estimated execution du-
ration as well as estimated start and end time points. In addition, the framework creates six
McmMissionDoReacquireMloAction instances inside the MCM execution ontology, one
for each do reacquire mlo action. Each such action instance is linked to the reacquisition
plan instance and is initially marked as as non executed and it has no execution outcome.
Moreover, it is linked to its parameters which are the vehicle instance (Nessie) and the two
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 176
Table 8.6. Plans for reacquisition of MLOs (top) and inspection of mines (bottom) in an
energy-efficient manner. Numbers on the left hand-side represent estimated start time points
while numbers on the right hand-side (in brackets) represent estimated durations for each
durative action. By estimated time points we refer to the points in time after each plan starts
executing with a reference to 0.000 seconds as being the start of the plan execution.
mlopoint instances that it comprises (e.g. mlop1 and mplop2). Initially, as in the case of
detection and classification actions, each reacquisition action is linked to its estimated ex-
ecution duration as well as the estimated execution start and end time points. Regarding
the actual execution duration as well as the actual start and end time points, they will be
assigned after each action has executed since they are unknown at the time of generating the
plan. The same applies for the execution outcome of each action. Again, each reacquisition
action is deemed executable or non executable at runtime given the outcome of the system
identification and update phases which are executed continuously and behind the scenes.
As the vehicle starts executing its reacquisition plan the MCM mission phase is updated
to reacquisition. Moreover, every time the vehicle reacquires an MLO that is deemed to
be a mine it needs to inspect it. Before doing so, the execution status, outcome, (actual)
execution duration as well as (actual) start and end time points of the reacquisition action
are updated/instantiated accordingly, the MLO is marked as reacquired and the mlopoint
(marked) as visited. Furthermore, the framework instantiates the Mine class inside MCM
world ontology, marks it as uninspected and associates it with its position and radius. Here
we make the assumption that there is uncertainty arising from the vehicle sensors. There-
fore, the initial positions of the MLOs are inaccurate compared to the positions of the
reacquired ones due to their detections being performed from a much longer distance than
reacquisition. As such, the positions of the mines are slightly different from the ones of the
MLOs before reacquisition.
In order for the vehicle to proceed with planning the inspection phase for that mine
the framework updates the MCM problem phase to be inspection and generates six Inspec-
tionpoint instances inside the MCM planning ontology and six inside the MCM execu-
tion ontology associated with their geometrical information and marked as unvisited. The
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 177
framework then proceeds with the reconstruction of the mine inspection problem and MCM
domain which feeds into the planner. The goal of the problem is to visit each inspection
point and inspect the mine from every inspectionpoint given that there is enough energy to
do so. Table 8.6 (bottom) illustrates the generated plan for inspection. “Inspectionpoint”
init corresponds to the position from where the vehicle starts travelling towards the first
inspection point; in our case the generated plan the vehicle performed reacquisition of the
MLO that was deemed to be a mine. The inspection plan is logged into the MCM plan-
ning execution ontology by instantiating the McmMissionMineInspectionPlan class and the
generated instance is given a generation time, date, generation, estimated execution dura-
tion as well as estimated start and end time points. In addition, the framework creates six
McmMissionDoInspectMineAction instances inside the MCM execution ontology, one for
each do inspect mine action. Each such action instance is linked to the inspection plan
instance and is initially marked as as non executed and it has no execution outcome. More-
over, as with every action instance so far, it is linked to its parameters which in this case
are the vehicle instance (Nessie) and the two inspectionpoint instances that it comprises
(e.g. ip1 and ip2). As in the case of detection, classification and reacquisition actions, each
inspection action is linked to its estimated execution duration as well as the estimated ex-
ecution start and end time points. Regarding the actual execution duration as well as the
actual start and end time points, they will be instantiated after each action has executed since
they are unknown at the time of generating the plan. The same applies for the execution
outcome. Again, each inspection action is deemed executable or non executable at runtime
given the outcome of the system identification and update phases. Before the execution of
inspection the framework appends the remaining inexecuted reacquisition plan at the end
of the generated inspection plan. In this manner, the planner does not have to devise a plan
for continuing reacquiring MLOs since it has already done so once. As a result, we save the
planner from unnecessary computational load, speeding up the planning process altogether.
Nessie is now ready to proceed with inspection and the MCM mission phase is now updated
to inspection. Additionally, every time Nessie visits an inspectionpoint and performs an in-
spection from that point that point is marked as visited and inspection as performed from
that inspectionpoint. The mine around which the inspection points are placed is marked as
inspected when all inspectionpoints have been visited and inspection of the mine has taken
place from every single one of them. Once the inspection of the mine finishes the vehicle
resumes with the reacquisition plan from where it left it. This interleaving of reacquisition
and inspection is continued until all MLOs are reacquired and all mines are inspected given
that the vehicle has enough energy to do so. Figure 8.8d illustrates the six reacquisition
and inspection areas that correspond to the six mines that where randomly placed in our
simulation environment along with the estimated position of the MLOs before reacquisi-
tion and the position of those deemed to be mines after. The slight difference in position
is also visible. Figure 8.8e illustrates the consecutive execution of do reacquire mlo and
do inspect mine actions as the MCM mission unfolds while Figure 8.8f illustrates the
completion of the reacquisition and inspection phases which also constitutes the comple-
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 178
tion of the MCM mission. Table 8.7 illustrates the estimated execution duration, estimated
energy consumption, plan generation duration and the distance to be travelled for the reac-
quisition and inspection plans shown in Table 8.6 as well as the number of states evaluated
by the planner in formulating the plans. One interesting observation to make by looking at
Table 8.7. Plan statistics for the plans shown in Table 8.6.
Table 8.7 is the the plan generation times are increased when compared to the generation
times for the detection and classification plans. This increase is due to the increased number
of states that the planner had to evaluate for both devising the reacquisition and inspection
plans. Despite that fact, the time to generate plans is still very low. One thing that we would
also like to emphasize is that the estimated energy consumption for both plans is the min-
imum since we are optimizing for energy consumption. Recall that in Section 7.3.1.2 we
explained that the minimization of energy consumption or equivalently the maximization
of the remaining energy subsumes the minimization of estimated execution duration. As
such the estimated execution durations illustrated in Table 8.7 are minimal. Moreover due
to the manner in which energy and duration are estimated (see Sections 7.3.1.1, 7.3.1.2) the
distance to be travelled is also minimal. The distances between mlopoints and the distances
between inspectionpoints are illustrated in Table 8.8.
Table 8.8. Distances between mlopoints (top) and between inspectionpoints (bottom) for the
plans shown in Table 8.6.
8.6.1 Results
To avoid unnecessary repetition, the phase walk-through that was presented in Section 8.5
will not be presented in this section. In addition, the planning outcomes for the detec-
tion, classification and inspection phases are omitted for the same reason. As such, we
only present the generated plans for the reacquisition phases under the two different high
level mission priorities in Table 8.9 while the probabilities and entropies of each MLO that
yielded the different reacquisition plans is shown in Table 8.10. As was explained in Sec-
tion 8.5, probabilities are random assignments in the interval of [0.50, 0.99] while entropies
are calculated based on equation 7.5.
As can be observed by looking at the reacquisition plan under the energy-efficient plus
probability-efficient (Energy-Prob) high level mission priority, the vehicle tends to visit
higher probability MLOs earlier than in the case of energy-efficient reacquisition (see the
top of Table 8.6 for a comparison between the reacquisition plans). However, due to the
fact that energy efficiency contributes equally to the multi-objective optimization, the plan-
ner is also instructed to generate a plan in which the vehicle will also attempt to minimize
energy consumption. This leads us to the trajectory illustrated in Figure 8.10a where, for
instance, the vehicle does not transition from area 6 to reacquire the MLO in area 3 (MLO
3) but instead transitions to reacquire the MLO in area 5 (MLO 5). That is, despite the
fact that the probability of MLO 3 being a mine is greater than the probability of MLO 5
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 180
Table 8.9. Plans for reacquisition of MLOs under different high level mission priorities. The
plan for energy efficient plus probability efficient reacquisition is shown at the top while the
plan for energy efficient plus entropy efficient reacquisition is shown at the bottom. Numbers
on the left hand-side represent estimated start time points while numbers on the right hand-
side (in brackets) represent estimated durations for each durative action. By estimated time
points we refer to the points in time after each plan starts executing with a reference to 0.000
seconds as being the start of the plan execution.
Energy-Prob
0.000 (do reacquire mlo nessie init mlop1) [7.850]
7.850 (do reacquire mlo nessie mlop1 mlop4) [29.950]
37.800 (do reacquire mlo nessie mlop4 mlop6) [29.549]
67.349 (do reacquire mlo nessie mlop6 mlop5) [23.250]
90.599 (do reacquire mlo nessie mlop5 mlop3) [39.150]
129.749 (do reacquire mlo nessie mlop3 mlop2) [13.799]
Energy-Ent
0.000 (do reacquire mlo nessie init mlop2) [18.125]
18.125 (do reacquire mlo nessie mlop2 mlop3) [13.799]
31.924 (do reacquire mlo nessie mlop3 mlop4) [25.875]
57.799 (do reacquire mlo nessie mlop4 mlop5) [26.375]
84.174 (do reacquire mlo nessie mlop5 mlop6) [23.250]
107.424 (do reacquire mlo nessie mlop6 mlop1) [52.575]
MLO 1 2 3 4 5 6
Probability 0.96 0.5 0.75 0.8 0.62 0.9
Entropy 0.24 1 0.81 0.72 0.96 0.47
being a mine (0.75 > 0.62). Table 8.11 illustrates the estimated execution duration, esti-
mated energy consumption, plan generation duration and the distance to be travelled for
the energy-efficient plus probability-efficient reacquisition plan shown at the top of Table
8.9 as well as the number of states evaluated by the planner in formulating the plan. By
Table 8.11. Plan statistics for the energy-efficient plus probability-efficient reacquisition plan
shown at the top of Table 8.9.
Energy-Prob reacquisition
Estimated Execution Duration: 143.548 secs
Estimated Energy Consumption: 175.9 units
Plan Generation Duration: 1.22 secs
Distance to Travel: 52.62 meters
Number of Evaluated States: 3017
comparing the findings shown in Table 8.11 to the findings illustrated at the top of Table 8.7
(effectively comparing energy-efficient reacquisition with energy-efficient plus probability-
efficient reacquisition), we can see that in the second case the vehicle is expected to: i) travel
a longer distance, ii) consume more energy and iii) require more time to reacquire targets.
In addition, the plan generation duration is increased. Regarding the former three4 such
4 distance, energy consumption and execution duration
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 181
3 3
Prob=0.75 Ent=0.81
5 5
Prob=0.62 2 Ent=0.96
2 Ent=1.0
Prob=0.5
4
Prob=0.8 4
Ent=0.72
6
Prob=0.9
1 1 6
Prob=0.96 Ent=0.24 Ent=0.47
Figure 8.10. Trajectories followed by the vehicle under different high level mission priorities.
Start points for each trajectory, denoted by green stars, are the points where the vehicle fin-
ished its lawnmower pattern and performed classification to estimate MLOs. Yellow spheres
denote the estimated positions of MLOs after classification while black spheres (denote) the
positions of mines. Black stars represent mission end points [4].
a finding is expected since we are also considering probability in the optimization and as
such only in the best case scenario the findings would agree with the ones presented at the
top of Table 8.7. That is, if the probability of each mine-like object being a mine was such
that the outcome of generating a plan that would only consider probabilities and not energy
would be the same as the plan shown at the top of Table 8.6. With respect to the time the
planer requires to generate the energy efficient plus probability efficient reacquisition plan,
the increase is due to the fact that the planner has to evaluate more states.
Regarding the reacquisition plan under the energy-efficient plus entropy-efficient (E-
nergy-Ent) high level mission priority shown at the bottom of Table 8.9 we can observe
that the vehicle tends to visit higher entropy MLOs first as opposed to the energy-efficient
reacquisition (see the top of Table 8.6 for a comparison between the reacquisition plans).
This is also shown schematically in Figure 8.10b. The transition of the vehicle from its
initial position to visit mlop2 and reacquire the MLO in the second area (MLO 2) first
given that the MLO in the first area (MLO 1) is so close, is representative of this behaviour.
This would not occur if the difference in entropy was not so great due to the fact that energy
efficiency contributes equally to the generation of the reacquisition plan. Its contribution is
evident by the fact that the vehicle does not transition from the mlop2 to mlop5 to reacquire
the MLO in the fifth area but instead visits mlop3 in the third area to reacquire MLO 3.
Table 8.12 illustrates the estimated execution duration, estimated energy consumption, plan
generation duration and the distance to be travelled for the energy-efficient plus entropy-
efficient reacquisition plan shown at the bottom of Table 8.9 as well as the number of states
evaluated by the planner in formulating the plan. By comparing the findings shown in
Table 8.12 to the findings illustrated at the top of Table 8.7 (effectively comparing energy-
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 182
Table 8.12. Plan statistics for the energy-efficient plus entropy-efficient reacquisition plan
shown at the bottom of Table 8.9.
Energy-Ent reacquisition
Estimated Execution Duration: 159.999 secs
Estimated Energy Consumption: 195.6 units
Plan Generation Duration: 0.72 secs
Distance to Travel: 59.2 meters
Number of Evaluated States: 1819
ferent high level mission priorities in MCM we additionally generated 10 random geo-
metrical configurations of the six mines within our 22 × 12 meter area. For each geo-
metrical configuration, after classification, each MLO is again assigned a random prob-
ability within the interval [0.50, 0.99] which is also converted to entropy. For each such
random geometrical configuration we ran one full MCM mission5 for each high level mis-
sion priority and calculated the average mission statistics that are shown in Table 8.13.
As can be seen by observing Table 8.13, the average total estimated plan execution dura-
Table 8.13. Average mission statistics of planning and executing MCM missions under 10
random geometrical configurations of mines using all three high level mission priorities.
tion for all plans under the energy-efficient (Energy) high level mission priority is lower
when compared to the average total estimated plan execution duration for all plans un-
der the energy-efficient plus probability-efficient (Energy-Prob) and energy-efficient plus
entropy-efficient (Energy-Ent) high level mission priorities. To avoid any confusion we
would like to clarify that we refer to the average combined (total) estimated execution
duration of all plans in each mission. That is, the detection, classification, reacquisition
and inspection plans. This is expected since in the latter two high level mission priorities
(Energy-Prob, Energy-Ent) we combine energy efficiency with probability and entropy
efficiency respectively when reacquiring MLOs. Recall that in Section 7.3.1.2 we explained
that the minimization of energy consumption subsumes the minimization of execution du-
ration. As such, by considering entropy or probability in addition to energy will not only
have an impact on the energy consumption of the vehicle (energy consumption is increased)
but also on the execution duration due to considering different reacquisition plans. That
is, in the first case reacquisition plans that minimize energy consumption, in the second
case reacquisition plans that minimize energy consumption in conjunction with reacquiring
MLOs that are most probable to be mines first and in the third case reacquisition plans that
minimize energy consumption in conjunction with reacquiring MLOs that yield the high-
est information gain first. The results in Table 8.13 clearly show the increase in energy
consumption an execution duration. The increase is in fact an increase in the duration and
the energy consumption of reacquisition actions for which the different high level mission
priorities are considered. That is, irrespective of the high level mission priority chosen for
reacquisition; the durations of detection, classification and inspection actions will not be
affected.
5 One full MCM mission comprises detection, classification, reacquisition and inspection operations.
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 184
Another interesting observation to be made by looking at Table 8.13 is that the average
combined (total) estimated plan execution durations are an over-estimate when compared
to the actual ones. That is, the planner estimated that the MCM missions would last longer
than they actually did, irrespective of the high level mission priority chosen. This over-
estimation is due to the formulation of upper duration bounds for actions as explained in
Section 8.4. Needless to say that we could have chosen values for quantities such as the
multiplication factor for time that would reduce this over-estimation. In our case the over-
estimation is about 17.5%. This allows actions to underrun and not overrun as explained in
aforementioned section.
Yet another observation to make when looking at Table 8.4 is that the average combined
(total) generation duration for all plans is lower in the case of reacquiring MLOs in an
energy efficient manner (Energy) as opposed to reacquiring MLOs in an energy-efficient
plus probability-efficient (Energy-Prob) manner or in an energy-efficient plus entropy-
efficient (Energy-Ent) manner. This difference is due to the very existence of different high
level mission priorities for reacquiring MLOs. More intuitively, irrespective of the high
level mission priority chosen for reacquiring MLOs; the average plan generation durations
for detection, classification and inspection plans will not change. What will contribute to the
increase of the average total generation duration for all plans is the process of performing
multi-objective optimizations in the cases of energy-efficient plus probability-efficient and
energy-efficient plus entropy-efficient reacquisitions. That is, the system needs to calculate
the maximum and the minimum of each objective first in order to be able to use them in
the weighted sum approach (see equation 7.4). As such, more planning time is required.
However, the reported average total plan generation durations for each MCM mission under
the three different high level mission priorities for reacquisition are still low in the sense
that durations of the magnitude of a few seconds (3.69, 9.59 and 8.84 seconds) are not
expected to have a noteworthy impact on the utilization of the framework for real world
applications.
Finally, in Table 8.13 we report the average total duration of KB-related operations for
each full MCM as well as the average total mission duration under each high level mission
priority. The average total duration of KB-related operations refers to KB-related operations
such as the creation, deletion, updating and fetching of instances and properties (knowl-
edge) that do not happen concurrently with the execution of actions. For instance, when
traversing a lawnmower trajectory as part of the execution of a detection plan, KB-related
operations in the form of instantiations of circle detections inside the world ontology occur
concurrently with the execution of the detection actions. As such, they are not included in
the average total duration of KB-related operations. In contrast to what we just mentioned,
KB-related operations such as the reconstruction of a planning domain and problem do not
happen concurrently with action execution. For instance, the reconstruction of a detection
problem from the planning ontology occurs before the execution of a detection plan. In fact,
the framework cannot devise a plan for detection if the detection (planning) problem and
the (planning) domain, are not reconstructed from the planning ontology so that they can be
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 185
fed into the planner for detection plan generation. Another example of a KB-related oper-
ation that does not occur concurrently with action execution is the assertion inside the KB
that a mine has been inspected from a specific inspectionpoint. This KB-related operation
occurs in-between inspection action executions. Consequently, the reported average total
durations for KB-related operations are added to the average total actual plan execution du-
ration reported in Table 8.13 so that the average total mission duration can be formulated.
The small magnitude of the average total durations of KB-related operations that do not
happen concurrently with the execution of actions is indicative of the speed at which the
KB communicates with the rest of the system.
Going back to the increased energy consumption and duration for MCM missions that
combine energy efficiency with probability efficiency or energy efficiency with entropy
efficiency as opposed to missions that only consider energy efficiency as the high level
mission priority it can be seen that these (increased energy consumption and duration) are
compensated in terms of high certainty exploitation and information gain (uncertainty re-
duction) respectively as shown in Figure 8.11. For the Energy-Ent reacquisitions the graph
4.5
Energy-Ent
4.0 Energy-Prob
Energy
Average Total Entropy (H)
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0.0
0 1 2 3 4 5 6
Number of Reacquisition Actions
Figure 8.11. Average total entropy reduction over the number of MLO reacquisition actions
taken by the vehicle under the three different high level mission priorities.
line is lower compared to the Energy and Energy-Prob reacquisitions. Additionally, for
Energy-Prob the graph line is higher than the rest. This means that for Energy-Ent the
uncertainty about the environment is always lower after every reacquisition compared to the
rest while for Energy-Prob it is always higher. For the Energy reacquisition it is some-
where in-between. The fact that the probabilities of MLOs are in the interval [0.50, 0.99]
means that during Energy-Prob reacquisitions the AUV tends to visit MLOs with higher
probability first (see also Figure 7.13 for a better understanding).
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 186
8.7.1 Results
The reuse of the same random geometrical configurations as well as the same probabilities
and entropies for each setting enables us to perform a direct comparison between our pre-
vious findings and the newly generated ones. Figure 8.12 is representative of the AUV’s
adaptive behaviour. Figures 8.12a-8.12c illustrate the reacquisition and inspection trajec-
tories followed by the simulated AUV due to planning for reacquisition based on different
high level mission priorities. In the mission illustrated in Figure 8.12d, the initial plan for
the vehicle was to reacquire MLOs with the high level mission priority being energy ef-
ficiency and inspect those deemed to be mines as shown in Figure 8.12a. After visiting
areas 1 and 4 however, a request for changing the high level mission priority into energy
efficiency plus entropy efficiency took place and the vehicle adapted accordingly. This
adaptation was due to replanning caused by invalidating the previously generated plan. A
similar adaptation is shown in Figure 8.12e where, a command for switching from energy
efficiency plus entropy efficiency as the high level mission priority (Figure 8.12c) to the
energy efficiency one is issued to the vehicle.
Except for the successful adaptation of planning and execution when issuing high level
mission priority changes, we also tested its impact on the average total entropy reduc-
tion as the reacquisition of MLOs unfolded during mission execution (see Figure 8.13).
As expected, transitioning from energy-efficient plus probability-efficient reacquisition to
energy-efficient plus entropy-efficient reacquisition leads to a lower average total entropy
(disorder/uncertainty) after each reacquisition action. In contrast, when transitioning from
an energy-efficient plus entropy-efficient reacquisition to just energy-efficient reacquisition
the average total entropy in the environment is higher after each reacquisition. That is, until
entropy becomes zero in the environment for all high level mission priorities which occurs
after all reacquisitions have taken place.
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 187
2 5
1 6
5 5
Prob=0.62 2 Ent=0.96
2 Ent=1.0
Prob=0.5
4
Prob=0.8 4
Ent=0.72
6
Prob=0.9
1 1 6
Prob=0.96 Ent=0.24 Ent=0.47
6
1 Ent=0.47
6 1
Prob=0.96 Ent=0.24
Ent=0.24 Prob=0.9
Ent=0.47
(d) Adaptation from energy-efficient plus (e) Adaptation from energy-efficient plus
probability-efficient reacquisition and inspec- entropy-efficient reacquisition and inspection
tion (areas 1, 4) to energy-efficient plus entropy- (areas 2, 3) to energy-efficient reacquisition and
efficient reacquisition and inspection (areas 2, 3, inspection (areas 1, 4, 5, 6).
5, 6).
Figure 8.12. High level mission priorities and adaptation. Start points for each trajectory,
denoted by green stars, are the points where the vehicle finished its lawnmower pattern and
performed classification to estimate MLOs. Black stars represent mission end points. Red
triangles denote the points in which adaptation occurred.
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 188
4.5
4.0 Energy-Ent
3.5
3.0 Energy-Prob
2.5
2.0 Energy
1.5
Figure 8.13. Average total entropy reduction over the number of MLO reacquisition actions
taken by the vehicle: i) under the three different high level mission priorities (upper graph),
ii) when adapting from the Energy-Prob to the Energy-Ent priority (middle graph), iii)
when adapting from the Energy-Ent to the Energy priority (bottom graph).
8.8.1 Results
In order to test our vehicle’s behaviour with respect to component faults we initially sim-
ulated a mine detection (software) module fault during the execution of a lawnmower tra-
jectory for mine detection. Recall from Section 8.1.2 and by extension Section 7.2.2.2 that
the mine detection capability of the vehicle is available when either mine detection capa-
bility (primary and/or secondary) is available to the system. However, both the primary
and secondary mine detection capabilities depend on the availability of the mine detection
module. Hence, simulating a mine detection module fault leaves the vehicle deprived of
its mine detection capability altogether. This in succession, causes both the primary and
the secondary mine detection actions to not be available (inexecutable) since both depend
on the mine detection module. As such, the vehicle cannot resort to adaptive execution
through the execution of a semantically equivalent action. However, due to the the mine
detection module being a critical component only for mine detection the framework causes
the vehicle to resort to adaptation on the planning level in order to satisfy the remaining
mission goals to the maximum extend possible. That is, classification of MLOs given po-
tential circle detections so far as well as reacquisition of MLOs, if any, and inspection of
those deemed to be mines. Figure 8.14 illustrates where we simulated the mine detection
module fault and the adaptation of the vehicle which led to the continuation of the MCM
mission. This adaptation of course would not be possible if the vehicle was not equipped
4
3 1
Figure 8.14. Adaptation due to mine detection (software) module fault (red square). The
mission start point is represented by the green star while the mission end point is represented
by the black one. The geometrical configuration of mines used in this experiment is the same
as the configuration illustrated throughout this chapter while the high level mission priority
chosen is energy efficiency.
with the components, capabilities, actions and vehicle ontologies as well as the Prolog-
based reasoning. That is, the continuous assessment of the vehicle’s internal state by the
continuously successive repetition of the system identification and system update phases in
real time and behind the scenes.
In another experiment we conducted, we simulated a sonar fault during the lawnmower
trajectory execution as shown in Figure 8.15. Due to the sonar being a critical component
for the primary mine detection capability which in succession is necessary for the primary
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 190
1 4
Figure 8.15. Adaptation due to sonar fault (red square). The mission start point is repre-
sented by the green star while the mission end point is represented by the black one. Purple
spheres represent the mines that the vehicle was not able to detect due to the limited range
of the camera while the purple square represents the recovery of the sonar. The geometrical
configuration of mines used in this experiment is the same as the configuration illustrated
throughout this chapter while the high level mission priority chosen is energy efficiency.
mine detection action the vehicle was forced to search for semantically equivalent actions
to continue mission execution. This lead to the vehicle perform an adaptation on the execu-
tion level since the secondary mine detection capability which depends on the camera was
still available as was, consequently, the secondary mine detection action. However, due to
the limited range of the camera the vehicle was not able to perform circle detections that
corresponded to some of the mines in the environment as shown in Figure 8.15. As such the
classification system classified MLOs only for areas 1-4. Moreover, due to the faulty sonar,
the vehicle started moving towards the first MLO executing secondary MLO reacquisition
actions until we simulated a sonar recovery. This led to the vehicle switch to the execution
of primary MLO reacquisition actions and continue the reacquisitions and inspections until
the end.
and execution level (of missions) in response to real time high level mission priority
changes and component faults.
Let us now elaborate on the above contributions and present a summary of the chapter.
We have made some modifications to the framework ontologies in order to accommo-
date MCM. For example, we extended the generic Object class inside the world ontology,
with a Mlo and a Mine class in order to model mine-like objects and mines respectively.
Similarly, we extended the generic ClassificationModule class with a MineClassification-
Module in order model a mine classifier and so on. However, due to the flexibility of the
framework, such modifications were minimal. It is also important to emphasize that by
no means the presented modifications throughout this chapter constituted a modification to
the framework’s architecture or how the problem of persistently autonomous operations is
addressed within the framework.
Experiments were conducted in simulation and concerned MCM in the form of DC
RIn using a simulated version of the Nessie V AUV. That is, four phases in two passes
with one AUV. In the aforementioned setting we have tested the ability of the framework
to plan and execute MCM missions under different high level mission priorities. As such,
we have moved from the standard MCM missions where energy consumption and/or ex-
ecution time govern the mission and provided an alternative perspective in which criteria
such as probability and entropy are also considered with promising results. That is, de-
spite an increase in the vehicle’s energy consumption and execution time, the combination
of an energy-efficient plus probability-efficient reacquisition of mine-like objects benefits
MCM in the sense that the vehicle tends to reacquire high probability mine-like objects first
while at the same time attempting to minimize energy consumption and execution time. In
the case of energy-efficient plus entropy-efficient reacquisitions the vehicle’s energy con-
sumption and execution time are also increased as opposed to the standard MCM. However,
this is compensated in terms of information gain (uncertainty reduction). That is, the ve-
hicle conducting MCM tends to reacquire higher entropy mine-like objects first while at
the same time attempting to minimize energy consumption and execution time. Further,
the experimental results demonstrate how an ontology-based knowledge representation and
reasoning approach can drive adaptation both on the planning and execution level of mis-
sions. More specifically, adaptive planning due to high level mission priority changes is an
important feature because it facilitates the need for adaptation due to real time mission re-
quirement changes. The experimental results demonstrate the robustness of the framework
in that respect. Moreover, experimental results demonstrate the efficiency of the framework
in recovering from faults in critical components through adaptive planning. In this manner,
the framework facilitates the continuation of missions and satisfaction of mission goals to
the maximum extent possible. On the other hand, the ability of the framework to recover
from faults in components for which there are redundancies, through adaptive execution,
demonstrates the benefit of encoding semantically equivalent capabilities and actions. That
is, preserving computational resources and resorting to adaptive planning only when neces-
sary. This is a highly desirable feature especially in the context of autonomous operations
CHAPTER 8. PERSISTENTLY AUTONOMOUS MINE COUNTERMEASURES 192
So far in this thesis we have presented our work with respect to persistently autonomous
operations with a focus on MCM in the context of maritime defence. However, maritime
defence is only one of the two aspects of the maritime defence and security domain with
which this thesis is concerned (see Chapter 1 for a reminder about the problem statement
and the thesis objectives). The other being maritime security in which maritime situation
awareness plays a key role.
Recall that in Chapter 2 we presented knowledge representation and reasoning ap-
proaches based on logic, probabilistic graphical models as well as hybrid approaches that
unify logic and probability into a single representation. In addition, in Chapter 5 we pre-
sented an overview of the maritime situation awareness domain including a number of
maritime situation awareness systems. Among the presented pieces of work one that par-
ticularly drew our attention was the work of Snidaro et al. [5]. To the best of our knowledge,
it is the only work in the field that utilizes MLNs for building maritime situation awareness.
However, despite the well researched domain of learning the weights of MLNs [93], [190],
[191], [192], [193], the authors set the weights manually. In addition, they do not evaluate
the performance of the networks in depth since they only provide exemplar query outcomes
without assessing the overall performance of the networks on a test set. As such, in this
chapter we present our work with respect to training and evaluating MLNs for maritime
situation awareness.
More specifically, in Section 9.1 we show how MLN weights can be learned genera-
tively using likelihood while in Section 9.2 we briefly present Alchemy, a framework for
statistical relational learning with MLNs which we used in our experiments. The maritime
situation awareness scenario investigated in this chapter concerns the identification of ves-
sels rendezvousing in order to get involved in illegal activities and is based on the work of
Snidaro et al. [5] (see Section 9.3). In that respect, the experiments also aim at demon-
strating the extent to which contextual information has an impact on the performance of the
network in assessing an evolving situation.
193
CHAPTER 9. TRAINING AND EVALUATING MLNS FOR MARITIME SITUATION
AWARENESS 194
For this chapter, the reader is referred to background Chapters 2, 5 for a reminder about
MLNs and maritime situation awareness respectively.
f (w) = log Pw (X = x)
( ( ))
1
= log exp ∑ wi ni (x) (9.1)
Z i
= ∑ wi ni (x) − log Z
i
( )
where, Z = ∑x exp ∑i wi ni (x′ ) . We assume that the training data are in the form of ground
atoms [193]. Let m be the number of all possible ground atoms. As such, the training
database can be represented as a vector x = {x1 , x2 , ....., xm } where each element represents a
binary truth value (0, 1). Zero corresponds to the respective possible ground atom not being
in the training database while one corresponds to the opposite. This effectively imposes the
closed world assumption [193]. In addition, x′ represents all possible databases. Now, max-
imization of function 9.1 can be achieved by applying standard first-order or second-order
gradient-based optimization methods. One of the most used first-order methods is the gradi-
ent descent (gradient ascent for maximization) while conjugate gradient, Newton’s method
as well as various quasi-Newton methods such as the Broyden-Fletcher-Goldfarb-Shanno
(LBFGS) algorithm are some of the most used second order methods. The interested reader
is referred to [194] for additional (technical) information on the gradient-based methods as
well as on their applicability to optimization problems.
(Log)-likelihood is a concave function and as such it does not suffer from local optima.
To be more precise, a local maximum of a concave function is at the same time a global
maximum. The derivative of the log-likelihood (Expression 9.1) with respect to a weight
CHAPTER 9. TRAINING AND EVALUATING MLNS FOR MARITIME SITUATION
AWARENESS 195
∂ 1 ( )
log Pw (X = x) = n j (x) − ∑ exp ∑ wi ni (x ) n j (x′ )
′
∂wj Z x′ i
where, n j (x) is the number of true groundings of the jth formula in the training database
x while Ew [n j (x)] is the expected (predicted) number of true groundings of the jth formula
according to the current model given the current weight vector w = (w1 , w2 , ...w j , ...). Ergo,
if the model predicts that the number of true grounding of the jth formula is higher than
it actually is then w j needs to decrease while if the prediction is lower the w j needs to
increase. Having said that, one can easily deduct that Expression 9.2 represents the jth
component of the gradient. The gradient as a whole is given by Expression 9.2 applied for
all formulas.
The problem that arises however, is that counting the true groundings of a formula in the
training data exactly is generally intractable. According to Richardson and Domingos [193]
this can be solved by counting the true groundings approximately by uniform sampling. In
addition, generally intractable is also the counting of the expected number of true ground-
ings which necessarily requires inference over the MLN [193]. According to Richardson
and Domingos [193] approximate methods can be used but they tend not to converge, as
they state, in reasonable time.
where, x is again a training database in the form of a vector like in the case of maximum
likelihood (x = {x1 , x2 , ....., xm }) with xl representing the binary truth value of the lth possi-
ble ground atom with respect to being or not being in the data. Moreover, MB(Xl ) represents
the truth values of the ground atoms that constitute Xl ’s Markov Blanket1 (MB). By tak-
ing the logarithm of pseudo-likelihood (Expression 9.3) we get the log-pseudo-likelihood
1 Since Xl = xl we refer to xl ’s Markov blanket.
CHAPTER 9. TRAINING AND EVALUATING MLNS FOR MARITIME SITUATION
AWARENESS 196
The derivative of the log-pseudo-likelihood (Expression 9.4) with respect to a weight (par-
tial derivative) is the following [193]:
m
∂ [
∂wj
log PLw (X = x) = ∑ n j (x) − Pw (Xl = 0|MB(Xl ))n j (x[Xl =0] )
l=1 (9.5)
]
− Pw (Xl = 1|MB(Xl ))n j (x[Xl =1] )
where, n j (x) is the number of true groundings of the jth MLN formula in the training data,
n j (x[Xl =0] ) is the number of true groundings of the jth MLN formula when we force Xl
to be zero (Xl = 0) without changing the remaining data while n j (x[Xl =1] ) is the number
of true groundings of the jth MLN formula when we force Xl to be one (Xl = 1) without
changing the remaining data. Compared to the case of the log-likelihood in Section 9.1.1
computing the derivatives of log-pseudo-likelihood or log-pseudo-likelihood itself does not
require inference over the MLN. For optimization (maximization) we can utilize any first
or second-order gradient-based technique as in the case of maximum likelihood.
# Formula
1 Overlaps(v, y) ⇔ Overlaps(y, v).
2 Meets(v, y) ⇔ Meets(y, v).
3 Proximity(v, y) ⇔ Proximity(y, v).
4 Rendezvous(v, y) ⇔ Rendezvous(y, v).
5 Stopped(v) ∧ (IsIn(v, OpenSea) ∨ IsIn(v, IntWaters)) ⇒ Suspicious(v)
6 Stopped(v) ∧ (IsIn(v, Harbour) ∨ IsIn(v, NearCoast)) ⇒ ¬Suspicious(v)
7 ¬AIS(v) ⇒ Alarm(v).
8 ¬InsideCorridor(v) ⇒ Suspicious(v)
9 Humint(v, Smuggling) ⇒ Suspicious(v)
10 Humint(v,Clear) ⇒ ¬Suspicious(v)
11 Suspicious(v) ⇒ Alarm(v).
12 ¬Suspicious(v) ⇒ ¬Alarm(v)
13 IsIn(v, z) ⇒ (z¬zp) ∧ ¬IsIn(v, zp).
14 IsIn(v, z) ∧ IsIn(y, zp) ∧ (z¬zp) ⇒ ¬Proximity(v, y).
15 ¬Proximity(v, y) ⇒ ¬Rendezvous(v, y).
Suspicious(v) ∧ Suspicious(y) ∧ (Overlaps(v, y) ∨ Meets(v, y)) ∧ Proximity(v, y)
16
⇒ Rendezvous(v, y).
17 (Overlaps(v, y) ∨ Meets(v, y)) ∧ Proximity(v, y) ⇒ Rendezvous(v, y)
18 ¬Stopped(v)¬Stopped(y) ⇒ ¬Rendezvous(v, y)
19 Be f ore(v, y) ∧ Proximity(v, y) ⇒ ¬Rendezvous(v, y)
Table 9.1. MLN formulas used in the rendezvous scenario presented in Snidaro et al. [5].
Variables v, y represent vessels while OpenSea, IntWaters, Harbour, NearCoast are constants
with which a zone variable can become bound and Smuggling, Clear constants with which a
report variable can become bound.
rendezvousing in order to get involved in smuggling activities (e.g. drugs, oil, arms2 etc.).
According to Snidaro et al. vessels stopping at open sea or international waters (zones)
constitute an indication of suspicion as opposed to vessels stopping near the coast or at har-
bours (zones). This is also the case for vessels moving outside traffic corridors as well as
vessels for which there is a human intelligence report that they are smuggling. In addition,
vessels who do not transmit AIS data or are suspicious should raise an alarm to the human
operator. Finally, in Snidaro et al. [5] the incident of rendezvous is determined by a com-
bination of the following indicators: vessels being suspicious, vessels being in the same
zone, a vessel leaving a zone well before another comes in, vessels meeting in the same
zone, e.g. a vessel leaves the harbour just after another vessel arrives and vessels over-
lapping temporally in the same zone, e.g. vessels being moored in a harbour at the same
time. The latter three indicators are temporal ones. Table 9.1 illustrates the rendezvous sce-
nario MLN formulas as they are presented in Snidaro et al. [5] while Table 9.2 illustrates
the adjusted rendezvous scenario MLN formulas that we used in our experiments. Before
proceeding with the comparison of the two sets of formulas let us first disambiguate a few
things. Where present, full stops at the end of formulas in both sets denote a deterministic
(hard) constraint which practically means that such formulas are equivalent to the “tradi-
tional” first-order logical formulas. Further, the notations ¬ and ! are equivalent and denote
negation. Finally, where present in the set of formulas of Table 9.2, the + notation preced-
2 For brevity, smuggling of arms is usually referred to as arms trafficking
CHAPTER 9. TRAINING AND EVALUATING MLNS FOR MARITIME SITUATION
AWARENESS 198
# Formula
1 Proximity(x, y) ⇔ Proximity(y, x).
2 Rendezvous(x, y) ⇔ Rendezvous(y, x).
3 Stopped(x) ∧ IsIn(x, +z) ⇒ Suspicious(x)
4 !AIS(x) ⇒ Alarm(x)
5 AIS(x) ⇒!Alarm(x)
6 !InsideCorridor(x) ⇒ Suspicious(x)
7 InsideCorridor(x) ⇒!Suspicious(x)
8 Humint(x, +r) ⇒ Suspicious(x)
9 Suspicious(x) ⇒ Alarm(x)
10 !Suspicious(x) ⇒!Alarm(x)
11 IsIn(x, z) ∧ (z! = zp) ⇒!IsIn(x, zp).
12 (Suspicious(x) ∨ Suspicious(y)) ∧ Proximity(x, y) ⇒ Rendezvous(x, y)
13 (!Stopped(x)∨!Stopped(y)) ⇒!Rendezvous(x, y)
14 !Proximity(x, y) ⇒!Rendezvous(x, y)
Table 9.2. Adjusted rendezvous scenario MLN formulas. Variables x, y represent vessels
while z is a zone variable which can become bound with constants OpenSea, IntWaters, Har-
bour, NearCoast and r is a report variable which can become bound with constants Smug-
gling, Clear.
ing a variable represents the grounding of that variable with all available constants for that
variable during training yielding in this manner additional formulas. For instance, given the
OpenSea, IntWaters, Harbour, NearCoast constants, formula #3 will be expanded into the
following four formulas:
where, each one will be assigned a different weight. The aforementioned + notation is
specific to the Alchemy framework.
Comparing the two sets of MLN formulas one can observe that in the adjusted set
we have omitted the explicit representation of temporal relations that refer to vessels, i.e
Overlaps, Meets and Before. The reason for doing so is that we have embedded the relations
in the Proximity predicate. More specifically, in the case of formulas in Table 9.1, predicate
Proximity does not actually refer to proximity of two vessels but instead to vessels being
in the same zone which can be either the OpenSea, IntWaters, Harbour and NearCoast.
As such, there exists the need for explicit definition of temporal relations among vessels.
For example, consider formula #16 in Table 9.1, where Proximity needs to be accompanied
by temporal relation Overlaps. In our case we give Proximity its natural meaning which
is vessels being close to each other at the same time. Ergo, there is no need to define an
explicit Overlaps temporal relation. In addition, in this context, we find temporal relations
Meets and Before redundant since according to Snidaro et al. [5], a vessel which is in
CHAPTER 9. TRAINING AND EVALUATING MLNS FOR MARITIME SITUATION
AWARENESS 199
some area and then leaves before some other vessel comes in the same area means that the
vessels do not meet. In short, temporal relations as they are defined in Snidaro et al. [5]
add unnecessary complexity to the set of formulas. The existence of a hard constraint in
formula #14 of Table 9.1 is problematic in the sense that it restricts the model in considering
vessels close to each other only when they are in the same zone. However, in reality, vessels
can be close to each other even when they are in different zones. For instance, consider the
case of a vessel being close to the boundaries of the OpenSea zone with the IntWaters zone
but in the OpenSea zone and another vessel being close to the boundaries of the IntWaters
zone with the OpenSea zone but in the IntWaters zone. With the representation of Snidaro
et al. [5] such vessels can never be in proximity because they belong to different zones.
This however, is hardly the case.
Dataset Entry 1 2
!Stopped(V 60) Stopped(V 10)
!Stopped(V 12) Stopped(V 22)
IsIn(V 60, IntWaters) IsIn(V 10, OpenSea)
IsIn(V 12, Harbour) IsIn(V 22, IntWaters)
!InsideCorridor(V 60) Suspicious(V 10)
Humint(V 60,Clear) !InsideCorridor(V 10)
!Suspicious(V 60) Humint(V 10, Smuggling)
InsideCorridor(V 12) Suspicious(V 22)
Humint(V 12,Clear) !InsideCorridor(V 22)
Ground Atoms
!Suspicious(V 12) Humint(V 22, Smuggling)
!AIS(V 60) AIS(V 10)
!Alarm(V 60) Alarm(V 10)
AIS(V 12) AIS(V 22)
!Alarm(V 12) Alarm(V 22)
!Proximity(V 60,V 12) Proximity(V 10,V 22)
!Proximity(V 12,V 60) Proximity(V 22,V 10)
!Rendezvous(V 60,V 12) Rendezvous(V 10,V 22)
!Rendezvous(V 12,V 60) Rendezvous(V 22,V 10)
incidents, demonstrating suspicious behaviour and raising an alarm do not add up to the
total number of vessels since the aforementioned assessments are not mutually exclu-
sive. The dataset is in the form of ground atoms and ground atoms that are not present
in the dataset are considered to be false. In this manner we abide to the closed world
assumption that was mentioned in Section 9.1.1. Table 9.3 illustrates two exemplar en-
tries from the artificial dataset. The first entry represents two vessels (V 12,V 60) that:
are not suspicious (!Suspicious(V 12), !Suspicious(V 60)), do not raise an alarm to a hu-
man operator (!Alarm(V 12), !Alarm(V 60)) and are not involved in a rendezvous incident
(!Rendezvous(V 60,V 12), !Rendezvous(V 12,V 60)). On the other hand, the second entry
represents two vessels (V 10,V 22) that: are suspicious (Suspicious(V 10), Suspicious(V 22)),
raise an alarm to a human operator (Alarm(V 10), Alarm(V 2)) and are involved in a ren-
dezvous incident (Rendezvous(V 10,V 22), Rendezvous(V 22,V 10)).
# Weight Formula
1 — Proximity(x, y) ⇔ Proximity(x, v).
2 — Rendezvous(x, y) ⇔ Rendezvous(x, v).
3 -1.61618 Stopped(x) ∧ IsIn(x, Harbour) ⇒ Suspicious(x)
4 -1.48836 Stopped(x) ∧ IsIn(x, NearCoast) ⇒ Suspicious(x)
5 1.92031 Stopped(x) ∧ IsIn(x, OpenSea) ⇒ Suspicious(x)
6 2.00784 Stopped(x) ∧ IsIn(x, IntWaters) ⇒ Suspicious(x)
7 1.87642 !AIS(x) ⇒ Alarm(x)
8 0.689701 AIS(x) ⇒!Alarm(x)
9 1.99037 !InsideCorridor(x) ⇒ Suspicious(x)
10 1.15087 InsideCorridor(x) ⇒!Suspicious(x)
11 1.50073 Humint(x, Smuggling) ⇒ Suspicious(x)
12 -0.66122 Humint(x,Clear) ⇒ Suspicious(x)
13 1.38086 Suspicious(x) ⇒ Alarm(x)
14 1.03365 !Suspicious(x) ⇒!Alarm(x)
15 — IsIn(x, z) ∧ (z! = zp) ⇒!IsIn(x, zp).
16 1.22971 Suspicious(x) ∧ Suspicious(y) ∧ Proximity(x, y) ⇒ Rendezvous(x, y)
17 0.307675 (!Stopped(x)∨!Stopped(y)) ⇒!Rendezvous(x, y)
18 0.6556 !Proximity(x, y) ⇒!Rendezvous(x, y)
Table 9.4 illustrates the outcome of training the MLN with one of the 10 folds.
For each validation set we use the MC-SAT algorithm implemented in the Alchemy frame-
work to perform approximate probabilistic inference querying for the probability of: i)
suspicious behaviours, ii) an alarm being raised and iii) rendezvous incidents, that is,
P(Suspicious(Vi )|ML,C )), P(Alarm(Vi )|ML,C ) and P(Rendezvous(Vi ,V j )|ML,C )) respectively.
The maximum number of steps chosen for the MC-SAT algorithm is 5000. For each of the
aforementioned three queries we produce k = 10 Receiver Operating Characteristic (ROC)
curves and calculate k = 10 Areas Under the Curve (AUC). ROC curves are graphs com-
monly used for assessing the performance of binary classifiers by plotting the True Posi-
tive Rate (TPR) of a (binary) classifier against its False Positive Rate (FPR) under varying
thresholds. In our case, the varying thresholds are probabilities outputted by the queries.
Regarding the AUC, it represents how well the classifier separates the two classes. At the
end of the validation process as a whole, the validation results for each query, i.e. k ROC
curves and k AUC for each query, are averaged to produce a single mean (average) ROC
curve and a single mean (average) AUC metric. Figures 9.1, 9.2 and 9.3 illustrate the ROC
curves and the AUC for each of the k = 10 validation sets as well as the mean ROC curve
and the mean AUC for the three aforementioned queries.
Let us first interpret our findings with respect to the performance of the MLN in cor-
rectly classifying vessels as being suspicious or non suspicious. As can be observed by
looking at the left graph of Figure 9.1, AUC ranges between 0.926 and 0.973 among the
ten different validation folds with a mean of 0.952 (right graph). This means that the MLN,
on average, has a 95.2% probability of ranking a randomly chosen suspicious vessel higher
than a randomly chosen non suspicious vessel. An AUC value of one means that the clas-
sifier is perfect since this corresponds to a TPR of one and an FPR of zero while an AUC
CHAPTER 9. TRAINING AND EVALUATING MLNS FOR MARITIME SITUATION
AWARENESS 202
0.8
True Positive Rate (TPR)
0.6 ROC 1 (area = 0.959)
ROC 2 (area = 0.970)
ROC 3 (area = 0.938)
0.4 ROC 4 (area = 0.943)
ROC 5 (area = 0.953)
ROC 6 (area = 0.926)
0.2 ROC 7 (area = 0.956)
ROC 8 (area = 0.973)
ROC 9 (area = 0.970) Random Classifier
0.0 ROC 10 (area = 0.969) Mean ROC (area = 0.952)
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
False Positive Rate (FPR) False Positive Rate (FPR)
Figure 9.1. Performance of the MLN in classifying vessels as being suspicious or non suspi-
cious. ROC curves and AUC for the 10-fold cross validation are illustrated in the left graph.
Mean ROC curve and mean AUC are illustrated in the right graph.
value of 0.5 corresponds to the performance of a binary classifier that assigns classes at
random (random classifier).
Regarding the performance of the MLN in correctly classifying (identifying) vessels
that should or should not raise an alarm to a human operator shown in Figure 9.2, the AUC
ranges between 0.761 and 0.854 among the ten different validation folds (left graph) with
a mean of 0.816 (right graph). In addition, we can see that TPRs of one are achieved for
0.8
True Positive Rate (TPR)
higher values of FPRs when compared to the findings of Figure 9.1. Even though this
performance is worse it is still very good and significantly better compared to a random
classifier.
Finally, with regards to the performance of the MLN in correctly classifying (identify-
CHAPTER 9. TRAINING AND EVALUATING MLNS FOR MARITIME SITUATION
AWARENESS 203
ing) pairs of vessels that are rendezvousing or not rendezvousing shown in Figure 9.3, the
AUC ranges between 0.933 and 0.998 among the ten different validation folds (left graph)
with a mean of 0.97 (right graph). The performance is similar to the case of identifying
0.8
True Positive Rate (TPR)
suspicious vessels but slightly better and much better when compared to the case of iden-
tifying vessels that should raise an alarm to a human operator. In addition we can see that,
in general, TPRs of one are achieved for lower values of FPRs compared to both previous
cases.
Recall from Chapter 5, Section 5.2, that context plays a key role in interpreting a sit-
uation correctly. In order to investigate the impact of context, we used 10-fold cross val-
idation but this time we omitted the predicates that represent context, i.e IsIn(x, z) and
InsideCorridor(x), from the validation sets and performed the same queries as before.
Training the MLN was performed in exactly the same way as before, that is, with the
predicates that represent context present in the training data. However, the exclusion of
the context predicates from the validation sets, forces the MLN to interpret the situations
without contextual information. Figures 9.4, 9.5 and 9.6 illustrate the performance of the
MLN in: i) classifying vessels as being suspicious or non suspicious, ii) identifying vessels
that should or should not raise an alarm to a human operator and iii) classifying pairs of
vessels rendezvousing or not rendezvousing, respectively, under context-free conditions.
As can be observed by looking at Figure 9.4 the AUC ranges between 0.782 and 0.899
(left graph) with an average of 0.834 (right graph). This constitutes a 0.118 drop on the
mean AUC metric or in other words an 11.8% drop in the probability of the MLN ranking
a randomly chosen suspicious vessel higher than a randomly chosen non suspicious vessel.
Regarding the performance of the network with respect to identifying vessels that should
raise or should not raise an alarm to a human operator under context-free conditions (see
Figure 9.5), the AUC ranges between 0.717 and 0.851 (left graph) with a mean of 0.789
CHAPTER 9. TRAINING AND EVALUATING MLNS FOR MARITIME SITUATION
AWARENESS 204
0.8
True Positive Rate (TPR)
0.6 ROC 1 (area = 0.831)
ROC 2 (area = 0.866)
ROC 3 (area = 0.859)
0.4 ROC 4 (area = 0.782)
ROC 5 (area = 0.832)
ROC 6 (area = 0.786)
0.2 ROC 7 (area = 0.829)
ROC 8 (area = 0.899)
ROC 9 (area = 0.837) Random Classifier
0.0 ROC 10 (area = 0.816) Mean ROC (area = 0.834)
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
False Positive Rate (FPR) False Positive Rate (FPR)
Figure 9.4. Performance of the MLN in classifying vessels as being suspicious or non suspi-
cious, under context-free conditions. ROC curves and AUC for the 10-fold cross validation
are illustrated in the left graph. Mean ROC curve and mean AUC are illustrated in the right
graph.
(right graph). This represents a 2.7% drop in the probability of the MLN ranking a ran-
0.8
True Positive Rate (TPR)
domly chosen vessel that should raise an alarm higher than a randomly chosen vessel that
should not do so. A question that naturally comes to mind is why this drop is lower than
in the case of classifying suspicious vessels. This is explained by the fact that in the case
of classifying a vessel as being suspicious or not, contextual information impacts indicator
formulas much more than in the case of an alarm. More specifically, there is high uncer-
tainty in the network arising from formulas #3-#6, #9, #10 (see Table 9.4) with only two
formulas (#11, #12) to reduce uncertainty since human intelligence reports are not excluded
from the validation sets. On the other hand, despite the fact that raising an alarm is affected
CHAPTER 9. TRAINING AND EVALUATING MLNS FOR MARITIME SITUATION
AWARENESS 205
0.8
True Positive Rate (TPR)
0.6 ROC 1 (area = 0.979)
ROC 2 (area = 0.952)
ROC 3 (area = 0.994)
0.4 ROC 4 (area = 0.954)
ROC 5 (area = 0.955)
ROC 6 (area = 0.963)
0.2 ROC 7 (area = 0.992)
ROC 8 (area = 0.934)
ROC 9 (area = 0.959) Random Classifier
0.0 ROC 10 (area = 0.998) Mean ROC (area = 0.967)
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
False Positive Rate (FPR) False Positive Rate (FPR)
Figure 9.6. Performance of the MLN in classifying pairs of vessels rendezvousing or not
rendezvousing, under context-free conditions. ROC curves and AUC for the 10-fold cross
validation are illustrated in the left graph. Mean ROC curve and mean AUC are illustrated in
the right graph.
Evidence Set 1 2
Stopped(V 496) Stopped(V 496)
Stopped(V 124) Stopped(V 124)
IsIn(V 496, OpenSea) —
IsIn(V 124, IntWaters) —
!InsideCorridor(V 496) —
Humint(V 496, Smuggling) Humint(V 496, Smuggling)
Ground Atoms
InsideCorridor(V 124) —
Humint(V 124,Clear) Humint(V 124,Clear)
!AIS(V 496) !AIS(V 496)
AIS(V 12496) AIS(V 124)
Proximity(V 496,V 124) Proximity(V 496,V 124)
Proximity(V 124,V 496) Proximity(V 124,V 496)
Suspicious(V 496) Suspicious(V 496)
Suspicious(V 124) Suspicious(V 124)
Alarm(V 496) Alarm(V 496)
Ground Truth
Alarm(V 124) Alarm(V 124)
Rendezvous(V 496,V 124) Rendezvous(V 496,V 124)
Rendezvous(V 124,V 496) Rendezvous(V 124,V 496)
Table 9.5. Two exemplar evidence sets with evidence for two vessels: V 124 and V 496. The
first set contains contextual information illustrated in red while the second set is generated by
stripping all contextual information from the first set. Also illustrated, are the ground truth
situations that correspond to the evidence.
Probability Probability
Query Ground Truth
(Context) (Context-Free)
P(Suspicious(V 124)|ML,C )) 0.385 0.128 Suspicious(V 124)
P(Suspicious(V 496)|ML,C )) 0.995 0.792 Suspicious(V 496)
P(Alarm(V 124)|ML,C )) 0.374 0.208 Alarm(V 124)
P(Alarm(V 496)|ML,C )) 0.952 0.893 Alarm(V 496)
Table 9.6. Outcomes of probabilistic inference when querying for whether vessels V 124,
V 496 are suspicious and for whether an alarm should be raised for those vessels. That is,
given the evidence found in Table 9.5.
ertheless, in the absence of contextual information there is higher uncertainty since the
probability of the predicates that describe the situation correctly is reduced. This is also the
case when querying for P(Rendezvous(V 124,V 496)|ML,C )) as can be observed by looking
at Table 9.7.
Probability Probability
Query Ground Truth
(Context) (Context-Free)
P(Rendezvous(124, 496)|ML,C )) 0.823 0.708 Rendezvous(V 124,V 496)
Table 9.7. Outcome of probabilistic inference when querying for the existence of a ren-
dezvous incident between vessels V 124, V 496 given the evidence found in Table 9.5.
CHAPTER 9. TRAINING AND EVALUATING MLNS FOR MARITIME SITUATION
AWARENESS 207
• The utilization of the maximum pseudo-likelihood weight learning method for train-
ing MLNs for maritime situation awareness based on data instead of hard-coding
weights.
• The evaluation of the performance of an MLN trained with data both in the presence
and absence of contextual information in identifying: i) suspicious vessels at sea, ii)
vessels that should raise an alarm to a human operator and iii) vessels rendezvousing
in order to get involved in illegal activities.
Conclusions
208
Chapter 10
Conclusions
This thesis was concerned with autonomous underwater operations with AUVs and mar-
itime situation awareness systems with a special interest in their utilization for enhancing
maritime defence and security respectively. Regarding autonomous underwater operations,
we presented a generic, easy to understand and extend, ontology-based framework com-
plemented by mission planning and execution system. The framework can be deployed in
AUVs aiming at increasing their persistence in maintaining autonomy in the presence of
mission requirement changes and vehicle subsystem failures during such operations.
In the context of maritime defence, the framework was deployed in an AUV intended for
MCM in simulation. In that respect, an initial set of experiments demonstrated the ability of
the framework in enabling the AUV to plan and execute MCM missions successfully under
three different high level mission priorities. That is, reacquiring and inspecting mines in: i)
an energy-efficient manner, ii) an energy-efficient plus probability-efficient manner, iii) an
energy-efficient plus entropy-efficient manner. This constitutes an advancement from the
standard MCM missions where energy consumption/execution time are prioritized when
trying to fulfil mission goals. Experimental results demonstrate that when considering en-
tropy and probability in addition to energy the vehicle consumes more energy and requires
more time to execute its mission. However, this is compensated in terms of uncertainty
reduction in the first case and in terms of certainty exploitation in the latter. The power
of the ontology-based knowledge representation and reasoning is demonstrated through the
vehicle’s persistence in maintaining autonomy in two cases.
First, in communicating high level mission priority changes to the vehicle during mis-
sion execution to which the vehicle responded successfully with adaptive planning and
continuation of the mission given pending goals. As was mentioned earlier in this thesis,
this is a very useful feature on the grounds that it facilitates the need for adaptation due
to real time mission requirement changes. Most importantly, without the vehicle having to
terminate its mission and be redeployed anew with the new requirements.
Second, in the presence of vehicle component faults (subsystem failures), where the ve-
hicle demonstrated an adaptive behaviour in order to successfully recover from such faults
and carry on with its mission. More specifically, the vehicle benefits from an ontological
representation based on which the interdependencies amongst vehicle components, capa-
209
CHAPTER 10. CONCLUSIONS 210
bilities and actions are captured and vehicle’s components health status is modelled. This
representation is complemented with semantically equivalent capabilities and (semantically
equivalent) actions subject to the existence of redundant functional components for the
same type of capability and the existence of semantically equivalent capabilities required
for the same type of action respectively. Based on this knowledge and reasoning modules
written in Prolog the vehicle is able to determine its internal state in the form capabili-
ties and actions dynamically and at runtime and achieve recovery though adaptation when
component faults affect mission execution. The adaptive behaviour in this context comes
in two forms: adaptive planning and adaptive execution. The vehicle resorts to adaptive
planning when the current plan under execution can no longer be executed due to faults in
critical components that deprive the vehicle of all the necessary capabilities and actions to
do so. In that respect, experimental results demonstrate the efficiency of the framework in
enabling the vehicle to adapt to the new reality by devising new plans that allow it to con-
tinue with remaining mission goals. This is but a highly desirable feature on the grounds
that the vehicle does not have to abort its mission altogether. Alternatively, the vehicle
resorts to adaptive execution when there are redundancies for a faulty component whose
usage can be substituted for the usage of a functional healthy one that enables the vehi-
cle take advantage of the existence of the semantically equivalent capabilities and actions
that it offers. Adaptive execution is always prioritized over adaptive planning not only as
a means to save computational resources but also as a means to satisfy all existing mission
goals. Ultimately, seeking and exploiting alternatives to satisfy mission goals is not only
indicative of persistence but also of increased intelligence.
Moving on to maritime situation awareness in the context of maritime security, an
overview of the field was presented, including a review of current approaches for build-
ing awareness through systems. Such systems aim to support human operators in their
challenging work of dealing with high volumes of potentially contradicting information
which should be fused efficiently to facilitate reliable decision making and action under po-
tentially multiple interpretations of a situation. Focusing on maritime situation awareness
built with Markov logic networks, we presented an approach based on maximum pseudo-
likelihood for training them and evaluated their performance using 10-fold cross validation.
Experimental results demonstrated that training Markov logic networks intended for mar-
itime situation awareness is a very promising alternative to hard-coding network weights.
As was explained earlier in this thesis, this is a very important feature on the grounds that
it constitutes a step forward towards disengaging human experts from transferring their
prior knowledge into the system. Instead, prior knowledge is built directly from data and
any potential contradictions present in the data are addressed during the training process
automatically. Moreover, performance evaluation results suggest that Markov logic net-
works constitute a mechanism which offers efficient fusion of information from various
sources and achieve high success rates in identifying and disambiguating between abnor-
mal and normal vessel activities under uncertainty. This is due to their underlying structure
which unifies first-order logic which is ideal for representing the relations among entities
CHAPTER 10. CONCLUSIONS 211
framework on a real vehicle where it can be tested under real world conditions. Similarly,
it would be beneficial to acquire real world data for training and evaluating Markov logic
networks for maritime situation awareness.
Bibliography
[2] M. Tenorth, Knowledge Processing for Autonomous Robots. PhD thesis, Technische
Universität München, 2011.
[3] M. Ghallab, D. Nau, and P. Traverso, Automated Planning: theory and practice.
Elsevier, 2004.
[4] G. Papadimitriou, Z. Saigol, and D. M. Lane, “Enabling fault recovery and adap-
tation in mine-countermeasures missions using ontologies,” in OCEANS 2015-
Genova, pp. 1–7, IEEE, 2015.
[5] L. Snidaro, I. Visentini, and K. Bryan, “Fusing uncertain knowledge and evidence
for maritime situational awareness via markov logic networks,” Information Fusion,
vol. 21, pp. 159–172, 2015.
[6] C. Office and G. B. Parliament, Securing Britain in an age of uncertainty: the strate-
gic defence and security review, vol. 7948. The Stationery Office, 2010.
[9] M. Nilsson, J. van Laere, T. Ziemke, and J. Edlund, “Extracting rules from expert
operators to support situation awareness in maritime surveillance,” in Information
Fusion, 2008 11th International Conference on, pp. 1–8, IEEE, 2008.
213
BIBLIOGRAPHY 214
[13] M. Davis and H. Putnam, “A computing procedure for quantification theory,” Journal
of the ACM (JACM), vol. 7, no. 3, pp. 201–215, 1960.
[14] B. Selman, H. A. Kautz, and B. Cohen, “Noise strategies for improving local search,”
in AAAI, vol. 94, pp. 337–343, 1994.
[15] D. McAllester, B. Selman, and H. Kautz, “Evidence for invariants in local search,”
in AAAI/IAAI, pp. 321–326, 1997.
[20] H. Gallaire, J. Minker, and J.-M. Nicolas, “Logic and databases: A deductive ap-
proach,” ACM Computing Surveys (CSUR), vol. 16, no. 2, pp. 153–185, 1984.
[21] S. S. Huang, T. J. Green, and B. T. Loo, “Datalog and emerging applications: An in-
teractive tutorial,” in Proceedings of the 2011 ACM SIGMOD International Confer-
ence on Management of Data, SIGMOD ’11, (New York, NY, USA), pp. 1213–1216,
ACM, 2011.
[22] W. Chen, M. Kifer, and D. S. Warren, “Hilog: A foundation for higher-order logic
programming,” The Journal of Logic Programming, vol. 15, no. 3, pp. 187–230,
1993.
[27] U. A. Acar, G. E. Blelloch, and R. Harper, Selective memoization, vol. 38. ACM,
2003.
[32] C. M. Bishop, Pattern Recognition and Machine Learning (Information Science and
Statistics). Secaucus, NJ, USA: Springer-Verlag New York, Inc., 2006.
[35] H. Chan and A. Darwiche, “On the robustness of most probable explanations,” arXiv
preprint arXiv:1206.6819, 2012.
[37] D. Koller and N. Friedman, Probabilistic graphical models: principles and tech-
niques. MIT press, 2009.
[39] J. Kwisthout, H. L. Bodlaender, and L. C. van der Gaag, “The necessity of bounded
treewidth for efficient inference in bayesian networks.,” in ECAI, vol. 215, pp. 237–
242, 2010.
[40] J. H. Kim and J. Pearl, “A computational model for causal and diagnostic reasoning
in inference systems,” in Proceedings of the Eighth International Joint Conference
on Artificial Intelligence - Volume 1, IJCAI’83, (San Francisco, CA, USA), pp. 190–
193, Morgan Kaufmann Publishers Inc., 1983.
BIBLIOGRAPHY 216
[42] S.-S. Leu and C.-M. Chang, “Bayesian-network-based safety risk assessment for
steel construction projects,” Accident Analysis & Prevention, vol. 54, pp. 122–133,
2013.
[43] N. Khakzad, F. Khan, and P. Amyotte, “Dynamic safety analysis of process sys-
tems by mapping bow-tie into bayesian network,” Process Safety and Environmental
Protection, vol. 91, no. 1, pp. 46–53, 2013.
[48] B. Cai, Y. Liu, Q. Fan, Y. Zhang, Z. Liu, S. Yu, and R. Ji, “Multi-source information
fusion based fault diagnosis of ground-source heat pump using bayesian network,”
Applied Energy, vol. 114, pp. 1–9, 2014.
[49] Z. Yongli, H. Limin, and L. Jinling, “Bayesian networks-based approach for power
systems fault diagnosis,” Power Delivery, IEEE Transactions on, vol. 21, no. 2,
pp. 634–639, 2006.
[50] C. Romessis and K. Mathioudakis, “Bayesian network approach for gas path fault di-
agnosis,” Journal of engineering for gas turbines and power, vol. 128, no. 1, pp. 64–
72, 2006.
[55] H.-S. Park and S.-B. Cho, “A modular design of bayesian networks using expert
knowledge: Context-aware home service robot,” Expert Systems with Applications,
vol. 39, no. 3, pp. 2629–2642, 2012.
[56] D. Kortenkamp and T. Weymouth, “Topological mapping for mobile robots using a
combination of sonar and vision sensing,” in AAAI, vol. 94, pp. 979–984, 1994.
[57] A. Joshi, T. C. Henderson, and W. Wang, “Robot cognition using bayesian symmetry
networks.,” in ICAART (1), pp. 696–702, 2014.
[58] R. Kindermann, J. L. Snell, et al., Markov random fields and their applications,
vol. 1. American Mathematical Society Providence, RI, 1980.
[59] D. Kahle, T. Savitsky, S. Schnelle, and V. Cevher, “Junction tree algorithm,” STAT,
vol. 631, 2008.
[60] B. J. Frey and D. J. MacKay, “A revolution: Belief propagation in graphs with cy-
cles,” Advances in neural information processing systems, pp. 479–485, 1998.
[61] F. R. Kschischang, B. J. Frey, and H.-A. Loeliger, “Factor graphs and the sum-
product algorithm,” Information Theory, IEEE Transactions on, vol. 47, no. 2,
pp. 498–519, 2001.
[63] J. Besag, “On the statistical analysis of dirty pictures,” Journal of the Royal Statisti-
cal Society. Series B (Methodological), pp. 259–302, 1986.
[64] P. Brémaud, Markov chains: Gibbs fields, Monte Carlo simulation, and queues,
vol. 31. Springer Science & Business Media, 2013.
[65] P. Kohli and P. H. Torr, “Dynamic graph cuts for efficient inference in markov
random fields,” Pattern Analysis and Machine Intelligence, IEEE Transactions on,
vol. 29, no. 12, pp. 2079–2088, 2007.
BIBLIOGRAPHY 218
[66] Y. Boykov and G. Funka-Lea, “Graph cuts and efficient nd image segmentation,”
International journal of computer vision, vol. 70, no. 2, pp. 109–131, 2006.
[67] D. Singaraju, L. Grady, and R. Vidal, “P-brush: Continuous valued mrfs with normed
pairwise distributions for image segmentation,” in Computer Vision and Pattern
Recognition, 2009. CVPR 2009. IEEE Conference on, pp. 1303–1310, IEEE, 2009.
[68] A. Shekhovtsov, I. Kovtun, and V. Hlaváč, “Efficient mrf deformation model for non-
rigid image matching,” Computer Vision and Image Understanding, vol. 112, no. 1,
pp. 91–99, 2008.
[70] C. Liu, J. Yuen, and A. Torralba, “Sift flow: Dense correspondence across scenes
and its applications,” Pattern Analysis and Machine Intelligence, IEEE Transactions
on, vol. 33, no. 5, pp. 978–994, 2011.
[72] R. Zhang, C. A. Bouman, J.-B. Thibault, and K. D. Sauer, “Gaussian mixture markov
random field for image denoising and reconstruction,” in Global Conference on Sig-
nal and Information Processing (GlobalSIP), 2013 IEEE, pp. 1089–1092, IEEE,
2013.
[74] J. Diebel and S. Thrun, “An application of markov random fields to range sensing,”
in NIPS, vol. 5, pp. 291–298, 2005.
[75] D. Metzler and W. B. Croft, “A markov random field model for term dependen-
cies,” in Proceedings of the 28th Annual International ACM SIGIR Conference on
Research and Development in Information Retrieval, SIGIR ’05, (New York, NY,
USA), pp. 472–479, ACM, 2005.
[76] M. Lease, “An improved markov random field model for supporting verbose
queries,” in Proceedings of the 32nd international ACM SIGIR conference on Re-
search and development in information retrieval, pp. 476–483, ACM, 2009.
BIBLIOGRAPHY 219
[77] P. Domingos and M. Richardson, “Mining the network value of customers,” in Pro-
ceedings of the Seventh ACM SIGKDD International Conference on Knowledge Dis-
covery and Data Mining, KDD ’01, (New York, NY, USA), pp. 57–66, ACM, 2001.
[78] E. Zheleva, L. Getoor, and S. Sarawagi, “Higher-order graphical models for classi-
fication in social and affiliation networks,” in NIPS Workshop on Networks Across
Disciplines: Theory and Applications, vol. 2, Citeseer, 2010.
[79] H. Li, A. Mukherjee, B. Liu, R. Kornfield, and S. Emery, “Detecting campaign pro-
moters on twitter using markov random fields,” in Data Mining (ICDM), 2014 IEEE
International Conference on, pp. 290–299, IEEE, 2014.
[81] D. Jain, S. Waldherr, and M. Beetz, “Bayesian logic networks,” IAS Group, Fakultät
für Informatik, Technische Universität München, Tech. Rep, 2009.
[86] K. B. Laskey, “Mebn: A language for first-order bayesian knowledge bases,” Artifi-
cial intelligence, vol. 172, no. 2, pp. 140–178, 2008.
[88] B. Taskar, P. Abbeel, M.-F. Wong, and D. Koller, “6 relational markov networks,”
STATISTICAL RELATIONAL LEARNING, p. 175, 2007.
[91] S. Sarkhel, D. Venugopal, P. Singla, and V. Gogate, “Lifted map inference for markov
logic networks.,” in AISTATS, pp. 859–867, 2014.
[93] D. Lowd and P. Domingos, “Efficient weight learning for markov logic networks,”
in Knowledge discovery in databases: PKDD 2007, pp. 200–211, Springer, 2007.
[95] R. Mateescu and R. Dechter, “Mixed deterministic and probabilistic networks,” An-
nals of mathematics and artificial intelligence, vol. 54, no. 1-3, pp. 3–51, 2008.
[96] V. Gogate and R. Dechter, “Samplesearch: A scheme that searches for consis-
tent samples,” in International Conference on Artificial Intelligence and Statistics,
pp. 147–154, 2007.
[98] P. Domingos and D. Lowd, “Markov logic: An interface layer for artificial intelli-
gence,” Synthesis Lectures on Artificial Intelligence and Machine Learning, vol. 3,
no. 1, pp. 1–155, 2009.
[99] H. Poon and P. Domingos, “Sound and efficient inference with probabilistic and
deterministic dependencies,” in AAAI, vol. 6, pp. 458–463, 2006.
[100] W. Wei, J. Erenrich, and B. Selman, “Towards efficient sampling: Exploiting random
walk strategies,” in AAAI, vol. 4, pp. 670–676, 2004.
[101] D. Jain, P. Maier, and G. Wylezich, “Markov logic as a modelling language for
weighted constraint satisfaction problems,” Constraint Modelling and Reformula-
tion (ModRef09), p. 60, 2009.
[102] H. Kautz, B. Selman, and Y. Jiang, “A general stochastic approach to solving prob-
lems with hard and soft constraints,” The Satisfiability Problem: Theory and Appli-
cations, vol. 17, pp. 573–586, 1997.
[105] F. Niu, C. Ré, A. Doan, and J. Shavlik, “Tuffy: Scaling up statistical inference in
markov logic networks using an rdbms,” Proceedings of the VLDB Endowment,
vol. 4, no. 6, pp. 373–384, 2011.
[106] K. Beedkar, L. Del Corro, and R. Gemulla, “Fully parallel inference in markov logic
networks.,” in BTW, pp. 205–224, Citeseer, 2013.
[107] L. A. Zadeh, “Fuzzy sets,” Information and control, vol. 8, no. 3, pp. 338–353, 1965.
[108] R. Fagin, J. Y. Halpern, Y. Moses, and M. Vardi, Reasoning about knowledge. MIT
press, 2004.
[109] N. Guarino, Formal Ontology in Information Systems: Proceedings of the 1st Inter-
national Conference June 6-8, 1998, Trento, Italy. Amsterdam, The Netherlands,
The Netherlands: IOS Press, 1st ed., 1998.
[110] T. R. Gruber, “Toward principles for the design of ontologies used for knowledge
sharing,” Int. J. Hum.-Comput. Stud., vol. 43, pp. 907–928, Dec. 1995.
[111] W3C, “Resource description framework (rdf) model and syntax specification.”
http://www.w3.org/TR/1999/REC-rdf-syntax-19990222/, Accessed on 25/4/2016.
[113] I. Horrocks, P. F. Patel-Schneider, and F. V. Harmelen, “From shiq and rdf to owl:
The making of a web ontology language,” Journal of Web Semantics, vol. 1, p. 2003,
2003.
[118] I. Niles and A. Pease, “Towards a standard upper ontology,” in Proceedings of the
international conference on Formal Ontology in Information Systems - Volume 2001,
FOIS ’01, (New York, NY, USA), pp. 2–9, ACM, 2001.
[120] C. Matuszek, J. Cabral, M. Witbrock, and J. Deoliveira, “An introduction to the syn-
tax and content of cyc,” in Proceedings of the 2006 AAAI Spring Symposium on For-
malizing and Compiling Background Knowledge and Its Applications to Knowledge
Representation and Question Answering, pp. 44–49, 2006.
[121] I. 13250-2:2006, “Information technology – topic maps – part 2: Data model,” tech.
rep., International Organization for Standardization, Geneva, Switzerland, 2006.
[123] M. Kifer, “Rules and ontologies in f-logic,” in Reasoning Web (N. Eisinger and
J. Mauszyski, eds.), vol. 3564 of Lecture Notes in Computer Science, pp. 22–34,
Springer Berlin Heidelberg, 2005.
[124] C. J. Matheus, M. M. Kokar, and K. Baclawski, “A core ontology for situation aware-
ness,” in Proceedings of the Sixth International Conference of Information Fusion,
2003, vol. 1, pp. 545–552, 2003.
[125] M. Tenorth and M. Beetz, “Knowrob – knowledge processing for autonomous per-
sonal robots,” in IROS’09 Proceedings of the 2009 IEEE/RSJ international confer-
ence on Intelligent robots and systems, pp. 4261–4266.
[127] G. Lortal, S. Dhouib, and S. Grard, “Integrating ontological domain knowledge into
a robotic dsl,” in Models in Software Engineering (J. Dingel and A. Solberg, eds.),
vol. 6627 of Lecture Notes in Computer Science, pp. 401–414, Springer Berlin /
Heidelberg, 2011.
[131] V. Vassiliadis, A Web Ontology Language - OWL Library for [SWI] Prolog. Release
Notes.
[134] M. Ghallab, D. Nau, and P. Traverso, Automated Planning and Acting. Cambridge
University Press, 2016.
[138] A. Cimatti, M. Pistore, M. Roveri, and P. Traverso, “Weak, strong, and strong cyclic
planning via symbolic model checking,” Artificial Intelligence, vol. 147, no. 1-2,
pp. 35–84, 2003.
[139] M. Pistore, R. Bettin, and P. Traverso, “Symbolic techniques for planning with ex-
tended goals in non-deterministic domains,” in Sixth European Conference on Plan-
ning, 2014.
[140] R. E. Fikes and N. J. Nilsson, “Strips: A new approach to the application of theorem
proving to problem solving,” Artificial intelligence, vol. 2, no. 3-4, pp. 189–208,
1971.
[142] M. Fox and D. Long, “Pddl2. 1: An extension to pddl for expressing temporal plan-
ning domains.,” J. Artif. Intell. Res.(JAIR), vol. 20, pp. 61–124, 2003.
[144] J. Benton, A. J. Coles, and A. I. Coles, “Temporal planning with preferences and
time-dependent continuous costs.,” in Proceedings of the Twenty Second Interna-
tional Conference on Automated Planning and Scheduling (ICAPS-12), June 2012.
[145] Z. A. Saigol, Automated planning for hydrothermal vent prospecting using AUVs.
PhD thesis, University of Birmingham, 2011.
[150] C. McGann, F. Py, K. Rajan, J. P. Ryan, and R. Henthorn, “Adaptive control for
autonomous underwater vehicles.,” in AAAI, pp. 1319–1324, 2008.
[154] O. Sidek and S. Quadri, “A review of data fusion models and systems,” International
Journal of Image and Data Fusion, vol. 3, no. 1, pp. 3–21, 2012.
[155] J. R. Boyd, “The essence of winning and losing,” Unpublished lecture notes, 1996.
[156] E. Shahbazian, D. E. Blodgett, and P. Labbé, “The extended ooda model for data
fusion systems,” in Intl Conf. on Info. Fusion-Fusion01, 2001.
[157] A. N. Steinberg, C. L. Bowman, and F. E. White, “Revisions to the jdl data fusion
model,” in AeroSense’99, pp. 430–441, International Society for Optics and Photon-
ics, 1999.
[158] A. N. Steinberg and C. L. Bowman, “Rethinking the jdl data fusion levels,” NSSDF
JHAPL, vol. 38, p. 39, 2004.
[159] E. Turban, J. Aronson, and T.-P. Liang, Decision Support Systems and Intelligent
Systems 7 Edition. Pearson Prentice Hall, 2005.
[160] J. Roy, “Rule-based expert system for maritime anomaly detection,” in SPIE De-
fense, Security, and Sensing, pp. 76662N–76662N, International Society for Optics
and Photonics, 2010.
[163] K. Kowalska and L. Peel, “Maritime anomaly detection using gaussian process active
learning,” in Information Fusion (FUSION), 2012 15th International Conference on,
pp. 1164–1171, IEEE, 2012.
[164] A. Vandecasteele and A. Napoli, “Spatial ontologies for detecting abnormal maritime
behaviour,” in 2012 Oceans-Yeosu, pp. 1–7, IEEE, 2012.
[165] J. Roy and M. Davenport, “Exploitation of maritime domain ontologies for anomaly
detection and threat analysis,” in 2010 International WaterSide Security Conference,
pp. 1–8, IEEE, 2010.
[166] F. T. Cetin, B. Yilmaz, Y. Kabak, J.-H. Lee, C. Erbas, E. Akagunduz, and S.-J. Lee,
“Increasing maritime situational awareness with interoperating distributed informa-
tion sources,” tech. rep., DTIC Document, 2013.
[168] P. Smets and R. Kennes, “The transferable belief model,” Artificial intelligence,
vol. 66, no. 2, pp. 191–234, 1994.
[177] L. Kunze, T. Roehm, and M. Beetz, “Towards semantic robot description lan-
guages,” in Robotics and Automation (ICRA), 2011 IEEE International Conference
on, pp. 5589–5595, IEEE, 2011.
[179] G. Papadimitriou and D. Lane, “Semantic based knowledge representation and adap-
tive mission planning for mcm missions using auvs,” in OCEANS 2014-TAIPEI,
pp. 1–8, IEEE, 2014.
BIBLIOGRAPHY 227
[181] U. Navy, “The navy unmanned undersea vehicle (uuv) master plan,” US Navy,
November, vol. 9, 2004.
[183] M. Prats, J. Pérez, J. J. Fernández, and P. J. Sanz, “An open source tool for simu-
lation and supervision of underwater intervention missions,” in 2012 IEEE/RSJ In-
ternational Conference on Intelligent Robots and Systems, pp. 2577–2582, IEEE,
2012.
[184] R. Wang and X. Qian, OpenSceneGraph 3 Cookbook. Packt Publishing Ltd, 2012.
[189] B. J. Frey and D. Dueck, “Clustering by passing messages between data points,”
science, vol. 315, no. 5814, pp. 972–976, 2007.
[192] T. N. Huynh and R. J. Mooney, “Max-margin weight learning for markov logic net-
works,” in Joint European Conference on Machine Learning and Knowledge Dis-
covery in Databases, pp. 564–579, Springer, 2009.
[195] J. Besag, “Statistical analysis of non-lattice data,” The statistician, pp. 179–195,
1975.