On Ability to Autonomously Execute Agent Programs with Sensing
Giuseppe De Giacomo
Yves Lespérance
Hector J. Levesque
Dept. of Computer Science
Dip. Informatica e Sistemistica
Dept. of Computer Science
Dept. of Computer Science
University of Toronto
Univer. di Roma “La Sapienza”
York University
University of Toronto
Sebastian Sardiña∗
Toronto, Canada
Roma, Italy
Toronto, Canada
Toronto, Canada
[email protected]
[email protected]
[email protected]
[email protected]
Abstract
Most existing work in agent programming assumes an
execution model where an agent has a knowledge base (KB)
about the current state of the world, and makes decisions
about what to do in terms of what is entailed or consistent
with this KB. We show that in the presence of sensing, such
a model does not always work properly, and propose an alternative that does. We then discuss how this affects agent
programming language design/semantics.
1. Introduction
There has been considerable work on formal models of
deliberation/planning under incomplete information, where
an agent can perform sensing actions to acquire additional
information. This problem is very important in agent applications such as web information retrieval/management.
However, much of the previous work on formal models
of deliberation—i.e., models of knowing how, ability, epistemic feasibility, executabiliy, etc. such as [14, 3, 9, 11, 6]—
has been set in epistemic logic-based frameworks and is
hard to relate to work on agent programming languages (e.g.
3APL [8], AgentSpeak(L) [17]). In this paper, we develop
new non-epistemic formalizations of deliberation that are
much closer and easier to relate to standard agent programming language semantics based on transition systems.
When doing deliberation/planning under incomplete information, one typically searches over a set of states, each of
which is associated with a knowledge base (KB) or theory
that represents what is known in the state. To evaluate tests
in the program and to determine what transitions/actions are
possible, one looks at what is entailed by the current KB.
To allow for future sensing results, one looks at which of
these are consistent with the current KB. We call this type
of approach to deliberation “entailment and consistencybased” (EC-based). In this paper, we argue that EC-based
*
First author is a student.
approaches do not always work, and propose an alternative. Our accounts are formalized within the situation calculus and use a simple programming language based on ConGolog [5] to specify agent programs as described in Section 2, but we claim that the results generalize to most proposed agent programming languages/frameworks. We point
out that this paper is mainly concerned with the semantics
of the deliberation process and not much with the actual algorithms implementing such process.
We initially focus on deterministic programs/plans and
how to formalize when an agent knows how to execute
them. For such deterministic programs, what this amounts
to is ensuring that the agent will always know what the next
step to perform is, and no matter what sensing results are
obtained, the agent will eventually get to the point where it
knows it can terminate. In Sections 3 and 4, we develop a
simple EC-based account of knowing how (KHowEC ). We
show that this account gives the wrong results on a simple example involving indefinite iteration. Then, we show
that whenever this account says that a deliberation/planning
problem is solvable, there is a conditional plan (a finite tree
program without loops) that is a solution. It follows that
this account is limited to problems where the total number
of steps needed can be bounded in advance. We claim that
this limitation is not specific to the simple account and applies to all EC-based accounts of deliberation.
The source of the problem with the EC-based account is
the use of local consistency checks to determine which sensing results are possible. This does not correctly distinguish
between the models that satisfy the overall domain specification (for which the plan must work) and those that do not.
To get a correct account of deliberation, one must take into
account what is true in different models of the domain together with what is true in all of them (what is entailed). In
Section 5, we develop such an entailment and truth-based
account (KHowET ), argue that it intuitively does the right
thing, and show how it correctly handles our test examples.
Following this, we consider richer notions of deliberation/planning, how they can be formalized, and how they
can be exploited in an agent programming language. In Section 6, we discuss the notion of ability to achieve a goal,
and show how it can be defined in terms of our notions of
knowing how to execute a deterministic program. We observe that an EC-based definition of ability inherits the limitations of the EC-based definition of knowing how. Then
in Section 7, we examine knowing how to execute a nondeterministic program. We consider two ways of interpreting
this: one (angelic knowing how) where the agent does planning/lookahead to make the right choices, and another (demonic knowing how) where the agent makes choices arbitrarily. We discuss EC-based and ET-based formalizations
of these notions. Finally in Section 8, we show how angelic knowing how can be used to specify a powerful planning construct in the IndiGolog agent programming language. We end by reviewing the paper’s contributions, discussing the lessons for agent programming language design,
and discussing future work.
All proofs can be found at the following address:
http://www.cs.toronto.edu/∼ssardina/papers/paamas04.pdf
2. The Situation Calculus and IndiGolog
The technical machinery we use to define program execution in the presence of sensing is based on that of [7, 5].
The starting point in the definition is the situation calculus [12]. We will not go over the language here except to
note the following components: there is a special constant
S0 used to denote the initial situation, namely that situation in which no actions have yet occurred; there is a distinguished binary function symbol do where do(a, s) denotes
the successor situation to s resulting from performing the
action a; relations whose truth values vary from situation to
situation are called (relational) fluents, and are denoted by
predicate symbols taking a situation term as their last argument. There is a special predicate P oss(a, s) used to state
that action a is executable in situation s. We assume that
actions return binary sensing results, and we use the predicate SF (a, s) to characterize what the action tells the agent
about its environment. For example, the axiom
SF (senseDoor(d), s) ≡ Open(d, s)
states that the action senseDoor(d) tells the agent whether
the door is open in situation s. For actions with no useful
sensing information, we write SF (a, s) ≡ T rue.
Within this language, we can formulate domain theories
which describe how the world changes as the result of the
available actions. Here, we use basic action theories [18] of
the following form:
• A set of foundational, domain independent axioms for
situations Σ as in [18].
• Axioms describing the initial situation, S0 .
• Action precondition axioms, one for each primitive action a, characterizing P oss(a, s).
• Successor state axioms for fluents of the form
F (~x, do(a, s)) ≡ γ(~x, a, s)
providing the usual solution to the frame problem.
• Sensed fluent axioms, as described above, of the form
SF (A(~x), s) ≡ φ(~x, s)
• Unique names axioms for the primitive actions.
To describe a run of a program which includes both actions and their sensing results, we use the notion of a history, i.e., a sequence of pairs (a, µ) where a is a primitive action and µ is 1 or 0, a sensing result. Intuitively,
the history σ = (a1 , µ1 ) · . . . · (an , µn ) is one where actions a1 , . . . , an happen starting in some initial situation,
and each action ai returns sensing value µi . We use end[σ]
to denote the situation term corresponding to the history σ,
and Sensed[σ] to denote the formula of the situation calculus stating all sensing results of the history σ. Formally,
end[ǫ] = S0 , where ǫ is the empty history; and
end[σ · (a, µ)] = do(a, end[σ]).
Sensed[ǫ] = T rue;
Sensed[σ · (a, 1)] = Sensed[σ] ∧ SF (a, end[σ]);
Sensed[σ · (a, 0)] = Sensed[σ] ∧ ¬SF (a, end[σ]).
Next we turn to programs. We consider a very simple deterministic language with the following constructs:
a,
δ1 ; δ2 ,
if φ then δ1 else δ2 endIf,
while φ do δ endWhile,
primitive action
sequence
conditional
while loop
This is a small subset of ConGolog [5] and we use its single
step transition semantics in the style of [16]. This semantics introduces two special predicates T rans and F inal are
introduced: T rans(δ, s, δ ′ , s′ ) means that by executing program δ in situation s, one can get to situation s′ in one elementary step with the program δ ′ remaining to be executed;
F inal(δ, s) means that program δ may successfully terminate in situation s.
Offline executions of programs, which are the kind of
executions originally proposed for Golog and ConGolog
[10, 5], are characterized using the Do(δ, s, s′ ) predicate,
which means that there is an execution of program δ that
starts in situation s and terminates in situation s′ . This holds
if there is a sequence of legal transitions from the initial configuration up to a final configuration:
def
Do(δ, s, s′ ) = ∃δ ′ .T rans∗ (δ, s, δ ′ , s′ ) ∧ F inal(δ ′ , s′ ),
where T rans∗ is the reflexive transitive closure of T rans. An offline execution of δ from
s is a sequence of actions a, . . . , an such that:
D ∪ C |= Do(δ, s, do(an , . . . , do(a1 , s) . . .)), where D is
an action theory as mentioned above, and C is a set of axioms defining the predicates T rans and F inal and the
encoding of programs as first-order terms [5].
Observe that an offline executor has no access to sensing results, available only at runtime. IndiGolog, an extension of ConGolog to deal with online executions with sensing, is proposed in [7]. The semantics defines an online execution of a program δ starting from a history σ. We say that
a configuration (δ, σ) may evolve to configuration (δ ′ , σ ′ )
w.r.t. a model M (relative to an underlying theory of action
D) iff 1 (i) M is a model of D ∪ C ∪ {Sensed[σ]}, and (ii)
D ∪ C ∪ {Sensed[σi ]} |= T rans(δ, end[σ], δ ′ , end[σ ′ ])
and (iii)
σ · (a, 1) if
σ · (a, 0) if
σ′ =
σ
if
end[σ ′ ] = do(a, end[σ])
and M |= SF (a, end[σ])
end[σ ′ ] = do(a, end[σ])
and M 6|= SF (a, end[σ]).
end[σ ′ ] = end[σ],
The model M above is only used to represent a possible environment and, hence, it is just used to generate the sensing results of the corresponding environment. Finally, we
say that a configuration (δ, σ) is final whenever
D ∪ C ∪ {Sensed[σ]} |= F inal(δ, end[σ]).
Using these two concepts of configuration evolution and
final configurations, one can define various notions of online, incremental, executions of programs as a sequence of
legal configuration evolutions, possibly terminating in a final configuration.
3. Deliberation: EC-based Account
Perhaps the first approach to come to mind for defining
when an agent knows how/is able to execute a deterministic
program δ in a history σ goes as follows: the agent must always know what the next action prescribed by the program
is and be able to perform it such that no matter what sensing output is obtained as a result of doing the action, she
can continue this process with what remains of the program
and, eventually, reach a configuration where she knows she
can legally terminate. We can formalize this idea as follows.
We say that a configuration (δ, σ) may evolve to configuration (δ ′ , σ ′ ) w.r.t. a (background) theory D if and only
(δ, σ) may evolve to (δ ′ , σ ′ ) w.r.t. some model M (relative
to D). Note that we now have two notions of “configuration evolution,” one w.r.t. a particular model (cf. Section 2)
and one w.r.t. a theory. Again, the model is used only to obtain the sensing values corresponding to some possible environment and not determine the truth values of formulas.
An important point is that this alternative version of configuration evolution appeals to consistency in that it considers
1
This defi nition is more general than the one in [7], where the sensing
results were assumed to come from the actual environment rather than
from a model (a model can represent any possible environment). Also,
here we deal with non-terminating, i.e., infi nite executions.
all evolutions of a configuration in which the sensing outcome (i.e., the environment response) is consistent with the
underlying theory.
We define KHowEC (δ, σ) to be the smallest relation
R(δ, σ) such that:
(E1) if (δ, σ) is final, then R(δ, σ);
(E2) if (δ, σ) may evolve to configurations (δ ′ , σ · (a, µi ))
w.r.t. theory D with i = 1..k for some k ≥ 1, and
R(δ ′ , σ · (a, µi )) holds for all i = 1..k, then R(δ, σ).
The first condition states that every terminating configuration is in the relation KHowEC .
The second condition states that if a configuration performs an action transition and for every consistent sensing
result, the resulting configuration is in KHowEC , then this
configuration is also in KHowEC .
Note that, here, the agent’s lack of complete knowledge in a history σ is modeled by the theory D ∪ C ∪
{Sensed[σ]} being incomplete and having many different
models. KHowEC uses entailment to ensure that the information available is sufficient to determine which transition should be performed next. For instance, for a conditional program involving different primitive actions a1
and a2 in the “then” and “else” branches (i.e., such that
D |= a1 6= a2 ), the agent must know whether the test holds
and know how to execute the appropriate branch:
KHowEC (if φ then a1 else a2 endIf, σ) iff
D ∪ C ∪ {Sensed[σ]} |= φ(end[σ]) and KHowEC (a1 , σ)
or
D ∪ C ∪ {Sensed[σ]} |= ¬φ(end[σ]) and KHowEC (a2 , σ).
KHowEC uses consistency to determine which sensing
results can occur, for which the agent needs to have a subplan that leads to a final configuration. Due to this, we say
that KHowEC is an entailment and consistency-based (ECbased) account of knowing how.
This EC-based account of knowing how seems quite intuitive and attractive. However it has a fundamental limitation: it fails on programs involving indefinite iteration. The
following simple example from [9] shows the problem.
Consider a situation in which an agent wants to cut down
a tree. Assume that the agent has a primitive action chop to
chop at the tree, and also assume that she can always find
out whether the tree is down by doing the (binary) sensing action look. If the sensing result is 1, then the tree is
down; otherwise the tree remains up. There is also a fluent RemainingChops(s), which we assume ranges over
the natural numbers N and whose value is unknown to the
agent, and which is meant to represent how many chop actions are still required in s to bring the tree down. The
agent’s goal is to bring the tree down, i.e., bringing about
a situation s such that Down(s) holds, where
def
Down(s) = RemainingChops(s) = 0
The action theory D tc is the union of:
1. The foundational axioms for situations Σ.
2. Duna = {chop 6= look}.
3. Dss contains the following successor state axiom:
RemainingChops(do(a, s)) = n ≡
(a = chop ∧ RemainingChops(s) = n + 1) ∨
(a 6= chop ∧ RemainingChops(s) = n).
4. Dap contains the following two precondition axioms:
P oss(chop, s) ≡ (RemainingChops > 0),
P oss(look, s) ≡ T rue.
5. DS0 = {RemainingChops(S0) 6= 0}.
6. Dsf contains the following two sensing axioms:
SF (chop, s) ≡ T rue,
SF (look, s) ≡ (RemainingChops(s) = 0).
Notice that sentence ∃n.RemainingChop(S0) = n
(where the variable n ranges over N) is entailed by
this theory so “infinitely” hard tree trunks are ruled
out. Nonetheless, the theory does not entail the sentence RemainingChop(S0) < k for any constant k ∈ N.
Hence, there exists some n ∈ N, though unknown and unbounded, such that the tree will fall after n chops. Because of this, intuitively, we should have that the agent
can bring the tree down, since if the agent keeps chopping, the tree will eventually come down, and the agent can
find out whether it has come down by looking. Thus, for
the program
δtc = while ¬Down do chop; look endWhile
we should have that KHowEC (δtc , ǫ) (note that δtc is deterministic). However, this is not the case:
Theorem 3.1 Let δtc be the above program to bring the
tree down. Then, for all k ∈ N, KHowEC (δtc , [(chop, 1) ·
(look, 0)]k ) does not hold. In particular, when k = 0,
KHowEC (δtc , ǫ) does not hold.
Thus, the simple EC-based formalization of knowing how gives the wrong result for this example. Why
is this so? Intuitively, it is easy to check that if the
agent knows how (to execute) the initial configuration, i.e., KHowEC (δtc , ǫ) holds, then she knows-how
(to execute) every possible finite evolution of it, i.e.,
for all j ∈ N, KHowEC (δtc , [(chop, 1) · (look, 0)]j ) and
KHowEC ((look; δtc ), [(chop, 1) · (look, 0)]j · (chop, 1)).
Now consider the hypothetical scenario in which an
agent keeps chopping and looking forever, always seeing that the tree is not down. There is no model of D tc
where δtc yields this scenario, as the tree is guaranteed to come down after a finite number of chops. However,
by the above, we see that KHowEC is, in some way, taking this case into account in determining whether the agent
knows how to execute δtc . This happens because every finite prefix of this never-ending execution is indeed consistent with D tc . The problem is that the set of all of them
together is not. This is why KHowEC fails. In the next section, we show that KHowEC ’s failure on the tree chopping example is due to a general limitation of the KHowEC
formalization. Note that Moore’s original account of ability [14] is closely related to KHowEC and also fails on the
tree chopping example [9].
4. KHowEC Only Handles Bounded Problems
In this section, we show that whenever KHowEC (δ, σ)
holds for some program δ and history σ, there is simple
kind of conditional plan, what we call a TREE program,
that can be followed to execute δ in σ. Since for TREE
programs (and conditional plans), the number of steps they
perform can be bounded in advance (there are no loops), it
follows that KHowEC will never be satisfied for programs
whose execution cannot be bounded in advance. Since there
are many such programs (for instance, the one for the tree
chopping example), it follows that KHowEC is fundamentally limited as a formalization of knowing how and can
only be used in contexts where attention can be restricted to
bounded strategies. As in [6], we define the class of (sensebranch) tree programs TREE with the following BNF rule:
dpt ::= nil | a; dpt1 |senseφ ; if φ then dpt1 else dpt2
where a is any non-sensing action, and dpt1 and dpt2 are
tree programs.
This class includes conditional programs where one can
only test a condition that has just been sensed. Thus as
shown in [6], whenever a TREE program is executable,
it is also epistemically feasible, i.e., the agent can execute
it without ever getting stuck not knowing what transition to
perform next. TREE programs are clearly deterministic.
Let us define a relation KHowByEC : P rogram ×
History × TREE . The relation is intended to associate
a program δ and history σ for which KHowEC holds with
some TREE program(s) that can be used as a strategy for
successfully executing δ in σ.
We define KHowByEC (δ, σ, δ tp ) to be the least relation
R(δ, σ, δ tp ) such that:
(A) if (δ, σ) is final, then R(δ, σ, nil);
(B) if (δ, σ) may evolve to configurations (δ ′ , σ · (a, µi ))
w.r.t. theory D, 1 ≤ i ≤ 2, and there ex′
′
ist δitp such that R(δ ′ , σ · (a, µi ), δitp ), then
tp
tp
R(δ, σ, (a; if φ then δ1 else δ2 endIf)) where φ is the
condition on the right hand side of the sensed fluent ax-
iom for a, and δitp is nil if D∪C ∪{Sensed[σ·(a, µi )]}
′
is inconsistent, and δitp is δitp otherwise.
It is possible to show that whenever KHowByEC (δ, σ, δ tp )
holds, then KHowEC (δ, σ) and KHowEC (δ dp , σ) hold, and
the TREE program δ tp is guaranteed to terminate in a
F inal situation of the given program δ (in all models).
Theorem 4.1 For all programs δ, histories σ, and programs δ tp , if KHowByEC (δ, σ, δ tp ) then we have that
dp
• KHowEC (δ, σ) and KHowEC (δ , σ) hold; and
D ∪ C ∪ {Sensed[σ]} |=
•
∃s.Do(δ tp , end[σ], s) ∧ Do(δ, end[σ], s).
In addition, every configuration captured in KHowEC
can be executed using a TREE program.
Theorem 4.2 For all programs δ and histories σ, if
KHowEC (δ, σ), then there exists a program δ tp such that
KHowByEC (δ, σ, δ tp ).
(T1) if (δ, σ) is final, then R(δ, σ);
(T2) if (δ, σ) may evolve to (δ ′ , σ · (a, µ)) w.r.t. M and
R(δ ′ , σ · (a, µ)), then R(δ, σ);
The only difference between this and KHowEC is that
the sensing results come from the fixed model M . Given
this, we obtain the following formalization of when an agent
knows how to execute a program δ in a history σ:
KHowET (δ, σ) iff for every model M such that
M |= D ∪ C ∪ {Sensed[σ]}, KHowInM(δ, σ, M ).
We call this type of formalization entailment and truthbased, since it uses entailment to ensure that the agent
knows what transitions she can do, and truth in a model
to obtain possible sensing results.
We claim that KHowET is actually correct for programs
δ that are deterministic. For instance, it handles the tree
chopping example correctly:
Proposition 5.1 KHowET (δtc , ǫ) holds w.r.t. theory D tc .
Since the number of steps a TREE program performs can
be bounded in advance, it follows that KHowEC will never
hold for programs/problems that are solvable, but whose executions require a number of steps that cannot be bounded
in advance, as it is the case with the program in the tree
chopping example. Thus KHowEC is severely restricted as
an account of knowing how; it can only be complete when
all possible strategies are bounded.
Theorem 5.2 For any background theory D and any
configuration (δ, σ), if KHowEC (δ, σ) holds, then
KHowET (δ, σ). Moreover, there is a background theory D ∗
and a configuration (δ ∗ , σ ∗ ) such that KHowET (δ ∗ , σ ∗ )
holds, but KHowEC (δ ∗ , σ ∗ ) does not.
5. Deliberation: ET-based Account
6. Ability and Planning
We saw in Section 3 that the reason KHowEC failed on
the tree chopping example was that it required the agent
to have a choice of action that guaranteed reaching a final configuration even for histories that were inconsistent
with the domain specification such us the infinite history
corresponding to the hypothetical scenario described at the
end of Section 3. There was a branch in the configuration
tree that corresponded to that history. This occurred because
“local consistency” was used to construct the configuration
tree. The consistency check kept switching which model of
D ∪ C (which may be thought as representing the environment) was used to generate the next sensing result, postponing the observation that the tree had come down forever. But
in the real world, sensing results come from a fixed environment (even if we don’t know which environment this is). It
seems reasonable that we could correct the problem by fixing the model of D ∪ C used in generating possible configurations in our formalization of knowing how. This is what
we will now do.
We define when an agent knows how to execute a program δ in a history σ and a model M (which represents the
environment), KHowInM(δ, σ, M ), as the smallest relation
R(δ, σ) such that:
Using our two notions of knowing how, we can define related notions of ability to achieve a goal [9, 14, 3]. For both
KHowEC and KHowET , we formally specify when an agent
can achieve a goal φ in a history σ, CanX (φ, σ) (where
X = EC or X = ET ) iff there exists a deterministic program δ d such that KHowX (δ d , σ) and
Furthermore, KHowET is strictly more general than
KHowEC . Formally,
D ∪ C ∪ {Sensed[σ]} |= ∃s.Do(δ d , end[σ], s) ∧ φ(s)
i.e., the agent knows how to execute the program and the
program (always) terminates in a situation where the goal
has been achieved. For the tree chopping problem we have
that CanET (Down, S0 ) holds, but CanEC (Down, S0 )
does not. Based on Theorem 5.2, we can also show
that CanET is a strictly more general account of ability than CanEC .
One can easily use this to define a new construct
achieve(φ) that does planning to achieve a goal and
add it to an agent programming language. But generally, one also want to be able to specify constraints on the
search for a plan to achieve the goal, constraints on what
sort of plan should be considered. One way to do this is to
specify the task as that of executing a nondeterministic program, which we now turn to.
7. Nondeterministic Programs
When it comes to nondeterministic programs, the notion
of knowing how needs to be extended. In particular there
are two main notions of nondeterministic program execution of interest for agents. The first notion states that choices
at program choice points are under the control of the agent,
and hence can involve reasoning on part of the agent. In
this case, an agent knows how to execute a nondeterministic program if she is able to make choices along the execution of the program so that, at each step, the action chosen is
known to be executable and such that no matter what sensing output is obtained as a result of doing the action, she can
continue this process and eventually terminate successfully.
This assumes that the agent does some planning/lookahead
to find a strategy for executing the nondeterministic program such that this strategy is guaranteed to succeed.
The second notion of nondeterministic program execution states that choices at its choice points are not under
the control of the agent. Hence, no reasoning, planning or
lookahead, is involved in choosing the next step, and all
choices must be accepted and dealt with by the agent. In
this case, the agent knows how to execute a nondeterministic program if every possible way she may execute it would
eventually terminate successfully.
Suppose that we enlarge our programming language with
the following nondeterministic constructs:
φ?,
δ1 | δ2 ,
π x. δ(x),
δ∗,
δ1 k δ 2 ,
wait for a condition
nondeterministic branch
nondeterministic choice of argument
nondeterministic iteration
(interleaved) concurrency
The first wait/test construct blocks until its condition becomes true and it is still a deterministic construct which
produces transitions involving no action. Such a construct is
useful with nondeterministic programs and can be easily accommodated into the already given definitions of KHowEC
and KHowET by just adding one extra condition to their
corresponding definitions:
(E3) if (δ, σ) may evolve to (δ ′ , σ) w.r.t. theory D and
R(δ ′ , σ), then R(δ, σ) (for KHowEC )
(T3) if (δ, σ) may evolve to (δ ′ , σ) w.r.t. M and R(δ ′ , σ),
then R(δ, σ) (for KHowET )
Let us first focus on knowing how for nondeterministic programs where choices are under the control of the agent, i.e.,
angelic knowing how. We can define EC/ET versions of
this notion. We can take KHowAng
ECnd to be just KHowEC assuming (E3) is included. This works just fine for nondeterministic programs, though, it still fails to capture (good)
programs with an unbounded number of steps. On the
other hand, KHowET , as defined and assuming (T3) is included, is too weak. Consider the following example. There
is a treasure behind one of two doors but the agent does not
know which. We want to know if she knows how to execute the program δtreas :
[(open1; look) | (open2; look)]; AtT reasure?
Intuitively, the agent does not know how to execute δtreas
because she does not know which door to open to get
to the treasure. However, KHowET (δtreas , ǫ) holds. Indeed in a model M1 where the treasure is behind door 1,
the agent can pick the open1 action, and then we have
KHowInM((look; AtT reasure?), [(open1, 1)], M1 ), and
thus KHowInM(δtreas , ǫ, M1 ). Similarly, in a model M2
where the treasure is behind door 2, she can pick open2,
and thus KHowInM(δtreas , ǫ, M2 ).
The problem with KHowET for nondeterministic programs is that the action chosen need not be the same in different models even if they have generated the same sensing results up to that point and are indistinguishable for the
agent. We can solve this problem by requiring that the agent
have a common strategy for all models/environments, i.e.,
that she has a deterministic program δ d that she knows how
to execute (in all models of the theory) and knows δ d will
terminate in a final situation of the given program δ:
d
KHowAng
ET nd (δ, σ) iff there is a deterministic program δ
d
such that KHowET (δ , σ) and
D ∪ C ∪ {Sensed[σ]} |=
∃s.Do(δ d , end[σ], s) ∧ Do(δ, end[σ], s).
We do not think that it is possible to obtain a much simpler general formalization of knowing how and to avoid
the existential quantification over deterministic programs/strategies.
Proposition 7.1 For any background theory D and any
configuration (δ, σ), if KHowAng
ECnd (δ, σ) holds, then
Ang
KHowET nd (δ, σ). Also, there is a background theory D ∗
∗
∗
and a configuration (δ ∗ , σ ∗ ) such that KHowAng
ET nd (δ , σ )
Ang
holds, but KHowECnd (δ ∗ , σ ∗ ) does not.
An interesting point is that, now, ability to achieve a goal,
as defined in the previous section, can be seen as a special
case of knowing how to execute a (nondeterministic) program. Indeed, we (re)define CanX (φ, end[σ]), as follows:
Ang
CanX (φ, σ) iff KHowXnd
(while ¬φ do (πa.a) end, σ),
i.e., the agent knows how to execute the program that involves repeatedly choosing and executing some action until
the goal has been achieved.
Next, we turn our attention to knowing how for nondeterministic programs where choices are not under the control of the agent, i.e., demonic knowing how. We start by
defining a relation between a program δa and another program δb that simulates it:
def
Simulates(δa, δb , s) =
∃R.(∀δ1 , δ2 , s.R(δ1 , δ2 , s) ⊃ ... ) ∧ R(δa , δb , s)
where the ellipsis stands for the conjunction of:
F inal(δ1 , s) ⊃ F inal(δ2 , s)
¬F inal(δ1 , s) ∧ ∀δ1′ , s′ .T rans(δ1 , s, δ1′ , s′ ) ⊃
∃δ2′ .T rans(δ2 , s, δ2′ , s′ ) ∧ R(δ1′ , δ2′ , s)
For instance, if δ = ((a; b) | (a; c)) and δ ′ = (a; (b |
c)), then Simulates(δ, δ ′, s) holds, but Simulates(δ ′, δ, s)
does not (provided all actions are possible from situation s).
We say that an agent knows how to execute a nondeterministic program in the demonic sense whenever she knows
how to execute every possible deterministic simulation of it:
KHowDem
ET nd (δ, σ)
This semantic is metatheoretic and provides a rather simpler alternative to that proposed in [6], which is based on an
epistemic version of the situation calculus.
9. Discussion and Conclusion
d
iff KHowET (δ , σ) holds for every deterministic program δ d such that
D ∪ C ∪ {Sensed[σ]} |= Simulates(δ d, δ, end[σ])
This account of knowing how can be seen as a specification of nondeterministic programs when choices at choice
points are not under the control of the agent. This is relevant
for agent programming languages in which nondeterministic programs are used to represent the agent behavior but an
online/reactive account of execution of these programs is
used (3APL [8], AgentSpeak(L) [17], etc). In those frameworks, the nondeterministic program must work no matter
how choice points are resolved.
8. Deliberation in IndiGolog
We can use our formalization of knowing how to provide a better semantics for search/deliberation in agent programming languages. Let’s show how this is done for IndiGolog [7]. In IndiGolog, the programmer controls when
deliberation occurs. By default, there is no deliberation or
lookahead; the interpreter arbitrarily selects a transition and
performs it on-line, until it gets to a final configuration. To
perform deliberation/lookahead, the programmer uses the
search operator Σ(δ), where δ is the part of the program that
needs to be deliberated over. The idea is that this Σ only allows a transition for δ if there exists a sequence of further
transitions that would allow δ to terminate successfully. The
original definition for the search operator in [7] failed to ensure that the plan was epistemically feasible, i.e., allowed
cases where a sequence of transitions must exist, but where
the agent cannot determine what the sequence is; our proposal here corrects this.
We can specify the semantics of this operator by extending our earlier notions of configuration evolution and finalness, used in defining online executions:
A configuration (Σ(δ), σ) is final if and only if the configuration (δ, σ) is final. A configuration (Σ(δ), σ) may
evolve to a configuration (δ ′ , σ ′ ) w.r.t. a model M if and
only if there exists a deterministic program δ d such that:
d
• KHowAng
ET (δ , σ) holds;
• D ∪ C ∪ {Sensed[σ]} |=
∃s.Do(δ d , end[σ], s) ∧ Do(δ, end[σ], s)
d
(i.e. δ must terminate in a F inal situation of the
given program δ);
′
• (δ d , σ) may evolve to (δ d , σ ′ ) w.r.t. M ;
′
• the program that remains afterwards is δ ′ = Σ(δ d ).
In this paper, we have looked at how to formalize when
an agent knows how to execute a program, which in the
general case, when the program is nondeterministic and the
agent does lookahead and reasons about possible execution strategies, subsumes ability to achieve a goal. First, we
have shown that an intuitively reasonable entailment and
consistency-based approach to formalizing knowing how,
KHowEC , fails on examples like our tree chopping case and
that, in fact, KHowEC can only handle problems that can be
solved in a bounded number of steps, i.e. without indefinite
iteration. Then, we developed an alternative entailment and
truth-based formalization, KHowET , that handles indefinite
iteration examples correctly. Finally, we proposed accounts
of ability and knowing how for nondeterministic programs.
The problems of accounts like KHowEC when they are formalized in epistemic logic, such as Moore’s [14], had been
pointed out before, for instance in [9]. However, the reasons for the problems were not well understood. The results we have presented clarify the source of the problems
and show what is needed for their solution. A simple metatheoretic approach to knowing how fails; one needs to take
entailment and truth into account together. (Even if we use
a more powerful logical language with an knowledge operator, knowledge and truth must be considered together.)
Our non-epistemic accounts of knowing how are easily related to models of agent programming language semantics and our results have important implications for this
area. While most work on agent programming languages
(e.g. 3APL [8], AgentSpeak(L) [17], etc.) has focused on
reactive execution, sensing is acknowledged to be important and there has been interest in providing mechanisms
for run-time planning/deliberation. The semantics of such
languages are usually specified as a transition system. For
instance in 3APL, configurations are pairs involving a program and a belief base, and a transition relation over such
pairs is defined by a set of rules. Evaluating program tests
is done by checking whether they are entailed by the belief base. Checking action preconditions is done by querying the agent’s belief base update relation, which would typically involve determining entailments over the belief base
— the 3APL semantics abstracts over the details of this.
Sensing is not dealt with explicitly, although one can suppose that it could be handled by simply updating the belief
base (AgentSpeak(L) has events for this kind of thing).
As mentioned, most work in the area only deals with
on-line reactive execution, where no deliberation/lookahead
is performed; this type of execution just involves repeatedly selecting some transition allowed in the current configuration and performing it. However, one natural view is
that deliberation can simply be taken as a different control
regime involving search over the agent program’s transition tree. In this view, a deliberating interpreter could first
lookahead and search the program’s transition tree to find a
sequence of transitions that leads to successful termination
and later execute this sequence. This assumes that the agent
can chose among all alternative transitions. Clearly, in the
presence of sensing, this idea needs to be refined. One must
find more than just a path to a final configuration in the transition tree; one needs to find some sort of conditional plan or
subtree where the agent has chosen some transition among
those allowed, but must have branches for all possible sensing results. The natural way of determining which sensing
results are possible is checking their consistency with the
current belief base. Thus, what is considered here is essentially an EC-based approach.
Also in work on planning under incomplete information,
e.g. [2, 15, 4], a similar sort of setting is typically used, and
finding a plan involves searching a (finite) space of knowledge states that are compatible with the planner’s knowledge. The underlying models of all these planners are meant
to represent only the current possible states of the environment, which, in turn, are updated upon the hypothetical execution of an action at planning time. We use models that are
dynamic in the sense that they represent the potential responses of the environment for any future state. In that way,
then, what the above planners are doing is deliberation in
the style of KHowEC .
Our results show that this view of deliberation is fundamentally flawed when sensing is present. It produces an account that only handles problems that can be solved in a
bounded number of actions. As an approach to implementing deliberation, this may be perfectly fine. But as a semantics or specification, it is wrong. What is required is a much
different kind of account, like our ET-based one.
One might argue that results concerning the indistinguishability of unbounded nondeterminism [13, 1] (e.g., a ∗ b
being observationally indistinguishable from a∗ b + aω ) are
a problem for our approach, but this is not the case because
we are assuming that agents can reason about all possible
program executions/futures.
Finally, we believe that there is a close relationship between KHowET nd some of the earlier epistemic accounts of
knowing how and ability [14, 3, 9, 11, 6]. We hope to get
some correspondence results on this soon.
References
[1] K. Apt and E. Olderog. Verifi cation of Sequential and Concurrent Programs. Springer-Verlag, 1997.
[2] P. Bertoli, A. Cimatti, M. Roveri, and P. Traverso. Planning
in nondeterministic domains under partial observability via
symbolic model checking. In Proc. of IJCAI-01, pages 473–
478, 2001.
[3] E. Davis. Knowledge preconditions for plans. Journal of
Logic and Computation, 4(5):721–766, 1994.
[4] G. De Giaccomo, L. Iocchi, D. Nardi, and R. Rosati. Planning with sensing for a mobile robot. In Proc, of ECP-97,
pages 156–168, 1997.
[5] G. De Giacomo, Y. Lespérance, and H. J. Levesque. ConGolog, a concurrent programming language based on the situation calculus. Artifi cial Intelligence, 121:109–169, 2000.
[6] G. De Giacomo, Y. Lespérance, H. J. Levesque, and
S. Sardiña. On the semantics of deliberation in IndiGolog:
From theory to implementation. In Proc. of KR-02, pages
603–614, 2002.
[7] G. De Giacomo and H. J. Levesque. An incremental interpreter for high-level programs with sensing. In H. J.
Levesque and F. Pirri, editors, Logical Foundations for Cognitive Agents, pages 86–102. Springer-Verlag, 1999.
[8] K. V. Hindriks, F. S. de Boer, W. van der Hoek, and J. J. C.
Meyer. A formal semantics for an abstract agent programming language. In Proc. of ATAL-97, pages 215–229, 1998.
[9] Y. Lespérance, H. J. Levesque, F. Lin, and R. B. Scherl. Ability and knowing how in the situation calculus. Studia Logica, 66(1):165–186, 2000.
[10] H. Levesque, R. Reiter, Y. Lesperance, F. Lin, and R. Scherl.
GOLOG: A logic programming language for dynamic domains. Journal of Logic Programming, 31:59–84, 1997.
[11] F. Lin and H. J. Levesque. What robots can do: Robot
programs and effective achievability. Artifi cial Intelligence,
101:201–226, 1998.
[12] J. McCarthy and P. Hayes. Some philosophical problems
from the standpoint of artifi cial intellig ence. In B. Meltzer
and D. Michie, editors, Machine Intelligence, volume 4,
pages 463–502. Edinburgh University Press, 1979.
[13] R. Milner. Communication and Concurrency. Prentice Hall,
1989.
[14] R. C. Moore. A formal theory of knowledge and action. In
J. R. Hobbs and R. C. Moore, editors, Formal Theories of the
Common Sense World, pages 319–358. 1985.
[15] R. Petrick and F. Bacchus. A knowledge-based approach to
planning with incomplete information and sensing. In Proc.
of AIPS-02, pages 212–221, 2002.
[16] G. Plotkin. A structural approach to operational semantics.
Technical Report DAIMI-FN-19, Computer Science Dept.,
Aarhus University, Denmark, 1981.
[17] A. S. Rao. AgentSpeak(L): BDI agents speak out in a logica computable language. In W. V. Velde and J. W. Perram,
editors, Agents Breaking Away (LNAI), volume 1038, pages
42–55. Springer-Verlag, 1996.
[18] R. Reiter. Knowledge in Action: Logical Foundations for
Specifying and Implementing Dynamical Systems. MIT
Press, 2001.
A. PROOFS
P ROOF OF T HEOREM 3.1:
Let R be the smallest relation defining KHowEC . Then, R is the smallest relation satisfying conditions (E1)-(E3).
Assume, by the contrary, that R(δtc , [chop, (look, 0)]m ) for some m ≥ 0. By Lemma A.1, there exists a binary relation R′ ⊆ R such that R′ (δtc , [chop, (look, 0)]k ) does not holds for any k ≥ 0 and such that R′ satisfies (E1)-(E3).
This means that R′ (δtc , [chop, (look, 0)]m ) does not hold and, therefore, R′ ⊂ R (i.e., R′ is a proper subset of R.) Therefore, R is not the smallest relation satisfying (E1)-(E3) which contradicts the initial assumption.
Then, ¬R(δtc , [chop, (look, 0)]k ) for all k ≥ 0 and the relation KHowEC does not contain the pair
(δtc , [chop, (look, 0)]k ). In particular, (δtc , ǫ) is not in the KHowEC relation when we take k = 0.
Lemma A.1
Let δtc be the program to bring the tree down. This Lemma will prove a more general version of the result by showing that
Ang
KHowAng
ECnd (δtc , σ) does not hold. It is obvious to see that KHowEC is always a subset of KHowECnd .
Let R be a set of pairs program-history (δ, σ) satisfying conditions (E1)-(E3) of the definition of KHow Ang
ECnd . Then, there
exists a set R′ ⊆ R such that R′ satisfies conditions (E1), (E2) and (E3), and such that R′ (δtc , [chop, (look, 0)]k ) does not
hold for any k ≥ 0.
P ROOF :
Intuitively, we will prove that we can “safely” remove all program-history pairs of the form (δ tc , [chop, (look, 0)]k ) from
the set R. To do that, let us define R′ = R − β, where:
β = {(δtc , σ) : σ = [chop, (look, 0)]k ∧ k ≥ 0} ∪ {((look; δtc ), σ) : σ = [chop, (look, 0)]k · chop ∧ k ≥ 0}
Clearly, R′ ⊆ R and ¬R′ (δtc , ǫ). It remains to show that relation R′ does indeed satisfy conditions (E1), (E2), and (E3).
The set R′ satisfies (E1)
This is trivial since all configurations for which β holds are not final. In concrete, if (δ, σ) is final, then R(δ, σ) and
¬β(δ, σ). Therefore, R′ (δ, σ).
The set R′ satisfies (E2)
Let δ be any program and σ be any history. Let us consider the following three exhaustive and exclusive cases depending
on the form of the history σ:
1. σ 6= [chop, (look, 0)]k and σ 6= [chop, (look, 0)]k · chop.
First, note that, because of the form of σ, ¬β(δ, σ) is true. Assume then that there exist configurations (δ i , σ · (a, µi )),
i ≥ 1, such that (δ, σ) may evolve to w.r.t. theory D tc and such that R′ (δi , σ · (a, µi )) apply.
Since R′ ⊆ R, then R(δ ′ , σ · (a, µ)) holds, and, given that R does satisfy condition (E2), R(δ, σ) is true. Due to the
fact that ¬β(δ, σ), R′ (δ, σ) holds as well.
2. σ = [chop, (look, 0)]k , for some k ≥ 0.
Assume that there exist configurations (δi , σ · (a, µi )), i ≥ 1, such that (δ, σ) may evolve to w.r.t. theory D tc . Let us
now consider the following two exhaustive and exclusive cases:
(a) δ 6= δtc , i.e., δ 6= while ¬Down do chop; look endWhile.
Suppose that R′ (δi , σ · (a, µi )) apply, for all i ≥ i. Hence, R(δ ′ , σ · (a, µ)) holds because R′ ⊆ R; and R(δ, σ)
holds since R does satisfy (E3). Moreover, because of the form of both δ and σ, it is easy to check that ¬β(δ, σ).
Thus, R′ (δ, σ) holds.
(b) δ = δtc , i.e., δ = while ¬Down do chop; look endWhile.
In this case, there could only be one possible configuration (δ ′ , σ ′ ) to which (δ, σ) may evolve to w.r.t. theory
D, namely, δ ′ = (look; δtc ) and σ ′ = σ · chop. Since σ ′ = [chop, (look, 0)]k · chop, β(δ ′ , σ ′ ) holds, and, therefore,
¬R′ (δ ′ , σ ′ ) applies. Thus, condition (E3) is trivially satisfied by R ′ .
3. σ = [chop, (look, 0)]k · chop, for some k ≥ 0.
Here, let us consider the following two exhaustive and exclusive cases depending on the form of the program δ:
(a) δ 6= (look; δtc ), i.e., δ 6= (look; while ¬Down do chop; look endWhile).
Suppose that R′ (δi , σ · (a, µi )) apply, for all i ≥ i. Hence, R(δ ′ , σ · (a, µ)) holds because R′ ⊆ R; and R(δ, σ)
holds since R does satisfy (E3). Moreover, because of the form of both δ and σ, it is easy to check that ¬β(δ, σ).
Thus, R′ (δ, σ) holds.
(b) δ = (look; δtc ), i.e., δ = (look; while ¬Down do chop; look endWhile).
Finally, the most interesting case. We can easily verify that (δ, σ) may evolve to (δ tc , σ ′ ) w.r.t. theory D, where
′
σ = σ · (look, 0). Informally, there is always a possible evolution of the configuration for which the tree was just
sensed to be still up. Technically, there is always a model M of D ∪ C ∪ {Sensed[σ · (look, 0)]}.
Next, we observe that σ ′ = [chop, (look, 0)]k+1 , and, therefore, β(δtc , σ ′ ) holds. Thus, by the way we defined
′
R in terms of R and β, ¬R′ (δtc , σ · (look, 0)) applies and the condition (E3) is trivially satisfied in this case.
In words, there is always a possible and consistent sensing outcome for the only legal action-transition (i.e., a
chopping action) such that the resulting configuration is not on the smallest set. Recall that in order to include the
original configuration into the smallest set, we require that no matter how sensing turns out to be, the resulting
configuration should be on the smallest set.
The set R′ satisfies (E3)
Let δ be any program and σ be any history. Let us consider the following two exhaustive and exclusive cases depending on the form of the program δ:
(a) δ 6= δtc and δ 6= look; δtc
First, note that, due to the form of δ, ¬β(δ, σ) is true. Assume then that there exists a program δ ′ such that (δ, σ)
may evolve to (δ ′ , σ) w.r.t. Dtc and such that that R′ (δ ′ , σ) holds. Since R′ ⊆ R, then R(δ ′ , σ) holds as well.
Given that R satisfies (E3), R(δ, σ). Finally, because ¬β(δ, σ), it follows that R ′ (δ, σ) is true.
(b) δ = δtc or δ = look; δtc
In this case, there is no program δ ′ such that (δ, σ) may evolve to (δ ′ , σ) as there can be no transition without
an action, either a look action or a chop one. Therefore, condition (E3) is trivially satisfied.
P ROOF OF P ROPOSITION 5.1:
Let M be a model of D tc ∪ C. We are to prove that KHowInM(δ, ǫ, M ). Let k be the number of chops required initially to
get the three down in model M , i.e., M |= RemainingChops(S0 ) = k. It is not difficult to see that history σ = [(chop, 1) ·
(look, 1)]k completely executes program δtc in model M . Moreover, we can check the following points:
• (δtc , σ) is final and hence, by condition (T1), KHowInM(δtc , σ, M ) holds;
• ((look; δtc ), [(chop, 1)·(look, 0)]k−1 ·(chop, 1)) may evolve to (δtc , σ). By condition (T2), KHowInM((look; δtc ), [(chop, 1)·
(look, 0)]k−1 , M ) holds;
• (δtc , [(chop, 1) · (look, 0)]k−1 ) may evolve to ((look; δtc ), [(chop, 1) · (look, 0)]k−1 · (chop, 1)). By condition (T2),
KHowInM((look; δtc ), [(chop, 1) · (look, 0)]k−1 , M ) holds;
• ...
• ((look; δtc ), (chop, 1)) may evolve to (δtc , [(chop, 1)·(look, 0)]1 ). By condition (T2), KHowInM((look; δtc ), (chop, 1), M )
holds;
• (δtc , ǫ) may evolve to ((look; δtc ), (chop, 1)). By condition (T2), KHowInM(δtc ), ǫ, M ) holds.
P ROOF OF T HEOREM 5.2:
Part (ii) folllows directly from Proposition 5.1 and Theorem 3.1, taking D tc as the background theory.
Let us prove part (i). Suppose that (δ ∗ , σ ∗ ) is a configuration such that KHowET (δ ∗ , σ ∗ ) does not hold. We shall prove
that KHowEC (δ ∗ , σ ∗ ) is not true either.
By hypothesis, there is a model M of D ∪ C ∪ {Sensed[σ ∗ ]} such that KHowInM (δ ∗ , σ ∗ , M ) is false. We define the
binary relation R as follows: for all programs δ and all histories σ, R(δ, σ) iff KHowInM (δ, σ, M ). We shall prove now
that R satisfies conditions (E1) and (E2) in the definition of KHowEC :
(E1) Suppose that (δ, σ) is final. Then, by condition (T1) in the definition of KHowInM, KHowSM (δ, σ, M ) is true and
R(δ, σ) holds.
(E2) Suppose that (δ, σ) may evolve to configurations (δ ′ , σ ·(a, µi )) w.r.t. theory D with 1 ≤ i ≤ k, and that R(δ ′ , σ ·(a, µi ))
for all 1 ≤ i ≤ k.
If σ is inconsistent in M (i.e., D ∪ C ∪ {Sensed[σ]} is unsatisfiable,) then configuration (δ, σ) is final (in the general
case), KHowET d(δ, σ, M ) holds, and, therefore, R(δ, σ) holds as well.
Assume then that σ is consistent in M . Then, there should be some µ j , 1 ≤ j ≤ k, such that (δ, σ) may evolve to
(δ ′ , σ · (a, µj )) w.r.t M . Notice that M |= T rans(δ, end[σ], δ ′ , do(a, end[σ])) and that µj = 1 iff M |= SF (a, end[σ]).
Since R(δ ′ , σ · (a, µj )) holds, so does KHowInM(δ ′ , σ · (a, µi ), M ). Then, by condition (T3) in the definition of
KHowInM, KHowInM (δ, σ, M ) holds and, by the way we have defined R, it follows that R(δ, σ) holds.
Ang
To generalize the result to KHowAng
ECnd and KHowET nd we just need to add the following case to the above two:
(E3) Suppose that (δ, σ) may evolve to (δ ′ , σ) w.r.t. theory D and that R(δ ′ , σ) holds. Then, since the history remains the
same, it is also true that (δ, σ) may evolve to (δ ′ , σ) w.r.t. model M (the configuration evolution does not depend on the
environment). Moreover, by the way R was defined, KHowSM (δ ′ , σ, M ) holds. By condition (T2) in the definition of
KHowInM, it follows that KHowSM (δ, σ, M ) and, therefore, R(δ, σ) holds as well.
P ROOF OF P ROPOSITION 7.1:
Assume KHowEC (δ, σ) holds. By Theorem 4.2, there is a program δ tp such that KHowByEC (δ, σ, δ tp ). By Theorem 4.1, δ tp
is a tree program and, thus, a deterministic one, and D ∪ C ∪ {Sensed[σ]} |= ∃s.Do(δ tp , end[σ], s) ∧ Do(δ, end[σ], s). By
Theorem 5.2, KHowET (δ tp , σ) holds and, putting all together, KHowET nd (δ, σ) applies.
Lemma A.2 (Induction Principle for KHowAng
ECnd )
For all relations T (δ, σ) over programs and histories, if T (δ, σ) is closed under assertions (E1), (E2), and (E3) of the definiAng
tion of KHowAng
ECnd , then for all programs δ and histories σ, if KHowECnd (δ, σ), then T (δ, σ).
P ROOF :
As for Theorem 3 of [5].
Lemma A.3 (Induction Principle for KHowByEC )
For all relations T (δ, σ, δ ′ ) over programs, histories and programs, if T (δ, σ, δ ′ ) is closed under assertions (A), (B), and (C)
of the definition of KHowByEC , then for all programs δ, delta′ , and histories σ, if KHowByEC (δ, σ, δ ′ ), then T (δ, σ, δ ′ ).
P ROOF :
As for Theorem 3 of [5].
P ROOF OF T HEOREM 4.1:
We shall prove a more general result w.r.t. KHowAng
ECnd . To that end, we add the following case to the definition of KHowBy EC :
′
′
(C) if (δ, σ) may evolve to (δ ′ , σ) w.r.t. theory D and R(δ ′ , σ, δ tp ), then R(δ, σ, T rue?; δ tp )
In addition we enrich a bit the notion of TREE programs to include two special test steps:
dpt ::= nil | F alse? | a; dpt1 | T rue?; dpt1 | senseφ ; if φ then dpt1 else dpt2
where a is any non-sensing action, and dpt1 and dpt2 are tree programs.
The proof goes by induction on KHowByEC . The property that we have to show is closed under the assertions in the
definition of KHowByEC (and hence be able to apply Lemma A.3) is the following one:
T (δ, σ, δ tp ) = D ∪ C ∪ {Sensed[σ]} |= ∃s.Do(δ tp , end[σ], s) ∧ Do(δ, end[σ], s).
Checking that T (δ, σ, δ tp ) is closed under assertion (A) of the definition of KHowByEC :
Assume that (δ, σ) is a final configuration. Let us show that T (p, h, nil). By definition of KHow Ang
ECnd , we have
KHowAng
(δ,
σ).
As
well,
nil
is
a
TREE
program,
and
we
have
D
∪
C
∪
{Sensed[σ]}
|=
Do(nil,
end[σ],
end[σ]) ∧
ECnd
Do(δ, end[σ], end[σ]). Thus, T (δ, σ, nil).
Checking that T (δ, σ, δ tp ) is closed under assertions (B) of the definition of KHowByEC :
Assume that (δ, σ) may evolve to one or more configurations (δ ′ , σ · (a, µi )) (for some action a) w.r.t. theory D where
′
1 ≤ i ≤ 2 and that there exists δitp such that T (δ ′ , σ · (a, µi ), δitp ). This means that:
′
′
′
′
tp
′
(a) if D ∪ C ∪ {Sensed[σ · (a, 1)]} is consistent, then T (δ ′ , σ · (a, 1), δ1tp ), i.e., KHowAng
ECnd (δ , σ · (a, 1)) and δ1 is a
′
TREE program and D ∪ C ∪ {Sensed[σ · (a, 1)]} |= ∃s.Do(δ1tp , end[σ · (a, 1)], s) ∧ Do(δ ′ , end[σ · (a, 1)], s).
tp
′
(b) if D ∪ C ∪ {Sensed[σ · (a, 0)]} is consistent, then T (δ ′ , σ · (a, 0), δ2tp ), i.e., KHowAng
is a
ECnd (δ , σ · (a, 0)) and δi
′
tp
′
TREE program and D ∪ C ∪ {Sensed[σ · (a, 0)]} |= ∃s.Do(δ2 , end[σ · (a, 0)], s) ∧ Do(δ , end[σ · (a, 0)], s).
Observe that since (δ, σ) may evolve to at least one (δ ′ , σ · (a, µ) for some sensing outcome µ, either D ∪ C ∪ {Sensed[σ ·
(a, 0)]} or D ∪ C ∪ {Sensed[σ · (a, 1)]} is consistent.
Let us show that T (δ, σ, (a; if φ then δ1tp else δ2tp endIf)) where φ is the condition on the right hand side of the sensed
′
fluent axiom for a, and if D ∪C ∪{Sensed[σ ·(a, 1)]} is inconsistent then δ 1tp = nil else δ1tp = δ1tp and if D ∪C ∪{Sensed[σ ·
′
(a, 0)]} is inconsistent then δ2tp = nil else δ2tp = δ2tp .
Ang
By the definition of KHowECnd , we have that KHowAng
ECnd (δ, σ) holds (again, notice that (δ, σ) may evolve at least to one
configuration of the above form.)
Given the assumptions and the way δ1tp and δ2tp are defined, clearly a; if φ then δ1tp else δ2tp endIf is a TREE program.
It remains to show that
D ∪ C ∪ {Sensed[σ]} |= ∃s.Do((a; if φ then δ1tp else δ2tp endIf), s) ∧ Do(δ, end[σ], s).
Pick a model M of D ∪ C ∪ {Sensed[σ]}. Suppose M satisfies SF (a, end[σ]) and thus D ∪ C ∪ Sensed[σ · (a, 1)] is true
in M . Then, due to (a) above, Do(δ ′ , end[σ · (a, 1)], s) must hold for some s in M . In addition, T rans(δ, end[σ], δ ′ , end[σ ·
(a, 1)]) is true in M , and we thus have Do(δ, end[σ], s) to be true in M . Also since
T rans(a; if φ then
δ1tp
else
′
δ1tp
Do(δ1tp , end[σ · (a, 1)], s)
′
endIf, end[σ], nil; if φ then δ1tp else δ1tp endIf, end[σ · (a, 1)])
φ(end[σ · (a, 1)])
′
are all true in M , we must, therefore, have Do(a; if φ then δ1tp else δ1tp endIf, end[σ], s) to holds in M .
Suppose, on the other hand, that M does not satisfy SF (a, end[σ]) and thus D ∪ C ∪ Sensed[σ · (a, 0)] is true in M . Then,
due to (b) above, Do(δ ′ , end[σ · (a, 0)], s) must hold for some s in M . In addition, T rans(δ, end[σ], δ ′ , end[σ · (a, 0)]) is
true in M , and we thus have Do(δ, end[σ], s) to be true in M . Also since
T rans(a; if φ then δ1tp else
′
δ1tp
Do(δ2tp , end[σ · (a, 0)], s)
′
endIf, end[σ], nil; if φ then δ1tp else δ1tp endIf, end[σ · (a, 0)])
¬φ(end[σ · (a, 0)])
′
are all true in M , we must, therefore, have Do(a; if φ then δ1tp else δ1tp endIf, end[σ], s) to holds in M .
Therefore, T (δ, σ, a; if φ then δ1tp else δ2tp endIf) applies.
Checking that T (δ, σ, δ tp ) is closed under assertion (C) of the definition of KHowByEC :
′
tp
Assume that (δ, σ) may evolve to (δ ′ , σ) w.r.t. the theory and that T (δ ′ , σ, δ tp ), i.e., KHowAng
is a TREE
ECnd (δ , σ) and δ
′
tp
′
program and D ∪ C ∪ {Sensed[σ]} |= ∃s.Do(δ , end[σ], s) ∧ Do(δ , end[σ], s).
′
Ang
tp ′
Let us show that T (δ, σ, (T rue?; δ tp )). By the definition of KHowAng
is a TREE
ECnd , we have KHowECnd (δ, σ). Since δ
tp ′
program, so must be T rue?; δ .
′
It remains to show that D ∪ C ∪ {Sensed[σ]} |= ∃s.Do((T rue?; δ tp ), end[σ], s) ∧ Do(δ ′ , end[σ], s). Pick a model
′
of D ∪ C ∪ {Sensed[σ]}. Since in this model Do(δ , end[σ], s) holds for some s and T rans(δ, end[σ], δ ′ , end[σ]) holds
′
too, then we it also holds Do(δ, end[σ], s) in the model in question. As well, since we have Do(δ tp , end[σ], s)
′
′
′
′
′
and T rans((T rue?; δ tp ), end[σ], nil; δ tp , end[σ]), we must also have Do((T rue?; δ tp ), end[σ], s). Thus
′
′
D ∪ C ∪ {Sensed[σ]} |= ∃s.Do((T rue?; δ tp ), end[σ], s) ∧ Do(δ ′ , end[σ], s). Therefore, T (δ, σ, (T rue?; δ tp )) applies.
P ROOF OF T HEOREM 4.2:
We shall prove a more general result w.r.t. KHowAng
ECnd . To that end, we add the following case to the definition of KHowBy EC :
′
′
(C) if (δ, σ) may evolve to (δ ′ , σ) w.r.t. theory D and R(δ ′ , σ, δ tp ), then R(δ, σ, (T rue?; δ tp ))
In addition we enrich a bit the notion of TREE programs to include two special test steps:
dpt ::= nil | F alse? | a; dpt1 | T rue?; dpt1 | senseφ ; if φ then dpt1 else dpt2
where a is any non-sensing action, and dpt1 and dpt2 are tree programs.
The proof goes by induction on KHowAng
ECnd configurations. The property that we have to show is closed under the asserAng
tions in the definition of KHowECnd (and hence be able to apply Lemma A.2) is the following one:
T (δ, σ) = there exists δ tp such that KHowByEC (δ, σ, δ tp ) holds.
So we shall prove that T (δ, σ) is closed under (E1)-(E3) (that is, take T to be one of the possible R in the definition of
KHowAng
ECnd .
Checking that T (δ, σ) is closed under assertion (E1) of the definition of KHowAng
ECnd :
Assume (δ, σ) is final. By definition of KHowByEC , case (A), we have KHowByEC (δ, σ, nil). Thus, T (δ, σ) applies.
Checking that T (δ, σ) is closed under assertion (E2) of the definition of KHowAng
ECnd :
Assume that (δ, σ) may evolve to configurations (δ ′ , σ · (a, µi )) (for some action a) w.r.t. theory D where 1 ≤ i ≤ 2 and
such that T (δ ′ , σ · (a, µi )) holds. Suppose that φ is the condition on the right hand side of the sensed fluent axiom for a, that
is, SF (a, s) ≡ φ(s) ∈ D. Since (δ, σ) must evolve to at least one configuration (possibly two), then D ∪ C{Sensed[σ]} is
consistent and, therefore, either D ∪ C{Sensed[σ · (a, 0)]} or D ∪ C{Sensed[σ · (a, 1)]} is consistent.
If (δ, σ) may evolve to (δ ′ , σ · (a, 1)), then D ∪ C{Sensed[σ · (a, 1)]} ought to be consistent, and, hence, T (δ ′ , σ · (a, 1)),
′
′
i.e., there exists δ1tp such that KHowByEC (δ ′ , σ · (a, 1), δ1tp ).
Simlarly, if (δ, σ) may evolve to (δ ′ , σ · (a, 0)), then D ∪ C{Sensed[σ · (a, 0)]} ought to be consistent, and, hence, T (δ ′ , σ ·
′
′
(a, 0)), i.e. there exists δ2tp such that KHowByEC (δ ′ , σ · (a, 0), δ2tp ).
′
Let δ tp = (a; if φ then δ1tp else δ2tp endIf), and if D∪C ∪{Sensed[σ ·(a, 1)]} is inconsistent then δ1tp = nil else δ1tp = δ1tp
′
and if D ∪ C ∪ {Sensed[σ · (a, 0)]} is inconsistent then δ2tp = nil else δ2tp = δ2tp .
By the definition of KHowByEC , case (B), we have that KHowByEC (δ, σ, δ tp ). Thus, T (δ, σ) applies.
Checking that T (δ, σ) is closed under assertion (E3) of the definition of KHowAng
ECnd :
′
Assume that (δ, σ) may evolve to (δ ′ , σ) w.r.t. D and that T (δ ′ , σ), i.e., there exists δ tp such that KHowByEC (δ ′ , σ, δ ′ ).
′
By the definition of KHowByEC , case (C), we have KHowByEC (δ, σ, (T rue?; δ tp )). Thus, T (δ, σ) applies taking δ tp =
′
(T rue?; δ tp ).