On ability to autonomously execute agent programs with sensing

Giuseppe Giacomo; Sebastian Sardina; Yves Lespérance

On ability to autonomously execute agent programs with sensing

Giuseppe Giacomo

Sebastian Sardina

Yves Lespérance

2004

visibility

…

description

13 pages

link

1 file

Most existing work in agent programming assumes an execution model where an agent has a knowledge base (KB) about the current state of the world, and makes decisions about what to do in terms of what is entailed or consistent with this KB. We show that in the presence of sensing, such a model does not always work properly, and propose an alternative that does. We then discuss how this affects agent programming language design/semantics.

On Ability to Autonomously Execute Agent Programs with Sensing Giuseppe De Giacomo Yves Lespérance Hector J. Levesque Dept. of Computer Science Dip. Informatica e Sistemistica Dept. of Computer Science Dept. of Computer Science University of Toronto Univer. di Roma “La Sapienza” York University University of Toronto Sebastian Sardiña∗ Toronto, Canada Roma, Italy Toronto, Canada Toronto, Canada [email protected] [email protected] [email protected] [email protected] Abstract Most existing work in agent programming assumes an execution model where an agent has a knowledge base (KB) about the current state of the world, and makes decisions about what to do in terms of what is entailed or consistent with this KB. We show that in the presence of sensing, such a model does not always work properly, and propose an alternative that does. We then discuss how this affects agent programming language design/semantics. 1. Introduction There has been considerable work on formal models of deliberation/planning under incomplete information, where an agent can perform sensing actions to acquire additional information. This problem is very important in agent applications such as web information retrieval/management. However, much of the previous work on formal models of deliberation—i.e., models of knowing how, ability, epistemic feasibility, executabiliy, etc. such as [14, 3, 9, 11, 6]— has been set in epistemic logic-based frameworks and is hard to relate to work on agent programming languages (e.g. 3APL [8], AgentSpeak(L) [17]). In this paper, we develop new non-epistemic formalizations of deliberation that are much closer and easier to relate to standard agent programming language semantics based on transition systems. When doing deliberation/planning under incomplete information, one typically searches over a set of states, each of which is associated with a knowledge base (KB) or theory that represents what is known in the state. To evaluate tests in the program and to determine what transitions/actions are possible, one looks at what is entailed by the current KB. To allow for future sensing results, one looks at which of these are consistent with the current KB. We call this type of approach to deliberation “entailment and consistencybased” (EC-based). In this paper, we argue that EC-based * First author is a student. approaches do not always work, and propose an alternative. Our accounts are formalized within the situation calculus and use a simple programming language based on ConGolog [5] to specify agent programs as described in Section 2, but we claim that the results generalize to most proposed agent programming languages/frameworks. We point out that this paper is mainly concerned with the semantics of the deliberation process and not much with the actual algorithms implementing such process. We initially focus on deterministic programs/plans and how to formalize when an agent knows how to execute them. For such deterministic programs, what this amounts to is ensuring that the agent will always know what the next step to perform is, and no matter what sensing results are obtained, the agent will eventually get to the point where it knows it can terminate. In Sections 3 and 4, we develop a simple EC-based account of knowing how (KHowEC ). We show that this account gives the wrong results on a simple example involving indefinite iteration. Then, we show that whenever this account says that a deliberation/planning problem is solvable, there is a conditional plan (a finite tree program without loops) that is a solution. It follows that this account is limited to problems where the total number of steps needed can be bounded in advance. We claim that this limitation is not specific to the simple account and applies to all EC-based accounts of deliberation. The source of the problem with the EC-based account is the use of local consistency checks to determine which sensing results are possible. This does not correctly distinguish between the models that satisfy the overall domain specification (for which the plan must work) and those that do not. To get a correct account of deliberation, one must take into account what is true in different models of the domain together with what is true in all of them (what is entailed). In Section 5, we develop such an entailment and truth-based account (KHowET ), argue that it intuitively does the right thing, and show how it correctly handles our test examples. Following this, we consider richer notions of deliberation/planning, how they can be formalized, and how they can be exploited in an agent programming language. In Section 6, we discuss the notion of ability to achieve a goal, and show how it can be defined in terms of our notions of knowing how to execute a deterministic program. We observe that an EC-based definition of ability inherits the limitations of the EC-based definition of knowing how. Then in Section 7, we examine knowing how to execute a nondeterministic program. We consider two ways of interpreting this: one (angelic knowing how) where the agent does planning/lookahead to make the right choices, and another (demonic knowing how) where the agent makes choices arbitrarily. We discuss EC-based and ET-based formalizations of these notions. Finally in Section 8, we show how angelic knowing how can be used to specify a powerful planning construct in the IndiGolog agent programming language. We end by reviewing the paper’s contributions, discussing the lessons for agent programming language design, and discussing future work. All proofs can be found at the following address: http://www.cs.toronto.edu/∼ssardina/papers/paamas04.pdf 2. The Situation Calculus and IndiGolog The technical machinery we use to define program execution in the presence of sensing is based on that of [7, 5]. The starting point in the definition is the situation calculus [12]. We will not go over the language here except to note the following components: there is a special constant S0 used to denote the initial situation, namely that situation in which no actions have yet occurred; there is a distinguished binary function symbol do where do(a, s) denotes the successor situation to s resulting from performing the action a; relations whose truth values vary from situation to situation are called (relational) fluents, and are denoted by predicate symbols taking a situation term as their last argument. There is a special predicate P oss(a, s) used to state that action a is executable in situation s. We assume that actions return binary sensing results, and we use the predicate SF (a, s) to characterize what the action tells the agent about its environment. For example, the axiom SF (senseDoor(d), s) ≡ Open(d, s) states that the action senseDoor(d) tells the agent whether the door is open in situation s. For actions with no useful sensing information, we write SF (a, s) ≡ T rue. Within this language, we can formulate domain theories which describe how the world changes as the result of the available actions. Here, we use basic action theories [18] of the following form: • A set of foundational, domain independent axioms for situations Σ as in [18]. • Axioms describing the initial situation, S0 . • Action precondition axioms, one for each primitive action a, characterizing P oss(a, s). • Successor state axioms for fluents of the form F (~x, do(a, s)) ≡ γ(~x, a, s) providing the usual solution to the frame problem. • Sensed fluent axioms, as described above, of the form SF (A(~x), s) ≡ φ(~x, s) • Unique names axioms for the primitive actions. To describe a run of a program which includes both actions and their sensing results, we use the notion of a history, i.e., a sequence of pairs (a, µ) where a is a primitive action and µ is 1 or 0, a sensing result. Intuitively, the history σ = (a1 , µ1 ) · . . . · (an , µn ) is one where actions a1 , . . . , an happen starting in some initial situation, and each action ai returns sensing value µi . We use end[σ] to denote the situation term corresponding to the history σ, and Sensed[σ] to denote the formula of the situation calculus stating all sensing results of the history σ. Formally, end[ǫ] = S0 , where ǫ is the empty history; and end[σ · (a, µ)] = do(a, end[σ]). Sensed[ǫ] = T rue; Sensed[σ · (a, 1)] = Sensed[σ] ∧ SF (a, end[σ]); Sensed[σ · (a, 0)] = Sensed[σ] ∧ ¬SF (a, end[σ]). Next we turn to programs. We consider a very simple deterministic language with the following constructs: a, δ1 ; δ2 , if φ then δ1 else δ2 endIf, while φ do δ endWhile, primitive action sequence conditional while loop This is a small subset of ConGolog [5] and we use its single step transition semantics in the style of [16]. This semantics introduces two special predicates T rans and F inal are introduced: T rans(δ, s, δ ′ , s′ ) means that by executing program δ in situation s, one can get to situation s′ in one elementary step with the program δ ′ remaining to be executed; F inal(δ, s) means that program δ may successfully terminate in situation s. Offline executions of programs, which are the kind of executions originally proposed for Golog and ConGolog [10, 5], are characterized using the Do(δ, s, s′ ) predicate, which means that there is an execution of program δ that starts in situation s and terminates in situation s′ . This holds if there is a sequence of legal transitions from the initial configuration up to a final configuration: def Do(δ, s, s′ ) = ∃δ ′ .T rans∗ (δ, s, δ ′ , s′ ) ∧ F inal(δ ′ , s′ ), where T rans∗ is the reflexive transitive closure of T rans. An offline execution of δ from s is a sequence of actions a, . . . , an such that: D ∪ C |= Do(δ, s, do(an , . . . , do(a1 , s) . . .)), where D is an action theory as mentioned above, and C is a set of axioms defining the predicates T rans and F inal and the encoding of programs as first-order terms [5]. Observe that an offline executor has no access to sensing results, available only at runtime. IndiGolog, an extension of ConGolog to deal with online executions with sensing, is proposed in [7]. The semantics defines an online execution of a program δ starting from a history σ. We say that a configuration (δ, σ) may evolve to configuration (δ ′ , σ ′ ) w.r.t. a model M (relative to an underlying theory of action D) iff 1 (i) M is a model of D ∪ C ∪ {Sensed[σ]}, and (ii) D ∪ C ∪ {Sensed[σi ]} |= T rans(δ, end[σ], δ ′ , end[σ ′ ]) and (iii)  σ · (a, 1) if      σ · (a, 0) if σ′ =      σ if end[σ ′ ] = do(a, end[σ]) and M |= SF (a, end[σ]) end[σ ′ ] = do(a, end[σ]) and M 6|= SF (a, end[σ]). end[σ ′ ] = end[σ], The model M above is only used to represent a possible environment and, hence, it is just used to generate the sensing results of the corresponding environment. Finally, we say that a configuration (δ, σ) is final whenever D ∪ C ∪ {Sensed[σ]} |= F inal(δ, end[σ]). Using these two concepts of configuration evolution and final configurations, one can define various notions of online, incremental, executions of programs as a sequence of legal configuration evolutions, possibly terminating in a final configuration. 3. Deliberation: EC-based Account Perhaps the first approach to come to mind for defining when an agent knows how/is able to execute a deterministic program δ in a history σ goes as follows: the agent must always know what the next action prescribed by the program is and be able to perform it such that no matter what sensing output is obtained as a result of doing the action, she can continue this process with what remains of the program and, eventually, reach a configuration where she knows she can legally terminate. We can formalize this idea as follows. We say that a configuration (δ, σ) may evolve to configuration (δ ′ , σ ′ ) w.r.t. a (background) theory D if and only (δ, σ) may evolve to (δ ′ , σ ′ ) w.r.t. some model M (relative to D). Note that we now have two notions of “configuration evolution,” one w.r.t. a particular model (cf. Section 2) and one w.r.t. a theory. Again, the model is used only to obtain the sensing values corresponding to some possible environment and not determine the truth values of formulas. An important point is that this alternative version of configuration evolution appeals to consistency in that it considers 1 This defi nition is more general than the one in [7], where the sensing results were assumed to come from the actual environment rather than from a model (a model can represent any possible environment). Also, here we deal with non-terminating, i.e., infi nite executions. all evolutions of a configuration in which the sensing outcome (i.e., the environment response) is consistent with the underlying theory. We define KHowEC (δ, σ) to be the smallest relation R(δ, σ) such that: (E1) if (δ, σ) is final, then R(δ, σ); (E2) if (δ, σ) may evolve to configurations (δ ′ , σ · (a, µi )) w.r.t. theory D with i = 1..k for some k ≥ 1, and R(δ ′ , σ · (a, µi )) holds for all i = 1..k, then R(δ, σ). The first condition states that every terminating configuration is in the relation KHowEC . The second condition states that if a configuration performs an action transition and for every consistent sensing result, the resulting configuration is in KHowEC , then this configuration is also in KHowEC . Note that, here, the agent’s lack of complete knowledge in a history σ is modeled by the theory D ∪ C ∪ {Sensed[σ]} being incomplete and having many different models. KHowEC uses entailment to ensure that the information available is sufficient to determine which transition should be performed next. For instance, for a conditional program involving different primitive actions a1 and a2 in the “then” and “else” branches (i.e., such that D |= a1 6= a2 ), the agent must know whether the test holds and know how to execute the appropriate branch: KHowEC (if φ then a1 else a2 endIf, σ) iff D ∪ C ∪ {Sensed[σ]} |= φ(end[σ]) and KHowEC (a1 , σ) or D ∪ C ∪ {Sensed[σ]} |= ¬φ(end[σ]) and KHowEC (a2 , σ). KHowEC uses consistency to determine which sensing results can occur, for which the agent needs to have a subplan that leads to a final configuration. Due to this, we say that KHowEC is an entailment and consistency-based (ECbased) account of knowing how. This EC-based account of knowing how seems quite intuitive and attractive. However it has a fundamental limitation: it fails on programs involving indefinite iteration. The following simple example from [9] shows the problem. Consider a situation in which an agent wants to cut down a tree. Assume that the agent has a primitive action chop to chop at the tree, and also assume that she can always find out whether the tree is down by doing the (binary) sensing action look. If the sensing result is 1, then the tree is down; otherwise the tree remains up. There is also a fluent RemainingChops(s), which we assume ranges over the natural numbers N and whose value is unknown to the agent, and which is meant to represent how many chop actions are still required in s to bring the tree down. The agent’s goal is to bring the tree down, i.e., bringing about a situation s such that Down(s) holds, where def Down(s) = RemainingChops(s) = 0 The action theory D tc is the union of: 1. The foundational axioms for situations Σ. 2. Duna = {chop 6= look}. 3. Dss contains the following successor state axiom: RemainingChops(do(a, s)) = n ≡ (a = chop ∧ RemainingChops(s) = n + 1) ∨ (a 6= chop ∧ RemainingChops(s) = n). 4. Dap contains the following two precondition axioms: P oss(chop, s) ≡ (RemainingChops > 0), P oss(look, s) ≡ T rue. 5. DS0 = {RemainingChops(S0) 6= 0}. 6. Dsf contains the following two sensing axioms: SF (chop, s) ≡ T rue, SF (look, s) ≡ (RemainingChops(s) = 0). Notice that sentence ∃n.RemainingChop(S0) = n (where the variable n ranges over N) is entailed by this theory so “infinitely” hard tree trunks are ruled out. Nonetheless, the theory does not entail the sentence RemainingChop(S0) < k for any constant k ∈ N. Hence, there exists some n ∈ N, though unknown and unbounded, such that the tree will fall after n chops. Because of this, intuitively, we should have that the agent can bring the tree down, since if the agent keeps chopping, the tree will eventually come down, and the agent can find out whether it has come down by looking. Thus, for the program δtc = while ¬Down do chop; look endWhile we should have that KHowEC (δtc , ǫ) (note that δtc is deterministic). However, this is not the case: Theorem 3.1 Let δtc be the above program to bring the tree down. Then, for all k ∈ N, KHowEC (δtc , [(chop, 1) · (look, 0)]k ) does not hold. In particular, when k = 0, KHowEC (δtc , ǫ) does not hold. Thus, the simple EC-based formalization of knowing how gives the wrong result for this example. Why is this so? Intuitively, it is easy to check that if the agent knows how (to execute) the initial configuration, i.e., KHowEC (δtc , ǫ) holds, then she knows-how (to execute) every possible finite evolution of it, i.e., for all j ∈ N, KHowEC (δtc , [(chop, 1) · (look, 0)]j ) and KHowEC ((look; δtc ), [(chop, 1) · (look, 0)]j · (chop, 1)). Now consider the hypothetical scenario in which an agent keeps chopping and looking forever, always seeing that the tree is not down. There is no model of D tc where δtc yields this scenario, as the tree is guaranteed to come down after a finite number of chops. However, by the above, we see that KHowEC is, in some way, taking this case into account in determining whether the agent knows how to execute δtc . This happens because every finite prefix of this never-ending execution is indeed consistent with D tc . The problem is that the set of all of them together is not. This is why KHowEC fails. In the next section, we show that KHowEC ’s failure on the tree chopping example is due to a general limitation of the KHowEC formalization. Note that Moore’s original account of ability [14] is closely related to KHowEC and also fails on the tree chopping example [9]. 4. KHowEC Only Handles Bounded Problems In this section, we show that whenever KHowEC (δ, σ) holds for some program δ and history σ, there is simple kind of conditional plan, what we call a TREE program, that can be followed to execute δ in σ. Since for TREE programs (and conditional plans), the number of steps they perform can be bounded in advance (there are no loops), it follows that KHowEC will never be satisfied for programs whose execution cannot be bounded in advance. Since there are many such programs (for instance, the one for the tree chopping example), it follows that KHowEC is fundamentally limited as a formalization of knowing how and can only be used in contexts where attention can be restricted to bounded strategies. As in [6], we define the class of (sensebranch) tree programs TREE with the following BNF rule: dpt ::= nil | a; dpt1 |senseφ ; if φ then dpt1 else dpt2 where a is any non-sensing action, and dpt1 and dpt2 are tree programs. This class includes conditional programs where one can only test a condition that has just been sensed. Thus as shown in [6], whenever a TREE program is executable, it is also epistemically feasible, i.e., the agent can execute it without ever getting stuck not knowing what transition to perform next. TREE programs are clearly deterministic. Let us define a relation KHowByEC : P rogram × History × TREE . The relation is intended to associate a program δ and history σ for which KHowEC holds with some TREE program(s) that can be used as a strategy for successfully executing δ in σ. We define KHowByEC (δ, σ, δ tp ) to be the least relation R(δ, σ, δ tp ) such that: (A) if (δ, σ) is final, then R(δ, σ, nil); (B) if (δ, σ) may evolve to configurations (δ ′ , σ · (a, µi )) w.r.t. theory D, 1 ≤ i ≤ 2, and there ex′ ′ ist δitp such that R(δ ′ , σ · (a, µi ), δitp ), then tp tp R(δ, σ, (a; if φ then δ1 else δ2 endIf)) where φ is the condition on the right hand side of the sensed fluent ax- iom for a, and δitp is nil if D∪C ∪{Sensed[σ·(a, µi )]} ′ is inconsistent, and δitp is δitp otherwise. It is possible to show that whenever KHowByEC (δ, σ, δ tp ) holds, then KHowEC (δ, σ) and KHowEC (δ dp , σ) hold, and the TREE program δ tp is guaranteed to terminate in a F inal situation of the given program δ (in all models). Theorem 4.1 For all programs δ, histories σ, and programs δ tp , if KHowByEC (δ, σ, δ tp ) then we have that dp • KHowEC (δ, σ) and KHowEC (δ , σ) hold; and D ∪ C ∪ {Sensed[σ]} |= • ∃s.Do(δ tp , end[σ], s) ∧ Do(δ, end[σ], s). In addition, every configuration captured in KHowEC can be executed using a TREE program. Theorem 4.2 For all programs δ and histories σ, if KHowEC (δ, σ), then there exists a program δ tp such that KHowByEC (δ, σ, δ tp ). (T1) if (δ, σ) is final, then R(δ, σ); (T2) if (δ, σ) may evolve to (δ ′ , σ · (a, µ)) w.r.t. M and R(δ ′ , σ · (a, µ)), then R(δ, σ); The only difference between this and KHowEC is that the sensing results come from the fixed model M . Given this, we obtain the following formalization of when an agent knows how to execute a program δ in a history σ: KHowET (δ, σ) iff for every model M such that M |= D ∪ C ∪ {Sensed[σ]}, KHowInM(δ, σ, M ). We call this type of formalization entailment and truthbased, since it uses entailment to ensure that the agent knows what transitions she can do, and truth in a model to obtain possible sensing results. We claim that KHowET is actually correct for programs δ that are deterministic. For instance, it handles the tree chopping example correctly: Proposition 5.1 KHowET (δtc , ǫ) holds w.r.t. theory D tc . Since the number of steps a TREE program performs can be bounded in advance, it follows that KHowEC will never hold for programs/problems that are solvable, but whose executions require a number of steps that cannot be bounded in advance, as it is the case with the program in the tree chopping example. Thus KHowEC is severely restricted as an account of knowing how; it can only be complete when all possible strategies are bounded. Theorem 5.2 For any background theory D and any configuration (δ, σ), if KHowEC (δ, σ) holds, then KHowET (δ, σ). Moreover, there is a background theory D ∗ and a configuration (δ ∗ , σ ∗ ) such that KHowET (δ ∗ , σ ∗ ) holds, but KHowEC (δ ∗ , σ ∗ ) does not. 5. Deliberation: ET-based Account 6. Ability and Planning We saw in Section 3 that the reason KHowEC failed on the tree chopping example was that it required the agent to have a choice of action that guaranteed reaching a final configuration even for histories that were inconsistent with the domain specification such us the infinite history corresponding to the hypothetical scenario described at the end of Section 3. There was a branch in the configuration tree that corresponded to that history. This occurred because “local consistency” was used to construct the configuration tree. The consistency check kept switching which model of D ∪ C (which may be thought as representing the environment) was used to generate the next sensing result, postponing the observation that the tree had come down forever. But in the real world, sensing results come from a fixed environment (even if we don’t know which environment this is). It seems reasonable that we could correct the problem by fixing the model of D ∪ C used in generating possible configurations in our formalization of knowing how. This is what we will now do. We define when an agent knows how to execute a program δ in a history σ and a model M (which represents the environment), KHowInM(δ, σ, M ), as the smallest relation R(δ, σ) such that: Using our two notions of knowing how, we can define related notions of ability to achieve a goal [9, 14, 3]. For both KHowEC and KHowET , we formally specify when an agent can achieve a goal φ in a history σ, CanX (φ, σ) (where X = EC or X = ET ) iff there exists a deterministic program δ d such that KHowX (δ d , σ) and Furthermore, KHowET is strictly more general than KHowEC . Formally, D ∪ C ∪ {Sensed[σ]} |= ∃s.Do(δ d , end[σ], s) ∧ φ(s) i.e., the agent knows how to execute the program and the program (always) terminates in a situation where the goal has been achieved. For the tree chopping problem we have that CanET (Down, S0 ) holds, but CanEC (Down, S0 ) does not. Based on Theorem 5.2, we can also show that CanET is a strictly more general account of ability than CanEC . One can easily use this to define a new construct achieve(φ) that does planning to achieve a goal and add it to an agent programming language. But generally, one also want to be able to specify constraints on the search for a plan to achieve the goal, constraints on what sort of plan should be considered. One way to do this is to specify the task as that of executing a nondeterministic program, which we now turn to. 7. Nondeterministic Programs When it comes to nondeterministic programs, the notion of knowing how needs to be extended. In particular there are two main notions of nondeterministic program execution of interest for agents. The first notion states that choices at program choice points are under the control of the agent, and hence can involve reasoning on part of the agent. In this case, an agent knows how to execute a nondeterministic program if she is able to make choices along the execution of the program so that, at each step, the action chosen is known to be executable and such that no matter what sensing output is obtained as a result of doing the action, she can continue this process and eventually terminate successfully. This assumes that the agent does some planning/lookahead to find a strategy for executing the nondeterministic program such that this strategy is guaranteed to succeed. The second notion of nondeterministic program execution states that choices at its choice points are not under the control of the agent. Hence, no reasoning, planning or lookahead, is involved in choosing the next step, and all choices must be accepted and dealt with by the agent. In this case, the agent knows how to execute a nondeterministic program if every possible way she may execute it would eventually terminate successfully. Suppose that we enlarge our programming language with the following nondeterministic constructs: φ?, δ1 | δ2 , π x. δ(x), δ∗, δ1 k δ 2 , wait for a condition nondeterministic branch nondeterministic choice of argument nondeterministic iteration (interleaved) concurrency The first wait/test construct blocks until its condition becomes true and it is still a deterministic construct which produces transitions involving no action. Such a construct is useful with nondeterministic programs and can be easily accommodated into the already given definitions of KHowEC and KHowET by just adding one extra condition to their corresponding definitions: (E3) if (δ, σ) may evolve to (δ ′ , σ) w.r.t. theory D and R(δ ′ , σ), then R(δ, σ) (for KHowEC ) (T3) if (δ, σ) may evolve to (δ ′ , σ) w.r.t. M and R(δ ′ , σ), then R(δ, σ) (for KHowET ) Let us first focus on knowing how for nondeterministic programs where choices are under the control of the agent, i.e., angelic knowing how. We can define EC/ET versions of this notion. We can take KHowAng ECnd to be just KHowEC assuming (E3) is included. This works just fine for nondeterministic programs, though, it still fails to capture (good) programs with an unbounded number of steps. On the other hand, KHowET , as defined and assuming (T3) is included, is too weak. Consider the following example. There is a treasure behind one of two doors but the agent does not know which. We want to know if she knows how to execute the program δtreas : [(open1; look) | (open2; look)]; AtT reasure? Intuitively, the agent does not know how to execute δtreas because she does not know which door to open to get to the treasure. However, KHowET (δtreas , ǫ) holds. Indeed in a model M1 where the treasure is behind door 1, the agent can pick the open1 action, and then we have KHowInM((look; AtT reasure?), [(open1, 1)], M1 ), and thus KHowInM(δtreas , ǫ, M1 ). Similarly, in a model M2 where the treasure is behind door 2, she can pick open2, and thus KHowInM(δtreas , ǫ, M2 ). The problem with KHowET for nondeterministic programs is that the action chosen need not be the same in different models even if they have generated the same sensing results up to that point and are indistinguishable for the agent. We can solve this problem by requiring that the agent have a common strategy for all models/environments, i.e., that she has a deterministic program δ d that she knows how to execute (in all models of the theory) and knows δ d will terminate in a final situation of the given program δ: d KHowAng ET nd (δ, σ) iff there is a deterministic program δ d such that KHowET (δ , σ) and D ∪ C ∪ {Sensed[σ]} |= ∃s.Do(δ d , end[σ], s) ∧ Do(δ, end[σ], s). We do not think that it is possible to obtain a much simpler general formalization of knowing how and to avoid the existential quantification over deterministic programs/strategies. Proposition 7.1 For any background theory D and any configuration (δ, σ), if KHowAng ECnd (δ, σ) holds, then Ang KHowET nd (δ, σ). Also, there is a background theory D ∗ ∗ ∗ and a configuration (δ ∗ , σ ∗ ) such that KHowAng ET nd (δ , σ ) Ang holds, but KHowECnd (δ ∗ , σ ∗ ) does not. An interesting point is that, now, ability to achieve a goal, as defined in the previous section, can be seen as a special case of knowing how to execute a (nondeterministic) program. Indeed, we (re)define CanX (φ, end[σ]), as follows: Ang CanX (φ, σ) iff KHowXnd (while ¬φ do (πa.a) end, σ), i.e., the agent knows how to execute the program that involves repeatedly choosing and executing some action until the goal has been achieved. Next, we turn our attention to knowing how for nondeterministic programs where choices are not under the control of the agent, i.e., demonic knowing how. We start by defining a relation between a program δa and another program δb that simulates it: def Simulates(δa, δb , s) = ∃R.(∀δ1 , δ2 , s.R(δ1 , δ2 , s) ⊃ ... ) ∧ R(δa , δb , s) where the ellipsis stands for the conjunction of: F inal(δ1 , s) ⊃ F inal(δ2 , s) ¬F inal(δ1 , s) ∧ ∀δ1′ , s′ .T rans(δ1 , s, δ1′ , s′ ) ⊃ ∃δ2′ .T rans(δ2 , s, δ2′ , s′ ) ∧ R(δ1′ , δ2′ , s) For instance, if δ = ((a; b) | (a; c)) and δ ′ = (a; (b | c)), then Simulates(δ, δ ′, s) holds, but Simulates(δ ′, δ, s) does not (provided all actions are possible from situation s). We say that an agent knows how to execute a nondeterministic program in the demonic sense whenever she knows how to execute every possible deterministic simulation of it: KHowDem ET nd (δ, σ) This semantic is metatheoretic and provides a rather simpler alternative to that proposed in [6], which is based on an epistemic version of the situation calculus. 9. Discussion and Conclusion d iff KHowET (δ , σ) holds for every deterministic program δ d such that D ∪ C ∪ {Sensed[σ]} |= Simulates(δ d, δ, end[σ]) This account of knowing how can be seen as a specification of nondeterministic programs when choices at choice points are not under the control of the agent. This is relevant for agent programming languages in which nondeterministic programs are used to represent the agent behavior but an online/reactive account of execution of these programs is used (3APL [8], AgentSpeak(L) [17], etc). In those frameworks, the nondeterministic program must work no matter how choice points are resolved. 8. Deliberation in IndiGolog We can use our formalization of knowing how to provide a better semantics for search/deliberation in agent programming languages. Let’s show how this is done for IndiGolog [7]. In IndiGolog, the programmer controls when deliberation occurs. By default, there is no deliberation or lookahead; the interpreter arbitrarily selects a transition and performs it on-line, until it gets to a final configuration. To perform deliberation/lookahead, the programmer uses the search operator Σ(δ), where δ is the part of the program that needs to be deliberated over. The idea is that this Σ only allows a transition for δ if there exists a sequence of further transitions that would allow δ to terminate successfully. The original definition for the search operator in [7] failed to ensure that the plan was epistemically feasible, i.e., allowed cases where a sequence of transitions must exist, but where the agent cannot determine what the sequence is; our proposal here corrects this. We can specify the semantics of this operator by extending our earlier notions of configuration evolution and finalness, used in defining online executions: A configuration (Σ(δ), σ) is final if and only if the configuration (δ, σ) is final. A configuration (Σ(δ), σ) may evolve to a configuration (δ ′ , σ ′ ) w.r.t. a model M if and only if there exists a deterministic program δ d such that: d • KHowAng ET (δ , σ) holds; • D ∪ C ∪ {Sensed[σ]} |= ∃s.Do(δ d , end[σ], s) ∧ Do(δ, end[σ], s) d (i.e. δ must terminate in a F inal situation of the given program δ); ′ • (δ d , σ) may evolve to (δ d , σ ′ ) w.r.t. M ; ′ • the program that remains afterwards is δ ′ = Σ(δ d ). In this paper, we have looked at how to formalize when an agent knows how to execute a program, which in the general case, when the program is nondeterministic and the agent does lookahead and reasons about possible execution strategies, subsumes ability to achieve a goal. First, we have shown that an intuitively reasonable entailment and consistency-based approach to formalizing knowing how, KHowEC , fails on examples like our tree chopping case and that, in fact, KHowEC can only handle problems that can be solved in a bounded number of steps, i.e. without indefinite iteration. Then, we developed an alternative entailment and truth-based formalization, KHowET , that handles indefinite iteration examples correctly. Finally, we proposed accounts of ability and knowing how for nondeterministic programs. The problems of accounts like KHowEC when they are formalized in epistemic logic, such as Moore’s [14], had been pointed out before, for instance in [9]. However, the reasons for the problems were not well understood. The results we have presented clarify the source of the problems and show what is needed for their solution. A simple metatheoretic approach to knowing how fails; one needs to take entailment and truth into account together. (Even if we use a more powerful logical language with an knowledge operator, knowledge and truth must be considered together.) Our non-epistemic accounts of knowing how are easily related to models of agent programming language semantics and our results have important implications for this area. While most work on agent programming languages (e.g. 3APL [8], AgentSpeak(L) [17], etc.) has focused on reactive execution, sensing is acknowledged to be important and there has been interest in providing mechanisms for run-time planning/deliberation. The semantics of such languages are usually specified as a transition system. For instance in 3APL, configurations are pairs involving a program and a belief base, and a transition relation over such pairs is defined by a set of rules. Evaluating program tests is done by checking whether they are entailed by the belief base. Checking action preconditions is done by querying the agent’s belief base update relation, which would typically involve determining entailments over the belief base — the 3APL semantics abstracts over the details of this. Sensing is not dealt with explicitly, although one can suppose that it could be handled by simply updating the belief base (AgentSpeak(L) has events for this kind of thing). As mentioned, most work in the area only deals with on-line reactive execution, where no deliberation/lookahead is performed; this type of execution just involves repeatedly selecting some transition allowed in the current configuration and performing it. However, one natural view is that deliberation can simply be taken as a different control regime involving search over the agent program’s transition tree. In this view, a deliberating interpreter could first lookahead and search the program’s transition tree to find a sequence of transitions that leads to successful termination and later execute this sequence. This assumes that the agent can chose among all alternative transitions. Clearly, in the presence of sensing, this idea needs to be refined. One must find more than just a path to a final configuration in the transition tree; one needs to find some sort of conditional plan or subtree where the agent has chosen some transition among those allowed, but must have branches for all possible sensing results. The natural way of determining which sensing results are possible is checking their consistency with the current belief base. Thus, what is considered here is essentially an EC-based approach. Also in work on planning under incomplete information, e.g. [2, 15, 4], a similar sort of setting is typically used, and finding a plan involves searching a (finite) space of knowledge states that are compatible with the planner’s knowledge. The underlying models of all these planners are meant to represent only the current possible states of the environment, which, in turn, are updated upon the hypothetical execution of an action at planning time. We use models that are dynamic in the sense that they represent the potential responses of the environment for any future state. In that way, then, what the above planners are doing is deliberation in the style of KHowEC . Our results show that this view of deliberation is fundamentally flawed when sensing is present. It produces an account that only handles problems that can be solved in a bounded number of actions. As an approach to implementing deliberation, this may be perfectly fine. But as a semantics or specification, it is wrong. What is required is a much different kind of account, like our ET-based one. One might argue that results concerning the indistinguishability of unbounded nondeterminism [13, 1] (e.g., a ∗ b being observationally indistinguishable from a∗ b + aω ) are a problem for our approach, but this is not the case because we are assuming that agents can reason about all possible program executions/futures. Finally, we believe that there is a close relationship between KHowET nd some of the earlier epistemic accounts of knowing how and ability [14, 3, 9, 11, 6]. We hope to get some correspondence results on this soon. References [1] K. Apt and E. Olderog. Verifi cation of Sequential and Concurrent Programs. Springer-Verlag, 1997. [2] P. Bertoli, A. Cimatti, M. Roveri, and P. Traverso. Planning in nondeterministic domains under partial observability via symbolic model checking. In Proc. of IJCAI-01, pages 473– 478, 2001. [3] E. Davis. Knowledge preconditions for plans. Journal of Logic and Computation, 4(5):721–766, 1994. [4] G. De Giaccomo, L. Iocchi, D. Nardi, and R. Rosati. Planning with sensing for a mobile robot. In Proc, of ECP-97, pages 156–168, 1997. [5] G. De Giacomo, Y. Lespérance, and H. J. Levesque. ConGolog, a concurrent programming language based on the situation calculus. Artifi cial Intelligence, 121:109–169, 2000. [6] G. De Giacomo, Y. Lespérance, H. J. Levesque, and S. Sardiña. On the semantics of deliberation in IndiGolog: From theory to implementation. In Proc. of KR-02, pages 603–614, 2002. [7] G. De Giacomo and H. J. Levesque. An incremental interpreter for high-level programs with sensing. In H. J. Levesque and F. Pirri, editors, Logical Foundations for Cognitive Agents, pages 86–102. Springer-Verlag, 1999. [8] K. V. Hindriks, F. S. de Boer, W. van der Hoek, and J. J. C. Meyer. A formal semantics for an abstract agent programming language. In Proc. of ATAL-97, pages 215–229, 1998. [9] Y. Lespérance, H. J. Levesque, F. Lin, and R. B. Scherl. Ability and knowing how in the situation calculus. Studia Logica, 66(1):165–186, 2000. [10] H. Levesque, R. Reiter, Y. Lesperance, F. Lin, and R. Scherl. GOLOG: A logic programming language for dynamic domains. Journal of Logic Programming, 31:59–84, 1997. [11] F. Lin and H. J. Levesque. What robots can do: Robot programs and effective achievability. Artifi cial Intelligence, 101:201–226, 1998. [12] J. McCarthy and P. Hayes. Some philosophical problems from the standpoint of artifi cial intellig ence. In B. Meltzer and D. Michie, editors, Machine Intelligence, volume 4, pages 463–502. Edinburgh University Press, 1979. [13] R. Milner. Communication and Concurrency. Prentice Hall, 1989. [14] R. C. Moore. A formal theory of knowledge and action. In J. R. Hobbs and R. C. Moore, editors, Formal Theories of the Common Sense World, pages 319–358. 1985. [15] R. Petrick and F. Bacchus. A knowledge-based approach to planning with incomplete information and sensing. In Proc. of AIPS-02, pages 212–221, 2002. [16] G. Plotkin. A structural approach to operational semantics. Technical Report DAIMI-FN-19, Computer Science Dept., Aarhus University, Denmark, 1981. [17] A. S. Rao. AgentSpeak(L): BDI agents speak out in a logica computable language. In W. V. Velde and J. W. Perram, editors, Agents Breaking Away (LNAI), volume 1038, pages 42–55. Springer-Verlag, 1996. [18] R. Reiter. Knowledge in Action: Logical Foundations for Specifying and Implementing Dynamical Systems. MIT Press, 2001. A. PROOFS P ROOF OF T HEOREM 3.1: Let R be the smallest relation defining KHowEC . Then, R is the smallest relation satisfying conditions (E1)-(E3). Assume, by the contrary, that R(δtc , [chop, (look, 0)]m ) for some m ≥ 0. By Lemma A.1, there exists a binary relation R′ ⊆ R such that R′ (δtc , [chop, (look, 0)]k ) does not holds for any k ≥ 0 and such that R′ satisfies (E1)-(E3). This means that R′ (δtc , [chop, (look, 0)]m ) does not hold and, therefore, R′ ⊂ R (i.e., R′ is a proper subset of R.) Therefore, R is not the smallest relation satisfying (E1)-(E3) which contradicts the initial assumption. Then, ¬R(δtc , [chop, (look, 0)]k ) for all k ≥ 0 and the relation KHowEC does not contain the pair (δtc , [chop, (look, 0)]k ). In particular, (δtc , ǫ) is not in the KHowEC relation when we take k = 0. Lemma A.1 Let δtc be the program to bring the tree down. This Lemma will prove a more general version of the result by showing that Ang KHowAng ECnd (δtc , σ) does not hold. It is obvious to see that KHowEC is always a subset of KHowECnd . Let R be a set of pairs program-history (δ, σ) satisfying conditions (E1)-(E3) of the definition of KHow Ang ECnd . Then, there exists a set R′ ⊆ R such that R′ satisfies conditions (E1), (E2) and (E3), and such that R′ (δtc , [chop, (look, 0)]k ) does not hold for any k ≥ 0. P ROOF : Intuitively, we will prove that we can “safely” remove all program-history pairs of the form (δ tc , [chop, (look, 0)]k ) from the set R. To do that, let us define R′ = R − β, where: β = {(δtc , σ) : σ = [chop, (look, 0)]k ∧ k ≥ 0} ∪ {((look; δtc ), σ) : σ = [chop, (look, 0)]k · chop ∧ k ≥ 0} Clearly, R′ ⊆ R and ¬R′ (δtc , ǫ). It remains to show that relation R′ does indeed satisfy conditions (E1), (E2), and (E3). The set R′ satisfies (E1) This is trivial since all configurations for which β holds are not final. In concrete, if (δ, σ) is final, then R(δ, σ) and ¬β(δ, σ). Therefore, R′ (δ, σ). The set R′ satisfies (E2) Let δ be any program and σ be any history. Let us consider the following three exhaustive and exclusive cases depending on the form of the history σ: 1. σ 6= [chop, (look, 0)]k and σ 6= [chop, (look, 0)]k · chop. First, note that, because of the form of σ, ¬β(δ, σ) is true. Assume then that there exist configurations (δ i , σ · (a, µi )), i ≥ 1, such that (δ, σ) may evolve to w.r.t. theory D tc and such that R′ (δi , σ · (a, µi )) apply. Since R′ ⊆ R, then R(δ ′ , σ · (a, µ)) holds, and, given that R does satisfy condition (E2), R(δ, σ) is true. Due to the fact that ¬β(δ, σ), R′ (δ, σ) holds as well. 2. σ = [chop, (look, 0)]k , for some k ≥ 0. Assume that there exist configurations (δi , σ · (a, µi )), i ≥ 1, such that (δ, σ) may evolve to w.r.t. theory D tc . Let us now consider the following two exhaustive and exclusive cases: (a) δ 6= δtc , i.e., δ 6= while ¬Down do chop; look endWhile. Suppose that R′ (δi , σ · (a, µi )) apply, for all i ≥ i. Hence, R(δ ′ , σ · (a, µ)) holds because R′ ⊆ R; and R(δ, σ) holds since R does satisfy (E3). Moreover, because of the form of both δ and σ, it is easy to check that ¬β(δ, σ). Thus, R′ (δ, σ) holds. (b) δ = δtc , i.e., δ = while ¬Down do chop; look endWhile. In this case, there could only be one possible configuration (δ ′ , σ ′ ) to which (δ, σ) may evolve to w.r.t. theory D, namely, δ ′ = (look; δtc ) and σ ′ = σ · chop. Since σ ′ = [chop, (look, 0)]k · chop, β(δ ′ , σ ′ ) holds, and, therefore, ¬R′ (δ ′ , σ ′ ) applies. Thus, condition (E3) is trivially satisfied by R ′ . 3. σ = [chop, (look, 0)]k · chop, for some k ≥ 0. Here, let us consider the following two exhaustive and exclusive cases depending on the form of the program δ: (a) δ 6= (look; δtc ), i.e., δ 6= (look; while ¬Down do chop; look endWhile). Suppose that R′ (δi , σ · (a, µi )) apply, for all i ≥ i. Hence, R(δ ′ , σ · (a, µ)) holds because R′ ⊆ R; and R(δ, σ) holds since R does satisfy (E3). Moreover, because of the form of both δ and σ, it is easy to check that ¬β(δ, σ). Thus, R′ (δ, σ) holds. (b) δ = (look; δtc ), i.e., δ = (look; while ¬Down do chop; look endWhile). Finally, the most interesting case. We can easily verify that (δ, σ) may evolve to (δ tc , σ ′ ) w.r.t. theory D, where ′ σ = σ · (look, 0). Informally, there is always a possible evolution of the configuration for which the tree was just sensed to be still up. Technically, there is always a model M of D ∪ C ∪ {Sensed[σ · (look, 0)]}. Next, we observe that σ ′ = [chop, (look, 0)]k+1 , and, therefore, β(δtc , σ ′ ) holds. Thus, by the way we defined ′ R in terms of R and β, ¬R′ (δtc , σ · (look, 0)) applies and the condition (E3) is trivially satisfied in this case. In words, there is always a possible and consistent sensing outcome for the only legal action-transition (i.e., a chopping action) such that the resulting configuration is not on the smallest set. Recall that in order to include the original configuration into the smallest set, we require that no matter how sensing turns out to be, the resulting configuration should be on the smallest set. The set R′ satisfies (E3) Let δ be any program and σ be any history. Let us consider the following two exhaustive and exclusive cases depending on the form of the program δ: (a) δ 6= δtc and δ 6= look; δtc First, note that, due to the form of δ, ¬β(δ, σ) is true. Assume then that there exists a program δ ′ such that (δ, σ) may evolve to (δ ′ , σ) w.r.t. Dtc and such that that R′ (δ ′ , σ) holds. Since R′ ⊆ R, then R(δ ′ , σ) holds as well. Given that R satisfies (E3), R(δ, σ). Finally, because ¬β(δ, σ), it follows that R ′ (δ, σ) is true. (b) δ = δtc or δ = look; δtc In this case, there is no program δ ′ such that (δ, σ) may evolve to (δ ′ , σ) as there can be no transition without an action, either a look action or a chop one. Therefore, condition (E3) is trivially satisfied. P ROOF OF P ROPOSITION 5.1: Let M be a model of D tc ∪ C. We are to prove that KHowInM(δ, ǫ, M ). Let k be the number of chops required initially to get the three down in model M , i.e., M |= RemainingChops(S0 ) = k. It is not difficult to see that history σ = [(chop, 1) · (look, 1)]k completely executes program δtc in model M . Moreover, we can check the following points: • (δtc , σ) is final and hence, by condition (T1), KHowInM(δtc , σ, M ) holds; • ((look; δtc ), [(chop, 1)·(look, 0)]k−1 ·(chop, 1)) may evolve to (δtc , σ). By condition (T2), KHowInM((look; δtc ), [(chop, 1)· (look, 0)]k−1 , M ) holds; • (δtc , [(chop, 1) · (look, 0)]k−1 ) may evolve to ((look; δtc ), [(chop, 1) · (look, 0)]k−1 · (chop, 1)). By condition (T2), KHowInM((look; δtc ), [(chop, 1) · (look, 0)]k−1 , M ) holds; • ... • ((look; δtc ), (chop, 1)) may evolve to (δtc , [(chop, 1)·(look, 0)]1 ). By condition (T2), KHowInM((look; δtc ), (chop, 1), M ) holds; • (δtc , ǫ) may evolve to ((look; δtc ), (chop, 1)). By condition (T2), KHowInM(δtc ), ǫ, M ) holds. P ROOF OF T HEOREM 5.2: Part (ii) folllows directly from Proposition 5.1 and Theorem 3.1, taking D tc as the background theory. Let us prove part (i). Suppose that (δ ∗ , σ ∗ ) is a configuration such that KHowET (δ ∗ , σ ∗ ) does not hold. We shall prove that KHowEC (δ ∗ , σ ∗ ) is not true either. By hypothesis, there is a model M of D ∪ C ∪ {Sensed[σ ∗ ]} such that KHowInM (δ ∗ , σ ∗ , M ) is false. We define the binary relation R as follows: for all programs δ and all histories σ, R(δ, σ) iff KHowInM (δ, σ, M ). We shall prove now that R satisfies conditions (E1) and (E2) in the definition of KHowEC : (E1) Suppose that (δ, σ) is final. Then, by condition (T1) in the definition of KHowInM, KHowSM (δ, σ, M ) is true and R(δ, σ) holds. (E2) Suppose that (δ, σ) may evolve to configurations (δ ′ , σ ·(a, µi )) w.r.t. theory D with 1 ≤ i ≤ k, and that R(δ ′ , σ ·(a, µi )) for all 1 ≤ i ≤ k. If σ is inconsistent in M (i.e., D ∪ C ∪ {Sensed[σ]} is unsatisfiable,) then configuration (δ, σ) is final (in the general case), KHowET d(δ, σ, M ) holds, and, therefore, R(δ, σ) holds as well. Assume then that σ is consistent in M . Then, there should be some µ j , 1 ≤ j ≤ k, such that (δ, σ) may evolve to (δ ′ , σ · (a, µj )) w.r.t M . Notice that M |= T rans(δ, end[σ], δ ′ , do(a, end[σ])) and that µj = 1 iff M |= SF (a, end[σ]). Since R(δ ′ , σ · (a, µj )) holds, so does KHowInM(δ ′ , σ · (a, µi ), M ). Then, by condition (T3) in the definition of KHowInM, KHowInM (δ, σ, M ) holds and, by the way we have defined R, it follows that R(δ, σ) holds. Ang To generalize the result to KHowAng ECnd and KHowET nd we just need to add the following case to the above two: (E3) Suppose that (δ, σ) may evolve to (δ ′ , σ) w.r.t. theory D and that R(δ ′ , σ) holds. Then, since the history remains the same, it is also true that (δ, σ) may evolve to (δ ′ , σ) w.r.t. model M (the configuration evolution does not depend on the environment). Moreover, by the way R was defined, KHowSM (δ ′ , σ, M ) holds. By condition (T2) in the definition of KHowInM, it follows that KHowSM (δ, σ, M ) and, therefore, R(δ, σ) holds as well. P ROOF OF P ROPOSITION 7.1: Assume KHowEC (δ, σ) holds. By Theorem 4.2, there is a program δ tp such that KHowByEC (δ, σ, δ tp ). By Theorem 4.1, δ tp is a tree program and, thus, a deterministic one, and D ∪ C ∪ {Sensed[σ]} |= ∃s.Do(δ tp , end[σ], s) ∧ Do(δ, end[σ], s). By Theorem 5.2, KHowET (δ tp , σ) holds and, putting all together, KHowET nd (δ, σ) applies. Lemma A.2 (Induction Principle for KHowAng ECnd ) For all relations T (δ, σ) over programs and histories, if T (δ, σ) is closed under assertions (E1), (E2), and (E3) of the definiAng tion of KHowAng ECnd , then for all programs δ and histories σ, if KHowECnd (δ, σ), then T (δ, σ). P ROOF : As for Theorem 3 of [5]. Lemma A.3 (Induction Principle for KHowByEC ) For all relations T (δ, σ, δ ′ ) over programs, histories and programs, if T (δ, σ, δ ′ ) is closed under assertions (A), (B), and (C) of the definition of KHowByEC , then for all programs δ, delta′ , and histories σ, if KHowByEC (δ, σ, δ ′ ), then T (δ, σ, δ ′ ). P ROOF : As for Theorem 3 of [5]. P ROOF OF T HEOREM 4.1: We shall prove a more general result w.r.t. KHowAng ECnd . To that end, we add the following case to the definition of KHowBy EC : ′ ′ (C) if (δ, σ) may evolve to (δ ′ , σ) w.r.t. theory D and R(δ ′ , σ, δ tp ), then R(δ, σ, T rue?; δ tp ) In addition we enrich a bit the notion of TREE programs to include two special test steps: dpt ::= nil | F alse? | a; dpt1 | T rue?; dpt1 | senseφ ; if φ then dpt1 else dpt2 where a is any non-sensing action, and dpt1 and dpt2 are tree programs. The proof goes by induction on KHowByEC . The property that we have to show is closed under the assertions in the definition of KHowByEC (and hence be able to apply Lemma A.3) is the following one: T (δ, σ, δ tp ) = D ∪ C ∪ {Sensed[σ]} |= ∃s.Do(δ tp , end[σ], s) ∧ Do(δ, end[σ], s). Checking that T (δ, σ, δ tp ) is closed under assertion (A) of the definition of KHowByEC : Assume that (δ, σ) is a final configuration. Let us show that T (p, h, nil). By definition of KHow Ang ECnd , we have KHowAng (δ, σ). As well, nil is a TREE program, and we have D ∪ C ∪ {Sensed[σ]} |= Do(nil, end[σ], end[σ]) ∧ ECnd Do(δ, end[σ], end[σ]). Thus, T (δ, σ, nil). Checking that T (δ, σ, δ tp ) is closed under assertions (B) of the definition of KHowByEC : Assume that (δ, σ) may evolve to one or more configurations (δ ′ , σ · (a, µi )) (for some action a) w.r.t. theory D where ′ 1 ≤ i ≤ 2 and that there exists δitp such that T (δ ′ , σ · (a, µi ), δitp ). This means that: ′ ′ ′ ′ tp ′ (a) if D ∪ C ∪ {Sensed[σ · (a, 1)]} is consistent, then T (δ ′ , σ · (a, 1), δ1tp ), i.e., KHowAng ECnd (δ , σ · (a, 1)) and δ1 is a ′ TREE program and D ∪ C ∪ {Sensed[σ · (a, 1)]} |= ∃s.Do(δ1tp , end[σ · (a, 1)], s) ∧ Do(δ ′ , end[σ · (a, 1)], s). tp ′ (b) if D ∪ C ∪ {Sensed[σ · (a, 0)]} is consistent, then T (δ ′ , σ · (a, 0), δ2tp ), i.e., KHowAng is a ECnd (δ , σ · (a, 0)) and δi ′ tp ′ TREE program and D ∪ C ∪ {Sensed[σ · (a, 0)]} |= ∃s.Do(δ2 , end[σ · (a, 0)], s) ∧ Do(δ , end[σ · (a, 0)], s). Observe that since (δ, σ) may evolve to at least one (δ ′ , σ · (a, µ) for some sensing outcome µ, either D ∪ C ∪ {Sensed[σ · (a, 0)]} or D ∪ C ∪ {Sensed[σ · (a, 1)]} is consistent. Let us show that T (δ, σ, (a; if φ then δ1tp else δ2tp endIf)) where φ is the condition on the right hand side of the sensed ′ fluent axiom for a, and if D ∪C ∪{Sensed[σ ·(a, 1)]} is inconsistent then δ 1tp = nil else δ1tp = δ1tp and if D ∪C ∪{Sensed[σ · ′ (a, 0)]} is inconsistent then δ2tp = nil else δ2tp = δ2tp . Ang By the definition of KHowECnd , we have that KHowAng ECnd (δ, σ) holds (again, notice that (δ, σ) may evolve at least to one configuration of the above form.) Given the assumptions and the way δ1tp and δ2tp are defined, clearly a; if φ then δ1tp else δ2tp endIf is a TREE program. It remains to show that D ∪ C ∪ {Sensed[σ]} |= ∃s.Do((a; if φ then δ1tp else δ2tp endIf), s) ∧ Do(δ, end[σ], s). Pick a model M of D ∪ C ∪ {Sensed[σ]}. Suppose M satisfies SF (a, end[σ]) and thus D ∪ C ∪ Sensed[σ · (a, 1)] is true in M . Then, due to (a) above, Do(δ ′ , end[σ · (a, 1)], s) must hold for some s in M . In addition, T rans(δ, end[σ], δ ′ , end[σ · (a, 1)]) is true in M , and we thus have Do(δ, end[σ], s) to be true in M . Also since T rans(a; if φ then δ1tp else ′ δ1tp Do(δ1tp , end[σ · (a, 1)], s) ′ endIf, end[σ], nil; if φ then δ1tp else δ1tp endIf, end[σ · (a, 1)]) φ(end[σ · (a, 1)]) ′ are all true in M , we must, therefore, have Do(a; if φ then δ1tp else δ1tp endIf, end[σ], s) to holds in M . Suppose, on the other hand, that M does not satisfy SF (a, end[σ]) and thus D ∪ C ∪ Sensed[σ · (a, 0)] is true in M . Then, due to (b) above, Do(δ ′ , end[σ · (a, 0)], s) must hold for some s in M . In addition, T rans(δ, end[σ], δ ′ , end[σ · (a, 0)]) is true in M , and we thus have Do(δ, end[σ], s) to be true in M . Also since T rans(a; if φ then δ1tp else ′ δ1tp Do(δ2tp , end[σ · (a, 0)], s) ′ endIf, end[σ], nil; if φ then δ1tp else δ1tp endIf, end[σ · (a, 0)]) ¬φ(end[σ · (a, 0)]) ′ are all true in M , we must, therefore, have Do(a; if φ then δ1tp else δ1tp endIf, end[σ], s) to holds in M . Therefore, T (δ, σ, a; if φ then δ1tp else δ2tp endIf) applies. Checking that T (δ, σ, δ tp ) is closed under assertion (C) of the definition of KHowByEC : ′ tp Assume that (δ, σ) may evolve to (δ ′ , σ) w.r.t. the theory and that T (δ ′ , σ, δ tp ), i.e., KHowAng is a TREE ECnd (δ , σ) and δ ′ tp ′ program and D ∪ C ∪ {Sensed[σ]} |= ∃s.Do(δ , end[σ], s) ∧ Do(δ , end[σ], s). ′ Ang tp ′ Let us show that T (δ, σ, (T rue?; δ tp )). By the definition of KHowAng is a TREE ECnd , we have KHowECnd (δ, σ). Since δ tp ′ program, so must be T rue?; δ . ′ It remains to show that D ∪ C ∪ {Sensed[σ]} |= ∃s.Do((T rue?; δ tp ), end[σ], s) ∧ Do(δ ′ , end[σ], s). Pick a model ′ of D ∪ C ∪ {Sensed[σ]}. Since in this model Do(δ , end[σ], s) holds for some s and T rans(δ, end[σ], δ ′ , end[σ]) holds ′ too, then we it also holds Do(δ, end[σ], s) in the model in question. As well, since we have Do(δ tp , end[σ], s) ′ ′ ′ ′ ′ and T rans((T rue?; δ tp ), end[σ], nil; δ tp , end[σ]), we must also have Do((T rue?; δ tp ), end[σ], s). Thus ′ ′ D ∪ C ∪ {Sensed[σ]} |= ∃s.Do((T rue?; δ tp ), end[σ], s) ∧ Do(δ ′ , end[σ], s). Therefore, T (δ, σ, (T rue?; δ tp )) applies. P ROOF OF T HEOREM 4.2: We shall prove a more general result w.r.t. KHowAng ECnd . To that end, we add the following case to the definition of KHowBy EC : ′ ′ (C) if (δ, σ) may evolve to (δ ′ , σ) w.r.t. theory D and R(δ ′ , σ, δ tp ), then R(δ, σ, (T rue?; δ tp )) In addition we enrich a bit the notion of TREE programs to include two special test steps: dpt ::= nil | F alse? | a; dpt1 | T rue?; dpt1 | senseφ ; if φ then dpt1 else dpt2 where a is any non-sensing action, and dpt1 and dpt2 are tree programs. The proof goes by induction on KHowAng ECnd configurations. The property that we have to show is closed under the asserAng tions in the definition of KHowECnd (and hence be able to apply Lemma A.2) is the following one: T (δ, σ) = there exists δ tp such that KHowByEC (δ, σ, δ tp ) holds. So we shall prove that T (δ, σ) is closed under (E1)-(E3) (that is, take T to be one of the possible R in the definition of KHowAng ECnd . Checking that T (δ, σ) is closed under assertion (E1) of the definition of KHowAng ECnd : Assume (δ, σ) is final. By definition of KHowByEC , case (A), we have KHowByEC (δ, σ, nil). Thus, T (δ, σ) applies. Checking that T (δ, σ) is closed under assertion (E2) of the definition of KHowAng ECnd : Assume that (δ, σ) may evolve to configurations (δ ′ , σ · (a, µi )) (for some action a) w.r.t. theory D where 1 ≤ i ≤ 2 and such that T (δ ′ , σ · (a, µi )) holds. Suppose that φ is the condition on the right hand side of the sensed fluent axiom for a, that is, SF (a, s) ≡ φ(s) ∈ D. Since (δ, σ) must evolve to at least one configuration (possibly two), then D ∪ C{Sensed[σ]} is consistent and, therefore, either D ∪ C{Sensed[σ · (a, 0)]} or D ∪ C{Sensed[σ · (a, 1)]} is consistent. If (δ, σ) may evolve to (δ ′ , σ · (a, 1)), then D ∪ C{Sensed[σ · (a, 1)]} ought to be consistent, and, hence, T (δ ′ , σ · (a, 1)), ′ ′ i.e., there exists δ1tp such that KHowByEC (δ ′ , σ · (a, 1), δ1tp ). Simlarly, if (δ, σ) may evolve to (δ ′ , σ · (a, 0)), then D ∪ C{Sensed[σ · (a, 0)]} ought to be consistent, and, hence, T (δ ′ , σ · ′ ′ (a, 0)), i.e. there exists δ2tp such that KHowByEC (δ ′ , σ · (a, 0), δ2tp ). ′ Let δ tp = (a; if φ then δ1tp else δ2tp endIf), and if D∪C ∪{Sensed[σ ·(a, 1)]} is inconsistent then δ1tp = nil else δ1tp = δ1tp ′ and if D ∪ C ∪ {Sensed[σ · (a, 0)]} is inconsistent then δ2tp = nil else δ2tp = δ2tp . By the definition of KHowByEC , case (B), we have that KHowByEC (δ, σ, δ tp ). Thus, T (δ, σ) applies. Checking that T (δ, σ) is closed under assertion (E3) of the definition of KHowAng ECnd : ′ Assume that (δ, σ) may evolve to (δ ′ , σ) w.r.t. D and that T (δ ′ , σ), i.e., there exists δ tp such that KHowByEC (δ ′ , σ, δ ′ ). ′ By the definition of KHowByEC , case (C), we have KHowByEC (δ, σ, (T rue?; δ tp )). Thus, T (δ, σ) applies taking δ tp = ′ (T rue?; δ tp ).

Log In

On ability to autonomously execute agent programs with sensing

Related papers

Related papers

Related topics