Academia.eduAcademia.edu

Extended Spectrum Based Plan Diagnosis for Plan-Repair

2012

In dynamic environments unexpected malfunctions or conditions can cause plan failure. Research has shown that plan-repair on failure can be more efficient than building complete conditional plans from scratch to handle all contingencies. The effectiveness of replanning depends on knowledge of exactly which plan actions failed and why. Conventional Model Based Diagnosis (MBD) can be used to detect such faulty components but the modeling cost (to generate the fault model) outweighs the benefits of MBD. In this paper, we propose an Extended Spectrum Based Diagnosis approach that efficiently pinpoints failed actions and does not require the fault models. Our approach first computes the likelihood of an action being faulty and subsequently proposes optimal probe locations to refine the diagnosis. We also exploit knowledge of plan steps that are instances of the same plan operator to optimize the selection of the most informative diagnostic probes. This reduces costs and improves the accu...

Extended Spectrum Based Plan Diagnosis for Plan-Repair Shekhar Gupta , Bob Price and Johan de Kleer Palo Alto Research Center 3333 Coyote Hill Road, Palo Alto CA 94304 USA Email: {shekhar.gupta,bprice,dekleer}@parc.com Nico Roos Cees Witteveen Masstricht University The Netherlands Email: [email protected] Department of Software Technology Delft University of Technology, The Netherlands Email: [email protected] Abstract In dynamic environments unexpected malfunctions or conditions can cause plan failure. Research has shown that plan-repair on failure can be more efficient than building complete conditional plans from scratch to handle all contingencies. The effectiveness of replanning depends on knowledge of exactly which plan actions failed and why. Conventional Model Based Diagnosis (MBD) can be used to detect such faulty components but the modeling cost (to generate the fault model) outweighs the benefits of MBD. In this paper, we propose an Extended Spectrum Based Diagnosis approach that efficiently pinpoints failed actions and does not require the fault models. Our approach first computes the likelihood of an action being faulty and subsequently proposes optimal probe locations to refine the diagnosis. We also exploit knowledge of plan steps that are instances of the same plan operator to optimize the selection of the most informative diagnostic probes. This reduces costs and improves the accuracy of diagnoses. Introduction Classical planning assumes that the world is deterministic so that every action produces the intended effects. However, this assumption is not true in real world planning domains where actions can fail because of unexpected events. Execution of a plan will lead to an unexpected goal state if one or more actions are behaving abnormally. When such incidents happen one possible way to achieve the desired goal state is repairing the original plan by adding/removing some actions (van der Krogt and de Weerdt, 2005). For example, consider a multimodal freight logistics system, where a planner such as TIMIPLAN generates optimal plans to deliver goods from one location to another (Flórez et al., 2011). TIMIPLAN has a plan monitoring component checking whether the execution of plans deviates from the expected outcome and triggers a replanning module if needed. To avoid replanning from scratch, the planner uses plan repair to change the initial plan as little as possible. For example, a damaged truck is replaced by a new truck and TIMIPLANS greedily selects such a truck with the least estimated total cost. This, however, assumes full observability of the health state of the trucks which in general might be too costly or in some cases infeasible. Model based diagnosis is used to infer the set of faulty component(s) in a system from observations and background knowledge (Reiter, 1987). It exploits a descriptive behavioral model of components together with a structural model of how the components are connected to compute the implications of observations. The idea of MBD can be further extended to diagnose faulty actions in a plan where the plan can be seen as a system and the action can understood as a component MINI-MAX (Roos and Witteveen, 2009). This view enables the application of well known diagnosis techniques to plan descriptions. For instance, knowing that a road must be clear for a truck to pass, and observing that a truck has arrived from a distant city allows the system to infer that the road from that distant city is clear even though this cannot be directly observed. Methodology such as the pervasive diagnosis framework (Kuhn et al., 2010), has demonstrated how diagnosis can be performed on systems controlled by plans, but makes the simplifying assumption that the planning goal is a single output for the system and that any failed action has a direct observable effect on the output. It is therefore unsuitable for domains such as the logistics domain where many goals must be achieved simultaneously and action failures have local effects that are only indirectly related to the goals. While powerful, model-based techniques require accurate fault models which are expensive to develop and in some cases the required data cannot be obtained at all. For example, it may be difficult to model all the ways in which a truck can fail to deliver a package to a destination. The proposed Spectrum Based Diagnosis (SBD) approach makes use of abstract frequency statistics to reveal possible causes of a problem without a fault model of the system. SBD has been successfully applied for software fault localization (Abreu et al., 2009) and hardware diagnosis (Arjan Van Gemund and Abreu, 2011). In our approach we use SBD to determine the health state of a plan step which infers the health state of corresponding action. In the planning domain, it is common for a single plan operator to be instantiated many times for different plan steps. For instance, a transport operation might be instantiated with the same truck to carry packages on several different routes in a plan. All plan steps that are instantiated from the same operator will fail if there is something wrong with the plan operator. For instance, every attempt to schedule a shipment on a blocked road will fail. In the online replanning context, we are given the plan ahead of time, so we can exploit knowledge about the operator dependencies of actions within a plan. We propose Extended Spectrum Based Diagnosis which is able to exploit available information about such dependencies in the plan by elegantly extending the spectrum matrix. Finally, in domains, such as the logistics domain, we often have the ability to take information gathering actions. Perhaps we could get the dispatcher to call drivers and ask for a report on road conditions along a particular segment. However, each of these actions has costs involved. We address this, by combining our extended spectrum based diagnosis with an optimal probing strategy, which uses a mutual information criteria. Given the resulting information, standard replanting techniques are used to repair the plan. The result is a practical approach to planning for online systems with dynamic failures that works with incompletely described systems but exploits the known information to efficiently repair plans with the lowest cost. In the following sections we develop the mathematical framework for extended spectrum based diagnosis and demonstrate it on a notional multimodal transportation problem. Preliminaries where normally components operate quite independently from each other. In plans, it seems rational to assume that a structural fault in the truck might affect at least a subset of its instantiations (plan steps). If two plan steps are instantiated from the same operator, we say that they are related. Let o(s) be the operator that step s is instantiated from. Given two plan steps s and s0 , if o(s) = o(s0 ) then they are related. For example, if the same truck (plan operator) is used to execute two different transportations (plan steps), these plan steps are related. Note that unlike typical physical systems in which components that make up the system fail independently, plans which contain related plan steps need to model the dependence in the failures between the related plan steps. Plan and plan execution We represent our plans as a partially ordered set of steps. Formally, a plan is a tuple P = hO, S, <i where S ✓ inst(O) is a set of plan steps occurring in O and (S, <) is a partial order (Cox, Durfee, and Bartold, 2005). If step s0 < s then s0 must be executed before s. Same < relation can be used to denote the relative order between states. Figure 1 gives an illustration of a partially ordered plan. Our planning formalism is modeled after the STRIPS planner (Fikes and Nilsson, 1971). Our specific notation is covered in following subsections. State The world can be described by a finite set Var = {v1 , v2 , . . . , vn } of variables and their respective value domains Di . A particular state is denoted by an n-tuple σ = (σ(v1 ), . . . , σ(vn )) 2 D1 ⇥ D2 ⇥ · · · ⇥ Dn . In multimodal transportation system, the variables would represent the locations of individual items such as trucks and goods to be shipped. Actions, plan operators and plan steps An action refers to an activity that results in some change of the (current) state of the world. A plan operator refers to a description of such an action in a plan. More exactly, a plan operator o is a function mapping state (σ0 ) to another state (σ1 ). An instantiation of an operator o with specific arguments is called a plan step. It maps a specific state into another specific state. Therefore, given a set O of plan operators, we consider a set S = inst(O) of instances of plan operators in O, called the set of plan steps. A plan step will be denoted by a small roman letter si . For example, a plan operator can be understood as a shipping action by a specific mode of transportation, i.e., a truck, a train or a ship. Such a shipping action can be used at several places in the plan using the same truck. Each specific occurrence of such a truck transportation is a plan step. If plan step s is an instantiation of operator o, we say that o(s) = o. If for two plan steps s and s0 it holds that o(s) = o(s0 ) they are said to be related to each other. In other words, s and s0 are sharing same resource therefore there resource dependency between these two plan steps. For example, if the same truck (plan operator) is used to execute two different transportations (plan steps), these plan steps are related. Note that here plans differ from systems v1 v2 v3 v4 v5 !goal s6 s7 s8 s3 s4 s5 s1 s2 !0 Figure 1: A partially ordered plan graph in which an initial state σ0 is transformed by plan steps (si for i = 1, 2, . . . , 8) into a goal state σgoal . Each state characterizes the values of five variables v1 , v2 , v3 , v4 and v5 . Plan steps having the same color (e.g. s1 and s7 , and s2 and s5 are instantiations of the same plan operator. Observations Our framework enables us to observe a set of values of the variables making up a state of the world. We denote an observation of variable v in state σ by σv . We assume that a cost is associated with any observation, except for observing the initial state (σ0 ) and the goal state σgoal . For ease of exposition, we assume all probes have equal cost. Plan Diagnosis We use conventional MBD notation to represent the plan we are diagnosing. Definition 1 A system is a pair (P, OBS) where P is a plan tuple hO, S, <i and OBS is the set of values of the observed variables at the initial state σ0 and the goal state σgoal . Plan execution is validated by continuously monitoring the goal state. The difference in the observed value σgoal(v)0 of any variable v in the goal state from the expected value σgoal(v) implies the plan execution failure, i.e., some plan steps are not executed in a correct way. For example, consider the plan shown in Figure 1. Suppose this plan represents a multimodal transportation plan where five goods (v1 . . . v5 ) need to be delivered from initial location to goal location using different transportation modes (s1 . . . s8 ). In the final destination, it is observed that two goods (v2 and v3 ) have not arrived which implies one or more plan steps are faulty. Let hj 2 {ok, ab} be the health state of plan step sj where ok represents normal behavior and ab abnormal behavior. In establishing which part of the plan fails, we are only interested in those plan steps qualified as abnormal. Therefore, a plan diagnosis can be defined as following: In establishing which part of the plan fails, we are only interested in those plan steps qualified as abnormal. Therefore, a plan diagnosis can be defined as following: Definition 2 (Diagnosis) A diagnosis PD of a plan P = hO, S, <i is a tuple PD = hO, S, <, Di, where D ✓ S is the subset of plan steps qualified as abnormal (and therefore, S D is the subset of plan steps qualified as ok). Spectrum Based Diagnosis In absence of a detailed fault model of plan operators and plan steps, SBD is a suitable diagnosis methodology for the problem in hand. The basic principle of SBD can be described as follows: if the value of a variable in the goal state is incorrect, then one or more plan steps involved in generation of that variable are abnormal. Obtaining the Spectrum Matrix The spectrum matrix shows for every variable in σgoal which plan steps are involved from the state σ0 to σgoal . It records, in the goal state, whether a particular variable vi has the expected value or not. Together with the information about involvement of plan steps, the resulting spectrum gives debuggers hints about the plan steps which are more likely related to failure, and hence have higher possibility to contain the faults. The spectrum matrix (A, e), where A = [aij ] is the plan spectrum and e is the error vector can be constructed as follows: The plan spectrum A has N rows (one for each variable) and M columns (one for each plan step). We have aij = 1 if a plan step sj is involved in the generation of variable vi in σgoal , else aij = 0. The vector e stores whether the outcome for variable vi has the expected value (ei = +) or not (ei = ). For example, suppose that in the plan presented in Figure 1, the value of variable v2 and v3 is not what we would ex- pect in the goal state. Therefore ei = for i = 2 and i = 3 and the following spectrum matrix can be obtained: s1 s2 s3 s4 s5 s6 s7 s8 e v1 1 0 1 0 0 1 0 0 + v2 1 0 1 0 0 1 0 0 v3 1 1 1 1 0 0 1 0 v4 1 1 0 1 1 0 0 1 + v5 1 1 0 1 1 0 0 1 + In any row with an unexpected outcome, at least on of the components used must be faulty. A minimal hitting set algorithm, STACCATO (Abreu et al., 2009), can be applied to the set of rows with unexpected outcomes to generate the set of diagnoses candidates (ck ) {c1 =< s1 >, c2 =< s3 > , c3 =< s2 , s6 >, c4 =< s4 , s6 >, c5 =< s7 , s6 >}. The Spectrum Matrix for Plan Steps with Shared Resources The candidate < s2 , s6 > implies that the operator associated with s2 may be faulty but it could be expensive or difficult to probe the output of s2 . From our knowledge of the plan, we know that s2 is instantiated from the same operator as s5 . Therefore s5 is also likely to fail, if s2 fails. In this case, the failure of s5 may have been intermittent or the failure may not have been relevant to the preconditions of the subsequent step s8 so it did not have an effect on the final goal state σgoal . This is called a masked fault and it is not picked up by standard SBD methods. This insight is important, because probing at s5 may be easier and cheaper than probing at s2 . Imagine a scenario in which steps s2 and s5 use the same truck. Suppose in s2 , the truck is used at a distant location where it is difficult to inspect. If it is later used in a plan step s5 at a location with inspection facilities it will be much easier to measure the health of this resource. There is one small complication. If an operator is used more than once in a plan, it could be heathy earlier in the plan and then fail at some later point. To take these related plan steps into account, we modify the spectrum matrix in such a way that these relations are encoded in the matrix A itself. Suppose that the plan steps s and s0 are related. If s is detected as faulty and s < s0 , it seems reasonable to consider s0 as faulty as well. Formally, we calculate the extended spectrum matrix A0 = [a0ij ] from A as follows: _ a0ij = (1) a0ij 0 _ aij j 0 <j,o(j 0 )=o(j) In the plan depicted in Figure 1, plan steps with the same background are related. So s1 and s7 are related and s2 and s5 are related. The extended spectrum matrix would be (new entries appear in bold face): s1 s2 s3 s4 s5 s6 s7 s8 e v1 1 0 1 0 0 1 1 0 + v2 1 0 1 0 0 1 1 0 v3 1 1 1 1 1 0 1 0 v4 1 1 0 1 1 0 1 1 + v5 1 1 0 1 1 0 1 1 + Similar to other MBD engine our diagnosis engine assumes that plan steps are failing independently while computing posterior probability for every diagnosis. Therefore, if we have a diagnosis in plan steps are related to each other our engine will compute incorrect posteriors. Hence diagnosis must not contain related plan steps. The extended matrix ensures that application of MHS algorithm on that matrix will produce diagnosis comprises of independent plan steps. Application of minimal hitting set algorithm on extended matrix A0 will generate diagnoses candidates (ck ) {c1 =< s1 >, c2 =< s3 >, c3 =< s7 >, c4 =< s2 , s6 >, c4 =< s4 , s6 >, c5 =< s7 , s6 >, c6 =< s5 , s6 >}. Theorem 1 Introducing related plan steps into the extended matrix ensures that the MHS algorithm will never return a diagnosis that includes two related plan steps. Proof Two plan steps will only appear together in a diagnosis if they individually explain distinct error observations. When we insert a pseudo observation for one of the steps into the matrix, the second step becomes an explanation for both error outputs and becomes a singleton diagnosis breaking up the joint diagnosis. Schematically,   1 0 1 1 ⇒ 0 1 0 1 Keeping related actions from appearing in the same diagnosis prevents us from multiplying these correlated failures together as if they were independent failures. This preserves the accuracy of the diagnosis. The diagnosis set for the extended matrix is ck = {c1 =< s1 >, c2 =< s3 >, c3 =< s7 >, c4 =< s2 , s6 >, c4 =< s4 , s6 >, c5 =< s7 , s6 > , c6 =< s5 , s6 >}. si s1 s2 s3 s4 s5 s6 s7 s8 P r(si ) 0.200 0.002 0.800 0.002 0.000 0.007 0.003 0.000 P r0 (si ) 0.160 0.002 0.762 0.002 0.002 0.008 0.160 0.000 I(X; Y ) 0.512884 0.008762 0.016707 0.264348 0.004198 0.000000 0.000000 0.074128 I 0 (X; Y ) 0.512884 0.008762 0.016707 0.264348 0.011041 0.000000 0.000000 0.074128 Table 1: P r(si ) and I(X; Y ) are derived for original matrix A. P r0 (si ) and I 0 (X; Y ) are derived for extended matrix A0 Probing Strategy A major challenge for a diagnostician is to identify a suitable location for a new probe. In conventional MBD, mutual information criterion can be used to evaluate and compare measurement choice based on their information contribution (de Kleer and Williams, 1987), we have adapted this criterion to probing plan based systems with related steps. To illustrate the formulation, assume X is a diagnostic state of a plan and Y is the measure value of a variable at a probing location where X and Y are both random variables. The mutual information between X and Y is defined as: X p(x, y) I(X; Y ) = p(x, y) · log (4) p(x)p(y) x,y For example, suppose we derive mutual information about the value of location l1 and l2 as I(X; Yl1 ) and I(X; Yl2 ), respectively. In choosing between l1 and l2, we will choose l1 to probe if I(X; Yl1 ) > I(X; Yl2 ). As described in (Juan Liu and Zhou, 2008), the above expression can be estimated using entropy calculation, which is givenh as I(X; Y ) = i H(Y ) − H(Y |X), where H(Y ) = P 1 y p(x) · log p(x) is the entropy of Y and H(Y |X) = i P r(σgoal (vi )|ck ) · P r(ck |σgoal (vi 1 )) P h 1 (2) P r(ck |σgoal (vi )) = p(y|x) · log p(y|x) is the conditional entropy. For the x,y P r(σgoal (vi )) plan example shown in Figure 1, observations are already . given and fault probability has been computed from SBD, The recursion bottoms out with the prior for the candidate, shown in Table 1. Estimated fault probabilities and obserP r(ck ), which is computed from the individual step priors vations in the goal state are used to compute H(Y ) and assuming independent failures. Note that the candidate < H(Y |X) as described in (Juan Liu and Zhou, 2008). Musi > implies health variable hi = ab. Generally: tual information for different probing location in our examY⇢ p ple (Figure 1) is summarized in Table 1. if hi = 1 i (3) P r(ck ) = 1 − pi otherwise Probability Calculation Having corrected the spectrum matrix, we can use the BARINEL (Abreu et al., 2009) diagnostic engine to compute a fault probability for every diagnosis candidate using Bayes rule. For each variable observation σgoal(vi ) , the posteriors are update according to the following rule for every candidate c. i where pi is the prior probability that plan step si is faulty.1 The BARINEL engine propagates failure probabilities along the plan step dependencies to calculate the probability P r(σ1 (vi )|ck ) for each output variable i using maximum likelihood estimation (Abreu, 2009). The final posterior probability is computed by combining Equations 2, 3 and P r(σ1 (vi )|ck ), and fault probabilities are assigned to plan step as shown in Table 1. 1 In our case, the prior probability of every plan step is assumed to be to be 0.1. Exploiting Related Plan Steps in Diagnosis In the plan described in Figure 1, s3 has the strongest participation in the unexpected goal state outcomes for variables, v2 and v3 . In the first column of Table 1, P r(si ), we see that the diagnoser assigns s3 the highest probability of failure. The standard spectrum A assigns different probabilities to plan steps s1 and s7 . The extended spectrum, which recognizes that s1 and s7 are related, increased the fault probability of s7 and now s7 and s1 have equal probability. Similar conclusions can be made for other related plan steps s2 and s5 . The mutual information results shown in Table 1 provides us some interesting conclusions. Without any ambiguity both the spectrum matrices suggest that s1 is the most informative location to probe and that s7 is the least. Therefore, probing at the output of s1 is going to improve the diagnosis by the maximum amount. Since s7 is in the goal state (no cost) of the plan therefore no extra information can be gained which matches our mutual information computation. At the same time, extending the matrix reveals the information content at the output of plan step s5 to the diagnoser. In this case, s5 is closer to the middle of the plan than s2 which means that it better splits the hypothesis space about possible causes of failure and therefore is more informative. In some cases, s5 may not be more informative, but may be cheaper or easier to measure. In any case, the extended spectrum matrix opens up new options to increase the accuracy and decrease the cost of diagnosis in plans with related plan steps. Conclusion Continuous planning in online dynamic real world environments requires accurate diagnosis to pinpoint which plan steps need to be repaired. Spectrum based diagnosis approaches are a natural approach as they do not require explicit fault models to provide useful diagnostic information. We have seen that extended spectrum based diagnosis extends the advantage of traditional spectrum based diagnosis to systems controlled by a plan which can have related plan steps. The extended spectrum matrix also increases the options for probing potentially leading to more accurate and cheaper diagnosis. The technique can be easily extended in many ways such as computing explicit expected probe costs and considering other ways in which operators can be related. Extended spectrum based diagnosis therefore represents an important technology option for robust, practical and efficient plan based control of real world systems. References Abreu, R.; Zoeteweij, P.; Golsteijn, R.; and van Gemund, A. 2009. A practical evaluation of spectrum-based fault localization. Journal of Systems and Software. Arjan Van Gemund, S. G., and Abreu, R. 2011. The antares approach to automatic system diagnosis. In Proceedings of the 22nd International Workshop on Principles of Diagnosis (DX-2011), 5–12. Cox, J. S.; Durfee, E. H.; and Bartold, T. 2005. A distributed framework for solving the multiagent plan coordination problem. In In AAMAS, 821–827. ACM Press. de Kleer, J., and Williams, B. C. 1987. Diagnosing multiple faults. Artif. Intell. 32(1):97–130. Fikes, R., and Nilsson, N. J. 1971. Strips: A new approach to the application of theorem proving to problem solving. In IJCAI, 608–620. Flórez, J. E.; de Reyna, Á. T. A.; Garcı́a, J.; López, C. L.; Olaya, A. G.; and Borrajo, D. 2011. Planning multi-modal transportation problems. In ICAPS. Juan Liu, Johan de Kleer, L. K., and Zhou, R. 2008. A unified information criterion for evaluating probe and test selection. In PHM-2008. Kuhn, L.; Price, B.; Do, M. B.; Liu, J.; Zhou, R.; Schmidt, T.; and de Kleer, J. 2010. Pervasive diagnosis. IEEE Transactions on Systems, Man, and Cybernetics, Part A 40(5):932–944. Reiter, R. 1987. A theory of diagnosis from first principles. Artif. Intell. 32(1):57–95. Roos, N., and Witteveen, C. 2009. Models and methods for plan diagnosis. Journal of Autonomous Agents and Multi-Agent Systems 19(1):30–52. van der Krogt, R., and de Weerdt, M. 2005. Plan repair as an extension of planning. In ICAPS, 161–170.