2013 ICRA Mcko

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Towards a Unified Behavior Trees Framework for Robot Control

Alejandro Marzinotto, Michele Colledanchise, Christian Smith, and Petter Ögren

Abstract— This paper presents a unified framework for


Input Hidden Output
Behavior Trees (BTs), a plan representation and execution tool. Layer Layer Layer Linear Temporal Logic (LTL)
The available literature lacks the consistency and mathematical Negation ¬ Defines ≡
Input 1 Always  Eventually ♦
rigor required for robotic and control applications. Therefore, Or ∨ And ∧
we approach this problem in two steps: first, reviewing the Input 2 Output
Derives ` Satisfies |=
most popular BT literature exposing the aforementioned issues; Input 3

second, describing our unified BT framework along with equiv-


Top Layer
alence notions between BTs and Controlled Hybrid Dynamical ?

Systems (CHDSs). This paper improves on the existing state of


the art as it describes BTs in a more accurate and compact → Action 2 q1 ··· qn ··· qN
way, while providing insight about their actual representation
Condition 1 Action 1
capabilities. Lastly, we demonstrate the applicability of our Middle Layer
framework to real systems scheduling open-loop actions in a Uncertainties e(t)
u(t)
grasping mission that involves a NAO robot and our BT library. r(t) e(t)
Controller
u(t)
System
y(t)
u(t)

ym (t)

I. I NTRODUCTION Measurements
Bottom Layer e(t)
0 dt 2dt 3dt 4dt 5dt t

Behavior Trees (BTs) are plan representation tools com-


monly used in scenarios where there is no need to have Fig. 1. The Top, Middle, and Bottom Layers of robot architectures. The
a mathematical foundation encompassing continuous-time division is arbitrary: there is no strict boundary between layers.
dynamics. However, such requirement arises in order to use
BTs on more complex applications, e.g. real robots, control
systems. This indicates the need to formalize BTs in a As represented in Fig. 1, the BTs and CHDSs belong to
mathematical framework that is both accurate and compact. the Middle Layer which provides a mechanism for switching
Intuitively, we measure accuracy to be inversely propor- between low-level controllers. The Top Layer automatically
tional to the degree of misinterpretation that a certain state- generates and updates plans, whereas the Bottom Layer
ment can be subjected to. Likewise, we measure compactness handles low-level controllers that interact with the hardware.
to be inversely proportional to the amount of definitions To demonstrate the usability of our work, we implemented
required to fully specify an idea. Using these two indicators an open source BT library for the Robot Operating System
we created a framework that surpasses the state of the art. (ROS) [18], which allowed us to test the concepts described
In pursuit of the accuracy, we formulated the Action and in this paper on real robotic platforms. We briefly analyze
Condition subsets which remove the ambiguities that could the implementation limitations regarding the different ways
arise when referring to the node Success, Running, or Failure. of resolving subset intersection (prevalence and hysteresis).
It is relying on these subsets that we could motivate the node The main contributions of this paper are listed below:
extensions, and compare BTs with other plan representations. 1) A more accurate and compact BT framework.
There exist ad-hoc engineering solutions to circumvent 2) Introduction of the Action and Condition subsets.
the intrinsic limitations of BTs regarding two aspects: nodes 3) Formalization and motivation of two node extensions.
are memory-less [5] (do not store the last running node), and 4) Equivalence notions between BTs and CHDSs.
BTs execute independently from one another [2] (cooperative The execution of the task described in Section IX is
tasks are not possible). This paper formalizes and motivates presented in a video1 , and the source code of the library
solutions to both problems in an accurate and compact way. is publicly available on www.github.com/almc.
After [13], there is still uncertainty regarding the potential The paper is structured as follows: in Section II we review
of BTs as a suitable representation to replace Controlled related work, in Section III we formalize BTs and introduce
Hybrid Dynamical Systems (CHDSs) [15]–[17]. We address the subsets, in Sections IV and V we present the Node∗
the problem in two complementary ways: from CHDSs to and the Decorator∼ extensions, in Section VI we present
BTs, and vice-versa. This provides important insight about a formal definition of CHDSs, in Section VII we study
which tasks are representable, under what constraints, and the equivalence between BTs and CHDSs, in Section VIII
what are the advantages / disadvantages of each paradigm. we present the software structure of our implementation, in
The authors are with the Computer Vision and Active Perception Lab.,
Section IX we describe the experimental framework, and in
Centre for Autonomous Systems, School of Computer Science and Com- Section X we present the conclusions and future work.
munication, Royal Institute of Technology (KTH), SE-100 44 Stockholm,
Sweden. e-mail: {almc|miccol|ccs|petter}@kth.se 1 YouTube video name: Behavior Trees - NAO Grasping [ROS / C++].
II. R ELATED W ORK III. B EHAVIOR T REES
Most of the current BT research efforts are focused This section gives a formal description of BTs following
towards finding new efficient ways to implement Artificial the guidelines of [3], [8], [13]. A BT is defined as a directed
Intelligence (AI) for entertainment systems, e.g. specifying acyclic graph G(V, E) with |V| nodes and |E| edges. We call
Non-Player Characters (NPCs) in video-games. This situ- the outgoing node of a connected pair the parent, and the
ation often yields BT frameworks that are game-oriented; incoming node the child. We call the child-less nodes leaves,
even though they contain interesting features, they are not and the unique parent-less node Root. Each node in a BT,
generalizable to be used in other research fields, e.g. robotics. with the exception of the Root, is one of six possible types:
Fortunately, not all papers suffer from these problems but four non-leaf (control-flow) node types (Selector, Sequence,
they do differ in several aspects (§1 – §11 ). The criteria, by Parallel and Decorator), and two leaf (execution) node types
which these papers were evaluated, is specified in Table I, (Action and Condition). These are summarized in Table III.
whereas the comparison itself is presented in Table II. We Unlike traditional graph theory trees [14], any node in the
refrain from expanding on the contributions of each paper, BT (except the Root and its only child) can have multiple
even though they do contain noteworthy contributions, since parents [3]. This allows sub-trees to be reused without having
our primary goal is to point out the differences between them to copy them but decreases readability; for this reason we
as a mean to justify the need for an unified BT framework. explicitly advocate for the following workaround: nodes
TABLE I. Criteria used to compare BT publications (§1 – §11 ). having multiple parents are prohibited, the re-usability of
sub-trees is not to be done at the level of control-flow nodes,
§1
planning integration: whether there exists a mechanism for automated BT preferably, it is to be done at the level of execution nodes.
generation and maintenance. This corresponds to the Top Layer.
non-blocking actions: whether actions stop the flow of the BT until they
The Root periodically, with frequency ftick , generates an
§2 enabling signal called tick, which is propagated through the
finish executing. This affects the ability of the BT to react to changes.
modularity: whether BTs are built in such a way that they can be chained branches according to the algorithm defined for each node.
§3
together (embedded one inside the other) and still function properly.
dynamic tree: whether BTs can be modified during run-time. This is When the tick reaches a leaf node, it executes one cycle
§4
required for the Top Layer implementation, but does not imply it exists. of the Action or Condition. Actions can alter the system
virtual world: if it was implemented in a simulated environment (game).
§5 configuration, returning one of three possible state values:
This considers cases where it has access to the game-state or to an interface.
§6
real world: if it was implemented in the real world (robot). This considers Success, Failure, or Running. Conditions cannot alter the sys-
cases where the outcome of the BT manifests outside a computer simulation.
multi-agent: whether the proposed BT structure can handle multiple agents tem configuration, returning one of two possible state values:
§7
simultaneously. This considers cases involving at least two agents. Success, or Failure. This returned state is then propagated
global variables: whether the implementation requires a common set of
§8
variables (blackboard) shared between actions or uses parameters instead. back and forth through the tree, possibly triggering other
§9
BT library: whether the implementation supports a database of behaviors leaf nodes with their own return states, until finally one of
that can be polled to build BTs. This requires modularity enforced.
infinite execution: whether the BTs are meant to finish executing at some
these states reaches the Root. The nodes which are not ticked
§10
point, or they are supposed to run forever. This affects the modularity. are set to a special node state: NotTicked. The BT then waits
multiple parents to node: whether the strict definition of trees is enforced, or
§11
a relaxed version (where a node can belong to several parents) is preferred.
before sending the new tick to maintain ftick constant. We
remark that in the implementation the tick frequency ftick is
TABLE II. Comparison of BT publications. [ ]: reference to this paper. completely unrelated to the controller’s frequency fcontrol ;
they work asynchronously as explained in Section VIII-B.
Ref §1 §2 §3 §4 §5 §6 §7 §8 §9 §10 §11
[1] 3 7 3 7 3 7 3 3 3 7 7 A. Node Types
[2] 7 7 7 3 3 7 7 3 7 7 7
[3] 7 3 3 7 3 7 7 3 7 3 3 The node types behave according to Algorithms 1–11,
[4] 7 7 3 7 3 7 7 3 7 7 7 where the statement Tick(child(i)): triggers the algorithm
[5] 7 7 7 7 3 7 3 7 7 3 7
[6] 3 7 3 3 3 7 3 3 3 3 7 that corresponds to its child node type. The execution begins
[7] 7 3 3 7 3 3 3 3 7 3 7 and ends on Algorithm 4, the symbols S, F, R ⊆ X , X(t) ∈
[8] 7 3 7 7 3 7 3 7 7 7 7
[9] 7 3 7 7 3 7 3 3 7 7 7 X , U (t) ∈ U are the Success/Failure/Running subsets, state
[10] 7 7 3 7 3 7 3 3 7 3 7 space, and control signals respectively. For a detailed real-
[11] 3 3 7 7 3 7 7 3 7 3 7
[12] 7 3 7 3 3 7 3 3 7 3 7
life example using these variables refer to Section III-B.
[13] 7 3 3 7 7 7 7 3 7 7 7
[ ] 7 3 3 3 3 3 3 3 3 3 3 TABLE III. The seven node types of a BT. Ch ≡ children, S ≡ succeeded,
F ≡ failed, R ≡ running. N ≡ # children, S, F ∈ N are node parameters.
Table II demonstrates that many papers disagree in crucial
aspects such as: §2 , §3 , §10 , and §11 . Our framework, which Node Type Symb. Succeeds if Fails if Runs if
builds upon [13], [19] and [20], manages to combine every Root ∅ tree S tree F tree R
aspect considered in Table I except planning integration. Selector ? 1 Ch S N Ch F 1 Ch R
To the best of our knowledge, this is the first paper to Sequence → N Ch S 1 Ch F 1 Ch R
Parallel ⇒ ≥ S Ch S ≥ F Ch F otherwise
achieve such integration. The reader should be aware that
Decorator δ varies varies varies


BTs are, however, not the only alternative for Middle Layer Action n Xn (t) ∈ Sn Xn (t) ∈ Fn Xn (t) ∈ Rn

representation, see [21]–[25]. Lastly, we point out that BTs Condition n Xn (t) ∈ Sn Xn (t) ∈ Fn never
have also been formalized through CSP semantics in [26].
Selector. When a Selector node is enabled, it ticks its B. Action Subsets
children sequentially as long as they continue to return Action nodes rely on three subsets: {Sn , Fn , Rn }, used in
Failure, and until one of them returns Running or Success. If Algorithm 5. Condition nodes rely on two subsets: {Sn , Fn },
the Selector node does not find a running or succeeding child, used in Algorithm 6. We focus on the Action since the
it returns Failure, otherwise it returns Running or Success Condition is simpler. These subsets partition the Action’s
depending on the state of its first non-failing child. state space Xn , where Xn (t) take its values, such that the
Sequence. When a Sequence node is enabled, it ticks following properties hold for Action n and Condition n
its children sequentially as long as they continue to return respectively: Sn ∪Fn ∪Rn ⊇ Xn , and Sn ∪Fn ⊇ Xn . We could
Success, and until one of them returns Running or Failure. If be more restrictive, i.e. Sn ∩ Fn = Sn ∩ Rn = Fn ∩ Rn = ∅.
the Sequence node does not find a running or failing child, However, intersecting subsets allow the use of hysteresis, e.g.
it returns Success, otherwise it returns Running or Failure the threshold for switching from Rn to Sn is not necessarily
depending on the state of its first non-succeeding child. the same as the threshold for switching from Sn to Rn .
Parallel. When a Parallel node is enabled, it ticks all its As an example consider the modular driving BT shown
children sequentially. If the number of succeeding children in Fig. 2, with the corresponding Action subsets portrayed in
is ≥ S, it returns Success. If the number of failing children Fig. 3. A BT being modular means that it can be treated as
is ≥ F , it returns Failure. Otherwise, it returns Running. a stand-alone Action by BTs of higher hierarchy seamlessly.
Decorator. When a Decorator node is enabled, it checks a Modularity is enforced through the logic of the BT (control-
condition on its internal variables, based on which it could flow node placement and subsets definitions). The Root,
tick or not its only child. It applies functions φ1 or φ2 to purposefully omitted in Fig. 2, is always removed before
determine the return state, see Algorithm 11 for the template. composing two BTs to avoid having multiple tick sources.
Action. When an Action node, indexed n, is enabled, it These three Actions, scheduled by the Sequence node,
determines the state value to be returned by checking if try to maintain a proper distance from the next vehicle
its current state space configuration Xn (t) belongs to the using control algorithms γn (Xn (t)) = Un (t). Defining the
Success Sn , Failure Fn or Running Rn subsets. On the third notation return statusaction label
time step , we illustrate the use of
case, it also performs a discrete control step γn : Xn → Un . subsets showing two possible event sequences from the
Condition. When a Condition node is enabled, it behaves perspective of each driver Action (e|n|c): these correspond
like the Action, without the Running subset and control step. to the upper/lower branches, i.e. (se1 →se5 |r1n →sn5 |f1c →sc5 ),
and (se1 → r4e |r1n → f4n |f1c → f4c ) respectively. Qualitatively,
Algorithm 1: Selector Algorithm 2: Sequence both scenarios start with two control steps (r1n , r2n ) executed
1 for i ← 1 to N do 1 for i ← 1 to N do by Normal Driver. On the upper branch, the execution is
2 state ← Tick(child(i)) 2 state ← Tick(child(i)) handed over to Cruise Driver (r3c , r4c ) reaching Success on
3 if state = Running then 3 if state = Running then
4 return Running 4 return Running the fifth control step (sc5 ). On the lower branch, the execution
5 if state = Success then 5 if state = Failure then is handed over to Emergency Driver to take care of an
6 return Success 6 return Failure unexpected situation during two control steps (r3e , r4e ).
7 end 7 end
8 end 8 end
9 return Failure 9 return Success →

Algorithm 3: Parallel Algorithm 4: Main Loop Emergency Normal Cruise


1 for i ← 1 to N do 1 initialize(agent) Driver Driver Driver
2 statei ← Tick(child(i)) 2 BT.parse(agent)
3 end 3 while (active = true) do Fig. 2. A generic autonomous driving behavior, represented using a BT.
4 if nSucc(state)≥ S then 4 state ← Tick(Root)
5 return Success 5 sleep(1/ftick )
6 if nFail(state)≥ F then 6 end Emergency Driver Normal Driver Cruise Driver
7 return Failure 7 BT.delete(agent) Distance [m] Distance [m] Distance [m]
8 else 8 return 0 4 4 4
9 return Running sn4 sn r4c sc
se3 se5 sn3 5 r3c 5
10 end se4
2 se 2 rn 2 fc
2 2 2
Algorithm 5: Action Algorithm 6: Condition r3e f3n f3c
se1 r4e r1n f4n f1c f4c
1 if Xn (t) ∈ Sn then 1 if Xn (t) ∈ Sn then
2 return Success 2 return Success 0 40 80 0 40 80 0 40 80
3 if Xn (t) ∈ Fn then 3 if Xn (t) ∈ Fn then Speed [km/h] Speed [km/h] Speed [km/h]
4 return Failure 4 return Failure
5 if Xn (t) ∈ Rn then 5 end Fig. 3. Succeeding (S, green), Failing (F, red), and Running (R, yellow)
6 Un (t) ← γn (Xn (t)) subsets. The dots (blue) represent X(t) ∈ X for t = 1/ftick , . . . , 5/ftick .
7 return Running Algorithm 7: Root The solid arrows / dashed arrows represent transitions caused by the
8 end Action’s controller / system dynamics or other controllers respectively.
1 return Tick(child(0))
IV. T HE N ODE∗ E XTENSION V. T HE D ECORATOR∼ E XTENSION
The BT node algorithms presented so far are insufficient The BT node algorithms presented so far are insufficient
to represent plans where it is necessary to “remember” if an to represent plans where two or more agents must undertake
Action / Condition / sub-tree has already succeeded or failed. a common task jointly by synchronizing parts of their BTs.
A. Node∗ Motivation A. Decorator∼ Motivation
Let us consider a Sequence node with two fully-actuated2 To control multiple agents using BTs, we have two
Actions whose subsets, X1 and X2 , are represented in Fig. 4. choices: to have one big BT containing the Actions of all the
Under these assumptions, if X1 ∩X2 6= ∅ there is at least one agents, or to have separated BTs running the Actions of each
variable controlled by both Actions. Depending on the subset agent independently from one another. The first solution has
definitions, this could yield unsatisfiable BTs, e.g. Action 2 the advantage that all the agents can be synchronized inside
executes r3A2 ; decreasing X[1] so that X(t) ∈ / S1 anymore, the same structure, however it is clear that for big groups of
Action 1 then executes r4A1 , r2A1 ; causing an endless cycle. agents it turns unmanageable. The second solution has the
advantage that BTs are much smaller, easier to understand
X1 [2] = x21 Action 1 X2 [2] = x22 Action 2
and expand, however it is not obvious how these independent
r1A1 r2A1 A r1A2 r2A2 A trees can be synchronized to achieve cooperative behaviors.
A1
r4 s3 1 A2
r4 r3 2 We discard the first solution as it is unfeasible for our
purposes, focusing on the second scenario which can be dealt
with using two approaches. First, making new control-flow
nodes with the synchronization capability. Second, making a
X1 [1] = x11 X2 [1] = x12 special Decorator node with the sole purpose of providing
its sub-tree with the synchronization capability. We favor the
Fig. 4. Subsets of two Actions demonstrating how cycles appear. For second because it allows the multi-agent features to be kept
simplicity we let X1 = X2 , but normally it is enough that X1 ∩ X2 6= ∅.
aside from the execution logic, this permits non-cooperative
In general, the traditional BT algorithms have the fol- behaviors to become cooperative by merely placing them
lowing limitations: in a Sequence node, for an arbitrary j- under the synchronization Decorator defined below.
indexed child Action Aj to be ticked at time tk it needs to
B. Decorator∼ Extended Algorithm
happen that {X1 (tk ) ∈ S1 ∧ . . . ∧ Xj−1 (tk ) ∈ Sj−1 }tk ∈ R+ .
Similarly, in a Selector node, for an arbitrary j-indexed child When the Decorator∼ is enabled, it broadcasts the agent’s
Action Aj to be ticked at time tk it needs to happen that name (determined contextually) to the other Decorator∼
{X1 (tk ) ∈ F1 ∧ . . . ∧ Xj−1 (tk ) ∈ Fj−1 }tk ∈ R+ . nodes of the same cooperative task, indicating them that it
This could be desirable in some cases where we need to is ready to engage as soon as there are enough agents nreq
guarantee that a certain property holds over a set of Actions, to trigger the sub-tree. In most cases, the tick is received
but in other situations it is necessary to “remember” which by this node when the other agents are busy performing
nodes have already returned Success or Failure, in order to higher priority tasks of their BTs (nnow < nreq ), in these
not tick (check) them again on the next iteration. The Parallel cases the Decorator∼ will return without ticking its sub-tree.
node does not have this problem, hence no Parallel∗ exists. Eventually, enough agents will be available to engage in the
cooperation (nnow ≥ nreq ), at this point the barrier imposed
B. Node∗ Extended Algorithms
to the ticks by the synchronization Decorator∼ will be
Rather than tracking and storing which children have temporarily3 removed allowing the sub-tree to be executed.
returned Success or Failure, we use a variable that points Naturally, this requires a software infrastructure, like ROS,
to the child that has most recently returned Running. This capable of handling message passing, and a mechanism that
variable is reset every time the Selector or Sequence returns allows each Decorator∼ to keep track of which and how
a terminal state (Success or Failure). See Algorithms 8, 9. many agents have broadcasted their messages. Time stamps
Algorithm 8: Selector∗ Algorithm 9: Sequence∗ are used to ensure that such messages were broadcasted
1 for i ← run-index to N do 1 for i ← run-index to N do recently enough to be valid. See Algorithms 10, 11.
2 state ← Tick(child(i)) 2 state ← Tick(child(i))
3 if state = Running then 3 if state = Running then Algorithm 10: Decorator∼ Algorithm 11: Decorator
4 run-index ← i 4 run-index ← i 1 Broadcast(agent, ID) 1 PreFunc(vars)
5 return Running 5 return Running 2 if nnow ≥ nreq then 2 if Condition(vars) then
6 if state = Success then 6 if state = Failure then 3 statet ← Tick(child) 3 statet ← Tick(child)
7 run-index ← 1 7 run-index ← 1 state ← statet state ← φ1 (vars, statet )
8 return Success 8 return Failure 4 else 4 else
9 end 9 end 5 state ← NotTicked 5 state ← φ2 (vars)
10 end 10 end 6 end 6 end
11 run-index ← 1 11 run-index ← 1 7 Update(agents, nnow ) 7 PostFunc(vars, state)
12 return Failure 12 return Success 8 return state 8 return state

2 The function µ : U → X is bijective (one to one correspondence ∀u, x). 3 The barrier will block again if nnow < nreq at any point in time.
VI. C ONTROLLED H YBRID DYNAMICAL S YSTEMS VII. E QUIVALENCE
Following the definitions of [15]–[17]: a CHDS, shown in We show that every CHDS using a specific jump policy
Fig. 5, is an indexed collection of Controlled Dynamical Sys- can be represented with a BT, and every BT composed only
tems (CDS) and a mechanism for switching between them of certain node types can be represented with a CHDS.
whenever the hybrid state satisfies certain conditions and the A. From CHDSs To BTs
control dictates so. More formally, a CHDS H is defined4 as
follows H = (Q, Xq , Uq , Aq , Eq , Iq , Cq , Dq , S0 )q∈Q , with: To prove that any CHDS has an equivalent BT it suffices
to show that the trajectories both systems produce, when
Q discrete state space Q = {qi | i ∈ {1, . . . , |Q|}}
confronted with the same environment (for all possible
Xq continuous state space Xq = {xjq | j ∈ {1, . . . , |Xq |}} environments and initial conditions), are identical. Naturally,
Uq control signal space Uq = {ukq | k ∈ {1, . . . , |Uq |}} a CHDS has continuous dynamics which are impossible to
Aq edge label set Aq = {aqq̄ | q̄ ∈ Nq } mimic using a discrete-time structure like a BT. For this
Nq directed neighborhood of q (q̄ ∈ Nq =⇒ 6 q ∈ Nq̄ ) reason, we define a continuous-time BT as a regular BT
Eq edge set, each edge is Eq,q̄ = (q, q̄, aqq̄ , Gqq̄ , Jqq̄ )q̄∈Nq which has: infinite tick frequency, zero tick propagation time,
q, q̄ initial and final discrete states (q, q̄) ∈ (Q, Nq ) and zero execution time for Actions and Conditions.
aqq̄ edge label connecting (q, q̄) with aqq̄ ∈ Aq Additionally, assuming that the BT initializes Q(t) to the
Gqq̄ edge guard enabled if Xq (t) ∈ Gqq̄ ⊆ Xq initial state of the CHDS, and the CHDS uses a sequential
Jqq̄ state jump sets Xq (t) → Xq̄ (t) ∈ Jqq̄ ⊆ Xq ×Xq̄ prioritized jump policy aj . It is straightforward to check
that for an infinitesimal time window dt, the BT shown
Iq location invariant Xq (t) ∈ Iq ∀t ∈ R, ∀q ∈ Q
in Fig. 6, in continuous-time, produces the same control as
Cq control algorithms Uq (t) = Γq (Xq (t)) the CHDS shown in Fig. 5. Since this holds true for any
Dq system dynamics Ẋq (t) = ∆q (Xq (t), Uq (t)) infinitesimal time window, it follows that the trajectories are
St hybrid state {Q(t), Xq (t), Aq (t), Uq (t)} also identical. More complex jump policies are not covered
S0 initial state {Q(0), Xq (0), Aq (0), Uq (0)} but could possibly be reproduced depending on the case.
Lastly, consider a CHDS where the continuous dynamics
Xq (t) ∈ Iq Xq̄ (t) ∈ Iq̄ are discretized using a finite sampling frequency fCHDS , and
Uq = Γq (Xq ) Uq̄ = Γq̄ (Xq̄ ) a BT which satisfies all the properties of continuous-time BTs
Ẋq = ∆q (Xq , Uq ) Ẋq̄ = ∆q̄ (Xq̄ , Uq̄ ) except for the tick frequency ftick which in this case is finite.
Gqq aqq̄ Jq̄q̄
Gqq̄ Jqq̄ Under the assumption that fCHDS = ftick , and following a
aqq q q̄ aq̄q̄ similar reasoning, it follows that the discretized trajectories
Jq̄q Gq̄q
Jqq aq̄q Gq̄q̄ are equal because they are sampled / executed synchronously.


··· ··· ··· ···
{q ∈ Nq \q̄}{q 0 |q ∈ Nq0 \q̄} {q 0 ∈ Nq̄ \q}{q 0 |q̄ ∈ Nq0 \q}
0

?
Fig. 5. Generic CHDS H, showing two connected states q and q̄.
→ · · · ∀q ∈ Q
Consider a continuous trajectory (q, δq , Xq (t), Uq (t)) as-
δ Forcesuccess
sociated to the discrete state q with a non-negative time
St .Q(t) = q U (t) ← Γq (X(t)) ?
δq (duration of the continuous trajectory), a piecewise con- Ẋ(t) ← ∆q (X(t), U (t))
St .X(t) ∈ Iq
tinuous function Uq (t) : [0, δq ] → Uq , and a continuous · · · ∀q̄ ∈ Nq

piecewise differentiable function Xq (t) : [0, δq ] → Xq , such
δ Forcesuccess
that Xq (t) ∈ Iq ∀t ∈ (0, δq ) and ∆(Xq (t), Uq (t)) =
Ẋq (t) ∀t ∈ (0, δq ) except for the points of discontinuity. St .X(t) ∈ Gqq̄ St .Q(t) ← q̄
St .X(t) ← Jqq̄ (X(t))
The trajectory (solution / run) of a CHDS is a (possibly in-
finite) sequence of continuous trajectories chained together: Fig. 6. Equivalent BT to the generic CHDS presented in Fig. 5.
a a
(q 0 , δq0 , Xq0 (t), Uq0 (t)) →0 (q 1 , δq1 , Xq1 (t), Uq1 (t)) →1 · · · ,
such that at the event times where transitions occur t0 , t1 , . . ., This BT mimics the corresponding CHDS as follows: there
defined as: t0 = δq0 , t1 = δq0 + δq1 , t2 = δq0 + δq1 + δq2 , . . ., are |Q| branches departing from the first Selector which
aj
the following inclusions hold for the discrete transitions →: account for the checks that need to be performed in order to
Xqj (tj ) ∈ Gqj qj+1 and (Xqj (tj ), Xqj+1 (tj )) ∈ Jqj qj+1 for determine which discrete state is currently being executed.
all j = 0, 1, . . . , ∞. Where q j is the j-th state q taking The transitions to other discrete states are covered by the
place, to which one associates the symbol aj ≡ aqj qj+1 , that |Nq | branches departing from the second Selector, these
represents the jump policy signal at the j-th state transition. include both the CHDS guards G and the jumps J . The
4I : Q → I ⊆ X , G CHDS-BT equivalence is not unique, to prove it we notice
q q q q q̄ : Q → Gq q̄ ⊆ Xq , Jq q̄ : Q×Xq → Q×Xq̄ ,
∆q : Q × Xq × Uq → Ẋq , Γq : Q × Xq → Uq , Q(t) ∈ Q, Xq (t) ∈ Xq , that any re-ordering of the BT nodes in Fig. 6, that preserves
Aq (t) ∈ Aq , Uq (t) ∈ Uq , St ∈ Q × Xq × Aq × Uq , S0 = S(t0 ). the underlying logic and jump policy, is still a valid BT.
B. From BTs To CHDSs Inside the CHDS discrete states, which correspond to BT
The inclusion of Decorator, Selector∗ , Sequence∗ , and Actions / sub-trees5 (A1 : AN ), we place the corresponding
Parallel nodes precludes the translation because CHDSs do controllers γn presented in Algorithm 5. From this point it is
not support certain features that those nodes bring about. The straightforward to see that under the input / output mindset,
functionality missing from CHDSs to mimic the Selector∗ the CHDS of a BT node primitive can be embedded as an
and Sequence∗ nodes is being able to rewire (on run-time) the Action An in a CHDS of higher hierarchy. Performing this
edges between discrete states; this involves dynamic edges. procedure recursively turns out to be equivalent to translating
The functionality missing from CHDSs to mimic the Parallel the BT to a CHDS starting from the leaves, and following this
node is being able to execute multiple discrete states simulta- embedding procedure recursively until the Root is reached.
neously; this involves multi-valued initial states. Decorators VIII. ROS I MPLEMENTATION
are to be dealt with on a case-by-case basis.
We propose a ROS implementation of the BT framework,
Any BT consisting only of a Root, Selectors, Sequences,
the behavior-trees library, designed abiding by the Google
Actions, and Conditions has an equivalent CHDS representa-
C++ Style Guide with the following compromises:
tion. To show the equivalence we write the CHDS version of
Compatibility with existing ROS libraries, making it fast
a generic Selector / Sequence (with N children), and a Root.
and easy to incorporate into new robotic platforms.
Then, we use the fact that BTs are constructed using these
Simplicity of the code, allowing other programmers to
basic building blocks (node primitives), embedding them one
understand, expand, maintain, and reuse the code.
inside the other, to justify making the following assertion: Efficiency of processing power, providing the complete
finding an equivalent CHDS representation for a set of BT functionality of BTs, using a low amount of resources.
node primitives is a sufficient condition to guarantee that any The implementation consists of three main parts: the
BT built using only this set can be represented with a CHDS. Client, where the BT is held, the Server, where the Action
X1 (t) ∈ F1 X2 (t) ∈ F2 XN−1 (t)∈ FN−1 and Condition algorithms are held, and the Communication,
start A1 A2 ··· AN where the actionlib ROS library provides the connection be-
X1 (t) ∈ S1 X2 (t) ∈ S2
XN (t) ∈ SN tween the first two. In the following paragraphs we describe
S
X1 (t) ∈ R1
X2 (t) ∈ R2
how these parts, represented in Fig. 10, work together.
XN (t) ∈ RN
R
XN (t) ∈ FN Client Communication Server
F

Fig. 7. Selector node with N Actions / sub-trees represented as a CHDS.


Execution Main Main Execution

X1 (t) ∈ S1 X2 (t) ∈ S2 XN−1 (t)∈ SN−1 Parser Interface Publisher Goal


start A1 A2 ··· AN
X1 (t) ∈ F1 X2 (t) ∈ F2
XN (t) ∈ FN
F Fig. 10. Diagram showing the client-server communication.
X1 (t) ∈ R1
X2 (t) ∈ R2
XN (t) ∈ RN
R
A. Client
XN (t) ∈ SN
S This part is a ROS module that contains the BT control-
flow nodes, and the clients of the Action and Condition nodes,
Fig. 8. Sequence node with N Actions / sub-trees represented as a CHDS.
i.e. the BT logic that does not change between applications. It
is composed of three parts: Parser, Interface, and Execution.
t≥ 1 a) Parser: Reads the BT specification from a file and
ftick
start A1 ∅ generates the node objects dynamically. The nodes are linked
X1 (t) ∈ F1 t=0
with each other according to the tree structure. The parent
X1 (t) ∈ R1
F t=0 only stores a pointer to his first child and refers to the other
X1 (t) ∈ S1 R t=0 children using pointers between adjacent brothers.
b) Interface: Displays the BT with the node states in
S real-time using a lightweight OpenGL based interface. It
allows the user to navigate through the tree, and override
Fig. 9. Root node with 1 Action / sub-tree represented as a CHDS.
the node states for simulation or debugging purposes.
c) Execution: Generates a new tick at the Root of the
In these three CHDSs: Fig. 7, Fig. 8, and Fig. 9, the start is
BT with a fixed frequency. It updates the state of each
to be thought of as an input (where the tick comes from), and
node by propagating the tick through the branches using the
the three small colored circles labeled ‘S’, ‘R’, and ‘F’ are
algorithms explained in Section III-A. When the tick reaches
to be thought of as outputs (where the tick is returned), none
a leaf node, an actionlib message is sent to signal the server.
of which are discrete states, they are symbolic and merely
used to wire arcs when composing the CHDS of the BT. 5 We ignored Conditions because they are simpler versions of Actions.
B. Server IX. S YSTEM D EMONSTRATION
Every leaf node of the BT is linked to a corresponding To show the potential of BTs and the usability of our
ROS server module. These contain the functionality of the library, we implemented in ROS a grasping task using a NAO
application’s actuation and sensing algorithms running at the humanoid robot from Aldebaran Robotics, see Fig. 12.
control frequency fcontrol referenced earlier. Each server is A. The Mission
composed of the three parts: Goal, Publisher, and Execution.
a) Goal: Receives the actionlib messages, that are sent We define a scenario where the robot stands up, walks
from the client side, each time a tick reaches a leaf node. towards a table, and attempts three different grasps on an
It updates internal variables, such as the “elapsed time since object until one of them succeeds or all three fail. If a grasp
last message was received”, to determine if the control succeeds: the robot informs the user, releases the object,
algorithms should be: started, resumed, or stopped. turns 180 degrees, and returns to the starting position. If all
b) Publisher: Contains functions to send state updates grasps fail, the robot informs the user, but does not return to
to the client side, so that the BT node states are kept the starting position. In both cases, whether the grasp was
synchronized with the actual state of the controllers. The successful or not, the robot sits down and disables its motors.
user must define the succeeding Sn , failing Fn , and running To improve the safety of the robot behavior, we include
Rn subsets for each control algorithm to be implemented. fallback handling for motor temperature and falls. This
c) Execution: Runs the main loop of the controller on means that at any point in the execution of the program, if
a separate thread to allow asynchronous actionlib message the robot detects either of these conditions, it automatically
reception. It periodically checks the “elapsed time since last disables the current node and enables the proper one to
message was received” in order to destroy the thread if no handle the situation. For instance, if the robot detects a
tick has reached the client node for a certain amount of time. high motor temperature, it sits down and disables the motors
in order to prevent overheating. Additionally, if the robot
C. Communication detects a fall, it attempts to stand up before continuing.
The client and server parts are connected using actionlib; B. Behavior Tree Representation
the most widely used ROS library. It functions under a non-
The BT of this task is shown in Fig. 11, and features two
supervised goal achieving scheme, where the client sends a
characteristics that, in general, make BTs powerful tools:
goal to the server and waits for it to either succeed or fail.
a) Flexibility: It is easy to extend the behavior by
Clearly, actionlib does not work with the same paradigm
adding or removing nodes without modifying the structure
as the BTs, but it provides a valuable framework of client-
of the tree. For instance, to include detection and proper
server communication in ROS. We took advantage of this to
handling of low battery levels, it suffices to add the dashed
provide our behavior-trees library with the set of callbacks
Condition in Fig. 11. In contrast, adding or removing a node
that allows it to schedule in time the different nodes of a BT.
from a CHDS, could potentially involve wiring 2N +1 arcs
Among these functions6 we highlight the following:
(N/N arcs going to / departing from the node + 1 self-loop).
DoneCB called upon goal completion. b) Modularity: It is possible to encapsulate behaviors
ActiveCB called upon goal acceptance. as sub-trees, in order to append them to a tree with a higher
FeedbackCB called upon feedback publishing. hierarchy. To do this, the sub-tree to be appended needs to
GoalCB called upon new goal reception. be modular, which informally means it never returns Success
PreemptCB called upon goal preemption. or Failure until it has actually finished executing its goal.
ExecuteCB called periodically if the node is active.

D. Implementation Limitations
?
There are two main limitations: the first cannot be avoided
due to the nature of BTs, the second has a small workaround: ?

1) BTs operate by calling a function from inside an-


→ →
other function in a recursive manner following the
Algorithms 1–11. Computationally, this could produce ? Sit Disable
Fall
Stand
Down Motors Up
stack overflow for huge trees, even for implementations Detected →

like ours that separate the control algorithms from the Hot Low

Motors Battery ∗
→ →

?
execution logic using clients and servers.
2) For each tick that is sent to traverse the BT, a large Stand Walk

∗ Grasp Speak Sit Disable
Up Forward Failed Status Down Motors
number of checks has to be performed over the state
∗ ∗
spaces of the Actions in the tree. Our implementation ? →

overcomes this problem by performing both calcu- Right Hand Left Hand Both Hands Grasp Release Turn Walk
Grasp Grasp Grasp Succeed Ball Around Forward
lations asynchronously, thereby preferring to get a
delayed state update than blocking the tick flow.
Fig. 11. BT representation of the grasping task implemented on the NAO.
6 The callbacks DoneCB, ActiveCB, and FeedbackCB belong to the client. We entirely disregard the automatic plan generation, i.e. Top Layer of Fig. 1.
The callbacks GoalCB, PreemptCB, and ExecuteCB belong to the server.
R EFERENCES
[1] R. Palma, P. A. González-Calero, M. A. Gómez-Martı́n, and
P. P. Gómez-Martı́n, “Extending Case-Based Planning with Behavior
Trees,” in FLAIRS Conference. AAAI Press, 2011.
[2] C.-U. Lim, R. Baumgarten, and S. Colton, “Evolving Behaviour Trees
for the Commercial Game DEFCON,” in Applications of Evolutionary
Computation. Springer, 2010, vol. 6024, pp. 100–110.
(a) Stand (b) Walk (c) Right (d) Left (e) Both (f) Release [3] A. Johansson and P. Dell’Acqua, “Emotional Behavior Trees,” in
Computational Intelligence and Games, 2012 IEEE Conference on.
IEEE, 2012, pp. 355–362.
[4] J. Tremblay, “Understanding and Evaluating Behaviour Trees,” McGill
University, Modelling, Simulation and Design Lab, Tech. Rep., 2012.
[5] S. Delmer, “Behavior Trees for Hierarchical RTS AI,” Diploma Thesis,
Southern Methodist University (SMU), 2012.
[6] G. Flórez-Puga, M. A. Gómez-Martı́n, P. P. Gómez-Martı́n, B. Dı́az-
(g) Stand (h) Walk (i) Right (j) Left (k) Both (l) Failed Agudo, and P. A. González-Calero, “Query-Enabled Behavior Trees,”
Computational Intelligence and AI in Games, IEEE Transactions on,
Fig. 12. Demonstration 1: successful ball grasp depicted in 12(a)–12(f). vol. 1, no. 4, pp. 298–308, 2009.
Demonstration 2: failed bottle grasp depicted in 12(g)–12(l). [7] N. Yatapanage, K. Winter, and S. Zafar, “Slicing Behavior Tree
Models for Verification,” in Theoretical Computer Science, ser. IFIP
Advances in Information and Communication Technology. Springer,
2010, vol. 323, pp. 125–139.
X. C ONCLUSIONS [8] A. Shoulson, F. M. Garcia, M. Jones, R. Mead, and N. I. Badler,
“Parameterizing Behavior Trees,” in Motion in Games, ser. Lecture
We presented an algorithmic BT framework which is Notes in Computer Science. Springer, 2011, vol. 7060, pp. 144–155.
substantially more accurate and compact than previous de- [9] R. Colvin, L. Grunske, and K. Winter, “Probabilistic Timed Behavior
scriptions. We provided equivalence notions between BTs Trees,” in Integrated Formal Methods, ser. Lecture Notes in Computer
Science. Springer, 2007, vol. 4591, pp. 156–175.
and CHDSs which gave us insight about the advantages of [10] M. Cutumisu and D. Szafron, “An Architecture for Game Behavior
each framework: using BTs we lose the ability to be in a AI: Behavior Multi-Queues,” in AIIDE. The AAAI Press, 2009.
certain state, but we achieve modularity. A BT is modular if [11] D. Perez, M. Nicolau, M. O’Neill, and A. Brabazon, “Evolving
Behaviour Trees for the Mario AI Competition Using Grammatical
its control-flow nodes and the subsets are laid out in such a Evolution,” in Applications of Evolutionary Computation, ser. Lecture
way that the goal represented succeeds or fails in finite time, Notes in Computer Science. Springer, 2011, vol. 6624, pp. 123–132.
and its Root receives Running for all intermediate states. [12] D. Becroft, J. Bassett, A. Mejı́a, C. Rich, and C. L. Sidner, “AIPaint:
A Sketch-Based Behavior Tree Authoring Tool,” in AIIDE, 2011.
We introduced the Action and Condition subsets, which [13] P. Ögren, “Increasing Modularity of UAV Control Systems using
allowed us to formalize and motivate two extensions to Computer Game Behavior Trees,” in AIAA Guidance, Navigation, and
the basic BT functionality, thereby avoiding the use of ad- Control Conference. AIAA, 2012.
[14] J. A. Bondy and U. S. R. Murty, Graph Theory with Applications.
hoc solutions. The robotic platform tests and theoretical Macmillan London, 1976, vol. 290.
analysis allowed us to conclude that BTs can replace certain [15] A. J. van der Schaft and J. M. Schumacher, An Introduction to Hybrid
CHDSs without losing accuracy, descriptive power, or human Dynamical Systems. Springer, 2000, vol. 251.
[16] M. S. Branicky, “Introduction to Hybrid Systems,” in Handbook of
readability. Moreover, BTs have a larger set of representable Networked and Embedded Control Systems, ser. Control Engineering.
plans because nodes such as the Parallel, Selector∗ , or Se- Springer, 2005, pp. 91–116.
quence∗ do not have a corresponding CHDS representation. [17] M. Branicky, “Studies in Hybrid Systems: Modeling, Analysis, and
Control,” Ph.D. dissertation, Cambridge, MA, USA, 1995.
While not explicitly demonstrated here, we believe that the [18] Willow Garage. (2012) Robot Operating System (ROS). Groovy.
flexibility and modularity of BTs, make them a good candi- [Online]. Available: http://www.ros.org/wiki/
date for representing Middle Layer plans that are created and [19] M. Colledanchise, A. Marzinotto, and P. Ögren, “Performance Analy-
sis of Stochastic Behavior Trees,” in Robotics and Automation (ICRA),
maintained automatically by high-level AI algorithms. We 2014 IEEE International Conference on, June 2014.
acknowledge that the current demonstration merely shows [20] J. A. D. Bagnell, F. Cavalcanti, L. Cui, T. Galluzzo, M. Hebert,
open-loop action scheduling through BTs, and we plan to M. Kazemi, M. Klingensmith, J. Libby, T. Y. Liu, N. Pollard,
M. Pivtoraiko, J.-S. Valois, and R. Zhu, “An Integrated System
address this by upgrading the system to encompass closed- for Autonomous Robotics Manipulation,” in IEEE/RSJ International
loop controllers. Lastly, we plan to analyze the relation Conference on Intelligent Robots and Systems, 2012, pp. 2955–2962.
between our framework parameters and its functionality. [21] R. A. Brooks, “A Robust Layered Control System for a Mobile Robot,”
IEEE Journal of Robotics and Automation, vol. 2, 1986.
[22] R. C. Arkin, An Behavior-based Robotics. Cambridge, MA, USA:
ACKNOWLEDGMENT MIT Press, 1998.
This work has been supported by the Swedish Research [23] J. Rosenblatt, “DAMN: A Distributed Architecture for Mobile Navi-
gation,” Ph.D. dissertation, Robotics Institute, Carnegie Mellon Uni-
Council (VR) and the European Union Project RECONFIG, versity, Pittsburgh, PA, 1997.
www.reconfig.eu (FP7-ICT-2011-9, Project Number: [24] J. Bohren, R. B. Rusu, E. G. Jones, E. Marder-Eppstein, C. Pantofaru,
600825), the authors gratefully acknowledge the support. M. Wise, L. Mosenlechner, W. Meeussen, and S. Holzer, “Towards
Autonomous Robotic Butlers: Lessons Learned with the PR2,” in
We also thank: Isak Tjernberg for sharing his experience Robotics and Automation (ICRA), 2011 IEEE International Conference
with the NAO platform, Xavi Gratal for providing his pro- on. IEEE, 2011, pp. 5568–5575.
gramming expertise, and Francisco Viña for sharing his ROS [25] H. Nguyen, M. Ciocarlie, K. Hsiao, and C. Kemp, “ROS Commander:
Flexible Behavior Creation for Home Robots,” in ICRA, 2013.
knowledge. Lastly we thank: Danica Kragic, Alessandro [26] R. J. Colvin and I. J. Hayes, “A semantics for Behavior Trees using
Pieropan, Johannes Stork, Jana Tumova, and Yuquan Wang CSP with specification commands,” Science of Computer Program-
for the useful critical reviews provided for this paper. ming, vol. 76, no. 10, pp. 891–914, 2011.

You might also like