Towards Informed Swarm Verification
Anton Wijs⋆
Eindhoven University of Technology, 5612 AZ Eindhoven, The Netherlands
[email protected]
Abstract. In this paper, we propose a new method to perform large
scale grid model checking. A manager distributes the workload over many
embarrassingly parallel jobs. Only little communication is needed between a worker and the manager, and only once the worker is ready for
more work. The novelty here is that the individual jobs together form
a so-called cumulatively exhaustive set, meaning that even though each
job explores only a part of the state space, together, the tasks explore
all states reachable from the initial state.
Keywords: parallel model checking, state space exploration.
1
Introduction
In (explicit-state) Model checking (MC), the truth-value of a logical statement
about a system specification, i.e. design, (or directly software code) is checked by
exploring all its potential behaviour, implicitly described by that specification,
as a directed graph, or state space. A flawed specification includes undesired
behaviour, which is represented by a trace through the corresponding state space.
With MC, we can find such a trace, and report it to the developers. To show
flaw (bug) absence, full exploration of the state space is crucial. However, in
order to explore a state space at once, it needs to be stored in the computer’s
main memory, and often, state spaces are too large, possibly including billions
of states. A secondary point of concern was raised in [22,23]: as the amount of
available main memory gets bigger, it becomes technically possible to explore
large state spaces using existing sequential, i.e. single-processor, techniques, but
the time needed to do so is practically too long. Therefore new techniques are
needed, which can exploit multi-core processors and grid architectures.
We envision an ’MC@Home’, similar to SETI@Home [34], where machines in a
network or grid can contribute to solving a computationally demanding problem.
In many application areas, this is very effective. BOINC [8] has about 585,000
computers processing around 2.7 petaFLOPS, topping the current fastest supercomputer (IBM Roadrunner with 1.026 PFLOPS). However, flexible grid MC
does not exist yet; current distributed MC methods, in which multiple machines
are employed for a single MC task, need lots of synchronisation between the
⋆
Supported by the Netherlands Organisation for Scientific Research (NWO) project
612.063.816 Efficient Multi-Core Model Checking.
M. Bobaru et al. (Eds.): NFM 2011, LNCS 6617, pp. 422–437, 2011.
c Springer-Verlag Berlin Heidelberg 2011
Towards Informed Swarm Verification
423
computers (‘workers’), which is a serious bottleneck. MC is both computationally expensive, and cannot obviously be distributed over so-called embarrassingly
parallel [14] processes, i.e. processes which do not synchronise with each other.
In this paper, we propose a method to divide a state space reachability task
into multiple smaller, embarrassingly parallel, subtasks. The penalty for doing
so is that some parts of the state space may be explored multiple times, but, as
noted by [22], this is probably unavoidable, and not that important, if enough
processing power is available. What sets the method which we present in this
paper apart from previous ones is that we distribute the work over a so-called
cumulatively exhaustive set (Ces) of search instructions, where each individual
instruction yields a strictly non-exhaustive search, in which a strict subset of the
set of reachable states is explored, hence less time and memory is needed, while
it is also guaranteed that the searches yielded by all instructions together search
the whole state space. This is novel, since partial (or non-exhaustive) searches,
such as random walk [40] and beam search [28,38,41] are typically very useful to
detect bugs quickly, but cannot provide a guarantee of bug-absence. In our case,
if all searches instructed by the (finitely-sized) Ces cannot uncover a bug, we
can conclude that the state space is bug-free. We believe that a suitable method
for large scale grid MC must be efficient both memory-wise and time-wise; distributed MC techniques tend to scale very well memory-wise, but disappointingly
time-wise, while papers on multi-core MC, in which multiple cores on a single
processor are employed for a single MC task, tend to focus entirely on speedup,
while assuming that the state space fits entirely in the main memory of a single machine. Therefore, we wish to focus more on the combination of time and
memory improvements. Our approach is built on the observation that state space
explosion is often due to the fact that a system specification is defined as a set of
processes in parallel composition, while those processes in isolation do not yield
large state spaces. The only serious requirement which systems must meet for
our method to be applicable at the moment, is that at least one process in the
specification yields finite, i.e. cycle-free, behaviour; we can enforce this by performing a bounded analysis on a process yielding infinite behaviour, but then,
we do not know a priori whether the swarm will be cumulatively exhaustive.
The structure of the paper is as follows: in the next Section, related work
is discussed. In Section 3, preliminary notions are explained. Then, Section 4
contains a discussion on directed search techniques. After that, in Section 5, we
explain the basics of our method for system with independent parallel processes.
A more challenging setup with synchronising parallel processes, together with
our algorithms, are presented in Section 6. Then, experimental results are given
in Section 7, and finally, conclusions and future work appear in Section 8.
2
Related Work
Concerning the state space explosion problem, over the years, many techniques
have been developed to make explicit-state MC tasks less demanding. Prominent
examples are reduction techniques like partial order reduction [31], and directed
424
A. Wijs
MC [12], which covers the whole range of state space exploration algorithms.
Some of these use heuristics to find bugs quickly, but if these are inaccurate, or
bugs are absent, they have no effect on the time and memory requirements.
In distributed algorithms such as e.g. in [3,4,5,6,10,15,27,32], multiple workers
in a cluster or grid work together to perform an MC task. This has the advantage
that more memory is available; in practice, though, the techniques do not scale
as well as desired. Since the workers need to synchronise data quite frequently,
for very large state spaces, the time spent on synchronisation tends to be longer
than the time spent on the actual task. Furthermore, if one of the workers is
considerably slower than the others or fails entirely, this has a direct effect on
the whole process. Another development is multi-core MC. Since a few years,
Moore’s Law no longer holds, meaning that the speed of new processors does
not double every two years anymore. Instead, new computers are equipped with
a growing number of processor cores. For e.g. MC, this means that in order to
speedup the computations, the available algorithms must be adapted. In multicore MC, we can exploit that the workers share memory. Major achievements are
reported in e.g. [1,20,21,26]. [26] demonstrates a significant speedup in a multicore breadth-first search (Bfs) using a lock-free hash table. However, papers
on multi-core MC tend to focus on reducing the time requirements, and it is
assumed that the entire state space fits in the main memory of a single machine.
A major step towards efficient grid MC was made with Swarm Verification
(SV) [22,23,24] and Parallel Randomized State Space Search [11,35], which involve embarrassingly parallel explorations. They require little synchronisation,
and have been very successful in finding bugs in large state spaces quickly. Bug
absence, though, still takes as much time and memory to detect than a traditional, sequential search, since the individual workers are unaware of each other’s
work, and each worker is not bounded to a specific part of the state space. The
method we propose is based on SV, and since each worker uses particular information about the specification to guide the search, we call it informed SV
(ISV), relating it to informed search techniques in directed MC. Similar ideas
appear in related work: in [27], it is proposed to distribute work based on the behaviour of a single process. The workers are not embarrassingly parallel, though.
A technique to restrict analysis of a program based on a given trace of events is
presented in [16]. It has similarities with ISV, but also many differences; their
technique performs slicing on a deterministic C program, and is not designed
for parallel MC, whereas ISV distributes work to analyse concurrent behaviour
of multiple processes based on the behaviour of a subsystem. Finally, a similar
approach appears in [36], but there, it is applied on symbolic execution trees
to generate test cases for software testing. Unlike ISV, they distribute the work
based on a (shallow) bounded analysis of the whole system behaviour.
3
Preliminaries
Labelled Transition Systems Labelled transition systems (Ltss) capture the opℓ
erational behaviour of concurrent systems. An Lts consists of transitions s −→s′ ,
Towards Informed Swarm Verification
425
meaning that being in a state s, an action ℓ can be executed, after which a state
s′ is reached. In model checking, a system specification, written in a modelling
language, has a corresponding Lts, defined by the structural operational semantics of that language.
Definition 1. A labelled transition system (Lts) is a tuple M = (S, A, T , sin ),
where S is a set of states, A a set of actions or transition labels, T a transition
ℓ
relation, and sin the initial state. A transition (s, ℓ, s′ ) ∈ T is denoted by s −→s′ .
A sequence of labels σ = ℓ1 , ℓ2 , . . . , ℓn , with n > 0, describes a sequence
of events relating to a trace in an Lts, starting at sin , with matching labels,
i.e. it maps to traces in the Lts with s0 , . . . , sn ∈ S, ℓ1 , . . . , ℓn ∈ A, with
ℓn
ℓ2
ℓ1
s0 = sin , such that s0 −→
sn . Note that σ maps to a single
· · · −→
s1 −→
trace iff the Lts is label-deterministic, i.e. that for all s ∈ S, if there exist
ℓ′
ℓ′′
s −→ s′ and s −→ s′′ with s′ = s′′ , then also ℓ′ = ℓ′′ . If the Lts is not labeldeterministic, then σ may describe a set of traces. In this paper, we assume that
Ltss are label-deterministic, but this is strictly not required. The set of enabled
transitions restricted to a set of labels A ∈ A in state s of Lts M is defined as
ℓ
en M (s, A) = {t ∈ T | ∃s′ ∈ S, ℓ ∈ A. t = s −→ s′ }. Whenever en M (s, A) = ∅, we
ℓ
call s a deadlock state. For T ⊆ T , we define nxt(T ) = {s ∈ S | ∃s′ −→ s ∈ T }.
This means that nxt(en M (s, A)) is the set of immediate successors of s.
System specifications often consist of a finite number of process specifications
in parallel composition. Then, the process specifications describe the potential
behaviour of individual system components. The potential behaviour of all these
processes concurrently then constitutes the Lts of the system as a whole. What
modelling language is being used to specify these systems is unimportant here;
we only assume that the process specifications can be mapped to process Ltss,
and that the processes can interact using synchronisation actions.
Next, we will highlight how a system Lts can be derived from a given set
of process Ltss and a so-called synchronisation function. System behaviour can
be described by a finite set Π of n > 0 process Ltss Mi = (Si , Ai , Ti , sin,i ),
for 1
≤ i ≤ n, together with a partial function C : As × As → Af , with
s
A = 1≤i≤n Ai and Af a set of actions representing successful synchronisation,
describing the potential
synchronisation behaviour of the system, i.e. it defines
which actions ℓ, ℓ′ ∈ 1≤i≤n Ai can synchronise with each other, resulting in an
action ℓ′′ ∈ Af . We write C({ℓ, ℓ′ }) = ℓ′′ , to indicate that the order of ℓ and
ℓ′ does not matter.1 Furthermore, we assume that each action ℓ is always only
involved in at most one synchronisation rule, i.e. for each ℓ, there are no two
distinct ℓ′ , ℓ′′ such that both C({ℓ, ℓ′ }) and C({ℓ, ℓ′′ }) are defined. Definition 2
describes how to construct a system Lts from a finite set Π of process Ltss.
Definition 2. Given a set Π of n > 0 process Ltss Mi = (Si , Ai , Ti , sin,i ),
for 1 ≤ i ≤ n, and synchronisation function C : As × As → Af , with As =
1
In practice, synchronisation rules can also be defined for more than two parties,
resulting in broadcasting rules. In this paper, we restrict synchronisation to two
parties. Note, however, that the definitions can be extended to support broadcasting.
426
A. Wijs
Ai and Af a set of actions representing successful synchronisation, we
construct a system Lts M = (S, A, T , sin ) as follows:
1≤i≤n
– sin = (sin,1 , . . . , sin,n );
– Let z1 = (s1 , . . . , si , . . . , sj , . . . , sn ) ∈ S with i = j.
ℓ
• If for some Mi ∈ Π, si −→ s′i with ℓ ∈ Ai , and there does not exist
ℓ′ ∈ As such that C({ℓ, ℓ′ }) = ℓ′′ , for some ℓ′′ ∈ Af , then z2 = (s1 , . . . ,
ℓ
s′i , . . . , sj , . . . , sn ) ∈ S. In this case, ℓ ∈ A and z1 −→ z2 ∈ T ;
ℓ
• If for some Mi ∈ Π, si −→ s′i with ℓ ∈ Ai , and for some Mj ∈ Π
ℓ
′
(i = j), sj −→ s′j with ℓ′ ∈ Aj , and C({ℓ, ℓ′ }) = ℓ′′ , for some ℓ′′ ∈ Af ,
then z2 = (s1 , . . . , s′i , . . . , s′j , . . . , sn ) ∈ S. In this case, ℓ′′ ∈ A and
ℓ′′
z1 −→ z2 ∈ T .
4
Directed Lts Search Techniques
The two most basic Lts exploration algorithms available in model checkers are
Bfs and depth-first search (Dfs). They differ in the order in which they consider
states for exploration. In Bfs, states are explored in order of their distance from
sin . Dfs gives priority to searching at increasing depth instead of exploring all
states at a certain depth before continuing. If at any point in the search, the selected state has no successors, or they have all been visited before, then Dfs will
backtrack to the parent of this state, and explore the next state from the parent’s
set of successors, according to the ordering function. Bfs and Dfs are typical
blind searches, since they do not take additional information about the system
under verification into account. In contrast to this are the informed searches
which do use such information. Examples of informed searches are Uniform-cost
search [33], also known as Dijkstra’s search [9], and A∗ [19].
All searches, both blind and informed, mentioned so far are examples of exhaustive searches, i.e. in the absence of deadlocks, they will explore all states
reachable from sin . Another class of searches is formed by the non-exhaustive
searches. These searches prune the Lts on-the-fly, completely ignoring those
parts which are deemed uninteresting according to some heuristics. At the cost
of losing completeness, these searches can find deadlocks in very large Ltss fast,
since they can have drastically lower memory and time requirements, compared
to exhaustive searches. A blind search in this category is random walk, or simulation, in which successor states are chosen randomly. An informed example
is beam search, which is basically a Bfs-like search, where in each iteration,
i.e. depth, only up to β ∈ , which is given a priori, states are selected for
exploration. For the selection procedure, various functions can be used; in classic beam search, a (state-based) function as in A∗ is used, in priority beam
search [38,39,41], a selection procedure based on transition labels is employed,
while highway search [13] uses random selection, and is therefore a blind variant
of beam search.
Towards Informed Swarm Verification
427
For a search L, let us define its scope ReachM (L) in a given Lts M as the set
of states in M that it will explore. For all exhaustive L, we have ReachM (L) =
S, while for all non-exhaustive L, ReachM (L) ⊂ S. Let us consider two searches
L1 and L2 , such that ReachM (L1 ) ∪ ReachM (L2 ) = S. Then, we propose to
call {L1 , L2 } cumulatively exhaustive on M. Such cumulatively exhaustive sets
(Cess) are very interesting for SV; the elements can be run independently in
parallel, and once they are all finished, the full Lts will have been explored.
Some existing searches lead to Cess. Iterative deepening [25] uses depthbounded Dfs in several iterations, each time relaxing the bound. Each iteration can be seen as an individual search, subsequent searches having increasing
scopes. Iterative searches form a class, which includes e.g. Ida∗ [25]. Another
class leading to Cess consists of random searches like random walk and highway
search. However, all these are not suitable for grid computing. Iterative searches
form Cess containing an exhaustive search. If M is bug-free, then eventually
this search is performed, which is particularly inefficient. With random searches,
there is no guarantee that after n searches, all reachable states are visited. If
n → ∞, the set will eventually be cumulatively exhaustive, but performing the
searches may take forever. Moreover, the probabilities to visit states in a random
walk are not uniformly distributed, but depend on the graph structure [30].
We want to derive Cess with a bounded number of non-exhaustive elements
from a system under verification. Preferably, all scopes have equal size, to ensure load-balancing, but this is not necessary (in SV, workers do not synchronise,
hence load-balancing is less important [22,23]). To achieve this, we have developed a search called informed swarm search (Iss) which accepts a guiding function, and we have a method to compute a set of guiding functions f0 , f1 , . . . , fn
given a system specification, such that {Iss(f0 ), Iss(f1 ), . . .} is a Ces. The guiding functions actually relate to traces through the Lts of a subsystem π of the
system under verification, which are derived from an Lts exploration of π. Such
an Lts can in practice be much smaller than the system Lts. For now, we require
that π yields finite behaviour, i.e. that its Lts is cycle-free.
Iss only selects those transitions for exploration which either do not stem from
π, or which correspond with the current position in the given trace through
the Lts of π. This is computationally inexpensive, since it only entails label
comparison. The underlying assumption is that labels can be uniquely mapped
to process Ltss; given a label, we know from which process it stems. If a given
set of Ltss does not meet this requirement, some label renaming can fix this.
5
Systems with Independent Processes
In this section, we will explain our method for constructing Cess for very basic specifications which consist of completely independent processes in parallel
composition. We are aware of the fact that such specifications may practically
not be very interesting. However, they are very useful for our explanation.
Figure 1 presents two Ltss of a beverage machine, and a snack machine,
respectively. Both are able to dispense goods when one of their buttons is pressed.
428
A. Wijs
There is no interaction between the machines. We have Π = {Mb , Ms }, with Mb
and Ms the Ltss of the beverage machine and the snack machine, respectively.
If we perform a Dfs through Mb , and we record the encountered traces
whenever backtracking is required, we get the following set: {push button(1 ),
get coffee, push button(2 ), get tea}. Note that all reachable states of Mb
have been visited. We use these two traces as guiding principles for two different
searches through Mbs , which is the Lts obtained by placing the two Ltss in
parallel composition. Algorithm 1 presents the pseudo-code of our Iss, which
accepts a trace σ and a set of transition labels Aex to guide the search. In our
example, each Iss, with one of the two traces and Aex = Ab as input, focusses
on specific behaviour of the beverage machine within the bigger system context.
Fig. 2 shows which
states will be visited
in Mbs if we perform an Iss based
on push button(1 ),
get coffee. Alg. 1 explains how this is
done. Initially, we put
Fig. 1. Two Ltss of a beverage and a snack machine
sin in Open. Then,
we add all successors
reached via a transition with a label not in Aex in Next, and add all successors
reached via a transition labelled σ(i) in Step. Here, σ(i) returns the (i + 1)th
element in the trace σ; if i is bigger than or equal to the trace length, we say
that σ(i) = ⊥, where ⊥ represents ‘undefined’, and {⊥} is equivalent to ∅. For
now, please ignore the next step concerning Fi ; it has to do with feedback to
the manager, and will be explained later. Finally, sin is added to Closed , i.e.
the set of explored states, and the states in Next which have not been explored
constitute the new Open. Then, the whole process is repeated. This continues until Open = ∅. Then, the contents of Step is moved to Open, and the
Iss moves on to the next step in σ. In this way, the Iss explores all traces
γ = α0 , σ(0), α1 , . . . , αn , σ(n), αn+1 , with n the length of σ and the αi traces
Fig. 2. A search through Mbs with Mb restricted to push button(1 ), get coffee
Towards Informed Swarm Verification
429
Algorithm 1. Bfs-based Informed Swarm Search
Require: Implicit description of M, exclusion action set Aex , swarm trace σ
Ensure: M restricted to σ is explored
i←0
Open ← sin ; Closed, Next, Step, Fi ← ∅
while Open = ∅ ∨ Step = ∅ do
if Open = ∅ then
i←i+1
Open ← Step \ Closed; Step, Fi ← ∅
end if
for all s ∈ Open do
Next ← Next ∪ nxt(en M (s, A \ Aex ))
Step ← Step ∪ nxt(en M (s, {σ(i)}))
ℓ
Fi ← Fi ∪ {ℓ | ∃s′ ∈ S.(s −→ s′ ) ∈ en M (s, Aex )}
end for
Closed ← Closed ∪ Open
Open ← Next \ Closed; Next ← ∅
end while
containing only labels from A \ Ab . If we perform an Iss for every trace through
Mb , we will visit all reachable states in Mbs . Figure 2 shows what the Iss with
push button(1 ), get coffee explores; out of the 9 states, 6 are explored, meaning that 33% of Mbs could be ignored. The Iss using push button(2 ), get tea
also explores 6 states, namely 0, 3 and 4 (the states reachable via behaviour
from Ms ), and 1, 5, and 6. In this way, some states are explored multiple times,
but we have a Ces of non-exhaustive searches through Mbs .
6
Systems with Synchronising Processes
Next, we consider parallel processes which synchronise. In such a setting, things
get slightly more complicated. Before we continue with an example, let us first
formally define a subsystem and the Lts it yields.
Definition 3. Given a set Π of n > 0 process Ltss Mi = (Si , Ai , Ti , sin i), for
1 ≤ i ≤ n, and a synchronisation function C : As × As → Af , we call a subset
π ⊆ Π of Ltss a subsystem of Π. We can derive
a synchronisation function
Cπ : Asπ × Asπ → Afπ from C as follows: Asπ = Mi ∈π Ai , and for all ℓ, ℓ′ ∈ Asπ ,
if C({ℓ, ℓ′ }) = ℓ′′ , for some ℓ′′ ∈ Af , we define Cπ ({ℓ, ℓ′ }) = ℓ′′ and ℓ′′ ∈ Afπ .
Note that the Lts of a subsystem π, which can be obtained with Definition 2, describes an over-approximation of the potential behaviour of π within the bigger
context of Π. This is because in π, it is implicitly assumed that all synchronisation with external processes (which are in Π, but not in π) can happen
whenever a process in π can take part in it. For the Iss, we have to choose Aex
more carefully, and we need to post-process traces σ yielded from π; we must
take synchronisation between π and the larger context into account. For this,
we define a relabelling function R : Aπ → Af as follows: R(ℓ) = ℓ if there exists
no ℓ′ such that C({ℓ, ℓ′ }) is defined, and R(ℓ) = ℓ′′ if there exists an ℓ′ such that
C({ℓ, ℓ′ }) = ℓ′′ . Then, we say that Aex = {R(ℓ) | ℓ ∈ Aπ } and σ ′ (i) = R(σ(i)) for
all defined σ(i), such that Aex and σ ′ are applicable in the system Lts, because
430
A. Wijs
actions from π which are forced to synchronise are relabelled to the results of
those synchronisations. In this case, σ ′ relates to a single trace iff C is injective.
Fig. 3 shows two Ltss: one
of a modified beverage machine Mb , and one of a user
Mu (for now, please ignore the
numbers between curly brackets). For the parallel composition
Mub , we define: C({push button,
push button}) = button pushed 2 ,
C( {get coffee, get coffee}) =
take coffee, C( {get tea, get tea})
= take tea.
First, if π = {Mu }, observe
that a Dfs through Mu does
Fig. 3. Ltss of a user of a beverage machine, and not give us the full set of traces
a beverage machine, respectively
through Mu ; if we consider the
transition ordering from left to
right, a Dfs will first provide trace push button(2 ), get tea, push button(2 ),
get tea, walk away . Then, it will backtrack to state 3, and continue via 5 to 6.
Since 6 has already been explored, the Dfs produces trace Algorithm 2. Trace-counting Dfs
push button(2 ), get tea, push button Require: Implicit description of cycle-free M
(1), get coffee and backtracks to 0. Ensure: M and tc : S → are constructed
Closed ← ∅
Note that this new trace does not fintc(sin ) ← dfs(sin )
ish with walk away. Continuing from dfs(s) =
if s ∈ Closed then
0, the search will finally produce the
tc(s) ← 0
trace push button(1 ), get coffee.
for all s′ ∈ nxt(en M (s, A)) do
tc(s) ← tc(s) + dfs(s′ )
Figure 4 presents Mub . If we use
end for
these three traces (after relabelling
if nxt(en M (s, A)) = ∅ then
tc(s) ← 1
with R) as guiding functions as in
end if
Alg. 1, and define Aex as mentioned
Closed ← Closed ∪ {s}
end if
earlier, none of the searches will visit
return tc(s)
(the marked) state 5! The reason for
this is that although in Mu , multiple
traces may lead to the same state, in Mub , the corresponding traces may not.
This is due to synchronisation. Since the different traces in Mu synchronise with
different traces in Mb which do not lead to the same state, also in Mub , the
resulting traces will lead to different states. One solution is to fully explore Mu
with a Dfs without a Closed set. However, this is very inefficient, since the complete reachable graph from a state s needs to be explored n times, if n different
traces reach s. Instead, we opt for constructing a weighted Lts, where each state
is assigned a value indicating the number of traces that can be explored from
2
When transition labels have parameters, they can synchronise iff they have the same
parameter values. Then, the resulting transition also has these parameter values.
Towards Informed Swarm Verification
431
that state. In Figure 3, these numbers are displayed between curly brackets. The
advantage of this is that we can avoid re-exploration of states, and it is in fact
possible to uniquely identify traces through Mu by a trace ID ∈ .
Alg. 2 presents our trace counting
search (in which Closed is a global
variable), which not only explores a
full Lts, but also constructs a function tc : S →
indicating the number of traces one can follow from a
state. Deadlock states get weight 1,
and other states get a weight equal to
the sum of the weights of their immediate successors. In Dfs, a state s is
placed in Closed once all states reachable from s have been explored, hence
Fig. 4. The Lts of a beverage machine and
at that moment, we know the final
a user in concurrency
weight of s. This allows us to reuse
weight information whenever we visit
an explored state. Note that tc(sin ) equals the number of possible traces through
the Lts. Alg. 3 shows how to reconstruct a trace, given its ID between 0 and
tc(sin ). It is important here that for each state, its successor states are always
ordered in the same way. In the weighted Lts, each trace from sin to a state
s represents a range of trace IDs from lower, the maintained lower-bound, to
lower + tc(s). Starting at sin , the algorithm narrows the matching range down
to the exact ID. At each state, it explores the transition with a matching ID interval to the next state, and continues like this until a deadlock state is reached.
The method works as follows: first,
an explicit weighted Lts of a sub- Algorithm 3. Trace Reconstruction
system π is constructed. Whenever a
Require: Cycle-free M, tc : S → , ID ∈
worker is ready for work, he contacts Ensure: Trace with given ID is constructed in σ
i, lower ← 0
a manager, who then selects a trace
crt ← sin
ID from the given set of IDs, and conℓ
for all (crt −→ s) ∈ en M (crt, A) do
structs the associated trace σ using
if lower + tc(s) > ID then
crt ← s
the weighted Lts. This trace, after reσ(i) ← ℓ; i ← i + 1
labelling with R, is used by the worker
else
lower ← lower + tc(s)
to guide his Iss. Next, we discuss the
end if
Fi in Alg. 1. For each σ(i), set Fi is
end for
constructed to hold all labels from Aπ
after relabelling which are encountered in M while searching for σ(i). Since Mπ
is an over-approximation of the potential behaviour of π in M,3 the set of trace
IDs is likely to contain many false positives, i.e. behaviour of π which cannot be
fully followed in M.
3
Note that only 2 of the 4 traces through Mu can be followed completely in Mub by
a swarm search.
432
A. Wijs
These Fi provide invaluable feedback, allowing the manager to prune nonexecutable traces from the work-list. This is essential to on-the-fly reduce the
number of Isss drastically. The manager performs this pruning by traversing
the weighted Lts, similar to Algorithm 3, and removing all ranges of trace IDs
corresponding to traces which are known to be false positives from a maintained
trace set. E.g. if a worker discovers that after an action a of π, the only action of π
which can be performed is b, then the manager will first follow a in the weighted
Lts of π from sin to a state s, and remove all ID ranges corresponding to the
immediate successors of s which are not reached via a transition labelled b from
the trace set. The manager will also remove the ID of the trace followed by that
worker, if the worker was able to process it entirely. From that moment on, the
manager will select trace IDs from the pruned set. This allows for embarrassingly
parallel workers, and the manager can dynamically process feedback and provide
new work in the form of traces. Furthermore, only little communication is needed,
and if a worker fails due to a technical issue, the work can easily be redone.
Note that trace-counting Dfs, like Dfs, runs in O(|S| + |T |), and the trace
reconstruction search and the ID pruning algorithm run in O(n + (n ∗ b)), with
n the length of the largest trace through the weighted Lts, and b the maximum
number of successors of a state in the Lts. Finally, the complexity of Iss depends
on the Lts structure; it is less than O(|S| + |T |), as it is non-exhaustive, but
also not linear in the length of the longest trace through the Lts, like e.g. beam
search, as it prunes less aggressively (not every Bfs-level has the same size).
7
Experiments
All proposed algorithms are implemented as an extension of LTSmin [7]. The
advantage of using this toolset as a starting point is that it has interfaces with
multiple popular model checkers, such as DiVinE [2] and mCRL2 [17]. However,
DiVinE is based on Kripke structures, where the states instead of the transitions
are labelled. Future work, therefore, is to develop a state-based ISV.
We have two bash scripts for performing real multi-core ISVs and simulating
ISVs, in case not enough processors are available. We do not yet support communication between workers and the manager over a network. The functionality of
the manager is implemented in new tools to perform the pruning on the current
trace ID set, and select new trace IDs from the remaining set. All intermediary
information is maintained on disk; Initially, the user has to create a subsystem
specification based on the system specification, which is used for trace-counting,
leading to a weighted Lts on disk. The selection tool can select available trace
IDs from that Lts, and write explicit traces into individual files. Relabelling
can also be applied, given the synchronisation rules. The IDs are currently selected such that they are evenly distributed over the ID range. The trace files
are accepted by the Iss in LTSmin, applied on the system specification. Finally, the written feedback can be applied on a file containing the current trace
ID set.
Towards Informed Swarm Verification
433
Table 1. Results for two protocol specifications. ISV n indicates an ISV with n workers.
# π-traces: estimated # Isss needed. (1 for single Bfs)# Isss: actual # Isss needed.
max. states: largest # states explored by an Iss, or # states explored by Bfs. max.
time: longest running time of an Iss, or running time of Bfs.
case
search
results
# π-traces # Isss max. states max. time
13,246,976
19,477 s
1.31 ∗ 1013 7,070
70,211
177 s
ISV 100 1.31 ∗ 1013 9,900
70,211
175 s
Bfs
DRM (1nnc, 3ttp) ISV 10
1
1
1
1
3.01 ∗ 109
1,160
236,823
524 s
ISV 100 3.01 ∗ 109
1,400
236,823
521 s
Bfs
1394 (3 link ent.) ISV 10
137,935,402 105,020 s
We performed a number of experiments using µCRL [18] specifications of a
DRM protocol [37] and the Link Layer Protocol of the IEEE-1394 Serial Bus
(Firewire) [29] with three parallel link protocol entities. In the first case, we
performed trace-counting on the two iPod devices in parallel composition. In the
second case, we isolated one of the link protocol entities for the trace-counting,
and bounded its infinite behaviour, only allowing the traversal through cyclic
behaviour up to 2 times. The experiments were performed on a machine with two
dual-core amd opteron (tm) processors 885 2.6 GHz, 126 GB RAM, running
Red Hat 4.3.2-7. Creating the weighted Ltss took no more than a few minutes;
the DRM weighted Lts contained 962 states, the 1394 weighted Lts 73 states.
We simulated the ISVs, executing Isss in sequence. This influences the outcome
slightly, since we process feedback each time n new Isss have been performed; we
do this to approach a real ISV, where, on the one hand, feedback can be processed
as it becomes available, but on the other hand, many Isss can be launched in
parallel at the same time, hence when a certain amount of feedback has been
processed. Therefore, updating the remaining set of traces after each individual
Iss is not really fair. The need to simulate was the main reason that we have not
looked at larger Ltss yet. Table 1 presents the results. For smaller instances, we
validated that the swarm was cumulatively exhaustive, by writing the full state
vectors of states to disk. Observe that initially, the first analyses produced a large
over-approximation of the number of Isss needed (see “#π-traces”). The quality
of the estimation has an effect on the ISV efficiency, even with feedback. This
also has an effect on the difference in efficiency when changing n (the number
of parallel workers); when the over-approximation is large, many Isss may be
launched which can not process the given trace entirely, since it is a false positive.
As n is increased, the probability of launching such Isss gets higher, as more Isss
are launched before any new feedback is given. On the other hand, increasing n
still reduces the overall execution time, because there is more parallellism.
The ISVs take more time4 compared to a Bfs. This seems to indicate that the
method is not interesting. However, keep in mind that already the first few Isss
4
Note that the overall execution time takes at most max . time ∗ (#Isss/n) seconds.
434
A. Wijs
reach great depths, and they go into several directions. Therefore, even though
we did no bug-hunting, in many cases, it is to be expected that like SV, our ISV
will find bugs much quicker than a single search. This could perhaps even be
improved by using a Dfs-based version of Alg. 1. We plan to do an empirical
comparison between such a Dfs-based ISV and SV. For full exploration, ISV
does not provide a speedup, but this was not intended; instead, observe that the
maximum number of states explored in an Iss is much smaller than the Lts size
(in the DRM case about 12 % of the overall size, in the 1394 case about 16 %). A
better trade-off could be realised by guiding each Iss with a set of subsystem
traces; for the DRM case, an Iss following 10 traces would still explore no more
than 5% of the Lts (probably less depending on the amount of redundant work
which could now be avoided), while the number of Isss could be decreased by
an order of magnitude. This makes ISV applicable in clusters and grids with
large numbers of processors, each having access to e.g. 2 GB memory, even if
exploring the whole Lts at once would require much more memory.
8
Conclusions
In this paper, we have proposed a new approach to parallel MC, aimed at large
scale grid computing. Part of the system behaviour is analysed in isolation,
yielding a set of possible traces, each representing a search through the full Lts.
These searches are embarrassingly parallel, since only a trace, the set of actions
of the subsystem, and the specification are needed for input. Once a search
is completed, feedback is sent to the manager, giving him information on the
validity of the remaining traces, which is invaluable, since the set of traces is an
over-approximation of the possible traces of the subsystem in the bigger context
of the full system. We believe that our method is fully compatible with existing
techniques. E.g. one can imagine having multiple multi-core machines available;
the ISV method can then be used to distribute the work over these machines,
but each machine individually can perform the work using a multi-core search.
Also reduction techniques like partial order reduction should be compatible with
ISV. For good results, we expect it to be important that both during the analysis
of the subsystem and the full system, the same reduction techniques are applied.
For future work, we plan to test ISV more thoroughly to see if it scales to
real-life problems, and make the tools more mature. As the size of the subsystem
has an effect on the work distribution, it is interesting to investigate what an
ideal subsystem relative to a system would be. We also wish to generalise ISV,
such that subsystems yielding infinite behaviour can be analysed, and to improve
the trace set approximation with e.g. static analysis. One could also construct
the trace set according to the MC task, e.g. taking the property to check into
account. We plan to investigate different strategies for trace selection. A good
strategy sends the workers into very different directions. Finally, [16,36] provide
good pointers to develop a state-based ISV.
Towards Informed Swarm Verification
435
References
1. Barnat, J., Brim, L., Ročkai, P.: Scalable multi-core LTL model-checking. In:
Bošnački, D., Edelkamp, S. (eds.) SPIN 2007. LNCS, vol. 4595, pp. 187–203.
Springer, Heidelberg (2007)
2. Barnat, J., Brim, L., Ročkai, P.: DiVinE multi-core – A parallel LTL model-checker.
In: Cha, S(S.), Choi, J.-Y., Kim, M., Lee, I., Viswanathan, M. (eds.) ATVA 2008.
LNCS, vol. 5311, pp. 234–239. Springer, Heidelberg (2008)
3. Barnat, J., Brim, L., Střı́brná, J.: Distributed LTL model-checking in SPIN. In:
Dwyer, M.B. (ed.) SPIN 2001. LNCS, vol. 2057, pp. 200–216. Springer, Heidelberg
(2001)
4. Behrmann, G., Hune, T., Vaandrager, F.: Distributing Timed Model Checking How the Search Order Matters. In: Emerson, E.A., Sistla, A.P. (eds.) CAV 2000.
LNCS, vol. 1855, pp. 216–231. Springer, Heidelberg (2000)
5. Blom, S.C.C., Calamé, J.R., Lisser, B., Orzan, S., Pang, J., van de Pol, J.C., Torabi
Dashti, M., Wijs, A.J.: Distributed Analysis with µCRL: A Compendium of Case
Studies. In: Grumberg, O., Huth, M. (eds.) TACAS 2007. LNCS, vol. 4424, pp.
683–689. Springer, Heidelberg (2007)
6. Blom, S.C.C., Lisser, B., van de Pol, J.C., Weber, M.: A Database Approach to
Distributed State Space Generation. In: Haverkort, B., Černá, I. (eds.) PDMC
2007. ENTCS, vol. 198, pp. 17–32. Elsevier, Amsterdam (2007)
7. Blom, S.C.C., van de Pol, J., Weber, M.: LTSmin: distributed and symbolic reachability. In: Touili, T., Cook, B., Jackson, P. (eds.) CAV 2010. LNCS, vol. 6174, pp.
354–359. Springer, Heidelberg (2010)
8. BOINC: Visited on 18 February (2011), http://boinc.berkeley.edu
9. Dijkstra, E.W.: A note on two problems in connection with graphs. Numerische
Mathematik 1, 269–271 (1959)
10. Dill, D.: The Murphi Verification System. In: Alur, R., Henzinger, T.A. (eds.) CAV
1996. LNCS, vol. 1102, pp. 390–393. Springer, Heidelberg (1996)
11. Dwyer, M.B., Elbaum, S.G., Person, S., Purandare, R.: Parallel Randomized Statespace Search. In: 29th Int. Conference on Software Engineering, pp. 3–12. IEEE
Press, New York (2007)
12. Edelkamp, S., Leue, S., Lluch-Lafuente, A.: Directed explicit-state model checking
in the validation of communication protocols. STTT 5(2), 247–267 (2004)
13. Engels, T.A.N., Groote, J.F., van Weerdenburg, M.J., Willemse, T.A.C.: Search
Algorithms for Automated Validation. JLAP 78(4), 274–287 (2009)
14. Foster, I.: Designing and Building Parallel Programs. Addison-Wesley, Reading
(1995)
15. Garavel, H., Mateescu, R., Bergamini, D., Curic, A., Descoubes, N., Joubert, C.,
Smarandache-Sturm, I., Stragier, G.: DISTRIBUTOR and BCG MERGE: Tools
for distributed explicit state space generation. In: Hermanns, H., Palsberg, J. (eds.)
TACAS 2006. LNCS, vol. 3920, pp. 445–449. Springer, Heidelberg (2006)
16. Groce, A., Joshi, R.: Exploiting traces in static program analysis: better model
checking through printfs. STTT 10(2), 131–144 (2008)
17. Groote, J.F., Keiren, J., Mathijssen, A., Ploeger, B., Stappers, F., Tankink, C.,
Usenko, Y.S., Weerdenburg, M.J.: The mCRL2 Toolset. In: 1st Int. Workshop on
Academic Software Development Tools and Techniques 2008, pp. 5-1/10 (2008)
18. Groote, J.F., Ponse, A.: The Syntax and Semantics of µCRL. In: Algebra of Communicating Processes 1994. Workshops in Computing, pp. 26–62. Springer, Heidelberg (1995)
436
A. Wijs
19. Hart, P.E., Nilsson, N.J., Raphael, B.: A Formal Basis for the Heuristic Determination of Minimum Cost Paths. IEEE Trans. on Systems, Science and Cybernetics 2,
100–107 (1968)
20. Holzmann, G.J.: A Stack-Slicing Algorithm for Multi-Core Model Checking. In:
Haverkort, B., Černá, I. (eds.) PDMC 2007. ENTCS, vol. 198, pp. 3–16. Elsevier,
Amsterdam (2007)
21. Holzmann, G.J., Bošnački, D.: The Design of a Multicore Extension of the SPIN
Model Checker. IEEE Trans. On Software Engineering 33(10), 659–674 (2007)
22. Holzmann, G.J., Joshi, R., Groce, A.: Swarm Verification. In: 23rd IEEE/ACM
Int. Conference on Automated Software Engineering, pp. 1–6. IEEE Press, New
York (2008)
23. Holzmann, G.J., Joshi, R., Groce, A.: Tackling large verification problems with the
swarm tool. In: Havelund, K., Majumdar, R. (eds.) SPIN 2008. LNCS, vol. 5156,
pp. 134–143. Springer, Heidelberg (2008)
24. Holzmann, G.J., Joshi, R., Groce, A.: Swarm Verification Techniques. IEEE Trans.
On Software Engineering (2010) (to appear)
25. Korf, R.E.: Depth-First Iterative-Deepening: An Optimal Admissible Tree Search.
Artificial Intelligence 27(1), 97–109 (1985)
26. Laarman, A., van de Pol, J.C., Weber, M.: Boosting Multi-Core Reachability Performance with Shared Hash Tables. In: Int. Conference on Formal Methods in
Computer-Aided Design (2010)
27. Lerda, F., Sisto, R.: Distributed-Memory Model Checking with SPIN. In: Dams,
D.R., Gerth, R., Leue, S., Massink, M. (eds.) SPIN 1999. LNCS, vol. 1680, pp.
22–39. Springer, Heidelberg (1999)
28. Lowerre, B.T.: The HARPY speech recognition system. PhD thesis, CarnegieMellon University (1976)
29. Luttik, S.P.: Description and Formal Specification of the Link Layer of P1394.
Technical Report SEN-R 9706, CWI (1997)
30. Pelánek, R., Hanžl, T., Černá, I., Brim, L.: Enhancing Random Walk State Space
Exploration. In: 10th Int. Workshop on Formal Methods for Industrial Critical
Systems. ACM SIGSOFT, pp. 98–105 (2005)
31. Peled, D., Pratt, V., Holzmann, G.J. (eds.): Partial Order Methods in Verification.
Series in Discrete Mathematics and Theoretical Computer Science 29 (1996)
32. Romein, J.W., Plaat, A., Bal, H.E., Schaeffer, J.: Transposition Table Driven Work
Scheduling in Distributed Search. In: 16th National Conference on Artificial Intelligence, pp. 725–731. AAAI Press, Menlo Park (1999)
33. Russell, S., Norvig, P.: Artificial intelligence: A modern approach. Prentice-Hall,
New Jersey (1995)
34. SETI@home, http://setiathome.berkeley.edu (Visited on 18 February 2011)
35. Sivaraj, H., Gopalakrishnan, G.: Random Walk Based Heuristic Algorithms for
Distributed Memory Model Checking. ENTCS, vol. 89, pp. 51–67 (2003)
36. Staats, M., Pǎsǎreanu, C.: Parallel Symbolic Execution for Structural Test Generation. In: 19th Int. Conference on Software Testing and Analysis, pp. 183–194.
ACM, New York (2010)
37. Torabi Dashti, M., Krishnan Nair, S., Jonker, H.L.: Nuovo DRM Paradiso: Towards
a Verified Fair DRM Scheme. Fundamenta Informaticae 89(4), 393–417 (2008)
38. Torabi Dashti, M., Wijs, A.J.: Pruning state spaces with extended beam search.
In: Namjoshi, K.S., Yoneda, T., Higashino, T., Okamura, Y. (eds.) ATVA 2007.
LNCS, vol. 4762, pp. 543–552. Springer, Heidelberg (2007)
Towards Informed Swarm Verification
437
39. Valente, J.M.S., Alves, R.A.F.S.: Filtered and recovering beam search algorithms
for the early/tardy scheduling problem with no idle time. Computers & Industrial
Engineering 48(2), 363–375 (2005)
40. West, C.H.: Protocol Validation by Random State Exploration. In: 8th Int. Conference on Protocol Specification, Testing and Verification, pp. 233–242. NorthHolland, Amsterdam (1986)
41. Wijs, A.J.: What to Do Next?: Analysing and Optimising System Behaviour in
Time. PhD thesis, Vrije Universiteit Amsterdam (2007)