The H IVE Tool for Informed Swarm State Space Exploration
Anton Wijs∗
Eindhoven University of Technology, 5612 AZ Eindhoven, The Netherlands
[email protected]
Swarm verification and parallel randomised depth-first search are very effective parallel techniques
to hunt bugs in large state spaces. In case bugs are absent, however, scalability of the parallelisation
is completely lost. In recent work, we proposed a mechanism to inform the workers which parts of
the state space to explore. This mechanism is compatible with any action-based formalism, where a
state space can be represented by a labelled transition system. With this extension, each worker can
be strictly bounded to explore only a small fraction of the state space at a time. In this paper, we
present the H IVE tool together with two search algorithms which were added to the LTS MIN tool
suite to both perform a preprocessing step, and execute a bounded worker search. The new tool is
used to coordinate informed swarm explorations, and the two new LTS MIN algorithms are employed
for preprocessing a model and performing the individual searches.
1 Introduction
In explicit-state model checking (MC), it is checked whether a given system specification yields a given
temporal property. This is done by exploring the so-called state space of the specification, which is a
directed graph describing explicitly all potential behaviour of the system. Since state space exploration
algorithms often need to keep track of all explored states in order to efficiently perform the MC task,1
and since state spaces can be very large, for many years, the amount of available memory in a computer
has been the most important bottleneck for MC.
In recent years, however, the increase of available memory in state-of-the-art computers has continued to follow Moore’s Law [12], while the increase of their processors’ speed no longer has. For
MC, this means that large state spaces can be stored in memory, but the time needed to explore them is
impractically long, hence a time explosion problem has emerged. This can be mitigated by developing
distributed exploration algorithms, in which a number of computers in a cluster or grid are used to perform an exploration. Many of those algorithms use a partitioning function to assign states to workers,
and require frequent synchronisation between these workers, see e.g. [1, 2, 4, 7, 10].
Swarm verification [9] (SV) (and parallel randomised Depth-First Search [5]) are recent techniques
to perform state space exploration in a so-called embarrassingly parallel [6] way, where the individual
workers never need to synchronise with each other. In SV, each worker starts at the initial state and
performs a search based on Depth-First Search (D FS). The direction of a worker is determined by a
given successor ordering strategy. As the direction of a D FS depends on the fact that a stack is used to
order successor states (i.e. a Last-In-First-Out strategy), changing this ordering directly influences the
direction of the search. By providing each worker a unique strategy, they will explore different parts of
the state space first. With this method, some states may be explored multiple times by different workers,
∗ Supported by the Netherlands Organisation for Scientific Research (NWO) project 612.063.816 Efficient Multi-Core Model
Checking.
1 A Depth-First Search can in principle be performed by just using a stack, but this means that the MC task can often not be
performed in linear time (depending on the structure of the state space).
Jiri Barnat and Keijo Heljanko (Eds.)
10th International Workshop on
Parallel and Distributed Methods in verifiCation (PDMC 2011)
EPTCS 72, 2011, pp. 91–98, doi:10.4204/EPTCS.72.10
92
The H IVE Tool for Informed Swarm State Space Exploration
Table 1: The four major functionalities of ISV
LTS MIN
P1. Trace-counting D FS: Constructs
H IVE
P′
with tc(s) = min(1, Σs′ ∈N
F2. Informed Swarm Search (I SS): Search of P restricted to σ
′
′ tc(s )).
F1. Trace selection: select a swarm trace σ for worker
F3. Update swarm set: remove inspected traces
but if the property does not hold, any bug states present are likely to be detected very quickly, due to the
diversity of the searches, which often means that the whole state space does not have to be explored.
However, if a property holds, each worker will exhaustively explore the whole reachable state space,
which means that the benefits of parallelisation are completely lost. Recently, we proposed a mechanism
to bound each worker to a particular reachable strict subset of the set of reachable states, in such a way
that together, the workers explore the whole state space [14]. This mechanism is compatible with any
action-based formalism such as µCRL [8], where each transition in a state space is labelled with some
action name corresponding with system behaviour. In this paper, we explain how the Heuristics Instructor for parallel VErification (H IVE) tool, which resulted from [14], works in practice. Section 2 presents
the functionality of the H IVE tool together with some new algorithms implemented in the LTS MIN tool
suite [4]. How all these have been implemented and how the resulting tools can be used is explained in
Section 3. In Section 4, experimental results are discussed. Finally, conclusions and pointers for possible
future work are given in Section 5.
2 The Informed Swarm Exploration Technique
The Setting The so-called Informed SV technique (ISV) implemented in H IVE and LTS MIN is applicable if three conditions are met: (1) A system specification P should be an implicit description of a
Labelled Transition System (LTS) P. An LTS P is a quadruple (S, A, T, sin ), where sin is the initial state,
S is the set of states reachable from sin , A is a set of transition labels (actions), and T : S × A × S is the set
A
of transitions between states. With s −→ t, A ⊆ A, we say that there exists an ℓ ∈ A such that (s, ℓ,t) ∈ T.
The reflexive transitive closure of −→ is denoted as −→∗ . In on-the-fly state space exploration, sin and A
are known a priori, but S and T are not, and a next-state function N : S → 2S provides the set of successors
A
of a given state. A state t is the successor of a state s iff s −→t. N is used to construct S and T, starting at
sin . In the following, we use the notation N | A, with A ⊆ A, to denote N restricted to a set of transition
labels A, i.e. N | A(s) = {s′ ∈ S | ∃ℓ ∈ A.(s, ℓ, s′ ) ∈ T}. Clearly, N | A = N. Finally, a sequence of actions
{ℓ0 }
{ℓ1 }
hℓ0 , ℓ1 , . . .i describes all transition sequences (traces) through an LTS P with sin −→ s0 −→ s1 · · · for some
ℓ
s0 , s1 , . . .. If P is label-deterministic, i.e. for all s,t ∈ S, ℓ ∈ A with s −→ t, there does not exist a state
ℓ
t ′ 6= t with s −→ t ′ , such an action sequence corresponds to a single trace. Here, we assume that all LTSs
are label-deterministic. If this is not the case, relabelling of some transitions can resolve this.
(2) P should consist of a finite number n > 1 of process descriptions (e.g. process algebraic terms)
in parallel composition. This is the case for any concurrent system. (3) At least some of these processes
in parallel composition, i.e. a subsystem, should yield finite behaviour, hence only finite traces. This is
not a strict requirement, but if it is not met, then the method relies on bounded analysis of the subsystem,
and it does not automatically guarantee anymore that all reachable states are visited.
ISV Say that a specification describes a system of concurrent processes P = {P0 , . . . , Pn }, with n ∈ N.
ISV exploits the fact that parallel composition is a major cause for state space explosion, and that LTS
A.J. Wijs
93
P of P is the synchronous product of LTSs Pi = (Si , Ai , T i , siin ) of the Pi (0 ≤ i ≤ n), restricted by some
synchronisation rules between processes, given by a symmetric function C. E.g. C(ℓ, ℓ′ ) = ℓ′′ states that
if actions ℓ and ℓ′ can be performed by different processes, then the result is action ℓ′′ in the system.
For the formal details, see [14]. We assume that the Ai are disjoint (if this is not the case, then some
rewriting can resolve this)2 and that no action is involved in more than one rule defined by C (either
as an input, or as a result). All this implies that for any ℓ ∈ A, it can be determined whether or not it
stems from some behaviour of a particular process Pi . Say that Ac ⊆ A is the set of actions stemming
from synchronisations, and that Aic ⊆ Ai is the set of actions of Pi which are forced to synchronise with
S
other actions, then A = ( i≤n Ai \ Aic ) ∪ Ac . Now, for any A ⊆ A, we can define M(A) as {ℓ′′ ∈ Ac |
∃ℓ ∈ A, ℓ′ 6∈ A.C(ℓ, ℓ′ ) = ℓ′′ }, which is the set of actions resulting from synchronisation involving one
action in A. Finally, the assumptions about C allow us to define a relabelling function R as follows:
R({ℓ}) = {ℓ′′ } iff there exists an ℓ′ such that C(ℓ, ℓ′ ) = ℓ′′ , and R({ℓ}) = {ℓ}, otherwise.
Four basic functionalities are
required
to perform ISV in pracsend
13
Trace counts
Trace #10
14
tice. These are listed in Table 1;
recv_C1_2
State
tc
0 → [0, 14i
a preprocessing step (P1) involvrequest
3 → [6, 13i
12
0
14
send
ing the analysis of a defined sub11
5 → [6, 13i
1
3
15
system yielding finite behaviour,
recv_C1_2
6 → [7, 13i
2
3
off(C1)
resolves(C1)
and three techniques for the three
off(C1)
16
7 → [7, 13i
3
7
recv_C1_2
major phases of ISV (F1-3). P1
request
8
update
10
7
8 → [7, 12i
4
1
send
and F2 require two new search
update
6
off(C1)
off(C1)
11 → [9, 12i
5
7
9
algorithms, which have been imsend
recv_C1_2
off(C1)
12 → [9, 12i
6
6
24
plemented in LTS MIN. F1 and
4
off(C1)
5
update
13 → [10, 12i
7
6
issue
F3 have been implemented in
send
14 → [10, 12i
8
5
recv_C1_3
23
off(C1)
the new stand-alone H IVE tool.
off(C1)
22
3
15 → [10, 12i
9
1
off(C1)
21
The general procedure to peroff(C1)
send
recv_C1_2
recv_C1_3
4 → [10, 11i
10
1
recv_C1_1
1
form an ISV is as follows: First,
0
11
3
20
the user selects a strict subsysrecv_C1_1
12
3
tem P′ ⊂ P which is guaranteed
request
send
13
2
2
to yield finite behaviour (again,
17
·
·
·
·
··
recv_C1_1
19
alternatively, this behaviour is
send
24
2
18
bounded in the ISV, but then,
the overall search could be nonFigure 1: The LTS of an iPod process, with weights, and trace nr. 10
exhaustive). In P1, the LTS P′ =
(S′ , A′ , T ′ , s′in ), described by P′ , is constructed and saved to disk, together with a weight function
tc : S′ → N. For this, we have extended the D FS implementation in LTS MIN, as described in Alg. 1.
The tc function (see also Table 1) assigns 1 to deadlock states, i.e. if N′ (s) = 0,
/ and the sum of all the
′
′
successor weights to any other state (note that N is the next-state function of P ).
This allows efficient reasoning about the traces through P′ ; the number of traces represented by a
trace prefix from sin to some s ∈ S′ equals tc(s). E.g., Fig. 1 shows a simplified acyclic LTS3 of an
iPod process as part of the DRM protocol specification from [13], with part of the definition of tc. With
this, if we sort the states based on their numbering (which was assigned by the D FS), then each trace
can be uniquely referred to with a natural number: note that the number of possible traces is 14, which
2 Strictly
3 The
speaking, a weaker requirement suffices [14].
actual LTS of this example, in which the actions are extended with some data parameters, consists of 547 states.
94
The H IVE Tool for Informed Swarm State Space Exploration
corresponds with tc(0). Say that we want to identify trace 10 (shown in Fig. 1). State 0 has successors
{1, . . . , 4}. Sorted by increasing state number, we first consider state 1; since tc(1) = 3, we conclude that
trace hrecv_C1_3i represents traces 0 to 2, i.e. 3 traces, starting at trace 0.
We also have tc(2) = 3, therefore hrecv_C1_1i represents
traces 3 to 5. Similarly, hrecv_C1_2i represents traces 6 to 12, Algorithm 1 Trace-counting D FS
and hoff(C1)i represents trace 13. This means that hrecv_C1_2i Require: P′ ⊂ P, s′in , A′
is a prefix of trace 10. Since state 3 only has state 5 as a succes- Ensure: P′ and tc : S′ → N are constructed
Closed ← 0/
sor, clearly hrecv_C1_2, sendi is also a prefix (this agrees with
tc(sin ) ← dfs(sin )
tc(5) = 7: all 7 traces represented by hrecv_C1_2i are also rep- dfs(s) =
if s 6∈ Closed then
resented by hrecv_C1_2, sendi). In this fashion, the complete
tc(s) ← 0
trace can be constructed following the states listed on the right of
for all s′ ∈ N′ (s) do
Fig. 1. This principle is used for trace selection in H IVE (F1). In
tc(s) ← tc(s) + dfs(s′ )
′ (s) = 0
′
if
N
/ then
ISV, each worker is bounded by a trace through P , given by H IVE
tc(s) ← 1
(this will be explained next). Therefore, each trace represents a
Closed ← Closed ∪ {s}
return tc(s)
worker job to be performed, and P′ represents the set of jobs.
From a trace hℓ0 , . . . , ℓn i through P′ (n ∈ N), a so-called swarm
trace σ = hR(ℓ0 ), . . . , R(ℓn )i can be constructed, taking into account synchronisations with P \ P′ .
Whenever a worker thread can be launched, H IVE selects a swarm trace. When the H IVE tool is launched
to start an ISV, this is done first.
A launched worker thread performs an informed swarm search (I SS), implemented in LTS MIN
(Alg. 2 and F2). In Alg. 2, σ is the swarm trace assigned by the H IVE tool, and σ (i) is the singleton
set containing the (i + 1)th element of σ (If σ contains fewer than i + 1 elements, we say that σ (i) = 0).
/
′
In the I SS, P is explored, but not exhaustively: the potential behaviour of the subsystem P is restricted
to σ , which restricts exploration of P. For each visited state s, Next is extended with N | (A \ A)(s),
i.e. all successor states reachable via behaviour of P \ P′ , and Step is extended with N | σ (i)(s), i.e. all
successor states reachable via the current behaviour in σ .
When all states in Open are explored, the contents of Next is
moved to Open, after duplicate detection (for which the search hisAlgorithm 2 B FS-based I SS
′
′
tory Closed is used). Note that when all reachable states have been
Require: P, sin , A, A = A ∪ M(A ), σ
Ensure: P restricted to σ is explored
explored in this manner, i is increased, by which the I SS moves to
i←0
the next step in σ , and new states become available. The main idea
Open ← sin ; Closed, Next, Step, Fi ← 0/
while Open 6= 0/ ∨ Step 6= 0/ do
of ISV is to construct the set of all possible traces through P′ , and to
if Open = 0/ then
perform an I SS through P for each of those traces. This means that
i ← i+1
eventually P is completely explored. A proof of correctness can be
Open ← Step \ Closed; Step, Fi ← 0/
for all s ∈ Open do
sketched as follows: say that all traces through P′ have been used by
Next ← Next ∪ N | (A \ A)(s)
workers to explore P, and that after this, some reachable state s ∈ S
Step ← Step ∪ N | σ (i)(s)
Fi ← Fi ∪ {ℓ ∈ A | N | {ℓ}(s) 6= 0}
/
has never been visited. We will show that this leads to a contradicClosed ← Closed ∪ Open
tion. It follows from Alg. 2 that for each state t to be explored, all
Open ← Next \ Closed; Next ← 0/
new states t ′ ∈ N | (A \ A)(t) are going to be explored as well, and for
′′
some i, t ∈ N | σ (i)(t) is going to be added to Step. This implies that all states tˆ ∈ N | (A \ σ (i))(t) are
going to be ignored. From this and the fact that sin is explored, it follows that a state s is ignored iff for all
A
ℓ
A
traces through P from sin to s, there exist t, tˆ ∈ S such that sin −→∗ t −→ tˆ −→∗ s, with ℓ ∈ A \ σ (i), i being
the current position in σ when exploring t. Let us consider one of those traces. We call σ ′ the swarm
trace followed to reach t from sin over that trace. Note that this is a prefix of σ . Let us assume that by
A.J. Wijs
95
following σ ′ extended with ℓ, s can be reached from sin .4 Since σ ′ has been derived from a trace through
P′ and ℓ ∈ A, the extended trace must also be derivable from a trace through P′ . But then, since all traces
through P′ have been used in the ISV, s must have been visited by some other worker that followed σ ′ ,
and we have a contradiction.
In case P and P′ sometimes synchronise, the trace counting will produce an over-approximation of
the possible set of traces of P′ in the context of P. This is because in the trace counting, it is always
assumed that whenever P′ needs to synchronise, this can happen in P. The result is that some swarm
traces may not correspond with actual potential behaviour in P. To deal with this, I SS includes a feedback
procedure: For every position i in σ , it is recorded in Fi which potential behaviour of P′ has actually
been observed. When finished, I SS returns the Fi , and using these, H IVE can prune away both σ and
other, invalid, traces (F3). Since each trace prefix represents a set of traces with consecutive numbers
(see e.g. the ranges for states in trace 10 in Fig. 1), the set of explored and pruned swarm traces can
be represented in a relatively small list of ranges. Initially, the set of swarm traces is empty. Say we
explore the LTS P of the DRM specification, and P′ is as displayed in Fig. 1, and say it is detected
that synchronisation with recv_C1_2 at state 0 (to state 3) can actually not happen in P. As already
mentioned, hrecv_C1_2i represents [6, 13i. So after pruning, [6, 13i represents the new set of explored
traces. Furthermore, elements in the list can often be merged. E.g., if ranges [0, 5i and [8, 14i have been
explored earlier, and range [5, 8i is to be added, the result can again be described using a single range
[0, 14i. At all times, H IVE is ready to launch another worker (F1) and to prune more traces (F3). When
there are no more swarm traces left to explore, the ISV is finished.
3 Implementation and Using the Tools
Implementation The trace-counting D FS and I SS have both been implemented in an unofficial extension of the LTS MIN toolset version 1.6-19, which has been written in C. Since LTS MIN already contains
a whole range of exploration algorithms (both explicit-state and symbolic), there was no need to implement new data structures. ISV is very light-weight in terms of communication between the H IVE and the
workers, the only information sent to launch an I SS being a swarm trace in the form of a list of actions.
This list is being stored in LTS MIN in a linked list, and a pointer traverses this list when exploring, to
keep track of the current swarm trace position. In addition to this, a bit set is used to keep track of
the encountered actions stemming from P′ since the last move along the swarm trace. This is done to
construct the Fi . A bit set implementation using a tree data structure is available in the LTS MIN toolset.
Unfortunately, it is currently not possible to automatically extract P′ of a given subsystem from P,
meaning that P′ must manually be derived from P. At times, this requires quite some inside knowledge
of the description, therefore it is at the moment the main reason that we have not yet performed more
experiments. Automatic construction of the P′ description is listed as future work (see Section 5).
The H IVE tool consists of about 1,200 lines of C-code. Because of the communication being lightweight, and because interactions between the workers and H IVE either involve asking for a new trace
and receiving it, or sending the results of an I SS, we decided to implement all communication in the
request-response method using TCP/IP sockets. During an ISV, H IVE frequently needs to extract traces
from the P′ , which is kept in memory together with the tc-function. Besides this, a linked list L of nodes
containing trace ranges (the [i, ji mentioned in Section 2) is maintained, representing the set of explored
4 This is not true if there are multiple transitions stemming from P′ on the trace to s not agreeing with σ , but then, we can
repeat the reasoning in the proof sketch until there is only one left.
96
The H IVE Tool for Informed Swarm State Space Exploration
traces. Currently, when a trace is selected (F1), an ID is chosen randomly from L, but one can imagine
other selection strategies (see Section 5). Then, the corresponding trace is extracted from P′ .
When launching many workers, frequent requests to H IVE are to be expected. Therefore, H IVE has
been implemented with pthreads; whenever a worker sends a request, a new thread is launched in H IVE
to handle the request. If a new swarm trace is required, F1 is performed, and if feedback is given, F3 is
performed. The LTS P′ is never changed, hence no race conditions can occur when multiple threads read
it, but L is frequently accessed and updated, when selecting a trace and processing feedback, the latter
involving writing. For this reason, we introduced a data lock on L. We plan to use more fine-grained
locking in the future, but we have not yet experienced a real slowdown when using one lock.
During an ISV, H IVE keeps accepting new requests until L has one node containing the range
[0, tc(s′in )i. From that moment on, any requests are answered with the command to terminate, effectively
ending all worker executions. The same is done if a worker reports in its feedback that a counter-example
to a property to check has been found, because the ISV can stop immediately in that case.
Finally, all has been tested on L INUX (R ED H AT 4.3.2-7 and D EBIAN 6.0.1) and M AC OS X 10.6.8.
Setting up and launching an ISV In the following, we assume that we have a µCRL specification
named spec.mcrl describing P, and a specification named specsub.mcrl describing P′ . Actually, any
action-based modelling language compatible with LTS MIN is suitable for ISV as well. A µCRL specification is usually first linearised to a tbf file, using the µCRL toolset [3], which is subsequently used
as the actual input of LTS MIN. Having spec.tbf and specsub.tbf, the weighted P′ is saved to disk as
follows, with hsubi being the chosen base name for the files storing the weighted P′ :5
lpo2lts-grey –getswarm=hsubi specsub.tbf
This produces the files sub.swh, sub.swc, and sub.sww, containing the actions in A′ , the transitions in
(with actions and states represented by numbers), and the weights of the states, respectively. Actually,
if specsub.tbf yields infinite behaviour, this can be bounded by a depth n using the option –swbound=hni.
The H IVE can now be launched on the same machine by invoking the following, with hportnri being
the port number it is supposed to listen at for incoming requests:
T′
hive hportnri hsubi
An I SS can be started as follows, hserveri being the IP address of the machine running H IVE:
lpo2lts-grey –swarm=hsubi –hiveserver=hserveri –hiveport=hporti spec.tbf
Note that each I SS also needs information on P′ . Actually, only sub.swh is read from disk, to learn
A′ . Therefore, this file should be available on all machines where I SSs are started. Finally, in practice,
one often wants to start many I SSs simultaneously, and start a new I SS every time one terminates. This
whole procedure can be launched using the shell script hive_launch.sh.
4 Experimental Results
Table 2 shows experimental results using the µCRL [8] specifications of a DRM procotol [13] and the
Link Layer Protocol of the IEEE-1394 Serial Bus (Firewire) [11]. We were not yet able to perform
5 For µCRL specifications, lpo2lts-grey is the explicit-state space generator of LTS MIN . For other modelling languages,
another appropriate LTS MIN tool should be used.
97
A.J. Wijs
Table 2: Results for bug-free cases with SV and ISV, 10 and 100 workers.
case
DRM (1nnc, 3ttp)
1394 (3 link ent.)
# workers
10
10
10
100
10
10
100
search
# est. runs
# runs
max. # states
results
max. time
total # states
total time
SV
ISV, 1 iPod
10
5, 124
10
45
13,246,976
2,352,315
19,477 s
2,832 s
132,469,760
85,966,540
19,477 s
14,157 s
ISV, 2 iPods
ISV, 2 iPods
1.31 ∗ 1013
1.31 ∗ 1013
7,070
9,900
70,211
70,211
177 s
175 s
353,591,910
361,050,900
125,139 s
17,325 s
SV
10
10
137,935,402
105,020 s
1,379,354,020
105,020 s
ISV
ISV
3.01 ∗ 109
3.01 ∗ 109
1,160
1,400
236,823
236,823
524 s
521 s
235,114,520
252,206,430
60,784 s
7,294 s
# est. runs: estimated # runs needed (# of swarm traces for ISV). # runs: actual # runs needed. total (max.) # states: total
(largest) # states explored (in a single search). total (max.) time: total (longest) running time (of a single search).
more experiments using other specifications, mainly because subsystem specifications still need to be
constructed manually, which requires a deep understanding of the system specifications. The experiments
were performed on a machine with two dual-core AMD OPTERON (tm) processors 885 2.6 GHz, 126 GB
RAM, running R ED H AT 4.3.2-7. We simulated the presence of 10 and 100 workers for each experiment
(the fully independent worker threads can also be run in sequence). This has some effect on the results:
in order to simulate 10 and 100 parallel I SSs, H IVE postponed the processing of I SS feedback until 10
and 100 of them had been accumulated, respectively. When the I SSs truly run in parallel, this feedback
processing is done continuously, and redundant work can therefore be avoided at an earlier stage. In the
DRM case, we selected both one and two iPod processes for P′ , and in the Firewire case, a bounded
analysis of one of the link protocol entities resulted in P′ . The SV runs have been performed with the
D FS of LTS MIN. Since the specifications are correct, there is no early termination for the explorations,
meaning that in SV, all reachable states are explored 10 times. In the DRM case, ISV based on one iPod
process leads to an initial swarm set with 5,124 traces, 45 of which were actually needed for different
runs. Each run needed to explore at most 18% of P, and in total, the number of states explored was
smaller than in the SV. ISV based on two iPod processes leads to a much larger swarm set, and clearly,
feedback information is essential. ISV with 10 parallel workers explored in total 2.5 times more states
than SV, but each I SS covered at most 21 % of P, meaning that they needed a small amount of memory.
This demonstrates that ISV is useful in a network where the machines do not have large amounts of
RAM. In the Firewire case with 10 parallel workers, each I SS explored at most only 16 % of P, and in
total, the SV explored 83% more states than the ISV. In terms of scalability related to the number of
parallel workers, the results with 100 workers show that the overall execution times can be drastically
reduced when increasing the number of workers: compared to having 10 workers, 100 workers reduce
the time by 86% in the DRM case, and 88% in the Firewire case. The number of I SSs has actually
increased, but we expect this to be an effect of the simulations of parallel workers, as explained before.
A full experimental analysis of the algorithms would also incorporate cases with bugs, to test the
speed of detection. This is future work, but since ISV has practically no overhead compared to SV,
and the I SSs are embarrassingly parallel and explore very different parts of a state space, we expect
ISV and SV to be comparable in their bug-hunting capabilities. Finally, we chose not to compare ISV
experimentally with other distributed techniques (e.g. those using frequent synchronisations), because
there are too many undesired factors playing a role when doing that (e.g. implementation language,
modelling language, level of expertise of the user with the model checker).
98
The H IVE Tool for Informed Swarm State Space Exploration
5 Conclusions and Future Work
We presented the functionality of the H IVE tool and two new LTS MIN algorithms for ISVs. ISV is an
SV method for action-based formalisms to bound the embarrassingly parallel workers to different LTS
parts. Worst case, if the system under verification is correct, no worker needs to perform an exhaustive
exploration, and memory and time requirements for a single worker can remain low.
Tool availability Both the ISV extended version of LTS MIN and H IVE are available at http://
www.win.tue.nl/~awijs/suppls/hive_ltsmin.html.
Future work We plan to further develop the H IVE tool such that a description P′ of a given subsystem can automatically be derived from a given description P. We also wish to investigate which
kind of subsystems are particulary effective for the work distribution in ISV, and which are not, so that
an automatic subsystem selection method can be derived. If P′ yields infinite behaviour, we want to
support its full behaviour automatically in the future. As long as P is finite-state, this should be possible.
Furthermore, we want to investigate different strategies to select swarm traces and to guide individual
I SSs. Finally, we will perform more experiments with much larger state spaces, using a computer cluster.
References
[1] J. Barnat, L. Brim, M. Ceška & P. Ročkai (2010): DiVinE: Parallel Distributed Model Checker. In:
HiBi/PDMC’10, pp. 4–7, doi:10.1109/PDMC-HiBi.2010.9.
[2] B. Bingham, J. Bingham, F.M. de Paula, J. Erickson, G. Singh & M. Reitblatt (2010): Industrial Strength
Distributed Explicit State Model Checking. In: HiBi / PDMC 2010, IEEE, pp. 28–36, doi:10.1109/PDMCHiBi.2010.13.
[3] S.C.C. Blom, W.J. Fokkink, J.F. Groote, I. van Langevelde, B. Lisser & J.C. van de Pol (2001): µCRL:
A Toolset for Analysing Algebraic Specifications. In: CAV’01, LNCS 2102, Springer, pp. 250–254,
doi:10.1007/3-540-44585-4_23.
[4] S.C.C. Blom, J.C. van de Pol & M. Weber (2010): LTS MIN: Distributed and Symbolic Reachability. In:
CAV’10, LNCS 6174, pp. 354–359, doi:10.1007/978-3-642-14295-6_31.
[5] M.B. Dwyer, S.G. Elbaum, S. Person & R. Purandare (2007): Parallel Randomized State-space Search. In:
ICSE’07, IEEE, pp. 3–12, doi:10.1109/ICSE.2007.62.
[6] I. Foster (1995): Designing and Building Parallel Programs. Addison-Wesley.
[7] H. Garavel, R. Mateescu, D. Bergamini, A. Curic, N. Descoubes, C. Joubert, I. Smarandache-Sturm &
G. Stragier (2006): DISTRIBUTOR and BCG_MERGE: Tools for Distributed Explicit State Space Generation. In: TACAS’06, LNCS 3920, Springer, pp. 445–449, doi:10.1007/11691372_30.
[8] J.F. Groote & A. Ponse (1995): The Syntax and Semantics of µCRL. In: ACP’94, Springer, pp. 26–62.
[9] G.J. Holzmann, R. Joshi & A. Groce (2008): Swarm Verification.
doi:10.1109/ASE.2008.9.
In: ASE’08, IEEE, pp. 1–6,
[10] F. Lerda & R. Sista (1999): Distributed-Memory Model Checking with SPIN. In: SPIN’99, LNCS 1680,
Springer, pp. 22–39, doi:10.1007/3-540-48234-2_3.
[11] S.P. Luttik (1997): Description and Formal Specification of the Link Layer of P1394. SEN-R 9706, CWI.
[12] G.E. Moore (1998): Cramming more Components onto Integrated Circuits. Proc. of the IEEE 86(1), pp.
82–85, doi:10.1109/JPROC.1998.658762.
[13] M. Torabi Dashti, S. Krishnan Nair & H.L. Jonker (2008): Nuovo DRM Paradiso: Towards a Verified Fair
DRM Scheme. Fundamenta Informaticae 89(4), pp. 393–417.
[14] A.J. Wijs (2011): Towards Informed Swarm Verification. In: NFM’11, LNCS 6617, Springer, pp. 422–437,
doi:10.1007/978-3-642-20398-5_30.