HUBAUX, A. - Collaborative, Dependable and Controlled
HUBAUX, A. - Collaborative, Dependable and Controlled
HUBAUX, A. - Collaborative, Dependable and Controlled
University of Namur
Faculty of Computer Science
Belgium
FUNDP
Feature-based Configuration:
Collaborative, Dependable, and Controlled
Arnaud Hubaux
c Arnaud Hubaux, 2012
c Presses universitaires de Namur, 2012
Rempart de la Vierge, 13
B - 5000 Namur (Belgique)
Toute reproduction d’un extrait quelconque de ce livre, hors des limites restrictives
prévues par la loi, par quelque procédé que ce soit, et notamment par photocopie ou
scanner, est strictement interdite pour tous pays.
Imprimé en Belgique
Jury
v
Résumé
vii
Acknowledgements
This whole adventure started on the impulse of two men: Prof. Pierre-Yves
Schobbens and Prof. Patrick Heymans. Pierre-Yves supervised my Master’s
thesis and enrolled me in the MoVES project. His vast knowledge and razor-
sharp rigour repeatedly helped me improve and simplify formal definitions and
algorithms.
Over four years, Patrick put some many caps on that I barely remember
them all: supervisor, mentor, overly picky reviewer, friend, room mate, host,
mad neighbour... His sniper’s eye spots the weaknesses of any piece of text
and his magic hand rephrases it frustratingly plainly. He markedly improved
all my papers and taught me a lot about paper writing.
The MoVES project has been a cradle for the new-born researcher I was.
The three work packages that took me on board (WP2, 3, and 4) taught me
to think, act, and communicate as a researcher. The collaborations built
within the network resulted in several joint publications integrated into this
manuscript.
As a student and researcher, I had the chance to work with Andreas Classen
for over nine years. His scientific skills, discipline, dedication, and outspoken-
ness make working with him both a pleasure and a challenge. Outside the
office, he is a hell of a party-goer, always up for a beer, and eager to debate.
Several chapters of this manuscript bear his indelible mark.
All my co-authors own a significant piece of this manuscript, too. I also
thank the industrial partners who helped me come back to reality and fed
me with valuable insight at conferences, meetings, or boarding areas. I am
particularly thankful to Jean-Marc Astesana, Danilo Beuche, Gaëtan Delannay,
and Herman Hartmann.
ix
x
For more than three years, I shared an office with a bunch of fantastic
guys who all contributed to the ideas presented here: Raimundas, Germain,
Quentin, Jean-Christophe, Raphaël, and Ebrahim. Special thanks to Ebrahim
who spared no effort working on the toolset despite the difference in timezone,
and Anthony Cleve whose acute understanding of the research game kept us
debating for hours.
The stay at the GSD lab at the University of Waterloo (ON, Canada) was
a milestone in my short research career. I thank Prof. Krzysztof Czarnecki
for welcoming me in his lab for one year. Despite his busy schedule, he always
managed to find some time to discuss and put drifting ideas back on track. His
passion and vision were genuinely inspiring. I also thank the GSD lab for the
lively discussions, and Yingfei and Steven for the nights they spend wrapping
up our work on fix generation. I also want to express my gratitude to Marcilio
for his help with SPLOT.
This stay abroad also put an heavy load of red tape on my parents during my
absence. I thank them for their patience. I also thank my brother who kindly
accepted to proofread several parts of the manuscript and give his outsider’s
feedback.
Finally and foremost, I am immensely grateful to my fiancée, Marie-Charlotte
Druet, who kept me going through the dire straits of this thesis. Without the
slightest hesitation, she followed me in Toronto for one year, leaving her family
and friends behind. I apologize for all the rough patches she went through and
dedicate this manuscript entirely to her.
This work is sponsored by the Interuniversity Attraction Poles Programme of the Bel-
gian State, Belgian Science Policy, under the MoVES project, and IN.WALLONIA-
Brussels International under the (IN.WBI) Excellence grants.
Contents
Contents xi
Glossary xv
1 Introduction 1
1.1 “To Foresee Is To Rule” . . . . . . . . . . . . . . . . . . . . . . 1
1.2 When the Rubber Hits the Road . . . . . . . . . . . . . . . . . 2
1.3 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Claimed Contribution . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 Reader’s Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.6 Bibliographical Notes . . . . . . . . . . . . . . . . . . . . . . . . 9
xi
xii CONTENTS
10 Conclusion 175
10.1 Our Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
10.2 Summary of the Contribution . . . . . . . . . . . . . . . . . . . 176
10.3 Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
Index 199
Bibliography 203
Glossary
xv
Chapter
1
Introduction
1
2 1 Introduction
selectoptions.aspx captured on July 4, 2011. For concision, only a subset of the options is
shown here.
1.2 When the Rubber Hits the Road 3
The policy of the shuttle company imposes dark colours for its vehicles, hence
the selection of the metallic Jet Black colour. To prevent passengers from tam-
4 1 Introduction
pering with windows during a ride, he selects the Electric ’one touch’ windows
(driver’s side) feature. That choice results in the automatic selection of Elec-
tric Door Mirrors, which is a pleasant surprise for the manager. Indeed, that
latter option was not available before and is automatically included. Then, he
selects the Right side loading door, which triggers the selection of the Left side
loading door. The purchasing manager thus makes decisions about options one
at a time, rather than considering all their combinations.
This configuration panel is only the first one in a series of four. Once he is
done with the selection of equipment and options, the purchasing manager can
proceed to the second stage (02 Preferences) and so on until the final stage
is reached. Besides constraints between features, constraints are also enforced
between stages. For instance, to move from 03 Version to 04 Summary, the
manager must chose one model version. At the end of the fourth stage, the
configuration is sent to a car dealer that will process the demand and return a
final quote.
Functionalities like cost minimisation, quote generation, order, and stock
management are beyond the scope of feature-based configuration. These are
domain-specific extensions wrapped around the FM that glue it together with
the assets of a global system. The boundary of feature-based configuration
stops at the creation of complete and correct product specifications.
This scenario is an everyday example of feature-based configuration. Yet, it
already highlights how decisions progressively narrow down the decision space
and can reveal hidden options, how dependencies between features restrict
choices, how options are clustered into coherent subsets (subgroups of options
and panels), and how configuration stages enforce the order in which decisions
are made.
For the configuration frontend, Renault has chosen a web interface. Others
rely for instance on tree-like representations or contextual menus. This work
looks beyond the graphical layer proposed to users. The graphical syntax pre-
sented in the following examples is used for illustration purpose only and is not
meant to be prescriptive. Our challenge is to define and build the underlying
mechanisms that enable reasoning and assist users during configuration.
RQ3 How can the satisfiability and completion of the configuration process be
ensured?
Process User FM
Management Management Management
Process Reasoning
Solvers
Engine
Chapters are organised in layers, as shown in Figure 1.3. The research question
addressed in a chapter is indicated in the left-hand side of the box.
The darker layers at the bottom are the foundations on which our con-
tributions relies. Chapter 2 and Chapter 3 both revisit work related to
feature modelling. Chapter 2 recalls the essence of software product line en-
gineering, briefly reviews existing variability modelling techniques, recalls the
formal semantics of FMs, and elaborates more on feature-based configuration.
Chapter 3 is presented as an essay that discusses feature modelling from the
perspective of the two leading fields in configuration: artificial intelligence in
manufacturing and software configuration management.
In Chapter 4, we present the results of a systematic literature survey on
separation of concerns in feature modelling (C1). We have identified seven
areas of research in which concerns are discussed. Our findings show that
the inherent vagueness in the feature abstraction and in the purpose of FMs
8 1 Introduction
RQ3
Chapter 7 — Advanced Configuration Scheduling
RQ2
Chapter 9 — A Toolset
for Advanced
RQ2 Chapter 6 — Basic Configuration Scheduling
Feature-based
Configuration
Feature Model
makes realistic FMs hard to comprehend and analyse. Also, separation and
composition techniques for concerns in FMs have been found to be rudimentary.
Based on those conclusions, Chapter 5 reexamines the generic concept
of view, provides a formal definition, investigates its properties, and proposes
three alternative visualisations (C2). These techniques are motivated and il-
lustrated through excerpts from a web-based meeting management application.
To organise these views in a coherent process, Chapter 6 looks into multi-
level staged configuration, extends the formal semantics of Chapter 2 to account
for the dynamics of the configuration process, and studies its properties (C3).
Multi-level staged configuration has several theoretical and practical limi-
tations. Chapter 7 overcomes most of these limitations by coupling the FM
and its views with a workflow that allows to model non-linear processes and
pilot the configuration process (C3). That work is motivated and illustrated
1.6 Bibliographical Notes 9
The research presented in this thesis reuses and extends publications of the
author. We list below the most relevant peer-reviewed papers:
Journal
Workshop
Doctoral Symposium
Under review
2
Feature Modelling in Software
Product Line Engineering
Managing a set of programs is a known problem since the early ages of soft-
ware engineering. Operating system developers rapidly learned that to manage
different versions of the same operating system, it is more profitable to study
the properties shared by several versions before their individual details [Par76].
The A-7E Corsair II attack aircraft used by the U.S. Navy from the 60s until
the late 80s was one of the first aircraft equipped with an onboard computer
system. The onboard software was expected to satisfy real-time performance
and modifiability constraints related to weaponry, platform, and symbology
on the display. At that time, a highly dependable and modular software ar-
chitecture proved to be the only way to go [BCK03]. In the financial realm,
software is extensively used to track and analyse the stock market. When
Market Maker committed in the late 90’s to launch a web-based version of its
product, it was confronted with a maze of heterogeneous databases, comput-
ing platforms, content-providing applications, and requirements. Besides these
technical challenges, a very short time-to-market was mandatory to keep ahead
of the competition. Their solution, called MERGER, was a global system satis-
fying all their customer requirements from which individually tailored products
could be derived [CN01].
To a greater extent, the time when the industry could rely on a limited
set of products with few options has passed. The diverse requirements of cus-
tomers, and the necessity to differentiate themselves drove software companies
to change their stance and develop versatile product families.
A first pragmatic intuition of product family was given by Parnas back in
13
14 2 Feature Modelling in Software Product Line Engineering
Domain
Domain Domain Domain
Requirements
Design Implementation Testing
Engineering
Domain
Engineering
Variability Model
Application
Application Application Application
Requirements
Design Implementation Testing
Engineering
Configuration 1
Application
Engineering ...
n
Configuration n
One of the main ideas behind SPLE is to dedicate a specific process, named
domain engineering, to the development of reusable artefacts, a.k.a core assets.
These core assets are defined at every stage of the life-cycle, as illustrated in the
upper part of Figure 2.1. The set of products that can be derived from these
artefacts defines the portfolio of the SPL. Planning, organizing and building
2.2 Variability Modelling 15
f f f f f
Concrete X ⟨i..j⟩
g g g g g
syntax
h h h h h
Decomposition and: ⋀ or: ⋁ generalized
xor: ⨁ optional
operator cardinality
Cardinality ⟨n..n⟩ ⟨1..n⟩ ⟨1..1⟩ ⟨i..j⟩ ⟨0..1⟩
Textual representation
Various textual FM languages have been proposed as an alternative to graphical
ones for a number of reasons. Their main advantage over graphical notations
is that they do not require dedicated modelling tools; well-established tools
are already available for text-based editing, transformation, versioning, etc.
Furthermore, textual information and textual models can be easily exchanged,
for instance by email.
To the best of our knowledge, FDL [vDK02] was the first textual FM lan-
guage. It supports basic requires and excludes constraints and is arguably user
friendly, but it does not include attributes, cardinality-based decompositions
and other advanced constructs. It is also the first textual language with a
formal semantics.
2.3 Feature Modelling 19
Voting (V)
The AHEAD [Bat05] and FeatureIDE [KTS+ 09] tools use the GUIDSL
syntax [Bat05], where FMs are represented through grammars. The syntax is
aimed at the engineer and is thus easy to write, read and understand. How-
ever, it does not support decomposition cardinalities, attributes, hierarchical
decomposition of FMs and has no formal semantics.
The SXFM file format is used by SPLOT [MBC09]. While XML is used
for metadata, FMs are entirely text-based. Its advantage over GUIDSL is that
it makes the tree structure of the FM explicit through indentation. However,
except for the hierarchy, it has the same deficiencies as GUIDSL.
The VSL file format of the CVM framework [Rei09, AJL+ 10] supports
many constructs. Attributes, however, cannot be used in constraints. The
Feature Modelling Plugin [AC04] as well as the FAMA framework [BSTRC07]
use XML-based file formats to encode FMs. Tags make them hard to read and
write by engineers. Furthermore, none of them proposes a formal semantics.
Clafer (class feature reference) is a meta-modelling language for meta-
models, feature models, and combinations thereof [BCW10]. Clafer aims at
20 2 Feature Modelling in Software Product Line Engineering
Furthermore, each d ∈ LF M must satisfy the following well-formedness
rules:
• r is the root: ∀n ∈ N (@n0 ∈ N • (n0 , n) ∈ DE) ⇔ n = r;
• DE is acyclic: @n1 , .., nk ∈ N • (n1 , ∗), (∗, nk ), (nk , n1 ) ∈ DE;
• Terminal nodes are h0..0i-decomposed.
Definition 2.1 is actually a formal definition of the graphical syntax of an
FM such as the one shown in Figure 2.2; for convenience, each feature is given
a name and a one-letter acronym. In abstract form, the FM of Figure 2.2
translates to:
Considering the previous example, the semantic function maps the diagram
of Figure 2.2 to all its valid feature combinations, three of which are listed
below:
{V, E, M, A, Y, N }, {V, E, M, A, Y, N, D, DY }, {V, E, M, A, Y, N, D, DN }, . . .
A formal semantics is the primary measure for precision and unambiguity, and
an important prerequisite for reliable tool-support. This section revisits some
of the most popular analyses on FMs and reasoning techniques.
2.4.1 Analyses
The benefit of defining a semantics before building a tool is the ability to
reason about the analyses the tool should conduct on a pure mathematical
level, without having to worry about their implementation. These analyses are
mathematical properties defined on the semantics that can serve as indicators,
validity or satisfiability checks. Over twenty years of existence, a great deal of
analyses on FMs have been proposed, the most frequent of them being:
2.4 Reasoning about Feature Models 23
MiniSAT [ES11]) take as input a boolean formula, and try to find an as-
signment of the variables such that the formula is satisfiable. SAT solvers
usually require the boolean formula to be converted in a conjuctive nor-
mal form (CNF). A CNF is a conjunction of clauses where each clause is
a disjunction of variables such that a variable and its complement can-
not appear in the same clause. Standard conversion algorithms can be
used to transform any arbitrary boolean formula into a CNF. Although
satisfiability is an NP-complete problem [Coo71], SAT solvers prove to
be very efficient to reason about FMs containing up to several thousand
features [MWC09].
Binary decision diagram (BDD) solvers (e.g. JavaBDD [Wha11] or Cro-
coPat [Bey11]) encode a propositional formula as a directed acyclic graph
(DAG) with two terminal nodes respectively representing true and false.
Each decision node is labelled with a boolean variable, and has exactly
two output edges respectively capturing the assignment of true or false
to the variable.
SAT and BDD solvers differ in that SAT solvers suffer from a high com-
plexity in time whereas BDD solver suffer from a high complexity in
space. In practice, both solvers complement each other and should be
picked carefully based on the type of analysis to conduct [Men09].
Constraint programming-based. A constraint satisfaction problem (CSP)
is defined as “a finite set of variables, each of which is associated with
a finite domain, and a set of constraints that restricts the values the
variables can simultaneously take” [Tsa93]. The role of a CSP solver
(e.g. JaCoP [Jac11] or Choco [Cho11]) is to find an assignment for each
variable such that all the constraints are satisfied.
The mapping from an FM to a particular CSP solver is less straight-
forward than with propositional logic because each solver has its own
encoding scheme. Details about alternative encodings can be found
2.5 Feature-based Configuration 25
in [BSRC10]. Overall, CSP solvers are less efficient to process FMs ex-
clusively defined over Boolean domains. However, their ability to reason
about arbitrary domains makes them very good candidates for attributes
of numeric or textual types. They can also maximise objective functions,
which is sometimes needed to optimise the cost of a set of features (e.g.
total CPU consumption or memory footprint [TBC+ 09]).
Other. Other approaches rely on original algorithms tailored for FMs. These
algorithms are usually bound to a particular dialect of FMs and focus
on very specific analyses [BSRC10]. Satisfiability modulo theory (SMT)
[NOT06] solvers are currently being evaluated [PNX+ 11] as an alternative
to propositional logic and CSP solvers. Solving an SMT problem means
deciding whether a first-order logic formula containing predicates (such
as linear equalities or inequalities) defined over non-binary variables is
satisfiable.
languages and several dialects of FMs. It then introduced the concrete syntax
and recalled the semantics of FMs of Schobbens et al. [SHTB07] used through-
out this thesis. Finally, common analyses of FMs and reasoning techniques
used in FBC were described and exemplified.
Chapter
3
A Glimpse at Feature Modelling
in the Configuration Realm
3.1 Overview
Since the early 90’s, FBC has adopted and revisited several notions from two
landmarks of computer science: artificial intelligence (AI) and software con-
figuration management (SCM). The tight bond between configuration in AI
and FBC has already been established. Günter et al. [GK99] recognise con-
cept hierarchies (similar to FMs) as a fundamental concept in their survey
of configuration methods used in knowledge-based configuration. According
to Junker’s classification of known configuration problems [Jun06], FBC falls
in the option selection or shopping list problems. Additionally, several au-
thors have already reviewed how AI contributes to the automation of FBC
(e.g. [Men09, Jan10, BSRC10]).
Rather than repeating existing work, we take several steps back from the
technical contribution of AI to outline its support for configuration in the
manufacturing industry and compare it to FBC. The manufacturing domain
caught our attention mainly because it has led research on configurators—
more generally called expert systems—in AI for years. Other researchers also
stressed the need to understand the relationship between hardware and software
product lines (e.g. [HK07]). In fact, many reports on the applications of SPLE
are very close to the manufacturing industry. A quick look at the SPL hall
of fame [Con11] shows that out of the 18 success stories, only 5 are geared
towards end-users. Most of them are dedicated to embedded devices like cell
phones, engines or military equipment.
While configurators help users create superior products, SCM manages
change along the engineering lifecycle. Since the 50’s, SCM has grown into a
29
30 3 A Glimpse at Feature Modelling in the Configuration Realm
activity.
3.2 On the use of FBC in Manufacturing 31
Make-to-stock CODP
Configure-to-order CODP
Make-to-order CODP
Engineer-to-order CODP
dependently, which results in markedly higher costs and longer delivery time.
Similarly, in SPLE, O’Leary et al. [ORRT09] report that customers are usually
ready to have only 80% of their requirements fulfilled if the cost of the deviation
from standard products is too prohibitive.
The nature and complexity of the products determine the type of strat-
egy(ies) that best fits the needs of the company. It is only once agreed upon
that the development of the configurator can actually start.
Specification process
Manufacturing
Customer Sales Product design Calculation
engineering
Configuration system
Delivery/
Purchasing Planning Production
Assembly
The second phase is the analysis of the product range, which aims at pro-
viding a global view of the product range by detailing the product knowledge
to be included in the configuration system. The model used to capture this
knowledge is called the product variant master (PVM). Basically, the PVM
consists of a set of hierarchically organised components and their parts. The
PVM is usually drawn on a large sheet of paper or with tools like Microsoft Ex-
cel or Visio. The specifications of individual components and parts are usually
documented via class, responsibility and collaboration (CRC) cards [BC89].
We stop here the description of the PVM as Section 3.2.4 is dedicated to it.
In the third phase, a first object-oriented model, in this case a class dia-
gram, is directly derived from the PVM and CRC cards. Once completed, the
object-oriented model contains both the product and system knowledge. This
model is then adapted in the fourth phase, called object-oriented design, based
on the selected configuration system. The configuration system contains the
configuration knowledge and performs the reasoning about module combina-
tion. It can either be developed in-house or, more frequently, be supplied by a
third-party. Since many of the existing systems are not fully object-oriented,
the model must be adapted to work properly with the selected software. Note
that the integration with more than one third-party tool might prove manda-
tory. For instance, the configurator might have to interact with an ERP to
compute delivery times and prices, CAD tools supporting the modelling of
complex physical parts like an engine or even 3D visualisation tools used by
architects to present explorable representations of the future building.
The role played by FBC in SPLE reflects that of the configuration system
in manufacturing. However, FBC is restricted to option selection and attribute
assignment, while the configuration system covers the simplest forms of binary
decisions to the assembly of complex mechanical parts. In other words, FBC
can only cover a limited span of the range of problem tackled in manufacturing.
The fifth phase deals with the actual programming of the configurator,
which is then implemented in the production environment during the sixth
phase. The final phase, the seventh, is focused on the maintenance and further
development of the configurator.
Station
Car family
wagon
[1]
Subassembly Van
[1] Cabriolet
Windshield
[1]
Gearbox
[1]
Sunroof
[1]
Engine Class name
Description: This class is used for
defining all engines that this family Description
of car uses
Volume: [1.6, 1.8, 2.0] Litres
Attributes
Fuel: (Diesel, Gasoline)
Constraint 1: Cabriolet is only offered
with a 2.0 litres engine Constraints
[4-5]
Wheels
[0,1]
Air conditioning
Figure
Figure61. Principle
3.2 – class declaration
Example in a PFMP.
of PVM The small
(copied fromexample shows a PFMP for a car family.
[Har06]).
The name of a class should be relatively short. In cases, where more information about the
contents and the purpose of the class is needed, a description field© can be applied. The de-
2003 IPU / DTU The Product Architecture G
The concepts of the
scription field PVM
is used for aare deeply
verbal inspired
description. from
In cases thethe
where object-oriented
class is given the name of a
paradigm, part
which greatly
number, facilitatesfield
the description their conversion
is useful. into class
The description diagrams.
is written below the In class name
the PVM, inevery component preceded by
the PFMP with a grey font (Figure 61). a circle with a short horizontal line
on the left is the class name, which must be unique. Classes can also have
Classes
cardinalities, which include attributes.
can specify the numberThe attributes defineitthe
of sub-parts variation
has. In ourwithin theaclass
case, car e.g. Colour
can have between 4 or 5 wheels if it has a spare wheel. Classes can also haveof the class,
(Blue, Green, Red, Yellow). The attributes are listed below the description field
or in case ofand
a short description no description
attributesfield,
like the
theattributes
Volumeare or listed
Fuel just
typebelow the Engine
of the class name. Further
description
in Figure 3.2. of an attribute declaration is found on page 113.
Constraints
Four types prescribe can
of constraints how be
classes and attributes
attached to a PVM canor beacombined. Constraints, which are
class. A constraint
denotes howwritten in text, are listed below the list of attributes. A red font
classes and attributes can be legally combined. The is normally
first typeapplied
of for dec-
constraint laration of constraints.
is the verbal constraint, which allows free-form natural language to
be used. The second, the logical constraint, is typically expressed in a semi-
formal or Class
formal hierarchy, i.e.propositional
logic like part-of and kind-of
logic.structures
The third type of constraints
is expressed as a calculation over attributes of classes. The fourth type is
Classes are used for creating the hierarchy in the part-of structures and the kind-of struc-
the combination table, which describes constraints through relationships. For
tures. A class can consist of one or more classes. A class below another class is so-called a
instance, it can represent the relationships between the parts available and
sub-part. The sub-part (class) is a part of the declaration of the super-parts. A super-part is
the assemblies of which they are constituents. In that case, a cell of the table
the class above another class. Super- and sub-parts are illustrated in Figure 62.
would, for example, represent the number of pistons of a given diameter needed
111
3.2 On the use of FBC in Manufacturing 35
The product variant master has created the basis for a far more
professional dialogue between sales, development and production.
One designer expressed it in this way: “This is the first time we have
had the possibility of a meaningful dialogue with sales”. [HMR08]
Subassembly
Windshield
Gearbox
Sunroof
Engine
Volume
X
1.6 Litres
1.8 Litres
2.0 Litres
Fuel
X
Diesel
Gasoline
[4-5]
Wheels
Air conditioning
Figure 3.3 – Example of translation of the PVM in Figure 3.2 into an FM.
translation of the PVM in Figure 3.2. The textual constraint has been ex-
pressed with a boolean implication, the multiplicity is modelled with a feature
cardinality, and the kind-of structure represented by an extra Kind of feature.
Yet, although the PVM was introduced almost ten years after FODA, we
could not find a reference to FODA in the PVM literature, and vice versa.
Albeit sharing the same motivation and issues, these two engineering fields
have evolved independently. In order to identify possible bridges, we first look
at FMs as a means to model the product range, and then as a basis for product
configuration.
FMs as means to represent the product range. The first major dif-
ference is the semantic distinction between the part-of and kind-of structures,
which are not standard FM constructs. However, comparable structures have
already been proposed (e.g., [LKL02]) for FMs and could be reused here. The
second major difference is the specification of constraints. Verbal constraints
and combination table have not been implemented in FMs. The addition of
the former would be trivial but hardly amenable to automated reasoning. The
latter is a mere tabular representation of a set of constraints. Thereby, we
postulate that FMs can be used as a replacement of PVMs and therefore be
3.2 On the use of FBC in Manufacturing 37
Construction Structure
Team
Workspaces
Conflict resolution
Product families
Auditing
History Artefacts
Traceability
Versions
Logging
Configurations
Versions of configurations
Baselines
Process Project contexts
Repository
Lifecycle support Kinds of artefacts
Task management
Communication
Documentation
Controlling
Accounting
Access control
Statistics Change requests
Status Bug tracking
Reports Change propagation
Partitioning
The last two functionalities thread them all together. The process func-
tionality integrates the tasks to be performed in the global engineering lifecy-
cle, assists users in the selection of these tasks, and documents the knowledge
of the process. The team functionality supports collaborative work through
workspace definition, conflict detection and resolution, and the maintenance of
product families.
Most of these functionalities will not be explained further for concision. The
interested reader is referred to [Fei91, Dar91, Pre04, ELvdH+ 05] for an in-depth
introduction to SCM, and to [Dar99] for an additional list of challenges posed
by web systems. In the remainder of this section, we concentrate on the rela-
tionship between FBC and product configuration, collaborative development,
and process management.
40 3 A Glimpse at Feature Modelling in the Configuration Realm
necessary configuration items are either downloaded directly in the workspace or linked before
compilation.
3.3 On the use of FBC in Software Configuration Management 41
accepted
Request Evaluate Create Assign Implement Verify Close
Change Change Planning Change Change Change Change
rejected
of these solutions, and would rather buy and reuse existing processes than de-
velop their own. To reconcile the need for process support and their inherent
complexity, tool vendors now hide them behind “"best practice" standardized
processes” [ELvdH+ 05].
Process control has now reached its maturity stage in SCM. The challenge
ahead is to provide a better integration with third-party tools like call tracking
and customer relationship management systems.
FBC barely supports process control. Existing work on that topic will be
reviewed in Chapters 6 and 7. We will also propose a combined formalism
that maps a multi-view FM to a generic process. We complete the support by
providing analyses that ensures that only satisfiable executions of the model
are allowed.
[...] the field as a whole is now sorting out how to better fit in
the overall picture of software development, rather than always as-
suming being able to provide a standalone application that somehow
seamlessly fits with existing tools, approaches, and practices. [ELvdH+ 05]
The journey through SCM also shed light on the limitations of textual con-
figuration languages, the necessity for process control, the importance of con-
flict detection and resolution, and the compelling need for workspaces. These
three latter observations are supplementary motivations for the work presented
in the coming chapters.
Finally, genericity and enhanced expressiveness do not always obtain the
preference of final users. Too much flexibility is often seen as a source of
confusion, inefficiency, and maintenance overhead.
We are well aware of the limitations of these conclusions. To provide defini-
tive statements, a more systematic and thorough treatment of both AI and
SCM is needed but is beyond the scope of this thesis. Yet, the lessons we could
learn from our investigation have clearly inspired and influenced our contribu-
tion.
Chapter
4
Separation of Concerns in
Feature Models
47
48 4 Separation of Concerns in Feature Models
The research method we have followed to collect and review papers is inspired
by the guidelines of Kitchenham et al. [Kit04]. Our method deviates from
Kitchenham’s in that we intentionally leave out a detailed quantitative meta-
analysis to favour an in-depth qualitative analysis.
This section begins with the presentation of our research questions followed
by the description of the survey protocol. It then details the survey material
and the data collection forms we used to harvest data systematically.
4.2 Survey Method 49
RQ4.1 What are the main concerns of FMs? We expect there are several
FM languages with different notions of “concerns”. We will define what
is meant by concern in the context of this survey, and list those concerns.
RQ4.2 How are concerns separated and composed? Not only that there are
different ways of separating concerns, there may be different ways of
composing concerns too. The survey will cover those too. In addition,
this survey examines how different techniques may have varying degrees
of effectiveness.
Already collected
literature
(12 papers)
1 Survey material
elicitiation
DBLP
(139 papers)
Paper base
(151 papers)
2 Paper filtering
FALSE TRUE
FALSE TRUE
Final node
Activity
4 Content analysis
Artefact
Decision
subjectivity and human effort under control. Reliance on only one of the two
techniques can have major weaknesses. For instance, the fully automated
search by Chen et al. [CAB09] initially missed important publications such
as FORM [KKL+ 98] because the terms did not match.
2 These criteria come from [SSS07]. We added the illustration criteria to account for cases
Research question 1
Field Description Possible values
Concern Concerns addressed in the paper Free form text
Stakeholders Stakeholders targeted by the concerns Free form text
Artefacts Artefacts targeted by the concerns Free form text
Development Stages in the software development cy- Free form text
stages/tasks cle targeted by the concerns
Scope The scope of the concern determines to some of {internal, ex-
what extent the concern is related to ternal}
the software itself or its environment
Target domain Type of domain to which the concern Free form text
applies
Rationale Motivation for the need for the con- Free form text
cerns
Evaluation Type of evaluation conducted to prove some of {illustration,
the relevance of the concerns controlled experiment,
case studies, survey re-
search, ethnographies,
action research, mixed
approach}2
Research question 2
Field Description Possible values
Separation tech- Techniques used to identify and sepa- Free form text
niques rate concerns in a FM
Composition tech- Techniques used to compose the con- Free form text
niques cerns of a FM
Research question 3
Field Description Possible values
Formalised arte- Artefacts discussed and formalised in Free form text
facts the paper
Level of formality Level of formality of the notation used one of { + , +, ,
-, - }3
Formalisation Type of formalisation technique used Free form text
Research question 4
Field Description Possible values
Supported activi- Activities supported by the tool Free form text
ties
Implementation Techniques used to implement the ac- Free form text
techniques tivities
Degree of automa- To what extent the activity is/can be one of {+, , -}4
tion automated
Implementation Level of maturity of the implementa- one of {mature, proto-
Status tion type, proof of concept}
4.3 Execution
We now report on the execution of the paper elicitation process. First we focus
on the results returned by the queries and then discuss the total number of
papers sorted by venues.
The five queries were executed on DBLP database on 13 December 2010,
using CompleteSearch developed by Hannah Bast. 5 6 shell scripts were used
to automatically extract papers and merge the results of the various queries.
The five queries respectively returned a total of 144 papers, which are broken
down as follows.
Query # papers
Q1 9
Q2 6
Q3 14
Q4 70
Q5 69
Total 1446
5 http://dblp.mpi-inf.mpg.de/dblp-mirror/index.php
6 There are 24 duplicate papers in the results of the five queries.
54 4 Separation of Concerns in Feature Models
4.4 Findings
When the data was collected, seven main threads of research began to emerge.
They are:
1. FODA and FODA Extensions. As the pioneering FM language, FODA
and many of its extensions provide a list of concerns for FMs and tech-
niques for their separation. Much of the work in this thread are related
to the concepts to be supported by the feature modelling languages.
2. Aspect-based Concerns. Although aspect-oriented programming was ini-
tially proposed as a mechanism for separating concers in program code,
many authors have extended the idea to analytical models of software
systems, including FMs.
3. Requirements Concerns. Requirements engineering approaches have their
own notions of concerns and how to separate them. Many authors have
argued that there are requirements-level concerns in FMs, and they can
be analysed using techniques developed for requirements engineering. By
extension, papers related to problem space, ontologies or natural language
also fall in this thread.
4. Viewpoint-based Concerns. Originally a requirements engineering ap-
proach, viewpoints organise software artefacts according to the stake-
holders and their heterogeneous perceptions of the system to build. Sep-
aration of concerns is a fundamental issue in viewpoint-based techniques.
5. Configuration Concerns. Besides documenting variability, FMs can serve
as a backbone for product configuration as they capture all the valid
4.4 Findings 55
[RW06, RW07, MSA09] consider the issue of managing large FMs and
changes made to them over time, in the context of automotive software de-
velopment. They point out that FMs need to reflect the structure of several
organisations involved in the software development. They argue that dividing
the FMs along organisational boundaries will make it difficult to propagate
changes made to a local diagram. Managing a large global FM is also unsat-
isfactory because it will make FMs unmanageable. They propose multi-level
feature trees in which FMs are refined in a hierarchical fashion. Elements of
a child FM can selectively reuse elements in the parent FM, allowing local
changes to be made without affecting the global structure of the FMs.
Similarly, Thompson et al. [TH03] argue, echoing [Lut08], that a single
hierarchical decomposition of features is not sufficient to capture the structure
of the domain. They propose n-dimensional and hierarchical FMs. They list
some of the dimensions as hardware platform, required behaviour, and fault
tolerance capabilities. Hartmann et al. discuss the issue of FM complexity due
to the need to support multiple product lines [HT08]. They propose using a
context variability model which captures “drivers for variation”.
Fey et al. suggest that usability and usefulness are important qualities for
FMs, and in order to improve those qualities of FODA, they propose a meta-
model extending FODA [FFB02]. They argue that feature attributes and com-
plex logical rules are necessary but they are not expressible in FODA. There-
fore, they propose to add feature attributes, pseudo-features (non-primitive
features) and provided-by relations to the FM.
Deelstra et al. propose an assessment process for evaluating the evolution of
variability in software product families [DSB09]. In this approach, a distinction
is made between provided variability model—the variability in the product fam-
ily artefacts—and required variability model—the variability that is demanded
as necessary by product scenarios.
Cho et al. identify several feature relations and dependencies [CLK08].
They are: aggregation relationship, generalisation relationship, required con-
figuration dependency, excluded configuration dependency, usage dependency,
modification dependency, exclusive activation dependency, subordinate activa-
tion dependency, concurrent activation dependency, and sequential activation
dependency.
Kim et al. differentiate between conventional variability and component
variability, and present five types of variability and three kinds of variability
scopes [KHC05]. Assuming that classes and workflows among the classes are
“building blocks of software components”, they identify attribute variability,
logic (algorithmic) variability, workflow variability, persistent variability, and
interface variability. In terms of scope, they propose that a variability may be
either binary, selection, or open.
Czarnecki et al. propose using probabilistic FMs in the context of mining
FMs from existing systems [CSW08]. In this approach, both hard constraints
4.4 Findings 57
an operational approach (VML*) [HSS+ 10] . They report that the two ap-
proaches are equally powerful on important criteria.
Chen et al. divide requirements into candidate features for crosscutting
requirements and identify two types of feature dependencies, called interaction
and weaving [CZZM06]. In [CZZM05], they propose a way of constructing FMs
based on requirements relationship graphs and a clustering technique.
Summary. Many requirements engineering approaches tend to focus on
separating the user’s perspective on a FM from the perspectives of other stake-
holders. There are also approaches that differentiate between features relating
to functional and non-functional requirements, to different personas, and to
phases of development.
4.5 Discussion
When FMs were first introduced, the main concern of the models seemed to be
to distinguish features that are particular to certain members of the product
family from those that are common across all members. FMs address something
that few other modelling languages do: to treat groups of programs as having
common functionality with only minor varying features. This could encour-
age reuse across programs, and design more stable architecture. The concepts
used in the language were also simple: mostly with AND (aggregation/consist-
of) and OR (alternatives/optional/mutually exclusive OR) relationships, to-
gether with some notion of feature attributes and “requires” (implies) relation-
ship [KCH+ 90].
Having said that, even the seminal FODA report [KCH+ 90] is not clear
about the level of abstraction of features, nor the purpose of FMs. At times,
[KCH+ 90] treat FMs as a domain analysis tool, suggesting that the feature
analysis should focus on application domain; at other times, they indicate that
FMs can be used to describe relationships between features of different binding
times: compile-time, load-time and run-time. Therefore, an FM is a “com-
munication medium between users and developers”, a description of the “user’s
understanding of the general capabilities of applications in a domain”, and a
tool for communication between developers and designers. This vagueness in
the purpose of FM language, or the attempt to cater for all purposes, unfortu-
nately contributes to FMs with poor separation of concerns.
This section revisits the findings of Section 4.4. Firstly, it synthesizes the
different types of concerns we identified and tries to clarify their meaning and
context of use. Then, it elaborates on the different separation and composition
techniques. Finally, the degree of formality of these techniques and the tools
supporting them are discussed.
64 4 Separation of Concerns in Feature Models
4.5.1 Concerns
Two types of concerns are proposed in the literature: (1) concerns that group
features together according to some given criteria; (2) concerns that differenti-
ate the kinds of relationships between features.
Feature groups
Concerns grouping features can be further separated into three categories, as
reported in Table 4.2. First, concerns that focus on functional and non func-
tional properties of the SPL. Then, concerns that put together features that
address a specific facet of the SPL variability. Finally, concerns that group
features according to the steps or status of the configuration process.
It is interesting to note that the earlier FM languages (such as FODA
and FORM [LKL02, KKL+ 98]) seem to be more concerned with design and
implementation issues, while later FM languages (such as staged configura-
tion [CHE05] and feature tree [RW06]) are more concerned with stakeholders
and organisational structures. There is a tendency to expand the scope of
FMs: in addition to describing variability in the design, a need for describing
the variability in the wider system context has been recognised by SPL ap-
proaches. This perhaps explains, in part, the phenomenon of increasing size
and complexity of FMs.
A modelling language such as the one described in FODA cannot be ex-
pected to adequately capture relationships between so many (kinds of) fea-
tures in a realistic system at so many levels of abstractions, involving so many
stakeholders at the same time. For example, there is no notion of timing (to
talk about temporal dependencies between features) in FMs, although FODA
authors suggested that they can be used to describe run-time and load-time
variabilities.
In that sense, it is not a real surprise that many FODA-like FM languages
are found not to be scalable. This has led some authors to question the actual
expressiveness of those languages. As shown in [SHTB07], from the seman-
tic perspective, a relatively simple FM language can have a high degree of
expressiveness and succinctness. Therefore, we have to conclude that mak-
ing FM languages richer does not necessarily make them more scalable. Poor
separation of concerns makes FMs difficult to understand and analyse using
automated tools.
The list of concerns recognised by the surveyed approaches, in particular by
[LKL02] and [KKL+ 98], is very comprehensive. They range from cost of fea-
tures, CPU platform to organisational structure. This indicates that variability
has to be addressed at different times in the development and in different parts
of the system structure. Therefore, we do not have a shortage of concerns to
be addressed by FMs: there are actually too many concerns to be addressed.
4.5 Discussion 65
References Concern
Functional and non-functional property
[HSSF06] Adaptivity
[CRB04] Configurability
[SK01] Declarative
[SK01] Deductive
[TH03] Fault tolerance
[CRB04] Flexibility
[EM08, HSSF06] Functionality
[EM08] Quality
Facet
[SG05, ELSP08] Artefacts
[CLLK09] Binding
[KCH+ 90, KKL+ 98, LKL02, KDK+ 02] Capability
[CLLK09] Configuration
[KCH+ 90, KKL+ 98, LKL02, KDK+ 02, Context
CHS08, HT08, TBC+ 09, MRA05]
[KCH+ 90, KKL+ 98, LKL02, KDK+ 02] Domain technology
[PBvdL05] External variability
[CZZM06] Feature interaction
[TH03] Hardware platform
[KCH+ 90, KKL+ 98, LKL02, KDK+ 02] Implementation techniques
[PBvdL05] Internal variability
[GRDL09] Market needs
[HT08, GRDL09] Multiple product lines
[FFB02, SHTB06] Non-primitive feature
[CLLK09] Operational dependency
[RW06, RW07, MSA09, GRDL09] Organisational structure
[DSB09, MHP+ 07, MRA05, KSG06, CHS08, Provided variability (solution space)
TBC+ 09, GRDL09]
[DSB09, MHP+ 07, KSG06, TBC+ 09, TH03, Required variability (problem space)
CHS08]
[CLLK09] Structure
[CLLK09] Traceability
[PBvdL05] Variability in time
[PBvdL05] Variability in space
Configuration process
[KCH+ 90,LKL02, KDK+ 02, LKK04, FFB02, Binding time
HHB08]
[LKK04] Binding state
[CHE05, CHH09b] Configuration levels
[CHE04] Configuration stages
[HCH09] Configuration tasks
Feature relationships
The second type of concern determines the nature of the relationship between
two or more features. Table 4.3 collects the different kinds of relationships we
came across. Relationships defined as propositional formulas [Bat05] that are
not explicitly qualified are left out this table.
Here again the concept of relationship is rather imprecise. It is alternatively
used to specify the parent/child relationship (e.g., aggregation or composition),
to constraint feature selection (e.g., requires or excludes) and to provide de-
pendency information (e.g., implemented by or usage). Many of them defeat
the purpose of FMs. By forcing implementation specific details into the model,
one loses independence from the solution domain. Furthermore, the semantics
of these relationships is for the most part completely overlooked.
However, implementation-specific relationships differ from implementation
specific feature groupings. Relationships restrict the selection by enforcing
constraints imposed by technological decisions. Groups present domain op-
tions that are related to each other. Even though concerns can focus on
implementation-specific features, they only expose choices and do not import
extraneous constraints from the implementation layer.
References Concern
[CLK08] Aggregation relationship
[LKL02] Composed-of
[CLK08] Concurrent activation dependency
[CLK08] Excluded configuration dependency
[CLK08] Exclusive activation dependency
[CHS08, TBC+ 09] Entailment
[LKL02, CLK08] Generalisation/specialisation
[CSW08] Hard constraint
[LKL02] Implemented by
[CZZM06] Interaction
[CLK08] Modification dependency
[WBDS09] Point configuration constraint
[FFB02] Provided by
[TBC+ 09] Quantitative constraints
[CLK08] Required configuration dependency
[CLK08] Sequential activation dependency
[CSW08] Soft constraint
[CLK08] Subordinate activation dependency
[CLK08] Usage dependency
[CZZM06] Weaving
[MHP+ 07] X-links
purposes and concerns of FMs are not clear, the concerns to which these tech-
niques are to be applied remain a question. Many of the proposed approaches
are well-grounded and probably constructive when applied to real problems.
More evidence of how they have been applied will strengthen confidence in
them.
Although there are several separation techniques, as shown in Table 4.4,
what is required, in our view, is a clarification in the purposes of FMs, a sys-
tematic way to separate, and compose FMs, according to well-defined criteria.
The Jackson-Zave framework (and related) for requirements engineering gives a
systematic way to separate requirements, specifications and domain properties.
We believe that FM languages could benefit such a framework, as explained
in [CHS08, TBC+ 09].
Separation of concerns is only one side of a two-sided story: having sep-
arated concerns, there is a need to reason about how concerns might be re-
lated and negotiated. Many of the techniques are summarised in Table 4.5.
Regarding this issue, we see a deep synergy between SPL and requirements
engineering research. For instance, techniques on viewpoints [EN96], model
synthesis [UC04], model merging [BCE+ 06, ACLF09], inconsistency manage-
ment [SZ01] may provide insight into how concerns on FMs could be managed.
68 4 Separation of Concerns in Feature Models
4.5.3 Formalisation
In terms of level of formality, most of the FM languages we surveyed are largely
informal. We believe that the question of having an expressive and succinct
formal FM language is no longer a major problem. The major problem is
that most of the approaches to separation and composition of concerns are
weakly formalised. Only 19 out of the 127 papers obtained a score of + or
+, which means that a formal semantics is provided. 10 other papers obtained
because they provide an abstract syntax or metamodel of the concepts they
use. Some of the papers we reviewed formalise concepts that are not directly
related to FMs (e.g. [AC06]). In general, what we observed is an informal and
loose definition of the concept of concern.
On top of that, the very nature of a concern varies. Besides the distinction
between group concerns and relationships, more fundamental differences can
be noticed. For group concerns, some seem to be “attributes” characterizing
features like the binding time. Others make a clear separation between FMs
achieving different objectives like provided vs. required variability. Finally, a
concern can be captured by a link to an external model. The same observa-
tion holds for relationship concerns that can either denote a specific type of
4.5 Discussion 69
• What are the types and subtypes of concerns? Different types of concern
exist. Tables 4.2 and 4.3 lay the foundation stone by collecting two of
them, i.e. feature groups and feature decomposition relationships. Future
work should tell which more fine grained types are needed, and catego-
rize them more precisely. However, a more pressing matter might be to
produce solid empirical evidence that these concerns are actually rele-
vant in practice. Therefore, further investigation is needed to populate a
repertoire of relevant concerns.
The last step of our survey is a critical look at the research method and the
collected results. For each point, the threats to validity are discussed with the
presumed consequences on the global validity of the results.
5
Multi-view Feature Models
Two challenges that FBC techniques fail to address in a satisfactory way are (1)
tailoring the configuration environment according to the stakeholder’s profile
(knowledge, role, preferences. . . ), and (2) managing the complexity resulting
from the size of the FM. Two concurrent approaches have been proposed to
tackle that problem: view integration and view projection. View integration
techniques start from small and roughly independent FMs that are configured
and then integrated to form a complete configuration (e.g. [CKK06, RW06,
MSA09]). In practice, the gain of designing and working with smaller mod-
els often echoes with costly and complex integrations of heterogeneous mod-
els [GRDL09]. Conversely, view projection techniques assume the existence of
a global FM that is divided into smaller views, which are then configured. The
high upfront investments necessary to build the initial FM is counterbalanced
by the automatic integration of user decisions. The wide application of that
latter approach in various domains (see e.g. Chapter 4, database engineer-
ing [EN06], and product line implementation [K1̈0]) and our own experience
showing that a single feature model is usually favoured over an heterogeneous
collection thereof lead us to focus on view projection.
In FBC, a view is a simplified representation of an FM that has been tailored
for a specific stakeholder, role, task, or, to generalize, a particular combination
of these elements, which we call a concern. Views facilitate configuration in
that they only focus on those parts of the FM that are relevant for a given
concern. Using multiple views is thus a way to achieve SoC in FMs. SoC helps
making FM-related tasks less complex by letting stakeholders concentrate on
73
74 5 Multi-view Feature Models
the parts of the FM that are relevant to them and hiding the others.
End
Data
In Figure 5.1, we represent the key steps and artefacts of FBC that use
view projection. Typically, the process starts with variability modelling, which
produces the FM. The second step focuses on the identification of stakeholders,
which determines the profiles of the users of the configurator. Then, a view on
the FM is defined for every profile. The final step is the actual configuration
of the FM, which results in the specification of the product.
In this chapter, we focus on the creation of consistent views and the genera-
tion of alternative visualisations for configuration, as highlighted in Figure 5.1.
Specifically, we address three fundamental issues of multi-perspective feature
modelling:
RQ5.1 How are views actually specified? Related work usually identifies views
by surrounding groups of features with a line or by showing subsets of
features from the original FM. This gives very little insight as to how to
actually build these views in practice and certainly does not tell how to
implement a tool that renders them.
RQ5.2 How is the complete configuration of the FM enforced? Views delimit
portions on the set of features. To be meaningful, views should cover the
complete decision space defined by features, i.e., one should ensure that
no feature of the FM can be left undecided.
RQ5.3 How are features outside a view filtered out? Some stakeholders need
to see features outside a view to comprehend it. However, for large or
technical FMs, the complexity can become disorienting and features out-
side the view have to be hidden to simplify the configuration task. Fi-
nally, security policies can restrict the set of stakeholders who can access
(read/write) particular features. These different scenarios put different
constraints on view rendering.
Our contribution is a set of techniques to specify, automatically generate
and check multiple views (Section 5.3). Views are generated through trans-
5.2 Working Example: PloneMeeting 75
1 http://www.plonegov.org/
2 Reverse-engineered from PloneMeeting version 1.7 build 564.
76 5 Multi-view Feature Models
Archived
Created
Decided
Closed
Published
Default view
X
My items
All items
Available meetings
Decided meetings
Item duplication
Open annexes in separate window
Email notification (E)
Tasks (T)
Display macro (DM)
Task creator (TC)
Manager
Meeting manager
Owner
Votes (V)
Meeting manager
Voter
Available vote values (A)
Yes (AY)
No (AN)
Abstention (AA)
Default vote value (D)
X
Yes (DY)
No (DN)
Abstention (DA)
A major problem is that existing FBC tools do not provide means to control
who is responsible for a given configuration option. At best, informal comments
or notes can be added to tag features, which severely limits the automated tai-
loring of configuration interfaces and the control over access rights. However,
these two functionalities are required by PloneMeeting. In the absence of clear
access specifications, a coarse policy has been implemented: the web admin-
istrator and the PloneMeeting manager have both access to all configuration
options, while the users get access to none. A reported consequence is that
sometimes the PloneMeeting manager does not have sufficient knowledge to
fully understand the options and make decisions. The results can be incor-
rect settings of interfaces to external macros and runtime changes of meeting
workflows that lead to inconsistent meeting states. Additionally, users are de-
nied any tailoring of their working environment, e.g., default GUI layouts or
choosing how to display states of meetings or other items.
Furthermore, responsibilities and profiles can vary from one PloneMeeting
instance to the other. The variability in the use context might imply variations
in the access rights (e.g. the PloneMeeting manager cannot control workflows).
It might also require other stakeholder profiles (e.g. a Task Manager is needed
to configure the task portlet). Consequently, responsibilities and profiles should
not be hardcoded in the FM but defined on a case-by-case basis, typically before
or during the instantiation of the website.
This situation provided the initial motivation for the use of views as a flex-
ible means to build and reason about configuration spaces. However, existing
solutions to multi-view feature modelling fail to provide complete support to
model this case.
• ∀vi ∈ V • vi ⊆ N ∧ r ∈ vi .
78 5 Multi-view Feature Models
A view can be defined for any profile or, more generally, for any concern
that requires only partial knowledge of the FM. As a general policy, we consider
that the root is part of each view. V is a multiset to account for duplicated
sets of features.
designed view definition GUI rather than written by hand. Conversely, XPath
expressions could be used to generate feature tags and link them to the features
in the XPath expression. These links can then be used to trace changes to the
FM back in the XPath expression. This way we can avoid the drawbacks of
both extensional and intensional definitions.
For the formal developments that follow, we are only interested in the fea-
tures contained in each view. Therefore, we will abstract from the approach
chosen to specify views.
Intuitively, this means that all the features appear in at least one view, hence
no feature can be left undecided.4 In our motivating example (Figure 5.2),
each feature is part of a view. Hence, this condition holds: the collaborative
configuration of the FM through the views will always lead to complete and
valid products. This is indeed a sufficient condition, but not necessary since
some decisions can usually be deduced from others. For instance, in the web
administrator’s view, if the feature Display macro is selected, its ancestor Tasks
will be too, although the latter does not belong to the view.
A necessary condition can be defined using the notion of propositional de-
fineability [LM08].
Definition 5.3 pdefines(M, f ) (adapted from [LM08])
For a given set N of features of u and ΓN the Boolean encoding of u on N , a
feature f ∈ N is propositionally defined by the features in M ⊆ N iff:
ΓN ∧ Γ0N [M 0 ← M ] f ≡ f 0
where:
4 Note that the complete view coverage is usually assumed by mutli-view approaches
(e.g. [MCMdO08]).
80 5 Multi-view Feature Models
We need to ensure that the decisions on the features that do not appear in
any view can be inferred from (i.e. are propositionally defined by) the decisions
made on the features that are part of the view.
Definition 5.4 Necessary coverage condition
For a view v of a multi-view FM u the necessary coverage condition is:
[ [
∀f ∈/ v • pdefines( v , f)
v∈V v∈V
5.3.4 Visualisation
Views are abstract entities. To be effectively used during FBC, they need to
be made concrete, i.e. visual. Henceforth, a visual representation of a view
will be called a visualisation. The goal of a visualisation is to strike a balance
between (1) showing only features that belong to a view, and (2) including
features that are not in the view but that provide context and thereby allow
5.3 View Definition, Verification and Visualisation 81
the user to make informed decisions. For instance, the Display macro feature
is in the view of the web administrator, but its parent feature Tasks is not.
To tackle this problem formulated in RQ5.1-3, we observed the practice of
the PloneMeeting developers and discussed alternative visualisations. We also
inspected the approaches suggested in [ZZM08, MCMdO08]. We finally looked
into filtering mechanisms provided by tools. Tools like pure::variants [psG06] or
kernel configurators for operating systems (e.g. xconfig for Linux or config-
tool for eCos [eCo11]) provide simple filtering or search mechanisms that are
similar to views on an FM. A filter is defined as a regular expression on the
FM. Any feature matching the regular expression is displayed—without any
control on the location of the feature in the hierarchy. Interestingly, all these
approaches produce purely graphical modifications (e.g. by greying out irrele-
vant features) whereas cardinalities are not recomputed.
The outcome of our investigation is a set of three complementary visualisa-
tions offering different levels of details. They were built to present information
on a need-to-know basis. They allow to regulate the amount of information
displayed, and provide enhanced control over access rights. For instance, a
standardised configuration menu will always display the position of the fea-
ture in the hierarchy and hide unavailable options while critical systems will
conceal all the features outside a view to protect fabrication secrets. Thereby,
visualisations not only propose convenient representations of a view, but also
dictate what information is accessible to the stakeholder. These visualisations
are depicted in Figure 5.3. The FM on the left is the same as in Figure 5.2.
The darker area defines a specific view of it, called v.
V V V V
⟨3..3⟩ ⟨3..3⟩ ⟨2..4⟩
E v E E E
A A A Y
⟨0..2⟩ O
Y Y Y D
O O O
B B DY
D
D D DY
X X
DY DY
DO DO
DB DB
Greyed Pruned Collapsed
• The greyed visualisation is a mere copy of the whole FM except that the
features that do not belong to the view are greyed out (e.g., A, B, DO
and DB). Greyed features are only displayed but cannot be manually
selected/deselected.
82 5 Multi-view Feature Models
• In the pruned visualisation, features that are not in the view are pruned
(e.g., B, DO and DB) unless they appear on a path between a feature
in the view and the root, in which case they are greyed out (e.g., A)5 .
• In the collapsed visualisation, all the features that do not belong to the
view are pruned. A feature in the view whose parent or ancestors are
pruned is connected to the closest ancestor that is still in the view. If no
ancestor is in the view, the feature is directly connected to the root (e.g.,
Y and O).
The simplest case is the one of the greyed visualisation, since there is no
transformation beyond the greying of each feature f 6∈ v (i.e. dgv = d) . The
transformations for the pruned and collapsed visualisations are respectively
specified in Transformations 5.1 and 5.2. Basically, they filter nodes, remove
dangling decomposition edges and adapt the cardinalities accordingly. We leave
crosscutting constraints untouched in the following definitions because they are
usually not displayed in FBC systems. They are reintroduced in Section 9.4
when we discuss how our toolset handles both transformations and crosscutting
constraints to maintain decision consistency.
Pruned visualisation
Nvp , the set of features in this visualisation, is the subset of N limited to features
that are in v or have a descendant in v. The definition uses DE + , the transitive
closure of DE. Based on Nvp , we remove all dangling edges, i.e. those not in
Nvp × Nvp to create DEvp .
Transformation 5.1 Pruned visualisation
The transformations applied to the FM to generate the pruned visualisation
are:
Nvp = {n ∈ N |n ∈ v ∨ ∃f ∈ v • (n, f ) ∈ DE + }
DEvp = {DE ∩ (Nvp × Nvp )}
λpv (f ) = (mincardpv (f ), maxcardpv (f ))
where orphanspv (f ) = children(f ) \ Nvp i.e., the set of children of f that are
not in Nvp . λ(f ).min and λ(f ).max represent the minimum and maximum
values of the original cardinality, respectively. For the minimum, the difference
between the cardinality and the number of orphans can be negative in some
cases6 , hence the necessity to take the maximum between this value and 0.
The maximum value is the maximum cardinality of f in d if the number of
children in v is greater. If not, the maximum cardinality is set to the number
of children that are in v.
Collapsed visualisation
The set of features Nvc of this visualisation is simply the set of features in v.
The consequence on DEvc is that some features have to be connected to their
closest ancestor if their parent is not part of the view.
Transformation 5.2 Collapsed visualisation
The transformations applied to the FM to generate the collapsed visualisation
are: c
Nv = v
DEvc = {(f, g)|f, g ∈ v ∧ (f, g) ∈ DE + ∧
@f 0 ∈ v • ((f, f 0 ) ∈ DE + ∧ (f 0 , g) ∈ DE + )}
λcv (f ) = (mincardcv (f ), maxcardcv (f ))
The computation of cardinalities λcv (f ) is slightly more complicated than
in the pruned case. Formally, mincardcv (f ) and maxcardcv (f ) are defined as
follows: c P c
mincardv (f ) = P minλ(f ).min (ms_minv (f ))
maxcardcv (f ) = maxλ(f ).max (ms_maxcv (f ))
where
ms_mincv (f ) =
{mincardcv (g)|g ∈ orphanscv (f )} ] {1|g ∈ children(f ) \ orphanscv (f )}
ms_maxcv (f ) =
{maxcardcv (g)|g ∈ orphanscv (f )} ] {1|g ∈ children(f ) \ orphanscv (f )}
child that is in the view. The λ(f ).min minimum values of the multiset are
then summed to obtain the minimum cardinality of f . The maximum value is
computed similarly.
The next section illustrates how the basic steps of the transformations are
applied to the motivating example presented in Section 5.2. A detailed and
step-by-step explanation of the transformations performed in Figure 5.3 is re-
ported in Section A.1.
1 Meeting_Config/Workflow_and_security
2 | //Workflow_and_security//*
3 | Meeting_Config/Email_notification
4 | Meeting_Config/Tasks/Display_macro
1 | //Data/Item_insertion_algorithm
2 | //Item_insertion_algorithm//*
3 | Meeting_Config/User_interface
4 | //User_interface//*
1 Meeting_Config/General
2 | //General//*
3 | Meeting_Config/Data
4 | //Data/Meeting_attributes
5 | //Meeting_attributes//*
6 | //Data/Use_groups_as_categories
7 | Meeting_Config/Workflow_and_security
8 | //Workflow_and_security/Meeting_workflow
9 | //Meeting_workflow//*
10 | Meeting_Config/Email_notification
11 | Meeting_Config/Tasks
12 | //Tasks/Task_creator
13 | //Task_creator//*
14 | Meeting_Config/Votes
15 | //Votes//*
web administrator. The new set of features Nvp thus only contains the features
in Figure 5.5(a). The same holds for the decomposition edges of DEvp . The
new cardinalities of the pruned version of the FM are calculated as follows:
λpW A (M C) = hmax(0, 7 − 4)..min(7, 7 − 4)i = h3..3i
λpW A (T ) = hmax(0, 2 − 1)..min(2, 2 − 1)i = h1..1i
where M C has four orphans (G, D, U and V ) and T only one (Task creator).
Obtaining the collapsed visualisation (Figure 5.5(b)) is more complex be-
cause collapsed features entail the recursive computation of cardinalities. In
this particular example, the Display macro feature (boldfaced in Figure 5.5(b))
is the only example of a collapsed feature. Unlike in the pruned case, its parent
feature T is removed from the visualisation. This means that Display macro
is disconnected from the FM. It thus has to be linked to its closest ancestor,
here M C, to keep the view consistent.
New cardinalities must be calculated to match the transformed structure
of the FM. Starting from the root feature, we separate the orphans of M C (G,
86 5 Multi-view Feature Models
D, U , T and V ), which requires recursive calls, from its children that are in
the view (W and E), which gives:
Only the left-hand side of the union implies a recursive call. The value for
G, D, U and V is trivially 0 since neither they nor their children are in the
view. T , however, has one child in the view. The cardinality of T is simple to
compute since its children have no descendants, which yields:
mincardcW A (M C) =
P
min7 {0, 0, 0, 1, 0} ] {1, 1}
= 0P+ 0 + 0 + 0 + 1 + 1 + 1 = 3
maxcardcW A (M C) = max7 {0, 0, 0, 1, 0} ] {1, 1}
= 0+0+0+0+1+1+1=3
λcW A (M C) = h3..3i
As appears in Table 5.2, the pruned and collapsed visualisations of the
sample FM of Figure 5.2 (counting 57 features), respectively the complete FM
5.5 Correctness of Transformations 87
Table 5.2 – Number of features for the three views and the corresponding number
of XPath lines for the sample and complete FMs (S=Sample ; C=Complete).
Also, we know from [SHTB06] that d ∈ LF M is satisfiable if and only if
[[d]]F M 6= ∅. From this result, we derive a corollary:
Corollary 5.1 Local consistency satisfiability
If d ∈ LF M is not locally consistent, then it is not satisfiable: [[d]]F M = ∅.
If d is not locally consistent, then it has no valid configuration and the
semantic equivalence is trivially satisfied. We demonstrate the correctness of
dpv under the assumption that it is locally consistent.
Theorem 5.1 Correctness of dpv
If d is locally consistent, the pruned visualisation dpv preserves the semantic
equivalence with the original FM d:
Let us consider c ∈ [[(Nvp , r, λpv , DEvp , {})]]F M . We claim that there exists
c0 ∈ [[(N, r, λ, DE, {})]]F M |Nvp such that c ⊆ c0 |Nvp ⊆ c0 by local consis-
tency. Indeed, for each m ∈ / Nvp and m ∈ children(n) with n ∈ Nvp
5.5 Correctness of Transformations 89
To prove that such a configuration c does not exist, we test the different
conditions that could lead to incompatible configurations.
1. Different root features. Since both dpv and d have the same root
feature by definition of v, we know that all the configurations will
have the same root feature.
2. Every product satisfies the extra constraints. In this case, the set of
constraints is empty, hence does not influence the equality.
3. Different decomposition edges. We know from Lemma 5.1 that DEvc
is a prefix of DE. Thereby, all features in Nvp are subject to the
same constraints in dpv and d.
4. Different decomposition types. We have to prove that dpv does not
exclude configurations of d that only contain features in Nvp . Valid
configurations can be excluded if there is a feature f ∈ Nvp for
which the interval between the minimum and maximum cardinality
is reduced too much or relaxed too much.
Let us first prove that less features than expected cannot be se-
lected for any feature f . We know that mincardpv (f ) = λ(f ).min −
|orphanspv (f )| if the result is positive, which means that the recom-
puted value only depends on the features in Nvp . If the result is
negative, then mincardpv (f ) = 0, which means that no feature in
Nvp might be selected. The cardinality is thus only reduced by the
number of features outside Nvp . Thereby, less features than required
in Nvp cannot be selected.
More features than necessary cannot be selected either. We know
that if |children(f )|−|orphanspv (f )| < λ(f ).max then maxcardpv (f ) =
|children(f )| − |orphanspv (f )| which means that we can select as
many features as available in Nvp because the original cardinality is
greater than the number of available children of f in Nvp . If it is
90 5 Multi-view Feature Models
a v a
<1..2> <2..4>
b c
<2..2> d
c f
d g
e
<2..2>
f
g
This shows that the transformation that produces the collapsed visualisa-
tion does not preserve the semantics of the FM. Yet, we can still prove that it
does not restrict the original semantics, and provides the most precise semantics
expressible on v.
Definition 5.8 Most precise semantics of a collapsed visualisation
The most precise semantics of a collapsed visualisation of a view v on an FM
d, [[(Nvc , r, λcv , DEvc , {})]]F M , defines the greatest possible lower bound and the
smallest possible upper bound. This means that there does not exist a cardinality
transformation λt such that:
(1) dcv does not restrict the original semantics of d. By reductio ad ab-
surdum we can prove that:
1. Every product contains the root feature. Since both dcv and d have
the same root feature by definition of v, we know both products
have the same root feature.
2. Every product satisfies the extra constraints. In this case, the set of
constraints is empty, hence does not influence the equality.
3. Every feature is justified. By definition of DE, every feature must
be justified. This means that all the ancestors of a selected feature
have to be selected (and nothing can be inferred about the descen-
dants). Likewise, all the descendants of a deselected feature have
to be deselected (and nothing can be inferred about the ancestors).
The selection of ancestors and deselection of descendants is thus
preserved in the transitive closure. Therefore, any configuration re-
specting DE is also valid in DEvc , modulo the projection on Nvc .
4. Every feature satisfies the decomposition type. If c is not satisfiable
in [[(Nvc , r, λcv , DEvc , {})]]F M , there exists n ∈ Nvc such that the in-
terval of λcv (n) is too narrow. If orphanscv (n) = ∅ then we know by
definition of Transformation 5.2 that λcv (n) = λ(n). The intervals
are thus equivalent if f has no orphans in Nvc .
If orphanscv (n) 6= ∅, then let us consider m ∈ orphanscv (n). The
absence of m from Nvc means that its children are collapsed in DEvc ,
thereby implying the recalculation of the cardinality of n. To be
valid, cardinalities has to preserve the constraints imposed by the
92 5 Multi-view Feature Models
Two interesting conclusions can be drawn from this latter theorem: (1) any
valid configuration of the FM is a valid configuration of the collapsed view; (2)
the cardinalities of the collapsed visualisation produce an under-constrained
FM. This is an inevitable consequence of collapsing several descendants under
the same feature. In fact, the first conclusion comes at the price of the second.
All these formal definitions and checks are implemented in our toolset.
Chapter 9 provides details on view specification with XPath expressions, the
implementation of the necessary and sufficient checks, and the maintenance of
the consistency in user decisions with the three visualisations.
Chapter
6
Basic Configuration Scheduling
95
96 6 Basic Configuration Scheduling
r r
<2..3>
r r dummy
select
<2..3> <2..2> a <1..2>
remove feature c
a feature c a b a
b b c b
c c
r r
<2..3> <2..2>
refine
a cardinality a
b b
c c
of the next level’s FM. The links between models are defined explicitly through
specialisation annotations. A specialisation annotation of a feature f ∈ di , (f ∈
Ni ), consists of a Boolean formula φ over the features of di−1 (φ ∈ B(Ni−1 )).
Once level i − 1 is configured, φ can be evaluated on the obtained configuration
c ∈ [[di−1 ]]F M , using the Boolean encoding of Section 2.4.2, i.e. a feature
variable n in φ is true if and only if n ∈ c. Depending on its value and the
specialisation type, the feature f will either be removed or selected through
one of the first two syntactic transformations of Figure 6.1. An overview of
this is shown in Table 6.1.
taxGateway (G)
X
certiTax (T)
calcMethod (M)
X
totalOrder (O)
lineItems (I)
taxExpress (E)
serviceID (D)
cyberSource (S)
repBrkdwn (B)
to decide about the gateways. The first time (Level 1) is when she purchases
the system. All she decides at this point is which gateways will be available
for use; the model that needs to be configured is the one shown on the top-left
corner of Figure 6.3. Then, when the system is being deployed (Level 2), she
will have to settle for one of the gateways and provide additional configuration
parameters, captured by the first model in the lower-left corner of Figure 6.3.
Given the inter-level links, the model in level two is automatically specialised
based on the choices made in level one.
Level 2
Note that even though both models in the example are very similar, they
98 6 Basic Configuration Scheduling
need not be so. Also note that the original paper mentions the possibility,
that several configuration levels might run in parallel. It applies, for instance,
if levels represent independent decisions that need to be taken by different
people. As we show later on, such situations give rise to interesting decision
problems.
The MLSC approach, as it appears in [CHE05], is entirely based on syntac-
tic transformations. This makes it difficult to decide things such as whether
two levels A and B are commutative (executing A before B leaves the same
variability as executing B before A). This is the main motivation for defining a
formal semantics, as follows in the next section. To respect the initial definition
of MLSC, this chapter uses the term level. In the next chapter, the level is
replaced by the more general concept of view. For now, it suffices to see a level
as a special kind of view on the FM.
We introduce the dynamic FM semantics in two steps. The first, Section 6.2.1,
defines the basic staged configuration semantics; the second, Section 6.2.2, adds
the multi-level aspect.
• last(σ1 ...σk ) = σk
(6.3.1) σ1 = [[d]]F M
(6.3.3) |σn | = 1
Contrary to what intuition might suggest, (6.1.2) and (6.1.3) do not imply
that |[[d]]F M | = |[[d]]CP |, they merely say that every configuration allowed by the
FM can be reached as part of a configuration path, and that each configuration
path ends with a configuration allowed by the FM.
100 6 Basic Configuration Scheduling
r {c1 , c2 , c3 }, with
X
c1 = {r, a}
d= a [[d]]F D =
c2 = {r, b}
b
c c3 = {r, c}
Note that this is not necessarily a bad thing, since Czarnecki et al. probably
chose to only include transformation steps that implement the most frequent
usages. However, the practical consequences of this limitation need to be as-
sessed empirically.
6.2 Dynamic FM Semantics ([[.]]MLSC ) 101
• is connected through exactly one node to the global root: ∃!n ∈ Li • (r, n)
∈ DE, noted hereafter root(Li );
• does not share decomposition edges with other levels (except for the root):
∀(n, n0 ) ∈ DE • (n ∈ Li ⇔ n0 ∈ Li ) ∨ (n = r ∧ n0 = root(Li ));
Given the new syntactic domain, we need to revise the semantic function.
As for the semantic domain, it can remain the same, since we still want to reason
about the possible configuration paths of an FM. The addition of multiple
levels, however, requires us to reconsider what a legal configuration path is.
Indeed, we want to restrict the configuration paths to those that obey the
levels specified in the FM. Formally, this is defined as follows.
1 The set of constraints here is empty because it is not needed for validity verification.
102 6 Basic Configuration Scheduling
taxGateway (G)
Crosscutting constraints
taxGateway1 (G1) L1 ¬ T1 ¬ T2
certiTax1 (T1) ¬ E1 ¬ S2
taxExpress1 (E1) ¬ S1 ¬ S2
cyberSource1 (S1)
taxGateway2 (G2) L2
X
certiTax2 (T2)
calcMethod2 (M2)
X
totalOrder2 (O2)
lineItems2 (I2)
taxExpress2 (E2)
serviceID2 (D2)
cyberSource2 (S2)
repBrkdwn2 (B2)
The rule (6.5.2) expresses the fact that each configuration deleted from
σj (i.e. c ∈ σj \ σj+1 ) during level Li must be necessary to delete one of
the configurations of Li that are deleted during this stage. In other words,
the set of deleted configurations needs to be included in the set of deletable
6.3 Illustration of [[.]]MLSC 103
Let us illustrate this with the FM of Figure 6.5, which we will call d, itself
being based on the example of Figure 6.3 in Section 6.1. The semantic domain
of [[d]]MLSC still consists of configuration paths, i.e. it did not change from those
of [[d]]CP shown in Figure 6.4. Yet, given that [[d]]MLSC takes into account the
levels defined for d, not all possible configuration paths given by [[d]]CP are
legal. Namely, those that do not conform to rules (6.5.1) and (6.5.2) need to
be discarded. This is depicted in Figure 6.6, where the upper box denotes the
staged configuration semantics of d ([[d]]CP ), and the lower box denotes [[d]]MLSC ,
i.e. the subset of [[d]]CP that conforms to Definition 6.5.
We now zoom in on two configuration paths πi , πj ∈ [[d]]CP , shown with
the help of intermediate FMs in the lower part of Figure 6.6. As noted in
Figure 6.6, πj is not part of [[d]]MLSC since it violates Definition 6.5, whereas πi
satisfies it and is kept. The rationale for this is provided in Table 6.2. Indeed,
for πj , there exists no level arrangement that would satisfy both rules (6.5.1)
and (6.5.2). This is because in σ2j , it is not allowed to remove the feature B2 ,
since it belongs to L2 , and L1 is not yet completed. Therefore, either there is
still some variability left in the FM at the end of the level, which is thus not
fully configured (the first possible arrangement of πj in Table 6.2 violates rule
(6.5.1)), or the set of deleted configurations is greater than the set of deletable
configurations (the other two arrangements of πj in Table 6.2, which violate
rule (6.5.2)). For πi , on the other hand, a valid level arrangement exists and is
indicated by the highlighted line in Table 6.2. All details for this illustration
are provided in Appendix B.1.
with πi = σ1 σ 2i σ 3i
G G G
G1 L1 G1 L1 G1 L1
T1 E1 E1
E1 S1 S1
S1
G2 L2 G2 L2
G2 L2 X
X
T2 E2 E2
M2 D2 D2
X S2
O2
B2
I2
E2
D2
S2
B2
πj = σ1 σ 2j σ 3j
G G G
G1 L1 G1 L1 G1 L1
T1 E1 E1
E1 S1 S1
S1
G2 L2 G2 L2
G2 L2 X
X
T2 E2 E2
M2 D2 D2
X S2
O2
I2
E2
D2
S2
B2
Proof.
Σ1 Σ2 true false
Σ1 Σ2 true true
πi = σ 1 σ 2i σ 3i
Σ1 Σ2 false /
Σ1 Σ2 true false
Σ1 Σ2
true false
πj = σ 1 σ 2j σ3j
deleted pertain to the level including the stage: ∀k < j • |σk |Li | >
|σj |Li |.
(6.3.d) |σj |Li | = 1 means that all configurations c ∈ σj contain σj |Li . Given
rule (6.3.2), only full configurations can be removed, hence the property.
Proof. Let us suppose that it is possible to find a diagram d that has a valid
configuration path π ∈ [[d]]MLSC with more than one level arrangement. Note,
that in that case, we need more than one level to begin with: |L| > 1. Without
loss of generality, each π ∈ [[d]]MLSC with multiple arrangements falls into one
of the following three categories.
π = σ1 σ2 ...
Level arrangement 1 Σi ...
Level arrangement 2 Σi+k ...
with k ≥ 1 and Σi+1 ..Σi+k−1 = . This case is similar to (b), i.e. if both
arrangements are legal, they also exclude each other:
Since these are all the situations that might occur, it is impossible for a π ∈
[[d]]MLSC to have multiple level arrangements.
Continuing with Definition 6.5, remember that rule (6.5.2) requires that
every deleted configuration be deletable in the stage of the associated level.
An immediate consequence of this is that, unless we have reached the end of
the configuration path, the set of deletable configurations must not be empty,
established in Theorem 6.5. A second theorem, Theorem 6.6, shows that con-
figurations that are deletable in a stage, are necessarily deleted in this stage.
Theorem 6.5 Sufficient replacement of rule (6.5.2)
A necessary, but not sufficient replacement for rule (6.5.2) is that (σj |Li \
σj+1 |Li ) 6= ∅.
which means that if in addition (σj \ σj+1 ) |Li ⊆ (σj |Li \ σj+1 |Li ) holds, both
sets are equal.
Proof. We can easily construct an example for LMLSC ; it suffices to take the
FM used to prove Theorem 6.2 and to consider it as the sole level of a diagram.
From there on, the proof is the same.
Proof. Basically, the proposed inter-level link types always have a sole feature
on their right-hand side. It is thus impossible, for example, to express the fact
that if some condition φ is satisfied for level Li , all configurations of level Li+1
that have f will be excluded if they also have f 0 (i.e. φ ⇒ (f 0 ⇒ ¬f )).
During the elaboration of an FM, the analyst needs to decide the order of
the levels. Since this order determines which configuration paths are consid-
ered legal, an important information at this point is whether two levels can be
interchanged, i.e. whether they are commutative. Two levels are called com-
mutative if the tails of the configuration paths, starting with the later one, are
the same for either order of them.
Definition 6.8 Commutative levels
Given an FM d, so that L = [..La ..Lb ..], La and Lb are said to be commutative,
iff the following holds:
This chapter introduced a dynamic formal semantics for FMs that allows rea-
soning about its configuration paths, i.e. the configuration process, rather than
only about its allowed configurations. Extending the basic dynamic semantics
with levels yielded a semantics for MLSC. The contribution is therefore a pre-
cise and formal account of MLSC that makes the original definition [CHE05]
more explicit and reveals some of its subtleties and incompletenesses. Based
on the semantics we showed some interesting properties of configuration paths.
However, the sequential process enforced by MLSC has its limitations. In
practice, the strictly sequential ordering of configuration activities rarely holds.
In the next chapter, we relax that assumption and describe a combined for-
malism to control and reason about non-linear configuration processes.
Chapter
7
Advanced Configuration
Scheduling
The semantics defined in Section 6.2 inherits from the original definition of
MLSC [CHE05] the assumption that levels are configured one after the other
in a strict order until the final configuration is obtained. We will gradually lift
these hypotheses, and discuss their implications on the semantics.
(d) Asynchronous
L2
L1 L3 L4 L5 L4 L!
...
109
110 7 Advanced Configuration Scheduling
while still being able to specify which interleavings are allowed (e.g. only L2 /L3
and L4 /L5 ), and what the minimal order is (e.g. a strict order from L6 on).
One way to do that is with a regular expression over the Li ’s, such as
process diagrams could also be used to specify the allowed sequences. Namely
a statechart, where a state corresponds to the configuration of a level and tran-
sitions denote configuration sequences, might allow stakeholders and managers
with a less formal background to specify the configuration process intuitively.
Interleaving also gives rise to a number of interesting analysis properties.
During the elaboration of the FM, mainly when defining the configuration
process, it is important to know which levels are safe to be interleaved. To
this end, it is interesting to know whether two levels are independent, i.e.
whether there are direct/indirect inter-level links between them, or whether
they influence the same features in a subsequent level.
Parallel levels. Actually, interleaving, with well-done tool support, al-
ready allows for pseudo-parallelism; similar to how a single core processor al-
lows multiple programs to run in parallel by interleaving their instructions.
It requires, however, the model to be accessible for configuration simultane-
ously from different places. Google Docs, and the recent software-as-a-service
trend, show how this is possible even within a browser, yet these approaches
generally require a live Internet connection. Lifting this barrier to distributed
offline collaboration would make the semantics even more fit to real scenarios,
since parallel configuration does happen in the real world, for instance in au-
tomotive software engineering [CHE05]. Figure 7.1 illustrates the differences
between strict order, interleaving and parallelism graphically.
Asynchronous levels. The advantage of parallel levels is that distributed
groups can work independently on their local copies. The model of parallelism
introduced previously, however, still assumes a certain amount of co-ordination,
namely at the fork and merge points. This can lead to problems; imagine, for
instance, that L2 and L3 run in parallel. If the configuration of L3 takes longer
than expected, the subsequent levels will have to wait for L3 to finish, even
though L2 is already configured.
This problem could be solved by considering a completely asynchronous
approach, as shown in Figure 7.1(d). There is a central base model, but instead
of executing configurations on the base model, each level is locally configured
and merges back its decision into the base model on the fly. This way, L2 can
be merged back to the central model even before L3 is finished. If merges are
assumed to work in the other direction as well, then this can also reduce the
potential for conflict, since each level can merge its changes back to the central
model as it progresses with the configuration (note that, if a merge is done for
every stage, this is roughly equal to the interleaving mode). This asynchronous
level model actually corresponds to how popular SCM systems work.
Crosscutting levels. In Definition 6.5, we require that each level has
to be an FM in itself, with no sharing of features or decomposition edges
between levels. It could be imagined to lift this hypothesis, too, and thereby
allow levels to be crosscutting. Indeed, Lee et al. suggested to group features
into binding units, denoting major functionalities of the system, so that nodes
112 7 Advanced Configuration Scheduling
shared between binding units allow for additional constraints [LK06]. Given
that binding units can be represented by levels in our semantics, it seems that
such an extension would make sense. Implications on the semantics, however,
have to be considered carefully.
To support these extensions, our semantics needs to be adapted in sev-
eral ways. To give a concrete sense to these extensions, we first illustrate the
limitations of MLSC and levels on a real-world SPL: the CFDP developed by
Spacebel (Section 7.2). We then introduce the workflow language (Section 7.3)
that, together with generic views on the FM, defines a new practical formalism
called feature-based configuration workflow (Section 7.4). The application of
the formalism to the CFDP case is described in Section 7.5. Next, we study
the desired properties of feature-based configuration workflows (Sections 7.6
and 7.7). We conclude with an experiment (Section 7.8) and threats to valid-
ity (Section 7.9).
CFDP library
Crosscutting constraints
PUS
PUS Copy Send ⋀ Receive
PUS Copy
PUS Rename PUS Rename Send Filestore Operations ⋀ Receive Filestore Operations
Send Extended Send ⋀ Receive
Send Filestore Operations Reboot PUS PUS
Send Acknowledged mode
Receive
Receive Filestore Operations
Receive Acknowledged mode
Extended
Reboot
Reboot Entity
Reboot PUS
As the limitations we just identified indicate, we need support for modelling and
enforcing configuration processes that are more complex than mere sequences.
Workflow modelling languages and tools serve this purpose.
Among the possible options, we picked YAWL as it is formal [vdAtH05],
has extensive tool support [vdAADTH04], is known to cover a large variety
of workflow modelling patterns [vdAtHKB03], can be mapped from other lan-
guages (e.g. BPMN, BPEL, and activity diagrams [WVvdA+ 09]), and has been
successfully applied to a wide variety of industrial settings. In [vdAtHKB03],
van der Aalst et al. conduct a study of 20 workflow patterns, and compare
114 7 Advanced Configuration Scheduling
X X A A X
Having introduced FMs and YAWL, we now introduce the new combined for-
malism of feature-based configuration workflows (FCWs) and illustrate it with
the Spacebel case.
In a nutshell, an FCW is a workflow, such as the one shown in Figure 7.3,
where a task is associated with a set of features. In MLSC, each set would be an
independent level, or FM, which is connected to the other levels through inter-
level links. However, that representation is not always the most suitable. In
the CFDP case, for instance, every set of features is a projection on the FM in
Figure 7.2. Building separate FMs and the extra-constraints would put a heavy
burden on the designers. Even with proper tool support, such representation
would clutter the definition of the FCW. Therefore, instead of levels, we use
the more generic concept of view (see Chapter 5), that can accommodate both
independent FMs—as in MLSC— and projections on a single model without
introducing unnecessary duplication.
Using YAWL to model the CFDP configuration process allows us to over-
come the restrictions of MLSC outlined in Sections 7.1 and 7.2. From a purely
structural viewpoint, it provides an immediate solution to the representation of
parallel views through and-split, optional views through xor-split and iterative
configurations through backward transitions.
Relaxing the limitation that FM views be completely configured before
passing on to the next view is of a more fundamental nature. Ideally, the for-
malism should be flexible enough to overcome this limitation but rigid enough
to enforce the time when views have to be configured. This is achieved by spec-
ifying, separately for each view, the task in which it can be configured, and the
point at which configuration has to be finished. This point is represented by a
condition in the workflow.
We now provide a formal syntax and semantics for FCWs that follows that
intuitive description.
Definition 7.2 Abstract syntax LF CW
An FCW m ∈ LF CW is a tuple (w, d, start, stop) such that:
116 7 Advanced Configuration Scheduling
Intuitively, the start of a view is the only task of the workflow during which
the associated view can be configured, while the stop of a view denotes the
point at which the configuration of a view needs to be done. The reason why
the start and stop of a view are dissociated, is to be able to capture cases where
the partial configuration of a view is completed by subsequent views, or where
the configuration iterates, such as between the SE, the NWI, and the TTI in
the Spacebel case.
As for M LSC, the semantic domain of the F CW language is also based on
the notion of configuration path (see Definition 6.1 and 6.3).
The semantics of an FCW is the set of legal configuration paths (see 7.3.A
in Definition 7.3) which follow a valid sequence of workflow states (7.3.B).
Intuitively, this means those configuration paths where the products eliminated
in a step pertain to the view whose task is being executed (7.3.B.2), and where
the stops encountered during the workflow execution are respected (7.3.B.3).
This intuition is formalised by saying that each stage σ of the configuration
path can be associated to a state s in the workflow, i.e. a sequence ϕ of
pairs (σ, s), that verifies the two above conditions and a minor well-formedness
condition (7.3.B.1).
Definition 7.3 FCW Semantics [[m]]F CW
For m ∈ LF CW , [[m]]F CW returns the set of paths π ∈ SCP such that π =
σ1 . . . σn for which there is a valid sequence of YAWL states ρ ∈ [[w]]Y AW L with
ρ = s1 → . . . → sk such that:
(7.3.B.1) either the configuration path or the workflow sequence evolve step-
wise:
∀ . . . (σi , si )(σi+1 , si+1 ) . . . ∈ ϕ
(σi = σi+1 ∧ si → si+1 ∈ ρ) ∨ (si = si+1 )
7.5 Working Example Revisited 117
∀ . . . (σi , si )(σi+1 , si ) . . . ∈ ϕ
• (σi \ σi+1 ) |active(si ) 6= ∅
∧ (σi \ σi+1 ) |active(si ) ⊆
(σi |active(si ) \ σi+1 |active(si ) )
where:
• c(s) is the set of conditions active in state s.
• t(s) is the set of tasks active in state s.
• active(s) returns the union of the views active in a given state s:
4 S
active(s) = { vi |start(vi ) ∈ t(s)}
S
• Vskip = V \ ( s∈ρ active(s)) is the set of views that do not appear in ρ.
• Conversely, Vdo = V \ Vskip is the set of views that do appear in ρ.
• stops returns the set of views that should be fully configured in a given
state s:
4
stops(s) = {vi ∈ Vdo |stop(vi ) ∈ c(s)}
• starts returns the set of views that should be configured in a given state
s:
4
starts(s) = {vi ∈ Vdo |start(vi ) ∈ t(s)}
Section 7.2 introduced the configuration scenario of the CFDP. This scenario
is now re-used to illustrate our definition of FCW. Figure 7.4 depicts three
types of artefacts: (1) the views, (2) the configuration workflow, and (3) the
mappings between both. Note that the concrete syntax presented here is used
for illustrative purpose only and is not meant to be prescriptive.
The view decomposition is based on the sample diagram of Figure 7.2. The
decomposition of the original FM produces five views, each accounting for the
roles and responsibilities defined in Section 7.2. All the views are rendered
with the pruned visualisation (see Section 5.3.4).
118 7 Advanced Configuration Scheduling
Reseller View X
CFDP
PUS (P)
PUS Copy (PC)
PUS Rename (PR)
Send (S)
Send Filestore Operations (SF)
Send Acknowledged mode (SA) Network Integrator View 3
Receive (R) CFDP
Receive Filestore Operations (RF)
Receive Acknowledged mode (RA) Send (S)
Send Filestore Operations (SF)
Extended (E)
Send Acknowledged mode (SA)
Reboot (O)
Reboot Entity (OE) Receive (R)
Reboot PUS (OP) Receive Filestore Operations (RF)
Receive Acknowledged mode (RA)
Configuration workflow
Reseller Network Integrator
X X A A X
TMTC Integrator
is automatically satisfied.
Figure 7.5 illustrates how this FCW can be executed, which eventually
results in a fully configured product. The workflow starts with the Spacebel
task (indicated by the arrow in Figure 7.5(a)), where the responsible person
decides that the P U S Copy feature shall not be included. As a consequence,
the feature becomes unavailable in the TMTC Integrator view. The next task
is the System Engineer , who decides to include all the features that are
available to him as shown in Figure 7.5(b). This decision again causes other
features to be selected automatically, including those of the previous module.
Finally, Network - and TMTC Integrator finalise the configuration process
in parallel. Their choices eliminate all remaining variability, meaning that both
stops are satisfied, and that the workflow has reached the end.
7.6.1 Satisfiability
To define the FCW satisfiability, we first build upon the definition of workflow
(weak) soundness [vdAtH05]. The (weak) soundness property ensures that (1)
every execution of the workflow completes, (2) no task is still running when
the execution stops, and (3) every task is executable.
X
In the following definitions, si −→ sj extends the → used in Definition 7.3.
X can be ∗ (possibly empty sequence of states that lead from si to sj ), +
(non empty sequence of states that lead from si to sj ), or x ∈ sj ∧ x ∈ / si ,
where x is either a task or a condition. We also denote sb the first state of
120 7 Advanced Configuration Scheduling
1
Network Integrator FD
Spacebel FD CFDP library
CFDP library
Send
PUS Send Filestore Operations
✗ Send Acknowledged mode
PUS Copy
PUS Rename System Engineer FD Receive
Send Receive Filestore Operations
CFDP library Receive Acknowledged mode
Send Filestore Operations
Send Acknowledged mode PUS
Receive
Send
Receive Filestore Operations TMTC Integrator FD
Receive Acknowledged mode Receive
CFDP library
Extended Extended
Reboot Reboot PUS
Reboot Entity Reboot Entity ✗ PUS Copy
Reboot PUS Reboot PUS PUS Rename
Network Integrator FD
Spacebel FD CFDP library
CFDP library
2 ✓ Send
✓ PUS Send Filestore Operations
✗ Send Acknowledged mode
PUS Copy
PUS Rename System Engineer FD ✓ Receive
✓ Send CFDP library
Receive Filestore Operations
Send Filestore Operations Receive Acknowledged mode
Send Acknowledged mode ✓ PUS
✓ Receive
Receive Filestore Operations
✓ Send TMTC Integrator FD
Receive Acknowledged mode ✓ Receive
CFDP library
✓ Extended ✓ Extended
✓ Reboot ✓ Reboot ✓ PUS
✓ Reboot Entity ✓ Reboot Entity ✗ PUS Copy
✓ Reboot PUS ✓ Reboot PUS PUS Rename
Network Integrator FD
Spacebel FD CFDP library
CFDP library
✓ Send
✓ PUS ✓ Send Filestore Operations
✗ PUS Copy ✗ Send Acknowledged mode
System Engineer FD ✓ Receive
✓ PUS Rename
✓
✓ Send CFDP library
Receive Filestore Operations
✓ Send Filestore Operations ✗ Receive Acknowledged mode
✗ Send Acknowledged mode ✓ PUS
✓ Receive
✓Receive Filestore Operations
✓ Send
TMTC Integrator FD
✗Receive Acknowledged mode ✓ Receive
CFDP library
✓ Extended ✓ Extended
✓ Reboot ✓ Reboot ✓ PUS
✓ Reboot Entity ✓ Reboot Entity ✗ PUS Copy
✓ Reboot PUS ✓ Reboot PUS ✓ PUS Rename
the execution and se the final state of the execution such that {b} = sb and
{e} = se , respectively. We finally write p• = {q|(p, q) ∈ F }, the set of output
places of p, and •q = {p|(p, q) ∈ F }, the set of input places of q.
Definition 7.4 Workflow (Weak) Soundness (adapted from [vdAtH05])
A workflow w is (weakly) sound if and only if:
(7.4.A) w has the option to complete:
∗ ∗
∀s • (sb −
→ s) ⇒ (s −
→ se )
The soundness property determines whether the configuration process can
complete properly and every configuration task can be executed, i.e., is not
dead. These are essential requirements for an FCW. However, neither sound-
ness nor LF CW restrict the placement of a stop wrt. its task. As explained in
the previous section, this was done intentionally to account for cases in which
consecutive FMs are refinements of an initial FM. The additional flexibility
comes at the cost of having to detect unsafe models.
The position of the stop is critical as it defines when the associated views
must be completely configured. By construction, a stop can come before, after,
or during the execution of the tasks linked to those views, or simply never be
active. Furthermore, the position of a stop can vary from one run to another.
In Figure 7.6 for instance, all the runs of the workflow reported in Figure 7.7
are perfectly valid wrt. YAWL’s semantics. The position of the stop varies
from one case to the other, which yields completely different conditions on the
views. If the stop is anterior or concurrent to its tasks, then all the features
in the associated views have to be propositionally defined by decisions made
earlier in the run, e.g., through crosscutting constraints. This is very unlikely
122 7 Advanced Configuration Scheduling
start stop
X t' X
X t X o X t'' X
S T S
where Vmandatory = { v| ρ s∈rho s • start(v) ∈ t(s)}.
A stop correctly follows its task when two conditions are satisfied. First, if
a task is in a sequence, then its stop must be too (7.6.A). Secondly, the stop
can never precede the first occurrence of the task (7.6.B). These two conditions
are captured in the following definition.
Definition 7.6 follows(t, c)
For all ρ ∈ [[w]]Y AW L that contain t, the condition c follows the task t iff:
(7.6.A) c occurs after t: ∃si , sj • t ∈ t(si ) ⇒ (c ∈ c(sj ) ∧ i ≥ j).
(7.6.B) c does not occur before the first occurrence of t:
∗\t c ∗ t
@si , sj , sk , sl • sb −−→ si →
− sj −
→ sk →
− sl
7.6 Desired Properties 123
Workflow soundness and safety, together with view coverage (see Defini-
tion 5.2 and 5.4), defines the satisfiability of an FCW.
Definition 7.7 FCW satisfiability
An FCW m ∈ LF CW , is satisfiable iff:
• w is sound;
• m is safe.
To illustrate Definition 7.7, let us come back to the FCW in Figure 7.6.
Figure 7.7(a) shows some examples of valid runs. One can observe that for
every occurrence of t0 there is a posterior occurrence of o. ρ4 might seem
peculiar as t0 both precedes and follows o. In fact, the second occurrence of t0
has no impact on the configuration of v since it is already fully configured—
enforced by the stop condition o. It also conforms to Definition 7.5 since there
is at least one occurrence of t0 before o.
ρ1 sb t t' o t'' se
ρ2 sb t t' t'' t t' o t'' se
ρ3 sb t t' o t'' t o t'' se
ρ4 sb t t' o t'' t t' o t'' se
ρ5 sb t o t'' se
ρ6 sb t o t'' t t' t'' se end
ρ7 sb t t' t'' se
In contrast, the two runs in Figure 7.7(b) are illegal. ρ5 only contains o.
Definition 7.5.B is satisfied but not Definition 7.5.A; the view is mandatory
but not configured. ρ6 is obviously illegal because t0 does not appear before
o (Definition 7.6.B). To be legal, it should have appeared at least once before
o, as in ρ1 or ρ2 for instance. The problem with ρ7 is the absence of o, which
does not allow to check the proper completion of t0 (Definition 7.6.A).
124 7 Advanced Configuration Scheduling
Both WDp1 ,p2 and SDp1 ,p2 filter out the features that do not belong to paths
between the starting place p1 and the end place p2 . It is obvious that these sets
make sense only if there is a path between p1 and p2 . This is not a problem
here because we focus only on paths between a task t and the associated stop
o, which we know exist if the FCW is satisfiable.
Based on that definition, we can define the concepts of weakly and strongly
open feature. Intuitively, a strongly open feature will always be decided upon
before the stop condition is reached whereas a weakly open view might not.
Equipped with these formal specifications, we are set to build the algorithms
that automate the analysis of FCWs. As for the properties, we first explain
how the satisfiability of an FCW can be verified, and then how open features
are checked at execution time.
126 7 Advanced Configuration Scheduling
7.7.1 Satisfiability
In Definition 7.7, we have seen that an FCW is satisfiable when its workflow
is sound, the view coverage is complete, and it is safe. Soundness analysis has
already been fully implemented in the YAWL tool suite [tHvdAAR09], and is
not further explored here. Similarly, both the necessary and sufficient view
coverage conditions have already been implemented (see Chapters 5 and 9).
We simply recall here that pdefines(M, f ) can be obtained with a single SAT
check:
¬sat(ΓN ∧ Γ0N [M 0 ← M ] ∧ ((f ∧ ¬f 0 ) ∨ (¬f ∨ f 0 )))
If the formula is satisfiable, it is possible to find two different configurations of
d (ΓN and Γ0N ), such that the assignment of the features in M are the same but
f and f 0 are different. In other words, it means that for the same assignment
of M , the values of f and f 0 can differ, hence M does not propositionally define
f . Since SAT returns true if M does not propositionally define f , the returned
value is negated to match the definition of pdefines(M, f ).
To verify the safety property we have to check that (1) the mandatory tasks
can completely configure the FM, (2) a stop does not precede its tasks, and
(3) the stop of a task always appears in the run. Interestingly, the verification
of the safety property can be reduced to soundness verification. The idea is to
transform the original workflow such that the soundness analysis of the trans-
formed workflow reveals whether some designated tasks prevent the option to
complete (they cause a deadlock) or are dead. The type of error determines
which property is violated. Besides capitalizing on optimisations of the reason-
ing engine, that approach does not require adaptation to the tool, and can be
applied to any workflow language supporting xor-split/joins, and-split/joins,
and cancellation regions.
A transformed workflow is generated for every view in the FCW. First, the
addition of elements in the workflow is linear in the number of views. However,
the complexity of the soundness analysis increases exponentially because of the
extra loops and cancellation regions added by the transformation. Generating
separate workflows limits the growth of the state space. Secondly, we want
to reduce the sensitivity of the soundness analysis to workflow constructs. As
we will see in Section 7.8, loops and or-joins have a detrimental effect on
performance and correctness. The individual analysis guarantees that views
not affected by these constructs are correctly analysed. The division of the
analysis thus increases scalability, precision, and reliability.
The transformation of a workflow is a seven-step process. Each transforma-
tion is informally discussed here and illustrated on the example in Figure 7.6,
transformed in Figure 7.8. The transformation is formally defined in Algo-
rithm 2 in Appendix C.
7.7
5. buildTaskCheck t', exit,
disable_t'
disabl
ed_t'
disable_t',
exit
1. updateBeginCondition enabl
ed_t' exit,
validate_t'
o X t'' X
check end_o_
4. buildConditionCheck X
_o_t' t'
procee subs
X sync_o A disable A run_o A
d_t' _o A A
_t'
6. buildPresenceCheck
7. buildClean
exit
Analysis of Feature-based Configuration Workflows
active
_o
active
_task
2. buildEndCheck _o check_
procee
A
d_o
t'
dead
_o
127
128 7 Advanced Configuration Scheduling
1. Update begin condition. The first step (blue area in Figure 7.8) adds
an extra task ( init ) between the begin condition and the rest of the
workflow. The goal of this task is to initialise the new places holding
tokens necessary for the condition, tasks, and presence checks.
2. Build end check. The second step (green area in Figure 7.8) builds the
necessary hooks for the task and condition checks. It is placed right
before the end condition to make sure that the stop of the task has been
activated before terminating the workflow.
3. Update end check. A new condition ( end ) is added to which all the
other places connected to the original end condition are reconnected.
4. Build condition check. The fourth step (red area in Figure 7.8) builds
the hook for the task check. Its purpose is to make sure that the task
check is performed before the stop is activated.
5. Build task check. The task check (pink area in Figure 7.8) is structured
so that if the task is activated once, then it must always be followed by
the stop. In the example, t’ activates activate_t’ , which enables
validate_t’ , and deactivates disabled_t’ . If validate_t’ has not
been activated before the stop condition is reached, then the task is dis-
abled ( disable_t’ removes the token from enabled_t’ ), and the task
( t’ ) cannot be executed anymore. If the sequence tries to execute the
task, it is deadlocked. This reveals a sequence that executes the task
after the stop (violation of Definition 7.5.B). If disable_t’ is dead, it
means that t’ is mandatory.
6. Build presence check. The presence check (yellow area in Figure 7.8)
verifies that if the task has been executed ( active_task_o contains
tokens), then its stop has also been activated ( active_o contains to-
kens). If the task has been executed but not its stop, then end_o is
deadlocked.
7. Build clean. Cancellation regions are added to remove the extra tokens
created during the execution that could violate the proper completion
condition. They are executed at the end of the workflow, before the end
condition is reached.
The transformation thus allows to verify Definition 7.5.B, and collect the
mandatory configuration tasks. To check Definition 7.5.A, we verify that the
union of the features in the mandatory views (Vmandatory ) completely covers
the views.
7.7 Analysis of Feature-based Configuration Workflows 129
The FCW in Figure 7.8 does not satisfy Definition 7.5.B. Indeed, t’ can
deadlock if o is executed before it, as in ρ6 . The deadlock is a result of the
absence of token in enabled_t’ due to the execution of disable_t’ . The
execution of disable_t’ removes all the tokens in the places in its cancellation
region, among which enabled_t’ . Another deadlock could occur in end_o
if there is no token in active_o . It means that t’ has been executed but
o has never been active before the end condition. This is a case similar to ρ7 .
Definition 7.5.A is not satisfied either. Since disable_t’ is not dead, t’ is
optional. Consequently, Vmandatory is empty, and none of the features in v are
propositionally defined by other views, as in ρ5 .
To sum up, the verification of the safety property is performed in four
phases: (1) verification of the workflow soundness; (2) verification of the com-
plete view coverage; (3) generation of a transformed workflow for each view
and verification of the soundness property; (4) verification of the complete cov-
erage of the mandatory views. This guarantees that at least one valid product
can be derived from the FCW. The next stage is to ensure that the successive
decisions of the users actually lead to a valid product.
(loop) → TMTC Int. → Network Int. → System Engineer → (loop) → TMTC Int. → . . .
computes a fixed point over m0 . m0 is defined as the tuple (m, wds, sds, fixed)
where: m ∈ LF CW ; wds : P ∪ C → P(N ) returns the weak dependency set;
sds : P ∪ C → P(N ) returns the strong dependency set; fixed : T ∪ C → B
returns true if the dependency set of a place can still evolve, and false other-
wise. For all the places p in the workflow, sds(p) and wds(p) are initialized to
empty and fixed(p) to false.
wbfs starts from condition o and initialises the queue Q of places to visit
with the predecessors of o to bootstrap the backward search (Line 2). Line 3
sets the starting condition of the search to fixed. By setting it to fixed, we
discard all the paths going out of the condition that could affect the places
placed before the stop. Finally, lines 4 to 9 iterate until the dependency sets
for each place stops varying, i.e., the fixed point is reached.
The updateDS function first initialises the temporary weak (tmpw) and
strong (tmps) dependency sets (lines 1 to 7). tmpw is set to ∅ to compute the
union. Similarly, tmps is set to ∅ if it is a task with and and-split. Otherwise,
it is set to N (global feature set) to compute the intersection. The values of
tmps and tmpw are then computed (lines 8 to 20) as specified in Definition 7.8
and 7.9. Once done, the predecessors of p are enqueued to proceed with the
backward search (lines 21 to 23). Finally, wds and sds are checked. If the
sets are unchanged, then p is set to fixed, otherwise the dependency sets are
updated (lines 24 to 29).
We prove below that the wbfs is complete, always terminates, and is cor-
rect.
Theorem 7.1 (Complete exploration of wbfs)
Given a sound m, wbfs visits all the places that precede o.
Proof. wbfs is a relaxed version of BFS as the places can be explored more than
once. In wbfs, a place can be enqueued as long as the size of its dependency
set varies (grows or shrinks). When it is not the case, it is marked has fixed
and can no longer be searched. This amounts to marking the place as visited
in the BFS algorithm.
Theorem 7.2 (Termination of wbfs)
Given a sound m, wbfs always terminates.
Proof. We know from Theorem 7.1 that the exploration is complete. We know
from line 24 in updateDS that each place is visited while its dependency sets
vary. The size of the dependency sets being bounded by ∅ and N and the stop
being set to fixed before the exploration, they converge toward a stable value.
When a place is fixed, it can no longer be enqueued and any occurrence still in
the queue will no longer be visited. When all the places are fixed, the queue
shrinks until it is empty. This guarantees that the loop (line 4) eventually
terminates, and so does the algorithm.
7.7 Analysis of Feature-based Configuration Workflows 131
Proof. We know from Theorem 7.1 that all the places are visited, and that they
can only be marked has fixed when their dependency sets are stable. Also, the
BFS guarantees that all the places between p and o are explored before p.
Any change to one of them is thus automatically propagated to p. We have to
demonstrate that:
2. sds are correct. We have to prove that (1) @f ∈ / sds(p) that should
have been propagated to p, and (2) every f ∈ sds(p) has been correctly
∗ ∗
propagated to p. Let us take pj • p − → pj − → o. If p →
− pj and split(p) =
AND, then f ∈ sds(p) byT definition of the union. If split(p) = OR ∨ XOR,
then f ∈ sds(p) iff f ∈ ps ∈p• sds(ps ).
+
If p − → pj , then for all pp ∈ •pj such that split(pj ) = AND we have
f ∈ sds(pp ) by definition
T of the union. If split(pj ) = OR ∨ XOR, then
f ∈ sds(pp ) iff f ∈ ps ∈pp • sds(ps ). Inductively, sds(p) will contain f iff
it is propagated by the places in Fp,pj .
7.8 Experiments
In Section 6.4, we have seen how our algorithms satisfy the properties defined in
Section 7.6. These algorithms have been implemented in our FCW toolset. De-
tails on the implementation are available in Chapter 9. The research questions
addressed by our experiments are:
7.8 Experiments 133
RQ7.1 How efficient are our algorithms? To verify satisfiability, our algo-
rithm relies on the verification of the soundness of the transformed work-
flow. We measure the performance of the verification. To compute de-
pendency sets, we propose a fixed point algorithm. We evaluate how fast
it converges to a solution.
4 http://code.google.com/p/linux-variability-analysis-tools/source/browse/
?repo=formulas
5 YAWL’s engine is configured to stop the analysis when the number of reachable markings
grows beyond 5000. Since the number of markings is potentially infinite, that bound aims
“to balance responsiveness and precision” [WVvdA+ 09].
7.8 Experiments 135
60 500
45 375
XX XX X X X
30 250
15 125
0 0
W1
W1 W2
W2 W3 W4 W5 W6 W7 W8 W9 W12 W13
W10 W11 W12 W13
Figure 7.9 – Results of the soundness analysis for the transformed workflows.
or-joins expose a similar pattern. Tasks placed before a split whose outgoing
paths are later joined with an or have a computation time below one second.
The same holds for the task with the or-join and those placed after it. That
explains why the processing time of W9 is reasonable despite the presence of
an or-join. The computation for tasks between the split and the or-join is
well beyond 150 seconds, and the results are most of the time approximate;
irrespective of the complexity of the workflow. For instance, W3 and W7 have
different levels complexity but fail likewise.
These results clearly show that or-joins degrade most the accuracy of the
results. In fact, or-joins require partial synchronisation of all the active paths
in the workflow. This synchronisation is the bottleneck in the analysis because
the reasoning engine waits “until it is not possible to add any relevant tokens to
the set of input condition” [vdAtH05], causing an explosion of the exploration
space. The accuracy of our algorithm can be improved at the expense of
one syntactical restriction: configuration tasks should not be placed on paths
synchronised at an or-join.
The evaluation of Algorithm 1 was conducted separately for the Linux Kernel
and eCos. Figure 7.10 shows the processing time in seconds of the algorithm for
these models and the different workflows, with and without loops. Each value
reported in the graph is an average of three successive runs of the algorithm.
Starting with RQ1, the first striking element is the difference between the
Linux Kernel and eCos. The maximum computation time for eCos is 0, 9
second, whereas for the Linux Kernel it is 17 seconds. Since the algorithm
only considers the number of features irrespective of the complexity of the FM,
the size of the FM is clearly the major performance predictor. To a lesser
extent, the number of elements in the workflow also influences performances.
The graph also shows that these variations are exacerbated by the number of
features to process. Both eCos and the Linux Kernel exhibit the same variations
but differences are amplified in the latter case.
Unlike for safety, workflow constructs have a marginal impact on perfor-
mance. Loops induce slightly higher processing times on average: 4, 59 vs 5, 68
for Linux ; 0, 23 vs 0, 3 for eCos. or-joins do not affect performance since
Definition 7.8 is independent of the type of join, and Definition 7.9 treats or
and or-joins similarly. The peaks of W6 and W9 do not seem to be due to a
particular pattern. They are peculiarities of the design of the workflow and
view-to-task mapping. The same holds for the higher processing time of the
non-loop version of W11.
7.9 Threats to validity 137
20
Time in seconds (sec)
15
10
0
W1 W2 W3 W4 W5 W6 W7 W8 W9 W10 W11 W12 W13
Figure 7.10 – Results of the dependency set analysis for the Linux Kernel and
eCos (vmWare) FMs.
6 Gilbert [Gil10] reports that these workflows are “so simple that they are usually ex-
138 7 Advanced Configuration Scheduling
Finally, as the experiments show, the workflow size is not the leading factor of
complexity. Yet, additional case studies are needed to ensure that the decisions
we made to build these FCWs unveiled all the strengths and weaknesses of our
algorithms.
Propositional definability has not been evaluated in our experiments. Yet,
it is a crucial part of most our algorithms. We recalled in Section 7.7.1 that
propositional definability can be obtained with a single SAT check. To date, the
largest FMs count up to 6000 features with thousands of constraints [BSL+ 10].
We know from [MWC09] that a SAT solver can verify the satisfiability of such
models in a few hundred milliseconds for the larger models. That fact has also
been confirmed by our experience in FM analysis. Therefore, for any given
dependency set, the verification at execution time of open features is unlikely
to degrade the response time of an interactive configurator.
Unsupported constructs. In Section 7.3, some YAWL constructs were
excluded from our definition: cancellation regions, composite tasks, and mul-
tiple instance tasks. Cancellation regions were discarded because they could
cancel an ongoing configuration task. Whether the configuration should be
part of a transaction and rolled back to its previous state, or the decisions
made should be preserved is still unclear. Similarly, how composite configura-
tion tasks and configuration tasks with multiple occurrences should be handled
requires further investigation. Our experience with FCWs never required such
constructs but additional empirical evaluation is necessary.
Complete coverage of mandatory tasks. Definition 7.5.A assumes that
mandatory configuration tasks must completely cover the set of features. The
benefit is that whatever path is followed at execution time, the final config-
uration will always be complete. The flip side of the coin is that optional
configuration tasks can only provide alternative ways of reaching a solution
without bringing in new features.
Alternatively, which optional tasks must be executed given the current deci-
sions could be computed at execution time. Such analysis relies on dependency
sets and open features in the complete model. Our algorithms can already pro-
vide that information. The challenge is the analysis of possible paths in the
workflow. Such an approach puts a heavier burden on execution time analyses,
which could harm the responsiveness of the configurator.
Finally, default values, and semi or fully automated completion mecha-
nisms [JBGMS10] could be used at the end of the process to complete the
configuration. These are less resource-demanding since no analysis of the con-
figuration process is required. The downside of default values, though, is that
their assignment at design time can become very cumbersome for large models.
This section revisits some selected work on workflow configuration and the
integration of processes with variability models.
Gottschalk et al. [GvdAJVLR07] developed C-YAWL, a configurable ver-
sion of YAWL, and a tool that allows the configuration and generation of YAWL
workflows. La Rosa et al. [LRvdADTH08] also focus on workflow configura-
tion. They propose a questionnaire-based approach to resolve variability in the
process. They also define partial and full dependencies to control the order in
which questions are asked. Their approach is complementary to ours and can
be used upstream of FBC to tailor the configuration process for a particular
application.
Rabiser et al. [RGD07] propose an approach supporting product configu-
ration based on decision models. Essentially, decision models represent assets
(e.g., features) tied to decisions, bound together through logic dependencies.
Decisions stand for the intervention of a role selecting assets during product
configuration. Decisions, roles and assets are thus all part of a single decision
model. They also discuss how models need to be prepared to meet the require-
ments of a specific project before allowing product derivation. Decision models
differ from FCWs in that the configuration process is entangled within the
decision model/questionnaire. By separating the workflow from the options,
we argue that FCW achieves better separation of concerns between process
and decision making. However, FCWs and decision models/questionnaires are
complementary. The control provided by dependencies could offer fine-grained
scheduling within each view.
White et al. [WBDS09] reason about contextual constraints (e.g., yearly
budget) to schedule the configuration of an FM into multiple steps. They also
show how the mapping of the FM and constraints in CSP allows deriving valid
configurations. That latter approach is complementary to ours as constraints
on steps could be used during the validation of the workflow at design time.
Mendonça et al. [MCMdO08] partition an FM into configuration spaces
whose configuration order is captured in a configuration plan. The order is de-
termined by a dependency analysis between the different configuration spaces.
While their approach is fully automated, ours enables a more flexible definition
of concerns and precedence among views.
Acher et al. [ACLF10] use FMs to capture the variability of web services
that are threaded together in a workflow. However, contrary to FCWs, they do
not assume the pre-existence of a global FM. They define composition operators
that merge FMs of “connected ” services. The two main differences with FCWs
are that they do not support crosscutting constraints, and the workflow does
not contain configuration tasks but provided services.
140 7 Advanced Configuration Scheduling
8
Towards Conflict Management
Until now, we have assumed that user decisions are invariable during the con-
figuration process. Yet, change is inherent to configuration. Most configurators
implement some form of assistance to deal with change and help users reach
a correct and complete configuration. To do so, they detect possible config-
uration errors, report, and avoid them. A configuration error is a decision
that conflicts with some constraints. Satisfying these constraints is often non-
trivial. Variability languages often carry advanced constructs that introduce
hidden constraints, and constraint rules declared in different places of the vari-
ability model may have interactions. The interplay of these factors often leads
to very complex situations.
Some configuration tools, like those based on Kconfig (the Linux kernel
141
142 8 Towards Conflict Management
Activating inactive features. 20% of the Linux users report that, when
they need to change an inactive option, they need at least a “few dozen
minutes” in average to figure out how to activate it. 56% of the eCos
users also consider the activation of an inactive option to be a problem.
Fix incompleteness. Existing configurators generate only one fix for an er-
ror. However, there are often multiple solutions to resolving an error,
and the user may prefer other solutions. 7 out of 9 eCos users have en-
countered situations where the generated fix is not useful. That claim
is corroborated by Berger et al. [BSL+ 10] who report that eCos users
complain about the incompleteness of fixes on the mailing list.
Since we also need to satisfy the corresponding constraint to activate a
feature, activation is inherently the same as resolving a configuration error,
and the idea of fixes would also work for activation. As a result, a possible
solution for the above two problems is to generate fixes for both resolving
errors and activating features, and the list fixes should be complete so that the
user can choose the one he wants.
To achieve this goal, two main challenges need to be addressed. First, a
previous study of eCos models [PNX+ 11] shows that non-Boolean operators,
such as arithmetic, inequality, and string operators, are quite common in their
constraints. In fact, the models contain four to six times more non-Boolean
constraints than Boolean ones. Non-Boolean constraints are challenging since
there is often an infinite number of ways to satisfy them. Computing such
infinite list of fixes is pointless. Thus, a compact and intensional representation
of fixes is needed. Second, many existing approaches rely on constraint solvers
8.1 Open Issues 143
Range fixes. We propose a novel concept, range fix (Section 7.2),to address
the first challenge. Instead of telling users what concrete changes should
be made, a range fix tells them what options should be changed and in
what range the value of each option can be chosen. A range fix can repre-
sent infinite number of concrete fixes and still retains the goal of assisting
the user to satisfy constraints. Particularly, we discuss the desired prop-
erties of range fixes, which formalize the requirement of the fix generation
problem. In addition, we also discuss how constraint interactions should
be handled in our framework (Section 8.5).
Evaluation with eCos. Our algorithm is (1) applied on eCos CDL (Sec-
tion 8.6) and (2) evaluated on data from five open source projects using
eCos (Section 8.7). The evaluation compares three different fix genera-
tion strategies and concludes that the propagation strategy is the most
effective one on our dataset. Specifically, for a total of 117 constraint
violations, the evaluation of the propagation strategy shows that our no-
tion of range fix leads to mostly simple yet complete sets of fixes (83%
of the fix lists have sizes smaller than 10, where the size is measured
by summing up the number of variables in all the fixes in the list). It
also demonstrates that our algorithm can generate fixes for models con-
taining hundreds of options and constraints in an average of 50ms and a
maximum of 245ms.
We motivate our work with a concrete example based on the eCos configurator
[VD01]. Figure 8.1 shows a small model for configuring an object pool. The
left panel shows a set of options that can be changed by the user, organized
into a tree. The lower-right panel shows the properties of the current option,
defined according to the eCos models. Particularly, the flavor property indi-
cates whether the option is a Boolean option or a data option. A Boolean
option can be either selected or unselected; a data option can be assigned an
integer or a string value. In Figure 8.1, “Pre-Allocation Size” is a data option;
“Use Pre-Allocation” is a Boolean option.
Besides the flavor, each option may also declare constraints using requires
property or active-if property. When a requires constraint is violated, an error
is reported in the upper-right panel. In Figure 8.1, option “Pre-Allocation Size”
declares a requires constraint demanding its value be smaller than or equal to
“Object Pool Size”; and an error is reported because the constraint is violated.
An active-if constraint implements the error avoidance mechanism. When
it is violated, the option is disabled in the GUI and its value is considered as
zero. Figure 8.2 shows the properties of the “Startup” option. This option
declares that at most half of the object pool can be pre-allocated. Since this
constraint is violated, the “Startup” option is disabled, and the user cannot
change its value.
• [Use_Pre_Allocation := false]
Each range fix consists of two parts: the option to be changed and a con-
straint over the options showing the range of values. The first range fix is also
a concrete assignment, and will be automatically applied when selected.
The other fixes are ranges. If the user selects, for example, the second fix,
the configurator will highlight option “Pre-Allocation Size”, prompt the range
“ <=8”, and ask the user to select a value in the range.
8.3.2 Definitions
Although feature modelling languages have different constructs and semantics,
existing work [SHTB07, CW07, BS10a, BSL+ 10] shows that they basically boil
down to a set of variables (options) and a set of constraints. Our approach also
builds upon this principle.
In essence, a feature modelling language provides a universe of typed vari-
ables V, and a constraint language Φ(V) for writing quantifier-free predicate
logic constraints over V. Consequently, a constraint violation can be defined
as follows.
Definition 8.1 Constraint violation
A constraint violation consists of a tuple (V, e, c), where V ⊆ V is a set of typed
variables; the current configuration e is a function assigning a type-correct value
to each variable; and c ∈ Φ(V ) is a constraint over V violated by e.
A fix generation problem for a violation (V, e, c) is to find a set of range fixes
to help users produce a new configuration e0 such that c is satisfied, denoted
as e0 |= c.
Consider the following example of a constraint violation:
All range fixes we have seen so far change only one variable, but more
complex fixes are sometimes inevitable. For example, we cannot solve viola-
tion (8.1) by changing only one variable. Several alternative fixes are possible:
The first fix contains two parts separated by “,”, each changing a variable. We
call each part a fix unit. The second fix is more complex. This fix contains only
one fix unit, but the range of this fix unit is defined on two variables. When
the fix is executed, the user has to choose a value for each variable within the
range.
8.3 Range Fixes 147
Taking the above forms into consideration, we can define a range fix. A
range fix r for a violation (V, e, c) is a set of fix units. A fix unit can be
either an assignment unit or a range unit. An assignment unit has the form
of “var := val” where var ∈ V is a variable and val is a value conforming to
the type of var. A range unit has the form of “U : cstrt”, where U ⊆ V is a
set of variables and cstrt ∈ Φ(U ) is a satisfiable constraint over U specifying
the new ranges of the variables. A technical requirement is that the variables
in fix units should be disjoint, otherwise two different values may be assigned
to one variable.
We use r.V to denote the set of the variables to be changed in all units.
We use r.c to denote the conjunction of the constraints from all units. The
constraint from an assignment unit “var := val” is var = val, and the con-
straint from a range unit “U : cstrt” is cstrt. For example, let r be the range
fix [m := false, b : b > 10], then r.V = {m, b} and r.c is m = false ∧ b > 10.
Applying range fix r of violation (V, e, c) to e will produce a new configura-
tion interactively. We denote all possible configurations that can be produced
by applying r to e as r . e, where r . e = {e0 | e0 |= r.c ∧ ∀v ∈ V (e0 (v) 6= e(v) →
v ∈ r.V )}
Minimal fixes, however, might not cover all possible changes that resolve a
violation, and these uncovered cases might be preferred by some users. How-
ever, as our evaluation will show, minimality is good heuristics in practice.
Thirdly, after determining a set of variables, we would like to present
the maximal range of the variables. The reason is simple: extending the
range over the same set of variables gives more choices, and usually neither
decreases readability nor affects more existing user decisions. For example,
[m := true, b : b > 10] is better than [m := true, b : b > 11] because it covers
a wider range on b.
Property 8.3 Maximality of ranges
There exists no fix r0 for (V, e, c) such that r0 is correct, r0 .V = r.V and (r.e) ⊂
(r0 . e).
Fourthly, after deciding the range over the variables, we would like to rep-
resent the range in the simplest way possible. Thus, another desired property
is that a fix unit should change as few variables as possible. In other words, no
fix unit can be divided into smaller equivalent fix units. We call this property
minimality of units.
However, as we treat Φ as a general notion, our generation algorithm cannot
ensure all fix units are minimal. Therefore we do not treat this property as
part of the formal requirement. However, this does not seem to be a limitation
in practice; all fix units generated in our evaluation contain only one variable,
which are minimal by construction.
Armed with these properties of range fixes, we can define the completeness
of a list of fixes. Since the same constraint can be represented in different ways,
we need to consider the semantic equivalence of fixes. Two fixes r and r0 are
semantically equivalent if (r . e) = (r0 . e), otherwise they are semantically
different.
Property 8.4 Completeness of fix lists
Given a constraint violation (V, e, c), a list of fixes L is complete iff
• any two fixes in L are semantically different,
• each fix in L satisfies Property 8.1, 8.2, and 8.3.
• and any fix that satisfies Property 8.1, 8.2, and 8.3 is semantically equiv-
alent to a fix in L
Thus, a fix generation problem is to find a complete list of fixes for a given
constraint violation (V, e, c).
8.4 Fix Generation Algorithm 149
In Section 8.3.3 we claimed that a fix should change a minimal set of variables
and have a maximal range. As a result, our generation algorithm consists of
three steps. (i) We find all minimal sets of variables that need to be changed.
For example, in violation (8.1), a minimal set of variables to change is D =
{m, b}. (ii) For each such set of variables, we replace any unchanged variable
in c by its current value, obtaining a maximal range of the variables. In the
example, we replace a by 6 and get (m → 6 > 10) ∧ (¬m → b > 10) ∧ (6 < b).
(iii) We simplify the range to get a set of minimal, or close to minimal, fix
units. In the example we will get [m := true, b : b > 10]. Step (ii) is trivial
and does not demand further developments. We now concentrate on steps (i)
and (iii).
To make the whole set satisfiable, we need to remove at least constraints {1, 3}
or constraints {2, 3}, which correspond to two variable sets {m, b} and {a, b}.
To find all diagnoses, Reiter’s theory uses an ability of most SAT/SMT
solvers: finding an unsatisfiable core. An unsatisfiable core is a subset of the
soft constraints that is still unsatisfiable. For example, the above constraint
set has two unsatisfiable cores {1, 2} and {3}.
150 8 Towards Conflict Management
{1,2}
1 2
{3} {3}
3 3
SAT SAT
Figure 8.4 shows an HS-DAG for the above example. Suppose the constraint
solver initially returns the unsatisfiable core {1, 2}, and a root node is created
for this core. Then we build an arc for each constraint in the core. In this case,
we build two arcs 1 and 2. The left arc is 1, so we remove constraint [1] from
the set, and invoke the constraint solver again. This time the constraint solver
returns {3}. We remove constraint [3] and now the constraint set is satisfiable.
We create a node SAT for the edge. Similarly, we repeat the same steps for all
other edges until all paths reach SAT. Finally, each path from the root to the
leaf is a diagnosis. In this case, we have {1, 3} and {2, 3}.
This process alone cannot ensure that the generated diagnoses are minimal.
To ensure it, three additional rules are applied to the algorithm. The details
of these rules can be found in [GSW89]. Greiner et al. [GSW89] prove that
HS-DAG builds a complete set of minimal diagnosis after applying the three
rules.
normal form (CNF), and convert each clause into a fix unit. Yet, we still need
to carefully make sure the fix units are disjoint and are as simple as possible.
First, if the constraints contain any operators convertible to propositional
operators, we convert them into propositional operators. For example, eCos
constraints contain the conditional operator “:?” such as (m ? a : b) > 10. We
convert it into propositional operators: (¬m ∨ a > 10) ∧ (m ∨ b > 10).
Secondly, we convert the constraint into CNF. In our example, with {m, b},
we have (m → 6 > 10) ∧ (¬m → b > 10) ∧ (6 < b), which gives three clauses in
CNF: {¬m ∨ 6 > 10, m ∨ b > 10, 6 < b}.
Thirdly, we apply the following rules repetitively until we reach a fixed
point.
Rule 1 Apply constant folding to all clauses.
Rule 2 If a clause contains only one literal, delete the negation of this literal
from all other clauses.
Rule 3 If a clause C1 contains all literals in C2 , delete C1 .
Rule 4 If a clause taking the form of v = c where v is a variable and c is a
constant, replace all occurrences of v with c.
In our example, applying Rule 1 to the above CNF, we get {¬m, m ∨ b >
10, 6 < b}. Apply Rule 2 to the above CNF, we get {¬m, b > 10, 6 < b}. No
further rule can be applied to this CNF.
Fourthly, two clauses are merged into one if they share variables. In the
example, we have {¬m, b > 10 ∧ 6 < b}.
Fifthly, we apply any domain specific rules to simplify the constraints in
each clause, or divide the clause into smaller, disjoint ones. These rules are
designed according to the types of operators used in the constraint language.
In our current implementations of CDL expressions, we use two types of rules.
First, for clauses containing only linear equations or inequalities with one vari-
able, we solve them and merge the result. Secondly, we eliminate some ob-
viously eliminable operators, such as replacing a + 0 with a. We also apply
Rule 1 and Rule 4 shown above during the process. In the example, the second
clause consists of two linear inequalities, we solve the inequalities and merge
the ranges on b, we get {¬m, b > 10}.
Finally, we convert each clause into a fix unit. If the clause has the form of
v, ¬v, or v = c, we convert it into an assignment unit, otherwise we convert it
into a range unit. In the example, we convert ¬m into an assignment unit and
b > 10 into a range unit and get [m := false, b : b > 10].
As mentioned before, the above algorithm does not guarantee the fix units
are minimal. The reason is that we cannot ensure that the domain-specific rules
in the fifth step are complete, since some common operators such as those on
strings are undecidable in general [BTV09].
152 8 Towards Conflict Management
So far we have only considered range fixes for one constraint. However, the
constraints in variability models are often interrelated; satisfying one constraint
might violate another. As a result, we have to consider multi-constraint viola-
tion rather than single-constraint violation.
Definition 8.2 Multi-constraint violation
A multi-constraint violation is a tuple (V, e, c, C), where V and e are un-
changed, c is the currently violated constraint, and C is the set of constraints
defined in the model and satisfied by e.
The following example shows how a fix satisfying c can conflict with other
constraints in C that were previously satisfied.
Ignorance
All constraints in C are simply ignored, and only fixes for (V, e, c) are gener-
ated. This strategy is used in fix generation approaches considering only one
constraint [NEF03]. This strategy does not solve the constraint interaction
problem at all. However, it has its merits: first, the fixes are only related to
the violated constraint, which makes it easier for the user to comprehend the
relation between the fixes and the constraints; secondly, this strategy does not
suffer from the problems of incomplete fix list and large fix list, unlike the two
others; thirdly, this strategy requires the least computation effort and is the
easiest to implement.
8.5 Constraint Interaction 153
Elimination
When a fix violates other satisfied constraints, it is excluded from the list
of fixes, i.e., the fix is “eliminated” by other constraints. In the example in
violation (8.2), fix r will violate c2 and thus is excluded from the generated fix
set. This strategy is proposed by Egyed et al. [ELF08] and used in their UML
fix generation tool.
To apply this strategy to range fix generation, we first find a subset of
C that shares variables with c, then replace the variables not in c with their
current values in e, and connect all constraints by conjunctions. For example,
to apply the elimination strategy to violation (8.2), we first find the constraints
sharing variables with c, which includes only c2, and then replace x and y in c2
with their current values, getting c02 = n → false ∨ false. Then we generate
fixes for (V, e, c ∧ c02 ).
Although the elimination strategy prevents the violation of new constraints,
it has two noticeable drawbacks. First, it excludes many potentially useful
fixes. In many cases, it is inevitable to bring new errors during error resolution.
Simply excluding fixes will only provide less help to the user. In our example,
we will get an empty fix set, which does not help the user resolve the error.
Secondly, since we need to deal with the conjunction of several constraints,
the resulting constraint is much more complex than the original one. Our
evaluation showed that some conjunctions can count more than ten constraints.
Nevertheless, compared to the propagation strategy, this increase in complexity
is still small.
Propagation
When a fix violates other constraints, we further modify variables in the vi-
olated constraints to keep these constraints satisfied. In this case, the fix is
“propagated” through other constraints. For example, fix r will violate c2 ,
so we further modify variables x or y to satisfy c2 . Then the modification
of x will violate c3 , and we further modify z. In the end, we get two fixes
[n := true, x := true, z := true] and [n := true, y := true]. This approach is
used in the eCos configuration tool [VD01], and the feature model diagnosis
approach proposed by White et al. [WSB+ 08].
To apply this strategy, we first perform a static slicing on C to get a set
of constraints directly or indirectly related to c. More concretely, we start
from a set D containing only c. If a constraint c0 shares any variable with any
constraint in D, we add c0 to D. We keep adding constraints until we reach a
fixed point. Then we make a conjunction of all constraints in D, and generate
fixes for the conjunction. For example, if we want to apply the propagation
strategy to violation (8.2), we start with D = {c1 }, then we add c2 because it
154 8 Towards Conflict Management
8.6 Implementation
We have implemented a command-line tool generating fixes for eCos CDL using
the Microsoft Z3 SMT solver [DMB08]. Our tool takes a CDL configuration
as input, and automatically generates fixes for each configuration error found.
Alternatively, the user can enter an option to activate via the command-line
interface, and our tool generates fixes to activate this option.
To implement our algorithm, one important step is to convert the constraint
in the CDL model into the standard input format of the SMT solver: SMT-LIB
[BST10]. To perform this task, we carefully studied the formal semantics of
CDL [Xio11, BS10a] through reverse engineering from the configurators and
the documents. We faced two problems during the conversion. First, CDL is
an untyped language, while SMT-LIB is a typed language. To convert CDL,
we implement a type inference algorithm to infer the types of the options based
on their uses. When a unique type cannot be inferred or type conflicts occur,
we manually decide the feature types.
The second problem is dealing with string constraints. The satisfiability
problem of string constraints is undecidable in general [BTV09], and general
SMT solvers do not support string constraints [DMB08]. Yet, string constraints
are heavily used in CDL models. Nevertheless, our previous study on CDL con-
straints [PNX+ 11] actually shows that the string constraints used in real world
models employ a set semantics: a string is considered as a set of substrings sep-
arated by spaces, and string functions are actually set operations. For example,
is_substr is actually a set member test. Based on this discovery, we encode
each string as a bit vector, where each bit indicates whether a particular sub-
string is present or not. Since in fix generation we will never need to introduce
new substrings, the size of the bit vector is always finite and can be determined
by collecting all substrings in the model and the current configuration.
8.7 Evaluation 155
8.7 Evaluation
8.7.1 Methodology
Our algorithm ensures Properties 1-4 for the generated range fixes. However, to
really know whether the approach works in practice, several research questions
need to be answered by empirical evaluation:
• RQ8.2 How often are the final user changes covered by our fixes?
• RQ8.4 Does our approach cover more user changes than existing ap-
proaches?
the suggestions from the eCos configurator. We record this corrected default
configuration as the first version. Then we diff each pair of consecutive revisions
to find changes to options. Next we replay these changes to simulate the real
configuration process. Since we do not know the order of changes within a
revision, we use three orders: a top-down exploration of the configuration file,
a bottom-up one, and a random one. The rationale for the first two orders is
that expert users usually edit the textual configuration file directly rather than
using the graphical configurator. In this case, they will read the options in the
order that they appear in the file, or the inverse if they scroll from bottom to
top.
We replay the changes as just explained and collect (i) errors—violating
requires constraints—and (ii) activation violations. An activation violation
occurs when an option value should be changed, but is currently inactive. The
last two columns in Table 8.2 show the numbers of the resulting violations
from changes. After duplicate removal, 27 errors and 22 activation violations
remain; together with the first dataset, we have a total of 117 multi-constraint
violations.
Finally, we invoke our tool to generate fixes for the 117 violations. For
RQ4, we also invoke the built-in fix generator of the eCos configurator on the
27 errors from the user changes. The activation violations are not compared
because they are not supported by the eCos configurator. The experiments
were executed on a computer with an Intel Core i5 2.4 GHz CPU and 4 GB
memory.
8.7.2 Results
We first give the results for RQ8.1-RQ8.4 using the propagation strategy. We
answer RQ8.5 by presenting the comparison of the three strategies last.
8.7 Evaluation 157
50
Number of violations
40
30
20
10
0
0 1 2 3 4 5 8 9
Number of fixes
80
Number of fixes
60
40
20
0
1 2 3 4 5 6 7 8 9
Number of variables per fix
RQ8.1
To answer RQ8.1, we first consider two basic measures over the 117 violations:
the distribution of the number of fixes per violation (see Figure 8.5), and the
distribution of the number of variables changed by each fix (see Figure 8.6).
From these figures we see that most fix lists are short and most fixes change a
small number of variables. More concretely, 95% of the fix lists contain at most
five fixes, and 75% of the fixes change less than five variables. There is also
an activation violation that did not produce any fix. A deeper investigation of
this violation revealed that the option is not supported by the current hardware
architecture, and cannot be activated without introducing new configuration
errors. The extracted changes actually lead to an unsolved configuration error
in the subsequent version.
It is still unclear how the combination of fix number and fix size affect the
size of a fix list, and how the large fixes and long lists are distributed in the
violations. To understand this, we measure the size of a fix list. The size of
a fix list is defined as the sum of the number of variables in each fix. The
result is shown in Figure 8.7. From the figure we can see that the propagation
strategy does lead to large fix lists. The largest involves 58 variables, which is
158 8 Towards Conflict Management
Number of violations
40
30
20
10
0
0 1 2 3 4 6 7 8 9 16182338565758
Number of variables per violation
not easily readable. However, the long lists and large fixes tend to appear only
on a relatively few number of violations, and the majority of the fix lists are
still small: 83% of the violations contains less than 10 variables.
We also measure the number of variables in each fix unit to understand
how large the fix units are. It turns out that every fix unit contains only one
variable. This shows that (1) “minimality of fix units” effectively holds on all
the violations, and (2) ranges declared on more than one variable (such as the
second fix for violation (8.1)) have never appeared in the evaluation.
RQ8.2
RQ8.3
For each of the 117 violations, we invoked the fix generator 100 times, and
calculated the average time. The result is presented as a density graph in
Figure 8.8. It shows that most fixes are generated within 100 ms. Some fixes
require about 200 ms, which is still acceptable for interactive tools.
8.7 Evaluation 159
Density of violations
0.025
0.020
0.015
0.010
0.005
0.000
50 100 150 200
Generation Time (ms)
RQ8.4
We measure whether the fixes proposed by the eCos configurator cover the user
changes in the same way as in RQ8.2. Since the eCos configurator is unable to
handle the activation violations, we measure only error resolutions. There are
26 out of 27 errors that have subsequent corrections. The eCos configurator
was able to handle 19 of the 26 errors, giving a coverage of 73%. Comparatively,
our tool covered all 26 errors.
RQ8.5
As discussed in Section 8.5, the propagation strategy potentially produces large
fix lists. At this stage, we would like to know if the other two strategies actually
produce simpler fixes. We compare the size of fix lists generated by the three
strategies in Figure 8.9. The elimination and ignorance strategies completely
avoid large fix lists, with the largest fix list containing four variables in total.
The elimination strategy changes even fewer variables because some of the
larger fixes are eliminated.
We also compare the generation time of the three strategies. For all viola-
tions, the average generation time for the propagation strategy is 50ms, while
the elimination strategy is 20ms and the ignorance strategy is 17ms. Since the
overall generation time is small, it does not make a big difference in tooling.
Next, we want to understand to what extent the other two strategies affect
completeness or bring new errors. First we see that the elimination strategy
does not generate fixes for 17 violations. This is significantly more than the
ignorance and propagation strategies, which have zero and one violation, re-
spectively. We measure the coverage of user changes using the elimination
strategy. In the 47 violations, only 27 are covered, giving a coverage of 57%.
This is even lower than the eCos configurator, which generates only one fix,
showing that a lot of useful fixes were eliminated by this strategy.
160 8 Towards Conflict Management
60
Elimination
50
40
30
20
10
Number of violations
0
60
50
Ignorance
40
30
20
10
0
60
Propagation
50
40
30
20
10
0
0 1 2 3 4 6 7 8 9 16 18 23 38 56 57 58
Number of variables
The problem of the ignorance strategy is that it may bring new errors.
To see how frequently a fix brings new errors, we compare the fix list of the
ignorance strategy with the fix list of the elimination strategy. If a fix does
not appear in the list of the elimination strategy, it may potentially bring new
errors. As a result, 32% of the fixes generated by the ignorance strategy bring
new errors, which covers 44% of the constraint violations. This shows that
the constraints in practice are usually inter-related and the ignorance strategy
potentially causes new errors in many cases.
We see two main threats to external validity. First, we have evaluated our ap-
proach on one variability language. However, Berger et al. [BSL+ 10] study and
compare three variability languages—CDL, Kconfig and feature modeling—and
find that CDL has the most complex constructs for declaring constraints, and
constraints in CDL models are significantly more complex than those in Kcon-
fig models. Thus, our result is probably generalizable to the other two other
languages.
The second threat is that our evaluation is a simulation rather than an
actual configuration process. We address this threat by using the models of
six architectures and configurations gathered from five projects. The configu-
rations and changes have a wide range of characteristics as shown in Tables 8.1
and 8.2. However, it still may be that these changes are not representative of
the problems that real users encountered. We hope to address this threat by
running a user study in industry settings in the future.
A threat to internal validity is that our translation of CDL into logic con-
8.9 Related Work 161
The idea of automatic fix generation is not new. Nentwich et al. [NEF03] pro-
pose an approach that generates abstract fixes from first-order logic rules. Their
fixes are abstract because they only specify the variables to change and trust
the user to chose a correct value. In contrast, our approach also gives the range
of values for a variable. Furthermore, their approach only supports “=” and
“6=” as predicates and, thereby, cannot handle models like eCos. Scheffczyk et
al. [SRBS04] enhance Nentwich et al. ’s approach by generating concrete fixes.
However, this approach requires manually writing fix generation procedures
for each predicate used in each constraint, which is not suitable for variability
models, often containing hundreds of constraints. Egyed et al. [ELF08] propose
to write such procedures for each type of variable rather than each constraint
to reduce the amount of code written and apply this idea to UML fix genera-
tion. Yet, in variability models, the number of variables is often larger than the
number of constraints. The actual reduction of code is thus not clear. Jose et
al. [JM11] generate fixes for programming bugs. They first identify the poten-
tially flawed statements using MAXSAT analysis, and then propose fixes based
on heuristic rules. However, their heuristic rules are specific to programming
languages and are not easily applicable to software configuration. Also, they
propose at most one fix each time rather than a complete list.
Fix generation approaches for variability models also exist. The eCos config-
urator [VD01] has an internal fix generator, producing fixes for a selected error
or on-the-fly when the user changes the configuration. White et al. [WSB+ 08]
propose an approach to generate fixes that resolve all errors in one step. How-
ever, both approaches can only produce one fix rather than a complete list. Fur-
thermore, they have very limited support of non-Boolean constraints. White et
al. ’s approach does not handle non-Boolean constraints at all, while the eCos
configurator supports only non-Boolean constraints in a simple form: v ⊕ c
where v is a variable, c is a constant, and ⊕ is an equality or inequality opera-
tor.
Another set of approaches maintain the consistency of a configuration. Valid
domains computation [HSJ+ 04, Men09] is an approach that propagates deci-
sions automatically. Initially all options are set to an unknown state. When
the user assigns a value to an option, it is recorded as a decision, and all other
options whose values are determined by this decision are automatically set.
162 8 Towards Conflict Management
The fix generation algorithm lays the foundation stones for both individual and
collaborative conflict resolution. Until now, we implicitly assumed a single-user
mode in which conflicts are fixed as soon as they appear. This is in fact the most
frequent scenario played by Linux and eCos users [HXC12]. However, in col-
laborative environments, such as multi-view FBC, conflict resolution involves
several users. Several users means different viewpoints to reconcile, priorities
8.11 Chapter Summary 163
to respect, security policies to enforce (e.g., visibility and access rights), and
workflows to follow.
The collaborative context does not influence the execution of the fix gen-
eration algorithm; assuming the FM is shared by all the users. To reason
about the decisions from n users, the current configuration is determined by
the union of their decisions, i.e., e = e1 ∪ . . . ∪ en . Unlike the single-user
case, e could contain multiple assignments for the same variable. For instance,
the union of the two configurations e1 : {m = true, a = 12, b = 12} and
e2 : {m = true, a = 5, b = 6} contains different assignments for a and b, which
produce different fixes.
To tackle this problem, the inconsistent assignments can be corrected either
prior to generating fixes or during fix generation. Solving inconsistencies prior
to fix generation basically means presenting the users with a list of conflictual
assignments and asking them to agree on a common value. This approach
does not require any reasoning from the tool. The flip side of the medal is
that changing the values can introduce several new conflicts. Conversely, when
conflicting assignments are solved during conflict resolution, the generated fixes
will guarantee that no new conflict is introduced. This, however, comes at the
cost of potentially large and numerous fixes because all the combinations of
conflictual assignments have to be explored. The first challenge is to determine
which approach is most appropriate.
How and when a conflict is fixed is also an open question. Different merge
operations (e.g., three way merging and octopus merging) have been suc-
cessfully implemented in alternative SCM tools (e.g., Subversion [Apa11] and
Git [Git11]) and models (e.g., checkout/checkin, composition, long transaction,
and change set [Fei91]). Collaborative software development platforms like
IBM Jazz [IBM11] target distributed teams, and aim to achieve better pro-
ductivity and transparent result integration. Such tools and models establish
communication channels between users, and regulate their interactions. They
are the breeding ground for collaborative fix generation. The second challenge
is to accommodate the organizational constraints they impose with the input
and output of the fix generation algorithm.
9
A Toolset for Advanced
Feature-based Configuration
9.1 Overview
Chapters 5 to 8 laid the foundations for multi-view FMs, FCWs, and conflict de-
tection and resolution. We now present a toolset for FCW1 that has been imple-
mented by extending and integrating two third-party tools: SPLOT [Men10]2
and YAWL3 .
FM management with SPLOT. SPLOT is an open source web-based
system for editing, sharing, and configuring FMs. The public version of SPLOT
available online now gathers 100+ FMs that are all freely accessible. SPLOT
is developed in Java, and uses Freemarker4 and Dojo5 to handle the web front-
end. To provide efficient interactive configuration, SPLOT relies on a SAT
solver (SAT4J6 ) and a BDD solver (JavaBDD7 ). Their reasoning abilities en-
able decision propagation and critical debugging operations [Men09].
SPLOT was chosen because it provides a simple, yet robust, web-based
visual editor to create, edit, and configure FMs. Its servlet-based architec-
ture makes it easy to extend. Third-party software can interact with SPLOT
through web-services. The existing repository of FMs is also an excellent
testbed for our extensions. We extended it to support view creation, con-
figuration, and view-to-workflow mapping.
1 http://www.splot-research.org/extensions/fundp/fundp.html
2 http://www.splot-research.org/
3 http://www.yawlfoundation.org/
4 http://freemarker.sourceforge.net/
5 http://www.dojotoolkit.org/
6 http://www.sat4j.org/
7 http://javabdd.sourceforge.net/
165
166 9 A Toolset for Advanced Feature-based Configuration
Figure 9.1 shows the essential components of our integrated toolset as well as
a typical usage scenario. All the elements under YAWL and SPLOT are web
service extensions. The newly developed FCW Engine minimizes the coupling
between YAWL and SPLOT. For clarity, we discuss the design and runtime
phases of the scenario separately. Then, we elaborate on user management
and explain how concurrent configuration is achieved. Design time activities
being independent of the configuration process, they can be performed either
during domain or application engineering [PBvdL05]. In other words, SPL
engineers are free to define a generic FCW during domain engineering, or they
can tailor an FCW for every application. To obtain more flexibility, configurable
workflows [GvdAJVLR07] could also be used during domain engineering, and
then configured during application engineering.
1 2 1 Design
Design worflow Register workflow Design views Time
Workflow
Specification Workflow View
Specification Specification
FCW Engine
FD & View
Session & Repository
Workflow Process
Repository Repository
Session 3 Map
Workflow & View
init Specifications views to
workflow
4
Workflow/View Runtime
Start workflow
mapping
Loop
7
Place 6 Place
[workflow Select view(s)
description description
active] Coordinate
5
Manage task configuration
execution 9
10
8
Configuration Configure view
Place status
status or
Retrieve view status
Figure 9.1 – Overview of the essential components and typical use case scenario.
complete when (1) all the views are mapped to exactly one task and one stop,
and (2) the coverage of the mapped views is complete.
The mapping of a view to a task and a condition is only a technical step. The
actual assignment of a user to the configuration of a view is done in YAWL. The
toolset uses YAWL’s user management functionality [Ada10]. YAWL allows to
create users, and to customise several parameters such as roles, capabilities,
positions, groups and privileges. In the Spacebel case, for example, five users
are created: Spacebel, the reseller, the system engineer, the TMTC integrator,
and the network integrator.
At runtime, when a task is ready to be executed, the administrator of the
running instance of the FCW either assigns the task directly to a user or to a
role, meaning that all the users with the same role can execute it. When a user
logs in, he can only access and start the active tasks that have been assigned to
him. The user ID is part of the data that is communicated to the FCW Engine
and then relayed to SPLOT. This way, both the FCW Engine and SPLOT can
keep track of who made what decision and when.
This section dwells upon the three core extensions for multi-view FBC provided
by the toolset.
The first extension enables view creation with XPath expressions. Fig-
ure 9.2 shows the view creation menu of SPLOT. The upper part shows the
FM of PloneMeeting. In the middle part, views can be created or edited. Here,
the User view is selected and the XPath expression that defines it is displayed.
The bottom part contains additional information identifying the creator of the
view. Finally, the Evaluate XPath Expression button checks that the XPath
expression is correct and shows the results of its evaluation. The Evaluate
Views Coverage button checks the completeness of the views and returns the
features that are not covered, if any. The last two buttons save, respectively
delete, the current view in the shared repository.
The actual configuration of a view is provided by the second extension. The
extension allows to select (1) the view to configure, and (2) the visualisation.
In Figure 9.3, the view of the User is selected and the pruned visualisation
is activated. Note the greyed Data feature: it can neither be selected nor
deselected. The stakeholder can switch freely from one visualisation to another
as she configures her view without loosing the decisions that were already made.
This way, we dynamically combine the advantages of the three visualisations
and leave the complete freedom to the stakeholder to choose the one(s) that
best fit(s) her preferences.
The table on the right monitors the status of the current configuration.
Basically, it tells what features have been selected or deselected, and which
170 9 A Toolset for Advanced Feature-based Configuration
Figure 9.2 – SPLOT view creation menu illustrated on the User view.
decisions were propagated. It also provides general information about the op-
erations performed by the SAT solver and the status of the configuration. The
latter is a good indicator of the work that remains after the configuration of
the view. As we have seen, the solver reasons about the full FM and not only
about individual views. This is important in practice. Recall that for the
collapsed visualisation, Section 5.5.2 concludes that the cardinalities produce
an under-constrained FM. Cardinalities are part of the constraints taken into
account by the solver. Thereby, the decision to select or deselect a feature in
the view is propagated in the complete model—keeping the global configura-
tion consistent. In the counter-example in Figure 5.6 for instance, the selection
of d in the view will automatically entail the selection of c, even though the
9.4 Multi-view Feature Modelling 171
Figure 9.3 – Configuration view of the User with the pruned visualisation in
SPLOT.
The third extension provides basic support for multi-user concurrent con-
figuration. At the time being, it only enables synchronous configuration. To
prevent conflictual decisions, a configuration session manager is used (see Sec-
tion 9.6). Its role is (1) to maintain a mutual exclusion on the configuration
engine so that only one user can commit a decision at a time, and (2) to notify
all the users of a decision and of the results of the propagation.
172 9 A Toolset for Advanced Feature-based Configuration
Design time verifications (Section 7.7) are performed after the mapping of views
onto the workflows (Ì). The mapping is then used to generate the transformed
workflow. Each individual workflow has then to be manually loaded in YAWL
for analysis. If the workflow requires corrections, it will have to be loaded in
SPLOT again and the mappings redefined.
Strong and weak dependency sets (Section 7.6) are also computed after
the mapping. They are stored in SPLOT with the view descriptions and the
mapping. The dependency sets are loaded by SPLOT when the configuration
menu of a given view is activated. The status of open features (Section 7.7) can
be obtained at any time during the configuration with the Check Open Feature
button. Open features are also automatically checked when the user exists the
configuration menu. Those that are not propositionally defined by the weak
and strong dependency sets are reported to the user.
Originally, a final check on open features was supposed to be performed by
the YAWL’s workflow management environment when a configuration task is
marked as completed. However, restrictions in YAWL’s source code did not
allow us to override or wrap the task completion method.
Two levels of concurrency can be achieved with the toolset. An FCW instance
defines the first level of concurrency and is basically a complete configuration
project. Each FCW instance is given an ID by the Registration service of the
FCW Engine, which allows to retrieve and load the proper configuration from
the repository. Within an FCW instance several configuration sessions can run
in parallel. That is the second level of concurrency. A configuration session is
the period during which a user executes a configuration task. Each part of the
toolset contributes to handle concurrent configuration.
YAWL has a built-in mechanism to run multiple instances of the same
workflow, where a particular instance is called a case. Users can switch between
running cases and proceed with their tasks in each individual case. YAWL thus
natively provides support for both levels of concurrency.
All the FCW instances and configuration sessions are monitored by the
FCW Engine. To manage concurrent sessions within the same FCW instance,
the FCW Engine coordinates users with active configuration tasks and forwards
the list of active users to SPLOT. SPLOT allocates distinct configuration spaces
for each FCW instance. To keep the configuration space consistent when mul-
tiple configuration sessions are simultaneously active, we have implemented a
basic conflict avoidance mechanism. In essence, an FCW Instance Manager is
created for every FCW instance. It plays two major roles. First, it controls
9.7 Chapter Summary 173
the access to the reasoning engine that preserves the consistency of the config-
uration space. Only the manager can submit decisions and receive propagation
results from the reasoning engine. Active users submit their decisions to the
manager. The manager maintains a lock on the reasoning engine so that only
one decision is processed at a time. This prevents conflictual decisions within
the same configuration space. Secondly, when a decision has been processed by
the reasoning engine, the manager notifies all the active users of the decision
and the results of the propagation. Only then does the manager release the
lock. That simple mechanism enables conflict-free synchronous configuration
of a multi-view FM.
The range fix algorithm still requires an extra layer on top of it to manage
user interactions before being integrated into the toolset. At the technical
level, the SMT solver used in our prototype (Z3) is a Microsoft product that
hardly blends in our open source environment. We are currently investigating
the replacement of Z3 by STP8 . Also, the expressiveness of SPLOT should
be extended to include non-Boolean variables to benefit from the complete
reasoning power of our algorithm.
This chapter brought together the contributions of Chapters 5–7 into a process-
aware and multi-user FBC toolset. Our ambition was to demonstrate the ap-
plicability and scalability of FBC tools with a prototype based on SPLOT to
support: (1) view editing, rendering and configuration; (2) views-to-workflow
mapping; (3) configuration persistency and recovery. Workflow execution is
supported through YAWL which was extended to monitor task and condition
execution and completion. An FCW Engine was created to control the interac-
tions between SPLOT and YAWL, and manage the configuration sessions. We
also implemented the design time and runtime analyses that ensure the proper
completion of the configuration process.
Inconsistencies during the concurrent configuration of the FM are prevented
by a lock on the configuration space. The underlying assumption here is that
views are configured synchronously. We have seen in Chapter 8 that this as-
sumption rarely holds in practice. We are now studying the application of
range fixes to concurrent configuration.
8 See http://sites.google.com/site/stpfastprover/
Chapter
10
Conclusion
If you have built castles in the air, your work need not be lost; that
is where they should be. Now put the foundations under them.
175
176 10 Conclusion
RQ2 How to model and schedule the configuration process? Starting from the
original work on multi-level staged configuration, we extended the se-
mantics of feature models with configuration paths. Configuration paths
account for the dynamic nature of the configuration process. However,
the theoretical and practical limitations of multi-level staged configura-
10.3 Perspective 177
RQ3 How to ensure the satisfiability, safety and completion of the configura-
tion process? A formal semantics provides the compulsory basis for tool
development and automation. However, careless design choices can yield
unsatisfiable models, and inappropriate configuration decisions can pre-
vent the correct completion of the configuration process. Safeguards are
thus necessary on top of the semantics. To that end, we defined a satis-
fiability property that guarantees that at least one valid product can be
built from the FCW, and verification mechanisms that ensure that any
decision left open during the configuration can be made later. We then
capitalised on the YAWL workflow engine and advances in SAT solvers
to design efficient verification algorithms. The performance of our algo-
rithms and the impact of workflow constructs was finally evaluated on 52
different feature-based configuration workflows.
RQ4 How to handle conflictual user decisions? Conflicts can occur at two
levels: when a user alters previous decisions; and when multiple users
concurrently configure the same feature model. We have detailed an
algorithm that detects conflicts and generates fixes for user decisions.
This algorithm relies on the known theory of diagnosis used in AI. We
have discussed how this algorithm can be extended to support concurrent
user configuration and reported preliminary results.
10.3 Perspective
10.3.1 Expressiveness
Feature attributes are partially supported. In Section 2.3, we pointed
out that feature attributes are not included in the semantics of feature
models used in FCWs. They were re-introduced in Chapter 8 to provide
comprehensive support for fix generation.
The addition of attributes to multi-view raises the question whether at-
tributes should be considered independently of the feature to which they
belong. If an attribute and its feature are an atomic unit, visualisation
would not be affected. If they are distinct entities, then both the view
specification and visualisations would have to be adapted. Empirical eval-
uations are needed to determine which alternative is best in practice and
if the latter is chosen, how attributes should be presented if its feature
is not in a view. In either case, Definition 5.3 will have to be adapted
to take constraints over attributes into consideration. The addition of
attributes would not alter the definition of feature-based configuration
workflows as they consider views as the base unit. Verification would not
be affected either, modulo the adaptation of Definition 5.3.
Feature cloning is not supported. Feature cloning (through feature refer-
ences and hi..ji feature cardinalities with j > 1) was also excluded from
the syntax and semantics of feature models. Feature cloning would re-
quire more drastic changes to the semantics, since the same feature might
appear several times in a configuration. Besides a complete adaptation
of the semantic domain, it would also require significant revision of the
constraint system. As discussed in a preliminary extension of the se-
mantics [MCHB11], future work includes the elaboration of an extended
constraint language as well as more advanced reasoning algorithms due
to the possibly infinite size of the domains to explore when performing
formal analyses.
The integration with CVL is pending. Ultimately, we will also have to
compose with CVL (Common Variability Standard 1 ), the OMG attempt
to standardise a variability modelling language. At the time of writing,
the specification is still a draft and no official release has been issued.
Yet, it is obvious that the implication of the leading tool vendors and
1 http://www.omgwiki.org/variability/
10.3 Perspective 179
A
Greyed and Pruned
Visualisations: Examples of
Transformations
181
182 A Greyed and Pruned Visualisations: Examples of Transformations
{ V, { (V, Ė), λg
v (V ) = h3..3i, { V, { (V, Ė), λp
v (V ) = h3..3i,
1 1
Ė, (Ė, E), λg Ė, (Ė, E), λp
v1 (Ė) = h0..1i, v1 (Ė) = h0..1i,
E, (V, Ȧ), E, (V, Ȧ),
λg
v (E) = h0..0i, λp
v (E) = h0..0i,
Ȧ, (Ȧ, A),
1 Ȧ, (Ȧ, A),
1
A, λg
v1 (Ȧ) = h0..1i, A, λp
v1 (Ȧ) = h0..1i,
(A, Y ), (A, Y ),
Y, (A, O), λg
v1 (A) = h1..3i, Y, (A, O), λp
v (A) = h0..2i,
1
O, λg O,
(A, B), v1 (Y ) = h0..0i, (V, Ḋ), λp
v1 (Y ) = h0..0i,
B, Ḋ,
(V, Ḋ), λg
v1 (O) = h0..0i, (Ḋ, D), λp
Ḋ, D, v (O)
1
= h0..0i,
D, (Ḋ, D), λg
v (B) = h0..0i, DY (D, DY ) λp
1 v1 (Ḋ) = h0..1i,
DY, (D, DY ), λg } }
v1 (Ḋ) = h0..1i, λp
v (D) = h0..1i,
DO, (D, DO), g 1
(D, DB) λv (D) = h1..1i, λp
DB 1 v1 (DY ) = h0..0i
} λg
v1 (DY ) = h0..0i,
} g
λv (DO) = h0..0i,
1
λg
v1 (DB) = h0..0i
Collapsed
Nvc DEvc λcv
1 1 1
ms_minp v1 (B) = {} ] {}
mincardcv (B)
P
= min0 {} = 0
1
p
ms_maxv (B) = {} ] {}
1
maxcardcv (B)
P
= max0 {} = 0
1
ms_minpv1 (DO) = {} ] {}
mincardcv (DO)
P
= min0 {} = 0
1
ms_maxp v1 (DO) = {} ] {}
c P
maxcardv (DO) = max0 {} = 0
1
ms_minpv1 (DB) = {} ] {}
mincardcv (DB)
P
= min0 {} = 0
1
ms_maxp v1 (DB) = {} ] {}
c P
maxcardv (DB) = max0 {} = 0
1
Title
Assembly members
Institution ID
Data (D)
Meeting attributes
Start time
End time
Attendees
Place
Use groups as categories
Meeting workflow
X
Standard workflow
Collège workflow
Archive workflow
Manager
Meeting manager
Owner
Votes (V)
Enable voting
Vote encoder
Meeting manager
Voter
Available vote values
Yes
No
Abstention
Default vote value
X
Yes
No
Abstention
λp
U ser (M C) = hmax(0, 7 − 5)..min(7, 7 − 5)i = h2..2i
λpU ser (D) = hmax(0, 3 − 2)..min(3, 3 − 2)i = h1..1i
B
Detailed Example of [ .]]MLSC and
Proof Helper
This section presents the calculation details of the two configuration paths, πi
and πj , used in the example of Section 6.3. In the remainder of this section,
Lj
we will use d when referring to the FM of Figure 6.5 and we note σ → σ 0 if
π = ..σσ 0 .. and σ 0 ∈ Σj .
This configuration path, shown in Figure 6.3 and in the middle part of Fig-
ure 6.6, actually corresponds to the one used in the original illustration by
Czarnecki et al. [CHE05], which was summarised in Section 6.1. One can see
in Figure 6.3 that it consists of two manual configuration stages and one au-
tomatic specialisation stage. Since the inter-level links already implement the
latter stage, there is no need to represent it as a dedicated transition between
two stages. The results of the two former stages result from manual configu-
rations that cannot be automatically derived from the constraint set and thus
have to be represented by two separate configuration sets, viz. σ2i and σ3i .
187
188 B Detailed Example of [[.]]MLSC and Proof Helper
σ1 = { {G1 , T1 , E1 , S1 , G2 , T2 , M2 , O2 },
{G1 , T1 , E1 , S1 , G2 , T2 , M2 , I2 },
{G1 , T1 , E1 , S1 , G2 , E2 , D2 },
{G1 , T1 , E1 , S1 , G2 , S2 },
{G1 , T1 , E1 , S1 , G2 , S2 , B2 },
{G1 , T1 , E1 , G2 , T2 , M2 , O2 },
{G1 , T1 , E1 , G2 , T2 , M2 , I2 },
{G1 , T1 , E1 , G2 , E2 , D2 },
{G1 , T1 , S1 , G2 , T2 , M2 , O2 },
{G1 , T1 , S1 , G2 , T2 , M2 , I2 },
{G1 , T1 , S1 , G2 , S2 },
{G1 , T1 , S1 , G2 , S2 , B2 },
{G1 , E1 , S1 , G2 , E2 , D2 },
{G1 , E1 , S1 , G2 , S2 },
{G1 , E1 , S1 , G2 , S2 , B2 },
{G1 , T1 , G2 , T2 , M2 , O2 },
{G1 , T1 , G2 , T2 , M2 , I2 },
{G1 , E1 , G2 , E2 , D2 },
{G1 , S1 , G2 , S2 },
{G1 , S1 , G2 , S2 , B2 } }
σ2i = { {G1 , E1 , S1 , G2 , E2 , D2 },
{G1 , E1 , S1 , G2 , S2 },
{G1 , E1 , S1 , G2 , S2 , B2 } }
σ3i = { {G1 , E1 , S1 , G2 , E2 , D2 } }
Note here that |(σ3i )| = 1, which denotes the end of the configuration path
(Definition 6.3), resulting in the single product {G1 , E1 , S1 , G2 , E2 , D2 }.
In order to show that πi ∈ [[d]]MLSC , we have to find a level arrangement
that satisfies rules (6.5.1) and (6.5.2). Let us examine each possible level ar-
rangement in turn.
L
and (6.5.1) evaluates to true. We have for σ1 →1 σ2i :
1. σ1 \ σ2i = { {G1 , T1 , E1 , S1 , G2 , T2 , M2 , O2 },
{G1 , T1 , E1 , S1 , G2 , T2 , M2 , I2 },
{G1 , T1 , E1 , S1 , G2 , E2 , D2 },
{G1 , T1 , E1 , S1 , G2 , S2 },
{G1 , T1 , E1 , S1 , G2 , S2 , B2 },
{G1 , T1 , E1 , G2 , T2 , M2 , O2 },
{G1 , T1 , E1 , G2 , T2 , M2 , I2 },
{G1 , T1 , E1 , G2 , E2 , D2 },
{G1 , T1 , S1 , G2 , T2 , M2 , O2 },
{G1 , T1 , S1 , G2 , T2 , M2 , I2 },
{G1 , T1 , S1 , G2 , S2 },
{G1 , T1 , S1 , G2 , S2 , B2 },
{G1 , T1 , G2 , T2 , M2 , O2 },
{G1 , T1 , G2 , T2 , M2 , I2 },
{G1 , E1 , G2 , E2 , D2 },
{G1 , S1 , G2 , S2 },
{G1 , S1 , G2 , S2 , B2 } }
2. (σ1 \ σ2i ) |L1 = { {G1 , T1 , E1 , S1 },
{G1 , T1 , E1 },
{G1 , T1 , S1 },
{G1 , T1 },
{G1 , E1 },
{G1 , S1 } }
and
1. σ1 |L1 = { {G1 , T1 , E1 , S1 },
{G1 , T1 , E1 },
{G1 , T1 , S1 },
{G1 , E1 , S1 },
{G1 , T1 },
{G1 , E1 },
{G1 , S1 } }
2. σ2i |L1 = { {G1 , E1 , S1 } }
3. σ1 |L1 \ σ2i |L1 = { {G1 , T1 , E1 , S1 },
{G1 , T1 , E1 },
{G1 , T1 , S1 },
{G1 , T1 },
{G1 , E1 },
{G1 , S1 } }
L
such that (σ1 \σ2i ) |L1 ⊆ (σ1 |L1 \σ2i |L1 ) is true and we have for σ2i →1 σ3i :
(σ2i \ σ3i ) |L1 = { {G1 , E1 , S1 , G2 , S2 },
{G1 , E1 , S1 , G2 , S2 , B2 } }
σ2i |L1 \ σ3i |L1 = ∅
The arrangement is thus rejected because (6.5.2) evaluates to false since
(σ2i \ σ3i ) |L1 ⊆ (σ2i |L1 \ σ3i |L1 ) does not hold. This rejection is obvious
since level L1 was already fully configured at σ2i , hence the absence of
deletable configurations from σ2i in L1 .
(c) For Σ1i = σ2i and Σ2i = σ3i , we have
|f inal(Σ1i ) |L1 | = |σ2i |L1 | = 1
190 B Detailed Example of [[.]]MLSC and Proof Helper
L
and we have for σ2i →2 σ3i :
(σ2i \ σ3i ) |L2 = { {G2 , S2 },
{G2 , S2 , B2 } }
σ2i |L2 \ σ3i |L2 = { {G2 , S2 },
{G2 , S2 , B2 } }
i.e. that (σ2i \ σ3i ) |L2 ⊆ (σ2i |L2 \ σ3i |L2 ). Since both (6.5.1) and (6.5.2)
evaluate to true, the arrangement is accepted.
One level arrangement, viz. Σ1i = σ2i and Σ2i = σ3i , satisfies rules (6.5.1)
and (6.5.2) and thus πi ∈ [[d]]MLSC .
Again, in order for πj ∈ [[d]]MLSC there must be at least one level arrange-
ment that satisfies rules (6.5.1) and (6.5.2). Let us examine each possible
arrangement in turn.
L
and (6.5.1) evaluates to true. We have for σ1 →1 σ2j :
(σ1 \ σ2j ) |L1 = { {G1 , T1 , E1 , S1 },
{G1 , T1 , E1 },
{G1 , T1 , S1 },
{G1 , E1 , S1 },
{G1 , T1 },
{G1 , E1 },
{G1 , S1 } }
(σ1 \ σ2j ) |L1 = {G1 , T1 , E1 , S1 },
{G1 , T1 , E1 },
{G1 , T1 , S1 },
{G1 , T1 },
{G1 , E1 },
{G1 , S1 } }
and (6.5.1) evaluates to true. Like for σ2j , σ3j ∈ Σ1j and Σ2j = ∅, we
L
have for σ1 →1 σ2j that (σ1 \ σ2j ) |L1 ⊃ (σ1 |L1 \ σ2j |L1 ). The arrangement
is thus rejected because (6.5.2) evaluates to false.
Since none of the possible arrangements satisfies both (6.5.1) and (6.5.2),
the configuration path πj does not belong to [[d]]MLSC .
The following theorem is used for the proof of Theorem B.1, Section 6.4. Ba-
sically, it says that when two sets of sets are reduced to include only sets
containing certain elements and then subtracted, the result is included in (i.e.
smaller than) the set obtained by subtracting first and reducing afterwards.
Intuitively, the result of a subtraction operation is smaller the more elements
in both sets “match up”, and if a reduction is applied prior to subtracting, it
becomes “more likely” for elements to match up, meaning that the result of the
subtraction can be smaller than if the reduction was applied afterwards.
192 B Detailed Example of [[.]]MLSC and Proof Helper
Proof.
0
{c ∩ L|c ∈ σ} \ {c ∩ L|c ∈ σ } 0
= a|(a ∈ {c ∩ L|c ∈ σ}) ∧ (a 6∈ {c ∩ L|c ∈ σ })
= a|(a ∈ {p|p = {x|∃c ∈ PN • x ∈ c ∧ x ∈ L ∧ c ∈ σ}}
0
∧ a 6∈ {p|p = {x|∃c ∈ PN • x ∈ c ∧ x ∈ L ∧ c ∈ σ }})
= p|p = {x|(∃c ∈ PN • x ∈ c ∧ x ∈ L ∧ c ∈ σ)
∧¬(∃c ∈ PN • x ∈ c ∧ x ∈ L ∧ c ∈ σ 0 )}
= p|p = {x|(∃c ∈ PN • x ∈ c ∧ x ∈ L ∧ c ∈ σ)
∧ (∀c ∈ PN • ¬(x ∈ c ∧ x ∈ L ∧ c ∈ σ 0 ))}
⊆
p|p = {x|∃c ∈ PN • (x ∈ c ∧ x ∈ L ∧ c ∈ σ)
∈ c ∧ x ∈ L ∧ c ∈ σ 0 )}
∧¬(x
0
= p|p = {x|∃c ∈ PN • x ∈ c ∧ x ∈ L ∧ c 0∈ σ ∧ ¬(c ∈ σ )}
= p|p = {x|x ∈ c ∧ x 0∈ L ∧ c ∈ σ ∧ c 6∈ σ }
= c ∩ L|c ∈ σ ∧ c 6∈ σ
= c ∩ L|c ∈ (σ \ σ 0 )
Appendix
C
Safety Analysis: Workflow
Transformation Algorithms
193
194 C Safety Analysis: Workflow Transformation Algorithms
199
200 INDEX
Example, 97
Multi-level staged configuration,
S
95 SAT, 23
Semantic function ([[d]]MLSC ), 101 SCM, 29, 38
Multi-constraint violation, 152 Software Configuration Manage-
Multi-level staged configuration, see ment, 29
MLSC SD, 124
Strong dependency set, 124
O Separation of concerns, see SoC
SMT, 25
Open feature, 124 SoC, 47
Order penetration point, see CODP Separation of concerns, 47
Orthogonal varibility model, see OVM Software Configuration Management,
OVM, 16 see SCM
Orthogonal varibility model, 16 Software Product Line, see SPL
Software Product Line Engineering,
see SPLE
P Spacebel, 112
SPL, 14
PloneGov, 75 Software Product Line, 14
PloneMeeting, 75 SPLE, 14
Process, 43 Software Product Line Engineer-
Multistage configuration, 43 ing, 14
Product family, 13 SPLOT, 165
Product portfolio, 14 Staged configuration, 95
Product variant master, see PVM Strong dependency set, see SD
Propositional definability (pdefines(M, f )), Subversion, 38
79
pure::variants, 27
PVM, 32 T
Example, 34
Product variant master, 32 Toolset, 165
Tree prefix, 87
TVL, 20
R
Range fix, 143, 146 U
Completeness of fix lists, 148
Correctness, 147 Unsatisfiable core, 149
Maximality of ranges, 148
Minimality of variables, 147 V
Range unit, 146
Reuse, 14 Variability in space, 15
202 INDEX
Variability in time, 15
Variability model, 14
Variation point, 14
View, 73
Abstract syntax (LM V F M ), 77
Integration, 73
Necessary coverage condition,
80
Projection, 73
Sufficient coverage condition, 79
Visualisation, 80, 82
Collapsed, 82, 83
Greyed, 81
Pruned, 81, 82
W
WD, 124
Weak dependency set, 124
Weak dependency set, see WD
Workflow, 114
(weak) soundness, 121
No dead transition, 121
Option to complete, 121
Proper completion, 121
Workspace, 40
Virtual, 42
Y
YAWL, 113, 165
Abstract Syntac (LF CW ), 115
Example, 114
Semantic function ([[m]]F CW ), 116
Bibliography
[Ada10] Michael Adams. Yawl - user manual, version 2.1. The YAWL
Foundation, 2010.
203
204 BIBLIOGRAPHY
[PCCW93] M.C. Paulk, B. Curtis, M.B. Chrissis, and C.V. Weber. Capa-
bility maturity model, version 1.1. Software, IEEE, 10(4):18
–27, July 1993.
[vdAtH05] W. van der Aalst and A. ter Hofstede. Yawl: yet an-
other workflow language. Information Systems, 30(4):245–
275, 2005.