Using Program Slicing
in Software Maintenance
K. B. Gallagher
Computer Science Department
Loyola College in Maryland
4501 N. Charles St.
Baltimore, Maryland 21210
J. R. Lyle
Computer Science Department
University of Maryland, Baltimore Campus
5401 Wilkens Avenue
Baltimore, Maryland 21228
August, 1991
Abstract
Program slicing, introduced by Weiser, is known to help programmers in understanding foreign code
and in debugging. We apply program slicing to the maintenance problem by extending the notion of a
program slice (that originally required both a variable and line number) to a decomposition slice, one that
captures all computation on a given variable; i.e., is independent of line numbers. Using the lattice of
single variable decomposition slices, ordered by set inclusion, we demonstrate how to form a slice-based
decomposition for programs. We are then able to delineate the e ects of a proposed change by isolating
those e ects in a single component of the decomposition. This gives maintainers a straightforward
technique for determining those statements and variables that may be modi ed in a component and
those that may not. Using the decomposition, we provide a set of principles to prohibit changes that
will interfere with unmodi ed components. These semantically consistent changes can then be merged
back into the original program in linear time. Moreover, the maintainer can test the changes in the
component with the assurance that there are no linkages into other components. Thus, decomposition
slicing induces a new software maintenance process model that eliminates the need for regression testing.
Index terms: Software Maintenance, Program Slicing, Decomposition Slicing, Software Process Models Software Testing, Software Tools, Impact Analysis
1
1
Introduction
In \Kill that Code!," [32] Gerald Weinberg alludes to his private list of the world's most expensive program
errors. The top three disasters were caused by a change to exactly one line of code: \each one involved
the change of a single digit in a previously correct program." The argument goes that since the change
was to only one line, the usual mechanisms for change control could be circumvented. And, of course, the
results were catastrophic. Weinberg o ers a partial explanation: \unexpected linkages," i.e., the value of the
modi ed variable was used in some other place in the program. The top three of this list of ignominy are
attributed to linkage. More recently, in a special section of the March, 1987 issue of IEEE Transactions on
Software Engineering, Schneidewind [30] notes that one of the reasons that maintenance is dicult is that it
is hard to determine when a code change will a ect some other piece of code. We present herein a method
for maintainers to use that addresses this issue.
While some may view software maintenance as a less intellectually demanding activity than development,
the central premise of this work is that software maintenance is more demanding. The added diculty is due
in large part to the semantic constraints that are placed on the maintainer. These constraints can be loosely
characterized as the attempt to avoid unexpected linkages. Some [4, 14] have addressed this problem by
attempting to eliminate these semantic constraints and then providing the maintainer with a tool that will
pinpoint potential inconsistencies after changes have been implemented. This makes maintenance appear to
be more like development, since the programmer does not need to worry about linkages: once the change is
made, the tool is invoked and the inconsistencies (if any) are located. One would expect that the tool would
proceed to resolve these inconsistencies, but it has been shown that this problem is NP-hard [14]. Thus, the
maintainer can be presented with a problem that is more dicult to resolve that the original change.
We take the opposite view: present the maintainer with a semantically constrained problem and let him
construct the solution that implements the change within these constraints. The semantic context with
which we propose to constrain the maintainer is one that will prohibit linkages into the portions of the code
that the maintainer does not want to change. This approach uncovers potential problems earlier than the
aforementioned methods, and, we believe, is worth any inconvenience that may be encountered due to the
imposition of the constraints.
Our program slicing based techniques give an assessment of the impact of proposed modi cations, ease the
problems associated with revalidation and reduce the resources required for maintenance activities. They
work on unstructured programs, so they are usable on older systems. They may be used for white-box,
spare-parts and backbone maintenance without regard to whether the maintenance is corrective, adaptive,
perfective or preventive.
2
Background
Program slicing, introduced by Weiser, [33, 36] is a technique for restricting the behavior of a program
to some speci ed subset of interest. A slice ( ) (of program ) on variable , or set of variables, at
statement yields the portions of the program that contributed to the value of just before statement
is executed. ( ) is called a slicing criteria. Slices can be computed automatically on source programs
by analyzing data ow and control ow. A program slice has the added advantage of being an executable
program. Slicing is done implicitly by programmers while debugging [33, 35]; slices can be combined to
isolate sections of code likely to contain program faults and signi cantly reduce debugging times [23, 24, 25].
There has been a urry of recent activities where slicing plays a signi cant role. Horwitz, Reps, et al.
[15, 16, 28] use slices in integrating programs. Their results are built on the seminal work of Ottenstein and
Ottenstein [7, 27] combining slicing with the robust representation a orded by program dependence graphs.
Korel and Laski [20, 21, 22] use slices combined with execution traces for program debugging and testing.
Choi et al. [6] use slices and traces in debugging parallel programs. Reps and Wang [29] have investigated
termination conditions for program slices. Hausler [13] has developed a denotational approach to program
slicing. Gallagher [8] has improved Lyle's [23] algorithm for slicing in the presence of goto's and developed
techniques for capturing arbitrarily placed output statements. We will not discuss slicing techniques in this
paper and instead refer the interested reader to these works.
Since we want to avoid getting bogged down in the details of a particular language, we will identify a
program with its owgraph. Each node in the graph will correspond to a single source language statement.
S v; n
P
n
v
v
S v; n
1
n
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#define YES 1
#define NO 0
main()
f
int c, nl, nw, nc, inword ;
inword = NO ;
nl = 0;
nw = 0;
nc = 0;
c = getchar();
while ( c != EOF ) f
nc = nc + 1;
if ( c == '\n')
nl = nl + 1;
if ( c == ' ' || c == '\n' || c == '\t')
inword = NO;
else if (inword == NO) f
inword = YES ;
nw = nw + 1;
g
g
g
c = getchar();
printf("%d \n",nl);
printf("%d \n",nw);
printf("%d \n",nc);
Figure 1: Program to be Sliced
Henceforth, the term statement will mean a node in the owgraph. Using a common representation scheme
makes the presentation clear, although it is clear that any tool based on these techniques will need to account
for the nuances of the particular language. In this paper, we also ignore problems introduced by having dead
code in the source program, and declare that the programs under consideration will not have any dead code.
See [8] for slicing based techniques to eliminate dead code.
Figures 2-6 illustrate slicing on the program of gure 1, a bare bones version of the Unix utility wc, word
count, taken from [19]. The program counts the number of characters, words, and lines in a text le. It has
been slightly modi ed to illustrate more clearly the slicing principles. The slices of gures 2-4 are complete
programs that compute a restriction of the speci cation. The slice on nw ( g. 2) will output the number of
words in a le; the slice on nc ( g. 3) will count the number of characters in the input text le; the slice on
nl ( g. 4) will count the number of lines in the le.
3 Using Slices for Decomposition
This section presents a method for using slices to obtain a decomposition of the program. Our objective
is to use slicing to decompose a program \directly," into two (or more) components. A program slice
will be one of the components. The construction is a two step process. The rst step is to build, for one
variable, a decomposition slice, which is the union of certain slices taken at certain line numbers, on the given
variable. Then the other component of the decomposition, called the \complement," will also be obtained
from the original program. The complement is constructed in such a way that when certain statements of
the decomposition slice are removed from the original program, the program that remains is the slice that
2
1
2
3
4
5
6
8
10
11
15
16
17
18
19
20
21
22
24
26
#define YES 1
#define NO 0
main()
f
int c, nw, inword ;
inword = NO ;
nw = 0;
c = getchar();
while ( c != EOF ) f
if ( c == ' ' || c == '\n' || c == '\t')
inword = NO;
else if (inword == NO) f
inword = YES ;
nw = nw + 1;
g
g
g
c = getchar();
printf("%d \n",nw);
Figure 2: Slice on (nw,26): Word Counter
3
4
5
9
10
11
12
21
22
25
26
main()
f
int c, nc ;
nc = 0;
c = getchar();
while ( c != EOF ) f
nc = nc + 1;
c = getchar();
g
g
printf("%d \n",nc);
Figure 3: Slice on (nc,26): Character Counter
3
3
4
5
7
10
11
13
14
21
22
23
26
main()
f
int c, nl, ;
nl = 0;
c = getchar();
while ( c != EOF ) f
if ( c == '\n')
nl = nl + 1;
c = getchar();
g
g
printf("%d \n",nl);
Figure 4: Slice on (nl,26): Line Counter
1
2
3
4
5
6
10
11
15
16
17
18
20
21
22
26
#define YES 1
#define NO 0
main()
f
int c, inword ;
inword = NO ;
c = getchar();
while ( c != EOF ) f
if ( c == ' ' || c == '\n' || c == '\t')
inword = NO;
else if (inword == NO) f
inword = YES ;
g
g
g
c = getchar();
Figure 5: Slice on (inword,26)
3
4
5
10
11
21
22
26
main()
f
int c ;
c = getchar();
while ( c != EOF ) f
c = getchar();
g
g
Figure 6: Slice on (c,26)
4
1
2
3
4
5
6
input
input
t = a
print
t = a
print
a
b
+ b
t
- b
t
Figure 7: Requires a Decomposition Slice
corresponds to the complement (in a sense to be de ned) of the given criteria, with respect to the variables
de ned in the program. Thus the complement is also a program slice.
The decomposition slice is used to guide the removal of statements in a systematic fashion to construct
the complement. It is insucient to merely remove the slice statements from the original program. Since
we require that a slice be executable, there will be certain crucial statements that are necessary in both the
slice and its complement. For example, if we start with the slice of gure 2 and remove all its statements
from the original program, the resulting object will not even compile!
We use this decomposition to break the program into manageable pieces and automatically assist the
maintainer in guaranteeing that there are no ripple e ects induced by modi cations in a component. We use
the complement to provide a semantic context for modi cations in the decomposition slice; the complement
must remain xed after any change.
The decomposition ideas presented in this section are independent of a particular slicing method. Once a
slice is obtained, by any slicing algorithm, a program decomposition may be computed. Clearly, the quality
of the decomposition will be a ected by the quality of the slice, in the sense that more re ned slices give a
ner granularity and also deliver more semantic information to the maintainer.
A program slice is dependent on a variable and a statement number. A decomposition slice does not
depend on statement numbers. The motivation for this concept is easily explained using the example of
gure 7. The slice ( 4) is statements 1, 2, 3, 4, while the slice ( 6) is statements 1, 2, 5, 6. Slicing at
statement
(in this case 6) of a program is insucient to get all computations involving the slice variable,
. A decomposition slice captures all relevant computations involving a given variable.
To construct a decomposition slice, we borrow the concept of critical instructions from an algorithm for
dead code elimination as presented in Kennedy [18]. A brief reprise follows. The usual method for dead
code elimination is to rst locate all instructions that are useful in some sense. These are declared to be the
critical instructions. Typically, dead code elimination algorithms start by marking output instructions to
be critical. Then the use-de nition [18] chains are traced to mark the instructions that impact the output
statements. Any code that is left unmarked is useless to the given computation.
De nition 1 Let
( ) be the set of statements in program P that [
output variable , let last be the
last statement of , and let N = Output(P,v) [ flastg. The statements in
( ) form the decomposition
S t;
S t;
last
t
Output P; v
v
P
n2N
S v
slice on , denoted ( ).
The decomposition slice is the union of a collection of slices, which is still a program slice [36]. We include
statement
so that a variable that is not output may still be used as a decomposition criteria; this will
also capture any de ning computation on the decomposition variable after the last statement that displays
its value. To successfully take a slice at statement last, we invoke one of the crucial di erences between the
slicing de nitions of Reps, with those of Weiser, Lyle and this work. A Reps slice must be taken at a point,
, with respect to a variable that is de ned or referenced at . Weiser's slices can be taken at an arbitrary
variable at an arbitrary line number. This di erence prohibits Reps' slicing techniques from being applicable
in the current context, since we want to slice on every variable in the program at the last statement.
We now begin to examine the relationship between decomposition slices. Once we have this in place, we
can use the decomposition slices to perform the actual decompositions. To determine the relationships, we
take the decomposition slice for each variable in the program and form a lattice of these decomposition slices,
v
S v
last
p
p
5
ordered by set inclusion. It is easier to gain a clear understanding of the relationship between decomposition
slices if we regard them without output statements. This may seem unusual in light of the above de nition,
since we used output statements in obtaining relevant computations. We view output statements as windows
into the current state of computation, which do not contribute to the realization of the state. This coincides
with the informal de nition of a slice: the statements that yield the portions of the program that contributed
to the value of v just before statement n is executed. Assuming that output statements do not contribute to
the value of a variable precludes from our discussion output statements (and therefore programs) in which
the output values are reused, as is the case with random access les or output to les that are later reopened
for input. Moreover, we are describing a decomposition technique that is not dependent on any particular
slicing technique; we have no way of knowing whether or not the slicing technique includes output statements
or not. We say a slice is output-restricted if all its output statements are removed.
De nition 2 Output
S (v) \ S (w) = ;.
restricted decomposition
slices S (v)
and
S (w)
are independent
if
It would be a peculiar program that had independent decomposition slices; they would share neither
control ow or data ow. In e ect, there would be two programs with non-intersecting computations on
disjoint domains that were merged together. The lattice would have two components. In Ott's slice metric
terminology, [26] independence corresponds to low (coincidental or temporal) cohesion.
Output-restricted decomposition slices that are not independent are said to be (weakly) dependent. Subsequently, when we speak of independence and dependence of slices it will always be in the context of
output-restricted decomposition slices.
De nition 3 Let S (v) and S (w) be output-restricted decomposition slices, w 6= v, and let S(v) S(w). S (v)
is said to be strongly dependent on S (w).
Thus output-restricted decomposition slices strongly dependent on independent slices are independent.
The de nitions of independence and dependence presented herein are themselves dependent on the notion
of a slice.
The analogous de nitions are used by Bergeretti and Carre [3] to de ne slices. In Ott's metric terminology,
[26] strong dependence corresponds to high (sequential or functional) cohesion.
Strong dependence of decomposition slices is a binary relation; in most cases, however, we will not always
need an explicit reference to the containing slice. Henceforth, we will write \S (v) is strongly dependent" as
a shorthand for \S (v) is strongly dependent on some other slice S (w)" when the context permits it.
De nition 4 An output-restricted slice S (v) that is not strongly dependent on any other slice is said to be
maximal.
Maximal decomposition slices are at the \ends" of the lattice. This de nition gives the motivation for
output restriction; we do not want to be concerned with the possible e ects of output statements on the
maximality of slices or decomposition slices. This can be observed by considering the decomposition slices
on nw and inword, of gures 2 and 5. If we regarded output statements in de ning maximal, we could force
the slice on inword to be maximal by the addition of a print statement referencing inword along with the
others at the end of the program. Such a statement would not be collected into the slice on nw. Since this
added statement is not in any other slice, the slice on inword would be maximal and it should not be.
Figure 8 gives the lattice we desire. S (nc), S (nl) and S (nw) are the maximal decomposition slices.
S (inword) is strongly dependent on S (nw); S (c) is strongly dependent on all the other decomposition
slices. The decomposition slices on S (nw), S (nc), and S (nl), gures 2 - 4 they are weakly dependent and
maximal, when the output statements are removed. There are no independent decomposition slices in the
example. Recall that independent decomposition slices cannot share any control ow: the surrounding
control statements would make them dependent.
We now begin to classify the individual statements in decomposition slices.
De nition 5 Let S (v) and S (w) be output-restricted decomposition slices of program P . Statements in
S (v) \ S (w) are called slice dependent statements.
6
S(nc) S(nl) S(nw)
"
-
S(inword)
"
S(c)
%
Figure 8: Lattice of Decomposition Slices.
Slice independent statements are statements that are not slice dependent. We will refer to slice dependent
statements and slice independent statements as dependent statements and independent statements. Dependent statements are those contained in decomposition slices that are interior points of the lattice; independent
statements are those in a maximal decomposition slice that are not in the union of the decomposition slices
that are properly contained in the maximal slice. The terms arise from the fact that two or more slices
depend on the computation performed by dependent statements. Independent statements do not contribute
to the computation of any other slice. When modifying a program, dependent statements cannot be changed,
or the e ect will ripple out of the focus of interest.
For example, statement 12 of the slice on nc ( g. 3) is a slice independent statement with respect to
any other decomposition slice. Statements 13 and 14 of the slice on nl ( g. 4) are also slice independent
statements with respect to any other decomposition slice. The decomposition slice on c ( g. 6) is strongly
dependent on all the other slices, thus all its statements are slice dependent statements with respect to any
other decomposition slice. Statements 6 and 15-20 of the slice on nw ( g. 2) are slice independent statements
with respect to decomposition slices S(nc), S(nl), and S(c); only statement 19 is slice independent when
compared with S(inword). Statements 6, 15-18 and 20 of the decomposition slice on inword ( g. 5) are
slice independent statements with respect to decomposition slices S(nc), S(nl), and S(c); no statements are
slice independent when compared with S(nw).
We have a relationship between maximal slices and independent statements. This proposition permits
us to apply the terms \(slice) independent statement" and \(slice) dependent statement" in a sensible way
to a particular statement in a given maximal decomposition slice without reference to the binary relation
between decomposition slices that is required in de nition 5.
Proposition 1 Let
1.
2.
3.
4.
Varset(P) be the set of variables in program P.
S(v) be an output-restricted decomposition slice of P.
Let M = f m 2 Varset(P) j S(m) is maximalg
Let U = M - f v g
The statements in S(v) ?
[ S(u) are independent.
u2U
Proof Sketch:
Let U =[fu1; . . .; umg.
S(v) ? S(u) = S(v) ? S(u1 ) . . . ? S(um ).
u2U
End.
There is a relationship between the maximal slices and the program. (Recall that dead code has been
excluded from our discussions.)
Proposition 2 Let M =
f m 2 Varset(P) j S(m) is maximalg. Then
7
[
m2M
S(m) = P .
Proof Sketch:
Since S(m) 2 P, [m2M S(m) P. If P 6 [m2M S(m), then the statements in P that are not in
[m2M S(m) are dead code.
End.
Maximal slices capture the computation performed by the program. Maximal slices and their respective
independent statements also are related:
Proposition 3 An output-restricted decomposition slice is maximal i it has at least one independent state-
ment.
Proof Sketch:
Suppose S(v) is maximal. By de nition S(v) has at least one statement that no other slice has. This
statement is an independent statement.
Now suppose that S(v) has an independent statement, s. Then s is not in any other slice, and the slice
that contains s is maximal.
End.
Conversely, a slice with no independent statements is strongly dependent.
We also have another characterization of strongly dependent slices.
Proposition 4 Let
1.
2.
3.
4.
5.
Varset(P) be the set of variables in program P.
S(v) be an output-restricted decomposition slice of P.
Let D = f w 2 Varset(P) j S(v) is strongly dependent on S(w)g
Let M = f m 2 Varset(P) j S(m) is maximalg
Let U = M - f v g
An output-restricted decomposition slice S(v) is strongly dependent (on some S(d)) i
[ S(u) = P .
u2U
Proof Sketch:
Suppose S(v) is strongly dependent. We need to show that D has a maximal slice. Partially order D by
set inclusion. Let d be one of the maximal elements of D. The element d is maximal; if it is not, then it is
properly contained in another slice d1, which is in D and contains S(v). Then d 2 M, d 6= v, and S(v)
makes no contribution
union.
[ S(u) =toP.theSince
Suppose
U M, S(v) makes no contribution to the union. By proposition 3, S(v)
u2U
is strongly dependent.
End.
We are now in a position to state the decomposition principles. Given a maximal output-restricted
decomposition slice S(v) P
of program P, delete the independent and output statements of S from P. We
will denote this program (v) and call it the complement of decomposition slice S(v) (with respect to P).
Henceforth, when we speak of complements, it will always be in the context of decomposition slices. The
decomposition slice is the subset of the program that computes a subset of the speci cation; the complement
computes the rest of the speci cation.
Figures 9 - 11 give the complements of the slices on nw, nc and nl of gures 2 - 4. Using proposition 4
we obtain that the complement of both the slice on inword and the slice on c is the entire program.
8
3
4
5
7
9
10
11
12
13
14
21
22
23
25
26
main()
f
int c, nl, nw, nc, inword ;
nl = 0;
nc = 0;
c = getchar();
while ( c != EOF ) f
nc = nc + 1;
if ( c == '\n')
nl = nl + 1;
c = getchar();
g
g
Figure 9:
printf("%d \n",nl);
printf("%d \n",nc);
P(nw). Complement of slice on nw: computes line count and character count
This yields the approximation of a direct sum decomposition of a program that preserves the computational integrity of the constituent parts. This also indicates
P that the only useful decompositions are done
with maximal decomposition slices. A complement, , of a maximal slice can be further decomposed, so
the decomposition may be continued until all slices with independent statements (i.e. the maximal ones) are
obtained.
In practice, a maintainer may nd a strongly dependent slice as a starting point for a proposed change.
Our method will permit such changes. Such a change may be viewed as properly extending the domain of
the partial function that the program computes, while preserving the partial function on its original domain.
4 Application to Modi cation and Testing
Statement independence can be used to build a set of guidelines for software modi cation. To do this, we
need to make one more set of de nitions regarding variables that appear in independent and dependent
statements. With these de nitions we give a set of rules that maintainers must obey in order to make
modi cations without ripple e ects and unexpected linkages. When these rules are obeyed, we have an
algorithm to merge the modi ed slice back into the complement and e ect a change. The driving motivation
for the following development is: \What restrictions must be placed on modi cations in a decomposition
slice so that the complement remains intact?"
De nition 6 A variable that is the target of a dependent assignment statement is called a dependent variable. Alternatively, and equivalently, if all assignments to a variable are in independent statements, then the
variable is called an independent variable.
An assignment statement can be an independent statement while its target is not an independent variable.
In the program of gure 12 the two maximal decomposition slices are S(a) and S(e) ( gures 13 and 14). Slice
S(b) ( gure 15) is strongly dependent on S(a) and S(f) ( gure 16) is strongly dependent on S(b) and S(a).
S(d) and S(c) (not shown) are strongly dependent on both maximal slices. In S(a), statements 8, 10, and 11
are independent, by the proposition. But variables a and b are targets of assignment statements 6 and 5,
respectively. So, in the decomposition slice S(a), only variable f is an independent variable.
9
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
26
#define YES 1
#define NO 0
main()
f
int c, nl, nw, nc, inword ;
inword = NO ;
nl = 0;
nw = 0;
nc = 0;
c = getchar();
while ( c != EOF ) f
nc = nc + 1;
if ( c == '\n')
nl = nl + 1;
if ( c == ' ' || c == '\n' || c == '\t')
inword = NO;
else if (inword == NO) f
inword = YES ;
nw = nw + 1;
g
g
g
c = getchar();
printf("%d \n",nl);
printf("%d \n",nw);
Figure 10:
P(nc). Complement of slice on nc: computes word count and line count
10
1
2
3
4
5
6
8
9
10
11
12
15
16
17
18
19
20
21
22
24
25
26
#define YES 1
#define NO 0
main()
f
int c, nl, nw, nc, inword ;
inword = NO ;
nw = 0;
nc = 0;
c = getchar();
while ( c != EOF ) f
nc = nc + 1;
if ( c == ' ' || c == '\n' || c == '\t')
inword = NO;
else if (inword == NO) f
inword = YES ;
nw = nw + 1;
g
c = getchar();
g
g
printf("%d \n",nw);
printf("%d \n",nc);
Figure 11:
P(
nl
). Complement of slice on nl: computes character count and word count
1 main()
2 f
3
int a, b, c, d, e, f;
4
c = 4;
5
b = c;
6
a = b + c;
7
d = a + c;
8
f = d + b;
9
e = d + 8;
10
b = 30 + f;
11
a = b + c;
12 g
Figure 12: Dependent Variable Sample Program
11
1 main()
2 f
3
int a, b, c, d, e, f;
4
c = 4;
5
b = c;
6
a = b + c;
7
d = a + c;
8
f = d + b;
10
b = 30 + f;
11
a = b + c;
12 g
Figure 13: Slice on a
1 main()
2 f
3
int a, b, c, d, e, f;
4
c = 4;
5
b = c;
6
a = b + c;
7
d = a + c;
9
e = d + 8;
12 g
Figure 14: Slice on e
12
1 main()
2 f
3
int a, b, c, d, e, f;
4
c = 4;
5
b = c;
6
a = b + c;
7
d = a + c;
8
f = d + b;
10
b = 30 + f;
12 g
Figure 15: Slice on b
1 main()
2 f
3
int a, b, c, d, e, f;
4
c = 4;
5
b = c;
6
a = b + c;
7
d = a + c;
8
f = d + b;
12 g
Figure 16: Slice on f
A similar argument applies for independent control ow statements that reference dependent variables. A
dependent variable in an independent statement corresponds to the situation where the variable in question
is required for the compilation of the complement, but the statement in question does not contribute to the
complement. If a variable is referenced in a dependent statement, it is necessary to the complement and
cannot be independent.
If a decomposing on a single variable yields a strongly dependent slice, we are able to construct a slice
where the original slice variable is an independent variable.
Proposition 5 Let
1.
2.
3.
4.
5.
6.
Varset(P) be the set of variables in program P.
S(v) be a strongly dependent output restricted decomposition slice of P.
Let D = f w 2 Varset(P) j S(v) is strongly dependent on S(w)g
Let M = f m 2 Varset(P) j S(m) is maximalg
Let U = D \ M
Let T =
S(u)
[
u2U
The variable v is an independent variable in T .
In other words, when S(v) is a strongly dependent slice and T is the union of all the maximal slices upon
which S(v) is strongly dependent, then v is an independent variable in T.
13
Proof Sketch:
We show that the complement of T , P - T has no references to v: if variable v is in the complement of T ,
then there is a maximal slice in the complement upon which S (v) is strongly dependent. This contradicts
the hypotheses, so the complement if T has no references to v and the variable v is independent in T .
End.
This can be interpreted as the variable version of proposition 1, that refers to statements.
This has not addressed the problem that is presented when the decomposition slice on variable is maximal,
but the variable itself remains dependent. This is the situation that occurred in the example at the beginning
of the chapter; the slice on variable a ( gure 13) is maximal but the variable is dependent. The solution is
straightforward: we construct the slice that is the union of all slices in which the the variable is dependent.
Proposition 6 Let
1. Varset(P) be the set of variables in program P.
2. S (v) be an output restricted decomposition slice of P.
3. Let E = f w 2 Varset(P) j v is a dependent variable in S (w)g
4. Let T =
[ S(e)
e2E
We have two cases:
1. E = ;, (and thus T is empty also) in which case v is an independent variable.
2. E =
6 ;, so T is not empty and the variable v is an independent variable in T .
Proof Sketch:
Case 1: E = ;
S (v) contains all references to v. In particular, S (v) contains all assignments to v. So v is an independent
variable in S (v). End Case 1
Case 2: E =
6 ;
T contains all references to v. In particular, T contains all assignments to v. So v is an independent
variable in T . End Case 2
End.
This proposition is about variables.
4.1 Modifying Decomposition Slices
We are now in a position to answer the question posed at the beginning of this section. We present the
restrictions as a collection of rules with justi cations.
Modi cations take three forms: additions, deletions and changes. A change may be viewed as a deletion
followed by an addition. We will use this second approach, and determine only those statements in a
decomposition slice that can be deleted and the forms of statements that can be added. Again, we must rely
on the fact that the union of decomposition slices is a slice, since the complementary criteria will usually
involve more than one maximal variable. We also assume that the maintainer has kept the modi ed program
compilable and has obtained the decomposition slice of the portion of the software that needs to be changed.
(Locating the code may be a highly nontrivial activity; for the sake of the current discussion, we assume its
completion.)
Since independent statements do not a ect data ow or control ow in the complement, we have:
Rule 1 Independent statements may be deleted from a decomposition slice.
Reason:
Independent statements do not a ect the computations of the complement. Deleting an independent
statement from a slice will have no impact on the complement.
End.
14
This result applies to control ow statements and assignment statements. The statement may be deleted
even if it is an assignment statement that targets a dependent variable, or a control statement that references
a dependent variable. The point to keep in mind is that if the statement is independent it does not a ect
the complement. If an independent statement is deleted, there will certainly be an e ect in the slice. But
the purpose of this methodology is to keep the complement intact.
There are a number of situations to consider when statements are to be added. We progress from simple
to complex. Also note that for additions, new variables may be introduced, as long as the variable name
does not clash with any name in the complement. In this instance the new variable is independent in the
decomposition slice. In the following, independent variable means an independent variable or a new variable.
Rule 2
slice.
Assignment statements that target independent variables may be added anywhere in a decomposition
Reason :
Independent variables are unknown to the complement. Thus changes to them cannot a ect the computations of the complement.
End.
This type of change is permissible even if the changed value ows into a dependent variable. In gure 13,
changes are permitted to the assignment statement a line 8 , which targets f. A change here would propagate
into the values of dependent variables a and b at lines 10 and 11. The maintainer would then be responsible
for the changes that would occur to these variables. If lines 10 and 11 were dependent, (i.e., contained in
another decomposition slice), line 8 would also be contained in this slice, and variable f would be dependent.
Adding control ow statements requires a little more care. This is required because control statements
have two parts: the logical expression, that determines the ow of control, and the actions taken for each
value of the expression. (We assume no side e ects in the evaluation of logical expressions.) We discuss only
the addition of if-then-else and while statements, since all other language constructs can be realized by
them [5].
Rule 3
Logical expressions (and output statements) may be added anywhere in a decomposition slice.
Reason :
We can inspect the state of the computation anywhere. Evaluation of logical expressions (or the inclusion
of an output statement) will not even a ect the computation of the slice. Thus the complement remains
intact.
End.
We must guarantee that the statements that are controlled by newly added control ow do not interfere
with the complement.
Rule 4 New control statements that surround (i.e. control) any dependent statement will cause the complement to change.
Reason :
Suppose newly added code controls a dependent statement.
Let C be the criteria that yield the complement. When using this criteria on the modi ed program,
the newly added control code will be included in this complementary slice. This is due to the fact that the
dependent statements are in both the slice and the complement. Thus any control statements that control
dependent statements will also be in the slice and the complement.
End.
By making such a change, we have violated out principle that the complement remain xed. Thus new
control statements may not surround any dependent statement.
This short list is necessary and sucient to keep the slice complement intact. This also has an impact
on testing the change that will be discussed later.
Changes may be required to computations involving a dependent variable, v, in the extracted slice. The
maintainer can choose one of the following two approaches:
1. Use the techniques of the previous section to extend the slice so that v is independent in the slice.
15
2. Add a new local variable (to the slice), copy the value to the new variable, and manipulate the new
name only. Of course, the new name must not clash with any name in the complement. This technique
may also be used if the slice has no independent statements, i.e., it is strongly dependent.
4.2 Merging the Modi cations into the Complement
Merging the modi ed slice back into the complement is straightforward. A key to understanding the merge
operation comes the the observation that through the technique, the maintainer is editing the entire program.
The method gives a view of the program with the unneeded statements deleted and with the dependent
statements restricted from modi cation. The slice gives smaller piece of code for the maintainer to focus
on, while the rules of the previous subsection provide the means by which the deleted and restricted parts
cannot be changed accidentally.
We now present the merge algorithm.
1. Order the statements in the original program. (In the following examples, we have one statement per
line so that the ordering is merely the line numbering.) A program slice and its complement can now
be identi ed with the subsequence of statement numbers from original program. We call the sequence
numbering from the slice, the slice sequence and the numbering of the complement the complement
sequence. We now view the editing process as the addition and deletion of the associated sequence
numbers.
2. For deleted statements, delete the sequence number from the slice sequence. Observe that since only
independent statements are deleted, this number is not in the complement sequence.
3. For statements inserted into the slice a new sequence number needs to be generated. Let P be the
sequence number of the statement preceding the statement to be inserted. Let M be the least value in
the slice sequence greater than P. Let F = min(int(P + 1); M). Insert the new statement at sequence
number (F + P)=2. (Although this works in principle, in practice, more care needs to be taken in the
generation of the insertion sequence numbers to avoid oating point errors after 10 inserts.)
4. The merged program is obtained by merging the modi ed slice sequence values (i.e. statements) into
the complement sequence.
Thus, the unchanged dependent statements are used to guide the reconstruction of the modi ed program.
The placement of the changed statements within a given control ow is arbitrary. Again, this becomes clearer
when the editing process is viewed as modi cation to the entire program. The following example will help
clarify this.
4.3 Testing the Change
Since the maintainer must restrict all changes to independent or newly created variables testing is reduced
to testing the modi ed slice. Thus the need for regression testing in the complement is eliminated. There
are two alternative approaches to verifying that only the change needs testing. The rst is to slice on the
original criteria plus any new variables minus any eliminated variables. and compare its complement with
the complement of the original: they should match exactly. The second approach is to preserve the criteria
that produced the original complement. Slicing out on this must produce the modi ed slice exactly.
An axiomatic consideration illumines this idea. The slice and its complement perform a subset of the
computation; where the computations meet are the dependencies. Modifying code in the independent part
of the slice, leaves the independent part of the complement as an invariant of the slice (and vice versa).
If the required change is \merely" a module replacement, the preceding techniques are still applicable.
The slice will provide a harness for the replaced module. A complete independent program supporting the
module is obtained. One of the principle bene ts of slicing is highlighted in this context: any side e ects of
the module to be replaced will also be in the slice. Thus the full impact of change is brought to the attention
of the modi er.
As an example, we make some changes to S(nw), the slice on nw, the word counter of gure 2. The
changed slice is shown in gure 17. The original program determined a word to be any string of \nonwhite" symbols terminated by a \white" symbol (space, tab, or newline). The modi cation changes this
16
3
4
*
5
*
8
10
11
*
*
*
21
22
24
26
main()
f
int ch;
int c, nw ;
ch = 0;
nw = 0;
c = getchar();
while ( c != EOF ) f
if (isspace(c) && isalpha(ch))
nw = nw + 1;
ch = c ;
c = getchar();
g
g
printf("%d \n",nw);
Figure 17: Modi ed slice on nw, the word counter
to requirement to be alphabetical characters terminated by white space. (The example is illustrating a
change, not advocating it.) Note the changes. We have deleted the independent \variables" YES and NO;
added a new, totally independent variable ch, and revamped the independent statements. The addition
of the C macros isspace and isalpha is safe, since the results are only referenced. We test this program
independently of the complement.
Figure 18 shows the reconstructed, modi ed program. Taking the decomposition slice on nw generates
the program of gure 17. Its complement is already given in gure 9. The starred (*) statements indicate
where the new statements would be placed using the line number generation technique above.
5 A New Software Maintenance Process Model
The usual Software Maintenance Process Model is depicted in gure 19. A request for change arrives. It
may be adaptive, perfective, corrective, or preventive. In making the change, we wish to minimize defects,
e ort, and cost, while maximizing customer satisfaction [12]. The software is changed, subject to pending
priorities. The change is composed of two parts. Understanding the code, which may require documentation,
code reading, and execution. Then the program is modi ed. The maintainer must rst design the change
(which may be subject to peer review) then alter the code itself, while trying to minimize side e ects. The
change is then validated. The altered code itself is veri ed to assure conformance with the speci cation. Then
the new code is integrated with the existing system to insure conformance with the system speci cations.
This task involves regression testing.
The new model is depicted in gure 20. The software is changed, subject to pending priorities. The
change is composed of two parts. Understanding the code will now require documentation, code reading,
execution, and the use of decomposition slices. The decomposition slices may be read, and executed (a
decided advantage of having executable program slices). The code is then modi ed, subject to the strictures
outlined. Using those guidelines, no side e ects or unintended linkages can be induced in the code, even by
accident. This lifts a substantial burden from the maintainer.
The change is tested in the decomposition slice. Since the change cannot ripple out into other modules,
regression testing is unnecessary. The maintainer need only verify that change is correct. After applying the
merge algorithm, the change (of the code) is complete.
17
3
4
*
5
*
7
8
9
10
11
*
*
*
12
13
14
21
22
23
24
25
26
main()
f
int ch;
int c, nl, nw, nc ;
ch = 0;
nl = 0;
nw = 0;
nc = 0;
c = getchar();
while ( c != EOF ) f
if (isspace(c) && isalpha(ch))
nw = nw + 1;
ch = c ;
nc = nc + 1;
if ( c == '\n')
nl = nl + 1;
c = getchar();
g
g
printf("%d \n",nl);
printf("%d \n",nw);
printf("%d \n",nc);
Figure 18: Modi ed Program
6
Future Directions
The underlying method and the tool based on it [9] needs to be empirically evaluated. This is underway
using the Goal-Question-Metric paradigm of Basili, et al [2]. Naturally, we are also addressing questions of
scale, to determine if existing software systems decompose suciently via these techniques, in order to e ect
a technology transfer. We are also evaluating decomposition slices as candidates for components in a reuse
library.
Although they seem to do well in practice, the slicing algorithms have relatively bad worst case running
times, O(n e log(e)), where n is the number of variables and e is the number of edges in the owgragh. To
obtain all the slices, this running time becomes O(n2 e log(e)). These worst case times would seem to make
an interactive slicer for large (i.e., real) programs impractical. This diculty can be assuaged by making the
data ow analysis one component of the deliverable products that are handed o from the development team
to the maintenance team. An interactive tool could then be built using these products. Then as changes are
made by the maintainers the data ow data can be updated, using the incremental techniques of Keables
[17].
Interprocedural slices can be attacked using the techniques in Weiser [36] and Barth [1]. The interprocedural slicing algorithms of Horwitz, et al. [16] cannot be used since they require that the slice be taken at a
point where the slice variable id defed or refed; we require that all slices be taken at the last statement of
the program. For separate compilation, worst case assumption must be made about the external variables,
if the source is not available. If the source is available, one proceeds as with procedures.
Berzins [4] has attacked the problem of software merges for extensions of programs. To quote him:
An extension extends the domain of the partial function without altering any of the initially
de ned values, while a modi cation rede nes values that were de ned initially.
We have addressed the modi cation problem by rst restricting the the domain of the partial function to
the slice complement, modifying the function on the values de ned by the independent variables in the slice,
18
corrective
preventive
minimize cost
maximize satisfaction
documentation
code reading
test runs
Request
for
Change
Design
Change
pending
priorities
minimize side effects
Change
Software
Alter
Code
Test
Change
Revalidate
regression
testing
Integrate
Figure 19: A Software Maintenance Process Model
19
adaptive
perfective
corrective
preventive
minimize defects
minimize effort
minimize cost
maximize satisfaction
test runs
decomposition slicing
Design
Change
Request
for
Change
Alter
Component
pending
priorities
Change
Software
no side effects
no regression
testing
Test
Change
Merge
Figure 20: A New Software Maintenance Process Model
20
then merging these two disjoint domains.
Horwitz, et al. [15] have addressed the modi cation problem. They start with a base program and two
modi cations it, A and B:
Whenever the changes made to base to create A and B do not \interfere" (in a sense de ned
in the paper), the algorithm produces a program M that integrates A and B. The algorithm is
predicated on the assumption that di erences in the behavior of the variant programs from that
of base, rather than the di erences in text, are signi cant and must be preserved in M.
Horwitz, et al. do not restrict the changes that can be made to base; thus their algorithm produces an
approximation to the undecidable problem of determining whether or not the behaviors interfere. We have
side-stepped this unsolvable problem by constraining the modi cations that are made. Our technique is
more akin to the limits placed on software maintainers. Changes must be done in a context: independence
and dependence provides the context. It is interesting to note, however, that their work uses program slicing
to determine potential interferences in the merge.
They do note that program variants, as they name them, are easily embedded in change control system,
such as RCS [31]. Moreover, the direct sum nature of the components can be exploited to build related
families of software. That is, components can be \summed" as long as their dependent code sections match
exactly and there is no intersection of the independent domains. We also follow this approach for component
construction.
Weiser [34] discusses some slice-based metrics. Overlap is a measure of how many statements in a slice
are found only in that slice, measured as a mean ratio of non-unique to unique statements in each slice.
Parallelism is the number of slices which have few statements in common, computed as number of slices
which have pairwise overlap below a certain threshold. Tightness is the number of statements in every slice,
expressed as a ratio over program length. Programs with high overlap and parallelism, but with low tightness
would decompose nicely: the lattice would not get too deep or too tangled.
We have shown how a data ow technique, program slicing, can be used to form a decomposition for
software systems. The decomposition yields a method for maintainers to use. The maintainer is able to
modify existing code cleanly, in the sense that the changes can be assured to be completely contained in the
modules under consideration and that no unseen linkages with the modi ed code is infecting other modules.
21
References
[1] J. M. Barth. A practical interprocedural data ow analysis algorithm. Communications of the Association for Computing Machinery, 21(9):724{726, September 1978.
[2] V. Basili, R. Selby, and D. Hutchens. Experimentation in software engineering. IEEE Transactions on
Software Engineering, 12(7):352{357, July 1984.
[3] J-F. Bergeretti and B. Carre. Information- ow and data- ow analysis of while-programs. ACM Transactions on Programming Languages and Systems, 7(1):37{61, January 1985.
[4] V. Berzins. On merging software extensions. Acta Informatica, 23:607{619, 1985.
[5] C. Bohm and G. Jacopini. Flow diagrams and languages with only two formation rules. Communications
of the Association for Computing Machinery, 9(5):366{371, May 1966.
[6] J-D. Choi, B. Miller, and P. Netzer. Techniques for debugging parallel programs with owback analysis.
Technical Report 786, University of Wisconsin - Madison, August 1988.
[7] J. Ferrante, K. Ottenstein, and J. Warren. The program dependence graph and its use in optimization.
ACM Transactions on Programming Languages and Systems, 9(3):319{349, July 1987.
[8] K. B. Gallagher. Using Program Slicing in Software Maintenance. PhD thesis, University of Maryland,
Baltimore, Maryland, December 1989.
[9] K. B. Gallagher. Surgeon's assistant limits side e ects. IEEE Software, May 1990.
[10] K. B. Gallagher and J. R. Lyle. Using program decomposition to guide modi cations. In Conference
on Software Maintenance { 1988, pages 265{268, October 1988.
[11] K. B. Gallagher and J. R. Lyle. A program decomposition scheme with applications to software modi cation and testing. In Proceedings of the 22nd Hawaii International Conference on System Sciences,
pages 479{485, January 1989. Volume II, Software Track.
[12] R. Grady. Measuring and managing software maintenance. IEEE Software, 4(9), September 1987.
[13] P. Hausler. Denotational program slicing. In Proceedings of the 22nd Hawaii International Conference
on System Sciences, pages 486{494, January 1989. Volume II, Software Track.
[14] S. Horwitz, J. Prins, and T. Reps. Integrating non-interfering versions of programs. In Proceedings of
the SIGPLAN 88 Symposium on the Principles of Programming Languages, January 1988.
[15] S. Horwitz, J. Prins, and T. Reps. Integrating non-interfering versions of programs. ACM Transactions
on Programming Languages and Systems, 11(3):345{387, July 1989.
[16] S. Horwitz, T. Reps, and D. Binkley. Interprocedural slicing using dependence graphs. ACM Transactions on Programming Languages and Systems, 12(1):35{46, January 1990.
[17] J. Keables, K. Robertson, and A. von Mayrhauser. Data ow analysis and its application to software
maintenance. In Conference on Software Maintenance { 1988, pages 335{347, October 1988.
[18] K. Kennedy. A survey of data ow analysis techniques. In Steven S. Muchnick and Neil D. Jones,
editors, Program Flow Analysis: Theory and Applications. Prentice-Hall, Englewood Cli s, New Jersey,
1981.
[19] B. Kernighan and D. Ritchie. The C Programming Language. Prentice-Hall, Englewood Cli s, New
Jersey, 1978.
[20] B. Korel and J. Laski. Dynamic program slicing. Information Processing Letters, 29(3):155{163, October
1988.
22
[21] B. Korel and J. Laski. STAD - A system for testing and debugging: User perspective. In Proceedings
of the Second Workshop on Software Testing, Veri cation and Analysis, pages 13{20, Ban , Alberta,
Canada, July 1988.
[22] J. Laski. Data ow testing in stad. The Journal of Systems and Software, 1989.
[23] J. R. Lyle. Evaluating Variations of Program Slicing for Debugging. PhD thesis, University of Maryland,
College Park, Maryland, December 1984.
[24] J. R. Lyle and M. D. Weiser. Experiments on slicing-based debugging aids. In Elliot Soloway and
Sitharama Iyengar, editors, Empirical Studies of Programmers. Ablex Publishing Corporation, Norwood,
New Jersey, 1986.
[25] J. R. Lyle and M. D. Weiser. Automatic program bug location by program slicing. In Proceeding of the
Second International Conference on Computers and Applications, pages 877{882, Peking, China, June
1987.
[26] L. Ott and J. Thuss. The relationship between slices and module cohesion. In International Conference
on Software Engineering, May 1989.
[27] K. Ottenstein and L. Ottenstein. The program dependence graph in software development environments. In Proceedings of the ACM SIGSOFT/SIGPLAN Software Engineering Symposium on Practical
Software Development Environments, pages 177{184, May 1984. L. Ottenstein is now known as L. Ott.
[28] T. Reps and S. Horwitz. Semantics-based program integration. In Proceedings of the Second European
Symposium on Programming ( ESOP '88), pages 133{145, Nancy, France, March 1988.
[29] T. Reps and W. Yang. The semantics of program slicing. Technical Report 777, University of Wisconsin
- Madison, June 1988.
[30] N. Schneidewind. The state of software maintenance. IEEE Transactions on Software Engineering,
13(3):303{310, March 1987.
[31] W. Tichy. RCS: A system for version control. Software - Practice & Experience, 15(7):637{654, July
1985.
[32] G. Weinberg. Kill that code! Infosystems, pages 48{49, August 1983.
[33] M. Weiser. Program Slicing: Formal, Psychological and Practical Investigations of an Automatic Program Abstraction Method. PhD thesis, The University of Michigan, Ann Arbor , Michigan, 1979.
[34] M. Weiser. Program slicing. In Proceeding of the Fifth International Conference on Software Engineering,
pages 439{449, May 1981.
[35] M. Weiser. Programmers use slices when debugging. CACM, 25(7):446{452, July 1982.
[36] M. Weiser. Program slicing. IEEE Transactions on Software Engineering, 10:352{357, July 1984.