Design Pattern Detection by Using Meta Patterns

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

IEICE TRANS. INF. & SYST., VOL.E91-D, NO.

4 APRIL 2008
933

PAPER Special Section on Knowledge-Based Software Engineering


Design Pattern Detection by Using Meta Patterns

Shinpei HAYASHI•õa), Junya KATADA•õ*, Ryota SAKAMOTO•õ, Nonmembers, Takashi KOBAYASHI•õ•õ,


and Motoshi SAEKI•õ, Members

SUMMARY Oneof the approaches to improveprogramunderstand- detect the occurrences of design patterns in source codes, in
ingis to extractwhatkindsof designpatternareusedin existingobject- order to support various activities of software process under
orientedsoftware.Thispaperproposesa technique for efficiently
andac- the situation where the used patterns are not explicitly spec-
curatelydetectingoccurrencesofdesignpatternsincluded in sourcecodes.
Weusebothstaticanddynamicanalysestoachievethedetection withhigh ified in the document of the source code [2]-[19]. Detecting
accuracy. Moreover,to reducecomputation andmaintenance costs,detec- occurrences of design patterns is a prerequire for many ap-
tion conditionsarehierarchicallyspecified
basedon Pree'smetapatterns plications:
ascommonstructures of designpatterns.Theusageof Prologtorepresent
thedetectionconditions enablesusto easilyaddandmodifythem.Finally, Program understanding-Developers can guess a pro-
we haveimplemented an automated toolas an Eclipseplug-inandcon- gram's behavior by obtaining what classes and meth-
ductedexperiments withJavaprograms. Theexperimental resultsshowthe ods in the program are corresponding with pattern
effectiveness
of ourapproach.
keywords: designpatterns,programunderstanding, metapatterns,dy- roles.
namicanalysis,Prolog Design recovery-Developers can confirm whether pat-
terns in a program are correctly used by continuously
1. Introduction recovering design information of the program.
Pattern mining-An automated pattern detection mecha-
nism leads to exploring unrecognized program struc-
Design patterns [1] are increasingly being applied in object-
tures that are frequently used.
oriented software design processes. Design patterns are ab-
stract structures of artifacts that recur as successful practice The usage of design recovery involves strict time efficiency
in software design so that they can be reused. The usage of because the detection is incrementally performed at the soft-
design patterns allows us to improve the quality of software ware development process. The pattern mining approach
designs, e.g. reusability, extensibility, and maintainability. also involves fast detection in order to apply the detection to
In addition, since design patterns reflect the designers' in- many projects and larger-scale software. Additionally, ac-
tents on the reasons why the designers adopt a design, the curate results are important in high-frequency and/or larger-
identification of design patterns in an existing source code scale pattern detection. We then have to consider how to
helps us to understand it by grasping the designers' intents. improve the accuracy and time efficiency of pattern detec-
Investigating the documentation of source codes is one tion, and of course to reduce the cost of human effort.
of the approaches to identify the occurrences of design pat- This paper proposes a technique to accurately and ef-
terns. However, it frequently happens that the document has ficiently detect occurrences of design patterns in existing
not been completely developed or has not been updated to source codes. Although there are varieties of software pat-
the newest version. Therefore, the documents cannot always terns, we focus only on the design patterns. For example,
include the accurate information on which patterns were Gang-of-Four (GoF) patterns [1] or J2EE patterns [20] are
used in which parts of the code. Additionally, and devel- well-known and widely used design patterns in industry. In
opers spend large amounts of effort on conjecturing pattern our approach, we adopt both static and dynamic analyses so
occurrences by investigating the incomplete document and that we can obtain more accurate results by capturing the dy-
the source code. namic properties of a program as well as the static structure.
There are several researches to (semi-)automatically To improve time efficiency of our automated detection, and
to maintain detection conditions more efficiently, we hierar-
Manuscript received July 2, 2007.
chically compose the detection conditions to decide whether
Manuscript revised October 17, 2007.
The authors are with the Department of Computer Science, a part of a program follows a design pattern.
Graduate School of Information Science and Engineering, Tokyo Since design patterns have common properties, we then
Institute of Technology, Tokyo, 152-8552 Japan. use Pree's meta patterns [21] to represent these common
The author is with the Center for Embedded Computing Sys- properties as a part of the detection conditions. More con-
tems, Graduate School of Information Science, Nagoya University, cretely, if a design pattern includes meta patterns, it can be
Nagoya-shi, 464-8601 Japan. specified with the combination of meta patterns. For this
*Presently
, with NTT COMWARE CORPORATION.
a)E-mail:[email protected] benefit, we take a two-stage pattern detection. First, we de-
DOI:10.1093/ietisy/e91-d.4.933 tect the occurrences of meta patterns in a source code. Next,

Copyright (c) 2008 The Institute of Electronics, Information and Communication Engineers
IEICE TRANS. INF. & SYST., VOL.E91-D, NO.4 APRIL 2008
934

we check if the pre-defined conditions of the design pattern,


including the combination conditions of the meta patterns,
hold or not. If it fails in this stage, we move to other pat-
terns that have the same meta patterns. By using this tech-
nique, we can reduce any redundant repetition of checking
the common properties of design patterns, i.e. the detection
of meta patterns. Furthermore, we use Prolog to define de-
tection conditions so that we can easily add and/or modify
the conditions. It is not a new idea to use Prolog or the first
order predicate logic (FOPL) for reasoning about structural
properties of an object-oriented program, including static
method invocation property [7], [8], [13]-[15], [17]. Addi-
tionally, for design pattern detection, reclassification of GoF
patterns is also proposed [16]. However, none of them used
any hierarchical technique to reduce computation and main-
tenance costs.
In summary, the main contributions of this paper are
considered as follows:
Improved accuracy-By using both static and dynamic
Fig. 1 An overview of our approach.
analyses, we can extract the candidates of design pat-
tern instances more accurately than only using static
analysis. to which class the created object actually belongs because of
Reduced computation and maintenance costs-We use the dynamic binding mechanism. To detect the occurrences
a hierarchy of conditions by using Pree's meta pat- of design patterns, the information on interactions among
terns [21]. Dynamic analysis causes slow detection be- classes is necessary. In addition, only with static analysis,
cause it produces large amounts of data to be analyzed. we cannot identify which method was actually executed, in
Furthermore, comprehensibility of pattern descriptions the case where the method is overridden by a method hav-
may decrease by introducing dynamic analysis. Hier- ing the same name defined in the subclasses. The dynamic
archical categorization of common conditions for each analysis, where we actually execute the code, complements
of the design patterns leads to their improvements in the static one to extract the information such as the methods
time efficiency and maintainability. that were actually executed or the classes of the objects that

The rest of the paper is organized as follows. In the were created during the execution. This information is also

next section, we outline our approach. Section 3 clarifies represented with facts in Prolog.

what information is extracted from a source code. Sections 4 Using extracted information as facts in Prolog, the de-

and 5 present the hierarchical structure of detection condi- tection conditions, described as Prolog programs, are exe-

tions and their representations with Prolog, respectively. In cuted to infer the existence of the class structures satisfying

Sect. 6, we discuss our supporting tool and experiments to the conditions of design patterns. For example, the predi-

show the benefits of our approach. Section 7 discusses the cate observer (C1assRoleList, MethodRoleList)for
detecting Observer pattern is defined as a Prolog program
most relevant work, and Sect. 8 concludes with our contri-
bution and discusses future work. and executed as a query on the facts denoting the extracted

information of the source code.•õ After finishing the execu-


tion, the classes and the methods recognized as constituents
2. Overview of Our Approach
of Observer are bound to the variables ClassRoleList and

MethodRoleList, respectively, as a solution, i.e. an occur-


In our approach, we extract the candidates of design patterns
rence of Observer. One of the benefits to use Prolog is that
from a source code. Figure 1 shows the overview of it.
it can perform exhaustive search for solutions with back-
First of all, we perform both static and dynamic analy-
tracking mechanism during its execution. The Prolog pro-
ses to a source code to be analyzed. The static analysis ex-
tracts the information on the inheritance relations of classes, gram can detect all occurrences of Observer by backtrack-
ing execution if existed in the source code.
the methods defined in the classes, and so on. The extracted
Pattern description should have functionality to repre-
information is translated into a set of facts in Prolog. Since
sent fuzzy conditions. It is difficult to completely decide
object-oriented programs have the mechanism of dynamic
whether a set of classes is actually an instance of a design
binding, we cannot obtain all of the necessary information
without executing them. For example, consider that an ob- pattern because of the variations of design pattern instances.

ject receives a method invocation and calls its corresponding •õ We use sans-serifs for pattern names such as Observer, italics

method. However, by applying static analysis, which only for roles in some patterns such as Observer, and teletypes for parts

investigates the text of the source code, we cannot identify of Prolog/Java code such as observer for avoiding ambiguity.
HAYASHI et al.: DESIGN PATTERN DETECTION BY USING META PATTERNS

935

Table 1 Static informatione xtracted from a source code .

(a)

(b)

3.2 Dynamic Information

Fig. 2 Variations in Observer. For dynamic analysis, we execute the source code and ex-
tract, from the execution records, the occurrences of method
invocations, including constructors' invocations to create in-
stances. The information that we extracted is listed up in
For example, Fig. 2 illustrates two variations in Observer. Table 2.
Based on the description in GoF's book [1], notify() in Sub- The right side of Fig. 3 illustrates an example of ex-
ject is not overridden in its subclasses, i.e. ConcreteSub- tracting the dynamic execution record as a fact in Prolog. A
ject (Fig. 2-a). However, sometimes we modify how to no- part of the facts is shown in the bottom line of the figure,
tify to observers. Then notify() may be overridden by Con- and it begins with the name of the predicate, methEntry.
creteSubjects (Fig. 2-b). To deal with these variations in a This fact denotes an invocation of the method intValue()
unified way, our detection conditions do not simply return in IntNumber class during the execution. We model the be-
true or false value, but the degree of certainty that a code havioral aspect of method invocation with a pair of events;
includes the design pattern. That is to say, we embed the the event of the method being called and the event of the
calculation of fuzziness to our facts in Prolog. The descrip- method being returned. Each event is assigned a number
tion about our fuzzy conditions is included in Sect. 5. corresponding the order in which it was invoked. We will
refer to it as the ordinal execution number of the event. The
3. Extracted Information first parameter 14 in the predicate methEntry stands for the
ordinal execution number of the event of the method just
3.1 Static Information being called in the analyzed execution record. After finish-
ing the execution of the method and returning to the caller
Our approach first extracts static information by source code object, the 15th (second parameter of the predicate) event
analysis. Table 1 lists up the types of the extracted informa- occurred as an event of the method to be returned. It means
tion. For example, for a class, we extract the information that the method invocation intValue () is modeled with a
on inheritance relationship (its superclass), visibility (pub- pair of events, 14 and 15, as shown in Fig. 3. The identifier
lic or private), whether it is an abstract or concrete class, 13 as the third parameter shows the parent method invoca-
the list of methods that it defines, and fields that it has tion event that caused this invocation. The caller (sender of
the method invocation) and the callee (receiver) object are
(instance variable declarations). The left side of Fig. 3 il-
lustrates an example of extracting the static structure of a referenced by 62 and 63, respectively. The class Number is
source code as a fact in Prolog. In the figure, the line start- a superclass of IntNumber, and this method invocation was
ing with the predicate class represents the static informa- done on the object whose class is Number from syntactic
tion of class IntNumber as a fact. The six parameters of the view. For example, the code of this method invocation can
predicate denote its class name ('IntNumber'), superclass be written as follows:
('Number'), methods (['IntNumber. intValue()']), in- Number obj=new IntNumber();
stance variables ([]: a null list because it has no fields), the obj. intValue (obj 1);
information on whether it is an abstract or concrete class
('false': abstract class), and its visibility ('public'), re- Note that the receiver object is bound to the type of Number,
spectively. not IntNumber, and therefore the sixth parameter of the
IEICE TRANS. INF. & SYST., VOL .E91-D, NO.4 APRIL 2008
936

Fig. 3 Representing static and dynamic information with Prolog.

Table 2 Dynamic information extracted from execution records .

above predicate is Number. The seventh parameter [58]


shows that the method invocation had only one input pa- Fig. 4 Observer pattern and 1:N Connection meta pattern.

rameter and that it is the object with the ID number 58.

4. Detection Condition by Meta Patterns of ConcreteObserver, and the invocation of notify() leads
to sending update messages to all of the ConcreteObserver
4.1 Meta Patterns instances. By using dynamic binding and overriding mecha-
nisms, the execution of update varies on its implementation
In many cases, customizable parts of design patterns, so- in ConcreteObserver subclass. Note that the ID numbers
called hot spots, are implemented with employing sub- such as ID: 157 attached to the syntactic elements in Fig. 4
classes of an abstract class and overriding its abstract meth- will be used to explain a detection condition in Sect. 5.2.
ods by concrete ones in the subclasses. The designers can Many of design patterns include common structures.
fill the hot spots with the concrete methods that override the Pree classified hot spots of design patterns into seven cate-
abstract one. The method overridden by a concrete method gories from the viewpoint of associations between the class
in a subclass is called hook method, and a template method having hook methods (called a hook class) and a class hav-
includes one or more hook methods to implement its func- ing template methods (template class). He defined these cat-
tions. egories as meta patterns [21]. Meta patterns can be consid-
For example, consider Observer of GoF patterns ered as a basis of design patterns and in fact. For example,
shown in the bottom of Fig. 4. The method update() of the most of GoF patterns include meta patterns. Figure 5 depicts
abstract class Observer has no implementation but defines the class diagrams of the Pree's meta patterns. For example,
its interface, i.e. its name and parameters. When a designer as shown in Fig. 4, Observer includes 1:N Connection meta
uses Observer, he/she composes concrete subclasses of Ob- pattern. In 1: N Connection, there is no inheritance relation-
server class and customizes the pattern by overriding the ab- ship between a template class and a hook class, but they are
stract method update() with a concrete method, i.e. a method connected through an aggregation relationship whose cardi-
having an implementation, in each subclass in order to im- nality is one-to-many. In Fig. 4, Subject class and Observer
plement the functions that he/she wants. The method up- class correspond to a template class and a hook class, re-
date() is a hook method, and the method notify() in Subject spectively.
class which calls update() is a template method. The in- We analyzed 23 GoF design patterns and investigated
stance variable observers in Subject class holds all instances which meta patterns are included in them. Table 3 shows the
HAYASHI et al.: DESIGN PATTERN DETECTION BY USING META PATTERNS

937

ditions for Bridge are the combination of 1:1 Connection


conditions and Bridge-specific ones. If we try to detect each
pattern individually, we have to check 1:1 Connection con-
ditions twice on the whole source code to be analyzed which
brings inefficiency. In our approach, as shown in Fig. 6,
we organize the hierarchical structure where 1:1Connection
conditions are in the upper layer and where Builder-specific
conditions and Bridge-specific ones are in the lower layer.
To detect the occurrences of GoF design patterns, we visit
nodes in this hierarchical structure from the root node and
check, in descending order, the conditions which the vis-
ited nodes have. In the example, we first find out the parts
of the source code that hold the 1:1 Connection conditions
and then apply the Builder-and Bridge-specific conditions
to them. Since we check the 1:1Connection conditions only
once, we can attain more efficient detection. The hierarchi-
cal structure is organized based on logical implication rela-
tionship among the detection conditions. For example, 1:1
Fig. 5 Meta patterns. T, H, TH, t(), h(), th() denote a template class, a Connection can be considered as a logically stronger ver-
hook class, a class having both roles of template and hook class, a template sion of 1:N Connection, because the constraint that the car-
method, a hook method, and a method having both roles of template and
dinality should be one-to-one is additionally imposed to 1:N
hook method, respectively.
Connection condition. Therefore we check the 1:N Con-
nection condition, and then check 1:1 Connection-specific
Table 3 Classification of GoF design patterns by meta patterns. condition, i.e. one-to-one cardinality if 1:N Connection is
satisfied. In this sense, the hierarchical structure represents
the order of checking the detection conditions. We start
with checking the Common Conditions for Meta Patterns,
which all meta patterns have. This common condition spec-
ifies the existence of a method that is overridden as a hook
method and is called from somewhere. Common Conditions
for Connection, the definition of an aggregation between the
corresponding classes, is common to all of connection types
of meta patterns. Proposed hierarchical structure is used
for both static and dynamic analyses. As shown in Fig. 6,
*: patterns including more than one meta pattern we can define hierarchical relationships not only between a
meta pattern and a design pattern but also between two meta
patterns and between two design patterns. Factory Method
classification of GoF patterns by the included meta patterns. is a special case of Template Method, we then check the
20 of the 23 patterns include a meta pattern, except for Iter- condition of Factory Method after that of Template Method
ator that includes two occurrences of 1: N Connection meta is satisfied.
pattern and Visitor that includes both 1: N Connection and By constructing the hierarchy of conditions, we can
1:1 Connection meta patterns. also reduce the size of detection conditions. In our tech-
nique, we develop detection conditions efficiently by using
4.2 Hierarchy of Detection Conditions a reusing mechanism as well as the inheritance mechanism
of object-oriented programming. If a target pattern (e.g. P)
We define the hierarchical structure of GoF patterns based includes other pattern or meta patterns (e.g. Pi), we can con-
on the inclusion of meta patterns. In the case where several sider P is inherited from Pi, and Pi is regarded as the super-
design patterns have the common properties for detecting classes of P. After declaring the relationship between P and
them, we can attain efficient detection by avoiding the re- Pi, we can just focus on writing the condition specific to P.
dundant repetition of checking these common properties. In For example, to describe the detection condition of Factory
our approach, these common properties are specified with Method, all we have to do is declare parent pattern Template
meta patterns. Method and describe Factory Method-specific conditions,
As an example, consider the detection of Builder and e.g. a constraint of result type. Furthermore, if a modifica-
Bridge. As shown in Table 3, both of them include 1:1 tion of detection conditions is required, e.g. adding debug-
Connection meta pattern. Therefore, the Builder detection ging information or introducing stricter conditions, we can
conditions can consist of the conditions for 1:1 Connection easily find where to modify because the conditions in com-
and the conditions specific to Builder. Similarly, the con- mon and those specific to the target pattern are separated.
IEICE TRANS. INF. & SYST., VOL.E91-D, NO.4 APRIL 2008
938

Fig. 6 Hierarchical structure of detection conditions.

observer(Class Role List, Method Role List):-


Compared with detection conditions that do not use any
hierarchical structure, we can acquire>30% reduction by
meta patterns, furthermore>60% reduction by meta patterns
and common conditions for meta patterns. A well-balanced
hierarchical structure of GoF patterns and Pree's meta pat-
terns contributes an effective reduction of whole description
of detection conditions.

5. Specifying Detection Conditions with Prolog

5.1 Detecting Patterns with Prolog

The overview of a detection process by using Prolog is as


follows. We first represent the extracted information from
the source code as facts in Prolog and assert (registration of
facts) them into a Prolog database, which is a collection of
facts. We then check the detection conditions in the order Fig. 7 A part of the detection condition for Observer.
following the hierarchical structure from its root shown in
Fig. 6. The candidates of design pattern instances are ob-
tained as solutions satisfying the detection conditions, and the list of the hook methods h(), Subject as the template
they are asserted into the database in order to utilize them class T, and Observer as the hook class H. The results are
for checking the conditions of the lower level. asserted as facts in Prolog, e.g. metal_Nconnection(157,
[158], 2, 3), into the database so that we can use them
5.2 Detection Condition to check further the Observer specific conditions.
The predicates following methEntry predicate are for
The detection conditions are defined as rules in Prolog. the Observer specific conditions:
The rules specify the properties that the design patterns
should have with a series of predicates so as to evaluate 1. The method notify() should be called by the instance
the properties to the facts of the source code in Prolog. of ConcreteSubject class. The candidates of notify()
Figure 7 shows a part of the detection conditions for Ob- method, which is calculated in metal_Nconnection,
server which has the 1:N Connection condition in the upper is passed through the variable NotifySN, and its actual
layer. Observer and its correspondence to 1:N Connection invocation from an instance of the ConcreteSubject in
are shown in Fig. 4. According to Fig. 6, we have to start the execution record is bound to NotifyMethID.
with the check of 1:N Connection conditions, and the pred- 2. The ConcreteSubject class should be a subclass of Sub-
icate metal_Nconnection appears in the right side of the ject class.
rule. The results from evaluating this predicate, i.e. the parts 3. The ConcreteObserver classes should have the method
of the source code satisfying it, are bound to the variables update(). The candidates of update() methods, con-
NotifySN as the template method t(), UpdateSNList as crete classes, and their instances are bound to the van-
HAYASHI et al.: DESIGN PATTERN DETECTION BY USING META PATTERNS

939

ables UpdateSNList, ConcreteObserverList, and


ConcreteObserverObjList, respectively. 6. Experimentation
4. NotifyDeciClass, the classes having method no-

tify(), should have the method attach(). 6.1 Automated Tool

The following Prolog goal is for detecting the occur- 6.1.1 Design Pattern Detector
rences of Observer in a source code.

We implemented a tool to automate the detection of de-


observer (ClassRoleList, MethodRoleList). sign patterns based on our approach as a plug-in of Eclipse
IDE [22]. As suggested in Fig, 1, our automated tool con-
sists of two parts: 1) an analysis module for extracting static
The first parameter ClassRoleList in the goal has
and dynamic information from a Java source code (called
the triples consisting of role names in Observer, their cor-
static analyzer and dynamic analyzer, respectively), and
responding detected class names, and the instances of the
2) a detection module for generating facts in Prolog from
classes. On the other hand, a list of triples regarding the
the extracted information and checking detection conditions
detected methods is bound to MethodRoleList, and each
against them. The static analyzer has been implemented
triple contains a role name in Observer, its corresponding
by using Abstract Syntax Tree (AST) parser of the Eclipse
method signature, and an identifier of its method invocation
JDT plug-in [23], and the dynamic analyzer uses Java De-
included in the execution record. A concrete example of
bug Interface (JDI) to obtain, from Java Virtual Machine, the
the detection result is shown in Fig. 8. It is the result from
events of method calls and method returns that occur during
applying the condition in Fig. 7 to a source code which in-
the execution. We have used tuProlog [24], [25], that can be
cludes the Observer shown in Fig. 4. The first four lines in
operated with Java platform, to check detection conditions
Fig. 8 show the value of the first parameter ClassRoleList
written in Prolog.
as the result regarding the detected classes. More concretely,
Figure 9 is the screenshot of the automated tool on
the class Node corresponds to Subject class in Observer, but
Eclipse IDE. The console view part (the bottom part of the
there are no instances of Node because it is an abstract class,
figure) displays the result from detecting Observer in the
hence the null. The class File plays a role of Concrete-
form of Prolog predicates. After recognizing the several oc-
Subject in Observer and an instance of the identifier 57 was
currences of 1:1 Connection, the tool detects an occurrence
created during the execution. In the fifth line of the Fig. 8,
of Observer. The user identifies what parts of the program
the results from detecting methods are shown. For exam-
are organized as design patterns from detection results. Our
ple, notifyObservers() method, defined in Node, cone- tool is under the prototyping, so only preliminary function-
sponds to the role of notify() in Observer, and its method
alities are available. As an empirical viewpoint, detection
invocation has the identifier 15 7.
results should be provided not only by textual but also more
Pattern candidates are associated with its fuzziness
graphical information. To improve the legibility of detection
score, which is represented using the predicate score
results, a tool to visualize pattern candidates and a combina-
shown in Fig. 7-5. The rule represents, •gfor Observer, if
tion of multiple patterns is desired.
Subject is not an abstract class, then the score should have
40 subtracted from it.•h The second and third parameters are
6.1.2 Test Case Generator
directly passed from predicate observer. The fourth pa-

rameter is the score that should be added to the default score


Dynamic analysis requires test cases to execute the target
100 if the following conditions are satisfied. In this case, a
program with the regular calling order in an effort to extract
negative score 40 is used because the following conditions
amounts of dynamic information enough to detect design
may not be suitable for Observer. As explained above, since
patterns. Our detection conditions represent what kind of
the definitions of fuzziness can be described declaratively,
methods are invoked in what order, so execution sequences
we then can easily add, remove, and modify the estimation
against the order do not satisfy the conditions. To reduce
rules without changing the detection rules.
computation time and human cost, avoiding unnecessary ex-
ecutions is useful. However, some techniques for improving
coverage requirements often generate many ineffective test
cases.
We also implemented a tool for supporting test case
construction by using test case templates and pattern candi-
dates detected by static analysis. Figure 10 is the screenshot
of the supporting tool. The tool lists the candidates of de-
sign patterns detected by static analysis (left side of tabs).
When the user selects a candidate, the tool shows a template
which is suitable for design pattern detection. For example,
Fig. 8 An example of detection results. the template for Observer is shown in Fig. 11. For each part
IEICE TRANS. INF. & SYST., VOL.E91-D , NO.4 APRIL 2008
940

Fig. 9 A screenshot of the tool for design pattern detection.

Table 4 Sizes of the programs.

proach. We picked up two Java programs to detect design


patterns from: 1) the modified version of a sample program
which appears in the textbook of design pattern written by
Vlissides [27] (we call it program #1) and 2) the dynamic
Fig. 10 A screenshot of the tool for test case generation. analyzer of our supporting tool itself (called program #2).
The reason why we selected them was that we had known
which design patterns were actually used in which parts of
the programs in advance. We then could assess the accuracy
of our detection by comparing the results produced by the
tool with the real usages of design patterns in the programs.
Fig, 11 A template of test case generation for Observer. Table 4 shows the size of the programs #1 and #2. For
example, the program #2 includes 56 classes, and 1,439
method invocations (excluding the invocations within Java
of the pattern candidate, the tool also shows information on libraries such as Swing) were done during the execution.
its names, types, and pre-conditions described in JML [26]. We have designed the detection conditions for 12
A user implements test cases by using this supporting infor- GoF design patterns; Abstract Factory, Builder, Chain of
mation. Responsibility, Composite, Factory Method, Observer,
Proxy, State, Strategy, Template Method, Visitor, and Sin-
6.2 Experimental Method gleton. Note that Singleton is not included in the hierarchi-
cal structure of Fig. 6 because it is too simple and has no
In this subsection, we introduce the experiments that we per- meta patterns.
formed to assess our approach and tool. In particular, the When performing dynamic analysis, what methods are
aim is to assess the accuracy and time efficiency of our ap- actually executed depends on the input data that will be sup-
HAYASHI et al.: DESIGN PATTERN DETECTION BY USING META PATTERNS
941

Table 5 Results of test case generation. Table 6 Detection results.

plied to the program during the execution, so we cannot get


the information on the methods that were not executed. We

also added test cases to extract additional information by


using our test case generator for program #1. The results

are shown in Table 5. In the table, the columns •gThr.•h, •g#

Cand.•h, •g# Tests•h, •gTime•h, and •gFound patterns•h represent

the threshold of the detection conditions, the number of au-

tomatically generated test cases, the number of manually


constructed test cases from candidates with a tool support, columns •gC•h, •gS•h, and •gSD•h represent Correct, the num-

time spent on construction in minutes, and pattern names ber of the patterns that were actually used, Static, the num-

that the instances of design patterns in constructed test cases ber of the patterns from static analysis only, and Static-and-

belong to, respectively. At the end we additionally obtained Dynamic, the number of the patterns from both static and

11 test cases, which are enough to find seven types of pat- dynamic analyses, respectively. The rows •gPrecision•h and

terns. Three test cases were obtained by the candidates by •g Recall•h indicate one of the accuracy criteria and are calcu-

using lower threshold of 80, which in turn rescuing informal lated by using a set of detected results M and correct patterns

candidates lead to lower false negatives. C:

In order to assess the improvements of the accuracy of

our detection technique, we provided two types of the de- Precision=|M•¿C|/|C|, Recall=|M•¿C|/|M|.

tection way; performing 1) static analysis only, and 2) both


static and dynamic analyses. We measured the number of The tool could detect all of the patterns that were actu-
detected pattern candidates and compared the precisions of ally used in both of the programs-maximum recalls. Fur-
both types. thermore, precisions are drastically improved by introduc-
Furthermore, in order to assess the time efficiency of ing dynamic analysis. As a whole program #1 and #2, we
our detection technique, we provided two types of detection can observe the improvement of the precision from 0.19 to
conditions; 1) following the hierarchical structure shown in 0.67.
Fig. 6 and 2) not following the hierarchical structure. The The dynamic analysis that we adopted was quite useful
second type of the conditions is not asserted into the Prolog and effective, and it provided the information on the execu-
database. Note that the first type for Singleton is the same as tion order of the method invocations and the identification
the second one because it is not included in the hierarchical of overriding methods. This information allowed the tool to
structure. We have detected the patterns by using both types filter out irrelevant candidates of patterns in early stage. In
of the conditions and compared the detection times. The fact, more predicates of Prolog specifying the conditions on

differences between the detection times allow us to evaluate the order of method invocations and on overriding methods
how much hierarchy of the detection conditions can con- were used in the detection conditions compared to the pred-
tribute to the improvement of time efficiency. Additionally, icates for static structures. They were frequently evaluated

we do not consider quantitative criteria of our experiments during the detection processes.

are especially important because assuming criteria depends Table 7 shows the time taken to detect the, patterns.
on the target program. For example, structurally and be- The results in the case of using the hierarchical structure

haviorally equal patterns, e.g. State and Strategy, are not (our proposed approach) are included in the cells under
correctly detected by our technique, and then quantitative the column of •gHierarchical conditions•h, while the column
results depend on the kinds of patterns included in the tar- •g Non-hierarchical conditions is for the case of the non-

hierarchical detection conditions. As for the total time for


get program. We qualitatively discuss what points of our
technique contribute to the experimental results. the detection, our proposed approach was 3.5 times as fast

Our experiments were done on a personal computer as non-hierarchical conditions in the case of the program #1,

with Intel Pentium III 1 GHzx2, 512 MB RAM, and Mi- and 6.5 times faster in the program #2. This result shows

crosoft Windows 2000. that our approach of using hierarchical detection conditions

is more effective.

6.3 Experimental Results As for the case shown in Table 6 that the value of •gS•h or

•g SD•h is greater than that of •gC•h, the tool correctly detected

the parts of thee programs where a pattern was really used,


The detection results are shown in Table 6. In the table, the
IEICE TRANS. INF. & SYST., VOL.E91-D, NO .4 APRIL 2008
942

Table 7 Detection time (unit: millisecond).

but it took the wrong patterns. In the example of the pro- In the works by Correa et al. and Kramer et al., the
gram #1, the tool detected Chain of Responsibility, but its conditions that should be satisfied on the static structure of a
part was structured with Composite. The tool detected the source code are defined with FOPL. The pattern occurrences
occurrences of State and Strategy at the same part of the are detected by checking the true/false value of the condi-
program #1, but Observer was actually used there. These tions [7], [8]. To provide an easy way to describe the condi-
errors resulted from the incorrect decision of the role of the tions on so-called pattern specifications, Balanyi et al. pro-
methods as constituents of the patterns. The reason why posed an XML based language named DPML [9]. However,
State and Strategy patterns were detected at the same part these approaches only use the static structure of a code such
is their similarity in static structure. In the case of the pro- as inheritance, aggregation, association and method invoca-
gram #2, an occurrence of Template Method was detected tion relationships. Therefore it leads to inaccuracy when de-
but the pattern was not actually used. The detected part was tecting patterns. In particular, in DPML, although Factory
included in the occurrence of Factory Method, which was Method could be completely detected, the other patterns,
correctly detected. This error arises from the fact that Fac- Builder, Proxy, and Visitor, could be detected at correctness
tory Method can be a special form of Template Method. ratios of 14%, 50%, and 0%, respectively. The types of pat-
These errors are not related to the structural and behav- terns that could be detected were limited.
ioral but the semantic differences. To detect these patterns Some approaches use Prolog or FOPL for reasoning
precisely, we are considering to introduce some semantic about structural properties of object-oriented programs, in-
analyses, such as program identification analysis [18], or cluding static method invocation properties [7], [8], [13]-
heuristic-based approach, such as the technique using a ma- [15]. In particular, the tool of Kniesel et al, is based on the
chine learning [6]. Some of the failing reasons of our de- static source code analysis and detects patterns rapidly [17].
tection technique are considered as follows: 1) not followed However, most of them do not consider maintainability of
variations of patterns, 2) irrelevant execution sequences, and pattern descriptions, i.e. programs in Prolog or FOPL ex-
3) pattern that has no structural characteristics. To detect pressions themselves.
non-structural patterns, Prolog representation of a program Some researches use the dynamic data such as exe-
behavior other than executions is required. cution traces [10]. The combination of static and dynamic
analyses seems to be one of the ways to achieve more accu-
7. Related Work rate pattern detection [28]. However, the computation cost
may be a problem of automated detection.
Since formal concept analysis helps us to extract the
There are several researches to automatically detect occur-
abstract static structures that commonly appear, it is applied
rences of design patterns, and some of them use metrics that
to pattern mining as well as pattern detection [11], [12]. Pat-
measure the possibility of the existence of patterns [2]-[5].
tern mining approach strongly requires time efficiency be-
These metrics techniques can be combined with a statistical
cause they are applied to many projects and larger-scale soft-
approach and learning algorithms to improve the accuracy
ware.
of detection [6].
HAYASHI et al.: DESIGN PATTERN DETECTION BY USING META PATTERNS

943

Semantic analysis using natural language processing, cale one.


e.g. an analysis of identifications or class/method names on
a source code, is also effective to detect design patterns. The
Acknowledgments
work by Dong et al. indicates that the collaboration of the
approaches brings good results [18], [19]. However,,consid-
The authors would like to thank Mr. Rodion Moiseev,
erations of extensibility and maintainability of pattern de-
Mr. Hironobu Kanai, and anonymous reviewers of IEICE
scriptions are not mentioned.
for their valuable comments to the earlier version of this pa-

per.
8. Conclusion
References

In this paper, we proposed the approach to detect design pat-


terns from a source code more accurately and efficiently. [1] E. Gamma, R. Helm, R. Johnson, and J. Vlissides, Design Patterns:
Elements of Reusable Object-Oriented Software, Addison-Wesley,
To improve accuracy, we adopted an integrated method of
1995.
static and dynamic analyses. Furthermore, we focused on [2] H. Kim and C. Boldyreff, •gA method to recover design patterns us-
the common properties that design patterns have and rep- ing software product metrics,•h Proc. 6th International Conference on
resented them as Pree's meta patterns. The meta patterns Software Reuse, pp. 318-335, 2000.

helped us to hierarchically structure the properties of design [3] T. Muraki and M. Saeki, •gMetrics for applying GoF design patterns
in refactoring processes,•h Proc. 4th International Workshop on Prin-
patterns, and we could reduce the time costs of detection ciples of Software Evolution, pp. 27-36, 2001.
and the size of whole detection conditions. We adopted Pro-
[4] S. Demeyer, S. Ducasse, and O. Nierstrasz, •gFinding refactoring
log to represent the detection condition of design patterns, via change metrics,•h Proc. 15th Conference on Object-Oriented
so that we could easily add and modify the detection condi- Programming, Systems, Languages and Applications, pp. 166-177,

tions and implement the efficient detection of patterns by us- 2000.

ing the backtracking mechanism and the database function, [5] Y.G. Gueheneuc, H. Sahraoui, and F. Zaidi, •gFingerprinting design

i.e. assert and retract predicates. The database function patterns,•h Proc. 11th Working Conference on Reverse Engineering,

pp. 172-181, 2004.


was useful to avoid re-evaluation of the same detection con- [6] R. Ferenc, A. Beszedes, L. Fulop, and J. Lele, •gDesign pattern min-
ditions. Our supporting tool and the experiments showed ing enhanced by machine learning,•h Proc. 21st International Confer-

the benefits of our approach. In particular, the result of our ence on Software Maintenance, pp. 295-304, 2005.

experiments shows that all of the pattern occurrences were [7] A.L. Correa, C.M.L. Werner, and G. Zaverucha, •gObject oriented

detected. It means that our hierarchical detection conditions design expertise reuse: An approach based on heuristics, design pat-
terns and anti-patterns,•h Proc. 6th International Conference on Soft-
are accurate enough to detect design patterns. However, it ware Reuse, pp. 336-352, 2000.
was difficult to completely recognize correct patterns, we [8] C. Kramer and L. Prechelt, •gDesign recovery by automated search
consider that some of these types of errors can be reduced for structural design patterns in object-oriented software,•h Proc. 3rd

by adopting a stronger dynamic analysis. To solve the oth- Working Conference on Reverse Engineering, pp. 208-215, 1996.

ers, we should consider the semantic aspect of patterns, not [9] Z. Balanyi and R. Ferenc, •gMining design patterns from C++ source
code,•h Proc. 19th International Conference on Software Mainte-
static structure or the usage of methods and classes during nance, pp. 305-314, 2003.
program execution. One of the differences between State [10] D. Heuzeroth, T. Holl, G. Hogstrom, and W. Lowe, •gAutomatic de-
and Strategy, which have the same structure, is that objects sign pattern detection,•h Proc. 11th International Workshop on Pro-

in State are created as states while objects in Strategy are gram Comprehension, pp. 94-103, 2003.

created as strategies. It means that we should clarify the [11] P. Tonella and G. Antoniol, •gObject oriented design pattern infer-
ence,•h Proc. 15th International Conference on Software Mainte-
semantic differences between State and Strategy.
nance, pp. 230-239, 1999.
We can consider the following issues as future work: [12] G. Arevalo, F. Buchli, and O. Nierstrasz, •gDetecting implicit collab-
oration patterns,•h Proc. 11th Working Conference on Reverse Engi-
Describing detection conditions for more patterns-We neering, pp. 122-131, 2004.
have shown that our hierarchical structure by using [13] R. Wuyts, •gDeclarative reasoning about the structure object-oriented
Pree's meta patterns is effective to describe conditions systems,•h Proc. TOOLS USA '98 Conference, pp. 112-124, 1998.

of design pattern detection. Confirmations to describe [14] J.M. Smith and D. Stotts, •gSPQR: Flexible automated design pattern

conditions for other design patterns, e.g. patterns in extraction from source code,•h Proc. 18th International Conference

on Automated Software Engineering, pp. 215-224, 2003.


Grand's catalog [29] or J2EE patterns [20], are effec-
[15] D. Heuzeroth, S. Mandel, and W. Lowe, •gGenerating, design pattern
tive. detectors from pattern specifications,•h Proc. 18th International Con-
Visualizing patterns-It is hard to understand constitu- ference on Automated Software Engineering, pp. 245-248, 2003.

tions and communications of patterns because our [16] N. Shi and R.A. Olsson, •gReverse engineering of design patterns

tool's outputs are just facts in Prolog. We consider a from Java source code,•h Proc. 21st International Conference on Au-

tomated Software Engineering, pp. 123-134, 2006.


visualizing technique to enhance the understandings of
[17] G. Kniesel, J. Hannemann, and T. Rho, •gA comparison of logic-
the patterns. In particular, the tool to visualize classes based infrastructures for concern detection and extraction,•h Proc. 3rd
corresponding to multiple patterns is desired. Workshop on Linking Aspect Technology and Evolution, 2007.

Case study of larger scale-Our case study in this paper is [18] J. Dong, D.S. Lad, and Y. Zhao, •gDP-Miner: Design pattern discov-

pre-limited, so we should apply our technique to larger ery using matrix,•h Proc. 14th International Conference and Work-
MICE TRANS. INF. & SYST., VOL.E91-D, NO.4 APRIL 2008
944

shops on the Engineering of Computer-Based Systems, pp. 371-380, Takashi Kobayashi received B. Eng., M.
2007. Eng., and Ph.D. degrees in computer science
[19] J. Dong and Y. Zhao, •gExperiments on design pattern discovery,•h from Tokyo Institute of Technology in 1997,
Proc. 3rd International Workshop on Predictor Models in Software 1999, and 2004, respectively. He is currently a
Engineering, 2007. designated associate professor of Center for Em-
[20] J. Crupi, D. Malks, and D. Alur, Core J2Ee Patterns: Best Practices bedded Computing Systems (NCES), Graduate
and Design Strategies, Prentice Hall PTR, 2001. School of Information Science, Nagoya Uni-
[21] W. Pree, Design Pattern for Object-Oriented Software Development, versity. His research interests include software
Addison-Wesley, 1996. patterns and architecture, software development
[22] Eclipse project, available at http://www.eclipse.org/ method, software configuration management,
[23] Eclipse Java development tools (JDT) subproject, available at Web-services compositions, workflow, multi-
http://www.eclipse.org/jdt/ media information retrieval, and data mining. He is a member of IPSJ,
[24] E. Denti, A. Omicini, and A. Ricci, •gMulti-paradigm Java-Prolog JSSST, DBSJ, and ACM.
integration in tuProlog,•h Science of Computer Programming, vol.57,

no.2, pp. 217-250, 2005.

[25] tuProlog, available at http://tuprolog.alice.unibo.it/


Motoshi Saeki received a B. Eng. de-
[26] G.T. Leavens, A.L. Baker, and C. Ruby, •gJML: Java modeling lan-
gree in electrical and electronic engineering, and
guage,•h available at http://www.eecs.ucf.eduneavens/JML/
M. Eng. and Ph.D. degrees in computer science
[27] J. Vlissides, Pattern Hatching: Design Patterns Applied, Addison-
from Tokyo Institute of Technology, in 1978,
Wesley, 1998.
1980, and 1983, respectively. He is currently a
[28] L. Wendehals and A. Orso, •gRecognizing behavioral patterns at run-
time using finite automata,•h Proc. 2006 International Workshop on
professor of computer science at Tokyo Institute
of Technology. His research interests include re-
Dynamic Systems Analysis, pp. 33-40, 2006.
quirements engineering, software design meth-
[29] M. Grand, Patterns in Java, John Wiley & Sons, 1998.
ods, software process modeling, and computer
supported cooperative work (CSCW).

Shinpei Hayashi received a B. Eng. de-


gree in information engineering from Hokkaido
University in 2004, and a M. Eng. degree in
computer science from Tokyo Institute of Tech-
nology in 2006. He is currently a Ph.D. stu-
dent in computer science at Tokyo Institute of
Technology. His research interests include soft-
ware evolution and refactoring, software pat-
terns, software development environment, and
mining software repositories.

Junya Katada received B. Eng. and M.


Eng. degrees in computer science from Tokyo
Institute of Technology in 2003 and 2005, re-
spectively. He is currently working at NTT
COMWARE CORPORATION. His research in-
terests include software patterns, software archi-
tecture, software design methods, and software
development environment.

Ryota Sakamoto received a B. Eng. de-


gree in computer science from Tokyo Institute
of Technology in 2006. He is currently a master
course student in computer science at Tokyo In-
stitute of Technology. His research interests in-
clude software testing, software patterns, human
eye modeling, and human visual processing.

You might also like