Design Pattern Detection by Using Meta Patterns
Design Pattern Detection by Using Meta Patterns
Design Pattern Detection by Using Meta Patterns
4 APRIL 2008
933
SUMMARY Oneof the approaches to improveprogramunderstand- detect the occurrences of design patterns in source codes, in
ingis to extractwhatkindsof designpatternareusedin existingobject- order to support various activities of software process under
orientedsoftware.Thispaperproposesa technique for efficiently
andac- the situation where the used patterns are not explicitly spec-
curatelydetectingoccurrencesofdesignpatternsincluded in sourcecodes.
Weusebothstaticanddynamicanalysestoachievethedetection withhigh ified in the document of the source code [2]-[19]. Detecting
accuracy. Moreover,to reducecomputation andmaintenance costs,detec- occurrences of design patterns is a prerequire for many ap-
tion conditionsarehierarchicallyspecified
basedon Pree'smetapatterns plications:
ascommonstructures of designpatterns.Theusageof Prologtorepresent
thedetectionconditions enablesusto easilyaddandmodifythem.Finally, Program understanding-Developers can guess a pro-
we haveimplemented an automated toolas an Eclipseplug-inandcon- gram's behavior by obtaining what classes and meth-
ductedexperiments withJavaprograms. Theexperimental resultsshowthe ods in the program are corresponding with pattern
effectiveness
of ourapproach.
keywords: designpatterns,programunderstanding, metapatterns,dy- roles.
namicanalysis,Prolog Design recovery-Developers can confirm whether pat-
terns in a program are correctly used by continuously
1. Introduction recovering design information of the program.
Pattern mining-An automated pattern detection mecha-
nism leads to exploring unrecognized program struc-
Design patterns [1] are increasingly being applied in object-
tures that are frequently used.
oriented software design processes. Design patterns are ab-
stract structures of artifacts that recur as successful practice The usage of design recovery involves strict time efficiency
in software design so that they can be reused. The usage of because the detection is incrementally performed at the soft-
design patterns allows us to improve the quality of software ware development process. The pattern mining approach
designs, e.g. reusability, extensibility, and maintainability. also involves fast detection in order to apply the detection to
In addition, since design patterns reflect the designers' in- many projects and larger-scale software. Additionally, ac-
tents on the reasons why the designers adopt a design, the curate results are important in high-frequency and/or larger-
identification of design patterns in an existing source code scale pattern detection. We then have to consider how to
helps us to understand it by grasping the designers' intents. improve the accuracy and time efficiency of pattern detec-
Investigating the documentation of source codes is one tion, and of course to reduce the cost of human effort.
of the approaches to identify the occurrences of design pat- This paper proposes a technique to accurately and ef-
terns. However, it frequently happens that the document has ficiently detect occurrences of design patterns in existing
not been completely developed or has not been updated to source codes. Although there are varieties of software pat-
the newest version. Therefore, the documents cannot always terns, we focus only on the design patterns. For example,
include the accurate information on which patterns were Gang-of-Four (GoF) patterns [1] or J2EE patterns [20] are
used in which parts of the code. Additionally, and devel- well-known and widely used design patterns in industry. In
opers spend large amounts of effort on conjecturing pattern our approach, we adopt both static and dynamic analyses so
occurrences by investigating the incomplete document and that we can obtain more accurate results by capturing the dy-
the source code. namic properties of a program as well as the static structure.
There are several researches to (semi-)automatically To improve time efficiency of our automated detection, and
to maintain detection conditions more efficiently, we hierar-
Manuscript received July 2, 2007.
chically compose the detection conditions to decide whether
Manuscript revised October 17, 2007.
The authors are with the Department of Computer Science, a part of a program follows a design pattern.
Graduate School of Information Science and Engineering, Tokyo Since design patterns have common properties, we then
Institute of Technology, Tokyo, 152-8552 Japan. use Pree's meta patterns [21] to represent these common
The author is with the Center for Embedded Computing Sys- properties as a part of the detection conditions. More con-
tems, Graduate School of Information Science, Nagoya University, cretely, if a design pattern includes meta patterns, it can be
Nagoya-shi, 464-8601 Japan. specified with the combination of meta patterns. For this
*Presently
, with NTT COMWARE CORPORATION.
a)E-mail:[email protected] benefit, we take a two-stage pattern detection. First, we de-
DOI:10.1093/ietisy/e91-d.4.933 tect the occurrences of meta patterns in a source code. Next,
Copyright (c) 2008 The Institute of Electronics, Information and Communication Engineers
IEICE TRANS. INF. & SYST., VOL.E91-D, NO.4 APRIL 2008
934
The rest of the paper is organized as follows. In the were created during the execution. This information is also
next section, we outline our approach. Section 3 clarifies represented with facts in Prolog.
what information is extracted from a source code. Sections 4 Using extracted information as facts in Prolog, the de-
and 5 present the hierarchical structure of detection condi- tection conditions, described as Prolog programs, are exe-
tions and their representations with Prolog, respectively. In cuted to infer the existence of the class structures satisfying
Sect. 6, we discuss our supporting tool and experiments to the conditions of design patterns. For example, the predi-
show the benefits of our approach. Section 7 discusses the cate observer (C1assRoleList, MethodRoleList)for
detecting Observer pattern is defined as a Prolog program
most relevant work, and Sect. 8 concludes with our contri-
bution and discusses future work. and executed as a query on the facts denoting the extracted
ject receives a method invocation and calls its corresponding •õ We use sans-serifs for pattern names such as Observer, italics
method. However, by applying static analysis, which only for roles in some patterns such as Observer, and teletypes for parts
investigates the text of the source code, we cannot identify of Prolog/Java code such as observer for avoiding ambiguity.
HAYASHI et al.: DESIGN PATTERN DETECTION BY USING META PATTERNS
935
(a)
(b)
Fig. 2 Variations in Observer. For dynamic analysis, we execute the source code and ex-
tract, from the execution records, the occurrences of method
invocations, including constructors' invocations to create in-
stances. The information that we extracted is listed up in
For example, Fig. 2 illustrates two variations in Observer. Table 2.
Based on the description in GoF's book [1], notify() in Sub- The right side of Fig. 3 illustrates an example of ex-
ject is not overridden in its subclasses, i.e. ConcreteSub- tracting the dynamic execution record as a fact in Prolog. A
ject (Fig. 2-a). However, sometimes we modify how to no- part of the facts is shown in the bottom line of the figure,
tify to observers. Then notify() may be overridden by Con- and it begins with the name of the predicate, methEntry.
creteSubjects (Fig. 2-b). To deal with these variations in a This fact denotes an invocation of the method intValue()
unified way, our detection conditions do not simply return in IntNumber class during the execution. We model the be-
true or false value, but the degree of certainty that a code havioral aspect of method invocation with a pair of events;
includes the design pattern. That is to say, we embed the the event of the method being called and the event of the
calculation of fuzziness to our facts in Prolog. The descrip- method being returned. Each event is assigned a number
tion about our fuzzy conditions is included in Sect. 5. corresponding the order in which it was invoked. We will
refer to it as the ordinal execution number of the event. The
3. Extracted Information first parameter 14 in the predicate methEntry stands for the
ordinal execution number of the event of the method just
3.1 Static Information being called in the analyzed execution record. After finish-
ing the execution of the method and returning to the caller
Our approach first extracts static information by source code object, the 15th (second parameter of the predicate) event
analysis. Table 1 lists up the types of the extracted informa- occurred as an event of the method to be returned. It means
tion. For example, for a class, we extract the information that the method invocation intValue () is modeled with a
on inheritance relationship (its superclass), visibility (pub- pair of events, 14 and 15, as shown in Fig. 3. The identifier
lic or private), whether it is an abstract or concrete class, 13 as the third parameter shows the parent method invoca-
the list of methods that it defines, and fields that it has tion event that caused this invocation. The caller (sender of
the method invocation) and the callee (receiver) object are
(instance variable declarations). The left side of Fig. 3 il-
lustrates an example of extracting the static structure of a referenced by 62 and 63, respectively. The class Number is
source code as a fact in Prolog. In the figure, the line start- a superclass of IntNumber, and this method invocation was
ing with the predicate class represents the static informa- done on the object whose class is Number from syntactic
tion of class IntNumber as a fact. The six parameters of the view. For example, the code of this method invocation can
predicate denote its class name ('IntNumber'), superclass be written as follows:
('Number'), methods (['IntNumber. intValue()']), in- Number obj=new IntNumber();
stance variables ([]: a null list because it has no fields), the obj. intValue (obj 1);
information on whether it is an abstract or concrete class
('false': abstract class), and its visibility ('public'), re- Note that the receiver object is bound to the type of Number,
spectively. not IntNumber, and therefore the sixth parameter of the
IEICE TRANS. INF. & SYST., VOL .E91-D, NO.4 APRIL 2008
936
4. Detection Condition by Meta Patterns of ConcreteObserver, and the invocation of notify() leads
to sending update messages to all of the ConcreteObserver
4.1 Meta Patterns instances. By using dynamic binding and overriding mecha-
nisms, the execution of update varies on its implementation
In many cases, customizable parts of design patterns, so- in ConcreteObserver subclass. Note that the ID numbers
called hot spots, are implemented with employing sub- such as ID: 157 attached to the syntactic elements in Fig. 4
classes of an abstract class and overriding its abstract meth- will be used to explain a detection condition in Sect. 5.2.
ods by concrete ones in the subclasses. The designers can Many of design patterns include common structures.
fill the hot spots with the concrete methods that override the Pree classified hot spots of design patterns into seven cate-
abstract one. The method overridden by a concrete method gories from the viewpoint of associations between the class
in a subclass is called hook method, and a template method having hook methods (called a hook class) and a class hav-
includes one or more hook methods to implement its func- ing template methods (template class). He defined these cat-
tions. egories as meta patterns [21]. Meta patterns can be consid-
For example, consider Observer of GoF patterns ered as a basis of design patterns and in fact. For example,
shown in the bottom of Fig. 4. The method update() of the most of GoF patterns include meta patterns. Figure 5 depicts
abstract class Observer has no implementation but defines the class diagrams of the Pree's meta patterns. For example,
its interface, i.e. its name and parameters. When a designer as shown in Fig. 4, Observer includes 1:N Connection meta
uses Observer, he/she composes concrete subclasses of Ob- pattern. In 1: N Connection, there is no inheritance relation-
server class and customizes the pattern by overriding the ab- ship between a template class and a hook class, but they are
stract method update() with a concrete method, i.e. a method connected through an aggregation relationship whose cardi-
having an implementation, in each subclass in order to im- nality is one-to-many. In Fig. 4, Subject class and Observer
plement the functions that he/she wants. The method up- class correspond to a template class and a hook class, re-
date() is a hook method, and the method notify() in Subject spectively.
class which calls update() is a template method. The in- We analyzed 23 GoF design patterns and investigated
stance variable observers in Subject class holds all instances which meta patterns are included in them. Table 3 shows the
HAYASHI et al.: DESIGN PATTERN DETECTION BY USING META PATTERNS
937
939
The following Prolog goal is for detecting the occur- 6.1.1 Design Pattern Detector
rences of Observer in a source code.
time spent on construction in minutes, and pattern names ber of the patterns that were actually used, Static, the num-
that the instances of design patterns in constructed test cases ber of the patterns from static analysis only, and Static-and-
belong to, respectively. At the end we additionally obtained Dynamic, the number of the patterns from both static and
11 test cases, which are enough to find seven types of pat- dynamic analyses, respectively. The rows •gPrecision•h and
terns. Three test cases were obtained by the candidates by •g Recall•h indicate one of the accuracy criteria and are calcu-
using lower threshold of 80, which in turn rescuing informal lated by using a set of detected results M and correct patterns
our detection technique, we provided two types of the de- Precision=|M•¿C|/|C|, Recall=|M•¿C|/|M|.
differences between the detection times allow us to evaluate the order of method invocations and on overriding methods
how much hierarchy of the detection conditions can con- were used in the detection conditions compared to the pred-
tribute to the improvement of time efficiency. Additionally, icates for static structures. They were frequently evaluated
we do not consider quantitative criteria of our experiments during the detection processes.
are especially important because assuming criteria depends Table 7 shows the time taken to detect the, patterns.
on the target program. For example, structurally and be- The results in the case of using the hierarchical structure
haviorally equal patterns, e.g. State and Strategy, are not (our proposed approach) are included in the cells under
correctly detected by our technique, and then quantitative the column of •gHierarchical conditions•h, while the column
results depend on the kinds of patterns included in the tar- •g Non-hierarchical conditions is for the case of the non-
Our experiments were done on a personal computer as non-hierarchical conditions in the case of the program #1,
with Intel Pentium III 1 GHzx2, 512 MB RAM, and Mi- and 6.5 times faster in the program #2. This result shows
crosoft Windows 2000. that our approach of using hierarchical detection conditions
is more effective.
6.3 Experimental Results As for the case shown in Table 6 that the value of •gS•h or
but it took the wrong patterns. In the example of the pro- In the works by Correa et al. and Kramer et al., the
gram #1, the tool detected Chain of Responsibility, but its conditions that should be satisfied on the static structure of a
part was structured with Composite. The tool detected the source code are defined with FOPL. The pattern occurrences
occurrences of State and Strategy at the same part of the are detected by checking the true/false value of the condi-
program #1, but Observer was actually used there. These tions [7], [8]. To provide an easy way to describe the condi-
errors resulted from the incorrect decision of the role of the tions on so-called pattern specifications, Balanyi et al. pro-
methods as constituents of the patterns. The reason why posed an XML based language named DPML [9]. However,
State and Strategy patterns were detected at the same part these approaches only use the static structure of a code such
is their similarity in static structure. In the case of the pro- as inheritance, aggregation, association and method invoca-
gram #2, an occurrence of Template Method was detected tion relationships. Therefore it leads to inaccuracy when de-
but the pattern was not actually used. The detected part was tecting patterns. In particular, in DPML, although Factory
included in the occurrence of Factory Method, which was Method could be completely detected, the other patterns,
correctly detected. This error arises from the fact that Fac- Builder, Proxy, and Visitor, could be detected at correctness
tory Method can be a special form of Template Method. ratios of 14%, 50%, and 0%, respectively. The types of pat-
These errors are not related to the structural and behav- terns that could be detected were limited.
ioral but the semantic differences. To detect these patterns Some approaches use Prolog or FOPL for reasoning
precisely, we are considering to introduce some semantic about structural properties of object-oriented programs, in-
analyses, such as program identification analysis [18], or cluding static method invocation properties [7], [8], [13]-
heuristic-based approach, such as the technique using a ma- [15]. In particular, the tool of Kniesel et al, is based on the
chine learning [6]. Some of the failing reasons of our de- static source code analysis and detects patterns rapidly [17].
tection technique are considered as follows: 1) not followed However, most of them do not consider maintainability of
variations of patterns, 2) irrelevant execution sequences, and pattern descriptions, i.e. programs in Prolog or FOPL ex-
3) pattern that has no structural characteristics. To detect pressions themselves.
non-structural patterns, Prolog representation of a program Some researches use the dynamic data such as exe-
behavior other than executions is required. cution traces [10]. The combination of static and dynamic
analyses seems to be one of the ways to achieve more accu-
7. Related Work rate pattern detection [28]. However, the computation cost
may be a problem of automated detection.
Since formal concept analysis helps us to extract the
There are several researches to automatically detect occur-
abstract static structures that commonly appear, it is applied
rences of design patterns, and some of them use metrics that
to pattern mining as well as pattern detection [11], [12]. Pat-
measure the possibility of the existence of patterns [2]-[5].
tern mining approach strongly requires time efficiency be-
These metrics techniques can be combined with a statistical
cause they are applied to many projects and larger-scale soft-
approach and learning algorithms to improve the accuracy
ware.
of detection [6].
HAYASHI et al.: DESIGN PATTERN DETECTION BY USING META PATTERNS
943
per.
8. Conclusion
References
helped us to hierarchically structure the properties of design [3] T. Muraki and M. Saeki, •gMetrics for applying GoF design patterns
in refactoring processes,•h Proc. 4th International Workshop on Prin-
patterns, and we could reduce the time costs of detection ciples of Software Evolution, pp. 27-36, 2001.
and the size of whole detection conditions. We adopted Pro-
[4] S. Demeyer, S. Ducasse, and O. Nierstrasz, •gFinding refactoring
log to represent the detection condition of design patterns, via change metrics,•h Proc. 15th Conference on Object-Oriented
so that we could easily add and modify the detection condi- Programming, Systems, Languages and Applications, pp. 166-177,
ing the backtracking mechanism and the database function, [5] Y.G. Gueheneuc, H. Sahraoui, and F. Zaidi, •gFingerprinting design
i.e. assert and retract predicates. The database function patterns,•h Proc. 11th Working Conference on Reverse Engineering,
the benefits of our approach. In particular, the result of our ence on Software Maintenance, pp. 295-304, 2005.
experiments shows that all of the pattern occurrences were [7] A.L. Correa, C.M.L. Werner, and G. Zaverucha, •gObject oriented
detected. It means that our hierarchical detection conditions design expertise reuse: An approach based on heuristics, design pat-
terns and anti-patterns,•h Proc. 6th International Conference on Soft-
are accurate enough to detect design patterns. However, it ware Reuse, pp. 336-352, 2000.
was difficult to completely recognize correct patterns, we [8] C. Kramer and L. Prechelt, •gDesign recovery by automated search
consider that some of these types of errors can be reduced for structural design patterns in object-oriented software,•h Proc. 3rd
by adopting a stronger dynamic analysis. To solve the oth- Working Conference on Reverse Engineering, pp. 208-215, 1996.
ers, we should consider the semantic aspect of patterns, not [9] Z. Balanyi and R. Ferenc, •gMining design patterns from C++ source
code,•h Proc. 19th International Conference on Software Mainte-
static structure or the usage of methods and classes during nance, pp. 305-314, 2003.
program execution. One of the differences between State [10] D. Heuzeroth, T. Holl, G. Hogstrom, and W. Lowe, •gAutomatic de-
and Strategy, which have the same structure, is that objects sign pattern detection,•h Proc. 11th International Workshop on Pro-
in State are created as states while objects in Strategy are gram Comprehension, pp. 94-103, 2003.
created as strategies. It means that we should clarify the [11] P. Tonella and G. Antoniol, •gObject oriented design pattern infer-
ence,•h Proc. 15th International Conference on Software Mainte-
semantic differences between State and Strategy.
nance, pp. 230-239, 1999.
We can consider the following issues as future work: [12] G. Arevalo, F. Buchli, and O. Nierstrasz, •gDetecting implicit collab-
oration patterns,•h Proc. 11th Working Conference on Reverse Engi-
Describing detection conditions for more patterns-We neering, pp. 122-131, 2004.
have shown that our hierarchical structure by using [13] R. Wuyts, •gDeclarative reasoning about the structure object-oriented
Pree's meta patterns is effective to describe conditions systems,•h Proc. TOOLS USA '98 Conference, pp. 112-124, 1998.
of design pattern detection. Confirmations to describe [14] J.M. Smith and D. Stotts, •gSPQR: Flexible automated design pattern
conditions for other design patterns, e.g. patterns in extraction from source code,•h Proc. 18th International Conference
tions and communications of patterns because our [16] N. Shi and R.A. Olsson, •gReverse engineering of design patterns
tool's outputs are just facts in Prolog. We consider a from Java source code,•h Proc. 21st International Conference on Au-
Case study of larger scale-Our case study in this paper is [18] J. Dong, D.S. Lad, and Y. Zhao, •gDP-Miner: Design pattern discov-
pre-limited, so we should apply our technique to larger ery using matrix,•h Proc. 14th International Conference and Work-
MICE TRANS. INF. & SYST., VOL.E91-D, NO.4 APRIL 2008
944
shops on the Engineering of Computer-Based Systems, pp. 371-380, Takashi Kobayashi received B. Eng., M.
2007. Eng., and Ph.D. degrees in computer science
[19] J. Dong and Y. Zhao, •gExperiments on design pattern discovery,•h from Tokyo Institute of Technology in 1997,
Proc. 3rd International Workshop on Predictor Models in Software 1999, and 2004, respectively. He is currently a
Engineering, 2007. designated associate professor of Center for Em-
[20] J. Crupi, D. Malks, and D. Alur, Core J2Ee Patterns: Best Practices bedded Computing Systems (NCES), Graduate
and Design Strategies, Prentice Hall PTR, 2001. School of Information Science, Nagoya Uni-
[21] W. Pree, Design Pattern for Object-Oriented Software Development, versity. His research interests include software
Addison-Wesley, 1996. patterns and architecture, software development
[22] Eclipse project, available at http://www.eclipse.org/ method, software configuration management,
[23] Eclipse Java development tools (JDT) subproject, available at Web-services compositions, workflow, multi-
http://www.eclipse.org/jdt/ media information retrieval, and data mining. He is a member of IPSJ,
[24] E. Denti, A. Omicini, and A. Ricci, •gMulti-paradigm Java-Prolog JSSST, DBSJ, and ACM.
integration in tuProlog,•h Science of Computer Programming, vol.57,