Academia.eduAcademia.edu

A Model Transformation Approach to Automated Model Evolution Dr

A MODEL TRANSFORMATION APPROACH TO AUTOMATED MODEL EVOLUTION by YUEHUA LIN JEFFREY G. GRAY, COMMITTEE CHAIR BARRETT BRYANT ANIRUDDHA GOKHALE MARJAN MERNIK CHENGCUI ZHANG A DISSERTATION Submitted to the graduate faculty of The University of Alabama at Birmingham, in partial fulfillment of the requirements for the degree of Doctor of Philosophy BIRMINGHAM, ALABAMA 2007 Copyright by Yuehua Lin 2007 A MODEL TRANSFORMATION APPROACH TO AUTOMATED MODEL EVOLUTION YUEHUA LIN COMPUTER AND INFORMATION SCIENCES ABSTRACT It is well-known that the inherent complex nature of software systems adds to the challenges of software development. The most notable techniques for addressing the complexity of software development are based on the principles of abstraction, problem decomposition, separation of concerns and automation. As an emerging paradigm for developing complex software, Model-Driven Engineering (MDE) realizes these principles by raising the specification of software to models, which are at a higher level of abstraction than source code. As models are elevated to first-class artifacts within the software development lifecycle, there is an increasing need for frequent model evolution to explore design alternatives and to address system adaptation issues. However, a system model often grows in size when representing a large-scale real-world system, which makes the task of evolving system models a manually intensive effort that can be very time consuming and error prone. Model transformation is a core activity of MDE, which converts one or more source models to one or more target models in order to change model structures or translate models to other software artifacts. The main goal of model transformation is to provide automation in MDE. To reduce the human effort associated with model evolution while minimizing potential errors, the research described in this dissertation has contributed toward a model transformation approach to automated model evolution. iii A pre-existing model transformation language, called the Embedded Constraint Language (ECL), has been evolved to specify tasks of model evolution, and a model transformation engine, called the Constraint-Specification Aspect Weaver (C-SAW), has been developed to perform model evolution tasks in an automated manner. Particularly, the model transformation approach described in this dissertation has been applied to the important issue of model scalability for exploring design alternatives and crosscutting modeling concerns for system adaptation. Another important issue of model evolution is improving the correctness of model transformation. However, there execution-based testing has not been considered for model transformation testing in current modeling practice. As another contribution of this research, a model transformation testing approach has been investigated to assist in determining the correctness of model transformations by providing a testing engine called M2MUnit to facilitate the execution of model transformation tests. The model transformation testing approach requires a new type of test oracle to compare the actual and expected transformed models. To address the model comparison problem, model differentiation algorithms have been designed and implemented in a tool called DSMDiff to compute the differences between models and visualize the detected model differences. The C-SAW transformation engine has been applied to support automated evolution of models on several different experimental platforms that represent various domains such as computational physics, middleware, and mission computing avionics. The research described in this dissertation contributes to the long-term goal of alleviating the increasing complexity of modeling iv large-scale, complex applications. DEDICATION To my husband Jun, my parents, Jiafu and Jinying, and my sisters, Yuerong and Yueqin for their love, support and sacrifice. To Wendy and Cindy, my connection to the future. v ACKNOWLEDGEMENTS I am deeply grateful to all the people that helped me to complete this work. First and foremost, I wish to thank my advisor, Dr. Jeff Gray, who has offered to me much valuable advice during my Ph.D. study, as well as inspired me to pursue high quality research from an exemplary work ethic. Through constant support from his DARPA and NSF research grants, I was able to start my thesis research at a very early stage during the second semester of my first year of doctoral study, which allowed me to focus on this research topic without involving other non-research duties. With his expertise and research experiences in the modeling area, he has led me into the research area of model transformation and helped me make stable and significant research progress. Moreover, Dr. Gray provided unbounded opportunities and resources that enabled me to conduct collaborative research with several new colleagues. In addition, he encouraged me to participate in numerous professional activities (e.g., conference and journal reviews), and generously shared with me his experiences in proposal writing. Without his tireless advising efforts and constant support, I could not have matured into an independent researcher. I also want to show my gratitude to Dr. Barrett Bryant. I still remember how I was impressed by his prompt replies to my questions during my application for graduate study in the CIS department. Since that time, he has offered many intelligent and insightful suggestions to help me adapt to the department policies and procedures, strategies and culture. vi I would like to thank Dr. Chengcui Zhang for her continuous encouragement. As the only female faculty member in the department, she has been my role model as a successful female researcher. Thank you, Dr. Zhang, for sharing with me your experiences in research and strategies for job searching. To Dr. Aniruddha Gokhale and Dr. Marjan Mernik, I greatly appreciate your precious time and effort in serving as my committee members. I am grateful to his willingness to assist me in improving this work. To Janet Sims, Kathy Baier, and John Faulkner, who have been so friendly and helpful during my Ph.D. studies – they have helped to make the department a very pleasant place to work by making me feel at home and at ease with your kind spirit. I am indebted to my collaborators at Vanderbilt University. Special thanks are due to Dr. Sandeep Neema, Dr. Ted Bapty, and Zoltan Molnar who helped me overcome technical difficulties during my tool implementation. I would like to thank Dr. Swapna Gokhale at the University of Connecticut, for sharing with me her expertise and knowledge in performance analysis and helping me to better understand Stochastic Reward Nets. I also thank Dario Correal at the University of Los Andes, Colombia for applying one of my research results (the C-SAW model transformation engine) to his thesis research. Moreover, thanks to Dr. Frédéric Jouault for offering a great course, which introduced me to many new topics and ideas in Model Engineering. My work on model differentiation has been improved greatly by addressing his constructive comments. My student colleagues in the SoftCom lab and the department created a friendly and cheerful working atmosphere that I enjoyed and will surely miss. To Jing Zhang, I vii cherish the time we were working together on Dr. Gray’s research grants. To Hui Wu, Alex Liu, Faizan Javed and Robert Tairas, I appreciate your time and patience in listening to me and discussing my work, which helped me to overcome difficult moments and made my time here at UAB more fun. To my best friends, Wenyan Gan and Shengjun Zheng, who I met with luck in my middle school, I appreciate your giving my parents and younger sisters long term help when I am out of hometown. My strength to complete this work comes from my family. To my Mom and Dad, thank you for giving me the freedom to pursue my life in another country. To my sisters, thank you for taking care of our parents when I was far away from them. To my husband, Jun, thank you for making such a wonderful and sweet home for me and being such a great father to our two lovely girls. Without your unwavering love and support, I can not imagine how I would complete this task. The best way I know to show my gratitude is to give my love to you from the bottom of my heart as you have given to me. Last, I am grateful to the National Science Foundation (NSF), under grant CSR0509342, and the DARPA Program Composition for Embedded Systems (PCES), for providing funds to support my research assistantship while working on this dissertation. viii TABLE OF CONTENTS Page ABSTRACT....................................................................................................................... iii DEDICATION.....................................................................................................................v ACKNOWLEDGMENTS ................................................................................................. vi LIST OF TABLES........................................................................................................... xiii LIST OF FIGURES ......................................................................................................... xiv LIST OF LISTINGS ........................................................................................................ xvi LIST OF ABBREVIATIONS......................................................................................... xvii CHAPTER 1. INTRODUCTION .........................................................................................................1 1.1. Domain-Specific Modeling (DSM) ......................................................................3 1.2. The Need for Frequent Model Evolution..............................................................7 1.2.1. System Adaptability through Modeling....................................................8 1.2.2. System Scalability through Modeling.....................................................10 1.3. Key Challenges in Model Evolution...................................................................11 1.3.1. The Increasing Complexity of Evolving Large-scale System Models ...................................................................11 1.3.2. The Limited Use of Model Transformations ..........................................13 1.3.3. The Lack of Model Transformation Testing for Improving the Correctness ................................................................14 1.3.4. Inadequate Support for Model Differentiation .......................................15 1.4. Research Goals and Overview ............................................................................17 1.4.1. Model Transformation to Automate Model Evolution ...........................17 1.4.2. Model Transformation Testing to Ensure the Correctness .....................18 1.4.3. Model Differentiation Algorithms and Visualization Techniques..........19 1.4.4. Experimental Evaluation.........................................................................20 1.5. The Structure of the Thesis .................................................................................21 ix TABLE OF CONTENTS (Continued) Page CHAPTER 2. BACKGROUND .........................................................................................................24 2.1. Model-Driven Architecture (MDA)....................................................................24 2.1.1. Objectives of MDA ................................................................................25 2.1.2. The MDA Vision ....................................................................................26 2.2. Basic Concepts of Metamodeling and Model Transformation ...........................27 2.2.1. Metamodel, Model and System ..............................................................28 2.2.2. The Four-Layer MOF Metamodeling Architecture ................................30 2.2.3. Model Transformation ............................................................................32 2.3. Supporting Technology and Tools......................................................................36 2.3.1. Model-Integrated Computing (MIC) .........................................................36 2.3.2. The Generic Modeling Environment (GME).............................................37 3. AUTOMATED MODEL EVOLUTION.....................................................................43 3.1. Challenges and Current Limitations ...................................................................43 3.1.1. Navigation, Selection and Transformation of Models............................44 3.1.2. Modularization of Crosscutting Modeling Concerns..............................45 3.1.3. The Limitations of Current Techniques .................................................47 3.2. The Embedded Constraint Language (ECL).......................................................48 3.2.1. ECL Type System ...................................................................................50 3.2.2. ECL Operations .....................................................................................50 3.2.3. The Strategy and Aspect Constructs .......................................................53 3.2.4. The Constraint-Specification Aspect Weaver (C-SAW) ........................55 3.2.5. Reducing the Complexities of Transforming GME models ...................56 3.3. Model Scaling with C-SAW ...............................................................................57 3.3.1. Model Scalability ....................................................................................58 3.3.2. Desired Characteristics of a Replication Approach ................................60 3.3.3. Existing Approaches to Support Model Replication .............................61 3.3.4. Replication with C-SAW ........................................................................64 3.3.5. Scaling System Integration Modeling Languages (SIML) ....................66 3.4. Aspect Weaving with C-SAW ............................................................................74 3.4.1. The Embedded System Modeling Language (ESML)............................74 3.4.2. Weaving Concurrency Properties into ESML Models ...........................77 3.5. Experimental Validation .....................................................................................80 3.5.1. Modeling Artifacts Available for Experimental Validation ...................81 3.5.2. Evaluation Metrics for Project Assessment ............................................82 3.5.3. Experimental Result................................................................................83 x TABLE OF CONTENTS (Continued) Page CHAPTER 3.6. Related Work ......................................................................................................85 3.6.1. Current Model Transformation Techniques and Languages ..................86 3.6.2. Related Worked on Model Scalability....................................................89 3.7. Conclusion ..........................................................................................................91 4. DSMDIFF: ALGORITHMS AND TOOL SUPPORT FOR MODEL DIFFERENTIATION ..........................................................................93 4.1. Motivation and Introduction ...............................................................................93 4.2. Problem Definition and Challenges ....................................................................95 4.2.1. Information Analysis of Domain-Specific Models.................................97 4.2.2. Formalizing a Model Representation as a Graph....................................99 4.2.3. Model Differences and Mappings.........................................................101 4.3. Model Differentiation Algorithms ....................................................................103 4.3.1. Detection of Model Mappings .............................................................103 4.3.2. Detection of Model Differences............................................................108 4.3.3. Depth-First Detection............................................................................110 4.4. Visualization of Model Differences..................................................................112 4.5. Evaluation and Discussions ..............................................................................114 4.5.1. Algorithm Analysis...............................................................................114 4.5.2. Limitations and Improvement ..............................................................117 4.6. Related Work ....................................................................................................119 4.6.1. Model Differentiation Algorithms .......................................................120 4.6.2. Visualization Techniques for Model Differences ................................122 4.7. Conclusion ........................................................................................................123 5. MODEL TRANSFORMATION TESTING..............................................................125 5.1. Motivation.........................................................................................................125 5.1.1. The Need to Ensure the Correctness of Model Transformation ...........126 5.1.2. The Need for Model Transformation Testing ......................................128 5.2. A Framework of Model Transformation Testing..............................................129 5.2.1. An Overview.........................................................................................130 5.2.2. Model Transformation Testing Engine: M2MUnit ..............................131 5.3. Case Study ........................................................................................................133 5.3.1. Overview of the Test Case....................................................................134 5.3.2. Execution of the Test Case ..................................................................136 5.3.3. Correction of the Model Transformation Specification........................139 5.4. Related Work ....................................................................................................140 5.5. Conclusion ........................................................................................................142 xi TABLE OF CONTENTS (Continued) Page CHAPTER 6. FUTURE WORK.......................................................................................................144 6.1. Model Transformation by Example (MTBE) ...................................................144 6.2. Toward a Complete Model Transformation Testing Framework .....................148 6.3. Model Transformation Debugging ...................................................................151 7. CONCLUSIONS........................................................................................................152 7.1. 7.2. 7.3. 7.4. The C-SAW Model Transformation Approach ................................................153 Model Transformation Testing .........................................................................155 Differencing Algorithms and Tools for Domain-Specific Models ...................156 Validation of Research Results.........................................................................157 LIST OF REFERENCES.................................................................................................160 APPENDIX A EMBEDDED CONSTRAINT LANGUAGE GRAMMAR.............................173 B OPERATIONS OF THE EMBEDDED CONSTRAINT LANGUAGE ..........178 C ADDITIONAL CASE STUDIES ON MODEL SCALABILITY ....................184 C.1. Scaling Stochastic Reward Net Modeling Language (SRNML) ..............185 C1.1. Scalability Issues in SRNML .........................................................188 C1.2. ECL Transformation to Scale SRNML..........................................190 C.2. Scaling Event QoS Aspect Language (EQAL) .........................................194 C2.1. Scalability Issues in EQAL ............................................................195 C2.2. ECL Transformation to Scale EQAL .............................................196 xii LIST OF TABLES Table C-1 Page Enabling guard equations for Figure C-1..............................................................188 xiii LIST OF FIGURES Figure Page 1-1 Metamodel, models and model transformation .........................................................5 1-2 An overview of the topics discussed in this dissertation .........................................18 2-1 The key concepts of the MDA ................................................................................28 2-2 The relation between metamodel, model and system..............................................30 2-3 The MOF four-tier metamodeling architecture .......................................................31 2-4 Generalized transformation pattern .........................................................................35 2-5 Metamodels, models and model interpreters (compilers) in GME .........................38 2-6 The state machine metamodel .................................................................................40 2-7 The ATM instance model ........................................................................................41 3-1 Modularization of crosscutting model evolution concerns......................................46 3-2 Overview of C-SAW ...............................................................................................56 3-3 Replication as an intermediate stage of model compilation (A1) ...........................62 3-4 Replication as a domain-specific model compiler (A2) ..........................................64 3-5 Replication using the model transformation engine C-SAW (A3)..........................66 3-6 Visual Example of SIML Scalability.......................................................................69 3-7 A subset of a model hierarchy with crosscutting model properties.........................76 3-8 Internal representation of a Bold Stroke component ...............................................78 xiv 3-9 The transformed Bold Stroke component model.....................................................80 4-1 A GME model and its hierarchical structure .........................................................101 4-2 Visualization of model differences........................................................................113 4-3 A nondeterministic case that DSMDiff may produce incorrect result ..................118 5-1 The model transformation testing framework .......................................................131 5-2 The model transformation testing engine M2MUnit.............................................133 5-3 The input model prior to model transformation ....................................................135 5-4 The expected model for model transformation testing..........................................135 5-5 The output model after model transformation.......................................................137 5-6 A summary of the detected differences .................................................................138 5-7 Visualization of the detected differences...............................................................138 C-1 Replication of Reactor Event Types (from 2 to 4 event types).............................187 C-2 Illustration of replication in EQAL.......................................................................196 xv LIST OF LISTINGS Listing Page 3-1 Examples of ECL aspect and strategy .....................................................................54 3-2 Example C++ code to find a model from the root folder ........................................57 3-3 ECL specification for SIML scalability...................................................................73 3-4 ECL specification to add concurrency atoms to ESML models..............................79 4-1 Finding the candidate of maximal edge similarity ................................................107 4-2 Computing edge similarity of a candidate.............................................................107 4-3 Finding signature mappings and the Delete differences........................................109 4-4 DSMDiff Algorithm ..............................................................................................111 5-1 The to-be-tested ECL specification .......................................................................136 5-2 The corrected ECL specification ...........................................................................139 C-1 ECL transformation to perform first subtask of scaling snapshot ........................192 C-2 ECL transformation to perform second subtask of scaling snapshot....................193 C-3 ECL fragment to perform the first step of replication in EQAL...........................197 xvi LIST OF ABBREVIATIONS AMMA Atlas Model Management Architecture ANTLR Another Tool for Language Recognition AOM Aspect-Oriented Modeling AOP Aspect-Oriented Programming AOSD Aspect-Oriented Software Development API Application Program Interface AST Abstract Syntax Tree ATL Atlas Transformation Language CASE Computer-Aided Software Engineering CIAO Component-Integrated ACE ORB CORBA Common Object Request Broker Architecture C-SAW Constraint-Specification Aspect Weaver CWM Common Warehouse Metamodel DSL Domain-Specific Language DSM Domain-Specific Modeling DSML Domain-Specific Modeling Language DRE Distributed Real-Time and Embedded EBNF Extended Backus-Naur Form ECL Embedded Constraint Language xvii EMF Eclipse Modeling Framework EQAL Event Quality Aspect Language ESML Embedded Systems Modeling Language GME Generic Modeling Environment GPL General Programming Language GReAT Graph rewriting and transformation IP Internet Protocol LHS Left-Hand Side MDA Model-Driven Architecture MDE Model-Driven Engineering MDPT Model-Driven Program Transformation MCL Multigraph Constraint Language MIC Model-Integrated Computing MOF Meta Object Facility MTBE Model Transformation by Example NP Non-deterministic Polynomial time OCL Object Constraint Language OMG Object Management Group PBE Programming by Example PIM Platform-Independent Model PICML Platform-Independent Component Modeling Language PLA Production-Line Architecture PSM Platform-Specific Model xviii QBE Query by Example QoS Quality of Service RHS Right-Hand Side RTES Real-Time and Embedded Systems SRN Stochastic Reward Net SRNML Stochastic Reward Net Modeling Language QVT Query/View/Transformations SIML System Integrated Modeling Language XMI XML Metadata Interchange XML Extensible Markup Language XSLT Extensible Stylesheet Transformations YATL Yet Another Transformation Language UAV Unmanned Aerial Vehicle UDP User Datagram Protocol UML Unified Modeling Language UUID Universally Unique Identifier VB Visual Basic xix 1 CHAPTER 1 INTRODUCTION It is well-known that the inherent complex nature of software systems increases the challenges of software development [Brooks, 95]. The most notable techniques for addressing the complexity of software development are based on the principles of abstraction, problem decomposition, information hiding, separation of concerns and automation [Dijkstra, 76], [Parnas, 72]. Since the inception of the software industry, various efforts in software research and practice have been made to provide abstractions to shield software developers from the complexity of software development. Computer-Aided Software Engineering (CASE) was a prominent effort that focused on developing software methods and modeling tools that enabled developers to express their designs in terms of general-purpose graphical programming representations, such as state machines, structure diagrams, and dataflow diagrams [Schmidt, 06]. As the first software product sold independently of a hardware package, Autoflow was a flowchart modeling tool developed in 1964 by Martin Goetz of Applied Data Research [Johnson, 98]. Although CASE tools have historical relevance in terms of offering some productivity benefits, there are several limitations that have narrowed their potential [Gray et al., 07]. 2 The primary drawback of most CASE tools was that they were constrained to work with a fixed notation, which forced the end-users to adopt a language prescribed by the tool vendors. Such a universal language may not be suitable in all cases for an enduser’s distinct needs for solving problems in their domain. As observed by Schmidt, “As a result, CASE had relatively little impact on commercial software development during the 1980s and 1990s, focusing primarily on a few domains, such as telecom call processing, that mapped nicely onto state machine representations” [Schmidt, 06]. Another goal of CASE is to automate software development by synthesizing implementations from the graphical design representations. However, there exists a major hindrance to achieve such automation due to the lack of integrated transformation technologies to transform graphical representations at a high-level of abstraction (e.g., design model) to a low-level representation (e.g., implementation code). Consequently, many CASE systems were restricted to a few specific application domains and unable to satisfy the needs for developing production-scale systems across various application domains [Schmidt, 06]. There also has been significant effort toward raising the abstraction of programming languages to shield the developers from the complexity of both language and platform technologies. For example, early programming languages such as assembly language provide an abstraction over machine code. Today, Object-Oriented languages such as C++ and Java introduce additional abstractions (e.g., abstract data types and objects) [Hailpern and Tarr, 06]. However, the advances on programming languages still cannot cover the fast growing complexity of platforms. For example, popular middleware platforms, such as J2EE, .NET and CORBA, contain thousands of classes and methods 3 with many intricate dependencies. Such middleware evolves rapidly, which requires considerable manual effort to program and port application code to newer platforms when using programming languages [Schmidt, 06]. Also, programming languages are hard to describe system-wide, non-functional concerns such as system deployment, configuration and quality assurance because they primarily aim to specify functional aspects of a system. To address the challenges in current software development such as the increased complexity of products, shortened development cycles and heightened expectations of quality [Hailpern and Tarr, 06], there is an increasing need for new languages and technologies that can express the concepts effectively for a specific domain. Also, new methodologies are needed for decomposing a system to various but consistent aspects, and enabling transformation and composition between various artifacts in the software development lifecycle within a unified infrastructure. To meet these challenges, ModelDriven Engineering (MDE) [Kent, 02] is an emerging approach to software development that centers on higher level specifications of programs in Domain-Specific Modeling Languages (DSMLs), offering greater degrees of automation in software development, and the increased use of standards [Schmidt, 06]. In practice, Domain-Specific modeling (DSM) is a methodology to realize the vision of MDE [Gray et al., 07]. 1.1 Domain-Specific Modeling (DSM) MDE represents a design approach that enables description of the essential characteristics of a problem in a manner that is decoupled from the details of a specific solution space (e.g., dependence on specific middleware or programming language). To 4 apply lessons learned from earlier efforts at developing higher level platform and language abstraction, a movement within the current MDE community is advancing the concept of customizable modeling languages, in opposition to a universal, generalpurpose language that attempts to offer solutions for a broad category of users such as the Unified Modeling Language (UML) [Booch et al., 99]. This newer breed of tools enables DSM, an MDE methodology that generates customized modeling languages and environments for a narrow domain of interest. In the past, abstraction was improved when programming languages evolved towards higher levels of specification. DSM takes a different approach, by raising the level of abstraction, while at the same time narrowing down the design space, often to a single range of products for a single domain [Gray et al., 07]. When applying DSM, the language follows the domain abstractions and semantics, allowing developers to perceive themselves as working directly with domain concepts of the problem space instead of code concepts of the solution space. Also, domain-specific models are subsequently transformed into executable code by a sequence of model transformations to provide automation support for software development. As shown in Figure 1-1, DSM technologies combine the following: • Domain-specific modeling languages “whose type systems formalize the application structure, behavior, and requirements within particular domains” [Schmidt, 06]. A metamodel formally defines the abstract syntax and static semantics of a DSML by specifying a set of modeling elements and their valid relationships for that specific domain. A model is an instance of the metamodel that represents a particular part of a real system. Developers use DSMLs to build 5 domain-specific models to specify applications and their design intents [Gray et al., 07]. • Model transformations play a key role in MDE to convert models to other software artifacts. They are used for refining models to capture more system details or synthesizing various types of artifacts from models. For example, models can be synthesized to source code, simulation input and XML deployment descriptions. Model transformation can be automated to reduce human effort and potential errors during software development. Figure 1-1 - Metamodel, models and model transformation The DSM philosophy of narrowly defined modeling languages can be contrasted with larger standardized modeling languages, such as the UML, which are fixed and 6 whose size and complexity [Gîrba and Ducasse, 06] provide abstractions that may not be needed in every domain, adding to the confusion of domain experts. Moreover, using notations that relate directly to a familiar domain not only helps flatten learning curves but also facilitates the communication between a broader range of experts, such as domain experts, system engineers and software architects. In addition, the ability of DSM to synthesize artifacts from high-level models to low-level implementation artifacts, simplifies the activities in software development such as developing, testing and debugging. Most recently, the “ModelWare” principle (i.e., everything is a model) [Kurtev et al., 06] has been adopted in the MDE community to provide a unified infrastructure to integrate various artifacts and enable transformations between them during the software development lifecycle. The key challenge in applying DSM is to define useful standards that enable tools and models to work together portably and effectively [Schmidt, 06]. Existing de facto standards include the Object Management Group’s Model Driven Architecture (MDA) [MDA, 07], Query/View/Transformations (QVT) [QVT, 07] and the MetaObject Facilities (MOF) [MOF, 07]. These standards can also form the basis for domain-specific modeling tools. Existing metamodeling infrastructures and tools include the Generic Modeling Environment (GME) [Lédeczi et al., 01], ATLAS Model Management Architecture (AMMA) [Kurtev et al., 06], Microsoft’s DSL tools [Microsoft, 05], [Cook et al., 07], MetaEdit+ [MetaCase, 07], and the Eclipse Modeling Framework (EMF) [Budinsky et al., 04]. Initial success stories from industry adoption of DSM have been reported, with perhaps the most noted being Saturn’s multi-million dollar cost savings associated with timelier reconfiguration of an automotive assembly line driven by 7 domain-specific models [Long et al., 98]. The newly created DSM Forum [DSM Forum, 07] serves as a repository of several dozen successful projects (mostly from industry, such as Nokia, Dupont, Honeywell, and NASA) that have adopted DSM. 1.2 The Need for Frequent Model Evolution The goal of MDE is to raise the level of abstraction in program specification and increase automation in software development in order to simplify and integrate the various activities and tasks that comprise the software development lifecycle. In MDE, models are elevated as the first-class artifacts in software development and used in various activities such as software design, implementation, testing and evolution. A powerful justification for the use of models concerns the flexibility of system analysis, i.e., system analysis can be performed while exploring various design alternatives. This is particularly true for distributed real-time and embedded (DRE) systems, which have many properties that are often conflicting (e.g., battery consumption versus memory size), where the analysis of system properties is often best provided at higher levels of abstraction [Hatcliff et al., 03]. Also, when developers apply MDE tools to model large-scale systems containing thousands of elements, designers must be able to examine various design alternatives quickly and easily among myriad and diverse configuration possibilities. Ideally, a tool should simulate each new design configuration so that designers could rapidly determine how some configuration aspect, such as a communication protocol, affects an observed property, such as throughput. To provide support for that degree of design exploration, frequent change evolution is required within system models [Gray et al., 06]. 8 Although various types of changes can be made to models, there are two categories of changes that designers often do manually—typically with poor results. The first category comprises changes that crosscut the model representation’s hierarchy in order to adapt the modeled system to new requirements or environments. The second category of change evolution involves scaling up parts of the model—a particular concern in the design of large-scale distributed, real-time, embedded systems, which can have thousands of coarse-grained components. Model transformation provides automation support in MDE, not only for translating models into other artifacts (i.e., exogenous transformation) but also for changing model structures (i.e., endogenous transformation). Application of model transformation to automate model evolution can reduce human effort and potential errors. The research described in this dissertation concentrates on developing an automated model transformation approach to address two important system properties—system adaptability and scalability at the modeling level, each corresponding to one category of model evolution, as discussed in the following sections. 1.2.1 System Adaptability through Modeling Adaptability is emerging as a critical enabling capability for many applications, particularly for environment monitoring, disaster management and other applications deployed in dynamically changing environments. Such applications have to reconfigure themselves according to fluctuations in their environment. A longstanding challenge of software development is to construct software that is easily adapted to changing requirements and new environments. Software production-line architectures (PLAs) are a 9 promising technology for the industrialization of software development by focusing on the automated assembly and customization of domain-specific component [Clements and Northrop, 01], which requires the ability to rapidly configure, adapt and assemble independent components to produce families of similar but distinct systems [Deng et al., 08]. As demand for software adaptability increases, novel strategies and methodologies are needed for supporting the requisite adaptations across different software artifacts (e.g., models, source code, test cases, documentation) [Batory et al., 04]. In modeling, many requirements changes must be made across a model hierarchy, which are called crosscutting modeling concerns [Gray et al., 01]. An example is the effect of fluctuating bandwidth on the quality of service across avionics components that must display a real-time video stream. To evaluate such a change, the designer must manually traverse the model hierarchy by recursively clicking on each submodel. Another example is the Quality of Service (QoS) constraints of Distributed Real-Time and Embedded (DRE) systems. The development of DRE systems is often a challenging task due to conflicting QoS constraints that must be explored as trade-offs among a series of alternative design decisions. The ability to model a set of possible design alternatives, and to analyze and simulate the execution of the representative model, offers great assistance toward arriving at the correct set of QoS parameters needed to satisfy the requirements for a specific DRE system. Typically, the QoS specifications are also distributed in DRE system models, which necessitate intensive effort to make changes manually. 10 1.2.2 System Scalability through Modeling Scalability is a desirable property of a system, a network, or a process, which indicates its ability to either handle growing amounts of work in a graceful manner, or to be readily enlarged [Bondi, 00] (e.g., new system resources may be added or new types of objects the system needs to handle). The corresponding form of design exploration for system scalability involves experimenting with model structures by expanding different portions of models and analyzing the result on scalability. For example, a network engineer may create various models to study the effect on network performance when moving from two routers to eight routers, and then to several dozen routers. This requires the ability to build a complex model from a base model by replicating its elements or substructures and adding the necessary connections [Lin et al., 07-a]. This type of change requires creating model elements and connections. Obviously, scaling a base model of a few elements to thousands of new elements requires a staggering amount of clicking and typing within the modeling tool. The ad hoc nature of this process causes errors, such as forgetting to make a connection between two replicated elements. Thus, manual scaling affects not only modeling performance, but also the representation’s correctness. According to the above discussion, to support system adaptability and scalability requires extensive support from the host modeling tool to enable rapid change evolution within the model representation. There are several challenges that need to be addressed in order to improve the productivity and quality of model evolution. These challenges are discussed in the next section. 11 1.3 Key Challenges in Model Evolution As discussed in Section 1.2, with the expanded focus of software and system models has come the urgent need to manage complex change evolution within the model representation [Sendall and Kozaczynski, 03]. In current MDE practice, as the size of system models expands, the limits of MDE practice are being pushed to address increasingly complex model management issues that pertain to change evolution within the model representation by providing new methodologies and best practices. Also, there is an increasing need to apply software engineering principles and processes into general modeling practice to assist in systematic development of models and model transformation. As a summary, the research described in this dissertation focuses on challenges in current modeling practice that are outlined in the following subsections. 1.3.1 The Increasing Complexity of Evolving Large-scale System Models To support frequent model evolution, changes to models need to be made quickly and correctly. Model evolution tasks have become human intensive because of the growing size of system models and the deeply nested structures of models, inherently from the complexity of large-scale software systems. From our personal experience, models can have multiple thousands of coarse grained components (others have reported similar experience, please see [Johann and Egyed, 04]). Modeling these components using traditional manual model creation techniques and tools can approach the limits of the effective capability of humans. Particularly, the process of modeling a large DRE system with a DSML, or a tool like MatLab [Matlab, 07], is different than traditional class-based UML modeling. In DRE 12 systems modeling, the models consist of instances of all entities in the system, which can number into several thousand instances from a set of types defined in a metamodel (e.g., thousands of individual instantiations of a sensor type in a large sensor network model). Traditional UML models (e.g., UML class diagrams) are typically not concerned with the same type of instance-level focus, but instead specify the entities and their relationship of a system at design time (such as classes). This is not to imply that UML-based models do not have change evolution issues such as scalability issues (in fact, the UML community has recognized the importance of specifying instance models at a large-scale [Cuccuru et al., 05]), but the problem is more acute with system models built with DSMLs. The main reason is that system models are usually sent to an analysis tool (e.g., simulation tool) to explore system properties such as performance and security. Such models need to capture a system by including the instances of all the entities (such as objects) that occur at runtime, which leads to their larger size and nested hierarchy [Lin et al., 07-a]. Due to the growing size and the complicated structures of a large-scale system model, a manual process for making correct changes can be laborious, error-prone and time consuming. For example, to examine the effect of scalability on a system, the size of a system model (e.g., the number of the participant model elements and connections) needs to be increased or decreased frequently. The challenges of scalability affect the productivity of the modeling process, as well as the correctness of the model representation. As an example, consider a base model consisting of a few modeling elements and their corresponding connections. To scale a base model to hundreds, or even thousands of duplicated elements would require a considerable amount of mouse clicking and typing within the associated modeling tool [Gray et al., 06]. Furthermore, 13 the tedious nature of manually replicating a base model may also be the source of many errors (e.g., forgetting to make a connection between two replicated modeling elements). Therefore, a manual process to model evolution significantly hampers the ability to explore design alternatives within a model (e.g., after scaling a model to 800 modeling elements, it may be desired to scale back to only 500 elements, and then back up to 700 elements, in order to understand the impact of system size). An observation from the research described in this dissertation is that the complexities of model evolution must be tackled at a higher level of abstraction through automation with a language tailored to the task of model transformation. 1.3.2 The Limited Use of Model Transformations Model transformation has the potential to provide intuitive notations at a high- level of abstraction to define tasks of model evolution. However, this new role of model transformation has not been addressed fully by current modeling research and practice. For example, transformations in software modeling and design are mostly performed between modeling languages representing different domains. The role of transformation within the same language has not been fully explored as a new applications of stepwise refinement. Such potential roles for transformations may include the following: 1. model optimizations—transforming a given model to an equivalent one that is optimized, in the sense that a given metric or design rule is respected; 2. consistency checks—transforming different viewpoints of the same model into a common notation for purposes of comparison; 14 3. automation of parts of the design process—transformations are used in developing and managing design artifacts like models; 4. model scalability – automation of model changes that will scale a base model to a larger configuration [Lin et al., 07-a]; 5. modularization of crosscutting modeling concerns – properties of a model may be scattered across the modeling hierarchy. Model transformations may assist in modularizing the specification of such properties in a manner that supports rapid exploration of design alternatives [Gray et al., 06]. Within a complex model evolution process, there are many issues that can be addressed by automated model transformation. The research described in this dissertation presents the benefits that model transformation offers in terms of capturing crosscutting model properties and other issues dealing with the difficulties of model scalability. Besides model scalability and modularization of crosscutting modeling concerns, another scenario is building implementation models (e.g., deployment models) based on design models (e.g., component models) [Balasubramanian et al., 06]. Such tasks can be performed rapidly and correctly in an automated fashion using the approach presented in Chapter 3. 1.3.3 The Lack of Model Transformation Testing for Improving Correctness One of the key issues in software engineering is to ensure that the product delivered meets its specification. In traditional software development, testing [Gelperin and Hetzel, 88], [Zhu et al., 97] and debugging [Rosenberg, 96], [Zellweger, 84] have proven to be vital techniques toward improving quality and maintainability of software systems. However, such processes are heavily applied at source code levels and are less 15 integrated into modeling. In addition to formal methods (e.g., model checking and theorem proving for verifying models and transformations), testing, as a widely used technique serving as a best practice in software engineering, can serve as an engineering solution to validate model transformations. Model transformation specifications are used to define tasks of model evolution. A transformation specification, like the source code in an implementation, is written by humans and susceptible to errors. Additionally, a transformation specification may be reusable across similar domains. Therefore, it is essential to test the correctness of the transformation specification (i.e., the consistency and completeness, as validated against model transformation requirements) before it is applied to a collection of source models. Consequently, within a model transformation infrastructure, it is vital to provide wellestablished software engineering techniques such as testing for validating model transformation [Küster, 06]. Otherwise, the correctness of the transformation may always be suspect, which hampers confidence in reusing the transformation. A contribution to model transformation testing is introduced in Chapter 5. 1.3.4 Inadequate Support for Model Differentiation The algorithms and the supporting tools of model differentiation (i.e., finding mappings and differences between two models, also called model differencing or model comparison) may benefit various modeling practices, such as model consistency checking, model versioning and model refactoring. In the transformation testing framework, model differencing techniques are crucial to realize the vision of execution- 16 based testing of model transformations by assisting in comparing the expected result (i.e., the expected model) and the actual output (i.e., the output model). Currently, there are many tools available for differentiating text files (e.g., code and documentation). However, these tools operate under a linear file-based paradigm that is purely textual, but models are often structurally rendered in a tree or graphical notation. Thus, there is an abstraction mismatch between currently available version control tools and the hierarchical nature of models. To address this problem, there have been only a few research efforts on UML model comparison [Ohst et al., 03], [Xing and Stroulia, 05] and metamodel independent comparison [Cicchetti et al., 07]. However, there has been no work reported on comparison of domain-specific models, aside from [Lin et al., 07-a]. Visualization of the result of model comparison (i.e., structural model differences) is also critical to assist in comprehending the mappings and differences between two models. To help communicate the comparison results, visualization techniques are needed to highlight model differences intuitively within a host modeling environment. For example, graphical symbols and colors can be used to indicate whether a model element is missing or redundant. Additionally, these symbols and colors are needed to decorate properties even inside models. Finally, a navigation system is needed to support browsing model differences efficiently. Such techniques are essential to understanding the results of model comparison. The details of a novel model differencing algorithm with visualization tool support are presented in Chapter 4. 17 1.4 Research Goals and Overview To address the increasing complexity of modeling large-scale software systems and to improve the productivity and quality of model evolution, the main goal of the research described in this dissertation is to provide a high-level model transformation approach and associated tools for rapid evolution of large-scale systems in an automated manner. To assist in determining the correctness of model transformations, this research also investigates testing of model transformations. The model transformation testing project has led into an exploration of model comparison, which is needed to determine the differences between an expected model and the actual result. Figure 1-2 shows an integrated view of this research. The overview of the research is described in the following sections. 1.4.1 Model Transformation to Automate Model Evolution To address the complexity of frequent model evolution, a contribution of this research is a model transformation approach to automated model evolution. A preexisting model transformation language, called the Embedded Constraint Language (ECL), has been evolved to specify tasks of model evolution. The ECL has been reimplemented in a model transformation engine, called the Constraint-Specification Aspect Weaver (C-SAW), to perform model evolution tasks in an automated manner. Particularly, the model transformation approach described in this dissertation has been applied to the important issue of model scalability for exploring design alternatives and crosscutting modeling concerns for system adaptation. 18 Figure 1-2 - An overview of the topics discussed in this dissertation By enabling model developers to work at a higher level of abstraction, ECL serves as a small but powerful language to define tasks of model evolution. By providing automation to execute ECL specifications, C-SAW aims to reduce the complexity that is inherent in the challenge problems of model evolution. 1.4.2 Model Transformation Testing to Ensure the Correctness Another important issue of model transformation is to ensure its correctness. To improve the quality of C-SAW transformations, a model transformation testing approach has been investigated to improve the accuracy of transformation results where a model transformation testing engine called M2MUnit provides support to execute test cases with the intent of revealing errors in the transformation specification. 19 The basic functionality includes execution of the transformations, comparison of the actual output model and the expected model, and visualization of the test results. Distinguished from classical software testing tools, to determine whether a model transformation test passes or fails requires comparison of the actual output model with the expected model, which necessitates model differencing algorithms and visualization. If there are no differences between the actual output and expected models, it can be inferred that the model transformation is correct with respect to the given test specification. If there are differences between the output and expected models, the errors in the transformation specification need to be isolated and removed. By providing a unit testing approach to test the ECL transformations, M2MUnit aims to reduce the human effort in verifying the correctness of model evolution. 1.4.3 Model Differentiation Algorithms and Visualization Techniques Driven by the need of model comparison for model transformation testing, model differencing algorithms and an associated tool called DSMDiff have been developed to compute differences between models. In addition to model transformation testing, model differencing techniques are essential to many model development and management practices such as model versioning. Theoretically, the generic model comparison problem is similar to the graph isomorphism problem that can be defined as finding the correspondence between two given graphs, which is known to be in NP [Garey and Johnson, 79]. The computational complexity of graph matching algorithms is the major hindrance to applying them to practical applications in modeling. To provide efficient and reliable model differencing 20 algorithms, the research described in this dissertation provides a solution by using the syntax of modeling languages to help handle conflicts during model matching and combining structural comparison to determine whether the two models are equivalent. In general, DSMDiff takes two models as hierarchical graphs, starts from the top-level of the two containment models and then continues comparison to the child sub-models. Visualization of the result of model differentiation (i.e., structural model differences) is critical to assist in comprehending the mappings and differences between two models. To help communicate the discovered model differences, a tree browser has been constructed to indicate the possible kinds of model differences (e.g., a missing element, or an extra element, or an element that has different values for some properties) with graphical symbols and colors. 1.4.4 Experimental Validation This research provides a model transformation approach to automated model evolution that considers additional issues of testing to assist in determining the correctness of model transformations. The contribution has been evaluated to determine the degree to which the developed approach achieves a significant increase in productivity and accuracy in model evolution. The modeling artifacts available for experimental validation are primarily from two sources. One source is Vanderbilt University, a collaborator on much of the C-SAW research, who has provided multiple modeling artifacts as experimental artifacts. The other source is the Escher repository [Escher, 07], which makes modeling artifacts developed from DARPA and NSF projects available for experimentation. 21 C-SAW has been applied to several model evolution projects for experimental evaluation. The feedback from these case studies has been used to evaluate the modeling effectiveness of C-SAW (e.g., reduced time, increased accuracy and usability) and has demonstrated C-SAW as an effective tool to automate model evolution in various domains for specific types of transformations. Moreover, a case study is provided to highlight the benefit of the M2MUnit testing engine in detecting errors in model transformation. Analytical evaluation has been conducted to assess the performance and relative merit of the DSMDiff algorithms and tools. 1.5 The Structure of the Dissertation To conclude, the major contributions of the thesis work include: 1) automated model evolution by offering ECL as a high-level transformation language to specify model evolution and providing the C-SAW model transformation engine to execute the ECL specifications; 2) apply software engineering practices such as testing to model transformations in order to ensure the correctness of model evolution; and 3) develop model differentiation algorithms and an associated tool (DSMDiff) for computing the mappings and differences between domain-specific models. The thesis research aims to address the difficult problems in modeling complex, large-scale software systems by providing support for evolving models rapidly and correctly. The remainder of this dissertation is structured as follows: Chapter 2 provides further background information on MDE and DSM. Several modeling standards are introduced in Chapter 2, including MDA and the MOF metamodeling architecture. Furthermore, the concepts of metamodels and models are discussed in this background 22 chapter as well as the definitions and categories of model transformation are presented. DSM is further discussed in Chapter 2 by describing one of its paradigms - Model Integrated Computing (MIC) and its metamodeling tool GME, which is also the modeling environment used in the research described in this dissertation. Chapter 3 details the model transformation approach to automate model evolution. The model transformation language ECL and the model transformation engine C-SAW are described as the initial work. The emphasis is given to describe how C-SAW has been used to address the important issues of model scalability for exploring alternative designs and model adaptability for adapting systems to new requirements and environments. Two case studies are presented to illustrate how C-SAW addresses the challenges. In addition, to demonstrate the benefits of this approach, experimental evaluation is discussed, including modeling artifacts, evaluation metrics and experimental results. Chapter 4 describes the research contributions on model differentiation. This chapter begins with a brief discussion on the need for model differentiation, followed by detailed discussions on the limitations of current techniques. The problem of model differentiation is formally defined and the challenges for this problem are identified. The emphasis is placed on the model differentiation algorithms, including an analysis of nonstructural and structural information of model elements, formal representation of models and details of the algorithms. The work representing visualization of model differences is also presented as necessary support to assist in comprehending the results of model differentiation. In addition, complexity analysis is given to evaluate the performance of the algorithms, followed by discussions on current limitations and future improvements. 23 Chapter 5 presents a model transformation testing approach. This chapter begins with a motivation of the specific need to ensure the correctness of model transformations, followed by a discussion on the limitations of current techniques. An overview of the model transformation testing approach is provided and an emphasis is given on the principles and the implementation of the model transformation testing engine M2MUnit. In addition, a case study is offered to illustrate this approach to assist in detecting the errors in ECL specifications. Chapter 6 explores future extensions for this work and Chapter 7 presents concluding comments. Appendix A provides the grammar of ECL, and Appendix B lists the operations in ECL. There are two additional case studies provided in Appendix C to demonstrate the benefit of using C-SAW to scale domain-specific models. 24 CHAPTER 2 BACKGROUND This chapter provides further background information on Model-Driven Engineering (MDE) and Domain-Specific Modeling (DSM). Several modeling standards are introduced, including Model-Driven Architecture (MDA) and the Meta Object Facility (MOF) metamodeling architecture. The concepts of metamodels and models are discussed, as well as the definitions and categories of model transformation. DomainSpecific Modeling is further discussed by describing one of its paradigms – Model Integrated Computing (MIC) and its metamodeling tool – the Generic Modeling Environment (GME), which is also the modeling environment used to conduct the research described in this dissertation. 2.1 Model-Driven Architecture (MDA) MDA is a standard for model-driven software development promoted by the Object Management Group (OMG) in 2001, which aims to provide open standards to interoperate new platforms and applications with legacy systems [MDA, 07], [Frankel, 03], [Kleppe et al., 03]. MDA introduces a set of basic concepts such as model, metamodel, modeling language and model transformation and lays the foundation for MDE. 25 2.1.1 Objectives of MDA To address the problem of the continual emergence of new technologies that forces organizations to frequently port their applications to new platforms, the primary goal of MDA is to provide cross-platform compatibility of application software despite any implementation, or platform-specific changes (to the hardware platform, the software execution platform, or the application software interface). In particular, MDA provides an architecture that assures portability, interoperability, reusability and productivity through architectural separation of concerns [Miller and Mukerji, 01]: • Portability: “reducing the time, cost and complexity associated with retargeting applications to different platforms and systems that are built with new technologies;” • Reusability: “enabling application and domain model reuse and reducing the cost and complexity of software development;” • Interoperability: “using rigorous methods to guarantee that standards based on multiple implementation technologies all implement identical business functions;” • Productivity: “allowing system designers and developers to use languages and concepts that are familiar to them, while allowing seamless communication and integration across the teams.” To meet the above objectives, OMG has established a number of modeling standards as the core infrastructure of the MDA: • The Unified Modeling Language (UML) [UML, 07] is “a standard objectoriented modeling language and framework for specifying, visualizing, constructing, and documenting software systems;” 26 • MetaObject Facility (MOF) [MOF, 07] is “an extensible model driven integration framework for defining, manipulating and integrating metadata and data in a platform-independent manner;” • XML Metadata Interchange (XMI) [XMI, 07] is “a model driven XML Integration framework for defining, interchanging, manipulating and integrating XML data and objects;” • Common Warehouse Metamodel (CWM) [CWM, 07] is “standard interfaces that can be used to enable easy interchange of warehouse and business intelligence metadata between warehouse tools, warehouse platforms and warehouse metadata repositories in distributed heterogeneous environments.” These standards form the basis for building platform-independent applications using any major open or proprietary platform, including CORBA, Java, .Net and Web-based platforms, and even future technologies. 2.1.2 The MDA Vision OMG defines MDA as an approach to system development based on models. MDA classifies models into two categories: Platform-Independent Models (PIMs) and Platform-Specific Models (PSMs). These categories contain models at different levels of abstraction. PIM represents a view of a system without involving platform and technology details. PSM specifies a view of a system from a platform-specific viewpoint by containing platform and technology dependent information. The development of a system according to the MDA approach starts by building a PIM with a high-level of abstraction that is independent of any implementation 27 technology. PIM describes the business functionality and behavior using UML, including constraints of services specified in the Object Constraint Language (OCL) [OCL, 07] and behavioral specification (dynamic semantics) specified in the Action Semantics (AS) language [AS, 01]. In the next phase the PIM is transformed to one or more PSMs. A PSM is tailored to specify a system in terms of the implementation constructs provided by the chosen platforms (e.g., CORBA, J2EE, and .NET). Finally, implementation code is generated from the PSMs in the code generation phase using model-to-code transformation tools. The MDA architecture is summarized in Figure 2-1. The claimed advantages of MDA include increased quality and productivity of software development by isolating software developers from implementation details and allowing them to focus on a thorough analysis of the problem space. MDA, however, lacks the notion of a software development process. MDE is an enhancement of MDA that adds the notion of software development processes to operate and manage models by utilizing domain-specific technologies. 2.2 Basic Concepts of Metamodeling and Model Transformation Metamodeling and model transformation are two important techniques used by model-driven approaches. However, there are no commonly agreed-upon definitions of these concepts in the literature and they may be analyzed from various perspectives. The remaining part of this chapter discusses some of these concepts to help the reader further understand the relationships among metamodels, models and model transformations. 28 PIM PlatformIndependent Model Model Transformation PSM CORBA Model Java/EJB Model Other PlatformSpcific Model Code Generation Application CORBA Application Java/EJB Application Other Software Application Figure 2-1 – The key concepts of the MDA 2.2.1 Metamodel, Model and System MDE promotes models as primary artifacts in the software development lifecycle. There are various definitions that help to understand the relationship among modeling terms. For example, the MDA guide [MDA, 07] defines, “a model of a system is a description or specification of that system and its environment for some certain purpose. A model is often presented as a combination of drawings and text. The text may be in a modeling language or in a natural language.” Another model definition can be found in [Kleppe et al., 03], “A model is a description of a system written in a well-defined 29 language.” In [Bézivin and Gerbé, 01] a model is defined as, “A model is a simplification of a system built with an intended goal in mind. The model should be able to answer questions in place of the actual system.” In the context of this research, a model represents an abstraction of some real system, whose concrete syntax is rendered in a graphical iconic notation that assists domain experts in constructing a problem description using concepts familiar to them. Metamodeling is a process for defining domain-specific modeling languages (DSMLs). A metamodel formally defines the abstract syntax and static semantics of a DSML by specifying a set of modeling elements and their valid relationships for that specific domain. A model is an instance of the metamodel that represents a particular part of a real system. Conformance is correspondence relationship between a metamodel and a model and substitutability defines a causal connection between a model and a system [Kurtev et al., 06]. As defined in [Kurtev et al., 06], a model is a directed multigraph that consists of a set of nodes, a set of edges and a mapping function between the nodes and the edges; a metamodel is a reference model of a model, which implies that there is a function associating the elements (nodes and edges) of the model to the nodes of the metamodel. Based on the above definitions, the relation between a model and its metamodel is called conformance, which is denoted as conformTo or c2. Particularly, a metamodel is a model whose reference model is a metametamodel, and a metametamodel is a model whose reference model is itself [Kurtev et al., 06]. The substitutability principle is defined as, “a model M is said to be a representation of a system S for a given set of questions Q if, for each question of this set 30 Q, the model M will provide exactly the same answer that the system S would have provided in answering the same question” [Kurtev et al., 06]. Using this terminology, a model M is a representation of a given system S, satisfying the substitutability principle. The relation between a model and a system is called representationOf, which is also denoted as repOf [Kurtev et al., 06]. Figure 2-2 illustrates the relationship between a metamodel, a model and a system. Figure 2-2 - The relation between metamodel, model and system (adapted from [Kurtev et al., 06]) 2.2.2 The Four-Layer MOF Metamodeling Architecture The MOF is an OMG standard for metamodeling. It defines a four-layer metamodeling architecture that a model engineer can use to define and manipulate a set of interoperable metamodels. As shown in Figure 2-3, every model element on every layer strictly conforms to a model element on the layer above. For example, the MOF resides at the top (M3) level of the four-layer metamodel architecture, which is the meta-metamodel that conforms to itself. The MOF captures the structure or abstract syntax of the UML metamodel. The UML metamodel at the M2 level describes the major concepts and structures of UML 31 models. A UML model represents the properties of a real system (denoted as M1). MOF only provides a means to define the structure or abstract syntax of a language. For defining metamodels, MOF serves the same role that the extended Backus–Naur form (EBNF) [Aho et al., 07] plays for defining programming language grammars. For defining a DSML, a metamodel for that specific domain plays the role that a grammar plays for defining a specific language (e.g., Java). There are other metamodeling techniques available for defining domain-specific modeling languages such as the Generic Modeling Environment (GME) [Lédeczi et al., 01], ATLAS Model Management Architecture (AMMA) [Kurtev et al., 06], Microsoft’s DSL tools [Microsoft, 05], [Cook et al., 07], MetaEdit+ [MetaCase, 07], and the Eclipse Modeling Framework (EMF) [Budinsky et al., 04], which also follow the four-layer metamodeling architecture. ConformsTo The MOF M3: The meta-metamodel Level ConformsTo The UML metamodel M2: The metamodel level ConformsTo The UML Models M1: The model level RepresentedBy System M0: The real world Figure 2-3 - The MOF four-tier metamodeling architecture 32 2.2.3 Model Transformation Model transformation, a key component of model-driven approaches, represents the process of applying a set of transformation rules that take one or more source models as input to produce one or more target models as output [Sendall and Kozaczynski, 03], [Czarnecki and Helsen, 06], [Mens and Van Gorp, 05]. The source and target models may be defined either in the same modeling languages or in different modeling languages. Based on whether the source and target models conform to the same modeling language, model transformations can be categorized as endogenous transformations or exogenous transformations. Endogenous transformations are transformations between models expressed in the same language. Exogenous transformations are transformations between models expressed using different languages [Mens and Van Gorp, 05]. A typical example of endogenous transformations is model refactoring, where a change is made to the internal structure of models to improve certain qualities (e.g., understandability and modularity) without changing its observable behaviors [Zhang et al., 05-a]. An example of an exogenous transformation is model-to-code transformation that typically generates source code (e.g., Java or C++) from models. Another standard to categorize model transformation is whether the source and target models reside at the same abstraction level. Based on this standard, model transformations can also be categorized as horizontal transformations and vertical transformations [Mens and Van Gorp, 05]. If the source and target models reside at the same abstraction level, such a model transformation is a horizontal transformation. Model refactoring is also a horizontal transformation. A vertical transformation is a transformation where the source and target models reside at different abstraction levels. 33 A typical example is model refinement, where a design model is gradually refined into a full-fledged implementation model, by means of successive refinement steps that add more concrete details [Batory et al., 04], [Greenfield et al., 04]. The dimensions of horizontal versus vertical transformations and endogenous versus exogenous transformations are orthogonal. For example, model migration translates models written in a language to another, but these two languages are at the same level of abstraction. A model migration is not only an exogenous transformation but also a horizontal transformation. Although most existing MDE tools provide support for exogenous transformation to stepwise produce implementation code from designs, many modeling activities can be automated by endogenous transformations to increase the productivity of modeling and improve the quality of models. For example, such modeling activities include model refactoring [Zhang et al., 05-a] and model optimization [Mens and Van Gorp, 05]. Moreover, exogenous transformations are also useful for computing different views of a system model and synchronizing between them [Czarnecki and Helsen, 06]. This dissertation concentrates on applying endogenous transformations to automate model change evolution with an emphasis on addressing system adaptability and scalability, which is further discussed in Chapter 3. In the rest of the dissertation, the general term “model transformation” will refer to endogenous transformations (i.e., this research offers no contribution in the exogenous transformation form such as model-to-code transformation). A model transformation is defined in a model transformation specification, which consists of a set of transformation rules. A transformation rule usually includes two parts: 34 a Left-Hand Side (LHS) and a Right-Hand Side (RHS). The LHS defines the configuration of objects in the source models to which the rule applies (i.e., filtering, which produces a subset of elements from the source model). The RHS defines the configuration of objects in the target models that will be created, updated or deleted by the rule. Both the LHS and RHS can be represented using any mixture of variables, patterns and logic. A model transformation specification not only needs to define mapping rules, but also the scope of rule application. Additional parts of a transformation rule include the rule application strategy, and the rule application scheduling and organization [Czarnecki and Helsen, 06]. Rule application scoping includes the scope of source models and target models for rule application (in this case, scope refers to the portion or subset of a model to which the transformation is to be applied). The rule application strategy refers to how the model structure is traversed in terms of how selection matches are made with modeling elements when applying a transformation. A model transformation can also be applied as an in-place update where the source location becomes the target location. Rule application scheduling determines the order in which the rules are applied, and rule organization considers modularity mechanisms and organizational structure of the transformation specification [Czarnecki and Helsen, 06]. There exist various techniques to define and perform model transformations. Some of these techniques provide transformation languages to define transformation rules and their application, which can be either graphical or textual, either imperative or declarative. Although there exist different approaches to model transformation, the OMG has initiated a standardization process by adopting a specification on 35 Query/View/Transformation (QVT) [QVT, 07]. This process led to an OMG standard not only for defining model transformations, but also for defining views on models and synchronization between models. Typically, a QVT transformation definition describes the relationship between a source metamodel and a target metamodel defined by the MOF. It uses source patterns (e.g., the LHS part in a transformation rule) and target patterns (e.g., the RHS part in a transformation rule). In QVT, transformation languages are defined as MOF metamodels. A transformation is an instance of a transformation definition, and its source models and target models are instances of source patterns and target patterns, respectively. Such a generalized transformation pattern is shown partially in Figure 2-4, without indicating the transformation language level. Source Pattern Metamodel level Model level uses instanceOf Source Model Transformation Definition uses instanceOf input Transformation Target Pattern instanceOf output Target Model Figure 2-4 - Generalized transformation pattern In MDE, model transformation is the core process to automate various activities in the software development life cycle. Exogenous transformation can be used to synthesize low-level software artifacts (e.g., source code) from high-level models, or to extract high-level models from lower level software artifacts such as reverse engineering. Endogenous transformation can be used to optimize and refactor models in order to improve the modeling productivity and the quality of models. 36 2.3 Supporting Technology and Tools This research is tied to a specific form of MDE, called Model-Integrated Computing (MIC) [Sztipanovits and Karsai, 97], which has been refined at Vanderbilt University over the past decade to assist in the creation and synthesis of computer-based systems. The Generic Modeling Environment (GME) [GME, 07] is a metamodeling tool based on MIC principles, with which the dissertation research is conducted. The following sections provide further descriptions of MIC and GME. 2.3.1 Model-Integrated Computing (MIC) MIC realized the vision of MDE a decade before the general concepts of MDE were enumerated in the modeling community. Similar to the MOF mechanism, MIC is also a four-layer metamodeling architecture that defines DSMLs for modeling real-world systems. Different from the MOF, MIC provides its own meta-metamodel called GMEMeta [Balasubramanian et al., 06-b] to define metamodels with notation similar to UML class diagrams and the OCL. In terms of MIC, the main concepts of model, metamodel, and other topics are defined as follows [Nordstrom, 99]: • Metamodeling Environment: “a tool-based framework for creating, validating, and translating metamodels; ” • Metamodel: also called the modeling paradigm, “formally defines a DSML for a particular domain, which captures the syntax, static semantics and visualization rules of the target domain;” • Modeling Environment: “a system, based on a specific metamodel, for creating, analyzing, and translating domain-specific models;” 37 • Model: “an abstract representation of a computer-based system that is an instance of a specific metamodel.” DSMLs are the backbone of MIC to capture the domain elements of various application areas. A DSML can be viewed as a five tuple [Karsai et al., 03]: • a concrete syntax defines “the specific notation (textual or graphical) used to express domain elements;” • an abstract syntax defines “the concepts, relationships, and integrity constraints available in the language;” • a semantic domain defines “the formalism used to map the semantics of the models to a particular domain;” • a syntactic mapping assigns “syntactic constructs (graphical or textual) to elements of the abstract syntax;” • a semantic mapping relates “the syntactic concepts to the semantic domain.” The key application domains of MIC range from embedded systems areas typified by automotive factories [Long et al., 98] to avionics systems [Gray et al., 04-b] that tightly integrate the computational structure of a system and its physical configuration. In such systems, MIC has been shown to be a powerful tool for providing adaptability in frequently changing environments [Sztipanovits and Kaisai, 97]. 2.3.2 Generic Modeling Environment (GME) GME is a metamodeling tool that realizes the principles of MIC [Lédeczi et al., 01]. As shown in Figure 2-5, GME provides a metamodeling interface to define 38 metamodels, a modeling environment to create and manipulate models and model interpretation to synthesize applications from models. Figure 2-5 - Metamodels, models and model interpreters (compilers) in GME (adapted from [Nordstrom et al., 99]) When using the GME, a modeling paradigm is loaded into the tool by meta-level translation to define a modeling environment containing all the modeling elements and valid relationships that can be constructed in the target domain. Such an environment allows users to specify and edit visual models using notations common to their domain of expertise. GME also provides a mechanism for writing model compilers that translate models to different applications according to a user’s various intentions. For example, such compilers can generate simulation applications or synthesize computer-based 39 systems. GME provides the following elements to define a DSML [Balasubramanian et al., 06-b]: • • • project: “the top-level container of the elements of a DSML;” folders: “used to group similar elements;” atoms: “the atomic elements of a DSML, used to represent the leaf-level elements in a DSML;” • models: “the compound objects in a DSML, used to contain different types of elements (e.g., references, sets, atoms, and connections);” • • • aspects: “used to define different viewpoints of the same model;” connections: “used to represent relationships between elements of a DSML;” references: “used to refer to other elements in different portions of a DSML hierarchy;” • sets: “containers whose elements are defined within the same aspect and have the same container as the owner.” The concepts of metamodel and model in GME are further illustrated by a state machine example, as shown in Figure 2-6 and Figure 2-7. Figure 2-6 shows a metamodel defined with the GME meta-metamodel, which is similar to a UML diagram class. It specifies the entities and their relationships needed for expressing a state machine, of which the instance models may be various state diagrams. As specified in the metamodel, a StateDiagram contains zero to multiple StartState, State or EndState elements, which are all inherited from StateInheritance, which is a first-class object (FCO) 1 . Thus, 1 In GME, Atom, Model, Reference, Set and Connection are basic modeling elements called first class objects (FCOs). 40 StateInheritance may refer to StartState, State or EndState elements. Also, a StateDiagram contains zero to multiple Transitions between a pair of StateInheritance objects. Either StateInheritance or Transition is associated with an attribute definition such as a field. All of these StateInheritance or Transition elements are meta-elements of the elements in any of its instance models. Moreover, some rules can be defined as constraints within the metamodel by a language such as OCL. For example, a rule for the state machine may be “a state diagram contains only one StartState,” which is specified in the OCL as “parts("StartState")->size() = 1”. Figure 2-6 - The state machine metamodel Figure 2-7 shows an Automated Teller Machine (ATM) model, which is an instance model of the state machine. Besides a StartState element and an EndState element, the ATM contains seven State elements and the necessary Transition elements between the StartState, EndState and State elements. The specification of this ATM model conforms to the state machine in several ways. For example, for any State element in the ATM (e.g., CaredInserted and TakeReceipt), its meta-element exists in the state 41 machine (i.e., State). Similarly, for all the links between State elements in the ATM, there exists an association in the state machine metamodel which leads from a StateInheritance to another StateInheritance or from a StateInheritance to itself. This represents type conformance within the metamodel. In other words, there exists a meta-element (i.e., type) in the metamodel for any element in the instance model. In addition to type conformance, the ATM needs to conform to the attribute and constraint definition in the state machine metamodel. Figure 2-7 - The ATM instance model To conclude, GME provides a framework for creating domain-specific modeling environments (DSMEs), which allow one to define DSMLs. The GME also provides a plug-in extension mechanism for writing model compilers that can be invoked from within the GME to synthesize a model into some other form (e.g., translation to code, 42 refinement to a different model, or simulation scripts). The tools developed to support the research have been implemented as GME plug-ins (i.e., the transformation engine CSAW discussed in Chapter 3, model comparison tool DSMDiff discussed in Chapter 4 and transformation testing engine M2MUnit discussed in Chapter 5). All of the DSMLs presented in this thesis are also defined and developed within the GME. 43 CHAPTER 3 AUTOMATED MODEL EVOLUTION This chapter presents a transformation approach to automate model change evolution. The specific challenges and the limitations of currently available techniques are discussed before a detailed introduction to the model transformation language ECL and the associated model transformation engine C-SAW. Particularly, this chapter concentrates on the role of C-SAW in addressing model evolution concerns related to system scalability and adaptability. Two case studies are offered to illustrate how these concerns are addressed by C-SAW. In addition, to demonstrate the benefits of this approach, experimental evaluation is discussed, including modeling artifacts, evaluation metrics and experimental results. Related work and a concluding discussion are presented at the end of this chapter. 3.1 Challenges and Current Limitations One of the benefits of modeling is the ability to explore design alternatives. To provide support for efficient design exploration, frequent change evolution is required within system models. However, the escalating complexity of software and system models is making it difficult to rapidly explore the effects of a design decision and make 44 changes to models correctly. Automating such exploration with model transformation can improve both productivity and quality of model evolution [Gray et al., 06]. To support automated model evolution, a model transformation language should be able to specify various types of changes that are needed for common model evolution tasks. As discussed in Chapter 1, there are two categories of changes embodied in model evolution that this research addresses: one is changes for system adaptability that crosscut the model representation’s hierarchy; the other is changes for system scalability that require replication of model elements and connections. To express these changes, a model transformation language should address the challenges discussed in the following subsections. 3.1.1 Navigation, Selection and Transformation of Models Many model evolution tasks require traversing the model hierarchy, selecting model elements and changing the model structure and properties. For example, model scalability is a process that refines a simple base model to a more complex model by replicating model elements or substructures and adding necessary connections. Such replication usually emerges in multiple locations within a model hierarchy and requires that a model transformation language provide support for model navigation and selection. Particularly, model evolution is essentially a process to manipulate (e.g., create, delete, or change) model elements and connections dynamically. A model transformation language also needs to support basic transformation operations for creating and deleting model elements or changing their properties. 45 Model transformation is also a specific type of computation and may be performed in a procedural style. A model transformation may contain multiple individual and reusable procedures. In addition to the above model-oriented features, a model transformation language often needs to support sequential, conditional, repetitive and parameterized model manipulation for defining control flows and enabling data communication between transformation rules. Another challenge of model evolution is many modeling concerns are spread across a model hierarchy. New language constructs are needed to improve the modularization of such concerns, as discussed in the following section. 3.1.2 Modularization of Crosscutting Modeling Concerns Many model evolution concerns are crosscutting within the model hierarchy [Zhang et al., 07], [Gray et al., 03]. An example is the widespread data logging mechanism embodied in a data communication system [Gray et al., 04-b]. Another example is the effect of fluctuating bandwidth on the quality of service across avionics components that must display a real-time video stream [Neema et al., 02]. To evaluate such system-wide changes inside a system model, the designer must manually traverse the model hierarchy by recursively clicking on each element and then make changes to the right elements. This process is tedious and error-prone, because system models often contain hierarchies several levels deep. 46 Level 1 Level 2 Level 3 Level 4 Figure 3-1 - Modularization of crosscutting model evolution concerns Traditional programming languages such as Object-Oriented languages are not suitable for modularizing concerns that crosscut modules such as objects. Currently, Aspect-Oriented Software Development (AOSD) [Filman et al., 04] offers techniques to modularize concerns that crosscut system components. For example, Aspect-Oriented Programming (AOP) [Kiczales et al., 01] provides two new types of language constructs: advice, which is used to represent the crosscutting behavior; and pointcut expressions, which are used to specify the locations in the base program where the advice should be applied. An aspect is a modularization of a specific crosscutting concern with pointcut and advice definitions. Although the application of AOSD originally focused on programming languages, the community investigating aspect-oriented modeling is growing [AOM, 07]. To support modularity in specifying crosscutting modeling concerns, as shown in Figure 3-1, Aspect-Oriented constructs are needed in a model 47 transformation language (e.g., to specify a collection of model elements crosscutting within the model hierarchy). A desired result of such a model transformation language is to achieve modularization such that a change in a design decision is isolated to one location. 3.1.3 The Limitations of Current Techniques For the purpose of automation, model evolution tasks can be programmed in either traditional programming languages or currently available model transformation languages. For example, many commercial and research toolsuites provide APIs to manipulate models directly. However, an API approach requires model developers to learn and use low-level tools (e.g., object-oriented languages and their frameworks) to program high-level model transformations. This emerges as a major hurdle when applying model-driven approaches in software development by end-users who are not familiar with general-purpose programming languages (GPLs). There are a number of approaches to model transformation, such as graphical languages typified by graph grammars (e.g., GReAT [Agrawal, 03] and Fujaba [Fujaba, 07]), or a hybrid language (e.g., the ATLAS Transformation Language [Bézivin et al., 04], [Kurtev et al., 06] and Yet Another Transformation Language [Patrascoiu, 04]). However, these model transformation approaches aim to solve model transformation where the source model and target model belong to different metamodels, so their languages have complicated syntax and semantics, and additional mapping techniques between different metamodels are embodied in these approaches. Thus, a more lightweight language that aims to solve endogenous transformation where the source 48 model and target model belong to the same metamodel is more suitable to address the model evolution problems identified in this research. 3.2 The Embedded Constraint Language (ECL) The model transformation approach developed in this research offers a textual transformation language called the Embedded Constraint Language (ECL). The earlier version of ECL was derived from Multigraph Constraint Language (MCL), which is supported in the GME to stipulate specific semantics within the domain during the creation of a domain’s metamodeling paradigm [Gray, 02]. Similar to MCL, ECL is an extension of the OCL [Warmer and Kleppe, 99], which complements the industry standard UML by providing a language to write constraints and queries over object models. Thus, ECL has concepts and notations that are familiar to model engineers. The ECL also takes advantage of model navigation features from OCL to provide declarative constructs for automatic selection of model elements. Originally, ECL was designed to address crosscutting modeling concerns where transformations were executed outside the GME by modifying the XML representation of models [Gray et al., 01], [Gray, 02]. A preliminary contribution of this dissertation was to enrich the language features of ECL and to adapt its interface to work as a plug-in within GME (i.e., the transformations are now performed within GME, not outside of GME through XML). Several criteria influenced the design of ECL, including: • The language should be small but powerful. The primary goal for the design of a transformation specification language should allow users to describe a transformation using concepts from their own domain and modeling environment. Although there is 49 a tradeoff between conciseness and comprehension, the key to the design of a transformation language is a set of core abstractions that are intuitive and cover the largest possible range of situations implied by current modeling practice. To achieve this goal, a transformation language should consist of a small set (ideally a minimized set) of concepts and constructs, but be powerful enough to express a complete set of desired transformation activities such that the expressiveness of the language is maximized. • The language should be specific to model transformation. A transformationspecific language needs to have full power to specify all types of modeling objects and transformation behaviors, including model navigation, model selection and model modification. Such a language should provide specific constructs and mechanisms for users to describe model transformations. This requires both a robust type system and a set of functionally rich operators. In other words, a transformation language needs to capture all the features of the model transformation domain. As a result of these desiderata, ECL is implemented as a simple but highly expressive language for model engineers to write transformation specifications. The ECL provides a small but robust type system, and also provides features such as collection and model navigation. A set of operators are also available to support model aggregation and connections. In general, the ECL supports an imperative transformation style with numerous operations that can alter the state of a model. 50 3.2.1 ECL Type System ECL currently provides a basic type system to describe values and model objects that appear in a transformation. The data types in ECL include the primitive data types (e.g., boolean, integer, real and string), the model object types (e.g., atom, model, object and connection) and the collection types (e.g., atomList, modelList, objectList and connectionList). The data types are explicitly used in parameter definition and variable declaration. These types are new additions to the earlier version of ECL described in [Gray, 02]. 3.2.2 ECL Operations ECL provides various operations to support model navigation, selection and transformation. These operations are described throughout this subsection. Model collection The manipulation of a collection of modeling elements is a common task for model transformation. In DSM, there exist modeling elements that have common features and can be grouped together. Model manipulations (e.g., navigations and evaluations) are often needed to be performed on such a collection of models. The concrete type of a collection in ECL is a bag, which can contain duplicate elements. An example operator for a collection is size(), which is similar to the OCL operator that returns the number of elements in a collection. All operations on collections are denoted in an ECL expression by an arrow (->). This makes it easy to distinguish an operation of a model object type (denoted as a 51 period) from an operation on a collection. In the following statement, the select operation following the arrow is applied to the collection (i.e., the result of the atoms operation) before the arrow, and the size operator appears in the comparison expression. atoms()->select(a | a.kindOf() == "Data")->size() >= 1 The above expression selects all the atoms, whose kind is “Data,” from the current model and determines whether the number of such atoms is equal to or greater than 1. Model selection and aggregation One common activity during model transformation is to find elements in a model. There are two different approaches to locating model elements. The first approach querying - evaluates an expression over a model, returning those elements of the model for which the expression holds. The other common approach uses pattern matching where a term or a graph pattern containing free variables is matched against the model. Currently, ECL supports model queries primarily by providing the select() operator. Other operators include model aggregation operators to select a collection of objects (e.g. atoms()), and a set of operators to find a single object (e.g., findModel(“aModelName”)). The select(expression) operator is frequently used in ECL to specify a selection from a source collection, which can be the result of previous operations and navigations. The result of the select operation is always a subset of the original collection. In addition, model aggregation operators can also be used to perform model querying. For example, models(expression) is used to select all the submodels that satisfy the constraint specified by the expression. Other aggregation operators include 52 atoms(expression), connections(expression), source() and destination(). Specifically, source() and destination() are used to return the source object and the destination object in a connection. Another set of operators are used to obtain a single object (e.g., findAtom() and findModel()). The following ECL uses a number of operators just mentioned: rootFolder().findFolder("ComponentTypes").models()-> select(m|m.name().endWith("Impl"))->AddConcurrency(); First, rootFolder() returns the root folder of a modeling project. Next, findFolder() is used to return a folder named “ComponentTypes” under the root folder. Then, models() is used to find all the models in the “ComponentTypes” folder. Finally, the select() operator is used to select all the models that match the predicate expression (i.e., those models whose names end with “Impl”). The AddConcurrency strategy is then applied to the resulting collection. The concept of a strategy is explained in Section 3.2.3. Transformation operations ECL provides basic transformation operations to add model elements, remove model elements and change the properties of model elements. Standard OCL does not provide such capabilities because it does not allow side effects on a model. However, a transformation language should be able to alter the state of a model. ECL extends the standard OCL by providing a series of operators for changing the structure or constraints of a model. To add new elements (e.g., a model, atom or connection) to a model, ECL provides such operators as addModel(), addAtom() and addConnection(). 53 Similarly, to remove a model, atom or connection, there are operators like removeModel(), removeAtom() and removeConnection(). To change the value of any attribute of a model element, setAttribute() can be used. 3.2.3 The Strategy and Aspect Constructs There are two kinds of modular constructs in ECL: strategy and aspect, which are designed to provide aspect-oriented capabilities in specifying crosscutting modeling concerns. A strategy is used to specify elements of computation and the application of specific properties to the model entities (e.g., adding model elements). A modeling aspect is used to specify a crosscutting concern across a model hierarchy (e.g., a collection of model elements that cut across a model hierarchy). In general, an ECL specification may consist of one or more strategies, and a strategy can be called by other strategies. A strategy call implements the binding and parameterization of the strategy to specific model entities. The context of a strategy call can be an entire project; a specific model, atom, or connection; or a collection of assembled modeling elements that satisfy a predicate. The aspect construct in ECL is used to specify such a context. Examples of ECL aspect and strategy are shown in Listing 3-1. 54 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 aspect FindData1(atomName, condName, condExpr : string) { atoms()->select(a | a.kind() == "Data" and a.name() == "data1")-> AddCond("Data1Cond", "value<200"); } strategy AddCond(condName, condExpr : string) { declare p : model; declare data, pre : atom; data := self; p := parent(); pre:=p.addAtom("Condition", condName); pre.setAttribute("Kind", "PreCondition"); pre.setAttribute("Expression", condExpr); p.addConnection("AddCondition", pre, data); } Listing 3-1 - Examples of ECL aspect and strategy The findData1 aspect selects a set of data atoms named “data1” from all the atoms and then a strategy called AddCond is applied to the selected atoms, which adds a Condition atom for each of the selected atoms (Line 15) and creates a connection between them (Line 18). The AddCond strategy also sets values to the attributes of each created Condition atom (Line 16 to 17). With strategy and aspect constructs, ECL offers the ability to explore numerous modeling scenarios by considering crosscutting modeling concerns as aspects that can be rapidly inserted and removed from a model. This permits a model engineer to make changes more easily to the base model without manually visiting multiple locations. 55 3.2.4 The Constraint Specification Aspect Weaver (C-SAW) The ECL is fully implemented within a model transformation engine called the Constraint-Specification Aspect Weaver (C-SAW)2. Originally, C-SAW was designed to address crosscutting modeling concerns [Gray, 02], but has evolved into a general model transformation engine to perform different modeling tasks. As shown in Figure 3-2, CSAW executes the ECL transformation specification on a set of source models. One or more source models, together with transformation specifications, are taken as input to the underlying transformation engine. By executing the transformation specification, C-SAW weaves additive changes into source models to generate the target model as output. Inside C-SAW there are two core components that relate to the ECL: the parser and the interpreter. The parser is responsible for generating an abstract syntax tree (AST) of the ECL specification. The interpreter then traverses this generated AST from top to bottom, and performs a transformation by using the modeling APIs provided by GME. Thus, the accidental complexities of using the low-level details of the API are made abstract in the ECL to provide a more intuitive representation for specifying model transformations. When model transformation is performed, additive changes can be applied to base models, leading to structural or behavioral changes that may crosscut multiple boundaries of the model. Section 3.3 provides an example where C-SAW serves as a model replicator to scale up models. The other example is presented in Section 3.4 to illustrate the capability of C-SAW to address crosscutting modeling concerns. 2 The name C-SAW was selected due to its affinity to an aspect-oriented concept – a crosscutting saw, or csaw, is a carpenter’s tool that cuts across the grain of wood. 56 Figure 3-2 - Overview of C-SAW 3.2.5 Reducing the Complexities of Transforming GME Models GME provides APIs in C++ and Java called the Builder Object Network (BON) to manipulate models as an extension mechanism for building model compilers (e.g., plug-in and add-on components) that supply an ability to alter the state of models or generate other software artifacts from the models. However, using BON to specify modeling concerns such as model navigation and querying introduces accidental complexities because users have to deal with the low-level details of the modeling APIs in C++ or Java. Listing 3-2 shows the code fragment for finding a model from the root folder using BON APIs in C++: 57 1 2 3 4 5 6 7 8 9 10 11 12 13 CbuilderFolder *rootFolder, string: modelName CBuilderModel *result = null; const CBuilderModelList *subModels; subModels = ((CBuilderFolder *)rootFolder)->GetRootModels(); POSITION POS = subModels->GetHeadPosition(); while(POS){ CBuilderModel *subModel = subModels->GetNext(POS); if(subModel->GetName() == modelName){ result = subModel; return result; } } return result; Listing 3-2 – Example C++ code to find a model from the root folder The ECL provides a more intuitive and high-level representation for specifying modeling concerns (e.g., model navigation, querying and transformation) such that the low-level details are made abstract. For example, to find a model from the root folder, the following ECL can be compared to the C++ code provided previously: rootFolder().findModel(“aModelName”); 3.3 Model Scaling with C-SAW In MDE, it is often desirable to evaluate different design alternatives as they relate to scalability issues of the modeled system. A typical approach to address scalability is model replication, which starts by creating base models that capture the key entities as model elements and their relationships as model connections. A collection of base models can be adorned with necessary information to characterize a specific scalability concern as it relates to how the base modeling elements are replicated and connected together. In current modeling practice, such model replication is usually accomplished by scaling the base model manually. This is a time-consuming process that represents a source of error, especially when there are deep interactions between model components. As an alternative 58 to the manual process, this section presents the idea of automated model replication through a C-SAW model transformation process that expands the number of elements from the base model and makes the correct connections among the generated modeling elements. The section motivates the need for model replication through a case study. 3.3.1 Model Scalability One of the benefits of modeling is the ability to explore design alternatives. A typical form of design exploration involves experimenting with model structures by expanding different portions of models and analyzing the result on scalability [Lin et al., 07-a], [Gray et al., 05]. For example, a high-performance computing system may be evaluated when moving from a few computing nodes to hundreds of computing nodes. Model scalability is defined as the ability to build a complex model from a base model by replicating its elements or substructures and adding the necessary connections. To support model scalability requires extensive support from the host modeling tool to enable rapid change evolution within the model representation [Gray et al., 06]. However, it is difficult to achieve model scalability in current modeling practice due to the following challenges: (1) Large-scale system models often contain many modeling elements: In practice, models can have multiple thousands of coarse-grained components. As discussed in Section 1.3.1, modeling these components using traditional manual model creation techniques and tools can approach the limits of the effective capability of an engineer. For example, the models of a DRE system consist of 59 several thousand instances from a set of types defined in a metamodel, which leads to their larger size and nested hierarchy. (2) Manually scaling up models is laborious, time consuming and prone to errors: To examine the effect of scalability on a system, the size of a system model (e.g., the number of the participant model elements and connections) needs to be increased or decreased frequently. The challenges of scalability affect the productivity of the modeling process, as well as the correctness of the model representation. As an example, consider a base model consisting of a few modeling elements and their corresponding connections. To scale a base model to hundreds, or even thousands of duplicated elements would require a lot of mouse clicking and typing within the associated modeling tool [Gray et al., 06]. Furthermore, the tedious nature of manually replicating a base model may also be the source of many errors (e.g., forgetting to make a connection between two replicated modeling elements). A manual process to replication significantly hampers the ability to explore design alternatives within a model (e.g., after scaling a model to 800 modeling elements, it may be desired to scale back to only 500 elements, and then back up to 700 elements, in order to understand the impact of system size). To address these challenges, the research described in this dissertation makes a contribution to model scalability by using a model transformation approach to automate replication3 of base models. A transformation for model replication is called a replicator, 3 The term “replication” has specific meaning in object replication of distributed systems and in database replication. In the context of this thesis, the term is used to refer to the repetition of modeling elements or structures among models to address scalability concerns. 60 which changes a model to address scalability concerns. In this approach, large-scale system models are automatically created from smaller, baseline specification models by applying model transformation rules that govern the scaling and replication behavior associated with stepwise refinement of models [Batory, 06]. 3.3.2 Desired Characteristics of a Replication Approach An approach that supports model scalability through replication should have the following desirable characteristics: 1) retains the benefits of modeling, 2) is general across multiple modeling languages, and 3) is flexible to support user extensions. Each of these characteristics (C1 through C3) is discussed further in this subsection. C1. Retains the benefits of modeling: The power of modeling comes from the opportunity to explore various design alternatives and the ability to perform analysis (e.g., model checking and verification of system properties [Hatcliff et al., 03]) that would be difficult to achieve at the implementation level, but easier at the modeling level. Thus, a model replication technique should not perform scalability in such a way that analysis and design exploration is inhibited. This seems to be an obvious characteristic to desire, but we have observed replication approaches that remove these fundamental benefits of modeling. C2. General across multiple modeling languages: A replication technique that is generally applicable across multiple modeling languages can leverage the effort expended in creating the underlying transformation mechanism. A side benefit of such generality is that a class of users can become familiar with a common replicator technique, which can be applied to many modeling languages. 61 C3. Flexible to support user extensions: Often, large-scale system models leverage architectures that are already well-suited toward scalability. Likewise, the modeling languages that specify such systems may embody similar patterns of scalability, and may lend themselves favorably toward a generative and reusable replication process. Further reuse can be realized if the replicator supports multiple types of scalability concerns in a templatized fashion (e.g., the name, type, and size of the elements to be scaled are parameters to the replicator). The most flexible type of replication would allow alteration of the semantics of the replication more directly using a language that can be manipulated easily by an end-user. In contrast, replicator techniques that are hard-coded restrict the impact for reuse. 3.3.3 Existing Approaches to Support Model Replication As observed, there are two techniques that represent approaches to model replication used in common practice: 1) an intermediate phase of replication within a model compiler, 2) a domain-specific model compiler that performs replication for a particular modeling language. A1. Intermediate stage of model compilation: A model compiler translates the representation of a model into some other artifacts (e.g., source code, configuration files, or simulation scripts). As a model compiler performs its translation, it typically traverses an internal representation of the model through data structures and APIs provided by the host modeling tool (e.g., the BON offered by GME). One of our earlier ideas for scaling large models considered performing the 62 replication as an intermediate stage of the model compiler. Prior to the generation phase of the compilation, the intermediate representation can be expanded to address the desired scalability. This idea is represented in Figure 3-3, which shows the model scaling as an internal task within the model compiler that directly precedes the artifact generation. Figure 3-3 - Replication as an intermediate stage of model compilation (A1) This approach is an inadequate solution to replication because it violates all three of the desired characteristics enumerated in Section 3.3.2. The most egregious violation is that the approach destroys the benefits of modeling. Because the replication is performed as a pre-processing phase in the model compiler, the replicated structures are never rendered back into the modeling tool itself to produce scaled models such that model engineers can further analyze the model scaling results. Thus, analysis and design alternatives are not made available to a model 63 engineer who wants to further evaluate the scaled models. Additionally, the preprocessing rules are hard-coded into the model compiler and intermixed with other concerns related to artifact generation. This coupling offers little opportunity for reuse in other modeling languages. In general, this is the least flexible of all approaches that we considered. A2. Domain-specific model compiler to support replication: This approach to model scalability constructs a model compiler that is capable of replicating the models as they appear in the tool such that the result of model scaling is available to the end-user for further consideration and analysis. Such a model compiler has detailed knowledge of the specific modeling language, as well as the particular scalability concern. Unlike approach A1, this technique preserves the benefits of modeling because the end result of the replication provides visualization of the scaling, and the replicated models can be further analyzed and refined. Figure 3-4 illustrates the domain-specific model replicator approach, which separates the model scaling task from the artifact generator in order to provide end-users an opportunity to analyze the scaled models. However, this approach also has a few drawbacks. Because the replication rules are hard-coded into the domain-specific model replicator, the developed replicator has limited use outside of the intended modeling language. Thus, the generality across modeling languages is lost. These first two approaches have drawbacks when compared against the desired characteristics of Section 3.3.2. The next section presents a more generalized solution based on C-SAW and ECL. 64 Figure 3-4 - Replication as a domain-specific model compiler (A2) 3.3.4 Replication with C-SAW A special type of model compiler within the GME is a plug-in that can be applied to any metamodel (i.e., it is domain-independent). The C-SAW model transformation engine is an example of a plug-in that can be applied to any modeling language. The type of transformations that can be performed by C-SAW are endogenous transformations where the source and target models are defined by the same metamodel. C-SAW executes as a model compiler and renders all transformations (as specified in the ECL) back into the host modeling tool. A model transformation written in ECL can be altered very rapidly to analyze the effect of different degrees of scalability (e.g., the effect on performance when the model is scaled from 256 to 512 nodes). 65 This third approach to replication (designated as A3) advocates the use of a model transformation engine like C-SAW to perform the replication (please see Figure 3-5 for an overview of the technique). This technique satisfies all of the desirable characteristics of a replicator: by definition, the C-SAW tool is applicable across many different modeling languages, and the replication strategy is decoupled from other concerns (e.g., artifact generation) and specified in a way that can be easily modified through a higher level transformation language. These benefits improve the capabilities of hard-coded rules as observed in the approaches described in A1 and A2. With a model transformation engine, a second model compiler is still required for each domain as in A2 (see “Model Compilers” in Figure 3-4), but the scalability issue is addressed independently of the modeling language. The key benefits of approach A3 can be seen by comparing it to A2. It can be observed that Figures 3-4 and 3-5 are common in the two-stage process of model replication followed by artifact generation with a model compiler. The difference between A2 and A3 can be found in the replication approach. In Figure 3-4, the replication is performed by three separate model compilers that are hard-coded to a specific domain (Translator 1a, Translator 2a, and Translator 3a), but the replication in Figure 3-5 is carried out by a single model transformation engine that is capable of performing replication on any modeling language. Approach A3 provides the benefit of a higher level scripting language that can be generalized through parameterization to capture the intent of the replication process. Our most recent efforts have explored this third technique for model replication on several existing modeling languages. 66 Figure 3-5 - Replication using the model transformation engine C-SAW (A3) 3.3.5 Scaling the System Integration Modeling Language (SIML) In this section, the concept of model replication is demonstrated on an example modeling language that was created in the GME for the computational physics domain. The purpose of introducing this case study is to illustrate how our model transformation approach supports scalability among SIML models that contain multiple hierarchies. The System Integration Modeling Language (SIML) is a modeling language developed to specify configurations of large-scale fault tolerant data processing systems used to conduct high-energy physics experiments [Shetty et al., 05]. SIML was developed by Shetty et al. from Vanderbilt University to model a large-scale real-time physics system developed at Fermi National Accelerator Laboratory (FermiLab) [Fermi, 07] for characterizing the subatomic particle interactions that take place in a high-energy physics 67 experiment. SIML models capture system structure, target system resources, and autonomic behavior. System generation technology is used to create the software from these models that implement communication between components with custom data type marshalling and demarshalling, system startup and configuration, fault tolerant behavior, and autonomic procedures for self-correction [Shetty et al., 05]. A system model expressed in SIML captures components and relationships at the systems engineering level. The features of SIML are hierarchical component decomposition and dataflow modeling with point-to-point and publish-subscribe communication between components. There are several rules defined by the SIML metamodel: • • • • A system model may be composed of several independent regions Each region model may be composed of several independent local process groups Each local process group model may include several primitive application models Each system, region, and local process group must have a representative manager that is responsible for mitigating failures in its area A local process group is a set of processes that run the set of critical tasks to perform the system’s overall function. In a data processing network, a local process group would include the set of processes that execute the algorithmic and signal processing tasks, as well as the data processing and transport tasks. A region is simply a collection of local process groups, and a system is defined as a collection of regions and possibly other supporting processes. These containment relationships lead to the multiple hierarchical structures of SIML models. A simple SIML base model is shown on the left side of Figure 3-6, which captures a system composed of one region and one local process group 68 in that region (shown as an expansion of the parent region), utilizing a total of 15 physical modeling elements (several elements are dedicated to supporting applications not included in any region). Scalability Issues in SIML: In order to plan, deploy, and refine a high-energy physics data processing system, designers manually build a multitude of different SIML models that are subject to a variety of outside and changing constraints (e.g., the current availability of hardware, software, or human resources). An example of this process would be the manual creation of separate 16-, 32-, 64-, and 128-node versions of a baseline system used for bandwidth and latency testing purposes. This would later be followed by the creation of a set of significantly larger SIML models where the final system model could incorporate as many as 2,500 local processing groups. Each of these models would undergo a variety of analysis routines to determine several key properties of the system. These analyses include system throughput, network/resource utilization and worst-case managerial latency (the latency between managers and subordinates is crucial in evaluating the fault tolerance of the system). The results of these analyses may vary greatly as the structure of the system model is scaled in different ways. 69 Figure 3-6 - Visual Example of SIML Scalability The number of system configurations that are created using this process is directly proportional to the time and effort allowed by system designers to create valid system models using a manual approach. In practice, SIML models have been scaled to 32- and 64-node models. However, the initial scaling in these cases was performed manually. The ultimate goal of the manual process was to scale to 2,500 nodes. After 64 nodes, it was determined that scaling to further nodes would be too tedious to perform without proper automation through improved tool support. Even with just a small expansion, the manual application of the same process would require an extraordinary amount of manual effort (e.g., much mouse-clicking and typing) to bring about the requisite changes, and increase the potential for introducing error into the model (e.g., forgetting to add a required connection). If the design needs to be scaled forward or backward, a manual approach 70 would require additional effort that would make the exploration and analysis of design alternatives impractical. Therefore, a significant victory for design agility can be claimed if replicators can be shown to scale a base SIML model quickly and correctly into a variety of larger and more elaborate SIML models. This case study shows the benefits that result from scaling SIML models by applying an automated approach that exhibits the desired characteristics for replicators. To decide what scaling behaviors such a replicator needs to perform, domainspecific knowledge and rules for creating a SIML model as embodied in its metamodel need to be captured. For example, there are one-to-many relationships between system and regional managers, and also one-to-many relationships between regional and local process group managers. These relationships are well-defined. Because the pattern of these relationships is known, it is feasible to write a replicator to perform automatic generation of additional local process groups and/or regions to create larger and more elaborate system models. In general, scaling up a system configuration using SIML can involve: 1) an increase in the number of regions, 2) an increase in the number of local process groups per region, or 3) both 1 and 2. Considering the SIML model in Figure 3-6, the system (which originally has one region with one local process group) is increased to nine regions with six local process groups per region. Such replication involves the following tasks: • • Replication of the local process group models Replication of the entire region models and their contents 71 • Generation of communication connections between the regional managers and newly created local managers • Generation of additional communication connections between the system manager and new regional manager processes The scaled model is shown in the right side of Figure 3-6. This example scales to just 9 regions and 6 nodes per region simply because of the printed space to visualize the figure. ECL Transformation to Scale SIML: The scalability shown in Figure 3-6 can be performed by a replicator, which is a model transformation specification in ECL as shown in Listing 3-3. As a point of support for the effectiveness of replicators as transformations, this ECL specification was written in less than an hour by a user who was very familiar with ECL, but had studied the SIML metamodel for less than a few hours. The ECL transformation specification is composed of an aspect and several strategies. In Listing 3-3, the aspect Start (Line 1) invokes two strategies, scaleUpNode and scaleUpRegion in order to replicate the local process group node (i.e., L2L3Node) within the region model and the region itself. The strategy scaleUpNode (Line 7) discovers the “Region” model, sets up the context for the transformation, and calls the strategy addNode (Line 12) that will recursively increase the number of local process group nodes. The new node instance is created on Line 18, which is followed by the construction of the communication connections between ports, regional managers and the newly created nodes (Line 21 to Line 23). Some other connections are omitted here for the sake of keeping the listing concise. Two other 72 strategies, scaleUpRegion (Line 29) and addRegion (Line 34), follow a similar mechanism. The process of modeling systems using SIML illustrates the benefits of replicators by providing an automated technique that uses transformations to scale models in a concise and flexible manner. Because of the multiple hierarchies of SIML models, replications usually need to be performed on all the elements associated with containment relationships within a model. To perform a scaling task across multiple model hierarchies, ECL supports model navigation through its model querying and selection operations. A model navigation concern can be specified concisely in the ECL. For example, Line 9 of Listing 3-3 is a declarative statement for finding all the region models by navigating from the root folder to the system model, which calls these three querying operations: rootFolder(), findFolder() and findModel(). Also, flexibility of the replicator can be achieved in several ways. Lines 3 and 4 of Listing 3-3 specify the magnitude of the scaling operation, as well as the names of the specific nodes and regions that are to be replicated. In addition to these parametric changes that can be made easily, the semantics of the replication can be changed because the transformation specification can be modified directly by an end-user. This is not the case in approaches A1 and A2 from Section 3.3.3 because the replication semantics are hard-coded into the model compiler. 73 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 aspect Start() { scaleUpNode("L2L3Node", 5); //add 5 L2L3Nodes in the Region scaleUpRegion("Region", 8); //add 8 Regions in the System } strategy scaleUpNode(node_name : string; max : integer) { rootFolder().findFolder("System").findModel("Region").addNode(node_name,max,1); } strategy addNode(node_name, max, idx : integer) //recursively add nodes { declare node, new_node, input_port, node_input_port : object; if (idx<=max) then node := rootFolder().findFolder("System").findModel(node_name); new_node := addInstance("Component", node_name, node); //add connections to the new node; three similar connections are omitted here input_port := findAtom("fromITCH"); node_input_port := new_node.findAtom("fromITCH"); addConnection("Interaction", input_port, node_input_port); addNode(node_name, max, idx+1); endif; } strategy scaleUpRegion(reg_name : string; max : integer) { rootFolder().findFolder("System").findModel("System").addRegion(reg_name,max,1); } strategy addRegion(region_name, max, idx : integer) //recursively add regions { declare region, new_region, out_port, region_in_port, router, new_router : object if (idx<=max) then region := rootFolder().findFolder("System").findModel(region_name); new_region := addInstance("Component", region_name, region); //add connections to the new region; four similar connections are omitted here out_port := findModel("TheSource").findAtom("eventData"); region_in_port := new_region.findAtom("fromITCH"); addConnection("Interaction", out_port, region_in_port); //add a new router and connect it to the new region router := findAtom("Router"); new_router := copyAtom(router, "Router"); addConnection("Router2Component", new_router, new_region); addRegion(region_name, max, idx+1); endif; } Listing 3-3 - ECL specification for SIML scalability In addition to the examples discussed in this section, replication strategies have also been developed for the Event Quality Aspect Language (EQAL) and Stochastic Reward Net Modeling Language (SRNML). EQAL has been used to configure a large 74 collection of federated event channels for mission computing avionics applications. Replication within EQAL was reported in [Gray et al., 06]. SRNML has been used to describe performability concerns of distributed systems built from middleware patternsbased building blocks. Replication within SRNML was reported in [Lin et al., 07-a]. These two case studies are presented in Appendix C. To conclude, replicating a hierarchical model requires that a model transformation language like ECL provide the capability to traverse models, the flexibility to change the scale of replication, and the computational power to change the data attributes within a replicated structure. 3.4 Aspect Weaving with C-SAW When a concern spreads across a model hierarchy, a model is difficult to comprehend and change. Currently, the most prominent work in aspect modeling concentrates on notational aspects for UML [France et al., 04], but tools could also provide automation using AOSD principles. Originally, one motivation for developing CSAW was the need to specify constraints that crosscut the model of a distributed realtime embedded system [Gray et al., 01]. In the initial stage of this research, C-SAW was used in weaving crosscutting changes into the Embedded System Modeling Language (ESML), which is introduced in the next section. 3.4.1 The Embedded Systems Modeling Language The Embedded Systems Modeling Language (ESML), designed by the Vanderbilt DARPA MoBIES team, is a domain-specific graphical modeling language for modeling 75 real-time mission computing embedded avionics applications [Shetty et al., 05]. ESML has been defined within the GME and provides the following modeling categories to allow representation of an embedded system: a) Components, b) Component Interactions, and c) Component Configurations. Bold Stroke is a product-line architecture written in several million lines of C++ that was developed by Boeing to support families of mission computing avionics applications for a variety of military aircraft [Sharp, 00]. It is a very complex framework with several thousand components implemented in over three million lines of source code. There are over 20 representative ESML projects for all of the Bold Stroke usage scenarios that have been defined by Boeing. For each specific scenario within Bold Stroke, the components and their interactions are captured by an event channel that is specified by an ESML model. There are a number of crosscutting model properties in ESML models, as shown in Figure 3-7. The top of Figure 3-7 shows the interaction among components in a mission-computing avionics application modeled in the ESML. The model illustrates a set of avionics components (Global Positioning Satellite and navigational display components, for example) that collaborate to process a video stream that provides a pilot with real-time navigational data. The middle of the figure shows the internal representation of two components, which reveals the data elements and other constituents intended to describe the infrastructure of component deployment and the distribution middleware. The infrastructure implements an event-driven model, in which components update and transfer data to each other through event notification and callback methods. 76 Among the components in Figure 3-7 are a concurrency atom and two data atoms (circled). Each of these atoms represents a system concern that spreads across the model hierarchy. The concurrency atom (in red circle with *) identifies a system property that corresponds to the synchronization strategy distributed across the components. The collection of atoms (in blue circle with #) defines the recording policy of a black-box flight data recorder. Some data elements also have an attached precondition (in green circle with @) to assert a set of valid values when a client invokes the component at runtime. Figure 3-7 - A subset of a model hierarchy with crosscutting model properties. Concerns related to synchronization (in red circle with *), black-box data recording (in blue circle with #), and preconditions (in green circle with @) are scattered across many submodels. 77 To analyze the effect of an alternative design decision manually, model engineers must change the synchronization or flight data recorder policies, which requires making the change manually at each component’s location. The partial system model in Figure 37 is a subset of an application with more than 6,000 components. Manually changing a policy will strain the limits of human ability in a system that large. With ECL, model engineers simply define a modeling aspect to specify the intention of a crosscutting concern. An example is given in the following subsection. 3.4.2 Weaving Concurrency Properties into ESML Models There are several locking strategies available in Bold Stroke to address concurrency control (e.g., Internal Locking and External Locking). Internal Locking requires the component to lock itself when its data are modified, and External Locking requires the user to acquire the component’s lock prior to any access of the component. However, existing Bold Stoke component models lack the ability to modularize and specify such concurrency strategies. Figure 3-8 illustrates the ESML modeling capabilities for specifying the internal configuration of a component. The “BM_ClosedEDComponentImpl” is shown in this figure. For this component, the facet/receptacle descriptors and event types are specified, as well as internal data elements, but it does not have any elements that represent the concurrency mechanisms. A model transformation may be used to weave concurrency representations into these ESML component models. This transformation task can be described as follows: Insert two concurrency atoms (one internal and one external lock) to each model that has at least one data atom. 78 To perform this transformation manually into over 20 existing component models will be time consuming and susceptible to errors. However, C-SAW can automate model evolution task based on an ECL transformation specification. Figure 3-8 - Internal representation of a Bold Stroke component To weave concurrency atoms into existing component models, an ECL transformation specification is defined as shown in Listing 3-4. The Start aspect defines a collection of component models whose names end with “Impl.” The AddConcurrency strategy is then performed on the collection of models meeting the Start selection criteria. AddConcurrency is represented by the following tasks: for any component model that has at least one Data atom, create two Concurrency atoms within the model representing Internal Locking and External Locking. 79 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 strategy AddConcurrency() { declare concurrencyAtom1, concurrencyAtom2 : atom; if(atoms()->select(a | a.kindOf() == "Data")->size() >= 1) then //add the first concurrency atom concurrencyAtom1 := addAtom("Concurrency", "InternalConcurrency"); concurrencyAtom1.setAttribute("Enable", "1"); concurrencyAtom1.setAttribute("LockType", "Thread Mutex"); concurrencyAtom1.setAttribute("LockStrategy", "Internal Locking"); //add the second concurrency atom concurrencyAtom2 := addAtom("Concurrency", "ExternalConcurrency"); concurrencyAtom2.setAttribute("Enable", "1"); concurrencyAtom2.setAttribute("LockType", "Thread Mutex"); concurrencyAtom2.setAttribute("LockStrategy", "External Locking"); endif; } aspect Start( ) { rootFolder().findFolder("ComponentTypes").models() ->select(m|m.name().endWith("Impl"))->AddConcurrency(); } Listing 3-4 - ECL specification to add concurrency atoms to ESML models In the Start aspect, the rootFolder() function first returns the root folder of a modeling project. Next, findFolder() is used to return a folder named “ComponentTypes” under the root folder. Then, models() is used to find all the models in the “ComponentTypes” folder. Finally, the select() operator is used to select all the models whose names end with “Impl.” The AddConcurrency strategy is then applied to the resulting collection. In the AddConcurrency strategy, line 5 of Listing 3-4 determines whether there exist any Data atoms in the current context model. If such atoms exist, the first Concurrency atom is created and its attributes are assigned appropriate values in line 7 through line 8 and the second Concurrency atom is created and defined in lines 13 through 16. 80 Figure 3-9 - The transformed Bold Stroke component model Considering the component model shown in Figure 3-8 as one of the source models for this transformation, the resulting target model generated from the transformed source model is shown in Figure 3-9. After this transformation is accomplished, two concurrency atoms, representing internal locking and external locking, are inserted into the source model. As a result, all the existing models can be adapted to capture concurrency mechanisms rapidly without extensive manual operations. The ECL is essential to automate such a transformation process. 3.5 Experimental Validation Experimental validation of the contributions described in this dissertation, in terms of the ability to enable model evolution, has been performed by evaluating the research on large-scale models in various domains such as computational physics, 81 middleware, and mission computing avionics. This section outlines the artifacts available for experimentation, as well as the assessment questions and measurement metrics that were used to evaluate the research results. 3.5.1 Modeling Artifacts Available for Experimental Validation The modeling artifacts available for experimental validation are primarily from two sources. One source is Vanderbilt University, a collaborator on much of the C-SAW research, who has provided multiple modeling artifacts as experimental platforms. The other source is Escher [Escher, 07], which is an NSF sponsored repository of modeling artifacts developed from DARPA projects available for experimentation. As discussed in Section 3.4, C-SAW was used to support change evolution of Bold Stroke component models, which were defined in the Embedded Systems Modeling Language (ESML). This experiment assisted in transforming legacy codes [Gray et al., 04-b]. Another modeling artifact is the System Integration Modeling Language (SIML) as discussed in Section 3.3. C-SAW was used to support system scalability issues with SIML. The Event QoS Aspect Language (EQAL) [Edwards, 04] is a modeling language from Vanderbilt that is used to graphically specify publisher-subscriber service configurations for large-scale DRE systems. The EQAL modeling environment consists of a GME metamodel that defines the concepts of publisher-subscriber systems, in addition to several model compilers that synthesize middleware configuration files from models. The EQAL model compilers automatically generate publisher-subscriber service configuration files and component property description files needed by the underlying 82 middleware. EQAL was also used for experimentation of model scalability with C-SAW [Gray et al., 05]. Other modeling artifacts include the Stochastic Reward Net Modeling Language (SRNML) [Lin et al., 07-a] and Platform-Independent Component Modeling Language (PICML) [Balasubramanian et al., 06-a]. SRNML has been used to describe performability concerns of distributed systems built from middleware patterns-based building blocks. PICML is a domain-specific modeling language for developing component-based systems. Case studies on using C-SAW to support model evolution of EQAL and SRNML are given in Appendix C. 3.5.2 Evaluation Metrics for Project Assessment Experimental validation of this research has been based on various experimental evaluations. There are a set of metrics used in the research validation. Domain generality is used to demonstrate that the C-SAW transformation engine is DSML-independent and able to perform a variety of modeling tasks. This can be assessed by determining how much customization effort is needed to adapt the C-SAW model transformation approach for each new modeling language. There are also other metrics that were assessed to determine the amount of effort and cost required to apply C-SAW to evolve models, and the effect of using C-SAW. This set of metrics includes productivity and accuracy. Productivity assessment is used to determine the ability of C-SAW to reduce the efforts (represented by amount of time) in developing model transformations to perform model evolution tasks, compared to a manual model evolution process (i.e., using editing operations in a modeling environment 83 such as GME). Accuracy is an assessment of C-SAW’s ability to reduce errors in performing model evolution tasks compared to a manual process. The expected benefits of the model transformation approach and its supporting tools are improved productivity and increased accuracy. Experimental validation was conducted by observing the level of effort expended in applying the results of the research to evolve model artifacts as identified in Section 3.5.1, and the correctness of the result. These metrics provide an indication of the success of the research as it relates to the ability to evolve domain models. 3.5.3 Experimental Result Experimental results for validating the research are from the feedback and observations during the applications of C-SAW to support automated evolution of models on several different experimental platforms. As an initial result, this work has been experimentally applied to a mission computing avionics application provided from Boeing where C-SAW was used to evolve component models to transform the code base through an approach called Model-Driven Program Transformation (MDPT) [Gray et al., 04-b], [Zhang et al., 04] (Note: MDPT is not a contribution of this research, but illustrates an application of C-SAW). On this experimental platform, C-SAW was used to weave the concurrency mechanisms, synchronization and flight data recorder policies into component models specified in ESML in order to adapt the modeled systems to new requirements. These concerns are usually spread across the model hierarchy and it is hard to address using a manual approach. In addition, C-SAW was applied to component-based distributed system 84 development by weaving deployment aspects into component models specified in PICML. Using C-SAW to compose deployment specification from component models not only facilitates modifications to the model in the presence of a large number of components, but also gives assurance that changes to model elements keep the model in a consistent state [Balasubramanian et al., 06]. More recently, C-SAW has been used in addressing the important issue of model scalability [Gray et al., 05], [Lin et al., 07-a] in SIML (as mentioned in Section 3.3), EQAL and SRNML (as discussed in Appendix C). The feedback from these experiments provides two categories of results to demonstrate the benefits of using C-SAW: • The result of the first category is related to domain generality. C-SAW has been designed and implemented as a modeling-language independent model transformation engine. Currently, C-SAW can be used in any modeling language that is conformant to a GME metamodel and is able to support any kind of model evolution when the source model and the target model belong to the same metamodel. Thus, C-SAW can be applied to various domains without any customization effort for performing a variety of modeling tasks such as aspect weaving [Gray et al., 06], model scalability [Lin et al., 07-a] and model refactoring [Zhang et al., 05-a]. This demonstrates that the C-SAW approach meets the quality measurement of the domain generality. Also, C-SAW is implemented as a GME plug-in. It is easy to install by registering the freelyavailable component (please see the conclusion of this chapter for the URL), without the need to install any other libraries or software. This installation process is the same for all modeling languages. 85 • The result of the second category was evaluated to determine the degree of how the C-SAW approach improves productivity and increases accuracy for model evolution. As observed, compared to a manual approach by making changes using the editing operations of GME, using C-SAW not only reduces the time significantly but also decreases the potential errors. For example, SIML models have been scaled by hand to 32 and 64 nodes. After 64 nodes, the manual process deteriorated taking several days with multiple errors. Using C-SAW, SIML models have been scaled up to 2500 nodes within a few minutes; flexibility for scaling up or down can also be achieved through parameterization. For a user familiar with ECL, the time to create a model transformation by a user unfamiliar with the domain is often less than 1.5 hours. Moreover, an ECL specification may be tested for correctness (e.g., using the testing engine described in Chapter 4) on a single model before it is applied to a large collection of models, which helps to reduce the potential errors of C-SAW transformations. In contrast, a manual process requires an overwhelming amount of ad hoc mouse clicking and typing, which makes it easy to make errors. To conclude, these results have preliminarily demonstrated C-SAW as an effective tool to assist in model evolution in various domains for specific types of transformations. 3.6 Related Work The general area of related work concerns model transformation approaches and applications, especially modeling research and practice that provide abilities to specify 86 model evolution concerns that address issues such as model scalability and evolution. This section provides an overview of work related to these issues. 3.6.1 Current Model Transformation Techniques and Languages A large number of approaches to model transformation have been proposed by both academic and industrial researchers and there are many model transformation tools available (example surveys can be found in [Czarnecki and Helsen, 06], [Mens and Van Gorp, 05], [Sendall and Kozaczynski, 03]). In general, there are three different approaches for defining transformations, as summarized from [Sendall and Kozaczynski, 03]: • Direct Model Manipulation – developers access an internal model representation and use a general-purpose programming language (GPL) to manipulate the representation from a set of procedural APIs provided by the host modeling tool. An example is Rational XDE, which exposes an extensive set of APIs to its model server that can be used from Java, VB or C++ [Rose, 07]. Another example is the GME, which offers the BON (Section 3.2.5) as a set of APIs in Java and C++ to manipulate models [GME, 07]. • Intermediate Representation – a modeling tool can export the model into an intermediate representation format (e.g., XML). Transformations can then be performed on the exported model by an external transformation tool (e.g., XSLT [XSLT, 99]), and the output models can be imported back into the host modeling tool. An example is OMG’s Common Warehouse Metamodel (CWM) Specification [CMW, 07] and transformation implemented using XSLT. In fact, the original 87 implementation of the ECL performed a transformation using XSLT on GME models exported as XML [Gray et al., 01], [Gray, 02]. • Specialized Transformation Language – a specialized transformation language provides a set of constructs for explicitly specifying the behavior of the transformation. A transformation specification can typically be written more concisely than direct manipulation with a GPL. The direct manipulation approach provides APIs that may be familiar to programmers using Java or C++ frameworks, but may not be familiar to end-users handling high-level modeling notations in specific domains. The disadvantage of APIs used by a GPL is that they lack high-level abstraction constructs to specify model transformations so that endusers have to deal with the accidental complexity of the low-level GPL. The advantage of the XSLT-based approach is that XSLT is an industry standard for transforming XML where the XML Metadata Interchange (XMI) [XMI, 07] is used to represent models. However, XSLT requires experience and considerable effort to define even simple model transformation [Sendall and Kozaczynski, 03]. Manual implementation of model transformation in XSLT quickly leads to non-maintainable implementations because of the verbosity and poor readability of XMI and XSLT. The specialized transformation language approach provides a domain-specific language (DSL) [Mernik et al., 05] for describing transformations, which offers the most potential expressive power for transformations. Currently, numerous model transformation languages have been proposed by both academic and industrial researchers. These languages are used to define transformation rules and rule application 88 strategies that can be either graphical or textual. Additionally, model transformation languages may be either imperative or declarative [Czarnecki and Helsen, 06]. There are two major kinds of model transformation specification languages: one represents a graphical language, typified by graph grammars (e.g., Graph Rewriting and Transformation Language (GReAT) [Agrawal, 03], AToM3 [Vangheluwe and De Lara, 04] and Fujaba [Fujaba, 07]), the other is a hybrid language (e.g., Atlas Transformation Language (ATL) [Bézivin et al., 04], [Kurtev et al., 06] and Yet Another Transformation Language (YATL) [Patrascoiu, 04]). The distinguishing features of these two language categories are summarized as: • Graphical transformation language: In this approach, models are treated as graphs and model transformations are specified as graph transformations. Graph transformations are realized by the application of transformation rules, which are rewriting rules for graphs specified as graph grammars. The left-hand side (LHS) of a transformation rule is a graph to match, and the right-hand side (RHS) is a replacement graph. If a match is found for the LHS graph, then the rule is fired, which results in the matched sub-graph of the graph under transformation being replaced by the RHS graph. In such a language, graphical notations are provided to specify graph patterns, model transformation rules and control flow of transformation execution. Compared to a textual language, a graphical language is efficient in communicating graph patterns. However, it can be tedious to use purely graphical notations to describe complicated computation algorithms. As a result, it may require generation to a separate language to apply and execute the transformations. 89 • Hybrid transformation language: combines declarative and imperative constructs. Declarative constructs are used to specify source and target patterns as transformation rules (e.g., filtering model elements), and imperative constructs are used to implement sequences of instructions (e.g., assignment, looping and conditional constructs). However, embedding predefined patterns renders complicated syntax and semantics for a hybrid language [Kurtev et al., 06]. Many existing model transformation languages (including those discussed above) allow transformation to be specified between two different domains (e.g., a transformation that converts a UML model into an entity-relationship model). The ECL can be distinguished from these approaches as a relatively simple and easy-to-learn language that focuses on specifying and executing endogenous transformation where the source and target models belong to the same domain. However, it has full expressive power for model replication and aspect modeling because these tasks can be specified as endogenous transformations. 3.6.2 Related Work on Model Scalability Related work on model scalability contributes to the ability to specify and generate instance-based models with repetitive structures. The approach proposed by Milicev [Milicev, 02] uses extended UML Object Diagrams to specify the instances and links of a target model that is created during automatic translation; this target model is called the domain mapping specification. An automatically generated model transformer is used to produce intermediate models, which are refined to final output artifacts (e.g, C++ codes). Similar to the ideas presented in this dissertation, Milicev adopts a model transformation approach whereby users write and execute transformation specifications 90 to produce instance-based models. However, different from our approach, Milicev’s work is domain-dependent because the domain mapping specifications and model transformers are domain-specific. Moreover, the target of his work is the reusability of code generators built in existing modeling tools, which introduces an intermediate model representation to bridge multiple abstraction levels (e.g., from model to code). The target of our approach is to use existing model transformation tools, which support model-to-model transformations at the same abstraction level. Model transformations can be written that perform model replication in a domain-independent manner without any effort toward extending existing model representations across different abstraction levels. Several researchers have proposed standard notations to represent repetitive structures of modeling real-time and embedded systems, which is helpful in discovering possible model replication patterns. The MARTE (Modeling and Analysis of Real-Time and Embedded systems) request for proposals was issued by the OMG in February 2005, which solicits submissions for a UML profile that adds capabilities for modeling RealTime and Embedded Systems (RTES), and for analyzing schedulability and performance properties of UML specifications. One of the particular requests of MARTE concerns the definition of common high-level modeling constructs for factoring repetitive structures for software and hardware. Cuccuru et al. [Cuccuru et al., 05] proposed multidimensional multiplicities and mechanisms for the description of regular connection patterns between model elements. However, these proposed patterns are in an initial stage and have not been used by any existing modeling tools. Their proposal mainly works with models containing repetitive elements that are identical, but may not specify all the model replication situations that were identified in this dissertation (e.g., to represent 91 models with a collection of model elements of the same type, which are slightly different in some properties, or have similar but not identical relationships with their neighbors). As such research matures, we believe significant results will be achieved toward representation of repetitive or even similar model structures. Such maturity will contribute to standardizing and advancing model replication capabilities. The C-SAW approach advocates using existing model transformation techniques and tools to address model scalability, especially where modeling languages lack support for dynamic creation of model instances and links. This investigation on the application of model transformations to address scalability concerns extends the application area of model transformations. The practice and experiences illustrated in this chapter help to motivate the need for model scalability. 3.7 Conclusion The goal of the research described in this chapter is to provide a model transformation approach to automate model evolution. This chapter presents the major extensions to the ECL model transformation language and its associated engine C-SAW to address model evolution concerns that relate to important system-wide issues such as scalability and adaptability. ECL is a transformation language to specify various types of evolution tasks in modeling, such as scalability concerns that allow a model engineer to explore design alternatives. C-SAW has been implemented as a GME plug-in to execute ECL specifications within the GME. This enables weaving changes into GME domain models automatically. 92 Experimental validation is also discussed in this chapter to assess the benefits and effectiveness of the C-SAW approach in automating model evolution. There are several large-scale models available from the Escher Institute [Escher, 07] that were used as experimental platforms. These models have several thousand modeling elements in various domains such as computational physics, middleware, and mission computing avionics. As an experimental result, the C-SAW transformation engine has been applied to support automated evolution of models on several of these different experimental platforms. Particularly, C-SAW has been used to address the important issue of model scalability for exploring design alternatives and crosscutting concerns for model adaptation and evolution. The observation and feedback from the usage of C-SAW has demonstrated that C-SAW not only helps to reduce the human effort in model evolution, but also helps to improve the correctness. Other benefits provided by C-SAW include modeling language independency and the capability to perform various types of model evolution tasks. The C-SAW plug-in downloads, publications, and video demonstrations are available at the project website: http://www.cis.uab.edu/gray/Research/C-SAW/ To improve the correctness of a model transformation specification, a model transformation testing approach as discussed in Chapter 5 provides support for testing model transformation specifications, which requires model comparison techniques. As another contribution of this research, the next chapter presents the algorithms and tool support called DSMDiff for model comparison. 93 CHAPTER 4 DSMDIFF: ALGORITHMS AND TOOL SUPPORT FOR MODEL DIFFERENTIATION This chapter describes the contribution of this dissertation on model differentiation. It begins with a brief discussion on the need for model differentiation, followed by detailed discussions on the limitations of current techniques. The problem of model differentiation is formally defined and the key issues are identified. The core of this chapter is to present the developed model differentiation algorithms and the associated tool called DSMDiff, including an analysis of non-structural and structural information of model elements, formal representation of models and details of the algorithms. This chapter also motivates the importance of visualizing the model differences in a manner that can be comprehended by a model engineer. The chapter provides an evaluation of the algorithms and concludes with an overview of related work and a summary. 4.1 Motivation and Introduction As MDE is emerging as a software development paradigm that promotes models as first-class artifacts to specify properties of software systems at a higher level of abstraction, the capability to identify mappings and differences between models, which 94 is called model differentiation, model differencing or model comparison, is essential to many model development and management practices [Cicchetti et al., 2007]. For example, model differentiation is needed in a version control system that is model-aware to trace the changes between different model versions to understand the evolution history of the models. Model comparison techniques and tools may help maintain consistency between different views of a modeled system. Particularly, model differentiation is needed in the model transformation testing research discussed in Chapter 5 to assist in testing the correctness of model transformations by comparing the expected model and the resulting model after applying a transformation ruleset. Although there exist many techniques available for differentiating text files (e.g., source code and documentation) and for structured data (e.g., XML documents), such tools either operate under a linear file-based paradigm that is purely textual (e.g., the Unix diff tool [Hunt and McIlroy, 75]) or perform comparison on a tree structure (e.g., the XMLDiff tool [Wang et al., 03]). However, models are structurally represented as graphs and are often rendered in a graphical notation. Thus, there is a structural mismatch between currently available text-based differentiation tools and the graphical nature of models. Furthermore, from our experience, large models can contain several thousand modeling elements, which makes a manual approach to model differentiation infeasible. To address these problems, more research is needed to explore automated differentiation algorithms and supporting tools that may be applied to models with graphical structures. 95 4.2 Problem Definition and Challenges Theoretically, generic model comparison is similar to the graph isomorphism problem that is known to be in NP [Garey and Johnson, 79]. Some research efforts aim to provide generic model comparison algorithms, such as the Bayesian approach, which initially provides diagram matching solutions to architectural models and data models [Mandelin et al., 06]. However, the computational complexity of general graph matching algorithms is the major hindrance to applying such algorithms to practical applications in modeling. Thus, it is necessary to loosen the constraints on graph matching to find solutions for model comparison. A typical solution is to provide differentiation techniques that are specific to a particular modeling language, where the syntax and semantics of this language help to handle conflicts during model matching. Currently, there exist many types of modeling languages. Particularly, the UML is a popular object-oriented modeling language. The majority of investigations into model differentiation focus on UML diagrams [Ohst et al., 03], [Xing and Stroulia, 05]. Alternatively, DSM [Gray et al., 07] is an emerging MDE methodology that generates customized modeling languages and environments from metamodels that define a narrow domain of interest. Distinguished from UML, which is a general purpose modeling language, DSMLs aim to specify the solution directly using rules and concepts familiar to end-users of a particular application domain. There are two main differences between domain-specific models and UML diagrams: 1) UML diagrams have a single definition for syntax and static semantics (i.e., a single metamodel), however, domain-specific models vary significantly in their structures and properties when their syntax and static semantics are defined in different 96 metamodels, which correspond to different DSMLs customized for specific end-users; 2) domain-specific models are usually considered as instance-based models (e.g., large domain-specific system models often have repetitive and nested hierarchical structures and may contain large quantities of objects of the same type), but traditional UML diagrams are primarily class-based models. Thus, domain-specific models and UML diagrams differ in structure, syntax and semantics. New approaches are therefore required to analyze differences among domain-specific models. However, there has been little work reported in the literature on computing differences between domain-specific models that are visualized in a graphical concrete syntax. To address the problem of computing the differences between domain-specific models, the following issues need to be explored: • What are the essential characteristics of domain-specific models and how are they defined? • What information within domain-specific models needs to be compared and what information is needed to support metamodel-independent model comparison? • How is this information formalized within the model representation in a particular DSML? • How are model mappings and differences defined to enable model comparison? • What algorithms can be used to discover the mappings and differences between models? 97 • How to visualize the result of model comparison to assist in comprehending the mappings and differences between two models? 4.2.1 Information Analysis of Domain-Specific Models To develop algorithms for model differentiation, one of the critical questions is whether to determine if the two models are syntactically equivalent or to determine if they are semantically equivalent. Because the semantics of most modeling languages are not formally defined, the developed algorithms only determine whether the two models are syntactically equivalent4. To achieve this, a model comparison algorithm must be informed by the syntax of a specific DSML. Thus, this section discusses how the syntax of a DSML is defined and what essential information is embodied in the syntax. As discussed in Chapter 2, metamodeling is a common technique for conceptualizing a domain by defining the abstract syntax and static semantics of a DSML. A metamodel defines a set of modeling elements and their valid relationships that represent certain properties for a specific domain. The GME [Lédeczi et al., 01] is a meta-configurable tool that allows a DSML to be defined from a metamodel. Domainspecific models can be created using a DSML and may be translated into source code, or synthesized into data to be sent to a simulation tool. The algorithms presented in this chapter have been developed within the context of the GME, but we believe these algorithms can solve broader model comparison problems in other metamodeling tools such as the ATLAS Model Management Architecture (AMMA) [Kurtev et al., 06], 4 Please note that this is not a serious limitation when compared to other differentiation methods. The large majority of differentiation techniques offer syntactic comparison only, especially those focused on detecting textual differences 98 Microsoft’s DSL tools [Microsoft, 05], MetaEdit+ [MetaCase, 07], and the Eclipse Modeling Framework (EMF) [Budinsky et al., 04]. There are three basic types of entities used to define a DSML in GME: atom, model and connection. An atom is the most basic type of entity that cannot have any internal structures. A model is another type of entity that can contain other modeling entities such as child models and atoms. A connection represents the relationship between two entities. Generally, the constructs of a DSML defined in a metamodel consist of a set of model entities, a set of atom entities and a set of connections. However, these three types of entities are generic to any DSML and provide domain-independent type information (i.e., called the type in GME terminology). Each entity (e.g., model, atom or connection) in a metamodel is given a name to specify the role that it plays in the domain. Correspondingly, the name that is defined for each entity in a metamodel represents the domain-specific type (i.e., called the kind in GME terminology), which end-users see when creating an instance model. Moreover, attributes are used to record state information and are bound to atoms, models, and connections. Thus, without considering its relationships to other elements, a model element is defined syntactically by its type, kind, name and a set of attributes. Specifically, type provides certain meta information to help determine the essential structure of a model element for any DSML (e.g., model, atom or connection) and is needed in metamodel-independent model differentiation. Meanwhile, kind and name are specific to a given DSML and provide non-structural syntactical information to further assist in model comparison. Other syntactical information of a model element include its relationships to other elements (i.e., 99 connections to its neighbours), which may also distinguish the identity of modeling elements. In summary, to determine whether two models are syntactically equivalent, model differentiation algorithms need to compare all the syntactical information between them. Such a set of syntactical information of a model element includes: 1) its type, kind, name and attribute information; and 2) its connections to other model elements. In addition, if these two models are containment models, the algorithms need to compare all the elements at all the levels. There is other information associated with a model that either relates to the concrete syntax of a DSML (e.g., visualization specifications such as associated icon objects and their default layouts and positions) or to the static semantics of a DSML (e.g., constraints to define domain rules). The concrete syntax is not generally involved in model differentiation for the purpose of determining whether two models from the same DSML are syntactically equivalent (e.g., the associated icon of a model element is always determined by its kind information from the metamodel definition). Similarly, because the constraints are defined at the metamodel level in our case (i.e., models with the same kind hold the same constraints), they are not explicitly compared in model differentiation; instead, kind equivalence implies the equivalence of constraints. 4.2.2 Formalizing a Model Representation as a Graph In order to design efficient algorithms to detect differences between two models, it is necessary to understand the structure of a model. Figure 4-1 shows a GME model and its hierarchical structure. According to its hierarchical containment structure, a model can be represented formally as a hierarchical graph that consists of a set of nodes and 100 edges, which are typed, named and attributed. There are four kinds of elements in such a graph: • Node. A node is an element of a model, represented as a 4-tuple (name, type, kind, attributes), where name is the identifier of the node, type is the corresponding metamodeling element for the node, kind is the domain-specific type, and attributes is a set of attributes that are predefined by the metamodel. There are two kinds of nodes: ̇ Model node: a containment node that can be expanded at a lower level as a graph that consists of a set of nodes and a set of edges (i.e., a container). This kind of node is used to represent submodels within a model, which leads to multiple-level hierarchies of a containment model. ̇ Atom node: an atomic node that cannot contain any other nodes (i.e., a leaf). This kind of node is used to represent atomic elements of a model. • Edge. An edge is a 5-tuple (name, type, kind, src, dst), where name is the identifier of the edge, type is the corresponding metamodeling element for the edge, kind is the domain-specific type, src is the source node, and dst is the destination node. A connection can be represented as an edge. • Graph. A directed graph consists of a set of nodes and a set of edges where the source node and the destination node of each edge belong to the set of nodes. A graph is used to represent an expanded model node. • Root. A root is the graph at the top level of a multiple-level hierarchy that represents the top of a hierarchical model. 101 Figure 4-1 - A GME model and its hierarchical structure 4.2.3 Model Differences and Mappings The task of model differentiation is to identify the mappings and differences between two containment models at all hierarchical levels. In general, the comparison starts from the top level of the two containment models and then continues to the child submodels. At each level, the comparison between two corresponding models (i.e., one is defined as the host model, denoted as M1, and the other is defined as the candidate model, denoted as M2), always produces two sets: the mapping set (denoted as MS) and the difference set (denoted as DS). The mapping set contains all pairs of model elements that are mapped to each other between two models. The difference set contains all 102 detected discrepancies between the two models. Before the details of the algorithms are presented, the definition of model mappings and differences is discussed. A pair of mappings is denoted as Map (elem1, elem2), where elem1 is in M1 and elem2 is in M2, and may be a pair of nodes or a pair of edges. Map (elem1, elem2) is a bidirectional relationship that implies elem2 is the only mapped correspondence in M2 for elem1 in M1 based on certain matching metrics, and vice versa. The difference relationship between two models is more complicated than the mapping relationship. The notations used to represent the differences between two models are editing operational terms that are considered more intuitive [Alanen and Porres, 03]. For example, a New operation implies creating a model element, a Delete operation implies removing a model element and a Change operation implies changing the value of an attribute. We define DS = M2 – M1, where M2 is compared to M1. DS consists of a set of operations that yields M2 when applied to M1. The “-” operator is not commutative. There are several situations that could cause two models to differ. The first situation of model difference occurs when some modeling elements (e.g., nodes or edges in the graph representation) are in M2, but not in M1. We denote this kind of difference as New (e2) where e2 is in M2, but not in M1. The converse is another situation that could cause a difference (i.e., elements in M1 are missing in M2). We denote this kind of difference as Delete (e1) where e1 is in M1, but not in M2. These two situations occur from structural differences between the two models. A third difference can occur when all of the structural elements are the same, but a particular value of an attribute is different. We denote this difference as Change (e1, e2, f, v1, v2), where e1 in M1 and e2 in M2 are a pair of mapping elements, f is the feature name (e.g., name of an attribute), v1 is 103 the value of e1.f, and v2 is the value of e2.f. Thus, the difference set actually includes three sets: DS = {N, D, C} where N is a set that contains all the New differences, D is a set that contains all the Delete differences, and C is a set that contains all the Change differences. This approach was initially defined in [Lin et al., 05] and extended in [Lin et al., 07-b]. 4.3 Model Differentiation Algorithms The model comparison algorithms developed as a part of the research described in this dissertation identify the mappings and differences between two containment models by comparing all the elements and their abstract syntactical information within these models. In general, the comparison starts from the two root models and then continues to the child submodels. At each level, two metrics (i.e., signature matching and structural similarity) are combined to detect the mapped nodes between a pair of models and the remaining nodes are examined to determine all the node differences. Based on the results of node comparison, all the edges are computed to discover all the edge mappings and differences. To store the two models that need to be compared and the results of model comparison, a data structure called DiffModel is used. The structure of DiffModel contains a pair of models to be compared, a mapping set to store all the matched child pairs, and three difference sets to record New, Delete, and Change differences. 4.3.1 Detection of Model Mappings It is well-known that some model comparison algorithms are greatly simplified by requiring that each element have a persistent identifier, such as a universally unique identifier (UUID), which is assigned to a newly created element and will not be changed 104 unless the element is removed [Ohst et al., 03]. However, such traceable links only apply to two models that are subsequent versions. In many modeling activities, model comparison is needed between two models that are not subsequent versions. A pair of corresponding model elements need to share a set of properties, which can be a subset of their syntactical information. Such properties may include type information, which can be used to select the model elements of the same type from the candidates to be matched because only model elements with the same type need to be compared. For example, in a Petri net model, a “place” node will not match a “transition” node. In addition to type information, identification information such as name is also important to determine mappings for domain-specific models. Therefore, a combination of syntactical properties for a node or an edge can be used to identify different model elements. Such properties are called the signature in DSMDiff, and are used as the first criterion to match model elements. Signature is a widely used term in much of the literature on structural data matching and may have different definitions [Wang et al., 03]. In our context, the signature of a node or an edge is a subset of its syntactical information, which is defined as follows: • Node Signature is the concatenation of the type, kind and name of a node. Suppose v is a node in a graph. Signature (v) = /Type (v)/Kind (v)/Name (v). If a node is nameless, its name is set as an empty string. • Edge Signature is the concatenation of the type, kind and name of the edge as well as of the signatures of its source node and destination node. Suppose e is an edge in a graph, src is its source node and dst is its destination node. Signature (e) 105 = Signature (src)/Type (e)/Kind (e)/Name (e)/Signature (dst). If an edge is nameless, its name is set as an empty string. Signature Matching Signature matching can be defined as: • Node Signature Matching: Given two models, M1 and M2, suppose v1 is a node in M1 and v2 is a node in M2. There is a node signature matching between v1 and v2 if Signature (v1) = Signature (v2), which means the two strings (i.e., the signature of v1 and the signature of v2) are textually equivalent. • Edge Signature Matching: Given two models, M1 and M2, suppose e1 is an edge in M1 and e2 is an edge in M2. There is an edge signature matching between e1 and e2 if Signature (e1) = Signature (e2), which means the two strings (i.e., the signature of e1 and the signature of e2) are textually equivalent. A node v1 in M1 mapping to a node v2 in M2 implies their name, type and kind are matched. An edge e1 in M1 mapping to an edge e2 in M2 implies their name, type, kind, source node and destination node are all signature matched. Usually, nodes are the most significant elements in a model and edge mappings also depend on whether their source and destination nodes match. Thus, DSMDiff first tries to match nodes that have the same signature. For example, to decide whether there is a node in M2 mapped to a node in M1 (denoted as v1), the algorithm first needs to find all the candidate nodes in M2 that have the same signature as v1 in M1. If there is only one candidate found in M2, the identified candidate is considered as a unique mapping for v1 and they are considered as syntactically equivalent. If there is more than one candidate 106 that has been found, the signature cannot identify a node uniquely. Therefore, v1 and its candidates in M2 will be sent for further analysis where structural matching is performed. Structural Matching In some cases, signature matching alone cannot find the exact mapping for a given model element. During signature matching, one node in M1 may have multiple candidates in M2. To find a unique mapping from these candidates, DSMDiff uses structural similarity as another criterion. The metric used for determining structural similarity between a node and its candidates is called edge similarity, which is defined as follows: Edge Similarity: Given two models, M1 and M2, suppose v1 is a node in M1 and v2 is one of its candidate nodes in M2. The edge similarity of v2 to v1 is the number of edges connecting to v2, with each signature matched to one of the edges connecting to v1. During structural matching, if DSMDiff can find a candidate that has the maximal edge similarity, this candidate becomes the unique mapping for the given node. If it cannot find this unique mapping using edge similarity, one of the candidates will be selected as the host node’s mapping, following the assumption that there may exist a set of identical model elements. Listing 4-1 presents the algorithm to find the candidate node with maximal edge similarity for a given host node from a set of candidate nodes. It takes the host node (i.e., hostNode) and a set of candidate nodes of M2 (i.e., candidateNodes) as input, computes the edge similarity of every candidate node and returns a candidate with maximal edge similarity. Listing 4-2 gives the algorithm for computing edge similarity between a candidate node and a host node. It takes two maps as input – hostConns 107 stores all the incoming and outgoing edges of the host node indexed by their edge signature, and candConns stores all the incoming and outgoing edges of the candidate node indexed by their edge signature. By examining the mapped edge pairs between these two maps, the algorithm computes the edge similarity as output. Name: findMaximalEdgeSimilarity Input: hostNode, candidateNodes Output: maximalCandidate 1. Initialize three maps: hostConns, candConns and set maxSimilarity = 0, maximalCanidate = null; 2. Store each edge signature and the number of associated edges of the hostNode in the map hostConns; 3. For each candidate c in candidateNodes 1) Store each of its edge signatures and the number of associated edges in the map candConns; 2) Call computeEdgeSimilarity(hostConns, candConns) to compute the edge similarity of c to hostNode; 3) If(the computed similarity > maxSimilarity) maxSimilarity = the computed similarity; maximalCandiate = c; 4. Return maximalCandidate; Listing 4-1 - Finding the candidate of maximal edge similarity Name: computeEdgeSimilarity Input: hostConns, candConns Output: similarity 1. Initialize similarity as zero; 2. For each edge signature in the map hostConns 1) Get the number of the edges associated with the edge signature as hostCount; 2) Get the number of the edges from the map candConns associated with the edge signature as candCount; 3) If candCount <= hostCount Similarity = similarity + candCount; 4) Else Similarity = similarity + hostCount; 3. Return similarity; Listing 4-2 - Computing edge similarity of a candidate 108 The algorithm in Listing 4-1 determines that the unique correspondence found using the edge similarity has the most identical connections and neighbors to the host node when only one candidate has the maximal edge similarity. The algorithm also implies one candidate with the maximal edge similarity is selected as the unique correspondence when there are more than one candidates with the same maximal edge similarity; however, this selection may be incorrect in some cases and needs to be improved as discussed later in Limitations and Improvement (Section 4.5.2). DSMDiff only examines structural similarity within a specific local region where the host node is the center and its neighbor nodes form the border. In our experience, using signature matching and edge similarity to find model mappings not only speeds up the model differentiation process, but also generates accurate results in the experiments that have been conducted (one example is demonstrated in Chapter 5). After all the nodes in M1 have been examined by signature and structural matching, all the possible node mappings between M1 and M2 are found in general practice except for the cases discussed in Section 4.5.2. 4.3.2 Detection of Model Differences As mentioned previously, there are three basic types of model differences: New, Delete and Change. To identify these various types of differences is another major task of DSMDiff. In order to increase the performance of DSMDiff, some of the procedures to detect model differences may be integrated into the previously discussed procedures for finding mappings. 109 Name: findSignatureMappingsAndDeleteDiffs Input: diffModel Output: hostSet, candMap, diffModel 1. Initialize a set hostSet and a map candMap; 2. Get M1 from diffModel and store all nodes of M1 in hostSet 3. Get M2 from diffModel and store all nodes of M2 in candMap associated with their signature; 4. For each node e1 in hostSet 1) Get the count of the nodes from candMap that are signature matched to e1; 2) If count == 1 Get the candidate from candMap as e2; Add Map(e1, e2) to the mapping set of diffModel; Erase e1 from hostSet; Erase e2 from candMap; 3) If count == 0 Add e1 to the Delete set of diffModel; Erase e1 from hostSet; 4) If count > 1 Do nothing; Listing 4-3 - Finding signature mappings and the Delete differences To discover all the Delete differences, DSMDiff must find all the model elements in M1 that do not have any signature matched candidates in M2. In signature matching, DSMDiff examines how many candidates can be found in M2 that have the same signature as each element in M1. If only one is found, a pair of mappings is constructed and added to the mapping set. If more than one is found, the host element and the found candidates are sent to structural matching. If no candidate can be identified, the host element is considered as a Delete difference, which means it exists in M1 but does not exist in M2. Listing 4-3 summarizes the algorithm. After all the mappings are discovered between M1 and M2, the mapped elements are filtered out. The remaining elements in M2 are then taken as the New differences (i.e., a New difference indicates that there is an element in M2 that is missing in M1). 110 The Change differences are used to indicate varying attributes between any pair of mappings. Both model nodes and atom nodes may have a set of attributes; thus, a pair of matched model nodes or atom nodes may have Change differences. DSMDiff compares the values of each attribute of each pair of model or atom mappings. If the values are different, the attribute name is added to the Change difference set. After all the node mappings and differences are determined, DSMDiff then tries to find the edge mappings and differences between M1 and M2 using these strategies: 1) all the edges connecting to a Delete node are Delete edges; 2) all the edges connecting to a New node are New edges; 3) the edge signature matching is applied to find out the edge mappings; and 4) the remaining edges in M1 are taken as additional Delete edges and those in M2 are taken as additional New edges. 4.3.3 Depth-First Detection The traversal strategy of DSMDiff is depth-first, which traverses from the root level of a model hierarchy and then walks down to the lower levels to compare all the child submodels until it reaches the bottom level, where there are no submodels that can be expanded. Supporting such depth-first detection requires that all the node mappings found at a current level be categorized into two groups: model node mappings and atom node mappings. DSMDiff then performs model comparison on each pair of model node mappings. Each atom node mapping is examined for attribute equivalence. If there are some attributes with different values, these represent Change differences between the models. If all the attributes are matched, it is inferred that two nodes are equivalent because there is no Change, Delete or New difference. 111 To summarize, Listing 4-4 presents the overall algorithm of DSMDiff to calculate the mappings and the differences between two models. It takes diffModel as input, which is a typed DiffModel and initially stores two models (M1 and M2). DSMDiff produces two sets: the mapping set (MS) and the difference set (DS) that consists of three types of differences (N: the set of New differences, D: the set of Delete differences, and C: the set of Change differences). All of these mapping and difference sets are stored in the diffModel during execution of DSMDiff. Name: DSMDiff Input: diffModel Output: diffModel 1. Initialize a set hostSet and a map candMap; 2. Get the host model from diffModel as M1 and the candidate model as M2; 3. Detect attribute differences between M1 and M2 and add them to the Change set of diffModel; 4. //Find node mappings by signature matching findSignatureMappingsAndDeleteDiffs (diffModel, hostSet, candMap); 5. If(hostSet is not empty && candMap is not empty) //Find node mappings by structural matching For each element e1 in hostSet 1) Get its candidates from candMap into a set called candSet; 2) e2 = findMaximalEdgeSimilarity(e1,candSet); 3) Add Pair(e1, e2) to the Mapping set of diffModel; 4) Erase e1 from hostSet; 5) Erase e2 from candMap; 6. If(candMap is not empty) Add all the remained members of candMap to the New set of diffModel; 7. For each mapped elements that are not submodels Detect attribute differences and add them to the Change set of diffModel; 8. Compute edge mappings and differences 9. //Walk into child submodels For each childDiffModel that stores a pair mapped submodels DSMDiff(childDiffModel); Listing 4-4 - DSMDiff Algorithm 112 4.4 Visualization of Model Differences Visualization of the result of model differentiation (i.e., structural model differences) is critical to assist in comprehending the mappings and differences between two models. To help communicate the comparison results intuitively within a host modeling environment, a tree browser has been developed to visualize the structural differences and to support navigation among the analyzed model differences. This browser looks similar to the model browser of GME, using the same graphical icons to represent items with types of model, atom and connection. To indicate the various types of differences, the browser uses three colors: red for a Delete difference, gray for a New difference, and green for a Change difference. The model difference browser displays two symmetric difference sets in two containment change trees: one indicates the difference set DS = M2 – M1 by annotating M1 with colors; and the other indicates the difference set DS' = M1 - M2 = -DS by annotating M2 with colors. If DS = {New = N, Delete = D, Change = C} then DS' = {New = D, Delete = N, Change = C}. For example, if there is a Delete difference in M1, correspondingly there is a New difference in M2. Such a symmetric visualization helps comprehend the corresponding relationships between two hierarchical models. Figure 4-2 shows screenshots of two models and the detected differences in the model difference browser 5 . The host model M1 is shown in Figure 4-2a, and the candidate model M2 is shown in Figure 4-2b. The corresponding model elements within the encircled regions in these two models are the mappings, which are filtered and not 5 Because the actual color shown in the browser can not be rendered in print, the figure has annotations that indicate the appropriate color. 113 displayed in the browser. The browser only visualizes the detected differences, as shown in Figure 4-2c. In red: Delete In gray: New In green: Change (a) The host model: M1 In gray: New In red: Delete In green: Change (b) The candidate model: M2 (c) Model differences Figure 4-2 - Visualization of model differences The root of the upper tree is M1; its subtrees and leaf nodes are all the differences compared to M2, which is represented by the bottom tree. For example, the first child of the upper tree is a Delete difference, which is in red. This difference means the LogOnRead element is in M1, but is missing in M2. Correspondingly, there is a New 114 difference in the bottom tree, which is in gray. It indicates that M2 misses the LogOnRead element when it is compared to M1. A Change difference is detected for the LogOnMethodEntry element; although this element exists in both models, one of its attributes, called kind, has different values: “On Write” in M1 but “On Method Entry” in M2. Such a Change difference item is highlighted in green. When the two trees do not have any subtree or leaf node, we can infer there is no difference between these two models. To focus on any model element, a user can navigate across the tree and doubleclick an item of interest, and the corresponding model element is brought into focus within the editing window. 4.5 Evaluation and Discussion This section first briefly analyzes the complexity of the algorithm and illustrates an example application. The current limitations and proposed improvements for DSMDiff are also discussed. 4.5.1 Algorithm Analysis Generally, DSMDiff is a level-wise model differentiation approach. It begins with the two root models at the top-levels and then continues to their child models at the lower levels. At each level, node comparison is performed to detect the node mappings by using signature matching and edge similarity, followed by edge comparison to detect the edge mappings and differences. These steps are repeated on the mapped child models until the bottom-level is reached. 115 The core of the DSMDiff algorithms include signature matching (Step 4 in Listing 4-4) and edge similarity matching (Step 5 in Listing 4-4), which significantly influence the execution time. To estimate the complexity of signature matching and edge similarity matching, we assume the two models have similar structures and sizes. Given a model, L denotes the depth of the model hierarchy; N denotes the average number of nodes; and, M denotes the average number of the edges of a model node. The size of a model node is denoted as S, where S = N+M. Considering the case that every node at all levels except for the lowest level are model nodes, the total number of model nodes is denoted as T, ∑N L−2 where T = i =0 i ≈ NL-1. In the best case, all the mappings and differences between two model nodes can be found by signature matching, in which the complexity depends on the size of the model nodes. In findSignatureMappingsAndDeleteDiffs (Listing 4-3), where signature matching is performed to detect node mappings and differences, all the candidate nodes and their signatures are stored in a sorted map; the upper bound for the complexity of this step is O(N x logN). To find correspondences from this map for all the node elements of M1, the complexity is also O(N x logN). Later, similar computation is taken to compute the edge mappings and differences (i.e., Step 8 of Listing 4-4); such complexity is neglected here because the number of edges is less than the number of nodes. Overall, because all the model nodes within the model hierarchy need to be compared, the complexity for this best case is O(N x logN x T). In the worst case, no exact mapping is found for a pair of model nodes during the signature matching. Thus, all the nodes need to be examined by edge similarity matching (i.e., Step 5 in Listing 4-4), which is the most complicated step in Listing 4-4. Assume 116 that there is an edge between any pair of nodes, then a node has N-1 edges, which is the worst case regarding the complexity. In edge similarity matching (i.e., Step 5 in Listing 4-4), the most complicated step is findMaximalEdgeSimilarity (Listing 4-1), which computes the edge similarity of all the candidate nodes for a host node, where all edge signatures of a candidate node and the number of the associated edges are stored in a map (i.e., Substep 3.1 in Listing 4-1). The complexity for building this map is O({N-1} x log{N-1}). To compute the edge similarity of every candidate node (i.e., Step 3 of Listing 1), the computation cost is bound by O(R x {N-1} x log{N-1}), where R is the number of candidate nodes with R ≤ N. Because Step 3 is the most complicated step in Listing 4-1, the upper bound of findMaximalEdgeSimilarity is also O(R x {N1} x log{N-1}). To find the candidate with maximal edge similarity for each host node (i.e., Step 5 in Listing 4-4), the cost is bounded by O(N x R x {N-1} x log{N-1}). To compute all the node mappings at all the levels in a model hierarchy using edge similarity matching, the upper bound of the complexity for this worst case is O(T x N x R x {N-1} x log{N-1}), which is in the polynomial class. For the same reason (i.e., the number of edges is less than the number of nodes), the complexity of detecting edge mappings and differences is neglected. Although the complexity of constant-time signature comparison and associated string comparison is not counted here, the algorithm achieves polynomial time in complexity according to the above analysis. 117 4.5.2 Limitations and Improvement DSMDiff is based on the assumption that domain-specific models are defined precisely and unambiguously. That is, domain-specific models are instances of a metamodel that can be distinguished from each other by comparing a set of properties of the elements and the connections to their neighbors. However, when there are several candidates with the same maximal edge similarity, DSMDiff may produce inaccurate results. A typical case occurs when there are nodes connected to each other but their mappings have not been determined yet. As shown in Figure 4-3, there is an A node connected to three nodes: B, C and D. In M2, the A’ node connects to three other nodes: B’, C’ and D’, and A’’ is connected to B’’. Given that nodes with the same letter label have the same signatures (e.g., all the A nodes have the same signature and all the B nodes have the same signature), then the connections between an A node and a B node have the same edge signature. According to the algorithm in Listing 4-1, suppose the A node is examined first for structural matching and the A’ node in M2 is selected as the mapping of the A node in M1. When the B node is examined, the algorithm may select either B’ or B’’ in M2 as the mapping of the B node in M1 because both B nodes in M2 have the same edge similarity as the B node in M1. If the B’’ node in M2 is selected as the mapping to the B node in M1, the result is incorrect because B’ is the correct mapping. In such cases, DSMDiff needs to use new rules or criteria to help find the correct mapping. For example, a new rule needs to be added to the algorithm in Listing 1 to require selecting first the unmapped node in M1 that has maximal already-mapped neighbors. Another improvement will allow interaction between DSMDiff and users, who can select the mappings from multiple candidates manually. 118 M1 A M2 A’ E E A’’ D B C D F B’ F C B’’ Figure 4-3 - A nondeterministic case that DSMDiff may produce incorrect result Besides the performance and the correctness of the results, it is also important for model differentiation algorithms to produce a small set of model differences (ideally a minimal set) rather than providing a large set of model differences. In other words, the conciseness of the produced result is another metric contributing to the overall quality of model differentiation algorithms. Currently, DSMDiff compares two models M1 and M2 by traversing their composition trees in parallel. When an element from a model cannot be matched to an element of the other model at some level, the algorithm does not traverse the children of this element. One issue with this scheme is that DSMDiff is not able to detect when a subtree has been moved from one container to another between M1 and M2. The algorithm will only report that a whole subtree has been deleted from M1, and that a whole subtree has been added to M2, without noting that these are identical subtrees. This implies that the reported difference set is less concise than it could be. To solve this problem, a new type of model difference needs to be introduced: Move, which may reference the subtree in M1, and its new container in M2. An additional step is also required in the algorithms to compare all the elements of M1 and M2 that have not been matched when first traversing them. However, this step is expensive in the general case 119 because many elements may need to be compared. This cost is actually avoided in the current version of the algorithm by assuming a similar composition structure in M1 and M2. DSMDiff visualizes all the possible differences as a containment tree in a browser, but does not directly highlight the differences upon the associated model elements within the editing window. To indicate the differences directly on the model diagrams and attribute panels within the modeling environment, a set of graphical decorators, which may be shapes or icons, could be attached to the corresponding model elements or attributes in order to change their look according to the type of model differences. In addition, our solution using coloring to highlight all possible types of model differences may fail to work when users are color-blind, or when a screenshot of the model difference tree view is printed in black-and-white (e.g., the need to add annotations to Figure 4-2c). A visualization mechanism to complement the coloring would indicate the Delete differences by striking through them, the Change ones by underlining them, and marking the New ones bold. This could be a complimentary solution that needs to be investigated in the future. 4.6 Related Work This work is related to differentiation techniques for various software artifacts such as source code, documents, diagrams and models. There are two important categories of related work: 1) the algorithms to compute model differences, and 2) the visualization techniques to highlight those differences. 120 4.6.1 Model Differentiation Algorithms There exist a number of general-purpose differentiation tools for comparing two or more text files (e.g., code or documentation). As an example, Unix diff [Hunt and McIlroy, 75] is a popular tool for comparing two text files. Diff compares files and indicates a set of additions and deletions. Many version control tools also provide functionality similar to diff to identify changes between versions of text documents [Eick et al., 01]. Although many tools are available for differentiating text documents, limited support is currently available for differentiating graphical objects such as UML diagrams and domain-specific models. As the importance of model differentiation techniques to system design and its evolution is well-recognized, there have been some research efforts focused on model difference calculation. Several metamodel-independent algorithms regarding difference calculation between models are presented in [Alanen and Porres, 03] and [Ohst et al., 03], which are developed primarily based on existing algorithms for detecting changes in structured data [Chawathe et al., 96] or XML documents [Wang et al., 03]. In these approaches, a set of change operations such as “create” and “delete” are used to represent and calculate model differences, which is similar to our approach. However, they are based on the assumption that the model versions are manipulated through the editing tool that assigns persistent identifiers to all model elements. Such capability is not available when two models are developed separately (e.g., by different developers in a non-collaborative context, or by different editing tools) or generated by execution of a model transformation. 121 To provide algorithms independent of such identifiers, UMLDiff uses name similarity and structure similarity for detecting structural changes between the designs of subsequent versions of UML models [Xing and Stroulia, 05]. However, differentiation applied to domain-specific modeling is more challenging than difference analysis on UML diagrams. The main reason is that UML diagrams usually belong to a single common metamodel that can be represented formally as a containment-spanning tree starting at a virtual root and progressing down to packages, classes and interfaces. However, domain-specific models may belong to different metamodels according to their domains and are considered as hierarchical graphs. Also, a differentiation algorithm for domain-specific models needs to be metamodel-independent in order to work with multiple DSMLs. This required DSMDiff to consider the type information of instance models, as well as the type information of the corresponding metamodel. A promising approach is to represent the result of model difference as a model itself. A recent work presented in [Cicchetti et al., 07] proposes a metamodel-independent approach to model difference representation. Within this approach, the detected model differences are represented as a difference model, which conforms to a metamodel that is automatically derived from the metamodel of the to-be-compared base models. Such a derivation process itself is a model transformation. Also, because the base models and the difference models are all model artifacts, other model-to-model transformations are induced to compose models (e.g., apply a difference model to a base model to produce the other base model). Thus, such an approach can be supported in a modeling platform and does not require other ad hoc tool support. A possible future improvement to 122 DSMDiff would be to integrate this approach to assist in representation of model differences. 4.6.2 Visualization Techniques for Model Differences There has been some work toward visualizing model differences textually. IBM Rational Rose [Rose, 07] and Magic Draw UML [MagicDraw, 07] display model differences in a textual way. These tools convert the diagrams into hierarchical text and then perform differentiation on this hierarchy. Changes are shown using highlighting schemes on the text. Although this approach is relatively easy to implement, its main drawback is that changes are no longer visible in a graphical form within the actual modeling tool, which makes the difference results more difficult to comprehend. Other researchers have shown that the use of color and graphical symbols (e.g., icons) are more efficient in highlighting model differences. An approach is proposed in [Ohst et al., 03] where coloring is used to highlight the model differences in two overlapping diagrams. A differentiation tool described in [Mehra et al., 05] presents graphical changes by developing a core set of highlighting schemes and an API for depicting changes in a visual diagram. UMLDiff presents a change-tree visualization technique. It reuses the visualization of Eclipse’s Java DOM model for displaying different entities with diverse icons and separate visibility with various colors. Additionally, UMLDiff extends the visualization to use different icons to represent the differentiation results (e.g., “+” for add, “-” for remove). Although it is intuitive to visualize model differences by coloring and iconic notations, these techniques are not specifically tied to modeling concepts and lack the 123 ability to be integrated into MDE processes. DSMDiff provides a model difference browser that displays the structural differences in a tree view, which is similar to the change-tree visualization technique of UMLDiff. To preserve the convention of the host modeling environment, many GME icons are used to represent the corresponding modeling types of the model difference items in the tree view. For example, a Delete atom or a New atom corresponds to an atom type. To avoid overuse of icons (e.g., “+” and “-” are commonly used for a collapsed folder and an expanded folder, respectively), DSMDiff uses colors to represent various types of model differences. 4.7 Conclusion In this chapter, the model differentiation problem is defined in the context of Domain-Specific Modeling. The main points include: 1) domain-specific modeling is distinguished from traditional UML modeling because it is a variable-metamodel approach whereas UML is a fixed-metamodel approach; 2) the underlying metamodeling mechanism used to define a DSML determines the properties and structures of domainspecific models; 3) domain-specific models may be formalized as hierarchical graphs annotated with a set of syntactical information. Based on these characteristics, model differentiation algorithms and an associated tool called DSMDiff were developed to discover the mappings and differences between any two domain-specific models. The chapter also describes a visualization technique to display model differences structurally and highlight them using color and icons. The applicability of DSMDiff has been demonstrated within the context of model transformation testing, as discussed in Chapter 5. To ensure the correctness of model transformation, executable testing can help detect errors in a model transformation 124 specification. To realize the vision of model transformation testing, a model differentiation technique is needed for comparison of the actual output model and the expected model, and visualization of the detected visualization. If there is no difference between the actual output and expected models, it can be inferred that the model transformation is correct with respect to the given test specification. If there are differences between the output and expected models, the errors in the transformation specification need to be isolated and removed. In this application, DSMDiff serves as a model comparator to perform the model comparison and visualize the produced differences. 125 CHAPTER 5 MODEL TRANSFORMATION TESTING To ensure the correctness of model transformation, testing techniques can help detect errors in a model transformation specification. This chapter presents a model transformation testing approach. It begins with a discussion of the specific need to ensure the correctness of model transformation, followed by a discussion on the limitations of current techniques. An overview of the model transformation testing approach is provided and an emphasis is given on the principles and the implementation of the model transformation testing engine M2MUnit. In addition, a case study is offered to illustrate using this approach to assist in detecting the errors in ECL specifications. Related work and concluding remarks are presented in the rest of this chapter. 5.1 Motivation Model transformation is the core process in MDE for providing automation in software development [Sendall and Kozaczynski, 03]. Particularly, model-to-model transformation is investigated in the dissertation research to facilitate change evolution within MDE. To improve the reliability of such automation, validation and verification techniques and tools are needed to ensure the correctness of model transformation, as discussed in the following section. Although there are various techniques that facilitate 126 quality assurance of model transformation, a testing approach is investigated in this dissertation to improve the correctness of model transformation. 5.1.1 The Need to Ensure the Correctness of Model Transformation As discussed in Chapter 2, there are different types of model transformation. Examples of such transformation are exogenous transformation (e.g., model-to-code transformation for generating code from models) and endogenous transformation (e.g., model-to-model transformation for altering the internal structure of the model representation itself). The model transformation approach discussed in Chapter 3 supports endogenous transformation. To perform a model transformation, the source models and the ECL transformation specification are taken by the transformation engine C-SAW as input to generate the target model as output. In such a model transformation environment, assuming the model transformation engine works correctly and the source models are properly specified, the quality of the transformed results depends on the correctness of the model transformation specifications. As defined in [Mens and Van Gorp, 05], there are two types of correctness. One is syntactic correctness, which is defined as, “Given a well-formed source model, can we guarantee that the target model produced by the transformation is well-formed?” The other is semantic correctness, which is a more significant and complex issue, “Does the produced target model have the expected semantic properties?” In this research, a model transformation specification is correct if the produced model meets its specified requirements with both the expected syntactic and semantic correctness. Specifically, the reasons for validating the correctness of a model transformation specification include: 127 • Transformation specifications are error-prone: like the code in an implementation, transformation specifications are written by humans and susceptible to errors. Also, transformation specifications need to define complicated model computation logic such as model navigation, selection and manipulation, which makes it hard to specify correctly. • Transformation specifications are usually applied to a collection of models: the input of a model transformation is a single model or a collection of models. When MDE is applied to develop a large-scale and complex system, it is common to apply transformation specifications to a large set of models. Before a transformation specification is performed on a large quantity of models, it is prudent to first test its correctness on a small set of models. • Transformation specifications are reusable: because it takes intensive effort to define model transformations, the reusability of a transformation specification is critical to reduce human effort. Before a model transformation is reused in the same domain or across similar domains, it is also necessary to ensure its correctness. Thus, there is a need for verification and validation techniques and tools to assist in finding and correcting the errors in model transformation specifications. At the implementation level, traditional software engineering methods and practices such as testing have been widely used in ensuring the quality of software development. However, at the modeling level, research efforts and best practices are still needed to improve the quality of models and model transformations. The need for model transformation testing is discussed in the following section. 128 5.1.2 The Need for Model Transformation Testing Verification and validation are well-established techniques for improving the quality of a software artifact within the overall software development lifecycle [Harrold, 00], [Adrion et al., 82]. These techniques can be divided into two forms: static analysis and dynamic analysis. Static analysis does not require execution of software artifacts; this form of verification includes model checking and proof of correctness. Execution-based testing is an important form of dynamic analysis that is performed to support quality assurance in traditional software development [Harrold, 00]. As a new emerging software development paradigm, MDE highlights the need for verification and validation techniques that are specific to model and model transformation artifacts. Currently, there are a variety of verification techniques proposed for model transformation (e.g., model checking [Hatcliff et al., 03], [Holzmann, 97], [Schmidt and Varró, 03], simulation [Yilmaz, 01] and theorem proving [Varró et al., 02]). Common to all of these verification techniques is that they rely on a formal semantics of the specification or programming language concerned. Despite the relative maturity of formal verification within software engineering research, practical applications are limited to safety-critical and embedded systems [Clarke and Wing, 96]. Reasons for this include the complexity of formal specification techniques [Adrion et al., 82] and the lack of training of software engineers in applying them [Hinchey et al., 96]. Furthermore, there are also well-known limitations for formal verification such as the state-explosion problem within model checking [Hinchey et al., 96]. 129 Execution-based testing is widely used in practice to provide confidence in the quality of software [Harrold, 00]. Compared to formal verification, testing has several advantages that make it a practical method to improve the quality of software. These advantages are: 1) the relative ease with which many of the testing activities can be performed; 2) the software artifacts being developed (e.g., model transformation specifications) can be executed in its expected environment; 3) much of the testing process can be automated [Harrold, 00]. Model transformation specifications are executable, which makes execution-based testing a feasible approach to finding transformation faults by executing specifications within the model transformation environment without the need to translate models and transformations to formal specifications and to develop analytic models for formal verification. In contrast to formal verification, model transformation testing has been developed as a contribution of the dissertation to validate model transformation. It aims at improving the confidence that a model transformation specification meets its requirements, but cannot prove any property as a guarantee (i.e., model transformation testing cannot assert the absence of errors, but is useful in revealing their presence, as noted by Dijsktra in relation to general testing [Dijkstra, 72]). The following subsections discuss the investigated testing approach for assisting in improving the quality of model transformations. 5.2 A Framework for Model Transformation Testing There are various levels of software testing such as unit testing [Zhu et al., 97], and system testing [Al Dallal and Sorenson, 02]. Unit testing is a procedure that aims at 130 validating individual software units or components. System testing is conducted on a complete, integrated system to evaluate the system’s compliance with its specified requirements. In the research described in this dissertation, model transformation testing is developed to support unit testing a model transformation as a modular unit (e.g., the ECL strategy). Theoretically, a complete verification of a program or a model transformation specification can only be obtained by performing exhaustive testing for every element of the domain. However, this technique is not practical because functional domains are sufficiently large to make the number of required test cases infeasible [Adrion et al., 82]. In practice, testing relies on the construction of a finite number of test cases and execution of parts or all of the system for the correctness of the test cases [Harrold, 00], [Zhu et al., 97]. A model transformation testing framework should facilitate the construction and execution of test cases. 5.2.1 Overview Model transformation testing involves executing a specification with the intent of finding errors [Lin et al., 05]. A testing framework should assist in generating tests, running tests, and analyzing tests. Figure 5-1 shows the framework for model transformation testing. There are three primary components to the testing framework: test case constructor, testing engine, and test analyzer. The test case constructor consumes the test specification and produces a suite of test cases that are necessary for testing a transformation specification. The generated test cases are passed to the testing engine to be executed. The test analyzer visualizes the results and provides a capability to navigate 131 among any differences. The research provides tool support for executing and analyzing tests, which is realized by a testing engine. An assumption is that test suites will be constructed manually by transformation testers. Figure 5-1 - The model transformation testing framework 5.2.2 Model Transformation Testing Engine: M2MUnit To provide tool support for executing test cases that are needed for testing a model transformation specification, a model transformation testing engine called M2MUnit has been developed as a GME plug-in to run test cases and visualize the test results. A test case contains an input model, the to-be-tested model transformation specification and an expected model. Figure 5-2 shows an overview of the model transformation testing engine M2MUnit. As shown in Figure 5-2, there are three major components within the testing engine: an executor, a comparator and a test analyzer. The executor is responsible for 132 executing the transformation specification on the input model to generate the output model. The comparator considers the output model to the expected model and collects the results of comparison. To assist in comprehending test results, basic visualization functionality of the test analyzer is also implemented within M2MUnit to structurally highlight the detected model differences. During the executor and comparator steps, the metamodel provides required information on types and constraints that are needed to assist in comparison of the expected and output models. Moreover, critical data that are included in a test case is indicated in Figure 5-2 such as input model, expected model and to-be-tested specification. The correctness of a model transformation specification can be determined by checking if the output of the model transformation satisfies its intent (i.e., when there are no differences between the output model and the expected model). If there are no differences between the actual output and expected models, it can be inferred that the model transformation is correct with respect to the given test specification. If there are differences between the output and expected models, the errors in the transformation specification need to be isolated and removed. The role of the executor is essentially a model transformation engine with functionality performed by C-SAW. Also, model comparison is performed between an expected model and an output model that are not subsequent versions. The output model is produced by the executor and the expected model is constructed by a tester. As discussed in Chapter 4, DSMDiff algorithms do not require two models to be subsequent versions. Thus, DSMDiff serves as the model comparator of M2MUnit to perform the model comparison and is also responsible for visualizing the produced differences as the 133 test analyzer. In fact, the development of DSMDiff was originally motivated by research on model transformation testing [Lin et al., 05]. To illustrate the feasibility and utility of the transformation testing framework, the next section describes a case study of testing a model transformation. MetaModel Testing Engine M2MUnit InputModel Executor OutputModel Transformation Specification Comparator ExpectedModel Test Report Test Analyzer Figure 5-2 - The model transformation testing engine M2MUnit 5.3 Case Study This case study is performed on an experimental platform, the Embedded Systems Modeling Language (ESML) introduced in Chapter 3, which is a freely available domainspecific graphical modeling language developed for modeling real-time mission computing embedded avionics applications [Sharp, 00]. There are over 50 ESML 134 component models used for this case study that communicate with each other via a realtime event-channel mechanism. An ESML component model may contain several data elements. This case study shows how M2MUnit can assist in finding errors in a transformation specification. 5.3.1 Overview of the Test Case The test case is designed to validate an ECL specification developed for the following model transformation task: 1) find all the Data atoms in a component model, 2) create a Log atom for each Data atom, and then set its Kind attribute to “On Method Entry” and its MethodList attribute to “update,” and 3) create a connection from the Log atom to its corresponding Data atom. Figure 5-3 and Figure 5-4 represent the input model and the expected model of this transformation task, respectively. The input model contains a Data atom called numberOfUsers. The expected model contains a Log atom called LogOnMethodEntry, which connects to the Data atom numberOfUsers. The Kind attribute of LogOnMethodEntry is set to “On Method Entry” and the MethodList attribute is set to “update.” Such an expected model represents a correct transformation output. 135 Figure 5-3 - The input model prior to model transformation Figure 5-4 - The expected model for model transformation testing To perform such a task by C-SAW, an ECL specification can be defined to transform the input model to the expected model. Listing 5-1 represents the initial ECL model transformation specification developed to accomplish the prescribed 136 transformation of the case study. This specification defines one aspect and two strategies. The Start aspect finds the input model and applies the FindData strategy. The FindData strategy specifies the search criteria to find all the Data atoms. The AddLog strategy is executed on those Data atoms identified by FindData. The AddLog strategy specifies the behavior to create the Log atom for each Data atom. Before this specification is applied to all component models and reused later, it is necessary to test its correctness. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 strategy FindData() { atoms()->select(a | a.kindOf() == "Data")->AddLog(); } strategy AddLog() { declare parentModel : model; declare dataAtom, logAtom : atom; dataAtom := self; parentModel := parent(); logAtom := parentModel.addAtom("Log", "LogOnMethodEntry"); parentModel.addAtom("Log", "LogOnRead"); logAtom.setAttribute("Kind", "On Write"); logAtom.setAttribute("MethodList", "update"); } aspect Start( ) { rootFolder( ).findFolder("ComponentTypes").models()-> select(m|m.name().endWith( "DataGatheringComponentImpl_target"))->FindData(); } Listing 5-1 - The to-be-tested ECL specification 5.3.2 Execution of the Test Case The test case is constructed as a GME project, from which M2MUnit is invoked as a GME plugin. The execution of the test case includes two steps: first, the to-be-tested ECL specification is executed by the executor to produce an output target model; second, 137 the output target model and the expected model are sent to the comparator, which compares the models and passes the result to the test analyzer to be visualized. Figure 5-5 shows the output model. When comparing it to the expected model, there are three differences, as shown in Figure 5-6. Figure 5-7 shows the visualization of the detected differences in a tree view. Figure 5-5 - The output model after model transformation Because of these detected differences between the output model and the expected model, the ECL specification is suspected to have errors. To discover these errors, the following differences need to be examined: • Difference 1: an extra atom LogOnRead is inserted in the output model, which needs to be deleted and is highlighted in red. • Difference 2: there is a missing connection from LogOnMethodEntry to numberOfUsers, which needs to be created in the output model and is highlighted in gray. 138 • Difference 3: the kind attribute of the LogOnMethodEntry has a different value “On Write" from the expected value “On Method Entry,” which needs to be changed and is highlighted in green. Figure 5-6 - A summary of the detected differences In red: Delete In gray: New In green: Change In gray: New In red: Delete In green: Change Figure 5-7 - Visualization of the detected differences 5.3.3 Correction of the Model Transformation Specification According to the test results, it is obvious that there are three corrections that need to be made to initial the transformation specification. One correction is to add a statement 139 that will create the connection between LogOnMethodEntry and numberOfUsers. The second correction is to delete the line that adds LogOnRead: parentModel.addAtom(“Log”, “LogOnRead”). The third correction is to change the value of the Kind attribute from “On Write” to “On Method Entry.” The modified transformation specification is shown in Listing 5-2 with the corrections underlined or marked by strikethrough. However, this does not imply that the correction is automated – the corrections need to be made manually after observing the test results. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 strategy FindData() { atoms()->select(a | a.kindOf() == "Data")->AddLog(); } strategy AddLog() { declare parentModel : model; declare dataAtom, logAtom : atom; dataAtom := self; parentModel := parent(); logAtom := parentModel.addAtom("Log", "LogOnMethodEntry"); parentModel.addAtom("Log", "LogOnRead"); logAtom.setAttribute("Kind", "On Method Entry"); logAtom.setAttribute("MethodList", "update"); parentModel.addConnection("AddLog", logAtom, dataAtom); } aspect Start( ) { rootFolder().findFolder("ComponentTypes").models()-> select(m|m.name().endWith("DataGatheringComponentImpl_target"))->FindData(); } Listing 5-2 - The corrected ECL specification 5.4 Related Work Regarding formal verification, model checking is a widely used technique for verification of model properties (e.g., the SPIN Model checker [Holzmann, 97], the Cadena model checking toolsuite [Hatcliff et al., 03], and the CheckVML tool [Schmidt 140 and Varró, 03]). SPIN is a verification system for models of distributed software, which has been used to detect design errors for a broad range of applications ranging from a high-level description of distributed algorithms to detailed code for controlling telephone exchanges. The main idea of SPIN is that system behaviors and requirements are specified as two aspects of the design by defining a verification or prototype in a specification language. The prototype is verified by checking the internal and mutual consistency of the requirements and behaviors. The Cadena model checking toolsuite extends SPIN to add support for objects, functions, and references. CheckVML is a tool for model checking dynamic consistency properties in arbitrary well-formed instance models. It first translates models into a tool-independent intermediate representation, then automatically generates the input language of the back-end model checker tool (e.g., SPIN). Generally, model checking is based on formal specification languages and automata theory. A mathematically proven technique for validating model transformations is proposed in [Varró et al., 02], where the main idea is to perform mathematical model transformations in order to integrate UML-based system models and mathematical models of formal verification tools. However, such an approach requires a detailed mathematical description and analysis of models and transformations, which may limit the applicability for general use. Other formal methods include temporal logics [Manna and Pnueli, 92] and assertions [Hoare, 69], which have been proposed to verify the conformance of a program with its specification at the implementation level. These techniques are based on formal specification and may be used for formal proof of correctness. However, proving correctness of model transformations formally is 141 difficult and requires formal verification techniques. Execution-based testing is a feasible alternative to finding transformation faults without the need to translate models and transformations to formal specifications. Using testing to determine the correctness of model and model transformation also provides opportunities to bring mature software engineering techniques to modeling practice. There has been work on applying testing to validate design models. For example, [Pilskalns et al., 07] presents an approach for testing UML design models to uncover inconsistencies. This approach uses behavioral views such as sequence diagrams to simulate state change in an aggregate model, which is the artifact of merging information from behavioral and structural UML views. OCL pre-conditions, post-conditions and invariants are used as a test oracle. However, there are only a few reports in the literature regarding efforts that provide the facilities for model transformation testing. Our own work on model transformation testing published as [Lin et al., 04], [Lin et al., 05] is one of the earliest reports addressing this issue. Another initial work on model transformation testing is [Fleurey et al., 04], which presents a general view of the roles of testing in the different stages of model-driven engineering, and a more detailed exploration of approaches to testing model transformations. Based on this, Fleurey et al. highlight the particular issues for the different testing tasks, including adequacy criteria, test oracles and automatic test data generation. More recently, there have been a few works that expand research on model transformation testing by exploring additional testing issues. For example, a metamodel-based approach for test generation is proposed in [Brottier et al., 06] for model transformation testing. In [Mottu et al., 06], mutation analysis is investigated for model transformation to evaluate the quality of test data. Experiences in 142 validating model transformations using a white box approach is reported in [Kuster and Abd-El-Razik, 06]. Such research has focused on test coverage, test data analysis and test generation of source code from models, but this dissertation effort primarily aims at providing a testing engine to run tests and analyze the results. 5.5 Conclusion This chapter presents another contribution of the dissertation on model transformation testing to improve the accuracy of transformation results. In addition to the developed model transformation testing framework and the unit testing approach for model transformation, the model transformation testing engine M2MUnit has been implemented to provide support to execute test cases with the intent of revealing errors in the model transformation specification. Distinguished from classical software testing tools, to determine whether a model transformation test passes or fails requires comparison of the actual output model with the expected model, which requires model differencing algorithms and visualization. The DSMDiff approach presented in Chapter 4 provides solutions for model differentiation and visualization. The result presented in this chapter provides an initial solution to applying testing techniques at the modeling level. There are other fundamental issues that need to be explored deeply in order to provide mature testing solutions. In the next chapter, several critical issues are proposed for advancing the research on model transformation testing. 144 CHAPTER 6 FUTURE WORK This chapter outlines research directions that will be investigated as future work. To alleviate the complexity of developing model transformations, research into the idea of Model Transformation by Example (MTBE) is proposed to assist users in constructing model transformation rules through interaction with modeling tools. Test generation from test specifications and metamodel-based coverage criteria to evaluate test adequacy are also discussed as future work for model transformation testing. To provide support to isolate the errors in model transformation specifications, model transformation debugging is another software engineering practice that needs to be investigated as an extension of the research described in this dissertation. 6.1 Model Transformation by Example (MTBE) There are several dozen model transformation languages that have been proposed over the last five years, with each having a unique syntax and style [Sendall and Kozaczynski, 03], [Mens and Van Gorp, 05], [Czarnecki and Helsen, 06]. Because these model transformation languages are based on various techniques (e.g., relational approach or graph rewriting approach) and not all the language concepts are explicitly specified (e.g., transformation rule scheduling and model matching mechanism), it is 145 difficult for certain classes of users of the model transformation languages to write model transformation rules. To simplify the task of developing a model transformation specification, Model Transformation by Example (MTBE) is an approach that enables an end-user to record the type of transformation that they desire and then have the modeling tool infer the transformation rule corresponding to that example [Varró, 06], [Wimmer et al., 07]. The idea of MTBE has similar goals to Programming by Example (PBE) [Lieberman, 00] and Query by Example (QBE) [Zloof, 77] in that a user interacts with a modeling tool that records a set of actions and generates some representative script that can replay the recorded actions. The inferred script could represent a fragment of code, a database query, or in the case of MTBE, a model transformation rule. To realize MTBE within a modeling tool, an event tracing mechanism needs to be developed and algorithms are needed for inferring a transformation rule from the set of event traces. Typically, there is a specific series of events that occur during a user modeling session; lower-level events are user interactions with the windowing system (e.g., “mouse click”) and higher-level events (composed from a sequence of lower-level events in a certain context) correspond to the core meaning of the user actions (e.g., “add attribute” or “delete connection”). To support event tracing of user interaction within most modeling tools requires a fair amount of manual customization by either modifying the source code of the modeling tool or hooking into the tool’s published event channel (if it exists). With respect to programming languages, an event is a detectable action occurring during program execution [Auguston, 98] (e.g., expression evaluation, subroutine call, 146 statement execution, message sending/receiving). Within the context of MTBE, an event corresponds to some action made by a user during interaction with a modeling tool. An event is delimited by a time interval with a beginning and an end. Such a model to record the history of change events is defined as the event trace [Auguston et al., 03]. To support general model evolution analysis, a new language will be designed for expressing computations over event traces as a basis for inferring model transformation rules. Such a language will allow analysis of model changes to be generalized, rather than fixed within the modeling tool. This language will be defined by an event grammar, which represents all of the possible events of interest within a given modeling context. For example, an “add connection” event is represented as the following: FOREACH C: atom-atom-connection FROM A1: atom A2: atom C.log( SAY( "added connection " C.conn_name "[" A1._name "-" A2._name"]" ) ) The tangible asset of this proposed work will be a language processor for event grammars, which will generate the requisite instrumentation of the modeling tool to perform the analysis specified in a query that is based on the event grammar. Integrating the language processor into a collection of modeling tools allows the specification of model evolution analysis in a tool-independent manner. This approach is distinguished from another MTBE approach proposed by Dániel Varró [Varró, 06], where transformation rules are derived semi-automatically from an initial prototypical set of interrelated source and target models. These initial model pairs describe critical cases of the model transformation problem in a purely declarative way. The derived transformation rules can be refined later by adding further source-target model pairs. The main advantage of the approach is also that transformation designers do not need to learn 147 a new model transformation language. Compared to Varró’s approach and manual adaptation of a modeling tool for a desired model evolution analysis task, the proposed investigation of MTBE using event grammars has the following apparent advantages: • The notion of an event grammar provides a general basis for model evolution tasks. This makes it possible to reason about the meaning of model evolution at an appropriate level of granularity. • An event grammar provides a coordinated system to refer to any interesting event in the evolution history of the user modeling session. Assertions about different modeling events may be specified and checked in an event query to determine if any sequence of undesirable changes were made. • Trace collection is implemented by instrumentation of a modeling tool. Because only a small projection of the entire event trace is used by any set of rules, it is possible to implement selective instrumentation, powerful event filtering and other optimizations to reduce dramatically the size of the collected trace. More importantly, special-purpose instrumentation will enable most analyses to be moved from post-mortem to live response during the actual modeling session. Furthermore, it may be possible for the algorithms and techniques of PBE and QBE to be adapted to the context of MTBE. An extensive set of event queries will be created to correspond with the event grammar for model evolution analysis. That is, during the record phase of MTBE, all of the relevant modeling events will be logged as an event trace. The event trace will then serve as input to the MTBE algorithms that generate the corresponding transformation rule. By plugging different back-ends into the MTBE 148 algorithms, it is anticipated that various model transformation languages can be inferred from a common example. For each new evolution analysis task (e.g., version control) that is needed, similar manual adaptation may be required with slight variation. The proposed work will produce an approach that generalizes the evolution analysis task and the underlying event channel of the modeling tool to allow the rapid addition of new analysis capabilities across a range of DSM tools. 6.2 Toward a Complete Model Transformation Testing Framework The dissertation work on model transformation testing as discussed in Chapter 5 is an initial step towards partially automating test execution and assisting in test result analysis. To develop a more mature approach to model transformation testing, additional research efforts are needed to provide support for test generation and test adequacy analysis. Test generation creates a set of test cases for testing a model transformation specification. A test case usually needs to contain general information, input data, test condition and action, and the expected result. Manually creating test cases is a human intensive task. To reduce the human effort in generating tests by improving the degree of automation, a test specification is required to define test cases, even test suites and a test execution sequence. Such a language may be an extension to the model transformation language that provides language constructs for defining tests. An envisioned test specification example for testing ECL specification is shown as the following. 149 Test test1 { Specification file: “C:\ESML\ModelComparison1\Strategies\addLog.spc” Start Strategy: FindData GME Project: “C:\ESML\ModelCompariosn1\modelComparison1.mga” Input model: “ComponentTypes\DataGatheringComponentImpl” Output model: “ComponentTypes\Output1” Expected model: “ComponentTypes\Expected1” Pass: Output1 = Expected1 } Such a test is composed of a name (e.g., “test1”) and body. The test body defines the locations and identifiers of the model transformation specification, the start procedure to execute, a test project built for the testing purpose, the input source and output target models, the expected model, as well as the criteria for asserting a successful pass (i.e., the test oracle is a comparison between two models). Such a test specification can be written manually by a test developer or generated by the modeling environment with specific support to directly select the involved ECL specification, the input model and expected model. Thus, test developers can build and edit tests from within the modeling environment. An effective test specification language also needs to support the definition of test suites and a test execution sequence. In addition, it may also need to support various types of test oracles, which provide mechanisms for specifying expected behaviors and verifying that test executions meet the specification. Currently, the M2MUnit testing engine only supports one type of oracle, i.e., comparing the actual output and the expected output that are model type. However, it is possible for other types of test oracles to compare the actual output and the expected output that are primitive types such as integer, double or string or just a fragment of a model. Currently, the M2MUnit testing approach has not investigated the concept of test coverage adequacy to formally ensure that the transformation specification has been fully 150 tested. Thus, more research efforts are needed to provide test criteria in the context of model transformation testing to ensure the test adequacy. Traditional software test coverage criteria such as statement coverage, branch coverage and path coverage [Schach, 07], [Adrion et al., 82] may be applied or adapted to a procedural style of model transformation such as used in ECL. In addition, other criteria specific to a particular modeling notation may be developed to help evaluate the test adequacy. Metamodel coverage is such a criterion to evaluate the adequacy of model transformation testing [Fleurey et al., 04]. Metamodeling provides a way to precisely define a domain. Test adequacy can be achieved by generating test cases that cover the domain entities and their relationships defined in a metamodel. For example, a MOF-based metamodel can reuse existing criteria defined for UML class diagrams [Fleurey et al., 04], [Andrews et al., 03]. It has been recognized that the input metamodel for a transformation is usually larger than the actual metamodel used by a transformation. Such an actual metamodel is a subset of the input metamodel and is called the effective metamodel [Brottier et al., 06]. An effective metamodel can be derived from the to-be-tested transformation and used for generating valid models for tests [Baudry et al., 06]. However, it is tedious to generate models manually. An automation technique can be applied to generate sufficient instance models from a metamodel for large scale testing [Ehrig et al., 06]. In summary, the future work for the M2MUnit testing framework include: 1) a test specification language and tool support for generation and execution of tests or test suite, and 2) coverage criteria for evaluating test adequacy, especially through metamodel-based analysis. 151 6.3 Model Transformation Debugging Model transformation testing assists in determining the presence of errors in model transformation specifications. After determining that errors exist in a model transformation, the transformation specification must be investigated in order to ascertain the cause of the error. Model transformation debugging is a process to identify the specific location of the error in a model transformation specification. Currently, C-SAW only supports transformation developers to write “print” statements for the purpose of debugging. A debugger is needed to offer support for tracing down why the transformation specifications do not work as expected. A model transformation debugger has many of the same functionalities as most debugging tools to support setting breakpoints, stepping through one statement at a time and reviewing the values of the local variables and status of affected models [Rosenberg, 96]. A model transformation debugger would allow the step-wise execution of a transformation to enable the viewing of properties of the transformed model as it is being changed in the modeling tool. A major technical problem of a model transformation debugger is to visualize the status of affected models during execution. For example, it may be a large set of models whose status has changed after executing a set of statements. A challenge of this future work is to represent the change status efficiently. The testing toolsuite and the debugging facility together will offer a synergistic benefit for detecting errors in a transformation specification and isolating the specific cause of the error. 152 CHAPTER 7 CONCLUSIONS With the expanded focus of software and system models has come the urgent need to manage complex change evolution within the model representation. Designers must be able to examine various design alternatives quickly and easily among myriad and diverse configuration possibilities. Existing approaches to exploring model change evolution include: 1) modifying the model by hand within the model editor, or 2) writing programs in C++ or Java to perform the change. Both of these approaches have limitations, which degrades the capability of modeling to explore system issues such as adaptability and scalability. A manual approach to evolving a large-scale model is often time consuming and error prone, especially if the size of a system model continues to grow. Meanwhile, many model change concerns crosscut the model hierarchy, which usually requires a considerable amount of typing and mouse clicking to navigate and manipulate a model in order to make a change. There is increasing accidental complexity when using low-level languages such as C++ or Java to define high-level model change evolution concerns such as model querying, navigation and transformation. Despite recent advances in modeling tools, many modeling tasks can still benefit from increased automation. The overall goal of the research described in this dissertation is to provide an automated model transformation approach to model evolution. The key 153 contributions include: 1) investigating the new application of model transformation to address model evolution concerns, especially the system-wide adaptability and scalability issues, 2) applying a testing process to model transformations, which assists in improving the quality of a transformation; and 3) developing algorithms to compute and visualize differences between models. The main benefit of the research is reduced human effort and potential errors in model evolution. The following sections summarize the research contributions in each of these areas. 7.1 The C-SAW Model Transformation Approach To assist in evolving models rapidly and correctly, the research described in this dissertation has developed a domain-independent model transformation approach. Evolved from an earlier aspect modeling language originally designed to address crosscutting modeling concerns [Gray et al., 01], the Embedded Constraint Language (ECL) has been developed to support additional modeling types and provide new operations for model transformation. ECL is a high-level textual language that supports an imperative model transformation style. Compared to other model transformation languages, ECL is a small but expressive language that aims at defining model transformation where the source and target models belong to the same metamodel (i.e., an endogenous transformation language). C-SAW serves as the model transformation engine associated with the new ECL extensions described in Chapter 3. Various types of model evolution tasks can be defined concisely in ECL and executed automatically by C-SAW. The dissertation describes the use of C-SAW to address the accidental complexities associated with current modeling practice (e.g., manually evolving the deep 154 hierarchical structures of large system models can be error prone and labor intensive). Particularly, this dissertation focuses on addressing two system development issues through modeling: scalability and adaptability. At the modeling level, system scalability is correspondingly formulated as a model scalability problem. A transformation specified in ECL can serve as a model replicator that scales models up or down with flexibility in order to explore system-wide properties such as performance and resource allocation. Also, the research described in this dissertation has investigated using C-SAW to address system adaptability issues through modeling. For example, systems have to reconfigure themselves according to fluctuations in their environment. In most cases, such adaptation concerns crosscut the system model and are hard to specify. C-SAW permits the separation of crosscutting concerns at the modeling level, which assists end-users in rapidly exploring design alternatives that would be infeasible to perform manually. As an extension to the earlier investigation in [Gray et al., 01] for modularizing crosscutting modeling concerns, the research described in this dissertation applied C-SAW to address new concerns such as component deployment and synchronization that often spread across system components. To simplify the development of model transformation, as a future extension to the C-SAW work, the dissertation proposes an approach called Model Transformation by Example (MTBE) to generate model transformation rules through a user’s interaction with the modeling tool. An event trace mechanism and algorithms that infer model transformation rules form recorded events need to be developed to realize the vision of MTBE. 155 7.2 Model Transformation Testing Another important issue of model transformation is to ensure its correctness. There are a variety of formal methods proposed for validation and verification for models and associated transformations (e.g., model checking). However, the applicability of formal methods is limited due to the complexity of formal techniques and the lack of training of many software engineers in applying them [Hinchey et al., 96], [Gogolla, 04]. Software engineering practices such as execution-based testing represent a feasible approach for finding transformation faults without the need to translate models and transformations to formal specifications. As one of the earliest research efforts to investigate model transformation testing, the dissertation has described a unit testing approach (M2MUnit) to help detect errors in model transformation specifications where a model transformation testing engine provides support to execute test cases with the intent of revealing errors in the transformation specification. The basic functionality includes execution of the transformations, comparison of the actual output model and the expected model, and visualization of the test results. Distinguished from classical software testing tools, to determine whether a model transformation test passes or fails requires comparison of the actual output model with the expected model, which necessitates model differencing algorithms and visualization. If there are no differences between the actual output and expected models, it can be inferred that the model transformation is correct with respect to the given test specification. If there are differences between the output and expected models, the errors in the transformation specification need to be isolated and removed. 156 To further advance model transformation testing, the dissertation proposes several important issues for future investigation. These issues include a test specification language to support test generation and metamodel-based coverage criteria to evaluate test adequacy. Also, to provide a capability to locate errors in model transformation specification, model transformation debugging is proposed as another software engineering practice to improve the quality of model transformation. 7.3 Differencing Algorithms and Tools for Domain-Specific Models Driven by the need for model comparison required by model transformation testing, model differencing algorithms and an associated tool called DSMDiff have been developed to compute differences between models. Theoretically, the generic model comparison problem is similar to the graph isomorphism problem, which is known to belong to NP [Garey and Johnson, 79]. The computational complexity of graph matching algorithms is the major hindrance to applying them to practical applications in modeling. To provide efficient and reliable model differencing algorithms, the dissertation has developed a solution using the syntax of modeling languages to help handle conflicts during model matching and combine structural comparison to determine whether the two models are equivalent. In general, DSMDiff takes two models as hierarchical graphs, starts from the top-level of the two containment models and then continues comparison to the child submodels. Compared to traditional UML model differentiation algorithms, comparison of domain-specific models is more challenging and characterized as: 1) domain-specific modeling is distinguished from traditional UML modeling because it is a variable- 157 metamodel approach whereas UML is a fixed-metamodel approach; 2) the underlying metamodeling mechanism used to define a DSML determines the properties and structures of domain-specific models; 3) domain-specific models may be formalized as hierarchical graphs annotated with a set of syntactical information. Based on these characteristics, DSMDiff has been developed as a metamodel-independent solution to discover the mappings and differences between any two domain-specific models. Visualization of the result of model differentiation (i.e., structural model differences) is critical to assist in comprehending the mappings and differences between two models. To help communicate the discovered model differences, a research contribution has also investigated a visualization technique to display model differences structurally and highlight them using color and icons. For example, a tree browser has been developed to indicate the possible kinds of model differences (e.g., a missing element, or an extra element, or an element that has different values for some properties). Based on complexity analysis, DSMDiff achieves polynomial time complexity. The applicability of DSMDiff has been discussed within the context of model transformation testing. In addition to model transformation testing, model differencing techniques are essential to many model development and management practices such as model versioning. 7.4 Validation of Research Results The C-SAW transformation engine has been applied to support automated evolution of models on several different modeling languages over multiple domains. On different experimentation platforms, C-SAW was applied successfully to integrate 158 crosscutting concerns into system models automatically. For example, C-SAW was used to weave the concurrency mechanisms, synchronization and flight data recorder policies into component models of real-time control systems provided by Boeing [Gray et al., 04b], [Gray et al., 06], [Zhang et al., 05-b]. More recently, C-SAW has been applied to improve the adaptability of component-based applications. For example, C-SAW was used to weave deployment concerns into PICML models that define component interfaces, along with their properties and system software building rules of componentbased distributed systems [Balasubramanian et al., 06-a]. Using model replicators, four case studies [Gray et al., 05], [Lin et al., 07-a] were used to demonstrate C-SAW’s ability to scale base models to large product-line instances. C-SAW was also used in ModelDriven Program Transformation (MDPT) and model refactoring [Zhang et al., 05-a]. In MDPT, the contribution was specific to a set of models and concerns (e.g., logging and concurrency concerns). Moreover, C-SAW has been used by several external researchers in their research. For example, Darío Correal from Colombia has applied C-SAW to address crosscutting concerns in workflow processes [Correal, 06]. The C-SAW web site contains software downloads, related papers, and several video demonstrations [C-SAW, 07]. The case studies have indicated the general applicability and flexibility of C-SAW to help evolve domain-specific models across various domains represented by different modeling languages (e.g., SIML and SRNML). These experimental results have also demonstrated that using C-SAW to automate model change evolution reduces the human effort and potential errors when compared to a corresponding manual technique. 159 To conclude, the escalating complexity of software and system models is making it difficult to rapidly explore the effects of a design decision. Automating such exploration with model transformation can improve both productivity and model quality. 160 LIST OF REFERENCES [Adrion et al., 82] W. Richards Adrion, Martha A. Branstad and John C. Cherniavsky, “Validation, Verification, and Testing of Computer Software,” ACM Computing Surveys, vol. 14 no. 2, June 1982, pp. 159-192. [Agrawal, 03] Aditya Agrawal, Gábor Karsai, and Ákos Lédeczi, “An End-to-End Domain-Driven Software Development Framework,” 18th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA) - Domain-driven Track, Anaheim, California, October 2003, pp. 8-15. [Aho et al., 07] Alfred V. Aho, Monica S. Lam, Ravi Sethi, and Jeffrey D. Ullman, Compilers Principles, Techniques, and Tools, 2nd edition, Addison-Wesley, 2007. [Al Dallal and Sorenson, 02] Jehad Al Dallal, Paul Sorenson, “System Testing for Object-Oriented Frameworks Using Hook Technology,” 17th IEEE International Conference on Automated Software Engineering (ASE), Edinburgh, Scotland, September 2002, pp. 231. [Alanen and Porres, 02] Marcus Alanen and Ivan Porres, “Difference and Union of Models,” 6th International Conference on Unified Modeling Language (UML), SpringerVerlag LNCS 2863, San Francisco, California, October 2003, pp. 2-17. [Andrews et al., 03] Anneliese Andrews, Robert France, Sudipto Ghosh and Gerald Craig, “Test Adequacy Criteria for UML Design Models,” Software Testing, Verification and Reliability, vol. 13 no. 2, April-June 2003, pp. 95-127. [ANTLR, 07] ANTLR website, 2007, http://www.antlr.org/ [AOM, 07] Aspect-Oriented Modeling, http://www.aspect-modeling.org/aosd07/ [AS, 01] Object Management Group, Action Semantics for the UML, 2001, http://www.omg.org. [Auguston, 98] Mikhail Auguston, “Building Program Behavior Models,” ECAI Workshop on Spatial and Temporal Reasoning, Brighton, England, August 1998, pp. 1926. 161 [Auguston et al., 03] Mikhail Auguston, Clinton Jeffery, and Scott Underwood, “A Monitoring Language for Run Time and Post-Mortem Behavior Analysis and Visualization,” AADEBUG Workshop on Automated and Algorithmic Debugging, Ghent, Belgium, September 2003. [Balasubramanian et al., 06-a] Krishnakumar Balasubramanian, Aniruddha Gokhale, Yuehua Lin, Jing Zhang, and Jeff Gray, “Weaving Deployment Aspects into DomainSpecific Models,” International Journal on Software Engineering and Knowledge Engineering, June 2006, vol. 16 no.3, pp. 403-424. [Balasubramanian et al., 06-b] Krishnakumar Balasubramanian, Aniruddha Gokhale, Gabor Karsai, Janos Sztipanovits, and Sandeep Neema, “Developing Applications Using Model-Driven Design Environments,” IEEE Computer (Special Issue on Model-Driven Engineering), February 2006, vol. 39 no. 2, pp. 33-40. [Batory, 06] Don Batory, “Multiple Models in Model-Driven Engineering, Product Lines, and Metaprogramming,” IBM Systems Journal, vol. 45 no. 3, July 2006, pp. 451–461. [Batory et al., 04] Don Batory, Jacob Neal Sarvela, and Axel Rauschmeyer, “Scaling Step-Wise Refinement,” IEEE Transactions on Software Engineering, vol. 30 no. 6, June 2004, pp. 355-371. [Baudry et al., 06] Benoit Baudry, Trung Dinh-Trong, Jean-Marie Mottu, Devon Simmonds, Robert France, Sudipto Ghosh, Franck Fleurey, and Yves Le Traon, "Challenges for Model Transformation Testing," Proceedings of workshop on Integration of Model Driven Development and Model Driven Testing (IMDT), Bilbao, Spain, July 2006. [Baxter et al., 04] Ira Baxter, Christopher Pidgeon, and Michael Mehlich, “DMS: Program Transformation for Practical Scalable Software Evolution,” International Conference on Software Engineering (ICSE), Edinburgh, Scotland, May 2004, pp. 625634. [Bernstein, 03] Philip A. Bernstein, “Applying Model Management to Classical Meta Data Problems,” The Conference on Innovative Database Research (CIDR), Asilomar, California, January 2003, pp. 209-220. [Bézivin, 03] Jean Bézivin, “On the Unification Power of Models,” Journal of Software and System Modeling, vol. 4 no. 2, May 2005, pp. 171-188. [Bézivin and Gerbé, 01] Jean Bézivin and Olivier Gerbé, “Towards a Precise Definition of the OMG/MDA Framework,” Automated Software Engineering (ASE), San Diego, California, November 2001, pp. 273-280. 162 [Bézivin et al., 04] Jean Bézivin, Frédéric Jouault, and Patrick Valduriez, “On the Need for MegaModels,” OOPSLA Workshop on Best Practices for Model-Driven Software Development, Vancouver, Canada, October 2004. [Bondi, 00] André B. Bondi, “Characteristics of Scalability and Their Impact on Performance,” 2nd International Workshop on Software and Performance, Ottawa, Ontario, Canada, 2000, pp. 195–203. [Booch et al., 99] Grady Booch, James Rumbaugh and Ivar Jacobson, The Unified Modeling Language User Guide, Addison Wesley, 1999. [Brooks, 95] Frederic P. Brooks, Mythical Man-Month, Addison-Wesley, 1995. [Brottier et al., 06] Erwan Brottier, Franck Fleurey, Jim Steel, Benoit Baudry, Yves Le Traon, “Metamodel-based Test Generation for Model Transformations: An Algorithm and a Tool,” 17th International Symposium on Software Reliability Engineering (ISSRE), Raleigh, North Carolina, November 2006, pp. 85–94. [Budinsky et al., 04] Frank Budinsky, David Steinberg, Ed Merks, Raymond Ellersick and Timothy J. Grose, Eclipse Modeling Framework, Addison-Wesley, 2004. [Chawathe, 96] Sudarshan S. Chawathe, Anand Rajaraman, Hector Garcia-Molina, and Jennifer Widom, “Change Detection in Hierarchically Structured Information,” The ACM SIGMOD International Conference on Management of Data, Montreal, Canada, June 1996, pp. 493-504. [Cicchetti, 07] Antonio Cicchetti, Davide Di Rusico, and Alfonso Pierantonio, “Metamodel Independent Approach to Difference Representation,” Journal of Object Technology (Special Issue from TOOLS Europe 2007), June 2007, 20 pages. [Clarke and Wing, 96] E. M. Clarke and J. M. Wing, “Formal Methods: State of the Art and Future Directions,” ACM Computing Surveys, vol. 28, 1996, pp. 626–643. [Clements and Northrop, 01] Paul Clements and Linda Northrop, Software Product-lines: Practices and Patterns, Addison-Wesley, 2001. [CMW, 07] Object Management Group, Common Warehouse Metamodel Specification, http://www.omg.org/technology/documents/modeling_spec_catalog.htm#CWM [Correal, 06] Darío Correal, “Definition and Execution of Multiple Viewpoints in Workflow Processes,” Companion to the 21st ACM SIGPLAN Conference on ObjectOriented Programming Systems, Languages, and Applications (OOPSLA), October 2006, Portland, Oregon, pp. 760-761. [C-SAW, 07] C-SAW: The Constraint Specification Weaver Website, 2007, http://www.cis/uab.edu/gray/Research/C-SAW 163 [Czarnecki and Helsen, 06] Krzysztof Czarnecki and Simon Helsen, “Feature-based Survey of Model Transformation Approaches,” IBM Systems Journal, 2006, vol. 45 no. 3, pp. 621-646. [Cuccuru et al., 05] Arnaud Cuccuru, Jean-Luc Dekeyser, Philippe Marquet, Pierre Boulet, “Towards UML2 Extensions for Compact Modeling of Regular Complex Topologies,” Model-Driven Engineering Languages and Systems (MoDELS), SpringerVerlag LNCS 3713, Montego Bay, Jamaica, October 2005, pp. 445-459. [Deng et al., 08] Gan Deng, Douglas C. Schmidt, Aniruddha Gokhale, Jeff Gray, Yuehua Lin, and Gunther Lenz, “Evolution in Model-Driven Software Product-line Architectures,” Designing Software-Intensive Systems: Methods and Principles, (Pierre Tiako, ed.), Idea Group, 2008. [Dijkstra, 76] Edsger Wybe Dijkstra, ed., A Discipline of Programming, Prentice Hall, 1976. [Dijkstra, 72] Edsger Dijsktra, “The Humble Programmer,” Communications of the ACM, October 1972, pp. 859-866 [DSM Forum, 07] Domain-Specific http://www.dsmforum.org/tools.html Modeling Forum, 2007, [Edwards, 04] George Edwards, Gan Deng, Douglas Schmidt, Aniruddha S. Gokhale, and Bala Natarajan, “Model-Driven Configuration and Deployment of Component Middleware Publish/Subscribe Services,” Generative Programming and Component Engineering (GPCE), Springer-Verlag LNCS 3286, Vancouver, Canada, October 2004, pp. 337-360. [Ehrig et al., 06] Karsten Ehrig, Jochen M. Kuster, Gabriele Taentzer, and Jessica Winkelmann, “Generating Instance Models from Meta Models,” 8th IFIP International Conference on Formal Methods for Open Object-Based Distributed Systems (FMOODS), Springer-Verlag LNCS 4037, Bologna, Italy, June 2006, pp. 156-170. [Eick et al., 01] Stephen G. Eick, Joseph L. Steffen, and Eric E. Sumner, “SeeSoft--A Tool for Visualizing Line-Oriented Software Statistics,” IEEE Transactions on Software Engineering, vol. 18 no. 11, 2001, pp. 957-968. [Engels and Groenewegen, 00] Gregor Engels and Luuk Groenewegen, “Object-Oriented Modeling: A Roadmap,” Future of Software Engineering, Special Volume Published in Conjunction with ICSE 2000, (Finkelstein, A., ed.), May 2000, pp. 103-116. [Escher, 07] The Escher Repository, 2007. http://escher.isis.vanderbilt.edu [Evans, 03] Eric Evans, Domain-Driven Design: Tackling Complexity at the Heart of Software, Addison-Wesley, 2003. 164 [Fermi, 07] Fermi lab, 2007, http://www.fnal.gov/ [Filman et al., 04] Robert Filman, Tzilla Elrad, Siobhan Clarke, and Mehmet Aksit, editors. Aspect-Oriented Software Development, Addison-Wesley, 2004. [Fleurey et al., 04] Franck Fleurey, Jim Steel, Benoit Baudry, “Validation in modeldriven engineering: testing model transformations,” 1st International Workshop on Model, Design and Validation, Rennes, Bretagne, France, November 2004, pp. 29–40 [France et al., 04] Robert France, Indrakshi Ray, Geri Georg, and Sudipto Ghosh, “An Aspect-Oriented Approach to Design Modeling,” IEE Proceedings on Software, vol. 4 no. 151, August 2004, pp. 173-185. [Frankel, 03] David S. Frankel, Model Driven Architecture: Applying MDA to Enterprise Computing, John Wiley and Sons, 2003. [Fujaba, 07] The FUJABA Toolsuite. http://wwwcs.uni-paderborn.de/cs/fujaba/ [Garey and Johnson, 79] Michael R. Garey, David S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness, W H Freeman and Co, 1979. [Gîrba and Ducasse, 06] Tudor Gîrba and Stéphane Ducasse, “Modeling History to Analyze Software Evolution,” Journal of Software Maintenance and Evolution, vol. 18 no. 3, May-June 2006, pp. 207-236. [Gelperin and Hetzel, 88] David Gelperin and Bill Hetzel, “The Growth of Software Testing,” Communications of the ACM, vol. 31 no. 6, June 1988, pp. 687-695. [GME, 07] Generic Modeling http://escher.isis.vanderbilit.edu/tools/get_tool?GME Environment, 2007, [Gogolla, 04] Martin Gogolla, “Benefits and Problems of Formal Methods,” Ada Europe, Springer-Verlag LNCS 3063, Palma de Mallorca, Spain, June 2004, pp. 1-15. [Gokhale et al., 04] Aniruddha Gokhale, Douglas Schmidt, Balachandran Natarajan, Jeff Gray, and Nanbor Wang, “Model-Driven Middleware,” Middleware for Communications, (Qusay Mahmoud, editor), John Wiley and Sons, 2004. [Gray et al., 01] Jeff Gray, Ted Bapty, Sandeep Neema, and James Tuck, “Handling Crosscutting Constraints in Domain-Specific Modeling,” Communications of the ACM, vol. 44 no. 10, October 2001, pp. 87-93. [Gray, 02] Jeff Gray, “Aspect-Oriented Domain-Specific Modeling: A Generative Approach Using a Metaweaver Framework,” Ph.D. Thesis, Dept. of Electrical Engineering and Computer Science, Vanderbilt University, Nashville, Tennessee, 2002. 165 [Gray et al., 03] Jeff Gray, Yuehua Lin, and Jing Zhang, “Aspect Model Weavers: Levels of Supported Independence,” Middleware 2003: Workshop on Model-driven Approaches to Middleware Applications Development, Rio de Janeiro, Brazil, June 2003. [Gray et al., 04-a] Jeff Gray, Matti Rossi, and Juha Pekka Tolvanen, “Preface: Special Issue on Domain-Specific Modeling,” Journal of Visual Languages and Computing, vol. 15 nos. 3-4, June/August 2004, pp. 207-209. [Gray et al., 04-b] Jeff Gray, Jing Zhang, Yuehua Lin, Hui Wu, Suman Roychoudhury, Rajesh Sudarsan, Aniruddha Gokhale, Sandeep Neema, Feng Shi, and Ted Bapty, “Model-Driven Program Transformation of a Large Avionics Framework,” Generative Programming and Component Engineering (GPCE), Springer-Verlag LNCS 3286, Vancouver, Canada, October 2004, pp. 361-378. [Gray et al., 05] Jeff Gray, Yuehua Lin, Jing Zhang, Steve Nordstrom, Aniruddha Gokhale, Sandeep Neema, and Swapna Gokhale, “Replicators: Transformations to Address Model Scalability,” 8th ACM/IEEE International Conference on Model Driven Engineering Languages and Systems (MoDELS), Springer-Verlag LNCS 3713, Montego Bay, Jamaica, October 2005, pp. 295-308. [Gray et al., 06] Jeff Gray, Yuehua Lin, Jing Zhang, “Automating Change Evolution in Model-Driven Engineering,” IEEE Computer (Special Issue on Model-Driven Engineering), February 2006, vol. 39 no. 2, pp. 41-48. [Gray et al., 07] Jeff Gray, Juha-Pekka Tolvanen, Steven Kelly, Aniruddha Gokhale, Sandeep Neema, and Jonathan Sprinkle, Handbook of Dynamic System Modeling, (Paul Fishwick, ed.), CRC Press, 2007. [Greenfield et al., 04] Jack Greenfield, Keith Short, Steve Cook, and Stuart Kent, Software Factories: Assembling Applications with Patterns, Models, Frameworks, and Tools, Wiley Publishing, Inc., 2004. [Hailpern and Tarr, 06] Brent Hailpern and Peri Tarr, “Model-Driven Development: The Good, the Bad, and the Ugly”, IBM Systems Journal, vol. 45 no. 3, July 2006, pp. 451– 461. [Harrold, 00] Mary J. Harrold, “Testing: A Road Map,” Future of Software Engineering, Special Volume Published in Conjunction with ICSE 2000, (A. Finkelstein, ed.), May 2000, Limerick, Ireland, pp. 61-72. [Hatcliff et al., 03] John Hatcliff, Xinghua Deng, Matthew B. Dwyer, Georg Jung, and Venkatesh P. Ranganath, “Cadena: An Integrated Development, Analysis, and Verification Environment for Component-based Systems,” International Conference on Software Engineering (ICSE), Portland, Oregon, May 2003, pp. 160-173. 166 [Hayashi et al., 04] Susumu Hayashi, Yibing Pan, Masami Sato, Kenji Mori, Sul Sejeon, and Shusuke Haruna, “Test Driven Development of UML Models with SMART Modeling System,” 7th International Conference on Unified Modeling Language (UML), Springer-Verlag LNCS 3237, Lisbon, Portugal, October 2004, pp. 295-409. [Hinchey et al., 96] Michael Hinchey, Jonathan Bowen, and Robert Glass, “Formal Methods: Point-Counterpoint,” IEEE Computer, vol. 13 no. 2, April 1996, pp. 18-19. [Hirel et al., 00] Christophe Hirel, Bruno Tuffin, and Kishor Trivedi, “SPNP: Stochastic Petri Nets. Version 6.0,” Computer Performance Evaluation: Modeling Tools and Techniques, Springer Verlag LNCS 1786, Schaumburg, Illinois, March 2000, pp. 354357. [Hoare, 69] Charles A. R. Hoare, “An Axiomatic Basis for Computer Programming,” Communications of the ACM, vol. 12 no. 10, October 1969, pp. 576-580. [Holzmann, 97] Gerard J. Holzmann, “The Model Checker SPIN,” IEEE Transactions on Software Engineering, vol. 23 no. 5, May 1997, pp. 279-295. [Hunt and McIlroy, 75] J. W. Hunt and M. D. McIlroy, “An Algorithm for Differential File Comparison,” Computing Science Technical Report No. 41, Bell Laboratories, 1975. [Johann and Egyed, 04] Sven Johann and Alexander Egyed, “Instant and Incremental Transformation of Models,” 19th IEEE/ACM International Conference on Automated Software Engineering (ASE), Linz, Austria, September 2004, pp. 362-365. [Johnson, 98] Luanne J. Johnson, “A View from the 1960s: How the Software Industry Began,” IEEE Annals of the History of Computing, vol. 20 no. 1, January-March 1998, pp. 36-42. [Karsai et al., 03] Gábor Karsai, Janos Sztipanovits, Ákos Lédeczi and Ted Bapty, “Model-Integrated Development of Embedded Software,” Proceedings of IEEE, vol. 91 no. 1, January 2003, pp. 145-164. [Karsai et al., 04] Gábor Karsai, Miklos Maroti, Ákos Lédeczi, Jeff Gray, and Janos Sztipanovits, “Composition and Cloning in Modeling and Meta-Modeling,” IEEE Transactions on Control Systems Technology, vol. 12 no. 2, March 2004, pp. 263-278. [Kent, 02] Stuart Kent, “Model Driven Engineering,” 3rd International Conference on Integrated Formal Methods (IFM’02), Springer-Verlag LNCS 2335, Turku, Finland, May 2002, pp. 286-298 [Kleppe et al., 03] Anneke Kleppe, Jos Warmer, and Wim Bast, MDA Explained. The Model Driven Architecture: Practice and Promise. Addison-Wesley, 2003 167 [Kiczales et al., 01] Gregor Kiczales, Erik Hilsdale, Jim Hugunin, Mik Kersten, Jeffrey Palm, and William Griswold, “Getting Started with AspectJ,” Communications of the ACM, vol. 44 no. 10, October 2001, pp. 59-65. [Kogekar et al., 06] Arundhati Kogekar, Dimple Kaul, Aniruddha Gokhale, Paul Vandal, Upsorn Praphamontripong, Swapna Gokhale, Jing Zhang, Yuehua Lin, and Jeff Gray, “Model-driven Generative Techniques for Scalable Performabality Analysis of Distributed Systems,” IPDPS Workshop on Next Generation Systems, Rhodes, Greece, April 2006. [Kuster and Abd-El-Razik, 06] Jochen M. Kuster and Mohamed Abd-El-Razik, “Validation of Model Transformations - First Experiences using a White Box Approach,” 3rd International Workshop on Model Development, Validation and Verification (MoDeV2a), Genova, Italy, October, 2006. [Kurtev et al., 06] Ivan Kurtev, Jean Bézivin, Frédéric Jouault, and Patrick Valduriez, “Model-based DSL Frameworks,” Companion of the 21st Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), Portland, Oregon, October 2006, pp. 602-616. [Küster, 06] Jochen M. Küster, “Definition and Validation of Model Transformations,” Software and Systems Modeling, vol. 5 no. 3, 2006, pp. 233-259. [Lédeczi et al., 01] Ákos Lédeczi, Arpad Bakay, Miklos Maroti, Peter Volgyesi, Greg Nordstrom, Jonathan Sprinkle, and Gábor Karsai, “Composing Domain-Specific Design Environments,” IEEE Computer, vol. 34 no. 11, November 2001, pp. 44-51. [Lieberman, 00] Henry Lieberman, “Programming by Example,” Communications of the ACM, vol. 43, no. 3, March 2000, pp. 72-74. [Lin et al., 04] Yuehua Lin, Jing Zhang, and Jeff Gray, “Model Comparison: A Key Challenge for Transformation Testing and Version Control in Model-Driven Software Development,” OOPSLA Workshop on Best Practices for Model-Driven Software Development, Vancouver, Canada, October 2004. [Lin et al., 05] Yuehua Lin, Jing Zhang, and Jeff Gray, “A Framework for Testing Model Transformations,” in Model-driven Software Development, (Beydeda, S., Book, M. and Gruhn, V., eds.), Springer, 2005, Chapter 10, pp. 219-236. [Lin et al., 07-a] Yuehua Lin, Jeff Gray, Jing Zhang, Steve Nordstrom, Aniruddha Gokhale, Sandeep Neema, and Swapna Gokhale, “Model Replication: Transformations to Address Model Scalability,” conditionally accepted, Software: Practice and Experience. [Lin et al., 07-b] Yuehua Lin, Jeff Gray and Frédéric Jouault, “DSMDiff: A Differencing Tool for Domain-Specific Models,” European Journal of Information Systems (Special Issue on Model-Driven Systems Development), Fall 2007. 168 [Long et al., 98] Earl Long, Amit Misra, and Janos Sztipanovits, “Increasing Productivity at Saturn,” IEEE Computer, vol. 31 no. 8, August 1998, pp. 35-43. [Manna and Pnueli, 92] Zohar Manna and Amir Pnueli, The Temporal Logic of Reactive and Concurrent Systems, Specification, Springer-Verlag, 1992. [Mandelin et al., 06] David Mandelin, Doug Kimelman, and Daniel Yellin, “A Bayesian Approach to Diagram Matching with Application to Architectural Models,” 28th International Conference on Software Engineering (ICSE), Shanghai, China, May 2006, pp. 222-231. [Marsan et al., 95] M.A. Marsan, G. Balbo, G. Conte, S. Donatelli, and G. Franceschinis, Modelling with Generalized Stochastic Petri Nets, Wiley Series in Parallel Computing, John Wiley and Sons, 1995. [MDA, 07] Object Management http://www.omg.org/mda/ Group, Model Driven Architecture, [Mehra et al., 05] Akhil Mehra, John Grundy, and John Hosking, “A Generic Approach to Supporting Diagram Differencing and Merging for Collaborative Design,” 20th IEEE/ACM International Conference on Automated Software Engineering (ASE), Long Beach, California, November 2005, pp. 204-213. [Mens and Van Gorp, 05] Tom Mens and Pieter Van Gorp, “A Taxonomy of Model Transformation,” International Workshop on Graph and Model Transformation (GraMoT), Tallinn, Estonia, September, 2005. [Mernik et al., 05] Marjan Mernik, Jan Heering, and Anthony M. Sloane, “When and How to Develop Domain-Specific Languages,” ACM Computing Surveys, December 2005, vol. 37 no. 4, pp. 316-344. [MetaCase, 07] MetaEdit+ 4.5 User’s Guide. http://www.metacase.com [Microsoft, 05] Visual Studio Launch: Domain-Specific Language (DSL) Tools: Visual Studio 2005 Team System. http://msdn.microsoft.com/vstudio/teamsystem/workshop/DSLTools [Milicev, 02] Dragan Milicev, “Automatic Model Transformations Using Extended UML Object Diagrams in Modeling Environments,” IEEE Transactions on Software Engineering, April 2002, vol. 28 no. 4, pp. 413-431. [Miller and Mukerji, 01] Joaquin Miller and Jishnu Mukerji, MDA Guide Version 1.0.1, http://www.omg.org/docs/omg/03-06-01.pdf [MOF, 07] Object Management Group, Meta Object Facility specification, http://www.omg.org/technology/documents/modeling_spec_catalog.htm#MOF 169 [Mottu et al., 06] Jean-Marie Mottu, Benoit Baudry and Yves Le Traon, “Mutation Analysis Testing for Model Transformations,” 2nd European conference on Model Driven Architecture - Foundations and Applications (ECMDA-FA), Bilbao, Spain, July 2006. pp. 376-390. [Muppala et al., 94] Jogesh K. Muppala, Gianfranco Ciardo, and Kishor S. Trivedi, “Stochastic Reward Nets for Reliability Prediction,” Communications in Reliability, Maintainability and Serviceability, July 1994, pp. 9-20. [Neema et al., 02] Sandeep Neema, Ted Bapty, Jeff Gray, and Aniruddha Gokhale, “Generators for Synthesis of QoS Adaptation in Distributed Real-Time Embedded Systems,” Generative Programming and Component Engineering (GPCE), SpringerVerlag LNCS 2487, Pittsburgh, Pennsylvania, October 2002, pp. 236-251. [Nordstrom et al., 99] Gregory G. Nordstrom, Janos Sztipanovits, Gábor Karsai, and Ákos Lédeczi, “Metamodeling - Rapid Design and Evolution of Domain-Specific Modeling Environments,” International Conference on Engineering of Computer-Based Systems (ECBS), Nashville, Tennessee, April 1999, pp. 68-74. [Nordstrom, 99] Gregory G. Nordstrom, “MetaModeling - Rapid Design and Evolution of Domain-Specific Modeling Environments,” Ph.D. Thesis, Dept. of Electrical Engineering and Computer Science, Vanderbilt University, Nashville, Tennessee, 1999. [OCL, 07] Object Management Group, Object Constraint Language Specification, http://www.omg.org/technology/documents/modeling_spec_catalog.htm#OCL [Ohst et al., 03] Dirk Ohst, Michael Welle, and Udo Kelter, “Differences Between Versions of UML Diagrams,” European Software Engineering Conference/Foundations of Software Engineering, Helsinki, Finland, September 2003, pp. 227-236. [Parnas, 72] David Parnas, “On the Criteria To Be Used in Decomposing Systems into Modules,” Communications of the ACM, December 1972, vol. 15 no. 12, pp. 1053-1058. [Patrascoiu, 04] Octavian Patrascoiu, “Mapping EDOC to Web Services Using YATL,” 8th International IEEE Enterprise Distributed Object Computing Conference (EDOC), Monterey, California, September 2004, pp. 286-297. [Peterson, 77] James L. Peterson, “Petri Nets,” ACM Computing Surveys, vol. 9 no. 3, September 1977, pp. 223-252. [Petriu et al., 05] Dorina C. Petriu, Jinhua Zhang, Gordon Gu and Hui Shen, “Performance Analysis with the SPT Profile,” in Model-Driven Engineering for Distributed and Embedded Systems, (S. Gerard, J.P. Babeau, J. Champeau, eds.), Hermes Science Publishing Ltd., London, England, 2005, pp. 205-224. 170 [Pilskalns et al., 07] Orest Pilskalns, Anneliese Andrews, Andrew Knight, Sudipto Ghosh, and Robert France, “Testing UML Designs,” Information and Software Technology, vol. 49 no.8, August 2007, pp. 892-912. [Pohjonen and Kelly, 02] Risto Pohjonen and Steven Kelly, “Domain-Specific Modeling,” Dr. Dobbs Journal, August 2002, pp. 26-35. [QVT, 07] MOF Query/Views/Transformations http://www.omg.org/cgi-bin/doc?ptc/2005-11-01. Specification, 2007. [Rácz et al., 99] Sándor Rácz and Miklós Telek, “Performability Analysis of Markov Reward Models with Rate and Impulse Reward,” International Conference on Numerical Solution of Markov Chains, Zaragoza, Spain, September 1999, pp. 169-180. [Rose, 07] IBM Rational 306.ibm.com/software/awdtools/developer/rose/ Rose, http://www- [Rosenberg, 96] Jonathan B. Rosenberg, How Debuggers Work - Algorithms, Data Structures, and Architecture, John Wiley and Sons, Inc, 1996. [Schach, 07] Stephen R. Schach, Object-Oriented and Classical Software Engineering, 7th Edition, McGraw-Hill, 2007. [Schmidt, 06] Douglas C. Schmidt, “Model-Driven Engineering,” IEEE Computer, February 2006, vol. 39 no. 2, pp. 25-32. [Schmidt et al., 00] Douglas C. Schmidt, Michael Stal, Hans Rohnert, Frank Buschman, Pattern-Oriented Software Architecture – Volume 2: Patterns for Concurrent and Networked Objects, John Wiley and Sons, 2000. [Schmidt and Varró, 03] Akos Schmidt and Daniel Varró, “CheckVML: A Tool for Model Checking Visual Modeling Languages,” 6th International Conference on the Unified Modeling Language (UML), Springer-Verlag LNCS 2863, San Francisco, California, October 2003, pp. 92-95. [Sendall and Kozaczynski, 03] Shane Sendall and Wojtek Kozaczynski, “Model Transformation - the Heart and Soul of Model-Driven Software Development,” IEEE Software, vol. 20 no. 5, September/October 2003, pp. 42-45. [Sharp, 00] David C. Sharp, “Component-Based Product Line Development of Avionics Software,” First Software Product Lines Conference (SPLC-1), Denver, Colorado, August 2000, pp. 353-369. 171 [Shetty et al., 05] Shweta Shetty, Steven Nordstrom, Shikha Ahuja, Di Yao, Ted Bapty, and Sandeep Neema, “Integration of Large-Scale Autonomic Systems using Multiple Domain Specific Modeling Languages,” 12th IEEE International Conference and Workshops on the Engineering of Autonomic Systems (ECBS), Greenbelt, Maryland, April 2005, pp. 481-489. [Sztipanovits, 02] Janos Sztipanovits, “Generative Programming for Embedded Systems,” Keynote Address: Generative Programming and Component Engineering (GPCE), Springer-Verlag LNCS 2487, Pittsburgh, Pennsylvania, October 2002, pp. 3249. [Sztipanovits and Karsai, 97] Janos Sztipanovits, Gábor Karsai, “Model-Integrated Computing,” IEEE Computer, April 1997, pp. 10-12. [UML, 07] Object Management Group, Unified Modeling Language Specification, http://www.omg.org/technology/documents/modeling_spec_catalog.htm#UML [Vangheluwe and De Lara, 04] Hans Vangheluwe and Juan de Lara, “Domain-Specific Modelling with AToM3,” 4th OOPSLA Workshop on Domain-Specific Modeling, Vancouver, Canada, October 2004. [Varró, 06] Dániel Varró, “Model Transformation by Example,” 9th International Conference on Model Driven Engineering Languages and Systems (MoDELS), Genova, Italy, October 2006, pp. 410-424. [Varró et al., 02] Dániel Varró, Gergely Varró, and András Pataricza, “Designing the Automatic Transformation of Visual Languages,” Science of Computer Programming, vol. 44 no. 2, 2002, pp. 205-227. [Wang et al., 03] Yuan Wang, David J. DeWitt, and Jin-Yi Cai, “X -Diff: An Effective Change Detection Algorithm for XML Documents,” 19th International Conference on Data Engineering, Bangalore, India, March 2003, pp. 519-530. [Warmer and Kleppe, 99] Jos Warmer and Anneke Kleppe, The Object Constraint Language: Precise Modeling with UML, Addison-Wesley, 1999. [Whittle, 02] Jon Whittle, “Transformations and Software Modeling Languages: Automating Transformations in UML,” 5th International Conference on the Unified Modeling Language (UML), Springer-Verlag LNCS 2460, September-October 2002, Dresden, Germany, pp. 227-242. [Wimmer et al., 07] Manuel Wimmer, Michael Strommer, Horst Kargl, and Gerhard Kramler, “Towards Model Transformation Generation By-Example,” 40th Hawaii International Conference on System Sciences (HICSS), Big Island, Hawaii, January 2007. 172 [Xing and Stroulia, 05] Zhenchang Xing and Eleni Stroulia, “UMLDiff: An Algorithm for Object-Oriented Design Differencing,” 20th IEEE/ACM International Conference on Automated Software Engineering (ASE), Long Beach, California, November 2005, pp. 54-65. [XMI, 07] Object Management Group, XMI specification, http://www.omg.org/technology/documents/modeling_spec_catalog.htm#XMI [XSLT, 99] W3C, XSLT Transformation version 1.0, 1999, http://www.w3.org/TR/xslt [Yilmaz, 01] Levent Yilmaz, “Verification and Validation: Automated Object-Flow Testing of Dynamic Process Interaction Models,” 33rd Winter Conference on Simulation, Arlington, Virginia, December 2001, pp. 586-594. [Zellweger, 84] Polle T. Zellweger, “Interactive Source-Level Debugging of Optimized Programs,” Ph.D. Thesis, Department of Computer Science, University of California, Berkeley, California, May 1984. [Zhang et al., 04] Jing Zhang, Jeff Gray, and Yuehua Lin, “A Generative Approach to Model Interpreter Evolution,” 4th OOPSLA Workshop on Domain-Specific Modeling, Vancouver, Canada, October 2004, pp. 121-129. [Zhang et al., 05-a] Jing Zhang, Yuehua Lin, and Jeff Gray, “Generic and DomainSpecific Model Refactoring using a Model Transformation Engine,” Model-Driven Software Development, (Beydeda, S., Book, M. and Gruhn, V., eds.), Springer, 2005, Chapter 9, pp. 199-218. [Zhang et al., 05-b] Jing Zhang, Jeff Gray, and Yuehua Lin, “A Model-Driven Approach to Enforce Crosscutting Assertion Checking,” 27th ICSE Workshop on the Modeling and Analysis of Concerns in Software (MACS), St. Louis, Missouri, May 2005. [Zhang et al., 07] Jing Zhang, Jeff Gray, Yuehua Lin, and Robert Tairas, “Aspect Mining from a Modeling Perspective,” International Journal of Computer Applications in Technology (Special Issue on Concern Oriented Software), Fall 2007. [Zhu et al., 97] Hong Zhu, Patrick Hall, and John May, “Software Unit Test Coverage and Adequacy,” ACM Computing Surveys, vol. 29 no. 4, December 1997, pp. 367-427. [Zloof, 77] Moshé Zloof, “Query By Example,” IBM Systems Journal, vol. 16 no. 4, 1977, pp. 324-343. 173 APPENDIX A EMBEDDED CONSTRAINT LANGUAGE GRAMMAR 174 The Embedded Constraint Language (ECL) extensions described in Chapter 3 are based on an earlier ECL description presented in [Gray, 02]. Furthermore, this earlier ECL definition was an extension of the Multigraph Constraint Language (MCL), which was an early OCL-like constraint language for the first public release of the GME [GME, 07]. The ECL grammar is defined in an ANTLR [ANTLR, 07] grammar (ecl.g), which is presented in the remainder of this appendix. Much of this grammar has legacy productions from the original MCL. This section does not claim a major contribution of this dissertation, but is provided for completeness to those desiring a more formal description of the ECL syntax. //begin ecl.g class ECLParser { exception default: //Print error messages name : id : IDENT ; defines : { DEFINES defs SEMI } ; defs : def ( COMMA def )* ; def : name ; cpars : cpar ( SEMI cpar )* ; cpar : name ( COMMA name )* ":" name ; 175 cdef : { INN foldername { DOT modelname { DOT aspectname }}} (STRATEGY name | ASPECT name ) LPAR { cpars } RPAR priority description LBRACE cexprs { action } RBRACE | FUNCTION name LPAR { cpars } RPAR cexprs { action } ; priority : PRIORITY "=" id:INTEGER | ; description : id:STR | ; foldername : name ; modelname : name ; aspectname : name ; lval : name ; astring : str:ACTION ; action : astring; cexprs : cexpr SEMI ( cexpr SEMI )* ; cexpr : { action } ( assign | DECLARE STATIC cpar | DECLARE cpar | lexpr) ; assign : lval ASSIGN lexpr ; 176 ifexpr : IF cexpr THEN cexprs { action } { ELSE cexprs { action } } ENDIF ; lexpr : relexpr lexpr_r ; lexpr_r : { lop relexpr lexpr_r } ; relexpr : addexpr { rop addexpr } ; addexpr : mulexpr addexpr_r ; addexpr_r :{ addop mulexpr addexpr_r } ; mulexpr : unaexpr mulexpr_r ; mulexpr_r :{ mulop unaexpr mulexpr_r } ; unaexpr : ( unaryop postfixexpr ) | postfixexpr ; postfixexpr : primexpr postfixcall ; postfixcall : { ( ( DOT | ARROW | CARET ) call ) postfixcall } ; primexpr : litcoll | lit | objname callpars | LPAR cexpr RPAR | ifexpr ; callpars : { LPAR ( ( decl ) ? decl { actparlist } | { actparlist } ) RPAR } ; lit : string | number ; tname : name ; litcoll : coll LBRACE { cexprlrange } RBRACE ; cexprlrange : cexpr { ( (COMMA cexpr )+ ) | ( ".." cexpr ) } ; call : name callpars ; decl : name ( COMMA name )* objname : name | SELF ; { ":" tname } "\|" ; 177 actparlist : cexpr ( COMMA cexpr )* ; lop : AND | OR | XOR | IMPLIES ; coll : SET | BAG | SEQUENCE | COLLECTION ; rop : "==" | "<" | ">" | ">=" | "<=" | "<>" ; addop > : PLUS | MINUS ; mulop > : MUL | DIV ; unaryop : MINUS | NOT ; string : str:STR ; number : r:REAL | n:INTEGER; } // Token definitions #token STR "\"~[\"]*\"" #token INTEGER "[0-9]+" #token REAL "([0-9]+.[0-9]* | [0-9]*.[0-9]+) {[eE]{[\-\+]}[0-9]+}" #token IDENT "[a-zA-Z][a-zA-Z0-9_]*" // end of ecl.g 178 APPENDIX B OPERATIONS OF THE EMBEDDED CONSTRAINT LANGUAGE 179 addAtom purpose: add an atom based on its partName(i.e., kindName) that belongs to a model and assign a new name to it caller: a model or an object that represents a model usage: caller.addAtom(string partName, string newName) result type: atom addConnection purpose: add a connection with a specific kindName from a source object to a destination object within a caller caller: a model or an object that represents a model usage: caller.addConnection(string kindName, object source, object destination) result type: connection addFolder purpose: add a folder based on its kindName and assign a new name to it caller: a folder usage: caller.addFolder(string kindName, string newName) result type: folder addMember purpose: add an object as a member of a set caller: an object that represents a set usage: caller.addMember(object anObj) result type: void addModel purpose: add a model based on its partName(i.e., kindName) that belongs to a model and assign a new name to it caller: a model, a folder, an object that represents a model/folder or a list of models. usage: 1) if caller is a single object, caller.addModel(string partName, string newName) 2) if caller is a list, caller-> addModel(string partName, string newName) result type: model addReference purpose: add a reference with a specific kindName that refers to an object and assign a new name to it within a caller caller: a model usage: caller.addReference(string kindName, object refTo) result type: reference 180 addSet purpose: add a set based on its kindName and assign a new name to it within a caller or a list of callers caller: a model or a list of models usage: caller.addSet(string kindName, string newName) or caller->addSet(string kindName, string newName) result type: set atoms purpose: return all the atoms or the atoms with specific kindName within a caller. caller: a model or an object that represents a model usage: caller.atoms() or caller.atoms(string kindName) result type: atomList connections purpose: return all the connections with a specific connection kindname within a model caller: a model or an object that represents a model usage: caller.connections(string connName) result type: objectList that represents a list of connections destination purpose: return the destination point of a connection caller: a connection usage: caller.destination() result type: model/atom/reference/object endWith purpose: check if a string ends with a substring caller: a string usage: caller.endWith(string aSubString) result type: boolean findAtom purpose: return an atom based on its name within a caller caller: a model or an object that represents a model usage: caller.findAtom(string atomName) result type: atom findConnection purpose: find a connection with a specific kindName from a source object to a destination object within a caller caller: a model or an object that represents a model usage: caller.findConnection(string kindName, object source, object destination) result type: connection 181 findFolder purpose: return a folder with a specific name caller: a folder usage: caller.findFolder(string folderName) result type: folder findModel purpose: return a model based on its name caller: a folder or a model or an object that represents a model usage: caller.findModel(string modelName) result type: model findObject purpose: return an object based on its name within a caller caller: a model or a folder, or an object that represents a model/folder usage: caller.findObject(string objName) result type: object getAttribute purpose: return the value of an attribute of a caller which type is int, bool, double or string caller: an atom, a model or an object usage: caller.getAttribute(string attrName) result type: int, bool, double or string intToString purpose: convert an integer value to string caller: none usage: intToString(int val) result type: string isNull purpose: determine if the caller is null caller: an atom, a model or an object usage: caller.isNull() result type: boolean kindOf purpose: return an caller’s kindname caller: an atom, a model or an object usage: caller.kindOf() result type: string 182 models purpose: return all the models or the models with specific kindName within a caller caller: a model or a folder usage: caller.models() or caller.models(string kindName). result type: modelList modelRefs purpose: return all the models within a caller that are referred by the model references with the specific kindName caller: a model or an object that represents a model usage: caller.modelRefs(string kindName) result type: modelList name purpose: return a caller’s name caller: an atom, a model or an object usage: caller.name() result type: string parent purpose: return the parent model of a caller caller: a model, an atom or an object that represents a model/an atom. usage: caller.parent() result type: model refersTo purpose: return a model/an atom/an object that the caller refers to caller: a modelRef or an object that represents a model reference usage: caller.referesTo() result type: model or atom or object removeAtom purpose: remove an atom based on its name caller: a model or an object that represents a model, which is the parent model of the to-be-removed atom usage: caller.removeAtom(string atomName) result type: void removeConnection purpose: remove a connection with a specific kindName from a source object to a destination object within a caller caller: a model or an object that represents a model usage: caller.removeConnection(string kindName, object source, object destination) result type: void 183 removeModel purpose: remove a model based on its name caller: a model or an object that represents a model, which is the parent model of the to-be-removed model usage: caller.removeModel(string modelName) result type: void rootFolder purpose: return the root folder of an open GME project caller: none usage: rootFolder() result type: folder select purpose: select the atoms or the models within a caller according to the specified condition caller: atomList, modelList or an objectList usage: caller.select(logic expression) result type: atomList or modelList show purpose: display string message caller: none usage: show(any expression that returns a string) result type: void size purpose: return the size of the caller list caller: atomList, modelList or an objectList usage: caller.size() result type: int source purpose: return the source point of a connection caller: a connection usage: caller.source() result type: model/atom/reference/object setAttribute purpose: set a new value to an attribute of a caller caller: an atom, a model or an object usage: caller.setAttribute(string attrName, anyType value) result type: void 184 APPENDIX C ADDITIONAL CASE STUDIES ON MODEL SCALABILITY 185 In this appendix, the concept of model replicators is further demonstrated on two separate example modeling languages that were created in GME for different domains. The two DSMLs are: • Stochastic Reward Net Modeling Language (SRNML), which has been used to describe performability concerns of distributed systems built from middleware patterns-based building blocks. • Event QoS Aspect Language (EQAL), which has been used to configure a large collection of federated event channels for mission computing avionics applications. C.1 Scaling Stochastic Reward Net Modeling Language (SRNML) Stochastic Reward Nets (SRNs) [Muppala et al., 94] represent a powerful modeling technique that is concise in its specification and whose form is closer to a designer’s intuition about what a performance model should look like. Because an SRN specification is closer to a designer’s intuition of system behavior, it is also easier to transfer the results obtained from solving the models and interpret them in terms of the entities that exist in the system being modeled. SRNs have been used extensively for performance, reliability and performability modeling of different types of systems. SRNs are the result of a chain of evolution starting with Petri nets [Peterson, 77]. More discussion on SRNs can be found in [Marsan et al., 95], [Rácz et al., 99]. The Stochastic Reward Net Modeling Language (SRNML) is a DSML developed in GME to describe SRN models of large distributed systems [Kogekar et al., 06]. The SRNML is similar to the goals of performance-based modeling extensions for the UML, 186 such as the Schedulability, Performance, and Time profile [Petriu et al., 05]. The model compilers developed for SRNML can synthesize artifacts required for the SPNP tool [Hirel et al., 00], which is a model solver based on SRN semantics. The SRN models, which are specified in SRNML, depict the Reactor pattern [Schmidt et al., 00] in middleware for network services, which provides synchronous event demultiplexing and dispatching mechanisms. In the Reactor pattern, an application registers an event handler with the event demultiplexer and delegates to it the responsibility of listening for incoming events. On the occurrence of an event, the demultiplexer dispatches the event by making a callback to its associated applicationsupplied event handler. As shown in Figure C-1a, an SRN model usually consists of two parts: the top half represents the event types handled by a reactor and the bottom half defines the associated execution snapshot. The execution snapshot needs to represent the underlying mechanism for handling the event types included in the top part (e.g., nondeterministic handling of events). Thus, there are implied dependent relations between the top and bottom parts. Any change made to the top will require corresponding changes to the bottom. Figure C-1a shows the SRN model for the reactor pattern for two event handlers. The top of Figure C-1a models the arrival, queuing and service of the two event types. Transitions A1 and A2 represent the arrivals of the events of types one and two, respectively. Places B1 and B2 represent the queue for the two types of events. Transitions Sn1 and Sn2 are immediate transitions that are enabled when a snapshot is taken. Places S1 and S2 represent the enabled handles of the two types of events, whereas transitions Sr1 and Sr2 represent the execution of the enabled event handlers of the two 187 types of events. An inhibitor arc from place B1 to transition A1 with multiplicity N1 prevents the firing of transition A1 when there are N1 tokens in place B1. The presence of N1 tokens in place B1 indicates that the buffer space to hold the incoming input events of the first type is full, and no additional incoming events can be accepted. The inhibitor arc from place B2 to transition A2 achieves the same purpose for type two events. a) base model with 2 event handlers b) scaled model with 4 event handlers Figure C-1 - Replication of Reactor Event Types (from 2 to 4 event types) The bottom of Figure C-1a models the process of taking successive snapshots and non-deterministic service of event handles in each snapshot. Transition Sn1 is enabled when there are one or more tokens in place B1, a token in place StSnpSht, and no token in place S1. Similarly, transition Sn2 is enabled when there are one or more tokens in place B2, a token in place StSnpSht and no token in place S2. Transitions TStSnp1 and TStSnp2 are enabled when there is a token in place S1, place S2, or both. Transitions TEnSnp1 and TEnSnp2 are enabled when there are no tokens in both places S1 and S2. Transition TProcSnp1,2 is enabled when there is no token in place S1 and a token in place S2. Similarly, transition TProcSnp2,1 is enabled when there is no token in place S2 and a 188 token in place S1. Transition Sr1 is enabled when there is a token in place SnpInProg1, and transition Sr2 is enabled when there is a token in place SnpInProg2. All the transitions have their own guard functions, as shown in Table C -1. Table C-1 - Enabling guard equations for Figure C-1 Transition Sn1 .. . Guard Function ((#StSnpShot == 1) && (#B1 >= 1) && (#S1 == 0)) ? 1 : 0 .. . Snm TStSnp1 .. . ((#StSnpShot == 1) && (#Bm >= 1) && (#Sm == 0)) ? 1 : 0 (#S1 == 1) ? 1 : 0 .. . TStSnpm TEnSnp1 .. . (#Sm == 1) ? 1 : 0 ((#S1 == 0) && (#S2 == 0) && … (#Sm == 0)) ? 1 : 0 .. . TEnSnpm TProcSnp1,2 .. . ((#S1 == 0) && (#S2 == 0) && … (#Sm == 0)) ? 1 : 0 ((#S1 == 0) && (#S2 == 1)) ? 1 : 0 .. . TProcSnp1,m .. . ((#S1 == 0) && (#Sm == 1)) ? 1 : 0 .. . TProcSnpm,m-1 .. . ((#Sm == 0) && (#Sm-1 == 1)) ? 1 : 0 .. . TProcSnpm,1 Sr1 .. . ((#Sm == 0) && (#S1 == 1)) ? 1 : 0 (#SnpInProg1 == 1) ? 1 : 0 .. . Srm (#SnpInProgm == 1) ? 1 : 0 C1.1 Scalability Issues in SRNML The scalability challenges of SRN models arise from the addition of new event types and connections between their corresponding event handlers. For example, the top of the SRN model must scale to represent the event handling for every event type that is available. A problem emerges when there could be non-deterministic handling of events, which leads to the complicated connections between the elements within the execution 189 snapshot of an SRN model. Due to the implied dependencies between the top and bottom parts, the bottom part of the model (i.e., the snapshot) should incorporate appropriate non-deterministic handling depicted in the scaled number of event types. The inherent structural complexity and the complicated dependent relations within an SRN model make it difficult and impractical to scale up SRN models manually, which requires a computer-aided method such as a replicator to perform the replication automatically. The replication behaviors for scaling up an SRN model can be formalized as computation logic and specified in a model transformation language such as ECL [Lin et al., 07]. Figure C-1a describes a base SRN model for two event types, and Figure C-1b represents the result of scaling this base model from two event types to four event types. Such scalability in SRN models can be performed with two model transformation steps. The first step scales the reactor event types (i.e., the upper part of the SRN model) from two to four, which involves creating the B and S places, the A, Sn and Sr transitions and associated connection arcs and renaming them, as well as setting appropriate guard functions for each new event type. The second step scales the snapshot (i.e., the bottom part of the SRN model) according to the newly added event types. Inside a snapshot, the model elements can be divided into three categories. The first category is a group of elements that are independent of each event type; the second category is a group of model elements that are associated with every two new event types; and the third category is a group of elements that are associated to one old event type and one new event type. Briefly, these three groups of elements can be built by three subtasks: • Create the TStSnp and TEnSnp transitions and the SnpInProg place, as well as required connection arcs among them for each newly added event type; assign 190 the correct guard function for each created transition; this task builds the first group. • For each pair of new event types, create two TProcSnp transitions and connect their SnpInProg places to these TProcSnp transitions; assign the correct guard function for each created transition; this task builds the second group. • For each pair of <old event type, new event type>, create two TProcSnp transitions and connect their SnpInProg places to these TProcSnp transitions; assign the correct guard function for each created transition; this task builds the third group. C1.2 ECL Transformation to Scale SRNML In this example, only the model transformation for scaling the snapshot is illustrated. The ECL specification shown in Listing C-1 performs subtask one. It is composed of several strategies. The computeTEnSnpGuard (Line 1) strategy is used to re-compute the guard functions of the TEnSnp transitions when new event types are added. The ECL code on Lines 3 and 4 recursively concatenate the string that represents the guard function. After this string is created, it is passed to the addEventsWithGuard strategy (Line 41 to 47), which adds the new guard function and event to the snapshot. The addEvents strategy (Line 12) recursively calls the addNewEvent strategy to create necessary transitions, places and connections in the snapshot for the new event types with identity numbers from min_new to max_new. The addNewEvent strategy (Line 20) creates snapshot elements for a single new event type with identity number event_num. The findAtom operation on Line 25 is used 191 to discover the StSnpSht place in the snapshot. The TStSnp transition is created on Line 26 and its guard function is created on Lines 27 and 28. Next, the SnpInProg place and the TEnSnp transition are created on Lines 30 and 31, respectively. The guard function of the TEnSnp transition is set on Line 32. Finally, four connection arcs are created among the StSnpSht place, the TStSnp transition, the SnpInProg place and the TEnSnp transition (Lines 34 to 37). Subtask one actually creates independent snapshot elements for each new event type. In collaboration, subtasks two and three build the necessary relationships between each pair of new event types, and each pair consisting of a new event type and an old event type. Listing C-2 shows the ECL specification to perform subtask two (i.e., to build the relationship between every two new event types). The connectTwoEvents strategy (Line 17) creates the TProcSnp transition and its associated connections between two events. Then, the connectOneNewEventToOtherNewEvents strategy (Line 9) recursively calls the connectTwoEvent strategy to build relationships between two new events. Finally, the connectNewEvents strategy (Line 1) builds the relationships between each pair of new event types connectOneNewEventToOtherNewEvents by recursively strategy. calling Inside the the connectTwoEvents strategy, the SnpInProg places of the two event types are discovered on Lines 28 and 29, respectively. Then, two new TProcSnp transitions are created and their guard functions are set (Lines 30 through 33), followed by the construction of the connections between the SnpInProg places and the TProcSnp transitions (Lines 35 through 38). 192 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 strategy computeTEnSnpGuard(min_old, min_new, max_new : integer; TEnSnpGuardStr : string) { if (min_old < max_new) then computeTEnSnpGuard(min_old + 1, min_new, max_new, TEnSnpGuardStr + "(#S" + intToString(min_old) + " == 0)&&"); else addEventswithGuard(min_new, max_new, TEnSnpGuardStr + "(#S" + intToString(min_old) + "== 0))?1:0"); endif; } ... // several strategies not show here strategy addEvents(min_new, max_new : integer; TEnSnpGuardStr : string) { if (min_new <= max_new) then addNewEvent(min_new, TEnSnpGuardStr); addEvents(min_new+1, max_new, TEnSnpGuardStr); endif; } strategy addNewEvent(event_num : integer; TEnSnpGuardStr : string) { declare start, stTran, inProg, endTran : atom; declare TStSnp_guard : string; start := findAtom("StSnpSht"); stTran := addAtom("ImmTransition", "TStSnp" + intToString(event_num)); TStSnp_guard := "(#S" + intToString(event_num) + " == 1)?1 : 0"; stTran.setAttribute("Guard", TStSnp_guard); inProg := addAtom("Place", "SnpInProg" + intToString(event_num)); endTran := addAtom("ImmTransition", "TEnSnp" + intToString(event_num)); endTran.setAttribute("Guard", TEnSnpGuardStr); addConnection("InpImmedArc", addConnection("OutImmedArc", addConnection("InpImmedArc", addConnection("OutImmedArc", start, stTran); stTran, inProg); inProg, endTran); endTran, start); } //recursively calls "addEvents" and "modifyOldGuards" strategy addEventswithGuard(min_new, max_new : integer; TEnSnpGuardStr : string) { rootFolder().findFolder("SRNFolder").findModel("SRNModel"). addEvents(min_new, max_new, TEnSnpGuardStr); rootFolder().findFolder("SRNFolder").findModel("SRNModel"). modifyOldGuards(1, min_new-1, TEnSnpGuardStr); } ... Listing C-1 - ECL transformation to perform first subtask of scaling snapshot 193 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 strategy connectNewEvents(min_new, max_new: interger) { if(min_new < max_new) then connectOneNewEventToOtherNewEvents(min_new, max_new); connectNewEvents(min_new+1, max_new); endif; } strategy connectOneNewEventToOtherNewEvents(event_num, max_new: integer) { if(event_num < max_new) then connectTwoEvents(event_num, max_new); connectNewEvents(event_num, max_new-1); endif; } strategy connectTwoEvents(first_num, second_num : integer) { declare firstinProg, secondinProg : atom; declare secondTProc1, secondTProc2 : atom; declare first_numStr, second_numStr, TProcSnp_guard1, TProcSnp_guard2 : string; 26 27 28 29 30 31 32 33 34 35 36 37 38 39 } 40 ... first_numStr := intToString(first_num); second_numStr := intToString(second_num); TProcSnp_guard1 := "((#S" + first_numStr + " == 0) && (#S" + second_numStr + " == 1))?1 : 0"; TProcSnp_guard2 := "((#S" + second_numStr + " == 0) && (#S" + first_numStr + " == 1))?1 : 0"; firstinProg := findAtom("SnpInProg" + first_numStr); secondinProg := findAtom("SnpInProg" + second_numStr); secondTProc1 := addAtom("ImmTransition", "TProcSnp" + first_numStr + "," + second_numStr); secondTProc1.setAttribute("Guard", TProcSnp_guard1); secondTProc2 := addAtom("ImmTransition", "TProcSnp" + second_numStr + "," + first_numStr); secondTProc2.setAttribute("Guard", TProcSnp_guard2); addConnection("InpImmedArc", addConnection("OutImmedArc", addConnection("InpImmedArc", addConnection("OutImmedArc", firstinProg, secondTProc1); secondTProc1, secondinProg); secondinProg, secondTProc2); secondTProc2, firstinProg); Listing C-2 - ECL transformation to perform second subtask of scaling snapshot To conclude, the introduction of new event types into an SRN model requires changes in several locations of the model. For example, new event types need to be inserted; some properties of model elements such as the guard functions need to be computed; and the execution snapshot needs to be expanded accordingly. The difficulties of scaling up an SRN model manually is due to the complicated dependencies among its model elements and parts, which can be addressed by a replicator using C-SAW and its 194 model transformation language ECL. With the expressive power of ECL, it is possible to specify reusable complicated transformation logic in a templatized fashion. The replication task is also simplified by distinguishing independent and dependent elements, and building up a larger SRN model in a stepwise manner. The result of the model replication preserves the benefits of modeling because the result of the replication can be persistently exported to XML and sent to a Petri net analysis tool. C.2 Scaling Event QoS Aspect Language (EQAL) The Event QoS Aspect Language (EQAL) [Edwards, 04] is a DSML for graphically specifying publisher-subscriber service configurations for large-scale DRE systems. Publisher-subscriber mechanisms, such as event-based communication models, are particularly relevant for large-scale DRE systems (e.g., avionics mission computing, distributed audio/video processing, and distributed interactive simulations) because they help reduce software dependencies and enhance system composability and evolution. In particular, the publisher-subscriber architecture of event-based communication allows application components to communicate anonymously and asynchronously. The publisher-subscriber communication model defines three software roles: • • • Publishers generate events to be transmitted Subscribers receive events via hook operations Event channels accept events from publishers and deliver events to subscribers The EQAL modeling environment consists of a GME metamodel that defines the concepts of publisher-subscriber systems, in addition to several model compilers that 195 synthesize middleware configuration files from models. The EQAL model compilers automatically generate publisher-subscriber service configuration files and component property description files needed by the underlying middleware. The EQAL metamodel defines a modeling paradigm for publisher-subscriber service configuration models, which specify QoS configurations, parameters, and constraints. For example, the EQAL metamodel contains a distinct set of modeling constructs for building a federation of real-time event services supported by the Component-Integrated ACE ORB (CIAO) [Gokhale et al., 04], which is a component middleware platform targeted by EQAL. A federated event service allows sharing filtering information to minimize or eliminate the transmission of unwanted events to a remote entity. Moreover, a federated event service allows events that are being communicated in one channel to be made available on another channel. The channels typically communicate through CORBA Gateways, User Datagram Protocol (UDP), or Internet Protocol (IP) Multicast. In Figure C-2, to model a federation of event channels in different sites, EQAL provides modeling concepts including CORBA Gateways and other entities of the publish-subscribe paradigm (e.g., event consumers, event suppliers, and event channels). C.2.1 Scalability Issues in EQAL The scalability issues in EQAL arise when a small federation of event services must be scaled to a very large system, which usually accommodates a large number of publishers and subscribers [Gray et al., 06]. It is conceivable that EQAL modeling features, such as the event channel, the associated QoS attributes, connections and event 196 correlations must be applied repeatedly to build a large scale federation of event services. Figure C-2 shows a federated event service with three sites, which is then scaled up to federated event services with eight sites. This scaling process includes three steps: • • Add five CORBA_Gateways to each original site Repeatedly replicate one site instance to add five more extra sites, each with eight CORBA_Gateways • Create the connections between all of the eight sites Figure C-2 - Illustration of replication in EQAL C.2.2 ECL Transformation to Scale EQAL The process discussed above can be automated with an ECL transformation that is applied to a base model with C-SAW. Listing C-3 shows a fragment of the ECL specification for the first step, which adds more Gateways to the original sites. The other steps would follow similarly using ECL. The size of the replication in this example was 197 kept to five sites so that the visualization could be rendered appropriately in Figure C-2. The approach could be extended to scale to hundreds or thousands of sites and gateways. 1 2 3 4 5 6 7 8 9 //traverse the original sites to add CORBA_Gateways //n is the number of the original sites //m is the total number of sites after scaling strategy traverseSites(n, i, m, j : integer) { declare id_str : string; if (i <= n) then id_str := intToString(i); rootFolder().findModel("NewGateway_Federation"). findModel("Site " + id_str).addGateWay_r(m, j); 10 traverseSites(n, i+1, m, j); 11 endif; 12 } 13 14 //recursively add CORBA_Gateways to each existing site 15 strategy addGateWay_r(m, j: integer) 16 { 17 if (j<=m) then 18 addGateWay(j); 19 addGateWay_r(m, j+1); 20 endif; 21 } 22 23 //add one CORBA_Gateway and connect it to Event_Channel 24 strategy addGateWay(j: integer) 25 { 26 declare id_str : string; 27 declare ec, site_gw : object; 28 id_str := intToString(j); 29 30 addAtom("CORBA_Gateway", "CORBA_Gateway" + id_str); 31 ec := findModel("Event_Channel"); 32 site_gw := findAtom("CORBA_Gateway" + id_str); 33 addConnection("LocalGateway_EC", site_gw, ec); 34 } Listing C-3 - ECL fragment to perform the first step of replication in EQAL To conclude, scaling EQAL from a small federation of events to a very large system requires creation of a large number of new publishers and subscribers. Also, the associated EQAL modeling features, such as the event channel, the associated QoS attributes, connections and event correlations must be built accordingly. Such model scalability can be achieved through ECL model transformation with the flexibility to control the scaling size and the reusability to repeat any desired scaling task.