Academia.eduAcademia.edu

MIMOS, system model-driven migration project

2013

The volatile IT industry often tempts companies to replace legacy information systems with new ones. However, legacy systems cannot always be completely discarded because they gradually store a significant amount of valuable business knowledge as a result of progressive maintenance over time. Most migration techniques are proposed and applied in an ad hoc way. As a result, most migration techniques have a lack of automation and formalization, which makes it difficult to reuse such techniques to large, complex legacy information systems. This paper introduces MIMOS, a third-year project aimed at developing a methodological and technological modernization framework to facilitate the migration of legacy systems based on high-level design models. The work in progress during the first year mainly focused on the definition of a business process mining technique to retrieve the business knowledge embedded in source code so that it can be reused in the target system.

2013 17th European Conference on Software Maintenance and Reengineering MIMOS. System Model-Driven Migration Project Ricardo Pérez-Castillo, Ignacio García-Rodríguez de Guzmán and Mario Piattini Instituto de Tecnologías y Sistemas de Información (ITSI) University of Castilla-La Mancha, Paseo de la Universidad 4, 13071, Ciudad Real, Spain [ricardo.pdelcastillo | ignacio.grodriguez | mario.piattini]@uclm.es usually involves retraining all the users in order for them to understand the new system and/or the new technology. Secondly, the new system may have a lack of specific functionalities that are missing as a result of technological changes. Thirdly, the economic aspect of companies is also affected, since the replacement of an entire LIS, by implementing a new system from scratch, implies a low Return of Investment (ROI) with regard to the old system. In addition, the development or purchase of the new system might exceed a company’s budget. Software migration is a particular type of maintenance that focuses on adaptive and perfective modifications. Indeed, according to [5], 78% of maintenance changes are corrective or behavior-preserving. Over the last two decades, reengineering has been the principal technique used to address the migration of legacy systems [2]. Reengineering is the examination and alteration of a subject system to reconstitute it in a new form and the subsequent implementation of the new form. This form may include modifications with respect to new requirements not met by the original system [4]. The main advantages is that reengineering preserves the systems’ legacy knowledge and makes it possible to change software easily, reliably and quickly, resulting in a maintenance cost that is also tolerable [1]. Several reengineering proposals consist of the analysis and inspection of different software artifacts. For example, Zou et al. [16], statically analyze the source code and apply a set of heuristic rules to discover embedded business processes. Other methods using dynamic analysis are also proposed to preserve the embedded business knowledge, e.g., Marchetto et al. [8] discover business processes through the execution of graphical user interfaces in Web applications. The main challenge of migration, and in general of reengineering, is that most efforts are ad hoc proposals, which are developed for particular platforms, technologies and specific contexts. This lack of formalization and standardization leads to another challenge related to the automation of such techniques, and the repeatability of migration techniques in large-scale projects is therefore in doubt [3]. In fact, a 2005 study [14] states that 50% of reengineering projects fail owing to the lack of standardization and automation, which often leads to overruns in costs. Standardization and automation challenges limit the applicability of migration techniques to large and complex Abstract—The volatile IT industry often tempts companies to replace legacy information systems with new ones. However, legacy systems cannot always be completely discarded because they gradually store a significant amount of valuable business knowledge as a result of progressive maintenance over time. Most migration techniques are proposed and applied in an ad hoc way. As a result, most migration techniques have a lack of automation and formalization, which makes it difficult to reuse such techniques to large, complex legacy information systems. This paper introduces MIMOS, a third-year project aimed at developing a methodological and technological modernization framework to facilitate the migration of legacy systems based on high-level design models. The work in progress during the first year mainly focused on the definition of a business process mining technique to retrieve the business knowledge embedded in source code so that it can be reused in the target system. Keywords—Migration, architecture-driven business process, legacy systems. modernization, I. MOTIVATION Although software is an intangible object, the quality of software diminishes over time in a similar way to that of material objects. Lehman’s first law states that an information system must continually evolve or it will become progressively less suitable in a real-world environment [7]. Companies currently have an enormous amount of large legacy systems which undergo the phenomenon of software erosion and software ageing. This means that existing information systems become progressively less maintainable [12]. The negative effects of software erosion can be dead code, clone programs, missing capacities, inconsistent data and control data (coupling), among others [15]. On the one hand, software maintenance is part of the software erosion problem, since software erosion is due to maintenance itself and to the uncontrolled evolution of the system over time. On the other hand, software maintenance is also part of the solution to software erosion. The successive changes in information systems transform them into Legacy Information Systems (LIS), and a new and improved system must therefore replace the previous one when the maintainability levels diminish below acceptable limits [9]. Nevertheless, the wide replacement of these systems from scratch is a key challenge since it makes a great impact on the technological, human and economic aspects of companies [14]. Firstly, the entire replacement of LISs affects technological and human aspects, since it 1534-5351/13 $26.00 © 2013 IEEE DOI 10.1109/CSMR.2013.68 445 legacy information systems. These challenges can be addressed by Model-Driven Development (MDD) principles, i.e., (i) considering and treating all software artifacts as models which conform to specific metamodels, and (ii) establishing automatic transformations between models at different abstraction levels. The Architecture DrivenModernization (ADM) initiative (also known as software modernization) launched by the OMG, particularly advocates carrying out a reengineering process by following modeldriven development principles. ADM solves the formalization problem since it represents all the artifacts involved in the reengineering process as models, which are represented in accordance with specific metamodels. ADM therefore treats all software artifacts homogenously, i.e., as models that can be transformed into other models by using deterministic transformations. The model transformations can consequently be automated through their formalization. Furthermore, the model-driven development principles make it possible to reuse models used in different modernization projects, since a computational independent model (CIM) can be transformed into several platform independent models (PIM), and each PIM model can in turn be transformed into several platform specific models (PSM). The remaining of this paper is structured as follows: Section II introduces the goals of the project. Section III explains in detail work packages and the schedule. Section IV provides the work in progress. Finally, Section V presents preliminary results and open issues. TABLE I. PROJECT DATASHET Full name Project Code Duration Date Status Participants Source of funding Amount of funding Proyecto MIMOS. MIgración dirigida por MOdelos de Sistemas de información IDI-20120260 3 years From 01/01/2012 to 31/12/2014 Milestone 1 12 full-time researchers CDTI (Spanish Industrial Technology Development Center) with FEDER funds 850,000 € TABLE II. PROJECT MILESTONES AND WORK PACKAGES Milestone M1 M2 M3 Work Packages WP0. Project coordination, technical and financial management WP1. Development of a reverse engineering technique for retrieving and representing embedded knowledge WP2. Obtaining business process views from legacy source code WP3. Definition of model-driven refactoring WP4. Development of a technological framework and tools. WP5. Industrial validation TABLE III. PROJECT OUTCOMES AND DELIVERABLES II. PROJECT GOALS WPx The main research goal of MIMOS is to develop a methodological and technological ADM-based framework to facilitate the migration of legacy systems based on high-level design models. MIMOS carries out applied research, and divides the main goal into 9 sub-goals: O1. To develop a platform-independent process supporting the reverse engineering of legacy information systems. O2. To define a strategy for representing the retrieved knowledge in a platform-independent way. O3. To build a mechanism for obtaining business rule / process views from legacy systems. O4. To develop a set of refactoring techniques to be applied to high-level design models. O5. To obtain a measuring system for assessing the gain of the migration process. O6. To propose a set of techniques to automatically generate analysis and design models for the modernized system. O7. To develop an incremental and iterative methodology for model-driven migration. O8. To implement a technological framework by automating all the proposed methods and techniques. O9. To conduct a real-life case study with a large, industrial legacy information system. WP0 WP1 WP2 WP3 WP4 III. TECHNICAL DESCRIPTION Table I provides the original name and code, duration, number of participants as well as the funding source and amount of the MIMOS project. WP5 446 Outcomes O.0.1. Project website development O.0.2. Monitoring reports Month 3 6, 12, 18, 24, 30 Every time O.0.3. Project Dissemination O.0.4. Final Report 36 O.1.1. Generic reverse engineering 6 process O.1.2. Generic Migration Process 12 O.1.3. Catalog of legacy software assets 6 O.1.4. KDM Extension (ISO 19506) 10 O.2.1. Model Transformations 24 O.3.1. Set of refactoring rules 14 O.3.2. Metamodels to represent 18 refactoring rules O.3.3. Implementation of refactoring 24 rules O.3.4. Measurement mechanism for the 24 refactoring rules O.4.1. Model repositories for migration 24 models O.4.2. QVT/ATL model transformations 30 O.4.3. Supporting tools 34 O.4.4. Integrated migration environment 36 O.5.1. Pilot, small-scope case studies 12, 24 O.5.2. Industrial case studies with real36 life Systems O.5.3. Proposal refinement 36 I Year 1 II III IV I Year 2 II III IV I Year 3 II III Level L1. This level represents several specific models, i.e., one model for each different software artifact involved in the archeology process like source code, database, user interface, and so on. Traditional reverse engineering techniques [3] such as static analysis, dynamic analysis, program slicing, formal concept analysis, and so on, could be used to extract the knowledge from any software artifact and build PSM models related to it. These PSM models are represented according to specific metamodels. For example, a Java metamodel may be used to model the legacy source code, or an SQL metamodel to represent the database schema, etc. Level L2. This level consists of a single PIM that represents the integrated view of the set of PSM models at L1. The KDM metamodel is used so that L2 works as a KDM repository that can be progressively populated with knowledge extracted from the different legacy artifacts and information systems of an organization. In addition, L2 is represented in a technological-independent way due to the fact that KDM standard abstract all those details concerning the technological viewpoint (e.g. the program language). The transformation between levels L1 and L2 consists of a set of model transformations implemented using QVT (Query/View/Transformation). Level L3. Finally, this level depicts, at the end of the archeology process, the business process models retrieved from a legacy system. Business process models at L3 represent a CIM and are represented according to the BPMN (Business Process Modeling and Notation). This level closes the conceptual gap between the software architecture views and underlying business rules. The last transformation is based on a set of patterns. When a specific structure is detected in the KDM model at L2, each pattern indicates what elements should be built and how they are interrelated in the business process model at L3 [10]. This pattern matching is implemented through a QVT transformation [11]. The obtained models are a first sketch of the business process, which can be refined by business experts. This is due, for instance, to the fact that not all parts of current business processes are executed by legacy information systems, i.e., there are some manual business activities. Although experts post-intervention can be necessary, the first version of business processes, compared with business process redesign by business experts from scratch, represents a more efficient and less error-prone solution to get business process models. In addition, the business process redesign by experts from scratch might discard meaningful business knowledge that is only embedded in legacy information systems. This knowledge is then used for migrate main business functionalities to the new system. IV WP0 WP1 WP2 WP3 WP4 WP5 Figure 1. MIMOS project schedule Figure 2. Extraction of business process views Table II presents the six work packages of the project, which are divided into three milestones (one per year). Table III shows the description of the expected achievements and deliverables of the project. Figure 1 shows the temporal schedule of the MIMOS project according to the work packages. IV. WORK IN PROGRESS The main effort during the first year of the MIMOS project has been focused on work packages WP0, WP1 and WP2. According to WP2 (see Table II), the work in progress is particularly addressing the discovery of business process views from source code by retrieving embedded knowledge, which have to be migrated. For this purpose, a framework for recovering business processes from legacy source code has been developed. This framework is extensible for different programming languages since it is based on the ADM. In addition, this framework supports the KDM (Knowledge Discovery Metamodel) [6] standard proposed by the ADM initiative. KDM enables the representation and management of knowledge extracted by means of reverse engineering from all the different software artifacts of legacy systems in an integrated and platform-independent way. Thus, that legacy knowledge is gradually transformed into business processes. Hence, the framework is divided into 4 abstraction levels with 3 model transformations among them (see Figure 2): Level L0. This level represents the legacy information system in the real world, and its source code to recover underlying business processes. V. PRELIMINARY RESULTS & OPEN ISSUES The framework implemented in WP2 has been applied to several pilot industrial case studies to retrieve business processes from a wide variety of legacy information systems. The conduction of these industrial case studies has allowed improving a preliminary tool and refining the migration framework. So far, the framework has been applied to six 447 The second major limitation revealed is related to the time spent on manual post-intervention to refine the first sketch of the migrated system that was automatically obtained. This time could be the bottleneck in some real migrating projects. Although the pilot case studies show that our proposal is less error-prone and time-consuming than manual modeling from scratch, the manual time should be reduced during MIMOS project. legacy information systems: (i) a system managing a Spanish author organization; (ii) an open source CRM (Customer Relationship Management) system; (iii) an enterprise information system of the water and waste industry; (iv) an e-government system used in a Spanish local eadministration; (v) a high school LMS (Learning Management System); and finally (vi) an oncological evaluation system used in Austrian hospitals. REFERENCES [1] [2] [3] [4] [5] Figure 3. Effectiveness summary of case studies These studies evaluated the effectiveness and efficiency of the technique applied through the tool. On one hand, effectiveness is measured through precision and recall. Precision measures the exactness or fidelity of the business processes recovered, whereas recall measures their completeness. These measures are computed regarding retrieved functionalities. On the other hand, efficiency of the technique developed in WP2 is evaluated through the time spent on the recovery as well as the scalability to larger legacy information systems. Figure 3 summarizes results obtained from case studies regarding effectiveness. Precision and recall values vary from a system to another, although the value trend is a recall higher than precision. This means that the technique retrieves a great number of business activities although a few of them could be erroneously recovered. Further empirical validation will be done in next work package WP5 with Universitas XXI, a large industrial system in charge of electronic administration of several Spanish universities. Universitas XXI is a client-server application written in Visual Basic and a set of Oracle forms with a total size of 2241 KLOC. [6] [7] [8] [9] [10] [11] [12] A. Open issues One of the most frequent clarification questions is related to the possibility (or not) of migrating cross-cutting business functionalities from various heterogeneous applications or subsystems integrating a whole enterprise information system, which are present in several companies. It is an important open issue to be addressed during the project according to the problems of delocalization and interleaving of the embedded business knowledge [13]. These problems lie in the fact that pieces of knowledge are usually scattered between many applications and, in turn, a single application contains several pieces of business knowledge. [13] [14] [15] [16] 448 Bennett, K.H. and V.T. Rajlich, Software maintenance and evolution: a roadmap, in Proceedings of the Conference on The Future of Software Engineering. 2000, ACM: Limerick, Ireland. Bianchi, A., D. Caivano, V. Marengo, and G. Visaggio, Iterative Reengineering of Legacy Systems. IEEE Trans. Softw. Eng., 2003. 29(3): p. 225-241. Canfora, G., M. Di Penta, and L. Cerulo, Achievements and challenges in software reverse engineering. Commun. ACM, 2011. 54(4): p. 142-151. Chikofsky, E.J. and J.H. Cross, Reverse Engineering and Design Recovery: A Taxonomy. IEEE Softw., 1990. 7(1): p. 13-17. Ghazarian, A., A Case Study of Source Code Evolution, in 13th European Conference on Software Maintenance and Reengineering (CSMR'09), R. Ferenc, J. Knodel, and A. Winter, Editors. 2009, IEEE Computer Society: Kaiserslautern, Germany. p. 159-168. ISO/IEC, ISO/IEC 19506. Knowledge Discovery Meta-model (KDM), v1.1 (Architecture-Driven Modernization). http://www.iso.org/iso/iso_catalogue/catalogue_ics/catalogue_detail_i cs.htm?ics1=35&ics2=080&ics3=&csnumber=32625. 2012, ISO/IEC. p. 302. Lehman, M.M., D.E. Perry, and J.F. Ramil, Implications of Evolution Metrics on Software Maintenance, in Proceedings of the International Conference on Software Maintenance. 1998, IEEE Computer Society. p. 208-217. Marchetto, A. and C. Di Francescomarino, Parameterised trace selection technique for process model recovering. Software, IET, 2011. 5(6): p. 563-575. Mens, T., Introduction and Roadmap: History and Challenges of Software Evolution Software Evolution (Springer Berlin Heidelberg), 2008. 1: p. 1-11. Pérez-Castillo, R., I. García-Rodríguez de Guzmán, O. Ávila-García, and M. Piattini, On the Use of Patterns to Recover Business Processes, in 25th Annual ACM Symposium on Applied Computing (SAC'10). 2010, ACM: Sierre, Switzerland. p. 165-166. Pérez-Castillo, R., I. García-Rodríguez de Guzmán, and M. Piattini, Implementing Business Process Recovery Patterns through QVT Transformations, in International Conference on Model Transformation (ICMT'10). 2010, Springer-Verlag. p. 168-183. Polo, M., M. Piattini, and F. Ruiz, Advances in software maintenance management: technologies and solutions. 2003: Idea Group Publishing. 286. Ratiu, D., R. Marinescu, and J. Jurjens, The Logical Modularity of Programs, in Working Conference on Reverse Engineering (WCRE'09). 2009, IEEE C. S.: Lille, France. p. 123-127. Sneed, H.M., Estimating the Costs of a Reengineering Project. Proceedings of the 12th Working Conference on Reverse Engineering. 2005: IEEE Computer Society. 111 - 119. Visaggio, G., Ageing of a data-intensive legacy system: symptoms and remedies. Journal of Software Maintenance, 2001. 13(5): p. 281308. Zou, Y. and M. Hung, An Approach for Extracting Workflows from E-Commerce Applications, in Proceedings of the Fourteenth International Conference on Program Comprehension. 2006, IEEE Computer Society. p. 127-136.