The support for complex services delivery is becoming a key point in current internet technology.... more The support for complex services delivery is becoming a key point in current internet technology. Current trends in internet applications are characterized by on demand delivery of ever growing amounts of content. The future internet of services will have to deliver content intensive applications to users with quality of service and security guarantees. This paper describes the RESERVOIR project and the challenge of a reliable and effective delivery of services as utilities in a commercial scenario. It starts by analyzing the needs of a future infrastructure provider and introducing the key concept of a service oriented architecture that combines virtualisation-aware grid with grid-aware virtualisation, while being driven by business service management. This article will then focus on the benefits and the innovations derived from the RESERVOIR approach. Eventually, a high level view of RESERVOIR general architecture is illustrated.
Abstract. Cloud computing leverages the use of abstracted resources. However, migrating an indust... more Abstract. Cloud computing leverages the use of abstracted resources. However, migrating an industrial application to well-known Cloud solu-tions such as EC2 might be complex and low level expertise is indeed needed. In this use case we present a methodology, based on practical experience matured on the ground, that allows service providers to en-able complex applications to the RESERVOIR cloud infrastructure. We also show an example of how a complex business application, such as the SAP ERP 6.0, can be automatically fully deployed and scaled up and down as resource needs change, easing the use of a cloud system for ser-vice providers that might experience difficulties or have mental barriers to carry out such task.
Software Architecture for Big Data and the Cloud, 2017
HARNESS is a next generation cloud-computing platform that offers commodity and specialized resou... more HARNESS is a next generation cloud-computing platform that offers commodity and specialized resources in support of largescale data processing applications. We focus primarily on application domains that are currently not well supported by today's cloud providers, including the areas of scientific computing, business-analytics, and online machine learning. These applications often require acceleration of critical operations using devices such as FPGAs, GPGPUs, network middleboxes, and SSDs. We explain the architectural principles that underlie the HARNESS platform, including the separation of agnostic and cognizant resource management that allows the platform to be resilient to heterogeneity while leveraging its use. We describe a prototype implementation of the platform, which was evaluated using two testbeds: (1) a heterogeneous compute and storage cluster that includes FPGAs and SSDs and (2) Grid'5000, a large-scale distributed testbed that spans France. We evaluate the HARNESS cloud-computing platform with two applications: Reverse-Time Migration, a scientific computing application from the geosciences domain, and AdPredictor, a machine learning algorithm used in the Bing search engine.
Abstract:- A recurrent problem encountered by distribute system designers is that of verifying, v... more Abstract:- A recurrent problem encountered by distribute system designers is that of verifying, validating and evaluating the performance of complex software systems. The behavior of these systems generally depends on how the various software entities inter-relate and on the status of the underlying inter-network. Although many aspects of the distributed software engineering process have been addressed, there is still the need to investigate suitable simulation tools and methodologies that provide support to all seven OSI layers and are geared for modeling distributed applications over simulated inter-networks. In this paper we present such a simulation framework, illustrating its use and applicability through a number of example applications including application-level networking, mobile agent systems, GRID computing, and mobile services for 3G. Starting from the lessons learned from implementing those applications we propose a general methodology for the assessment of distributed ...
Systems relying on increasingly large and dynamic communication networks must find effective ways... more Systems relying on increasingly large and dynamic communication networks must find effective ways to optimally localize service facilities. This can be achieved by efficiently partitioning the system and computing the partitions' centers, solving the classic p-median and p-center problems. These are NP-hard when striving for optimality. Numerous approximate solutions have been proposed during the past 30 years. However, they all fail to address the combined requirements of scalability, optimality and flexibility. This thesis presents a novel location algorithm that is distributed, does not require any direct knowledge of the network topology, is computed in linear time, and leads to provably near-optimal p-medians. The algorithm exploits the key properties of Mobile Agents (MAs), the latter being autonomous software entities capable of roaming the network and cloning or spawning other Mobile Agents. Mobile Agents play the role of service facilities that are capable of iterativel...
Most cloud service offerings are based on homogeneous commodity resources, such as large numbers ... more Most cloud service offerings are based on homogeneous commodity resources, such as large numbers of inexpensive machines interconnected by off-the-shelf networking equipment and disk drives, to provide low-cost application hosting. However, cloud service providers have reached a limit in satisfying performance and cost requirements for important classes of applications, such as geo-exploration and real-time business analytics. The HARNESS project aims to fill this gap by developing architectural principles that enable the next generation cloud platforms to incorporate heterogeneous technologies such as reconfigurable Dataflow Engines (DFEs), programmable routers, and SSDs, and provide as a result vastly increased performance, reduced energy consumption, and lower cost profiles. In this paper we focus on three challenges for supporting heterogeneous computing resources in the context of a cloud platform, namely: (1) cross-optimisation of heterogeneous computing resources, (2) resource virtualisation and (3) programming heterogeneous platforms.
... APA. Ragusa, C. (2010). Business Grids, Infrastructuring the Future of ICT. In Antonopoulos, ... more ... APA. Ragusa, C. (2010). Business Grids, Infrastructuring the Future of ICT. In Antonopoulos, N., Exarchakos, G., Li, M., & Liotta, A. (Eds.), Handbook of Research on P2P and Grid Systems for Service-Oriented Computing: Models, Methodologies and Applications. (pp. 245-284). ...
... for enterprise cloud computing. CoRR, abs/1001.3257 (2010) 7. Khajeh-Hosseini, A., Greenwood,... more ... for enterprise cloud computing. CoRR, abs/1001.3257 (2010) 7. Khajeh-Hosseini, A., Greenwood, D., Sommerville, I.: Cloud migration: A case study of migrating anenterprise it system to iaas. CoRR, abs/1002.3492 (2010) 8 ...
Handbook of research on P2P and grid systems for …, 2010
... APA. Ragusa, C. (2010). Business Grids, Infrastructuring the Future of ICT. In Antonopoulos, ... more ... APA. Ragusa, C. (2010). Business Grids, Infrastructuring the Future of ICT. In Antonopoulos, N., Exarchakos, G., Li, M., & Liotta, A. (Eds.), Handbook of Research on P2P and Grid Systems for Service-Oriented Computing: Models, Methodologies and Applications. (pp. 245-284). ...
As cloud computing becomes more predominant, the problem of scalability has become critical for c... more As cloud computing becomes more predominant, the problem of scalability has become critical for cloud computing providers. The cloud paradigm is attractive because it offers a dramatic reduction in capital and operation expenses for consumers. But as the demand for cloud services increases, the ensuing increases in cost and complexity for the cloud provider may become unbearable. We briefly discuss the technologies we developed under the RESERVOIR European research project to help cloud providers deal with complexity and scalability issues. We also introduce the notion of a federated cloud that would consist of several cloud providers joined by mutual collaboration agreements. A federated cloud can deal with scalability problems in a costeffective manner. Providers in the federation who have excess capacity can share their infrastructure with members in need of additional resources.
2013 IEEE 5th International Conference on Cloud Computing Technology and Science, 2013
We investigate the feasibility of detecting host-level CPU contention from inside a guest virtual... more We investigate the feasibility of detecting host-level CPU contention from inside a guest virtual machine (VM). Our methodology involves running benchmarks with deterministic and randomized execution times inside a guest VM in a private cloud testbed. Simultaneously, using the recently proposed COCOMA tool, we expose the guest VM to host-level CPU stealing events of increasing intensity. This leads us to observe that the use of hyper-threading in the host can hinder detection of CPU contention, which otherwise can be done accurately using the CPU steal metric. For systems where hyper-threading is enabled, we investigate the performance of some basic detection algorithms. We find that thresholding often outperforms more sophisticated statistical tests.
A recurrent problem encountered by distribute system designers is that of verifying, validating a... more A recurrent problem encountered by distribute system designers is that of verifying, validating and evaluating the performance of complex software systems. The behavior of these systems generally depends on how the various software entities inter-relate and on the status of the underlying inter-network. Although many aspects of the distributed software engineering process have been addressed, there is still the need to investigate suitable simulation tools and methodologies that provide support to all seven OSI layers and are geared for modeling distributed applications over simulated inter-networks. In this paper we present such a simulation framework, illustrating its use and applicability through a number of example applications including application-level networking, mobile agent systems, GRID computing, and mobile services for 3G. Starting from the lessons learned from implementing those applications we propose a general methodology for the assessment of distributed systems and applications.
Mobile Agent (MA) systems are complex software entities whose behavior, performance and effective... more Mobile Agent (MA) systems are complex software entities whose behavior, performance and effectiveness cannot always be anticipated by the designer. Their evaluation often presents various aspects that require a careful, methodological approach as well as the adoption of suitable tools, needed to identify critical overheads that may impact the overall system performance, stability, validity and scalability. In this paper, we propose a novel approach to evaluating complex mobile agent systems based on a hybrid framework which allows the execution of prototype agent code over simulated internet-works. In this way it is possible to realize arbitrarily complex MA systems and evaluate them over arbitrarily complex inter-networks, relying on full support to physical, link, network and transport layers for fixed and mobile networks. We illustrate the potential of our approach through an example agent system which we have prototyped and assessed over large-scale IP networks.
Comprehensive testing of multi-tenant cloud-based applications has to consider the effects of co-... more Comprehensive testing of multi-tenant cloud-based applications has to consider the effects of co-location with workloads of other tenants, which may be characteristically, accidentally or maliciously contentious. Otherwise the execution and scaling of the application can demonstrate unpredictable behaviours that make it difficult for users to guarantee behaviour and providers to safely and efficiently optimise their physical infrastructure. We present motivations, principles and work in progress on the COntrolled COntentious and MAlicious (CO-COMA) framework towards supporting the design and execution of these tests in a coherent and reproducible manner.
In this paper we propose a dynamic resource management system addressing the key requirements of ... more In this paper we propose a dynamic resource management system addressing the key requirements of mobile services -i.e. services realized as Mobile Agents (MAs). MAs are exploited here in different ways. First, to realize adaptable, configurable, context-aware services for 3G and beyond. Second, to develop a distributed monitoring system that suits the hurdles posed by service and network mobility. Finally, to construct a management system that can dynamically re-configure MA-based services for load-balancing and adaptation purposes. We present the resource management system architecture, a scheme providing run-time adaptation through agent mobility, and a prototype implementation along with some important simulation results. telecommunication scenario and it is for this reason that so much attention is currently being dedicated to their effective realization.
2011 IEEE Third International Conference on Cloud Computing Technology and Science, 2011
Cloud-based software testing is today predominantly focused on testing services provided in the c... more Cloud-based software testing is today predominantly focused on testing services provided in the cloud. Secondly, the properties of the testing process are often highlighted as opposed to the infrastructure. A taxonomy of 5 patterns for testing in the cloud and 7 criteria for effective infrastructure is presented. The practicality and relevance of the taxonomy are demonstrated with an application study in the Platform as a Service (PaaS) domain. This domain has been selected as there are no extensive studies on testing PaaS applications or the infrastructure requirements for supporting such tests.
The support for complex services delivery is becoming a key point in current internet technology.... more The support for complex services delivery is becoming a key point in current internet technology. Current trends in internet applications are characterized by on demand delivery of ever growing amounts of content. The future internet of services will have to deliver content intensive applications to users with quality of service and security guarantees. This paper describes the RESERVOIR project and the challenge of a reliable and effective delivery of services as utilities in a commercial scenario. It starts by analyzing the needs of a future infrastructure provider and introducing the key concept of a service oriented architecture that combines virtualisation-aware grid with grid-aware virtualisation, while being driven by business service management. This article will then focus on the benefits and the innovations derived from the RESERVOIR approach. Eventually, a high level view of RESERVOIR general architecture is illustrated.
Abstract. Cloud computing leverages the use of abstracted resources. However, migrating an indust... more Abstract. Cloud computing leverages the use of abstracted resources. However, migrating an industrial application to well-known Cloud solu-tions such as EC2 might be complex and low level expertise is indeed needed. In this use case we present a methodology, based on practical experience matured on the ground, that allows service providers to en-able complex applications to the RESERVOIR cloud infrastructure. We also show an example of how a complex business application, such as the SAP ERP 6.0, can be automatically fully deployed and scaled up and down as resource needs change, easing the use of a cloud system for ser-vice providers that might experience difficulties or have mental barriers to carry out such task.
Software Architecture for Big Data and the Cloud, 2017
HARNESS is a next generation cloud-computing platform that offers commodity and specialized resou... more HARNESS is a next generation cloud-computing platform that offers commodity and specialized resources in support of largescale data processing applications. We focus primarily on application domains that are currently not well supported by today's cloud providers, including the areas of scientific computing, business-analytics, and online machine learning. These applications often require acceleration of critical operations using devices such as FPGAs, GPGPUs, network middleboxes, and SSDs. We explain the architectural principles that underlie the HARNESS platform, including the separation of agnostic and cognizant resource management that allows the platform to be resilient to heterogeneity while leveraging its use. We describe a prototype implementation of the platform, which was evaluated using two testbeds: (1) a heterogeneous compute and storage cluster that includes FPGAs and SSDs and (2) Grid'5000, a large-scale distributed testbed that spans France. We evaluate the HARNESS cloud-computing platform with two applications: Reverse-Time Migration, a scientific computing application from the geosciences domain, and AdPredictor, a machine learning algorithm used in the Bing search engine.
Abstract:- A recurrent problem encountered by distribute system designers is that of verifying, v... more Abstract:- A recurrent problem encountered by distribute system designers is that of verifying, validating and evaluating the performance of complex software systems. The behavior of these systems generally depends on how the various software entities inter-relate and on the status of the underlying inter-network. Although many aspects of the distributed software engineering process have been addressed, there is still the need to investigate suitable simulation tools and methodologies that provide support to all seven OSI layers and are geared for modeling distributed applications over simulated inter-networks. In this paper we present such a simulation framework, illustrating its use and applicability through a number of example applications including application-level networking, mobile agent systems, GRID computing, and mobile services for 3G. Starting from the lessons learned from implementing those applications we propose a general methodology for the assessment of distributed ...
Systems relying on increasingly large and dynamic communication networks must find effective ways... more Systems relying on increasingly large and dynamic communication networks must find effective ways to optimally localize service facilities. This can be achieved by efficiently partitioning the system and computing the partitions' centers, solving the classic p-median and p-center problems. These are NP-hard when striving for optimality. Numerous approximate solutions have been proposed during the past 30 years. However, they all fail to address the combined requirements of scalability, optimality and flexibility. This thesis presents a novel location algorithm that is distributed, does not require any direct knowledge of the network topology, is computed in linear time, and leads to provably near-optimal p-medians. The algorithm exploits the key properties of Mobile Agents (MAs), the latter being autonomous software entities capable of roaming the network and cloning or spawning other Mobile Agents. Mobile Agents play the role of service facilities that are capable of iterativel...
Most cloud service offerings are based on homogeneous commodity resources, such as large numbers ... more Most cloud service offerings are based on homogeneous commodity resources, such as large numbers of inexpensive machines interconnected by off-the-shelf networking equipment and disk drives, to provide low-cost application hosting. However, cloud service providers have reached a limit in satisfying performance and cost requirements for important classes of applications, such as geo-exploration and real-time business analytics. The HARNESS project aims to fill this gap by developing architectural principles that enable the next generation cloud platforms to incorporate heterogeneous technologies such as reconfigurable Dataflow Engines (DFEs), programmable routers, and SSDs, and provide as a result vastly increased performance, reduced energy consumption, and lower cost profiles. In this paper we focus on three challenges for supporting heterogeneous computing resources in the context of a cloud platform, namely: (1) cross-optimisation of heterogeneous computing resources, (2) resource virtualisation and (3) programming heterogeneous platforms.
... APA. Ragusa, C. (2010). Business Grids, Infrastructuring the Future of ICT. In Antonopoulos, ... more ... APA. Ragusa, C. (2010). Business Grids, Infrastructuring the Future of ICT. In Antonopoulos, N., Exarchakos, G., Li, M., & Liotta, A. (Eds.), Handbook of Research on P2P and Grid Systems for Service-Oriented Computing: Models, Methodologies and Applications. (pp. 245-284). ...
... for enterprise cloud computing. CoRR, abs/1001.3257 (2010) 7. Khajeh-Hosseini, A., Greenwood,... more ... for enterprise cloud computing. CoRR, abs/1001.3257 (2010) 7. Khajeh-Hosseini, A., Greenwood, D., Sommerville, I.: Cloud migration: A case study of migrating anenterprise it system to iaas. CoRR, abs/1002.3492 (2010) 8 ...
Handbook of research on P2P and grid systems for …, 2010
... APA. Ragusa, C. (2010). Business Grids, Infrastructuring the Future of ICT. In Antonopoulos, ... more ... APA. Ragusa, C. (2010). Business Grids, Infrastructuring the Future of ICT. In Antonopoulos, N., Exarchakos, G., Li, M., & Liotta, A. (Eds.), Handbook of Research on P2P and Grid Systems for Service-Oriented Computing: Models, Methodologies and Applications. (pp. 245-284). ...
As cloud computing becomes more predominant, the problem of scalability has become critical for c... more As cloud computing becomes more predominant, the problem of scalability has become critical for cloud computing providers. The cloud paradigm is attractive because it offers a dramatic reduction in capital and operation expenses for consumers. But as the demand for cloud services increases, the ensuing increases in cost and complexity for the cloud provider may become unbearable. We briefly discuss the technologies we developed under the RESERVOIR European research project to help cloud providers deal with complexity and scalability issues. We also introduce the notion of a federated cloud that would consist of several cloud providers joined by mutual collaboration agreements. A federated cloud can deal with scalability problems in a costeffective manner. Providers in the federation who have excess capacity can share their infrastructure with members in need of additional resources.
2013 IEEE 5th International Conference on Cloud Computing Technology and Science, 2013
We investigate the feasibility of detecting host-level CPU contention from inside a guest virtual... more We investigate the feasibility of detecting host-level CPU contention from inside a guest virtual machine (VM). Our methodology involves running benchmarks with deterministic and randomized execution times inside a guest VM in a private cloud testbed. Simultaneously, using the recently proposed COCOMA tool, we expose the guest VM to host-level CPU stealing events of increasing intensity. This leads us to observe that the use of hyper-threading in the host can hinder detection of CPU contention, which otherwise can be done accurately using the CPU steal metric. For systems where hyper-threading is enabled, we investigate the performance of some basic detection algorithms. We find that thresholding often outperforms more sophisticated statistical tests.
A recurrent problem encountered by distribute system designers is that of verifying, validating a... more A recurrent problem encountered by distribute system designers is that of verifying, validating and evaluating the performance of complex software systems. The behavior of these systems generally depends on how the various software entities inter-relate and on the status of the underlying inter-network. Although many aspects of the distributed software engineering process have been addressed, there is still the need to investigate suitable simulation tools and methodologies that provide support to all seven OSI layers and are geared for modeling distributed applications over simulated inter-networks. In this paper we present such a simulation framework, illustrating its use and applicability through a number of example applications including application-level networking, mobile agent systems, GRID computing, and mobile services for 3G. Starting from the lessons learned from implementing those applications we propose a general methodology for the assessment of distributed systems and applications.
Mobile Agent (MA) systems are complex software entities whose behavior, performance and effective... more Mobile Agent (MA) systems are complex software entities whose behavior, performance and effectiveness cannot always be anticipated by the designer. Their evaluation often presents various aspects that require a careful, methodological approach as well as the adoption of suitable tools, needed to identify critical overheads that may impact the overall system performance, stability, validity and scalability. In this paper, we propose a novel approach to evaluating complex mobile agent systems based on a hybrid framework which allows the execution of prototype agent code over simulated internet-works. In this way it is possible to realize arbitrarily complex MA systems and evaluate them over arbitrarily complex inter-networks, relying on full support to physical, link, network and transport layers for fixed and mobile networks. We illustrate the potential of our approach through an example agent system which we have prototyped and assessed over large-scale IP networks.
Comprehensive testing of multi-tenant cloud-based applications has to consider the effects of co-... more Comprehensive testing of multi-tenant cloud-based applications has to consider the effects of co-location with workloads of other tenants, which may be characteristically, accidentally or maliciously contentious. Otherwise the execution and scaling of the application can demonstrate unpredictable behaviours that make it difficult for users to guarantee behaviour and providers to safely and efficiently optimise their physical infrastructure. We present motivations, principles and work in progress on the COntrolled COntentious and MAlicious (CO-COMA) framework towards supporting the design and execution of these tests in a coherent and reproducible manner.
In this paper we propose a dynamic resource management system addressing the key requirements of ... more In this paper we propose a dynamic resource management system addressing the key requirements of mobile services -i.e. services realized as Mobile Agents (MAs). MAs are exploited here in different ways. First, to realize adaptable, configurable, context-aware services for 3G and beyond. Second, to develop a distributed monitoring system that suits the hurdles posed by service and network mobility. Finally, to construct a management system that can dynamically re-configure MA-based services for load-balancing and adaptation purposes. We present the resource management system architecture, a scheme providing run-time adaptation through agent mobility, and a prototype implementation along with some important simulation results. telecommunication scenario and it is for this reason that so much attention is currently being dedicated to their effective realization.
2011 IEEE Third International Conference on Cloud Computing Technology and Science, 2011
Cloud-based software testing is today predominantly focused on testing services provided in the c... more Cloud-based software testing is today predominantly focused on testing services provided in the cloud. Secondly, the properties of the testing process are often highlighted as opposed to the infrastructure. A taxonomy of 5 patterns for testing in the cloud and 7 criteria for effective infrastructure is presented. The practicality and relevance of the taxonomy are demonstrated with an application study in the Platform as a Service (PaaS) domain. This domain has been selected as there are no extensive studies on testing PaaS applications or the infrastructure requirements for supporting such tests.
Uploads
Papers by Carmelo Ragusa