Automated Workflow Scheduling in Self-Adaptive Clouds
Automated Workflow Scheduling in Self-Adaptive Clouds
Automated Workflow Scheduling in Self-Adaptive Clouds
G. Kousalya
P. Balakrishnan
C. Pethuru Raj
Automated
Workflow
Scheduling in Self-
Adaptive Clouds
Concepts, Algorithms and Methods
Computer Communications and Networks
Series editor
A.J. Sammes
Centre for Forensic Computing
Cranfield University, Shrivenham Campus
Swindon, UK
The Computer Communications and Networks series is a range of textbooks,
monographs and handbooks. It sets out to provide students, researchers, and non-
specialists alike with a sure grounding in current knowledge, together with
comprehensible access to the latest developments in computer communications and
networking.
Emphasis is placed on clear and explanatory styles that support a tutorial
approach, so that even the most complex of topics is presented in a lucid and
intelligible manner.
Automated Workflow
Scheduling in Self-Adaptive
Clouds
Concepts, Algorithms and Methods
G. Kousalya P. Balakrishnan
Coimbatore Institute of Technology SCOPE, VIT University
Coimbatore, India Vellore, India
C. Pethuru Raj
Reliance Jio Cloud Services (JCS)
Bangalore, India
v
vi Foreword
vii
viii Preface
G. Kousalya
I express my sincere and heartfelt gratitude to my beloved father Thiru
R. Govardhanan for his selfless support and motivation in every walk of my life. In
remembrance of his enthusiasm and determination, I wholeheartedly dedicate this
book to my father.
P. Balakrishnan
First of all, I convey my heartfelt thanks to Wayne Wheeler and Simon Rees of
Springer for providing us this opportunity. Besides, I thank my co-authors for their
helping hand and valuable suggestions in shaping up the book. Above all, I must
thank my beloved parents (Mr. M. Ponnuraman and Ms. P. Parvathy) who taught me
the power of hard work. Finally, I thank my brother (Dr. P. Selvan) and sister (Dr.
P. Selvi) for their continuous and consistent moral support.
C. Pethuru Raj
I express my sincere gratitude to Wayne Wheeler and Simon Rees of Springer for
immensely helping us from the conceptualization to the completion of this book. I
need to remember my supervisors Prof. Ponnammal Natarajan, Anna University,
Chennai; Prof. Priti Shankar (late), Computer Science and Automation (CSA)
Department, Indian Institute of Science (IISc), Bangalore; Prof. Naohiro Ishii,
Department of Intelligence and Computer Science, Nagoya Institute of Technology;
and Prof. Kazuo Iwama, School of Informatics, Kyoto University, Japan, for shap-
ing my research life. I express my heartfelt gratitude to Mr. Thomas Erl, the world’s
top-selling SOA author, for giving me a number of memorable opportunities to
write book chapters for his exemplary books. I thank Sreekrishnan, a distinguished
engineer in IBM Global Cloud Center of Excellence (CoE), for extending his moral
support in completing this book.
xi
xii Acknowledgments
I, at this point in time, recollect and reflect on the selfless sacrifices made by my
parents in shaping me up to this level. I would expressly like to thank my wife
(Sweetlin Reena) and sons (Darren Samuel and Darresh Bernie) for their persever-
ance as I have taken the tremendous and tedious challenge of putting the book
together. I thank all the readers for their overwhelming support for my previous
books. Above all, I give all the glory and honor to my Lord and Savior Jesus Christ
for His abundant grace and guidance.
Contents
xiii
xiv Contents
Index.................................................................................................................. 223
Chapter 1
Stepping into the Digital Intelligence Era
1.1 Introduction
One of the most visible and value-adding trends in IT is nonetheless the digitization
aspect. All kinds of everyday items in our personal, professional, and social envi-
ronments are being digitized systematically to be computational, communicative,
sensitive, perceptive, and responsive. That is, all kinds of ordinary entities in our
midst are instrumented differently to be extraordinary in their operations, outputs,
and offerings. These days, due to the unprecedented maturity and stability of a host
of path-breaking technologies such as miniaturization, integration, communication,
computing, sensing, perception, middleware, analysis, actuation, and orchestration,
everything has grasped the inherent power of interconnecting with one another in its
vicinity as well as with remote objects via networks purposefully and on need basis
to uninhibitedly share their distinct capabilities toward the goal of business automa-
tion, acceleration, and augmentation. Ultimately, everything will become smart,
electronics goods will become smarter, and human beings will become the smartest
in their deals, decisions, and deeds.
In this section, the most prevalent and pioneering trends and transitions in the IT
landscape will be discussed. Especially the digitization technologies and techniques
are given the sufficient thrust.
The Trend-Setting Technologies in the IT Space As widely reported, there are
several delectable transitions in the IT landscape. The consequences are vast and
varied: the incorporation of nimbler and next-generation features and functionalities
into existing IT solutions, the grand opening of fresh possibilities and opportunities,
and the eruption of altogether new IT products and solutions for the humanity.
These have the intrinsic capabilities to bring forth numerous subtle and succinct
transformations in business as well as people.
IT Consumerization The much-discoursed and deliberated Gartner report details
the diversity of mobile devices (smartphones, tablets, wearables, drones, etc.) and
their management. This trend is ultimately empowering people. The ubiquitous
information access is made possible. Further on, the IT infrastructures are being
tweaked accordingly in order to support this movement. There are some challenges
for IT administrators in fulfilling the device explosion. That is, IT is steadily becom-
ing an inescapable part of consumers directly and indirectly. And the need for robust
and resilient mobile device management software solutions with the powerful emer-
gence of “bring your own device (BYOD)” is being felt and is being insisted across.
Another aspect is the emergence of next-generation mobile applications and ser-
vices across a variety of business verticals. There is a myriad of mobile applications,
maps and services development platforms, programming and markup languages,
1.2 Elucidating Digitization Technologies 3
architectures and frameworks, tools, containers, and operating systems in the fast-
moving mobile space.
IT Commoditization This is another cool trend penetrating in the IT industry.
With the huge acceptance and adoption of cloud computing and big data analytics,
the value of commodity IT is decidedly on the rise. The embedded intelligence
inside IT hardware elements is being abstracted and centralized in hypervisor soft-
ware solutions. Hardware systems are thus software enabled to be flexibly manipu-
lated and maneuvered. With this transition, all the hardware resources in any data
center become dumb and can be easily replaced, substituted, and composed for
easily and quickly fulfilling different requirements and use cases. The IT afford-
ability is thus realized along with a number of other advantages. That is, the future
IT data centers and server farms are going to be stuffed with a number of commod-
ity servers, storages, and network solutions. The utilization level is bound to go up
substantially when the IT resources are commoditized.
IT Compartmentalization (Virtualization and Containerization) The “divide
and conquer” has been the most versatile and rewarding mantra in the IT field.
Abstraction is another powerful technique gaining a lot of ground. The widely used
virtualization, which had laid a stimulating and sustainable foundation for the rag-
ing cloud idea, is actually hardware virtualization. Then the aspect of containeriza-
tion being represented through the popular Docker containers is one step above the
hardware virtualization. That is, containerization is at the operating system (OS)-
level virtualization, which is incidentally lightweight. These two are clearly and
cleverly leading to the cloud journey now.
IT Digitization and Distribution As explained in the beginning, digitization has
been an ongoing and overwhelming process, and it has quickly generated and gar-
nered a lot of market and mind shares. Digitally enabling everything around us
induces a dazzling array of cascading and captivating effects in the form of cogni-
tive and comprehensive transformations for businesses as well as people. With the
growing maturity and affordability of edge technologies, every common thing in our
personal, social, and professional environment is becoming digitized. Devices are
being tactically empowered to be intelligent. Ordinary articles are becoming smart
artifacts in order to significantly enhance the convenience, choice, and comfort lev-
els of humans in their everyday lives and works. Therefore it is no exaggeration in
stating that lately there have been a number of tactical as well as strategic advance-
ments in the edge-technologies space. Infinitesimal and invisible tags, sensors, actu-
ators, controllers, stickers, chips, codes, motes, specks, smart dust, and the like are
being produced in plenty. Every single tangible item in our midst is being systemati-
cally digitized by internally as well as externally attaching these minuscule products
onto them. This is for empowering them to be smart in their actions and reactions.
Similarly, the distribution aspect too gains more ground. Due to its significant
advantages in crafting and sustaining a variety of business applications ensuring the
hard-to-realize quality of service (QoS) attributes, there is a bevy of distribution-
centric software architectures, frameworks, patterns, practices, and platforms for
the Web, enterprise, embedded, analytical, and cloud applications and services.
4 1 Stepping into the Digital Intelligence Era
one being energy efficient. Green solutions and practices are being insisted upon
everywhere these days, and IT are one of the principal culprits in wasting a lot of
energy due to the pervasiveness of IT servers and connected devices. Data centers
consume a lot of electricity, so green IT is a hot subject for study and research across
the globe. Another area of interest is remote monitoring, management, and enhance-
ment of the empowered devices. With the number of devices in our everyday envi-
ronments growing at an unprecedented scale, their real-time administration,
configuration, activation, monitoring, management, and repair (if any problem
arises) can be eased considerably with effective remote connection and correction
competency.
Extreme Connectivity The connectivity capability has risen dramatically and
become deeper and extreme. The kinds of network topologies are consistently
expanding and empowering their participants and constituents to be highly produc-
tive. There are unified, ambient, and autonomic communication technologies from
research organizations and labs drawing the attention of executives and decision-
makers. All kinds of systems, sensors, actuators, and other devices are empowered
to form ad hoc networks for accomplishing specialized tasks in a simpler manner.
There are a variety of network and connectivity solutions in the form of load balanc-
ers, switches, routers, gateways, proxies, firewalls, etc. for providing higher perfor-
mance; network solutions are being embedded in appliances (software as well as
hardware) mode.
Device middleware or Device Service Bus (DSB) is the latest buzzword enabling
a seamless and spontaneous connectivity and integration between disparate and dis-
tributed devices. That is, device-to-device (in other words, machine-to-machine
(M2M)) communication is the talk of the town. The interconnectivity-facilitated
interactions among diverse categories of devices precisely portend a litany of sup-
ple, smart, and sophisticated applications for people. Software-defined networking
(SDN) is the latest technological trend captivating professionals to have a renewed
focus on this emerging yet compelling concept. With clouds being strengthened as
the core, converged, and central IT infrastructure, device-to-cloud connections are
fast-materializing. This local as well as remote connectivity empowers ordinary
articles to become extraordinary objects by being distinctively communicative, col-
laborative, and cognitive.
Service Enablement Every technology pushes for its adoption invariably. The
Internet computing has forced for the beneficial Web enablement, which is the
essence behind the proliferation of Web-based applications. Now, with the perva-
siveness of sleek, handy, and multifaceted mobiles, every enterprise and Web appli-
cations are being mobile-enabled. That is, any kind of local and remote applications
are being accessed through mobiles on the move, thus fulfilling real-time interac-
tions and decision-making economically. With the overwhelming approval of the
service idea, every application is service-enabled. That is, we often read, hear, and
feel service-oriented systems. The majority of next-generation enterprise-scale,
mission-critical, process-centric, and multi-purpose applications are being assem-
bled out of multiple discrete and complex services.
6 1 Stepping into the Digital Intelligence Era
Not only applications but physical devices at the ground level are being seriously
service-enabled in order to uninhibitedly join in the mainstream computing tasks
and contribute to the intended success. That is, devices, individually and collec-
tively, could become service providers or publishers, brokers and boosters, and con-
sumers. The prevailing and pulsating idea is that any service-enabled device in a
physical environment could interoperate with others in the vicinity as well as with
remote devices and applications. Services could abstract and expose only specific
capabilities of devices through service interfaces while service implementations are
hidden from user agents. Such kinds of smart separations enable any requesting
device to see only the capabilities of target devices and then connect, access, and
leverage those capabilities to achieve business or people services. The service
enablement completely eliminates all dependencies and deficiencies so that devices
could interact with one another flawlessly and flexibly.
1.3 T
he Internet of Things (IoT)/Internet of Everything
(IoE)
Originally, the Internet was the network of networked computers. Then, with the
heightened ubiquity and utility of wireless and wired devices, the scope, size, and
structure of the Internet has changed to what it is now, making the Internet of Devices
(IoD) concept a mainstream reality. With the service paradigm being positioned as
the most optimal, rational, and practical way of building enterprise-class applica-
tions, a gamut of services (business and IT) are being built by many, deployed in
worldwide Web and application servers and delivered to everyone via an increasing
array of input/output devices over networks. The increased accessibility and audi-
bilityof services have propelled interested software architects, engineers, and appli-
cation developers to realize modular, scalable, and secured software applications by
choosing and composing appropriate services from those service repositories
quickly. Thus, the Internet of Services (IoS) idea is fast-growing. Another interest-
ing phenomenon getting the attention of press these days is the Internet of Energy.
That is, our personal, as well as professional, devices get their energy through their
interconnectivity. Figure 1.1 clearly illustrates how different things are linked with
one another in order to conceive, concretize, and deliver futuristic services for the
mankind (Distributed Data Mining and Big Data, a Vision paper by Intel, 2012).
As digitization gains more accolades and success, all sorts of everyday objects
are being connected with one another as well as with scores of remote applications
in cloud environments. That is, everything is becoming a data supplier for the next-
generation applications, thereby becoming an indispensable ingredient individually
as well as collectively in consciously conceptualizing and concretizing smarter
applications. There are several promising implementation technologies, standards,
platforms, and tools enabling the realization of the IoT vision. The probable outputs
of the IoT field is a cornucopia of smarter environments such as smarter offices,
1.3 The Internet of Things (IoT)/Internet of Everything (IoE) 7
Fig. 1.1 The extreme connectivity among physical devices with virtual applications
These can be applied to people, things, information, and places. That is, the
Internet of Everything is all set to flourish unflinchingly. Let me summarize and
categorize the delectable advancements in the ICT discipline into three.
Tending Toward the Trillions of Digitized Elements/Smart Objects/Sentient
Materials The surging popularity and pervasiveness of a litany of digitization and
edge technologies such as sensors, actuators, and other implantables; application-
specific system-on-a-chip (SoC); microcontrollers; miniaturized RFID tags; easy-
to-handle barcodes, stickers, and labels; nanoscale specks and particles, illuminating
LED lights, etc. come handy in readily enabling every kind of common, casual, and
cheap things in our everyday environments to join in the mainstream computing. All
kinds of ordinary and tangible things in our midst get succulently transitioned into
purposefully contributing and participating articles to accomplish extraordinary
tasks. This tactically as well as strategically beneficial transformation is performed
through the well-intended and well-designed application of disposable and disap-
pearing yet indispensable digitization technologies. In short, every concrete thing
gets systematically connected with one another as well as the Web. This phenome-
non is being aptly termed as the Internet of Things (IoT).
Further on, every kind of physical, mechanical, and electrical system in the
ground level get hooked to various software applications and data sources at the
faraway as well as nearby cyber/virtual environments. Resultantly, the emerging
domain of cyber-physical systems (CPS) is gaining immense attention lately:
1. Ticking toward the billions of connected devices – A myriad of electronics,
machines, instruments, wares, equipment, drones, pumps, wearables, hearables,
robots, smartphones, and other devices across industry verticals are intrinsically
instrumented at the design and manufacturing stages in order to embed the con-
nectivity capability. These devices are also being integrated with software ser-
vices and data sources at cloud environments to be enabled accordingly.
2. Envisioning millions of software services – With the accelerated adoption of
microservices architecture (MSA), enterprise-scale applications are being
expressed and exposed as a dynamic collection of fine-grained, loosely-coupled,
network-accessible, publicly discoverable, API-enabled, composable, and light-
weight services. This arrangement is all set to lay down a stimulating and sus-
tainable foundation for producing next-generation software applications and
services. The emergence of the scintillating concepts such as Docker containers,
IT agility, and DevOps in conjunction with MSA clearly foretells that the days
of software engineering is bound to flourish bigger and better. Even hardware
resources and assets are being software-defined in order to incorporate the much-
needed flexibility, maneuverability, and extensibility.
In short, every tangible thing becomes smart, every device becomes smarter, and
every human being tends to become the smartest.
The disruptions and transformations brought in by the above-articulated advance-
ments are really mesmerizing. The IT has touched every small or big entity deci-
sively in order to produce context-aware, service-oriented, event-driven,
1.3 The Internet of Things (IoT)/Internet of Everything (IoE) 9
As we all know, the big data paradigm is opening up a fresh set of opportunities for
businesses. As data explosion would occur according to the forecasts of leading
market research and analyst reports, the key challenge in front of businesses is how
efficiently and rapidly to capture, process, analyze, and extract tactical, operational,
and strategic insights in time to act upon swiftly with all the confidence and clarity.
In the recent past, there were experiments on in-memory computing. For a faster
generation of insights out of a large amount of multi-structured data, the new
entrants such as in-memory and in-database analytics are highly reviewed and rec-
ommended. The new mechanism insists on putting all incoming data in memory
instead of storing it in local or remote databases so that the major barrier of data
latency gets eliminated. There are a variety of big data analytics applications in the
market and they implement this new technique in order to facilitate real-time data
analytics. Timeliness is an important factor for information to be beneficially lever-
aged. The appliances are in general high-performing, thus guaranteeing higher
throughput in all they do. Here too, considering the need for real-time emission of
insights, several product vendors have taken the route of software as well as hard-
ware appliances for substantially accelerating the speed with which the next-
generation big data analytics get accomplished.
In the business intelligence (BI) industry, apart from realizing real-time insights,
analytical processes and platforms are being tuned to bring forth insights that invari-
ably predict something to happen for businesses in the near future. Therefore, exec-
utives and other serious stakeholders proactively and preemptively can formulate
well-defined schemes and action plans, fresh policies, new product offerings, pre-
mium services, and viable and value-added solutions based on the inputs. Prescriptive
analytics, on the other hand, is to assist business executives inprescribing and for-
mulating competent and comprehensive schemes and solutions based on the pre-
dicted trends and transitions.
IBM has introduced a new computing paradigm “stream computing” in order to
capture streaming and event data on the fly and to come out with usable and reusable
patterns, hidden associations, tips, alerts and notifications, impending opportunities,
threats, etc. in time for executives and decision-makers to contemplate appropriate
countermeasures (James Kobielus [2]).
(continued)
1.4 Real-Time, Predictive, and Prescriptive Analytics 11
The Big Picture With the cloud space growing fast as the next-generation environ-
ment for application design, development, deployment, integration, management,
and delivery as a service, the integration requirement too has grown deeper and
broader as pictorially illustrated in Fig. 1.2.
Integration Bus
Embedded Space
All kinds of physical entities at the ground level will have purpose-specific inter-
actions with services and data hosted on the enterprise as well as cloud servers and
storages to enable scores of real-time and real-world applications for the society.
This extended and enhanced integration would lead to data deluges that have to be
accurately and appropriately subjected to a variety of checks to promptly derive
actionable insights that in turn enable institutions, innovators, and individuals to be
smarter and speedier in their obligations and offerings. Newer environments such as
smarter cities, governments, retail, healthcare, energy, manufacturing, supply chain,
offices, and homes will flourish. Cloud, being the smartest IT technology, is inher-
ently capable of meeting up with all kinds of infrastructural requirements fully and
firmly.
The digitization process has gripped the whole world today as never before, and its
impacts and initiatives are being widely talked about. With an increasing variety of
input and output devices and newer data sources, the realm of data generation has
gone up remarkably. It is forecasted that there will be billions of everyday devices
getting connected, capable of generating an enormous amount of data assets which
need to be processed. It is clear that the envisaged digital world is to result in a huge
amount of bankable data. This growing data richness, diversity, value, and reach
decisively gripped the business organizations and governments first. Thus, there is a
fast spreading of newer terminologies such as digital enterprises and economies.
Now it is fascinating the whole world, and this new world order hastellingly woken
up worldwide professionals and professors to formulate and firm up flexible, futur-
istic strategies, policies, practices, technologies, tools, platforms, and infrastruc-
tures to tackle this colossal yet cognitive challenge head on. Also, IT product
vendors are releasing refined and resilient storage appliances, newer types of data-
bases, distributed file systems, data warehouses, etc. to stock up for the growing
volume of business, personal, machine, people, and online data and to enable spe-
cific types of data processing, mining, slicing, and analyzing the data getting col-
lected and processed. This pivotal phenomenon has become a clear reason for
envisioning the digital universe.
There will be hitherto unforeseen applications in the digital universe in which all
kinds of data producers, middleware, consumers, storages, analytical systems, vir-
tualization, and visualization tools and software applications will be seamlessly and
spontaneously connected with one another. Especially there is a series of renowned
and radical transformations in the sensor space. Nanotechnology and other minia-
turization technologies have brought in legendary changes in sensor design. The
nanosensors can be used to detect vibrations, motion, sound, color, light, humidity,
chemical composition, and many other characteristics of their deployed environ-
ments. These sensors can revolutionize the search for new oil reservoirs, structural
integrity for buildings and bridges, merchandise tracking and authentication, food
1.6 Describing the Digitization-Driven Big Data World 13
and water safety, energy use and optimization, healthcare monitoring and cost sav-
ings, and climate and environmental monitoring. The point to be noted here is the
volume of real-time data being emitted by the army of sensors and actuators.
The steady growth of sensor networks increases the need for 1 million times
more storage and processing power by 2020. It is projected that there will be one
trillion sensors by 2030 and every single person will be assisted by approximately
150 sensors in this planet. Cisco has predicted that there will be 50 billion connected
devices in 2020, and hence the days of the Internet of Everything (IoE) are not too
far off. All these scary statistics convey one thing, which is that IT applications,
services, platforms, and infrastructures need to be substantially and smartly invigo-
rated to meet up all sorts of business and peoples’ needs in the ensuing era of deep-
ened digitization.
Precisely speaking, the data volume is to be humongous as the digitization is
growing deep and wide. The resulting digitization-induced digital universe will,
therefore, be at war with the amount of data being collected and analyzed. The data
complexity through the data heterogeneity and multiplicity will be a real challenge
for enterprise IT teams. Therefore big data is being positioned and projected as the
right computing model to effectively tackle the data revolution challenges of the
ensuing digital universe.
Big Data Analytics (BDA) The big data paradigm has become a big topic across
nearly every business domain. IDC defines big data computing as a set of new-
generation technologies and architectures, designed to economically extract value
from very large volumes of a wide variety of data by enabling high-velocity capture,
discovery, and/or analysis. There are three core components in big data: the data
itself, the analytics of the data captured and consolidated, and the articulation of
insights oozing out of data analytics. There are robust products and services that can
be wrapped around one or all of these big data elements. Thus there is a direct
connectivity and correlation between the digital universe and the big data idea
sweeping the entire business scene. The vast majority of new data being generated
as a result of digitization is unstructured or semi-structured. This means there is a
need arising to somehow characterize or tag such kinds of multi-structured big data
to be useful and usable. This empowerment through additional characterization or
tagging results in metadata, which is one of the fastest-growing subsegments of the
digital universe through metadata itself, is a minuscule part of the digital universe.
IDC believes that by 2020, a third of the data in the digital universe (more than
13,000 exabytes) will have big data value, only if it is tagged and analyzed. There
will be routine, repetitive, redundant data and hence not all data is necessarily useful
for big data analytics. However, there are some specific data types that are princely
ripe for big analysis such as:
14 1 Stepping into the Digital Intelligence Era
cifics on business behavior and performance. With the new-generation data analyt-
ics being performed easily and economically in cloud platforms and transmitted to
smartphones, the success of any enterprise or endeavor solely rests with knowledge-
empowered consumers.
Consumers Delegate Tasks to Digital Concierges We have been using a myriad
of digital assistants (tablets, smartphones, wearables, etc.) for a variety of purposes
in our daily life. These electronics are of great help, and crafting applications and
services for these specific as well as generic devices empower them to be more right
and relevant for us. Data-driven smart applications will enable these new-generation
digital concierges to be expertly tuned to help us in many things in our daily life.
Big data is driving a revolution in machine learning and automation. This will
create a wealth of new smart applications and devices that can anticipate our needs
precisely and perfectly. In addition to responding to requests, these smart applica-
tions will proactively offer information and advice based on detailed knowledge of
our situation, interests, and opinions. This convergence of data and automation will
simultaneously drive a rise of user-friendly analytic tools that help to make sense of
the information and create new levels of ease and empowerment for everything from
data entry to decision-making. Our tools will become our data interpreters, business
advisors, and life coaches, making us smarter and more fluent in all subjects of life.
Data Fosters Community Due to the growing array of extra facilities, opportuni-
ties, and luxuries being made available and accessible in modernized cities, there is
a consistent migration to urban areas and metros from villages. This trend has dis-
placed people from their roots and there is a huge disconnect between people in new
locations also. Now with the development and deployment of services (online
location-based services, local search, community-specific services, and new data-
driven discovery applications) based on the growing size of social, professional, and
people data, people can quickly form digital communities virtually in order to
explore, find, share, link, and collaborate with others. The popular social network-
ing sites enable people to meet and interact with one another purposefully. The
government uses data and analytics to establish citizen-centric services, improve
public safety, and reduce crime. Medical practitioners use it to diagnose better and
treat diseases effectively. Individuals are tapping on online data and tools for help
with everything from planning their career to retirement, to choose everyday service
providers, to pick up places to live, to find the quickest way to get to work, and so
on. Data, services, and connectivity are the three prime ingredients in establishing
and sustaining rewarding relationships among diverse and distributed people groups.
Data Empowers Businesses to Be Smart Big data is changing the way companies
conduct businesses. Starting with streamlining operations, increasing efficiencies to
boost the productivity, improving decision-making, and bringing forth premium
services to market are some of the serious turnarounds due to big data concepts. It
is all “more with less.” A lot of cost savings are being achieved by leveraging big
data technologies smartly, and this, in turn, enables businesses to incorporate more
competencies and capabilities.
16 1 Stepping into the Digital Intelligence Era
Big data is also being used to better target customers, personalize goods and
services, and build stronger relationships with customers, suppliers, and employees.
Business will see intelligent devices, machines, and robots taking over many repeti-
tive, mundane, difficult, and dangerous activities. Monitoring and providing real-
time information about assets, operations, and employees and customers, these
smart machines will extend and augment human capabilities. Computing power will
increase as costs decrease. Sensors will monitor, forecast, and report on environ-
ments; smart machines will develop, share, and refine new data into a knowledge
based on their repetitive tasks. Real-time, dynamic, analytics-based insights will
help businesses provide unique services to their customers on the fly. Both sources
will transmit these rich streams of data to cloud environments so that all kinds of
implantable, wearable, portable, fixed, nomadic, and any input/output devices can
provide timely information and insights to their users unobtrusively. There is a
gamut of improvisations such as the machine learning discipline solidifying an
ingenious foundation for smart devices. Scores of data interpretation engines, expert
systems, and analytical applications go a long way in substantially augmenting and
assisting humans in their decision-making tasks.
Big Data Brings in Big Opportunities The big data and cloud paradigms have
collectively sparked a stream of opportunities for both start-ups, and existing small
businesses find innovative ways to harness the power of the growing streams of digi-
tal data. As the digital economy and enterprise mature, there can be more powerful
and pioneering products, solutions, and services.
In the beginning, we had written about three kinds of data being produced. The
processing is mainly two types: batch and online (real-time) processing. As far as
the speed with which data needs to be captured and processed is concerned, there
are both low-latency as well as high-latency data. Therefore, the core role of
stream computing (introduced by IBM) is to power extremely low-latency data,
but it should not rely on high-volume storage to do its job. By contrast, the conven-
tional big data platforms involve a massively parallel processing architecture com-
prising enterprise data warehouses (EDW), Hadoop framework, and other analytics
databases. This setup usually requires high-volume storage that can have a consid-
erable physical footprint within the data center. On the other hand, a stream com-
puting architecture uses smaller servers distributed across many data centers.
Therefore there is a need for blending and balancing of stream computing with the
traditional one. It is all about choosing a big data fabric that elegantly fits for the
purpose on hand. The big data analytics platform has to have specialized “data
persistence” architectures for both short-latency persistence (caching) of in-motion
data (stream computing) and long-latency persistence (storage) of at-rest data.
Stream computing is for extracting actionable insights in time out of streaming
1.7 The Cloud Infrastructures for the Digitization Era 17
data. This computing model prescribes an optimal architecture for real-time analy-
sis for those data in flight.
Big Data Analytics Infrastructures And as IT moves to the strategic center of
business, CXOs at organizations of all sizes turn to product vendors and service
providers to help them extract more real value from their data assets, business pro-
cesses, and other key investments. IT is being primed for eliminating all kinds of
existing business and IT inefficiencies, slippages, wastages etc. Nearly 70% of the
total IT budget is being spent on IT operations and maintenance alone. Two-thirds
of companies go over schedule on their project deployments. Hence, this is the
prime time to move into smarter computing through the systematic elimination of
IT complexities and all the inflicted barriers to innovation. Thus there is a business
need for a new category of systems. Many prescribe different characteristics for
next-generation IT infrastructures. The future IT infrastructures need to be open,
modular, dynamic, converged, instant-on, expertly integrated, shared, software-
defined, virtualized, etc.
Infrastructure Clouds: The Foundation for Big Data Analytics Businesses cur-
rently face a deluge of data, and business leaders need a smart way of capturing and
understanding information rapidly. Infrastructure clouds enable big data analytics in
two ways: storage and analysis. With data flowing in from a wide range of sources
over a variety of networks, it is imperative that IT can store and make the data acces-
sible to the business. Infrastructure clouds also enable enterprises to take the full
advantages of big data by providing high-performing, scalable, and agile storage.
But the real value comes from analyzing all of the data made available. The lure of
breakthrough insights has led many lines of business to set up their own server and
storage infrastructure, networking solutions, analytics platforms, databases, and
applications. Big data appliances are also gaining market and mind shares to have a
single and simplified set up for big data analytics.
However, they are often only analyzing narrow slivers of the full datasets avail-
able. Without a centralized point of aggregation and integration, data is collected in
a fragmented way, resulting in limited or partial insights. Considering the data- and
process-intensive nature of big data storage and analysis, cloud computing, storage,
and network infrastructures are the best course of action. Private, public, and hybrid
clouds are the smartest way of proceeding with big data analytics. Also, social data
are being transmitted over the public and open Internet; public clouds seem to be a
good bet for some specific big data analytical workloads. There are WAN optimiza-
tion technologies strengthening the case for public clouds for effective and efficient
big data analysis. Succinctly speaking, cloud environments with all the latest
advancements in the form of software-defined networking, storage and compute
infrastructures, cloud federation, etc. are the future of fit-for-purpose big data analy-
sis. State-of-the-art cloud centers are right for a cornucopia of next-generation big
data applications and services.
IBM has come out with expert integrated systems that ingeniously eliminate all
kinds of inflexibilities and inefficiencies. Cloud services and applications need to
be scalable and the underlying IT infrastructures need to be elastic. The business
18 1 Stepping into the Digital Intelligence Era
Previously we have talked about versatile infrastructures for big data analytics. In
this section, we are insisting on the need for integrated platforms for compact big
data analysis. An integrated platform (Fig. 1.4) has to have all kinds of compatible
and optimized technologies, platforms, and other ingredients to adaptively support
varying business requirements.
The first is central core business data, the consistent, quality-assured data found
in EDW and MDM systems. Traditional relational databases, such as IBM DB2, are
the base technology. Application-specific reporting and decision support data often
stored in EDWs today are excluded. Core reporting and analytic data cover the latter
data types. In terms of technology, this type is ideally a relational database. Data
warehouse platforms, such as IBM InfoSphere Warehouse, IBM Smart Analytics
System, and the new IBM PureData System for Operational Analytics, play a strong
role here. Business needs requiring higher query performance may demand an ana-
lytical database system built on massively parallel processing (MPP) columnar
databases or other specialized technologies, such as the new IBM PureData System
for Analytics (powered by Netezza technology).
20 1 Stepping into the Digital Intelligence Era
Fig. 1.4 The reference architecture for integrated data analytics platform
transformations. The BI horizon has now increased sharply with distributed and
diverse data sources. The changes have propelled industry professionals, research
labs, and academicians to bring in required technologies, tools, techniques, and
tips. The traditional BI architecture (Fig. 1.5) is as follows:
The big data-inspired BI architecture is given below. There are additional mod-
ules in this architecture as big data analytics typically involves data collection, vir-
tualization, pre-processing, information storage, and knowledge discovery and
articulation activities (Fig. 1.6).
1.9 Conclusion
There have been a number of disruptive and decisive transformations in the infor-
mation technology (IT) discipline in the last five decades. Silently and succinctly IT
has moved from a specialized tool to become a greatly and gracefully leveraged tool
on nearly every aspect of our lives today. For a long time, IT has been a business
enabler and is steadily on the way to becoming the people enabler in the years to
unfold. Once upon a time, IT was viewed as a cost center, and now the scene has
totally changed to become a profit center for all kinds of organizations across the
globe. The role and relevance of IT in this deeply connected and shrunken world are
really phenomenal as IT through its sheer strength in seamlessly capturing and
embodying all kinds of proven and potential innovations and improvisations is able
to sustain its marvelous and mesmerizing journey for the betterment of the human
society.
The brewing trends clearly vouch for a digital universe in 2020. The distinct
characteristic of the digitized universe is nonetheless the huge data collection (big
data) from a variety of sources. This voluminous data production and the clarion
call for squeezing out workable knowledge out of the data for adequately empower-
ing the total human society is activating IT experts, engineers, evangelists, and
exponents to incorporate more subtle and successful innovations in the IT field. The
slogan “more with less” is becoming louder. The inherent expectations from IT for
resolving various social, business, and personal problems are on the climb. In this
chapter, we have discussed about the digitization paradigm and how next-generation
cloud environments are to support and sustain the digital universe.
References
1. Devlin, B.(2012), The Big Data Zoo—Taming the Beasts, a white paper by 9sight Consulting,
Retrieved from http://ibmdatamag.com/2012/10/the-big-data-zoo-taming-the-beasts/
2. Kobielus J (2013) The role of stream computing in big data architectures. Retrieved from http://
ibmdatamag.com/2013/01/the-role-of-stream-computing-in-big-data-architectures/
3. Persistent (2013) How to enhance traditional bi architecture to leverage big data, a white paper.
Retrieved from http://sandhill.com/wp-content/files_mf/persistentsystemswhitepaperenhance_
traditionalbi_architecture.pdf
Chapter 2
Demystifying the Traits of Software-Defined
Cloud Environments (SDCEs)
Abstract Definitely the cloud journey is on the fast track. The cloud idea got origi-
nated and started to thrive from the days of server virtualization. Server machines
are being virtualized in order to have multiple virtual machines, which are provi-
sioned dynamically and kept in ready and steady state to deliver sufficient resources
(compute, storage, and network) for optimally running any software application.
That is, a physical machine can be empowered to run multiple and different applica-
tions through the aspect of virtualization. Resultantly, the utilization of expensive
compute machines is steadily going up.
This chapter details and describes the nitty-gritty of next-generation cloud centers.
The motivations, the key advantages, and the enabling tools and engines along with
other relevant details are being neatly illustrated there. An SDCE is an additional
abstraction layer that ultimately defines a complete data center. This software layer
presents the resources of the data center as pools of virtual and physical resources to
host and deliver software applications. A modern SDCE is nimble and supple as per
the vagaries of business movements. SECE is, therefore, a collection of virtualized IT
resources that can be scaled up or down as required and can be deployed as needed
in a number of distinct ways. There are three key components making up SDCEs:
1. Software-defined computing
2. Software-defined networking
3. Software-defined storage
The trait of software enablement of different hardware systems has pervaded into
other domains so that we hear and read about software-defined protection, security,
etc. There are several useful links in the portal (Sang-Woo et al. “Scalable multi-
access flash store for big data analytics” FPGA’14, Monterey, CA, USA, February
26–28, 2014) pointing to a number of resources on the software-defined cloud
environments.
2.1 Introduction
• Energy
• Resource
• Fault tolerance
• Load balancing
• Security
Thus, virtualization has brought in a number of delectable advancements. Further
on, the much-anticipated IT agility and affordability through such virtualization
mechanisms are also being realized. It is not only partitioning of physical machines
in any data center toward having hundreds of virtual machines in order to fulfil the
IT requirements of business activities but also clubbing hundreds of virtual machines
programmatically brings forth a large-scale virtual environment for running high-
performance computing (HPC) applications. Precisely speaking, the virtualization
tenet is leading to the realization of cheaper supercomputing capability. There are
other crucial upgrades being brought in by the indomitable virtualization feature,
and we will write them too in this book.
Not only servers but also networking components such as firewalls, load balanc-
ers, application delivery controllers (ADCs), intrusion detection and prevention sys-
tems, routers, switches, gateways, etc. are also getting virtualized in order to deliver
these capabilities as network services. The other noteworthy forward-looking move-
ments include the realization of software-defined storage through storage
virtualization.
The prime objective of the hugely popular cloud paradigm is to realize highly orga-
nized and optimized IT environments for enabling business automation, accelera-
tion, and augmentation. Most of the enterprise IT environments across the globe are
bloated, closed, inflexible, static, complex, and expensive. The brewing business
and IT challenges are therefore how to make IT elastic, extensible, programmable,
dynamic, modular, and cost-effective. Especially with the worldwide businesses
cutting down their IT budgets gradually year after year, the enterprise IT team has
left with no other options other than to embark on a meticulous and measured jour-
ney to accomplish more with less through a host of pioneering and promising tech-
nological solutions. Organizations are clearly coming to the conclusion that business
operations can run without any hitch and hurdle with less IT resources through
effective commoditization, consolidation, centralization, compartmentalization
(virtualization and containerization), federation, and rationalization of various IT
solutions (server machines, storage appliances, and networking components).
IT operations also go through a variety of technology-induced innovations and
disruptions to bring in the desired rationalization and optimization. The IT infra-
structure management is also being performed remotely and in an automated man-
ner through the smart leverage of a host of automated IT operation, administration,
26 2 Demystifying the Traits of Software-Defined Cloud Environments (SDCEs)
The mesmerizing cloud paradigm has become the mainstream concept in IT today
and its primary and ancillary technologies are simply flourishing due to the over-
whelming acceptance and adoption of the cloud theory. The cloudification move-
ment has blossomed these days, and most of the IT infrastructures and platforms
2.2 Reflecting the Cloud Journey 27
(continued)
28 2 Demystifying the Traits of Software-Defined Cloud Environments (SDCEs)
Very few technologies could survive and contribute copiously for a long time.
Primarily, the intrinsic complexity toward technologies’ all-around utilization
and the lack of revered innovations are being touted as the chief reasons for
their abject and abysmal failure and the subsequent banishment into thin air.
Thus, the factors such as the fitment/suitability, adaptability, sustainability,
simplicity, and extensibility of technologies ought to be taken into serious
consideration while deciding technologies and tools for enterprise- scale,
transformational, and mission-critical projects. The cloud technology is being
positioned as the best-in-class technology in the engrossing IT domain with
all the necessary wherewithal, power, and potential for handsomely and hur-
riedly contributing to the business disruption, innovation, and transformation
needs. Precisely speaking, the cloud idea is the aggregation of several proven
techniques and tools for realizing the most efficient, elegant, and elastic IT
infrastructure for the ensuing knowledge era.
The arrival of cloud concepts has brought in remarkable changes in the IT landscape
that in turn leads in realizing big transitions in the delivery of business applications
and services and in the solid enhancement of business flexibility, productivity, and
sustainability. Formally cloud infrastructures are centralized, virtualized, auto-
mated, and shared IT infrastructures. The utilization rate of cloud infrastructures
has gone up significantly. Still, there are dependencies curtailing the full usage of
expensive IT resources. Employing the decoupling technique among various mod-
ules to decimate all kinds of constricting dependencies, more intensive and insight-
ful process automation through orchestration and policy-based configuration,
operation, management, delivery, and maintenance, attaching external knowledge
bases, is widely prescribed to achieve still more IT utilization to cut costs
remarkably.
Lately, the aroma of commoditization and compartmentalization is picking up.
These two are the most important ingredients of cloudification. Let us begin with
the commoditization technique.
• The Commoditization of Compute Machines – The tried and time-tested abstrac-
tion aspect is being recommended for fulfilling the commoditization need. There
is a technological maturity as far as physical/bare metal machines getting com-
moditized through partitioning. The server commoditization has reached a state
of semblance and stability. Servers are virtualized, containerized, shared across
many clients, publicly discovered, and leveraged over any network, delivered as
a service, billed for the appropriate usage, automatically provisioned, composed
toward large-scale clusters, monitored, measured, and managed through tools,
performance tuned, made policy-aware, automatically scaled up and out based
2.2 Reflecting the Cloud Journey 29
on brewing user, data, and processing needs, etc. In short, cloud servers are being
made workloads-aware. However, that is not the case with networking and stor-
age portions.
• The Commoditization of Networking Solutions – On the networking front, the
propriety and expensive network switches and routers and other networking
solutions in any IT data centers and server farms are consciously commoditized
through a kind of separation. That is, the control plane gets abstracted out, and
hence, the routers and switches have only the data forwarding plane. That means,
there is less intelligence into these systems; thereby, the goal of commoditization
of network elements is technologically enabled. The controlling intelligence
embedded inside various networking solutions is adroitly segregated and is being
separately developed and presented as a software controller. This transition
makes routers and switches dumb as they lose out their costly intelligence.
Also, this strategically sound segregation comes handy in interchanging one
with another one from a different manufacturer. The vendor lock-in problem
simply vanishes with the application of the widely dissected and deliberated
abstraction concept. Now with the controlling stake in its pure software form,
incorporating any kind of patching in addition to configuration and policy
changes in the controlling module can be done quickly in a risk-free and rapid
manner. With such a neat and nice abstraction procedure, routers and switches
are becoming commoditized entities. There is fresh business and technical
advantages as the inflexible networking in present-day IT environments is
steadily inching toward to gain the venerable and wholesome benefits of the
commoditized networking.
• The Commoditization of Storage Appliances – Similar to the commoditization of
networking components, all kinds of storage solutions are being commoditized.
There are a number of important advantages with such transitions. In the subse-
quent sections, readers can find more intuitive and informative details on this
crucial trait. Currently, commoditization is being realized through the proven
abstraction technique.
Thus commoditization plays a very vital role in shaping up the cloud idea. For
enhanced utilization of IT resources in an affordable fashion and for realizing
software-defined cloud environments, the commoditization techniques are being
given more thrusts these days.
The compartmentalization is being realized through the virtualization and con-
tainerization technologies. There are several comprehensive books on Docker-
enabled containerization in the market, and hence we skip the details of
containerization, which is incidentally being touted as the next best thing in the
cloud era.
As indicated above, virtualization is one of the prime compartmentalization tech-
niques. As widely accepted and articulated, virtualization has been in the forefront
in realizing highly optimized, programmable, managed, and autonomic cloud envi-
ronments. Virtualization leads to the accumulation of virtualized and software-
defined IT resources, which are remotely discoverable, network-accessible,
30 2 Demystifying the Traits of Software-Defined Cloud Environments (SDCEs)
extraction of actionable insights out of data heaps. There are big, fast, streaming,
and IoT data, and there are batch, real-time, and interactive processing methods.
Herein, we need to feed the system with the right and relevant data along with
the programming logic on how to process the data and present the insights
squeezed out.
However, the future beckons for automated analytics in the sense that we just
feed the data and the platform creates viable and venerable patterns, models, and
hypotheses in order to discover and disseminate knowledge. The unprecedented
growth of cognitive computing is to bring the desired automation so that data
gets converted into information and insights casually and cognitively.
2. Cognitive Clouds – The cloud journey is simply phenomenal. It started with
server virtualization, and we are now moving toward software-defined cloud
centers. The originally envisioned goal is to have highly optimized and orga-
nized ICT infrastructures and resources for hosting and delivering any kind of
software applications. The ICT resource utilization is picking up fast with all the
innovations and improvisations in the cloud field. The future belongs to real-time
and cognitive analytics of every data emanating out of cloud environments. The
log, security, performance, and other operational, transactional, and analytical
data can be captured and subjected to a variety of investigations in order to estab-
lish dynamic capacity planning, adaptive task/workflow scheduling and load bal-
ancing, workload consolidation, and optimization, resource placement, etc. The
machine learning (ML) algorithms are bound to come handy in predicting and
prescribing the valid steps toward enhanced ICT utilization while fulfilling the
functional as well as nonfunctional needs of business applications and IT
services.
3 . No Servers, but Services – The ICT infrastructure operations and management
activities such as resource provisioning, load balancing, firewalling, software
deployment, etc. are being activated and accelerated through a bevy of automa-
tion tools. The vision of NoOps is steadily seeing the reality. There is very less
intervention, instruction, interpretation, and involvement of humans in operating
and managing IT. People can focus on their core activities blissfully. On the
other hand, we are heading toward the serverless architecture. The leading cloud
service providers (CSPs) provide the serverless computing capabilities through
the various offerings (IBM OpenWhisk, AWS Lambda, Microsoft Azure
Functions, and Google Cloud Functions). We are fiddling with bare metal (BM)
servers, virtual machines (VMs) through hardware virtualization, and now con-
tainers through OS virtualization. The future is for virtual servers. That is, appro-
priate compute, storage, and network resources get assigned automatically for
event-driven applications in order to make way for serverless computing.
4 . Analytics and Applications at the InterCloud of Public, Private, and Edge/Fog
Clouds – Clouds present an illusion of infinite resources, and hence big, fast,
streaming, and IoT data analytics are being accomplished in on-premise as well
as off-premise, online, and on-demand cloud centers. However, with the faster
maturity and stability of device cluster/cloud formation mechanisms due to the
faster spread of fog/edge devices in our personal and professional environments,
32 2 Demystifying the Traits of Software-Defined Cloud Environments (SDCEs)
real-time capture, processing, and action are being achieved these days. That is,
the newer capability of edge analytics through edge device clouds for producing
real-time insights and services is being widely accepted and accentuated.
5 . Secure Distributed Computing Through Blockchain – It is a widely expressed
concern that security is the main drawback of distributed computing. Similarly,
the topmost concern of cloud computing is again the same security aspect. The
blockchain technology, which is very popular in financial industries, is now
being tweaked and leveraged for other industry segments. The blockchain is a
sort of public “ledger” of every transaction that has ever taken place. There is no
centralized authority, but it is a kind of peer-to-peer (P2P) network of distributed
parties to arrive at a consensus, and this consensus is entered into the ledger to be
accessed by anyone at a later point in time. It is computationally infeasible for a
single actor (or anything less than the majority consensus) to go back and modify
history. Moving away from a single decision-maker to multiple decision enablers
toward the impenetrable and unbreakable security of any kind of transaction
across a myriad of industry verticals is the game-changing breakthrough of the
blockchain technology, which is immensely penetrative and participative. There
are different viable and venerable use cases across industry segments being
considered and empowered by the unlimited power of blockchain technology.
There may be salivating convergence among multiple technology domains such
as cloud environments, blockchains, artificial intelligence (AI), robotics, self-
driving vehicles, cognitive analytics, and insurance domains in the days ahead.
On concluding, the various technological evolutions and revolutions are remark-
ably enhancing the quality of human lives across the world. Carefully choosing and
smartly leveraging the fully matured and stabilized technological solutions and ser-
vices toward the much-anticipated and acclaimed digital transformation are the
need of the hour toward the safe, smarter, and sustainable planet.
2.4 T
he Emergence of Software-Defined Cloud
Environments (SECEs)
We have discussed the commoditization tenet above. Now the buzzword of software-
defined everything is all over the place as a fulfilling mechanism for next-generation
cloud environments. As widely accepted and accentuated, software is penetrating
into every tangible thing in order to bring in decisive and deterministic automation.
Decision-enabling, activating, controlling, routing, switching, management, gover-
nance, and other associated policies and rules are being coded in software form in
order to bring in the desired flexibilities in product installation, administration, con-
figuration, customization, etc. In short, the behavior of any IT products (compute,
storage, and networking) is being defined through software. Traditionally all the
right and relevant intelligence are embedded into IT systems. Now those insights
are being detached from those systems and run in a separate appliance or in virtual
2.4 The Emergence of Software-Defined Cloud Environments (SECEs) 33
machines or in bare metal servers. This detached controlling machine could work
with multiple IT systems. It is easy and quick to bring in modifications to the poli-
cies in a software controller rather on the firmware, which is embedded inside IT
systems. Precisely speaking, deeper automation and software-based configuration,
controlling, and operation of hardware resources are the principal enablers behind
the longstanding vision of software-defined infrastructure (SDI).
A software-defined infrastructure is supposed to be aware and adaptive to the
business needs and sentiments. Such infrastructures are automatically governed and
managed according to the business changes. That is, the complex IT infrastructure
management is automatically accomplished in consonance with the business direc-
tion and destination. Business goals are being literally programmed in and spelled
in a software definition. The business policies, compliance and configuration
requirements, and other critical requirements are etched in a software form. It is a
combination of reusable and rapidly deployable patterns of expertise, recommended
configurations, etc. in order to run businesses on the right path. There are orchestra-
tion templates and tools, cloud management platforms such as OpenStack, auto-
mated software deployment solutions, configuration management and workflow
scheduling solutions, etc. in order to accelerate and automate resource provisioning,
monitoring, management, and delivery needs. These solutions are able to absorb the
abovementioned software definitions and could deliver on them perfectly and
precisely.
The SDI automatically orchestrates all its resources to meet the varying work-
load requirements in near real time. Infrastructures are being stuffed with real-time
analytics through additional platforms such as operational, log, performance, and
security analytics. As enunciated above, the SDI is agile, highly optimized and orga-
nized, and workload-aware. The agility gained out of SDI is bound to propagate and
penetrate further to bring the much-needed business agility. The gap between the
business expectations and the IT supplies gets closed down with the arrival of
software-defined infrastructures. SDI comprises not only the virtualized servers but
also virtualized storages and networks.
Software-Defined Cloud Environments vs. Converged Infrastructure (CI) A
converged infrastructure is typically a single box internally comprising all the right
and relevant hardware (server machines, storage appliance, and network compo-
nents) and software solutions. This is a being touted as a highly synchronized and
optimized IT solution for faster application hosting and delivery. CI is an integrated
approach toward data center optimization to substantially minimize the lingering
compatibility issues between server, storage, and network components. This gains
prominence because it is able to ultimately reduce the costs for cabling, cooling,
power, and floor space. CI, a renowned turnkey IT solution, also embeds the soft-
ware modules for simplifying and streamlining all the management, automation,
and orchestration needs. In other words, CI is a kind of appliance specially crafted
by a single vendor or by a combination of IT infrastructure vendors. For example, a
server vendor establishes a kind of seamless linkage with storage and network prod-
uct vendors to come out with new-generation CI solutions to speed up the process
of IT infrastructure setup, activation, usage, and management.
34 2 Demystifying the Traits of Software-Defined Cloud Environments (SDCEs)
2.5 T
he Major Building Blocks of Software-Defined Cloud
Environments (SDCEs)
minutes; thereby, the time to value has come down sharply. The IT cost gets reduced
significantly. There are a number of noteworthy advancements in the field of server
virtualization in the form of a host of automated tools, design and deployment pat-
terns, easy-to-use templates, etc. The cloud paradigm became a famous and fantas-
tic approach for data center transformation and optimization because of the
unprecedented success of server virtualization. This riveting success has since then
penetrated into other important ingredients of data centers.
IT resources that are virtualized thereby are extremely elastic, remotely pro-
grammable, easily consumable, predictable, measurable, and manageable. With the
comprehensive yet compact virtualization sweeping each and every component of
data centers, the goals of distributed deployment of various resources but centrally
monitored, measured, and managed are nearing the reality. Server virtualization has
greatly improved data center operations, providing significant gains in performance,
efficiency, and cost-effectiveness by enabling IT departments to consolidate and
pool computing resources. Considering the strategic impacts of 100% virtualiza-
tion, we would like to focus on network and storage virtualization methods in the
sections to follow.
Server virtualization has played a pivotal and paramount role in cloud computing.
Through server virtualization, the goals of on-demand and faster provisioning
besides the flexible management of computing resources are readily and reward-
ingly fulfilled. Strictly speaking, server virtualization also includes the virtualiza-
tion of network interfaces from the operating system (OS) point of view. However,
it does not involve any virtualization of the networking solutions such as switches
and routers. The crux of the network virtualization is to derive multiple isolated
virtual networks from sharing the same physical network. This paradigm shift
blesses virtual networks with truly differentiated capabilities to coexist on the same
infrastructure and to bring forth several benefits toward data center automation and
transformation.
Further on, VMs across geographically distributed cloud centers can be con-
nected to work together to achieve bigger and better things for businesses. These
virtual networks can be crafted and deployed on demand and dynamically allocated
for meeting differently expressed networking demands of different business appli-
cations. The functionalities of virtual networks are decisively varying. That is, vir-
tual networks not only come handy in fulfilling the basic connectivity requirement
but also are capable of getting tweaked to get a heightened performance for specific
workloads. Figure 2.1 vividly illustrates the difference between server and network
virtualization.
36 2 Demystifying the Traits of Software-Defined Cloud Environments (SDCEs)
Virtual Virtual
Network Network
Network virtualization
Decoupled
Physical Network Physical Network
(routers, switches, etc.) (routers, switches, etc.)
Fig. 2.1 Depicting the differences between server and network virtualization
There are several network functions such as load balancing, firewalling, routing,
switching, etc. in any IT environment. The idea is to bring forth the established
virtualization capabilities into the networking arena so that we can have virtualized
load balancing, firewalling, etc. The fast-emerging domain of network functions
virtualization aims to transform the way that network operators and communication
service providers architect and operate communication networks and their network
services.
The today’s IT environment is exceedingly dynamic with the steady incorpora-
tion of cloud technologies. New virtual machines can be spun up in minutes and can
migrate between physical hosts. Application containers are also emerging fast in
order to speed up application composition, packaging, and shipping across data
centers. The network remains relatively static and painstakingly slow in the sense
that it is an error-prone provisioning process to provide network connectivity for
applications. Data center networks have to facilitate the movement of applications
between computing servers within data centers as well as across data centers. There
is also a need for layer 2 VLAN extensions. In today’s traditional LAN/WAN
design, the extension of VLANs and their propagation within data centers is not an
easy affair. Ensuring all redundant links, in addition to switches, are properly con-
figured can be a time-consuming operation. This can introduce errors and risks.
With the trends such as big data, bring your own devices (BYOD) and data- and
process-intensive videos, the IT infrastructures, especially networks, are under
immense pressure.
2.5 The Major Building Blocks of Software-Defined Cloud Environments (SDCEs) 37
Application 1 Application N1
Applications
Northbound APIs
Controller 1
East-West Controller N
APIs
Control plane
Southbound APIs
Router Router
In the IT world, there are several trends mandating the immediate recognition and
sagacious adoption of SDN. Software-defined cloud environments (SDCEs) are
being established in different cool locations across the globe to provide scores of
orchestrated cloud services to worldwide businesses and individuals over the
Internet on a subscription basis. Application and database servers besides integra-
tion middleware solutions are increasingly distributed, whereas the governance and
the management of distributed resources are being accomplished in a centralized
manner to avail the much-needed single point of view (SPoV). Due to the hugeness
of data centers, the data traffic therefore internally as well as externally is exploding
these days. Flexible traffic management and ensuring “bandwidth on demand” are
the principal requirements.
2.5 The Major Building Blocks of Software-Defined Cloud Environments (SDCEs) 41
The benefits of SDN are definitely diversified and gradually enlarging. SDN is
being seen as a blessing for cloud service providers (CSPs), enterprise data centers,
telecommunication service providers, etc. The primary SDN benefits are the
following:
The Centralized Network Control and Programmability As we discussed
above, the gist of the SDN paradigm is to separate the control function from the
forwarding function. This separation resultantly facilitates the centralized manage-
ment of the network switches without requiring physical access to the switches. The
IT infrastructure deployments and management are therefore gaining the most ben-
efits as SDN controller is capable of creating, migrating, and tearing down VMs
without requiring manual network configurations. This feature maximizes the value
of large-scale server virtualization.
Dynamic Network Segmentation VLANs provide an effective solution to logi-
cally group servers and virtual machines at the enterprise or branch network level.
However, the 12-bit VLAN ID cannot accommodate more than 4096 virtual net-
works, and this presents a problem for mega data centers such as public and private
clouds. Reconfiguring VLANs is also a daunting task as multiple switches and rout-
ers have to be reconfigured whenever VMs are relocated. The SDN’s support for
centralized network management and network element programmability allows
highly flexible VM grouping and VM migration.
High Visibility of VMs The virtual hypervisor switch and all the VMs running in
a physical server use only one or two NICs to communicate with the physical net-
work. These VMs are managed by server management tools and hence not visible to
network management tools. This lacuna makes it difficult for network designers and
administers to understand the VM movement. However, SDN-enabled hypervisor
switches and VMs alleviate this visibility problem.
Capacity Utilization With centralized control and programmability, SDN easily
facilitates VM migration across servers in the same rack or across clusters of servers
in the same data center or even with servers in geographically distributed data cen-
2.6 The Distinct Benefits of Software-Defined Networking 43
ters. This ultimately leads to automated and dynamic capacity planning that in turn
significantly increments physical server utilization.
Network Capacity Optimization The classic tri-level design of data center net-
works consisting of core, aggregation, and access layer switches (North-South
design) is facing scalability limits and poses inefficiencies for server-to-server
(East-West) traffic. There are innovative solutions such as link aggregation, multi-
chassis link aggregation, top-of-rack (ToC) switches, and layer 2 multipath proto-
cols. These are able to fulfill load-balancing, resiliency, and performance
requirements of dense data centers. However, these are found to be complex and
difficult to manage and maintain. The SDN paradigm enables the design and main-
tenance of network fabrics that span across multiple data centers.
Distributed Application Load Balancing With SDN, it is possible to have the
load-balancing feature that chooses not only compute machines but also the net-
work path. It is possible to have geographically distributed load-balancing
capability.
This is the big data era. Huge volumes of data are being generated, captured,
cleansed, and crunched in order to extract actionable insights due to the realization
that big data leads to big insights. Data is the new fuel for the world economy. IT
has to be prepared accordingly and pressed into service in order to garner, transmit,
and stock multi-structured and massive data. Software-defined storage gives enter-
prises, organizations, and governments a viable and venerable mechanism to effec-
tively address this explosive data growth.
We are slowly yet steadily getting into the virtual world with the faster realiza-
tion of the goals allied with the concept of virtual IT. The ensuing world is leaning
toward the vision of anytime-anywhere access to information and services. This
projected transformation needs a lot of perceivable and paradigm shifts. Traditional
data centers were designed to support specific workloads and users. This has resulted
in siloed and heterogeneous storage solutions that are difficult to manage, provision
newer resources to serve dynamic needs, and finally to scale out. The existing setup
acts as a barrier for business innovations and value. Untangling this goes a long way
in facilitating instant access to information and services.
Undoubtedly storage has been a prominent infrastructural module in data cen-
ters. There are different storage types and solutions in the market. In the recent past,
the unprecedented growth of data generation, collection, processing, and storage
clearly indicates the importance of producing and provisioning of better and bigger
storage systems and services. Storage management is another important topic not to
be sidestepped. We often read about big, fast, and even extreme data. Due to an
array of technology-inspired processes and systems, the data size, scope, structure,
and speed are on the climb. For example, digitization is an overwhelming world-
wide trend and trick gripping every facet of human life; thereby, the digital data is
everywhere and continues to grow at a stunning pace. Statisticians say that every
day, approximately 15 petabytes of new data is being generated worldwide, and the
total amount of digital data doubles approximately every 2 years. The indisputable
fact is that machine-generated data is larger compared to man-generated data. The
expectation is that correspondingly there have to be copious innovations in order to
cost-effectively accommodate and manage big data.
2.7 Accentuating Software-Defined Storage (SDS) 45
HTTP using a web browser or directly through an API like REST (representational
state transfer). The flat address space in an object-based storage system enables
simplicity and massive scalability. But the data in these systems can’t be modified,
and every refresh gets stored as a new object. Object-based storage is predominantly
used by cloud services providers (CSPs) to archive and backup their customers’
data.
Analysts estimate that more than 2 million terabytes (or 2 exabytes) of data are
created every day. The range of applications that IT has to support today spans
everything from social computing, big data analytics, mobile, enterprise and embed-
ded applications, etc. All the data for all those applications has got to be made avail-
able to mobile and wearable devices, and hence data storage acquires an indispensable
status. As per the main findings of Cisco’s global IP traffic forecast, in 2016, global
IP traffic will reach 1.1 zettabytes per year or 91.3 exabytes (1 billion gigabytes) per
month, and by 2018, global IP traffic will reach 1.6 zettabytes per year or 131.9
exabytes per month. IDC has predicted that cloud storage capacity will exceed 7
exabytes in 2014, driven by a strong demand for agile and capex-friendly deploy-
ment models. Furthermore, IDC had estimated that by 2015, big data workloads
will be one of the fastest-growing contributors to storage in the cloud. In conjunc-
tion with these trends, meeting service-level agreements (SLAs) for the agreed per-
formance is a top IT concern. As a result, enterprises will increasingly turn to
flash-based SDS solutions to accelerate the performance significantly to meet up
emerging storage needs.
2.8 T
he Key Characteristics of Software-Defined Storage
(SDS)
SDS is characterized by several key architectural elements and capabilities that dif-
ferentiate it from the traditional infrastructure.
Commodity Hardware With the extraction and centralization of all the intelli-
gence embedded in storage and its associated systems in a specially crafted software
layer, all kinds of storage solutions are bound to become cheap, dumb, off-the-shelf,
and hence commoditized hardware elements. Not only the physical storage appli-
ances but also all the interconnecting and intermediate fabric is to become commod-
itized. Such segregation goes a long way in centrally automating, activating, and
adapting the full storage landscape.
Scale-Out Architecture Any SDS setup ought to have the capability of ensuring a
fluid, flexible, and elastic configuration of storage resources through software. SDS
facilitates the realization of storage as a dynamic pool of heterogeneous resources;
thereby, the much-needed scale-out requirement can be easily met. The traditional
architecture hinders the dynamic addition and release of storage resources due to
the extreme dependency. For the software-defined cloud environments, storage scal-
ability is essential to have a dynamic, highly optimized, and virtual environment.
48 2 Demystifying the Traits of Software-Defined Cloud Environments (SDCEs)
Resource Pooling The available storage resources are pooled into a unified logical
entity that can be managed centrally. The control plane provides the fine-grained
visibility and the control to all available resources in the system.
Abstraction Physical storage resources are increasingly virtualized and presented
to the control plane, which can then configure and deliver them as tiered storage
services.
Automation The storage layer brings in extensive automation that enables it to
deliver one-click and policy-based provisioning of storage resources. Administrators
and users request storage resources in terms of application need (capacity, perfor-
mance, and reliability) rather than storage configurations such as RAID levels or
physical location of drives. The system automatically configures and delivers stor-
age as needed on the fly. It also monitors and reconfigures storage as required to
continue to meet SLAs.
Programmability In addition to the inbuilt automation, the storage system offers
fine-grained visibility and control of underlying resources via rich APIs that allows
administrators and third-party applications to integrate the control plane across stor-
age, network, and compute layers to deliver workflow automation. The real power
of SDS lies in the ability to integrate it with other layers of the infrastructure to build
end-to-end application-focused automation.
The maturity of SDS is to quicken the process of setting up and sustaining
software-defined environments for the tactic as well as the strategic benefits of
cloud service providers as well as the consumers at large.
2.9 T
he Key Benefits of Software-Defined Cloud
Environments (SDCEs)
The new technologies have brought in highly discernible changes in how data cen-
ters are being operated to deliver both cloud-enabled and cloud-native applications
as network services to worldwide subscribers. Here are a few important implica-
tions (business and technical) of SDCEs.
The consolidation and centralization of commoditized, easy-to-use and easy-to-
maintain, and off-the-shelf server, storage, and network hardware solutions obviate
50 2 Demystifying the Traits of Software-Defined Cloud Environments (SDCEs)
the need for having highly specialized and expensive server, storage, and network-
ing components in IT environments. This cloud-inspired transition brings down the
capital as well as operational costs sharply. The most important aspect is the intro-
duction and incorporation of a variety of policy-aware automated tools in order to
quickly provision, deploy, deliver, and manage IT systems. There are other mecha-
nisms such as templates, patterns, and domain-specific languages for automated IT
setup and sustenance. Hardware components and application workloads are being
provided with well-intended APIs in order to enable remote monitoring, measure-
ment, and management of each of them. The APIs facilitate the system interopera-
bility. The direct fallout here is that we can arrive at highly agile, adaptive, and
affordable IT environments. The utilization of hardware resources and applications
goes up significantly through sharing and automation. Multiple tenants and users
can avail the IT facility comfortably for a cheaper price. The cloud technologies and
their smart leverage ultimately ensure the system elasticity, availability, and security
along with application scalability.
Faster Time to Value The notion of IT as a cost center is slowly disappearing, and
businesses across the globe have understood the strategic contributions of IT in
ensuring the mandated business transformation. IT is being positioned as the most
competitive differentiator for worldwide enterprises to be smartly steered in the
right direction. However, there is an insistence for more with less as the IT budget
is being consistently pruned every year. Thus enterprises started to embrace all
kinds of proven and potential innovations and inventions in the IT space. That is,
establishing data centers locally or acquiring the right and relevant IT capabilities
from multiple cloud service providers (CSPs) is heavily simplified and accelerated.
Further on, resource provisioning, application deployment, and service delivery are
automated to a greater extent, and hence it is easier and faster to realize the business
value. In short, the IT agility being accrued through the cloud idea translates into
business agility.
Affordable IT By expertly pooling and assigning resources, the SDCEs greatly
maximize the utilization of the physical infrastructures. With enhanced utilization
through automation and sharing, the cloud center brings down the IT costs remark-
ably while enhancing the business productivity. The operational costs come down
due to tools-supported IT automation, augmentation, and acceleration.
Eliminating Vendor Lock-In Today’s data center features an amazing array of
custom hardware for storage and networking requirements such as routers, switches,
firewall appliances, VPN concentrators, application delivery controllers (ADCs),
storage controllers, intrusion detection, and prevention components. With the stor-
age and network virtualization, the above functions are performed by software run-
ning on commodity x86 servers. Instead of being locked into the vendor’s hardware,
IT managers can buy commodity servers in quantity and use them for running the
network and storage controlling software. With this transition, the perpetual vendor
lock-in issue gets simply solved and surmounted. The modifying source code is
quite easy and fast, policies can be established and enforced, software-based
2.9 The Key Benefits of Software-Defined Cloud Environments (SDCEs) 51
2.10 Conclusion
The aspect of IT optimization is continuously getting rapt and apt attention from
technology leaders and luminaries across the globe. A number of generic, as well as
specific, improvisations are being brought in to make IT aware and adaptive. The
cloud paradigm is being touted as the game changer in empowering and elevating
IT to the desired heights. There have been notable achievements in making IT being
the core and cost-effective enabler of both personal as well as professional activi-
ties. There are definite improvements in business automation, acceleration, and aug-
mentation. Still, there are opportunities and possibilities waiting for IT to move up
further.
The pioneering virtualization technology is being taken to every kind of infra-
structures such as networking and storage to complete the IT ecosystem. The
abstraction and decoupling techniques are lavishly utilized here in order to bring in
the necessary malleability, extensibility, and serviceability. That is, all the configu-
ration and operational functionalities hitherto embedded inside hardware compo-
nents are now neatly identified, extracted and centralized, and implemented as a
separate software controller. That is, the embedded intelligence is being developed
now as a self-contained entity so that hardware components could be commod-
itized. Thus, the software-defined computing, networking, and storage disciplines
have become the hot topic for discussion and dissertation. The journey of data cen-
ters (DCs) to software-defined cloud environments (SDCEs) is being pursued with
vigor and rigor. In this chapter, we have primarily focused on the industry mecha-
nism for capturing and collecting requirement details from clients.
References
3.1 Introduction
Scientific workflows are used to model the complex applications with DAG
(Directed Acyclic Graph) format with nodes and edges which are easy to express
the entire data process with its dependencies. Huge amount of data are consumed
and produced during the scientific experiments, which made the workflows data
intensive. As complexity of scientific applications increases, the need for using
Scientific Workflow Management System also increases to automate and orches-
trate the end-to-end processing. In order to process the large-scale data, they need to
be executed in a distributed environment such as grid or cloud.
Scientific Workflow Management System is an effective framework to execute
and manage massive datasets in computing environment. Several workflow man-
agement systems, such as Kepler [1], Taverna [2], Triana [3], Pegasus [4], and
Askalon [5], are available, and they are widely used by the researchers in various
fields such as astronomy, biology, limnology, geography, computational engineer-
ing, and others. The suitable environment for workflow computation, storage provi-
sions, and execution is provided by grid, cluster, and cloud.
Cloud computing is a widely used computing environment comprising of several
data centers with its own resources and data, which provides a scalable and on-
demand service over the Internet as pay-as-you-go pricing model. This chapter pro-
vides a detailed description of widely used scientific workflow management
systems.
Workflow management systems provide tools to define, map, and execute workflow
applications. There are numerous WMSs available to support workflow execution in
different domains. This section describes some of the significant WMS developed
by the research community. Workflow systems develop their own workflow repre-
sentations to describe workflows. Generally, workflow models are roughly catego-
rized into two types: script-like systems and graphical-based systems. Script-like
systems describe workflow using textual programming languages such as Java,
Python, Perl, or Ruby. They declare task and their dependencies with a textual spec-
ification. Examples of script-based model are GridAnt [6] and Karajan [7].
Graphical-based systems describe workflow with basic graphical elements with
nodes and edge. It is easy as compared with the script-based model. The node
represents the actual computational tasks and the communication between the tasks
are represented with the help of links. Workflows are often created with the
dragging and dropping graph elements.
3.3 Kepler
Kepler is based on directors, which dictate the flow of execution within a work-
flow. The basic components of Kepler are director, actor, parameter, ports, and rela-
tions which are shown in Fig. 3.2.
Director – It is used to represent the different types of components with the model
of its computation in a workflow, and it controls the overall execution flow.
Actor – Executes the director instructions, and the composite actor performs the
complex operations.
Ports – Actors are connected to each other with the help of ports. An actor may
contain one or more ports to produce or consume data to communicate with other
actors. Link is the dataflow from one actor port to another.
Relations – Allows user to branch a dataflow.
Parameter – Configurable values attached to a workflow, director, or actors.
Features of Kepler
• Run-time engine and graphical user interface are supported which is helpful for
executing the workflows with GUI or CLI.
• Reusability is supported in Kepler; modules and components can be created,
saved, and reused for other workflow applications also.
• Native support for parallel processing applications.
• Reusable library contains 350 ready-to-use process components. These compo-
nents can be easily customized and used for other workflow applications. Kepler
supports integration with other applications such as:
Statistical analyses into Kepler workflows are possible, by integrating it with R
and MATLAB.
58 3 Workflow Management Systems
WSDL defined Web services support for workflows accessing and execution
• Workflows can be uploaded, downloaded, and searched on the Kepler’s
Component Repository which provides a centralized server.
• The rapid development and scalable distributed execution of bioinformatics
workflows in Kepler are the latest version in bioKepler 1.2 consists of new
features for bioinformatics applications such as:
Workflow for BLAST+
Machine learning
Updating to Spark 1.5.0 to Spark 1.5.0
3.4 Taverna
Fig. 3.3 Scientific
workflow representation in
Taverna (Example using
the Spreadsheet Import
service to import data from
an Excel spreadsheet)
biology, and medicine to execute scientific workflows and support in silicon experi-
mentation, where the experiments are carried through computer simulations with
models, which are closely reflecting the real world. It supports Web services, local
Java services and API, R scripts, and CSV data files.
Workbench in Taverna supports desktop client application which is a GUI, pro-
vides create, edit and run workflows. Taverna Server is used for remote execution.
Taverna Player is a web interface for execution of workflows through the Taverna
server. Taverna also supports Command Line Tool for executing workflows using
Command line instruction. Taverna provides service discovery facility and supports
for browsing selected service catalogues. Figure 3.3 represents workflow designed
in Taverna. It can access local processes, Web services, and also high-performance
computing infrastructures through Taverna PBS plugin.
Taverna utilizes a simple conceptual unified flow language (Scufl) to represent
the workflow [11]. Scufl is an XML-based language, which consists of three main
entities: processors, data links, and coordination constraints. Processors represent a
computational activity in a scientific workflow. Data links and coordination con-
straints separately represent the data dependencies and control dependencies
between two activities.
60 3 Workflow Management Systems
Graphical SWfMSs combine the efficiency of scientific workflow design and the
ease of scientific workflow representation. Desktop-based graphical SWfMSs are
typically installed either in a local computer or in a remote server that is accessible
through network connection. The local computer or remote server can be connected
to large computing and storage resources for large-scale scientific workflow
execution.
The data storage module generally exploits database management systems and
file systems to manage all the data during workflow execution. Some SWfMSs such
as Taverna put intermediate data and output data in a database.
3.5 Triana
3.6 Pegasus
related components which are required for the workflow execution. It also
restructures workflows for optimized performance.
• Local execution engine – Submit the jobs and manage it by tracking the job state
and determines when to run a job. Then it submits the jobs to the local scheduling
queue.
• Job scheduler – It manages the jobs on both local and remote resources
• Remote execution engine – It manages the execution of jobs on the remote com-
puting nodes.
• Monitoring component – Monitors the workflow and track all the information’s
and logs. Workflow database collects those information and provides the prove-
nance and performance information. Additionally it notifies the user about the
workflow status.
3.7 ASKALON
• Resource broker: Reserves the resources required for the execution of workflow
applications.
• Resource monitoring: Support for monitoring grid and cloud resources by inte-
gration and scaling of resources using new techniques.
• Information service: Service for discovering and organizing resources and data.
• Workflow executer: Executing workflow applications in remote cloud and grid
sites.
64 3 Workflow Management Systems
3.8 Conclusion
Scientific workflows represent the complex applications as DAG wherein the nodes
and edges which are easy to express the entire data process with its dependencies.
Workflow management systems provide tools to define, map, and execute workflow
applications. This chapter highlights the salient features of most popular workflow
management systems such as Kepler, Pegasus, Triana, Taverna, and Askalon.
However, the workflows that are created using WMS require following important
ingredients for their efficient execution: scheduling algorithm and heterogeneous
and scalable computing environment. Chapter 4 explains all of these perspectives
by reviewing several notable research works that are available in literature.
References
4.1 Introduction
• Load balancing
• Security
The workflow model and its structure are dealt in section 4.2 along with taxon-
omy of workflow scheduling.
T1
T2 T3 T4 T5 T6 T7 T8 T9
T10 T11
T12
Fig. 4.1 Workflow
68 4 Workflow Scheduling Algorithms and Approaches
Static scheduling assumes the task timing information precisely but incurs less run-
time overhead. An example for a static scheduling algorithm is Opportunistic Load
Balancing (OLB).
List Scheduling Heuristic
It creates a scheduling list by assigning priority and sorting the task according to
their priority, and then repeatedly it selects the task and resource till all tasks in the
DAG are scheduled. A prioritizing attribute and resource selection strategy are
required to decide task priorities and optimal resource for each task.
Some list scheduling heuristics are modified critical path (MCP), mapping heu-
ristic (MH), insertion scheduling heuristic and earliest time first (ETF), heteroge-
neous earliest finish time (HEFT), critical path on a processor (CPOP), dynamic
level scheduling (DLS), dynamic critical path (DCP), and predict earliest finish time
(PEFT).
Clustering Heuristic
Clustering heuristic is designed to optimize transmission time between data-
dependent tasks. The two main parts of clustering heuristics are clustering and
ordering. Clustering maps tasks to clusters, whereas ordering orders task in the
same cluster.
70 4 Workflow Scheduling Algorithms and Approaches
Duplication Heuristic
Duplication technique is usually used along with list scheduling or clustering sched-
uling as an optimization procedure or as a new algorithm. There are two issues to be
addressed when designing an effective task duplication algorithm:
1. Which task(s) to duplicate? The start time of the child task can be minimized by
the selecting which parent talks to be duplicated.
2. Where to duplicate the task(s)? Allocating the proper time slot on the resources
to the duplicate the parent task(s).
According to the selection of duplicate task, duplication algorithm is classified as
scheduling with full duplication (SFD) and scheduling with partial duplication
(SPD). Task from high priority or higher levels is considered for duplication in SFD.
Meta-heuristic
To achieve better optimization solutions, meta-heuristic approaches are used. Large
complicated problems are solved by using this approach.
DAG scheduling is an NP-complete problem. Therefore, developing an approxi-
mate algorithm is a good alternative comparing to the exact methods. Some meta-
heuristic solutions were proposed as they provide an efficient way toward a good
solution. Genetic algorithm is one of the best solutions for task scheduling problem.
Some of the examples for DAG scheduling includes [8, 11, 13, 14]. These genetic
algorithms differ in string representations of schedules in search space, fitness func-
tion evaluation for schedules, genetic operators for generating new schedules, and
stochastic assignment to control the genetic operators.
Blythe et al. [15] investigated greedy GRAP (Greedy Randomized Adaptive
Search) algorithm for workflow scheduling on grids, which performs better than
Min-Min heuristic for data-intensive applications. Young et al. [16] have investi-
gated performance of simulated annealing (SA) algorithms for scheduling workflow
applications in grid environment.
Dynamic scheduling doesn’t know about the task arrival information during run-
time although it adapts better to the timing changes during execution. Some of the
example of dynamic scheduling is Earliest Deadline First (EDF) and Least Laxity
First (LLF). Dynamic scheduling is developed for handling the unavailability of
scheduling information and resource contention with other workflow or non-
workflow system load. Sonmez et al. [52] presented a dynamic scheduling taxon-
omy based on the three-resource information status, processing speed and link
speed, and two-task information task length and communication data size. A
dynamic scheduling algorithm balances the load among the available resource
queues. Xie and Qin [17, 18] addressed a family of dynamic security-aware sched-
uling algorithms for homogeneous clusters and heterogeneous distributed systems.
4.6 Taxonomy of Cloud Resource Scheduling 71
Workflow scheduling is the problem of mapping of the workflow flow tasks on suit-
able resources while satisfying the constraints imposed by the users [19]. In other
words, it is the process of atomization of the workflows with the help of algorithms.
A workflow will consist of sequence of connected instructions. The motive of work-
flow scheduling is to automate the procedures especially which are involved in the
process of passing the data and fields between the participants of the cloud main-
taining the constraints [20]. The performance of the entire system can be achieved
by properly scheduling the workflows with the help of the scheduling algorithms.
The WfMC (Workflow Management Coalition) defined workflow as “The auto-
mation of a business process, in whole a set of procedural rules” [21]. The compo-
nents of the workflow reference model are represented in Fig. 4.3.
Workflow Engine
The workflow engine will provide a runtime environment to create, manage, and
execute the workflow instances.
Process Definition
The processes are defined in such a way that it facilitates the automated
manipulation.
Workflow Interoperability
Interoperability is provided between the different kinds of workflow systems.
Invoked Application
It helps in the communication between the different kinds of IT applications.
Workflow Client Application
Support for the interaction with the users with the help of user interface.
Administration and Monitoring
It helps in coordinating the composite workflow application environment.
Scheduling can be done in different layers of service stacks; hence the cloud com-
puting architecture consists of IaaS, PaaS, and SaaS stacks which classifies the
scheduling problem according to the stacks such as scheduling in application (soft-
ware), scheduling in the virtualization (platform), and scheduling in the deployment
(infrastructure) [23]. Figure 4.4 represents the taxonomy of cloud resource
scheduling.
• Scheduling in the application layer schedules the virtual or physical resources to
support for user applications, tasks, and workflows with optimal QOS and effi-
ciency and software.
• Scheduling in the virtualization layer maps virtual resources to physical resources
with load balance, energy efficiency, budget, and deadline constraints.
• Scheduling in the deployment layer is concerned with the infrastructure, service
placement, multi-cloud centers, outsourcing, partnering, data routing, and appli-
cation migration.
Best effort scheduling tries to optimize one objective leaving the other factors such
as various QoS requirements. Pandey et al. [11, 24] presented a particle swam
optimization-based heuristic to schedule workflows to cloud resources that target to
minimize the cost. Both the execution cost and data transmission cost are taken into
account. Simulation is performed in a cloud environment with the help of Amazon
EC2. For algorithm performance, PSO is compared with best resource selection
(BRS), and PSO achieves three times cost saving than BRS.
Bittencourt et al. [12, 25] presented Hybrid Cloud Optimized Cost schedule
algorithm for scheduling workflow in a hybrid environment with an aim to optimize
cost. It decides which resources should be leased from the public cloud and aggre-
gated to the private cloud to provide sufficient processing power to execute a work-
flow within a given execution time.
Garg et al. [8, 26] proposed adaptive workflow scheduling(AWS) with an aim
to minimize the makespan. It also considers the resource availability changes and
the impact of existing loads over grid resources. Scheduling policy changes
dynamically as per previous and current behavior of the system. Simulation is
performed in grid environment with a help of GridSim toolkit. To verify the cor-
rectness of the algorithm, it is compared with a popular heuristics such as HEFT,
Min-Min, Max-Min, AHEFT, Re-DCP-G, and Re-LSDS, and the result shows
that AWS performed 10–40% better than the other scheduling algorithms
considered.
Luo et al. [11, 27] presented a deadline guarantee enhanced scheduling algo-
rithm (DGESA) for deadline scientific workflow in a hybrid grid and cloud environ-
ment that targets the deadline guarantee with the objective to minimize the
makespan. The simulation environment is hybrid in nature with three grid sites and
four cloud services. For the result analysis, DGESA is compared with HCOC and
Aneka algorithm, and the result shows that the DGESA is very efficient in deadline
guarantee and the makespan is lower than other two algorithms.
74 4 Workflow Scheduling Algorithms and Approaches
Verma et al. [13, 28] proposed BDHEFT which considers budget and deadline con-
straints while scheduling workflow with an aim to optimize the makespan and cost.
Simulation is performed in cloud environment with a help of CloudSim. To evaluate
the algorithm, authors compared BDHEFT with HEFT, and result shows that
BDHEFT outperforms HEFT in terms of monetary cost and produces better
makespan.
Udomkasemsub et al. [14, 29] proposed a workflow scheduling framework using
Artificial Bee Colony (ABC) by considering multiple objective such as makespan
and cost. A Pareto analysis concept is applied to balance the solution quality accord-
ing to both objectives. Experiment is performed in cloud environment with java
implementation. ABC outperforms HEFT/LOSS in both single objective optimiza-
tion with constraints and multiple objective optimizations for structured workflow.
Wu et al. [15, 30] proposed Revised Discrete Particle Swarm Optimization
(RDPSO) to schedule workflow applications in the cloud that takes both transmis-
sion cost and communication cost into account, which aims to minimize the makes-
pan and cost of the workflow application. Experiments are performed in Amazon
Elastic Compute Cloud. Experimental results show that the proposed RDPSO algo-
rithm can achieve much more cost savings and better performance on makespan and
cost optimization.
Ke et al. [16, 31] presented compromised-time-cost (CTC) scheduling algorithm
with an aim to minimize the makespan and cost of the workflows. Simulation is
performed in SwinDeW-C platform. For the performance evaluation, CTC is com-
pared with Deadline-MDP algorithm, which shows that CTC performance is better
in terms of cost of 15% and execution cost by 20%.
Arabnejad et al. [17, 32] proposed a Heterogeneous Budget-Constrained
Scheduling (HBCS) algorithm that guarantees an execution cost within the user’s
specified budget and minimizes the makespan. The experiment is performed in
GridSim toolkit, and for the evaluation of algorithm performance, it is compared
with LOSS, GreedyTime-CD, and BHEFT. The result shows that HBCS algorithm
achieves lower makespans, with a guaranteed cost per application and with a lower
time complexity than other budget-constrained state-of-the-art algorithms.
Singh et al. [18, 33] proposed a score-based deadline constrained workflow
scheduling algorithm, which considers deadline as the main constraint to minimize
the cost while meeting user-defined deadline constraints. Simulation is performed
in a cloud environment with a help of CloudSim toolkit. For performance analysis,
the algorithm is compared with the same algorithm without score result that shows
that score-based algorithm performance is better.
Verma et al. [19, 34] proposed Deadline and Budget Distribution-based Cost-
Time Optimization (DBD-CTO) with an aim to minimize cost and time by consid-
ering the budget.
Malawski et al. [20, 35] addressed the problem of managing workflow ensem-
bles with a goal to maximize the completion of user-prioritized workflows in a
4.7 Existing Workflow Scheduling Algorithms 75
Main objectives considered are cost, makespan, and CPU idle time. Experiments
are performed in a MATLAB tool, and the comparison is made with the existing
multi-objective particle swarm optimization (MOPSO); the result shows that multi-
objective cat swarm optimization performance is better than MOPSO.
Kumar et al. [28, 43] proposed apriority-based decisive algorithm with De-De
dodging algorithm that is proposed to schedule multiple workflows with an aim to
maximize the resource utilization and cost and reduce the makespan. CloudSim was
used for the simulation and the proposed algorithm is compared with the Time and
Cost Optimization for Hybrid Clouds (TCHC) algorithm and De-De, PBD performs
better than TCHC. The De-De algorithms performs better in terms of CPU time,
makespan, and cost.
Yassa et al. [29, 44] proposed Dynamic Voltage and Frequency Scaling (DVFS)-
Multi-Objective Discrete Particle Swarm Optimization (DVFS-MODPSO) to mini-
mize makespan, cost, and energy. Dynamic Voltage and Frequency Scaling technique
is used to reduce the consumption of energy. The proposed algorithm is compared
with the HEFT for performance, and the result shows DVFS-MODPSO perfor-
mance is better and produces an optimal solution.
Li et al. [30, 45] presented a security and cost-aware scheduling (SCAS) with an
aim to minimize the cost by considering deadline and risk rate constraints. They
deployed security services such authentication service, integrity service, and confi-
dentiality service to protect the workflow applications from common security
attacks. CloudSim is used for the simulation and experiment is conducted with the
scientific workflows.
Jayadivya et al. [31, 46] proposed a QWS algorithm with an aim to achieve mini-
mum makespan and cost and maximum reliability. For the performance evaluation,
proposed QWS algorithm is compared with MQMW algorithm, and the result
shows that the success rate of the QWS algorithm is better than MQMW.
The detailed comparison of cloud scheduling algorithms is depicted in Tables 4.1
and 4.2. The algorithms are classified based on best effort scheduling and bi-
objective and multi-objective scheduling. The type of workflow and environment in
which these algorithms are experimented or simulated is compared.
From the comparison it is clear that the major scheduling algorithm considers the
objective makespan, deadline, and budget, and there are very less number of works
toward the other objectives. There is a need to be taken care of other objectives to
make an efficient and effective workflow scheduling. Rahman et al. [32, 47] pre-
sented dynamic critical path-based workflow scheduling algorithm for the grid
(DCP-G) that provides an efficient schedule in astatic environment. It adapts to
dynamic grid environment where resource information is updated after fixed inter-
val, and rescheduling (Re-DCP-G) will be done if necessary. A. Olteanu et al. [33,
48] proposed a generic rescheduling algorithm that will be useful for large-scale
distributed systems to support fault tolerance and resilience. As energy consumption
is becoming an important factor, it also becomes an issue for scheduling workflows.
Two types of solution are available to attain the target of energy consumption reduc-
tion [1, 34]. They are resource utilization and dynamic voltage scaling (DVS).
Reduction of energy consumption is done with the help of improving the resource
4.7 Existing Workflow Scheduling Algorithms 77
Algorithm Makespan Deadline Budget Security Energy Elasticity Reliability Resource utilization
Best effort workflow scheduling
Pandey et al. [24] – – ✓ – – – – –
Bittencourt et al. [25] – – ✓ – – – – –
Garg et al. [26] ✓ – – – – – – –
Luo et al. [27] ✓ – – – – – – –
Bi-objective workflow scheduling
Verma et al. [28] ✓ – ✓ – – – – –
Udomkasemsub et al. [29] ✓ – ✓ – – – – –
Wu et al. [30] ✓ – ✓ – – – – –
Ke et al. [31] ✓ – ✓ – – – – –
Arabnejad et al. [32] ✓ – ✓ – – – – –
Singh et al. [33] – ✓ ✓ – – – – –
Verma et al. [34] – ✓ ✓ – – – – –
Malawski [35] – ✓ ✓ – – – – –
Xu et al. [36] ✓ – ✓ – – – – –
Bessai et al. [37] – ✓ ✓ – – – – –
Chopra et al. [38] – ✓ ✓ – – – – –
Verma et al. [39] – ✓ ✓ – – – – –
Lin et al. [40] ✓ – – – – ✓ – –
Poola et al. [41] – ✓ ✓ – – – – –
Multi-objective workflow scheduling
Bilgayan et al. [42] ✓ – ✓ – – – – ✓
Kumar et al. [43] ✓ – ✓ – – – – ✓
Yassa et al. [44] ✓ – ✓ – ✓ – – –
Liet al. [45] – ✓ ✓ ✓ – – – –
Jayadivya et al. [46] – – – – –
4 Workflow Scheduling Algorithms and Approaches
✓ ✓ ✓
4.8 Issues of Scheduling Workflow in Cloud 79
utilization [35–37, 49–51]. Venkatachalam et al. [38, 52] investigated and devel-
oped various techniques such as DVS, memory optimizations, and resource
hibernation. DVS has been proven an efficient technique for energy savings [39, 40,
53, 54].
For workflow scheduling problem in the cloud, the best method is to generate a
schedule using makespan best effort algorithm and then adopt DVS technique to
tune the slack time of the generated schedule [41, 42, 55, 56]. The removal of data
files that are no longer needed is a best practice toward an efficient mapping and
execution of workflows, since it minimizes their overall storage requirement. Piotr
bryk et al. [43] proposed and implemented a simulation Model for handling file
transfers dynamically between the tasks with configurable replications. Also, they
proposed Dynamic Provisioning Locality-Aware Scheduling (DPLS) and Storage-
and Workflow-Aware DPLS (SWA-DPLS) algorithm. Yang Wang et al. [44, 57]
presented WaFS, a user-level workflow-aware file system with a proposed hybrid
scheduling (HS) algorithm for scientific workflow computation in the cloud.
Workflow scheduler usesWaFS data to make effective cost-performance trade-offs
or improve storage utilization.
The IaaS and PaaS combination cloud forms a complete workflow scheduling archi-
tecture, and it introduce new challenges [34]:
• Scheduling works for grids and clusters focused on meeting deadlines or mini-
mizing makespan without considering the cost. New scheduling algorithms need
to be developed for the cloud by considering the pay-as-you-go model in order
to avoid unnecessary cost.
• Adaptive nature: Cloud is a dynamic environment with a variation of perfor-
mance during the workflow execution; therefore an adaptive scheduling solution
is required.
• Migration: Migration of jobs for load balancing also affects the performance of
workflow scheduling
• Virtual machine replacement: Workflow scheduling algorithm should efficiently
replace the VM in the case of failure.
• There are two main stages in cloud environment prior to the execution of work-
flow: Resource provisioning and task-resource mapping. Grid and cluster envi-
ronments focus only on task-resource mapping stage since they are static
environment whose configurations are known in advance. A major issue of
resource provisioning is to determine the amount and resource type that a work-
flow application would request, which affects the cost and makespan of a
workflow.
• Virtual machines offered by current cloud infrastructure are not exhibiting stable
performance [45]. This has a significant impact on a scheduling policy. VM boot
80 4 Workflow Scheduling Algorithms and Approaches
time and stop time are the other important factors that should be considered for
cloud scheduling [46].
• Integrated architecture: The first challenge is to integrate the workflow manage-
ment system with the cloud. A workflow engine should be designed with strong
functionality to deal with large-scale tasks.
• Large-scale data management and workflow scheduling: Scientific workflow
applications are more data intensive, and data resource management and data
transfer between the storage and computing resources are the main bottleneck. It
is very important to find an efficient way to manage data needed by the
workflows.
• Service composition and orchestration: To accommodate the further growing and
complex workflow application needs, services of several cloud providers should
be composed to deliver uniform QoS as a single request. To achieve the con-
sumer requirements, this composition and orchestration of services should be
carried out in an automated and dynamic manner. To find an efficient composi-
tion solution in Intercloud is a major challenge as it involves service composition
and orchestration optimization under the constraint of cost and deadline.
4.9 Conclusion
References
25. Bittencourt LF, Madeira ERM HCOC: a cost optimization algorithm for workflow scheduling
in hybrid clouds. Int J Internet Serv Appl 2:207–227. doi:10.1007/s13174-011-0032-0
26. Garg R, Singh AK (2015) Adaptive workflow scheduling in grid computing based on dynamic
resource availability. Eng Sci Technol An Int J 18(2):256–269. http://dx.doi.org/10.1016/j.
jestch.2015.01.001
27. Luo H, Yan C, Hu Z (2015) An Enhanced Workflow Scheduling Strategy for Deadline
Guarantee on Hybrid Grid/Cloud Infrastructure. J Appl Sci Eng 18(1):6778. doi:10.6180/
jase.2015.18.1.09
28. Verma A, Kaushal S (2015) Cost-time efficient scheduling plan for executing workflows in the
cloud. J Grid Comput 13(4):495–506. doi:10.1007/s10723-015-9344-9
29. Udomkasemsub O, Xiaorong L, Achalakul T (2012) A multiple-objective workflow schedul-
ing framework for cloud data analytics. In: Computer Science and Software Engineering
(JCSSE), International Joint Conference, pp 391–398. doi:10.1109/JCSSE.2012.6261985
30. Wu Z, Ni Z, Gu L, Liu X A revised discrete particle swarm optimization for cloud workflow
scheduling. In: Computational Intelligence and Security (CIS), 2010 International conference
on 11–14 Dec 2010, pp 184–188. doi:10.1109/CIS.2010.46
31. Ke L, Hai J, Jinjun CXL, Dong Y (2010) A compromised time-cost scheduling algorithm in
SwinDeW-C for instance-intensive cost-constrained workflows on cloud computing platform.
J High-Perform Comput Appl 24(4):445–456
32. Arabnejad H, Barbosa JG (2014) A budget constrained scheduling algorithm for workflow
applications. J Grid Comput 12(4):665–679. http://dx.doi.org/10.1007/s10723-014-9294-7
33. Singh R, Singh S (2013) Score based deadline constrained workflow scheduling algorithm for
cloud systems. Int J Cloud Comput Serv Archit 3(6). doi:10.5121/IJCCSA.2013.3603
34. Verma A, Kaushal S, Deadline and budget distribution based cost- time optimization workflow
scheduling algorithm for cloud. In: IJCA Proceedings on International Conference on Recent
Advances and Future Trends in Information Technology (iRAFIT 2012) iRAFIT(7):1–4
35. Malawski M, Juve G, Deelman E, Nabrzyski J, Cost- and deadline-constrained provisioning
for scientific workflow ensembles in IaaS clouds. In: Proceedings of the international confer-
ence on high-performance computing, networking, storage and analysis (SC ’12). IEEE
Computer Society Press, Los Alamitos, CA, USA, Article 22
36. Xu M, Cui L, Wang H, Bi Y (2009) A multiple QoS constrained scheduling strategy of multi-
ple workflows for cloud computing. In: Parallel and distributed processing with applications,
2009 IEEE International Symposium on, Chengdu, pp 629–634. doi: 10.1109/ISPA.2009.95
37. Bessai K, Youcef S, Oulamara A, Godart C, Nurcan S (2012) Bi-criteria workflow tasks alloca-
tion and scheduling in cloud computing environments. In: Cloud Computing (CLOUD), IEEE
5th International Conference, pp 638–645. doi: 10.1109/CLOUD.2012.83
38. Chopra N, Singh S (2013) HEFT based workflow scheduling algorithm for cost optimization
within deadline in hybrid clouds. In: Computing, Communications and Networking
Technologies (ICCCNT), 2013 Fourth International Conference, pp 1–6. doi: 10.1109/
ICCCNT.2013.6726627
39. Verma A, Kaushal S (2015) Cost minimized PSO based workflow scheduling plan for cloud
computing. IJITCS 7(8):37–43. doi:10.5815/ijitcs.2015.08.06
40. Lin C, Lu S (2011) Scheduling scientific workflows elastically for cloud computing. In: Cloud
computing 2011 IEEE, international conference, Washington, DC, pp 746–747. doi: 10.1109/
CLOUD.2011.110
41. Poola D, Ramamohanarao K, Buyya R (2014) Fault-tolerant workflow scheduling using spot
instances on clouds. Proc Comput Sci 29:523–533
42. Bilgaiyan S, Sagnika S, Das M (2014) A multi-objective cat swarm optimization algorithm for
workflow scheduling in cloud computing environment. Intell Comput Commun Devices
308:73–84. http://dx.doi.org/10.1007/978-81-322-2012-1_9
43. Kumar B, Ravichandran T Scheduling multiple workflow using De-De Dodging Algorithm
and PBD Algorithm in cloud: detailed study. Int J Comput Electr Autom Control and Inf Eng
9(4):917–922
References 83
44. Yassa S, Chelouah R, Kadima H, Granado B (2013) Multi-objective approach for energy-
aware workflow scheduling in cloud computing environments. Sci World J:Article ID 350934.
doi:10.1155/2013/350934
45. Li Z, Ge J, Yang H, Huang L, Hu H, Hu H, Luo B (2016) A security and cost aware scheduling
algorithm for heterogeneous tasks of scientific workflow in clouds. J Futur Gener Comput
Syst. http://dx.doi.org/10.1016/j.future.2015.12.014
46. Jayadivya S, Bhanu SMS Qos based scheduling of workflows in cloud computing. Int J Comput
Sci Electr Eng 2012:2315–4209
47. Rahman M, Hassan R, Ranjan R, Buyya R (2013) Adaptive workflow scheduling for dynamic
grid and cloud computing environment. Concurr Comput Pract Exp 25(13):1816–1842.
doi:10.1002/cpe.3003
48. Olteanu A, Pop F, Dobre C, Cristea V (2012) A dynamic rescheduling algorithm for resource
management in large-scale dependable distributed systems. Comput Math Appl 63(9):1409–
1423. http://dx.doi.org/10.1016/j.camwa.2012.02.066
49. Beloglazov A, Abawajy J, Buyya R Energy-aware resource allocation heuristics for efficient
management of data centers for cloud computing. Futur Gener Comput Syst 28(5):755–768.
http://dx.doi.org/10.1016/j.future.2011.04.017
50. Beloglazov A, Buyya R (2012) Optimal online deterministic algorithms and adaptive heuris-
tics for energy and performance efficient dynamic consolidation of virtual machines in Cloud
data centers. Concurr Comput Pract Exp 24(13):1397–1420. http://dx.doi.org/10.1002/
cpe.1867
51. Meng X, Pappas V, Zhang L (2010) Improving the scalability of data center networks with
traffic-aware virtual machine placement. In: Proceedings of the 29th conference on informa-
tion communications. IEEE Press, Piscataway, NJ, USA, pp 1154–1162
52. Sonmez O, Yigitbasi N, Abrishami S, Iosup A, Epema D (2010) Performance analysis of
dynamic workflow scheduling in multicluster grids. In: Proceedings of the 19th ACM interna-
tional symposium on high performance distributed computing, ACM, pp 49–60
53. Ge R, Feng X, Cameron KW, Performance-constrained distributed DVS scheduling for scien-
tific applications on power-aware clusters. In: Supercomputing, 2005. Proceedings of the
ACM/IEEE SC 2005 Conference, pp 34–34, 12–18 Nov 2005. doi: 10.1109/SC.2005.57
54. Rountree B, Lowenthal DK, Funk S, Freeh VW, de Supinski BR, Schulz M Bounding energy
consumption in large-scale MPI programs. In: Supercomputing, 2007. SC ‘07. Proceedings of
the 2007 ACM/IEEE Conference, pp 1–9, 10–16, Nov 2007. doi: 10.1145/1362622.1362688
55. Baskiyar S, Abdel-Kader R (2010) Energy-aware DAG scheduling on heterogeneous systems.
Clust Comput 13(4):373–383. http://dx.doi.org/10.1007/s10586-009-0119-6
56. Cao F, Zhu MM, Wu CQ, Energy-efficient resource management for scientific workflows in
clouds. In: Services, 2014 IEEE world congress, pp 402–409, July 2014 doi: 10.1109/
SERVICES.2014.76
57. Wang Y, Lu P, Kent KB (2015) WaFS: a workflow-aware file system for effective storage utili-
zation in the cloud. Comput IEEE Trans 64(9):2716–2729. doi:10.1109/TC.2014.2375195
Chapter 5
Workflow Modeling and Simulation
Techniques
5.1 Introduction
System Layer
The massive resources (RAM, storage, CPU, network) power the IaaS in the cloud
computing environment. The resources are dynamically provisioned in the data cen-
ters with the help of virtualization techniques. The virtualized servers play a major
role in this layer with necessary security and fault tolerance.
Core Middleware
The virtualized resource deployment and management services are important ser-
vice in this layer. The Platform as a Service (PaaS) is supported by the core middle-
ware. It provides the following functionalities:
• Dynamic service-level agreement (SLA) management
• Accounting and metering services
• Execution monitoring and management
• Service discovery
• Load balancing
User-Level Middleware
This layer supports cost-effective user interface framework, programming environ-
ment, composition tools for creation, deployment, and execution of cloud
applications.
Cloud Application
The various applications provided by the cloud service providers are supported in
this cloud application layer as Software as a Service (Saas) for the end user.
5.3 L
ayered Design and Implementation of CloudSim
Framework
Figure 5.2 demonstrates the layered architecture of the CloudSim framework. The
very first layer in the CloudSim is the “user code” that creates the following things:
multiple users along with their applications, VMs, and other entities available to the
hosts. The next layer is CloudSim which is made up of many fundamental classes in
java.
User Code
The user code is the first layer of simulation stack which expresses the configuration-
related operations for hosts, applications, VMs, number of users, and their applica-
tion types and broker scheduling policies.
CloudSim
The second layer, CloudSim, will consist of the following classes which are all the
building blocks of the simulator.
The Data Center
• Data center class models the core infrastructure-level services (hardware and
software) offered by resource providers in a cloud computing environment [5].
• It abbreviates a group of compute hosts which may be either homogeneous or
heterogeneous with regard to their resource properties (memory, cores, capacity,
and storage).
88 5 Workflow Modeling and Simulation Techniques
User Code
Simulation Requirements and Specifications
VM Services
Cloudlet Execution and VM Management
Cloud Services and Cloud Resource Provisioning
CPU , Network , Memory , Storage , Bandwidth - allocation
and monitoring
Cloud Coordinator
This abstract class enables the association capability to a data center. Further, it is
responsible for intercloud coordinator communication and cloud coordinator and
broker communication. Besides, it captures the inner state of a data center that plays
a vital role in load balancing and application scaling.
BW Provisioner
BW provisioner is an abstract class for representing the bandwidth provisioning
policy to VMs which are installed on a host entity.
Memory Provisioner
Memory provisioner is an abstract class for representing the memory provisioning
policy to VMs. The VM is allowed to deploy on a host if and only if the memory
provisioner identifies that the host has sufficient quantum of free memory as
demanded by the new VM.
VM Provisioner
VM provisioner is an abstract class to define an allocation policy which is used by
VM monitor to assign VMs to hosts.
VM Allocation Policy
VMM allocation policy is an abstract class which is realized by a host component
that defines the policies required for allocating processing power to VMs [9].
Workflow Management System (WMS)
WMS is used for the management of the workflow tasks on the computing resources.
The major components of the WMS are shown in Fig. 5.3.
Fig. 5.3 Workflow
structure
90 5 Workflow Modeling and Simulation Techniques
The workflow structure describes the relationship between the tasks of the work-
flow. It can be mainly of two types: DAG (Directed Acyclic Graph) and non-DAG. It
can also be further categorized into sequence, parallelism, and choice in the DAG
scheme.
Workflow Scheduling
Mapping and management of workflow task execution on shared resources are done
with the help of workflow scheduling. The elements of the workflow scheduling are
shown in Fig. 5.4.
Further subclassifications:
• Architecture:
–– Centralized
–– Hierarchical
–– Decentralized
• Decision-making:
–– Local
–– Global
• Planning scheme:
–– Static
User-directed
Simulation-based
–– Dynamic:
Prediction-based
Just in time
• Scheduling strategies:
–– Performance-driven
–– Market-driven
–– Trust-driven
5.4 Experimental Results Using CloudSim 91
Example 1: This example demonstrates the creation of two data centers with
one host each and executes two cloudlets on them.
1. Initialize the CloudSim package.
2. Initialize the GridSim library.
3. Create the data centers:
In CloudSim, the resources are provisioned from data centers. Further, there
should be minimum one data center to run a CloudSim simulation.
4. Create the broker.
5. Create a VM by specifying its characteristics such as MIPS, size, RAM, and
bandwidth.
6. Add those VMs into the VM list.
7. Submit VM list to the broker.
8. Create the cloudlets by specifying length in Million Instructions, input and out-
put file size.
9. Add those cloudlets into cloudlet list.
10. Start the simulation.
11. In this example, the VMAllocatonPolicy in use is SpaceShared. It means that
only one VM is allowed to run on each Pe (processing element). As each host
has only one Pe, only one VM can run on each host.
12. Finally, the simulation results are printed.
Output Results
Starting CloudSimExample4...
Initialising...
Starting CloudSim version 3.0
Datacenter_0 is starting...
Datacenter_1 is starting...
Broker is starting...
Entities started.
0.0: Broker: Cloud Resource List received with 2 resource(s)
0.0: Broker: Trying to Create VM #0 in Datacenter_0
0.0: Broker: Trying to Create VM #1 in Datacenter_0
[VmScheduler.vmCreate] Allocation of VM #1 to Host #0 failed by MIPS
0.1: Broker: VM #0 has been created in Datacenter #2, Host #0
0.1: Broker: Creation of VM #1 failed in Datacenter #2
0.1: Broker: Trying to Create VM #1 in Datacenter_1
0.2: Broker: VM #1 has been created in Datacenter #3, Host #0
0.2: Broker: Sending cloudlet 0 to VM #0
0.2: Broker: Sending cloudlet 1 to VM #1
160.2: Broker: Cloudlet 0 received
92 5 Workflow Modeling and Simulation Techniques
Example 3: Here 20 VMs are requested and 11 VMs are created successfully.
Forty cloudlets are sent for execution. This output is for SpaceShared Policy.
The total execution time of all cloudlets is 4.2.
Cloudlet Parameters Assigned
VM Parameters Assigned
94 5 Workflow Modeling and Simulation Techniques
The execution continues for all the cloudlets assigned with necessary VM alloca-
tion, and the simulation is successfully completed:
5.6 Architecture of WorkflowSim 95
5.5 WorkflowSim
Scientific workflows [10] are generally used for representing complex distributed
scientific computations. They are having huge number of tasks and the execution of
those tasks demands several complex modules and software. The performance vali-
dation of workflow optimization techniques in real infrastructures is difficult and
time-consuming. Consequently, simulation-based approaches evolved as one among
the most renowned approaches to validate the scientific workflows. Further, it
relaxes the intricacy of the empirical setup and accumulates much effort in work-
flow execution by empowering the examination of their applications in a repeatable
and controlled environment [10].
Current workflow simulators do not furnish a framework that considers the sys-
tem overheads and failures. Further, they are not equipped with most used workflow
optimization approaches such as task clustering [10]. WorkflowSim is an open-
source workflow simulator, and it inherits CloudSim by providing a workflow-level
support of simulation [11]. CloudSim supports only the execution of single work-
load, whereas WorkflowSim focuses on workflow scheduling as well as execution.
CloudSim has a basic model of task execution which never considers the association
among the tasks or clustering. Usually, it simply overlooks the unfortunate occur-
rence of failures and overheads [10].
WorkflowSim models workflows as a DAG model and provides an elaborate
model for node failures, a model for delays occurring in the various levels of the
workflow management system stack. It also includes implementations of several
most popular dynamic and static workflow schedulers like HEFT and Min-Min and
task clustering algorithms such as runtime-based algorithms, data-oriented algo-
rithms, and fault-tolerant clustering algorithms [11].
Figure 5.5 [10] depicts the architecture of the WorkflowSim. It clearly shows the
various components contained in WorkflowSim, and the area surrounded by red
lines is supported by CloudSim.
Workflow Mapper
Workflows are modeled as Directed Acyclic Graphs (DAGs), where nodes represent
the tasks which are going to be executed over the computing resources and directed
edges represent communication or control flow dependencies among the jobs [10].
The workflow mapper imports the XML-formatted DAG files and other metadata
information. Once the mapping is completed, the workflow mapper produces a
group of tasks and allocates them to an execution site. A task is a software or an
action that a user would like to execute [11].
96 5 Workflow Modeling and Simulation Techniques
WORKFLOW
PLANNER
CLUSTERING WORKFLOW
ENGINE DATACENTER
WORKFLOW
ENGINE CLOUD
INFORMATION
SERVICE
WORKFLOW
SCHEDULER
EXECUTION SITE
FAILURE FAILURE
GENERATOR MONITOR
Clustering Engine
A job is an infinitesimal unit which groups multiple tasks that are to be performed
in sequence or in parallel. The clustering engine combines tasks into jobs so as to
minimize the scheduling overheads [11]. There are two techniques in clustering
such as horizontal clustering and vertical clustering. Horizontal clustering combines
the tasks at the identical horizontal levels within the source workflow graph, and
vertical clustering combines the tasks in an identical vertical pipeline levels [10].
Clustering engine [11] in WorkflowSim additionally performs task reclustering in a
failure environment by means of transient failures. The failed jobs which are
returned from the workflow scheduler are combined together to create a new job.
Workflow Engine
The workflow engine [11] selects the jobs to be executed based on the parent-child
relationship. It controls jobs execution by considering their associations to ensure
that a job is allowed to execute only when all of its parent jobs have completed suc-
cessfully. The workflow engine will discharge free jobs to the workflow scheduler.
Workflow Scheduler
The workflow scheduler [11] is used to add the characteristics for the data center. It
creates the resources like virtual machines in data center. It does the matching of
jobs to appropriate resources based on the conditions chosen by users.
Interaction Between the Components
To combine and harmonize these components, an event-based approach [10] is
adopted where each of them having a message queue. Figure 5.6 [10] showcases a
5.6 Architecture of WorkflowSim 97
minimal configuration that contains two data centers and two nodes in each site.
Here, each component manages its own message queue, and it periodically checks
in a repeated manner, whether it needs to process any message.
For instance, the clustering engine checks every time whether it has received any
new tasks from the workflow engine and whether any new jobs are to be released to
the scheduler. When the message queue is empty for all the components, the simula-
tion is completed.
Layered Failures and Job Retry
Failures may occur at a wide range of instances during the execution of workflows.
These failures are broadly classified into two types: task failure and job failure. Task
failure [10] occurs in a situation in which the transient failure affects the computa-
tion of a task and other tasks that are in the job list do not necessarily fail. Job failure
[10] is a situation in which transient failure affects the clustered job and all the jobs
in the task will fail eventually. There are two components added in response to the
simulation of failures:
Failure Generator [10] component is used to insert task or job failures at each
execution site. After each job execution, failure generator will randomly generate
task/job failures based on the distribution and average failure rate that is specified
by the user.
Failure Monitor [10] collects failure records such as resource id, job id, and task
id. These records are then returned to the workflow management system to adjust
the scheduling strategies dynamically.
Fault Tolerant Optimization There are several ways to improve the performance
in a failure-prone environment. Two prominent ways are included in the
WorkflowSim. The first method [10] emphasizes on the retry of the particular failed
98 5 Workflow Modeling and Simulation Techniques
part of the job or the retry of the entire job. Workflow scheduler is used to take care
of this functionality. Workflow scheduler will check the status of a job and takes
actions based on the user-selected strategies. The second method [4] that can be
used is reclustering. It is a technique in which the task clustering strategy is adjusted
based on the detected failure rate. Workflow engine is used to take care of this
functionality.
Results
5.6 Architecture of WorkflowSim 99
Results
Example 3: Min-Min
The Min-Min [10] heuristic begins by sorting the free jobs based on the order of
completion time. From the sorted order, the job with the minimum completion time
is selected and allocated the corresponding resource. The next job is submitted to
the queue and the process repeats until all the jobs in the list are scheduled. The
100 5 Workflow Modeling and Simulation Techniques
main aim of Min-Min algorithm is to create a local optimal path such that the over-
all runtime is reduced.
Pseudocode
Results
5.7 Conclusion
CloudSim and WorkflowSim are increasingly popular to simulate the cloud comput-
ing environment due to their guarantee for dynamic repeatable evaluation of provi-
sioning policies for several applications. Simulation frameworks achieve faster
validation of scheduling and resource allocation mechanisms in cloud data centers.
Further, diverse network topologies with several parallel applications can be mod-
eled and simulated in NetworkCloudSim, which was recently developed [2, 3].
CloudSim is very useful tool for modeling and simulation of data centers with nec-
essary resource allocation based on various scheduling algorithms. Furthermore, the
WorkflowSim is used to execute DAG-like job that has an interdependency between
them. Additionally, WorkflowSim can also be used to validate several scheduling
and planning algorithms.
References
1. http://www.cloudbus.org/cloudsim/
2. Piraghaj SF, Dastjerdi AV, Calheiros RN, Buyya R (2016) Container CloudSim: a framework
and algorithm for energy efficient container consolidation in cloud data centers, an environ-
ment for modeling and simulation of containers in cloud data centers, Software Pract Exp,
John Wiley & Sons, Ltd, USA, 2016. (Accepted for publication)
3. Garg SK, Buyya R (2011) NetworkCloudSim: modelling parallel applications in cloud simula-
tions. In: Proceedings of the 4th IEEE/ACM International Conference on Utility and Cloud
Computing (UCC 2011, IEEE CS Press, USA), Melbourne, Australia, December 5–7, 2011
4. Calheiros RN, Ranjan R, Beloglazov A, De Rose CAF, Buyya R (2011) CloudSim: a toolkit
for modeling and simulation of cloud computing environments and evaluation of resource
provisioning algorithms. Software Pract Exp 41(1):23–50. ISSN: 0038-0644, Wiley Press,
New York, USA, January, 2011
5. Zheng J, Sun Y, Zohu W (2009) Cloud computing based internet data center. Springer Cloud
Com pp 700–704
6. Sarishma KM (2015) Cloud storage: focusing on back end storage architecture. IOSR
J Comput Eng 17(Jan–Feb):2278–8727
7. Sabahi F (2012) Member, IEEE – secure virtualization for cloud environment using hypervisor-
based technology. Int J Mach Learn Comput 2(1)
8. Ferreira L, Putnik G, Cunha M (2013) Science direct – cloudlet architecture for dashboard in
cloud and ubiquitous manufacturing vol 12
9. Raj G, Nischal A (2012) Efficient resource allocation in resource provisioning policies over
resource cloud communication paradigm. Int J Cloud Comput Serv Architect 2(3):79–83
10. Chen W, Deelman E (2012) WorkflowSim: a toolkit for simulating scientific workflows in
distributed environments. In: Proceedings of the 8th International Conference on E-Science
(E-Science)
11. http://www.workflowsim.org/index.html
Chapter 6
Execution of Workflow Scheduling in Cloud
Middleware
Abstract Many scientific applications are often modeled as workflows. The data
and computational resource requirements are high for such workflow applications.
Cloud provides a better solution to this problem by offering the promising environ-
ment for the execution of these workflow. As it involves tremendous data computa-
tions and resources, there is a need to automate the entire process. Workflow
management system serves this purpose by orchestrating workflow task and execut-
ing it on distributed resources. Pegasus is a well-known workflow management sys-
tem that has been widely used in large-scale e-applications. This chapter provides
an overview about the Pegasus Workflow Management System, describes the envi-
ronmental setup with OpenStack and creation and execution of workflows in
Pegasus, and discusses about the workflow scheduling in cloud with its issues.
6.1 Introduction
Applications such as multi-tier web service workflows [1], scientific workflows [2],
and big data processing like MapReduce consist of numerous dependent tasks that
require a lot of computational power. This massive computational power require-
ment makes these applications difficult to process. An effective means to define
such multifarious applications is achieved through workflow scheduling algorithms
and frameworks. The problem of workflow scheduling in the distributed environ-
ments has been addressed widely in the literature. Harshad Prajapathi [3] discussed
scheduling in grid computing environment and presented a concise perceptive about
the grid computing system. Fuhui Wu et al. presented a workflow scheduling algo-
rithm in cloud environment and also presented a comparative review on workflow
scheduling algorithm [4]. Grid computing-based workflow scheduling taxonomy
was proposed by Jia yu et al. [5]. Mohammad Masdari (Masdari et al.) [6] presented
a comprehensive survey and analysis of workflow scheduling in cloud computing.
Workflow applications are represented as Directed Acyclic Graphs (DAGs), and
the scheduling of workflow is a NP-complete problem [7]. These applications
require higher computing environment for the execution of their workflows.
Traditional high-performance computing (HPC) support with cloud computing will
be the effective solution for the execution of these intensive computational power
demanding workflows. Many definitions have been proposed for cloud computing
[8, 9] According to NIST [10] Peter M. Mell and Timothy Grance, cloud computing
is a model for enabling ubiquitous, convenient, on-demand network access to a
shared pool of configurable computing resources that can be rapidly provisioned
and released with minimal management effort or service provider interaction.
Cloud computing is a popular concept that has emerged from various heteroge-
neous distributed computing such as utility computing, grid computing, and auto-
nomic computing. Cloud computing is viewed as a solution to effectively run
workflows and has rapidly gained the interest of researchers in scientific commu-
nity. Cloud provides resources as services over a network in an on-demand fashion,
and the user needs to identify the resource type to lease, lease time, and cost for
running the application. There are several open-source clouds available for creating
private and public clouds. OpenStack cloud environment-based experimental analy-
sis is carried out for various scenarios in this chapter.
The prospect of running workflow applications through cloud is made attractive
by its very many benefits. The essential benefits include:
• Virtualization
Cloud gives the illusion of unlimited resources, and this allows the user to acquire
sufficient resources at any time.
• Elasticity
Cloud allows its user to scale up and scale down the resources by acquiring and
releasing the resources as and when required.
The major contributions include:
• A description of an approach that sets up a computational environment in
OpenStack cloud to support the execution of scientific workflow applications
• Creation of workflows with Pegasus-keg tool and their execution in OpenStack
cloud
• A comparison of the performance of workflow applications using different
instances of OpenStack
• Addressing of some of the issues in scheduling workflows in cloud
Many works have been carried out for workflows in cloud, and Google scholar
reports 17,700 entries for the keyword workflows and cloud from 2010 to 2016.
This shows that workflow in cloud is a significant area of research.
Many scientific applications are often modeled as workflows. The data and com-
putational resource requirements are high for such workflow applications. Cloud
provides a better solution to this problem by offering a promising environment for
the execution of these workflows. Modeling as workflows involves tremendous data
computations and resources, thereby creating the necessity to automate the entire
process. Workflow management system serves this purpose by orchestrating the
workflow tasks and by executing it on distributed resources. Pegasus is a well-
known workflow management system that has been widely used in large-scale
e-applications. This chapter provides an overview of the Pegasus Workflow
6.2 Workflow Management System 105
Management System and gives the description of the environmental setup with
OpenStack.
This chapter is organized as follows: Section 6.2 describes the workflow man-
agement system with the overview of Pegasus and its subsystems. Section 6.3 dis-
cusses about the execution environmental setup for the experiment with the creation
and execution of workflow in Pegasus. Section 6.4 describes scheduling algorithm
and addresses the issues of scheduling workflow in cloud. Finally Sect. 6.5 con-
cludes the chapter.
A scientific workflow describes the necessary computational steps and their data
dependency for accomplishing a scientific objective. The tasks in the workflows are
organized and orchestrated according to dataflow and dependencies. An appropriate
administration is needed for the efficient organization of the workflow applications.
Workflow management system (WMS) is used for proper management of workflow
execution on computing resources. Pegasus is such an open-source workflow man-
agement system that consists of different integrated technologies that provides for
execution of workflow-based applications in heterogeneous environments and man-
ages the workflow running on potentially distributed data and compute resources.
Architecture of the Pegasus WMS is shown in the Fig. 6.1.
Table 6.1 Major subsystems of the Pegasus workflow management system [12]
Mapper Generates an executable workflow from the abstract workflows which was
provided by the user. Mapper finds the appropriate resources and other
related components which are required for the workflow execution. It also
restructures workflows for optimized performance
Local execution Submits the jobs and manages them by tracking the job state and
engine determines when to run a job. Then it submits the jobs to the local
scheduling queue
Job scheduler It manages the jobs on both local and remote resources
Remote execution It manages the execution of jobs on the remote computing nodes
engine
OpenStack cloud is used for the experiment. Presently no other work has been car-
ried out in the OpenStack with Pegasus WMS. Till date all the experiments have
been done with AmazonEC2 instances. There are numerous ways to configure an
execution environment for workflow application in cloud. The environment can be
deployed entirely in the cloud, or parts of it can reside outside the cloud. The former
approach has been chosen for this experiment and the private cloud setup has been
accomplished with OpenStack. Instances are created with different flavors such as
medium, large, xlarge, and dxlarge with CentOS 6.5 as a base operating system in
addition with Pegasus and HTCondor. The information regarding the resource usage
of this experiment is detailed in Table 6.2. To reduce the setup time, preconfigured
images are used to create instances in cloud. The Pegasus Workflow Management
System environmental setup in the OpenStack cloud is shown in Fig. 6.2.
Workflow applications are loosely coupled parallel applications which comprise
of a set of tasks that are linked with dataflow and control-flow dependencies.
6.4 General Steps for Submitting Workflow in Pegasus 107
Workflow tasks use file system for communication between them. Each task pro-
duces one or more outputs which becomes the input for other tasks. Workflows are
executed with Pegasus Workflow Management System with DAGMan and Condor.
Pegasus transforms the abstract workflow into concrete plans which are executed
using DAGMan to manage task dependencies and Condor to manage task
execution.
A study of Pegasus environment with its control flow has been made by execut-
ing the example workflows that are shipped with Pegasus in different instances. The
results for the execution of examples in different OpenStack instances are given in
Fig. 6.3.
Desktop machine refers to the standalone machine, and others are the instances
in OpenStack cloud. From the results, we infer that the performances of the
OpenStack instances are better than the ones on the standalone machine. Hence,
cloud instances are preferred for execution of the workflows than standalone.
a b
3.5 7
3 6
Execution in seconds
Execution in seconds
2.5 5
4
2
3
1.5
2
1
1
0.5
0
0
er
di
er
c
rl
p
t
t
stage create Is stage out cleanup
ou
ou
w
cu
nu
k
st
te
or
gi
ea
ea
e
e
worker dir
re
ag
ag
cr
cl
e
st
st
ag
st
DX-extra large L- large MED – medium X – small Desktop machine
c d
4.5 4.5
4 4
Execution in seconds
Execution in seconds
3.5 3.5
3 3
2.5 2.5
2
2
1.5
1.5
1
1
0.5
0.5
0
0
2
3
4
ag 5
w ir
ag t
cl out
sp n
p
1
st ker
Is 3
Is 2
Is 1
cat
st ou
stage worker
create dir
stage out
stage out
register
stage out
register
cleanup
d
nu
c
c
c
st wc
lit
e
w
w
w
ag te
or
ag
e
ea
st rea
e
c
e 35
f 4
30 3.5
Execution in seconds
Execution in seconds
25 3
20 2.5
15 2
10 1.5
5 1
0 0.5
in
p
ra s
ag e
ra 1
an 2
st ir
t
cr ker
0
ou
s
nu
yz
d
fin nge
e
fin oce
e
ng
te
al
ag
or
ea
e
ea
w
stage create
cl
ep
e
st
d
d
pr
ag
worker dir
st
Fig. 6.3 Workflow execution in different OpenStack instances. (a) Process workflow execution.
(b) Pipeline workflow execution. (c) Split workflow execution. (d) Merge workflow execution. (e)
Diamond workflow execution. (f) Date workflow execution
6.4 General Steps for Submitting Workflow in Pegasus 109
create_dir_date_0_local stage_worker_local_date_0_local
date_ID0000001
date
stage_out_local_local_0_0
date.txt
Abstract workflow
cleanup_date_0_local
Executable workflow
Fig. 6.7 Sites.xml
The computational resources are identified as site in site catalog. Here the site is
local Condor [13] where the submitted workflow will run. Figure 6.7 shows the site
catalog for the date workflow. Executables are identified from the transformation
catalog. After mapping process gets completed, workflow is generated with a speci-
fied target workflow engine. Then it will be submitted to the engine and to its sched-
uler on the submit host. HTCondor DAGMan (Thain et al., 2005) is the default
112 6 Execution of Workflow Scheduling in Cloud Middleware
pegasus::Z:4.0
pegasus::X:4.0 pegasus::X:4.0
pegasus::W:4.0
Fig. 6.8 Abstract workflow for the test generated by Pegasus using Pegasus-keg
workflow engine and HTCondor schedd is the default scheduler. The workflow is
planned with Pegasus-plan for submission of the same to the execution host.
The test workflows for the execution can be generated with the help of Pegasus-
keg (canonical executable) that is installed along with the installation of Pegasus
WMS. It is used to create the desired workflow task that runs for a specified time
and generates a specified size output. Pegasus-keg has several options to generate
workflows with desired features such as:
–– i: To read all input files into memory buffer
–– o: To write the content to output files
–– T: To generate CPU load for specified time
–– a: To specify the name of the application
Test workflow with 12 nodes is created for the execution in Pegasus WMS with
the help of Pegasus-keg. Abstract workflows are prepared in dax format with a use
of java API generator, which is a primary input to Pegasus. The created test work-
flow structure is shown in Fig. 6.8, and this is an abstract workflow. Figure 6.9
shows the concrete workflow which can be executed over a number of resources.
Here the workflow is planned to be executed in a cloud environment, and the site
handle is condorpool. Output schedule for the test workflow is shown in Fig. 6.10.
6.4 General Steps for Submitting Workflow in Pegasus 113
2 Pegasus-dx.novalocal
1 Host
0 50 100 150 200 250 300 350 400 450 500
condor job resource delay job run time as seen by dagman pegasus::dirmanager
pegasus::Y:4.0 pegasus::transfer pegasus::X:4.0 pegasus::Z:4.0
pegasus::cleanup pegasus::X:4.0
Fig. 6.10 The output schedule for the test workflow of 12 tasks scheduled provided by Pegasus
WMS
Fig. 6.11 Output schedule for date workflow scheduled using random site selector
to generate graphs and charts to visualize the workflow run. Pegasus-plots is a com-
mand to generate graphs and charts. This is placed in the plots folder of workflow
directory. Figure 6.15 represents the Pegasus-plots generated for the spilt workflow
in Pegasus WMS.
The combination of infrastructure as a service and platform as a service over the
cloud forms an absolute workflow scheduling architecture, and it introduces various
scheduling challenges. This section addresses such issues of scheduling workflows
in cloud. While proposing a new scheduling algorithm, certain issues are to be
116 6 Execution of Workflow Scheduling in Cloud Middleware
6.5 Conclusion
References
1. Mao M, Humphrey M (2011) Auto-scaling to minimize cost and meet application deadlines in
cloud workflows. In: High-performance computing, networking, storage and analysis (SC),
International conference, pp 1–12, 12–18 Nov, 2011
2. Juve G, Chervenak A, Deelman E, Bharathi S, Mehta G, Vahi K (2013) Characterizing and
profiling scientific workflows. Futur Gener Comput Syst 29(3):682–692
3. Prajapati HB, Shah VA (2014) Scheduling in grid computing environment 2014 fourth interna-
tional conference on advanced computing & communication technologies, Rohtak, pp 315–
324. doi:10.1109/ACCT. 2014.32
4. Wu F, Wu Q, Tan Y (2015) Workflow scheduling in cloud: a survey. J Supercomput
71(9):3373–3418
5. Yu J, Buyya R (2005) A taxonomy of scientific workflow systems for grid computing. ACM
SIGMOD 34(3):44–49
6. Masdari M, ValiKardan S, Shahi Z, Azar SI (2016) Towards workflow scheduling in cloud
computing: a comprehensive analysis. J Netw Comput Appl 66:64–82
7. Michael RG, Johnson DS (1979) Computers and intractability: a guide to the theory of NP
Completeness. WH Freeman Co., San Francisco
8. Armbrust M et al (2009) Above the clouds: a Berkeley view of cloud computing, white paper,
UC Berkeley
118 6 Execution of Workflow Scheduling in Cloud Middleware
9. Foster I et al (2008) Cloud computing and grid computing 360-degree compared. Grid com-
puting environments workshop (GCE ‘08)
10. Peter M, Grance T. The NIST definition of cloud computing. NIST Special Publication. http://
dx.doi.org/10.6028/NIST.SP.800-145
11. Rodriguez MA, Buyya R (2014) Deadline based resource provisioning and scheduling algo-
rithm for scientific workflows on clouds. IEEE Trans Cloud Comput 2:222–235
12. https://pegasus.isi.edu/overview/
13. Thain D, Tannenbaum T, Livny M (2005) Distributed computing in practice: the condor expe-
rience. Concurr Comput Exper Pract 17(2–4):323–356. http://dx.doi.org/10.1002/cpe.v17:2/4
14. Mohanapriya N, Kousalya G, Balakrishnan P (2016) Cloud workflow scheduling algorithms:
a survey. Int J Adv Eng Technol VII(III):88–195
15. Iosup A, Simon O, Nezih Yigitbasi M, Prodan R, Fahringer T, Epema DHJ (2011) Performance
analysis of cloud computing services for many-tasks scientific computing. IEEE Trans Parallel
Distrib Syst 22(6):931–945
Chapter 7
Workflow Predictions Through Operational
Analytics and Machine Learning
7.1 Introduction
Workflow prediction is an act of forecasting the execution time and cost profile for
each task on top of the workflow as a whole. Conversely, the execution time and cost
primarily depend on several factors such as: workflow structure, nature of the appli-
cation and its execution environments like compute and storage resources and their
network connectivity. Obviously, these factors introduce a lot of complexities in
deciding from which cloud provider, what type of cloud instance profile, and how
many such instances need to be leased for executing a given workflow. There are
two approaches proposed in the literature to handle the abovementioned scenario:
analytical models (AMs) and machine learning models (MLMs). This chapter
emphasizes the fundamentals of AM and MLM along with their limitations.
Following that, it explains a hybrid model to reap out the benefits of both AM and
ML approaches.
Section 7.2 discusses about different workflow prediction approaches. Important
concepts, such as the need of workflow prediction models along with the associated
challenges, merits as well as demerits of AM, MLM, and hybrid models, are
explained in this section.
Section 7.3 highlights various AM-based prediction approaches available in the
literature. Section 7.4 elucidates different MLM-based prediction approaches along
with their pros and cons. Section 7.5 points out the hybrid approach for performance
prediction.
Apart from this, it also explicates the hybrid approach using PANORAMA as a
case study in Sect. 7.6. Additionally, the architecture and operation of PANORAMA
are explained in detail using two workflow applications.
Finally, this chapter is closed off with several concluding remarks.
the information from these layers and simulates the applications at a faster time
scale to generate the estimation of elapsed time, scalability, and resource usage.
Here, the elapsed time represents the predicted application runtime under a given
application and system parameters. The scalability parameter expresses the perfor-
mance changes in application while varying the application (like input data size)
and system parameters (like number of processors).
Task profiling model (TPM) [3] is an APPS which predicts the runtime and load
profile of job tasks in a standard time-sharing system. In such systems, the varia-
tions in CPU load seriously affect the execution time of CPU-intensive applications.
Hence, the TPM initially gathers the load of each user in the system, and then it
collects the free load profile (FLP) which is a load profile of each task with no back-
ground load. Further, it utilizes these information to predict the load profile of future
jobs, thereby scheduling them to suitable computing resources.
DIMEMAS [4] is a simulator that predicts the performance of an MPI applica-
tion using its execution trace file that contains the computation/communication pat-
tern along with the configuration file which models the target architecture.
LaPIe [5] is a generic framework that identifies an efficient execution plan by
associating the best communication strategy-schedule scheme which minimizes the
completion time of a parallel application by considering the overall collective com-
munication time. For that, it divides the entire network into logical clusters, gener-
ates performance models for several communication schemes, and chooses the best
scheme for each logical cluster. The entire framework is modeled as pLogP model
that contains communication latency (L), message gap between messages of size
“m,” and the number of processes (P).
The performance prophet of Askalon [6] predicts the performance of distributed/
parallel applications using Teuta and Performance Estimator components. Using
Teuta, the user represents the application as UML models and embeds the perfor-
mance and control flow information. These models are converted into an intermedi-
ate form and are stored in the data repository. Later, performance predictor reads
these models and estimates the application performance on selected platforms. If
the performance is not satisfied, the user may alter the application model or opt for
an alternate target platform till the user gets satisfied. In contrast, G-Prophet uses
the historic runtimes and input size of each task to predict the runtime of future jobs.
If the future jobs with the same input size are executed over a similar machine, then
the runtime is predicted using the load and memory. If not, firstly calculate the new
runtime for this new machine, and then the future runtime is predicted using the
load and memory.
The GAMMA [7] model predicts the relationship of a parallel application to the
cluster architecture, thereby mapping a suitable cluster to that application. For that,
it models the parallel application as a ratio of the number of operations (O in
Mflops) to be executed on a single processor to the amount of data to be transferred
(S in Mwords) from one processor to another. Similarly, it models the cluster archi-
tecture as a ratio of the number of operations (Mflops/s) that the processor can per-
form while sending a single word from one to another (Mwords/s). A parallel
124 7 Workflow Predictions Through Operational Analytics and Machine Learning
from potential donor peers and predicts the quantity of resources available for con-
sumer peer in a future time. This prediction phase should be initialized just before
submitting the tasks. The proposed model is validated by a trace-driven simulation,
and prediction error is calculated for each donor peer by subtracting the ratio of
estimated requested (ER) resources from obtained requested (OR) resources.
predict the runtime and finally applies the genetic algorithm to estimate these
parameters. A generic Bayesian-neural network approach to accurately predict the
runtime of any scientific application is explained in [25]. The Bayesian network
dynamically identifies the critical factors that affect the performance and fed them
to radial basis function neural network (RBF-NN) for more accurate runtime
predictions.
In concise, pure AM or MLM approaches are not appropriate for designing a robust
APPS that predicts the optimal resource configuration for their applications which
gives the assured performance. To get the best of these two approaches, a new
hybrid APPS with an ensemble of both AM and MLM approaches is explained in
this section. This hybrid APPS starts with a pre-built AM which is incarnated with
lesser training time than its MLM-based predictors. However, the MLM component
improvises the accuracy of hybrid APPS as the new data arrive from operational
systems.
During their training phase, any ML algorithm develops a model by utilizing the
features’ space of input values and their corresponding output values from their
training data. More precisely, the model takes any input which is not available in the
training set and estimates its corresponding output value. Obviously, there is a dif-
ference between the actual and estimated value which is known as an error. These
errors can be minimized by tuning the model using some statistical approaches
which actually tries to reduce the errors during the training itself. Further, the model
can be updated every time new observations are included in the training data. In
contrast, an AM is a static or immutable model built using a priori knowledge about
the system and cannot be updated or does not involve any training. However, the
prediction accuracy of AM depends on several internal parameters which can be
tuned to improvise the efficacy of the model. This dependency provides an excellent
opportunity to integrate MLM into AM. Firstly, a MLM is developed using the
samples of target system which are collected over a period of time to estimate these
internal parameters. Secondly, these estimated values are fed into AM to adjust its
predictions. In concise, the internal parameters of AM are modified dynamically
whenever the MLM is updated as new data arrives. The primary benefits of exploit-
ing the synergies between AM and MLM are to realize a more accurate HPPS than
their individual implementations.
However, the major challenge is as follows: How to combine AM and MLM?
Here, three different approaches are proposed to combine AM and MLM.
128 7 Workflow Predictions Through Operational Analytics and Machine Learning
7.6.1 P
ANORAMA: An Approach to Performance Modeling
and Diagnosis of Extreme-Scale Workflows
Motivating Use Cases The PANORAMA framework uses the following applica-
tions for their modeling experiments: Spallation Neutron Source (SNS) (refer to
Fig. 7.2) and Accelerated Climate Modeling for Energy (ACME) (refer to Fig. 7.3).
SNS application contains two different workflows: post-processing workflows and
pre-processing workflows. The post-processing workflows firstly conduct the SNS
experiments and then process the resultant data, whereas the pre-processing work-
flows initially conduct the data analysis and simulation to guide the SNS experi-
ments. The primary objective of the post-processing workflow is to produce a short
proton sequence to hit a mercury target which generates neutrons by spallation.
These scattered neutron events are captured by array of detectors in NOMAD. After
a series of operations over these events, it is reduced to powder diffraction pattern
over which data analysis is conducted to pull out information. In contrast, the pre-
processing workflow tries to refine a target parameter headed for fitting experimen-
tal data. In this SNS refinement workflow (refer to Fig. 7.2), each set of parameters
is given as input to series of parallel NAMD. The initial simulation estimates the
equilibrium followed by production dynamics. Then the output is sent to AMBER
and Sassena from where the final output is transferred to client desktop. For more
information on SNS workflow, refer to [26]. The ACME application studies (refer to
Fig. 7.3) about the climate change by integrating the models of ocean, land, atmo-
sphere, and ice. However, each part of the workflow demands diverse software and
hardware spread across several places in DoE labs. The primary objective of
PANORAMA is to automate the monitoring, resubmission, and reporting in several
stages of ACME application. Here, a huge simulation is sliced into different stages.
7.7 Conclusion 133
Fig. 7.3 The complete Accelerated Climate Modeling for Energy (ACME) application
Each stage has to be completed in a stipulated deadline. At the end of each stage,
two files are created: restart file and history files. The restart files are given as input
to resume the simulations. Additionally, the summary of simulation at each stage is
extracted from history files and stored in climatologies, which are used to verify the
correctness of simulation. Finally, the history files as well as climatologies are
stored in HPSS and CADES infrastructure for future analysis.
7.7 Conclusion
This chapter highlights the significance of workflow predictions together with the
influencing parameters. Further, it explains about the two prominent approaches in
workflow prediction: AM and MLM. Subsequently, several notable approaches that
utilize either AM or MLM to predict one or more resource parameters during work-
flow execution are given in detail. Following that, a hybrid approach which inte-
grates both AM and MLM is emphasized along with its merits. Finally, the most
recent PANORAMA architecture is explicated in detail and also spotted the usage
of hybrid approach for workflow execution prediction. Consequently, Chap. 8 out-
lines the opportunities and challenges during orchestration as well as integration of
workflows.
134 7 Workflow Predictions Through Operational Analytics and Machine Learning
References
21. Rodero I, Guim F, Corbalán J, Labarta J (2005) eNANOS: coordinated scheduling in grid
environments. In: Presented at the parallel computing: current & future issues of high-end
computing, Parco 2005
22. Andrzejak A, Domingues P, Silva L (2006) Predicting machine availabilities in desktop pools.
In: IEEE/IFIP Network Operations & Management Symposium (NOMS 2006), Vancouver,
Canada
23. Matsunaga A, Fortes JAB (2010) On the use of machine learning to predict the time and
resources consumed by applications. In: Presented at the 10th IEEE/ACM international con-
ference on Cluster, Cloud and Grid Computing (CCGrid), Melbourne, VIC, Australia
24. Minh TN, Wolters L (2010) Using historical data to predict application runtimes on backfilling
parallel systems. In: Presented at the 18th Euromicro conference on Parallel, Distributed and
Network-based Processing (PDP ’10), Pisa, Italy
25. Duan R, Nadeem F, Wang J, Zhang Y, Prodan R, Fahringer T (2009) A hybrid intelligent
method for performance modeling and prediction of workflow activities in grids. In: Presented
at the 9th IEEE/ACM international symposium on Cluster Computing and the Grid (CCGRID
’09), Shanghai, China
26. PANORAMA (2015) An approach to performance modeling and diagnosis of extreme scale
workflows. Ewa Deelman, Christopher Carothers, Anirban Mandal, Brian Tierney, Jeffrey
S. Vetter, Ilya Baldin, Claris Castillo, Gideon Juve, Dariusz Król, Vickie Lynch, Ben Mayer,
Jeremy Meredith, Thomas Proffen, Paul Ruth, Rafael Ferreira da Silva. Int J High Perform
Comput Appl (in press)
Chapter 8
Workflow Integration and Orchestration,
Opportunities and the Challenges
8.1 Introduction
populate with their data at this stage. Next, the workflow instance is created by
populating the data to the workflow templates using the data and metadata catalogs.
At this stage, the workflow instances do not know the operational details of resources
to carry out the analysis. Following that, the executable workflow is created by
associating the workflow instance over the available distributed executing resources.
The association step has to find out the appropriate software, hardware, and replicas
of data mentioned in the workflow instance. Apart from this, the association phase
is focused on the workflow restructuring to improvise the overall performance and
workflow transformations to manage the data as well as provenance information. In
concise, the executable workflow creation requires information about the available
resources and data replicas on one side, whereas the application components
requirements on the other side. After the successful creation of executable work-
flow, workflow engine started to execute the activities over the resources associated
in the workflow. During execution, the data, metadata about data, and provenance
information are collected and stored in a repository. This information enables reus-
ing of workflows as is or adapt and modify it based on the future needs of users.
Besides, the workflow cycle allows users to start from any stage. The following
subsections explain each stage of workflow life cycle in detail.
140 8 Workflow Integration and Orchestration, Opportunities and the Challenges
Workflow mapping is a procedure, which converts the abstract workflow into exe-
cutable workflow. Alternatively, it modifies the resource-independent workflow
instance into resource-dependent workflow. The appropriate services or resources
for each functional unit can be either chosen by the user directly or using a sched-
uler. For service-based workflows, the scheduler finds and binds suitable services
using their metadata, functional and nonfunctional properties by considering the
Quality of Service (QoS) requirements of each functional units. In case of task-
based workflows, the scheduler optimally chooses suitable resources, those satisfy-
ing the requirements of functional units. Here, the resources may be associated with
the functional units either statically before starting the execution or dynamically at
runtime. Further, the scheduler may use any of the following strategies to map the
resources: just-in-time planning, full-ahead planning, or partial planning [20] to
enable data reuse, reduced cost, and data usage. The just-in-time planning maps
resources to each functional units thereby enhancing resource utilization and cost
reduction. In full-ahead planning, the entire workflow is scheduled on the whole
which enables the data reuse and thereby reduces the computation time. The entire
workflow is sliced into multiple sub-workflows and map each sub-workflow to
resources. However, the objective of optimization of the scheduler can be anything
as specified in Chap. 3.
steps may be failed in a task-based workflow or a service do not complete its execu-
tion even after a timeout in a service-based workflow. Obviously, the execution
model should be equipped with appropriate fault tolerance mechanisms. In task-
based workflows, the fault tolerance is realized by saving the execution state (i.e.,
check pointing and migration) of tasks and migrate it to another resource (if needed),
which is identified using rescheduling. Here, the check pointing can be done at three
different levels: task level, application level, or operating system level (i.e., VM
level). But, the service-based workflows reschedule the service after a timeout.
Besides, the workflow engine has the capability to analyze the trace information or
log files to identify the root cause of failure, if the task completions status is “failed.”
The data provenance maintains the historic details about the newly created data
object which provides the opportunity to reproduce the results. The provenance data
contains the information about nodes that are clustered, input data chosen for execu-
tion, intermediate data chosen for reuse, scheduled execution sites, timestamp, soft-
ware and library version, etc. The Karma [28] provenance system offers the
capability to search over the provenance database that are collected using event-
driven streams. However, most of the workflow systems optimize the user work-
flows by restructuring. Hence, it is tedious to directly use the provenance data over
restructured workflows for reproducibility. With this objective, Pegasus is combined
with PASOA provenance system [29].
With so many workflow creation, mapping, execution, metadata, and provenance
methodologies, it is pretty hard to integrate and orchestrate scientific workflows.
Mostly, the different parts of workflow applications require different resources: one
step may need user intervention, the other step needs a data stream as input for its
computation, and some steps need high-performance resources. None of the exist-
ing workflow systems can handle all sorts of demands on the whole. Therefore, the
developers rely on multiple workflow engines to design and develop their work-
flows. Unfortunately, the workflows represented for one workflow system cannot be
directly executed over the other because of the interoperability issues. At the bottom
level, it is mandatory to provide the interoperability at the workflow description
level. Consequently, the workflow described for one workflow system can be reused
on the other without any modification. Section 8.3 discusses the opportunities and
challenges in workflow integration, orchestration, and execution in cloud
environment.
8.3 Challenges and Opportunities 143
Currently, most of the scientific research projects need collaboration from several
domain experts who are usually available at different geographical locations.
Alternatively, the part or complete scientific process needs collaboration among
several research groups, datasets, and computational tasks. Hence, to execute a col-
laborative scientific workflow, the collaborators need to be interconnected through
the Internet. However, storing, transferring, and reusing the intermediate data
among the collaborators is a critical issue. Since cloud computing offers scalable
data and computing resources which can also be shared among group members,
these collaborative scientific workflows can be defined, developed, and executed
using cloud resources. Nevertheless, the integration and orchestration of workflows
pose the following challenges [30]: architectural, integration, computing, data man-
agement, and languages.
Architectural Challenges Scientific workflows are distributed in nature which
opens new avenues for scalability in several dimensions like the number of users,
use cases, and resources. Besides, each distributed part of the workflow may be
described and executed in heterogeneous environments. In addition, these distrib-
uted parts need to be seamlessly integrated to complete the scientific process. Hence,
the workflow management system (WfMS) which manages the workflow execution
should architecturally support flexibility, extensibility, and portability. At the fabric
level, it has to support heterogeneous services, software tools, data sources, and
computing environments, which can be accessed in a distributed manner.
Consequently, it has to manage task execution and their datasets along with their
provenance data. Further, at the macro-level, it has to monitor the workflow execu-
tion to gather their resource consumption patterns, tasks execution status, and fault
tolerance. Likewise, it has to provide interoperability among other WfMS at a work-
flow description level hereby offloading sub-workflows to execute over the fabric
layer of other WfMS. Finally, it has to furnish the customization and interaction
support at the user interface.
Integration Challenges The integration challenges focus toward the disputes
evolved while executing the workflow over cloud computing resources. Here, the
jobs in workflow may be either service-based or task-based. On the other hand, the
services for the task units in a workflow from the cloud environment can be applica-
tions, services, software tools, compute, and storage resources. Therefore, it is man-
datory to identify the target job type (i.e., service or task), and then it needs to be
scheduled and dispatched to the appropriate cloud service for execution. Nonetheless,
this step demands tweaking of WfMS architecture in several aspects: interface with
several cloud providers for resource provisioning, workflow monitoring, seamless
integration of sub-workflows to transfer or interconnect intermediate data, and job
migration. Firstly, the jobs need computing resources or services that are available
in cloud environment for their execution. But most of the current WfMS cannot
directly interface with several cloud vendors to create the computing resources.
144 8 Workflow Integration and Orchestration, Opportunities and the Challenges
Likewise, choosing an appropriate cloud instance and the required number of such
instances is also critical since it involves cost of execution. Apart from this, the
WfMS needs interaction with cloud instances to monitor, debug, and collect the
provenance data. For service-based jobs, the WfMS needs to identify suitable cloud
services based on their metadata and Quality of Services (QoS). Briefly, the WfMS
needs to be re-architected in such a way to interact with several cloud vendors to
create resources at the first level along with the capability to interface with indi-
vidual cloud computing instances or services at the next level to monitor and gather
the resource information. BioCloud [31] is one such broker which reengineered the
Galaxy WfMS [32] to integrate it with cloud vendors as well as cloud instances
directly which is explained in detail later in Section.
Language Challenges Several parts of the workflow that are executed in a distrib-
uted manner can be integrated together using ad hoc scripts. These scripts can be of
MapReduce, SwiftScript, and BPEL based. But, it has the capability to integrate the
input data with a service or task, Further, it has to support scalability in terms of
compute or storage resources to do the computations in parallel.
Computing Challenges As already mentioned, choosing an appropriate cloud
instance type and the number of such instances are critical in workflow scheduling.
After choosing the hardware, suitable machine images need to be selected for exe-
cuting the workflow tasks is the next challenge. Then, the instances have to be con-
figured dynamically with several parameters such as IP address, machine name, and
cluster creation. Apart from this, the input data need to be staged into newly created
instances before initiating the task execution. Further, different portions of the
workflow demand different instance types and transfer of the intermediate data from
one stage to other.
Data Management Challenges The primary bottleneck in data-intensive work-
flow execution is the mechanism to handle the data staging to and from the comput-
ing resources. Also, relative location of data and computational resources as well as
their I/O speeds seriously affect the scalability of applications. Additionally, select-
ing an appropriate data source for the computation is also a challenging one. Mostly,
the separate handling of data and computational-related issues leads to a huge
amount of data staging between them. The collective management of data and com-
putational resources along with their provenance data access patterns that consist of
data locations, intermediate data, and the methodology to generate the data product
are used to improve the scalability and performance.
Service Management Challenges Since service-oriented architecture (SOA)
offers abstraction, decoupling, and interoperability among services, it can be easily
leveraged in distributed scientific and engineering applications. However, it is hard
to manage a huge number of services in terms of service invocation, state manage-
ment, and service destruction. Similarly, data staging from one service to another is
critical by considering performance and throughput.
8.4 BioCloud: A Resource Provisioning Framework for Bioinformatics Applications… 145
8.4 B
ioCloud: A Resource Provisioning Framework
for Bioinformatics Applications in Multi-cloud
Environments
step and ensuring availability of the resources through resource management and
provisioning, BCWM is responsible from dispatching workflow steps for execution
on the resources predetermined by BioCloud Portal and monitoring them.
Initially, BCWM informs BioCloud Portal about the workflow to be run by send-
ing the related information through the Web services. In order to designate the
workflow execution schedule and determine the resources to be used at each step,
BioCloud scheduler is employed by BioCloud Portal. Scheduler receives the sub-
mitted workflow as a DAG (Directed Acyclic Graph) along with the associated tool
names at each step and input data size. One of the key features of the scheduler is
inherent workflow improvement through data partitioning and parallelism. The
scheduler automatically manipulates the DAG to enable parallelism. It employs a
profiler to estimate expected running times of the tools, the amount of data to be
produced, and the cost of execution to be incurred considering available resources.
Based on this information, the scheduler identifies the resources to be used at each
step considering cost and time requirements. It employs a resource manager to
ensure availability of the resources before executing a workflow step. When the
resources are provisioned, BCWM is allowed to dispatch the next available work-
flow step for execution. This enables dynamic scaling up of the resources right
before the execution of a particular workflow step to meet the resource demand of
the corresponding VO. The resource management module also tracks the provi-
sioned resources to scale down based on the supply-demand balance in the next
billing cycle of the provisioned resources.
BioCloud not only provides an efficient scheduler to minimize the execution cost
while meeting cost and time requirements which is the key contribution of this
paper, it also offers a user-friendly platform to encapsulate the complexity of iden-
tifying resources to be used among several options, using resources simultaneously
on multiple cloud providers to execute workflows while handling data partitioning
and parallelism, dynamic resource scaling, and cluster configuration in the cloud.
Hiding such a complexity from the user enables her to focus on the workflow design.
The user simply clicks the run button to execute the workflow. Figure 8.2 depicts the
steps involved on execution of a workflow where the user has Amazon EC2 and
Rackspace cloud accounts as well as local clusters.
Authentication Alberich policy engine [34] is leveraged by BioCloud to authenti-
cate the users based on the defined roles and permissions. BioCloud Portal exploits
permissions of the account provided by the user for the initial deployment of the
BCWM and to run the workflows. Considering the fact that various steps of the
workflow can be executed using different computational resources, the Alberich
policy engine authenticates a user for the particular resource. The users provide
cloud service credentials so that the Alberich policy engine retrieves roles, permis-
sions, and privileges to authenticate and authorize users for the resource pools.
Accessing the image details, profiler information, resource information, and the
allowed actions are determined based on the access rights. The policy engine is
extended so that the resources from different cloud resources can be used.
8.4 BioCloud: A Resource Provisioning Framework for Bioinformatics Applications… 149
Here, two different scenarios are presented to demonstrate the smooth transition
from a single workstation, the only resource available for the majority of biological
scientists, to a multi-cloud environment. In the first scenario, only a single worksta-
tion is assumed to be available to the user. On the other hand, besides the worksta-
tion, multiple cloud resources are also available in the second scenario. Two different
cloud vendors are selected to demonstrate the flexibility and interoperability of
BioCloud. These scenarios are tested with two different use cases (bioinformatics
workflows) as explained below.
The testbed consists of a local workstation and multiple instances from two
cloud vendors: Amazon and Rackspace. The workstation is equipped with two Intel
Xeon E5520 CPU clocked at 2.27 Hz and 48 GB memory. Each CPU is a quad-core,
with Hyper-Threading enabled and all cores share an 8 MB L3 cache. The “instance-
type” BioCloud used on Amazon EC2 is M3 General Purpose Double Extra Large
(m3.2xlarge). This configuration has 8 cores with a memory of 30 GB. The “flavor”
selected on Rackspace is Performance 2 with a memory of 30 GB and 8 cores. More
information regarding the configuration of these particular instance types can be
found in the respective cloud vendors’ websites. BioCloud forms a dynamic cluster,
on the fly, upon needed in the corresponding cloud service using instances of these
types.
8.4 BioCloud: A Resource Provisioning Framework for Bioinformatics Applications… 151
ExomeSeq Workflow The first use case of BioCloud is an exome sequence analy-
sis pipeline obtained from their collaborators [35], which is known as ExomeSeq.
Test and control data with paired-end reads are used as input. In the first two steps,
sequencing reads are aligned to human reference genome using BWA [36] align-
ment and BWA sampe steps. The third step sorts the alignments by leftmost coordi-
nates. Duplicates are marked in the following two steps using two different tools. In
step six, local realignments around indels are performed and the last step detects
indels. The abstract ExomeSeq workflow is depicted in Fig. 8.3. The readers can
refer to [35] for more information regarding the workflow steps. As mentioned ear-
lier, one of the key features of BioCloud is its inherent workflow improvement facil-
ity through data partitioning and parallelism. BioCloud scheduler evaluates the
submitted abstract workflow and generates an optimized workflow that would uti-
lize available resources and hence enable parallelism. For example, for the abstract
ExomeSeq workflow designed by the user (Fig. 8.3), BioCloud designates an
improved version of this workflow as given in Fig. 8.4. Here, BioCloud scheduler
checks whether data partitioning can be enabled for workflow steps, and consider-
ing available resources and profiling data of earlier executions, BioCloud scheduler
determines to dispatch the test data and the control data to separate cloud resources
for execution. In this scenario, the workflow steps for test data are run in Amazon
EC2, and the steps for the control data are run in Rackspace. The output data of step
six are transferred back to the workstation, and the last step is executed locally
where the final result will also be stored.
Transcriptome Assembly and Functional Annotation De novo transcriptome
assembly and functional annotation are considered as an essential but computation-
ally intensive step in bioinformatics. The objective of the assembly and the annota-
tion workflow is to assemble a big dataset of short reads and extract the functional
annotation for the assembled contigs. As can be seen in Fig. 8.5, the workflow con-
sists of four stages. The first stage is data cleaning in which a Trimmomatic [37] tool
is applied on the paired-end reads dataset. After that, the output is converted to
FASTA format. In stage two, the assembly, the clean dataset is used as input to five
different de novo transcriptome assemblers: Tran-Abyss [38], SOAPdenovo-Trans
[39], IDBA-Tran [40], Trinity [41], and Velvet-Oases [42]. The assembled contigs
from each assembler are merged and used as input to stage three which includes
clustering and removing redundant contigs as well as applying reassembly for the
unique contigs. In stage three, the TGICL tool [43] which utilizes MEGABLAST
[44] is used for contigs clustering and CAP3 [45] for assembly. Functional annota-
tion is done in the last stage, and it is the most computational part. The blast com-
parison and functional annotation used in this workflow follow the pipeline detailed
in [46]. Three major sequences databases, NCBI nonredundant protein sequences
(nr), UniProt/Swiss-Prot, and UniProt/TrEMBL, are used in the sequence compari-
son steps. The Blastx results are parsed in the last step, and their associated GO
categories are generated. The used dataset is rice transcriptome data from Oryza
sativa 9311 (http://www.ncbi.nlm.nih.gov/sra/SRX017631). 9.8M paired-end reads
of 75 bp length and totaling 1.47 Gbp were generated using the Illumina GA
152 8 Workflow Integration and Orchestration, Opportunities and the Challenges
platform [47]. The output contigs of the TGICL step were filtered by removing
contigs of length less than 400 base pairs. For practical issues, the number of
sequences of the three protein databases was reduced to 1% of its original sequences
count, and the databases were installed in the single workstation and remote clus-
ters. Similar to ExomeSeq, transcriptome assembly and annotation use case are
tested in two different scenarios. In the first one, the assumption is that the user has
access to a single commodity workstation to run the workflow. The second scenario
assumes the availability of multiple cloud resources besides the workstation.
References 153
8.5 Conclusion
This chapter initially spotlights two different classes of workflows together with
their characteristics: task-based and service-based. Further, it explains several
phases in the workflow life cycle. After that, several architectural, integration, com-
puting, data management, and language-related challenges are elucidated in detail.
Finally, the BioCloud which is built on top of the Galaxy workflow system to exe-
cute their task-based workflow by leasing cloud instances from multiple cloud ven-
dors is described in detail. Chapter 9 expounds the automated scheduling of
workflows through workload consolidation.
References
28. Simmhan YL, Plale B, Gannon D (2006) Performance evaluation of the karma provenance
framework for scientific workflows, in: International Provenance and Annotation Workshop,
IPAW. Springer, Berlin
29. Miles S, Groth P, Deelman E, Vahi K, Mehta G, Moreau L (2008) Provenance: the bridge
between experiments and data. Comput Sci Eng 10(3):38–46
30. Zhao Y, Fei X, Raicu I, Lu S (2011) Opportunities and challenges in running scientific work-
flows on the cloud. International conference on cyber-enabled distributed computing and
knowledge discovery, Beijing, pp 455–462
31. Senturk IF, Balakrishnan P, Abu-Doleh A, Kaya K, Qutaibah M, Ümit V (2016) A resource
provisioning framework for bioinformatics applications in multi-cloud environments. Future
generation computer systems, Elsevier, (Accepted to Publish impact factor-2.64): doi:10.1016/j.
future.2016.06.008
32. Goecks J, Nekrutenko A, Taylor J, Team TG (2010) Galaxy: a comprehensive approach for
supporting accessible, reproducible, and transparent computational research in the life sci-
ences. Genome Biol 11(8) http://dx.doi.org/10.1186/gb-2010-11-8-r86, R86+
33. BioCloud. URL http://confluence.qu.edu.qa/display/KINDI/BioCloud
34. Alberich. URL https://github.com/aeolus-incubator/alberich
35. Woyach JA, Furman RR, Liu T-M, Ozer HG, Zapatka M, Ruppert AS, Xue L, Li DH-H,
Steggerda SM, Versele M, Dave SS, Zhang J, Yilmaz AS, Jaglowski SM, Blum KA, Lozanski
A, Lozanski G, James DF, Barrientos JC, Lichter P, Stilgenbauer S, Buggy JJ, Chang BY,
Johnson AJ, Byrd JC (2014) Resistance mechanisms for the bruton’s tyrosine kinase inhibitor
ibrutinib. New Engl J Med 370(24):2286–2294. http://dx.doi.org/10.1056/NEJMoa1400029,
pMID: 24869598
36. Li H, Durbin R (2009) Fast and accurate short read alignment with burrows–wheeler trans-
form. Bioinformatics 25(14):1754–1760
37. Bolger AM, Lohse M, Usadel B, Trimmomatic: a flexible trimmer for illumina sequence data,
Bioinformatics. http://dx.doi.org/10.1093/bioinformatics/btu170
38. Robertson G, Schein J, Chiu R, Corbett R, Field M, Jackman SD, Mungall K, Lee S, Okada
HM, Qian JQ, Griffith M, Raymond A, Thiessen N, Cezard T, Butterfield YS, Newsome R,
Chan SK, She R, Varhol R, Kamoh B, Prabhu A-L, Tam A, Zhao Y, Moore RA, Hirst M, Marra
MA, Jones SJM, Hoodless PA, Birol I (2010) De-novo assembly and analysis of RNA-seq
data. Nat Methods 7(11):912
39. Xie Y, Wu G, Tang J, Luo R, Patterson J, Liu S, Huang W, He G, Gu S, Li S, Zhou X, Lam T-W,
Li Y, Xu X, Wong GK-S, Wang J, SOAPdenovo-Trans: De novo transcriptome assembly with
short RNA-Seq reads. Bioinformatics. http://dx.doi.org/10.1093/bioinformatics/btu077
40. Peng Y, Leung HCM, Yiu S-M, Lv M-J, Zhu X-G, Chin FYL (2013) IDBAtran: a more robust
de novo de bruijn graph assembler for transcriptomes with uneven expression levels.
Bioinformatics 29(13):i326–i334. http://dx.doi.org/10.1093/bioinformatics/btt219
41. Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L,
Raychowdhury R, Zeng Q, Chen Z, Mauceli E, Hacohen N, Gnirke A, Rhind N, di Palma F,
Birren BW, Nusbaum C, Lindblad-Toh K, Friedman N, Regev A (2011) Full-length transcrip-
tome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 29(7):652
42. Schulz MH, Zerbino DR, Vingron M, Birney E, Oases: Robust de novo rnaseq assembly across
the dynamic range of expression levels, Bioinformatics. http://dx.doi.org/10.1093/bioinfor-
matics/bts094
43. Pertea G, Huang X, Liang F, Antonescu V, Sultana R, Karamycheva S, Lee Y, White J, Cheung
F, Parvizi B, Tsai J, Quackenbush J (2003) TIGR gene indices clustering tools (TGICL): a
software system for fast clustering of large EST datasets. Bioinformatics 19(5):651–652.
http://dx.doi.org/10.1093/bioinformatics/btg034
44. Zheng Zhang LW, Scott S, Miller W (2000) A greedy algorithm for aligning DNA sequences.
Comput Biol 7:203–214. http://dx.doi.org/10.1089/10665270050081478
45. Huang X, Madan A (1999) Cap3: a DNA sequence assembly program. Genome Res 9:868–
877. http://dx.doi.org/10.1089/10665270050081478
156 8 Workflow Integration and Orchestration, Opportunities and the Challenges
46. De Wit P, Pespeni MH, Ladner JT, Barshis DJ, Seneca F, Jaris H, Therkildsen NO, Morikawa
M, Palumbi SR (2012) The simple fool’s guide to population genomics via RNA-Seq: An
introduction to high-throughput sequencing data analysis. Mol Ecol Res 12(6):1058–1067.
http://dx.doi.org/10.1111/1755-0998.12003
47. Deelman E, Gannon D, Shields M, Taylor I (2009) Workflows and e-science: An overview of
workflow system features and capabilities. Futur Gener Comput Syst 25(5):528–540
Chapter 9
Workload Consolidation Through Automated
Workload Scheduling
9.1 Introduction
The two important workflow scheduling objectives considered in this section are
cost and energy. The algorithm design and its comparison with other existing algo-
rithms are also highlighted in the following sections.
The Proportional Deadline Constrained (PDC) algorithm helps to achieve the dead-
line constraints with the minimum cost. PDC will consist of the following four dif-
ferent steps:
Workflow leveling: Each task in the workflow is grouped into different kinds of
levels.
Deadline distribution: The user-defined deadlines are divided and distributed in
such a way that each level will get its own deadlines.
Task selection: A task is selected for execution based on its priority in the ready
queue.
Instance selection: The instances are selected in such a way that the deadlines are
met with the minimum cost.
In order to evaluate the performance of the PDC algorithm, the following scien-
tific workflow structures are used:
1. CyberShake
2. Montage
3. LIGO
CyberShake: This seismological workflow was used by the Southern California
Earthquake Center (SCEC) to characterize the earthquake hazards [4, 5]. The
structure is shown in Fig. 9.2.
LIGO: Laser Interferometer Gravitational-Wave Observatory (LIGO) (refer to
Fig. 9.3) is a data-intensive workflow which is used to find out the gravitational
wave signatures in the data. Thus, the four stages in the PDC algorithm focus on
the deadline constraints with minimum cost. With PDC viable schedule can be
constructed even with tight deadlines and also with minimum cost. With the help
of PDC algorithm, all the e-Science workflow scheduling like cumulus [6, 7] and
nimbus has met their deadline with minimum cost. Overall PDC algorithm helps
in achieving the deadlines with the lower cost.
Montage: Montage is an astronomy-based application in which the mosaic images
are generated for the sky, thereby providing a detailed understanding of the por-
tion of the sky. The structure is shown in Fig. 9.4.
160 9 Workload Consolidation Through Automated Workload Scheduling
Fig. 9.2 CyberShake
Fig. 9.3 LIGO
PSO is a heuristic algorithm which will schedule the cloud workflow applications
with minimum computational cost and data transmission cost. In PSO each particle
will have their own fitness value, and it will be evaluated by the fitness function to
get an optimized result.
PSO is a self-adaptive global search population-based algorithm without any
direct recombination [8]. The social behavior of the particles is used in this algo-
9.3 Cost-Based Scheduling Algorithm 161
Fig. 9.4 Montage
rithm. PSO will generate a number of candidate solutions known as particles, and
these particles will be moved around in a finite space. Initially the particles will
move in its local best-known position and then in the global best of the entire popu-
lation. Later these movements are found by the other particles also and it is termed
as Swarm toward the best solutions.
PSO is started with a set of random particles, and then search for the optima is
made by updating the generations. In every iteration each and every particle is
updated with two best values. The first value is the fitness value achieved so far and
it is termed as pbest. The second value is the best value of any particle within the
flock and it is termed as gbest. After finding the pbest and gbest, the particle velocity
(v) is adjusted with the help of the following equation:
9.3.3 H
ybrid Cloud Optimized Cost (HCOC) Schedule
Algorithm
HCOC algorithm reduces the makespan to fit the desired execution time or deadline
within a reasonable cost. This algorithm will decide which resource has to be leased
from the public cloud so that the cost of the allocated tasks is minimized. The hybrid
cloud arises when both the public and the private clouds are merged together. In
such hybrid clouds, the HCOC algorithm is used in such a way that both the cost and
execution time are minimized.
Importance of Scheduling Inside Hybrid Clouds
Scheduling plays an important role in managing the jobs and workflow tasks inside
the hybrid clouds. In hybrid clouds, some of the tasks are scheduled over resources
in private cloud, whereas the remaining tasks for which suitable resources are not
available in private cloud are allocated to the resources in public cloud.
Workflow Scheduling Using HCOC Algorithm
The HCOC algorithm is implemented using the following steps:
Initial schedule:
Schedule the workflow within the private cloud.
While the makespan is greater than the deadline:
Tasks are selected for reschedule.
The resources are selected from the public cloud to compose it to the
hybrid cloud say H.
Then reschedule the selected tasks in H.
9.3 Cost-Based Scheduling Algorithm 163
The two main steps in this algorithm are selection of the tasks for rescheduling
and selecting the resources from the public cloud to compose it to the hybrid cloud.
HCOC is a multi-core-aware algorithm which is used for cost-efficient scheduling
of multiple workflows [9]. This algorithm is more efficient when compared to the
others because it supports multiple workflows. It mainly follows the QoS parame-
ters for the users which include cost optimization within the user-specified deadline
and user-specified budget [9].
ACO is a heuristic algorithm and it is based on the foraging behavior of ant colo-
nies. Initially the ants will be searching their food in random directions. When the
ant finds the path to the food source, a chemical substance called pheromone will be
left by the ant while returning back to its nest. The density of the pheromone will
evaporate if the path to the food source is much longer. If the pheromone density is
high, it indicates that many ants have used this path and hence the next ants will be
using this particular path. On the other hand, if the pheromone density is small, it
means that the path to the food source is much longer and few ants have used this
path and this path will be rejected. The path with the highest density pf pheromone
will be selected as the optimal solution.
Ant Colony Optimization (ACO) in Workflow Scheduling
In cloud scheduling each path will represent a cloud schedule. One of the main steps
to be focused here is the local pheromone update and the global pheromone update.
ACO Algorithm
Step 1: Collect information about the tasks (n) and virtual machines (m)
Step 2: Initialize expected time to compute (ETC) values
Step 3: Initialize the following parameters:
Step 3.1: Set α = 1, β = 1, Q = 100, pheromone evaporation rate (ρ) = 0.5,
and number of ants = 100.
Step 3.2: Set optimal solution = null and epoch = 0
Step 3.3: Initialize pheromone trial value τij = c.
Step 4: Repeat until each ant k in the colony finds VM for running all tasks:
Step 4.1: Put the starting VM in tabuk.
Step 4.2: Calculate the probability for selecting VM.
Step 5: Calculate the makespan for the schedule and select the best schedule
based on makespan.
Step 6: Update pheromone trial value:
Step 6.1: Compute the quantity of pheromone deposited.
Step 7: If the maximum epoch is reached, then the optimal schedule is equal to
schedule with optimal makespan.
Step 8: Go to Step 4.
164 9 Workload Consolidation Through Automated Workload Scheduling
9.3.5 C
ustomer-Facilitated Cost-Based Scheduling in Cloud
(CFSC)
The main objective is to minimize the total monetary cost and to balance the load.
Both the service providers and the customers will receive the economical benefits
only if the resources are properly scheduled. This is mainly done by the cost and the
data scheduler in the workflow systems.
The following assumptions are made in the CFSC algorithm:
• A set of heterogeneous virtual machines (VMs) denoted by M are considered for
creating cloud environment.
• The communication network is always connected.
• Tasks are executed normally and there are no failures.
• Tasks are non-preemptive.
CFSC Algorithm
Step 1: Compute Ranku b-level (the length of a longest path from a node to the
exit node) of all the nodes.
Step 2: Arrange the nodes in a list by decreasing order of Ranku values.
Step 3: Arrange the virtual machine list by pricing, in ascending order.
Step 4: Calculate MEFT value for all the nodes.
Step 5: Repeat Steps 6.1, 6.2, and 6.3.
Step 6: Begin:
Step 6.1: The first task vi in the list is removed.
Step 6.2: Find the MEFT value of task vi for all virtual machines.
Step 6.3: Find the vmj which has minimum MEFT value for task vi, and
assign it to vmj until all the nodes in the list are scheduled.
The CFSC minimizes the cost by using easier cost function algorithm. This
CFSC algorithm can be simulated in various cloud tools like Montage, LIGO1, and
LIGO 2. Using this kind of meta-heuristic algorithm, the makespan can be
minimized.
User Selection
This is the component selection process based on some behavior or attribute. Here
the task of similar type will be selected and scheduled collectively. For cost-based
scheduling, the users will have to follow the following steps:
For each user request:
Step 1: Compute start time of each user.
Step 2: The user with minimum cost time will be selected and scheduled.
Step 3: Check the count if the users are having same task time.
Step 4: For the users with the same count, cost-based count will be used.
Step 5: The user with the maximum cost-based time will be selected and
scheduled.
Step 6: If the cost-based count is also the same, use the id and update the task.
Step 7: The time limit is not needed since it is a time-based cost.
It is observed that this algorithm is efficient for both the users and the service
providers.
Workflow scheduling is one of the major tasks in cloud computing. This section
has surveyed the various workflow scheduling algorithms based on cost which are
consolidated in Table 9.1. In order to achieve better results, the algorithms should be
designed in such a way that it focuses on both the execution time and also the cost.
Moving workflows to the cloud computing helps us in using various cloud services
to facilitate workflow execution [10].
In the cloud system, these computing resources are provisioned as virtual machines.
The VMs are generally provided with various specifications and configuration
parameters that include the number of CPU cores, the amount of memory, the disk
capacity, etc. The tasks in a workflow are interdependent and must wait for data
from their predecessor tasks to continue the execution. Therefore some computing
resources are inevitably idle during different time slot, which certainly decreases
resource utilization of cloud data centers. This low resource utilization leads to
energy waste.
The energy utilization of a cloud platform has focused much in recent years as
the scientific workflow executions in cloud platforms acquire huge energy con-
sumption. The energy utilized by the cloud resources assigned for the execution of
the workflow (i.e., the virtual machines) can be classified into three components,
namely, the processing energy, the storage energy, and the data communication
energy. The processing energy is the energy consumed by the virtual machine to
execute a specific task. The storage energy represents the energy needed to store
data on a permanent storage device (disk memory). The communication energy is
166
the energy rate needed to transfer data from a virtual machine to another one using
a network bandwidth.
Moreover there is a tremendous amount of energy expended in a cloud data cen-
ter in order to run servers, cool fans of processors, console, monitors, network
peripherals, light systems, and cooling system. Mainly the energy consumption of
data centers usually causes a large amount of CO2 emission. In addition to negative
environmental implications, increased power consumption may lead to system fail-
ures caused by power capacity overload or system overheating. The workflow
scheduling algorithm plays a major role for the successful execution of tasks in
cloud environments. It is a method which assigns and controls the execution of
interdependent tasks on the distributed resources. It maps appropriate resources to
workflow tasks in a way to complete their execution to satisfy the objective func-
tions imposed by users. Scheduling is important to make the maximum utilization
of resources by appropriate assignment of tasks to the available resources like CPU,
memory, and storage. It is highly necessary to devise efficient scheduling algo-
rithms to overcome the energy issues in clouds. This section mainly focuses on the
different energy-efficient workflow scheduling algorithms that have been developed
to minimize the energy consumption.
Energy management in clouds has been one of the important research interests in
recent times. A large number of energy-aware workflow scheduling algorithms are
in existence for executing workflow applications in clouds. Generally, these sched-
uling algorithms can be divided into two categories: meta-heuristic approaches and
heuristic approaches.
Xiaolong Xu et al. [11] proposed an energy-aware resource allocation method,
named EnReal, to address the energy consumption problem while deploying scien-
tific workflows across multiple cloud computing platforms. This method focuses on
the optimal dynamic deployment of virtual machines for executing the scientific
workflows in an energy-aware fashion. The energy problems are brought to light as
it is estimated that cloud data centers cause about 2% of global CO2 emission.
The energy consumption in the cloud environment is divided into two major
categories, i.e., energy consumption for application execution and energy consump-
tion for dynamic operations. The energy consumption for application execution
includes physical machine (PM) baseline energy consumption, the energy con-
sumed by active virtual machines (VMs), idle VMs, and internal communications
and external communications between the VMs. The energy consumption for
dynamic operations refers to the PM mode switch operations. If all VMs on a single
PM are unused, the PM may be put into two modes, i.e., the low-power mode and
the sleep mode based on the service time for hosting the allocated applications.
The EnReal method mainly takes into consideration these two types of energy
consumptions and aims at reducing both. CloudSim framework is used to evaluate
the performance of the energy-aware resource allocation method.
To demonstrate the performance of the proposed EnReal method for scientific
workflow executions, it is compared with a baseline method named BFD-M and two
existing state-of-the-art methods, i.e., the greedy task scheduling algorithm and the
energy-aware greedy algorithm. There are two main focuses in the experimental
168 9 Workload Consolidation Through Automated Workload Scheduling
comparison, i.e., resource utilization and energy consumption. The EnReal method
achieves more energy savings by employing both migration-based resource alloca-
tion and physical machine mode switch operations.
Zhongjin Li et al. [12] designed a cost- and energy-aware scheduling (CEAS)
algorithm with an intent to minimize the energy and cost for deadline constrained,
data-intensive, computation-intensive big data applications that require many hours
to execute. The energy-related research has gained importance as the cloud provid-
ers spent 50% of the management budget for powering and cooling the physical
servers and the growing environmental issues like global warming due to CO2
emissions.
The CEAS algorithm consists of five sub-algorithms. First, the VM selection
algorithm is used to map each task to the optimal VM types to reduce the monetary
cost. Then, two tasks merging methods such as sequence task merging and parallel
task merging methods are employed to reduce execution cost and energy consump-
tion of workflow. Further, the VM reuse policy is used in order reuse the idle time
of leased VM to execute the current executable tasks. Finally, the task slacking
algorithm is used to reclaim slack time by lowering the voltage and frequency of
leased VM instances to reduce energy consumption.
The performance of the CEAS algorithm is evaluated using CloudSim. The four
different scientific workflows such as Montage, LIGO, SIPHT, and CyberShake are
taken for experimental analysis. Performance comparison of CEAS algorithm is
done with three different algorithms. Heterogeneous Earliest Finish Time (HEFT)
algorithm, enhanced energy-efficient scheduling (EES) algorithm, and enhancing
HEFT (EHEFT) algorithm are used to demonstrate the energy-saving performance
of the proposed algorithm.
The comparison results show (refer to Table 9.2) that CEAS algorithm reduces
the monetary cost by finding the proper VM using the task merging algorithm. The
VM reuse algorithm is used by many tasks to reuse the idle VM instances which
help in reducing the energy consumption.
Sonia Yassa et al. [13] suggest a new method to minimize the energy consump-
tion while scheduling workflow applications on heterogeneous computing systems
like cloud computing infrastructures. It also concentrates on the two other Quality
of Service (QoS) parameters such as deadline and budget. The algorithm devised in
this work is a discrete version of the multi-objective particle swarm optimization
(MOPSO) combined with Dynamic Voltage and Frequency Scaling (DVFS) tech-
nique. The DVFS technique is used to minimize energy consumption by allowing
the processors to operate in different voltage supply levels by sacrificing clock
frequencies.
The experimental evaluation is done using the pricing models of Amazon EC2
and Amazon CloudFront. The algorithm is compared against HEFT heuristic. The
results show that the proposed DVFS-MODPSO is able to produce a set of solutions
named Pareto solutions (i.e., nondominated solutions) enabling the user to select the
desired trade-off.
Khadija Bouselmi et al. [14] proposed an approach that focuses on the minimiza-
tion of network energy consumption, as the network devices consume up to one-third
9.4 Energy-Based Scheduling Algorithms 169
of the total energy consumption of cloud data centers. There are two major steps
involved in this process.
In the first step, a Workflow Partitioning for Energy Minimization (WPEM) algo-
rithm is utilized for reducing the network energy consumption of the workflow and
the total amount of data communication while achieving a high degree of parallel-
ism. In the second step, the heuristic of Cat Swarm Optimization is used to schedule
the generated partitions in order to minimize the workflow’s overall energy con-
sumption and execution time.
The simulation environment used for evaluation is the CloudSim toolkit. Two
other workflow partitioning algorithms are implemented to evaluate and compare
the test results of WPEM approach. The first algorithm denoted as Heuristic 1 is the
multi-constraint graph partitioning algorithm. To limit the maximum number of par-
titions generated by Heuristic 1, another version of it, denoted as Heuristic 2, which
sets the maximum number of partitions, is implemented.
The performance evaluation results suggest that the proposed WPEM algorithm
allows reducing remarkably the total energy consumption and the additional energy
consumption incurred from the workflow partitioning, by reducing the data com-
munication amount between the partitions and reducing the number of used VMs.
The authors, Peng Xiao et al. [15], present a novel heuristic called Minimized
Energy Consumption in Deployment and Scheduling (MECDS) for scheduling
data-intensive workflows in virtualized cloud platforms. This algorithm aims at
reducing the energy consumption of intensive data accessing. The data-intensive
workflow will generate a large volume of intermediate data, which requires being
stored in independent storage nodes. When running a data-intensive workflow, input
data of an activity node is transferred from an independent storage node to the exe-
cution node, and output data is transferred back to the original storage node or oth-
ers. Such an interweaving makes it more complex for scheduling data-intensive
workflows.
The algorithm consists of two phases such as VM deployment and DAG schedul-
ing. In the phase of VM deployment, physical resources are mapped into a number
of VM instances. So, the original power model of a physical machine should be
translated into the VM-oriented power model. In the phase of DAG scheduling,
activities in the workflow are assigned onto a set of VM instances; therefore, the
total energy consumption is dependent on the VM power models and the execution
time of each VM.
The experimental results in comparison with the algorithms such as MMF-
DVFS, HEFT, ECS+idle, and EADAGS show that the proposed algorithm is more
170 9 Workload Consolidation Through Automated Workload Scheduling
robust than other algorithms, when the system is in the presence of intensive data-
accessing requests.
Huangke Chen et al. [16] propose an energy-efficient online scheduling algo-
rithm (EONS) for real-time workflows. This algorithm is used to improve the energy
efficiency and provide high resource utilization. The major contribution of this work
is to schedule the workflows to VMs while improving VMs’ resource utilization and
ensuring the timeliness of workflows. It also emphasizes on scaling up/down the
computing resources dynamically with the variation of system workload to enhance
the energy efficiency for cloud data centers.
CloudSim toolkit is used for the experimentation. The performance of this algo-
rithm is compared with three other algorithms such as EASA, HEFT, and ESFS. The
experimental results show that EONS achieves a better performance in terms of
energy saving and resource utilization while guaranteeing the timing requirements
of workflows.
Guangyu Du et al. [17] aim at implementing an energy-efficient task scheduling
algorithm, as scheduling plays a very important role in successful task execution
and energy consumption in virtualized environments. The energy issues are taken
into consideration in this work as large numbers of computing servers containing
virtual machines of data centers consume a tremendous amount of energy.
The proposed algorithm is comprised of two main objectives. The first objective
is to assign as many tasks as possible to virtual machines with lower energy con-
sumption, and the second objective is to keep the makespan of each virtual machine
within a deadline. The experimental evaluation is done using CloudSim toolkit, and
the results compared against HEFT, GA, and HPSO show that there is effective
reduction in energy consumption and the tasks are completed within the deadline.
Zhuo Tang et al. [18] formulated a DVFS-enabled energy-efficient workflow task
scheduling (DEWTS) algorithm. This algorithm is mainly used to achieve energy
reduction and maintain the quality of service by meeting the deadlines. This algo-
rithm initially calculates the scheduling order of all tasks and obtains the whole
makespan and deadline based on Heterogeneous Earliest Finish Time (HEFT) algo-
rithm. Then by resorting the processors with their running task number and energy
utilization, the underutilized processors can be merged by closing the last node and
redistributing the assigned tasks on it.
Later, in the task slacking phase, the tasks can be distributed in the idle slots by
leveraging DVFS technique.
The experimental evaluation is done using CloudSim framework, and the pro-
posed algorithm is compared against EES and HEFT algorithm. Compared to the
other two algorithms, DEWTS provides better results as the tasks can be distributed
in the idle slots under a lower voltage and frequency, without violating the depen-
dency constraints and increasing the slacked makespan.
Yonghong Luo and Shuren Zhou [19] describe a power consumption optimiza-
tion algorithm for cloud workflow scheduling based on service-level agreement
(SLA). The main objective of this work is to reduce power consumption while meet-
ing the performance-based constraints of time and cost. This algorithm works by
searching for all feasible scheduling solutions of cloud workflow application with
9.5 Automated Workload Consolidation 171
critical path, and then the optimal scheduling solution can be found out by calculat-
ing total power consumption for each feasible scheduling solution.
The authors, Hong He and Dongbo Liu [20], presented a new approach to deal
with the data-accessing related energy for the data-intensive workflow applications
that are executed in cloud environment, as several studies focus on CPU-related
energy consumption. A novel heuristic named Minimal Data-Accessing Energy
Path (MDEP) is designed for scheduling data-intensive workflows in virtualized
cloud platforms. The proposed scheduling algorithm consists of two distinguished
phases: firstly, it uses MDEP heuristic for deploying and configuring VM instances
with the aim to reduce the energy consumption spent on intermediate data access-
ing; secondly, it schedules workflow activities to the VM instances according to VM
power model. This method uses a conception called Minimized Energy Consumption
in Deployment and Scheduling (MECDS). This is used to select a storage node aim-
ing to obtain minimal data-accessing energy consumption for the current activities.
The experiments performed mainly focus on the characteristic of workflow
energy consumption and execution performance. The experimentation is done in the
CloudSim environment. The algorithm is compared against HEFT, MMF-DVFS,
ECS+idle, and EADAGS. To further investigate the energy-efficiency, three mea-
surements are introduced: Effective Computing Energy Consumption (ECEC),
Effective Data Accessing Energy Consumption (EDAEC), and Ineffective Energy
Consumption (IEEC).
The results show that the proposed work can significantly reduce the energy
consumption of storing/retrieving intermediate data generated during the execution
of data-intensive workflow and offer better robustness.
Table 9.3 provides an insight on the various energy-efficient algorithms dis-
cussed above.
9.6 Conclusion
References
1. Kataria D, Kumar S (2015) A study on workflow scheduling algorithms in cloud. Int J Res
Appl Sci Technol 3(8):268–273
2. Tao J, Kunze M, Rattu D, Castellanos AC (2008) The Cumulus project: build a scientific cloud
for a data center. In Cloud Computing and its Applications, Chicago
3. Chopra N, Singh S (2014) Survey on scheduling in Hybrid Clouds. In: 5th ICCCNT–2014 July
1113, Hefei, China
4. Abawajy JH (2004) Fault-tolerant scheduling policy for grid computing systems. In:
Proceedings of parallel and distributed processing symposium, 2004, 18th international, IEEE,
p 238
5. Selvarani S, Sudha Sadhasivam G (2010) Improved cost based algorithm for task scheduling
in cloud computing. In: IEEE
6. Abrishami S, Naghibzadeh M, Epema DH (2012) Cost-driven scheduling of grid workflows
using partial critical paths. IEEE Trans Parallel Distrib Syst 23(8):1400–1414
7. Choudhary M, Sateesh Kumar P (2012) A dynamic optimization algorithm for task scheduling
in cloud environment. 2, 3, pp.2564-2568
8. Calheiros RN Ranjan R, De Rose CAF, Buyya R (2009) Cloudsim: a novel framework for
modeling and simulation of cloud computing infrastructures and services, Arxiv preprint
arXiv:0903.2525
9. George Amalarethinam DI, Joyce Mary GJ (2011) DAGEN–A tool to generate arbitrary
directed acyclic graphs used for multiprocessor scheduling. Int J Res Rev Comput Sci
(IJRRCS) 2(3):782
10. Jangra A, Saini T (2013) Scheduling optimization in cloud computing. Int J Adv Res Comput
Sci Softw Eng 3(4)
11. Xiaolong Xu, Wanchun Dou, Xuyun Zhang, Jinjun Chen (2016) EnReal: an energy-aware
resource allocation method for scientific workflow executions in cloud environment. IEEE
Trans Cloud Comput 4(2):166–179
12. Zhongjin Li, Jidong Ge, Haiyang Hu, Wei Song, Hao Hu, Bin Luo Cost and energy aware
Scheduling algorithm for scientific workflows with deadline constraintin clouds. In: IEEE
Transactions on Services Computing. doi:10.1109/TSC.2015.2466545
13. Yassa S, Chelouah R, Kadima H, Granado B (2013) Multi-objective approach for energy-
aware workflow scheduling in cloud computing environments. Hindawi Publishing
Corporation. ScientificWorld J 2013, Article ID 350934:13. http://dx.doi.org/10.1155/
2013/350934
176 9 Workload Consolidation Through Automated Workload Scheduling
14. Bouselmi K, Brahmi Z, Gammoudi MM (2016) Energy efficient partitioning and scheduling
approach for scientific workflows in the cloud. In: IEEE international conference on services
computing, doi:10.1109/SCC.2016.26
15. Peng Xiao, Zhi-Gang Hu, Yan-Ping Zhang (2013) An energy-aware heuristic scheduling for
data-intensive workfows in virtualized datacenters. J Comput Sci Technol 28(6):948–961.
doi:10.1007/s11390-013-1390-9
16. Huangke Chen, Xiaomin Zhu, Dishan Qiu, Hui Guo, Laurence T. Yang, Peizhong Lu (2016)
EONS: minimizing energy consumption for executing real-time workflows in virtualized
cloud data centers, 2332-5690/16 $31.00 © 2016 IEEE DOI 10.1109/ICPPW.2016.60
17. Guangyu Du, Hong He, Qinggang Meng (2014) Energy-efficient scheduling for tasks with
Deadline in virtualized environments. Math Probl Eng 2014, Article ID 496843: 7. http://dx.
doi.org/10.1155/2014/496843
18. Zhuo Tang, Ling Qi, Zhenzhen Cheng, Kenli Li, Samee U. Khan, Keqin Li (2015) An energy-
efficient task scheduling algorithm in DVFS-enabled cloud environment, doi 10.1007/s10723-
015-9334-y, © Springer Science+Business Media Dordrecht.
19. Yonghong Luo, Shuren Zhou (2014) Power consumption optimization strategy of cloud work-
flow scheduling based on SLA. WSEAS Trans Syst 13: 368–377, E-ISSN: 2224-2678
20. He H, Liu D (2014) Optimizing data-accessing energy consumption for workflow applications
in clouds. Int J Future Gener Commun Netw 7(3):37–48
Chapter 10
Automated Optimization Methods
for Workflow Execution
10.1 Introduction
Scientific workflows are emerged as a key technology that assists the scientists in
performing their experiments due to the factors that it offers, like higher level of
abstraction, automation, comprehensive, and a graphical alternative for the existing
script-based programming. Additionally, it can also be used for recreating the entire
work without the need for recreating the experiment. However, workflows rely on
several technical choices that lay out the foundation to optimize toward speedup,
robustness, and compact to the scientists. But the objective of optimization has dif-
ferent focuses, such as topology, runtime, and output optimization. Engineering
frameworks offer large variety of optimization algorithm, and hence user must test
different search methods and apply the improved and novel techniques. Hence
developers are in need for a platform that offers abstraction and general mechanisms
to provide workflow optimization for different algorithms and levels.
Section 10.2 illustrates the significances of workflow optimization during work-
flow execution. Additionally, it also highlights some of state-of-the-art optimization
approaches that are practiced in the literature.
Section 10.3 portrays the inclusion of an optimization phase in the traditional
workflow life cycle. Besides, it outlines the several possible optimization levels
such as parameter, component, topology, runtime, and output optimizations.
Sections 10.4 and 10.5 explain the execution of Taverna optimization framework
over single and distributed infrastructure. The workflow parameter optimization
techniques such as genetic algorithm, ant colony optimization, and Particle Swarm
Optimization are explained in Sect. 10.6. The detailed control flow of optimization
plug-in is described in Sect. 10.7. Following that, Sect. 10.8 validates the
Fig. 10.1 Modified
workflow life cycle
10.3 Modified Workflow Life Cycle and Optimization Levels 181
The newly introduced optimization phase is not limited to a particular type of opti-
mization. Optimizations can target different levels, namely, parameter optimization,
component optimization, and topological optimization. Each optimization level can
be achieved by different optimization algorithms.
Sometimes two or more algorithms implement equivalent function but with differ-
ent characteristics in such a manner that each one is optimal for each data type of
input. However, the component optimization attempts to find the optimal algorithm
to fulfill the requirements. For example, the choice of local sequence alignment
method can be implemented either by BLAST [26] or by profile alignment methods
PSI-BLAST [27]. The user has to select which is best. It also supports the null com-
ponent. The components with same input and output can be switched on and off, so
that their testing efficiency and utility can be achieved. The parameter optimization
should be applied in combination with the component optimization by which we
say that they can be jointly encoded. Apart from the information obtained from the
users, like constraints and applied methods, any other information that is obtained
from optimization of similar workflows also helps.
At some cases, the reorganization of the data and the enactment flow can result in
benefits, in order to perform that their output formats must be identical in nature. If
the data formats are not compatible with each other, then a new individual compo-
nent is introduced that transforms the data into appropriate formats. They are known
as shims.
tuning test that can be used or modified without in-depth knowledge of the domain.
Also, this allows users to change the parameters at runtime.
Fig. 10.2 Taverna
framework with
optimization plug-ins
The framework must offer an API to access the specific functions and GUI to
offer user-friendly nature. Taverna is basically not developed for supporting the
scientific workflow optimization methods, and hence it has to be developed for that
adaptation.
To make the Taverna support workflow optimization, the major requirement is
accessibility of all input and output parameters and sub-workflows. By accessibility,
we mean the rights to modify the workflow parameter, execute the modified work-
flow, and evaluate the result. In addition to these, the mechanisms for interrupting
the execution of the entire workflow at a specific point execute the sub-workflow
several times. To implement these requirements in Taverna, the first approach aims
at extending and reusing service provider interfaces (SPI) that are provided by
Taverna.
One such SPI provides usage of processor dispatch stack of Taverna. The proces-
sor dispatch stack is a mechanism to manage the execution of a specific activity.
Before the activity is invoked, the predefined stack of layers is called from up to
bottom, and after that, the execution is from bottom to top. In order to integrate the
optimized execution in Taverna, the dispatch stack was extended with a new layer,
named as optimize layer, on top of the stack. This layer interrupts the process of
workflow execution before a specific component is executed.
Additionally, the advantage was taken from the sub-workflow concept, which is
provided by Taverna. The sub-workflow concept allows definition and execution of
the sub-workflow. This implies that one can manually select a sub-workflow and
process it by the dispatch stack. Only the lowest level of the workflow is decom-
posed into individual activities, which gains traverse the stack itself. Simply the
processor dispatch stack provides access to sub-workflows, which in turn provides
access to all parameters, data structure, dataflows, etc., during the invocation of
optimize layer.
Optimization framework also utilizes a GUI-SPI, and it implements a new uni-
form perspective into Taverna Workbench. This perspective implements a general
10.4 Taverna Optimization Framework 185
Fig. 10.3 Processor
dispatch stack of Taverna
setup of an optimization run. The common workflow diagram and selection pane
were arranged within these, by which the users can define sub-workflow, which
shall be the subject to the optimization. After the selection, the components are not
subjected to optimization will appear as gray in workflow diagram as shown below.
Lower left pane shows the specific interface that is implemented by the respective
optimization plug-in. To provide the access and usage of the execution model and
user interface mechanisms, an API is developed for framework. This API allows the
users to extend the optimization method of Taverna’s specific execution and GUI
functionalities that too without reinventing and re-implementing those.
APIs provide methods for requesting GUI pane, to provide the text fields, for
optimization; starting options for specific plug-ins; receiving the modified parame-
ter or sub-workflow to start the execution; receiving the fitness values from the
workflow to forward it to the plug-ins; requesting the termination criteria and decid-
ing when optimization is finished; and requesting the best result to return to the
users.
In the new perspective optimization, users can select the sub-workflow and opti-
mization-specific informations. After that the newly integrated run button activates
the fully automated optimization process. Workflow is executed until the respective
sub-workflow is reached, and then the newly embed optimize layer extracts the
information and forwards it to the particular optimization plug-in for modifications.
Then the plug-in does the proper optimization methods and the sub-workflow
refinement, while the framework does nothing during that.
The plug-in returns the set of new sub-workflow entities with set of different
parameters. The optimization framework then executes the workflow by utilizing
the topmost Taverna layer, parallelize (refer to Fig. 10.3). Taverna provides a queue,
which is filled with the sub-workflows for parallelize layer and pushes each sub-
workflows down the stack in a separate thread. After execution of all sub-workflows,
the optimize layer receives the set of results from parallelize layer, which represents
the fitness value, and these will be passed to a specific optimization plug-in for
analysis and evaluation.
186 10 Automated Optimization Methods for Workflow Execution
10.5 O
ptimization Using Distributed Computing
Infrastructure
In the second tier, the workflow has several independent steps that can be repre-
sented as individual UNICORE jobs. These independent steps are submitted for
parallel execution, due to the parallel execution model of Taverna and job distribut-
ing nature of SO, on the high-performance computing (HPC) resources. In the third
tier, the workflow steps are implemented as MPI/thread parallel applications [40].
The desired three-tier architecture can be set up with available software compo-
nents, namely, Taverna and UNICORE components. But the highly recommended
security standards cannot be met with this standard. Security issues arrive at Taverna
Server, in the second tier. At this tier, the workflow execution takes place (refer to
Fig. 10.5). The workflow at some cases has some grid activity, which has to be sub-
mitted to the infrastructure. While this submission requires the user certificate,
which is not available in Taverna Server, as it is being distributed to it.
Now to equip this situation, the new mechanism based on SAML assertion was
investigated in Taverna Workbench and server. SAML is based on trust delegation
utilizing user certificate and Taverna Server certificate on the client.
During the requisition to SO, the SO fetches the distinguished name (DN) of the
server, and it is provided to Taverna Workbench, which is the client side. The
Workbench extracts and stores the DN along with the user’s X.509 certificate. This
results in trust delegation. This is used by the server to create SAML assertion by
which the servers can submit individual applications on behalf of users of HPC.
188 10 Automated Optimization Methods for Workflow Execution
In order to achieve the establishment of architecture and novel security, two out of
three activities should be optimized. Sub-workflow instances were executed parallel
on client and on computing resources. For more detailed analyses of execution sce-
narios and use cases, refer to [40].
Generic and automated optimization framework was introduced to fulfill the second
requirement to access the scientific workflow optimization. This framework offers
API for extension of different optimization levels. The nonlinear problem is
approached via proposing a best plug-in, which applies meta-heuristic optimization
algorithm. The plug-in actually implements parameter optimization. Therefore the
developed framework is extendable by any combination of optimized algorithm and
level. Life science workflows combine various data and configuration inputs and
contain several different applications. Each component has number of parameters,
on which final result critically depends on. In addition to these, the data and con-
figuration inputs also have different influences on the final result. Currently only
one application and their specific scope have been optimized, and the configuration
parameters are calibrated using default values/trial and error/parameter sweeps. In
order to optimize, the measure of the workflow result is required. This measure will
be evaluated regarding the numerical value, which is called as objective function.
The objective function is calculated by a standard performance measure or simula-
tion software.
It is identified as deterministic, stochastic, and heuristic algorithms can be used
to solve the nonlinear optimization problems. Genetic algorithms and other meta-
heuristic algorithms need a significant lower number of explored solutions in order
to give optimal solutions. Since workflow takes huge time for execution, the distrib-
uted and parallel execution environments like evolutionary algorithms are used.
Genetic algorithm: Each parameter is represented via one gene and hence on chrome
has one parameter set in it. The constraints give values for genes, and the popula-
tion represents one generation of several different workflow setups.
Particle Swarm Optimization: Each parameter spans one dimension of search space,
and one particle represents one parameter set. The constraints define the range of
their respective dimensions.
Ant colony optimization: Each set of parameter is defined by an ant and each visited
node represents one parameter. The constraints define range to pheromone
intensities.
To obtain best decision in reasonable time, parallel execution must be practiced,
and hence the evolutionary algorithms have been emerged as best candidate for it.
10.6 Optimization Techniques for Workflow Parameters 189
Most popular heuristic optimization methods have been used for more than 36 years
to solve complex scientific optimization and search problems. GA has also been
used to provide protein structure in real time and calculate multiple sequence align-
ment and parameter estimation in kinetic models. GA doesn’t require any deriva-
tions and hence can be easily encoded in both numerical and nonnumerical problems.
GA mimics the mechanism of natural evolution by applying recombination, muta-
tions, and solution to a population of parameter vectors. Each of those generations
consists of population of individuals which represent a point, which is the possible
solution in the search space. Evolution of the population of potential solution is
guided by a problem-specific objective function, which is called as fitness function.
Each individual is evaluated by this fitness function. The fitness function determines
the probability of a solution that has to be inherited to the next generation. The new
population is created via selecting the fittest parent of the prior population. The
selection is done via applying crossover or mutations. This search space process
combines exploration and hill climbing and allows the algorithm to sample many
local minima. GA can terminate the process at fixed iterations or at some fixed
fitness-based threshold.
To optimize the parameters of the scientific workflows by GA, a mapping
between the genetic entities is required. In this, the real-coded GA is selected due to
its lower computational complexity compared with the binary-coded GA. Workflow
parameters were straightly encoded as genes on the chromosomes. Thus simply,
each chromosome encodes set of input parameters and represents a particular com-
bination of input values. GA libraries in different languages are available. For this
work, the Java-based library JGAP [41] is selected. In addition to this, library ECJ
[42] and Watchmaker framework [43] were also taken into account.
JGAP’s application can be described as follows: open-source GA library devel-
oped by various freelancing developers. It offers general mechanism for genetic
evolution which can be adapted to the particular optimization problem. Parallel
execution mechanism and breeder mechanism are best suited for workflow optimi-
zation. For supporting the constrained parameter optimization, we make use of a
special genotype called Maphene. They are used to define a fixed set, from which
breeder can select only one of parameter values. Novel mechanism was inverted to
add functions to numerical genes for mapping gene values to the parameter values
as given below where y represents the parameter value and x represents the gene
value.
y = f ( x)
Main changes done in JGAP are the central breeder method. Refined breeder acts
as given below:
• Generated population of sub-workflows is first executed, and the calculated ones
are translated into fitness values, in order to determine the fittest chromosomes.
190 10 Automated Optimization Methods for Workflow Execution
To validate the optimization plug-in, several real-world use cases from life sciences
must be experimentally tested with reasonable execution environment and find good
values for crossover and mutation rate. The test runs are especially for scientific
192 10 Automated Optimization Methods for Workflow Execution
workflow optimization. The benchmarks for these works are tested via Rosenbrock
function and Goldstein and price function. These experiments showed good results
when the mutation rate and crossover rate are maintained at 0.1 and 0.8 values.
10.9 Conclusion
Since trial and error parameter sweeps are impractical and inefficient for complex
life science workflows, hence, optimization of complex scientific workflow is cru-
cial in novel research area and in silico experiments. The things that are introduced
in this approach are optimization phase, plug-ins for achieving the levels of optimi-
zation, and automated and generic optimization framework, and that framework
was integrated to Taverna. The framework was developed to offer scientists a com-
pensation tool achieving optimized scientific results. The framework and Taverna
integration helped developers so that they don’t have to deal with Taverna’s GUI
implementations, specifications, and model execution. Also the issues of security
are solved via trust delegation in client and SAML assertion in Taverna Server.
References
18. Zhou A, Qu B-Y, Li H, Zhao S-Z, Suganthan PN, Zhang Q (2011) Multiobjective evolutionary
algorithms: a survey of the state of the art. Swarm Evol Comput 1(1):32–49
19. Deb K (2001) Multi-objective optimization using evolutionary algorithms. Wiley, Chichester
20. Jean-Charles Boisson, Laetitia Jourdan, El-Ghazali Talbi, Dragos Horvath (2008) Parallel
multi-objective algorithms for the molecular docking problem. In: IEEE symposium on com-
putational intelligence in bioinformatics and computational biology, IEEE, pp 187–194
21. Namasivayam V, Bajorath J (2012) Multiobjective particle Swarm optimization: automated
identification of structure–activity relationship-informative compounds with favorable physi-
cochemical property distributions. J Chem Inf Model 52(11):2848–2855
22. Poladian L, Jermiin LS (2006) Multi-objective evolutionary algorithms and phylogenetic
inference with multiple data sets. Soft Comput 10(4):359–368
23. Hisao Ishibuchi, Tadahiko Murata (1996) Multi-objective genetic local search algorithm In:
Proceedings of IEEE international conference on evolutionary computation, IEEE,
pp 119–124
24. Moscato P (1999) Memetic algorithms: a short introduction. In: Corne D, Dorigo M, Glover F,
Dasgupta D, Moscato P, Poli R, Price KV (eds) New ideas in optimization. McGraw-Hill Ltd,
Maidenhead, pp 219–234
25. Blum C, Puchinger J, Raidl GR, Roli A (2011) Hybrid metaheuristics in combinatorial optimi-
zation: a survey. Appl Soft Comput 11(6):4135–4151
26. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search
tool. J Mol Biol 215(3):403–410
27. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997)
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.
Nucleic Acids Res 25(17):3389–3402
28. Wieczorek M, Hoheisel A, Prodan R (2009) Towards a general model of the multi-criteria
workflow scheduling on the grid. Futur Gener Comput Syst 25(3):237–256
29. Yu J, Kirley M, Buyya R (2007) Multi-objective planning for workflow execution on grids. In:
Proceedings of the 8th IEEE/ACM international conference on grid computing, IEEE,
pp 10–17
30. Kumar VS, Kurc T, Ratnakar V, Kim J, Mehta G, Vahi K, Nelson YL, Sadayappan P, Deelman
E, Gil Y, Hall M, Saltz J (2010) Parameterized specification, configuration and execution of
data-intensive scientific workflows. Clust Comput 13(3):315–333
31. Chen W-N, Zhang J (2009) An Ant colony optimization approach to a grid workflow schedul-
ing problem with various QoS requirements. IEEE Trans Syst Man Cybernetics C Appl Rev
39(1):29–43
32. Yu J, Buyya R, Ramamohanarao K (2008) Workflow scheduling algorithms for grid comput-
ing. In: Metaheuristics for scheduling in distributed computing environments, vol 146, studies
in computational intelligence. Springer, Berlin, pp 173–214
33. Prodan R, Fahringer T (2005) Dynamic scheduling of scientific workflow applications on the
grid: a case study. In: Proceedings of the 2005 ACM symposium on Applied computing, ACM,
pp 687–694
34. Abramson D, Bethwaite B, Enticott C, Garic S, Peachey T, Michailova A, Amirriazi S (2010)
Embedding optimization in computational science workflows. J Comput Sci 1(1):41–47
35. Ludäscher B, Altintas I, Berkley C, Higgins D, Jaeger E, Jones M, Lee EA, Tao J, Zhao Y
(2006) Scientific workflow management and the Kepler system. Concurr Comput Pract Exp
18(10):1039–1065
36. Crick T, Dunning P, Kim H, Padget J (2009) Engineering design optimization using services
and workflows. Philos Trans R Soc A Math Phys Eng Sci. 367(1898):2741–2751
37. Missier P, Soiland-Reyes S, Owen S, Tan W, Nenadic A, Dunlop I, Williams A, Oinn T, Goble
C (2010) Taverna, reloaded. In: Proceedings of the 22nd international conference on Scientific
and statistical database management, Springer, pp 471–481
38. Barga R, Gannon D (2007) Business versus scientific workflow, Workflows for e-Science.
Springer, London, pp 9–16
198 10 Automated Optimization Methods for Workflow Execution
Abstract With the faster adoption of the cloud idea across industry verticals with
all the elegance and the enthusiasm, the traditional IT is bound to enlarge its pros-
pects and potentials. Thatis, the IT capabilities and capacities are being enhanced
with the seamless and spontaneous association with the cloud paradigm in order to
meet up fast-emerging and evolving business requirements. This is a kind of new IT
getting enormous attention and garnering a lot of attraction among business execu-
tives and IT professionals lately. The systematic amalgamation of the cloud con-
cepts with the time-tested and trusted enterprise IT environment is to deliver a bevy
of significant advantages for business houses in the days ahead. This model of next-
generation computing through the cognitive and collective leverage of enterprise
and cloud IT environments is being touted as the hybrid IT. There are a variety of
technologies and tools expressly enabling the faster realization of hybrid era. This
chapter is specially crafted for digging deep and describing the various implications
of the hybrid IT.
11.1 Introduction
The cloud paradigm is definitely journeying in the right direction toward its ordained
destination (the one-stop IT solution for all kinds of institutions, innovators, and
individuals). The various stakeholders are playing their roles and responsibilities
with all the alacrity and astuteness to smoothen the cloud route. Resultantly there
are a number of noteworthy innovations and transformations in the cloud space, and
they are being consciously verified and validated by corporates in order to avail
them with confidence. There are IT product vendors, service organizations, inde-
pendent software vendors, research labs, and academic institutions closely and col-
laboratively working to make the cloud idea decisively penetrative and deftly
pervasive.
In this chapter, we are to describe the various unique capabilities of hybrid clouds
and how the feature-rich and state-of-the-art IBM hybrid cloud offering differenti-
ates well against the various hybrid cloud service providers in the competition-filled
market. Hybrid cloud ensures fast and frictionless access for cloud infrastructures,
platforms, software, and data along with the much-touted bulletproof governance
Let us start with a brief of the raging hybrid cloud concept. The cloud paradigm is
definitely recognized as the most disruptive one in the IT space in the recent past.
The cloud idea has brought in a series of well-intended and widely noticed business
transformations. It all started with the widespread establishment and sustenance of
public clouds that are typically centralized, consolidated, virtualized, shared, auto-
mated, managed, and secured data centers and server farms. This trend clearly indi-
cated and illustrated the overwhelming idea of IT industrialization. The era of
commoditization to make IT the fifth social utility has also flourished with the faster
maturity and stability of cloud concepts. That is, public clouds are made out of a
11.2 Demystifying the Hybrid Cloud Paradigm 201
Hybrid cloud is a kind of converged cloud computing environment which uses a mix
of on-premise private cloud and public cloud services with seamless interaction and
orchestration between the participating platforms. While some of the organizations
are looking to put selective IT functions onto a public cloud, they still prefer keep-
ing the higher-risk or more bespoke functions in a private/on-premise environment.
Sometimes the best infrastructure for an application requires both cloud and dedi-
cated environments, that is, being touted as the prime reason for having hybrid
clouds.
• Public cloud for cost-effective scalability and ideal for heavy or unpredictable
traffic
• Private cloud for complete control and security aspects
• Dedicated servers for super-fast performance and reliability
Development and Testing Hybrid cloud provides businesses with the required
flexibility to gain the needed capacity for limited time periods without making capi-
tal investments for additional IT infrastructures.
Extending Existing Applications With a hybrid cloud, businesses can extend cur-
rent standard applications to the cloud to meet the needs of rapid growth or free up
on-site resources for more business-critical projects.
Disaster Recovery Every organization fears an outage, or outright loss, of
business-
critical information. While on-site disaster recovery solutions can be
expensive, preventing businesses from adopting the protection plans they need, a
hybrid cloud can offer an affordable disaster recovery solution with flexible com-
mitments, capacity, and cost.
Web and Mobile Applications Hybrid cloud is ideal for cloud-native and mobile
applications that are data-intensive and tend to need the elasticity to scale with sud-
den or unpredictable traffic spikes. With a hybrid cloud, organizations can keep
sensitive data on-site and maintain existing IT policies to meet the application’s
security and compliance requirements.
Development Operations As developers and operations teams work closer
together to increase the rate of delivery and quality of deployed software, a hybrid
cloud allows them to not only blur the lines between the roles but between Dev/Test
and production and between on-site and off-site placement of workloads.
Capacity Expansion Quickly addresses resource constraints by bursting work-
loads into VMware on IBM cloud.
Data center Consolidation Consolidate legacy infrastructures onto an automated
and centrally managed software-defined data center.
The adoption of hybrid clouds is being driven due to several parameters as articu-
lated above. The mandate toward highly optimized and organized cloud environ-
ments for enabling smarter organizations is the key force for hybrid clouds. The
heightened stability and surging popularity of hybrid clouds ultimately lead to
multi-cloud environments.
Hybrid cloud facilitates to run different applications in the best of the environments
to reap the required advantages such as the speed, scale, throughput, visibility, con-
trol, etc. There are competent solutions and services being offered by different pro-
viders in the cloud space in order to accelerate the hybrid cloud setup and the
sustenance. There are cloud infrastructure service providers and cloud-managed
service providers too in plenty. There is a plethora of open-source as well as
commercial-grade solutions and toolsets. Service providers have formulated and
enabled frameworks toward risk-free and sustainable hybrid clouds.
11.4 The Hybrid Cloud Challenges 205
However, there are challenges too. Especially the prickling challenge lies in
establishing a high-performing hybrid environment to appropriately and accurately
manage and monitor all of its different components. Most of the current infrastruc-
ture management and monitoring tools were initially built to manage a single envi-
ronment only. These point tools are incapable of managing distributed yet connected
environments together. These tools lack the much-needed visibility and controlla-
bility into different environments; thereby the activities such as workload migration
among the integrated clouds are not happening smoothly. Further on, the applica-
tion performance management (APM) across the participating clouds is also not an
easy affair.
Thus, there is an insistence for integrated cloud management platform (CMP) in
order to leverage the fast-evolving hybrid concept to the fullest extent to give utmost
flexibility to businesses.
The Distinct Capabilities of Hybrid Clouds
We are slowly yet steadily heading toward the digital era. The digitization-
enablement technologies and tools are bringing a number of rapid and radical
changes on how we collaborate, correlate, corroborate, and work. IT professionals
are under immense pressure to meticulously capitalize these innovations and help
their organization to move with agility and alacrity than today. The mantra of “more
with less” has induced a large number of organizations to accurately strategize and
implement a private cloud due to the huge advantages being offered by public
clouds. But with the broad range of cloud platforms, along with the explosion of
new infrastructure and application services, having a private cloud environment is
no longer sufficient and efficient. The clear-cut answer is the realization of hybrid
clouds. The following questions facilitate to understand how hybrid cloud comes
handy in steering businesses in the right direction:
• Are workloads moved from private to public environments?
• Do you develop the Web or mobile application in one cloud platform but run it in
a different cloud?
• Do your developers want to use multiple public platforms for their projects?
The following are the widely articulated and accepted features of hybrid cloud
service providers:
1. Cloud bursting – After all, the promise of running a performant and efficient
private data center and leveraging public cloud providers for the occasional
hybrid cloud bursting is the way forward. On the private side, you can maintain
a complete control over privacy, security, and performance for your workloads,
and on the public side, you can have “infinite” capacity for those occasional
workloads. Imagine a Ruby application that is being used for an online store
and the transactional volume is increasing due to a sale, a new promotion, or
Cyber Monday. The cloud bursting module will recognize the increased load
and recommend the right action to address the issue, effectively answering
when it’s time to burst.
206 11 The Hybrid IT, the Characteristics and Capabilities
9. Data backup, archival, and protection – Cloud environments are turning out to
be an excellent mechanism for business continuity and resiliency.
10. Identity and access management (IAM) – Authentication, authorization, and
audibility are the key requirements for ensuring cloud security and privacy.
There are several additional security-ensuring mechanisms for impenetrable
and unbreakable security.
11. Disaster and data recovery – Whether their applications are running in on-
premise private clouds, hosted private clouds, or public clouds, enterprises are
increasingly seeing the value of using the public cloud for disaster recovery.
One of the major challenges of traditional disaster recovery architectures that
span multiple data centers has been the cost of provisioning duplicate infra-
structure that is rarely used. By using pay-as-you-go public cloud as the disas-
ter recovery environment, IT teams can now deliver DR solutions at a much
lower cost.
12. Service integration and management – IT service management (ITSM) has
been an important requirement for different IT service providers, data center
operators, and cloud centers for exemplary service fulfillment. As clouds
emerge as the one-stop IT solution, the aspect of service management and inte-
gration garners extreme importance in order to fulfill the agreed SLAs between
cloud service consumers and providers. The various nonfunctional attributes
need to be guaranteed by CSPs to their subscribers through highly competitive
ITSM solutions.
13. Monitoring, measurement, and management – Hybrid cloud management plat-
form is a key ingredient for running hybrid cloud environments successfully,
and we have extensively written about the role and responsibility of a viable
management platform in a hybrid scenario.
14. Metering and charge-back – A charge-back model in the cloud delivers many
benefits, including the most obvious:
• Correlating utilization back to cloud consumers or corporate departments, so
that usage can be charged if desired
• Providing visibility into resource utilization and facilitating capacity plan-
ning, forecasting, and budgeting
• Providing a mechanism for the enterprise IT function to justify and allocate
their costs to their stakeholder business units
With the continued evolution of business sentiments and expectations, the capa-
bilities of hybrid clouds are bound to enlarge consistently through the careful addi-
tion of powerful tools, processes, and practices.
Hybridization is definitely a unique and useful approach for different scenarios.
In the cloud space also, the hybrid capability is all set to penetrate and participate in
bringing forth a number of exceptional benefits such as the acceleration of innova-
tions, sharply lessening the time to take fresh products and services to the knowledge-
filled market, the means and ways of achieving agile development through
continuous integration, deployment, and delivery, and guaranteeing the significant
enhancement in resource utilization, for worldwide enterprises.
208 11 The Hybrid IT, the Characteristics and Capabilities
11.5 H
ybrid Cloud Management: The Use Cases
and Requirements
11.6 T
he Growing Ecosystem of Hybrid Cloud Management
Solutions
There are others who built solutions for cloud migration or brokerage and are
now extending them to have life-cycle and governance features to be accommo-
dated in the hybrid cloud management solutions:
1. Established management solution providers extend existing management suites.
Three of the traditional big four enterprise management vendors (BMC Software,
HP, and IBM) offer stand-alone hybrid cloud management solutions, often inte-
grating with and leveraging other tools from the vendor’s existing catalog.
Microsoft and Oracle have added hybrid cloud management features to System
Center and Enterprise Manager, respectively.
2. Enterprise system vendors add hybrid management to private cloud platforms.
Cisco Systems, Citrix, Computer Sciences Corp (CSC), Dell, HP, IBM,
Microsoft, Oracle, and VMware all offer private cloud suites that may also
include hybrid cloud management capabilities. If private cloud is critical to your
hybrid cloud strategy, consider whether such an extension of private cloud meets
your needs.
3. Hypervisor and OS vendors target technology managers who own infrastructure.
Citrix, Microsoft, Red Hat, and VMware offer virtualization platforms and vir-
tual machine (VM)-focused management tools and focus hybrid cloud manage-
ment solutions on the cloud infrastructure life-cycle use case. CloudBolt and
Embotics also feature VM-focused management capabilities.
4. Independent software vendors target cloud-focused developers and DevOps.
CliQr, CSC, Dell, DivvyCloud, GigaSpaces, RightScale, and Scalr market their
solutions primarily at cloud developers and the DevOps pros who support them.
Their solutions are well suited for the cloud application life-cycle use case.
210 11 The Hybrid IT, the Characteristics and Capabilities
5. Public cloud platform vendors focus on their own clouds in hybrid scenarios.
Cisco, IBM, Microsoft, Oracle, Red Hat, and VMware offer public cloud plat-
forms in addition to hybrid cloud management software. Naturally, these ven-
dors encourage the use of their own platforms for hybrid deployments. Pay
attention to how strongly the vendor’s own platform is favored when evaluating
its hybrid cloud management capabilities.
6. Cloud migration vendors add more life-cycle management features. HotLink and
RackWare are cloud migration tools with added VM management features that
extend to public cloud platforms. They stress the onboarding and disaster recov-
ery use cases for cloud migration. RISC Networks is a cloud migration analysis
tool. In addition to these vendors, many other hybrid cloud management vendors
in this landscape have migration capabilities.
7. Cloud brokers and brokerage enablers extend beyond cost analytics. AppDirect,
Gravitant, Jamcracker, and Ostrato primarily focus on the enterprise cloud bro-
kerage use case. Each of these vendors, however, offers additional capabilities
beyond cost brokering and analytics.
Due to a large number of connected, clustered, and centralized IT systems, the het-
erogeneity and multiplicity-induced complexity of cloud centers has risen abnor-
mally. However, a bevy of tool-assisted, standard-compliant, policy- and
pattern-centric, and template-driven methods have come handy in moderating the
development, management, delivery, and operational complexities of clouds.
Precisely speaking, converged, virtualized, automated, shared, and managed cloud
environments are the result of a stream of pioneering technologies, techniques, and
tools working in concert toward the strategically sound goal of the Intercloud. The
results are all there for everyone to see. IT industrialization is seeing the light, the
IT is emerging as the fifth social utility, and the digital, insightful, idea and API era
are kicking in. Brokerage solutions are being presented and prescribed as the most
elementary as well as essential instrument and ingredient for attaining the intended
success. In this document, we would like to describe how IBM cloudMatrix, the
enterprise-grade cloud brokerage solution, is going to be the real game-changer for
the ensuing cloud era.
Undeniably the cloud journey is still at a frenetic pace. The game-changing jour-
ney started with server virtualization with the easy availability, the faster maturity,
and stability of hypervisors (virtual machine monitors (VMMs)). This phase is
thereafter followed by the arrival of powerful tools and engines to automate and
accelerate several manual tasks such as virtual machine (VM) monitoring,
measurement, management, load balancing, capacity planning, security, and job
scheduling. In addition, the unprecedented acceptance and adoption of cloud man-
agement platforms such as OpenStack, CloudStack, etc. have made it easy for deci-
sively and declaratively managing various IT infrastructures such as compute
11.8 The Key Drivers for Cloud Brokerage Solutions and Services 211
11.8 T
he Key Drivers for Cloud Brokerage Solutions
and Services
The following are the prominent and dominant drivers for the huge success of the
brokerage concept:
1 . Transforming to hybrid IT
2. Delivering the ideals of “IT as a service”
3. Planning smooth transition to cloud
4. Empowering self-service IT
5. Incorporating shadow IT
6. Setting and sustaining multi-cloud environments
7. Streamlining multi-cloud governance and control
212 11 The Hybrid IT, the Characteristics and Capabilities
IBM cloudMatrix is the prime ingredient for enabling hybrid IT – When a data
center nears the end of life, an important decision has to be made on how to
replace it, and increasingly enterprises are opting to replace their inflexible
and complicated data centers with a mix of cloud and physical assets, called
hybrid IT. Enterprises are recognizing the need to be more competitive in their
dealings, decisions, and deeds. Some of the basic problems they need to solve
for are the capital and operational costs, the time to value, the lack of automa-
tion, the charge-back accuracy, etc. Hybrid IT helps solve these perpetual
problems and increases competitiveness, as long as the right expertise and
tools are being leveraged.
1. Ongoing cost – The cost of operating, maintaining, and extending applica-
tion services within the physical data center environments, especially
across political and geographic boundaries, would continue to increase.
2. Speed – Internal and technology requests for services, on average, took
four to six weeks for review and approval, often leading to frustration and
a lack of agility for business units.
3. Lack of automation – Fulfilling application service requests took too many
manual steps, exacerbated by required technology skillsets.
4. Charge-back accuracy – Business units were being charged a percentage
of IT costs without consideration of usage.
5. Capital expenditure – There is a large upfront cost associated with building
and deploying new data centers.
The hybrid IT is definitely a long-term, strategic approach and move for
any enterprise IT. The hybrid IT typically comprises private cloud, public
cloud, and traditional IT. There are some game-changing advantages of hybrid
IT. The first and foremost is that it never ask you to rip and replace the current
system. Any hybrid IT solution would need to continue to interoperate with
the existing service management system and work with ticket management
where appropriate. The most crucial tool for realizing painless and risk-free
hybrid IT is a highly competitive and comprehensive cloud brokerage solu-
tion. A complete cloud brokerage solution would tie planning, consumption,
delivery, and management seamlessly across public, private, virtual, hosted,
and on- and off-premise solutions. IBM cloudMatrix is widely recognized as
the best-in-class cloud brokerage solution. I have given its unique capabilities
in comfortably fulfilling the various
IBM cloudMatrix provides the following features:
• A seeded catalog of the industry’s leading cloud infrastructure providers,
out of the box without the overhead of custom integration.
• A marketplace where consumers can select and compare provider services
or add their own IT-approved services for purchasing and provisioning.
Consumers can use a common workflow with approval processes that are
executed in terms of minutes not weeks.
(continued)
11.8 The Key Drivers for Cloud Brokerage Solutions and Services 213
Journeying Toward the “IT as a Service (ITaaS)” Days This is definitely service
era. The excitement and elegance associated with service-oriented architecture
(SOA) have paid well in formulating and firming up the journey toward the days of
“everything as a service (XaaS).” The service paradigm is on the heightened growth.
The varied tasks such as service conceptualization, concretization, composition,
deployment, delivery, management, and enhancement are getting extremely simpli-
fied and accelerated through a variety of automated tools. All kinds of IT capabili-
ties are being expressed and exposed as easily identifiable, network-accessible,
distinctively interoperable, smartly composable, quickly recoverable, and replace-
able services. With such kinds of service enablement, all kinds of closed, mono-
lithic, and inflexible IT infrastructures are being tuned into open, remotely
consumable, easily maneuverable, and managed components. With the arrival and
acceptance of IT service monitoring, measurement, and management tools, engines,
and platforms, the IT assets and applications are being readied for the era of ITaaS.
Further on, microservice architecture (MSA), which is an offshoot of SOA, is
gaining a lot of ground these days, and hence the days of “as a service” is bound to
see a litany of powerful innovations, transformations, and even a few disruptions.
Cloud broker solutions are being recognized as the best fit for ensuring this greatly
expressed need of ITaaS.
Embracing the Cloud Idea The raging cloud paradigm is acquiring a lot of atten-
tion and attraction because of its direct and decisive contribution toward highly
optimized and organized IT environments. However, cloud embarkation journey is
beset with innumerable barriers. For jumping on the cloud bandwagon, especially
identifying which application workloads give better results in which cloud environ-
ments is a tedious and tough job indeed. Herein, a full-fledged cloud broker plays a
very vital role in shaping up the cloud strategy and implementation.
Ticking Toward Self-Service IT It is being insisted that IT has to be business and
people friendly. For working with IT solutions and availing IT-enabled services, the
interfaces have to be very informative, intuitive, and intelligent for giving a simpli-
fied and streamlined experience to various users. Automation has to be an inherent
and important tenet and trait of cloud offerings. Cloud brokers are being positioned
as the principal instrument to have quick and easy servicing of cloud infrastructures,
platforms, and applications.
Enabling the Shadow IT Requirements The IT organizations of worldwide
enterprises literally struggle to provide the required capabilities with the same level
of agility and flexibility as being provided by public clouds. Even their own
on-premise private clouds with all the cloud-enabled IT infrastructures do not pro-
vide the mandated variety, simplicity, and consumability of public clouds because
legacy workflows, manual interpretation and intervention, and business procure-
ment requirements often reduce the accelerated realization. These challenges
increasingly drive business users to search for and procure various IT capabilities
without involving the core IT team of their organizations. That is, different depart-
ments and users within a corporate on their own fulfill their IT requirements from
11.8 The Key Drivers for Cloud Brokerage Solutions and Services 215
various public clouds. They circumvent the core IT team, and this industry trend is
being called as the shadow IT. Users use a shadow IT model because public clouds
guarantee on-demand resources, and this, in turn, lays a stimulating foundation for
accelerating innovation and improving time to market for newer and premium
offerings.
However, the shadow IT is beset with risks and challenges, and there is an intense
pressure on IT divisions of business houses to address this in a structured and smart
manner. Many IT organizations don’t know what cloud services their employees are
using. The IT team doesn’t know where data resides, whether datasets are safe-
guarded accordingly, whether data and applications are backed up to support data
and disaster recovery, whether the capabilities will scale in line with fast-evolving
business sentiments, and what the costs are. Thus, it is becoming mandatory for
business behemoths to address this issue of shadow IT by offering a compelling
alternative. The traditional IT centers and even private clouds need to be empowered
to give all that are being ensured by public clouds that are very famous for on-
demand, online, off-premise, consolidated, shared, automated, virtualized, and con-
tainerized IT services. In effect, IT organizations have to offer Shadow IT capabilities
and benefits without the identified risks. Herein the celebrated role and responsibil-
ity of cloud brokerage solutions are vividly prescribed to provide Shadow IT capa-
bilities yet without the articulated risks. With IBM cloudMatrix, IT organizations
can devise a pragmatic approach to discover existing resources, provide visibility to
new resources, and offer an equivalent alternative to Shadow IT. Organizations can
start small and then extend capabilities and functionality as desired.
Establishing and Managing Multi-cloud Environments There are integration
engines enabling distributed and different clouds to find, bind, and leverage one
another’s unique feats and features. Clouds are increasingly federated to accomplish
special business needs. Clouds are being made interoperable through technology-
centric standardization so that the vision of the Intercloud is to see the reality sooner
than later. There are a few interesting new nomenclatures such as open, delta, and
interoperable clouds. Apart from the virtualization dogma, the era of containeriza-
tion paradigm to flourish with the industry-strength standards for containerization
are being worked out. The Docker-enabled containerization is to have containerized
applications that are very famous for portability for fulfilling the mantra of “make
once and run everywhere.” Developing, shipping, deploying, managing, and enhanc-
ing containerized workloads are made simple and faster with the open-source
Docker platform. All these clearly indicate that disparate and distributed cloud envi-
ronments are being integrated at different levels in order to set everything right for
the ensuing days of people-centric and knowledge-filled services for achieving
varying personal, social, and professional needs of the total human society. There
are several business imperatives vehemently insisting for the onset of hybrid and
multi-cloud environments. It is visualized that geographically distributed and differ-
ent cloud environments (on-premise clouds, traditional IT environments, online,
on-demand and off-premise clouds) need to be integrated with one another in order
to fulfill varying business requirements.
216 11 The Hybrid IT, the Characteristics and Capabilities
Having watched and analyzed the market sentiments and business environments,
it is safely predicted that multi-cloud environments will become the new normal in
the days to unfurl. We need industry-strength and standardized integration and
orchestration engines, multi-cloud management platforms, and a host of other asso-
ciated empowerments in order to make multi-cloud environments a worthy addition
for next-generation business behemoths. Clouds bring in IT agility, adaptivity, and
affordability that in turn make business more easily and expediently attuned to be
right and relevant for their constituents, customers, and consumers. Cloud broker-
age solution is being touted as the most significant entity for presenting a synchro-
nized, simplified, and smart front end for a growing array of heterogeneous generic
as well as specific clouds. That is, cloud consumers need not interact with multiple
and differently enabled clouds. Instead, users at any point of time from anywhere
just interact with the cloud broker portal to get things done smoothly and in time
with just clicks.
In conclusion, the role of a cloud broker is to significantly transform IT service
delivery while ensuring the much-demanded IT agility and control. A cloud broker
enables cloud consumers to access and use multiple cloud service providers (CSPs)
and their distinct services. Further on, a cloud broker can also take care of the ser-
vice delivery, fulfillment, API handling, configuration management, resource
behavior differences, and other complex tasks. The broker facilitates users to take
informed decisions for selecting cloud infrastructures, platforms, processes, and
applications. This typically includes the cost, compliance, utility, governance, audi-
bility, etc. Cloud brokers simplify the procedures and precipitate the cloud adoption
and adaption. The IT operating model is bound to go through a number of transfor-
mations and disruptions through the smart leverage of cloud brokerage solution.
The cloud complexity gets nullified through cloud brokerage solution. In short, the
digital, API, idea, and insightful economy and era are bound to go through a radical
transformation through cloud services and brokerage solutions.
A cloud service broker operationalizes best execution venue (BEV) strategies,
which is based on the notion that every class of IT-related business need has an
environment where it will best balance performance and cost and that the IT orga-
nization should be able to select that environment (or even have the application
select it automatically). Brokers thus enable any organization to create the “right
mix” of resources for its hybrid IT environment. The strategic goal of more with less
is to get accentuated with all the accomplishments in the cloud space. Cloud users
expect to be able to make decisions about how and where to run applications and
from where to source services based upon workload profile, policies, and SLA
requirements. As the worlds of outsourcing, hosting, managed services, and cloud
converge, the options are growing exponentially. BEV strategies enable users to find
the most suitable services to meet their needs. The cloud broker is the key element
toward operationalizing this approach.
We have detailed how next-generation cloud broker solutions are to do justice to
the abovementioned hybrid IT requirements. In the following sections, we are to
detail how IBM cloudMatrix is emerging as the strategic software suite for the cloud
brokerage needs.
Appendix 217
• Reduce the costs of cloud services (30–40% estimated savings by using cloud
brokers)
• Integrate multiple IT environments – existing and cloud environments, e.g.,
establish hybridity – as well as integrate services from multiple cloud providers
• Understand what public cloud services are available via a catalog
• Policy-based service catalog populated with only the cloud services that an
enterprise wants their employees to purchase
• Unified purchase cloud services (broker) and help those services selected (by
clients) better together
• Assess current applications for cloud readiness
• Ensure cloud services meet enterprise policies
• Ensure data sovereignty laws are followed
• Cloud brokers
–– Cover all layers of the cloud stack (IaaS, PaaS, and SaaS)
–– Offer multiple deployment models: on-premises (local) and off-premises
(dedicated or shared). IBM supports all of these “as a service” deployment
models but does not currently offer a traditionally licensed software product.
11.10 Conclusion
The cloud technology has matured and opened up newer possibilities and opportu-
nities in the form of pragmatic hybrid clouds that in turn comprises elegant private
clouds and elastic public clouds. Organizations are increasingly leaning toward
hybrid clouds in order to reap the combined benefits of both public and private
clouds. This document has explained the unique advantages to be accrued out of
hybrid clouds. Enterprises considering the distinct requirements of their workloads
are consciously embracing hybrid clouds. There are several hybrid cloud service
providers in the market with different capabilities, and this document has faithfully
articulated the various competencies of those hybrid cloud service providers in
order to educate worldwide corporates to take an informed decision.
Appendix
vCloud Air ecosystem of several thousand partners, with companies like CenturyLink
and Claranet leading the charge.
Google
Google competes primarily with AWS and Microsoft Azure in the public cloud
space, with its Google Cloud Platform. Like AWS, Google relies on a deep partner
network to help fill out its hybrid cloud solution, but the size and customer base of
Google Cloud Platform earned it a top spot on this list. With its background in data,
Google tools like BigQuery are useful additions for the data-savvy ops team. And,
given that Google shares many of the same partners that AWS utilizes in its hybrid
cloud, users can expect similar types of integrations to be available.
Rackspace
Rackspace is another hybrid cloud vendor that works with a host of other vendors
and products. Known for its focus on infrastructure, Rackspace offers dedicated
database and application servers and dedicated firewalls for added security.
Rackspace’s hybrid cloud solution is held together by RackConnect, which essen-
tially links an organization’s public and private clouds. While it does offer VPN
bursting and dedicated load balancing, Rackspace’s catalog of additional tools and
applications isn’t as comprehensive as some of the competition.
Hewlett Packard Enterprise
HPE’s Helion offering is focused on what it calls the Right Mix, where businesses
can choose how much of their hybrid strategy will be public and how much will be
private. HPE’s private cloud solutions have a strong basis in open technologies,
including major support for OpenStack. However, the company also leverages its
partnerships with AWS and Microsoft Azure, among others, to provide some of the
public cloud aspects of its hybrid cloud offering.
EMC
EMC’s strength in hyper-convergence and plethora of storage options make it a
good vendor for operation-heavy organizations who like to play a part in building
out their own solutions. In terms of hyper-convergence, EMC has made many strides
in the hardware space with its hardware solutions such as the VCE VxRack,
VxBlock, and Vblock solutions. The company also offers a ton of security options
but still relies on partners to provide the public cloud end of the deal.
IBM
IBM’s Bluemix hybrid cloud is a valuable option, thanks to its open architecture,
focus on developer and operations access, and catalog of tools available through the
public cloud. Organizations looking to more effectively leverage data will find
Watson and the IoT tools especially helpful. Using a product called Relay, IBM is
able to make your private cloud and public cloud look similar, increasing transpar-
ency and helping with DevOps efforts. The company’s admin console and syndi-
cated catalog are also helpful in working between public and private clouds.
220 11 The Hybrid IT, the Characteristics and Capabilities
Verizon Enterprise
What many in IT don’t realize is that most of the major telecom providers have
cloud offerings of their own. Verizon Enterprise, the business division of Verizon,
offers three customizable cloud models including a hybrid solution. Verizon
Enterprise has a strong product in terms of disaster recovery and cloud backup. It
also has a cloud marketplace and offers authorized Oracle integrations on Verizon
cloud deployments.
Fujitsu
Fujitsu is another hybrid cloud provider built on another vendor’s offering – in this
case Microsoft Azure. Fujitsu Hybrid Cloud Services (FHCS) are a combination of
Fujitsu’s Public S5 cloud, running on Azure, and a private cloud, which is powered
by Microsoft Hyper-V, and can be deployed on client side or in a Fujitsu data center.
The offering provides standard tools like workload bursting, as well as the ability to
split a workload by geography.
CenturyLink
CenturyLink is another telecom company that provides cloud services. The com-
pany advertises its service as a public cloud that is “hybrid-ready.” Since it basically
only provides the public cloud portion of a hybrid cloud deployment, CenturyLink
is focused heavily on integrating with existing systems. Automation and container-
ization tools make it a good fit for shops that are exploring DevOps.
NTT
Japanese telecom giant NTT (Nippon Telegraph and Telephone) might fly under the
radar by most IT leaders’ standards, but it shouldn’t be overlooked. The company’s
hybrid cloud solution is focused on security and privacy, with HIPAA, FISMA, and
PCI compliance at the forefront. NTT’s hybrid cloud has enhanced monitoring and
additional security via Trend Micro. A plethora of optional features is available to
further customize the deployment.
Cisco
Much like VMware, Cisco is known for its private cloud products and offers hybrid
solutions through a partner network. Customers stitch their clouds together with the
Cisco Intercloud Fabric, which allows users to manage workloads across their
clouds. Cisco’s partner network includes companies like Accenture, AT&T, and
CDW, among many others.
CSC
Another up and comer in the hybrid cloud space is technology and professional
services provider CSC. CSC’s BizCloud is its private cloud component, and it part-
ners with companies like AWS to provide the public cloud layer. CSC’s big focus is
on its Agility Platform, which connects different clouds together. The company uses
adapters to make it easy to work with multiple providers.
Hitachi
Hitachi offers cloud storage on demand and compute as a service via its Hitachi
Data Systems (HDS) division. Solutions are offered in outcome-based service-level
Bibliography 221
agreements with a focus on customer choice. Hitachi also offers convergence tools
and is a gold member of OpenStack, which signifies its commitment to open
technologies.
Bibliography
K
Kepler, 55–58, 64, 183 R
Kickstart, 130, 131 Radial basis function neural network
(RBF-NN), 127
Remote execution engine, 61, 62, 106
L Rensselaer’s Optimistic Simulation System
LaPIe, 123 (ROSS), 129, 130
Local execution engine, 61, 62, 106 Runtime optimization, 182
Index 225
V
Virtual machine (VM), 9, 24, 25, 27, 31–37, X
42, 43, 45, 52, 66, 79, 85–89, 91–94, XBaya, 140