Microsoft Cloud Adoption Framework For Azure PDF
Microsoft Cloud Adoption Framework For Azure PDF
Microsoft Cloud Adoption Framework For Azure PDF
The Microsoft Cloud Adoption Framework for Azure is proven guidance that's designed to help you create and
implement the business and technology strategies necessary for your organization to succeed in the cloud. It
provides best practices, documentation, and tools that cloud architects, IT professionals, and business decision
makers need to successfully achieve short-term and long-term objectives.
By using the Microsoft Cloud Adoption Framework for Azure best practices, organizations can better align their
business and technical strategies to ensure success. Watch the following video to learn more.
The Cloud Adoption Framework brings together cloud adoption best practices from Microsoft employees, partners,
and customers. It provides a set of tools, guidance, and narratives that help shape technology, business, and people
strategies for driving desired business outcomes during your cloud adoption effort. Review the guidance for each
methodology below, providing you with easy access to the right guidance at the right time.
Intended audience
This guidance affects the business, technology, and culture of enterprises. The affected roles include line-of-
business leaders, business decision makers, IT decision makers, finance, enterprise administrators, IT operations, IT
security and compliance, IT governance, workload development owners, and workload operations owners. Each
role uses its own vocabulary, and each has different objectives and key performance indicators. A single set of
content can't address all audiences effectively.
Enter the cloud architect. The cloud architect serves as the thought leader and facilitator to bring these audiences
together. We've designed this collection of guides to help cloud architects facilitate the right conversations with the
right audiences and drive decision-making. Business transformation that's empowered by the cloud depends on the
cloud architect role to help guide decisions throughout the business and IT.
Each section of the Cloud Adoption Framework represents a different specialization or variant of the cloud architect
role. These sections also create opportunities to share cloud architecture responsibilities across a team of cloud
architects. For example, the governance section is designed for cloud architects who have a passion for mitigating
technical risks. Some cloud providers refer to these specialists as cloud custodians; we prefer the term cloud
guardian or, collectively, the cloud governance team.
A RT IC L E DESC RIP T IO N
Compare common operating models This article is the primary guide for comparing operating
models and choosing a course of action.
Understand cloud operating models Primer for making import decisions regarding your operating
model.
Define your operating model with CAF The Cloud Adoption Framework is an incremental guide to
building out your environment and adopting the cloud within
your chosen operating model. This article creates a frame of
reference to understand how the various methodologies
support the development of your operating model.
A RT IC L E DESC RIP T IO N
Partner landing zones Review and compare Azure landing zone offers from your
partner.
A RT IC L E DESC RIP T IO N
Implementation options Updated to add partner landing zone options to the existing
Azure landing zone implementation options.
NOTE
The new partner landing zone articles don't specify how a partner should define or implement a landing zone. Instead, it's
designed to add structure to a complex conversation, so you can better understand the partner offer. This list of questions
and minimum evaluation criteria can also be used to compare offers from potential partners. It's also being used by some
partners to more clearly communicate the value of their Azure landing zone implementation options.
A RT IC L E DESC RIP T IO N
Windows Virtual Desktop This scenario enables productivity boosts and accelerates the
migration of various workloads to support the end-user
experience.
Azure Stack Learn about deploying Azure in your datacenter using Azure
Stack Hub.
A RT IC L E DESC RIP T IO N
Analytics solution for Teradata, Netezza, Exadata Learn about migrating legacy on-premises environments
including Teradata, Netezza, and Exadata to modern analytics
solutions.
High availability for Azure Synapse Learn about one of the key benefits of a modern cloud-based
infrastructure, built-in high availability and disaster recovery.
Schema migration data definition languages (DDL) Learn about the database objects and associated processes
when preparing to migrate existing data.
A RT IC L E DESC RIP T IO N
Azure innovation guide: Innovate with AI Learn about how you can innovate with AI and find the best
solution based on your implementation needs.
AI in the Cloud Adoption Framework Review a prescriptive framework that includes the tools,
programs, and content (best practices, configuration
templates, and architecture guidance) to simplify adoption of
AI and cloud-native practices at scale.
MLOps with Azure Machine Learning Learn about Machine Learning operations (MLOps) best
practices.
A RT IC L E DESC RIP T IO N
Azure landing zones Azure landing zones create a common set of design areas and
implementation options to accelerate environment creation
aligned to the cloud adoption plan and cloud operating model.
This new article defines Azure landing zones more clearly.
Azure landing zones: Design areas All Azure landing zones share a common set of 8 design areas.
Before deploying any of the Azure landing zones, customers
should consider each of these design to make critical decisions.
Azure landing zones: Implementation options Choose the best Azure landing zone implementation option,
depending on your cloud adoption plan and cloud operating
model.
The existing CAF blueprint definitions and CAF Terraform modules provide a starting point for Azure landing zone
implementation. However, some customers need a richer implementation option that can meet the demands of
enterprise-scale cloud adoption plans. This release adds CAF enterprise-scale to the Azure landing zone
implementation options to fill that need. The following lists a few of the articles to get you started with the CAF
enterprise-scale architecture and reference implementations.
A RT IC L E DESC RIP T IO N
Implement CAF enterprise-scale landing zones Rapid implementation options and GitHub examples
Enterprise-scale design principles Understand the architectural design principles that guide
decisions during implementation to evaluate whether this
approach fits your cloud operating model
Enterprise-scale design guideline Evaluate the enterprise-scale guidelines for fulfilling the
common design areas of Azure landing zones
Partners are an important aspect of successful cloud adoption. Throughout the Cloud Adoption Framework
guidance, we have added references to show the important role that partners play and how customers can better
engage partners. For a list of validated CAF partners, see the CAF-aligned partner offers, Azure expert managed
service providers (MSPs), or advanced specialist partners.
A RT IC L E DESC RIP T IO N
Cloud Adoption Framework for Azure The Cloud Adoption Framework landing page has been
redesigned to make it easier to find the guidance, tools, learn
modules and programs that support a successful cloud
adoption journey.
Get started with the Cloud Adoption Framework Choose a getting started guide that's aligned with your cloud
adoption goals. These common scenarios provide a roadmap
through the Microsoft Cloud Adoption Framework for Azure.
Understand and document foundational alignment decisions Learn about the initial decisions that every team involved in
cloud adoption should understand.
Understand and align the portfolio hierarchy Learn how a portfolio hierarchy shows how your workloads
and supporting services all fit together.
How do Azure products support the portfolio hierarchy? Learn about the Azure tools and solutions that support your
portfolio hierarchy.
Tools and templates Find the tools, templates, and assessments that can help you
accelerate your cloud adoption journey.
April 4, 2020
Continued iteration of refinement to the Migrate methodology and the Ready methodology, to more tightly align
them with feedback from Microsoft customers, partners, and internal programs.
Migrate methodology updates:
A RT IC L E DESC RIP T IO N
Migrate methodology These changes streamline the phases of the migration effort
(assess workloads, deploy workloads, and release workloads).
The changes also remove the details regarding the migration
backlog. Removing those details and referencing plan, ready,
and Adopt methodologies instead creates flexibility for various
different cloud adoption programs to better align with the
methodology.
A RT IC L E DESC RIP T IO N
Refactor landing zones New ar ticle: Drawing from Ready methodology workshops,
this article demonstrates the theory of starting with an initial
template, using decision trees and refactoring to expand the
landing zone, and moving toward a future state of enterprise
readiness.
Expand your landing zone New ar ticle: Builds on the parallel iterations section of the
refactoring article to show how various types of landing zone
expansions would embed shared principles into the
supporting platform. The original content for this overview
has been moved to the basic landing zone considerations
node in the table of contents.
Test-driven development (TDD) for landing zones New ar ticle: The refactoring approach is much improved
through the adoption of a test-driven development cycle to
guide landing zone development and refactoring.
Landing zone TDD in Azure New ar ticle: Azure governance tools provide a rich platform
for TDD cycles or red/green tests.
Improve landing zone security New ar ticle: Overview of the best practices in this section,
related back to the TDD cycle.
Improve landing zone operations New ar ticle: List of best practices in the Manage
methodology, with a transition to that modular approach to
improving operations, reliability, and performance.
Improve landing zone governance New ar ticle: List of best practices related to Govern
methodology, with a transition to that modular approach to
improving governance, cost management, and scale.
A RT IC L E DESC RIP T IO N
Start with enterprise scale New ar ticle: Demonstrate an approach that shows the
differences in the process, when a customer starts with CAF
enterprise-scale landing zone templates. This article helps
customers understand qualifiers that would support this
decision.
A RT IC L E DESC RIP T IO N
Create your initial Azure subscriptions New ar ticle: Create your initial production and
nonproduction subscriptions, and decide whether to create
sandbox subscriptions, as well as a subscription to contain
shared services.
Create additional subscriptions to scale your Azure Learn about reasons to create additional subscriptions,
environment moving resources between subscriptions, and tips for creating
new subscriptions.
Organize and manage multiple Azure subscriptions Create a management group hierarchy to help organize,
manage, and govern your Azure subscriptions.
A RT IC L E DESC RIP T IO N
Application development and deployment New ar ticle: Provides checklists, resources, and best practices
for planning application development, configuring CI/CD
pipelines, and implementing site reliability engineering for
Kubernetes.
Cluster design and operations New ar ticle: Provides checklists, resources, and best practices
for cluster configuration, network design, future-proof
scalability, business continuity, and disaster recovery for
Kubernetes.
Cluster and application security New ar ticle: Provides checklists, resources, and best practices
for Kubernetes security planning, production, and scaling.
March 2, 2020
In response to feedback about continuity in the migration approach through multiple sections of the Cloud
Adoption Framework, including Strategy, Plan, Ready, and Migrate, we've made the following updates. These
updates are designed to make it easier for you to understand planning and adoption refinements as you continue a
migration journey.
Strategy methodology updates:
A RT IC L E DESC RIP T IO N
Balance the portfolio Moved this article to appear earlier in the Strategy
methodology. This gives you visibility into the thought process
earlier in the lifecycle.
Balancing competing priorities New ar ticle: Outlines the balance of priorities across
methodologies to help inform your strategy.
A RT IC L E DESC RIP T IO N
Assessment best practice Moved this article to the new "best practices" section of the
Plan methodology. This gives you visibility into the practice of
assessing local environments earlier in the lifecycle.
A RT IC L E DESC RIP T IO N
What is a landing zone? New ar ticle: Defines the term landing zone.
First landing zone New ar ticle: Expands on the comparison of various landing
zones.
CAF Migration landing zone Separated the blueprint definition from the selection of the
first landing zone.
CAF Terraform modules Moved to the new "landing zone" section of the Ready
methodology, to elevate Terraform in the landing zone
conversation.
A RT IC L E DESC RIP T IO N
Classification during assess processes New ar ticle: Outlines the importance of classifying every
asset and workload prior to migration.
Test, optimize, and promote Aligned the title of this article with other process improvement
suggestions.
A RT IC L E DESC RIP T IO N
Assess overview Updated to illustrate that the assessment in this phase focuses
on assessing the technical fit of a specific workload and related
assets.
The Cloud Adoption Framework can help you get started in several ways, so there are several different getting
started guides. This article groups the guides to help you find the one that best aligns with your current challenges.
Each of the following links takes you to questions that are typically asked when an organization is trying to
accomplish a certain goal during their cloud adoption journey.
Align foundational concepts to onboard a person, project, or team
Adopt the cloud to deliver business and technical outcomes sooner
Improve controls to ensure proper operations of the cloud
Establish teams to support adoption and operations
Align foundation
A company's cloud adoption journey is typically built on a set of foundational decisions that impact the outcomes
of a cloud adoption journey. The following information can help you make core decisions and record them as a
reference to be used during the cloud adoption lifecycle.
Get started aligning foundation decisions
How does Azure work
Fundamental concepts
Portfolio hierarchy
Azure hierarchy support
Accelerate adoption
Cloud adoption requires technical change, but to digitally transform with the cloud, it requires more than just IT.
Use these guides to start aligning various teams to accelerate migration and innovation efforts.
We want to migrate existing workloads to the cloud. This guide is a great starting point if your primary focus is
migrating on-premises workloads to the cloud.
We want to build new products and services in the cloud. This guide can help you prepare to deploy innovative
solutions to the cloud.
We're blocked by environment design and configuration. This guide provides a quick approach to designing and
configuring your environment.
Improve controls
As your cloud adoption journey progresses, a solid operating model can help ensure that wise decisions are made.
You'll also want to consider organizational change. These guides can help you align people and improve operations
to develop your cloud operating model.
GUIDE DESC RIP T IO N
How do we deliver operational excellence during cloud The steps in this guide can help the strategy team lead the
transformation? organizational change management required to consistently
ensure operational excellence.
How do we manage enterprise costs? This guide can help you start optimizing enterprise costs and
manage cost across the environment.
How do we consistently secure the enterprise cloud This guide can help ensure that the security requirements are
environment? applied across the enterprise to minimize risk of breach, and
to accelerate recovery when a breach occurs.
How do we apply the right controls to improve reliability? This guide helps minimize disruptions related to
inconsistencies in configuration, resource organization,
security baselines, or resource protection policies.
How do we ensure performance across the enterprise? This guide can help you establish processes for maintaining
performance across the enterprise.
Establish teams
Depending on your adoption strategy and operating model, you might need to establish a few teams. This section
helps you get those new teams started.
How do we align our organization? This guide can help you establish an appropriately staffed
organizational structure.
Do I need a cloud strategy team? This team ensures that cloud adoption efforts progress in
alignment with business outcomes.
What does a cloud adoption team do? This team implements technical solutions outlined in the plan,
and in accordance with governance requirements.
How do I build a cloud governance team? This team ensure that risks and risk tolerance are properly
evaluated and managed.
How does a cloud operations team work? This team focuses on monitoring, repairing, and the
remediation of issues related to traditional IT operations and
assets.
Get started: Understand and document foundational
alignment decisions
10/30/2020 • 5 minutes to read • Edit Online
The cloud adoption journey can unlock many business, technical, and organizational benefits. Whatever you want
to accomplish, if your journey involves the cloud, there are a few initial decisions that every team involved should
understand.
NOTE
Selecting any of the following links might lead you to bounce around the table of contents for the Microsoft Cloud Adoption
Framework for Azure, looking for fundamental concepts that you'll use later to help the team implement the associated
guidance. Bookmark this page to come back to this checklist often.
The cloud strategy team is accountable for defining a way Multiple teams will use the following guidance to create
to view the portfolio. those views. Everyone involved in cloud adoption should
know where to find the portfolio view to support decisions
later in the process.
The cloud governance team is accountable for defining, Everyone involved in the technical strategy for cloud
enforcing, and automating the portfolio hierarchy to shape adoption should be familiar with the portfolio hierarchy and
corporate policy in the cloud. the levels of the hierarchy in use today.
The cloud governance team is accountable for defining, Everyone involved in the technical strategy for cloud
enforcing, and automating the naming and tagging standards adoption should be familiar with the naming and tagging
to ensure consistency across the portfolio. standards before deployment to the cloud.
The cloud governance team is accountable for defining, Everyone involved in the technical strategy for cloud
implementing, and automating the resource organization adoption should be familiar with the resource organization
design across the portfolio. design before deployment to the cloud.
The cloud strategy team is accountable for aligning virtual Everyone involved in the cloud adoption lifecycle should be
or dedicated organizational structures to ensure success of familiar with the alignment of people and levels of
the cloud adoption lifecycle. accountability.
What's next
Build on this set of fundamental concepts through the series of getting-started guides in this section of the Cloud
Adoption Framework.
Apply fundamental concepts to other getting-started guides
How does Azure work?
10/30/2020 • 2 minutes to read • Edit Online
Azure is Microsoft's public cloud platform. Azure offers a large collection of services including platform as a
service (PaaS), infrastructure as a service (IaaS), and managed database service capabilities. But what exactly is
Azure, and how does it work?
Azure, like other cloud platforms, relies on a technology known as virtualization. Most computer hardware can be
emulated in software, because most computer hardware is simply a set of instructions permanently or semi-
permanently encoded in silicon. Using an emulation layer that maps software instructions to hardware
instructions, virtualized hardware can execute in software as if it were the actual hardware itself.
Essentially, the cloud is a set of physical servers in one or more datacenters that execute virtualized hardware on
behalf of customers. So how does the cloud create, start, stop, and delete millions of instances of virtualized
hardware for millions of customers simultaneously?
To understand this, let's look at the architecture of the hardware in the datacenter. Inside each datacenter is a
collection of servers sitting in server racks. Each server rack contains many server blades as well as a network
switch providing network connectivity and a power distribution unit (PDU) providing power. Racks are sometimes
grouped together in larger units known as clusters.
Within each rack or cluster, most of the servers are designated to run these virtualized hardware instances on
behalf of the user. But some of the servers run cloud management software known as a fabric controller. The fabric
controller is a distributed application with many responsibilities. It allocates services, monitors the health of the
server and the services running on it, and heals servers when they fail.
Each instance of the fabric controller is connected to another set of servers running cloud orchestration software,
typically known as a front end. The front end hosts the web services, RESTful APIs, and internal Azure databases
used for all functions the cloud performs.
For example, the front end hosts the services that handle customer requests to allocate Azure resources such as
virtual machines, and services like Azure Cosmos DB. First, the front end validates the user and verifies the user is
authorized to allocate the requested resources. If so, the front end checks a database to locate a server rack with
sufficient capacity and then instructs the fabric controller on that rack to allocate the resource.
So fundamentally, Azure is a huge collection of servers and networking hardware running a complex set of
distributed applications to orchestrate the configuration and operation of the virtualized hardware and software
on those servers. It is this orchestration that makes Azure so powerful, because users are no longer responsible for
maintaining and upgrading hardware because Azure does all this behind the scenes.
Next steps
Learn more about cloud adoption with the Microsoft Cloud Adoption Framework for Azure.
Learn about the Microsoft Cloud Adoption Framework for Azure
Azure fundamental concepts
10/30/2020 • 5 minutes to read • Edit Online
Learn fundamental concepts and terms that are used in Azure, and how the concepts relate to one another.
Azure terminology
It's helpful to know the following definitions as you begin your Azure cloud adoption efforts:
Resource: An entity that's managed by Azure. Examples include Azure Virtual Machines, virtual networks, and
storage accounts.
Subscription: A logical container for your resources. Each Azure resource is associated with only one
subscription. Creating a subscription is the first step in adopting Azure.
Azure account: The email address that you provide when you create an Azure subscription is the Azure
account for the subscription. The party that's associated with the email account is responsible for the monthly
costs that are incurred by the resources in the subscription. When you create an Azure account, you provide
contact information and billing details, like a credit card. You can use the same Azure account (email address)
for multiple subscriptions. Each subscription is associated with only one Azure account.
Account administrator : The party associated with the email address that's used to create an Azure
subscription. The account administrator is responsible for paying for all costs that are incurred by the
subscription's resources.
Azure Active Director y (Azure AD): The Microsoft cloud-based identity and access management service.
Azure AD allows your employees to sign in and access resources.
Azure AD tenant: A dedicated and trusted instance of Azure AD. An Azure AD tenant is automatically created
when your organization first signs up for a Microsoft cloud service subscription like Microsoft Azure, Intune, or
Microsoft 365. An Azure tenant represents a single organization.
Azure AD director y: Each Azure AD tenant has a single, dedicated, and trusted directory. The directory
includes the tenant's users, groups, and apps. The directory is used to perform identity and access
management functions for tenant resources. A directory can be associated with multiple subscriptions, but
each subscription is associated with only one directory.
Resource groups: Logical containers that you use to group related resources in a subscription. Each resource
can exist in only one resource group. Resource groups allow for more granular grouping within a subscription,
and are commonly used to represent a collection of assets required to support a workload, application, or
specific function within a subscription.
Management groups: Logical containers that you use for one or more subscriptions. You can define a
hierarchy of management groups, subscriptions, resource groups, and resources to efficiently manage access,
policies, and compliance through inheritance.
Region: A set of Azure datacenters that are deployed inside a latency-defined perimeter. The datacenters are
connected through a dedicated, regional, low-latency network. Most Azure resources run in a specific Azure
region.
NOTE
When you sign up for Azure, you might see the phrase create an Azure account. You create an Azure account when you
create an Azure subscription and associate the subscription with an email account.
NOTE
Most Azure resources are deployed to a specific region. Certain resource types are considered global resources, such as
policies that you set by using the Azure Policy services.
Related resources
The following resources provide detailed information about the concepts discussed in this article:
How does Azure work?
Resource access management in Azure
Azure Resource Manager overview
Role-based access control (RBAC) for Azure resources
What is Azure Active Directory?
Associate or add an Azure subscription to your Azure Active Directory tenant
Topologies for Azure AD Connect
Subscriptions, licenses, accounts, and tenants for Microsoft's cloud offerings
Next steps
Now that you understand fundamental Azure concepts, learn how to scale with multiple Azure subscriptions.
Scale with multiple Azure subscriptions
Understand and align the portfolio hierarchy
10/30/2020 • 11 minutes to read • Edit Online
Business needs are often supported, improved, or accelerated through information technology. A collection of
technologies that delivers defined business value is called a workload. That collection might include applications,
servers or virtual machines, data, devices, and other similarly grouped assets.
Typically, a business stakeholder and technical leader share accountability for the ongoing support of each
workload. In some phases of the workload lifecycle, those roles are clearly stated. In more operational phases of a
workload's lifecycle, those roles might be transitioned to a shared operations management team or cloud
operations team. As the number of workloads increases, the roles (stated or implied) become more complex and
more matrixed.
Most businesses rely on multiple workloads to deliver vital business functions. The collection of workloads, assets,
and supporting factors (projects, people, processes, and investments) is called a portfolio. The matrix of business,
development, and operations staff requires a portfolio hierarchy to show how the workloads and supporting
services all fit together.
This article provides clear definitions for the levels of the portfolio hierarchy. The article aligns various teams with
the appropriate accountability in each layer, along with the source of the best guidance for that team to deliver on
the expectations for that level. Throughout this article, each level of the hierarchy is also called a scope.
Portfolio hierarchy
Workloads
Workloads and their supporting assets are at the core of any portfolio. The additional scopes or layers below
define how those workloads are viewed and to what extent they're affected by the matrix of potential supporting
teams.
Although the terms can vary, all IT solutions include assets and workloads:
Asset: The smallest unit of technical function that supports a workload or solution.
Workload: The smallest unit of IT support for the business. A workload is a collection of assets (infrastructure,
applications, and data) that supports a common business goal or the execution of a common business process.
When you're deploying your first workload, the workload and its assets might be the only defined scope. The other
layers might be explicitly defined as more workloads are deployed.
IT portfolio
When companies support workloads through matrixed approaches or centralized approaches, a broader hierarchy
likely exists to support those workloads:
Landing zones: Landing zones provide workloads with the necessary foundational utilities (or shared
plumbing) that are provided from a platform foundation that's required to support one or more workloads.
Landing zones are so critical in the cloud that the entire Ready methodology of the Cloud Adoption Framework
focuses on landing zones. For a more detailed definition, see What is a landing zone?
Foundational utilities: These shared IT services are required for workloads to operate within the technology
and business portfolio.
Platform foundation: This organizational construct centralizes foundational solutions and helps ensure that
those controls are enforced for all landing zones.
Cloud platforms: Depending on the overall strategy for supporting the full portfolio, customers might need
multiple cloud platforms with distinct deployments of the platform foundation to govern multiple regions,
hybrid solutions, or even multicloud solutions.
Por tfolio: Through a technology lens, the portfolio is a collection of workloads, assets, and supporting
resources that span all cloud platforms. Through a business lens, the portfolio is the collection of projects,
people, processes, and investments that support and manage the technology portfolio to drive business
outcomes. Together, these two lenses capture the portfolio.
Por tfolio: The cloud strategy team and the cloud center of excellence (CCoE) use the Strategy and Plan
methodologies to guide decisions that affect the overall portfolio. The cloud strategy team is accountable for
the enterprise level of the cloud portfolio hierarchy. The cloud strategy team should also be informed of
decisions about the environment, landing zones, and high-priority workloads.
Cloud platforms: The cloud governance team is accountable for the disciplines that ensure consistency across
each environment in alignment with the Govern methodology. The cloud governance team is accountable for
governance of all resources in all environments. The cloud governance team should be consulted on changes
that might require an exception or change to governing policies. The cloud governance team should also be
informed of progress with workload and asset adoption.
Landing zones and cloud foundation: The cloud platform team is accountable for developing the landing
zones and platform utilities that support adoption. The cloud automation team is accountable for automating
the development of, and ongoing support for, those landing zones and platform utilities. Both teams use the
Ready methodology to guide implementation. Both teams should be informed of progress with workload
adoption and any changes to the enterprise or environment.
Workloads: Adoption happens at the workload level. Cloud adoption teams use the Migrate and Innovate
methodologies to establish scalable processes to accelerate adoption. After adoption is complete, the
ownership of workloads is likely transferred to a cloud operations team that uses the Manage methodology to
guide operations management. Both teams should be comfortable using the Microsoft Azure Well-Architected
Framework to make detailed architectural decisions that affect the workloads they support. Both teams should
be informed of changes to landing zones and environments. Both teams might occasionally contribute to
landing zone features.
Assets: Assets are typically the responsibility of the cloud operations team. That team uses the management
baseline in the Manage methodology to guide operations management decisions. It should also use Azure
Advisor and the Azure Well-Architected Framework to make detailed resource and architectural changes that
are required to deliver on operations requirements.
Accountability variants
Single environment: When an enterprise needs only one environment, a CCoE is typically not required.
Single landing zone: If an environment has only a single landing zone, the governance and platform
capabilities can likely be combined into one team.
Single workload: Some businesses need only one workload, or few workloads, in a single landing zone and a
single environment. In those cases, there's little need for a separation of duties between governance, platform,
and operations teams.
In Understanding and aligning the portfolio hierarchy, a set of definitions for the portfolio hierarchy and role
mapping established a hierarchy of scope for most portfolio approaches. As described in that article, you might
not need each of the outlined levels or scopes. Minimizing the number of layers reduces complexity, so these
layers shouldn't all be viewed as a requirement.
This article shows how each level or scope of the hierarchy is supported in Azure through organizational tools,
deployment and governance tools, and some solutions in the Microsoft Cloud Adoption Framework for Azure.
Por tfolio: The enterprise or business unit probably won't contain any technical assets but might affect cost
decisions. The enterprise and business units are represented in the root nodes of the management group
hierarchy.
Cloud platforms: Each environment has its own node in the management group hierarchy.
Landing zones and cloud foundation: Each landing zone is represented as a subscription. Likewise,
platform foundations are contained in their own subscriptions. Some subscription designs might call for a
subscription per cloud or per workload, which would change the organizing tool for each.
Workloads: Each workload is represented as a resource group. Resource groups are often used to represent
solutions, deployments, or other technical groupings of assets.
Assets: Each asset is inherently represented as a resource in Azure.
Proper alignment of business and IT stakeholders helps to overcome migration roadblocks and accelerate
migration efforts. This article provides recommended steps for:
Stakeholder alignment
Migration planning
Deploying a landing zone
Migrating your first 10 workloads
It also helps you implement proper governance and management processes.
Use this guide to streamline the processes and materials required for aligning an overall migration effort. The
guide uses the methodologies of the Cloud Adoption Framework that are highlighted in this illustration.
If your migration scenario is atypical, you can get a personalized assessment of your organization's migration
readiness by using the strategic migration and readiness tool (SMART) assessment. Use it to identify the guidance
that best aligns to your current needs.
Get started
The technical effort and process required to migrate workloads is relatively straightforward. It's important to
complete the migration process efficiently. Strategic migration readiness has an even bigger impact on the
timelines and successful completion of the overall migration.
To accelerate adoption, you must take steps to support the cloud adoption team during migration. This guide
outlines these iterative tasks to help customers start on the right path toward any cloud migration. To show the
importance of the supporting steps, migration is listed as step 10 in this article. In practice, the cloud adoption
team is likely to begin their first pilot migration in parallel with steps 4 or 5.
Cloud migration tools enable migrating all virtual machines in a datacenter in one pass or iteration. It's more
common to migrate a smaller number of workloads during each iteration. Breaking up the migration into smaller
increments requires more planning, but it reduces technical risks and the impact of organizational change
management.
With each iteration, the cloud adoption team gets better at migrating workloads. These steps help the technical
team mature their capabilities:
1. Migrate your first workload in a pure infrastructure as a service (IaaS) approach by using the tools outlined in
the Azure migration guide.
2. Expand tooling options to use migration and modernization by using the migration examples.
3. Develop your technical strategy by using broader approaches outlined in Azure cloud migration best practices.
4. Improve consistency, reliability, and performance through an efficient migration-factory approach as outlined
in Migration process improvements.
Deliverables:
Continuous improvement of the adoption team's ability to migrate workloads.
Value statement
These steps help teams accelerate their migration efforts through better change management and stakeholder
alignment. These steps also remove common blockers and realize business value more quickly.
Next steps
The Cloud Adoption Framework is a lifecycle solution that helps you begin a migration journey. It also helps
mature the teams that support migration efforts. The following teams can use these next steps to continue to
mature their capabilities. These parallel processes aren't linear and shouldn't considered blockers. Instead, each is
a parallel value stream to help improve your organization's overall cloud readiness.
T EA M N EXT IT ERAT IO N
Cloud adoption team Use the migration model to learn about moving toward a
migration factory that provides efficient ongoing migration
capabilities.
Cloud strategy team Iteratively improve the Strategy methodology and the Plan
methodology along with the adoption plan. Review these
overviews and continue iterating on your business and
technical strategies.
Cloud platform team Revisit the Ready methodology to continue to advance the
overall cloud platform that supports migration or other
adoption efforts.
If your migration scenario is atypical, you can get a personalized assessment of your organization's migration
readiness by using the strategic migration and readiness tool (SMART) assessment. The answers you provide help
identify which guidance aligns best with your current needs.
Get started: Accelerate new product and service
innovation in the cloud
10/30/2020 • 10 minutes to read • Edit Online
Creating new products and services in the cloud requires a different approach than migration requires. The
Innovate methodology of the Cloud Adoption Framework establishes an approach that guides the development of
new products and services.
This guide uses the sections of the Cloud Adoption Framework that are highlighted in the following illustration.
Innovation is less predictable than a standard migration, but it still fits within the context of the broader cloud
adoption plan. This guide can help your enterprise provide the support needed to innovate and provide a structure
for creating a balanced portfolio throughout cloud adoption.
Value statement
The steps outlined in this guide can help you and your teams create innovative solutions in the cloud that create
business value, are governed appropriately, and are well architected.
Next steps
The Cloud Adoption Framework is a lifecycle solution. It can help you begin an innovation journey. It can help your
organization to start an innovation journey and to advance the maturity of the teams that support innovation
efforts.
The following teams can use these next steps to continue to advance the maturity of their efforts. These parallel
processes aren't linear and shouldn't be viewed as blockers. Instead, each is a parallel value stream to help mature
your company's overall cloud readiness.
T EA M N EXT IT ERAT IO N
Cloud strategy team The Strategy methodology and the Plan methodology are
iterative processes that evolve with the adoption plan. Return
to these overview pages and continue to iterate on your
business and technical strategies.
Cloud platform team Revisit the Ready methodology to continue to advance the
overall cloud platform that supports migration or other
adoption efforts.
Environment design and configuration are the most common blockers to adoption efforts that are focused on
migration or innovation. Quickly implementing a design that supports your long-term adoption plan can be
difficult. This article establishes an approach and series of steps that help to overcome common blockers and
accelerate your adoption efforts.
The technical effort required to create an effective environmental design and configuration can be complex. You
can manage the scope to improve the odds of success for the cloud platform team. The greatest challenge is
alignment among multiple stakeholders. Some of these stakeholders have the authority to stop or slow the
adoption efforts. These steps outline ways to quickly meet short-term objectives and establish long-term success.
Value statement
The steps outlined in this guide can help you and your teams accelerate their path to an enterprise-ready cloud
environment that's properly configured.
Next steps
Consider these next steps in a future iteration to build on your initial efforts:
Environmental technical readiness learning paths
Migration environment planning checklist
Enable customer success with a sound operating
model and organizational alignment
10/30/2020 • 2 minutes to read • Edit Online
Customer success in cloud adoption efforts often has little to do with technical skills or adoption-related projects.
Your operating model creates opportunities to enable adoption or roadblocks that might slow down cloud
adoption.
Alignment
As you drive innovation, alignment between business and technical teams is paramount to the success of your
solution.
For business stakeholders, we've created the Microsoft AI Business School to support business strategy
development and provide example best practices.
For technical stakeholders, the Microsoft AI learning paths are available to help you build new AI skills.
Blockers
When adoption of the cloud is slowed or stalled, it might be wise to evaluate your operating model to enable
continued success. When success is inconsistent from workload to workload or project to project, the operating
model might be misaligned. If more than one project is stalled by blocking policies, outdated processes, or
misalignment of people, the operating model is likely blocking success.
Opportunities
Beyond the common blockers, a few key opportunities can be scaled across the portfolio through incremental
improvements to your operating model. In particular, customers commonly want to scale operational excellence,
cost optimization, security, reliability, performance, or people management. Scaling these conversations at the
portfolio level can help bring best practices for specific workload-focused teams to all other projects and
workloads.
How do we deliver operational excellence during cloud The steps in this guide will help the strategy team lead
transformation? organizational change management to consistently ensure
operational excellence.
How do we manage enterprise costs? Start optimizing enterprise costs and manage cost across the
environment.
How do we consistently secure the enterprise cloud This getting started guide can help ensure that the proper
environment? security requirements have been applied across the enterprise
to minimize risk of breach and accelerate recovery when a
breach occurs.
GUIDE DESC RIP T IO N
How do we apply the right controls to improve reliability? This getting started guide helps minimize disruptions related
to inconsistencies in configuration, resource organization,
security baselines, or resource protection policies.
How do we ensure performance across the enterprise? This getting started guide can help you establish processes for
maintaining performance across the enterprise.
How do we align our organization? This getting started guide can help you establish an
appropriately staffed organizational structure.
These principles are shared across Azure Advisor, the Microsoft Azure Well-Architected Framework, and solutions
in the Azure Architecture Center:
Azure Advisor evaluates the principles for individual assets across solutions, workloads, and the full portfolio.
Azure Architecture Center applies these principles to develop and manage specific technical solutions.
Microsoft Azure Well-Architected Framework helps balance these principles across a workload, to guide
architecture decisions.
Cloud Adoption Framework ensures that the principles scale across the portfolio to enable adoption teams
through a well-managed environment.
Get started: Deliver operational excellence during
digital transformation
10/30/2020 • 5 minutes to read • Edit Online
How do you ensure operational excellence during digital transformation? Operational excellence is a business
function that directly affects IT decisions. To achieve operational excellence, you must focus on customer and
stakeholder value by keeping an eye on revenue, risk, and cost impacts.
This organizational change management approach requires:
A defined strategy.
Clear business outcomes.
Change management planning.
From a cloud perspective, you can manage the impact of risk and cost by making post-adoption changes and
continuously refining operational processes. Areas to monitor include systems automation, IT operations
management practices, and Resource Consistency discipline throughout the cloud adoption lifecycle.
The steps in this article can help the strategy team lead the organizational change management that's required to
consistently ensure operational excellence.
Operational excellence across the enterprise and portfolio starts with peer processes of strategy and planning to
align and report on organizational change management expectations. The following steps help technical teams
deliver the disciplines required to achieve operational excellence.
Value statement
The previous steps outline a business-led approach to establish operational excellence requirements throughout
digital transformation. This approach provides a consistent foundation that carries through other operating model
functions.
Next steps to delivering operational excellence across the portfolio
Operational excellence requires a disciplined approach to reliability, performance, security, and cost optimization.
Use the remaining guidance in this series to implement these principles through consistent approaches to
automation.
Cost optimization: Continuously optimize operating costs by using the getting started guide on managing
enterprise costs
Security: Reduce risk by integrating enterprise security across the portfolio by using the getting started guide
on implementing security across the portfolio.
Performance management: Ensure IT asset performance supports business processes by using the getting
started guide on performance management across the enterprise.
Reliability: Improve reliability and reduce business disruptions by using the getting started guide on
implementing controls to create reliability.
Get started: Manage cloud costs
10/30/2020 • 8 minutes to read • Edit Online
The Cost Management discipline of cloud governance focuses on establishing budgets, monitoring cost allocation
patterns, and implementing controls to improve cloud spending behaviors across the IT portfolio. Enterprise cost
optimization involves many other roles and functions to minimize cost and balance the demands of scale,
performance, security, and reliability. This article maps those various supporting functions into a getting started
guide that helps create alignment among the involved teams.
However, enterprise cost optimization involves many other roles and functions to minimize cost and balance the
demands of scale, performance, security, and reliability. This article maps those supporting functions to help create
alignment between the involved teams.
Governance is the cornerstone of cost optimization within any large enterprise. The following section outlines cost
optimization guidance within the context of governance. The subsequent steps help each team take actions that
target its role in cost optimization. Together, these steps will help your organization get started on a journey
toward cost optimization.
The governance team can detect and drive significant cost optimization across most enterprises. Basic, data-driven
resource sizing can have an immediate and measurable impact on costs.
As discussed in Build a cost-conscious organization, an enterprise-wide focus on cost management and cost
optimization can deliver much more value. The following steps demonstrate ways the various teams can help build
a cost-conscious organization.
Value statement
Following these steps helps you build a cost-conscious organization. Simplify cost optimization by using shared
ownership and driving collaboration with the right teams at the right times.
Get started: Implement security across the enterprise
environment
10/30/2020 • 13 minutes to read • Edit Online
Security helps create assurances of confidentiality, integrity, and availability for a business. Security efforts have a
critical focus on protecting against the potential impact to operations caused by both internal and external
malicious and unintentional acts.
This getting started guide outlines the key steps that will mitigate or avoid the business risk from cybersecurity
attacks. It can help you rapidly establish essential security practices in the cloud and integrate security into your
cloud adoption process.
The steps in this guide are intended for all roles that support security assurances for cloud environments and
landing zones. Tasks include immediate risk mitigation priorities, guidance on building a modern security strategy,
operationalizing the approach, and executing on that strategy.
This guide includes elements from across the Microsoft Cloud Adoption Framework for Azure:
Adhering to the steps in this guide will help you integrate security at critical points in the process. The goal is to
avoid obstacles in cloud adoption and reduce unnecessary business or operational disruption.
Microsoft has built capabilities and resources to help accelerate your implementation of this security guidance on
Microsoft Azure. You'll see these resources referenced throughout this guide. They're designed to help you
establish, monitor, and enforce security, and they're frequently updated and reviewed.
The following diagram shows a holistic approach for using security guidance and platform tooling to establish
security visibility and control over your cloud assets in Azure. We recommend this approach.
Use these steps to plan and execute your strategy for securing your cloud assets and using the cloud to modernize
security operations.
NOTE
Each organization should define its own minimum standards. Risk posture and subsequent tolerance to that risk can vary
widely based on industry, culture, and other factors. For example, a bank might not tolerate any potential damage to its
reputation from even a minor attack on a test system. Some organizations would gladly accept that same risk if it
accelerated their digital transformation by three to six months.
Security leadership team (chief information security officer Cloud strategy team
(CISO) or equivalent) Cloud security team
Cloud adoption team
Cloud center of excellence or central IT team
Strategy approval:
Executives and business leaders with accountability for outcomes or risks of business lines within the organization
should approve this strategy. This group might include the board of directors, depending on the organization.
Next steps
The steps in this guide have helped you implement the strategy, controls, processes, skills, and culture needed to
consistently manage security risks across the enterprise.
As you continue into the operations mode of cloud security, consider these next steps:
Review Microsoft security documentation. It provides technical guidance to help security professionals build
and improve cybersecurity strategy, architecture, and prioritized roadmaps.
Review security information in Built-in security controls for Azure services.
Review Azure security tools and services in Security services and technologies available on Azure.
Review the Microsoft Trust Center. It contains extensive guidance, reports, and related documentation that can
help you perform risk assessments as part of your regulatory compliance processes.
Review third-party tools available to facilitate meeting your security requirements. For more information, see
Integrate security solutions in Azure Security Center.
Get started: Improve reliability with the right controls
10/30/2020 • 7 minutes to read • Edit Online
How do you apply the right controls to improve reliability? This article helps you minimize disruptions related to:
Inconsistencies in configuration.
Resource organization.
Security baselines.
Resource protection.
The steps in this article help the operations team balance reliability and cost across the IT portfolio. This article also
helps the governance team to ensure that balance is applied consistently. Reliability also depends on other roles
and functions. This article maps supporting functions to help you create alignment among the involved teams.
Operations management and governance are equal partners in enterprise reliability. The decisions you make
about operational practices set the baseline for reliability. The approaches used to govern the overall environment
ensure consistency across all resources.
The first two steps in this article help both teams get started. They're listed sequentially, but you can perform them
in parallel. The subsequent steps help you get the entire enterprise started on a shared journey toward more
reliable solutions throughout the enterprise.
NOTE
Steps to star t reliability par tnerships with other teams: Various decisions throughout the cloud adoption lifecycle
can have a direct impact on reliability. The following steps outline the partnerships and supporting efforts required to deliver
consistent reliability across the IT portfolio.
A C C O UN TA B L E T EA M RESP O N SIB L E A N D SUP P O RT IN G T EA M S
Value statement
These steps help you to implement the controls and processes that are needed to ensure reliability across the
enterprise and all hosted resources.
Get started: Ensure consistent performance across a
portfolio
10/30/2020 • 6 minutes to read • Edit Online
How do you ensure adequate performance across a portfolio of workloads? The steps in this guide can help you
establish processes for maintaining that level of performance.
Performance also depends on other roles and functions. This article maps those supporting functions to help you
create alignment among the involved teams.
Centralized operations management is the most common approach to consistent performance across the
portfolio. Decisions about operational practices define the operations baseline and any holistic enhancements.
The first step in this guide helps the operations team get started. The subsequent steps help the entire enterprise
get started on a shared journey toward enterprise performance across the portfolio of workloads.
NOTE
Various decisions throughout the cloud adoption lifecycle can have a direct impact on performance. The following steps help
outline the partnerships and supporting efforts required to deliver performance across the IT portfolio.
Step 6: Adoption
Long-term operations might be affected by the decisions that you make during migration and innovation efforts.
Maintaining consistent alignment early in adoption processes helps remove barriers to production release. It also
reduces the effort required to onboard new solutions into operations management practices.
Deliverables:
Test operational readiness of production deployments by using Resource Consistency policies.
Validate adherence to design guidance for resource consistency and to operations requirements.
Document any advanced operations requirements in the operations management workbook.
Guidance to suppor t deliverable completion:
Environmental readiness checklist
Pre-promotion checklist
Production release checklist
Value statement
The preceding steps will help you implement controls and processes to ensure performance across the enterprise
and all hosted resources.
Get started: Align your organization
10/30/2020 • 3 minutes to read • Edit Online
Successful cloud adoption is the result of properly skilled people doing the appropriate types of work, in
alignment with clearly defined business goals, and in a well-managed environment. To deliver an effective cloud
operating model, it's important to establish appropriately staffed organizational structures. This article outlines
such an approach.
This proven approach is considered a minimum viable product (MVP), because it might not be sustainable. Each
team wears many hats, as outlined in the RACI (responsible, accountable, consulted, and informed) charts.
As adoption needs grow, so does the need to create balance and structure. To meet those needs, companies often
follow a process of maturing their organizational structures.
Watch this video to get an overview of common team structures at various stages of organizational maturity.
Additional information
Adapt existing roles, skills, and processes for the cloud
Organizational antipatterns: Silos and fiefdoms
Download the RACI template
Get started: Build a cloud strategy team
10/30/2020 • 11 minutes to read • Edit Online
To be successful, every cloud adoption journey needs to involve some level of strategic planning. This getting
started guide is designed to help you establish a dedicated team or virtual team that can build and deliver on a
solid cloud strategy.
The first step in the journey is to decide whether you need a strategy team, or whether your existing team
members can deliver on cloud strategy as a distributed responsibility.
Whichever approach you choose, you'll want to create a cloud strategy team that defines motivations and business
outcomes, and that validates and maintains alignment between business priorities and cloud adoption efforts.
When the business outcomes affect business functions, the strategy team should include business leaders from
across the organization. The goal of the cloud strategy team is to produce tangible business results that are
enabled by cloud technologies. Overall, this team ensures that cloud adoption efforts progress in alignment with
business outcomes. Whenever possible, business outcomes and the cloud strategy team's efforts should both be
defined early in the process.
NOTE
This article discusses a strategy facilitator, a key player in the cloud-adoption process. The role is commonly held by a
program manager, architect, or consultant. As the cloud strategy team forms and gets started, the strategy facilitator is
temporarily accountable for creating alignment and keeping the team aligned with business goals. The strategy facilitator is
often the person most accountable for the success of the cloud adoption journey.
Cloud adoption is impor tant to the business. The cloud adoption effort has board-level visibility.
Success of the cloud adoption effort will improve market
positioning, customer retention, or revenue.
The programs in the adoption portfolio map directly to
strategic business outcomes.
The portfolio of workloads in this adoption effort is
strategic and mission-critical and could affect multiple
business units.
Cloud adoption requires ongoing executive suppor t. The cloud adoption effort will affect how you manage
organizational change.
The effort will require additional training from multiple
business users and could interrupt certain business functions.
The existing IT operations team or vendor is motivated to
remain in an existing datacenter.
The existing IT team hasn't fully bought into the effort.
Cloud adoption presents risk to the business. Failure to complete the migration within the specified time
window will result in negative market impact or increased
hosting costs.
Workloads slated for adoption need to be protected from
data leakage that could affect business success or customer
security.
Metrics that are being used to measure the cloud effort are
business aligned, creating a dependency and risk on the
technical success.
If any or all of the preceding reasons represent your existing business considerations, the information in the rest of
this article will help you establish your cloud strategy team.
Accountable person or team:
The strategy facilitator is accountable for determining whether a cloud strategy team is needed.
What's next
Strategy and planning are important. Nothing is actionable until you identify the cloud adoption functions that are
needed on your team. It's important to understand these key capabilities before you begin your adoption efforts.
Align your strategy with the cloud adoption functions by working with the adoption team or individuals who are
responsible for these functions.
Learn to align responsibilities across teams by developing a cross-team matrix that identifies RACI parties.
Download and modify the RACI template.
Get started: Build a cloud adoption team
10/30/2020 • 6 minutes to read • Edit Online
Cloud adoption teams are the modern-day equivalent of technical implementation teams or project teams. The
nature of the cloud might require more fluid team structures.
Some cloud adoption teams focus exclusively on cloud migration, and others focus on innovations that take
advantage of cloud technologies. Some teams include the broad technical expertise that's required to complete
large adoption efforts, such as a full datacenter migration, and others have a tighter technical focus.
A smaller team might move between projects to accomplish specific goals. For example, a team of data platform
specialists might focus on helping convert SQL Database virtual machines (VMs) to SQL PaaS instances.
As cloud adoption expands, customers benefit from a team that's dedicated to the cloud platform function. That
team uses automated deployment and code reuse to accelerate successful adoption. People focused on a cloud
platform function can implement infrastructure, application patterns, governance, and other supporting assets to
drive further efficiencies and consistency, and to instill cloud principles in your organization. Small organizations
and small adoption teams don't have the luxury of a dedicated cloud platform team. We recommend that you
establish an automation capability in your adoption team to begin building this important cloud muscle.
The cloud strategy team is accountable for maintaining a Review guidance and requirements from:
clear RACI structure across the cloud adoption lifecycle. Cloud governance team
Cloud operations team
Cloud center of excellence or central IT team
Other cloud adoption teams or individuals listed in the
RACI
What's next
Cloud adoption is a great goal, but ungoverned adoption can produce unexpected results. To accelerate adoption
and best practices, as you're reducing business and technical risks, align cloud adoption with cloud governance
functions.
Aligning with the cloud governance team creates balance across cloud adoption efforts, but this is considered a
minimum viable product (MVP), because it might not be sustainable. Each team is wearing many hats, as outlined
in the RACI charts.
Learn more about overcoming organizational antipatterns: silos and fiefdoms.
Get started: Build a cloud governance team
10/30/2020 • 6 minutes to read • Edit Online
A cloud governance team ensures that cloud-adoption risks and risk tolerance are properly evaluated and
managed. The team identifies risks that can't be tolerated by the business, and it converts risks into governing
corporate policies.
What's next
All companies are unique, and so are their governance needs. Choose the level of maturity that fits your
organization, and use the Cloud Adoption Framework to guide the practices, processes, and tooling that can help
you get there.
As cloud governance matures, teams are empowered to adopt the cloud at a faster pace. Continuous cloud
adoption efforts tend to trigger maturity in IT operations. To ensure that governance is a part of operations
development, either develop a cloud operations team or sync with your existing cloud operations team.
Get started: Build a cloud operations team
10/30/2020 • 6 minutes to read • Edit Online
An operations team focuses on monitoring, repairing, and remediating issues related to traditional IT operations
and assets. In the cloud, many of the capital costs and operations activities are transferred to the cloud provider,
giving IT operations the opportunity to improve and provide significant additional value.
What's next
As adoption and operations scale, it's important to define and automate governance best practices that extend
existing IT requirements. Forming a cloud center of excellence (CCoE) team is an important step toward scaling
cloud adoption, cloud operations, and cloud governance efforts.
Learn more about:
Cloud center of excellence functions
Organizational antipatterns: Silos and fiefdoms
Align responsibilities across teams by developing a cross-team matrix that identifies RACI parties. Download and
modify the RACI template.
Develop a cloud adoption strategy
10/30/2020 • 2 minutes to read • Edit Online
The cloud delivers fundamental technology benefits that can help your enterprise execute multiple business
strategies. By using cloud-based approaches, you can improve business agility, reduce costs, accelerate time to
market, and enable expansion into new markets. To take advantage of this great potential, start by documenting
your business strategy in a way that's both understandable to cloud technicians and palatable to your business
stakeholders.
The following steps can help you document your business strategy efficiently. This approach helps you drive
adoption efforts that capture targeted business value in a cross-functional model. Then, you can map your cloud
adoption strategy to specific cloud capabilities and business strategies to reach your desired state of
transformation.
Use the strategy and plan template to build out your cloud adoption strategy, and to track the output of each of
the steps outlined above.
Motivations: Why are we moving to the cloud?
10/30/2020 • 4 minutes to read • Edit Online
"Why are we moving to the cloud?" It's a common question for business and technical stakeholders alike. If the
answer is, "Our board (or CIO, or C-level executives) told us to move to the cloud," then it's unlikely that the
business will achieve the desired outcomes.
This article discusses a few motivations behind cloud migration that can help produce more successful
business outcomes. These options help facilitate a conversation about motivations and, ultimately, business
outcomes.
Motivations
Business transformations that are supported by cloud adoption can be driven by various motivations. It's likely
that several motivations apply at the same time. The goal of the lists in the following table is to help generate
ideas about which motivations are relevant. From there, you can prioritize and assess the potential impacts of
the motivations. In this article, your cloud adoption team should meet with various executives and business
leaders using the list below to understand which of these motivations are affected by the cloud adoption effort.
Response to regulatory compliance Preparation for new technical Improved customer experiences and
changes capabilities engagements
New data sovereignty requirements Scaling to meet market demands Transformation of products or services
Reduction of disruptions and Scaling to meet geographic demands Market disruption with new products
improvement of IT stability or services
Integration of a complex it portfolio
Reduce carbon footprint Democratization and/or self-service
environments
Motivation-driven strategies
This section highlights the migration and innovation motivations and their corresponding strategies.
Migration
The migration motivations listed near the top of the motivations table are the most common, but not
necessarily the most significant, reasons for adopting the cloud. These outcomes are important to achieve, but
they're most effectively used to transition to other, more useful worldviews. This important first step to cloud
adoption is often called a cloud migration. The framework refers to the strategy for executing a cloud
migration by using the term migrate.
Some motivations align well with a migrate strategy. The motives at the top of this list will likely have
significantly less business impact than those toward the bottom of the list.
Cost savings.
Reduction in vendor or technical complexity.
Optimization of internal operations.
Increasing business agility.
Preparing for new technical capabilities.
Scaling to meet market demands.
Scaling to meet geographic demands.
Innovation
Data is the new commodity. Modern applications are the supply chain that drives that data into various
experiences. In today's business market, it's hard to find a transformative product or service that isn't built on
top of data, insights, and customer experiences. The motivations that appear lower in the innovation list align
to a technology strategy referred to in this framework as the Innovate methodology.
The following list includes motivations that cause an IT organization to focus more on an innovate strategy
than a migrate strategy.
Increasing business agility.
Preparing for new technical capabilities.
Building new technical capabilities.
Scaling to meet market demands.
Scaling to meet geographic demands.
Improving customer experiences and engagements.
Transforming products or services.
Next steps
Understanding projected business outcomes helps facilitate the conversations that you need to have as you
document your motivations and supporting metrics, in alignment with your business strategy. Next, read an
overview of business outcomes that are commonly associated with a move to the cloud.
Overview of business outcomes
What business outcomes are associated with
transformation journeys?
10/30/2020 • 2 minutes to read • Edit Online
The most successful transformation journeys start with a business outcome in mind. Cloud adoption can be a
costly and time-consuming effort. Fostering the right level of support from IT and other areas of the business
is crucial to success. This article series is designed to help customers identify business outcomes that are
concise, defined, and drive observable results or change in business performance, supported by a specific
measure.
During any cloud transformation, the ability to speak in terms of business outcomes supports transparency
and cross-functional partnerships. The business outcome framework starts with a simple template to help
technically minded individuals document and gain consensus. This template can be used with several business
stakeholders to collect a variety of business outcomes, which could each be influenced by a company's
transformation journey. Feel free to use this template electronically or, better still, draw it on a whiteboard to
engage business leaders and stakeholders in outcome-focused discussions.
To learn more about business outcomes and the business outcome template, see Documenting business
outcomes, or download the business outcome template.
Next steps
Learn more about fiscal outcomes.
Fiscal outcomes
Data innovations
10/30/2020 • 4 minutes to read • Edit Online
Many companies want to migrate their existing data warehouse to the cloud. They are motivated by a number of
factors, including:
No hardware to buy or maintenance costs.
No infrastructure to manage.
The ability to switch to a secure, scalable, and low-cost cloud solution.
For example, the cloud-native, pay-as-you-go service from Azure called Azure Synapse Analytics provides an
analytical database management system for organizations. Azure technologies help modernize your data
warehouse after it's migrated and extend your analytical capabilities to drive new business value.
A data warehouse migration project involves many components. These include schema, data, extract-transform-
load (ETL) pipelines, authorization privileges, users, BI tool semantic access layers, and analytic applications.
After your data warehouse has been migrated to Azure Synapse Analytics, you can take advantage of other
technologies in the Microsoft analytical ecosystem. Doing so allows you to not only modernize your data
warehouse but also bring together insights produced in other analytical data stores on Azure.
You can broaden ETL processing to ingest data of any type into Azure Data Lake Storage. You can prepare and
integrate it at scale by using Azure Data Factory. This produces trusted, commonly understood data assets that can
be consumed by your data warehouse, and also accessed by data scientists and other applications. You can build
real-time, batch-oriented analytical pipelines. You can also create machine learning models that can deploy to run in
batch, in real time on streaming data, and on demand.
In addition, you can use PolyBase to go beyond your data warehouse. This simplifies access to insights being
produced in multiple underlying analytical platforms on Azure. You create holistic, integrated views in a logical data
warehouse to gain access to streaming, big data, and traditional data warehouse insights from BI tools and
applications.
Many companies have had data warehouses running in their datacenters for years, to enable users to produce
business intelligence. Data warehouses extract data from known transaction systems, stage the data, and then
clean, transform, and integrate it to populate data warehouses.
Use cases, business cases, and technology advances all support how Azure Synapse Analytics can help you with
data warehouse migration. The following sections list many of these examples.
Use cases
Connected product innovation
Factory of the future
Clinical analytics
Compliance analytics
Cost-based analytics
Omni-channel optimization
Personalization
Intelligent supply chain
Dynamic pricing
Procurement analytics
Digital control tower
Risk management
Customer analytics
Fraud detection
Claims analytics
Business cases
Build end-to-end analytics solutions with a single analytics service.
Use the Azure Synapse Analytics studio, which provides a unified workspace for data prep, data management,
data warehousing, big data, and AI tasks.
Build and manage pipeline with a no-code visual environment, automate query optimization, build proofs of
concept, and use Power BI, all from the same analytics service.
Deliver your data insights to data warehouses and big data analytics systems.
For mission-critical workloads, optimize the performance of all queries with intelligent workload management,
workload isolation, and limitless concurrency.
Edit and build Power BI dashboards directly from Azure Synapse Analytics.
Reduce project development time for BI and machine learning projects.
Easily share data with just a few clicks by using Azure Data Share integration within Azure Synapse Analytics.
Implement fine-grained access control with column-level security and native row-level security.
Automatically protect sensitive data in real time with dynamic data masking.
Industry-leading security with built-in security features like automated threat detection and always-on data
encryption.
Technology advances
No hardware to buy or maintenance costs so you pay only for what you use.
No infrastructure to manage, so you can focus on competitive insights.
Massively parallel SQL query processing with dynamic scalability when you need it, and the option to shut down
or pause when you don't.
Ability to independently scale storage from compute.
You can avoid unnecessary, expensive upgrades caused by the staging areas on your data warehouse getting
too big, taking up storage capacity, and forcing an upgrade. For example, move the staging area to Azure Data
Lake Storage. Then process it with an ETL tool like Azure Data Factory or your existing ETL tool running on Azure
at lower cost.
Avoid expensive hardware upgrades by processing ETL workloads in Azure, by using Azure Data Lake Storage
and Azure Data Factory. This is often a better solution than running on your existing data warehouse DBMS with
SQL query processing doing the work. As staging data volumes increase, more storage and compute power
underpinning your on-premises data warehouse is consumed by ETL. This in turn affects the performance of
query, reporting, and analysis workloads.
Avoid building expensive data marts that use storage and databases software licenses on on-premises
hardware. You can build them in Azure Synapse Analytics instead. This is especially helpful if your data
warehouse is a Data Vault design, which often causes an increased demand for data marts.
Avoid the cost of analyzing and storing high-velocity, high-volume data on on-premises hardware. For example,
if you need to analyze real-time, machine generated data like click-stream and streaming IoT data in your data
warehouse, you can use Azure Synapse Analytics.
You can avoid paying a premium for storing data on expensive warehouse hardware in the datacenter as your
data warehouse grows. Azure Synapse Analytics can store your data in cloud storage at a lower cost.
Next steps
Data democratization
Data democratization
10/30/2020 • 2 minutes to read • Edit Online
Many companies keep data warehouses in their datacenters to help different parts of their business analyze data
and make decisions. Sales, marketing, and finance departments rely heavily on these systems in order to produce
standard reports and dashboards. Companies also employ business analysts to perform ad hoc querying and
analysis of data in data marts. These data marts use self-service business intelligence tools to perform
multidimensional analysis.
A business that's supported by data innovation and a modern data estate can empower a broad range of
contributors, from an IT stakeholder to a data professional and beyond. They can take action on this repository of
centralized data, which is often referred to as "the single source of truth."
Azure Synapse Analytics is a single service for seamless collaboration and accelerated time-to-insight. To
understand this service in more detail, first consider the various roles and skills involved in a typical data estate:
Data warehousing : database admins support the management of data lakes and data warehouses, while
intelligently optimizing workloads and automatically securing data.
Data integration : data engineers use a code-free environment to easily connect multiple sources and types of
data.
Big data and machine learning : data scientists build proofs of concept rapidly and provision resources as
needed, while working in the language of their choice (for example, T-SQL, Python, Scala, .NET, or Spark SQL).
Management and security : IT pros protect and manage data more efficiently, enforce privacy requirements, and
secure access to cloud and hybrid configurations.
Business intelligence : business analysts securely access datasets, build dashboards, and share data within and
outside their organization.
The following diagram shows an example of a classic data warehouse architecture. Known structured data is
extracted from core transaction processing systems and copied into a staging area. From there, it's cleaned,
transformed, and integrated into production tables in a data warehouse. It's often the case that several years of
historical transaction data are incrementally built up here. This provides the data needed to understand changes in
sales, customer purchasing behavior, and customer segmentation over time. It also provides yearly financial
reporting and analysis to help with decision making.
From there, subsets of data are extracted into data marts to analyze activity associated with a specific business
process. This supports decision making in a specific part of the business.
For a business to run efficiently, it needs all types of data for the different skills and roles described earlier. You need
raw data that has been cleansed for data scientists to build machine-learning models. You need cleaned and
structured data for a data warehouse to provide reliable performance to business applications and dashboards.
Most importantly, you need to be able to go from raw data to insights in minutes, not days.
Azure Synapse Analytics has a native, built-in business intelligence tool with Power BI. This fully enables you to go
from raw data to a dashboard serving insights in minutes, by using one service within one single interface.
Examples of fiscal outcomes
10/30/2020 • 7 minutes to read • Edit Online
NOTE
The following examples are hypothetical and should not be considered a guarantee of returns when adopting any cloud
strategy.
Revenue outcomes
New revenue streams
The cloud can help create opportunities to deliver new products to customers or deliver existing products in a new
way. New revenue streams are innovative, entrepreneurial, and exciting for many people in the business world.
New revenue streams are also prone to failure and are considered by many companies to be high risk. When
revenue-related outcomes are proposed by IT, there will likely be resistance. To add credibility to these outcomes,
partner with a business leader who's a proven innovator. Validation of the revenue stream early in the process
helps avoid roadblocks from the business.
Example: A company has been selling books for over a hundred years. An employee of the company realizes
that the content can be delivered electronically. The employee creates a device that can be sold in the
bookstore, which allows the same books to be downloaded directly, driving $x in new book sales.
Revenue increases
With global scale and digital reach, the cloud can help businesses to increase revenues from existing revenue
streams. Often, this type of outcome comes from an alignment with sales or marketing leadership.
Example: A company that sells widgets could sell more widgets, if the salespeople could securely access the
company's digital catalog and stock levels. Unfortunately, that data is only in the company's ERP system, which
can be accessed only via a network-connected device. Creating a service façade to interface with the ERP and
exposing the catalog list and nonsensitive stock levels to an application in the cloud would allow the
salespeople to access the data they need while onsite with a customer. Extending on-premises Active Directory
using Azure Active Directory (Azure AD) and integrating role-based access into the application would allow the
company to help ensure that the data stays safe. This simple project could affect revenue from an existing
product line by x%.
Profit increases
Seldom does a single effort simultaneously increase revenue and decrease costs. However, when it does, align the
outcome statements from one or more of the revenue outcomes with one or more of the cost outcomes to
communicate the desired outcome.
Cost outcomes
Cost reduction
Cloud computing can reduce capital expenses for hardware and software, setting up datacenters, running on-site
datacenters, and so on. The costs of racks of servers, round-the-clock electricity for power and cooling, and IT
experts for managing the infrastructure add up fast. Shutting down a datacenter can reduce capital expense
commitments. This is commonly referred to as "getting out of the datacenter business." Cost reduction is typically
measured in dollars in the current budget, which could span one to five years depending on how the CFO
manages finances.
Example #1: A company's datacenter consumes a large percentage of the annual IT budget. IT chooses to
conduct a cloud migration and transitions the assets in that datacenter to infrastructure as a service (IaaS)
solutions, creating a three-year cost reduction.
Example #2: A holding company recently acquired a new company. In the acquisition, the terms dictate that
the new entity should be removed from the current datacenters within six months. Failure to do so will result in
a fine of $1 million USD per month to the holding company. Moving the digital assets to the cloud in a cloud
migration could allow for a quick decommission of the old assets.
Example #3: An income tax company that caters to consumers experiences 70 percent of its annual revenue
during the first three months of the year. The remainder of the year, its large IT investment is relatively
dormant. A cloud migration could allow IT to deploy the compute/hosting capacity required for those three
months. During the remaining nine months, the IaaS costs could be significantly reduced by shrinking the
compute footprint.
Example: Coverdell
Coverdell modernizes their infrastructure to drive record cost savings with Azure. Coverdell's decision to invest in
Azure, and to unite their network of websites, applications, data, and infrastructure within this environment, led to
more cost savings than the company could have ever expected. The migration to an Azure-only environment
eliminated $54,000 USD in monthly costs for colocation services. With the company's new united infrastructure
alone, coverdell expects to save an estimated $1M USD over the next two to three years.
"Having access to the Azure technology stack opens the door for some scalable, easy-to-implement, and
highly available solutions that are cost effective. This allows our architects to be much more creative with the
solutions they provide."
Ryan Sorensen
Director of Application Development and Enterprise Architecture
Coverdell
Cost avoidance
Terminating a datacenter can also provide cost avoidance, by preventing future refresh cycles. A refresh cycle is
the process of buying new hardware and software to replace aging on-premises systems. In Azure, hardware and
OS are routinely maintained, patched, and refreshed at no additional cost to customers. This allows a CFO to
remove planned future spend from long-term financial forecasts. Cost avoidance is measured in dollars. It differs
from cost reduction, generally focusing on a future budget that has not been fully approved yet.
Example: A company's datacenter is up for a lease renewal in six months. The datacenter has been in service
for eight years. Four years ago, all servers were refreshed and virtualized, costing the company millions of
dollars. Next year, the company plans to refresh the hardware and software again. Migrating the assets in that
datacenter as part of a cloud migration would allow cost avoidance by removing the planned refresh from next
year's forecasted budget. It could also produce cost reduction by decreasing or eliminating the real estate lease
costs.
Capital expenses and operating expenses
Before you discuss cost outcomes, it's important to understand the two primary cost options: capital expenses and
operating expenses.
The following terms will help you understand the differences between capital expenses and operating expenses
during business discussions about a transformation journey.
Capital is the money and assets owned by a business to contribute to a particular purpose, such as increasing
server capacity or building an application.
Capital expenditures generate benefits over a long period. These expenditures are generally nonrecurring
and result in the acquisition of permanent assets. Building an application could qualify as a capital expenditure.
Operating expenditures are ongoing costs of doing business. Consuming cloud services in a pay-as-you-go
model could qualify as an operating expenditure.
Assets are economic resources that can be owned or controlled to produce value. Servers, data lakes, and
applications can all be considered assets.
Depreciation is a decrease in the value of an asset over time. More relevant to the capital expense versus
operating expense conversation, depreciation is how the costs of an asset are allocated across the periods in
which they are used. For example, if you build an application this year but it's expected to have an average shelf
life of five years (like most commercial applications), the cost of the development team and the tools required
to create and deploy the code base would be depreciated evenly over five years.
Valuation is the process of estimating how much a company is worth. In most industries, valuation is based
on the company's ability to generate revenue and profit, while respecting the operating costs required to create
the goods that provide that revenue. In some industries, such as retail, or in some transaction types, such as
private equity, assets and depreciation can play a large part in the company's valuation.
It is often a safe bet that various executives, including the chief investment officer (CIO), debate the best use of
capital to grow the company in the desired direction. Giving the CIO a means of converting contentious capital
expense conversations into clear accountability for operating expenses could be an attractive outcome by itself. In
many industries, chief financial officers (CFOs) are actively seeking ways of better associating fiscal accountability
to the cost of goods being sold.
However, before you associate any transformation journey with this type of capital versus operating expense
conversion, it's wise to meet with members of the CFO or CIO teams to see which cost structure the business
prefers. In some organizations, reducing capital expenses in favor of operating expenses is a highly undesirable
outcome. As previously mentioned, this approach is sometimes seen in retail, holding, and private equity
companies that place higher value on traditional asset accounting models, which place little value on IP. It's also
seen in organizations that had negative experiences when they outsourced IT staff or other functions in the past.
If an operating expense model is desirable, the following example could be a viable business outcome:
Example: The company's datacenter is currently depreciating at $x USD per year for the next three years. It is
expected to require an additional $y USD to refresh the hardware next year. We can convert the capital
expenses to an operating expense model at an even rate of $z USD per month, allowing for better
management of and accountability for the operating costs of technology.
Next steps
Learn more about agility outcomes.
Agility outcomes
Examples of agility outcomes
10/30/2020 • 3 minutes to read • Edit Online
As discussed in the business outcomes overview, several potential business outcomes can serve as the foundation
for any transformation journey conversation with the business. This article focuses on the timeliest business
measure: business agility. Understanding your company's market position and competitive landscape can help you
articulate the business outcomes that are the target of the business's transformation journey.
Traditionally, chief investment officers and IT teams were considered a source of stability in core mission-critical
processes. This is still true. Few businesses can function well when their IT platform is unstable. However, in today's
business world, much more is expected. IT can expand beyond a simple cost center by partnering with the
business to provide market advantages. Many chief investment officers and executives assume that stability is
simply a baseline for IT. For these leaders, business agility is the measure of IT's contribution to the business.
Time-to-market outcome
During cloud-enabled innovation efforts, time to market is a key measure of IT's ability to address market change.
In many cases, a business leader might have existing budget for the creation of an application or the launch of a
new product. Clearly communicating a time-to-market benefit can motivate that leader to redirect budget to IT's
transformation journey.
Example 1: The European division of a US-based company needs to comply with GDPR regulations by
protecting customer data in a database that supports UK operations. Their existing version of SQL Server
doesn't support the necessary row-level security. An in-place upgrade would be too disruptive. Using Azure
SQL Database to replicate and upgrade the database, the customer adds the necessary compliance measure
in a matter of weeks.
Example 2: A logistics company has discovered an untapped segment of the market, but it needs a new
version of their flagship application to capture this market share. Their larger competitor has made the
same discovery. Through the execution of a cloud-enabled application innovation effort, the company
embraces customer obsession and a DevOps-driven development approach to beat their slower, legacy
competitor by x months. This jump on market entrance secured the customer base.
Aurora Health Care
Healthcare system transforms online services into a friendly digital experience. To transform its digital services,
Aurora Health Care migrated its websites to the Microsoft Azure platform and adopted a strategy of continuous
innovation.
"As a team, we're focused on high-quality solutions and speed. Choosing Azure was a very transformative
decision for us."
Jamey Shiels
Vice President of Digital Experience
Aurora Health Care
Provision time
When business demands new IT services or scale to existing services, acquisition and provision of new hardware
or virtual resources can take weeks. After cloud migration, IT can more easily enable self-service provisioning,
allowing the business to scale in hours.
Example: A consumer packaged goods company requires the creation and tear-down of hundreds of database
clusters per year to fulfill operational demands of the business. The on-premises virtual hosts can provision
quickly, but the process of recovering virtual assets is slow and requires significant time from the team. As
such, the legacy on-premises environment suffers from bloat and can seldom keep up with demand. After
cloud migration, IT can more easily provide scripted self-provisioning of resources, with a chargeback approach
to billing. Together, this allows the business to move as quickly as they need, but still be accountable for the cost
of the resources they demand. Doing so in the cloud limits deployments to the business's budget only.
Next steps
Learn more about reach outcomes.
Reach outcomes
Examples of global reach outcomes
10/30/2020 • 3 minutes to read • Edit Online
As discussed in Business outcomes, several potential business outcomes can serve as the foundation for any
transformation journey conversation with the business. This article focuses on a common business measure:
reach. Reach is a concise term that, in this case, refers to a company's globalization strategy. Understanding the
company's globalization strategy helps you better articulate the business outcomes that are the target of a
business's transformation journey.
Fortune 500 and smaller enterprises have focused on the globalization of services and customers for over three
decades, and most business are likely to engage in global commerce as this globalization continues to pull focus.
Hosting datacenters around the world can consume more than 80 percent of an annual IT budget, and wide-area
networks using private lines to connect those datacenters can cost millions of dollars per year. Therefore,
supporting global operations is both challenging and costly.
Cloud solutions move the cost of globalization to the cloud provider. In Azure, customers can quickly deploy
resources in the same region as customers or operations, without buying and provisioning a datacenter. Microsoft
owns one of the largest wide-area networks in the world, connecting datacenters around the globe. Connectivity
and global operating capacity are available to global customers on demand.
Global access
Expanding into a new market can be one of the most valuable business outcomes during a transformation. The
ability to quickly deploy resources in market without a longer-term commitment allows sales and operations
leaders to explore options that wouldn't have been considered in the past.
Manufacturing example
A cosmetics manufacturer has identified a trend. Some products are being shipped to the Asia Pacific region even
though no sales teams are operating in that region. The minimum systems required by a remote sales force are
small, but latency prevents a remote access solution. To capitalize on this trend, the vice president of sales wants to
experiment with sales teams in Japan and South Korea. Because the company has undergone a cloud migration, it
was able to deploy the necessary systems in both Japan and South Korea within days. This allowed the vice
president of sales to grow revenue in the region by x% within three months. Those two markets continue to
outperform other parts of the world, leading to sales operations throughout the region.
Retail example
An online retailer that ships products globally can engage with their customers across time zones and multiple
languages. The retailer uses Azure Bot Service and various features in Azure Cognitive Services, such as Translator,
Language Understanding (LUIS), QnA Maker, and Text Analytics. This ensures their customers are able to get the
information they need when they need it, and that it's provided to them in their language. The retailer uses the
Personalizer service to further customize the experience and catalog offerings for their customers, ensuring
geographical tastes, preferences, and availability are reflected.
Data sovereignty
Operating in new markets introduces additional governance constraints. Azure provides compliance offerings that
help customers meet compliance obligations across regulated industries and global markets. For more
information, see the overview of Microsoft Azure compliance.
Example
A US-based utilities provider was awarded a contract to provide utilities in Canada. Canadian data sovereignty law
requires that Canadian data stays in Canada. This company had been working their way through a cloud-enabled
application innovation effort for years. As a result, their software was deployed through fully scripted DevOps
processes. With a few minor changes to the code base, they were able to deploy a working copy of the code to an
Azure datacenter in Canada, meeting data sovereignty compliance and retaining the customer.
Next steps
Learn more about customer engagement outcomes.
Customer engagement outcomes
Examples of customer engagement outcomes
10/30/2020 • 2 minutes to read • Edit Online
As discussed in the business outcomes overview, several potential business outcomes can serve as the foundation
for any transformation journey conversation with the business. This article focuses on a common business
measure: customer engagement. Understanding the needs of customers, and the ecosystem around customers,
helps you to articulate the business outcomes that are the target of a business's transformation journey.
During cloud-enabled data innovation efforts, you can assume that customers are engaged. The following
functions are potentially disruptive and require a high degree of customer engagement:
Aggregating data
Testing theories
Advancing insights
Informing cultural change
Customer engagement outcomes are about meeting and exceeding customer expectations. As a baseline for
customer engagements, customers assume that products and services perform and are reliable. When they're not,
it's easy for an executive to understand the business value of performance and reliability outcomes. For more
advanced companies, the speed of integrating learnings and observations from this process is a fundamental
business outcome.
The next sections provide examples and outcomes related to customer engagement.
Cycle time
During customer-obsessed transformations such as a cloud-enabled application innovation effort, customers
respond from direct engagement. They also appreciate seeing their needs met quickly by the development team.
Cycle time is a Six Sigma term that refers to the duration from the start to the finish of a function. For business
leaders who invest heavily in improving customer engagement, cycle time can be a strong business outcome.
Example
A services company that provides business-to-business (B2B) services is trying to retain market share in a
competitive market. Customers who have left for a competing service provider found that their overly complex
technical solution interferes with their business processes, and is the primary reason for leaving. In this case, cycle
time is imperative.
It currently takes 12 months for a feature to progress from request to release. If it's prioritized by the executive
team, this cycle can shorten from nine to six months. The team can cut cycle time down to one month through a
cloud-enabled application innovation effort, cloud-native application models, and Azure DevOps integration. This
frees business and application development teams to interact more directly with customers.
Next steps
Learn more about performance outcomes.
Performance outcomes
Examples of performance outcomes
10/30/2020 • 2 minutes to read • Edit Online
As discussed in Business outcomes, several potential business outcomes can serve as the foundation for any
transformation journey conversation with the business. This article focuses on a common business measure:
performance.
In today's technological society, customers assume that applications will perform well and always be available.
When this expectation isn't met, it causes reputation damage that can be costly and long-lasting.
Performance
The biggest cloud computing services run on a worldwide network of secure datacenters, which are regularly
upgraded to the latest generation of fast and efficient computing hardware. This provides several benefits over a
single corporate datacenter, such as reduced network latency for applications and greater economies of scale.
Transform your business and reduce costs with an energy-efficient infrastructure that spans more than 100 highly
secure facilities worldwide, linked by one of the largest networks on earth. Azure has more global regions than any
other cloud provider. This translates into the scale that's required to bring applications closer to users around the
world, preserve data residency, and provide comprehensive compliance and resiliency options for customers.
Example 1: A services company worked a hosting provider that hosted multiple operational infrastructure
assets. Those systems suffered from frequent outages and poor performance. The company migrated its
assets to Azure to take advantage of the SLA and performance controls of the cloud. Any downtime would
cost the company approximately $15,000 USD per minute of outage. With between four and eight hours of
outage per month, it was easy to justify this organizational transformation.
Example 2: A consumer investment company was in the early stages of a cloud-enabled application
innovation effort. Agile processes and DevOps were maturing well, but application performance was spiky.
As a more mature transformation, the company started a program to monitor and automate sizing based
on usage demands. The company eliminated sizing issues by using Azure performance management tools,
resulting in a surprising five-percent increase in transactions.
Reliability
Cloud computing makes data backup, disaster recovery, and business continuity easier and less expensive, because
data can be mirrored at multiple redundant sites on the cloud provider's network.
One of IT's crucial functions is ensuring that corporate data is never lost and applications stay available despite
server crashes, power outages, or natural disasters. You can keep your data safe and recoverable by backing it up
to Azure.
Azure Backup is a simple solution that decreases your infrastructure costs while providing enhanced security
mechanisms to protect your data against ransomware. With one solution, you can protect workloads that are
running in Azure and on-premises across Linux, Windows, VMware, and Hyper-V. You can ensure business
continuity by keeping your applications running in Azure.
Azure Site Recovery makes it simple to test disaster recovery by replicating applications between Azure regions.
You can also replicate on-premises VMware and Hyper-V virtual machines and physical servers to Azure to stay
available if the primary site goes down. And you can recover workloads to the primary site when it's up and
running again.
Example: An oil and gas company used Azure technologies to implement a full site recovery. The company
chose not to fully embrace the cloud for day-to-day operations, but the cloud's business continuity and disaster
recovery (BCDR) features still protected their datacenter. As a hurricane formed hundreds of miles away, their
implementation partner started recovering the site to Azure. Before the hurricane touched down, all mission-
critical assets were running in Azure, preventing any downtime.
Next steps
Learn how to use the business outcome template.
Use the business outcome template
How to use the business outcome template
10/30/2020 • 2 minutes to read • Edit Online
As discussed in the business outcomes overview, it can be difficult to bridge the gap between business and
technical conversations. This simple template is designed to help teams uniformly capture business outcomes to
be used later in the development of customer transformation journey strategies.
Download the business outcome template to begin brainstorming and tracking business outcomes. Continue
reading to learn how to use the template. Review the business outcomes section for ideas on potential business
outcomes that could come up in executive conversations.
Figure 1: Business outcomes visualized as a house with stakeholders, over business outcomes, over technical
capabilities.
The business outcome template focuses on simplified conversations that can quickly engage stakeholders without
getting too deep into the technical solution. By rapidly understanding and aligning the key performance indicators
(KPIs) and business drivers that are important to stakeholders, your team can think about high-level approaches
and transformations before diving into the implementation details.
An example can be found on the "example outcome" tab of the spreadsheet, as shown below. To track multiple
outcomes, add them to the "collective outcomes" tab.
Figure 2: Example of a business outcome template.
Figure 3: Five areas of focus in discovery: stakeholders, outcomes, drivers, KPIs, and capabilities.
Stakeholders: Who in the organization is likely to see the greatest value in a specific business outcome? Who is
most likely to support this transformation, especially when things get tough or time consuming? Who has the
greatest stake in the success of this transformation? This person is a potential stakeholder.
Business outcomes: A business outcome is a concise, defined, and observable result or change in business
performance, supported by a specific measure. How does the stakeholder want to change the business? How will
the business be affected? What is the value of this transformation?
Business drivers: Business drivers capture the current challenge that's preventing the company from achieving
desired outcomes. They can also capture new opportunities that the business can capitalize on with the right
solution. How would you describe the current challenges or future state of the business? What business functions
would be changing to meet the desired outcomes?
KPIs: How will this change be measured? How does the business know whether they are successful? How
frequently will this KPI be observed? Understanding each KPI helps enable incremental change and
experimentation.
Capabilities: When you define any transformation journey, how will technical capabilities accelerate realization of
the business outcome? What applications must be included in the transformation to achieve business objectives?
How do various applications or workloads get prioritized to deliver on capabilities? How do parts of the solution
need to be expanded or rearchitected to meet each of the outcomes? Can execution approaches (or timelines) be
rearranged to prioritize high-impact business outcomes?
Next steps
Learn to align your technical efforts to meaningful learning metrics.
Align your technical efforts
How can we align efforts to meaningful learning
metrics?
10/30/2020 • 3 minutes to read • Edit Online
The business outcomes overview discussed ways to measure and communicate the impact a transformation will
have on the business. Unfortunately, it can take years for some of those outcomes to produce measurable results.
The board and C-suite are unhappy with reports that show a 0% delta for long periods of time.
Learning metrics are interim, shorter-term metrics that can be tied back to longer-term business outcomes. These
metrics align well with a growth mindset and help position the culture to become more resilient. Rather than
highlighting the anticipated lack of progress toward a long-term business goal, learning metrics highlight early
indicators of success. The metrics also highlight early indicators of failure, which are likely to produce the greatest
opportunity for you to learn and adjust the plan.
As with much of the material in this framework, we assume you're familiar with the transformation journey that
best aligns with your desired business outcomes. This article will outline a few learning metrics for each
transformation journey to illustrate the concept.
Cloud migration
This transformation focuses on cost, complexity, and efficiency, with an emphasis on IT operations. The most easily
measured data behind this transformation is the movement of assets to the cloud. In this kind of transformation,
the digital estate is measured by virtual machines (VMs), racks or clusters that host those VMs, datacenter
operational costs, required capital expenses to maintain systems, and depreciation of those assets over time.
As VMs are moved to the cloud, dependence on on-premises legacy assets is reduced. The cost of asset
maintenance is also reduced. Unfortunately, businesses can't realize the cost reduction until clusters are
deprovisioned and datacenter leases expire. In many cases, the full value of the effort isn't realized until the
depreciation cycles are complete.
Always align with the CFO or finance office before making financial statements. However, IT teams can generally
estimate current monetary cost and future monetary cost values for each VM based on CPU, memory, and storage
consumed. You can then apply that value to each migrated VM to estimate the immediate cost savings and future
monetary value of the effort.
Application innovation
Cloud-enabled application innovation focuses largely on the customer experience and the customer's willingness
to consume products and services provided by the company. It takes time for increments of change to affect
consumer or customer buying behaviors. But application innovation cycles tend to be much shorter than they are
in the other forms of transformation. The traditional advice is that you should start with an understanding of the
specific behaviors that you want to influence and use those behaviors as the learning metrics. For example, in an
e-commerce application, total purchases or add-on purchases could be the target behavior. For a video company,
time watching video streams could be the target.
Customer behavior metrics can easily be influenced by outside variables, so it's often important to include related
statistics with the learning metrics. These related statistics can include release cadence, bugs resolved per release,
code coverage of unit tests, number of page views, page throughput, page load time, and other application
performance metrics. Each can show different activities and changes to the code base and the customer
experience to correlate with higher-level customer behavior patterns.
Data innovation
Changing an industry, disrupting markets, or transforming products and services can take years. In a cloud-
enabled data innovation effort, experimentation is key to measuring success. Be transparent by sharing prediction
metrics like percent probability, number of failed experiments, and number of models trained. Failures will
accumulate faster than successes. These metrics can be discouraging, and the executive team must understand the
time and investment needed to use these metrics properly.
On the other hand, some positive indicators are often associated with data-driven learning: centralization of
heterogeneous data sets, data ingress, and democratization of data. While the team is learning about the customer
of tomorrow, real results can be produced today. Supporting learning metrics could include:
Number of models available.
Number of partner data sources consumed.
Devices producing ingress data.
Volume of ingress data.
Types of data.
An even more valuable metric is the number of dashboards created from combined data sources. This number
reflects the current-state business processes that are affected by new data sources. By sharing new data sources
openly, your business can take advantage of the data by using reporting tools like Power BI to produce
incremental insights and drive business change.
Next steps
After learning metrics are aligned, you're ready to begin building the business case to deliver against those
metrics.
Build the cloud business case
Build a business justification for cloud migration
10/30/2020 • 8 minutes to read • Edit Online
Cloud migrations can generate early return on investment (ROI) from cloud transformation efforts. But
developing a clear business justification with tangible, relevant costs and returns can be a complex process. This
article will help you think about what data you need to create a financial model that aligns with cloud migration
outcomes. First, let's dispel a few myths about cloud migration, so your organization can avoid some common
mistakes.
We can unpack this equation to get a migration-specific view of the formulas for the input variables on the right
side of the equation. The remaining sections of this article offer some considerations to take into account.
Next steps
Create a financial model for cloud transformation
Create a financial model for cloud transformation
10/30/2020 • 5 minutes to read • Edit Online
Creating a financial model that accurately represents the full business value of any cloud transformation can be
complicated. Financial models and business justifications tend to vary for different organizations. This article
establishes some formulas and points out a few things that are commonly missed when strategists create
financial models.
Return on investment
Return on investment (ROI) is often an important criteria for the C-suite or the board. ROI is used to compare
different ways to invest limited capital resources. The formula for ROI is fairly simple. The details you'll need to
create each input to the formula might not be as simple. Essentially, ROI is the amount of return produced from
an initial investment. It's usually represented as a percentage:
In the next sections, we'll walk through the data you'll need to calculate the initial investment and the gain from
investment (earnings).
Revenue deltas
Revenue deltas should be forecast in partnership with business stakeholders. After the business stakeholders
agree on a revenue impact, it can be used to improve the earning position.
Cost deltas
Cost deltas are the amount of increase or decrease that will be caused by the transformation. Independent
variables can affect cost deltas. Earnings are largely based on hard costs like capital expense reductions, cost
avoidance, operational cost reductions, and depreciation reductions. The following sections describe some cost
deltas to consider.
Depreciation reduction or acceleration
For guidance on depreciation, speak with the CFO or finance team. The following information is meant to serve as
a general reference on the topic of depreciation.
When capital is invested in the acquisition of an asset, that investment could be used for financial or tax purposes
to produce ongoing benefits over the expected lifespan of the asset. Some companies see depreciation as a
positive tax advantage. Others see it as a committed, ongoing expense similar to other recurring expenses
attributed to the annual IT budget.
Speak with the finance office to find out if elimination of depreciation is possible and if it would make a positive
contribution to cost deltas.
Physical asset recovery
In some cases, retired assets can be sold as a source of revenue. This revenue is often lumped into cost reduction
for simplicity. But it's truly an increase in revenue and can be taxed as such. Speak with the finance office to
understand the viability of this option and how to account for the resulting revenue.
Operational cost reductions
Recurring expenses required to operate a business are often called operating expenses. This is a broad category.
In most accounting models, it includes:
Software licensing.
Hosting expenses.
Electric bills.
Real estate rentals.
Cooling expenses.
Temporary staff required for operations.
Equipment rentals.
Replacement parts.
Maintenance contracts.
Repair services.
Business continuity and disaster recovery (BCDR) services.
Other expenses that don't require capital expense approvals.
This category provides one of the highest earning deltas. When you're considering a cloud migration, time
invested in making this list exhaustive is rarely wasted. Ask the CIO and finance team questions to ensure all
operational costs are accounted for.
Cost avoidance
When an operating expenditure is expected but not yet in an approved budget, it might not fit into a cost
reduction category. For example, if VMware and Microsoft licenses need to be renegotiated and paid next year,
they aren't fully qualified costs yet. Reductions in those expected costs are treated like operational costs for the
sake of cost-delta calculations. Informally, however, they should be referred to as "cost avoidance" until
negotiation and budget approval is complete.
Soft-cost reductions
At some companies, soft costs like reductions in operational complexity or reductions in full-time staff for
operating a datacenter could also be included in cost deltas. But including soft costs might not be a good idea.
When you include soft-cost reductions, you insert an undocumented assumption that the reduction will create
tangible cost savings. Technology projects rarely result in actual soft-cost recovery.
Headcount reductions
Time savings for staff are often included under soft-cost reduction. When those time savings map to actual
reduction of IT salary or staffing, they could be calculated separately as headcount reductions.
That said, the skills needed on-premises generally map to a similar (or higher-level) set of skills needed in the
cloud. So people aren't generally laid off after a cloud migration.
An exception occurs when operational capacity is provided by a third party or a managed services provider
(MSP). If IT systems are managed by a third party, the operating costs could be replaced by a cloud-native
solution or cloud-native MSP. A cloud-native MSP is likely to operate more efficiently and potentially at a lower
cost. If that's the case, operational cost reductions belong in the hard-cost calculations.
Capital expense reductions or avoidance
Capital expenses are slightly different from operating expenses. Generally, this category is driven by refresh cycles
or datacenter expansion. An example of a datacenter expansion would be a new high-performance cluster to host
a big data solution or data warehouse. This expense would generally fit into a capital expense category. More
common are the basic refresh cycles. Some companies have rigid hardware refresh cycles, meaning assets are
retired and replaced on a regular cycle (usually every three, five, or eight years). These cycles often coincide with
asset lease cycles or the forecasted life span of equipment. When a refresh cycle hits, IT draws capital expense to
acquire new equipment.
If a refresh cycle is approved and budgeted, the cloud transformation could help eliminate that cost. If a refresh
cycle is planned but not yet approved, the cloud transformation could avoid a capital expenditure. Both reductions
would be added to the cost delta.
Next steps
Learn more about cloud accounting models.
Cloud accounting
What is cloud accounting?
10/30/2020 • 4 minutes to read • Edit Online
The cloud changes how IT accounts for costs, as is described in Create a financial model for cloud transformation.
Various IT accounting models are much easier to support because of how the cloud allocates costs. So it's
important to understand how to account for cloud costs before you begin a cloud transformation journey. This
article outlines the most common cloud accounting models for IT.
Chargeback
One of the common first steps in changing IT's reputation as a cost center is implementing a chargeback model of
accounting. This model is especially common in smaller enterprises or highly efficient IT organizations. In the
chargeback model, any IT costs that are associated with a specific business unit are treated like an operating
expense in that business unit's budget. This practice reduces the cumulative cost effects on IT, allowing business
values to show more clearly.
In a legacy on-premises model, chargeback is difficult to realize because someone still has to carry the large
capital expenses and depreciation. The ongoing conversion from capital expenditures to operating expenses
associated with usage is a difficult accounting exercise. This difficulty is a major reason for the creation of the
traditional IT accounting model and the central IT accounting model. The operating expenses model of cloud cost
accounting is almost required if you want to efficiently deliver a chargeback model.
But you shouldn't implement this model without considering the implications. Here are a few consequences that
are unique to a chargeback model:
Chargeback results in a massive reduction of the overall IT budget. For IT organizations that are inefficient or
require extensive complex technical skills in operations or maintenance, this model can expose those expenses
in an unhealthy way.
Loss of control is a common consequence. In highly political environments, chargeback can result in loss of
control and staff being reallocated to the business. This could create significant inefficiencies and reduce IT's
ability to consistently meet SLAs or project requirements.
Difficulty accounting for shared services is another common consequence. If the organization has grown
through acquisition and is carrying technical debt as a result, it's likely that a high percentage of shared
services must be maintained to keep all systems working together effectively.
Cloud transformations include solutions to these and other consequences associated with a chargeback model. But
each of those solutions includes implementation and operating expenses. The CIO and CFO should carefully weigh
the pros and cons of a chargeback model before considering one.
Showback or awareness-back
For larger enterprises, a showback or awareness-back model is a safer first step in the transition from cost center
to value center. This model doesn't affect financial accounting. In fact, the P&Ls of each organization don't change.
The biggest shift is in mindset and awareness. In a showback or awareness-back model, IT manages the
centralized, consolidated buying power as an agent for the business. In reports back to the business, IT attributes
any direct costs to the relevant business unit, which reduces the perceived budget directly consumed by IT. IT also
plans budgets based on the needs of the associated business units, which allows IT to more accurately account for
costs associated to purely IT initiatives.
This model provides a balance between a true chargeback model and more traditional models of IT accounting.
The Cloud Adoption Framework approaches cloud adoption as a self-service activity. The objective is to empower
each of the teams supporting adoption through standardized approaches. In practice, you can't assume that a self-
service approach will be sufficient for all adoption activities.
Successful cloud adoption programs typically involve at least one level of support. Some cloud adoption efforts
may require support from multiple partners working together towards a common goal.
Partnership options
You are not alone in your cloud journey. There are several options to support your team throughout your cloud
adoption journey.
Azure solution providers (par tners): Get connected with Azure expert managed services providers (MSP)
and other Microsoft partners who have service offerings aligned to the Cloud Adoption Framework
methodologies.
FastTrack for Azure: Use the Microsoft FastTrack for Azure program to accelerate migration.
Azure Migration Program (AMP): The AMP program aligns a mixture of partners and Microsoft employees
to accelerate and support your migration.
Azure solution providers
Microsoft certified solution providers specialize in providing modern customer solutions base on Microsoft
technologies across the world. Optimize your business in the cloud with help from an experienced partner.
Find a Cloud Solution Provider (CSP) . A certified CSP can help take full advantage of the cloud by assessing
business goals for cloud adoption, identifying the right cloud solution that meets business needs and helps the
business become more agile and efficient.
Azure expert managed services providers (MSP) have undergone a third-party audit to validate a higher tier of
capability, demonstrated through certified staff headcounts, customer references, annual consumption of Azure at
scale, and other criteria.
Find a managed ser vice par tner . An Azure managed service partner (MSP) helps a business transition to Azure
by guiding all aspects of the cloud journey. From consulting to migrations and operations management, cloud
MSPs show customers all the benefits that come with cloud adoption. They also act as a one-stop shop for common
support, provisioning, and the billing experience, all with a flexible pay-as-you-go business model.
In parallel to the development of the cloud adoption strategy, the cloud strategy team should start to identify
solution providers that can partner in the delivery of business objectives.
FastTrack for Azure
FastTrack for Azure provides direct assistance from Azure engineers, working hand in hand with partners, to help
customers build Azure solutions quickly and confidently. FastTrack brings best practices and tools from real
customer experiences to guide customers from setup, configuration, and development to production of Azure
solutions, including:
During a typical FastTrack for Azure engagement, Microsoft helps to define the business vision to plan and develop
Azure solutions successfully. The team assesses architectural needs and provides guidance, design principles, tools,
and resources to help build, deploy, and manage Azure solutions. The team matches skilled partners for
deployment services on request and periodically checks in to ensure that deployment is on track and to help
remove blockers.
Azure Migration Program (AMP)
The Azure Migration Program (AMP) provides a mixture of technical skill building, step-by-step guidance, free
migration tools, and potential offers to reduce migration costs.
The program uses FastTrack for Azure and Azure solution providers to improve customer success during migration.
Watch this short video to get an overview of how the Azure Migration Program can help you.
Azure support
If you have questions or need help, create a support request. If your support request requires deep technical
guidance, visit Azure support plans to align the best plan for your needs.
Shortlist of partner options
During strategy development, its hard to define specific partnership needs. During development of the cloud
adoption plan and skilling plan, those needs will come into focus.
But, based on the culture and maturity of your team, it may be possible to decide on a partnership option that is
more aligned with your expected needs.
Choose one or more of the partnership options above to narrow down the options to investigate first.
Next steps
After your partner alignment strategy is kicked off, you may want to consider your security strategy next.
Define your security strategy
First cloud adoption project
10/30/2020 • 3 minutes to read • Edit Online
There's a learning curve and a time commitment associated with cloud adoption planning. Even for experienced
teams, proper planning takes time: time to align stakeholders, time to collect and analyze data, time to validate
long-term decisions, and time to align people, processes, and technology. In the most productive adoption efforts,
planning grows in parallel with adoption, improving with each release and with each workload migration to the
cloud. It's important to understand the difference between a cloud adoption plan and a cloud adoption strategy.
You need a well-defined strategy to facilitate and guide the implementation of a cloud adoption plan.
The Cloud Adoption Framework for Azure outlines the processes for cloud adoption and the operation of
workloads hosted in the cloud. Each of the processes across the Strategy, Plan, Ready, Adopt, and Manage
methodologies require slight expansions of technical, business, and operational skills. Some of those skills can
come from directed learning. But many of them are most effectively acquired through hands-on experience.
Starting a first adoption process in parallel with the development of the plan provides some benefits:
Establish a growth mindset to encourage learning and exploration.
Provide an opportunity for the team to develop necessary skills.
Create situations that encourage new approaches to collaboration.
Identify skill gaps and potential partnership needs.
Provide tangible inputs to the plan.
Next steps
Learn about strategies for balancing competing priorities.
Balance competing priorities
Balance competing priorities
10/30/2020 • 13 minutes to read • Edit Online
Embarking on any digital transformation journey acts like a forcing function for stakeholders, across the business
and technology teams. The path to success is firmly rooted in the organization's ability to balance competing
priorities.
Similar to other digital transformations, cloud adoption will expose competing priorities throughout the adoption
lifecycle. Like other forms of transformation, the ability to find balance in those priorities will have a significant
impact on the realization of business value. Balancing these competing priorities will require open and sometimes
difficult conversations between stakeholders (and sometimes with individual contributors).
This article outlines some of the competing priorities commonly discussed during the execution of each
methodology. We hope this advanced awareness will help better prepare for those discussions when developing
your cloud adoption strategy.
The following sections align to the flow of the cloud adoption lifecycle visual above. However, it's important to
recognize that cloud adoption is iterative (not a sequential process) and these competing priorities will emerge
(and sometimes reemerge) at various points along your cloud adoption journey.
Next steps
Learn to balance migration, innovation, and experimentation to maximize the value your cloud migration efforts.
Balance the portfolio
Balance the portfolio
10/30/2020 • 9 minutes to read • Edit Online
Cloud adoption is a portfolio-management effort, cleverly disguised as technical implementation. Like any
portfolio management exercise, balancing the portfolio is critical. At a strategic level, this means balancing
migration, innovation, and experimentation to get the most out of the cloud. When the cloud adoption effort leans
too far in one direction, complexity finds its way into the adoption efforts. This article will guide the reader
through approaches to achieve balance in the portfolio.
P RIO RIT Y F O R T H IS
O UTC O M E M EA SURED B Y GO A L T IM E F RA M E EF F O RT
IMPORTANT
The above table is a fictional example and should not used to set priorities. In many cases, this table could be considered an
antipattern by placing cost savings above customer experiences.
The above table could accurately represent the priorities of the cloud strategy team and the cloud adoption team.
Due to short-term constraints, this team is placing a higher emphasis on IT cost reduction and prioritizing a
datacenter exit as a means to achieve the desired IT cost reductions. However, by documenting the competing
priorities in this table, the cloud adoption team is empowered to help the cloud strategy team identify
opportunities to better align implementation of the overarching portfolio strategy.
Move fast while maintaining balance
The guidance regarding incremental rationalization of the digital estate suggests an approach in which the
rationalization starts with an unbalanced position. The cloud strategy team should evaluate every workload for
compatibility with a rehost approach. Such an approach is suggested because it allows for the rapid evaluation of
a complex digital estate based on quantitative data. Making such an initial assumption allows the cloud adoption
team to engage quickly, reducing time to business outcomes. However, as stated in that article, qualitative
questions will provide the necessary balance in the portfolio. This article documents the process for creating the
promised balance.
Importance of sunset and retire decisions
The table in the documenting business outcomes section above misses a key outcome that would support the
number one objective of reducing IT costs. When IT costs reductions rank anywhere in the list of business
outcomes, it is important to consider the potential to sunset or retire workloads. In some scenarios, cost savings
can come from not migrating workloads that don't warrant a short-term investment. Some customers have
reported cost savings in excess of 20% total cost reductions by retiring underutilized workloads.
To balance the portfolio, better reflecting sunset and retire decisions, the cloud strategy team and the cloud
adoption team are encouraged to ask the following questions of each workload within assess and migrate phases:
Has the workload been used by end users in the past six months?
Is end-user traffic consistent or growing?
Will this workload be required by the business 12 months from now?
If the answer to any of these questions is "no", then the workload could be a candidate for retirement. If retirement
potential is confirmed with the app owner, then it may not make sense to migrate the workload. This prompts for
a few qualification questions:
Can a retirement plan or sunset plan be established for this workload?
Can this workload be retired prior to the datacenter exit?
If the answer to both of these questions is "yes", then it would be wise to consider not migrating the workload.
This approach would help meet the objectives of reducing costs and exiting the datacenter.
If the answer to either question is "no", it may be wise to establish a plan for hosting the workload until it can be
retired. This plan could include moving the assets to a lower-cost datacenter or alternative datacenter, which
would also accomplish the objectives of reducing costs and exiting one datacenter.
Next steps
Understand how global market decisions can affect your transformation journey.
Understand global markets
How will global market decisions affect the
transformation journey?
10/30/2020 • 2 minutes to read • Edit Online
The cloud opens new opportunities to perform on a global scale. Barriers to global operations are significantly
reduced, by empowering companies to deploy assets in market, without the need to invest heavily in new
datacenters. Unfortunately, this also adds a great deal of complexity from technical and legal perspectives.
Data sovereignty
Many geopolitical regions have established data sovereignty regulations. Those regulations restrict where data can
be stored, what data can leave the country of origin, and what data can be collected about citizens of that region.
Before operating any cloud-based solution in a foreign geography, you should understand how that cloud
provider handles data sovereignty. For more information about Azure's approach for each geography, see Azure
geographies. For more information about compliance in Azure, see Privacy at Microsoft in the Microsoft Trust
Center.
The remainder of this article assumes legal counsel has reviewed and approved operations in a foreign country.
Business units
It is important to understand which business units operate in foreign countries, and which countries are affected.
This information will be used to design solutions for hosting, billing, and deployments to the cloud provider.
Next steps
Learn about the skills needed during the Strategy phase of your cloud adoption journey.
Skills needed during the Strategy phase
Define a security strategy
10/30/2020 • 23 minutes to read • Edit Online
The ultimate objectives for a security organization don't change with adoption of cloud services, but how those
objectives are achieved will change. Security teams must still focus on reducing business risk from attacks and
work to get confidentiality, integrity, and availability assurances built into all information systems and data.
As there are different models of cloud services, the responsibilities for each workload will vary depending on
whether it is hosted on software as a service (SaaS), platform as a service (PaaS), infrastructure as a service (IaaS),
or in an on-premises datacenter.
NOTE
Not all examples are obvious, so it's critical to bring a team together with the right stakeholders from data
science teams, business stakeholders, security teams, privacy teams, and others. These teams should have a
responsibility to meet common goals of innovation and responsibility. They should address common issues
such as how and where to store copies of data in insecure configurations, how to classify algorithms, as well
as any concerns of your organizations.
Microsoft has published our principles of responsible AI to guide our own teams and our customers.
Data ownership and privacy: Regulations like GDPR have increased the visibility of data issues
and applications. Application teams now have the ability to control, protect, and track sensitive
data at a level comparable to tracking financial data by banks and financial institutions. Data
owners and applications teams need to build a rich understanding of what data applications store
and what controls are required.
Between organizations and cloud providers: As organizations host workloads in the cloud, they are
entering into a business relationship with each of those cloud providers. The use of cloud services often
brings business value such as:
Accelerating digital transformation initiatives by reducing time to market for new capabilities.
Increasing value of IT and security activities by freeing teams to focus on higher value
(business-aligned) activities rather than lower-level commodity tasks that are provided more
efficiently by cloud services on their behalf.
Increased reliability and responsiveness: Most modern clouds also have extremely high uptime
compared to traditional on-premises datacenters and have shown they can scale rapidly (such as
during the COVID-19 pandemic) and provide resiliency following natural events like lightning strikes
(which would have kept many on-premises equivalents down for much longer).
While extremely beneficial, this shift to the cloud is not without risk. As organizations adopt cloud
services, they should consider potential risk areas including:
Business continuity and disaster recover y: Is the cloud provider financially healthy with a
business model that's likely to survive and thrive during your organization's use of the service? Has
the cloud provider made provisions to allow customer continuity if the provider experiences financial
or other failure, such as providing their source code to customers or open-sourcing it?
For more information and documents regarding Microsoft's financial health, see Microsoft investor
relations.
Security: Does the cloud provider follow industry best practices for security? Has this been validated
by independent regulatory bodies?
Microsoft Cloud App Security allows you to discover usage of over 16,000 cloud apps, which are
ranked and scored based on more than 70 risk factors to provide you with ongoing visibility into
cloud use, shadow IT, and the risk that shadow IT poses to your organization.
The Microsoft Service Trust Portal makes regulatory compliance certifications, audit reports, pen
tests, and more available to customers. These documents include many details of internal security
practices (notably the SOC 2 type 2 report and FedRAMP Moderate system security plan).
Business competitor : Is the cloud provider a significant business competitor in your industry? Do
you have sufficient protections in the cloud services contract or other means to protect your
business against potentially hostile actions?
Review this article for commentary on how Microsoft avoids competing with cloud customers.
Multicloud: Many organizations have a de facto or intentional multicloud strategy. This could be an
intentional objective to reduce reliance on a single supplier or to access unique best of breed
capabilities, but can also happen because developers chose preferred or familiar cloud services, or
your organization acquired another business. Regardless of the reason, this strategy can introduce
potential risks and costs that have to be managed including:
Downtime from multiple dependencies: Systems architected to rely on multiple clouds are
exposed to more sources of downtime risk as disruptions in the cloud providers (or your team's
use of them) could cause an outage/disruption of your business. This increased system
complexity would also increase the likelihood of disruption events as team members are less
likely to fully understand a more complex system.
Negotiating power : Larger organizations also should consider whether a single-cloud (mutual
commitment/partnership) or multicloud strategy (ability to shift business) will achieve greater
influence over their cloud providers to get their organization's feature requests prioritized.
Increased maintenance overhead: IT and security resources already are overburdened from
their existing workloads and keeping up with the changes of a single cloud platform. Each
additional platform further increases this overhead and takes team members away from higher
value activities like streamlining technical process to speed business innovation, consulting with
business groups on more effective use of technologies, and so on.
Staffing and training: Organizations often do not consider the staffing requirements necessary
to support multiple platforms and the training required to maintain knowledge and currency of
new features which are released in a rapid pace.
Cloud monitoring guide: Formulate a monitoring
strategy
10/30/2020 • 20 minutes to read • Edit Online
As you undergo your digital transformation to the cloud, it's important that you plan and develop an effective cloud
monitoring strategy with participation of developers, operations staff, and infrastructure engineers. The strategy
should be growth-oriented, defined minimally, then refined iteratively; always aligned with business needs. Its
outcome delivers an agile operations modality centered around the ability of the organization to proactively
monitor complex distributed applications the business depends on.
Where to start?
To ease your journey to the cloud, use the Strategy phase and the Plan phase of the Cloud Adoption Framework.
Monitoring influences and justifies the motivations, business outcomes, and initiatives. Include monitoring in the
strategy and plan phases, your initiatives, and projects. For example, examine how the first adoption project
establishes early operations management in Azure. Imagine what the cloud operating model needs to look like,
including the role of monitoring. Monitoring is best served with a service-based approach, as an operations
function, where monitoring is an advisory service and a provider of expertise to business and IT consumers.
The following are important areas that strongly influence a sound monitoring strategy:
Monitor the health of your applications, based on its components and their relationship with other
dependencies. Start with the cloud service platform, resources, the network, and lastly the application by
collecting metrics and logs where applicable. For the hybrid cloud model, include on-premises infrastructure
and other systems the application relies on.
Include measuring the end user's experience in your applications performance monitoring plan by
mimicking your customer's typical interactions with the application.
Ensure security requirements correspond with your organizations security compliance policy.
Align alerts with what is considered a relevant and practical incident (such as warnings and exceptions) and
align severity with its significance following your incident priority and urgency escalation matrix.
Collect only the metrics and logs that are useful, measurable, and identifiable to the business and IT
organization.
Define an integration plan with existing ITSM solutions such as remedy or ServiceNow for incident
generation or upstream monitoring. Determine which alerts should be forwarded, whether alert enrichment
is required to support specific filtering requirements, and how to configure.
Understand who needs visibility, what they need to see, and how it should be visualized based on their roles
and responsibilities.
At the heart of operations management, your IT organization needs to establish centralized governance and strict
delegation over approaches to build, operate, and manage IT services.
Initial strategy goals
As an architect or strategic planner, you may need to formulate an early strategy for operations management, in
which monitoring plays a major role. Consider these four outcomes:
1. Manage cloud production services when they go live into production, such as networking, applications,
security and virtual infrastructure.
2. Apply limited resources to rationalize your existing monitoring tools, skills and expertise, and use cloud
monitoring to reduce complexity.
3. Make your monitoring solution processes more efficient, work faster and smoother, at scale and be able to
change quickly too.
4. Account for how your organization will plan for and host monitoring based on cloud models. Work towards
the goal of reducing your requirements as the organization transitions from IaaS to PaaS, and then to SaaS.
High-level modeling
As the business determines what services to move, you need to invest your resources carefully. On-premises, you
own all responsibilities for monitoring and are heavily invested. The moves made toward SaaS services, for
example, do not eliminate your monitoring responsibility. You'll decide who needs access, who gets alerts, and who
needs access to analytics at a minimum. Azure Monitor and Azure Arc are Azure services with the flexibility of
addressing monitoring scenarios across all four cloud models, not just resources inside Azure. You need to look
beyond the common cloud models as shown below. If you're using Microsoft Office applications delivered by
Microsoft 365 services in your organization, you'll need to include security and compliance monitoring with
Microsoft 365 in addition to Azure Security Center. This includes identities, endpoint management, and device
monitoring outside of your corporate network.
Monitoring informs strategy
Consider where early monitoring capability informs strategy. Many decisions depend on early monitoring data in
order to build a capability roadmap that guides limited resources and adds confidence. Strategies also need real-
world input from monitoring of service enablement.
Consider the role monitoring plays in strategies to incrementally protect and secure the digital estate:
Activity logs and security monitoring are needed to measure directory usage and external sharing of
sensitive content, to inform in an incremental approach to layer on protective features and achieve the right
balance with privacy monitoring.
Policies and baselines will inform the rationalization objective (migrate, lift and shift, or rearchitect) and
improve confidence that data and information can be migrated from on-premises to cloud services.
Later in this guide, discover some common monitoring scenarios or use cases that will help accelerate adoption.
Formulate initiatives
As a monitoring expert or systems administrator, you've discovered that cloud monitoring is faster and easier to
establish, leading to inexpensive demos or proofs-of-value. To overcome the tendency to stay in demo mode, you
need to stay in constant touch with strategy and be able to execute on production-focused monitoring plans.
Because strategy has plenty of uncertainty and unknowns, you won't know all of the monitoring requirements in
advance. Therefore, decide on the first set of adoption plans, based on what is minimally viable to the business and
IT management. You may call this a core capability - that which is needed to begin the journey. Here are two
example initiatives that help declare forward motion:
Initiative 1: to reduce the diversity and complexity of our current monitoring investment, we will invest in
establishing a core capability using Azure Monitor first, given the same skills and readiness applies to other
areas of cloud monitoring.
Initiative 2: to decide on how we use our license plans for identity, access, and overall information protection,
we will help the security and privacy offices establish early activity monitoring of users and content as they
migrate to the cloud, to clarify questions on classification labels, data loss prevention, encryption, and
retention policies.
Consider scale
Consider scale in your strategy and who will be defining and standardizing monitoring as code. Your organization
should plan to build standardized solutions using a combination of tools such as:
Azure Resource Manager templates.
Azure Policy monitoring initiative definitions and policies.
GitHub to establish a source control for the scripts, code, and documentation.
Consider privacy and security
In Azure, you'll need to secure certain monitoring data emitted by resources and the control plane actions that are
logged in Azure, known as activity logs. Additionally, specialized logs that record user activity such as the Azure
Active Directory sign-in and audit logs, and if integrated, the Microsoft 365 unified audit log, as they contain
sensitive data that may need to be protected under privacy laws.
Your monitoring strategy should include these components:
Separate non-monitoring data from monitoring data
Restrict access to resources
Consider business continuity
Azure Monitor collects, indexes, and analyzes real-time machine and resource-generated data to support your
operations and help drive business decisions. Under rare circumstances, it is possible that facilities in an entire
region can become inaccessible, for example due to network failures. Or facilities can be lost entirely, for example
due to a natural disaster. By relying on these services in the cloud, your planning isn't focused around infrastructure
resiliency and high availability, rather its planning for:
Availability for data ingestion from all your dependent services and resources in Azure, resources in other
clouds, and from on-premises.
Data availability for insights, solutions, workbooks and other visualizations, alerting, integration with ITSM, and
other control plane services in Azure supporting your operational requirements.
Create a recovery plan, and make sure that it covers data restoration, network outages, dependent service failures,
and region-wide service disruptions.
Consider maturity
Maturity is an important consideration in your monitoring strategy. We recommend you start minimally, gather
data and with this information, determine the strategy. The first monitoring solutions you'll want are those that
ensure observability, to include responsive processes, such as incident and problem management. Here, you will:
Create one or more Log Analytics workspaces
Enable agents
Enable resource diagnostic settings
Enable initial alert rules
Over time, you gain confidence in Azure Monitor capabilities with the need to measure health indicators, so this
involves expanding the focus on the collection of logs, enabling and using insights and metrics, and defining log
search queries that drive the measurement and calculation of what is healthy or unhealthy.
Learning cycles includes getting monitoring data and insights into the hands of managers, and ensuring the right
consumers have monitoring data they need. Learning cycles include continual tuning and optimizing of your initial
monitoring plans to adapt, to improve service, and inform adoption plans.
1 The dikwmodel is an often used method, with roots in knowledge management, to explain the ways we move
from Data to Information, Knowledge, and Wisdom with a component of actions and decisions.
Monitoring is foundational for services you build in Azure. Your strategy can address these four disciplines of
modern monitoring, to help you define minimum viable monitoring, and gain confidence in steps. Moving your
capability from reactive to proactive and scaling its reach to end users is but one goal.
Obser ve: First, you should focus on establishing monitoring to observe the health and status of Azure
services and resources. Configure basic monitoring and then automate with Azure Policy and Azure
Resource Manager templates, to establish initial visibility of services and their warranty: availability,
performance or capacity, security, and configuration compliance. For example, based on minimum viable
setup of Azure Monitor, configure resources for monitoring and diagnostics, set up alerts, and insights.
Include knowledge and readiness of monitoring consumers, defining and triggering from events, for service
work such as incidents and problems. One indicator of maturity is how much can be automated to reduce
unnecessary human costs to manually observe health and status. Knowing which services are healthy is as
important as being alerted on services that are unhealthy.
Measure: Configure collection of metrics and logs from all resources to monitor for symptoms/conditions
that are issues, which indicate potential or actual impact to the availability of the service, or impact of the
consumers of the service/application. For example:
When using a feature in the application, is it showing response time latency, returning an error when I
selected something, or unresponsive?
Ensure services are meeting service agreements by measuring the utility of the service or application.
Respond: Based on the context of known issues to observe and measure, evaluate what qualifies as a bug,
auto-remediation, or requires manual response based on what is classified as an incident, problem, or
change.
Learn and improve: Providers and consumers participating in learning cycles implies consuming actual
monitoring data through insights, reports and workbooks, to continually improve the target service and to
enact tuning and optimization of the monitoring configuration. Change is important too, that the monitoring
configuration is changing in tandem with changes to the service (such as new, modified, or retired) and
continues to match the actual service warranty.
To help you align monitoring plans to strategy, use the following table to categorize the different monitoring
scenarios that occur in more detail. This works with the five Rs of rationalization introduced earlier in the Plan
phase. If you're using System Center Operations Manager, you have hybrid and cloud options available to
rationalize your investment.
Health and status monitoring Holistically observe, measure, learn, and improve the long-
term warranty of the service or component, including service
levels, in these aspects taken together: availability, capacity,
performance, security, and compliance. A healthy system,
service or component is online, performing well, secure and
compliant. Health monitoring includes logs and is stateful with
real-time health states and metrics. It also includes trending
reports, insights, and trends focused on service usage.
Cost monitoring Monitor usage and estimate costs using Azure Monitor and
Azure Cost Management + Billing as a new primary objective.
The Azure Cost Management + Billing APIs provide the ability
to explore cost and usage data using multidimensional
analysis.
T ERT IA RY O B JEC T IVES GO A L A N D O UTC O M E
Activity monitoring Observe, measure, learn, and improve usage, security, and
compliance from sources such as Azure activity logs, audit
logs, and the Microsoft 365 unified audit log for subscription
level events, actions on resources, user and administrator
activity, content, data, and for your security and compliance
needs in Azure and Microsoft 365.
Service usage Service owners want analytics and insights to measure, learn,
and improve the usage of Azure and Microsoft 365 services
(IaaS, PaaS, SaaS) with service usage reports, analytics, and
insights. Ensure plans include who will need access to the
admin portals, dashboards, insights, and reports.
Service and resource health Observe the health of your cloud resources, as well as service
outages and advisories from Microsoft, to stay informed about
incidents and maintenance. Include resource health in
monitoring of the availability of your resources and alert on
changes in availability.
Capacity and performance monitoring In support of health monitoring, your needs may require more
depth and specialization.
Change and compliance monitoring Observe, measure, learn, and improve configuration
management of resources, which should now include security
in the formulation, influenced by good use of Azure Policy to
standardize monitoring configurations and enforce security
hardening. Log data to filter on key changes being made on
resources.
Identity and access monitoring Observe, measure, learn, and improve both the usage and
security of Active Directory, Azure Active Directory, and
identity management that integrates users, applications,
devices, and other resources no matter where they are.
Information protection Not only Azure Monitor, but Azure Information Protection
depending on the plan, includes usage analytics critical to your
development of a robust information protection strategy
across Azure and Microsoft.
Threat management and integrated threat protection The cloud brings together the separate, traditional roles of
security monitoring with health monitoring. Integrated threat
protection, for example, involves monitoring to accelerate an
optimal state of zero trust. Integrating Azure Advanced Threat
Protection allows a migration from using System Center
Operations Manager to monitor Active Directory, and
integrate your Active Directory security-related signals to
detect advanced attacks in hybrid environments.
User stories and tasks The end result is a monitoring Network monitoring (for example,
configuration or solution ExpressRoute)
Standardized IaaS VM monitoring (for
example Azure Monitor for VMs,
Application Insights, Azure Policy,
settings, policies, reports, workspaces.)
In summary your monitoring consumer roles probably need broad access, versus your developers and system
administrators who only need role-based access to certain Azure resources. As an additional restriction, ensure you
exempt readers from access to sensitive monitoring data such as security, sign-in and user activity logs.
Establish readiness
Early on, formulate a readiness plan to help your IT staff adopt new skills, practices, and techniques for cloud
monitoring in Azure. Consider the skills readiness guidance for monitoring that includes foundational needs, as
well as those specific to monitoring.
Responsible AI
10/30/2020 • 2 minutes to read • Edit Online
Driven by ethical principles that put people first, Microsoft is committed to advancing AI. We want to partner with
you to support this endeavor.
Responsible AI principles
As you implement AI solutions, consider the following principles in your solution:
Fairness: AI systems should treat all people fairly.
Reliability and safety: AI systems should perform reliably and safely.
Privacy and security: AI systems should be secure and respect privacy.
Inclusiveness: AI systems should empower everyone and engage people.
Transparency: AI systems should be understandable.
Accountability: People should be accountable for AI systems.
During the Plan phase of a migration journey, the objective is to develop the plans necessary to guide migration
implementation. This phase requires a few critical skills, including:
Establishing the vision.
Building the business justification.
Rationalizing the digital estate.
Creating a migration backlog (technical plan).
The following sections provide learning paths to develop each of these skills.
Organizational skills
Depending on the motivations and desired business outcomes of a cloud adoption effort, leaders might need to
establish new organizational structures or virtual teams to facilitate various functions. These articles will help you
develop the skills necessary to structure those teams to meet desired outcomes:
Initial organizational alignment. Overview of organizational alignment and various team structures to facilitate
specific goals.
Breaking down silos and fiefdoms. Understanding two common organizational antipatterns and ways to guide
a team to productive collaboration.
Microsoft Learn
Microsoft Learn is a new approach to learning. Readiness for the new skills and responsibilities that come with
cloud adoption doesn't come easily. Microsoft Learn provides a more rewarding approach to hands-on learning
that helps you achieve your goals faster. Earn points and levels, and achieve more!
Here is an example of a tailored learning path that aligns to the Strategy methodology of the Cloud Adoption
Framework.
Learn the business value of Microsoft Azure: This learning experience will take you on a journey that will begin by
showing you how digital transformation and the power of the cloud can transform your business. We will cover
how Microsoft Azure cloud services can power your organization on a trusted cloud platform. Finally, we will wrap
up by illustrating how to make this journey real for your organization.
Learn more
To discover additional learning paths, browse the Microsoft Learn catalog. Use the Roles filter to align learning
paths with your role.
Develop a cloud adoption plan
10/30/2020 • 2 minutes to read • Edit Online
Cloud adoption plans convert the aspirational goals of a cloud adoption strategy into an actionable plan. The
collective cloud teams can use the cloud adoption plan to guide their technical efforts and align them with the
business strategy.
The following exercises will help you document your technology strategy. This approach captures prioritized
tasks to drive adoption efforts. The cloud adoption plan then maps to the metrics and motivations defined in the
cloud adoption strategy.
Download the strategy and plan template to track the outputs of each exercise as you build out your cloud
adoption strategy. Also, learn about the five Rs of cloud rationalization to help build your cloud adoption plan.
Cloud rationalization
10/30/2020 • 4 minutes to read • Edit Online
Cloud rationalization is the process of evaluating assets to determine the best way to migrate or modernize each
asset in the cloud. For more information about the process of rationalization, see What is a digital estate?.
Rationalization context
The five Rs of rationalization listed in this article are a great way to label a potential future state for any workload
that's being considered as a cloud candidate. This labeling process should be put into the correct context before
you attempt to rationalize an environment. Review the following myths to provide that context:
Myth: It's easy to make rationalization decisions early in the process
Accurate rationalization requires a deep knowledge of the workload and associated assets (applications,
infrastructure, and data). Most importantly, accurate rationalization decisions take time. We recommend using an
incremental rationalization process.
Myth: Cloud adoption has to wait for all workloads to be rationalized
Rationalizing an entire IT portfolio or even a single datacenter can delay the realization of business value by
months or even years. Full rationalization should be avoided when possible. Instead, use the Power of 10
approach to release planning to make wise decisions about the next 10 workloads that are slated for cloud
adoption.
Myth: Business justification has to wait for all workloads to be rationalized
To develop a business justification for a cloud adoption effort, make a few basic assumptions at the portfolio
level. When motivations are aligned to innovation, assume rearchitecture. When motivations are aligned to
migration, assume rehost. These assumptions can accelerate the business justification process. Assumptions are
then challenged and budgets refined during the Assess phase of each workload's adoption cycles.
Now review the following five Rs of rationalization to familiarize yourself with the long-term process. While
developing your cloud adoption plan, choose the option that best aligns with your motivations, business
outcomes, and current state environment. The goal in digital estate rationalization is to set a baseline, not to
rationalize every workload.
Rehost
Also known as a lift and shift migration, a rehost effort moves a current state asset to the chosen cloud provider,
with minimal change to overall architecture.
Common drivers might include:
Reducing capital expense.
Freeing up datacenter space.
Achieving rapid return on investment in the cloud.
Quantitative analysis factors:
VM size (CPU, memory, storage).
Dependencies (network traffic).
Asset compatibility.
Qualitative analysis factors:
Tolerance for change.
Business priorities.
Critical business events.
Process dependencies.
Refactor
Platform as a service (PaaS) options can reduce the operational costs that are associated with many applications.
It's a good idea to slightly refactor an application to fit a PaaS-based model.
"Refactor" also refers to the application development process of refactoring code to enable an application to
deliver on new business opportunities.
Common drivers might include:
Faster and shorter updates.
Code portability.
Greater cloud efficiency (resources, speed, cost, managed operations).
Quantitative analysis factors:
Application asset size (CPU, memory, storage).
Dependencies (network traffic).
User traffic (page views, time on page, load time).
Development platform (languages, data platform, middle-tier services).
Database (CPU, memory, storage, version).
Qualitative analysis factors:
Continued business investments.
Bursting options or timelines.
Business process dependencies.
Rearchitect
Some aging applications aren't compatible with cloud providers because of the architectural decisions that were
made when the application was built. In these cases, the application might need to be rearchitected before
transformation.
In other cases, applications that are cloud-compatible, but not cloud-native, might create cost efficiencies and
operational efficiencies by rearchitecting the solution into a cloud-native application.
Common drivers might include:
Application scale and agility.
Easier adoption of new cloud capabilities.
Mix of technology stacks.
Quantitative analysis factors:
Application asset size (CPU, memory, storage).
Dependencies (network traffic).
User traffic (page views, time on page, load time).
Development platform (languages, data platform, middle tier services).
Database (CPU, memory, storage, version).
Qualitative analysis factors:
Growing business investments.
Operational costs.
Potential feedback loops and DevOps investments.
Rebuild
In some scenarios, the delta that must be overcome to carry an application forward can be too large to justify
further investment. This is especially true for applications that previously met the needs of a business but are
now unsupported or misaligned with the current business processes. In this case, a new code base is created to
align with a cloud-native approach.
Common drivers might include:
Accelerating innovation.
Building applications faster.
Reducing operational cost.
Quantitative analysis factors:
Application asset size (CPU, memory, storage).
Dependencies (network traffic).
User traffic (page views, time on page, load time).
Development platform (languages, data platform, middle tier services).
Database (CPU, memory, storage, version).
Qualitative analysis factors:
Declining end-user satisfaction.
Business processes limited by functionality.
Potential cost, experience, or revenue gains.
Replace
Solutions are typically implemented by using the best technology and approach available at the time. Sometimes
software as a service (SaaS) applications can provide all the necessary functionality for the hosted application. In
these scenarios, a workload can be scheduled for future replacement, effectively removing it from the
transformation effort.
Common drivers might include:
Standardizing around industry best practices.
Accelerating adoption of business-process-driven approaches.
Reallocating development investments into applications that create competitive differentiation or advantages.
Quantitative analysis factors:
General operating-cost reductions.
VM size (CPU, memory, storage).
Dependencies (network traffic).
Assets to be retired.
Database (CPU, memory, storage, version).
Qualitative analysis factors:
Cost benefit analysis of the current architecture versus a SaaS solution.
Business process maps.
Data schemas.
Custom or automated processes.
Next steps
Collectively, you can apply these five Rs of rationalization to a digital estate to help you make rationalization
decisions about the future state of each application.
What is a digital estate?
What is a digital estate?
10/30/2020 • 2 minutes to read • Edit Online
Every modern company has some form of digital estate. Much like a physical estate, a digital estate is an
abstract reference to a collection of tangible owned assets. In a digital estate, those assets include virtual
machines (VMs), servers, applications, data, and so on. Essentially, a digital estate is the collection of IT assets
that power business processes and supporting operations.
The importance of a digital estate is most obvious during the planning and execution of digital transformation
efforts. During transformation journeys, the cloud strategy teams use the digital estate to map the business
outcomes to release plans and technical efforts. That all starts with an inventory and measurement of the
digital assets that the organization owns today.
Each type of transformation can be measured with any of the above views. Companies commonly complete
all these transformations in parallel. We strongly recommend that company leadership and the cloud
strategy team agree regarding the transformation that is most important for business success. That
understanding serves as the basis for common language and metrics across multiple initiatives.
Next steps
Before digital estate planning begins, determine which approach to use.
Approaches to digital estate planning
Approaches to digital estate planning
10/30/2020 • 3 minutes to read • Edit Online
Digital estate planning can take several forms depending on the desired outcomes and size of the existing estate.
There are various approaches that you can take. It's important to set expectations regarding the approach early in
planning cycles. Unclear expectations often lead to delays associated with additional inventory-gathering
exercises. This article outlines three approaches to analysis.
Workload-driven approach
The top-down assessment approach evaluates security aspects. Security includes the categorization of data (high,
medium, or low business impact), compliance, sovereignty, and security risk requirements. This approach assesses
high-level architectural complexity. It evaluates aspects such as authentication, data structure, latency
requirements, dependencies, and application life expectancy.
The top-down approach also measures the operational requirements of the application, such as service levels,
integration, maintenance windows, monitoring, and insight. When these aspects have been analyzed and
considered, the resulting score reflects the relative difficulty of migrating this application to each cloud platform:
IaaS, PaaS, and SaaS.
In addition, the top-down assessment evaluates the financial benefits of the application, such as operational
efficiencies, TCO, return on investment, and other appropriate financial metrics. The assessment also examines the
seasonality of the application (such as whether there are certain times of the year when demand spikes) and
overall compute load.
It also looks at the types of users it supports (casual/expert, always/occasionally logged on), and the required
scalability and elasticity. Finally, the assessment concludes by examining business continuity and resiliency
requirements, as well as dependencies for running the application if a disruption of service should occur.
TIP
This approach requires interviews and anecdotal feedback from business and technical stakeholders. Availability of key
individuals is the biggest risk to timing. The anecdotal nature of the data sources makes it more difficult to produce accurate
cost or timing estimates. Plan schedules in advance and validate any data that's collected.
Asset-driven approach
The asset-driven approach provides a plan based on the assets that support an application for migration. In this
approach, you pull statistical usage data from a configuration management database (CMDB) or other
infrastructure assessment tools.
This approach usually assumes an IaaS model of deployment as a baseline. In this process, the analysis evaluates
the attributes of each asset: memory, number of processors (CPU cores), operating system storage space, data
drives, network interface cards (NICs), IPv6, network load balancing, clustering, operating system version,
database version (if necessary), supported domains, and third-party components or software packages, among
others. The assets that you inventory in this approach are then aligned with workloads or applications for
grouping and dependency mapping purposes.
TIP
This approach requires a rich source of statistical usage data. The time that's needed to scan the inventory and collect data
is the biggest risk to timing. The low-level data sources can miss dependencies between assets or applications. Plan for at
least one month to scan the inventory. Validate dependencies before deployment.
Incremental approach
We strongly suggest an incremental approach, as we do for many processes in the Cloud Adoption Framework. In
the case of digital estate planning, that equates to a multiphase process:
Initial cost analysis: If financial validation is required, start with an asset-driven approach, described
earlier, to get an initial cost calculation for the entire digital estate, with no rationalization. This establishes a
worst-case scenario benchmark.
Migration planning: After you have assembled a cloud strategy team, build an initial migration backlog
using a workload-driven approach that's based on their collective knowledge and limited stakeholder
interviews. This approach quickly builds a lightweight workload assessment to foster collaboration.
Release planning: At each release, the migration backlog is pruned and reprioritized to focus on the most
relevant business impact. During this process, the next five to ten workloads are selected as prioritized
releases. At this point, the cloud strategy team invests the time in completing an exhaustive workload-
driven approach. Delaying this assessment until a release is aligned better respects the time of
stakeholders. It also delays the investment in full analysis until the business starts to see results from
earlier efforts.
Execution analysis: Before migrating, modernizing, or replicating any asset, assess it both individually
and as part of a collective release. At this point, the data from the initial asset-driven approach can be
scrutinized to ensure accurate sizing and operational constraints.
TIP
This incremental approach enables streamlined planning and accelerated results. It's important that all parties involved
understand the approach to delayed decision making. It's equally important that assumptions made at each stage be
documented to avoid loss of details.
Next steps
After an approach is selected, the inventory can be collected.
Gather inventory data
Gather inventory data for a digital estate
10/30/2020 • 2 minutes to read • Edit Online
Developing an inventory is the first step for digital estate planning. In this process, a list of IT assets that support
specific business functions are collected for later analysis and rationalization. This article assumes that a bottom-
up approach to analysis is most appropriate for planning. For more information, see Approaches to digital estate
planning.
Next steps
After an inventory is compiled and validated, it can be rationalized. Inventory rationalization is the next step to
digital estate planning.
Rationalize the digital estate
Rationalize the digital estate
10/30/2020 • 10 minutes to read • Edit Online
Cloud rationalization is the process of evaluating assets to determine the best approach to hosting them in the
cloud. After you've determined an approach and aggregated an inventory, cloud rationalization can begin. Cloud
rationalization discusses the most common rationalization options.
Incremental rationalization
The complete rationalization of a large digital estate is prone to risk and can suffer delays because of its
complexity. The assumption behind the incremental approach is that delayed decisions stagger the load on the
business to reduce the risk of roadblocks. Over time, this approach creates an organic model for developing the
processes and experience required to make qualified rationalization decisions more efficiently.
Inventory: Reduce discovery data points
Few organizations invest the time, energy, and expense in maintaining an accurate real-time inventory of the full
digital estate. Loss, theft, refresh cycles, and employee onboarding often justify detailed asset tracking of end-
user devices. The ROI of maintaining an accurate server and application inventory in a traditional, on-premises
datacenter is often low. Most IT organizations have more urgent issues to address than tracking the usage of
fixed assets in a datacenter.
In a cloud transformation, inventory directly correlates to operating costs. Accurate inventory data is required
for proper planning. Unfortunately, current environmental scanning options can delay decisions by weeks or
months. Fortunately, a few tricks can accelerate data collection.
Agent-based scanning is the most frequently cited delay. The robust data that's required for a traditional
rationalization can often only be collected with an agent running on each asset. This dependency on agents often
slows progress, because it can require feedback from security, operations, and administration functions.
In an incremental rationalization process, an agentless solution could be used for an initial discovery to
accelerate early decisions. Depending on the level of complexity in the environment, an agent-based solution
might still be required, but it can be removed from the critical path to business change.
Quantitative analysis: Streamline decisions
Regardless of the approach to inventory discovery, quantitative analysis can drive initial decisions and
assumptions. This is especially true when trying to identify the first workload or when the goal of rationalization
is a high-level cost comparison. In an incremental rationalization process, the cloud strategy team and the cloud
adoption teams limit the five Rs of rationalization to two concise decisions and only apply those quantitative
factors. This streamlines the analysis and reduces the amount of initial data that's required to drive change.
For example, if an organization is in the midst of an IaaS migration to the cloud, you can assume that most
workloads will either be retired or rehosted.
Qualitative analysis: Temporary assumptions
By reducing the number of potential outcomes, it's easier to reach an initial decision about the future state of an
asset. When you reduce the options, you also reduce the number of questions asked of the business at this early
stage.
For example, if the options are limited to rehosting or retiring, the business needs to answer only one question
during initial rationalization, which is whether to retire the asset.
"Analysis suggests that no users are actively using this asset. Is that accurate, or have we overlooked
something?" Such a binary question is typically much easier to run through qualitative analysis.
This streamlined approach produces baselines, financial plans, strategy, and direction. In later activities, each
asset goes through further rationalization and qualitative analysis to evaluate other options. All assumptions
that you make in this initial rationalization are tested before migrating individual workloads.
Challenge assumptions
The outcome of the prior section is a rough rationalization that's full of assumptions. Next, it's time to challenge
some of those assumptions.
Retire assets
In a traditional on-premises environment, hosting small, unused assets seldom causes a significant impact on
annual costs. With a few exceptions, FTE effort that's required to analyze and retire the actual asset outweighs
the cost savings from pruning and retiring those assets.
When you move to a cloud accounting model, retiring assets can produce significant savings in annual
operating costs and up-front migration efforts.
It's not uncommon for organizations to retire 20% or more of their digital estate after completing a quantitative
analysis. We recommend conducting further qualitative analysis before taking action. After it's confirmed,
retiring those assets can produce the first ROI victory of the cloud migration. This is often one of the biggest
cost-saving factors. Therefore, the cloud strategy team should oversee the validation and retirement of assets, in
parallel with execution of the Migrate methodology, to achieve an early financial win.
Program adjustments
A company seldom embarks on just one transformation journey. The choice between cost reduction, market
growth, and new revenue streams is rarely a binary decision. As such, we recommend that the cloud strategy
team work with IT to identify assets on parallel transformation efforts that are outside of the scope of the
primary transformation journey.
In the IaaS migration example given in this article:
Ask the DevOps team to identify assets that are already part of a deployment automation and remove
those assets from the core migration plan.
Ask the data and R&D teams to identify assets that are powering new revenue streams and remove them
from the core migration plan.
This program-focused qualitative analysis can be executed quickly and creates alignment across multiple
migration backlogs.
You might still need to consider some assets as rehost assets for a while. You can phase in later rationalization
after the initial migration.
Release planning
While the cloud adoption team is executing the migration or implementation of the first workload, the cloud
strategy team can begin prioritizing the remaining applications and workloads.
Power of 10
The traditional approach to rationalization attempts to meet all foreseeable needs. Fortunately, a plan for every
application is often not required to start a transformation journey. In an incremental model, the Power of 10
approach provides a good starting point. In this model, the cloud strategy team selects the first 10 applications
to be migrated. Those ten workloads should contain a mixture of simple and complex workloads.
Build the first backlogs
The cloud adoption teams and the cloud strategy team can work together on the qualitative analysis for the first
10 workloads. This effort creates the first prioritized migration backlog and the first prioritized release backlog.
This method enables the teams to iterate on the approach and provides sufficient time to create an adequate
process for qualitative analysis.
Mature the process
After the two teams agree on the qualitative analysis criteria, assessment can become a task within each
iteration. Reaching consensus on assessment criteria usually requires two to three releases.
After the assessment has moved into the incremental execution process of migration, the cloud adoption team
can iterate faster on assessment and architecture. At this stage, the cloud strategy team is also abstracted,
reducing the drain on their time. This also enables the cloud strategy team to focus on prioritizing the
applications that are not yet in a specific release, ensuring tight alignment with changing market conditions.
Not all of the prioritized applications will be ready for migration. Sequencing is likely to change as the team does
deeper qualitative analysis and discovers business events and dependencies that might prompt reprioritization
of the backlog. Some releases might group together a small number of workloads. Others might just contain a
single workload.
The cloud adoption team is likely to run iterations that don't produce a complete workload migration. The
smaller the workload, and the fewer dependencies, the more likely a workload is to fit into a single sprint or
iteration. For this reason, we recommend that the first few applications in the release backlog be small and
contain few external dependencies.
End state
Over time, the cloud adoption team and the cloud strategy team together complete a full rationalization of the
inventory. This incremental approach enables the teams to get continually faster at the rationalization process. It
also helps the transformation journey to yield tangible business results sooner, without as much upfront analysis
effort.
In some cases, the financial model might be too tight to make a decision without additional rationalization. In
such cases, you might need a more traditional approach to rationalization.
Next steps
The output of a rationalization effort is a prioritized backlog of all assets that are affected by the chosen
transformation. This backlog is now ready to serve as the foundation for costing models of cloud services.
Align cost models with the digital estate
Align cost models with the digital estate to forecast
cloud costs
10/30/2020 • 2 minutes to read • Edit Online
After you've rationalized a digital estate, you can align it to equivalent costing models with the chosen cloud
provider. Discussing cost models is difficult without focusing on a specific cloud provider. To provide tangible
examples in this article, Azure is the assumed cloud provider.
Azure pricing tools help you manage cloud spend with transparency and accuracy, so you can make the most of
Azure and other clouds. Providing the tools to monitor, allocate, and optimize cloud costs, empowers customers
to accelerate future investments with confidence.
Azure Migrate: Azure Migrate is perhaps the most cost effective approach to cost model alignment. This
tool allows for digital estate inventory, limited rationalization, and cost calculations in one tool.
Total cost of ownership (TCO) calculator: Lower the total cost of ownership of your on-premises
infrastructure with the Azure cloud platform. Use the Azure TCO calculator to estimate the cost savings you
can realize by migrating your application workloads to Azure. Provide a brief description of your on-
premises environment to get an instant report.
Azure pricing calculator: Estimate your expected monthly bill by using our pricing calculator. Track your
actual account usage and bill at any time using the billing portal. Set up automatic email billing alerts to
notify you if your spend goes above an amount you configure.
Azure Cost Management + Billing: Azure Cost Management + Billing is a cost management solution that
helps you use and manage Azure and other cloud resources effectively. Collect cloud usage and billing
data through application program interfaces (APIs) from Azure, Amazon Web Services, and Google Cloud
Platform. With that data, gain full visibility into resource consumption and costs across cloud platforms in
a single, unified view. Continuously monitor cloud consumption and cost trends. Track actual cloud
spending against your budget to avoid overspending. Detect spending anomalies and usage inefficiencies.
Use historical data to improve your forecasting accuracy for cloud usage and expenditures.
Measure business outcomes with AppDynamics
10/30/2020 • 4 minutes to read • Edit Online
Measuring and quantifying successful business outcomes is a crucial part of any cloud adoption strategy.
Understanding an application's performance and user experience is key to measuring those business outcomes.
However, accurately measuring the correlation between application performance, user experience, and business
impact is often difficult, inaccurate, and time consuming.
AppDynamics can provide business insights for most use cases, and many organizations start a comprehensive
cloud adoption strategy with the following use cases:
A pre- and post-migration comparison
Business health
Release validation
Segment health
User journeys
Business journeys
Conversion funnels
This guidance will focus on how to measure business outcomes of a pre- and post-migration comparison and how
to accelerate and reduce risk for a migration during your cloud adoption journey.
Next steps
AppDynamics gives organizations the unique ability to measure the business outcomes during their cloud adoption
strategy. Visit AppDynamics to learn more about AppDynamics with Azure.
Initial organization alignment
10/30/2020 • 2 minutes to read • Edit Online
The most important aspect of any cloud adoption plan is the alignment of people who will make the plan a reality.
No plan is complete until you understand its people-related aspects.
True organizational alignment takes time. It will become important to establish long-term organizational
alignment, especially as cloud adoption scales across the business and IT culture. Alignment is so important that an
entire section has been dedicated to it in the Organize methodology of the Cloud Adoption Framework.
Full organization alignment is not a required component of the cloud adoption plan. However, some initial
organization alignment is needed. This article outlines a best-practice starting point for organizational alignment.
The guidance here can help complete your plan and get your teams ready for cloud adoption. When you're ready,
you can use the organization alignment section to customize this guidance to fit your organization.
It's fairly intuitive that cloud adoption tasks require people to execute those tasks. So, few people are surprised that
a cloud adoption team is a requirement. However, those who are new to the cloud may not fully appreciate the
importance of a cloud governance team. This challenge often occurs early in adoption cycles. The cloud
governance team provides the necessary checks and balances to ensure that cloud adoption doesn't expose the
business to any new risks. When risks must be taken, this team ensures that proper processes and controls are
implemented to mitigate or govern those risks.
For more information about cloud adoption, cloud governance, and other such capabilities, see the brief section on
understanding required cloud capabilities.
Next steps
Learn how to plan for cloud adoption.
Plan for cloud adoption
Adapt existing roles, skills, and processes for the
cloud
10/30/2020 • 3 minutes to read • Edit Online
At each phase of the IT industry's history, the most notable changes have often been marked by changes in staff
roles. One example is the transition from mainframe computing to client/server computing. The role of the
computer operator during this transition has largely disappeared, replaced by the system administrator role.
When virtualization arrived, the requirement for individuals working with physical servers was replaced with a
need for virtualization specialists.
Roles will likely change as institutions similarly shift to cloud computing. For example, datacenter specialists might
be replaced with cloud administrators or cloud architects. In some cases, though IT job titles haven't changed, the
daily work of these roles has changed significantly.
IT staff members might feel anxious about their roles and positions because they realize that they need a different
set of skills to support cloud solutions. But agile employees who explore and learn new cloud technologies
shouldn't fear. They can lead the adoption of cloud services and help the organization learn and embrace the
associated changes.
For guidance on building a new skill set, see the skills readiness path.
Capture concerns
As the organization prepares for a cloud adoption effort, each team should document staff concerns as they arise
by identifying:
The type of concern. For example, workers might be resistant to the changes in job duties that come with the
adoption effort.
The impact if the concern isn't addressed. For example, resistance to adoption might result in workers being
slow to execute the required changes.
The area equipped to address the concern. For example, if workers in the IT department are reluctant to acquire
new skills, the IT stakeholder's area is best equipped to address this concern. Identifying the area might be clear
for some concerns. In these cases, you might need to escalate to executive leadership.
IT staff members commonly have concerns about acquiring the training needed to support expanded functions
and new duties. Learning the training preferences of the team helps you prepare a plan. It also allows you to
address these concerns.
Identify gaps
Identifying gaps is another important aspect of organization readiness. A gap is a role, skill, or process that is
required for your digital transformation but doesn't currently exist in your enterprise.
1. Enumerate the responsibilities that come with the digital transformation. Emphasize new responsibilities and
existing responsibilities to be retired.
2. Identify the area that aligns with each responsibility. For each new responsibility, check how closely it aligns
with the area. Some responsibilities might span several areas. This crossover represents an opportunity for
better alignment that you should document as a concern. In the case where no area is identified as being
responsible, document this gap.
3. Identify the skills necessary to support each responsibility, and check if your enterprise has existing resources
with those skills. Where there are no existing resources, determine the training programs or talent acquisition
necessary to fill the gaps. Also determine the deadline by which you must support each responsibility to keep
your digital transformation on schedule.
4. Identify the roles that will execute these skills. Some of your existing workforce will assume parts of the roles.
In other cases, entirely new roles might be necessary.
Next steps
Ensuring proper support for the translated roles is a team effort. To act on this guidance, review the organizational
readiness overview to identify the right team structures and participants.
Identify the right team structures
Get started on a skills readiness path
10/30/2020 • 2 minutes to read • Edit Online
IT staff members might feel anxious about their roles and positions as they realize a different set of skills is needed
to support cloud solutions. Agile employees who explore and learn new cloud technologies don't need to have that
fear. They can lead the adoption of cloud services by helping the organization understand and embrace the
associated changes.
Learn more
To discover additional learning paths, browse the Microsoft Learn catalog. Use the roles filter to align learning
paths with your role.
Plan for cloud adoption
5/12/2020 • 2 minutes to read • Edit Online
A plan is an essential requirement for a successful cloud adoption. A cloud adoption plan is an iterative project
plan that helps a company transition from traditional IT approaches to transformation over to modern, agile
approaches. This article series outlines how a cloud adoption plan helps companies balance their IT portfolio and
manage transitions over time. Through this process, business objectives can be clearly translated into tangible
technical efforts. Those efforts can then be managed and communicated in ways that make sense to business
stakeholders. However, adopting such a process may require some changes to traditional project-management
approaches.
Next steps
Before building your cloud adoption plan, ensure that all necessary prerequisites are in place.
Review prerequisites
Prerequisites for an effective cloud adoption plan
10/30/2020 • 2 minutes to read • Edit Online
A plan is only as effective as the data that's put into it. For a cloud adoption plan to be effective, there are two
categories of input: strategic and tactical. The following sections outline the minimum data points required in each
category.
Strategic inputs
Accurate strategic inputs ensure that the work being done contributes to achievement of business outcomes. The
strategy section of the Cloud Adoption Framework provides a series of exercises to develop a clear strategy. The
outputs of those exercises feed the cloud adoption plan. Before developing the plan, ensure that the following
items are well defined as a result of those exercises:
Clear motivations: Why are we adopting the cloud?
Defined business outcomes: What results do we expect to see from adopting the cloud?
Business justification: How will the business measure success?
Every member of the team that implements the cloud adoption plan should be able to answer these three strategic
questions. Managers and leaders who are accountable for implementation of the plan should understand the
metrics behind each question and any progress toward realizing those metrics.
Tactical inputs
Accurate tactical inputs ensure that the work can be planned accurately and managed effectively. The plan section
of the Cloud Adoption Framework provides a series of exercises to develop planning artifacts before you develop
your plan. These artifacts provide answers to the following questions:
Digital estate rationalization: What are the top 10 priority workloads in the adoption plan? How many
additional workloads are likely to be in the plan? How many assets are being considered as candidates for
cloud adoption? Are the initial efforts focused more on migration or innovation activities?
Organization alignment: Who will do the technical work in the adoption plan? Who is accountable for
adherence to governance and compliance requirements?
Skills readiness: How many people are allocated to perform the required tasks? How well are their skills
aligned to cloud adoption efforts? Are partners aligned to support the technical implementation?
These questions are essential to the accuracy of the cloud adoption plan. At a minimum, the questions about digital
estate rationalization must be answered to create a plan. To provide accurate timelines, the questions about
organization and skills are also important.
Next steps
Define your cloud adoption plan by deploying the template to Azure DevOps Services.
Define your cloud adoption plan using the template
Cloud adoption plan and Azure DevOps
10/30/2020 • 3 minutes to read • Edit Online
Azure DevOps is the set of cloud-based tools for Azure customers who manage iterative projects. It also includes
tools for managing deployment pipelines and other important aspects of DevOps.
In this article, you'll learn how to quickly deploy a backlog to Azure DevOps using a template. This template
aligns cloud adoption efforts to a standardized process based on the guidance in the Cloud Adoption Framework.
NOTE
The current state of the cloud adoption plan focuses heavily on migration efforts. Tasks related to governance, innovation,
or operations must be populated manually.
Next steps
Start aligning your plan project by defining and prioritizing workloads.
Define and prioritize workloads
Define and prioritize workloads for a cloud adoption
plan
10/30/2020 • 5 minutes to read • Edit Online
Establishing clear, actionable priorities is one of the secrets to successful cloud adoption. The natural temptation is
to invest time in defining all workloads that could potentially be affected during cloud adoption. But that's
counterproductive, especially early in the adoption process.
Instead, we recommend that your team focus on thoroughly prioritizing and documenting the first 10 workloads.
After implementation of the adoption plan begins, the team can maintain a list of the next 10 highest-priority
workloads. This approach provides enough information to plan for the next few iterations.
Limiting the plan to 10 workloads encourages agility and alignment of priorities as business criteria change. This
approach also makes room for the cloud adoption team to learn and to refine estimates. Most important, it
removes extensive planning as a barrier to effective business change.
What is a workload?
In the context of a cloud adoption, a workload is a collection of IT assets (servers, VMs, applications, data, or
appliances) that collectively support a defined process. Workloads can support more than one process.
Workloads can also depend on other shared assets or larger platforms. However, a workload should have defined
boundaries regarding the dependent assets and the processes that depend upon the workload. Often, workloads
can be visualized by monitoring network traffic among IT assets.
Prerequisites
The strategic inputs from the prerequisites list make the following tasks much easier to accomplish. For help with
gathering the data discussed in this article, review the prerequisites.
NOTE
The Power of 10 approach serves as an initial boundary for planning, to focus the energy and investment in early-stage
analysis. However, the act of analyzing and defining workloads is likely to cause changes in the list of priority workloads.
Define workloads
After initial priorities have been defined and workloads have been added to the plan, each of the workloads can
be defined via deeper qualitative analysis. Before including any workload in the cloud adoption plan, try to
provide the following data points for each workload.
Business inputs
DATA P O IN T DESC RIP T IO N IN P UT
Business freeze periods Are there any times during which the
business will not permit change?
Technical inputs
DATA P O IN T DESC RIP T IO N IN P UT
Confirm priorities
Based on the assembled data, the cloud strategy team and the cloud adoption team should meet to reevaluate
priorities. Clarification of business data points might prompt changes in priorities. Technical complexity or
dependencies might result in changes related to staffing allocations, timelines, or sequencing of technical efforts.
After a review, both teams should be comfortable with confirming the resulting priorities. This set of documented,
validated, and confirmed priorities is the prioritized cloud adoption backlog.
Next steps
For any workload in the prioritized cloud adoption backlog, the team is now ready to align assets.
Align assets for prioritized workloads
Align assets to prioritized workloads
10/30/2020 • 2 minutes to read • Edit Online
Workload is a conceptual description of a collection of assets: VMs, applications, and data sources. The previous
article, Define and prioritize, provided guidance for collecting the data that will define the workload. Before
migration, a few of the technical inputs in that list require additional validation. This article helps with validation of
the following inputs:
Applications: List any applications included in this workload.
VMs and ser vers: List any VMs or servers included in the workload.
Data sources: List any data sources included in the workload.
Dependencies: List any asset dependencies not included in the workload.
There are several options for assembling this data. The following are a few of the most common approaches.
Azure Migrate
Azure Migrate provides a set of grouping functions that can speed up the aggregation of applications, VMs, data
sources, and dependencies. After workloads have been defined conceptually, they can be used as the basis for
grouping assets based on dependency mapping.
The Azure Migrate documentation provides guidance on how to group machines based on dependencies.
Configuration-management database
Some organizations have a well-maintained configuration-management database (CMDB) within their existing
operations-management tooling. They could use the CMDB alternatively to provide the input data points
discussed earlier.
Next steps
Review rationalization decisions based on asset alignment and workload definitions.
Review rationalization decisions
Review rationalization decisions
10/30/2020 • 4 minutes to read • Edit Online
During initial strategy and planning stages, we suggest you apply an incremental rationalization approach to the
digital estate. But this approach embeds some assumptions into the resulting decisions. We advise the cloud
strategy team and the cloud adoption teams to review those decisions in light of expanded-workload
documentation. This review is also a good time to involve business stakeholders and the executive sponsor in
future state decisions.
IMPORTANT
Further validation of the rationalization decisions will occur during the Assess phase of migration. This validation focuses on
business review of the rationalization to align resources appropriately.
To validate rationalization decisions, use the following questions to facilitate a conversation with the business. The
questions are grouped by the likely rationalization alignment.
Innovation indicators
If the joint review of the following questions yields an affirmative answer, a workload might be a better candidate
for innovation. Such a workload wouldn't be migrated via a lift and shift or modernize model. Instead, the business
logic or data structures would be re-created as a new or rearchitected application. This approach can be more
labor-intensive and time-consuming. But for a workload that represents significant business returns, the
investment is justified.
Do the applications in this workload create market differentiation?
Is there a proposed or approved investment aimed at improving the experiences associated with the
applications in this workload?
Does the data in this workload make new product or service offerings available?
Is there a proposed or approved investment aimed at taking advantage of the data associated with this
workload?
Can the effect of the market differentiation or new offerings be quantified? If so, does that return justify the
increased cost of innovation during cloud adoption?
The following two questions can help you include high-level technical scenarios in the rationalization review.
Answering "yes" to either could identify ways of accounting for or reducing the cost associated with innovation.
Will the data structures or business logic change during the course of cloud adoption?
Is an existing deployment pipeline used to deploy this workload to production?
If the answer to either question is "yes," the team should consider including this workload as an innovation
candidate. At a minimum, the team should flag this workload for architecture review to identify modernization
opportunities.
Migration indicators
Migration is a faster and cheaper way of adopting the cloud. But it doesn't take advantage of opportunities to
innovate. Before you invest in innovation, answer the following questions. They can help you determine if a
migration model is more applicable for a workload.
Is the source code supporting this application stable? Do you expect it to remain stable and unchanged during
the time frame of this release cycle?
Does this workload support production business processes today? Will it do so throughout the course of this
release cycle?
Is it a priority that this cloud adoption effort improves the stability and performance of this workload?
Is cost reduction associated with this workload an objective during this effort?
Is reducing operational complexity for this workload a goal during this effort?
Is innovation limited by the current architecture or IT operation processes?
If the answer to any of these questions is "yes," you should consider a migration model for this workload. This
recommendation is true even if the workload is a candidate for innovation.
Challenges in operational complexity, costs, performance, or stability can hinder business returns. You can use the
cloud to quickly produce improvements related to those challenges. Where it's applicable, we suggest you use the
migration approach to first stabilize the workload. Then expand on innovation opportunities in the stable, agile
cloud environment. This approach provides short-term returns and reduces the cost required to drive long-term
change.
IMPORTANT
Migration models include incremental modernization. Using platform as a service (PaaS) architectures is a common aspect of
migration activities. So too are minor configuration changes that use those platform services. The boundary for migration is
defined as a material change to the business logic or supporting business structures. Such change is considered an
innovation effort.
Next steps
Establish iterations and release plans to begin planning work.
Establish iterations and release plans to begin planning work.
Establish iterations and release plans
3/31/2020 • 4 minutes to read • Edit Online
Agile and other iterative methodologies are built on the concepts of iterations and releases. This article outlines
the assignment of iterations and releases during planning. Those assignments drive timeline visibility to make
conversations easier among members of the cloud strategy team. The assignments also align technical tasks in a
way that the cloud adoption team can manage during implementation.
Establish iterations
In an iterative approach to technical implementation, you plan technical efforts around recurring time blocks.
Iterations tend to be one-week to six-week time blocks. Consensus suggests that two weeks is the average
iteration duration for most cloud adoption teams. But the choice of iteration duration depends on the type of
technical effort, the administrative overhead, and the team's preference.
To begin aligning efforts to a timeline, we suggest that you define a set of iterations that last 6 to 12 months.
Understand velocity
Aligning efforts to iterations and releases requires an understanding of velocity. Velocity is the amount of work
that can be completed in any given iteration. During early planning, velocity is an estimate. After several
iterations, velocity becomes a highly valuable indicator of the commitments that the team can make confidently.
You can measure velocity in abstract terms like story points. You can also measure it in more tangible terms like
hours. For most iterative frameworks, we recommend using abstract measurements to avoid challenges in
precision and perception. Examples in this article represent velocity in hours per sprint. This representation
makes the topic more universally understood.
Example: A five-person cloud adoption team has committed to two-week sprints. Given current obligations like
meetings and support of other processes, each team member can consistently contribute 20 hours per week to
the adoption effort. For this team, the initial velocity estimate is 100 hours per sprint.
Iteration planning
Initially, you plan iterations by evaluating the technical tasks based on the prioritized backlog. Cloud adoption
teams estimate the effort required to complete various tasks. Those tasks are then assigned to the first available
iteration.
During iteration planning, the cloud adoption teams validate and refine estimates. They do so until they have
aligned all available velocity to specific tasks. This process continues for each prioritized workload until all efforts
align to a forecasted iteration.
In this process, the team validates the tasks assigned to the next sprint. The team updates its estimates based on
the team's conversation about each task. The team then adds each estimated task to the next sprint until the
available velocity is met. Finally, the team estimates additional tasks and adds them to the next iteration. The team
performs these steps until the velocity of that iteration is also exhausted.
The preceding process continues until all tasks are assigned to an iteration.
Example: Let's build on the previous example. Assume each workload migration requires 40 tasks. Also assume
you estimate each task to take an average of one hour. The combined estimation is approximately 40 hours per
workload migration. If these estimates remain consistent for all 10 of the prioritized workloads, those workloads
will take 400 hours.
The velocity defined in the previous example suggests that the migration of the first 10 workloads will take four
iterations, which is two months of calendar time. The first iteration will consist of 100 tasks that result in the
migration of two workloads. In the next iteration, a similar collection of 100 tasks will result in the migration of
three workloads.
WARNING
The preceding numbers of tasks and estimates are strictly used as an example. Technical tasks are seldom that consistent.
You shouldn't see this example as a reflection of the amount of time required to migrate a workload.
Release planning
Within cloud adoption, a release is defined as a collection of deliverables that produce enough business value to
justify the risk of disruption to business processes.
Releasing any workload-related changes into a production environment creates some changes to business
processes. Ideally, these changes are seamless, and the business sees the value of the changes with no significant
disruptions to service. But the risk of business disruption is present with any change and shouldn't be taken
lightly.
To ensure a change is justified by its potential return, the cloud strategy team should participate in release
planning. Once tasks are aligned to sprints, the team can determine a rough timeline of when each workload will
be ready for production release. The cloud strategy team would review the timing of each release. The team
would then identify the inflection point between risk and business value.
Example: Continuing the previous example, the cloud strategy team has reviewed the iteration plan. The review
identified two release points. During the second iteration, a total of five workloads will be ready for migration.
Those five workloads will provide significant business value and will trigger the first release. The next release will
come two iterations later, when the next five workloads are ready for release.
Next steps
Estimate timelines to properly communicate expectations.
Estimate timelines
Timelines in a cloud adoption plan
10/30/2020 • 2 minutes to read • Edit Online
In the previous article in this series, workloads and tasks were assigned to releases and iterations. Those
assignments feed the timeline estimates in this article.
Work breakdown structures are commonly used in sequential project-management tools. They represent how
dependent tasks will be completed over time. Such structures work well when tasks are sequential in nature. The
interdependencies in tasks found in cloud adoption make such structures difficult to manage. To fill this gap, you
can estimate timelines based on iteration-path assignments by hiding complexity.
Estimate timelines
To develop a timeline, start with releases. Those release objectives create a target date for any business impact.
Iterations aid in aligning those releases with specific time durations.
If more granular milestones are required in the timeline, use iteration assignment to indicate milestones. To do this
assignment, assume that the last instance of a workload-related task can serve as the final milestone. Teams also
commonly tag the final task as a milestone.
For any level of granularity, use the last day of the iteration as the date for each milestone. This ties completion of
workload adoption to a specific date. You can track the date in a spreadsheet or a sequential project-management
tool like Microsoft Project.
This article shows how the fictional company Contoso assesses an on-premises app for migration to Azure. In the
example scenario, Contoso's on-premises SmartHotel360 application currently runs on VMware. Contoso
assesses the application VMs using the Azure Migrate service, and the SQL Server application database using
Data Migration Assistant.
Overview
As Contoso considers migrating to Azure, the company needs a technical and financial assessment to determine
whether its on-premises workloads are good candidates for cloud migration. In particular, the Contoso team
wants to assess machine and database compatibility for migration. It wants to estimate capacity and costs for
running Contoso's resources in Azure.
To get started and to better understand the technologies involved, Contoso assesses two of its on-premises apps,
summarized in the following table. The company assesses for migration scenarios that rehost and refactor apps
for migration. Learn more about rehosting and refactoring in the migration examples overview.
Smar tHotel360 Runs on Windows with a Two-tiered app. The front- VMs run on a VMware ESXi
SQL Server database end ASP.NET website runs host managed by vCenter
(manages Contoso travel on one VM ( WEBVM ) and Server.
requirements) the SQL Server runs on
another VM ( SQLVM ). You can download the
sample app from GitHub.
osTicket Runs on a LAMP stack. Two-tiered app. A front-end The app is used by
PHP website runs on one customer service apps to
(Contoso service desk app) VM ( OSTICKETWEB ) and the track issues for internal
MySQL database runs on employees and external
another VM ( customers.
OSTICKETMYSQL ).
You can download the
sample from GitHub.
Current architecture
This diagram shows the current Contoso on-premises infrastructure:
Contoso has one main datacenter. The datacenter is located in the city of New York in the Eastern United
States.
Contoso has three additional local branches across the United States.
The main datacenter is connected to the internet with a fiber optic Metro Ethernet connection (500 MBps).
Each branch is connected locally to the internet by using business-class connections with IPsec VPN tunnels
back to the main datacenter. The setup allows Contoso's entire network to be permanently connected and
optimizes internet connectivity.
The main datacenter is fully virtualized with VMware. Contoso has two ESXi 6.5 virtualization hosts that are
managed by vCenter Server 6.5.
Contoso uses Active Directory for identity management. Contoso uses DNS servers on the internal network.
The domain controllers in the datacenter run on VMware VMs. The domain controllers at local branches run
on physical servers.
Business drivers
Contoso's IT leadership team has worked closely with the company's business partners to understand what the
business wants to achieve with this migration:
Address business growth. Contoso is growing. As a result, pressure has increased on the company's on-
premises systems and infrastructure.
Increase efficiency. Contoso needs to remove unnecessary procedures and streamline processes for its
developers and users. The business needs IT to be fast and to not waste time or money, so the company can
deliver faster on customer requirements.
Increase agility. Contoso IT needs to be more responsive to the needs of the business. It must be able to
react faster than the changes that occur in the marketplace for the company to be successful in a global
economy. IT at Contoso must not get in the way or become a business blocker.
Scale. As the company's business grows successfully, Contoso IT must provide systems that can grow at the
same pace.
Assessment goals
The Contoso cloud team has identified goals for its migration assessments:
After migration, apps in Azure should have the same performance capabilities that apps have today in
Contoso's on-premises VMware environment. Moving to the cloud doesn't mean that app performance is less
critical.
Contoso needs to understand the compatibility of its applications and databases with Azure requirements.
Contoso also needs to understand its hosting options in Azure.
Contoso's database administration should be minimized after apps move to the cloud.
Contoso wants to understand not only its migration options, but also the costs associated with the
infrastructure after it moves to the cloud.
Assessment tools
Contoso uses Microsoft tools for its migration assessment. The tools align with the company's goals and should
provide Contoso with all the information it needs.
T EC H N O LO GY DESC RIP T IO N C O ST
Data Migration Assistant Contoso uses Data Migration Assistant Data Migration Assistant is a free
to assess and detect compatibility downloadable tool.
issues that might affect its database
functionality in Azure. Data Migration
Assistant assesses feature parity
between SQL sources and targets. It
recommends performance and
reliability improvements.
Azure Migrate Contoso uses the Azure Migrate Azure Migrate is available at no
service to assess its VMware VMs. additional charge. However, you may
Azure Migrate assesses the migration incur charges depending on the tools
suitability of the machines. It provides (first-party or ISV) you decide to use
sizing and cost estimates for running in for assessment and migration. Learn
Azure. more about Azure Migrate pricing.
Service Map Azure Migrate uses Service Map to Service Map is part of Azure Monitor
show dependencies between machines logs. Currently, Contoso can use
that the company wants to migrate. Service Map for 180 days without
incurring charges.
In this scenario, Contoso downloads and runs Data Migration Assistant to assess the on-premises SQL Server
database for its travel app. Contoso uses Azure Migrate with dependency mapping to assess the app VMs before
migration to Azure.
Assessment architecture
Contoso is a fictional name that represents a typical enterprise organization.
Contoso has an on-premises datacenter ( contoso-datacenter ) and on-premises domain controllers (
CONTOSODC1 , CONTOSODC2 ).
VMware VMs are located on VMware ESXi hosts running version 6.5 ( contosohost1 , contosohost2 ).
The VMware environment is managed by vCenter Server 6.5 ( vcenter.contoso.com , running on a VM).
The SmartHotel360 travel app has these characteristics:
The app is tiered across two VMware VMs ( WEBVM and SQLVM ).
The VMs are located on VMware ESXi host contosohost1.contoso.com .
The VMs are running Windows Server 2008 R2 Datacenter with SP1.
The VMware environment is managed by vCenter Server ( vcenter.contoso.com ) running on a VM.
The osTicket service desk app:
The app is tiered across two VMs ( OSTICKETWEB and OSTICKETMYSQL ).
The VMs are running Ubuntu Linux Server 16.04-LTS.
OSTICKETWEB is running Apache 2 and PHP 7.0.
OSTICKETMYSQL is running MySQL 5.7.22.
Prerequisites
Contoso and other users must meet the following prerequisites for the assessment:
Owner or Contributor permissions for the Azure subscription, or for a resource group in the Azure
subscription.
An on-premises vCenter Server instance running version 6.5, 6.0, or 5.5.
A read-only account in vCenter Server, or permissions to create one.
Permissions to create a VM on the vCenter Server instance by using an .ova template.
At least one ESXi host running version 5.5 or later.
At least two on-premises VMware VMs, one running a SQL Server database.
Permissions to install Azure Migrate agents on each VM.
The VMs should have direct internet connectivity.
You can restrict internet access to the required URLs.
If your VMs don't have internet connectivity, the Azure Log Analytics Gateway must be installed on
them, and agent traffic directed through it.
The fully qualified domain name (FQDN) of the VM running the SQL Server instance, for database
assessment.
Windows Firewall running on the SQL Server VM should allow external connections on TCP port 1433
(default). This setup allows Data Migration Assistant to connect.
Assessment overview
Here's how Contoso performs its assessment:
Step 1: Download and install Data Migration Assistant. Contoso prepares Data Migration Assistant for
assessment of the on-premises SQL Server database.
Step 2: Assess the database by using Data Migration Assistant. Contoso runs and analyzes the
database assessment.
Step 3: Prepare for VM assessment by using Azure Migrate. Contoso sets up on-premises accounts
and adjusts VMware settings.
Step 4: Discover on-premises VMs by using Azure Migrate. Contoso creates an Azure Migrate
collector VM. Then, Contoso runs the collector to discover VMs for assessment.
Step 5: Prepare for dependency analysis by using Azure Migrate. Contoso installs Azure Migrate
agents on the VMs, so the company can see dependency mapping between VMs.
Step 6: Assess the VMs by using Azure Migrate. Contoso checks dependencies, groups the VMs, and
runs the assessment. When the assessment is ready, Contoso analyzes the assessment in preparation for
migration.
NOTE
Assessments shouldn't just be limited to using tooling to discover information about your environment. You should also
schedule time to speak to business owners, end users, and other members of the IT department to fully understand of
what is happening in the environment and understand factors that tooling cannot tell you.
NOTE
Currently, Data Migration Assistant doesn't support assessment for migrating to Azure SQL Managed Instance. As
a workaround, Contoso uses SQL Server on an Azure VM as the supposed target for the assessment.
3. In Select Target Version , Contoso selects SQL Server 2017 as the target version. Contoso needs to select
this version because it's the version that's used by the SQL Managed Instance.
4. Contoso selects reports to help it discover information about compatibility and new features:
Compatibility issues note changes that might break migration or that require a minor adjustment
before migration. This report keeps Contoso informed about any features currently in use that are
deprecated. Issues are organized by compatibility level.
New feature recommendation notes new features in the target SQL Server platform that can be
used for the database after migration. New feature recommendations are organized under the
headings Performance , Security , and Storage.
5. In Connect to a ser ver , Contoso enters the name of the VM that's running the database and credentials
to access it. Contoso selects Trust ser ver cer tificate to make sure the VM can access SQL Server. Then,
Contoso selects Connect .
6. In Add source , Contoso adds the database it wants to assess, then selects Next to start the assessment.
7. The assessment is created.
2. In the Feature recommendations report, Contoso views performance, security, and storage features
that the assessment recommends after migration. A variety of features are recommended, including In-
Memory OLTP, columnstore indexes, Stretch Database, Always Encrypted, dynamic data masking, and
transparent data encryption.
NOTE
Contoso should enable transparent data encryption for all SQL Server databases. This is even more critical when a
database is in the cloud than when it's hosted on-premises. Transparent data encryption should be enabled only
after migration. If transparent data encryption is already enabled, Contoso must move the certificate or asymmetric
key to the master database of the target server. Learn how to move a transparent data encryption-protected
database to another SQL Server instance.
3. Contoso can export the assessment in JSON or CSV format.
NOTE
For large-scale assessments:
Run multiple assessments concurrently and view the state of the assessments on the All assessments page.
Consolidate assessments into a SQL Server database.
Consolidate assessments into a Power BI report.
9. In Select migration tool , select Skip adding a migration tool for now > Next .
10. In Review + add tools , review the settings, then select Add tools .
11. Wait a few minutes for the Azure Migrate project to deploy. You'll be taken to the project page. If you don't
see the project, you can access it from Ser vers in the Azure Migrate dashboard.
Download the collector appliance
1. In Migration Goals > Ser vers > Azure Migrate: Ser ver Assessment , select Discover .
2. In Discover machines > Are your machines vir tualized? , select Yes, with VMware vSphere
hyper visor .
3. Select Download to download the .OVA template file.
Example:
C:\> CertUtil -HashFile C:\AzureMigrate\AzureMigrate.ova SHA256
3. The generated hash should match the hash values listed in the Verify security section of the Assess
VMware VMs for migration tutorial.
Create the collector appliance
Now, Contoso can import the downloaded file to the vCenter Server instance and provision the collector
appliance VM:
1. In the vSphere Client console, Contoso selects File > Deploy OVF template .
2. In the Deploy OVF Template Wizard, Contoso selects Source , and then specifies the location of the OVA
file.
3. In Name and Location , Contoso specifies a display name for the collector VM. Then, it selects the
inventory location in which to host the VM. Contoso also specifies the host or cluster on which to run the
collector appliance.
4. In Storage , Contoso specifies the storage location. In Disk Format , Contoso selects how it wants to
provision the storage.
5. In Network Mapping , Contoso specifies the network in which to connect the collector VM. The network
needs internet connectivity to send metadata to Azure.
6. Contoso reviews the settings, then selects Power on after deployment > Finish . A message that
confirms successful completion appears when the appliance is created.
Run the collector to discover VMs
Now, Contoso runs the collector to discover VMs. Currently, the collector currently supports only English
(United States) as the operating system language and collector interface language.
1. In the vSphere Client console, Contoso selects Open Console . Contoso specifies the accepts the licensing
terms, and password preferences for the collector VM.
2. On the desktop, Contoso selects the Microsoft Azure Appliance Configuration Manager shortcut.
3. In the Azure Migrate Collector, Contoso selects Set up prerequisites . Contoso accepts the license terms
and reads the third-party information.
4. The collector checks that the VM has internet access, that the time is synced, and that the collector service
is running. (The collector service is installed by default on the VM.) Contoso also installs the VMware
vSphere Virtual Disk Development Kit.
NOTE
It's assumed that the VM has direct access to the internet without using a proxy.
5. Sign into your Azure account and select the subscription and Migrate project you created earlier. Also
enter a name for the appliance so you can identify it in the Azure portal.
6. In Specify vCenter Ser ver details , Contoso enters the name (FQDN) or IP address of the vCenter
Server instance and the read-only credentials used for discovery.
7. Contoso selects a scope for VM discovery. The collector can discover only VMs that are within the specified
scope. The scope can be set to a specific folder, datacenter, or cluster.
8. The collector will now start to discovery and collect information about the Contoso environment.
2. Contoso must run the command to install the MMA agent as root. To become root, Contoso runs the
following command, and then enters the root password:
sudo -i
NOTE
To view more granular dependencies, you can expand the time range. You can select a specific duration or select
start and end dates.
Run an assessment
1. In Groups , Contoso opens the group ( smarthotelapp ), then selects Create assessment .
An assessment has a confidence rating of from 1 star to 5 stars (1 star is the lowest and 5 stars is the highest).
The confidence rating is assigned to an assessment based on the availability of data points that are needed
to compute the assessment.
The rating helps you estimate the reliability of the size recommendations that are provided by Azure
Migrate.
The confidence rating is useful when you are doing performance-based sizing. Azure Migrate might not
have enough data points for utilization-based sizing. For as on-premises sizing, the confidence rating is
always 5 stars because Azure Migrate has all the data points it needs to size the VM.
Depending on the percentage of data points available, the confidence rating for the assessment is
provided:
0%-20% 1 star
21%-40% 2 stars
AVA IL A B IL IT Y O F DATA P O IN T S C O N F IDEN C E RAT IN G
41%-60% 3 stars
61%-80% 4 stars
81%-100% 5 stars
The assessment report shows the information that's summarized in the table. To show performance-based sizing,
Azure Migrate needs the following information. If the information can't be collected, sizing assessment might not
be accurate.
Utilization data for CPU and memory.
Read/write IOPS and throughput for each disk attached to the VM.
Network in/out information for each network adapter attached to the VM.
Azure VM size For ready VMs, Azure Migrate provides Sizing recommendation depends on
an Azure VM size recommendation. assessment properties:
If you used performance-based
sizing, then sizing considers the
performance history of the VMs.
If you used as on-premises sizing,
then sizing is based on the on-
premises VM size and utilization.
Data isn't used.
SET T IN G IN DIC AT IO N DETA IL S
Cost estimates are calculated by using the size recommendations for a machine.
Estimated monthly costs for compute and storage are aggregated for all VMs in the group.
Conclusion
In this scenario, Contoso assesses its SmartHotel360 app database by using the Data Migration Assistant tool. It
assesses the on-premises VMs by using the Azure Migrate service. Contoso reviews the assessments to make
sure that on-premises resources are ready for migration to Azure.
Next steps
After Contoso assesses this workload as a potential migration candidate, it can begin preparing its on-premises
infrastructure and its Azure infrastructure for migration. See the deploy Azure infrastructure article in the Cloud
Adoption Framework migrate best practices section for an example of how Contoso performs these processes.
Plan a data warehouse migration
10/30/2020 • 14 minutes to read • Edit Online
A data warehouse migration is a challenge for any company. In order to execute it well and avoid any unwelcome
surprises and unplanned costs, you need to thoroughly research the challenge, mitigate risk, and plan your
migration to ensure that you're as ready as possible. At a high level, your plan should cover the core data
warehouse migration process steps and any tasks within them. The main process steps are:
Pre-migration preparation
Migration strategy and execution
Post-migration
For example, preparation includes things like readying your data warehouse migration team in terms of skills
training and technology familiarization. It also includes setting up a proof of concept lab, understanding how you
will manage test and production environments, gaining appropriate clearance to migrate your data and a
production system outside of the corporate firewall and setting up migration software in your datacenter to enable
migration to proceed.
For a data warehouse migration to proceed smoothly, your plan should establish a clear understanding of:
Your business case, including its drivers, business benefits, and risks.
Migration team roles and responsibilities.
The skill set and training required to enable successful migration.
Allocated budget for the complete migration.
Your migration strategy.
How you can avoid risk in the migration project to avoid delays or rework.
Your existing data warehouse system, its architecture, schema, data volumes, data flows, security, and
operational dependencies.
Differences between your existing on premises data warehouse DBMS and Azure Synapse, like data types, SQL
functions, logic, and other considerations.
What needs to be migrated and priorities.
The migration tasks, approaches, order, and deadlines.
How you will control migration.
How to prevent user disruption while undertaking the migration.
What you need to do on-premises to avoid delays and enable migration.
Tools to enable secure migration of schemas, data, and ETL processing to Azure.
Data model design changes that are required during and after migration.
Any pre-migration or post-migration technology changes and how to minimize rework.
Post-migration technology deprecation.
How you will implement testing and quality assurance to prove success.
Your checkpoints to assess progress and enable decisions to be made.
Your contingency plan and points of rollback in case things go wrong.
In order to achieve this understanding, we need to prepare and begin specific activities before any migration starts.
Let's look at what that entails in more detail.
Pre-migration preparation
There are several things that should be addressed before you even begin a data warehouse migration.
Key roles in a data warehouse migration team
Key roles in a migration project include:
Business owner
Project manager (with agile methodology experience such as Scrum)
Project coordinator
Cloud engineer
Database administrator (existing data warehouse DBMS and Azure Synapse)
Data modelers
ETL developers
Data virtualization specialist (could be a DBA)
Testing engineer
Business analysts (to help test BI tool queries, reports, and analyses)
In addition, the team need the support of your on-premises infrastructure team.
Skills and training to ready the team for migration
With respect to skills, expertise is important in a data warehouse migration. Therefore, ensure the appropriate
members of your migration team are trained in Azure cloud fundamentals, Azure Blob storage, Azure Data Lake
Storage, Azure Data Box, ExpressRoute, Azure identity management, Azure Data Factory, and Azure Synapse. Your
data modelers will most likely need to fine-tune your Microsoft Azure Synapse data models once migration from
your existing data warehouse has occurred.
Assessing your existing data warehouse
Another part of preparing to migrate is the need for a full assessment of your existing data warehouse to fully
understand the architecture, data stores, schema, business logic, data flows, the DBMS functionality utilized,
warehouse operation, and the dependencies. The more understanding is gained here the better. A detailed
knowledge of how the system works helps to communicate and cover off all bases.
The purpose of the assessment is not just to ensure detailed understanding of the current setup across the
migration team but also to understand strengths and weaknesses in the current setup. The outcome of an
assessment of your current data warehouse therefore can impact your migration strategy in terms of lift and shift
versus something broader. For example, if the outcome of an assessment is that your data warehouse is at end of
life then clearly the strategy would be more of a data migration to a newly designed data warehouse on Azure
Synapse versus a lift and shift approach.
On-premises preparation for data migration
In addition to preparing and readying your migration team for your target environment and assessing your current
setup, it is equally important to also set things in motion on-premises as production data warehouses tend to be
heavily controlled by IT procedures and approval processes. To avoid delays, ensure that your datacenter
infrastructure and operations teams are ready for migrating your data, schema, ETL jobs, and so on, to the Azure
cloud. Data migration can occur via:
AzCopy to Azure Blob storage.
Microsoft Azure ExpressRoute to transfer compressed data directly to Azure.
File export to Azure Data Box.
The main factors influencing which of these options is selected are data volume size (in terabytes) and network
speed (in Mbps). A calculation is needed to determine how long it would take to migrate the data via the network,
considering that data might be compressed in your data warehouse and become uncompressed when you export
it. This situation can slow data transfer. Recompress data via Gzip when moving data by any of the above methods.
PolyBase can process gzipped data directly. Large data volumes will likely be migrated via Azure Data Box if it will
take too long to move the data.
Additionally, for Azure Data Factory to control the execution of exports of your existing data warehouse data from
Azure, self-hosted integration run-time software must be installed in your datacenter to enable migration to
proceed. Given these requirements if formal approval is needed to make this possible, then starting the appropriate
approval processes early to enable this to happen will help avoid delays down the line.
Azure preparation for schema and data migration
In terms of preparation on the Azure side, data import will need to be managed either via Microsoft Azure
ExpressRoute or Microsoft Azure Data Box. Azure Data Factory pipelines are an ideal way to load your data into
Azure Blob Storage and then load from there into Azure Synapse using PolyBase. Therefore preparation is needed
on the Azure side to develop such a pipeline.
The alternative is to use your existing ETL tool on Azure if it supports Azure Synapse, which means setting up the
tool on Azure from Azure Marketplace and readying a pipeline to import your data and load it into Azure Blob
storage.
The purpose of this is to break the dependency between business users utilizing self-service BI tools and the
physical schema of the underlying data warehouse and data marts that are being migrated. By introducing data
virtualization, any schema alternations made during data warehouse and data mart migration to Azure Synapse
(for example, to optimize performance) can be hidden from business users because they only access virtual tables
in the data virtualization layer. If structural change is needed, only the mappings between the data warehouse or
data marts and any virtual tables would need to be changed so that users remain unaware of those changes and
unaware of the migration.
Look to archive any existing tables identified as never used prior to data warehouse migration as there is little
point migrating tables that are not used. One possible way of doing this is to archive the unused data to Azure
blob storage or to Azure Data Lake and create external tables in Azure Synapse to that data so that it is still
online.
Consider the possibility of introducing a virtual machine (VM) on Azure with a development version (usually
free) of the existing legacy data warehouse DBMS running on this VM. This allows you to quickly move existing
data warehouse schema to the VM and them move it on into Azure Synapse while working entirely on the Azure
cloud.
Define migration order and dependencies.
Ensure your infrastructure and operations teams are ready for the migration of your data as early as possible
into the migration project.
Identify the differences in DBMS functionality and where proprietary business logic could become a problem.
For example, using stored procedures for ELT processing is unlikely to migrate easily and won't contain any
metadata lineage since the transformations are buried in code.
Considering a strategy to migrate data marts first followed by the data warehouse that is the source to the data
marts. The reason for this is that it enables incremental migration, it makes it more manageable and it is
possible to prioritize migration based on business needs.
Considering the possibility of using data virtualization to simplify your current data warehouse architecture
before you migrate, for example, to replace data marts with virtual data marts so that you can eliminate physical
data stores and ETL jobs for data marts without losing any functionality prior to migration. Doing this would
reduce the number of data stores to migrate, reduce copies of data, reduce the total cost of ownership and
improve agility. This requires switching from physical to virtual data marts before migrating your data
warehouse. In many ways, you could consider this a data warehouse modernization step prior to migration.
Next steps
For more information on data warehouse migrations, attend a virtual Cloud Data Warehouse Modernization
Workshop on Azure from Informatica.
Ensure the environment is prepared for the cloud
adoption plan
10/30/2020 • 2 minutes to read • Edit Online
Before adoption can begin, you must create a landing zone to host the workloads that you plan to build in or
migrate to the cloud . This section of the framework guides you through how a landing zone is created.
The following exercises help guide you through the process of creating a landing zone to support cloud
adoption.
At a minimum, to get ready for cloud adoption, review the Azure setup guide.
Azure setup guide overview
10/30/2020 • 2 minutes to read • Edit Online
NOTE
This guide provides a starting point for readiness guidance in the Cloud Adoption Framework and is also available in the
Azure Quickstart Center. See the tip in the article for a link.
Before you start building and deploying solutions using Azure services, you need to prepare your environment. In
this guide, we introduce features that help you organize resources, control costs, and secure and manage your
organization. For more information, best practices, and considerations related to preparing your cloud
environment, see the Cloud Adoption Framework's readiness section.
You'll learn how to:
Organize resources : Set up a management hierarchy to consistently apply access control, policy, and
compliance to groups of resources and use tagging to track related resources.
Manage access : Use role-based access control to make sure that users have only the permissions they really
need.
Manage costs and billing : Identify your subscription type, understand how billing works, and learn how to
control costs.
Plan for governance, security, and compliance : Enforce and automate policies and security settings that
help you follow applicable legal requirements.
Use monitoring and repor ting : Get visibility across resources to find and fix problems, optimize
performance, and gain insight into customer behavior.
Stay current with Azure : Track product updates to enable a proactive approach to change management.
TIP
For an interactive experience, view this guide in the Azure portal. Go to the Azure Quickstart Center in the Azure portal,
select Azure Setup Guide , and then follow the step-by-step instructions.
Next steps:
Organize your resources to simplify how you apply settings
This guide provides interactive steps that let you try features as they're introduced. To come back to where you left
off, use the breadcrumb for navigation.
Organize your Azure resources effectively
10/30/2020 • 6 minutes to read • Edit Online
Organizing your cloud-based resources is critical to securing, managing, and tracking the costs related to your
workloads. To organize your resources, define a management group hierarchy, follow a well-considered naming
convention and apply resource tagging.
Azure management groups and hierarchy
Naming standards
Resource tags
Azure provides four levels of management scope: management groups, subscriptions, resource groups, and
resources. The following image shows the relationship of these levels.
NOTE
Subscriptions can also be created programmatically. For more information, see Programmatically create Azure subscriptions.
Managing who can access your Azure resources and subscriptions is an important part of your Azure governance
strategy, and assigning group-based access rights and privileges is a good practice. Dealing with groups rather
than individual users simplifies maintenance of access policies, provides consistent access management across
teams, and reduces configuration errors. Azure role-based access control (RBAC) is the primary method of
managing access in Azure.
RBAC provides detailed access management of resources in Azure. It helps you manage who has access to Azure
resources, what they can do with those resources, and what scopes they can access.
When you plan your access control strategy, grant users the least privilege required to get their work done. The
following image shows a suggested pattern for assigning RBAC.
Figure 1: RBAC
roles.
When you plan your access control methodology, we recommend that you work with people in your organizations
with the following roles: security and compliance, IT administration, and enterprise architect.
The Cloud Adoption Framework offers additional guidance on using role-based access control in your cloud
adoption efforts.
Actions
Grant resource group access:
To grant a user access to a resource group:
1. Go to Resource groups .
2. Select a resource group.
3. Select Access control (IAM) .
4. Select + Add > Add role assignment .
5. Select a role, and then assign access to a user, group, or service principal.
G O TO R E S O U R C E
G R O U PS
Grant subscription access:
To grant a user access to a subscription:
1. Go to Subscriptions .
2. Select a subscription.
3. Select Access control (IAM) .
4. Select + Add > Add role assignment .
5. Select a role, and then assign access to a user, group, or service principal.
G O TO
S U B S C R I P TI O N S
Learn more
To learn more, see:
What is role-based access control (Azure RBAC)?
Cloud Adoption Framework: Use role-based access control
Manage costs and billing for your Azure resources
10/30/2020 • 2 minutes to read • Edit Online
Cost management is the process of effectively planning and controlling costs involved in your business. Cost
management tasks are typically performed by finance, management, and app teams. Azure Cost Management and
Billing can help you plan with cost in mind. It can also help you to analyze costs effectively and take action to
optimize cloud spending.
For more information about integrating cloud cost management processes throughout your organization, see the
Cloud Adoption Framework article on how to track costs across business units, environments, or projects.
Learn more
To learn more, see:
Azure cost management and billing documentation
Cloud Adoption Framework: Track costs across business units, environments, or projects
Cloud Adoption Framework: Cost Management discipline
Actions
Predict and manage costs:
1. Go to Cost Management + Billing .
2. Select Cost Management .
Manage invoices and payment methods:
1. Go to Cost Management + Billing .
2. Select Invoices or Payment methods from the Billing section in the left pane.
G O TO C O S T M A N A G E M E N T +
B ILLING
As you establish corporate policy and plan your governance strategies, you can use tools and services like Azure
Policy, Azure Blueprints, and Azure Security Center to enforce and automate your organization's governance
decisions. Before you start your governance planning, use the governance benchmark tool to identify potential
gaps in your organization's cloud governance approach. For more information about developing governance
processes, see the Govern methodology.
Azure Blueprints
Azure Policy
Azure Security Center
Azure Blueprints enables cloud architects and central information technology groups to define a repeatable set of
Azure resources that implements and adheres to an organization's standards, patterns, and requirements. Azure
Blueprints makes it possible for development teams to rapidly build and stand up new environments and trust that
they're building within organizational compliance using a set of built-in components like networking to speed up
development and delivery.
Blueprints are a declarative way to orchestrate the deployment of various resource templates and other artifacts
like:
Role assignments.
Policy assignments.
Azure Resource Manager templates.
Resource groups.
Create a blueprint
To create a blueprint:
1. Go to Blueprints: Getting star ted .
2. In the Create a Blueprint section, select Create .
3. Filter the list of blueprints to select the appropriate blueprint.
4. Enter the Blueprint name , then select the appropriate Definition location .
5. Select Next : Ar tifacts >> , Then review the artifacts included in the blueprint.
6. Select Save Draft .
C R E ATE A
B LU E PR INT
Azure offers many services that together provide a comprehensive solution for collecting, analyzing, and acting on
telemetry from your applications and the Azure resources that support them. In addition, these services can extend
to monitoring critical on-premises resources to provide a hybrid monitoring environment.
Azure Monitor
Azure Service Health
Azure Advisor
Azure Security Center
Azure Monitor provides a single unified hub for all monitoring and diagnostics data in Azure. You can use it to get
visibility across your resources. With Azure Monitor, you can find and fix problems and optimize performance. You
also can understand customer behavior.
Monitor and visualize metrics. Metrics are numerical values available from Azure resources that help
you understand the health of your systems. Customize charts for your dashboards, and use workbooks for
reporting.
Quer y and analyze logs. Logs include activity logs and diagnostic logs from Azure. Collect additional logs
from other monitoring and management solutions for your cloud or on-premises resources. Log Analytics
provides a central repository to aggregate all this data. From there, you can run queries to help troubleshoot
issues or to visualize data.
Set up aler ts and actions. Alerts proactively notify you of critical conditions. Corrective actions can be
taken based on triggers from metrics, logs, or service health issues. You can set up different notifications and
actions and send data to your IT service management tools.
Start monitoring your:
Applications
Containers
Virtual machines
Networks
To monitor other resources, find additional solutions in the Azure Marketplace.
To explore Azure Monitor, go to the Azure portal.
Learn more
To learn more, see Azure Monitor documentation.
Action
E XP L O R E A Z U R E
M O N I TO R
Stay current with Azure
10/30/2020 • 2 minutes to read • Edit Online
Cloud platforms like Azure change faster than many organizations are accustomed to. This pace of change means
that organizations have to adapt people and processes to a new cadence. If you're responsible for helping your
organization keep up with change, you might feel overwhelmed at times. The resources listed in this section can
help you stay up to date.
Top resources
Additional resources
The following resources can help you stay current with Azure:
Azure Ser vice Health: Service Health alerts provide timely notifications about ongoing service issues,
planned maintenance, and health advisories. These alerts also includes information about Azure features
scheduled for retirement.
Azure updates: Review Azure updates for announcements about product updates. Brief summaries link to
additional details, making the updates easy to follow. Subscribe via the Azure updates RSS feed.
Azure blog: The Azure blogcommunicates the most important announcements for the Azure platform. Follow
this blog to stay up to date on critical information. Subscribe via the Azure blog RSS feed.
Ser vice-specific blogs: Many individual Azure services publish blogs that you can follow if you rely on those
services. Find the ones you're interested in via a web search.
Azure Info Hub: The unofficial Azure Info Hub pulls together most of the resources listed here. Follow links to
individual services to get detailed information and find service-specific blogs. Subscribe via the Azure Info Hub
RSS feed. *
Understand cloud operating models
10/30/2020 • 3 minutes to read • Edit Online
Adopting the cloud creates an opportunity to revisit how you operate technology systems. This article series will
clarify cloud operating models and the considerations that impact your cloud adoption strategy. But first, let's
clarify the term cloud operating model.
Next steps
Learn how the Cloud Adoption Framework helps you define your operating model.
Compare common cloud operating models
Define your cloud operating model
10/30/2020 • 2 minutes to read • Edit Online
Cloud operating models are complex. Countless customers become blocked by minor details while defining their
cloud operating model. It's easy to fall into a series of circular references. To avoid circular references, the Cloud
Adoption Framework provides a series of complimentary and incremental methodologies that decompose the
volume of decisions into smaller exercises.
Next steps
Before engaging any of the above methodologies, use the next article to compare common cloud operating
models and find a model that closely matches your requirements. That article will help establish the most
actionable starting point and set of exercises to move you towards the desired operating model across your cloud
platform.
Compare common cloud operating models
Compare common cloud operating models
10/30/2020 • 21 minutes to read • Edit Online
Operating models are unique and specific to the business they support, based on their current requirements and
constraints. But, this uniqueness shouldn't suggest that operating models are snowflakes. There are a few
common patterns of customer operating models. This article outlines the four most common patterns.
Priorities or scope
A cloud operating model is primarily driven by two factors:
Strategic priorities and motivations.
The scope of the portfolio to be managed.
Por tfolio scope Workload Landing zone Cloud platform Full portfolio
Foundation N/A N/A or low support Centralized and more Most support
utilities support
Strategic priorities or motivations : Each operating model is capable of delivering the typical strategic
motivations for cloud adoption. However, some operating models simplify realizing specific motivations.
Por tfolio scope : The portfolio scope row below identifies the largest scope that a specific operating
model is designed to support. For example, centralized operations is designed for a small number of
landing zones. But that operating model decision could inject operational risks for an organization that's
trying to manage a large, complex portfolio that might require many landing zones or variable complexity
in landing zone design.
IMPORTANT
Adopting the cloud often triggers a reflection on the current operating model and might lead to a shift from one of the
common operating models to another. But cloud adoption isn't the only trigger. As business priorities and the scope of cloud
adoption change how the portfolio needs to be supported, there could be other shifts in the most-appropriately aligned
operating model. When the board or other executive teams develop 5 to 10 year business plans, those plans often include a
requirement (explicit or implied) to adjust the operating model. While these common models are a good reference for
guiding decisions, keep in mind that your operating model might change or you might need to customize one of these
models to meet your requirements and constraints.
Accountability alignment
While many teams and individuals will be responsible for supporting different functions, each of the common
operating models assigns final accountability for decisions and their outcomes to one team or one individual. This
approach affects how the operating model is funded and what level of support is provided for each function.
Cloud security Workload team Security operations CCoE + SOC Mixed - see Define a
center (SOC) security strategy
DEC EN T RA L IZ ED O P S C EN T RA L IZ ED O P S EN T ERP RISE O P S DIST RIB UT ED O P S
Star ting point Azure Well- Azure landing zones: Azure landing zones: Business alignment
Architected start-small options CAF enterprise-scale
Framework (WAF)
Iterations A focus on workloads The start-small option As illustrated by the Review the Azure
allows the team to requires additional reference landing zone
iterate within WAF. iteration on each implementations, implementation
methodology, but future iterations options to start with
that can be done as typically focus on the option that best
cloud adoption minor configuration meets your
efforts mature. additions. operations baseline.
Follow the iteration
path defined in that
option's design
principles.
Decentralized operations
Operations is always complex. By limiting the scope of operations to one workload or a small collection of
workloads, that complexity can be controlled. As such, decentralized operations is the least complex of the
common operating models. In this form of operations, all workloads are operated independently by dedicated
workload teams.
Priorities: Innovation is prioritized over centralized control or standardization across multiple workloads.
Distinct advantage: Maximizes speed of innovation by placing workload and business teams in full control of
design, build, and operations.
Distinct disadvantage: Reduction in cross-workload standardization, economies of scale through shared
services, and consistent governance centralized compliance efforts.
Risk : This approach introduces risk when managing a portfolio of workloads. Since the workload team is less
likely to have specialized teams dedicated to central IT functions, this operating model is viewed as a high risk
option by some organizations, especially companies that are required to follow third-party compliance
requirements.
Guidance: Decentralized operations are limited to workload level decisions. The Microsoft Azure Well-
Architected Framework is designed to support the decisions made within that scope. The processes and
guidance within the Cloud Adoption Framework are likely to add overhead that are not required by
decentralized operations.
Advantages of decentralized operations
Cost management: Cost of operations can be easily mapped to a single business unit. Workload-specific
operations allow for greater workload optimization.
Responsibilities: Typically, this form of operations is highly dependent on automation to minimize overhead.
Responsibilities tend to focus on DevOps and pipelines for release management. This allows for faster
deployments and shorter feedback cycles during development.
Standardization: Source code and deployment pipeline should be used standardize the environment from
release to release.
Operations suppor t: Decisions that impact operations are only concerned with the needs of that workload,
simplifying operations decisions. Many in the DevOps community would argue that this is the purest form of
operations because of the tighter operational scope.
Exper tise: DevOps and development teams are most empowered by this approach and experience the least
resistance to driving market change.
Landing zone design: No specific operational advantage.
Foundational utilities: No specific operational advantage.
Separation of duties: No specific operational advantage.
Disadvantages of decentralized operations
Cost management: Enterprise costs are harder to calculate. Lack of centralized governance teams make it
harder to implement uniform cost controls or optimization. At scale, this model can be costly, since each
workload would likely have duplication in deployed assets and staffing assignments.
Responsibilities: Lack of centralized supporting teams means that the workload team is entirely responsible
for governance, security, operations, and change management. This is a detriment when those tasks have not
been automated in code review and release pipelines.
Standardization: Standardization across a portfolio of workloads can become variable and inconsistent.
Operations suppor t: Scale efficiencies are often missed. As our uniform best practices across multiple
workloads.
Exper tise: Team members have a greater responsibility to make wise and ethical decisions regarding
governance, security, operations, and change management decisions within the application design and
configuration. The Microsoft Azure Well-Architected Review and Azure Well-Architected Framework should be
consulted frequently to improve the required expertise.
Landing zone design: Landing zones are not workload-specific and are not considered in this approach.
Foundational utilities: Few (if any) foundational services are shared across workloads, reducing scale
efficiencies.
Separation of duties: Higher requirements for DevOps and development teams increase the usage of
elevated privileges from those teams. If separation of duties is required, heavy investment in DevOps maturity
will be needed to operate in this approach.
Centralized operations
Stable state environments might not require as much focus on the architecture or distinct operational
requirements of the individual workloads. Central operations tend to be the norm for technology environments
that consist primarily of stable-state workloads. Examples of a stable-state of operations include things like
commercial-off-the-shelf (COTS) applications or well-established custom applications that have a slow release
cadence. When rate of change is driven by a regular drumbeat of updates and patches (over the high change rate
of innovation), centralization of operations is an effective means to manage the portfolio.
Priorities: Prioritizes central control over innovation. Also prioritizes continuation of existing operational
processes over cultural shift to modern cloud operations options.
Distinct advantage: Centralization introduces economies of scale, best-of-breed controls, and standardized
operations. This approach works best with the cloud environment needs specific configurations integrate cloud
operations into existing operations and processes. This approach is most advantageous to centralized teams
with a portfolio of a few hundred workloads with modest architectural complexity and compliance
requirements.
Distinct disadvantage: Scaling to meet the demands of a large portfolio of workloads can place significant
strain on a centralized team making operational decisions for production workloads. If technical assets are
expected to scale beyond 1,000 or so VMs, applications, or data sources in the cloud within the next 18-24
months, an enterprise model should be considered.
Risk : This approach limits centralization to a smaller number of subscriptions (often one production
subscription). There is a risk of significant refactoring later in the cloud journey that could interfere with
adoption plans. Specifically, care should be given to segmentation, environment boundaries, identity tooling,
and other foundational elements to avoid significant rework in the future.
Guidance: Azure landing zone implementation options aligned to the "start small and expand" development
velocity creates a sound starting point. Those can be used to accelerate adoption efforts. To be successful, clear
policies must be established to guide early adoption efforts within acceptable risk tolerances. Govern and
Manage methodologies create processes to mature operations in parallel. Those steps serve as stage gates that
must be completed before allowing for increased risk as operations matures.
Advantages of centralized operations
Cost management: Centralizing shared services across a number of workloads creates economies of scale
and eliminates duplicated tasks. Central teams can more quickly implement cost reductions through
enterprise-wide sizing and scale optimizations.
Responsibilities: Centralized expertize and standardization is likely to lead to higher stability, better
operational performance, and lower risk of change-related outages. This reduces broad skilling pressures on
the workload focused teams.
Standardization: In general, standardization and cost of operations is lowest with a centralized model
because there are fewer duplicated systems or tasks.
Operations suppor t: Reducing complexity and centralizing operations makes it easy for smaller IT teams to
support operations.
Exper tise: Centralization of supporting teams allow for experts in the fields of security, risk, governance, and
operations to drive business critical decisions.
Landing zone design: Central IT tends to minimize the number of landing zones and subscriptions to reduce
complexity. Landing zone designs tend to mimic the preceding datacenter designs, which reduces transition
time. As adoption progresses, shared resources can then be moved out into a separate subscription or platform
foundation.
Foundational utilities: Carrying existing datacenter designs into the cloud results in foundational, shared
services that mimic on-premises tools and operations. When on-premises operations are your primary
operating model, this can be an advantage (beware the disadvantages below). Doing so reduces transition
time, capitalizes on economies of scale, and allows for consistent operational processes between on-premises
and cloud hosted workloads. This approach can reduce short-term complexity/effort and allow smaller teams
to support cloud operations with reduced learning curves.
Separation of duties: Separation of duties is clear in central operations. Central IT maintains control of the
production environments reducing the need for any elevated permissions from other teams. This reduces the
surface area of breach by reducing the number of accounts with elevated privileges.
Disadvantages of centralized operations
Cost management: Central teams rarely have enough understanding of the workload architectures to
produce impactful optimizations at the workload level. This limits the amount of cost savings that can come
from well-tuned workload operations. Further, lack of workload architecture understanding can cause
centralized cost optimizations to have a direct impact on performance, scale, or other pillars of a well-
architected workload. Before applying enterprise-wide cost changes to high profile workloads, the Microsoft
Azure Well-Architected Review should be completed and considered by the central IT team.
Responsibilities: Centralizing production support and access places a higher operational burden on a smaller
number of people. It also places greater pressure on those individuals to perform deeper reviews of the
deployed workloads to validate adherence to detailed security, governance, and compliance requirements.
Standardization: Central IT approaches make it difficult to scale standardization without a linear scaling of
central IT staff.
Operations suppor t: Not the disadvantage and risks listed above. The greatest disadvantages of this
approach are associated with significant scale and shifts that prioritize innovation.
Exper tise: Developer and DevOps experts are at risk of being under-valued or too constrained in this type of
environment.
Landing zone design: Datacenter designs are based on the constraints of preceding approaches, which aren't
always relevant to the cloud. Following this approach reduces the opportunities to rethink environment
segmentation and empower innovation opportunities. Lack of landing zone segmentation also increases the
potential impact of breach, increases complexity of governance/compliance adherence, and could create
blockers to adoption later in the cloud journey. See the risks section above.
Foundational utilities: During digital transformation, cloud might become the primary operating model.
Persisting central tools built for on-premises operations reduces opportunities to modernize operations and
drive increased operational efficiencies. Choosing not to modernize operations early in the adoption process
can be overcome through creation of a platform foundations subscription later in the cloud adoption journey,
but that effort can be complex, costly, and time consuming without advanced planning.
Separation of duties: Central operations generally follow one of two paths, both of which can hinder
innovation.
Option 1: Teams outside of central IT are granted limited access to development environments that
mimic production. This option hinders experimentation.
Option 2: Teams develop and test in non-supported environments. This option hinders deployment
processes and slows post-deployment integration testing.
Enterprise operations
Enterprise operations is the suggested target state for all cloud operations. Enterprise operations balances the
need for control and innovation by democratizing decisions and responsibilities. Central IT is replaced by a more
facilitative cloud center of excellence or CCoE team, which supports workload teams and hold them accountable
for decisions, as opposed to controlling or limiting their actions. Workload teams are granted more power and
more responsibility to drive innovation, within well-defined guardrails.
Priorities: Prioritizes democratization of technical decisions. Democratization of technical decisions shifts
responsibilities previously held by central IT to workload teams when applicable. To deliver this shift in
priorities, decisions become less dependent on human-run review processes and more dependent on
automated review, governance, and enforcement using cloud-native tools.
Distinct advantage: Segmentation of environments and separation of duties allow for balance between
control and innovation. Central operations can maintain centralized operations for workloads that require
increase compliance, stable state operations, or represent greater security risks. Conversely, this approach
allows for reduction in centralized control of workloads and environments that require greater innovation.
Since larger portfolios are more likely to struggle with the balance between control and innovation, this
flexibility makes it easier to scale to thousands of workloads with reductions in operational pains.
Distinct disadvantage: What worked well on-premises might not work well in enterprise cloud operations.
This approach to operations requires changes on many fronts. Cultural shifts in control and responsibility are
often the biggest challenge. Operational shifts that follow the cultural shift take time and committed effort to
implement, mature, and stabilize. Architectural shifts are sometimes required in otherwise stable workloads.
Tooling shifts are required to empower and support the cultural, operational, and architectural shifts, which
might require commitments to a primary cloud provider. Adoption efforts made prior to these changes might
require significant rework that goes beyond typical refactoring efforts.
Risk : This approach requires executive commitment to the change strategy. It also requires commitment from
the technical teams to overcome learning curves and deliver the required change. Long-term cooperation
between business, CCoE/central IT, and workload teams is required to see long-term benefits.
Guidance: Azure landing zone implementation options defined as "enterprise-scale" provide reference
implementations to demonstrate how the technical changes are delivered using cloud-native tooling in Azure.
The enterprise-scale approach guides teams through the operational and cultural shifts required to take full
advantage of those implementations. That same approach can be used to tailor the reference architecture to
configure the environment to meet your adoption strategy and compliance constraints. Once enterprise-scale
has been implemented, the Govern and Manage methodologies can be used to define processes and expand
your compliance and operations capabilities to meet your specific operational needs.
Advantages of enterprise operations
Cost management: Central teams act on cross-portfolio optimizations and hold individual workload teams
accountable for deeper workload optimization. Workload focused teams are empowered to make decisions
and provided clarity when those decisions have a negative cost impact. Central and workload teams share
accountability for cost decisions at the right level.
Responsibilities: Central teams use cloud-native tools to define, enforce, and automate guardrails. Efforts of
the workload teams are accelerated through CCoE automation and practices. The workload teams are then
empowered to drive innovation and make decisions within those guardrails.
Standardization: Centralized guardrails and foundational services create consistency across all environments,
regardless of scale.
Operations suppor t: Workloads that require centralized operations support are segmented to environments
with stable-state controls. Segmentation and separation of duties empower workload teams to take
accountability for operational support in their own dedicated environments. Automated, cloud native tools
ensure a minimum operations baseline for all environments with centralized operational support for the
baseline offering.
Exper tise: Centralization of core services such as security, risk, governance, and operations ensures proper
central expertise. Clear processes and guardrails educates and empowers all members of the workload teams
to make more detailed decisions that expand the impact of the centralized experts, without needing to scale
that staff linearly with technology scale.
Landing zone design: Landing zone design replicates the needs of the portfolio, creating clear security,
governance, and accountability boundaries required to operate workloads in the cloud. Segmentation practices
are unlikely to resemble the constraints created by preceding datacenter designs. In enterprise operations,
landing-zone design is less complex, allowing for faster scale and reduced barriers to self-service demand.
Foundational utilities: Foundational utilities are hosted in separate centrally controlled subscriptions
referred to as the platform foundation. Central tools are then "piped" into each landing zone as a utility
services. Separating foundational utilities from the landing zones maximizes consistency, economy of scale,
and creates clear distinctions between centrally managed responsibilities and workload level responsibilities.
Separation of duties: Clear separation of duties between foundational utilities and landing zones is one of
the biggest advantages of this approach to operations. Cloud-native tools and sound processes allow for just-
in-time access and proper balance of control between centralized teams and workload teams, based on the
requirements of the individual landing zones and the workloads hosted in those landing zone segments.
Disadvantages of enterprise operations
Cost management: Central teams are more dependent on workload teams to make production changes
within landing zones. This shift does create a risk of potential budget overruns and slower right-sizing of actual
spend. Cost control processes, clear budgets, automated controls, and regular reviews must be in place early to
avoid cost surprises.
Responsibilities: Enterprise operations requires greater cultural and operational requirements on central and
workload teams to ensure clarity in responsibilities and accountability between teams.
Traditional change management processes or change advisory boards (cabs) are less likely to maintain the
pace and balance required in this operating model. Those processes should be reflected in the automation of
processes and procedures to safely scale cloud adoption.
Lack of commitment to change will first materialize in negotiation and alignment of responsibilities. Inability to
align on shifts in responsibility is an indication that central IT operating models might be required during
short-term cloud adoption efforts.
Standardization: Lack of investment in centralized guardrails or automation create risks to standardization
that are more difficult to overcome through manual review processes. Additionally, operational dependencies
between workloads in the landing zones and shared services in the platform foundation creates greater risk to
standardization during upgrade cycles or future versions of the foundational utilities. During platform
foundation revisions, improved or even automated testing is required of all supported landing zones and the
workloads they host.
Operations suppor t: The operations baseline provided through automation and centralized operations
might be sufficient for low impact or low criticality workloads. However, workload teams or other forms of
dedicated operations will likely be required for complex or high criticality workloads. This might necessitate a
shift in operations budgets, requiring business units to allocate operating expenses to those forms of advanced
operations. If central IT is required to maintain sole accountability for the cost of operations, then enterprise
operations will be difficult to implement.
Exper tise: Central IT team members will be required to develop expertise regarding automation of central
controls previously delivered via manual processes. Those teams might also need to develop a proficiency for
infrastructure-as-code approaches to defining the environment, along with an understanding of branching,
merging, and deployment pipelines. At a minimum, a platform automation team will need these skills to act on
decisions made by the cloud center of excellence or central operations teams. Workload teams will be required
to develop additional knowledge related to the controls and processes that will govern their decisions.
Landing zone design: Landing zone design takes a dependency on the foundational utilities. To avoid
duplication of effort (or errors/conflicts with automated governance), each workload focused team should
understand what is included in the design and what is forbidden. Exception processes should also be factored
in to landing zone designs to create flexibility.
Foundational utilities: Centralization of foundational utilities does take some time to consider options and
develop a solution that will scale to meet various adoption plans. This can delay early adoption efforts, but
should be offset in the long term by accelerations and blocker avoidance later in the process.
Separation of duties: Ensuring clear separation of duties does require mature identity management
processes. There might be additional maintenance associated with the proper alignment of users, groups, and
onboarding/off-boarding activities. New processes will likely be needed to accommodate just-in-time access
via elevated privileges.
Distributed operations
The existing operating model might be too engrained for the entire organization to shift to a new operating
model. For others, global operations and various compliance requirements might prevent specific business units
from making a change. For those companies, a distributing operations approach might be required. This is by far
the most complex approach, as it requires an integration of one or more of the previously mentioned operating
models.
While heavily discouraged, this approach to operations might be required for some organizations who consist of a
loose collection of disparate business units. Especially when those business units span a diverse base of customer
segments or regional operations.
Priorities: Integration of multiple existing operating models.
Transitional state with a focus on moving the entire organization to one of the previously mentioned operating
models, over time.
Longer term operational approach when the organization is too large or too complex to align to a single
operating model.
Distinct advantage: Integration of common operating model elements from each business unit. Creates a
vehicle to group operating units into a hierarchy and help them mature operations using consistent repeatable
processes.
Distinct disadvantage: Consistency and standardization across multiple operating models is difficult to
maintain for extended periods. This operational approach requires deep awareness of the portfolio and how
various segments of the technology portfolio are operated.
Risk : Lack of commitment to a primary operating model could lead to confusion across teams. This operating
model should only be used when there is no way to align to a single operating model.
Guidance: Start with a thorough review of the portfolio using the approach outlined in the business alignment
articles. Take care to group the portfolio by desired state operating model (decentralized, centralized, or
enterprise).
Develop a management group hierarchy that reflects the operating model groupings, followed by other
organizational patterns for region, business unit, or other criteria that map the workload clusters from least
common to most common buckets.
Evaluate the alignment of workloads to operating models to find the most relevant operating model cluster to
start with. Follow the guidance that maps to that operating model for all workloads under that node of the
management group hierarchy.
Use the Govern and Manage methodologies to find common corporate policies and required operational
management practices at various points of the hierarchy. Apply common Azure policies to automate the shared
corporate policies.
As those Azure policies are tested with various deployments, attempt to move them higher in the management
group hierarchy applying those policies to greater numbers of workloads to find commonalities and distinct
operation needs.
Over time this approach will help define an operating model that scales across your various other operating
models and unifies teams through a set of common policies and procedures.
Advantages and disadvantages of this approach are purposefully blank. After you complete the business
alignment of your portfolio, see the predominant operating model section above for clarity on advantages and
disadvantages.
Next steps
Learn the terminology associated with operating models. The terminology helps you understand how an
operating model fits into the bigger theme of corporate planning.
Operating model terminology
Learn how a landing zone provides the basic building block of any cloud adoption environment.
Compare common cloud operating models
Operating model terminology
10/30/2020 • 2 minutes to read • Edit Online
The term operating model has many definitions. This intro article establishes terminology associated with
operating models. To understand an operating model as it relates to the cloud, we first have to understand how an
operating model fits into the bigger theme of corporate planning.
Terms
Business model: Business models tend to define corporate value (what the business does to provide value) and
mission/vision statements (why the business has chosen to add value in that way). At a minimum, business models
should be able to represent the what and why in the form of financial projections. There are many different schools
of thought regarding how far a business model goes beyond these basic leadership principles. However, to create a
sound operating model, the business models should include high-level statements to establish directional goals. It's
even more effective if those goals can be represented in metrics or KPIs to track progress.
Customer experience: All good business models ground the why side of a business's strategy in the experience
of their customers. This process could involve a customer acquiring a product or service. It could include
interactions between a company and its business customers. Another example could center around the long-term
management of a customer's financial or health needs, as opposed to a single transaction or process. Regardless of
the type of experience, the majority of successful companies realize that they exist to operate and improve the
experiences that drive their why statements.
Digital transformation: Digital transformation has become an industry buzzword. However, it is a vital
component in the fulfillment of modern business models. Since the advent of the smartphone and other portable
computing form factors, customer experiences have become increasingly digital. This shift is painfully obvious in
some industries like DVD rentals, print media, automotive, or retail. In each case, digitized experiences have had a
significant impact on the customer experience. In some cases, physical media have been entirely replaced with
digital media, upsetting the entire industry vertical. In others, digital experiences are seen as a standard
augmentation of the experience. To deliver business value (what statements), the customer experience (why
statements) must factor in the impact of digital experiences on the customers' experiences. This process is digital
transformation. Digital transformation is seldom the entire why statement in a business strategy, but it is an
important aspect.
Operating model: If the business model represents the what and why, then an operating model represents the
how and who for operationalizing the business strategy. The operating model defines the ways in which people
work together to accomplish the large goals outlined in the business strategy. Operating models are often
described as the people, process, and technology behind the business strategy. In the article on the Cloud Adoption
Framework operating model, this concept is explained in detail.
Cloud adoption: As stated above, digital transformation is an important aspect of the customer experience and
the business model. Likewise, cloud adoption is an important aspect of any operating model. Cloud adoption is a
strong enabler to deliver the right technologies and processes required to successfully deliver on the modern
operating model.
Cloud adoption is what we do to realize the business value. The operating model represents who we are and how
we function on a daily basis while cloud adoption is being delivered.
Take action
Use the operating model provided by the Cloud Adoption Framework to develop operational maturity.
Next steps
Continue to the next section of the Cloud Adoption Framework. Learn how a landing zone provides the basic
building block of any cloud adoption environment.
Use the operating model
What is an Azure landing zone?
10/30/2020 • 2 minutes to read • Edit Online
Azure landing zones are the output of a multisubscription Azure environment that accounts for scale, security,
governance, networking, and identity. Azure landing zones enable application migrations and greenfield
development at an enterprise scale in Azure. These zones consider all platform resources that are required to
support the customer's application portfolio and don't differentiate between infrastructure as a service or
platform as a service.
Next steps
When you're choosing the right Azure landing zone implementation option, you should understand the Azure
landing zone design areas.
Review design areas
Design areas of a well-architected landing zone
10/30/2020 • 2 minutes to read • Edit Online
Each Azure landing zone implementation option provides a deployment approach and defined design principles.
Before choosing an implementation option, use this article to gain an understanding of the design areas listed in
the following table.
NOTE
These design areas describe what you should consider prior to deploying a landing zone. Use it as a simple reference. See
the landing zone implementation options for design principles and actionable steps for deployment.
Design areas
Regardless of the deployment option, you should carefully consider each design area. Your decisions affect the
platform foundation on which each landing zone depends.
Business continuity and disaster Resiliency is key for smooth functioning Manage
recovery (BCDR) of applications. BCDR is an important
component of resiliency. BCDR involves
protection of data via backups and
protection of applications from outages
via disaster recovery.
Next steps
You can implement these design areas over time so that you can grow into your cloud operating model.
Alternately, there are rich, opinionated implementation options that start with a defined position on each design
area.
With an understanding of the modular design areas, the next step is to choose the landing zone implementation
option that best aligns with your cloud adoption plan and requirements.
Choose an implementation option
Landing zone implementation options
10/30/2020 • 3 minutes to read • Edit Online
Azure landing zones provide cloud adoption teams with a well-managed environment for their workloads.
Follow the landing zone design areas guidance to take advantage of these capabilities.
Each of the following implementation options is designed for a specific set of operating model dependencies to
support your nonfunctional requirements. Each implementation option includes distinct automation approaches.
When available, reference architectures and reference implementations are included to accelerate your cloud
adoption journey. While each option is mapped to a different operating model, they share the same design areas.
The difference is how you choose to implement them.
Implementation options
The following table describes some of the implementation options for landing zones and the variables that might
drive your decision.
CAF Migration Deploys the basic Start small Design principles Deploy
landing zone foundation for
blueprint migrating low risk
assets.
CAF Foundation Adds the minimum Start small Design principles Deploy
blueprint tools need to begin
developing a
governance strategy.
IM P L EM EN TAT IO N DEP LO Y M EN T DEEP ER DESIGN DEP LO Y M EN T
O P T IO N DESC RIP T IO N VELO C IT Y P RIN C IP L ES IN ST RUC T IO N S
CAF Terraform Third-party path for Start small Design principles Deploy
modules multicloud operating
models. This path can
limit Azure-first
operating models.
Partner landing zones Partners who provide Variable Design principles Find a partner
offerings aligned to
the Ready
methodology of the
Cloud Adoption
Framework can
provide their own
customized
implementation
option.
The following table looks at some of these implementation options from a slightly different perspective to guide
more technical decision processes.
IM P L EM EN TAT IO N DEP LO Y M EN T DEP LO Y M EN T
O P T IO N H UB SP O K E T EC H N O LO GY IN ST RUC T IO N S
Next steps
To proceed, choose one of the implementation options shown in the preceding table. Each option includes a link
to deployment instructions and the specific design principles that guide implementation.
Start with Cloud Adoption Framework enterprise-
scale landing zones
10/30/2020 • 2 minutes to read • Edit Online
The enterprise-scale architecture represents the strategic design path and target technical state for your Azure
environment. It will continue to evolve alongside the Azure platform and is defined by the various design
decisions that your organization must make to map your Azure journey.
Not all enterprises adopt Azure in the same way, so the Cloud Adoption Framework for Azure enterprise-scale
landing zone architecture varies between customers. The technical considerations and design recommendations
of the enterprise-scale architecture might lead to different trade-offs based on your organization's scenario.
Some variation is expected, but if you follow the core recommendations, the resulting target architecture will set
your organization on a path to sustainable scale.
Prescriptive guidance
The enterprise-scale architecture provides prescriptive guidance coupled with Azure best practices. It follows
design principles across the critical design areas for an organization's Azure environment.
Community
This guide is developed largely by Microsoft architects and the broader Cloud Solutions Unit technical
community. This community actively advances this guide to share lessons learned during enterprise-scale
adoption efforts.
This guide shares the same design principles as the standard Ready methodology. It expands on those principles
to integrate subjects such as governance and security earlier in the planning process. Expanding the standard
process is necessary because of a few natural assumptions that can be made when an adoption effort requires
large-scale enterprise change.
Next steps
Implement a Cloud Adoption Framework enterprise-scale landing zone
Implement Cloud Adoption Framework enterprise-
scale landing zones in Azure
10/30/2020 • 2 minutes to read • Edit Online
When business requirements necessitate a rich initial implementation of landing zones with fully integrated
governance, security, and operations from the start, use the enterprise-scale example options listed here. With this
approach, you can use the Azure portal or infrastructure as code to set up and configure your environment. It's
also possible to transition between the portal and infrastructure as code (recommended) when your organization
is ready.
Example implementation
The following table lists example modular implementations.
Enterprise-scale foundation This is the suggested Example in GitHub Deploy example to Azure
foundation for enterprise-
scale adoption.
Enterprise-scale Virtual Add a Virtual WAN network Example in GitHub Deploy example to Azure
WAN module to the enterprise-
scale foundation.
Enterprise-scale hub and Add a hub-and-spoke Example in GitHub Deploy example to Azure
spoke network module to the
enterprise-scale foundation.
Next steps
These examples provide an easy deployment option to support continued learning for the enterprise-scale
approach. Before you use these examples in a production version of enterprise scale, review the enterprise-scale
architecture.
Cloud Adoption Framework enterprise-scale landing
zone architecture
10/30/2020 • 4 minutes to read • Edit Online
Enterprise-scale is an architectural approach and a reference implementation that enables effective construction
and operationalization of landing zones on Azure, at scale. This approach aligns with the Azure roadmap and the
Cloud Adoption Framework for Azure.
Architecture overview
The Cloud Adoption Framework enterprise-scale landing zone architecture represents the strategic design path
and target technical state for an organization's Azure environment. It will continue to evolve alongside the Azure
platform and is defined by the various design decisions that your organization must make to map your Azure
journey.
Not all enterprises adopt Azure the same way, so the Cloud Adoption Framework enterprise-scale landing zone
architecture varies between customers. The technical considerations and design recommendations in this guide
might yield different trade-offs based on your organization's scenario. Some variation is expected, but if you follow
the core recommendations, the resulting target architecture will set your organization on a path to sustainable
scale.
High-level architecture
An Enterprise-Scale architecture is defined by a set of design considerations and recommendations across eight
critical design areas, with two network topologies recommended: an Enterprise-Scale architecture based on an
Azure Virtual WAN network topology (depictured on figure 2), or based on a traditional Azure network topology
based on the hub and spoke architecture (depicted on figure 3).
Figure 2: Cloud Adoption Framework enterprise-scale landing zone architecture based on an Azure Virtual WAN
network topology.
Figure 3: Cloud Adoption Framework enterprise-scale landing zone architecture based on a traditional Azure
networking topology.
Download the PDF files that contain the Enterprise-Scale architecture diagrams based on the Virtual WAN network
topology or a traditional Azure network topology based on the hub and spoke architecture.
On figures 2 and 3 there are references to the Enterprise-Scale critical design areas, which are indicated with the
letters A to I:
Enterprise Agreement (EA) enrollment and Azure Active Directory tenants. An Enterprise Agreement (EA)
enrollment represents the commercial relationship between Microsoft and how your organization uses Azure. It
provides the basis for billing across all your subscriptions and affects administration of your digital estate. Your EA
enrollment is managed via an Azure enterprise portal. An enrollment often represents an organization's hierarchy,
which includes departments, accounts, and subscriptions. An Azure AD tenant provides identity and access
management, which is an important part of your security posture. An Azure AD tenant ensures that authenticated
and authorized users have access to only the resources for which they have access permissions.
Identity and access management. Azure Active Directory design and integration must be built to ensure both
server and user authentication. Resource-based access control (RBAC) must be modeled and deployed to enforce
separation of duties and the required entitlements for platform operation and management. Key management
must be designed and deployed to ensure secure access to resources and support operations such as rotation and
recovery. Ultimately, access roles are assigned to application owners at the control and data planes to create and
manage resources autonomously.
Management group and subscription organization. Management group structures within an Azure Active
Directory (Azure AD) tenant support organizational mapping and must be considered thoroughly when an
organization plans Azure adoption at scale. Subscriptions are a unit of management, billing, and scale within
Azure. They play a critical role when you're designing for large-scale Azure adoption. This critical design area helps
you capture subscription requirements and design target subscriptions based on critical factors. These factors are
environment type, ownership and governance model, organizational structure, and application portfolios.
Management and monitoring. Platform-level holistic (horizontal) resource monitoring and alerting must be
designed, deployed, and integrated. Operational tasks such as patching and backup must also be defined and
streamlined. Security operations, monitoring, and logging must be designed and integrated with both resources
on Azure and existing on-premises systems. All subscription activity logs that capture control plane operations
across resources should be streamed into Log Analytics to make them available for query and analysis, subject to
RBAC permissions.
Network topology and connectivity. The end-to-end network topology must be built and deployed across
Azure regions and on-premises environments to ensure north-south and east-west connectivity between platform
deployments. Required services and resources such as firewalls and network virtual appliances must be identified,
deployed, and configured throughout network security design to ensure that security requirements are fully met.
, , Business continuity and disaster recovery and Security, governance, and compliance. Holistic and
landing-zone-specific policies must be identified, described, built, and deployed onto the target Azure platform to
ensure corporate, regulatory, and line-of-business controls are in place. Ultimately, policies should be used to
guarantee the compliance of applications and underlying resources without any abstraction provisioning or
administration capability.
Platform automation and DevOps. An end-to-end DevOps experience with robust software development
lifecycle practices must be designed, built, and deployed to ensure a safe, repeatable, and consistent delivery of
infrastructure-as-code artifacts. Such artifacts are to be developed, tested, and deployed by using dedicated
integration, release, and deployment pipelines with strong source control and traceability.
Next steps
Customize implementation of this architecture by using the Cloud Adoption Framework enterprise-scale design
guidelines.
Design guidelines
Cloud Adoption Framework enterprise-scale design
principles
10/30/2020 • 2 minutes to read • Edit Online
The enterprise-scale architecture prescribed in this guidance is based on the design principles described here.
These principles serve as a compass for subsequent design decisions across critical technical domains. Familiarize
yourself with these principles to better understand their impact and the trade-offs associated with nonadherence.
Subscription democratization
Subscriptions should be used as a unit of management and scale aligned with business needs and priorities to
support business areas and portfolio owners to accelerate application migrations and new application
development. Subscriptions should be provided to business units to support the design, development, and testing
of new workloads and migration of workloads.
Policy-driven governance
Azure Policy should be used to provide guardrails and ensure continued compliance with your organization's
platform, along with the applications deployed onto it. Azure Policy also provides application owners with
sufficient freedom and a secure unhindered path to the cloud.
Recommendations
Be prepared to trade off functionality because it's unlikely that everything will be required on day one. Use
preview services and take dependencies on service roadmaps to remove technical blockers.
Cloud Adoption Framework enterprise-scale design
guidelines
10/30/2020 • 2 minutes to read • Edit Online
This article and the articles series that follows outline how the enterprise-scale architecture provides an
opinionated position on each of the Azure landing zone design areas. This series provides a step-by-step set of
design guidelines that can be followed to implement the design principles embodied in the enterprise-scale
solution.
The core of enterprise-scale architecture contains a critical design path comprised of fundamental design topics
with heavily interrelated and dependent design decisions. This repo provides design guidance across these
architecturally significant technical domains to support the critical design decisions that must occur to define the
enterprise-scale architecture. For each of the considered domains, review the provided considerations and
recommendations and use them to structure and drive designs within each area.
For example, you might ask how many subscriptions are needed for your estate. You can review subscription
organization and governance and use the outlined recommendations to drive subscription decisions.
Identity provides the basis of a large percentage of security assurance. It enables access based on identity authentication and
authorization controls in cloud services to protect data and resources and to decide which requests should be permitted.
Identity and access management (IAM) is boundary security in the public cloud. It must be treated as the foundation of any secure and
fully compliant public cloud architecture. Azure offers a comprehensive set of services, tools, and reference architectures to enable
organizations to make highly secure, operationally efficient environments as outlined here.
This section examines design considerations and recommendations related to IAM in an enterprise environment.
RO L E USA GE A C T IO N S N O A C T IO N S
Use Azure Security Center just-in-time access for all infrastructure as a service (IaaS) resources to enable network-level protection for
ephemeral user access to IaaS virtual machines.
Use Azure-AD-managed identities for Azure resources to avoid authentication based on user names and passwords. Because many
security breaches of public cloud resources originate with credential theft embedded in code or other text sources, enforcing managed
identities for programmatic access greatly reduces the risk of credential theft.
Use privileged identities for automation runbooks that require elevated access permissions. Automated workflows that violate critical
security boundaries should be governed by the same tools and policies users of equivalent privilege are.
Don't add users directly to Azure resource scopes. This lack of centralized management greatly increases the management required to
prevent unauthorized access to restricted data.
Plan for authentication inside a landing zone
A critical design decision that an enterprise organization must make when adopting Azure is whether to extend an existing on-premises
identity domain into Azure or to create a brand new one. Requirements for authentication inside the landing zone should be thoroughly
assessed and incorporated into plans to deploy Active Directory Domain Services (AD DS) in Windows server, Azure AD Domain Services
(Azure AD DS), or both. Most Azure environments will use at least Azure AD for Azure fabric authentication and AD DS local host
authentication and group policy management.
Design considerations:
Consider centralized and delegated responsibilities to manage resources deployed inside the landing zone.
Applications that rely on domain services and use older protocols can use Azure AD DS.
Design recommendations:
Use centralized and delegated responsibilities to manage resources deployed inside the landing zone based on role and security
requirements.
Privileged operations such as creating service principal objects, registering applications in Azure AD, and procuring and handling
certificates or wildcard certificates require special permissions. Consider which users will be handling such requests and how to
secure and monitor their accounts with the degree of diligence required.
If an organization has a scenario where an application that uses integrated Windows authentication must be accessed remotely
through Azure AD, consider using Azure AD Application Proxy.
There's a difference between Azure AD, Azure AD DS, and AD DS running on Windows server. Evaluate your application needs, and
understand and document the authentication provider that each one will be using. Plan accordingly for all applications.
Evaluate the compatibility of workloads for AD DS on Windows server and for Azure AD DS.
Ensure your network design allows resources that require AD DS on Windows server for local authentication and management to
access the appropriate domain controllers.
For AD DS on Windows server, consider shared services environments that offer local authentication and host management in
a larger enterprise-wide network context.
Deploy Azure AD DS within the primary region because this service can only be projected into one subscription.
Use managed identities instead of service principals for authentication to Azure services. This approach reduces exposure to credential
theft.
Management group and subscription organization
10/30/2020 • 8 minutes to read • Edit Online
This article examines key design considerations and recommendations surrounding networking and connectivity
to, from, and within Microsoft Azure.
Connectivity to Azure
This section expands on the network topology to consider recommended models for connecting on-premises
locations to Azure.
Design considerations:
Azure ExpressRoute provides dedicated private connectivity to Azure infrastructure as a service (IaaS) and
platform as a service (PaaS) functionality from on-premises locations.
You can use Private Link to establish connectivity to PaaS services over ExpressRoute with private peering.
When multiple virtual networks are connected to the same ExpressRoute circuit, they'll become part of the
same routing domain, and all virtual networks will share the bandwidth.
You can use ExpressRoute Global Reach, where available, to connect on-premises locations together through
ExpressRoute circuits to transit traffic over the Microsoft backbone network.
ExpressRoute Global Reach is available in many ExpressRoute peering locations.
ExpressRoute Direct allows creation of multiple ExpressRoute circuits at no additional cost, up to the
ExpressRoute Direct port capacity (10 Gbps or 100 Gbps). It also allows you to connect directly to
Microsoft's ExpressRoute routers. For the 100-Gbps SKU, the minimum circuit bandwidth is 5 Gbps. For the
10-Gbps SKU, the minimum circuit bandwidth is 1 Gbps.
Design recommendations:
Use ExpressRoute as the primary connectivity channel for connecting an on-premises network to Azure. You
can use VPNs as a source of backup connectivity to enhance connectivity resiliency.
Use dual ExpressRoute circuits from different peering locations when you're connecting an on-premises
location to virtual networks in Azure. This setup will ensure redundant paths to Azure by removing single
points of failure between on-premises and Azure.
When you use multiple ExpressRoute circuits, optimize ExpressRoute routing via BGP local preference and
AS PATH prepending.
Ensure that you're using the right SKU for the ExpressRoute/VPN gateways based on bandwidth and
performance requirements.
Deploy a zone-redundant ExpressRoute gateway in the supported Azure regions.
For scenarios that require bandwidth higher than 10 Gbps or dedicated 10/100-Gbps ports, use
ExpressRoute Direct.
When low latency is required, or throughput from on-premises to Azure must be greater than 10 Gbps,
enable FastPath to bypass the ExpressRoute gateway from the data path.
Use VPN gateways to connect branches or remote locations to Azure. For higher resilience, deploy zone-
redundant gateways (where available).
Use ExpressRoute Global Reach to connect large offices, regional headquarters, or datacenters connected to
Azure via ExpressRoute.
When traffic isolation or dedicated bandwidth is required, such as for separating production and
nonproduction environments, use different ExpressRoute circuits. It will help you ensure isolated routing
domains and alleviate noisy-neighbor risks.
Proactively monitor ExpressRoute circuits by using Network Performance Monitor.
Don't explicitly use ExpressRoute circuits from a single peering location. This creates a single point of failure
and makes your organization susceptible to peering location outages.
Don't use the same ExpressRoute circuit to connect multiple environments that require isolation or
dedicated bandwidth, to avoid noisy-neighbor risks.
When you're using ExpressRoute Direct, configure MACsec in order to encrypt traffic at the layer-two level
between your organization's routers and MSEE. The diagram shows this encryption in flow B .
For Virtual WAN scenarios where MACsec isn't an option (for example, not using ExpressRoute Direct), use a
Virtual WAN VPN gateway to establish IPsec tunnels over ExpressRoute private peering. The diagram shows
this encryption in flow C .
For non-Virtual WAN scenarios, and where MACsec isn't an option (for example, not using ExpressRoute
Direct), the only options are:
Use partner NVAs to establish IPsec tunnels over ExpressRoute private peering.
Establish a VPN tunnel over ExpressRoute with Microsoft peering.
Evaluate the capability to configure a Site-to-Site VPN connection over ExpressRoute private peering (in
preview).
If traffic between Azure regions must be encrypted, use Global VNet Peering to connect virtual networks
across regions.
If native Azure solutions (as shown in flows B and C in the diagram) don't meet your requirements, use
partner NVAs in Azure to encrypt traffic over ExpressRoute private peering.
Your organization or enterprise needs to design suitable, platform-level capabilities that application workloads can
consume to meet their specific requirements. Specifically, these application workloads have requirements
pertaining to recover time objective (RTO) and recovery point objective (RPO). Be sure that you capture disaster
recovery (DR) requirements in order to design capabilities appropriately for these workloads.
Design considerations
Consider the following factors:
Application and data availability requirements, and the use of active-active and active-passive availability
patterns (such as workload RTO and RPO requirements).
Business continuity and DR for platform as a service (PaaS) services, and the availability of native DR and
high-availability features.
Support for multiregion deployments for failover purposes, with component proximity for performance
reasons.
Application operations with reduced functionality or degraded performance in the presence of an outage.
Workload suitability for Availability Zones or availability sets.
Data sharing and dependencies between zones.
The impact of Availability Zones on update domains compared to availability sets and the percentage
of workloads that can be under maintenance simultaneously.
Support for specific virtual machine (VM) stock-keeping units with Availability Zones.
Using Availability Zones is required if Microsoft Azure ultra disk storage is used.
Consistent backups for applications and data.
VM snapshots and using Azure Backup and Recovery Services vaults.
Subscription limits restricting the number of Recovery Services vaults and the size of each vault.
Geo-replication and DR capabilities for PaaS services.
Network connectivity if a failover occurs.
Bandwidth capacity planning for Azure ExpressRoute.
Traffic routing if a regional, zonal, or network outage occurs.
Planned and unplanned failovers.
IP address consistency requirements and the potential need to maintain IP addresses after failover
and failback.
Maintained engineering DevOps capabilities.
Azure Key Vault DR for application keys, certificates, and secrets.
Design recommendations
The following are best practices for your design:
Employ Azure Site Recovery for Azure-to-Azure Virtual Machines disaster recovery scenarios. This enables
you to replicate workloads across regions.
Site Recovery provides built-in platform capabilities for VM workloads to meet low RPO/RTO requirements
through real-time replication and recovery automation. Additionally, the service provides the ability to run
recovery drills without affecting the workloads in production. You can use Azure Policy to enable replication
and also audit the protection of your VMs.
Use native PaaS service disaster recovery capabilities.
The built-in features provide an easy solution to the complex task of building replication and failover into a
workload architecture, simplifying both design and deployment automation. An organization that has
defined a standard for the services they use can also audit and enforce the service configuration through
Azure Policy.
Use Azure-native backup capabilities.
Azure Backup and PaaS-native backup features remove the need for managing third-party backup software
and infrastructure. As with other native features, you can set, audit, and enforce backup configurations with
Azure Policy. This ensures that services remain compliant with the organization's requirements.
Use multiple regions and peering locations for ExpressRoute connectivity.
A redundant hybrid network architecture can help ensure uninterrupted cross-premises connectivity in the
event of an outage affecting an Azure region or peering provider location.
Avoid using overlapping IP address ranges for production and DR sites.
When possible, plan for a business continuity and DR network architecture that provides concurrent
connectivity to all sites. DR networks that use the same classless inter-domain routing blocks, such as
production networks, require a network failover process that can complicate and delay application failover
in the event of an outage.
Enterprise-scale security governance and compliance
10/30/2020 • 10 minutes to read • Edit Online
This article covers defining encryption and key management, planning for governance, defining security
monitoring and an audit policy, and planning for platform security. At the end of the article, you can refer to a table
that describes a framework to assess enterprise security readiness of Azure services.
Identity and access management Authentication and access control Are all control plane operations
governed by Azure AD? Is there a
nested control plane, such as with
Azure Kubernetes Service?
Is Azure-to-IaaS (service-to-virtual-
network) authentication via Azure AD?
Governance Data export and import Does service allow you to import and
export data securely and encrypted?
Azure service compliance Service attestation, certification, and Is the service PCI/ISO/SOC compliant?
external audits
This article covers how to get started with the enterprise-scale, platform-native reference implementation and outline
design objectives.
In order to implement the enterprise-scale architecture, you must think in terms of the following categories of activities:
1. What must be true for the enterprise-scale architecture: Encompasses activities that must be performed
by the Azure and Azure Active Directory (Azure AD) administrators to establish an initial configuration. These
activities are sequential by nature and primarily one-off activities.
2. Enable a new region (File > New > Region): Encompasses activities that are required whenever there is a
need to expand the enterprise-scale platform into a new Azure region.
3. Deploy a new landing zone (File > New > Landing Zone): These are recurring activities that are required
to instantiate a new landing zone.
To operationalize at scale, these activities must follow infrastructure-as-code (IaC) principles and must be automated by
using deployment pipelines.
Deploy-VM-Backup Ensures that backup is configured and deployed to all VMs in the
landing zones.
Deploy-VNet Ensures that all landing zones have a virtual network deployed
and that it's peered to the regional virtual hub.
Deny-VNET-Peering-Cross- Prevents VNET peering connections being Ensure this policy is only assigned to the
Subscription created to other VNETs outside of the Sandbox Management Group hierarchy
subscription. scoping level.
Denied-Resources Resources that are denied from creation When assigning this policy select the
in the sandbox subscriptions. This will following resources to deny the creation
prevent any hybrid connectivity resources of: VPN Gateways:
from being created; e.g. microsoft.network/vpngateways , P2S
VPN/ExpressRoute/VirtualWAN Gateways:
microsoft.network/p2svpngateways ,
Virtual WANs:
microsoft.network/virtualwans ,
Virtual WAN Hubs:
microsoft.network/virtualhubs ,
ExpressRoute Circuits:
microsoft.network/expressroutecircuits
, ExpressRoute Gateways:
microsoft.network/expressroutegateways
, ExpressRoute Ports:
microsoft.network/expressrouteports
, ExpressRoute Cross-Connections:
microsoft.network/expressroutecrossconnections
and Local Network Gateways:
microsoft.network/localnetworkgateways
.
NAME DESC RIP T IO N A SSIGN M EN T N OT ES
Deploy-Budget-Sandbox Ensures a budget exists for each sandbox If during the assignment of the policy the
subscription, with e-mail alerts enabled. parameters are not amended from their
The budget will be named: defaults a the budget (
default-sandbox-budget in each default-sandbox-budget ) will be
subscription. created with a 1000 currency threshold
limit and will send an e-mail alert to the
subscription's owners and contributors
(based on RBAC role assignment) at 90%
and 100% of the budget threshold.
Deploy-Diag-LogAnalytics
Platform identity
1. If you create the identity resources via Azure Policy, assign the policies listed in the following table to the identity
subscription. By doing this, Azure Policy ensures that the resources in the following list are created based on the
parameters provided.
2. Deploy the Active Directory domain controllers.
The following list shows policies that you can use when you're implementing identity resources for an enterprise-scale
deployment.
NAME DESC RIP T IO N
Deploy-VHub This policy deploys a virtual hub, Azure Firewall, and gateways
(VPN/ExpressRoute). It also configures the default route on
connected virtual networks to Azure Firewall.
File > New > Landing Zone for applications and workloads
1. Create a subscription and move it under the Landing Zones management group scope.
2. Create Azure AD groups for the subscription, such as Owner , Reader , and Contributor .
3. Create Azure AD PIM entitlements for established Azure AD groups.
Transitioning existing Azure environments to
Enterprise-Scale
10/30/2020 • 4 minutes to read • Edit Online
We recognize that most organizations may have an existing footprint in Azure, one or more subscriptions, and
potentially an existing structure of their management groups. Depending on their initial business requirements and
scenarios, Azure resources such as hybrid connectivity (for example with Site-to-Site VPN and/or ExpressRoute)
may have been deployed.
This article helps organizations to navigate the right path based on an existing Azure environment transitioning
into Enterprise-Scale. This article also describes considerations for moving resources in Azure (for example, moving
a subscription from one existing management group to another management group), which will help you evaluate
and plan for transitioning your existing Azure environment to Enterprise-Scale landing zones.
SC O P E DEST IN AT IO N P RO S C ONS
Resources in resource Can be moved to new Allows you to modify * Not supported by all
groups resource group in same or resource composition in a resourceTypes
different subscription resource group after * Some resourceTypes have
deployment specific limitations or
requirements
* ResourceId’s are updated
and impacts existing
monitoring, alerts, and
control plane operations
* Resource groups are locked
during the move period
* Requires assessment of
policies and RBAC pre and
post-move operation
To understand which move strategy you should use, we will go through examples of both:
Subscription move
The common use cases for moving subscriptions are primarily to 1) organize subscriptions into management
groups, and 2) transfer subscriptions to a new Azure Active Directory tenant. We will focus on moving
subscriptions to management groups in this section as moving to a new tenant is mainly for transferring billing
ownership.
RBAC requirements
To assess a subscription prior to a move, it is important that the user has the appropriate RBAC such as being an
Owner on the subscription (direct roleAssignment), and has write permission on the target management group
(built-in roles that support this is Owner, Contributor, Management Group Contributor). If the user has an inherited
Owner permission on the subscription from an existing management group, the subscription can only be moved to
the management group where the user has been assigned the Owner role.
Policy
Existing subscriptions may be subject to Azure policies assigned either directly, or at the management group where
they are currently located. It is important to assess current policies, and the policies that may exist in the new
management group/management group hierarchy. Azure Resource Graph can be used to perform an inventory of
existing resources and compare their configuration with the policies existing at the destination.
Once subscriptions are moved to a management group with existing RBAC and policies in place, consider the
following options:
Any RBAC that is inherited to the moved subscriptions can take up to 30 minutes before the user tokens in the
management group cache are refreshed. To expedite this process, you can refresh the token by signing out and
in or request a new token.
Any policy where the assignment scope includes the moved subscriptions, will perform audit operations only on
the existing resources. More specifically:
Any existing resource in the subscription subject to deployIfNotExists policy effect will appear as non-
compliant and will not be remediated automatically but requires user interaction to perform the
remediation manually.
Any existing resource in the subscription subject to deny policy effect will appear as non-compliant and
will not be rejected. User must manually mitigate this result as appropriate.
Any existing resource in the subscription subject to append and modify policy effect will appear as non-
compliant and requires user interaction to mitigate.
Any existing resource in the subscription subject to audit and auditIfNotExist will appear as non-
compliant and requires user interaction to mitigate.
All new writes to resources in the moved subscription will be subject to the assigned policies at real-time as
normal.
Resource move
The primary use cases to perform a resource move is when you want to consolidate resources into the same
resource group if they share the same life-cycle, or move resources to a different subscription due to cost,
ownership, or RBAC requirements.
When performing a resource move, both the source resource group and the target resource group are locked (this
lock will not affect any of the resources in the resource group) during the move operation, meaning you cannot
add, update, or delete resources in the resource groups. A resource move operation will not change the location of
the resources.
Before you move resources
Prior to a move operation, you must verify that the resources in scope are supported as well as assessing their
requirements and dependencies. For instance, moving a peered virtual network requires you to disable virtual
network peering first, and reenable the peering once the move operation has completed. This disable/reenable
dependency requires planning upfront to understand the impact to any existing workload that may be connected to
your virtual networks.
Post move operation
When the resources are moved into a new resource group in the same subscription, any inherited RBAC and
policies from management group or/and subscription scope will still apply. If you move to a resource group in a
new subscription – where the subscription may be subject to other RBAC and policy assignment, same guidance
applies as to the move subscription scenario to validate the resource compliance and access controls.
Deploy a migration landing zone in Azure
10/30/2020 • 5 minutes to read • Edit Online
A migration landing zone is an environment that has been provisioned and prepared to host workloads that are
being migrated from an on-premises environment into Azure.
Design principles
This implementation option provides an opinionated approach to the common design areas shared by all Azure
landing zones. See the assumptions and decisions below for addition technical detail.
Deployment options
This implementation option deploys a minimum viable product (MVP) to start a migration. As the migration
progresses, the customer will follow a modular refactoring-based approach to mature the operating model in
parallel guidance, using the Govern methodology and the Manage methodology to address those complex topics
in parallel to the initial migration effort.
The specific resources deployed by this MVP approach are outlined in the decisions section below.
Enterprise enrollment
This implementation option doesn't take an inherent position on enterprise enrollment. This approach is designed
to be applicable to customers regardless of contractual agreements with Microsoft or Microsoft partners. Prior to
deployment of this implementation option, it is assumed that the customer has created a target subscription.
Identity
This implementation option assumes that the target subscription is already associated with an Azure Active
Directory instance in accordance with identity management best practices
Network topology and connectivity
This implementation option creates a virtual network with subnets for gateway, firewall, jump box, and landing
zone. As a next step iteration, the team would follow the networking decisions guide to implement the appropriate
form of connectivity between the gateway subnet and other networks in alignment with network security best
practices.
Resource organization
This implementation option creates a single landing zone, in which resources will be organized into workloads
defined by specific resource groups. Choosing this minimalist approach to resource organization defers the
technical decision of resource organization until the team's cloud operating model is more clearly defined.
This approach is based on an assumption that the cloud adoption effort will not exceed subscription limits. This
option also assumes limited architectural complexity and security requirements within this landing zone.
If this changes through the course of the cloud adoption plan, the resource organization may need to be
refactored using the guidance in the Govern methodology.
Governance disciplines
This implementation option doesn't implement any governance tooling. In the absence of defined policy
automation, this landing zone should not be used for any mission critical workloads or sensitive data. It is
assumed that this landing zone is being used for limited production deployment to initiate learning, iteration, and
development of the overall operating model in parallel to these early stage migration efforts.
To accelerate parallel development of governance disciplines, review the Govern methodology and consider
deploying the CAF Foundation blueprint in addition to the CAF Migration landing zone blueprint.
WARNING
As the governance disciplines mature, refactoring may be required. Specifically, resources may later need to be moved to a
new subscription or resource group.
Operations baseline
This implementation option doesn't implement any operations. In the absence of a defined operations baseline,
this landing zone should not be used for any mission critical workloads or sensitive data. It is assumed that this
landing zone is being used for limited production deployment to initiate learning, iteration, and development of
the overall operating model in parallel to these early stage migration efforts.
To accelerate parallel development of an operations baseline, review the Manage methodology and consider
deploying the Azure server management guide.
WARNING
As the operations baseline is developed, refactoring may be required. Specifically, resources may later need to be moved to
a new subscription or resource group.
Assumptions
This initial landing zone includes the following assumptions or constraints. If these assumptions align with your
constraints, you can use the blueprint to create your first landing zone. The blueprint also can be extended to
create a landing zone blueprint that meets your unique constraints.
Subscription limits: This adoption effort isn't expected to exceed subscription limits.
Compliance: No third-party compliance requirements are needed in this landing zone.
Architectural complexity: Architectural complexity doesn't require additional production subscriptions.
Shared ser vices: No existing shared services in Azure require this subscription to be treated like a spoke in a
hub and spoke architecture.
Limited production scope: This landing zone could potentially host production workloads. It is not a
suitable environment for sensitive data or mission-critical workloads.
If these assumptions align with your current adoption needs, then this blueprint might be a starting point for
building your landing zone.
Decisions
The following decisions are represented in the landing zone blueprint.
C O M P O N EN T DEC ISIO N S A LT ERN AT IVE A P P RO A C H ES
Migration tools Azure Site Recovery will be deployed Migration tools decision guide
and an Azure Migrate project will be
created.
Identity It's assumed that the subscription is identity management best practices
already associated with an Azure Active
Directory instance.
Subscription design N/A - designed for a single production Create initial subscriptions
subscription.
Management groups N/A - designed for a single production Organize and manage subscriptions
subscription.
Naming and tagging standards N/A Naming and tagging best practices
Next steps
After deploying your first landing zone, you're ready to expand your landing zone.
Expand your landing zone
Deploy a CAF Foundation blueprint in Azure
10/30/2020 • 4 minutes to read • Edit Online
The CAF Foundation blueprint does not deploy a landing zone. Instead, it deploys the tools required to establish a
governance MVP (minimum viable product) to begin developing your governance disciplines. This blueprint is
designed to be additive to an existing landing zone and can be applied to the CAF Migration landing zone
blueprint with a single action.
Design principles
This implementation option provides an opinionated approach to the common design areas shared by all Azure
landing zones. See the assumptions and decisions below for addition technical detail.
Deployment options
This implementation option deploys an MVP to serve as the foundation for your governance disciplines. The team
will follow a modular refactoring-based approach to mature the governance disciplines using the Govern
methodology.
Enterprise enrollment
This implementation option does not take an inherent position on enterprise enrollment. This approach is
designed to be applicable to customers regardless of contractual agreements with Microsoft or Microsoft partners.
Prior to deployment of this implementation option, it's assumed that the customer has already created a target
subscription.
Identity
This implementation option assumes that the target subscription is already associated with an Azure Active
Directory instance in accordance with identity management best practices.
Network topology and connectivity
This implementation option assumes the landing zone already has a defined network topology in accordance with
network security best practices.
Resource organization
This implementation option demonstrates how Azure Policy can add some elements of resource organization
through the application of tags. Specifically, a CostCenter tag will be appended to resources using Azure Policy.
The governance team should compare and contrast the elements of resource organization to be addressed by
tagging versus those that should be addressed through subscription design. These fundamental decisions will
inform resource organization as your cloud adoption plans progress.
To aid in this comparison early in adoption cycles, the following articles should be considered:
Initial Azure subscriptions: At this stage of adoption scale, does your operating model require two, three, or four
subscriptions?
Scale subscriptions: As adoption scales, what criteria will be used to drive subscription scaling?
Organize subscriptions: How will you organize subscriptions as you scale?
Tagging standards: What other criteria need to be consistently captured in tags to augment your subscription
design?
To aid in this comparison when teams are further along with cloud adoption, see the governance patters section of
the governance guide - prescriptive guidance article. This section of the prescriptive guidance demonstrates a set
of patterns based on a specific narrative and operating model. That guidance also includes links to other patterns
that should be considered.
Governance disciplines
This implementation demonstrates one approach to maturity in the Cost Management discipline of the Govern
methodology. Specifically, it demonstrates how Azure Policy can be used to create an allow list of specific SKUs.
Limiting the types and sizes of resources that can be deployed into a landing zone reduces the risk of
overspending.
To accelerate parallel development of the other governance disciplines, review the Govern methodology. To
continue maturing the Cost Management discipline of governance, see the Cost Management discipline guidance.
WARNING
As the governance disciplines mature, refactoring may be required. Refactoring may be required. Specifically, resources may
later need to be moved to a new subscription or resource group.
Operations baseline
This implementation option does not implement any aspects of the operations baseline. In the absence of a
defined operations baseline, this landing zone should not be used for any mission critical workloads or sensitive
data. It is assumed that this landing zone is being used for limited production deployment to initiate learning,
iteration, and development of the overall operating model in parallel to these early stage migration efforts.
To accelerate parallel development of an operations baseline, review the Manage methodology and consider
deploying the Azure server management guide.
WARNING
As the operations baseline is developed, refactoring may be required. Specifically, resources may later need to be moved to a
new subscription or resource group.
Assumptions
This initial blueprint assumes that the team is committed to maturing governance capabilities in parallel to the
initial cloud migration efforts. If these assumptions align with your constraints, you can use the blueprint to begin
the process of developing governance maturity.
Compliance: No third-party compliance requirements are needed in this landing zone.
Limited production scope: This landing zone could potentially host production workloads. It is not a suitable
environment for sensitive data or mission-critical workloads.
If these assumptions align with your current adoption needs, then this blueprint might be a starting point for
building your landing zone.
Customize or deploy this blueprint
Learn more and download a reference sample of the CAF Foundation blueprint for deployment or customization
from the Azure blueprint samples.
Deploy the blueprint sample
Next steps
After deploying your first landing zone, you're ready to expand your landing zone.
Expand your landing zone
Use Terraform to build your landing zones
10/30/2020 • 6 minutes to read • Edit Online
Azure provides native services for deploying your landing zones. Other third-party tools can also help with this
effort. One such tool that customers and partners often use to deploy landing zones is Terraform by HashiCorp.
This section shows how to use a sample landing zone to deploy foundational governance, accounting, and security
capabilities for an Azure subscription.
Architecture diagram
The first landing zone deploys the following components in your subscription:
Figure 1: A foundation landing zone using Terraform.
Capabilities
The components deployed and their purpose include the following:
C O M P O N EN T RESP O N SIB IL IT Y
Diagnostics logging All operation logs kept for a specific number of days:
Storage account
Event Hubs
C O M P O N EN T RESP O N SIB IL IT Y
Log Analytics Stores the operation logs. Deploy common solutions for deep
application best practices review:
NetworkMonitoring
AdAssessment
AdReplication
AgentHealthAssessment
DnsAnalytics
KeyVaultAnalytics
Azure Security Center Security hygiene metrics and alerts sent to email and phone
number
Assumptions
The following assumptions or constraints were considered when this initial landing zone was defined. If these
assumptions align with your constraints, you can use the blueprint to create your first landing zone. The blueprint
also can be extended to create a landing zone blueprint that meets your unique constraints.
Subscription limits: This adoption effort is unlikely to exceed subscription limits. Two common indicators are
an excess of 25,000 VMs or 10,000 vCPUs.
Compliance: No third-party compliance requirements are needed for this landing zone.
Architectural complexity: Architectural complexity doesn't require additional production subscriptions.
Shared ser vices: No existing shared services in Azure require this subscription to be treated like a spoke in a
hub and spoke architecture.
If these assumptions match your current environment, this blueprint might be a good way to start building your
landing zone.
Design decisions
The following decisions are represented in the CAF Terraform modules:
Identity It's assumed that the subscription is Identity management best practices
already associated with an Azure Active
Directory instance.
Subscription design N/A - designed for a single production Create initial subscriptions
subscription.
Naming standards When the environment is created, a Naming and tagging best practices
unique prefix is also created. Resources
that require a globally unique name
(such as storage accounts) use this
prefix. The custom name is appended
with a random suffix. Tag usage is
mandated as described in the following
table.
Tagging standards
The minimum set of tags shown below must be present on all resources and resource groups:
resource_groups_hub = {
HUB-CORE-SEC = {
name = "-hub-core-sec"
location = "southeastasia"
}
HUB-OPERATIONS = {
name = "-hub-operations"
location = "southeastasia"
}
}
Next, we specify the regions where we can set the foundations. Here, southeastasia is used to deploy all the
resources.
location_map = {
region1 = "southeastasia"
region2 = "eastasia"
}
Then, we specify the retention period for the operations logs and the Azure subscription logs. This data is stored in
separate storage accounts and an event hub, whose names are randomly generated because they must be unique.
azure_activity_logs_retention = 365
azure_diagnostics_logs_retention = 60
Into the tags_hub, we specify the minimum set of tags that are applied to all resources created.
tags_hub = {
environment = "DEV"
owner = "Arnaud"
deploymentType = "Terraform"
costCenter = "65182"
BusinessUnit = "SHARED"
DR = "NON-DR-ENABLED"
}
Then, we specify the Log Analytics name and a set of solutions that analyze the deployment. Here, we retained
network monitoring, Active Directory assessment and replication, DNS analytics, and Key Vault analytics.
analytics_workspace_name = "lalogs"
solution_plan_map = {
NetworkMonitoring = {
"publisher" = "Microsoft"
"product" = "OMSGallery/NetworkMonitoring"
},
ADAssessment = {
"publisher" = "Microsoft"
"product" = "OMSGallery/ADAssessment"
},
ADReplication = {
"publisher" = "Microsoft"
"product" = "OMSGallery/ADReplication"
},
AgentHealthAssessment = {
"publisher" = "Microsoft"
"product" = "OMSGallery/AgentHealthAssessment"
},
DnsAnalytics = {
"publisher" = "Microsoft"
"product" = "OMSGallery/DnsAnalytics"
},
KeyVaultAnalytics = {
"publisher" = "Microsoft"
"product" = "OMSGallery/KeyVaultAnalytics"
}
}
Take action
After you've reviewed the configuration, you can deploy the configuration as you would deploy a Terraform
environment. We recommend that you use the rover, which is a Docker container that allows deployment from
Windows, Linux, or macOS. You can get started with the landing zones.
Next steps
The foundation landing zone lays the groundwork for a complex environment in a decomposed manner. This
edition provides a set of simple capabilities that can be extended by adding other modules to the blueprint or
layering additional landing zones on top of it.
Layering your landing zones is a good practice for decoupling systems, versioning each component that you're
using, and allowing fast innovation and stability for your infrastructure as code deployment.
Future reference architectures will demonstrate this concept for a hub and spoke topology.
Review the sample foundation Terraform landing zone
Expand your landing zone
10/30/2020 • 2 minutes to read • Edit Online
This section of the Ready methodology builds on the principles of landing zone refactoring. As outlined in that
article, a refactoring approach to infrastructure as code removes blockers to business success while minimizing
risk. This series of articles assumes that you've deployed your first landing zone and would now like to expand
that landing zone to meet enterprise requirements.
WARNING
Adoption teams who have a midterm objective (within 24 months) to host more than 1,000 assets (apps,
infrastructure, or data assets) in the cloud should consider each of these expansions early in their cloud adoption
journey. For all other adoption patterns, landing zone expansions could be a parallel iteration, allowing for early business
success.
Next steps
Before refactoring your first landing zone, it is important to understand test-driven development of landing
zones.
Test-driven development of landing zones
Landing zone considerations
10/30/2020 • 2 minutes to read • Edit Online
A landing zone is the basic building block of any cloud adoption environment. The term landing zone refers to an
environment that's been provisioned and prepared to host workloads in a cloud environment like Azure. A fully
functioning landing zone is the final deliverable of any iteration of the Cloud Adoption Framework's Ready
methodology.
Hosting considerations
All landing zones provide structure for hosting options. The structure is created explicitly through governance
controls or organically through the adoption of services within the landing zone. The following articles can help
you make decisions that will be reflected in the blueprint or other automation scripts that create your landing zone:
Compute decisions: To minimize operational complexity, align compute options with the purpose of the landing
zone. This decision can be enforced by using automation toolchains like Azure Policy initiatives and landing
zones.
Storage decisions: Choose the right Azure Storage solution to support your workload requirements.
Networking decisions: Choose networking services, tools, and architectures to support your organization's
workload, governance, and connectivity requirements.
Database decisions: Determine which database technology is best suited for your workload requirements.
Azure fundamentals
Each landing zone is part of a broader solution for organizing resources across a cloud environment. Azure
fundamentals are the foundational building blocks for an organization.
Azure fundamental concepts: Learn fundamental concepts and terms that are used to organize resources in
Azure and how the concepts relate to one another.
Resource consistency decision guide: When you understand each of the fundamentals, the resource
organization decision guide can help you make decisions that shape the landing zone.
Governance considerations
The Cloud Adoption Framework's Govern methodologies establish a process for governing the environment as a
whole. Many use cases might require you to make governance decisions on a per-landing-zone basis. In many
scenarios, governance baselines are enforced on a per-landing-zone basis even though the baselines are
established holistically. It's true for the first few landing zones that an organization deploys.
The following articles can help you make governance-related decisions about your landing zone. You can factor
each decision into your governance baselines.
Cost requirements. Based on an organization's motivation for cloud adoption and operational commitments
made about its environment, various cost management configurations might need to be changed for the
landing zone.
Monitoring decisions. Depending on the operational requirements for a landing zone, various monitoring
tools can be deployed. The monitoring decisions article can help you determine the most appropriate tools to
deploy.
Role-based access control. Azure role-based access control (RBAC) offers fine-grained, group-based access
management for resources that are organized around user roles.
Policy decisions. Azure Blueprints samples provide premade compliance blueprints, each with predefined
policy initiatives. Policy decisions help inform a selection of the best blueprint or policy initiative based on your
requirements and constraints.
Create hybrid cloud consistency . Create hybrid cloud solutions that give your organization the benefits of
cloud innovation while maintaining many of the conveniences of on-premises management.
Review your compute options
10/30/2020 • 6 minutes to read • Edit Online
Determining the compute requirements for hosting your workloads is a key consideration as you prepare for your
cloud adoption. Azure compute products and services support a wide variety of workload computing scenarios
and capabilities. How you configure your landing zone environment to support your compute requirements
depends on your workload's governance, technical, and business requirements.
NOTE
Learn more about how to assess compute options for each of your applications or services in the Azure application
architecture guide.
Key questions
Answer the following questions about your workloads to help you make decisions based on the Azure compute
services decision tree:
Are you building net-new applications and ser vices or migrating from existing on-premises
workloads? Developing new applications as part of your cloud adoption efforts allows you to take full
advantage of modern cloud-based hosting technologies from the design phase moving forward.
If you're migrating existing workloads, can they take advantage of modern cloud technologies?
Migrating on-premises workloads requires analysis. Can you easily optimize existing applications and services
to take advantage of modern cloud technologies, or will a lift-and-shift approach work better for your
workloads?
Can your applications or ser vices take advantage of containers? If your applications are good
candidates for containerized hosting, you can take advantage of the resource efficiency, scalability, and
orchestration capabilities provided by container services in Azure. Both Azure managed disks and Azure Files
can be used for persistent storage in containerized applications.
Are your applications web- or API-based, and do they use PHP, ASP.NET, Node.js, or similar
technologies? Web apps can be deployed to managed Azure App Service instances, so you don't have to
maintain virtual machines for hosting purposes.
Will you require full control over the OS and hosting environment of your workload? If you need to
control the hosting environment, including OS, disks, locally running software, and other configurations, you
can use Azure Virtual Machines to host your applications and services. In addition to choosing your virtual
machine sizes and performance tiers, your decisions regarding virtual disk storage will affect performance and
SLAs related to your infrastructure-as-a-service workloads. For more information, see the Azure disk storage
documentation.
Will your workload involve high-performance computing (HPC) capabilities? Azure Batch provides
job scheduling and autoscaling of compute resources as a platform service, so it's easy to run large-scale
parallel and HPC applications in the cloud.
Will your applications use a microser vices architecture? Applications that use a microservices-based
architecture can take advantage of several optimized compute technologies. Self-contained, event-driven
workloads can use Azure Functions to build scalable, serverless applications that don't need an infrastructure.
For applications that require more control over the environment where microservices run, you can use
container services like Azure container instances, Azure Kubernetes Service, and Azure Service Fabric.
NOTE
Most Azure compute services are used in combination with Azure Storage. Consult the storage decisions guidance for
related storage decisions.
SC EN A RIO C O M P UT E SERVIC E
I need to provision Linux and Windows virtual machines in Azure Virtual Machines
seconds with the configurations of my choice.
I need to achieve high availability by autoscaling to create Virtual machine scale sets
thousands of VMs in minutes.
SC EN A RIO C O M P UT E SERVIC E
I want to simplify the deployment, management, and Azure Kubernetes Service (AKS)
operations of Kubernetes.
I want to quickly create cloud apps for web and mobile by Azure App Service
using a fully managed platform.
I want to containerize apps and easily run containers by using Azure container instances
a single command.
I need to create highly available, scalable cloud applications Azure cloud services
and APIs that can help me focus on apps instead of hardware.
Regional availability
Azure lets you deliver services at the scale you need to reach your customers and partners wherever they are . A
key factor in planning your cloud deployment is to determine which Azure region will host your workload
resources.
Some compute options such as Azure App Service are generally available in most Azure regions while other
compute services are supported only in certain regions. Some virtual machine types and their associated storage
types have limited regional availability. Before you decide the regions to which you will deploy your compute
resources, we recommend that you refer to the regions page to check the latest status of regional availability.
To learn more about the Azure global infrastructure, see the Azure regions page. You can also view products
available by region for specific details about the overall services that are available in each Azure region.
Designing and implementing Azure networking capabilities is a critical part of your cloud adoption efforts. You'll
need to make networking design decisions to properly support the workloads and services that will be hosted in
the cloud. Azure networking products and services support a wide variety of networking capabilities. How you
structure these services and the networking architectures you choose depends on your organization's workload,
governance, and connectivity requirements.
I need to balance inbound and outbound connections and Azure Load Balancer
requests to my applications or services.
I want to optimize delivery from application server farms Azure Application Gateway
while increasing application security with a Web Application Azure Front Door
Firewall.
I need to securely use the internet to access Azure Virtual Azure VPN gateway
Network through high-performance VPN gateways.
I need to accelerate the delivery of high-bandwidth content Azure Content Delivery Network (CDN)
to customers worldwide, from applications and stored content
to streaming video.
I need to protect my Azure applications from DDoS attacks. Azure DDoS protection
I need to distribute traffic optimally to services across global Azure Traffic Manager
Azure regions, while providing high availability and
responsiveness. Azure Front Door
I need native firewall capabilities, with built-in high availability, Azure Firewall
unrestricted cloud scalability, and zero maintenance.
I need to connect business offices, retail locations, and sites Azure Virtual WAN
securely.
I need a scalable, security-enhanced delivery point for global Azure Front Door
microservices-based web applications.
You need to deploy and manage a large number of VMs and Hub and spoke
workloads, potentially exceeding Azure subscription limits,
you need to share services across subscriptions, or you need
a more segmented structure for role, application, or
permission segregation.
You have many branch offices that need to connect to each Azure Virtual WAN
other and to Azure.
Storage capabilities are critical for supporting workloads and services that are hosted in the cloud. As part of your
cloud adoption readiness preparations, review this article to help you plan for and address your storage needs.
I have bare-metal servers or VMs Azure disk storage (premium SSD) For production services, the premium
(Hyper-V or VMware) with direct SSD option provides consistent low-
attached storage running LOB latency coupled with high IOPS and
applications. throughput.
I have servers that will host web and Azure disk storage (standard SSD) standard SSD IOPS and throughput
mobile apps. might be sufficient (at a lower cost than
premium SSD) for CPU-bound web and
app servers in production.
I have an enterprise SAN or all-flash Azure disk storage (premium or ultra ultra SSD is NVMe-based and offers
array. SSD) submillisecond latency with high IOPS
and bandwidth. ultra SSD is scalable up
Azure NetApp Files to 64 TB. The choice of premium SSD
and ultra SSD depends on peak latency,
IOPS, and scalability requirements.
I have high-availability (HA) clustered Azure Files (premium) Clustered workloads require multiple
servers (such as SQL Server FCI or nodes to mount the same underlying
Windows server failover clustering). Azure disk storage (premium or ultra shared storage for failover or HA.
SSD) Premium file shares offer shared
storage that's mountable via SMB.
Shared block storage also can be
configured on premium SSD or ultra
SSD by using partner solutions.
I have a relational database or data Azure disk storage premium or ultra The choice of premium SSD versus ultra
warehouse workload (such as SQL SSD) SSD depends on peak latency, IOPS,
Server or Oracle). and scalability requirements. ultra SSD
also reduces complexity by removing
the need for storage pool configuration
for scalability.
I have a NoSQL cluster (such as Azure disk storage (premium SSD) Azure disk storage premium SSD
Cassandra or MongoDB). offering provides consistent low-latency
coupled with high IOPS and
throughput.
C O N SIDERAT IO N S F O R SUGGEST ED
SC EN A RIO SUGGEST ED A Z URE SERVIC ES SERVIC ES
I am running containers with persistent Azure Files (standard or premium) File (RWX) and block (RWO) volumes
volumes. driver options are available for both
Azure disk storage (standard, premium, Azure Kubernetes Service (AKS) and
or ultra SSD) custom Kubernetes deployments.
Persistent volumes can map to either
an Azure disk storage disk or a
managed Azure Files share. Choose
premium versus standard options bases
on workload requirements for
persistent volumes.
I have a data lake (such as a Hadoop Azure Data Lake Storage gen 2 The Data Lake Storage gen 2 feature of
cluster for HDFS data). Azure Blob storage provides server-side
Azure disk storage (standard or HDFS compatibility and petabyte scale
premium SSD) for parallel analytics. It also offers HA
and reliability. Software like Cloudera
can use premium or standard SSD on
master/worker nodes, if needed.
I have an SAP or SAP HANA Azure disk storage (premium or ultra ultra SSD is optimized to offer
deployment. SSD) submillisecond latency for tier-1 SAP
workloads. ultra SSD is now in preview.
premium SSD coupled with M-series
VMs offers a general-availability option.
I have a disaster recovery site with Azure page blobs Azure page blobs are used by
strict RPO/RTO that syncs from my replication software to enable low-cost
primary servers. replication to Azure without the need
for compute VMs until failover occurs.
For more information, see the Azure
disk storage documentation. Note:
Page blobs support a maximum of 8 TB.
I use Windows file server. Azure Files With Azure file sync, you can store
rarely used data on cloud-based Azure
Azure file sync file shares while caching your most
frequently used files on-premises for
fast, local access. You can also use
multisite sync to keep files in sync
across multiple servers. If you plan to
migrate your workloads to a cloud-only
deployment, Azure Files might be
sufficient.
C O N SIDERAT IO N S F O R SUGGEST ED
SC EN A RIO SUGGEST ED A Z URE SERVIC ES SERVIC ES
I have an enterprise NAS (such as Azure Azure NetApp Files If you have an on-premises deployment
NetApp Files or Dell-EMC Isilon). of NetApp, consider using Azure
Azure Files (premium) NetApp Files to migrate your
deployment to Azure. If you're using or
migrating to a Windows or Linux server,
or you have basic file-share needs,
consider using Azure Files. For
continued on-premises access, use
Azure file sync to sync Azure file shares
with on-premises file shares by using a
cloud-tiering mechanism.
I have a file share (SMB or NFS). Azure Files (standard or premium) The choice of premium versus standard
Azure Files tiers depends on IOPS,
Azure NetApp Files throughput, and your need for latency
consistency. If you have an on-premises
deployment of NetApp, consider using
Azure NetApp Files. If you need to
migrate your access control lists (ACLs)
and timestamps to the cloud, Azure file
sync can bring all these settings to your
Azure file shares as a convenient
migration path.
I have an on-premises object storage Azure Blob storage Azure Blob storage provides premium,
system for petabytes of data (such as hot, cool, and archive tiers to match
Dell-EMC ECS). your workload performance and cost
needs.
I have a DFSR deployment or another Azure Files Azure file sync offers multisite sync to
way of handling branch offices. keep files in sync across multiple
Azure file sync servers and native Azure file shares in
the cloud. Move to a fixed storage
footprint on-premises by using cloud
tiering. Cloud tiering transforms your
server into a cache for the relevant files
while scaling cold data in Azure file
shares.
I have a tape library (either on- Azure Blob storage (cool or archive An Azure Blob storage archive tier will
premises or offsite) for backup and tiers) have the lowest possible cost, but it
disaster recovery or long-term data might require hours to copy the offline
retention. data to a cool, hot, or Premium tier of
storage to allow access. Cool tiers
provide instantaneous access at low
cost.
C O N SIDERAT IO N S F O R SUGGEST ED
SC EN A RIO SUGGEST ED A Z URE SERVIC ES SERVIC ES
I have file or object storage configured Azure Blob storage (cool or archive To back up data for long-term retention
to receive my backups. tiers) with lowest-cost storage, move data to
Azure file sync Azure Blob storage and use cool and
archive tiers. To enable fast disaster
recovery for file data on a server (on-
premises or in an Azure VM), sync
shares to individual Azure file shares by
using Azure file sync. With Azure file
share snapshots, you can restore earlier
versions and sync them back to
connected servers or access them
natively in the Azure file share.
I run data replication to a disaster Azure Files Azure file sync removes the need for a
recovery site. disaster recovery server and stores files
Azure file sync in native Azure SMB shares. Fast
disaster recovery rebuilds any data on a
failed on-premises server quickly. You
can even keep multiple server locations
in sync or use cloud tiering to store
only relevant data on-premises.
I manage data transfer in disconnected Azure Data Box Edge or Azure Data Box Using Data Box Edge or Data Box
scenarios. Gateway Gateway, you can copy data in
disconnected scenarios. When the
gateway is offline, it saves all files you
copy in the cache, then uploads them
when you're connected.
I manage an ongoing data pipeline to Azure Data Box Edge or Azure Data Box Move data to the cloud from systems
the cloud. Gateway that are constantly generating data just
by having them copy that data straight
to the storage gateway. If they need to
access that data later, it's right there
where they put it.
I have bursts of quantities of data that Azure Data Box Edge or Azure Data Box Manage large quantities of data that
arrive at the same time. Gateway arrive at the same time, like when an
autonomous car pulls back into the
garage, or a gene sequencing machine
finishes its analysis. Copy all that data
to Data Box Gateway at fast local
speeds, and then let the gateway
upload it as your network allows.
I need to support "burst compute" - Avere vFXT for Azure IaaS scale-out NFS/SMB file caching
NFS/SMB read-heavy, file-based
workloads with data assets that reside
on-premises while computation runs in
the cloud.
I need to move file shares that aren't Azure Files Protocol support regional availability
Windows server or NetApp to the performance requirements snapshot
cloud. Azure NetApp Files and clone capabilities price sensitivity
Azure Blob storage Azure Blob storage is Microsoft's object storage solution for
the cloud. Blob storage is optimized for storing massive
amounts of unstructured data. Unstructured data is data that
doesn't adhere to a specific data model or definition, such as
text or binary data.
Azure Data Lake Storage gen 2 Blob storage supports Azure Data Lake Storage Gen2,
Microsoft's enterprise big data analytics solution for the cloud.
Azure Data Lake Storage Gen2 offers a hierarchical file system
as well as the advantages of Blob storage, including low-cost,
tiered storage; high availability; strong consistency; and
disaster recovery capabilities.
Azure disk storage Azure disk storage offers persistent, high-performance block
storage to power Azure Virtual Machines. Azure disks are
highly durable, secure, and offer the industry's only single-
instance SLA for VMs that use premium or ultra SSDs. Azure
disks provide high availability with availability sets and
availability zones that map to your Azure Virtual Machines
fault domains. In addition, Azure disks are managed as a top-
level resource in Azure. Azure Resource Manager capabilities
like role-based access control (RBAC), policy, and tagging by
default are provided.
Azure Files Azure Files provides fully managed, native SMB file shares,
without the need to run a VM. You can mount an Azure Files
share as a network drive to any Azure VM or on-premises
machine.
Azure file sync Azure file sync can be used to centralize your organization's
file shares in Azure Files, while keeping the flexibility,
performance, and compatibility of an on-premises file server.
Azure file sync transforms Windows server into a quick cache
of your Azure file share.
Azure NetApp Files The Azure NetApp Files service is an enterprise-class, high-
performance, metered file storage service. Azure NetApp Files
supports any workload type and is highly available by default.
You can select service and performance levels and set up
snapshots through the service.
Azure Data Box Edge Azure Data Box Edge is an on-premises network device that
moves data into and out of Azure. Data Box Edge has AI-
enabled edge compute to preprocess data during upload.
Data Box Gateway is a virtual version of the device but with
the same data transfer capabilities.
SERVIC E DESC RIP T IO N
Azure Data Box Gateway Azure Data Box Gateway is a storage solution that enables
you to seamlessly send data to Azure. Data Box Gateway is a
virtual device based on a virtual machine provisioned in your
virtualized environment or hypervisor. The virtual device
resides on-premises and you write data to it by using the NFS
and SMB protocols. The device then transfers your data to
Azure block blobs or Azure page blobs, or to Azure Files.
Avere vFXT for Azure Avere vFXT for Azure is a filesystem caching solution for data-
intensive high-performance computing (HPC) tasks. Take
advantage of cloud computing's scalability to make your data
accessible when and where it's needed, even for data that's
stored in your own on-premises hardware.
Security
To help you protect your data in the cloud, Azure Storage offers several best practices for data security and
encryption for data at rest and in transit. You can:
Secure the storage account by using RBAC and Azure AD.
Secure data in transit between an application and Azure by using client-side encryption, HTTPS, or SMB 3.0.
Set data to be automatically encrypted when it's written to Azure Storage by using storage service encryption.
Grant delegated access to the data objects in Azure Storage by using shared access signatures.
Use analytics to track the authentication method that someone is using when they access storage in Azure.
These security features apply to Azure Blob storage (block and page) and to Azure Files. Get detailed storage
security guidance in the Azure Storage security guide.
Storage service encryption provides encryption at rest and safeguards your data to meet your organization's
security and compliance commitments. Storage service encryption is enabled by default for all managed disks,
snapshots, and images in all the Azure regions. Starting June 10, 2017, all new managed disks, snapshots, images,
and new data written to existing managed disks are automatically encrypted at rest with keys managed by
Microsoft. For more information, see the FAQ for managed disks.
Azure Disk Encryption allows you to encrypt managed disks that are attached to IaaS VMs as OS and data disks at
rest and in transit by using your keys stored in Azure Key Vault. For Windows, the drives are encrypted by using
industry-standard BitLocker encryption technology. For Linux, the disks are encrypted by using the dm-crypt
subsystem. The encryption process is integrated with Azure Key Vault to allow you to control and manage the disk
encryption keys. For more information, see Azure Disk Encryption for Windows and Linux IaaS VMs.
Regional availability
You can use Azure to deliver services at the scale that you need to reach your customers and partners wherever
they are . The managed disks and Azure Storage regional availability pages show the regions where these
services are available. Checking the regional availability of a service beforehand can help you make the right
decision for your workload and customer needs.
Managed disks are available in all Azure regions that have premium SSD and standard SSD offerings. Although
ultra SSD currently is in public preview, it's offered in only one availability zone, the East US 2 region. Verify the
regional availability when you plan mission-critical, top-tier workloads that require ultra SSD.
Hot and cool Blob storage, Data Lake Storage Gen2, and Azure Files storage are available in all Azure regions.
Archival bob storage, premium file shares, and premium block Blob storage are limited to certain regions. We
recommend that you refer to the regions page to check the latest status of regional availability.
To learn more about Azure global infrastructure, see the Azure regions page. You can also consult the products
available by region page for specific details about what's available in each Azure region.
When you prepare your landing zone environment for your cloud adoption, you need to determine the data
requirements for hosting your workloads. Azure database products and services support a wide variety of data
storage scenarios and capabilities. How you configure your landing zone environment to support your data
requirements depends on your workload governance, technical, and business requirements.
NOTE
Learn more about how to assess database options for each of your application or services in the Azure application
architecture guide.
I need a fully managed relational database that provisions Azure SQL Database
quickly, scales on the fly, and includes built-in intelligence and
security.
I need a fully managed, scalable MySQL relational database Azure Database for MySQL
that has high availability and security built in at no extra cost.
I need a fully managed, scalable PostgreSQL relational Azure Database for PostgreSQL
database that has high availability and security built in at no
extra cost.
I plan to host enterprise SQL Server apps in the cloud and SQL Server on virtual machines
have full control over the server OS.
SC EN A RIO DATA SERVIC E
I need a fully managed elastic data warehouse that has Azure SQL Data Warehouse
security at every level of scale at no extra cost.
I need Data Lake Storage resources that are capable of Azure data lake
supporting Hadoop clusters or HDFS data.
I need high throughput and consistent, low-latency access for Azure Cache for Redis
my data to support fast, scalable applications.
I need a fully managed, scalable MariaDB relational database Azure Database for MariaDB
that has high availability and security built in at no extra cost.
Regional availability
Azure lets you deliver services at the scale you need to reach your customers and partners, wherever they are . A
key factor in planning your cloud deployment is to determine what Azure region will host your workload resources.
Most database services are generally available in most Azure regions. But there are a few regions, mostly targeting
governmental customers, that support only a subset of these products. Before you decide which regions you will
deploy your database resources to, we recommend that you refer to the regions page to check the latest status of
regional availability.
To learn more about Azure global infrastructure, see the Azure regions page. You can also view products available
by region for specific details about the overall services that are available in each Azure region.
Group-based access rights and privileges are a good practice. Dealing with groups rather than individual users
simplifies maintenance of access policies, provides consistent access management across teams, and reduces
configuration errors. Assigning users to and removing users from appropriate groups helps keep current the
privileges of a specific user. Azure role-based access control (RBAC) offers fine-grained access management for
resources organized around user roles.
For an overview of recommended RBAC practices as part of an identity and security strategy, see Azure identity
management and access control security best practices.
For detailed instructions for assigning users and groups to specific roles and assigning roles to scopes, see
Manage access to Azure resources using RBAC.
When planning your access control strategy, use a least-privilege access model that grants users only the
permissions required to perform their work. The following diagram shows a suggested pattern for using RBAC
through this approach.
NOTE
The more specific or detailed permissions are that you define, the more likely it is that your access controls will become
complex and difficult to manage. This is especially true as your cloud estate grows in size. Avoid resource-specific
permissions. Instead, use management groups for enterprise-wide access control and resource groups for access control
within subscriptions. Also avoid user-specific permissions. Instead, assign access to groups in Azure AD.
Development, test, and operations DevOps Builds and deploys workload features
and applications.
The breakdown of actions and permissions in these standard roles are often the same across your applications,
subscriptions, or entire cloud estate, even if these roles are performed by different people at different levels.
Accordingly, you can create a common set of RBAC role definitions to apply across different scopes within your
environment. Users and groups can then be assigned a common role, but only for the scope of resources, resource
groups, subscriptions, or management groups that they're responsible for managing.
For example, in a hub and spoke network topology with multiple subscriptions, you might have a common set of
role definitions for the hub and all workload spokes. A hub subscription's NetOps role can be assigned to
members of the organization's Central IT team, who are responsible for maintaining networking for shared
services used by all workloads. A workload spoke subscription's NetOps role can then be assigned to members of
that specific workload team, allowing them to configure networking within that subscription to best support their
workload requirements. The same role definition is used for both, but scope-based assignments ensure that users
have only the access that they need to perform their job.
Create hybrid cloud consistency
10/30/2020 • 6 minutes to read • Edit Online
This article guides you through the high-level approaches for creating hybrid cloud consistency.
Hybrid deployment models during migration can reduce risk and contribute to a smooth infrastructure transition.
Cloud platforms offer the greatest level of flexibility when it comes to business processes. Many organizations are
hesitant to make the move to the cloud. Instead, they prefer to keep full control over their most sensitive data.
Unfortunately, on-premises servers don't allow for the same rate of innovation as the cloud. A hybrid cloud
solution offers the speed of cloud innovation and the control of on-premises management.
Figure 1: Creating hybrid cloud consistency across identity, management, security, data, development, and DevOps.
A true hybrid cloud solution must provide four components, each of which brings significant benefits:
Common identity for on-premises and cloud applications: This component improves user productivity
by giving users single sign-on (SSO) to all their applications. It also ensures consistency as applications and
users cross network or cloud boundaries.
Integrated management and security across your hybrid cloud: This component provides you with a
cohesive way to monitor, manage, and secure the environment, which enables increased visibility and control.
A consistent data platform for the datacenter and the cloud: This component creates data portability,
combined with seamless access to on-premises and cloud data services for deep insight into all data sources.
Unified development and DevOps across the cloud and on-premises datacenters: This component
allows you to move applications between the two environments as needed. Developer productivity improves
because both locations now have the same development environment.
Here are some examples of these components from an Azure perspective:
Azure Active Directory (Azure AD) works with on-premises Active Directory to provide common identity for all
users. SSO across on-premises and via the cloud makes it simple for users to safely access the applications and
assets they need. Admins can manage security and governance controls and also have the flexibility to adjust
permissions without affecting the user experience.
Azure provides integrated management and security services for both cloud and on-premises infrastructure.
These services include an integrated set of tools that are used to monitor, configure, and protect hybrid clouds.
This end-to-end approach to management specifically addresses real-world challenges that face organizations
considering a hybrid cloud solution.
Azure hybrid cloud provides common tools that ensure secure access to all data, seamlessly and efficiently.
Azure data services combine with Microsoft SQL Server to create a consistent data platform. A consistent
hybrid cloud model allows users to work with both operational and analytical data. The same services are
provided on-premises and in the cloud for data warehousing, data analysis, and data visualization.
Azure cloud services, combined with Azure Stack on-premises, provide unified development and DevOps.
Consistency across the cloud and on-premises means that your DevOps team can build applications that run in
either environment and can easily deploy to the right location. You also can reuse templates across the hybrid
solution, which can further simplify DevOps processes.
Landing zone operations provide the initial foundation for operations management. As operations scale, these
improvements will refactor landing zones to meet growing operational excellence, reliability, and performance
requirements.
Next steps
Understand how to improve landing zone governance to support adoption at scale.
Improve landing zone governance
Improve landing zone governance
10/30/2020 • 2 minutes to read • Edit Online
Landing zone governance is the smallest unit of overall governance. Establishing a sound governance foundation
within your first few landing zones will reduce the amount of refactoring required later in the adoption lifecycle.
Improving landing zone governance will integrate cost controls, establish basic tooling to allow for scale, and will
make it easier for the cloud governance team to deliver on the Five Disciplines of Cloud Governance.
Next steps
Cloud adoption will continue to expand with each wave or release of new workloads. To stay ahead of these
requirements, cloud platform teams should periodically review additional landing zone best practices.
Review additional landing zone best practices
Improve landing zone security
10/30/2020 • 2 minutes to read • Edit Online
When a workload or the landing zones that hosts it require access to any sensitive data or critical systems, it's
important to protect the data and assets. Improving landing zone security builds on the test-driven development
approach to landing zones by expanding or refactoring the landing zone to account for heightened security
requirements.
The Cloud Adoption Framework approaches cloud adoption as a self-service activity. The objective is to empower
each team that supports adoption through standardized approaches. In practice, you can't assume that a self-
service approach is sufficient for all adoption activities.
Successful cloud adoption programs typically involve at least one level of third-party support. Many cloud
adoption efforts require support from a systems integrator (SI) or consulting partner who provides services that
accelerate cloud adoption. Managed service providers (MSPs) provide enduring value by supporting landing
zones and cloud adoption, but they also provide post-adoption operations management support. Additionally,
successful cloud adoption efforts tend to engage one or more independent software vendors (ISV) who provide
software-based services that accelerate cloud adoption. The rich partner ecosystems of SIs, ISVs, MSPs, and other
forms of Microsoft partners have aligned their offerings to specific methodologies found in the Cloud Adoption
Framework. When a partner is aligned to the Ready methodology of this framework, they will likely offer their
own Azure landing zone implementation option.
This article provides a set of questions that help create an understanding of the scope of the partner's Azure
landing zone implementation options.
IMPORTANT
Partner offers and Azure landing zone implementation options are defined by the partner, based on their extensive
experience helping customers adopt the cloud.
Partners might choose to omit the implementation of specific design areas in their initial landing zone implementation.
However, they should be able to communicate when and how each design area is implemented, as well as a range of costs
for completing that design area whenever possible.
Other partner solutions might be flexible enough to support multiple options for each of the questions below. Use these
questions to ensure you're comparing partner offers and self-service options equally.
Find a partner
If you need a partner to implement your Azure landing zones, start with the approved list of Cloud Adoption
Framework aligned partners. Specifically, start with partners who have offers aligned to the Ready methodology.
Additionally, all Azure expert managed service providers (MSPs) have been audited to validate their ability to
deliver each methodology of the Cloud Adoption Framework. While a particular partner might not have an
aligned offer, all partners have demonstrated alignment during technical delivery.
Design principles
All Azure landing zones must consider the following set of common design areas. We refer to the way those
design areas are implemented as design principles. The following sections will help validate the partner's design
principles that define the Azure landing zone implementation.
Deployment options
Partners who offer an Azure landing zone solution might support one or more options to deploy (or
modify/expand the landing zone) the solution to your Azure tenant.
Question for the par tner : Which of the following does your Azure landing zone solution support?
Configuration automation: Does the solution deploy the landing zone from a deployment pipeline or
deployment tool?
Manual configuration: Does the solution empower the IT team to manually configure the landing zone,
without injecting errors into the landing zone source code?
Question for the par tner : Which of the Azure landing zone implementation options are supported by the
partner's solution? See the Azure landing zone implementation options article for a full list of options.
Identity
Identity is perhaps the most important design area to evaluate in the partner solution.
Question for the par tner : Which of the following identity management options does the partner solution
support?
Azure AD: The suggested best practice is to use Azure AD and role-based access control to manage identity
and access in Azure.
Active Director y: If required, does the partner solution provide an option to deploy Active Directory as an
infrastructure as a service solution?
Third-par ty identity provider : If your company uses a third-party identity solution, determine whether and
how the partner's Azure landing zone integrates with the third-party solution.
Network topology and connectivity
Networking is arguably the second most important design area to evaluate. There are several best practice
approaches to network topology and connectivity.
Question for the par tner : Which of the following options is included with the partner's Azure landing zone
solution? Are any of the following options incompatible with the partner's solution?
Vir tual network : Does the partner solution configure a virtual network? Can its topology be modified to meet
your technical or business constraints?
Vir tual private network (VPN): Is VPN configuration included in the partner's landing zone design to
connect the cloud to existing datacenters or offices?
High-speed connectivity: Is a high-speed connection such as Azure ExpressRoute included in the landing
zone design?
Vir tual network peering: Does the design include connectivity between different subscriptions or virtual
networks in Azure?
Resource organization
Sound governance and operational management of the cloud starts with best practice resource organization.
Question for the par tner : Does the partner's landing zone design include considerations for the following
resource organization practices?
Naming standards: What naming standards will this offering follow and is that standard automatically
enforced through policy?
Tagging standards: Does the landing zone configuration follow and enforce a specific standards for tagging
assets?
Subscription design: What subscription design strategies are supported by the partner offer?
Management group design: Does the partner offer follow a defined pattern for the Azure management
group hierarchy to organize subscriptions?
Resource group alignment: How are resource groups used to group assets deployed to the cloud? In the
partner offer, are resource groups used to group assets into workloads, deployment packages, or other
organization standards?
Question for the par tner : Does the partner provide onboarding documentation to track foundational decisions
and educate staff? See the initial decision template for an example of such documentation.
Governance disciplines
Your governance requirements can heavily influence any complex landing zone designs. Many partners provide a
separate offering to fully implement governance disciplines after landing zones are deployed. The following
questions will help create clarity around the aspects of governance that will be built into any landing zones.
Question for the par tner : What governance tooling does the partner solution include as part of the landing
zone implementation?
Policy compliance monitoring: Does the partner's landing zone solution includes defined governance
policies along with tools and processes to monitor compliance? Does the offer include customization of policies
to fit your governance needs?
Policy enforcement: Does the partner's landing zone solution include automated enforcement tools and
processes?
Cloud platform governance: Does the partner offer include a solution for maintaining compliance to a
common set of policies across all subscriptions? Or is the scope limited to individual subscriptions?
N/A: Start-small approaches intentionally postpone governance decisions until the team has deployed low-risk
workloads to Azure. This can be addressed in a separate offer after the landing zone solution has been
deployed.
Question for the par tner : Does the partner offer go beyond governance tooling to also include processes and
practices for delivering any of the following cloud governance disciplines?
Cost management: Does the partner offer prepare the team to evaluate, monitor, and optimize spend while
creating cost accountability with workload teams?
Security baseline: Does the partner offer prepare the team to maintain compliance as security requirements
change and mature?
Resource consistency: Does the partner offer prepare the team to ensure that all assets in the cloud are
onboarded into relevant operations management processes?
Identity baseline: Does the partner offer prepare the team to maintain identity, role definitions, and
assignments after the initial landing zone is deployed?
Operations baseline
Your operations management requirements could influence configuration of specific Azure products during
landing zone implementation. Many partners provide a separate offering to fully implement the operations
baseline and advanced operations later in the cloud adoption journey, but before your first workload is released
for production use. But, the partner's landing zone solution might include configuration for a number of
operations management tools by default.
Question for the par tner : Does the partner solution include design options to support any of the cloud
operations disciplines?
Inventor y and visibility: Does the landing zone include tooling to ensure that 100% of assets are centrally
monitored?
Operational compliance: Does the architecture include tooling and automated processes to enforce patching
or other operational compliance requirements?
Protect and recover : Does the partner offer include tooling and configuration to ensure a minimal standard
of backup and recovery for 100% of assets deployed?
Platform operations: Does the landing zone offering include tooling or processes to optimize operations
across the portfolio?
Workload operations: Does the landing zone offering include tooling to manage workload-specific
operations requirements and ensure that each workload is well-architected?
Take action
After reviewing the partner's Azure landing zone offer or solution using the questions above, your team will be
better equipped to choose the partner whose Azure landing zone most closely aligns to your cloud operating
model.
If you determine that a self-service approach to Azure landing zone deployment is a better fit, review or revisit the
Azure landing zone implementation options to find the templated landing zone approach that best aligns with
your cloud operating model.
Next steps
Learn about the process for refactoring landing zones.
Refactor landing zones
Refactor landing zones
10/30/2020 • 7 minutes to read • Edit Online
A landing zone is an environment for hosting your workloads that's preprovisioned through code . Since
landing zone infrastructure is defined in code, it can be refactored similar to any other codebase. Refactoring is the
process of modifying or restructuring source code to optimize the output of that code without changing its
purpose or core function.
The Ready methodology uses the concept of refactoring to accelerate migration and remove common blockers.
The steps in the ready overview discuss a process that starts with predefined landing zone template that aligns
best with your hosting function. Then refactor or add to the source code to expand the landing zones ability to
deliver that function through improved security, operations, or governance. The following image illustrates the
concept of refactoring.
Common blockers
When customers adopt the cloud, landing zone considerations are the single most common blocker to adoption
and cloud-related business results. Customers tend to lean towards one of the following two blockers. Various
teams often lean towards one of these two blockers, resulting in cultural deadlocks that make adoption difficult.
Both of the primary blockers are rooted in one belief, the cloud environment and the existing datacenters should
be at or near feature parity regarding operations, governance, and security. This is a wise long-term goal. But the
pain comes from the delicate balance between the timing to achieve that goal and the speed required to deliver
business results.
Blocker: Acting too soon
It took years and significant effort to reach the current state of security, governance, and operations in the current
datacenter. It also required observations, learning, and customization to meet the unique constraints of that
environment. Replicating those same procedures and configurations will take time. Reaching complete feature
parity may also result in an environment that underperforms in the cloud. This parity approach also commonly
leads to significant unplanned overspending in the cloud environment. Don't try to apply current-state
requirements to a future-state environment as an early stage gate. Such an approach rarely proves to be
profitable.
Figure 2: Acting too soon is a common blocker.
In the image above, the customer has an objective of 100 workloads running in the cloud. To get there, the
customer will likely deploy their first workload and then their first ten or so workloads before they're ready to
release one of them to production. Eventually, they'll reach the objective of the adoption plan and have a robust
portfolio in the cloud. But the red x in the image shows where customers commonly get stuck. Waiting for total
alignment can delay the first workload by weeks, months, or even years.
Blocker: Acting too late
On the other hand, acting too late can have significant long-term consequences on the success of the cloud
adoption effort. If the team waits to reach feature parity until the adoption efforts are complete, they will
encounter unnecessary roadblocks and require several escalations to keep the efforts on track.
WARNING
Adoption teams who have a mid-term objective (within 24 months) to host more than 1,000 assets (apps,
infrastructure, or data assets) in the cloud are highly unlikely to be successful using a refactoring approach. The
learning curve is too high and the timeline too tight to allow for organic approaches to skills attainment. A more complete
starting point requiring less customization will be a better path to achieve your objectives. Your implementation partners will
likely be able to guide you through a better approach.
The remainder of this article will focus on some key constraints that can empower a refactoring approach, while
minimizing risk.
Theory
The concept of refactoring a landing zone is simple, but execution requires proper guardrails. The concept shown
above outlines the basic flow:
When you're ready to build your first landing zone, start with an initial landing zone defined via a template.
Once that landing zone is deployed, use the decision trees in the subsequent articles under the
Expand your landing zone section of the table of contents to refactor and add to your initial landing zone.
Repeat decision trees and refactoring until you have an enterprise-ready environment that meets the enhanced
requirements of your security, operations, and governance teams.
Development approach
The advantage of a refactoring-based approach, is the ability to create parallel iteration paths for development.
The image below provides an example of two parallel iteration paths: cloud adoption and cloud platform. Both
progress at their own pace, with minimal risk of becoming a blocker to either team's daily efforts. Alignment on
the adoption plan and refactoring guardrails can lead to agreement about milestones and clarity about future-
state dependencies.
Next steps
To get started on a refactoring process, get started using Azure landing zones.
Azure landing zones
Test-driven development (TDD) for landing zones
10/30/2020 • 5 minutes to read • Edit Online
Test-driven development is a common software development and DevOps process that improves the quality of
new features and improvements in any code-based solution. Cloud-based infrastructure, and the underlying
source code can use this process to ensure landing zones meet core requirements and are of high quality. This
process is especially useful when landing zones are being developed and refactored in a parallel development
effort.
In the cloud, infrastructure is the output of code execution. Well-structured, tested, and verified code produces a
viable landing zone. A landing zone is an environment for hosting your workloads, preprovisioned through code.
It includes foundational capabilities using a defined set of cloud services and best practices that set you up for
success. This guidance describes an approach that uses test-driven development to fulfill the last part of that
definition, while meeting quality, security, operations, and governance requirements.
This approach can be used to meet simple feature requests during early development. Later in the cloud adoption
lifecycle, this process can be used to meet security, operations, governance, or compliance requirements.
Definition of done
"Set up for success" is a subjective statement. This statement provides the cloud platform team with little
actionable information during landing zone development or refactoring efforts. This lack of clarity can lead to
missed expectations and vulnerabilities in a cloud environment. Before refactoring or expanding any landing
zone, the cloud platform team should seek clarity regarding the "definition of done" for each landing zone.
Definition of done is a simple agreement between the cloud platform team and other affected teams. This
agreement outlines the expected value added features, which should be included in any landing zone
development effort. The definition of done is often a checklist that's aligned with the short-term cloud adoption
plan. In mature processes, those expected features in the checklist will each have their own acceptance criteria to
create even more clarity. When the value-added features each meet the acceptance criteria, the landing zone is
sufficiently configured to enable the success of the current wave or release of adoption effort.
As teams adopt additional workloads and cloud features, the definition of done and acceptance criteria will
become increasingly more complex.
Create a test: Define a test to validate that acceptance criteria for a specific value-add feature has been met.
Automate the test whenever possible.
Test the landing zone: Run the new test and any existing tests. If the required feature hasn't already been
met by prior development efforts and isn't inclusive to the cloud provider's offering, the test should fail.
Running existing tests will help validate that your new test doesn't reduce reliability of landing zone features
delivered by existing code.
Expand and refactor the landing zone: Add or modify the source code to fulfill the requested value-add
feature and improve the general quality of the code base. To meet the fullest spirit of test-driven development,
the cloud platform team would only add code to meet the requested feature and nothing more. At the same
time, code quality and maintenance is a shared effort. When fulfilling new feature requests, the cloud platform
team should seek to improve the code by removing duplication and clarifying the code. Running tests
between new code creation and refactoring of source code is highly suggested.
Deploy the landing zone: Once the source code is capable of fulfilling the feature request, deploy the
modified landing zone to the cloud provider in a controlled testing or sandbox environment.
Test the landing zone: Retesting the landing zone should validate that the new code meets the acceptance
criteria for the requested feature. Once all tests pass, the feature is considered complete and the acceptance
criteria are considered to be met.
When all value-added features and acceptance criteria pass their associated tests, the landing zone is ready to
support the next wave of the cloud adoption plan.
Next steps
To accelerate test-driven development in Azure, review test-driven development features of Azure.
Test-driven development in Azure
Test-driven development for landing zones in Azure
10/30/2020 • 3 minutes to read • Edit Online
As outlined in the previous article on test-driven development (TDD) for landing zones, TDD cycles begin with a
test that validates the acceptance criteria of a specific feature required to deliver the cloud adoption plan.
Expanding or refactoring the landing zone can then be tested to validate that the acceptance criteria have been
met. This article outlines a cloud-native toolchain in Azure to automate test-driven development cycles.
Next steps
To begin refactoring your first landing zone, evaluate basic landing zone considerations.
Basic landing zone considerations
Best practices for Azure readiness
10/30/2020 • 3 minutes to read • Edit Online
Cloud readiness requires equipping staff with the technical skills needed to start a cloud adoption effort and
prepare your migration target environment for the assets and workloads you'll move to the cloud. Read these best
practices and additional guidance to help your team prepare your Azure environment.
Azure fundamentals
Organize and deploy your assets in the Azure environment.
Azure fundamental concepts. Learn key Azure concepts and terms, and how these concepts relate to one
another.
Create your initial subscriptions. Establish an initial set of Azure subscriptions to begin your cloud adoption.
Scale your Azure environment using multiple subscriptions. Understand reasons and strategies for creating
additional subscriptions to scale your Azure environment.
Organize your resources with Azure management groups. Learn how Azure management groups can manage
resources, roles, policies, and deployment across multiple subscriptions.
Follow recommended naming and tagging conventions. Review detailed recommendations for naming and
tagging your resources. These recommendations support enterprise cloud adoption efforts.
Create hybrid cloud consistency. Create hybrid cloud solutions that provide the benefits of cloud innovation
while maintaining many of the conveniences of on-premises management.
Networking
Prepare your cloud networking infrastructure to support your workloads.
Networking decisions. Choose the networking services, tools, and architectures that will support your
organization's workload, governance, and connectivity requirements.
Virtual network planning. Plan virtual networks based on your isolation, connectivity, and location
requirements.
Best practices for network security. Learn best practices for addressing common network security issues using
built-in Azure capabilities.
Perimeter networks. Enable secure connectivity between your cloud networks and your on-premises or physical
datacenter networks, along with any connectivity to and from the internet.
Hub and spoke network topology. Efficiently manage common communication or security requirements for
complicated workloads and address potential Azure subscription limitations.
Storage
Azure Storage guidance. Select the right Azure Storage solution to support your usage scenarios.
Azure Storage security guide. Learn about security features in Azure Storage.
Databases
Choose the correct SQL Server option in Azure. Choose the PaaS or IaaS solution that best supports your SQL
Server workloads.
Database security best practices. Learn best practices for database security on the Azure platform.
Choose the right data store. Select the right data store to meet your requirements. Hundreds of implementation
choices are available among SQL and NoSQL databases. Data stores are often categorized by how they
structure data and the types of operations they support. This article describes several common storage models.
Cost management
Tracking costs across business units, environments, and projects. Learn best practices for creating proper cost-
tracking mechanisms.
How to optimize your cloud investment with Azure Cost Management and Billing. Implement a strategy for cost
management and learn about the tools available for addressing cost challenges.
Create and manage budgets. Learn to create and manage budgets using Azure Cost Management and Billing.
Export cost data. Learn to export cost data using Azure Cost Management and Billing.
Optimize costs based on recommendations. Learn to identify underutilized resources and reduce costs by using
Azure Cost Management and Billing and Azure Advisor.
Use cost alerts to monitor usage and spending. Learn to use Azure Cost Management and Billing alerts to
monitor your Azure usage and spending.
Create your initial Azure subscriptions
10/30/2020 • 2 minutes to read • Edit Online
Start your Azure adoption by creating an initial set of subscriptions. Learn what subscriptions you should begin
with based on your initial requirements.
Sandbox subscriptions
If innovation goals are part of your cloud adoption strategy, consider creating one or more sandbox
subscriptions. You can apply security policies to keep these test subscriptions isolated from your production and
nonproduction environments. Users can easily experiment with Azure capabilities in these isolated environments.
Use an Azure Dev/Test offer to create these subscriptions.
Figure 2: A
subscription model with sandbox subscriptions.
Next steps
Review the reasons why you might want to create additional Azure subscriptions to meet your requirements.
Create additional subscriptions to scale your Azure environment
Create additional subscriptions to scale your Azure
environment
10/30/2020 • 2 minutes to read • Edit Online
Organizations often use multiple Azure subscriptions to avoid per-subscription resource limits and to better
manage and govern their Azure resources. It's important to define a strategy for scaling your subscriptions.
Technical considerations
Subscription limits: Subscriptions have defined limits for some resource types. For example, the number of
virtual networks in a subscription is limited. When a subscription approaches these limits, you'll need to create
another subscription and put additional resources there. For more information, see Azure subscription and
service limits.
Classic model resources: If you've been using Azure for a long time, you may have resources that were
created using the classic deployment model. Azure policies, role-based access control, resource grouping, and
tags cannot be applied to classic model resources. You should move these resources into subscriptions that
contain only classic model resources.
Costs: There might be some additional costs for data ingress and egress between subscriptions.
Business priorities
Your business priorities might lead you to create additional subscriptions. These priorities include:
Innovation
Migration
Cost
Operations
Security
Governance
For other considerations about scaling your subscriptions, review the subscription decision guide in the Cloud
Adoption Framework.
Next steps
Create a management group hierarchy to help organize and manage your subscriptions and resources.
Organize and manage your subscriptions and resources
Organize and manage multiple Azure subscriptions
10/30/2020 • 2 minutes to read • Edit Online
If you have only a few subscriptions, then managing them independently is relatively simple. However, if you
have many subscriptions, create a management group hierarchy to help manage your subscriptions and
resources.
NOTE
Tag inheritance is not yet supported but will be available soon.
This inheritance model lets you arrange the subscriptions in your hierarchy so that each subscription follows
appropriate policies and security controls.
Related resources
Review the following resources to learn more about organizing and managing your Azure resources.
Organize your resources with Azure management groups
Elevate access to manage all Azure subscriptions and management groups
Move Azure resources to another resource group or subscription
Next steps
Review recommended naming and tagging conventions to follow when deploying your Azure resources.
Recommended naming and tagging conventions
Recommended naming and tagging conventions
10/30/2020 • 11 minutes to read • Edit Online
Organize your cloud assets to support operational management and accounting requirements. Well-defined
naming and metadata tagging conventions help to quickly locate and manage resources. These conventions
also help associate cloud usage costs with business teams via chargeback and showback accounting
mechanisms.
Accurate representation and naming of resources are critical for security purposes. In the event of a security
incident, quickly identifying affected systems, their potential business impact, and what they are being used
for is critical to making good risk decisions. Security services such as Azure Security Center and Azure Sentinel
reference resources and their associated logging/alert information by resource name.
Azure defines naming rules and restrictions for Azure resources. This guidance provides detailed
recommendations to support enterprise cloud adoption efforts.
Changing resource names can be difficult. Establish a comprehensive naming convention before you begin
any large cloud deployment.
NOTE
Every business has different organizational and management requirements. These recommendations provide a starting
point for discussions within your cloud adoption teams.
As these discussions proceed, use the following template to capture the naming and tagging decisions you make when
you align these recommendations to your specific business needs.
Download the naming and tagging conventions tracking template.
Resource naming
An effective naming convention assembles resource names by using important resource information as parts
of a resource's name. For example, using these recommended naming conventions, a public IP resource for a
production SharePoint workload is named like this: pip-sharepoint-prod-westus-001 .
From the name, you can quickly identify the resource's type, its associated workload, its deployment
environment, and the Azure region hosting it.
Naming scope
All Azure resource types have a scope that defines the level that resource names must be unique. A resource
must have a unique name within its scope.
For example, a virtual network has a resource group scope, which means that there can be only one network
named vnet-prod-westus-001 in a given resource group. Other resource groups could have their own virtual
network named vnet-prod-westus-001 . Subnets are scoped to virtual networks, so each subnet within a virtual
network must be uniquely named.
Some resource names, such as PaaS services with public endpoints or virtual machine DNS labels, have global
scopes, which means that they must be unique across the entire Azure platform.
Resource names have length limits. Balancing the context embedded in a name with its scope and length is
important when you develop your naming conventions. For more information, see naming rules and
restrictions for Azure resources.
Recommended naming components
When you construct your naming convention, identify the key pieces of information that you want to reflect in
a resource name. Different information is relevant for different resource types. The following list provides
examples of information that are useful when you construct resource names.
Keep the length of naming components short to prevent exceeding resource name length limits.
Business unit Top-level division of your company fin , mktg , product , it , corp
that owns the subscription or
workload the resource belongs to. In
smaller organizations, this component
might represent a single corporate
top-level organizational element.
Deployment environment The stage of the development lifecycle prod , dev , qa , stage , test
for the workload that the resource
supports.
Region The Azure region where the resource westus , eastus2 , westeurope ,
is deployed. usgovia
Networking
A SSET T Y P E N A M E P REF IX
Subnet snet-
Virtual machine vm
Container registry cr
Databases
A SSET T Y P E N A M E P REF IX
Storage
A SSET T Y P E N A M E P REF IX
Storage account st
Developer tools
A SSET T Y P E N A M E P REF IX
Integration
A SSET T Y P E N A M E P REF IX
Blueprint bp-
Migration
A SSET T Y P E N A M E P REF IX
Metadata tags
When you apply metadata tags to your cloud resources, you can include information about those assets that
couldn't be included in the resource name. You can use that information to perform more sophisticated
filtering and reporting on resources. You want these tags to include context about the resource's associated
workload or application, operational requirements, and ownership information. This information can be used
by IT or business teams to find resources or generate reports about resource usage and billing.
What tags you apply to resources and what tags are required or optional differs among organizations. The
following list provides examples of common tags that capture important context and information about a
resource. Use this list as a starting point to establish your own tagging conventions.
End date of the project Date when the application, EndDate {date}
workload, or service is
scheduled for retirement.
Service class Service level agreement ServiceClass Dev, Bronze, Silver, Gold
level of the application,
workload, or service.
Start date of the project Date when the application, StartDate {date}
workload, or service was
first deployed.
Example names
The following section provides some example names for common Azure resource types in an enterprise cloud
deployment.
Example names: General
A SSET T Y P E SC O P E F O RM AT EXA M P L ES
NOTE
The example names above and elsewhere in this document reference a three digit padding (<###>). I.E. mktg-
prod-001
Padding aids in human readability and sorting of assets when those assets are managed in a configuration
management database (CMDB), IT Asset Management tool, or traditional accounting tools. When the deployed asset is
managed centrally as part of a larger inventory or portfolio of IT assets, the padding approach aligns with interfaces
those systems use to manage inventory naming.
Unfortunately, the traditional asset padding approach can prove problematic in infrastructure-as-code approaches
which may iterate through assets based on a non-padded number. This approach is common during deployment or
automated configuration management tasks. Those scripts would have to routinely strip the padding and convert the
padded number to a real number, which slows script development and run time.
Which approach you choose to implement is a personal decision. The padding in this article is meant to illustrate the
importance of using a consistent approach to inventory numbering, not which approach is superior. Before deciding on
a number schema (with or without padding) evaluate which will have a bigger impact on long term operations:
CMDB/asset management solutions or code-based inventory management. Then consistently follow the padding
option that best fits your operational needs.
Applications migrated from on-premises will benefit from Azure's secure cost-efficient infrastructure, even with
minimal application changes. Even so, enterprises should adapt their architectures to improve agility and take
advantage of Azure's capabilities.
Microsoft Azure delivers hyperscale services and infrastructure with enterprise-grade capabilities and reliability.
These services and infrastructure offer many choices in hybrid connectivity, so customers can choose to access
them over the internet or a private network connection. Microsoft partners can also provide enhanced capabilities
by offering security services and virtual appliances that are optimized to run in Azure.
Customers can use Azure to seamlessly extend their infrastructure into the cloud and build multitier architectures.
NOTE
A virtual datacenter is not a specific Azure service. Rather, various Azure features and capabilities are combined to meet your
requirements. A virtual datacenter is a way of thinking about your workloads and Azure usage to optimize your resources
and capabilities in the cloud. It provides a modular approach to providing IT services in Azure while respecting the
enterprise's organizational roles and responsibilities.
A virtual datacenter helps enterprises deploy workloads and applications in Azure for the following scenarios:
Host multiple related workloads.
Migrate workloads from an on-premises environment to Azure.
Implement shared or centralized security and access requirements across workloads.
Mix DevOps and centralized IT appropriately for a large enterprise.
A Peering hub and spoke topology is well suited for distributed applications and teams with delegated
responsibilities.
An Azure Virtual WAN topology can support large-scale branch office scenarios and global WAN services.
The peering hub and spoke topology and the Azure Virtual WAN topology both use a hub and spoke design, which
is optimal for communication, shared resources, and centralized security policy. Hubs are built using either a virtual
network peering hub (labeled as Hub Virtual Network in the diagram) or a Virtual WAN hub (labeled as
Azure Virtual WAN in the diagram). Azure Virtual WAN is designed for large-scale branch-to-branch and branch-
to-Azure communications, or for avoiding the complexities of building all the components individually in a virtual
networking peering hub. In some cases, your requirements might mandate a virtual network peering hub design,
such as the need for network virtual appliances in the hub.
In both of the hub and spoke topologies, the hub is the central network zone that controls and inspects all traffic
between different zones: internet, on-premises, and the spokes. The hub and spoke topology helps the IT
department centrally enforce security policies. It also reduces the potential for misconfiguration and exposure.
The hub often contains the common service components consumed by the spokes. The following examples are
common central services:
The Windows Active Directory infrastructure, required for user authentication of third parties that access from
untrusted networks before they get access to the workloads in the spoke. It includes the related Active Directory
Federation Services (AD FS).
A Distributed Name System (DNS) service to resolve naming for the workload in the spokes, to access
resources on-premises and on the internet if Azure DNS isn't used.
A public key infrastructure (PKI), to implement single sign-on on workloads.
Flow control of TCP and UDP traffic between the spoke network zones and the internet.
Flow control between the spokes and on-premises.
If needed, flow control between one spoke and another.
A virtual datacenter reduces overall cost by using the shared hub infrastructure between multiple spokes.
The role of each spoke can be to host different types of workloads. The spokes also provide a modular approach for
repeatable deployments of the same workloads. Examples include dev/test, user acceptance testing, preproduction,
and production. The spokes can also segregate and enable different groups within your organization. An example is
DevOps groups. Inside a spoke, it's possible to deploy a basic workload or complex multitier workloads with traffic
control between the tiers.
Subscription limits and multiple hubs
IMPORTANT
Based on the size of your Azure deployments, a multiple hub strategy may be needed. When designing your hub and spoke
strategy, ask "can this design scale to use another hub virtual network in this region?", also, "can this design scale to
accommodate multiple regions?" It's far better to plan for a design that scales and not need it, than to fail to plan and need
it.
When to scale to a secondary (or more) hub will depend on myriad factors, usually based on inherent limits on scale. Be sure
to review the subscription, virtual network, and virtual machine limits when designing for scale.
In Azure, every component, whatever the type, is deployed in an Azure subscription. The isolation of Azure
components in different Azure subscriptions can satisfy the requirements of different lines of business, such as
setting up differentiated levels of access and authorization.
A single VDC implementation can scale up to large number of spokes, although, as with every IT system, there are
platform limits. The hub deployment is bound to a specific Azure subscription, which has restrictions and limits (for
example, a maximum number of virtual network peerings. For details, see Azure subscription and service limits,
quotas, and constraints). In cases where limits may be an issue, the architecture can scale up further by extending
the model from a single hub-spokes to a cluster of hub and spokes. Multiple hubs in one or more Azure regions
can be connected using virtual network peering, ExpressRoute, Virtual WAN, or site-to-site VPN.
The introduction of multiple hubs increases the cost and management effort of the system. It is only justified due to
scalability, system limits, redundancy, regional replication for end-user performance, or disaster recovery. In
scenarios requiring multiple hubs, all the hubs should strive to offer the same set of services for operational ease.
Interconnection between spokes
Inside a single spoke, or a flat network design, it's possible to implement complex multitier workloads. Multitier
configurations can be implemented using subnets, one for every tier or application, in the same virtual network.
Traffic control and filtering are done using network security groups and user-defined routes.
An architect might want to deploy a multitier workload across multiple virtual networks. With virtual network
peering, spokes can connect to other spokes in the same hub or different hubs. A typical example of this scenario is
the case where application processing servers are in one spoke, or virtual network. The database deploys in a
different spoke, or virtual network. In this case, it's easy to interconnect the spokes with virtual network peering
and, by doing that, avoid transiting through the hub. A careful architecture and security review should be done to
ensure that bypassing the hub doesn't bypass important security or auditing points that might exist only in the
hub.
Spokes can also be interconnected to a spoke that acts as a hub. This approach creates a two-level hierarchy: the
spoke in the higher level (level 0) becomes the hub of lower spokes (level 1) of the hierarchy. The spokes of a VDC
implementation are required to forward the traffic to the central hub so that the traffic can transit to its destination
in either the on-premises network or the public internet. An architecture with two levels of hubs introduces
complex routing that removes the benefits of a simple hub-spoke relationship.
Although Azure allows complex topologies, one of the core principles of the VDC concept is repeatability and
simplicity. To minimize management effort, the simple hub-spoke design is the VDC reference architecture that we
recommend.
Components
The virtual datacenter is made up of four basic component types: Infrastructure , Perimeter Networks ,
Workloads , and Monitoring .
Each component type consists of various Azure features and resources. Your VDC implementation is made up of
instances of multiple components types and multiple variations of the same component type. For instance, you
may have many different, logically separated workload instances that represent different applications. You use
these different component types and instances to ultimately build the VDC.
The preceding high-level conceptual architecture of the VDC shows different component types used in different
zones of the hub-spokes topology. The diagram shows infrastructure components in various parts of the
architecture.
As good practice in general, access rights and privileges should be group-based. Dealing with groups rather than
individual users eases maintenance of access policies, by providing a consistent way to manage it across teams,
and aids in minimizing configuration errors. Assigning and removing users to and from appropriate groups helps
keeping the privileges of a specific user up to date.
Each role group should have a unique prefix on their names. This prefix makes it easy to identify which group is
associated with which workload. For example, a workload hosting an authentication service might have groups
named AuthSer viceNetOps , AuthSer viceSecOps , AuthSer viceDevOps , and AuthSer viceInfraOps .
Centralized roles, or roles not related to a specific service, might be prefaced with Corp . An example is
CorpNetOps .
Many organizations use a variation of the following groups to provide a major breakdown of roles:
The central IT team, Corp, has the ownership rights to control infrastructure components. Examples are
networking and security. The group needs to have the role of contributor on the subscription, control of the hub,
and network contributor rights in the spokes. Large organizations frequently split up these management
responsibilities between multiple teams. Examples are a network operations CorpNetOps group with exclusive
focus on networking and a security operations CorpSecOps group responsible for the firewall and security
policy. In this specific case, two different groups need to be created for assignment of these custom roles.
The dev/test group, AppDevOps, has the responsibility to deploy app or service workloads. This group takes
the role of virtual machine contributor for IaaS deployments or one or more PaaS contributor's roles. For more
information, see Built-in roles for Azure resources. Optionally, the dev/test team might need visibility on security
policies (network security groups) and routing policies (user-defined routes) inside the hub or a specific spoke.
In addition to the role of contributor for workloads, this group would also need the role of network reader.
The operation and maintenance group, CorpInfraOps or AppInfraOps, has the responsibility of managing
workloads in production. This group needs to be a subscription contributor on workloads in any production
subscriptions. Some organizations might also evaluate if they need an additional escalation support team group
with the role of subscription contributor in production and the central hub subscription. The additional group
fixes potential configuration issues in the production environment.
The VDC is designed so that groups created for the central IT team, managing the hub, have corresponding groups
at the workload level. In addition to managing hub resources only, the central IT team can control external access
and top-level permissions on the subscription. Workload groups can also control resources and permissions of
their virtual network independently from the central IT team.
The virtual datacenter is partitioned to securely host multiple projects across different lines of business. All projects
require different isolated environments (dev, UAT, and production). Separate Azure subscriptions for each of these
environments can provide natural isolation.
The preceding diagram shows the relationship between an organization's projects, users, and groups and the
environments where the Azure components are deployed.
Typically in IT, an environment (or tier) is a system in which multiple applications are deployed and executed. Large
enterprises use a development environment (where changes are made and tested) and a production environment
(what end-users use). Those environments are separated, often with several staging environments in between
them to allow phased deployment (rollout), testing, and rollback if problems arise. Deployment architectures vary
significantly, but usually the basic process of starting at development (DEV) and ending at production (PROD) is
still followed.
A common architecture for these types of multitier environments consists of DevOps for development and testing,
UAT for staging, and production environments. Organizations can use single or multiple Azure AD tenants to define
access and rights to these environments. The previous diagram shows a case where two different Azure AD tenants
are used: one for DevOps and UAT, and the other exclusively for production.
The presence of different Azure AD tenants enforces the separation between environments. The same group of
users, such as the central IT team, need to authenticate by using a different URI to access a different Azure AD
tenant to modify the roles or permissions of either the DevOps or production environments of a project. The
presence of different user authentications to access different environments reduces possible outages and other
issues caused by human errors.
Component type: Infrastructure
This component type is where most of the supporting infrastructure resides. It's also where your centralized IT,
security, and compliance teams spend most of their time.
Infrastructure components provide an interconnection for the different components of a VDC implementation, and
are present in both the hub and the spokes. The responsibility for managing and maintaining the infrastructure
components is typically assigned to the central IT team or security team.
One of the primary tasks of the IT infrastructure team is to guarantee the consistency of IP address schemas across
the enterprise. The private IP address space assigned to a VDC implementation must be consistent and NOT
overlapping with private IP addresses assigned on your on-premises networks.
While NAT on the on-premises edge routers or in Azure environments can avoid IP address conflicts, it adds
complications to your infrastructure components. Simplicity of management is one of the key goals of the VDC, so
using NAT to handle IP concerns, while a valid solution, is not a recommended solution.
Infrastructure components have the following functionality:
Identity and directory services. Access to every resource type in Azure is controlled by an identity stored in a
directory service. The directory service stores not only the list of users, but also the access rights to resources in
a specific Azure subscription. These services can exist cloud-only, or they can be synchronized with on-premises
identity stored in Active Directory.
Virtual Network. Virtual networks are one of main components of the VDC, and enable you to create a traffic
isolation boundary on the Azure platform. A virtual network is composed of a single or multiple virtual network
segments, each with a specific IP network prefix (a subnet, either IPv4 or dual stack IPv4/IPv6). The virtual
network defines an internal perimeter area where IaaS virtual machines and PaaS services can establish private
communications. VMs (and PaaS services) in one virtual network can't communicate directly to VMs (and PaaS
services) in a different virtual network, even if both virtual networks are created by the same customer, under
the same subscription. Isolation is a critical property that ensures customer VMs and communication remains
private within a virtual network. Where cross-network connectivity is desired, the following features describe
how that can be accomplished.
Virtual network peering. The fundamental feature used to create the infrastructure of a VDC is virtual network
peering, which connects two virtual networks in the same region, either through the Azure datacenter network
or using the Azure worldwide backbone across regions.
Virtual Network service endpoints. Service endpoints extend your virtual network private address space to
include your PaaS space. The endpoints also extend the identity of your virtual network to the Azure services
over a direct connection. Endpoints allow you to secure your critical Azure service resources to only your virtual
networks.
Private Link. Azure Private Link enables you to access Azure PaaS Services (for example, Azure Storage, Azure
Cosmos DB, and Azure SQL Database) and Azure hosted customer/partner services over a Private Endpoint in
your virtual network. Traffic between your virtual network and the service traverses over the Microsoft
backbone network, eliminating exposure from the public Internet. You can also create your own Private Link
Service in your virtual network and deliver it privately to your customers. The setup and consumption
experience using Azure Private Link is consistent across Azure PaaS, customer-owned, and shared partner
services.
User-defined routes. Traffic in a virtual network is routed by default based on the system routing table. A user-
defined route is a custom routing table that network administrators can associate to one or more subnets to
override the behavior of the system routing table and define a communication path within a virtual network.
The presence of user-defined routes guarantees that egress traffic from the spoke transit through specific
custom VMs or network virtual appliances and load balancers present in both the hub and the spokes.
Network security groups. A network security group is a list of security rules that act as traffic filtering on IP
sources, IP destinations, protocols, IP source ports, and IP destination ports (also called a Layer 4 five-tuple). The
network security group can be applied to a subnet, a Virtual NIC associated with an Azure VM, or both. The
network security groups are essential to implement a correct flow control in the hub and in the spokes. The level
of security afforded by the network security group is a function of which ports you open, and for what purpose.
Customers should apply additional per-VM filters with host-based firewalls such as iptables or the Windows
Firewall.
DNS. DNS provides name resolution for resources in a virtual datacenter. Azure provides DNS services for both
public and private name resolution. Private zones provide name resolution both within a virtual network and
across virtual networks. You can have private zones not only span across virtual networks in the same region,
but also across regions and subscriptions. For public resolution, Azure DNS provides a hosting service for DNS
domains, providing name resolution using Microsoft Azure infrastructure. By hosting your domains in Azure,
you can manage your DNS records using the same credentials, APIs, tools, and billing as your other Azure
services.
Management group, subscription, and resource group management. A subscription defines a natural boundary
to create multiple groups of resources in Azure. This separation can be for function, role segregation, or billing.
Resources in a subscription are assembled together in logical containers known as resource groups. The
resource group represents a logical group to organize the resources in a virtual datacenter. If your organization
has many subscriptions, you may need a way to efficiently manage access, policies, and compliance for those
subscriptions. Azure management groups provide a level of scope above subscriptions. You organize
subscriptions into containers known as management groups and apply your governance conditions to the
management groups. All subscriptions within a management group automatically inherit the conditions applied
to the management group. To see these three features in a hierarchy view, see Organizing your resources in the
Cloud Adoption Framework.
Role-based access control (RBAC). RBAC can map organizational roles and rights to access specific Azure
resources, allowing you to restrict users to only a certain subset of actions. If you're synchronizing Azure Active
Directory with an on-premises Active Directory, you can use the same Active Directory groups in Azure that you
use on-premises. With RBAC, you can grant access by assigning the appropriate role to users, groups, and
applications within the relevant scope. The scope of a role assignment can be an Azure subscription, a resource
group, or a single resource. RBAC allows inheritance of permissions. A role assigned at a parent scope also
grants access to the children contained within it. Using RBAC, you can segregate duties and grant only the
amount of access to users that they need to perform their jobs. For example, one employee can manage virtual
machines in a subscription, while another can manage SQL Server databases in the same subscription.
Component Type: Perimeter Networks
Components of a perimeter network (sometimes called a DMZ network) connect your on-premises or physical
datacenter networks, along with any internet connectivity. The perimeter typically requires a significant time
investment from your network and security teams.
Incoming packets should flow through the security appliances in the hub before reaching the back-end servers and
services in the spokes. Examples include the firewall, IDS, and IPS. Before they leave the network, internet-bound
packets from the workloads should also flow through the security appliances in the perimeter network. This flow
enables policy enforcement, inspection, and auditing.
Perimeter network components include:
Virtual networks, user-defined routes, and network security groups
Network virtual appliances
Azure Load Balancer
Azure Application Gateway with web application firewall (WAF)
Public IPs
Azure Front Door with web application firewall (WAF)
Azure Firewall and Azure Firewall Manager
Standard DDoS Protection
Usually, the central IT team and security teams have responsibility for requirement definition and operation of the
perimeter networks.
The preceding diagram shows the enforcement of two perimeters with access to the internet and an on-premises
network, both resident in the DMZ hub. In the DMZ hub, the perimeter network to internet can scale up to support
many lines of business, using multiple farms of Web Application Firewalls (WAFs) or Azure Firewalls. The hub also
allows for on-premises connectivity via VPN or ExpressRoute as needed.
NOTE
In the preceding diagram, in the "DMZ Hub", many of the following features can be bundled together in an Azure Virtual
WAN hub (such as virtual networks, user-defined routes, network security groups, VPN gateways, ExpressRoute gateways,
Azure load balancers, Azure Firewalls, Firewall Manager, and DDOS). Using Azure Virtual WAN hubs can make the creation of
the hub virtual network, and thus the VDC, much easier, since most of the engineering complexity is handled for you by
Azure when you deploy an Azure Virtual WAN hub.
Virtual networks. The hub is typically built on a virtual network with multiple subnets to host the different types of
services that filter and inspect traffic to or from the internet via Azure Firewall, NVAs, WAF, and Azure Application
Gateway instances.
User-defined routes Using user-defined routes, customers can deploy firewalls, IDS/IPS, and other virtual
appliances, and route network traffic through these security appliances for security boundary policy enforcement,
auditing, and inspection. User-defined routes can be created in both the hub and the spokes to guarantee that
traffic transits through the specific custom VMs, Network Virtual Appliances, and load balancers used by a VDC
implementation. To guarantee that traffic generated from virtual machines residing in the spoke transits to the
correct virtual appliances, a user-defined route needs to be set in the subnets of the spoke by setting the front-end
IP address of the internal load balancer as the next hop. The internal load balancer distributes the internal traffic to
the virtual appliances (load balancer back-end pool).
Azure Firewall is a managed network security service that protects your Azure Virtual Network resources. It's a
stateful managed firewall with high availability and cloud scalability. You can centrally create, enforce, and log
application and network connectivity policies across subscriptions and virtual networks. Azure Firewall uses a static
public IP address for your virtual network resources. It allows outside firewalls to identify traffic that originates
from your virtual network. The service is fully integrated with Azure Monitor for logging and analytics.
If you use the Azure Virtual WAN topology, the Azure Firewall Manager is a security management service that
provides central security policy and route management for cloud-based security perimeters. It works with Azure
Virtual WAN hub, a Microsoft-managed resource that lets you easily create hub and spoke architectures. When
security and routing policies are associated with such a hub, it's referred to as a secured virtual hub.
Network virtual appliances. In the hub, the perimeter network with access to the internet is normally managed
through an Azure Firewall instance or a farm of firewalls or web application firewall (WAF).
Different lines of business commonly use many web applications, which tend to suffer from various vulnerabilities
and potential exploits. Web application firewalls are a special type of product used to detect attacks against web
applications, HTTP/HTTPS, in more depth than a generic firewall. Compared with tradition firewall technology,
WAFs have a set of specific features to protect internal web servers from threats.
An Azure Firewall or NVA firewall both use a common administration plane, with a set of security rules to protect
the workloads hosted in the spokes, and control access to on-premises networks. The Azure Firewall has scalability
built in, whereas NVA firewalls can be manually scaled behind a load balancer. Generally, a firewall farm has less
specialized software compared with a WAF, but has a broader application scope to filter and inspect any type of
traffic in egress and ingress. If an NVA approach is used, they can be found and deployed from Azure Marketplace.
We recommend that you use one set of Azure Firewall instances, or NVAs, for traffic originating on the internet.
Use another for traffic originating on-premises. Using only one set of firewalls for both is a security risk as it
provides no security perimeter between the two sets of network traffic. Using separate firewall layers reduces the
complexity of checking security rules and makes it clear which rules correspond to which incoming network
request.
Azure Load Balancer offers a high availability Layer 4 (TCP/UDP) service, which can distribute incoming traffic
among service instances defined in a load-balanced set. Traffic sent to the load balancer from front-end endpoints
(public IP endpoints or private IP endpoints) can be redistributed with or without address translation to a set of
back-end IP address pool (such as network virtual appliances or virtual machines).
Azure Load Balancer can probe the health of the various server instances as well, and when an instance fails to
respond to a probe, the load balancer stops sending traffic to the unhealthy instance. In a virtual datacenter, an
external load balancer is deployed to the hub and the spokes. In the hub, the load balancer is used to efficiently
route traffic across firewall instances, and in the spokes, load balancers are used to manage application traffic.
Azure Front Door (AFD) is Microsoft's highly available and scalable Web Application Acceleration Platform, Global
HTTP Load Balancer, Application Protection, and Content Delivery Network. Running in more than 100 locations at
the edge of Microsoft's Global Network, AFD enables you to build, operate, and scale out your dynamic web
application and static content. AFD provides your application with world-class end-user performance, unified
regional/stamp maintenance automation, BCDR automation, unified client/user information, caching, and service
insights. The platform offers: - Performance, reliability, and support service-level agreements (SLAs). - Compliance
certifications. - Auditable security practices that are developed, operated, and natively supported by Azure. Azure
Front Door also provides a web application firewall (WAF), which protects web applications from common
vulnerabilities and exploits.
Azure Application Gateway is a dedicated virtual appliance providing a managed application delivery controller. It
offers various Layer 7 load-balancing capabilities for your application. It allows you to optimize web farm
performance by offloading CPU-intensive SSL termination to the application gateway. It also provides other Layer 7
routing capabilities, such as round-robin distribution of incoming traffic, cookie-based session affinity, URL-path-
based routing, and the ability to host multiple websites behind a single application gateway. A web application
firewall (WAF) is also provided as part of the application gateway WAF SKU. This SKU provides protection to web
applications from common web vulnerabilities and exploits. Application Gateway can be configured as internet
facing gateway, internal only gateway, or a combination of both.
Public IPs. With some Azure features, you can associate service endpoints to a public IP address so that your
resource is accessible from the internet. This endpoint uses NAT to route traffic to the internal address and port on
the virtual network in Azure. This path is the primary way for external traffic to pass into the virtual network. You
can configure public IP addresses to determine which traffic is passed in and how and where it's translated onto the
virtual network.
Azure DDoS Protection Standard provides additional mitigation capabilities over the Basic service tier that are
tuned specifically to Azure Virtual Network resources. DDoS Protection Standard is simple to enable and requires
no application changes. Protection policies are tuned through dedicated traffic monitoring and machine learning
algorithms. Policies are applied to public IP addresses associated to resources deployed in virtual networks.
Examples include Azure Load Balancer, Azure Application Gateway, and Azure Service Fabric instances. Near real-
time, system-generated logs are available through Azure Monitor views during an attack and for history.
Application layer protection can be added through the Azure Application Gateway web application firewall.
Protection is provided for IPv4 and IPv6 Azure public IP addresses.
The hub and spoke topology uses virtual network peering and user-defined routes to route traffic properly.
In the diagram, the user-defined route ensures that traffic flows from the spoke to the firewall before passing to
on-premises through the ExpressRoute gateway (if the firewall policy allows that flow).
Component type: Monitoring
Monitoring components provide visibility and alerting from all the other component types. All teams should have
access to monitoring for the components and services they have access to. If you have a centralized help desk or
operations teams, they require integrated access to the data provided by these components.
Azure offers different types of logging and monitoring services to track the behavior of Azure-hosted resources.
Governance and control of workloads in Azure is based not just on collecting log data but also on the ability to
trigger actions based on specific reported events.
Azure Monitor. Azure includes multiple services that individually perform a specific role or task in the monitoring
space. Together, these services deliver a comprehensive solution for collecting, analyzing, and acting on system-
generated logs from your applications and the Azure resources that support them. They can also work to monitor
critical on-premises resources in order to provide a hybrid monitoring environment. Understanding the tools and
data that are available is the first step in developing a complete monitoring strategy for your applications.
There are two fundamental types of logs in Azure Monitor:
Metrics are numerical values that describe some aspect of a system at a particular point in time. They are
lightweight and capable of supporting near real-time scenarios. For many Azure resources, you'll see data
collected by Azure Monitor right in their Overview page in the Azure portal. As an example, look at any
virtual machine and you'll see several charts displaying performance metrics. Select any of the graphs to
open the data in metrics explorer in the Azure portal, which allows you to chart the values of multiple
metrics over time. You can view the charts interactively or pin them to a dashboard to view them with other
visualizations.
Logs contain different kinds of data organized into records with different sets of properties for each type.
Events and traces are stored as logs along with performance data, which can all be combined for analysis.
Log data collected by Azure Monitor can be analyzed with queries to quickly retrieve, consolidate, and
analyze collected data. Logs are stored and queried from Log Analytics. You can create and test queries using
Log Analytics in the Azure portal and then either directly analyze the data using these tools or save queries
for use with visualizations or alert rules.
Azure Monitor can collect data from a variety of sources. You can think of monitoring data for your applications in
tiers ranging from your application, any operating system, and the services it relies on, down to the Azure platform
itself. Azure Monitor collects data from each of the following tiers:
Application monitoring data: Data about the performance and functionality of the code you have written,
regardless of its platform.
Guest OS monitoring data: Data about the operating system on which your application is running. This OS
could be running in Azure, another cloud, or on-premises.
Azure resource monitoring data: Data about the operation of an Azure resource.
Azure subscription monitoring data: Data about the operation and management of an Azure subscription,
as well as data about the health and operation of Azure itself.
Azure tenant monitoring data: Data about the operation of tenant-level Azure services, such as Azure Active
Directory.
Custom sources: Logs sent from on-premises sources can be included as well. Examples include on-premises
server events or network device syslog output.
Monitoring data is only useful if it can increase your visibility into the operation of your computing environment.
Azure Monitor includes several features and tools that provide valuable insights into your applications and other
resources that they depend on. Monitoring solutions and features such as Application Insights and Azure Monitor
for containers provide deep insights into different aspects of your application and specific Azure services.
Monitoring solutions in Azure Monitor are packaged sets of logic that provide insights for a particular application
or service. They include logic for collecting monitoring data for the application or service, queries to analyze that
data, and views for visualization. Monitoring solutions are available from Microsoft and partners to provide
monitoring for various Azure services and other applications.
With all of this rich data collected, it's important to take proactive action on events happening in your environment
where manual queries alone won't suffice. Alerts in Azure Monitor proactively notify you of critical conditions and
potentially attempt to take corrective action. Alert rules based on metrics provide near real-time alerting based on
numeric values, while rules based on logs allow for complex logic across data from multiple sources. Alert rules in
Azure Monitor use action groups, which contain unique sets of recipients and actions that can be shared across
multiple rules. Based on your requirements, action groups can perform such actions as using webhooks that cause
alerts to start external actions or to integrate with your ITSM tools.
Azure Monitor also allows the creation of custom dashboards. Azure dashboards allow you to combine different
kinds of data, including both metrics and logs, into a single pane in the Azure portal. You can optionally share the
dashboard with other Azure users. Elements throughout Azure Monitor can be added to an Azure dashboard in
addition to the output of any log query or metrics chart. For example, you could create a dashboard that combines
tiles that show a graph of metrics, a table of activity logs, a usage chart from Application Insights, and the output of
a log query.
Finally, Azure Monitor data is a native source for Power BI. Power BI is a business analytics service that provides
interactive visualizations across a variety of data sources and is an effective means of making data available to
others within and outside your organization. You can configure Power BI to automatically import log data from
Azure Monitor to take advantage of these additional visualizations.
Azure Network Watcher provides tools to monitor, diagnose, and view metrics and enable or disable logs for
resources in a virtual network in Azure. It's a multifaceted service that allows the following functionalities and
more:
Monitor communication between a virtual machine and an endpoint.
View resources in a virtual network and their relationships.
Diagnose network traffic filtering problems to or from a VM.
Diagnose network routing problems from a VM.
Diagnose outbound connections from a VM.
Capture packets to and from a VM.
Diagnose problems with an virtual network gateway and connections.
Determine relative latencies between Azure regions and internet service providers.
View security rules for a network interface.
View network metrics.
Analyze traffic to or from a network security group.
View diagnostic logs for network resources.
Component type: Workloads
Workload components are where your actual applications and services reside. It's where your application
development teams spend most of their time.
The workload possibilities are endless. The following are just a few of the possible workload types:
Internal applications: Line-of-business applications are critical to enterprise operations. These applications have
some common characteristics:
Interactive: Data is entered, and results or reports are returned.
Data-driven: Data intensive with frequent access to databases or other storage.
Integrated: Offer integration with other systems within or outside the organization.
Customer-facing web sites (internet-facing or internally facing): Most internet applications are web sites.
Azure can run a web site via either an IaaS virtual machine or an Azure Web Apps site (PaaS). Azure Web Apps
integrates with virtual networks to deploy web apps in a spoke network zone. Internally facing web sites don't need
to expose a public internet endpoint because the resources are accessible via private non-internet routable
addresses from the private virtual network.
Big data analytics: When data needs to scale up to larger volumes, relational databases may not perform well
under the extreme load or unstructured nature of the data. Azure HDInsight is a managed, full-spectrum, open-
source analytics service in the cloud for enterprises. You can use open-source frameworks such as Hadoop, Apache
Spark, Apache Hive, LLAP, Apache Kafka, Apache Storm, and R. HDInsight supports deploying into a location-based
virtual network, can be deployed to a cluster in a spoke of the virtual datacenter.
Events and Messaging: Azure Event Hubs is a big data streaming platform and event ingestion service. It can
receive and process millions of events per second. It provides low latency and configurable time retention, enabling
you to ingest massive amounts of data into Azure and read it from multiple applications. A single stream can
support both real-time and batch-based pipelines.
You can implement a highly reliable cloud messaging service between applications and services through Azure
Service Bus. It offers asynchronous brokered messaging between client and server, structured first-in-first-out
(FIFO) messaging, and publishes and subscribe capabilities.
These examples barely scratch the surface of the types of workloads you can create in Azure—everything from a
basic Web and SQL app to the latest in IoT, big data, machine learning, AI, and so much more.
Highly availability: multiple virtual datacenters
So far, this article has focused on the design of a single VDC, describing the basic components and architectures
that contribute to resiliency. Azure features such as Azure load balancer, NVAs, availability zones, availability sets,
scale sets, and other capabilities that help you include solid SLA levels into your production services.
However, because a virtual datacenter is typically implemented within a single region, it might be vulnerable to
outages that affect the entire region. Customers that require high availability must protect the services through
deployments of the same project in two or more VDC implementations deployed to different regions.
In addition to SLA concerns, several common scenarios benefit from running multiple virtual datacenters:
Regional or global presence of your end users or partners.
Disaster recovery requirements.
A mechanism to divert traffic between datacenters for load or performance.
Regional/global presence
Azure datacenters exist in many regions worldwide. When selecting multiple Azure datacenters, consider two
related factors: geographical distances and latency. To optimize user experience, evaluate the distance between each
virtual datacenter as well as the distance from each virtual datacenter to the end users.
An Azure region that hosts your virtual datacenter must conform with regulatory requirements of any legal
jurisdiction under which your organization operates.
Disaster recovery
The design of a disaster recovery plan depends on the types of workloads and the ability to synchronize state of
those workloads between different VDC implementations. Ideally, most customers desire a fast fail-over
mechanism, and this requirement may need application data synchronization between deployments running in
multiple VDC implementations. However, when designing disaster recovery plans, it's important to consider that
most applications are sensitive to the latency that can be caused by this data synchronization.
Synchronization and heartbeat monitoring of applications in different VDC implementations requires them to
communicate over the network. Multiple VDC implementations in different regions can be connected through:
Hub-to-hub communication built into Azure Virtual WAN hubs across regions in the same Virtual WAN.
Virtual network peering to connect hubs across regions.
ExpressRoute private peering, when the hubs in each VDC implementation are connected to the same
ExpressRoute circuit.
Multiple ExpressRoute circuits connected via your corporate backbone, and your multiple VDC implementations
connected to the ExpressRoute circuits.
Site-to-site VPN connections between the hub zone of your VDC implementations in each Azure region.
Typically, Virtual WAN hubs, virtual network peering, or ExpressRoute connections are preferred for network
connectivity, due to the higher bandwidth and consistent latency levels when passing through the Microsoft
backbone.
Run network qualification tests to verify the latency and bandwidth of these connections, and decide whether
synchronous or asynchronous data replication is appropriate based on the result. It's also important to weigh these
results in view of the optimal recovery time objective (RTO).
Disaster recovery: diverting traffic from one region to another
Both Azure Traffic Manager and Azure Front Door periodically check the service health of listening endpoints in
different VDC implementations and, if those endpoints fail, route automatically to the next closest VDC. Traffic
Manager uses real-time user measurements and DNS to route users to the closest (or next closest during failure).
Azure Front Door is a reverse proxy at over 100 Microsoft backbone edge sites, using anycast to route users to the
closest listening endpoint.
Summary
A virtual datacenter approach to datacenter migration creates a scalable architecture that optimizes Azure resource
use, lowers costs, and simplifies system governance. The virtual datacenter is typical based on hub and spoke
network topologies (using either virtual network peering or Virtual WAN hubs). Common shared services provided
in the hub, and specific applications and workloads are deployed in the spokes. The virtual datacenter also matches
the structure of company roles, where different departments such as Central IT, DevOps, and Operations and
Maintenance all work together while performing their specific roles. The virtual datacenter supports migrating
existing on-premises workloads to Azure, but also provides many advantages to cloud-native deployments.
References
Learn more about the Azure capabilities discussed in this document.
Network features
Azure Virtual Networks
Network Security Groups
Service Endpoints
Private Link
User-Defined Routes
Network Virtual Appliances
Public IP Addresses
Azure DNS
Load balancing
Azure Front Door
Azure Load Balancer (L4)
Application Gateway (L7)
Azure Traffic Manager
Connectivity
Virtual Network Peering
Virtual Private Network
Virtual WAN
ExpressRoute
ExpressRoute Direct
Identity
Azure Active Directory
Multi-Factor Authentication
Role-Based Access Control
Default Azure AD Roles
Monitoring
Network Watcher
Azure Monitor
Log Analytics
Best practices
Management Group
Subscription Management
Resource Group Management
Azure Subscription Limits
Security
Azure Firewall
Firewall Manager
Application Gateway WAF
Front Door WAF
Azure DDoS
Other Azure ser vices
Azure Storage
Azure SQL
Azure Web Apps
Azure Cosmos DB
HDInsight
Event Hubs
Service Bus
Azure IoT
Azure Machine Learning
Next steps
Learn more about virtual network peering, the core technology of hub and spoke topologies.
Implement Azure Active Directory to use role-based access control.
Develop a subscription and resource management model using role-based access control that fits the structure,
requirements, and policies of your organization. The most important activity is planning. Analyze how
reorganizations, mergers, new product lines, and other considerations will affect your initial models to ensure
you can scale to meet future needs and growth.
Perimeter networks
10/30/2020 • 6 minutes to read • Edit Online
Perimeter networks enable secure connectivity between your cloud networks and your on-premises or physical
datacenter networks, along with any connectivity to and from the internet. A perimeter network is sometimes
called a demilitarized zone or DMZ.
For perimeter networks to be effective, incoming packets must flow through security appliances hosted in secure
subnets before reaching back-end servers. Examples include the firewall, intrusion detection systems, and intrusion
prevention systems. Before they leave the network, internet-bound packets from workloads should also flow
through the security appliances in the perimeter network. The purposes of this flow are policy enforcement,
inspection, and auditing.
Perimeter networks make use of the following Azure features and services:
Virtual networks, user-defined routes, and network security groups
Network virtual appliances (NVAs)
Azure Load Balancer
Azure Application Gateway and Web Application Firewall (WAF)
Public IPs
Azure Front Door with Web Application Firewall
Azure Firewall
NOTE
Azure reference architectures provide example templates that you can use to implement your own perimeter networks:
Implement a perimeter network between Azure and your on-premises datacenter
Implement a perimeter network between Azure and the internet
Usually, your Central IT team and security teams are responsible for defining requirements for operating your
perimeter networks.
Figure 1: Example
of a hub and spoke network topology.
The diagram above shows an example hub and spoke network topology that implements enforcement of two
perimeters with access to the internet and an on-premises network. Both perimeters reside in the DMZ hub. In the
DMZ hub, the perimeter network to the internet can scale up to support many lines of business via multiple farms
of WAFs and Azure Firewall instances that help protect the spoke virtual networks. The hub also allows for
connectivity via VPN or Azure ExpressRoute as needed.
Virtual networks
Perimeter networks are typically built using a virtual network with multiple subnets to host the different types of
services that filter and inspect traffic to or from the internet via NVAs, WAFs, and Azure Application Gateway
instances.
User-defined routes
By using user-defined routes, customers can deploy firewalls, intrusion detection systems, intrusion prevention
systems, and other virtual appliances. Customers can then route network traffic through these security appliances
for security boundary policy enforcement, auditing, and inspection. User-defined routes can be created to
guarantee that traffic passes through the specified custom VMs, NVAs, and load balancers.
In a hub and spoke network example, guaranteeing that traffic generated by virtual machines that reside in the
spoke passes through the correct virtual appliances in the hub requires a user-defined route defined in the subnets
of the spoke. This route sets the front-end IP address of the internal load balancer as the next hop. The internal load
balancer distributes the internal traffic to the virtual appliances (load balancer back-end pool).
Azure Firewall
Azure Firewall is a managed cloud-based service that helps protect your Azure Virtual Network resources. It's a
fully stateful managed firewall with built-in high availability and unrestricted cloud scalability. You can centrally
create, enforce, and log application and network connectivity policies across subscriptions and virtual networks.
Azure Firewall uses a static public IP address for your virtual network resources. It allows outside firewalls to
identify traffic that originates from your virtual network. The service interoperates with Azure Monitor for logging
and analytics.
Network virtual appliances
Perimeter networks with access to the internet are typically managed through an Azure Firewall instance or a farm
of firewalls or web application firewalls.
Different lines of business commonly use many web applications. These applications tend to suffer from various
vulnerabilities and potential exploits. A Web Application Firewall detects attacks against web applications (HTTP/S)
in more depth than a generic firewall. Compared with tradition firewall technology, web application firewalls have a
set of specific features to help protect internal web servers from threats.
An Azure Firewall instance and a [network virtual appliance][NVA] firewall use a common administration plane
with a set of security rules to help protect the workloads hosted in the spokes and control access to on-premises
networks. Azure Firewall has built-in scalability, whereas NVA firewalls can be manually scaled behind a load
balancer.
A firewall farm typically has less specialized software compared with a WAF, but it has a broader application scope
to filter and inspect any type of traffic in egress and ingress. If you use an NVA approach, you can find and deploy
the software from the Azure Marketplace.
Use one set of Azure Firewall instances (or NVAs) for traffic that originates on the internet and another set for
traffic that originates on-premises. Using only one set of firewalls for both is a security risk because it provides no
security perimeter between the two sets of network traffic. Using separate firewall layers reduces the complexity of
checking security rules and makes clear which rules correspond to which incoming network requests.
Public IPs
With some Azure features, you can associate service endpoints to a public IP address so that your resource can be
accessed from the internet. This endpoint uses network address translation (NAT) to route traffic to the internal
address and port on the Azure Virtual Network. This path is the primary way for external traffic to pass into the
virtual network. You can configure public IP addresses to determine what traffic is passed in, and how and where
it's translated onto the virtual network.
Hub and spoke is a networking model for efficiently managing common communication or security requirements.
It also helps avoid Azure subscription limitations. This model addresses the following concerns:
Cost savings and management efficiency. Centralizing services that can be shared by multiple workloads,
such as network virtual appliances (NVAs) and DNS servers, in a single location allows IT to minimize
redundant resources and management effort.
Overcoming subscription limits. Large cloud-based workloads might require using more resources than
are allowed in a single Azure subscription. Peering workload virtual networks from different subscriptions to a
central hub can overcome these limits. For more information, see Azure subscription limits.
Separation of concerns. You can deploy individual workloads between Central IT teams and workload teams.
Smaller cloud estates might not benefit from the added structure and capabilities that this model offers. But larger
cloud adoption efforts should consider implementing a hub and spoke networking architecture if they have any of
the concerns listed previously.
NOTE
The Azure reference architectures site contains example templates that you can use as the basis for implementing your own
hub and spoke networks:
Implement a hub and spoke network topology in Azure
Implement a hub and spoke network topology with shared services in Azure
Overview
Figure 1: Example of a hub and spoke network topology.
As shown in the diagram, Azure supports two types of hub and spoke design. It supports communication, shared
resources, and centralized security policy (labeled as VNet hub in the diagram), or a design based on Azure Virtual
WAN (labeled as Virtual WAN in the diagram) for large-scale branch-to-branch and branch-to-Azure
communications.
A hub is a central network zone that controls and inspects ingress or egress traffic between zones: internet, on-
premises, and spokes. The hub and spoke topology gives your IT department an effective way to enforce security
policies in a central location. It also reduces the potential for misconfiguration and exposure.
The hub often contains the common service components that the spokes consume. The following examples are
common central services:
The Windows server Active Directory infrastructure, required for user authentication of third parties that gain
access from untrusted networks before they get access to the workloads in the spoke. It includes the related
Active Directory Federation Services (AD FS).
A DNS service to resolve naming for the workload in the spokes, to access resources on-premises and on the
internet if Azure DNS isn't used.
A public key infrastructure (PKI), to implement single sign-on on workloads.
Flow control of TCP and UDP traffic between the spoke network zones and the internet.
Flow control between the spokes and on-premises.
If needed, flow control between one spoke and another.
You can minimize redundancy, simplify management, and reduce overall cost by using the shared hub
infrastructure to support multiple spokes.
The role of each spoke can be to host different types of workloads. The spokes also provide a modular approach
for repeatable deployments of the same workloads. Examples include dev/test, user acceptance testing, staging,
and production.
The spokes can also segregate and enable different groups within your organization. An example is Azure DevOps
groups. Inside a spoke, it's possible to deploy a basic workload or complex multitier workloads with traffic control
between the tiers.
Building a cost-conscious organization requires visibility and properly defined access (or scope) to cost-related
data. This best-practice article outlines decisions and implementation approaches to creating tracking
mechanisms.
During the Ready phase of a migration journey, the objective is to prepare for the journey ahead. This phase is
accomplished in two primary areas: organizational readiness and environmental (technical) readiness. Each area
might require new skills for both technical and nontechnical contributors. The following sections describe a few
options to help build the necessary skills.
Microsoft Learn
Microsoft Learn is a new approach to learning. Readiness for the new skills and responsibilities that come with
cloud adoption doesn't come easily. Microsoft Learn provides a more rewarding approach to hands-on learning
that helps you achieve your goals faster. Earn points and levels and achieve more.
The following examples are a few tailored learning paths on Microsoft Learn, which align to the Ready
methodology of the Cloud Adoption Framework:
Azure fundamentals: Learn cloud concepts such as high availability, scalability, elasticity, agility, fault tolerance, and
disaster recovery. Understand the benefits of cloud computing in Azure and how it can save you time and money.
Compare and contrast basic strategies for transitioning to the Azure cloud. Explore the breadth of services
available in Azure including compute, network, storage, and security.
Manage resources in Azure: Learn how to work with the Azure command line and web portal to create, manage,
and control cloud-based resources.
Administer infrastructure resources in Azure: Learn how to create, manage, secure, and scale virtual machine
resources.
Store data in Azure: Azure provides a variety of ways to store data: unstructured, archival, relational, and more.
Learn the basics of storage management in Azure, how to create a storage account, and how to choose the right
model for the data you want to store in the cloud.
Architect great solutions in Azure: Learn how to design and build secure, scalable, and high-performing solutions
in Azure by examining the core principles found in sound architecture.
Learn more
For additional learning paths, browse the Microsoft Learn catalog. Use the Roles filter to align learning paths with
your role.
Cloud migration in the Cloud Adoption Framework
10/30/2020 • 4 minutes to read • Edit Online
Any enterprise-scale cloud adoption plan, will include workloads that do not warrant significant investments in
the creation of new business logic. Those workloads could be moved to the cloud through any number of
approaches: lift and shift; lift and optimize; or modernize. Each of these approaches is considered a migration. The
following exercises will help establish the iterative processes to assess, migrate, optimize, secure, and manage
those workloads.
To prepare you for this phase of the cloud adoption lifecycle, we recommend the following:
The Migrate methodology and the steps above build on the following assumptions:
The methodology governing migration sprints fits within migration waves or releases, which are defined
using the Plan, Ready, and Adopt methodologies. Within each migration sprint, a batch of workloads is
migrated to the cloud.
Before migrating workloads, at least one landing zone has been identified, configured, and deployed to meet
the needs of the near-term cloud adoption plan.
Migration is commonly associated with the terms lift and shift or rehost. This methodology and the above
steps are built on the belief that no datacenter and few workloads should be migrated using a pure rehost
approach. While many workloads can be rehosted, customers more often choose to modernize specific assets
within each workload. During this iterative process, the balance between speed and modernization is a
common discussion point.
Migration effort
The effort required to migrate workloads generally falls into three types of effort (or phases) for each workload:
assess workloads, deploy workloads, and release workloads. This section of the Cloud Adoption Framework
teaches readers how to maximize the return from each phase required to migrate a workload to production.
In a standard two-week long iteration, an experienced migration team can complete this process for 2-5
workloads of low-medium complexity. More complex workloads, such as SAP, may take several two-week
iterations to complete all three phases of migration effort for a single workload. Experience and complexity both
have a significant impact on timelines and migration velocity.
The following bullets provide an overview of the phases of this process (pictured above):
Assess workloads: Assess workloads to evaluate cost, modernization, and deployment tooling. This process
focuses on validating or challenging the assumptions made during earlier discovery and assessments by
looking more closely at rationalization options. This is also when user patterns and dependencies are studied
more closely to ensure workloads will achieve technical success after migration.
Deploy workloads: After workloads are assessed, the existing functionality of those workloads is replicated
(or improved) in the cloud. This could involve a lift and shift or rehost to the cloud. But more commonly
during this phase, many of the assets supporting these workloads will be modernized to capitalize on the
benefits of the cloud.
Release workloads: Once functionality is replicated to the cloud, workloads can be tested, optimized,
documented, and released for ongoing operations. Critical during this process, is the effort to review the
migrated workloads and hand them off to governance, operations management, and security teams for
ongoing support of those workloads.
NOTE
In some early iterations of migration effort, it is common to limit scope to a single workload. This approach maximizes skills
retention and provides the team with more time to experiment and learn.
NOTE
When building a migration factory, some teams may choose to disperse each of the above phases across multiple teams
and multiple sprints. This approach can improve repeatability and accelerate migration efforts.
Next steps
The steps outlined above, and subsequent guidance in the Migrate methodology, can help you develop skills to
improve processes within each migration sprint. The Azure migration guide is a brief series of articles that
outlines the most common tools and approaches needed during your first migration wave.
Azure migration guide
Azure migration guide overview
10/30/2020 • 2 minutes to read • Edit Online
The Cloud Adoption Framework's Migrate methodology guides readers through an iterative process of migrating
one workload, or a small collection of workloads per release. In each iteration, the process of assess, migration,
and optimize and promote is followed to ensure that workloads are ready to meet production demands. That
cloud-agnostic process can guide migration to any cloud provider.
This guide demonstrates a simplified version of that process when migrating from your on-premises
environment to Azure .
TIP
For an interactive experience, view this guide in the Azure portal. Go to the Azure Quickstart Center in the Azure portal,
select Azure migration guide , and then follow the step-by-step instructions.
Migration tools
When to use this guide
This guide is the suggested path for your first migration to Azure, as it will expose you to the methodology and
the cloud-native tools most commonly used during migration to Azure. Those tools are presented across the
following pages:
Assess each workload's technical fit. Validate the technical readiness and suitability for migration.
Migrate your ser vices. Perform the actual migration, by replicating on-premises resources to Azure.
Manage costs and billing. Understand the tools required to control costs in Azure.
Optimize and promote. Optimize for cost and performance balance before promoting your workload to
production.
Get assistance. Get help and support during your migration or post-migration activities.
It is assumed that a landing zone has already been deployed, in alignment with the best practices in the Cloud
Adoption Framework's Ready methodology.
Assess workloads and refine plans
10/30/2020 • 7 minutes to read • Edit Online
The resources in this guide help you assess each workload, challenge assumptions about each workload's
suitability for migration, and finalize architectural decisions about migration options.
Tools
Challenge assumptions
Scenarios and stakeholders
Timelines
Cost management
If you didn't follow the guidance in the links above, you will likely need data and an assessment tool to make
informed migration decisions. Azure Migrate is the native tool for assessing and migrating to Azure. If you haven't
already, use these steps to create a new server migration project and collect the necessary data.
Azure Migrate
Azure Migrate assesses on-premises infrastructure, applications, and data for migration to Azure. This service:
Assesses the migration suitability of on-premises assets.
Performs performance-based sizing.
Provides cost estimates for running on-premises assets in Azure.
If you're considering a lift and shift approach, or are in the early assessment stages of migration, this service is for
you. After completing the assessment, use Azure Migrate to execute the migration.
Learn more
Azure Migrate overview
Migrate physical or virtualized servers to Azure
Azure Migrate in the Azure portal
Service Map
Service Map automatically discovers application components on Windows and Linux systems and maps the
communication between services. With Service Map, you can view your servers in the way that you think of them:
as interconnected systems that deliver critical services. Service Map shows connections between servers,
processes, inbound and outbound connection latency, and ports across any TCP-connected architecture, with no
configuration required other than the installation of an agent.
Azure Migrate uses Service Map to enhance the reporting capabilities and dependencies across the environment.
For full details of this integration, see Dependency visualization. If you use the Azure Migrate service, then no
additional steps are required to configure and obtain the benefits of Service Map. The following instructions are
provided for your reference if you'd like to use Service Map for other purposes or projects.
Enable dependency visualization using Service Map
To use dependency visualization, download and install agents on each on-premises machine that you want to
analyze.
Microsoft Monitoring Agent must be installed on each machine.
The Microsoft Dependency Agent must be installed on each machine.
Also, if you have machines with no internet connectivity, download and install Log Analytics gateway on those
machines.
Learn more
Using Service Map solution in Azure
Azure Migrate and Service Map: Dependency visualization
Deploy workloads and assets (infrastructure, apps,
and data)
10/30/2020 • 11 minutes to read • Edit Online
In this phase of the journey, you use the output of the Assess phase to initiate the migration of the environment.
This guide helps identify the appropriate tools to reach a completed state. You'll explore native tools, third-party
tools, and project management tools.
Native migration tools
Third-party migration tools
Project management tools
Cost management
The following sections describe the native Azure tools available to perform or assist with migration. For
information on choosing the right tools to support your migration efforts, see the Cloud Adoption Framework's
migration tools decision guide.
Azure Migrate
Azure Migrate delivers a unified and extensible migration experience. Azure Migrate provides a one-stop, dedicated
experience to track your migration journey across the phases of assessment and migration to Azure. It provides
you the option to use the tools of your choice and track the progress of migration across these tools.
Azure Migrate is a centralized hub to assess and migrate on-premises servers, infrastructure, applications, and data
to Azure. It provides the following functionality:
Unified platform with assessment, migration, and progress tracking.
Enhanced assessment and migration capabilities:
On-premises servers including Hyper-V & VMware.
Agentless migration of VMware virtual machines to Azure.
Database migrations to Azure SQL Database or SQL Managed Instance
Web applications
Virtual desktop infrastructure (VDI) to Windows Virtual Desktop in Azure
Large data collections using Azure Data Box products
Extensible approach with ISV integration (such as Cloudamize).
To perform a migration using Azure Migrate, follow these steps:
1. Search for Azure Migrate under All ser vices . Select Azure Migrate to continue.
2. Select Add a tool to start your migration project.
3. Select the subscription, resource group, and geography to host the migration.
4. Select Select assessment tool > Azure Migrate: Ser ver assessment > Next .
5. Select Review + add tools , and verify the configuration. Select Add tools to initiate the job to create the
migration project and register the selected solutions.
NOTE
For guidance specific to your scenario refer to the tutorials and Azure Migrate documentation.
Learn more
About Azure Migrate
Azure Migrate tutorial - migrate physical or virtualized servers to Azure
Azure Database Migration Service
Azure Database Migration Service is a fully managed service that enables seamless migrations from multiple
database sources to Azure data platforms, with minimal downtime (online migrations). Database Migration Service
performs all of the required steps. You can initiate your migration projects assured that the process takes
advantage of best practices recommended by Microsoft.
Create an Azure Database Migration Service instance
If this is the first time using Azure Database Migration Service, you need to register the resource provider for your
Azure subscription:
1. Select All ser vices > Subscriptions , and choose the target subscription.
2. Select Resource providers .
3. Search for migration , and then to the right of Microsoft.DataMigration , select Register .
G O TO
S U B S C R I P TI O N S
After you register the resource provider, you can create an instance of Azure Database Migration Service.
1. Select + Create a resource and search the marketplace for Azure Database Migration Ser vice .
2. Complete the Create Migration Service Wizard, then select Create .
The service is now ready to migrate the supported source databases to target platforms such as SQL Server,
MySQL, PostgreSQL, or MongoDB.
C R E ATE A N A Z U R E D ATA B A S E M I G R ATI O N S E R V I C E
I N S TA N C E
NOTE
For large migrations (in terms of number and size of databases), we recommend that you use Azure Database Migration
Service, which can migrate databases at scale.
Now that you have migrated your services to Azure, the next phase includes reviewing the solution for possible
areas of optimization. This effort could include reviewing the design of the solution, right-sizing the services, and
analyzing costs.
This phase is also an opportunity to optimize your environment and perform possible transformations of the
environment. For example, you may have performed a "rehost" migration, and now that your services are running
on Azure you can revisit the solutions configuration or consumed services, and possibly perform some
"refactoring" to modernize and increase the functionality of your solution.
The remainder of this article focuses on tools for optimizing the migrated workload. When the proper balance
between performance and cost has been reached, a workload is ready to be promoted to production. For guidance
on promotion options, see the process improvement articles on optimize and promote.
Right-size assets
Cost management
All Azure services that provide a consumption-based cost model can be resized through the Azure portal, CLI, or
PowerShell. The first step in correctly sizing a service is to review its usage metrics. The Azure Monitor service
provides access to these metrics. You may need to configure the collection of the metrics for the service you're
analyzing, and allow an appropriate time to collect meaningful data based on your workload patterns.
1. Go to Monitor .
2. Select Metrics and configure the chart to show the metrics for the service to analyze.
G O TO
M O N I TO R
The following are some common services that you can resize.
Resize a virtual machine
Azure Migrate performs a right-sizing analysis as part of its pre-migration Assess phase, and virtual machines
migrated using this tool will likely already be sized based on your pre-migration requirements.
However, for virtual machines created or migrated using other methods, or in cases where your post-migration
virtual machine requirements need adjustment, you may want to further refine your virtual machine sizing.
1. Go to Vir tual machines .
2. Select the desired virtual machine from the list.
3. Select Size and the desired new size from the list. You may need to adjust the filters to find the size you need.
4. Select Resize .
Resizing production virtual machines can cause service disruptions. Try to apply the correct sizing for your VMs
before you promote them to production.
G O TO V I R TU A L
M AC H INE S
The cloud introduces a few shifts in how we work, regardless of our role on the technology team. Cost is a great
example of this shift. In the past, only finance and IT leadership were concerned with the cost of IT assets
(infrastructure, apps, and data). The cloud empowers every member of IT to make and act on decisions that better
support the end user. However, with that power comes the responsibility to be cost conscious when making those
decisions.
This article introduces the tools that can help make wise cost decisions before, during, and after a migration to
Azure.
The tools in this article include:
Azure Migrate
Azure pricing calculator
Azure TCO calculator
Azure Cost Management and Billing
Azure Advisor
The processes described in this article may also require a partnership with IT managers, finance, or line-of-business
application owners.
Estimate VM costs prior to migration
Estimate and optimize VM costs during and after migration
Tips and tricks to optimize costs
Prior to migration of any asset (infrastructure, app, or data), there is an opportunity to estimate costs and refine
sizing based on observed performance criteria for those assets. Estimating costs serves two purposes: it allows for
cost control, and it provides a checkpoint to ensure that current budgets account for necessary performance
requirements.
Cost calculators
For manual cost calculations, there are two handy calculators that can provide a quick cost estimate based on the
architecture of the workload to be migrated.
The Azure pricing calculator provides cost estimates for the Azure products you select.
Sometimes decisions require a comparison of the future cloud costs and the current on-premises costs. The
total cost of ownership (TCO) calculator can provide such a comparison.
These manual cost calculators can be used on their own to forecast potential spend and savings. They can also be
used in conjunction with the cost forecasting tools of Azure Migrate to adjust the cost expectations to fit alternative
architectures or performance constraints.
Azure Migrate calculations
Prerequisites: The remainder of this tab assumes the reader has already populated Azure Migrate with a
collection of assets (infrastructure, apps, and data) to be migrated. The prior article on assessments provides
instructions on collecting the initial data. Once the data is populated, follow the next few steps to estimate monthly
costs based on the data collected.
Azure Migrate calculates monthly cost estimates based on data captured by the collector and Service Map. The
following steps will load the cost estimates:
1. Navigate to Azure Migrate assessment in the portal.
2. In the project Over view page, select + Create assessment .
3. Select View all to review the assessment properties.
4. Create the group, and specify a group name.
5. Select the machines that you want to add to the group.
6. Select Create assessment , to create the group and the assessment.
7. After the assessment is created, view it in Over view > Dashboard .
8. In the Assessment details section of the portal navigation, select Cost details .
The resulting estimate, pictured below, identifies the monthly costs of compute and storage, which often represent
the largest portion of cloud costs.
We know that getting the right support at the right time will accelerate your migration efforts. Review the
assistance avenues below to meet your needs.
Support plans
Partners
Microsoft Support
Microsoft offers a basic support plan to all Azure customers. You have 24x7 access to billing and subscription
support, online self-help, documentation, whitepapers, and support forums.
If you need help from Microsoft Support while using Azure, follow these steps to create a support request:
1. Select Help + suppor t in the Azure portal.
2. Select New suppor t request to enter details about your issue and contact support.
1. Select Help + suppor t .
2. Select New suppor t request to enter details about your issue and contact support.
C R E ATE A S U P P O R T
REQU EST
Online communities
The following online communities provide community-based support:
MSDN forums
Stack Overflow
The One Migration approach to migrating the IT
portfolio
10/30/2020 • 2 minutes to read • Edit Online
Azure and Azure Migrate are both well known for hosting Microsoft technologies. But you might not be aware of
Azure's ability to support migrations beyond Windows and SQL Server. The One Migration scenarios captured in
the Migrate methodology demonstrate the same set of consistent guidelines and processes for migrating both
Microsoft and third-party technologies.
Migration scenarios
The following diagram and table outline a number of scenarios that follow the same iterative Migrate methodology
for migration and modernization.
Migration methodology
In each of the preceding migration scenarios, the same basic process will guide your efforts as you move your
existing workloads to the cloud, as shown here:
In each scenario, you'll structure migration waves to guide the releases of multiple workloads. Establishing a cloud
adoption plan and Azure landing zones through the plan and Ready methodologies helps to add structure to your
migration waves.
During each iteration, follow the Migrate methodology to assess, deploy, and release workloads. To modify those
processes to fit your organization's specific scenario, select any of the migration scenarios listed in the table.
Next steps
If you aren't migrating a specific scenario, start by following the four-step Cloud Adoption Framework migration
process.
Overview of application migration examples for
Azure
10/30/2020 • 9 minutes to read • Edit Online
This section of the Cloud Adoption Framework for Azure provides examples of several common migration
scenarios and demonstrates how you can migrate on-premises infrastructure to Microsoft Azure.
Introduction
Azure provides access to a comprehensive set of cloud services. As developers and IT professionals, you can use
these services to build, deploy, and manage applications on a range of tools and frameworks through a global
network of datacenters. As your business faces challenges associated with the digital shift, the Azure platform
helps you to figure out how to:
Optimize resources and operations.
Engage with your customers and employees.
Transform your products.
The cloud provides advantages for speed and flexibility, minimized costs, performance, and reliability. But many
organizations will need to continue to run on-premises datacenters. In response to cloud adoption barriers, Azure
provides a hybrid cloud strategy that builds bridges between your on-premises datacenters and the Azure public
cloud. An example is using Azure cloud resources like Azure Backup to protect on-premises resources or Azure
analytics to gain insights into on-premises workloads.
As part of the hybrid cloud strategy, Azure provides growing solutions for migrating on-premises applications and
workloads to the cloud. With simple steps, you can comprehensively assess your on-premises resources to figure
out how they'll run in the Azure platform. Then, with a deep assessment in hand, you can confidently migrate
resources to Azure. When resources are up and running in Azure, you can optimize them to retain and improve
access, flexibility, security, and reliability.
Migration patterns
Strategies for migration to the cloud fall into four broad patterns: rehost, refactor, rearchitect, or rebuild. The
strategy you adopt depends on your business drivers and migration goals. You might adopt multiple patterns. For
example, you could choose to rehost noncritical applications while rearchitecting applications that are more
complex and business-critical. Let's look at these patterns.
Rearchitect Rearchitecting for migration focuses on When your applications need major
modifying and extending application revisions to incorporate new capabilities
functionality and the code base to or to work effectively on a cloud
optimize the application architecture for platform.
cloud scalability.
When you want to use existing
For example, you could break down a application investments, meet scalability
monolithic application into a group of requirements, apply innovative DevOps
microservices that work together and practices, and minimize use of virtual
scale easily. machines.
Rebuild Rebuild takes things a step further by When you want rapid development,
rebuilding an application from scratch and existing applications have limited
using Azure cloud technologies. functionality and lifespan.
For example, you could build greenfield When you're ready to expedite business
applications with cloud-native innovation (including DevOps practices
technologies like Azure Functions, AI, provided by Azure), build new
SQL Managed Instance, and Azure applications using cloud-native
Cosmos DB. technologies, and take advantage of
advancements in AI, blockchain, and
IoT.
Assess on-premises resources for migration to Azure This best practice article in the Plan methodology discusses
how to run an assessment of an on-premises application
running on VMware. In the article, an example organization
assesses application VMs by using Azure Migrate and the
application SQL Server database by using Data Migration
Assistant.
Infrastructure
A RT IC L E DETA IL S
Deploy Azure infrastructure This article shows how an organization can prepare its on-
premises infrastructure and its Azure infrastructure for
migration. The infrastructure example established in this
article is referenced in the other samples provided in this
section.
Rehost an application on Azure VMs This article provides an example of migrating on-premises
application VMs to Azure VMs using Azure Migrate.
Migrate SQL Server databases to Azure This article demonstrates how the fictional company Contoso
assessed, planned, and migrated its various on-premises SQL
Server databases to Azure.
Rehost an application on an Azure VM and SQL Managed This article provides an example of a lift-and-shift migration to
Instance Azure for an on-premises application. This process involves
migrating the application front-end VM by using Azure
Migrate and the application database to SQL Managed
Instance by using Azure Database Migration Service.
Rehost an application on Azure VMs using SQL Server Always This example shows how to migrate an application and data
On availability groups by using Azure-hosted SQL Server VMs. It uses Azure
Migrate to migrate the application VMs and Database
Migration Service to migrate the application database to a
SQL Server cluster that's protected by an Always On
availability group.
Migrate open-source databases to Azure This article demonstrates how the fictional company Contoso
assessed, planned, and migrated its various on-premises
open-source databases to Azure.
Migrate MySQL to Azure This article demonstrates how the fictional company Contoso
planned and migrated its on-premises MySQL open-source
database platform to Azure.
A RT IC L E DETA IL S
Migrate PostgreSQL to Azure This article demonstrates how the fictional company Contoso
planned and migrated its on-premises PostgreSQL open-
source database platform to Azure.
Migrate MariaDB to Azure This article demonstrates how the fictional company Contoso
planned and migrated its on-premises MariaDB open-source
database platform to Azure.
Rehost a Linux application on Azure VMs and Azure Database This article provides an example of migrating a Linux-hosted
for MySQL application to Azure VMs by using Azure Migrate. The
application database is migrated to Azure Database for
MySQL by using Database Migration Service.
Rehost a Linux application on Azure VMs This example shows how to complete a lift-and-shift migration
of a Linux-based application to Azure VMs by using Azure
Migrate.
Dev/test workloads
A RT IC L E DETA IL S
Migrate dev/test environments to Azure IaaS This article demonstrates how Contoso rehosts its dev/test
environment for two applications running on VMware VMs by
migrating to Azure VMs.
Migrate to Azure DevTest Labs This article discusses how Contoso moves its dev/test
workloads to Azure by using DevTest Labs.
Refactor a Windows application using App Service and SQL This example shows how to migrate an on-premises
Database Windows-based application to an Azure web app and migrate
the application database to an Azure SQL Database server
instance by using Database Migration Service.
Refactor a Windows application using App Service and SQL This example shows how to migrate an on-premises
Managed Instance Windows-based application to an Azure web app and migrate
the application database to SQL Managed Instance by using
Database Migration Service.
Refactor a Linux application to multiple regions using App This example shows how to migrate an on-premises Linux-
Service, Azure Traffic Manager, and Azure Database for based application to an Azure web app on multiple Azure
MySQL regions by using Traffic Manager to integrate with GitHub for
continuous delivery. The application database is migrated to
an Azure Database for MySQL instance.
Refactor Team Foundation Server to Azure DevOps Services This article shows an example migration of an on-premises
Team Foundation Server deployment to Azure DevOps
Services in Azure.
SAP
A RT IC L E DETA IL S
SAP migration guide Get practical guidance to move your on-premises SAP
workloads to the cloud.
Migrate SAP applications to Azure White paper and roadmap for your SAP journey to the cloud.
Migration methodologies for SAP on Azure Overview of various migration options to move SAP
applications to Azure.
Specialized workloads
A RT IC L E DETA IL S
Move on-premises VMware infrastructure to Azure This article provides an example of moving on-premises
VMware VMs to Azure by using Azure VMware Solution.
Azure NetApp Files Enterprise file storage powered by NetApp. Run Linux and
Windows file workloads in Azure.
VDI
A RT IC L E DETA IL S
Move on-premises Remote Desktop Services to Windows This article shows how to migrate on-premises Remote
Virtual Desktop in Azure Desktop Services to Windows Virtual Desktop in Azure.
Migration scaling
A RT IC L E DETA IL S
Scale a migration to Azure This article shows how an example organization prepares to
scale to a full migration to Azure.
Demo applications
The example articles provided in this section use two demo applications: SmartHotel360 and osTicket.
Smar tHotel360 : This test application was developed by Microsoft to use when you work with Azure. It's provided
under an open-source license, and you can download it from GitHub. It's an ASP.NET application connected to a
SQL Server database. In the scenarios discussed in these articles, the current version of this application is deployed
to two VMware VMs running Windows Server 2008 R2 and SQL Server 2008 R2. These application VMs are
hosted on-premises and managed by vCenter Server.
osTicket : This open-source service desk ticketing application runs on Linux. You can download it from GitHub. In
the scenarios discussed in these articles, the current version of this application is deployed on-premises to two
VMware VMs running Ubuntu 16.04 LTS using Apache 2, PHP 7.0, and MySQL 5.7.
Rehost an on-premises application on Azure VMs by
using Azure Migrate
10/30/2020 • 13 minutes to read • Edit Online
This article demonstrates how the fictional company Contoso rehosts a two-tier Windows .NET front-end
application running on VMware virtual machines (VMs) by migrating application VMs to Azure VMs.
The SmartHotel360 application used in this example is provided as open source. If you want to use it for your own
testing purposes, you can download it from GitHub.
Business drivers
The IT leadership team has worked closely with business partners to understand what they want to achieve with
this migration. They want to:
Address business growth. Contoso is growing, so there's pressure on the company's on-premises systems
and infrastructure.
Limit risk . The SmartHotel360 application is critical for the Contoso business. The company wants to move
the application to Azure with zero risk.
Extend. Contoso doesn't want to modify the application, but it does want to ensure that the application is
stable.
Migration goals
The Contoso cloud team has pinned down goals for this migration. It used these goals to determine the best
migration method:
After migration, the application in Azure should have the same performance capabilities as it does today in
VMware. The application will remain as critical in the cloud as it is on-premises.
Although this application is important to Contoso, the company doesn't want to invest in it at this time. Contoso
wants to move the application safely to the cloud in its current form.
Contoso doesn't want to change the ops model for this application. Contoso does want to interact with it in the
cloud in the same way that it does now.
Contoso doesn't want to change any application functionality. Only the application location will change.
Solution design
After establishing goals and requirements, Contoso designs and reviews a deployment solution. Contoso identifies
the migration process, including the Azure services that it will use for the migration.
Current application
The application is tiered across two VMs ( WEBVM and SQLVM ).
The VMs are located on VMware ESXi host contosohost1.contoso.com (version 6.5).
The VMware environment is managed by vCenter Server 6.5 ( vcenter.contoso.com ) running on a VM.
Contoso has an on-premises datacenter ( contoso-datacenter ) with an on-premises domain controller (
contosodc1 ).
Proposed architecture
Because the application is a production workload, the application VMs in Azure will reside in the production
resource group ContosoRG .
The application VMs will be migrated to the primary Azure region (East US 2) and placed in the production
network ( VNET-PROD-EUS2 ).
The web front-end VM will reside in the front-end subnet ( PROD-FE-EUS2 ) in the production network.
The database VM will reside in the database subnet ( PROD-DB-EUS2 ) in the production network.
The on-premises VMs in the Contoso datacenter will be decommissioned after the migration is done.
Database considerations
As part of the solution design process, Contoso did a feature comparison between Azure SQL Database and SQL
Server. The following considerations helped the company to decide to use SQL Server running on an Azure IaaS
VM:
Using an Azure VM running SQL Server seems to be an optimal solution if Contoso needs to customize the
operating system and the database, or co-locate and run partner applications on the same VM.
With Software Assurance, Contoso can later exchange existing licenses for discounted rates on Azure SQL
Managed Instance by using the Azure Hybrid Benefit for SQL Server. This can save up to 30 percent on SQL
Managed Instance.
Solution review
Contoso evaluates the proposed design by putting together a list of pros and cons.
C O N SIDERAT IO N DETA IL S
C O N SIDERAT IO N DETA IL S
Cons WEBVM and SQLVM are running Windows Server 2008 R2.
Azure supports the operating system for specific roles. Learn
more.
Migration process
Contoso will migrate the application front-end and database VMs to Azure VMs by using the agentless method in
the Azure Migrate: Server Migration tool.
As a first step, Contoso prepares and sets up Azure components for Azure Migrate: Server Migration, and
prepares the on-premises VMware infrastructure.
The Azure infrastructure is in place, so Contoso just needs to configure the replication of the VMs through the
Azure Migrate: Server Migration tool.
With everything prepared, Contoso can start replicating the VMs.
After replication is enabled and working, Contoso will migrate the VM by testing the migration and failing it
over to Azure, if successful.
Azure services
SERVIC E DESC RIP T IO N C O ST
Azure Migrate: Server Migration The service orchestrates and manages During replication to Azure, Azure
migration of on-premises applications Storage charges are incurred. Azure
and workloads and Amazon Web VMs are created, and incur charges,
Services (AWS)/Google Cloud Platform when the migration occurs and the
(GCP) VM instances. VMs are running in Azure. Learn more
about charges and pricing.
Prerequisites
Contoso and other users must meet the following prerequisites for this scenario.
On-premises ser vers On-premises vCenter servers should be running version 5.5,
6.0, 6.5, or 6.7.
Scenario steps
Here's how Contoso admins will run the migration:
Step 1: Prepare Azure for Azure Migrate: Ser ver Migration. They add the server migration tool to their
Azure Migrate project.
Step 2: Replicate on-premises VMs. They set up replication and start replicating VMs to Azure Storage.
Step 3: Migrate the VMs with Azure Migrate: Ser ver Migration. They run a test migration to make sure
everything's working, and then run a full migration to move the VMs to Azure.
b. Start the imported image and configure the tool, including the following steps:
Set up the prerequisites.
Point the tool to the Azure subscription.
Set the VMware vCenter credentials.
2. In Replicate > Source settings > Are your machines vir tualized? , select Yes, with VMware
vSphere .
3. In On-premises appliance , select the name of the Azure Migrate appliance that you set up, and then
select OK .
4. In Vir tual machines , select the machines that you want to replicate.
If you've run an assessment for the VMs, you can apply VM sizing and disk type (premium or standard)
recommendations from the assessment results. To do this, in Impor t migration settings from an
Azure Migrate assessment? , select the Yes option.
If you didn't run an assessment, or you don't want to use the assessment settings, select the No option.
If you selected to use the assessment, select the VM group and assessment name.
5. In Vir tual machines , search for VMs as needed and check each VM that you want to migrate. Then select
Next: Target settings .
6. In Target settings , select the subscription and target region to which you'll migrate. Then specify the
resource group in which the Azure VMs will reside after migration. In Vir tual Network , select the Azure
virtual network or subnet to which the Azure VMs will be joined after migration.
7. In Azure Hybrid Benefit :
Select No if you don't want to apply Azure Hybrid Benefit. Then select Next .
Select Yes if you have Windows Server machines that are covered with active Software Assurance or
Windows Server subscriptions and you want to apply the benefit to the machines that you're migrating.
Then select Next .
8. In Compute , review the VM name, size, OS disk type, and availability set. VMs must conform with Azure
requirements.
VM size: If you're using assessment recommendations, the VM size drop-down list will contain the
recommended size. Otherwise, Azure Migrate picks a size based on the closest match in the Azure
subscription. Alternatively, pick a manual size in Azure VM size .
OS disk : Specify the OS (boot) disk for the VM. The OS disk has the operating system bootloader and
installer.
Availability set: If the VM should be in an Azure availability set after migration, specify the set. The set
must be in the target resource group that you specify for the migration.
9. In Disks , specify whether the VM disks should be replicated to Azure, and select the disk type (standard
SSD/HDD or premium-managed disks) in Azure. Then select Next .
You can exclude disks from replication. If you exclude disks, they won't be present on the Azure VM after
migration.
10. In Review and star t replication , review the settings, and then select Replicate to start the initial
replication for the servers.
NOTE
You can update replication settings at any time before replication starts, in Manage > Replicating machines . Settings
can't be changed after replication starts.
Step 3: Migrate the VMs with Azure Migrate: Server Migration
The Contoso admins run a quick test migration and then a full migration to migrate the VMs.
Run a test migration
1. In Migration goals > Ser vers > Azure Migrate: Ser ver Migration , select Test migrated ser vers .
2. Select and hold (or right-click) the VM to test, and then select Test migrate .
3. In Test Migration , select the Azure virtual network in which the Azure VM will be located after the
migration. We recommend that you use a nonproduction virtual network.
4. The Test migration job starts. Monitor the job in the portal notifications.
5. After the migration finishes, view the migrated Azure VM in Vir tual Machines in the Azure portal. The
machine name has a -Test suffix.
6. After the test is done, select and hold (or right-click) the Azure VM in Replicating machines , and then
select Clean up test migration .
Migrate the VMs
Now the Contoso admins run a full migration.
1. In the Azure Migrate project, select Ser vers > Azure Migrate: Ser ver Migration > Replicating
ser vers .
2. In Replicating machines , select and hold (or right-click) the VM, and then select Migrate .
3. In Migrate > Shut down vir tual machines and perform a planned migration with no data loss ,
select Yes > OK .
By default, Azure Migrate shuts down the on-premises VM and runs an on-demand replication to
synchronize any VM changes that occurred since the last replication. This ensures no data loss. If you don't
want to shut down the VM, select No .
4. A migration job starts for the VM. Track the job in Azure notifications.
5. After the job finishes, you can view and manage the VM from the Vir tual Machines page.
Need more help?
Learn about how to run a test migration.
Learn about how to migrate VMs to Azure.
Conclusion
In this article, Contoso rehosted the SmartHotel360 application in Azure. The admins migrated the application VMs
to Azure VMs by using the Azure Migrate: Server Migration tool.
Migrate SQL Server databases to Azure
10/30/2020 • 10 minutes to read • Edit Online
This article demonstrates how a fictional company Contoso assessed, planned and migrated their various on-
premises SQL Server databases to Azure.
As Contoso considers migrating to Azure, the company needs a technical and financial assessment to determine
whether its on-premises workloads are good candidates for cloud migration. In particular, the Contoso team wants
to assess machine and database compatibility for migration. Additionally, it wants to estimate capacity and costs for
running Contoso's resources in Azure.
Business drivers
Contoso is having various issues with maintaining all the wide array of versions of SQL Server workloads that exist
on their network. After the latest investor's meeting, the CFO and CTO have made the decision to move all these
workloads to Azure. This will allow them to shift from a structured capital expense model to a fluid operating
expense model.
The IT leadership team has worked closely with business partners to understand the business and technical
requirements:
Increase security: Contoso needs to be able to monitor and protect all data resources in a more timely and
efficient manner. They would also like to get a more centralized reporting system setup on database access
patterns.
Optimize compute resources: Contoso has deployed a large on-premises server infrastructure. They
have several SQL Server instances that consume but do not really use the underlying CPU, memory and
disk allocated in efficient ways.
Increase efficiency: Contoso needs to remove unnecessary procedures, and streamline processes for
developers and users. The business needs IT to be fast and not waste time or money, thus delivering faster
on customer requirements. Database administration should be reduced and/or minimized after the
migration.
Increase agility: Contoso IT needs to be more responsive to the needs of the business. It must be able to
react faster than the changes in the marketplace, to enable the success in a global economy. It mustn't get in
the way, or become a business blocker.
Scale: As the business grows successfully, Contoso IT must provide systems that are able to grow at the
same pace. There are several legacy hardware environments that cannot be upgraded any further and are
past or near end of support.
Costs: Business and applications owners want to know they won't be stuck with high cloud costs as
compared to running the applications on-premises.
Migration goals
The Contoso cloud team has pinned down goals for the various migrations. These goals were used to determine
the best migration methods.
REQ UIREM EN T S DETA IL S
Limitations Initially, not all branch offices that run applications will have a
direct ExpressRoute link to Azure, so these offices will need to
connect through virtual network gateways.
Solution design
Contoso has already performed a migration assessment of their digital estate using Azure Migrate.
The assessment results in multiple workloads spread across multiple departments. The overall size of the migration
project will require a full project management office (PMO), to manage the specifics of communication, resources
and schedule planning.
Solution review
Contoso evaluates their proposed design by putting together a pros and cons list.
C O N SIDERAT IO N DETA IL S
Pros Azure will provide a single pane of glass into the database
workloads
With the database information now loaded into Azure Migrate, Contoso has identified over 1,000 database
instances that must be migrated. Of these instances, roughly 40 percent can be moved to SQL Database for Azure.
The remaining 60 percent must be moved either to either SQL Server running on Azure Virtual Machines or to
Azure SQL Managed Instance. Of those 60 percent, about 10 percent require a virtual machine-based approach, the
remaining instances will be moved to Azure SQL Managed Instance.
When DMA was not able to be executed on a data source, the following guidelines were followed on the database
migrations.
NOTE
As part of the Assess phase, Contoso discovered various open source databases. Separately, they followed this guide for their
migration planning.
DATA B A SE O N L IN E O F F L IN E M IGRAT IO N
TA RGET USA GE DETA IL S M IGRAT IO N M IGRAT IO N M A X SIZ E GUIDE
Azure SQL SQL Server These Data BACPAC, bcp 1 TiB Link
Database (data only) databases Migration
(PaaS) simply use Assistant,
basic tables, transactional
columns, replication
stored
procedures
and functions
Azure SQL SQL Server These Data BACPAC, bcp, 2 TiB - 8 TiB Link
Managed (advanced databases use Migration native
Instance features) triggers and Assistant, backup/restor
other transactional e
advanced replication
concepts such
as custom
.NET types,
service
brokers, etc.
DATA B A SE O N L IN E O F F L IN E M IGRAT IO N
TA RGET USA GE DETA IL S M IGRAT IO N M IGRAT IO N M A X SIZ E GUIDE
SQL Server on SQL Server The SQL transactional BACPAC, bcp, 4 GiB - 64 TiB Link
Azure Virtual (third-party Server must replication snapshot
Machines integrations) have non- replication,
(IaaS) supported native
SQL Managed backup/restor
Instance e, convert
features physical
(cross- machine to
instance VM
service
brokers,
cryptographic
providers,
buffer pool,
compatibility
levels below
100, database
mirroring,
FILESTREAM,
PolyBase,
anything that
requires
access to file
shares,
external
scripts,
extended
stored
procedures,
and others) or
third-party
software
installed to
support the
activities of
the database.
Due to the large number of databases, Contoso created a project management office (PMO) to keep track of every
database migration instance. Accountability and responsibilities were assigned to each business and application
team.
Contoso also performed a workload readiness review. This review examined the infrastructure, database and
network components.
Step 5: Test migrations
The first part of the migration preparation involved a test migration of each of the databases to the pre-setup
environments. In order to save time, they scripted all of the operations for the migrations and recorded the timings
for each. In order to speed up the migration, they identified what migration operations could be run concurrently.
Any rollback procedures were identified for each of the database workloads in case of some unexpected failures.
For the IaaS-based workloads, they set up all the required third-party software beforehand.
After the test migration, Contoso was able to use the various Azure cost estimation tools to get a more accurate
picture of the future operational costs of their migration.
Step 6: Migration
For the production migration, Contoso identified the time frames for all database migrations and what could be
sufficiently executed in a weekend window (midnight Friday through midnight Sunday) with minimal downtime to
the business.
Based on their documented test procedures, they execute each migration via scripting as much as possible, limiting
any manual tasks to minimize errors.
If any migrations fail during the window, they're rolled back and re-scheduled in the next migration window.
Clean up after migration
Contoso identified the archival window for all database workloads. As the window expires, the resources will be
retired from the on-premises infrastructure.
This includes:
Removing the production data from on-premises servers.
Retiring the hosting server when the last workload window expires.
Review the deployment
With the migrated resources in Azure, Contoso needs to fully operationalize and secure their new infrastructure.
Security
Contoso needs to ensure that their new Azure database workloads are secure. Learn more.
In particular, Contoso should review the firewall and virtual network configurations.
Setup Private Link so that all database traffic is kept inside Azure and the on-premises network.
Enable Azure Advanced Threat Protection for Azure SQL Database.
Backups
Ensure that the Azure databases are backed up using geo-restore. This allows backups to be used in a paired
region in case of a regional outage.
Impor tant: Ensure that the Azure resource has a resource lock to prevent it from being deleted. Deleted
servers cannot be restored.
Licensing and cost optimization
Many Azure database workloads can be scaled up or down, therefore performance monitoring of the server and
databases is important to ensure you're meeting your needs but also keeping costs at a minimum.
Both CPU and storage have costs associated. There are several pricing tiers to select from. Be sure the
appropriate pricing plan is selected for the data workloads.
Elastic pools are to be implemented for databases that have compatible resource utilization patterns.
Each read replica is billed based on the compute and storage selected
Use reserved capacity to save on costs.
Conclusion
In this article, Contoso assessed, planned and migrated their Microsoft SQL Server workloads to Azure.
Rehost an on-premises application by migrating to
Azure VMs and Azure SQL Managed Instance
10/30/2020 • 26 minutes to read • Edit Online
This article shows how the fictional company Contoso migrates a two-tier Windows .NET front-end application
running on VMware virtual machines (VMs) to an Azure VM by using Azure Migrate. It also shows how Contoso
migrates the application database to Azure SQL Managed Instance.
The SmartHotel360 application used in this example is provided as open source. If you want to use it for your own
testing purposes, download it from GitHub.
Business drivers
Contoso's IT leadership team has worked closely with the company's business partners to understand what the
business wants to achieve with this migration. They want to:
Address business growth. Contoso is growing. As a result, pressure has increased on the company's on-
premises systems and infrastructure.
Increase efficiency. Contoso needs to remove unnecessary procedures and streamline processes for its
developers and users. The business needs IT to be fast and not waste time or money for the company to deliver
faster on customer requirements.
Increase agility. Contoso IT needs to be more responsive to the needs of the business. It must react faster
than the changes that occur in the marketplace for the company to be successful in a global economy. IT at
Contoso must not get in the way or become a business blocker.
Scale. As the company's business grows successfully, Contoso IT must provide systems that can grow at the
same pace.
Migration goals
The Contoso cloud team has identified goals for this migration. The company uses migration goals to determine
the best migration method.
After migration, the application in Azure should have the same performance capabilities that the application has
today in Contoso's on-premises VMware environment. Moving to the cloud doesn't mean that application
performance is less critical.
Contoso doesn't want to invest in the application. The application is critical and important to the business, but
Contoso simply wants to move the application in its current form to the cloud.
Database administration tasks should be minimized after the application is migrated.
Contoso doesn't want to use Azure SQL Database for this application. It's looking for alternatives.
Solution design
After pinning down the company's goals and requirements, Contoso designs and reviews a deployment solution
and identifies the migration process. The Azure services that it will use for the migration also are identified.
Current architecture
Contoso has one main datacenter ( contoso-datacenter ). The datacenter is located in New York City in the
eastern United States.
Contoso has three additional local branches across the United States.
The main datacenter is connected to the internet with a fiber-optic Metro Ethernet connection (500 megabits
per second).
Each branch is connected locally to the internet by using business-class connections with IPsec VPN tunnels
back to the main datacenter. The setup allows Contoso's entire network to be permanently connected and
optimizes internet connectivity.
The main datacenter is fully virtualized with VMware. Contoso has two ESXi 6.5 virtualization hosts that are
managed by vCenter Server 6.5.
Contoso uses Active Directory for identity management. Contoso uses DNS servers on the internal network.
Contoso has an on-premises domain controller ( contosodc1 ).
The domain controllers run on VMware VMs. The domain controllers at local branches run on physical servers.
The SmartHotel360 application is tiered across two VMs ( WEBVM and SQLVM ) that are located on a VMware ESXi
version 6.5 host ( contosohost1.contoso.com ).
The VMware environment is managed by vCenter Server 6.5 ( vcenter.contoso.com ) running on a VM.
Proposed architecture
In this scenario, Contoso wants to migrate its two-tier on-premises travel application as follows:
Migrate the application database ( SmartHotelDB ) to a SQL managed instance.
Migrate the front end, WEBVM , to an Azure VM.
The on-premises VMs in the Contoso datacenter will be decommissioned when the migration is finished.
Database considerations
As part of the solution design process, Contoso did a feature comparison between Azure SQL Database and SQL
Managed Instance. The following considerations helped the company decide to use SQL Managed Instance.
SQL Managed Instance aims to deliver almost 100% compatibility with the latest on-premises SQL Server
version. We recommend SQL Managed Instance for customers who are running SQL Server on-premises or on
infrastructure as a service (IaaS) VMs and want to migrate their applications to a fully managed service with
minimal design changes.
Contoso is planning to migrate a large number of applications from on-premises to IaaS. Many of these
applications are ISV provided. Contoso realizes that using SQL Managed Instance will help ensure database
compatibility for these applications, rather than using SQL Database, which might not be supported.
Contoso can perform a lift-and-shift migration to SQL Managed Instance by using the fully automated Azure
Database Migration Service. With this service in place, Contoso can reuse it for future database migrations.
SQL Managed Instance supports SQL Server Agent, an important component of the SmartHotel360
application. Contoso needs this compatibility. Otherwise, it will have to redesign maintenance plans required by
the application.
With Software Assurance, Contoso can exchange its existing licenses for discounted rates on a SQL managed
instance by using the Azure Hybrid Benefit for SQL Server. For this reason, Contoso can save up to 30 percent
on SQL Managed Instance.
SQL Managed Instance is fully contained in the virtual network, so it provides greater isolation and security for
Contoso's data. Contoso can get the benefits of the public cloud while keeping the environment isolated from
the public internet.
SQL Managed Instance supports many security features. They include Always Encrypted, dynamic data
masking, row-level security, and threat detection.
Solution review
Contoso evaluates the proposed design by putting together a list of pros and cons.
C O N SIDERAT IO N DETA IL S
For the data tier, SQL Managed Instance might not be the
best solution if Contoso wants to customize the operating
system or the database server, or if the company wants to run
third-party applications along with SQL Server. Running SQL
Server on an IaaS VM could provide this flexibility.
Migration process
Contoso will migrate the web and data tiers of its SmartHotel360 application to Azure by completing these steps:
1. Contoso already has its Azure infrastructure in place, so it just needs to add a couple of specific Azure
components for this scenario.
2. The data tier will be migrated by using Azure Database Migration Service. This service connects to the on-
premises SQL Server VM across a site-to-site VPN connection between the Contoso datacenter and Azure.
The service then migrates the database.
3. The web tier will be migrated by using a lift-and-shift migration by using Azure Migrate. The process entails
preparing the on-premises VMware environment, setting up and enabling replication, and migrating the
VMs by failing them over to Azure.
Azure services
SERVIC E DESC RIP T IO N C O ST
SERVIC E DESC RIP T IO N C O ST
Azure Database Migration Service Azure Database Migration Service Learn about supported regions and
enables seamless migration from Azure Database Migration Service
multiple database sources to Azure data pricing.
platforms with minimal downtime.
Azure SQL Managed Instance SQL Managed Instance is a managed Using a SQL managed instance running
database service that represents a fully in Azure incurs charges based on
managed SQL Server instance in the capacity. Learn more about SQL
Azure cloud. It uses the same code as Managed Instance pricing.
the latest version of SQL Server
Database Engine and has the latest
features, performance improvements,
and security patches.
Azure Migrate Contoso uses Azure Migrate to assess Azure Migrate is available at no
its VMware VMs. Azure Migrate additional charge. They might incur
assesses the migration suitability of the charges depending on the tools (first-
machines. It provides sizing and cost party or independent software vendor)
estimates for running in Azure. they decide to use for assessment and
migration. Learn more about Azure
Migrate pricing.
Prerequisites
Contoso and other users must meet the following prerequisites for this scenario.
On-premises ser vers The on-premises vCenter Server should be running version
5.5, 6.0, or 6.5.
On-premises VMs Review Linux machines that are endorsed to run on Azure.
REQ UIREM EN T S DETA IL S
Database Migration Ser vice For Azure Database Migration Service, you need a compatible
on-premises VPN device.
Make sure that the service account running the source SQL
Server instance has write permissions on the network share.
Scenario steps
Here's how Contoso plans to set up the deployment:
Step 1: Prepare a SQL managed instance. Contoso needs an existing managed instance to which the on-
premises SQL Server database will migrate.
Step 2: Prepare Azure Database Migration Ser vice. Contoso must register the database migration
provider, create an instance, and then create a Database Migration Service project. Contoso also must set up a
shared access signature (SAS) uniform resource identifier (URI) for the Database Migration Service instance. An
SAS URI provides delegated access to resources in Contoso's storage account so that Contoso can grant limited
permissions to storage objects. Contoso sets up an SAS URI so that Azure Database Migration Service can
access the storage account container to which the service uploads the SQL Server backup files.
Step 3: Prepare Azure for the Azure Migrate: Ser ver Migration tool. Contoso adds the server migration
tool to its Azure Migrate project.
Step 4: Prepare on-premises VMware for Azure Migrate: Ser ver Migration. Contoso prepares
accounts for VM discovery and prepares to connect to Azure VMs after migration.
Step 5: Replicate the on-premises VMs. Contoso sets up replication and starts replicating VMs to Azure
Storage.
Step 6: Migrate the database via Azure Database Migration Ser vice. Contoso migrates the database.
Step 7: Migrate the VMs with Azure Migrate: Ser ver Migration. Contoso runs a test migration to make
sure everything's working and then runs a full migrate to move the VM to Azure.
5. Set custom DNS settings. DNS points first to Contoso's Azure domain controllers. Azure DNS is secondary.
The Contoso Azure domain controllers are located as follows:
Located in the PROD-DC-EUS2 subnet, in the East US 2 production network ( VNET-PROD-EUS2 ).
CONTOSODC3 address: 10.245.42.4 .
CONTOSODC4 address: 10.245.42.5 .
Azure DNS resolver: 168.63.129.16 .
Need more help?
Read the SQL Managed Instance overview.
Learn how to create a virtual network for a SQL managed instance.
Learn how to set up peering.
Learn how to update Azure Active Directory DNS settings.
Set up routing
The managed instance is placed in a private virtual network. Contoso needs a route table for the virtual network to
communicate with the Azure management service. If the virtual network can't communicate with the service that
manages it, the virtual network becomes inaccessible.
Contoso considers these factors:
The route table contains a set of rules (routes) that specify how packets sent from the managed instance should
be routed in the virtual network.
The route table is associated with subnets where managed instances are deployed. Each packet that leaves a
subnet is handled based on the associated route table.
A subnet can be associated with only one route table.
There are no additional charges for creating route tables in Microsoft Azure.
To set up routing, the Contoso admins do the following steps:
1. Create a user-defined route table in the ContosoNetworkingRG resource group.
2. To comply with SQL Managed Instance requirements, after the route table ( MIRouteTable ) is deployed, they
add a route that has an address prefix of 0.0.0.0/0 . The Next hop type option is set to Internet .
3. Associate the route table with the SQLMI-DB-EUS2 subnet (in the VNET-SQLMI-EUS2 network).
2. Create an Azure Blob storage container. Contoso generates an SAS URI so that Azure Database Migration
Service can access it.
4. Place the Database Migration Service instance in the PROD-DC-EUS2 subnet of the VNET-PROD-DC-EUS2 virtual
network.
The instance is placed here because the service must be in a virtual network that can access the on-
premises SQL Server VM via a VPN gateway.
is peered to VNET-HUB-EUS2 and is allowed to use remote gateways. The Use
VNET-PROD-EUS2
remote gateways option ensures that the instance can communicate as required.
Step 3: Prepare Azure for the Azure Migrate: Server Migration tool
Here are the Azure components Contoso needs to migrate the VMs to Azure:
A virtual network in which Azure VMs will be located when they're created during migration.
The Azure Migrate: Server Migration tool provisioned.
The Contoso admins set up these components:
1. Set up a network. Contoso already set up a network that can be used for Azure Migrate: Server Migration
when it deployed the Azure infrastructure.
The SmartHotel360 application is a production application, and the VMs will be migrated to the Azure
production network ( VNET-PROD-EUS2 ) in the primary region ( East US 2 ).
Both VMs will be placed in the ContosoRG resource group, which is used for production resources.
The application front-end VM ( WEBVM ) will migrate to the front-end subnet ( PROD-FE-EUS2 ) of the
production network.
The application database VM ( SQLVM ) will migrate to the database subnet ( PROD-DB-EUS2 ) of the
production network.
b. Start the imported image and configure the tool by following these steps:
a. Set up the prerequisites.
b. Point the tool to the Azure subscription.
c. Set the VMware vCenter credentials.
2. In Replicate > Source settings > Are your machines vir tualized? , they select Yes, with VMware
vSphere .
3. In On-premises appliance , they select the name of the Azure Migrate appliance that was set up and then
select OK .
4. In Vir tual machines , they select the machines they want to replicate:
If they've run an assessment for the VMs, they can apply VM sizing and disk type (premium/standard)
recommendations from the assessment results. In Impor t migration settings from an Azure
Migrate assessment? , they select the Yes option.
If they didn't run an assessment or they don't want to use the assessment settings, they select the No
option.
If they selected to use the assessment, they select the VM group and assessment name.
5. In Vir tual machines , they search for VMs as needed and check each VM they want to migrate. Then they
select Next: Target settings .
6. In Target settings , they select the subscription and target region to which they'll migrate. They also specify
the resource group in which the Azure VMs will reside after migration. In Vir tual Network , they select the
Azure virtual network/subnet to which the Azure VMs will be joined after migration.
7. In Azure Hybrid Benefit , they:
Select No if they don't want to apply Azure Hybrid Benefit. Then they select Next .
Select Yes if they have Windows Server machines that are covered with active Software Assurance or
Windows Server subscriptions and they want to apply the benefit to the machines they're migrating.
Then they select Next .
8. In Compute , they review the VM name, size, OS disk type, and availability set. VMs must conform with
Azure requirements.
VM size: If they're using assessment recommendations, the VM size drop-down list contains the
recommended size. Otherwise, Azure Migrate picks a size based on the closest match in the Azure
subscription. Alternatively, they can pick a manual size in Azure VM size.
OS disk : They specify the OS (boot) disk for the VM. The OS disk is the disk that has the operating
system bootloader and installer.
Availability set: If the VM should be in an Azure availability set after migration, they specify the set. The
set must be in the target resource group specified for the migration.
9. In Disks , they specify whether the VM disks should be replicated to Azure. Then they select the disk type
(standard SSD/HDD or premium-managed disks) in Azure and select Next .
They can exclude disks from replication.
If disks are excluded, they won't be present on the Azure VM after migration.
10. In Review + star t replication , they review the settings. Then they select Replicate to start the initial
replication for the servers.
NOTE
Replication settings can be updated any time before replication starts in Manage > Replicating machines . Settings can't
be changed after replication starts.
3. For the target, they enter the name of the managed instance in Azure and the access credentials.
4. In New Activity > Run migration , they specify settings to run the migration:
Source and target credentials.
The database to migrate.
The network share created on the on-premises VM. Azure Database Migration Service takes source
backups to this share.
The service account that runs the source SQL Server instance must have write permissions on this
share.
The FQDN path to the share must be used.
The SAS URI that provides Azure Database Migration Service with access to the storage account
container to which the service uploads the backup files for migration.
5. They save the migration settings and then run the migration.
6. In Over view , they monitor the migration status.
7. When migration is finished, they verify that the target databases exist on the managed instance.
3. In Test migration , they select the Azure virtual network in which the Azure VM will be located after the
migration. We recommend using a nonproduction virtual network.
4. The Test migration job starts. They monitor the job in the portal notifications.
5. After the migration finishes, they view the migrated Azure VM in Vir tual Machines in the Azure portal. The
machine name has a suffix -Test .
6. After the test is done, they select and hold (or right-click) the Azure VM in Replicating machines and then
select Clean up test migration .
Migrate the VM
Now the Contoso admins run a full migration to complete the move.
1. In the Azure Migrate project, they go to Ser vers > Azure Migrate: Ser ver Migration and select
Replicating ser vers .
2. In Replicating machines , they select and hold (or right-click) the VM, and then they select Migrate .
3. In Migrate > Shut down vir tual machines and perform a planned migration with no data loss ,
they select Yes > OK .
By default, Azure Migrate shuts down the on-premises VM and runs an on-demand replication to
synchronize any VM changes that occurred since the last replication occurred. This action ensures no
data loss.
If they don't want to shut down the VM, they select No .
4. A migration job starts for the VM. They track the job in Azure notifications.
5. After the job finishes, they can view and manage the VM from the Vir tual Machines page.
6. Finally, they update the DNS records for WEBVM on one of the Contoso domain controllers.
Update the connection string
As the final step in the migration process, the Contoso admins update the connection string of the application to
point to the migrated database that's running on Contoso's SQL managed instance.
1. In the Azure portal, they find the connection string by selecting Settings > Connection strings .
2. They update the string with the user name and password of the SQL managed instance.
3. After the string is configured, they replace the current connection string in the web.config file of its
application.
4. After they update the file and save it, they restart IIS on WEBVM by running iisreset /restart in a
command prompt window.
5. After IIS is restarted, the application uses the database that's running on the SQL managed instance.
6. At this point, they can shut down the on-premises SQLVM machine. The migration is finished.
Need more help?
Learn how to run a test failover.
Learn how to create a recovery plan.
Learn how to fail over to Azure.
To learn more about security practices for VMs, see Security best practices for IaaS workloads in Azure.
Business continuity and disaster recovery
For business continuity and disaster recovery, Contoso takes the following actions:
Keep data safe. Contoso backs up the data on the VMs by using the Azure Backup service. For more
information, see An overview of Azure VM backup.
Keep applications up and running. Contoso replicates the application VMs in Azure to a secondary region
using Site Recovery. To learn more, see Set up disaster recovery to a secondary Azure region for an Azure VM.
Learn more. Contoso learns more about managing SQL Managed Instance, which includes database backups.
Licensing and cost optimization
Contoso has existing licensing for WEBVM. To take advantage of pricing with Azure Hybrid Benefit, Contoso
converts the existing Azure VM.
Contoso will use Azure Cost Management and Billing to ensure the company stays within budgets established
by the IT leadership.
Conclusion
In this article, Contoso rehosts the SmartHotel360 application in Azure by migrating the application front-end VM
to Azure by using Azure Migrate. Contoso migrates the on-premises database to a SQL managed instance by using
Azure Database Migration Service.
Rehost an on-premises application with Azure VMs
and SQL Server Always On availability groups
10/30/2020 • 24 minutes to read • Edit Online
This article demonstrates how the fictional company Contoso rehosts a two-tier Windows .NET application running
on VMware virtual machines (VMs) as part of a migration to Azure. Contoso migrates the application front-end VM
to an Azure VM, and the application database to an Azure SQL Server VM, running in a Windows Server failover
cluster with SQL Server Always On availability groups.
The SmartHotel360 application used in this example is provided as open source. If you want to use it for your own
testing purposes, download it from GitHub.
Business drivers
The IT leadership team has worked closely with business partners to understand what they want to achieve with
this migration. They want to:
Address business growth. Contoso is growing, and as a result there's pressure on on-premises systems and
infrastructure.
Increase efficiency. Contoso needs to remove unnecessary procedures and streamline processes for
developers and users. The business needs IT to be fast and not waste time or money to deliver faster on
customer requirements.
Increase agility. Contoso IT needs to be more responsive to the needs of the business. It must react faster than
the changes in the marketplace to enable success in a global economy. IT mustn't get in the way or become a
business blocker.
Scale. As the business grows successfully, Contoso IT must provide systems that grow at the same pace.
Migration goals
The Contoso cloud team has pinned down goals for this migration. These goals were used to determine the best
migration method:
After migration, the application in Azure should have the same performance capabilities as it does today in
VMware. The application will remain as critical in the cloud as it is on-premises.
Contoso doesn't want to invest in this application. It's important to the business, but in its current form, Contoso
simply wants to move it safely to the cloud.
The on-premises database for the application has had availability issues. Contoso want to deploy it in Azure as a
high-availability cluster with failover capabilities.
Contoso wants to upgrade from its current SQL Server 2008 R2 platform to SQL Server 2017.
Contoso doesn't want to use Azure SQL Database for this application and is looking for alternatives.
Solution design
After pinning down the company's goals and requirements, Contoso designs and reviews a deployment solution
and identifies the migration process. The Azure services that it will use for the migration also are identified.
Current architecture
The application is tiered across two VMs ( WEBVM and SQLVM ).
The VMs are located on VMware ESXi host contosohost1.contoso.com (version 6.5).
The VMware environment is managed by vCenter Server 6.5 ( vcenter.contoso.com ) that runs on a VM.
Contoso has an on-premises datacenter ( contoso-datacenter ) with an on-premises domain controller (
contosodc1 ).
Proposed architecture
In this scenario:
Contoso will migrate the application front end WEBVM to an Azure infrastructure as a service (IaaS) VM.
The front-end VM in Azure will be deployed in the ContosoRG resource group (used for production
resources).
It will be located in the Azure production network ( VNET-PROD-EUS2 ) in the primary region ( East US 2 ).
The application database will be migrated to an Azure SQL Server VM.
It will be located in Contoso's Azure database network ( PROD-DB-EUS2 ) in the primary region ( East US 2 ).
It will be placed in a Windows Server failover cluster with two nodes that uses SQL Server Always On
availability groups.
In Azure, the two SQL Server VM nodes in the cluster will be deployed in the ContosoRG resource group.
The VM nodes will be located in the Azure production network ( VNET-PROD-EUS2 ) in the primary region (
East US 2 ).
VMs will run Windows Server 2016 with SQL Server 2017 Enterprise edition. Contoso doesn't have
licenses for this operating system. It will use an image in Azure Marketplace that provides the license as a
charge to the company's Azure Enterprise Agreement commitment.
Apart from unique names, both VMs use the same settings.
Contoso will deploy an internal load balancer that listens for traffic on the cluster and directs it to the
appropriate cluster node.
The internal load balancer will be deployed in ContosoNetworkingRG (used for networking resources).
The on-premises VMs in the Contoso datacenter will be decommissioned after the migration is done.
Database considerations
As part of the solution design process, Contoso did a feature comparison between Azure SQL Database and SQL
Server. The following considerations helped the company to decide to use an Azure IaaS VM running SQL Server:
Using an Azure VM running SQL Server seems to be an optimal solution if Contoso needs to customize the
operating system or the database server, or if it might want to colocate and run third-party applications on the
same VM.
Solution review
Contoso evaluates its proposed design by putting together a list of pros and cons.
C O N SIDERAT IO N DETA IL S
The SQL Server tier will run on SQL Server 2017 and Windows
Server 2016. which retires the current Windows Server 2008
R2 operating system. Running SQL Server 2017 supports
Contoso's technical requirements and goals. IT provides 100
percent compatibility while moving away from SQL Server
2008 R2.
Azure services
SERVIC E DESC RIP T IO N C O ST
Azure Database Migration Service Azure Database Migration Service Learn about supported regions and
enables seamless migration from Azure Database Migration Service
multiple database sources to Azure data pricing.
platforms with minimal downtime.
Azure Migrate Contoso uses Azure Migrate to assess Azure Migrate is available at no
its VMware VMs. Azure Migrate additional charge. They might incur
assesses the migration suitability of the charges depending on the tools (first-
machines. It provides sizing and cost party or independent software vendor)
estimates for running in Azure. they decide to use for assessment and
migration. Learn more about Azure
Migrate pricing.
Migration process
The Contoso admins will migrate the application VMs to Azure.
They'll migrate the front-end VM to Azure VM by using Azure Migrate:
As a first step, they'll prepare and set up Azure components and prepare the on-premises VMware
infrastructure.
With everything prepared, they can start replicating the VM.
After replication is enabled and working, they migrate the VM by using Azure Migrate.
After they've verified the database, they'll migrate the database to a SQL Server cluster in Azure by using
Azure Database Migration Service.
As a first step, they'll need to provision SQL Server VMs in Azure, set up the cluster and an internal load
balancer, and configure Always On availability groups.
With this in place, they can migrate the database.
After the migration, they'll enable Always On availability groups for the database.
Prerequisites
Here's what Contoso needs to do for this scenario.
On-premises ser vers The on-premises vCenter Server should be running version
5.5, 6.0, 6.5, or 6.7.
On-premises VMs Review Linux machines that are endorsed to run on Azure.
Scenario steps
Here's how Contoso will run the migration:
Step 1: Prepare a SQL Ser ver Always On availability group cluster. Create a cluster for deploying two
SQL Server VM nodes in Azure.
Step 2: Deploy and set up the cluster. Prepare an Azure SQL Server cluster. Databases are migrated into
this existing cluster.
Step 3: Deploy Azure Load Balancer. Deploy a load balancer to balance traffic to the SQL Server nodes.
Step 4: Prepare Azure for Azure Migrate. Create an Azure Storage account to hold replicated data.
Step 5: Prepare on-premises VMware for Azure Migrate. Prepare accounts for VM discovery and agent
installation. Prepare on-premises VMs so that users can connect to Azure VMs after migration.
Step 6: Replicate the on-premises VMs to Azure. Enable VM replication to Azure.
Step 7: Migrate the database via Azure Database Migration Ser vice. Migrate the database to Azure by
using Azure Database Migration Service.
Step 8: Protect the database with SQL Ser ver Always On. Create an Always On availability group for the
cluster.
Step 9: Migrate the VM with Azure Migrate. Run a test migration to make sure everything's working as
expected. Then run a migration to Azure.
They create a new availability set ( SQLAOGAVSET ) with two fault domains and five update domains.
5. In SQL Ser ver settings , they limit SQL connectivity to the virtual network (private) on default port 1433.
For authentication, they use the same credentials as used on-site ( contosoadmin ).
5. When they create the storage account, primary and secondary access keys are generated for it. They need
the primary access key to create the cloud witness. The key appears under the storage account name >
Access keys .
Add SQL Server VMs to Contoso domain
1. Contoso adds SQLAOG1 and SQLAOG2 to the contoso.com domain.
2. On each VM, the admins install the Windows Failover Cluster feature and tools.
Set up the cluster
Before the Contoso admins set up the cluster, they take a snapshot of the OS disk on each machine.
2. Place the load balancer in the database subnet ( PROD-DB-EUS2 ) of the production network ( VNET-PROD-EUS2 ).
3. Assign it a static IP address ( 10.245.40.100 ).
4. As a networking element, deploy the load balancer in the networking resource group ContosoNetworkingRG .
After the internal load balancer is deployed, the Contoso admins need to set it up. They create a back-end address
pool, set up a health probe, and configure a load-balancing rule.
Add a back-end pool
To distribute traffic to the VMs in the cluster, the Contoso admins set up a back-end address pool that contains the
IP addresses of the NICs for VMs that will receive network traffic from the load balancer.
1. In the load balancer settings in the portal, Contoso adds a back-end pool: ILB-PROD-DB-EUS-SQLAOG-BEPOOL .
2. The admins associate the pool with availability set SQLAOGAVSET . The VMs in the set ( SQLAOG1 and SQLAOG2 )
are added to the pool.
Create a health probe
The Contoso admins create a health probe so that the load balancer can monitor the application health. The probe
dynamically adds or removes VMs from the load balancer rotation based on how they respond to health checks.
To create the probe, the Contoso admins:
1. In the load balancer settings in the portal, create a health probe: SQL AlwaysOnEndPointProbe .
2. Set the probe to monitor VMs on TCP port 59999.
3. Set an interval of 5 seconds between probes and a threshold of 2. If two probes fail, the VM will be
considered unhealthy.
Configure the load balancer to receive traffic
Now, the Contoso admins set up a load balancer rule to define how traffic is distributed to the VMs.
The front-end IP address handles incoming traffic.
The back-end IP pool receives the traffic.
To create the rule, the Contoso admins:
1. In the load balancer settings in the portal, add a new rule: SQLAlwaysOnEndPointListener .
2. Set a front-end listener to receive incoming SQL client traffic on TCP port 1433.
3. Specify the back-end pool to which traffic will be routed and the port on which VMs listen for traffic.
4. Enable floating IP (direct server return), which is always required for SQL Server Always On.
Need more help?
Get an overview of Azure Load Balancer.
Learn about how to create a load balancer.
5. In Vir tual machines , they search for VMs as needed and check each VM to migrate. Then they select Next:
Target settings .
6. In Target settings , they select the subscription, and target region to which they'll migrate, and specify the
resource group in which the Azure VMs will reside after migration. In Vir tual Network , they select the
Azure virtual network/subnet to which the Azure VMs will be joined after migration.
7. In Azure Hybrid Benefit , the Contoso admins:
Select No if they don't want to apply Azure Hybrid Benefit. Then they select Next .
Select Yes if they have Windows Server machines that are covered with active Software Assurance or
Windows Server subscriptions, and they want to apply the benefit to the machines they're migrating.
Then they select Next .
8. In Compute , they review the VM name, size, OS disk type, and availability set. VMs must conform with
Azure requirements.
VM size: If they're using assessment recommendations, the VM size drop-down list contains the
recommended size. Otherwise, Azure Migrate picks a size based on the closest match in the Azure
subscription. Alternatively, they can pick a manual size in Azure VM size.
OS disk : They specify the OS (boot) disk for the VM. The OS disk is the disk that has the operating
system bootloader and installer.
Availability set: If the VM should be in an Azure availability set after migration, they specify the set. The
set must be in the target resource group specified for the migration.
9. In Disks , they specify whether the VM disks should be replicated to Azure. Then they select the disk type
(standard SSD/HDD or premium managed disks) in Azure and select Next .
They can exclude disks from replication.
If disks are excluded, they won't be present on the Azure VM after migration.
10. In Review + Star t replication , they review the settings. Then they select Replicate to start the initial
replication for the servers.
NOTE
Replication settings can be updated any time before replication starts in Manage > Replicating machines . Settings can't
be changed after replication starts.
3. In Specify Replicas , they add the two SQL nodes as availability replicas and configure them to provide
automatic failover with synchronous commit.
4. They configure a listener for the group ( SHAOG ) and port. The IP address of the internal load balancer is
added as a static IP address ( 10.245.40.100 ).
5. In Select Data Synchronization , they enable automatic seeding. With this option, SQL Server
automatically creates secondary replicas for every database in the group, so Contoso doesn't have to
manually back up and restore them. After validation, the availability group is created.
6. Contoso ran into an issue when creating the group. It isn't using Active Directory Windows integrated
security and needs to grant permissions to the SQL login to create the Windows failover cluster roles.
7. After the group is created, it appears in SQL Server Management Studio.
Configure a listener on the cluster
As a last step in setting up the SQL deployment, the Contoso admins configure the internal load balancer as the
listener on the cluster and bring the listener online. They use a script to do this task.
3. After the failover, they verify that the Azure VM appears as expected in the Azure portal.
4. After verifying the VM in Azure, they complete the migration to finish the migration process, stop replication
for the VM, and stop Azure Migrate billing for the VM.
2. After updating the file and saving it, they restart IIS on WEBVM . They use iisreset /restart from a
command prompt.
3. After IIS is restarted, the application now uses the database running on the managed instance.
Need more help?
Learn about how to run a test failover.
Learn how to create a recovery plan.
Learn about failing over to Azure.
Clean up after migration
After migration, the SmartHotel360 application is running on an Azure VM. The SmartHotel360 database is located
in the Azure SQL cluster.
Now, Contoso needs to finish these cleanup steps:
Remove the on-premises VMs from the vCenter inventory.
Remove the VMs from local backup jobs.
Update internal documentation to show the new locations and IP addresses for VMs.
Review any resources that interact with the decommissioned VMs. Update any relevant settings or
documentation to reflect the new configuration.
Add the two new VMs ( SQLAOG1 and SQLAOG2 ) to production monitoring systems.
Review the deployment
With the migrated resources in Azure, Contoso needs to fully operationalize and secure its new infrastructure.
Security
The Contoso security team reviews the virtual machines WEBVM , SQLAOG1 , and SQLAOG2 to determine any security
issues. They need to:
Review the network security groups (NSGs) for the VM to control access. NSGs are used to ensure that only
traffic allowed to the application can pass.
Consider securing the data on the disk by using Azure Disk Encryption and Azure Key Vault.
Evaluate transparent data encryption. Then enable it on the SmartHotel360 database running on the new
Always On availability group. Learn more about transparent data encryption.
For more information, see Security best practices for IaaS workloads in Azure.
Conclusion
In this article, Contoso rehosted the SmartHotel360 application in Azure by migrating the application front-end VM
to Azure by using Azure Migrate. Contoso migrated the application database to a SQL Server cluster provisioned in
Azure by using Azure Database Migration Service and protected it in a SQL Server Always On availability group.
Migrate open-source databases to Azure
10/30/2020 • 7 minutes to read • Edit Online
This article demonstrates how the fictional company Contoso assessed, planned, and migrated its various on-
premises open-source databases to Azure.
As Contoso considers migrating to Azure, the company needs a technical and financial assessment to determine
whether its on-premises workloads are good candidates for cloud migration. In particular, the Contoso team wants
to assess machine and database compatibility for migration. Additionally, it wants to estimate capacity and costs
for running Contoso's resources in Azure.
Business drivers
Contoso is having various issues with maintaining the wide array of versions of open-source database workloads
that exist on its network. After the latest investor's meeting, the CFO and CTO decided to move all these workloads
to Azure. This move will shift them from a structured capital expense model to a fluid operating expense model.
The IT leadership team has worked closely with business partners to understand the business and technical
requirements. They want to:
Increase security. Contoso needs to be able to monitor and protect all data resources in a more timely and
efficient manner. The company also wants to get a more centralized reporting system set up on database access
patterns.
Optimize compute resources. Contoso has deployed a large on-premises server infrastructure. The
company has several SQL Server instances that consume but don't really use the underlying CPU, memory, and
disk allocated in efficient ways.
Increase efficiency. Contoso needs to remove unnecessary procedures and streamline processes for
developers and users. The business needs IT to be fast and not waste time or money to deliver faster on
customer requirements. Database administration should be reduced or minimized after the migration.
Increase agility. Contoso IT needs to be more responsive to the needs of the business. It must react faster
than the changes in the marketplace to enable success in a global economy. It mustn't get in the way or become
a business blocker.
Scale. As the business grows successfully, Contoso IT must provide systems that grow at the same pace.
Understand costs. Business and application owners want to know they won't be stuck with high cloud costs
when running the applications on-premises.
Migration goals
The Contoso cloud team has pinned down goals for the various migrations. These goals were used to determine
the best migration methods.
Limitations Initially, not all branch offices that run applications will have a
direct Azure ExpressRoute link to Azure. These offices will need
to connect through virtual network gateways.
Solution design
Contoso has already performed a migration assessment of its digital estate by using Azure Migrate.
Pros Azure will provide a single pane of glass into the database
workloads.
IMPORTANT
Make sure that the Azure resource has a resource lock to prevent it from being deleted. Deleted servers can't be restored.
Conclusion
In this article, Contoso assessed, planned, and migrated its open-source databases to Azure PaaS and IaaS
solutions.
Migrate MySQL databases to Azure
10/30/2020 • 7 minutes to read • Edit Online
This article demonstrates how the fictional company Contoso planned and migrated its on-premises MySQL open-
source database platform to Azure.
Business drivers
The IT leadership team has worked closely with business partners to understand what they want to achieve with
this migration. They want to:
Increase availability. Contoso has had availability issues with its MySQL on-premises environment. The
business requires the applications that use this data store to be more reliable.
Increase efficiency. Contoso needs to remove unnecessary procedures and streamline processes for
developers and users. The business needs IT to be fast and not waste time or money to deliver faster on
customer requirements.
Increase agility. Contoso IT needs to be more responsive to the needs of the business. It must react faster than
the changes in the marketplace to enable success in a global economy. It mustn't become a business blocker.
Scale. As the business grows successfully, Contoso IT must provide systems that grow at the same pace.
Migration goals
The Contoso cloud team has pinned down goals for this migration. These goals were used to determine the best
migration method.
Availability Currently internal staff are having a hard time with the
hosting environment for the MySQL instance. Contoso wants
to have close to 99.99 percent availability for the database
layer.
Solution design
After pinning down goals and requirements, Contoso designs and reviews a deployment solution and identifies the
migration process. The tools and services that it will use for migration are also identified.
Current application
The MySQL database stores employee data that's used for all aspects of the company's HR department. A LAMP-
based application is used as the front end to handle employee HR requests. Contoso has 100,000 employees
worldwide, so uptime is important.
Proposed solution
Use Azure Database Migration Service to migrate the database to an Azure Database for MySQL instance. Modify
all applications and processes to use the new Azure Database for MySQL instance.
Database considerations
As part of the solution design process, Contoso reviewed the features in Azure for hosting its MySQL data. The
following considerations helped the company decide to use Azure:
Similar to Azure SQL Database, Azure Database for MySQL allows for firewall rules.
Azure Database for MySQL can be used with Azure Virtual Network to prevent the instance from being publicly
accessible.
Azure Database for MySQL has the required compliance and privacy certifications that Contoso must meet for
its auditors.
Report and application processing performance will be enhanced by using read replicas.
Ability to expose the service to internal network traffic only (no public access) by using Azure Private Link.
Contoso chose not to move to Azure Database for MySQL because it's considering using the MariaDB
ColumnStore and graph database model in the future.
Aside from MySQL features, Contoso is a proponent of true open-source projects and chose not to use MySQL.
The bandwidth and latency from the application to the database will be sufficient enough based on the chosen
gateway (either Azure ExpressRoute or Site-to-Site VPN).
Solution review
Contoso evaluates the proposed design by putting together a pros and cons list.
C O N SIDERAT IO N DETA IL S
C O N SIDERAT IO N DETA IL S
Proposed architecture
NOTE
MySQL 8.0 is supported in Azure Database for MySQL. The Database Migration Service tool doesn't yet support that
version.
IMPORTANT
Ensure that the Azure Database for MySQL resource has a resource lock to prevent it from being deleted. Deleted servers
can't be restored.
Conclusion
In this article, Contoso migrated its MySQL databases to an Azure Database for MySQL instance.
Migrate PostgreSQL databases to Azure
10/30/2020 • 9 minutes to read • Edit Online
This article demonstrates how the fictional company Contoso planned and migrated its on-premises PostgreSQL
open-source database platform to Azure.
Business drivers
The IT leadership team has worked closely with business partners to understand what they want to achieve with
this migration. They want to:
Automate big data. Contoso uses PostgreSQL for several of its big data and AI initiatives. The company wants
to build scalable repeatable pipelines to automate many of these analytical workloads.
Increase efficiency. Contoso needs to remove unnecessary procedures and streamline processes for
developers and users. The business needs IT to be fast and not waste time or money to deliver quicker on
customer requirements.
Increase agility. Contoso IT needs to be more responsive to the needs of the business. It must react faster than
the changes in the marketplace to enable success in a global economy and to not become a business blocker.
Scale. As the business grows successfully, Contoso IT must provide systems that can grow at the same pace.
Increase security. Contoso realizes that regulatory issues will cause the company to adjust its on-premises
strategy based on auditing, logging, and compliance requirements.
Migration goals
The Contoso cloud team has pinned down goals for this migration and will use them to determine the best
migration method.
Integrations Contoso wants to integrate the data in the database with data
and AI pipelines for machine learning.
Backup and restore Contoso is looking for the ability to do point-in-time restores
when and if data updates fail or are corrupted for any reason.
Solution design
After pinning down goals and requirements, Contoso designs and reviews a deployment solution and identifies the
migration process. The tools and services it will use for migration are also identified.
Current environment
PostgreSQL 9.6.7 is running on a physical Linux machine ( sql-pg-01.contoso.com ) in the Contoso datacenter.
Contoso already has an Azure subscription with a Site-to-Site VPN gateway to an on-premises datacenter network.
Proposed solution
Use Azure Database Migration Service to migrate the database to an Azure Database for PostgreSQL instance.
Modify all applications and processes to use the new Azure Database for PostgreSQL instance.
Build a new data processing pipeline using Azure Data Factory that connects to the Azure Database for
PostgreSQL instance.
Database considerations
As part of the solution design process, Contoso reviewed the features in Azure for hosting its PostgreSQL data. The
following considerations helped the company decide to use Azure:
Similar to Azure SQL Database, Azure Database for PostgreSQL supports firewall rules.
Azure Database for PostgreSQL can be used with virtual networks to prevent the instance from being publicly
accessible.
Azure Database for PostgreSQL has the required compliance certifications that Contoso must meet.
Integration with DevOps and Azure Data Factory will allow for automated data processing pipelines to be built.
Processing performance can be enhanced by using read replicas.
Support for bring your own key (BYOK) for data encryption.
Ability to expose the service to internal network traffic only (no-public access) by using Azure Private Link.
The bandwidth and latency from the application to the database will be sufficient enough based on the chosen
gateway (either Azure ExpressRoute or Site-to-Site VPN).
Solution review
Contoso evaluates its proposed design by putting together a list of pros and cons.
C O N SIDERAT IO N DETA IL S
Proposed architecture
Figure 1: Scenario architecture.
Migration process
Preparation
Before Contoso can migrate its PostgreSQL databases, it ensures that Contoso's instances meet all the Azure
prerequisites for a successful migration.
Supported versions
Only migrations to the same or a higher version are supported. Migrating PostgreSQL 9.5 to Azure Database for
PostgreSQL 9.6 or 10 is supported, but migrating from PostgreSQL 11 to PostgreSQL 9.6 isn't supported.
Microsoft aims to support n-2 versions of the PostgreSQL engine in Azure Database for PostgreSQL - Single
Server. The versions would be the current major version on Azure (n) and the two prior major versions (-2).
For the latest updates on supported versions, see Supported PostgreSQL major versions.
NOTE
Automatic major version upgrade isn't supported. For example, there isn't an automatic upgrade from PostgreSQL 9.5 to
PostgreSQL 9.6. To upgrade to the next major version, dump the database and restore it to a server created with the target
engine version.
Network
Contoso will need to set up a virtual network gateway connection from its on-premises environment to the virtual
network where its Azure Database for PostgreSQL database is located. This connection allows the on-premises
application to access the database but not be migrated to the cloud.
Assessment
Contoso will need to assess the current database for replication issues. These issues include:
The source database version is compatible for migration to the target database version.
Primary keys must exist on all tables to be replicated.
Database names can't include a semicolon ( ; ).
Migration of multiple tables with the same name, but a different case might cause unpredictable behavior.
Figure 2: The migration process.
Migration
Contoso can perform the migration in several ways:
Dump and restore
Azure Database Migration Service
Import/export
Contoso has selected Azure Database Migration Service to allow the company to reuse the migration project
whenever it needs to perform major-to-major upgrades. Because a single Database Migration Service activity only
accommodates up to four databases, Contoso sets up several jobs by using the following steps.
To prepare, set up a virtual network to access the database. Create a virtual network connection by using VPN
gateways in various ways.
Create an Azure Database Migration Service instance
1. In the Azure portal, select Add a resource .
2. Search for Azure Database Migration Ser vices , and select it.
3. Select + Add .
4. Select the subscription and resource group for the service.
5. Enter a name for the instance.
6. Select the closest location to the Contoso datacenter or VPN gateway.
7. Select Azure for the service mode.
8. Select a pricing tier.
9. Select Review + create .
Figure 3: Review and create.
10. Select Create .
Create an Azure Database for PostgreSQL instance
1. On the on-premises server, configure the postgresql.conf file.
2. Set the server to listen on the proper IP address that Azure Database Migration Service will use to access the
server and databases.
Set the listen_addresses variable.
3. Enable SSL.
a. Set the ssl=on variable.
b. Verify that Contoso is using a publicly signed SSL certificate for the server that supports TLS 1.2.
Otherwise, the Database Migration Service tool will raise an error.
4. Update the pg_hba.conf file.
Add entries that are specific to the Database Migration Service instance.
5. Logical replication must be enabled on the source server by modifying the values in the postgresql.conf
file for each server.
a. wal_level = logical
b. = [at least the maximum number of databases for migration]
max_replication_slots
For example, if Contoso wants to migrate four databases, it sets the value to 4.
c. max_wal_senders = [number of databases running concurrently]
The recommended value is 10.
6. Migration User must have the REPLICATION role on the source database.
7. Add the Database Migration Service instance IP address to the PostgreSQLpg_hba.conf file.
8. To export the database schemas, run the following commands:
9. Copy the file, name the copy dvdrental_schema_foreign.sql , and remove all non-foreign key and trigger-
related items.
10. Remove all foreign key and trigger-related items from the dvdrental_schema.sql file.
11. Import the database schema (step 1):
Migration
1. In the Azure portal, Contoso goes to its Database Migration Service resource.
2. If the service isn't started, select Star t Ser vice .
3. Select New Migration Project .
NOTE
The previous Database Migration Service steps can also be performed via the Azure CLI.
19. Reconfigure any applications or processes that use the on-premises database to point to the new Azure
Database for PostgreSQL database instance.
20. For post-migration, Contoso will ensure that it also set up cross-region read replicas, if necessary, after the
migration is finished.
Conclusion
In this article, Contoso migrated its PostgreSQL databases to an Azure Database for PostgreSQL instance.
Migrate MariaDB databases to Azure
10/30/2020 • 8 minutes to read • Edit Online
This article demonstrates how the fictional company Contoso planned and migrated its on-premises MariaDB
open-source database platform to Azure.
Contoso is using MariaDB instead of MySQL because of its:
Numerous storage engine options.
Cache and index performance.
Open-source support with features and extensions.
ColumnStore storage engine for analytical workloads.
The company's migration goal is to continue to use MariaDB but not worry about managing the environment
needed to support it.
Business drivers
The IT leadership team has worked closely with business partners to understand what they want to achieve with
this migration. They want to:
Increase availability. Contoso has had availability issues with its MariaDB on-premises environment. The
business requires the applications that use this data store to be more reliable.
Increase efficiency. Contoso needs to remove unnecessary procedures and streamline processes for
developers and users. The business needs IT to be fast and not waste time or money to deliver faster on
customer requirements.
Increase agility. Contoso IT needs to be more responsive to the needs of the business. It must react faster than
the changes in the marketplace to enable success in a global economy. It mustn't get in the way or become a
business blocker.
Scale. As the business grows successfully, Contoso IT must provide systems that grow at the same pace.
Migration goals
The Contoso cloud team has pinned down goals for this migration. These goals were used to determine the best
migration method.
Availability Currently internal staff are having a hard time with the
hosting environment for the MariaDB instance. Contoso
wants to have close to 99.99 percent availability for the
database layer.
Solution design
After pinning down goals and requirements, Contoso designs and reviews a deployment solution and identifies the
migration process. The tools and services that it will use for migration are also identified.
Current application
The MariaDB database hosts employee data that's used for all aspects of the company's HR department. A LAMP-
based application is used as the front end to handle employee HR requests. Contoso has 100,000 employees
worldwide, so uptime is important for its databases.
Proposed solution
Evaluate the environments for migration compatibility.
Use common open-source tools to migrate databases to the Azure Database for MariaDB instance.
Modify all applications and processes to use the new Azure Database for MariaDB instance.
Database considerations
As part of the solution design process, Contoso reviewed the features in Azure for hosting its MariaDB databases.
The following considerations helped the company decide to use Azure:
Similar to Azure SQL Database, Azure Database for MariaDB allows for firewall rules.
Azure Database for MariaDB can be used with Azure Virtual Network to prevent the instance from being
publicly accessible.
Azure Database for MariaDB has the required compliance and privacy certifications that Contoso must meet for
its auditors.
Report and application processing performance will be enhanced by using read replicas.
Ability to expose the service to internal network traffic only (no-public access) by using Azure Private Link.
Contoso chose not to move to Azure Database for MySQL because it's looking at potentially using the MariaDB
ColumnStore and graph database model in the future.
The bandwidth and latency from the application to the database will be sufficient enough based on the chosen
gateway (either Azure ExpressRoute or Site-to-Site VPN).
Solution review
Contoso evaluates the proposed design by putting together a pros and cons list.
C O N SIDERAT IO N DETA IL S
Proposed architecture
Restore the database. Replace with the endpoint for your Azure Database for MariaDB instance and the
username:
Use phpMyAdmin or a similar tool, such as MySQL Workbench, Toad, and Navicat, to verify the restore by
checking record counts in each table.
Update all application connection strings to point to the migrated database.
Test all applications for proper operation.
IMPORTANT
Make sure that the Azure Database for MariaDB instance has a resource lock to prevent it from being deleted. Deleted
servers can't be restored.
Conclusion
In this article, Contoso migrated its MariaDB databases to an Azure Database for MariaDB instance.
Rehost an on-premises Linux application to Azure
VMs
10/30/2020 • 12 minutes to read • Edit Online
This article shows how the fictional company Contoso rehosts a two-tier LAMP-based application by using Azure
infrastructure as a service (IaaS) virtual machines (VMs).
The service desk application used in this example, osTicket, is provided as open source. If you want to use it for
your own testing purposes, you can download it from GitHub.
Business drivers
The IT leadership team has worked closely with business partners to understand what they want to achieve with
this migration:
Address business growth. Contoso is growing, and as a result there's pressure on the on-premises systems
and infrastructure.
Limit risk . The service desk application is critical for the Contoso business. Contoso wants to move it to Azure
with zero risk.
Extend. Contoso doesn't want to change the application right now. It wants to ensure that the application is
stable.
Migration goals
The Contoso cloud team has pinned down goals for this migration to determine the best migration method:
After migration, the application in Azure should have the same performance capabilities as it does today in the
company's on-premises VMware environment. The application will remain as critical in the cloud as it is on-
premises.
Contoso doesn't want to invest in this application. It's important to the business, but in its current form Contoso
simply wants to move it safely to the cloud.
Contoso doesn't want to change the ops model for this application. It wants to interact with the application in
the cloud in the same way that it does now.
Contoso doesn't want to change application functionality. Only the application location will change.
Having completed a couple of Windows application migrations, Contoso wants to learn how to use a Linux-
based infrastructure in Azure.
Solution design
After pinning down goals and requirements, Contoso designs and reviews a deployment solution and identifies the
migration process. The Azure services that Contoso will use for the migration also are identified.
Current application
The osTicket application is tiered across two VMs ( OSTICKETWEB and OSTICKETMYSQL ).
The VMs are located on VMware ESXi host contosohost1.contoso.com (version 6.5).
The VMware environment is managed by vCenter Server 6.5 ( vcenter.contoso.com ) and runs on a VM.
Contoso has an on-premises datacenter ( contoso-datacenter ) with an on-premises domain controller (
contosodc1 ).
Proposed architecture
Because the application is a production workload, the VMs in Azure will reside in the production resource group
ContosoRG .
The VMs will be migrated to the primary region (East US 2) and placed in the production network (
VNET-PROD-EUS2 ):
The web VM will reside in the front-end subnet ( PROD-FE-EUS2 ).
The database VM will reside in the database subnet ( PROD-DB-EUS2 ).
The on-premises VMs in the Contoso datacenter will be decommissioned after the migration is done.
Solution review
Contoso evaluates the proposed design by putting together a list of pros and cons.
C O N SIDERAT IO N DETA IL S
Cons The web and data tier of the application remain single points
of failover.
Migration process
Contoso will complete the migration process as follows:
As a first step, Contoso prepares and sets up Azure components for Azure Migrate: Server Migration and
prepares the on-premises VMware infrastructure.
The company already has the Azure infrastructure in place, so it just needs to configure the replication of the
VMs through the Azure Migrate: Server Migration tool.
With everything prepared, Contoso can start replicating the VMs.
After replication is enabled and working, Contoso will migrate the VM by failing it over to Azure.
Azure services
SERVIC E DESC RIP T IO N C O ST
Azure Migrate: Server Migration The service orchestrates and manages During replication to Azure, Azure
migration of your on-premises Storage charges are incurred. Azure
applications and workloads and VMs are created, and incur charges,
Amazon Web Services (AWS)/Google when migration occurs. Learn more
Cloud Platform (GCP) VM instances. about charges and pricing.
Prerequisites
Here's what Contoso needs for this scenario.
REQ UIREM EN T S DETA IL S
On-premises ser vers The on-premises vCenter Server should be running version
5.5, 6.0, or 6.5.
On-premises VMs Review Linux Distros that are endorsed to run on Azure.
Scenario steps
Here's how Contoso will complete the migration:
Step 1: Prepare Azure for Azure Migrate: Ser ver Migration. Add the Azure Migrate: Server Migration
tool to the Azure Migrate project.
Step 2: Prepare on-premises VMware for Azure Migrate: Ser ver Migration. Prepare accounts for VM
discovery, and prepare to connect to Azure VMs after migration.
Step 3: Replicate VMs. Set up replication, and start replicating VMs to Azure Storage.
Step 4: Migrate the VMs with Azure Migrate: Ser ver Migration. Run a test migration to make sure
everything's working, and then run a migration to move the VMs to Azure.
Step 1: Prepare Azure for the Azure Migrate: Server Migration tool
Here are the Azure components Contoso needs to migrate the VMs to Azure:
A virtual network in which Azure VMs will be located when they're created during migration.
The Azure Migrate: Server Migration tool provisioned.
They set up these components as follows:
1. Set up a network. Contoso already set up a network that can be used for Azure Migrate: Server Migration
when the company deployed the Azure infrastructure
The SmartHotel360 application is a production application. The VMs will be migrated to the Azure
production network ( VNET-PROD-EUS2 ) in the primary region ( East US 2 ).
Both VMs will be placed in the ContosoRG resource group, which is used for production resources.
The application front-end VM ( OSTICKETWEB ) will migrate to the front-end subnet ( PROD-FE-EUS2 ) in the
production network.
The application database VM ( OSTICKETMYSQL ) will migrate to the database subnet ( PROD-DB-EUS2 ) in the
production network.
2. Provision the Azure Migrate: Server Migration tool. With the network and storage account in place, Contoso
now creates a Recovery Services vault ( ContosoMigrationVault ) and places it in the ContosoFailoverRG
resource group in the primary region ( East US 2 ).
5. In Vir tual machines , search for VMs as needed, and select each VM you want to migrate. Then select
Next: Target settings .
6. In Target settings , select the subscription and target region to which you'll migrate. Specify the resource
group in which the Azure VMs will reside after migration. In Vir tual Network , select the Azure virtual
network/subnet to which the Azure VMs will be joined after migration.
7. In Azure Hybrid Benefit :
Select No if you don't want to apply Azure Hybrid Benefit. Then select Next .
Select Yes if you have Windows Server machines that are covered with active Software Assurance or
Windows Server subscriptions and you want to apply the benefit to the machines you're migrating. Then
select Next .
8. In Compute , review the VM name, size, OS disk type, and availability set. VMs must conform with Azure
requirements.
VM size: If you're using assessment recommendations, the VM size drop-down list will contain the
recommended size. Otherwise, Azure Migrate picks a size based on the closest match in the Azure
subscription. Alternatively, pick a manual size in Azure VM size .
OS disk : Specify the OS (boot) disk for the VM. The OS disk is the disk that has the operating system
bootloader and installer.
Availability set: If the VM should be in an Azure availability set after migration, specify the set. The set
must be in the target resource group you specify for the migration.
9. In Disks , specify whether the VM disks should be replicated to Azure. Select the disk type (standard
SSD/HDD or premium-managed disks) in Azure. Then select Next .
You can exclude disks from replication.
If you exclude disks, they won't be present on the Azure VM after migration.
10. In Review + Star t replication , review the settings. Then select Replicate to start the initial replication for
the servers.
NOTE
You can update replication settings any time before replication starts in Manage > Replicating machines . Settings can't
be changed after replication starts.
2. Select and hold (or right-click) the VM to test. Then select Test migrate .
3. In Test Migration , select the Azure virtual network in which the Azure VM will be located after the
migration. We recommend you use a nonproduction virtual network.
4. The Test migration job starts. Monitor the job in the portal notifications.
5. After the migration finishes, view the migrated Azure VM in Vir tual Machines in the Azure portal. The
machine name has a suffix -Test .
6. After the test is done, select and hold (or right-click) the Azure VM in Replicating machines . Then select
Clean up test migration .
2. In Replicating machines , select and hold (or right-click) the VM and select Migrate .
3. In Migrate > Shut down vir tual machines and perform a planned migration with no data loss ,
select Yes > OK .
By default, Azure Migrate shuts down the on-premises VM and runs an on-demand replication to
synchronize any VM changes that occurred since the last replication occurred. This action ensures no
data loss.
If you don't want to shut down the VM, select No .
4. A migration job starts for the VM. Track the job in Azure notifications.
5. After the job finishes, you can view and manage the VM from the Vir tual Machines page.
Connect the VM to the database
As the final step in the migration process, Contoso admins update the connection string of the application to point
to the application database running on the OSTICKETMYSQL VM.
1. Make an SSH connection to the OSTICKETWEB VM by using PuTTY or another SSH client. The VM is private,
so connect by using the private IP address.
2. Make sure that the OSTICKETWEB VM can communicate with the OSTICKETMYSQL VM. Currently, the
configuration is hardcoded with the on-premises IP address 172.16.0.43 .
Before the update:
After the update:
4. Finally, update the DNS records for OSTICKETWEB and OSTICKETMYSQL on one of the Contoso domain
controllers.
Need more help?
Learn about how to run a test migration.
Learn about how to migrate VMs to Azure.
This article shows how the fictional company Contoso rehosts a two-tier LAMP-based application and migrates it
from on-premises to Azure by using Azure Virtual Machines (VMs) and Azure Database for MySQL.
The service desk application used in this example, osTicket, is provided as open source. If you want to use it for
your own testing, you can download it from GitHub.
Business drivers
The IT leadership team has worked closely with business partners to understand what they want to achieve:
Address business growth. Contoso is growing, and as a result there's pressure on the on-premises systems
and infrastructure.
Limit risk . The service desk application is critical for the business. Contoso wants to move it to Azure with zero
risk.
Extend. Contoso doesn't want to change the application right now. The company wants to keep the application
stable.
Migration goals
The Contoso cloud team has pinned down goals for this migration to determine the best migration method:
After migration, the application in Azure should have the same performance capabilities as it does today in the
company's on-premises VMware environment. The application will remain as critical in the cloud as it is on-
premises.
Contoso doesn't want to invest in this application. It's important to the business, but in its current form Contoso
simply wants to move it safely to the cloud.
Having completed a couple of Windows application migrations, Contoso wants to learn how to use a Linux-
based infrastructure in Azure.
Contoso wants to minimize database admin tasks after the application is moved to the cloud.
Proposed architecture
In this scenario:
Currently the application is tiered across two VMs ( OSTICKETWEB and OSTICKETMYSQL ).
The VMs are located on VMware ESXi host contosohost1.contoso.com (version 6.5).
The VMware environment is managed by vCenter Server 6.5 ( vcenter.contoso.com ) and runs on a VM.
Contoso has an on-premises datacenter ( contoso-datacenter ), with an on-premises domain controller (
contosodc1 ).
The web application on OSTICKETWEB will be migrated to an Azure infrastructure as a service (IaaS) VM.
The application database will be migrated to the Azure Database for MySQL platform as a service.
Because Contoso is migrating a production workload, the resources will reside in the production resource
group ContosoRG .
The OSTICKETWEB resource will be replicated to the primary region (East US 2) and placed in the production
network ( VNET-PROD-EUS2 ):
The web VM will reside in the front-end subnet ( PROD-FE-EUS2 ).
The application database will be migrated to Azure Database for MySQL by using Azure Database Migration
Service.
The on-premises VMs in the Contoso datacenter will be decommissioned after the migration is done.
Migration process
Contoso will complete the migration process as follows:
To migrate the web VM:
As a first step, Contoso sets up the Azure and on-premises infrastructure needed to deploy Azure Migrate.
The company already has the Azure infrastructure in place, so it just needs to add and configure the replication
of the VMs through the Azure Migrate: Server Migration tool.
With everything prepared, Contoso can start replicating the VM.
After replication is enabled and working, Contoso will complete the move by using Azure Migrate.
To migrate the database:
1. Contoso provisions a MySQL instance in Azure.
2. Contoso sets up Database Migration Service, ensuring access to the on-premises database server.
3. Contoso migrates the database to Azure Database for MySQL.
Azure services
SERVIC E DESC RIP T IO N C O ST
Azure Migrate Contoso uses Azure Migrate to assess Azure Migrate is available at no
its VMware VMs. Azure Migrate additional charge. You might incur
assesses the migration suitability of the charges depending on the tools (first-
machines. It provides sizing and cost party or ISV) you decide to use for
estimates for running in Azure. assessment and migration.
Azure Database Migration Service Database Migration Service enables Learn about supported regions and
seamless migration from multiple Database Migration Service pricing.
database sources to Azure data
platforms with minimal downtime.
Azure Database for MySQL The database is based on the open- Learn more about Azure Database for
source MySQL database engine. It MySQL pricing and scalability options.
provides a fully managed enterprise-
ready community MySQL database for
application development and
deployment.
Prerequisites
Here's what Contoso needs for this scenario.
On-premises ser vers The on-premises vCenter Server should be running version
5.5, 6.0, 6.5, or 6.7.
On-premises VMs Review Linux machines that are endorsed to run on Azure.
Scenario steps
Here's how Contoso admins will complete the migration:
Step 1: Prepare Azure for Azure Migrate: Ser ver Migration. Add the server migration tool to the Azure
Migrate project.
Step 2: Prepare on-premises VMware for Azure Migrate: Ser ver Migration. Prepare accounts for VM
discovery and prepare to connect to Azure Virtual Machines after migrated.
Step 3: Replicate VMs. Set up replication and start replicating VMs to Azure Storage.
Step 4: Migrate the application VM with Azure Migrate: Ser ver Migration. Run a test migration to
make sure everything's working, and then run a full migration to move the VM to Azure.
Step 5: Migrate the database. Set up migration by using Azure Database Migration Service.
Step 1: Prepare Azure for the Azure Migrate: Server Migration tool
Here are the Azure components Contoso needs to migrate the VMs to Azure:
A virtual network in which Azure VMs will be located when they're created during migration.
The Azure Migrate: Server Migration tool (OVA) provisioned and configured.
To set up the components, Contoso admins follow these steps:
1. Set up a network. Contoso already set up a network that can be used for Azure Migrate: Server Migration
when it deployed the Azure infrastructure.
2. Provision the Azure Migrate: Server Migration tool.
a. From Azure Migrate, download the OVA image, and import it into VMware.
b. Start the imported image, and configure the tool by using the following steps:
a. Set up the prerequisites.
5. In Vir tual machines , search for VMs as needed, and select each VM you want to migrate. Then select Next:
Target settings .
6. In Target settings , select the subscription and target region to which you'll migrate. Specify the resource
group in which the Azure VMs will reside after migration. In Vir tual Network , select the Azure virtual
network/subnet to which the Azure VMs will be joined after migration.
7. In Azure Hybrid Benefit :
Select No if you don't want to apply Azure Hybrid Benefit. Then select Next .
8. In Compute , review the VM name, size, OS disk type, and availability set. VMs must conform with Azure
requirements.
VM size: If you use assessment recommendations, the VM size drop-down list contains the
recommended size. Otherwise, Azure Migrate picks a size based on the closest match in the Azure
subscription. Alternatively, pick a manual size in Azure VM size .
OS disk : Specify the OS (boot) disk for the VM. The OS disk is the disk that has the operating system
bootloader and installer.
Availability set: If the VM should be in an Azure availability set after migration, specify the set. The set
must be in the target resource group you specify for the migration.
9. In Disks , specify whether the VM disks should be replicated to Azure. Then select the disk type (standard
SSD/HDD or premium-managed disks) in Azure, and select Next .
You can exclude disks from replication.
If you exclude disks, they won't be present on the Azure VM after migration.
10. In Review + Star t replication , review the settings. Then select Replicate to start the initial replication for
the servers.
NOTE
You can update replication settings any time before replication starts in Manage > Replicating machines . Settings can't be
changed after replication starts.
3. In Test Migration , select the Azure virtual network in which the Azure VM will be located after the
migration. We recommend you use a nonproduction virtual network.
4. The Test migration job starts. Monitor the job in the portal notifications.
5. After the migration finishes, view the migrated Azure VM in Vir tual Machines in the Azure portal. The
machine name has a suffix -Test .
6. After the test is done, select and hold (or right-click) the Azure VM in Replicating machines . Then select
Clean up test migration .
Migrate the VM
Now Contoso admins run a full migration to complete the move.
1. In the Azure Migrate project, go to Ser vers > Azure Migrate: Ser ver Migration , and select Replicating
ser vers .
2. In Replicating machines , select and hold (or right-click) the VM, and then select Migrate .
3. In Migrate > Shut down vir tual machines and perform a planned migration with no data loss ,
select Yes > OK .
By default, Azure Migrate shuts down the on-premises VM and runs an on-demand replication to
synchronize any VM changes that occurred since the last replication occurred. This action ensures no
data loss.
If you don't want to shut down the VM, select No .
4. A migration job starts for the VM. Track the job in Azure notifications.
5. After the job finishes, you can view and manage the VM from the Vir tual Machines page.
3. The on-premises MySQL database is version 5.7, so select this version for compatibility. Use the default
sizes, which match database requirements.
4. For Backup Redundancy Options , select Geo-Redundant . This option allows you to restore the database
in the secondary region ( Central US ) if an outage occurs. You can configure this option only when you
provision the database.
5. In the VNET-PROD-EUS2 network, go to Ser vice endpoints , and add a service endpoint (a database subnet)
for the SQL service.
6. After adding the subnet, create a virtual network rule that allows access from the database subnet in the
production network.
Step 6: Migrate the database
There are several ways to move the MySQL database. Each option requires the Contoso admins to create an Azure
Database for MySQL instance for the target. After it's created, they can perform the migration by using two paths
that are described in the following steps:
6a: Database Migration Service
6b: MySQL Workbench backup and restore
Step 6a: Migrate the database via Database Migration Service
Contoso admins migrate the database via Database Migration Service by following the step-by-step migration
tutorial. They can perform online, offline, and hybrid (preview) migrations by using MySQL 5.6 or 5.7.
NOTE
MySQL 8.0 is supported in Azure Database for MySQL, but the Database Migration Service tool doesn't yet support that
version.
6. Now, import (restore) the database in the Azure Database for MySQL instance from the self-contained file. A
new schema ( osticket ) is created for the instance.
Connect the VM to the database
As the final step in the migration process, Contoso admins update the connection string of the application to point
to the application database running on the OSTICKETMYSQL VM.
1. Make an SSH connection to the OSTICKETWEB VM by using PuTTY or another SSH client. The VM is private, so
connect by using the private IP address.
2. Make sure that the OSTICKETWEB VM can communicate with the OSTICKETMYSQL VM. Currently, the
configuration is hardcoded with the on-premises IP address 172.16.0.43 .
Before the update:
4. Finally, update the DNS records for OSTICKETWEB and OSTICKETMYSQL on one of the Contoso domain
controllers.
Need more help?
Learn about how to run a test migration.
Learn about how to migrate VMs to Azure.
This article demonstrates how the fictional company Contoso rehosts its dev/test environment for two applications
running on VMware virtual machines (VMs) by migrating to Azure Virtual Machines.
The SmartHotel360 and osTicket applications used in this example are open source. You can download them for
your own testing purposes.
Migration options
Contoso has several options available for moving dev/test environments to Azure:
M IGRAT IO N O P T IO N S O UTC O M E
NOTE
Read how Contoso moved its dev/test environment to Azure by using DevTest Labs.
Business drivers
The development leadership team has outlined what it wants to achieve with this migration. It aims to quickly
move dev/test capabilities out of an on-premises datacenter and no longer purchase hardware to develop
software. It also seeks to empower developers to create and run their environments without involvement from IT.
NOTE
Contoso will use the Pay-As-You-Go Dev/Test subscription offer for its environments. Each active Visual Studio subscriber on
the team can use the Microsoft software included with the subscription virtual machines for dev/test at no extra charge.
Contoso will just pay the Linux rate for VMs that it runs. That includes VMs with SQL Server, SharePoint Server, or other
software that's normally billed at a higher rate.
Migration goals
The Contoso development team has pinned down goals for this migration. These goals are used to determine the
best migration method:
Contoso wants to quickly move out of its on-premises dev/test environments.
After migration, Contoso's dev/test environment in Azure should have enhanced capabilities over the current
system in VMware.
The operations model will move from IT provisioned to DevOps with self-service provisioning.
Solution design
After pinning down goals and requirements, Contoso designs and reviews a deployment solution and identifies the
migration process. The process includes the Azure services that Contoso will use for the migration.
Current application
The dev/test VMs for the two applications are running on VMs ( WEBVMDEV , SQLVMDEV , OSTICKETWEBDEV ,
OSTICKETMYSQLDEV ). These VMs are used for development before code is promoted to the production VMs.
The VMs are located on VMware ESXi host contosohost1.contoso.com (version 6.5).
The VMware environment is managed by vCenter Server 6.5 ( vcenter.contoso.com ), running on a VM.
Contoso has an on-premises datacenter ( contoso-datacenter ) with an on-premises domain controller (
contosodc1 ).
Proposed architecture
Because the VMs are used for dev/test, they'll reside in the ContosoDevRG resource group in Azure.
The VMs will be migrated to the primary Azure region ( East US 2 ) and placed in the development virtual
network ( VNET-DEV-EUS2 ).
The web front-end VMs will reside in the front-end subnet ( DEV-FE-EUS2 ) in the development network.
The database VM will reside in the database subnet ( DEV-DB-EUS2 ) in the development network.
The on-premises VMs in the Contoso datacenter will be decommissioned after the migration is done.
Figure 1: Proposed architecture.
Database considerations
To support ongoing development, Contoso has decided to continue using existing VMs and migrate them to Azure.
In the future, Contoso will pursue the use of platform as a service (PaaS) services such as Azure SQL Database and
Azure Database for MySQL.
Database VMs will be migrated as is without changes.
With the use of the Azure Dev/Test subscription offer, the computers running Windows Server and SQL Server
will not incur licensing fees. Avoiding fees will keep the compute costs to a minimum.
In the future, Contoso will look to integrate its development with PaaS services.
Solution review
Contoso evaluates the proposed design by putting together a list of pros and cons.
C O N SIDERAT IO N DETA IL S
C O N SIDERAT IO N DETA IL S
Cons The migration will only move the VMs, not yet moving to PaaS
services for their development. This means that Contoso will
need have to start supporting the operations of its VMs,
including security patches. This was maintained by IT in the
past, so Contoso will need to find a solution for this new
operational task.
NOTE
Contoso could address the cons in its list by using DevTest Labs.
Migration process
Contoso will migrate its development front end and database to Azure VMs by using the agentless method in the
Azure Migrate: Server Migration tool.
Contoso prepares and sets up Azure components for Azure Migrate: Server Migration, and prepares the on-
premises VMware infrastructure.
The Azure infrastructure is in place, so Contoso just needs to configure the replication of the VMs through the
Azure Migrate: Server Migration tool.
With everything prepared, Contoso can start replicating the VMs.
After replication is enabled and working, Contoso migrates the VMs by testing the migration and if successful,
failing it over to Azure.
After the development VMs are up and running in Azure, Contoso will reconfigure its development
workstations to point at the VMs now running in Azure.
Figure 2: An overview of the migration process.
Azure services
SERVIC E DESC RIP T IO N C O ST
Azure Migrate: Server Migration The service orchestrates and manages During replication to Azure, Azure
migrating on-premises applications and Storage charges are incurred. Azure
workloads and AWS or GCP VM VMs are created and incur charges
instances. when the migration occurs and the
VMs are running in Azure. Learn more
about charges and pricing.
Prerequisites
This is what Contoso needs to run this scenario:
On-premises ser vers On-premises vCenter servers should be running version 5.5,
6.0, 6.5, or 6.7.
Scenario steps
Here's how Contoso admins will run the migration:
Step 1: Prepare Azure for Azure Migrate: Ser ver Migration. They add the server migration tool to their
Azure Migrate project.
Step 2: Prepare on-premises VMware for Azure Migrate: Ser ver Migration. They prepare accounts for
VM discovery and prepare to connect to Azure VMs after migration.
Step 3: Replicate VMs. They set up replication and start replicating VMs to Azure Storage.
Step 4: Migrate the VMs with Azure Migrate: Ser ver Migration. They run a test migration to make sure
everything's working and then run a full migration to move the VMs to Azure.
Step 1: Prepare Azure for the Azure Migrate: Server Migration tool
Contoso needs to migrate the VMs to a virtual network where the Azure VMs will reside when they're created,
provisioned, and configured through the Azure Migrate: Server Migration tool.
1. Set up a network: Contoso already set up a network that can be for Azure Migrate: Server Migration when it
deployed the Azure infrastructure.
The VMs to be migrated are used for development. They will migrate to the Azure development virtual
network ( VNET-DEV-EUS2 ) in the primary East US 2 region.
Both VMs will be placed in the ContosoDevRG resource group, which is used for development resources.
The application front-end VMs ( WEBVMDEV and OSTICKETWEBDEV ) will migrate to the front-end subnet (
DEV-FE-EUS2 ), in the development virtual network.
The application database VM ( SQLVMDEV and OSTICKETMYSQLDEV ) will migrate to the database subnet (
DEV-DB-EUS2 ), in the development virtual network.
2. Provision the Azure Migrate: Server Migration tool.
a. From Azure Migrate, download the .OVA image and import it into VMware.
NOTE
In the case of Contoso, the admins will select No to Azure Hybrid Benefit because this is an Azure Dev/Test
subscription. This means they'll pay for the compute only. Azure Hybrid Benefit should be used only for production
systems that have Software Assurance benefits.
8. In Compute , review the VM name, size, OS disk type, and availability set. VMs must conform with Azure
requirements.
VM size: If you're using assessment recommendations, this drop-down list contains the recommended
size. Otherwise, Azure Migrate selects a size based on the closest match in the Azure subscription. You
can choose a manual size instead in Azure VM size .
OS disk : Specify the OS (boot) disk for the VM. The OS disk has the operating system bootloader and
installer.
Availability set: If the VM should be in an Azure availability set after migration, then specify the set. The
set must be in the target resource group that you specify for the migration.
9. In Disks , specify whether the VM disks should be replicated to Azure and select the disk type (standard
SSD/HDD or premium managed disks) in Azure. Then select Next . You can exclude disks from replication. If
you do, they won't be present on the Azure VM after migration.
10. In Review and star t replication , review the settings and select Replicate to start the initial replication for
the servers.
NOTE
You can update replication settings at any time before replication starts in Manage > Replicating machines . Settings can't
be changed after replication starts.
Conclusion
In this article, Contoso rehosted the development VMs used for its SmartHotel360 and osTicket applications in
Azure. The admins migrated the application VMs to Azure VMs by using the Azure Migrate: Server Migration tool.
Migrate a dev/test environment to Azure DevTest
Labs
10/30/2020 • 11 minutes to read • Edit Online
This article demonstrates how the fictional company Contoso migrates its dev/test environment to Azure DevTest
Labs.
Migration options
Contoso has several options available when moving its dev/test environment to Azure.
M IGRAT IO N O P T IO N S O UTC O M E
NOTE
This article focuses on using DevTest Labs to move an on-premises dev/test environment to Azure. Read how Contoso
moved dev/test to Azure IaaS via Azure Migrate.
Business drivers
The development leadership team has outlined what it wants to achieve with this migration:
Empower developers with access to DevOps tools and self-service environments.
Give access to DevOps tools for continuous integration/continuous delivery (CI/CD) pipelines and cloud-native
tools for dev/test, such as AI, machine learning, and serverless.
Ensure governance and compliance in dev/test environments.
Save costs by moving all dev/test environments out of the datacenter and no longer purchase hardware to
develop software.
NOTE
Contoso will use the Pay-As-You-Go Dev/Test subscription offer for its environments. Each active Visual Studio subscriber on
the team can use the Microsoft software included with the subscription on Azure Virtual Machines for dev/test at no extra
charge. Contoso will just pay the Linux rate for VMs that it runs. That includes VMs with SQL Server, SharePoint Server, or
other software that's normally billed at a higher rate.
NOTE
Azure customers with an Enterprise Agreement can also benefit from the Azure Dev/Test subscription offer. To learn more,
review this video on creating an Azure Dev/Test subscription by using the Enterprise Agreement portal.
Migration goals
The Contoso development team has pinned down goals for this migration. These goals are used to determine the
best migration method:
Quickly provision development and test environments. It should take minutes, not months, to build the
infrastructure that a developer needs to write or test software.
After migration, Contoso's dev/test environment in Azure should have enhanced capabilities over the current
system on-premises.
The operations model will move from IT-provisioned VMs to DevOps with self-service provisioning.
Contoso wants to quickly move out of its on-premises dev/test environments.
All developers will connect to dev/test environments remotely and securely.
Solution design
After pinning down goals and requirements, Contoso designs and reviews a deployment solution. The solution
includes the Azure services that it will use for dev/test.
Current architecture
The dev/test VMs for Contoso's applications are running on VMware in the on-premises datacenter.
These VMs are used for development and testing before code is promoted to the production VMs.
Developers maintain their own workstations, but they need new solutions for connecting remotely from home
offices.
Proposed architecture
Contoso will use an Azure Dev/Test subscription to reduce costs for Azure resources. This subscription offers
significant savings, including VMs that don't incur licensing fees for Microsoft software.
Contoso will use DevTest Labs for managing the environments. New VMs will be created in DevTest Labs to
support the move to new tools for development and testing in the cloud.
The on-premises dev/test VMs in the Contoso datacenter will be decommissioned after the migration is done.
Developers and testers will have access to Windows Virtual Desktop for their workstations.
Figure 1: Scenario architecture.
Database considerations
To support ongoing development, Contoso has decided to continue using databases running on VMs. But the
current VMs will be replaced with new ones running in DevTest Labs. In the future, Contoso will pursue the use of
platform as a service (PaaS) services such as Azure SQL Database and Azure Database for MySQL.
Current VMware database VMs will be decommissioned and replaced with Azure VMs in DevTest Labs. The existing
databases will be migrated with simple backups and restores. Using the Azure Dev/Test subscription offer won't
incur licensing fees for the Windows Server and SQL Server instances, minimizing compute costs.
Solution review
Contoso evaluates the proposed design by putting together a list of pros and cons.
C O N SIDERAT IO N DETA IL S
Migration process
Contoso will migrate its development application and database VMs to new Azure VMs by using DevTest Labs.
Contoso already has the Azure infrastructure in place, including the development virtual network.
With everything prepared, Contoso will provision and configure DevTest Labs.
Contoso will configure the development virtual network, assign a resource group, and set policies.
Contoso will create Windows Virtual Desktop instances for developers to use at remote locations.
Contoso will create VMs within DevTest Labs for development and migrate databases.
Prerequisites
Here's what Contoso needs to run this scenario.
Scenario steps
Here's how Contoso admins will run the migration:
Step 1: Provision a new Azure Dev/Test subscription and create a DevTest Labs instance.
Step 2: Configure the development virtual network, assign a resource group, and set policies.
Step 3: Create Windows 10 Enterprise multi-session virtual desktops for developers to use from remote
locations.
Step 4: Create formulas and VMs within DevTest Labs for development and migrate databases.
NOTE
The Contributor role is an administrator-level role with all rights except the ability to provide access to other users.
Read more about Azure role-based access control.
Figure 9: Auto-start.
d. Contoso configures the allowed VM sizes, ensuring that large and expensive VMs can't be started.
Figure 10: Allowed VM sizes.
e. Contoso configures the support message.
Step 4: Create formulas and VMs within DevTest Labs for development
and migrate databases
With DevTest Labs configured and the remote developers' workstation up and running, Contoso focuses on
building its VMs for development. To get started, Contoso completes the following steps:
1. Contoso creates formulas (reusable bases) for application and database VMs, and it provisions application
and database VMs by using the formulas.
Contoso selects Formulas > + Add , and then a Windows Ser ver 2012 R2 Datacenter base.
Conclusion
In this article, Contoso moved its development environments to DevTest Labs. It also implemented Windows
Virtual Desktop as a platform for remote and contract developers.
Need more help?
Create a DevTest Labs instance in your subscription now, and learn how to use DevTest Labs for developers.
Migrate an application to Azure App Service and
SQL Database
10/30/2020 • 15 minutes to read • Edit Online
This article demonstrates how the fictional company Contoso refactors a two-tier Windows .NET application that's
running on VMware VMs as part of a migration to Azure. The Contoso team migrates the application front-end
virtual machine (VM) to an Azure App Service web app and the application database to Azure SQL Database.
The SmartHotel360 application that we use in this example is provided as open source. If you want to use it for
your own testing purposes, you can download it from GitHub.
Business drivers
The Contoso IT leadership team has worked closely with business partners to understand what they want to
achieve with this migration:
Address business growth . Contoso is growing, and there is pressure on their on-premises systems and
infrastructure.
Increase efficiency . Contoso needs to remove unnecessary procedures and streamline processes for
developers and users. The business needs IT to be fast and not waste time or money, thus delivering faster on
customer requirements.
Increase agility . To enable their success in a global economy, Contoso IT needs to be more responsive to the
needs of the business. It must be able to react more quickly to changes in the marketplace. IT must not get in
the way or become a business blocker.
Scale . As the business grows successfully, Contoso IT must provide systems that are able to grow at the same
pace.
Reduce costs . Contoso wants to minimize licensing costs.
Migration goals
To help determine the best migration method, the Contoso cloud team pinned down the following goals:
The team also wants to move away from SQL Server 2008 R2
to a modern platform as a service (PaaS) database, which will
minimize the need for management.
Solution design
After pinning down their goals and requirements, Contoso designs and reviews a deployment solution. They also
identify the migration process, including the Azure services that they'll use for the migration.
Current application
The SmartHotel360 on-premises application is tiered across two VMs, WEBVM and SQLVM .
The VMs are located on VMware ESXi host contosohost1.contoso.com version 6.5.
The VMware environment is managed by vCenter Server 6.5 (vcenter.contoso.com), which runs on a VM.
Contoso has an on-premises datacenter (contoso-datacenter), with an on-premises domain controller
(contosodc1).
The on-premises VMs in the Contoso datacenter will be decommissioned after the migration is done.
Proposed solution
For the database tier of the application, Contoso compared Azure SQL Database to SQL Server by referring to
Features comparison: Azure SQL Database and Azure SQL Managed Instance. Contoso decided to use Azure
SQL Database for a few reasons:
Azure SQL Database is a managed relational database service. It delivers predictable performance at
multiple service levels, with near-zero administration. Advantages include dynamic scalability with no
downtime, built-in intelligent optimization, and global scalability and availability.
Contoso can use the lightweight Data Migration Assistant to assess the on-premises database migration
to Azure SQL.
Contoso can use Azure Database Migration Service to migrate the on-premises database to Azure SQL.
With Software Assurance, Contoso can exchange existing licenses for discounted rates on a database in
SQL Database by using the Azure Hybrid Benefit for SQL Server. This approach could provide a cost
saving of up to 30 percent.
SQL Database provides security features such as Always Encrypted, dynamic data masking, row-level
security, and SQL threat detection.
For the application web tier, Contoso has decided to use Azure App Service. This PaaS service enables them to
deploy the application with just a few configuration changes. Contoso will use Visual Studio to make the
change, and they'll deploy two web apps, one for the website and one for the WCF service.
To meet requirements for a DevOps pipeline, Contoso will use Azure DevOps for source code management with
Git repos. They'll use automated builds and release to build the code and deploy it to the Azure App Service.
Solution review
Contoso evaluates their proposed design by putting together a pros and cons list, as shown in the following table:
C O N SIDERAT IO N DETA IL S
Proposed architecture
Migration process
1. Contoso provisions an Azure SQL managed instance and then migrates the SmartHotel360 database to it
by using Azure Database Migration Service.
2. Contoso provisions and configures web apps, and deploys the SmartHotel360 application to them.
Azure services
SERVIC E DESC RIP T IO N C O ST
Azure App Service Migration Assistant A free and simple path to seamlessly It's a downloadable tool, free of charge.
migrate .NET web applications from on-
premises to the cloud with minimal to
no code changes.
Data Migration Assistant Contoso will use Data Migration It's a downloadable tool, free of charge.
Assistant to assess and detect
compatibility issues that might affect
database functionality in Azure. Data
Migration Assistant assesses feature
parity between SQL sources and
targets, and it recommends
performance and reliability
improvements.
SERVIC E DESC RIP T IO N C O ST
Azure Database Migration Service Azure Database Migration Service Learn about supported regions and
enables seamless migration from Database Migration Service pricing.
multiple database sources to Azure data
platforms with minimal downtime.
Azure SQL Database An intelligent, fully managed relational Cost is based on features, throughput,
cloud database service. and size. Learn more.
Azure App Service Helps create powerful cloud applications Pricing is based on size, location, and
that use a fully managed platform. usage duration. Learn more.
Prerequisites
To run this scenario, Contoso must meet the following prerequisites:
Scenario steps
Here's how Contoso will run the migration:
Step 1: Assess and migrate the web apps.. Contoso uses the Azure App Service Migration Assistant tool to
run pre-migration compatibility checks and migrate their web apps to Azure App Service.
Step 2: Provision a database in Azure SQL Database . Contoso provisions an Azure SQL Database
instance. After the application website is migrated to Azure, the WCF service web app will point to this instance.
Step 3: Assess the database . Contoso assesses the database for migration by using Data Migration Assistant
and then migrates it via Azure Database Migration Service.
Step 4: Set up Azure DevOps . Contoso creates a new Azure DevOps project, and imports the Git repo.
Step 5: Configure connection strings . Contoso configures connection strings so that the web tier web app,
the WCF service web app, and the SQL instance can communicate.
Step 6: Set up build and release pipelines in Azure DevOps . As a final step, Contoso sets up build and
release pipelines in Azure DevOps to create the application, and then deploys them to two separate web apps.
2. They specify a database name to match the database, SmartHotel.Registration , that's running on the on-
premises VM. They place the database in the ContosoRG resource group. This is the resource group they
use for production resources in Azure.
3. They set up a new SQL Server instance, sql-smar thotel-eus2 , in the primary region.
4. They set the pricing tier to match their server and database needs. And they select to save money with
Azure Hybrid Benefit because they already have a SQL Server license.
5. For sizing, they use vCore-based purchasing and set the limits for their expected requirements.
6. They create the database instance.
7. They open the database and note the details they'll need when they use Data Migration Assistant for
migration.
2. They import the Git repo that currently holds their application code. They download it from the public
GitHub repository.
3. They connect Visual Studio to the repo and then clone the code to the developer machine by using Team
Explorer.
4. They open the solution file for the application. The web app and the WCF service have separate projects
within the file.
Step 5: Configure connection strings
The Contoso admins make sure that the web apps and database can communicate with each other. To do this, they
configure connection strings in the code and in the web apps.
1. In the web app for the WCF service, SHWCF-EUS2 , under Settings > Application settings , they add a new
connection string named DefaultConnection .
2. They pull the connection string from the SmartHotel-Registration database and then update it with the
correct credentials.
3. In Visual Studio, the admins open the SmartHotel.Registration.wcf project from the solution file. In the
project, they update the connectionStrings section of the web.config file with the connection string.
4. They change the client section of the web.config file for SmartHotel.Registration.Web to point to the new
location of the WCF service. This is the URL of the WCF web app that hosts the service endpoint.
5. With the code changes now in place, the admins commit and sync them by using Team Explorer in Visual
Studio.
2. They select Azure Repos Git and, in the Repositor y drop-down list, they select the relevant repo.
3. Under Select a template , they select the ASP.NET template for their build.
4. They use the name ContosoSmar tHotelRefactor-ASP.NET-CI for the build and then select Save &
queue , which kicks off the first build.
5. They select the build number to watch the process. After it's finished, the admins can see the process
feedback, and they select Ar tifacts to review the build results.
The Ar tifacts explorer pane opens, and the drop folder displays the build results.
The two .zip files are the packages that contain the applications.
These .zip files are used in the release pipeline for deployment to Azure App Service.
6. They select Releases > + New pipeline .
9. Under the stages, they select 1 job, 1 task to configure deployment of the WCF service.
10. They verify that the subscription is selected and authorized, and then they select the app ser vice name .
11. On the pipeline > Ar tifacts , they select + Add an ar tifact , then select to build with the
ContosoSmar thotel360Refactor pipeline.
12. To enable the continuous deployment trigger, the admins select the lightning bolt icon on the artifact.
15. In Select a file or folder , they expand the drop folder, select the SmartHotel.Registration.Wcf.zip file that
was created during the build, and then select Save .
16. They select Pipeline > Stages , and then select + Add to add an environment for SHWEB-EUS2 . They select
another Azure App Service deployment.
17. They repeat the process to publish the SmartHotel.Registration.Web.zip file to the correct web app, and
then select Save .
18. They go back to Build , select Triggers , and then select the Enable continuous integration check box.
This action enables the pipeline so that when changes are committed to the code, the full build and release
occur.
19. They select Save & queue to run the full pipeline. A new build is triggered, which in turn creates the first
release of the application to the Azure App Service.
20. Contoso admins can follow the build and release pipeline process from Azure DevOps. After the build
finishes, the release starts.
21. After the pipeline finishes, both sites have been deployed, and the application is up and running online.
The application has been successfully migrated to Azure.
Conclusion
In this article, Contoso refactored the SmartHotel360 application in Azure by migrating the application front-end
VM to two Azure App Service web apps. The application database was migrated to Azure SQL Database.
Refactor an on-premises application to an Azure App
Service web app and a SQL managed instance
10/30/2020 • 18 minutes to read • Edit Online
This article demonstrates how the fictional company Contoso refactors a two-tier Windows .NET application that's
running on VMware virtual machines (VMs) as part of a migration to Azure. The Contoso team migrates the
application front-end VM to an Azure App Service web app. The article also shows how Contoso migrates the
application database to an Azure SQL managed instance.
The SmartHotel360 application that we use in this example is provided as open source. If you want to use it for
your own testing purposes, you can download it from GitHub.
Business drivers
The Contoso IT leadership team has worked closely with business partners to understand what they want to
achieve with this migration:
Address business growth . Contoso is growing, and there is pressure on their on-premises systems and
infrastructure.
Increase efficiency . Contoso needs to remove unnecessary procedures and streamline processes for
developers and users. The business needs IT to be fast and not waste time or money, thus delivering faster on
customer requirements.
Increase agility . Contoso IT needs to be more responsive to the needs of the business. It must be able to react
faster than the changes in the marketplace, to enable success in a global economy. Reaction time must not get in
the way, or become a business blocker.
Scale . As the business grows successfully, Contoso IT must provide systems that are able to grow at the same
pace.
Reduce costs . Contoso wants to minimize licensing costs.
Migration goals
To help determine the best migration method, the Contoso cloud team pinned down the following goals:
The team also wants to move away from SQL Server 2008 R2
to a modern platform as a service (PaaS) database, which will
minimize the need for management.
Solution design
After pinning down their goals and requirements, Contoso designs and reviews a deployment solution. They also
identify the migration process, including the Azure services that they'll use for the migration.
Current application
The SmartHotel360 on-premises application is tiered across two VMs, WEBVM and SQLVM .
The VMs are located on VMware ESXi host contosohost1.contoso.com version 6.5.
The VMware environment is managed by vCenter Server 6.5 (vcenter.contoso.com), which runs on a VM.
Contoso has an on-premises datacenter (contoso-datacenter), with an on-premises domain controller
(contosodc1).
The on-premises VMs in the Contoso datacenter will be decommissioned after the migration is done.
Proposed solution
For the application web tier, Contoso has decided to use Azure App Service. This PaaS service enables them to
deploy the application with just a few configuration changes. Contoso will use Visual Studio to make the change,
and they'll deploy two web apps, one for the website and one for the WCF service.
To meet requirements for a DevOps pipeline, Contoso will use Azure DevOps for source code management with
Git repos. They'll use automated builds and release to build the code and deploy it to the Azure App Service.
Database considerations
As part of the solution design process, Contoso did a feature comparison between Azure SQL Database and SQL
Managed Instance. They decided to use SQL Managed Instance based on the following considerations:
SQL Managed Instance aims to deliver almost 100 percent compatibility with the latest on-premises SQL Server
version. Microsoft recommends SQL Managed Instance for customers who are running SQL Server on-
premises or on infrastructure as a service (IaaS) VMs who want to migrate their applications to a fully managed
service with minimal design changes.
Contoso is planning to migrate a large number of applications from on-premises to IaaS VMs. Many of these
VMs are provided by independent software vendors. Contoso realizes that using SQL Managed Instance will
help ensure database compatibility for these applications. They'll use SQL Managed Instance rather than SQL
Database, which might not be supported.
Contoso can simply do a lift and shift migration to SQL Managed Instance by using the fully automated Azure
Database Migration Service. With this service in place, Contoso can reuse it for future database migrations.
SQL Managed Instance supports SQL Server Agent, an important component of the SmartHotel360 application.
Contoso needs this compatibility; otherwise, they'll have to redesign the maintenance plans required by the
application.
With Software Assurance, Contoso can exchange their existing licenses for discounted rates on a SQL managed
instance by using the Azure Hybrid Benefit for SQL Server. This allows Contoso to save up to 30 percent by
using SQL Managed Instance.
Their SQL managed instance is fully contained in the virtual network, so it provides greater isolation and
security for Contoso's data. Contoso can get the benefits of the public cloud, while keeping the environment
isolated from the public internet.
SQL Managed Instance supports many security features, including always-encrypted, dynamic data masking,
row-level security, and threat detection.
Solution review
Contoso evaluates their proposed design by putting together a pros and cons list, as shown in the following table:
C O N SIDERAT IO N DETA IL S
C O N SIDERAT IO N DETA IL S
For the data tier, SQL Managed Instance might not be the
best solution if Contoso wants to customize the operating
system or the database server, or if they want to run third-
party applications along with SQL Server. Running SQL Server
on an IaaS VM could provide this flexibility.
Proposed architecture
Migration process
1. Contoso provisions an Azure SQL managed instance and then migrates the SmartHotel360 database to it by
using Azure Database Migration Service.
2. Contoso provisions and configures web apps and deploys the SmartHotel360 application to them.
Azure services
SERVIC E DESC RIP T IO N C O ST
Azure App Service Migration Assistant A free and simple path to seamlessly It's a downloadable tool, free of charge.
migrate .NET web applications from on-
premises to the cloud with minimal to
no code changes.
Azure Database Migration Service Azure Database Migration Service Learn about supported regions and
enables seamless migration from Azure Database Migration Service
multiple database sources to Azure data pricing.
platforms with minimal downtime.
SERVIC E DESC RIP T IO N C O ST
Azure SQL Managed Instance SQL Managed Instance is a managed Using a SQL managed instance that
database service that represents a fully runs in Azure incurs charges based on
managed SQL Server instance in Azure. capacity. Learn more about SQL
It uses the same code as the latest Managed Instance pricing.
version of SQL Server Database Engine,
and has the latest features,
performance improvements, and
security patches.
Azure App Service Helps create powerful cloud applications Pricing is based on size, location, and
that use a fully managed platform. usage duration. Learn more.
Prerequisites
To run this scenario, Contoso must meet the following prerequisites:
Scenario steps
Here's how Contoso will run the migration:
Step 1: Assess and migrate the web apps.. Contoso uses the Azure App Service Migration Assistant tool to
run pre-migration compatibility checks and migrate their web apps to Azure App Service.
Step 2: Set up a SQL managed instance . Contoso needs an existing managed instance to which the on-
premises SQL Server database will migrate.
Step 3: Migrate via Azure Database Migration Ser vice . Contoso migrates the application database via
Azure Database Migration Service.
Step 4: Set up Azure DevOps . Contoso creates a new Azure DevOps project, and imports the Git repo.
Step 5: Configure connection strings . Contoso configures connection strings so that the web tier web app,
the WCF service web app, and the SQL managed instance can communicate.
Step 6: Set up build and release pipelines in Azure DevOps . As a final step, Contoso sets up build and
release pipelines in Azure DevOps to create the application. The team then deploys the pipelines to two separate
web apps.
5. They set custom DNS settings. The DNS settings point first to Contoso's Azure domain controllers. Azure
DNS is secondary. The Contoso Azure domain controllers are located as follows:
Located in the PROD-DC-EUS2 subnet of the production network (VNET-PROD-EUS2) in the East US 2
region.
CONTOSODC3 address: 10.245.42.4
CONTOSODC4 address: 10.245.42.5
Azure DNS resolver: 168.63.129.16
Need more help?
Read the SQL Managed Instance overview.
Learn how to create a virtual network for a SQL managed instance.
Learn how to set up peering.
Learn how to update Azure Active Directory DNS settings.
Set up routing
The managed instance is placed in a private virtual network. Contoso needs a route table for the virtual network to
communicate with the Azure management service. If the virtual network can't communicate with the service that
manages it, the virtual network becomes inaccessible.
Contoso considers these factors:
The route table contains a set of rules (routes) that specify how packets that are sent from the managed instance
should be routed in the virtual network.
The route table is associated with subnets where managed instances are deployed. Each packet that leaves a
subnet is handled based on the associated route table.
A subnet can be associated with only one route table.
There are no additional charges for creating route tables in Microsoft Azure.
To set up routing, Contoso admins do the following:
1. They create a user-defined route table in the ContosoNetworkingRG resource group.
2. To comply with SQL Managed Instance requirements, after the route table (MIRouteTable) is deployed, the
admins add a route with an address prefix of 0.0.0.0/0 . The Next hop type option is set to Internet .
3. They associate the route table with the SQLMI-DB-EUS2 subnet (in the VNET-SQLMI-EUS2 network).
2. They import the Git repo that currently holds their application code. They download it from the public
GitHub repository.
3. They connect Visual Studio to the repo and then clone the code to the developer machine by using Team
Explorer.
4. They open the solution file for the application. The web app and WCF service have separate projects within
the file.
Step 5: Configure connection strings
The Contoso admins make sure that the web apps and database can communicate with each other. To do this, they
configure connection strings in the code and in the web apps.
1. In the web app for the WCF service, SHWCF-EUS2, under Settings > Application settings , they add a new
connection string named DefaultConnection .
2. They pull the connection string from the SmartHotel-Registration database and then update it with the
correct credentials.
3. In Visual Studio, the admins open the SmartHotel.Registration.wcf project from the solution file. In the
project, they update the connectionStrings section of the web.config file with the connection string.
4. They change the client section of the web.config file for SmartHotel.Registration.Web to point to the new
location of the WCF service. This is the URL of the WCF web app that hosts the service endpoint.
5. With the code changes now in place, the admins commit and sync them by using Team Explorer in Visual
Studio.
4. They use the name ContosoSmar tHotelRefactor-ASP.NET-CI for the build and then select Save &
Queue , which kicks off the first build.
5. They select the build number to watch the process. After it's finished, the admins can see the process
feedback, and they select Ar tifacts to review the build results.
The Ar tifacts explorer pane opens, and the drop folder displays the build results.
The two .zip files are the packages that contain the applications.
These .zip files are used in the release pipeline for deployment to Azure App Service.
6. They select Releases > + New pipeline .
9. Under the stages, they select 1 job, 1 task to configure deployment of the WCF service.
10. They verify that the subscription is selected and authorized, and then they select the app ser vice name .
11. On the pipeline, they select Ar tifacts , select + Add an ar tifact , select Build as the source type, and then
build with the ContosoSmarthotel360Refactor pipeline.
12. To enable the continuous deployment trigger, the admins select the lightning bolt icon on the artifact.
15. In Select a file or folder , they expand the drop folder, select the SmartHotel.Registration.Wcf.zip file that
was created during the build, and then select Save .
16. They select Pipeline > Stages , and then select + Add to add an environment for SHWEB-EUS2 . They select
another Azure App Service deployment.
17. They repeat the process to publish the web app SmartHotel.Registration.Web.zip file to the correct web app,
and then select Save .
18. They go back to Build , select Triggers , and then select the Enable continuous integration check box.
This action enables the pipeline so that when changes are committed to the code, the full build and release
occur.
19. They select Save & Queue to run the full pipeline. A new build is triggered, which in turn creates the first
release of the application to the Azure App Service.
20. Contoso admins can follow the build and release pipeline process from Azure DevOps. After the build
finishes, the release starts.
21. After the pipeline finishes, both sites have been deployed and the application is up and running online.
The application has been successfully migrated to Azure.
Conclusion
In this article, Contoso refactored the SmartHotel360 application in Azure by migrating the application front-end
VM to two Azure App Service web apps. The application database was migrated to an Azure SQL managed
instance.
Refactor a Linux application by using Azure App
Service, Traffic Manager, and Azure Database for
MySQL
10/30/2020 • 14 minutes to read • Edit Online
This article shows how the fictional company Contoso refactors a two-tier LAMP-based application, migrating it
from on-premises to Azure by using Azure App Service with GitHub integration and Azure Database for MySQL.
osTicket, the service desk application that we use in this example, is provided as open source. If you want to use it
for your own testing purposes, you can download it from the osTicket repo in GitHub.
Business drivers
The IT leadership team has worked closely with business partners to understand what they want to achieve:
Address business growth . Contoso is growing and moving into new markets. It needs additional customer
service agents.
Scale . The solution should be built so that Contoso can add more customer service agents as the business
scales.
Improve resiliency . In the past, issues with the system affected internal users only. With the new business
model, external users will be affected, and Contoso needs the application up and running at all times.
Migration goals
To determine the best migration method, the Contoso cloud team has pinned down their goals for this migration:
The application should scale beyond current on-premises capacity and performance. Contoso is moving the
application to take advantage of Azure's on-demand scaling.
Contoso wants to move the application code base to a continuous delivery pipeline. As application changes are
pushed to GitHub, Contoso wants to deploy those changes without tasks for operations staff.
The application must be resilient, with capabilities for growth and failover. Contoso wants to deploy the
application in two different Azure regions and set it up to scale automatically.
Contoso wants to minimize database admin tasks after the application is moved to the cloud.
Solution design
After pinning down their goals and requirements, Contoso designs and reviews a deployment solution, and
identifies the migration process, including the Azure services that will be used for the migration.
Current architecture
The application is tiered across two virtual machines (VMs) ( OSTICKETWEB and OSTICKETMYSQL ).
The VMs are located on VMware ESXi host contosohost1.contoso.com (version 6.5).
The VMware environment is managed by vCenter Server 6.5 ( vcenter.contoso.com ), running on a VM.
Contoso has an on-premises datacenter ( contoso-datacenter ), with an on-premises domain controller (
contosodc1 ).
Proposed architecture
Here's the proposed architecture:
The web tier application on OSTICKETWEB will be migrated by building an Azure App Service web app in two
Azure regions. The Contoso team will implement Azure App Service for Linux by using the PHP 7.0 Docker
container.
The application code will be moved to GitHub, and the Azure App Service web app will be configured for
continuous delivery with GitHub.
Azure App Service will be deployed in both the primary region ( East US 2 ) and secondary region ( Central US
).
Azure Traffic Manager will be set up in front of the two web apps in both regions.
Traffic Manager will be configured in priority mode to force the traffic through East US 2 .
If the Azure app server in East US 2 goes offline, users can access the failed over application in Central US .
The application database will be migrated to the Azure Database for MySQL service by using Azure Database
Migration Service. The on-premises database will be backed up locally, and restored directly to Azure Database
for MySQL.
The database will reside in the primary region ( East US 2 ) in the database subnet ( PROD-DB-EUS2 ) of the
production network ( VNET-PROD-EUS2 ).
Since they're migrating a production workload, Azure resources for the application will reside in the production
resource group ContosoRG .
The Traffic Manager resource will be deployed in Contoso's infrastructure resource group ContosoInfraRG .
The on-premises VMs in the Contoso datacenter will be decommissioned after the migration is done.
Migration process
Contoso completes the migration process as follows:
1. As a first step, Contoso admins set up the Azure infrastructure, including provisioning Azure App Service,
setting up Traffic Manager, and provisioning an Azure Database for MySQL instance.
2. After preparing the Azure infrastructure, they migrate the database by using Azure Database Migration Service.
3. After the database is running in Azure, they upload a GitHub private repository for Azure App Service with
continuous delivery, and load it with the osTicket application.
4. In the Azure portal, they load the application from GitHub to the Docker container by running Azure App
Service.
5. They tweak DNS settings and configure autoscaling for the application.
Azure services
SERVIC E DESC RIP T IO N C O ST
Azure App Service The service runs and scales Pricing is based on the size
applications by using Azure of the instances and the
platform as a service (PaaS) features required. Learn
for websites. more.
Azure Traffic Manager A load balancer that uses Pricing is based on the Learn more.
Domain Name System (DNS) number of received DNS
to direct users to Azure or queries and the number of
to external websites and monitored endpoints.
services.
Azure Database for MySQL The database is based on Pricing is based on compute,
the open-source MySQL storage, and backup
database engine. It provides requirements. Learn more.
a fully managed, enterprise-
ready community MySQL
database for application
development and
deployment.
Prerequisites
To run this scenario, Contoso must meet the following prerequisites:
Scenario steps
Here's the Contoso plan for completing the migration:
Step 1: Provision Azure App Ser vice . Contoso admins will provision web apps in the primary and
secondary regions.
Step 2: Set up Traffic Manager . They set up Traffic Manager in front of the web apps, for routing and load
balancing traffic.
Step 3: Provision Azure Database for MySQL . In Azure, they provision an instance of Azure Database for
MySQL.
Step 4: Migrate the database . They migrate the database by using Azure Database Migration Service.
Step 5: Set up GitHub . They set up a local GitHub repository for the application web sites and code.
Step 6: Configure the web apps . They configure the web apps with the osTicket websites.
4. They select a Linux OS with PHP 7.0 runtime stack, which is a Docker container.
5. They create a second web app, osticket-cus , and an Azure App Service plan for Central US .
Need more help?
Learn about Azure App Service web apps.
Learn about Azure App Service on Linux.
3. After they add the endpoints, the admins can monitor them.
Need more help?
Learn about Traffic Manager.
Learn about routing traffic to a priority endpoint.
2. They add the name contosoosticket for the Azure database. They add the database to the production
resource group ContosoRG and then specify credentials for it.
3. The on-premises MySQL database is version 5.7, so they select this version for compatibility. They use the
default sizes, which match their database requirements.
4. For Backup Redundancy Options , they select Geo-Redundant . This option allows them to restore the
database in their secondary region (Central US) if an outage occurs. They can configure this option only
when they provision the database.
5. They set up connection security. In the database, they select Connection security and then set up firewall
rules to allow the database to access Azure services.
6. They add the local workstation client IP address to the start and end IP addresses. This allows the web apps
to access the MySQL database, along with the database client that's performing the migration.
Step 4: Migrate the database
There are several ways to move the MySQL database. Each option requires Contoso admins to create an Azure
Database for MySQL instance for the target. After they create the instance, they can migrate the database by using
either of two paths:
Step 4a: Azure Database Migration Service
Step 4b: MySQL Workbench backup and restore
Step 4a: Migrate the database via Azure Database Migration Service
Contoso admins migrate the database via Azure Database Migration Service by following the step-by-step
migration tutorial. They can perform online, offline, and hybrid (preview) migrations by using MySQL 5.6 or 5.7.
NOTE
MySQL 8.0 is supported in Azure Database for MySQL, but the Database Migration Service tool does not yet support this
version.
c. Select a target.
6. Now, they can import (restore) the database in the Azure Database for MySQL instance from the self-
contained file. A new schema, osticket , is created for the instance.
7. After they've restored the data, the admins can query it by using MySQL Workbench. The data is displayed
in the Azure portal.
8. The admins update the database information on the web apps. On the MySQL instance, they open
Connection Strings .
9. In the connection strings list, they select the web app settings and then copy them by selecting Click to
copy .
10. They open a new file in Notepad, paste the string into it, and update the string to match the osTicket
database, MySQL instance, and credentials settings.
11. They can verify the server name and login on the Over view pane in the MySQL instance in the Azure
portal.
2. After they fork the repo, they go to the include folder and then look for and select the ost-config.php file.
3. The file opens in the browser, and they edit it.
4. In the editor, the admins update the database details, specifically for DBHOST and DBUSER .
6. For each web app (osticket-eus2 and osticket-cus), in the Azure portal, they select Application settings on
the left pane and then modify the settings.
7. They enter the connection string with the name osticket , and copy the string from Notepad into the value
area . They select MySQL in the dropdown list next to the string, and save the settings.
4. After the configuration is updated and the osTicket web app is loaded from GitHub to the Docker container
that runs the Azure App Service, the site shows as Active.
5. They repeat the preceding steps for the secondary web app, osticket-cus.
6. After the site is configured, it's accessible via the Traffic Manager profile. The DNS name is the new location
of the osTicket application. Learn more.
7. Contoso wants to use a DNS name that's easy to remember. On the New Resource Record pane, they
create an alias, CNAME , and a full qualified domain name, osticket.contoso.com , which points to the
Traffic Manager name in the DNS on their domain controllers.
8. They configure both the osticket-eus2 and osticket-cus web apps to allow the custom host names.
Set up autoscaling
Finally, the Contoso admins set up automatic scaling for the application. Automatic scaling ensures that, as agents
use the application, the application instances increase and decrease according to business needs.
1. In App Service APP-SVP-EUS2 , they open Scale Unit .
2. They configure a new autoscale setting with a single rule that increases the instance count by one when the
CPU usage for the current instance is above 70 percent for 10 minutes.
3. They configure the same setting on APP-SVP-CUS to ensure that the same behavior applies if the
application fails over to the secondary region. The only difference is that they set the default instance to 1,
because this is for failovers only.
This article demonstrates how the fictional company Contoso rebuilds a two-tier Windows .NET application that's
running on VMware virtual machines (VMs) as part of a migration to Azure. Contoso migrates the front-end VM to
an Azure App Service web app. Contoso builds the application back end by using microservices that are deployed
to containers managed by Azure Kubernetes Service (AKS). The site interacts with Azure Functions to provide pet
photo functionality.
The SmartHotel360 application used in this example is provided under an open-source license. If you want to use it
for your own testing purposes, you can download it from GitHub.
Business drivers
The Contoso IT leadership team has worked closely with business partners to understand what they want to
achieve with this migration:
Address business growth. Contoso is growing and wants to provide differentiated experiences for customers
on Contoso websites.
Be agile. Contoso must be able to react faster than the changes in the marketplace to enable their success in a
global economy.
Scale. As the business grows successfully, the Contoso IT team must provide systems that can grow at the same
pace.
Reduce costs. Contoso wants to minimize licensing costs.
Migration goals
The Contoso cloud team has pinned down application requirements for this migration. These requirements were
used to determine the best migration method:
The application in Azure must remain as critical as it is today on-premises. It should perform well and scale
easily.
The application shouldn't use infrastructure as a service (IaaS) components. Everything should be built to use
platform as a service (PaaS) or serverless services.
Application builds should run in cloud services, and containers should reside in a private, enterprise-wide
registry in the cloud.
The API service that's used for pet photos should be accurate and reliable in the real world, because decisions
made by the application must be honored in their hotels. Any pet granted access is allowed to stay at the hotels.
To meet requirements for a DevOps pipeline, Contoso will use a Git repository in Azure Repos for source code
management. Automated builds and releases will be used to build code and deploy to Azure App Service, Azure
Functions, and AKS.
Separate continuous integration/continuous development (CI/CD) pipelines are needed for microservices on
the back end and for the website on the front end.
The back-end services and the front-end web app have different release cycles. To meet this requirement,
Contoso will deploy two different pipelines.
Contoso needs management approval for all front-end website deployment, and the CI/CD pipeline must
provide this.
Solution design
After pinning down their goals and requirements, Contoso designs and reviews a deployment solution, and
identifies the migration process, including the Azure services that will be used for the migration.
Current application
The SmartHotel360 on-premises application is tiered across two VMs ( WEBVM and SQLVM ).
The VMs are located on VMware ESXi host contosohost1.contoso.com (version 6.5).
The VMware environment is managed by vCenter Server 6.5 ( vcenter.contoso.com ), running on a VM.
Contoso has an on-premises datacenter ( contoso-datacenter ), with an on-premises domain controller (
contosodc1 ).
The on-premises VMs in the Contoso datacenter will be decommissioned after the migration is done.
Proposed architecture
The front end of the application is deployed as an Azure App Service web app in the primary Azure region.
An Azure function provides uploads of pet photos, and the site interacts with this functionality.
The pet photo function uses the Computer Vision API of Azure Cognitive Services along with Azure Cosmos
DB.
The back end of the site is built by using microservices. These microservices will be deployed to containers
that are managed in AKS.
Containers will be built using Azure DevOps and then pushed to Azure Container Registry.
For now, Contoso will manually deploy the web app and function code by using Visual Studio.
Contoso will deploy microservices by using a PowerShell script that calls Kubernetes command-line tools.
Migration process
1. Contoso provisions Azure Container Registry, AKS, and Azure Cosmos DB.
2. Contoso provisions the infrastructure for the deployment, including the Azure App Service web app, storage
account, function, and API.
3. After the infrastructure is in place, Contoso builds their microservices container images by using Azure
DevOps, which pushes the images to the container registry.
4. Contoso deploys these microservices to AKS by using a PowerShell script.
5. Finally, Contoso deploys the function and web app.
Figure 2: The migration process.
Azure services
SERVIC E DESC RIP T IO N C O ST
AKS Simplifies Kubernetes management, AKS is a free service. Pay for only the
deployment, and operations. Provides a VMs and the associated storage and
fully managed Kubernetes container networking resources that are
orchestration service. consumed. Learn more.
Azure Functions Accelerates development with an event- Pay only for consumed resources. Plan
driven, serverless compute experience. is billed based on per-second resource
Scale on demand. consumption and executions. Learn
more.
Azure Container Registry Stores images for all types of container Cost is based on features, storage, and
deployments. usage duration. Learn more.
Azure App Service Quickly build, deploy, and scale App Service plans are billed on a per-
enterprise-grade web, mobile, and API second basis. Learn more.
apps that run on any platform.
Prerequisites
Here's what Contoso needs for this scenario:
Scenario steps
Here's how Contoso will run the migration:
Step 1: Provision AKS and Azure Container Registr y. Contoso provisions the managed AKS cluster and
the container registry by using PowerShell.
Step 2: Build Docker containers. Contoso sets up continuous integration (CI) for Docker containers by using
Azure DevOps and pushes the containers to the container registry.
Step 3: Deploy back-end microser vices. Contoso deploys the rest of the infrastructure that will be used by
back-end microservices.
Step 4: Deploy front-end infrastructure. Contoso deploys the front-end infrastructure, including Blob
storage for the pet phones, Azure Cosmos DB, and the Computer Vision API.
Step 5: Migrate the back end. Contoso deploys microservices and runs them on AKS to migrate the back
end.
Step 6: Publish the front end. Contoso publishes the SmartHotel360 application to Azure App Service along
with the function app to be called by the pet service.
2. They run the script to create the managed Kubernetes cluster, using AKS and Azure Container Registry.
Figure 3: Creating the managed Kubernetes cluster.
3. With the file open, they update the $location parameter to eastus2 , and save the file.
10. They verify the connection to the cluster by running the kubectl get nodes command. The node has the
same name as the VM in the automatically created resource group.
12. A browser tab opens to the dashboard. This is a tunneled connection that uses the Azure CLI.
Figure 11: A tunneled connection.
20. After the build finishes and the release is deployed to the slot, Azure DevOps emails the dev lead for
approval.
21. The dev lead selects View approval and can approve or reject the request in the Azure DevOps portal.
Conclusion
In this article, Contoso rebuilds the SmartHotel360 application in Azure. The on-premises application front-end VM
is rebuilt for Azure App Service web apps. The application back end is built by using microservices that are
deployed to containers managed by AKS. Contoso enhanced functionality with a pet photo application.
Suggested skills
Microsoft Learn is a new approach to learning. Readiness for the new skills and responsibilities that come with
cloud adoption doesn't come easily. Microsoft Learn provides a more rewarding approach to hands-on learning
that helps you achieve your goals faster. With Microsoft Learn, you can earn points, rise through levels, and achieve
more.
Here are two examples of tailored learning paths on Microsoft Learn that align with the Contoso SmartHotel360
application in Azure.
Deploy a website to Azure with Azure App Ser vice : By creating web apps in Azure, you can publish
and manage your website easily without having to work with the underlying servers, storage, or network
assets. Instead, you can focus on your website features and rely on the robust Azure platform to help
provide secure access to your site.
Process and classify images with the Azure Cognitive Vision Ser vices : Azure Cognitive Services
offers prebuilt functionality to enable computer vision functionality in your applications. Learn how to use
the Azure Cognitive Vision Services to detect faces, tag and classify images, and identify objects.
Refactor a Team Foundation Server deployment to
Azure DevOps Services
10/30/2020 • 17 minutes to read • Edit Online
This article shows how the fictional company Contoso refactors its on-premises Visual Studio Team Foundation
Server deployment by migrating it to Azure DevOps Services in Azure. The Contoso development team has used
Team Foundation Server for team collaboration and source control for the past five years. Now, the team wants to
move to a cloud-based solution for dev and test work and for source control. Azure DevOps Services will play a
role as the Contoso team moves to an Azure DevOps model and develops new cloud-native applications.
Business drivers
The Contoso IT leadership team has worked closely with business partners to identify future goals. The partners
aren't overly concerned with dev tools and technologies, but the team has captured these points:
Software : Regardless of the core business, all companies are now software companies, including Contoso.
Business leadership is interested in how IT can help lead the company with new working practices for users and
new experiences for its customers.
Efficiency : Contoso needs to streamline its processes and remove unnecessary procedures for developers and
users. Doing so will allow the company to deliver on customer requirements more efficiently. The business
needs IT to move quickly, without wasting time or money.
Agility : To enable its success in a global economy, Contoso IT needs to be more responsive to the needs of the
business. It must be able to react more quickly to changes in the marketplace. IT must not get in the way or
become a business blocker.
Migration goals
The Contoso cloud team has pinned down the following goals for its migration to Azure DevOps Services:
The team needs a tool to migrate its data to the cloud. Few manual processes should be needed.
Work item data and history for the last year must be migrated.
The team doesn't want to set up new user names and passwords. All current system assignments must be
maintained.
The team wants to move away from Team Foundation Version Control (TFVC) to Git for source control.
The transition to Git will be a tip migration that imports only the latest version of the source code. The transition
will happen during a downtime, when all work will be halted as the code base shifts. The team understands that
only the current master branch history will be available after the move.
The team is concerned about the change and wants to test it before it does a full move. The team wants to retain
access to Team Foundation Server even after the move to Azure DevOps Services.
The team has multiple collections and, to better understand the process, it wants to start with one that has only
a few projects.
The team understands that Team Foundation Server collections are a one-to-one relationship with Azure
DevOps Services organizations, so it will have multiple URLs. But this matches its current model of separation
for code bases and projects.
Proposed architecture
Contoso will move its Team Foundation Server projects to the cloud, and it will no longer host its projects or
source control on-premises.
Team Foundation Server will be migrated to Azure DevOps Services.
Currently, Contoso has one Team Foundation Server collection, named ContosoDev , which will be migrated to
an Azure DevOps Services organization called contosodevmigration.visualstudio.com .
The projects, work items, bugs, and iterations from the last year will be migrated to Azure DevOps Services.
Contoso will use its Azure Active Directory (Azure AD) instance, which it set up when it deployed its Azure
infrastructure at the beginning of the migration planning.
Migration process
Contoso will complete the migration process as follows:
1. Significant preparation is required. First, Contoso must upgrade its Team Foundation Server implementation to
a supported level. Contoso is currently running Team Foundation Server 2017 Update 3, but to use database
migration it needs to run a supported 2018 version with the latest updates.
2. After Contoso upgrades, it will run the Team Foundation Server migration tool and validate its collection.
3. Contoso will build a set of preparation files and then perform a migration dry run for testing.
4. Contoso will then run another migration, this time a full migration that includes work items, bugs, sprints, and
code.
5. After the migration, Contoso will move its code from TFVC to Git.
Prerequisites
To run this scenario, Contoso needs to meet the following prerequisites:
On-premises Team Foundation Ser ver instance The on-premises instance needs to either run Team
Foundation Server 2018 upgrade 2 or be upgraded to it as
part of this process.
Scenario steps
Here's how Contoso will complete the migration:
Step 1: Create an Azure storage account . This storage account will be used during the migration process.
Step 2: Upgrade Team Foundation Ser ver . Contoso will upgrade its deployment to Team Foundation
Server 2018 upgrade 2.
Step 3: Validate the Team Foundation Ser ver collection . Contoso will validate the Team Foundation
Server collection in preparation for the migration.
Step 4: Build the migration files . Contoso will create the migration files by using the Team Foundation
Server migration tool.
5. The admins verify the Team Foundation Server installation by reviewing projects, work items, and code.
NOTE
Some Team Foundation Server upgrades need to run the Configure Features wizard after the upgrade finishes. Learn more.
2. They run the tool to perform the validation by specifying the URL of the project collection, as shown in the
following command:
TfsMigrator validate /collection:http://contosotfs:8080/tfs/ContosoDev
5. They run TfsMigrator validate /help at the command line, and they see that the command
/tenantDomainName seems to be required to validate identities.
6. They run the validation command again and include this value and their Azure AD name,
TfsMigrator validate /collection:http://contosotfs:8080/tfs/ContosoDev
/tenantDomainName:contosomigration.onmicrosoft.com
.
7. In the Azure AD sign-in window that opens, they enter the credentials of a global admin user.
8. The validation passes and is confirmed by the tool.
3. The preparation is completed, and the tool reports that the import files have been generated successfully.
4. The admins can now see that both the IdentityMapLog.csv file and the import.json file have been created in
a new folder.
5. The import.json file provides import settings. It includes information such as the desired organization
name, and storage account details. Most of the fields are populated automatically. Some fields require user
input. The admins open the file and add the Azure DevOps Services organization name to be created,
contosodevmigration . With this name, the Contoso Azure DevOps Services URL will be
contosodevmigration.visualstudio.com .
NOTE
The organization must be created before the migration begins. It can be changed after the migration is completed.
6. The admins review the identity log map file, which shows the accounts that will be brought into Azure
DevOps Services during the import.
Active identities refer to identities that will become users in Azure DevOps Services after the import.
In Azure DevOps Services, these identities will be licensed and displayed as users in the organization
after migration.
The identities are marked as Active in the Expected Impor t Status column in the file.
Step 5: Migrate to Azure DevOps Services
With the preparation completed, Contoso admins can focus on the migration. After they run the migration, they'll
switch from using TFVC to Git for version control.
Before they start, the admins schedule downtime with the dev team, so that they can plan to take the collection
offline for migration.
Here is the migration process they'll follow:
1. Detach the collection . Identity data for the collection resides in the configuration database for the Team
Foundation Server instance while the collection is attached and online.
When a collection is detached from the Team Foundation Server instance, a copy of that identity data is
made and then packaged with the collection for transport. Without this data, the identity portion of the
import can't be executed.
We recommended that the collection stay detached until the import has been completed, because changes
that occur during the import can't be imported.
2. Generate a backup . The next step is to generate a backup that can be imported into Azure DevOps
Services. The data-tier application component package (DACPAC) is a SQL Server feature that allows
database changes to be packaged into a single file and then deployed to other instances of SQL.
The backup can also be restored directly to Azure DevOps Services, and it's used as the packaging method
for getting collection data to the cloud. Contoso will use the sqlpackage.exe tool to generate the DACPAC.
This tool is included in SQL Server Data Tools.
3. Upload to storage . After the DACPAC is created, the admins upload it to Azure Storage. After they've
uploaded it, they get a shared access signature (SAS) to allow the Team Foundation Server migration tool
access to the storage.
4. Fill out the impor t . Contoso can then complete the missing fields in the import file, including the DACPAC
setting. To ensure that everything's working properly before the full migration, the admins will specify that
they want to perform a dry-run import.
5. Perform a dr y-run impor t . A dry-run import helps them test the collection migration. Dry runs have a
limited life, so they're deleted before a production migration runs. They're deleted automatically after a set
duration. A note that informs Contoso when the dry run will be deleted is included in the success email
that's sent after the import finishes. The team takes note and plans accordingly.
6. Complete the production migration . With the dry-run migration completed, Contoso admins do the
final migration by updating the import.json file and then running import again.
Detach the collection
Before they detach the collection, Contoso admins take a local SQL Server instance backup and a VMware snapshot
of the Team Foundation Server instance.
1. In the Team Foundation Server Administration Console, the admins select the collection they want to detach,
ContosoDev .
2. They select the General tab and then select Detach Collection .
3. In the Detach Team Project Collection wizard, on the Ser vicing Message pane, the admins provide a
message for users who might try to connect to projects in the collection.
4. On the Detach Progress pane, they monitor progress. When the process finishes, they select Next .
5. On the Readiness Checks pane, when the checks finish, they select Detach .
6. When the collection has been successfully detached, they select Close to finish up.
The collection is no longer referenced in the Team Foundation Server Administration Console.
Generate a DACPAC
Contoso admins create a backup, or DACPAC, to import into Azure DevOps Services.
The admins use the sqlpackage.exe utility in SQL Server Data Tools (SSDT) to create the DACPAC. There are
multiple versions of sqlpackage.exe installed with SQL Server Data Tools, and they're located under folders
with names like 120 , 130 , and 140 . It's important to use the right version to prepare the DACPAC.
Team Foundation Server 2018 imports need to use sqlpackage.exe from the 140 folder or higher. For
CONTOSOTFS , this file is located in
C:\Program Files (x86)\Microsoft Visual
Studio\2017\Enterprise\Common7\IDE\Extensions\Microsoft\SQLDB\DAC\140
.
Contoso admins generate the DACPAC as follows:
1. They open a command prompt and go to the sqlpackage.exe location. To generate the DACPAC, they run
the following command:
SqlPackage.exe /sourceconnectionstring:"Data Source=SQLSERVERNAME\INSTANCENAME;Initial
Catalog=Tfs_ContosoDev;Integrated Security=True" /targetFile:C:\TFSMigrator\Tfs_ContosoDev.dacpac
/action:extract /p:ExtractAllTableData=true /p:IgnoreUserLoginMappings=true /p:IgnorePermissions=true
/p:Storage=Memory
2. In Storage Explorer, the admins connect to their subscription and then search for and select the storage
account they created for the migration ( contosodevmigration ). They create a new blob container,
azuredevopsmigration .
3. On the Upload files pane, in the Blob type drop-down list, the admins specify Block Blob for the DACPAC
file upload.
4. After they upload the file, they select the file name and then select Generate SAS . They expand the Blob
Containers list under the storage account, select the container with the import files, and then select Get
Shared Access Signature .
5. On the Shared Access Signature pane, they accept the default settings and then select Create . This
enables access for 24 hours.
6. They copy the shared access signature URL, so that it can be used by the Team Foundation Server migration
tool.
NOTE
The migration must happen within the allowed time window or the permissions will expire. Do not generate an SAS key from
the Azure portal. Keys that are generated from the portal are account-scoped and won't work with the import.
The validation returns an error saying that the SAS key needs a longer period before it expires.
3. They use Azure Storage Explorer to create a new SAS key with the period before expiration set to seven days.
4. They update the import.json file and rerun the command. This time, the validation is completed
successfully.
TfsMigrator import /importFile:C:\TFSMigrator\import.json /validateonly
A message is displayed asking them to confirm that they want to continue with the migration. Note the
seven-day period after the dry run during which the staged data will be maintained.
6. The Azure AD sign-in window opens. Contoso admins sign in to Azure AD with admin permissions.
A message is displayed confirming that the import has been started successfully.
7. After about 15 minutes, the admins go to the website and see the following information:
8. After the migration finishes, a Contoso dev lead signs in to Azure DevOps Services to ensure that the dry
run worked properly. After authentication, Azure DevOps Services needs a few details to confirm the
organization.
The dev lead can see that the projects have been migrated successfully. A notice near the top of the page
warns that the dry run account will be deleted in 15 days.
9. The dev lead opens one of the projects and then selects Work Items > Assigned to me . This page verifies
that the work item data has been migrated successfully, along with the identity.
10. To confirm that the source code and history have been migrated, the dev lead checks other projects and
code.
5. After about 15 minutes, the admins go to the website and see the following information:
6. After the migration finishes, a dev lead signs into Azure DevOps Services to ensure that the migration
worked properly. After signing in, the dev lead can see that projects have been migrated.
7. The dev lead opens one of the projects and selects Work Items > Assigned to me . This shows that the
work item data has been migrated, along with the identity.
8. The dev lead checks to confirm that other work item data has been migrated.
9. To confirm that the source code and history have been migrated, the dev lead checks other projects and
code.
Move source control from TFVC to Git
With the migration now completed, Contoso admins want to move source code management from TFVC to Git.
The admins need to import the source code that's currently in their Azure DevOps Services organization as Git
repos in the same organization.
1. In the Azure DevOps Services portal, they open one of the TFVC repos, $/PolicyConnect , and review it.
NOTE
Because TFVC and Git store version control information differently, we recommend that Contoso not migrate its
repository history. This is the approach that Microsoft took when we migrated Windows and other products from
centralized version control to Git.
4. After the import finishes, the admins review the code.
6. After the dev lead reviews the source, they agree that the migration to Azure DevOps Services is done. Azure
DevOps Services now becomes the source for all development within the teams involved in the migration.
Need more help?
For more information, see Import repositories from TFVC to Git.
Post-migration training
The Contoso team will need to provide Azure DevOps Services and Git training for relevant team members.
Move on-premises VMware infrastructure to Azure
10/30/2020 • 10 minutes to read • Edit Online
When fictional company Contoso migrates its VMware virtual machines (VMs) from an on-premises datacenter to
Azure, two options are available to the team. This article focuses on Azure VMware Solution, which Contoso has
determined to be the better migration option.
M IGRAT IO N O P T IO N S O UTC O M E
Azure VMware Solution Use VMware Hybrid Cloud Extension (HCX) or vMotion to
move on-premises VMs.
Run native VMware workloads on Azure bare-metal
hardware.
Manage VMs using vSphere.
In this article, Contoso uses Azure VMware Solution to create a private cloud in Azure with native access to
VMware vCenter and other tools that are supported by VMware for workload migration. Contoso can confidently
use Azure VMware Solution, knowing that it's a first-party Microsoft offering backed by VMware.
Business drivers
Working closely with business partners, the Contoso IT team defines the business drivers for a VMware migration
to Azure. These drivers can include:
Datacenter evacuation or shutdown : Seamlessly move VMware-based workloads when they consolidate or
retire existing datacenters.
Disaster recover y and business continuity : Use a VMware stack deployed in Azure as a primary or
secondary on-demand disaster recovery site for on-premises datacenter infrastructure.
Application modernization : Tap into the Azure ecosystem to modernize Contoso applications without having
to rebuild VMware-based environments.
Implementing DevOps : Bring Azure DevOps tool chains to VMware environments and modernize
applications at its own pace.
Ensure operational continuity : Redeploy vSphere-based applications to Azure while avoiding hypervisor
conversions and application refactoring. Extend support for legacy applications that run Windows and SQL
Server.
C O N SIDERAT IO N DETA IL S
NOTE
For information about pricing, see Azure VMware Solution pricing.
Migration process
Contoso will move its VMs to Azure VMware Solution by using the VMware HCX tool. The VMs will run in an Azure
VMware Solution private cloud. VMware HCX migration methods include running a bulk or cold migration.
VMware vMotion or Replication-assisted vMotion (RAV) is a method reserved for workloads that run through a live
migration.
To complete the process, the Contoso team:
Plans its networking in Azure and ExpressRoute.
Creates the Azure VMware Solution private cloud by using the Azure portal.
Configures the network to include the ExpressRoute circuits.
Configures the HCX components to connect its on-premises vSphere environment to the Azure VMware
Solution private cloud.
Replicates the VMs and then moves them to Azure by using VMware HCX.
Scenarios steps
Step 1: Network planning
Step 2: Create an Azure VMware Solution private cloud
Step 3: Configure networking
Step 4: Migrate VMs using HCX
Step 1: Network planning
Contoso needs to plan out its networking to include Azure Virtual Network and connectivity between on-premises
and Azure. The company needs to provide a high-speed connection between its on-premises and Azure-based
environments, along with a connection to the Azure VMware Solution private cloud.
This connectivity is delivered through Azure ExpressRoute and will require some specific network address ranges
and firewall ports for enabling the services. This high-bandwidth, low-latency connection allows Contoso to access
services that run in its Azure subscription from the Azure VMware Solution private cloud environment.
Contoso will need to plan an IP address scheme that includes non-overlapping address space for its virtual
networks. The company will need to include a gateway subnet for the ExpressRoute gateway.
The Azure VMware Solution private cloud is connected to Contoso's Azure virtual network by using another Azure
ExpressRoute connection. ExpressRoute Global Reach will be enabled to allow direct connection from on-premises
VMs to VMs running on the Azure VMware Solution private cloud. The ExpressRoute Premium SKU is required to
enable Global Reach.
Azure VMware Solution private clouds require, at minimum, a /22 CIDR network address block for subnets. To
connect to on-premises environments and virtual networks, this must be a non-overlapping network address
block.
NOTE
To learn about network planning for Azure VMware Solution, see Networking checklist for Azure VMware Solution.
2. In the Azure portal, the team creates the Azure VMware Solution private cloud by providing the networking
information from the plan. The team then selects Review + create . This step takes about two hours.
3. The team verifies that the Azure VMware Solution private cloud deployment is complete by going to the
resource group and selecting the private cloud resource. The status is displayed as Succeeded.
Step 3: Configure networking
An Azure VMware Solution private cloud requires a virtual network. Because Azure VMware Solution doesn't
support an on-premises vCenter during preview, Contoso requires additional steps for integration with its on-
premises environment. By setting up an ExpressRoute circuit and a virtual network gateway, the team connects its
virtual networks to the Azure VMware Solution private cloud.
For more information, see Configure networking for your VMware private cloud in Azure.
1. The Contoso team first creates a virtual network with a gateway subnet.
IMPORTANT
The team must use an address space that does not overlap with the address space that it used when it created the
private cloud.
2. The team creates the ExpressRoute VPN gateway, making sure to select the correct SKU, and then selects
Review + create .
3. The team gets the authorization key to connect ExpressRoute to the virtual network. The key is found on the
connectivity screen of the Azure VMware Solution private cloud resource in the Azure portal.
4. The team connects the ExpressRoute to the VPN gateway that connects the Azure VMware Solution private
cloud to the Contoso virtual network. It does this by creating a connection in Azure.
For more information, see Learn how to access an Azure VMware Solution private cloud.
Step 4: Migrate by using VMware HCX
To move VMware VMs to Azure using HCX, the Contoso team will need to follow these high-level steps:
Install and configure VMware HCX.
Perform migrations to Azure by using HCX.
For more information, see Install HCX for Azure VMware Solution.
Install and configure VMware HCX for the public cloud
VMware HCX is a VMware product that's part of the Azure VMware Solution default installation. HCX Advanced is
installed by default, but it can be upgraded to HCX Enterprise as additional features and functionality are required.
Azure VMware Solution automates the cloud manager component of HCX in Azure VMware Solution. It provides
the customer activation keys and download link to the connector HCX appliance that must be configured on the
on-premises side and in a customer's vCenter domain. These elements are then paired with the Azure VMware
Solution cloud appliance, so that customers can take advantage of services such as migration and L2 stretch.
The Contoso team is deploying the HCX by using an OVF package that's provided by VMware.
To install and configure HCX for your Azure VMware Solution private cloud, see Install HCX for Azure
VMware Solution.
As the team is configuring HCX, it has chosen to enable migration and other options, including disaster
recovery.
For more information, see HCX installation workflow for HCX public clouds.
Migrate VMs to Azure by using HCX
When both the on-premises datacenter (source) and the Azure VMware Solution private cloud (destination) are
configured with the VMware cloud and HCX, Contoso can begin migrating its VMs. The team can move VMs to and
from VMware HCX-enabled datacenters by using multiple migration technologies.
Contoso's HCX application is online, and the status is green. The team is now ready to migrate and protect
Azure VMware Solution VMs by using HCX.
VMware HCX bulk migration
This migration method uses the VMware vSphere replication protocols to move multiple VMs simultaneously to a
destination site. Benefits include:
This method is designed to move multiple VMs in parallel.
The migration can be set to finish on a predefined schedule.
The VMs run at the source site until failover begins. The service interruption is equivalent to a reboot.
VMware HCX vMotion live migration
This migration method uses the VMware vMotion protocol to move a single VM to a remote site. Benefits include:
This method is designed to move one VM at a time.
There's no service interruption when the VM state is migrated.
VMware HCX cold migration
This migration method uses the VMware near-field communication protocol. The option is automatically selected
when the source VM is powered off.
VMware HCX Replication-assisted vMotion
VMware HCX RAV combines the benefits of VMware HCX bulk migration, which include parallel operations,
resiliency, and scheduling, with the benefits of VMware HCX vMotion migration, which include zero downtime
during VM state migration.
Additional resources
For additional VMware technical documentation, see:
VMware HCX documentation
Migrate virtual machines by using VMware HCX
Move on-premises Remote Desktop Services to
Azure Windows Virtual Desktop scenario
10/30/2020 • 11 minutes to read • Edit Online
Windows Virtual Desktop is a comprehensive desktop and application virtualization service running in the cloud.
It's the only virtual desktop infrastructure (VDI) that delivers simplified management, Windows 10 Enterprise multi-
session optimizations for Microsoft 365 Apps for enterprise, and support for Remote Desktop Services (RDS)
environments. Deploy and scale Windows desktops and applications on Azure in minutes, and get built-in security
and compliance features.
M IGRAT IO N O P T IO N S O UTC O M E
NOTE
This article focuses on using Windows Virtual Desktop in Azure to move an on-premises RDS environment to Azure.
Business drivers
Working closely with business partners, the Contoso IT team will define the business drivers for a VDI migration to
Azure. These drivers might include:
Current environment end-of-life: A datacenter is out of capacity when it reaches the end of a lease or is
closing down. Migrating to the cloud provides virtually unlimited capacity. Current software might also be
reaching its end of life where it has become necessary to upgrade the software running Contoso's current VDI
solution.
Multi-session Windows 10 VDI: Provide Contoso users with the only multi-session Windows 10 desktop
virtualized in the cloud that's highly scalable, up to date, and available on any device.
Optimize for Microsoft 365 Apps for enterprise: Deliver the best Microsoft 365 Apps for enterprise
experience, with multi-session virtual desktop scenarios providing the most productive virtualized experience
for Contoso's users.
Deploy and scale in minutes: Quickly virtualize and deploy modern and legacy desktop applications to the
cloud in minutes with unified management in the Azure portal.
Secure and productive on Azure and Microsoft 365: Deploy a complete, intelligent solution that enhances
creativity and collaboration for everyone. Shift to Microsoft 365 and get Office 365, Windows 10, and Enterprise
Mobility + Security.
Solutions design
After pinning down goals and requirements, Contoso designs and reviews a deployment solution and identifies the
migration process.
Current architecture
RDS is deployed to an on-premises datacenter. Microsoft 365 is licensed and in use by the organization.
Proposed architecture
Sync Active Directory or Azure Active Directory Domain Services.
Deploy Windows Virtual Desktop to Azure.
Migrate on-premises RDS servers to Azure.
Convert user profile disks (UPDs) to FSLogix profile containers.
Solution review
Contoso evaluates the proposed design by putting together a list of pros and cons.
C O N SIDERAT IO N DETA IL S
Migration process
Contoso will move VMs to Windows Virtual Desktop in Azure by using the Lakeside assessment tool and Azure
Migrate. Contoso will need to:
Run the assessment tool against its on-premises RDS infrastructure to establish the scale of the Windows
Virtual Desktop deployment in Azure.
Migrate to Windows Virtual Desktop via either Windows 10 Enterprise multi-session or persistent virtual
machines.
Optimize the Windows Virtual Desktop multi-session by scaling up and down as needed to manage costs.
Virtualize applications and assign users as needed to continue to secure and manage the Windows Virtual
Desktop environment.
Scenario steps
1. Assess the current RDS environment.
2. Create the VDI and new images in Azure and migrate and persist VMs to Azure.
3. Convert UPDs to FSLogix profile containers.
4. Replicate any persistent VMs to Azure.
1. Make sure that domain services, either Active Directory or Azure Active Directory Domain Services, are
synchronized with Azure Active Directory (Azure AD). Ensure the domain service is accessible from the
Azure subscription and virtual network to be connected where Windows Virtual Desktop will be deployed.
NOTE
Learn more about Azure AD Connect for synchronizing Active Directory on-premises with Azure AD.
NOTE
Learn about provisioning Azure Active Directory Domain Services and synchronizing Azure AD to it.
IMPORTANT
This location isn't where the new Windows Virtual Desktop environment will be deployed. Only the data related to
the Azure Migrate project will be stored here.
Figure 6:
Adding tools to the migration.
8. Start the assessment of the current environment by selecting Register with Azure Migrate in the
Lakeside tool.
NOTE
Contoso will also need to migrate application servers to Azure to get the company closer to the Windows Virtual Desktop
environment and reduce network latency for its users.
NOTE
Contoso can't create a new virtual network at this step. Before reaching this step, Contoso should have already
created a virtual network that has access to Active Directory.
NOTE
Contoso can't use a user account that requires multi-factor authentication in this step. If Contoso plans to use multi-
factor authentication for its users, it will need to create a service principal for this purpose.
3. Contoso performs one more validation of the Windows Virtual Desktop settings, and creates the new
environment of pooled Windows Virtual Desktop virtual machines.
IMPORTANT
The PowerShell modules for Hyper-V, Active Directory, and Pester are prerequisites to running the cmdlets to convert UPDs
to FSLogix.
A UDP conversion:
Convert-RoamingProfile -ParentPath "C:\Users\" -Target "\\Server\FSLogixProfiles$" -MaxVHDSize 20 -
VHDLogicalSectorSize 512
At this point, the migration has enabled using pooled resources with Windows 10 Enterprise multi-session.
Contoso can begin to deploy the necessary applications to the users who will use Windows 10 Enterprise multi-
session.
But now Contoso must migrate the persistent virtual machines to Azure.
6. As the last step before the final migration, Contoso selects the Users item in the Azure Windows Virtual
Desktop settings to map the servers to their respective users and groups.
Conclusion
In this article, Contoso moved its RDS deployment to Windows Virtual Desktop hosted in Azure.
VMware host migration best practices for Azure
10/30/2020 • 2 minutes to read • Edit Online
Migrating VMware host to Azure can accelerate the standard methodology outlined in the Cloud Adoption
Framework, and pictured here.
Figure 1
The table of contents on the left outlines best practices across multiple Microsoft web properties. These best
practices can guide the execution of your VMware host migration to Azure VMware Solution. Bookmark this page
to quickly reference the full list of best practices.
Migrate or deploy Windows Virtual Desktop
instances to Azure
10/30/2020 • 2 minutes to read • Edit Online
Migrating an organization's end-user desktops to the cloud is a common scenario in cloud migrations. Doing so
helps improve employee productivity and accelerate the migration of various workloads to support the
organization's user experience.
Organizations want to extend productivity to PCs, phones, tablets, or browsers that might not be under the
direct control of the IT team.
Employees need to access corporate data and applications from their devices.
As workloads are migrated to the cloud, employees need more support for a low-latency, more optimized
experience.
The costs of current or proposed virtual desktop experiences need to be optimized to help organizations scale
their remote work more effectively.
The IT team wants to transform the workplace, which often starts with transforming employees' user
experience.
Virtualization of your end users' desktops in the cloud can help your team realize each of these outcomes.
Next steps
For guidance on specific elements of the cloud adoption journey, see:
Plan for Windows Virtual Desktop migration or deployment
Review your environment or Azure landing zone(s)
Complete a Windows Virtual Desktop proof-of-concept
Assess for Windows Virtual Desktop migration or deployment
Deploy or migrate Windows Virtual Desktop instances
Release your Windows Virtual Desktop deployment to production
Windows Virtual Desktop planning
10/30/2020 • 2 minutes to read • Edit Online
Windows Virtual Desktop and deployment scenarios follow the same Migrate methodology as other migration
efforts. This consistent approach allows migration factories or existing migration teams to adopt the process with
little change to non-technical requirements.
Next steps
For guidance on specific elements of the cloud adoption journey, see:
Review your environment or Azure landing zones
Complete a Windows Virtual Desktop proof of concept
Assess Windows Virtual Desktop migration or deployment
Deploy or migrate Windows Virtual Desktop instances
Release your Windows Virtual Desktop deployment to production
Windows Virtual Desktop Azure landing zone review
10/30/2020 • 2 minutes to read • Edit Online
Before the Contoso cloud adoption team migrates to Windows Virtual Desktop, it will need an Azure landing zone
that's capable of hosting desktops and any supporting workloads. The following checklist can help the team
evaluate the landing zone for compatibility. Guidance in the Ready methodology of this framework can help the
team build a compatible Azure landing zone, if one has not been provided.
Evaluate compatibility
Resource organization plan: The landing zone should include references to the subscription or subscriptions
to be used, guidance on resource group usage, and the tagging and naming standards to be used when the
team deploys resources.
Azure AD: An Azure Active Directory (Azure AD) instance or an Azure AD tenant should be provided for end-
user authentication.
Network : Any required network configuration should be established in the landing zone prior to migration.
VPN or ExpressRoute: Additionally, any landing zone that supports virtual desktops will need a network
connection so that end users can connect to the landing zone and hosted assets. If an existing set of endpoints
is configured for virtual desktops, end users can still be routed through those on-premises devices via a VPN or
Azure ExpressRoute connection. If a connection doesn't already exist, you might want to review the guidance on
configuring network connectivity options in the Ready methodology.
Governance, users, and identity: For consistent enforcement, any requirements to govern access from
virtual desktops and to govern users and their identities should be configured as Azure policies and applied to
the landing zone.
Security: The security team has reviewed the landing zone configurations and approved each landing zone for
its intended use, including landing zones for the external connection and landing zones for any mission-critical
applications or sensitive data.
Windows Vir tual Desktop: Windows Virtual Desktop platform as a service has been enabled.
Any landing zone that the team develops by using the best practices in the Ready methodology and that can meet
the previously mentioned specialized requirements would qualify as a landing zone for this migration.
To understand how to architect Windows Virtual Desktop, review the Windows Virtual Desktop requirements.
Next steps
For guidance on specific elements of the cloud adoption journey, see:
Complete a Windows Virtual Desktop proof of concept
Assess for Windows Virtual Desktop migration or deployment
Deploy or migrate Windows Virtual Desktop instances
Release your Windows Virtual Desktop deployment to production
Windows Virtual Desktop proof of concept
10/30/2020 • 2 minutes to read • Edit Online
Before the Contoso cloud adoption team deploys its end-user desktops, it validates the configuration of the Azure
landing zone and end-user network capacity by completing and testing a proof of concept.
The following approach to the migration process is simplified to outline a proof-of-concept implementation.
1. Assess : the team deploys host pools by using the default virtual machine (VM) sizes. Assessment data helps
the team identify the expected number of concurrent user sessions and the number of VMs required to
support those concurrent sessions.
2. Deploy : the team creates a host pool for pooled desktops by using a Windows 10 gallery image from Azure
Marketplace and the sizing from assessment step 1.
3. Deploy : the team creates RemoteApp application groups for workloads that it has already migrated.
4. Deploy : the team creates an FSLogix profile container to store user profiles.
5. Release : the team tests the performance and latency of application groups and deployed desktops for a
sampling of users.
6. Release : the team onboards its end users to teach them how to connect through Windows desktop client, web
client, Android client, macOS client, or iOS client.
Assumptions
The proof of concept approach could meet some production needs, but it's built on a number of assumptions.
It's unlikely that all the following assumptions will prove to be true for any enterprise migration of Windows
Virtual Desktop. The adoption team should assume that the production deployment will require a separate
deployment that more closely aligns to the production requirements that it identifies during the Windows Virtual
Desktop assessment. The assumptions are:
End users have a low-latency connection to the assigned landing zone in Azure.
All users can work from a shared pool of desktops.
All users can use the Windows 10 Enterprise multi-session image from Azure Marketplace.
All user profiles will be migrated to either Azure Files, Azure NetApp Files, or a VM-based storage service for
the FSLogix profile containers.
All users can be described by a common persona with a density of six users per virtual central processing unit
(vCPU) and 4 gigabytes (GB) of RAM, as per the VM sizing recommendations.
All workloads are compatible with Windows 10 multi-session.
Latency between the virtual desktops and application groups is acceptable for production usage.
To calculate the cost of the Windows Virtual Desktop scenario based on the proof-of-concept configuration
reference, the team uses the pricing calculator for East US, West Europe, or Southeast Asia.
NOTE
These examples all use Azure Files as the storage service for user profiles.
Next steps
For guidance on specific elements of the cloud adoption journey, see:
Assess for Windows Virtual Desktop migration or deployment
Deploy or migrate Windows Virtual Desktop instances
Release your Windows Virtual Desktop deployment to production
Windows Virtual Desktop assessment
10/30/2020 • 5 minutes to read • Edit Online
The Windows Virtual Desktop proof of concept provides an initial scope as a baseline implementation for the
Contoso cloud adoption team. But the output of that proof of concept is unlikely to meet their production needs.
The Windows Virtual Desktop assessment exercise serves as a focused means of testing assumptions through a
data-driven process. Assessment data will help the team answer a series of important questions, validate or
invalidate their assumptions, and refine the scope as necessary to support the team's Windows Virtual Desktop
scenario. By using this assumption-validation approach, the team can accelerate the migration or deployment of
its end-user desktops to Windows Virtual Desktop.
Each persona, or each group of users with distinct business functions and technical requirements, would require a
specific host-pool configuration.
The end-user assessment provides the required data: pool type, density, size, CPU/GPU, landing zone region, and
so on.
Host-pool configuration assessment now maps that data to a deployment plan. Aligning the technical
requirements, business requirements, and cost will help determine the proper number and configuration of host
pools.
See examples for pricing in the East US, West Europe, or Southeast Asia regions.
Application groups
Both Movere and lakeside scans of the current on-premises environment can provide data about the applications
that are run on end-user desktops. By using that data, you can create a list of all applications required per each
persona. For each required application, the answers to the following questions will shape deployment iterations:
Do any applications need to be installed for the persona to use this desktop? Unless the persona uses 100
percent web-based software as a service applications, you'll likely need to configure a custom master VHD
image for each persona, with the required applications installed on the master image.
Does this persona need Microsoft 365 applications? If so, you'll need to add Microsoft 365 to a customized
master VHD image.
Is this application compatible with Windows 10 multi-session? If an application isn't compatible, a personal
pool might be required to run the custom VHD image. For assistance with application and Windows Virtual
Desktop compatibility issues, see the desktop application assure service.
Are mission-critical applications likely to suffer from latency between the Windows Virtual Desktop instance
and any back-end systems? If so, you might want to consider migrating the back-end systems that support the
application to Azure.
The answers to these questions might require the plan to include remediation to the desktop images or
supporting application components prior to desktop migration or deployment.
Next steps
For guidance on specific elements of the cloud adoption journey, see:
Deploy or migrate Windows Virtual Desktop instances
Release your Windows Virtual Desktop deployment to production
Windows Virtual Desktop deployment or migration
10/30/2020 • 4 minutes to read • Edit Online
The guidance in this article assumes that you've established a plan for Windows Virtual Desktop), assessed the
desktop deployment requirements, completed a proof of concept, and are now ready to migrate or deploy your
Windows Virtual Desktop instances.
Initial scope
The deployment of Windows Virtual Desktop instances follows a process that's similar to the proof of concept
process. Use this initial scope as a baseline to explain the various scope changes that are required by the output of
the assessment.
Create a host pool for pooled desktops by using a Windows 10 gallery image from Azure Marketplace and the
sizing from step 1 of that procedure.
Create RemoteApp application groups for workloads that have already been migrated.
Create an FSLogix profile container to store user profiles.
Deployment and migration consist of persona migration, application migration, and user profile migration.
Depending on the results of the workload assessment, there will likely be changes to each of those migration
tasks. This article helps identify ways that the scope would change based on the assessment feedback.
Iterative methodology
Each persona will likely require an iteration of the previously outlined initial scope, resulting in multiple host
pools. Depending on the Windows Virtual Desktop assessment, the adoption team should define iterations that
are based on the number of personas or users per persona. Breaking the process into persona-driven iterations
helps to reduce the change velocity impact on the business and allows the team to focus on proper testing or
onboarding of each of the persona pools.
Scope considerations
Each of the following sets of considerations should be included in the design documentation for each persona
group to be migrated or deployed. After the scope considerations are factored in to the previously discussed
initial scope, the deployment or migration can begin.
Azure landing zone considerations
Before you deploy the persona groups, a landing zone should be created in the Azure region that's required to
support each persona to be deployed. Each assigned landing zone should be evaluated against the landing zone
review requirements.
If the assigned Azure landing zone doesn't meet your requirements, scope should be added for any modifications
to be made to the environment.
Application and desktop considerations
Some personas might have a dependency on legacy solutions, which are not compatible with Windows 10 multi-
session. In these cases, some personas might require dedicated desktops. This dependency might not be
discovered until deployment and testing.
If they're discovered late in the process, future iterations should be allocated to modernization or migration of the
legacy application. This will reduce the long-term cost of the desktop experience. Those future iterations should be
prioritized and completed based on the overall pricing impact of modernization versus the extra cost associated
with dedicated desktops. To avoid pipeline disruptions and the realization of business outcomes, this prioritization
should not affect current iterations.
Some applications might require remediation, modernization, or migration to Azure to support the desired end-
user experience. Those changes are likely to come after release. Alternately, when desktop latency can affect
business functions, the application changes might create blocking dependencies for the migration of some
personas.
User profile considerations
The initial scope assumes that you're using a VM-based FSLogix user profile container.
You can use Azure NetApp Files to host user profiles. Doing so will require a few extra steps in the scope,
including:
Per NetApp instance: Configure NetApp files, volumes, and Active Directory connections.
Per host/persona: Configure FSLogix on session host virtual machines.
Per user : Assign users to the host session.
You can also use Azure Files to host user profiles. Doing so will require a few extra steps in the scope, including:
Per Azure Files instance: Configure the storage account, disk type, and Active Directory connection (Active
Directory Domain Services (AD DS) is also supported, assign role-based access control access for an Active
Directory user group, apply new technology file system permissions, and get the storage account access key.
Per host/persona: Configure FSLogix on session host virtual machines.
Per user : Assign users to the host session.
The user profiles for some personas or users might also require a data migration effort, which can delay the
migration of specific personas until user profiles can be remediated within your local Active Directory or
individual user desktops. This delay could significantly affect the scope outside of the Windows Virtual Desktop
scenario. After they've been remediated, the initial scope and the preceding approaches can be resumed.
Next steps
Release your Windows Virtual Desktop deployment to production
Windows Virtual Desktop post-deployment
10/30/2020 • 2 minutes to read • Edit Online
The release process for the migration or deployment of Windows Virtual Desktop instances is relatively
straightforward. This process mirrors the one that's used during the Windows Virtual Desktop proof of concept:
Test the performance and latency of application groups and deployed desktops for a sampling of users.
Onboard end users to teach them how to connect via:
Windows desktop client
Web client
Android client
macOS client
iOS client
Post-deployment
After the release has been completed, it's common to add logging and diagnostics to better operate Windows
Virtual Desktop. It's also common for operations teams to onboard the pooled hosts and desktop virtual
machines into the Azure server management best practices to manage reporting, patching, and business
continuity and disaster recovery configurations.
Although the release process is out of scope for this migration scenario, the process might expose the need to
migrate additional workloads to Azure during subsequent iterations of migration. If you haven't configured
Microsoft 365 or Azure Active Directory, your cloud adoption team might choose to onboard into those services
upon the release of the desktop scenarios. For a hybrid operating model, operations teams might also choose to
integrate Intune, System Center, or other configuration management tools to improve operations, compliance,
and security.
Next steps
After the Windows Virtual Desktop migration is complete, your cloud adoption team can begin the next scenario-
specific migration. Alternately, if there are additional desktops to be migrated, you can reuse this article series to
guide your next Windows Virtual Desktop migration or deployment.
Plan for Windows Virtual Desktop migration or deployment
Review your environment or Azure landing zones
Complete a Windows Virtual Desktop proof of concept
Assess for Windows Virtual Desktop migration or deployment
Deploy or migrate Windows Virtual Desktop instances
Release your Windows Virtual Desktop deployment to production
SQL Server migration best practices for Azure
10/30/2020 • 2 minutes to read • Edit Online
Migrating SQL Server to Azure can accelerate the standard methodology outlined in the Cloud Adoption
Framework. The process is shown in the following diagram.
Figure 1
The table of contents on the left outlines best practices that can guide the execution of your SQL Server migration.
You can migrate by using Azure Database Migration Guide, Azure Database Migration Service, or other tools.
Bookmark this page for quick reference to the full list of best practices.
Azure Stack: A strategic option for running Azure in
your datacenter
10/30/2020 • 2 minutes to read • Edit Online
Microsoft takes a cloud-first approach to application and data storage. The priority is to move applications and
data to one or more of the hyperscale clouds, including the global Azure option or a sovereign, locale-specific
cloud such as Azure Germany or Azure Government.
Azure Stack Hub acts as another instance of a sovereign cloud, whether it's operated by customers in their
datacenters or consumed through a cloud service provider. However, Azure Stack Hub is not a hyperscale cloud,
and Microsoft doesn't publish or support any service-level agreements for Azure Stack Hub.
Next steps
For guidance on specific elements of the cloud adoption journey, see:
Plan for Azure Stack Hub migration
Environmental readiness
Assess workloads for Azure Stack Hub
Deploy workloads to Azure Stack Hub
Govern Azure Stack Hub
Manage Azure Stack Hub
Plan your Azure Stack Hub migration
10/30/2020 • 2 minutes to read • Edit Online
This article assumes that you've reviewed how to integrate Azure Stack into your cloud strategy and that your
journey aligns with the examples in that article.
Before you move directly into your organization's migration efforts, it's important to set expectations appropriately
about Azure and Azure Stack Hub. Doing so can help avoid pitfalls or setbacks later in the project. The key to a
successful implementation is a good understanding of when to use Azure and when to use Azure Stack Hub.
Next steps
For guidance on specific elements of the cloud adoption journey, see:
Ready your cloud environment for Azure Stack Hub migration
Assess workloads for Azure Stack Hub
Deploy workloads to Azure Stack Hub
Govern Azure Stack Hub
Manage Azure Stack Hub
Ready your cloud environment for Azure Stack Hub
migration
10/30/2020 • 2 minutes to read • Edit Online
This article assumes that you've decided to integrate Azure Stack into your cloud strategy and you've developed a
plan for Azure Stack Hub migration.
Assess the infrastructure dependencies that must be addressed first:
Identity
Connectivity
Security
Encryption
Next steps
For guidance on specific elements of the cloud adoption journey, see:
Assess workloads for Azure Stack Hub
Deploy workloads to Azure Stack Hub
Govern Azure Stack Hub
Manage Azure Stack Hub
Assess workloads for Azure Stack Hub migration
10/30/2020 • 2 minutes to read • Edit Online
This article assumes that you've decided to integrate Azure Stack into your cloud strategy, you've developed a plan
for Azure Stack Hub migration, and your environment is ready for migration.
During the rationalization of your organization's digital estate in the Plan methodology, each workload was
discovered and inventoried, and initial decisions were made based on quantitative data. Before you deploy each
workload, it's important to validate the data and the decisions with qualitative data.
Placement
The first data point to consider is placement. That is, will this workload be migrated to your public cloud, private
cloud, or some other cloud platform, such as a sovereign cloud or service provider's Azure environment?
The information in each of the following sections can help validate your decisions about placement. The
information will also help surface data that will be useful during the deployment of your workloads.
Stakeholder value
Evaluate the value of migrating this workload with business and IT stakeholders:
Less friction: short-term focus, limited long-term viability.
More friction: long-term investment, easier to iterate and continue to modernize.
A balance of the two.
Success metrics
Determine success metrics and availability tolerances:
Performance
Availability
Resiliency
Deployment or migration approach
Licensing
Assess the impact of licensing and support:
Are there product licensing restrictions that will limit transformation?
Is the application or dataset supportable in the new environment?
Are there third-party software vendors who need to provide support statements?
Operations requirements
Avoid duplication of effort and optimize service-level agreements (SLAs) by examining the correlation between
IT-managed cloud services and application-specific services.
Consider the automation that's required to orchestrate the provisioning of services during deployment and
migration of applications.
To help meet your operations requirements, consider scalability and availability services such as pay per use,
virtual machine (VM) availability sets, VM scale sets, network adapters, and the ability to add and resize VMs
and disks.
Monitoring
Monitor system health and operational status and performance by using well-defined metrics that form the
basis of the SLAs that you offer your end users.
Check security and compliance, evaluating how well the cloud environment meets the regulatory and
compliance requirements that are imposed by the application.
What are the processes for backup/restore and replication/failover?
Find data-protection services for infrastructure as a service, platform as a service, and software as a service
resources.
Incorporate multiple vendors, technologies, and capabilities to achieve a comprehensive protection strategy.
Next steps
For guidance on specific elements of the cloud adoption journey, see:
Assess workloads for Azure Stack Hub
Deploy workloads to Azure Stack Hub
Govern Azure Stack Hub
Manage Azure Stack Hub
Deploy workloads to Azure Stack Hub
10/30/2020 • 2 minutes to read • Edit Online
By using Azure Stack, your organization can run its own instance of Azure in its datacenter. Organizations include
Azure Stack in their cloud strategy because it helps them handle situations when the public cloud won't work for
them. The three most common reasons to use Azure Stack are:
Poor network connectivity to the public cloud.
Regulatory or contractual requirements.
Back-end systems that can't be exposed to the internet.
Deploy workloads
After the Azure Stack Hub administrator has properly configured your stack instance, migrations can continue as
they would with most other Azure migration efforts. By using Azure Stack, your team can run any of the following
types of migration:
Ethereum blockchain network
AKS engine
Azure Cognitive Services
C# ASP.NET web app
Linux VM
Java web app
Additional considerations during migration
The following articles can help your team during migration and modernization:
Scalability and availability services such as pay per use, VM availability sets, VM scale sets, network adapters,
and the ability to add and resize VMs and disks
Storage capacity, including the ability to upload and download and also capture and deploy VM images
Azure Stack quickstart templates GitHub repository
Azure quickstart templates GitHub repository
Next steps
For guidance on specific elements of the cloud adoption journey, see:
Govern Azure Stack Hub
Manage Azure Stack Hub
Govern an Azure instance in your datacenter
10/30/2020 • 2 minutes to read • Edit Online
Governing hybrid solutions across public and private cloud platforms adds complexity. Because your Azure Stack
Hub is your own private instance of Azure running in your datacenter, that complexity is inherently reduced.
The business processes, disciplines, and many of the best practices outlined in the Govern methodology of the
Cloud Adoption Framework can still be applied to hybrid governance with Azure Stack Hub. Many cloud-native
tools used in the public cloud version of Azure can also be used in your Azure Stack Hub.
Next steps
For guidance on specific elements of the cloud adoption journey, see:
Manage Azure Stack Hub
Manage workloads that run on Azure Stack Hub
10/30/2020 • 2 minutes to read • Edit Online
Operations and management of hybrid solutions across public and private cloud platforms is complex and could
introduce risk to business operations. Because Azure Stack Hub is your organization's private instance of Azure
running in your datacenter, the risk of hybrid operations is greatly reduced.
As outlined in the Manage methodology of the Cloud Adoption Framework, suggested operations management
activities focus on the following list of core responsibilities. The same responsibilities hold true for the operations
management teams that support Azure Stack Hub.
Inventor y and visibility: Create an inventory of assets across multiple clouds. Develop visibility into the run
state of each asset.
Operational compliance: Establish controls and processes to ensure that each state is properly configured
and running in a well-governed environment.
Protect and recover : Ensure that all managed assets are protected and can be recovered by using baseline
management tooling.
Enhanced baseline options: Evaluate common additions to the baseline that might meet business needs.
Platform operations: Extend the management baseline with a well-defined service catalog and centrally
managed platforms.
Workload operations: Extend the management baseline to include a focus on mission-critical workloads.
Next steps
After your Azure Stack Hub migration reaches an operational state, you can begin the next iteration of migrations
by using Azure Stack Hub or other migration scenarios in the Azure public cloud.
Plan for Azure Stack Hub migrations
Environmental readiness
Assess workloads for Azure Stack Hub
Deploy workloads to Azure Stack Hub
Govern Azure Stack Hub
Manage Azure Stack Hub
Azure Synapse Analytics solutions
10/30/2020 • 2 minutes to read • Edit Online
Current market offerings fall short in meeting an organization's growing needs. Legacy on-premises environments,
including Teradata, Netezza, and Oracle Exadata, are expensive, slow to innovate, and inelastic. Organizations that
use on-premises systems are now considering taking advantage of innovative cloud, infrastructure as a service,
and platform as a service offerings in newer environments like Azure.
Many organizations are ready to take the step of shifting expensive tasks like infrastructure maintenance and
platform development to a cloud provider. In Microsoft Azure, an organization has access to a globally available,
highly secure, scalable cloud environment that includes Azure Synapse Analytics in an ecosystem of supporting
tools and capabilities.
Azure Synapse Analytics provides best-of-class relational database performance through techniques like massively
parallel processing and automatic in-memory caching. The results of this approach can be seen in independent
benchmarks like the one run recently by GigaOm, which compares Azure Synapse to other popular cloud data
warehouse offerings.
Organizations that have already migrated to Azure Synapse Analytics have seen many benefits, including:
Improved performance and price for performance.
Increased agility and shorter time to value.
Faster server deployment and application development.
Elastic scalability to ensure that you pay only for what you use.
Improved security and compliance.
Reduced storage and disaster recovery costs.
Lower overall TCO and better cost control (operating expenses).
To maximize these benefits, it's necessary to migrate existing data and applications to the Azure Synapse platform.
In many organizations, this approach includes migrating an existing data warehouse from a legacy on-premises
platform like Teradata, Netezza, or Exadata. Organizations need to modernize their data estate with an analytics
offering that is price-performant, rapidly innovative, scalable, and truly elastic. Learn more in the following sections
for migration best practices on Teradata, Netezza, and Exadata.
Next steps
Azure Synapse Analytics solutions for Teradata
Azure Synapse Analytics solutions for Netezza
Azure Synapse Analytics solutions for Exadata
Azure Synapse Analytics solutions and migration for
Teradata
10/30/2020 • 18 minutes to read • Edit Online
Many organizations are ready to take the step of shifting expensive data warehouse tasks like infrastructure
maintenance and platform development to a cloud provider. Organizations are now looking to take advantage of
innovative cloud, infrastructure as a service, and platform as a service offerings in newer environments like Azure.
Azure Synapse Analytics is a limitless analytics service that brings together enterprise data warehousing and big
data analytics. It gives you the freedom to query data on your terms at scale by using either serverless on-demand
or provisioned resources. Learn what to plan for as you migrate a legacy Teradata system to Azure Synapse.
Although Teradata and Azure Synapse are similar in that they're both SQL databases that are designed to use
massively parallel processing techniques to achieve high query performance on large data volumes, they have
some basic differences:
Legacy Teradata systems are installed on-premises, and they use proprietary hardware. Azure Synapse is cloud-
based and uses Azure compute and storage resources.
Upgrading a Teradata configuration is a major task that involves extra physical hardware and a potentially
lengthy database reconfiguration or dump and reload. In Azure Synapse, compute and storage resources are
separate, so you can easily scale up or down independently by using the elastic scalability of Azure.
Without a physical system to support, you can pause or resize Azure Synapse as needed to reduce resource
utilization and cost. In Azure, you have access to a globally available, highly secure, and scalable cloud
environment that includes Azure Synapse in an ecosystem of supporting tools and capabilities.
In this article, we look at schema migration, with an objective of obtaining equivalent or increased performance of
your migrated Teradata data warehouse and data marts on Azure Synapse. We consider concerns that apply
specifically to migrating from an existing Teradata environment.
At a high level, the migration process includes the steps that are listed in the following table:
Define scope: what do we want Start small and simple. Monitor and document all
to migrate? Automate where possible. stages of the migration process.
Build an inventory of data and Use Azure built-in tools and Use experience gained to build a
processes to migrate. features to reduce the migration template for future migrations.
Define any data model changes. effort. Reengineer the data model if
Identify the best Azure and Migrate metadata for tables and necessary by using the new
third-party tools and features to views. platform's performance and
use. Migrate relevant historical data. scalability.
Train staff early on the new Migrate or refactor stored Test applications and query
platform. procedures and business tools.
Set up the Azure target processes. Benchmark and optimize query
platform. Migrate or refactor ETL/ELT performance.
incremental load processes.
When you migrate from a legacy Teradata environment to Azure Synapse, in addition to the more general subjects
that are described in the Teradata documentation, you must consider some specific factors.
Metadata migration
It makes sense to automate and orchestrate the migration process by using the capabilities of the Azure
environment. This approach minimizes impact on the existing Teradata environment, which might already be
running close to full capacity.
Azure Data Factory is a cloud-based data integration service. You can use Data Factory to create data-driven
workflows in the cloud to orchestrate and automate data movement and data transformation. Data Factory
pipelines can ingest data from disparate datastores. Then, they process and transform the data by using compute
services like Azure HDInsight for Apache Hadoop and Apache Spark, Azure Data Lake Analytics, and Azure Machine
Learning.
Start by creating metadata that lists the data tables you want to migrate along with their locations. Then, use Data
Factory capabilities to manage the migration process.
In Azure Synapse, you can achieve the same result by using the following syntax:
SELECT * FROM (SELECT col1, ROW_NUMBER() OVER (PARTITION by col1 ORDER BY col1) rn FROM tab1 WHERE
c1='XYZ' ) WHERE rn = 1;
Date arithmetic: Azure Synapse has operators like DATEADD and DATEDIFF , which you can use on DATE or
DATETIME .
Depending on system settings, character comparisons in Teradata might not be case-specific by default. In
Azure Synapse, these comparisons are always case-specific.
Performance-tuning recommendations
The platforms have some differences when it comes to optimization. In the following list of performance-tuning
recommendations, lower-level implementation differences between Teradata and Azure Synapse and alternatives
for your migration are highlighted:
Data distribution options: In Azure, you can set the data distribution methods for individual tables. The
purpose of the functionality is to reduce the amount of data that moves between processing nodes when a
query is executed.
For large table/large table joins, hash distributing in one or both (ideally, both) tables on the join columns
helps ensure that join processing can be performed locally because the data rows to be joined are already
colocated on the same processing node.
Azure Synapse provides an additional way to achieve local joins for small table/large table joins (often called
a dimension table/fact table join in a star schema model). You replicate the smaller table across all nodes,
thereby ensuring that any value of the join key for the larger table has a matching dimension row that's
locally available. The overhead of replicating the dimension table is relatively low if the tables aren't large. In
this case, using the hash distribution approach described earlier is preferable.
Data indexing: Azure Synapse provides various indexing options, but the options are different in operation
and usage from indexing options in Teradata. To learn about the indexing options in Azure Synapse, see
Design tables in an Azure Synapse pool.
Existing indexes in the source Teradata environment can provide a useful indication of how data is used and
provide an indication of candidate columns for indexing in the Azure Synapse environment.
Data par titioning: In an enterprise data warehouse, fact tables might contain many billions of rows of
data. Partitioning is a way to optimize maintenance and querying in these tables. Splitting the tables into
separate parts reduces the amount of data processed at one time. Partitioning for a table is defined in the
CREATE TABLE statement.
Only one field per table can be used for partitioning. The field that's used for partitioning frequently is a date
field because many queries are filtered by date or by a date range. You can change the partitioning of a table
after initial load. To change a table's partitioning, re-create the table with a new distribution that uses the
CREATE TABLE AS SELECT statement. For a detailed description of partitioning in Azure Synapse, see Partition
tables in an Azure Synapse SQL pool.
Data table statistics: You can ensure that statistics about data tables are up to date by adding a
COLLECT STATISTICS step in ETL/ELT jobs or by enabling automatic statistics collection on the table.
PolyBase for data loading: PolyBase is the most efficient method to use to load large amounts of data
into a warehouse. You can use PolyBase to load data in parallel streams.
Resource classes for workload management: Azure Synapse uses resource classes to manage
workloads. In general, large resource classes provide better individual query performance. Smaller resource
classes give you higher levels of concurrency. You can use dynamic management views to monitor
utilization to help ensure that the appropriate resources are used efficiently.
Next steps
For more information about implementing a Teradata migration, talk with your Microsoft account representative
about on-premises migration offers.
Azure Synapse Analytics solutions and migration for
Netezza
10/30/2020 • 17 minutes to read • Edit Online
As IBM support for Netezza ends, many organizations that currently use Netezza data warehouse systems are
looking to take advantage of innovative cloud, infrastructure as a service, and platform as a service offerings in
newer environments like Azure. Many organizations are ready to take the step of shifting expensive tasks like
infrastructure maintenance and platform development to a cloud provider.
Azure Synapse Analytics is a limitless analytics service that brings together enterprise data warehousing and big
data analytics. It gives you the freedom to query data on your terms at scale by using either serverless on-demand
or provisioned resources. Learn what to plan for as you migrate a legacy Netezza system to Azure Synapse.
Netezza and Azure Synapse are similar in that each is a SQL database that's designed to use massively parallel
processing techniques to achieve high query performance on large data volumes. But the two platforms are
different in key aspects:
Legacy Netezza systems are installed on-premises, and they use proprietary hardware. Azure Synapse is cloud-
based and uses Azure compute and storage resources.
Upgrading a Netezza configuration is a major task that involves extra physical hardware and a potentially
lengthy database reconfiguration or dump and reload. In Azure Synapse, storage and compute resources are
separate. You can use the elastic scalability of Azure to independently scale up or down.
Without a physical system to support, you can pause or resize Azure Synapse as needed to reduce resource
utilization and cost. In Azure, you have access to a globally available, highly secure, and scalable cloud
environment that includes Azure Synapse in an ecosystem of supporting tools and capabilities.
In this article, we look at schema migration, with a view to obtaining equivalent or increased performance of your
migrated Netezza data warehouse and data marts on Azure Synapse. We consider concerns that apply specifically
to migrating from an existing Netezza environment.
At a high level, the migration process includes the steps that are listed in the following table:
Define scope: what do we want Start small and simple. Monitor and document all
to migrate? Automate where possible. stages of the migration process.
Build an inventory of data and Use Azure built-in tools and Use experience gained to build a
processes to migrate. features to reduce the migration template for future migrations.
Define any data model changes. effort. Reengineer the data model if
Identify the best Azure and Migrate metadata for tables and necessary by using the new
third-party tools and features to views. platform's performance and
use. Migrate relevant historical data. scalability.
Train staff early on the new Migrate or refactor stored Test applications and query
platform. procedures and business tools.
Set up the Azure target processes. Benchmark and optimize query
platform. Migrate or refactor ETL or ELT performance.
incremental load processes.
When you migrate from a legacy Netezza environment to Azure Synapse, you must consider some specific factors,
in addition to the more general subjects described in the Netezza documentation.
Initial migration workload
Legacy Netezza environments typically evolve over time to encompass multiple subject areas and mixed
workloads. When you are deciding where to start on an initial migration project, it makes sense to choose an area
that:
Proves the viability of migrating to Azure Synapse by quickly delivering the benefits of the new environment.
Allows in-house technical staff to gain experience with new processes and tools so that they can use them to
migrate other areas.
Creates a template based on the current tools and processes to use in additional migration from the source
Netezza environment.
A good candidate for an initial migration from a Netezza environment that would support these objectives typically
is one that implements a Power BI/analytics workload rather than an OLTP workload. The workload should have a
data model that can be migrated with minimal modifications, such as a star or snowflake schema.
For size, it's important that the data volume you migrate in the initial exercise is large enough to demonstrate the
capabilities and benefits of the Azure Synapse environment with a short time to demonstrate value. The size that
typically meets the requirements is in the range of 1 terabyte (TB) to 10 TB.
An approach for the initial migration project that minimizes risk and implementation time is to confine the scope of
the migration to data marts. This approach is a good starting point because it clearly limits the scope of the
migration and typically can be achieved on a short timescale. An initial migration of data marts only doesn't
address broader concerns like how to migrate ETL and historical data. You must address these areas in later phases
and backfill the migrated data mart layer with the data and processes that are required to build them.
Metadata migration
It makes sense to automate and orchestrate the migration process by using the capabilities of the Azure
environment. This approach minimizes the effect on the existing Netezza environment, which might already be
running close to full capacity.
Azure Data Factory is a cloud-based data integration service. You can use Data Factory to create data-driven
workflows in the cloud to orchestrate and automate data movement and data transformation. Data Factory
pipelines can ingest data from disparate datastores, and then process and transform the data by using compute
services like Azure HDInsight for Apache Hadoop and Apache Spark, Azure Data Lake Analytics, and Azure Machine
Learning. You start by creating metadata to list the data tables you want to migrate, with their locations, and then
use Data Factory capabilities to manage the migration process.
AGE: Netezza supports the AGE operator to give the interval between two temporal values (for
example, timestamps and dates). For example:
SELECT AGE ('23-03-1956', '01-01-2019') FROM ...
You can achieve the same result in Azure Synapse by using DATEDIFF (note the date representation
sequence):
SELECT DATEDIFF(day, '1956-03-23', '2019-01-01') FROM ...
In Netezza, the information that specifies the current table and view definitions is maintained in system
catalog tables. System catalog tables are the best source of the information because the tables likely are up
to date and complete. User-maintained documentation might not be in sync with current table definitions.
You can access system catalog tables in Netezza by using a utility like nz_ddl_table. You can use the tables to
generate CREATE TABLE DDL statements, which you can then edit for the equivalent tables in Azure Synapse. Third-
party migration and ETL tools also use the catalog information to achieve the same results.
Data extraction: You can extract raw data to migrate from an existing Netezza table into a flat, delimited
file by using standard Netezza utilities like nzsql and nzunload, and by using external tables. Compress the
files by using gzip, and then use AzCopy or an Azure data transport service like Azure Data Box to upload
the files to Azure Blob storage.
During a migration exercise, it's important to extract data as efficiently as possible. The recommended
approach for Netezza is to use external tables, which also is the fastest method. You can complete multiple
extracts in parallel to maximize the throughput for data extraction.
Here's a simple example of an external table extract:
CREATE EXTERNAL TABLE '/tmp/export_tab1.CSV' USING (DELIM ',') AS SELECT * from <TABLE-NAME>;
If you have sufficient network bandwidth, you can extract data directly from an on-premises Netezza system into
Azure Synapse tables or into Azure data storage by using Data Factory processes or third-party data migration or
ETL products.
Recommended data formats for extracted data are delimited text files (also called comma-separated values),
optimized row columnar files, or Parquet files.
For more detailed information about the process of migrating data and ETL from a Netezza environment, see the
Netezza documentation about data migration ETL and load.
Performance-tuning recommendations
When you move to Azure Synapse from a Netezza environment, many of the performance-tuning concepts you
use will be familiar.
For example, these concepts are the same for both environments:
Data distribution colocates data to be joined onto the same processing node.
Using the smallest data type for a specific column saves storage space and accelerates query processing.
Ensuring that data types of columns to be joined are identical optimizes join processing by reducing the need to
transform data for matching.
Ensuring that statistics are up to date helps the optimizer produce the best execution plan.
There are some differences between platforms when it comes to optimization. In the following list of performance-
tuning recommendations, lower-level implementation differences between Netezza and Azure Synapse, and
alternatives for your migration, are highlighted:
Data distribution options: In both Netezza and Azure Synapse, you can use a CREATE TABLE statement to
specify a distribution definition. Use DISTRIBUTE ON for Netezza and DISTRIBUTION = for Azure Synapse.
Azure Synapse provides an additional way to achieve local joins for small table/large table joins, often called
a dimension table/fact table join in a star schema model. The approach is to replicate the smaller dimension
table across all nodes, thereby ensuring that any value of the join key for the larger table will have a
matching dimension row that's locally available. The overhead of replicating the dimension table is relatively
low if the tables are not large. In this case, using the hash distribution approach described earlier is
preferable.
Data indexing: Azure Synapse provides various user-definable indexing options, but the options are
different in operation and usage than system-managed zone maps in Netezza. To learn about the indexing
options in Azure Synapse, see Index tables in an Azure Synapse SQL pool.
Existing system-managed zone maps in the source Netezza environment can provide a useful indication of
how data is used and provide an indication of candidate columns for indexing in the Azure Synapse
environment.
Data par titioning: In an enterprise data warehouse, fact tables might contain many billions of rows of
data. Partitioning is a way to optimize maintenance and querying in these tables. Splitting the tables into
separate parts reduces the amount of data processed at one time. Partitioning for a table is defined in the
CREATE TABLE statement.
Only one field per table can be used for partitioning. The field that's used for partitioning frequently is a date
field because many queries are filtered by date or by a date range. You can change the partitioning of a table
after initial load. To change a table's partitioning, re-create the table with a new distribution that uses the
CREATE TABLE AS SELECT statement. For a detailed description of partitioning in Azure Synapse, see Partition
tables in an Azure Synapse SQL pool.
PolyBase for data loading: PolyBase is the most efficient method to use to load large amounts of data
into a warehouse. You can use PolyBase to load data in parallel streams.
Resource classes for workload management: Azure Synapse uses resource classes to manage
workloads. In general, large resource classes provide better individual query performance. Smaller resource
classes give you higher levels of concurrency. You can use dynamic management views to monitor
utilization to help ensure that the appropriate resources are used efficiently.
Next steps
For more information about implementing a Netezza migration, talk with your Microsoft account representative
about on-premises migration offers.
Azure Synapse Analytics solutions and migration for
an Oracle data warehouse
10/30/2020 • 2 minutes to read • Edit Online
An Oracle data warehouse schema is different from Azure Synapse Analytics in several ways. The differences
include databases, data types, and a range of Oracle Database object types that aren't supported in Azure Synapse.
Like other database management systems, when you migrate an Oracle data warehouse to Azure Synapse, you'll
find that Oracle has multiple, separate databases and Azure Synapse has only one database. You might need to use
a new naming convention, such as concatenating Oracle schema and table names, to move tables and views in
your Oracle data warehouse staging database, production database, and data mart databases to Azure Synapse.
Several Oracle Database objects aren't supported in Azure Synapse. Database objects that aren't supported in
Azure Synapse include Oracle bit-mapped indexes, function-based indexes, domain indexes, Oracle clustered tables,
row-level triggers, user-defined data types, and PL/SQL stored procedures. You can identify these objects by
querying various Oracle system catalog tables and views. In some cases, you can use workarounds. For example,
you can use partitioning or other index types in Azure Synapse to work around the unsupported index types in
Oracle. You might be able to use materialized views instead of Oracle clustered tables, and migration tools like SQL
Server Migration Assistant (SSMA) for Oracle can translate at least some PL/SQL.
When you migrate an Oracle data warehouse schema, you also must take into account data type differences on
columns. To find the columns in your Oracle data warehouse and data mart schemas that have data types that
don't map to data types in Azure Synapse, query the Oracle catalog. You can use workarounds for several of these
instances.
To maintain or improve performance of your schema after migration, consider performance mechanisms, like
Oracle indexing, that you currently have in place. For example, bit-mapped indexes that Oracle queries frequently
use might indicate that creating a nonclustered index in the migrated schema on Azure Synapse would be
advantageous.
A good practice in Azure Synapse includes using data distribution to colocate data to be joined onto the same
processing node. Another good practice in Azure Synapse is ensuring that data types of columns to be joined are
identical. Using identical joined columns optimizes join processing by reducing the need to transform data for
matching. In Azure Synapse, often it isn't necessary to migrate every Oracle index because other features provide
high performance. You can instead use parallel query processing, in-memory data, and result set caching and data
distribution options that reduce I/O.
SSMA for Oracle can help you migrate an Oracle data warehouse or data mart to Azure Synapse. SSMA is designed
to automate the process of migrating tables, views, and data from an existing Oracle environment. Among other
features, SSMA recommends index types and data distributions for target Azure Synapse tables, and it applies data
type mappings during migration. Although SSMA isn't the most efficient approach for very high volumes of data,
it's useful for smaller tables.
Mainframe migration overview
10/30/2020 • 4 minutes to read • Edit Online
Many companies and organizations benefit from moving some or all their mainframe workloads, applications, and
databases to the cloud. Azure provides mainframe-like features at cloud scale without many of the drawbacks
associated with mainframes.
The term mainframe generally refers to a large computer system, but the vast majority currently of mainframes
deployed are IBM System Z servers or IBM plug-compatible systems running MVS, DOS, VSE, OS/390, or z/OS.
Mainframe systems continue to be used in many industries to run vital information systems, and they have a place
in highly specific scenarios, such as large, high-volume, transaction-intensive IT environments.
Migrating to the cloud enables companies to modernize their infrastructure. With cloud services you can make
mainframe applications, and the value that they provide, available as a workload whenever your organization
needs it. Many workloads can be transferred to Azure with only minor code changes, such as updating the names
of databases. You can migrate more complex workloads using a phased approach.
Most Fortune 500 companies are already running Azure for their critical workloads. Azure's significant bottom-line
incentives motivate many migration projects. Companies typically move development and test workloads to Azure
first, followed by DevOps, email, and disaster recovery.
Intended audience
If you're considering a migration or the addition of cloud services as an option for your IT environment, this guide
is for you.
This guidance helps IT organizations start the migration conversation. You may be more familiar with Azure and
cloud-based infrastructures than you are with mainframes, so this guide starts with an overview of how
mainframes work, and continues with various strategies for determining what and how to migrate.
Mainframe architecture
In the late 1950s, mainframes were designed as scale-up servers to run high-volume online transactions and batch
processing. Because of this, mainframes have software for online transaction forms (sometimes called green
screens) and high-performance I/O systems for processing batch runs.
Mainframes are known for high reliability and availability as well as their ability to run huge online transactions
and batch jobs. A transaction results from a piece of processing initiated by a single request, typically from a user at
a terminal. Transactions can also come from multiple other sources, including web pages, remote workstations, and
applications from other information systems. A transaction can also be triggered automatically at a predefined time
as the following figure shows.
A typical IBM mainframe architecture includes these common components:
Front-end systems: Users can initiate transactions from terminals, web pages, or remote workstations.
Mainframe applications often have custom user interfaces that can be preserved after migration to Azure.
Terminal emulators (also called green-screen terminals) are still used to access mainframe applications.
Application tier : Mainframes typically include a customer information control system (CICS), a leading
transaction management suite for the IBM z/OS mainframe that is often used with IBM Information
Management System (IMS), a message-based transaction manager. Batch systems handle high-throughput
data updates for large volumes of account records.
Code: Programming languages used by mainframes include COBOL, Fortran, PL/I, and Natural. Job control
language (JCL) is used to work with z/OS.
Database tier : A common relational database management system (DBMS) for z/OS is IBM DD2. It
manages data structures called dbspaces that contain one or more tables and are assigned to storage pools
of physical data sets called dbextents. Two important database components are the directory that identifies
data locations in the storage pools, and the log that contains a record of operations performed on the
database. Various flat-file data formats are supported. DB2 for z/OS typically uses virtual storage access
method (VSAM) datasets to store the data.
Management tier : IBM mainframes include scheduling software such as TWS-OPC, tools for print and
output management such as CA-SAR and SPOOL, and a source control system for code. Secure access
control for z/OS is handled by resource access control facility (RACF). A database manager provides access
to data in the database and runs in its own partition in a z/OS environment.
LPAR: Logical partitions, or LPARs, are used to divide compute resources. A physical mainframe is
partitioned into multiple LPARs.
z/OS: A 64-bit operating system that is most commonly used for IBM mainframes.
IBM systems use a transaction monitor such as CICS to track and manage all aspects of a business transaction.
CICS manages the sharing of resources, the integrity of data, and prioritization of execution. CICS authorizes users,
allocates resources, and passes database requests by the application to a database manager, such as IBM DB2.
For more precise tuning, CICS is commonly used with IMS/TM (formerly IMS/Data Communications or IMS/DC).
IMS was designed to reduce data redundancy by maintaining a single copy of the data. It complements CICS as a
transaction monitor by maintaining state throughout the process and recording business functions in a data store.
Mainframe operations
The following are typical mainframe operations:
Online: Workloads include transaction processing, database management, and connections. They are often
implemented using IBM DB2, CICS, and z/OS connectors.
Batch: Jobs run without user interaction, typically on a regular schedule such as every weekday morning.
Batch jobs can be run on systems based on Windows or Linux by using a JCL emulator such as Micro Focus
Enterprise Server or BMC Control-M software.
Job control language (JCL): Specify resources needed to process batch jobs. JCL conveys this
information to z/OS through a set of job control statements. Basic JCL contains six types of statements: JOB,
ASSGN, DLBL, EXTENT, LIBDEF, and EXEC. A job can contain several EXEC statements (steps), and each step
could have several LIBDEF, ASSGN, DLBL, and EXTENT statements.
Initial program load (IPL): Refers to loading a copy of the operating system from disk into a processor's
real storage and running it. IPLs are used to recover from downtime. An IPL is like booting the operating
system on Windows or Linux VMs.
Next steps
Myths and facts
Mainframe myths and facts
10/30/2020 • 2 minutes to read • Edit Online
Mainframes figure prominently in the history of computing and remain viable for highly specific workloads. Most
agree that mainframes are a proven platform with long-established operating procedures that make them reliable,
robust environments. Software runs based on usage, measured in million instructions per second (MIPS), and
extensive usage reports are available for chargebacks.
The reliability, availability, and processing power of mainframes have taken on almost mythical proportions. To
evaluate the mainframe workloads that are most suitable for Azure, you first want to distinguish the myths from
the reality.
Summary
By comparison, Azure offers an alternative platform that is capable of delivering equivalent mainframe
functionality and features, and at a much lower cost. In addition, the total cost of ownership (TCO) of the cloud's
subscription-based, usage-driven cost model is far less expensive than mainframe computers.
Next steps
Make the switch from mainframes to Azure
Make the switch from mainframes to Azure
10/30/2020 • 4 minutes to read • Edit Online
As an alternative platform for running traditional mainframe applications, Azure offers hyperscale compute and
storage in a high availability environment. You get the value and agility of a modern, cloud-based platform without
the costs associated with a mainframe environment.
This section provides technical guidance for making the switch from a mainframe platform to Azure.
NOTE
These estimates are subject to change as new virtual machine (VM) series become available in Azure.
Scalability
Mainframes typically scale up, while cloud environments scale out. Mainframes can scale out with the use of a
coupling facility (CF), but the high cost of hardware and storage makes mainframes expensive to scale out.
A CF also offers tightly coupled compute, whereas the scale-out features of Azure are loosely coupled. The cloud
can scale up or down to match exact user specifications, with compute power, storage, and services scaling on
demand under a usage-based billing model.
Storage
Part of understanding how mainframes work involves decoding various overlapping terms. For example, central
storage, real memory, real storage, and main storage all generally refer to storage attached directly to the
mainframe processor.
Mainframe hardware includes processors and many other devices, such as direct-access storage devices (DASDs),
magnetic tape drives, and several types of user consoles. Tapes and DASDs are used for system functions and by
user programs.
Types of physical storage for mainframes include:
Central storage: Located directly on the mainframe processor, this is also known as processor or real storage.
Auxiliar y storage: Located separately from the mainframe, this type includes storage on DASDs and is also
known as paging storage.
The cloud offers a range of flexible, scalable options, and you will pay only for those options that you need. Azure
Storage offers a massively scalable object store for data objects, a file system service for the cloud, a reliable
messaging store, and a NoSQL store. For VMs, managed and unmanaged disks provide persistent, secure disk
storage.
Next steps
Mainframe application migration
Mainframe application migration
10/30/2020 • 10 minutes to read • Edit Online
When migrating applications from mainframe environments to Azure, most teams follow a pragmatic approach:
reuse wherever and whenever possible, and then start a phased deployment where applications are rewritten or
replaced.
Application migration typically involves one or more of the following strategies:
Rehost: You can move existing code, programs, and applications from the mainframe, and then recompile
the code to run in a mainframe emulator hosted in a cloud instance. This approach typically starts with
moving applications to a cloud-based emulator, and then migrating the database to a cloud-based database.
Some engineering and refactoring are required along with data and file conversions.
Alternatively, you can rehost using a traditional hosting provider. One of the principal benefits of the cloud is
outsourcing infrastructure management. You can find a datacenter provider that will host your mainframe
workloads for you. This model may buy time, reduce vendor lock in, and produce interim cost savings.
Retire: All applications that are no longer needed should be retired before migration.
Rebuild: Some organizations choose to completely rewrite programs using modern techniques. Given the
added cost and complexity of this approach, it's not as common as a lift and shift approach. Often after this
type of migration, it makes sense to begin replacing modules and code using code transformation engines.
Replace: This approach replaces mainframe functionality with equivalent features in the cloud. Software as
a service (SaaS) is one option, which is using a solution created specifically for an enterprise concern, such
as finance, human resources, manufacturing, or enterprise resource planning. In addition, many industry-
specific apps are now available to solve problems that custom mainframe solutions used to previously solve.
You should consider starting by planning those workloads that you want to initially migrate, and then determine
those requirements for moving associated applications, legacy code bases, and databases.
On Azure, emulation environments are used to run the TP manager and the batch jobs that use JCL. In the data tier,
DB2 is replaced by Azure SQL Database, although Microsoft SQL Server, DB2 LUW, or Oracle Database can also be
used. An emulator supports IMS, VSAM, and SEQ. The mainframe's system management tools are replaced by
Azure services, and software from other vendors, that run in VMs.
The screen handling and form entry functionality is commonly implemented using web servers, which can be
combined with database APIs, such as ADO, ODBC, and JDBC for data access and transactions. The exact line-up of
Azure IaaS components to use depends on the operating system you prefer. For example:
Windows–based VMs: Internet Information Server (IIS) along with ASP.NET for the screen handling and
business logic. Use ADO.NET for data access and transactions.
Linux–based VMs: The Java-based application servers that are available, such as Apache Tomcat for screen
handling and Java-based business functionality. Use JDBC for data access and transactions.
Migrate batch workloads to Azure
Batch operations in Azure differ from the typical batch environment on mainframes. Mainframe batch jobs are
typically serial in nature and depend on the IOPS provided by the mainframe backbone for performance. Cloud-
based batch environments use parallel computing and high-speed networks for performance.
To optimize batch performance using Azure, consider the compute, storage, networking, and monitoring options as
follows.
Compute
Use:
VMs with the highest clock speed. Mainframe applications are often single-threaded and mainframe CPUs
have a very high clock speed.
VMs with large memory capacity to allow caching of data and application work areas.
VMs with higher density vCPUs to take advantage of multithreaded processing if the application supports
multiple threads.
Parallel processing, as Azure easily scales out for parallel processing, delivering more compute power for a
batch run.
Storage
Use:
Azure premium SSD or Azure ultra SSD for maximum available IOPS.
Striping with multiple disks for more IOPS per storage size.
Partitioning for storage to spread IO over multiple Azure storage devices.
Networking
Use Azure Accelerated Networking to minimize latency.
Monitoring
Use monitoring tools, Azure Monitor, Application Insights, and Azure logs enable administrators to monitor any
over performance of batch runs and help eliminate bottlenecks.
C O M P O N EN T A Z URE O P T IO N S
Partner solutions
If you are considering a mainframe migration, the partner ecosystem is available to assist you.
Azure provides a proven, highly available, and scalable infrastructure for systems that currently run on mainframes.
Some workloads can be migrated with relative ease. Other workloads that depend on legacy system software, such
as CICS and IMS, can be rehosted using partner solutions and migrated to Azure over time. Regardless of the
choice you make, Microsoft and our partners are available to assist you in optimizing for Azure while maintaining
mainframe system software functionality.
Learn more
For more information, see the following resources:
Get started with Azure
Deploy IBM DB2 pureScale on Azure
Host Integration Server documentation
Best practices to secure and manage workloads
migrated to Azure
10/30/2020 • 29 minutes to read • Edit Online
As you plan and design for migration, in addition to thinking about the migration itself, you need to consider your
security and management model in Azure after migration. This article describes planning and best practices for
securing your Azure deployment after migrating. It also covers ongoing tasks to keep your deployment running at
an optimal level.
IMPORTANT
The best practices and opinions described in this article are based on the Azure platform and service features available at the
time of writing. Features and capabilities change over time.
Next steps
Review other best practices:
Best practices for networking after migration.
Best practices for cost management after migration.
Azure cloud migration best practices checklist
10/30/2020 • 2 minutes to read • Edit Online
Start with the Azure migration guide in the Cloud Adoption Framework if you're interested in migrating to Azure.
That guide walks you through a set of tools and basic approaches to migrating virtual machines to the cloud.
The following checklists provide Azure cloud migration best practices that go beyond the basic cloud-native tools.
These outline the common areas of complexity that might require the scope of the migration to expand beyond
the Azure migration guide.
Next steps
The following is a good starting point for reviewing Azure migration best practices.
Multiple datacenters
Multiple datacenters
10/30/2020 • 2 minutes to read • Edit Online
Often the scope of a migration involves the transition of multiple datacenters. The following guidance expands the
scope of the Azure migration guide to address multiple datacenters.
Suggested prerequisites
Before starting the migration, you should create epics within the project management tool for each datacenter
that's to be migrated. Each epic represents a datacenter. It's important to understand the business outcomes and
motivations for this migration. Use those motivations to prioritize the list of epics (or datacenters). For instance, if
migration is driven by a desire to exit datacenters before leases must be renewed, then each epic would be
prioritized based on lease renewal date.
Within each epic, the workloads to be assessed and migrated are managed as features. Each asset within that
workload is managed as a user story. The work required to assess, migrate, optimize, promote, secure, and manage
each asset is represented as tasks for each asset.
Sprints or iterations then consist of a series of tasks required to migrate the assets and user stories committed to
by the cloud adoption team. Releases then consist of one or more workloads or features to be promoted to
production.
IMPORTANT
A subject matter expert with an understanding of asset placement and IP address schemas is required to identify assets
that reside in a secondary datacenter.
Evaluate both downstream dependencies and clients in the visualization to understand bidirectional dependencies.
Next steps
Return to the checklist to ensure that your migration method is fully aligned.
Migration best practices checklist
Azure regions decision guide
10/30/2020 • 15 minutes to read • Edit Online
Azure comprises many regions around the world. Each Azure region has specific characteristics that make
choosing which region to use incredibly important. These include available services, capacity, constraints, and
sovereignty:
Available ser vices: Services that are deployed to each region differ, based on various factors. Select a region
for your workload that contains your desired service. For more information, see Products available by region.
Capacity: Each region has a maximum capacity. This can affect which types of subscriptions can deploy which
types of services and under what circumstances. This is different than subscription quotas. If you're planning a
large-scale datacenter migration to Azure, you might want to consult with your local Azure field team or
account manager to confirm that you can deploy at the scale necessary.
Constraints: Certain constraints are placed on the deployment of services in certain regions. For example,
some regions are only available as a backup or failover target. Other constraints that are important to note are
data sovereignty requirements.
Sovereignty: Certain regions are dedicated to specific sovereign entities. While all regions are Azure regions,
these sovereign regions are completely isolated from the rest of Azure. They aren't necessarily managed by
Microsoft and might be restricted to certain types of customers. These sovereign regions are:
Azure China
Azure Germany: Azure Germany is being deprecated in favor of standard nonsovereign Azure regions in
Germany.
Azure US government
Two regions in Australia are managed by Microsoft but are provided for the Australian government and
its customers and contractors. Therefore, these regions carry client constraints similar to the other
sovereign clouds.
Network considerations
Any robust cloud deployment requires a well-considered network that takes into account Azure regions. You
should account for the following:
Azure regions are deployed in pairs. In the event of a catastrophic region failure, another region within the
same geopolitical boundary is designated as its paired region. Consider deploying into paired regions as a
primary and secondary resiliency strategy. One exception to this strategy is Brazil South , which is paired
with South Central US . For more information, see Azure paired regions.
Azure Storage supports geo-redundant storage (GRS). This means that three copies of your data are
stored within your primary region, and three additional copies are stored in the paired region. You can't
change the storage pairing for GRS.
Services that rely on Azure Storage GRS can take advantage of this paired region capability. To do so,
your applications and the network must be oriented to support that.
If you don't plan to use GRS to support your regional resiliency needs, you shouldn't use the paired
region as your secondary. In the event of a regional failure, there will be intense pressure on resources in
the paired region as resources migrate. You can avoid that pressure by recovering to an alternate site
and gaining additional speed during your recovery.
WARNING
Do not attempt to use Azure GRS for VM backups or recovery. Instead, use Azure Backup and Azure Site Recovery,
along with Azure managed disks, to support your infrastructure as a service (IaaS) workload resiliency.
Azure Backup and Azure Site Recovery work in tandem with your network design to facilitate regional
resiliency for your IaaS and data backup needs. Make sure the network is optimized so data transfers
remain on the Microsoft backbone and use virtual network peering, if possible. Some larger organizations
with global deployments might instead use ExpressRoute premium, to route traffic between regions and
potentially save regional egress charges.
Azure resource groups are regional specific. It's normal, however, for resources within a resource group to
span multiple regions. Consider that in the event of a regional failure, control plane operations against a
resource group will fail in the affected region, even though the resources in other regions (within that
resource group) will continue to operate. This can affect both your network design and your resource group
design.
Many platform as a service (PaaS) services within Azure support service endpoints or Azure Private Link.
Both of these solutions affect your network considerations substantially with regard to regional resiliency,
migration, and governance.
Many PaaS services rely on their own regional resiliency solutions. For example, both Azure SQL Database
and Azure Cosmos DB allow you to easily replicate to additional regions. Services such as Azure DNS don't
have regional dependencies. As you consider which services you will use in your adoption process, make
sure to clearly understand the failover capabilities and recovery steps that can be required for each Azure
service.
In addition to deploying to multiple regions to support disaster recovery, many organizations choose to
deploy in an active-active pattern to not rely on failover. This method offers the additional benefits of global
load balancing, additional fault tolerance, and network performance boosts. To take advantage of this
pattern, your applications must support running active-active in multiple regions.
WARNING
Azure regions are highly available constructs, with SLAs applied to the services running in them. But you should never take a
single region dependency on mission-critical applications. Always plan for regional failure, and practice recovery and
mitigation steps.
After considering the network topology, you must next look at additional documentation and process alignment
that might be necessary. The following approach can help assess the potential challenges and establish a general
course of action:
Consider a more robust readiness and governance implementation.
Inventory the affected geographies. Compile a list of the regions and countries that are affected.
Document data sovereignty requirements. Do the countries identified have compliance requirements that
govern data sovereignty?
Document the user base. Will employees, partners, or customers in the identified country be affected by the
cloud migration?
Document datacenters and assets. Are there assets in the identified country that might be included in the
migration effort?
Document regional SKU availability and failover requirements.
Align changes across the migration process to address the initial inventory.
Document complexity
The following table can aid in documenting the findings from the previous steps:
LO C A L DATA
LO C A L LO C A L DATA C EN T ERS SO VEREIGN T Y
REGIO N C O UN T RY EM P LO Y EES EXT ERN A L USERS O R A SSET S REQ UIREM EN T S
IMPORTANT
A subject matter expert with an understanding of asset placement and IP address schemas is required to identify assets
that reside in a secondary datacenter.
Evaluate both downstream dependencies and clients in the visualization to understand bidirectional dependencies.
Identify global user impact: The outputs from the prerequisite user profile analysis should identify any
workload affected by global user profiles. When a migration candidate is in the affected workload list, the architect
preparing for migration should consult networking and operations subject matter experts. They help to validate
network routing and performance expectations. At a minimum, the architecture should include an ExpressRoute
connection between the closest network operations center and Azure. The reference architecture for ExpressRoute
connections can aid in the configuration of the necessary connection.
Design for compliance: The outputs from the prerequisite user profile analysis should identify any workload
affected by data sovereignty requirements. During the architecture activities of the assess process, the assigned
architect should consult compliance subject matter experts. They help to understand any requirements for
migration and deployment across multiple regions. Those requirements significantly affect design strategies. The
reference architectures for multiregion web applications and multiregion n-tier applications can assist design.
WARNING
When you're using either of the reference architectures above, it might be necessary to exclude specific data elements from
replication processes to adhere to data sovereignty requirements. This will add an additional step to the promotion process.
NOTE
This approach can increase short-term migration costs through additional egress bandwidth charges.
In a cloud migration, you replicate and synchronize assets over the network between the existing datacenter and
the cloud. It's not uncommon for the existing data size requirements of various workloads to exceed network
capacity. In such a scenario, the process of migration can be radically slowed, or in some cases, stopped entirely.
The following guidance expands the scope of the Azure migration guide to provide a solution that works around
network limitations.
Suggested prerequisites
Validate network capacity risks: Digital estate rationalization is a highly recommended prerequisite, especially
if there are concerns of overburdening the available network capacity. During digital estate rationalization, you
collect an inventory of digital assets. That inventory should include existing storage requirements across the digital
estate.
As outlined in Replication risks: physics of replication, you can use that inventory to estimate total migration data
size, which can be compared to total available migration bandwidth. If that comparison doesn't align with the
required time to business change, then this article can help accelerate migration velocity reducing the time
required to migrate the datacenter.
Offline transfer of independent data stores: The following diagram shows examples of both online and
offline data transfers with Azure Data Box. You can use these approaches to ship large volumes of data to the cloud,
prior to workload migration. In an offline data transfer, you copy source data to Azure Data Box, which is then
physically shipped to Microsoft for transfer into an Azure Storage account as a file or a blob. Prior to other
migration efforts, you can use this process to ship data that isn't directly tied to a specific workload. Doing this
reduces the amount of data that needs to be shipped over the network and supports completing a migration
within network constraints.
You can use this approach to transfer data from HDFS, backups, archives, file servers, and applications. Existing
technical guidance explains how to use this approach to transfer data from an HDFS store or from disks by using
SMB, NFS, rest, or data copy service to Data Box.
There are also third-party partner solutions that use Azure Data Box for a migration. With these solutions, you
move a large volume of data via an offline transfer, but you synchronize it later at a lower scale over the network.
Assess process changes
If the storage requirements of a workload (or workloads) exceed network capacity, then you can still use Azure
Data Box in an offline data transfer.
Network transmission is the recommended approach unless the network is unavailable. The speed of transferring
data over the network, even when bandwidth is constrained, is typically faster than physically shipping the data by
using an offline transfer mechanism.
If connectivity to Azure is available, you should conduct an analysis before using Data Box, especially if migration of
the workload is time sensitive. Data Box is only advisable when the time to transfer the necessary data exceeds the
time to populate, ship, and restore it.
Suggested action during the assess process
Network capacity analysis: When workload-related data transfer requirements are at risk of exceeding network
capacity, the cloud adoption team adds an additional analysis task to the assess process called network capacity
analysis. During this analysis, a member of the team estimates the amount of available network capacity and
required data transfer time. Note that this team member should have subject matter expertise regarding the local
network and network connectivity.
Available capacity is compared to the storage requirements of all assets to be migrated during the current release.
If the storage requirements exceed the available bandwidth, then assets supporting the workload are selected for
offline transfer.
IMPORTANT
At the conclusion of the analysis, you might need to update the release plan to reflect the time required to ship, restore, and
synchronize the assets to be transferred offline.
Drift analysis: Analyze each asset to be transferred offline for storage and configuration drift. Storage drift is the
amount of change in the underlying storage over time. Configuration drift is change in the configuration of the
asset over time. From the time the storage is copied to the time the asset is promoted to production, any drift
might be lost. If that drift needs to be reflected in the migrated asset, you'll need to synchronize the local asset and
the migrated asset. Flag this for consideration during migration execution.
Migration process changes
When you're using offline transfer mechanisms, replication processes aren't typically required, whereas
synchronization processes might still be necessary. If an asset is being transferred offline, understanding the drift
analysis results from the assess process will inform the tasks required during migration.
Suggested action during the migration process
Copy storage: You can use this approach to transfer data of HDFS, backups, archives, file servers, or applications.
Existing technical guidance explains how to use this approach to transfer data from an HDFS store or from disks by
using SMB, NFS, rest, or data copy service to Data Box.
There are also third-party partner solutions that use Azure Data Box for a migration. With these solutions, you
move a large volume of data via an offline transfer, but you synchronize it later at a lower scale over the network.
Ship the device: After you copy the data, you can ship the device to Microsoft. After the data is received and
imported, it's available in an Azure Storage account.
Restore the asset: Verify that the data is available in the storage account. If so, you can use the data as a blob or
in Azure Files. If the data is a VHD/VHDX file, you can convert the file to managed disks. Those managed disks can
then be used to instantiate a virtual machine, which creates a replica of the original on-premises asset.
Synchronization: If synchronization of drift is a requirement for a migrated asset, you can use one of the third-
party partner solutions to synchronize the files until the asset is restored.
Next steps
Return to the checklist to ensure that your migration method is fully aligned.
Migration best practices checklist
Best practices to set up networking for workloads
migrated to Azure
10/30/2020 • 28 minutes to read • Edit Online
As you plan and design for migration, in addition to the migration itself, one of the most critical steps is the design
and implementation of Azure networking. This article describes best practices for networking when you're
migrating to infrastructure as a service (IaaS) and platform as a service (PaaS) implementations in Azure.
IMPORTANT
The best practices and opinions described in this article are based on the Azure platform and service features available at the
time of writing. Features and capabilities change over time. Not all recommendations might be applicable for your
deployment, so select those that work for you.
Learn more:
Learn about designing subnets.
Learn how a fictional company (Contoso) prepared their networking infrastructure for migration.
Figure 9:
Application security group example.
NIC1 AsgWeb
NIC2 AsgWeb
NIC3 AsgLogic
NIC4 AsgDb
In our example, each network interface belongs to only one application security group, but in fact an interface can
belong to multiple groups, in accordance with Azure limits. None of the network interfaces have an associated
NSG. NSG1 is associated with both subnets, and contains the following rules:
Destination port: 80
Protocol: TCP
Access: Allow
Destination: AsgDb
Protocol: All
Access: Deny
Protocol: TCP
Access: Allow
The rules that specify an application security group as the source or destination are only applied to the network
interfaces that are members of the application security group. If the network interface isn't a member of an
application security group, the rule is not applied to the network interface, even though the network security group
is associated with the subnet.
Learn more:
Learn about application security groups.
Best practice: Secure access to PaaS by using virtual network service endpoints
Virtual network service endpoints extend your virtual network private address space and identity to Azure services
over a direct connection.
Endpoints allow you to secure critical Azure service resources to your virtual networks only. Traffic from your
virtual network to the Azure service always remains on the Azure backbone network.
Virtual network private address space can be overlapping, and thus can't be used to uniquely identify traffic
originating from a virtual network.
After you enable service endpoints in your virtual network, you can secure Azure service resources by adding a
virtual network rule to the service resources. This provides improved security by fully removing public internet
access to resources, and allowing traffic only from your virtual network.
Azure Firewall Like NVA firewall farms, Azure Firewall uses a common
administration mechanism and a set of security rules to
protect workloads hosted in spoke networks. Azure Firewall
also helps control access to on-premises networks. Azure
Firewall has built-in scalability.
NVA firewalls Like Azure Firewall, NVA firewall farms have a common
administration mechanism and a set of security rules to
protect workloads hosted in spoke networks. NVA firewalls
also help control access to on-premises networks. NVA
firewalls can be manually scaled behind a load balancer.
We recommend using one set of Azure firewalls (or NVAs) for traffic originating on the internet, and another for
traffic originating on-premises. Using only one set of firewalls for both is a security risk, as it provides no security
perimeter between the two sets of network traffic. Using separate firewall layers reduces the complexity of
checking security rules, and it's clear which rules correspond to which incoming network request.
Learn more:
Learn about using NVAs in an Azure Virtual Network.
Next steps
Review other best practices:
Best practices for security and management after migration.
Best practices for cost management after migration.
Deploy a migration infrastructure
10/30/2020 • 38 minutes to read • Edit Online
This article shows how the fictional company Contoso prepares its on-premises infrastructure for migration,
sets up an Azure infrastructure in preparation for migration, and runs the business in a hybrid environment.
When you use this example to help plan your own infrastructure migration efforts, keep in mind that the
provided sample architecture is specific to Contoso. Review your organization's business needs, structure,
and technical requirements when making important infrastructure decisions about subscription design or
network architecture.
Whether you need all the elements described in this article depends on your migration strategy. For
example, you might need a less complex network structure if you're building only cloud-native applications
in Azure.
Overview
Before Contoso can migrate to Azure, it's critical to prepare an Azure infrastructure. Generally, Contoso
needs to think about six areas:
Step 1: Azure subscriptions. How will it purchase Azure and interact with the Azure platform and
services?
Step 2: Hybrid identity. How will it manage and control access to on-premises and Azure resources
after migration? How does it extend or move identity management to the cloud?
Step 3: Disaster recover y and resilience. How will it ensure that its applications and infrastructure
are resilient if outages and disasters occur?
Step 4: Network . How should it design a network infrastructure and establish connectivity between its
on-premises datacenter and Azure?
Step 5: Security. How will it secure the hybrid deployment?
Step 6: Governance. How will it keep the deployment aligned with security and governance
requirements?
On-premises architecture
Here's a diagram that shows the current Contoso on-premises infrastructure.
Figure 1: Contoso on-premises architecture.
Contoso has one main datacenter located in New York City in the eastern United States.
There are three additional local branches across the United States.
The main datacenter is connected to the internet with a fiber-optic Metro Ethernet connection (500
Mbps).
Each branch is connected locally to the internet through business-class connections, with IPsec VPN
tunnels back to the main datacenter. This approach allows the entire network to be permanently
connected and optimizes internet connectivity.
The main datacenter is fully virtualized with VMware. Contoso has two ESXi 6.5 virtualization hosts
managed by vCenter Server 6.5.
Contoso uses Active Directory for identity management and domain name system (DNS) servers on the
internal network.
The domain controllers in the datacenter run on VMware virtual machines (VMs). The domain controllers
at local branches run on physical servers.
NOTE
The directory that's created has an initial domain name in the form domain-name.onmicrosoft.com . The name can't
be changed or deleted. Instead, the admins need to add its registered domain name to Azure AD.
In future, Contoso will add other resource groups based on needs. For example, it might define a resource
group for each application or service so that each can be managed and secured independently.
Create matching security groups on-premises
In the on-premises Active Directory instance, Contoso admins set up security groups with names that match
the names of the Azure resource groups.
As it thinks about the hybrid environment, Contoso needs to consider how to build resilience and a disaster
recovery strategy into the region design. The simplest strategy is a single-region deployment, which relies
on Azure platform features such as fault domains and regional pairing for resilience. The most complex is a
full active-active model in which cloud services and database are deployed and servicing users from two
regions.
Contoso has decided to take a middle road. It will deploy applications and resources in a primary region and
keep a full copy of the infrastructure in the secondary region. With that strategy, the copy is ready to act as a
full backup if a complete application disaster or regional failure occurs.
Set up availability
Availability sets
Availability sets help protect applications and data from a local hardware and network outage within a
datacenter. Availability sets distribute Azure VMs across physical hardware within a datacenter.
Fault domains represent underlying hardware with a common power source and network switch within the
datacenter. VMs in an availability set are distributed across fault domains to minimize outages caused by a
single hardware or network failure.
Update domains represent underlying hardware that can undergo maintenance or be rebooted at the same
time. Availability sets also distribute VMs across multiple update domains to ensure that at least one
instance will be running at all times.
Contoso will implement availability sets whenever VM workloads require high availability. For more
information, see Manage the availability of Windows VMs in Azure.
Availability Zones
Availability Zones help protect applications and data from failures that affect an entire datacenter within a
region.
Each Availability Zone represents a unique physical location within an Azure region. Each zone consists of
one or more datacenters equipped with independent power, cooling, and networking.
There's a minimum of three separate zones in all enabled regions. The physical separation of zones within a
region protects applications and data from datacenter failures.
Contoso will use Availability Zones whenever applications need greater scalability, availability, and
resilience. For more information, see Regions and Availability Zones in Azure.
Configure backup
Azure Backup
You can use Azure Backup to back up and restore Azure VM disks.
Azure Backup allows automated backups of VM disk images stored in Azure Storage. Backups are
application consistent to ensure that backed-up data is transactionally consistent and that applications will
start post-restore.
Azure Backup supports locally redundant storage (LRS) to replicate multiple copies of backup data within a
datacenter if a local hardware failure occurs. If a regional outage occurs, Azure Backup also supports geo-
redundant storage (GRS), which replicates backup data to a secondary paired region.
Azure Backup encrypts data in transit by using AES-256. Backed-up data at rest is encrypted through Azure
Storage encryption.
Contoso will use Azure Backup with GRS on all production VMs to ensure that workload data is backed up
and can be quickly restored if a disruption occurs. For more information, see An overview of Azure VM
backup.
Set up disaster recovery
Azure Site Recovery
Azure Site Recovery helps ensure business continuity by keeping business applications and workloads
running during regional outages.
Azure Site Recovery continuously replicates Azure VMs from a primary to a secondary region, ensuring
functional copies in both locations. In the event of an outage in the primary region, the application or
service fails over to using the VM instances replicated in the secondary region. This failover minimizes
potential disruption. When operations return to normal, the applications or services can fail back to VMs in
the primary region.
Contoso will implement Azure Site Recovery for all production VMs used in mission-critical workloads,
ensuring minimal disruption during an outage in the primary region.
East US 2 is the primary region that Contoso will use to deploy resources and services. Here's how
Contoso will design networks in that region:
Hub: The hub virtual network in East US 2 is considered Contoso's primary connectivity to the on-
premises datacenter.
Vir tual networks: The spoke virtual networks in East US 2 can be used to isolate workloads if
necessary. In addition to the hub virtual network, Contoso will have two spoke virtual networks in
East US 2 :
VNET-DEV-EUS2 . This virtual network will provide the dev/test team with a fully functional
network for dev projects. It will act as a production pilot area, and will rely on the production
infrastructure to function.
VNET-PROD-EUS2 . Azure IaaS production components will be located in this network. Each
virtual network will have its own unique address space without overlap. Contoso intends to
configure routing without requiring network address translation (NAT).
Subnets: There will be a subnet in each network for each application tier. Each subnet in the
production network will have a matching subnet in the development virtual network. The production
network has a subnet for domain controllers.
The following table summarizes virtual networks in East US 2 .
VIRT UA L N ET W O RK RA N GE P EER
VIRT UA L N ET W O RK RA N GE P EER
For the domain controllers in the VNET-PROD-EUS2 network, Contoso wants traffic to flow both between the
EUS2 hub/production network and over the VPN connection to on-premises. To do this, Contoso admins
must allow the following:
1. Allow for warded traffic and Allow gateway transit configurations on the peered connection.
In our example, this would be the connection from VNET-HUB-EUS2 to VNET-PROD-EUS2 .
Figure 23: A peered connection.
2. Allow for warded traffic and Use remote gateways on the other side of the peering, on the
connection from VNET-PROD-EUS2 to VNET-HUB-EUS2 .
A spoked peer network can't see a spoked peer network in another region via a hub. For Contoso's
production networks in both regions to see each other, Contoso admins need to create a direct peered
connection for VNET-PROD-EUS2 and VENT-PROD-CUS .
After the virtual networks are deployed, the on-premises domain controllers are configured as DNS
servers in the networks.
If an optional custom DNS is specified for the virtual network, the virtual IP address 168.63.129.16
for the recursive resolvers in Azure must be added to the list. To do this, Contoso configures DNS
server settings on each virtual network. For example, the custom DNS settings for the VNET-HUB-EUS2
network would be as follows:
Figure 27: A custom DNS.
In addition to the on-premises domain controllers, Contoso will implement four domain controllers to
support the Azure networks (two for each region):
After deploying the on-premises domain controllers, Contoso needs to update the DNS settings on
networks on either region to include the new domain controllers in the DNS server list.
Set up domain controllers in Azure
After updating network settings, Contoso admins are ready to build out the domain controllers in Azure.
1. In the Azure portal, they deploy a new Windows Server VM to the appropriate virtual network.
2. They create availability sets in each location for the VM. Availability sets ensure that the Azure fabric
separates the VMs into different infrastructures in the Azure region. Availability sets also allow
Contoso to be eligible for the 99.95 percent service-level agreement (SLA) for VMs in Azure.
NOTE
The disk shouldn't be set to read/write for host caching. Active Directory databases don't support this.
2. After creating the sites, they create subnets in the sites, to match the virtual networks and datacenter.
TA G N A M E VA L UE
ApplicationTeam Email alias of the team that owns support for the
application.
ServiceManager Email alias of the ITIL Service Manager for the resource.
For example:
Figure 42: Azure tags.
After creating the tag, Contoso will go back and create new policy definitions and assignments to enforce
the use of the required tags across the organization.
Encrypt data
Azure Disk Encryption integrates with Azure Key Vault to help control and manage the disk-encryption keys
and secrets for a subscription. It ensures that all data on VM disks is encrypted at rest in Azure Storage.
Contoso has determined that specific VMs require encryption. Contoso will apply encryption to VMs with
customer, confidential, or personal data.
Conclusion
In this article, Contoso set up an Azure infrastructure and policy for Azure subscription, hybrid identify,
disaster recovery, network, governance, and security.
Not every step taken here is required for a cloud migration. In this case, Contoso planned a network
infrastructure that can handle all types of migrations while being secure, resilient, and scalable.
Next steps
After setting up its Azure infrastructure, Contoso is ready to begin migrating workloads to the cloud. See the
migration patterns and examples overview for a selection of scenarios that use this sample infrastructure as
a migration target.
Best practices to cost and size workloads migrated to
Azure
10/30/2020 • 17 minutes to read • Edit Online
As you plan and design for migration, focusing on costs ensures the long-term success of your Azure migration.
During a migration project, it's critical that all teams (such as finance, management, and application development
teams) understand associated costs.
Before migration, it's important to have a baseline for monthly, quarterly, and yearly budget targets in order to
estimate the amount you'd spend on your migration and ensure its success.
After migration, you should optimize costs, continually monitor workloads, and plan for future usage patterns.
Migrated resources might start out as one type of workload, but shift to another type over time, based on
usage, costs, and shifting business requirements.
This article describes best practices for preparing for and managing cost and size, both before and after migration.
IMPORTANT
The best practices and opinions described in this article are based on Azure platform and service features available at the
time of writing. Features and capabilities change over time. Not all recommendations might be applicable for your
deployment, so select what works for you.
Before migration
Before you move your workloads to the cloud, estimate the monthly cost of running them in Azure. Proactively
managing cloud costs helps you adhere to your operating expense budget. If budget is limited, take this into
account before migration. Consider converting workloads to Azure serverless technologies, where appropriate, to
reduce costs.
The best practices in this section help you:
Estimate costs.
Perform right-sizing for virtual machines (VMs) and storage.
Use Azure Hybrid Benefit.
Use Azure Reserved Virtual Machine Instances.
Estimate cloud spending across subscriptions.
Storage optimized High disk throughput and I/O. Good for big data, and SQL and NoSQL
databases.
GPU optimized Specialized VMs. Single or multiple Heavy graphics and video editing.
GPUs.
High performance Fastest and most powerful CPU. VMs Critical high-performance applications.
with optional high-throughput network
interfaces (RDMA).
It's important to understand the pricing differences between these VMs, and the long-term budget effects.
Each type has several VM series within it.
Additionally, when you select a VM within a series, you can only scale the VM up and down within that series.
For example, a DS2_v2 instance can scale up to DS4_v2 , but it can't be changed to an instance of a different
series, such as a F2s_v2 instance.
Learn more:
Learn more about VM types and sizing, and map sizes to types.
Plan sizes for VM instances.
Review a sample assessment for the fictional Contoso company.
Blobs Optimized to store massive amounts of Access data from everywhere over
unstructured objects, such as text or HTTP/HTTPS.
binary data.
Use for streaming and random access
scenarios. For example, to serve images
and documents directly to a browser,
stream video and audio, and store
backup and disaster recovery data.
Files Managed file shares accessed over SMB Use when migrating on-premises file
3.0. shares, and to provide multiple
access/connections to file data.
Disks Based on page blobs. Use premium disks for VMs. Use
managed disks for simple management
Disk type: standard (HDD or SSD) or and scaling.
premium (SSD).
Queues Store and retrieve large numbers of Connect application components with
messages accessed via authenticated asynchronous message queueing.
calls (HTTP or HTTPS).
DATA T Y P E DETA IL S USA GE
Access tiers
Azure Storage provides different options for accessing block blob data. Selecting the right access tier helps ensure
that you store block blob data in the most cost-effective manner.
Hot Higher storage cost than cool. Lower Use for data in active use, that's
access charges than cool. accessed frequently.
Cool Lower storage cost than hot. Higher Store short-term. Data is available but
access charges than hot. accessed infrequently.
Archive Used for individual block blobs. Use for data that can tolerate several
hours of retrieval latency, and will
Most cost-effective option for storage. remain in the tier for at least 180 days.
Data access is more expensive than hot
and cold.
A C C O UN T T Y P E DETA IL S USA GE
General-purpose v2 standard Supports blobs (block, page, and Use for most scenarios and most types
append), files, disks, queues, and tables. of data. Standard storage accounts can
be HDD- or SSD-based.
Supports hot, cool, and archive access
tiers. Zone-redundant storage (ZRS) is
supported.
General-purpose v2 premium Supports Blob storage data (page Microsoft recommends using for all
blobs). Supports hot, cool, and archive VMs.
access tiers. ZRS is supported.
Stored on SSD.
General-purpose v1 Access tiering isn't supported. Doesn't Use if applications need the Azure
support ZRS. classic deployment model.
Blob Specialized storage account for storing You can't store page blobs in these
unstructured objects. Provides block accounts, and therefore can't store VHD
blobs and append blobs only (no file, files. You can set an access tier to hot or
queue, table, or disk storage services). cool.
Provides the same durability,
availability, scalability and performance
as general-purpose v2.
Locally redundant storage (LRS) Protects against a local outage by Consider whether your application
replicating within a single storage unit stores data that can be easily
to a separate fault domain and update reconstructed.
domain. Keeps multiple copies of your
data in one datacenter. Provides at least
99.999999999 percent (eleven nines)
durability of objects over a particular
year.
Zone-redundant storage (ZRS) Protects against a datacenter outage Consider whether you need
by replicating across three storage consistency, durability, and high
clusters in a single region. Each storage availability. Might not protect against a
cluster is physically separated and regional disaster, when multiple zones
located in its own Availability Zone. are permanently affected.
Provides at least 99.9999999999
percent (twelve nines) durability of
objects over a particular year, by
keeping multiple copies of your data
across multiple datacenters or regions.
Geo-redundant storage (GRS) Protects against an entire region Replica data isn't available unless
outage, by replicating data to a Microsoft initiates a failover to the
secondary region hundreds of miles secondary region. If failover occurs,
away from the primary. Provides at read and write access is available.
least 99.99999999999999 percent
(sixteen nines) durability of objects over
a particular year.
Read-access geo-redundant Similar to GRS. Provides at least Provides 99.99 percent read availability,
storage (RA-GRS) 99.99999999999999 percent (sixteen by allowing read access from the
nines) durability of objects over a second region used for GRS.
particular year.
Learn more:
Review Azure Storage pricing.
Learn about Azure Import/Export.
Compare blobs, files, and disk storage data types.
Learn more about access tiers.
Review different types of storage accounts.
Learn about Azure Storage redundancy, including LRS, ZRS, GRS, and read-access GRS.
Learn more about Azure Files.
After migration
After a successful migration of your workloads and a few weeks of collecting consumption data, you'll have a clear
idea of resources costs. As you analyze data, you can start to generate a budget baseline for Azure resource
groups and resources. Then, as you understand where your cloud budget is being spent, you can analyze how to
further reduce your costs.
NOTE
In addition to VM monitoring, you should monitor other networking resources, such as Azure ExpressRoute and virtual
network gateways, for underuse and overuse.
Learn more:
Read overviews of Azure Monitor and Azure Advisor.
Get Azure Advisor cost recommendations.
Learn how to optimize costs from recommendations, and prevent unexpected charges.
Learn about the Azure resource optimization (ARO) toolkit.
Best practices: Use Azure Logic Apps and runbooks with Budgets API
Azure provides a REST API that has access to your tenant billing information. You can use the Budgets API to
integrate external systems and workflows that are triggered by metrics that you build from the API data. You can
pull usage and resource data into your preferred data analysis tools. You can integrate the Budgets API with Azure
Logic Apps and runbooks.
The Azure Resource Usage and RateCard APIs can help you accurately predict and manage your costs. The APIs
are implemented as a resource provider and are included in the APIs exposed by the Azure Resource Manager.
Learn more:
Review the Azure Budgets API.
Get insights into usage with the Azure Billing APIs.
Next steps
Review other best practices:
Best practices for security and management after migration.
Best practices for networking after migration.
Scale a migration to Azure
10/30/2020 • 20 minutes to read • Edit Online
This article demonstrates how the fictional company Contoso performs a migration at scale to Azure. The company
considers how to plan and perform a migration of more than 3,000 workloads, 8,000 databases, and 10,000 virtual
machines (VMs).
Business drivers
The IT leadership team has worked closely with business partners to understand what they want to achieve with
this migration:
Address business growth. Contoso is growing, causing pressure on on-premises systems and infrastructure.
Increase efficiency. Contoso needs to remove unnecessary procedures, and streamline processes for
developers and users. The business needs IT to be fast and not waste time or money, thus delivering faster on
customer requirements.
Increase agility. Contoso IT needs to be more responsive to the needs of the business. It must be able to react
faster than the changes in the marketplace, to enable success in a global economy. It mustn't get in the way or
become a business blocker.
Scale. As the business grows successfully, the Contoso IT team must provide systems that are able to grow at
the same pace.
Improve cost models. Contoso wants to lessen capital requirements in the IT budget. Contoso wants to use
cloud abilities to scale and reduce the need for expensive hardware.
Lower licensing costs. Contoso wants to minimize cloud costs.
Migration goals
The Contoso cloud team has pinned down goals for this migration. It used these goals to determine the best
migration method.
Move to Azure quickly Contoso wants to start moving applications and VMs to
Azure as quickly as possible.
Assess and classify applications Contoso wants to take full advantage of the cloud. As a
default, Contoso assumes that all services will run as platform
as a service (PaaS). Infrastructure as a service (IaaS) will be
used where PaaS isn't appropriate.
Train and move to DevOps Contoso wants to move to a DevOps model. Contoso will
provide Azure and DevOps training and will reorganize teams
as necessary.
After establishing goals and requirements, Contoso reviews the IT footprint and identifies the migration process.
Current deployment
Contoso has planned and set up an Azure infrastructure and tried out different proof-of-concept (POC) migration
combinations as detailed in the preceding table. It's now ready to embark on a full migration to Azure at scale.
Here's what Contoso wants to migrate.
IT EM VO L UM E DETA IL S
VMs > 35,000 VMs VMs run on VMware hosts and are
managed by vCenter servers.
Migration process
Now that Contoso has established business drivers and migration goals, it can align to the Migrate methodology. It
can build on the application of migration waves and migration sprints to iteratively plan and execute migration
efforts.
Plan
Contoso kicks off the planning process by discovering and assessing on-premises applications, data, and
infrastructure. Here's what Contoso will do:
Contoso needs to discover applications, map dependencies across applications, and decide on migration order
and priority.
As Contoso assesses, it will build out a comprehensive inventory of applications and resources. Along with the
new inventory, Contoso will use and update these existing items:
The configuration management database (CMDB). It holds technical configurations for Contoso
applications.
The service catalog. It documents the operational details of applications, including associated business
partners and service-level agreements.
Discover applications
Contoso runs thousands of applications across a range of servers. In addition to the CMDB and service catalog,
Contoso needs discovery and assessment tools.
The tools must provide a mechanism that can feed assessment data into the migration process. Assessment tools
must provide data that helps build up an intelligent inventory of Contoso's physical and virtual resources. Data
should include profile information and performance metrics.
When discovery is complete, Contoso should have a full inventory of assets and the metadata associated with
them. The company will use this inventory to define the migration plan.
Identify classifications
Contoso identifies some common categories to classify assets in the inventory. These classifications are critical to
Contoso's decision making for migration. The classification list helps to establish migration priorities and identify
complex issues.
C AT EGO RY A SSIGN ED VA L UE DETA IL S
Business group List of business group names Which group is responsible for the
inventory item?
Business group List of business group names Which group is responsible for the
inventory item?
Migration risk 1-5 What's the risk level for migrating the
application? Contoso DevOps and
relevant partners should agree on this
value.
Determine costs
To determine costs and the potential savings of Azure migration, Contoso can use the total cost of ownership (TCO)
calculator to calculate and compare the TCO for Azure to a comparable on-premises deployment.
Identify assessment tools
Contoso decides which tool to use for discovery, assessment, and building the inventory. Contoso identifies a mix
of Azure tools and services, native application tools and scripts, and partner tools. In particular, Contoso is
interested in how Azure Migrate can be used to assess at scale.
Azure Migrate
The Azure Migrate service helps you to discover and assess on-premises VMware VMs, in preparation for
migration to Azure. Here's what Azure Migrate does:
1. Discover : Discover on-premises VMware VMs.
Azure Migrate supports discovery from multiple vCenter servers (serially) and can run discoveries in
separate Azure Migrate projects.
Azure Migrate performs discovery via a VMware VM running the Azure Migrate Collector. The same
collector can discover VMs on different vCenter servers and send data to different projects.
2. Assess readiness: Assess whether on-premises machines are suitable for running in Azure. An assessment
includes:
Size recommendations: Get size recommendations for Azure VMs, based on the performance history
of on-premises VMs.
Estimated monthly costs: Get estimated costs for running on-premises machines in Azure.
3. Identify dependencies: Visualize dependencies of on-premises machines to create optimal machine
groups for assessment and migration.
Contoso needs to use Azure Migrate correctly, given the scale of this migration:
Contoso does an app-by-app assessment with Azure Migrate. This assessment ensures that Azure Migrate
returns timely data to the Azure portal.
Contoso admins learn how to deploy Azure Migrate at scale.
Contoso notes the Azure Migrate limits summarized in the following table.
A C T IO N L IM IT
Phase 2: Migrate
With the assessment complete, Contoso needs to identify tools to move its applications, data, and infrastructure to
Azure.
Migration strategies
Contoso can consider four broad migration strategies.
Refactor Also called repackaging, this strategy Contoso can refactor strategic
requires minimal application code or applications to retain the same basic
configuration changes to connect the functionality, but move them to run on
application to Azure PaaS, and take an Azure platform such as Azure App
better advantage of cloud capabilities. Service.
This requires minimum code changes.
On the other hand, Contoso will have
to maintain a VM platform because
Microsoft won't manage this.
Data must also be considered, especially with the volume of databases that Contoso has. Contoso's default
approach is to use PaaS services such as Azure SQL Database to take full advantage of cloud features. By moving
to a PaaS service for databases, Contoso will only have to maintain data. It will leave the underlying platform to
Microsoft.
Evaluate migration tools
Contoso is primarily using these Azure services and tools for the migration:
Azure Migrate: Service for migrating on-premises virtual machines and other resources to Azure.
Azure Database Migration Service: Migrates on-premises databases such as SQL Server, MySQL, and Oracle to
Azure.
Azure Migrate
Azure Migrate is the primary Azure service for orchestrating migration from within Azure and from on-premises
sites to Azure.
Azure Migrate orchestrates replication from on-premises locations to Azure. When replication is set up and
running, on-premises machines can be failed over to Azure, completing the migration.
Contoso already completed a POC to see how Azure Migrate can help it to migrate to the cloud.
U se A z u r e M i g r a t e a t sc a l e
Contoso plans to perform multiple lift-and-shift migrations. To ensure that this works, Azure Migrate will replicate
batches of around 100 VMs at a time. To determine how this will work, Contoso must perform capacity planning
for the proposed migration.
Contoso needs to gather information about their traffic volumes. In particular:
It needs to determine the rate of change for VMs that it wants to replicate.
It needs to take network connectivity from the on-premises site to Azure into account.
In response to capacity and volume requirements, Contoso will need to allocate sufficient bandwidth based on the
daily data change rate for the required VMs, to meet its recovery point objective (RPO). Last, it must determine
how many servers are needed to run the Azure Migrate components for the deployment.
G a t h e r o n - p r e m i se s i n fo r m a t i o n
In addition to the VMs being replicated, Site Recovery requires several components for VMware migration.
C O M P O N EN T DETA IL S
Contoso needs to figure out how to deploy these components, based on capacity considerations.
C O M P O N EN T C A PA C IT Y REQ UIREM EN T S
Maximum daily change rate A single process server can handle a daily change rate up to
2 terabytes (TB). Because a VM can only use one process
server, the maximum daily data change rate that's supported
for a replicated VM is 2 TB.
Azure Storage For migration, Contoso must identify the right type and
number of target Azure Storage accounts. Site Recovery
replicates VM data to Azure Storage.
Site Recovery can replicate to standard or premium SSD
storage accounts.
To decide about storage, Contoso must review storage
limits and consider expected growth and future increased
usage. Given the speed and priority of migrations, Contoso
has decided to use premium SSDs.
Contoso has decided to use managed disks for all VMs
deployed to Azure. The IOPS required will help determine if
the disks will be standard HDD, standard SSD, or premium
SSD.
Contoso will use Database Migration Service when migrating from SQL Server.
When provisioning Database Migration Service, Contoso needs to size it correctly and set it to optimize
performance for data migrations. Contoso will select the Business Critical tier with four vCores. This option allows
the service to take advantage of multiple vCPUs for parallelization and faster data transfer.
Another scaling tactic that Contoso can use is to temporarily scale up the Azure SQL Database or Azure Database
for MySQL target instance to the Premium pricing tier during data migration. This minimizes database throttling
that might affect data transfer activities when an organization is using lower tiers.
U se o t h e r t o o l s
In addition to Database Migration Service, Contoso can use other tools and services to identify VM information:
Scripts to help with manual migrations. These are available in the GitHub repo.
Various partner tools for migration.
Conclusion
In this article, Contoso planned for an Azure migration at scale. It divided the migration process into four stages.
The stages ran from assessment and migration, through to optimization, security, and management after migration
was complete.
It's important for an organization to plan a migration project as a whole process and to migrate its systems by
breaking down sets into classifications and numbers that make sense for the business. By assessing data and
applying classifications, projects can be broken down into a series of smaller migrations, which can run safely and
rapidly. The sum of these smaller migrations quickly turns into a large successful migration to Azure.
Data definition languages for schema migration
10/30/2020 • 23 minutes to read • Edit Online
This article describes design considerations and performance options for data definition languages (DDLs) when
you're migrating schemas to Azure Synapse Analytics.
Design considerations
Preparation for migration
When you're preparing to migrate existing data to Azure Synapse Analytics, it's important to clearly define the
scope of the exercise (especially for an initial migration project). The time spent up front to understand how
database objects and related processes will migrate can reduce both effort and risk later in the project.
Create an inventory of database objects to be migrated. Depending on the source platform, this inventory will
include some or all of the following objects:
Tables
Views
Indexes
Functions
Stored procedures
Data distribution and partitioning
The basic information for these objects should include metrics such as row counts, physical size, data compression
ratios, and object dependencies. This information should be available via queries against system catalog tables in
the source system. The system metadata is the best source for this information. External documentation might be
stale and not in sync with changes that have been applied to the data structure since the initial implementation.
You might also be able to analyze actual object usage from query logs or use tooling from Microsoft partners, such
as Attunity Visibility, to help. It's possible that some tables don't need to be migrated because they're no longer
used in production queries.
Data size and workload information is important for Azure Synapse Analytics because it helps to define
appropriate configurations. One example is the required levels of concurrency. Understanding the expected growth
of data and workloads might affect a recommended target configuration, and it's a good practice to also harness
this information.
When you're using data volumes to estimate the storage required for the new target platform, it's important to
understand the data compression ratio, if any, on the source database. Simply taking the amount of storage used
on the source system is likely to be a false basis for sizing. Monitoring and metadata information can help you
determine uncompressed raw data size and overheads for indexing, data replication, logging, or other processes in
the current system.
The uncompressed raw data size of the tables to be migrated is a good starting point for estimating the storage
required in the new target Azure Synapse Analytics environment.
The new target platform will also include a compression factor and indexing overhead, but these will probably be
different from the source system. Azure Synapse Analytics storage pricing also includes seven days of snapshot
backups. When compared to the existing environment, this can have an impact on the overall cost of storage
required.
You can delay performance tuning for the data model until late in the migration process and time this with when
real data volumes are in the data warehouse. However, we recommend that you implement some performance
tuning options earlier on.
For example, in Azure Synapse Analytics, it makes sense to define small dimension tables as replicated tables and
to define large fact tables as clustered columnstore indexes. Similarly, indexes defined in the source environment
provide a good indication of which columns might benefit from indexing in the new environment. Using this
information when you're initially defining the tables before loading will save time later in the process.
It's good practice to measure the compression ratio and index overhead for your own data in Azure Synapse
Analytics as the migration project progresses. This measure enables future capacity planning.
It might be possible to simplify your existing data warehouse before migration by reducing complexity to ease
migration. This effort might include:
Removing or archiving unused tables before migrating to avoid migrating data that's not used. Archiving to
Azure Blob storage and defining the data as an external table might keep the data available for a lower cost.
Converting physical data marts to virtual data marts by using data virtualization software to reduce what you
have to migrate. This conversion also improves agility and reduces total cost of ownership. You might consider
it as modernization during migration.
One objective of the migration exercise might also be to modernize the warehouse by changing the underlying
data model. One example is moving from an Inmon-style data model to a data vault approach. You should decide
this as part of the preparation phase and incorporate a strategy for the transition into the migration plan.
The recommended approach in this scenario is to first migrate the data model as is to the new platform and then
transition to the new model in Azure Synapse Analytics. Use the platform's scalability and performance
characteristics to execute the transformation without affecting the source system.
Data model migration
Depending on the platform and the origins of the source system, the data model of some or all parts may already
be in a star or snowflake schema form. If so, you can directly migrate it to Azure Synapse Analytics as is. This
scenario is the easiest and lowest-risk migration to achieve. An as-is migration can also be the first stage of a more
complex migration that includes a transition to a new underlying data model such as a data vault, as described
earlier.
Any set of relational tables and views can be migrated to Azure Synapse Analytics. For analytical query workloads
against a large data set, a star or snowflake data model generally gives the best overall performance. If the source
data model is not already in this form, it might be worth using the migration process to reengineer the model.
If the migration project includes any changes to the data model, the best practice is to perform these changes in the
new target environment. That is, migrate the existing model first, and then use the power and flexibility of Azure
Synapse Analytics to transform the data to the new model. This approach minimizes the impact on the existing
system and uses the performance and scalability of Azure Synapse Analytics to make any changes quickly and cost-
effectively.
You can migrate the existing system as several layers (for example, data ingest/staging layer, data warehouse layer,
and reporting or data mart layer). Each layer consists of relational tables and views. Although you can migrate all
these to Azure Synapse Analytics as is, it might be more cost-effective and reliable to use some of the features and
capabilities of the Azure ecosystem. For example:
Data ingest and staging: You can use Azure Blob storage in conjunction with PolyBase for fast parallel
data loading for part of the ETL (extract, transform, load) or ELT (extract, load, transform) process, rather than
relational tables.
Repor ting layer and data mar ts: The performance characteristics of Azure Synapse Analytics might
eliminate the need to physically instantiate aggregated tables for reporting purposes or data marts. It might
be possible to implement these as views onto the core data warehouse or via a third-party data
virtualization layer. At the basic level, you can achieve the process for data migration of historical data and
possibly also incremental updates as shown in this diagram:
If you can use these or similar approaches, the number of tables to be migrated is reduced. Some processes might
be simplified or eliminated, again reducing the migration workload. The applicability of these approaches depends
on the individual use case. But the general principle is to consider using the features and facilities of the Azure
ecosystem, where possible, to reduce the migration workload and build a cost-effective target environment. This
also holds true for other functions, such as backup/restore and workflow management and monitoring.
Products and services available from Microsoft partners can assist in data warehouse migration and in some cases
automate parts of the process. If the existing system incorporates a third-party ETL product, it might already
support Azure Synapse Analytics as a target environment. The existing ETL workflows can be redirected to the new
target data warehouse.
Data marts: Physical or virtual
It's a common practice for organizations with older data warehouse environments to create data marts that
provide their departments or business functions with good ad hoc self-service query and report performance. A
data mart typically consists of a subset of the data warehouse that contains aggregated versions of the original
data. Its form, typically a dimensional data model, supports users to easily query the data and receive fast response
times from user-friendly tools like Tableau, MicroStrategy, or Microsoft Power BI.
One use of data marts is to expose the data in a usable form, even if the underlying warehouse data model is
something different (such as a data vault). This approach is also known as a three-tier model.
You can use separate data marts for individual business units within an organization to implement robust data
security regimes. For example, you can allow user access to specific data marts relevant to them and eliminate,
obfuscate, or anonymize sensitive data.
If these data marts are implemented as physical tables, they require additional storage resources to house them
and additional processing to build and refresh them regularly. Physical tables show that the data in the mart is only
as current as the last refresh operation, so they may not be suitable for highly volatile data dashboards.
With the advent of relatively cheap scalable massively parallel processing (MPP) architectures such as Azure
Synapse Analytics and their inherent performance characteristics, you might be able to provide data mart
functionality without having to instantiate the mart as a set of physical tables. You achieve this by effectively
virtualizing the data marts through one of these methods:
SQL views on the main data warehouse.
A virtualization layer that uses features such as views in Azure Synapse Analytics or third-party virtualization
products such as Denodo.
This approach simplifies or eliminates the need for additional storage and aggregation processing. It reduces the
overall number of database objects to be migrated.
Another benefit of the data warehouse approach is the capacity to run operations such as joins and aggregations
on large data volumes. For example, implementing the aggregation and join logic within a virtualization layer and
displaying external reporting in a virtualized view push the robust processing required to create these views into
the data warehouse.
The primary drivers for choosing to implement physical or virtual data mart implementation are:
More agility. A virtual data mart is easier to change than physical tables and the associated ETL processes.
Lower total cost of ownership because of fewer data stores and copies of data in a virtualized implementation.
Elimination of ETL jobs to migrate and simplified data warehouse architecture in a virtualized environment.
Performance. Historically, physical data marts have been more reliable. Virtualization products are now
implementing intelligent caching techniques to mitigate this.
You can also use data virtualization to display data to users consistently during a migration project.
Data mapping
Key and integrity constraints in Azure Synapse Analytics
Primary key and foreign key constraints are not currently enforced within Azure Synapse Analytics. However, you
can include the definition for PRIMARY KEY in the CREATE TABLE statement with the NOT ENFORCED clause. This
means that third-party reporting products can use the metadata for the table to understand the keys within the
data model and therefore generate the most efficient queries.
Data type support in Azure Synapse Analytics
Some older database systems include support for data types that are not directly supported within Azure Synapse
Analytics. You can handle these data types by using a supported data type to store the data as is or by transforming
the data to a supported data type.
Here's an alphabetical list of supported data types:
bigint
binary [ (n) ]
bit
char [ (n) ]
date
datetime
datetime2 [ (n) ]
datetimeoffset [ (n) ]
decimal [ (precision [, scale ]) ]
float [ (n) ]
int
money
nchar [ (n) ]
numeric [ (precision [ , scale ]) ]
nvarchar [ (n | MAX) ]
real [ (n) ]
smalldatetime
smallint
smallmoney
time [ (n) ]
tinyint
uniqueidentifier
varbinary [ (n | MAX) ]
varchar [ (n | MAX) ]
The following table lists common data types that are not currently supported, together with the recommended
approach for storing them in Azure Synapse Analytics. For specific environments such as Teradata or Netezza, see
the associated documents for more detailed information.
UN SUP P O RT ED DATA T Y P E W O RK A RO UN D
geometry varbinary
geography varbinary
hierarchyid nvarchar(4000)
image varbinary
text varchar
ntext nvarchar
xml varchar
User-defined type Convert back to the native data type when possible
Test these thoroughly to determine whether the desired results are achieved in the target environment. The
migration exercise can uncover bugs or incorrect results that are currently part of the existing source system, and
the migration process is a good opportunity to correct anomalies.
Best practices for defining columns in Azure Synapse Analytics
It's common for older systems to contain columns with inefficient data types. For example, you might find a field
defined as VARCHAR(20) when the actual data values would fit into a CHAR(5) field. Or, you might find the use of
INTEGER fields when all values would fit within a SMALLINT field. Insufficient data types can lead to inefficiencies in
both storage and query performance, especially in large fact tables.
It's a good time to check and rationalize current data definitions during a migration exercise. You can automate
these tasks by using SQL queries to find the maximum numeric value or character length within a data field and
comparing the result to the data type.
In general, it's a good practice to minimize the total defined row length for a table. For the best query performance,
you can use the smallest data type for each column, as described earlier. The recommended approach to load data
from external tables in Azure Synapse Analytics is to use the PolyBase utility, which supports a maximum defined
row length of 1 megabyte (MB). PolyBase won't load tables with rows longer than 1 MB, and you must use bcp
instead.
For the most efficient join execution, define the columns on both sides of the join as the same data type. If the key
of a dimension table is defined as SMALLINT , then the corresponding reference columns in fact tables using that
dimension should also be defined as SMALLINT .
Avoid defining character fields with a large default size. If the maximum size of data within a field is 50 characters,
use VARCHAR(50) . Similarly, don't use NVARCHAR if VARCHAR will suffice. NVARCHAR stores Unicode data to allow for
different language character sets. VARCHAR stores ASCII data and takes less space.
Performance options
This section describes the features available within Azure Synapse Analytics that you can use to improve
performance for a data model.
General approach
The platform's features run performance tuning on the database that will be migrated. Indexes, data partitioning,
and data distribution are examples of such performance tuning. When you're preparing for migration,
documenting the tuning can capture and reveal optimizations that you can apply in the Azure Synapse Analytics
target environment.
For example, the presence of a non-unique index on a table can indicate that fields used in the index are used
frequently for filtering, grouping, or joining. This will still be the case in the new environment, so keep it in mind
when you're choosing which fields to index there. Migration recommendations for specific source platforms such
as Teradata and Netezza are described in detail in separate documents.
Use the performance and scalability of the target Azure Synapse Analytics environment to experiment with
different performance options like data distribution. Determine the best choice of alternative approaches (for
example, replicated versus hash-distributed for a large dimension table). This doesn't mean that data must be
reloaded from external sources. It's relatively quick and easy to test alternative approaches in Azure Synapse
Analytics by creating copies of any table with different partitioning or distribution options via a
CREATE TABLE AS SELECT statement.
Use the monitoring tools provided by the Azure environment to understand how queries are executed and where
bottlenecks might be occurring. Tools are also available from third-party Microsoft partners to provide monitoring
dashboards and automated resource management and alerting.
Each SQL operation in Azure Synapse Analytics and resource, such as memory or the CPU used by that query, is
logged into system tables. A series of dynamic management views simplifies access to this information.
The following sections explain the key options within Azure SQL Data Warehouse for tuning query performance.
Existing environments will contain information about potential optimization in the target environment.
Temporary tables
Azure Synapse Analytics supports temporary tables that are visible only to the session in which they were created.
They exist for the duration of a user session and are automatically dropped at the end of the session.
To create a temporary table, prefix the table name with the hash character ( # ). You can use all the usual indexing
and distribution options with temporary tables, as described in the next section.
Temporary tables have some restrictions:
Renaming them isn't allowed.
Viewing or partitioning them isn't allowed.
Changing permissions isn't allowed.
Temporary tables are commonly used within ETL/ELT processing, where transient intermediate results are used as
part of a transformation process.
Table distribution options
Azure Synapse Analytics is an MPP database system that achieves performance and scalability by running in
parallel across multiple processing nodes.
The ideal processing scenario for running an SQL query in a multinode environment is to balance the workload
and give all nodes an equal amount of data to process. This approach also allows you to minimize or eliminate the
amount of data that has to be moved between nodes to satisfy the query.
It can be challenging to achieve the ideal scenario because there are often aggregations in typical analytics queries
and multiple joins between several tables, as between fact and dimension tables.
One way to influence how queries are processed is to use the distribution options within Azure Synapse Analytics
to specify where each table's individual data rows are stored. For example, assume that two large tables are joined
on the data column, CUSTOMER_ID . By distributing the two tables through the CUSTOMER_ID columns whenever that
join is performed, you can ensure that the data from each side of the join will already be co-located on the same
processing node. This method eliminates the need to move data between nodes. The distribution specification for a
table is defined in the CREATE TABLE statement.
The following sections describe the available distribution options and recommendations for when to use them. It's
possible to change the distribution of a table after the initial load, if necessary: re-create the table with the new
distribution by using the CREATE TABLE AS SELECT statement.
Round robin
Round-robin table distribution is the default option and spreads the data evenly across the nodes in the system.
This method is good for fast data loading and for data that's relatively low in volume and doesn't have an obvious
candidate for hashing. It's frequently used for staging tables as part of an ETL or ELT process.
Hashed
The system assigns the row to a hash bucket, a task based on a hashing algorithm applied to a user-defined key
like CUSTOMER_ID in the preceding example. The bucket is then assigned to a specific node, and all data rows hash-
distributed on the same value end up on the same processing node.
This method is useful for large tables that are frequently joined or aggregated on a key. Other large tables to be
joined should be hashed on the same key if possible. If there are multiple candidates for the hash key, choose the
most frequently joined one.
The hash column shouldn't contain nulls and isn't typically a date because many queries filter on date. Hashing is
typically more efficient if the key to hash is an integer value instead CHAR or VARCHAR . Avoid choosing keys with a
highly skewed range of values, like when a small number of key values represent a large percentage of the data
rows.
Replicated
Choosing replicated as the distribution option for a table will cause a complete copy of that table to be replicated
on each compute node for query processing purposes.
This approach is useful for relatively small tables (typically less than 2 GB compressed) that are relatively static and
frequently joined to larger tables via an equi-join. These tables are often dimensional tables in a star schema.
Indexing
Azure Synapse Analytics includes options for indexing data in large tables to reduce the resources and time
required to retrieve records:
Clustered columnstore index
Clustered index
Non-clustered index
A non-indexed option, HEAP , exists for tables that don't benefit from any of the index options. Using indexes is a
trade-off between improved query times versus longer load times and usage of more storage space. Indexes often
speed up SELECT , UPDATE , DELETE , and MERGE operations on large tables that affect a small percentage of the
data rows, and they can minimize full table scans.
Indexes are automatically created when UNIQUE or PRIMARY KEY constraints are defined on columns.
Clustered columnstore index
Clustered columnstore index is the default indexing option within Azure Synapse Analytics. It provides the best
compression and query performance for large tables. For smaller tables of fewer than 60 million rows, these
indexes aren't efficient, so you should use the HEAP option. Similarly, a heap or a temporary table might be more
efficient if the data in a table is transient and part of an ETL/ELT process.
Clustered index
If there's a requirement to regularly retrieve a single row or small number of rows from a large table based on a
strong filter condition, a clustered index might be more efficient than a clustered columnstore index. Only one
clustered index is allowed per table.
Non-clustered index
Non-clustered indexes are similar to clustered indexes in that they can speed up retrieval of single rows or a small
number of rows based on a filter condition. Internally, non-clustered indexes are stored separately from the data,
and multiple non-clustered indexes can be defined on a table. However, each additional index will require more
storage and will reduce the throughput of data insert or loading.
Heap
Heap tables incur none of the overhead associated with the creation and maintenance of indexes at data load time.
They can help to quickly load transient data during processes, including ELT processes. Caching can also assist
when the data is read immediately afterward. Because clustered columnstore indexes are inefficient below 60
million rows, heap tables can also help to store tables with rows less than this amount.
Data partitioning
In an enterprise data warehouse, fact tables can contain many billions of rows. Partitioning is a way to optimize the
maintenance and querying of these tables by splitting them into separate parts to reduce the amount of data
processed when running queries. The partitioning specification for a table is defined in the CREATE TABLE
statement.
You can use only one field per table for partitioning. It's frequently a date field because many queries are filtered by
a date or date range. You can change the partitioning of a table after initial load, if necessary, by re-creating the
table with the new distribution through the CREATE TABLE AS SELECT statement.
Partitioning for query optimization
If queries against a large fact table are frequently filtered by a certain data column, then partitioning on that
column can significantly reduce the amount of data that needs to be processed to perform the queries. A common
example is to use a date field to split the table into smaller groups. Each group contains data for a single day. When
a query contains a WHERE clause that filters on the date, only partitions that match the date filter need to be
accessed.
Partitioning for optimization of table maintenance
It's common in data warehouse environments to maintain a rolling window of detailed fact data. An example is
sales transactions that go back five years. By partitioning on the sales date, the removal of old data beyond the
rolling window becomes much more efficient. Dropping the oldest partition is quicker and uses fewer resources
than deletions of all the individual rows.
Statistics
When a query is submitted to Azure Synapse Analytics, it's first processed by the query optimizer. The optimizer
determines the best internal methods to execute the query efficiently.
The optimizer compares the various query-execution plans that are available based on a cost-based algorithm. The
accuracy of the cost estimates is dependent on the statistics available. It's a good practice to ensure that statistics
are up to date.
In Azure Synapse Analytics, if the AUTO_CREATE_STATISTICS option is turned on, it will trigger an automatic update of
statistics. You can also create or update statistics manually via the CREATE STATISTICS command.
Refresh statistics when the contents have changed substantially, such as in a daily update. This refresh can be
incorporated into an ETL process.
All tables in the database should have statistics collected on at least one column. It ensures that basic information
such as row count and table size is available to the optimizer. Other columns that should have statistics collected
are columns specified in JOIN , DISTINCT , ORDER BY , and GROUP BY processing.
Workload management
Azure Synapse Analytics incorporates comprehensive features for managing resource utilization across mixed
workloads. Creating resource classes for different workload types, such as queries versus data load, helps you
manage your workload. It sets limits on the number of queries that run concurrently and on the compute resources
assigned to each query. There's a trade-off between memory and concurrency:
Smaller resource classes reduce the maximum memory per query but increase concurrency.
Larger resource classes increase the maximum memory per query but reduce concurrency.
Performance recommendations
Use performance improvement methods like indexes or data distribution to gauge candidates for similar methods
in the new target environment, but benchmark to confirm that they're necessary in Azure Synapse Analytics. Build
COLLECT STATISTICS steps into ETL/ELT processes to ensure that statistics are up to date, or select to automatically
create statistics.
Understand the tuning options available in Azure Synapse Analytics and the performance characteristics of
associated utilities, such as PolyBase for fast parallel data loading. Use these options to build an efficient end-to-
end implementation.
Use the flexibility, scalability, and performance of the Azure environment to implement any data model changes or
performance tuning options in place. This effort will reduce the impact on existing source systems.
Understand the dynamic management views available in Azure Synapse Analytics. These views provide both
system-wide resource utilization information and detailed execution information for individual queries.
Understand Azure resource classes and allocate them appropriately to ensure efficient management of mixed
workloads and concurrency.
Consider using a virtualization layer as part of the Azure Synapse Analytics environment. It can shield changes in
the warehouse implementation from business users and reporting tools.
Research partner-provided migration tools and services such as Qlik Replicate for Microsoft migrations,
WhereScape, and Datometry. These services can automate parts of the migration process and reduce the elapsed
time and risk involved in a migration project.
High availability for Azure Synapse Analytics
10/30/2020 • 2 minutes to read • Edit Online
One of the key benefits of a modern cloud-based infrastructure such as Microsoft Azure is that features for high
availability (HA) and disaster recovery (DR) are built in and simple to implement and customize. These facilities are
often lower in cost than the equivalent functionality within an on-premises environment. Using these built-in
functions also means that the backup and recovery mechanisms in the existing data warehouse don't need to be
migrated.
The following sections describe the standard Azure Synapse Analytics features that address requirements for high
availability and disaster recovery.
High availability
Azure Synapse Analytics uses database snapshots to provide high availability of the warehouse. A data warehouse
snapshot creates a restore point that can be used to recover or copy a data warehouse to a previous state. Because
Azure Synapse Analytics is a distributed system, a data warehouse snapshot consists of many files that are located
in Azure Storage. Snapshots capture incremental changes from the data stored in your data warehouse.
Azure Synapse Analytics automatically takes snapshots throughout the day to create restore points that are
available for seven days. This retention period can't be changed. Azure Synapse Analytics supports an eight-hour
recovery point objective (RPO). You can restore a data warehouse in the primary region from any one of the
snapshots taken in the past seven days.
The service also supports user-defined restore points. Manually triggering snapshots can create restore points of a
data warehouse before and after large modifications. This capability ensures that restore points are logically
consistent. Logical consistency provides additional data protection against workload interruptions or user errors
for quick recovery time.
Disaster recovery
In addition to the snapshots described earlier, Azure Synapse Analytics performs a standard geo-backup once per
day to a paired datacenter. The RPO for a geo-restore is 24 hours. You can restore the geo-backup to a server in
any other region where Azure Synapse Analytics is supported. A geo-backup ensures that a data warehouse can be
restored in case the restore points in the primary region are not available.
Governance or compliance strategy
10/30/2020 • 4 minutes to read • Edit Online
When governance or compliance is required throughout a migration effort, you need to broaden your scope to
account for these requirements. The following guidance expands the scope of the Azure migration guide to address
different approaches to addressing governance or compliance requirements.
Suggested prerequisites
Configuration of the base Azure environment can change significantly when you're integrating governance or
compliance requirements. To understand how prerequisites change, it's important to understand the nature of the
requirements. Prior to beginning any migration that requires governance or compliance, you should choose and
implement an approach in the cloud environment. The following are a few high-level approaches commonly seen
during migrations:
Common governance approach: For most organizations, the Cloud Adoption Framework governance model is
a sufficient approach. It consists of a minimum viable product (MVP) implementation, followed by targeted
iterations of governance maturity to address tangible risks identified in the adoption plan. This approach provides
the minimum tooling needed to establish consistent governance, so the team can understand the tools. It then
expands on those tools to address common governance concerns.
International Organization for Standardization (ISO) 27001 compliance blueprints: If your organization
is required to adhere to ISO compliance standards, the ISO 27001 Shared Services blueprint samples can serve as
a more effective MVP. The blueprint can produce richer governance constraints, earlier in the iterative process. The
ISO 27001 App Service Environment/SQL Database workload blueprint sample expands on the Shared Services
blueprint, to map controls and deploy a common architecture for an application environment.
Cloud Adoption Framework enterprise-scale landing zone: You might require a more robust governance
starting point. If so, consider the Cloud Adoption Framework enterprise-scale landing zone. The Cloud Adoption
Framework enterprise-scale landing zone approach focuses on adoption teams who have a mid-term objective
(within 24 months) to host more than 1,000 assets (applications, infrastructure, or data assets) in the cloud. The
Cloud Adoption Framework enterprise-scale landing zone is the de facto choice for complex governance scenarios
for these larger cloud adoption efforts.
Partnership option to complete prerequisites
Microsoft Ser vices: Microsoft Services provides solution offerings that can align to the Cloud Adoption
Framework governance model, compliance blueprints, or Cloud Adoption Framework enterprise-scale landing
zone options. This option helps you to ensure that you're using the most appropriate governance or compliance
model. Use the Secure Cloud Insights solution to establish a data-driven picture of a customer deployment in
Azure. This solution also validates the customer´s Azure implementation maturity while identifying optimization of
existing deployment architectures. Secure Cloud Insights also helps you reduce risk pertaining to governance
security and availability. Based on customer insights, you should lead with the following approaches:
Cloud foundation: Establish the customer's core Azure designs, patterns, and governance architecture with
the hybrid cloud foundation solution. Map the customer's requirements to the most appropriate reference
architecture. Implement a minimum viable product consisting of shared services and IaaS workloads.
Cloud modernization: Use the cloud modernization solution as a comprehensive approach to move
applications, data, and infrastructure to an enterprise-ready cloud. You can also optimize and modernize after
cloud deployment.
Innovate with cloud: Engage the customer through the cloud center of excellence (CCoE) solution. It
implements an agile approach to capture business requirements, and to reuse deployment packages aligned
with security, compliance, and service management policies. It also maintains the alignment of the Azure
platform with operational procedures.
Next steps
Return to the checklist to reevaluate any additional scope requirements for the migration effort.
Migration best practices checklist
Cloud Adoption Framework migration model
10/30/2020 • 3 minutes to read • Edit Online
This section of the Cloud Adoption Framework explains the principles behind its migration model. Wherever
possible, this content attempts to maintain a vendor-neutral position while guiding you through the processes and
activities that can be applied to any cloud migration, regardless of your chosen cloud vendor.
NOTE
While business planning is important, a growth mindset is equally important. In parallel with broader business planning
efforts by the cloud strategy team, it's suggested that the cloud adoption team begin migrating a first workload as a
precursor to wider scale migration efforts. This initial migration will allow the team to gain practical experience with the
business and technical issues involved in a migration.
Migration and modernization of workloads range from simple rehost migrations (also called lift and shift
migrations) using infrastructure as a service (IaaS) capabilities that don't require code and application changes,
through refactoring with minimal changes, to rearchitecting to modify and extend code and application
functionality to take advantage of cloud technologies.
Cloud-native strategies and platform as a service (PaaS) strategies rebuild on-premises workloads using Azure
platform offerings and managed services. Workloads that have equivalent fully managed software as a service
(SaaS) cloud-based offerings can often be fully replaced by these services as part of the migration process.
NOTE
During the public preview of the Cloud Adoption Framework, this section of the framework emphasizes a rehost migration
strategy. Although PaaS and SaaS solutions are discussed as alternatives when appropriate, the migration of virtual
machine-based workloads using IaaS capabilities is the primary focus.
Other sections and future iterations of this content will expand on other approaches. For a high-level discussion on
expanding the scope of your migration to include more complicated migration strategies, see the article balancing the
portfolio.
Incremental migration
The Cloud Adoption Framework migration model is based on an incremental cloud transformation process. It
assumes that your organization will start with an initial, limited-scope, cloud migration effort, which we refer to
commonly as the first workload. This effort will expand iteratively to include more workloads as your operations
teams refine and improve your migration processes.
Cloud migrations tools like Azure Site Recovery can migrate entire datacenters consisting of tens of thousands of
VMs. However, the business and existing IT operations can seldom handle such a high pace of change. As such
many organizations break up a migration effort into multiple iterations, moving one workload (or a collection of
workloads) per iteration.
The principles behind this incremental model are based on the execution of processes and prerequisites
referenced in the following infographic.
The consistent application of these principles represents an end goal for your cloud migration processes and
should not be viewed as a required starting point. As your migration efforts mature, refer to the guidance in this
section to help define the best process to support your organizational needs.
Next steps
Begin learning about this model by investigating the prerequisites to migration.
Prerequisites to migration
Prerequisites for migration
10/30/2020 • 3 minutes to read • Edit Online
Prior to beginning any migrations, your migration target environment must be prepared for the coming changes.
In this case, environment refers to the technical foundation in the cloud. Environment also means the business
environment and mindset driving the migration. Likewise, the environment includes the culture of the teams
executing the changes and those receiving the output. Lack of preparation for these changes is the most common
reason for failure of migrations. This series of articles walks you through suggested prerequisites to prepare the
environment.
Objective
Ensure business, culture, and technical readiness prior to beginning an iterative migration plan.
Definition of done
Prerequisites are completed when the following are true:
Business readiness. The cloud strategy team has defined and prioritized a high-level migration backlog
representing the portion of the digital estate to be migrated in the next two or three releases. The cloud
strategy team and the cloud adoption team have agreed to an initial strategy for managing change.
Culture readiness. The roles, responsibilities, and expectations of the cloud adoption team, cloud strategy
team, and affected users have been agreed on regarding the workloads to be migrated in the next two or three
releases.
Technical readiness. The landing zone (or allocated hosting space in the cloud) that will receive the migrated
assets meets minimum requirements to host the first migrated workload.
Cau t i on
Preparation is key to the success of a migration. However, too much preparation can lead to analysis paralysis,
where too much time spent on planning can seriously delay a migration effort. The processes and prerequisites
defined in this section are meant to help you make decisions, but don't let them block you from making
meaningful progress.
Choose a relatively simple workload for your initial migration. Use the processes discussed in this section as you
plan and implement this first migration. This first migration effort will quickly demonstrate cloud principles to
your team and force them to learn about how the cloud works. As your team gains experience, integrate these
learnings as you take on larger and more complex migrations.
Next steps
With a general understanding of the prerequisites, you're ready to address the first prerequisite early migration
decisions.
Early migration decisions
Decisions that affect migration
10/30/2020 • 6 minutes to read • Edit Online
During migration, several factors affect decisions and execution activities. This article explains the central theme of
those decisions and explores a few questions that carry through the discussions of migration principles in this
section of the Cloud Adoption Framework guidance.
Business outcomes
The objective or goal of any adoption effort can have a significant impact on the suggested approach to execution.
Migration. Urgent business drivers, speed of adoption, or cost savings are examples of operational outcomes.
These outcomes are central to efforts that drive business value from transitive change in IT or operations
models. The Migrate methodology of the Cloud Adoption Framework focuses heavily on migration focused
business outcomes.
Application innovation. Improving customer experience and growing market share are examples of
incremental outcomes. The outcomes result from a collection of incremental changes focused on the needs and
desires of current customers.
Data-driven innovation. New products or services, especially ones that come from the power of data, are
examples of disruptive outcomes. These outcomes are the result of experimentation and predictions that use
data to disrupt status quo in the market.
No business would pursue just one of these outcomes. Without operations, there are no customers, and vice versa.
Cloud adoption is no different. Companies commonly work to achieve each of these outcomes, but trying to focus
on all of them simultaneously can spread your efforts too thin and slow progress on work that could most benefit
your business needs.
This prerequisite isn't a demand for you to pick one of these three goals, but instead to help your cloud strategy
team and your cloud adoption team establish a set of operational priorities that will guide execution for the next
three to six months. These priorities are set by ranking each of the three itemized options from most significant to
least significant, as they relate to the efforts this team can contribute to in the next one or two quarters.
Act on migration outcomes
If operational outcomes rank highest in the list, this section of the Cloud Adoption Framework will suit your team
well. In this section, it is assumed that you need to prioritize speed and cost savings as primary key performance
indicators (KPIs), in which case a migration model to adoption would be well aligned with the outcomes. A
migration-focused model is heavily predicated on lift and shift migration of infrastructure as a service (IaaS) assets
to deplete a datacenter and to produce cost savings. In such a model, modernization may occur but is a secondary
focus until the primary migration mission is realized.
Act on application innovations
If market share and customer experience are your primary drivers, this section may not be the best section of the
Cloud Adoption Framework to guide your teams' efforts. Application innovation requires a plan that focuses on
the modernization and transition of workloads, regardless of the underlying infrastructure. In such a case, the
guidance in this section can be informative but may not be the best approach to guide core decisions.
Act on data innovations
If data, experimentation, research and development (R&D), or new products are your priority for the next six
months or so, this section may not be the best section of the Cloud Adoption Framework to guide your teams'
efforts. Any data innovation effort could benefit from guidance regarding the migration of existing source data.
However, the broader focus of that effort would be on the ingress and integration of additional data sources.
Extending that guidance with predictions and new experiences is much more important than the migration of IaaS
assets.
Effort
Migration effort can vary widely depending on the size and complexities of the workloads involved. A smaller
workload migration involving a few hundred virtual machines (VMs) is a tactical process, potentially being
implemented using automated tools such as Azure Migrate. Conversely, a large enterprise migration of tens of
thousands of workloads requires a highly strategic process and can involve extensive refactoring, rebuilding, and
replacing of existing applications integrating platform as a service (PaaS) and software as a service (SaaS)
capabilities. Identifying and balancing the scope of your planned migrations is critical.
Before making any decisions that could have a long-term impact on the current migration program, it is vital that
you create consensus on the following decisions.
Effort type
In any migration of significant scale (more than 250 VMs), assets are migrated using a variety of transition options,
discussed in the five Rs of rationalization: rehost, refactor, rearchitect, rebuild, and replace.
Some workloads are modernized through a rebuild or rearchitect process, creating more modern applications with
new features and technical capabilities. Other assets go through a refactor process, for instance a move to
containers or other more modern hosting and operational approaches that don't necessarily affect the solutions
code base. Commonly, virtual machines and other assets that are more well established go through a rehost
process, transitioning those assets from the datacenter to the cloud. Some workloads could potentially be migrated
to the cloud, but instead should be replaced using service-based (SaaS-based) cloud services that meet the same
business need—for example, by using Microsoft 365 as an alternative to migrating Exchange Server instances.
In the majority of scenarios, some business event creates a forcing function that causes a high percentage of assets
to temporarily migrate using the rehost process, followed by a more significant secondary transition using one of
the other migration strategies after they're in the cloud. This process is commonly known as a cloud transition.
During the process of rationalizing the digital estate, these types of decisions are applied to each asset to migrate.
However, the prerequisite needed at this time is to make a baseline assumption. Of the five migration strategies,
which best aligns with the business objectives or business outcomes driving this migration effort? This decision
serves as a guiding assumption throughout the migration effort.
Effort scale
Scale of the migration is the next important prerequisite decision. The processes needed to migrate 1,000 assets
are different from the processes required to move 10,000 assets. Before beginning any migration effort, it is
important to answer the following questions:
How many assets suppor t the migrating workloads today? Assets include data structures, applications,
VMs, and necessary IT appliances. Choose a relatively small workload for your first migration candidate.
Of those assets, how many are planned for migration? It's common for some assets to be terminated
during a migration process, due to lack of sustained end-user dependency.
What are the top-down estimates of the scale of migrateable assets? For the workloads included for
migration, estimate the number of supporting assets such as applications, virtual machines, data sources, and IT
appliances. See the digital estate section of the Cloud Adoption Framework for guidance on identifying relevant
assets.
Effort timing
Often, migrations are driven by a compelling business event that is time sensitive. For instance, one common
driver is the termination or renewal of a third-party hosting contract. Although there are many potential business
events necessitating a migration, they share a common factor: an end date. It is important to understand the
timing of any approaching business events, so activities and velocity can be planned and validated properly.
Recap
Before proceeding, document the following assumptions and share them with the cloud strategy team and the
cloud adoption teams:
Business outcomes.
Roles, documented and refined for the Assess, Migrate, Optimize, and Secure and manage migration processes.
Definition of done, documented and refined separately for these migration processes.
Effort type.
Effort scale.
Effort timing.
Next steps
After the process is understood among the team, it's time to review technical prerequisites. The migration
environment planning checklist helps to ensure that the technical foundation is ready for migration.
Once the process is understood among the team, its time to review technical prerequisites the migration planning
checklist will help ensure the technical foundation is ready for migration.
Review the migration planning checklist
Migration environment planning checklist: Validate
environmental readiness prior to migration
10/30/2020 • 3 minutes to read • Edit Online
As an initial step in the migration process, you need to create the right environment in the cloud to receive, host,
and support migrating assets. This article provides a list of things to validate in the current environment prior to
migration.
The following checklist aligns with the guidance in the Ready methodology of the Cloud Adoption Framework.
Review that section for guidance regarding execution of any of the following.
Governance alignment
The first and most important decision regarding any migration-ready environment is the choice of governance
alignment. Has a consensus been achieved regarding alignment of governance with the migration foundation? At
a minimum, the cloud adoption team should understand whether this migration is landing in a single
environment with limited governance, a fully governed environment factory, or some variant in between. For
additional guidance on governance alignment, see the Govern methodology.
We highly recommend that you develop a governance strategy for anything beyond your initial workload
migration.
Regardless of your level of governance alignment, you will need to make decisions related to the following topics.
Resource organization
Based on the governance alignment decision, an approach to the organization and deployment of resources
should be established prior to migration.
Nomenclature
A consistent approach for naming resources, along with consistent naming schemas, should be established prior
to migration.
Resource governance
A decision regarding the tools to govern resources should be made prior to migration. The tools do not need to
be fully implemented, but a direction should be selected and tested. The cloud governance team should define
and require the implementation of a minimum viable product (MVP) for governance tooling prior to migration.
Network
Your cloud-based workloads will require the provisioning of virtual networks to support end-user and
administrative access. Based on resource organization and resource governance decisions, you should select a
network approach align it to IT security requirements. Further, your networking decisions should be aligned with
any hybrid network constraints required to operate the workloads in the migration backlog and support any
access to resources hosted on-premises.
Identity
Cloud-based identity services are a prerequisite for offering identity and access management (IAM) for your
cloud resources. Align your identity management strategy with your cloud adoption plans before proceeding. For
example, when migrating existing on-premises assets, consider supporting a hybrid identity approach using
directory synchronization to allow a consistent set of user credentials across you on-premises and cloud
environments during and after the migration.
Next steps
If the environment meets the minimum requirements, it may be deemed approved for migration readiness.
Cultural complexity and change management helps to align roles and responsibilities to ensure proper
expectations during execution of the plan.
Cultural complexity and change management
Prepare for cultural complexity: Aligning roles and
responsibilities
10/30/2020 • 3 minutes to read • Edit Online
An understanding of the culture required to operate the existing datacenters is important to the success of any
migration. In some organizations, datacenter management is contained within centralized IT operations teams. In
these centralized teams, roles and responsibilities tend to be well defined and well understood throughout the
team. For larger enterprises, especially those bound by third-party compliance requirements, the culture tends to
be more nuanced and complex. Cultural complexity can lead to roadblocks that are difficult to understand and time
consuming to overcome.
In either scenario, it's wise to invest in the documentation of roles and responsibilities required to complete a
migration. This article outlines some of the roles and responsibilities seen in a datacenter migration, to serve as a
template for documentation that can drive clarity throughout execution.
Business functions
In any migration, there are a few key functions that are best executed by the business, whenever possible. Often, IT
is capable of completing the following tasks. However, engaging members of the business could aid in reducing
barriers later in the adoption process. It also ensures mutual investment from key stakeholders throughout the
migration process.
Secure and manage Interruption impact Aid the cloud adoption team in
quantifying the impact of a business
process interruption.
Secure and manage Service-level agreement (SLA) validation Aid the cloud adoption team in defining
service-level agreements and
acceptable tolerances for business
outages.
Ultimately, the cloud adoption team is accountable for each of these activities. However, establishing
responsibilities and a regular cadence with the business for the completion of these activities on an established
rhythm can improve stakeholder alignment and cohesiveness with the business.
NOTE
In the following table, an accountable party should start the alignment of roles. That column should be customized to fit
existing processes for efficient execution. Ideally, a single person should be assigned as the accountable party.
Prerequisite Digital estate Align the existing inventory Cloud strategy team
to basic assumptions, based
on business outcomes.
Secure and manage Ops transition Document production Cloud adoption team
systems prior to production
operations.
Cau t i on
For these activities, permissions and authorization heavily influence the accountable party, who must have direct
access to production systems in the existing environment or must have means of securing access through other
responsible actors. Determining this accountable party directly affects the promotion strategy during the migrate
and optimize processes.
Next steps
When the team has a general understanding of roles and responsibilities, it's time to begin preparing the technical
details of the migration. Understanding technical complexity and change management can help prepare the cloud
adoption team for the technical complexity of migration by aligning to an incremental change management
process.
Technical complexity and change management
Prepare for technical complexity: Agile change
management
10/30/2020 • 12 minutes to read • Edit Online
When an entire datacenter can be deprovisioned and re-created with a single line of code, traditional processes
struggle to keep up. The guidance throughout the Cloud Adoption Framework is built on practices like IT service
management (ITSM), the open group architecture framework (TOGAF), and others. However, to ensure agility and
responsiveness to business change, this framework molds those practices to fit agile methodologies and DevOps
approaches.
When shifting to an agile model where flexibility and iteration are emphasized, technical complexity and change
management are handled differently than they're in a traditional waterfall model focusing on a linear series of
migration steps. This article outlines a high-level approach to change management in an agile-based migration
effort. At the end of this article, you should have a general understanding of the levels of change management and
documentation involved in an incremental migration approach. Additional training and decisions are required to
select and implement agile practices based on that understanding. The intention of this article is to prepare cloud
architects for a facilitated conversation with project management to explain the general concept of change
management in this approach.
INVEST in workloads
The term workload appears throughout the Cloud Adoption Framework. A workload is a unit of application
functionality that can be migrated to the cloud. It could be a single application, a layer of an application, or a
collection of an application. The definition is flexible and may change at various phrases of migration. The Cloud
Adoption Framework uses the term INVEST to define a workload.
INVEST is a common acronym in many agile methodologies for writing user stories or product backlog items, both
of which are units of output in agile project management tools. The measurable unit of output in a migration is a
migrated workload. The Cloud Adoption Framework modifies the INVEST acronym a bit to create a construct for
defining workloads:
Independent: A workload should not have any inaccessible dependencies. For a workload to be considered
migrated, all dependencies should be accessible and included in the migration effort.
Negotiable: As additional discovery is performed, the definition of a workload changes. The architects
planning the migration could negotiate factors regarding dependencies. Examples of negotiation points could
include prerelease of features, making features accessible over a hybrid network, or packaging all
dependencies in a single release.
Valuable: Value in a workload is measured by the ability to provide users with access to a production
workload.
Estimable: Dependencies, assets, migration time, performance, and cloud costs should all be estimable and
should be estimated prior to migration.
Small: The goal is to package workloads in a single sprint. However, this may not always be feasible. Instead,
teams are encouraged to plan sprints and releases to minimize the time required to move a workload to
production.
Testable: There should always be a defined means of testing or validating completion of the migration of a
workload.
This acronym is not intended as a basis for rigid adherence but should help guide the definition of the term
workload.
The migration, release, and iteration backlogs track different levels of activity during migration processes.
In any migration backlog, the change management team should strive to obtain the following information for any
workload in the plan. At a minimum, this data should be available for any workloads prioritized for migration in
the next two or three releases.
Migration backlog data points
Business impact. Understanding of the impact to the business of missing the expected timeline or reducing
functionality during freeze windows.
Relative business priority. A ranked list of workloads based on business priorities.
Business owner. Document the one individual responsible for making business decisions regarding this
workload.
Technical owner. Document the one individual responsible for technical decisions related to this workload.
Expected timelines. When the migration is scheduled for completion.
Workload freezes. Time frames in which the workload should be ineligible for change.
Workload name.
Initial inventor y. Any assets required to provide the functionality of the workload, including VMs, IT
appliances, data, applications, deployment pipelines, and others. This information is likely to be inaccurate.
Next steps
After change management approaches have been established, its time to address the final prerequisite, migration
backlog review
Migration backlog review
Migration backlog review
10/30/2020 • 2 minutes to read • Edit Online
The actionable output of the Plan phase is a migration backlog, which influences all of the prerequisites discussed
so far. Development of the migration backlog should be completed as a first prerequisite. This article serves as a
milestone to complete prerequisite activities. The cloud strategy team is accountable for the care and maintenance
of the digital estate. However, the realization of the resultant backlog is the responsibility of every member of the
migration effort. As a final prerequisite, the cloud strategy team and the cloud adoption team should review and
understand the migration backlog. During that review, the members of both teams must gain sufficient knowledge
to articulate the following key points in the migration backlog.
Business priorities
Sometimes, prioritizing one workload over another may seem illogical to the cloud adoption team. Understanding
the business priorities that drove those decisions can help maintain the team's motivation. It also allows the team
to make a stronger contribution to the prioritization process.
Core assumptions
The article on digital estate rationalization discusses the agility and time-saving impact of basic assumptions when
evaluating a digital estate. To fully realize those values, the cloud adoption team needs to understand the
assumptions and the reasons that they were established. That knowledge better equips the cloud adoption team to
challenge those assumptions.
Next steps
With a general understanding of the digital estate and migration backlog, the team is ready to move beyond
prerequisites and begin assessing workloads.
Assess workloads
Assess workloads and validate assumptions before
migration
10/30/2020 • 3 minutes to read • Edit Online
Many of your existing workloads are ideal candidates for cloud migration, but not every asset is compatible with
cloud platforms and not all workloads can benefit from hosting in the cloud. Digital estate planning allows you to
generate an overall migration backlog of potential workloads to migrate. However, this planning effort is high-
level. It relies on assumptions made by the cloud strategy team and does not dig deeply into technical
considerations.
As a result, before migrating a workload to the cloud it's critical to assess the individual assets associated with
that workload for their migration suitability. During this assessment, your cloud adoption team should evaluate
technical compatibility, required architecture, performance/sizing expectations, and dependencies to ensure that
the migrated workload can be deployed to the cloud effectively.
The assess process is the first of four incremental activities that occur within an iteration. As discussed in the
prerequisite article regarding technical complexity and change management, a decision should be made in
advance to determine how this phase is executed. In particular, will assessments be completed by the cloud
adoption team during the same sprint as the actual migration effort? Alternatively, will a wave or factory model
be used to complete assessments in a separate iteration? If the answer to this basic process question can't be
answered by every member of the team, it may be wise to revisit the prerequisites section.
Objective
Assess a migration candidate, evaluating the workload, associated assets, and dependencies prior to migration.
Definition of done
This process is complete when the following are known about a single migration candidate:
The path from on-premises to cloud, including production promotion approach decision, has been defined.
Any required approvals, changes, cost estimates, or validation processes have been completed to allow the
cloud adoption team to execute the migration.
This full list of responsibilities and actions can support large and complex migrations involving multiple roles with
varying levels of responsibility, and requiring a detailed approval process. Smaller and simpler migration efforts
may not require all of roles and actions described here. To determine which of these activities add value and
which are unnecessary, your cloud adoption team and the cloud strategy team should use this complete process
as part of your first workload migration. After the workload has been verified and tested, the team can evaluate
this process and choose which actions to use moving forward.
Next steps
With a general understanding of the assessment process, you're ready to begin the process by classifying
workloads.
Classify workloads
Workload classification before migration
10/30/2020 • 2 minutes to read • Edit Online
During each iteration of any migration process, one or more workloads will be migrated and promoted to
production. Prior to either of those migration activities, it is important to classify each workload. Classification
helps clarify governance, security, operations, and data management requirements.
The following guidance builds on the suggested tagging requirements outlined in the naming and tagging
standards article by adding important operations and governance elements.
In this article, we specifically suggest adding criticality and data sensitivity to your existing tagging standards. Each
of these data points will help other teams understand which workloads may require additional attention or
support.
Data sensitivity
As outlined in the article on data classification, data classification measures the impact that a data leak would have
on the business or customers. The governance and security teams use data sensitivity or data classification as an
indicator of security risks. During assessment, the cloud adoption team should evaluate the data classification for
each workload targeted for migration and share that classification with supporting teams. Workloads that deal
strictly in "public data" may not have any impact on supporting teams. However, as data moves further towards
the "highly confidential" end of the spectrum, both governance and security teams will likely have a vested interest
in participating in the assessment of the workload.
Work with your security and governance teams as early as possible to define the following items:
A clear process for sharing any workloads on the backlog with sensitive data.
An understanding of the governance requirements and security baseline required for various different levels of
data sensitivity.
Any impact data sensitivity may have on subscription design, management group hierarchies, or landing zone
requirements.
Any requirements for testing data classification, which may include specific tooling or defined scope of
classification.
Mission criticality
As outlined in the article on workload criticality, the criticality of a workload is a measure of how significantly the
business will be affected during an outage. This data point helps operations management and security teams
evaluate risks regarding outages and breaches. During assessment, the cloud adoption team should evaluate
mission criticality for each workload targeted for migration and share that classification with supporting teams.
"Low" or "unsupported" workloads are likely to have little impact on the supporting teams. However, as workloads
approach "mission critical" or "unit critical" classifications, their operational dependencies become more apparent.
Work with your security and operations teams as early as possible to define the following items:
A clear process for sharing any workloads on the backlog with support requirements.
An understanding of the operations management and resource consistency requirements for various different
levels of criticality.
Any impact criticality may have on subscription design, management group hierarchies, or landing zone
requirements.
Any requirements for documenting criticality, which might include specific traffic or usage reports, financial
analyses, or other tools.
Next steps
Once workloads are properly classified, it's much easier to align business priorities.
Align business priorities
Business priorities: Maintaining alignment
10/30/2020 • 3 minutes to read • Edit Online
Transformation is often defined as a dramatic or spontaneous change. At the board level, change can look like a
dramatic transformation. However, for those who work through the process of change in an organization,
transformation is a bit misleading. Under the surface, transformation is better described as a series of properly
executed transitions from one state to another.
The amount of time required to rationalize or transition a workload will vary, depending on the technical
complexity involved. However, even when this process can be applied to a single workload or group of applications
quickly, it takes time to produce substantial changes among a user base. It takes longer for changes to propagate
through various layers of existing business processes. If transformation is expected to shape behavior patterns in
consumers, the results can take longer to produce significant results.
Unfortunately, the market doesn't wait for businesses to transition. Consumer behavior patterns change on their
own, often unexpectedly. The market's perception of a company and its products can be swayed by social media or
a competitor's positioning. Fast and unexpected market changes require companies to be nimble and responsive.
The ability to execute processes and technical transitions requires a consistent, stable effort. Quick decisions and
nimble actions are needed to respond to market conditions. These two are at odds, making it easy for priorities to
fall out of alignment. This article describes approaches to maintaining transitional alignment during migration
efforts.
Next steps
With properly aligned business priorities, the cloud adoption team can confidently begin to evaluate workloads to
develop architecture and migration plans.
Evaluate workloads
Evaluate workload readiness
10/30/2020 • 3 minutes to read • Edit Online
This activity focuses on evaluating readiness of a workload to migrate to the cloud. During this activity, the cloud
adoption team validates that all assets and associated dependencies are compatible with the chosen deployment
model and cloud provider. During the process, the team documents any efforts required to remediate
compatibility issues.
Evaluation assumptions
Most of the content discussing principles in the Cloud Adoption Framework is cloud agnostic. However, the
readiness evaluation process must be largely specific to each specific cloud platform. The following guidance
assumes an intention to migrate to Azure. It also assumes use of Azure Migrate (also known as Azure Site
Recovery) for replication activities. For alternative tools, see Replication options.
This article doesn't capture all possible evaluation activities. It is assumed that each environment and business
outcome will dictate specific requirements. To help accelerate the creation of those requirements, the remainder of
this article shares a few common evaluation activities related to infrastructure, database, and network evaluation.
NOTE
Total storage directly affects bandwidth requirements during initial replication. However, storage drift continues from the
point of replication until release. This means that drift has a cumulative effect on available bandwidth.
Next steps
After the evaluation of a system is complete, the outputs feed the development of a new cloud architecture.
Architect workloads prior to migration
Architect workloads prior to migration
3/31/2020 • 3 minutes to read • Edit Online
This article expands on the assessment process by reviewing activities associated with defining the architecture of
a workload within a given iteration. As discussed in the article on incremental rationalization, some architectural
assumptions are made during any business transformation that requires a migration. This article clarifies those
assumptions, shares a few roadblocks that can be avoided, and identifies opportunities to accelerate business
value by challenging those assumptions. This incremental model for architecture allows teams to move faster and
to obtain business outcomes sooner.
Next steps
After the new architecture is defined, accurate cost estimations can be calculated.
Estimate cloud costs
Estimate cloud costs
10/30/2020 • 2 minutes to read • Edit Online
During migration, there are several factors that can affect decisions and execution activities. To help understand
which of those options are best for different situations, this article discusses various options for estimating cloud
costs.
Accounting models
Accounting models
If you're familiar with traditional IT procurement processes, estimation in the cloud may seem foreign. When
adopting cloud technologies, acquisition shifts from a rigid, structured capital expense model to a fluid operating
expense model. In the traditional capital expense model, the IT team would attempt to consolidate buying power
for multiple workloads across various programs to centralize a pool of shared IT assets that could support each of
those solutions. In the operating expenses cloud model, costs can be directly attributed to the support needs of
individual workloads, teams, or business units. This approach allows for a more direct attribution of costs to the
supported internal customer. When estimating costs, it's important to first understand how much of this new
accounting capability will be used by the IT team.
For those wanting to replicate the legacy capital expense approach to accounting, use the outputs of either
approach suggested in the digital estate size section above to get an annual cost basis. Next, multiply that annual
cost by the company's typical hardware refresh cycle. Hardware refresh cycle is the rate at which a company
replaces aging hardware, typically measured in years. Annual run rate multiplied by hardware refresh cycle
creates a cost structure similar to a capital expense investment pattern.
Next steps
After estimating costs, migration can begin. However, it would be wise to review partnership and support options
before beginning any migration.
Understand partnership and support options
Understand partnership and support options
10/30/2020 • 5 minutes to read • Edit Online
During migration, the cloud adoption team performs the actual migration of workloads to the cloud. Unlike the
collaborative and problem-solving tasks when defining the digital estate or building the core cloud infrastructure,
migration tends to be a series of repetitive execution tasks. Beyond the repetitive aspects, there are likely testing
and tuning efforts that require deep knowledge of the chosen cloud provider. The repetitive nature of this process
can sometimes be best addressed by a partner, reducing strain on full-time staff. Additionally, partners may be able
to better align deep technical expertise when the repetitive processes encounter execution anomalies.
Partners tend to be closely aligned with a single cloud vendor or a small number of cloud vendors. To better
illustrate partnership options, the remainder of this article assumes that Microsoft Azure is the chosen cloud
provider.
During plan, build, or migrate, a company generally has four execution partnership options:
Guided self-ser vice. The existing technical team executes the migration, with help from Microsoft.
FastTrack for Azure. Use the Microsoft FastTrack for Azure program to accelerate migration.
Solutions par tner. Get connected with Azure partners or cloud solution providers (CSPs) to accelerate
migration.
Suppor ted self-ser vice. Execution is completed by the existing technical staff with support from Microsoft.
Guided self-service
If an organization is planning an Azure migration on its own, Microsoft is always there to assist throughout the
journey. To help fast-track migration to Azure, Microsoft and its partners have developed an extensive set of
architectures, guides, tools, and services to reduce risk and to speed migration of virtual machines, applications,
and databases. These tools and services support a broad selection of operating systems, programming languages,
frameworks, and databases.
Assessment and migration tools. Azure provides a wide range of tools to be used in different phases for
your cloud transformation, including assessing your existing infrastructure. For more information, refer to the
"assess" section in the "migration" chapter that follows.
Microsoft Cloud Adoption Framework . This framework presents a structured approach to cloud adoption
and migration. It is based on best practices across many Microsoft-supported customer engagements and is
organized as a series of steps, from architecture and design to implementation. For each step, supporting
guidance helps you with the design of your application architecture.
Cloud design patterns . Azure provides some useful cloud design patterns for building reliable, scalable,
secure workloads in the cloud. Each pattern describes the problem that the pattern addresses, considerations
for applying the pattern, and an example based on Azure. Most of the patterns include code samples or snippets
that show how to implement the pattern on Azure. However, they're relevant to any distributed system, whether
hosted on Azure or on other cloud platforms.
Cloud fundamentals . Fundamentals help teach the basic approaches to implementation of core concepts.
This guide helps technicians think about solutions that go beyond a single Azure service.
Example scenarios . The guide provides references from real customer implementations, outlining the tools,
approaches, and processes that past customers have followed to accomplish specific business goals.
Reference architectures . Reference architectures are arranged by scenario, with related architectures
grouped together. Each architecture includes best practices, along with considerations for scalability, availability,
manageability, and security. Most also include a deployable solution.
FastTrack for Azure
FastTrack for Azure provides direct assistance from Azure engineers, working hand in hand with partners, to help
customers build Azure solutions quickly and confidently. FastTrack brings best practices and tools from real
customer experiences to guide customers from setup, configuration, and development to production of Azure
solutions, including:
Datacenter migration
Windows Server on Azure
Linux on Azure
SAP on Azure
Business continuity and disaster recovery (BCDR)
High-performance computing
Cloud-native applications
DevOps
Application modernization
Cloud-scale analytics
Intelligent applications
Intelligent agents
Data modernization to Azure
Security and management
Globally distributed data
Windows Virtual Desktop
Azure Marketplace
Fundamentals and governance
During a typical FastTrack for Azure engagement, Microsoft helps to define the business vision to plan and develop
Azure solutions successfully. The team assesses architectural needs and provides guidance, design principles, tools,
and resources to help build, deploy, and manage Azure solutions. The team matches skilled partners for
deployment services on request and periodically checks in to ensure that deployment is on track and to help
remove blockers.
The main phases of a typical FastTrack for Azure engagement are:
Discover y. Identify key stakeholders, understand the goal or vision for problems to be solved, and then assess
architectural needs.
Solution enablement. Learn design principles for building applications, review architecture of applications
and solutions, and receive guidance and tools to drive proof of concept (PoC) work through to production.
Continuous par tnership. Azure engineers and program managers check in every so often to ensure that
deployment is on track and to help remove blockers.
Azure support
If you have questions or need help, create a support request. If your support request requires deep technical
guidance, visit Azure support plans to align the best plan for your needs.
Next steps
After a partner and support strategy is selected, the release and iteration backlogs can be updated to reflect
planned efforts and assignments.
Manage change using release and iteration backlogs
Manage change in an incremental migration effort
10/30/2020 • 2 minutes to read • Edit Online
This article assumes that migration processes are incremental in nature, running parallel to the govern process.
However, the same guidance could be used to populate initial tasks in a work breakdown structure for traditional
waterfall change management approaches.
Release backlog
A release backlog consists of a series of assets (VMs, databases, files, and applications, among others) that must
be migrated before a workload can be released for production usage in the cloud. During each iteration, the cloud
adoption team documents and estimates the efforts required to move each asset to the cloud. See the "iteration
backlog" section that follows.
Iteration backlog
An iteration backlog is a list of the detailed work required to migrate a specific number of assets from the existing
digital estate to the cloud. The entries on this list are often stored in an agile management tool, like Azure DevOps,
as work items.
Prior to starting the first iteration, the cloud adoption team specifies an iteration duration, usually two to four
weeks. This time box is important to create a start and finish time period for each set of committed activities.
Maintaining consistent execution windows makes it easy to gauge velocity (pace of migration) and alignment to
changing business needs.
Prior to each iteration, the team reviews the release backlog, estimating the effort and priorities of assets to be
migrated. It then commits to deliver a specific number of agreed-on migrations. After this is agreed to by the
cloud adoption team, the list of activities becomes the current iteration backlog.
During each iteration, team members work as a self-organizing team to fulfill commitments in the current
iteration backlog.
Next steps
After an iteration backlog is defined and accepted by the cloud adoption team, change management approvals
can be finalized.
Approve architecture changes prior to migration
Approve architecture changes before migration
10/30/2020 • 4 minutes to read • Edit Online
During the assess process of migration, each workload is evaluated, architected, and estimated to develop a future
state plan for the workload. Some workloads can be migrated to the cloud with no change to the architecture.
Maintaining on-premises configuration and architecture can reduce risk and streamline the migration process.
Unfortunately, not every application can run in the cloud without changes to the architecture. When architecture
changes are required, this article can help classify the change and can provide some guidance on the proper
approval activities.
Existing culture
Your IT teams likely have existing mechanisms for managing change involving your on-premises assets. Typically
these mechanisms are governed by traditional Information Technology Infrastructure Library-based (ITIL-based)
change management processes. In many enterprise migrations, these processes involve a change advisory board
(CAB) that's responsible for reviewing, documenting, and approving all IT-related requests for changes (RFC).
The CAB generally includes experts from multiple IT and business teams, offering a variety of perspectives and
detailed review for all IT-related changes. A CAB approval process is a proven way to reduce risk and minimize the
business impact of changes involving stable workloads managed by IT operations.
Technical approval
Organizational readiness for the approval of technical change is among the most common reasons for cloud
migration failure. More projects are stalled by a series of technical approvals than any deficit in a cloud platform.
Preparing the organization for technical change approval is an important requirement for migration success. The
following are a few best practices to ensure that the organization is ready for technical approval.
ITIL change advisory board challenges
Every change management approach has its own set of controls and approval processes. Migration is a series of
continuous changes that start with a high degree of ambiguity and develop additional clarity through the course of
execution. As such, migration is best governed by agile-based change management approaches, with the cloud
strategy team serving as a product owner.
However, the scale and frequency of change during a cloud migration doesn't fit well with the nature of ITIL
processes. The requirements of a CAB approval can risk the success of a migration, slowing or stopping the effort.
Further, in the early stages of migration, ambiguity is high and subject matter expertise tends to be low. For the
first several workload migrations or releases, the cloud adoption team is often in a learning mode. As such, it could
be difficult for the team to provide the types of data needed to pass a CAB approval.
The following best practices can help the CAB maintain a degree of comfort during migration without become a
painful blocker.
Standardize change
It is tempting for a cloud adoption team to consider detailed architectural decisions for each workload being
migrated to the cloud. It is equally tempting to use cloud migration as a catalyst to refactor past architectural
decisions. For organizations that are migrating a few hundred VMs or a few dozen workloads, either approach can
be properly managed. When migrating a datacenter consisting of 1,000 or more assets, each of these approaches
is considered a high-risk antipattern that significantly reduces the likelihood of success. Modernizing, refactoring,
and rearchitecting every application requires a diverse skill set and a wide variety of changes, and these tasks
create dependencies on human efforts at scale. Each of these dependencies injects risk into the migration effort.
The article on digital estate rationalization discusses the agility and time-saving impact of basic assumptions when
rationalizing a digital estate. There is an additional benefit of standardized change. By choosing a default
rationalization approach to govern the migration effort, the cloud advisory board or product owner can review and
approve the application of one change to a long list of workloads. This reduces technical approval of each
workload to those that require a significant architecture change to be cloud compatible.
Clarify expectations and roles of approvers
Before the first workload is assessed, the cloud strategy team should document and communicate the expectations
of anyone involved in the approval of change. This simple activity can avoid costly delays when the cloud adoption
team is fully engaged.
Seek approval early
When possible, technical change should be detected and documented during the assessment process. Regardless
of approval processes, the cloud adoption team should engage approvers early. The sooner that change approval
can begin, the less likely an approval process is to block migration activities.
Next steps
With the help of these best practices, it should be easier to integrate proper, low-risk approval into migration
efforts. After workload changes are approved, the cloud adoption team is ready to migrate workloads.
Migrate workloads
Deploy workloads
10/30/2020 • 2 minutes to read • Edit Online
After workloads have been assessed, they can be deployed to the cloud or staged for release. This series of articles
explains the various activities that may be involved in this phase of migration effort.
Objective
The objective of a migration is to migrate a single workload to the cloud.
Definition of done
The migration phase is complete when a workload is staged and ready for testing in the cloud, including all
dependent assets required for the workload to function. During the optimize process, the workload is prepared for
production usage.
This definition of done can vary, depending on your testing and release processes. The next article in this series
covers deciding on a promotion model and can help you understand when it would be best to promote a migrated
workload to production.
Next steps
With a general understanding of the migration process, you're ready to decide on a promotion model.
Decide on a promotion model
Promotion models: Single-step, staged, or flight
10/30/2020 • 5 minutes to read • Edit Online
Workload migration is often discussed as a single activity. In practice, it's a collection of smaller activities that
facilitate the movement of a digital asset to the cloud. One of the last activities in a migration is the promotion of
an asset to production. Promotion is the point at which the production system changes for end users. It can often
be as simple as changing the network routing, redirecting end users to the new production asset. Promotion is
also the point at which IT operations or cloud operations change the focus of operational management processes
from the previous production system to the new production systems.
There are several promotion models. This article outlines three of the most common ones used in cloud
migrations. The choice of a promotion model changes the activities seen within the migrate and optimize
processes. As such, promotion model should be decided early in a release.
NOTE
The table of contents for this site lists the promotion activity as part of the optimize process. In a single-step model,
promotion occurs during the Migrate phase. When using this model, roles and responsibilities should be updated to
reflect this.
Staged. In a staged promotion model, the workload is considered migrated after it is staged, but it is not yet
promoted. Prior to promotion, the migrated workload undergoes a series of performance tests, business tests,
and optimization changes. It is then promoted at a future date in conjunction with a business test plan. This
approach improves the balance between cost and performance, while making it easier to obtain business
validation.
Flight. The flight promotion model combines single-step and staged models. In a flight model, the assets in the
workload are treated like production after landing in staging. After a condensed period of automated testing,
production traffic is routed to the workload. However, it is a subset of the traffic. That traffic serves as the first
flight of production and testing. Assuming the workload performs from a feature and performance perspective,
additional traffic is migrated. After all production traffic has been moved onto the new assets, the workload is
considered fully promoted.
The chosen promotion model affects the sequence of activities to be performed. It also affects the roles and
responsibilities of the cloud adoption team. It may even impact the composition of a sprint or multiple sprints.
Single-step promotion
This model uses migration automation tools to replicate, stage, and promote assets. The assets are replicated into
a contained staging environment controlled by the migration tool. After all assets have been replicated, the tool
can execute an automated process to promote the assets into the chosen subscription in a single step. While in
staging, the tool continues to replicate the asset, minimizing loss of data between the two environments. After an
asset is promoted, the linkage between the source system and the replicated system is severed. In this approach, if
additional changes occur in the initial source systems, the changes are lost.
Pros. Positive benefits of this approach include:
This model introduces less change to the target systems.
Continuous replication minimizes data loss.
If a staging process fails, it can quickly be deleted and repeated.
Replication and repeated staging tests enable an incremental scripting and testing process.
Cons. Negative aspects of this approach include:
Assets staged within the tools-isolated sandbox don't allow for complex testing models.
During replication, the migration tool consumes bandwidth in the local datacenter. Staging a large volume of
assets over an extended duration has an exponential impact on available bandwidth, hurting the migration
process and potentially affecting performance of production workloads in the on-premises environment.
Staged promotion
In this model, the staging sandbox managed by the migration tool is used for limited testing purposes. The
replicated assets are then deployed into the cloud environment, which serves as an extended staging environment.
The migrated assets run in the cloud, while additional assets are replicated, staged, and migrated. When full
workloads become available, richer testing is initiated. When all assets associated with a subscription have been
migrated, the subscription and all hosted workloads are promoted to production. In this scenario, there is no
change to the workloads during the promotion process. Instead, the changes tend to be at the network and
identity layers, routing users to the new environment and revoking access of the cloud adoption team.
Pros. Positive benefits of this approach include:
This model provides more accurate business testing opportunities.
The workload can be studied more closely to better optimize performance and cost of the assets.
A larger numbers of assets can be replicated within similar time and bandwidth constraints.
Cons. Negative aspects of this approach include:
The chosen migration tool can't facilitate ongoing replication after migration.
A secondary means of data replication is required to synchronize data platforms during the staged time frame.
Flight promotion
This model is similar to the staged promotion model. However, there is one fundamental difference. When the
subscription is ready for promotion, end-user routing happens in stages or flights. At each flight, additional users
are rerouted to the production systems.
Pros. Positive benefits of this approach include:
This model mitigates the risks associated with a big migration or promotion activity. Errors in the migrated
solution can be identified with less impact to business processes.
It allows for monitoring of workload performance demands in the cloud environment for an extended duration,
increasing accuracy of asset-sizing decisions.
Larger numbers of assets can be replicated within similar time and bandwidth constraints.
Cons. Negative aspects of this approach include:
The chosen migration tool can't facilitate ongoing replication after migration.
A secondary means of data replication is required to synchronize data platforms during the staged time frame.
Next steps
After a promotion model is defined and accepted by the cloud adoption team, remediation of assets can begin.
Remediating assets prior to migration
Remediate assets prior to migration
10/30/2020 • 4 minutes to read • Edit Online
During the assessment process of migration, the team seeks to identify any configurations that would make an
asset incompatible with the chosen cloud provider. Remediate is a checkpoint in the migration process to ensure
that those incompatibilities have been resolved. This article discusses a few common remediation tasks for
reference. It also establishes a skeleton process for deciding whether remediation is a wise investment.
NOTE
This isn't production routing to the new assets, but rather configuration to allow for proper routing to the assets in
general.
Decision framework
Because remediation for smaller workloads can be straightforward, you should choose a smaller workload for
your initial migration. However, as your migration efforts mature and you begin to tackle larger workloads,
remediation can be a time consuming and costly process. For example, remediation efforts for a Windows Server
2003 migration involving a 5,000+ VM pool of assets can delay a migration by months. When such large-scale
remediation is required, the following questions can help guide decisions:
Have all workloads affected by the remediation been identified and notated in the migration backlog?
For workloads that are not affected, will a migration produce a similar return on investment (ROI)?
Can the affected assets be remediated in alignment with the original migration timeline? What impact would
timeline changes have on ROI?
Is it economically feasible to remediate the assets in parallel with migration efforts?
Is there sufficient bandwidth on staff to remediate and migrate? Should a partner be engaged to execute one
or both tasks?
If these questions don't yield favorable answers, a few alternative approaches that move beyond a basic IaaS
rehosting strategy may be worth considering:
Containerization. Some assets can be hosted in a containerized environment without remediation. This could
produce less-than-favorable performance and doesn't resolve security or compliance issues.
Automation. Depending on the workload and remediation requirements, it may be more profitable to script
the deployment to new assets using a DevOps approach.
Rebuild. When remediation costs are very high and business value is equally high, a workload may be a good
fit as a candidate for rebuilding or rearchitecting.
Next steps
After remediation is complete, replication activities are ready.
Replicate assets
What role does replication play in the migration
process?
10/30/2020 • 4 minutes to read • Edit Online
On-premises datacenters are filled with physical assets like servers, appliances, and network devices. However,
each server is only a physical shell. The real value comes from the binary running on the server. The applications
and data are the purpose for the datacenter. Those are the primary binaries to migrate. Powering these
applications and data stores are other digital assets and binary sources, like operating systems, network routes,
files, and security protocols.
Replication is the workhorse of migration efforts. It is the process of copying a point-in-time version of various
binaries. The binary snapshots are then copied to a new platform and deployed onto new hardware, in a process
referred to as seeding. When executed properly, the seeded copy of the binary should behave identically to the
original binary on the old hardware. However, that snapshot of the binary is immediately out of date and
misaligned with the original source. To keep the new binary and the old binary aligned, a process referred to as
synchronization continuously updates the copy stored in the new platform. Synchronization continues until the
asset is promoted in alignment with the chosen promotion model. At that point, the synchronization is severed.
Next steps
After replication is complete, staging activities can begin.
Staging activities during a migration
Replication options
10/30/2020 • 2 minutes to read • Edit Online
Before any migration, you should ensure that primary systems are safe and will continue to run without issues.
Any downtime disrupts users or customers, and it costs time and money. Migration is not as simple as turning off
the virtual machines on-premises and copying them across to Azure. Migration tools must take into account
asynchronous or synchronous replication to ensure that live systems can be copied to Azure with no downtime.
Most of all, systems must be kept in lockstep with on-premises counterparts. You might want to test migrated
resources in isolated partitions in Azure, to ensure that workloads work as expected.
The content within the Cloud Adoption Framework assumes that Azure Migrate (or Azure Site Recovery) is the
most appropriate tool for replicating assets to the cloud. However, there are other options available. This article
discusses those options to help enable decision-making.
Next steps
After replication is complete, staging activities can begin.
Staging activities during a migration
Understand staging activities during a migration
10/30/2020 • 2 minutes to read • Edit Online
As described in the article on promotion models, staging is the point at which assets have been migrated to the
cloud. However, they're not ready to be promoted to production yet. This is often the last step in the migrate
process of a migration. After staging, the workload is managed by an IT operations or cloud operations team to
prepare it for production usage.
Deliverables
Staged assets may not be ready for use in production. There are several production readiness checks that should
be finalized before this stage is considered complete. The following is a list of deliverables often associated with
completion of asset staging.
Automated testing. Any automated tests available to validate workload performance should be run before
concluding the staging process. After the asset leaves staging, synchronization with the original source system
is terminated. As such, it is harder to redeploy the replicated assets, after the assets are staged for optimization.
Migration documentation. Most migration tools can produce an automated report of the assets being
migrated. Before concluding the staging activity, all migrated assets should be documented for clarity.
Configuration documentation. Any changes made to an asset (during remediation, replication, or staging)
should be documented for operational readiness.
Backlog documentation. The migration backlog should be updated to reflect the workload and assets
staged.
Next steps
After staged assets are tested and documented, you can proceed to optimization activities.
Optimize migrated workloads
Release workloads
10/30/2020 • 2 minutes to read • Edit Online
After a collection of workloads and their supporting assets have been deployed to the cloud, it must be prepared
before it can be released. In this phase of the migration effort, the collection of workloads are load tested and
validated with the business. They're then optimized and documented. Once the business and IT teams have
reviewed and signed off on workload deployments, those workloads can be released or handed off to governance,
security, and operations teams for ongoing operations.
The objective of "release workloads" is to prepare migrated workloads for promotion to production usage.
Definition of done
The optimization process is complete when a workload has been properly configured, sized, and deployed to
production.
Traditionally, IT has overseen the release of new workloads. During a major transformation, like a datacenter
migration or a cloud migration, a similar pattern of IT lead adoption could be applied. However, the traditional
approach might miss opportunities to realize additional business value. For this reason, before a migrated
workload is promoted to production, implementing a broader approach to user adoption is suggested. This article
outlines the ways in which a business change plan adds to a standard user adoption plan.
Next steps
After business change is documented and planned, business testing can begin.
Guidance for business testing (UAT) during migration
References
Eason, K. (1988) Information Technology and Organizational Change, New York: Taylor and Francis.
Guidance for business testing (UAT) during migration
3/31/2020 • 3 minutes to read • Edit Online
Traditionally seen as an IT function, user acceptance testing during a business transformation can be orchestrated
solely by IT. However, this function is often most effectively executed as a business function. IT then supports this
business activity by facilitating the testing, developing testing plans, and automating tests when possible. Although
IT can often serve as a surrogate for testing, there is no replacement for firsthand observation of real users
attempting to take advantage of a new solution in the context of a real or replicated business process.
NOTE
When available, automated testing is a much more effective and efficient means of testing any system. However, cloud
migrations often focus most heavily on legacy systems or at least stable production systems. Often, those systems aren't
managed by thorough and well-maintained automated tests. This article assumes that no such tests are available at the
time of migration.
Second to automated testing is testing of the process and technology changes by power users. Power users are the
people that commonly execute a real-world process that requires interactions with a technology tool or set of
tools. They could be represented by an external customer using an e-commerce site to acquire goods or services.
Power users could also be represented by a group of employees executing a business process, such as a call center
servicing customers and recording their experiences.
The goal of business testing is to solicit validation from power users to certify that the new solution performs in
line with expectations and does not impede business processes. If that goal isn't met, the business testing serves as
a feedback loop that can help define why and how the workload isn't meeting expectations.
Next steps
In conjunction with business testing, optimization of migrated assets can refine cost and workload performance.
Benchmark and resize cloud assets
Benchmark and resize cloud assets
10/30/2020 • 3 minutes to read • Edit Online
Monitoring usage and spending is critically important for cloud infrastructures. Organizations pay for the
resources they consume over time. When usage exceeds agreement thresholds, unexpected cost overages can
quickly accumulate. Cost management reports monitor spending to analyze and track cloud usage, costs, and
trends. Using overtime reports, detect anomalies that differ from normal trends. Inefficiencies in cloud deployment
are visible in optimization reports. Note inefficiencies in cost-analysis reports.
In the traditional on-premises models of IT, requisition of IT systems is costly and time consuming. The processes
often require lengthy capital expenditure review cycles and may even require an annual planning process. As such,
it is common practice to buy more than is needed. It is equally common for IT administrators to then overprovision
assets in preparation for anticipated future demands.
In the cloud, the accounting and provisioning models eliminate the time delays that lead to overbuying. When an
asset needs additional resources, it can be scaled up or out almost instantly. This means that assets can safely be
reduced in size to minimize resources and costs consumed. During benchmarking and optimization, the cloud
adoption team seeks to find the balance between performance and costs, provisioning assets to be no larger and
no smaller than necessary to meet production demands.
Next steps
After a workload has been tested and optimized, it is time to ready the workload for promotion.
Getting a migrated workload ready for production promotion
Prepare a migrated application for production
promotion
5/12/2020 • 2 minutes to read • Edit Online
After a workload is promoted, production user traffic is routed to the migrated assets. Readiness activities provide
an opportunity to prepare the workload for that traffic. The following are a few business and technology
considerations to help guide readiness activities.
Next steps
After all readiness activities have been completed, its time to promote the workload.
What is required to promote a migrated resource to production?
What is required to promote a migrated resource to
production?
10/30/2020 • 2 minutes to read • Edit Online
Promotion to production marks the completion of a workload's migration to the cloud. After the asset and all of its
dependencies are promoted, production traffic is rerouted. The rerouting of traffic makes the on-premises assets
obsolete, allowing them to be decommissioned.
The process of promotion varies according to the workload's architecture. However, there are several consistent
prerequisites and a few common tasks. This article describes each and serves as a kind of pre-promotion checklist.
Prerequisite processes
Each of the following processes should be executed, documented, and validated prior to production deployment:
Assess : The workload has been assessed for cloud compatibility.
Architect : The structure of the workload has been properly designed to align with the chosen cloud provider.
Replicate : The assets have been replicated to the cloud environment.
Stage : The replicated assets have been restored in a staged instance of the cloud environment.
Business testing : The workload has been fully tested and validated by business users.
Business change plan : The business has shared a plan for the changes to be made in accordance with the
production promotion; this should include a user adoption plan, changes to business processes, users that
require training, and timelines for various activities.
Ready : Generally, a series of technical changes must be made before promotion.
Next steps
Promotion of a workload signals the completion of a release. However, in parallel with migration, retired assets
need to be decommissioned taking them out of service.
Decommission retired assets
Decommission retired assets
3/31/2020 • 2 minutes to read • Edit Online
After a workload is promoted to production, the assets that previously hosted the production workload are no
longer required to support business operations. At that point, the older assets are considered retired. Retired assets
can then be decommissioned, reducing operational costs. Decommissioning a resource can be as simple as turning
off the power to the asset and disposing of the asset responsibly. Unfortunately, decommissioning resources can
sometimes have undesired consequences. The following guidance can aid in properly decommissioning retired
resources, with minimal business interruptions.
Continued monitoring
After a migrated workload is promoted, the assets to be retired should continue to be monitored to validate that
no additional production traffic is being routed to the wrong assets.
Next steps
After retired assets are decommissioned, the migration is completed. This creates a good opportunity to improve
the migration process, and a retrospective engages the cloud adoption team in a review of the release in an effort
to learn and improve.
Retrospective
How do retrospectives help build a growth mindset?
10/30/2020 • 2 minutes to read • Edit Online
"Culture eats strategy for breakfast." The best migration plan can easily be undone, if it doesn't have executive
support and encouragement from leadership. Learning, growing, and even failure are at the heart of a growth
mindset. They're also at the heart of any transformation effort.
Humility and curiosity are never more important than during a business transformation. Embracing digital
transformation requires both in ample supply. These traits are strengthened by regular introspection and an
environment of encouragement. When employees are encouraged to take risks, they find better solutions. When
employees are allowed to fail and learn, they succeed. Retrospectives are an opportunity for such investigation and
growth.
Retrospectives reinforce the principles of a growth mindset: experimentation, testing, learning, sharing, growing,
and empowering. They provide a safe place for team members to share the challenges faced in the current sprint.
And they allow the team to discuss and collaborate on ways to overcome those challenges. Retrospectives
empower the team to create sustainable growth.
Retrospective structure
A quick search on any search engine will offer many different approaches and tools for running a retrospective.
Depending on the maturity of the culture and experience level of the team, these could prove useful. However, the
general structure of a retrospective remains roughly the same. During these meetings, each member of the team is
expected to contribute a thought regarding three basic questions:
What went well?
What could have been better?
What did we learn?
Although these questions are simple in nature, they require employees to pause and reflect on their work over the
last iteration. This small pause for introspection is the primary building block of a growth mindset. The humility
and honesty produced when sharing the answers can become infectious beyond the time contract for the
retrospective meeting.
Lessons learned
Highly effective teams don't just run retrospective meetings. They live retrospective processes. The lessons learned
and shared in these meetings can influence process, shape future work, and help the team execute more
effectively. Lessons learned in a retrospective should help the team grow organically. The primary byproducts of a
retrospective are an increase in experimentation and a refinement of the lessons learned by the team.
That new growth is most tangibly represented in changes to the release or iteration backlog.
The retrospective marks the end of a release or iteration, as teams gain experience and learn lessons, and they
adjust the adjust the release and iteration backlog to reflect new processes and experiments to be tested. This
starts the next iteration through the migration processes.
Skills readiness for cloud migration
10/30/2020 • 2 minutes to read • Edit Online
During a cloud migration, it is likely that employees, as well as some incumbent systems integration partners or
managed services partners, will need to develop new skills to be effective during migration efforts.
There are four distinct processes that are completed iteratively in the Migrate methodology. The following sections
align the necessary skills for each of those processes with references to two prerequisites for skilling resources.
Next steps
Return to the migration best practices checklist to ensure your migration method is fully aligned.
Migration best practices checklist
Cloud innovation in the Cloud Adoption Framework
10/30/2020 • 2 minutes to read • Edit Online
All IT portfolios contain a few workloads and ideas that could significantly improve a company's position in the
market. Most cloud adoption efforts focus on the migration and modernization of existing workloads. It's
innovation, however, that can provide the greatest business value. Cloud adoption-related innovation can unlock
new technical skills and expanded business capabilities.
This section of the Cloud Adoption Framework focuses on the elements of your portfolio that drive the greatest
return on investment.
To prepare you for this phase of the cloud adoption lifecycle, the framework suggests the following exercises:
Innovation summary
The considerations overview establishes a common language for innovation across application development,
DevOps, IT, and business teams. The following approach builds on existing lean methodologies. It's designed to
help you create a cloud-focused conversation about customer adoption and a scientific model for creating
business value. The approach also maps existing Azure services to manageable decision processes. This alignment
can help you find the right technical options to address specific customer needs or hypotheses.
Suggested skills
Readiness for the new skills and responsibilities that come with cloud adoption doesn't come easily. Microsoft
Learn provides a rewarding approach to hands-on learning that helps you achieve your goals faster. Earn points
and levels, and achieve more!
Here are a couple of examples of role-specific learning paths on Microsoft Learn that align with the Innovate
methodology of the Cloud Adoption Framework.
Administer containers in Azure: Azure Container Instances are the quickest and easiest way to run containers in
Azure. This learning path will teach you how to create and manage your containers, and how you can use Azure
Container Instances to provide elastic scale for Kubernetes.
Create serverless applications: Azure Functions enable the creation of event-driven, compute-on-demand systems
that can be triggered by various external events. Learn to use functions to execute server-side logic and build
serverless architectures.
To discover additional learning paths, browse the Microsoft Learn catalog. Use the Roles filter to align learning
paths with your role.
Azure innovation guide overview
10/30/2020 • 2 minutes to read • Edit Online
NOTE
This guide provides a starting point for innovation guidance in the Cloud Adoption Framework. It is also available in the
Azure Quickstart Center.
Before you start developing innovative solutions by using Azure services, you need to prepare your environment,
which includes preparing to manage customer feedback loops. In this guide, we introduce features that help you
engage customers, build solutions, and drive adoption. For more information, best practices, and considerations
related to preparing your cloud environment, see the Cloud Adoption Framework innovate section.
In this guide, you'll learn how to:
Manage customer feedback : Set up tools and processes to manage the build-measure-learn feedback loop
by using GitHub and Azure DevOps.
Democratize data: Data alone might be enough to drive innovative solutions to your customers. Deploy
common data options in Azure.
Engage via applications: Some innovation requires an engaging experience. Use cloud-native application
platforms to create engaging experiences.
Empower adoption: Invention is great, but a plan to reduce friction is needed to empower and scale adoption.
Deploy a foundation for CI/CD, DevOps, and other adoption enablers.
Interact through devices: Create ambient experiences to bring your applications and data closer to the
customers' point of need. IoT, mixed reality, and mobile experiences are easier with Azure.
Predict and influence: Find patterns in data. Put those patterns to work to predict and influence customer
behaviors by using Azure-based predictive analytics tools.
TIP
For an interactive experience, view this guide in the Azure portal. Go to the Azure Quickstart Center in the Azure portal,
select Azure innovation guide , and then follow the step-by-step instructions.
Next steps:
Prepare for innovation with a shared repository and ideation management tools
This guide provides interactive steps that let you try features as they're introduced. To come back to where you left
off, use the breadcrumb for navigation.
Prepare for customer feedback
10/30/2020 • 3 minutes to read • Edit Online
User adoption, engagement, and retention are key to successful innovation. Why?
Building an innovative new solution isn't about giving users what they want or think they want. It's about the
formulation of a hypothesis that can be tested and improved upon. That testing comes in two forms:
Quantitative (testing feedback): This feedback measures the actions we hope to see.
Qualitative (customer feedback): This feedback tells us what those metrics mean in the customer's voice.
Before you integrate feedback loops, you need to have a shared repository for your solution. A centralized repo will
provide a way to record and act on all the feedback coming in about your project. GitHub is the home for open
source software. It's also one of the most commonly used platforms for hosting source code repositories for
commercially developed applications. The article on building GitHub repositories can help you get started with
your repo.
Each of the following tools in Azure integrates with (or is compatible with) projects hosted in GitHub:
Quantitative feedback for web apps
Quantitative feedback for APIs
Qualitative feedback
Close the loop with pipelines
Application Insights is a monitoring tool that provides near-real-time quantitative feedback on the usage of your
application. This feedback can help you test and validate your current hypothesis to shape the next feature or user
story in your backlog.
Action
To view quantitative data on your applications:
1. Go to Application Insights .
If your application doesn't appear in the list, select add and follow the prompts to start configuring
Application Insights.
If the desired application is in the list, select it.
2. The over view pane includes some statistics on the application. Select application dashboard to build a
custom dashboard for data that's more relevant to your hypothesis.
G O TO A P P L I C ATI O N
I N S I G H TS
One of the first steps in democratizing data is to enhance data discoverability. Cataloging and managing data
sharing can help enterprises get the most value from their existing information assets. A data catalog makes data
sources easy to discover and understand by the users who manage the data. Azure Data Catalog enables
management inside an enterprise, whereas Azure Data Share enables management and sharing outside the
enterprise.
Azure services that provide data processing, like Azure Time Series Insights and Stream Analytics, are other
capabilities that customers and partners are successfully using for their innovation needs.
Catalog
Share
Insights
Azure Data Catalog
Azure Data Catalog addresses the discovery challenges of data consumers and enables data producers who
maintain information assets. It bridges the gap between IT and the business, allowing everyone to contribute their
insights. You can store your data where you want it and connect with the tools you want to use. With Azure Data
Catalog, you can control who can discover registered data assets. You can integrate into existing tools and
processes by using open REST APIs.
Register
Search and annotate
Connect and manage
Go to the Azure Data Catalog documentation
Action
You can use only one Azure Data Catalog per organization. If a catalog has already been created for your
organization, you can't add more catalogs.
To create a catalog for your organization:
1. Go to Azure Data Catalog .
2. Select Create .
G O TO A Z U R E D ATA
C ATA L O G
Engage customers through applications
10/30/2020 • 9 minutes to read • Edit Online
Innovation with applications includes both modernizing your existing applications that are hosted on-premises and
building cloud-native applications by using containers or serverless technologies. Azure provides PaaS services like
Azure App Service to help you easily modernize your existing web and API apps written in .NET, .NET Core, Java,
Node.js, Ruby, Python, or PHP for deployment in Azure.
With an open-standard container model, building microservices or containerizing your existing applications and
deploying them on Azure is simple when you use managed services like Azure Kubernetes Service, Azure Container
Instances, and Web App for Containers. Serverless technologies like Azure Functions and Azure Logic Apps use a
consumption model (pay for what you use) and help you focus on building your application rather than deploying
and managing infrastructure.
Deliver value faster
Create cloud-native applications
Isolate points of failure
One of the advantages of cloud-based solutions is the ability to gather feedback faster and start delivering value to
your user. Whether that user is an external customer or a user in your own company, the faster you can get
feedback on your applications, the better.
Azure App Service
Azure App Service provides a hosting environment for your applications that removes the burden of infrastructure
management and OS patching. It provides automation of scale to meet the demands of your users while bound by
limits that you define to keep costs in check.
Azure App Service provides first-class support for languages like ASP.NET, ASP.NET Core, Java, Ruby, Node.js, PHP,
and Python. If you need to host another runtime stack, Web App for Containers lets you quickly and easily host a
Docker container within App Service, so you can host your custom code stack in an environment that gets you out
of the server business.
Action
To configure or monitor Azure App Service deployments:
1. Go to App Ser vices .
2. Configure a new service: select Add and follow the prompts.
3. Manage existing services: select the desired application from the list of hosted applications.
G O TO A P P
S E R V IC E S
Azure DevOps
During your innovation journey, you'll eventually find yourself on the path to DevOps. Microsoft has long had an
on-premises product known as Team Foundation Server (TFS). During our own innovation journey, Microsoft
developed Azure DevOps, a cloud-based service that provides build and release tools supporting many languages
and destinations for your releases. For more information, see Azure DevOps.
Visual Studio App Center
As mobile apps continue to grow in popularity, the need for a platform that can provide automated testing on real
devices of various configurations grows. Visual Studio App Center not only provides a place where you can test
your applications across iOS, Android, Windows, and macOS, it also provides a monitoring platform that can use
Azure Application Insights to analyze your telemetry quickly and easily. For more information, see Visual Studio
App Center.
Visual Studio App Center also provides a notification service that lets you use a single call to send notifications to
your application across platforms without having to contact each notification service individually. For more
information, see Visual Studio App Center Push (ACP).
Learn more
App Service overview
Web App for Containers: Run a custom container
Introduction to Azure Functions
Azure for .NET and .NET Core developers
Azure SDK for Python documentation
Azure for Java cloud developers
Create a PHP web app in Azure
Azure SDK for JavaScript documentation
Azure SDK for Go documentation
DevOps solutions
Empower adoption
10/30/2020 • 7 minutes to read • Edit Online
You know that innovation is critical to business success. You don't accomplish innovation solely through the
introduction of new technologies. You need to focus on supporting the people who catalyze change and create the
new value that you seek. Developers are at the center of digital transformation, and to empower them to achieve
more, you need to accelerate developer velocity. To unleash the creative energy of developer teams, you need to
help them build productively, foster global and secure collaboration, and remove barriers so they can scale
innovation.
Generate value
In every industry, every organization is trying to do one thing: drive constant value generation.
The focus on innovation is essentially a process to help your organization find new ways to generate value.
Perhaps the biggest mistake organizations make is trying to create new value by introducing new technologies.
Sometimes the attitude is "if we just use more technology, we'll see things improve." But innovation is first and
foremost a people story.
Innovation is about the combination of people and technology.
Organizations that successfully innovate see vision, strategy, culture, unique potential, and capabilities as the
foundational elements. They then turn to technology with a specific purpose in mind. Every company is becoming a
software company. The hiring of software engineers is growing at a faster rate outside the tech industry than inside,
according to LinkedIn data.
Innovation is accomplished when organizations support their people to create the value they seek. One group of
those people, developers, is a catalyst for innovation. They play an increasingly vital role in value creation and
growth across every industry. They're the builders of our era, writing the world's code and sitting at the heart of
innovation. Innovative organizations build a culture that empowers developers to achieve more.
Developer productivity
Innovate collaboratively
Innovation characteristics
LiveOps innovation
Developer velocity
Empowering developers to invent means accelerating developer velocity, enabling them to create more, innovate
more, and solve more problems. Developer velocity is the underpinning of each organization's tech intensity.
Developer velocity isn't just about speed. It's also about unleashing developer ingenuity, turning your developers'
ideas into software with speed and agility so that innovative solutions can be built. The differentiated Azure solution
is uniquely positioned to unleash innovation in your organization.
Build productively
There are several areas of opportunity where Azure can help you build productively:
Ensure developers become and stay proficient in their domain by helping them advance their knowledge.
Hone the right skills by giving them the right tools.
One of the best ways to improve your developers' skills is by giving them tools they know and love. Azure tools
meet developers where they are today and introduce them to new technologies in the context of the code they're
writing. With the Azure commitment to open-source software and support for all languages and frameworks in
Azure tools, your developers can build how they want and deploy where you want.
Azure DevOps provides best-in-class tools for every developer. Azure developer services infuse modern
development practices and emerging trends into our tools. With the Azure platform, developers have access to the
latest technologies and a cutting-edge toolchain that supports the way they work.
AI-assisted development tools
Integrated tools and cloud
Remote development and pair programming
Go to Azure DevOps documentation
Action
To create a DevOps project:
1. Go to Azure DevOps Projects .
2. Select Create DevOps project .
3. Select Runtime, Framework , and Ser vice .
G O TO A Z U R E D E V O P S
P R O J E C TS
Interact through devices
10/30/2020 • 5 minutes to read • Edit Online
Innovate through intermittently connected and perceptive edge devices. Orchestrate millions of such devices,
acquire and process limitless data, and take advantage of a growing number of multisensory, multidevice
experiences. For devices at the edge of your network, Azure provides a framework for building immersive and
effective business solutions. With ubiquitous computing, enabled by Azure combined with AI technology, you can
build every type of intelligent application and system you can envision.
Azure customers employ a continually expanding set of connected systems and devices that gather and analyze
data (close to their users, the data, or both). Users get real-time insights and experiences, delivered by highly
responsive and contextually aware applications. By moving parts of the workload to the edge, these devices can
spend less time sending messages to the cloud and react more quickly to spatial events.
Industrial assets
Microsoft HoloLens 2
Azure Sphere
Azure Kinect DK
Drones
Azure SQL Edge
IoT plug and play
Global scale IoT service
Azure Digital Twins
Location intelligence
Spatial experiences
Azure Remote Rendering
Architect solutions that exercise bidirectional communication with IoT devices at billions scale. Use out-of-box,
device-to-cloud telemetry data to understand the state of your devices and define message routes to other Azure
services just through configuration. By taking advantage of cloud-to-device messages, you can reliably send
commands and notifications to your connected devices and track message delivery with acknowledgment receipts.
And you'll automatically resend device messages as needed to accommodate intermittent connectivity.
Here are a few features you'll find:
Security-enhanced communication channel for sending and receiving data from IoT devices.
Built-in device management and provisioning to connect and manage IoT devices at scale.
Full integration with Event Grid and serverless compute, simplifying IoT application development.
Compatibility with Azure IoT Edge for building hybrid IoT applications.
Learn more
Azure IoT Hub
Azure IoT Hub Device Provisioning Service (DPS)
Action
To create an IoT hub:
1. Go to IoT Hub .
2. Select Create IoT hub .
G O TO I O T
HUB
The Azure IoT Hub Device Provisioning Service is a helper service for Azure IoT Hub that enables zero-touch, just-
in-time provisioning.
Action
To create an Azure IoT Hub Device Provisioning Service:
1. Go to Device Provisioning Ser vices .
2. Select Add .
G O TO D E V I C E P R O V I S I O N I N G
S E R V IC E S
Innovate with AI
10/30/2020 • 3 minutes to read • Edit Online
As an innovator, your company has rich information about its business and its customers. Using AI, your company
can:
Make predictions about customer needs.
Automate business processes.
Discover information that's latent in unstructured data.
Engage with customers in new ways to deliver better experiences.
This article introduces a few approaches to innovating with AI. The following table can help you find the best
solution for your implementation needs.
Machine learning
Azure provides advanced machine learning capabilities. Build, train, and deploy your machine learning models
across the cloud and edge by using Azure Machine Learning. Develop models faster by using automated machine
learning. Use tools and frameworks of your choice without being locked in.
For more information, see Azure Machine Learning overview and getting started with your first machine learning
experiment. For more information on the open source model format and runtime for machine learning, see ONNX
Runtime.
Action
A data scientist can use Azure Machine Learning to train and build a model by using advanced languages such as
Python and R, as well as by using a drag-and-drop visual experience. To get started with Azure Machine Learning:
1. In the Azure portal, search for and select Machine Learning .
2. Select Add , and follow the steps in the portal to create a workspace.
3. The new workspace provides both low-code and code-driven approaches for data scientists to train, build,
deploy, and manage models.
G O TO A Z U R E M A C H I N E L E A R N I N G
RESO URCES
Knowledge mining
Use Azure Cognitive Search to uncover latent insights from your content, including documents, images, and media.
You can discover patterns and relationships in your content, understand sentiment, and extract key phrases.
Azure Cognitive Search uses the same natural language stack that Bing and Microsoft Office use. Spend more time
innovating and less time maintaining a complex cloud search solution.
For more information, see What is Azure Cognitive Search?
Action
To get started:
1. In the Azure portal, search for and select Azure Cognitive Search .
2. Follow the steps in the portal to provision the service.
G O TO A Z U R E C O G N I TI V E
SEARCH
Review a prescriptive framework that includes the tools, programs, and content (best practices, configuration
templates, and architecture guidance) to simplify adoption of Kubernetes and cloud-native practices at scale.
The list of required actions is categorized by persona to drive a successful deployment of applications on
Kubernetes, from proof of concept to production, then scaling and optimization.
Get started
To prepare for this phase of the cloud adoption lifecycle, use the following exercises:
Application development and deployment: Examine patterns and practices of application development,
configure CI/CD pipelines, and implement site reliability engineering (SRE) best practices.
Cluster design and operations: Identify for cluster configuration and network design. Ensure future scalability by
automating infrastructure provisioning. Maintain high availability by planning for business continuity and
disaster recovery.
Cluster and application security: Familiarize yourself with Kubernetes security essentials. Review the secure
setup for clusters and application security guidance.
Application development and deployment
10/30/2020 • 4 minutes to read • Edit Online
Examine patterns and practices of application development, configure Azure Pipelines, and implement site
reliability engineering (SRE) best practices.
Prepare your development environment. Configure your Working with Docker in Visual Studio Code
environment with the tools you need to create containers and Working with Kubernetes in Visual Studio Code
set up your development workflow. Introduction to Azure Dev Spaces
Containerize your application. Familiarize yourself with Containerize your applications with Docker and Kubernetes
the end-to-end Kubernetes development experience, (e-book)
including application scaffolding, inner-loop workflows, End-to-end Kubernetes development experience on Azure
application-management frameworks, CI/CD pipelines, log (webinar)
aggregation, monitoring, and application metrics.
Review common Kubernetes scenarios. Kubernetes is Common scenarios to use Kubernetes (video)
often thought of as a platform for delivering microservices,
but it's becoming a much broader platform. Watch this video
to learn about common Kubernetes scenarios such as batch
analytics and workflow.
Prepare your application for Kubernetes. Prepare your Project design and layout for successful Kubernetes
application file system layout for Kubernetes and organize for applications (webinar)
weekly or daily releases. Learn how the Kubernetes How Kubernetes deployments work (video)
deployment process enables reliable, zero-downtime Go through an AKS workshop
upgrades.
Manage application storage. Understand the performance The basics of stateful applications in Kubernetes (video)
needs and access methods for pods so that you can provide State and data in Docker applications
the appropriate storage options. You should also plan for Storage options in Azure Kubernetes Service
ways to back up and test the restore process for attached
storage.
Manage application secrets. Don't store credentials in How Kubernetes and configuration management work (video)
your application code. A key vault should be used to store Understand secrets management in Kubernetes (video)
and retrieve keys and credentials. Using Azure Key Vault with Kubernetes
Use Azure AD pod identity to authenticate and access Azure
resources
Configure readiness and liveness health checks. Liveness and readiness checks
Kubernetes uses readiness and liveness checks to know when
to your application is ready to receive traffic and when it
needs to be restarted. Without defining such checks,
Kubernetes will not be able to determine if your application is
up and running.
Define resource requirements for the application. A Define pod resource requests and limits
primary way to manage the compute resources within a
Kubernetes cluster is using pod requests and limits. These
requests and limits tell the Kubernetes scheduler what
compute resources a pod should be assigned.
Deploy applications using an automated pipeline and Evolve your DevOps practices
DevOps. The full automation of all steps between code Setting up a Kubernetes build pipeline (video)
commit to production deployment allows teams to focus on Deployment Center for Azure Kubernetes Service
building code and removes the overhead and potential GitHub Actions for deploying to Azure Kubernetes Service
human error in manual mundane steps. Deploying new code CI/CD to Azure Kubernetes Service with Jenkins
is quicker and less risky, helping teams become more agile,
more productive, and more confident about their running
code.
Deploy an API gateway. An API gateway serves as an Use Azure API Management with microservices deployed in
entry point to microservices, decouples clients from your Azure Kubernetes Service
microservices, adds an additional layer of security, and
decreases the complexity of your microservices by removing
the burden of handling cross-cutting concerns.
Deploy a ser vice mesh. A service mesh provides How service meshes work in Kubernetes (video)
capabilities like traffic management, resiliency, policy, security, Learn about service meshes
strong identity, and observability to your workloads. Your Use Istio with Azure Kubernetes Service
application is decoupled from these operational capabilities Use Linkerd with Azure Kubernetes Service
and the service mesh moves them out of the application layer Use Consul with Azure Kubernetes Service
and down to the infrastructure layer.
Implement site reliability engineering (SRE) practices. Introduction to site reliability engineering (SRE)
Site reliability engineering (SRE) is a proven approach to DevOps at Microsoft: Game streaming SRE
maintain crucial system and application reliability while
iterating at the speed demanded by the marketplace.
Cluster design and operations
10/30/2020 • 3 minutes to read • Edit Online
Identify for cluster configuration and network design. Future-proof scalability by automating infrastructure
provisioning. Maintain high availability by planning for business continuity and disaster recovery.
Identify network design considerations. Understand Kubenet and Azure Container Networking Interface (CNI)
cluster network design considerations, compare network Use kubenet networking with your own IP address ranges
models, and choose the Kubernetes networking plug-in that in Azure Kubernetes Service (AKS)
fits your needs. For the Azure container networking interface Configure Azure CNI networking in Azure Kubernetes
(CNI), consider the number of IP addresses required as a Service (AKS)
multiple of the maximum pods per node (default of 30) and Secure network design for an AKS cluster
number of nodes. Add one node required during upgrade.
When choosing load balancer services, consider using an
ingress controller when there are too many services to reduce
the number of exposed endpoints. For Azure CNI, the service
CIDR has to be unique across the virtual network and all
connected virtual networks to ensure appropriate routing.
Create multiple node pools. To support applications that Create and manage multiple node pools for a cluster in
have different compute or storage demands, you can Azure Kubernetes Service
optionally configure your cluster with multiple node pools. For
example, use additional node pools to provide GPUs for
compute-intensive applications or access to high-performance
SSD storage.
Decide on availability requirements. A minimum of two Create an Azure Kubernetes Service (AKS) cluster that uses
pods behind Azure Kubernetes Service ensures high availability zones
availability of your application in case of pod failures or
restarts. Use three or more pods to handle load during pod
failures and restarts. For the cluster configuration, a minimum
of 2 nodes in an availability set or virtual machine scale set is
required to meet the service-level agreement of 99.95%. Use
at least three pods to ensure pod scheduling during node
failures and reboots. To provide a higher level of availability to
your applications, clusters can be distributed across availability
zones. These zones are physically separate datacenters within
a given region. When the cluster components are distributed
across multiple zones, your cluster is able to tolerate a failure
in one of those zones. Your applications and management
operations remain available even if an entire datacenter
experiences an outage.
Automate cluster provisioning. With infrastructure as Create a Kubernetes cluster with Azure Kubernetes Service
code, you can automate infrastructure provisioning to provide using Terraform
more resiliency during disasters and gain agility to quickly
redeploy the infrastructure as needed.
Plan for availability using pod disruption budgets. To Plan for availability using pod disruption budgets
maintain the availability of applications, define pod disruption
budgets (PDBs) to ensure that a minimum number of pods
are available in the cluster during hardware failures or cluster
upgrades.
Enforce resource quotas on namespaces. Plan and apply Enforce resource quotas
resource quotas at the namespace level. Quotas can be set on
compute resources, storage resources, and object count.
Plan for business continuity and disaster recover y. Best practices for region deployments
Plan for multiregion deployment, create a storage migration Azure Container Registry geo-replication
plan, and enable geo-replication for container images.
Configure monitoring and troubleshooting at scale. Get started with monitoring and alerting for Kubernetes
Set up alerting and monitoring for applications in Kubernetes. (video)
Learn about the default configuration, how to integrate more Configure alerts using Azure Monitor for containers
advanced metrics, and how to add your own custom Review diagnostic logs for master components
monitoring and alerting to reliably operate your application. Azure Kubernetes Service (AKS) diagnostics
Cluster and application security
10/30/2020 • 3 minutes to read • Edit Online
Familiarize yourself with Kubernetes security essentials and review the secure setup for clusters and application
security guidance.
Familiarize yourself with the security essentials white The definitive guide to securing Kubernetes (white paper)
paper. The primary goals of a secure Kubernetes environment
are ensuring that the applications it runs are protected, that
security issues can be identified and addressed quickly, and
that future similar issues will be prevented.
Review the security hardening setup for the cluster Security hardening in AKS virtual machine hosts
nodes. A security hardened host OS reduces the surface area
of attack and allows deploying containers securely.
Setup cluster role-based access control (RBAC). This Understand role-based access control (RBAC) in Kubernetes
control mechanism lets you assign users, or groups of users, (video)
permission to do things like create or modify resources, or Integrate Azure AD with Azure Kubernetes Service
view logs from running application workloads. Limit access to cluster configuration file
Control access to clusters using group membership. Control access to clusters using RBAC and Azure AD groups
Configure Kubernetes role-based access control (RBAC) to
limit access to cluster resources based on user identity or
group membership.
Create a secrets management policy. Securely deploy Understand secrets management in Kubernetes (video)
and manage sensitive information, such as passwords and
certificates, using secrets management in Kubernetes.
C H EC K L IST RESO URC ES
Secure intra-pod network traffic with network Secure intra-pod traffic with network policies
policies. Apply the principle of least privilege to control
network traffic flow between pods in the cluster.
Restrict access to the API ser ver using authorized IPs. Secure access to the API server
Improve cluster security and minimize attack surface by
limiting access to the API server to a limited set of IP address
ranges.
Restrict cluster egress traffic. Learn what ports and Control egress traffic for cluster nodes in AKS
addresses to allow if you restrict egress traffic for the cluster.
You can use Azure Firewall or a third-party firewall appliance
to secure your egress traffic and define these required ports
and addresses.
Secure traffic with Web Application Firewall (WAF). Configure Azure Application Gateway as an ingress controller
Use Azure Application Gateway as an ingress controller for
Kubernetes clusters.
Apply security and kernel updates to worker nodes. Use kured to automatically reboot nodes to apply updates
Understand the AKS node update experience. To protect your
clusters, security updates are automatically applied to Linux
nodes in AKS. These updates include OS security fixes or
kernel updates. Some of these updates require a node reboot
to complete the process.
Configure a container and cluster scanning solution. Azure Container Registry integration with Security Center
Scan containers pushed into Azure Container Registry and Azure Kubernetes Service integration with Security Center
gain deeper visibility to your cluster nodes, cloud traffic, and
security controls.
Enforce cluster governance policies. Apply at-scale Control deployments with Azure Policy
enforcements and safeguards on your clusters in a centralized,
consistent manner.
Rotate cluster cer tificates periodically. Kubernetes uses Rotate certificates in Azure Kubernetes Service (AKS)
certificates for authentication with many of its components.
You may want to periodically rotate those certificates for
security or policy reasons.
AI in the Cloud Adoption Framework
10/30/2020 • 2 minutes to read • Edit Online
Review a prescriptive framework that includes the tools, programs, and content (best practices, configuration
templates, and architecture guidance) to simplify adoption of AI and cloud-native practices at scale.
The list of required actions is categorized by persona to drive a successful deployment of AI in applications, from
proof of concept to production, then scaling and optimization.
Get started
To prepare for this phase of the cloud adoption lifecycle, use the following exercises:
Machine Learning model development, deployment, and management: Examine patterns and practices of
building your own machine learning models, including MLOps, automated machine learning (AutoML), and
Responsible ML learning tools such as InterpretML and FairLearn.
Adding domain-specific AI models to your applications: Learn about best practices for adding AI capabilities into
your applications with Cognitive Services. Also learn about the key scenarios these services help you address.
Build your own conversational AI solution: Learn how to build your own Virtual Assistant, a conversational AI
application that can understand language and speech, perceive vast amounts of information, and respond
intelligently.
Build AI-driven knowledge mining solutions: Learn how to use knowledge mining to extract structured data
from your unstructured content and discover actionable information across your organization's data.
Machine learning
10/30/2020 • 2 minutes to read • Edit Online
Azure empowers you with the most advanced machine learning capabilities. Quickly and easily build, train, and
deploy your machine learning models by using Azure Machine Learning. Machine Learning can be used for any
kind of machine learning, from classical to deep, supervised, and unsupervised learning. Whether you prefer to
write Python or R code, or use zero-code or low-code options such as the designer, you can build, train, and track
highly accurate machine learning and deep learning models in a Machine Learning workspace.
You can even start training on your local machine and then scale out to the cloud. The service also interoperates
with popular deep learning and reinforcement open-source tools such as PyTorch, TensorFlow, scikit-learn, and Ray
and RLlib.
Get started with Machine Learning. You'll find a tutorial on how to get started with your first machine learning
experiment. To learn more about the open-source model format and runtime for machine learning, see ONNX
Runtime.
Common scenarios for machine learning solutions include:
Predictive maintenance
Inventory management
Fraud detection
Demand forecasting
Intelligent recommendations
Sales forecasting
Checklist
Get star ted by first familiarizing yourself with Machine Learning , and then choose which
experience to begin with. You can follow along with steps to use a Jupyter notebook with Python, the visual
drag-and-drop experience, or automated machine learning (AutoML).
Machine Learning overview
Create your first machine learning experiment with a Jupyter notebook with Python
Visual drag-and-drop experiments
Use the AutoML UI
Configure your dev environment
Experiment with more advanced tutorials to predict taxi fees, classify images, and build a pipeline for
batch scoring.
Use AutoML to predict taxi fees
Classify images with scikit-learn
Build a Machine Learning pipeline for batch scoring
Follow along with video tutorials to learn more about the benefits of Machine Learning, such as no-
code model building, MLOps, ONNX Runtime, model interpretability and transparency, and more.
What's new with Machine Learning
Use AutoML to build models
Build zero-code models with Machine Learning designer
MLOps for managing the end-to-end lifecycle
Incorporating ONNX Runtime into your models
Model interpretability and transparency
Building models with R
Review reference architectures for AI solutions
Next steps
Explore other AI solution categories:
AI applications and agents
Knowledge mining
AI applications and agents
10/30/2020 • 4 minutes to read • Edit Online
Infusing AI into an application can be difficult and time-consuming. Until recently, you needed both a deep
understanding of machine learning and months of development to acquire data, train models, and deploy them at
scale. Even then, success was not guaranteed. The path was filled with blockers, gotchas, and pitfalls causing teams
to fail to realize value from their AI investments.
AI applications
Microsoft Azure Cognitive Services remove these challenges and are designed to be productive, enterprise ready,
and trusted. They make it possible for you to build on the latest breakthroughs in AI without building and
deploying your own models; instead you can deploy AI models using just a few simple lines of code so that even
without a large data science team, you can quickly create applications that see, hear, speak, understand, and even
begin to reason.
Common scenarios for AI applications include:
Sentiment analysis
Object detection
Translation
Personalization
Robotic process automation
As you get started, the checklist and resources below will help you plan your application development and
deployment.
Are you familiar with the multitude of capabilities and services offered within Azure Cognitive Services, and
which ones in particular you will be using?
Determine whether or not you have custom data with which you want to train and customize these models.
There are Cognitive Services that are customizable.
There are several ways to use Azure Cognitive Services. Explore the quickstart tutorials for getting up and
running for both SDK and REST APIs. Note: the Cognitive Services SDKs are available for many popular dev
languages, including C#, Python, Java, JavaScript and Go.
Determine if you will need to deploy these Cognitive Services in containers.
AI applications checklist
To get started, first familiarize yourself with the various categories and services within Azure Cognitive Services.
Visit the product pages to learn more and to interact with demos to learn more about the capabilities available,
such as vision, speech, language, and decision. There is also an e-book that walks through common scenarios and
how to build your first application with Cognitive Services.
Cognitive Services
Interactive demos across the product/service pages
Building intelligent applications with cognitive APIs (e-book)
Select the service you want to use across vision, language, speech, decision, or web search. Each category on the
page offers a set of quick starts, tutorials, how-to guides, whether you want to use the REST API or SDKs.
You can also download the Intelligent Kiosk to experience and demo these services.
Cognitive Services documentation
Building intelligent applications with cognitive APIs (e-book)
Install the Intelligent Kiosk to familiarize yourself with Cognitive Services capabilities
Learn more about container support for Azure Cognitive Services.
Container support in Azure Cognitive Services
Review reference architectures for AI solutions.
AI + Machine Learning
AI agents
Microsoft's Azure AI platform aims to empower developers to innovate and accelerate their projects. Specifically
for conversational AI, Azure offers the Azure Bot Service and Bot Framework SDK and tools enabling developers to
build rich conversational experiences. Additionally, developers can use Azure Cognitive Services (domain-specific
AI services available as APIs) like Language Understanding (LUIS), QnA Maker, and Speech service to add the
abilities for your chatbot to understand and speak with your end users.
Common scenarios for conversational AI or chatbot solutions include:
Informational Q&A chatbot
Customer service or support chatbot
IT help desk or HR chatbot
e-commerce or sales chatbot
Speech-enabled devices
NOTE
We offer Power Virtual Agents, built on top of the Bot Framework, for developers who want to build a chatbot with a
primarily no-code/low-code experience. Additionally, developers neither host the bot themselves, control the natural
language or other AI models themselves with Cognitive Services.
AI agents checklist
Familiarize yourself with Azure Bot Service and Microsoft Bot Framework.
Bot Framework is an open-source offering that provides an SDK (available in C#, JavaScript, Python, and
Java) to help you design, build, and test your bot. It also offers a free visual authoring canvas in Bot
Framework Composer and a testing tool in Bot Framework Emulator.
Azure Bot Service is a dedicated service within Azure that allows you to host or publish your bot in Azure
and connect to popular channels.
Learn about Azure Bot Service and Bot Framework overview
Principles of bot design
Find the latest versions of Bot Framework SDK and tools
One of the simplest ways to get started is to use QnA Maker, part of Azure Cognitive Services, which can
intelligently convert an FAQ document or website into a Q&A experience in minutes.
Learn how to create a bot with Q&A abilities quickly with QnA Maker
Test out the QnA Maker service directly
Download and use Bot Framework SDK and tools for bot development
5 minute quickstart with Bot Framework Composer
Build and test bots with Bot Framework SDK (C#, JavaScript, Python)
Learn how to add Cognitive Services to make your bot even more intelligent.
A developer's guide to building AI applications (e-book)
Learn more about Cognitive Services
Learn how to build your own Virtual Assistant with Bot Framework solution accelerators, and select a common set
of skills such as calendar, e-mail, point of interest, and to-do.
Bot Framework Virtual Assistant solution
Next steps
Explore other AI solution categories:
Machine learning
Knowledge mining
Knowledge mining
10/30/2020 • 2 minutes to read • Edit Online
Knowledge mining refers to an emerging category of AI designed to simplify the process of accessing the latent
insights contained within structured and unstructured data. It defines the process of using an AI pipeline to
discover hidden patterns and actionable information from sets of structured and unstructured data in a scalable
way.
Azure Cognitive Search is a managed cloud solution that gives developers APIs and tools for adding a rich search
experience over private, heterogeneous content in web, mobile, and enterprise applications. It offers capabilities
such as scoring, faceting, suggestions, synonyms, and geo-search to provide a rich user experience. Azure
Cognitive Search is also the only cloud search service with built-in knowledge mining capabilities. Azure Cognitive
Search acts as the orchestrator for your knowledge mining enrichment pipeline by following the steps to ingest,
enrich, and explore and analyze.
Key scenarios for knowledge mining include:
Digital content management: Help customers consume content more quickly by providing them relevant
search results in your content catalog.
Customer suppor t and feedback analysis: Quickly find the right answer in documents and discover trends
of what customers are asking for to improve customer experiences.
Data extraction and process management: Accelerate processing documents by extracting key information
and populating it in other business documentation.
Technical content review and research: Quickly review documents and extract key information to make
informed decisions.
Auditing and compliance management: Quickly identify key areas and flag important ideas or information
in documents.
Checklist
Get star ted: Access free knowledge mining solution accelerators, boot camps, and workshops.
Knowledge mining solution accelerator
Knowledge mining workshop
Knowledge mining boot camp
Knowledge mining e-book
Use power skills: Azure search power skills provide useful functions deployable as custom skills for Azure
Cognitive Search. The skills can be used as templates or starting points for your own custom skills. They
also can be deployed and used as is if they happen to meet your requirements. We also invite you to
contribute your own work by submitting a pull request.
Explore additional resources:
Azure Cognitive Search overview
Built-in cognitive skills for text and image processing during indexing
Documentation resources for AI enrichment in Azure Cognitive Search
Design tips and tricks for AI enrichment
Full text search
Next steps
Explore other AI solution categories:
AI applications and agents
Machine learning
Develop digital inventions in Azure
10/30/2020 • 2 minutes to read • Edit Online
Azure can help accelerate the development of each area of digital invention. This section of the Cloud Adoption
Framework builds on the Innovate methodology. This section shows how you can combine Azure services to
create a toolchain for digital invention.
Toolchain
Start with the overview page that relates to the type of digital invention you require to test your hypothesis. You
start with that page for guidance you can act on and so that you can build with customer empathy.
Here are the types of digital invention in this article series:
Democratize data: Tools for sharing data to solve information-related customer needs.
Engage via applications: Tools to create applications that engage customers beyond raw data.
Empower adoption: Tools to accelerate customer adoption through digital support for your build-measure-
learn cycles.
Interact with devices: Tools to create different levels of ambient experiences for your customers.
Predict and influence: Tools for predictive analysis and integration of their output into applications.
Tools to democratize data in Azure
10/30/2020 • 2 minutes to read • Edit Online
As described in the conceptual article on democratizing data, you can deliver many innovations with little technical
investment. Many major innovations require little more than raw data. Democratizing data is about investing as
little resource as needed to engage your customers who use data to take advantage of their existing knowledge.
Starting with data is a quick way to test a hypothesis before expanding into broader, more costly digital inventions.
As you refine more of the hypothesis and begin to adopt the inventions at scale, the following processes will help
you prepare for operational support of the innovation.
Toolchain
In Azure, the following tools are commonly used to accelerate digital invention across the preceding phases:
Power BI
Azure Data Catalog
Azure Synapse Analytics
Azure Cosmos DB
Azure Database for PostgreSQL
Azure Database for MySQL
Azure Database for MariaDB
Azure Database for PostgreSQL hyperscale
Azure Data Lake Storage
Azure Database Migration Service
Azure SQL Database, with or without Azure SQL Managed Instance
Azure Data Factory
Azure Stream Analytics
SQL Server Integration Services
Azure Stack
SQL Server Stretch Database
Azure StorSimple
Azure Files
Azure file sync
PolyBase
As the invention approaches adoption at scale, the aspects of each solution require refinement and technical
maturity. As that happens, more of these services are likely to be required. Use the table of contents on the left side
of this page for Azure tools guidance relevant to your hypothesis-testing process.
Get started
The table of contents on the left side of this page outlines many articles. These articles help you get started with
each of the tools in this toolchain.
NOTE
Some links might leave the Cloud Adoption Framework to help you go beyond the scope of this framework.
What is data classification?
10/30/2020 • 2 minutes to read • Edit Online
Data classification allows you to determine and assign value to your organization's data and provides a common
starting point for governance. The data classification process categorizes data by sensitivity and business impact in
order to identify risks. When data is classified, you can manage it in ways that protect sensitive or important data
from theft or loss.
Take action
Take action by defining and tagging assets with a defined data classification.
Choose one of the actionable governance guides for examples of applying tags across your portfolio.
Review the naming and tagging standards article to define a more comprehensive tagging standard.
For additional information on resource tagging in Azure, see Use tags to organize your Azure resources and
management hierarchy.
Next steps
Continue learning from this article series by reviewing the article on securing sensitive data. The next article
contains applicable insights if you are working with data that is classified as confidential or highly confidential.
Secure sensitive data
Collect data through the migration and
modernization of existing data sources
10/30/2020 • 2 minutes to read • Edit Online
Companies often have different kinds of existing data that they can democratize. When a customer hypothesis
requires the use of existing data to build modern solutions, a first step might be the migration and modernization
of data to prepare for inventions and innovations. To align with existing migration efforts within a cloud adoption
plan, you can more easily do the migration and modernization within the Migrate methodology.
Primary toolset
When you migrate and modernize on-premises data, the most common Azure tool choice is Azure Database
Migration Service. This service is part of the broader Azure Migrate toolchain. For existing SQL Server data sources,
Data Migration Assistant can help you assess and migrate a small number of data structures.
To support Oracle and NoSQL migrations, you can also use Database Migration Service for certain types of source-
to-target databases. Examples include migrating Oracle databases to PostgreSQL or MongoDB databases to Azure
Cosmos DB. More commonly, adoption teams use partner tools or custom scripts to migrate to Azure Cosmos DB,
Azure HDInsight, or virtual machine options based on infrastructure as a service (IaaS).
RDS SQL Server Azure SQL Database Database Migration Online Tutorial
or Azure SQL Service
Managed Instance
As described in Engage via applications, applications can be an important aspect of an MVP solution. Applications
are often required for testing a hypothesis. This article helps you learn the tools Azure provides to accelerate
development of those applications.
Toolchain
Depending on the path that the cloud adoption team takes, Azure provides tools to accelerate the team's ability to
build with customer empathy. The following list of Azure offerings is grouped based on the preceding decision
paths. These offerings include:
Azure App Service
Azure Kubernetes Service (AKS)
Azure Migrate
Azure Stack
Power Apps
Power Automate
Power BI
Get started
The table of contents on the left side of this page outlines many articles. These articles help you get started with
each of the tools in this toolchain.
NOTE
Some links might leave the Cloud Adoption Framework to help you go beyond the scope of this framework.
Tools to empower adoption in Azure
10/30/2020 • 2 minutes to read • Edit Online
As described in Empower adoption, building true innovation at scale requires an investment in removing friction
that could slow adoption. In the early stages of testing a hypothesis, a solution is small. The investment in
removing friction is likely small as well. As hypotheses prove true, the solution and the investment in empowering
adoption grows. This article provides key links to help you get started with each stage of maturity.
Toolchain
For adoption teams that are mature professional development teams with many contributors, the Azure toolchain
starts with GitHub and Azure DevOps.
As your need grows, you can expand this foundation to use other tool features. The expanded foundation might
involve tools like:
Azure Blueprints
Azure Policy
Azure Resource Manager templates
Azure Monitor
The table of contents on the left side of this page lists guidance for each tool and aligns with the previously
described maturity model.
Get started
The table of contents on the left side of this page outlines many articles. These articles help you get started with
each of the tools in this toolchain.
NOTE
Some links might leave the Cloud Adoption Framework to help you go beyond the scope of this framework.
Machine Learning Operations with Azure Machine
Learning
10/30/2020 • 2 minutes to read • Edit Online
Machine Learning Operations (MLOps) is based on DevOps principles and practices that increase workflow
efficiencies like continuous integration, delivery, and deployment. MLOps applies these principles to the machine
learning process in order to:
Experiment and develop models more quickly.
Deploy models to production more quickly.
Practice and refine quality assurance.
Azure Machine Learning provides the following MLOps capabilities:
Create reproducible pipelines . Machine Learning pipelines enable you to define repeatable and reusable
steps for your data preparation, training, and scoring processes.
Create reusable software environments for training and deploying models.
Register, package, and deploy models from anywhere . You can track the associated metadata required to
use the model.
Capture the governance data for the end-to-end lifecycle . The logged information can include who is
publishing models, why changes were made, and when models were deployed or used in production.
Notify and aler t on events in the lifecycle . For example, you can get alerts for experiment completion,
model registration, model deployment, and data drift detection.
Monitor applications for operational and machine learning-related issues . Compare model inputs
between training and inference, explore model-specific metrics, and provide monitoring and alerts on your
machine learning infrastructure.
Automate the end-to-end machine learning lifecycle with Azure Machine Learning and Azure
Pipelines . With pipelines, you can frequently update models, test new models, and continuously roll out new
machine learning models alongside your other applications and services.
Next steps
Learn more by reading and exploring the following resources:
MLOps: Model management, deployment, and monitoring with Azure Machine Learning
How and where to deploy models with Azure Machine Learning
Tutorial: Deploy an image classification model in ACI
End-to-end MLOps examples repo
CI/CD of machine learning models with Azure Pipelines
Create clients that consume a deployed model
Machine learning at scale
Azure AI reference architectures and best practices repo
Tools to interact with devices in Azure
10/30/2020 • 2 minutes to read • Edit Online
As described in the conceptual article on interacting with devices, the devices used to interact with a customer
depend on the amount of ambient experience required to deliver the customer's need and empower adoption.
Speed from the trigger that prompts the customer's need and your solution's ability to meet that need are
determining factors in repeat usage. Ambient experiences help accelerate that response time and create a better
experience for your customers by embedding your solution in the customers' immediate surroundings.
Toolchain
In Azure, you commonly use the following tools to accelerate digital invention across each of the preceding levels
of ambient solutions. These tools are grouped based on the amount of experience required to reduce complexity in
aligning tools with those experiences.
C AT EGO RY TO O L S
Get started
The table of contents on the left side of this page outlines many articles. These articles help you get started with
each of the tools in this toolchain.
NOTE
Some links might leave the Cloud Adoption Framework to help you go beyond the scope of this framework.
Use innovation tools to predict and influence
10/30/2020 • 2 minutes to read • Edit Online
Using AI, your company can make predictions about customers' needs and automate business processes. You also
can discover information lying latent in unstructured data and deliver new ways to engage with customers to
deliver better experiences.
You can accelerate this type of digital invention through each of the following solution areas. Best practices and
technical guidance to accelerate digital invention are listed in the table of contents on the left side of this page.
Those articles are grouped by solution area:
Machine learning
AI applications and agents
Knowledge mining
Get started
The table of contents on the left side of this page outlines many articles. These articles help you get started with
each of the solution areas.
NOTE
Some links might leave the Cloud Adoption Framework to help you go beyond the scope of this framework.
What is machine learning?
10/30/2020 • 6 minutes to read • Edit Online
Machine learning is a data science technique that allows computers to use existing data to forecast future
behaviors, outcomes, and trends. By using machine learning, computers learn without being explicitly programmed.
Forecasts or predictions from machine learning can make applications and devices smarter. For example, when you
shop online, machine learning helps recommend other products you might want based on what you've bought. Or
when your credit card is swiped, machine learning compares the transaction to a database of transactions and
helps detect fraud. And when your robot vacuum cleaner vacuums a room, machine learning helps it decide
whether the job is done.
Responsible ML
Throughout the development and use of AI systems, trust must be at the core. Trust in the platform, process, and
models. As AI and autonomous systems integrate more into the fabric of society, it's important to proactively make
an effort to anticipate and mitigate the unintended consequences of these technologies.
Understand your models and build for fairness: explain model behavior and uncover features that have the most
impact on predictions. Use built-in explainers for both glass-box and black-box models during model training
and inference. Use interactive visualizations to compare models and perform what-if analysis to improve model
accuracy. Test your models for fairness using state-of-the-art algorithms. Mitigate unfairness throughout the
machine learning lifecycle, compare mitigated models, and make intentional fairness versus accuracy trade-offs
as desired.
Protect data privacy and confidentiality: build models that preserve privacy using the latest innovations in
differential privacy, which injects precise levels of statistical noise in data to limit the disclosure of sensitive
information. Identify data leaks and intelligently limit repeat queries to manage exposure risk. Use encryption
and confidential machine learning (coming soon) techniques specifically designed for machine learning to
securely build models using confidential data.
Control and govern through every step of the machine learning process: access built-in capabilities to
automatically track lineage and create an audit trial across the machine learning lifecycle. Obtain full visibility
into the machine learning process by tracking datasets, models, experiments, code, and more. Use custom tags
to implement model data sheets, document key model metadata, increase accountability, and ensure
responsible process.
Learn more about how to implement Responsible ML.
Next steps
Review machine learning white papers and e-books on Machine Learning studio and Machine Learning service.
Review AI + Machine Learning architectures.
What are AI applications?
10/30/2020 • 4 minutes to read • Edit Online
In Azure, you can build intelligent applications faster by using the tools and technologies of your choice and built-in
AI.
Easily build and deploy anywhere: Use your team's existing skill set and the tools you know to build
intelligent applications and deploy them without a change in code. You can build once and then deploy in the
cloud, on-premises, and to edge devices. You can be confident of global distribution to more datacenters than
with any other provider.
Create an impact by using an open platform: Choose your favorite technologies, which can be open
source. Azure supports a range of deployment options, popular stacks and languages, and a comprehensive set
of data engines. Capitalize on this flexibility, plus the performance, scale, and security delivered by Microsoft
technologies to build applications for any scenario.
Develop applications with built-in intelligence: Building intelligent applications using Azure is easy,
because no other platform brings analytics and native AI to your data wherever it lives and in the languages you
use. You can take advantage of a rich set of cognitive APIs to easily build new experiences into your applications
for human-like intelligence.
Custom Vision Custom Vision allows you to build custom image classifiers.
Form Recognizer (preview) Form Recognizer identifies and extracts key-value pairs and
table data from form documents. It then outputs structured
data, which includes the relationships, in the original file.
Ink Recognizer (preview) Ink Recognizer allows you to recognize and analyze digital ink-
stroke data, shapes, and handwritten content, and output a
document structure with all recognized entities.
SERVIC E N A M E SERVIC E DESC RIP T IO N
Video Indexer Video Indexer enables you to extract insights from your
videos.
Speech APIs
SERVIC E N A M E SERVIC E DESC RIP T IO N
Speaker Recognition (preview) The Speaker Recognition API provides algorithms for Speaker
Identification and verification.
Bing Speech (retiring) The Bing Speech API provides you with an easy way to create
speech-enabled features in your applications.
Language APIs
SERVIC E N A M E SERVIC E DESC RIP T IO N
Language Understanding (LUIS) The Language Understanding service (LUIS) allows your
application to understand what a person wants in their own
words.
Text Analytics Text Analytics provides natural language processing over raw
text for sentiment analysis, key phrase extraction, and
language detection.
Decision APIs
SERVIC E N A M E SERVIC E DESC RIP T IO N
Anomaly Detector (preview) Anomaly Detector allows you to monitor and detect
abnormalities in your time series data.
Next steps
Learn more about Cognitive Services.
Find best practices for AI architectures.
What are AI agents?
10/30/2020 • 6 minutes to read • Edit Online
Users are engaging more and more with conversational interfaces, which can present a more natural experience
where humans express their needs through natural language and quickly complete tasks. For many companies,
conversational AI applications are becoming a competitive differentiator. Many organizations are strategically
making bots available within the same messaging platforms in which their customers spend time.
Organizations around the world are transforming their businesses with conversational AI, which can promote more
efficient and natural interactions with both their customers and their employees. Here are a few common use cases:
Customer support
Enterprise assistant
Call center optimization
In-car voice assistant
Build a bot
Azure Bot Service and Bot Framework offer an integrated set of tools and services to help with this process. Choose
your favorite development environment or command-line tools to create your bot. SDKs exist for C#, JavaScript,
TypeScript, and Python. The SDK for Java is under development. We provide tools for various stages of bot
development to help you design and build bots.
Plan
Having a thorough understanding of the goals, processes, and user needs is important to the process of creating a
successful bot. Before you write code, review the bot design guidelines for best practices, and identify the needs for
your bot. You can create a simple bot or include more sophisticated capabilities such as speech, natural language
understanding, and question answering.
While you design your bot in the Plan phase, consider these aspects:
Define bot personas:
What should your bot look like?
What should it be named?
What's your bot's personality? Does it have a gender?
How should your bot handle difficult situations and questions?
Design conversation flow:
What type of conversations can you expect for your use cases?
Define an evaluation plan:
How would you measure success?
What measurements do you want to use to improve your service?
To learn more about how to design your bot, see Principles of bot design.
Build
Your bot is a web service that implements a conversational interface and communicates with the Bot Framework
Service to send and receive messages and events. The Bot Framework Service is one of the components of Azure
Bot Service and Bot Framework. You can create bots in any number of environments and languages. You can start
your bot development in the Azure portal or use C#, JavaScript, or Python templates for local development. You
also have access to a variety of samples that showcase many of the capabilities available through the SDK. These
samples are great for developers who want a more feature-rich starting point.
As part of the Azure Bot Service and Bot Framework, we offer additional components you can use to extend the
functionality of your bot.
Add natural language processing Enable your bot to understand natural How to use LUIS
language, understand spelling errors,
use speech, and recognize the user's
intent.
Answer questions Add a knowledge base to answer How to use QnA Maker
questions users ask in a more natural,
conversational way.
Manage multiple models If you use more than one model, such Dispatch tool
as for LUIS and QnA Maker, intelligently
determine when to use which one
during your bot's conversation.
Add cards and buttons Enhance the user experience with media How to add cards
other than text, such as graphics,
menus, and cards.
NOTE
This table isn't a comprehensive list. For more information, see the Azure Bot Service documentation.
Test
Bots are complex applications with many different parts that work together. Like any other complex application, this
complexity can lead to some interesting bugs or cause your bot to behave differently than expected. Before you
publish your bot, test it. We provide several ways to test bots before they're released for use:
Test your bot locally with the emulator. The Bot Framework Emulator is a stand-alone application that not only
provides a chat interface but also debugging and interrogation tools to help you understand how and why your
bot does what it does. The emulator can be run locally alongside your in-development bot application.
Test your bot on the web. After your bot is configured through the Azure portal, it can also be reached through a
web chat interface. The web chat interface is a great way to grant access to your bot to testers and other people
who don't have direct access to the running code.
Unit test your bot with the July update of the Bot Framework SDK.
Publish
When you're ready to make your bot available on the web, publish it to Azure or to your own web service or
datacenter. Having an address on the public internet is the first step to bringing your bot to life on your site or
inside chat channels.
Connect
Connect your bot to channels such as Facebook, Messenger, Kik, Skype, Slack, Microsoft Teams, Telegram, text/SMS,
Twilio, Cortana, and Skype. Bot Framework does most of the work necessary to send and receive messages from all
of these different platforms. Your bot application receives a unified, normalized stream of messages no matter
number and type of channels to which it's connected. For information on how to add channels, see Channels.
Evaluate
Use the data collected in the Azure portal to identify opportunities to improve the capabilities and performance of
your bot. You can get service-level and instrumentation data like traffic, latency, and integrations. Analytics also
provide conversation-level reporting on user, message, and channel data. For more information, see How to gather
analytics.
Patterns for common use cases
There are common patterns used for implementation of a conversational AI application:
Knowledge base: A knowledge bot can be designed to provide information about virtually any subject. For
example, one knowledge bot might answer questions about events such as "what bot events are there at this
conference?" Or "when is the next reggae show?" Another bot might answer IT-related questions such as
"how do I update my operating system?" Yet another bot might answer questions about contacts such as
"who is john doe?" Or "what is jane doe's email address?"
For information on the design elements for knowledge bots, see Design knowledge bots.
Hand off to a human: No matter how much AI a bot possesses, there might still be times when it needs to
hand off the conversation to a human being. In such cases, the bot should recognize when it needs to hand
off and provide the user with a smooth transition.
For information on the patterns to hand off, see Transition conversations from bot to human.
Embed a bot in an application: Although bots most commonly exist outside of applications, they can also
be integrated with applications. For example, you could embed a knowledge bot within an application to
help users find information. You could also embed a bot within a help desk application to act as the first
responder to incoming user requests. The bot could independently resolve simple issues and hand off more
complex issues to a human agent.
For information on the ways to integrate your bot within an application, see Embed a bot in an application.
Embed a bot in a website: Like embedding bots in applications, bots can also be embedded within a
website to enable multiple modes of communication across channels.
For information on the ways to integrate your bot within a website, see Embed a bot in a website.
Next steps
Review machine learning white papers and e-books about Azure Bot Service.
Review AI + Machine Learning architectures.
Building intelligent applications with cognitive APIs (e-book).
FAQ chatbot architecture.
What is Azure Cognitive Search?
10/30/2020 • 4 minutes to read • Edit Online
Formerly known as Azure Search, Azure Cognitive Search is a managed cloud solution that gives developers APIs
and tools for adding a rich search experience over private, heterogeneous content in web, mobile, and enterprise
applications. Your code or a tool invokes data ingestion (indexing) to create and load an index. Optionally, you can
add cognitive skills to apply AI processes during indexing. Doing so can add new information and structures that
are useful for search and other scenarios.
On the other side of your service, your application code issues query requests and handles responses. The search
experience is defined in your client by using functionality from Azure Cognitive Search, with query execution over a
persisted index that you create, own, and store in your service.
Functionality is exposed through a simple REST API or .NET SDK that masks the inherent complexity of information
retrieval. In addition to APIs, the Azure portal provides administration and content management support, with tools
for prototyping and querying your indexes. Because the service runs in the cloud, infrastructure and availability are
managed by Microsoft.
Next steps
Learn more about Azure Cognitive Search.
Browse more AI architectures.
See an example knowledge mining solution in the article JFK Files example architecture and solution.
Innovation in the digital economy
10/30/2020 • 4 minutes to read • Edit Online
The digital economy is an undeniable force in almost every industry. During the industrial revolution, gasoline,
conveyor belts, and human ingenuity were key resources for promoting market innovation. Product quality, price,
and logistics drove markets as companies sought to deliver better products to their customers more quickly.
Today's digital economy shifts the way in which customers interact with corporations. The primary forms of
capital and market differentiators have all shifted as a result. In the digital economy, customers are less
concerned with logistics and more concerned with their overall experience of using a product. This shift arises
from direct interaction with technology in our daily lives and from a realization of the value associated with those
interactions.
In the Innovate methodology of the Cloud Adoption Framework, we'll focus on understanding customer needs
and rapidly building innovations that shape how your customers interact with your products. We'll also illustrate
an approach to delivering on the value of a minimum viable product (MVP). Finally, we'll map decisions common
to innovation cycles to help you understand how the cloud can unlock innovation and create partnerships with
your customers.
Innovate methodology
The simple methodology for cloud innovation within the Cloud Adoption Framework is illustrated in the
following image. Subsequent articles in this section will show how to establish core processes, approaches, and
mechanisms for finding and driving innovation within your company.
Cultural commitments
Adopting the Innovate methodology requires some cultural commitments to effectively use the metrics outlined
in this article. Before you change your approach to driving innovation, make sure the adoption and leadership
teams are ready to make these important commitments.
Commitment to transparency
To understand measurement in an innovation approach, you must first understand the commitment to
transparency. Innovation can only thrive in an environment that adheres to a growth mindset. At the root of a
growth mindset is a cultural imperative to learn from experiences. Successful innovation and continuous learning
start with a commitment to transparency in measurement. This is a brave commitment for the cloud adoption
team. However, that commitment is meaningless if it's not matched by a commitment to preserve transparency
within the leadership and cloud strategy teams.
Transparency is important because measuring customer impact doesn't address the question of right or wrong.
Nor are impact measurements indicative of the quality of work or the performance of the adoption team. Instead,
they represent an opportunity to learn and better meet your customers' needs. Misuse of innovation metrics can
stifle that culture. Eventually, such misuse will lead to manipulation of metrics, which in turn causes long-term
failure of the invention, the supporting staff, and ultimately the management structure who misused the data.
Leaders and contributors alike should avoid using measurements for anything other than an opportunity to learn
and improve the MVP solution.
Commitment to iteration
Only one promise rings true across all innovation cycles: you won't get it right on the first try. Measurement
helps you understand what adjustments you should make to achieve the desired results. Changes that lead to
favorable outcomes stem from iterations of the build-measure-learn process. The cloud adoption team and the
cloud strategy team must commit to an iterative mindset before adopting a growth mindset or a build-measure-
learn approach.
Next steps
Before building the next great invention, get started with customer adoption by understanding the build-
measure-learn feedback loop.
Customer adoption with the build-measure-learn feedback loop
Build consensus on the business value of innovation
10/30/2020 • 5 minutes to read • Edit Online
The first step to developing any new innovation is to identify how that innovation can drive business value. In this
exercise, you answer a series of questions that highlight the importance of investing ample time when your
organization defines business value.
Qualifying questions
Before you develop any solution (in the cloud or on-premises), validate your business value criteria by answering
the following questions:
1. What is the defined customer need that you seek to address with this solution?
2. What opportunities would this solution create for your business?
3. Which business outcomes would be achieved with this solution?
4. Which of your company's motivations would be served with this solution?
If the answers to all four questions are well documented, you might not need to complete the rest of this exercise.
Fortunately, you can easily test any documentation. Set up two short meetings to test both the documentation and
your organization's internal alignment. Invite committed business stakeholders to one meeting and set up a
separate meeting with the engaged development team. Ask the four questions above to each group, and then
compare the results.
NOTE
The existing documentation should not be shared with either team before the meeting. If true alignment exists, the
guiding hypotheses should be referenced or even recited by members of each group.
WARNING
Don't facilitate the meeting. This test is to determine alignment; it's not an alignment creation exercise. When you start the
meeting, remind the attendees that the objective is to test directional alignment to existing agreements within the team.
Establish a five-minute time limit for each question. Set a timer and close each question after five minutes even if the
attendees haven't agreed upon an answer.
Account for the different languages and interests of each group. If the test results in answers that are directionally
aligned, consider this exercise a victory. You're ready to move on to solution development.
If one or two of the answers are directionally aligned, recognize that your hard work is paying off. You're already
better aligned than most organizations. Future success is likely with minor continuing investment in alignment.
Review each of the following sections for ideas that may help you build further alignment.
If either team fails to answer all four questions in 30 minutes, then alignment and the considerations in the
following sections are likely to have a significant impact on this effort and others. Pay careful attention to each of
the following sections.
Next steps
After you've aligned your business value proposition and communicated it, you're ready to start building your
solution.
Return to the innovate exercises for next steps
Create customer partnerships through the build-
measure-learn feedback loop
10/30/2020 • 2 minutes to read • Edit Online
True innovation comes from the hard work of building solutions that demonstrate customer empathy, from
measuring the impact of those changes on the customer, and from learning with the customer. Most importantly, it
comes from feedback over multiple iterations.
If the past decade has taught us anything about innovation, it's that the old rules of business have changed. Large,
wealthy incumbents no longer have an unbreakable hold on the market. The first or best players to market are not
always the winners. Having the best idea doesn't lead to market dominance. In a rapidly changing business
climate, market leaders are the most agile. Those who can adapt to changing conditions lead.
Large or small, the companies that thrive in the digital economy as innovative leaders are those with the greatest
ability to listen to their customer base. That skill can be cultivated and managed. At the core of all good
partnerships is a clear feedback loop. The process for building customer partnerships within the Cloud Adoption
Framework is the build-measure-learn feedback loop.
Next steps
Learn how to build with customer empathy to begin your build-measure-learn cycle.
Build with customer empathy
Build with customer empathy
10/30/2020 • 11 minutes to read • Edit Online
"Necessity is the mother of invention." This proverb captures the indelibility of the human spirit and our natural
drive to invent. As explained in the Oxford English Dictionary, "When the need for something becomes
imperative, you're forced to find ways of getting or achieving it." Few would deny these universal truths about
invention. However, as described in Innovation in the digital economy, innovation requires a balance between
invention and adoption.
Continuing with the analogy, innovation comes from a more extended family. Customer empathy is the proud
parent of innovation. creating a solution that drives innovation requires a legitimate customer need that keeps
the customer coming back to solve critical challenges. These solutions are based on what a customer needs
rather than on their wants or whims. To find their true needs, we start with empathy, a deep understanding of
the customer's experience. Empathy is an underdeveloped skill for many engineers, product managers, and even
business leaders. Fortunately, the diverse interactions and rapid pace of the cloud architect role have already
started fostering this skill.
Why is empathy so important? From the first release of a minimum viable product (MVP) to the general
availability of a market-grade solution, customer empathy helps us understand and share in the experience of
the customer. Empathy helps us build a better solution. More importantly, it better positions us to invent
solutions that will encourage adoption. In a digital economy, those who can most readily empathize with
customer needs can build a brighter future that redefines and leads the market.
Properly defining what to build can be tricky and requires some practice. If you build something too quickly, if
might not reflect customer needs. If you spend too much time trying to understand initial customer needs and
solution requirements, the market may meet them before you have a chance to build anything at all. In either
scenario, the opportunity to learn can be significantly delayed or reduced. Sometimes the data can even be
corrupted.
The most innovative solutions in history began with an intuitive belief. That gut feeling comes from both existing
expertise and firsthand observation. We start with the build phase because it allows for a rapid test of that
intuition. From there, we can cultivate deeper understanding and clearer degrees of empathy. At every iteration
or release of a solution, balance comes from building MVPs that demonstrate customer empathy.
To steady this balancing act, the following two sections discuss the concepts of building with empathy and
defining an MVP.
Define a customer focused-hypothesis
Building with empathy means creating a solution based on defined hypotheses that illustrate a specific customer
need. The following steps aim to formulate a hypothesis that will encourage building with empathy.
1. When you build with empathy, the customer is always the focus. This intention can take many shapes. You
could reference a customer archetype, a specific persona, or even a picture of a customer in the midst of the
problem you want to solve. And keep in mind that customers can be internal (employees or partners) or
external (consumers or business customers). This definition is the first hypothesis to be tested: can we help
this specific customer?
2. Understand the customer experience. Building with empathy means you can relate to the customer's
experience and understand their challenges. This mindset indicates the next hypothesis to be tested: can we
help this specific customer with this manageable challenge?
3. Define a simple solution to a single challenge. Relying on expertise across people, processes, and subject
matter experts will lead to a potential solution. This is the full hypothesis to be tested: can we help this
specific customer with this manageable challenge through the proposed solution?
4. Arrive at a value statement. What long-term value do you hope to provide to these customers? The answer to
this question creates your full hypothesis: how will these customers' lives be improved by using the
proposed solution to address this manageable challenge?
This last step is the culmination of an empathy-driven hypothesis. It defines the audience, the problem, the
solution, and the metric by which improvement is to be made, all of which center on the customer. During the
measure and learn phases, each hypothesis should be tested. Changes in the customer, problem statement, or
solution are anticipated as the team develops greater empathy for the addressable customer base.
Cau t i on
The goal is to build with customer empathy, not to plan with it. It's all too easy to get stuck in endless cycles of
planning and tweaking to hit upon the perfect customer empathy statement. Before you try to develop such a
statement, review the following sections on defining and building an MVP.
After core assumptions are proven, later iterations will focus on growth tests in addition to empathy tests. After
empathy is built, tested, and validated, you can begin to understand the addressable market at scale. This can be
done through an expansion of the standard hypothesis formula described earlier. Based on available data,
estimate the size of the total market (the number of potential customers).
From there, estimate the percentage of that total market that experiences a similar challenge and that might
therefore be interested in this solution. This is your addressable market. The next hypothesis to be tested is: how
will x% of customers' lives be improved by using the proposed solution to address this manageable challenge?
A small sampling of customers will reveal leading indicators that suggest a percentage impact on the pool of
customers engaged.
Define a solution to test the hypothesis
During each iteration of a build-measure-learn feedback loop, your attempt to build with empathy is defined by
an MVP.
An MVP is the smallest unit of effort (invention, engineering, application development, or data architecture)
required to create enough of a solution to learn with the customer. The goal of every MVP is to test some or all
of the prior hypotheses and to receive feedback directly from the customer. The output is not a beautiful
application with all the features required to change your industry. The desired output of each iteration is a
learning opportunity, a chance to more deeply test a hypothesis.
Timeboxing is a standard way to make sure a product remains lean. For example, make sure your development
team thinks the solution can be created in a single iteration to allow for rapid testing. To better understand using
velocity, iterations, and releases to define what minimal means, see Planning velocity, iterations, release, and
iteration paths.
Reduce complexity and delay technical spikes
The disciplines of invention found in the Innovate methodology describe the functionality that's often required
to deliver a mature innovation or scale-ready MVP solution. Use these disciplines as a long-term guide for
feature inclusion. Likewise, use them as a cautionary guide during early testing of customer value and empathy
in your solution.
Feature breadth and the different disciplines of invention can't all be created in a single iteration. It might take
several releases for an MVP solution to include the complexity of multiple disciplines. Depending on the
investment in development, there might be multiple parallel teams working within different disciplines to test
multiple hypotheses. Although it's smart to maintain architectural alignment between those teams, it's unwise to
try to build complex, integrated solutions until value hypotheses can be validated.
Complexity is best detected in the frequency or volume of technical spikes. Technical spikes are efforts to create
technical solutions that can't be easily tested with customers. When customer value and customer empathy are
untested, technical spikes represent a risk to innovation and should be minimized. For the types of mature tested
solutions found in a migration effort, technical spikes can be common throughout adoption. However, they delay
the testing of hypotheses in innovation efforts and should be postponed whenever possible.
A relentless simplification approach is suggested for any MVP definition. This approach means removing
anything that doesn't add to your ability to validate the hypothesis. To minimize complexity, reduce the number
of integrations and features that aren't required to test the hypothesis.
Build an MVP
At each iteration, an MVP solution can take many different shapes. The common requirement is only that the
output allows for measurement and testing of the hypothesis. This simple requirement initiates the scientific
process and allows the team to build with empathy. To deliver this customer-first focus, an initial MVP might rely
on only one of the disciplines of invention.
In some cases, the fastest path to innovation means temporarily avoiding these disciplines entirely, until the
cloud adoption team is confident that the hypothesis has been accurately validated. Coming from a technology
company like Microsoft, this guidance might sound counterintuitive. However, this simply emphasizes that
customer needs, not a specific technology decision, are the highest priority in an MVP solution.
Typically, an MVP solution consists of a simple application or data solution with minimal features and limited
polish. For organizations that have professional development expertise, this path is often the fastest one to
learning and iteration. The following list includes several other approaches a team might take to build an MVP:
A predictive algorithm that's wrong 99 percent of the time but that demonstrates specific desired outcomes.
An IoT device that doesn't communicate securely at production scale but that demonstrates the value of
nearly real-time data within a process.
An application built by a citizen developer to test a hypothesis or meet smaller-scale needs.
A manual process that re-creates the benefits of the application to follow.
A wireframe or video that's detailed enough to allow the customer to interact.
Developing an MVP shouldn't require massive amounts of development investment. Preferably, investment
should be as constrained as possible to minimize the number of hypotheses being tested at one time. Then, in
each iteration and with each release, the solution is intentionally improved toward a scale-ready solution that
represents multiple disciplines of invention.
Accelerate MVP development
Time to market is crucial to the success of any innovation. Faster releases lead to faster learning. Faster learning
leads to products that can scale more quickly. At times, traditional application development cycles can slow this
process. More frequently, innovation is constrained by limits on available expertise. Budgets, headcount, and
availability of staff can all create limits to the number of new innovations a team can handle.
Staffing constraints and the desire to build with empathy have spawned a rapidly growing trend toward citizen
developers. These developers reduce risk and provide scale within an organization's professional development
community. Citizen developers are subject matter experts where the customer experience is concerned, but
they're not trained as engineers. These individuals use prototyping tools or lighter-weight development tools
that might be frowned upon by professional developers. These business-aligned developers create MVP
solutions and test theories. When aligned well, this process can create production solutions that provide value
but don't pass a sufficiently effective scale hypothesis. They can also be used to validate a prototype before scale
efforts begin.
Within any innovate plan, cloud adoption teams should diversify their portfolios to include citizen developer
efforts. By scaling development efforts, more hypotheses can be formed and tested at a reduced investment.
When a hypothesis is validated and an addressable market is identified, professional developers can harden and
scale the solution by using modern development tools.
Final build gate: Customer pain
When customer empathy is strong, a clearly existing problem should be easy to identify. The customer's pain
should be obvious. During build, the cloud adoption team is building a solution to test a hypothesis based on a
customer pain point. If the hypothesis is well-defined but the pain point is not, the solution is not truly based on
customer empathy. In this scenario, build is not the right starting point. Instead, invest first in building empathy
and learning from real customers. The best approach for building empathy and validating pain is simple: listen
to your customers. Invest time in meeting with and observing them until you can identify a pain point that
occurs frequently. After the pain point is well-understood, you're ready to test a hypothesized solution for
addressing that pain.
References
Some of the concepts in this article build on topics discussed in The Lean Startup by Eric Ries.
Next steps
After you've built an MVP solution, you can measure the empathy value and scale value. Learn how to measure
for customer impact.
Measure for customer impact
Measure for customer impact
10/30/2020 • 4 minutes to read • Edit Online
There are several ways to measure for customer impact. This article will help you define metrics to validate
hypotheses that arise out of an effort to build with customer empathy.
Strategic metrics
During the Strategy phase of the cloud adoption lifecycle, we examine motivations and business outcomes. These
practices provide a set of metrics by which to test customer impact. When innovation is successful, you tend to
see results that are aligned with your strategic objectives.
Before establishing learning metrics, define a small number of strategic metrics that you want this innovation to
affect. Generally those strategic metrics align with one or more of the following outcome areas: - Business agility -
Customer engagement - Customer reach - Financial impact - Solution performance, in the case of operational
innovation.
Document the agreed-upon metrics and track their impact frequently. But don't expect results in any of these
metrics to emerge for several iterations. For more information about setting and aligning expectations across the
parties involved, see Commitment to iteration.
Aside from motivation and business outcome metrics, the remainder of this article focuses on learning metrics
designed to guide transparent discovery and customer-focused iterations. For more information about these
aspects, see Commitment to transparency.
Learning metrics
When the first version of any minimum viable product (MVP) is shared with customers, preferably at the end of
the first development iteration, there will be no impact on strategic metrics. Several iterations later, the team may
still be struggling to change behaviors enough to materially affect strategic metrics. During learning processes,
such as build-measure-learn cycles, we advise the team to adopt learning metrics. These metrics tracking and
learning opportunities.
Customer flow and learning metrics
If an MVP solution validates a customer-focused hypothesis, the solution will drive some change in customer
behaviors. Those behavior changes across customer cohorts should improve business outcomes. Keep in mind
that changing customer behavior is typically a multistep process. Because each step provides an opportunity to
measure impact, the adoption team can keep learning along the way and build a better solution.
Learning about changes to customer behavior starts by mapping the flow that you hope to see from an MVP
solution.
In most cases, a customer flow will have an easily defined starting point and no more than two endpoints.
Between the start and endpoints are a variety of learning metrics to be used as measures in the feedback loop:
1. Star ting point—initial trigger : The starting point is the scenario that triggers the need for this solution.
When the solution is built with customer empathy, that initial trigger should inspire a customer to try the MVP
solution.
2. Customer need met: The hypothesis is validated when a customer need has been met by using the solution.
3. Solution steps: This term refers to the steps that are required to move the customer from the initial trigger to
a successful outcome. Each step produces a learning metric based on a customer decision to move on to the
next step.
4. Individual adoption achieved: The next time the trigger is encountered, if the customer returns to the
solution to get their need met, individual adoption has been achieved.
5. Business outcome indicator : When a customer behaves in a way that contributes to the defined business
outcome, a business outcome indicator is observed.
6. True innovation: When business outcome indicators and individual adoption both occur at the desired scale,
you've realized true innovation.
Each step of the customer flow generates learning metrics. After each iteration (or release), a new version of the
hypothesis is tested. At the same time, tweaks to the solution are tested to reflect adjustments in the hypothesis.
When customers follow the prescribed path in any given step, a positive metric is recorded. When customers
deviate from the prescribed path, a negative metric is recorded.
These alignment and deviation counters create learning metrics. Each should be recorded and tracked as the
cloud adoption team progresses toward business outcomes and true innovation. In Learn with customers, we'll
discuss ways to apply these metrics to learn and build better solutions.
Group and observe customer partners
The first measurement in defining learning metrics is the customer partner definition. Any customer who
participates in innovation cycles qualifies as a customer partner. To accurately measure behavior, you should use a
cohort model to define customer partners. In this model, customers are grouped to sharpen your understanding
of their responses to changes in the MVP. These groups typically resemble the following:
Experiment or focus group: Grouping customers based on their participation in a specific experiment
designed to test changes over time.
Segment: Grouping customers by the size of the company.
Ver tical: Grouping customers by the industry vertical they represent.
Individual demographics: Grouping based on personal demographics like age and physical location.
These types of groupings help you validate learning metrics across various cross-sections of those customers
who choose to partner with you during your innovation efforts. All subsequent metrics should be derived from
definable customer grouping.
Next steps
As learning metrics accumulate, the team can begin to learn with customers.
Learn with customers
Some of the concepts in this article build on topics first described in The Lean Startup, written by Eric Ries.
Learn with customers
10/30/2020 • 4 minutes to read • Edit Online
Our current customers represent our best resource for learning. By partnering with us, they help us build with
customer empathy to find the best solution to their needs. They also help create a minimum viable product (MVP)
solution by generating metrics from which we measure customer impact. In this article, we'll describe how to learn
with and from our customer-partners.
Continuous learning
At the end of every iteration, we have an opportunity to learn from the build and measure cycles. This process of
continuous learning is quite simple. The following image offers an overview of the process flow.
At its most basic, continuous learning is a method for responding to learning metrics and assessing their impact
on customer needs. This process consists of three primary decisions to be made at the end of each iteration:
Did the hypothesis prove true? When the answer is yes, celebrate for a moment and then move on. There
are always more things to learn, more hypotheses to test, and more ways to help the customer in your next
iteration. When a hypothesis proves true, it's often a good time for teams to decide on a new feature that will
enhance the solution's utility for the customer.
Can you get closer to a validated hypothesis by iterating on the current solution? The answer is
usually yes. Learning metrics typically suggest points in the process that lead to customer deviation. Use these
data points to find the root of a failed hypothesis. At times, the metrics may also suggest a solution.
Is a reset of the hypothesis required? The scariest thing to learn in any iteration is that the hypothesis or
underlying need was flawed. When this happens, an iteration alone isn't necessarily the right answer. When a
reset is required, the hypothesis should be rewritten and the solution reviewed in light of the new hypothesis.
The sooner this type of learning occurs, the easier it will be to pivot. Early hypotheses should focus on testing
the riskiest aspects of the solution in service of avoiding pivots later in development.
Unsure? The second most common response after "iterate" is "we're not sure." Embrace this response. It
represents an opportunity to engage the customer and to look beyond the data.
The answers to these questions will shape the iteration to follow. Companies that demonstrate an ability to apply
continuous learning and boldly make the right decisions for their customers are more likely to emerge as leaders
in their markets.
For better or worse, the practice of continuous learning is an art that requires a great deal of trial and error. It also
requires some science and data-driven decision-making. Perhaps the most difficult part of adopting continuous
learning concerns the cultural requirements. To effectively adopt continuous learning, your business culture must
be open to a fail first, customer-focused approach. The following section provides more details about this
approach.
Growth mindset
Few could deny the radical transformation within Microsoft culture that's occurred over the last several years. This
multifaceted transformation, led by Satya Nadella, has been hailed as a surprising business success story. At the
heart of this story is the simple belief we call the growth mindset. An entire section of this framework could be
dedicated to the adoption of a growth mindset. But to simplify this guidance, we'll focus on a few key points that
inform the process of learning with customers:
Customer first: If a hypothesis is designed to improve the experience of real customers, you have to meet
real customers where they are. Don't just rely on metrics. Compare and analyze metrics based on firsthand
observation of customer experiences.
Continuous learning: Customer focus and customer empathy stem from a learn-it-all mindset. The Innovate
methodology strives to be learn-it-all, not know-it-all.
Beginner's mindset: Demonstrate empathy by approaching every conversation with a beginner's mindset.
Whether you're new to your field or a 30-year veteran, assume you know little, and you'll learn a lot.
Listen more: Customers want to partner with you. Unfortunately, an ego-driven need to be right blocks that
partnership. To learn beyond the metrics, speak less and listen more.
Encourage others: Don't just listen; use the things you do say to encourage others. In every meeting, find
ways to pull in diverse perspectives from those who may not be quick to share.
Share the code: When we feel our obligation is to the ownership of a code base, we lose sight of the true
power of innovation. Focus on owning and driving outcomes for your customers. Share your code (publicly
with the world or privately within your company) to invite diverse perspectives into the solution and the code
base.
Challenge what works: Success doesn't necessarily mean you're demonstrating true customer empathy.
Avoid having a fixed mindset and a bias toward doing what's worked before. Look for learning in positive and
negative metrics by engaging your customers.
Be inclusive: Work hard to invite diverse perspectives into the mix. There are many variables that can divide
humans into segregated groups. Cultural norms, past behaviors, gender, religion, sexual preference, even
physical abilities. True innovation comes when we challenge ourselves to see past our differences and
consciously strive to include all customers, partners, and coworkers.
Next steps
As a next step to understanding this methodology, see Common blockers and challenges to innovation to prepare
for the changes ahead.
Understanding common blockers and challenges
Some of the concepts in this article build on topics first described in The Lean Startup, written by Eric Ries.
Common blockers and challenges to innovation
10/30/2020 • 5 minutes to read • Edit Online
As described in Innovation in the digital economy, innovation requires a balance between invention and adoption.
This article expands on the common challenges and blockers to innovation, as it aims to help you understand how
this approach can add value during your innovation cycles. Formula for innovation: innovation = invention +
adoption
Adoption challenges
Cloud technology advances have reduced some of the friction related to adoption. However, adoption is more
people-centric than technology-centric. And unfortunately, the cloud can't fix people.
The following list elaborates on some of the most common adoption challenges related to innovation. As you
progress through the Innovate methodology, each of the challenges in the following sections will be identified and
addressed. Before you apply this methodology, evaluate your current innovation cycles to determine which are the
most important challenges or blockers for you. Then, use the methodology to address or remove those blockers.
External challenges
Time to market: In a digital economy, time to market is one of the most crucial indicators of market
domination. Surprisingly, time to market impact has little to do with positioning or early market share. Both of
those factors are fickle and temporary. The time to market advantage comes from the simple truth that more
time your solution has on the market, the more time you have to learn, iterate, and improve. Focus heavily on
quick definition and rapid build of an effective minimum viable product to shorten time to market and
accelerate learning opportunities.
Competitive challenges: Dominant incumbents reduce opportunities to engage and learn from customers.
Competitors also create external pressure to deliver more quickly. Build fast but invest heavily in understanding
the proper measures. Well-defined niches produce more actionable feedback measures and enhance your
ability to partner and learn, resulting in better overall solutions.
Understand your customer : Customer empathy starts with an understanding of the customer and customer
base. One of the biggest challenges for innovators is the ability to rapidly categorize measurements and
learning within the build-measure-learn cycle. It's important to understand your customer through the lenses of
market segmentation, channels, and types of relationships. Throughout the build-measure-learn cycle, these
data points help create empathy and shape the lessons learned.
Internal challenges
Choosing innovation candidates: When investing in innovation, healthy companies spawn an endless
supply of potential inventions. Many of these create compelling business cases that suggest high returns and
generate enticing business justification spreadsheets. As described in the build article, building with customer
empathy should be prioritized over invention that's based only on gain projections. If customer empathy isn't
visible in the proposal, long-term adoption is unlikely.
Balancing the por tfolio: Most technology implementations don't focus on changing the market or improving
the lives of customers. In the average IT department, more than 80% of workloads are maintained for basic
process automation. With the ease of innovation, it's tempting to innovate and rearchitect those solutions. Most
of the times, those workloads can experience similar or better returns by migrating or modernizing the solution,
with no change to core business logic or data processes. Balance your portfolio to favor innovation strategies
that can be built with clear empathy for the customer (internal or external). For all other workloads, follow a
migrate path to financial returns.
Maintaining focus and protecting priorities: When you've made a commitment to innovation, it's
important to maintain your team's focus. During the first iteration of a build phase, it's relatively easy to keep a
team excited about the possibilities of changing the future for your customers. However, that first MVP release
is just the beginning. True innovation comes with each build-measure-learn cycle, by learning from the
feedback loops to produce a better solution. As a leader in any innovation process, you should concentrate on
keeping the team focused and on maintaining your innovation priorities through the subsequent, less-
glamorous build iterations.
Invention challenges
Before the widespread adoption of the cloud, invention cycles that depended on information technology were
laborious and time-consuming. Procurement and provisioning cycles frequently delayed the crucial first steps
toward any new solutions. The cost of DevOps solutions and feedback loops delayed teams' abilities to collaborate
on early stage ideation and invention. Costs related to developer environments and data platforms prevented
anyone but highly trained professional developers from participating in the creation of new solutions.
The cloud has overcome many of these invention challenges by providing self-service automated provisioning,
light-weight development and deployment tools, and opportunities for professional developers and citizen
developers to cooperate in creating rapid solutions. Using the cloud for innovation dramatically reduces customer
challenges and blockers to the invention side of the innovation equation.
Invention challenges in a digital economy
The invention challenges of today are different. The endless potential of cloud technologies also produces more
implementation options and deeper considerations about how those implementations might be used.
The Innovate methodology uses the following innovation disciplines to help align your implementation decisions
with your invention and adoption goals:
Data platforms: New sources and variations on data are available. Many of these couldn't be integrated into
legacy or on-premises applications to create cost-effective solutions. Understanding the change you hope to
drive in customers will inform your data platform decisions. Those decisions will be an extension of selected
approaches to ingest, integrate, categorize, and share data. Microsoft refers to this decision-making process as
the democratization of data.
Device interactions: IoT, mobile, and augmented reality blur the lines between digital and physical,
accelerating the digital economy. Understanding the real-world interactions surrounding customer behavior
will drive decisions about device integration.
Applications: Applications are no longer the exclusive domain of professional developers. Nor do they require
traditional server-based approaches. Empowering professional developers, enabling business specialists to
become citizen developers, and expanding compute options for API, micro-services, and PaaS solutions expand
application interface options. Understanding the digital experience required to shape customer behavior will
improve your decision-making about application options.
Source code and deployment: Collaboration between developers of all walks improves both quality and
speed to market. Integration of feedback and a rapid response to learning shape market leaders. Commitments
to the build, measure, and learn processes help accelerate tool adoption decisions.
Predictive solutions: In a digital economy, it's seldom sufficient to simply meet the current needs of your
customers. Customers expect businesses to anticipate their next steps and predict their future needs.
Continuous learning often evolves into prediction tooling. The complexity of customer needs and the
availability of data will help define the best tools and approaches to predict and influence.
In a digital economy, the greatest challenge architects face is to clearly understand their customers' invention and
adoption needs and to then determine the best cloud-based toolchain to deliver on those needs.
Next steps
With the knowledge you've gained about the build-measure-learn model and a growth mindset, you're ready to
develop digital inventions within the Innovate methodology.
Develop digital inventions
Develop digital inventions
10/30/2020 • 2 minutes to read • Edit Online
As described in Innovation in the digital economy, innovation requires a balance between invention and adoption.
Customer feedback and partnership are required to drive adoption. The disciplines described in the next section
define a series of approaches to developing digital inventions while keeping adoption and customer empathy in
mind. Each of the disciplines is briefly described, along with deeper links into each process.
Next steps
Democratization of data is the first discipline of innovation to consider and evaluate.
Democratize data
Democratize data with digital invention
10/30/2020 • 7 minutes to read • Edit Online
Coal, oil, and human potential were the three most consequential assets during the industrial revolution. These
assets built companies, shifted markets, and ultimately changed nations. In the digital economy, there are three
equally important assets: data, devices, and human potential. Each of these assets holds great innovation
potential. For any innovation effort in the modern era, data is the new oil.
Across every company today, there are pockets of data that could be used to find and meet customer needs more
effectively. Unfortunately, the process of mining that data to drive innovation has long been costly and time-
consuming. Many of the most valuable solutions to customer needs go unmet because the right people can't
access the data they need.
Democratization of data is the process of getting this data into the right hands to drive innovation. This process
can take several forms, but they generally include solutions for ingested or integrated raw data, centralization of
data, sharing data, and securing data. When these methods are successful, experts around the company can use
the data to test hypotheses. In many cases, cloud adoption teams can build with customer empathy using only
data, and rapidly addressing existing customer needs.
Share data
When you build with customer empathy, all processes elevate customer need over a technical solution. Because
democratizing data is no exception, we start by sharing data. To democratize data, it must include a solution that
shares data with a data consumer. The data consumer could be a direct customer or a proxy who makes decisions
for customers. Approved data consumers can analyze, interrogate, and report on centralized data, with no
support from IT staff.
Many successful innovations have been launched as a minimum viable product (MVP) that deliver manual, data-
driven processes on behalf of the customer. In this concierge model, an employee is the data consumer. That
employee uses data to aid the customer. Each time the customer engages manual support, a hypothesis can be
tested and validated. This approach is often a cost effective means of testing a customer-focused hypothesis
before you invest heavily in integrated solutions.
The primary tools for sharing data directly with data consumers include self-service reporting or data embedded
within other experiences, using tools like Power BI.
NOTE
Before you share data, make sure you've read the following sections. Sharing data might require governance to provide
protection for the shared data. Also, that data might be spread across multiple clouds and could require centralization.
Much of the data might even reside within applications, which will require data collection before you can share it.
Govern data
Sharing data can quickly produce an MVP that you can use in customer conversations. However, to turn that
shared data into useful and actionable knowledge, a bit more is generally required. After a hypothesis has been
validated through data sharing, the next phase of development is typically data governance.
Data governance is a broad topic that could require its own dedicated framework. That degree of granularity is
outside the scope of the Cloud Adoption Framework. However, there are several aspects of data governance that
you should consider as soon as the customer hypothesis is validated. For example:
Is the shared data sensitive? Data should be classified before being shared publicly to protect the interests
of customers and the company.
If the data is sensitive, has it been secured? Protection of sensitive data should be a requirement for any
democratized data. The example workload focused on securing data solutions provides a few references for
securing data.
Is the data cataloged? Capturing details about the data being shared will aid in long-term data
management. Tools for documenting data, like Azure Data Catalog, can make this process much easier in the
cloud. Guidance regarding the annotation of data and the documentation of data sources can help accelerate
the process.
When democratization of data is important to a customer-focused hypothesis, make sure the governance of
shared data is somewhere in the release plan. This will help protect customers, data consumers, and the company.
Centralize data
When data is disrupted across an IT environment, opportunities to innovate can be extremely constrained,
expensive, and time-consuming. The cloud provides new opportunities to centralize data across data silos. When
centralization of multiple data sources is required to build with customer empathy, the cloud can accelerate the
testing of hypotheses.
Cau t i on
Centralization of data represents a risk point in any innovation process. When data centralization is a technical
spike, and not a source of customer value, we suggest that you delay centralization until the customer hypotheses
have been validated.
If centralization of data is required, you should first define the appropriate data store for the centralized data. It's a
good practice to establish a data warehouse in the cloud. This scalable option provides a central location for all
your data. This type of solution is available in online analytical processing (OLAP) or big data options.
The reference architectures for OLAP and big data solutions can help you choose the most relevant solution in
Azure. If a hybrid solution is required, the reference architecture for extending on-premises data can also help
accelerate solution development.
IMPORTANT
Depending on the customer need and the aligned solution, a simpler approach may be sufficient. The cloud architect should
challenge the team to consider lower cost solutions that could result in faster validation of the customer hypothesis,
especially during early development. The following section on collecting data covers some scenarios that might suggest a
different solution for your situation.
Collect data
When you need data to be centralized to address a customer need, it's very likely that you'll also have to collect
the data from various sources and move it into the centralized data store. The two primary forms of data
collection are integration and ingestion.
Integration: Data that resides in an existing data store can be integrated into the centralized data store by using
traditional data movement techniques. This is especially common for scenarios that involve multicloud data
storage. These techniques involve extracting the data from the existing data store and then loading it into the
central data store. At some point in this process, the data is typically transformed to be more usable and relevant
in the central store.
Cloud-based tools have turned these techniques into pay-per-use tools, reducing the barrier to entry for data
collection and centralization. Tools like Azure Database Migration Service and Azure Data Factory are two
examples. The reference architecture for data factory with an OLAP data store is an example of one such solution.
Ingestion: Some data doesn't reside in an existing data store. When this transient data is a primary source of
innovation, you'll want to consider alternative approaches. Transient data can be found in a variety of existing
sources like applications, APIs, data streams, IoT devices, a blockchain, an application cache, in media content, or
even in flat files.
You can integrate these various forms of data into a central data store on an OLAP or big data solution. However,
for early iterations of the build-measure-learn cycle, an online transactional processing (OLTP) solution might be
more than sufficient to validate a customer hypothesis. OLTP solutions aren't the best option for any reporting
scenario. However, when you're building with customer empathy, it's more important to focus on customer needs
than on technical tooling decisions. After the customer hypothesis is validated at scale, a more suitable platform
might be required. The reference architecture on OLTP data stores can help you determine which data store is
most appropriate for your solution.
Vir tualize: Integration and ingestion of data can sometimes slow innovation. When a solution for data
virtualization is already available, it might represent a more reasonable approach. Ingestion and integration can
both duplicate storage and development requirements, add data latency, increase attack surface area, trigger
quality issues, and increase governance efforts. Data virtualization is a more contemporary alternative that leaves
the original data in a single location and creates pass-through or cached queries of the source data.
SQL Server 2017 and Azure SQL Data Warehouse both support PolyBase, which is the approach to data
virtualization most commonly used in Azure.
Next steps
With a strategy for democratizing data in place, you'll next want to evaluate approaches to engaging customers
through applications.
Engaging customers through applications
Engage via applications
10/30/2020 • 8 minutes to read • Edit Online
As discussed in Democratize data, data is the new oil. It fuels most innovations across the digital economy.
Building on that analogy, applications are the fueling stations and infrastructure required to get that fuel into the
right hands.
In some cases, data alone is enough to drive change and meet customer needs. More commonly though, solutions
to customer needs require applications to shape the data and create an experience. Applications are the way we
engage the user and the home for the processes required to respond to customer triggers. Applications are how
customers provide data and receive guidance. This article summarizes several principles that can help align you
with the right application solution, based on the hypotheses to be validated.
Shared code
Teams that more quickly and accurately respond to customer feedback, market changes, and opportunities to
innovate typically lead their respective markets in innovation. The first principle of innovative applications is
summed up in the growth mindset overview: "Share the code." Over time, innovation emerges from a cultural
focus. To sustain innovation, diverse perspectives and contributions are required.
To be ready for innovation, all application development should start with a shared code repository. The most
widely adopted tool for managing code repositories is GitHub, which allows you to create a shared code
repository quickly. Alternatively, Azure Repos is a set of version control tools in Azure DevOps Services that you
can use to manage your code. Azure Repos provides two types of version control:
Git: Distributed version control.
Team Foundation Version Control (TFVC): Centralized version control.
Citizen developers
Professional developers are a vital component of innovation. When a hypothesis proves accurate at scale,
professional developers are required to stabilize and prepare the solution for scale. Most of the principles
referenced in this article require support from professional developers. Unfortunately, current trends suggest
there's a greater demand for professional developers than there are developers. Moreover, the cost and pace of
innovation can be less favorable when professional development is deemed necessary. In response to these
challenges, citizen developers provide a way to scale development efforts and accelerate early hypothesis testing.
The use of citizen developers can be viable and effective when early hypotheses can be validated through tools like
Power Apps for application interfaces, AI Builder for processes and predictions, Microsoft Power Automate for
workflows, and Power BI for data consumption.
NOTE
When you rely on citizen developers to test hypotheses, it's advisable to have some professional developers on hand to
provide support, review, and guidance. After a hypothesis is validated at scale, a process for transitioning the application into
a more robust programming model will accelerate returns on the innovation. By involving professional developers in process
definitions early on, you can realize cleaner transitions later.
Intelligent experiences
Intelligent experiences combine the speed and scale of modern web applications with the intelligence of Cognitive
Services and bots. Alone, each of these technologies might be sufficient to meet your customers' needs. When
smartly combined, they broaden the spectrum of needs that can be met through a digital experience, while
helping to contain development costs.
Modern web apps
When an application or experience is required to meet a customer need, modern web applications can be the
fastest way to go. Modern web experiences can engage internal or external customers quickly and allow for rapid
iteration on the solution.
Infusing intelligence
Machine learning and AI are increasingly available to developers. The wide-spread availability of common APIs
with predictive capabilities allows developers to better meet the needs of the customer through expanded access
to data and predictions.
Adding intelligence to a solution can enable speech to text, text translation, Computer Vision, and even visual
search. With these expanded capabilities, it's easier for developers to build solutions that take advantage of
intelligence to create an interactive and modern experience.
Bots
Bots provide an experience that feels less like using a computer and more like dealing with a person — at least
with an intelligent robot. They can be used to shift simple, repetitive tasks (such as making a dinner reservation or
gathering profile information) onto automated systems that might no longer require direct human intervention.
Users converse with a bot through text, interactive cards, and speech. A bot interaction can range from a quick
question-and-answer to a sophisticated conversation that intelligently provides access to services.
Bots are a lot like modern web applications: they live on the internet and use APIs to send and receive messages.
What's in a bot varies widely depending on what kind of bot it is. Modern bot software relies on a stack of
technology and tools to deliver increasingly complex experiences on a variety of platforms. However, a simple bot
could just receive a message and echo it back to the user with very little code involved.
Bots can do the same things as other types of software: read and write files, use databases and APIs, and handle
regular computational tasks. What makes bots unique is their use of mechanisms generally reserved for human-
to-human communication.
Cloud-native solutions
Cloud-native applications are built from the ground up, and they're optimized for cloud scale and performance.
Cloud-native applications are typically built using a microservices, serverless, event-based, or container-based
approaches. Most commonly, cloud-native solutions use a combination of microservices architectures, managed
services, and continuous delivery to achieve reliability and faster time to market.
A cloud-native solution allows centralized development teams to maintain control of the business logic without
the need for monolithic, centralized solutions. This type of solution also creates an anchor to drive consistency
across the input of citizen developers and modern experiences. Finally, cloud-native solutions provide an
innovation accelerator by freeing citizen and professional developers to innovate safely and with a minimum of
blockers.
Refactoring or rearchitecting solutions or centralizing business logic can quickly trigger a time-consuming
technical spike instead of a source of customer value. This is a risk to innovation, especially early in hypothesis
validation. With a bit of creativity in the design of a solution, there should be a path to MVP that doesn't require
refactoring of existing solutions. It's wise to delay refactoring until the initial hypothesis can be validated at scale.
Next steps
Depending on the hypothesis and solution, the principles in this article can aid in designing applications that meet
MVP definitions and engage users. Up next are the principles for empowering adoption, which offer ways to get
the application and data into the hands of customers more quickly and efficiently.
Empower adoption
Empower adoption with digital invention
10/30/2020 • 8 minutes to read • Edit Online
The ultimate test of innovation is customer reaction to your invention. Did the hypothesis prove true? Do
customers use the solution? Does it scale to meet the needs of the desired percentage of users? Most importantly,
do they keep coming back? None of these questions can be asked until the minimum viable product (MVP)
solution has been deployed. In this article, we'll focus on the discipline of empowering adoption.
Shared solution: Establish a centralized repository for all aspects of the solution.
Feedback loops: Make sure that feedback loops can be managed consistently through iterations.
Continuous integration: Regularly build and consolidate the solution.
Reliable testing: Validate solution quality and expected changes to ensure the reliability of your testing metrics.
Solution deployment: Deploy solutions so that the team can quickly share changes with customers.
Integrated measurement: Add learning metrics to the feedback loop for clear analysis by the full team.
To minimize technical spikes, assume that maturity will initially be low across each of these principles. But
definitely plan ahead by aligning to tools and processes that can scale as hypotheses become more fine-grained.
In Azure, the GitHub and Azure DevOps allow small teams to get started with little friction. These teams might
grow to include thousands of developers who collaborate on scale solutions and test hundreds of customer
hypotheses. The remainder of this article illustrates the "plan big, start small" approach to empowering adoption
across each of these principles.
Shared solution
As described in Measure for customer impact, positive validation of any hypothesis requires iteration and
determination. You'll experience far more failures than wins during any innovation cycle. This is expected. However,
when a customer need, hypothesis, and solution align at scale, the world changes quickly.
When you're scaling innovation, there's no more valuable tool than a shared code base for the solution.
Unfortunately, there's no reliable way of predicting which iteration or which MVP will yield the winning
combination. That's why it's never too early to establish a shared code base or repository. This is the one technical
spike that should never be delayed. As the team iterates through various MVP solutions, a shared repo enables
easy collaboration and accelerated development. When changes to the solution drag down learning metrics,
version control lets you roll back to an earlier, more effective version of the solution.
The most widely adopted tool for managing code repositories is GitHub, which lets you create a shared code
repository in just a few steps. Additionally, the Azure Repos feature of Azure DevOps can be used to create a Git or
TFVC repository.
Feedback loops
Making the customer part of the solution is the key to building customer partnerships during innovation cycles.
That's accomplished, in part, by measuring customer impact. It requires conversations and direct testing with the
customer. Both generate feedback that must be managed effectively.
Every point of feedback is a potential solution to the customer need. More importantly, every bit of direct
customer feedback represents an opportunity to improve the partnership. If feedback makes it into an MVP
solution, celebrate that with the customer. Even if some feedback isn't actionable, simply being transparent with
the decision to deprioritize the feedback demonstrates a growth mindset and a focus on continuous learning.
Azure DevOps includes ways to request, provide, and manage feedback. Each of these tools centralizes feedback
so that the team can take action and provide follow-up in service of a transparent feedback loop.
Continuous integration
As adoptions scale and a hypothesis gets closer to true innovation at scale, the number of smaller hypotheses to
be tested tends to grow rapidly. For accurate feedback loops and smooth adoption processes, it's important that
each of those hypotheses is integrated and supportive of the primary hypothesis behind the innovation. This
means that you also have to move quickly to innovate and grow, which requires multiple developers for testing
variations of the core hypothesis. For later stage development efforts, you might even need multiple teams of
developers, each building toward a shared solution. Continuous integration is the first step toward management
of all the moving parts.
In continuous integration, code changes are frequently merged into the main branch. Automated build and test
processes make sure that code in the main branch is always production quality. This ensures that developers are
working together to develop shared solutions that provide accurate and reliable feedback loops.
Azure DevOps and Azure Pipelines provide continuous integration capabilities with just a few steps in GitHub or a
variety of other repositories. Learn more about continuous integration, or for more information, check out the
hands-on lab. Solution architectures are available that can accelerate creation of your CI/CD pipelines via Azure
DevOps.
Reliable testing
Defects in any solution can create false positives or false negatives. Unexpected errors can easily lead to
misinterpretation of user adoption metrics. They can also generate negative feedback from customers that doesn't
accurately represent the test of your hypothesis.
During early iterations of an MVP solution, defects are expected; early adopters might even find them endearing.
In early releases, acceptance testing is typically nonexistent. However, one aspect of building with empathy
concerns the validation of the need and hypothesis. Both can be completed through unit tests at a code level and
manual acceptance tests before deployment. Together, these provide some means of reliability in testing. You
should strive to automate a well-defined series of build, unit, and acceptance tests. These will ensure reliable
metrics related to more granular tweaks to the hypothesis and the resulting solution.
The Azure Test Plans feature provides tooling to develop and operate test plans during manual or automated test
execution.
Solution deployment
Perhaps the most meaningful aspect of empowering adoption concerns your ability to control the release of a
solution to customers. By providing a self-service or automated pipeline for releasing a solution to customers,
you'll accelerate the feedback loop. By allowing customers to quickly interact with changes in the solution, you
invite them into the process. This approach also triggers quicker testing of hypotheses, thereby reducing
assumptions and potential rework.
The are several methods for solution deployment. The following represent the three most common:
Continuous deployment is the most advanced method, as it automatically deploys code changes into
production. For mature teams that are testing mature hypotheses, continuous deployment can be extremely
valuable.
During early stages of development, continuous deliver y might be more appropriate. In continuous delivery,
any code changes are automatically deployed to a production-like environment. Developers, business decision-
makers, and others on the team can use this environment to verify that their work is production-ready. You can
also use this method to test a hypothesis with customers without affecting ongoing business activities.
Manual deployment is the least sophisticated approach to release management. As the name suggests,
someone on the team manually deploys the most recent code changes. This approach is error prone,
unreliable, and considered an antipattern by most seasoned engineers.
During the first iteration of an MVP solution, manual deployment is common, despite the preceding assessment.
When the solution is extremely fluid and customer feedback is unknown, there's a significant risk in resetting the
entire solution (or even the core hypothesis). Here's the general rule for manual deployment: no customer proof,
no deployment automation.
Investing early can lead to lost time. More importantly, it can create dependencies on the release pipeline that
make the team more resistant to an early pivot. After the first few iterations or when customer feedback suggests
potential success, a more advanced model of deployment should be quickly adopted.
At any stage of hypothesis validation, Azure DevOps and Azure Pipelines provide continuous delivery and
continuous deployment capabilities. Learn more about continuous delivery, or check out the hands-on lab.
Solution architecture can also accelerate creation of your CI/CD pipelines through Azure DevOps.
Integrated measurements
When you measure for customer impact, it's important to understand how customers react to changes in the
solution. This data, known as telemetry, provides insights into the actions a user (or cohort of users) took when
working with the solution. From this data, it's easy to get a quantitative validation of the hypothesis. Those metrics
can then be used to adjust the solution and generate more fine-grained hypotheses. Those subtler changes help
mature the initial solution in subsequent iterations, ultimately driving to repeat adoption at scale.
In Azure, Azure Monitor provides the tools and interface to collect and review data from customer experiences. You
can apply those observations and insights to refine the backlog by using Azure Boards.
Next steps
After you've gained an understanding of the tools and processes needed to empower adoption, it's time to
examine a more advanced innovation discipline: interact with devices. This discipline can help reduce the barriers
between physical and digital experiences, making your solution even easier to adopt.
Interact with devices
Ambient experiences: Interact with devices
10/30/2020 • 8 minutes to read • Edit Online
In Build with customer empathy, we discussed the three tests of true innovation: solve a customer need, keep the
customer coming back, and scale across a base of customer cohorts. Each test of your hypothesis requires effort
and iterations on the approach to adoption. This article offers insights on some advanced approaches to reduce
that effort through ambient experiences. By interacting with devices, instead of an application, the customer may
be more likely to turn to your solution first.
Ambient experiences
An ambient experience is a digital experience that relates to the immediate surroundings. A solution that features
ambient experiences strives to meet the customer in their moment of need. When possible, the solution meets the
customer need without leaving the flow of activity that triggered it.
Life in the digital economy is full of distractions. We're all bombarded with social, email, web, visual, and verbal
messaging, each of which is a risk of distraction. This risk increases with every second that elapses between the
customer's point of need and the moment they encounter a solution. Countless customers are lost in that brief
time gap. To foster an increase in repeat adoption, you have to reduce the number of distractions by reducing the
time to solution.
Ambient experiences typically require more than a web application these days. Through measurement and
learning with the customer, the behavior that triggers the customer's need can be observed, tracked, and used to
build a more ambient experience. The following list summarizes a few approaches to integration of ambient
solutions into your hypotheses, with more details about each in the following paragraphs.
Mobile experience : As with laptops, mobile apps are ubiquitous in customer environments. In some
situations, this might provide a sufficient level of interactivity to make a solution ambient.
Mixed reality : Sometimes a customer's typical surroundings must be altered to make an interaction ambient.
This factor creates something of a false reality in which the customer interacts with the solution and has a need
met. In this case, the solution is ambient within the false reality.
Integrated reality : Moving closer to true ambience, integrated reality solutions focus on the use of a device
that exists within the customer's reality to integrate the solution into their natural behaviors. A Virtual Assistant
is a great example of integrating reality into the surrounding environment. A less well-known option concerns
Internet of Things (IoT) technologies, which integrate devices that already exist in the customer's surroundings.
Adjusted reality : When any of these ambient solutions use predictive analysis in the cloud to define and
provide an interaction with the customer through the natural surroundings, the solution has adjusted reality.
Understanding the customer need and measuring customer impact both help you determine whether a device
interaction or ambient experience are necessary to validate your hypothesis. With each of those data points, the
following sections will help you find the best solution.
Mobile experience
In the first stage of ambient experience, the user moves away from the computer. Today's consumers and business
professionals move fluidly between mobile and PC devices. Each of the platforms or devices used by your
customer creates a new potential experience. Adding a mobile experience that extends the primary solution is the
fastest way to improve integration into the customer's immediate surroundings. While a mobile device is far from
ambient, it might edge closer to the customer's point of need.
When customers are mobile and change locations frequently, that may represent the most relevant form of
ambient experience for a particular solution. Over the past decade, innovation has frequently been triggered by
the integration of existing solutions with a mobile experience.
Azure App Service is a great example of this approach. During early iterations, the web app feature of Azure App
Service can be used to test the hypothesis. As the hypotheses become more complex, the mobile app feature of
Azure App Service can extend the web app to run in a variety of mobile platforms.
Mixed reality
Mixed reality solutions represent the next level of maturity for ambient experiences. This approach augments or
replicates the customer's surroundings; it creates an extension of reality for the customer to operate within.
IMPORTANT
If a VR device is required and it's not already part of a customer's immediate surroundings or natural behaviors, augmented
or virtual reality is more of an alternative experience and less of an ambient experience.
Mixed reality experiences are increasingly common among remote workforces. Their use is growing even faster in
industries that require collaboration or specialty skills that aren't readily available in the local market. Situations
that require centralized implementation support of a complex product for a remote labor force are particularly
fertile ground for augmented reality. In these scenarios, the central support team and remote employees might
use augmented reality to work on, troubleshoot, and install the product.
For example, consider the case of spatial anchors. Spatial anchors allow you to create mixed reality experiences
with objects that persist their respective locations across devices over time. Through spatial anchors, a specific
behavior can be captured, recorded, and persisted, thereby providing an ambient experience the next time the user
operates within that augmented environment. Azure Spatial Anchors is a service that moves this logic to the cloud,
allowing experiences to be shared across devices and even across solutions.
Integrated reality
Beyond mobile reality or even mixed reality lies integrated reality. Integrated reality aims to remove the digital
experience entirely. All around us are devices with compute and connectivity capabilities. These devices can be
used to collect data from the immediate surroundings without the customer having to ever touch a phone, laptop,
or virtual reality (VR) device.
This experience is ideal when some form of device is consistently within the same surroundings in which the
customer need occurs. Common scenarios include factory floors, elevators, and even your car. These types of large
devices already contain compute power. You can also use data from the device itself to detect customer behaviors
and send those behaviors to the cloud. This automatic capture of customer behavior data dramatically reduces the
need for a customer to input data. Additionally, the web, mobile, or VR experience can function as a feedback loop
to share what's been learned from the integrated reality solution.
Examples of integrated reality in Azure could include:
Azure Internet of Things (IoT) solutions: A collection of services in Azure that each aid in managing devices and
the flow of data from those devices into the cloud and back out to end users.
Azure Sphere: A combination of hardware and software that provides an intrinsically secure way to enable an
existing device to securely transmit data between the device and Azure IoT solutions.
Azure Kinect DK, AI sensors with advanced computer vision and speech models. These sensors can collect
visual and audio data from the immediate surroundings and feed those inputs into your solution.
You can use all three of these tools to collect data from the natural surroundings and at the point of customer
need. From there, your solution can respond to those data inputs to solve the need, sometimes before the
customer is even aware that a trigger for that need has occurred.
Adjusted reality
The highest form of ambient experience is adjusted reality, often referred to as ambient intelligence. Adjusted
reality is an approach to using information from your solution to change the customer's reality without requiring
them to interact directly with an application. In this approach, the application you initially built to prove your
hypothesis might no longer be relevant at all. Instead, devices in the environment help modulate the inputs and
outputs to meet customer needs.
Virtual assistants and smart speakers offer great examples of adjusted reality. Alone, a smart speaker is an
example of simple integrated reality. But add a smart light and motion sensor to a smart speaker solution and it's
easy to create a basic solution that turns on the lights when you enter a room.
Factory floors around the world provide additional examples of adjusted reality. During early stages of integrated
reality, sensors on devices detected conditions like overheating, and then alerted a human being through an
application. In adjusted reality, the customer might still be involved, but the feedback loop is tighter. On an
adjusted reality factory floor, one device might detect overheating in a vital machine somewhere along the
assembly line. Somewhere else on the floor, a second device then slows production slightly to allow the machine
to cool and then resume full pace when the condition is resolved. In this situation, the customer is a second-hand
participant. The customer uses your application to set the rules and understand how those rules have affected
production, but they're not necessary to the feedback loop.
The Azure services described in Azure Internet of Things (IoT) solutions, Azure Sphere, and Azure Kinect DK can all
be components of an adjusted reality solution. Your original application and business logic would then serve as
the intermediary between the environmental input and the change that should be made in the physical
environment.
A digital twin is another example of adjusted reality. This term refers to a digital representation of a physical
device, presented through computer, mobile, or mixed-reality formats. Unlike less sophisticated 3D models, a
digital twin reflects data collected from an actual device in the physical environment. This solution allows the user
to interact with the digital representation in ways that could never be done in the real world. In this approach,
physical devices adjust a mixed reality environment. However, the solution still gathers data from an integrated
reality solution and uses that data to shape the reality of the customer's current surroundings.
In Azure, digital twins are created and accessed through a service called Azure Digital Twins.
Next steps
Now that you have a deeper understanding of device interactions and the ambient experience that's right for your
solution, you're ready to explore the final discipline of innovation, predict and influence.
Predict and influence
Predict and influence
10/30/2020 • 5 minutes to read • Edit Online
There are two classes of applications in the digital economy: historical and predictive. Many customer needs can be
met solely by using historical data, including nearly real-time data. Most solutions focus primarily on aggregating
data in the moment. They then process and share that data back to the customer in the form of a digital or
ambient experience.
As predictive modeling becomes more cost-effective and readily available, customers demand forward-thinking
experiences that lead to better decisions and actions. However, that demand doesn't always suggest a predictive
solution. In most cases, a historical view can provide enough data to empower the customer to make a decision on
their own.
Unfortunately, customers often take a myopic view that leads to decisions based on their immediate surroundings
and sphere of influence. As options and decisions grow in number and impact, that myopic view may not serve the
customer's needs. At the same time, as a hypothesis is proven at scale, the company providing the solution can see
across thousands or millions of customer decisions. This big-picture approach makes it possible to see broad
patterns and the impacts of those patterns. Predictive capability is a wise investment when an understanding of
those patterns is necessary to make decisions that best serve the customer.
If the customer hypothesis developed in Build with customer empathy includes predictive capabilities, the
principles described there might well apply. However, predictive capabilities require significant investment of time
and energy. When predictive capabilities are technical spikes, as opposed to a source of real customer value, we
suggest that you delay predictions until the customer hypotheses have been validated at scale.
Data
Data is the most elemental of the characteristics mentioned earlier. Each of the disciplines for developing digital
inventions generates data. That data, of course, contributes to the development of predictions. For more guidance
on ways to get data into a predictive solution, see Democratizing data and interacting with devices.
A variety of data sources can be used to deliver predictive capabilities:
Insights
Subject matter experts use data about customer needs and behaviors to develop basic business insights from a
study of raw data. Those insights can pinpoint occurrences of the desired customer behaviors (or, alternatively,
undesirable results). During iterations on the predictions, these insights can aid in identifying potential correlations
that could ultimately generate positive outcomes. For guidance on enabling subject matter experts to develop
insights, see Democratizing data.
Patterns
People have always tried to detect patterns in large volumes of data. Computers were designed for that purpose.
Machine learning accelerates that quest by detecting precisely such patterns, a skill that comprises the machine
learning model. Those patterns are then applied through machine learning algorithms to predict outcomes when a
new set of data is entered into the algorithms.
Using insights as a starting point, machine learning develops and applies predictive models to capitalize on the
patterns in data. Through multiple iterations of training, testing, and adoption, those models and algorithms can
accurately predict future outcomes.
Azure Machine Learning is the cloud-native service in Azure for building and training models based on your data.
This tool also includes a workflow for accelerating the development of machine learning algorithms. This workflow
can be used to develop algorithms through a visual interface or Python.
For more robust machine learning models, ML Services in Azure HDInsight provides a machine learning platform
built on Apache Hadoop clusters. This approach enables more granular control of the underlying clusters, storage,
and compute nodes. Azure HDInsight also offers more advanced integration through tools like ScaleR and SparkR
to create predictions based on integrated and ingested data, even working with data from a stream. The flight
delay prediction solution demonstrates each of these advanced capabilities when used to predict flight delays
based on weather conditions. The HDInsight solution also allows for enterprise controls, such as data security,
network access, and performance monitoring to operationalize patterns.
Predictions
After a pattern is built and trained, you can apply it through APIs, which can make predictions during the delivery
of a digital experience. Most of these APIs are built from a well-trained model based on a pattern in your data. As
more customers deploy everyday workloads to the cloud, the prediction APIs used by cloud providers lead to ever-
faster adoption.
Azure Cognitive Services is an example of a predictive API built by a cloud vendor. This service includes predictive
APIs for content moderation, anomaly detection, and suggestions to personalize content. These APIs are ready to
use and are based on well-known content patterns, which Microsoft has used to train models. Each of those APIs
makes predictions based on the data you feed into the API.
Azure Machine Learning lets you deploy custom-built algorithms, which you can create and train based solely on
your own data. Learn more about deploying predictions with Azure Machine Learning.
Set up HDInsight clusters discusses the processes for exposing predictions developed for ML Services on Azure
HDInsight.
Interactions
After a prediction is made available through an API, you can use it to influence customer behavior. That influence
takes the form of interactions. An interaction with a machine learning algorithm happens within your other digital
or ambient experiences. As data is collected through the application or experience, it's run through the machine
learning algorithms. When the algorithm predicts an outcome, that prediction can be shared back with the
customer through the existing experience.
Learn more about how to create an ambient experience through an adjusted reality solution.
Next steps
Having acquainted yourself with disciplines of invention and the Innovate methodology, you're now ready to learn
how to build with customer empathy.
Build with empathy
Governance in the Microsoft Cloud Adoption
Framework for Azure
10/30/2020 • 3 minutes to read • Edit Online
The cloud creates new paradigms for the technologies that support the business. These new paradigms also
change how those technologies are adopted, managed, and governed. When entire datacenters can be
virtually torn down and rebuilt with one line of code executed by an unattended process, we have to rethink
traditional approaches. This is especially true for governance.
Cloud governance is an iterative process. For organizations with existing policies that govern on-premises
IT environments, cloud governance should complement those policies. The level of corporate policy
integration between on-premises and the cloud varies depending on cloud governance maturity and a
digital estate in the cloud. As the cloud estate changes over time, so do cloud governance processes and
policies. The following exercises help you start building your initial governance foundation.
1. Methodology: Establish a basic understanding of the methodology that drives cloud governance in the
Cloud Adoption Framework to begin thinking through the end state solution.
2. Benchmark: Assess your current state and future state to establish a vision for applying the framework.
3. Initial governance foundation: Begin your governance journey with a small, easily implemented set of
governance tools. This initial governance foundation is called a minimum viable product (MVP).
4. Improve the initial governance foundation: Throughout implementation of the cloud adoption plan,
iteratively add governance controls to address tangible risks as you progress toward the end state.
Intended audience
The content in the Cloud Adoption Framework affects the business, technology, and culture of enterprises.
This section of the Cloud Adoption Framework interacts heavily with IT security, IT governance, finance, line-
of-business leaders, networking, identity, and cloud adoption teams. Various dependencies on these
personnel require a facilitative approach by the cloud architects using this guidance. Facilitation with these
teams might be a one-time effort. In some cases, interactions with these other personnel will be ongoing.
The cloud architect serves as the thought leader and facilitator to bring these audiences together. The
content in this collection of guides is designed to help the cloud architect facilitate the right conversation,
with the right audience, to drive necessary decisions. Business transformation that's empowered by the
cloud depends on the cloud architect to help guide decisions throughout the business and IT.
Cloud architect specialization in this section: Each section of the Cloud Adoption Framework
represents a different specialization or variant of the cloud architect role. This section of the Cloud Adoption
Framework is designed for cloud architects with a passion for mitigating or reducing technical risks. Some
cloud providers refer to these specialists as cloud custodians, but we prefer cloud guardians or, collectively,
the cloud governance team. The actionable governance guides show how the composition and role of the
cloud governance team might change over time.
Adopting the cloud is a journey, not a destination. Along the way, there are clear milestones and tangible business
benefits. The final state of cloud adoption is unknown when a company begins the journey. Cloud governance
creates guardrails that keep the company on a safe path throughout the journey.
The Cloud Adoption Framework provides governance guides that describe the experiences of fictional companies
that are based on the experiences of real customers. Each guide follows the customer through the governance
aspects of their cloud adoption.
The Cloud Adoption Framework governance model identifies key areas of importance during the journey. Each
area relates to different types of risks the company must address as it adopts more cloud services. Within this
framework, the governance guide identifies required actions for the cloud governance team. Along the way, each
principle of the Cloud Adoption Framework governance model is described further. Broadly, these include:
Corporate policies: Corporate policies drive cloud governance. The governance guide focuses on specific
aspects of corporate policy:
Business risks: Identifying and understanding corporate risks.
Policy and compliance: Converting risks into policy statements that support any compliance requirements.
Processes: Ensuring adherence to the stated policies.
Five Disciplines of Cloud Governance: These disciplines support the corporate policies. Each discipline
protects the company from potential pitfalls:
Cost Management discipline
Security Baseline discipline
Resource Consistency discipline
Identity Baseline discipline
Deployment Acceleration discipline
Essentially, corporate policies serve as the early warning system to detect potential problems. The disciplines help
the company manage risks and create guardrails.
NOTE
Governance is not a replacement for key functions such as security, networking, identity, finance, DevOps, or operations.
Along the way, there will be interactions with and dependencies on members from each function. Those members should be
included on the cloud governance team to accelerate decisions and actions.
Next steps
Learn to use the Cloud Adoption Framework governance benchmark tool to assess your transformation journey
and help you identify gaps in your organization across six key domains as defined in the framework.
Assess your transformation journey
Assess your transformation journey
5/19/2020 • 2 minutes to read • Edit Online
The Cloud Adoption Framework provides a governance benchmark tool to help you identify gaps in your
organization across six key domains as defined in the framework.
Next steps
Begin your governance journey with a small, easily implemented set of governance tools. This initial governance
foundation is called a minimum viable product (MVP).
Establish an initial governance foundation
Establish an initial cloud governance foundation
10/30/2020 • 2 minutes to read • Edit Online
Establishing cloud governance is a broad iterative effort. It is challenging to strike an effective balance between
speed and control, especially during execution of early methodologies within the cloud adoption. The governance
guidance in the Cloud Adoption Framework helps provide that balance via an agile approach to adoption.
This article provides two options for establishing an initial foundation for governance. Either option ensures that
governance constraints can be scaled and expanded as the adoption plan is implemented and requirements
become more clearly defined. By default, the initial foundation assumes an isolate-and-control position. It also
focuses more on resource organization than on resource governance. This lightweight starting point is called a
minimum viable product (MVP) for governance. The objective of the MVP is reducing barriers to establishing an
initial governance position, and then enabling rapid maturation of the solution to address a variety of tangible
risks.
Next steps
Once a governance foundation is in place, apply suitable recommendations to improve the solution and protect
against tangible risks.
Improve the initial governance foundation
Improve your initial cloud governance foundation
10/30/2020 • 2 minutes to read • Edit Online
This article assumes that you have established an initial cloud governance foundation. As your cloud adoption
plan is implemented, tangible risks will emerge from the proposed approaches by which teams want to adopt the
cloud. As these risks surface in release planning conversations, use the following grid to quickly identify a few best
practices for getting ahead of the adoption plan to prevent risks from becoming real threats.
Maturity vectors
At any time, the following best practices can be applied to the initial governance foundation to address the risk or
need mentioned in the table below.
IMPORTANT
Resource organization can affect how these best practices are applied. It is important to start with the recommendations
that best align with the initial cloud governance foundation you implemented in the previous step.
Next steps
In addition to the application of best practices, the Govern methodology of the Cloud Adoption Framework can be
customized to fit unique business constraints. After following the applicable recommendations, evaluate corporate
policy to understand additional customization requirements.
Evaluate corporate policy
Cloud governance guides
10/30/2020 • 5 minutes to read • Edit Online
The actionable governance guides in this section illustrate the incremental approach of the Cloud Adoption
Framework governance model, based on the Govern methodology previously described. You can establish an
agile approach to cloud governance that will grow to meet the needs of any cloud governance scenario.
WARNING
A more robust governance starting point may be required. In such cases, consider the CAF enterprise-scale landing zone.
This approach focuses on adoption teams who have a mid-term objective (within 24 months) to host more than 1,000
assets (infrastructure, apps, or data) in the cloud. The CAF enterprise-scale landing zone is the typical choice for complex
governance scenarios in large cloud adoption efforts.
NOTE
It's unlikely that either guide aligns entirely with your situation. Choose whichever guide is closest and use it as a starting
point. Throughout the guide, additional information is provided to help you customize decisions to meet specific criteria.
Business characteristics
C H A RA C T ERIST IC STA N DA RD O RGA N IZ AT IO N C O M P L EX EN T ERP RISE
Geography (country or geopolitical Customers or staff reside largely in Customers or staff reside in multiple
region) one geography geographies or require sovereign
clouds.
Business units affected Business units that share a common IT Multiple business units that do not
infrastructure share a common IT infrastructure.
Datacenter or third-party hosting Fewer than five datacenters More than five datacenters
providers
Cost Management: cloud accounting Showback model. Billing is centralized Chargeback model. Billing could be
through IT. distributed through IT procurement.
Security Baseline: protected data Company financial data and IP. Limited Multiple collections of customers'
customer data. No third-party financial and personal data. Might
compliance requirements. need to consider third-party
compliance.
Next steps
Choose one of these guides:
Standard enterprise governance guide
Governance guide for complex enterprises
Standard enterprise governance guide
10/30/2020 • 8 minutes to read • Edit Online
WARNING
This MVP is a baseline starting point, based on a set of assumptions. Even this minimal set of best practices is based on
corporate policies that are driven by unique business risks and risk tolerances. To see whether these assumptions apply to
you, read the longer narrative that follows this article.
Every application should be deployed in the proper area of the management group, subscription, and resource
group hierarchy. During deployment planning, the cloud governance team will create the necessary nodes in the
hierarchy to empower the cloud adoption teams.
1. One management group for each type of environment (such as production, development, and test).
2. Two subscriptions, one for production workloads and another for nonproduction workloads.
3. Consistent nomenclature should be applied at each level of this grouping hierarchy.
4. Resource groups should be deployed in a manner that considers its contents lifecycle: everything that is
developed together, is managed together, and retires together goes together. For more information about
resource group best practices, see here.
5. Region selection is incredibly important and must be considered so that networking, monitoring, auditing can
be in place for failover/failback as well as confirmation that needed SKUs are available in the preferred
regions.
Here is an example of this pattern in use:
These patterns provide room for growth without complicating the hierarchy unnecessarily.
NOTE
In the event of changes to your business requirements, Azure management groups allow you to easily reorganize your
management hierarchy and subscription group assignments. However, keep in mind that policy and role assignments
applied to a management group are inherited by all subscriptions underneath that group in the hierarchy. If you plan to
reassign subscriptions between management groups, make sure that you are aware of any policy and role assignment
changes that may result. See the Azure management groups documentation for more information.
Governance of resources
A set of global policies and RBAC roles will provide a baseline level of governance enforcement. To meet the cloud
governance team's policy requirements, implementing the governance MVP requires completing the following
tasks:
1. Identify the Azure Policy definitions needed to enforce business requirements. This might include using built-in
definitions and creating new custom definitions. To keep up with the pace of newly released built-in
definitions, there's an atom feed of all the commits for built-in policies, which you can use for an RSS feed.
Alternatively, you can check AzAdvertizer.
2. Create a blueprint definition using these built-in and custom policy and the role assignments required by the
governance MVP.
3. Apply policies and configuration globally by assigning the blueprint definition to all subscriptions.
Identify policy definitions
Azure provides several built-in policies and role definitions that you can assign to any management group,
subscription, or resource group. Many common governance requirements can be handled using built-in
definitions. However, it's likely that you will also need to create custom policy definitions to handle your specific
requirements.
Custom policy definitions are saved to either a management group or a subscription and are inherited through
the management group hierarchy. If a policy definition's save location is a management group, that policy
definition is available to assign to any of that group's child management groups or subscriptions.
Since the policies required to support the governance MVP are meant to apply to all current subscriptions, the
following business requirements will be implemented using a combination of built-in definitions and custom
definitions created in the root management group:
1. Restrict the list of available role assignments to a set of built-in Azure roles authorized by your cloud
governance team. This requires a custom policy definition.
2. Require the following tags on all resources: Department/Billing Unit, Geography, Data Classification, Criticality,
SLA, Environment, Application Archetype, Application, and Application Owner. This can be handled using the
Require specified tag built-in definition.
3. Require that the Application tag for resources should match the name of the relevant resource group. This
can be handled using the "Require tag and its value" built-in definition.
For information on defining custom policies see the Azure Policy documentation. For guidance and examples of
custom policies, consult the Azure Policy samples site and the associated GitHub repository.
Assign Azure Policy and RBAC roles using Azure Blueprints
Azure policies can be assigned at the resource group, subscription, and management group level, and can be
included in Azure Blueprints definitions. Although the policy requirements defined in this governance MVP apply
to all current subscriptions, it's very likely that future deployments will require exceptions or alternative policies.
As a result, assigning policy using management groups, with all child subscriptions inheriting these assignments,
may not be flexible enough to support these scenarios.
Azure Blueprints allows consistent assignment of policy and roles, application of Resource Manager templates,
and deployment of resource groups across multiple subscriptions. Like policy definitions, blueprint definitions are
saved to management groups or subscriptions. The policy definitions are available through inheritance to any
children in the management group hierarchy.
The cloud governance team has decided that enforcement of required Azure Policy and RBAC assignments across
subscriptions will be implemented through Azure Blueprints and associated artifacts:
1. In the root management group, create a blueprint definition named governance-baseline .
2. Add the following blueprint artifacts to the blueprint definition:
a. Policy assignments for the custom Azure Policy definitions defined at the management group root.
b. Resource group definitions for any groups required in subscriptions created or governed by the
Governance MVP.
c. Standard role assignments required in subscriptions created or governed by the Governance MVP.
3. Publish the blueprint definition.
4. Assign the governance-baseline blueprint definition to all subscriptions.
See the Azure Blueprints documentation for more information on creating and using blueprint definitions.
Secure hybrid VNet
Specific subscriptions often require some level of access to on-premises resources. This is common in migration
scenarios or dev scenarios where dependent resources reside in the on-premises datacenter.
Until trust in the cloud environment is fully established it's important to tightly control and monitor any allowed
communication between the on-premises environment and cloud workloads, and that the on-premises network
is secured against potential unauthorized access from cloud-based resources. To support these scenarios, the
governance MVP adds the following best practices:
1. Establish a cloud secure hybrid VNet.
a. The VPN reference architecture establishes a pattern and deployment model for creating a VPN
Gateway in Azure.
b. Validate that on-premises security and traffic management mechanisms treat connected cloud
networks as untrusted. Resources and services hosted in the cloud should only have access to
authorized on-premises services.
c. Validate that the local edge device in the on-premises datacenter is compatible with Azure VPN
Gateway requirements and is configured to access the public internet.
d. Note that VPN tunnels should not be considered production ready circuits for anything but the most
simple workloads. Anything beyond a few simple workloads requiring on-premises connectivity should
use Azure ExpressRoute.
2. In the root management group, create a second blueprint definition named secure-hybrid-vnet .
a. Add the Resource Manager template for the VPN Gateway as an artifact to the blueprint definition.
b. Add the Resource Manager template for the virtual network as an artifact to the blueprint definition.
c. Publish the blueprint definition.
3. Assign the secure-hybrid-vnet blueprint definition to any subscriptions requiring on-premises connectivity.
This definition should be assigned in addition to the governance-baseline blueprint definition.
One of the biggest concerns raised by IT security and traditional governance teams is the risk that early stage
cloud adoption will compromise existing assets. The above approach allows cloud adoption teams to build and
migrate hybrid solutions, with reduced risk to on-premises assets. As trust in the cloud environment increases,
later evolutions may remove this temporary solution.
NOTE
The above is a starting point to quickly create a baseline governance MVP. This is only the beginning of the governance
journey. Further evolution will be needed as the company continues to adopt the cloud and takes on more risk in the
following areas:
Mission-critical workloads
Protected data
Cost management
Multicloud scenarios
Moreover, the specific details of this MVP are based on the example journey of a fictional company, described in the articles
that follow. We highly recommend becoming familiar with the other articles in this series before implementing this best
practice.
Next steps
Now that you're familiar with the governance MVP and have an idea of the governance improvements to follow,
read the supporting narrative for additional context.
Read the supporting narrative
Standard enterprise governance guide: The narrative
behind the governance strategy
10/30/2020 • 2 minutes to read • Edit Online
The following narrative describes the use case for governance during a standard enterprise's cloud adoption
journey. Before implementing the journey, it's important to understand the assumptions and rationale that are
reflected in this narrative. Then you can better align the governance strategy to your organization's journey.
Back story
The board of directors started the year with plans to energize the business in several ways. They're pushing
leadership to improve customer experiences to gain market share. They're also pushing for new products and
services that will position the company as a thought leader in the industry. They also initiated a parallel effort to
reduce waste and cut unnecessary costs. Though intimidating, the actions of the board and leadership show that
this effort is focusing as much capital as possible on future growth.
In the past, the company's CIO has been excluded from these strategic conversations. Because the future vision is
intrinsically linked to technical growth, IT has a seat at the table to help guide these big plans. IT is now expected to
deliver in new ways. The team isn't prepared for these changes and is likely to struggle with the learning curve.
Business characteristics
The company has the following business profile:
All sales and operations reside in a single country, with a low percentage of global customers.
The business operates as a single business unit, with budget aligned to functions, including sales, marketing,
operations, and IT.
The business views most of IT as a capital drain or a cost center.
Current state
Here is the current state of the company's IT and cloud operations:
IT operates two hosted infrastructure environments. One environment contains production assets. The
second environment contains disaster recovery and some dev/test assets. These environments are hosted
by two different providers. IT refers to these two datacenters as Prod and DR respectively.
IT entered the cloud by migrating all end-user email accounts to Microsoft 365. This migration was
completed six months ago. Few other IT assets have been deployed to the cloud.
The application development teams are working in a dev/test capacity to learn about cloud-native
capabilities.
The business intelligence (BI) team is experimenting with big data in the cloud and curation of data on new
platforms.
The company has a loosely defined policy stating that personal customer data and financial data cannot be
hosted in the cloud, which limits mission-critical applications in the current deployments.
IT investments are controlled largely by capital expense. Those investments are planned yearly. In the past
several years, investments have included little more than basic maintenance requirements.
Future state
The following changes are anticipated over the next several years:
The CIO is reviewing the policy on personal data and financial data to allow for the future state goals.
The application development and BI teams want to release cloud-based solutions to production over the
next 24 months based on the vision for customer engagement and new products.
This year, the IT team will finish retiring the disaster recovery workloads of the DR datacenter by migrating
2,000 VMs to the cloud. This is expected to produce an estimated $25m USD cost savings over the next five
years.
The company plans to change how it makes IT investments by repositioning the committed capital expense
as an operating expense within IT. This change will provide greater cost control and enable IT to accelerate
other planned efforts.
Next steps
The company has developed a corporate policy to shape the governance implementation. The corporate policy
drives many of the technical decisions.
Review the initial corporate policy
Standard enterprise governance guide: Initial
corporate policy behind the governance strategy
10/30/2020 • 4 minutes to read • Edit Online
The following corporate policy defines an initial governance position, which is the starting point for this guide. This
article defines early-stage risks, initial policy statements, and early processes to enforce policy statements.
NOTE
The corporate policy is not a technical document, but it drives many technical decisions. The governance MVP described in
the overview ultimately derives from this policy. Before implementing a governance MVP, your organization should develop a
corporate policy based on your own objectives and business risks.
Objective
The initial objective is to establish a foundation for governance agility. An effective Governance MVP allows the
governance team to stay ahead of cloud adoption and implement guardrails as the adoption plan changes.
Business risks
The company is at an early stage of cloud adoption, experimenting and building proofs of concept. Risks are now
relatively low, but future risks are likely to have a significant impact. There is little definition around the final state
of the technical solutions to be deployed to the cloud. In addition, the cloud readiness of IT employees is low. A
foundation for cloud adoption will help the team safely learn and grow.
Future-proofing: There is a risk of not empowering growth, but also a risk of not providing the right protections
against future risks.
An agile yet robust governance approach is needed to support the board's vision for corporate and technical
growth. Failure to implement such a strategy will slow technical growth, potentially risking current and future
market share growth. The impact of such a business risk is unquestionably high. However, the role IT will play in
those potential future states is unknown, making the risk associated with current IT efforts relatively high. That
said, until more concrete plans are aligned, the business has a high tolerance for risk.
This business risk can be broken down tactically into several technical risks:
Well-intended corporate policies could slow transformation efforts or break critical business processes, if not
considered within a structured approval flow.
The application of governance to deployed assets could be difficult and costly.
Governance may not be properly applied across an application or workload, creating gaps in security.
With so many teams working in the cloud, there is a risk of inconsistency.
Costs may not properly align to business units, teams, or other budgetary management units.
The use of multiple identities to manage various deployments could lead to security issues.
Despite current policies, there is a risk that protected data could be mistakenly deployed to the cloud.
Tolerance indicators
The current tolerance for risk is high and the appetite for investing in cloud governance is low. As such, the
tolerance indicators act as an early warning system to trigger more investment of time and energy. If and when the
following indicators are observed, you should iteratively improve the governance strategy.
Cost management: The scale of deployment exceeds predetermined limits on number of resources or
monthly cost.
Security baseline: Inclusion of protected data in defined cloud adoption plans.
Resource consistency: Inclusion of any mission-critical applications in defined cloud adoption plans.
Policy statements
The following policy statements establish the requirements needed to remediate the defined risks. These policies
define the functional requirements for the governance MVP. Each will be represented in the implementation of the
governance MVP.
Cost Management:
For tracking purposes, all assets must be assigned to an application owner within one of the core business
functions.
When cost concerns arise, additional governance requirements will be established with the finance team.
Security Baseline:
Any asset deployed to the cloud must have an approved data classification.
No assets identified with a protected level of data may be deployed to the cloud, until sufficient requirements
for security and governance can be approved and implemented.
Until minimum network security requirements can be validated and governed, cloud environments are seen as
perimeter networks and should meet similar connection requirements to other datacenters or internal
networks.
Resource Consistency:
Because no mission-critical workloads are deployed at this stage, there are no SLA, performance, or BCDR
requirements to be governed.
When mission-critical workloads are deployed, additional governance requirements will be established with IT
operations.
Identity Baseline:
All assets deployed to the cloud should be controlled using identities and roles approved by current governance
policies.
All groups in the on-premises Active Directory infrastructure that have elevated privileges should be mapped to
an approved RBAC role.
Deployment Acceleration:
All assets must be grouped and tagged according to defined grouping and tagging strategies.
All assets must use an approved deployment model.
Once a governance foundation has been established for a cloud provider, any deployment tooling must be
compatible with the tools defined by the governance team.
Processes
No budget has been allocated for ongoing monitoring and enforcement of these governance policies. Because of
that, the cloud governance team has improvised ways to monitor adherence to policy statements.
Education: The cloud governance team is investing time to educate the cloud adoption teams on the
governance guides that support these policies.
Deployment reviews: Before deploying any asset, the cloud governance team will review the governance
guide with the cloud adoption teams.
Next steps
This corporate policy prepares the cloud governance team to implement the governance MVP, which will be the
foundation for adoption. The next step is to implement this MVP.
Best practices explained
Standard enterprise governance guide: Best practices
explained
10/30/2020 • 10 minutes to read • Edit Online
The governance guide starts with a set of initial corporate policies. These policies are used to establish a
governance MVP that reflects best practices.
In this article, we discuss the high-level strategies that are required to create a governance MVP. The core of the
governance MVP is the Deployment Acceleration discipline. The tools and patterns applied at this stage will enable
the incremental improvements needed to expand governance in the future.
Implementation process
The implementation of the governance MVP has dependencies on identity, security, and networking. Once the
dependencies are resolved, the cloud governance team will decide a few aspects of governance. The decisions from
the cloud governance team and from supporting teams will be implemented through a single package of
enforcement assets.
This implementation can also be described using a simple checklist:
1. Solicit decisions regarding core dependencies: identity, networking, monitoring, and encryption.
2. Determine the pattern to be used during corporate policy enforcement.
3. Determine the appropriate governance patterns for the resource consistency, resource tagging, and logging and
reporting disciplines.
4. Implement the governance tools aligned to the chosen policy enforcement pattern to apply the dependent
decisions and governance decisions.
Dependent decisions
The following decisions come from teams outside of the cloud governance team. The implementation of each will
come from those same teams. However, the cloud governance team is responsible for implementing a solution to
validate that those implementations are consistently applied.
Identity Baseline
Identity Baseline is the fundamental starting point for all governance. Before attempting to apply governance,
identity must be established. The established identity strategy will then be enforced by the governance solutions. In
this governance guide, the Identity Management team implements the Directory Synchronization pattern:
RBAC will be provided by Azure Active Directory (Azure AD), using the directory synchronization or "Same
Sign-On" that was implemented during company's migration to Microsoft 365. For implementation guidance,
see Reference Architecture for Azure AD Integration.
The Azure AD tenant will also govern authentication and access for assets deployed to Azure.
In the governance MVP, the governance team will enforce application of the replicated tenant through subscription
governance tooling, discussed later in this article. In future iterations, the governance team could also enforce rich
tooling in Azure AD to extend this capability.
Security Baseline: Networking
Software Defined Network is an important initial aspect of the Security Baseline. Establishing the governance MVP
depends on early decisions from the Security Management team to define how networks can be safely configured.
Given the lack of requirements, IT security is playing it safe and requires a Cloud DMZ pattern. That means
governance of the Azure deployments themselves will be very light.
Azure subscriptions may connect to an existing datacenter via VPN, but must follow all existing on-premises IT
governance policies regarding connection of a perimeter network to protected resources. For implementation
guidance regarding VPN connectivity, see On-premises network connected to Azure using a VPN gateway.
Decisions regarding subnet, firewall, and routing are currently being deferred to each application/workload
lead.
Additional analysis is required before releasing of any protected data or mission-critical workloads.
In this pattern, cloud networks can only connect to on-premises resources over an existing VPN that is compatible
with Azure. Traffic over that connection will be treated like any traffic coming from a perimeter network. Additional
considerations may be required on the on-premises edge device to securely handle traffic from Azure.
The cloud governance team has proactively invited members of the networking and IT security teams to regular
meetings, in order to stay ahead of networking demands and risks.
Security Baseline: Encryption
Encryption is another fundamental decision within the Security Baseline discipline. Because the company currently
does not yet store any protected data in the cloud, the Security Team has decided on a less aggressive pattern for
encryption. At this point, a cloud-native pattern for encryption is suggested but not required of any development
team.
No governance requirements have been set regarding the use of encryption, because the current corporate
policy does not permit mission-critical or protected data in the cloud.
Additional analysis will be required before releasing any protected data or mission-critical workloads.
Policy enforcement
The first decision to make regarding Deployment Acceleration is the pattern for enforcement. In this narrative, the
governance team decided to implement the Automated Enforcement pattern.
Azure Security Center will be made available to the security and identity teams to monitor security risks. Both
teams are also likely to use Security Center to identify new risks and improve corporate policy.
RBAC is required in all subscriptions to govern authentication enforcement.
Azure Policy will be published to each management group and applied to all subscriptions. However, the level of
policies being enforced will be very limited in this initial Governance MVP.
Although Azure management groups are being used, a relatively simple hierarchy is expected.
Azure Blueprints will be used to deploy and update subscriptions by applying RBAC requirements, Resource
Manager Templates, and Azure Policy across management groups.
IMPORTANT
Any time a resource in a resource group no longer shares the same lifecycle, it should be moved to another resource group.
Examples include common databases and networking components. While they may serve the application being developed,
they may also serve other purposes and should therefore exist in other resource groups.
Resource tagging
Resource tagging decisions determine how metadata is applied to Azure resources within a subscription to support
operations, management, and accounting purposes. In this narrative, the classification pattern has been chosen as
the default model for resource tagging.
Deployed assets should be tagged with:
Data classification
Criticality
SLA
Environment
These four values will drive governance, operations, and security decisions.
If this governance guide is being implemented for a business unit or team within a larger corporation, tagging
should also include metadata for the billing unit.
Logging and reporting
Logging and reporting decisions determine how your store log data and how the monitoring and reporting tools
that keep IT staff informed on operational health are structured. In this narrative, a cloud-native pattern** for
logging and reporting is suggested.
Alternative patterns
If any of the patterns selected in this governance guide don't align with the reader's requirements, alternatives to
each pattern are available:
Encryption patterns
Identity patterns
Logging and reporting patterns
Policy enforcement patterns
Resource consistency patterns
Resource tagging patterns
Software Defined Networking patterns
Subscription design patterns
Next steps
Once this guide is implemented, each cloud adoption team can go forth with a sound governance foundation. At
the same time, the cloud governance team will work to continuously update the corporate policies and governance
disciplines.
The two teams will use the tolerance indicators to identify the next set of improvements needed to continue
supporting cloud adoption. For the fictional company in this guide, the next step is improving the security baseline
to support moving protected data to the cloud.
Improve the Security Baseline discipline
Standard enterprise governance guide: Improve the
Security Baseline discipline
10/30/2020 • 9 minutes to read • Edit Online
This article advances the governance strategy narrative by adding security controls that support moving protected
data to the cloud.
Conclusion
Adding the above processes and changes to the governance MVP will help to remediate many of the risks
associated with security governance. Together, they add the network, identity, and security monitoring tools
needed to protect data.
Next steps
As cloud adoption continues and delivers additional business value, risks and cloud governance needs also
change. For the fictional company in this guide, the next step is to support mission-critical workloads. At this point,
resource consistency controls are needed.
Improve the Resource Consistency discipline
Standard enterprise governance guide: Improve the
Resource Consistency discipline
10/30/2020 • 6 minutes to read • Edit Online
This article advances the narrative by adding resource consistency controls to support mission-critical
applications.
Conclusion
These additional processes and changes to the governance MVP help remediate many of the risks associated with
resource governance. Together they add recovery, sizing, and monitoring controls that empower cloud-aware
operations.
Next steps
As cloud adoption continues and delivers additional business value, risks and cloud governance needs will also
change. For the fictional company in this guide, the next trigger is when the scale of deployment exceeds 100
assets to the cloud or monthly spending exceeds $1,000 per month. At this point, the cloud governance team adds
cost management controls.
Improve the Cost Management discipline
Standard enterprise governance guide: Improve the
Cost Management discipline
10/30/2020 • 4 minutes to read • Edit Online
This article advances the narrative by adding cost controls to the governance MVP.
Conclusion
Adding these processes and changes to the governance MVP helps remediate many of the risks associated with
cost governance. Together, they create the visibility, accountability, and optimization needed to control costs.
Next steps
As cloud adoption continues and delivers additional business value, risks and cloud governance needs will also
change. For the fictional company in this guide, the next step is using this governance investment to manage
multiple clouds.
Multicloud evolution
Standard enterprise governance guide: Multicloud
improvement
10/30/2020 • 4 minutes to read • Edit Online
This article advances the narrative by adding controls for multicloud adoption.
Conclusion
This series of articles described the incremental development of governance best practices, aligned with the
experiences of this fictional company. By starting small, but with the right foundation, the company could move
quickly and yet still apply the right amount of governance at the right time. The MVP by itself did not protect the
customer. Instead, it created the foundation to manage risks and add protections. From there, layers of governance
were applied to remediate tangible risks. The exact journey presented here won't align 100% with the experiences
of any reader. Rather, it serves as a pattern for incremental governance. You should mold these best practices to fit
your own unique constraints and governance requirements.
Governance guide for complex enterprises
10/30/2020 • 8 minutes to read • Edit Online
WARNING
This MVP is a baseline starting point, based on a set of assumptions. Even this minimal set of best practices is based on
corporate policies that are driven by unique business risks and risk tolerances. To see whether these assumptions apply to
you, read the longer narrative that follows this article.
Every application should be deployed in the proper area of the management group, subscription, and resource
group hierarchy. During deployment planning, the cloud governance team will create the necessary nodes in the
hierarchy to empower the cloud adoption teams.
1. Define a management group for each business unit with a detailed hierarchy that reflects geography first,
then environment type (for example, production or nonproduction environments).
2. Create a production subscription and a nonproduction subscription for each unique combination of
discrete business unit or geography. Creating multiple subscriptions requires careful consideration. For
more information, see the subscription decision guide.
3. Apply consistent nomenclature at each level of this grouping hierarchy.
4. Resource groups should be deployed in a manner that considers its contents lifecycle. Resources that are
developed together, managed together, and retired together belong in the same resource group. For more
information about best practices for using resource groups, see here.
5. Region selection is incredibly important and must be considered so that networking, monitoring, auditing
can be in place for failover/failback as well as confirmation that needed SKUs are available in the
preferred regions.
These patterns provide room for growth without making the hierarchy needlessly complicated.
NOTE
In the event of changes to your business requirements, Azure management groups allow you to easily reorganize your
management hierarchy and subscription group assignments. However, keep in mind that policy and role assignments
applied to a management group are inherited by all subscriptions underneath that group in the hierarchy. If you plan to
reassign subscriptions between management groups, make sure that you are aware of any policy and role assignment
changes that may result. See the Azure management groups documentation for more information.
Governance of resources
A set of global policies and RBAC roles will provide a baseline level of governance enforcement. To meet the
cloud governance team's policy requirements, implementing the governance MVP requires completing the
following tasks:
1. Identify the Azure Policy definitions needed to enforce business requirements. This might include using built-
in definitions and creating new custom definitions. To keep up with the pace of newly released built-in
definitions, there's an atom feed of all the commits for built-in policies, which you can use for an RSS feed.
Alternatively, you can check AzAdvertizer.
2. Create a blueprint definition using these built-in and custom policy and the role assignments required by the
governance MVP.
3. Apply policies and configuration globally by assigning the blueprint definition to all subscriptions.
Identify policy definitions
Azure provides several built-in policies and role definitions that you can assign to any management group,
subscription, or resource group. Many common governance requirements can be handled using built-in
definitions. However, it's likely that you will also need to create custom policy definitions to handle your specific
requirements.
Custom policy definitions are saved to either a management group or a subscription and are inherited through
the management group hierarchy. If a policy definition's save location is a management group, that policy
definition is available to assign to any of that group's child management groups or subscriptions.
Since the policies required to support the governance MVP are meant to apply to all current subscriptions, the
following business requirements will be implemented using a combination of built-in definitions and custom
definitions created in the root management group:
1. Restrict the list of available role assignments to a set of built-in Azure roles authorized by your cloud
governance team. This requires a custom policy definition.
2. Require the following tags on all resources: Department/Billing Unit, Geography, Data Classification,
Criticality, SLA, Environment, Application Archetype, Application, and Application Owner. This can be handled
using the Require specified tag built-in definition.
3. Require that the Application tag for resources should match the name of the relevant resource group. This
can be handled using the "Require tag and its value" built-in definition.
For information on defining custom policies see the Azure Policy documentation. For guidance and examples of
custom policies, consult the Azure Policy samples site and the associated GitHub repository.
Assign Azure Policy and RBAC roles using Azure Blueprints
Azure policies can be assigned at the resource group, subscription, and management group level, and can be
included in Azure Blueprints definitions. Although the policy requirements defined in this governance MVP apply
to all current subscriptions, it's very likely that future deployments will require exceptions or alternative policies.
As a result, assigning policy using management groups, with all child subscriptions inheriting these assignments,
may not be flexible enough to support these scenarios.
Azure Blueprints allows consistent assignment of policy and roles, application of Resource Manager templates,
and deployment of resource groups across multiple subscriptions. Like policy definitions, blueprint definitions
are saved to management groups or subscriptions. The policy definitions are available through inheritance to
any children in the management group hierarchy.
The cloud governance team has decided that enforcement of required Azure Policy and RBAC assignments
across subscriptions will be implemented through Azure Blueprints and associated artifacts:
1. In the root management group, create a blueprint definition named governance-baseline .
2. Add the following blueprint artifacts to the blueprint definition:
a. Policy assignments for the custom Azure Policy definitions defined at the management group root.
b. Resource group definitions for any groups required in subscriptions created or governed by the
Governance MVP.
c. Standard role assignments required in subscriptions created or governed by the Governance MVP.
3. Publish the blueprint definition.
4. Assign the governance-baseline blueprint definition to all subscriptions.
See the Azure Blueprints documentation for more information on creating and using blueprint definitions.
Secure hybrid VNet
Specific subscriptions often require some level of access to on-premises resources. This is common in migration
scenarios or dev scenarios where dependent resources reside in the on-premises datacenter.
Until trust in the cloud environment is fully established it's important to tightly control and monitor any allowed
communication between the on-premises environment and cloud workloads, and that the on-premises network
is secured against potential unauthorized access from cloud-based resources. To support these scenarios, the
governance MVP adds the following best practices:
1. Establish a cloud secure hybrid VNet.
a. The VPN reference architecture establishes a pattern and deployment model for creating a VPN
Gateway in Azure.
b. Validate that on-premises security and traffic management mechanisms treat connected cloud
networks as untrusted. Resources and services hosted in the cloud should only have access to
authorized on-premises services.
c. Validate that the local edge device in the on-premises datacenter is compatible with Azure VPN
Gateway requirements and is configured to access the public internet.
d. Note that VPN tunnels should not be considered production ready circuits for anything but the most
simple workloads. Anything beyond a few simple workloads requiring on-premises connectivity
should use Azure ExpressRoute.
2. In the root management group, create a second blueprint definition named secure-hybrid-vnet .
a. Add the Resource Manager template for the VPN Gateway as an artifact to the blueprint definition.
b. Add the Resource Manager template for the virtual network as an artifact to the blueprint definition.
c. Publish the blueprint definition.
3. Assign the secure-hybrid-vnet blueprint definition to any subscriptions requiring on-premises connectivity.
This definition should be assigned in addition to the governance-baseline blueprint definition.
One of the biggest concerns raised by IT security and traditional governance teams is the risk that early stage
cloud adoption will compromise existing assets. The above approach allows cloud adoption teams to build and
migrate hybrid solutions, with reduced risk to on-premises assets. As trust in the cloud environment increases,
later evolutions may remove this temporary solution.
NOTE
The above is a starting point to quickly create a baseline governance MVP. This is only the beginning of the governance
journey. Further evolution will be needed as the company continues to adopt the cloud and takes on more risk in the
following areas:
Mission-critical workloads
Protected data
Cost management
Multicloud scenarios
Moreover, the specific details of this MVP are based on the example journey of a fictional company, described in the
articles that follow. We highly recommend becoming familiar with the other articles in this series before implementing this
best practice.
Next steps
Now that you're familiar with the governance MVP and the forthcoming governance changes, read the
supporting narrative for additional context.
Read the supporting narrative
Governance guide for complex enterprises: The
supporting narrative
10/30/2020 • 4 minutes to read • Edit Online
The following narrative establishes a use case for governance during complex enterprise's cloud adoption journey.
Before acting on the recommendations in the guide, it's important to understand the assumptions and reasoning
that are reflected in this narrative. Then you can better align the governance strategy to your organization's cloud
adoption journey.
Back story
Customers are demanding a better experience when interacting with this company. The current experience caused
market erosion and led to the board to hire a chief digital officer (CDO). The CDO is working with marketing and
sales to drive a digital transformation that will power improved experiences. Additionally, several business units
recently hired data scientists to farm data and improve many of the manual experiences through learning and
prediction. IT is supporting these efforts where it can. There are "shadow IT" activities occurring that fall outside of
needed governance and security controls.
The IT organization is also facing its own challenges. Finance is planning continued reductions in the IT budget over
the next five years, leading to some necessary spending cuts starting this year. Conversely, GDPR and other data
sovereignty requirements are forcing IT to invest in assets in additional countries to localize data. Two of the
existing datacenters are overdue for hardware refreshes, causing further problems with employee and customer
satisfaction. Three more datacenters require hardware refreshes during the execution of the five-year plan. The
CFO is pushing the CIO to consider the cloud as an alternative for those datacenters, to free up capital expenses.
The CIO has innovative ideas that could help the company, but she and her teams are limited to fighting fires and
controlling costs. At a luncheon with the CDO and one of the business unit leaders, the cloud migration
conversation generated interest from the CIO's peers. The three leaders aim to support each other using the cloud
to achieve their business objectives, and they have begun the exploration and planning stages of cloud adoption.
Business characteristics
The company has the following business profile:
Sales and operations span multiple geographic areas with global customers in multiple markets.
The business grew through acquisition and operates across three business units based on the target
customer base. Budgeting is a complex matrix across business units and functions.
The business views most of IT as a capital drain or a cost center.
Current state
Here is the current state of the company's IT and cloud operations:
IT operates more than 20 privately owned datacenters around the globe.
Due to organic growth and multiple geographies, there are a few IT teams that have unique data
sovereignty and compliance requirements that impact a single business unit operating within a specific
geography.
Each datacenter is connected by a-series of regional leased lines, creating a loosely coupled global WAN.
IT entered the cloud by migrating all end-user email accounts to Microsoft 365. This migration was
completed more than six months ago. Since then, only a few IT assets have been deployed to the cloud.
The CDO's primary development team is working in a dev/test capacity to learn about cloud-native
capabilities.
One business unit is experimenting with big data in the cloud. The BI team inside of IT is participating in that
effort.
The existing IT governance policy states that personal customer data and financial data must be hosted on
assets owned directly by the company. This policy blocks cloud adoption for any mission-critical
applications or protected data.
IT investments are controlled largely by capital expense. Those investments are planned yearly and often
include plans for ongoing maintenance, as well as established refresh cycles of three to five years
depending on the datacenter.
Most investments in technology that don't align to the annual plan are addressed by shadow IT efforts.
Those efforts are usually managed by business units and funded through the business unit's operating
expenses.
Future state
The following changes are anticipated over the next several years:
The CIO is leading an effort to modernize the policy on personal and financial data to support future goals.
Two members of the IT governance team have visibility into this effort.
The CIO wants to use the cloud migration as a forcing function to improve consistency and stability across
business units and geographies. The future state must respect any external compliance requirements that
would require deviation from standard approaches by specific IT teams.
If the early experiments in application development and BI show leading indicators of success, they would
each like to release small-scale production solutions to the cloud in the next 24 months.
The CIO and CFO have assigned an architect and the vice president of infrastructure to create a cost analysis
and feasibility study. These efforts will determine whether the company can and should move 5,000 assets
to the cloud over the next 36 months. A successful migration would allow the CIO to eliminate two
datacenters, reducing costs by over $100m USD during the five-year plan. If three to four datacenters can
experience similar results, the budget will be back in the black, giving the CIO budget to support more
innovative initiatives.
Along with this cost savings, the company plans to change the management of some IT investments by
repositioning the committed capital expense as an operating expense within IT. This change will provide
greater cost control that IT can use to accelerate other planned efforts.
Next steps
The company has developed a corporate policy to shape the governance implementation. The corporate policy
drives many of the technical decisions.
Review the initial corporate policy
Governance guide for complex enterprises: Initial
corporate policy behind the governance strategy
10/30/2020 • 5 minutes to read • Edit Online
The following corporate policy defines the initial governance position that's the starting point for this guide. This
article defines early-stage risks, initial policy statements, and early processes to enforce policy statements.
NOTE
The corporate policy is not a technical document, but it drives many technical decisions. The governance MVP described in
the overview ultimately derives from this policy. Before implementing a governance MVP, your organization should develop a
corporate policy based on your own objectives and business risks.
Objective
The initial objective is to establish a foundation for governance agility. An effective Governance MVP allows the
governance team to stay ahead of cloud adoption and implement guardrails as the adoption plan changes.
Business risks
The company is at an early stage of cloud adoption, experimenting and building proofs of concept. Risks are now
relatively low, but future risks are likely to have a significant impact. There is little definition around the final state
of the technical solutions to be deployed to the cloud. In addition, the cloud readiness of IT employees is low. A
foundation for cloud adoption will help the team safely learn and grow.
Future-proofing: There is a risk of not empowering growth, but also a risk of not providing the right protections
against future risks.
An agile yet robust governance approach is needed to support the board's vision for corporate and technical
growth. Failure to implement such a strategy will slow technical growth, potentially risking current and future
market share growth. The impact of such a business risk is unquestionably high. However, the role IT will play in
those potential future states is unknown, making the risk associated with current IT efforts relatively high. That
said, until more concrete plans are aligned, the business has a high tolerance for risk.
This business risk can be broken down tactically into several technical risks:
Well-intended corporate policies could slow transformation efforts or break critical business processes, if not
considered within a structured approval flow.
The application of governance to deployed assets could be difficult and costly.
Governance may not be properly applied across an application or workload, creating gaps in security.
With so many teams working in the cloud, there is a risk of inconsistency.
Costs may not properly align to business units, teams, or other budgetary management units.
The use of multiple identities to manage various deployments could lead to security issues.
Despite current policies, there is a risk that protected data could be mistakenly deployed to the cloud.
Tolerance indicators
The current risk tolerance is high and the appetite for investing in cloud governance is low. As such, the tolerance
indicators act as an early warning system to trigger the investment of time and energy. If the following indicators
are observed, it would be wise to advance the governance strategy.
Cost Management discipline: Scale of deployment exceeds 1,000 assets to the cloud, or monthly spending
exceeds $10,000 USD per month.
Identity Baseline discipline: Inclusion of applications with legacy or third-party multi-factor authentication
requirements.
Security Baseline discipline: Inclusion of protected data in defined cloud adoption plans.
Resource Consistency discipline: Inclusion of any mission-critical applications in defined cloud adoption
plans.
Policy statements
The following policy statements establish the requirements needed to remediate the defined risks. These policies
define the functional requirements for the governance MVP. Each will be represented in the implementation of the
governance MVP.
Cost Management:
For tracking purposes, all assets must be assigned to an application owner within one of the core business
functions.
When cost concerns arise, additional governance requirements will be established with the finance team.
Security Baseline:
Any asset deployed to the cloud must have an approved data classification.
No assets identified with a protected level of data may be deployed to the cloud, until sufficient requirements
for security and governance can be approved and implemented.
Until minimum network security requirements can be validated and governed, cloud environments are seen as
perimeter networks and should meet similar connection requirements to other datacenters or internal
networks.
Resource Consistency:
Because no mission-critical workloads are deployed at this stage, there are no SLA, performance, or BCDR
requirements to be governed.
When mission-critical workloads are deployed, additional governance requirements will be established with IT
operations.
Identity Baseline:
All assets deployed to the cloud should be controlled using identities and roles approved by current governance
policies.
All groups in the on-premises Active Directory infrastructure that have elevated privileges should be mapped to
an approved RBAC role.
Deployment Acceleration:
All assets must be grouped and tagged according to defined grouping and tagging strategies.
All assets must use an approved deployment model.
Once a governance foundation has been established for a cloud provider, any deployment tooling must be
compatible with the tools defined by the governance team.
Processes
No budget has been allocated for ongoing monitoring and enforcement of these governance policies. Because of
that, the cloud governance team has improvised ways to monitor adherence to policy statements.
Education: The cloud governance team is investing time to educate the cloud adoption teams on the
governance guides that support these policies.
Deployment reviews: Before deploying any asset, the cloud governance team will review the governance
guide with the cloud adoption teams.
Next steps
This corporate policy prepares the cloud governance team to implement the governance MVP as the foundation
for adoption. The next step is to implement this MVP.
Best practices explained
Governance guide for complex enterprises: Best
practices explained
10/30/2020 • 11 minutes to read • Edit Online
The governance guide begins with a set of initial corporate policies. These policies are used to establish a
minimum viable product (MVP) for governance that reflects best practices.
In this article, we discuss the high-level strategies that are required to create a governance MVP. The core of the
governance MVP is the Deployment Acceleration discipline. The tools and patterns applied at this stage will enable
the incremental improvements needed to expand governance in the future.
Implementation process
The implementation of the governance MVP has dependencies on identity, security, and networking. Once the
dependencies are resolved, the cloud governance team will decide a few aspects of governance. The decisions
from the cloud governance team and from supporting teams will be implemented through a single package of
enforcement assets.
This implementation can also be described using a simple checklist:
1. Solicit decisions regarding core dependencies: identity, network, and encryption.
2. Determine the pattern to be used during corporate policy enforcement.
3. Determine the appropriate governance patterns for resource consistency, resource tagging, and logging and
reporting.
4. Implement the governance tools aligned to the chosen policy enforcement pattern to apply the dependent
decisions and governance decisions.
Dependent decisions
The following decisions come from teams outside of the cloud governance team. The implementation of each will
come from those same teams. However, the cloud governance team is responsible for implementing a solution to
validate that those implementations are consistently applied.
Identity Baseline
Identity Baseline is the fundamental starting point for all governance. Before attempting to apply governance,
identity must be established. The established identity strategy will then be enforced by the governance solutions.
In this governance guide, the Identity Management team implements the Directory Synchronization pattern:
RBAC will be provided by Azure Active Directory (Azure AD), using the directory synchronization or "Same
Sign-On" that was implemented during company's migration to Microsoft 365. For implementation guidance,
see Reference Architecture for Azure AD Integration.
The Azure AD tenant will also govern authentication and access for assets deployed to Azure.
In the governance MVP, the governance team will enforce application of the replicated tenant through subscription
governance tooling, discussed later in this article. In future iterations, the governance team could also enforce rich
tooling in Azure AD to extend this capability.
Security Baseline: Networking
Software Defined Network is an important initial aspect of the Security Baseline. Establishing the governance MVP
depends on early decisions from the Security Management team to define how networks can be safely configured.
Given the lack of requirements, IT security is playing it safe and requires a Cloud DMZ pattern. That means
governance of the Azure deployments themselves will be very light.
Azure subscriptions may connect to an existing datacenter via VPN, but must follow all existing on-premises IT
governance policies regarding connection of a perimeter network to protected resources. For implementation
guidance regarding VPN connectivity, see On-premises network connected to Azure using a VPN gateway.
Decisions regarding subnet, firewall, and routing are currently being deferred to each application/workload
lead.
Additional analysis is required before releasing of any protected data or mission-critical workloads.
In this pattern, cloud networks can only connect to on-premises resources over an existing VPN that is compatible
with Azure. Traffic over that connection will be treated like any traffic coming from a perimeter network. Additional
considerations may be required on the on-premises edge device to securely handle traffic from Azure.
The cloud governance team has proactively invited members of the networking and IT security teams to regular
meetings, in order to stay ahead of networking demands and risks.
Security Baseline: Encryption
Encryption is another fundamental decision within the Security Baseline discipline. Because the company currently
does not yet store any protected data in the cloud, the Security Team has decided on a less aggressive pattern for
encryption. At this point, a cloud-native pattern for encryption is suggested but not required of any development
team.
No governance requirements have been set regarding the use of encryption, because the current corporate
policy does not permit mission-critical or protected data in the cloud.
Additional analysis will be required before releasing any protected data or mission-critical workloads.
Policy enforcement
The first decision to make regarding Deployment Acceleration is the pattern for enforcement. In this narrative, the
governance team decided to implement the Automated Enforcement pattern.
Azure Security Center will be made available to the security and identity teams to monitor security risks. Both
teams are also likely to use Security Center to identify new risks and improve corporate policy.
RBAC is required in all subscriptions to govern authentication enforcement.
Azure Policy will be published to each management group and applied to all subscriptions. However, the level
of policies being enforced will be very limited in this initial Governance MVP.
Although Azure management groups are being used, a relatively simple hierarchy is expected.
Azure Blueprints will be used to deploy and update subscriptions by applying RBAC requirements, Resource
Manager Templates, and Azure Policy across management groups.
IMPORTANT
Any time a resource in a resource group no longer shares the same lifecycle, it should be moved to another resource group.
Examples include common databases and networking components. While they may serve the application being developed,
they may also serve other purposes and should therefore exist in other resource groups.
Resource tagging
Resource tagging decisions determine how metadata is applied to Azure resources within a subscription to
support operations, management, and accounting purposes. In this narrative, the accounting pattern has been
chosen as the default model for resource tagging.
Deployed assets should be tagged with values for:
Department or billing unit
Geography
Data classification
Criticality
SLA
Environment
Application archetype
Application
Application owner
These values, along with the Azure management group and subscription associated with a deployed asset, will
drive governance, operations, and security decisions.
Logging and reporting
Logging and reporting decisions determine how your store log data and how the monitoring and reporting tools
that keep IT staff informed on operational health are structured. In this narrative a hybrid monitoring pattern for
logging and reporting is suggested, but not required of any development team at this point.
No governance requirements are currently set regarding the specific data points to be collected for logging or
reporting purposes. This is specific to this fictional narrative and should be considered an antipattern. Logging
standards should be determined and enforced as soon as possible.
Additional analysis is required before the release of any protected data or mission-critical workloads.
Before supporting protected data or mission-critical workloads, the existing on-premises operational
monitoring solution must be granted access to the workspace used for logging. Applications are required to
meet security and logging requirements associated with the use of that tenant, if the application is to be
supported with a defined SLA.
Alternative patterns
If any of the patterns chosen in this governance guide don't align with the reader's requirements, alternatives to
each pattern are available:
Encryption patterns
Identity patterns
Logging and reporting patterns
Policy enforcement patterns
Resource consistency patterns
Resource tagging patterns
Software Defined Networking patterns
Subscription design patterns
Next steps
Once this guidance is implemented, each cloud adoption team can proceed with a solid governance foundation. At
the same time, the cloud governance team will work to continually update the corporate policies and governance
disciplines.
Both teams will use the tolerance indicators to identify the next set of improvements needed to continue
supporting cloud adoption. The next step for this company is incremental improvement of their governance
baseline to support applications with legacy or third-party multi-factor authentication requirements.
Improve the Identity Baseline discipline
Governance guide for complex enterprises: Improve
the Identity Baseline discipline
10/30/2020 • 4 minutes to read • Edit Online
This article advances the narrative by adding identity baseline controls to the governance MVP.
Conclusion
Adding these changes to the governance MVP helps remediate many of the risks in this article, allowing each
cloud adoption team to quickly move past this roadblock.
Next steps
As cloud adoption continues and delivers additional business value, risks and cloud governance needs will also
change. The following are a few changes that may occur. For this fictional company, the next trigger is the inclusion
of protected data in the cloud adoption plan. This change requires additional security controls.
Improve the Security Baseline discipline
Governance guide for complex enterprises: Improve
the Security Baseline discipline
10/30/2020 • 13 minutes to read • Edit Online
This article advances the narrative by adding security controls that support moving protected data to the cloud.
Conclusion
Adding these processes and changes to the governance MVP helps remediate many of the risks associated with
security governance. Together, they add the network, identity, and security monitoring tools needed to protect data.
Next steps
As cloud adoption continues and delivers additional business value, risks and cloud governance needs also
change. For the fictional company in this guide, the next step is to support mission-critical workloads. This is the
point when resource consistency controls are needed.
Improve the Resource Consistency discipline
Governance guide for complex enterprises: Improve
the Resource Consistency discipline
10/30/2020 • 6 minutes to read • Edit Online
This article advances the narrative by adding resource consistency controls to the governance MVP to support
mission-critical applications.
Conclusion
Adding these processes and changes to the governance MVP helps remediate many of the risks associated with
resource governance. Together, they add the recovery, sizing, and monitoring controls necessary to empower
cloud-aware operations.
Next steps
As cloud adoption grows and delivers additional business value, the risks and cloud governance needs will also
change. For the fictional company in this guide, the next trigger is when the scale of deployment exceeds 1,000
assets to the cloud or monthly spending exceeds $10,000 USD per month. At this point, the cloud governance
team adds cost management controls.
Improve the Cost Management discipline
Governance guide for complex enterprises: Improve
the Cost Management discipline
10/30/2020 • 4 minutes to read • Edit Online
This article advances the narrative by adding cost controls to the minimum viable product (MVP) governance.
Changes in risk
Budget control: There is an inherent risk that self-service capabilities will result in excessive and unexpected
costs on the new platform. Governance processes for monitoring costs and mitigating ongoing cost risks must
be in place to ensure continued alignment with the planned budget.
This business risk can be expanded into a few technical risks:
There is a risk of actual costs exceeding the plan.
Business conditions change. When they do, there will be cases when a business function needs to consume
more cloud services than expected, leading to spending anomalies. There is a risk that these additional costs
will be considered overages as opposed to a required adjustment to the plan. If successful, the Canadian
experiment should help remediate this risk.
There is a risk of systems being overprovisioned, resulting in excess spending.
Conclusion
Adding the above processes and changes to the governance MVP helps remediate many of the risks associated
with cost governance. Together, they create the visibility, accountability, and optimization needed to control costs.
Next steps
As cloud adoption grows and delivers additional business value, risks and cloud governance needs will also
change. For this fictional company, the next step is using this governance investment to manage multiple clouds.
Multicloud improvement
Governance guide for complex enterprises:
Multicloud improvement
10/30/2020 • 3 minutes to read • Edit Online
Next steps
In many large enterprises, the Five Disciplines of Cloud Governance can be blockers to adoption. The next article
has some additional thoughts on making governance a team sport to help ensure long-term success in the cloud.
Multiple layers of governance
Governance guide for complex enterprises: Multiple
layers of governance
10/30/2020 • 3 minutes to read • Edit Online
When large enterprises require multiple layers of governance, there are greater levels of complexity that must be
factored into the governance MVP and later governance improvements.
A few common examples of such complexities include:
Distributed governance functions.
Corporate IT supporting business unit IT organizations.
Corporate IT supporting geographically distributed IT organizations.
This article explores some ways to navigate this type of complexity.
Any change to business processes or technology platforms introduces risk to the business. Cloud governance
teams, whose members are sometimes known as cloud custodians, are tasked with mitigating these risks with
minimal interruption to adoption or innovation efforts.
But cloud governance requires more than technical implementation. Subtle changes in the corporate narrative or
corporate policies can affect adoption efforts significantly. Before implementation, it's important to look beyond
IT while defining corporate policy.
Figure 1: Visual of corporate policy and the Five Disciplines of Cloud Governance.
Next steps
Learn how to prepare your corporate policy for the cloud.
Prepare your corporate policy for the cloud
Prepare corporate IT policy for the cloud
10/30/2020 • 4 minutes to read • Edit Online
Cloud governance is the product of an ongoing adoption effort over time, as a true lasting transformation doesn't
happen overnight. Attempting to deliver complete cloud governance before addressing key corporate policy
changes using a fast aggressive method seldom produces the desired results. Instead we recommend an
incremental approach.
What is different about our Cloud Adoption Framework is the purchasing cycle and how it can enable authentic
transformation. Since there is not a big capital expenditure acquisition requirement, engineers can begin
experimentation and adoption sooner. In most corporate cultures, elimination of the capital expense barrier to
adoption can lead to tighter feedback loops, organic growth, and incremental execution.
The shift to cloud adoption requires a shift in governance. In many organizations, corporate policy transformation
allows for improved governance and higher rates of adherence through incremental policy changes and
automated enforcement of those changes, powered by newly defined capabilities that you configure with your
cloud service provider.
This article outlines key activities that can help you shape your corporate policies to enable an expanded
governance model.
TIP
If your organization is governed by third-party compliance, one of the biggest business risks to consider may be a risk of
adherence to regulatory compliance. This risk often cannot be remediated, and instead may require a strict adherence. Be
sure to understand your third-party compliance requirements before beginning a policy review.
Next steps
Effective cloud governance strategy begins with understanding business risk.
Understand business risk
Understand business risk during cloud migration
10/30/2020 • 4 minutes to read • Edit Online
An understanding of business risk is one of the most important elements of any cloud transformation. Risk drives
policy, and it influences monitoring and enforcement requirements. Risk heavily influences how we manage the
digital estate, on-premises or in the cloud.
Relativity of risk
Risk is relative. A small company with a few IT assets, in a closed building has little risk. Add users and an internet
connection with access to those assets, the risk is intensified. When that small company grows to Fortune 500
status, the risks are exponentially greater. As revenue, business process, employee counts, and IT assets
accumulate, risks increase and coalesce. IT assets that aid in generating revenue are at tangible risk of stopping
that revenue stream in the event of an outage. Every moment of downtime equates to losses. Likewise, as data
accumulates, the risk of harming customers grows.
In the traditional on-premises world, IT governance teams focus on assessing risks, creating processes to manage
those risks, and deploying systems to ensure remediation measures are successfully implemented. These efforts
work to balance risks required to operate in a connected, modern business environment.
Next steps
Learn to evaluate risk tolerance during cloud adoption.
Evaluate risk tolerance
Evaluate risk tolerance
10/30/2020 • 8 minutes to read • Edit Online
Every business decision creates new risks. Making an investment in anything creates risk of losses. New products
or services create risks of market failure. Changes to current products or services could reduce market share.
Cloud transformation does not provide a magical solution to everyday business risk. To the contrary, connected
solutions (cloud or on-premises) introduce new risks. Deploying assets to any network connected facility also
expands the potential threat profile by exposing security weaknesses to a much broader, global community.
Fortunately, cloud providers are aware of the changes, increases, and addition of risks. They invest heavily to
reduce and manage those risks on the behalf of their customers.
This article is not focused on cloud risks. Instead it discusses the business risks associated with various forms of
cloud transformation. Later in the article, the discussion shifts focus to discuss ways of understanding the
business's tolerance for risk.
IMPORTANT
Before reading the following, be aware that each of these risks can be managed. The goal of this article is to inform and
prepare readers for more productive risk management discussions.
Data breach: The top risk associated with any transformation is a data breach. Data leaks can cause
significant damage to your company, leading to loss of customers, decrease in business, or even legal
liability. Any changes to the way data is stored, processed, or used creates risk. Cloud transformations
create a high degree of change regarding data management, so the risk should not be taken lightly. The
Security Baseline discipline, data classification, and incremental rationalization can each help manage this
risk.
Ser vice disruption: Business operations and customer experiences rely heavily on technical operations.
Cloud transformations will create change in IT operations. In some organizations, that change is small and
easily adjusted. In other organizations, these changes could require retooling, retraining, or new
approaches to support cloud operations. The bigger the change, the bigger the potential impact on
business operations and customer experience. Managing this risk will require the involvement of the
business in transformation planning. Release planning and first workload selection in the incremental
rationalization article discuss ways to choose workloads for transformation projects. The business's role in
that activity is to communicate the business operations risk of changing prioritized workloads. Helping IT
choose workloads that have a lower impact on operations will reduce the overall risk.
Budget control: Cost models change in the cloud. This change can create risks associated with cost
overruns or increases in the cost of goods sold (COGS), especially directly attributed operating expenses.
When business works closely with IT, it is feasible to create transparency regarding costs and services
consumed by various business units, programs, or projects. The Cost Management discipline provides
examples of ways business and IT can partner on this topic.
The above are a few of the most common risks mentioned by customers. The cloud governance team and the
cloud adoption teams can begin to develop a risk profile, as workloads are migrated and readied for production
release. Be prepared for conversations to define, refine, and manage risks based on the desired business
outcomes and transformation effort.
Next steps
This type of conversation can help the business and IT evaluate tolerance more effectively. These conversations
can be used during the creation of MVP policies and during incremental policy reviews.
Define corporate policy
Define corporate policy for cloud governance
10/30/2020 • 3 minutes to read • Edit Online
Once you've analyzed the known risks and related risk tolerances for your organization's cloud transformation
journey, your next step is to establish policy that will explicitly address those risks and define the steps needed to
remediate them where possible.
TIP
If your organization uses vendors or other trusted business partners, one of the biggest business risks to consider may be a
lack of adherence to regulatory compliance by these external organizations. This risk often cannot be remediated, and
instead may require a strict adherence to requirements by all parties. Make sure you've identified and understand any third-
party compliance requirements before beginning a policy review.
Next steps
After defining your policies, draft an architecture design guide to provide IT staff and developers with actionable
guidance.
Align your governance design guide with corporate policy
Align your cloud governance design guide with
corporate policy
10/30/2020 • 2 minutes to read • Edit Online
After you've defined cloud policies based on your identified risks, you'll need to generate actionable guidance that
aligns with these policies for your IT staff and developers to refer to. Drafting a cloud governance design guide
allows you to specify specific structural, technological, and process choices based on the policy statements you
generated for each of the five governance disciplines.
A cloud governance design guide should establish the architecture choices and design patterns for each of the core
infrastructure components of cloud deployments that best meet your policy requirements. Alongside these you
should provide a high-level explanation of the technology, tools, and processes that will support each of these
design decisions.
Although your risk analysis and policy statements may, to some degree, be cloud platform agnostic, your design
guide should provide platform-specific implementation details that your IT and dev teams can use when creating
and deploying cloud-based workloads. Focus on the architecture, tools, and features of your chosen platform when
making design decision and providing guidance.
While cloud design guides should take into account some of the technical details associated with each
infrastructure component, they're not meant to be extensive technical documents or specifications. Make sure your
guides address your policy statements and clearly state design decisions in a format easy for staff to understand
and reference.
Next steps
With design guidance in place, establish policy adherence processes to ensure policy compliance.
Establish policy adherence processes
Establish policy adherence processes
10/30/2020 • 5 minutes to read • Edit Online
After establishing your cloud policy statements and drafting a design guide, you'll need to create a strategy for
ensuring your cloud deployment stays in compliance with your policy requirements. This strategy will need to
encompass your cloud governance team's ongoing review and communication processes, establish criteria for
when policy violations require action, and defining the requirements for automated monitoring and compliance
systems that will detect violations and trigger remediation actions.
See the corporate policy sections of the actionable governance guides for examples of how policy adherence
process fit into a cloud governance plan.
Cost Management Monthly cloud spending is more than Notify the billing unit leader who will
20% higher than expected. begin a review of resource usage.
Security Baseline Detect suspicious user activity. Notify the IT security team and disable
the suspect user account.
Resource Consistency CPU utilization for a workload is greater Notify the IT operations team and scale
than 90%. out additional resources to handle the
load.
Next steps
Learn more about regulatory compliance in the cloud.
Regulatory compliance
Introduction to regulatory compliance
10/30/2020 • 3 minutes to read • Edit Online
This is an introductory article about regulatory compliance, therefore it's not intended for implementing a
compliance strategy. More detailed information about Azure compliance offerings is available at the Microsoft
Trust Center. Moreover, all downloadable documentation is available to certain Azure customers from the
Microsoft Service Trust Portal.
Regulatory compliance refers to the discipline and process of ensuring that a company follows the laws enforced
by governing bodies in their geography or rules required by voluntarily adopted industry standards. For IT
regulatory compliance, people and processes monitor corporate systems in an effort to detect and prevent
violations of policies and procedures established by these governing laws, regulations, and standards. This in turn
applies to a wide array of monitoring and enforcement processes. Depending on the industry and geography,
these processes can become lengthy and complex.
Compliance is challenging for multinational organizations, especially in heavily regulated industries like healthcare
and financial services. Standards and regulations abound, and in certain cases may change frequently, making it
difficult for businesses to keep up with changing international electronic data handling laws.
As with security controls, organizations should understand the division of responsibilities regarding regulatory
compliance in the cloud. Cloud providers strive to ensure that their platforms and services are compliant.
Organizations also need to confirm that their applications, the infrastructure those applications depend on, and
services supplied by third parties are also certified as compliant.
The following are descriptions of compliance regulations in various industries and geographies:
HIPAA
A healthcare application that processes protected health information (PHI) is subject to both the privacy rule and
the security rule encompassed within the Health Information Portability and Accountability Act (HIPAA). At a
minimum, HIPAA could likely require that a healthcare business must receive written assurances from the cloud
provider that it will safeguard any PHI received or created.
PCI
The Payment Card Industry Data Security Standard (PCI DSS) is a proprietary information security standard for
organizations that handle branded credit cards from the major card payment systems, including Visa, Mastercard,
American Express, Discover, and JCB. The PCI standard is mandated by the card brands and administered by the
Payment Card Industry Security Standards Council. The standard was created to increase controls around
cardholder data to reduce credit-card fraud. Validation of compliance is performed annually, either by an external
qualified security assessor (QSA) or by a firm-specific internal security assessor (ISA) who creates a report on
compliance (ROC) for organizations handling large volumes of transactions, or by a self-assessment questionnaire
(SAQ) for companies.
Personal data
Personal data is information that could be used to identify a consumer, employee, partner, or any other living or
legal entity. Many emerging laws, particularly those dealing with privacy and personal data, require that
businesses themselves comply and report on compliance and any breaches that might occur.
GDPR
One of the most important developments in this area is the General Data Protection Regulation (GDPR), designed
to strengthen data protection for individuals within the European Union. GDPR requires that data about individuals
(such as "a name, a home address, a photo, an email address, bank details, posts on social networking websites,
medical information, or a computer's IP address") be maintained on servers within the EU and not transferred out
of it. It also requires that companies notify individuals of any data breaches, and mandates that companies have a
data protection officer (DPO). Other countries have, or are developing, similar types of regulations.
Next steps
Learn more about cloud security readiness.
Cloud security readiness
CISO cloud readiness guide
10/30/2020 • 3 minutes to read • Edit Online
Microsoft guidance like the Cloud Adoption Framework is not positioned to determine or guide the unique security
constraints of the thousands of enterprises supported by this documentation. When moving to the cloud, the role
of the chief information security officer or chief information security office (CISO) isn't supplanted by cloud
technologies. Quite the contrary, the CISO and the office of the CISO, become more engrained and integrated. This
guide assumes the reader is familiar with CISO processes and is seeking to modernize those processes to enable
cloud transformation.
Cloud adoption enables services that weren't often considered in traditional IT environments. Self-service or
automated deployments are commonly executed by application development or other IT teams not traditionally
aligned to production deployment. In some organizations, business constituents similarly have self-service
capabilities. This can trigger new security requirements that weren't needed in the on-premises world. Centralized
security is more challenging, security often becomes a shared responsibility across the business and IT culture. This
article can help a CISO prepare for that approach and engage in incremental governance.
Next steps
The first step to taking action in any governance strategy is a policy review. Policy and compliance could be a useful
guide during your policy review.
Prepare for a policy review
Conduct a cloud policy review
10/30/2020 • 3 minutes to read • Edit Online
A cloud policy review is the first step toward governance maturity in the cloud. The objective of this process is to
modernize existing corporate IT policies. When completed, the updated policies provide an equivalent level of
risk management for cloud-based resources. This article explains the cloud policy review process and its
importance.
Next steps
Learn more about including data classification in your cloud governance strategy.
Data classification
What is data classification?
10/30/2020 • 2 minutes to read • Edit Online
Data classification allows you to determine and assign value to your organization's data and provides a common
starting point for governance. The data classification process categorizes data by sensitivity and business impact
in order to identify risks. When data is classified, you can manage it in ways that protect sensitive or important
data from theft or loss.
Take action
Take action by defining and tagging assets with a defined data classification.
Choose one of the actionable governance guides for examples of applying tags across your portfolio.
Review the naming and tagging standards article to define a more comprehensive tagging standard.
For additional information on resource tagging in Azure, see Use tags to organize your Azure resources and
management hierarchy.
Next steps
Continue learning from this article series by reviewing the article on securing sensitive data. The next article
contains applicable insights if you are working with data that is classified as confidential or highly confidential.
Secure sensitive data
The Five Disciplines of Cloud Governance
10/30/2020 • 2 minutes to read • Edit Online
Any change to business processes or technology platforms introduces risk. Cloud governance teams, whose
members are sometimes known as cloud custodians, are tasked with mitigating these risks and ensuring
minimal interruption to adoption or innovation efforts.
The Cloud Adoption Framework governance model guides these decisions, irrespective of the chosen cloud
platform, by focusing on development of corporate policy and the Five Disciplines of Cloud Governance.
Actionable design guides demonstrate this model using Azure services. Learn about the disciplines of the Cloud
Adoption Framework governance model below.
Figure 1: Visual of corporate policy and the Five Disciplines of Cloud Governance.
The Cost Management discipline is one of the Five Disciplines of Cloud Governance within the Cloud Adoption
Framework governance model. For many customers, governing their costs is a major concern when adopting
cloud technologies. Balancing performance demands, adoption pacing, and cloud services costs can be
challenging. This is especially relevant during major business transformations that implement cloud
technologies. This section outlines the approach to developing a Cost Management discipline as part of a cloud
governance strategy.
NOTE
Cost Management discipline does not replace the existing business teams, accounting practices, and procedures that are
involved in your organization's financial management of IT-related costs. The primary purpose of this discipline is to
identify potential cloud-related risks related to IT spending, and provide risk-mitigation guidance to the business and IT
teams responsible for deploying and managing cloud resources.
The primary audience for this guidance is your organization's cloud architects and other members of your cloud
governance team. The decisions, policies, and processes that emerge from this discipline should involve
engagement and discussions with relevant members of your business and IT teams, especially those leaders
responsible for owning, managing, and paying for cloud-based workloads.
Policy statements
Actionable policy statements and the resulting architecture requirements serve as the foundation of a Cost
Management discipline. Use sample policy statements as a starting point for defining your Cost Management
policies.
Cau t i on
The sample policies come from common customer experiences. To better align these policies to specific cloud
governance needs, execute the following steps to create policy statements that meet your unique business needs.
Next steps
Get started by evaluating business risks in a specific environment.
Understand business risks
Cost Management discipline template
10/30/2020 • 2 minutes to read • Edit Online
The first step to implementing change is communicating the desired change. The same is true when changing
governance practices. The template below serves as a starting point for documenting and communicating policy
statements that govern cost management issues in the cloud.
As your discussions progress, use this template's structure as a model for capturing the business risks, risk
tolerances, compliance processes, and tooling needed to define your organization's Cost Management policy
statements.
IMPORTANT
This template is a limited sample. Before updating this template to reflect your requirements, you should review the
subsequent steps for defining an effective Cost Management discipline within your cloud governance strategy.
Next steps
Solid governance practices start with an understanding of business risk. Review the article on business risks and
begin to document the business risks that align with your current cloud adoption plan.
Understand business risks
Motivations and business risks in the Cost
Management discipline
10/30/2020 • 2 minutes to read • Edit Online
This article discusses the reasons that customers typically adopt a Cost Management discipline within a cloud
governance strategy. It also provides a few examples of business risks that drive policy statements.
Relevance
In terms of cost governance, cloud adoption creates a paradigm shift. Managing costs in a traditional on-premises
world is based on refresh cycles, datacenter acquisitions, host renewals, and recurring maintenance issues. You can
forecast, plan, and refine these costs to align with annual capital expenditure budgets.
For cloud solutions, many businesses tend to take a more reactive approach to cost management. In many cases,
businesses will prepurchase, or commit to use, a set amount of cloud services. This model assumes that
maximizing discounts, based on how much the business plans on spending with a specific cloud vendor, creates
the perception of a proactive, planned cost cycle. That perception will only become reality if the business also
implements a mature Cost Management discipline.
The cloud offers self-service capabilities that were previously unheard of in traditional on-premises datacenters.
These new capabilities empower businesses to be more agile, less restrictive, and more open to adopt new
technologies. The downside of self-service is that end users can unknowingly exceed allocated budgets.
Conversely, the same users can experience a change in plans and unexpectedly not use the amount of cloud
services forecasted. The potential of shift in either direction justifies investment in a Cost Management discipline
within the governance team.
Business risk
The Cost Management discipline attempts to address core business risks related to expenses incurred when
hosting cloud-based workloads. Work with your business to identify these risks and monitor each of them for
relevance as you plan for and implement your cloud deployments.
Risks will differ between organization, but the following serve as common cost-related risks that you can use as a
starting point for discussions within your cloud governance team:
Budget control: Not controlling budget can lead to excessive spending with a cloud vendor.
Utilization loss: Prepurchases or precommitments that go unused can result in lost investments.
Spending anomalies: Unexpected spikes in either direction can be indicators of improper usage.
Overprovisioned assets: When assets are deployed in a configuration that exceed the needs of an
application or virtual machine (VM), they can create waste.
Next steps
Use the Cost Management policy template to document business risks that are likely to be introduced by the
current cloud adoption plan.
After you've gained an understanding of realistic business risks, the next step is to document the business's
tolerance for risk and the indicators and key metrics to monitor that tolerance.
Understand indicators, metrics, and risk tolerance
Risk tolerance metrics and indicators in the Cost
Management discipline
10/30/2020 • 3 minutes to read • Edit Online
Learn to quantify business risk tolerance associated with the Cost Management discipline. Defining metrics and
indicators helps to create a business case for investing in the maturity of this discipline.
Metrics
Cost management generally focuses on metrics related to costs. As part of your risk analysis, you'll want to gather
data related to your current and planned spending on cloud-based workloads to determine how much risk you
face, and how important investment in your Cost Management discipline is for your planned cloud deployments.
The following are examples of useful metrics that you should gather to help evaluate risk tolerance within the Cost
Management discipline:
Annual spending: The total annual cost for services provided by a cloud provider.
Monthly spending: The total monthly cost for services provided by a cloud provider.
Forecasted versus actual ratio: The ratio comparing forecasted and actual spending (monthly or annual).
Pace of adoption (month-over-month) ratio: The percentage of the delta in cloud costs from month to
month.
Accumulated cost: Total accrued daily spending, starting from the beginning of the month.
Spending trends: Spending trend against the budget.
Next steps
Use the Cost Management discipline template to document metrics and tolerance indicators that align to the
current cloud adoption plan.
Review sample Cost Management policies as a starting point to develop your own policies to address specific
business risks aligned with your cloud adoption plans.
Review sample policies
Cost Management sample policy statements
10/30/2020 • 3 minutes to read • Edit Online
Individual cloud policy statements are guidelines for addressing specific risks identified during your risk
assessment process. These statements should provide a concise summary of risks and plans to deal with them.
Each statement definition should include these pieces of information:
Business risk : A summary of the risk this policy will address.
Policy statement: A clear summary explanation of the policy requirements.
Design options: Actionable recommendations, specifications, or other guidance that IT teams and developers
can use when implementing the policy.
The following sample policy statements address common cost-related business risks. These statements are
examples you can reference when drafting policy statements to address your organization's needs. These examples
are not meant to be prescriptive, and there are potentially several policy options for dealing with each identified
risk. Work closely with business and IT teams to identify the best policies for your unique set of risks.
Future-proofing
Business risk : Current criteria that don't warrant an investment in a Cost Management discipline from the
governance team, but you anticipate such an investment in the future.
Policy statement: You should associate all assets deployed to the cloud with a billing unit and application or
workload. This policy will ensure that your Cost Management discipline is effective.
Design options: For information on establishing a future-proof foundation, see the discussions related to
creating a governance MVP in the actionable design guides included as part of the Cloud Adoption Framework
guidance.
Budget overruns
Business risk : Self-service deployment creates a risk of overspending.
Policy statement: Any cloud deployment must be allocated to a billing unit with approved budget and a
mechanism for budgetary limits.
Design options: In Azure, budget can be controlled with Azure Cost Management + Billing.
Underutilization
Business risk : The company has prepaid for cloud services or has made an annual commitment to spend a
specific amount. There is a risk that the agreed-on amount won't be used, resulting in a lost investment.
Policy statement: Each billing unit with an allocated cloud budget will meet annually to set budgets, quarterly to
adjust budgets, and monthly to allocate time for reviewing planned versus actual spending. Discuss any deviations
greater than 20% with the billing unit leader monthly. For tracking purposes, assign all assets to a billing unit.
Design options:
In Azure, planned versus actual spending can be managed via Azure Cost Management + Billing.
There are several options for grouping resources by billing unit. In Azure, a resource consistency model should
be chosen in conjunction with the governance team and applied to all assets.
Overprovisioned assets
Business risk : In traditional on-premises datacenters, it is common practice to deploy assets with extra capacity
planning for growth in the distant future. The cloud can scale more quickly than traditional equipment. Assets in
the cloud are also priced based on the technical capacity. There is a risk of the old on-premises practice artificially
inflating cloud spending.
Policy statement: Any asset deployed to the cloud must be enrolled in a program that can monitor utilization
and report any capacity in excess of 50% of utilization. Any asset deployed to the cloud must be grouped or
tagged in a logical manner, so governance team members can engage the workload owner regarding any
optimization of overprovisioned assets.
Design options:
In Azure, Azure Advisor can provide optimization recommendations.
There are several options for grouping resources by billing unit. In Azure, a resource consistency model should
be chosen in conjunction with the governance team and applied to all assets.
Overoptimization
Business risk : Effective cost management creates new risks. Optimization of spending is inverse to system
performance. When reducing costs, there is a risk of overtightening spending and producing poor user
experiences.
Policy statement: Any asset that directly affects customer experiences must be identified through grouping or
tagging. Before optimizing any asset that affects customer experience, the cloud governance team must adjust
optimization based on at least 90 days of utilization trends. Document any seasonal or event-driven bursts
considered when optimizing assets.
Design options:
In Azure, Azure Monitor's insights features can help with analysis of system utilization.
There are several options for grouping and tagging resources based on roles. In Azure, you should choose a
resource consistency model in conjunction with the governance team and apply this to all assets.
Next steps
Use the samples mentioned in this article as a starting point to develop policies that address specific business risks
that align with your cloud adoption plans.
To begin developing your own custom Cost Management policy statements, download the Cost Management
policy template.
To accelerate adoption of this discipline, choose the actionable governance guide that most closely aligns with your
environment. Then modify the design to incorporate your specific corporate policy decisions.
Building on risks and tolerance, establish a process for governing and communicating Cost Management policy
adherence.
Establish policy compliance processes
Cost Management policy compliance processes
10/30/2020 • 3 minutes to read • Edit Online
This article discusses an approach to creating processes that support an effective Cost Management discipline.
Effective governance of cloud costs starts with recurring manual processes designed to support policy compliance.
This requires regular involvement of the cloud governance team and interested business stakeholders to review
and update policy and ensure policy compliance. In addition, many ongoing monitoring and enforcement
processes can be automated or supplemented with tooling to reduce the overhead of governance and allow for
faster response to policy deviation.
Next steps
Use the Cost Management discipline template to document the processes and triggers that align to the current
cloud adoption plan.
For guidance on executing Cost Management policies in alignment with adoption plans, see Cost Management
discipline improvement.
Cost Management discipline improvement
Cost Management discipline improvement
10/30/2020 • 4 minutes to read • Edit Online
The Cost Management discipline attempts to address core business risks related to expenses incurred when
hosting cloud-based workloads. Within the Five Disciplines of Cloud Governance, the Cost Management discipline
is involved in controlling cost and usage of cloud resources with the goal of creating and maintaining a planned
cost cycle.
This article outlines potential tasks your company perform to develop and mature your Cost Management
discipline. These tasks can be broken down into planning, building, adopting, and operating phases of
implementing a cloud solution. The tasks are then iterated on to allow the development of an incremental
approach to cloud governance.
Neither the minimum or potential activities outlined in this article are aligned to specific corporate policies or
third-party compliance requirements. This guidance is designed to help facilitate the conversations that will lead to
alignment of both requirements with a cloud governance model.
Next steps
Now that you understand the concept of cloud cost governance, review the Cost Management discipline best
practices to find ways to reduce your overall spend.
Cost Management discipline best practices
Best practices for costing and sizing resources hosted
in Azure
10/30/2020 • 22 minutes to read • Edit Online
While delivering the disciplines of governance, cost management is a recurring theme at the enterprise level. By
optimizing and managing costs, you can ensure the long-term success of your Azure environment. It's critical that
all teams (such as finance, management, and application development teams) understand associated costs and
review them on a recurring basis.
IMPORTANT
The best practices and opinions described in this article are based on platform and service features in Azure that were
available at the time of writing. Features and capabilities change over time. Not all recommendations will apply to your
deployment, so choose what works best for your situation.
Before adoption
Before you move your workloads to the cloud, estimate the monthly cost of running them in Azure. Proactively
managing cloud costs helps you adhere to your operating expense budget. The best practices in this section help
you to estimate costs, and perform right-sizing for VMs and storage before a workload is deployed to the cloud.
Storage optimized High disk throughput and I/O. Suitable for big data, SQL, and NoSQL
databases.
GPU optimized Specialized VMs. Single or multiple Heavy graphics and video editing.
GPUs.
High performance Fastest and most powerful CPU. VMs Critical high-performance applications.
with optional high-throughput network
interfaces (RDMA).
It's important to understand the pricing differences between these VMs, and the long-term budget effects.
Each type has several VM series within it.
Additionally, when you select a VM within a series, you can only scale the VM up and down within that series.
For example, a DS2_v2 instance can scale up to DS4_v2 , but it can't be changed to an instance of a different
series such as a F2S_v2 instance.
Learn more:
Learn more about VM types and sizing, and map sizes to types.
Plan VM sizing.
Review a sample assessment for the fictional Contoso company.
Blobs Optimized to store massive amounts of Access data from everywhere over
unstructured objects, such as text or HTTP/HTTPS.
binary data.
Use for streaming and random access
scenarios. For example, to serve images
and documents directly to a browser,
stream video and audio, and store
backup and disaster recovery data.
Files Managed file shares accessed over SMB Use when migrating on-premises file
3.0. shares, and to provide multiple access
and connections to file data.
Disks Based on page blobs. Use premium disks for VMs. Use
managed disks for simple management
Disk type (speed): Standard HDD, and scaling.
standard SSD, premium SSD, or ultra
disks.
Queues Store and retrieve large numbers of Connect application components with
messages accessed via authenticated asynchronous message queueing.
calls (HTTP or HTTPS).
DATA T Y P E DETA IL S USA GE
Access tiers
Azure Storage provides different options for accessing block blob data. Selecting the right access tier helps ensure
that you store block blob data in the most cost-effective manner.
Hot Higher storage costs, lower access, and Use for data in active use that's
transaction costs accessed frequently.
Cool Lower storage costs, higher access and Store short-term, data is available but
transaction costs. accessed infrequently.
Archive Used for individual block blobs. Use for data that can tolerate several
hours of retrieval latency and will reside
Most cost-effective option for storage. in the archive tier for at least 180 days.
Lowest storage costs, highest access
and transaction costs.
A C C O UN T T Y P E DETA IL S USA GE
General-purpose v2 Standard tier Supports blobs (block, page, append), Use for most scenarios and most types
files, disks, queues, and tables. of data. Standard storage accounts can
be HDD or SSD-based.
Supports hot, cool, and archive access
tiers. Zone-redundant storage (ZRS) is
supported.
General-purpose v2 Premium tier Supports Blob storage data (page Microsoft recommends using for all
blobs). Supports hot, cool, and archive VMs.
access tiers. ZRS is supported.
Stored on SSD.
General-purpose v1 Access tiering isn't supported. Doesn't Use if applications need the Azure
support ZRS classic deployment model.
Blob Specialized storage account for storing You can't store page blobs in these
unstructured objects. Provides block accounts, and therefore can't store VHD
blobs and append blobs only (no file, files. You can set an access tier to hot or
queue, table, or disk storage services). cool.
Provides the same durability, availability,
scalability, and performance as general-
purpose v2.
Locally redundant storage (LRS) Protects against a local outage by Consider whether your application
replicating within a single storage unit stores data that can be easily
to a separate fault domain and update reconstructed.
domain. Keeps multiple copies of your
data in one datacenter. Provides at least
99.999999999 percent (eleven 9's)
durability of objects over a given year.
Zone-redundant storage (ZRS) Protects again a datacenter outage by Consider whether you need
replicating across three storage clusters consistency, durability, and high
in a single region. Each storage cluster availability. Might not protect against a
is physically separated and located in its regional disaster when multiple zones
own Availability Zone. Provides at least are permanently affected.
99.9999999999 percent (twelve 9's)
durability of objects over a given year
by keeping multiple copies of your data
across multiple datacenters or regions.
Geo-redundant storage (GRS) Protects against an entire region Replica data isn't available unless
outage by replicating data to a Microsoft initiates a failover to the
secondary region that's hundreds of secondary region. If failover occurs,
miles away from the primary. Provides read and write access is available.
at least 99.99999999999999 percent
(sixteen 9's) durability of objects over a
given year.
Read-access geo-redundant Similar to GRS. Provides at least Provides and 99.99 percent read
storage (RA-GRS) 99.99999999999999 percent (sixteen availability by allowing read access from
9's) durability of objects over a given the second region used for GRS.
year.
Learn more:
Review Azure Storage pricing.
Learn to use the Azure Import/Export service to securely import large amounts of data to Azure Blob storage
and Azure Files.
Compare blobs, files, and disk storage data types.
Learn more about access tiers.
Review different types of storage accounts.
Learn about Azure Storage redundancy, including LRS, ZRS, GRS, and read-access GRS.
Learn more about Azure Files.
After adoption
Prior to adoption, cost forecasts are dependent upon decisions made by workload owners and the cloud adoption
team. While the governance team can aid in influencing those decisions, there's likely to be little action for the
governance team to take.
Once resources are in production, data can be aggregated and trends analyzed at an environment level. This data
will help the governance team make sizing and usage decisions independently, based on actual usage patterns and
current state architecture.
Analyze data to generate a budget baseline for Azure resource groups and resources.
Identify patterns of use that would allow you to reduce size and stop or pause resources to further reduce your
costs.
Best practices in this section include using Azure Hybrid Benefit and Azure Reserved Virtual Machine Instances,
reduce cloud spending across subscriptions, using Azure Cost Management + Billing for cost budgeting and
analysis, monitoring resources and implementing resource group budgets, and optimizing monitoring, storage,
and VMs.
Next steps
With an understanding of best practices, examine the Cost Management toolchain to identify Azure tools and
features to help you execute these best practices.
Cost Management toolchain for Azure
Cost Management tools in Azure
10/30/2020 • 2 minutes to read • Edit Online
The Cost Management discipline is one of the Five Disciplines of Cloud Governance. This discipline focuses on
ways of establishing cloud spending plans, allocating cloud budgets, monitoring and enforcing cloud budgets,
detecting costly anomalies, and adjusting the cloud governance plan when actual spending is misaligned.
The following is a list of Azure native tools that can help mature the policies and processes that support this
discipline.
A Z URE C O ST
M A N A GEM EN T + P O W ER B I DESK TO P
TO O L A Z URE P O RTA L B IL L IN G C O N N EC TO R A Z URE P O L IC Y
Security baseline is one of the Five Disciplines of Cloud Governance within the Cloud Adoption Framework
governance model. Security is a component of any IT deployment, and the cloud introduces unique security
concerns. Many businesses are subject to regulatory requirements that make protecting sensitive data a major
organizational priority when considering a cloud transformation. Identifying potential security threats to your
cloud environment and establishing processes and procedures for addressing these threats should be a priority
for any IT security or cybersecurity team. The Security Baseline discipline ensures technical requirements and
security constraints are consistently applied to cloud environments, as those requirements mature.
NOTE
Security Baseline discipline does not replace the existing IT teams, processes, and procedures that your organization uses to
secure cloud-deployed resources. The primary purpose of this discipline is to identify security-related business risks and
provide risk-mitigation guidance to the IT staff responsible for security infrastructure. As you develop governance policies
and processes make sure to involve relevant IT teams in your planning and review processes.
This article outlines the approach to developing a Security Baseline discipline as part of your cloud governance
strategy. The primary audience for this guidance is your organization's cloud architects and other members of
your cloud governance team. The decisions, policies, and processes that emerge from this discipline should
involve engagement and discussions with relevant members of your IT and security teams, especially those
technical leaders responsible for implementing networking, encryption, and identity services.
Making the correct security decisions is critical to the success of your cloud deployments and wider business
success. If your organization lacks in-house expertise in cybersecurity, consider engaging external security
consultants as a component of this discipline. Also consider engaging Microsoft Consulting Services, the
Microsoft FastTrack cloud adoption service, or other external cloud adoption experts to discuss concerns related to
this discipline.
Policy statements
Actionable policy statements and the resulting architecture requirements serve as the foundation of a Security
Baseline discipline. Use sample policy statements as a starting point for defining your Security Baseline policies.
Cau t i on
The sample policies come from common customer experiences. To better align these policies to specific cloud
governance needs, execute the following steps to create policy statements that meet your unique business needs.
Next steps
Get started by evaluating business risks in a specific environment.
Understand business risks
Security Baseline discipline template
10/30/2020 • 2 minutes to read • Edit Online
The first step to implementing change is communicating what is desired. The same is true when changing
governance practices. The template below provides a starting point for documenting and communicating policy
statements that govern security related issues in the cloud.
As your discussions progress, use this template's structure as a model for capturing the business risks, risk
tolerances, compliance processes, and tooling needed to define your organization's Security Baseline policy
statements.
IMPORTANT
This template is a limited sample. Before updating this template to reflect your requirements, you should review the
subsequent steps for defining an effective Security Baseline discipline within your cloud governance strategy.
Next steps
Solid governance practices start with an understanding of business risk. Review the article on business risks and
begin to document the business risks that align with your current cloud adoption plan.
Understand business risks
Motivations and business risks in the Security
Baseline discipline
10/30/2020 • 2 minutes to read • Edit Online
This article discusses the reasons that customers typically adopt a Security Baseline discipline within a cloud
governance strategy. It also provides a few examples of potential business risks that can drive policy statements.
Relevance
Security is a key concern for any IT organization. Cloud deployments face many of the same security risks as
workloads hosted in traditional on-premises datacenters. The nature of public cloud platforms, with a lack of direct
ownership of the physical hardware storing and running your workloads, means cloud security requires its own
policy and processes.
One of the primary things that sets cloud security governance apart from traditional security policy is the ease
with which resources can be created, potentially adding vulnerabilities if security isn't considered before
deployment. The flexibility that technologies like Software Defined Networking (SDN) provide for rapidly changing
your cloud-based network topology can also easily modify your overall network attack surface in unforeseen
ways. Cloud platforms also provide tools and features that can improve your security capabilities in ways not
always possible in on-premises environments.
The amount you invest into security policy and processes will depend a great deal on the nature of your cloud
deployment. Initial test deployments may only need the most basic of security policies in place, while a mission-
critical workload will entail addressing complex and extensive security needs. All deployments will need to engage
with the discipline at some level.
The Security Baseline discipline covers the corporate policies and manual processes that you can put in place to
protect your cloud deployment against security risks.
NOTE
While it's important to understand the Identity Baseline discipline in the context of the Security Baseline discipline and how
that relates to access control, the Five Disciplines of Cloud Governance treats it as a separate discipline.
Business risk
The Security Baseline discipline attempts to address core security-related business risks. Work with your business
to identify these risks and monitor each of them for relevance as you plan for and implement your cloud
deployments.
Risks differ between organizations. Use this list of common security-related risks as a starting point for
discussions within your cloud governance team:
Data breach: Inadvertent exposure or loss of sensitive cloud-hosted data can lead to losing customers,
contractual issues, or legal consequences.
Ser vice disruption: Outages and other performance issues due to insecure infrastructure interrupts normal
operations and can result in lost productivity or lost business.
Next steps
Use the Security Baseline discipline template to document business risks that are likely to be introduced by the
current cloud adoption plan.
Once an understanding of realistic business risks is established, the next step is to document the business's
tolerance for risk and the indicators and key metrics to monitor that tolerance.
Understand indicators, metrics, and risk tolerance
Risk tolerance metrics and indicators in the Security
Baseline discipline
10/30/2020 • 4 minutes to read • Edit Online
Learn to quantify business risk tolerance associated with the Security Baseline discipline. Defining metrics and
indicators helps to create a business case for investing in the maturity of this discipline.
Metrics
The Security Baseline discipline generally focuses on identifying potential vulnerabilities in your cloud
deployments. As part of your risk analysis you'll want to gather data related to your security environment to
determine how much risk you face, and how important investment in your Security Baseline discipline is for your
planned cloud deployments.
Every organization has different security environments and requirements and different potential sources of
security data. The following are examples of useful metrics that you should gather to help evaluate risk tolerance
within the Security Baseline discipline:
Data classification: Number of cloud-stored data and services that are unclassified according to on your
organization's privacy, compliance, or business impact standards.
Number of sensitive data stores: Number of storage endpoints or databases that contain sensitive data and
should be protected.
Number of unencr ypted data stores: Number of sensitive data stores that are not encrypted.
Attack surface: How many total data sources, services, and applications will be cloud-hosted. What
percentage of these data sources are classified as sensitive? What percentage of these applications and services
are mission-critical?
Covered standards: Number of security standards defined by the security team.
Covered resources: Deployed assets that are covered by security standards.
Overall standards compliance: Ratio of compliance adherence to security standards.
Attacks by severity: How many coordinated attempts to disrupt your cloud-hosted services, such as through
distributed denial of service (DDoS) attacks, does your infrastructure experience? What is the size and severity
of these attacks?
Malware protection: Percentage of deployed virtual machines (VMs) that have all required anti-malware,
firewall, or other security software installed.
Patch latency: How long has it been since VMs have had OS and software patches applied.
Security health recommendations: Number of security software recommendations for resolving health
standards for deployed resources, organized by severity.
Next steps
Use the Security Baseline discipline template to document metrics and tolerance indicators that align to the current
cloud adoption plan.
Review sample Security Baseline policies as a starting point to develop your own policies to address specific
business risks aligned with your cloud adoption plans.
Review sample policies
Security Baseline sample policy statements
10/30/2020 • 4 minutes to read • Edit Online
Individual cloud policy statements are guidelines for addressing specific risks identified during your risk
assessment process. These statements should provide a concise summary of risks and plans to deal with them.
Each statement definition should include these pieces of information:
Technical risk : A summary of the risk this policy will address.
Policy statement: A clear summary explanation of the policy requirements.
Technical options: Actionable recommendations, specifications, or other guidance that IT teams and
developers can use when implementing the policy.
The following sample policy statements address common security-related business risks. These statements are
examples you can reference when drafting policy statements to address your organization's needs. These examples
are not meant to be proscriptive, and there are potentially several policy options for dealing with each identified
risk. Work closely with business, security, and IT teams to identify the best policies for your unique set of risks.
Asset classification
Technical risk : Assets that are not correctly identified as mission-critical or involving sensitive data may not
receive sufficient protections, leading to potential data leaks or business disruptions.
Policy statement: All deployed assets must be categorized by criticality and data classification. Classifications
must be reviewed by the cloud governance team and the application owner before deployment to the cloud.
Potential design option: Establish resource tagging standards and ensure IT staff apply them consistently to any
deployed resources using Azure resource tags.
Data encryption
Technical risk : There is a risk of protected data being exposed during storage.
Policy statement: All protected data must be encrypted when at rest.
Potential design option: See the Azure encryption overview article for a discussion of how data at rest
encryption is performed on the Azure platform. Additional controls such as in account data encryption and control
over how storage account settings can be changed should also be considered.
Network isolation
Technical risk : Connectivity between networks and subnets within networks introduces potential vulnerabilities
that can result in data leaks or disruption of mission-critical services.
Policy statement: Network subnets containing protected data must be isolated from any other subnets. Network
traffic between protected data subnets is to be audited regularly.
Potential design option: In Azure, network and subnet isolation is managed through Azure Virtual Network.
DDoS protection
Technical risk : Distributed denial of service (DDoS) attacks can result in a business interruption.
Policy statement: Deploy automated DDoS mitigation mechanisms to all publicly accessible network endpoints.
No public-facing web site backed by IaaS should be exposed to the internet without DDoS.
Potential design option: Use Azure DDoS Protection Standard to minimize disruptions caused by DDoS attacks.
Security review
Technical risk : Over time, new security threats and attack types emerge, increasing the risk of exposure or
disruption of your cloud resources.
Policy statement: Trends and potential exploits that could affect cloud deployments should be reviewed regularly
by the security team to provide updates to Security Baseline tools used in the cloud.
Potential design option: Establish a regular security review meeting that includes relevant IT and governance
team members. Review existing security data and metrics to establish gaps in current policy and Security Baseline
tools, and update policy to remediate any new risks. Use Azure Advisor and Azure Security Center to gain
actionable insights on emerging threats specific to your deployments.
Next steps
Use the samples mentioned in this article as a starting point to develop policies that address specific security risks
that align with your cloud adoption plans.
To begin developing your own custom Security Baseline policy statements, download the Security Baseline
discipline template.
To accelerate adoption of this discipline, choose the actionable governance guide that most closely aligns with your
environment. Then modify the design to incorporate your specific corporate policy decisions.
Building on risks and tolerance, establish a process for governing and communicating Security Baseline policy
adherence.
Establish policy compliance processes
Security Baseline policy compliance processes
10/30/2020 • 5 minutes to read • Edit Online
This article discusses an approach to policy adherence processes that govern the Security Baseline discipline.
Effective governance of cloud security starts with recurring manual processes designed to detect vulnerabilities
and impose policies to remediate those security risks. This requires regular involvement of the cloud governance
team and interested business and IT stakeholders to review and update policy and ensure policy compliance. In
addition, many ongoing monitoring and enforcement processes can be automated or supplemented with tooling
to reduce the overhead of governance and allow for faster response to policy deviation.
Next steps
Use the Security Baseline discipline template to document the processes and triggers that align to the current
cloud adoption plan.
For guidance on executing cloud management policies in alignment with adoption plans, see the article on
discipline improvement.
Security Baseline discipline improvement
Security Baseline discipline improvement
10/30/2020 • 5 minutes to read • Edit Online
The Security Baseline discipline focuses on ways of establishing policies that protect the network, assets, and most
importantly the data that will reside on a cloud provider's solution. Within the Five Disciplines of Cloud
Governance, the Security Baseline discipline includes classification of the digital estate and data. It also includes
documentation of risks, business tolerance, and mitigation strategies associated with the security of the data,
assets, and network. From a technical perspective, this also includes involvement in decisions regarding
encryption, network requirements, hybrid identity strategies, and the processes used to develop Security Baseline
policies for the cloud.
This article outlines some potential tasks your company can engage in to better develop and mature the Security
Baseline discipline. These tasks can be broken down into planning, building, adopting, and operating phases of
implementing a cloud solution, which are then iterated on allowing the development of an incremental approach
to cloud governance.
Neither the minimum or potential activities outlined in this article are aligned to specific corporate policies or
third-party compliance requirements. This guidance is designed to help facilitate the conversations that will lead to
alignment of both requirements with a cloud governance model.
Next steps
Now that you understand the concept of cloud security governance, move on to learn more about what security
and best practices guidance Microsoft provides for Azure.
Learn about security guidance for Azure Introduction to Azure security Learn about logging, reporting, and
monitoring
Cloud-native Security Baseline policy
10/30/2020 • 6 minutes to read • Edit Online
The Security Baseline discipline is one of the Five Disciplines of Cloud Governance. This discipline focuses on
general security topics including protection of the network, digital assets, and data. As outlined in the policy review
guide, the Cloud Adoption Framework includes three levels of sample policy: cloud-native, enterprise, and cloud-
design-principle-compliant for each of the disciplines. This article discusses the cloud-native sample policy for the
Security Baseline discipline.
NOTE
Microsoft is in no position to dictate corporate or IT policy. This article will help you prepare for an internal policy review. It is
assumed that this sample policy will be extended, validated, and tested against your corporate policy before attempting to
use it. Any use of this sample policy as-is is discouraged.
Policy alignment
This sample policy synthesizes a cloud-native scenario, meaning that the tools and platforms provided by Azure are
sufficient to manage business risks involved in a deployment. In this scenario, it is assumed that a simple
configuration of the default Azure services provides sufficient asset protection.
Next steps
Now that you've reviewed the sample Security Baseline policy for cloud-native solutions, return to the policy review
guide to start building on this sample to create your own policies for cloud adoption.
Build your own policies using the policy review guide
Microsoft security guidance
10/30/2020 • 5 minutes to read • Edit Online
Tools
The Microsoft Service Trust Portal and Compliance Manager can help meet these needs:
Overcome compliance management challenges.
Fulfill responsibilities of meeting regulatory requirements.
Conduct self-service audits and risk assessments of enterprise cloud service utilization.
These tools are designed to help organizations meet complex compliance obligations and improve data protection
capabilities when choosing and using Microsoft cloud services.
The Microsoft Ser vice Trust Por tal provides in-depth information and tools to help meet your needs for using
Microsoft cloud services, including Azure, Microsoft 365, Dynamics 365, and Windows. The portal is a one-stop
shop for security, regulatory, compliance, and privacy information related to the Microsoft cloud. It is where we
publish the information and resources needed to perform self-service risk assessments of cloud services and tools.
The portal was created to help track regulatory compliance activities within Azure, including:
Compliance Manager : Compliance Manager, a workflow-based risk assessment tool in the Microsoft Service
Trust Portal, enables you to track, assign, and verify your organization's regulatory compliance activities related
to Microsoft cloud services, such as Microsoft 365, Dynamics 365, and Azure. You can find more details in the
next section.
Trust documents: Three categories of guides provide abundant resources to assess the Microsoft cloud, learn
about Microsoft operations in security, compliance, and privacy, and help you act on improving your data
protection capabilities. These guides include:
Audit repor ts: Audit reports allow you to stay current on the latest privacy, security, and compliance-related
information for Microsoft cloud services. This information includes ISO, SOC, FedRAMP, and other audit reports,
bridge letters, and materials related to independent third-party audits of Microsoft cloud services such as
Azure, Microsoft 365, Dynamics 365, and others.
Data protection guides: Data protection guides provide information about how Microsoft cloud services
protect your data, and how you can manage cloud data security and compliance for your organization. These
guides include detailed white papers about the design and operation of Microsoft cloud services, FAQ
documents, reports of end-of-year security assessments, penetration test results, and guidance to help you
conduct risk assessment and improve your data protection capabilities.
Azure security and compliance blueprint: Blueprints provide resources to assist you in building and
launching cloud-powered applications that help you comply with stringent regulations and standards. With
more certifications than any other cloud provider, you can have confidence deploying your critical workloads to
Azure, with blueprints that include:
Industry-specific overview and guidance.
Customer responsibilities matrix.
Reference architectures with threat models.
Control implementation matrices.
Automation to deploy reference architectures.
Privacy resources. Documentation for data protection impact assessments, data subject requests, and
data breach notification is provided to incorporate into your own accountability program in support of
the General Data Protection Regulation (GDPR).
Get star ted with GDPR: Microsoft products and services help organizations meet GDPR requirements while
collecting or processing personal data. The Microsoft Service Trust Portal is designed to give you information
about the capabilities in Microsoft services that you can use to address specific requirements of the GDPR. The
documentation can help your GDPR accountability and your understanding of technical and organizational
measures. Documentation for data protection impact assessments, data subject requests, and data breach
notification is provided to incorporate into your own accountability program in support of the GDPR.
Data subject requests: The GDPR grants individuals (or data subjects) certain rights in connection with
the processing of their personal data. These rights include the right to correct inaccurate data, erase data,
or restrict its processing, as well as the right to receive their data and fulfill a request to transmit their
data to another controller.
Data breach: The GDPR mandates notification requirements for data controllers and processors if a
breach of personal data occurs. The Microsoft Service Trust Portal provides you with information about
how Microsoft works to prevent breaches, how Microsoft detects a breach, and how Microsoft will
respond and notify you as a data controller if a breach occurs.
Data protection impact assessment: Microsoft helps controllers complete GDPR data protection
impact assessments (DPIAs). The GDPR provides an inexhaustive list of cases in which DPIAs must be
performed, such as automated processing for the purposes of profiling and similar activities; processing
on a large scale of special categories of personal data, and systematic monitoring of a publicly accessible
area on a large scale.
Other resources: In addition to tools guidance discussed in the above sections, the Microsoft Service
Trust Portal also provides other resources including regional compliance, additional resources for the
security and compliance center, and frequently asked questions about the Microsoft Service Trust Portal,
Compliance Manager, privacy, and GDPR.
Regional compliance: The Microsoft Service Trust Portal provides numerous compliance documents and
guidance for Microsoft online services to meet compliance requirements for different regions including Czech
Republic, Poland, and Romania.
Behavioral analytics
Behavioral analytics is a technique that analyzes and compares data to a collection of known patterns. These
patterns are not simple signatures. They're determined through complex machine learning algorithms that are
applied to massive data sets. They're also determined through careful analysis of malicious behaviors by expert
analysts. Azure Security Center can use behavioral analytics to identify compromised resources based on analysis
of virtual machine logs, virtual network device logs, fabric logs, crash dumps, and other sources.
Security Baseline tools in Azure
10/30/2020 • 2 minutes to read • Edit Online
The Security Baseline discipline is one of the Five Disciplines of Cloud Governance. This discipline focuses on ways
of establishing policies that protect the network, assets, and most importantly the data that will reside on a cloud
provider's solution. Within the Five Disciplines of Cloud Governance, the Security Baseline discipline involves
classification of the digital estate and data. It also involves documentation of risks, business tolerance, and
mitigation strategies associated with the security of data, assets, and networks. From a technical perspective, this
discipline also includes involvement in decisions regarding encryption, network requirements, hybrid identity
strategies, and tools to automate enforcement of security policies across resource groups.
The following list of Azure tools can help mature the policies and processes that support this discipline.
A Z URE
P O RTA L A N D
A Z URE A Z URE
RESO URC E A Z URE K EY A Z URE SEC URIT Y A Z URE
TO O L M A N A GER VA ULT A Z URE A D P O L IC Y C EN T ER M O N ITO R
Encrypt No Yes No No No No
virtual drives
Manage No No Yes No No No
hybrid
identity
services
Restrict No No No Yes No No
allowed types
of resource
Preemptively No No No No Yes No
detect
vulnerabilities
Configure Yes No No No No No
backup and
disaster
recovery
For a complete list of Azure security tools and services, see Security services and technologies available on Azure.
Customers commonly use third-party tools to enable Security Baseline discipline activities. For more information,
see the article integrate security solutions in Azure Security Center.
In addition to security tools, the Microsoft Trust Center contains extensive guidance, reports, and related
documentation that can help you perform risk assessments as part of your migration planning process.
Identity Baseline discipline overview
10/30/2020 • 2 minutes to read • Edit Online
Identity baseline is one of the Five Disciplines of Cloud Governance within the Cloud Adoption Framework
governance model. Identity is increasingly considered the primary security perimeter in the cloud, which is a shift
from the traditional focus on network security. Identity services provide the core mechanisms supporting access
control and organization within IT environments, and the Identity Baseline discipline complements the Security
Baseline discipline by consistently applying authentication and authorization requirements across cloud adoption
efforts.
NOTE
Identity Baseline discipline does not replace the existing IT teams, processes, and procedures that allow your organization to
manage and secure identity services. The primary purpose of this discipline is to identify potential identity-related business
risks and provide risk-mitigation guidance to IT staff that are responsible for implementing, maintaining, and operating your
identity management infrastructure. As you develop governance policies and processes make sure to involve relevant IT
teams in your planning and review processes.
This section of the Cloud Adoption Framework outlines the approach to developing an Identity Baseline discipline
as part of your cloud governance strategy. The primary audience for this guidance is your organization's cloud
architects and other members of your cloud governance team. The decisions, policies, and processes that emerge
from this discipline should involve engagement and discussions with relevant members of the IT teams
responsible for implementing and managing your organization's identity management solutions.
If your organization lacks in-house expertise in identity and security, consider engaging external consultants as a
part of this discipline. Also consider engaging Microsoft Consulting Services, the Microsoft FastTrack cloud
adoption service, or other external cloud adoption partners to discuss concerns related to this discipline.
Policy statements
Actionable policy statements and the resulting architecture requirements serve as the foundation of an Identity
Baseline discipline. Use sample policy statements as a starting point for defining your Identity Baseline policies.
Cau t i on
The sample policies come from common customer experiences. To better align these policies to specific cloud
governance needs, execute the following steps to create policy statements that meet your unique business needs.
Next steps
Get started by evaluating business risks in a specific environment.
Understand business risks
Identity Baseline discipline template
10/30/2020 • 2 minutes to read • Edit Online
The first step to implementing change is communicating the desired change. The same is true when changing
governance practices. The template below serves as a starting point for documenting and communicating policy
statements that govern identity services in the cloud.
As your discussions progress, use this template's structure as a model for capturing the business risks, risk
tolerances, compliance processes, and tooling needed to define your organization's Identity Baseline policy
statements.
IMPORTANT
This template is a limited sample. Before updating this template to reflect your requirements, you should review the
subsequent steps for defining an effective Identity Baseline discipline within your cloud governance strategy.
Next steps
Solid governance practices start with an understanding of business risk. Review the article on business risks and
begin to document the business risks that align with your current cloud adoption plan.
Understand business risks
Motivations and business risks in the Identity Baseline
discipline
10/30/2020 • 2 minutes to read • Edit Online
This article discusses the reasons that customers typically adopt an Identity Baseline discipline within a cloud
governance strategy. It also provides a few examples of business risks that drive policy statements.
Relevance
Traditional on-premises directories are designed to allow businesses to strictly control permissions and policies
for users, groups, and roles within their internal networks and datacenters. These directories typically support
single-tenant implementations, with services applicable only within the on-premises environment.
Cloud identity services expand an organization's authentication and access control capabilities to the internet.
They support multitenancy and can be used to manage users and access policy across cloud applications and
deployments. Public cloud platforms have cloud-native identity services supporting management and
deployment tasks and are capable of varying levels of integration with your existing on-premises identity
solutions. All of these features can result in cloud identity policy being more complicated than your traditional on-
premises solutions require.
The importance of the Identity Baseline discipline to your cloud deployment will depend on the size of your team
and need to integrate your cloud-based identity solution with an existing on-premises identity service. Initial test
deployments may not require much in the way of user organization or management, but as your cloud estate
matures, you will likely need to support more complicated organizational integration and centralized
management.
Business risk
The Identity Baseline discipline attempts to address core business risks related to identity services and access
control. Work with your business to identify these risks and monitor each of them for relevance as you plan for
and implement your cloud deployments.
Risks will differ between organization, but the following serve as common identity-related risks that you can use
as a starting point for discussions within your cloud governance team:
Unauthorized access. Sensitive data and resources that can be accessed by unauthorized users can lead to
data leaks or service disruptions, violating your organization's security perimeter and risking business or legal
liabilities.
Inefficiency due to multiple identity solutions. Organizations with multiple identity services tenants can
require multiple accounts for users. This can lead to inefficiency for users who need to remember multiple sets
of credentials and for IT in managing accounts across multiple systems. If user access assignments are not
updated across identity solutions as staff, teams, and business goals change, your cloud resources may be
vulnerable to unauthorized access or users unable to access required resources.
Inability to share resources with external par tners. Difficulty adding external business partners to your
existing identity solutions can prevent efficient resource sharing and business communication.
On-premises identity dependencies. Legacy authentication mechanisms or third-party multi-factor
authentication might not be available in the cloud, requiring either migrating workloads to be retooled, or
additional identity services to be deployed to the cloud. Either requirement could delay or prevent migration,
and increase costs.
Next steps
Use the Identity Baseline discipline template to document business risks that are likely to be introduced by the
current cloud adoption plan.
Once an understanding of realistic business risks is established, the next step is to document the business's
tolerance for risk and the indicators and key metrics to monitor that tolerance.
Understand indicators, metrics, and risk tolerance
Identity baseline metrics, indicators, and risk
tolerance
10/30/2020 • 4 minutes to read • Edit Online
Learn to quantify business risk tolerance associated with the Identity Baseline discipline. Defining metrics and
indicators helps to create a business case for investing in the maturity of this discipline.
Metrics
Identity management focuses on identifying, authenticating, and authorizing individuals, groups of users, or
automated processes, and providing them appropriate access to resources in your cloud deployments. As part of
your risk analysis you'll want to gather data related to your identity services to determine how much risk you face,
and how important investment in your Identity Baseline discipline is for your planned cloud deployments.
The following are examples of useful metrics that you should gather to help evaluate risk tolerance within the
Identity Baseline discipline:
Identity systems size. Total number of users, groups, or other objects managed through your identity
systems.
Overall size of director y ser vices infrastructure. Number of directory forests, domains, and tenants used
by your organization.
Dependency on legacy or on-premises authentication mechanisms. Number of workloads that depend
on legacy or third-party or multi-factor authentication mechanisms.
Extent of cloud-deployed director y ser vices. Number of directory forests, domains, and tenants you've
deployed to the cloud.
Cloud-deployed Active Director y ser vers. Number of Active Directory servers deployed to the cloud.
Cloud-deployed organizational units. Number of Active Directory organizational units (OUs) deployed to
the cloud.
Extent of federation. Number of identity management systems federated with your organization's systems.
Elevated users. Number of user accounts with elevated access to resources or management tools.
Use of role-based access control. Number of subscriptions, resource groups, or individual resources not
managed through role-based access control (RBAC) via groups.
Authentication claims. Number of successful and failed user authentication attempts.
Authorization claims. Number of successful and failed attempts by users to access resources.
Compromised accounts. Number of user accounts that have been compromised.
Next steps
Use the Identity Baseline discipline template to document metrics and tolerance indicators that align to the current
cloud adoption plan.
Review sample Identity Baseline policies as a starting point to develop your own policies to address specific
business risks aligned with your cloud adoption plans.
Review sample policies
Identity Baseline sample policy statements
10/30/2020 • 3 minutes to read • Edit Online
Individual cloud policy statements are guidelines for addressing specific risks identified during your risk
assessment process. These statements should provide a concise summary of risks and plans to deal with them.
Each statement definition should include these pieces of information:
Technical risk : A summary of the risk this policy will address.
Policy statement: A clear summary explanation of the policy requirements.
Design options: Actionable recommendations, specifications, or other guidance that IT teams and developers
can use when implementing the policy.
The following sample policy statements address common identity-related business risks. These statements are
examples you can reference when drafting policy statements to address your organization's needs. These examples
are not meant to be proscriptive, and there are potentially several policy options for dealing with each identified
risk. Work closely with business and IT teams to identify the best policies for your unique set of risks.
Overprovisioned access
Technical risk : Users and groups with control over resources beyond their area of responsibility can result in
unauthorized modifications leading to outages or security vulnerabilities.
Policy statement: The following policies will be implemented:
A least-privilege access model will be applied to any resources involved in mission-critical applications or
protected data.
Elevated permissions should be an exception, and any such exceptions must be recorded with the cloud
governance team. Exceptions will be audited regularly.
Potential design options: Consult the Azure identity management best practices to implement a role-based
access control (RBAC) strategy that restricts access based on the need to know and least-privilege security
principles.
Identity reviews
Technical risk : As business changes over time, the addition of new cloud deployments or other security concerns
can increase the risks of unauthorized access to secure resources.
Policy statement: Cloud governance processes must include quarterly review with identity management teams
to identify malicious actors or usage patterns that should be prevented by cloud asset configuration.
Potential design options: Establish a quarterly security review meeting that includes both governance team
members and IT staff responsible for managing identity services. Review existing security data and metrics to
establish gaps in current identity management policy and tooling, and update policy to remediate any new risks.
Next steps
Use the samples mentioned in this article as a starting point for developing policies to address specific business
risks that align with your cloud adoption plans.
To begin developing your own custom Identity Baseline policy statements, download the Identity Baseline
discipline template.
To accelerate adoption of this discipline, choose the actionable governance guide that most closely aligns with your
environment. Then modify the design to incorporate your specific corporate policy decisions.
Building on risks and tolerance, establish a process for governing and communicating Identity Baseline policy
adherence.
Establish policy compliance processes
Identity Baseline policy compliance processes
10/30/2020 • 4 minutes to read • Edit Online
This article discusses an approach to policy adherence processes that govern the Identity Baseline discipline.
Effective governance of identity starts with recurring manual processes that guide identity policy adoption and
revisions. This requires regular involvement of the cloud governance team and interested business and IT
stakeholders to review and update policy and ensure policy compliance. In addition, many ongoing monitoring and
enforcement processes can be automated or supplemented with tooling to reduce the overhead of governance
and allow for faster response to policy deviation.
Next steps
Use the Identity Baseline discipline template to document the processes and triggers that align to the current cloud
adoption plan.
For guidance on executing cloud management policies in alignment with adoption plans, see the article on
discipline improvement.
Identity Baseline discipline improvement
Identity Baseline discipline improvement
10/30/2020 • 6 minutes to read • Edit Online
The Identity Baseline discipline focuses on ways of establishing policies that ensure consistency and continuity of
user identities regardless of the cloud provider that hosts the application or workload. Within the Five Disciplines
of Cloud Governance, the Identity Baseline discipline includes decisions regarding the hybrid identity strategy,
evaluation and extension of identity repositories, implementation of single sign-on (same sign-on), auditing and
monitoring for unauthorized use or malicious actors. In some cases, it may also involve decisions to modernize,
consolidate, or integrate multiple identity providers.
This article outlines some potential tasks your company can engage in to better develop and mature the Identity
Baseline discipline. These tasks can be broken down into planning, building, adopting, and operating phases of
implementing a cloud solution, which are then iterated on allowing the development of an incremental approach
to cloud governance.
Neither the minimum or potential activities outlined in this article are aligned to specific corporate policies or
third-party compliance requirements. This guidance is designed to help facilitate the conversations that will lead to
alignment of both requirements with a cloud governance model.
Next steps
Now that you understand the concept of cloud identity governance, examine the Identity Baseline toolchain to
identify Azure tools and features that you'll need when developing the Identity Baseline discipline on the Azure
platform.
Identity Baseline toolchain for Azure
Identity Baseline tools in Azure
10/30/2020 • 4 minutes to read • Edit Online
The Identity Baseline discipline is one of the Five Disciplines of Cloud Governance. This discipline focuses on ways
of establishing policies that ensure consistency and continuity of user identities regardless of the cloud provider
that hosts the application or workload.
The following tools are included in the discovery guide for hybrid identity.
Active Director y (on-premises): Active Directory is the identity provider most frequently used in the
enterprise to store and validate user credentials.
Azure Active Director y: A software as a service (SaaS) equivalent to Active Directory, capable of federating
with an on-premises Active Directory.
Active Director y (IaaS): An instance of the Active Directory application running in a virtual machine in Azure.
Identity is the control plane for IT security. So authentication is an organization's access guard to the cloud.
Organizations need an identity control plane that strengthens their security and keeps their cloud applications
safe from intruders.
Cloud authentication
Choosing the correct authentication method is the first concern for organizations wanting to move their
applications to the cloud.
When you choose this method, Azure AD handles users' sign-in process. Coupled with seamless single sign-on
(SSO), users can sign in to cloud applications without having to reenter their credentials. With cloud
authentication, you can choose from two options:
Azure AD password hash synchronization: The simplest way to enable authentication for on-premises
directory objects in Azure AD. This method can also be used with any method as a back-up failover authentication
method in case your on-premises server goes down.
Azure AD Pass-through Authentication: Provides a persistent password validation for Azure AD
authentication services by using a software agent that runs on one or more on-premises servers.
NOTE
Companies with a security requirement to immediately enforce on-premises user account states, password policies, and
sign-in hours should consider the pass-through authentication method.
Federated authentication:
When you choose this method, Azure AD passes the authentication process to a separate trusted authentication
system, such as on-premises Active Directory Federation Services (AD FS) or a trusted third-party federation
provider, to validate the user's password.
For a decision tree that helps you choose the best solution for your organization, see Choose the right
authentication method for Azure Active Directory.
The following table lists the native tools that can help mature the policies and processes that support this
discipline.
PA SSW O RD H A SH PA SS- T H RO UGH
SY N C H RO N IZ AT IO N + A UT H EN T IC AT IO N +
C O N SIDERAT IO N SEA M L ESS SSO SEA M L ESS SSO F EDERAT IO N W IT H A D F S
Where does authentication In the cloud In the cloud after a secure On-premises
happen? password verification
exchange with the on-
premises authentication
agent
What are the on-premises None One server for each Two or more AD FS servers
server requirements beyond additional authentication
the provisioning system: agent Two or more WAP servers in
Azure AD Connect? the perimeter network
What are the requirements None Outbound internet access Inbound internet access to
for on-premises internet from the servers running WAP servers in the
and networking beyond the authentication agents perimeter
provisioning system?
Inbound network access to
AD FS servers from WAP
servers in the perimeter
Is there a health monitoring Not required Agent status provided by Azure AD Connect Health
solution? Azure Active Directory
admin center
Do users get single sign-on Yes with Seamless SSO Yes with Seamless SSO Yes
to cloud resources from
domain-joined devices
within the company
network?
Alternate login ID
Is Windows Hello for Key trust model Key trust model Key trust model
Business supported?
Certificate trust model with Certificate trust model with Certificate trust model
Intune Intune
PA SSW O RD H A SH PA SS- T H RO UGH
SY N C H RO N IZ AT IO N + A UT H EN T IC AT IO N +
C O N SIDERAT IO N SEA M L ESS SSO SEA M L ESS SSO F EDERAT IO N W IT H A D F S
What are the multi-factor Azure Multi-Factor Azure Multi-Factor Azure Multi-Factor
authentication options? Authentication Authentication Authentication
Custom controls with Azure Custom controls with Azure Azure Multi-Factor
AD Conditional Access* AD Conditional Access* Authentication server
Third-party multi-factor
authentication
What user account states Disabled accounts Disabled accounts Disabled accounts
are supported? (Up to 30-minute delay)
Account locked out Account locked out
What are the Azure AD Azure AD Conditional Azure AD Conditional Azure AD Conditional
Conditional Access options? Access Access Access
AD FS claim rules
Can you customize the Yes, with Azure AD Premium Yes, with Azure AD Premium Yes
logo, image, and description
on the sign-in pages?
What advanced scenarios Smart password lockout Smart password lockout Multisite low-latency
are supported? authentication system
Leaked credentials reports
AD FS extranet lockout
NOTE
Custom controls in Azure AD Conditional Access does not currently support device registration.
Next steps
The Hybrid Identity Digital Transformation Framework white paper outlines combinations and solutions for
choosing and integrating each of these components.
The Azure AD Connect tool helps you to integrate your on-premises directories with Azure AD.
Resource Consistency discipline overview
10/30/2020 • 2 minutes to read • Edit Online
Resource consistency is one of the Five Disciplines of Cloud Governance within the Cloud Adoption Framework
governance model. This discipline focuses on ways of establishing policies related to the operational management
of an environment, application, or workload. IT operations teams often provide monitoring of applications,
workload, and asset performance. They also commonly execute the tasks required to meet scale demands,
remediate performance service-level agreement (SLA) violations, and proactively avoid performance SLA
violations through automated remediation. Within the Five Disciplines of Cloud Governance, the Resource
Consistency discipline ensures resources are consistently configured in such a way that they can be discoverable
by IT operations, are included in recovery solutions, and can be onboarded into repeatable operations processes.
NOTE
Resource Consistency discipline does not replace the existing IT teams, processes, and procedures that allow your
organization to effectively manage cloud-based resources. The primary purpose of this discipline is to identify potential
business risks and provide risk-mitigation guidance to the IT staff that are responsible for managing your resources in the
cloud. As you develop governance policies and processes make sure to involve relevant IT teams in your planning and review
processes.
This section of the Cloud Adoption Framework outlines how to develop a Resource Consistency discipline as part
of your cloud governance strategy. The primary audience for this guidance is your organization's cloud architects
and other members of your cloud governance team. The decisions, policies, and processes that emerge from this
discipline should involve engagement and discussions with relevant members of the IT teams responsible for
implementing and managing your organization's resource consistency solutions.
If your organization lacks in-house expertise in resource consistency strategies, consider engaging external
consultants as a part of this discipline. Also consider engaging Microsoft Consulting Services, the Microsoft
FastTrack cloud adoption service, or other external cloud adoption experts for discussing how best to organize,
track, and optimize your cloud-based assets.
Policy statements
Actionable policy statements and the resulting architecture requirements serve as the foundation of a Resource
Consistency discipline. Use sample policy statements. These samples can serve as a starting point for defining
your Resource Consistency policies.
Cau t i on
The sample policies come from common customer experiences. To better align these policies to specific cloud
governance needs, execute the following steps to create policy statements that meet your unique business needs.
Next steps
Get started by evaluating business risks in a specific environment.
Understand business risks
Resource Consistency discipline template
10/30/2020 • 2 minutes to read • Edit Online
The first step to implementing change is communicating what is desired. The same is true when changing
governance practices. The template below serves as a starting point for documenting and communicating policy
statements that govern IT operations and management in the cloud.
As your discussions progress, use this template's structure as a model for capturing the business risks, risk
tolerances, compliance processes, and tooling needed to define your organization's Resource Consistency policy
statements.
IMPORTANT
This template is a limited sample. Before updating this template to reflect your requirements, you should review the
subsequent steps for defining an effective Resource Consistency discipline within your cloud governance strategy.
Next steps
Solid governance practices start with an understanding of business risk. Review the article on business risks and
begin to document the business risks that align with your current cloud adoption plan.
Understand business risks
Motivations and business risks in the Resource
Consistency discipline
10/30/2020 • 2 minutes to read • Edit Online
This article discusses the reasons that customers typically adopt a Resource Consistency discipline within a cloud
governance strategy. It also provides a few examples of potential business risks that can drive policy statements.
Relevance
When it comes to deploying resources and workloads, the cloud offers increased agility and flexibility over most
traditional on-premises datacenters. These potential cloud-based advantages also come with potential
management drawbacks that can seriously jeopardize the success of your cloud adoption. What assets have you
deployed? What teams own what assets? Do you have enough resources supporting a workload? How do you
know whether workloads are healthy?
Resource consistency is crucial to ensure that resources are deployed, updated, and configured consistently in a
repeatable manner, and that service disruptions are minimized and remedied in as little time as possible.
The Resource Consistency discipline is concerned with identifying and mitigating business risks related to the
operational aspects of your cloud deployment. Resource consistency includes monitoring of applications,
workloads, and asset performance. It also includes the tasks required to meet scale demands, provide disaster
recovery capabilities, mitigate performance service-level agreement (SLA) violations, and proactively avoid those
SLA violations through automated remediation.
Initial test deployments may not require much beyond adopting some cursory naming and tagging standards to
support your resource consistency needs. As your cloud adoption matures and you deploy more complicated and
mission-critical assets, the need to invest in the Resource Consistency discipline increases rapidly.
Business risk
The Resource Consistency discipline attempts to address core operational business risks. Work with your business
and IT teams to identify these risks and monitor each of them for relevance as you plan for and implement your
cloud deployments.
Risks will differ between organization, but the following serve as common risks that you can use as a starting
point for discussions within your cloud governance team:
Unnecessar y operational cost. Obsolete or unused resources, or resources that are overprovisioned during
times of low demand, add unnecessary operational costs.
Underprovisioned resources. Resources that experience higher than anticipated demand can result in
business disruption as cloud resources are overwhelmed by demand.
Management inefficiencies. Lack of consistent naming and tagging metadata associated with resources can
lead to IT staff having difficulty finding resources for management tasks or identifying ownership and
accounting information related to assets. This results in management inefficiencies that can increase cost and
slow IT responsiveness to service disruption or other operational issues.
Business interruption. Service disruptions that result in violations of your organization's established service-
level agreements (SLAs) can result in loss of business or other financial impacts to your company.
Next steps
Use the Resource Consistency discipline template to document business risks that are likely to be introduced by
the current cloud adoption plan.
Once an understanding of realistic business risks is established, the next step is to document the business's
tolerance for risk and the indicators and key metrics to monitor that tolerance.
Understand indicators, metrics, and risk tolerance
Risk tolerance metrics and indicators in the Resource
Consistency discipline
10/30/2020 • 5 minutes to read • Edit Online
Learn to quantify business risk tolerance associated with the Resource Consistency discipline. Defining metrics and
indicators helps to create a business case for investing in the maturity of this discipline.
Metrics
Resource consistency focuses on addressing risks related to the operational management of your cloud
deployments. As part of your risk analysis you'll want to gather data related to your IT operations to determine
how much risk you face, and how important investment in your Resource Consistency discipline is for your
planned cloud deployments.
Every organization has different operational scenarios, but the following items represent useful examples of the
metrics you should gather when evaluating risk tolerance within the Resource Consistency discipline:
Cloud assets. Total number of cloud-deployed resources.
Untagged resources. Number of resources without required accounting, business impact, or organizational
tags.
Underused assets. Number of resources where memory, CPU, or network capabilities are all consistently
underutilized.
Resource depletion. Number of resources where memory, CPU, or network capabilities are exhausted by
load.
Resource age. Time since resource was last deployed or modified.
VMs in critical condition. Number of deployed VMs where one or more critical issues are detected that must
addressed in order to restore normal functionality.
Aler ts by severity. Total number of alerts on a deployed asset, broken down by severity.
Unhealthy network links. Number of resources with network connectivity issues.
Unhealthy ser vice endpoints. Number of issues with external network endpoints.
Cloud provider ser vice health incidents. Number of disruptions or performance incidents caused by the
cloud provider.
Ser vice-level agreements. This can include both Microsoft's commitments for uptime and connectivity of
Azure services, as well as commitments made by the business to its external and internal customers.
Ser vice availability. Percentage of actual uptime cloud-hosted workloads compared to the expected uptime.
Recover y time objective (RTO). The maximum acceptable time that an application can be unavailable after
an incident.
Recover y point objective (RPO). The maximum duration of data loss that is acceptable during a disaster. For
example, if you store data in a single database, with no replication to other databases, and perform hourly
backups, you could lose up to an hour of data.
Mean time to recover (MTTR). The average time required to restore a component after a failure.
Mean time between failures (MTBF). The duration that a component can reasonably expect to run between
outages. This metric can help you calculate how often a service will become unavailable.
Backup health. Number of backups actively being synchronized.
Recover y health. Number of recovery operations successfully performed.
Risk tolerance indicators
Cloud platforms offer a baseline set of features that allow deployment teams to effectively manage small
deployments without extensive additional planning or processes. As a result, small dev/test or experimental first
workloads that include a relatively small amount of cloud-based assets represent low level of risk, and will likely
not need much in the way of a formal Resource Consistency policy.
As the size of your cloud estate grows the complexity of managing your assets becomes significantly more difficult.
With more assets on the cloud, the ability identify ownership of resources and control resource useful becomes
critical to minimizing risks. As more mission-critical workloads are deployed to the cloud, service uptime becomes
more critical, and tolerance for service disruption potential cost overruns diminishes rapidly.
In the early stages of cloud adoption, work with your IT operations team and business stakeholders to identify
business risks related to resource consistency, then determine an acceptable baseline for risk tolerance. This
section of the Cloud Adoption Framework provides examples, but the detailed risks and baselines for your
company or deployments may be different.
Once you have a baseline, establish minimum benchmarks representing an unacceptable increase in your
identified risks. These benchmarks act as triggers for when you need to take action to remediate these risks. The
following are a few examples of how operational metrics, such as those discussed above, can justify an increased
investment in the Resource Consistency discipline.
Tagging and naming trigger. A company with more than x resources lacking required tagging information
or not obeying naming standards should consider investing in the Resource Consistency discipline to help
refine these standards and ensure consistent application of them to cloud-deployed assets.
Overprovisioned resources trigger. If a company has more than x% of assets regularly using small
amounts of their available memory, CPU, or network capabilities, investment in the Resource Consistency
discipline is suggested to help optimize resources usage for these items.
Underprovisioned resources trigger. If a company has more than x% of assets regularly exhausting most of
their available memory, CPU, or network capabilities, investment in the Resource Consistency discipline is
suggested to help ensure these assets have the resources necessary to prevent service interruptions.
Resource age trigger. A company with more than x resources that haven't been updated in over y months
could benefit from investment in the Resource Consistency discipline aimed at ensuring active resources are
patched and healthy, while retiring obsolete or otherwise unused assets.
Ser vice-level agreement trigger. A company that cannot meet its service-level agreements to its external
customers or internal partners should invest in the Deployment Acceleration discipline to reduce system
downtime.
Recover y time triggers. If a company exceeds the required thresholds for recovery time following a system
failure, it should invest in improving its Deployment Acceleration discipline and systems design to reduce or
eliminate failures or the effect of individual component downtime.
VM health trigger. A company that has more than x% of VMs experiencing a critical health issue should invest
in the Resource Consistency discipline to identify issues and improve VM stability.
Network health trigger. A company that has more than x% of network subnets or endpoints experiencing
connectivity issues should invest in the Resource Consistency discipline to identify and resolve network issues.
Backup coverage trigger. A company with x% of mission-critical assets without up-to-date backups in place
would benefit from an increased investment in the Resource Consistency discipline to ensure a consistent
backup strategy.
Backup health trigger. A company experiencing more than x% failure of restore operations should invest in
the Resource Consistency discipline to identify problems with backup and ensure important resources are
protected.
The exact metrics and triggers you use to gauge risk tolerance and the level of investment in the Resource
Consistency discipline will be specific to your organization, but the examples above should serve as a useful base
for discussion within your cloud governance team.
Next steps
Use the Resource Consistency discipline template to document metrics and tolerance indicators that align to the
current cloud adoption plan.
Review sample Resource Consistency policies as a starting point to develop your own policies to address specific
business risks aligned with your cloud adoption plans.
Review sample policies
Resource Consistency sample policy statements
10/30/2020 • 4 minutes to read • Edit Online
Individual cloud policy statements are guidelines for addressing specific risks identified during your risk
assessment process. These statements should provide a concise summary of risks and plans to deal with them.
Each statement definition should include these pieces of information:
Technical risk : A summary of the risk this policy will address.
Policy statement: A clear summary explanation of the policy requirements.
Design options: Actionable recommendations, specifications, or other guidance that IT teams and developers
can use when implementing the policy.
The following sample policy statements address common business risks related to resource consistency. These
statements are examples you can reference when drafting policy statements to address your organization's needs.
These examples are not meant to be proscriptive, and there are potentially several policy options for dealing with
each identified risk. Work closely with business and IT teams to identify the best policies for your unique set of
risks.
Tagging
Technical risk : Without proper metadata tagging associated with deployed resources, IT operations cannot
prioritize support or optimization of resources based on required SLA, importance to business operations, or
operational cost. This can result in mis-allocation of IT resources and potential delays in incident resolution.
Policy statement: The following policies will be implemented:
Deployed assets should be tagged with the following values:
Cost
Criticality
SLA
Environment
Governance tooling must validate tagging related to cost, criticality, SLA, application, and environment. All
values must align to predefined values managed by the governance team.
Potential design options: In Azure, standard name-value metadata tags are supported on most resource types.
Azure Policy is used to enforce specific tags as part of resource creation.
Ungoverned subscriptions
Technical risk : Arbitrary creation of subscriptions and management groups can lead to isolated sections of your
cloud estate that are not properly subject to your governance policies.
Policy statement: Creation of new subscriptions or management groups for any mission-critical applications or
protected data will require a review from the cloud governance team. Approved changes will be integrated into a
proper blueprint assignment.
Potential design options: Lock down administrative access to your organizations Azure management groups to
only approved governance team members who will control the subscription creation and access control process.
Deployment compliance
Technical risk : Deployment scripts and automation tooling that is not fully vetted by the cloud governance team
can result in resource deployments that violate policy.
Policy statement: The following policies will be implemented:
Deployment tooling must be approved by the cloud governance team to ensure ongoing governance of
deployed assets.
Deployment scripts must be maintained in central repository accessible by the cloud governance team for
periodic review and auditing.
Potential design options: Consistent use of Azure Blueprints to manage automated deployments allows
consistent deployments of Azure resources that adhere to your organization's governance standards and policies.
Monitoring
Technical risk : Improperly implemented or inconsistently instrumented monitoring can prevent the detection of
workload health issues or other policy compliance violations.
Policy statement: The following policies will be implemented:
Governance tooling must validate that all assets are included in monitoring for resource depletion, security,
compliance, and optimization.
Governance tooling must validate that the appropriate level of logging data is being collected for all
applications and data.
Potential design options: Azure Monitor is the default monitoring service in Azure, and consistent monitoring
can be enforced via Azure Blueprints when deploying resources.
Disaster recovery
Technical risk : Resource failure, deletions, or corruption can result in disruption of mission-critical applications or
services and the loss of sensitive data.
Policy statement: All mission-critical applications and protected data must have backup and recovery solutions
implemented to minimize business impact of outages or system failures.
Potential design options: The Azure Site Recovery service provides backup, recovery, and replication
capabilities that minimize outage duration in business continuity and disaster recovery (BCDR) scenarios.
Next steps
Use the samples mentioned in this article as a starting point to develop policies that address specific business risks
that align with your cloud adoption plans.
To begin developing your own custom Resource Consistency policy statements, download the Resource
Consistency discipline template.
To accelerate adoption of this discipline, choose the actionable governance guide that most closely aligns with your
environment. Then modify the design to incorporate your specific corporate policy decisions.
Building on risks and tolerance, establish a process for governing and communicating Resource Consistency policy
adherence.
Establish policy compliance processes
Resource Consistency policy compliance processes
10/30/2020 • 5 minutes to read • Edit Online
This article discusses an approach to policy adherence processes that govern resource consistency. Effective cloud
resource consistency governance starts with recurring manual processes designed to identify operational
inefficiencies, improve management of deployed resources, and ensure mission-critical workloads have minimal
disruptions. These manual processes are supplemented with monitoring, automation, and tooling to help reduce
the overhead of governance and allow for faster response to policy deviation.
Next steps
Use the Resource Consistency discipline template to document the processes and triggers that align to the current
cloud adoption plan.
For guidance on executing cloud management policies in alignment with adoption plans, see the article on
discipline improvement.
Resource Consistency discipline improvement
Resource Consistency discipline improvement
10/30/2020 • 6 minutes to read • Edit Online
The Resource Consistency discipline focuses on ways of establishing policies related to the operational
management of an environment, application, or workload. Within the Five Disciplines of Cloud Governance, the
Resource Consistency discipline includes the monitoring of application, workload, and asset performance. It also
includes the tasks required to meet scale demands, remediate performance service-level agreement (SLA)
violations, and proactively avoid SLA violations through automated remediation.
This article outlines some potential tasks your company can engage in to better develop and mature the Resource
Consistency discipline. These tasks can be broken down into planning, building, adopting, and operating phases of
implementing a cloud solution, which are then iterated on allowing the development of an incremental approach
to cloud governance.
Neither the minimum or potential activities outlined in this article are aligned to specific corporate policies or
third-party compliance requirements. This guidance is designed to help facilitate the conversations that will lead to
alignment of both requirements with a cloud governance model.
Next steps
Now that you understand the concept of cloud resource governance, move on to learn more about how resource
access is managed in Azure in preparation for learning how to design a governance model for a simple workload
or for multiple teams.
Learn about resource access management in Azure Learn about service-level agreements for Azure Learn about
logging, reporting, and monitoring
Resource Consistency tools in Azure
10/30/2020 • 2 minutes to read • Edit Online
Resource consistency is one of the Five Disciplines of Cloud Governance. This discipline focuses on ways of
establishing policies related to the operational management of an environment, application, or workload. Within
the Five Disciplines of Cloud Governance, the Resource Consistency discipline involves monitoring of application,
workload, and asset performance. It also involves the tasks required to meet scale demands, remediate
performance SLA violations, and proactively avoid performance SLA violations through automated remediation.
The following is a list of Azure tools that can help mature the policies and processes that support this discipline.
Orchestrate No No Yes No No No No
d
environmen
t
deployment
Assess No No No Yes No No No
availability
and
scalability
Apply No No No Yes No No No
automated
remediation
Manage Yes No No No No No No
billing
Along with these Resource Consistency tools and features, you will need to monitor your deployed resources for
performance and health issues. Azure Monitor is the default monitoring and reporting solution in Azure. Azure
Monitor provides features for monitoring your cloud resources. This list shows which feature addresses common
monitoring requirements.
A P P L IC AT IO N A Z URE M O N ITO R
TO O L A Z URE P O RTA L IN SIGH T S LO G A N A LY T IC S REST A P I
Schedule regular No No No No
reports or custom
analysis
When planning your deployment, you will need to consider where logging data is stored and how you integrate
cloud-based reporting and monitoring services with your existing processes and tools.
NOTE
Organizations also use third-party DevOps tools to monitor workloads and resources. For more information, see DevOps
tool integrations.
Next steps
Learn to create, assign, and manage policy definitions in Azure.
Resource access management in Azure
10/30/2020 • 4 minutes to read • Edit Online
The Govern methodology outlines the Five Disciplines of Cloud Governance, which includes resource
management. What is resource access governance furthers explains how resource access management fits into
the resource management discipline. Before you move on to learn how to design a governance model, it's
important to understand the resource access management controls in Azure. The configuration of these resource
access management controls forms the basis of your governance model.
Begin by taking a closer look at how resources are deployed in Azure.
Figure 1: A resource.
Summary
In this article, you learned about how resource access is managed in Azure using Azure Resource Manager.
Next steps
Now that you understand how to manage resource access in Azure, move on to learn how to design a governance
model for a simple workload or for multiple teams using these services.
An overview of governance
Governance design for a simple workload
10/30/2020 • 6 minutes to read • Edit Online
The goal of this guidance is to help you learn the process for designing a resource governance model in Azure to
support a single team and a simple workload. You'll look at a set of hypothetical governance requirements, then go
through several example implementations that satisfy those requirements.
In the foundational adoption stage, our goal is to deploy a simple workload to Azure. This results in the following
requirements:
Identity management for a single workload owner who is responsible for deploying and maintaining the
simple workload. The workload owner requires permission to create, read, update, and delete resources as well
as permission to delegate these rights to other users in the identity management system.
Manage all resources for the simple workload as a single management unit.
Azure licensing
Before you begin designing our governance model, it's important to understand how Azure is licensed. This is
because the administrative accounts associated with your Azure license have the highest level of access to your
Azure resources. These administrative accounts form the basis of your governance model.
NOTE
If your organization has an existing Microsoft Enterprise Agreement that does not include Azure, Azure can be added by
making an upfront monetary commitment. For more information, see Licensing Azure for the enterprise.
When Azure was added to your organization's Enterprise Agreement, your organization was prompted to create an
Azure account . During the account creation process, an Azure account owner was created, as well as an Azure
Active Directory (Azure AD) tenant with a global administrator account. An Azure AD tenant is a logical
construct that represents a secure, dedicated instance of Azure AD.
Figure 1: An Azure account with an Azure account owner and Azure AD global administrator.
Identity management
Azure only trusts Azure AD to authenticate users and authorize user access to resources, so Azure AD is our
identity management system. The Azure AD global administrator has the highest level of permissions and can
perform all actions related to identity, including creating users and assigning permissions.
Our requirement is identity management for a single workload owner who is responsible for deploying and
maintaining the simple workload. The workload owner requires permission to create, read, update, and delete
resources as well as permission to delegate these rights to other users in the identity management system.
Our Azure AD global administrator will create the workload owner account for the workload owner:
Figure 2: The Azure AD global administrator creates the workload owner user account.
You can't assign resource access permission until this user is added to a subscription , so you'll do that in the next
two sections.
Figure 4: The Azure account owner associates the Azure AD tenant with the subscription.
You may have noticed that there is currently no user associated with the subscription, which means that no one
has permission to manage resources. In practice, the account owner is the owner of the subscription and has
permission to take any action on a resource in the subscription. In practical terms, the account owner is more
than likely a finance person in your organization and is not responsible for creating, reading, updating, and
deleting resources. Those tasks will be performed by the workload owner , so you need to add the workload
owner to the subscription and assign permissions.
Since the account owner is currently the only user with permission to add the workload owner to the
subscription, they add the workload owner to the subscription:
Figure 5: The Azure account owner adds the workload owner to the subscription.
The Azure account owner grants permissions to the workload owner by assigning a role-based access control
(RBAC) role. The RBAC role specifies a set of permissions that the workload owner has for an individual resource
type or a set of resource types.
Notice that in this example, the account owner has assigned the built-in owner role:
Figure 6: The workload owner was assigned the built-in owner role.
The built-in owner role grants all permissions to the workload owner at the subscription scope.
IMPORTANT
The Azure account owner is responsible for the financial commitment associated with the subscription, but the workload
owner has the same permissions. The account owner must trust the workload owner to deploy resources that are
within the subscription budget.
The next level of management scope is the resource group level. A resource group is a logical container for
resources. Operations applied at the resource group level apply to all resources in a group. Also, it's important to
note that permissions for each user are inherited from the next level up unless they're explicitly changed at that
scope.
To illustrate this, let's look at what happens when the workload owner creates a resource group:
Figure 7: The workload owner creates a resource group and inherits the built-in owner role at the resource group
scope.
Again, the built-in owner role grants all permissions to the workload owner at the resource group scope. As
discussed earlier, this role is inherited from the subscription level. If a different role is assigned to this user at this
scope, it applies to this scope only.
The lowest level of management scope is at the resource level. Operations applied at the resource level apply
only to the resource itself. Again, permissions at the resource level are inherited from resource group scope. For
example, let's look at what happens if the workload owner deploys a virtual network into the resource group:
Figure 8: The workload owner creates a resource and inherits the built-in owner role at the resource scope.
The workload owner inherits the owner role at the resource scope, which means the workload owner has all
permissions for the virtual network.
Next steps
Learn about resource access for multiple teams
Governance design for multiple teams
10/30/2020 • 24 minutes to read • Edit Online
The goal of this guidance is to help you learn the process for designing a resource governance model in Azure to
support multiple teams, multiple workloads, and multiple environments. First you'll look at a set of hypothetical
governance requirements, then go through several example implementations that satisfy those requirements.
The requirements are:
The enterprise plans to transition new cloud roles and responsibilities to a set of users and therefore requires
identity management for multiple teams with different resource access needs in Azure. This identity
management system is required to store the identity of the following users:
The individual in your organization responsible for ownership of subscriptions .
The individual in your organization responsible for the shared infrastructure resources used to
connect your on-premises network to a virtual network in Azure.
Two individuals in your organization responsible for managing a workload .
Support for multiple environments . An environment is a logical grouping of resources, such as virtual
machines, virtual networking, and network traffic routing services. These groups of resources have similar
management and security requirements and are typically used for a specific purpose such as testing or
production. In this example, the requirement is for four environments:
A shared infrastructure environment that includes resources shared by workloads in other
environments. For example, a virtual network with a gateway subnet that provides connectivity to on-
premises.
A production environment with the most restrictive security policies. Could include internal or
external facing workloads.
A nonproduction environment for development and testing work. This environment has security,
compliance, and cost policies that differ from those in the production environment. In Azure, this takes
the form of an Enterprise Dev/Test subscription.
A sandbox environment for proof of concept and education purposes. This environment is typically
assigned per employee participating in development activities and has strict procedural and operational
security controls in place to prevent corporate data from landing here. In Azure, these take the form of
Visual Studio subscriptions. These subscriptions should also not be tied to the enterprise Azure Active
Directory.
A permissions model of least privilege in which users have no permissions by default. The model must
support the following:
A single trusted user at the subscription scope, treated like a service account and granted permission to
assign resource access rights.
Each workload owner is denied access to resources by default. Resource access rights are granted
explicitly by the single trusted user at the resource group scope.
Management access for the shared infrastructure resources, limited to the shared infrastructure owners.
Management access for each workload restricted to the workload owner in production, and increasing
levels of control as development proceeds through the various deployment environments
(development, test, staging, and production).
The enterprise does not want to have to manage roles independently in each of the three main
environments, and therefore requires the use of only built-in roles available in Azure's role-based access
control (RBAC). If the enterprise absolutely requires custom RBAC roles, additional processes would be
needed to synchronize custom roles across the three environments.
Cost tracking by workload owner name, environment, or both.
Identity management
Before you can design identity management for your governance model, it's important to understand the four
major areas it encompasses:
Administration: The processes and tools for creating, editing, and deleting user identity.
Authentication: Verifying user identity by validating credentials, such as a user name and password.
Authorization: Determining which resources an authenticated user is allowed to access or what operations
they have permission to perform.
Auditing: Periodically reviewing logs and other information to discover security issues related to user identity.
This includes reviewing suspicious usage patterns, periodically reviewing user permissions to verify they're
accurate, and other functions.
There is only one service trusted by Azure for identity, and that is Azure Active Directory (Azure AD). You'll be
adding users to Azure AD and using it for all of the functions listed above. Before looking at how to configure
Azure AD, it's important to understand the privileged accounts that are used to manage access to these services.
When your organization signed up for an Azure account, at least one Azure account owner was assigned. Also,
an Azure AD tenant was created, unless an existing tenant was already associated with your organization's use of
other Microsoft services such as Microsoft 365. A global administrator with full permissions on the Azure AD
tenant was associated when it was created.
The user identities for both the Azure account owner and the Azure AD global administrator are stored in a highly
secure identity system that is managed by Microsoft. The Azure account owner is authorized to create, update, and
delete subscriptions. The Azure AD global administrator is authorized to perform many actions in Azure AD, but
for this design guide you'll focus on the creation and deletion of user identity.
NOTE
Your organization may already have an existing Azure AD tenant if there's an existing Microsoft 365, Intune, or Dynamics
365 license associated with your account.
The Azure account owner has permission to create, update, and delete subscriptions:
Figure 1: An Azure account with an Azure account owner and Azure AD global administrator.
The Azure AD global administrator has permission to create user accounts:
Figure 2: The Azure AD global administrator creates the required user accounts in the tenant.
The first two accounts, app1 workload owner and app2 workload owner , are each associated with an
individual in your organization responsible for managing a workload. The network operations account is owned
by the individual that is responsible for the shared infrastructure resources. Finally, the subscription owner
account is associated with the individual responsible for ownership of subscriptions.
2. The ser vice administrator reviews their request and creates resource group A . At this point, workload
owner A still doesn't have permission to do anything.
3. The ser vice administrator adds workload owner A to resource group A and assigns the built-in
Contributor role. The Contributor role grants all permissions on resource group A except managing access
permission.
4. Let's assume that workload owner A has a requirement for a pair of team members to view the CPU and
network traffic monitoring data as part of capacity planning for the workload. Because workload owner A is
assigned the Contributor role, they do not have permission to add a user to resource group A . They must
send this request to the ser vice administrator .
5. The ser vice administrator reviews the request, and adds the two workload contributor users to resource
group A . Neither of these two users require permission to manage resources, so they're assigned the built-in
reader role.
6. Next, workload owner B also requires a resource group to contain the resources for their workload. As with
workload owner A , workload owner B initially does not have permission to take any action at the
subscription scope so they must send a request to the ser vice administrator .
7. The ser vice administrator reviews the request and creates resource group B .
8. The ser vice administrator then adds workload owner B to resource group B and assigns the built-in
Contributor role.
At this point, each of the workload owners is isolated in their own resource group. None of the workload owners
or their team members have management access to the resources in any other resource group.
gfigure 4: a subscription with two workload owners isolated with their own resource group._
This model is a least-privilege model. Each user is assigned the correct permission at the correct resource
management scope.
Consider that every task in this example was performed by the ser vice administrator . While this is a simple
example and may not appear to be an issue because there were only two workload owners, it's easy to imagine
the types of issues that would result for a large organization. For example, the ser vice administrator can
become a bottleneck with a large backlog of requests that result in delays.
Let's take a look at second example that reduces the number of tasks performed by the ser vice administrator .
1. In this model, workload owner A is assigned the built-in owner role at the subscription scope, enabling them
to create their own resource group: resource group A .
2. When resource group A is created, workload owner A is added by default and inherits the built-in owner
role from the subscription scope.
3. The built-in owner role grants workload owner A permission to manage access to the resource group.
Workload owner A adds two workload contributors and assigns the built-in reader role to each of them.
4. Ser vice administrator now adds workload owner B to the subscription with the built-in owner role.
5. Workload owner B creates resource group B and is added by default. Again, workload owner B inherits
the built-in owner role from the subscription scope.
Note that in this model, the ser vice administrator performed fewer actions than they did in the first example
due to the delegation of management access to each of the individual workload owners.
Figure 5: A subscription with a service administrator and two workload owners, all assigned the built-in owner
role.
Because both workload owner A and workload owner B are assigned the built-in owner role at the
subscription scope, they have each inherited the built-in owner role for each other's resource group. This means
that not only do they have full access to each other's resources, they can also delegate management access to each
other's resource groups. For example, workload owner B has rights to add any other user to resource group A
and can assign any role to them, including the built-in owner role.
If you compare each example to the requirements, you'll see that both examples support a single trusted user at
the subscription scope with permission to grant resource access rights to the two workload owners. Each of the
two workload owners did not have access to resource management by default and required the ser vice
administrator to explicitly assign permissions to them. Only the first example supports the requirement that the
resources associated with each workload are isolated from one another such that no workload owner has access
to the resources of any other workload.
3. The network operations user creates a VPN gateway and configures it to connect to the on-premises VPN
appliance. The network operations user also applies a pair of tags to each of the resources:
environment:shared and managedBy:netops . When the subscription ser vice administrator exports a cost
report, costs will be aligned with each of these tags. This allows the subscription ser vice administrator to
pivot costs using the environment tag and the managedBy tag. Notice the resource limits counter at the top
right-hand side of the figure. Each Azure subscription has service limits, and to help you understand the effect
of these limits you'll follow the virtual network limit for each subscription. There is a limit of 1,000 virtual
networks per subscription, and after the first virtual network is deployed there are now 999 available.
4. Two more resource groups are deployed. The first is named prod-rg . This resource group is aligned with the
production environment. The second is named dev-rg and is aligned with the development environment. All
resources associated with production workloads are deployed to the production environment and all resources
associated with development workloads are deployed to the development environment. In this example, you'll
only deploy two workloads to each of these two environments, so you won't encounter any Azure subscription
service limits. Consider that each resource group has a limit of 800 resources per resource group. If you
continue to add workloads to each resource group, you'll eventually reach this limit.
5. The first workload owner sends a request to the subscription ser vice administrator and is added to each
of the development and production environment resource groups with the contributor role. As you learned
earlier, the contributor role allows the user to perform any operation other than assigning a role to another
user. The first workload owner can now create the resources associated with their workload.
6. The first workload owner creates a virtual network in each of the two resource groups with a pair of virtual
machines in each. The first workload owner applies the environment and managedBy tags to all resources.
Note that the Azure service limit counter is now at 997 virtual networks remaining.
7. None of the virtual networks has connectivity to on-premises when created. In this type of architecture, each
virtual network must be peered to the hub-vnet in the shared infrastructure environment. Virtual network
peering creates a connection between two separate virtual networks and allows network traffic to travel
between them. Note that virtual network peering is not inherently transitive. A peering must be specified in
each of the two virtual networks that are connected, and if only one of the virtual networks specifies a peering,
then the connection is incomplete. To illustrate the effect of this, the first workload owner specifies a peering
between prod-vnet and hub-vnet . The first peering is created, but no traffic flows because the complementary
peering from hub-vnet to prod-vnet has not yet been specified. The first workload owner contacts the
network operations user and requests this complementary peering connection.
8. The network operations user reviews the request, approves it, then specifies the peering in the settings for
the hub-vnet . The peering connection is now complete, and network traffic flows between the two virtual
networks.
9. Now, a second workload owner sends a request to the subscription ser vice administrator and is added
to the existing production and development environment resource groups with the contributor role. The
second workload owner has the same permissions on all resources as the first workload owner in each
resource group.
10. The second workload owner creates a subnet in the prod-vnet virtual network, then adds two virtual
machines. The second workload owner applies the environment and managedBy tags to each resource.
This example resource management model enables us to manage resources in the three required environments.
The shared infrastructure resources are protected because only a single user in the subscription has permission to
access those resources. Each of the workload owners can use the shared infrastructure resources without having
any permissions on the shared resources themselves. This management model fails the requirement for workload
isolation, because both workload owners can access the resources of each other's workload.
There's another important consideration with this model that may not be immediately obvious. In the example, it
was app1 workload owner that requested the network peering connection with the hub-vnet to provide
connectivity to the on-premises network. The network operations user evaluated that request based on the
resources deployed with that workload. When the subscription owner account added app2 workload owner
with the contributor role, that user had management access rights to all resources in the prod-rg resource
group.
This means app2 workload owner had permission to deploy their own subnet with virtual machines in the
prod-vnet virtual network. By default, those virtual machines have access to the on-premises network. The
network operations user is not aware of those machines and did not approve their connectivity to on-premises.
Next, let's look at a single subscription with multiple resource groups for different environments and workloads.
Note that in the previous example, the resources for each environment were easily identifiable because they were
in the same resource group. Now that you no longer have that grouping, you will have to rely on a resource group
naming convention to provide that functionality.
1. The shared infrastructure resources will still have a separate resource group in this model, so that remains
the same. Each workload requires two resource groups, one for each of the development and production
environments. For the first workload, the subscription owner account creates two resource groups. The first
is named app1-prod-rg and the second is named app1-dev-rg . As discussed earlier, this naming convention
identifies the resources as being associated with the first workload, app1 , and either the development or
production environment. Again, the subscription owner account adds app1 workload owner to the
resource group with the contributor role.
2. Similar to the first example, app1 workload owner deploys a virtual network named app1-prod-vnet to the
production environment, and another named app1-dev-vnet to the development environment. Again, app1
workload owner sends a request to the network operations user to create a peering connection. Note that
app1 workload owner adds the same tags as in the first example, and the limit counter has been
decremented to 997 virtual networks remaining in the subscription.
3. The subscription owner account now creates two resource groups for app2 workload owner . Following
the same conventions as for app1 workload owner , the resource groups are named app2-prod-rg and
app2-dev-rg . The subscription owner account adds app2 workload owner to each of the resource groups
with the contributor role.
4. The app2 workload owner account deploys virtual networks and virtual machines to the resource groups
with the same naming conventions. Tags are added and the limit counter has been decremented to 995 virtual
networks remaining in the subscription.
5. The app2 workload owner account sends a request to the network operations user to peer the
app2-prod-vnet with the hub-vnet . The network operations user creates the peering connection.
The resulting management model is similar to the first example, with several key differences:
Each of the two workloads is isolated by workload and by environment.
This model required two more virtual networks than the first example model. While this is not an important
distinction with only two workloads, the theoretical limit on the number of workloads for this model is 24.
Resources are no longer grouped in a single resource group for each environment. Grouping resources
requires an understanding of the naming conventions used for each environment.
Each of the peered virtual network connections was reviewed and approved by the network operations
user .
Now let's look at a resource management model using multiple subscriptions. In this model, you'll align each of
the three environments to a separate subscription: a shared ser vices subscription, production subscription, and
finally a development subscription. The considerations for this model are similar to a model using a single
subscription in that you have to decide how to align resource groups to workloads. Already determined is that
creating a resource group for each workload satisfies the workload isolation requirement, so you'll stick with that
model in this example.
1. In this model, there are three subscriptions: shared infrastructure , production , and development . Each
of these three subscriptions requires a subscription owner, and in the simple example you'll use the same
user account for all three. The shared infrastructure resources are managed similarly to the first two
examples above, and the first workload is associated with the app1-rg resource group in the production
environment and the same-named resource group in the development environment. The app1
workload owner account is added to each of the resource group with the contributor role.
2. As with the earlier examples, app1 workload owner creates the resources and requests the peering
connection with the shared infrastructure virtual network. The app1 workload owner account adds
only the managedBy tag because there is no longer a need for the environment tag. That is, resources are for
each environment are now grouped in the same subscription and the environment tag is redundant. The
limit counter is decremented to 999 virtual networks remaining.
3. Finally, the subscription owner account repeats the process for the second workload, adding the resource
groups with app2 workload owner in the contributor role. The limit counter for each of the
environment subscriptions is decremented to 998 virtual networks remaining.
This management model has the benefits of the second example above. The key difference is that limits are less of
an issue due to the fact that they're spread over two subscriptions. The drawback is that the cost data tracked by
tags must be aggregated across all three subscriptions.
Therefore, you can select any of these two examples resource management models depending on the priority of
your requirements. If you anticipate that your organization will not reach the service limits for a single
subscription, you can use a single subscription with multiple resource groups. Conversely, if your organization
anticipates many workloads, multiple subscriptions for each environment may be better.
Related resources
Built-in roles for Azure resources
Deployment Acceleration discipline overview
10/30/2020 • 2 minutes to read • Edit Online
Deployment acceleration is one of the Five Disciplines of Cloud Governance within the Cloud Adoption
Framework governance model. This discipline focuses on ways of establishing policies to govern asset
configuration or deployment. Within the Five Disciplines of Cloud Governance, the Deployment Acceleration
discipline includes deployment, configuration alignment, and script reusability. This could be through manual
activities or fully automated DevOps activities. In either case, the policies would remain largely the same. As this
discipline matures, the cloud governance team can serve as a partner in DevOps and deployment strategies by
accelerating deployments and removing barriers to cloud adoption, through the application of reusable assets.
This article outlines the deployment acceleration process that a company experiences during the planning,
building, adopting, and operating phases of implementing a cloud solution. It's impossible for any one document
to account for all of the requirements of any business. As such, each section of this article outlines suggested
minimum and potential activities. The objective of these activities is to help you build a policy MVP, but establish
a framework for incremental policy improvement. The cloud governance team should decide how much to invest
in these activities to improve the deployment acceleration position.
NOTE
The Deployment Acceleration discipline does not replace the existing IT teams, processes, and procedures that allow your
organization to effectively deploy and configure cloud-based resources. The primary purpose of this discipline is to identify
potential business risks and provide risk-mitigation guidance to the IT staff that are responsible for managing your
resources in the cloud. As you develop governance policies and processes make sure to involve relevant IT teams in your
planning and review processes.
The primary audience for this guidance is your organization's cloud architects and other members of your cloud
governance team. The decisions, policies, and processes that emerge from this discipline should involve
engagement and discussions with relevant members of your business and IT teams, especially those leaders
responsible for deploying and configuring cloud-based workloads.
Policy statements
Actionable policy statements and the resulting architecture requirements serve as the foundation of a
Deployment Acceleration discipline. Use sample policy statements as a starting point for defining your
Deployment Acceleration policies.
Cau t i on
The sample policies come from common customer experiences. To better align these policies to specific cloud
governance needs, execute the following steps to create policy statements that meet your unique business needs.
Next steps
Get started by evaluating business risks in a specific environment.
Understand business risks
Deployment acceleration template
10/30/2020 • 2 minutes to read • Edit Online
The first step to implementing change is communicating the desired change. The same is true when changing
governance practices. The template below serves as a starting point for documenting and communicating policy
statements that govern configuration and deployment issues in the cloud. The template also outlines the business
criteria that may have led you to create the documented policy statements.
As your discussions progress, use this template's structure as a model for capturing the business risks, risk
tolerances, compliance processes, and tooling needed to define your organization's Deployment Acceleration
policy statements.
IMPORTANT
This template is a limited sample. Before updating this template to reflect your requirements, you should review the
subsequent steps for defining an effective Deployment Acceleration discipline within your cloud governance strategy.
Next steps
Solid governance practices start with an understanding of business risk. Review the article on business risks and
begin to document the business risks that align with your current cloud adoption plan.
Understand business risks
Motivations and business risks in the Deployment
Acceleration discipline
10/30/2020 • 2 minutes to read • Edit Online
This article discusses the reasons that customers typically adopt a Deployment Acceleration discipline within a
cloud governance strategy. It also provides a few examples of business risks that drive policy statements.
Relevance
On-premises systems are often deployed using baseline images or installation scripts. Additional configuration is
usually necessary, which may involve multiple steps or human intervention. These manual processes are error-
prone and often result in "configuration drift", requiring time-consuming troubleshooting and remediation tasks.
Most Azure resources can be deployed and configured manually via the Azure portal. This approach may be
sufficient for your needs when only have a few resources to manage. As your cloud estate grows, your
organization should begin to integrate automation into your deployment processes to ensure your cloud
resources avoid configuration drift or other problems introduced by manual processes. Adopting a DevOps or
DevSecOps approach is often the best way to manage your deployments as you cloud adoption efforts mature.
A robust deployment acceleration plan ensures that your cloud resources are deployed, updated, and configured
correctly and consistently, and remain that way. The maturity of your Deployment Acceleration strategy can also
be a significant factor in your Cost Management strategy. Automated provisioning and configuration of your
cloud resources allows you to scale down or deallocate resources when demand is low or time-bound, so you only
pay for resources as you need them.
Business risk
The Deployment Acceleration discipline attempts to address the following business risks. During cloud adoption,
monitor each of the following for relevance:
Ser vice disruption: Lack of predictable repeatable deployment processes or unmanaged changes to system
configurations can disrupt normal operations and can result in lost productivity or lost business.
Cost overruns: Unexpected changes in configuration of system resources can make identifying root cause of
issues more difficult, raising the costs of development, operations, and maintenance.
Organizational inefficiencies: Barriers between development, operations, and security teams can cause
numerous challenges to effective adoption of cloud technologies and the development of a unified cloud
governance model.
Next steps
Use the Deployment Acceleration discipline template to document business risks that are likely to be introduced
by the current cloud adoption plan.
Once an understanding of realistic business risks is established, the next step is to document the business's
tolerance for risk and the indicators and key metrics to monitor that tolerance.
Metrics, indicators, and risk tolerance
Risk tolerance metrics and indicators in the
Deployment Acceleration discipline
5/21/2020 • 2 minutes to read • Edit Online
Learn to quantify business risk tolerance associated with the Deployment Acceleration discipline. Defining metrics
and indicators helps to create a business case for investing in the maturity of this discipline.
Metrics
Deployment acceleration focuses on risks related to how cloud resources are configured, deployed, updated, and
maintained. The following information is useful when adopting the Deployment Acceleration discipline:
Deployment failures: Percentage of deployments that fail or result in misconfigured resources.
Time to deployment: The amount of time needed to deploy updates to an existing system.
Assets out-of-compliance: The number or percentage of resources that are out of compliance with defined
policies.
Next steps
Use the Deployment Acceleration discipline template to document metrics and tolerance indicators that align to
the current cloud adoption plan.
Review sample Deployment Acceleration policies as a starting point to develop your own policies to address
specific business risks aligned with your cloud adoption plans.
Review sample policies
Deployment Acceleration sample policy statements
10/30/2020 • 3 minutes to read • Edit Online
Individual cloud policy statements are guidelines for addressing specific risks identified during your risk
assessment process. These statements should provide a concise summary of risks and plans to deal with them.
Each statement definition should include these pieces of information:
Technical risk : A summary of the risk this policy will address.
Policy statement: A clear summary explanation of the policy requirements.
Design options: Actionable recommendations, specifications, or other guidance that IT teams and developers
can use when implementing the policy.
The following sample policy statements address common configuration-related business risks. These statements
are examples you can reference when drafting policy statements to address your organization's needs. These
examples are not meant to be proscriptive, and there are potentially several policy options for dealing with each
identified risk. Work closely with business and IT teams to identify the best policies for your unique set of risks.
Next steps
Use the samples mentioned in this article as a starting point to develop policies that address specific business risks
that align with your cloud adoption plans.
To begin developing your own custom Identity Baseline policy statements, download the Identity Baseline
discipline template.
To accelerate adoption of this discipline, choose the actionable governance guide that most closely aligns with your
environment. Then modify the design to incorporate your specific corporate policy decisions.
Building on risks and tolerance, establish a process for governing and communicating Deployment Acceleration
policy adherence.
Establish policy compliance processes
Deployment Acceleration policy compliance
processes
10/30/2020 • 4 minutes to read • Edit Online
This article discusses an approach to policy-adherence processes that govern the Deployment Acceleration
discipline. Effective governance of cloud configuration starts with recurring manual processes designed to detect
issues and impose policies to remediate those risks. You can automate these processes and supplement to reduce
the overhead of governance and allow for faster response to deviation.
Next steps
Use the Deployment Acceleration discipline template to document the processes and triggers that align to the
current cloud adoption plan.
For guidance on executing cloud management policies in alignment with adoption plans, see Deployment
Acceleration discipline improvement.
Deployment Acceleration discipline improvement
Deployment Acceleration discipline improvement
10/30/2020 • 4 minutes to read • Edit Online
The Deployment Acceleration discipline focuses on establishing policies that ensure that resources are deployed
and configured consistently and repeatably, and remain in compliance throughout their lifecycle. Within the Five
Disciplines of Cloud Governance, the Deployment Acceleration discipline includes decisions regarding automating
deployments, source-controlling deployment artifacts, monitoring deployed resources to maintain desired state,
and auditing any compliance issues.
This article outlines some potential tasks your company can engage in to better develop and mature the
Deployment Acceleration discipline. These tasks can be broken down into planning, building, adopting, and
operating phases of implementing a cloud solution, which are then iterated on allowing the development of an
incremental approach to cloud governance.
Neither the minimum or potential activities outlined in this article are aligned to specific corporate policies or
third-party compliance requirements. This guidance is designed to help facilitate the conversations that will lead to
alignment of both requirements with a cloud governance model.
Next steps
Now that you understand the concept of cloud identity governance, examine the Identity Baseline toolchain to
identify Azure tools and features that you'll need when developing your Identity Baseline discipline on the Azure
platform.
Identity Baseline toolchain for Azure
Deployment Acceleration tools in Azure
10/30/2020 • 2 minutes to read • Edit Online
The Deployment Acceleration discipline is one of the Five Disciplines of Cloud Governance. This discipline focuses
on ways of establishing policies to govern asset configuration or deployment. Within the Five Disciplines of
Cloud Governance, the Deployment Acceleration discipline involves deployment and configuration alignment.
This could be through manual activities or fully automated DevOps activities. In either case, the policies involved
would remain largely the same.
Cloud custodians, cloud guardians, and cloud architects with an interest in governance are each likely to invest a
lot of time in the Deployment Acceleration discipline, which codifies policies and requirements across multiple
cloud adoption efforts. The tools in this toolchain are important to the cloud governance team and should be a
high priority on the learning path for the team.
The following is a list of Azure tools that can help mature the policies and processes that support this discipline.
Implement Yes No No No No No
corporate
policies
Deploy No No Yes No No No
defined
resources
Audit Yes No No No No No
policies
Quer y No No No No Yes No
Azure
resources
Repor t on No No No No No Yes
cost of
resources
The following are additional tools that may be required to accomplish specific deployment acceleration
objectives. Often these tools are used outside of the governance team, but are still considered an aspect of the
Deployment Acceleration discipline.
A Z URE
A Z URE RESO URC E A Z URE A Z URE A Z URE A Z URE SIT E
P O RTA L M A N A GER P O L IC Y DEVO P S B A C K UP REC O VERY
Create an No No No Yes No No
automated
pipeline to
deploy code
and
configure
assets
(DevOps)
Aside from the Azure native tools mentioned above, it is common for customers to use third-party tools to
facilitate deployment acceleration and DevOps deployments.
Cloud management in the Cloud Adoption
Framework
10/30/2020 • 2 minutes to read • Edit Online
Delivering on a cloud strategy requires solid planning, readiness, and adoption. But it's the ongoing operation
of the digital assets that delivers tangible business outcomes. Without a plan for reliable, well-managed
operations of the cloud solutions, those efforts will yield little value. The following exercises help develop the
business and technical approaches needed to provide cloud management that powers ongoing operations.
Get started
To prepare you for this phase of the cloud adoption lifecycle, the framework suggests the following exercises:
The preceding steps create actionable approaches to deliver on the Cloud Adoption Framework's Manage
methodology.
As discussed in the business alignment article, not all workloads are mission critical. Within any portfolio are
various degrees of operational management needs. Business alignment efforts aid in capturing the business
impact and negotiating management costs with the business, to ensure the most appropriate operational
management processes and tools.
The guidance in the manage section of the Cloud Adoption Framework serves two purposes:
Provides examples of actionable operations management approaches that represent common experiences
often encountered by customers.
Helps you create personalized management solutions based on business commitments.
This content is intended for use by the cloud operations team. It's also relevant to cloud architects who need to
develop a strong foundation in cloud operations or cloud design principles.
The content in the Cloud Adoption Framework affects the business, technology, and culture of enterprises. This
section of the Cloud Adoption Framework interacts heavily with IT operations, IT governance, finance, line-of-
business leaders, networking, identity, and cloud adoption teams. Various dependencies on these personnel
require a facilitative approach by the cloud architects who are using this guidance. Facilitation with these teams
is seldom a one-time effort.
The cloud architect serves as the thought leader and facilitator to bring these audiences together. The content in
this collection of guides is designed to help the cloud architect facilitate the right conversation, with the right
audience, to drive necessary decisions. Business transformation that's empowered by the cloud depends on the
cloud architect to help guide decisions throughout the business and IT.
Each section of the Cloud Adoption Framework represents a different specialization or variant of the cloud
architect role. This section of the Cloud Adoption Framework is designed for cloud architects with a passion for
operations and management of deployment solutions. Within this framework, these specialists are referred to
frequently as cloud operations, or collectively as the cloud operations team.
If you want to follow this guide from beginning to end, this content aids in developing a robust cloud
operations strategy. The guidance walks you through the theory and implementation of such a strategy.
You can also apply the methodology to Establish clear business commitments.
Azure management guide: Before you start
5/12/2020 • 2 minutes to read • Edit Online
Management baseline
A management baseline is the minimum set of tools and processes that should be applied to every asset in an
environment. Several additional options can be included in the management baseline. The next few articles
accelerate cloud management capabilities by focusing on the minimum options necessary instead of on all of the
available options.
The next step is Inventory and visibility.
This guide provides interactive steps that let you try features as they're introduced. To come back to where you left
off, use the breadcrumb for navigation.
Inventory and visibility in Azure
10/30/2020 • 3 minutes to read • Edit Online
Inventory and visibility is the first of three disciplines in a cloud management baseline.
This discipline comes first because collecting proper operational data is vital when you make decisions about
operations. Cloud management teams must understand what is managed and how well those assets are operated.
This article describes the different tools that provide both an inventory and visibility into the inventory's run state.
For any enterprise-grade environment, the following table outlines the suggested minimum for a management
baseline.
P RO C ESS TO O L P URP O SE
Monitor health of Azure services Azure Service Health Health, performance, and diagnostics
for services running in Azure
Log centralization Log Analytics Central logging for all visibility purposes
Virtual machine inventory and change Azure Change Tracking and Inventory Inventory VMs and monitor changes
tracking for guest OS level
Guest OS monitoring Azure Monitor for VMs Monitoring changes and performance
of VMs
Log Analytics
Log Analytics
A Log Analytics workspace is a unique environment for storing Azure Monitor log data. Each workspace has its
own data repository and configuration. Data sources and solutions are configured to store their data in particular
workspaces. Azure monitoring solutions require all servers to be connected to a workspace, so that their log data
can be stored and accessed.
Action
E XP L O R E A Z U R E
M O N I TO R
Learn more
To learn more, see the Log Analytics workspace creation documentation.
Azure Monitor
Azure Monitor
Azure Monitor provides a single unified hub for all monitoring and diagnostics data in Azure and gives you
visibility across your resources. With Azure Monitor, you can find and fix problems and optimize performance. You
can also understand customer behavior.
Monitor and visualize metrics. Metrics are numerical values available from Azure resources. They help
you understand the health of your systems. Customize charts for your dashboards, and use workbooks for
reporting.
Quer y and analyze logs. Logs include activity logs and diagnostic logs from Azure. Collect additional logs
from other monitoring and management solutions for your cloud or on-premises resources. Log Analytics
provides a central repository to aggregate all of this data. From there, you can run queries to help
troubleshoot issues or to visualize data.
Set up aler ts and actions. Alerts notify you of critical conditions. Corrective actions can be taken based
on triggers from metrics, logs, or service-health issues. You can set up different notifications and actions and
can also send data to your IT service management tools.
Action
E XP L O R E A Z U R E
M O N I TO R
Onboard solutions
Onboard solutions
To enable solutions, you need to configure the Log Analytics workspace. Onboarded Azure VMs and on-premises
servers get the solutions from the Log Analytics workspaces they're connected to.
There are two approaches to onboarding:
Single VM
Entire subscription
Each article guides you through a series of steps to onboard these solutions:
Update Management
Change Tracking and Inventory
Azure Activity Log
Azure Log Analytics Agent Health
Antimalware Assessment
Azure Monitor for VMs
Azure Security Center
Each of the previous steps helps establish inventory and visibility.
Operational compliance in Azure
10/30/2020 • 3 minutes to read • Edit Online
Improving operational compliance reduces the likelihood of an outage related to configuration drift or
vulnerabilities related to systems being improperly patched.
For any enterprise-grade environment, this table outlines the suggested minimum for a management baseline.
P RO C ESS TO O L P URP O SE
Update Management
Update Management
Computers that are managed by Update Management use the following configurations to do assessment and
update deployments:
Microsoft Monitoring Agent (MMA) for Windows or Linux.
PowerShell Desired State Configuration (DSC) for Linux.
Azure Automation Hybrid Runbook Worker.
Microsoft Update or Windows Server Update Services (WSUS) for Windows computers.
For more information, see Update Management solution.
WARNING
Before using Update Management, you must onboard virtual machines or an entire subscription into Log Analytics and
Azure Automation.
There are two approaches to onboarding:
Single VM
Entire subscription
You should follow one before proceeding with Update Management.
Manage updates
To apply a policy to a resource group:
1. Go to Azure Automation.
2. Select Automation accounts , and choose one of the listed accounts.
3. Go to Configuration Management .
4. Inventor y , Change Management , and State Configuration can be used to control the state and operational
compliance of the managed VMs.
AS S IG N
PO LIC Y
Azure Policy
Azure Policy
Azure Policy is used throughout governance processes. It's also highly valuable within cloud management
processes. Azure Policy can audit and remediate Azure resources and can also audit settings inside a machine. The
validation is performed by the Guest Configuration extension and client. The extension, through the client, validates
settings like:
Operating system configuration.
Application configuration or presence.
Environment settings.
Azure Policy Guest Configuration currently only audits settings inside the machine. It doesn't apply configurations.
Action
Assign a built-in policy to a management group, subscription, or resource group.
AS S IG N
PO LIC Y
Apply a policy
To apply a policy to a resource group:
1. Go to Azure Policy.
2. Select Assign a policy .
Learn more
To learn more, see:
Azure Policy
Azure Policy: Guest configuration
Cloud Adoption Framework: Policy enforcement decision guide
Azure Blueprints
Azure Blueprints
With Azure Blueprints, cloud architects and central information-technology groups can define a repeatable set of
Azure resources. These resources implement and adhere to an organization's standards, patterns, and
requirements.
With Azure Blueprints, development teams can rapidly build and stand up new environments. Teams can also trust
they're building within organizational compliance. They do so by using a set of built-in components like networking
to speed up development and delivery.
Blueprints are a declarative way to orchestrate the deployment of different resource templates and other artifacts
like:
Role assignments.
Policy assignments.
Azure Resource Manager templates.
Resource groups.
Applying a blueprint can enforce operational compliance in an environment if this enforcement isn't done by the
cloud governance team.
Create a blueprint
To create a blueprint:
1. Go to Blueprints: Getting star ted .
2. On the Create a Blueprint pane, select Create .
3. Filter the list of blueprints to select the appropriate blueprint.
4. In the Blueprint name box, enter the blueprint name.
5. Select Definition location , and choose the appropriate location.
6. Select Next : Ar tifacts >> , and review the artifacts included in the blueprint.
7. Select Save draft .
C R E ATE A
B LU E PR INT
Protect and recover is the third and final discipline in any cloud-management baseline.
In Operational compliance in Azure the objective is to reduce the likelihood of a business interruption. The current
article aims to reduce the duration and impact of outages that can't be prevented.
For any enterprise-grade environment, this table outlines the suggested minimum for any management baseline:
P RO C ESS TO O L P URP O SE
Protect the environment Azure Security Center Strengthen security and provide
advanced threat protection across your
hybrid workloads.
Azure Backup
Azure Backup
With Azure Backup, you can back up, protect, and recover your data in the Microsoft cloud. Azure Backup replaces
your existing on-premises or offsite backup solution with a cloud-based solution. This new solution is reliable,
secure, and cost competitive. Azure Backup can also help protect and recover on-premises assets through one
consistent solution.
For data present in Azure, Azure Backup offer varied levels of protection. For eg: For backing up key cloud
infrastructure pieces such as Azure Virtual machines and Azure Files, it offers Azure virtual machine backup and
Azure files backup. For more critical components such as databases running in Azure Virtual machines, it offers
dedicated database backup solutions for MS SQL and SAP HANA with far lower RPO.
To get a glimpse of how easy it is to enable backup with Azure Backup, look at the section below to enable backup
for Azure Virtual machines
Enable backup for an Azure VM
1. In the Azure portal, select Vir tual machines , then select the VM you want to backup.
2. On the Operations pane, select Backup .
3. Create or select an existing Azure Recovery Services vault.
4. Select Create (or edit) a new policy .
5. Configure the schedule and retention period.
6. Select OK .
7. Select Enable backup .
G O TO V I R TU A L
M AC H INE S
For more details about Azure Backup and it's varied offering, refer to this Overview section
TIP
Depending on your scenario, the exact steps might differ slightly.
Verify settings
After the replication job has finished, you can check the replication status, verify replication health, and test the
deployment.
1. In the VM menu, select Disaster recover y .
2. Verify replication health, the recovery points that have been created, and source and target regions on the map.
G O TO V I R TU A L
M AC H INE S
Learn more
Azure Site Recovery overview
Replicate an Azure VM to another region
Enhanced management baseline in Azure
10/30/2020 • 4 minutes to read • Edit Online
The first three cloud management disciplines describe a management baseline. The preceding articles in this guide
outline a minimum viable product (MVP) for cloud management services, which is referred to as a management
baseline. This article outlines a few common improvements to the baseline.
The purpose of a management baseline is to create a consistent offering that provides a minimum level of
business commitment for all supported workloads. With this baseline of common, repeatable management
offerings, the team can deliver highly optimized operational management with minimal deviation.
However, you might need a greater commitment to the business beyond the standard offering. The following
image and list show three ways to go beyond the management baseline.
DISC IP L IN E P RO C ESS TO O L P OT EN T IA L IM PA C T L EA RN M O RE
Inventory and Service change Azure Resource Graph Greater visibility into Overview of Azure
visibility tracking changes to Azure Resource Graph
services might help
detect negative
effects sooner or
remediate faster.
Protect and recover Breach notification Azure Security Center Extend protection to See the following
include security- sections
breach recovery
triggers.
Azure Automation
Azure Automation
Azure Automation provides a centralized system for the management of automated controls. In Azure Automation,
you can run simple remediation, scale, and optimization processes in response to environmental metrics. These
processes reduce the overhead associated with manual incident processing.
Most importantly, automated remediation can be delivered in near-real-time, significantly reducing interruptions
to business processes. A study of the most common business interruptions identifies activities within your
environment that could be automated.
Runbooks
The basic unit of code for delivering automated remediation is a runbook. Runbooks contain the instructions for
remediating or recovering from an incident.
To create or manage runbooks:
1. Go to Azure Automation .
2. Select Automation accounts and choose one of the listed accounts.
3. Go to Process automation .
4. With the options presented, you can create or manage runbooks, schedules, and other automated remediation
functionality.
G O TO A Z U R E
A U TO M ATI O N
Much like the enhanced management baseline, platform specialization is extension beyond the standard
management baseline. See the following image and list that show the ways to expand the management baseline.
This article addresses the platform specialization options.
Workload operations: The largest per-workload operations investment and the highest degree of resiliency.
We suggest workload operations for the approximately 20% of workloads that drive business value. This
specialization is usually reserved for high criticality or mission-critical workloads.
Platform operations: Operations investment is spread across many workloads. Resiliency improvements
affect all workloads that use the defined platform. We suggest platform operations for the approximately 20%
of platforms that have the highest criticality. This specialization is usually reserved for medium to high criticality
workloads.
Enhanced management baseline: The relatively lowest operations investment. This specialization slightly
improves business commitments by using additional cloud-native operations tools and processes.
Both workload and platform operations require changes to design and architecture principles. Those changes can
take time and might result in increased operating expenses. To reduce the number of workloads requiring such
investments, an enhanced management baseline might provide enough of an improvement to the business
commitment.
This table outlines a few common processes, tools, and potential effects common in customers' enhanced
management baselines:
SUGGEST ED M A N A GEM EN T
P RO C ESS TO O L P URP O SE L EVEL
Improve system design Microsoft Azure Well- Improving the architectural N/A
Architected Framework design of the platform to
improve operations
Container performance Azure Monitor for Monitoring and diagnostics Platform operations
containers of containers
Platform as a service (PaaS) Azure SQL Analytics Monitoring and diagnostics Platform operations
data performance for PaaS databases
Infrastructure as a service SQL Server Health Check Monitoring and diagnostics Platform operations
(IaaS) data performance for IaaS databases
High-level process
Platform specialization consists of a disciplined execution of the following four processes in an iterative approach.
Each process is explained in more detail in later sections of this article.
Improve system design: Improve the design of common systems or platforms to effectively minimize
interruptions.
Automate remediation: Some improvements aren't cost effective. In such cases, it might make more sense to
automate remediation and reduce the effect of interruptions.
Scale the solution: As systems design and automated remediation are improved, those changes can be
scaled across the environment through the service catalog.
Continuous improvement: Different monitoring tools can be used to discover incremental improvements.
These improvements can be addressed in the next pass of system design, automation, and scale.
Automated remediation
Automated remediation
Some technical debt can't be addressed. Resolution might be too expensive to correct or might be planned but
have a long project duration. The business interruption might not have a significant business effect. Or the
business priority might be to recover quickly instead of investing in resiliency.
When resolution of technical debt isn't the desired approach, automated remediation is commonly the next step.
Using Azure Automation and Azure Monitor to detect trends and provide automated remediation is the most
common approach to automated remediation.
For guidance on automated remediation, see Azure Automation and alerts.
Continuous improvement
Continuous improvement
Platform specialization and platform operations both depend on strong feedback loops among adoption, platform,
automation, and management teams. Grounding those feedback loops in data helps each team make wise
decisions. For platform operations to achieve long-term business commitments, it's important to use insights
specific to the centralized platform.
Containers and SQL Server are the two most common centrally managed platforms. These articles can help you
get started with continuous-improvement data collection on those platforms:
Container performance
PaaS database performance
IaaS database performance
Workload specialization for cloud management
10/30/2020 • 2 minutes to read • Edit Online
Workload operations: The largest per-workload operations investment and highest degree of resiliency. We
suggest workload operations for the approximately 20% of workloads that drive business value. This
specialization is usually reserved for high criticality or mission-critical workloads.
Platform operations: Operations investment is spread across many workloads. Resiliency improvements
affect all workloads that use the defined platform. We suggest platform operations for the approximately 20%
of platforms that have the highest criticality. This specialization is usually reserved for medium to high criticality
workloads.
Enhanced management baseline: The relatively lowest operations investment. This specialization slightly
improves business commitments by using additional cloud-native operations tools and processes.
High-level process
Workload specialization consists of a disciplined execution of the following four processes in an iterative approach.
Each process is explained in more detail in Platform Specialization.
Improve system design: Improve the design of a specific workload to effectively minimize interruptions.
Automate remediation: Some improvements aren't cost effective. In such cases, it might make more sense to
automate remediation and reduce the effect of interruptions.
Scale the solution: As you improve systems design and automated remediation, you can scale those changes
across the environment through the service catalog.
Continuous improvement: You can use different monitoring tools to discover incremental improvements.
These improvements can be addressed in the next pass of system design, automation, and scale.
Cultural change
Workload specialization often triggers a cultural change in traditional IT build processes that focus on delivering a
management baseline, enhanced baselines, and platform operations. Those types of offerings can be scaled across
the environment. Workload specialization is similar in execution to platform specialization. But unlike common
platforms, the specialization required by individual workloads often doesn't scale.
When workload specialization is required, operational management commonly evolves beyond a centralized IT
perspective. The approach suggested in Cloud Adoption Framework is a distribution of cloud management
functionality.
In this model, operational tasks like monitoring, deployment, DevOps, and other innovation-focused functions shift
to an application-development or business-unit organization. The cloud platform team and the core cloud
monitoring team still delivers on the management baseline across the environment.
Those centralized teams also guide and instruct workload-specialized teams on operations of their workloads. But
the day-to-day operational responsibility falls on a cloud management team that is managed outside of IT. This
type of distributed control is one of the primary indicators of maturity in a cloud center of excellence.
Performance, availability, and usage Application Insights Advanced application monitoring with
the application dashboard, composite
maps, usage, and tracing
Cloud adoption is a catalyst for enabling business value. However, real business value is realized through ongoing,
stable operations of the technology assets deployed to the cloud. This section of the Cloud Adoption Framework
guides you through various transitions into operational management in the cloud.
Cloud operations
Both of these best practices build toward a future-state methodology for operations management, as illustrated in
the following diagram:
Business alignment: In the Manage methodology, all workloads are classified by criticality and business value.
That classification can then be measured through an impact analysis, which calculates the lost value associated
with performance degradation or business interruptions. Using that tangible revenue impact, cloud operations
teams can work with the business to establish a commitment that balances cost and performance.
Cloud operations disciplines: After the business is aligned, it's much easier to track and report on the proper
disciplines of cloud operations for each workload. Making decisions along each discipline can then be converted
to commitment terms that can be easily understood by the business. This collaborative approach makes the
business stakeholder a partner in finding the right balance between cost and performance.
Inventor y and visibility: At a minimum, operations management requires a means of inventorying assets
and creating visibility into the run state of each asset.
Operational compliance: Regular management of configuration, sizing, cost, and performance of assets is
key to maintaining performance expectations.
Protect and recover : Minimizing operational interruptions and expediting recovery help the business avoid
performance losses and adverse revenue impacts. Detection and recovery are essential aspects of this
discipline.
Platform operations: All IT environments contain a set of commonly used platforms. Those platforms could
include data stores such as SQL Server or Azure HDInsight. Other common platforms could include container
solutions such as Azure Kubernetes Service (AKS). Regardless of the platform, platform operations maturity
focuses on customizing operations based on how the common platforms are deployed, configured, and used
by workloads.
Workload operations: At the highest level of operational maturity, cloud operations teams can tune
operations for critical workloads. For those workloads, available data can assist in automating the remediation,
sizing, or protection of workloads based on their utilization.
Additional guidance, such as the Design Review Framework (Code name: Cloud Design Principles), can help you
make detailed architectural decisions about each workload, within the previously described disciplines.
This section of the Cloud Adoption Framework will build on each of the preceding topics to help promote mature
cloud operations within your organization.
Overview of Azure server management services
10/30/2020 • 2 minutes to read • Edit Online
Azure server management services provide a consistent experience for managing servers at scale. These services
cover both Linux and Windows operating systems. They can be used in production, development, and test
environments. The server management services can support Azure IaaS virtual machines, physical servers, and
virtual machines that are hosted on-premises or in other hosting environments.
The Azure server management services suite includes the services in the following diagram:
This section of the Microsoft Cloud Adoption Framework provides an actionable and prescriptive plan for
deploying server management services in your environment. This plan helps orient you quickly to these services,
guiding you through an incremental set of management stages for all environment sizes.
For simplicity, we've categorized this guidance into three stages:
Next steps
Familiarize yourself with the tools, services, and planning involved with adopting the Azure server management
suite.
Prerequisite tools and planning
Phase 1: Prerequisite planning for Azure server
management services
10/30/2020 • 6 minutes to read • Edit Online
In this phase, you'll become familiar with the Azure server management suite of services, and plan how to deploy
the resources needed to implement these management solutions.
Planning considerations
When preparing the workspaces and accounts that you need for onboarding management services, consider the
following issues:
Azure geographies and regulator y compliance: Azure regions are organized into geographies. An Azure
geography ensures that data residency, sovereignty, compliance, and resiliency requirements are honored
within geographical boundaries. If your workloads are subject to data-sovereignty or other compliance
requirements, workspace and Automation accounts must be deployed to regions within the same Azure
geography as the workload resources they support.
Number of workspaces: As a guiding principle, create the minimum number of workspaces required per
Azure geography. We recommend at least one workspace for each Azure geography where your compute or
storage resources are located. This initial alignment helps avoid future regulatory issues when you migrate data
to different geographies.
Data retention and capping: You may also need to take Data retention policies or data capping requirements
into consideration when creating workspaces or Automation accounts. For more information about these
principles, and for additional considerations when planning your workspaces, see Manage log data and
workspaces in Azure Monitor.
Region mapping: Linking a Log Analytics workspace and an Azure Automation account is supported only
between certain Azure regions. For example, if the Log Analytics workspace is hosted in the East US region,
the linked Automation account must be created in the East US 2 region to be used with management services.
If you have an Automation account that was created in another region, it can't link to a workspace in East US .
The choice of deployment region can significantly affect Azure geography requirements. Consult the region
mapping table to decide which region should host your workspaces and Automation accounts.
Workspace multihoming: The Azure Log Analytics agent supports multihoming in some scenarios, but the
agent faces several limitations and challenges when running in this configuration. Unless Microsoft has
recommended it for your specific scenario, don't configure multihoming on the Log Analytics agent.
NOTE
When you create an Automation account by using the Azure portal, the portal attempts by default to create Run As
accounts for both Azure Resource Manager and the classic deployment model resources. If you don't have classic virtual
machines in your environment and you're not the co-administrator on the subscription, the portal creates a Run As account
for Resource Manager, but it generates an error when deploying the classic Run As account. If you don't intend to support
classic resources, you can ignore this error.
You can also create Run As accounts by using PowerShell.
Next steps
Learn how to onboard your servers to Azure server management services.
Onboard to Azure server management services
Phase 2: Onboarding Azure server management
services
10/30/2020 • 2 minutes to read • Edit Online
After you're familiar with the tools and planning involved in Azure management services, you're ready for the
second phase. Phase 2 provides step-by-step guidance for onboarding these services for use with your Azure
resources. Start by evaluating this onboarding process before adopting it broadly in your environment.
NOTE
The automation approaches discussed in later sections of this guidance are meant for deployments that don't already have
servers deployed to the cloud. They require that you have the Owner role on a subscription to create all the required
resources and policies. If you've already created Log Analytics workspaces and Automation accounts, we recommend that
you pass these resources in the appropriate parameters when you start the example automation scripts.
Onboarding processes
This section of the guidance covers the following onboarding processes for both Azure virtual machines and on-
premises servers:
Enable management ser vices on a single VM for evaluation by using the por tal . Use this process to
familiarize yourself with the Azure server management services.
Configure management ser vices for a subscription by using the por tal . This process helps you
configure the Azure environment so that any new VMs that are provisioned will automatically use management
services. Use this approach if you prefer the Azure portal experience to scripts and command lines.
Configure management ser vices for a subscription by using Azure Automation . This process is fully
automated. Just create a subscription, and the scripts will configure the environment to use management
services for any newly provisioned VM. Use this approach if you're familiar with PowerShell scripts and Azure
Resource Manager templates, or if you want to learn to use them.
The procedures for each of these approaches are different.
NOTE
When you use the Azure portal, the sequence of onboarding steps differs from the automated onboarding steps. The portal
offers a simpler onboarding experience.
The following diagram shows the recommended deployment model for management services:
As shown in the preceding diagram, the Log Analytics agent has two configurations for on-premises servers:
Auto-enroll: When the Log Analytics agent is installed on a server and configured to connect to a workspace,
the solutions that are enabled on that workspace are applied to the server automatically.
Opt-in: Even if the agent is installed and connected to the workspace, the solution isn't applied unless it's added
to the server's scope configuration in the workspace.
Next steps
Learn how to onboard a single VM by using the portal to evaluate the onboarding process.
Onboard a single Azure VM for evaluation
Enable server management services on a single VM
for evaluation
10/30/2020 • 2 minutes to read • Edit Online
NOTE
Create the required Log Analytics workspace and Azure Automation account before you implement Azure management
services on a VM.
It's simple to onboard Azure server management services to individual virtual machines in the Azure portal. You
can familiarize yourself with these services before you onboard them. When you select a VM instance, all the
solutions on the list of management tools and services appear on the Operations or Monitoring menu. You
select a solution and follow the wizard to onboard it.
Related resources
For more information about how to onboard these solutions to individual VMs, see:
Onboard Update Management, Change Tracking, and Inventory solutions from Azure virtual machine
Onboard Azure Monitoring for VMs
Next steps
Learn how to use Azure Policy to onboard Azure VMs at scale.
Configure Azure management services for a subscription
Configure Azure server management services at
scale
10/30/2020 • 7 minutes to read • Edit Online
You must complete these two tasks to onboard Azure server management services to your servers:
Deploy service agents to your servers.
Enable the management solutions.
This article covers the three processes that are necessary to complete these tasks:
1. Deploy the required agents to Azure VMs by using Azure Policy.
2. Deploy the required agents to on-premises servers.
3. Enable and configuring the solutions.
NOTE
Create the required Log Analytics workspace and Azure Automation account before you onboard virtual machines to Azure
server management services.
NOTE
For more information about various agents for Azure monitoring, see Overview of the Azure monitoring agents.
Assign policies
To assign the policies that described in the previous section:
1. In the Azure portal, go to Policy > Assignments > Assign initiative .
2. On the Assign policy page, set the Scope by selecting the ellipsis (...) and then selecting either a
management group or subscription. Optionally, select a resource group. Then choose Select at the bottom
of the Scope page. The scope determines which resources or group of resources the policy is assigned to.
3. Select the ellipsis (...) next to Policy definition to open the list of available definitions. To filter the initiative
definitions, enter Azure Monitor in the Search box:
4. The Assignment name is automatically populated with the policy name that you selected, but you can
change it. You can also add an optional description to provide more information about this policy
assignment. The Assigned by field is automatically filled based on who is signed in. This field is optional,
and it supports custom values.
5. For this policy, select Log Analytics workspace for the Log analytics agent to associate.
6. Select the Managed Identity location check box. If this policy is of the type DeployIfNotExists , a
managed identity will be required to deploy the policy. In the portal, the account will be created as indicated
by the check box selection.
7. Select Assign .
After you complete the wizard, the policy assignment will be deployed to the environment. It can take up to 30
minutes for the policy to take effect. To test it, create new VMs after 30 minutes, and check if the Microsoft
Monitoring Agent is enabled on the VM by default.
For on-premises servers, you need to download and install the Log Analytics Agent and the Microsoft Dependency
Agent manually and configure them to connect to the correct workspace. You must specify the workspace ID and
key information. To get that information, go to your Log Analytics workspace in the Azure portal, then select
Settings > Advanced settings .
Enable and configure solutions
To enable solutions, you need to configure the Log Analytics workspace. Onboarded Azure VMs and on-premises
servers will get the solutions from the Log Analytics workspaces that they're connected to.
Update Management
The Update Management, Change Tracking, and Inventory solutions require both a Log Analytics workspace and
an Automation account. To ensure that these resources are properly configured, we recommend that you onboard
through your Automation account. For more information, see Onboard Update Management, Change Tracking,
and Inventory solutions.
We recommend that you enable the Update Management solution for all servers. Update Management is free for
Azure VMs and on-premises servers. If you enable Update Management through your Automation account, a
scope configuration is created in the workspace. Manually update the scope to include machines that are covered
by the Update Management service.
To cover your existing servers as well as future servers, you need to remove the scope configuration. To do this,
view your Automation account in the Azure portal. Select Update Management > Manage machine > Enable
on all available and future machines . This setting allows all Azure VMs that are connected to the workspace to
use Update Management.
Heartbeat
| where AzureEnvironment=~"Azure" or Computer in~ ("list of the on-premises server names", "server1")
| distinct Computer
NOTE
The server name must exactly match the value in the expression, and it shouldn't contain a domain name suffix.
1. Select Save . By default, the scope configuration is linked to the MicrosoftDefaultComputerGroup saved
search. It will be automatically updated.
Azure Activity Log
Azure Activity Log is also part of Azure Monitor. It provides insight into subscription-level events that occur in
Azure.
To implement this solution:
1. In the Azure portal, open All ser vices , then select Management + Governance > Solutions .
2. In the Solutions view, select Add .
3. Search for Activity Log Analytics and select it.
4. Select Create .
You need to specify the Workspace name of the workspace that you created in the previous section where the
solution is enabled.
Azure Log Analytics Agent Health
The Azure Log Analytics Agent Health solution reports on the health, performance, and availability of your
Windows and Linux servers.
To implement this solution:
1. In the Azure portal, open All ser vices , then select Management + Governance > Solutions .
2. In the Solutions view, select Add .
3. Search for Azure Log Analytics agent health and select it.
4. Select Create .
You need to specify the Workspace name of the workspace that you created in the previous section where the
solution is enabled.
After creation is complete, the workspace resource instance displays AgentHealthAssessment when you select
View > Solutions .
Antimalware Assessment
The Antimalware Assessment solution helps you identify servers that are infected or at increased risk of infection
by malware.
To implement this solution:
1. In the Azure portal, open All ser vices , select select Management + Governance > Solutions .
2. In the Solutions view, select Add .
3. Search for and then select Antimalware Assessment .
4. Select Create .
You need to specify the Workspace name of the workspace that you created in the previous section where the
solution is enabled.
After creation is complete, the workspace resource instance displays AntiMalware when you select View >
Solutions .
Azure Monitor for VMs
You can enable Azure Monitor for VMs through the view page for the VM instance, as described in Enable
management services on a single VM for evaluation. You shouldn't enable solutions directly from the Solutions
page as you do for the other solutions that are described in this article. For large-scale deployments, it may be
easier to use automation to enable the correct solutions in the workspace.
Azure Security Center
We recommend that you onboard all your servers at least to the Free tier of Azure Security Center. This option
provides basic security assessments and actionable security recommendations for your environment. The
Standard tier provides additional benefits. For more information, see Azure Security Center pricing.
To enable the Free tier of Azure Security Center, follow these steps:
1. Go to the Security Center portal page.
2. Under POLICY & COMPLIANCE , select Security policy .
3. Find the Log Analytics workspace resource that you created in the pane on the right side.
4. Select Edit settings for that workspace.
5. Select Pricing tier .
6. Choose the Free option.
7. Select Save .
Next steps
Learn how to use automation to onboard servers and create alerts.
Automate onboarding and alert configuration
Automate onboarding
10/30/2020 • 2 minutes to read • Edit Online
To improve the efficiency of deploying Azure server management services, consider automating deployment as
discussed in previous sections of this guidance. The script and the example templates provided in the following
sections are starting points for developing your own automation of onboarding processes.
This guidance has a supporting GitHub repository of sample code, CloudAdoptionFramework. The repository
provides example scripts and Azure Resource Manager templates to help you automate the deployment of Azure
server management services.
The sample files illustrate how to use Azure PowerShell cmdlets to automate the following tasks:
Create a Log Analytics workspace. (Or, use an existing workspace if it meets the requirements. For details,
see Workspace planning.
Create an Automation account, or use an existing account that meets the requirements. For more
information, see Workspace planning.
Link the Automation account and the Log Analytics workspace. This step isn't required if you're onboarding
by using the Azure portal.
Enable Update Management, and Change Tracking and Inventory, for the workspace.
Onboard Azure VMs by using Azure Policy. A policy installs the Log Analytics agent and the Microsoft
Dependency Agent on the Azure VMs.
Auto-enable Azure backup for VMs using Azure Policy
Onboard on-premises servers by installing the Log Analytics agent on them.
The files described in the following table are used in this sample. You can customize them to support your own
deployment scenarios.
F IL E N A M E DESC RIP T IO N
ScopeConfig.json A Resource Manager template that uses the opt-in model for
on-premises servers with the Change Tracking solution. Using
the opt-in model is optional.
ChangeTracking-FileList.json A Resource Manager template that defines the list of files that
will be monitored by Change Tracking.
Next steps
Learn how to set up basic alerts to notify your team of key management events and issues.
Set up basic alerts
Set up basic alerts
10/30/2020 • 2 minutes to read • Edit Online
A key part of managing resources is getting notified when problems occur. Alerts proactively notify you of critical
conditions, based on triggers from metrics, logs, or service-health issues. As part of onboarding the Azure server
management services, you can set up alerts and notifications that help keep your IT teams aware of any problems.
Next steps
Learn about operations and security mechanisms that support your ongoing operations.
Ongoing management and security
Phase 3: Ongoing management and security
10/30/2020 • 2 minutes to read • Edit Online
After you've onboarded Azure server management services, you'll need to focus on the operations and security
configurations that will support your ongoing operations. We'll start with securing your environment by reviewing
the Azure Security Center. We'll then configure policies to keep your servers in compliance and automate common
tasks. This section covers the following topics:
Address security recommendations. Azure Security Center provides suggestions to improve the security of
your environment. When you implement these recommendations, you see the impact reflected in a security
score.
Enable the Guest Configuration policy. Use the Azure Policy Guest Configuration feature to audit the
settings in a virtual machine. For example, you can check whether any certificates are about to expire.
Track and aler t on critical changes. When you're troubleshooting, the first question to consider is, "What's
changed?" In this article, you'll learn how to track changes and create alerts to proactively monitor critical
components.
Create update schedules. Schedule the installation of updates to ensure that all your servers have the latest
ones.
Common Azure Policy examples. This article provides examples of common management policies.
Next steps
Learn how to enable the Azure Policy Guest Configuration feature.
Guest Configuration policy
Guest Configuration policy
10/30/2020 • 2 minutes to read • Edit Online
You can use the Azure Policy Guest Configuration extension to audit the configuration settings in a virtual machine.
Guest Configuration is currently supported only on Azure VMs.
To find the list of Guest Configuration policies, search for "Guest Configuration" on the Azure Policy portal page. Or
run this cmdlet in a PowerShell window to find the list:
NOTE
Guest Configuration functionality is regularly updated to support additional policy sets. Check for new supported policies
periodically and evaluate whether they'll be useful.
Deployment
Use the following example PowerShell script to deploy these policies to:
Verify that password security settings in Windows and Linux computers are set correctly.
Verify that certificates aren't close to expiration on Windows VMs.
Before you run this script, use the Connect-AzAccount cmdlet to sign in. When you run the script, you must
provide the name of the subscription that you want to apply the policies to.
param (
[Parameter(Mandatory=$true)]
[string] $SubscriptionName
)
New-AzPolicyAssignment -Name "CertExpirePolicy" -DisplayName "[Preview]: Audit that certificates are not
expiring on Windows VMs" -Scope $scope -PolicySetDefinition $CertExpirePolicy -AssignIdentity -Location eastus
Next steps
Learn how to enable change tracking and alerting for critical file, service, software, and registry changes.
Enable tracking and alerting for critical changes
Enable tracking and alerting for critical changes
10/30/2020 • 3 minutes to read • Edit Online
Azure Change Tracking and Inventory provide alerts on the configuration state of your hybrid environment and
changes to that environment. It can report critical file, service, software, and registry changes that might affect
your deployed servers.
By default, the Azure Automation inventory service doesn't monitor files or registry settings. The solution does
provide a list of registry keys that we recommend for monitoring. To see this list, go to your Automation account in
the Azure portal, then select Inventor y > Edit settings .
For more information about each registry key, see Registry key change tracking. Select any key to evaluate and
then enable it. The setting is applied to all VMs that are enabled in the current workspace.
You can also use the service to track critical file changes. For example, you might want to track the
C:\windows\system32\drivers\etc\hosts file because the OS uses it to map host names to IP addresses. Changes to
this file could cause connectivity problems or redirect traffic to dangerous websites.
To enable file-content tracking for the hosts file, follow the steps in Enable file content tracking.
You can also add an alert for changes to files that you're tracking. For example, say you want to set an alert for
changes to the hosts file. Select Log Analytics on the command bar or Log Search for the linked Log Analytics
workspace. In Log Analytics, use the following query to search for changes to the hosts file:
This query searches for changes to the contents of files that have a path that contains the word "hosts." You can
also search for a specific file by changing the path parameter. (For example,
FileSystemPath == "c:\\windows\\system32\\drivers\\etc\\hosts" .)
After the query returns the results, select New aler t rule to open the alert-rule editor. You can also get to this
editor via Azure Monitor in the Azure portal.
In the alert-rule editor, review the query and change the alert logic if you need to. In this case, we want the alert to
be raised if any changes are detected on any machine in the environment.
After you set the condition logic, you can assign action groups to perform actions in response to the alert. In this
example, when the alert is raised, emails are sent and an ITSM ticket is created. You can take many other useful
actions, like triggering an Azure function, an Azure Automation runbook, a webhook, or a logic app.
After you've set all the parameters and logic, apply the alert to the environment.
Next steps
Learn how Azure Automation can create update schedules to manage updates for your servers.
Create update schedules
Create update schedules
10/30/2020 • 2 minutes to read • Edit Online
You can manage update schedules by using the Azure portal or the new PowerShell cmdlet modules.
To create an update schedule via the Azure portal, see Schedule an update deployment.
The Az.Automation module now supports configuring update management by using Azure PowerShell. Version
1.7.0 of the module adds support for the New-AzAutomationUpdateManagementAzureQuery cmdlet. This cmdlet
lets you use tags, location, and saved searches to configure update schedules for a flexible group of machines.
Example script
The example script in this section illustrates the use of tagging and querying to create dynamic groups of
machines that you can apply update schedules to. It performs the following actions. You can refer to the
implementations of the specific actions when you create your own scripts.
Creates an Azure Automation update schedule that runs every Saturday at 8:00 AM.
Creates a query for any machines that match these criteria:
Deployed in the westus , eastus , or eastus2 Azure location.
Has an Owner tag applied with a value set to JaneSmith .
Has a Production tag applied with a value set to true .
Applies the update schedule to the queried machines and sets a two-hour update window.
Before you run the example script, you'll need to sign in by using the Connect-AzAccount cmdlet. When you start
the script, provide the following information:
The target subscription ID
The target resource group
Your Log Analytics workspace name
Your Azure Automation account name
<#
.SYNOPSIS
This script orchestrates the deployment of the solutions and the agents.
.Parameter SubscriptionName
.Parameter WorkspaceName
.Parameter AutomationAccountName
.Parameter ResourceGroupName
#>
param (
[Parameter(Mandatory=$true)]
[string] $SubscriptionId,
[Parameter(Mandatory=$true)]
[string] $ResourceGroupName,
[Parameter(Mandatory=$true)]
[string] $WorkspaceName,
[Parameter(Mandatory=$true)]
[string] $AutomationAccountName,
[Parameter(Mandatory=$false)]
[string] $scheduleName = "SaturdayCriticalSecurity"
)
Import-Module Az.Automation
$startTime = ([DateTime]::Now).AddMinutes(10)
$schedule = New-AzAutomationSchedule -ResourceGroupName $ResourceGroupName `
-AutomationAccountName $AutomationAccountName `
-StartTime $startTime `
-Name $scheduleName `
-Description "Saturday patches" `
-DaysOfWeek Saturday `
-WeekInterval 1 `
-ForUpdateConfiguration
$queryScope = @("/subscriptions/$SubscriptionID/resourceGroups/")
$AzureQueries = @($DGQuery)
Next steps
See examples of how to implement common policies in Azure that can help manage your servers.
Common policies in Azure
Common Azure Policy examples
10/30/2020 • 2 minutes to read • Edit Online
Azure Policy can help you apply governance to your cloud resources. This service can help you create guardrails
that ensure company-wide compliance to governance policy requirements. To create policies, use either the Azure
portal or PowerShell cmdlets. This article provides PowerShell cmdlet examples.
NOTE
With Azure Policy, enforcement policies ( DeployIfNotExists ) aren't automatically deployed to existing VMs. Remediation is
required to keep VMs in compliance. For more information, see Remediate noncompliant resources with Azure Policy.
The following script shows how to assign the policy. Change the $SubscriptionID value to point to the
subscription that you want to assign the policy to. Before you run the script, use the Connect-AzAccount cmdlet to
sign in.
# Replace the -Name GUID with the policy GUID you want to assign.
$AllowedLocationPolicy = Get-AzPolicyDefinition -Name "e56962a6-4747-49cd-b67b-bf8b01975c4c"
You can also use this script to apply the other policies that are discussed in this article. Just replace the GUID in the
line that sets $AllowedLocationPolicy with the GUID of the policy that you want to apply.
Block certain resource types
Another common built-in policy that's used to control costs can also be used to block certain resource types.
To find this policy in the portal, search for "allowed resource types" on the policy definition page. Or run this
cmdlet to find the policy:
Get-AzPolicyDefinition | Where-Object { ($_.Properties.policyType -eq "BuiltIn") -and
($_.Properties.displayName -like "*allowed resource types") }
After you identify the policy that you want to use, you can modify the PowerShell sample in the Restrict resource
regions section to assign the policy.
Restrict VM size
Azure offers a wide range of VM sizes to support various workloads. To control your budget, you could create a
policy that allows only a subset of VM sizes in your subscriptions.
Deploy antimalware
You can use this policy to deploy a Microsoft Antimalware extension with a default configuration to VMs that aren't
protected by antimalware.
The policy GUID is 2835b622-407b-4114-9198-6f7064cbe0dc .
The following script shows how to assign the policy. To use the script, change the $SubscriptionID value to point
to the subscription that you want to assign the policy to. Before you run the script, use the Connect-AzAccount
cmdlet to sign in.
# Replace location "eastus" with the value that you want to use.
New-AzPolicyAssignment -Name "Deploy Antimalware" -DisplayName "Deploy default Microsoft IaaSAntimalware
extension for Windows Server" -Scope $scope -PolicyDefinition $antimalwarePolicy -Location eastus –
AssignIdentity
Next steps
Learn about other server-management tools and services that are available.
Azure server management tools and services
Azure server management tools and services
10/30/2020 • 4 minutes to read • Edit Online
As is discussed in the overview of this guidance, the suite of Azure server management services covers these
areas:
Migrate
Secure
Protect
Monitor
Configure
Govern
The following sections briefly describe these management areas and provide links to detailed content about the
main Azure services that support them.
Migrate
Migration services can help you migrate your workloads into Azure. To provide the best guidance, the Azure
Migrate service starts by measuring on-premises server performance and assessing suitability for migration.
After Azure Migrate completes the assessment, you can use Azure Site Recovery and Azure Database Migration
Service to migrate your on-premises machines to Azure.
Secure
Azure Security Center is a comprehensive security management application. By onboarding to Security Center,
you can quickly get an assessment of the security and regulatory compliance status of your environment. For
instructions on onboarding your servers to Azure Security Center, see Configure Azure management services for a
subscription.
Protect
To protect your data, you need to plan for backup, high availability, encryption, authorization, and related
operational issues. These topics are covered extensively online, so here we'll focus on building a business
continuity and disaster recovery (BCDR) plan. We'll include references to documentation that describes in detail
how to implement and deploy this type of plan.
When you build data-protection strategies, first consider breaking down your workload applications into their
different tiers. This approach helps because each tier typically requires its own unique protection plan. To learn
more about designing applications to be resilient, see Designing resilient applications for Azure.
The most basic data protection is backup. To speed up the recovery process if servers are lost, back up not just
data but also server configurations. Backup is an effective mechanism to handle accidental data deletion and
ransomware attacks. Azure Backup can help you protect your data on Azure and on-premises servers running
Windows or Linux. For details about what Backup can do and for how-to guides, see the Azure Backup service
overview.
If a workload requires real-time business continuity for hardware failures or datacenter outage, consider using
data replication. Azure Site Recovery provides continuous replication of your VMs, a solution that provides bare-
minimum data loss. Site Recovery also supports several replication scenarios, such as replication:
Of Azure VMs between two Azure regions.
Between servers on-premises.
Between on-premises servers and Azure.
For more information, see the complete Azure Site Recovery replication matrix.
For your file-server data, another service to consider is Azure File Sync. This service helps you centralize your
organization's file shares in Azure Files, while preserving the flexibility, performance, and compatibility of an on-
premises file server. To use this service, follow the instructions for deploying Azure File Sync.
Monitor
Azure Monitor provides a view into various resources, like applications, containers, and virtual machines. It also
collects data from several sources:
Azure Monitor for VMs provides an in-depth view of VM health, performance trends, and dependencies. The
service monitors the health of the operating systems of your Azure virtual machines, virtual-machine scale
sets, and machines in your on-premises environment.
Log Analytics is a feature of Azure Monitor. Its role is central to the overall Azure management story. It serves
as the data store for log analysis and for many other Azure services. It offers a rich query language and an
analytics engine that provides insights into the operation of your applications and resources.
Azure Activity Log is also a feature of Azure Monitor. It provides insight into subscription-level events that
occur in Azure.
Configure
Several services fit into this category. They can help you to:
Automate operational tasks.
Manage server configurations.
Measure update compliance.
Schedule updates.
Detect changes to your servers.
These services are essential to supporting ongoing operations:
Update Management automates the deployment of patches across your environment, including deployment to
operating-system instances running outside of Azure. It supports both Windows and Linux operating systems,
and tracks key OS vulnerabilities and nonconformance caused by missing patches.
Change Tracking and Inventory provides insight into the software that's running in your environment, and
highlights any changes that have occurred.
Azure Automation lets you run Python and PowerShell scripts or runbooks to automate tasks across your
environment. When you use Automation with the Hybrid Runbook Worker, you can extend your runbooks to
your on-premises resources as well.
Azure Automation State Configuration enables you to push PowerShell Desired State Configuration (DSC)
configurations directly from Azure. DSC also lets you monitor and preserve configurations for guest operating
systems and workloads.
Govern
Adopting and moving to the cloud creates new management challenges. It requires a different mindset as you
shift from an operational management burden to monitoring and governance. The Cloud Adoption Framework
starts with governance. The framework explains how to migrate to the cloud, what the journey will look like, and
who should be involved.
The governance design for standard organizations often differs from governance design for complex enterprises.
To learn more about governance best practices for a standard organization, see the standard enterprise
governance guide. To learn more about governance best practices for a complex enterprise, see the governance
guide for complex enterprises.
Billing information
To learn about pricing for Azure management services, go to these pages:
Azure Site Recovery
Azure Backup
Azure Monitor
Azure Security Center
Azure Automation, including:
Desired State Configuration
Azure Update Management service
Azure Change Tracking and Inventory services
Azure Policy
Azure File Sync service
NOTE
The Azure Update Management solution is free, but there's a small cost related to data ingestion. As a rule of thumb, the
first 5 gigabytes (GB) per month of data ingestion are free. We generally observe that each machine uses about 25 MB per
month. So, about 200 machines per month are covered for free. For more servers, multiply the number of additional servers
by 25 MB per month. Then, multiply the result by the storage price for the additional storage that you need. For
information about costs, see Azure Storage Overview pricing. Each additional server typically has a nominal impact on cost.
Cloud monitoring guide: Introduction
10/30/2020 • 3 minutes to read • Edit Online
The cloud fundamentally changes how enterprises procure and use technology resources. In the past, enterprises
assumed ownership of and responsibility for all levels of technology, from infrastructure to software. Now, the
cloud offers the potential for enterprises to provision and consume resources as needed.
Although the cloud offers nearly unlimited flexibility in terms of design choices, enterprises seek proven and
consistent methodology for the adoption of cloud technologies. Each enterprise has different goals and timelines
for cloud adoption, making a one-size-fits-all approach to adoption nearly impossible.
This digital transformation also enables an opportunity to modernize your infrastructure, workloads, and
applications. Depending on business strategy and objectives, adopting a hybrid cloud model is likely part of the
migration journey from on-premises to operating fully in the cloud. During this journey, IT teams are challenged to
adopt and realize rapid value from the cloud. IT must also understand how to effectively monitor the application or
service that's migrating to Azure, and continue to deliver effective IT operations and DevOps.
Stakeholders want to use cloud-based, software as a service (SaaS) monitoring and management tools. They need
to understand what services and solutions deliver to achieve end-to-end visibility, reduce costs, and focus less on
infrastructure and maintenance of traditional software-based IT operations tools.
However, IT often prefers to use the tools they've already made a significant investment in. This approach supports
their service operations processes to monitor both cloud models, with the eventual goal of transitioning to a SaaS-
based offering. IT prefers this approach not only because it takes time, planning, resources, and funding to switch.
It's also because of confusion about which products or Azure services are appropriate or applicable to achieve the
transition.
The goal of this guide is to provide a detailed reference to help enterprise IT managers, business decision makers,
application architects, and application developers understand:
Azure monitoring platforms, with an overview and comparison of their capabilities.
The best-fit solution for monitoring hybrid, private, and Azure native workloads.
The recommended end-to-end monitoring approach for both infrastructure and applications. This approach
includes deployable solutions for migrating these common workloads to Azure.
This guide isn't a how-to article for using or configuring individual Azure services and solutions, but it does
reference those sources when they're applicable or available. After you've read it, you'll understand how to
successfully operate a workload by following best practices and patterns.
If you're unfamiliar with Azure Monitor and System Center Operations Manager, and you want to get a better
understanding of what makes them unique and how they compare to each other, review the Overview of our
monitoring platforms.
Audience
This guide is useful primarily for enterprise administrators, IT operations, IT security and compliance, application
architects, workload development owners, and workload operations owners.
Next steps
Monitoring strategy for cloud deployment models
Cloud monitoring guide: Monitoring strategy for
cloud deployment models
10/30/2020 • 17 minutes to read • Edit Online
This article includes our recommended monitoring strategy for each of the cloud deployment models, based on
the following criteria:
You must maintain your commitment to Operations Manager or another enterprise monitoring platform,
because it's integrated with your IT operations processes, knowledge, and expertise, or certain functionality isn't
available yet in Azure Monitor.
You must monitor workloads both on-premises and in the public cloud, or just in the cloud.
Your cloud migration strategy includes modernizing IT operations and moving to our cloud monitoring services
and solutions.
You might have critical systems that are air-gapped or physically isolated, or are hosted in a private cloud or on
physical hardware, and these systems need to be monitored.
Our strategy includes support for monitoring infrastructure (compute, storage, and server workloads), application
(end-user, exceptions, and client), and network resources. It delivers a complete, service-oriented monitoring
perspective.
L AY ER RESO URC E SC O P E M ET H O D
Azure resources - platform Azure Database services (for Azure Database for SQL Enable diagnostics logging
as a service (PaaS) example, SQL or MySQL). performance metrics. to stream SQL data to Azure
Monitor Logs.
Azure resources - 1. Azure Storage 1. Capacity, availability, and For items 1 through 5 in the
infrastructure as a service 2. Azure load balancing performance. first column, platform
(IaaS) services 2. Performance and metrics and the Activity log
3. Network security groups diagnostics logs (activity, are automatically collected
4. Azure Virtual Machines access, performance, and and available in Azure
5. Azure Kubernetes firewall). Monitor for analysis and
Service/Azure Container 3. Monitor events when alerting.
Instances rules are applied, and the Configure Diagnostic
rule counter for how many Settings to forward resource
times a rule is applied to logs to Azure Monitor Logs.
deny or allow. 4. Enable Azure Monitor for
4. Monitor capacity, VMs.
availability, and performance 5. Enable Azure Monitor for
in a guest VM operating containers.
system (OS). Map app
dependencies hosted on
each VM, including the
visibility of active network
connections between
servers, inbound and
outbound connection
latency, and ports across
any TCP-connected
architecture.
5. Monitor capacity,
availability, and performance
of workloads running on
containers and container
instances.
Azure subscription Azure Service Health and Administrative actions Delivered in the Activity Log
basic resource health from performed on a service or for monitoring and alerting
the perspective of the Azure resource. by using Azure Monitor.
service. Service health of an Azure
service is in a degraded or
unavailable state.
Health issues detected
with an Azure resource from
the Azure service
perspective.
Operations performed
with Azure Autoscale
indicating a failure or
exception.
Operations performed
with Azure Policy indicating
that an allowed or denied
action occurred.
Record of alerts generated
by Azure Security Center.
Azure tenant Azure Active Directory Azure AD audit logs and Enable diagnostics logging,
sign-in logs. and configure streaming to
Azure Monitor Logs.
Can collect IIS and SQL Server error Supports monitoring most of the
logs, Windows events, and performance server workloads with available
counters. Requires creating custom management packs. Requires either the
queries, alerts, and visualizations. Log Analytics Windows agent or
Operations Manager agent on the VM,
reporting back to the management
group on the corporate network.
Legacy web application monitoring Yes, limited, varies by SDK Yes, limited
Next steps
Collect the right data
Cloud monitoring guide: Collect the right data
10/30/2020 • 2 minutes to read • Edit Online
This article describes some considerations for collecting monitoring data in a cloud application.
To observe the health and availability of your cloud solution, you must configure the monitoring tools to collect a
level of signals that are based on predictable failure states. These signals are the symptoms of the failure, not the
cause. The monitoring tools use metrics and, for advanced diagnostics and root cause analysis, logs.
Plan for monitoring and migration carefully. Start by including the monitoring service owner, the manager of
operations, and other related personnel during the Plan phase, and continue engaging them throughout the
development and release cycle. Their focus will be to develop a monitoring configuration that's based on the
following criteria:
What is the composition of the service? Are those dependencies monitored today? If so, are there multiple tools
involved? Is there an opportunity to consolidate, without introducing risks?
What is the SLA of the service, and how will I measure and report it?
What should the service dashboard look like when an incident is raised? What should the dashboard look like
for the service owner, and for the team that supports the service?
What metrics does the resource produce that I need to monitor?
How will the service owner, support teams, and other personnel be searching the logs?
How you answer those questions, and the criteria for alerting, determines how you'll use the monitoring platform.
If you're migrating from an existing monitoring platform or set of monitoring tools, use the migration as an
opportunity to reevaluate the signals you collect. This is especially true now that there are several cost factors to
consider when you migrate or integrate with a cloud-based monitoring platform like Azure Monitor. Remember,
monitoring data needs to be actionable. You need to have optimized data collected to give you "a 10,000 foot
view" of the overall health of the service. The instrumentation that's defined to identify real incidents should be as
simple, predictable, and reliable as possible.
Next steps
Alerting strategy
Cloud monitoring guide: Alerting
10/30/2020 • 11 minutes to read • Edit Online
For years, IT organizations have struggled to combat the alert fatigue that's created by the monitoring tools
deployed in the enterprise. Many systems generate a high volume of alerts often considered meaningless, while
other alerts are relevant but are either overlooked or ignored. As a result, IT and developer operations have
struggled to meet the service-level quality promised to internal or external customers. To ensure reliability, it's
essential to understand the state of your infrastructure and applications. To minimize service degradation and
disruption, or to decrease the effect of or reduce the number of incidents, you need to identify causes quickly.
Azure Monitor for containers Calculated average performance data Create metric alerts if you want to be
from nodes and pods are written to the alerted based on variation of measured
metrics database. utilization performance, aggregated
over time.
Calculated performance data that uses Create log query alerts if you want to
percentiles from nodes, controllers, be alerted based on variation of
containers, and pods are written to the measured utilization from clusters and
workspace. Container logs and containers. Log query alerts can also be
inventory information are also written configured based on pod-phase counts
to the workspace. and status node counts.
Azure Monitor for VMs Health criteria are metrics stored in the Alerts are generated when the health
metrics database. state changes from healthy to
unhealthy. This alert supports only
Action Groups that are configured to
send SMS or email notifications.
NOTE
These features apply only to metric alerts, alerts based on data that's being sent to the Azure Monitor metric database. The
features don't apply to the other types of alerts. As mentioned previously, the primary objective of metric alerts is speed. If
getting an alert in less than five minutes isn't of primary concern, you can use a log query alert instead.
Dynamic thresholds: Dynamic thresholds look at the activity of the resource over a time period, and create
upper and lower "normal behavior" thresholds. When the metric being monitored falls outside of these
thresholds, you get an alert.
Multisignal alerts: You can create a metric alert that uses the combination of two different inputs from two
different resource types. For example, if you want to fire an alert when the CPU utilization of a VM is over 90
percent, and the number of messages in a certain Azure Service Bus queue feeding that VM exceeds a
certain amount, you can do so without creating a log query. This feature works for only two signals. If you
have a more complex query, feed your metric data into the Log Analytics workspace, and use a log query.
Multiresource alerts: Azure Monitor allows a single metric alert rule that applies to all VM resources. This
feature can save you time because you don't need to create individual alerts for each VM. Pricing for this
type of alert is the same. Whether you create 50 alerts for monitoring CPU utilization for 50 VMs, or one
alert that monitors CPU utilization for all 50 VMs, it costs you the same amount. You can use these types of
alerts in combination with dynamic thresholds as well.
Used together, these features can save time by minimizing alert notifications and the management of the
underlying alerts.
Limits on alerts
Be sure to note the limits on the number of alerts you can create. Some limits (but not all of them) can be
increased by calling support.
Best query experience
If you're looking for trends across all your data, it makes sense to import all your data into Azure Logs, unless it's
already in Application Insights. You can create queries across both workspaces, so there's no need to move data
between them. You can also import activity log and Azure Service Health data into your Log Analytics workspace.
You pay for this ingestion and storage, but you get all your data in one place for analysis and querying. This
approach also gives you the ability to create complex query conditions and alert on them.
Cloud monitoring guide: Monitoring platforms
overview
10/30/2020 • 14 minutes to read • Edit Online
Microsoft provides a range of monitoring capabilities from two products: System Center Operations Manager,
which was designed for on-premises and then extended to the cloud, and Azure Monitor, which was designed for
the cloud but can also monitor on-premises systems. These two offerings deliver core monitoring services, such as
alerting, service uptime tracking, application and infrastructure health monitoring, diagnostics, and analytics.
Many organizations are embracing the latest practices for DevOps agility and cloud innovations to manage their
heterogenous environments. Yet they are also concerned about their ability to make appropriate and responsible
decisions about how to monitor those workloads.
This article provides a high-level overview of our monitoring platforms to help you understand how each delivers
core monitoring functionality.
Infrastructure requirements
Operations Manager
Operations Manager requires significant infrastructure and maintenance to support a management group, which is
a basic unit of functionality. At a minimum, a management group consists of one or more management servers, a
SQL Server instance, hosting the operational and reporting data warehouse database, and agents. The complexity
of a management group design depends on multiple factors, such as the scope of workloads to monitor, and the
number of devices or computers supporting the workloads. If you require high availability and site resiliency, as is
commonly the case with enterprise monitoring platforms, the infrastructure requirements and associated
maintenance can increase dramatically.
Operations Manager
Management Group
Network Device
Web
console UNIX/Linux System
Management
server Agent-managed
system
Operations Operational DB
console
Data Warehouse DB
Agentless-managed
system
Reporting DB
Azure Monitor
Azure Monitor is a software as a service (SaaS) offering, so its supporting infrastructure runs in Azure and is
managed by Microsoft. It's performs monitoring, analytics, and diagnostics at scale. It is available in all national
clouds. Core parts of the infrastructure (collectors, metrics and logs store, and analytics) that support Azure
Monitor are maintained by Microsoft.
Azure Monitor
Sources Collectors Storage Usage
Insights
Application Container VM Monitoring
Solutions
Integrate
Export APIs Logic Apps
Grey items not part of Azure Monitor, but part of Azure Monitor story.
Data collection
Operations Manager
Agents
Operations Manager collects data directly only from agents that are installed on Windows computers. It can accept
data from the Operations Manager SDK, but this approach is typically used for partners that extend the product
with custom applications, not for collecting monitoring data. It can collect data from other sources, such as Linux
computers and network devices, by using special modules that run on the Windows agent that remotely accesses
these other devices.
Network Device
Linux server
Management
server
Agent-managed
Windows computer
The Operations Manager agent can collect from multiple data sources on the local computer, such as the event log,
custom logs, and performance counters. It can also run scripts, which can collect data from the local computer or
from external sources. You can write custom scripts to collect data that can't be collected by other means, or to
collect data from a variety of remote devices that can't otherwise be monitored.
Management packs
Operations Manager performs all monitoring with workflows (rules, monitors, and object discoveries). These
workflows are packaged together in a management pack and deployed to agents. Management packs are available
for a variety of products and services, which include predefined rules and monitors. You can also author your own
management pack for your own applications and custom scenarios.
Monitoring configuration
Management packs can contain hundreds of rules, monitors, and object discovery rules. An agent runs all these
monitoring settings from all the management packs that apply, which are determined by discovery rules. Each
instance of each monitoring setting runs independently and acts immediately on the data that it collects. This is
how Operations Manager can achieve near-real-time alerting and the current health state of monitored resources.
For example, a monitor might sample a performance counter every few minutes. If that counter exceeds a
threshold, it immediately sets the health state of its target object, which immediately triggers an alert in the
management group. A scheduled rule might watch for a particular event to be created and immediately fire an
alert when that event is created in the local event log.
Because these monitoring settings are isolated from each other and work from the individual sources of data,
Operations Manager has challenges correlating data between multiple sources. It's also difficult to react to data
after it's been collected. You can run workflows that access the Operations Manager database, but this scenario isn't
common, and it's typically used for a limited number of special purpose workflows.
Operations Manager
Management Group
Network Device
Web
console UNIX/Linux System
Management
server Agent-managed
system
Operations Operational DB
console
Data Warehouse DB
Agentless-managed
system
Reporting DB
Azure Monitor
Data sources
Azure Monitor collects data from a variety of sources, including Azure infrastructure and platform resources,
agents on Windows and Linux computers, and monitoring data collected in Azure storage. Any REST client can
write log data to Azure Monitor by using an API, and you can define custom metrics for your web applications.
Some metric data can be routed to different locations, depending on its usage. For example, you might use the data
for "fast-as-possible" alerting or for long-term trend analysis searches in conjunction with other log data.
Monitoring solutions and insights
Monitoring solutions use the logs platform in Azure Monitor to provide monitoring for a particular application or
service. They typically define data collection from agents or from Azure services, and provide log queries and views
to analyze that data. They typically don't provide alert rules, which means that you must define your own alert
criteria based on collected data.
Insights, such as Azure Monitor for containers and Azure Monitor for VMs, use the logs and metrics platform of
Azure Monitor to provide a customized monitoring experience for an application or service in the Azure portal.
They might provide health monitoring and alerting conditions, in addition to customized analysis of collected data.
Monitoring configuration
Azure Monitor separates data collection from actions taken against that data, which supports distributed
microservices in a cloud environment. It consolidates data from multiple sources into a common data platform,
and provides analysis, visualization, and alerting capabilities based on the collected data.
Data collected by Azure Monitor is stored as either logs or metrics, and different features of Azure Monitor rely on
either. Metrics contain numerical values in time series that are well suited for near-real-time alerting and quick
detection of issues. Logs contain text or numerical data and can be queried using a powerful language especially
useful for performing complex analysis.
Because Azure Monitor separates data collection from actions against that data, it might be unable to provide near-
real-time alerting in many cases. To alert on log data, queries are run on a recurring schedule defined in the alert.
This behavior allows Azure Monitor to easily correlate data from all monitored sources, and you can interactively
analyze data in a variety of ways. This is especially helpful for doing root cause analysis and identifying where else
an issue might occur.
Health monitoring
Operations Manager
Management Packs in Operations Manager include a service model that describes the components of the
application being monitored and their relationship. Monitors identify the current health state of each component
based on data and scripts on the agent. Health states roll up so that you can quickly view the summarized health
state of monitored computers and applications.
Azure Monitor
Azure Monitor doesn't provide a user-definable method of implementing a service model or monitors that indicate
the current health state of any service components. Because monitoring solutions are based on standard features
of Azure Monitor, they don't provide state-level monitoring. The following features of Azure Monitor can be helpful:
Application Insights: Builds a composite map of your web application, and provides a health state for
each application component or dependency. This includes alerts status and drill-down to more detailed
diagnostics of your application.
Azure Monitor for VMs: Delivers a health-monitoring experience for the guest Azure VMs, similar to that
of Operations Manager, when it monitors Windows and Linux virtual machines. It evaluates the health of key
operating system components from the perspective of availability and performance to determine the
current health state. When it determines that the guest VM is experiencing sustained resource utilization,
disk-space capacity, or an issue related to core operating system functionality, it generates an alert to bring
this state to your attention.
Azure Monitor for containers: Monitors the performance and health of Azure Kubernetes Service or
Azure Container Instances. It collects memory and processor metrics from controllers, nodes, and containers
that are available in Kubernetes through the Metrics API. It also collects container logs and inventory data
about containers and their images. Predefined health criteria that are based on the collected performance
data help you identify whether a resource bottleneck or capacity issue exists. You can also understand the
overall performance, or the performance from a specific Kubernetes object type (pod, node, controller, or
container).
Analyze data
Operations Manager
Operations Manager provides four basic ways to analyze data after it has been collected:
Health Explorer : Helps you discover which monitors are identifying a health state issue and review
knowledge about the monitor and possible causes for actions related to it.
Views: Offers predefined visualizations of collected data, such as a graph of performance data or a list of
monitored components and their current health state. Diagram views visually present the service model of
an application.
Repor ts: Allow you to summarize historical data that's stored in the Operations Manager data warehouse.
You can customize the data that views and reports are based on. However, there is no feature to allow for
complex or interactive analysis of collected data.
Operations Manager Command Shell: Extends Windows PowerShell with an additional set of cmdlets,
and can query and visualize collected data. This includes graphs and other visualizations, natively with
PowerShell, or with the Operations Manager HTML-based web console.
Azure Monitor
With the powerful Azure Monitor analytics engine, you can interactively work with log data and combine them with
other monitoring data for trending and other data analysis. Views and dashboards allow you to visualize query
data in a variety of ways from the Azure portal, and import it into Power BI. Monitoring solutions include queries
and views to present the data they collect. Insights such as Application Insights, Azure Monitor for VMs, and Azure
Monitor for containers include customized visualizations to support interactive monitoring scenarios.
Alerting
Operations Manager
Operations Manager creates alerts in response to predefined events, when a performance threshold is met, and
when the health state of a monitored component changes. It includes the complete management of alerts, allowing
you to set their resolution and assign them to various operators or system engineers. You can set notification rules
that specify which alerts will send proactive notifications.
Management packs include various predefined alerting rules for different critical conditions in the application
being monitored. You can tune these rules or create custom rules to the particular requirements of your
environment.
Azure Monitor
With Azure Monitor, you can create alerts based on a metric crossing a threshold, or based on a scheduled query
result. Although alerts based on metrics can achieve near-real-time results, scheduled queries have a longer
response time, depending on the speed of data ingestion and indexing. Instead of being limited to a specific agent,
log query alerts in Azure Monitor let you analyze data across all data stored in multiple workspaces. These alerts
also include data from a specific Application Insights application by using a cross-workspace query.
Although monitoring solutions can include alert rules, you ordinarily create them based on your own
requirements.
Workflows
Operations Manager
Management packs in Operations Manager contain hundreds of individual workflows, and they determine both
what data to collect and what action to perform with that data. For example, a rule might sample a performance
counter every few minutes, storing its results for analysis. A monitor might sample the same performance counter
and compare its value to a threshold to determine the health state of a monitored object. Another rule might run a
script to collect and analyze some data on an agent computer, and then fire an alert if it returns a particular value.
Workflows in Operations Manager are independent of each other, which makes analysis across multiple monitored
objects difficult. These monitoring scenarios must be based on data after it's collected, which is possible but can be
difficult, and it isn't common.
Azure Monitor
Azure Monitor separates data collection from actions and analysis taken from that data. Agents and other data
sources write log data to a Log Analytics workspace and write metric data to the metric database, without any
analysis of that data or knowledge of how it might be used. Monitor performs alerting and other actions from the
stored data, which allows you to perform analysis across data from all sources.
Next steps
Monitoring the cloud deployment models
Skills readiness for cloud monitoring
10/30/2020 • 6 minutes to read • Edit Online
During the Plan phase of your migration journey, the objective is to develop the plans necessary to guide
implementation. The plans need to also include how you will operate these workloads before they are transitioned
or released into production, and not afterwards. Business stakeholders expect valuable services, and they expect
them without disruption. IT staff members realize they need to learn new skills and adapt so they are prepared to
confidently use the integrated Azure services to effectively monitor resources in Azure and hybrid environments.
Developing the necessary skills can be accelerated with the following learning paths. They are organized starting
with learning the fundamentals and then divided across three primary subject domains - infrastructure, application,
and data analysis.
Fundamentals
Introduction to Azure Resource Manager discusses the basic concepts of management and deployment of
Azure resources. The IT staff managing the monitoring experience across the enterprise should understand
management scopes, role-based access control (RBAC), using. Azure Resource Manager templates, and
management of resources using Azure CLI and Azure PowerShell.
Introduction to Azure Policy helps you learn how you can use Azure Policy to create, assign, and manage
policies. Azure Policy can deploy and configure the Azure Monitor agents, enable monitoring with Azure
Monitor for VMs and Azure Security Center, deploy Diagnostic Settings, audit guest configuration settings,
and more.
Introduction to Azure command-line interface (CLI), which is our cross-platform command-line experience
for managing Azure resources. Also review, introduction to Azure PowerShell. LinkedIn offers, as part of
their beginner-level course Learning Azure Management Tools, sessions covering Azure CLI and PowerShell
programming languages:
Use the Azure CLI.
Get started with Azure PowerShell
Learn how to secure resources using policy, role-based access control, and other Azure services by viewing
Implement resource management security in Azure.
Monitoring Microsoft Azure Resources and Workloads helps you learn how to use Azure monitoring tools to
monitor Azure network resources as well as resources located on-premises.
Learn about planning and designing your monitoring deployments at-scale and automating actions by
viewing Azure Monitor best practices and recommendations.
Infrastructure monitoring
Design a Monitoring Strategy for Infrastructure in Microsoft Azure helps you learn foundational knowledge
of monitoring capabilities and solutions in Azure.
How to monitor your Kubernetes clusters provides an intermediate level deep dive to help you learn about
monitoring your Kubernetes cluster with Azure Monitor for containers.
Learn with Azure Monitor how to monitor data from Azure Storage and HDInsight.
Microsoft Azure Database Monitoring Playbook explores the key monitoring capabilities that can be used to
gain insight and actionable steps for Azure SQL Database, Azure SQL Data Warehouse, and Azure Cosmos
DB.
Monitoring Microsoft Azure Hybrid Cloud Networks is an advanced-level course that helps you learn how to
use Azure monitoring tools to visualize, maintain, and optimize Azure virtual networks and virtual private
network connections for your hybrid cloud implementation.
With Azure Arc for servers, learn how you can manage your Windows and Linux machines hosted outside of
Azure similarly to how you manage native Azure virtual machines.
How to monitor your VMs provides an intermediate level deep dive to help you learn about monitoring your
hybrid machines or servers, and Azure VM or virtual machine scale sets with Azure Monitor for VMs.
Application monitoring
Understand how Azure Monitor helps you view availability and performance of your applications and
services together from one place. Pluralsight offers the following courses to help:
Microsoft Azure DevOps Engineer: Optimize Feedback Mechanisms helps you prepare you to use
Azure Monitor, including Application Insights, to monitor and optimize your web applications.
Capture and view page load times in your Azure web app. Get started with this course on using Azure
Monitor Application Insights for end-to-end monitoring of your applications components running in
Azure.
Microsoft Azure Database Monitoring Playbook helps you learn how to implement and use
monitoring of Azure SQL Database, Azure SQL Data Warehouse, and Azure Cosmos DB.
Instrument Applications with Azure Monitor Application Insights is a deep-dive course on using the
Application Insights SDK to collect telemetry and events from an app with Angular and Node.js
components.
Application Debugging and Profiling is a recording from a Microsoft conference session on using and
interpreting data provided by the Azure Monitor Application Insights Snapshot Debugger and Profiler.
Data analysis
Learn how to write log queries in Azure Monitor. The Kusto query language is the primary resource for
writing Azure Monitor log queries to explore and analyze log data between the collected data from Azure
and hybrid resource application dependencies, including the live application.
Kusto Query Language (KQL) from Scratch is a comprehensive course that includes detailed examples
covering a wide range of use-cases and techniques for log analysis in Azure Monitor logs.
Other Considerations
Customers often struggle to manage, maintain, and deliver the expected business (and to the IT organization)
outcomes for the services that IT is charged with delivering. Monitoring is considered core to managing
infrastructure and the business, with a focus on measuring quality of service and customer experience. In order to
achieve those goals, lay the groundwork using ITSM in conjunction with DevOps, which will help the monitoring
team mature how they manage, deliver, and support the monitoring service. Adopting an ITSM framework allows
the monitoring team to function as a provider and gain recognition as a trusted business enabler by aligning to the
strategic goals and needs of the organization.
Review the following to understand the updates made to the most popular ITSM framework ITIL v4 and Cloud
Computing whitepaper, which focuses on joining existing ITIL guidance with best practices from DevOps, Agile, and
Lean. Also consider the IT4IT reference architecture that delivers an alternative blueprint on how to transform IT
using a process agnostic framework.
Learn more
To discover additional learning paths, browse the Microsoft Learn catalog. Use the Roles filter to align learning
paths with your role.
Centralize management operations
10/30/2020 • 2 minutes to read • Edit Online
For most organizations, using a single Azure Active Directory (Azure AD) tenant for all users simplifies
management operations and reduces maintenance costs. This is because all management tasks can be by
designated users, user groups, or service principals within that tenant.
We recommend that you use only one Azure AD tenant for your organization, if possible. However, some situations
might require an organization to maintain multiple Azure AD tenants for the following reasons:
They are wholly independent subsidiaries.
They're operating independently in multiple geographies.
Certain legal or compliance requirements apply.
There are acquisitions of other organizations (sometimes temporary until a long-term tenant consolidation
strategy is defined).
When a multiple-tenant architecture is required, Azure Lighthouse provides a way to centralize and streamline
management operations. Subscriptions from multiple tenants can be onboarded for Azure delegated resource
management. This option allows specified users in the managing tenant to perform cross-tenant management
functions in a centralized and scalable manner.
For example, let's say your organization has a single tenant, Tenant A . The organization then acquires two
additional tenants, Tenant B and Tenant C , and you have business reasons that require you to maintain them as
separate tenants.
Your organization wants to use the same policy definitions, backup practices, and security processes across all
tenants. Because you already have users (including user groups and service principals) that are responsible for
performing these tasks within Tenant A, you can onboard all of the subscriptions within Tenant B and Tenant C so
that those same users in Tenant A can perform those tasks. Tenant A then becomes the managing tenant for Tenant
B and Tenant C.
As your enterprise begins to operate workloads in Azure, the next step is to establish a process for operational
fitness review . This process enumerates, implements, and iteratively reviews the nonfunctional requirements for
these workloads. Nonfunctional requirements are related to the expected operational behavior of the service.
There are five essential categories of nonfunctional requirements, known as the pillars of architecture excellence:
Cost optimization
Operational excellence
Performance efficiency
Reliability
Security
A process for operational fitness review ensures that your mission-critical workloads meet the expectations of your
business with respect to the pillars.
Create a process for operational fitness review to fully understand the problems that result from running
workloads in a production environment, and how to remediate and resolve those problems. This article outlines a
high-level process for operational fitness review that your enterprise can use to achieve this goal.
At a high level, the process has two phases. In the prerequisites phase, the requirements are established and
mapped to supporting services. This phase occurs infrequently: perhaps annually or when new operations are
introduced. The output of the prerequisites phase is used in the flow phase. The flow phase occurs more frequently,
such as monthly.
Prerequisites phase
The steps in this phase capture the requirements for conducting a regular review of the important services.
1. Identify critical business operations . Identify the enterprise's mission-critical business operations.
Business operations are independent from any supporting service functionality. In other words, business
operations represent the actual activities that the business needs to perform and that are supported by a set
of IT services.
The term mission-critical (or business critical) reflects a severe impact on the business if the operation is
impeded. For example, an online retailer might have a business operation, such as "enable a customer to
add an item to a shopping cart" or "process a credit card payment." If either of these operations fails, a
customer can't complete the transaction and the enterprise fails to realize sales.
2. Map operations to ser vices . Map the critical business operations to the services that support them. In the
shopping-cart example, several services might be involved, including an inventory stock-management
service and a shopping-cart service. To process a credit-card payment, an on-premises payment service
might interact with a third-party, payment-processing service.
3. Analyze ser vice dependencies . Most business operations require orchestration among multiple
supporting services. It's important to understand the dependencies between the services, and the flow of
mission-critical transactions through these services.
Also consider the dependencies between on-premises services and Azure services. In the shopping-cart
example, the inventory stock-management service might be hosted on-premises and ingest data entered by
employees from a physical warehouse. However, it might store data off-premises in an Azure service, such
as Azure Storage, or a database, such as Azure Cosmos DB.
An output from these activities is a set of scorecard metrics for service operations. The scorecard measures criteria
such as availability, scalability, and disaster recovery. Scorecard metrics express the operational criteria that you
expect the service to meet. These metrics can be expressed at any level of granularity that's appropriate for the
service operation.
The scorecard should be expressed in simple terms to facilitate meaningful discussion between the business
owners and engineering. For example, a scorecard metric for scalability might be color-coded in a simple way.
Green means meeting the defined criteria, yellow means failing to meet the defined criteria but actively
implementing a planned remediation, and red means failing to meet the defined criteria with no plan or action.
It's important to emphasize that these metrics should directly reflect business needs.
Service -review phase
The service-review phase is the core of the operational fitness review. It involves these steps:
1. Measure ser vice metrics . Use the scorecard metrics to monitor the services, to ensure that the services
meet the business expectations. Service monitoring is essential. If you can't monitor a set of services with
respect to the nonfunctional requirements, consider the corresponding scorecard metrics to be red. In this
case, the first step for remediation is to implement the appropriate service monitoring. For example, if the
business expects a service to operate with 99.99 percent availability, but there is no production telemetry in
place to measure availability, assume that you're not meeting the requirement.
2. Plan remediation . For each service operation for which metrics fall below an acceptable threshold,
determine the cost of remediating the service to bring operation to an acceptable level. If the cost of
remediating the service is greater than the expected revenue generation of the service, move on to consider
the intangible costs, such as customer experience. For example, if customers have difficulty placing a
successful order by using the service, they might choose a competitor instead.
3. Implement remediation . After the business owners and engineering team agree on a plan, implement it.
Report the status of the implementation whenever you review scorecard metrics.
This process is iterative, and ideally your enterprise has a team dedicated to it. This team should meet regularly to
review existing remediation projects, kick off the fundamental review of new workloads, and track the enterprise's
overall scorecard. The team should also have the authority to hold remediation teams accountable if they're behind
schedule or fail to meet metrics.
Recommended resources
Microsoft Azure Well-Architected Framework: Learn about guiding tenets for improving the quality of a
workload. The framework consists of five pillars of architecture excellence:
Cost optimization
Operational excellence
Performance efficiency
Reliability
Security
Ten design principles for Azure applications. Follow these design principles to make your application more
scalable, resilient, and manageable.
Designing resilient applications for Azure. Build and maintain reliable systems using a structured approach over
the lifetime of an application, from design and implementation to deployment and operations.
Cloud design patterns. Use design patterns to build applications on the pillars of architecture excellence.
Azure Advisor. Azure Advisor provides personalized recommendations based on your usage and configurations
to help optimize your resources for high availability, security, performance, and cost.
IT management and operations in the cloud
10/30/2020 • 2 minutes to read • Edit Online
As a business moves to a cloud-based model, the importance of proper management and operations can't be
overstated. Unfortunately, few organizations are prepared for the IT management shift that's required for success in
building a cloud-first operating model. This section of the Cloud Adoption Framework outlines the operating
model, processes, and tooling that have proven successful in the cloud. Each of these areas represents a minor but
fundamental change in the way the business should view IT operations and management as it begins to adopt the
cloud.
Cloud management
The historical IT operating model was sufficient for over 20 years. But that model is now outdated and is less
desirable than cloud-first alternatives. When IT management teams move to the cloud, they have an opportunity to
rethink this model and drive greater value for the business. This article series outlines a modernized model of IT
management.
Next steps
For a deeper understanding of the new cloud management model, start with Understand business alignment.
Understand business alignment
Create business alignment in cloud management
10/30/2020 • 2 minutes to read • Edit Online
In on-premises environments, IT assets (applications, virtual machines, VM hosts, disk, servers, devices, and
data sources) are managed by IT to support workload operations. In IT terms, a workload is a collection of IT
assets that support a specific business operation. To help support business operations, IT management delivers
processes that are designed to minimize disruptions to those assets. When an organization moves to the cloud,
management and operations shift a bit, creating an opportunity to develop tighter business alignment.
Business vernacular
The first step in creating business alignment is to ensure term alignment. IT management, like most
engineering professions, has amassed a collection of jargon, or highly technical terms. Such terms can lead to
confusion for business stakeholders and make it difficult to map management services to business value.
Fortunately, the process of developing a cloud adoption strategy and cloud adoption plan creates an ideal
opportunity to remap these terms. The process also creates opportunities to rethink commitments to
operational management, in partnership with the business. The following article series walks you through this
new approach across three specific terms that can help improve conversations among business stakeholders:
Criticality : Mapping workloads to business processes. Ranking criticality to focus investments.
Impact : Understanding the impact of potential outages to aid in evaluating return on investment for cloud
management.
Commitment : Developing true partnerships, by creating and documenting agreements with the business.
NOTE
Underlying these terms are classic IT terms such as SLA, RTO, and RPO. Mapping specific business and IT terms is
covered in more detail in the Commitment article.
Next steps
Start creating business alignment by defining workload criticality.
Define workload criticality
Business criticality in cloud management
10/30/2020 • 3 minutes to read • Edit Online
Across every business, there exist a small number of workloads that are too important to fail. These workloads are
considered mission critical. When those workloads experience outages or performance degradation, the adverse
impact on revenue and profitability can be felt across the entire company.
At the other end of the spectrum, some workloads can go months at a time without being used. Poor performance
or outages for those workloads is not desirable, but the impact is isolated and limited.
Understanding the criticality of each workload in the IT portfolio is the first step toward establishing mutual
commitments to cloud management. The following diagram illustrates a common alignment between the
criticality scale to follow and the standard commitments made by the business.
Criticality scale
The first step in any business criticality alignment effort is to create a criticality scale. The following table presents
a sample scale to be used as a reference, or template, for creating your own scale.
Unit-critical Affects the mission of a specific business unit and its profit-
and-loss statements.
It's common for businesses to include additional criticality classifications that are specific to their industry, vertical,
or specific business processes. Examples of additional classifications include:
Compliance-critical: In heavily regulated industries, some workloads might be critical as part of an effort to
maintain compliance requirements.
Security-critical: Some workloads might not be mission critical, but outages could result in loss of data or
unintended access to protected information.
Safety-critical: When lives or the physical safety of employees and customers is at risk during an outage, it
can be wise to classify workloads as safety-critical.
Next steps
After your team has defined business criticality, you can calculate and record business impact.
Calculate and record business impact
Business impact in cloud management
5/22/2020 • 4 minutes to read • Edit Online
Assume the best, prepare for the worst. In IT management, it's safe to assume that the workloads required to
support business operations will be available and will perform within agreed-upon constraints, based on the
selected criticality. However, to manage investments wisely, it's important to understand the impact on the
business when an outage or performance degradation occurs. This importance is illustrated in the following
graph, which maps potential business interruptions of specific workloads to the business impact of outages across
a relative value scale.
To create a fair basis of comparison for the impact on various workloads across a portfolio, a time/value metric is
suggested. The time/value metric captures the adverse impact of a workload outage. Generally, this impact is
recorded as a direct loss of revenue or operating revenue during a typical outage period. More specifically, it
calculates the amount of lost revenue for a unit of time. The most common time/value metric is Impact per hour,
which measures operating revenue losses per hour of outage.
A few approaches can be used to calculate impact. You can apply any of the options in the following sections to
achieve similar outcomes. It's important to use the same approach for each workload when you calculate
protected losses across a portfolio.
Calculate time
Depending on the nature of the workload, you could calculate losses differently. For high-paced transactional
systems such as a real-time trading platform, losses per millisecond might be significant. Less frequently used
systems, such as payroll, might not be used every hour. Whether the frequency of usage is high or low, it's
important to normalize the time variable when you calculate financial impact.
Next steps
After the business has defined impact, you can align commitments.
Align management commitments with the business
Business commitment in cloud management
10/30/2020 • 9 minutes to read • Edit Online
Defining business commitment is an exercise in balancing priorities. The objective is to align the proper level of
operational management at an acceptable operating cost. Finding that balance requires a few data points and
calculations, which we've outlined in this article.
Commitments to business stability, via technical resiliency or other service-level agreement (SLA) impacts, are a
business justification decision. For most workloads in an environment, a baseline level of cloud management is
sufficient. For others, a 2x to 4x cost increase is easily justified because of the potential impact of any business
interruptions.
The previous articles in this series can help you understand the classification and impact of interruptions to
various workloads. This article helps you calculate the returns. As illustrated in the preceding image, each level of
cloud management has inflection points where cost can rise faster than increases in resiliency. Those inflection
points will prompt detailed business decisions and business commitments.
IT operations prerequisites
The Azure Management Guide outlines the management tools that are available in Azure. Before reaching a
commitment with the business, IT should determine an acceptable standard-level management baseline to be
applied to all managed workloads. IT would then calculate a standard management cost for each of the managed
workloads in the IT portfolio, based on counts of CPU cores, disk space, and other asset-related variables. IT
would also estimate a composite SLA for each workload, based on the architecture.
TIP
IT operations teams often use a default minimum of 99.9 percent uptime for the initial composite SLA. They might also
choose to normalize management costs based on the average workload, especially for solutions with minimal logging and
storage needs. Averaging the costs of a few medium criticality workloads can provide a starting point for initial
conversations.
TIP
If you're using the operations management workbook to plan for cloud management, the operations management fields
should be updated to reflect these prerequisites. Those fields include Commitment level, Composite SLA, and Monthly cost.
Monthly cost should represent the cost of the added operational management tools on a monthly basis.
The operations management baseline serves as an initial starting point to be validated in each of the following
sections.
Management responsibility
In a traditional on-premises environment, the cost of managing the environment is commonly assumed to be a
sunk cost that's owned by IT operations. In the cloud, management is a purposeful decision with direct budgetary
impact. The costs of each management function can be more directly attributed to each workload that's deployed
to the cloud. This approach allows for greater control, but it does create a requirement for cloud operations teams
and cloud strategy teams to first commit to an agreement about responsibilities.
Organizations might also choose to outsource some of their ongoing management functions to a service
provider. These service providers can use Azure Lighthouse to give organizations more precise control in
granting access to their resources, along with greater visibility into the actions performed by the service
providers.
Delegated responsibility: Because there's no need to centralize and assume operational management
overhead, IT operations for many organizations are considering new approaches. One common approach
is referred to as delegated responsibility. In a cloud center of excellence model, platform operations and
platform automation provide self-service management tools that can be used by business-led operations
teams, independent of a centralized IT operations team. This approach gives business stakeholders
complete control over management-related budgets. It also allows the cloud center of excellence (CCoE)
team to ensure that a minimum set of guardrails has been properly implemented. In this model, IT acts as
a broker and a guide to help the business make wise decisions. Business operations oversee day to day
operations of dependent workloads.
Centralized responsibility: Compliance requirements, technical complexity, and some shared service
models might require a Central IT team model. In this model, IT continues to exercise its operations
management responsibilities. Environmental design, management controls, and governance tooling might
be centrally managed and controlled, which restricts the role of business stakeholders in making
management commitments. But the visibility into the cost and architecture of cloud approaches makes it
much easier for centralized IT to communicate the cost and level of management for each workload.
Mixed model: Classification is at the heart of a mixed model of management responsibilities. Companies
that are in the midst of a transformation from on-premises to cloud might require an on-premises-first
operating model for a while. Companies with strict compliance requirements, or that depend on long-term
contracts with IT outsourcing vendors, might require a centralized operating model.
Regardless of their constraints, today's businesses must innovate. When rapid innovation must flourish, in
the midst of a central-IT, centralized-responsibility model, a mixed-model approach might provide balance.
In this approach, a central IT team provides a centralized operating model for all workloads that are
mission-critical or contain sensitive information. At the same time, all other workload classifications might
be placed in a cloud environment that's designed for delegated responsibilities. The centralized
responsibility approach serves as the general operating model. The business then has flexibility to adopt a
specialized operating model, based on its required level of support and sensitivity.
The first step is committing to a responsibility approach, which then shapes the following commitments.
Which organization will be responsible for day-to-day operations management for this workload?
Cloud tenancy
For most businesses, management is easier when all assets reside in a single tenant. However, some
organizations might need to maintain multiple tenants. To learn why a business might require a multitenant
Azure environment, see Centralize management operations with Azure Lighthouse.
Will this workload reside in a single Azure tenant, alongside all other workloads?
Soft-cost factors
The next section outlines an approach to comparative returns that are associated with levels of management
processes and tooling. At the end of that section, each analyzed workload measures the cost of management
relative to the forecast impact of business disruptions. That approach provides a relatively easy way to
understand whether an investment in richer management approaches is warranted.
Before you run the numbers, it's important to look at the soft-cost factors. Soft-cost factors produce a return, but
that return is difficult to measure through direct hard-cost savings that would be visible in a profit-and-loss
statement. Soft-cost factors are important because they can indicate a need to invest in a higher level of
management than is fiscally prudent.
A few examples of soft-cost factors would include:
Daily workload usage by the board or CEO.
Workload usage by the top x% of customers that leads to a greater revenue impact elsewhere.
Impact on employee satisfaction.
The next data point that's required to make a commitment is a list of soft-cost factors. These factors don't need to
be documented at this stage, but business stakeholders should be aware of the importance of these factors and
their exclusion from the following calculations.
TIP
If you're using the operations management workbook to plan for cloud management, update the operations management
fields to reflect to reflect each conversation. Those fields include Commitment level, Composite SLA, and Monthly cost.
Monthly cost should represent the monthly cost of the added operational management tools. After they're updated, the
fields will update the ROI formulas and each of the following fields.
The workbook uses the default value of 8,760 hours per year.
Standard loss impact
Standard loss impact (labeled Standard Impact in the workbook) forecasts the financial impact of any outage,
assuming that the estimated outage prediction proves accurate. To calculate this forecast without using the
workbook, apply the following formula:
This serves as a baseline for cost, should the business stakeholders choose to invest in a higher level of
management.
Composite -SLA impact
Composite-SLA impact (labeled Commitment level impact in the workbook) provides updated fiscal impact, based
on the changes to the uptime SLA. This calculation allows you to compare the projected financial impact of both
options. To calculate this forecast impact without the spreadsheet, apply the following formula:
The value represents the potential losses to be avoided by the changed commitment level and new composite
SLA.
Comparison basis
Comparison basis evaluates standard impact and composite SLA impact to determine which is most appropriate
in the return column.
Return on loss avoidance
If the cost of managing a workload exceeds the potential losses, the proposed investment in cloud management
might not be fruitful. To compare the Return on Loss Avoidance, see the column labeled Annual ROI****. To
calculate this column on your own, use the following formula:
Return on Loss Avoidance = (Comparison basis - (Monthly cost × 12) ) ÷ (Monthly cost × 12) )
Unless there are other soft-cost factors to consider, this comparison can quickly suggest whether there should be
a deeper investment in cloud operations, resiliency, reliability, or other areas.
Next steps
After the commitments are made, the responsible operations teams can begin configuring the workload in
question. To get started, evaluate various approaches to inventory and visibility.
Inventory and visibility options
Management leveling across cloud management
disciplines
10/30/2020 • 3 minutes to read • Edit Online
The keys to proper management in any environment are consistency and repeatable processes. There are endless
of options for the things that can be done in Azure. Likewise, there are countless approaches to cloud
management. To provide consistency and repeatability, it's important to narrow those options to a consistent set of
management processes and tools that will be offered for workloads hosted in the cloud.
As a starting point, consider establishing the management levels that are shown in the preceding diagram and
suggested in the following list:
Management baseline: A cloud management baseline (or management baseline) is a defined set of tools,
processes, and consistent pricing that serve as the foundation for all cloud management in Azure. To establish a
cloud management baseline and determine which tools to include in the baseline offering to your business,
review the list in the "Cloud management disciplines" section.
Enhanced baseline: Some workloads might require enhancements to the baseline that aren't necessarily
specific to a single platform or workload. Although these enhancements aren't cost effective for every workload,
there should be common processes, tools, and solutions for any workload that can justify the cost of the extra
management support.
Platform specialization: In any given environment, some common platforms are used by a variety of
workloads. This general architectural commonality doesn't change when businesses adopt the cloud. Platform
specialization is an elevated level of management that applies data and architectural subject matter expertise to
provide a higher level of operational management. Examples of platform specialization would include
management functions specific to SQL Server, Containers, Active Directory, or other services that can be better
managed through consistent, repeatable processes, tools, and architectures.
Workload specialization: For workloads that are truly mission critical, there might be a cost justification to
go much deeper into the management of that workload. Workload specialization applies workload telemetry to
determine more advanced approaches to daily management. That same data often identifies automation,
deployment, and design improvements that would lead to greater stability, reliability, and resiliency beyond
what's possible with operational management alone.
Unsuppor ted: It's equally important to communicate common management processes that won't be delivered
through cloud management disciplines for workloads that are classified as not supported or not critical.
Organizations might also choose to outsource functions related to one or more of these management levels to a
service provider. These service providers can use Azure Lighthouse to provide greater precision and transparency.
The remaining articles in this series outline processes that are commonly found within each of these disciplines. In
parallel, the Azure Management Guide demonstrates the tools that can support each of those processes. For
assistance with building your management baseline, start with the Azure Management Guide. After you've
established the baseline, this article series and the accompanying best practices can help expand that baseline to
define other levels of management support.
Next steps
The next step toward defining each level of cloud management is an understanding of inventory and visibility.
Inventory and visibility options
Inventory and visibility in cloud management
10/30/2020 • 6 minutes to read • Edit Online
Operational management has a clear dependency on data. Consistent management requires an understanding
about what is managed (inventory) and how those managed workloads and assets change over time (visibility).
Clear insights about inventory and visibility help empower the team to manage the environment effectively. All
other operational management activities and processes build on these two areas.
A few classic phrases about the importance of measurements set the tone for this article:
Manage what matters.
You can only manage what you can measure.
If you can't measure it, it might not matter.
The inventory and visibility discipline builds on these timeless phrases. Before you can effectively establish
operational management processes, it's important to gather data and create the right level of visibility for the
right teams.
Processes
Perhaps more important than the features of the cloud management platform, the cloud management processes
will realize operations commitments with the business. Any cloud management methodology should include, at a
minimum, the following processes:
Reactive monitoring: When deviations adversely affect business operations, who addresses those
deviations? What actions do they take to remediate the deviations?
Proactive monitoring: When deviations are detected but business operations are not affected, how are
those deviations addressed, and by whom?
Commitment repor ting: How is adherence to the business commitment communicated to business
stakeholders?
Budgetar y reviews: What is the process for reviewing those commitments against budgeted costs? What is
the process for adjusting the deployed solution or the commitments to create alignment?
Escalation paths: What escalation paths are available when any of the preceding processes fail to meet the
needs of the business?
There are several more processes related to inventory and visibility. The preceding list is designed to provoke
thought within the operations team. Answering these questions will help develop some of the necessary
processes, as well as likely trigger new, deeper questions.
Responsibilities
When you're developing processes for operational monitoring, it's equally important to determine responsibilities
for daily operation and regular support of each process.
In a centralized IT organization, IT provides the operational expertise. The business would be consultative in
nature, when issues require remediation.
In a cloud center of excellence organization, business operations would provide the expertise and hold
responsibility for management of these processes. IT would focus on the automation and support of teams, as
they operate the environment.
But these are the common responsibilities. Organizations often require a mixture of responsibilities to meet
business commitments.
Next steps
Operational compliance builds on inventory capabilities by applying management automation and controls. See
how operational compliance maps to your processes.
Plan for operational compliance
Operational compliance in cloud management
5/12/2020 • 2 minutes to read • Edit Online
Operational compliance builds on the discipline of inventory and visibility. As the first actionable step of cloud
management, this discipline focuses on regular telemetry reviews and remediation efforts (both proactive and
reactive remediation). This discipline is the cornerstone for maintaining balance between security, governance,
performance, and cost.
Next steps
Protection and recovery are the next areas to consider in a cloud management baseline.
Protect and recover
Protect and recover in cloud management
10/30/2020 • 5 minutes to read • Edit Online
After they've met the requirements for inventory and visibility and operational compliance, cloud management
teams can anticipate and prepare for a potential workload outage. As they're planning for cloud management, the
teams must start with an assumption that something will fail.
No technical solution can consistently offer a 100 percent uptime SLA. Solutions with the most redundant
architectures claim to deliver on "six 9s" or 99.9999 percent uptime. But even a "six 9s" solution goes down for
31.6 seconds in any given year. Sadly, it's rare for a solution to warrant a large, ongoing operational investment
that's required to reach "six 9s" of uptime.
Preparation for an outage allows the team to detect failures sooner and recover more quickly. The focus of this
discipline is on the steps that come immediately after a system fails. How do you protect workloads, so that they
can be recovered quickly when an outage occurs?
Next steps
After this management baseline component is met, the team can look ahead to avoid outages in its platform
operations and workload operations.
Platform operations Workload operations
Platform operations in cloud management
10/30/2020 • 6 minutes to read • Edit Online
A cloud management baseline that spans inventory and visibility, operational compliance, and protection and
recovery might provide a sufficient level of cloud management for most workloads in the IT portfolio. However,
that baseline is seldom enough to support the full portfolio. This article builds on the most common next step in
cloud management, portfolio operations.
A quick study of the assets in the IT portfolio highlights patterns across the workloads that are being supported.
Within those workloads, there will be common platforms. Depending on the past technical decisions within the
company, those platforms could vary widely.
For some organizations, there will be a heavy dependence on SQL Server, Oracle, or other open-source data
platforms. In other organizations, the commonalities might be rooted in the hosting platforms for virtual machines
(VMs) or containers. Still others might have a common dependency on applications or Enterprise Resource
Planning (ERP) systems, such as SAP, Oracle, or others.
By understanding these commonalities, the cloud management team can specialize in higher levels of support for
those prioritized platforms.
NOTE
Building a service catalog requires a great deal of effort and time from multiple teams. Using the service catalog or
approved list as a gating mechanism will slow innovation. When innovation is a priority, service catalogs should be
developed parallel to other adoption efforts.
Next steps
In parallel with improvements to platform operations, cloud management teams also focus on improving
workload operations for the top 20 percent or less of production workloads.
Improve workload operations
Workload operations in cloud management
10/30/2020 • 5 minutes to read • Edit Online
Some workloads are critical to the success of the business. For those workloads, a management baseline is
insufficient to meet the required business commitments to cloud management. Platform operations might not
even be sufficient to meet business commitments. This highly important subset of workloads requires a
specialized focus on the way the workload functions and how it is supported.
In return, the investment in workload operations can lead to improved performance, decreased risk of business
interruption, and faster recovery when system failures occur. This article discusses an approach to investing in the
continued operations of these high priority workloads to drive improved business commitments.
Continued observation
Initial data and ongoing telemetry can help formulate and test theories about the performance of a workload. But
ongoing workload operations are rooted in a continued and expanded observation of workload performance,
with a heavy focus on application and data performance.
Test the automation
At the application level, the first requirements of workload operations, is an investment in deep testing. For any
application that's supported through workload operations, a test plan should be established and regularly
executed to deliver functional and scale testing across the applications.
Regular test telemetry can provide immediate validation of various hypotheses about the operation of the
workload. Improving operational and architectural patterns can be executed and tested. The resulting deltas
provide a clear impact analysis to guide continued investments.
Understand releases
A clear understanding of release cycles and release pipelines is an important element of workload operations.
An understanding of cycles can prepare for potential interruptions and allow the team to proactively address any
releases that might produce an adverse effect on operations. This understanding also allows the cloud
management team to partner with adoption teams to continuously improve the quality of the product and
address any bugs that might affect stability.
More importantly, an understanding of release pipelines can significantly improve the recovery point objective
(RPO) of a workload. In many scenarios, the fastest and most accurate path to the recovery of an application is a
release pipeline. For application layers that change only when a new release happens, it might be wise to invest
more heavily in pipeline optimization than on the recovery of the application from traditional back-up processes.
Although a deployment pipeline can be the fastest path to recovery, it can also be the fastest path to remediation.
When an application has a fast, efficient, and reliable release pipeline, the cloud management team has an option
to automate deployment to a new host as a form of automated remediation.
There might be many other faster, more effective mechanisms for remediation and recovery. However, when the
use of an existing pipeline can meet business commitments and capitalize on existing DevOps investments, the
existing pipeline might be a viable alternative.
Clearly communicate changes to the workload
Change to any workload is among the biggest risks to workload operations. For any workload in the workload
operations level of cloud management, the cloud management team should closely align with the cloud adoption
teams to understand the changes coming from each release. This investment in proactive understanding will have
a direct, positive impact on operational stability.
Improve outcomes
The data and communication investments in a workload will yield suggestions for improvements to ongoing
operations in one of three areas:
Technical debt resolution
Automated remediation
Improved system design
Technical debt resolution
The best workload operations plans still require remediation. As your cloud management team seeks to stay
connected to understand adoption efforts and releases, the team likewise should regularly share remediation
requirements to ensure that technical debt and bugs are a continued priority for your development teams.
Automated remediation
By applying the Pareto principle, we can say that 80 percent of negative business impact likely comes from 20
percent of the service incidents. When those incidents can't be addressed in normal development cycles,
investments in remediation automation can significantly reduce business interruptions.
Improved system design
In the cases of technical debt resolution and automated remediation, system flaws are the common cause of most
system outages. You can have the greatest impact on overall workload operations by adhering to a few design
principles:
Scalability: The ability of a system to handle increased load.
Availability: The percentage of time that a system is functional and working.
Resiliency: The ability of a system to recover from failures and continue to function.
Management: Operations processes that keep a system running in production.
Security: Protecting applications and data from threats.
To help improve overall operations, the Microsoft Azure Well-Architected Framework provides an approach to
evaluating specific workloads for adherence to these pillars. Apply the pillars to both platform operations and
workload operations.
Next steps
With a full understanding of the manage methodology within the Cloud Adoption Framework, you are now
armed to implement cloud management principles. Learn how to make this methodology actionable within your
operations environment.
Apply this methodology
Apply design principles and advanced operations
10/30/2020 • 6 minutes to read • Edit Online
The first three cloud management disciplines describe a management baseline. At a minimum, a management
baseline should include a standard business commitment to minimize business interruptions and accelerate
recovery if service is interrupted. Most management baselines include a disciplined focus on maintaining
"inventory and visibility," "operational compliance," and "protection and recovery."
The purpose of a management baseline is to create a consistent offering that provides a minimum level of
business commitment for all supported workloads. This baseline of common, repeatable management offerings
allows the team to deliver a highly optimized degree of operational management, with minimal deviation. But
that standard offering might not provide a rich enough commitment to the business.
The diagram in the next section illustrates three ways to go beyond the management baseline.
The management baseline should meet the minimum commitment required by 80 percent of the lowest criticality
workloads in the portfolio. The baseline should not be applied to mission-critical workloads. Nor should it be
applied to common platforms that are shared across workloads. Those workloads require a focus on design
principles and advanced operations.
Management specialization
Aspects of workload and platform operations might require changes to design and architecture principles. Those
changes could take time and might result in increased operating expenses. To reduce the number of workloads
requiring such investments, an enhanced management baseline could provide enough of an improvement to the
business commitment.
For workloads that warrant a higher investment to meet a business commitment, specialization of operations is
key.
Cloud adoption can't happen without well-organized people. Successful cloud adoption is the result of properly
skilled people doing the appropriate types of work, in alignment with clearly defined business goals, and in a
well-managed environment. To deliver an effective operating model for the cloud, it's important to establish
appropriately staffed organizational structures. This article outlines an approach to establishing and maintaining
the proper organizational structures in four steps.
The following exercises will help guide the process of creating a landing zone to support cloud adoption.
Structure type
The following organizational structures do not necessarily have to map to an organizational chart (org chart).
Org charts generally reflect command and control management structures. Conversely, the following
organizational structures are designed to capture alignment of roles and responsibilities. In an agile, matrix
organization, these structures may be best represented as virtual teams. There is no limitation suggesting that
these organizational structures couldn't be represented in an org chart, but it is not necessary in order to
produce an effective operating model.
The first step of managing organizational alignment is to determine how the following organizational structures
will be fulfilled:
Org char t alignment: Management hierarchies, manager responsibilities, and staff alignment will align to
organizational structures.
Vir tual teams: Management structures and org charts remain unchanged. Instead, virtual teams will be
created and tasked with the required functions.
Mixed model: More commonly, a mixture of org chart and virtual team alignment will be required to deliver
on transformation goals.
The article on determining organizational structure maturity provides additional detail regarding each level of
maturity.
To track organization structure decisions over time, download and modify the RACI template.
Cloud strategy functions
10/30/2020 • 3 minutes to read • Edit Online
A cloud strategy team defines motivations and business outcomes, and validates and maintains alignment
between business priorities and cloud adoption efforts. In the absence of a defined cloud strategy team,
someone must still provide the functionality that aligns technical activities to business outcomes. That same
person or group should also manage change across the project.
Cloud strategy functions are commonly provided by the following types of roles. When a cloud strategy team is
defined, it should include many of the following roles:
Finance
Line of business
Human resources
Operations
Enterprise architecture
IT infrastructure
Application groups
Project managers (often with agile project management experience)
This helps guide critical prioritization and discovery efforts during cloud adoption. It may also trigger changes
in business processes, the execution of operations, customer interactions, or even product development. If
these functions are confined to IT, the success of cloud adoption efforts will be constrained. To drive true
business change, business leaders should be the primary source of this functionality. A defined cloud strategy
team provides a means for involving key participants in a structured way.
NOTE
The organization's CEO and CIO often assign the team. Assignments are typically based on empowering this team to
drive change that cuts across various different organizations within the enterprise. The cloud strategy team members
should be assigned based on the motivations for cloud adoption, business outcomes, and relevant financial models.
Preparation
Learn the business value of Microsoft Azure.
Learn how the Cloud Adoption Framework can help you align the strategy for business, people, and
technology.
Review the cloud adoption strategy process.
Download the strategy and plan template.
Minimum scope
Align business stakeholders to maximize the business value of cloud adoption investments.
Whenever possible, business outcomes and the cloud strategy should both be defined early in the process. As
investments in cloud adoption grows and business values are realized, business stakeholders often become
more engaged. When cloud adoption efforts are led by the business, the focus might be on an operating model
and the organization.
Establish a vision
Adoption motivations: Document and articulate the reasons behind the technical effort.
Business outcomes: Clearly articulate what's expected of the technical team in terms of business changes.
Learning metrics: Establish short-term metrics that can show progress toward longer-term business
outcomes.
Build business justification
Cloud migration business case. Establish a business case for cloud migration.
Rationalize the digital estate
Incremental rationalization: An agile approach to rationalization that properly aligns late-bound technical
decisions.
The five Rs of rationalization: Understand the various rationalization options.
Deliverable
The cloud strategy team drives critical prioritization and discovery efforts during cloud adoption. They may
also change business processes, the execution of operations, customer interactions, or even product
development. The primary focus of the cloud strategy team is to validate and maintain alignment between
business priorities and cloud adoption efforts. Secondarily, this team should focus on change management
across the adoption efforts. The cloud strategy team should be capable of delivering on the following tasks.
Early planning tasks:
Review and provide feedback on business outcomes and financial models.
Aid in establishing clear motivations for cloud adoption that align with corporate objectives.
Define relevant learning metrics that clearly communicate progress toward business outcomes.
Understand business risks introduced by the plan, represent the business's tolerance for risk.
Review and approve the rationalization of the digital estate.
Ongoing monthly tasks:
Support the cloud governance team during risk/tolerance conversations.
Review release plans to understand timelines and business impact of technical change.
Define business change plans associated with planned releases.
Ensure business teams are ready to execute business testing and the business change plan.
Meeting cadence:
Cloud strategy team members must be able to allocate time to planning and developing the cloud strategy:
During early planning efforts, allocate an hour each week to meet with the team. After the adoption plan is
solidified (usually within 4-6 weeks), the time requirements can be reduced.
Throughout the adoption efforts, allocate 1-2 hours each month to review progress and validate continued
priorities.
Additional time is likely required from delegated members of the executive's team on an as-needed basis.
Each member of the cloud strategy team should appoint a delegate who can allocate 5-10 hours per week
to support ongoing prioritization questions and report on any urgent needs.
Next steps
Start a cloud strategy team
Align your strategy with the cloud adoption functions by creating a cloud adoption team to work with.
Use the RACI template to align responsibilities across teams.
Cloud adoption functions
10/30/2020 • 3 minutes to read • Edit Online
Cloud adoption functions enable the implementation of technical solutions in the cloud. Like any IT project, the
people delivering the actual work will determine success. The teams providing the necessary cloud adoption
functions can be staffed from multiple subject matter experts or implementation partners.
Cloud adoption teams are the modern-day equivalent of technical implementation teams or project teams. But
the nature of the cloud may require a more fluid team structure. Some teams focus exclusively on cloud
migration, while other teams focus on innovations that take advantage of cloud technologies. Some teams
include the broad technical expertise required to complete large adoption efforts, like a full datacenter
migration. Other teams have a tighter technical focus and may move between projects to accomplish specific
goals. One example would be a team of data platform specialists who help convert SQL VMs to SQL PaaS
instances.
Regardless of the type or number of cloud adoption teams, the functionality required for cloud adoption is
provided by subject matter experts found in IT, business analysis, or implementation partners.
Depending on the desired business outcomes, the skills needed to provide full cloud adoption functions could
include:
Infrastructure implementers
DevOps engineers
Application developers
Data scientists
Data or application platform specialists
For optimal collaboration and efficiency, we recommend that cloud adoption teams have an average team size
of six people. These teams should be self-organizing from a technical execution perspective. We highly
recommend that these teams also include project management expertise, with deep experience in agile, scrum,
or other iterative models. This team is most effective when managed using a flat structure.
Preparation
Create an Azure account: The first step to using Azure is to create an account.
Azure portal: Tour the Azure portal features and services, and customize the portal.
Introduction to Azure: Get started with Azure. Create and configure your first virtual machine in the cloud.
Azure fundamentals: Learn cloud concepts, understand the benefits, compare and contrast basic strategies,
and explore the breadth of services available in Azure.
Review the Migrate methodology.
Minimum scope
The nucleus of all cloud adoption efforts is the cloud migration team. This team drives the technical changes
that enable adoption. Depending on the objectives of the adoption effort, this team may include a diverse range
of team members who handle a broad set of technical and business tasks.
At a minimum, the team scope includes:
Rationalization of the digital estate
Review, validation, and advancement of the prioritized migration backlog
The execution of the first workload as a learning opportunity.
Deliverable
The primary need from any cloud adoption function is the timely, high-quality implementation of the technical
solutions outlined in the adoption plan. These solutions should align with governance requirements and
business outcomes, and should take advantage of technology, tools, and automation solutions that are available
to the team.
Early planning tasks:
Execute the rationalization of the digital estate.
Review, validate, and advance the prioritized migration backlog.
Begin execution of the first workload as a learning opportunity.
Ongoing monthly tasks:
Oversee change management processes.
Manage the release and sprint backlogs.
Build and maintain the adoption landing zone in conjunction with governance requirements.
Execute the technical tasks outlined in the sprint backlogs.
Meeting cadence:
We recommend that teams providing cloud adoption functions be dedicated to the effort full-time.
It's best if these teams meet daily in a self-organizing way. The goal of daily meetings is to quickly update the
backlog, and to communicate what has been completed, what is to be done today, and what things are blocked,
requiring additional external support.
Release schedules and iteration durations are unique to each company. But a range of one to four weeks per
iteration seems to be the average duration. Regardless of iteration or release cadence, we recommend that the
team meets all supporting teams at the end of each release to communicate the outcome of the release, and to
reprioritize upcoming efforts. Likewise, it's valuable to meet as a team at the end of each sprint, with the cloud
center of excellence or cloud governance team to stay aligned on common efforts and any needs for support.
Some of the technical tasks associated with cloud adoption can become repetitive. Team members should
rotate every 3–6 months to avoid employee satisfaction issues and maintain relevant skills. A rotating seat on a
cloud center of excellence or cloud governance team can provide an excellent opportunity to keep employees
fresh and harness new innovations.
Learn more about the function of a cloud center of excellence or cloud governance team.
Next steps
Build a cloud adoption team
Align cloud adoption efforts with cloud governance functions to accelerate adoption and implement best
practices, while reducing business and technical risks.
Cloud governance functions
10/30/2020 • 7 minutes to read • Edit Online
A cloud governance team ensure that risks and risk tolerance are properly evaluated and managed. This team
ensures the proper identification of risks that can't be tolerated by the business. The people on this team
convert risks into governing corporate policies.
Depending on the desired business outcomes, the skills needed to provide full cloud governance functions
include:
IT governance
Enterprise architecture
Security
IT operations
IT infrastructure
Networking
Identity
Virtualization
Business continuity and disaster recovery
Application owners within IT
Finance owners
These baseline functions help you identify risks related to current and future releases. These efforts help you
evaluate risk, understand the potential impacts, and make decisions regarding risk tolerance. When doing so,
quickly update plans to reflect the changing needs of the cloud migration team.
Preparation
Review the Govern methodology.
Take the governance benchmark assessment.
Introduction to security in Azure: Learn the basic concepts to protect your infrastructure and data in the
cloud. Understand what responsibilities are yours and what Azure handles for you.
Understand how to work across groups to manage cost.
Minimum scope
Understand business risks introduced by the plan.
Represent the business's tolerance for risk.
Help create a governance MVP.
Involve the following participants in cloud governance activities:
Leaders from middle management and direct contributors in key roles should represent the business and
help evaluate risk tolerances.
The cloud governance functions are delivered by an extension of the cloud strategy team. Just as the CIO
and business leaders are expected to participate in cloud strategy functions, their direct reports are expected
to participate in cloud governance activities.
Business employees that are members of the business unit who work closely with the leadership of the line-
of-business should be empowered to make decisions regarding corporate and technical risk.
Information technology (IT) and information security (IS) employees who understand the technical aspects
of the cloud transformation may serve in a rotating capacity instead of being a consistent provider of cloud
governance functions.
Deliverable
The cloud governance mission is to balance competing forces of transformation and risk mitigation.
Additionally, cloud governance ensures that the cloud migration team is aware of data and asset classification,
as well as architecture guidelines that govern adoption. Governance teams or individuals also works with the
cloud center of excellence to apply automated approaches to governing cloud environments.
Ongoing monthly tasks:
Understand business risks introduced during each release.
Represent the business's tolerance for risk.
Aid in the incremental improvement of policy and compliance requirements.
Meeting cadence:
The time commitment from each team member of the cloud governance team will represent a large percentage
of their daily schedules. Contributions will not be limited to meetings and feedback cycles.
Out of scope
As adoption scales, the cloud governance team may struggle to keep pace with innovations. This is especially
true if your environment has heavy compliance, operations, or security requirements. If this happens you can
shift some responsibilities to an existing IT team to reduce scope for the governance team.
Next steps
Some large organizations have dedicated teams that focus on IT governance. These teams specialize in risk
management across the IT portfolio. When those teams exist, the following maturity models can be accelerated
quickly. But the IT governance team is encouraged to review the cloud governance model to understand how
governance shifts slightly in the cloud. Key articles include extending corporate policy to the cloud and the Five
Disciplines of Cloud Governance. No governance: It is common for organizations to move into the cloud with
no clear plans for governance. Before long, concerns around security, cost, scale, and operations begin to
trigger conversations about the need for a governance model and people to staff the processes associated with
that model. Starting those conversations before they become concerns is always a good first step to overcome
the antipattern of "no governance." The section on defining corporate policy can help facilitate those
conversations.
Governance blocked: When concerns around security, cost, scale, and operations go unanswered, projects
and business goals tend to get blocked. Lack of proper governance generates fear, uncertainty, and doubt
among stakeholders and engineers. Stop this in its tracks by taking action early. The two governance guides
defined in the Cloud Adoption Framework can help you start small, set initially limiting policies to minimize
uncertainty and mature governance over time. Choose from the complex enterprise guide or standard
enterprise guide.
Voluntar y governance: There tend to be brave souls in every enterprise. Those gallant few who are willing to
jump in and help the team learn from their mistakes. Often this is how governance starts, especially in smaller
companies. These brave souls volunteer time to fix some issues and push cloud adoption teams toward a
consistent well-managed set of best practices.
The efforts of these individuals are much better than "no governance" or "governance blocked" scenarios.
While their efforts should be commended, this approach should not be confused with governance. Proper
governance requires more than sporadic support to drive consistency, which is the goal of any good
governance approach. The guidance in the Five Disciplines of Cloud Governance can help develop this
discipline.
Cloud custodian: This moniker has become a badge of honor for many cloud architects who specialize in
early stage governance. When governance practices first start out, the results appear similar to those of
governance volunteers. But there is one fundamental difference. A cloud custodian has a plan in mind. At this
stage of maturity, the team is spending time cleaning up the messes made by the cloud architects who came
before them. But the cloud custodian aligns that effort to well structured corporate policy. They also use
governance governance tools, like those outlined in the governance MVP.
Another fundamental difference between a cloud custodian and a governance volunteer is leadership support.
The volunteer puts in extra hours above regular expectations because of their quest to learn and do. The cloud
custodian gets support from leadership to reduce their daily duties to ensure regular allocations of time can be
invested in improving cloud governance.
Cloud guardian: As governance practices solidify and become accepted by cloud adoption teams, the role of
cloud architects who specialize in governance changes a bit, as does the role of the cloud governance team.
Generally, the more mature practices gain the attention of other subject matter experts who can help
strengthen the protections provided by governance implementations.
While the difference is subtle, it is an important distinction when building a governance-focused IT culture. A
cloud custodian cleans up the messes made by innovative cloud architects, and the two roles have natural
friction and opposing objectives. A cloud guardian helps keep the cloud safe, so other cloud architects can
move more quickly with fewer messes. Cloud guardians begin using more advanced governance approaches to
accelerate platform deployment and help teams self-service their environmental needs, so they can move
faster. Examples of these more advanced functions are seen in the incremental improvements to the
governance MVP, such as improvement of the security baseline.
Cloud accelerators: Cloud guardians and cloud custodians naturally harvest scripts and governance tools
that accelerate the deployment of environments, platforms, or even components of various applications.
Curating and sharing these scripts in addition to centralized governance responsibilities develops a high degree
of respect for these architects throughout IT.
Those governance practitioners who openly share their curated scripts help deliver technology projects faster
and embed governance into the architecture of the workloads. This workload influence and support of good
design patterns elevate cloud accelerators to a higher rank of governance specialist.
Global governance: When organizations depend on globally dispersed IT needs, there can be significant
deviations in operations and governance in various geographies. Business unit demands and even local data
sovereignty requirements can cause governance best practices to interfere with required operations. In these
scenarios, a tiered governance model allows for minimally viable consistency and localized governance. The
article on multiple layers of governance provides more insights on reaching this level of maturity.
Every company is unique, and so are their governance needs. Choose the level of maturity that fits your
organization and use the Cloud Adoption Framework to guide the practices, processes, and tooling to help you
get there.
As cloud governance matures, teams are empowered to adopt the cloud at faster paces. Continued cloud
adoption efforts tend to trigger maturity in IT operations. Either develop a cloud operations team, or sync with
your cloud operations team to ensure governance is a part of operations development.
Learn more about starting a cloud governance team or a cloud operations team.
After you've established an initial cloud governance foundation, use these best practices in Governance
foundation improvements to get ahead of your adoption plan and prevent risks.
Central IT team functions
10/30/2020 • 7 minutes to read • Edit Online
As cloud adoption scales, cloud governance functions alone may not be sufficient to govern adoption efforts.
When adoption is gradual, teams tend to organically develop the skills and processes needed to be ready for the
cloud over time.
But when one cloud adoption team uses the cloud to achieve a high-profile business outcome, gradual adoption
is seldom the case. Success follows success. This is also true for cloud adoption, but it happens at cloud scale.
When cloud adoption expands from one team to multiple teams relatively quickly, additional support from
existing IT staff is needed. But those staff members may lack the training and experience required to support the
cloud using cloud-native IT tools. This often drives the formation of a Central IT team governing the cloud.
Cau t i on
While this is a common maturity step, it can present a high risk to adoption, potentially blocking innovation and
migration efforts if not managed effectively. See the risk section below to learn how to mitigate the risk of
centralization becoming a cultural antipattern.
The skills needed to provide centralized IT functions could be provided by:
An existing Central IT team
Enterprise architects
IT operations
IT governance
IT infrastructure
Networking
Identity
Virtualization
Business continuity and disaster recovery
Application owners within IT
WARNING
Centralized IT should only be applied in the cloud when existing delivery on-premises is based on a Central IT team model.
If the current on-premises model is based on delegated control, consider a cloud center of excellence (CCoE) approach for
a more cloud-compatible alternative.
Key responsibilities
Adapt existing IT practices to ensure adoption efforts result in well-governed, well-managed environments in the
cloud.
The following tasks are typically executed regularly:
Strategic tasks
Review:
Business outcomes.
Financial models.
Motivations for cloud adoption.
Business risks.
Rationalization of the digital estate.
Monitor adoption plans and progress against the prioritized migration backlog.
Identify and prioritize platform changes that are required to support the migration backlog.
Act as an intermediary or translation layer between cloud adoption needs and existing IT teams.
Take advantage of existing IT teams to accelerate platform functions and enable adoption.
Technical tasks
Build and maintain the cloud platform to support solutions.
Define and implement the platform architecture.
Operate and manage the cloud platform.
Continuously improve the platform.
Keep up with new innovations in the cloud platform.
Deliver new cloud functionality to support business value creation.
Suggest self-service solutions.
Ensure that solutions meet existing governance and compliance requirements.
Create and validate deployment of platform architecture.
Review release plans for sources of new platform requirements.
Meeting cadence
Central IT team expertise usually comes from a working team. Expect participants to commit much of their daily
schedules to alignment efforts. Contributions aren't limited to meetings and feedback cycles.
Next steps
As a central IT team matures its cloud capabilities, the next maturity step is typically looser coupling of cloud
operations. The availability of cloud-native operations management tooling and lower operating costs for
PaaS-first solutions often lead to business teams (or more specifically, DevOps teams within the business)
assuming responsibility for cloud operations.
Learn more about:
Building a cloud operations team
Cloud operations functions
Cloud operations functions
10/30/2020 • 2 minutes to read • Edit Online
An operations team focuses on monitoring, repairing, and the remediation of issues related to traditional IT
operations and assets. In the cloud, many of the capital costs and operations activities are transferred to the cloud
provider, giving IT operations the opportunity to improve and provide significant additional value.
The skills needed to provide cloud operations functions can be provided by:
IT operations
Outsource IT operations vendors
Cloud service providers
Cloud-managed service providers
Application-specific operations teams
Business application operations teams
DevOps teams
IMPORTANT
The individuals or teams accountable for cloud operations are generally responsible for making reactive changes to
configuration during remediation. They're also likely to be responsible for proactive configuration changes to minimize
operational disruptions. Depending on the organizations cloud operating model, those changes could be delivered via
infrastructure-as-code, Azure Pipelines, or direct configuration in the portal. Since operations team will likely have elevated
permissions, it is extremely important that those who fill this role are following identity and access control best practices to
minimize unintended access or production changes.
Preparation
Manage resources in Azure: Learn how to work through the Azure CLI and web portal to create, manage, and
control cloud-based resources.
Azure network services: Learn Azure networking basics and how to improve resiliency and reduce latency.
Review the following:
Business outcomes
Financial models
Motivations for cloud adoption
Business risks
Rationalization of the digital estate
Minimum scope
The duties of the people on the cloud operations team involve delivering maximum workload performance and
minimum business interruptions within an agreed-upon operations budget.
Determine workload criticality, impact of disruptions, or performance degradation.
Establish business-approved cost and performance commitments.
Monitor and operate cloud workloads.
Deliverables
Maintain asset and workload inventory
Monitor performance of workloads
Maintain operational compliance
Protect workloads and associated assets
Recover assets if there is performance degradation or business interruption
Mature functionality of core platforms
Continuously improve workload performance
Improve budgetary and design requirements of workloads to fit commitments to the business
Meeting cadence
The cloud operations team should be involved in release planning and cloud center of excellence planning to
provide feedback and prepare for operational requirements.
Out of scope
Traditional IT operations that focus on maintaining current-state operations for low-level technical assets is out of
scope for the cloud operations team. Things like storage, CPU, memory, network equipment, servers, and virtual
machine hosts require continuous maintenance, monitoring, repair, and remediation of issues to maintain peak
operations. In the cloud, many of these capital costs and operations activities are transferred to the cloud
provider.
Next steps
As adoption and operations scale, it's important to define and automate governance best practices that extend
existing IT requirements. Forming a cloud center of excellence is an important step to scaling cloud adoption,
cloud operations, and cloud governance efforts.
Learn more about:
Cloud center of excellence functions.
Organizational antipatterns: Silos and fiefdoms.
Learn to align responsibilities across teams by developing a cross-team matrix that identifies responsible,
accountable, consulted, and informed (RACI) parties. Download and modify the RACI template.
Cloud center of excellence (CCoE) functions
10/30/2020 • 9 minutes to read • Edit Online
Business and technical agility are core objectives of most IT organizations. A cloud center of excellence (CCoE)
is a function that creates a balance between speed and stability.
Function structure
A CCoE model requires collaboration between each of the following:
Cloud adoption (specifically solution architects)
Cloud strategy (specifically the program and project managers)
Cloud governance
Cloud platform
Cloud automation
Key responsibilities
The primary duty of the CCoE team is to accelerate cloud adoption through cloud-native or hybrid solutions.
The objective of the CCoE is to:
Help build a modern IT organization through agile approaches to capture and implement business
requirements.
Use reusable deployment packages that align with security, compliance, and service management policies.
Maintain a functional Azure platform in alignment with operational procedures.
Review and approve the use of cloud-native tools.
Over time, standardize and automate commonly needed platform components and solutions.
Meeting cadence
The CCoE is a function staffed by four high demand teams. It is important to allow for organic collaboration
and track growth through a common repository/solution catalog. Maximize natural interactions, but
minimize meetings. When this function matures, the teams should try to limit dedicated meetings.
Attendance at recurring meetings, like release meetings hosted by the cloud adoption team, will provide data
inputs. In parallel, a meeting after each release plan is shared can provide a minimum touch point for this
team.
Provision a production SQL Server Network, IT, and data platform teams The team requiring the server
provision various components over deploys a PaaS instance of Azure SQL
the course of days or even weeks. Database. Alternatively, a
preapproved template could be used
to deploy all of the IaaS assets to the
cloud in hours.
Provision a development Network, IT, development, and The development team defines their
environment DevOps teams agree to specs and own specs and deploys an
deploy an environment. environment based on allocated
budget.
Update security requirements to Networking, IT, and security teams Cloud governance tools are used to
improve data protection update various networking devices update policies that can be applied
and VMs across multiple immediately to all assets in all cloud
environments to add protections. environments.
Negotiations
At the root of any CCoE effort is an ongoing negotiation process. The CCoE team negotiates with existing IT
functions to reduce central control. The trade-offs for the business in this negotiation are freedom, agility,
and speed. The value of the trade-off for existing IT teams is delivered as new solutions. The new solutions
provide the existing IT team with one or more of the following benefits:
Ability to automate common issues.
Improvements in consistency (reduction in day-to-day frustrations).
Opportunity to learn and deploy new technical solutions.
Reductions in high severity incidents (fewer quick fixes or late-night pager-duty responses).
Ability to broaden their technical scope, addressing broader topics.
Participation in higher-level business solutions, addressing the impact of technology.
Reduction in menial maintenance tasks.
Increase in technology strategy and automation.
In exchange for these benefits, the existing IT function may be trading the following values, whether real or
perceived:
Sense of control from manual approval processes.
Sense of stability from change control.
Sense of job security from completion of necessary yet repetitive tasks.
Sense of consistency that comes from adherence to existing IT solution vendors.
In healthy cloud-forward companies, this negotiation process is a dynamic conversation between peers and
partnering IT teams. The technical details may be complex, but are manageable when IT understands the
objective and is supportive of the CCoE efforts. When IT is less than supportive, the following section on
enabling CCoE success can help overcome cultural blockers.
Next steps
A CCoE model requires cloud platform functions and cloud automation functions. The next step is to align
cloud platform functions.
Learn more about:
Cloud platform functions
Cloud automation functions
Cloud platform functions
10/30/2020 • 2 minutes to read • Edit Online
The cloud introduces many technical changes as well as opportunities to streamline technical solutions. But
general IT principles and business needs stay the same. You still need to protect sensitive business data. If your IT
platform depends on a local area network, there's a good chance that you'll need network definitions in the cloud.
Users who need to access applications and data will want their current identities to access relevant cloud
resources.
While the cloud presents the opportunity to learn new skills, your current architects should be able to directly
apply their experiences and subject matter expertise. Cloud platform functions are usually provided by a select
group of architects who focus on learning about the cloud platform. These architects then aid others in decision
making and the proper application of controls to cloud environments.
The skills needed to provide full platform functionality can be provided by:
Enterprise architecture
IT operations
IT governance
IT infrastructure
Networking
Identity
Virtualization
Business continuity and disaster recovery
Application owners within IT
Preparation
Foundations for cloud architecture: A Pluralsight course to help architect the right foundational solutions.
Microsoft Azure architecture: A Pluralsight course to ground architects in Azure architecture.
Azure network services: Learn Azure networking basics and how to improve resiliency and reduce latency.
Review the following:
Business outcomes
Financial models
Motivations for cloud adoption
Business risks
Rationalization of the digital estate
Minimum scope
Cloud platform duties center around the creation and support of your cloud platform or landing zones.
The following tasks are typically executed on a regular basis:
Monitor adoption plans and progress against the prioritized migration backlog.
Identify and prioritize platform changes that are required to support the migration backlog.
Meeting cadence:
Cloud platform expertise usually comes from a working team. Expect participants to commit a large portion of
their daily schedules to cloud platform work. Contributions aren't limited to meetings and feedback cycles.
Deliverables
Build and maintain the cloud platform to support solutions.
Define and implement the platform architecture.
Operate and manage the cloud platform.
Continuously improve the platform.
Keep up with new innovations in the cloud platform.
Bring new cloud functionality to support business value creation.
Suggest self-service solutions.
Ensure solutions meet existing governance and compliance requirements.
Create and validate deployment of platform architecture.
Review release plans for sources of new platform requirements.
Next steps
As your cloud platform becomes better defined, aligning cloud automation functions can accelerate adoption. It
can also help establish best practices while reducing business and technical risks.
Learn to align responsibilities across teams by developing a cross-team matrix that identifies responsible,
accountable, consulted, and informed (RACI) parties. Download and modify the RACI template.
Cloud automation functions
5/21/2020 • 2 minutes to read • Edit Online
During cloud adoption efforts, cloud automation functions unlock the potential of DevOps and a cloud-native
approach. Expertise in each of these areas can accelerate adoption and innovation.
The skills needed to provide cloud automation functions can be provided by:
DevOps engineers
Developers with DevOps and infrastructure expertise
IT engineers with DevOps and automation expertise
These subject matter experts might be providing functions in other areas such as cloud adoption, cloud
governance, or cloud platform. After they demonstrate proficiency at automating complex workloads, you can
recruit these experts to deliver automation value.
Preparation
Before you admit a team member to this group, they should demonstrate three key characteristics:
Expertise in any cloud platform with a special emphasis on DevOps and automation.
A growth mindset or openness to changing the way IT operates today.
A desire to accelerate business change and remove traditional IT roadblocks.
Minimum scope
The primary duty of cloud automation is to own and advance the solution catalog. The solution catalog is a
collection of prebuilt solutions or automation templates. These solutions can rapidly deploy various platforms as
required to support needed workloads. These solutions are building blocks that accelerate cloud adoption and
reduce the time to market during migration or innovation efforts.
Examples of solutions in the catalog include:
A script to deploy a containerized application.
A Resource Manager template to deploy a SQL HA AO cluster.
Sample code to build a deployment pipeline using Azure DevOps.
An Azure DevTest Labs instance of the corporate ERP for development purposes.
Automated deployment of a self-service environment commonly requested by business users.
The solutions in the solution catalog aren't deployment pipelines for a workload. Instead, you might use
automation scripts in the catalog to quickly create a deployment pipeline. You might also use a solution in the
catalog to quickly provision platform components to support workload tasks like automated deployment, manual
deployment, or migration.
Strategic tasks
Rationalization of the digital estate:
Monitor adoption plans and progress against the prioritized migration backlog.
Identify opportunities to accelerate cloud adoption, reduce effort through automation, and improve
security, stability, and consistency.
Prioritize a backlog of solutions for the solution catalog that delivers the most value given other
strategic inputs.
Review release plans for sources of new automation opportunities.
Meeting cadence:
Cloud automation is a working team. Expect participants to commit a large portion of their daily schedules to
cloud automation work. Contributions aren't limited to meetings and feedback cycles.
The cloud automation team should align activities with other areas of capability. This alignment might result in
meeting fatigue. To ensure cloud automation has sufficient time to manage the solution catalog, you should
review meeting cadences to maximize collaboration and minimize disruptions to development activities.
Deliverables
Curate or develop solutions based on the prioritized backlog.
Ensure solutions align to platform requirements.
Ensure solutions are consistently applied and meet existing governance and compliance requirements.
Create and validate solutions in the catalog.
Next steps
As essential cloud functions align, the collective teams can help develop necessary technical skills.
Cloud data functions
10/30/2020 • 3 minutes to read • Edit Online
There are multiple audiences involved in an analytics conversation, including the typical seller, database architect,
and infrastructure team. In addition, analytics solutions involve influencers, recommenders, and decision-makers
from enterprise architecture, data science, business analysts, and executive leadership roles.
Azure Synapse Analytics enables the entire business, from the IT stakeholder to the business analyst, to collaborate
on analytics solutions and understand cloud data functions. The following sections discuss these roles in more
detail.
Infrastructure teams
These teams deal with the provisioning and architecture of the underlying compute resources required for large
analytics systems. In many cases, they are managing transitions between datacenter-based and cloud-based
systems, and current needs for interoperability across both. Disaster recovery, business continuity, and high
availability are common concerns.
With Azure Synapse Analytics, IT professionals can protect and manage their organization's data more efficiently.
They can enable big data processing with both on-demand and provisioned compute. Through tight integration
with Azure Active Directory, the service helps secure access to cloud and hybrid configurations. IT pros can enforce
privacy requirements by using data masking, as well as row-level and column-level security.
Data scientists
Data scientists understand how to build advanced models for huge volumes of critical, yet often disparate data.
Their work involves translating the needs of the business into the technology requirements for normalization and
transformation of data. They create statistical and other analytical models, and ensure that line-of-business teams
can get the analysis they need to run the business.
Using Azure Synapse Analytics, data scientists can build proofs of concept in minutes, and create or adjust end-to-
end solutions. They can provision resources as needed, or simply query existing resources on demand across
massive amounts of data. They can do their work in a variety of languages, including T-SQL, R, Python, Scala, .NET,
and Spark SQL.
Business analysts
These teams build and use dashboards, reports, and other forms of data visualization to gain rapid insights
required for operations. Often, each line-of-business department will have dedicated business analysts who gather
and package information and analytics from specialized applications. These specialized apps can be for credit cards,
retail banking, commercial banking, treasury, marketing, and other organizations.
Using Azure Synapse Analytics, business analysts can securely access datasets and use Power BI to build
dashboards. They can also securely share data within and outside their organization through Azure Data Share.
Executives
Executives are responsible for charting strategy and ensuring strategic initiatives are implemented effectively
across both IT and line-of-business departments. Solutions must be cost-effective, prevent disruption to the
business, allow for easy extensibility as requirements change and grow, and deliver results to the business.
Cloud security functions
10/30/2020 • 2 minutes to read • Edit Online
This article provides a summary of the organizational functions required to manage information security risk in an
enterprise. These organizational functions collectively form the human portion of an overall cybersecurity system.
Each function may be performed by one or more people, and each person may perform one or more functions,
depending on various factors such as culture, budget, and available resources.
The following diagram and documentation represent an ideal view of the functions of an enterprise security team.
The diagram represents an aspirational view for smaller organizations or smaller security teams who might not
have significant resources and formal responsibilities defined around all of these functions.
Security is a team spor t: Its critical that individuals on the security team see each other as part of a whole
security team, part of the whole organization, and part of a larger security community defending against the same
adversaries. This holistic view enables the team to work well in general. It's especially important as the teams work
through any unplanned gaps and overlaps discovered during the evolution of roles and responsibilities.
Security functions
Each of the following articles provide information about each function. Each article provides a summary of
objectives, how the function can evolve because of the threat environment or cloud technology changes, and the
relationships and dependencies that are critical to its success.
Policy and standards
Security operations
Security architecture
Security compliance management
People security
Application security and DevSecOps
Data security
Infrastructure and endpoint security
Identity and key management
Threat intelligence
Posture management
Incident preparation
Function of cloud security policy and standards
10/30/2020 • 2 minutes to read • Edit Online
Security policy and standards teams author, approve, and publish security policy and standards to guide security
decisions within the organization.
The policies and standards should:
Reflect the organizations security strategy at a detailed enough way to guide decisions in the organization by
various teams
Enable productivity throughout the organization while reducing risk to the organizations business and mission
Security policy should reflect long term sustainable objectives that align to the organizations security strategy
and risk tolerance. Policy should always address:
Regulatory compliance requirements and current compliance status (requirements met, risks accepted, etc.)
Architectural assessment of current state and what is technically possible to design, implement, and enforce
Organizational culture and preferences
Industry best practices
Accountability of security risk assigned to appropriate business stakeholders who are accountable for other
risks and business outcomes.
Security standards define the processes and rules to support execution of the security policy.
Modernization
While policy should remain static, standards should be dynamic and continuously revisited to keep up with pace
of change in cloud technology, threat environment, and business competitive landscape.
Because of this high rate of change, you should keep a close eye on how many exceptions are being made as this
may indicate a need to adjust standards (or policy).
Security standards should include guidance specific to the adoption of cloud such as:
Secure use of cloud platforms for hosting workloads
Secure use of DevOps model and inclusion of cloud applications, APIs, and services in development
Use of identity perimeter controls to supplement or replace network perimeter controls
Define your segmentation strategy prior to moving your workloads to IaaS platform
Tagging and classifying the sensitivity of assets
Define process for assessing and ensuring your assets are configured and secured properly
Next steps
Review the function of a cloud security operations center (SOC).
Cloud SOC functions
10/30/2020 • 2 minutes to read • Edit Online
The main objective of a cloud security operations center (SOC) is to detect, respond to, and recover from active
attacks on enterprise assets.
As the SOC matures, security operations should:
Reactively respond to attacks detected by tools
Proactively hunt for attacks that slipped past reactive detections
Modernization
Detecting and responding to threats is currently undergoing significant modernization at all levels.
Elevation to business risk management: SOC is growing into a key component of managing business risk
for the organization
Metrics and goals: Tracking SOC effectiveness is evolving from "time to detect" to these key indicators:
Responsiveness via mean time to acknowledge (MTTA).
Remediation speed via mean time to remediate (MTTR).
Technology evolution: SOC technology is evolving from exclusive use of static analysis of logs in a SIEM to
add the use of specialized tooling and sophisticated analysis techniques. This provides deep insights into
assets that provide high quality alerts and investigation experience that complement the breadth view of the
SIEM. Both types of tooling are increasingly using AI and machine learning, behavior analytics, and integrated
threat intelligence to help spot and prioritize anomalous actions that could be a malicious attacker.
Threat hunting: SOCs are adding hypothesis driven threat hunting to proactively identify advanced attackers
and shift noisy alerts out of frontline analyst queues.
Incident management: Discipline is becoming formalized to coordinate nontechnical elements of incidents
with legal, communications, and other teams. Integration of internal context: To help prioritize SOC
activities such as the relative risk scores of user accounts and devices, sensitivity of data and applications, and
key security isolation boundaries to closely defend.
For more information, see:
Strategy and architecture standards—security operations
CISO workshop module 4b: Threat protection strategy
Cyber Defense Operations Center (CDOC) blog series part 1, part 2a, part 2b, part 3a, part 3b
NIST computer security incident handling guide
NIST guide for cybersecurity event recovery
Next steps
Review the function of security architecture.
Cloud security architecture functions
10/30/2020 • 2 minutes to read • Edit Online
Security architecture translates the organizations business and assurance goals into documentation and
diagrams to guide technical security decisions.
Modernization
Security architecture is affected by different factors:
Continuous engagement model: Continuous release of software updates and cloud features make fixed
engagement models obsolete. Architects should be engaged with all teams working in technical topic areas to
guide decision making along those teams' capability lifecycles.
Security from the cloud: Incorporate security capabilities from the cloud to reduce enablement time and
ongoing maintenance costs (hardware, software, time, and effort).
Security of the cloud: Ensure coverage of all cloud assets including software as a service (SaaS)
applications, infrastructure as a service (IaaS) VMs, and platform as a service (PaaS) applications and services.
This should include discovery and security of both sanctioned and unsanctioned services.
Identity integration: Security architects should ensure tight alignment with identity teams to help
organizations meet the dual goals of enabling productivity and providing security assurances.
Integration of internal context in security designs to such as context from posture management and
incidents investigated by security operations [center] (SOC). This should include elements like relative risk
scores of user accounts and devices, sensitivity of data, and key security isolation boundaries to actively
defend.
Next steps
Review the function of cloud security compliance management.
Cloud security compliance management functions
10/30/2020 • 2 minutes to read • Edit Online
The objective of cloud security compliance management is to ensure that the organization is compliant with
regulatory requirements (and internal policies) and efficiently tracks and reports status.
Modernization
Cloud introduces changes to security compliance including:
Requirement to validate the compliance status of the cloud provider with your regulatory requirements. This
is a shared responsibility, see adopting the shared responsibility model for how these responsibilities differ for
cloud types.
Pre-cloud guidance: While many regulatory requirements have been updated to incorporate the dynamic
nature of cloud services, some do not yet include this. Organizations should work with regulatory bodies to get
these updated and be prepared to explain these differences during audit exercises.
Linking compliance to risk : Ensure that organizations are tying compliance violations and exceptions to
organizational risks to ensure the right level of attention and funding to correct issues.
Tracking and repor ting enabled by cloud: This function shoudl actively embrace the software defined
nature of cloud as this offers comprehensive logging, configuration data, and analytical insight that make
reporting on compliance more efficient than traditional on-premises approaches.
Cloud-based compliance tools are available to facilitate easier reporting of regulatory compliance such as
Microsoft Compliance Manager, which can reduce overhead costs of this function.
Next steps
Review the function of people security.
People security functions in the cloud
10/30/2020 • 2 minutes to read • Edit Online
People security protects the organization from risk of inadvertent human mistakes and malicious insider actions.
Modernization
Modernization of this function includes:
Increase positive engagement with users using gamification and positive reinforcement / education rather
than relying solely on negative reinforcement approaches like traditional "phish and punish" solutions.
High quality human engagement: Security awareness communications and training should be high quality
productions that drive empathy and emotional engagement to connect with the human side of employees and
the organizations mission.
Realistic expectations: Accept that users will sometimes open phishing emails, and instead focus success
metrics on reducing the rate versus expecting to stop 100 percent of opening.
Organizational culture change: Organizational leadership must drive an intentional culture change to make
security a priority for each member of the organization.
Increased insider risk focus to help organizations protect valuable trade secrets and other data with highly
profitable illicit use cases (such as customer locations or communication records).
Improved insider risk detection which takes advantage of cloud capabilities for activity logging, behavior
analytics, and machine learning (machine learning).
Next steps
Review the function of application security and DevSecOps.
Application security and DevSecOps functions
10/30/2020 • 2 minutes to read • Edit Online
The objective of application security and DevSecOps is to integrate security assurances into development
processes and custom line of business (LOB) applications.
Modernization
Application development is rapidly being reshaped in multiple aspects simultaneously including the DevOps team
model, DevOps rapid release cadence, and the technical composition of applications via cloud services and APIs.
See how the cloud is changing security relationships and responsibilities to understand these changes.
This modernization of antiquated development models presents both opportunity and a requirement to modernize
security of applications and development processes. The fusion of security into DevOps processes is often referred
to as DevSecOps and drives changes including:
Security is integrated, not outside approval: The rapid pace of change in application development makes
classic arms-length "scan and report" approaches obsolete. These legacy approaches can't keep up with
releases without grinding development to a halt and creating time-to-market delays, developer underutilization,
and growth of issue backlog.
Shift left to engage security earlier in application development processes as fixing issues earlier is
cheaper, faster, and more effective. If you wait until after the cake is baked, it is harder to change the
shape.
Native integration: Security practices must be integrated seamlessly to avoid unhealthy friction in
development workflows and continuous integration/continuous deployment (CI/CD) processes. For
more information about the GitHub approach, see Securing software, together.
High-quality security: Security must provide high-quality findings and guidance that enable
developers to fix issues fast and don't waste developer time with false positives.
Converged culture: Security, development, and operations roles should contribute key elements into a
shared culture, shared values, and shared goals and accountabilities.
Agile security: Shift security from a "must be perfect to ship" approach to an agile approach that starts with
minimum viable security for applications (and for the processes to develop them) that is continuously improved
incrementally.
Embrace cloud-native infrastructure and security features to streamline development processes while
integrating security.
Supply chain risk management: Take a zero-trust approach to open-source software (OSS) and third-party
components that validate their integrity and ensure that bug fixes and updates are applied to these
components.
Continuous learning: The rapid release pace of developer services, sometimes called platform as a service
(PaaS) services, and changing composition of applications means that dev, ops, and security team members will
be constantly learning new technology.
Programmatic approach to application security to ensure continuous improvement of the agile approach
happens.
For additional context, see Microsoft secure development lifecycle.
Next steps
Review the function of data security.
Function of cloud data security
10/30/2020 • 2 minutes to read • Edit Online
The main objective for a data security team is to provide security protections and detective controls for sensitive
enterprise data in any format in any location.
Modernization
Data security strategies are being shaped primarily by:
Data sprawl: Sensitive data is being generated and stored on a nearly limitless variety of devices and cloud
services where people creatively collaborate.
New model: The cloud enables new models of "phone home for key" to supplement and replace classic data
loss protection (DLP) models that "catch it on the way out the door"
Regulations like general data protection regulation (GDPR) are requiring organizations to closely track private
data and how applications are using it.
Next steps
Review the function of cloud infrastructure and endpoint security.
Function of cloud infrastructure and endpoint
security
10/30/2020 • 2 minutes to read • Edit Online
A cloud security team working on infrastructure and endpoint security provides security protections, detective,
and response controls for infrastructure and network components used by enterprise applications and users.
Modernization
Software defined datacenters and other cloud technologies are helping solve longstanding challenges with
infrastructure and endpoint security including:
Inventor y and configuration error discover y are much more reliable for cloud hosted assets as they're all
immediately visible (vs. A physical datacenter).
Vulnerability management evolving into a critical part of overall security posture management.
Addition of container technologies to be managed and secured by infrastructure and network teams as
the organization adopts this technology broadly. See container security in Security Center for an example.
Security agent consolidation and tool simplification to reduce the maintenance and performance overhead
of security agents and tools.
Allow-listing of applications and internal network filtering is becoming much easier to configure and
deploy for cloud hosted servers (using machine learning generated rule sets). See adaptive application controls
and adaptive network hardening for Azure examples.
Automated templates for configuring infrastructure and security are much easier with software defined
datacenters in the cloud. Azure example is Azure Blueprints
Just in time (JIT) and just enough access (JEA) enable practical application of least privilege principles to
privileged access for servers and endpoints.
User experience becomes critical as users increasingly can choose or purchase their endpoint devices.
Unified endpoint management allows managing security posture of all endpoint devices including mobile
and traditional PCs as well as providing critical device integrity signals for zero trust access control solutions.
Network security architectures and controls are partially diminished with the shift to cloud application
architectures, but they remain a fundamental security measure. For more information, see Network security
and containment.
Next steps
Review the function of threat intelligence.
Function of identity and key management in the
cloud
10/30/2020 • 2 minutes to read • Edit Online
The main objective of a security team working on identity management is to provide authentication and
authorization of humans, services, devices, and applications. Key and certification management provides secure
distribution and access to key material for cryptographic operations (which often support similar outcomes as
identity management).
Modernization
Data identity and key management modernization is being shaped by:
Identity and key/certification management disciplines are coming closer together as they both provide
assurances for authentication and authorization to enable secure communications.
Identity controls are emerging as a primary security perimeter for cloud applications
Key-based authentication for cloud services is being replaced with identity management because of the
difficulty of storing and securely providing access to those keys.
Critical importance of carrying positive lessons learned from on-premises identity architectures such as single
identity, single sign-on (SSO), and native application integration.
Critical importance of avoiding common mistakes of on-premises architectures that often overcomplicated
them, making support difficult and attacks easier. These include:
Sprawling groups and organizational units (OUs).
Sprawling set of third-party directories and identity management systems.
Lack of clear standardization and ownership of application identity strategy.
Credential theft attacks remain a high impact and high likelihood threat to mitigate.
Service accounts and application accounts remaining a top challenge, but becoming easier to solve. Identity
teams should actively embrace the cloud capabilities that are beginning to solve this like Azure AD managed
identities.
Next steps
Review the function of infrastructure and endpoint security
Function of cloud threat intelligence
10/30/2020 • 2 minutes to read • Edit Online
Security threat intelligence provides context and actionable insights on active attacks and potential threats to
enable decision making by security teams, technical teams, and organizational leaders.
Modernization
Threat intelligence teams are emerging and evolving to meet the needs of the security operations center (SOC)
and others managing security risk for the organization.
These teams should focus on on a strategy that includes:
Strategic threat intelligence tailored to executive audiences increases awareness of cybersecurity risk,
funding requirements, and supports sound risk decision making by organizational leadership.
Incremental program growth to provide quick wins with direct incident support and evolving into a threat
intelligence platform to track and inform stakeholders.
Tactical and operational threat intelligence to guide decision making during incident investigation and
threat detections.
Next steps
Review the function of cloud security posture management.
Function of cloud security posture management
10/30/2020 • 2 minutes to read • Edit Online
The main objective for a cloud security team working on posture management is to continuously report on and
improve the security posture of the organization by focusing on disrupting a potential attacker's return on
investment (ROI).
Modernization
Posture management is a set of new functions that realize many previously imagined or attempted ideas that were
difficult, impossible, or extremely manual before the advent of the cloud. Some of elements of posture
management can be traced to zero trust, deperimeterization, continuous monitoring, and manual scoring of risk by
expert consultancies.
Posture management introduces a structured approach to this, using the following:
Zero trust-based access control: That considers active threat level during access control decisions.
Real time risk scoring: To provide visibility into top risks.
Threat and vulnerability management (TVM) to establish a holistic view of the organizations attack
surface and risk and integrate it into operations and engineering decision making.
Discover sharing risks: To understand the data exposure of enterprise intellectual property on both
sanctioned and unsanctioned cloud services.
Cloud security posture management to take advantage of cloud instrumentation to monitor and prioritize
security improvements.
Technical policy: Apply guardrails to audit and enforce the organizations standards and policies to technical
systems. See Azure Policy and Azure Blueprints.
Threat modelling systems and architectures, as well as specific applications.
Emerging discipline: Security posture management will disrupt many norms of the security organization in a
healthy way with these new capabilities and may shift responsibilities among roles or create new roles.
Next steps
Review the function of cloud security incident preparation.
Function of cloud security incident preparation
10/30/2020 • 2 minutes to read • Edit Online
The primary objective for an incident preparation team is to build process maturity and muscle memory for
responding to major incidents throughout the organization. This includes helping prepare security, executive
leadership and many outside of security.
Modernization
Practice exercises have become powerful tools to ensure stakeholders are informed and familiar with their role in a
major security incident. Participants of these exercises should include:
Executive leadership and board of directors to make strategic risk decisions and provide oversight.
Communications and public relations to ensure internal users, customers, and other external stakeholders
are informed of relevant and appropriate information.
Internal stakeholders to provide legal counsel and other business advice
Incident management to coordinate activities and communications.
Technical team members to investigate and remediate incident.
Business continuity integration with organizational functions that own crisis management, disaster
recovery, and business continuity plans.
Microsoft has published lessons learned and recommendations in the incident response reference guide (IRRG).
Every cloud function is provided by someone during every cloud adoption effort. These assignments and team
structures can develop organically, or they can be intentionally designed to match a defined team structure.
As adoption needs grow, so does the need for balance and structure. Watch this video to get an overview of
common team structures at various stages of organizational maturity.
The following graphic and list outline those structures based on typical maturation stages. Use these examples to
find the organizational structure that best aligns with your operational needs.
Organizational structures tend to move through the common maturity model that's outlined here:
1. Cloud adoption team only
2. MVP best practice
3. Central IT team
4. Strategic alignment
5. Operational alignment
6. Cloud center of excellence (CCoE)
Most companies start with little more than a cloud adoption team. But we recommend that you establish an
organizational structure that more closely resembles the MVP best practice structure.
For small-scale or early-stage adoption efforts, this team might be as small as one person. In larger-scale or late-
stage efforts, it's common to have several cloud adoption teams, each with around six engineers. Regardless of size
or tasks, the consistent aspect of any cloud adoption team is that it provides the means to onboarding solutions
into the cloud. For some organizations, this might be a sufficient organizational structure. The cloud adoption team
article provides more insight into the structure, composition, and function of the cloud adoption team.
WARNING
Operating with only a cloud adoption team (or multiple cloud adoption teams) is considered an antipattern and should be
avoided. At a minimum, consider the MVP best practice.
We recommend that you have two teams to create balance across cloud adoption efforts. These two teams are
responsible for various functions throughout the adoption effort.
Cloud adoption team: This team is accountable for technical solutions, business alignment, project
management, and operations for solutions that are adopted.
Cloud governance team: To balance the cloud adoption team, a cloud governance team is dedicated to
ensuring excellence in the solutions that are adopted. The cloud governance team is accountable for platform
maturity, platform operations, governance, and automation.
This proven approach is considered an MVP because it might not be sustainable. Each team is wearing many hats,
as outlined in the responsible, accountable, consulted, informed (RACI) charts.
The following sections describe a fully staffed, proven organizational structure along with approaches to aligning
the appropriate structure to your organization.
Central IT team
As adoption scales, the cloud governance team might struggle to keep pace with the flow of innovation from
multiple cloud adoption teams. This is especially true in environments that have heavy compliance, operations, or
security requirements. At this stage, it is common for companies to shift cloud responsibilities to an existing central
IT team. If that team can reassess tools, processes, and people to better support cloud adoption at scale, then
including the central IT team can add significant value. Bringing in subject matter experts from operations,
automation, security, and administration to modernize the central IT team can drive effective operational
innovations.
Unfortunately, the central IT team phase can be one of the riskiest phases of organizational maturity. The central IT
team must come to the table with a strong growth mindset. If the team views the cloud as an opportunity to grow
and adapt, then it can provide great value throughout the process. But if the central IT team views cloud adoption
primarily as a threat to their existing model, then the central IT team becomes an obstacle to the cloud adoption
teams and the business objectives they support. Some central IT teams have spent months or even years
attempting to force the cloud into alignment with on-premises approaches, with only negative results. The cloud
doesn't require that everything change within the central IT team, but it does require significant change. If
resistance to change is prevalent within the central IT team, this phase of maturity can quickly become a cultural
antipattern.
Cloud adoption plans heavily focused on platform as a service (PaaS), DevOps, or other solutions that require less
operations support are less likely to see value during this phase of maturity. On the contrary, these types of
solutions are the most likely to be hindered or blocked by attempts to centralize IT. A higher level of maturity, like a
cloud center of excellence (CCoE), is more likely to yield positive results for those types of transformational efforts.
To understand the differences between centralized IT in the cloud and a CCoE, see Cloud center of excellence.
Strategic alignment
As the investment in cloud adoption grows and business values are realized, business stakeholders often become
more engaged. A defined cloud strategy team, as the following image illustrates, aligns those business
stakeholders to maximize the value realized by cloud adoption investments.
When maturity happens organically, as a result of IT-led cloud adoption efforts, strategic alignment is usually
preceded by a governance or central IT team. When cloud adoption efforts are lead by the business, the focus on
operating model and organization tends to happen earlier. Whenever possible, business outcomes and the cloud
strategy team should both be defined early in the process.
Operational alignment
Realizing business value from cloud adoption efforts requires stable operations. Operations in the cloud might
require new tools, processes, or skills. When stable IT operations are required to achieve business outcomes, it's
important to add a defined cloud operations team, as shown here.
Cloud operations can be delivered by the existing IT operations roles. But it's not uncommon for cloud operations
to be delegated to other parties outside of IT operations. Managed service providers, DevOps teams, and business
unit IT often assume the responsibilities associated with cloud operations, with support and guardrails provided by
IT operations. This is increasingly common for cloud adoption efforts that focus heavily on DevOps or PaaS
deployments.
At the highest state of maturity, a cloud center of excellence aligns teams around a modern cloud-first operating
model. This approach provides centralized IT functions like governance, security, platform, and automation.
The primary difference between this structure and the central IT team structure above is a strong focus on self-
service and democratization. The teams in this structure organize with the intent of delegating control as much as
possible. Aligning governance and compliance practices to cloud-native solutions creates guardrails and protection
mechanisms. Unlike the central IT team model, the cloud-native approach maximizes innovation and minimizes
operational overhead. For this model to be adopted, mutual agreement to modernize IT processes will be required
from business and IT leadership. This model is unlikely to occur organically and often requires executive support.
Next steps
After aligning to a certain stage of organizational structure maturity, you can use RACI charts to align
accountability and responsibility across each team.
Align the appropriate RACI chart
Align responsibilities across teams
10/30/2020 • 3 minutes to read • Edit Online
Learn to align responsibilities across teams by developing a cross-team matrix that identifies responsible,
accountable, consulted, and informed (RACI) parties. This article provides an example RACI matrix for the
organizational structures described in Establish team structures:
Cloud adoption team only
MVP best practice
Central IT team
Strategic alignment
Operational alignment
Cloud center of excellence (CCoE)
To track organizational structure decisions over time, download and modify the RACI template.
The examples in this article specify these RACI constructs:
The one team that is accountable for a function.
The teams that are responsible for the outcomes.
The teams that should be consulted during planning.
The teams that should be informed when work is completed.
The last row of each table (except the first) contains a link to the most-aligned cloud capability for additional
information.
Aligned Cloud Cloud Cloud Cloud CCoE and CCoE - CCoE and CCoE and
cloud adoption strategy strategy operation cloud cloud cloud cloud
capability s governan platform platform automati
ce on
Central IT team
SO L UT IO P L AT F O R P L AT F O R P L AT F O R
SO L UT IO B USIN ESS C H A N GE N M M M
N A L IGN M E M A N A GE O P ERAT I GO VERN A M AT URIT O P ERAT I A UTO M A
T EA M DEL IVERY NT M EN T ONS NCE Y ONS T IO N
Strategic alignment
SO L UT IO P L AT F O R P L AT F O R P L AT F O R
SO L UT IO B USIN ESS C H A N GE N M M M
N A L IGN M E M A N A GE O P ERAT I GO VERN A M AT URIT O P ERAT I A UTO M A
T EA M DEL IVERY NT M EN T ONS NCE Y ONS T IO N
Aligned Cloud Cloud Cloud Cloud CCoE and CCoE and CCoE and CCoE and
cloud adoption strategy strategy operation cloud cloud cloud cloud
capability s governan platform platform automati
ce on
Operational alignment
SO L UT IO P L AT F O R P L AT F O R P L AT F O R
SO L UT IO B USIN ESS C H A N GE N M M M
N A L IGN M E M A N A GE O P ERAT I GO VERN A M AT URIT O P ERAT I A UTO M A
T EA M DEL IVERY NT M EN T ONS NCE Y ONS T IO N
Aligned Cloud Cloud Cloud Cloud CCoE and CCoE and CCoE and CCoE and
cloud adoption strategy strategy operation cloud cloud cloud cloud
capability s governan platform platform automati
ce on
Aligned Cloud Cloud Cloud Cloud CCoE and CCoE and CCoE and CCoE and
cloud adoption strategy strategy operation cloud cloud cloud cloud
capability s governan platform platform automati
ce on
Next steps
To track decisions about organization structure over time, download and modify the RACI template. Copy and
modify the most closely aligned sample from the RACI matrices in this article.
Download the RACI template
Build technical skills
10/30/2020 • 3 minutes to read • Edit Online
Organizational and environmental (technical) readiness can require new skills for technical and nontechnical
contributors. The following information can help your organization build the necessary skills.
Learn more
For additional learning paths, browse the Microsoft Learn catalog. Use the roles filter to align learning paths with
your role.
Build a cost-conscious organization
5/22/2020 • 6 minutes to read • Edit Online
As outlined in Motivations: why are we moving to the cloud?, there are many sound reasons for a company to
adopt the cloud. When cost reduction is a primary driver, it's important to create a cost-conscious organization.
Ensuring cost consciousness is not a one-time activity. Like other cloud-adoption topics, it's iterative. The following
diagram outlines this process to focus on three interdependent activities: visibility, accountability, and
optimization. These processes play out at macro and micro levels, which we describe in detail in this article.
Next steps
Practicing these responsibilities at each level of the business helps drive a cost-conscious organization. To begin
acting on this guidance, review the organizational readiness introduction to help identify the right team structures.
Identify the right team structures
Organizational antipatterns: Silos and fiefdoms
10/30/2020 • 13 minutes to read • Edit Online
Success in any major change to business practices, culture, or technology operations requires a growth mindset.
At the heart of the growth mindset is an acceptance of change and the ability to lead in spite of ambiguity.
Some antipatterns can block a growth mindset in organizations that want to grow and transform, including
micromanagement, biased thinking, and exclusionary practices. Many of these blockers are personal challenges
that create personal growth opportunities for everyone. But two common antipatterns in IT require more than
individual growth or maturity: silos and fiefdoms.
These antipatterns are a result of organic changes within various teams, which result in unhealthy organizational
behaviors. To address the resistance caused by each antipattern, it's important to understand the root cause of
this formation.
Antipatterns
The organic and responsive growth within IT that creates healthy IT teams can also result in antipatterns that
block transformation and cloud adoption. IT silos and fiefdoms are different from the natural microcultures within
healthy IT teams. In either pattern, the team focus tends to be directed toward protecting their "turf". When team
members are confronted with an opportunity to drive change and improve operations, they will invest more time
and energy into blocking the change than finding a positive solution.
As mentioned earlier, healthy IT teams can create natural resistance and positive friction. Silos and fiefdoms are a
different challenge. There is no documented leading indicator for either antipattern. These antipatterns tend to be
identified after months of cloud center of excellence and cloud governance team efforts. They're discovered as the
result of ongoing resistance.
Even in toxic cultures, the efforts of the CCoE and the cloud governance team should help drive cultural growth
and technical progress. After months of effort, a few teams might still show no signs of inclusive behaviors and
stand firm in their resistance to change. These teams are likely operating in one of the following antipattern
models: silos and fiefdoms. Although these models have similar symptoms, the root cause and approaches to
addressing resistance is radically different between them.
IT silos
Team members in an IT silo are likely to define themselves through their alignment to a small number of IT
vendors or an area of technical specialization. But don't confuse silos with IT fiefdoms. Silos tend to be driven by
comfort and passion, and silos are often easier to overcome than the fear-driven motives behind fiefdoms.
This antipattern often emerges from of a common passion for a specific solution. IT silos are then reinforced by
the team's advanced skills as a result of the investment in that specific solution. This superior skill can be an
accelerator to cloud adoption efforts if the resistance to change can be overcome. It can also become a major
blocker if the silos are broken down or if the team members can't accurately evaluate options. Fortunately, IT silos
can often be overcome without any significant changes to the organizational chart.
Address resistance from IT silos
IT silos can be addressed through the following approaches. The best approach will depend on the root cause of
the resistance.
Create vir tual teams: The organizational readiness section of the Cloud Adoption Framework describes a
multilayered structure for integrating and defining four virtual teams. One benefit of this structure is cross-
organization visibility and inclusion. Introducing a cloud center of excellence creates a high-profile aspirational
team that top engineers will want to participate in. This helps create new cross-solution alignments that aren't
bound by organizational-chart constraints, and will drive inclusion of top engineers who have been sheltered by
IT silos.
Introduction of a cloud strategy team will create immediate visibility to IT contributions regarding cloud adoption
efforts. When IT silos fight for separation, this visibility can help motivate IT and business leaders to properly
support those resistant team members. This process is a quick path to stakeholder engagement and support.
Consider experimentation and exposure: Team members in an IT silo have likely been constrained to think a
certain way for some time. Breaking the one-track mind is a first step to addressing resistance.
Experimentation and exposure are powerful tools for breaking down barriers in silos. The team members might
be resistant to competing solutions, so it's not wise to put them in charge of an experiment that competes with
their existing solution. But as part of a first workload test of the cloud, the organization should implement
competing solutions. The siloed team should be invited to participate as an input and review source, but not as a
decision maker. This should be clearly communicated to the team, along with a commitment to engage the team
more deeply as a decision maker before moving into production solutions.
During review of the competing solution, use the practices outlined in Define corporate policy to document
tangible risks of the experiment and establish policies that help the siloed team become more comfortable with
the future state. This will expose the team to new solutions and harden the future solution.
Be "boundar y-less": The teams that drive cloud adoption find it easy to push boundaries by exploring exciting,
new cloud-native solutions. This is one half of the approach to removing boundaries. But that thinking can further
reinforce IT silos. Pushing for change too quickly and without respect to existing cultures can create unhealthy
friction and lead to natural resistance.
When IT silos start to resist, it's important to be "boundary-less" in your own solutions. Be mindful of one simple
truth: cloud-native isn't always the best solution. Consider hybrid solutions that might provide an opportunity to
extend the existing investments of the IT silo into the future.
Also consider cloud-based versions of the solution that the IT silo team uses now. Experiment with those solutions
and expose yourself to the viewpoint of those living in the IT silo. At a minimum, you will gain a fresh perspective.
In many situations, you might earn enough of the IT silo's respect to lessen resistance.
Invest in education: Many people living in an IT silo became passionate about the current solution as a result of
expanding their own education. Investing in the education of these teams is seldom misplaced. Allocate time for
these individuals to engage in self-learning, classes, or even conferences to break the day-to-day focus on the
current solution.
For education to be an investment, some return must come as a result of the expense. In exchange for the
investment, the team might demonstrate the proposed solution to the rest of the teams involved in cloud
adoption. They might also provide documentation of the tangible risks, risk management approaches, and
desired policies in adopting the proposed solution. Each will engage these teams in the solution and help take
advantage of their tribal knowledge.
Turn roadblocks into speed bumps: IT silos can slow or stop any transformation. Experimentation and
iteration will find a way, but only if the project keeps moving. Focus on turning roadblocks into merely speed
bumps. Define policies that everyone can be temporarily comfortable with in exchange for continued
progression.
For instance, if IT security is the roadblock because its security solution can't monitor compromises of protected
data in the cloud, establish data classification policies. Prevent deployment of classified data into the cloud until
an agreeable solution can be found. Invite IT security into experimentation with hybrid or cloud-native solutions
to monitor protected data.
If the network team operates as a silo, identify workloads that are self-contained and don't have network
dependencies. In parallel, experiment, expose, and educate the network team while working on hybrid or
alternative solutions.
Be patient and be inclusive: It's tempting to move on without support of an IT silo. But this decision will cause
disruptions and roadblocks down the road. Changing minds in members of the IT silo can take time. Be patient of
their natural resistance--convert it to value. Be inclusive and invite healthy friction to improve the future solution.
Never compete: The IT silo exists for a reason. It persists for a reason. There is an investment in maintaining the
solution that the team members are passionate about. Directly competing with the solution or the IT silo will
distract from the real goal of achieving business outcomes. This trap has blocked many transformation projects.
Stay focused on the goal, as opposed to a single component of the goal. Help accentuate the positive aspects of
the IT silo's solution and help the team members make wise decisions about the best solutions for the future.
Don't insult or degrade the current solution, because that would be counterproductive.
Par tner with the business: If the IT silo isn't blocking business outcomes, why do you care? There is no perfect
solution or perfect IT vendor. Competition exists for a reason; each has its own benefits.
Embrace diversity and include the business by supporting and aligning to a strong cloud strategy team. When an
IT silo supports a solution that blocks business outcomes, it will be easier to communicate that roadblock without
the noise of technical squabbles. Supporting nonblocking IT silos will show an ability to partner for the desired
business outcomes. These efforts will earn more respect and greater support from the business when an IT silo
presents a legitimate blocker.
IT fiefdoms
Team members in an IT fiefdom are likely to define themselves through their alignment to a specific process or
area of responsibility. The team operates under an assumption that external influence on its area of responsibility
will lead to problems. Fiefdoms tend to be a fear-driven antipattern, which will require significant leadership
support to overcome.
Fiefdoms are especially common in organizations that have experienced IT downsizing, frequent turbulence in IT
staff, or poor IT leadership. When the business sees IT purely as a cost center, fiefdoms are much more likely to
arise.
Generally, fiefdoms are the result of a line manager who fears loss of the team and the associated power base.
These leaders often have a sense of duties to their team and feel a need to protect their subordinates from
negative consequences. Phrases like "shelter the team from change" and "protect the team from process
disruption" can be indicators of an overly guarded manager who might need more support from leadership.
Address resistance from IT fiefdoms
IT fiefdoms can demonstrate some growth by following the approaches to addressing IT silo resistance. Before
you try to address resistance from an IT fiefdom, we recommend that you treat the team like an IT silo first. If
those types of approaches fail to yield any significant change, the resistant team might be suffering from an IT
fiefdom antipattern. The root cause of IT fiefdoms is a little more complex to address, because that resistance
tends to come from the direct line manager (or a leader higher up the organizational chart). Challenges that are IT
silo-driven are typically simpler to overcome.
When continued resistance from IT fiefdoms blocks cloud adoption efforts, it might be wise for a combined effort
to evaluate the situation with existing IT leaders. IT leaders should carefully consider insights from the cloud
strategy team , cloud center of excellence, and the cloud governance team before making decisions.
NOTE
IT leaders should never take changes to the organizational chart lightly. They should also validate and analyze feedback
from each of the supporting teams. But transformational efforts like cloud adoption tend to magnify underlying issues that
have gone unnoticed or unaddressed long before this effort. When fiefdoms are preventing the company's success,
leadership changes are a likely necessity.
Fortunately, removing the leader of a fiefdom doesn't often end in termination. These strong, passionate leaders can often
move into a management role after a brief period of reflection. With the right support, this change can be healthy for the
leader of the fiefdom and the current team.
Cau t i on
For managers of IT fiefdoms, protecting the team from risk is a clear leadership value. But there's a fine line
between protection and isolation. When the team is blocked from participating in driving changes, it can have
psychological and professional consequences on the team. The urge to resist change might be strong, especially
during times of visible change.
The manager of any isolated team can best demonstrate a growth mindset by experimenting with the guidance
associated with healthy IT teams in the preceding sections. Active and optimistic participation in governance and
CCoE activities can lead to personal growth. Managers of IT fiefdoms are best positioned to change stifling
mindsets and help the team develop new ideas.
IT fiefdoms can be a sign of systemic leadership issues. To overcome an IT fiefdom, IT leaders need the ability to
make changes to operations, responsibilities, and occasionally even the people who provide line management of
specific teams. When those changes are required, it's wise to approach those changes with clear and defensible
data points.
Alignment with business stakeholders, business motivations, and business outcomes might be required to drive
the necessary change. Partnership with the cloud strategy team, cloud center of excellence, and the cloud
governance team can provide the data points needed for a defensible position. When necessary, these teams
should be involved in a group escalation to address challenges that can't be addressed with IT leadership alone.
Next steps
Disrupting organizational antipatterns is a team effort. To act on this guidance, review the organizational
readiness introduction to identify the right team structures and participants:
Identify the right team structures and participants
Tools and templates
10/30/2020 • 3 minutes to read • Edit Online
The Cloud Adoption Framework includes tools that help you quickly implement technical change. Use these tools,
templates, and assessments to accelerate cloud adoption. The following resources can help you in each phase of
adoption. Some of the tools and templates can be used in multiple phases.
Strategy
RESO URC E DESC RIP T IO N
Cloud journey tracker Identify your cloud adoption path based on the needs of your
business.
Strategy and plan template Document decisions as you execute your cloud adoption
strategy and plan.
Plan
RESO URC E DESC RIP T IO N
Cloud journey tracker Identify your cloud adoption path based on the needs of your
business.
Strategy and plan template Document decisions, as you execute your cloud adoption
strategy and plan.
Cloud adoption plan generator Standardize processes by deploying a backlog to Azure Boards
using a template.
Ready
RESO URC E DESC RIP T IO N
Readiness checklist Use this checklist to prepare your environment for adoption,
including preparing your first migration landing zone,
personalizing the blueprint, and expanding it.
Naming and tagging conventions tracking template Document decisions about naming and tagging standards to
ensure consistency and reduce onboarding time.
CAF Migration landing zone blueprint Provision and prepare to host workloads being migrated from
an on-premises environment into Azure. For more information
about this blueprint, see Deploy a migration landing zone.
RESO URC E DESC RIP T IO N
Terraform modules Open-source code base for the Terraform version of the CAF
landing zones.
Terraform registry The Terraform registry website, filtered to list all of the Cloud
Adoption Framework modules needed to create a landing
zone via Terraform.
Govern
RESO URC E DESC RIP T IO N
Governance benchmark assessment Identify gaps between your current state and business
priorities, and get the right resources to help you address
those gaps.
Governance discipline template Define the basic set of governance processes used to enforce
each governance discipline.
Cost Management discipline template Define the policy statements and design guidance that allow
you to mature the cloud governance within your organization
with a focus on cost management.
Deployment Acceleration discipline template Define the policy statements and design guidance that allow
you to mature the cloud governance within your organization
with a focus on deployment acceleration.
Identity Baseline discipline template Define the policy statements and design guidance that allow
you to mature the cloud governance within your organization
with a focus on identity requirements.
Resource Consistency discipline template Define the policy statements and design guidance that allow
you to mature the cloud governance within your organization
with a focus on resource consistency.
Security Baseline discipline template Define the policy statements and design guidance that allow
you to mature the cloud governance within your organization
with a focus on security baseline.
Azure governance visualizer The Azure governance visualizer is a PowerShell script that
iterates through an Azure tenant's management group
hierarchy down to the subscription level. It captures data from
the most relevant Azure governance capabilities such as Azure
Policy, role-based access control (RBAC), and Azure Blueprints.
From the collected data, the visualizer shows your hierarchy
map, creates a tenant summary, and builds granular scope
insights about your management groups and subscriptions.
Migrate
RESO URC E DESC RIP T IO N
Datacenter migration discovery checklist Review this checklist for information that helps identify
workloads, servers, and other assets in your datacenter. Use
this information to help plan your migration.
Manage
RESO URC E DESC RIP T IO N
Microsoft Azure Well-Architected Review This online assessment will aid in defining workload specific
architectures and operations options.
Best practices source code This deployable source code complements and accelerates
adoption of best practices for Azure server management
services. Use this source code to quickly enable operations
management and establish an operations baseline.
Organize
RESO URC E DESC RIP T IO N
Cross-team RACI diagram Download and modify the RACI spreadsheet template to track
organizational structure decisions over time.
Azure security best practices
10/30/2020 • 26 minutes to read • Edit Online
These are the top Azure security best practices that Microsoft recommends based on lessons learned across
customers and our own environments.
You can view a video presentation of these best practices in the Microsoft Tech Community.
IMPORTANT
Identity protocols are critical to access control in the cloud but often not prioritized in on-premises security, so security teams
should ensure to focus on developing familiarity with these protocols and logs.
Microsoft provides extensive resources to help technical professionals ramp up on securing Azure resources and
report compliance:
Azure Security
AZ-500 learning path (and Certification)
Azure security benchmark (ASB) –Prescriptive Best Practices and Controls for Azure Security
Security Baselines for Azure – Application of ASB to individual Azure Services
Microsoft security best practices - Videos and Documentation
Azure Compliance
Regulatory compliance evaluation with Azure Security Center
Identity Protocols and Security
Azure security documentation site
Azure AD Authentication YouTube series
Securing Azure environments with Azure active directory
Also see the Azure Security Benchmark GS-3: Align organization roles, responsibilities, and accountabilities
Network Management Enterprise-wide virtual network and Typically existing network operations
subnet allocation team in Central IT Operations
Server Endpoint Security Monitor and remediate server security Typically Central IT Operations and
(patching, configuration, endpoint Infrastructure and endpoint security
security, etc.) teams jointly
Incident Monitoring and Response Investigate and remediate security Typically security operations team
incidents in SIEM or source console (
Azure Security Center, Azure AD
Identity Protection, etc.)
Policy Management Set direction for use of Roles Based Typically Policy and Standards + Security
Access Control (RBAC), Azure Security Architecture Teams jointly
Center, Administrator protection
strategy, and Azure Policy to govern
Azure resources
Identity Security and Standards Set direction for Azure AD directories, Typically Identity and Key Management
PIM/PAM usage, MFA, + Policy and Standards + Security
password/synchronization Architecture Teams jointly
configuration, Application Identity
Standards
NOTE
Ensure decision makers have the appropriate education in their area of the cloud to accompany this responsibility.
Ensure decisions are documented in policy and standards to provide a record and guide the organization over the long
term.
Also see the Azure Security Benchmark GS-3: Align organization roles, responsibilities, and accountabilities
NOTE
The goal of simplification and automation isn’t about getting rid of jobs, but about removing the burden of repetitive tasks
from people so they can focus on higher value human activities like engaging with and educating IT and DevOps teams.
IMPORTANT
The explanations for why, what, and how to secure resources are often similar across different resource types and
applications, but it's critical to relate these to what each team already knows and cares about. Security teams should engage
with their IT and DevOps counterparts as a trusted advisor and partner focused on enabling these teams to be successful.
Tooling : Secure Score in Azure Security Center provides an assessment of the most important security information
in Azure for a wide variety of assets. This should be your starting point on posture management and can be
supplemented with custom Azure policies and other mechanisms as needed.
Frequency : Set up a regular cadence (typically monthly) to review Azure secure score and plan initiatives with
specific improvement goals. The frequency can be increased as needed.
TIP
Gamify the activity if possible to increase engagement, such as creating fun competitions and prizes for the DevOps teams
that improve their score the most.
Also see the Azure Security Benchmark GS-2: Define security posture management strategy.
NOTE
Text Message based MFA is now relatively inexpensive for attackers to bypass, so focus on passwordless & stronger MFA.
Also see the Azure Security Benchmark ID-4: Use strong authentication controls for all Azure Active Directory based
access.
NOTE
This best practice refers specifically to enterprise resources. For partner accounts, use Azure AD B2B so you don’t have to
create and maintain accounts in your directory. For customer/citizen accounts, use Azure AD B2C to manage them.
Why : Multiple accounts and identity directories create unnecessary friction and confusion in daily workflows for
productivity users, developers, IT and Identity Admins, security analysts, and other roles.
Managing multiple accounts and directories also creates an incentive for poor security practices such as reusing the
same password across accounts and increases the likelihood of stale/abandoned accounts that attackers can target.
While it sometimes seems easier to quickly stand up a custom directory (based on LDAP, etc.) for a particular
application or workload, this creates much more integration and maintenance work to set up and manage. This is
similar in many ways to the decision of setting up an additional Azure tenant or additional on-premises Active
Directory Forest vs. using the existing enterprise one. See also the “Drive Simplicity” security principle.
Who : This is often a cross-team effort driven by Security Architecture or Identity and Key Management teams.
Sponsorship - This is typically sponsored by Identity and Key management and Security Architecture (though
some organizations may require sponsorship by CISO or CIO)
Execution – This is a collaborative effort involving:
Security Architecture : Incorporates into security and IT architecture documents and diagrams
Policy and standards : Document policy and monitor for compliance
Identity and Key Management or Central IT Operations to implement the policy by enabling
features and supporting developers with accounts, education, and so on.
Application developers and/or Central IT Operations : Use identity in applications and Azure service
configurations (responsibilities will vary based on level of DevOps adoption)
How : Adopt a pragmatic approach that starts with new ‘greenfield’ capabilities (growing today) and then clean up
challenges with the ‘brownfield’ of existing applications and services as a follow-up exercise:
Greenfield : Establish and implement a clear policy that all enterprise identity going forward should use a
single Azure AD directory with a single account for each user.
Brownfield : Many organizations often have multiple legacy directories and identity systems. Address these
when the cost of ongoing management friction exceeds the investment to clean it up. While identity
management and synchronization solutions can mitigate some of these issues, they lack deep integration of
security and productivity features that enable a seamless experience for users, admins, and developers.
The ideal time to consolidate your use of identity is during application development cycles as you:
Modernize applications for the cloud
Update cloud applications with DevOps processes
While there are valid reasons for a separate directory in the case of extremely independent business units or
regulatory requirements, multiple directories should be avoided in all other circumstances.
Also see the Azure Security Benchmark ID-1: Standardize Azure Active Directory as the central identity and
authentication system.
IMPORTANT
The only exception to the single accounts rule is that privileged users (including IT administrators and security analysts)
should have separate accounts for standard user tasks vs. administrative tasks.
For more information, see Azure Security Benchmark Privileged Access.
The architectural decision guides in the Cloud Adoption Framework describe patterns and models that help when
creating cloud governance design guidance. Each decision guide focuses on one core infrastructure component of
cloud deployments and lists patterns and models that can support specific cloud deployment scenarios.
When you begin to establish cloud governance for your organization, actionable governance journeys provide a
baseline roadmap. These journeys make assumptions about requirements and priorities that might not reflect
those of your organization.
These decision guides supplement the sample governance journeys by providing alternative patterns and models
that help you align the architectural design choices made in the example design guidance with your own
requirements.
Next steps
Learn how subscriptions and accounts serve as the base of a cloud deployment.
Subscriptions design
Subscription decision guide
10/30/2020 • 3 minutes to read • Edit Online
Effective subscription design helps organizations establish a structure to organize and manage assets in Azure
during cloud adoption. This guide will help you decide when to create additional subscriptions and expand your
management group hierarchy to support your business priorities.
Prerequisites
Adopting Azure begins by creating an Azure subscription, associating it with an account, and deploying
resources like virtual machines and databases to the subscription. For an overview of these concepts, see Azure
fundamental concepts.
Create your initial subscriptions.
Create additional subscriptions to scale your Azure environment.
Organize and manage your subscriptions using Azure management groups.
NOTE
An Azure Enterprise Agreement (EA) allows you to define another organizational hierarchy for billing purposes. This
hierarchy is distinct from your management group hierarchy, which focuses on providing an inheritance model for easily
applying suitable policies and access control to your resources.
Each organization will categorize their applications differently, often separating subscriptions based on specific
applications or services or along the lines of application archetypes. This categorization is often designed to
support workloads that are likely to consume most of the resource limits of a subscription, or separate mission-
critical workloads to ensure they don't compete with other workloads under these limits. Some workloads that
might justify a separate subscription include:
Mission-critical workloads.
Applications that are part of cost of goods sold (COGS) within your company. For example, every widget
manufactured by a company contains an Azure IoT module that sends telemetry. This may require a
dedicated subscription for accounting or governance purposes as part of COGS.
Applications subject to regulatory requirements such as HIPAA or FedRAMP.
Functional strategy
The functional strategy organizes subscriptions and accounts along functional lines, such as finance, sales, or IT
support, using a management group hierarchy.
Business unit strategy
The business unit strategy groups subscriptions and accounts based on profit and loss category, business unit,
division, profit center, or similar business structure using a management group hierarchy.
Geographic strategy
For organizations with global operations, the geographic strategy groups subscriptions and accounts based on
geographic regions using a management group hierarchy.
Related resources
Resource access management in Azure
Multiple layers of governance in large enterprises
Multiple geographic regions
Next steps
Subscription design is just one of the core infrastructure components requiring architectural decisions during a
cloud adoption process. Visit the architectural decision guides overview to learn about additional strategies used
when making design decisions for other types of infrastructure.
Architectural decision guides
Identity decision guide
10/30/2020 • 7 minutes to read • Edit Online
In any environment, whether on-premises, hybrid, or cloud-only, IT needs to control which administrators, users,
and groups have access to resources. Identity and access management (IAM) services enable you to manage
access control in the cloud.
Jump to: Determine identity integration requirements | Cloud baseline | Directory synchronization | Cloud-
hosted domain services | Active Directory Federation Services | Learn more
Several options are available for managing identity in a cloud environment. These options vary in cost and
complexity. A key factor in structuring your cloud-based identity services is the level of integration required with
your existing on-premises identity infrastructure.
Azure Active Directory (Azure AD) provides a base level of access control and identity management for Azure
resources. If your organization's on-premises Active Directory infrastructure has a complex forest structure or
customized organizational units (OUs), your cloud-based workloads might require directory synchronization
with Azure AD for a consistent set of identities, groups, and roles between your on-premises and cloud
environments. Additionally, support for applications that depend on legacy authentication mechanisms might
require the deployment of Active Directory Domain Services (AD DS) in the cloud.
Cloud-based identity management is an iterative process. You could start with a cloud-native solution with a
small set of users and corresponding roles for an initial deployment. As your migration matures, you might
need to integrate your identity solution using directory synchronization or add domains services as part of your
cloud deployments. Revisit your identity strategy in every iteration of your migration process.
As part of planning your migration to Azure, you will need to determine how best to integrate your existing
identity management and cloud identity services. The following are common integration scenarios.
Cloud baseline
Azure AD is the native identity and access management (IAM) system for granting users and groups access to
management features on the Azure platform. If your organization lacks a significant on-premises identity
solution, and you plan to migrate workloads to be compatible with cloud-based authentication mechanisms,
you should begin developing your identity infrastructure using Azure AD as a base.
Cloud baseline assumptions: Using a purely cloud-native identity infrastructure assumes the following:
Your cloud-based resources will not have dependencies on on-premises directory services or Active
Directory servers, or workloads can be modified to remove those dependencies.
The application or service workloads being migrated either support authentication mechanisms compatible
with Azure AD or can be modified easily to support them. Azure AD relies on internet-ready authentication
mechanisms such as SAML, OAuth, and OpenID Connect. Existing workloads that depend on legacy
authentication methods using protocols such as Kerberos or NTLM might need to be refactored before
migrating to the cloud using the cloud baseline pattern.
TIP
Completely migrating your identity services to Azure AD eliminates the need to maintain your own identity infrastructure,
significantly simplifying your IT management.
But Azure AD is not a full replacement for a traditional on-premises Active Directory infrastructure. Directory features
such as legacy authentication methods, computer management, or group policy might not be available without deploying
additional tools or services to the cloud.
For scenarios where you need to integrate your on-premises identities or domain services with your cloud deployments,
see the directory synchronization and cloud-hosted domain services patterns discussed below.
Directory synchronization
For organizations with existing on-premises Active Directory infrastructure, directory synchronization is often
the best solution for preserving existing user and access management while providing the required IAM
capabilities for managing cloud resources. This process continuously replicates directory information between
Azure AD and on-premises directory services, allowing common credentials for users and a consistent identity,
role, and permission system across your entire organization.
NOTE
Organizations that have adopted Microsoft 365 might have already implemented directory synchronization between their
on-premises Active Directory infrastructure and Azure Active Directory.
Director y synchronization assumptions: Using a synchronized identity solution assumes the following:
You need to maintain a common set of user accounts and groups across your cloud and on-premises IT
infrastructure.
Your on-premises identity services support replication with Azure AD.
TIP
Any cloud-based workloads that depend on legacy authentication mechanisms provided by on-premises Active Directory
servers and that are not supported by Azure AD will still require either connectivity to on-premises domain services or
virtual servers in the cloud environment providing these services. Using on-premises identity services also introduces
dependencies on connectivity between the cloud and on-premises networks.
TIP
While a directory migration coupled with cloud-hosted domain services provides great flexibility when migrating existing
workloads, hosting virtual machines within your cloud virtual network to provide these services does increase the
complexity of your IT management tasks. As your cloud migration experience matures, examine the long-term
maintenance requirements of hosting these servers. Consider whether refactoring existing workloads for compatibility
with cloud identity providers such as Azure Active Directory can reduce the need for these cloud-hosted servers.
Learn more
For more information about identity services in Azure, see:
Azure AD . Azure AD provides cloud-based identity services. It allows you to manage access to your Azure
resources and control identity management, device registration, user provisioning, application access control,
and data protection.
Azure AD Connect . The Azure AD Connect tool allows you to connect Azure AD instances with your
existing identity management solutions, allowing synchronization of your existing directory in the cloud.
Role-based access control (RBAC) . Azure AD provides RBAC to efficiently and securely manage access to
resources in the management plane. Jobs and responsibilities are organized into roles, and users are
assigned to these roles. RBAC allows you to control who has access to a resource along with which actions a
user can perform on that resource.
Azure AD Privileged Identity Management (PIM) . PIM lowers the exposure time of resource access
privileges and increases your visibility into their use through reports and alerts. It limits users to taking on
their privileges "just in time" (JIT), or by assigning privileges for a shorter duration, after which privileges are
revoked automatically.
Integrate on-premises Active Director y domains with Azure Active Director y . This reference
architecture provides an example of directory synchronization between on-premises Active Directory
domains and Azure AD.
Extend Active Director y Domain Ser vices (AD DS) to Azure . This reference architecture provides an
example of deploying AD DS servers to extend domain services to cloud-based resources.
Extend Active Director y Federation Ser vices (AD FS) to Azure . This reference architecture configures
Active Directory Federation Services (AD FS) to perform federated authentication and authorization with
your Azure AD directory.
Next steps
Identity is just one of the core infrastructure components requiring architectural decisions during a cloud
adoption process. To learn about alternative patterns or models used when making design decisions for other
types of infrastructure, see the architectural decision guides overview.
Architectural decision guides overview
Policy enforcement decision guide
10/30/2020 • 3 minutes to read • Edit Online
Defining organizational policy is not effective unless it can be enforced across your organization. A key aspect of
planning any cloud migration is determining how best to combine tools provided by the cloud platform with
your existing IT processes to maximize policy compliance across your entire cloud estate.
Jump to: Baseline best practices | Policy compliance monitoring | Policy enforcement | Cross-organization policy |
Automated enforcement
As your cloud estate grows, you will be faced with a corresponding need to maintain and enforce policy across a
larger array of resources, and subscriptions. As your estate gets larger and your organization's policy
requirements increase, the scope of your policy enforcement processes needs to expand to ensure consistent
policy adherence and fast violation detection.
Platform-provided policy enforcement mechanisms at the resource or subscription level are usually sufficient for
smaller cloud estates. Larger deployments justify a larger enforcement scope and may need to take advantage of
more sophisticated enforcement mechanisms involving deployment standards, resource grouping and
organization, and integrating policy enforcement with your logging and reporting systems.
The primary factors in determining the scope of your policy enforcement processes is your organization's cloud
governance requirements, the size and nature of your cloud estate, and how your organization is reflected in your
subscription design. An increase in size of your estate or a greater need to centrally manage policy enforcement
can both justify an increase in enforcement scope.
Policy enforcement
In Azure, you can apply configuration settings and resource creation rules at the management group,
subscription, or resource group level to help ensure policy alignment.
Azure Policy is an Azure service for creating, assigning, and managing policies. These policies enforce different
rules and effects over your resources, so those resources stay compliant with your corporate standards and
service-level agreements. Azure Policy evaluates your resources for noncompliance with assigned policies. For
example, you might want to limit the SKU size of virtual machines in your environment. After implementing a
corresponding policy, new and existing resources are evaluated for compliance. With the right policy, existing
resources can be brought into compliance.
Cross-organization policy
As your cloud estate grows to span many subscriptions that require enforcement, you will need to focus on a
cloud-estate-wide enforcement strategy to ensure policy consistency.
Your subscription design must account for policy in relation to your organizational structure. In addition to
helping support complex organization within your subscription design, Azure management groups can be used
to assign Azure Policy rules across multiple subscriptions.
Automated enforcement
While standardized deployment templates are effective at a smaller scale, Azure Blueprints allows large-scale
standardized provisioning and deployment orchestration of Azure solutions. Workloads across multiple
subscriptions can be deployed with consistent policy settings for any resources created.
For IT environments integrating cloud and on-premises resources, you may need use logging and reporting
systems to provide hybrid monitoring capabilities. Your third-party or custom operational monitoring systems
may offer additional policy enforcement capabilities. For larger or more mature cloud estates, consider how best
to integrate these systems with your cloud assets.
Next steps
Policy enforcement is just one of the core infrastructure components requiring architectural decisions during a
cloud adoption process. Visit the architectural decision guides overview to learn about alternative patterns or
models used when making design decisions for other types of infrastructure.
Architectural decision guides
Resource consistency decision guide
10/30/2020 • 5 minutes to read • Edit Online
Azure subscription design defines how you organize your cloud assets in relation to your organization's
structure, accounting practices, and workload requirements. In addition to this level of structure, addressing
your organizational governance policy requirements across your cloud estate requires the ability to
consistently organize, deploy, and manage resources within a subscription.
Jump to: Basic grouping | Deployment consistency | Policy consistency | Hierarchical consistency | Automated
consistency
Decisions regarding the level of your cloud estate's resource consistency requirements are primarily driven by
these factors: post-migration digital estate size, business or environmental requirements that don't fit neatly
within your existing subscription design approaches, or the need to enforce governance over time after
resources have been deployed.
As these factors increase in importance, the benefits of ensuring consistent deployment, grouping, and
management of cloud-based resources becomes more important. Achieving more advanced levels of resource
consistency to meet increasing requirements requires more effort spent in automation, tooling, and
consistency enforcement, and this results in additional time spent on change management and tracking.
Basic grouping
In Azure, resource groups are a core resource organization mechanism to logically group resources within a
subscription.
Resource groups act as containers for resources with a common lifecycle as well as shared management
constraints such as policy or role-based access control (RBAC) requirements. Resource groups can't be nested,
and resources can only belong to one resource group. All control plane actions act on all resources in a
resource group. For example, deleting a resource group also deletes all resources within that group. The
preferred pattern for resource group management is to consider:
1. Are the contents of the resource group developed together?
2. Are the contents of the resource group managed, updated, and monitored together and done so by the
same people or teams?
3. Are the contents of the resource group retired together?
If you answered no to any of the above points, the resource in question should be placed elsewhere, in another
resource group.
IMPORTANT
Resource groups are also region specific; however, it is common for resources to be in different regions within the same
resource group because they're managed together as described above. For more information about region selection, see
Multiple regions.
Deployment consistency
Building on top of the base resource grouping mechanism, the Azure platform provides a system for using
templates to deploy your resources to the cloud environment. You can use templates to create consistent
organization and naming conventions when deploying workloads, enforcing those aspects of your resource
deployment and management design.
Azure Resource Manager templates allow you to repeatedly deploy your resources in a consistent state using a
predetermined configuration and resource group structure. Resource Manager templates help you define a set
of standards as a basis for your deployments.
For example, you can have a standard template for deploying a web server workload that contains two virtual
machines as web servers combined with a load balancer to distribute traffic between the servers. You can then
reuse this template to create structurally identical set of virtual machines and load balancer whenever this type
of workload is needed, only changing the deployment name and IP addresses involved.
You can also programmatically deploy these templates and integrate them with your CI/CD systems.
Policy consistency
To ensure that governance policies are applied when resources are created, part of resource grouping design
involves using a common configuration when deploying resources.
By combining resource groups and standardized Resource Manager templates, you can enforce standards for
what settings are required in a deployment and what Azure Policy rules are applied to each resource group or
resource.
For example, you may have a requirement that all virtual machines deployed within your subscription connect
to a common subnet managed by your central IT team. You can create a standard template for deploying
workload VMs to create a separate resource group for the workload and deploy the required VMs there. This
resource group would have a policy rule to only allow network interfaces within the resource group to be
joined to the shared subnet.
For a more in-depth discussion of enforcing your policy decisions within a cloud deployment, see Policy
enforcement.
Hierarchical consistency
Resource groups allow you to support additional levels of hierarchy within your organization within the
subscription, applying Azure Policy rules and access controls at a resource group level. As the size of your cloud
estate grows, you may need to support more complicated cross-subscription governance requirements than
can be supported using the Azure Enterprise Agreement's enterprise/department/account/subscription
hierarchy.
Azure management groups allow you to organize subscriptions into more sophisticated organizational
structures by grouping subscriptions in a hierarchy distinct from your Enterprise Agreement's hierarchy. This
alternate hierarchy allows you to apply access control and policy enforcement mechanisms across multiple
subscriptions and the resources they contain. Management group hierarchies can be used to match your cloud
estate's subscriptions with operations or business governance requirements. For more information, see the
subscription decision guide.
Automated consistency
For large cloud deployments, global governance becomes both more important and more complex. It is crucial
to automatically apply and enforce governance requirements when deploying resources, as well as meet
updated requirements for existing deployments.
Azure Blueprints enable organizations to support global governance of large cloud estates in Azure. Blueprints
move beyond the capabilities provided by standard Azure Resource Manager templates to create complete
deployment orchestrations capable of deploying resources and applying policy rules. Blueprints support
versioning, the ability to update all subscriptions where the blueprint was used, and the ability to lock down
deployed subscriptions to avoid the unauthorized creation and modification of resources.
These deployment packages allow IT and development teams to rapidly deploy new workloads and networking
assets that comply with changing organizational policy requirements. Blueprints can also be integrated into
CI/CD pipelines to apply revised governance standards to deployments as they're updated.
Next steps
Resource consistency is just one of the core infrastructure components requiring architectural decisions during
a cloud adoption process. Visit the architectural decision guides overview to learn about alternative patterns or
models used when making design decisions for other types of infrastructure.
Architectural decision guides
Resource naming and tagging decision guide
10/30/2020 • 5 minutes to read • Edit Online
Organizing cloud-based resources is a crucial task for IT, unless you only have simple deployments. Use naming
and tagging standards to organize your resources for these reasons:
Resource management: Your IT teams will need to quickly locate resources associated with specific
workloads, environments, ownership groups, or other important information. Organizing resources is
critical to assigning organizational roles and access permissions for resource management.
Cost management and optimization: Making business groups aware of cloud resource consumption
requires IT to understand the resources and workloads each team is using. The following topics are
supported by cost-related tags:
Cloud accounting models
ROI calculations
Cost tracking
Budgets
Alerts
Recurring spend tracking and reporting
Post-implementation optimizations
Cost-optimization tactics
Operations management: Visibility for the operations management team regarding business
commitments and SLAs is an important aspect of ongoing operations. To be well-managed, tagging for
mission criticality tagging is a requirement.
Security: Classification of data and security impact is a vital data point for the team, when breaches or
other security issues arise. To operate securely, tagging for data classification is required.
Governance and regulator y compliance: Maintaining consistency across resources helps identify
deviation from agreed-upon policies. This governance foundation article demonstrates how one of the
patterns below can help when deploying governance practices. Similar patterns are available to evaluate
regulatory compliance using tags.
Automation: In addition to making resources easier for IT to manage, a proper organizational scheme
allows you to take advantage of automation as part of resource creation, operational monitoring, and the
creation of DevOps processes.
Workload optimization: Tagging can help identify patterns and resolve broad issues. Tag can also help
identify the assets required to support a single workload. Tagging all assets associated with each workload
enables deeper analysis of your mission-critical workloads to make sound architectural decisions.
NOTE
Naming rules and restrictions vary per Azure resource. Your naming conventions must comply with these rules.
Learn more
For more information about naming and tagging in Azure, see:
Naming conventions for Azure resources. Refer to this guidance for recommended naming conventions for
Azure resources.
Use tags to organize your Azure resources and management hierarchy. You can apply tags in Azure at both the
resource group and individual resource level, giving you flexibility in the granularity of any accounting reports
based on applied tags.
Next steps
Resource tagging is just one of the core infrastructure components requiring architectural decisions during a
cloud adoption process. Visit the architectural decision guides overview to learn about alternative patterns or
models used when making design decisions for other types of infrastructure.
Architectural decision guides
Encryption decision guide
10/30/2020 • 7 minutes to read • Edit Online
Encrypting data protects it against unauthorized access. Properly implemented encryption policy provides
additional layers of security for your cloud-based workloads and guards against attackers and other unauthorized
users from both inside and outside your organization and networks.
Jump to: Key management | Data encryption | Learn more
Cloud encryption strategy focuses on corporate policy and compliance mandates. Encrypting resources is
desirable, and many Azure services such as Azure Storage and Azure SQL Database enable encryption by default.
But encryption has costs that can increase latency and overall resource usage.
For demanding workloads, striking the correct balance between encryption and performance, and determining
how data and traffic is encrypted can be essential. Encryption mechanisms can vary in cost and complexity, and
both technical and policy requirements can influence your decisions on how encryption is applied and how you
store and manage critical secrets and keys.
Corporate policy and third-party compliance are the biggest drivers when planning an encryption strategy. Azure
provides multiple standard mechanisms that can meet common requirements for encrypting data, whether at rest
or in transit. For policies and compliance requirements that demand tighter controls, such as standardized secrets
and key management, encryption in-use, or data-specific encryption, you'll need to develop a more sophisticated
encryption strategy to support these requirements.
Key management
Encryption of data in the cloud depends on the secure storage, management, and operational use of encryption
keys. A key management system is critical to your organization's ability to create, store, and manage
cryptographic keys, as well important passwords, connection strings, and other IT confidential information.
Modern key management systems such as Azure Key Vault support storage and management of software
protected keys for dev and test usage and hardware security module (HSM) protected keys for maximum
protection of production workloads or sensitive data.
When planning a cloud migration, the following table can help you decide how to store and manage encryption
keys, certificates, and secrets that are critical for creating secure and manageable cloud deployments:
Cloud-native
With cloud-native key management, all keys and secrets are generated, managed, and stored in a cloud-based
vault such as Azure Key Vault. This approach simplifies many IT tasks related to key management, such as key
backup, storage, and renewal.
Cloud-native assumptions: Using a cloud-native key management system includes these assumptions:
You trust the cloud key management solution with creating, managing, and hosting your organization's secrets
and keys.
You enable all on-premises applications and services that rely on accessing encryption services or secrets to
access the cloud key management system.
Bring your own key (BYOK )
With a BYOK approach, you generate keys on dedicated HSM hardware within your on-premises environment,
then securely transferring these keys to a cloud-based management system such as Azure Key Vault for use with
your cloud-hosted resources.
Bring-your-own-key assumptions: Generating keys on-premises and using them with a cloud-based key
management system includes these assumptions:
You trust the underlying security and access control infrastructure of the cloud platform for hosting and using
your keys and secrets.
Your cloud-hosted applications or services can access and use keys and secrets in a robust and secure way.
You're required by regulatory or organizational policy to keep the creation and management of your
organization's secrets and keys on-premises.
On-premises (hold your own key)
Certain scenarios might have regulatory, policy, or technical reasons prohibiting the storage of keys on a cloud-
based key management system. If so, you must generate keys using on-premises hardware, store and manage
them using an on-premises key management system, and establish a way for cloud-based resources to access
these keys for encryption purposes. Note that holding your own key might not be compatible with all Azure-
based services.
On-premises key management assumptions: Using an on-premises key management system includes these
assumptions:
You're required by regulatory or organizational policy to keep the creation, management, and hosting of your
organization's secrets and keys on-premises.
Any cloud-based applications or services that rely on accessing encryption services or secrets can access the
on-premises key management system.
Data encryption
Consider several different states of data with different encryption needs when planning your encryption policy:
DATA STAT E DATA
Data in transit
Data in transit is data moving between resources on the internal, between datacenters or external networks, or
over the internet.
Data in transit is usually encrypted by requiring SSL/TLS protocols for network traffic. Always encrypt traffic
between your cloud-hosted resources and external networks or the public internet. PaaS resources typically
enforce SSL/TLS encryption by default. Your cloud adoption teams and workload owners should consider
enforcing encryption for traffic between IaaS resources hosted inside your virtual networks.
Assumptions about encr ypting data in transit: Implementing proper encryption policy for data in transit
assumes the following:
All publicly accessible endpoints in your cloud environment will communicate with the public internet using
SSL/TLS protocols.
When connecting cloud networks with on-premises or other external network over the public internet, use
encrypted VPN protocols.
When connecting cloud networks with on-premises or other external network using a dedicated WAN
connection such as ExpressRoute, you will use a VPN or other encryption appliance on-premises paired with a
corresponding virtual VPN or encryption appliance deployed to your cloud network.
If you have sensitive data that shouldn't be included in traffic logs or other diagnostics reports visible to IT
staff, you will encrypt all traffic between resources in your virtual network.
Data at rest
Data at rest represents any data not being actively moved or processed, including files, databases, virtual machine
drives, PaaS storage accounts, or similar assets. Encrypting stored data protects virtual devices or files against
unauthorized access either from external network penetration, rogue internal users, or accidental releases.
PaaS storage and database resources generally enforce encryption by default. IaaS resources can be secured by
encrypting data at the virtual disk level or by encrypting the entire storage account hosting your virtual drives. All
of these assets can make use of either Microsoft-managed or customer-managed keys stored in Azure Key Vault.
Encryption for data at rest also encompasses more advanced database encryption techniques, such as column-
level and row level encryption, providing much more control over exactly what data is being secured.
Your overall policy and compliance requirements, the sensitivity of the data being stored, and the performance
requirements of your workloads should determine which assets require encryption.
Assumptions about encrypting data at rest
Encrypting data at rest assumes the following:
You're storing data that is not meant for public consumption.
Your workloads can accept the added latency cost of disk encryption.
Data in use
Encryption for data in use involves securing data in nonpersistent storage, such as RAM or CPU caches. Use of
technologies such as full memory encryption, enclave technologies, such as Intel's Secure Guard Extensions (SGX).
This also includes cryptographic techniques, such as homomorphic encryption that can be used to create secure,
trusted execution environments.
Assumptions about encr ypting data in use: Encrypting data in use assumes the following:
You're required to maintain data ownership separate from the underlying cloud platform at all times, even at
the RAM and CPU level.
Learn more
For more information about encryption and key management in Azure, see:
Azure encr yption over view : A detailed description of how Azure uses encryption to secure both data at rest
and data in transit.
Azure Key Vault : Key Vault is the primary key management system for storing and managing cryptographic
keys, secrets, and certificates within Azure.
Azure data security and encr yption best practices . A discussion of Azure data security and encryption
best practices.
Confidential computing in Azure : Azure's confidential computing initiative provides tools and technology
to create trusted execution environments or other encryption mechanisms to secure data in use.
Next steps
Encryption is just one of the core infrastructure components requiring architectural decisions during a cloud
adoption process. To learn about alternative patterns or models used when making design decisions for other
types of infrastructure, see the architectural decision guides overview.
Architectural decision guides overview
Software Defined Networking decision guide
10/30/2020 • 3 minutes to read • Edit Online
Software Defined Networking (SDN) is a network architecture designed to allow virtualized networking
functionality that can be centrally managed, configured, and modified through software. SDN enables the
creation of cloud-based networks using the virtualized equivalents to physical routers, firewalls, and other
networking devices used in on-premises networks. SDN is critical to creating secure virtual networks on public
cloud platforms such as Azure.
Jump to: PaaS only | Cloud-native | Cloud DMZ | Hybrid | Hub and spoke model | Learn more
SDN provides several options with varying degrees of pricing and complexity. The above discovery guide
provides a reference to quickly personalize these options to best align with specific business and technology
strategies.
The inflection point in this guide depends on several key decisions that your cloud strategy team has made
before making decisions about networking architecture. Most important among these are decisions involving
your digital estate definition and subscription design, which may also require inputs from decisions made related
to your cloud accounting and global markets strategies.
Small single-region deployments of fewer than 1,000 VMs are less likely to be significantly affected by this
inflection point. Conversely, large adoption efforts with more than 1,000 VMs, multiple business units, or
multiple geopolitical markets, could be substantially affected by your SDN decision and this key inflection point.
Learn more
For more information about Software Defined Networking in Azure, see:
Azure Virtual Network. On Azure, the core SDN capability is provided by Azure Virtual Network, which acts as
a cloud analog to physical on-premises networks. Virtual networks also act as a default isolation boundary
between resources on the platform.
Azure best practices for network security. Recommendations from the Azure security team on how to
configure your virtual networks to minimize security vulnerabilities.
Next steps
Software Defined Networking is just one of the core infrastructure components requiring architectural decisions
during a cloud adoption process. Visit the architectural decision guides overview to learn about alternative
patterns or models used when making design decisions for other types of infrastructure.
Architectural decision guides
Software Defined Networking: PaaS-only
10/30/2020 • 2 minutes to read • Edit Online
When you implement a platform as a service (PaaS) resource, the deployment process automatically creates an
assumed underlying network with a limited number of controls over that network, including load balancing, port
blocking, and connections to other PaaS services.
In Azure, several PaaS resource types can be deployed into a virtual network or connected to a virtual network,
integrating these resources with your existing virtual networking infrastructure. Other services, such as App
Service Environment, Azure Kubernetes Service (AKS), and Service Fabric must be deployed within a virtual
network. In many cases, a PaaS-only networking architecture, relying solely on the default native networking
capabilities provided by PaaS resources, is sufficient to meet a workload's connectivity and traffic management
requirements.
If you're considering a PaaS only networking architecture, be sure you validate that the required assumptions align
with your requirements.
PaaS-only assumptions
Deploying a PaaS-only networking architecture assumes the following:
The application being deployed is a standalone application or depends only on other PaaS resources that do
not require a virtual network.
Your IT operations teams can update their tools, training, and processes to support management, configuration,
and deployment of standalone PaaS applications.
The PaaS application is not part of a broader cloud migration effort that will include IaaS resources.
These assumptions are minimum qualifiers aligned to deploying a PaaS-only network. While this approach may
align with the requirements of a single application deployment, each cloud adoption team should consider these
long-term questions:
Will this deployment expand in scope or scale to require access to other non-PaaS resources?
Are other PaaS deployments planned beyond the current solution?
Does the organization have plans for other future cloud migrations?
The answers to these questions would not preclude a team from choosing a PaaS only option but should be
considered before making a final decision.
Software Defined Networking: Cloud-native
10/30/2020 • 2 minutes to read • Edit Online
A cloud-native virtual network is required when deploying IaaS resources such as virtual machines to a cloud
platform. Access to virtual networks from external sources, similar to the web, need to be explicitly provisioned.
These types of virtual networks support the creation of subnets, routing rules, and virtual firewall and traffic
management devices.
A cloud-native virtual network has no dependencies on your organization's on-premises or other non-cloud
resources to support the cloud-hosted workloads. All required resources are provisioned either in the virtual
network itself or by using managed PaaS offerings.
Cloud-native assumptions
Deploying a cloud-native virtual network assumes the following:
The workloads you deploy to the virtual network have no dependencies on applications or services that are
accessible only from inside your on-premises network. Unless they provide endpoints accessible over the
public internet, applications and services hosted internally on-premises are not usable by resources hosted on
a cloud platform.
Your workload's identity management and access control depends on the cloud platform's identity services or
IaaS servers hosted in your cloud environment. You will not need to directly connect to identity services hosted
on-premises or other external locations.
Your identity services do not need to support single sign-on (SSO) with on-premises directories.
Cloud-native virtual networks have no external dependencies. This makes them simple to deploy and configure,
and as a result this architecture is often the best choice for experiments or other smaller self-contained or rapidly
iterating deployments.
Additional issues your cloud adoption teams should consider when discussing a cloud-native virtual networking
architecture include:
Existing workloads designed to run in an on-premises datacenter may need extensive modification to take
advantage of cloud-based functionality, such as storage or authentication services.
Cloud-native networks are managed solely through the cloud platform management tools, and therefore may
lead to management and policy divergence from your existing IT standards as time goes on.
Next steps
For more information about cloud-native virtual networking in Azure, see:
Azure Virtual Network guides: Newly created virtual networks are cloud-native by default. Use these guides to
help plan the design and deployment of your virtual networks.
Azure networking limits: Each virtual network and connected resources exists in a single subscription. These
resources bound by subscription limits.
Software Defined Networking: Cloud DMZ
10/30/2020 • 2 minutes to read • Edit Online
The Cloud DMZ network architecture allows limited access between your on-premises and cloud-based networks,
using a virtual private network (VPN) to connect the networks. Although a DMZ model is commonly used when
you want to secure external access to a network, the Cloud DMZ architecture discussed here is intended
specifically to secure access to the on-premises network from cloud-based resources and vice versa.
This architecture is designed to support scenarios where your organization wants to start integrating cloud-based
workloads with on-premises workloads but may not have fully matured cloud security policies or acquired a
secure dedicated WAN connection between the two environments. As a result, cloud networks should be treated
like a DMZ to ensure on-premises services are secure.
The DMZ deploys network virtual appliances (NVAs) to implement security functionality such as firewalls and
packet inspection. Traffic passing between on-premises and cloud-based applications or services must pass
through the DMZ where it can be audited. VPN connections and the rules determining what traffic is allowed
through the DMZ network are strictly controlled by IT security teams.
Learn more
For more information about implementing a Cloud DMZ in Azure, see:
Implement a DMZ between Azure and your on-premises datacenter. This article discusses how to implement a
secure hybrid network architecture in Azure.
Software Defined Networking: Hybrid network
10/30/2020 • 2 minutes to read • Edit Online
The hybrid cloud network architecture allows virtual networks to access your on-premises resources and services
and vice versa, using a dedicated WAN connection such as ExpressRoute or other connection method to directly
connect the networks.
Building on the cloud-native virtual network architecture, a hybrid virtual network is isolated when initially
created. Adding connectivity to the on-premises environment grants access to and from the on-premises network,
although all other inbound traffic targeting resources in the virtual network need to be explicitly allowed. You can
secure the connection using virtual firewall devices and routing rules to limit access or you can specify exactly
what services can be accessed between the two networks using cloud-native routing features or deploying
network virtual appliances (NVAs) to manage traffic.
Although the hybrid networking architecture supports VPN connections, dedicated WAN connections like
ExpressRoute are preferred due to higher performance and increased security.
Hybrid assumptions
Deploying a hybrid virtual network includes the following assumptions:
Your IT security teams have aligned on-premises and cloud-based network security policy to ensure cloud-
based virtual networks can be trusted to communicated directly with on-premises systems.
Your cloud-based workloads require access to storage, applications, and services hosted on your on-premises
or third-party networks, or your users or applications in your on-premises need access to cloud-hosted
resources.
You need to migrate existing applications and services that depend on on-premises resources, but don't want to
expend the resources on redevelopment to remove those dependencies.
Connecting your on-premises networks to cloud resources over VPN or dedicated WAN is not prevented by
corporate policy, data sovereignty requirements, or other regulatory compliance issues.
Your workloads either do not require multiple subscriptions to bypass subscription resource limits, or your
workloads involve multiple subscriptions but do not require central management of connectivity or shared
services used by resources spread across multiple subscriptions.
Your cloud adoption teams should consider the following issues when looking at implementing a hybrid virtual
networking architecture:
Connecting on-premises networks with cloud networks increases the complexity of your security requirements.
Both networks must be secured against external vulnerabilities and unauthorized access from both sides of the
hybrid environment.
Scaling the number and size of workloads within a hybrid cloud environment can add significant complexity to
routing and traffic management.
You will need to develop compatible management and access control policies to maintain consistent
governance throughout your organization.
Learn more
For more information about hybrid networking in Azure, see:
Hybrid network reference architecture. Azure hybrid virtual networks use either an ExpressRoute circuit or
Azure VPN to connect your virtual network with your organization's existing IT assets not hosted in Azure. This
article discusses the options for creating a hybrid network in Azure.
Software Defined Networking: Hub and spoke
10/30/2020 • 2 minutes to read • Edit Online
The hub and spoke networking model organizes your Azure-based cloud network infrastructure into multiple
connected virtual networks. This model allows you to more efficiently manage common communication or
security requirements and deal with potential subscription limitations.
In the hub and spoke model, the hub is a virtual network that acts as a central location for managing external
connectivity and hosting services used by multiple workloads. The spokes are virtual networks that host
workloads and connect to the central hub through virtual network peering.
All traffic passing in or out of the workload spoke networks is routed through the hub network where it can be
routed, inspected, or otherwise managed by centrally managed IT rules or processes.
This model aims to address the following concerns:
Cost savings and management efficiency. Centralizing services that can be shared by multiple workloads,
such as network virtual appliances (NVAs) and DNS servers, in a single location allows IT to minimize
redundant resources and management effort across multiple workloads.
Overcoming subscription limits. Large cloud-based workloads may require the use of more resources than
are allowed within a single Azure subscription. Peering workload virtual networks from different subscriptions
to a central hub can overcome these limits. For more information, see Azure networking limits.
Separation of concerns. The ability to deploy individual workloads between central IT teams and workload
teams.
The following diagram shows an example hub and spoke architecture including centrally managed hybrid
connectivity.
The hub and spoke architecture is often used alongside the hybrid networking architecture, providing a centrally
managed connection to your on-premises environment shared between multiple workloads. In this scenario, all
traffic traveling between the workloads and on-premises passes through the hub where it can be managed and
secured.
Learn more
For reference architectures showing how to implement hub and spoke networks on Azure, see:
Implement a hub and spoke network topology in Azure
Implement a hub and spoke network topology with shared services in Azure
Logging and reporting decision guide
10/30/2020 • 7 minutes to read • Edit Online
All organizations need mechanisms for notifying IT teams of performance, uptime, and security issues before
they become serious problems. A successful monitoring strategy allows you to understand how the individual
components that make up your workloads and networking infrastructure are performing. Within the context of a
public cloud migration, integrating logging and reporting with any of your existing monitoring systems, while
surfacing important events and metrics to the appropriate IT staff, is critical in ensuring your organization is
meeting uptime, security, and policy compliance goals.
Jump to: Planning your monitoring infrastructure | Cloud-native | On-premises extension | Gateway aggregation
| Hybrid monitoring (on-premises) | Hybrid monitoring (cloud-based) | Multicloud | Learn more
The inflection point when determining a cloud logging and reporting strategy is based primarily on existing
investments your organization has made in operational processes, and to some degree any requirements you
have to support a multicloud strategy.
Activities in the cloud can be logged and reported in multiple ways. Cloud-native and centralized logging are two
common managed service options that are driven by the subscription design and the number of subscriptions.
Cloud-native
If your organization currently lacks established logging and reporting systems, or if your planned deployment
does not need to be integrated with existing on-premises or other external monitoring systems, a cloud-native
SaaS solution such as Azure Monitor, is the simplest choice.
In this scenario, all log data is recorded and stored in the cloud, while the logging and reporting tools that
process and surface information to IT staff are provided by the Azure platform and Azure Monitor.
Custom logging solutions based on Azure Monitor can be implemented as needed for each subscription or
workload in smaller or experimental deployments. These solutions are organized centrally to monitor log data
across your entire cloud estate.
Cloud-native assumptions: Using a cloud-native logging and reporting system assumes the following:
You do not need to integrate the log data from you cloud workloads into existing on-premises systems.
You will not be using your cloud-based reporting systems to monitor on-premises systems.
On-premises extension
It might require substantial redevelopment effort for applications and services migrating to the cloud to use
cloud-based logging and reporting solutions such as Azure Monitor. In these cases, consider allowing these
workloads to continue sending telemetry data to existing on-premises systems.
To support this approach, your cloud resources must communicate directly with your on-premises systems
through a combination of hybrid networking and cloud-hosted domain services. With this in place, the cloud
virtual network functions as a network extension of the on-premises environment. Therefore, cloud-hosted
workloads can communicate directly with your on-premises logging and reporting system.
This approach capitalizes on your existing investment in monitoring tooling with limited modification to any
cloud-deployed applications or services. This is often the fastest approach to support monitoring during a lift
and shift migration. But it won't capture log data produced by cloud-based PaaS and SaaS resources, and it will
omit any VM-related logs generated by the cloud platform itself such as VM status. As a result, this pattern
should be a temporary solution until a more comprehensive hybrid monitoring solution is implemented.
On-premises-only assumptions:
You need to maintain log data only in your on-premises environment only, either in support of technical
requirements or due to regulatory or policy requirements.
Your on-premises systems do not support hybrid logging and reporting or gateway aggregation solutions.
Your cloud-based applications can submit telemetry directly to your on-premises logging systems or
monitoring agents that submit to on-premises can be deployed to workload VMs.
Your workloads don't depend on PaaS or SaaS services that require cloud-based logging and reporting.
Gateway aggregation
For scenarios where the amount of cloud-based telemetry data is large or existing on-premises monitoring
systems need log data modified before it can be processed, a log data gateway aggregation service might be
required.
A gateway service is deployed to your cloud provider. Then, relevant applications and services are configured to
submit telemetry data to the gateway instead of a default logging system. The gateway can then process the
data: aggregating, combining, or otherwise formatting it before then submitting it to your monitoring service for
ingestion and analysis.
Also, a gateway can be used to aggregate and preprocess telemetry data bound for cloud-native or hybrid
systems.
Gateway aggregation assumptions:
You expect large volumes of telemetry data from your cloud-based applications or services.
You need to format or otherwise optimize telemetry data before submitting it to your monitoring systems.
Your monitoring systems have APIs or other mechanisms available to ingest log data after processing by the
gateway.
Hybrid monitoring (on-premises)
A hybrid monitoring solution combines log data from both your on-premises and cloud resources to provide an
integrated view into your IT estate's operational status.
If you have an existing investment in on-premises monitoring systems that would be difficult or costly to replace,
you might need to integrate the telemetry from your cloud workloads into preexisting on-premises monitoring
solutions. In a hybrid on-premises monitoring system, on-premises telemetry data continues to use the existing
on-premises monitoring system. Cloud-based telemetry data is either sent to the on-premises monitoring
system directly, or the data is sent to Azure Monitor then compiled and ingested into the on-premises system at
regular intervals.
On-premises hybrid monitoring assumptions: Using an on-premises logging and reporting system for
hybrid monitoring assumes the following:
You need to use existing on-premises reporting systems to monitor cloud workloads.
You need to maintain ownership of log data on-premises.
Your on-premises management systems have APIs or other mechanisms available to ingest log data from
cloud-based systems.
TIP
As part of the iterative nature of cloud migration, transitioning from distinct cloud-native and on-premises monitoring to a
partial hybrid approach is likely as the integration of cloud-based resources and services into your overall IT estate
matures.
Learn more
Azure Monitor is the default reporting and monitoring service for Azure. It provides:
A unified platform for collecting application telemetry, host telemetry (such as VMs), container metrics, Azure
platform metrics, and event logs.
Visualization, queries, alerts, and analytical tools. It can provide insights into virtual machines, guest operating
systems, virtual networks, and workload application events.
REST APIs for integration with external services and automation of monitoring and alerting services.
Integration with many popular third-party vendors.
Next steps
Logging and reporting is just one of the core infrastructure components requiring architectural decisions during
a cloud adoption process. Visit the architectural decision guides overview to learn about alternative patterns or
models used when making design decisions for other types of infrastructure.
Architectural decision guides
Migration tools decision guide
10/30/2020 • 4 minutes to read • Edit Online
The strategy and tools you use to migrate an application to Azure will largely depend on your business
motivations, technology strategies, and timelines, as well as a deep understanding of the actual workload and
assets (infrastructure, apps, and data) being migrated. The following decision tree serves as high-level guidance
for selecting the best tools to use based on migration decisions. Treat this decision tree as a starting point.
The choice to migrate using platform as a service (PaaS) or infrastructure as a service (IaaS) technologies is driven
by the balance between cost, time, existing technical debt, and long-term returns. IaaS is often the fastest path to
the cloud with the least amount of required change to the workload. PaaS could require modifications to data
structures or source code, but produces substantial long-term returns in the form of reduced operating costs and
greater technical flexibility. In the following diagram, the term modernize is used to reflect a decision to modernize
an asset during migration and migrate the modernized asset to a PaaS platform.
Key questions
Answering the following questions will allow you to make decisions based on the above tree.
Would modernization of the application platform during migration prove to be a wise investment
of time, energy, and budget? PaaS technologies such as Azure App Service or Azure Functions can increase
deployment flexibility and reduce the complexity of managing virtual machines to host applications.
Applications may require refactoring before they can take advantage of these cloud-native capabilities,
potentially adding significant time and cost to a migration effort. If your application can migrate to PaaS
technologies with a minimum of modifications, it is likely a good candidate for modernization. If extensive
refactoring would be required, a migration using IaaS-based virtual machines may be a better choice.
Would modernization of the data platform during migration prove to be a wise investment of
time, energy, and budget? As with application migration, Azure PaaS managed storage options, such as
Azure SQL Database, Azure Cosmos DB, and Azure Storage, offer significant management and flexibility
benefits, but migrating to these services may require refactoring of existing data and the applications that use
that data. Data platforms typically require less refactoring than the application platform would. Therefore, it's
common for the data platform to be modernized, even though the application platform remains the same. If
your data can be migrated to a managed data service with minimal changes, it is a good candidate for
modernization. Data that would require extensive time or cost to be refactored to use these PaaS services may
be better migrated using IaaS-based virtual machines to better match existing hosting capabilities.
Is your application currently running on dedicated vir tual machines or sharing hosting with other
applications? Application running on dedicated virtual machines may be more easily migrated to PaaS
hosting options than applications running on shared servers.
Will your data migration exceed your network bandwidth? Network capacity between your on-
premises data sources and Azure can be a bottleneck on data migration. If the data you need to transfer faces
bandwidth limitations that prevent efficient or timely migration, you may need to look into alternative or offline
transfer mechanisms. The Cloud Adoption Framework's article on migration replication discusses how
replication limits can affect migration efforts. As part of your migration assessment, consult your IT teams to
verify your local and WAN bandwidth is capable of handling your migration requirements. Also see the
migration scenario for handling storage requirements that exceed network capacity during a migration.
Does your application make use of an existing DevOps pipeline? In many cases, Azure Pipelines can be
easily refactored to deploy applications to cloud-based hosting environments.
Does your data have complex data storage requirements? Production applications usually require data
storage that is highly available, offers always-on functionality and similar service uptime and continuity
features. Azure PaaS-based managed database options, such as Azure SQL Database, Azure Database for
MySQL, and Azure Cosmos DB all offer 99.99 percent uptime service-level agreements. Conversely, IaaS-based
SQL Server on Azure VMs offers single-instance service-level agreements of 99.95 percent. If your data cannot
be modernized to use PaaS storage options, guaranteeing higher IaaS uptime will involve more complex data
storage scenarios such as running SQL Server Always On clusters and continuously syncing data between
instances. This can involve significant hosting and maintenance costs, so balancing uptime requirements,
modernization effort, and overall budgetary impact is important when considering your data migration
options.
Learn more
Cloud fundamentals: Over view of Azure compute options : Provides information on the capabilities of
Azure IaaS and PaaS compute options.
Cloud fundamentals: Choose the right data store : Discusses PaaS storage options available on the Azure
platform.
Migration best practices: Data requirements exceed network capacity during a migration effor t :
Discusses alternative data migration mechanisms for scenarios where data migration is hindered by available
network bandwidth.
SQL Database: Choose the right SQL Ser ver option in Azure : Discussion of the options and business
justifications for choosing to host your SQL Server workloads in a managed infrastructure (IaaS) or a managed
service (PaaS) environment.
Cloud Operating Model is now part of the Microsoft
Cloud Adoption Framework for Azure
10/30/2020 • 2 minutes to read • Edit Online
In early 2018, Microsoft released the Cloud Operating Model (COM). The COM was a guide that helped customers
understand the what and the why of digital transformation. This helped customers get a sense of all the areas that
needed to be addressed: business strategy, culture strategy, and technology strategy. What was not included in the
COM were the specific how-to steps, which left customers wondering, "Where do we go from here?"
The Microsoft Cloud Adoption Framework for Azure, is designed to help you understand the what and why and
provide unified guidance on the how to help accelerate your cloud adoption efforts.
The Azure enterprise scaffold has been integrated into the Microsoft Cloud Adoption Framework for Azure. The
goals of the enterprise scaffold are now addressed in the Ready methodology of the Cloud Adoption Framework.
The enterprise scaffold content has been deprecated.
To begin using the Cloud Adoption Framework, see:
Ready overview
Azure landing zones
Landing zone considerations.
If you need to review the deprecated content, see the Azure enterprise scaffold.
Azure Virtual Datacenter
10/30/2020 • 2 minutes to read • Edit Online
A more robust platform architecture and implementation has been created to build on the prior Azure Virtual
Datacenter (VDC) approach. Enterprise-scale landing zones in the Microsoft Cloud Adoption Framework for Azure
are now the recommended approach for larger cloud-adoption efforts.
The following guidance serves as a significant part of the foundation for the Ready methodology and the Govern
methodology in the Cloud Adoption Framework. To support customers making this transition, the following
resources are archived and maintained in a separate GitHub repository.
Azure Virtual Datacenter: This eBook shows you how to deploy enterprise workloads to the Azure cloud
platform while respecting your existing security and networking policies.
Azure Virtual Datacenter lift-and-shift guide: This white paper discusses the process that enterprise IT staff and
decision makers can use to identify and plan the migration of applications and servers to Azure using a lift-and-
shift approach while minimizing any additional development costs and optimizing cloud hosting options.