LP 1215
LP 1215
LP 1215
David West
Chandrakandh Mouleeswaran
Xiaotong Jiang
Markesha Parker
3 Requirements ............................................................................................ 4
Resources ..................................................................................................... 60
The reference architecture is intended for IT decision makers, infrastructure and application architects looking
to plan and implement hybrid cloud and leverage Google Kubernetes Engine container platform to build
modern applications on their on-prem data centers and implement a hybrid cloud with Google Cloud Connect.
Knowledge of containers, Kubernetes, cloud, and data center infrastructure architecture will be helpful.
Lenovo has certified the ThinkAgile VX solution as an Anthos Ready virtualized platform, with bare-metal
certification coming soon. Lenovo successfully deployed the Anthos version 1.12 solution in a 3 node
ThinkAgile VX VMware 7 environment, to validate and complete the certification. The Anthos Ready platform
ensures Lenovo servers support the latest versions and have been tested for Anthos clusters on VMware, as
part of Google Distributed Cloud Virtual (GDC Virtual). The validation includes deployment and scaling of a
cluster along with interaction of GDC Virtual with the overall Lenovo compute, network, and storage solution.
Finally, it validates cluster deployment on-premises and exposes a workload through one of the supported
load balancer options on GDC Virtual clusters. With this, Lenovo demonstrates a strong understanding of
Anthos architectures, features, and the hybrid cloud benefits available to customers from Anthos.
https://cloud.google.com/anthos/docs/resources/partner-platforms
In addition, the ThinkAgile VX has been certified for Intel Select Solution for Anthos.
https://www.intel.com/content/www/us/en/products/solutions/select-solutions/cloud/google-cloud-anthos.html
This document provides an overview of the business problem that is addressed by Anthos and the business
value that is provided by the various Anthos components. A description of customer requirements is followed
by an architectural overview of the solution and the logical components. The operational model describes the
architecture for deploying Anthos on ThinkAgile VX platform, deployment considerations, network
architecture, and other requirements. The last section provides the Bill of Materials for the hardware
configurations for the Lenovo ThinkAgile VX certified nodes and appliances and networking hardware that is
used in the solution.
In order to take advantage of these emerging technologies, many companies are going through a
modernization phase to transform their legacy IT systems, processes, as well as culture to become lean
and agile and to build the right kind of technology capabilities while controlling costs.
With the increasing cost and competitive pressures, organizations are being forced to rethink their business
strategy. Below are some of the common concerns you come across in every business these days:
• How can they continue to stay innovative and relevant in the marketplace while controlling costs
• How can they improve the profitability of the products and services in a crowded market
• How to improve customer satisfaction by quickly delivering new products, features, and better service
• How to establish market leadership by bringing innovative products before competition
• How to build an agile workforce that can react quickly to customer requests and market trends
• How to withstand being disrupted by new market entrants
While some of the business problems above require non-technical solutions, the key technical challenges can
be addressed by the following capabilities:
One of the key on-prem components required to deploy Anthos is the infrastructure platform. Lenovo has
partnered with Google to certify the Lenovo ThinkAgile VX platform for Anthos. VX provides VMware vSAN
certified hyperconverged infrastructure (HCI) servers with a rich set of configurable options depending upon
the application workload and business needs. Anthos works directly on the ThinkAgile VX platform without
any modifications. Together with Lenovo ThinkAgile VX hyperconverged infrastructure, Anthos provides a
turnkey on-prem cloud solution and enables management of Kubernetes container engine clusters from a
central control plane.
3.1 Introduction
The following section describes some background on emerging customer requirements around new
technologies. This will help setup the discussion for the rest of the reference architecture.
Software that follows a “monolithic” architecture is designed with simplicity and manageability on mind, and
typically has a single code-base that included all functional modules such as the user interface tier, business
logic, data access, authentication/authorization, etc., within a single application unit. While software best
practices are followed in writing the code, e.g. modularity and loose-coupling to allow various functional
modules to be independently written and maintained, the application development, integration testing,
production deployment, and release lifecycle has to be managed with tight coordination across all
development teams because all functional modules had to be combined to produce a single application.
• Fixes to one or more modules or new features will force the full program to be rebuilt, tested, and
released.
• Bugs in any part of the code could affect the entire application
• You cannot independently scale or provide high-availability to different functional units
Micro-services is the new architecture paradigm that addresses some of the issues in monolithic application
design. In this design the code is broken down into smaller functional modules, which are deployed as
independent applications, exposing a service API. This architecture preserves the benefits of loose-coupling
and modularity, while also detaching the modules such that they act as independent services. In addition,
micro-services architecture enables smaller functional modules to be developed by independent teams and
deployed anywhere- on on-prem data center infrastructure, or on the cloud. For instance, the checkout
function on an e-commerce website can be implemented as a micro-service and can be shared by multiple
product lines on the same site. Figure 2 illustrates such an example for an n-tier traditional business
application broken down into several micro-services interacting with each other.
3.1.2 Containers
Containers are a new way of running traditional applications and micro-services at-scale, and managing their
lifecycle with an orchestration engine such as Kubernetes. Containers provide a lightweight packaging of the
application and its required runtime libraries into an image, and run this image on top of traditional operating
systems, either on baremetal hardware or on virtualized infrastructure. Figure 3 shows the new architectural
layers from hardware up to the applications. Docker is an open source project, which provides the user
runtime libraries and tools to build, deploy, and manage containers. Kubernetes orchestrates containers at-
scale on a cluster of machines running Docker.
As modern application development accelerates along with the proliferation of data, hybrid cloud becomes a
necessary capability to have for any company, small or large. In addition, new applications will be “cloud-
native” by design, which enables them to run on the cloud, allowing for greater flexibility, security, availability,
scalability, and other advantages provided by the cloud. Figure 5 illustrates how applications now share a
common architecture spanning on-prem and public cloud. Hybrid cloud provides the necessary bridging
between on-prem and cloud.
Following table provides the various components of Anthos and how they address the functional
requirements.
Google Kubernetes Engine Managed Kubernetes on Google GKE on-prem version 1.12
(GKE) for container Compute Platform (GCP)
orchestration
Multicluster Management Via GCP console and control Via GCP console and control
plane plane
Traffic Director
Fault tolerance Single component error will not lead to whole system unavailability
In this architecture, the Google cloud platform console provides a single control plane for managing
Kubernetes clusters deployed in multiple locations – on the Google public cloud, on the on-prem data center,
or other cloud provide such as AWS. In this sense, the on-prem GKE cluster is essentially an extension of the
public cloud. GCP provides the centralized configuration and security management across the clusters, and
the services running in different clusters can be managed and connected through the Istio service mesh. This
centralized control provides a consistent mechanism to manage distributed Kubernetes clusters, configuration
policy, and security. In addition, the Google container marketplace is available to deploy workloads to any of
the clusters managed from the control plane.
The deployment of Anthos requires a VMware vSAN cluster, which provides the compute and storage
virtualization. The GKE on-prem clusters from Anthos will be deployed as virtual machines running on top of
the vSAN cluster. Hence, the master and worker nodes that are part of the GKE on-prem clusters are
implemented as virtual machines instead of physical hosts. This simplifies the Anthos deployments as well
because you do not need dedicated hosts for implementing GKE clusters. Instead, multiple GKE clusters can
be installed on the same vSAN cluster.
Most of this document refers to a VMware vSAN configuration. However, as an alternate storage option, GKE
on-prem clusters for Anthos can also be deployed in a traditional SAN environment. SAN volumes are
mapped to all VMware cluster nodes as usual, where VMware data stores reside. The Anthos deployment
uses this storage for all volumes and persistent storage created on the GKE cluster nodes. Both of these
storage options have been validated on the Lenovo ThinkAgile VX VMware certified platform.
See Figure 7 for the architecture of the GKE on-prem clusters when deployed on VMware vSAN. There are
three main components:
Admin workstation – this is the VM that acts as the deployment host for all GKE clusters on-prem. You
would login to the admin workstation to kick-off the Anthos cluster deployments and configuration.
User GKE cluster – You can deploy one or more user GKE clusters once the admin workstation and admin
cluster are installed. The user GKE clusters execute the user container workloads. These clusters can be
managed from the GCP console once the clusters are installed and registered with GCP.
For more information on the GKE on-prem, see Google documentation below:
https://cloud.google.com/anthos/docs/concepts/overview
As shown in Figure 8, the centralized control plane is hosted on GCP, which provides the common UI and the
core services required to connect and operate Kubernetes clusters. Below is a brief description of these
components:
Multi-cluster Ingress
If you have an application running on multiple Google Kubernetes Engine clusters located in different regions,
then you can route traffic to a cluster in the region closest to the user by configuring the multi-cluster ingress.
The applications that need the multi-cluster ingress should be configured identically in all Kubernetes clusters
with regards to the deployment configuration including the namespace, name, port number, etc.
https://cloud.google.com/kubernetes-engine/docs/how-to/multi-cluster-ingress
Google Cloud's operations suite provides powerful monitoring, logging, and diagnostics. It equips you with
insight into the health, performance, and availability of cloud-powered applications, enabling you to find and
fix issues faster. It is integrated with Google Cloud Platform, Amazon Web Services, and popular open source
packages.
Google Cloud's operations suite combines metrics, logs, and metadata from all of your cloud accounts and
projects into a single comprehensive view of your environment, enabling rapid drill-down and root cause
More information about Google Cloud's operations suite can be found here:
https://cloud.google.com/products/operations
Cloud identity aware proxy (IAP) provides unified access control to the workloads running on the Google
cloud. With IAP, you can use centralized authentication an authorization to secure users and applications,
running inside VMs and containers. Hence, IAP simplifies management of user and service identities as well
as access to the workloads and resources across multiple Kubernetes clusters.
https://cloud.google.com/iap/
API Management
The API management enables the cloud administrators control access to various service APIs by service
agents, users, and tools. Only authorized users (or service accounts) will have access to the secured APIs.
For example, in order to connect and manage the Anthos on-prem cluster via the Google cloud console, the
user account should be authorized for the GKE connect API. In addition to the access control to APIs, you can
also monitor the API access via the GCP dashboard for the respective APIs. Some of the monitored metrics
include traffic (calls/sec), errors, latency, etc.
As described previously, Anthos as a developer platform enables modern application development with micro-
services architecture. With micro-services, the traditional monolithic applications can be broken down into
smaller and more manageable functional modules and deployed as self-contained services in the cloud.
Micro-services expose APIs that other services can access. A service mesh essentially enables connectivity
among the micro-services distributed across clouds.
Istio is an open source project developed by Google, IBM and others. Istio provides a scalable service mesh
implementation for connecting applications running on Kubernetes. Developers do not need to modify their
code to use Istio. With simple descriptive annotations as a YAML file, you can specify the service-mesh rules
and apply them to running container workloads. Istio will then apply the rules to the deployed applications and
start managing the security, configuration, traffic policies, etc., as defined in the configuration rules.
• Automatic load balancing for HTTP, gRPC, WebSocket, MongoDB, and TCP traffic.
• Fine-grained control of traffic behavior with rich routing rules, retries, failovers, and fault injection.
• A configurable policy layer and API supporting access controls, rate limits, and quotas.
• Automatic metrics, logs, and traces for all traffic within a cluster, including cluster ingress, and egress.
• Secure service-to-service communication in a cluster with strong identity based authentication and
authorization.
This is one of the core components of Anthos. When deploying and managing GKE clusters in multiple
locations, it becomes difficult to keep all clusters in sync with respect to their configuration, security policies
(RBAC), resource configurations, namespaces, and so forth. As people start using these clusters and start
making configuration changes, over time you will run into “configuration drift”, which results in different
clusters behaving differently when the same application is deployed in different places. The Anthos Config
management enables centralized configuration management via descriptive templates maintained as code in
a repository. This makes it easy to ensure that you see consistent behaviour across the clusters and any
deviations can be easily rectified by reverting the changes to the last known good state.
https://cloud.google.com/anthos-config-management/docs/
GCP marketplace
The GCP container marketplace provides access to a large ecosystem of open source and commercial
container application images that can be deployed on the GKE clusters running anywhere. With the
marketplace, customers can utilize pre-created containers for common applications such as databases or web
servers without having to create them on their own. Figure 11 shows the screenshot of the GCP Kubernetes
apps marketplace.
Figure 12 shows Anthos GKE On-Prem deployment Architecture with the ThinkAgile VX vSAN certified nodes.
The on-prem deployment consists of a single vSAN cluster with four or more servers. Each system runs the
VMware vSphere 6.5U3 hypervisor host operating system. The hosts are managed via a vCenter 6.5 virtual
appliance. The shared vSAN cluster provides the persistent storage via the VMFS distributed file system. The
Anthos deployment consists of an Admin workstation VM, an admin GKE cluster, and one or more user GKE
clusters, all implemented as virtual machines. More details on the system requirements, hardware options,
and deployment steps are described in the following sections.
In addition to the simplified deployment architecture, vSAN HCI provides several advantages:
• vSAN is built on top of the popular VMWare vSphere (ESXi) hypervisor operating system. Hence,
applications that run virtualized on top of vSphere can directly take advantage of vSAN without any
modifications or additional software requirements.
• vSAN is managed through the familiar vCenter software, which provides a single-pane-of-glass
management to vSphere clusters. Hence, administrators that are already familiar with vCenter do not
need to learn a new tool to manage vSAN clusters.
• Health monitoring and lifecycle management of vSAN is built into the vCenter/vSphere.
• All the enterprise-class storage features such as data replication, deduplication, compression,
encryption, scaling of storage, etc., are standard.
• vSAN also supports container-native storage interface (CSI) to provide persistent storage for
containers running in a Kubernetes environment.
The Lenovo ThinkAgile VX Series appliance arrives with the hardware configured, VMware Hyper-Converged
Infrastructure (HCI) software preinstalled, and Lenovo professional services to integrate it into your
environment. This makes the ThinkAgile VX Series easy to deploy and provides faster time-to-value and
reduces your costs.
There are two types of ThinkAgile VX systems – VX Certified Nodes and VX Series appliances. The VX
certified nodes are vSAN certified build-your-own (BYO) servers. They provide all the certified disk and I/O
options for vSAN and allow most flexibility of choice for customers to configure the systems. The VX series
VX series appliances are optimized around specific workload use cases such as transactional databases,
web, storage-rich, high-performance, etc. Hence, the VX series appliances are purpose-built vSAN certified
servers. VX series comes in a wide range of platforms and provides the flexibility to configure the system you
need to meet any use-case. The appliances are preloaded with VMware ESXi and preconfigured with vSAN
along with license and subscriptions. Both all-flash and hybrid platforms are supported.
Figure 14: Lenovo ThinkAgile VX 2U Certified Node with 16x SFF (top), 12x LFF (middle), or 24x SFF
(bottom) drive bays
You can find more detailed information about the various ThinkAgile VX certified nodes as well as appliances
on the Lenovo press website here:
https://lenovopress.com/servers/thinkagile/vx-series
6.1.3 Lenovo ThinkAgile VX with 3rd Generation of Intel Xeon Scalable Processors
ThinkAgile VX servers with 3rd Generation of Intel Xeon Scalable Processors provide up to 40 cores per
processor, 4TB memory per server and support for the new PCIe 4.0 standard for I/O, the VX systems
offer the ultimate in two-socket performance in a 1U/2U form factors. The 1U servers are suitable for
compute heavy and 2U servers are scalable for compute, storage and inference workloads with support
for more drives and GPUs.
Memory 32x DDR4 3200 32x DDR4 3200 32x DDR4 3200 32x DDR4 3200
MHz MHz MHz MHz
(4TB maximum) (4TB maximum) (4TB maximum) (4TB maximum)
Drive Bays 4x 3.5" SAS/SATA 12x 2.5" SAS/SATA 12x 2.5" NVMe 12x 2.5" SAS/SATA
4x 3.5" NVMe 12x 2.5" NVMe 12x 2.5" NVMe
4x 3.5" SAS/SATA
Memory 32x DDR4 3200 32x DDR4 3200 32x DDR4 3200 32x DDR4 3200
MHz MHz MHz MHz
(4TB maximum) (4TB maximum) (4TB maximum) (4TB maximum)
Drive Bays 24 x 2.5" (HS) 16 x 3.5” (HS) 40x 2.5" SAS/SATA 40x 2.5" SAS/SATA
32x 2.5" NVMe 32x 2.5" NVMe
16x 3.5" SAS/SATA
Persistent storage, on the other hand, is used for data that needs to be persisted across container
instantiations. An example is a 2 or 3-tier application that has separate containers for the web and business
logic tier and the database tier. The web and business logic tier can be scaled out using multiple containers
for high availability. The database that is used in the database tier requires persistent storage that is not
destroyed.
Kubernetes uses a persistent volume framework that operates on two concepts – persistent storage and
persistent volume claim. Persistent storage are the physical storage volumes that are created and managed
by the Kubernetes cluster administrator. When an application container requires persistent storage, it would
create a persistent volume claim (PVC). The PVC is a unique pointer/handle to a persistent volume on the
As described previously, the GKE on-prem clusters are installed on a VMware vSAN HCI platform, which
provides both the compute and storage capacity for the workloads. Since the GKE clusters are deployed as
virtual machines on vSphere, they will have direct access to the vSAN cluster storage or SAN based data
stores if a SAN storage environment is used.
As shown in Figure 15, the nodes in the GKE cluster run as virtual machines with the corresponding virtual
disks attached to them, which are stored on the shared vSAN datastore. When the Kubernetes pods running
within the nodes make persistent volume claim requests, the data is persisted as part of the VM’s virtual
disks, which are the VMDK files. This makes management of persistent volume stores and PVCs much easier
from a Kubernetes administration standpoint.
6.3 Networking
There are three logical networks defined in this RA:
• External: The external network is used for the internet access to the clusters, ingress to the exposed
applications (services and routes). Anthos in its current release requires a layer 4 load-balancer to
support traffic routing across the internal and external networks, as well as the communication across
the on-prem GKE clusters.
• Internal: This is the primary, non-routable network used for cluster management and inter-node
communication. Domain Name Servers (DNS) and Dynamic Host Configuration Protocol (DHCP)
services also reside on this network to provide the functionality necessary for the deployment process
and the cluster to work. Communication with the Internet is handled by the F5 gateway, which runs as
a separate virtual appliance on the VMware cluster.
• Out-of-band network: This is a secured and isolated network used for switch and server hardware
management, such as access to the xClarity Controller (XCC) module on the servers and SoL (Serial-
over-LAN).
The two primary network fabrics shown in the diagram are the systems management network and the internal
data/user network. Typically, 1Gbps Ethernet is sufficient for the systems management network, which
provides out-of-band access to the on-board management processors on the servers and network switches.
The data/cluster internal fabric is recommended to be 10Gbps Ethernet. This fabric is also recommended to
have redundant switches for high-availability of the network fabric. The Lenovo ThinkSystem network switches
support the Cloud Network Operating System (CNOS), which provides advanced data center networking
features including virtual link aggregation (VLAG).
Figure 18 shows the redundant network architecture and the VLAG configuration.
The Lenovo XClarity Administrator provides agent-free hardware management for Lenovo’s ThinkSystem®
rack servers, System x® rack servers, and Flex System™ compute nodes and components, including the
Chassis Management Module (CMM) and Flex System I/O modules. Figure 19 shows the Lenovo XClarity
administrator interface, in which Flex System components and rack servers are managed and are seen on the
dashboard. Lenovo XClarity Administrator is a virtual appliance that is quickly imported into a virtualized
environment server configuration.
https://cloud.google.com/anthos/gke/docs/on-prem/release-notes
Some of the key new features since earlier Anthos 1.x releases are the following.
More detailed instructions to prepare for Anthos deployment can be found here:
https://cloud.google.com/gke-on-prem/docs/how-to/installation/getting-started
The second tier of sizing is the physical hardware sizing. Anthos allows you to deploy multiple user clusters on
top of the same VMware vSAN cluster. Hence, you need to determine how many GKE clusters you would
install, and how many worker VMs in each cluster, the individual resource requirements as the vCPUs and
vRAM, and then aggregate the total to determine what kind of physical resources you would need to
implement the clusters. This translates to a number of vSAN servers with specific physical CPU cores, core
speed, physical memory, and disk.
In this reference architecture, we provided three recommended hardware configurations based on a rough
workload profile estimate:
Admin workstation: Used for the rest of the cluster deployment. This is a Google provided VM template.
Admin cluster: Provides the administrative control plane for all on-prem clusters and for connectivity to
Google cloud.
Table lists the minimum hardware requirements for these different virtual machines.
Add-ons VMs Two VMs running with the Run the admin control plane's add-ons.
following specifications:
• 4 vCPU
• 16384 MB RAM
User cluster • 4 vCPU Each user cluster has its own control plane. User control
See the following Google document for more detailed hardware requirements for Anthos:
https://cloud.google.com/gke-on-prem/docs/how-to/installation/requirements
The cluster administrator can scale up the clusters later by adding more user nodes or create additional user
clusters based on requirements.
The high-level cluster architecture for this example implementation is shown in Figure 20.
The installation uses the Google Cloud CLI and GKEADM tool to setup the installation environment and
deploy the admin workstation image. All subsequent setup tasks are completed from this admin workstation
VM.
1. On the VX7531 nodes, change UEFI processor mwait setting to enabled. By default, it is
disabled. It must be in custom mode to allow this setting change. This is a VMware requirement to
allow the VMware CLS system VMs to start.
2. Setup the VMware vSphere platform. This includes loading ESXi on the VX7531 certified nodes,
and a vSphere vCenter server which runs on a VM now as of version 7.0.
3. Setup the vSAN data stores or SAN based shared storage accessible by all ESXi hosts
4. Plan IP addresses and networking. Either from a DHCP server scope, or a list of allowed static IP
addresses. Provide a DNS server and default gateway.
5. Configure vSphere resource levels such as the virtual data center, resource pool, network and
data store.
Note: Google’s online documentation walks you through these steps and includes the Linux commands and
cluster config file templates needed to accomplish each one. See this link for the latest install details:
https://cloud.google.com/anthos/clusters/docs/on-prem/latest/how-to/install-overview
6. Create the admin workstation. This is a Google provided VMware template that creates a VM to
use for the remaining deployment steps.
7. Create the admin cluster load balancer VM. Google includes a bundled VM-based load balancer
called Seesaw, which works fine.
8. Create an admin cluster. This cluster runs the Kubernetes control plane for itself and any
associated user clusters.
9. Create the user cluster load balancer VM. Use the Seesaw one.
10. Create user clusters. These clusters are made up of worker nodes and run the workload
containers.
11. Deploy a sample workload on a user cluster. This includes deploying an application, creating a
service and an ingress for it.
For a production level GKE on-prem implementation you need to consider system reliability, availability,
performance, and scalability requirements. Since technically there is no limit for scaling the GKE clusters, you
will be only limited by the underlying infrastructure capabilities. VMware vSAN can scale up to 64 physical
hosts in a single vSAN cluster. This scale provides a lot of capacity, both from the compute and storage
perspective. The sweet spot production cluster deployments are between 4 and 16 nodes. When designing
the vSAN clusters for production environments you should follow the best practices for data redundancy,
availability, and disaster recovery.
https://storagehub.vmware.com/t/vmware-vsan/vmware-r-vsan-tm-design-and-sizing-guide-2/version-6-5/
To deliver high performance for mission-critical workloads, consider vSAN All-flash node configurations which
use solid state disks (SSDs) to deliver high IOPS and throughput. The high-performance configuration BOM in
the appendix is based on all-flash vSAN.
Figure 21 shows the network architecture for a production level vSAN cluster for Anthos deployment. As you
will see in the picture, the data and user network fabric uses redundant 10Gbps Ethernet switches. The nodes
each have two 10Gbps ports connected into the fabric for redundancy as well as aggregation of the ports to
deliver 20Gbps bandwidth. The aggregation configuration can be done in vSphere. The switches also have an
ISL VLAG across them, which essentially makes the two switches act as a single logical switch for the
downstream links. If you lose one of the switches, the nodes will still have the other port active without any
disruption to the network traffic. In addition, there is a 1Gbps switch used for out-of-band management access
to the hardware for functions such as remote power management, event/alerts, firmware updates, etc. In
External network:
As the name suggests, this network connects the GKE on-prem clusters into the customer’s campus network
as well as the internet. The external network traffic is only allowed to access services inside the GKE on-prem
clusters via the ingress routing provided by the F5 BIG IP load-balancer. In addition, you will need a gateway
host that provides NAT based access to the GKE cluster nodes to access the internet during the initial
deployment phase and optionally to provide access to the internet to running pods. This will be a
unidirectional link. When the on-prem GKE clusters are connected to the GCP console, a TLS connection is
established from the GCP to the admin Kubernetes cluster through the gateway.
Internal VM network:
This is a private network segment used for the management network across the GKE cluster virtual machines
(admin and worker nodes). The Kubernetes cluster management and API traffic is accessible over their
network segment. The IP addresses for this segment can be assigned statically or via a DHCP server running
on the network. It’s recommended to isolate the VM network segment for each of the user clusters into its own
VLAN on the virtual switch so that the traffic is isolated across the different clusters. The deployment definition
file for the user cluster should specify the IP addresses used for various API end-points.
This is also a private network segment similar to the VM network segment. This management network is for
communication across the vSphere ESXi hosts, vCenter, and the GKE on-prem admin workstation.
More detailed network configuration and F5 BIG IP load-balancer requirements can be found here:
https://cloud.google.com/gke-on-prem/docs/how-to/installation/requirements#f5_big-ip_requirements
This section also may include a high-level overview of the requirements the customers IT environment must
address for deploying this reference architecture.
Hybrid cloud implementations tend to be complex because of the integration required between the on-prem
data center and the public cloud data centers, concerns with network/internet security, complex bi-directional
traffic routing, data access and provisioning requirements, etc. Hence, hybrid cloud implementations tend to
require several third party tools and services depending upon the specific capabilities expected. Google
Cloud’s Anthos simplifies hybrid cloud by providing the necessary tools and services for a secure and scalable
hybrid cloud implementation.
Figure 23 shows the core pieces of the hybrid cloud architecture with google cloud and Anthos. You will notice
that the components that are running on the Google public cloud are largely the same components running on
Google Kubernetes Engine (GKE) is the common denominator between the public and on-prem clouds. Since
Anthos is primarily focused on enabling containerized workloads running on top of Kubernetes, the hybrid
cloud implemented with Anthos enables cross-cloud orchestration of containers and micro-services across the
Kubernetes clusters. The migration of container workloads between on-prem and GKE clusters running in
other clouds can be achieved through the public container registry such as Google container registry (GCR),
or through your own private registries that are secured with the central identity and access control to only
allow authenticated users or service accounts to push or pull container images from the registry. There is no
need for converting container images running on-prem to run on the Google managed Kubernetes engine on
the public cloud.
As shown in Figure 24, the master node(s) run the core cluster management services such as the kube-
apiserver, kube-scheduler, and etc. The worker nodes interact with the master nodes via the kubelet service,
which is responsible for managing the Kubernetes pods on the local servers. The pods run one or more
https://kubernetes.io/docs/concepts/architecture/cloud-controller/
With introduction of Anthos, Google cloud platform also enables managing Kubernetes clusters running
anywhere – on Google cloud, on-prem data centers, or other cloud providers such as Amazon AWS from a
single place. In addition, the multi-cluster capability also enables configuration management across cloud and
on-prem environments as well as workloads running in different environments.
The core components of the multi-cluster management are GKE connection hub (Connect for short), Google
cloud platform console, and the Anthos Config management.
Figure 26: Secure (TLS) based connection to Google cloud platform from on-prem
https://cloud.google.com/anthos/multicluster-management/connect/overview
Note that the Anthos GKE clusters running in your on-premises data center need to be connected and
registered with GCP to be reachable by Google and displayed in GCP Console. The GKE on-prem clusters
deployed through Anthos are automatically registered and connected with GCP as part of the setup process.
Figure 28: Managing workloads across GKE clusters from GCP console
https://cloud.google.com/anthos/multicluster-management/console/
Continuous integration is the process in which code developed by multiple developers concurrently is
continuously pulled from the source code repository, integrated, built, and tested.
Jenkins itself can be deployed containerized on top of the GKE cluster on-prem, which makes the deployment
quite straightforward. We are not covering Jenkins deployment in detail in this paper. Please see the following
tutorial for a step-by-step implementation of Jenkins on Kubernetes.
https://cloud.google.com/solutions/jenkins-on-kubernetes-engine
Once Jenkins has been deployed on the cluster you will see the Jenkins container pods successfully created
and running. See Figure 30 for the kubectl commands to check the Jenkins pod status and to access the
Jenkins website at that point.
After installing Jenkins successfully, you need to configure Jenkins and attach the Anthos GKE cluster with
the Jenkins master to use to run the CI/CD pipelines. From the Jenkins portal, select “configure Jenkins” and
create the “cloud” configuration (Figure 32). This is where you need to specify “kubernetes” as the name of
the cloud because this same cloud needs to be specified with the pipeline definition later. Also, you need to
specify the URL for the Jenkins master and the Kubernetes agent.
In order for the Jenkins master to successfully deploy and run the Jenkins slaves on the Kubernetes cluster
you also need to configure the credentials for the Kubernetes cluster in the global Jenkins credentials. You
can basically copy-paste the kubeconfig file from the Anthos GKE cluster. See Figure 33.
Figure 34: Kubeconfig file for the GKE on-prem cluster with Jenkins
https://cloud.google.com/solutions/continuous-delivery-jenkins-kubernetes-engine
You will also need to register your Git repository credentials in Jenkins so that the Jenkins build agent can
access the repository and checkout the code. In addition, for your personal development workstation to pull
and push code to the Git repository you need to register the SSH keys with the Git repo and enable them. See
the Git documentation on how to do that.
Figure 36: Creating a multi-branch pipeline for the sample CI/CD application
For the continuous integration setup, we want the source code repo to be scanned periodically for changes.
You can also configure build triggers via web hooks within Git configuration such that Git will trigger the
pipeline build when new code is committed to the repos. However, typically your Anthos GKE clusters will be
behind corporate firewalls, which will prevent the Git web hooks traffic to the Jenkins server. There are ways
of working around this issue. See the following article for details on making Git web hooks work with firewalls.
For this article, we will configure a periodic repository scanner in the Jenkins pipeline. See Figure 38 where
we specify one minute as the interval for scanning the repositories on Git hub. Every one minute, Jenkins
agent will scan the repository for any code updates and then trigger the pipeline build.
Once the multi-branch pipeline is created you can see that pipeline in the Jenkins dashboard as in figure. You
can see there are two branches detected by Jenkins – Canary and Master. Jenkins will trigger the build on
one or both of these branches individually when the code scanner detects the changes.
During the init step, various settings for the build are specified, including the project ID for the GCP project
where the Anthos on-prem cluster is registered, the application parameters, build tag for the docker image for
the application, the IAM cloud service account, and the Jenkins account authorized for the Jenkins slave
running on the Kubernetes cluster.
The Google cloud service account is used for accessing the Google container registry, where we would push
the built docker container images. This account should have the read/write access to the Google cloud
storage bucket which is used by GCR as the image repository. The same account should be then specified in
the later stage during the deployment of the code to the on-prem Kubernetes cluster. The service account
credentials should be registered in the cluster image pull secret store and then used as part of the deployment
definition of the pods.
spec:
containers:
- name: backend
image: gcr.io/cloud-solutions-images/gceme:1.0.0
resources:
limits:
memory: "1000Mi"
cpu: "100m"
imagePullPolicy: Always
readinessProbe:
httpGet:
path: /healthz
port: 8080
command: ["sh", "-c", "app -port=8080"]
ports:
- name: backend
containerPort: 8080
imagePullSecrets:
- name: gcr-secret
The first step is for Jenkins to checkout the source code from Git and extract the Jenkinsfile, which describes
the pipeline steps.
Once the Jenkinsfile is obtained, the pipeline build stages will execute. The full Jenkinsfile is posted at the end
of this section.
The test phase will execute the unit tests (and any other additional QA to be done) for the code. The following
is the console output in Jenkins from the pipeline execution.
[Pipeline] { (Test)
[Pipeline] container
[Pipeline] {
[Pipeline] sh
+ pwd
+ ln -s /home/jenkins/workspace/sample-pipeline_master /go/src/sample-app
+ cd /go/src/sample-app
+ go test
PASS
ok sample-app 0.015s
Once the test phase is complete and successful, the next stage is to build the container image for the
application and push it to the container registry on GCR.
stage('Test') {
steps {
container('golang') {
sh """
ln -s `pwd` /go/src/sample-app
cd /go/src/sample-app
go test
"""
}
}
}
stage('Build and push image with Container Builder') {
steps {
container('gcloud') {
sh "PYTHONUNBUFFERED=1 gcloud builds submit -t ${imageTag} ."
}
}
}
• Micro-services simplify integration of the businesses, processes, technology, and people by breaking
down the monolithic application to a smaller set that can be handled independently.
• They help build an application as a suite of small services, each running in its own process and are
independently deployable.
• Micro-services can be written in different programming languages and may use different data storage
techniques.
• Micro-services are scalable and flexible, and connected via APIs,
• Leverage many of the reusable tools and solutions in the RESTful and web service ecosystem.
• Micro-service architecture enables the rapid, frequent and reliable delivery of large, complex
applications.
• Enables an organization to quickly evolve its technology stack
Micro-services Apps are deployed as a set of containers in Kubernetes cluster. Istio is a service mesh
platform to connect micro-services. Istio makes it easy to manage load balancing, service-to-service
authentication, monitoring, etc., in services network. Figure 42 shows the official architecture diagram of istio
1.1.
https://istio.io/docs/
Figure 43 shows architecture of Istio deployed on Anthos GKE On-Prem on ThinkAgile VX platform.
Istio is installed in user clusters on Anthos GKE On-Prem with Lenovo ThinkAgile VX platform. Users can
leverage Istio to deploy applications and provide service to their customers.
There are three configurations provided below based on the Anthos deployment use cases for dev/test, QA,
and production environments and increasing workload resource requirements in each environment.
The BOM lists in this appendix are not meant to be exhaustive and must always be double-checked with the
configuration tools. Any discussion of pricing, support, and maintenance options is outside the scope of this
document.
• 4x ThinkAgile VX3330 1U compute node with 2x3.1Ghz 16C CPUs + 512 GB memory
• (2x 800GB cache SSDs + 8x 3.84TB SATA capacity SSD) per node
• 4x 10Gbps CAT6 Ethernet ports per node
• 2x 480GB M.2. SSDs per node for OS
• 4x ThinkAgile VX7530 2U compute node with 2x2.6Ghz 32C CPUs + 1TB memory
• (2x 800GB NVMe cache SSDs + 8x 6.4TB SAS capacity SSD) per node
• 4x 10Gbps CAT6 Ethernet ports per node
• 2x 480GB M.2. SSDs per node for OS