Google Cloud Essential
Google Cloud Essential
Google Cloud Essential
Before getting to know about the Google Cloud Platform, let us gain
knowledge on Cloud Computing.
The following are five important characteristics that explain what Cloud
Computing is:
o First, you will get all your computing resources on-demand and self-
service. That is, you will get Processing power, Storage and Network
resources through the internet by using a simple user interface
o Second, you will have the facility and infrastructure to access the
above resources over the internet.
o Third, these resources are provisioned from a pool through a provider,
who will allocate all these resources to the customers.
o Fourth, the resources are elastic, which means you can scale your
resources up and down.
o Fifth, you will pay for the resources that you use or reserve, which is
known as the pay-for-use model. You will not be billed for the
resources that you do not use.
Google Cloud Platform offers several services like Compute, Storage, Big
Data and Machine Learning, which are used to develop applications
like Web, Mobile, Analytics, or back-end solutions.
Google Cloud offers reasonable billing to all the resources that it provides.
The primary reasons to choose Google Cloud Platform is that it is Global,
Cost-effective, Open-source friendly and Secure.
Let us learn about the services which are provided by Google Cloud Platform
to its customers.
Security
The following are the solutions that Google Cloud provides to its customers to keep
their data safe:
Google will design its own Server Boards and Network Equipment.
Cryptographic signatures are used by Google server machines to check
that they are booting the correct software.
Google will design and build its own data centers by introducing multiple
layers of physical security protections.
Google Infrastructure provides the cryptographic privacy and integrity for
data-on-the-network.
Google also has Encryption support in hard drives and SSDs.
Google services that are made available through the internet should register
with an infrastructure service called Google Frontend, whose job is to check
for incoming network connections for correct certificates.
Google also supports multi-layer, multi-tier denials of service protections.
Phishing attacks against Google employees are guarded through U2F
Compatability security keys.
1. GCP Console
2. Google Cloud SDK and Cloud Shell
3. Mobile App
4. APIs
Prelude
If you have heard the term Cloud Computing, then you would be aware of Virtual
Machine (VM). Google Compute Engine (GCE) in GCP will let you run the VM in
Google Cloud Platform Infrastructure. You will configure your VM in such a way that
you will build a physical server by specifying its CPU power, memory, storage types
and OS.
In this topic, you are going to acquire knowledge on how GCE works with Google
Virtual Networking.
You can have your resources in different zones that are on the same subnet.
When you create your own Subnet, you can dynamically increase the size of
the subnet by expanding the IP Addresses.
In the above example, you have a VPC network with a subnet, which consists
of two Compute Engine VMs.
Even though both the VMs are in different zones, they are considered as
neighbors as they are in the same subnet.
Introduction
Developers develop many applications that require the storage of large amounts of
data. Data can be in many forms like Media, Confidential data from devices,
Customer account balances, and so on.
Earlier, we read that data can be stored in Persistent disks. Also, GCP provides
other storage options for Structured or Unstructured data, Transactional, and
Relational data.
In this topic, you will be learning the various storage options like Cloud Storage,
Cloud Bigtable, Cloud SQL, Cloud Spanner, and Cloud Data storage .
Scalability: With the help of Bigtable, you can increase your machine count,
which does not require any downtime. Also, it handles all Administration
tasks like Upgrades and Restarts.
The data present in the Cloud Bigtable is encrypted. You can use IAM
roles to specify who can access the data.
From an application API perspective, data is written to and from Bigtable
through data service layers like Managed VMs, HBase Rest Server, or Java
Server using HBase Client.
The above will serve the data to Applications, Dashboards and Data
services.
Data can be read from and written to Bigtable through batch processes
like Hadoop MapReduce, DataFlow or Spark.
Containers
Containers are preferred due to the drawbacks in the Compute
Engine and App Engine.
Containers are responsible for independent scalability of workloads and also
acts as an abstraction layer of OS and Hardware.
All you need when you get an host with an OS is which
supports Containers and Container Runtime.
Your code will be portable and you can treat the OS and hardware as Black
Box. Using this functionality, you can move between stages
like Development, Staging, and Production from on-premises to the cloud.
For example, if you want to scale a web server a hundred times, then you can
do it in seconds on a single host depending on the size of the workload.
Kubernetes
For example, if you want to build your application using a lot of containers
performing like microservices, which are connected through network
connections, then you can make them to scale independently, modular,
and easily deployable across a group of hosts.
Containers can be scaled up or down, and start or stop by the hosts on-
demand as your application changes or else the host fails.
The above process can be done by a tool called Kubernetes.
The functionality of Kubernetes is to orchestrate several containers on hosts,
scale them as microservices, and rollouts, rollbacks.
Now you will get to know about how you can build an run containers. An
open-source tool called Docker, which helps you to define a format for
bundling your application with machine-specific settings and dependencies
into a container.
Google cloud platform also has a separate tool called Google Container
Builder a managed service which is used for building containers.
Kubernetes Functionality
Kubernetes an open-source orchestrator for containers.
Kubernetes is a set of APIs which can be used to deploy containers on a set
of nodes called Cluster.
Cluster is divided into master and node components. The master will manage
the nodes in the cluster.
In Kubernetes, you will consider Node as Virtual Machines running
in Google Compute Engine.
When you define a set of applications, Kubernetes is responsible for how to
make them interact with each other.
Kubernetes makes it easy to run containerized applications one yourself on
your own hardware but you have to maintain it in your own environment.
In order to avoid the burden of maintaining it in your own environment, Google
Cloud provides Google Kubernetes Engine which handles kubernetes as a
managed service.
You can deploy Containers on the nodes using a wrapper called Pod.
Pod is the smallest unit in Kubernetes that you can create or deploy. Think
about pod like a running process on your cluster.
Pod can be a component of an application or it can be an entire application.
A pod can contain single or multiple containers.
You can package multiple containers into a single pod and they will
automatically share networking and can have disk storage volumes in
common.
Each pod will get a separate and unique IP, set of ports for your containes.
Containers in the pod will communicate with each other using localhost and
fixed ports.
Kubectl command is used to run the container in the pod, which results in an
immediate deployment of a container in the running pod.
Pods
kubectl scale command can be used to scale the deployment. You can also
provide autoscaling with all kinds of parameters.
A Configuration file can also be provided instead of giving commands to tell
Kubernetes of what your desired state to look like and kubernetes will figure it
out`.
You can check the deployment to make sure that replicas are running by
using the command kubectl get deployments or kubectl describe
deployments.
You can get the External IP of the service by running the command kubectl
get services.
For example, there are 3 replicas of a nginx pod and you want to update it
to 5 replicas, then you can edit the deployment configuration file
from three to five.
Run kubectl apply command to update the changes to configuration file and
run kubectl get replicasets command to see the replicas and their updated
state.
kubectl get services command confirms that the external IP of the service is
unaffected. The service will send traffic to all the 5 pods.
If you want to update the version of the application, but rolling out the change
at once is risky so you will use a feature called Rolling update.
Introduction
By now you are familiar with two crucial GCP products, Compute
Engine and Kubernetes Engine.
One common feature in these two products is you will choose the
infrastructure in which your application will run, that is, Virtual
Machines for Compute Engine and Containers for Kubernetes Engine.
If you want to focus on the application code, then your choice should be App
Engine.
App Engine comes under Platform as a Service (Paas). App Engine
manages both Hardware and Network Infrastructure.
App Engine has many built-in services like No SQL databases, in-memory
caching, load balancing, health checks, logging, and authenticating
users, which can be used by your application.
App Engine will automatically scale applications like web
applications and mobile backend.
App Engine Standard
App Engine offers two main environments namely App Engine Standard and App
Engine Flexible.
Service
Hybrid PaaS PaaS
Model
Primary Use- Container-based Web and Mobile Both web and mobile applications
Case workloads applications and container-based workloads
1. Development
2. Deployment
3. Monitoring
Most of the developers will store and maintain their code in the Git Repository.
You can also create Git Instance (which gives you the great control) or Git
Provider (which lessens your work).
In addition to the above options, you can keep your code private and add IAM
permissions to protect it by using the Cloud Source Repository (CSR).
Cloud Functions
When you are using Cloud Functions, you will not bother about
the Servers or Runtime Binaries.
For example, You will write Javascript code in Node.js environment provided by
GCP, and then configure when it should fire.
You will only pay for the functions that run in 100 milliseconds interval.
It can trigger on events in Cloud Pub/Sub, Cloud Storage and HTTP call.
You can choose your interested events.
For each event type you will tell cloud Functions on which event you are interested in,
and this type of declaration is called Triggers.
You will attach JavaScript functions to the triggers, and the functions respond
whenever the event happens.
The applications that have Microservices architecture can be implemented in Cloud
Functions.
Cloud Functions can also be used to enhance the application without worrying about
scaling.
Deployment
Setting up an environment in GCP will take a lot of steps like
1. You can set up compute network, storage network, and their configurations.
2. If you want to change the environment, you can do it with some commands.
3. You can also clone the environment by executing some commands.
The above steps will take a lot of time and you can reduce it by using a Template. (
Requires specification for the environment that you want it to be).
GCP provides , Deployment Manager to automate the creation and management of
the above steps by using the template.
You can create the Template file either by using YAML Markup
language or Python. Then the template will be consumed by the Deployment
Manager and performs the actions that need to be done.
You can edit the template and tell the Deployment Manager to make changes
according to that.
Deployment Manager Templates can be stored and version controlled in Cloud Sources
Repository.
Monitoring
Without Monitoring your application you cannot run it stably.
It will analyze and let you know whether the changes are functional or not. It will
respond with some information whenever your application is down.
Stackdriver is the monitoring tool that you can use in GCP for Monitoring, Logging
and Diagnosing.
Stackdriver will give you an entry to receive signals from Infrastructure Platform,
Virtual Machines, Middleware, Application tier, Logs, Metrics, and Traces.
It also gives helps you to check about Application health, Performance and
Availability.
With Stackdriver you can able to do Monitoring, Logging, Tracing, Error
Reporting and Debugging.
You can configure uptime checks which are associated with URL's and Resources
such as Instances and Load Balancers.
You can also set up alerts on Health check results or Uptime falls.
Stackdriver
You can also use Monitoring tool with Notification tools. You can also view
your Application state by creating Dashboards in Stackdriver.
Stackdriver Logging will allow you to view logs of your application.
Logging also allow you to define metrics depending on the Log content that are
included in the Dashboards and Alerts.
You can export logs to Cloud Pub/Sub, Cloud Storage and BigQuery.
Stackdriver Error Reporting will track and group all the errors in the cloud
application.
Stackdriver Trace is used to sample the latency of App Engine applications.
Stackdriver Debugger it connects your production data to source code. It works
efficiently when your source code is available in Cloud Source Repository.
Google Big Data Platform
Google Big Data solutions will help to transform the users and their business
experiences through data insights which are known as Integrated Serverless
Platform.
GCP Services are fully maintained and managed, and you only pay for the
resources that you consume.
Let us discuss the data services that are integrated and help you to create
custom solutions.
Apache Hadoop is an open-source framework for Hadoop which depends on
MapReduce Programming model.
MapReduce model is nothing but a function called Map Function will run along
with a large dataset to generate intermediate results. By taking these
results(intermediate) as input Reduce Function will produce Final output.
Along with Apache Hadoop, there are such related projects such as Apache
Pig, Hive and Spark.
On Google Cloud Platform, Cloud DataProc can be used to run Hadoop,
Spark, Hive and Pig.
Cloud DataProc
When you request a Hadoop Cluster, it will be built you in less than 90
seconds on top of your Virtual Machines. Scaling can be done up and down
based on the processing power.
You can also monitor your cluster using Processing Power.
Running the clusters in On-premises will require hardware investment.
But running them in DataProc will allow you to pay only for hardware
resources that you will when you are creating the cluster.
Cloud DataProc is billed by Second and when you complete it, you can delete
the cluster, and immediately GCP stops the billing.
You can also use Preemptible instances for the batch processing to save
costs.
Once your cluster contains the data, you can use Spark and SparkSQL for
data mining.
You can also use Apache Spark Machine Learning Libraries to discover
patterns through the Machine Learning.
Cloud Dataflow
Cloud DataProc is suitable when you know your cluster size. But if your
cluster size is unpredictable or your data shows up in real-time, then your
choice should be Cloud Dataflow.
Dataflow is a managed service which allows you to develop and execute a big
range of processing patterns by Extracting, Transforming and
Loading batch computation or continuous computation.
Dataflow is used to build Data pipelines for both Batch and Streaming data.
It is used to automate processing resources, frees you from operational tasks
like Performance optimization and Resource Management.
DataProc can read data from BigQuery, Processes it, applies transforms
like Map operations and Reduce Operations and writes it to Cloud
Storage.
Use cases like Fraud Detection and Financial Services, IoT Analytics,
Manufacturing, Logistics, HealthCare and so on.
BigQuery
Suppose you possess a large dataset and you need to perform ad-hoc SQL Queries
then BigQuery must be your choice.
It is the simple, reliable and scalable foundation for stream analytics. By using
it, you can build independent applications to send and receive messages.
The Pub in Pub/Sub is known as Publishers, and Sub is Subscribers.
The application will publish their messages to pub/sub and subscribers who
are subscribed to them will receive messages.
Cloud Pub/Sub can also integrate with Cloud Dataflow. You can also
configure the subscribers to accept messages through push and pull basis.
Cloud Datalab helps you to explore the data, and it can also integrate with
multiple GCP Services like BigQuery, Cloud Storage and Compute Engine.
Cloud Datalab runs on a compute engine and you can also mention the
specific region to run compute engine.
More on ML Platform
If you want to add different machine learning capabilities to your applications,
then you can add it through Machine Learning API's which are provided
through Google Cloud.
Cloud Machine Learning Platform can be used in different applications based
on the type of data whether it can be Structured or Unstructured.
For the structured data, ML is used for Classification and Regression
tasks like Customer churn analysis, product diagnostics and
Forecasting.
For Unstructured data , You can use ML for Image Analytics such
as Damage Shipment, Identifying Styles and Flagging Content.
You can also perform Text analytics like Blog analysis, Language
Identification, Topic Classification.