DLT Unit 5
DLT Unit 5
DLT Unit 5
Q) Machine vision?
Machine vision is sometimes conflated with the term computer vision. The
technology is often integrated with artificial intelligence (AI), machine
learning and deep learning to accelerate image processing.
Another distinction that's often made is in processing power -- that is, the
difference between a machine and a computer. A machine vision system
typically has less processing power and is used in Lean
manufacturing environments, performing practical tasks at a high speed to
acquire the data needed to complete a specified job. Quality control,
inspection of items and guiding of objects through an assembly line are
common applications of machine vision.
…………………………….. end………………………
NLP drives computer programs that translate text from one language to
another, respond to spoken commands, and summarize large volumes of text
rapidly—even in real time. There’s a good chance you’ve interacted with NLP
in the form of voice-operated GPS systems, digital assistants, speech-to-text
dictation software, customer service chatbots, and other consumer
conveniences. But NLP also plays a growing role in enterprise solutions that
help streamline business operations, increase employee productivity, and
simplify mission-critical business processes.
Several NLP tasks break down human text and voice data in ways that help
the computer make sense of what it's ingesting. Some of these tasks include
the following:
Those achievements set the basis for the development of real-world deep
reinforcement learning applications:
Healthcare: In the medical field, Artificial Intelligence (AI) has enabled the
development of advanced intelligent systems able to learn about clinical
treatments, provide clinical decision support, and discover new medical
knowledge from the huge amount of data collected. Reinforcement Learning
enabled advances such as personalized medicine that is used to
systematically optimize patient health care, in particular, for chronic conditions
and cancers using individual patient information.
……………………….. end……………
Q) Autoencoders?
Principle Component Analysis (PCA), which finds the directions along which
data can be extrapolated with the least amount of variance, and autoencoders,
which reconstruct our original input from a compressed version of it, differ
from one another.
Architecture
Encoder
Code
Decoder
The encoder layer converts the input image into a latent space representation.
It produces a compressed image in a reduced dimension from the provided
image.
After decoding the image, the decoder layer returns its original dimensions.
The decoded image can be reconstructed using latent space representation,
while the original image is reconstructed lossily using latent space
representation.
The size of the code or bottleneck is the first and most crucial hyperparameter
for configuring the autoencoder. It chooses how much data needs to be
compressed. It can also be used as a regularization phrase.
Second, keep in mind that the number of layers is important for fine-tuning
autoencoders. A shallower depth is easier to process, whereas a deeper depth
complicates the model.
Thirdly, we need to think about how many nodes each tier can support. The
number of nodes in the autoencoder decreases as the input to every layer gets
lower across the layers.
…………………………… end……………..
Q)Types of Autoencoders?
An unsupervised neural network operating completely under autoencoders can
be used to compress the input data.
It is important to take an input image and try to predict the same image as an
output to reconstruct the image from its compressed bottleneck region.
Sparse Autoencoders
To control sparse autoencoders, one can alter the number of nodes at every
hidden layer.
It suggests that a penalty that is inversely correlated with the number of active
neurons is imposed on the loss function.
Contractive Autoencoders
Prior to rebuilding the input in the decoder, a contractive autoencoder funnels it
through a bottleneck. The bottleneck function is being used to learn an image
representation of the image while it is being processed.
To train a model that satisfies this requirement, we must ensure that the hidden
layer activation derivatives are minimum with respect to the input.
Denoising Autoencoders
Have there ever been times when you wanted to remove background noise from
an image but didn't know where to begin? If so, denoising autoencoders are the
answer for you!
It's because removing image noise is difficult when working with photographs.
Variational Autoencoders
Variational autoencoders (VAEs) are models created to address a specific
problem with conventional autoencoders. An autoencoder learns to solely
represent the input in the so called latent space or bottleneck during training.
The post-training latent space is not necessarily continuous, which makes
interpolation challenging.
The variational autoencoders that concentrate on this topic express their latent
features as probability distributions, resulting in a continuous latent space that
is easy to sample and extend.
…………………… end……………
Deep belief networks (DBNs) are a type of deep learning algorithm that
addresses the problems associated with classic neural networks. They do this
by using layers of stochastic latent variables, which make up the network.
These binary latent variables, or feature detectors and hidden units, are
binary variables, and they are known as stochastic because they can take on
any value within a specific range with some probability.
The top two layers in DBNs have no direction, but the layers above them have
directed links to lower layers. DBNs differ from traditional neural
networks because they can be generative and discriminative models. For
example, you can only train a conventional neural network to classify images.
DBNs also differ from other deep learning algorithms like restricted
Boltzmann machines (RBMs) or autoencoders because they don't work with
raw inputs like RBMs. Instead, they rely on an input layer with one neuron per
input vector and then proceed through many layers until reaching a final layer
where outputs are generated using probabilities derived from previous layers'
activations!
The lowest layer of visible units receives input data as binary or actual data.
Like RBM, there are no intralayer connections in DBN. The hidden units
represent features that encapsulate the data’s correlations.
A matrix of proportional weights W connects two layers. We’ll link every unit
in each layer to every other unit in the layer above it.
First, we train a property layer that can directly gain pixel input signals. Then
we learn the features of the preliminarily attained features by treating the
values of this subcaste as pixels. The lower bound on the log-liability of the
training data set improves every time a fresh subcaste of parcels or features
that we add to the network.
First, we run numerous steps of Gibbs sampling in the top two hidden
layers. The top two hidden layers define the RBM. Thus, this stage
effectively extracts a sample from it.
Then generate a sample from the visible units using a single pass of
ancestral sampling through the rest of the model.
Finally, we’ll use a single bottom-up pass to infer the values of the latent
variables in each layer. In the bottom layer, greedy pretraining begins with
an observed data vector. It then oppositely fine-tunes the generative
weights.
It's pretty simple: each record type contains the RBMs that make up the
network’s layers, as well as a vector indicating layer size and- in the case of
classification DBNs- number of classes in representative data set.
Applications:
……………………… end………………….
It is a network of neurons in which all the neurons are connected to each other.
In this machine, there are two layers named visible layer or input layer and
hidden layer. The visible layer is denoted as v and the hidden layer is denoted
as the h. In Boltzmann machine, there is no output layer. Boltzmann machines
are random and generative neural networks capable of learning internal
representations and are able to represent and (given enough time) solve tough
combinatoric problems.
The Boltzmann distribution (also known as Gibbs Distribution) which is an
integral part of Statistical Mechanics and also explain the impact of parameters
like Entropy and Temperature on the Quantum States in Thermodynamics. Due
to this, it is also known as Energy-Based Models (EBM). It was invented in
1985 by Geoffrey Hinton, then a Professor at Carnegie Mellon University, and
Terry Sejnowski, then a Professor at Johns Hopkins University
……………………….. end……………
A restricted term refers to that we are not allowed to connect the same type
layer to each other. In other words, the two neurons of the input layer or hidden
layer can’t connect to each other. Although the hidden layer and visible layer
can be connected to each other.
As in this machine, there is no output layer so the question arises how we are
going to identify, adjust the weights and how to measure the that our prediction
is accurate or not. All the questions have one answer, that is Restricted
Boltzmann Machine.
The RBM algorithm was proposed by Geoffrey Hinton (2007), which learns
probability distribution over its sample training data inputs. It has seen wide
applications in different areas of supervised/unsupervised machine learning
such as feature learning, dimensionality reduction, classification, collaborative
filtering, and topic modeling.
Consider the example movie rating discussed in the recommender system
section.
Movies like Avengers, Avatar, and Interstellar have strong associations with the
latest fantasy and science fiction factor. Based on the user rating RBM will
discover latent factors that can explain the activation of movie choices. In short,
RBM describes variability among correlated variables of input dataset in terms
of a potentially lower number of unobserved variables.
The energy function is given by
In RBM there are two phases through which the entire RBM works:
1st Phase: In this phase, we take the input layer and using the concept of
weights and biased we are going to activate the hidden layer. This process is
said to be Feed Forward Pass. In Feed Forward Pass we are identifying the
positive association and negative association.
Feed Forward Equation:
Positive Association — When the association between the visible unit and
the hidden unit is positive.
Negative Association — When the association between the visible unit and
the hidden unit is negative.
2nd Phase: As we don’t have any output layer. Instead of calculating the output
layer, we are reconstructing the input layer through the activated hidden state.
This process is said to be Feed Backward Pass. We are just backtracking the
input layer through the activated hidden neurons. After performing this we have
reconstructed Input through the activated hidden state. So, we can calculate the
error and adjust weight in this way:
Feed Backward Equation:
Error = Reconstructed Input Layer-Actual Input layer
Adjust Weight = Input*error*learning rate (0.1)
After doing all the steps we get the pattern that is responsible to activate the
hidden neurons. To understand how it works:
Let us consider an example in which we have some assumption that V1 visible
unit activates the h1 and h2 hidden unit and V2 visible unit activates the h2 and
h3 hidden. Now when any new visible unit let V5 has come into the machine
and it also activates the h1 and h2 unit. So, we can back trace the hidden units
easily and also identify that the characteristics of the new V5 neuron is
matching with that of V1. This is because V1 also activated the same hidden
unit earlier.
Restricted Boltzmann Machines
……………………….. end………………….
Q)Types of RBM?
There are mainly two types of Restricted Boltzmann Machine (RBM) based on
the types of variables they use:
1. Binary RBM: In a binary RBM, the input and hidden units are binary
variables. Binary RBMs are often used in modeling binary data such as
images or text.
2. Gaussian RBM: In a Gaussian RBM, the input and hidden units are
continuous variables that follow a Gaussian distribution. Gaussian RBMs are
often used in modeling continuous data such as audio signals or sensor
data.
Apart from these two types, there are also variations of RBMs such as:
1. Deep Belief Network (DBN): A DBN is a type of generative model that
consists of multiple layers of RBMs. DBNs are often used in modeling high-
dimensional data such as images or videos.
2. Convolutional RBM (CRBM): A CRBM is a type of RBM that is designed
specifically for processing images or other grid-like structures. In a CRBM,
the connections between the input and hidden units are local and shared,
which makes it possible to capture spatial relationships between the input
units.
3. Temporal RBM (TRBM): A TRBM is a type of RBM that is designed for
processing temporal data such as time series or video frames. In a TRBM,
the hidden units are connected across time steps, which allows the network
to model temporal dependencies in the data.
………………………… end……………….
They compete with each other to scrutinize, capture, and replicate the variations
within a dataset. GANs can be used to generate new examples that plausibly could
have been drawn from the original dataset.
Shown below is an example of a GAN. There is a database that has real 100 rupee
notes. The generator neural network generates fake 100 rupee notes. The
discriminator network will help identify the real and fake notes.
How Do GANs Work?
The Generator network takes a sample and generates a fake sample of data. The
Generator is trained to increase the Discriminator network's probability of making
mistakes.
Below is an example of a GAN trying to identify if the 100 rupee notes are real or
fake. So, first, a noise vector or the input vector is fed to the Generator network. The
generator creates fake 100 rupee notes. The real images of 100 rupee notes stored
in a database are passed to the discriminator along with the fake notes. The
Discriminator then identifies the notes as classifying them as real or fake.
We train the model, calculate the loss function at the end of the discriminator
network, and backpropagate the loss into both discriminator and generator models.
Mathematical Equation
Here,
G = Generator
D = Discriminator
Application of GANs
With the help of DCGANs, you can train images of cartoon characters for
generating faces of anime characters as well as Pokemon characters.
GANs can be trained on the images of humans to generate realistic faces. The
faces that you see below have been generated using GANs and do not exist in
reality.
GANs can build realistic images from textual descriptions of objects like birds,
humans, and other animals. We input a sentence and generate multiple images
fitting the description.
………………………………….. end……………