DLT Unit 5

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 27

Deep Learning Technques(unit-5)

Q) Machine vision?

What is machine vision?


Machine vision is the ability of a computer to see; it employs one or more
video cameras, analog-to-digital conversion and digital signal processing. The
resulting data goes to a computer or robot controller. Machine vision is
similar in complexity to voice recognition.

Machine vision is sometimes conflated with the term computer vision. The
technology is often integrated with artificial intelligence (AI), machine
learning and deep learning to accelerate image processing.

How does machine vision work?


Machine vision uses cameras to capture visual information from the
surrounding environment. It then processes the images using a combination
of hardware and software and prepares the information for use in various
applications. Machine vision technology often uses specialized optics to
acquire images. This approach lets certain characteristics of the image be
processed, analyzed and measured.

For example, a machine vision application as part of a manufacturing system


can be used to analyze a certain characteristic of a part being manufactured
on an assembly line. It could determine if the part meets product quality
criteria and, if not, dispose of the part.

In manufacturing settings, machine vision systems typically need the


following items:

 Lighting. Lighting illuminates the object or scene to make its features


visible.
 Lens. This captures the image and delivers it to the sensor in the camera as
light.
 Capture board, frame grabber or sensor. These devices work together
to process the image from the camera and convert it to a digital format as
pixels. Image sensors convert light into electric signals using
either complementary metal-oxide semiconductor technology or a charge-
coupled device.
 Processor. The processor runs software and related algorithms that
process the digital image and extract the required information.
 Communication. These systems enable the machine vision cameras and
processing system to communicate with other elements of the bigger
system, usually using a discrete input/output signal or a serial connection.

There are two types of cameras used in manufacturing machine vision:

1. Area scan. These cameras take pictures in a single frame using a


rectangular sensor. The number of pixels in the sensor corresponds to the
width and height of the image. Area scan cameras are used for scanning
objects that are the same size in terms of width and height.
2. Line scan. These cameras build an image pixel by pixel. They're suited for
taking images of items in motion or of irregular sizes. The sensor passes in
a linear motion over an object when taking the picture. Line scan cameras
aren't as limited to specific resolutions the way area scan cameras are.
…………………… end………………

Q)Types of machine vision?


Machine vision systems can operate across various dimensions based on the
specific needs and requirements of a particular application.

Common types of machine vision systems include the following:


 2D vision systems. These are the most widely used systems that excel in
pattern recognition tasks.
 3D vision systems. Operating in multiple dimensions, 3D vision systems
provide enhanced accuracy for measurement and inspection purposes.
 Smart camera-based vision systems. These systems use integrated
cameras and software to perform a variety of inspection-related tasks.
 Compact vision systems. These systems are designed to be self-contained
and can seamlessly integrate into existing equipment and manufacturing
processes.
 PC-based vision systems. By using computer processing and image
analysis, these systems enable the execution of more complex visual
inspection tasks.
 Multispectral imaging. As an alternative to conventional 2D imaging, this
method involves capturing images at multiple wavelengths.
 Hyperspectral imaging. Similar to multispectral imaging, hyperspectral
imaging captures images at a significantly larger number of wavelengths,
facilitating detailed analysis of spectral data.
 Variable magnification lenses. Equipped with adjustable magnification
levels, these lenses provide greater flexibility in carrying out inspection
tasks.
…………………………… end……………….

Q)How are machine vision systems used?


Machine vision applications are used in a range of industries to perform
various tasks, including the following:

 Electronic component analysis. Machine vision is used in the


construction of circuit boards for tasks such as solder paste inspection and
component placement.
 Optical character recognition (OCR). OCR enables a computer to extract
printed or handwritten text from images.
 Handwriting and signature recognition. With these features, a computer
can detect patterns in images of handwriting and signatures.
 Object recognition. In the automotive industry, self-driving
cars use object recognition on images taken by cameras to identify
obstacles on the road. Machine vision systems also determine the position
of objects, such as the proper placement of a label on a pill bottle.
 Pattern recognition. Medical imaging analysis uses pattern recognition to
make diagnoses based on technologies such as magnetic resonance
imaging, blood scans and brain scans.
 Materials inspection. Machine vision capabilities in materials inspection
systems ensure quality control. Machine vision checks for flaws, defects
and contaminants in a range of materials and products. For example, these
systems can inspect pills and tablets for issues during manufacturing.
 Currency inspection. Machine vision is used to analyze currencies to
detect counterfeit notes.
 Item counting. This capability is used to tally items such as pills in a
packet or bottles in a case.
 Barcode tracking. This common application uses the capabilities of
machine vision systems to read and track barcodes in real time.
 Robotics. The use of cameras for robot guiding is a rapidly growing area of
machine vision. Both 2D and 3D cameras are important in instructing
robots to handle individual or bulk components effectively. These
applications provide high return on investment by reducing the
requirement for physical labor.
……………………………… end…………………………..

Q)What is the difference between machine vision and computer vision?


In some cases, the terms machine vision and computer vision are used
synonymously. In other cases, distinctions are made.

Machine vision is often associated with industrial applications of a computer's


ability to see. The term computer vision is often used to describe any
technology in which a computer is tasked with digitizing an image, processing
the data it contains and taking some kind of action.

Another distinction that's often made is in processing power -- that is, the
difference between a machine and a computer. A machine vision system
typically has less processing power and is used in Lean
manufacturing environments, performing practical tasks at a high speed to
acquire the data needed to complete a specified job. Quality control,
inspection of items and guiding of objects through an assembly line are
common applications of machine vision.

Computer vision systems collect as much data as possible about objects or


scenes and aim to fully understand them. Computer vision is better for
collecting general, transferable information that may be applied to a variety of
tasks. It also can be performed without a camera as the term can refer to a
computer's ability to process images from any source, including the internet.
Common applications of computer vision include self-driving cars,
reading barcodes and RFID tags, and inspecting for product defects.

Machine vision is one of the many applications of AI in manufacturing. Learn


other ways manufacturing companies use AI to simplify business processes and
increase efficiency.

…………………………….. end………………………

Q) natural Language processing?

Natural language processing (NLP) refers to the branch of computer science—


and more specifically, the branch of artificial intelligence or AI—concerned
with giving computers the ability to understand text and spoken words in
much the same way human beings can.

NLP combines computational linguistics—rule-based modeling of human


language—with statistical, machine learning, and deep learning models.
Together, these technologies enable computers to process human language in
the form of text or voice data and to ‘understand’ its full meaning, complete
with the speaker or writer’s intent and sentiment.

NLP drives computer programs that translate text from one language to
another, respond to spoken commands, and summarize large volumes of text
rapidly—even in real time. There’s a good chance you’ve interacted with NLP
in the form of voice-operated GPS systems, digital assistants, speech-to-text
dictation software, customer service chatbots, and other consumer
conveniences. But NLP also plays a growing role in enterprise solutions that
help streamline business operations, increase employee productivity, and
simplify mission-critical business processes.

Several NLP tasks break down human text and voice data in ways that help
the computer make sense of what it's ingesting. Some of these tasks include
the following:

 Speech recognition, also called speech-to-text, is the task of reliably


converting voice data into text data. Speech recognition is required for
any application that follows voice commands or answers spoken
questions. What makes speech recognition especially challenging is the
way people talk—quickly, slurring words together, with varying
emphasis and intonation, in different accents, and often using incorrect
grammar.
 Part of speech tagging, also called grammatical tagging, is the process
of determining the part of speech of a particular word or piece of text
based on its use and context. Part of speech identifies ‘make’ as a verb in
‘I can make a paper plane,’ and as a noun in ‘What make of car do you
own?’
 Word sense disambiguation is the selection of the meaning of a word
with multiple meanings through a process of semantic analysis that
determine the word that makes the most sense in the given context. For
example, word sense disambiguation helps distinguish the meaning of
the verb 'make' in ‘make the grade’ (achieve) vs. ‘make a bet’ (place).
 Named entity recognition, or NEM, identifies words or phrases as
useful entities. NEM identifies ‘Kentucky’ as a location or ‘Fred’ as a
man's name.
 Co-reference resolution is the task of identifying if and when two
words refer to the same entity. The most common example is
determining the person or object to which a certain pronoun refers (e.g.,
‘she’ = ‘Mary’), but it can also involve identifying a metaphor or an
idiom in the text (e.g., an instance in which 'bear' isn't an animal but a
large hairy person).
 Sentiment analysis attempts to extract subjective qualities—attitudes,
emotions, sarcasm, confusion, suspicion—from text.
 Natural language generation is sometimes described as the opposite
of speech recognition or speech-to-text; it's the task of putting
structured information into human language.

NLP use cases:

 Spam detection: You may not think of spam detection as an NLP


solution, but the best spam detection technologies use NLP's text
classification capabilities to scan emails for language that often
indicates spam or phishing. These indicators can include overuse of
financial terms, characteristic bad grammar, threatening language,
inappropriate urgency, misspelled company names, and more. Spam
detection is one of a handful of NLP problems that experts consider
'mostly solved' (although you may argue that this doesn’t match your
email experience).
 Virtual agents and chatbots: Virtual agents such as Apple's Siri and
Amazon's Alexa use speech recognition to recognize patterns in voice
commands and natural language generation to respond with
appropriate action or helpful comments. Chatbots perform the same
magic in response to typed text entries. The best of these also learn to
recognize contextual clues about human requests and use them to
provide even better responses or options over time. The next
enhancement for these applications is question answering, the ability to
respond to our questions—anticipated or not—with relevant and
helpful answers in their own words.
 Social media sentiment analysis: NLP has become an essential
business tool for uncovering hidden data insights from social media
channels. Sentiment analysis can analyze language used in social media
posts, responses, reviews, and more to extract attitudes and emotions in
response to products, promotions, and events–information companies
can use in product designs, advertising campaigns, and more.
 Text summarization: Text summarization uses NLP techniques to
digest huge volumes of digital text and create summaries and synopses
for indexes, research databases, or busy readers who don't have time to
read full text. The best text summarization applications use semantic
reasoning and natural language generation (NLG) to add useful context
and conclusions to summaries.
……………………………………… end…………………..

Q) Deep Reinforcement Learning?

Deep Reinforcement Learning is the combination of Reinforcement Learning


and Deep Learning. This technology enables machines to solve a wide range
of complex decision-making tasks. Hence, it opens up many new applications
in industries such as healthcare, security and surveillance, robotics, smart
grids, self-driving cars, and many more.
In the past few years, Deep Learning techniques have become very popular.
Deep Reinforcement Learning is the combination of Reinforcement Learning
with Deep Learning techniques to solve challenging sequential decision-
making problems. The use of deep learning is most useful in problems with
high-dimensional state space. This means that with deep learning,
Reinforcement Learning is able to solve more complicated tasks with lower
prior knowledge because of its ability to learn different levels of abstractions
from data. To use reinforcement learning successfully in situations
approaching real-world complexity, however, agents are confronted with a
difficult task: they must derive efficient representations of the environment
from high-dimensional sensory inputs, and use these to generalize past
experience to new situations. This makes it possible for machines to mimic
some human problem-solving capabilities, even in high-dimensional space,
which only a few years ago was difficult to conceive.
Applications of Deep Reinforcement Learning Some prominent projects used
deep Reinforcement Learning in games with results that are far beyond what
is humanly possible. Deep RL techniques have demonstrated their ability to
tackle a wide range of problems that were previously unsolved. Deep RL has
achieved human-level or superhuman performance for many two-player or
even multi-player games. Such achievements with popular games are
significant because they show the potential of deep Reinforcement Learning in
a variety of complex and diverse tasks that are based on high-dimensional
inputs. With games, we have good or even perfect simulators, and can easily
generate unlimited data.

Atari 2600 games: Machines achieved superhuman-level performance in


playing Atari games.

Go: Mastering the game of Go with deep neural networks.

Poker: AI is able to beat professional poker players in the game of heads-up


no-limit Texas hold’em.

Quake III: An agent achieved human-level performance in a 3D multiplayer


first-person video game, using only pixels and game points as input.

Dota 2: An AI agent learned to play Dota 2 by playing over 10,000 years of


games against itself (OpenAI Five).
StarCraft II: An agent was able to learn how to play StarCraft II a 99\% win-
rate, using only 1.08 hours on a single commercial machine.

Those achievements set the basis for the development of real-world deep
reinforcement learning applications:

Robot control: Robotics is a classical application area for reinforcement


learning. Robust adversarial reinforcement learning is applied as an agent
operates in the presence of a destabilizing adversary that applies disturbance
forces to the system. The machine is trained to learn an optimal
destabilization policy. AI-powered robots have a wide range of applications,
e.g. in manufacturing, supply chain automation, healthcare, and many more.

Self-driving cars: Deep Reinforcement Learning is prominently used with


autonomous driving. Autonomous driving scenarios involve interacting agents
and require negotiation and dynamic decision-making which suits
Reinforcement Learning.

Healthcare: In the medical field, Artificial Intelligence (AI) has enabled the
development of advanced intelligent systems able to learn about clinical
treatments, provide clinical decision support, and discover new medical
knowledge from the huge amount of data collected. Reinforcement Learning
enabled advances such as personalized medicine that is used to
systematically optimize patient health care, in particular, for chronic conditions
and cancers using individual patient information.

Other: In terms of applications, many areas are likely to be impacted by the


possibilities brought by deep Reinforcement Learning, such as finance,
business management, marketing, resource management, education, smart
grids, transportation, science, engineering, or art. In fact, Deep RL systems
are already in production environments. For example, Facebook uses Deep
Reinforcement Learning for pushing notifications and for faster video loading
with smart prefetching.
…………………….. end……………

……………………….. end……………

Q) Autoencoders?

Autoencoders are very useful in the field of unsupervised machine learning.


They can be used to reduce the data's size and compress it.

Principle Component Analysis (PCA), which finds the directions along which
data can be extrapolated with the least amount of variance, and autoencoders,
which reconstruct our original input from a compressed version of it, differ
from one another.

If necessary, the original data can be recovered using an autoencoder using


the compressed data.

Architecture

Using compressed versions of themselves, an autoencoder is a form of neural


network that can learn to recreate images, text, and other types of input.

Typically, an autoencoder has three layers −

 Encoder
 Code
 Decoder

The encoder layer converts the input image into a latent space representation.
It produces a compressed image in a reduced dimension from the provided
image.

The original image has been warped in the compressed form.


The coding layer represents the compressed input to the decoder layer.

After decoding the image, the decoder layer returns its original dimensions.
The decoded image can be reconstructed using latent space representation,
while the original image is reconstructed lossily using latent space
representation.

When developing an autoencoder, the following factors should be considered


The size of the code or bottleneck is the first and most crucial hyperparameter
for configuring the autoencoder. It chooses how much data needs to be
compressed. It can also be used as a regularization phrase.

Second, keep in mind that the number of layers is important for fine-tuning
autoencoders. A shallower depth is easier to process, whereas a deeper depth
complicates the model.

Thirdly, we need to think about how many nodes each tier can support. The
number of nodes in the autoencoder decreases as the input to every layer gets
lower across the layers.
…………………………… end……………..

Q)How to train autoencoders?

You need to set 4 hyperparameters before training an autoencoder:


1. Code size: The code size or the size of the bottleneck is the most important
hyperparameter used to tune the autoencoder. The bottleneck size decides
how much the data has to be compressed. This can also act as a
regularisation term.
2. Number of layers: Like all neural networks, an important hyperparameter
to tune autoencoders is the depth of the encoder and the decoder. While a
higher depth increases model complexity, a lower depth is faster to process.
3. Number of nodes per layer: The number of nodes per layer defines the
weights we use per layer. Typically, the number of nodes decreases with
each subsequent layer in the autoencoder as the input to each of these
layers becomes smaller across the layers.
4. Reconstruction Loss: The loss function we use to train the autoencoder is
highly dependent on the type of input and output we want the autoencoder
to adapt to. If we are working with image data, the most popular loss
functions for reconstruction are MSE Loss and L1 Loss. In case the inputs
and outputs are within the range [0,1], as in MNIST, we can also make use
of Binary Cross Entropy as the reconstruction loss.
………………………. End………………….

Q)Types of Autoencoders?
An unsupervised neural network operating completely under autoencoders can
be used to compress the input data.

It is important to take an input image and try to predict the same image as an
output to reconstruct the image from its compressed bottleneck region.

These autoencoders are typically employed to produce a latent space or


bottleneck, which acts as a compressed version of the input data and is rapidly
and easily decompressed when required with the help of a network.

Sparse Autoencoders
To control sparse autoencoders, one can alter the number of nodes at every
hidden layer.

Since it is challenging to construct a neural network with a customizable


number of nodes in its hidden levels, sparse autoencoders work by suppressing
the activity of certain neurons in those layers.

It suggests that a penalty that is inversely correlated with the number of active
neurons is imposed on the loss function.

Additional neurons cannot activate because of the sparsity function.

Regularizers Come in two Varieties


 The L1 Loss technique can be used as a general regularize to boost the model's magnitude.
 The KL-divergence method considers all activations simultaneously, in contrast to the L1
Loss approach, which merely adds up the activations over all samples. We established upper
and lower bounds on the average intensity of each neuron in this group.

Contractive Autoencoders
Prior to rebuilding the input in the decoder, a contractive autoencoder funnels it
through a bottleneck. The bottleneck function is being used to learn an image
representation of the image while it is being processed.

The contractive autoencoder additionally has a regularization term to prevent


the network from figuring out the identity function and converting input to
output.

To train a model that satisfies this requirement, we must ensure that the hidden
layer activation derivatives are minimum with respect to the input.

Denoising Autoencoders
Have there ever been times when you wanted to remove background noise from
an image but didn't know where to begin? If so, denoising autoencoders are the
answer for you!

Denoising autoencoders perform similarly to traditional autoencoders in that


they accept an input and output it. But they differ from one another in that they
don't accept the input image as the absolute truth. Instead, they use a louder
version.

It's because removing image noise is difficult when working with photographs.

To translate a noisy concept into a lower-dimensional spectrum, where noise


filtering is much easier to regulate, we can instead use a denoising autoencoder.

The standard loss function employed with these networks is L2 or L1 loss.

Variational Autoencoders
Variational autoencoders (VAEs) are models created to address a specific
problem with conventional autoencoders. An autoencoder learns to solely
represent the input in the so called latent space or bottleneck during training.
The post-training latent space is not necessarily continuous, which makes
interpolation challenging.

The variational autoencoders that concentrate on this topic express their latent
features as probability distributions, resulting in a continuous latent space that
is easy to sample and extend.

…………………… end……………

Q) Deep Belief Networks?

Deep belief networks (DBNs) are a type of deep learning algorithm that
addresses the problems associated with classic neural networks. They do this
by using layers of stochastic latent variables, which make up the network.
These binary latent variables, or feature detectors and hidden units, are
binary variables, and they are known as stochastic because they can take on
any value within a specific range with some probability.

The top two layers in DBNs have no direction, but the layers above them have
directed links to lower layers. DBNs differ from traditional neural
networks because they can be generative and discriminative models. For
example, you can only train a conventional neural network to classify images.

DBNs also differ from other deep learning algorithms like restricted
Boltzmann machines (RBMs) or autoencoders because they don't work with
raw inputs like RBMs. Instead, they rely on an input layer with one neuron per
input vector and then proceed through many layers until reaching a final layer
where outputs are generated using probabilities derived from previous layers'
activations!

The Architecture of DBN:


In the DBN, we have a hierarchy of layers. The top two layers are the
associative memory, and the bottom layer is the visible units. The arrows
pointing towards the layer closest to the data point to relationships between
all lower layers.

Directed acyclic connections in the lower layers translate associative memory


to observable variables.

The lowest layer of visible units receives input data as binary or actual data.
Like RBM, there are no intralayer connections in DBN. The hidden units
represent features that encapsulate the data’s correlations.

A matrix of proportional weights W connects two layers. We’ll link every unit
in each layer to every other unit in the layer above it.

How Does DBN work?

Getting from pixels to property layers is not a straightforward process.

First, we train a property layer that can directly gain pixel input signals. Then
we learn the features of the preliminarily attained features by treating the
values of this subcaste as pixels. The lower bound on the log-liability of the
training data set improves every time a fresh subcaste of parcels or features
that we add to the network.

The deep belief network's operational pipeline is as follows:

 First, we run numerous steps of Gibbs sampling in the top two hidden
layers. The top two hidden layers define the RBM. Thus, this stage
effectively extracts a sample from it.
 Then generate a sample from the visible units using a single pass of
ancestral sampling through the rest of the model.
 Finally, we’ll use a single bottom-up pass to infer the values of the latent
variables in each layer. In the bottom layer, greedy pretraining begins with
an observed data vector. It then oppositely fine-tunes the generative
weights.

Creating a Deep Belief Network:

DBNs are made up of multiple Restricted Boltzmann Machines (RBMs), just


like regular Boltzmann Machines, except they're not fully connected. That's
why we call them "restricted."

We use the fully unsupervised form of DBNs to initialize Deep Neural


Networks, whereas we use the classification form of DBNs as classifiers on
their own.

It's pretty simple: each record type contains the RBMs that make up the
network’s layers, as well as a vector indicating layer size and- in the case of
classification DBNs- number of classes in representative data set.

Applications:

Deep Belief Networks are a more computationally efficient version of


feedforward neural networks and can be used for image recognition, video
sequences, motion capture data, speech recognition, and more.

……………………… end………………….

Q)What are Boltzmann Machines?

It is a network of neurons in which all the neurons are connected to each other.
In this machine, there are two layers named visible layer or input layer and
hidden layer. The visible layer is denoted as v and the hidden layer is denoted
as the h. In Boltzmann machine, there is no output layer. Boltzmann machines
are random and generative neural networks capable of learning internal
representations and are able to represent and (given enough time) solve tough
combinatoric problems.
The Boltzmann distribution (also known as Gibbs Distribution) which is an
integral part of Statistical Mechanics and also explain the impact of parameters
like Entropy and Temperature on the Quantum States in Thermodynamics. Due
to this, it is also known as Energy-Based Models (EBM). It was invented in
1985 by Geoffrey Hinton, then a Professor at Carnegie Mellon University, and
Terry Sejnowski, then a Professor at Johns Hopkins University

……………………….. end……………

Q)What are Restricted Boltzmann Machines (RBM)?

A restricted term refers to that we are not allowed to connect the same type
layer to each other. In other words, the two neurons of the input layer or hidden
layer can’t connect to each other. Although the hidden layer and visible layer
can be connected to each other.
As in this machine, there is no output layer so the question arises how we are
going to identify, adjust the weights and how to measure the that our prediction
is accurate or not. All the questions have one answer, that is Restricted
Boltzmann Machine.
The RBM algorithm was proposed by Geoffrey Hinton (2007), which learns
probability distribution over its sample training data inputs. It has seen wide
applications in different areas of supervised/unsupervised machine learning
such as feature learning, dimensionality reduction, classification, collaborative
filtering, and topic modeling.
Consider the example movie rating discussed in the recommender system
section.
Movies like Avengers, Avatar, and Interstellar have strong associations with the
latest fantasy and science fiction factor. Based on the user rating RBM will
discover latent factors that can explain the activation of movie choices. In short,
RBM describes variability among correlated variables of input dataset in terms
of a potentially lower number of unobserved variables.
The energy function is given by

Applications of Restricted Boltzmann Machine

Restricted Boltzmann Machines (RBMs) have found numerous applications in


various fields, some of which are:
 Collaborative filtering: RBMs are widely used in collaborative filtering for
recommender systems. They learn to predict user preferences based on
their past behavior and recommend items that are likely to be of interest to
the user.
 Image and video processing: RBMs can be used for image and video
processing tasks such as object recognition, image denoising, and image
reconstruction. They can also be used for tasks such as video segmentation
and tracking.
 Natural language processing: RBMs can be used for natural language
processing tasks such as language modeling, text classification, and
sentiment analysis. They can also be used for tasks such as speech
recognition and speech synthesis.
 Bioinformatics: RBMs have found applications in bioinformatics for tasks
such as protein structure prediction, gene expression analysis, and drug
discovery.
 Financial modeling: RBMs can be used for financial modeling tasks such
as predicting stock prices, risk analysis, and portfolio optimization.
 Anomaly detection: RBMs can be used for anomaly detection tasks such
as fraud detection in financial transactions, network intrusion detection, and
medical diagnosis.
 It is used in Filtering.
 It is used in Feature Learning.
 It is used in Classification.
 It is used in Risk Detection.
 It is used in Business and Economic analysis.
……………………… end……………

Q)How do Restricted Boltzmann Machines work?

In RBM there are two phases through which the entire RBM works:
1st Phase: In this phase, we take the input layer and using the concept of
weights and biased we are going to activate the hidden layer. This process is
said to be Feed Forward Pass. In Feed Forward Pass we are identifying the
positive association and negative association.
Feed Forward Equation:
 Positive Association — When the association between the visible unit and
the hidden unit is positive.
 Negative Association — When the association between the visible unit and
the hidden unit is negative.
2nd Phase: As we don’t have any output layer. Instead of calculating the output
layer, we are reconstructing the input layer through the activated hidden state.
This process is said to be Feed Backward Pass. We are just backtracking the
input layer through the activated hidden neurons. After performing this we have
reconstructed Input through the activated hidden state. So, we can calculate the
error and adjust weight in this way:
Feed Backward Equation:
 Error = Reconstructed Input Layer-Actual Input layer
 Adjust Weight = Input*error*learning rate (0.1)
After doing all the steps we get the pattern that is responsible to activate the
hidden neurons. To understand how it works:
Let us consider an example in which we have some assumption that V1 visible
unit activates the h1 and h2 hidden unit and V2 visible unit activates the h2 and
h3 hidden. Now when any new visible unit let V5 has come into the machine
and it also activates the h1 and h2 unit. So, we can back trace the hidden units
easily and also identify that the characteristics of the new V5 neuron is
matching with that of V1. This is because V1 also activated the same hidden
unit earlier.
Restricted Boltzmann Machines

……………………….. end………………….

Q)Types of RBM?

There are mainly two types of Restricted Boltzmann Machine (RBM) based on
the types of variables they use:
1. Binary RBM: In a binary RBM, the input and hidden units are binary
variables. Binary RBMs are often used in modeling binary data such as
images or text.
2. Gaussian RBM: In a Gaussian RBM, the input and hidden units are
continuous variables that follow a Gaussian distribution. Gaussian RBMs are
often used in modeling continuous data such as audio signals or sensor
data.
Apart from these two types, there are also variations of RBMs such as:
1. Deep Belief Network (DBN): A DBN is a type of generative model that
consists of multiple layers of RBMs. DBNs are often used in modeling high-
dimensional data such as images or videos.
2. Convolutional RBM (CRBM): A CRBM is a type of RBM that is designed
specifically for processing images or other grid-like structures. In a CRBM,
the connections between the input and hidden units are local and shared,
which makes it possible to capture spatial relationships between the input
units.
3. Temporal RBM (TRBM): A TRBM is a type of RBM that is designed for
processing temporal data such as time series or video frames. In a TRBM,
the hidden units are connected across time steps, which allows the network
to model temporal dependencies in the data.
………………………… end……………….

Q)Generative Adversarial Networks?

Generative Adversarial Networks (GANs) were introduced in 2014 by Ian J.


Goodfellow and co-authors. GANs perform unsupervised learning tasks in machine
learning. It consists of 2 models that automatically discover and learn the patterns in
input data.

The two models are known as Generator and Discriminator.

They compete with each other to scrutinize, capture, and replicate the variations
within a dataset. GANs can be used to generate new examples that plausibly could
have been drawn from the original dataset.

Shown below is an example of a GAN. There is a database that has real 100 rupee
notes. The generator neural network generates fake 100 rupee notes. The
discriminator network will help identify the real and fake notes.
How Do GANs Work?

GANs consists of two neural networks. There is a Generator G(x) and a


Discriminator D(x). Both of them play an adversarial game. The generator's aim is to
fool the discriminator by producing data that are similar to those in the training set.
The discriminator will try not to be fooled by identifying fake data from real data.
Both of them work simultaneously to learn and train complex data like audio, video,
or image files.

The Generator network takes a sample and generates a fake sample of data. The
Generator is trained to increase the Discriminator network's probability of making
mistakes.
Below is an example of a GAN trying to identify if the 100 rupee notes are real or
fake. So, first, a noise vector or the input vector is fed to the Generator network. The
generator creates fake 100 rupee notes. The real images of 100 rupee notes stored
in a database are passed to the discriminator along with the fake notes. The
Discriminator then identifies the notes as classifying them as real or fake.

We train the model, calculate the loss function at the end of the discriminator
network, and backpropagate the loss into both discriminator and generator models.

Mathematical Equation

The mathematical equation for training a GAN can be represented as:

Here,

G = Generator

D = Discriminator

Pdata(x) = distribution of real data

p(z) = distribution of generator

x = sample from Pdata(x)


z = sample from P(z)
D(x) = Discriminator network
G(z) = Generator network

Application of GANs

 With the help of DCGANs, you can train images of cartoon characters for
generating faces of anime characters as well as Pokemon characters.

 GANs can be trained on the images of humans to generate realistic faces. The
faces that you see below have been generated using GANs and do not exist in
reality.
 GANs can build realistic images from textual descriptions of objects like birds,
humans, and other animals. We input a sentence and generate multiple images
fitting the description.

………………………………….. end……………

You might also like