AI Lab Practicals

Download as pdf or txt
Download as pdf or txt
You are on page 1of 34

Experiment-1

Objective: For a given network of cities, find an optimal path to reach from a given source city
to any other destination city using an admissible heuristic.

Theory:

Heuristics: The heuristic function h(n) tells A* an estimate of the minimum cost from any vertex
n to the goal. It’s important to choose a good heuristic function.

The heuristic can be used to control A*’s behavior.

● At one extreme, if h(n) is 0, then only g(n) plays a role, and A* turns into Dijkstra’s
Algorithm, which is guaranteed to find a shortest path.
● If h(n) is always lower than (or equal to) the cost of moving from n to the goal, then A* is
guaranteed to find a shortest path. The lower h(n) is, the more node A* expands, making
it slower.
● If h(n) is exactly equal to the cost of moving from n to the goal, then A* will only follow
the best path and never expand anything else, making it very fast. Although you can’t
make this happen in all cases, you can make it exact in some special cases. It’s nice to
know that given perfect information, A* will behave perfectly.
● If h(n) is sometimes greater than the cost of moving from n to the goal, then A* is not
guaranteed to find a shortest path, but it can run faster.
● At the other extreme, if h(n) is very high relative to g(n), then only h(n) plays a role, and
A* turns into Greedy Best-First-Search.

So we have an interesting situation in that we can decide what we want to get out of A*. With
100% accurate estimates, we’ll get shortest paths really quickly. If we’re too low, then we’ll
continue to get shortest paths, but it’ll slow down. If we’re too high, then we give up shortest
paths, but A* will run faster.
Procedure:

1. Put the start node son a list called OPENof unexpanded nodes.
2. If OPEN is empty exit with failure; no solutions exists.
3. Remove the first OPEN node n at which f is minimum (break ties arbitrarily), and place it on a
list called CLOSEDto be used for expanded nodes.
4. If nis a goal node, exit successfully with the solution obtained by tracing the path along the
pointers from the goal back to s.
5. Otherwise expand node n, generating all it’s successors with pointers back to n.
6. For every successor n’on n:a. Calculate f(n’).b. if n’ was neither on OPENnor on CLOSED,
add it to OPEN. Attach a pointer from n’back to n. Assign the newly computed f(n’)to node n’.c.
if n’ already resided on OPENor CLOSED, compare the newly computed f(n’)with the value
previously assigned to n’. If the old value is lower, discard the newly generated node. If the new
value is lower, substitute it for the old (n’ now points back to n instead of to its previous
predecessor). If the matching node n’ resides on CLOSED, move it back to OPEN.
7.Go to step 2.
Conclusion: When h is consistent, the f values of nodes expanded by A* are never
decreasing. When A* selected n for expansion it already found the shortest path to it. When h is
consistent every node is expanded once.Normally the heuristics we encounter are consistent
–the number of misplaced tiles
–Manhattan distance
–straight-line distance

Experiment-2

Objective: Solve the weather problem to predict the possibility of a rain happening under
known parameters for e.g., temperature, humidity, wind flow, sunny or cloudy etc. using
Bayesian Learning.

Theory: The basic idea of Bayesian networks (BNs) (BNs) is to reproduce the most important
dependencies and independencies among a set of variables in a graphical form (a directed
acyclic graph) which is easy to understand and interpret. Let us consider the subset of climatic
stations shown in the graph in Fıgure, where the variables (rainfall) are represented pictorially
by a set of nodes; one node for each variable (for clarity of exposition, the set of nodes is
denoted {y1,.....yn}). These nodes are connected by arrows, which represent a cause and
effect relationship. That is, if there is an arrow from node yi to node yj , we say that yi is the
cause of yj , or equivalently, yj is the effect of yi. Another popular terminology of this is to say that
yi is a parent of yj or yj is a child of yi. For example, in Figure, the nodes Gijon and Amieva and
Proaza are a child of Gijon and Rioseco (the set of parents of a node yi is denoted by πi).
Directed graphs provide a simple defınition of independence (d-separation) based on the
existence or not of certain paths between the variables.

The dependency/independency structure displayed by an acyclic directed graph can be also


expressed in terms of a the Joint Probability Distribution (JPD) factorized as a product of several
conditional distributions as follows:

Pr(y1,y2, …., yn) = n∏i=1 P(yi | πi)

Therefore, the independencies from the graph are easily translated to the probabilistic model in
a sound form. For instance, the JPD of a BN defıned by the graph given in Fıgure requires the
specifıcation of 100 conditional probability tables, one for each variable conditioned to its
parents’ set. Hereafter we shall consider rainfall discretized into three different states (0=“no
rain”, 1=“weak rain”, 2=“heavy rain”), associated with the thresholds 0, 2, and 10 mm,
respectively.

Procedure:

1) Learning Bayesian Networks from Data-


In addition to the graph structure, a BN requires that we specify the conditional probability of
each node given its parents. However, in many practical problems, we do not know neither the
complete topology of the graph, nor some of the required probabilities. For this reason, several
methods have been recently introduced for learning the graphical structure (structure learning)
and estimating probabilities (parametric learning) from data. A learning algorithm consists of two
parts:
1. A quality measure, which is used for computing the quality of the candidate BNs. This is a
global measure, since it measures both the quality of the graphical structure and the quality of
the estimated parameters.
2. A search algorithm, which is used to effıciently search the space of possible BNs to fınd the
one with highest quality. Note that the number of all possible networks, even for a small number
of variables and, therefore, the search space is huge.
Among the different quality measures proposed in the literature the basic idea of Bayesian
quality measures is to assign to every BN a quality value that is a function of the posterior
probability distribution of the available data D = {yt1, …, yt 100} (with the index t running daily from
1979 to 1993), given the BN (M,θ) with network structure M and the corresponding estimated
probabilities θ. The posterior probability distribution p(M, θ|D) is calculated as follows:

Geiger and Heckerman consider multinomial networks and assume certain hypothesis about the
prior distributions of the parameters, leading to the quality measure

where n is the number of variables, ri is the cardinal of the i-th variable, si the number of
realizations of the parent’s set ∏i , ηijk are the “a priori” Dirichlet hyper-parameters for the
conditional distribution of node i, Nijk is the number of realizations in the database consistent
with yi = j and πi = k, Nik is the number of realizations in the database consistent with πi = k and
Г is the gamma function.

2) Inference- Once a model describing the relationships among the set of variables has been
selected, it can then be used to answer queries when evidence becomes available.

3) Validation of the Bayesian Network Forecast Model- To check the quality of BN in a simple
case, we shall apply this methodology to a nowcasting problem. In this case we are given a
forecast in a given subset of stations and we need to infer a prediction for the remaining stations
in the network. To this aim, consider that we are given predictions in the fıve stations of the
primary network. These predictions shall be plugged in the network as evidence, obtaining the
probabilities for the remaining stations in the secondary network.

4) Connecting With Numerical Atmospheric Models- Since we are interested in rainfall


forecasts, we shall use the gridded forecasts of total precipitation given by the operative
ECMWF model (these values are obtained by adding both the convective and the large scale
precipitation outputs). The forecasts are obtained 24 hours ahead; therefore, they give a
numeric estimation of the future precipitation pattern (one day ahead) on a coarse-grained
resolution grid.

Output:

Bayesian network of precipitation grid points and local precipitation at the network of local
stations.

Conclusion: We have used bayesian network learning and show their applicability for local
weather forecasting and downscaling. The preliminary results presented how such models can
be built and how they can be used for performing inference.

Experiment-3

Objective: Solve the problem of human recognition from their faces using machine learning
techniques.

Theory: Let us introduce a new benchmark data set of face images with variable makeup,
hairstyles and occlusions, named BookClub artistic makeup data, and then examine the
performance of the ANNs under different conditions. Makeup and other occlusions can be used
not only to disguise a person's identity from the ANN algorithms, but also to spoof a wrong
identification.
ANN Algorithm: Artificial Neural Network (ANN) are capable of learning patterns of interest
from data in the presence of variations. An Artificial Neural Network in the field of Artificial
intelligence where it attempts to mimic the network of neurons makes up a human brain so that
computers will have an option to understand things and make decisions in a human-like
manner. The artificial neural network is designed by programming computers to behave simply
like interconnected brain cells.

Artificial Neural Network primarily consists of three layers:

● Input Layer
● Hidden Layer
● Output Layer

Procedure:

1. The images used in this are kept coloured and downsized and compressed into JPEG
format with the dimension of 48x48 pixels.
2. The downsizing is done due to computational restrictions to keep processing times
reasonable. However, observations made on the small size images are extendable to
larger sizes.
3. For computational experiments, ‘Keras’ library with Tensorflow back-end were used.
4. The ANN consists of the four sequential groups of layers of the Gaussian noise,
convolution with ReLU activation functions, normalization, pooling and dropout layers.
5. It is topped with the fully connected layers, the softmax activation function of the last
layer and cross-entropy loss function. "Adam" learning algorithm with 0:001 coecient,
mini-batch size 32 and 100 epochs parameters are used.

Output:

Conclusion: Despite the small size images were scaled to and not very deep ANN, mean
accuracy of the face recognition of the model trained on the samples from all photo-sessions of
all subjects is quite high at 92%, and higher (up to 99:9%)

Experiment-4

Objective: Classify the objects using deep learning techniques.


Theory: Image classification involves assigning a class label to an image, whereas object
localization involves drawing a bounding box around one or more objects in an image. Object
detection is more challenging and combines these two tasks and draws a bounding box around
each object of interest in the image and assigns them a class label. Together, all of these
problems are referred to as object recognition.

● Image Classification: Predict the type or class of an object in an image.


○ Input: An image with a single object, such as a photograph.
○ Output: A class label (e.g. one or more integers that are mapped to class labels).
● Object Localization: Locate the presence of objects in an image and indicate their
location with a bounding box.
○ Input: An image with one or more objects, such as a photograph.
○ Output: One or more bounding boxes (e.g. defined by a point, width, and height).
● Object Detection: Locate the presence of objects with a bounding box and types or
classes of the located objects in an image.
○ Input: An image with one or more objects, such as a photograph.
○ Output: One or more bounding boxes (e.g. defined by a point, width, and height),
and a class label for each bounding box.

Conclusion: Object detection can be used in many areas to reduce human efforts and
increase the efficiency of processes in various fields. Object detection, as well as deep learning,
are areas that will be blooming in the future and making its presence across numerous fields.
There is a lot of scope in these fields and also many opportunities for improvements.
Q. 5 Objective: Validate the principles of transfer learning for solving any real-life
classification/recognition problem.

Theory: Humans have an inherent ability to transfer knowledge across tasks. What we
acquire as knowledge while learning about one task, we utilize in the same way to
solve related tasks. The more related the tasks, the easier it is for us to transfer, or
cross-utilize our knowledge. Some simple examples would be,

Know how to ride a motorbike ⮫ Learn how to ride a car

Transfer learning, as we have seen so far, is having the ability to utilize existing
knowledge from the source learner in the target task. During the process of transfer
learning, the following three important questions must be answered:

What to transfer: This is the first and the most important step in the whole
process. We try to seek answers about which part of the knowledge can be transferred
from the source to the target in order to improve the performance of the target task.
When trying to answer this question, we try to identify which portion of knowledge is
source-specific and what is common between the source and the target.

When to transfer: There can be scenarios where transferring knowledge for the
sake of it may make matters worse than improving anything (also known as negative
transfer). We should aim at utilizing transfer learning to improve target task
performance/results and not degrade them. We need to be careful about when to
transfer and when not to.

How to transfer: Once the what and when have been answered, we can proceed
towards identifying ways of actually transferring the knowledge across
domains/tasks. This involves changes to existing algorithms and different techniques,
which we will cover in later sections of this article. Also, specific case studies are lined
up in the end for a better understanding of how to transfer.

Image Classification with a Data Availability Constraint

The dataset that we will be using, comes from the very


popular Dog vs Cat Challenge, where our primary objective is to
build a deep learning model that can successfully recognize and
categorize images into either a cat or a dog.

Creating Datasets:
import glob
import numpy as np
import os
import shutilnp.random.seed(42)
files = glob.glob('train/*')

cat_files = [fn for fn in files if 'cat' in fn]

dog_files = [fn for fn in files if 'dog' in fn]

len(cat_files), len(dog_files)

cat_train = np.random.choice(cat_files, size=1500, replace=False)

dog_train = np.random.choice(dog_files, size=1500, replace=False)

cat_files = list(set(cat_files) - set(cat_train))

dog_files = list(set(dog_files) - set(dog_train))

cat_val = np.random.choice(cat_files, size=500, replace=False)

dog_val = np.random.choice(dog_files, size=500, replace=False)

cat_files = list(set(cat_files) - set(cat_val))

dog_files = list(set(dog_files) - set(dog_val))

cat_test = np.random.choice(cat_files, size=500, replace=False)

dog_test = np.random.choice(dog_files, size=500, replace=False)

print('Cat datasets:', cat_train.shape, cat_val.shape, cat_test.shape)

Writing on disk
train_dir = 'training_data'

val_dir = 'validation_data'

test_dir = 'test_data'

train_files = np.concatenate([cat_train, dog_train])

validate_files = np.concatenate([cat_val, dog_val])

test_files = np.concatenate([cat_test, dog_test])

os.mkdir(train_dir) if not os.path.isdir(train_dir) else None

os.mkdir(val_dir) if not os.path.isdir(val_dir) else None

os.mkdir(test_dir) if not os.path.isdir(test_dir) else None


for fn in train_files:

shutil.copy(fn, train_dir)

for fn in validate_files:

shutil.copy(fn, val_dir)

Testing on CNN Model

import glob
import numpy as np
import matplotlib.pyplot as plt
from keras.preprocessing.image import ImageDataGenerator,
load_img, img_to_array, array_to_img%matplotlib inline

IMG_DIM = (150, 150)

train_files = glob.glob('training_data/*')
train_imgs = [img_to_array(load_img(img, target_size=IMG_DIM)) for img in train_files]

train_imgs = np.array(train_imgs)

train_labels = [fn.split('\\')[1].split('.')[0].strip() for fn in train_files]

validation_files = glob.glob('validation_data/*')

validation_imgs = [img_to_array(load_img(img, target_size=IMG_DIM)) for img in


validation_files]

validation_imgs = np.array(validation_imgs)

validation_labels = [fn.split('\\')[1].split('.')[0].strip() for fn in validation_files]

print('Train dataset shape:', train_imgs.shape),

Conclusion

We can clearly see that we have 3000 training images and 1000
validation images. Each image is of size 150 x 150 and has three
channels for red, green, and blue (RGB), hence giving each
image the (150, 150, 3) dimensions. We will now scale each image
with pixel values between (0, 255) to values between (0, 1) because
deep learning models work really well with small input values.
Q. 6 Objective: Write a Program using Cloudsim to create a datacentre having three
hosts and run five cloudlets on it. The cloudlets run in Virtual Machines with different
million instructions per second (MIPS) requirements. The cloudlets will take different
time to complete the execution depending on the requested VM performance
Theory: CloudSim Simulation Tool is the most popular simulator used by researchers and
developers nowadays for the cloud-related issues in the research field. This manual will ease
your learning by providing simple steps to follow up with installing and understanding this
simulation tool. *This manual is intentionally prepared to help the research community who are
working in Cloud Computing domain.

package org.cloudbus.cloudsim.examples;

import java.text.DecimalFormat;
import java.util.ArrayList;
import java.util.Calendar;
import java.util.LinkedList;
import java.util.List;
import org.cloudbus.cloudsim.Cloudlet;
import org.cloudbus.cloudsim.CloudletSchedulerTimeShared;
import org.cloudbus.cloudsim.Datacenter;
import org.cloudbus.cloudsim.DatacenterBroker;
import org.cloudbus.cloudsim.DatacenterCharacteristics;
import org.cloudbus.cloudsim.Host;
import org.cloudbus.cloudsim.Log;
import org.cloudbus.cloudsim.Pe;
import org.cloudbus.cloudsim.Storage;
import org.cloudbus.cloudsim.UtilizationModel;
import org.cloudbus.cloudsim.UtilizationModelFull;
import org.cloudbus.cloudsim.Vm;
import org.cloudbus.cloudsim.VmAllocationPolicySimple;
import org.cloudbus.cloudsim.VmSchedulerTimeShared;
import org.cloudbus.cloudsim.core.CloudSim;
import org.cloudbus.cloudsim.provisioners.BwProvisionerSimple;
import org.cloudbus.cloudsim.provisioners.PeProvisionerSimple;
import org.cloudbus.cloudsim.provisioners.RamProvisionerSimple;

/**
* A simple example showing how to create
**/
public class CloudSimExample3 {

/** The cloudlet list. */


private static List<Cloudlet> cloudletList;

/** The vmlist. */


private static List<Vm> vmlist;
/** Creates main() to run this example **/
public static void main(String[] args) {
Log.printLine("Starting CloudSimExample3...");
try {
// First step: Initialize the CloudSim package. It should be called.
int num_user = 1; // number of cloud users
Calendar calendar = Calendar.getInstance();
boolean trace_flag = false; // mean trace events
CloudSim.init(num_user, calendar, trace_flag);

// Second step: Create Datacenters


@SuppressWarnings("unused")
Datacenter datacenter0 = createDatacenter("Datacenter_0");

//Third step: Create Broker


DatacenterBroker broker = createBroker();
int brokerId = broker.getId();

//Fourth step: Create one virtual machine


vmlist = new ArrayList<Vm>();

//VM description
int vmid = 0;
int mips = 250;
long size = 10000; //image size (MB)
int ram = 2048; //vm memory (MB)
long bw = 1000;
int pesNumber = 1; //number of cpus
String vmm = "Xen"; //VMM name

//create two VMs


Vm vm1 = new Vm(vmid, brokerId, mips, pesNumber, ram, bw, size, vmm,
new CloudletSchedulerTimeShared());

//the second VM will have twice the priority of VM1 and so will receive twice CPU
time
vmid++;
Vm vm2 = new Vm(vmid, brokerId, mips * 2, pesNumber, ram, bw, size, vmm,
new CloudletSchedulerTimeShared());

//add the VMs to the vmList


vmlist.add(vm1);
vmlist.add(vm2);

//submit vm list to the broker


broker.submitVmList(vmlist);

//Fifth step: Create two Cloudlets


cloudletList = new ArrayList<Cloudlet>();
//Cloudlet properties
int id = 0;
long length = 40000;
long fileSize = 300;
long outputSize = 300;
UtilizationModel utilizationModel = new UtilizationModelFull();

Cloudlet cloudlet1 = new Cloudlet(id, length, pesNumber, fileSize, outputSize,


utilizationModel, utilizationModel, utilizationModel);
cloudlet1.setUserId(brokerId);
id++;
Cloudlet cloudlet2 = new Cloudlet(id, length, pesNumber,
fileSize, outputSize, utilizationModel, utilizationModel, utilizationModel);
cloudlet2.setUserId(brokerId);

//add the cloudlets to the list


cloudletList.add(cloudlet1);
cloudletList.add(cloudlet2);

//submit cloudlet list to the broker


broker.submitCloudletList(cloudletList);

//bind the cloudlets to the vms. This way, the broker


// will submit the bound cloudlets only to the specific VM

broker.bindCloudletToVm(cloudlet1.getCloudletId(),vm1.getId());

broker.bindCloudletToVm(cloudlet2.getCloudletId(),vm2.getId());

// Sixth step: Starts the simulation


CloudSim.startSimulation();
// Final step: Print results when simulation is over
List<Cloudlet> newList = broker.getCloudletReceivedList();
CloudSim.stopSimulation();

printCloudletList(newList);
Log.printLine("CloudSimExample3 finished!");
}
catch (Exception e) {
e.printStackTrace();
Log.printLine("The simulation has been terminated due to an
unexpected error");
}
}

private static Datacenter createDatacenter(String name){

// Here are the steps needed to create a PowerDatacenter:


// 1. We need to create a list to store
// our machine
List<Host> hostList = new ArrayList<Host>();

// 2. A Machine contains one or more PEs or CPUs/Cores.


// In this example, it will have only one core.
List<Pe> peList = new ArrayList<Pe>();

int mips = 1000;

// 3. Create PEs and add these into a list.


peList.add(new Pe(0, new PeProvisionerSimple(mips))); // need to
store Pe id and MIPS Rating

//4. Create Hosts with its id and list of PEs and add them to the list of machines
int hostId=0;
int ram = 2048; //host memory (MB)
long storage = 1000000; //host storage
int bw = 10000;

hostList.add(
new Host(
hostId,
new RamProvisionerSimple(ram),
new BwProvisionerSimple(bw),
storage,
peList,
new VmSchedulerTimeShared(peList)
)
); // This is our first machine

//create another machine in the Data center


List<Pe> peList2 = new ArrayList<Pe>();
peList2.add(new Pe(0, new PeProvisionerSimple(mips)));

hostId++;

hostList.add(
new Host(
hostId,
new RamProvisionerSimple(ram),
new BwProvisionerSimple(bw),
storage,
peList2,
new VmSchedulerTimeShared(peList2)
)
); // This is our second machine

// 5. Create a DatacenterCharacteristics object that stores the


// properties of a data center: architecture, OS, list of
// Machines, allocation policy: time- or space-shared, time zone
// and its price (G$/Pe time unit).
String arch = "x86"; // system architecture
String os = "Linux"; // operating system
String vmm = "Xen";
double time_zone = 10.0; // time zone this resource located
double cost = 3.0; // the cost of using processing in this resource
double costPerMem = 0.05; // the cost of using memory in this resource
double costPerStorage = 0.001;// the cost of using storage in this resource
double costPerBw = 0.0; // the cost of using bw in this resource
LinkedList<Storage> storageList = new LinkedList<Storage>(); //we are
not adding SAN devices by now
DatacenterCharacteristics characteristics = new DatacenterCharacteristics(
arch, os, vmm, hostList, time_zone, cost, costPerMem, costPerStorage,
costPerBw);
// 6. Finally, we need to create a PowerDatacenter object.
Datacenter datacenter = null;
try {
datacenter = new Datacenter(name, characteristics, new
VmAllocationPolicySimple(hostList), storageList, 0);
} catch (Exception e) {
e.printStackTrace();
}
return datacenter;
}

//We strongly encourage users to develop their own broker policies, to submit vms
and cloudlets according
//to the specific rules of the simulated scenario
private static DatacenterBroker createBroker(){
DatacenterBroker broker = null;
try {
broker = new DatacenterBroker("Broker");
} catch (Exception e) {
e.printStackTrace();
return null;
}
return broker;
}
/* * Prints the Cloudlet objects**/
private static void printCloudletList(List<Cloudlet> list) {
int size = list.size();
Cloudlet cloudlet;

String indent = " ";


Log.printLine();
Log.printLine("========== OUTPUT ==========");
Log.printLine("Cloudlet ID" + indent + "STATUS" + indent +
"Data center ID" + indent + "VM ID" + indent + "Time" +
indent + "Start Time" + indent + "Finish Time");

DecimalFormat dft = new DecimalFormat("###.##");


for (int i = 0; i < size; i++) {
cloudlet = list.get(i);
Log.print(indent + cloudlet.getCloudletId() + indent + indent);

if (cloudlet.getCloudletStatus() == Cloudlet.SUCCESS){
Log.print("SUCCESS");

Log.printLine( indent + indent + cloudlet.getResourceId()


+ indent + indent + indent + cloudlet.getVmId() +
indent + indent +
dft.format(cloudlet.getActualCPUTime()) + indent + indent +
dft.format(cloudlet.getExecStartTime())+
indent + indent +
dft.format(cloudlet.getFinishTime()));
}
}
}
}
Objective:‌A ‌ pplication‌‌of‌‌Multi-Layer‌‌Perceptron‌‌on‌‌classification‌‌Problem‌  ‌
 ‌
Theory:‌M ‌ ulti-layer‌‌perceptron‌‌(MLP)‌‌is‌‌a‌‌supplement‌‌of‌‌a‌‌feed-forward‌‌neural‌‌network.‌‌ 
It‌‌consists‌‌of‌‌three‌‌types‌‌of‌‌layers—the‌‌input‌‌layer,‌‌output‌‌layer,‌‌and‌‌hidden‌‌layer,‌‌as‌‌ 
shown‌‌in‌‌Fig.‌‌below.‌‌The‌‌input‌‌layer‌‌receives‌‌the‌‌input‌‌signal‌‌to‌‌be‌‌processed.‌‌The‌‌ 
required‌‌task‌‌such‌‌as‌‌prediction‌‌and‌‌classification‌‌is‌‌performed‌‌by‌‌the‌‌output‌‌layer.‌‌An‌‌ 
arbitrary‌‌number‌‌of‌‌hidden‌‌layers‌‌that‌‌are‌‌placed‌‌in‌‌between‌‌the‌‌input‌‌and‌‌output‌‌layer‌‌ 
are‌‌the‌‌true‌‌computational‌‌engine‌‌of‌‌the‌‌MLP.‌‌Similar‌‌to‌‌a‌‌feed-forward‌‌network‌‌in‌‌an‌‌ 
MLP‌‌the‌‌data‌‌flows‌‌in‌‌the‌‌forward‌‌direction‌‌from‌‌input‌‌to‌‌output‌‌layer.‌‌The‌‌neurons‌‌in‌‌ 
the‌‌MLP‌‌are‌‌trained‌‌with‌‌the‌‌backpropagation‌‌learning‌‌algorithm.‌‌MLPs‌‌are‌‌designed‌‌to‌‌ 
approximate‌‌any‌‌continuous‌‌function‌‌and‌‌can‌‌solve‌‌problems‌‌that‌‌are‌‌not‌‌linearly‌‌ 
separable.‌‌The‌‌major‌‌use‌‌cases‌‌of‌‌MLP‌‌are‌‌pattern‌‌classification,‌‌recognition,‌‌prediction,‌‌ 
and‌‌approximation.‌  ‌
 ‌
The‌‌computations‌‌taking‌‌place‌‌at‌‌every‌‌neuron‌‌in‌‌the‌‌ 
output‌‌and‌‌hidden‌‌layer‌‌are‌‌as‌‌follows,‌  ‌

‌o(x)=G(b(2)+W(2)h(x))‌…
‌ (1)‌  ‌
‌h(x)=Φ(x)=s(b(1)+W(1)x)‌…
‌ (2)‌  ‌
with‌‌bias‌‌vectors‌‌b(1),‌‌b(2);‌‌weight‌‌matrices‌‌W(1),‌‌W(2)‌‌ 
and‌‌activation‌‌functions‌‌G‌‌and‌‌s.‌‌The‌‌set‌‌of‌‌parameters‌‌ 
to‌‌learn‌‌is‌‌the‌‌set‌‌θ‌‌=‌‌{W(1),‌‌b(1),‌‌W(2),‌‌b(2)}.‌‌Typical‌‌ 
choices‌‌for‌‌s‌‌include‌‌tanh‌‌function‌‌with‌‌tanh(a)‌‌=‌‌(ea‌−
‌ ‌‌e−‌‌a)/(ea‌‌+‌‌e−‌‌a)‌‌or‌‌the‌‌logistic‌‌ 
sigmoid‌‌function,‌‌with‌‌sigmoid(a)‌‌=‌‌1/(1‌‌+‌‌e−‌‌a)‌  ‌

 ‌

Perceptron‌‌for‌‌Binary‌‌Classification‌  ‌

With‌‌this‌‌discrete‌‌output,‌‌controlled‌‌by‌‌the‌‌activation‌‌function,‌‌the‌‌perceptron‌‌can‌‌be‌‌ 
used‌‌as‌‌a‌b
‌ inary‌‌classification‌‌model‌,‌‌defining‌‌a‌l‌inear‌‌decision‌‌boundary‌.‌‌It‌‌finds‌‌the‌‌ 
separating‌‌hyperplane‌‌that‌‌minimizes‌‌the‌‌distance‌‌between‌‌misclassified‌‌points‌‌and‌‌the‌‌ 
decision‌‌boundary‌  ‌
 ‌

To‌‌minimize‌‌this‌‌distance,‌‌Perceptron‌‌uses‌‌Stochastic‌‌Gradient‌‌Descent‌‌as‌‌the‌‌ 
optimization‌‌function.‌  ‌
If‌‌the‌‌data‌‌is‌‌linearly‌‌separable,‌‌it‌‌is‌‌guaranteed‌‌that‌‌Stochastic‌‌Gradient‌‌Descent‌‌ 
will‌‌converge‌‌in‌‌a‌‌finite‌‌number‌‌of‌‌steps.‌  ‌
The‌‌last‌‌piece‌‌that‌‌Perceptron‌‌needs‌‌is‌‌the‌a‌ ctivation‌‌function‌,‌‌the‌‌function‌‌that‌‌ 
determines‌‌if‌‌the‌‌neuron‌‌will‌‌fire‌‌or‌‌not.‌‌Initial‌‌Perceptron‌‌models‌‌used‌‌sigmoid‌‌ 
function‌,‌‌and‌‌just‌‌by‌‌looking‌‌at‌‌its‌‌shape,‌‌it‌‌makes‌‌a‌‌lot‌‌of‌‌sense!‌‌The‌‌sigmoid‌‌ 
function‌‌maps‌‌any‌‌real‌‌input‌‌to‌‌a‌‌value‌‌that‌‌is‌‌either‌‌0‌‌or‌‌1‌‌and‌‌encodes‌‌a ‌‌
non-linear‌‌function.‌‌The‌‌neuron‌‌can‌‌receive‌‌negative‌‌numbers‌‌as‌‌input,‌‌and‌‌it‌‌will‌‌ 
still‌‌be‌‌able‌‌to‌‌produce‌‌an‌‌output‌‌that‌‌is‌‌either‌‌0‌‌or‌‌1.‌  ‌
A‌‌Multilayer‌‌Perceptron‌‌has‌‌input‌‌and‌‌output‌‌layers,‌‌and‌‌one‌‌or‌‌more‌‌hidden‌‌ 
layers‌‌with‌‌many‌‌neurons‌‌stacked‌‌together.‌‌And‌‌while‌‌in‌‌the‌‌Perceptron‌‌the‌‌ 
neuron‌‌must‌‌have‌‌an‌‌activation‌‌function‌‌that‌‌imposes‌‌a‌‌threshold,‌‌like‌‌ReLU‌‌or‌‌ 
sigmoid,‌‌neurons‌‌in‌‌a‌‌Multilayer‌‌Perceptron‌‌can‌‌use‌‌any‌‌arbitrary‌‌activation‌‌ 
function.‌ 
 ‌
Conclusion‌  ‌
Perceptron‌‌is‌‌a‌‌neural‌‌network‌‌with‌‌only‌‌one‌‌neuron,‌‌and‌‌can‌‌only‌‌understand‌‌ 
linear‌‌relationships‌‌between‌‌the‌‌input‌‌and‌‌output‌‌data‌‌provided.‌  ‌
However,‌‌with‌‌Multilayer‌‌Perceptron,‌‌horizons‌‌are‌‌expanded‌‌and‌‌now‌‌this‌‌neural‌‌ 
network‌‌can‌‌have‌‌many‌‌layers‌‌of‌‌neurons.‌  ‌
 ‌
 ‌
 ‌
 ‌
 ‌
 ‌
 ‌
Objective‌:‌‌Application‌‌of‌‌LSTM‌‌in‌‌Time‌‌Series‌‌Prediction/Speech‌‌recognition‌‌/ ‌‌
covid‌‌-19‌‌forecasting.‌  ‌
 ‌
Theory:‌‌‌LSTM‌‌(Long‌‌Short-Term‌‌Memory)‌‌is‌‌a‌‌Recurrent‌‌Neural‌‌Network‌‌(RNN)‌‌ 
based‌‌architecture‌‌that‌‌is‌‌widely‌‌used‌‌in‌‌natural‌‌language‌‌processing‌‌and‌‌time‌‌ 
series‌‌forecasting.‌‌The‌‌LSTM‌‌rectifies‌‌a‌‌huge‌‌issue‌‌that‌‌recurrent‌‌neural‌‌networks‌‌ 
suffer‌‌from‌‌short‌‌memory.‌‌Using‌‌a‌‌series‌‌of‌‌‘gates,’‌‌each‌‌with‌‌its‌‌own‌‌RNN,‌‌the‌‌ 
LSTM‌‌manages‌‌to‌‌keep,‌‌forget‌‌or‌‌ignore‌‌data‌‌points‌‌based‌‌on‌‌a‌‌probabilistic‌‌ 
model.‌  ‌
 ‌
LSTMs‌‌also‌‌help‌‌solve‌‌exploding‌‌and‌‌vanishing‌‌gradient‌‌problems.‌‌In‌‌simple‌‌terms,‌‌ 
these‌‌problems‌‌are‌‌a‌‌result‌‌of‌‌repeated‌‌weight‌‌adjustments‌‌as‌‌a‌‌neural‌‌network‌‌ 
trains.‌‌With‌‌repeated‌‌epochs,‌‌gradients‌‌become‌‌larger‌‌or‌‌smaller,‌‌and‌‌with‌‌each‌‌ 
adjustment,‌‌it‌‌becomes‌‌easier‌‌for‌‌the‌‌network’s‌‌gradients‌‌to‌‌compound‌‌in‌‌either‌‌ 
direction.‌‌This‌‌compounding‌‌either‌‌makes‌‌the‌‌gradients‌‌way‌‌too‌‌large‌‌or‌‌way‌‌too‌‌ 
small.‌‌While‌‌exploding‌‌and‌‌vanishing‌‌gradients‌‌are‌‌huge‌‌downsides‌‌of‌‌using‌‌ 
traditional‌‌RNN’s,‌‌LSTM‌‌architecture‌‌severely‌‌mitigates‌‌these‌‌issues.‌  ‌
After‌‌a‌‌prediction‌‌is‌‌made,‌‌it‌‌is‌‌fed‌‌back‌‌into‌‌the‌‌model‌‌to‌‌predict‌‌the‌‌next‌‌value‌‌in‌‌ 
the‌‌sequence.‌‌With‌‌each‌‌prediction,‌‌some‌‌error‌‌is‌‌introduced‌‌into‌‌the‌‌model.‌‌To‌‌ 
avoid‌‌exploding‌‌gradients,‌‌values‌‌are‌‌‘squashed’‌‌via‌‌(typically)‌‌sigmoid‌‌&‌‌tanh‌‌ 
activation‌‌functions‌‌prior‌‌to‌‌gate‌‌entrance‌‌&‌‌output.‌‌Below‌‌is‌‌a‌‌diagram‌‌of‌‌LSTM‌‌ 
architecture‌  ‌

 ‌
 ‌
  ‌
#‌T‌ ime‌S‌ eries‌  ‌
import‌n ‌ umpy‌a‌ s‌n ‌ p‌  ‌
import‌m ‌ atplotlib.pyplot‌a‌ s‌p ‌ lt‌  ‌
 ‌
def‌c‌ reate_series(df,‌x‌ col,‌d ‌ atecol):‌  ‌
f‌ eatures_considered‌=‌ ‌[‌ xcol]‌  ‌
f‌ eatures‌=‌ ‌d ‌ f[features_considered]‌  ‌
f‌ eatures.index‌=‌ ‌d ‌ f[datecol]‌  ‌
f‌ eatures.head()‌  ‌
f‌ eatures.plot(subplots=True)‌  ‌
r‌ eturn‌f‌ eatures‌  ‌
 ‌
def‌s‌ tationarity_test(X,‌l‌og_x‌=‌ ‌"‌ Y",‌r‌ eturn_p‌=‌ ‌F‌ alse,‌p ‌ rint_res‌=‌ ‌T‌ rue):‌  ‌
i‌f‌l‌og_x‌=‌ =‌"‌ Y":‌  ‌
X‌ ‌=‌ ‌n‌ p.log(X[X>0])‌  ‌
f‌ rom‌s‌ tatsmodels.tsa.stattools‌i‌mport‌a‌ dfuller‌  ‌
d‌ ickey_fuller‌=‌ ‌a‌ dfuller(X)‌  ‌
   ‌ ‌
i‌f‌p‌ rint_res:‌  ‌
p‌ rint('ADF‌S‌ tat‌i‌s:‌{‌ }.'.format(dickey_fuller[0]))‌‌   ‌
i‌f‌l‌og_x‌=‌ =‌"‌ Y":‌  ‌
X‌ ‌=‌ ‌n‌ p.log(X[X>0])‌  ‌
   ‌ ‌
#‌ ‌O
‌ nce‌w ‌ e‌h ‌ ave‌t‌ he‌s‌ eries‌a‌ s‌n ‌ eeded‌w
‌ e‌c‌ an‌d
‌ o‌t‌ he‌A
‌ DF‌t‌ est‌  ‌
f‌ rom‌s‌ tatsmodels.tsa.stattools‌i‌mport‌a‌ dfuller‌  ‌
d‌ ickey_fuller‌=‌ ‌a‌ dfuller(X)‌  ‌
   ‌ ‌
i‌f‌p‌ rint_res:‌  ‌
p‌ rint('P‌V ‌ al‌i‌s:‌{‌ }.'.format(dickey_fuller[1]))‌  ‌
p‌ rint('Critical‌V ‌ alues‌(‌ Significance‌L‌ evels):‌'‌)‌  ‌
f‌ or‌k‌ ey,val‌i‌n‌d ‌ ickey_fuller[4].items():‌  ‌
p‌ rint(key,":",round(val,3))‌  ‌
   ‌ ‌
i‌f‌r‌ eturn_p:‌  ‌
r‌ eturn‌d ‌ ickey_fuller[1]‌  ‌
   ‌ ‌
def‌d ‌ ifference(X):‌  ‌
d‌ iff‌=‌ ‌X ‌ .diff()‌  ‌
p‌ lt.plot(diff)‌  ‌
p‌ lt.show()‌  ‌
r‌ eturn‌d ‌ iff‌  ‌
 ‌
 ‌
Procedure:‌B ‌ efore‌‌building‌‌the‌‌model,‌‌we‌‌create‌‌a‌‌series‌‌and‌‌check‌‌for‌‌ 
stationarity.‌‌While‌‌stationarity‌‌is‌‌not‌‌an‌‌explicit‌‌assumption‌‌of‌‌LSTM,‌‌it‌‌does‌‌help‌‌ 
immensely‌‌in‌‌controlling‌‌error.‌‌A‌‌non-stationary‌‌series‌‌will‌‌introduce‌‌more‌‌errors‌‌ 
in‌‌predictions‌‌and‌‌force‌‌errors‌‌to‌‌compound‌‌faster.‌  ‌
We‌‌filter‌‌out‌‌one‌‌‘sequence‌‌length’‌‌of‌‌data‌‌points‌‌for‌‌later‌‌validation.‌‌In‌‌this‌‌case,‌‌ 
60‌‌points.‌  ‌
The‌‌data‌‌format‌‌required‌‌for‌‌an‌‌LSTM‌‌is‌‌3‌‌dimensional,‌‌with‌‌a‌‌moving‌‌window.‌  ‌
● So‌‌the‌‌first‌‌data‌‌point‌‌will‌‌be‌‌the‌‌first‌‌60‌‌days‌‌of‌‌data.‌  ‌
● The‌‌second‌‌data‌‌point‌‌is‌‌the‌‌first‌‌61‌‌days‌‌of‌‌data‌‌but‌‌not‌‌including‌‌the‌‌ 
first.‌  ‌
● The‌‌third‌‌data‌‌point‌‌is‌‌the‌‌first‌‌62‌‌days‌‌of‌‌data‌‌but‌‌not‌‌including‌‌the‌‌first‌‌ 
and‌‌second.‌  ‌
The‌‌last‌‌major‌‌step‌‌of‌‌prep‌‌is‌‌to‌‌scale‌‌the‌‌data.‌‌Here‌‌we‌‌use‌‌a‌‌simple‌‌min-max‌‌ 
scaler.‌‌Our‌‌sequence‌‌length‌‌is‌‌60‌‌days‌‌for‌‌this‌‌part‌‌of‌‌the‌‌code.‌  ‌
 ‌

 ‌
Conclusion:‌S‌ ince‌‌this‌‌article‌‌is‌‌mainly‌‌about‌‌building‌‌an‌‌LSTM,‌‌I‌‌didn’t‌‌discuss‌‌ 
many‌‌advantages/disadvantages‌‌of‌‌using‌‌an‌‌LSTM‌‌over‌‌classical‌‌methods.‌‌I’d‌‌like‌‌ 
to‌‌offer‌‌some‌‌guidelines‌‌in‌‌this‌‌conclusion:‌  ‌
Technical‌‌Considerations‌  ‌
1. ARIMA‌‌(and‌‌MA-based‌‌models‌‌in‌‌general)‌‌are‌‌designed‌‌for‌‌time‌‌series‌‌ 
data‌‌while‌‌RNN-based‌‌models‌‌are‌‌designed‌‌for‌‌sequence‌‌data.‌‌Because‌‌ 
of‌‌this‌‌distinction,‌‌it’s‌‌harder‌‌to‌‌build‌‌RNN-based‌‌models‌‌out‌‌of‌‌the‌‌box.‌  ‌
2. ARIMA‌‌models‌‌are‌‌highly‌‌parameterized‌‌and‌‌due‌‌to‌‌this,‌‌they‌‌don’t‌‌ 
generalize‌‌well.‌‌Using‌‌a‌‌parameterized‌‌ARIMA‌‌on‌‌a‌‌new‌‌dataset‌‌may‌‌ 
not‌‌return‌‌accurate‌‌results.‌‌RNN-based‌‌models‌‌are‌‌non-parametric‌‌and‌‌ 
are‌‌more‌‌generalizable.‌  ‌
3. Depending‌‌on‌‌window‌‌size,‌‌data,‌‌and‌‌desired‌‌prediction‌‌time,‌‌LSTM‌‌ 
models‌‌can‌‌be‌‌very‌‌computationally‌‌expensive.‌‌Sometimes‌‌they’re‌‌not‌‌ 
feasible‌‌without‌‌powerful‌‌cloud‌‌computing.‌  ‌
4. It’s‌‌good‌‌practice‌‌to‌‌have‌‌a‌‌‘no-skill’‌‌model‌‌to‌‌compare‌‌results‌‌to.‌‌A ‌‌
good‌‌start‌‌would‌‌be‌‌to‌‌compare‌‌the‌‌model‌‌results‌‌to‌‌a‌‌model‌‌ 
predicting‌‌only‌‌the‌‌mean‌‌for‌‌each‌‌time‌‌step‌‌over‌‌the‌‌period‌‌(horizontal‌‌ 
line).‌  ‌
 ‌
Objective :-

Application of Convolution Neural Network in disease detection such as


pneumonia/covid detection through Chest X-ray/ heart beat classification etc.

Abstract :
CNNs are powerful image processing, artificial intelligence (AI) that use deep learning to
perform both generative and descriptive tasks, often using machine vison that includes image
and video recognition, along with recommender systems and natural language processing
(NLP).

A CNN uses a system much like a multilayer perceptron that has been designed for reduced
processing requirements. The layers of a CNN consist of an input layer, an output layer and a
hidden layer that includes multiple convolutional layers, pooling layers, fully connected layers
and normalization layers. The removal of limitations and increase in efficiency for image
processing results in a system that is far more effective, simpler to trains limited for image
processing and natural language processing.

a) Application of Convolution Neural Network in pneumonia through Chest X-ray classification


Introduction:

Pneumonia is a lung parenchyma inflammation often caused by pathogenic


microorganisms, factors of physical and chemical, immunologic injury and other
pharmaceuticals. There are several popular pneumonia classification methods: (1)
pneumonia is classified as infectious and non-infectious based on different pathogeneses in
which infectious pneumonia is then classified to bacteria, virus, mycoplasmas, chlamydial
pneumonia, and others, while non-infectious pneumonia is classified as immune-associated
pneumonia, aspiration pneumonia caused by physical and chemical factors, and radiation
pneumonia. (2) Pneumonia is classified as CAP (community-acquired pneumonia), HAP
(hospital-acquired pneumonia) and VAP (ventilator-associated pneumonia) based on
different infections, among which CAP accounts for a larger part. Because of the different
range of pathogens, HAP is easier to develop resistance to various antibiotics, making
treatment more difficult.

Related Work:

Several methods have been introduced to describe a brief process in pneumonia detection
using chest X-ray images in recent years, especially some deep learning methods. Deep
Learning has been successfully applied to improve the performance of computer aided
diagnosis technology (CAD), especially in the field of medical imaging [5], image
segmentation [6,7] and image reconstruction [8,9]. In 2017, Rampura et al. [10] proposed a
classical deep learning network named DenseNet-121 [11], which was a 121-layer CNN
model to accelerate the diagnosis for pneumonia

Background:
In the past few decades, machine learning (ML) algorithms have gradually attracted
researchers’ attention. This type of algorithm could take full advantage of the giant computing
power of calculators in images processing through given algorithms or specified steps.
However, traditional ML methods in classification tasks need to manually design algorithms
or manually set feature extraction layers to classify images
Proposed CNN Model
Figure 4 illustrates the architecture of our proposed model that has been applied for the
detection of whether the input image shows pneumonia. Figure 5 displays our model that
contains a total of six layers, where we employed 3 × 3 kernel convolution layers whose
strides are 1 × 1 and the activation function is ReLU. After each convolution layer, a 2 × 2
strides kernel operation was employed as a max-pooling operation to retain the maximum of
each sub-region, which is split according to strides. Besides, we set several drop layers to
randomly fit weights to zero, aiming to improve the model performance. Then two densely
fully-connected layers followed by Sigmoid function are utilized to take full advantage of the
features extracted through previous layers, outputting the possibility of patients suffering
from pneumonia or not. As illustrated above, the input channel is 224 × 224 × 1 and the
output size is y ∈ {0, 1}, where 0 denotes that the image does not show pneumonia, while 1
denotes that the image shows pneumonia

b) Application of Convolution Neural Network in covid detection through heart beat


classification: -

Introduction:-
CNN is used in pattern recognition with superior feature learning capabilities, being a suitable
model to deal with image data. Indeed, CNN is a dominant architecture of DL for image
classification and can rival human accuracies in many tasks. CNN uses hierarchical layers of
tiled convolutional filters to mimic the effects of human receptive fields on feedforward
processing in the early visual cortex thereby exploiting the local spatial correlations present in
images while developing robustness to natural transformations such as changes of viewpoint
or scale. A CNN-based model generally requires a large set of training samples to achieve
good generalization capabilities. Its basic structure is represented as a sequence of
Convolutional—Pooling—Fully Connected Layers possibly with other intermediary layers
for normalization and/or dropout.

Network architecture:-
1. Input layer
The input layer basically depends on the dimension of the images. In our network, all images
must have the same dimension presented as a grayscale (single colour channel) image.

2. Batch Normalization layer.


Batch normalization converts the distribution of the inputs to a standard normal distribution
with mean 0 and variance 1, avoiding the problem of gradient dispersion and accelerating the
training process.

3. Convolutional layer.
Convolutions are the main building blocks of a CNN. Filter kernels are slid over the image
and for each position the dot product of the filter kernel and the part of the image covered by
the kernel is taken. All kernels used in this layer are 3 × 3 pixels. The chosen activation
function of convolutional layers is the rectified linear unit (ReLU), which is easy to train due
to its piecewise linear and sparse characteristics.

4. Max pooling layer.


Max pooling is a sub-sampling procedure that uses the maximum value of a window as the
output. The size of such a window was chosen as 2 × 2 pixels.

5. Fire layer.
A fire module is comprised of a squeeze convolutional layer (which has only 1 × 1 filters)
feeding into an expand layer that has a mix of 1 × 1 and 3 × 3 convolution filters. The use of a
fire layer could reduce training time while still extracting data characteristics in comparison
with dense layers with the same number of parameters. The layer is represented in Fig 4 in
which Input and Output have the same dimensions.

Proposed model:-

Despite their self-learning capacity and superior prediction performance, LWL and SOM
models achieve human-like precision in image description and prediction issues. Our
framework aims mainly at providing distinguishing visual properties and a quick diagnostic
system that can be used to classify new COVID-19 X-rays. This technique can also be useful
to clinicians as a treatment plan that can be used depending on the type of infection and can
provide prompt decisions.
Related Work:-
Real-time reverse transcription-polymerase chain reaction (RT-PCR) is the primary research
technique currently in use for COVID-19 diagnosis. Chest radiographic images, such as CT
images and X-rays, are critical for the early diagnosis and treatment of the condition. The low
sensitivity of RT-PCR (60–70%) allows symptoms to be detected by analysing radiographic
images of patients, even though adverse findings are obtained.

Conclusion:-
Within this context, the literature suggests that the diagnosis may be assisted by the use of
data mining methods to classify pneumonia disease in chest X-rays. However, the issue is
much more difficult when we look at chest images of patients suffering from pneumonia
caused by multiple types of pathogens and attempt to forecast a particular form of pneumonia
(COVID-19).
Objective:- Designing new methods for DAG scheduling problem for cloud computing.

Abstract:-

It is a scheduling layer in a spark which implements stage-oriented scheduling. It


converts logical execution plan to a physical execution plan. When an action is called,
spark directly strikes to DAG scheduler. It executes the tasks those are submitted to
the scheduler.

The objective of DAG scheduling is to minimize the overall program finish-time by proper
allocation of the tasks to the processors and arrangement of execution sequencing of the
tasks. Scheduling is done in such a manner that the precedence constraints among the
program tasks are preserved. The overall finish-time of a parallel program is commonly called
the schedule length or make span. Some variations to this goal have been suggested. For
example, some researchers proposed algorithms to minimize the mean flow-time or mean
finish-time, which is the average of the finish-times of all the program tasks [25], [110]. The
significance of the mean finish-time criterion is that minimizing it in the final schedule leads to
the reduction of the mean number of unfinished tasks at each point in the schedule. Some
other algorithms try to reduce the setup costs of the parallel processors [159]. We focus on
algorithms that minimize the schedule length.

INTRODUCTION:-

The Cloud is a huge, interconnected system of Powerful servers that provides businesses
and individuals with services [1] The concept (Cloud Computing) refers to the ability for
online users to share resources offered by the service provider. Without needing to buy
expensive hardware, to leverage the high-service provider's capabilities[2]. The main goal of
the cloud computing model is to allow users to share resources and data, Software as a
service (SaaS), application as a service (PaaS), and infrastructure as a service (IaaS). As the
number of cloud users has grown in recent years, the number of tasks that must be
managed propositionally has increased, necessitating task scheduling[3]. methodology is
based on Reinforcement learning

RELATED WORK:-

The task scheduling algorithm's main goal is to ensure that tasks are completed as
efficiently as possible. List scheduling algorithms are used in the task scheduling process. In
list scheduling algorithms, there are two distinct phases. The first phase entails
determining the tasks' priority, and the second phase entails assigning tasks to the
processor in the order determined[3], They will be discussed as follow. In 2017 (Wei et al.)[4] t
has been proposed a task scheduling algorithm based on Q-learning and the mutual value
function (QS).

Workflow model:-

A directed acyclic graph, G=(V,E), represents an application, with V representing the set of v
tasks and E representing the set of e edges between the tasks. Each edge (imp) E
represents a precedence constraint, requiring task to finish before task can begin.
Data is a v×v matrix of communication data, with indicating the amount of data to be
transmitted from task to task . DAG scheduling object: node tasks are assigned object
resources that must satisfy a chronological order constraint in order to reduce the total time
to completion.

Components of proposed algorithm:-

RL, MDP, and the Q-learning algorithm

Proposed Scheduling Algorithm:-

Input: DAG all Tasks.

Output: The make span.

Procedure:

1: Create DAG for all tasks.

2: Set gamma parameter, environment rewards in matrix R.

3: Initialize matrix Q to zero.

4: Repeat for each episode.

5: Select an initial state.

6: While the goal state not reached Do.

7: Select possible actions for the current state.

8: Go to the next state.

9: Get maximum Q value with E.g. (6).

10: Set next state as a current state.

11:Update Q(state, action) with E.g. (6).

12: Obtain tasks order according to updated Q-table.

13: Map task to the processor which have the minimum execution time.

14: Calculate the make span.

15: Until no longer changes in make span

CONCLUSION:-
Conclusion:-

Existing scheduling algorithms focused on the time. The main goal of these schedulers
is to reduce the overall Make span of the workflow. Gaps in current workflow
scheduling strategies in the cloud environments were studied in this thesis, and an effective
scheduling method for workflow management in the cloud setting was proposed based on
the gap analysis. It has b even determined that the current scheme is effective enough to
make the best use of the available resources. There are two stages to the algorithm design
theory.

You might also like