Cloud Computing

EX NO : 1 VIRTUALIZATION
AIM:
To run the Virtual Machine of different Configuration and to check how many virtual
machines can be utilized at a particular time.
REQUIREMENTS:
1. ORACLE VIRTUAL BOX

2. OPEN NEBULA SANDBOX
PROCEDURE:
1. Open Virtual box

2. File →import Appliance
3. Browse OpenNebula-Sandbox-5.0.ova file
4. Then go to setting, select Usb and choose USB 1.1
5. Then Start the Open Nebula
6. Login using username: root, password: opennebula
7. Open Browser, type localhost:9869
8. Login using username: oneadmin, password: opennebula
9. Click on instances, select VMs then follow the steps to create Virtual machine
a. Expand the + symbol
b. Select user oneadmin
c. Then enter the VM name, number of instance, CPU.
d. Then click on create button.
e. Repeat the steps the C,D for creating more than one VMs.
OUTPUT:
RESULT:
Thus the Virtual Machines of different Configuration are created successfully and
checked how many virtual machines can be utilized at a particular time.
Ex No : 2 ATTACH A VIRTUAL BLOCK TO VIRTUAL MACHINE
AIM:
To attach a Virtual Block to Virtual Machine and to check whether it holds the data even
after the release of virtual machine.
REQUIREMENTS:

PROCEDURE:
METHOD 1:
1. Open the virtual box

2. Power off the VM which you want to add virtual box
3. Then right click on that VM,select setting
4. Then click on storage,find controller IDE .
5. In the top right find add hard disk icon, the pop up window display
6. On that window select create new disk, and then click next and next then finish.
7. Then find attributes icon ,hard disk as IDE secondary slave.
METHOD 2:
1. Open Browser, type localhost:9869

2. Login using username: oneadmin, password: opennebula
3. Click on instances, select VMs then follow the steps to add virtual block
a. Select any one VM from the list and power off the VM
b. Then click on that VM ,find the storage tab then click on that
c. Then find the attach disk button
d. Click on that button ,the new pop window display
e. On that window select either image or volatile disk
f. Click on attach button.
OUTPUT:
RESULT:
Thus a Virtual Block is attached to a Virtual Machine and checked whether it holds the data even
after the release of virtual machine.
EX NO: 3 INSTALL A C COMPILER IN THE VIRTUAL MACHINE AND
EXECUTE A SAMPLE PROGRAM
AIM:
To install a C Compiler in the Virtual Machine and execute a sample program
REQUIREMENTS:

3. UBUNTU Gt6.Ova
PROCEDURE:
STEP 1:
ubuntu_gt6 installation:
• Open Virtual box

• File →import Appliance
• Browse ubuntu_gt6.ova file
• Then go to setting, select Usb and choose USB 1.1
• Then Start the ubuntu_gt6
• Login using username: dinesh, password:99425.
STEP 2:
Open the terminal
STEP 3:
//to install gcc
Sudo add-apt repository ppa:ubuntu-toolchain-r/test
sudo apt-get update
sudo apt-get install gcc-6 gcc-6-base
STEP 4:
To type a sample c program and save it
gedit hello.c
STEP 5:
To compile and run a sample c program
gcc hello.c
./a.out
OUTPUT:
RESULT:
Thus the C Compiler in the Virtual Machine is installed and a sample program is
executed successfully.
EX NO: 4 SHOW THE VIRTUAL MACHINE MIGRATION BASED ON CERTAIN
CONDITION FROM ONE NODE TO OTHER
AIM:
To show the virtual machine migration based on the certain condition from one node to the
other.
REQUIREMENTS:

3. UBUNTU Gt6.Ova
PROCEDURE:
STEP:1
Open Browser, type localhost:9869
STEP:2
Login using username: oneadmin, password: opennebula
STEP:3
Then follow the steps to migrate VMs
a. Click on infrastructure
b. Select clusters and enter the cluster name
c. Then select host tab, and select all host
d. Then select Vnets tab, and select all vnet
e. Then select datastores tab, and select all datastores
f. And then choose host under infrastructure tab
g. Click on + symbol to add new host, name the host then click on create.
STEP:4
On instances, select VMs to migrate then follow the stpes
h. Click on 8th icon ,the drop down list display

i. Select migrate on that ,the popup window display
j. On that select the target host to migrate then click on migrate.
OUTPUT:
Before migration
Host:one-sandbox
After Migration:
Host:one-sandbox
RESULT:
Thus the virtual machine migration based on the certain condition from one node to the
other has been successfully executed.
EX NO: 5 STORAGE CONTROLLER
AIM:
To install Storage Controller and to interact with it.
REQUIREMENTS:

3. UBUNTU Gt6.Ova
PROCEDURE:
Nova compute instances support the attachment and detachment of Cinder storage
volumes. This procedure details the steps involved in creating a logical volume in the cinder-
volumes volume group using the cinder command line interface.
METHOD: 1(using ubuntu GT6)

1. After login plug-in the USB drive
2. Right Click on the USB icon at bottom right corner(4th Icon)
3. Select your device name like jetflash, sandisk etc
4. Explorer window open.
5. Then do read, write operations on the USB.
METHOD: 2
STEP: 1
Log in to the dashboard.

STEP: 2
Select the appropriate project from the drop down menu at the top left.
STEP: 3
On the Project tab, open the Compute tab and click Access & Security category.
STEP: 4
On the Access & Security tab, click API Access category and click Download Openstack RC File
V2.0
STEP: 5
source ./admin-openrc.sh
STEP: 6
Create Cinder Volume. Use the cinder create command to create a new volume.
$ cinder create --display_name NAME SIZE.
OUTPUT:
RESULT:
Thus the Storage Controller is installed and interaction with it, is done successfully.
EX NO: 6 HADOOP INSTALLATION
AIM:
To install Hadoop Software and to set up the one node Hadoop cluster.
PROCEDURE:
STEP 1:INSTALLING JAVA
Step 1.1:Download jdk tar.gz file for Ubuntu 64 bit OS
$tar zxvf jdk-8u60-linux-x64.tar.gz
$cd jdk1.8.0-60/
$pwd
/home/Downloads/jdk→copy the path of jdk
Step 1.2:To set the environment variables for java
$ sudo nano /etc/profile
Pwd:
Add following three lines at the mid file
JAVA_HOME=/home/Downloads/jdk 1.8.0_60(paste path here)
PATH=$PATH:$JAVA_HOME/bin
Export PATH JAVA_HOME
Save the file by pressing ctrl+x,press y&enter
Step 1.3:source /etc/profile
Step 1.4:java -version
You will get “java hotspot(TM)64 bit server as last line
If you are not getting this,Update java alternatives
#update_alternatives__install “us a/bin/java”java
#java_version
STEP 2:INSTALLING HADOOP
Step 2.1:Download latest version of hadoop tar.gz(hadoop-2.7.0)
#cd..
#tar zxvf hadoop-2.7.0 tar.gz
#cd hadoop-2.7.0/ (to go to file)
#pwd
/home/Downloads/hadoop-2.7.0-→copy the path of hadoop
Step 2.2:To set the environment variables for hadoop
#sudo nano /etc/profile
Pwd:
Add the following three lines
HADOOP_PREFIX=/home/Downloads/hadoop 2.7.0
PATH=$PATH:$HADOOP_PREFIX/bin
Export PATH JAVA_HOME HADOOP_PREFIX
Save the file by pressing ctrl+x, press y&enter
Step 2.3:#source /etc/profile
#cd $HADOOP_PREFIX
#bin/hadoop version
Hadoop 2.7.0
Step 2.4:Update java,hadoop path to hadoop environment file
#cat /etc/profile(copy JAVA_HOME lines)HADOOP_PREFIX
#cd $HADOOP_PREFIX /etc/hadoop
#nano hadoop-env.sh
After lastline,paste the path and add export at the file
export JAVA_HOME=/home/Downloads/jdk
export HADOOP_PREFIX=/home/downloads/hadoop
STEP 3:INSTALLING PACKAGE OPEN SSH(configuring SSH)
Step 3.1:#sudo apt-get install open ssh-server
Press y for all
Unable to fetch
#sudo nano /etc/resolv.conf
Add nameserver 8.8.8.8
Save file by pressing ctrl+x, press y and enter
#sudo apt-get install open ssh-server
Press y for all
Step 3.2:To establish password less communication between system
#ssh localhost
#ssh-keygen
Given enter for all
#ssh-copy-id -i localhost
Press y
Yes
Pwd:
You will get “No.of keys added:/”
STEP 4:TO CONFIGURE FOUR XML FILES
(to configure what is the file system in which its configured)
Step 4.1:Modify core-site.xml
$nano core-site.xml
In that file, paste

<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
Save file by processing ctrl+x,press y and enter
Step 4.2:To configure number of replication, modify hdfs-site.xml
$nano hdfs-site.xml
In hdfs-site.xml file, paste
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
ctrl+x, press y and enter
Step 4.3:Modify mapred-site.xml
$cp mapred-site.xml.template mapred-site.xml
$nano mapred-site.xml
In that file, paste
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
Step 4.4:Modify yarn-site.xml
In that file, paste
<configuration>
<property><name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</valu
e></property>
</configuration>
STEP 5:FORMAT HDFS FILE SYSTEM (via name node for first time)
#cd $HADOOP_PREFIX
#bin/hadoop namenode -format
Step 5.1:Start name node and data node(port 50070)
#sbin/start-dfs.sh
Press y, yes
If error occurs ssh-add
To know running name node and data node Daemon type
#jps[In browser, type local host:50070,give enter , name node information is displayed]
Step 5.2:Start resource manager and node manager Daemon(port 8088)
#sbin/start-yarn.sh
#jps
Step 5.3:To stop the running process
#sbin/stop-dfs.sh
#sbin/stop-yarn.sh
OUTPUT:
RESULT:
Thus the Hadoop Software is installed and the one node Hadoop cluster is set up
successfully
EX NO: 7 MOUNT THE ONE NODE HADOOP CLUSTER USING FUSE
AIM:
To Mount the one node Hadoop cluster using FUSE.
.
PROCEDURE:
STEP: 1
wget http://archive.cloudera.com/cdh5/one-click-install/trusty/amd64/cdh5-repository_1.0_all.deb
STEP: 2
sudo dpkg -i cdh5-repository_1.0_all.deb
STEP: 3
sudo apt-get update
STEP: 4
sudo apt-get install hadoop-hdfs-fuse
STEP: 5
sudo mkdir -p xyz
STEP: 6
cd hadoop-2.7.0/
STEP: 7
bin/hadoop namenode -format
STEP: 8
sbin/start-all.sh
STEP: 9
hadoop-fuse-dfs dfs://localhost:9000 /home/it08/Downloads/xyz/
STEP: 10
sudo chmod 777 /home/it08/Downloads/xyz/
STEP: 11
hadoop-fuse-dfs dfs://localhost:9000 /home/it08/Downloads/xyz/
STEP: 12
cd /home/it08/Downloads/xyz/
STEP: 13
mkdir a
ls
OUTPUT:
RESULT:
Thus one node Hadoop cluster is mounted using FUSE successfully.
EX NO: 8 A WORD COUNT PROGRAM USING MAP-REDUCE TASKS
AIM:
To write a wordcount program to demonstrate the use of Map and Reduce tasks.
PROCEDURE:
STEP: 1
Download Hadoop-core-1.2.1.jar, which is used to compile and execute the MapReduce
program. Visit the following link http://mvnrepository.com/artifact/org.apache.hadoop/hadoop-
core/1.2.1 to download the jar. Let us assume the downloaded folder is /home/hadoop/.
STEP: 2
The following commands are used for compiling the WordCount.java program.
javac -classpath hadoop-core-1.2.1.jar -d . WordCount.java
STEP: 3
Create a jar for the program.

jar -cvf sample1.jar sample1/
STEP: 4
cd $HADOOP_PREFIX
bin/hadoop namenode -format
sbin/start-dfs.sh
sbin/start-yarn.sh
jps
STEP: 5
The following command is used to create an input directory in HDFS.

bin/hdfs dfs -mkdir /input
STEP: 6
The following command is used to copy the input file named sal.txt in the input directory of
HDFS.
bin/hdfs dfs -put /home/it08/Downloads/sal.txt /input
STEP: 7
The following command is used to run the application by taking the input files from the input
directory.
bin/hadoop jar /home/it08/Downloads/sample1.jar sample1.WordCount /input /output
PROGRAM:
WordCount.java
package sample;
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class WordCount {
public static class TokenizerMapper
extends Mapper<Object, Text, Text, IntWritable>{
private final static IntWritable one = new IntWritable(1);

private Text word = new Text();
public void map(Object key, Text value, Context context
) throws IOException, InterruptedException {
StringTokenizer itr = new StringTokenizer(value.toString());
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
context.write(word, one);
}
}
}
public static class IntSumReducer

extends Reducer<Text,IntWritable,Text,IntWritable> {
private IntWritable result = new IntWritable();
public void reduce(Text key, Iterable<IntWritable> values,
Context context
) throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
result.set(sum);
context.write(key, result);
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "word count");
job.setJarByClass(WordCount.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
INPUT FILE: (Sample.txt)
1949 12 12 34 23 45 34 12 23
1987 13 11 32 34 45 56 12 34
1997 12 12 12 12 12 11 34 12
1998 23 34 23 34 45 56 23 34
2000 10 11 12 23 14 13 15 16
OUTPUT:
RESULT:
Thus the word count program to demonstrate the use of Map and Reduce tasks is
executed successfully.
EX NO: 9 Unstructured data into NoSQL data and do all operations such
as NoSQL query with API
AIM:
Converting unstructured data into NoSQL format and performing operations using NoSQL
queries with an API involves several steps. I'll provide a general guide using Python and
MongoDB as an example of a NoSQL database. Keep in mind that the specifics might vary
depending on the NoSQL database you're using.
PROCEDURE:
Step 1: Install MongoDB and pymongo
Ensure you have MongoDB installed and running. You can install the pymongo library using
pip:
pip install pymongo
Step 2: Connect to MongoDB
from pymongo import MongoClient
# Connect to MongoDB
client = MongoClient('localhost', 27017)
db = client['your_database_name']
collection = db['your_collection_name']
Step 3: Insert Data
Assuming your unstructured data is in a list of dictionaries:
data = [
{"name": "John", "age": 25, "city": "New York"},

{"name": "Jane", "age": 30, "city": "San Francisco"},
# ... more data
# Insert data into MongoDB
collection.insert_many(data)
Step 4: Query Data
# Query data
result = collection.find({"age": {"$gt": 25}})
for document in result:
print(document)
Step 5: Update Data
# Update data
collection.update_one({"name": "John"}, {"$set": {"age": 26}})
Step 6: Delete Data
# Delete data
collection.delete_one({"name": "Jane"})
Step 7: Set up API (Optional)
You can expose your MongoDB operations through an API using a framework like Flask or
FastAPI. Here's a basic example using Flask:
from flask import Flask, jsonify, request
app = Flask(__name__)
@app.route('/get_data', methods=['GET'])
def get_data():
result = collection.find({})
data = [document for document in result]
return jsonify(data)
if __name__ == '__main__':
app.run(debug=True)
This is a simple example, and you might need to add more routes for different operations.
Remember to secure your API and handle errors appropriately. Additionally, adjust the code
according to the NoSQL database you're using, as the syntax for queries and operations may
differ between databases.
RESULT:
Thus Unstructured data into NoSQL data and do all operations such as NoSQL query
with API is mounted using FUSE successfully.
EX NO: 10 K-means clustering using map reduce
K-means clustering is an iterative algorithm that partitions a dataset into K clusters, where each
data point belongs to the cluster with the nearest mean. MapReduce is a programming model for
processing and generating large datasets that can be parallelized across a distributed cluster of
computers. Here's a high-level overview of how you might implement K-means clustering using
MapReduce:
Step-by-Step Procedure:
Initialization:
Randomly select K initial cluster centroids.
Map Phase:
Read the input data distributed across your cluster of machines.
For each data point, calculate the distance to each centroid and emit the data point with the ID of
the nearest centroid.
Mapper Output: <centroid_id, data_point>
Combine/Group Phase:
Group the emitted data points by centroid ID.
This step is typically handled automatically in MapReduce frameworks.
Reduce Phase:
For each group (centroid), calculate the new centroid by computing the mean of the data points
assigned to that cluster.
Reducer Output: <centroid_id, new_centroid>
Update Centroids:
Collect the new centroids from the reducers.
Use these new centroids as the input centroids for the next iteration.
Convergence Check:
Check for convergence by comparing the new centroids with the previous centroids.
If the centroids have not changed significantly, stop the iterations.
Iteration:
If the convergence criteria are not met, repeat the map-reduce steps with the updated centroids.
Final Output:
The final output will be the K centroids representing the clusters.
Pseudocode:
Here is a simplified pseudocode representation of the MapReduce steps:
Map (data_point):
for each centroid in centroids:
distance = calculate_distance(data_point, centroid)
emit(centroid_id, (data_point, distance))
Reduce (centroid_id, data_points):
new_centroid = calculate_mean(data_points)
emit(centroid_id, new_centroid)
Notes:
The distance calculation and mean calculation functions depend on your data and application
context.
The MapReduce framework, like Hadoop or Apache Spark, handles the distribution of data and
tasks across the cluster.
The number of iterations needed for convergence can vary based on the data and initial centroids.
Keep in mind that the MapReduce paradigm might not be the most efficient for iterative
algorithms like K-means due to the overhead of repeated map and reduce phases. More modern
distributed computing frameworks, like Apache Spark, provide iterative algorithms that are more
optimized for these types of tasks.
RESULT:
Thus K-means clustering using map reduce is executed successfully
EX NO: 11 Page Rank Computation
PageRank computation in the context of cloud computing and big data often involves
distributing the computation across multiple nodes to handle the large-scale processing required
for web-scale graphs. PageRank is an algorithm that measures the importance of webpages in a
hyperlink graph, and it was famously used by Google to rank search engine results.
Here's a high-level overview of how PageRank computation can be implemented in a cloud

computing and big data lab environment:
1. Data Representation:
Web graphs are often represented as adjacency lists or matrices, where each node
represents a webpage, and edges represent hyperlinks.
2. Data Partitioning:
Break the web graph into smaller partitions that can be distributed across multiple nodes
in the cloud. This step is crucial for parallel processing.
3. Distributed Storage:
Store the graph data in a distributed storage system like Hadoop Distributed File System
(HDFS) or a cloud-based equivalent (e.g., Amazon S3, Google Cloud Storage).
4. Distributed Computation:
Leverage a distributed computing framework like Apache Hadoop or Apache Spark to

perform the PageRank computation in parallel across multiple nodes.
MapReduce paradigm is commonly used for such computations. The graph is partitioned
into chunks, and each chunk is processed independently by different nodes.
5. Iterative Algorithm:
PageRank is an iterative algorithm where the scores are updated until convergence. Each
iteration involves processing the graph data and updating the PageRank scores.
6. Fault Tolerance:
Implement fault-tolerant mechanisms to handle failures that may occur in a distributed

environment. For instance, Hadoop and Spark have built-in fault tolerance features.
7. Scaling:
Cloud computing allows you to dynamically scale resources based on the workload. As
the size of the web graph grows or changes, you can allocate more resources to the
computation.
8. Optimizations:
Apply optimization techniques to enhance the efficiency of PageRank computation. This

may include using graph partitioning strategies, optimizing communication between
nodes, and employing caching mechanisms.
9. Result Aggregation:
After several iterations, the final PageRank scores need to be aggregated and presented.
This result can be stored in a distributed storage system or used for further analysis.
10. Monitoring and Visualization:
Implement monitoring tools to keep track of the progress of the computation.

Visualization tools can help in understanding the structure of the web graph and the
distribution of PageRank scores.
In summary, cloud computing and big data technologies provide the infrastructure and tools
necessary to efficiently compute PageRank on large-scale graphs. Leveraging distributed
computing frameworks and storage systems allows for the parallel processing and storage of vast
amounts of data, making it feasible to compute PageRank for the entire web graph.
RESULT:
Thus the procedure of the Page Rank Computation and process is executed successfully

Cloud Computing

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Cloud Computing

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Cloud Computing

Uploaded by

Copyright:

Available Formats

EX NO : 1 VIRTUALIZATION

1. ORACLE VIRTUAL BOX

1. Open Virtual box

3. ORACLE VIRTUAL BOX

1. Open the virtual box

1. Open Browser, type localhost:9869

To install a C Compiler in the Virtual Machine and execute a sample program

1. ORACLE VIRTUAL BOX

• Open Virtual box

Open the terminal

//to install gcc

Sudo add-apt repository ppa:ubuntu-toolchain-r/test

sudo apt-get update

sudo apt-get install gcc-6 gcc-6-base

To type a sample c program and save it

To compile and run a sample c program

1. ORACLE VIRTUAL BOX

Open Browser, type localhost:9869

Login using username: oneadmin, password: opennebula

Then follow the steps to migrate VMs

On instances, select VMs to migrate then follow the stpes

h. Click on 8th icon ,the drop down list display

To install Storage Controller and to interact with it.

1. ORACLE VIRTUAL BOX

METHOD: 1(using ubuntu GT6)

Log in to the dashboard.

STEP 1:INSTALLING JAVA

Step 1.1:Download jdk tar.gz file for Ubuntu 64 bit OS

$tar zxvf jdk-8u60-linux-x64.tar.gz

/home/Downloads/jdk→copy the path of jdk

Step 1.2:To set the environment variables for java

$ sudo nano /etc/profile

Add following three lines at the mid file

JAVA_HOME=/home/Downloads/jdk 1.8.0_60(paste path here)

Export PATH JAVA_HOME

Save the file by pressing ctrl+x,press y&enter

Step 1.3:source /etc/profile

Step 1.4:java -version

You will get “java hotspot(TM)64 bit server as last line

If you are not getting this,Update java alternatives

#update_alternatives__install “us a/bin/java”java

Step 2.1:Download latest version of hadoop tar.gz(hadoop-2.7.0)

#tar zxvf hadoop-2.7.0 tar.gz

#cd hadoop-2.7.0/ (to go to file)

/home/Downloads/hadoop-2.7.0-→copy the path of hadoop

Step 2.2:To set the environment variables for hadoop

#sudo nano /etc/profile

Add the following three lines

Export PATH JAVA_HOME HADOOP_PREFIX

Save the file by pressing ctrl+x, press y&enter

Step 2.3:#source /etc/profile

Step 2.4:Update java,hadoop path to hadoop environment file

#cat /etc/profile(copy JAVA_HOME lines)HADOOP_PREFIX

#cd $HADOOP_PREFIX /etc/hadoop

After lastline,paste the path and add export at the file

Step 3.1:#sudo apt-get install open ssh-server

Press y for all

#sudo nano /etc/resolv.conf