Cloud Computing
Cloud Computing
Cloud Computing
AIM:
To run the Virtual Machine of different Configuration and to check how many virtual
machines can be utilized at a particular time.
REQUIREMENTS:
PROCEDURE:
Thus the Virtual Machines of different Configuration are created successfully and
checked how many virtual machines can be utilized at a particular time.
Ex No : 2 ATTACH A VIRTUAL BLOCK TO VIRTUAL MACHINE
AIM:
To attach a Virtual Block to Virtual Machine and to check whether it holds the data even
after the release of virtual machine.
REQUIREMENTS:
PROCEDURE:
METHOD 1:
AIM:
REQUIREMENTS:
PROCEDURE:
STEP 1:
ubuntu_gt6 installation:
STEP 3:
STEP 4:
gedit hello.c
STEP 5:
gcc hello.c
./a.out
OUTPUT:
RESULT:
Thus the C Compiler in the Virtual Machine is installed and a sample program is
executed successfully.
EX NO: 4 SHOW THE VIRTUAL MACHINE MIGRATION BASED ON CERTAIN
CONDITION FROM ONE NODE TO OTHER
AIM:
To show the virtual machine migration based on the certain condition from one node to the
other.
REQUIREMENTS:
PROCEDURE:
STEP:1
STEP:2
STEP:3
a. Click on infrastructure
b. Select clusters and enter the cluster name
c. Then select host tab, and select all host
d. Then select Vnets tab, and select all vnet
e. Then select datastores tab, and select all datastores
f. And then choose host under infrastructure tab
g. Click on + symbol to add new host, name the host then click on create.
STEP:4
Before migration
Host:one-sandbox
After Migration:
Host:one-sandbox
RESULT:
Thus the virtual machine migration based on the certain condition from one node to the
other has been successfully executed.
EX NO: 5 STORAGE CONTROLLER
AIM:
REQUIREMENTS:
PROCEDURE:
Nova compute instances support the attachment and detachment of Cinder storage
volumes. This procedure details the steps involved in creating a logical volume in the cinder-
volumes volume group using the cinder command line interface.
METHOD: 2
STEP: 1
Select the appropriate project from the drop down menu at the top left.
STEP: 3
On the Project tab, open the Compute tab and click Access & Security category.
STEP: 4
On the Access & Security tab, click API Access category and click Download Openstack RC File
V2.0
STEP: 5
source ./admin-openrc.sh
STEP: 6
Create Cinder Volume. Use the cinder create command to create a new volume.
$ cinder create --display_name NAME SIZE.
OUTPUT:
RESULT:
Thus the Storage Controller is installed and interaction with it, is done successfully.
EX NO: 6 HADOOP INSTALLATION
AIM:
To install Hadoop Software and to set up the one node Hadoop cluster.
PROCEDURE:
$cd jdk1.8.0-60/
$pwd
Pwd:
PATH=$PATH:$JAVA_HOME/bin
#java_version
STEP 2:INSTALLING HADOOP
#cd..
#pwd
Pwd:
HADOOP_PREFIX=/home/Downloads/hadoop 2.7.0
PATH=$PATH:$HADOOP_PREFIX/bin
#cd $HADOOP_PREFIX
#bin/hadoop version
Hadoop 2.7.0
#nano hadoop-env.sh
export JAVA_HOME=/home/Downloads/jdk
export HADOOP_PREFIX=/home/downloads/hadoop
STEP 3:INSTALLING PACKAGE OPEN SSH(configuring SSH)
Unable to fetch
#ssh localhost
#ssh-keygen
#ssh-copy-id -i localhost
Press y
Yes
Pwd:
$nano core-site.xml
$nano hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
$nano mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
<configuration>
<property><name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</valu
e></property>
</configuration>
STEP 5:FORMAT HDFS FILE SYSTEM (via name node for first time)
#cd $HADOOP_PREFIX
#sbin/start-dfs.sh
Press y, yes
#jps[In browser, type local host:50070,give enter , name node information is displayed]
#sbin/start-yarn.sh
#jps
#sbin/stop-dfs.sh
#sbin/stop-yarn.sh
OUTPUT:
RESULT:
Thus the Hadoop Software is installed and the one node Hadoop cluster is set up
successfully
EX NO: 7 MOUNT THE ONE NODE HADOOP CLUSTER USING FUSE
AIM:
To Mount the one node Hadoop cluster using FUSE.
.
PROCEDURE:
STEP: 1
wget http://archive.cloudera.com/cdh5/one-click-install/trusty/amd64/cdh5-repository_1.0_all.deb
STEP: 2
sudo dpkg -i cdh5-repository_1.0_all.deb
STEP: 3
sudo apt-get update
STEP: 4
sudo apt-get install hadoop-hdfs-fuse
STEP: 5
sudo mkdir -p xyz
STEP: 6
cd hadoop-2.7.0/
STEP: 7
bin/hadoop namenode -format
STEP: 8
sbin/start-all.sh
STEP: 9
hadoop-fuse-dfs dfs://localhost:9000 /home/it08/Downloads/xyz/
STEP: 10
sudo chmod 777 /home/it08/Downloads/xyz/
STEP: 11
hadoop-fuse-dfs dfs://localhost:9000 /home/it08/Downloads/xyz/
STEP: 12
cd /home/it08/Downloads/xyz/
STEP: 13
mkdir a
ls
OUTPUT:
RESULT:
Thus one node Hadoop cluster is mounted using FUSE successfully.
EX NO: 8 A WORD COUNT PROGRAM USING MAP-REDUCE TASKS
AIM:
To write a wordcount program to demonstrate the use of Map and Reduce tasks.
PROCEDURE:
STEP: 1
Download Hadoop-core-1.2.1.jar, which is used to compile and execute the MapReduce
program. Visit the following link http://mvnrepository.com/artifact/org.apache.hadoop/hadoop-
core/1.2.1 to download the jar. Let us assume the downloaded folder is /home/hadoop/.
STEP: 2
The following commands are used for compiling the WordCount.java program.
javac -classpath hadoop-core-1.2.1.jar -d . WordCount.java
STEP: 3
STEP: 4
cd $HADOOP_PREFIX
sbin/start-dfs.sh
sbin/start-yarn.sh
jps
STEP: 5
STEP: 6
The following command is used to copy the input file named sal.txt in the input directory of
HDFS.
bin/hdfs dfs -put /home/it08/Downloads/sal.txt /input
STEP: 7
The following command is used to run the application by taking the input files from the input
directory.
bin/hadoop jar /home/it08/Downloads/sample1.jar sample1.WordCount /input /output
PROGRAM:
WordCount.java
package sample;
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class WordCount {
public static class TokenizerMapper
extends Mapper<Object, Text, Text, IntWritable>{
1949 12 12 34 23 45 34 12 23
1987 13 11 32 34 45 56 12 34
1997 12 12 12 12 12 11 34 12
1998 23 34 23 34 45 56 23 34
2000 10 11 12 23 14 13 15 16
OUTPUT:
RESULT:
Thus the word count program to demonstrate the use of Map and Reduce tasks is
executed successfully.
EX NO: 9 Unstructured data into NoSQL data and do all operations such
AIM:
Converting unstructured data into NoSQL format and performing operations using NoSQL
queries with an API involves several steps. I'll provide a general guide using Python and
MongoDB as an example of a NoSQL database. Keep in mind that the specifics might vary
depending on the NoSQL database you're using.
PROCEDURE:
Ensure you have MongoDB installed and running. You can install the pymongo library using
pip:
# Connect to MongoDB
db = client['your_database_name']
collection = db['your_collection_name']
data = [
collection.insert_many(data)
# Query data
print(document)
# Update data
# Delete data
collection.delete_one({"name": "Jane"})
You can expose your MongoDB operations through an API using a framework like Flask or
FastAPI. Here's a basic example using Flask:
app = Flask(__name__)
@app.route('/get_data', methods=['GET'])
def get_data():
result = collection.find({})
return jsonify(data)
if __name__ == '__main__':
app.run(debug=True)
This is a simple example, and you might need to add more routes for different operations.
Remember to secure your API and handle errors appropriately. Additionally, adjust the code
according to the NoSQL database you're using, as the syntax for queries and operations may
differ between databases.
RESULT:
Thus Unstructured data into NoSQL data and do all operations such as NoSQL query
with API is mounted using FUSE successfully.
EX NO: 10 K-means clustering using map reduce
K-means clustering is an iterative algorithm that partitions a dataset into K clusters, where each
data point belongs to the cluster with the nearest mean. MapReduce is a programming model for
processing and generating large datasets that can be parallelized across a distributed cluster of
computers. Here's a high-level overview of how you might implement K-means clustering using
MapReduce:
Step-by-Step Procedure:
Initialization:
Map Phase:
For each data point, calculate the distance to each centroid and emit the data point with the ID of
the nearest centroid.
Combine/Group Phase:
Reduce Phase:
For each group (centroid), calculate the new centroid by computing the mean of the data points
assigned to that cluster.
Update Centroids:
Collect the new centroids from the reducers.
Use these new centroids as the input centroids for the next iteration.
Convergence Check:
Check for convergence by comparing the new centroids with the previous centroids.
Iteration:
If the convergence criteria are not met, repeat the map-reduce steps with the updated centroids.
Final Output:
Pseudocode:
Map (data_point):
new_centroid = calculate_mean(data_points)
emit(centroid_id, new_centroid)
Notes:
The distance calculation and mean calculation functions depend on your data and application
context.
The MapReduce framework, like Hadoop or Apache Spark, handles the distribution of data and
tasks across the cluster.
The number of iterations needed for convergence can vary based on the data and initial centroids.
Keep in mind that the MapReduce paradigm might not be the most efficient for iterative
algorithms like K-means due to the overhead of repeated map and reduce phases. More modern
distributed computing frameworks, like Apache Spark, provide iterative algorithms that are more
optimized for these types of tasks.
RESULT:
Thus K-means clustering using map reduce is executed successfully
EX NO: 11 Page Rank Computation
PageRank computation in the context of cloud computing and big data often involves
distributing the computation across multiple nodes to handle the large-scale processing required
for web-scale graphs. PageRank is an algorithm that measures the importance of webpages in a
hyperlink graph, and it was famously used by Google to rank search engine results.
1. Data Representation:
Web graphs are often represented as adjacency lists or matrices, where each node
represents a webpage, and edges represent hyperlinks.
2. Data Partitioning:
Break the web graph into smaller partitions that can be distributed across multiple nodes
in the cloud. This step is crucial for parallel processing.
3. Distributed Storage:
Store the graph data in a distributed storage system like Hadoop Distributed File System
(HDFS) or a cloud-based equivalent (e.g., Amazon S3, Google Cloud Storage).
4. Distributed Computation:
PageRank is an iterative algorithm where the scores are updated until convergence. Each
iteration involves processing the graph data and updating the PageRank scores.
6. Fault Tolerance:
7. Scaling:
Cloud computing allows you to dynamically scale resources based on the workload. As
the size of the web graph grows or changes, you can allocate more resources to the
computation.
8. Optimizations:
9. Result Aggregation:
After several iterations, the final PageRank scores need to be aggregated and presented.
This result can be stored in a distributed storage system or used for further analysis.
In summary, cloud computing and big data technologies provide the infrastructure and tools
necessary to efficiently compute PageRank on large-scale graphs. Leveraging distributed
computing frameworks and storage systems allows for the parallel processing and storage of vast
amounts of data, making it feasible to compute PageRank for the entire web graph.
RESULT:
Thus the procedure of the Page Rank Computation and process is executed successfully