Hadoop Lab

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

Approved by AICTE |Affiliated to VTU | Recognized by UGC with 2(f) & 12(B) status |Accredited by NBA and NAAC

Department of Information Science and Engineering

Course Code MVJ19ISL77 Course Title Bigdata and Hadoop Lab


Lecture Hours 40 L:T:P ::0:0:40 Credit 2
List of Universities/Institutes referred in framing the Syllabus:
AICTE, VTU, BMSCE, Anna University, VIT
Course objectives: This course will enable students to
 Understand Hadoop Distributed File system and examine MapReduce Programming
 Explore Hadoop tools and manage Hadoop with Ambari
 Appraise the role of Business intelligence and its applications across industries
 Assess core data mining techniques for data analytics
 Identify various Text Mining techniques
Prerequisites:
 Basics of Java and Linux
Description:
 The programs can be implemented in JAVA
 Hadoop software can be installed in three modes of operation:
 Stand Alone Mode: Hadoop is a distributed software and is designed to run on a commodity of machines.
However, we can install it on a single node in stand-alone mode. In this mode, Hadoop software runs as a
single monolithic java process. This mode is extremely useful for debugging purpose. You can first test
run your Map-Reduce application in this mode on small data, before actually executing it on cluster with
big data.
 Pseudo Distributed Mode: In this mode also, Hadoop software is installed on a Single Node. Various
daemons of Hadoop will run on the same machine as separate java processes. Hence all the daemons
namely NameNode, DataNode, Secondary NameNode, JobTracker, TaskTracker run on single machine.
 Fully Distributed Mode: In Fully Distributed Mode, the daemons NameNode, JobTracker,
SecondaryNameNode (Optional and can be run on a separate node) run on the Master Node. The daemons
DataNode and TaskTracker run on the Slave Node.

Experiment Experiment Name Revised


No Bloom’s
Taxonomy
Levels (RBT
Level)
Implement the following Data structures in Java
1 L3
a)Linked Lists b) Stacks
Implement the following Data structures in java L3
2
a) Queues b) Set c) Map
Perform setting up and Installing Hadoop in its three operating modes: L3
3
Standalone, Pseudo distributed, Fully distributed
4 Use web based tools to monitor your Hadoop setup. L3
Implement the following file management tasks in Hadoop: L3
 Adding files and directories
 Retrieving files
5  Deleting files

Hint: A typical Hadoop workflow creates data files (such as log files)
elsewhere and copies them into HDFS using one of the above command line
utilities.
Run a basic Word Count Map Reduce program to understand Map Reduce L3
6
Paradigm.
Write a Map Reduce program that mines weather data. Weather sensors L3
collecting data every hour at many locations across the globe gather a large
7
volume of log data, which is a good candidate for analysis with MapReduce,
since it is semi structured and record-oriented.
8 Implement Matrix Multiplication with Hadoop Map Reduce L3

Install and Run Pig then write Pig Latin scripts to sort, group, join, project, L3
9
and filter your data.
Install and Run Hive then use Hive to create, alter, and drop databases, tables, L3
10
views, functions, and indexes
Course Code (CO) Course Outcome
The students should be able to:
CO1 Master the concepts of HDFS and MapReduce framework
CO2 Investigate Hadoop related tools for Big Data Analytics and perform basic Hadoop
Administration
CO3 Recognize the role of Business Intelligence, Data warehousing and Visualization in decision
making
CO4 Infer the importance of core data mining techniques for data analytics
CO5 Compare and contrast different Text Mining Techniques
Mapping of Course Outcomes to Program Outcomes
Course Code PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12 PSO1 PSO2

CO1 3 3 3 3 2 2
CO2 3 3 3 3 2 2
CO3 3 3 3 3 2 2
CO4 3 3 3 2 2
CO5 3 3 3 3

Average 3 3 3 3 3 2 2

High:3 Medium:2 Low:1


Faculty Name & Signature
(Prepared by)

You might also like