Hadoop Lab

Uploaded by

This document provides information about the Big Data and Hadoop Lab course offered by the Department of Information Science and Engineering. The course objectives are to understand Hadoop, MapReduce programming, Hadoop tools, and applications of data mining and text mining techniques. The course involves experiments implementing data structures in Java, setting up Hadoop in different modes, writing MapReduce programs, and using tools like Pig and Hive. The course outcomes are to master Hadoop concepts and tools for big data analytics and understand the role of business intelligence in decision making.

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Hadoop Lab

Uploaded by

sharmila

0% found this document useful (0 votes)

67 views2 pages

Copyright

Available Formats

DOCX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Download as docx, pdf, or txt

0% found this document useful (0 votes)

67 views2 pages

Hadoop Lab

Uploaded by

sharmila

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Download as docx, pdf, or txt

Jump to Page

You are on page 1of 2

Search inside document

Approved by AICTE |Affiliated to VTU | Recognized by UGC with 2(f) & 12(B) status |Accredited by NBA and NAAC

Department of Information Science and Engineering

Course Code MVJ19ISL77 Course Title Bigdata and Hadoop Lab

Lecture Hours 40 L:T:P ::0:0:40 Credit 2
List of Universities/Institutes referred in framing the Syllabus:
AICTE, VTU, BMSCE, Anna University, VIT
Course objectives: This course will enable students to
 Understand Hadoop Distributed File system and examine MapReduce Programming
 Explore Hadoop tools and manage Hadoop with Ambari
 Appraise the role of Business intelligence and its applications across industries
 Assess core data mining techniques for data analytics
 Identify various Text Mining techniques
Prerequisites:
 Basics of Java and Linux
Description:
 The programs can be implemented in JAVA
 Hadoop software can be installed in three modes of operation:
 Stand Alone Mode: Hadoop is a distributed software and is designed to run on a commodity of machines.
However, we can install it on a single node in stand-alone mode. In this mode, Hadoop software runs as a
single monolithic java process. This mode is extremely useful for debugging purpose. You can first test
run your Map-Reduce application in this mode on small data, before actually executing it on cluster with
big data.
 Pseudo Distributed Mode: In this mode also, Hadoop software is installed on a Single Node. Various
daemons of Hadoop will run on the same machine as separate java processes. Hence all the daemons
namely NameNode, DataNode, Secondary NameNode, JobTracker, TaskTracker run on single machine.
 Fully Distributed Mode: In Fully Distributed Mode, the daemons NameNode, JobTracker,
SecondaryNameNode (Optional and can be run on a separate node) run on the Master Node. The daemons
DataNode and TaskTracker run on the Slave Node.

Experiment Experiment Name Revised

No Bloom’s
Taxonomy
Levels (RBT
Level)
Implement the following Data structures in Java
1 L3
a)Linked Lists b) Stacks
Implement the following Data structures in java L3
2
a) Queues b) Set c) Map
Perform setting up and Installing Hadoop in its three operating modes: L3
3
Standalone, Pseudo distributed, Fully distributed
4 Use web based tools to monitor your Hadoop setup. L3
Implement the following file management tasks in Hadoop: L3
 Adding files and directories
 Retrieving files
5  Deleting files

Hint: A typical Hadoop workflow creates data files (such as log files)
elsewhere and copies them into HDFS using one of the above command line
utilities.
Run a basic Word Count Map Reduce program to understand Map Reduce L3
6
Paradigm.
Write a Map Reduce program that mines weather data. Weather sensors L3
collecting data every hour at many locations across the globe gather a large
7
volume of log data, which is a good candidate for analysis with MapReduce,
since it is semi structured and record-oriented.
8 Implement Matrix Multiplication with Hadoop Map Reduce L3

Install and Run Pig then write Pig Latin scripts to sort, group, join, project, L3
9
and filter your data.
Install and Run Hive then use Hive to create, alter, and drop databases, tables, L3
10
views, functions, and indexes
Course Code (CO) Course Outcome
The students should be able to:
CO1 Master the concepts of HDFS and MapReduce framework
CO2 Investigate Hadoop related tools for Big Data Analytics and perform basic Hadoop
Administration
CO3 Recognize the role of Business Intelligence, Data warehousing and Visualization in decision
making
CO4 Infer the importance of core data mining techniques for data analytics
CO5 Compare and contrast different Text Mining Techniques
Mapping of Course Outcomes to Program Outcomes
Course Code PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12 PSO1 PSO2

CO1 3 3 3 3 2 2
CO2 3 3 3 3 2 2
CO3 3 3 3 3 2 2
CO4 3 3 3 2 2
CO5 3 3 3 3

Average 3 3 3 3 3 2 2

High:3 Medium:2 Low:1

Faculty Name & Signature
(Prepared by)

ETL Testing Training Course Content
Document7 pages
ETL Testing Training Course Content
Tekclasses
No ratings yet
DataProvisioning Agent Installation and Upgrade
Document19 pages
DataProvisioning Agent Installation and Upgrade
Sharif Razi
No ratings yet
Lab Syllabus Format
Document4 pages
Lab Syllabus Format
Narendra Babu
No ratings yet
III-II Big Data Analytics Question Bank
Document3 pages
III-II Big Data Analytics Question Bank
UDAY REDDY
100% (1)
Notes
Document53 pages
Notes
Radheshyam Shah
No ratings yet
Wibd
Document39 pages
Wibd
204Kashish VermaF2
No ratings yet
C21053 Jay Vijay Karwatkar-Big Data Analytics & Visualization
Document210 pages
C21053 Jay Vijay Karwatkar-Big Data Analytics & Visualization
Jay Karwatkar
No ratings yet
Hadoop Interviews Q
Document9 pages
Hadoop Interviews Q
S K
No ratings yet
20IT503 - Big Data Analytics - Unit4
Document73 pages
20IT503 - Big Data Analytics - Unit4
5023-Monish Kumar K
No ratings yet
UNIT 4 Notes by ARUN JHAPATE
Document20 pages
UNIT 4 Notes by ARUN JHAPATE
Ankit “अंकित मौर्य” Mourya
No ratings yet
Daily Data Ingestion/data Node Capacity
Document2 pages
Daily Data Ingestion/data Node Capacity
Sam Unkil
No ratings yet
System Design and Implementation 5.1 System Design
Document14 pages
System Design and Implementation 5.1 System Design
sararajee
No ratings yet
Big Data Analytics Syllabus
Document2 pages
Big Data Analytics Syllabus
R. RAJASARANYA
No ratings yet
Cse6242 HW3
Document8 pages
Cse6242 HW3
Richard Ding
No ratings yet
Big Data Computing Spark Basics and RDD: Ke Yi
Document43 pages
Big Data Computing Spark Basics and RDD: Ke Yi
Patrick Li
No ratings yet
Big Data
Document17 pages
Big Data
gtfhbmnvh
No ratings yet
Bda Lab Manual
Document45 pages
Bda Lab Manual
Srinivas Nani
No ratings yet
Big Data Theory
Document3 pages
Big Data Theory
lathasivasankari.v
No ratings yet
Big Data and Hadoop: by - Ujjwal Kumar Gupta
Document57 pages
Big Data and Hadoop: by - Ujjwal Kumar Gupta
Ujjwal Kumar Gupta
No ratings yet
Data Engineer Interview Questions
Document16 pages
Data Engineer Interview Questions
junaid
No ratings yet
Unit 4 Iot II ..
Document19 pages
Unit 4 Iot II ..
Keerthi Sadhana
No ratings yet
CCS334 SET3
Document2 pages
CCS334 SET3
kmozhi2112
No ratings yet
Questionsand Answers
Document23 pages
Questionsand Answers
anaghayawale007
No ratings yet
Mapreduce Performance Evaluation Through Benchmarking and Stress Testing On Multi-Node Hadoop Cluster
Document4 pages
Mapreduce Performance Evaluation Through Benchmarking and Stress Testing On Multi-Node Hadoop Cluster
International Journal of computational Engineering research (IJCER)
No ratings yet
CS702 Big Data Programs
Document59 pages
CS702 Big Data Programs
aksaraf1508
No ratings yet
Big Data
Document4 pages
Big Data
aryan kothambia
No ratings yet
Bda Lab Manual
Document20 pages
Bda Lab Manual
RAKSHIT AYACHIT
No ratings yet
Institute of Technology: Practical List
Document4 pages
Institute of Technology: Practical List
Alex Tiwari
No ratings yet
Part A - Micro-Project Proposal: Assembly Language Program To Print String
Document7 pages
Part A - Micro-Project Proposal: Assembly Language Program To Print String
Rahul B. Fere
0% (1)
Distributed Database Systems: - Spark I
Document59 pages
Distributed Database Systems: - Spark I
Thomas Ariyanto
No ratings yet
20dce017 Bda Pracfil
Document41 pages
20dce017 Bda Pracfil
Raj Chauhan
No ratings yet
BDA Notes Unit-4
Document86 pages
BDA Notes Unit-4
Varshini Dirishala
No ratings yet
Unit 3
Document10 pages
Unit 3
Yuva Teja
No ratings yet
IAT-IV Question Paper With Solution of 18CS72 Big Data Analytics Feb-2022-Poonam Vijay Tijare
Document9 pages
IAT-IV Question Paper With Solution of 18CS72 Big Data Analytics Feb-2022-Poonam Vijay Tijare
Darshan R Gowda
No ratings yet
IOT Analytics - AI361
Document3 pages
IOT Analytics - AI361
PRIYANKA R
No ratings yet
R Programming Lab Manual - Final
Document39 pages
R Programming Lab Manual - Final
tejashreegurav243
No ratings yet
Introduction To The Big Data Ecosystem
Document13 pages
Introduction To The Big Data Ecosystem
Rico Martenstyaro
No ratings yet
Bda Lab
Document94 pages
Bda Lab
Dinesh Raj
No ratings yet
Model Paper BIG DATA (KOE097)
Document8 pages
Model Paper BIG DATA (KOE097)
chetu.sri81
No ratings yet
ECC Assignment11
Document4 pages
ECC Assignment11
nick5252
No ratings yet
Hadoop Overview-Tutorial-20081128 PDF
Document31 pages
Hadoop Overview-Tutorial-20081128 PDF
TrurlScribd
No ratings yet
PE CS801A SampleQB2
Document6 pages
PE CS801A SampleQB2
Tiyasha Neogi
No ratings yet
BDA Assignments
Document5 pages
BDA Assignments
Sana Hosaritti
No ratings yet
Kcs 061 PPT Unit 2
Document56 pages
Kcs 061 PPT Unit 2
PRACHI ROSHAN
No ratings yet
Chapter 4
Document71 pages
Chapter 4
yehenew
No ratings yet
Hadoop Streaming: Mapreduce
Document8 pages
Hadoop Streaming: Mapreduce
Prabir Kisku
No ratings yet
Bda 03
Document10 pages
Bda 03
HARSH NAG
No ratings yet
Performing Indexing Operation Using Hadoop MapReduce
Document5 pages
Performing Indexing Operation Using Hadoop MapReduce
Khushali Dave
No ratings yet
Big Data Workshop Contents
Document2 pages
Big Data Workshop Contents
Sunil Patil
No ratings yet
BD 5
Document28 pages
BD 5
gaudav217
No ratings yet
CIA3 Answer
Document5 pages
CIA3 Answer
Vijay ragavan
No ratings yet
Intro To Apache Spark: Credits To CS 347-Stanford Course, 2015, Reynold Xin, Databricks (Spark Provider)
Document96 pages
Intro To Apache Spark: Credits To CS 347-Stanford Course, 2015, Reynold Xin, Databricks (Spark Provider)
Costi Stoian
No ratings yet
Unit 3
Document18 pages
Unit 3
Ajay Kumar Kanamarlapudi
No ratings yet
Hadoop Bitcoin-BlockChain - A New Era Needed in Distributed Computing
Document7 pages
Hadoop Bitcoin-BlockChain - A New Era Needed in Distributed Computing
pacdox
No ratings yet
MapReduce Arch
Document29 pages
MapReduce Arch
21ve1a6772
No ratings yet
BDA - II Sem - II Mid
Document4 pages
BDA - II Sem - II Mid
Polikanti Goutham
100% (1)
BDA Experiments
Document2 pages
BDA Experiments
subramanyam62
No ratings yet
Hadoop V 2.x Vs V 3.x
Document20 pages
Hadoop V 2.x Vs V 3.x
Atharv Chaudhari
No ratings yet
HADOOP and PYTHON For BEGINNERS - 2 BOOKS in 1 - Learn Coding Fast! HADOOP and PYTHON Crash Course, A QuickStart Guide, Tutorial Book by Program Examples, in Easy Steps!
Document89 pages
HADOOP and PYTHON For BEGINNERS - 2 BOOKS in 1 - Learn Coding Fast! HADOOP and PYTHON Crash Course, A QuickStart Guide, Tutorial Book by Program Examples, in Easy Steps!
Antony George Sahayaraj
100% (1)
Introduction To Hadoop
Document44 pages
Introduction To Hadoop
Ponnusamy S Pichaimuthu
No ratings yet
Untitled Document
Document5 pages
Untitled Document
monisha.aids
No ratings yet
Elements of Android Room
From Everand
Elements of Android Room
Mark Murphy
No ratings yet
Box Model Layout
Document8 pages
Box Model Layout
sharmila
No ratings yet
Wp-Module1 Notes
Document31 pages
Wp-Module1 Notes
sharmila
No ratings yet
Daa Course File-Main
Document15 pages
Daa Course File-Main
sharmila
No ratings yet
CS6402 - Daa 16marks With Answers
Document20 pages
CS6402 - Daa 16marks With Answers
sharmila
No ratings yet
Advanced - Java - Lab Manual (Updated) 31.05.2022
Document60 pages
Advanced - Java - Lab Manual (Updated) 31.05.2022
sharmila
No ratings yet
Solmidt
Document3 pages
Solmidt
aimee_jc
No ratings yet
Cambridge IGCSE™: Computer Science 0478/13 October/November 2021
Document8 pages
Cambridge IGCSE™: Computer Science 0478/13 October/November 2021
sorixo9693
No ratings yet
Mk7a25p v06
Document53 pages
Mk7a25p v06
אור מהללאל בן השם
No ratings yet
3-1 Storage Resource Tuning Technologies and Applications
Document48 pages
3-1 Storage Resource Tuning Technologies and Applications
Alberto Christyan
No ratings yet
Web Application Security
Document25 pages
Web Application Security
Nipun Verma
No ratings yet
Hadoop Development Series: by Sandeep Patil
Document18 pages
Hadoop Development Series: by Sandeep Patil
Mohit Sharma
No ratings yet
Q.station Basic Settings
Document4 pages
Q.station Basic Settings
gooffline
No ratings yet
Cisco SNF Main Office and Mobile Worker Topologies: Cisco Secure Network Foundation Smart Designs
Document15 pages
Cisco SNF Main Office and Mobile Worker Topologies: Cisco Secure Network Foundation Smart Designs
Ryan Belicov
100% (1)
Bab 1 - DB F 3038
Document58 pages
Bab 1 - DB F 3038
Syafiq Fauzi
No ratings yet
Commands
Document2 pages
Commands
Daniel
No ratings yet
EE 436 HW2 - Spring2014
Document2 pages
EE 436 HW2 - Spring2014
techaddictt
No ratings yet
Csizg525 Nov29 FN
Document2 pages
Csizg525 Nov29 FN
jose
No ratings yet
03.-V570 V350 V130 - 2009 - V2
Document29 pages
03.-V570 V350 V130 - 2009 - V2
Jorge Tamayo Mancilla
100% (1)
Chapter 13: Data Storage Structures: Database System Concepts, 7 Ed
Document29 pages
Chapter 13: Data Storage Structures: Database System Concepts, 7 Ed
Rohit Pachlor
No ratings yet
Implementation of A Multi-Channel UART Controller Based On FIFO Technique and FPGA
Document5 pages
Implementation of A Multi-Channel UART Controller Based On FIFO Technique and FPGA
JNR
No ratings yet
GPRS Capacity and Coverage Planning
Document13 pages
GPRS Capacity and Coverage Planning
Wake Up
No ratings yet
GSW 5.1 Options Pack Installation Card
Document12 pages
GSW 5.1 Options Pack Installation Card
Unique Hullett
No ratings yet
Ascii Code: Baudot Code Murray Code
Document8 pages
Ascii Code: Baudot Code Murray Code
SudeshShivsharan
No ratings yet
1st Quarterly Test in ICT
Document4 pages
1st Quarterly Test in ICT
Ricardo Nugas
100% (1)
DB2 10.5 License Files
Document4 pages
DB2 10.5 License Files
mana1345
No ratings yet
USB Card USB Flash Disk Utility: User's Manual
Document37 pages
USB Card USB Flash Disk Utility: User's Manual
nilworld
No ratings yet
Summary About Networking
Document7 pages
Summary About Networking
don dali
100% (1)
Net s7-1200 Isoontcp en
Document22 pages
Net s7-1200 Isoontcp en
Edgar José Sánchez Angeles
No ratings yet
Jawapan Kertas 1 Ict SPM 2010 Section A No Soalan Jawapan 1
Document4 pages
Jawapan Kertas 1 Ict SPM 2010 Section A No Soalan Jawapan 1
Norhanizura
No ratings yet
Email Ar Invoices To Customers
Document38 pages
Email Ar Invoices To Customers
upmuthukumarmcom
No ratings yet
Oracle 12c RAC - Quick Guide To GIMR Administration
Document10 pages
Oracle 12c RAC - Quick Guide To GIMR Administration
miguelangel.mirandarios1109
No ratings yet
Autodesk Moldflow 2010 Install English
Document22 pages
Autodesk Moldflow 2010 Install English
kaposvaritamas
No ratings yet
Cisco 892 Performance Test
Document39 pages
Cisco 892 Performance Test
Febry Citra Prawira Negara
No ratings yet