Hadoop Distributed File System: Bhavneet Kaur B.Tech Computer Science 2 Year
Hadoop Distributed File System: Bhavneet Kaur B.Tech Computer Science 2 Year
Hadoop Distributed File System: Bhavneet Kaur B.Tech Computer Science 2 Year
-Bhavneet
Kaur
B.Tech Computer Science
2nd year
What is HDFS?
A file system that provides reliable data
Basic Features:
Highly fault-tolerant:
Hardware failure is the norm rather than the
exception.
Hundreds or thousands of server machines, each
storing part of the file systems data.
Each component has a nontrivial probability of
failure means that some component of HDFS is
always non-functional.
Therefore, detection of faults and quick, automatic
recovery from them is a core architectural goal of
HDFS.
3
data sets.
A typical file in HDFS is gigabytes to
terabytes in size. Thus, HDFS is tuned to
support large files.
It should provide high aggregate data
bandwidth and scale to hundreds of nodes
in a single cluster.
It should support tens of millions of files in
a single instance.
5
Architecture
HDFS Architecture
Metadata ops
Metadata(Name, replicas..)
(/home/foo/data,6. ..
Namenode
Client
Block ops
Read
Datanodes
Datanodes
replication
B
Blocks
Rack1
Write
Client
Rack2
11
and files
Create, remove, move, rename etc.
Namenode maintains the file system
Any meta information changes to the file
system recorded by the Namenode.
An application can specify the number of
replicas of the file needed: replication
factor of the file. This information is
stored in the Namenode.
12
Data Replication
HDFS is designed to store very large files across
13
Replica Placement
The placement of the replicas is critical to
bandwidth utilization
Research topic
Many racks, communication between racks
DataNode.
Replicas are typically placed on unique racks
Simple but non-optimal
Writes are expensive
Replication factor is 3
Another research topic?
Replica Selection
Replica selection for READ operation:
Safemode Startup
On startup Namenode enters Safemode.
Replication of data blocks do not occur in Safemode.
Each DataNode checks in with Heartbeat and
BlockReport.
Namenode verifies that each block has acceptable
number of replicas
After a configurable percentage of safely replicated
blocks check in with the Namenode, Namenode exits
Safemode.
It then makes the list of blocks that need to be
replicated.
Namenode then proceeds to replicate these blocks to
other Datanodes.
17
Filesystem Metadata
The HDFS namespace is stored by Namenode.
Namenode uses a transaction log called the
Namenode
Keeps image of entire file system namespace
Datanode
A Datanode stores data in files in its local file system.
Datanode has no knowledge about HDFS filesystem
It stores each block of HDFS data in a separate file.
Datanode does not create all files in the same
directory.
It uses heuristics to determine optimal number of files
per directory and creates directories appropriately:
Research issue?
20
Robustness
21
Objectives
Primary objective of HDFS is to store data
22
Re-replication
The necessity for re-replication may arise
due to:
A Datanode may become unavailable,
A replica may become corrupted,
A hard disk on a Datanode may fail, or
The replication factor on the block may be
increased.
24
Cluster Rebalancing
HDFS architecture is compatible with data
rebalancing schemes.
A scheme might move data from one Datanode
to another if the free space on a Datanode falls
below a certain threshold.
In the event of a sudden high demand for a
particular file, a scheme might dynamically
create additional replicas and rebalance other
data in the cluster.
These types of data rebalancing are not yet
implemented: research issue.
25
Data Integrity
Consider a situation: a block of data fetched from
26
HDFS.
A corruption of these files can cause a HDFS instance
to be non-functional.
For this reason, a Namenode can be configured to
maintain multiple copies of the FsImage and EditLog.
Multiple copies of the FsImage and EditLog files are
updated synchronously.
Meta-data is not data-intensive.
The Namenode could be single point failure:
automatic failover is NOT supported! Another
research topic.
27
Data Organization
28
06/29/15
Data Blocks
HDFS support write-once-read-many with
29
Staging
A client request to create a file does not reach
Namenode immediately.
HDFS client caches the data into a temporary
file. When the data reached a HDFS block size
the client contacts the Namenode.
Namenode inserts the filename into its
hierarchy and allocates a data block for it.
The Namenode responds to the client with the
identity of the Datanode and the destination of
the replicas (Datanodes) for the block.
Then the client flushes it from its local memory.
30
Staging (contd.)
The client sends a message that the file is
closed.
Namenode proceeds to commit the file for
creation operation into the persistent store.
If the Namenode dies before file is closed,
the file is lost.
This client side caching is required to avoid
network congestion; also it has precedence
is AFS (Andrew file system).
31
Replication Pipelining
When the client receives response from
32
Space Reclamation
When a file is deleted by a client, HDFS renames file
33
34
06/29/15