DS Chapter 5

Chapter 5 - Naming
Introduction
names play an important role to:
share resources
uniquely identify entities
refer to locations
etc.
an important issue is that a name can be resolved to the entity
it refers to
to resolve names, it is necessary to implement a naming
system
in a distributed system, the implementation of a naming
system is itself often distributed, unlike in nondistributed
systems
efficiency and scalability of the naming system are the main
issues
2
Objectives of the Chapter
we will discuss
some general issues in naming
how human-friendly names are organized and implemented;
e.g., those for file systems and the WWW; classes of
naming systems
flat naming
structured naming, and
attribute-based naming
3
5.1 Names, Identifiers, and Addresses
a name in a distributed system is a string of bits or characters
that is used to refer to an entity
an entity is anything; e.g., resources such as hosts, printers,
disks, files, objects, processes, users, Web pages,
newsgroups, mailboxes, network connections, ...
entities can be operated on
e.g., a resource such as a printer offers an interface
containing operations for printing a document, requesting
the status of a job, etc.
a network connection may provide operations for sending
and receiving data, setting quality of service parameters, etc.
to operate on an entity, it is necessary to access it through its
access point, itself an entity (special)
4
access point
the name of an access point is called an address (such as
IP address and port number as used by the transport layer)
the address of the access point of an entity is also referred
to as the address of the entity
an entity can have more than one access point (similar to
accessing an individual through different telephone
numbers)
an entity may change its access point in the course of time
(e.g., a mobile computer getting a new IP address as it
moves)
5
an address is a special kind of name
it refers to at most one entity
each entity is referred by at most one address; even when
replicated such as in Web pages
an entity may change an access point, or an access point
may be reassigned to a different entity (like telephone
numbers in offices)
separating the name of an entity and its address makes it
easier and more flexible; such a name is called location
independent
there are also other types of names that uniquely identify an
entity; in any case a true identifier is a name with the
following properties
it refers to at most one entity
each entity is referred by at most one identifier
it always refers to the same entity (never reused)
identifiers allow us to unambiguously refer to an entity
6
examples
name of an FTP server (entity)
URL of the FTP server
address of the FTP server
IP number:port number
the address of the FTP server may change
there are three classes on naming systems: flat naming,
structured naming, and attribute-based naming
7
5.2 Flat Naming
a name is a sequence of characters without structure; like
human names? may be if it is not an Ethiopian name!
difficult to be used in a large system since it must be centrally
controlled to avoid duplication
moreover, it does not contain any information on how to
locate the access point of its associated entity
how are flat names resolved (or how to locate an entity when
a flat name is given)
name resolution: mapping a name to an address or an
address to a name is called name-address resolution
possible solutions: simple solutions, home-based
approaches, and hierarchical approaches
8
1. Simple Solutions
two solutions (for LANs only): Broadcasting and
Multicasting, and Forwarding Pointers

a. Broadcasting and Multicasting
broadcast a message containing the identifier of an
entity; only machines that can offer an access point for
the entity send a reply
e.g., ARP (Address Resolution Protocol) in the Internet to
find the data link address (MAC address) of a machine
a computer that wants to access another computer for
which it knows its IP address broadcasts this address
the owner responds by sending its Ethernet address
broadcasting is inefficient when the network grows
(wastage of bandwidth and too much interruption to other
machines)
multicasting is better when the network grows - send only
to a restricted group of hosts
9
multicasting can also be used to locate the nearest replica
- choose the one whose reply comes in first
b. Forwarding Pointers
how to look for mobile entities
when an entity moves from A to B, it leaves behind a
reference to its new location
advantage
simple: as soon as the first name is located using
traditional naming service, the chain of forwarding
pointers can be used to find the current address
drawbacks
the chain can be too long - locating becomes expensive
all the intermediary locations in a chain have to maintain
their pointers
vulnerability if links are broken
hence, making sure that chains are short and that
forwarding pointers are robust is an important issue 10
2. Home-Based Approaches
broadcasting and multicasting have scalability problems;
performance and broken links are problems in forwarding
pointers
a home location keeps track of the current location of an
entity; often it is the place where an entity was created
it is a two-tiered approach
an example where it is used in Mobile IP
each mobile host uses a fixed IP address
all communication to that IP address is initially directly
sent to the host’s home agent located on the LAN
corresponding to the network address contained in the
mobile host’s IP address
whenever the mobile host moves to another network, it
requests a temporary address in the new network
(called care-of-address) and informs the new address
to the home agent
11
when the home agent receives a message for the mobile host
(from a correspondent agent) it forwards it to its new address
(if it has moved) and also informs the sender the host’s
current location for sending other packets
home-based approach: the principle of Mobile IP 12

problems:
creates communication latency (Triangle routing:
correspondent-home network-mobile)
the home location must always exist; the host is
unreachable if the home does no more exist (permanently
changed); the solution is to register the home at a traditional
name service and let a client first look up the location of the
home
13
3. Hierarchical Approaches
a generalization of the two-tiered approach into multiple
layers
a network is divided into a collection of domains, similar
to DNS
a single top-level domain spans the entire network
each domain can be subdivided into multiple, smaller
domains
the lowest-level domain is called a leaf domain; typically a
LAN
each domain D has an associated directory node dir(D)
that keeps track of the entities in that domain leading to a
tree of directory nodes
the root (directory) node knows about all entities
14
hierarchical organization of a location service into domains, each having an
associated directory node
15
each entity is represented by a location record in the
directory node dir(D) to keep track of its whereabouts
a location record for an entity in a leaf domain contains the
entity’s current address; all other high-level domains will
have only pointers to this address; this means the root node
will store only pointers to all entities
an entity may have multiple addresses, for instance, if it is
replicated; a higher level domain containing the two
subdomains where the entity has addresses will have two
pointers
16
an example of storing information of an entity having two addresses in
different leaf domains D1 and D2
17
example of a look up operation
a client (in Domain D) would like to locate an entity E
looking up a location in a hierarchically organized location service

18
update operations (i.e., inserting and deleting addresses)
read pages 194 - 195)
in addition to the three methods discussed so far (simple
solutions, home-based approaches, and hierarchical
approaches), another approach for resolution in flat naming is
Distributed Hash Tables (DHT)
read pages 188 - 191
19
5.3 Structured Naming
flat names are not convenient for humans
Name Spaces
names are organized into a name space
each name is made of several parts; the first may define
the nature of the organization, the second the name, the
third departments, ...
the authority to assign and control the name spaces can be
decentralized where a central authority assigns only the
first two parts
a name space is generally organized as a labeled, directed
graph with two types of nodes
leaf node: represents the named entity and stores
information such as its address or the state of that entity
directory node: a special entity that has a number of
outgoing edges, each labeled with a name
each node in a naming graph is considered as another entity
with an identifier
20
a general naming graph with a single root node, no
a directory node stores a table in which an outgoing edge is
represented as a pair (edge label, node identifier), called a
directory table
each path in a naming graph can be referred to by the
sequence of labels corresponding to the edges of the path
and the first node in the path, such as
N:<label-1, label-2, ..., label-n>, where N refers to the first
node in the path 21
such a sequence is called a path name
if the first node is the root of the naming graph, it is called an
absolute path name; otherwise it is a relative path name
instead of the path name n0:<home, steen, mbox>, we often
use its string representation /home/steen/mbox
there may also be several paths leading to the same node,
e.g., node n5 can be represented as /keys or
/home/steen/keys
although the above naming graph is directed acyclic graph (a
node can have more than one incoming edge but is not
permitted to have a cycle), the common way is to use a tree
(hierarchical) with a single root (as is used in file systems)
in a tree structure, each node except the root has exactly
one incoming edge; the root has no incoming edges
Â each node also has exactly one associated (absolute) path
name
22
e.g., file naming in UNIX file system
a directory node represents a directory and a leaf node
represents a file
there is a single root directory, represented in the naming
graph by the root node
we have a contiguous series of blocks from a logical disk
the boot block is used to load the operating system
the superblock contains information on the entire file
system such as its size, etc.
inodes are referred to by an index number, starting at
number zero, which is for the inode representing the root
directory
given the index number of an inode, it is possible to access
its associated file
23
Name Resolution
given a path name, the process of looking up a name
stored in the node is referred to as name resolution; it
consists of finding the address when the name is given (by
following the path)
knowing how and where to start name resolution is referred
to as closure mechanism; e.g., UNIX file system
Linking and Mounting
Linking: giving another name for the same entity (an alias)
e.g., environment variables in UNIX such as HOME that
refer to the home directory of a user
two types of links (or two ways to implement an alias):
hard link and symbolic link
hard link: to allow multiple absolute path names to
refer to the same node in a naming graph
e.g., in the previous graph, there are two different path
names for node n5: /keys and /home/steen/keys
24
symbolic link: representing an entity by a leaf node and
instead of storing the address or state of the entity, the
node stores an absolute path name
the concept of a symbolic link explained in a naming graph
when first resolving an absolute path name stored in a

node (e.g., /home/steen/keys in node n6), name
resolution will return the path name stored in the node
(/keys), at which point it can continue with resolving that
new path name, i.e., closure mechanism
25
so far name resolution was discussed as taking place
within a single name space
name resolution can also be used to merge different name
spaces in a transparent way
the solution is to use mounting
Mounting
as an example, consider a mounted file system, which
can be generalized to other name spaces as well
let a directory node store the directory node from a
different (foreign) name space
the directory node storing the node identifier is called a
mount point
the directory node in the foreign name space is called a
mounting point, normally the root of a name space
during name resolution, the mounting point is looked up
and resolution proceeds by accessing its directory table
26
consider a collection of name spaces distributed across
different machines (each name space implemented by a
different server)
to mount a foreign name space in a distributed system, the
following are at least required
the name of an access protocol (for communication)
the name of the server
the name of the mounting point in the foreign name space
each of these names needs to be resolved
to the implementation of the protocol so that
communication can take place properly
to an address where the server can be reached
to a node identifier in the foreign name space (to be
resolved by the server of the foreign name space)
the three names can be listed as a URL
27
example: Sun’s Network File System (NFS) is a distributed file
system with a protocol that describes how a client can
access a file stored on a (remote) NFS file server
an NFS URL may look like nfs://flits.cs.vu.nl/home/steen
- nfs is an implementation of a protocol
- flits.cs.vu.nl is a server name to be resolved using DNS
- /home/steen is resolved by the foreign server
e.g., the subdirectory /remote includes mount points for
foreign name spaces on the client machine
a directory node named /remote/vu is used to store
nfs://flits.cs.vu.nl/home/steen
consider /remote/vu/mbox
this name is resolved by starting at the root directory on
the client’s machine until node /remote/vu, which returns
the URL nfs://flits.cs.vu.nl/home/steen
this leads the client machine to contact flits.cs.vu.nl
using the NFS protocol
then the file mbox is read in the directory /home/steen
28
mount point
mounting point
mounting remote name spaces through a specific process protocol
29
distributed systems that allow mounting a remote file system
also allow to execute some commands
example commands to access the file system
cd /remote/vu /*changing directory on the remote machine
ls -l /*listing the files on the remote machine
by doing so the user is not supposed to worry about the
details of the actual access; the name space on the local
machine and that on the remote machine look to form a
single name space
30
The Implementation of a Name Space
a name space forms the heart of a naming service
a naming service allows users and processes to add,
remove, and lookup names
a naming service is implemented by name servers
for a distributed system on a single LAN, a single server
might suffice; for a large-scale distributed system the
implementation of a name space is distributed over multiple
name servers
Name Space Distribution

in large scale distributed systems, it is necessary to
distribute the name service over multiple name servers,
usually organized hierarchically
a name service can be partitioned into logical layers
the following three layers can be distinguished (according to
Cheriton and Mann)
31
global layer
formed by highest level nodes (root node and nodes close
to it or its children)
nodes on this layer are characterized by their stability, i.e.,
directory tables are rarely changed
they may represent organizations, groups of
organizations, ..., where names are stored in the name
space
administrational layer
groups of entities that belong to the same organization or
administrational unit, e.g., departments
relatively stable
managerial layer
nodes that may change regularly, e.g., nodes representing
hosts of a LAN, shared files such as libraries or binaries,
…
nodes are managed not only by system administrators,
but also by end users 32
an example partitioning of the DNS name space, including Internet-
accessible files, into three layers 33
the name space is divided into nonoverlapping parts, called
zones in DNS
a zone is a part of the name space that is implemented by a
separate name server
some requirements of servers at different layers: performance
(responsiveness to lookups), availability (failure rate), etc.
high availability is critical for the global layer, since name
resolution cannot proceed beyond the failing server; it is
also important at the administrational layer for clients in the
same organization
performance is very important in the lowest layer, since
results of lookups can be cached and used due to the
relative stability of the higher layers
they may be enhanced by client side caching (for global and
administrational layers since names do not change often)
and replication; they create implementation problems since
they may introduce inconsistency (see Chapter 7)
34
Item Global Administrational Managerial
Geographical scale of network Worldwide Organization Department
Total number of nodes Few Many Vast numbers
Responsiveness to lookups Seconds Milliseconds Immediate
Update propagation Lazy Immediate Immediate
Availability requirement Very High High low
Number of replicas Many None or few None
Is client-side caching applied? Yes Yes Sometimes
a comparison between name servers for implementing nodes from a large-

scale name space partitioned into a global layer, an administrational
layer, and a managerial layer
35
Implementation of Name Resolution
recall that name resolution consists of finding the address
when the name is given
assume that name servers are not replicated and that no
client-side caches are allowed
each client has access to a local name resolver, responsible
for ensuring that the name resolution process is carried out
e.g., assume the path name
root:<nl, vu, cs, ftp, pub, globe, index.txt>
is to be resolved
or using a URL notation, this path name would correspond
to ftp://ftp.cs.vu.nl/pub/globe/index.txt
36
a host that needs to map a name to an address calls a DNS
client named a resolver (and provides it the name to be
resolved - ftp.cs.vu.nl)
the resolver accesses the closest DNS server with a mapping
request
if the server has the information it satisfies the resolver;
otherwise, it either refers the resolver to other servers (called
Iterative Resolution) or asks other servers to provide it with
the information (called Recursive Resolution)
Iterative Resolution
a name resolver hands over the complete name to the root
name server
the root name server will resolve the name as far as it can
and return the result to the client; at the minimum it can
resolve the first level and sends the name of the first level
name server to the client
the client calls the first level name server, then the second,
..., until it finds the address of the entity 37
the principle of iterative name resolution
38
Recursive Resolution
a name resolver hands over the whole name to the root name
server
the root name server will try to resolve the name and if it
can’t, it requests the first level name server to resolve it and
to return the address
the first level will do the same thing recursively
the principle of recursive name resolution 39

Advantages and drawbacks
recursive name resolution puts a higher performance
demand on each name server; hence name servers in the
global layer support only iterative name resolution
caching is more effective with recursive name resolution
each name server gradually learns the address of each
name server responsible for implementing lower-level
nodes
eventually lookup operations can be handled efficiently
40
Server for Should Looks Passes to Receives Returns to
node resolve up child and caches requester
cs <ftp> #<ftp> -- -- #<ftp>
vu <cs,ftp> #<cs> <ftp> #<ftp> #<cs>
#<cs, ftp>
nl <vu,cs,ftp> #<vu> <cs,ftp> #<cs> #<vu>
#<cs,ftp> #<vu,cs>
#<vu,cs,ftp>
root <nl,vu,cs,ftp> #<nl> <vu,cs,ftp> #<vu> #<nl>
#<vu,cs> #<nl,vu>
#<vu,cs,ftp> #<nl,vu,cs>
#<nl,vu,cs,ftp>
recursive name resolution of <nl, vu, cs, ftp>; name servers cache
intermediate results for subsequent lookups
41
communication costs may be reduced in recursive name
resolution
the comparison between recursive and iterative name resolution with

respect to communication costs; assume the client is in Ethiopia and
Summary the name servers in the Netherlands
Method Advantages
Recursive Less Communication cost; Caching is more effective
Iterative Less performance demand on name servers
42
Example - The Domain Name System (DNS)
one of the largest distributed naming services is the
Internet DNS
it is used for looking up host addresses and mail servers
hierarchical, defined in an inverted tree structure with the
root at the top
the tree can have only 128 levels
43
Label
each node has a label, a string with a maximum of 63
characters (case insensitive)
the root label is null (has no label)
children of a node must have different names (to guarantee
uniqueness)
Domain Name
each node has a domain
name; it is a path name to its
root node
a full domain name is a
sequence of labels separated
by dots (the last character is a
dot)
domain names are read from
the node up to the root
full path names must not
exceed 255 characters
44
the contents of a node is formed by a collection of resource
records; the important ones are the following
Type of Associated
Description
record entity
SOA (start of Holds information on the represented zone, such as an
Zone
authority) e-mail address of the system administrator
A (address) Host Contains an IP address of the host this node represents
MX (mail Refers to a mail server to handle mail addressed to this
Domain
exchange) node; it is a symbolic link; e.g. name of a mail server
SRV Domain Refers to a server handling a specific service
NS (name Refers to a name server that implements the
Zone
server) represented zone
CNAME Node Contains the canonical name of a host; an alias
Symbolic link with the primary name of the represented
PTR (pointer) Host
node; for mapping an IP address to a name
HINFO (host Holds information on the host this node represents;
Host
info) such as machine type and OS
Contains any entity-specific information considered
TXT Any kind
useful; cannot be automatically processed
45
cs.vu.nl represents the domain as well as the zone; it has 4
name servers (ns, star, top, solo) and 3 mail servers
name server for this zone with 2 network addresses (star)
mail servers; the numbers preceding the name show
priorities; first the one with the lowest number is tried
an excerpt from the DNS database for the zone cs.vu.nl 46

a Web server and an FTP server, implemented by a single
machine (soling)
older server clusters (vucs-das1)
two printers (inkt and pen) with a local address; i.e., they
cannot be accessed from outside
an excerpt from the DNS database for the zone cs.vu.nl, cont’d
47
cs.vu.nl is implemented as a single zone
hence, the records in the previous slides do not include
references to other zones
nodes in a subdomain that are implemented in a different
zone are specified by giving the domain name and IP
address
part of the description for the vu.nl domain which contains the cs.vu.nl domain
48
5.4 Attribute-Based Naming
flat naming: provides a unique and location-independent way
of referring to entities
structured naming: also provides a unique and location-
independent way of referring to entities as well as human-
friendly names
but both do not allow searching entities by giving a
description of an entity
in attribute-based naming, each entity is assumed to have a
collection of attributes that say something about the entity
then a user can search an entity by specifying (attribute, value)
pairs known as attribute-based naming
Directory Services
attribute-based naming systems are also called directory
services whereas systems that support structured naming
are called naming systems
49
how are resources described? one possibility is to use RDF
(Resource Description Framework) that uses triplets
consisting of a subject, a predicate, and an object
e.g., (person, name, Alice) to describe a resource Person
whose Name is Alice
or in e-mail systems, we can use sender, recipient, subject,
etc. for searching
Hierarchical Implementations: LDAP
distributed directory services are implemented by combining
structured naming with attribute-based naming
e.g., Microsoft’s Active Directory service
such systems rely on the lightweight directory access
protocol or LADP which is derived from OSI’s X.500
directory service
a LADP directory service consists of a number of records
called directory entries (attribute, value) pairs, similar to a
resource record in DNS; could be single- or multiple-valued
(e.g., Mail_Servers on next slide) 50
Attribute Abbr. Value
Country C NL
Locality L Amsterdam
Organization O Vrije Universiteit
OrganizationalUnit OU Comp. Sc.
CommonName CN Main server
Mail_Servers -- 137.37.20.3, 130.37.24.6,137.37.20.10
FTP_Server -- 130.37.20.20
WWW_Server -- 130.37.20.20
a simple example of an LDAP directory entry using LDAP naming

conventions to identify the network addresses of some servers
51
the collection of all directory entries is called a Directory
Information Base (DIB)
each record is uniquely named so that it can be looked up
each naming attribute is called a Relative Distinguished
Name (RDN); the first 5 entries above
a globally unique name is formed using abbreviations of
naming attributes, e.g.,
/C=NL/O=Vrije Universiteit/OU=Comp. Sc.
this is similar to the DNS name nl.vu.cs
listing RDNs in sequence leads to a hierarchy of the
collection of directory entries, called a Directory
Information Tree (DIT)
a DIT forms the naming graph of an LDAP directory service
where each node represents a directory entry
52
node N corresponds to the directory entry shown earlier; it
also acts as a parent of other directory entries that have an
additional attribute, Host_Name; such entries may be used
to represent hosts
part of the directory information tree
53
Attribute Value Attribute Value
Country NL Country NL
Locality Amsterdam Locality Amsterdam
Organization Vrije Universiteit Organization Vrije Universiteit
OrganizationalUnit Comp. Sc. OrganizationalUnit Comp. Sc.
CommonName Main server CommonName Main server

Host_Name star Host_Name zephyr
Host_Address 192.31.231.42 Host_Address 137.37.20.10
two directory entries having Host_Name as RDN
read pages 221 - 226 about Decentralized Implementations
54

DS Chapter 5

Uploaded by

Copyright:

Available Formats

DS Chapter 5

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DS Chapter 5

Uploaded by

Copyright:

Available Formats

Chapter 5 - Naming

Multicasting, and Forwarding Pointers

home-based approach: the principle of Mobile IP 12

looking up a location in a hierarchically organized location service

the concept of a symbolic link explained in a naming graph

 when first resolving an absolute path name stored in a

mounting remote name spaces through a specific process protocol

 Name Space Distribution

Geographical scale of network Worldwide Organization Department

Total number of nodes Few Many Vast numbers

Responsiveness to lookups Seconds Milliseconds Immediate

Update propagation Lazy Immediate Immediate

Availability requirement Very High High low

Number of replicas Many None or few None

Is client-side caching applied? Yes Yes Sometimes

a comparison between name servers for implementing nodes from a large-

the principle of recursive name resolution 39

the comparison between recursive and iterative name resolution with

an excerpt from the DNS database for the zone cs.vu.nl 46

a simple example of an LDAP directory entry using LDAP naming

part of the directory information tree

OrganizationalUnit Comp. Sc. OrganizationalUnit Comp. Sc.

CommonName Main server CommonName Main server

two directory entries having Host_Name as RDN

 read pages 221 - 226 about Decentralized Implementations

You might also like

when first resolving an absolute path name stored in a

Name Space Distribution

read pages 221 - 226 about Decentralized Implementations