Unit-Ii: Socket Address Structures

Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

Socket Address Structures

UNIT-II
Socket Address Structures
Most socket functions require a pointer to a socket address structure as an argument. Each supported
protocol suite defines its own socket address structure.

IPv4 Socket Address Structure

An IPv4 socket address structure, commonly called an "Internet socket address structure," is named
sockaddr_in and is defined by including the <netinet/in.h>header. The POSIX definition of IPv4 SAS is
shown below.
struct in_addr {
in_addr_t s_addr; /* 32-bit IPv4 address */
/* network byte ordered */
};

struct sockaddr_in {
uint8_t sin_len; /* length of structure (16) */
sa_family_t sin_family; /* AF_INET */
in_port_t sin_port; /* 16-bit TCP or UDP port number */
/* network byte ordered */
struct in_addr sin_addr; /* 32-bit IPv4 address */
/* network byte ordered */
char sin_zero[8]; /* unused */
};

int8_t Signed 8-bit integer <sys/types.h>


uint8_t Unsigned 8-bit integer <sys/types.h>
int16_t Signed 16-bit integer <sys/types.h>
unit16_t Unsigned 16-bit integer <sys/types.h>
int32_t Signed 32-bit integer <sys/types.h>
unit32_t Unsigned 32-bit integer <sys/types.h>
sa_family_t Address family of socket address structure <sys/socket.h>
socklen_t Length of socket address structure, normally uint32_t <sys/socket.h>
in_addr_t IPv4 address, normally uint32_t <netinet/in.h>
in_port_t TCP or UDP port , normally uint16_t <netinet/in.h>
Table 1 : Datatype, Description and Header File of IPV4 SAS Members

• The POSIX specification requires only three members in the structure: sin_family, sin_addr, and
sin_port.
• The datatypes u_char, u_short, u_int, and u_long, which are all unsigned
• Both the IPv4 address and the TCP or UDP port number are always stored in the structure in network
byte order.
• The 32-bit IPv4 address can be accessed in two different ways.
o For example, if serv is defined as an Internet socket address structure, then serv.sin_addr
references the 32-bit IPv4 address as an in_addr structure, while serv.sin_addr.s_addr

Information Technology Page 1


Socket Address Structures

references the same 32-bit IPv4 address as an in_addr_t (typically an unsigned 32-bit
integer).

Generic Socket Address Structure

• A socket address structure is always passed by reference when passed as an argument to any
socket functions.
• void * is the generic pointer type.
• <sys/socket.h> :Generic Socket address structure
struct sockaddr {
uint8_t sa_len;
sa_family_t sa_family; /* address family: AF_xxx value */
char sa_data[14]; /* protocol-specific address */
};

The socket functions are then defined as taking a pointer to the generic socket address structure, as shown
here in the ANSI C function prototype for the bind function:

int bind(int, struct sockaddr *, socklen_t);

Example:
Struct sockaddr_in serv; /* IPv4 socket address structure */

/* fill in serv{} */

bind(sockfd, (struct sockaddr *) &serv, sizeof(serv));

IPv6 Socket Address Structure


struct in6_addr {
uint8_t s6_addr[16]; /* 128-bit IPv6 address */
/* network byte ordered */
};

#define SIN6_LEN /* required for compile-time tests */

struct sockaddr_in6 {
uint8_t sin6_len; /* length of this struct (28) */
sa_family_t sin6_family; /* AF_INET6 */
in_port_t sin6_port; /* transport layer port# */
/* network byte ordered */
uint32_t sin6_flowinfo; /* flow information, undefined */
struct in6_addr sin6_addr; /* IPv6 address */
/* network byte ordered */
uint32_t sin6_scope_id; /* set of interfaces for a scope */
};

• The sin6_flowinfo member is divided into two fields:


o The low-order 20 bits are the flow label
o The high-order 12 bits are reserved

Information Technology Page 2


Socket Address Structures

The sin6_scope_id identifies the scope zone in which a scoped address is meaningful, most commonly an
interface index for a link-local address

New Generic Socket Address Structure

• defined as part of the IPv6 sockets API


• unlike the struct sockaddr
• <netinet/in.h> header
Struct sockaddr_storage {
uint8_t ss_len; /* length of this struct (implementation dependent) */
sa_family_t ss_family; /* address family: AF_xxx value */

/* implementation-dependent elements to provide:


* a) alignment sufficient to fulfill the alignment requirements of
* all socket address types that the system supports.
* b) enough storage to hold any type of socket address that the
* system supports.
*/
};
• Different from structsockaddr in two ways
o provides the strictest alignment requirement
o large enough to contain any socket address structure that the system supports
• must be cast or copied to the appropriate socket address structure

Comparison of Socket Address Structures

Figure 1: Comparison of various socket address structures

Information Technology Page 3


Socket Address Structures

Value-Result Arguments

Socket address structure passed from process to kernel


Three functions bind, connect, and sendto pass a socket address structure from the process to the kernel. One
argument to these three functions is the pointer to the socket address structure and another argument is the
integer size of the structure. Since the kernel is passed both the pointer and the size of what the pointer
points to, it knows exactly how much data to copy from the process into the kernel.

Figure 2: Socket address structure passed from process to kernel

Socket address structure passed from kernel to process


Four functions accept, recvfrom, getsockname and getpeername pass a socket address structure from the
kernel to the process, the reverse direction from the previous scenario. Two of the arguments to these four
functions are the pointer to the socket address structure along with a pointer to an integer containing the size
of the structure.

The reason that the size changes from an integer to be a pointer to an integer is because the size is both a
value when the function is called (it tells the kernel the size of the structure so that the kernel does not write
past the end of the structure when filling it in) and a result when the function returns. This type of argument
is called a value-result argument.

Information Technology Page 4


Socket Address Structures

Figure 3: Socket address structure passed from kernel to process

Byte Ordering Functions

Consider a 16-bit integer that is made up of 2 bytes. There are two ways to store the two bytes in memory:
with the low-order byte at the starting address, known as little-endian byte order, or with the high-order byte
at the starting address, known as big-endian byte order.

Figure 4: Little-endian byte order and big-endian byte order for a 16-bit integer

We must deal with these byte ordering differences as network programmers because networking protocols
must specify a network byte order. For example, in a TCP segment, there is a 16-bit port number and a 32-
bit IPv4 address. The sending protocol stack and the receiving protocol stack must agree on the order in
which the bytes of these multibyte fields will be transmitted. The Internet protocols use big-endian byte
ordering for these multibyte integers.

Information Technology Page 5


Socket Address Structures

In theory, an implementation could store the fields in a socket address structure in host byte order and then
convert to and from the network byte order when moving the fields to and from the protocol headers, saving
us from having to worry about this detail. But, both history and the POSIX specification say that certain
fields in the socket address structures must be maintained in network byte order. Our concern is therefore
converting between host byte order and network byte order. We use the following four functions to convert
between these two byte orders.
#include <netinet/in.h>
uint16_t htons(uint16_t host16bitvalue) ;
uint32_t htonl(uint32_t host32bitvalue) ;
Both return: value in network byte order
uint16_t ntohs(uint16_t net16bitvalue) ;
uint32_t ntohl(uint32_t net32bitvalue) ;
Both return: value in host byte order

In the names of these functions, h stands for host, n stands for network, s stands for short, and l stands for
long. The terms "short" and "long" are historical artifacts from the Digital VAX implementation of 4.2BSD.
We should instead think of s as a 16-bit value (such as a TCP or UDP port number) and l as a 32-bit value
(such as an IPv4 address). Indeed, on the 64-bit Digital Alpha, a long integer occupies 64 bits, yet the htonl
and ntohl functions operate on 32-bit values.

NOTE: These functions are used exclusively for data functionality between sockets (storage).

Byte Manipulation Functions

There are two groups of functions that operate on multibyte fields, without interpreting the data, and without
assuming that the data is a null-terminated C string. We need these types of functions when dealing with
socket address structures because we need to manipulate fields such as IP addresses, which can contain
bytes of 0, but are not C character strings.

The first group of functions, whose names begin with b (for byte), are from 4.2BSD and are still provided by
almost any system that supports the socket functions. The second group of functions, whose names begin
with mem (for memory), are from the ANSI C standard and are provided with any system that supports an
ANSI C library.
#include <strings.h>
void bzero(void *dest, size_t nbytes);
void bcopy(const void *src, void *dest, size_t nbytes);
Int bcmp(const void *ptr1, const void *ptr2, size_t nbytes);
Returns: 0 if equal, nonzero if unequal
The following functions are the ANSI C functions:

#include <string.h>
void *memset(void *dest, int c, size_t len);
void *memcpy(void *dest, const void *src, size_t nbytes);
Int memcmp(const void *ptr1, const void *ptr2, size_t nbytes);
Returns: 0 if equal, <0 or >0 if unequal (see text)

Information Technology Page 6


Socket Address Structures

src might represent application space and dest might represent socket send buffer space (socket receive
buffer space).

inet_aton, inet_addr, and inet_ntoa Functions

To send IP address on the network, we have the functions that serve the purpose. The following functions
are for IPV4.
#include <arpa/inet.h>
Int inet_aton(const char *strptr, struct in_addr *addrptr);
Returns: 1 if string was valid, 0 on error
in_addr_t inet_addr(const char *strptr);
Returns: 32-bit binary network byte ordered IPv4 address; INADDR_NONE if error
char *inet_ntoa(struct in_addr inaddr);
Returns: pointer to dotted-decimal string

inet_pton and inet_ntop Functions

The IPV6 functions for the data communication over the network, following functions areused. These
functions can also be used for IPV4 addresses also (The ‗family‘ argument specifies this).
#include <arpa/inet.h>
Int inet_pton(int family, const char *strptr, void *addrptr);
Returns: 1 if OK, 0 if input not a valid presentation format, -1 on error
const char *inet_ntop(int family, const void *addrptr, char *strptr, size_t len);
Returns: pointer to result if OK, NULL on error

Information Technology Page 7


Socket Address Structures

Figure 5: Summary of address conversion functions

sock_ntop Function

A basic problem with inet_ntop is that it requires the caller to pass a pointer to a binary address. This
address is normally contained in a socket address structure, requiring the caller to know the format of the
structure and the address family.

To solve this problem, sock_ntop() is used which takes pointer to a socket address structure as an argument,
calls the appropriate function and the presentation address is returned.
#include "unp.h"
char *sock_ntop(const struct sockaddr *sockaddr, socklen_t addrlen);
Returns: non-null pointer if OK, NULL on error

readn, writen, and readline Functions

Stream sockets (e.g., TCP sockets) exhibit a behavior with the read and write functions that differ from
normal file I/O. A read or write on a stream socket might input or output fewer bytes than requested, but this
is not an error condition. The reason is that buffer limits might be reached for the socket in the kernel. All
that is required to input or output the remaining bytes is for the caller to invoke the read or write function
again. Some versions of UNIX also exhibit this behavior when writing more than 4,096 bytes to a pipe. This
scenario is always a possibility on a stream socket with read, but is normally seen with write only if the
socket is nonblocking. Nevertheless, we always call our writenfunction instead of write, in case the
implementation returns a short count.
#include "unp.h"
ssize_t readn(int filedes, void *buff, size_t nbytes);
ssize_t writen(int filedes, const void *buff, size_t nbytes);
ssize_t readline(int filedes, void *buff, size_t maxlen);
All return: number of bytes read or written, –1 on error

Information Technology Page 8


Socket Address Structures

Elementary TCP Sockets

This chapter describes the elementary socket functions required to write a complete TCP client and server.

Figure 6: Socket functions for elementary TCP client/server

Socket Function

To perform network I/O, the first thing a process must do is call the socket function, specifying the type of
communication protocol desired (TCP using IPv4, UDP using IPv6, Unix domain stream protocol, etc.).

#include <sys/socket.h>

int socket (int family, int type, int protocol);


Returns: non-negative descriptor if OK, -1 on error
Creates a socket on demand (placing it in an unconnected state), returns an integer identifying the socket
(descriptor), and specifies:

Information Technology Page 9


Socket Address Structures

Family - particular address of the family.

Type - Type of communication socket

Protocol - Accommodates multiple protocols within a family

Figure 7: Protocol family constants for socket function

Figure 8: type of socket for socket function

Figure 9: protocol of sockets for AF_INET or AF_INET6

Connect Function

The connect function is used by a TCP client to establish a connection with a TCP server.

#include <sys/socket.h>

int connect(int sockfd, const struct sockaddr *servaddr, socklen_t addrlen);


Returns: 0 if OK, -1 on error

EX: connect (socket, destaddr, addrlen);

Binds a permanent destination to a socket placing it in a connected state.Sockets using connection-less


service do not have to use connect (specify the address in every datagram), but may.

Socket - socket descriptor.

Destaddr -socket_addr structure (also includes protocol port number) specifying the destination address.

Addrlen - length of destination address (in bytes).

Information Technology Page 10


Socket Address Structures

bind Function

The bind function assigns a local protocol address to a socket. With the Internet protocols, the protocol
address is the combination of either a 32-bit IPv4 address or a 128-bit IPv6 address, along with a 16-bit TCP
or UDP port number.

#include <sys/socket.h>

int bind (int sockfd, const struct sockaddr *myaddr, socklen_t addrlen);
Returns: 0 if OK,-1 on error

EX: bind (socket, localaddr, addrlen);

Socket is created without any association to local or destination addresses, so a program uses bind to
establish a local address for it.

Socket - integer descriptor of the socket.

Localaddr - structure that specifies the local address to be bound.

Addrlen - integer length of the address (in bytes).

listen Function

The listen function is called only by a TCP server and it performs two actions:

1. When a socket is created by the socket function, it is assumed to be an active socket, that is, a client
socket that will issue a connect. The listen function converts an unconnected socket into a passive
socket, indicating that the kernel should accept incoming connection requests directed to this socket.
In terms of the TCP state transition diagram, the call to listen moves the socket from the CLOSED
state to the LISTEN state.
2. The second argument to this function specifies the maximum number of connections the kernel
should queue for this socket.

#include <sys/socket.h>

#int listen (int sockfd, int backlog);


Returns: 0 if OK, -1 on error

This function is normally called after both the socket and bind functions and must be called before calling
the accept function.

EX: listen (socket, backlog);

Server creates a socket, binds it to a well-known port, and waits for requests. To avoid rejecting service
requests that cannot be handled, a server queue is created using Listen. It provides a mechanism to create
the queue and then listen for incoming connections (passive mode). Listen only works with sockets using a
reliable stream service.
Information Technology Page 11
Socket Address Structures

Socket - Integer descriptor.

Backlog(qlength) - length of the request queue for that socket (max. = 5).

accept Function

accept is called by a TCP server to return the next completed connection from the front of the completed
connection queue. If the completed connection queue is empty, the process is put to sleep (assuming the
default of a blocking socket).

#include <sys/socket.h>

int accept (int sockfd, struct sockaddr *cliaddr, socklen_t *addrlen);


Returns: non-negative descriptor if OK, -1 on error

EX: accept (socket, addr, addrlen);

Bind associates a socket with port, but that socket is not connected to a foreign destination.

When a request comes in, Accept establishes the full connection. It blocks until a connectionrequest arrives.

Addr - pointer to the sockaddr structure.

Addrlen - pointer to integer size of address.

fork and exec Functions

This function (including the variants of it provided by some systems) is the only way in Unix to create a new
process

#include <unistd.h>

pid_t fork(void);

Returns: 0 in child, process ID of child in parent, -1 on error


If you have never seen this function before, the hard part in understanding fork is that it is called once but it
returns twice. It returns once in the calling process (called the parent) with a return value that is the process
ID of the newly created process (the child). It also returns once in the child, with a return value of 0. Hence,
the return value tells the process whether it is the parent or the child.

The reason fork returns 0 in the child, instead of the parent's process ID, is because a child has only one
parent and it can always obtain the parent's process ID by calling getppid. A parent, on the other hand, can
have any number of children, and there is no way to obtain the process IDs of its children. If a parent wants
to keep track of the process IDs of all its children, it must record the return values from fork.

All descriptors open in the parent before the call to fork are shared with the child after fork returns. We will
see this feature used by network servers: The parent calls accept and then calls fork. The connected socket is
then shared between the parent and child. Normally, the child then reads and writes the connected socket
and the parent closes the connected socket.

Information Technology Page 12


Socket Address Structures

There are two typical uses of fork:

1. A process makes a copy of itself so that one copy can handle one operation while the other copy does
another task. This is typical for network servers. We will see many examples of this later in the text.
2. A process wants to execute another program. Since the only way to create a new process is by calling
fork, the process first calls fork to make a copy of itself, and then one of the copies (typically the
child process) calls exec (described next) to replace itself with the new program. This is typical for
programs such as shells.

exec replaces the current process image with the new program file, and this new program normally starts at the
main function.The process ID does not change. We refer to the process that calls exec as the calling process and the
newly executed program as the new program.

The differences in the six exec functions are: (a) whether the program file to execute is specified by a
filename or a pathname; (b) whether the arguments to the new program are listed one by one or referenced
through an array of pointers; and (c) whether the environment of the calling process is passed to the new
program or whether a new environment is specified.

#include <unistd.h>

Int execl (const char *pathname, const char *arg0, ... /* (char *) 0 */ );

Int execv (const char *pathname, char *constargv[]);

Int execle (const char *pathname, const char *arg0, ...

/* (char *) 0, char *constenvp[] */ );

Int execve (const char *pathname, char *constargv[], char *constenvp[]);

Int execlp (const char *filename, const char *arg0, ... /* (char *) 0 */ );

Int execvp (const char *filename, char *constargv[]);

All six return: -1 on error, no return on success

These functions return to the caller only if an error occurs. Otherwise, control passes to the start of the new
program, normally the main function.

Figure 10 : Relationship among the six exec functions.

Note the following differences among these six functions:

Information Technology Page 13


Socket Address Structures

1. The three functions in the top row specify each argument string as a separate argument to the exec
function, with a null pointer terminating the variable number of arguments. The three functions in the
second row have an argv array, containing pointers to the argument strings. This argv array must
contain a null pointer to specify its end, since a count is not specified.
2. The two functions in the left column specify a filename argument. This is converted into a pathname
using the current PATH environment variable. If the filename argument to execlp or execvp contains
a slash (/) anywhere in the string, the PATH variable is not used. The four functions in the right two
columns specify a fully qualified pathname argument.
3. The four functions in the left two columns do not specify an explicit environment pointer. Instead,
the current value of the external variable environ is used for building an environment list that is
passed to the new program. The two functions in the right column specify an explicit environment
list. The envp array of pointers must be terminated by a null pointer.

Close: (A system call from traditional UNIX Environment)

close (socket descriptor);

When a client or server finishes with a socket, calls close to reallocate its resources. The connection
immediately terminates unless several processes share the same socket. It then decrements the reference
count (closing it completely when reference count = 0).

Order of Socket System Calls:

Client Side (depends on connection type): Server Side (depends on connection type):
Socket Socket
Connect Bind
Write (may be repeated) Listen
Read (may be repeated) Accept
Close Read (may be repeated)
Write (may be repeated)
Close (go back to Accept)

Shutdown:

Shutdown (socket, direction);

The shutdown function applies to full-duplex sockets (connected using a TCP socket) and is used to partially
close the connection.

Socket - socket descriptor of a connected socket.

Direction - direction in which shutdown is desired

0 = terminate further input.


1 = terminate further output.
2 = terminate input / output (close).

Information Technology Page 14


Socket Address Structures

Concurrent Servers

Outline for a typical concurrent server


pid_t pid;
int listenfd, connfd;

listenfd = Socket( ... );

/* fill in sockaddr_in{} with server's well-known port */


Bind(listenfd, ... );
Listen(listenfd, LISTENQ);

for ( ; ; ) {
connfd = Accept (listenfd, ... ); /* probably blocks */

if( (pid = Fork()) == 0) {


Close(listenfd); /* child closes listening socket */
doit(connfd); /* process the request */
Close(connfd); /* done with this client */
exit(0); /* child terminates */
}
Close(connfd); /* parent closes connected socket */
}

When a connection is established, accept returns, the server calls fork, and the child process services the
client (on connfd, the connected socket) and the parent process waits for another connection (on listenfd, the
listening socket). The parent closes the connected socket since the child handles the new client.

Figure 11 : Status of client/server before call to accept returns

First, Figure 10 shows the status of the client and server while the server is blocked in the call to accept and
the connection request arrives from the client

Figure 12 : Status of client/server after return from accept

Immediately after accept returns, we have the scenario shown in Figure 11. The connection is accepted by
the kernel and a new socket, connfd, is created. This is a connected socket and data can now be read and
written across the connection.

Information Technology Page 15


Socket Address Structures

Figure 13 : Status of client/server after fork returns

The next step in the concurrent server is to call fork. Figure 12 shows the status after fork returns.

Figure 14 : Status of client/server after parent and child close appropriate sockets

Notice that both descriptors, listenfd and connfd, are shared (duplicated) between the parent and child.

The next step is for the parent to close the connected socket and the child to close the listening socket. This
is shown in Figure 13.

This is the desired final state of the sockets. The child is handling the connection with the client and the
parent can call accept again on the listening socket, to handle the next client connection.

getsockname and getpeername Functions

These two functions return either the local protocol address associated with a socket (getsockname) or the
foreign protocol address associated with a socket (getpeername).

Information Technology Page 16


Socket Address Structures

#include <sys/socket.h>

Int getsockname(int sockfd, structsockaddr *localaddr, socklen_t *addrlen);


Int getpeername(int sockfd, structsockaddr *peeraddr, socklen_t *addrlen);
Both return: 0 if OK, -1 on error
Notice that the final argument for both functions is a value-result argument. That is, both functions fill in
the socket address structure pointed to by localaddr or peeraddr. We mentioned in our discussion of bind
that the term "name" is misleading. These two functions return the protocol address associated with one of
the two ends of a network connection, which for IPV4 and IPV6 is the combination of an IP address and
port number. These functions have nothing to do with domain names.

These two functions are required for the following reasons:

• After connect successfully returns in a TCP client that does not call bind, getsockname returns the
local IP address and local port number assigned to the connection by the kernel.
• After calling bind with a port number of 0 (telling the kernel to choose the local port number),
getsockname returns the local port number that was assigned. getsockname can be called to obtain
the address family of a socket.
• In a TCP server that binds the wildcard IP address, once a connection is established with a client
(accept returns successfully), the server can call getsockname to obtain the local IP address
assigned to the connection. The socket descriptor argument in this call must be that of the connected
socket, and not the listening socket.
• When a server is execed by the process that calls accept, the only way the server can obtain the
identity of the client is to callgetpeername.

Information Technology Page 17

You might also like