Unit-Ii: Socket Address Structures
Unit-Ii: Socket Address Structures
Unit-Ii: Socket Address Structures
UNIT-II
Socket Address Structures
Most socket functions require a pointer to a socket address structure as an argument. Each supported
protocol suite defines its own socket address structure.
An IPv4 socket address structure, commonly called an "Internet socket address structure," is named
sockaddr_in and is defined by including the <netinet/in.h>header. The POSIX definition of IPv4 SAS is
shown below.
struct in_addr {
in_addr_t s_addr; /* 32-bit IPv4 address */
/* network byte ordered */
};
struct sockaddr_in {
uint8_t sin_len; /* length of structure (16) */
sa_family_t sin_family; /* AF_INET */
in_port_t sin_port; /* 16-bit TCP or UDP port number */
/* network byte ordered */
struct in_addr sin_addr; /* 32-bit IPv4 address */
/* network byte ordered */
char sin_zero[8]; /* unused */
};
• The POSIX specification requires only three members in the structure: sin_family, sin_addr, and
sin_port.
• The datatypes u_char, u_short, u_int, and u_long, which are all unsigned
• Both the IPv4 address and the TCP or UDP port number are always stored in the structure in network
byte order.
• The 32-bit IPv4 address can be accessed in two different ways.
o For example, if serv is defined as an Internet socket address structure, then serv.sin_addr
references the 32-bit IPv4 address as an in_addr structure, while serv.sin_addr.s_addr
references the same 32-bit IPv4 address as an in_addr_t (typically an unsigned 32-bit
integer).
• A socket address structure is always passed by reference when passed as an argument to any
socket functions.
• void * is the generic pointer type.
• <sys/socket.h> :Generic Socket address structure
struct sockaddr {
uint8_t sa_len;
sa_family_t sa_family; /* address family: AF_xxx value */
char sa_data[14]; /* protocol-specific address */
};
The socket functions are then defined as taking a pointer to the generic socket address structure, as shown
here in the ANSI C function prototype for the bind function:
Example:
Struct sockaddr_in serv; /* IPv4 socket address structure */
/* fill in serv{} */
struct sockaddr_in6 {
uint8_t sin6_len; /* length of this struct (28) */
sa_family_t sin6_family; /* AF_INET6 */
in_port_t sin6_port; /* transport layer port# */
/* network byte ordered */
uint32_t sin6_flowinfo; /* flow information, undefined */
struct in6_addr sin6_addr; /* IPv6 address */
/* network byte ordered */
uint32_t sin6_scope_id; /* set of interfaces for a scope */
};
The sin6_scope_id identifies the scope zone in which a scoped address is meaningful, most commonly an
interface index for a link-local address
Value-Result Arguments
The reason that the size changes from an integer to be a pointer to an integer is because the size is both a
value when the function is called (it tells the kernel the size of the structure so that the kernel does not write
past the end of the structure when filling it in) and a result when the function returns. This type of argument
is called a value-result argument.
Consider a 16-bit integer that is made up of 2 bytes. There are two ways to store the two bytes in memory:
with the low-order byte at the starting address, known as little-endian byte order, or with the high-order byte
at the starting address, known as big-endian byte order.
Figure 4: Little-endian byte order and big-endian byte order for a 16-bit integer
We must deal with these byte ordering differences as network programmers because networking protocols
must specify a network byte order. For example, in a TCP segment, there is a 16-bit port number and a 32-
bit IPv4 address. The sending protocol stack and the receiving protocol stack must agree on the order in
which the bytes of these multibyte fields will be transmitted. The Internet protocols use big-endian byte
ordering for these multibyte integers.
In theory, an implementation could store the fields in a socket address structure in host byte order and then
convert to and from the network byte order when moving the fields to and from the protocol headers, saving
us from having to worry about this detail. But, both history and the POSIX specification say that certain
fields in the socket address structures must be maintained in network byte order. Our concern is therefore
converting between host byte order and network byte order. We use the following four functions to convert
between these two byte orders.
#include <netinet/in.h>
uint16_t htons(uint16_t host16bitvalue) ;
uint32_t htonl(uint32_t host32bitvalue) ;
Both return: value in network byte order
uint16_t ntohs(uint16_t net16bitvalue) ;
uint32_t ntohl(uint32_t net32bitvalue) ;
Both return: value in host byte order
In the names of these functions, h stands for host, n stands for network, s stands for short, and l stands for
long. The terms "short" and "long" are historical artifacts from the Digital VAX implementation of 4.2BSD.
We should instead think of s as a 16-bit value (such as a TCP or UDP port number) and l as a 32-bit value
(such as an IPv4 address). Indeed, on the 64-bit Digital Alpha, a long integer occupies 64 bits, yet the htonl
and ntohl functions operate on 32-bit values.
NOTE: These functions are used exclusively for data functionality between sockets (storage).
There are two groups of functions that operate on multibyte fields, without interpreting the data, and without
assuming that the data is a null-terminated C string. We need these types of functions when dealing with
socket address structures because we need to manipulate fields such as IP addresses, which can contain
bytes of 0, but are not C character strings.
The first group of functions, whose names begin with b (for byte), are from 4.2BSD and are still provided by
almost any system that supports the socket functions. The second group of functions, whose names begin
with mem (for memory), are from the ANSI C standard and are provided with any system that supports an
ANSI C library.
#include <strings.h>
void bzero(void *dest, size_t nbytes);
void bcopy(const void *src, void *dest, size_t nbytes);
Int bcmp(const void *ptr1, const void *ptr2, size_t nbytes);
Returns: 0 if equal, nonzero if unequal
The following functions are the ANSI C functions:
#include <string.h>
void *memset(void *dest, int c, size_t len);
void *memcpy(void *dest, const void *src, size_t nbytes);
Int memcmp(const void *ptr1, const void *ptr2, size_t nbytes);
Returns: 0 if equal, <0 or >0 if unequal (see text)
src might represent application space and dest might represent socket send buffer space (socket receive
buffer space).
To send IP address on the network, we have the functions that serve the purpose. The following functions
are for IPV4.
#include <arpa/inet.h>
Int inet_aton(const char *strptr, struct in_addr *addrptr);
Returns: 1 if string was valid, 0 on error
in_addr_t inet_addr(const char *strptr);
Returns: 32-bit binary network byte ordered IPv4 address; INADDR_NONE if error
char *inet_ntoa(struct in_addr inaddr);
Returns: pointer to dotted-decimal string
The IPV6 functions for the data communication over the network, following functions areused. These
functions can also be used for IPV4 addresses also (The ‗family‘ argument specifies this).
#include <arpa/inet.h>
Int inet_pton(int family, const char *strptr, void *addrptr);
Returns: 1 if OK, 0 if input not a valid presentation format, -1 on error
const char *inet_ntop(int family, const void *addrptr, char *strptr, size_t len);
Returns: pointer to result if OK, NULL on error
sock_ntop Function
A basic problem with inet_ntop is that it requires the caller to pass a pointer to a binary address. This
address is normally contained in a socket address structure, requiring the caller to know the format of the
structure and the address family.
To solve this problem, sock_ntop() is used which takes pointer to a socket address structure as an argument,
calls the appropriate function and the presentation address is returned.
#include "unp.h"
char *sock_ntop(const struct sockaddr *sockaddr, socklen_t addrlen);
Returns: non-null pointer if OK, NULL on error
Stream sockets (e.g., TCP sockets) exhibit a behavior with the read and write functions that differ from
normal file I/O. A read or write on a stream socket might input or output fewer bytes than requested, but this
is not an error condition. The reason is that buffer limits might be reached for the socket in the kernel. All
that is required to input or output the remaining bytes is for the caller to invoke the read or write function
again. Some versions of UNIX also exhibit this behavior when writing more than 4,096 bytes to a pipe. This
scenario is always a possibility on a stream socket with read, but is normally seen with write only if the
socket is nonblocking. Nevertheless, we always call our writenfunction instead of write, in case the
implementation returns a short count.
#include "unp.h"
ssize_t readn(int filedes, void *buff, size_t nbytes);
ssize_t writen(int filedes, const void *buff, size_t nbytes);
ssize_t readline(int filedes, void *buff, size_t maxlen);
All return: number of bytes read or written, –1 on error
This chapter describes the elementary socket functions required to write a complete TCP client and server.
Socket Function
To perform network I/O, the first thing a process must do is call the socket function, specifying the type of
communication protocol desired (TCP using IPv4, UDP using IPv6, Unix domain stream protocol, etc.).
#include <sys/socket.h>
Connect Function
The connect function is used by a TCP client to establish a connection with a TCP server.
#include <sys/socket.h>
Destaddr -socket_addr structure (also includes protocol port number) specifying the destination address.
bind Function
The bind function assigns a local protocol address to a socket. With the Internet protocols, the protocol
address is the combination of either a 32-bit IPv4 address or a 128-bit IPv6 address, along with a 16-bit TCP
or UDP port number.
#include <sys/socket.h>
int bind (int sockfd, const struct sockaddr *myaddr, socklen_t addrlen);
Returns: 0 if OK,-1 on error
Socket is created without any association to local or destination addresses, so a program uses bind to
establish a local address for it.
listen Function
The listen function is called only by a TCP server and it performs two actions:
1. When a socket is created by the socket function, it is assumed to be an active socket, that is, a client
socket that will issue a connect. The listen function converts an unconnected socket into a passive
socket, indicating that the kernel should accept incoming connection requests directed to this socket.
In terms of the TCP state transition diagram, the call to listen moves the socket from the CLOSED
state to the LISTEN state.
2. The second argument to this function specifies the maximum number of connections the kernel
should queue for this socket.
#include <sys/socket.h>
This function is normally called after both the socket and bind functions and must be called before calling
the accept function.
Server creates a socket, binds it to a well-known port, and waits for requests. To avoid rejecting service
requests that cannot be handled, a server queue is created using Listen. It provides a mechanism to create
the queue and then listen for incoming connections (passive mode). Listen only works with sockets using a
reliable stream service.
Information Technology Page 11
Socket Address Structures
Backlog(qlength) - length of the request queue for that socket (max. = 5).
accept Function
accept is called by a TCP server to return the next completed connection from the front of the completed
connection queue. If the completed connection queue is empty, the process is put to sleep (assuming the
default of a blocking socket).
#include <sys/socket.h>
Bind associates a socket with port, but that socket is not connected to a foreign destination.
When a request comes in, Accept establishes the full connection. It blocks until a connectionrequest arrives.
This function (including the variants of it provided by some systems) is the only way in Unix to create a new
process
#include <unistd.h>
pid_t fork(void);
The reason fork returns 0 in the child, instead of the parent's process ID, is because a child has only one
parent and it can always obtain the parent's process ID by calling getppid. A parent, on the other hand, can
have any number of children, and there is no way to obtain the process IDs of its children. If a parent wants
to keep track of the process IDs of all its children, it must record the return values from fork.
All descriptors open in the parent before the call to fork are shared with the child after fork returns. We will
see this feature used by network servers: The parent calls accept and then calls fork. The connected socket is
then shared between the parent and child. Normally, the child then reads and writes the connected socket
and the parent closes the connected socket.
1. A process makes a copy of itself so that one copy can handle one operation while the other copy does
another task. This is typical for network servers. We will see many examples of this later in the text.
2. A process wants to execute another program. Since the only way to create a new process is by calling
fork, the process first calls fork to make a copy of itself, and then one of the copies (typically the
child process) calls exec (described next) to replace itself with the new program. This is typical for
programs such as shells.
exec replaces the current process image with the new program file, and this new program normally starts at the
main function.The process ID does not change. We refer to the process that calls exec as the calling process and the
newly executed program as the new program.
The differences in the six exec functions are: (a) whether the program file to execute is specified by a
filename or a pathname; (b) whether the arguments to the new program are listed one by one or referenced
through an array of pointers; and (c) whether the environment of the calling process is passed to the new
program or whether a new environment is specified.
#include <unistd.h>
Int execl (const char *pathname, const char *arg0, ... /* (char *) 0 */ );
Int execlp (const char *filename, const char *arg0, ... /* (char *) 0 */ );
These functions return to the caller only if an error occurs. Otherwise, control passes to the start of the new
program, normally the main function.
1. The three functions in the top row specify each argument string as a separate argument to the exec
function, with a null pointer terminating the variable number of arguments. The three functions in the
second row have an argv array, containing pointers to the argument strings. This argv array must
contain a null pointer to specify its end, since a count is not specified.
2. The two functions in the left column specify a filename argument. This is converted into a pathname
using the current PATH environment variable. If the filename argument to execlp or execvp contains
a slash (/) anywhere in the string, the PATH variable is not used. The four functions in the right two
columns specify a fully qualified pathname argument.
3. The four functions in the left two columns do not specify an explicit environment pointer. Instead,
the current value of the external variable environ is used for building an environment list that is
passed to the new program. The two functions in the right column specify an explicit environment
list. The envp array of pointers must be terminated by a null pointer.
When a client or server finishes with a socket, calls close to reallocate its resources. The connection
immediately terminates unless several processes share the same socket. It then decrements the reference
count (closing it completely when reference count = 0).
Client Side (depends on connection type): Server Side (depends on connection type):
Socket Socket
Connect Bind
Write (may be repeated) Listen
Read (may be repeated) Accept
Close Read (may be repeated)
Write (may be repeated)
Close (go back to Accept)
Shutdown:
The shutdown function applies to full-duplex sockets (connected using a TCP socket) and is used to partially
close the connection.
Concurrent Servers
for ( ; ; ) {
connfd = Accept (listenfd, ... ); /* probably blocks */
When a connection is established, accept returns, the server calls fork, and the child process services the
client (on connfd, the connected socket) and the parent process waits for another connection (on listenfd, the
listening socket). The parent closes the connected socket since the child handles the new client.
First, Figure 10 shows the status of the client and server while the server is blocked in the call to accept and
the connection request arrives from the client
Immediately after accept returns, we have the scenario shown in Figure 11. The connection is accepted by
the kernel and a new socket, connfd, is created. This is a connected socket and data can now be read and
written across the connection.
The next step in the concurrent server is to call fork. Figure 12 shows the status after fork returns.
Figure 14 : Status of client/server after parent and child close appropriate sockets
Notice that both descriptors, listenfd and connfd, are shared (duplicated) between the parent and child.
The next step is for the parent to close the connected socket and the child to close the listening socket. This
is shown in Figure 13.
This is the desired final state of the sockets. The child is handling the connection with the client and the
parent can call accept again on the listening socket, to handle the next client connection.
These two functions return either the local protocol address associated with a socket (getsockname) or the
foreign protocol address associated with a socket (getpeername).
#include <sys/socket.h>
• After connect successfully returns in a TCP client that does not call bind, getsockname returns the
local IP address and local port number assigned to the connection by the kernel.
• After calling bind with a port number of 0 (telling the kernel to choose the local port number),
getsockname returns the local port number that was assigned. getsockname can be called to obtain
the address family of a socket.
• In a TCP server that binds the wildcard IP address, once a connection is established with a client
(accept returns successfully), the server can call getsockname to obtain the local IP address
assigned to the connection. The socket descriptor argument in this call must be that of the connected
socket, and not the listening socket.
• When a server is execed by the process that calls accept, the only way the server can obtain the
identity of the client is to callgetpeername.