Sockets
Sockets
Sockets
NJIT
1
Introduction to Sockets
A socket is one of the most fundamental technologies of computer networking. The socket is the BSD method for accomplishing interprocess communication (IPC). What this means is a socket is used to allow one process to speak to another, very much like the telephone is used to allow one person to speak to another. Many of today's most popular software packages -- including Web Browsers, Instant Messaging and File Sharing -- rely on sockets.
Funded by ARPA (Advanced Research Projects Agency) in 1980. Developed at UC Berkeley Objective: to transport TCP/IP software to UNIX The socket interface has become a de facto standard.
History of Sockets
Sockets were introduced in 1981 as the Unix BSD 4.2 generic interface for Unix to Unix communications over networks. In 1985, SunOS introduced NFS and RPC over sockets. In 1986, AT&T introduced the Transport Layer Interface (TLI) with socket-like functionality but more network independent.
The Windows socket API, Winsock, is a multivendor specification to standardize the use of TCP/IP under Windows. It is based on the Berkeley sockets interface. In BSD Unix, Sockets are part of the kernel and provide standalone and networked IPC services. MS-DOS, Windows, Mac OS, and OS/2 provide sockets in the form of libraries.
3 Types of Socket
Stream sockets interface to the TCP (transmission control protocol). Datagram sockets interface to the UDP (user datagram protocol). Raw sockets interface to the IP (Internet protocol).
TCP used for services with a large data capacity, and a persistent connection, while UDP is more commonly used for quick lookups, and single use query-reply actions. Some common examples of TCP and UDP with their default ports:
DNS lookup FTP HTTP POP3 Windows shared printer name lookup Telnet 7
53 21 80 110 137
TCP 23
In 1978-1982, when the TCP/IP protocols were developed, provisions were made for 232 (about 4 billion) hosts. The address protocol, IPv4, has proven inadequate due to the unexpected rapid growth of the internet and inefficient use of address space. IPv6 uses 16 byte (128 bit) addresses allowing 2128 addressable entities. This is roughly 1,000 IP addresses for each square meter of the surface of the earth, including the oceans.
Try to avoid confusion between an IP address and an IP header. An IP header usually includes the address and port number of both the source and destination nodes, along with other information, and has attached data. The address is just an identifier for a network location.
IPv4 addresses
The 32 bits of an IPv4 address are broken into 4 octets, or 8 bit fields. In decimal notation, an 8 bit number can be represented by the values 0255. For networks of different size, the first one (for large networks) to three (for small networks) octets can be used to identify the network, while the rest of the octets can be used to identify the node on the network.
10
Class A, B, C, D, E Addresses
Using reserved values for the first octet, network addresses are broken into classes: Class A very large networks (up to 224 hosts) Class B large networks (up to 216 hosts) Class C small networks (up to 255 hosts) Class D multi-cast messages to multiple hosts Class E addresses not allocated and reserved. This addressing scheme is shown graphically on the following slides.
11
12
Coulouris et al
Class C:
192 to 223
0 to 255
Multicast address Class D (multicast): Class E (reserved): 224 to 239 240 to 255 0 to 255 0 to 255 0 to 255 0 to 255 1 to 254 1 to 254
13
Coulouris et al
IP Protocol Approach
Define functions that support network communications in general, and use parameters to make TCP/IP communication a special case. Socket calls refer to all TCP/IP protocols as a single protocol family.
15
IP Protocol
The IP protocol transmits datagrams from one host to another with unreliable or best-effort semantics. Delivery is not guaranteed. The IP layer puts datagrams into packets suitable for transmission in the underlying network, such as Ethernet. It must also inform the underlying network of the address of the message destination using address resolution.
16
Address Resolution
The address resolution module must convert an internet address so that it can be understood by the underlying network. For example, the 32 bit IPv4 address has to be converted to a 48 bit Ethernet address on an Ethernet network. This process is specific for each network, and network addressing schemes do not correlate directly to one another. Typically, known address resolutions will be cached, while new addresses are found by querying each node on the network.
17
In 1996, due largely to the allocation of class B network addresses to small networks, the Internet began to run out of addresses. Network administrators who could not be certain that their network would not grow past 255 nodes used class B addresses instead of Class C. The CIDR scheme was developed to allow a series of contiguous class C addresses to be used for a subnet requiring more than 255 addresses. This also allowed existing Class B addresses to be subdivided.
18
CIDR required redesign of the routing tables to avoid inefficiency, since a former class B network address might now represent many widely separated CIDR networks. The solution was to add a mask field to a routing table. The mask is used to select the portion of the IP address that is to be used to select the network identifier as opposed to the node identifier.
19
Unregistered Addresses
All of the computers and devices that access the Internet do not need globally unique IP addresses. Computers that are attached to a local network and access the Internet through a router can use the router to redirect packets to the correct computer. For example, the instructors home network is connected through a router to a cable modem to an Internet provider. The single globally unique IP address provided by the Internet service is the address of the cable modem, and is shared by the four computers on the home network.
20
Unregistered internal Internet enabled devices are assigned addresses, usually by the Dynamic Host Configuration Protocol (DHCP). Normally, small networks are assigned addresses on the 192.168.1.x class C subnet, while larger networks use either the 10.z.y.x. class A subnet or the 172.16.y.x Class B subnet. NAT enabled routers maintain an address translation table and use available source and destination port numbers to assign packets to local nodes.
21
printer
PC 1
192.168.1.5
Laptop
192.168.1.104 192.168.1.101
Game box
192.168.1.105
Coulouris et al 22
Media hub
192.168.1.106
Camera
IPv6
In 1994, IPv6 was adopted as a more permanent solution to the shortage of IP addresses and migration to it over a period of time was recommended. IPv6 contains not only a much larger address space, but also provisions desired by large Internet service providers. Some of these are controversial, such as the ability to assign classes to packets, so a provider can give a higher quality of service to its own subscribers than to transient traffic on its network.
23
Larger address space Partitioned address space Reduced header complexity for faster routing Traffic class and flow label headers to identify traffic for special handling, such as a multimedia stream The IPv6 header format is shown on the next slide.
24
Coulouris et al 25
Also known as session-based protocols, virtual circuits, or sequenced packet exchanges. Provide reliable two-way connection service over a session. Packets are given unique sequence numbers. Delivered packets are individually acknowledged. Duplicated packets are detected and discarded.
26
The first phase is the connection setup phase, during which the corresponding entities establish the connection and negotiate the parameters defining the connection. The second phase is the data transfer phase, during which the corresponding entities exchange messages under the auspices of the connection. Finally, the connection release phase is when the correspondents "tear down" the connection because it is no longer needed.
27
TCP/IP
TCP/IP is a family of protocols. TCP/IP is built on "connectionless" technology. Information is transferred as a sequence of "datagrams". Generally, TCP/IP applications use 4 layers:
An application protocol such as mail . A protocol such as TCP that provides services need by many applications. IP, which provides the basic service of getting datagrams to their destination . The protocols needed to manage a specific physical medium, such as Ethernet or a point to point line.
28
Reliable service has an overhead cost. You must create and manage the session. A lost session must be reestablished by one of the parties, a problem for fault tolerant servers that switch automatically to backup. Sessions are a two party affair, and not well suited to broadcasting.
29
Sockets extend these basis I/O functions: open close read (see also recv and recvfrom) write (see also send and sendto) lseek ioctl
30
31
They extended the conventional UNIX I/O facilities It became possible to use file descriptors for network communication Extended the read and write system calls so they work with the new network descriptors.
32
Descriptor Table
0 1 2 Internal data structure for file 0
...
33
0 1 2
Internal data structure for file 0 Family: PF_INET Service: SOCK_STREAM Local IP: ... Remote IP: Local Port: Remote Port: ...
34
Passive/Active Socket
A passive socket is used by a server to wait for an incoming connection. An active socket is used by a client to initiate a connection.
35
Sockets
When a socket is created it does not contain information about how it will be used. TCP/IP protocols define a communication endpoint to consist of an IP address and a protocol port number.
36
Sockets
Figure A
37
Figure B
Server Process
UNIX version
Server Process
socket()
UDP
bind()
Client Process
recvfrom()
get a blocked client
Client Process
read()
2 procees request
write()
read()
sendto()
recvfrom()
38
Server Process
Winsock or Unix
Server Process
version
TCP
socket()
UDP
bind()
recvfrom()
get a blocked client
Client Process
recv()
2 process request
send()
recv()
sendto()
recvfrom()
39
Connection-oriented Reliability in delivery of messages Splitting messages into datagrams keep track of order (or sequence) Use checksums for detecting errors
40
Connectionless No attempt to fragment messages No reassembly and synchronization In case of error, message is retransmitted No acknowledgment
41
Datagrams
Also known as connectionless or transmit and pray protocols. Simple, but unreliable. They are not tracked by sequence number or acknowledged. NetBIOS and some others provide broadcast capability. LAN Server and some others have acknowledged datagrams.
42
Datagrams
A datagram, often called a packet, is much more atomic in nature. A datagram is an independent, self-contained message sent over the network whose arrival, arrival time, and content are not guaranteed. All data sent over the channel is received in the same order in which it was sent. This is guaranteed by the channel. In modern data networking, it is important to distinguish between datagrams and streams.
43
Datagram Uses
Good for discovery - is someone listening?. Broadcast applications - a form of network junk mail. Dynamic environments where name of recipient is unknown. Quick messages (The network is up) where it is not critical that the message be received.
44
Selecting UDP
Remote procedures are idempotent* Server and client messsages fit completely within a packet. The server handles multiple clients (UDP is stateless)
result
45
Selecting TCP
Procedures are not idempotent Reliability is a must Messages exceed UDP packet size
46
IP (Raw) Socket
To use RAW sockets in Unix it is mandatory that one have root authority. To create a RAW socket write: s=socket(AF_INET,SOCK_RAW,[protocol]) Then you can sending or receive over it. Raw sockets are used to generate / receive packets of a type that the kernel doesn't explicitly support.
47
IP Socket example
A familiar example is PING. Ping works by sending out an ICMP (internet control message protocol - another IP protocol distinct from TCP or UDP) echo packet. The kernel has built-in code to respond to echo/ping packets. It doesn't have code to generate these packets, because it isn't required. The "ping packet generator" is a program in user space. It formats an ICMP echo packet and sends it out over a SOCK_RAW, waiting for a response.
48
49
Concurrent Connectionless
50
Summary
Algorithms for TCP and UDP Clients and Servers
NJIT
51
Find IP address and protocol port number on server Allocate a socket Allow TCP to allocate an arbitrary local port Connect the socket to the server Send requests and receive replies Close the connection
52
Create a socket and bind to the well known address for the service offered Place socket in passive mode Accept next connection request and obtain a new socket Repeatedly receive requests and send replies When client is done, close the connection and return to waiting for connection requests
53
Master:
Create a socket and bind to the well known address for the service offered. Leave socket unconnected Place socket in passive mode Repeatedly call accept to get requests and create a new slave thread Receive connection request and socket Receive requests and send responses to client Close connection and exit
Slave:
54
Find IP address and protocol port number on server Allocate a socket Allow UDP to allocate an arbitrary local port Specify the server Send requests and receive replies Close the socket
55
Create a socket and bind to the well known address for the service offered Repeatedly receive requests and send replies
56
Master:
Create a socket and bind to the well known address for the service offered. Leave socket unconnected Repeatedly call recvfrom to get requests and create a new slave thread Receive request and access to socket Form reply and send to client with sendto Exit
Slave:
57
Named Pipes
An Alternative to Sockets
NJIT
58
Named Pipes
Named Pipes are a file system from which the kernel can assign data blocks or memory spaces. They provide a file-like programming API for session-based two way exchange of data. Exchange data as if reading from or writing to a file. Good for many-to-one server programs, with scheduling and synchronization. Part of the base interprocess communications service for Windows. Provided on Unix by LAN Manager/X.
59
References
Robert Orfali, Dan Harkey, Jeri Edwards, Client Server Survival Guide, Third Edition, Wiley, 1999. Douglas E. Comer and David L. Stevens, Internetworking With TCP/IP, Volume III, Prentice Hall, multiple editions and dates. George Coularis, Jean Dollimore and Tim Kindberg, Distributed Systems, Concepts and Design, Addison Wesley, Fourth Edition, 2005 Figures from the Coulouris text are from the instructors guide and are copyrighted by Pearson Education 2005
61