Sockets

Introduction to Sockets and TCP/IP Protocols
Leon Jololian and George Blank
NJIT
1
Introduction to Sockets

A socket is one of the most fundamental technologies of computer networking. The socket is the BSD method for accomplishing interprocess communication (IPC). What this means is a socket is used to allow one process to speak to another, very much like the telephone is used to allow one person to speak to another. Many of today's most popular software packages -- including Web Browsers, Instant Messaging and File Sharing -- rely on sockets.
The Socket Interface

Funded by ARPA (Advanced Research Projects Agency) in 1980. Developed at UC Berkeley Objective: to transport TCP/IP software to UNIX The socket interface has become a de facto standard.
History of Sockets

Sockets were introduced in 1981 as the Unix BSD 4.2 generic interface for Unix to Unix communications over networks. In 1985, SunOS introduced NFS and RPC over sockets. In 1986, AT&T introduced the Transport Layer Interface (TLI) with socket-like functionality but more network independent.
TCP/IP Network Standard

The Windows socket API, Winsock, is a multivendor specification to standardize the use of TCP/IP under Windows. It is based on the Berkeley sockets interface. In BSD Unix, Sockets are part of the kernel and provide standalone and networked IPC services. MS-DOS, Windows, Mac OS, and OS/2 provide sockets in the form of libraries.
3 Types of Socket

Stream sockets interface to the TCP (transmission control protocol). Datagram sockets interface to the UDP (user datagram protocol). Raw sockets interface to the IP (Internet protocol).
TCP vs. UDP

TCP used for services with a large data capacity, and a persistent connection, while UDP is more commonly used for quick lookups, and single use query-reply actions. Some common examples of TCP and UDP with their default ports:
DNS lookup FTP HTTP POP3 Windows shared printer name lookup Telnet 7
UDP TCP TCP TCP UDP
53 21 80 110 137
TCP 23
IPv4 and IPv6

In 1978-1982, when the TCP/IP protocols were developed, provisions were made for 232 (about 4 billion) hosts. The address protocol, IPv4, has proven inadequate due to the unexpected rapid growth of the internet and inefficient use of address space. IPv6 uses 16 byte (128 bit) addresses allowing 2128 addressable entities. This is roughly 1,000 IP addresses for each square meter of the surface of the earth, including the oceans.
Addresses and Headers

Try to avoid confusion between an IP address and an IP header. An IP header usually includes the address and port number of both the source and destination nodes, along with other information, and has attached data. The address is just an identifier for a network location.
IPv4 addresses

The 32 bits of an IPv4 address are broken into 4 octets, or 8 bit fields. In decimal notation, an 8 bit number can be represented by the values 0255. For networks of different size, the first one (for large networks) to three (for small networks) octets can be used to identify the network, while the rest of the octets can be used to identify the node on the network.
10
Class A, B, C, D, E Addresses

Using reserved values for the first octet, network addresses are broken into classes: Class A very large networks (up to 224 hosts) Class B large networks (up to 216 hosts) Class C small networks (up to 255 hosts) Class D multi-cast messages to multiple hosts Class E addresses not allocated and reserved. This addressing scheme is shown graphically on the following slides.
11
Figure 3.15 IP addresses (bits)

7 Class A: 0 Network ID 14 Class B: 1 0 Network ID 21 Class C: 1 1 0 Network ID 28 Class D (multicast): 1 1 1 0 Multicast address 27 Class E (reserved): 1 1 1 1 0 unused 24 Host ID 16 Host ID 8 Host ID
12
Coulouris et al
Figure 3.16 IP addresses (decimal)

octet 1 Network ID Class A: 1 to 127 0 to 255 octet 2 octet 3 Host ID 0 to 255 Host ID 0 to 255 0 to 255 Host ID 1 to 254 128.0.0.0 to 191.255.255.255 192.0.0.0 to 223.255.255.255 224.0.0.0 to 239.255.255.255 240.0.0.0 to 255.255.255.255 0 to 255 1.0.0.0 to 127.255.255.255 Range of addresses
Network ID Class B: 128 to 191 0 to 255 Network ID 0 to 255
Class C:
192 to 223
0 to 255
Multicast address Class D (multicast): Class E (reserved): 224 to 239 240 to 255 0 to 255 0 to 255 0 to 255 0 to 255 1 to 254 1 to 254
13
Coulouris et al
Socket Address (IPv4)

A socket address on the TCP/IP internet consists of two parts: An internet (IP) address, a 32 bit number usually represented by 4 decimal number separated by dots. It is a unique identifier for a network interface card within an administered AF_INET domain. A TCP/IP Host may have as many addresses as it has network interfaces. (Newer IP addresses have 6 decimal numbers) A 16 bit port number, which is an entry point to an application that resides on a host. Port define entry points for services provided by server applications. Important commercial applications such as Oracle 14 have their own well known ports.

IP Protocol Approach

Define functions that support network communications in general, and use parameters to make TCP/IP communication a special case. Socket calls refer to all TCP/IP protocols as a single protocol family.
15
IP Protocol

The IP protocol transmits datagrams from one host to another with unreliable or best-effort semantics. Delivery is not guaranteed. The IP layer puts datagrams into packets suitable for transmission in the underlying network, such as Ethernet. It must also inform the underlying network of the address of the message destination using address resolution.
16
Address Resolution

The address resolution module must convert an internet address so that it can be understood by the underlying network. For example, the 32 bit IPv4 address has to be converted to a 48 bit Ethernet address on an Ethernet network. This process is specific for each network, and network addressing schemes do not correlate directly to one another. Typically, known address resolutions will be cached, while new addresses are found by querying each node on the network.
17
Classless Interdomain Routing (CIDR)

In 1996, due largely to the allocation of class B network addresses to small networks, the Internet began to run out of addresses. Network administrators who could not be certain that their network would not grow past 255 nodes used class B addresses instead of Class C. The CIDR scheme was developed to allow a series of contiguous class C addresses to be used for a subnet requiring more than 255 addresses. This also allowed existing Class B addresses to be subdivided.
18
CIDR Routing Tables

CIDR required redesign of the routing tables to avoid inefficiency, since a former class B network address might now represent many widely separated CIDR networks. The solution was to add a mask field to a routing table. The mask is used to select the portion of the IP address that is to be used to select the network identifier as opposed to the node identifier.
19
Unregistered Addresses

All of the computers and devices that access the Internet do not need globally unique IP addresses. Computers that are attached to a local network and access the Internet through a router can use the router to redirect packets to the correct computer. For example, the instructors home network is connected through a router to a cable modem to an Internet provider. The single globally unique IP address provided by the Internet service is the address of the cable modem, and is shared by the four computers on the home network.
20
Network Address Translation

Unregistered internal Internet enabled devices are assigned addresses, usually by the Dynamic Host Configuration Protocol (DHCP). Normally, small networks are assigned addresses on the 192.168.1.x class C subnet, while larger networks use either the 10.z.y.x. class A subnet or the 172.16.y.x Class B subnet. NAT enabled routers maintain an address translation table and use available source and destination port numbers to assign packets to local nodes.
21
Figure 3.18 A Home Network

DSL or Cable connection to ISP 192.168.1.xx subnet
83.215.152.95
Modem / firewall / router (NAT enabled)

192.168.1.1
Ethernet switch WiFi base station/ access point

192.168.1.2 192.168.1.10
printer
PC 1
192.168.1.5
Laptop
192.168.1.104 192.168.1.101
PC 2 Bluetooth adapter TV monitor Bluetooth printer
Game box
192.168.1.105
Coulouris et al 22
Media hub
192.168.1.106
Camera
IPv6

In 1994, IPv6 was adopted as a more permanent solution to the shortage of IP addresses and migration to it over a period of time was recommended. IPv6 contains not only a much larger address space, but also provisions desired by large Internet service providers. Some of these are controversial, such as the ability to assign classes to packets, so a provider can give a higher quality of service to its own subscribers than to transient traffic on its network.
23
New IPv6 Provisions

Larger address space Partitioned address space Reduced header complexity for faster routing Traffic class and flow label headers to identify traffic for special handling, such as a multimedia stream The IPv6 header format is shown on the next slide.
24
Figure 3.19 IPv6 Header

Version (4 bits) Traffic class (8 bits) Payload length (16 bits) Flow label (20 bits) Next header (8 bits) Hop limit (8 bits)
Source address (128 bits)
Destination address (128 bits)
Coulouris et al 25
Connection Oriented Protocols

Also known as session-based protocols, virtual circuits, or sequenced packet exchanges. Provide reliable two-way connection service over a session. Packets are given unique sequence numbers. Delivered packets are individually acknowledged. Duplicated packets are detected and discarded.
26
Connection Oriented Protocols

Connection-oriented protocols operate in three phases.

The first phase is the connection setup phase, during which the corresponding entities establish the connection and negotiate the parameters defining the connection. The second phase is the data transfer phase, during which the corresponding entities exchange messages under the auspices of the connection. Finally, the connection release phase is when the correspondents "tear down" the connection because it is no longer needed.
27
TCP/IP

TCP/IP is a family of protocols. TCP/IP is built on "connectionless" technology. Information is transferred as a sequence of "datagrams". Generally, TCP/IP applications use 4 layers:

An application protocol such as mail . A protocol such as TCP that provides services need by many applications. IP, which provides the basic service of getting datagrams to their destination . The protocols needed to manage a specific physical medium, such as Ethernet or a point to point line.
28
Cost of Session Oriented

Reliable service has an overhead cost. You must create and manage the session. A lost session must be reestablished by one of the parties, a problem for fault tolerant servers that switch automatically to backup. Sessions are a two party affair, and not well suited to broadcasting.
29
Basic I/O Functions in UNIX

Sockets extend these basis I/O functions: open close read (see also recv and recvfrom) write (see also send and sendto) lseek ioctl
30
Using I/O in UNIX

int desc; ... desc = open(file, O_RDWR, 0); read(desc, buffer, 128); close(desc);
31
Using UNIX I/O with TCP/IP

They extended the conventional UNIX I/O facilities It became possible to use file descriptors for network communication Extended the read and write system calls so they work with the new network descriptors.
32
Descriptor Table
0 1 2 Internal data structure for file 0
...
33
0 1 2
Internal data structure for file 0 Family: PF_INET Service: SOCK_STREAM Local IP: ... Remote IP: Local Port: Remote Port: ...
34
Passive/Active Socket

A passive socket is used by a server to wait for an incoming connection. An active socket is used by a client to initiate a connection.
35
Sockets

When a socket is created it does not contain information about how it will be used. TCP/IP protocols define a communication endpoint to consist of an IP address and a protocol port number.
36
Sockets
Figure A
37
Figure B
Server Process
UNIX version
Server Process
socket() bind() listen() accept()

get a blocked client 1 TCP
socket()
UDP
bind()
Client Process
recvfrom()
get a blocked client
Client Process
socket() connect() write()

process request
read()
2 procees request
socket() bind() sendto()
write()
read()
sendto()
recvfrom()
38
Server Process
Winsock or Unix
Server Process
socket() bind() listen() accept()

get a blocked client 1 Client Process
version
TCP
socket()
UDP
bind()
recvfrom()
get a blocked client
Client Process
socket() connect() send()

process request
recv()
2 process request
socket() bind() sendto()
send()
recv()
sendto()
recvfrom()
39
TCP vs. UDP

TCP (Transmission Control Protocol)

Connection-oriented Reliability in delivery of messages Splitting messages into datagrams keep track of order (or sequence) Use checksums for detecting errors
40
TCP vs. UDP (Contd)

UDP (User Datagram Protocols)

Connectionless No attempt to fragment messages No reassembly and synchronization In case of error, message is retransmitted No acknowledgment
41
Datagrams

Also known as connectionless or transmit and pray protocols. Simple, but unreliable. They are not tracked by sequence number or acknowledged. NetBIOS and some others provide broadcast capability. LAN Server and some others have acknowledged datagrams.
42
Datagrams

A datagram, often called a packet, is much more atomic in nature. A datagram is an independent, self-contained message sent over the network whose arrival, arrival time, and content are not guaranteed. All data sent over the channel is received in the same order in which it was sent. This is guaranteed by the channel. In modern data networking, it is important to distinguish between datagrams and streams.
43
Datagram Uses

Good for discovery - is someone listening?. Broadcast applications - a form of network junk mail. Dynamic environments where name of recipient is unknown. Quick messages (The network is up) where it is not critical that the message be received.
44
Selecting UDP

Remote procedures are idempotent* Server and client messsages fit completely within a packet. The server handles multiple clients (UDP is stateless)
result
*a mathematical operation that always produces the same
45
Selecting TCP

Procedures are not idempotent Reliability is a must Messages exceed UDP packet size
46
IP (Raw) Socket

To use RAW sockets in Unix it is mandatory that one have root authority. To create a RAW socket write: s=socket(AF_INET,SOCK_RAW,[protocol]) Then you can sending or receive over it. Raw sockets are used to generate / receive packets of a type that the kernel doesn't explicitly support.
47
IP Socket example

A familiar example is PING. Ping works by sending out an ICMP (internet control message protocol - another IP protocol distinct from TCP or UDP) echo packet. The kernel has built-in code to respond to echo/ping packets. It doesn't have code to generate these packets, because it isn't required. The "ping packet generator" is a program in user space. It formats an ICMP echo packet and sends it out over a SOCK_RAW, waiting for a response.
48
OSI Layers vs. TCP/IP

5-7. Session 4. Transport
User Application TCP IP Hardware Interface Network UDP
3. Network 1-2. Data Link/ Physical
49
Four Types of Servers

Iterative Iterative Connectionless ConnectionOriented Concurrent ConnectionOriented
Concurrent Connectionless
50
Summary
Algorithms for TCP and UDP Clients and Servers
NJIT
51
TCP Client Algorithm

Comer and Stevens, Algorithm 6.1

Find IP address and protocol port number on server Allocate a socket Allow TCP to allocate an arbitrary local port Connect the socket to the server Send requests and receive replies Close the connection
52
TCP Iterative Server Algorithm


Create a socket and bind to the well known address for the service offered Place socket in passive mode Accept next connection request and obtain a new socket Repeatedly receive requests and send replies When client is done, close the connection and return to waiting for connection requests
53
TCP Concurrent Server Algorithm


Master:

Create a socket and bind to the well known address for the service offered. Leave socket unconnected Place socket in passive mode Repeatedly call accept to get requests and create a new slave thread Receive connection request and socket Receive requests and send responses to client Close connection and exit
Slave:

54
UDP Client Algorithm


Find IP address and protocol port number on server Allocate a socket Allow UDP to allocate an arbitrary local port Specify the server Send requests and receive replies Close the socket
55
UDP Iterative Server Algorithm


Create a socket and bind to the well known address for the service offered Repeatedly receive requests and send replies
56
UDP Concurrent Server Algorithm


Master:

Create a socket and bind to the well known address for the service offered. Leave socket unconnected Repeatedly call recvfrom to get requests and create a new slave thread Receive request and access to socket Form reply and send to client with sendto Exit
Slave:

57
Named Pipes
An Alternative to Sockets
NJIT
58
Named Pipes

Named Pipes are a file system from which the kernel can assign data blocks or memory spaces. They provide a file-like programming API for session-based two way exchange of data. Exchange data as if reading from or writing to a file. Good for many-to-one server programs, with scheduling and synchronization. Part of the base interprocess communications service for Windows. Provided on Unix by LAN Manager/X.
59
Code fragment: Named Pipes

Char string[]=Hello; main(argc, argv) int argc; char* argv[]; int fd; char buf[256]; mknod(fifo,010777,0) /* Create Pipe RW=all*/ if (argc==2) fd = open(fifo, 0_WRONLY; else fd = open(fifo, 0_RDONLY); for (;;) if (argc==2) write (fd, string, 6); else rd (fd, buf, 6); 60
References

Robert Orfali, Dan Harkey, Jeri Edwards, Client Server Survival Guide, Third Edition, Wiley, 1999. Douglas E. Comer and David L. Stevens, Internetworking With TCP/IP, Volume III, Prentice Hall, multiple editions and dates. George Coularis, Jean Dollimore and Tim Kindberg, Distributed Systems, Concepts and Design, Addison Wesley, Fourth Edition, 2005 Figures from the Coulouris text are from the instructors guide and are copyrighted by Pearson Education 2005
61

Sockets

Uploaded by

Copyright:

Available Formats

Sockets

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Sockets

Uploaded by

Copyright:

Available Formats

Introduction to Sockets and TCP/IP Protocols

Leon Jololian and George Blank

The Socket Interface

TCP/IP Network Standard

TCP vs. UDP

UDP TCP TCP TCP UDP

IPv4 and IPv6

Addresses and Headers

Figure 3.15 IP addresses (bits)

Figure 3.16 IP addresses (decimal)

Network ID Class B: 128 to 191 0 to 255 Network ID 0 to 255

Socket Address (IPv4)

Classless Interdomain Routing (CIDR)

CIDR Routing Tables

Network Address Translation

Figure 3.18 A Home Network

Modem / firewall / router (NAT enabled)

Ethernet switch WiFi base station/ access point

PC 2 Bluetooth adapter TV monitor Bluetooth printer

New IPv6 Provisions

Figure 3.19 IPv6 Header

Source address (128 bits)

Destination address (128 bits)

Connection Oriented Protocols

Connection Oriented Protocols

Connection-oriented protocols operate in three phases.

Cost of Session Oriented

Basic I/O Functions in UNIX

Using I/O in UNIX

Using UNIX I/O with TCP/IP

socket() bind() listen() accept()

socket() connect() write()

socket() bind() sendto()

socket() bind() listen() accept()

socket() connect() send()

socket() bind() sendto()

TCP vs. UDP

TCP (Transmission Control Protocol)

TCP vs. UDP (Contd)

UDP (User Datagram Protocols)

*a mathematical operation that always produces the same

OSI Layers vs. TCP/IP

User Application TCP IP Hardware Interface Network UDP

3. Network 1-2. Data Link/ Physical

Four Types of Servers

TCP Client Algorithm

TCP Iterative Server Algorithm

TCP Concurrent Server Algorithm

UDP Client Algorithm

UDP Iterative Server Algorithm

UDP Concurrent Server Algorithm

Code fragment: Named Pipes

You might also like