CN HR Unit-4

Download as pdf or txt
Download as pdf or txt
You are on page 1of 61

Unit – IV

Transport Layer
Syllabus
Transport Layer: Transport Services, Elements of Transport protocols,
Connection management, TCP and UDP protocols.

CN-HR UNIT-4
Compiled by: Mr.Harish Reddy.G, Associate Professor, CSE Dept’, VITS, Hyderabad.
Transport layer
 The Transport layer is responsible for process to process delivery
of the entire message.
• End-to-end delivery
• Segmentation and Re-assembly
• Connection control
• Flow control
• Error control
• Multiplexing and Demultiplexing.

16-Feb-24 2
Segmentation and Reassembly
The Transport Service
Services Provided to the upper layers :
1. The main goal of the Transport layer is to provide efficient, reliable, and cost-
effective service to its users, normally processes in the application layer.
To achieve this goal, it makes use of the services provided by the network layer.
2.The hardware/software within the transport layer that does this work is called
the transport entity. This may be located in the O/S Kernel(user process), or on
the NIC.

The (logical) relationship


of the network, transport
and application layers is
shown in the following
figure 
3. There are 2 types of transport services (similar to network layer services):
- Connection oriented transport service.
- Connectionless transport service.

Reasons behind presence of both layers, when both are offering similar
services:
1. The transport layer code runs on the user’s machines, whereas the network
layer code runs on routers.

2.What happens if the network layer offers inadequate service? Suppose that it
frequently loses packets? What happens if routers crash from time to time?

Thus, the users have no control over the network layer , so they cannot solve the
problem of poor service by using better routers or putting more error handling
techniques in the data link layer.
The only possibility is to put another layer above the network layer that
improves the quality of service.
Transport Service Primitives :
To allow the users to access the transport service, the transport layer provides
several operations to application programs, i.e., a transport service interface.
Each transport service has its own interface.

To get an idea of a transport service, let us consider the primitives listed in the
following figure :

The primitives for a simple transport service.


TPDU ( Transport Protocol Data Unit ) are the messages sent from one
transport entity to other entity. Thus, the TPDUs ( exchanged by the transport
layer ) are contained in packets ( exchanged by the network layer ) and in turn
packets are contained in frames ( exchanges by data link layer ).
When a frame arrives, the data link layer processes the frame header
and passes the contents of the frame payload field up to the network entity. The
network entity processes the packet header and passes the contents of the packet
payload up to the transport entity. This nesting is shown in the following figure:
State diagram for connection establishment & release using simple primitives:

Transitions labeled in italics are caused by packet arrivals. The solid lines
show the client's state sequence. The dashed lines show the server's state
sequence.
Elements of Transport Protocols
 Addressing
 Connection Establishment
 Connection Release
 Flow Control and Buffering
 Multiplexing
 Crash Recovery
The transport service is implemented by a transport protocol used between the 2
transport entities. These transport protocols resemble the DLL protocols. Both
have to deal with error control, sequencing and flow control etc.
However, Significant differences also exist due to dissimilarities between the
environments in which these two protocols operate.

•At DLL, two routers communicate directly via a physical channel , whereas at
the transport layer this physical channel is replaced by the entire subnet.
•In DLL, it is not necessary for a router to specify which router it wants to talk to
– each outgoing line specifies a particular router, whereas in transport layer,
explicit addressing of destinations is required.
•Process of establishing a connection over wire is simple (fig(a)), when
compared to establishing the connection over the subnet.
•Buffering and Flow control are needed in both layers, but the transport layer
takes a different approach than that of data link layer.
Addressing
When an application/user process wishes to set up a connection to a remote
application process, it must specify which one(port address) to connect to.
In network layer, we use IP addresses to connect to the network. That address is
called as NSAPs ( Network Service Access Points ).
In transport layer, the end points are called as ports. We use the term called TSAP
( Transport Service Access Point ) similar to NSAP.
Initial connection protocol:
Instead of every server listening at a well-known TSAP, each machine that wishes
to offer services to remote users has a special process server that acts as a proxy.
It listens to a set of ports at the same time, waiting for a connection request.
Potential users of a service begin by doing a CONNECT request, specifying the
service(TSAP address of the service) they want.
If no server is waiting for them , they get a connection to the process server, as
shown in the fig(a).
While the initial connection protocol works fine for those servers that can be
created when required. But, there may be situations in which services do exist
independent of the process server. (Eg., A file server needs to run on a special
hardware and cannot be just created when someone wants). To handle this, an
alternative scheme is used. In this model, there exists a special process called a
name server or directory server.
How a user process in host 1 establishes a connection with a time-of-day server in host 2.
Name server Scheme:
To find the TSAP address corresponding to a given service name, such as time-of-
day; -a user sets up a connection to the name server.
-The user then sends message specifying the service name, and
-the name server sends back the TSAP address.
-Then the user releases the connection with the name server and
establishes a new one with the desired service.
-In this, when a new service is created, it must register itself with the name server,
giving both its service name and its TSAP.
-The name server records this information in its internal database so that when
queries come in later, it can answer.
Connection Establishment
For establishing a connection, the source will send a CONNECTION REQUEST
TPDU to the destination. Then the destination sends a CONNECTION ACCEPTED
reply. Then they’ll start the connection.
While sending the data, they may get problems like i)When the network is lost
ii)stores Duplicate Packets iii) when there is heavy congestion in the subnets.

Delayed Duplicates: While sending the data/packets, duplicate packets may be


generated and stored somewhere in the subnet and they can suddenly reach the
destination, which causes some problems. This is called Delayed Duplicates.

Consider a worst possibility :


A user establishes a connection with a bank, sends messages telling the bank to
transfer a large amount of money to the account of a person, and then releases the
connection.
Unfortunately, each packet in the scenario is duplicated and stored in the subnet.
After the connection has been released, all the packets pop out of the subnet and
arrive at the destination in the order, asking the bank to establish a new connection,
transfer money (again), and release the connection.
The bank has no way to find that this is a duplicate. It assumes that this is a
second independent transaction and transfers the money.
Thus, the problem is the existence of the delayed duplicates.

1. Connection identifier: To avoid this, each connection is given a connection


identifier( ie., a sequence no. incremented for each connection established ).
After each connection is released, each transport entity could update a table.
This method has a problem: it requires each transport entity to maintain certain
amount of history information. If a machine crashes and loses its memory, it will
no longer know which connection identifiers have already been used.

2. Age field: We include age field which gets decremented. When it becomes 0,
the packets will be discarded. One problem with this is, due to congestion, the
packet may take longer time to reach the destination. At that time, if Age=0, then
the packet may be discarded before reaching the destination.

3. Sequence number field: Each connection starts numbering its TPDUs with a
different initial sequence no. Inorder not to wrap around, a 32 bit field can be
used as sequence number. A problem occurs when a host crashes, it looses
memory. When it comes up again, its transport entity does not know where it was
in the sequence space. To avoid same sequence numbers in more than one packet,
we use forbidden region.
To solve this , a technique called Three-way Handshake protocol is used.
Case 1: Normal Operation
The normal setup procedure when host 1 initiates is shown in fig(a). Host 1
chooses a sequence no. x, and sends a CONNECTION REQUEST TPDU
containing it to host 2. Host 2 replies with an ACK TPDU acknowledging x
and announcing its own initial sequence no. y. Finally, host 1 acknowledges
host 2’s choice of an initial sequence no. in the first data TPDU that it sends.
Case 2: Delayed Duplicate CR TPDU:
In the fig(b), the first TPDU is a delayed duplicate CONNECTION REQUEST
from an old connection. This TPDU arrives at host 2 without host 1’s
knowledge. Host 2 reacts to this TPDU by sending host 1 an ACK TPDU, in
effect asking for verification. When host 1 rejects host 2’s attempt to establish a
connection, host 2 realizes that it was a delayed duplicate and abandons the
connection. In this way, a delayed duplicate does no damage.

Case 3: Delayed Duplicate CR TPDU & ACK:


This case is shown in fig(c). Here host 2 gets a
delayed CONNECTION REQUEST and replies
to it. When the second delayed TPDU arrives
at host 2, the fact that Z has been
acknowledged rather than y tells host2 that
this too, is an old duplicate.
Connection Release
There are two ways of terminating a connection :
a) Asymmetric release b) Symmetric release.
Asymmetric release is the way the telephone system works: when one party
hangs up, the connection is broken.
Symmetric release treats the connection as 2 separate unidirectional
connections and requires each one to release separately.
Note :- Asymmetric release may result in loss of data.
Consider the scenario of figure. After the
connection is established, host 1 sends a
TPDU that arrives properly at host 2.
Then host 1 sends another TPDU.
Suppose host 2 issues a DISCONNECT
before the second TPDU arrives. The
result is that the connection is released
and data is lost.
Thus, a sophisticated release protocol is
needed to avoid data loss. One way is to
use symmetric release in which each
direction is released independently of
Symmetric release does the job when each process has a fixed amount of data to send and
knows when it has sent it. But, determining when the connection should be terminated is a
problem. There is a famous problem called as “ two-army problem “.

Imagine that a white army is encamped in a valley, as shown in the figure. On both the
sides are blue armies. The white army is larger than either of the blue armies alone, but
together the blue armies are larger than the white army. If blue armies attack indivually, it
will be defeated, but if the two blue armies attack simultaneously, they will be victorious.
Note :- The blue armies want to synchronize their attacks.
Suppose that the commander of the blue army #1 sends a message :
“ I propose to attack at dawn on Oct 1. How about it ? “. Now suppose that the message
arrives , the commander of blue army #2 agrees, and his reply reaches safely back to blue
army #1. Will the attack happen? Probably not, because the commander #2 does not know
if his reply got through. If it did not, blue army #1 will not attack, so it would be foolish for
them to attack.
Let us improve the protocol by using a three-way handshake. The initiator of
the original proposal must acknowledge the response. Assuming no messages
are lost, blue army #2 will get the acknowledgement, but the commander of
blue army #1 will hesitate.

Now let us consider 4 scenarios for releasing the connection using the three-
way handshaking.

In figure(a), we see the normal procedure in which one of the users sends a
DR ( DISCONNECTION REQUEST ) TPDU to initiate the connection
release. When it arrives, the recipient sends back a DR TPDU, too, and starts
a timer, just in case its DR is lost. When this DR arrives, the original sender
sends back an ACK TPDU and releases the connection. Finally, when the
ACK TPDU arrives, the receiver also releases the connection.

If the final ACK TPDU is lost, as shown in figure(b), the situation is saved by
the timer. When the timer expires, the connection is released anyway.
Four protocol scenarios for releasing a connection. (a) Normal
case of a three-way handshake. (b) final ACK lost.
Now consider the case of second DR being lost. The user initiating the disconnection
will not receive the expected response, will time out, and will start all over again. In
figure( c), we assume that the second time no TPDUs are lost and all TPDUs are
delivered correctly and on time.

Our last scenario, figure(d) is similar to that of figure( c) except that now we assume
all the repeated attempts to retransmit the DR also fail due to lost TPDUs. After N
entries, the sender just gives up and releases the connection. Meanwhile, the reciever
times out and also exits.

Note :- In theory, the initial DR and N retransmissions may lost. The sender will give
up and release the connection, while the other side knows nothing at all about the
attempts to disconnect and is still active. This results in a half-open connection.

One way to kill off half-open connections is to have a rule saying that if no TPDUs
have arrived for a certain number of seconds, the connection is then automatically
disconnected. That way, if one side ever disconnects, the other side will detect the lack
of activity and also disconnect.

Note :- To implement this, it is necessary for each transport entity to have a timer that
is stopped and then restarted whenever a TPDU is sent.
(c) Response lost. (d) Response lost and subsequent DRs lost.
Flow Control and Buffering
Flow control problem in transport layer is similar to that of data link layer’s
concept , but with some differences. The basic similarity is that in both layers a
sliding window or other method is needed on each connection to keep a fast
sender from overrunning a slow receiver. The main difference is that a router
usually has a relatively few lines, whereas a host may have numerous
connections. Due to this , it is impractical to implement the data link buffering
methods.

In the data link layer, the sending side must buffer outgoing frames because
they might have to be retransmitted. If the subnet uses datagram service , the
sending transport entity must also buffer. If the receiver knows that the sender
buffers all TPDUs until they are acknowledged, Hence we must use a buffer at
receiver side to store the packets received.

We have two types of allocations to organize the buffers:


i)Static buffer allocation ii)Dynamic buffer allocation
i)Static buffer allocation is again sub-divided into 3 types:

a) Chained fixed size buffers b) Chained variable size buffers.


c) One large circular buffer per connection

a) Chained fixed size buffers: Here buffers are identically-sized, with one
TPDU per buffer. However, if there is a wide variation in TPDU size, a pool of
fixed size buffers will be a problem.

If the buffer size is chosen equal to the


largest possible TPDU, space will be
wasted whenever a short TPDU arrives.
If the buffer size is chosen to be the
minimum TPDU size, multiple buffers will
be needed for long TPDUs, with the
complexity.
b) Chained variable size buffers:
Another approach is to use variable
size buffers. The advantage is better
memory utilization, but more
complicated in terms of buffer
management.

c) One large circular buffer per connection: A third


possibility is to dedicate a single large circular buffer per
connection. It makes efficient use of memory, provided
that all connections are heavily loaded , but poor, if some
connections are lightly loaded.

ii)Dynamic buffer allocation: As connections are opened and closed as the


traffic pattern changes, the sender and receiver need to dynamically adjust their
buffer allocations. Assume that buffer allocation information travels in separate
TPDUs, as shown , and is not piggybacked onto the reverse traffic. The sender
requests certain no. of buffers based on the requirement and the receiver grants
that many buffers, possible with it.
The following figure shows an example of how dynamic window management might
work in a datagram subnet with 4-bit sequence numbers.

Dynamic buffer allocation: The arrows show the direction of transmission.


An ellipsis (…) indicates a lost TPDU.
Initially, A wants 8 buffers, but is granted only 4 of them. It then sends 3
TPDUs , of which third is lost. TPDU 6 acknowledges receipt of all TPDUs up
to including sequence number 1, thus allowing A to release those buffers, and
furthermore informs A that it has permission to send 3 more TPDUs starting
beyond 1 ( ie., TPDUs 2, 3, 4 ). A knows that it has already sent number 2, so
it assumes that it may send 3, 4 which it proceeds to do. At this point, it is
blocked and must wait for more buffer allocation. Time out retransmissions (
line 9) may occur while blocked, since they use buffers that have already been
allocated. In line 10, B acknowledges receipt of all TPDUs up to including 4
but refuses to let A continue. The next TPDU from B to A allocates another
buffer and allows A to continue.

Note :- Problems with this method arrives if control TPDUs can get lost.
Consider line 16, B has allocated more buffers to A, but the allocation TPDU
was lost. Since control TPDUs are not sequenced or timed out, A is now
deadlocked. To prevent this, each host periodically send control TPDUs giving
the ack and buffer status on each connection. In this way, the deadlock will be
broken. ( sooner or later).
Multiplexing
Multiplexing is a technique used to combine and send the multiple
conversations over a single medium. It is of two types:
i) Upward multiplexing ii) Downward multiplexing
Upward multiplexing: In this, all multiple connections are multiplexed on to
a single connection. When a TPDU arrives , there should be some method to
tell which process to give it to. This situation is called upward multiplexing,
If only one network address is available on a host, all transport connections on
that machine have to use it.

Downward multiplexing: In this, a single connection is split and is distributed


among multiple connections on a round-robin basis. This is called as
downward multiplexing.
Suppose that a subnet uses virtual circuits internally and imposes a maximum
data rate on each one. If a user needs more bandwidth than 1 virtual circuit
can provide, a way is to open multiple network connections and distribute
traffic among them.
Crash Recovery
If hosts and routers are subject to crashes, recovery becomes an issue.
Main problem is how to recover from host crashes. It may be desirable for some
clients to be able to continue working when servers crash and then quickly reboot.
Let us assume that one host, client, is sending a long file to another host, file server,
using a stop-and-wait protocol. The transport layer on the server simply passes the
incoming TPDUs to the transport user, one by one. During the transmission, the
server crashes. When it comes back up, its tables are reinitialized, so it no longer
knows where it was.
Thus to recover from the previous status, the server might send a broadcast TPDU
to all other hosts, announcing that it had just crashed and requesting that its clients
inform it of the status of all open connections.
Each client can be in one of the 2 states : one TPDU outstanding, S1, or no TPDUs
outstanding , S0.
Based on this, the client must decide whether to retransmit the most recent TPDU.
We assume that the client should retransmit if and only if it has an unacknowledged
TPDU outstanding ( in state S1 ) when it learns of the crash.
Now let us reprogram the transport entity to first do the write and then send the ack. Imagine
that write has been done but the crash occurs before the ack can be sent. The client will be in
state S1 and thus retransmit, leading to an undetected duplicate TPDU in the output stream
to server application.
The server can be programmed in one of the 2 ways :
i)Acknowledge first (or) ii)Write first.
The client can be programmed in one of the 4 ways :
-always retransmit the last TPDU,
-never retransmit the last TPDU,
-retransmit only in state S0, or
-retransmit only in the state S1.
This gives 8 combinations. ( for each combination there is some set of events that
makes the protocol fail )

3 events are possible at the server:


i)sending an Ack (A),
ii)writing to the output process (W), and
iii)crashing ( C ).
These 3 events can occur in 6 different orderings : AC(W), AWC, C(AW), C(WA),
WAC, WC(A).
Note :- For each strategy, there is some sequence of events that causes the
protocol to fail. For eg., if the client always retransmits, the AWC event will
generate an undetected duplicate, even though the other 2 events work properly.
Note :- Thus, the recovery from a layer N crash can only be done by layer N+1,
and only if higher layers retains enough status information.
The following figure shows all 8 combinations of client and server strategy and
the valid event sequences for each one.

Different combinations of client and server strategy


The Internet Transport Protocols : UDP
 Introduction to UDP
 Remote Procedure Call
 The Real-Time Transport Protocol
The Internet has two main protocols in the transport layer, a connectionless
protocol(UDP) and a connection-oriented protocol(TCP).

Introduction to UDP
UDP provides a way for applications /users to send encapsulated IP datagrams
without having to establish a connection.

UDP transmits segments consisting of an 8-byte header followed by the payload.


The header is shown below :

The UDP header


The 2 ports(source port, destination port) are used to identify the end points in
the source & destination machines.
When a UDP packet arrives , its payload is handed to the process attached to the
destination port. This attachment occurs, when BIND primitive is used.

Source Port(16 bit): It defines the source port of the application process on the
sending device.
Destination Port(16 bit): It defines the destination port of the application
process on the receiving device.
The UDP length field includes the 8-byte header and the data.
The UDP checksum is optional and stored as 0 if not computed.

Advantages: UDP is mainly useful in client-server applications.


-No setup(connection estb’) is needed in advance and no release is needed later.
-An Application that uses UDP is DNS ( Domain Name System ).
Disadvantages: UDP does not provide reliability, flow control, error control,
ACK or retransmission upon receipt of a bad segment.
UDP just provides an interface to the IP protocol with the added feature of
demultiplexing multiple processes using the ports.
Applications of UDP: Used in Route Updating protocols, Multicast, Broadcast,
DNS, Live streaming (as it will not check for errors during lags, and catch live
without going to previous missed video)
Remote Procedure Call
When a process on machine-1 calls a procedure on machine -2, the calling
process on 1 is suspended and execution of the called procedure takes place on 2.
Information can be transported from the caller to the callee in the parameters
and can come back in the procedure result. Message passing is not visible to the
programmer. This technique is known as RPC ( Remote Procedure Call ) and it
has become basis for many networking applications.
The calling procedure is called as the client and the called procedure is called as
server.
To call a remote procedure, the client program is bound with a small library
procedure, called the client stub. Similarly, the server is bound with a procedure
called the server stub.

Steps in making a remote procedure call. The stubs are shaded.


Steps in making an RPC:
Step 1 : Client program will call the client stub. This call is a local procedure
call, with the parameters pushed onto the stack in the normal way.
Step 2 : The client stub packing the parameters into a message and marking a
system call to send the message. This packing process is called marshaling.
Step 3 : O/S: The kernel sends the message from the client machine to the
server machine.
Step 4 : Server kernel will pass the incoming packet to the server stub.
Step 5 : After unmarshaling, the server stub calls the server procedure. The
procedure will run.

Problems in using RPC:


-Problems with use of global variables.
-Impossible to pass pointers directly.
-Array operations cannot be performed.
-It is not always possible to deduce the type of parameters.

Irrespective of the problems , RPC is widely used , but with some restrictions.
UDP is commonly used for RPC. However, when the parameters or results may
be larger than the maximum UDP packet, it may be necessary to set up a TCP
connection and send the request over it rather than using UDP.
Real-time Transport Protocol(RTP)
UDP is also widely used in the area of real-time multimedia applications.
In particular, internet telephony, video conferencing, video-on-demand and other
multimedia applications became more common. Thus, a generic real-time transport
protocol for multimedia applications is needed. Thus RTP came into existence.
RTP is the protocol designed to handle real-time traffic on the Internet. It does not have
a delivery mechanism; it must be used with UDP. The main contributions of RTP are
time-stamping, sequencing, and mixing facilities.
RTP is on user space and runs over UDP. It operates as follows :
The multimedia application consisting of multiple audio, video, text & possibly other
streams is fed into the RTP library, which is in user space along with the application.
This library then multiplexes the streams and encodes them in RTP packets, which it
then stuffs into a socket. At the other end of socket (in the O/S kernel), UDP packets are
generated and embedded in IP packets. The IP packets are then put in Ethernet frames
for transmission.
The basic function of RTP is to multiplex several real time data streams onto
a single stream of UDP packets. The UDP stream can be sent to a single
destination(unicasting) or to multiple destinations(multicasting).

Each packet sent in an RTP stream is given a number 1 higher than its
predecessor. This numbering allows the destination to determine if any packets
are missing. If a packet is missing, the destination approximate the missing
value by interpolation. Retransmission is not useful, since the retransmitted
packet would probably arrive too late to be useful. Due to this, RTP has no
flow control , no error control, no acknowledgements and no retranmissions.

The RTP header is given in the following figure . It consists of 32-bit words
and some extensions.
The first word contains version field,
which is now 2.
The P bit indicates that the packet has been
padded to a multiple of 4 bytes.
The X bit indicates that an extension
header is present.
The CC field tells how many contributing
sources are present, from 0 to 15. The M
bit is an application-specific marker bit. It RTP Header
can be used to mark the start of a video
frame/word in an audio channel and so on.
The Payload type field tells which encoding algorithm has been used.
The sequence number is a counter that is incremented on each RTP packet sent.
It is used to detect lost packets. The timestamp is produced by the stream’s source
to note when the first sample in the packet was made. This can be used to reduce
the jitter at the receiver.
The synchronization source identifier tells which stream the packet belongs to.
(It is the method used to multiplex and demultiplex multiple data streams onto a
single stream of UDP packets).
The contributing source identifiers, if any, are used when mixers are present in
the studio. ie, the mixer is synchronizing source, and other streams being mixed.
The Internet Transport Protocols: TCP
 Introduction to TCP
 The TCP Service Model
 The TCP Protocol
 The TCP Segment Header
 TCP Connection Establishment
 TCP Connection Release
 TCP Connection Management Modeling
 TCP Transmission Policy
 TCP Congestion Control
 TCP Timer Management
 Wireless TCP and UDP
 Transactional TCP
Introduction to TCP
TCP (Transmission Control Protocol ) was specifically designed to provide a
reliable, end-to-end, byte stream over an unreliable internetwork.
An internetwork differs because different parts may have different topologies,
bandwidths, delays, packet sizes etc. TCP was designed to dynamically adapt
to these properties.

Each machine supporting TCP has a TCP transport entity, either a library
procedure, a user process, or part of kernel.
A TCP entity accepts user data streams from local processes,
breaks them into pieces not exceeding 64 KB and
sends each piece as a separate IP datagram.
When datagrams containing TCP data arrive at a machine, they are given to
the TCP entity which reconstructs the original byte streams.
The TCP Service Model
TCP service is obtained by creating end points by both sender and the receiver.
These end points are called sockets.
Each socket has a socket no. consisting of the i)IP address of the host and
ii)16-bit no. local to that host, called a port.
A port is nothing but TSAP(Transport layer Service Access Point). Port numbers
below 1024 are called well-known ports and are reserved for standard services.
For TCP service to be obtained, a connection must be explicitly established
between a socket on the sending machine and a socket on the receiving machine.
A socket may be used for multiple connections at the same time. These connections
are identified by the socket-IDs ie., socket1, socket2 ...
Socket address  Port address + IP address of the host.

Note :- All TCP connections are full


duplex and point-to-point. TCP does
not support multicasting or
broadcasting. TCP connection is
byte stream.
The TCP Protocol
The sending and receiving TCP entities exchange data in the form of segments. A
TCP segment consists of a fixed 20-byte header (plus an optional part) followed
by 0 or mote data bytes.
Two limits restrict the segment size :
i) each segment, including the TCP header, must fit in the 65,515-byte IP payload.
ii) each network has a maximum transfer unit(MTU), and each segment must fit in
the MTU. MTU is generally 1500 bytes (default).
The basic protocol used by the TCP entities is the sliding window protocol. When
a sender transmits a segment, it also starts the timer.
When the segment arrives at the destination, the receiving TCP entity sends back a
segment(with data) bearing an ack no. equal to the next sequence no. it expects to
receive.
If the sender’s timer goes off before the ack is received, the sender retransmits
the segment.
Segments can arrive out of order, can arrive but cannot be acknowledged
and segments can also be delayed. TCP deals such problems and solve them in an
efficient way.
The TCP Segment Header
TCP Header
Every segment begins with a
20-byte header. This may be
followed by header options.
After the options, data bytes
may appear.
Note : Segments without any
data are legal and are
commonly used for ACKs and
control messages.

-The Source port and Destination port fields identify the local end points of
the connection.
-The Sequence no. and ACK no. fields perform their usual functions. Both are
32-bit long. ACK no. specifies the next byte expected, not the byte received.
-The TCP header length tells how many 32-bit words are present in TCP
header. This is needed because the Options field is of variable length.
-Next comes a 6-bit field that is not used.
Next comes six 1-bit flags: URG, ACK, PSH, RST, SYN and FIN.
-URG(Urgent pointer ) is set to 1 if it is in use.
-The ACK(ACK no.) bit is set to 1 to indicate that it is valid; otherwise it is
ignored.
-The PSH(Push) bit indicates pushed data. The receiver is requested to deliver
the data to the application upon arrival and not buffer it until a full buffer has
been received.
-The RST(Reset) bit is used to reset a connection that has been confused due to
crash or some other reason.
It is also used to reject an invalid segment or refuse an attempt to open a
connection.
-The SYN (synchronize) bit is used to establish connections. The connection
request has SYN=1 and ACK=0 to indicate that the piggyback Ack field is not in
use. If the connection reply has an Ack, so it has SYN=1 and ACK=1.
Note :- SYN bit is used to denote CONNECTION REQUEST and
CONNECTION ACCEPTED, with the ACK bit used to distinguish between
them.
-The FIN(Finish) bit is used to release a connection. It specifies that the sender
has no more data to transmit. However, it may receive data.
-Flow control in TCP is handled using a variable sized sliding window. The
Window size field tells how many bytes may be sent. Window size 0 is also valid .
Options field provides a way to add extra facilities that were not covered in the
regular header. It allows each host to specify maximum TCP payload, it is
willing to accept. Using larger payload (segments) is more efficient. If host
doesn’t use options, it defaults to 536-byte payload + 20 byte header  536 bytes

A Checksum is provided for extra reliability. It checksums the header, the data
and the conceptual pseudoheader shown in the following figure. When
performing this, it is set to 0; and the data field is padded with 1byte zeros.

The pseudoheader contains the 32-bit IP addresses of the source and the
destination machines, protocol number for TCP (6) and the byte count for the
TCP segment(including the header).
Including the pseudoheader in TCP checksum helps to detect undelivered
packets; but it may violate the protocol hierarchy, since IP address belong to IP
layer and not TCP layer.
TCP Connection Management
TCP Connection Management includes:
i)Connection Establishment ,
ii)Connection Release and
iii)Connection Management Modeling

TCP Connection Establishment


Connections are established in TCP by means of three-way handshake method.
To establish a connection, one side, say, server when it is ready to accept an
incoming connection; it executes the LISTEN and ACCEPT primitives.
The other side, say, client establishes a connection by executing a CONNECT
primitive specifying i) the IP address and port to which it wants to connect,
ii)the maximum TCP segment size it is willing to accept, and other optional
user data. The CONNECT primitive sends a TCP segment with the SYN bit ON
and ACK bit OFF and waits for a response.
When this segment arrives at the destination, the TCP entity checks to see if
there is a process that has done a LISTEN on the port given in the Destination
port field. If not, it sends a reply with RST bit ON to reject the connection.
If some process is listening to the port, that process is given the incoming TCP
segment. It can then either accept or reject the connection.
If it accepts, an ACK segment is sent back. The sequence of TCP segments sent
in the normal case is shown in fig(a).

If two hosts simultaneously attempt to establish a connection between the same


two sockets, the sequence of events is given in fig(b). The result of these events is
that only one connection is established , not two because connections are
identified by their end points.
If the first setup results in a connection identified by (x,y) and the second one does
same, only one entry is made in the table, namely (x,y).
TCP Connection Release
To release a connection, either parties can send a TCP segment with the FIN bit set,
which means that it has finished transmitting data and no more data left.
When the FIN is acknowledged, that direction is shut down. Data may
continue to flow in the other direction.
When both directions have been shut down, the connection is released.
Note :- Both ends of a TCP connection may send FIN segments at the same time
also.
TCP Connection Management Modeling
Steps required to establish & release the connections can be represented in a finite state machine
with the 11 states listed in the figure
-Each connection starts with the CLOSED state. It leaves this state when it does
either a passive open (LISTEN) or an active open (CONNECT).
-If the other side does the opposite one, a connection is established and the state
becomes ESTABLISHED. Connection release can be initiated by either side.
When it is complete, the state returns to CLOSED.

-The finite state machine is shown in the next figure.


-The common case of a client actively connecting to a passive server is shown
with heavy lines – solid for the client, dotted for the server.
-The light lines are unusual event sequences.
-Each line is marked by an event/action pair. The event can either be a user-
initiated system call(CONNECT, LISTEN, SEND or CLOSE ), or a segment
arrival(SYN, FIN, ACK, RST) or a timeout. The action is the sending of a
control segment (SYN,FIN or RST) or nothing, indicated by comments, shown
in parentheses.

Note :- we can understand the diagram by first following the path of a client
(heavy solid line) , then later following by the path of a server (heavy dashed
line).
TCP Transmission Policy
Window management in TCP is not
directly tied to acknowledgements.
Suppose the receiver has a 4096-byte
buffer, as shown in the figure. If the
sender transmits a 2048-byte segment
that is correctly received, the receiver
will acknowledge it. Since, it has only
2048-bytes of buffer space now ( until
the application removes some data
from the buffer ), it will advertise a
window of 2048, starting at the next
byte expected.
Now the sender transmits another 2048 bytes, which are acknowledged, but the advertised
window is 0. The sender must stop until the application process on the receiving host has
removed some data from the buffer, at which time TCP can advertise a larger window.
Note :- When the window is 0, the sender may not send segments normally, but with 2
exceptions it will send.
-First, urgent data may be sent.
-Second, the sender may send a 1-byte segment to make the receiver; re-announce the next
byte expected and window size.
The TCP explicitly provides this option to prevent deadlock, if a window
announcement gets lost ever.

Senders are not required to transmit the data as soon as it arrives. Neither
receivers are required to send acknowledgements as soon as possible.
For eg., in the above figure, when the first 2KB of data came in, TCP can buffer
it until another 2KB data arrives (because it has a window of 4KB), to be able to
transmit a segment of 4-KB payload at once.

One approach that many TCP implementations use is to delay the ACKs and
window updates in the hope of acquiring some more data.
Although this rule reduces the load placed on the network by the receiver, the
sender is still operating inefficiently. A way to solve this problem is Nagle’s
algorithm.
Nagle’s algorithm: According to this, when data come into the sender 1 byte at a
time, send the first byte and buffer all the rest until the outstanding byte is
acknowledged. Then send all the buffered characters in 1 TCP segment and start
buffering again.
Silly window syndrome degrades TCP performance.
This occurs when data are passed to the sending TCP
entity in large blocks, but an interactive application on
the receiving side reads 1 byte at a time.

Consider the following figure. Initially, the


TCP buffer on the receiving side is full and
the sender knows this. Then the interactive
application reads 1 character from the TCP
stream. This action makes the receiver to
send a window update to the sender saying
that it can send 1 byte. The sender agrees
and sends 1 byte. This can go on forever.

Clark’s solution is to prevent the receiver from sending a window update for 1
byte. Specifically, the receiver should not send a window update until it can
handle the maximum segment size it advertised, when the connection was
established.
Further, the sender can also help by not sending small segments. Sender should wait
until it has accumulated enough space in the window to send a full segment.
TCP Congestion Control
When the load offered to any network is more than it can handle, congestion
builds up.
The idea is to refrain from injecting a new packet into the network until an old
one leaves (ie., delivered). TCP attempts to achieve this by dynamically
manipulating the window size.
Let us try to prevent congestion from occurring initially. When a connection is
established, a suitable window size has to be chosen. The receiver can specify a window
based on its buffer size.

If the sender sticks to this


window, problems will not occur
due to buffer overflow at the
receiving side, but they may still
occur due to internal congestion
within the network.
In figure(a), we see a thick pipe
leading to a small-capacity
receiver. As long as the sender
does not send more water than
the bucket can contain, no water
will be lost.
In figure(b), the limiting factor is not the bucket capacity, but the internal
carrying capacity of the network. If too much water comes in too fast, it will
back up and some will be lost (in this case by overflowing the funnel).

Thus 2 problems exists : network capacity and receiver capacity – and to deal
with them separately. The sender maintains 2 windows :
i)the window receiver has granted ii)the congestion window.
Each reflects the number of bytes the sender may transmit. The number of bytes
that may sent is the minimum of the 2 windows.

When a connection is established, the sender initializes the congestion window


to the size of the maximum segment in use on the connection.
It then sends one maximum segment. If this segment is acknowledged before
the timer goes off, it adds 1 more segment to the congestion window and sends
2 segments. As each of the segments is acknowledged, the congestion window
is increased by 1 maximum segment size.
When the congestion window is n segments, if all n are acknowledged on time,
the congestion window is increased by the byte count corresponding to n
segments.

The congestion window keeps growing exponentially until either a timeout


occurs or the receiver’s window is reached. This algorithm is called slow start.
It is exponential. All TCP implementations are required to support it.

Internet congestion control algorithm: This algorithm uses 3 parameters:


i)the threshold ii)receiver window and iii)congestion windows.
When congestion/timeout occurs, the threshold is set to half of the current
congestion window, and the congestion window is reset to one maximum
segment.

Slow start is then used to determine what the network can handle,
except that exponential growth stops when the threshold is hit.
Let us consider the following
figure. The maximum segment
size here, is 1 KB. Initially, the
congestion window was 64 KB,
but a timeout occurred, so the
threshold is set to 32 KB and
the congestion window to 1 KB.
The congestion window then
grows exponentially until it hits
threshold. Starting then, it
grows linearly.

An example of the Internet congestion algorithm

Transmission 13 is unlucky, and a timeout occurs. The threshold is set to half


the current window ( it is 40 KB, so half is 20 KB), and slow start is initiated
all over again. When the acknowledgements from transmission 14 start coming
in, the first four each double the congestion window, but after that, growth
becomes linear again.
TCP Timer Management
TCP uses multiple timers. The important one is retransmission timer. When a
segment is sent, a retransmission timer is started. If the segment is
acknowledged before the timer expires, the timer is stopped. If the timer goes
off before the acknowledgement comes in, the segment is retransmitted ( and
timer is started again). How long should the timeout interval be ?

This problem is difficult in Internet transport layer than in the Data link layer.
In data link layer, the expected delay is highly predictable, so the timer can be
set to go off slightly after the acknowledgement is expected, as shown in the
figure(a). The probability density function for the time it takes for a TCP
acknowledgement to come back looks like in figure(b) than in figure(a).
Determining round-trip time in TCP is very difficult.
(a) Probability density of ACK arrival times in the data link layer.
(b) Probability density of ACK arrival times for TCP.
If the timeout is set too short, say T1, in figure(b), unnecessary retransmissions
will occur, clogging the Internet with the useless packets. If it is set too long,
say T2, performance will suffer due to the long retransmission delay whenever
a packet is lost.

The solution is to use a highly dynamic algorithm that constantly adjusts the
timeout interval, based on continuous measurements of network performance.
The algorithm used by TCP works as follows : For each connection, TCP
maintains a variable, RTT, that is the best current estimate of the round-trip
time to the destination.
When a segment is sent, a timer is started, to see how long the
acknowledgement takes and to trigger a retransmission if it takes too long. If
the acknowledgement gets back before the timer expires, TCP measures how
long the acknowledgement took, say M. It then updates RTT according to the
formula :

RTT = αRTT + ( 1 – α )M
where α is a smoothing factor that determines how much weight is given to the
old value. Even given a good value of RTT, choosing a suitable retransmission
timeout is difficult.
Instead, the timeout is doubled on each failure until the segments get through
the first time. This fix is called karn’s algorithm. Most TCP implementations
use it.

A second timer is persistence timer. It is designed to prevent deadlock. The


receiver sends an acknowledgement with a window size of 0, telling the sender
to wait. Later, the receiver updates the window, but the packet with the update
is lost. Now both the sender and receiver are waiting for each other to do
something. When the persistence timer goes off, the sender transmits a probe to
the receiver. The response to the probe gives the window size. If it is still 0, the
persistence timer is set again and the cycle repeats. If it is nonzero, data can
now be sent.

A third timer is keepalive timer. When a connection has been idle for a long
time, the keepalive timer may go off to cause one side to check whether the
other side is still there. If is fails to respond, the connection is terminated.

You might also like