Network Layer

Download as pdf or txt
Download as pdf or txt
You are on page 1of 38

Chapter 3

Transport Layer

Dr/ Hala Hassan


Chapter 3 outline
3.1 transport-layer services 3.5 connection-oriented transport: TCP
3.2 multiplexing and • segment structure
demultiplexing • reliable data transfer
3.3 connectionless • flow control
transport: UDP
• connection management
3.4 principles of reliable
3.6 principles of congestion control
data transfer
3.7 TCP congestion control

Transport Layer 3-2


TCP: Overview RFCs: 793,1122,1323, 2018, 2581

 point-to-point:  full duplex data:


 one sender, one receiver • bi-directional data flow in
same connection
 reliable, in-order byte steam:
• MSS: maximum segment size
 no “message boundaries”
 connection-oriented:
 pipelined:
• handshaking (exchange of
 TCP congestion and flow control msgs) inits sender,
control set window size receiver state before data
exchange
 flow controlled:
• sender will not overwhelm
receiver

3-3
TCP segment structure
32 bits
URG: urgent data counting
(generally not used) source port # dest port #
by bytes
sequence number of data
ACK: ACK #
valid acknowledgement number (not segments!)
head not
PSH: push data now len used
UAP R S F receive window
(generally not used) # bytes
checksum Urg data pointer
rcvr willing
RST, SYN, FIN: to accept
options (variable length)
connection estab
(setup, teardown
commands)
application
Internet data
checksum (variable length)
(as in UDP)

Transport Layer 3-4


TCP seq. numbers, ACKs
outgoing segment from sender
sequence numbers: source port # dest port #
sequence number
byte stream “number” of acknowledgement number
rwnd
first byte in segment’s data checksum urg pointer

acknowledgements: window size


N
seq # of next byte expected
from other side
sender sequence number space
cumulative ACK

Q: how receiver handles out-of- sent sent, not- usable not


ACKed yet ACKed but not usable
order segments (“in- yet sent
flight”)
A: TCP spec doesn’t say, - incoming segment to sender
up to implementor source port # dest port #
sequence number
acknowledgement number
A rwnd
Transport Layer 3-5
checksum urg pointer
TCP seq. numbers, ACKs

Host A Host B

User
types
‘C’ Seq=42, ACK=79, data = ‘C’
host ACKs
receipt of
‘C’, echoes
Seq=79, ACK=43, data = ‘C’ back ‘C’
host ACKs
receipt
of echoed
‘C’ Seq=43, ACK=80

simple telnet scenario

Transport Layer 3-6


TCP round trip time, timeout

Q: how to set TCP Q: how to estimate RTT?


timeout value?  SampleRTT: measured
 longer than RTT time from segment
•but RTT varies transmission until ACK
receipt
 too short: premature
timeout, unnecessary  ignore retransmissions
retransmissions  SampleRTT will vary, want
 too long: slow reaction estimated RTT “smoother”
to segment loss  average several recent
measurements, not just
current SampleRTT

Transport Layer 3-7


TCP round trip time, timeout

EstimatedRTT = (1- )*EstimatedRTT + *SampleRTT


 exponential weighted moving average
 influence of past sample decreases exponentially fast
 typical value:  = 0.125 RTT: gaia.cs.umass.edu to fantasia.eurecom.fr

350
RTT (milliseconds)

300

250
RTT (milliseconds)

200

sampleRTT
150

EstimatedRTT

Transport Layer 100


1 8 15 22 29 36 43 50 57 64 71 78 85
3-892 99 106
time (seconnds)
time (seconds)
SampleRTT Estimated RTT
TCP round trip time, timeout

 timeout interval: EstimatedRTT plus “safety margin”


 large variation in EstimatedRTT -> larger safety margin
 estimate SampleRTT deviation from EstimatedRTT:
DevRTT = (1-)*DevRTT +
*|SampleRTT-EstimatedRTT|
(typically,  = 0.25)

TimeoutInterval = EstimatedRTT + 4*DevRTT

estimated RTT “safety margin”

Transport Layer 3-9


TCP reliable data transfer

 TCP creates rdt service


on top of IP’s unreliable
service
 pipelined segments
let’s initially consider
 cumulative acks simplified TCP sender:
 single retransmission  ignore duplicate acks
timer
 ignoreflow control,
 retransmissions triggered congestion control
by:
 timeout events
 duplicate acks 3-10
TCP sender events:

data rcvd from app: timeout:


 create segment with seq #  retransmit segment that
 seq # is byte-stream number caused timeout
of first data byte in  restart timer
segment
ack rcvd:
 start timer if not already  if ack acknowledges
running previously unacked
• think of timer as for oldest segments
unacked segment
• update what is known to be
• expiration interval: ACKed
TimeOutInterval • start timer if there are still
unacked segments

Transport Layer 3-11


TCP sender (simplified)
data received from application above
create segment, seq. #: NextSeqNum
pass segment to IP (i.e., “send”)
NextSeqNum = NextSeqNum + length(data)
if (timer currently not running)
L start timer
NextSeqNum = InitialSeqNum wait
SendBase = InitialSeqNum for
event timeout
retransmit not-yet-acked segment
with smallest seq. #
start timer
ACK received, with ACK field value y
if (y > SendBase) {
SendBase = y
/* SendBase–1: last cumulatively ACKed byte */
if (there are currently not-yet-acked segments)
start timer
Transport Layer
else stop timer 3-12

}
TCP: retransmission scenarios
Host A Host B Host A Host B

SendBase=92
Seq=92, 8 bytes of data Seq=92, 8 bytes of data

Seq=100, 20 bytes of data


timeout

timeout
ACK=100
X
ACK=100
ACK=120

Seq=92, 8 bytes of data Seq=92, 8


SendBase=100 bytes of data
SendBase=120
ACK=100
ACK=120

SendBase=120

Transport lost
Layer ACK scenario premature
3-13 timeout
TCP: retransmission scenarios
Host A Host B

Seq=92, 8 bytes of data

Seq=100, 20 bytes of data


timeout

ACK=100
X
ACK=120

Seq=120, 15 bytes of data

Transport Layercumulative ACK 3-14


TCP ACK generation [RFC 1122, RFC 2581]

event at receiver TCP receiver action


arrival of in-order segment with delayed ACK. Wait up to 500ms
expected seq #. All data up to for next segment. If no next segment,
expected seq # already ACKed send ACK

arrival of in-order segment with immediately send single cumulative


expected seq #. One other ACK, ACKing both in-order segments
segment has ACK pending

arrival of out-of-order segment immediately send duplicate ACK,


higher-than-expect seq. # . indicating seq. # of next expected byte
Gap detected

arrival of segment that immediate send ACK, provided that


partially or completely fills gap segment starts at lower end of gap
Transport Layer 3-15
TCP fast retransmit

 time-out period often


relatively long: TCP fast retransmit
• long delay before resending if sender receives 3
lost packet
ACKs for same data
 detect lost segments via
(“triple
(“triple duplicate
duplicate ACKs”),
ACKs”),
duplicate ACKs.
• sender often sends many
resend unacked
segments back-to-back segment with smallest
• if segment is lost, there
seq #
will likely be many  likely that unacked
duplicate ACKs. segment lost, so don’t
wait for timeout

Transport Layer 3-16


TCP fast retransmit
Host A Host B

Seq=92, 8 bytes of data


Seq=100, 20 bytes of data
X

ACK=100
timeout

ACK=100
ACK=100
ACK=100
Seq=100, 20 bytes of data

Transport Layer fast retransmit after sender 3-17


receipt of triple duplicate ACK
TCP flow control
application
application may process
remove data from application
TCP socket buffers ….
TCP socket OS
receiver buffers
… slower than TCP
receiver is delivering
(sender is sending) TCP
code

IP
flow control code
receiver controls sender, so
sender won’t overflow
receiver’s buffer by transmitting from sender
too much, too fast
receiver protocol stack
Transport Layer 3-18
TCP flow control

 receiver “advertises” free


buffer space by including to application process
rwnd value in TCP header of
receiver-to-sender segments
RcvBuffer buffered data
 RcvBuffer size set via
socket options (typical
default is 4096 bytes)
rwnd free buffer space

 many operating systems


autoadjust RcvBuffer TCP segment payloads
 sender limits amount of
unacked (“in-flight”) data to receiver-side buffering
receiver’s rwnd value
 guarantees receive buffer will
not overflow 3-19
Connection Management
before exchanging data, sender/receiver “handshake”:
 agree to establish connection (each knowing the other willing to
establish connection)
 agree on connection parameters

application application

connection state: ESTAB connection state: ESTAB


connection variables: connection Variables:
seq # client-to-server seq # client-to-server
server-to-client server-to-client
rcvBuffer size rcvBuffer size
at server,client at server,client

network network

Socket clientSocket = Socket connectionSocket =


newSocket("hostname","port welcomeSocket.accept();
3-20
number");
Agreeing to establish a connection

2-way handshake:
Q: will 2-way handshake always
work in network?
 variable delays
Let’s talk
ESTAB  retransmitted messages (e.g.
OK
ESTAB req_conn(x)) due to message
loss
 message reordering
 can’t “see” other side
choose x
req_conn(x)
ESTAB
acc_conn(x)
ESTAB

Transport Layer 3-21


Agreeing to establish a connection
2-way handshake failure scenarios:

choose x choose x
req_conn(x) req_conn(x)
ESTAB ESTAB
retransmit acc_conn(x) retransmit acc_conn(x)
req_conn(x) req_conn(x)

ESTAB ESTAB
data(x+1) accept
req_conn(x)
retransmit data(x+1)
data(x+1)
connection connection
client x completes server x completes server
client
terminates forgets x terminates forgets x
req_conn(x)

ESTAB ESTAB
data(x+1) accept
half open connection! data(x+1)
3-22
(no client!)
TCP 3-way handshake

client state server state


LISTEN LISTEN
choose init seq num, x
send TCP SYN msg
SYNSENT SYNbit=1, Seq=x
choose init seq num, y
send TCP SYNACK
msg, acking SYN SYN RCVD
SYNbit=1, Seq=y
ACKbit=1; ACKnum=x+1
received SYNACK(x)
ESTAB indicates server is live;
send ACK for SYNACK;
this segment may contain ACKbit=1, ACKnum=y+1
client-to-server data
received ACK(y)
indicates client is live
ESTAB

Transport Layer 3-23


TCP: closing a connection

 client, server each close their side of connection


• send TCP segment with FIN bit = 1
 respond to received FIN with ACK
• on receiving FIN, ACK can be combined with own FIN
 simultaneous FIN exchanges can be handled

Transport Layer 3-24


TCP: closing a connection
client state server state
ESTAB ESTAB
clientSocket.close()
FIN_WAIT_1 can no longer FINbit=1, seq=x
send but can
receive data CLOSE_WAIT
ACKbit=1; ACKnum=x+1
can still
FIN_WAIT_2 wait for server send data
close

LAST_ACK
FINbit=1, seq=y
TIMED_WAIT can no longer
send data
ACKbit=1; ACKnum=y+1
timed wait
for 2*max CLOSED
segment lifetime

CLOSED
Transport Layer 3-25
Chapter 3 outline
3.1 transport-layer services 3.5 connection-oriented transport: TCP
3.2 multiplexing and • segment structure
demultiplexing • reliable data transfer
3.3 connectionless • flow control
transport: UDP
• connection management
3.4 principles of reliable
3.6 principles of congestion control
data transfer
3.7 TCP congestion control

Transport Layer 3-26


Principles of congestion control

congestion:
 informally: “too many sources sending too much
data too fast for network to handle”
 different from flow control!
 manifestations:
 lost packets (buffer overflow at routers)
 long delays (queueing in router buffers)
 a top-10 problem!

Transport Layer 3-27


Causes/costs of congestion: scenario 1

original data: lin throughput: lout


 two senders, two
receivers Host A

 one router, infinite unlimited shared


buffers output link buffers

 output link capacity: R


 no retransmission
Host B

R/2

delay
lout

lin R/2 lin R/2


 maximum per-connection  large delays as arrival rate, lin,
throughput: R/2 approaches capacity
Causes/costs of congestion: scenario 2

 one router, finite buffers


 sender retransmission of timed-out packet
• application-layer input = application-layer output: lin = lout
• transport-layer input includes retransmissions : lin lin

lin : original data


lout
l'in: original data, plus
retransmitted data

Host A

finite shared output


TransportHost
LayerB 3-29
link buffers
Chapter 3 outline
3.1 transport-layer services 3.5 connection-oriented transport: TCP
3.2 multiplexing and • segment structure
demultiplexing • reliable data transfer
3.3 connectionless • flow control
transport: UDP
• connection management
3.4 principles of reliable
3.6 principles of congestion control
data transfer
3.7 TCP congestion control

Transport Layer 3-30


TCP congestion control: additive increase
multiplicative decrease

 approach: sender increases transmission rate (window


size), probing for usable bandwidth, until loss occurs
• additive increase: increase cwnd by 1 MSS every
RTT until loss detected
• multiplicative decrease: cut cwnd in half after loss
additively increase window size …
…. until loss occurs (then cut window in half)
congestion window size
cwnd: TCP sender

AIMD saw tooth


behavior: probing
for bandwidth

Transport Layer 3-31


time
TCP Congestion Control: details
sender sequence number space
cwnd TCP sending rate:
 roughly: send cwnd
bytes, wait RTT for
last byte last byte ACKS, then send more
ACKed sent, not- sent
yet ACKed bytes
(“in-
flight”) cwnd
rate ~
~ bytes/sec
 sender limits transmission: RTT
LastByteSent- < cwnd
LastByteAcked

 cwnd is dynamic, function of


perceived network congestion

3-32
TCP Slow Start
Host A Host B
 when connection begins, increase
rate exponentially until first loss
event:
• initially cwnd = 1 MSS

RTT
• double cwnd every RTT
• done by incrementing cwnd for
every ACK received
 summary: initial rate is slow but
ramps up exponentially fast

time

Transport Layer 3-33


TCP: detecting, reacting to loss
 loss indicated by timeout:
• cwnd set to 1 MSS;
• window then grows exponentially (as in slow start) to threshold, then
grows linearly

 loss indicated by 3 duplicate ACKs: TCP RENO


• dup ACKs indicate network capable of delivering some segments
• cwnd is cut in half window then grows linearly

 TCP Tahoe always sets cwnd to 1 (timeout or 3 duplicate


acks)

3-34
TCP: switching from slow start to CA
Q: when should the
exponential
increase switch to
linear?
A: when cwnd gets to
1/2 of its value
before timeout.

Implementation:
 variable ssthresh
 on loss event, ssthresh
is set to 1/2 of cwnd just
before loss event

Transport Layer 3-35


TCP throughput

 avg. TCP thruput as function of window size, RTT?


 ignore slow start, assume always data to send
 W: window size (measured in bytes) where loss occurs
 avg. window size (# in-flight bytes) is ¾ W
 avg. thruput is 3/4W per RTT
3 W
avg TCP thruput = bytes/sec
4 RTT

W/2

Transport Layer 3-36


Chapter 3: summary

 principles behind transport layer


services:
• multiplexing, demultiplexing
• reliable data transfer
• flow control
• congestion control
 instantiation, implementation in
the Internet
• UDP
• TCP

Transport Layer 3-37

You might also like