TCP Flow Control and Congestion Control: EECS 489 Computer Networks Z. Morley Mao Monday Feb 5, 2007
TCP Flow Control and Congestion Control: EECS 489 Computer Networks Z. Morley Mao Monday Feb 5, 2007
TCP Flow Control and Congestion Control: EECS 489 Computer Networks Z. Morley Mao Monday Feb 5, 2007
Acknowledgement: Some slides taken from Kurose&Ross and Katz&Stoica Mao W07 1
TCP Flow Control
flow control
sender won’t overflow
receive side of TCP receiver’s buffer by
connection has a receive
transmitting too
buffer:
much,
too fast
speed-matching service:
matching the send rate to the
receiving app’s drain rate
Mao W07 2
TCP Flow control: how it works
Rcvr advertises spare room
by including value of
RcvWindow in segments
Sender limits unACKed data
to RcvWindow
- guarantees receive buffer
(Suppose TCP receiver discards doesn’t overflow
out-of-order segments)
spare room in buffer
= RcvWindow
= RcvBuffer-[LastByteRcvd -
LastByteRead]
Mao W07 3
TCP Connection Management
Mao W07 4
TCP Connection Management (cont.)
timed wait
ACK
closed
Mao W07 5
TCP Connection Management (cont.)
timed wait
ACK
closed
closed
Mao W07 6
TCP Connection Management
(cont)
TCP server
lifecycle
TCP client
lifecycle
Mao W07 7
Principles of Congestion Control
Congestion:
informally: “too many sources sending too much data too
fast for network to handle”
different from flow control!
manifestations:
- lost packets (buffer overflow at routers)
- long delays (queueing in router buffers)
a top-10 problem!
Mao W07 8
Causes/costs of congestion:
scenario 1
Host A
in : original data out
two senders, two
receivers
one router, infinite Host B unlimited shared
no retransmission
Mao W07 9
Causes/costs of congestion:
scenario 2
one router, finite buffers
sender retransmission of lost packet
Mao W07 10
Causes/costs of congestion:
scenario 2
always: = out(goodput)
in
“perfect” retransmission only when loss: > out
in
retransmission of delayed (not lost) packet makes larger (than
in
perfect case) for same
out
R/2 R/2 R/2
R/3
out
out
out
R/4
a. b. c.
“costs” of congestion:
more work (retrans) for given “goodput”
unneeded retransmissions: link carries multiple copies of pkt
Mao W07 11
Causes/costs of congestion:
scenario 3
four senders
Q: what happens as
multihop paths in
timeout/retransmit
and increase ?
in
Host A out
in : original data
'in : original data, plus
retransmitted data
finite shared output
link buffers
Host B
Mao W07 12
Causes/costs of congestion:
scenario 3
H
o
s o
t
u
A
t
H
o
s
t
B
Mao W07 13
Approaches towards congestion
control
Two broad approaches towards congestion control:
Mao W07 14
Case study: ATM ABR congestion
control
ABR: available bit rate: RM (resource management)
“elastic service” cells:
if sender’s path sent by sender, interspersed with
“underloaded”: data cells
- sender should use bits in RM cell set by switches
available bandwidth (“network-assisted”)
if sender’s path congested:
- NI bit: no increase in rate
- sender throttled to (mild congestion)
minimum guaranteed
rate - CI bit: congestion indication
RM cells returned to sender by
receiver, with bits intact
Mao W07 15
Case study: ATM ABR congestion
control
Mao W07 16
TCP Congestion Control
Mao W07 17
TCP AIMD
multiplicative decrease: additive increase: increase
cut CongWin in half CongWin by 1 MSS every
after loss event RTT in the absence of loss
events: probing
c o n g e s tio n
w in d o w
2 4 K b y te s
1 6 K b y te s
8 K b y te s
tim e
Mao W07 19
TCP Slow Start (more)
RTT
- double CongWin every
RTT
two segm
- done by incrementing ents
CongWin for every ACK
received
Summary: initial rate is four segm
ents
slow but ramps up
exponentially fast
time
Mao W07 20
Refinement
Philosophy:
After 3 dup ACKs:
- CongWin is cut in half • 3 dup ACKs indicates
- window then grows linearly network capable of
But after timeout event: delivering some segments
- CongWin instead set to 1 MSS;
• timeout before 3 dup
- window then grows exponentially
ACKs is “more alarming”
- to a threshold, then grows linearly
Mao W07 21
Refinement (more)
Implementation:
Variable Threshold
At loss event, Threshold is
set to 1/2 of CongWin just
before loss event
Mao W07 22
Summary: TCP Congestion
Control
When CongWin is below Threshold, sender in slow-
start phase, window grows exponentially.
When CongWin is above Threshold, sender is in
congestion-avoidance phase, window grows linearly.
When a triple duplicate ACK occurs, Threshold set
to CongWin/2 and CongWin set to Threshold.
When timeout occurs, Threshold set to CongWin/2
and CongWin is set to 1 MSS.
Mao W07 23
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for Slow Start CongWin = CongWin + MSS, Resulting in a doubling of
previously (SS) If (CongWin > Threshold) CongWin every RTT
unacked data set state to “Congestion
Avoidance”
ACK receipt for Congestion CongWin = CongWin+MSS * Additive increase, resulting
previously Avoidance (MSS/CongWin) in increase of CongWin by
unacked data (CA) 1 MSS every RTT
Loss event SS or CA Threshold = CongWin/2, Fast recovery, implementing
detected by CongWin = Threshold, multiplicative decrease.
triple duplicate Set state to “Congestion CongWin will not drop
ACK Avoidance” below 1 MSS.
Timeout SS or CA Threshold = CongWin/2, Enter slow start
CongWin = 1 MSS,
Set state to “Slow Start”
Duplicate ACK SS or CA Increment duplicate ACK count CongWin and Threshold not
for segment being acked changed
Mao W07 24
TCP throughput
Mao W07 25
TCP Futures
Mao W07 26
TCP Fairness
Fairness goal: if K TCP sessions share same
bottleneck link of bandwidth R, each should have
average rate of R/K
TCP connection 1
bottleneck
TCP
router
connection 2
capacity R
Mao W07 27
Why is TCP fair?
Two competing sessions:
Additive increase gives slope of 1, as throughout increases
multiplicative decrease decreases throughput proportionally
Connection 1 throughput R
Mao W07 28
Fairness (more)
Fairness and UDP Fairness and parallel TCP
connections
Multimedia apps often do not
nothing prevents app from
use TCP
opening parallel cnctions between
- do not want rate throttled by
congestion control 2 hosts.
Web browsers do this
Instead use UDP:
Example: link of rate R supporting
- pump audio/video at
constant rate, tolerate 9 cnctions;
packet loss - new app asks for 1 TCP, gets rate
Research area: TCP friendly R/10
- new app asks for 11 TCPs, gets
R/2 !
Mao W07 29
Delay modeling
Notation, assumptions:
Q: How long does it take to receive Assume one link between client
an object from a Web server after and server of rate R
sending a request? S: MSS (bits)
Ignoring congestion, delay is O: object size (bits)
influenced by: no retransmissions (no loss, no
corruption)
TCP connection establishment
Window size:
data transmission delay
First assume: fixed congestion
slow start window, W segments
Then dynamic window,
modeling slow start
Mao W07 30
TCP Delay Modeling:
Slow Start (1)
Now suppose window grows according to slow start
O S S
Latency 2 RTT P RTT ( 2 1)
P
R R R
P min{Q , K 1}
Server idles: t h ir d w in d o w
= 4 S /R
P = min{K-1,Q} times
Example:
• O/S = 15 segments fo u r th w in d o w
• K = 4 windows = 8 S /R
•Q=2
• P = min{K-1,Q} = 2
c o m p le te
Server idles P=2 times
o b je c t t r a n s m is s io n
d e liv e r e d
tim e a t
tim e a t s e rv e r
c lie n t Mao W07 32
TCP Delay Modeling (3)
S
RTT time from when server starts to send segment
R
until server receives acknowledgement icn oi t ni ant ee c Tt iCo nP
S
2k 1 time to transmit the kth window re q u e s t
R o b je c t
f ir s t w in d o w
= S /R
S k 1 S
RTT
R
= 2 S /R
t h ir d w in d o w
= 4 S /R
P
O
delay 2 RTT idleTime p f o u r t h w in d o w
= 8 S /R
R p 1
P
O S S
2 RTT [ RTT 2 k 1 ]
R k 1 R R o b je c t
c o m p le t e
tr a n s m is s io n
d e liv e r e d
O S S
2 RTT P[ RTT ] (2 P 1) tim e a t
R R R tim e a t
c lie n t
s e rv e r
Mao W07 33
TCP Delay Modeling (4)
Recall K = number of windows that cover object
How do we calculate K ?
K min{k : 2 0 S 21 S 2 k 1 S O}
min{k : 2 0 21 2 k 1 O / S}
O
min{k : 2 1 }
k
S
O
min{k : k log 2 ( 1)}
S
O
log 2 ( 1)
S
Mao W07 34
HTTP Modeling
Assume Web page consists of:
- 1 base HTML page (of size O bits)
- M images (each of size O bits)
Non-persistent HTTP:
- M+1 TCP connections in series
- Response time = (M+1)O/R + (M+1)2RTT + sum of idle times
Persistent HTTP:
- 2 RTT to request and receive base HTML file
- 1 RTT to request and receive M images
- Response time = (M+1)O/R + 3RTT + sum of idle times
Non-persistent HTTP with X parallel connections
- Suppose M/X integer.
- 1 TCP connection for base file
- M/X sets of parallel connections for images.
- Response time = (M+1)O/R + (M/X + 1)2RTT + sum of idle times
Mao W07 35
HTTP Response time (in seconds)
RTT = 100 msec, O = 5 Kbytes, M=10 and X=5
20
18
16
14
non-persistent
12
10
persistent
8
6
parallel non-
4
persistent
2
0
28 100 1 Mbps 10
Kbps Kbps Mbps
For low bandwidth, connection & response time dominated by
transmission time.
Persistent connections only give minor improvement over parallel
connections. Mao W07 36
HTTP Response time (in seconds)
RTT =1 sec, O = 5 Kbytes, M=10 and X=5
70
60
50
non-persistent
40
30 persistent
20 parallel non-
10 persistent
0
28 100 1 Mbps 10
Kbps Kbps Mbps
For larger RTT, response time dominated by TCP establishment
& slow start delays. Persistent connections now give important
improvement: particularly in high delaybandwidth networks. Mao W07 37
Issues to Think About
High speeds?
- to reach 10gbps, packet losses occur every 90 minutes!
Mao W07 38
Security issues with TCP
Example attacks:
- Sequence number spoofing
- Routing attacks
- Source address spoofing
- Authentication attacks
Mao W07 39
Network Layer
goals:
understand principles behind network layer
services:
- routing (path selection)
- dealing with scale
- how a router works
- advanced topics: IPv6, mobility
instantiation and implementation in the Internet
Mao W07 40
Network layer
transport segment from sending to receiving host application
transport
on sending side encapsulates segments into network
datagrams data link network
on rcving side, delivers segments to transport physical data link
network network
layer data link physical data link
physical physical
network layer protocols in every host, router
Router examines header fields in all IP network
datagrams passing through it data link
physical network
data link
physical
network
network data link
data link physical
physical
network
data link application
physical transport
network
data link
physical
Mao W07 41
Key Network-Layer Functions
Mao W07 42
Interplay between routing and forwarding
routing algorithm
value in arriving
packet’s header
0111 1
3 2
Mao W07 43
Connection setup
Mao W07 44
Network service model
Q: What service model for “channel” transporting
datagrams from sender to rcvr?
Example services for individual Example services for a flow of
datagrams: datagrams:
guaranteed delivery In-order datagram delivery
Mao W07 45
Network layer service models:
Guarantees ?
Network Service Congestion
Architecture Model Bandwidth Loss Order Timing feedback
Mao W07 46
Network layer connection and
connection-less service
Mao W07 47
Virtual circuits
“source-to-dest path behaves much like telephone circuit”
- performance-wise
- network actions along source-to-dest path
call setup, teardown for each call before data can flow
each packet carries VC identifier (not destination host address)
every router on source-dest path maintains “state” for each
passing connection
link, router resources (bandwidth, buffers) may be allocated to
VC
Mao W07 48
VC implementation
A VC consists of:
1. Path from source to destination
2. VC numbers, one number for each link along path
3. Entries in forwarding tables in routers along path
Packet belonging to VC carries a VC number.
VC number must be changed on each link.
- New VC number comes from forwarding table
Mao W07 49
Forwarding table VC number
12 22 32
1 3
2
1 12 2 22
2 63 1 18
3 7 2 17
1 97 3 87
… … … …
application
6. Receive data application
transport 5. Data flow begins
network 4. Call connected 3. Accept call transport
data link 1. Initiate call 2. incoming call network
data link
physical
physical
Mao W07 51
Datagram networks
no call setup at network layer
routers: no state about end-to-end connections
- no network-level concept of “connection”
packets forwarded using destination host address
- packets between same source-dest pair may take different paths
application
application
transport
transport
network
data link 1. Send data 2. Receive data network
data link
physical
physical
Mao W07 52
4 billion
Forwarding table possible entries
otherwise 3
Mao W07 53
Longest prefix matching
Examples
Mao W07 54
Datagram or VC network: why?
Internet ATM
data exchange among computers evolved from telephony
- “elastic” service, no strict human conversation:
timing req.
- strict timing, reliability
“smart” end systems (computers) requirements
- can adapt, perform control, - need for guaranteed service
error recovery
“dumb” end systems
- simple inside network,
complexity at “edge” - telephones
many link types - complexity inside network
- different characteristics
- uniform service difficult
Mao W07 55