TCP Retransmission and Flow Control (Part II)
CEN 223 – Internet Communication
Assoc. Prof. Dr. Fatih ABUT
Department of Computer Engineering
Çukurova University
What TCP does for you?
Stream-based in-order delivery
Segments are ordered according to sequence numbers
Only consecutive bytes are delivered
Reliability
Missing segments are detected (ACK is missing) and retransmitted
Flow control
Receiver is protected against overload (window based)
Congestion control
Network is protected against overload (window based)
Protocol tries to fill available capacity
Connection handling
Explicit establishment + teardown
Full-duplex communication
e.g., an ACK can be a data segment at the same time (piggybacking)
2
TCP Retransmission Strategy
TCP relies on positive acknowledgements
Retransmission on timeout
Timer associated with each segment as it is sent
If timer expires before acknowledgement, sender must
retransmit
TCP uses an adaptive retransmission algorithm because
internet delays are so variable
Round trip time of each connection is recomputed
every time an acknowledgment arrives
Timeout value is adjusted accordingly
3
Retransmit Timeout Mechanism (1)
RTO timer value difficult to determine:
too high bad in case of msg-loss!
too short risk of false alarms!
General consensus: too short is worse than too long; use conservative estimate
Calculation: measure RTT (Seg# ... ACK#)
Original suggestion in RFC 793 (1981)
Exponentially Weighed Moving Average (EWMA)
SRTTnew = * SRTTold + (1-) * RTTcurrent
RTO = min(UBOUND, max(LBOUND, * SRTT))
SRTT: Smoothed Round Trip Time
: Smoothing factor (typically 0.8-0.9)
: Variance factor (typically 2) 4
Retransmit Timeout Mechanism (2)
Depending on variation, this RTO may be too small or too large; thus, final
algorithm includes deviation
Key observation:
Original smoothed RTT can’t keep up with wide/rapid variations in RTT
Jacobson/Karel‘s Retransmission Timeout (1988)
SRTTnew = * SRTTold + (1-) * RTTcurrent
SDEVnew = * SDEVold + (1 - ) * [abs(RTTcurrent - SRTTnew)]
RTOnew = SRTTnew + γ * SDEVnew
SDEV: Smoothed deviation
: Smoothing factor for standard
deviation (typically 0.8-0.9)
5
γ: Adjustment factor (typically 4)
RTO calculation
Problem: retransmission ambiguity
Segment #1 sent, no ACK received → segment #1 retransmitted
Incoming ACK #2: cannot distinguish whether original or retransmitted segment #1 was
ACKed
Thus, cannot reliably calculate RTO!
Solution [Karn/Partridge]: ignore RTT values from retransmits
Problem: RTT calculation especially important when loss occurs; sampling theorem
suggests that RTT samples should be taken more often
Solution: Timestamps option
Sender writes current time into packet header (option)
Receiver reflects value
At sender, when ACK arrives, RTT = (current time) - (value carried in option)
6
Sliding Window Management
Receiver “grants“ credit (receiver window, rwnd)
sender restricts sent data with window
Receiver buffer not specified
i.e. receiver may buffer reordered segments (i.e. with gaps) 7
TCP flow control - Segments on the sender side
Segment to be sent Segment to be received
Source Port Destination Port Source Port Destination Port
Sequence Number Sequence Number
Acknowledgement Number Acknowledgement Number
U A P R S F U A P R S F
Offset Reserved R C S S Y I
G K H T N N Advertised Window Size Offset Reserved R C S S Y I
G K H T N N Advertised Window Size
Checksum Urgent Pointer Checksum Urgent Pointer
Options (variable Länge) Padding Options (variable Länge) Padding
NextByteToBeSent
Advertised Window Size:
⚫ Number of bytes the receiver is
LastByteAcked+1
ready to receive
nicht
past
verwendbar
sender window
sender buffer sending
application
"LastByteSendable"
8
TCP Flow Control – "Zero advertised window size"
The solution of the problem:
Sender Receiver ⚫ Sender uses a persistent timer to
periodically send a zero window probe
segment as soon as the receiver window
is closed.
(Algorithm: Exponential back-off algorithm: start value 1.5
seconds, doubling after each Ack until limit, e.g., 60 s, is
reached.
→ 1,5 3 6 12 24 48 60 60 60 60 60 ……..)
⚫ Probe segment does not contain any
AdWin=0 user data. Specification explicitly allows
sample segments to be sent even after
AdWin=4096 the receiver window is closed.
(Note: As long as the receiver cannot process the sample
deadlock segment, the ACK contains the sequence number of the
This segment last accepted byte)
will be lost!
9
TCP Flow Control – "Small Packet Problem"
Sender Problem:
(Appl.: Telnet) ⚫ Sending 2 bytes creates 240 bytes of overhead
Receiver
(2 data segments of 40 bytes each,
1 Byte 2 acknowledgments of 40 bytes each,
2 window updates of 40 bytes each)
ACK, AdWin=4 Solutions:
⚫ Receiver side: delayed Acknowledgement: Delay of
ACK, AdWin=5 confirmation and window updates by 200 ms (Idea
behind: "Piggyback", "Collecting ACKs")
1 Byte ⚫ Sender side: Nagle's algorithm (RFC 896, 1984): If
data is used byte by byte, only send the first byte,
collect bytes and send them in a segment as soon as
ACK, AdWin=4
MSS is reached, or the first byte is confirmed.
ACK, AdWin=5 Little influence with IP over LAN, but with WAN.
Problems with interactive applications (Nagle's algorithm leads
to jerky work e.g. with X-Window). Therefore Nagle's algorithm
can be switched off via sockets.
10
TCP Flow Control – "Silly Window Syndrome (SWS)
RFC 813"
Problem:
Sender Receiver ⚫Byte-wise processing on the receiver side:
The sender is animated to send small
segments.
ACK, AdWin=0
ACK, AdWin=1 Solution:
⚫ Sender side:
1 Byte
(1) Avoid sending small amounts of data.
(2) Nagle's algorithm.
ACK, AdWin=0 ⚫ Receiver side:
Window updates only for a larger amount (e.g.
ACK, AdWin=1 if 30% of the receive buffer or 2 MSS are
free)
11
Nagle’s Algorithm (1)
If there is data to send but the window is open less
than MSS, then we may want to wait some amount of
time before sending the available data
If we wait too long, then we hurt interactive applications
If we don’t wait long enough, then we risk sending a bunch
of tiny packets and falling into the silly window syndrome
The solution is to introduce a timer and to transmit
when the timer expires
12
Nagle’s Algorithm (2)
When the application produces data to send
if both the available data and the window ≥ MSS
send a full segment
else /* window < MSS */
if there is unACKed data in flight
buffer the new data until an ACK arrives
else
send all the new data now
13
TCP Acknowledgements (RFC 1122, RFC 2581)
Event Reaction TCP receiver
Arrival of a directly following segment, delayed Acknowledgement:
no gap, all previous segments have Wait up to 200 ms to see whether a new
already been confirmed. segment comes, if none comes by then, an
ACK must be sent.
Arrival of a directly following segment, Sending a cumulative ACK: Confirmation
no gap, segment waiting for of several segments with a single
confirmation acknowledgment
Arrival of an out-of-order segment Sending a duplicate ACK:
larger than NextByteExpected (there is a Repeated sending of the last ACK (=
gap). beginning of the gap)
Arrival of a segment that completely or Immediate Acknowledgement:
partially fills a gap (so that part of the Immediate sending of an ACK
byte stream is completed).
14
Fast Retransmit (1)
Coarse timeouts remained a problem, and Fast retransmit
was added with TCP Tahoe.
Since the receiver responds every time a packet arrives, this
implies the sender will see duplicate ACKs.
Basic Idea: use duplicate ACKs to signal lost packet.
Fast Retransmit
Upon receipt of three duplicate ACKs, the TCP Sender
retransmits the lost packet.
15
Fast Retransmit (2)
Sender Receiver
Problem:
Seg 1
Seg 2 ⚫Rough setting of the RTO leads to waiting times in
Seg 3 the event of packet loss
Seg 4 Solution:
ACK 1
Seg 5 ACK 2
⚫In the event of a triple duplicate ACK, retransmission
of the missing package without waiting for the
dACK 2
Seg 6 retransmission timer to expire
dACK 2 New Problem:
dACK 2 ⚫A lost packet is a sign of network congestion.
Therefore, after Fast Retransmit, a reaction to
network overload is necessary.
Retransmit
Seg 3 –> Congestion Control
ACK 6
RTO Note:
Fast Recovery: Algorithm to get data flow after Fast Retransmit
SACK (Selective Acknowledgement, RFC2018): Confirmation of the segments
actually received
16