Transport Layer Protocols
Transport Layer Protocols
Sender: Receiver:
▪ is passed an application- ▪ receives segment from IP
layer message ▪ checks header values
▪ determines segment ▪ extracts application-layer
header fields values message
▪ creates segment
▪ demultiplexes message up
▪ passes segment to IP to application via socket
• congestion control
• flow control
• connection setup
local or
▪ UDP: User Datagram Protocol regional ISP
application
transport ? transport
multiplexing de-multiplexing
User Datagram Protocol (UDP)
physical physical
data to/from
UDP segment format application layer
Transmitted: 5 6 11
Received: 4 6 11
receiver-computed sender-computed
checksum
= checksum (as received)
sum 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 0
checksum 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1
Note: when adding numbers, a carryout from the most significant bit needs to be
added to the result
* Check out the online interactive exercises for more examples: http://gaia.cs.umass.edu/kurose_ross/interactive/
Transport Layer: 3-17
Internet checksum: weak protection!
example: add two 16-bit integers
0 1
1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 1 0
1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
wraparound 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 Even though
numbers have
sum 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 0 changed (bit
flips), no change
checksum 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1 in checksum!
SendBase=92
Seq=92, 8 bytes of data Seq=92, 8 bytes of data
Seq=92, 8 bytes of data
Seq=100, 20 bytes of data
timeout
timeout
SendBase=120
cumulative ACK covers
lost ACK scenario early timeout for earlier lost ACK
ACK
Data
Client
Data Server
25
S=1 ➔ Seq. num. field carries ISN to be used
TCP Segment Structure S=0 ➔ Seq. num. = Seq. # of the first data byte in seg.
32 bits
3-27
TCP Sequence Numbers, ACKs
outgoing segment from sender
sequence numbers: source port # dest port #
sequence number
• byte stream “number” of acknowledgement number
rwnd
first byte in segment’s checksum urg pointer
data window size
N
acknowledgements:
• seq # of next byte
expected from other side sender sequence number space
3-29
TCP: Header
▪ URG: ‘1’ => Urgent Pointer is valid
▪ ACK: ‘1’ => ACK Seq# is valid
▪ PSH:
• ‘1’: The receiving TCP module passes the data to the
application immediately
• ‘0’: The receiving TCP module may delay the data
▪ RST: ‘1’ => Tells the receiver to abort the conn.
▪ SYN: This bit requests a connection
▪ FIN
• ‘1’: Sender has no more data to send, but is ready to receive.
3-30
TCP: Header
▪ Window Size
• The number of bytes the sender is willing to receive.
• Used in flow control and congestion control
▪ Checksum: For error detection; scope: complete seg.
▪ Urgent Pointer: Valid if URG = ‘1’
• Urgent data
• Start byte is not specified, but it is considered to be the
start of the seg.
• Final byte in receiver’s buffer: Seq# + Urgent Pointer
• The sender can send “control” information to the receiver to
be processed on a priority basis.
3-31
TCP: Header Options
▪ MSS
• The Max Segment Size accepted by the sender
• Specified during connection set up
▪ Window Scale
• Allows the use of a larger advertised Window Size
▪ Time Stamp
• Used in Round-Trip Time (RTT) calculation
• Intended to be used on high-speed connection
• Sequence number may wrap around during a connection.
• New segments are distinguished from old segments by means
of time stamps
3-32
TCP Connection
LAST_ACK
FINbit=1, seq=y
TIMED_WAIT can no longer
send data
ACKbit=1; ACKnum=y+1
timed wait
for 2*max CLOSED
segment lifetime
CLOSED
LISTEN Closed
Closed
Active open Passive open
SYN
SENT
SYN
SYN+ACK
RCVD
SYN
ACK
Read/Write
Established
Read/Write
Established
Active close
FIN
WAIT-1
FIN
ACK
CLOSE
WAIT
Passive close
WAIT-2
FIN
LAST
FIN
ACK
2MSL timer
WAIT
TIME
ACK
Closed
Server states
Closed
Client states
Transport Layer: 3-39
ACK Generation Rules
▪ When an in-order data segment is received, delay the
ACK until
• Another data segment is received, OR
• 500 milliseconds have elapsed
▪ When an out of sequence segment with a higher
sequence # arrives
• Send an ACK with the expected seq#
▪ When a missing segment arrives
• Send an ACK to announce the next seq# expected.
▪ If a duplicate segment arrives, immediately send an
ACK
Transport Layer: 3-40
TCP Flow Control
TCP
code
Network layer
delivering IP datagram
payload into TCP
IP
socket buffers
code
flow control
receiver controls sender, so sender from sender
won’t overflow receiver’s buffer
receiver protocol stack
by transmitting too much, too fast
Transport Layer: 3-43
TCP Silly Window Syndrome
3-44
TCP: Silly Window Syndrome (Sender produces small data blocks)
Server
Client
Read Write
Port Port
TCP TCP
IP/Link/PHY
Internet IP/LinkPHY
Nagle’s solution
Sender sends the first segment even if it is a small one.
Next, wait until an ACK is received OR a maximum-size segment is accumulated
before sending the next segment
…… and repeat “Next” ...
3-45
TCP: Silly Window Syndrome (Slow Receiver)
Client is emptying the buffer slowly ➔ RWND is small
Client Server
Read Write
Port Port
TCP TCP
IP/Link/PHY
Internet IP/Link/PHY
Receive buffer
Clarke’s solution
Send an ACK and close the window until another segment
can be received or buffer is ½ empty.
3-46
TCP Congestion Control
Host
H H
Network input
Network output
3-48
Causes of congestion
3-49
General Principles of Congestion Control
Monitor: A variety of metrics can be monitored.
Fraction of all packets discarded due to lack of buffer
Average queue length
Number of retransmitted packets
Average packet delay
- Network layer
Next …
3-51
TCP: Congestion Control (CC)
▪ CC is achieved by controlling the transmission rate at the
sender after “detecting” congestion.
• Tx rate is controlled by controlling the window
size.
• Main idea in controlling CW (congestion window)
❖ Slow start (CW = 1 MSS)
but quickly speed up to congestion threshold (CT): 1,2,4, 8, …CT
❖ Congestion avoidance
beyond threshold, increase linearly: CW++, CW++, …, RWND
v Congestion detection
Go back to slow start ….
3-52
TCP: Congestion Control
▪ Slow start ▪ Congestion Avoidance: Additive Inc.
✓ Initially, CW = 1: Tx 1 Seg. (MSS) ✓ Each time the whole window of
segs. is ACKed
✓ If ACK received before TO CW = CW + 1
CW = 2 (= CW x 2): Tx 2 Segs. (CWmax = RWND)
✓ If ACKs received before TO
CW = 4 (= CW x 2): Tx 4 Segs.
✓ If ACKs received before TO Congestion Detection
CW = 8 (= CW x 2): Tx 8 Segs.
RTO timer goes off
:
✓ Continue until you hit a threshold:
CT = CW/2 and CW = 1
Congestion Threshold (CT)
▪ Variable CT
▪ Congestion Threshold is also known as ssthresh
Transport Layer 3-54
TCP: Timers
Four kinds of timers
Persistence Timer
Keep-Alive Timer
3-55
Example RTT estimation:
RTT: gaia.cs.umass.edu to fantasia.eurecom.fr
350
300
250
RTT (milliseconds)
200
150
100
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
3-57
TCP: Persistence Timer
• A receiver can close the window and reopen it with an ACK
▪ Problem: If the ACK is lost, there is deadlock.
▪ Solution:
✓ When a sending TCP receives a segment with RWND = 0, start a
persistence timer.
3-58
TCP: Timers (Keepalive and TIME-WAIT)
• Keepalive Timer
✓ To sustain mostly idle connections (as between BGP routers)
✓ Each time the server hears from a client
Reset the timer: 2 hours.
If the server does not hear from the client for 2 hours
Send a probe segment.
If there is no response after 10 probes (75 sec apart)
Assume that the client is down.
bottleneck
TCP connection 2 router
capacity R
Connection 1 throughput R
Transport Layer: 3-62
Fairness: must all network apps be “fair”?
Fairness and UDP Fairness, parallel TCP
▪ multimedia apps often do not connections
use TCP ▪ application can open multiple
• do not want rate throttled by
congestion control parallel connections between two
hosts
▪ instead use UDP:
• send audio/video at constant rate, ▪ web browsers do this , e.g., link of
tolerate packet loss rate R with 9 existing connections:
▪ there is no “Internet police” • new app asks for 1 TCP, gets rate R/10
policing use of congestion • new app asks for 11 TCPs, gets R/2
control
W/2
Extra Slide: TCP Variants
▪ The most common TCP variant used today is CUBIC, which is the default congestion control
algorithm in many Linux kernels. Other notable TCP variants include Reno, New Reno, Vegas, and
SACK.
▪ CUBIC: CUBIC TCP is an updated version of TCP designed to handle high-bandwidth networks
effectively. It uses a cubic function for congestion window increase, improving scalability.
▪ Reno: Reno is a standard TCP implementation and was one of the earlier TCP variants. It's known
for its simple implementation but can be less efficient in certain network conditions.
▪ New Reno: New Reno improves upon Reno by addressing the problem of multiple packet loss
events and making the fast recovery algorithm more robust.
▪ Vegas: Vegas is another TCP variant that attempts to avoid congestion by dynamically adjusting
its window size based on observed round trip times (RTTs).
▪ SACK (Selective Acknowledgment): SACK allows the receiver to acknowledge individual packets
that were received in order, rather than cumulative acknowledgments. This helps with packet
loss recovery and is useful in certain network environments.