Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
12 views28 pages

Chapter 3

Uploaded by

chienphan852003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views28 pages

Chapter 3

Uploaded by

chienphan852003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Chapter 3: Transport Layer

Chapter 3
Transport Layer our goals:
v understand v learn about Internet
principles behind transport layer protocols:
transport layer § UDP: connectionless
A note on the use of these ppt slides:
services: transport
We’re making these slides freely available to all (faculty, students, readers). Computer § multiplexing, § TCP: connection-oriented
They’re in PowerPoint form so you see the animations; and can add, modify,
and delete slides (including this one) and slide content to suit your needs. Networking: A Top demultiplexing reliable transport
They obviously represent a lot of work on our part. In return for use, we only
ask the following: Down Approach § reliable data transfer § TCP congestion control
v If you use these slides (e.g., in a class) that you mention their source
(after all, we’d like people to use our book!)
6th edition § flow control
v If you post any slides on a www site, that you note that they are adapted Jim Kurose, Keith Ross
from (or perhaps identical to) our slides, and note our copyright of this
material.
Addison-Wesley § congestion control
March 2012
Thanks and enjoy! JFK/KWR

All material copyright 1996-2013


J.F Kurose and K.W. Ross, All Rights Reserved

Transport Layer 3-1 Transport Layer 3-2

Chapter 3 outline Transport services and protocols


application
transport

3.1 transport-layer 3.5 connection-oriented v provide logical communication network


data link
between app processes physical
services transport: TCP running on different hosts
3.2 multiplexing and § segment structure

lo
transport protocols run in

gi
v

ca
demultiplexing § reliable data transfer end systems

le
nd
§ flow control

-e
3.3 connectionless § send side: breaks app

nd
messages into segments,

tra
transport: UDP § connection management

ns
passes to network layer

po
3.4 principles of reliable 3.6 principles of congestion

rt
control § rcv side: reassembles
data transfer segments into messages,
application
transport

3.7 TCP congestion control passes to app layer


network
data link
physical

v more than one transport


protocol available to apps
§ Internet: TCP and UDP
Transport Layer 3-3 Transport Layer 3-4
Transport vs. network layer Internet transport-layer protocols
application

v network layer: logical v reliable, in-order transport


network
household analogy: delivery (TCP) data link

communication physical
network

between hosts 12 kids in Ann’s house sending § congestion control network data link

lo
data link physical
letters to 12 kids in Bill’s

gi
physical
§ flow control

ca
v transport layer:
network
house:

le
data link

nd
logical v hosts = houses § connection setup physical

-e
nd
network

communication v processes = kids unreliable, unordered data link

tra
v physical

ns
between processes v app messages = letters in delivery: UDP network

po
data link
envelopes

rt
physical
§ relies on, enhances, v transport protocol = Ann
§ no-frills extension of network
data link application
network layer and Bill who demux to in- “best-effort” IP physical
network transport
network
services house siblings v services not available:
data link
physical data link
physical

v network-layer protocol =
postal service § delay guarantees
§ bandwidth guarantees

Transport Layer 3-5 Transport Layer 3-6

Chapter 3 outline Multiplexing/demultiplexing


multiplexing at sender:
3.1 transport-layer 3.5 connection-oriented handle data from multiple demultiplexing at receiver:
services transport: TCP sockets, add transport header use header info to deliver
(later used for demultiplexing) received segments to correct
3.2 multiplexing and § segment structure socket
demultiplexing § reliable data transfer
3.3 connectionless § flow control application

transport: UDP § connection management application P1 P2 application socket

3.4 principles of reliable 3.6 principles of congestion P3 transport P4


process

data transfer control transport network transport


network link network
3.7 TCP congestion control link physical link
physical physical

Transport Layer 3-7 Transport Layer 3-8


How demultiplexing works Connectionless demultiplexing
v host receives IP datagrams 32 bits v recall: created socket has v recall: when creating
§ each datagram has source IP host-local port #: datagram to send into
source port # dest port #
address, destination IP DatagramSocket mySocket1 UDP socket, must specify
= new DatagramSocket(12534);
address § destination IP address
other header fields
§ each datagram carries one § destination port #
transport-layer segment
§ each segment has source, application
destination port number v when host receives UDP IP datagrams with same
data dest. port #, but different
segment:
v host uses IP addresses & (payload) source IP addresses
§ checks destination port #
port numbers to direct in segment and/or source port
segment to appropriate numbers will be directed
TCP/UDP segment format § directs UDP segment to to same socket at dest
socket socket with that port #

Transport Layer 3-9 Transport Layer 3-10

Connectionless demux: example Connection-oriented demux


DatagramSocket
DatagramSocket serverSocket = new TCP socket identified server host may support
DatagramSocket DatagramSocket v v
mySocket2 = new mySocket1 = new
DatagramSocket (6428); DatagramSocket
by 4-tuple: many simultaneous TCP
(9157); application
(5775); § source IP address sockets:
application P1 application § source port number § each socket identified by
P3 P4
§ dest IP address its own 4-tuple
transport
transport
network
transport § dest port number v web servers have
network link network
v demux: receiver uses different sockets for
link physical link
physical all four values to direct each connecting client
physical
segment to appropriate § non-persistent HTTP will
source port: 6428 source port: ? socket have different socket for
dest port: 9157 dest port: ? each request

source port: 9157 source port: ?


dest port: 6428 dest port: ?
Transport Layer 3-11 Transport Layer 3-12
Connection-oriented demux: example Connection-oriented demux: example
threaded server
application application
application P4 P5 P6 application application application
P4
P3 P2 P3 P3 P2 P3
transport transport
transport transport transport transport
network network
network link network network link network
link physical link link physical link
physical server: IP physical physical server: IP physical
address B address B

host: IP source IP,port: B,80 host: IP host: IP source IP,port: B,80 host: IP
address A dest IP,port: A,9157 source IP,port: C,5775 address C address A dest IP,port: A,9157 source IP,port: C,5775 address C
dest IP,port: B,80 dest IP,port: B,80
source IP,port: A,9157 source IP,port: A,9157
dest IP, port: B,80 dest IP, port: B,80
source IP,port: C,9157 source IP,port: C,9157
dest IP,port: B,80 dest IP,port: B,80
three segments, all destined to IP address: B,
dest port: 80 are demultiplexed to different sockets Transport Layer 3-13 Transport Layer 3-14

Chapter 3 outline UDP: User Datagram Protocol [RFC 768]


v “no frills,” “bare bones” v UDP use:
3.1 transport-layer 3.5 connection-oriented Internet transport § streaming multimedia
services transport: TCP protocol apps (loss tolerant, rate
3.2 multiplexing and § segment structure v “best effort” service, sensitive)
UDP segments may be: § DNS
demultiplexing § reliable data transfer
§ flow control § lost § SNMP
3.3 connectionless § delivered out-of-order
transport: UDP § connection management v reliable transfer over
to app
3.6 principles of congestion UDP:
3.4 principles of reliable v connectionless:
data transfer control § add reliability at
§ no handshaking application layer
3.7 TCP congestion control between UDP sender,
receiver § application-specific error
recovery!
§ each UDP segment
handled independently
of others
Transport Layer 3-15 Transport Layer 3-16
UDP: segment header UDP checksum
length, in bytes of
32 bits UDP segment, Goal: detect “errors” (e.g., flipped bits) in transmitted
source port # dest port # including header segment
length checksum
why is there a UDP? sender: receiver:
v no connection v treat segment contents, v compute checksum of
application establishment (which can including header fields, received segment
data add delay) as sequence of 16-bit v check if computed
(payload) integers
v simple: no connection checksum equals checksum
v checksum: addition field value:
state at sender, receiver (one’s complement
v small header size sum) of segment § NO - error detected
v no congestion control: contents § YES - no error detected.
UDP segment format
UDP can blast away as v sender puts checksum But maybe errors
fast as desired value into UDP nonetheless? More later
checksum field ….
Transport Layer 3-17 Transport Layer 3-18

Internet checksum: example Chapter 3 outline


example: add two 16-bit integers 3.1 transport-layer 3.5 connection-oriented
services transport: TCP
1 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0
1 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 3.2 multiplexing and § segment structure
demultiplexing § reliable data transfer
wraparound 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 3.3 connectionless § flow control
transport: UDP § connection management
sum 1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 0
checksum 1 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1 3.4 principles of reliable 3.6 principles of congestion
data transfer control
3.7 TCP congestion control
Note: when adding numbers, a carryout from the most
significant bit needs to be added to the result

Transport Layer 3-19 Transport Layer 3-20


Principles of reliable data transfer Principles of reliable data transfer
v important in application, transport, link layers v important in application, transport, link layers
§ top-10 list of important networking topics! § top-10 list of important networking topics!

v characteristics of unreliable channel will determine v characteristics of unreliable channel will determine
complexity of reliable data transfer protocol (rdt) complexity of reliable data transfer protocol (rdt)
Transport Layer 3-21 Transport Layer 3-22

Principles of reliable data transfer Reliable data transfer: getting started


v important in application, transport, link layers
rdt_send(): called from above, deliver_data(): called by
§ top-10 list of important networking topics! (e.g., by app.). Passed data to rdt to deliver data to upper
deliver to receiver upper layer

send receive
side side

udt_send(): called by rdt, rdt_rcv(): called when packet


v characteristics of unreliable channel will determine to transfer packet over arrives on rcv-side of channel
complexity of reliable data transfer protocol (rdt) unreliable channel to receiver

Transport Layer 3-23 Transport Layer 3-24


Reliable data transfer: getting started rdt1.0: reliable transfer over a reliable channel
we’ll: v underlying channel perfectly reliable
v incrementally develop sender, receiver sides of § no bit errors
reliable data transfer protocol (rdt) § no loss of packets
v consider only unidirectional data transfer v separate FSMs for sender, receiver:
§ but control info will flow on both directions! § sender sends data into underlying channel
v use finite state machines (FSM) to specify sender, § receiver reads data from underlying channel
receiver
event causing state transition
actions taken on state transition Wait for rdt_send(data) Wait for rdt_rcv(packet)
call from call from extract (packet,data)
state: when in this packet = make_pkt(data)
state above below deliver_data(data)
“state” next state state udt_send(packet)
uniquely determined 1 event
by next event 2
actions
sender receiver

Transport Layer 3-25 Transport Layer 3-26

rdt2.0: channel with bit errors rdt2.0: channel with bit errors
v underlying channel may flip bits in packet v underlying channel may flip bits in packet
§ checksum to detect bit errors § checksum to detect bit errors
v the question: how to recover from errors: v the question: how to recover from errors:
§ acknowledgements (ACKs): receiver explicitly tells sender § acknowledgements (ACKs): receiver explicitly tells sender
that pkt received OK
that pkt received OK
§ negative acknowledgements (NAKs): receiver explicitly tells
sender that pkt had errors § negative acknowledgements (NAKs): receiver explicitly tells
§ sender sender that pkt had errors
Howretransmits
do humans recover
pkt on receipt from
of NAK“errors” § sender retransmits pkt on receipt of NAK
v new mechanisms in rdt2.0 (beyond rdt1.0):
§ error detection
during conversation? v new mechanisms in rdt2.0 (beyond rdt1.0):
§ receiver feedback: control msgs (ACK,NAK) rcvr- § error detection
>sender § feedback: control msgs (ACK,NAK) from receiver to
sender

Transport Layer 3-27 Transport Layer 3-28


rdt2.0: FSM specification rdt2.0: operation with no errors
rdt_send(data) rdt_send(data)
sndpkt = make_pkt(data, checksum) receiver snkpkt = make_pkt(data, checksum)
udt_send(sndpkt) udt_send(sndpkt)
rdt_rcv(rcvpkt) && rdt_rcv(rcvpkt) &&
isNAK(rcvpkt) isNAK(rcvpkt)
Wait for Wait for rdt_rcv(rcvpkt) && Wait for Wait for rdt_rcv(rcvpkt) &&
call from ACK or udt_send(sndpkt) corrupt(rcvpkt) call from ACK or udt_send(sndpkt) corrupt(rcvpkt)
above NAK above NAK
udt_send(NAK) udt_send(NAK)

rdt_rcv(rcvpkt) && isACK(rcvpkt) rdt_rcv(rcvpkt) && isACK(rcvpkt)


Wait for Wait for
L call from L call from
below below
sender
rdt_rcv(rcvpkt) && rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt) notcorrupt(rcvpkt)
extract(rcvpkt,data) extract(rcvpkt,data)
deliver_data(data) deliver_data(data)
udt_send(ACK) udt_send(ACK)

Transport Layer 3-29 Transport Layer 3-30

rdt2.0: error scenario rdt2.0 has a fatal flaw!


rdt_send(data)
snkpkt = make_pkt(data, checksum)
udt_send(sndpkt) what happens if handling duplicates:
rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
ACK/NAK corrupted? v sender retransmits
Wait for Wait for rdt_rcv(rcvpkt) && sender doesn’t know
call from ACK or udt_send(sndpkt) corrupt(rcvpkt)
v current pkt if ACK/NAK
above NAK
what happened at corrupted
udt_send(NAK) receiver!
v sender adds sequence
v can’t just retransmit: number to each pkt
rdt_rcv(rcvpkt) && isACK(rcvpkt)
Wait for possible duplicate
L call from v receiver discards (doesn’t
below deliver up) duplicate pkt
stop and wait
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt) sender sends one packet,
extract(rcvpkt,data) then waits for receiver
deliver_data(data)
udt_send(ACK)
response

Transport Layer 3-31 Transport Layer 3-32


rdt2.1: sender, handles garbled ACK/NAKs rdt2.1: receiver, handles garbled ACK/NAKs
rdt_send(data) rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
&& has_seq0(rcvpkt)
sndpkt = make_pkt(0, data, checksum)
udt_send(sndpkt) extract(rcvpkt,data)
rdt_rcv(rcvpkt) && deliver_data(data)
( corrupt(rcvpkt) || sndpkt = make_pkt(ACK, chksum)
Wait for Wait for
ACK or
isNAK(rcvpkt) ) udt_send(sndpkt)
call 0 from rdt_rcv(rcvpkt) && (corrupt(rcvpkt) rdt_rcv(rcvpkt) && (corrupt(rcvpkt)
NAK 0 udt_send(sndpkt)
above
sndpkt = make_pkt(NAK, chksum) sndpkt = make_pkt(NAK, chksum)
rdt_rcv(rcvpkt)
rdt_rcv(rcvpkt) udt_send(sndpkt) udt_send(sndpkt)
&& notcorrupt(rcvpkt)
&& notcorrupt(rcvpkt) Wait for Wait for
&& isACK(rcvpkt) 0 from
&& isACK(rcvpkt) rdt_rcv(rcvpkt) && 1 from rdt_rcv(rcvpkt) &&
L not corrupt(rcvpkt) && below below not corrupt(rcvpkt) &&
L has_seq1(rcvpkt) has_seq0(rcvpkt)
Wait for Wait for
ACK or sndpkt = make_pkt(ACK, chksum) sndpkt = make_pkt(ACK, chksum)
call 1 from udt_send(sndpkt)
rdt_rcv(rcvpkt) && NAK 1 above udt_send(sndpkt)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
( corrupt(rcvpkt) ||
rdt_send(data) && has_seq1(rcvpkt)
isNAK(rcvpkt) )
sndpkt = make_pkt(1, data, checksum) extract(rcvpkt,data)
udt_send(sndpkt)
udt_send(sndpkt) deliver_data(data)
sndpkt = make_pkt(ACK, chksum)
udt_send(sndpkt)

Transport Layer 3-33 Transport Layer 3-34

rdt2.1: discussion rdt2.2: a NAK-free protocol

sender: receiver: v same functionality as rdt2.1, using ACKs only


v seq # added to pkt v must check if received v instead of NAK, receiver sends ACK for last pkt
v two seq. #’s (0,1) will packet is duplicate received OK
suffice. Why? § state indicates whether § receiver must explicitly include seq # of pkt being ACKed
0 or 1 is expected pkt duplicate ACK at sender results in same action as
v must check if received v
seq #
ACK/NAK corrupted NAK: retransmit current pkt
v note: receiver can not
v twice as many states know if its last
§ state must ACK/NAK received
“remember” whether OK at sender
“expected” pkt should
have seq # of 0 or 1

Transport Layer 3-35 Transport Layer 3-36


rdt2.2: sender, receiver fragments rdt3.0: channels with errors and loss
rdt_send(data)
sndpkt = make_pkt(0, data, checksum)
udt_send(sndpkt) rdt_rcv(rcvpkt) &&
new assumption: approach: sender waits
Wait for
( corrupt(rcvpkt) || underlying channel can “reasonable” amount of
Wait for
call 0 from ACK isACK(rcvpkt,1) ) also lose packets time for ACK
udt_send(sndpkt)
above
sender FSM
0
(data, ACKs) v retransmits if no ACK
fragment § checksum, seq. #, received in this time
rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt) ACKs, retransmissions v if pkt (or ACK) just delayed
&& isACK(rcvpkt,0) (not lost):
rdt_rcv(rcvpkt) && will be of help … but
(corrupt(rcvpkt) || L not enough § retransmission will be
has_seq1(rcvpkt)) Wait for
0 from
receiver FSM duplicate, but seq. #’s
udt_send(sndpkt) below fragment already handles this
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) § receiver must specify seq
&& has_seq1(rcvpkt) # of pkt being ACKed
extract(rcvpkt,data) v requires countdown timer
deliver_data(data)
sndpkt = make_pkt(ACK1, chksum)
udt_send(sndpkt) Transport Layer 3-37 Transport Layer 3-38

rdt3.0 sender rdt3.0 in action


rdt_send(data)
rdt_rcv(rcvpkt) &&
sndpkt = make_pkt(0, data, checksum) ( corrupt(rcvpkt) || sender receiver sender receiver
udt_send(sndpkt) isACK(rcvpkt,1) )
rdt_rcv(rcvpkt) start_timer L send pkt0 pkt0 send pkt0 pkt0
L Wait for Wait rcv pkt0 rcv pkt0
call 0from for timeout ack0 send ack0 ack0 send ack0
ACK0 udt_send(sndpkt) rcv ack0 rcv ack0
above
start_timer send pkt1 pkt1 send pkt1 pkt1
rdt_rcv(rcvpkt)
rcv pkt1 X
&& notcorrupt(rcvpkt) rdt_rcv(rcvpkt) loss
&& isACK(rcvpkt,1) ack1 send ack1
&& notcorrupt(rcvpkt)
&& isACK(rcvpkt,0)
rcv ack1
stop_timer send pkt0 pkt0
stop_timer rcv pkt0 timeout
Wait Wait for ack0 send ack0 resend pkt1 pkt1
timeout for call 1 from rcv pkt1
udt_send(sndpkt) ACK1 above ack1 send ack1
rdt_rcv(rcvpkt)
start_timer rcv ack1
rdt_send(data) L send pkt0 pkt0
rdt_rcv(rcvpkt) && rcv pkt0
( corrupt(rcvpkt) || sndpkt = make_pkt(1, data, checksum) (a) no loss
udt_send(sndpkt)
ack0 send ack0
isACK(rcvpkt,0) )
start_timer
L
(b) packet loss
Transport Layer 3-39 Transport Layer 3-40
rdt3.0 in action Performance of rdt3.0
sender receiver
sender receiver send pkt0 pkt0
send pkt0 pkt0 rcv pkt0 v rdt3.0 is correct, but performance stinks
ack0 send ack0
rcv pkt0
ack0 send ack0 rcv ack0
pkt1
v e.g.: 1 Gbps link, 15 ms prop. delay, 8000 bit packet:
send pkt1
rcv ack0 rcv pkt1
send pkt1 pkt1 L 8000 bits
rcv pkt1 send ack1 Dtrans = R = = 8 microsecs
ack1 send ack1
ack1 109 bits/sec
X
loss timeout
resend pkt1 pkt1
timeout rcv pkt1 § U sender: utilization – fraction of time sender busy sending
resend pkt1 pkt1 rcv ack1 pkt0 (detect duplicate)
rcv pkt1 send pkt0 send ack1
(detect duplicate) ack1 U L/R .008
ack1 send ack1 rcv ack1 ack0
rcv pkt0 sender = = = 0.00027
rcv ack1 send ack0 RTT + L / R 30.008
pkt0 send pkt0 pkt0
send pkt0 rcv pkt0
rcv pkt0 ack0
ack0 send ack0
(detect duplicate)
send ack0
§ if RTT=30 msec, 1KB pkt every 30 msec: 33kB/sec thruput
over 1 Gbps link
(c) ACK loss (d) premature timeout/ delayed ACK v network protocol limits use of physical resources!
Transport Layer 3-41 Transport Layer 3-42

rdt3.0: stop-and-wait operation Pipelined protocols


sender receiver pipelining: sender allows multiple, “in-flight”, yet-
first packet bit transmitted, t = 0 to-be-acknowledged pkts
last packet bit transmitted, t = L / R § range of sequence numbers must be increased
§ buffering at sender and/or receiver
first packet bit arrives
RTT last packet bit arrives, send ACK

ACK arrives, send next


packet, t = RTT + L / R

U L/R .008
sender = = = 0.00027
RTT + L / R 30.008
v two generic forms of pipelined protocols: go-Back-N,
selective repeat
Transport Layer 3-43 Transport Layer 3-44
Pipelining: increased utilization Pipelined protocols: overview
sender receiver
first packet bit transmitted, t = 0 Go-back-N: Selective Repeat:
last bit transmitted, t = L / R v sender can have up to v sender can have up to N
N unacked packets in unack’ed packets in
first packet bit arrives pipeline pipeline
RTT last packet bit arrives, send ACK
last bit of 2nd packet arrives, send ACK v receiver only sends v rcvr sends individual ack
last bit of 3rd packet arrives, send ACK cumulative ack for each packet
ACK arrives, send next § doesn’t ack packet if
packet, t = RTT + L / R
there’s a gap
3-packet pipelining increases
utilization by a factor of 3!
v sender has timer for v sender maintains timer
oldest unacked packet for each unacked packet
3L / R .0024 § when timer expires, § when timer expires,
U = 0.00081 retransmit only that
sender = =
30.008
retransmit all unacked
RTT + L / R packets unacked packet

Transport Layer 3-45 Transport Layer 3-46

Go-Back-N: sender GBN: sender extended FSM


rdt_send(data)
v k-bit seq # in pkt header if (nextseqnum < base+N) {
sndpkt[nextseqnum] = make_pkt(nextseqnum,data,chksum)
v “window” of up to N, consecutive unack’ed pkts allowed udt_send(sndpkt[nextseqnum])
if (base == nextseqnum)
start_timer
nextseqnum++
}
L else
refuse_data(data)
base=1
nextseqnum=1
timeout
start_timer
Wait
udt_send(sndpkt[base])
v ACK(n): ACKs all pkts up to, including seq # n - “cumulative rdt_rcv(rcvpkt) udt_send(sndpkt[base+1])
ACK” && corrupt(rcvpkt) …
udt_send(sndpkt[nextseqnum-
§ may receive duplicate ACKs (see receiver) rdt_rcv(rcvpkt) &&
1])
notcorrupt(rcvpkt)
v timer for oldest in-flight pkt
base = getacknum(rcvpkt)+1
v timeout(n): retransmit packet n and all higher seq # pkts in If (base == nextseqnum)
window stop_timer
else
start_timer
Transport Layer 3-47 Transport Layer 3-48
GBN: receiver extended FSM GBN in action
default sender window (N=4) sender receiver
udt_send(sndpkt) rdt_rcv(rcvpkt)
012345678 send pkt0
&& notcurrupt(rcvpkt) 012345678 send pkt1
L && hasseqnum(rcvpkt,expectedseqnum) 012345678 send pkt2 receive pkt0, send ack0
Wait extract(rcvpkt,data) 012345678 send pkt3 Xloss receive pkt1, send ack1
expectedseqnum=1
sndpkt = deliver_data(data) (wait)
make_pkt(expectedseqnum,ACK,chksum) sndpkt = make_pkt(expectedseqnum,ACK,chksum) receive pkt3, discard,
udt_send(sndpkt) (re)send ack1
expectedseqnum++ 012345678 rcv ack0, send pkt4
012345678 rcv ack1, send pkt5 receive pkt4, discard,
ACK-only: always send ACK for correctly-received ignore duplicate ACK
(re)send ack1
receive pkt5, discard,
pkt with highest in-order seq # (re)send ack1
pkt 2 timeout
§ may generate duplicate ACKs 012345678 send pkt2
§ need only remember expectedseqnum 012345678 send pkt3
012345678 send pkt4 rcv pkt2, deliver, send ack2
v out-of-order pkt: 012345678 send pkt5 rcv pkt3, deliver, send ack3
rcv pkt4, deliver, send ack4
§ discard (don’t buffer): no receiver buffering! rcv pkt5, deliver, send ack5
§ re-ACK pkt with highest in-order seq #
Transport Layer 3-49 Transport Layer 3-50

Selective repeat Selective repeat: sender, receiver windows

v receiver individually acknowledges all correctly


received pkts
§ buffers pkts, as needed, for eventual in-order delivery
to upper layer
v sender only resends pkts for which ACK not
received
§ sender timer for each unACKed pkt
v sender window
§ N consecutive seq #’s
§ limits seq #s of sent, unACKed pkts

Transport Layer 3-51 Transport Layer 3-52


Selective repeat Selective repeat in action
sender receiver sender window (N=4) sender receiver
012345678 send pkt0
data from above: pkt n in [rcvbase, rcvbase+N-1] 012345678 send pkt1
send ACK(n) receive pkt0, send ack0
v if next available seq # in v 012345678 send pkt2
send pkt3 Xloss receive pkt1, send ack1
window, send pkt v out-of-order: buffer 012345678
(wait)
timeout(n): v in-order: deliver (also receive pkt3, buffer,
deliver buffered, in-order 012345678 rcv ack0, send pkt4 send ack3
v resend pkt n, restart
pkts), advance window to 012345678 rcv ack1, send pkt5 receive pkt4, buffer,
timer
next not-yet-received pkt send ack4
ACK(n) in [sendbase,sendbase+N]: record ack3 arrived receive pkt5, buffer,
v mark pkt n as received
pkt n in [rcvbase-N,rcvbase-1] send ack5
pkt 2 timeout
v ACK(n) send pkt2
v if n smallest unACKed 012345678

pkt, advance window base otherwise: 012345678


012345678
record ack4 arrived
rcv pkt2; deliver pkt2,
to next unACKed seq # v ignore 012345678
record ack5 arrived
pkt3, pkt4, pkt5; send ack2

Q: what happens when ack2 arrives?

Transport Layer 3-53 Transport Layer 3-54

Selective repeat:
sender window receiver window
(after receipt) (after receipt)

dilemma 0123012 pkt0 Chapter 3 outline


0123012 pkt1 0123012
0123012 pkt2 0123012
example: 0123012 3.1 transport-layer 3.5 connection-oriented
0123012 pkt3
v seq #’s: 0, 1, 2, 3 0123012
X services transport: TCP
v window size=3 pkt0 will accept packet
3.2 multiplexing and § segment structure
with seq number 0
(a) no problem
v receiver sees no demultiplexing § reliable data transfer
difference in two receiver can’t see sender side.
scenarios! receiver behavior identical in both cases! 3.3 connectionless § flow control
v duplicate data
something’s (very) wrong!
transport: UDP § connection management
accepted as new in 0123012 pkt0
3.4 principles of reliable 3.6 principles of congestion
(b) pkt1 0123012
control
0123012
0123012 pkt2 0123012
data transfer
X 0123012 3.7 TCP congestion control
Q: what relationship X
between seq # size timeout
retransmit pkt0 X
and window size to 0123012 pkt0
will accept packet
avoid problem in (b)? with seq number 0
(b) oops!
Transport Layer 3-55 Transport Layer 3-56
TCP: Overview RFCs: 793,1122,1323, 2018, 2581 TCP segment structure
32 bits
URG: urgent data counting
v point-to-point: v full duplex data: (generally not used) source port # dest port #
by bytes
§ one sender, one receiver § bi-directional data flow ACK: ACK #
sequence number of data
v reliable, in-order byte in same connection valid acknowledgement number (not segments!)

steam: § MSS: maximum segment PSH: push data now


head not
len used
UAP R S F receive window
size (generally not used) checksum
# bytes
§ no “message Urg data pointer
rcvr willing
boundaries” v connection-oriented: RST, SYN, FIN: to accept
options (variable length)
v pipelined: § handshaking (exchange connection estab
of control msgs) inits (setup, teardown
§ TCP congestion and sender, receiver state commands)
flow control set window application
before data exchange data
size Internet
v flow controlled: checksum (variable length)
§ sender will not (as in UDP)
overwhelm receiver
Transport Layer 3-57 Transport Layer 3-58

TCP seq. numbers, ACKs TCP seq. numbers, ACKs


outgoing segment from sender
sequence numbers: source port # dest port #
Host A Host B
sequence number
§byte stream “number” of acknowledgement number

first byte in segment’s checksum


rwnd
urg pointer
data User
window size types
acknowledgements: N ‘C’ Seq=42, ACK=79, data = ‘C’
§seq # of next byte host ACKs
receipt of
expected from other side sender sequence number space ‘C’, echoes
Seq=79, ACK=43, data = ‘C’
§cumulative ACK host ACKs
back ‘C’
sent sent, not- usable not receipt
Q: how receiver handles ACKed yet ACKed but not usable of echoed
out-of-order segments (“in-
flight”)
yet sent
‘C’ Seq=43, ACK=80
§A: TCP spec doesn’t say, incoming segment to sender
- up to implementor source port # dest port #
sequence number simple telnet scenario
acknowledgement number
A rwnd
checksum urg pointer

Transport Layer 3-59 Transport Layer 3-60


TCP round trip time, timeout TCP round trip time, timeout
EstimatedRTT = (1- )*EstimatedRTT + *SampleRTT
Q: how to set TCP Q: how to estimate RTT?
timeout value? v SampleRTT: measured v exponential weighted moving average
time from segment v influence of past sample decreases exponentially fast
v longer than RTT transmission until ACK v typical value:  = 0.125
§ but RTT varies receipt
RTT: gaia.cs.umass.edu to fantasia.eurecom.fr
v too short: premature § ignore retransmissions
SampleRTT will vary, want

RTT (milliseconds)
timeout, unnecessary v

retransmissions estimated RTT “smoother”


§ average several recent
v too long: slow reaction measurements, not just
to segment loss current SampleRTT
sampleRTT
EstimatedRTT

Transport Layer 3-61 time (seconds) Transport Layer 3-62

TCP round trip time, timeout Chapter 3 outline


v timeout interval: EstimatedRTT plus “safety margin” 3.1 transport-layer 3.5 connection-oriented
§ large variation in EstimatedRTT -> larger safety margin services transport: TCP
v estimate SampleRTT deviation from EstimatedRTT: 3.2 multiplexing and § segment structure
DevRTT = (1-)*DevRTT + demultiplexing § reliable data transfer
*|SampleRTT-EstimatedRTT| 3.3 connectionless § flow control
(typically,  = 0.25) transport: UDP § connection management
3.4 principles of reliable 3.6 principles of congestion
TimeoutInterval = EstimatedRTT + 4*DevRTT data transfer control
3.7 TCP congestion control
estimated RTT “safety margin”

Transport Layer 3-63 Transport Layer 3-64


TCP reliable data transfer TCP sender events:
data rcvd from app: timeout:
v TCP creates rdt service v create segment with v retransmit segment
on top of IP’s unreliable seq # that caused timeout
service
v seq # is byte-stream v restart timer
§ pipelined segments
let’s initially consider number of first data ack rcvd:
§ cumulative acks byte in segment
§ single retransmission simplified TCP sender: v if ack acknowledges
timer § ignore duplicate acks v start timer if not previously unacked
v retransmissions § ignore flow control, already running segments
triggered by: congestion control § think of timer as for § update what is known
oldest unacked to be ACKed
§ timeout events segment
§ duplicate acks § start timer if there are
§ expiration interval: still unacked segments
TimeOutInterval

Transport Layer 3-65 Transport Layer 3-66

TCP sender (simplified) TCP: retransmission scenarios


data received from application above Host A Host B Host A Host B
create segment, seq. #: NextSeqNum
pass segment to IP (i.e., “send”)
NextSeqNum = NextSeqNum + length(data) SendBase=92
if (timer currently not running)
L Seq=92, 8 bytes of data Seq=92, 8 bytes of data
start timer
NextSeqNum = InitialSeqNum wait
Seq=100, 20 bytes of data
timeout

timeout
SendBase = InitialSeqNum for ACK=100
event timeout X
ACK=100
retransmit not-yet-acked segment ACK=120
with smallest seq. #
start timer Seq=92, 8 bytes of data Seq=92, 8
ACK received, with ACK field value y SendBase=100 bytes of data
if (y > SendBase) { SendBase=120
ACK=100
SendBase = y ACK=120
/* SendBase–1: last cumulatively ACKed byte */
SendBase=120
if (there are currently not-yet-acked segments)
start timer
lost ACK scenario premature timeout
else stop timer
} Transport Layer 3-67 Transport Layer 3-68
TCP: retransmission scenarios TCP ACK generation [RFC 1122, RFC 2581]

Host A Host B
event at receiver TCP receiver action
arrival of in-order segment with delayed ACK. Wait up to 500ms
Seq=92, 8 bytes of data expected seq #. All data up to for next segment. If no next segment,
expected seq # already ACKed send ACK
Seq=100, 20 bytes of data
ACK=100 arrival of in-order segment with immediately send single cumulative
timeout

X expected seq #. One other ACK, ACKing both in-order segments


ACK=120 segment has ACK pending

arrival of out-of-order segment immediately send duplicate ACK,


Seq=120, 15 bytes of data higher-than-expect seq. # . indicating seq. # of next expected byte
Gap detected

arrival of segment that immediate send ACK, provided that


partially or completely fills gap segment starts at lower end of gap
cumulative ACK
Transport Layer 3-69 Transport Layer 3-70

TCP fast retransmit TCP fast retransmit


Host A Host B
v time-out period often
relatively long: TCP fast retransmit
§ long delay before if sender receives 3 Seq=92, 8 bytes of data
resending lost packet ACKs for same data Seq=100, 20 bytes of data
X
v detect lost segments (“triple
(“triple duplicate
duplicate ACKs”),
ACKs”),
via duplicate ACKs. resend unacked ACK=100
§ sender often sends segment with smallest

timeout
ACK=100
many segments back- seq # ACK=100
to-back ACK=100
§ likely that unacked
§ if segment is lost, there segment lost, so don’t Seq=100, 20 bytes of data

will likely be many wait for timeout


duplicate ACKs.

fast retransmit after sender


receipt of triple duplicate ACK
Transport Layer 3-71 Transport Layer 3-72
Chapter 3 outline TCP flow control
application
application may process
remove data from
3.1 transport-layer 3.5 connection-oriented TCP socket buffers ….
application
services transport: TCP TCP socket OS

§ segment structure receiver buffers


3.2 multiplexing and … slower than TCP
demultiplexing § reliable data transfer receiver is delivering
(sender is sending) TCP
3.3 connectionless § flow control code
transport: UDP § connection management
3.4 principles of reliable 3.6 principles of congestion IP

data transfer control flow control code


receiver controls sender, so
3.7 TCP congestion control sender won’t overflow
receiver’s buffer by transmitting from sender
too much, too fast
receiver protocol stack

Transport Layer 3-73 Transport Layer 3-74

TCP flow control Chapter 3 outline


v receiver “advertises” free 3.1 transport-layer 3.5 connection-oriented
buffer space by including to application process
services transport: TCP
rwnd value in TCP header
of receiver-to-sender 3.2 multiplexing and § segment structure
RcvBuffer buffered data
segments demultiplexing § reliable data transfer
§ RcvBuffer size set via § flow control
socket options (typical default rwnd free buffer space
3.3 connectionless
is 4096 bytes) transport: UDP § connection management
§ many operating systems 3.4 principles of reliable 3.6 principles of congestion
autoadjust RcvBuffer TCP segment payloads
data transfer control
v sender limits amount of
unacked (“in-flight”) data to 3.7 TCP congestion control
receiver-side buffering
receiver’s rwnd value
v guarantees receive buffer
will not overflow
Transport Layer 3-75 Transport Layer 3-76
Connection Management Agreeing to establish a connection
before exchanging data, sender/receiver “handshake”:
v agree to establish connection (each knowing the other willing 2-way handshake:
to establish connection) Q: will 2-way handshake
v agree on connection parameters always work in
Let’s talk
network?
ESTAB v variable delays
application application OK
ESTAB v retransmitted messages
connection state: ESTAB connection state: ESTAB (e.g. req_conn(x)) due to
connection variables: connection Variables:
seq # client-to-server seq # client-to-server message loss
server-to-client server-to-client
rcvBuffer size rcvBuffer size
v message reordering
at server,client at server,client choose x
req_conn(x)
v can’t “see” other side
network network ESTAB
acc_conn(x)
ESTAB
Socket clientSocket = Socket connectionSocket =
newSocket("hostname","port welcomeSocket.accept();
number");

Transport Layer 3-77 Transport Layer 3-78

Agreeing to establish a connection TCP 3-way handshake


2-way handshake failure scenarios:
client state server state
LISTEN LISTEN
choose x choose x choose init seq num, x
req_conn(x) req_conn(x) send TCP SYN msg
ESTAB ESTAB SYNSENT SYNbit=1, Seq=x
retransmit acc_conn(x) retransmit acc_conn(x) choose init seq num, y
req_conn(x) req_conn(x) send TCP SYNACK
msg, acking SYN SYN RCVD
ESTAB ESTAB SYNbit=1, Seq=y
data(x+1) accept ACKbit=1; ACKnum=x+1
req_conn(x)
retransmit data(x+1) received SYNACK(x)
data(x+1) ESTAB indicates server is live;
connection connection send ACK for SYNACK;
x completes x completes this segment may contain ACKbit=1, ACKnum=y+1
client server client server client-to-server data
terminates forgets x terminates forgets x received ACK(y)
req_conn(x) indicates client is live
ESTAB
ESTAB ESTAB
data(x+1) accept
half open connection!
data(x+1)
(no client!)
Transport Layer 3-79 Transport Layer 3-80
TCP 3-way handshake: FSM TCP: closing a connection
closed v client, server each close their side of connection
§ send TCP segment with FIN bit = 1
Socket connectionSocket =
welcomeSocket.accept(); v respond to received FIN with ACK
L Socket clientSocket = § on receiving FIN, ACK can be combined with own FIN
SYN(x) newSocket("hostname","port

SYNACK(seq=y,ACKnum=x+1)
number");
v simultaneous FIN exchanges can be handled
create new socket for SYN(seq=x)
communication back to client
listen

SYN SYN
rcvd sent

SYNACK(seq=y,ACKnum=x+1)
ESTAB ACK(ACKnum=y+1)
ACK(ACKnum=y+1)
L

Transport Layer 3-81 Transport Layer 3-82

TCP: closing a connection Chapter 3 outline


client state server state
ESTAB ESTAB 3.1 transport-layer 3.5 connection-oriented
clientSocket.close() services transport: TCP
FINbit=1, seq=x § segment structure
FIN_WAIT_1 can no longer
send but can
3.2 multiplexing and
CLOSE_WAIT
receive data
ACKbit=1; ACKnum=x+1
demultiplexing § reliable data transfer
can still
FIN_WAIT_2 wait for server send data 3.3 connectionless § flow control
close
transport: UDP § connection management
LAST_ACK
FINbit=1, seq=y 3.4 principles of reliable 3.6 principles of congestion
can no longer
TIMED_WAIT
send data data transfer control
ACKbit=1; ACKnum=y+1
timed wait 3.7 TCP congestion control
for 2*max CLOSED
segment lifetime

CLOSED

Transport Layer 3-83 Transport Layer 3-84


Causes/costs of congestion: scenario 1
Principles of congestion control
original data: lin throughput: lout
two senders, two
congestion:
v
receivers Host A

v informally: “too many sources sending too much v one router, infinite
buffers
unlimited shared
output link buffers
data too fast for network to handle” v output link capacity: R
v different from flow control! v no retransmission
v manifestations: Host B

§ lost packets (buffer overflow at routers)


R/2
§ long delays (queueing in router buffers)

delay
a top-10 problem!

lout
v

lin R/2 lin R/2


v maximum per-connection v large delays as arrival rate, lin,
throughput: R/2 approaches capacity
Transport Layer 3-85 Transport Layer 3-86

Causes/costs of congestion: scenario 2 Causes/costs of congestion: scenario 2


v one router, finite buffers R/2
idealization: perfect
v sender retransmission of timed-out packet knowledge

lout
§ application-layer input = application-layer output: lin = v sender sends only when
lout router buffers available
lin R/2
§ transport-layer input includes retransmissions : l‘in lin

lin : original data lin : original data


l'in: original data, plus lout copy l'in: original data, plus lout
retransmitted data retransmitted data

Host A A free buffer space!

finite shared output finite shared output


Host B Host B
link buffers link buffers
Transport Layer 3-87 Transport Layer 3-88
Causes/costs of congestion: scenario 2 Causes/costs of congestion: scenario 2
Idealization: known loss Idealization: known loss R/2
packets can be lost, packets can be lost,
when sending at R/2,
dropped at router due dropped at router due some packets are

lout
to full buffers to full buffers retransmissions but
asymptotic goodput
v sender only resends if v sender only resends if is still R/2 (why?)
packet known to be lost packet known to be lost lin R/2

lin : original data lin : original data


copy l'in: original data, plus lout l'in: original data, plus lout
retransmitted data retransmitted data

A no buffer space! A free buffer space!

Host B Host B
Transport Layer 3-89 Transport Layer 3-90

Causes/costs of congestion: scenario 2 Causes/costs of congestion: scenario 2


Realistic: duplicates R/2
Realistic: duplicates R/2
v packets can be lost, dropped v packets can be lost, dropped
when sending at R/2, when sending at R/2,
at router due to full buffers some packets are
at router due to full buffers some packets are
lout

lout
v sender times out prematurely, retransmissions
including duplicated
v sender times out prematurely, retransmissions
including duplicated
sending two copies, both of that are delivered! sending two copies, both of that are delivered!
which are delivered lin R/2 which are delivered lin R/2

lin
copy
timeout
l'in lout
“costs” of congestion:
v more work (retrans) for given “goodput”
A free buffer space! v unneeded retransmissions: link carries multiple copies of pkt
§ decreasing goodput

Host B
Transport Layer 3-91 Transport Layer 3-92
Causes/costs of congestion: scenario 3 Causes/costs of congestion: scenario 3
v four senders Q: what happens as lin and lin’
increase ? C/2
v multihop paths
A: as red lin’ increases, all arriving
v timeout/retransmit blue pkts at upper queue are

lout
dropped, blue throughput g 0
Host A
lin : original data lout Host B
l'in: original data, plus
retransmitted data lin’ C/2

finite shared output


link buffers
another “cost” of congestion:
Host D v when packet dropped, any “upstream
Host C
transmission capacity used for that packet was
wasted!

Transport Layer 3-93 Transport Layer 3-94

Approaches towards congestion control Case study: ATM ABR congestion control

two broad approaches towards congestion control: ABR: available bit rate: RM (resource management)
v “elastic service” cells:
end-end congestion network-assisted v if sender’s path v sent by sender, interspersed
control: congestion control: “underloaded”: with data cells
v no explicit feedback v routers provide § sender should use v bits in RM cell set by switches
from network feedback to end systems available bandwidth (“network-assisted”)
v congestion inferred § single bit indicating v if sender’s path § NI bit: no increase in rate
from end-system congestion (SNA, congested: (mild congestion)
observed loss, delay DECbit, TCP/IP ECN, § sender throttled to § CI bit: congestion
v approach taken by ATM) minimum guaranteed indication
TCP § explicit rate for rate v RM cells returned to sender
sender to send at by receiver, with bits intact

Transport Layer 3-95 Transport Layer 3-96


Case study: ATM ABR congestion control Chapter 3 outline
RM cell data cell
3.1 transport-layer 3.5 connection-oriented
services transport: TCP
3.2 multiplexing and § segment structure
demultiplexing § reliable data transfer
3.3 connectionless § flow control
transport: UDP § connection management
v two-byte ER (explicit rate) field in RM cell 3.4 principles of reliable 3.6 principles of congestion
§ congested switch may lower ER value in cell data transfer control
§ senders’ send rate thus max supportable rate on path 3.7 TCP congestion control
v EFCI bit in data cells: set to 1 in congested switch
§ if data cell preceding RM cell has EFCI set, receiver sets
CI bit in returned RM cell
Transport Layer 3-97 Transport Layer 3-98

TCP congestion control: additive increase TCP Congestion Control: details


multiplicative decrease
v approach: sender increases transmission rate (window sender sequence number space
cwnd TCP sending rate:
size), probing for usable bandwidth, until loss occurs
v roughly: send cwnd
§ additive increase: increase cwnd by 1 MSS every bytes, wait RTT for
RTT until loss detected last byte ACKS, then send
§ multiplicative decrease: cut cwnd in half after loss
last byte
ACKed sent, not-
yet ACKed
sent more bytes
(“in-
additively increase window size … flight”) cwnd
…. until loss occurs (then cut window in half) v sender limits transmission: rate ~
~ bytes/sec
RTT
congestion window size

LastByteSent-
cwnd: TCP sender

AIMD saw tooth < cwnd


LastByteAcked
behavior: probing
for bandwidth v cwnd is dynamic, function
of perceived network
congestion
time
Transport Layer 3-99 Transport Layer 3-100
TCP Slow Start TCP: detecting, reacting to loss
Host A Host B
v when connection begins, v loss indicated by timeout:
increase rate
exponentially until first one segm § cwnd set to 1 MSS;
ent
loss event: § window then grows exponentially (as in slow start)

RTT
§ initially cwnd = 1 MSS two segm
ents
to threshold, then grows linearly
§ double cwnd every RTT v loss indicated by 3 duplicate ACKs: TCP RENO
§ done by incrementing § dup ACKs indicate network capable of delivering
cwnd for every ACK four segm
ents
received some segments
v summary: initial rate is § cwnd is cut in half window then grows linearly
slow but ramps up v TCP Tahoe always sets cwnd to 1 (timeout or 3
exponentially fast time
duplicate acks)

Transport Layer 3-101 Transport Layer 3-102

TCP: switching from slow start to CA Summary: TCP Congestion Control


New
Q: when should the New ACK!

exponential
duplicate ACK
dupACKcount++
ACK!
new ACK
new ACK
.
cwnd = cwnd + MSS (MSS/cwnd)
dupACKcount = 0

increase switch to cwnd = cwnd+MSS


dupACKcount = 0
transmit new segment(s), as allowed
transmit new segment(s), as allowed

L
linear? cwnd = 1 MSS

A: when cwnd gets


ssthresh = 64 KB cwnd > ssthresh
dupACKcount = 0 slow L congestion
to 1/2 of its value start timeout avoidance
ssthresh = cwnd/2
before timeout. timeout
cwnd = 1 MSS
dupACKcount = 0
duplicate ACK
dupACKcount++
ssthresh = cwnd/2 retransmit missing segment
cwnd = 1 MSS
dupACKcount = 0
retransmit missing segment
New
Implementation: timeout
ssthresh = cwnd/2
ACK!

variable ssthresh
cwnd = 1 New ACK
dupACKcount = 0
v dupACKcount == 3 retransmit missing segment cwnd = ssthresh dupACKcount == 3
dupACKcount = 0
v on loss event, ssthresh ssthresh= cwnd/2
cwnd = ssthresh + 3
ssthresh= cwnd/2
cwnd = ssthresh + 3

is set to 1/2 of cwnd just


retransmit missing segment retransmit missing segment
fast
before loss event recovery
duplicate ACK
cwnd = cwnd + MSS
transmit new segment(s), as allowed

Transport Layer 3-103 Transport Layer 3-104


TCP throughput TCP Futures: TCP over “long, fat pipes”
v avg. TCP thruput as function of window size, RTT?
§ ignore slow start, assume always data to send
v example: 1500 byte segments, 100ms RTT, want
10 Gbps throughput
v W: window size (measured in bytes) where loss occurs
v requires W = 83,333 in-flight segments
§ avg. window size (# in-flight bytes) is ¾ W
§ avg. thruput is 3/4W per RTT v throughput in terms of segment loss probability, L
[Mathis 1997]:
3 W .
TCP throughput = 1.22 MSS
avg TCP thruput = bytes/sec
4 RTT
RTT L
W
➜ to achieve 10 Gbps throughput, need a loss rate of L
= 2·10-10 – a very small loss rate!
W/2
v new versions of TCP for high-speed

Transport Layer 3-105 Transport Layer 3-106

TCP Fairness Why is TCP fair?


fairness goal: if K TCP sessions share same two competing sessions:
bottleneck link of bandwidth R, each should have v additive increase gives slope of 1, as throughout increases
average rate of R/K v multiplicative decrease decreases throughput proportionally
R equal bandwidth share
TCP connection 1

Connection 2 throughput loss: decrease window by factor of 2


congestion avoidance: additive increase
bottleneck loss: decrease window by factor of 2
router congestion avoidance: additive increase
capacity R
TCP connection 2

Connection 1 throughput R
Transport Layer 3-107 Transport Layer 3-108
Fairness (more) Chapter 3: summary
Fairness and UDP Fairness, parallel TCP v principles behind
v multimedia apps often connections transport layer services:
do not use TCP v application can open § multiplexing,
§ do not want rate multiple parallel demultiplexing next:
throttled by congestion connections between two v leaving the
control § reliable data transfer
hosts network “edge”
v instead use UDP: § flow control (application,
v web browsers do this
§ send audio/video at § congestion control transport layers)
constant rate, tolerate v e.g., link of rate R with 9
packet loss
v instantiation, v into the network
existing connections: implementation in the
§ new app asks for 1 TCP, gets rate
“core”
R/10
Internet
§ new app asks for 11 TCPs, gets R/2 § UDP
§ TCP
Transport Layer 3-109 Transport Layer 3-110

You might also like