Data Communication Networks
Lecture 2
Saad Mneimneh
Computer Science
Hunter College of CUNY
New York
DLC . . . . . . . . . . . . . . . . . . . . . .
Framing . . . . . . . . . . . . . . . . . . . .
Character based framing . . . . . . . . .
Character based framing (cont.) . . . .
Length field . . . . . . . . . . . . . . . . .
Maximum frame size . . . . . . . . . . .
Maximum frame size (cont.) . . . . . .
Fixed length packets/frames . . . . . .
Bit oriented framing . . . . . . . . . . . .
Bit stuffing . . . . . . . . . . . . . . . . . .
Overhead of bit stuffing . . . . . . . . .
Overhead of bit stuffing (cont.) . . . .
Can we do better? . . . . . . . . . . . . .
Error detection . . . . . . . . . . . . . . .
How to detect errors? . . . . . . . . . . .
Single parity check . . . . . . . . . . . . .
Horizontal and vertical parity checks .
Horizontal and vertical parity checks .
Horizontal and vertical parity checks .
Arbitrary parity check codes . . . . . .
Effectiveness of a code . . . . . . . . . .
Effectiveness of a code (cont.) . . . . .
Cyclic Redundancy Check . . . . . . . .
Obtaining c(x) . . . . . . . . . . . . . . .
Another example . . . . . . . . . . . . . .
Using bits only . . . . . . . . . . . . . . .
Using bits only (cont.) . . . . . . . . . .
Using bits only (cont.) . . . . . . . . . .
Feedback shift register . . . . . . . . . .
How does c(x) help? . . . . . . . . . . .
Undetected errors . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
DLC
We are not going to study the physical layer and how communication signals are sent and received
We will assume that we are capable of sending bits over a link
DLC is responsible for reliable transmission of packets over a link
every packet is delivered once,
only once,
without errors,
and in order
To achieve this goal, we have:
Framing: determine start and end of packets
Error detection: determine when errors exist
Error correction: retransmit packets containing errors
Framing
Recall that DLC adds its own header and trailer to the packet frame
header
packet
trailer
frame
The problem is to decide where successive frames start and end
in some cases, there is a period of idle fills between successive frames (e.g. synchronous bit
pipe)
it is also necessary to separate idle fills from frames
even when idle fill are replaces by dead periods (intermittent bit pipe), problem is not
simplified, e.g. often no dead periods
. . . 01011010110010110101101110101110 . . .
Where is the data?
Character based framing
Character based codes, such as ASCII (7 bits and 1 parity bit), provide binary representation for keyboard
characters and terminal control characters
Such codes can also provide representation for various communication characters
SYN: a string of SYN characters provide idle fill between frames when a sending DLC has no data to
send (but a synchronous modem requires bits)
STX: start of text
ETX: end of text
frame
SYN
SYN
STX
header
packet
ETX
CRC
SYN
SYN
trailer added by DLC
for error detection
(later)
Frame must contain integer number of characters
Frame is character code dependent how do we send binary data
e.g. packet is an arbitrary binary string and may possibly contain the ETX character for instance, which
could be wrongly interpreted as end of frame
Character based framing (cont.)
A special control character DLE (Data Link Escape) is inserted before any intentional use of
communication control characters
e.g. DLE is not inserted before the possible appearance these characters as part of the binary
data
But what if DLE appears itself in the data?
Insert a DLE before each appearance of DLE in data
e.g. DLE ETX (but not DLE DLE ETX): end of frame
e.g. DLE DLE ETX (but not DLE DLE DLE ETX): DLE ETX in data
frame
SYN
SYN
DLE
STX
header
packet
Too much overhead: at least 6 characters/packet
Primary framing method from 1960-1975
DLE
ETX
CRC
SYN
SYN
Length field
The basic problem is to inform the receiving DLC where each idle fill ends and where each frame
ends
idle fill: in principle, easy to identify because it is represented by a fixed string
frame: harder to indicate where it ends because it consists of arbitrary and unknown bit string
Include a length field of certain number of bits in header (e.g. DECNET)
length
Idle fill
|||||
packet
CRC
Idle fill
header
packet
CRC
header
Once synchronized, DLC can always tell where next frame starts
Length field restricts packet size
length field must be blog 2 M axF rameSizec + 1 bits (thats the overhead)
Difficult to recover from error in length field
e.g. resynchronization needed after error in length field
Maximum frame size
How should transport layer choose maximum packet size?
not a big deal since IP fragments packets further if necessary
usually about 1500 bytes (from Ethernet)
but theoretically speaking?
Let Kmax be the maximum packet size. Assume V overhead bits. Let M be the message length.
Then we have
M
M +d
eV
Kmax
bits to send
Kmax
Kmax
Therefore,
Kmax % small overhead per message
Kmax & faster delivery of message (why?)
Kmax
Rest V
Maximum frame size (cont.)
What is the time needed for the message to traverse j links?
receiver
1
2
sender
Assume capacity of link is c bps (bandwidth)
T =
M
(Kmax + V )(j 1) M + d Kmax eV
+
+P +Q
c
c
E[T ] (Kmax + V )(j 1) + E[M ] +
Minimizing (take first derivative and set it to zero)
Kmax =
Fixed length packets/frames
Length field is implicit (not needed)
e.g. ATM, all packets are 53 bytes
Requires synchronization upon initialization
Message length not multiple of packet size
last packet contains idle fill (efficiency?)
E[M ]V
j1
E[M ]
V
Kmax
Bit oriented framing
In character based framing, DLE ETX indicates the end of frame
avoided within frame by doubling each DLE character
In bit oriented framing, a special binary flag indicates the end of frame
avoided within frame using a technique called bit stuffing
The difference is that a flag can be of any length (later we see how to set length to minimize overhead)
Standard protocols use 01111110, we denote it by 01 6 0
The same flag can be used to indicate start of frame
01111110 . . . . . . . . . . . . . . . . . . 01111110
Standard DLCs have also an abort capability in which a frame can be aborted by sending 7 or more
consecutive 1s (15 consecutive 1s link is idle)
Therefore, 0111111, i.e. 016 is the actual bit string that must be avoided in data
Bit stuffing, 1970 by IBM
Bit stuffing
Sender DLC
insert (stuff) a 0 after each appearance of five consecutive 1s
append the flag 016 0 (without stuffing) at the end of frame
Receiver DLC
delete the first 0 after each string of five consecutive 1s
if six consecutive 1s are seen end of frame
stuffed bits
0
0
0
0
1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 0
Which ones of these stuffed bits can be avoided (provided receivers rule for deleting stuffed bits is
changed accordingly)?
Overhead
of overhead
bit stuffing
What is the
of bit stuffing (assume flag = 01j 0)?
if j % less stuffing but longer flag
if j & more stuffing but shorter flag
For the sake of analysis, assume a random string of bits with p(0) = p(1) = 1/2
insertion after ith bit occurs with probability (1/2)j
01
. . 1}
| .{z
j1
also insertion after ith bit occurs with probability (1/2)2j1
. . 1}
01
. . 1} |1 .{z
| .{z
j1
j1
since (1/2)2j1 << (1/2)j , we can ignore such event and events of insertion dur to yet longer strings
of 1s
The probability of stuffing after the ith bit is approximately 2j , which is also E[stuf f ed @ i]
By linearity of expectation, E[stuf f ed|K] K2j (be careful at boundary), where K is the length of the
frame
Overhead
of bit
stuffing (cont.)
From previous
slide
E[stuf f ed] = E[K]2j
E[overhead] = E[K]2j + j + 2
| {z } | {z }
stuf f ed
f lag
Minimizing with respect to j, we need smallest j such that
E[K]2j + j + 2 < E[K]2(j+1) + (j + 1) + 2
j+1
E[k]2(j+1) < 1
The smallest j that satisfies above is j = blog2 E[k]c
We can show that (homework?)
log2 E[K] + 2.914 E[overhead] log2 E[K] + 3
Can we do better?
Length field based framing and bit oriented framing are comparable in their overhead
log 2 K
where K is the length of the frame
Can we do better? Information theory tells us NO.
Essentially, we are encoding information about the length of the frame at the sending DLC and
transmitting it to the receiving DLC
At least we need a number of bits equal to the entropy
H=
KX
max
p(k) log 2
k=1
When distribution is uniform, i.e. p =
1
Kmax ,
1
p(k)
H = log2 Kmax .
Error detection
All framing techniques are sensitive to errors
error in DLE ETX
error in length field (re-sync needed)
error in flag
error in data itself
Flag approach is least sensitive to errors because a flag will eventually appear
flag ruined (frame disappears)
flag created by error (extra frame appears)
the only thing is that an erroneous packet/frame is created
but this can be removed by error detection techniques
Error detection is used by receiving DLC to determine if a frame contains errors
If frame contains errors, receiver requires the transmitter to resend the frame
How to detect errors?
The problem (simply stated)
Assume that DLC knows where frames begin and end (solved earlier)
Determine which frames contain errors
Error cannot be detected by analyzing the packet itself (why?)
Therefore, extra bits must be used
Parity check
single parity
multiple parity (e.g. horizontal and vertical)
Cyclic Redundancy Check CRC
Single parity check
Add one parity bit
Parity bit is 1 if frame contains ODD number of 1s, and 0 otherwise
e.g. 1001010 1
e.g. 0111010 0
Therefore, a frame always contains an even number of 1s
Receiver counts number of 1s
ODD number of 1s: an error must have occurred
EVEN number of 1s: interpret as no error (why?)
even number of errors cannot be detected!
P
p(undetected error) = i even ki pi (1 p)i assuming independent errors (simplification),
where k is the length of the frame, p is the probability of error (binary symmetric channel)
Horizontal and vertical parity checks
Data is visualized as a rectangular array
1
0
1
1
0
0
1
1
0
0
0
1
1
0
1
1
1
0
0
1
0
0
0
1
0
1
1
0
1
0
0
0
1
1
1
Parity bit is computed for every row and every column
If an even number of errors is confined to a single row, each of them can be detected by the
corresponding column parity checks (and vice-versa)
Horizontal and vertical parity checks
Data is visualized as a rectangular array
1
0
1
1
0
1
0
1
1
0
0
0
0
1
1
0
1
1
1
1
0
0
1
1
0
0
0
1
0
1
1
1
0
1
0
1
0
0
1
1
1
1
1
0
0
0
1
0
horizontal checks
always consistent with both checks
(why?) (addition modulo 2)
vertical checks
Parity bit is computed for every row and every column
If an even number of errors is confined to a single row, each of them can be detected by the
corresponding column parity checks (and vice-versa)
Horizontal and vertical parity checks
Data is visualized as a rectangular array
1
0
1
1
0
1
0
1
1
0
0
0
0
1
1
0
1
1
1
1
0
0
1
1
0
0
0
1
0
1
1
1
0
1
0
1
0
0
1
1
1
1
1
0
0
0
1
0
horizontal checks
always consistent with both checks
(why?) (addition modulo 2)
vertical checks
Parity bit is computed for every row and every column
If an even number of errors is confined to a single row, each of them can be detected by the
corresponding column parity checks (and vice-versa)
Some errors are still undetected
e.g. any 4 errors forming a rectangle
Arbitrary parity check codes
Parity check code is simply addition modulo 2
s1 s2 . . . s k c1 c2 . . . cL
{z
} |
{z
}
|
Kbit frame
Lbit parity check
every ci is the sum of some bits in s1 . . . sK
ci =
K
X
ij sj
j=1
where
s1 is
0
0
0
0
1
1
1
1
an
s2 L
s3K
0
0
0
1
1
0
1
1
0
0
0
1
1
0
1
1
0-1
c1
0
1
0
1
1
0
1
0
matrix
c2 c3
0
0
1
0
1
1
0
1
1
1
0
1
0
0
1
0
c4
0
1
1
0
0
1
1
0
c1
c2
c3
c4
= s1 + s3
= s1 + s2 + s3
= s1 + s2
= s2 + s3
1
1
=
1
0
0
1
1
1
1
1
0
1
Effectiveness of a code
K
|
L
{z
codeword
The effectiveness of a code is usually measured by three parameters
minimum distance of the code d: the smallest number of errors that can convert one code word
into another
burst detecting capability
burst = number of bits from first error to last error (inclusive)
defined as: largest integer B such that a code can detect all bursts B
probability that a random string is accepted as error free
useful when framing is lost, e.g. check code is random with respect to received frame
We have 2K codewords (why?) and 2K+L random strings
therefore, the probability is 2L
Effectiveness of a code (cont.)
Minimum distance d
single parity:
horz. and vert. parity:
Burst detecting capability B
single parity:
horz. and vert. parity (assumes data sent by rows):
Cyclic Redundancy Check
For convenience, denote the data bits as
sK1 , sK2 , . . ., s0
Represent the string as a polynomial
s(x) = sK1 xK1 + sK2 xK2 + . . . + s1 x + s0
Similarly, we can represent the CRC (with L bits) as
c(x) = cL1 xL1 + cL2 xL2 + . . . + c1 x + c0
The whole frame can be represented as a polynomial
f (x) = s(x)xL + c(x) = sK1 xL+K1 + . . . + s0 xL + cL1 xL1 + . . . + c0
|{z}
|{z}
| {z }
| {z }
Why this polynomial representation? because we are going to obtain c as c(x) by dividing s(x)x L by some
polynomial g(x)
Obtaining c(x)
We know si (data) for i = 0 . . . K 1
How do we compute ci (CRC) for i = 0 . . . L 1?
let g(x) = xL + gL1 xL1 + . . . + g1 x + 1 be given (gL = g0 = 1)
then
"
#
s(x)xL
c(x) = Remainder
g(x)
- division modulo 2
result is a degree L 1 polynomial L bits
Example: s = 101 (K = 3) and g(x) = x3 + x2 + 1 (L = 3)
s(x) =?
s(x)xL =?
divide s(x)xL by g(x) and obtain remainder (ill do it on the board?)
Another example
s = 110101
g(x) = x3 + 1
s(x) = x5 + x4 + x2 + 1
s(x)xL = x8 + x7 + x5 + x3
x8 + x 7 + x 5 + x 3
x8 + x 5
x7 + x 3
x7 + x 4
x4 + x 3
x4 + x
x3 + x
x3 + 1
x+1
x3 + 1
x5 +x4 +x+1
c(x) = 0.x2 + 1.x + 1 (L = 3) c = 011
110101 011
Using bits only
s = 110101
g(x) = x3 + 1 (L = 3)
g=
x3
x2
x1
x0
L
s(x)xL
x8
x7
x6
x5
x4
x3
x2
}|
x1
x0
Using bits only (cont.)
1 1 0 1 0 1 0 0 0
1 0 0 1
1 0 0 1
1 1 0 0 1 1
1 0 0 0
1 0 0 1
0 0 1 1
0 0 0 0
0 1 1 0
0 0 0 0
1 1 0 0
1 0 0 1
1 0 1 0
1 0 0 1
0 1 1
Using bits only (cont.)
1 1 0 1 0 1 0 0 0
1 0 0 1
1 0 0 1
1 1 0 0 1 1
1 0 0 0
1 0 0 1
0 0 1 1
0 0 0 0
0 1 1 0
0 0 0 0
1 1 0 0
1 0 0 1
1 0 1 0
1 0 0 1
0 1 1
Multiply g by first bit
Add
Shift
Can be implemented using a feedback shift register
Feedback shift register
g0
g1
g2
gL-2
gL-1
s0 sK-L-1
sK-L
sK-L+1
sK-2
Register is initialized with first L bits of s
After K shifts, switch is moved and CRC is read
How does c(x) help?
s(x)xL = g(x)z(x) + c(x)
s(x)xL + c(x) = g(x)z(x) + c(x) + c(x) (modulo 2)
|
{z
0
s(x)xL + c(x) = g(x)z(x)
f (x) = g(x)z(x)
Polynomial representation of the frame is multiple of g(x)
Assume f (x) received as y(x)
Receiver DLC computers
y(x)
Remainder
g(x)
if remainder is not zero error in frame
if remainder is zero, declare the frame error free
sK-1
Undetected errors
Assume error is e(x), i.e. y(x) = f (x) + e(x)
f (x)
e(x)
Then, y(x)
g(x) = g(x) + g(x)
Therefore, we have undetected errors iff e(x) 6= 0 divisible by g(x)
Single errors are always detected
assume undetected, i.e. e(x) = xi = g(x)z(x) for some i
since g(x) = xL + . . . + 1, multiplying g(x) by any z(x) 6= 0 cannot produce x i (must produce at least
2 terms)
g(x) can be chosen such that
all odd number of errors are detected
all double errors are detected (if K + L < 2L )
therefore, minimum distance d = 4
burst detecting capability B = L
probability of random string accepted is 2L
e.g. (L=16) g(x) = x16 + x15 + x2 + 1 CRC-16
e.g. (L=16) g(x) = x16 + x12 + x5 + 1 CRC-CCITT
e.g. (L=32) g(x) = ... (see book page 64)