Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
3 views34 pages

Communication Theory II - Lecture 7

Uploaded by

Pabasara Eranga
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views34 pages

Communication Theory II - Lecture 7

Uploaded by

Pabasara Eranga
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Communication Theory II

Lecture 7: Source Coding Theorem, Huffman Coding


Source Coding Theorem
• An important problem in communication is the efficient
representation of data generated by a discrete source.
• The device that performs the representation is called a
source encoder.
• A binary code encodes each character as a
binary string or codeword.
• We would like to find a binary code that encodes
the file using as few bits as possible, ie.,
compresses it as much as possible.
• In a fixed-length code each codeword has the
same length.
• In a variable-length code codewords may have
different lengths.
Example

• Suppose that we have a 100, 000 character data


file that we wish to store . The file contains only 6
characters, appearing with the following
frequencies:
Examples of fixed and variable length
codes

The fixed length-code requires 300, 000 bits to store the file.

The variable-length code uses only


(45·1+13·3+12·3+16·3+9·4+5·4)·1000 = 224, 000 bits, saving a lot of
space!

Can we do better?
Code

• A code will be a set of codewords,


• e.g., {000, 001, 010, 011, 100, 101} and {0, 101, 100,
111, 1101, 1100}
Example
Morse code is a method of encoding text characters as
sequences of two different signal durations, such as short
and long signals, dots and dashes, or "dits" and "dahs." It
was developed in the 1830s and 1840s by Samuel Morse
and Alfred Vail for use with their electric telegraph system.
In Morse code, each letter of the alphabet, as well as
numbers and some punctuation marks, is represented by a
unique combination of short and long signals. The duration
of a short signal is typically referred to as a "dot" or a
"dit," while the duration of a long signal is called a "dash"
or a "dah." The pauses between signals within a letter are
short, and the pauses between letters and words are
longer.

More probable letters have short codes. E.g. letter E and


letter Q, or letter A and letter J.

Morse code - Wikipedia


Encoding

• replace the characters by the codewords.

Example: Γ = {a, b, c, d}

If the code is C1 = {a= 00, b = 01, c = 10, d = 11}.


then "bad" is encoded into 010011

If the code is C3 = {a = 1, b = 110, c = 10, d = 111}


then "bad" is encoded into 1101111
Decoding
• Given an encoded message, decoding is the
process of turning it back into the original
message

C1 = {a= 00, b = 01, c = 10, d = 11}.

For example relative to C1, 010011 is uniquely


decodable to "bad".
C3 = {a = 1, b = 110, c = 10, d = 111}
But, relative to C3, 1101111 is not uniquely
decipherable since it could have encoded
either "bad" or "acad".
Codeword length and entropy

Average codeword length

Encoded length
Probability of Sk Coding efficiency
Minimum
possible value of 𝐿ത
Codeword length and entropy
Shannon's source coding theorem

This means that the average number of bits per source symbol
must be greater than or equal to the entropy of the source. It
implies that to achieve efficient compression, the average number
of bits per symbol should be close to the entropy value

entropy provides a theoretical lower bound on the average


number of bits per symbol in an optimal coding scheme, and it
serves as a fundamental measure of the information content of
the source.
Data compression
➢ Data compression reduces data size while maintaining
essential information.
➢ Entropy measures the average information or uncertainty in
a source.
➢ Entropy indicates compressibility and redundancy within the
data.
➢ The average number of bits per symbol should be close to
entropy for efficient compression.
➢ Coding schemes assign shorter codes to more frequent
symbols and longer codes to less frequent ones.
➢ Shannon's source coding theorem establishes a lower
bound: average bits per symbol ≥ entropy.
➢ Practical compression algorithms like Huffman coding aim to
achieve close-to-entropy compression.
Prefix Code
• A code is called a prefix code if no codeword is a
prefix of any other code word.

• A prefix code has the important property that it is


always uniquely decodable.

In a prefix code, each symbol is represented by a unique binary codeword. The key
characteristic of a prefix code is that no codeword is a prefix (initial segment) of another
codeword. This property guarantees unambiguous decoding because when we encounter a
sequence of bits during decoding, we can determine the corresponding symbol without the
need for lookahead or further examination.
Prefix Code- example

Symbol A: 0 Symbol B: 10 Symbol C: 110 Symbol D: 111

If we encounter the bit sequence "10" during decoding, we can


be certain that it corresponds to symbol B because "10" is a
complete codeword. We don't need to look ahead or check for
any additional bits.

Symbol A: 0 Symbol B: 10 Symbol C: 100

In this case, if we encounter the bit sequence "10," we can't


determine the symbol yet because "10" could either represent
symbol B or be the prefix of another codeword. The code lacks
the prefix property, leading to ambiguity during decoding.

Prefix codes, such as Huffman coding, are widely used in


various data compression applications because they provide
efficient and reliable compression while ensuring clear and
unambiguous decoding.
Optimum Source Coding Problem
The objective is to find a binary prefix code that minimizes the average
number of bits required to encode symbols from a given alphabet,
based on their frequency distribution.

• Start with the given alphabet A and its corresponding frequency distribution f(ai).

• Sort the symbols in A based on their frequencies in non-decreasing order. The


symbol with the lowest frequency will have the longest code, and the symbol with
the highest frequency will have the shortest code.

• The problem: Given an alphabet A = {a1, . . . , an} with frequency distribution


f(ai) find a binary prefix code C for A that minimizes the number of bits
Huffman Code

• Huffman developed a nice greedy algorithm for


solving this problem and producing a minimum
cost (optimum) prefix code. The code that it
produces is called a Huffman code.
Huffman Code
Example of Huffman Coding

assigning '0' for


the left branch
and '1' for the
right branch
Example of Huffman Coding – Continued
Example of Huffman Coding – Continued
Example of Huffman Coding – Continued
Example of Huffman Coding – Continued
Example of Huffman Coding
Calculate average code length and entropy

Encoded length

Probability of Sk
Huffman Coding
➢ Is coding unique?

• The Huffman encoding process is not unique due to


two variations in the process.

The first variation is in the assignment of '0' and '1’


to the last two source symbols during the splitting
stage. However, the resulting differences are trivial.

The second variation occurs when the probability of


a combined symbol equals another probability in the
list. Different placements can result in code words of
different lengths, but the average code-word length
remains the same.
Huffman Coding- variance of the average code-word length
Average codeword length

length of the code word


probability
• When a combined symbol is moved as high as possible during the
Huffman coding process, the resulting Huffman code tends to have a
significantly smaller variance compared to when it is moved as low as
possible.
• Based on this observation, it is reasonable to choose the former
Huffman code (combined symbol moved as high as possible) over the
latter (combined symbol moved as low as possible) to reduce the
variability in code-word lengths.
Huffman Coding
Weaknesses

• Data with uniform probabilities (equal frequencies), the overhead of storing the
Huffman tree or codebook can outweigh the benefits of compression. In such
cases, the compression achieved by Huffman coding might not be significant.

• Sensitivity to input distribution: The effectiveness of Huffman coding heavily


depends on the frequency distribution of the input data. If the distribution
changes significantly, the entire Huffman tree must be recomputed, which might
not be practical in real-time scenarios or streaming data.

• Encoding and decoding complexity: Constructing the Huffman tree and encoding
data can be computationally intensive, especially for large alphabets or data
streams. While decoding is generally efficient, the encoding process requires
traversing the tree for each character to obtain its corresponding code
Huffman Coding

Suppose we have the following string of characters that we want to


encode using Huffman code.

"ABRACADABRA"

What is the bit length ratio of Huffman code, to the following fixed-length (3 bits per
character) coding?
A -> 000
B -> 001
C -> 010
D -> 011
R-> 100
Example

Compute two different Huffman codes for this alphabet. In one case, move a
combined symbol in the coding procedure as high as possible, and in the
second case, move it as low as possible. Hence, for each of the two codes, find
the average code-word length and the variance of the average code-word length
over the ensemble of letters.

Steps:-
1. Generate Huffman code
2. calculate average code length
3. calculate variance of the average code-word length
Example
Letter S0 S1 S2 S3 S4
Probability 0.55 0.15 0.15 0.1 0.05

Compute two different Huffman codes for this alphabet. In one case, move a
combined symbol in the coding procedure as high as possible, and in the
second case, move it as low as possible. Hence, for each of the two codes,
find the average code-word length and the variance of the average code-
word length over the ensemble of letters.

Steps:-
1. Generate Huffman code
2. calculate average code length
3. calculate variance of the average code-word length
Example

1. Compute the Huffman code for this source, moving a “combined” symbol as low
as possible.
1. Compute the Huffman code for this source, moving a “combined” symbol as high as possible.
Comparison

Average code length = 1.9 Average code length = 1.9


variance of the variance of the
average code-word average code-word
length = 0.99 length = 1.29

Both has same average code length but different variance values.
Based on this observation, it is reasonable to choose the former Huffman code
(combined symbol moved as high as possible) over the latter (combined symbol
moved as low as possible) to reduce the variability in code-word lengths.
In Huffman coding, the tree construction process involves iteratively combining symbols with the lowest
probabilities until all symbols are combined into a single tree. During this process, there is flexibility in the
order of combining the symbols, which can result in different code-word lengths.

For the former Huffman code (combined symbol moved as high as possible):
When the combined symbol (formed by merging two least probable symbols) is placed higher up in the tree, it
will have a shorter code length than if it were placed lower down. This is because, in a binary tree, the depth
of the nodes determines the code length. Placing the combined symbol higher up means it will be closer to the
root, resulting in a shorter code length.

For the latter Huffman code (combined symbol moved as low as possible):
When the combined symbol is placed lower down in the tree, it will have a longer code length compared to if it
were placed higher up, as it will be farther from the root in the binary tree structure.

Since the former Huffman code places combined symbols higher up in the tree, it tends to result in shorter
code lengths for the most probable symbols. This leads to a Huffman code with smaller variance in code-word
lengths because the difference between the longest and shortest code lengths is minimized. Consequently, the
average code-word length remains closer to the optimal value, resulting in a more efficient compression
scheme.

In summary, choosing the former Huffman code with combined symbols moved as high as possible reduces
the variability in code-word lengths, resulting in a more balanced and efficient encoding, compared to the
latter Huffman code where combined symbols are moved as low as possible.
Example
A discrete memoryless source has an alphabet of seven symbols whose
probabilities of occurrence are as described here:

Compute the Huffman code for this source, moving a “combined” symbol as high as
possible. Explain why the computed source code has an efficiency of 100 percent

You might also like