GQ
QUANTUM Series
_ Session
= Even Senesstyt
ayn ”
+ Topic-wise coverage of entire syllabus in Question-Answer form.
+ Short Questions (2 Marks)www.askbooks.net
A.S.|K.
Always.Seek.Knowledge
All AKTU QUANTUMS are available
* An initiative to provide free ebooks to students.
* Hub of educational books.
Ul CR UCU em COE}
on this website are submitted by readers you can also donate ebooks/study materials.
2. We don't intend to infringe any copyrighted material.
PCa una cr ek air
DR ee Ar ee ORR RLS Loe
PWN ehh inanePUBLISHED BY: Apram Singh
a CONTENTS
Quantum Publications”
(A Unit of Quantum Page Pvt. Ltd.)
Plot No. 59/2/7, Site - 4, Industr
* Sahibabad, Ghaziabad-201 010 RCS 087 : DATA COMPRESSIO
Phone : 01204160879
LUNIT- : INTRODUCTION
. Email :
[email protected] — Websi
Delhi Office : 1/6590, East Rohtas Na
(4p—27D)
Compression Techniques: Loss Test compression, Lossy Compression,
Suni ofpertormarce Mosctiygand cating Mathematical Prctiminarice
Joc Lani compeston, Abr! intchic ton tinormation theory, Models
Physical monte, Prebatity motels, Nurkow moxels, composite Source
fmuade, Conng uniguely decodable coxes, Profs codes
© Ati Ricits Resenven
fo part of this publication may be reproduced or transmitted, UNIT-2 : HUFFMAN CODING (28 D—61D)
in any form or by any means, without permission The H
coding algorithm: Minimum variance Hutfs
[ tnformation contained in this work is derived from sources | compre
thc heres fey rttachen macs | eo oe
accuracy, however neither the publisher nor the authors Coding a sequence, Generating a binary code, Comparison of Binary and
travis secure) compleeners ay formation | ishose ine piaines Boga man
Fred erin a she ephe othe es | SE eee Sagat Coane canes eaten eee
Ses nt of un of bran
compress, Image Compression: The Graphics interchange Format (€
Compression over Modems V 42 bits, Predictive Coding: Prediction with
Partial mateh (pp: The f
a ceeaie car eo peal ry bs ear ie CARE EVMBOL nah
‘Markoy Compression
4 Edition + 2014-15
5% Edition : 2015-16
UNIT; MATHEMATICAL PRELIMINARIES FOR LOSSY CODING
(107 D—135
Distortion crits, Models Scalar Quantization: The Quantization problem,
6" Edition : 2016-17 inform Quantizer, Adaptive Quantization, Non uniform Qhinntisation
7» Edition : 2017-18 UNITS : VECTOR QUANTIZATION
(136 D—146 D)
8 Edition : 2018-19 Advantages of Vector Quantization over Seal
9 Edition : 2019-20 Base
uantization, The Linde
red Vector Quantizets, Strnetured
Price: Rs, 110/- only SHORT QUESTIONS
(147 D165 D)
ee SOLVED PAPERS (2011.12 TO 2018.19)
Printe
‘at Bala
(166 D176 D)Introduction
(SD - 10D)
Compression Techniques Lossless Compression and Lossy Compress
Measures of Performance
‘Modeling and Coding
A. Concept Outline : Part-1 : “5D
B. Long and Medium Answer Type Questions 5D
Part.2.. ee 10D - 22D)
[ Mathematical Preliminaries for Lossless Compression = A Brief
Introduction to Information Theory
Models = Physical Models
* Probability Models
Markov Model
Composite Souree Model
A, Concept Outline : Part-2 . nb
B. Long and Medium Answer Type Questions uD
Part-3 = (22p - 27D)
f Codinis Uniquely Decodable Codes
Prefix Codes
A. Concept Outline : Part-2 : svn 22D
B. Long and Medium Answer Type Quostivns swsnonnnnnne, 22D
s(CSAT-8) D
Data Compression Sicsir gp
PART-1
Compression Techniques : Lossless Compression and Lossy
Compression, Measures of Performance, Modeling and Coding,
[ CONCEPT OUTLINE : PART-1
‘+ Data compression is an art of science of representing information
| ina compact form.
*+ Lossless compression involves no loss of information,
* Lossy compression involves some loss of information and data
cannot be reconstructed exactly same as original,
‘+ Modeling and coding are the two phases for the development of
any data compression algorithm for a variety of data,
Questions-Answers
|__Long Answer Type and Medium Answer Type Questions
Que 1.1, ] What is data compression and why we need it? Explain
compression and reconstruction with the help of block diagram.
UPTU 2013-14, Marks 05
UPTU 2016-16, Marks 02
UPTU 2015-16, Marks 10
Answer
1. In computer science and information theory, data compression is the
process of encoding information using fewer bits than a decoded
representation would use through use of specific encoding schemes.
It is the art or science of representing information in a compact form
‘This compaction of information is done by identifying the structure that
exists in the data
3. Compressed data communication only works when both the sender and
the receiver of the information understand the encoding scheme
4 For example, any text makes sense only if the receiver understands
that itis intended tobe interpreted as characters representing Fis
languags
5. Similarly, the compressed data can only be understood if the decoding
method is known by the receiver.6 csars)D Introduction
Need of data compression :
1. Compression is needed hecause it helps to reduce the consumption of
expensive resources such as a hard disk space or transmission bandwidth,
[Asan uncompressed text or multimedia (speech, image or video) data
requires a huge amount of bits to represent them and thus require large
bandwidth, this storage space and bandwidth requirement can be
decreased by applying proper encoding scheme for compression.
3. Thedesign of data compression schemes involves trade off among verious
fuetors including the degree of compression, the amount of distortion,
introduced and the computational resources required to compress and
decompress the data.
Compression and reconstruction :
1. Acompression teclanique or compression algorithm refers two algorithms
i.e., compression algorithm and reconstruction algorithm.
2 The compression algorithm takes an input X and generates a
representation X, that requires fewer bits, and the reconstruction
algorithm operates on the compressed representation X, to generate
the reconstruction Y. These operations are shown in Fig. 11.1
a ae Teconstrict
Quel. | What do you mean by data compression ? Explain its
UPTU 2011-12, Marks 05
5D, Unit-1
application areas.
Data compression : Refer Q. 1.1, Poxe
Applications of data compression
1 Audio
sn bandwidth and
‘8. Audio data compression reduces the transmission band
storage requirements of audio data,
Data Compression 7(CSAT-8) D
b. Audio compression algorithms are implemented in software as audio
codec:
© Lossy audio compression algorithms provide higher compression at
the cost of fidelity and are used in numerous audio applications.
4d. ‘These algorithms rely on psychoacousties to eliminate or reduce
fidelity ofless audible sounds, thereby reducing the space required
to store or transmit them.
2 Video!
4. Video compression uses modern coding techniques to reduce
redundancy in video data.
b. Most video compression algorithms and codees combine spatial image
compression and temporal motion compensation.
&. Video compression is a practical implementation of source coding in
information theory.
3. Geneties : Geneties compression algorithms are the latest generation
of lossless algorithms that compress data using both conventional
compression algorithms and genetic algorithms adapted to the specific
datatype.
4. Emulation :
4. In order to emulate CD-based consoles such as the Playstation 2,
data compression is desirable to reduce huge amounts of disk space
used by ISOs,
». For example, Final Fantasy XII (Computer Game) is normally 2.9
figabytes, With proper compression, itis reduced to around 90% of
Ge What do you understand by los:
compression ?
What do you mean by lossl
compression with lossy compre
ion, [UPTW 2011-12, Marks 05,
UPTU 2018-16, Marks 02
mee]
Lossless compression
1. In lossless compression, the redundai
data is removed,
int information contained in the
Due to removal of auch information, there is na loss
interest. Hence it is called as lossless compressior ‘ a'SIT-8) D Introduction
3. Lossless compression is also known as data compaction
4, Lossless compression techniques, as their nam
of information,
5, Ifdatahave been losslessly compre:
exactly from the compressed data.
plies, involve no loss
1, the original data can be recovered
6 Lossless compression is generally used for applications that cannot
tolerate any difference between the original and reconstructed data.
‘Text compression is an important area for lossless compression.
8. Itisvery important that the reconstruetion is identical to the text original,
as very small differences ean result in statements with very different
meanings.
Lossy compression
1. Inthis type of compression, there is a loss of information in a controlled
2. ‘The lossy compression is therefore not completely reversible,
3. But the advantage of this type is higher compression ratios than the
lossless compre: “ion.
‘he losers compression i se forthe gal data
Sree pny applications, the leery eomprosson is prefered doe tit
ea ergot a aican oe of mportantniormation
Fan maaio and video applications weneed standard compression
figorithn
eee
eee tNcumpreasion techniques vote sone 1s of
eo oe sung ony techniques zeneraly Be
date tht pa cconatruted exact)
Ferre ee ing aie distortion inthe reconstruction, we ca
* generally obtain much ‘higher compression ratios than is. possible with
fooicoacomprenion
Que TAL | What is data compression and why we need it? Describe
'y compression technique is necessary
cae applications where 158
= UPTU 2014-15, Marks 10
for data compression
‘Answer
Data compression and its nee ee 8D.
lossy compression iS
tefer Q. 1.1, Page 5D, Unit
sary for
Applications where
compression s, to inerease
4 cam
cion ean be used in di
picture quality
1. Lossy imate ee ith minimal degradation
storage capacities with mi
Data Compression 9.csiT-8)D
2, In lossy audio compres
remove non-audible (
methorls of psychoacoustics are used to
« less audible) components of the audio signal
Que is. | What is data compression and why we need it? Explain
compression and reconstruction with the help of block diagram.
What are the measures of performance of data compression
UPTU 2012-13, Marks 10
algorithms ?
oR,
What are the measures of performance of data compres:
UPTU 2018-14, Marks 05
algorithms ?
UPTU 2018-16, Marks 10
‘Answer
Data compression and its need : Refer Q. 1.1, Page 5D, Unit-1
Compression and reconstruction : Refer Q. 1.1, Page 6D, Unit-1
‘Measures of performance of data compression :
1. A compression algorithm ean be evaluated in a number of different
2. Wecould measure the relative complexity of the algorithm, the memory
‘required to implement the algorithm, how fast the algorithm performs
ona given machine, the amount of compression, and how closely the
reconstruction resembles the original.
3, A very logical way of measuring how well a compression algorithm
compresses a given set of data is to look at the ratio of the number of bits
required to represent the data before compression to the number of its
required to represent the data after compression. This ratio isealled the
compression ratio,
4. Another way of reporting compression performance is to provide the
average number of bits required to represent a single sample.
5, ‘This is generally referred to as the rate.
6. _ In lossy compression, the reconstruction differs from the original dats
‘Therefore, in order to determine the efficiency of a compression
algorithm, we have to have some way of quantifying the difference
& The difference between the original and the reconstruction is efter
called the distortion.
9. Lossy techniques are generally used for the compre
ich as speech and video.
jgsion of data that
rnalog signals:
10. Incompression of speech and video, the final arbiter of quality sham?10(CS1T-8) D
u.
Introduction
Secause human responses are difficult to model mathematically, many
approximate measures of distortion are used to determine the quality of
the reconstructed waveforms
Other terms that are also used ab
ut differences between the
reconstruction and the original are fidelity and quality,
When the fidelity or quality, of a reconstruction is high, the difference
between the reconstruction and the original is small,
Que 16. | Explain modeling and coding with the help of suitable
example.
penn
“UPTU 2013-14, Marks 05
‘The development of any data compression algorithm for a variety of data can
be divided into two phases : Modeling and Coding,
a
Modeling
a. Inthis phase, we try to extract information about any redundancy.
or similarity that exist in the data and describe the redundancy of
data in the form of model.
‘This model act as the basis of any data compression algorithm and
the performance of any algorithms will depend on how well the
model is being formed,
Coding:
8. This is the second phase. It is the description of the model and a
description of how the data different from the model are encoded,
generally encoding is done using binary digits.
b> Example
Consider the following sequence of number
‘9, 10, 11, 12, 13
By examining and exploiting the structure of data in a graph paper
it seems to be a straight line, so we modeled it with the equation,
xant9 | n=0,1,2.5
PART-2 .
Mathematical Preliminaries for Lossless Compression : A Brief
Introduction ta Information Theory, Models : Physical Models,
Probability Models, Markov Models, Composite Source Model.
Data Compression
1(csar-s) D
CONCEPT OUTLINE : PART-2
‘A fundamental limit to lossless data compression is called entropy
rate.
Entropy is the measure of the uncertainty associated with a
random variable,
Physical model, probability mode! and Markov model are the
three approaches for building mathematical model
eee
Long Answer Type and Medium Answer Type Questions
Que 17, | Explain entropy or entropy rate as given by Shannon.
1
In information theory, entropy is the measure of the uncertainty
associated with arandom variable
tis usually refer to as Shannon entropy, which quant
of an expected value, the
lunits such as bit
ss, in the sense
formation contained in a message usually in
Shannon entropy isa measure of the average information content ie,
fhe average nutber af binary symbol needed to code the outpat of the
Shannon entropy represent an absolute imiton the best posible lossless
Compression of ny commniction ander eotain consents eating
tmesnage to be encoded au'a sequence of independent and idonticn
distributed random variable. fe me
‘The entropy rate ofa source it number which depends only on the
statistical mature of the source. Consider an arbitrary couree
X= (KX Ky Xd
Following are various modus
1. Zeroorder model: The characters are statistically independent of
each other and every letter of siphabet are equally teas tocceea
Let m be the size of the alphabet, In this ease, the entrope tat
given by . peas
H = log, m bitveharacter
For example, if the alphabet size fs m
would be
27 then the entropy rate
Hs tog,12(CSAT-8) D
Introduction
fi. First order model : The character are statistically independent
Pepe be the size of the alphabet and let Pis the prebabilies ofthe
# letter in the alphabet. The entropy rate
H= ~)' Flog, P, bits/character
Hi Second order thodel : Let P,,,be the conditional probability that
the present character is the * fetter in the alphabet given that the
Previous character is the / letter, The entropy rate ee
H= EPSP, log, Py bitwcharactor
iv. Third order model : Let P,.., be the conditional probability that
the present character is the é!*Ietter in the alphabet given that the
previous character is the, letter and the one before that is the
letter. The entropy rate is,
He DT PL Pe, Pay. bitscharacter
General model : Let B, represents the first n characters. The
entropy rate in the general case is given by,
H = “lim PH, jog, PC) bittcharacter
Que 1.8. ] Consider the following sequence :
1,2,3,2,3, 4,5, 4,5, 6,7, 8,9,8,9, 10
Assuming the frequency of occurrence of each number is reflected
accurately in the number of times it appears in the sequence and
the sequence is independent and identically distributed, find the
first order entropy of the sequence.
=a)
1 a
Pa a
a oe
2 s 16
21 24
poy= 21 as
“ 16° 8 168
204 21
= 2d poe 22
ae 168 168
201 1
poe 222 Pao
i6"s is
H= -DPWlog, PO
Data Compression 13 (CS'T-8) D.
eee i 4)
2 tog, ]+6{ 21
[sista] eee
3.25 bits
Que 1.9. | What do you understand by information and entropy ?
Find the first order entropy over alphabet A = (a,.a,,ay a,) where
Pla,)= Pla.) =Pla,) = Pla) = V4. UPTU 2018-14, Marks 05
‘Answer
Information :
1
2
‘The amount of information conveyed by a message increases as the
amount of uncertainity regarding the message becomes greater,
‘The more it is known about the message a source will produce. the less
the uncertainity, and less the information conveyed.
The entropy of communication theory is a measure of this uncertainity
conveyed by a message from a source.
‘The starting point of information theory is the concept of uncertainity.
Let us define an event as an occurrence which ean result in one of the
many possible outcomes,
‘The outcome of the event is known only after it has occurred, and
before its occurrence we do not know which one of the several possible
outcomes will actually result.
We are thus uncertain with regard to the outcome before the occurrence
of the event.
After the event has occurred, we are no longer uncertain about it.
If we know or can assign a probability to each one of the outcomes, then
We will have some information as to which one of the outcomes is most
likely to oceur,
Entropy : Refer Q. 1.7, Page 11D, Unit-1
Numerical
First order entropy is given by,
He Sri. Pw14(CSAT-8) D Introduction
Que 110. | Prove that the average codeword length ! of an optimal
code for a source $ is greater than or equal to entropy H(S).
1. According to Kraft-McMillan, ifwe have uniquely decodable code C with
‘k codeword then the following inequality holds
‘
De" et 10.1)
2 Il states that if we have a sequence of positive integers (ify whieh
satisfy equation (1.10.1). Consider a source S with alphabet A
"a, and probability model {Pta,), PCa,),
average codeword length is given by
*
7 = Pak
3, Therefore, the difference between entropy of the source H(S) and the
average length as,
= leo yy
{Pla the
» &
= EPea logs Pray ¥ Pea,
Hs) —
= Ecos ean |-4)
4. rr 1 1)
. EPeeo( teal aa logat2! 1}
* 2h
= Sreptoes| 25] « toy
ee 7 * lity, which states
last inequality is obtained using Jensen's ineawality,
“That ffi isa concave function, then BVO] =/UELXD, The log function
ina concave function,
‘As the code is an optimal code
hence
itted and received
Que 111. ] The joint probabilities of the transmitted and
message of a communication system is given as
Data Compression, 15 (CST) D
y, ee le
ee On iio ee
PYM xX, ° ua 0 Bo
x ° o 6010
= mo = Fo Vio)
bs ° ° m0
Calculate HOO and HG).
Par
iit
aiatetot 6
Dl
2} 0 | 0 |ario| i720
| 0 fasz0[ "0° Jaro
ele | o |i
Given:
= PUK, Togs (UPS PX, og, [UP
"SPO, Tog, BVP CX Is POE) tow, EPOX,
+P) logs L/PoT
= 0.36 og, (70.35) + 0:2 1og, (10.3) "FOB i,
70.15) 0.8 log, 0.18) + 0.08 tog (70.008
= 0.5201 so.5211'r 4105 + 0.4105 + 8.2164
100) = 2.0388 itamensaye,
PY) = aror 0+ 0105 0.25
YD 204 U4s0s nO r0c 08
RY) = Vi0-r0+ 1100-40202
PUD = 0+ 17202 120 10+ 130 = 0.25
2) = PLY, tog, (UP s FLY) Togs VAY)
* POP) tog, LU ate PEE) Noms EUPLY
= 0.25 log, (/0.25) + 0.3 log, (10.3) + 02 log, *
(0/02) * 0.25 log, (170.25) *
8.5000 + 0.5211 0.4644 0.500
BC = 1.9855 hitstmensage ot” °5000
‘Que TAZ] What do you understand by inform,
Give an alphabet A =(a,
following eases :
i Pla) = 12, Pla, = V4, Pra,
Similarly,
ion and entropy ?
14 @y@,). find the first-order « -tropy in the
2) = Pla,
vsvoxtuetion
16(CSAT-8) D
Hi. P(a,) =0.505, Pla,) = 1/4, Pla.) = V8and Pla,)= 0.12.
And also differentiate between static length and variable length
coding schemes. Explain with the help of examples.
UPTU 2012-18, Marks 10)
oR
Differentiate between static length and variable length coding
scheme. Explain with the help of an example.
UPTU 2011-12, Marks 05
UPTU 2015-16, Marks 10
“Answer
Information and entropy : Refer Q. 1.7, Page 11D, Unit-1
irst-order entropy is given by,
fa Lioe by og E 1
p= hog, jogs t+ Hog, 4+ 2tog, +
ee laut aide eens
1 1 1 1
Frog. 2+ + hog. 8+} tog,8
log, 2+ blog, 4+ Slog. 8+ 5
1a
bits
+34
8
& Poa,) = 0.505, Pia,)= 1, Pa,)= 1 and Pa,)=0.12
a 3
I Lge, Le} tgp, 2
[0.505 og, 0.505 +} tog, 2 + tog, 4+ 0.12 logs 0.12
11 [0805 1g, 0.805 + Hog, 4 Lt 3+ 0.12 06, 0.12]
= 10.505(- 0.985684) + 0.25(-2)
Fongs2) +0.1-3.00)
*[Lodo775 ~0.5—0.375~0,266] = 1.79875
Difference between static length and variable length coding
ochemes
Static length codes:
1. Static length cndes are also known as fixed length codes. A fixed length
tod is one whose codeword length is Red
2 The ASCITcode forthe letter‘a' i= 1000011 and forthe letter ‘Ais coded
1000001
Here, it should be notice thatthe ASCII code uses the same numberof
bitsto represent each symbel, Such codes are called static or Mae length
codes.
Variable length codes
1. A variable tength code i codeword length is not fixed.
2 For example, consider a table given bel
Data Compression esars)D
Z| Gedei |p Code® [Codes
7 0 0 °
Pa oL 1 10
Pa 10, 00 110
= it n hi
In this table, code 1 is fixed length code and code 2 and code 3 are variable
Tength codes.
Que 115. ] What is average information ? What are the properties
used in measure of average information ? [UPTU 2011-12, Marks 05)
UPTU 2015-16, Marks10
eal
Average information :
1, Average information is also called as entropy,
2. Ifwe havea set of independent events A,, which are the set of outcomes
of some experiments S, such that
Uses
where S is the sample space, then the average self-information associated
with the random experiment is given by,
H = SP(A) i(A,) = -2P\A) log, PA)
3. ‘The quantity is called the entropy associated with the experiment.
4. One of the many contributions of Shannon was that he showed that if
the experiment is a source that puts out symbol A, from a set A, then the
entropy is a measure of the average number of binary symbols needed
to code the output of the souree.
5, Shannon showed that the best that a lossless compression scheme can
do is to encode the output of a source with an average number of bits,
‘equal to the entropy of the source.
Given a set of independent events A,, Ay, .... 4, with probability
P, = PIA), the following properties are used in the measure of average
information H:
1. We want / to be a continuous function of the probabilities p,. That is, a
small change in p, should only cause a small change in the average
information
Ifall events are equally likely, that is, p,= Vn for all i, then H should be
4 monotonically increasing function of n. The more possible outeomes
there are, the more information should be contained in the occurrence
of any particular outcome.18(CSAT-8) D
Introduction
3. Suppose we divide the possible outcomes into a number of groups. We
indicate the occurrence of a particular event by first indicating the group
it belongs to, then indicating which particular member of the group it is,
4
‘Thus, we get some information first by knowing which group the event
belongs to and then we get additional information by learning which
particular event (from the events in the group) has occurred. ‘The
information associated with indicating the outcome in a single stage,
REA] exptain ditterent approaches for building mathematical
toil also define two state Markov model for
inary images.
UPTU 201415, Marke
(UPTUB011-13; Marks 05
UPTU 2015-16, Marks 10]
There are several approaches to building mathematical models
Physical models
1 In speoch-related applications, knowledge about the physies of speech
production canbe ueed to construct a mathematical model for the sampled
Speech process, Sampled speech can then be encoded using this model
2. Modelsfor certain telemetry data can also be obtained through knowledge
ofthe underiving process
2. Forexample,ifresidential electrical meter readings at hourly intervals
were to be coded, knowledge about the living habits of the populace
Could be used to determine when electricity sage would be high and
‘when the usage would be low. Then instead of the actaal readings, the
{ifference (residual) between the actual readings and those predicted
by the model could be coded.
Probability models : seven
1. The simplest statistical model for the source is te assume that eac
letter that is generated by the source le independent of every other
letter, and each occurs with the same probability
2 We could call this the ignorance model, a it would generally be usefl
only when we know nothing. about the source. The next step up in
Comple-ity i t0 Keep the dere" -” axsumption, but remove the
equal probability assur salty of occurrence to
tach letter in the alphabet
3. For asource that generates letters from an alphabet A = lay. 4g) sh
wecan have a probability model P = [P\a,). Pla, on. Play
4. Given a probability model (and the independence assumption). we can
compute the entropy of the source using equation (1.14.1)
Data Compression 19 (CSAT-8) D
H(s) = ~SP\X,) log PUX,) oALA4D)
5. We canalso construct some very efficient codes to represent the letters
ina.
6. Ofcourse, these codes are only efficient if our mathematical assumptions
are in accord with reality,
Markov models :
L
One of the most popular ways of representing dependence in the data is
through the use of Markov models,
2 Formodels used in lossless compression, we use a specific type of Markov
process called a discrete time Markov chain,
3. Let {x,) bea sequence of observations. This sequence is said to follow a
order Markov model if
Phy Lyon arg) Pg Lg pee pgp ood 142)
4
In other words, knowledge of the past £ symbols is equivalent to the
knowledge of the entire past history of the process
5. he values taken on by the set (x, ,,... x, ,) are called the states of the
Process. Ifthe size of the source alphabet is, then the number of states
isl
‘The most commonly used Markov model isthe first-order Markov model,
for which
Pe, |x,..) = Poe, |,
Equations (1.14.2) and (1.14.3)
between samples.
not Baa Ey gy oo)
44a)
) indicate the existence of dependence
8. However, they do not describe the form of the dependence. We can
develop different first-order Markov models depending on our
assumption about the form of the dependence between samples
we assumed that the dependence was introduced in a linear manner,
we could view the data sequence as the outputofa linear i
woeould wie * put of alinear filter driven by
10, The output of such a filter can be given by the difference equation,
xO te,
where ©, is a white noise process. This mod.
developing coding algorithen
Jl doe ni ree the ssp near
12. Forexample, ' a *
& Consider a binary ima
b. The image has only two types of pixel
wae es of pixels, white pixel
Sand black2018-8) D
© We know that the appearance of a white pixel as the next
observation depends, to some extent, on whether the current pixel
is white or black,
‘Therefore, we ean model the pixel process as a discrete time Markov
chain,
©. Define two states S, and S, (S, would correspond to the ease where
the current pixel is a white pixel, and S, corresponds to the ease
‘where the current pixel is a black pixel).
f, We define the transition probabilities Puo/b) and Pibiw), and the
probability of being in each state P(S,) and AS,), The Markov
‘model can then be represented by thé state diagram shown in
Fig. LULL
The entropy of a finite state process with states S, is simply the
average value of the entropy at each state
a= SpsyHs)
roi (%) CY) mow
Fer
Fig. L24:1A fio state Markov made! for binary images!
h. For our particular example ofa binary image
HS.) = —Plbhe) log Plb fu) ~ Plu) log Poole)
Pho) = 1 - Ptblw). HIS,) can be calculated ina similar
caaasy
Poblw)
where
Composite source model
sions, it isnot easy to use a single model to describe the:
1 Inmany applic
In uch cases, we can define a composite source, whichcan be viewed aa
7 a combination or composition of several sources, with only one source:
ng active at acy given imo
3. Acomposite source can be represented as a number af individual sources,
teach with ite own model Mand a switeh that selects a souree S, with
probability P,(as shown in Fig, 1.14.2).
21 (CSAT-8)D
Data Compression
Source 1
aes
Source2 fe
Source n |—*
Fig. 1.14.2, A composite sovree.
4. This is an exceptionally rich model and ean be used to deseribe some
very complicated processes.
Que TAS: | What is zero frequency model in Markov models in text
compression ?
UPTU 2013-14 Marks 05
UPTU 2015-16, Marks 05)
‘Answer
As expected, Markov models are particularly useful in text compression,
where the probability of the next letter is heavily influenced by the
preceding letters,
2 Ineurrent text compression literature, the kth-order Markov models
are more widely known as finite context models.
3. Consider the word preceding.
4
Suppose we have already processed preceding and are going t enode
the next letter, = fone
5. Itwetake no account of
the probability of the lett
{fwe use a first-order Markov model or single-letter context (that is, we
look at the probability model given n), we can see that the probability of
@ would increase substantially.
‘As we increase the cont
Probability of the al
Fesults in lower entr:
the context and treat each letter as a surprise,
jer g occurring is relatively low.
text size (go from n to in to din and so on), the
Iphabet becomes more and more skewed, which
py.
‘The longer the context, the better its predictive value.
Itwe were to store the
a given length, the nut
the length of context.
Consider a context model ofo
last four symbols)
probability model with respect to all contexts of
imber of contexts would grow exponentially with
yrder four (the contextis determined by the
If we take an alphabet size of 95, the possible number of contexts is 95
(more than 81 milli22 (CS1T-#) D Introduction
12, Context modeling in text compression schemes tends to be an adaptive
strategy in which the probabilities for different symbols in the different
contexts are updated as they are encountered,
13. However, this means that we will often encounter symbols that have
not been encountered before for any of the given contexts (this is known
as the zero frequeney problem),
14. The larger the context, the more often this will happen.
15. This problem could be resolved by sending a code to indicate that the
following symbol was being encountered for the first time, followed by a
prearranged code for that symbol
16. This would significantly increase the length of the eode for the symbol
on its first occurrence.
7. However, ifthis situation did not occur too often, the overhead associated
with such occurrences would be small compared to the total number of
bits used to encode the output of the source.
18. Unfortunately. in context-based encoding, the zero frequency problem
is encountered often enough for overhead to be a problem, especially
for longer contexts.
eee
Coating : Unite BRAT Cle Pres Coen
CONCEPT OUTLINE : PART-2
> Acodeis uniquely decodable ifthe mapping C:A,"—>A,"is one
| tone, that is, v, and’ in A,*, x#x'=> C=) Cx).
+ A code C is a prefix code if no codeword w, is the prefix to
‘another codeword w, (#3).
LL yond
“Long Answer Type and Medium Answer Type Questions
Que 116. | Write a short note on coding. What do you understand
by uniquely decodable codes ?
|
1. Coding means the assignment of binary sequences to elements of an
alphabet
2. The set of binary sequences is ealled a code, and the individual members
ofthe set are called codewords.
3, Analphabet is a collection of symbols called letter,
Data Compression 25 (CST) D
4, The ASCTT code uses the same number of bits to represent each symbol.
5. Such a eode is ealled a fixed-length code,
6. If we want to reduce the number of bits required to represent different
messages, we need to use a different number ofits to represent different
symbols.
7. If we use fewer bits to represent symbols that occur more often, om the
average we would use fewer bits per symbol.
8, The average number of bits per symbol is often called the rate of the
code.
‘Uniquely decodable codes
1, -Acode is uniquely decodable if the mapping C":", +A”, is one to one,
that is, \y xandx! in A’, x 2272 C(x) eC).
2. Suppose we have two binary codeword a and b where ais bit longand
bis'm bit long, /
Pla,), then
4, Sly where jis the number ofbits in the codeword for a,
b. Condition 2: The two least probable letters have codewords with
the same maximum length /,
© Condition 3 : In the tree corresponding to the optimum code,
there must be two branches stemming from each intermediate
node.
Ifthere were any intermediate node with only one branch coming
from that node, we could remove it without affecting the
decipherability of the code while reducing its average length,
4 Condition 4 : Suppose we change an intermediate node into aleaf
node by combining all the leaves descending from it into a composite
word of a reducing alphabet. Then, if the original tree was optimal
for the original alphabet, the reduced tree is optimal for the reduced
alphabet,
QEORET] Write u short note on non-binary Huffman code.
1. The binary Huffman coding procedure can be easily extended to the
non-binary case where the code elements come from an m-ary alphabet,
‘mis not equal to two,
2. We obtained the Huffman algorithm based on the observations that in
an optimum binary prefix code and the requirement that the two symbols
with the lowest probability differ only in the last position.
8. Symbols that occur more frequently (have a higher probability of
occurrence) will have shorter codewords than symbols that occur
less frequently, and
b. The two symbols that occur Ic ast frequently will have the same
Tength,
8, We can obtain a non-binary Huffman code in almost exactly the same
Data Compression
4. ‘The obvious thing to do would be to modify
read : “The m symt
length,” and also modify
Eee
Step 1 :Sort all symbols according to their probabilities :
B
os1
Step 2: Build the tree.
a. Pick two symbols with the lowest.
oot as the sum of the two and m:
A
016
D
ota
child and the other as right child of root.
b. Here two smallest nodes are C and E
respectively,
© Again arrange it into the ori
original list.
B
051
CE
0.20
ce Xe)
Repeating step 2, we get,
raking lowest fre
the second observation
bol that occur least frequently will have the sex
the additional requirement to read "Then, |
symbols with the lowest probability differ only in the last position
Que27] Consider the source
probabilities P(A) = 0.16, P(B) =0.51, PC)
Design the Huffman code.
Iphabet A, B, C, D,
B
out
frequency count and form atree with
quency symbol as ale
with probabilities 0.09 and 0.11
iginal list and delete the children from theSao) Huffman Coding
B CEDA
ost 0.47
‘Step 3: Label left branches of the tree with 0 and right branche:
with 1
ofthe tree
Data Compression 39 (CSAT-8) D
Step 4 : Create Huffman code by traversing from the root to leaf.
Sobol cl
. B 1
fa
: fs
7 o
: od
Considering the following source alphabet and their
frequencies. Create the Huffman trev and calculate the averege
lonatin of tho sade
Tee]
Applying Huffman algorithm, tree formed is :40 (CSI'T-8) D
Huffman Coding.
PART-2 |
CONCEPT OUTLINE
* Adaptive Huffman coding : In adaptive Huffman coding
procedure, neither transmitter nor receiver kriows anything
about statistic of the source sequenee at the start of transmission,
‘+ Update procedure : The update procedure requires that the
nodes be in a fixed order. This ordering is preserved by
numbering the nodes.
Questions-Answers
‘Long Answer Type and Medium Answer Type Questions
ive Huffman coding.
Data Compression
Answer
10
4,
13,
4,
Huftan coding requires knowledge of the probabilities ofthe say
sequence. "
If this knowledge isnot available, Huffian coding becomes two
Procedure: the statistis are collected in the fist pasando
encoded in the second pass. |
In order to convert this algorithm into a one-pass procedure, Faller an |
Gallaher independently developed adaptive algorithms to construc tig
Huffman code based on the statistics of the first symbuls alcagy |
‘encountered. |
‘Theoretically, if we wanted to eneode the (+ 1th symbol using the
statistics of the first # symbols, we could recompute the code usin the
Huffman coding procedure each time a symbol is transmitted,
However, this would not be a very practical approach due to the large
‘amount of computation involved—hence, the adaptive Huffman coding
procedure.
In order to describe how the adaptive Huffman code works, we add two
other parameters to the binary tree : the weight of each leaf, which
written as a number inside the node, and a node number.
‘The weight of each external node is simply the number of times the
symbol corresponding to the leaf has been encountered.
‘The weight of each internal node is the sumof the weights ofits offspring.
‘The node number y, is a unique number assigned to each internal and
external node.
Ifwe have an alphabet of size n, then the 2n — 1 internal and external
node can be numbered as yy)... 5 ¥oq-1 Such that if x, is the weight of
node yp we have x, Sxy 5.1 Sxy. 4
Furthermore, the node y,, , and ya; are offspring of the same parent
node, or siblings, for 1 rresponding to symbol46 (CST-8) D Huffman Coding Data Compression
47(CSAT-8) D
Road bit and go
to corresponding,
node,
‘Send code for NYT
node followed by
index in the NYT list
‘ode is the path from]
‘the root node to the
Gall update corresponding node
Read e-bits
Add r top
‘corresponding,
to node
1. The flowchart for the decoding procedure is shown in Fig. 2.12.2.
2 As we read in the received binary string, we traverse the tree in a og
‘manner identical to that used in the encoding procedure. ‘Decode the ei)
3. Once a leaf is encountered, the symbol corresponding to that leaf is Jelement in NYT list}
decoded
4. Ifthe leafis the NYT node, then we check the nexte bits to see ithe
resulting number is less than r.
5. If itis less than r, we read in another bit to complete the code for the
symbol.
5 Theindex forthe symbol is obtained by adding one to the decimal number
corresponding to thee ore + 1-bit binary string
1
Once the symbol has been decoded, the tree is updated and the next
received bit is used to start another traversal down the tree ‘Fig. 2.12.2, Flowchart of the decoding procedure.48 (CSIT-8) D
Huffman Coding
PART-3
Golomb Codes, Rice Codes, Tunstall Codes, Application of Huffman
Coding : Lossless Image Compression, Text Compression,
Audio Compression,
CONCEPT OUTLINE: PART-3
* Golomb coding is a data compression scheme based on entropy
encoding and optimal for geometric distribution.
*+ Rice code is a sequence of non negative integers and is divided
into blocks of K integer piece. Each block is then eoded,
‘+, Tunstall code is an important exception. All codewords are of
equal length. Each codeword represents a different number of
letters,
L
‘QueZIT] Explain Golomb code with the help of example.
Answer” |
A
eae
‘The Golomb-Rice codes belong to a family of code designed to encode
integers with the assumption that the larger an integer, the lower its
probability of occurrence.
‘The simplest code for this situation is the unary code.
‘The unary code for a positive integer n is simply n 1s followed by a0.
‘Thus, the code for 4 is 11110, and the code for 7 is 11111110.
The unary code is the same as the Huffman code for the semi-ingi
alphabet (1, 2,3, ...) with probability model, or
1
P=
hl = Se
Because the Huffman code is optimal, the unary code is also optimal for
this probability model,
Although the unary code is optimal in very restrieted conditions, we ean
See that it is certainly very simple to implement,
One step higher in complexity are a number of coding schemes that split
the inteyer into two parts, representing one part with a unary code and
the other part with a different code
Data Compression
Csr)
9. An example of such a code is the Golomb code. —
10. The Golomb code begins “Secret Agent 00111 is back at
> A at the
again, playing a game of chance, while the fate of mankind hs 0
again, playinga g the fate of mankind hangin?
11. Agent 00111 requires a code to re
game, and Golomb provides it
12, The Golomb eode is actually
integer m>0
18. In the Golomb code with parameter m, we
using two numbers q andr, where
lal
14. LxJis the integer part ofx
15. In other words, q is the quotient and ris the remainder when,
by m.
Present runs of success ina roueay
4 family of codes parameterized by aq
represent an integer nq
and
is divided
16. The quotient g can take on values 0, 1, 2,
unary code of g
17. The remainder r can take on the values 0, 1, 2, ..m~—1.
Ifm is a power of two, we use the log, m-bit binary representation ofr
19. If m isnot a power of two, we could still use [log m |] bits, where [is
the smallest integer greater than or equal to.
and is represented by the
20. We can reduce the number of bits required if we use the [10g it
binary representation of r for the first 2¥"1_m values, and the
fogs: m|-bit binary representation of r +.2/!6"l—m forthe restof the
BIRBA] Pesien a Gotomb code torm «5.
1 [log,5] = 3 and [og,5] =
2 By ghennl— mie. 2°—5=8values ofr (ise.r=0, 1,2) willbe represented
by the 2 bit binary representation of r and next two values (12.7 = 3.4)
will be represented by the 3 bit [log,5] representation of
1480+ Wm ar 28
3. Golomb code for m = 5Data Compression
Hutinan Coding
5 i z Cea
0 0 3 fn
1 0 1 con
2 0 2 o10
3 ol a ono
a ° ‘ ou
5 1 0 a
5 i 1 101
i 2 ro10
3 1 3 to110
° 1 ‘ rom
wv | 2 ° 11000
n | 2 1 1n001
5 ibe 2 i010
5 be 3 niot10
u | 2 4 nou
w | os 0 211000
HERI) ro~ ice code can be viewed ? Explain the
implementation of the Rice code in the recommendation for lossless
compression from the Consultative Committers Secs Bans
Standard Explain adaptive Huffman coding How tieditorsat Rage
conventional Huffman coding? [UPU SOE, Mari 79]
Rice code!
1. The Rice code was originally developed by Robert F. Rice.
2. The Rice code can be viewed as an adaptive Golomb codg,
3. Inthe Rice code, a sequence of non negative integers (which might have
been obtained from the preprocessing of other data) is divided into
blocks of J integers apiece.
4. Each bockis then coded using one of several options, most of which are
a form of Golomb code=
Each block is encoded with each of these options, and the option resulting
in the least number of coded bit is selected
6. The particular option used is indicated by an identifier attached to the
code for each block.
- SL (CSIT-8) D
CCSDS recommendation for lossless compression :
1. The algorithm consists of a preprocessor (the modeling stop) and a
binary coder (coding step). °
‘The proprocessor removes correla
sequence of non negativ
tion from the input and generates a
tegers.
3. This sequence has the property that smaller values are more probable
than larger values.
The binary coder generates a bitstream to represent the integer
sequence,
5. The binary coder is our main focus at this point.
‘The preprocessor functions as follows : Given a sequence (y, for each
‘y, We generate a prediction 5,
7. A simple way to generate a prediction would be to take the previous
value of the sequence to be a prediction of the current valle of whe
Sequence
Siaynd
We then generate a sequence whose element are the difference between
¥, and its predicted value 3,
d=y,- i
9. The d, value will have a small magnitude when our predietion is good
and a large value when it is not
10. Assuming an accurate modeling ‘of the data, the former situation is
more likely than the latter.
11. Lety,,., andy,,., be the largest and smallest values that the sequegee
takes on.
12. Itis reasonable to assume that the values of will be confined to the
range Ly,
13. Define
in? Yo
T,= min Wa —IF—Yarinl (215.1)
‘The sequence ( d, } can be converted into a sequence of non negative
integer ( x, using the following mapping
2a, os4,s7,
x= y2ldl-l TN sdico (2.16.2)
livia! otherwise
15. The value of x, will be small whenever the magnitude ofd, is small
16. Therefore, the value of x, will be small with higher probability.
17. ‘The sequence ( x,} is divided into segments with each segment being
further divided into blocks of size-J.
18. It is recommended by CCSDS that have value of 16,52(CSAT-8) D
19.
20.
‘The coded block is transmitted along with an identifier that indicates
which particular option was used.
Bach block is then eoded using one of the following options :
4. Fundamental sequence:
i Thisis unary code,
i A number sis represented by a sequence of n Os followed by a
1 (or a sequence of n 1s followed by a 0.)
b. Split sample option:
i. These options consist of a set of codes indexed by a parameter
i. The code for a A-bit number n using the mth split sample
option consists of the m least significant bits of k followed by a
unary code representing the # ~ m most significant bits.
‘ii, For example, suppose we wanted to encode the &-bit number
23 using the third split sample option,
iv. ‘The S-bit representation of 23 is 00010111.
¥. The three least significant bits are 111
vi, The remaining bits (00010) correspond to the number 2, which
has a unary code 001
vii, Therefore, the code for 23 using the third split sample option is,
uu1011.
Vili, Notice that different values of m will be preferable for different
values of x,, with higher values of m used for high-entropy
sequences.
© Second extension option :
i. The second extension option is useful for sequence with low
entropy when, in general, many of the values of x, will be zero.
ii, In the second extension option, the sequence is divided into
consecutive pairs of samples,
iii Each pair is used to obtain an index y using the following
transformation
1
Ye Gta Mate AD (2.15.3)
and the value of / is encoded using a unary code.
iv. The value of yis an index to a lookup table with each value of
corresponding to a pair of values x, x
Zero block option :
i. The zero block option is used when one or more of the block of
2, are zero generally when we have long sequences of y, that
have the same value
Data Compression 53 (CSaT8)p
{In this case, the number of zero blocks is transmitted using the
code shown in Table 2.15.1.
iii, ‘The ROS code is used when the last five or more block in «
segment are all zero. a
21. The Rice code has been used in several space applications, and variations
ofthe Rice code have heen proposed for anumber of different applications.
‘Table 2.15.1. Code used for zero block option
Number of All-Zero Blucks | __Codewora
1
o1
001
0001
00001
000001
Oanewn
e308
ba 000.03
ROS 00001
Adaptive Huffman coding : Refer Q. 2.9, Page 40D, Unit-2,
Different from conventional Huffman coding : Refer Q. 2.10,
Page 42D, Unit:
Que 2.16)] Explain Golomb codes and Tunstall codes.
Marks 05)
pata a
Golomb code : Refer Q. 2.19, Page 48D, Unit-2
Tunstall codes :
1. Most of the variable-length codes that we look encode letters from the
source alphabet using codewords with varying aursbors of be
codewords with fewer bits for letters that occur tose ireneectly aad
codewords with more bits for letters that occur lose ecreeeiy
‘The Tunstall code isan important exception,
To the Tunstall code, all codewords are of equal length,
However, each codeword represents a different number of letters
An cxample of 2-bit Tunstall code for an alphabet A =A, shown
in Table 2.16.1. oe
‘The main advantage ofa Tunstall codes is that errors in eadewords donot
Propagate, unlike other variable-length codes: seeks Hotere aoe
which an error in one codeword will use a series of tern ts soeer
veers54(CSAT-8) D : ‘Huffman Coding Data Compression 55 (CSA'T-8) D
‘Table 2.16.1 : A 2-bit Tunstall code iii, Average length of code
A 205% 1 +026%2+0.15 x3 +005 %4+004%4
Sequence Codeword = 0.5 + 0.52 +0.45 +0.2 + 0.16
AAA 00 = 183
AB a Redundancy = Entropy ~ Average length of code
“B a 92-188
7 7 = 70.01 bite/aymbo!
[pl ‘Tunstall code ; For 3-bit Tunstall code, the maximum number of
WEERAE] For an alphabet A = (aa, ay ay a,) with probabilities otewords in th codebook i 2
Pa,)=0.15, P(a,) = 0.04, P(a,) = 0.26, P(a,) = 0.05 and P(a,) = 0.50. Symbol —_Initial probability
i “Calculate the entropy of this source. 2, or
ii, Finda Huffman code for this source. o 92
iii, Find the average length of the code in (ii) and its redundancy.
And also design a 3-bit Tunstall code.
For an alphabet A = (a,,a,, @,) with probabilities P(a,) =0.7, P(a,)= 0.2,
P(a,) = 0.1. Design a 3-bit Tunstall code. [UPTU 2012-13, Marks 10)
i. Entropy of the source is given by,
H= -¥ Plog, Pw)
= —[0.15 log, 0.15 + 0.04 log, 0.04 +0.26 log, 0.26
+£0.08 log, 0.05 + 0.5 log, 0.5)
[0.412 + 0.186 + 0.507 + 0.217 + 0.502),
Code length
‘Taking the symbol with highest probability (-c., a,) and concatenate it with
every other symbol and remove a, from the list. So, the entries in the codebook.
will be
As number of codewords is less than maximum value, again apply the same
procedure with a,a, (maximum probability)
Sequence Probability
aaa, 0.343
We have to stop here as if we apply one more iteration, the number of
codeword increases the maximum limit. Thus, the sequence will become
‘Sequence Codeword
a, ‘000
a, 01
aa, 010
aa, oun
aa, 100
aaa, 101
2,00, 110
Que2is] For an alphabet A = (ay, ay a3) with probabilities
0.7, Play) =0.2, Play) =0.156 (CSAT-8) D Huffman Coding
Design a 3-bit Tunstall code. UPTU 2018-14, Marks 05
“Answer
Refer Q. 2.16, Page 58D, Unit-2,
Que 2.
FREER] Write short notes on the following:
Ui, Rice codes UPTO ROILIS, Marks 05)
“Answer |
GERBHT] wnat is redundancy of code? How it can be defined and
calculated ?
Explain redundancy code with the help of one example.
ii, Non-binary Huffman code UPTU 2018-14, Marks 05
#@2AG.| Write short hotes on the following :
Rice code
Refer Q. 2.15, Page 50D, Unit-2.
Refer Q.2.6, Page 36D, Unit-2.
Golomb codes
Refer Q. 2.13, Page 48D, Unit-2.
Refer Q. 2.15, Page 50D, Unit-2.
oR
Sequence of symbols which are dependent upon one another is termed
‘as redundant.
‘The main concept in detecting or correcting errors is redundancy.
Redundancy is achieved through various coding schemes.
‘The sender adds redundant bits through a process that creates a
relationship between redundant bits and the actual bits,
‘The receiver checks the relat
detect o correct errors,
‘The redundaney F is defined as,
mnship between the two sets of bits to
Hole,
HG)
where E is the redundancy of the message.
Data Compression S87(CS1Ts)p
“H(yix) is the entropy of the message where intersymbol influence
taken into amount. ;
Hi) isthe entropy assuming all symbols are independent.
7. The redundancy ofa sequence of symbol can be measured hy noting the
amount by which the entropy has been reduced.
Que BSz, | What are the various application of Huffman coding
and also give various steps required in encoding procedure?
[CRT 201S-14 Maria 19)
wer]
Applications of Huffman coding
L_ Lossless image compression :
8, Asimple application of Huffman coding to image compression would
be to generate a Huffman code for the set of values that any pixel
may take.
b. For monochrome images, this set usually consists of integers from
0 to 255.
che original (uncompressed) image representation uses 8 bits/pixel.
4d The image consists of 256 rows of 256 pixels, so the uncompressed
representation uses 65,536 bytes,
¢. Thecompression ratio is simply the ratio of the number of bytes in
the uncompressed representation to the number of bytes in the
compressed representation.
{The number of bytes in the compressed representation includes
the number of bytes needed to store the Huffman code,
2 Text compression :
@ Text compression seems natural for Huffman coding.
b. In text, we have a discrete alphabet that, in a given class, has
relatively stationary probabilities.
© For example, the probability model for a particular novel will not
differ significantly from the probability model for another novel
‘Huffinan coding is very much used for text compression.
©. It is known that for two documents that are substantially different,
the two set of probabilities are very much alike.
£ Experiment shows that ge file size ds from 70000
at on an average file size dropped from
bytes to about 43000 bytes with Huffman coding.
3 Audio compression :
@. Huffman coding shows significant amount of reduction in size of 0
audio file,58 (CSAT-8) D
b. Uncompressed CD quality audio data requires enormous amount
of bandwidth for transmission,
Applying Hufinan coding reduces its size thus a lower bandwidth is
required for transmission of audio data over acommunication link
TERE] write down the applications of Huffman coding in toxt
compression and audio compression.
on
Explain lossless image compression with an example
(WPTU B02; Marks 05]
“Answer |
Refer Q. 2.22, Page 57D, Unit-2.
WEEBHT] Explain minimum variance Huffman code and encoding
procedure taking a suitable example. What are the various
applications of Huffman coding ? UPTU R012 15, Marks 10
Minimum variance Huffman code : Refer Q. 2.4, Page 34D, Unit-2.
Encoding procedure :
1. When more than two “symbols” in a Huffman tree have the same
probability, different merge orders produce different Huffman codes.
Table 2.24.1.
Srabol | Step1 | Step2 | Step | Stop4 |Codeword| 82
o4 —4 age
a a—teoa oa |eoso | | aah
a [ea [poe aol] oat 1 | 253
sae
a 02s | o2qo[[ o2tt un | eee
0.150 | o2tt oo | S22
7 aaa cima a
‘Two code trees with same symbol probabilities
Data Compression cs
PateCompression spears
‘We prefer a code with smaller length variance.
“Fig. 224.17
3. Huffman's well-known coding method constructs a minimum
redundancy code which minimizes the expected value of the word length.
4. We characterize the minimum redundancy code with the minimum
variance of the word length.
An algorithm is given to construct such a code, It is shown thatthe code
is in a certain unique sense
6. Furthermore, the code is also shown to have a strong property in thatit
‘minimizes a general class ~f functions of the minimum redundancy
codes as long as the functions are non-decreasing with respect to the
path lengths from the root to the internal nodes of the corresponding
Gecoding trees.
7. To build Huffman code with minimum variance you need to break ties
with one of the following methods (ofcourse probability of node is the
‘most important)
‘4. Select node that represents shortest tree.
. Select node that was created earliest (consider leafs as ereated at
star.
8. According to where in thelist the combined source is placed, we obtain
different Huffman codes with the sume average length (same
compression performance).
In some applications we do not want the codeword lengths to vary
significantly from one symbol to another example: fixed-rate channels).
10, To obtain a minimum variance Huffman code, we always put the
combined symbol as high in the list as possible.60 (CSAT-8) D
ee
11. Forexample,
a, (2)
24.) ———,
5.) ———F
cee oo) Average length = 2.2 bits
402 10 Entropy = 2.122 bits
302) - 1
agi) 010
ag) on
Applications of Huffman coding : Refer Q. 2.22, Page 57D, Unit-2.
Q1. Define Huffman coding and write its properties.
AME Refer Q. 2.1
@2. Explain minimum variance Huffman codes.
AME Refer Q. 2.4.
@.3. Draw the flowchart of encoding and decoding procedure.
AME Refer Q. 2.12.
Q.4. Write a short note on:
i, Golomb code
ii, Rice code
iii, Tunstall code
Ans
Refer Q. 2.13.
i, Refer Q. 2.15.
Data Compression
26D
Hi Refer @. 216
5. Design Colomh code for m = 8 where values ofm are,
m are 0,1,
Sine ine ethan ine fo i ingey akl
grat atin vf the flag smth
Sree
re ne
Sone nee
tes
©©oO' Arithmetic Coding ©
(63D - 74D)
+ Generating a Binary Code
of Binary and Huffman Coding
: Bi-Level Image Compression-The JBIG Standard
. Image Compression
A. Concept Outline : Part-1 .... .
B. Long and Medium Answer Type Questions
© Digram Coding
+ The LZ77 Approach
Compression * UNIX Compress
u: The Graphic Interchange Format (GIF)
A. Concept Outline : Part-2 7 uD
B. Long and Medium Answer Type Questions e780
(90D - 105D)
{Prediction with Partial Match (PPM): The Bai
The ESCAPE SYMBOL
Exclusion Principle
‘The Burrows-Wheeler Transform : Move-to-Front Coding
* Length of Context
CALIC JPEG-LS. * Multiresolution Approaches
Facsimile Eneoding + Dynamic Markov Compression
A. Concept Outline : Part 90D
B. Long and Medium Answer Type Questions sronnsee 90D
62 (CS/T-8) D
Data Compression
Code, Comparison of
ns: Bi-Level Image.
SONCEPT OUTLINE: PART-1
* Arithmetic coding is useful when dealing with sources with small
alphabets, such as binary sources.
‘+ JBIG recommends arithmetic coding as the coding scheme for
coding binary images
‘+ JBIG2is an advance version of JBIG algorithm, it uses the same
arithmetic coding scheme as JBIG.
+ By using arithmetic coding, we get rates closer to the entropy
than by using Huffman code,
RESTA] Derine arithmetic coding in brief. Write about its
application.
1. Method of generating variable-length codes called arithmetic coding.
2
Arithmetic coding is especially useful when dealing with sources with
small alphabets, such as binary sources, and alphabets with highly skewed
probabilities.
3. Tt is also.a very useful approach when, for various reason, the modeling
and coding aspects of lossless compression are to be kept separate.
4 In arithmetic coding, a unique identifier or tag is generated for the
sequence to be encoded.
This tag corresponds to a binary fraction, which becomes the binary
code for the sequence,
‘The generation of the tag and the binary code are the same process.
7. However, the arithmetic coding approach is easier to understand ifwe
conceptually divide the approach into twe phases.
In the first phase, a unique identifier or tag is generated for a given
sequence of symbols,64(CS1T-8) D
9.
10.
Arithmetic Coding.
ry code.
‘This tag is then given a unique bi
Aunique arithmetic eode can be generated for a sequence of length m
‘without the need for generating codewords for all sequences of length
Applications of arithmetic coding +
i
Arithmetic coding is used in a variety of lossless and lossy compression
applications,
part of many international standards. In the area of multimedia,
2 Itis
there are few principle organizations that develop standards.
3. Arithmetic coding is used in image compression, audio compression, and
‘video compression standards,
=] What do you mean by binary code ? How a sequence is
coded?
Binary code
6
1
Abinary code is a way of representing text or computer instructions by
the use of the binary number system 0 and 1.
‘This is accomplished by assigning abit string to each particular symbol
or instruction.
For example, a binary string of eight binary digits (bits) can represent
any of 256 symbols, letters or instructions.
In computing, binary codes are used for many methods of encoding
data, stich as character strings into bit strings.
‘These methods may be fixed-width or variable-width, In a fixed-width
binary code, each letter, digit, or ther interpreted as a binary number,
is usually displayed in code tables in octal, decimal or hexadecimal
notation,
‘There are many character sets and character encoding for them.
Abit string, interpreted as a binary number, can be translated into a
decimal number.
Coding a sequence :
2
In order to distinguish a sequence of symbols from another sequence of
symbols, we need to tag it with a unique identifier
One possible set of tags for representing sequences of symbols are the
numbers in the unit interval (0, 1)
Becanse the number of numbers in the unit interval is infinite, it should
be possible to assign a unique tag to each distinct sequence of Symbol
In order to do this, we need a function that. will map
symbols into the unit interval
sequences of
15 (CSAT-8)
Data Compression
Pee Compressiop 65 cCaTT.8yy
5. Afimetion that maps random variables and sequences ofrandom variabiy
into the unit interval is the cumulative distribution function (edPot tk,
random variable associated with the source.
6. A random variable maps the outcomes, or sets of outcomes, of an
‘experitnent to values on the real number line,
7. For example, in a coin-tossing experiment, the random variable could
map a head to zero and a ail to one (or it cauld map a head to 23675 and
tail to-192),
8. To.use this technique, we need to map the source symbols or letters to
numbers,
9. For convenience, we will use the mapping
Ma)=i aed
where As (a,, a,,....@,} is the alphabet for a discrete source and Xisa
random variable
10. ‘This mapping means that given a probability model P for the souree, we
also have a probability density function for the random variable:
AX =i) = Pa)
and the cumulative density function can be defined as,
Pai
‘QaESR | How a tag is generated in arithmetic coding ?
Answer |
‘The procedure for generating the tag works by reducing the size of the
interval in which the tag resides as more and more elements of the
sequence are received.
Sax
2 Start by first dividing the unit interval into sub-intervals of the form
li~ 0, FO, E=1, mm
Because the minimum value of the edfis zero and the maximum value
is one, this exactly partitions the unit interval.
‘The sub-interval [F (i ~ 1), F,(i)] with the symbol a,
‘The appearance of the first symbol in the sequence restricts the interval
containing the tag to one of these sub-intervals,
Suppose the first symbol was a,
Then the interval containing the tag value will be the sub-interval
lee, FON.
This sub-interval is now portioned in exactly the same proportions as
the original interval
‘The interval corresponding to the symbol a, is given by
|
|66 (CSAT-8) D Arithmetic Coding.
Vek D+ FGA (HF AV), FR) + FGI AB ~ FD.
10, Thus the second symbol in the sequence isa, then the interval contains
Thue tae oe Ie Deh YR) F a D+F OME
TRae DD
1, Bed ceosoedingeymbol causes ths taitto be restricted toa subitervl
Hach enecen dng sy ioned inthe same proportions,
GuoTAs | Find the real valued tag for the sequence aya,
over letter {a,a,4,) with probabilities (0.2, 0.3, 0.5).
Given : Pia,) = 0.2, Pla,) = 0.3, Plas) = 0.5
‘To find ; Real valued tag for the sequence @ 4,044,
seq = 123231
Asweknow, F@= )P(x=®)
Fa) =02
FAQ) =0.2+0.
Fa) = 0.5405
For any sequence, lower and upper limit is given by
1 = BY 4 (WY D0) Fle, D
Initialize 1
Ist element of sequence is 1 50,
D2 9-9-1) FO)
00-0 x0
Peo
w= 1+?) FQ)
= 040-0) x 02
02
So, the tag is contained in the interval [0, 0.2)
Second element of sequence is 2s0,
Besa) Fa)
04
Pela? 2) 2)
04 (02-0) x05
on
Data Compr:
e7(Cs1T-8) D
So, the tag is contained in the interval (0.04, 0.1.
‘Third clement of sequence is 3 50,
PeP+Wt-P) FQ
0.04 + (0.10.04) x0.5
107
+
PYF)
04 + 0.10.04) x 1
a
So, the tag is contained in the interval [0.07, 0.11.
Fourth clement of sequence is 2.50,
H=B4W8-) FD
= 007+ (01-007) x02
0.076
wha (ut 1) FQ)
0.07 + @.1-0.07) x 05
= 0.085
So, the tag is contained in the interval (0.076, 0.0851.
Fifth element of sequence is 3 50,
+ (ut 19) F,(2)
= 0.076 + (0.085 ~ 0.076) x 0.5
0.0805
Be ut-19 FO)
0.076 + (0.085 - 0.076) x 1
= 0.085
Now, the interval for the tag is (0.0806, 0.0851
Sixth clement of sequence is 1 s0,
8+ (u 15) FO)
ost
Tag (123231) = 9.0808 = 0.0814 = 0.08095,
Real valued tag for the given sequence is 0.08095,68 (CSAT-8) D Arithmetic Coding.
Tl] what are the advantages and problems of arithmetic
coding ?
Advantages of arithmetic coding
1. Arithmetic coding is naturally suitable for adaptation strategies.
2. Arithmetic coding is close to optimal compression performance for sources
with very low entropies.
3. It is easy to implement a system with multiple arithmetic codes.
Problems of arithmetic coding:
1. Unlike Huffman coding, the decoder requires knowledge of the number
of symbols to be decoded,
2, ‘Transmission of coded data cannot begin until the whole sequence is
encoded.
3. Infinite precision is required
QueSS. | Compare Huffman coding and arithmetic coding. How
a tag is generated in arithmetic coding ?[ UPTU 2011-12, Marks 05
‘Answer
1. Huffman algorithm requires building the entire code for all possible
sequences of length m. If the original alpho ize
of codebook is K. And for arithmetic coding, we do not need to build
Data Compression
corresponding to a given sequenc ae
2, By using arithmetic coding, we get rates closer to the enteo
using Huffman code. oy than by
3, In Huffman code, we are guaranteed to obtain rate within 0.086
Ping, of the entropy. When the alphabet size is relatively large and
probability is not too skewed, Huffman provide better result than
arithmetic code
4. Arithmetic code comes with extra complexity than Huffinan code
Itis easy to implement multiple arithmetic code ina system than Hufinaa
code.
6. It ismuch easier to adapt arithmetic codes to changing input statistics
3, Page, Unit,
Generating a tagin arithmetic coding: Refer @
WaEIAT] What do you mean by binary code ? Compare binary
UPTU 2011-12, |
Po]
Binary code : Refer Q. 3.2, Page 64D, Unit-3. }
Comparison of binary code with Huffman code :
Binary code : In binary code, we can generate codewords for group of
sequence of symbols,
Huffman code : In Huffman code, we need separate codeword for each
symbol in a sequence
Explain the JBIG standard of bi-level image com
[uPru 2oTis Ma
code with Huffman code.
1. The arithmetic coding approach is particularly amenable to the use of
multiple coders.
2. All coders use the same computational machinery, with each cadet
using a different set of probabilities.
3. The JBIG algorithm makes full use of this feature of arithmetic coding
4. Instead of checking to see if most of the pixels in the neighbourhood a]
white or black, the JBIG encoder uses the pattern of pixels in the)
neighbourhood, or context, to decide which set of probabilities to use ®
encoding a particular pixel.
5. If the neighbourhood consists of 10 pixels, with each pix
taking on two different values, the number of possible patterns i
6 The JBIG coder uses 1024 to 4096 coders, depending on whether &
or high-resolution layer is being encoded
ot capable of
. 1024
toe70 (CSAT-#) D
Arithmetic Coding
For the low-resolution layer, the JBIG encoder uses one of the two
different neighbourhoods as shown in Fig. 3.8.1
ojofo
ofofofola] © fofofofo
ofolx of of ofolx
@ >
‘Fig/B.B{L. G)Thice line and (b) TWo line neighbourhoods,
8 Thepixel tobe coded is marked X, while the pixels to be used for templates,
fare marked O or A.
9. The A and O pixels are previously encoded pixels and are available to
both encoder and decoder.
10. The A pixel can be thought of as a floating member of the neighbourhood.
Its placement is dependent on the input being encoded
11. Inthe JBIG standard, the 1024 arithmetic coders are a variation of the
arithmetic coder known as the QM coder.
12, ‘The QM coder is a modification of an adaptive binary arithmetic coder
called the Q coder which in turn is an extension of another binary
adaptive arithmetic coder called the skew coder.
ERB] verine progressive trans
MMR and JBIG.
1 Insome applications, we may not always need to view an image at full
resolution.
yn. Compare MH, MR,
2 Forexample, if we are looking at the layout of a page, we may not need
to know what each word or letter on the page is.
3. ‘The JBIG standard allows for the generation of progressively lower-
resolution images.
4. Ifthe user is interested in some gross patterns in the image (for example,
ifthey were interested in seeing if there were any figures on a particular
page) they could request a lower-resolution image, which could be
transmitted using fewer bits.
5. Once the lower-resolution image was available, the user could decide
whether a higher-resolution image was necessary
6. The JBIG specification recommends generating one lower-resolution
pixel for each 2x2 block in the higher-resolution image.
7. ‘The number of lower-resolution im is not specified
by JBIG
es (called laye
Data Compression T1CSAT-8) D
8 Astraightforward method for generating lower-resolution images is to
replace every 2 x 2 block of pixels with the average value of the four
pixels, thus reducing the resolution by two in both the horizontal and
vertical directions. This approach works well as long as three of the four
pixels are either black or white.
9, However, when we have two pixels of each kind, we run into trouble;
consistently replacing the four pixels with either a white or black pixel
‘causes a severe loss of detail, and randomly replacing with a black or
white pixel introduces a considerable amount of noise into the image,
10. Instead of simply taking the average of every 2 x 2 block, the JBIG
specification provides a table-based method for resolution reduction,
‘The table is indexed by the neighbouring pixels as shown in Fig. 3.9.1, in
which the circles represent the lower-resolution layer pixels and the
squares represent the higher-resolution layer pixels.
11. Each pixel contributes ait to the index. The table is formed by computing
the expression
de+Qotd4f+h+(aterg+i-3(B+O)-A
tt
pune
Comparison of MH, MR, MMR and JBIG
1. The JBIG algorithm performs better than the MMR algorithm, which
performs better than the MR algorithm, which in turn performs better
than the MH algorithm.
2. The level of complexity also follows the same trend, although we could
argue that MMR is actually less complex than MR.
3. A comparison of the schemes for some facsimile sources is shown in
Table 3.9.1.
4. ‘The modified READ algorithm was used with K’= 4, while the JBIG
algorithm was used with an adaptive three line template and adaptive
arithmetic coder to obtain the results.
[As we go from the one-dimensional MH coder to the two-dimensional
MMR coder, we get a factor of two reduction in file size for the sparse
text sources
6. We get even more reduction when we use an adaptive coder and an
adaptive model, as is true for the JBIG coder72 (CSITT-8) D
Arithmetic Coding
7. When we come to the dense text, the ad
lense text, the advantage of the two-dimensional
MMR over the one-dimensional MH is not as significant as the amouct,
of two-dimensional correlation becomes substantially less,
Table 3.9.1. Comparison of binary image coding schemes
= Origimal Size | [Me | MMR | snrG
Description! (pixels) | (bytes) | (bytes) | (bytes) | (bytes)
Letter 4352x3072 | 20,605 | 14,290 | 501 | 6682
Sparsetext | 4352x3072 | 26155 | 16,676 | 9,056 | 7,606
Dense text | 4352x3072 | 135,705 | 105,684 | 92,100 | 70,703
Gas TAG. | Write a short note on JIGS.
Answer |
1. The JBIG2 standard was approved in February of 2000.
2. Besides facsimile transmission, the standard is also intended for document
storage, archiving, wireless transmission, print spooling, and coding of
images on the web.
3. The standard provides specifications only for the decoder, leaving the
encoder design open.
4. This means that the encoder design can be constantly refined, subject
only to compatibility with the decoder specifications.
5. This situation also allows for lossy compression, because the encoder
can incorporate lossy transformations to the data that enhance the level
of compression.
6 The compression algorithm in JBIG provides excellent compression of a
generic bi-level image.
‘The compression algorithm proposed for JBIG2 uses the same arithmet
coding scheme as JBIC.
5. However, it takes advantage of the fact that a significant number of bi-
level images contain structure that can be used to enhance the
compression performance.
9% Alarge percentage of bi-level images consist of text on some background,
while another signifieant percentage of bi-level images are or contain
halftone images.
10. The JBIG2 approach allows the encoder to select the compression
technique that would provide the best performance for the type of dats
—___ 8s)»
11. To do so, the encoder divides the page to be compressed into three typo,
of regions called symbol regions, halons ropons, and gas eet
12, Thesrmbolragions are thowe containing text data the haters
are thote containing halftone images an the geen eet
the rions that donot ino either catego
18, ‘The partitioning information ha tobe supplied othe decoder
1M. The decoder requires that al information provided tow heer
sogments that are made up ofa seument Header dts ee |
oar ee
15, ‘The page information segment contains information about the pg
including the siz and rnslaion.
16. "The decoder uses this information to setup the page blr
17. Te then decodes the various regione using the appropriate daiding
procedure and places the different segionsin the appropriate lessee
@weS.AT] what do you mean by binary code ? Compare binary
code with Huffman code. What are adaptive compression schemes?
What is the basic difference between adaptive and statistical
compression scheme ? Discuss with the model of adaptive
compression.
‘Answer _
Binary Code : Refer Q. 8.2, Page 64D, Unit-3.
Comparison of binary code with Huffman code : Refer Q. 3.7,
Page 69D, Unit-3,
Adaptive compression : |
1. So far we have seen two algorithms for compression of message or text
‘which used statie data and input text.
2 Te generated the code based on statistical data which contains cde or
each symbol calculated before and then accordingly codes the ¢
5. These were the modes of statistical compression which retain Ht
Utsaantege ofcompresson bssdonprvaumasta asics
nine icf seanned aly one
5. Examples adaptive compression methods
b,74(CSIT-8) D
Arithmetic Coding
Method PPMC
Difference between adaptive and statistical compression seale +
1. With respect to compression of statistical model, the adaptive model has,
an advantage of better compression ratio.
2. This model also corresponds to the probability (or frequency) of characters
in the input message sfreani.
3. Herein this method, the static data is updated according to input symbol
and then coded as per the regulations of compression model.
4. Asimilar procedure is followed at receiving end to decode the data.
Model of adaptive compression
CONCEPT OUTLINE: PART-2
‘+ Static dictionary is static in nature ., the entry in the dictionary
are predefined.
+ Adaptive dictionary are adaptive in nature i. the entries in the
dicti are constructed as the symbols are received for
encoding.
© L277 end L278
+ One of the more common forms of static dicti
static diagram coding,
\gorithms are used in adaptive dictionary.
nary coding is,
Data Compression 75 (CSTs) D
THETID] Discuss the role of dictionary technique. Explain the
concept of statie dictionary.
Role of dictionary technique :
1. Inmany applications, the output of source consists of recurring patterns
ie, certain pattern recur more frequently and some pattern do not
‘occur, oF if they do, occur with great rarity.
2, In such cases, encoding of the source can be done by a list or dictionary
of frequently occurring patterns.
3. When the pattern in the dictionary appears in the source output, they
are encoded with reference to the dictionary,
4. Ifthe pattern does not appear in the dictionary, then it can be encoded
using some other, less efficient, method.
5. Imeffect, we are splitting the input into two classes : frequently occurring
patterns and infrequently oceurring patterns.
6 For this technique to be effective, the class of frequency occurring
patterns, and hence the size of the dictionary, must be much smaller
than the number of all possible patterns.
Static dictionary :
1. Static dictionary is static in nature i.e., the entry in the dictionary are
predefined.
2. Static dictionary is mainly useful when prior knowledge of the output of
the source is available.
3. ‘They form application specific dictionary because the content of dictionary
will vary from application to application.
4. ‘Two application of different domain will have different recurring patterns
of word thus making different application specific dictionary.
5. For example, ifwe want to compress the library record of a university,
a statie dictionary approach may be useful because we know in advance
that words such as book title, ISBN number which are going to appear
in almost all record and they can be included in the static dictionary.
6 There could be a number of other situation in which an application-
specific or data-specifie statie dictionary based coding scheme would be
the most efficient,