Assignment 2{data compression)
Submitted by: Submitted to:
Name: Mohammad Avaish Khan Tabassum Mam
Branch. Btech CSE B. (Assistant professor)
Year/sem: 3rd5th
enrollment2200101tot
1. What is a Dictionary? Difterence Between LZW LZ7,
and LZ8 Approaches
Dictionary in data compression refers to a set of mappings
(or table) used to encode repeated patterrs or sequences in
data Dictionary-based compression methods replace sequeraes
with shorter references to a dictionary. making the data more
compact
Differences:
LZW Lempel-Ziv-Welch) Bilda adietionarydynamicaly from
the input data and uses it to encode tuture sequences. LZW
is widely used in GF files.
|LZ7: Uses a sliaing window approach where the encoder
searches for sequences within a fixed-size window of the
past few syrmbols.
LZ8. Builds a dictionary from input data as well but
stores unique sequences as erntries. Unlike LZ7, it doesn't
Use sliding window, instead, it builds an explicit dictionary
of prevously seen
secen sequences.
Q2. BWT (Burrows-Wheeler Transtorm) Decoding
BNT is a transformation used in data compression where the
original data is transformed to cluster similar characters
together. For exarmple, the BWT of banana' produces
"annb$aa.
BWT Decodirg Example:
Given the BWT output "annb$aa. We want to reconstruct
"'banana."
Step 1 Sort characters alphabeticaly.
Step 2: Use the last and frst columns to reconstruct the
original string by tracking the positions of cach character.
Through this process, we eventually rebuild the string as
'banana."
Move-to-front coding is a technique that maintains alist of
syrbols. henever asyrbol appears, its movEd to the frant.
of the list. reducing redundancy when similar characters occur.
Example: we have the list [a b. c d] and the string 'cabc"'
the encocing Would proced by moving cach accessed character
to the front:
c becomes 2, list changes to [c. a. b. d]
'a' becomes 1. list changes to [a c, b, d]
5 becomes 2, list changes to [b. a. c. d]
c' becomes 3. list changes to [c. b. a d
Qt. Prediction with Partial Match (PPM)
PPM is a statistical model that predicts the next synbod
based on prevíous synbols. using a variable-length context.
Example: If the input is "abracadabra" PPM can predict the
next symbol by looking at the most recent context (ike
bra) and building a probbility model. It assigns higher
probabilties to symbols seen frequently in that context.
improving compression efficiency.
C5. Generating a Binary Code in Arithmetic Coding
Arithmetic cosing encodes asequence of symbols as a single
fraction using intervals based on probability.
Exarmple. Suppose We want to encode the string AB" with
probabilities:
P(A) = 0.6
P(B) - o.t
The interval for Ais [o.o, o.6] and for Bis [o.o, 10)
Encoding "AB' inNolves dividing intervals at cach step.
eventualy resulting in a fraction representing the string
niquely within a binary code.