Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
50 views235 pages

Image Compression Essentials

This document discusses image compression and provides an overview of the topic. It notes that the need for image compression is growing due to increasing amounts of image data from applications like the internet, businesses, and medical imaging. Image compression aims to reduce file sizes while retaining necessary visual information depending on the application. Compression is achieved by removing redundant data from images.

Uploaded by

okuwobi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views235 pages

Image Compression Essentials

This document discusses image compression and provides an overview of the topic. It notes that the need for image compression is growing due to increasing amounts of image data from applications like the internet, businesses, and medical imaging. Image compression aims to reduce file sizes while retaining necessary visual information depending on the application. Compression is achieved by removing redundant data from images.

Uploaded by

okuwobi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 235

Chapter 10

Image Compression

(c) Scott E Umbaugh, SIUE 2005 1


 Introduction and Overview

 The field of image compression continues


to grow at a rapid pace

 As we look to the future, the need to store


and transmit images will only continue to
increase faster than the available
capability to process all the data

(c) Scott E Umbaugh, SIUE 2005 2


 Applications that require image
compression are many and varied such
as:

1. Internet,
2. Businesses,
3. Multimedia,
4. Satellite imaging,
5. Medical imaging

(c) Scott E Umbaugh, SIUE 2005 3


 Compression algorithm development
starts with applications to two-dimensional
(2-D) still images

 After the 2-D methods are developed, they


are often extended to video (motion
imaging)

 However, we will focus on image


compression of single frames of image
data

(c) Scott E Umbaugh, SIUE 2005 4


 Image compression involves reducing the
size of image data files, while retaining
necessary information

 Retaining necessary information depends


upon the application

 Image segmentation methods, which are


primarily a data reduction process, can be
used for compression

(c) Scott E Umbaugh, SIUE 2005 5


 The reduced file created by the
compression process is called the
compressed file and is used to reconstruct
the image, resulting in the decompressed
image
 The original image, before any
compression is performed, is called the
uncompressed image file
 The ratio of the original, uncompressed
image file and the compressed file is
referred to as the compression ratio
(c) Scott E Umbaugh, SIUE 2005 6
 The compression ratio is denoted by:

(c) Scott E Umbaugh, SIUE 2005 7


 The reduction in file size is necessary to
meet the bandwidth requirements for
many transmission systems, and for the
storage requirements in computer
databases

 Also, the amount of data required for


digital images is enormous

(c) Scott E Umbaugh, SIUE 2005 8


 This number is based on the actual
transmission rate being the maximum,
which is typically not the case due to
Internet traffic, overhead bits and
transmission errors

(c) Scott E Umbaugh, SIUE 2005 9


 Additionally, considering that a web page
might contain more than one of these
images, the time it takes is simply too long

 For high quality images the required


resolution can be much higher than the
previous example

(c) Scott E Umbaugh, SIUE 2005 10


Example 10.1.5 applies maximum data rate to Example 10.1.4

(c) Scott E Umbaugh, SIUE 2005 11


 Now, consider the transmission of video
images, where we need multiple frames
per second
 If we consider just one second of video
data that has been digitized at 640x480
pixels per frame, and requiring 15 frames
per second for interlaced video, then:

(c) Scott E Umbaugh, SIUE 2005 12


 Waiting 35 seconds for one second’s
worth of video is not exactly real time!

 Even attempting to transmit


uncompressed video over the highest
speed Internet connection is impractical

 For example: The Japanese Advanced


Earth Observing Satellite (ADEOS)
transmits image data at the rate of 120
Mbps

(c) Scott E Umbaugh, SIUE 2005 13


 Applications requiring high speed
connections such as high definition
television, real-time teleconferencing, and
transmission of multiband high resolution
satellite images, leads us to the conclusion
that image compression is not only
desirable but necessessary

 Key to a successful compression scheme is


retaining necessary information

(c) Scott E Umbaugh, SIUE 2005 14


 To understand “retaining necessary
information”, we must differentiate
between data and information

1. Data:
• For digital images, data refers to the
pixel gray level values that correspond to
the brightness of a pixel at a point in
space
• Data are used to convey information,
much like the way the alphabet is used to
convey information via words

(c) Scott E Umbaugh, SIUE 2005 15


2. Information:

• Information is an interpretation of the


data in a meaningful way

• Information is an elusive concept; it can


be application specific

(c) Scott E Umbaugh, SIUE 2005 16


 There are two primary types of image
compression methods:

1. Lossless compression methods:


• Allows for the exact recreation of the original
image data, and can compress complex
images to a maximum 1/2 to 1/3 the original
size – 2:1 to 3:1 compression ratios
• Preserves the data exactly

(c) Scott E Umbaugh, SIUE 2005 17


2. Lossy compression methods:

• Data loss, original image cannot be re-


created exactly

• Can compress complex images 10:1 to


50:1 and retain high quality, and 100 to
200 times for lower quality, but
acceptable, images

(c) Scott E Umbaugh, SIUE 2005 18


 Compression algorithms are developed
by taking advantage of the redundancy
that is inherent in image data

 Four primary types of redundancy that


can be found in images are:
1. Coding
2. Interpixel
3. Interband
4. Psychovisual redundancy

(c) Scott E Umbaugh, SIUE 2005 19


1. Coding redundancy
 Occurs when the data used to represent
the image is not utilized in an optimal
manner

2. Interpixel redundancy
 Occurs because adjacent pixels tend to
be highly correlated, in most images the
brightness levels do not change rapidly,
but change gradually

(c) Scott E Umbaugh, SIUE 2005 20


3. Interband redundancy
 Occurs in color images due to the
correlation between bands within an
image – if we extract the red, green and
blue bands they look similar

4. Psychovisual redundancy
 Some information is more important to
the human visual system than other
types of information

(c) Scott E Umbaugh, SIUE 2005 21


 The key in image compression algorithm
development is to determine the minimal
data required to retain the necessary
information
 The compression is achieved by taking
advantage of the redundancy that exists in
images
 If the redundancies are removed prior to
compression, for example with a
decorrelation process, a more effective
compression can be achieved

(c) Scott E Umbaugh, SIUE 2005 22


 To help determine which information can
be removed and which information is
important, the image fidelity criteria are
used

 These measures provide metrics for


determining image quality

 It should be noted that the information


required is application specific, and that,
with lossless schemes, there is no need
for a fidelity criteria
(c) Scott E Umbaugh, SIUE 2005 23
 Most of the compressed images shown in
this chapter are generated with CVIPtools,
which consists of code that has been
developed for educational and research
purposes
 The compressed images shown are not
necessarily representative of the best
commercial applications that use the
techniques described, because the
commercial compression algorithms are
often combinations of the techniques
described herein

(c) Scott E Umbaugh, SIUE 2005 24


 Compression System Model

• The compression system model consists


of two parts:

1. The compressor
2. The decompressor

• The compressor consists of a


preprocessing stage and encoding stage,
whereas the decompressor consists of a
decoding stage followed by a
postprocessing stage
(c) Scott E Umbaugh, SIUE 2005 25
Decompressed image

(c) Scott E Umbaugh, SIUE 2005 26


• Before encoding, preprocessing is
performed to prepare the image for the
encoding process, and consists of any
number of operations that are application
specific

• After the compressed file has been


decoded, postprocessing can be
performed to eliminate some of the
potentially undesirable artifacts brought
about by the compression process
(c) Scott E Umbaugh, SIUE 2005 27
• The compressor can be broken into
following stages:
1. Data reduction: Image data can be
reduced by gray level and/or spatial
quantization, or can undergo any desired
image improvement (for example, noise
removal) process
2. Mapping: Involves mapping the original
image data into another mathematical
space where it is easier to compress the
data
(c) Scott E Umbaugh, SIUE 2005 28
3. Quantization: Involves taking potentially
continuous data from the mapping stage
and putting it in discrete form

4. Coding: Involves mapping the discrete


data from the quantizer onto a code in an
optimal manner

• A compression algorithm may consist of


all the stages, or it may consist of only
one or two of the stages

(c) Scott E Umbaugh, SIUE 2005 29


(c) Scott E Umbaugh, SIUE 2005 30
• The decompressor can be broken down
into following stages:

1. Decoding: Takes the compressed file


and reverses the original coding by
mapping the codes to the original,
quantized values

2. Inverse mapping: Involves reversing the


original mapping process

(c) Scott E Umbaugh, SIUE 2005 31


3. Postprocessing: Involves enhancing the
look of the final image

• This may be done to reverse any


preprocessing, for example, enlarging an
image that was shrunk in the data
reduction process
• In other cases the postprocessing may
be used to simply enhance the image to
ameliorate any artifacts from the
compression process itself

(c) Scott E Umbaugh, SIUE 2005 32


Decompressed image

(c) Scott E Umbaugh, SIUE 2005 33


• The development of a compression
algorithm is highly application specific

• Preprocessing stage of compression


consists of processes such as
enhancement, noise removal, or
quantization are applied

• The goal of preprocessing is to prepare


the image for the encoding process by
eliminating any irrelevant information,
where irrelevant is defined by the
application

(c) Scott E Umbaugh, SIUE 2005 34


• For example, many images that are for
viewing purposes only can be
preprocessed by eliminating the lower bit
planes, without losing any useful
information

(c) Scott E Umbaugh, SIUE 2005 35


Figure 10.1.4 Bit plane images

a) Original image c) Bit plane 6

b) Bit plane 7, the most significant bit

(c) Scott E Umbaugh, SIUE 2005 36


Figure 10.1.4 Bit plane images (Contd)

d) Bit plane 5 f) Bit plane 3

e) Bit plane 4

(c) Scott E Umbaugh, SIUE 2005 37


Figure 10.1.4 Bit plane images (Contd)

g) Bit plane 2 i) Bit plane 0,


the least significant bit

h) Bit plane 1

(c) Scott E Umbaugh, SIUE 2005 38


• The mapping process is important
because image data tends to be highly
correlated

• Specifically, if the value of one pixel is


known, it is highly likely that the adjacent
pixel value is similar

• By finding a mapping equation that


decorrelates the data this type of data
redundancy can be removed

(c) Scott E Umbaugh, SIUE 2005 39


• Differential coding: Method of reducing
data redundancy, by finding the difference
between adjacent pixels and encoding
those values

• The principal components transform can


also be used, which provides a
theoretically optimal decorrelation

• Color transforms are used to decorrelate


data between image bands

(c) Scott E Umbaugh, SIUE 2005 40


Figure -5.6.1 Principal Components Transform (PCT)

a) Red band of a color image b) Green band c) Blue band

d) Principal component band 1 e) Principal component band 2 f) Principal component band 3

(c) Scott E Umbaugh, SIUE 2005 41


• As the spectral domain can also be used
for image compression, so the first stage
may include mapping into the frequency or
sequency domain where the energy in the
image is compacted into primarily the
lower frequency/sequency components

• These methods are all reversible, that is


information preserving, although all
mapping methods are not reversible

(c) Scott E Umbaugh, SIUE 2005 42


• Quantization may be necessary to convert
the data into digital form (BYTE data type),
depending on the mapping equation used

• This is because many of these mapping


methods will result in floating point data
which requires multiple bytes for
representation which is not very efficient, if
the goal is data reduction

(c) Scott E Umbaugh, SIUE 2005 43


• Quantization can be performed in the
following ways:

1. Uniform quantization: In it, all the quanta,


or subdivisions into which the range is
divided, are of equal width

2. Nonuniform quantization: In it the


quantization bins are not all of equal width

(c) Scott E Umbaugh, SIUE 2005 44


(c) Scott E Umbaugh, SIUE 2005 45
• Often, nonuniform quantization bins are
designed to take advantage of the
response of the human visual system

• In the spectral domain, the higher


frequencies may also be quantized with
wider bins because we are more sensitive
to lower and midrange spatial frequencies
and most images have little energy at high
frequencies

(c) Scott E Umbaugh, SIUE 2005 46


• The concept of nonuniform quantization
bin sizes is also described as a variable bit
rate, since the wider quantization bins
imply fewer bits to encode, while the
smaller bins need more bits

• It is important to note that the quantization


process is not reversible, so it is not in the
decompression model and also some
information may be lost during
quantization
(c) Scott E Umbaugh, SIUE 2005 47
• The coder in the coding stage provides a
one-to-one mapping, each input is
mapped to a unique output by the coder,
so it is a reversible process

• The code can be an equal length code,


where all the code words are the same
size, or an unequal length code with
variable length code words

(c) Scott E Umbaugh, SIUE 2005 48


• In most cases, an unequal length code is
the most efficient for data compression,
but requires more overhead in the coding
and decoding stages

(c) Scott E Umbaugh, SIUE 2005 49


 LOSSLESS COMPRESSION
METHODS

 No loss of data, decompressed image


exactly same as uncompressed image
 Medical images or any images used in
courts
 Lossless compression methods typically
provide about a 10% reduction in file size
for complex images

(c) Scott E Umbaugh, SIUE 2005 50


 Lossless compression methods can
provide substantial compression for simple
images

 However, lossless compression


techniques may be used for both
preprocessing and postprocessing in
image compression algorithms to obtain
the extra 10% compression

(c) Scott E Umbaugh, SIUE 2005 51


 The underlying theory for lossless
compression (also called data compaction)
comes from the area of communications
and information theory, with a
mathematical basis in probability theory

 One of the most important concepts used


is the idea of information content and
randomness in data

(c) Scott E Umbaugh, SIUE 2005 52


 Information theory defines information
based on the probability of an event,
knowledge of an unlikely event has more
information than knowledge of a likely
event
 For example:
• The earth will continue to revolve around the
sun; little information, 100% probability
• An earthquake will occur tomorrow; more info.
Less than 100% probability
• A matter transporter will be invented in the next
10 years; highly unlikely – low probability, high
information content
(c) Scott E Umbaugh, SIUE 2005 53
 This perspective on information is the
information theoretic definition and should
not be confused with our working definition
that requires information in images to be
useful, not simply novel

 Entropy is the measurement of the


average information in an image

(c) Scott E Umbaugh, SIUE 2005 54


 The entropy for an N x N image can be
calculated by this equation:

(c) Scott E Umbaugh, SIUE 2005 55


 This measure provides us with a
theoretical minimum for the average
number of bits per pixel that could be used
to code the image

 It can also be used as a metric for judging


the success of a coding scheme, as it is
theoretically optimal

(c) Scott E Umbaugh, SIUE 2005 56


(c) Scott E Umbaugh, SIUE 2005 57
(c) Scott E Umbaugh, SIUE 2005 58
 The two preceding examples (10.2.1 and
10.2.2) illustrate the range of the entropy:

 The examples also illustrate the


information theory perspective regarding
information and randomness
 The more randomness that exists in an
image, the more evenly distributed the
gray levels, and more bits per pixel are
required to represent the data

(c) Scott E Umbaugh, SIUE 2005 59


Figure 10.2-1 Entropy

a) Original image, c) Image after binary threshold,


entropy = 7.032 bpp
entropy = 0.976 bpp

b) Image after local histogram equalization,


block size 4, entropy = 4.348 bpp

(c) Scott E Umbaugh, SIUE 2005 60


Figure 10.2-1 Entropy (contd)

d) Circle with a radius of 32, f) Circle with a radius of 32,


entropy = 0.283 bpp and a linear blur radius of 64,
entropy = 2.030 bpp

e) Circle with a radius of 64,


entropy = 0.716 bpp

(c) Scott E Umbaugh, SIUE 2005 61


 Figure 10.2.1 depicts that a minimum
overall file size will be achieved if a
smaller number of bits is used to code the
most frequent gray levels
 Average number of bits per pixel (Length)
in a coder can be measured by the
following equation:

(c) Scott E Umbaugh, SIUE 2005 62


 Huffman Coding

• The Huffman code, developed by D.


Huffman in 1952, is a minimum length
code
• This means that given the statistical
distribution of the gray levels (the
histogram), the Huffman algorithm will
generate a code that is as close as
possible to the minimum bound, the
entropy

(c) Scott E Umbaugh, SIUE 2005 63


• The method results in an unequal (or
variable) length code, where the size of
the code words can vary

• For complex images, Huffman coding


alone will typically reduce the file by 10%
to 50% (1.1:1 to 1.5:1), but this ratio can
be improved to 2:1 or 3:1 by
preprocessing for irrelevant information
removal

(c) Scott E Umbaugh, SIUE 2005 64


• The Huffman algorithm can be described
in five steps:

1. Find the gray level probabilities for the image


by finding the histogram
2. Order the input probabilities (histogram
magnitudes) from smallest to largest
3. Combine the smallest two by addition
4. GOTO step 2, until only two probabilities are
left
5. By working backward along the tree, generate
code by alternating assignment of 0 and 1
(c) Scott E Umbaugh, SIUE 2005 65
(c) Scott E Umbaugh, SIUE 2005 66
(c) Scott E Umbaugh, SIUE 2005 67
(c) Scott E Umbaugh, SIUE 2005 68
(c) Scott E Umbaugh, SIUE 2005 69
(c) Scott E Umbaugh, SIUE 2005 70
(c) Scott E Umbaugh, SIUE 2005 71
• In the example, we observe a 2.0 : 1.9
compression, which is about a 1.05
compression ratio, providing about 5%
compression

• From the example we can see that the


Huffman code is highly dependent on the
histogram, so any preprocessing to
simplify the histogram will help improve
the compression ratio

(c) Scott E Umbaugh, SIUE 2005 72


 Run-Length Coding

• Run-length coding (RLC) works by


counting adjacent pixels with the same
gray level value called the run-length,
which is then encoded and stored

• RLC works best for binary, two-valued,


images

(c) Scott E Umbaugh, SIUE 2005 73


• RLC can also work with complex images
that have been preprocessed by
thresholding to reduce the number of gray
levels to two
• RLC can be implemented in various ways,
but the first step is to define the required
parameters
• Horizontal RLC (counting along the rows)
or vertical RLC (counting along the
columns) can be used

(c) Scott E Umbaugh, SIUE 2005 74


• In basic horizontal RLC, the number of bits
used for the encoding depends on the
number of pixels in a row

• If the row has 2n pixels, then the required


number of bits is n, so that a run that is the
length of the entire row can be encoded

(c) Scott E Umbaugh, SIUE 2005 75


• The next step is to define a convention for
the first RLC number in a row – does it
represent a run of 0's or 1's?
(c) Scott E Umbaugh, SIUE 2005 76
(c) Scott E Umbaugh, SIUE 2005 77
(c) Scott E Umbaugh, SIUE 2005 78
• Bitplane-RLC : A technique which involves
extension of basic RLC method to gray
level images, by applying basic RLC to
each bit-plane independently

• For each binary digit in the gray level


value, an image plane is created, and this
image plane (a string of 0's and 1's) is
then encoded using RLC

(c) Scott E Umbaugh, SIUE 2005 79


(c) Scott E Umbaugh, SIUE 2005 80
• Typical compression ratios of 0.5 to 1.2
are achieved with complex 8-bit
monochrome images

• Thus without further processing, this is not


a good compression technique for
complex images

• Bitplane-RLC is most useful for simple


images, such as graphics files, where
much higher compression ratios are
achieved
(c) Scott E Umbaugh, SIUE 2005 81
• The compression results using this
method can be improved by preprocessing
to reduce the number of gray levels, but
then the compression is not lossless

• With lossless bitplane RLC we can


improve the compression results by taking
our original pixel data (in natural code) and
mapping it to a Gray code (named after
Frank Gray), where adjacent numbers
differ in only one bit

(c) Scott E Umbaugh, SIUE 2005 82


• As the adjacent pixel values are highly
correlated, adjacent pixel values tend to
be relatively close in gray level value, and
this can be problematic for RLC

(c) Scott E Umbaugh, SIUE 2005 83


(c) Scott E Umbaugh, SIUE 2005 84
(c) Scott E Umbaugh, SIUE 2005 85
• When a situation such as the above
example occurs, each bitplane
experiences a transition, which adds a
code for the run in each bitplane
• However, with the Gray code, only one
bitplane experiences the transition, so it
only adds one extra code word
• By preprocessing with a Gray code we can
achieve about a 10% to 15% increase in
compression with bitplane-RLC for typical
images
(c) Scott E Umbaugh, SIUE 2005 86
• Another way to extend basic RLC to gray
level images is to include the gray level of
a particular run as part of the code
• Here, instead of a single value for a run,
two parameters are used to characterize
the run
• The pair (G,L) correspond to the gray level
value, G, and the run length, L
• This technique is only effective with
images containing a small number of gray
levels

(c) Scott E Umbaugh, SIUE 2005 87


(c) Scott E Umbaugh, SIUE 2005 88
(c) Scott E Umbaugh, SIUE 2005 89
• The decompression process requires the
number of pixels in a row, and the type of
encoding used

• Standards for RLC have been defined by


the International Telecommunications
Union-Radio (ITU-R, previously CCIR)

• These standards use horizontal RLC, but


postprocess the resulting RLC with a
Huffman encoding scheme

(c) Scott E Umbaugh, SIUE 2005 90


• Newer versions of this standard also utilize
a two-dimensional technique where the
current line is encoded based on a
previous line, which helps to reduce the
file size

• These encoding methods provide


compression ratios of about 15 to 20 for
typical documents

(c) Scott E Umbaugh, SIUE 2005 91


 Lempel-Ziv-Welch Coding

• The Lempel-Ziv-Welch (LZW) coding


algorithm works by encoding strings of
data, which correspond to sequences of
pixel values in images

• It works by creating a string table that


contains the strings and their
corresponding codes

(c) Scott E Umbaugh, SIUE 2005 92


• The string table is updated as the file is
read, with new codes being inserted
whenever a new string is encountered

• If a string is encountered that is already in


the table, the corresponding code for that
string is put into the compressed file

• LZW coding uses code words with more


bits than the original data

(c) Scott E Umbaugh, SIUE 2005 93


For Example:

• With 8-bit image data, an LZW coding


method could employ 10-bit words
• The corresponding string table would then
have 210 = 1024 entries
• This table consists of the original 256
entries, corresponding to the original 8-bit
data, and allows 768 other entries for
string codes

(c) Scott E Umbaugh, SIUE 2005 94


• The string codes are assigned during the
compression process, but the actual string
table is not stored with the compressed
data

• During decompression the information in


the string table is extracted from the
compressed data itself

(c) Scott E Umbaugh, SIUE 2005 95


• For the GIF (and TIFF) image file format
the LZW algorithm is specified, but there
has been some controversy over this,
since the algorithm is patented by Unisys
Corporation

• Since these image formats are widely


used, other methods similar in nature to
the LZW algorithm have been developed
to be used with these, or similar, image file
formats
(c) Scott E Umbaugh, SIUE 2005 96
• Similar versions of this algorithm include
the adaptive Lempel-Ziv, used in the UNIX
compress function, and the Lempel-Ziv 77
algorithm used in the UNIX gzip function

(c) Scott E Umbaugh, SIUE 2005 97


 Arithmetic Coding

• Arithmetic coding transforms input data


into a single floating point number
between 0 and 1

• There is not a direct correspondence


between the code and the individual pixel
values

(c) Scott E Umbaugh, SIUE 2005 98


• As each input symbol (pixel value) is read
the precision required for the number
becomes greater

• As the images are very large and the


precision of digital computers is finite, the
entire image must be divided into small
subimages to be encoded

(c) Scott E Umbaugh, SIUE 2005 99


• Arithmetic coding uses the probability
distribution of the data (histogram), so it
can theoretically achieve the maximum
compression specified by the entropy

• It works by successively subdividing the


interval between 0 and 1, based on the
placement of the current pixel value in the
probability distribution

(c) Scott E Umbaugh, SIUE 2005 100


(c) Scott E Umbaugh, SIUE 2005 101
(c) Scott E Umbaugh, SIUE 2005 102
(c) Scott E Umbaugh, SIUE 2005 103
• In practice, this technique may be used as
part of an image compression scheme, but
is impractical to use alone

• It is one of the options available in the


JPEG standard

(c) Scott E Umbaugh, SIUE 2005 104


 Lossy Compression Methods

 Lossy compression methods are required


to achieve high compression ratios with
complex images

 They provides tradeoffs between image


quality and degree of compression, which
allows the compression algorithm to be
customized to the application

(c) Scott E Umbaugh, SIUE 2005 105


(c) Scott E Umbaugh, SIUE 2005 106
 With more advanced methods, images can
be compressed 10 to 20 times with
virtually no visible information loss, and 30
to 50 times with minimal degradation
 Newer techniques, such as JPEG2000,
can achieve reasonably good image
quality with compression ratios as high as
100 to 200
 Image enhancement and restoration
techniques can be combined with lossy
compression schemes to improve the
appearance of the decompressed image

(c) Scott E Umbaugh, SIUE 2005 107


 In general, a higher compression ratio
results in a poorer image, but the results
are highly image dependent – application
specific

 Lossy compression can be performed in


both the spatial and transform domains.
Hybrid methods use both domains.

(c) Scott E Umbaugh, SIUE 2005 108


 Gray-Level Run Length Coding

• The RLC technique can also be used for


lossy image compression, by reducing the
number of gray levels, and then applying
standard RLC techniques

• As with the lossless techniques,


preprocessing by Gray code mapping will
improve the compression ratio

(c) Scott E Umbaugh, SIUE 2005 109


Figure 10.3-2 Lossy Bitplane Run Length Coding

Note: No compression occurs until reduction to 5 bits/pixel

a) Original image, 8 bits/pixel, b) Image after reduction to 7 bits/pixel,


256 gray levels 128 gray levels, compression ratio 0.55,
with Gray code preprocessing 0.66

(c) Scott E Umbaugh, SIUE 2005 110


Figure 10.3-2 Lossy Bitplane Run Length Coding (contd)

c) Image after reduction to 6 bits/pixel, d) Image after reduction to 5 bits/pixel,


64 gray levels, compression ratio 0.77, 32 gray levels, compression ratio 1.20,
with Gray code preprocessing 0.97 with Gray code preprocessing 1.60

(c) Scott E Umbaugh, SIUE 2005 111


Figure 10.3-2 Lossy Bitplane Run Length Coding (contd)

e) Image after reduction to 4 bits/pixel, f) Image after reduction to 3 bits/pixel,


16 gray levels, compression ratio 2.17, 8 gray levels, compression ratio 4.86,
with Gray code preprocessing 2.79 with Gray code preprocessing 5.82

(c) Scott E Umbaugh, SIUE 2005 112


Figure 10.3-2 Lossy Bitplane Run Length Coding (contd)

g) Image after reduction to 2 bits/pixel, h) Image after reduction to 1 bit/pixel,


4 gray levels, compression ratio 13.18, 2 gray levels, compression ratio 44.46,
with Gray code preprocessing 15.44 with Gray code preprocessing 44.46

(c) Scott E Umbaugh, SIUE 2005 113


• A more sophisticated method is dynamic
window-based RLC
• This algorithm relaxes the criterion of the
runs being the same value and allows for
the runs to fall within a gray level range,
called the dynamic window range
• This range is dynamic because it starts out
larger than the actual gray level window
range, and maximum and minimum values
are narrowed down to the actual range as
each pixel value is encountered

(c) Scott E Umbaugh, SIUE 2005 114


• This process continues until a pixel is
found out of the actual range

• The image is encoded with two values,


one for the run length and one to
approximate the gray level value of the run

• This approximation can simply be the


average of all the gray level values in the
run

(c) Scott E Umbaugh, SIUE 2005 115


(c) Scott E Umbaugh, SIUE 2005 116
(c) Scott E Umbaugh, SIUE 2005 117
(c) Scott E Umbaugh, SIUE 2005 118
• This particular algorithm also uses some
preprocessing to allow for the run-length
mapping to be coded so that a run can be
any length and is not constrained by the
length of a row

(c) Scott E Umbaugh, SIUE 2005 119


 Block Truncation Coding

• Block truncation coding (BTC) works by


dividing the image into small subimages
and then reducing the number of gray
levels within each block

• The gray levels are reduced by a quantizer


that adapts to local statistics

(c) Scott E Umbaugh, SIUE 2005 120


• The levels for the quantizer are chosen to
minimize a specified error criteria, and
then all the pixel values within each block
are mapped to the quantized levels

• The necessary information to decompress


the image is then encoded and stored

• The basic form of BTC divides the image


into N * N blocks and codes each block
using a two-level quantizer

(c) Scott E Umbaugh, SIUE 2005 121


• The two levels are selected so that the
mean and variance of the gray levels
within the block are preserved
• Each pixel value within the block is then
compared with a threshold, typically the
block mean, and then is assigned to one
of the two levels
• If it is above the mean it is assigned the
high level code, if it is below the mean, it is
assigned the low level code

(c) Scott E Umbaugh, SIUE 2005 122


• If we call the high value H and the low
value L, we can find these values via the
following equations:

(c) Scott E Umbaugh, SIUE 2005 123


• If n = 4, then after the H and L values are
found, the 4x4 block is encoded with four
bytes

• Two bytes to store the two levels, H and L,


and two bytes to store a bit string of 1's
and 0's corresponding to the high and low
codes for that particular block

(c) Scott E Umbaugh, SIUE 2005 124


(c) Scott E Umbaugh, SIUE 2005 125
(c) Scott E Umbaugh, SIUE 2005 126
(c) Scott E Umbaugh, SIUE 2005 127
• This algorithm tends to produce images
with blocky effects

• These artifacts can be smoothed by


applying enhancement techniques such as
median and average (lowpass) filters

(c) Scott E Umbaugh, SIUE 2005 128


(c) Scott E Umbaugh, SIUE 2005 129
(c) Scott E Umbaugh, SIUE 2005 130
• The multilevel BTC algorithm, which uses
a 4-level quantizer, allows for varying the
block size, and a larger block size should
provide higher compression, but with a
corresponding decrease in image quality

• With this particular implementation, we get


decreasing image quality, but the
compression ratio is fixed

(c) Scott E Umbaugh, SIUE 2005 131


(c) Scott E Umbaugh, SIUE 2005 132
(c) Scott E Umbaugh, SIUE 2005 133
 Vector Quantization

• Vector quantization (VQ) is the process of


mapping a vector that can have many
values to a vector that has a smaller
(quantized) number of values

• For image compression, the vector


corresponds to a small subimage, or block

(c) Scott E Umbaugh, SIUE 2005 134


(c) Scott E Umbaugh, SIUE 2005 135
• VQ can be applied in both the spectral or
spatial domains

• Information theory tells us that better


compression can be achieved with vector
quantization than with scalar quantization
(rounding or truncating individual values)

(c) Scott E Umbaugh, SIUE 2005 136


• Vector quantization treats the entire
subimage (vector) as a single entity and
quantizes it by reducing the total number
of bits required to represent the subimage

• This is done by utilizing a codebook, which


stores a fixed set of vectors, and then
coding the subimage by using the index
(address) into the codebook

(c) Scott E Umbaugh, SIUE 2005 137


• In the example we achieved a 16:1
compression, but note that this assumes
that the codebook is not stored with the
compressed file

(c) Scott E Umbaugh, SIUE 2005 138


(c) Scott E Umbaugh, SIUE 2005 139
• However, the codebook will need to be
stored unless a generic codebook is
devised which could be used for a
particular type of image, in that case we
need only store the name of that particular
codebook file

• In the general case, better results will be


obtained with a codebook that is designed
for a particular image

(c) Scott E Umbaugh, SIUE 2005 140


(c) Scott E Umbaugh, SIUE 2005 141
• A training algorithm determines which
vectors will be stored in the codebook by
finding a set of vectors that best represent
the blocks in the image

• This set of vectors is determined by


optimizing some error criterion, where the
error is defined as the sum of the vector
distances between the original subimages
and the resulting decompressed
subimages
(c) Scott E Umbaugh, SIUE 2005 142
• The standard algorithm to generate the
codebook is the Linde-Buzo-Gray (LBG)
algorithm, also called the K-means or the
clustering algorithm:

(c) Scott E Umbaugh, SIUE 2005 143


• The LBG algorithm, along with other
iterative codebook design algorithms do
not, in general, yield globally optimum
codes
• These algorithms will converge to a local
minimum in the error (distortion) space
• Theoretically, to improve the codebook,
the algorithm is repeated with different
initial random codebooks and the one
codebook that minimizes distortion is
chosen
(c) Scott E Umbaugh, SIUE 2005 144
• However, the LBG algorithm will typically
yield "good" codes if the initial codebook is
carefully chosen by subdividing the vector
space and finding the centroid for the
sample vectors within each division
• These centroids are then used as the
initial codebook
• Alternately, a subset of the training
vectors, preferably spread across the
vector space, can be randomly selected
and used to initialize the codebook
(c) Scott E Umbaugh, SIUE 2005 145
• The primary advantage of vector
quantization is simple and fast
decompression, but with the high cost of
complex compression

• The decompression process requires the


use of the codebook to recreate the
image, which can be easily implemented
with a look-up table (LUT)

(c) Scott E Umbaugh, SIUE 2005 146


• This type of compression is useful for
applications where the images are
compressed once and decompressed
many times, such as images on an
Internet site

• However, it cannot be used for real-time


applications

(c) Scott E Umbaugh, SIUE 2005 147


Figure 10.3-8
Vector Quantization in the Spatial Domain

a) Original image b) VQ with 4x4 vectors, and a


codebook of 128 entries,
compression ratio = 11.49

(c) Scott E Umbaugh, SIUE 2005 148


Figure 10.3-8
Vector Quantization in the Spatial Domain (contd)

c) VQ with 4x4 vectors, and a d) VQ with 4x4 vectors, and a


codebook of 256 entries, codebook of 512 entries,
compression ratio = 7.93 compression ratio = 5.09

Note: As the codebook size is increased the image quality improves and the
compression ratio decreases
(c) Scott E Umbaugh, SIUE 2005 149
Figure 10.3-9
Vector Quantization in the Transform Domain

Note: The original image is the image in Figure 10.3-8a

a) VQ with the discrete cosine transform, b) VQ with the wavelet transform,


compression ratio = 9.21 compression ratio = 9.21

(c) Scott E Umbaugh, SIUE 2005 150


Figure 10.3-9
Vector Quantization in the Transform Domain (contd)

c) VQ with the discrete cosine transform, d) VQ with the wavelet transform,


compression ratio = 3.44 compression ratio = 3.44

(c) Scott E Umbaugh, SIUE 2005 151


 Differential Predictive Coding

• Differential predictive coding (DPC)


predicts the next pixel value based on
previous values, and encodes the
difference between predicted and actual
value – the error signal
• This technique takes advantage of the fact
that adjacent pixels are highly correlated,
except at object boundaries

(c) Scott E Umbaugh, SIUE 2005 152


• Typically the difference, or error, will be
small which minimizes the number of bits
required for compressed file

• This error is then quantized, to further


reduce the data and to optimize visual
results, and can then be coded

(c) Scott E Umbaugh, SIUE 2005 153


(c) Scott E Umbaugh, SIUE 2005 154
• From the block diagram, we have the
following:

• The prediction equation is typically a


function of the previous pixel(s), and can
also include global or application-specific
information

(c) Scott E Umbaugh, SIUE 2005 155


(c) Scott E Umbaugh, SIUE 2005 156
• This quantized error can be encoded using
a lossless encoder, such as a Huffman
coder

• It should be noted that it is important that


the predictor uses the same values during
both compression and decompression;
specifically the reconstructed values and
not the original values

(c) Scott E Umbaugh, SIUE 2005 157


(c) Scott E Umbaugh, SIUE 2005 158
(c) Scott E Umbaugh, SIUE 2005 159
• The prediction equation can be one-
dimensional or two-dimensional, that is, it
can be based on previous values in the
current row only, or on previous rows also
• The following prediction equations are
typical examples of those used in practice,
with the first being one-dimensional and
the next two being two-dimensional:

(c) Scott E Umbaugh, SIUE 2005 160


(c) Scott E Umbaugh, SIUE 2005 161
• Using more of the previous values in the
predictor increases the complexity of the
computations for both compression and
decompression

• It has been determined that using more


than three of the previous values provides
no significant improvement in the resulting
image

(c) Scott E Umbaugh, SIUE 2005 162


• The results of DPC can be improved by
using an optimal quantizer, such as the
Lloyd-Max quantizer, instead of simply
truncating the resulting error

• The Lloyd-Max quantizer assumes a


specific distribution for the prediction error

(c) Scott E Umbaugh, SIUE 2005 163


• Assuming a 2-bit code for the error, and a
Laplacian distribution for the error, the
Lloyd-Max quantizer is defined as follows:

(c) Scott E Umbaugh, SIUE 2005 164


(c) Scott E Umbaugh, SIUE 2005 165
• For most images, the standard deviation
for the error signal is between 3 and 15

• After the data is quantized it can be further


compressed with a lossless coder such as
Huffman or arithmetic coding

(c) Scott E Umbaugh, SIUE 2005 166


(c) Scott E Umbaugh, SIUE 2005 167
(c) Scott E Umbaugh, SIUE 2005 168
(c) Scott E Umbaugh, SIUE 2005 169
(c) Scott E Umbaugh, SIUE 2005 170
Figure 10.3.15 DPC Quantization (contd)

h) Lloyd-Max quantizer, using 4 bits/pixel,


normalized correlation = 0.90, i) Error image for (h)
with standard deviation = 10

j) Lloyd-Max quantizer, using 5 bits/pixel,


normalized correlation = 0.90, k) Error image for (j)
with standard deviation = 10

(c) Scott E Umbaugh, SIUE 2005 171


 Model-based and Fractal Compression

• Model-based or intelligent compression


works by finding models for objects within
the image and using model parameters for
the compressed file

• The techniques used are similar to


computer vision methods where the goal is
to find descriptions of the objects in the
image

(c) Scott E Umbaugh, SIUE 2005 172


• The objects are often defined by lines or
shapes (boundaries), so a Hough
transform (Chap 4) may be used, while the
object interiors can be defined by
statistical texture modeling
• The model-based methods can achieve
very high compression ratios, but the
decompressed images often have an
artificial look to them
• Fractal methods are an example of model-
based compression techniques
(c) Scott E Umbaugh, SIUE 2005 173
• Fractal image compression is based on
the idea that if an image is divided into
subimages, many of the subimages will be
self-similar

• Self-similar means that one subimage can


be represented as a skewed, stretched,
rotated, scaled and/or translated version of
another subimage

(c) Scott E Umbaugh, SIUE 2005 174


• Treating the image as a geometric plane,
the mathematical operations (skew,
stretch, scale, rotate, translate) are called
affine transformations and can be
represented by the following general
equations:

(c) Scott E Umbaugh, SIUE 2005 175


• Fractal compression is somewhat like
vector quantization, except that the
subimages, or blocks, can vary in size and
shape
• The idea is to find a good set of basis
images, or fractals, that can undergo affine
transformations, and then be assembled
into a good representation of the image
• The fractals (basis images), and the
necessary affine transformation
coefficients are then stored in the
compressed file

(c) Scott E Umbaugh, SIUE 2005 176


• Fractal compression can provide high
quality images and very high compression
rates, but often at a very high cost

• The quality of the resulting decompressed


image is directly related to the amount of
time taken in generating the fractal
compressed image

• If the compression is done offline, one


time, and the images are to be used many
times, it may be worth the cost
(c) Scott E Umbaugh, SIUE 2005 177
• An advantage of fractals is that they can
be magnified as much as is desired, so
one fractal compressed image file can be
used for any resolution or size of image
• To apply fractal compression, the image is
first divided into non-overlapping regions
that completely cover the image, called
domains
• Then, regions of various size and shape
are chosen for the basis images, called
the range regions
(c) Scott E Umbaugh, SIUE 2005 178
• The range regions are typically larger than
the domain regions, can be overlapping
and do not cover the entire image

• The goal is to find the set affine


transformations to best match the range
regions to the domain regions

• The methods used to find the best range


regions for the image, as well as the best
transformations, are many and varied

(c) Scott E Umbaugh, SIUE 2005 179


Figure 10.3-16 Fractal Compression

a) Cameraman image compressed with fractal b) Error image for (a)


encoding, compression ratio = 9.19

(c) Scott E Umbaugh, SIUE 2005 180


Figure 10.3-16 Fractal Compression (contd)

c) Compression ratio = 15.65 d) Error image for (c)

(c) Scott E Umbaugh, SIUE 2005 181


Figure 10.3-16 Fractal Compression (contd)

e) Compression ratio = 34.06 f) Error image for (e)

(c) Scott E Umbaugh, SIUE 2005 182


Figure 10.3-16 Fractal Compression (contd)

g) A checkerboard, h) Error image for (g)


compression ratio = 564.97

Note: Error images have been remapped for display so the background gray corresponds to zero,
then they were enhanced by a histogram stretch to show detail

(c) Scott E Umbaugh, SIUE 2005 183


 Transform Coding

• Transform coding, is a form of block


coding done in the transform domain

• The image is divided into blocks, or


subimages, and the transform is
calculated for each block

(c) Scott E Umbaugh, SIUE 2005 184


• Any of the previously defined transforms
can be used, frequency (e.g. Fourier) or
sequency (e.g. Walsh/Hadamard), but it
has been determined that the discrete
cosine transform (DCT) is optimal for most
images

• The newer JPEG2000 algorithms uses the


wavelet transform, which has been found
to provide even better compression

(c) Scott E Umbaugh, SIUE 2005 185


• After the transform has been calculated,
the transform coefficients are quantized
and coded

• This method is effective because the


frequency/sequency transform of images
is very efficient at putting most of the
information into relatively few coefficients,
so many of the high frequency coefficients
can be quantized to 0 (eliminated
completely)
(c) Scott E Umbaugh, SIUE 2005 186
• This type of transform is a special type of
mapping that uses spatial frequency
concepts as a basis for the mapping

• The main reason for mapping the original


data into another mathematical space is to
pack the information (or energy) into as
few coefficients as possible

(c) Scott E Umbaugh, SIUE 2005 187


• The simplest form of transform coding is
achieved by filtering by eliminating some
of the high frequency coefficients

• However, this will not provide much


compression, since the transform data is
typically floating point and thus 4 or 8
bytes per pixel (compared to the original
pixel data at 1 byte per pixel), so
quantization and coding is applied to the
reduced data
(c) Scott E Umbaugh, SIUE 2005 188
• Quantization includes a process called bit
allocation, which determines the number
of bits to be used to code each coefficient
based on its importance

• Typically, more bits are used for lower


frequency components where the energy
is concentrated for most images, resulting
in a variable bit rate or nonuniform
quantization and better resolution

(c) Scott E Umbaugh, SIUE 2005 189


(c) Scott E Umbaugh, SIUE 2005 190
• Then a quantization scheme, such as
Lloyd-Max quantization is applied
• As the zero-frequency coefficient for real
images contains a large portion of the
energy in the image and is always
positive, it is typically treated differently
than the higher frequency coefficients
• Often this term is not quantized at all, or
the differential between blocks is encoded
• After they have been quantized, the
coefficients can be coded using, for
example, a Huffman or arithmetic coding
method

(c) Scott E Umbaugh, SIUE 2005 191


• Two particular types of transform coding
have been widely explored:
1. Zonal coding
2. Threshold coding

• These two vary in the method they use


for selecting the transform coefficients to
retain (using ideal filters for transform
coding selects the coefficients based on
their location in the transform domain)

(c) Scott E Umbaugh, SIUE 2005 192


1. Zonal coding

• It involves selecting specific coefficients


based on maximal variance
• A zonal mask is determined for the entire
image by finding the variance for each
frequency component
• This variance is calculated by using each
subimage within the image as a separate
sample and then finding the variance
within this group of subimages

(c) Scott E Umbaugh, SIUE 2005 193


(c) Scott E Umbaugh, SIUE 2005 194
• The zonal mask is a bitmap of 1's and 0',
where the 1's correspond to the
coefficients to retain, and the 0's to the
ones to eliminate

• As the zonal mask applies to the entire


image, only one mask is required

(c) Scott E Umbaugh, SIUE 2005 195


2. Threshold coding

• It selects the transform coefficients


based on specific value

• A different threshold mask is required for


each block, which increases file size as
well as algorithmic complexity

(c) Scott E Umbaugh, SIUE 2005 196


• In practice, the zonal mask is often
predetermined because the low frequency
terms tend to contain the most information,
and hence exhibit the most variance

• In this case we select a fixed mask of a


given shape and desired compression
ratio, which streamlines the compression
process

(c) Scott E Umbaugh, SIUE 2005 197


• It also saves the overhead involved in
calculating the variance of each group of
subimages for compression and also
eases the decompression process

• Typical masks may be square, triangular


or circular and the cutoff frequency is
determined by the compression ratio

(c) Scott E Umbaugh, SIUE 2005 198


Figure 10.3-18
Zonal Compression with DCT and Walsh Transforms

A block size of 64x64 was used, a circular zonal mask, and DC coefficients were not quantized

a) Original image, a view of c) Error image comparing the


St. Louis, Missouri, from original and (b), histogram
the Gateway Arch stretched to show detail

b) Results from using the DCT


with a compression ratio = 4.27

(c) Scott E Umbaugh, SIUE 2005 199


Figure 10.3-18
Zonal Compression with DCT and Walsh Transforms
(contd)

d) Results from using the DCT with e) Error image comparing the original and
a compression ratio = 14.94 (d), histogram stretched to show detail,

(c) Scott E Umbaugh, SIUE 2005 200


Figure 10.3-18
Zonal Compression with DCT and Walsh Transforms
(contd)

f) Results from using the Walsh Transform g) Error image comparing the original and
(WHT) with a compression ratio = 4.27 (f), histogram stretched to show detail

(c) Scott E Umbaugh, SIUE 2005 201


Figure 10.3-18
Zonal Compression with DCT and Walsh Transforms
(contd)

h) Results from using the WHT with i) Error image comparing the original and
a compression ratio = 14.94 (h), histogram stretched to show detail

(c) Scott E Umbaugh, SIUE 2005 202


• One of the most commonly used image
compression standards is primarily a form
of transform coding
• The Joint Photographic Expert Group
(JPEG) under the auspices of the
International Standards Organization (ISO)
devised a family of image compression
methods for still images
• The original JPEG standard uses the DCT
and 8x8 pixel blocks as the basis for
compression
(c) Scott E Umbaugh, SIUE 2005 203
• Before computing the DCT, the pixel
values are level shifted so that they are
centered at zero

• EXAMPLE 10.3.7:
A typical 8-bit image has a range of gray
levels of 0 to 255. Level shifting this range
to be centered at zero involves subtracting
128 from each pixel value, so the resulting
range is from -128 to 127

(c) Scott E Umbaugh, SIUE 2005 204


• After level shifting, the DCT is computed

• Next, the DCT coefficients are quantized


by dividing by the values in a quantization
table and then truncated

• For color signals JPEG transforms the


RGB components into the YCrCb color
space, and subsamples the two color
difference signals (Cr and Cb), since we
perceive more detail in the luminance
(brightness) than in the color information
(c) Scott E Umbaugh, SIUE 2005 205
• Once the coefficients are quantized, they
are coded using a Huffman code

• The zero-frequency coefficient (DC term)


is differentially encoded relative to the
previous block

(c) Scott E Umbaugh, SIUE 2005 206


These quantization tables were experimentally determined by JPEG to take
advantage of the human visual system’s response to spatial frequency which
peaks around 4 or 5 cycles per degree

(c) Scott E Umbaugh, SIUE 2005 207


(c) Scott E Umbaugh, SIUE 2005 208
(c) Scott E Umbaugh, SIUE 2005 209
Figure 10.3-21:The Original DCT-based JPEG
Algorithm Applied to a Color Image

a) The original image b) Compression ratio = 34.34

(c) Scott E Umbaugh, SIUE 2005 210


Figure 10.3-21:The Original DCT-based JPEG
Algorithm Applied to a Color Image (contd)

c) Compression ratio = 57.62 d) Compression ratio = 79.95

(c) Scott E Umbaugh, SIUE 2005 211


Figure 10.3-21:The Original DCT-based JPEG
Algorithm Applied to a Color Image (contd)

e) Compression ratio = 131.03 f) Compression ratio = 201.39

(c) Scott E Umbaugh, SIUE 2005 212


 Hybrid and Wavelet Methods

• Hybrid methods use both the spatial and


spectral domains

• Algorithms exist that combine differential


coding and spectral transforms for analog
video compression

(c) Scott E Umbaugh, SIUE 2005 213


• For digital images these techniques can
be applied to blocks (subimages), as well
as rows or columns
• Vector quantization is often combined with
these methods to achieve higher
compression ratios
• The wavelet transform, which localizes
information in both the spatial and
frequency domain, is used in newer hybrid
compression methods like the JPEG2000
standard
(c) Scott E Umbaugh, SIUE 2005 214
• The wavelet transform provides superior
performance to the DCT-based
techniques, and also is useful in
progressive transmission for Internet and
database use

• Progressive transmission allows low


quality images to appear quickly and then
gradually improve over time as more detail
information is transmitted or retrieved

(c) Scott E Umbaugh, SIUE 2005 215


• Thus the user need not wait for an entire
high quality image before they decide to
view it or move on

• The wavelet transform combined with


vector quantization has led to the
development of experimental
compression algorithms

(c) Scott E Umbaugh, SIUE 2005 216


• The general algorithm is as follows:

1. Perform the wavelet transform on the


image by using convolution masks

2. Number the different wavelet bands from


0 to N−1, where N is the total number of
wavelet bands, and 0 is the lowest
frequency (in both horizontal and vertical
directions) band

(c) Scott E Umbaugh, SIUE 2005 217


3. Scalar quantize the 0 band linearly to 8
bits

4. Vector quantize the middle bands using


a small block size (e.g. 2x2). Decrease
the codebook size as the band number
increases

5. Eliminate the highest frequency bands

(c) Scott E Umbaugh, SIUE 2005 218


(c) Scott E Umbaugh, SIUE 2005 219
• The example algorithms shown here utilize
10‑band wavelet decomposition
(Figure 10.3-22b), with the Daubecies 4
element basis vectors, in combination with
the vector quantization technique

• They are called Wavelet/Vector


Quantization followed by a number
(WVQ#); specifically WVQ2, WVQ3 and
WVQ4

(c) Scott E Umbaugh, SIUE 2005 220


• One algorithm (WVQ4) employs the PCT
for preprocessing, before subsampling the
second and third PCT bands by a factor of
2:1 in the horizontal and vertical direction

(c) Scott E Umbaugh, SIUE 2005 221


(c) Scott E Umbaugh, SIUE 2005 222
• The table (10.2) lists the wavelet band
numbers versus the three WVQ algorithms

• For each WVQ algorithm, we have a


blocksize, which corresponds to the vector
size, and the number of bits, which, for
vector quantization, corresponds to the
codebook size

• The lowest wavelet band is coded linearly


using 8-bit scalar quantization

(c) Scott E Umbaugh, SIUE 2005 223


• Vector quantization is used for bands 1-8,
where the number of bits per vector
defines the size of the codebook
• The highest band is completely eliminated
(0 bits are used to code them) in WVQ2
and WVQ4, while the highest three bands
are eliminated in WVQ3
• For WVQ2 and WVQ3, each of the red,
green and blue color planes are
individually encoded using the parameters
in the table
(c) Scott E Umbaugh, SIUE 2005 224
(c) Scott E Umbaugh, SIUE 2005 225
(c) Scott E Umbaugh, SIUE 2005 226
Figure 10.3.23 Wavelet/Vector Quantization (WVQ)
Compression Example (contd)

h) WVQ4 compression ratio 36:1 i) Error of image (h)

(c) Scott E Umbaugh, SIUE 2005 227


• The JPEG2000 standard is also based
on the wavelet transform

• It provides high quality images at very


high compression ratios

• The committee that developed the


standard had certain goals for
JPEG2000

(c) Scott E Umbaugh, SIUE 2005 228


• The goals are as follows:

1. To provide better compression than the


DCT-based JPEG algorithm

2. To allow for progressive transmission of


high quality images

3. To be able to compress binary and


continuous tone images by allowing 1 to
16 bits for image components

(c) Scott E Umbaugh, SIUE 2005 229


4. To allow random access to subimages

5. To be robust to transmission errors

6. To allow for sequentially image encoding

• The JPEG2000 compression method


begins by level shifting the data to center
it at zero, followed by an optional
transform to decorrelate the data, such
as a color transform for color images

(c) Scott E Umbaugh, SIUE 2005 230


• The one-dimensional wavelet transform is
applied to the rows and columns, and the
coefficients are quantized based on the
image size and number of wavelet bands
utilized

• These quantized coefficients are then


arithmetically coded on a bitplane basis

(c) Scott E Umbaugh, SIUE 2005 231


Figure 10.3-24: The JPEG2000 Algorithm Applied to a
Color Image

a) The original image

(c) Scott E Umbaugh, SIUE 2005 232


Figure 10.3-24: The JPEG2000 Algorithm Applied to a
Color Image (contd)

b) Compression ratio = 130 , c) Compression ratio = 200,


compare to Fig10.3-21e (next slide) compare to Fig10.3-21f

(c) Scott E Umbaugh, SIUE 2005 233


Figure 10.3-21:The Original DCT-based JPEG
Algorithm Applied to a Color Image (contd)

e) Compression ratio = 131.03 f) Compression ratio = 201.39

(c) Scott E Umbaugh, SIUE 2005 234


Figure 10.3-24: The JPEG2000 Algorithm Applied to a
Color Image (contd)

d) A 128x128 subimage cropped from e) A 128x128 subimage cropped from


the standard JPEG image and enlarged the JPEG2000 image and enlarged to
to 256x256 using zero-order hold 256x256 using zero order hold

Note: The JPEG2000 image is much smoother, even with the zero-order hold enlargement

(c) Scott E Umbaugh, SIUE 2005 235

You might also like