What is Data Compression
Data Compression is also referred to as bit-rate reduction or source coding. This technique is used to reduce the
size of large files.
The advantage of data compression is that it helps us save our disk space and time in the data transmission.
There are different ways how to classify compression techniques:
• with respect to the type of data to be compressed
• with respect to the target application area
• with respect the the fundamental building blocks of the algorithms used
There are mainly two types of data compression techniques -7M
1. Lossless Data Compression
2. Lossy Data Compression
What is Lossless data compression
Lossless data compression is used to compress the files without losing an original file's quality and data. Simply,
we can say that in lossless data compression, file size is reduced, but the quality of data remains the same.
The main advantage of lossless data compression is that we can restore the original data in its original form after
the decompression.
Lossless data compression mainly used in the sensitive documents, confidential information, and PNG, RAW, GIF,
BMP file formats.
Some most important Lossless data compression techniques are -
1. Run Length Encoding (RLE)
2. Lempel Ziv - Welch (LZW)
3. Huffman Coding
4. Arithmetic Coding
What is Lossy data compression
Lossy data compression is used to compress larger files into smaller files. In this compression technique, some
specific amount of data and quality are removed (loss) from the original file. It takes less memory space from the
original file due to the loss of original data and quality. This technique is generally useful for us when the quality of
data is not our first priority.
Note: The human eye does not measure the loss of data.
Lossy data compression is most widely used in JPEG images, MPEG video, and MP3 audio formats.
Some important Lossy data compression techniques are -
1. Transform coding
2. Discrete Cosine Transform (DCT)
3. Discrete Wavelet Transform (DWT)
Difference between lossless and lossy data compression
As we know, both lossless and lossy data compression techniques are used to compress data form its
original size. The main difference between lossless and lossy data compression is that we can restore
the lossless data in its original form after the decompression, but lossy data can't be restored to its
original form after the decompression.
The below table shows the difference between lossless and lossy data compression -
S.N Lossless data compression Lossy data compression
o
1. In Lossless data compression, there is no loss In Lossy data compression, there is a loss
of any data and quality. of quality and data, which is not
measurable.
2. In lossless, the file is restored in its original In Lossy, the file does not restore in its
form. original form.
3. Lossless data compression algorithms are Run Lossy data compression algorithms are:
Length Encoding, Huffman encoding, Shannon Transform coding, Discrete Cosine
fano encoding, Arithmetic encoding, Lempel Ziv Transform, Discrete Wavelet Transform,
Welch encoding, etc. fractal compression, etc.
4. Lossless compression is mainly used to Lossy compression is mainly used to
compress text-sound and images. compress audio, video, and images.
5. As compare to lossy data compression, lossless As compare to lossless data compression,
data compression holds more data. lossy data compression holds less data.
6. File quality is high in the lossless data File quality is low in the lossy data
compression. compression.
7. Lossless data compression mainly supports Lossy data compression mainly supports
RAW, BMP, PNG, WAV, FLAC, and ALAC file JPEG, GIF, MP3, MP4, MKV, and OGG file
types. types.
Why do we still need compression ?
Compression Technology is employed to efficiently use storage space, to save on transmission capacity and
transmission time, respectively. Basically, its all about saving resources and money. Despite of the overwhelming
advances in the areas of storage media and transmission networks it is actually quite a surprise that still
compression technology is required. One important reason is that also the resolution and amount of digital data
has increased (e.g. HD-TV resolution, ever-increasing sensor sizes in consumer cameras), and that there are still
application areas where resources are limited, e.g. wireless networks. Apart from the aim of simply reducing the
amount of data, standards like MPEG-4, MPEG-7, and MPEG-21 offer additional functionalities.
During the last years three important trends have contributed to the fact that nowadays compression technology is
as important as it has never been before – this development has already changed the way we work with
multimedial data like text, speech, audio, images, and video which will lead to new products and applications:
● The availability of highly effective methods for compressing various types of data.
● The availability of fast and cheap hardware components to conduct compression on single-chip systems,
microprocessors, DSPs and VLSI systems.
● Convergence of computer, communication, consumer electronics, publishing, and entertainment industries.
Selection criteria for chosing a compression scheme
If it is evident that in an application compression technology is required it has to be decided which type of
technology should be employed. Even if it is not evident at first sight, compression may lead to certain surprising
benefits and can offer additional functionalities due to saved resources. When selecting a specific system, quite
often we are not entirely free to chose due to standards or de-facto standards – due to the increasing
develeopment of open systems and the eventual declining importance of standards (example: MPEG-4
standardization !) these criteria might gain even more importance in the future. Important selection criteria are for
example:
• data dimensionality: 1-D, 2-D, 3-D, .........
• lossy or lossless compression: dependent of data type, required quality, compression rate
• quality: with target quality is required for my target application ?
• algorithm complexity, speed, delay: on- or off-line application, real-time requirements
• hardware or software solution: speed vs. price (video encoder boards are still costly)
• encoder / decoder symmetry: repeated encoding (e.g. video conferencing) vs. encoding only once but decoding
often (image databases, DVD, VoD, ....)
• error robustness and error resilience: do we expect storage or transmission errors
• scalability: is it possible to decode in different qualities / resolutions without hassle ?
• progressive transmission: the more data we transmit, the better the quality gets.
• standard required: do we build a closed system which is not intended to interact with other systems
(which can be desired due to market position, e.g. medical imaging) or do we want to exchange data
across many systems
• multiple encoding / decoding: repeated application of lossy compression, e.g. in video editing
• adaptivity: do we expect data with highly varying properties or can we pre-adapt to specific properties
(fingerprint compression)
• synchronisation issues – audio and video data should both be frame-based preferably
• transcoding: can the data be converted into other datatypes easisly (e.g. MJPEG −− > MPEG)