Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
13 views4 pages

Data Compression

The document discusses the significance of data compression in the context of the Internet of Things (IoT), highlighting the need for efficient data transmission and storage due to the vast amounts of data generated by IoT devices. It reviews various data compression techniques and formats, categorizing them based on data quality, coding schemes, data types, and application suitability, while also addressing the relationship between data compression and cryptography. The conclusion emphasizes the importance of compressing data before encryption to enhance security and reduce data size for easier transmission and storage.

Uploaded by

kuolmarol84
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views4 pages

Data Compression

The document discusses the significance of data compression in the context of the Internet of Things (IoT), highlighting the need for efficient data transmission and storage due to the vast amounts of data generated by IoT devices. It reviews various data compression techniques and formats, categorizing them based on data quality, coding schemes, data types, and application suitability, while also addressing the relationship between data compression and cryptography. The conclusion emphasizes the importance of compressing data before encryption to enhance security and reduce data size for easier transmission and storage.

Uploaded by

kuolmarol84
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Data Compression and Cryptography

Samson Otieno Ooko


African Center of Excellence in IoT
[email protected]
I. INTRODUCTION
The increasing popularity and need for Internet of Things (IoT) has led to generation of huge
amounts of data. The data generated by IoT devices maybe in the forms of texts, images, audios,
videos or a combination of many forms. The amount of data generated by connected internet of
things (IoT) devices, forecast to grow to 41.6 billion by 2025, is expected to generate 79.4
zettabytes (ZB) of data [1].
IoT devices are characterized by constrained energy, storage and processing resources and
therefore a need for efficiency in data transmission and storage. Data compression (DC) provides
an opportunity to minimize the size of data stored or transmitted, saving on bandwidth and
storages space. There is therefore a need to review DC approaches that can be used for IoT.
In this paper, an overview of the understanding of data compression is outlined, different
compression formats reviewed, reasons for many DC formats highlighted, the relationship
between compression and cryptology give and thereafter a conclusion drawn.
II. DATA COMPRESSION OVERVIEW
A. Definition
Data compression deals with taking a sting of bytes and compressing it down to a smaller set of
bytes, whereby it takes either less bandwidth to transmit the string or to store it to disk [2]. In
general, data can be compressed by eliminating data redundancy and irrelevancy. Modeling and
coding are the two levels to compress data: In the first level, the data will be analyzed for any
redundant information and extract it to develop a model. In the second level, the difference
between the modeled and actual data called residual is computed and is coded by an encoding
technique [3].
B. Reason for many DC methods
There are several ways to characterize data and different characterization leads to the
development of numerous DC approaches [3]. In addition, data is generated by many
heterogeneous devices from many application hence the many DC methods.
II. COMPRESSION TECHNIQUES AND FORMATS
A. Data Compression
Data Compression can be grouped into four categories [3] based on reconstructed data quality,
coding schemes, data type and application suitability as shown in figure 1.

Figure 1: Data compression techniques [3]


i. Based on data quality
The significance of the data quality of a DC technique is highly depends on the type of data or
application involved. Based on this the compression can either be lossless and lossy
compression. Lossless compression refers to no loss of information and the reconstructed data
being identical to original data. It is used in applications where loss of information is undesirable
like text, medical imaging, law forensics, military imagery, satellite imaging, etc. Lossy
compression techniques are preferable where the reconstructed data is not perfectly matched with
the original data and the approximation of original data is also acceptable [4].
Example formats - JPEGs and GIFs are both lossy image format while RAW, BMP, GIF, and
PNG are all lossless image formats
ii. Based on coding Techniques
Some coding techniques include Huffman coding, Arithmetic coding, LZ coding, Burrows-
wheeler transform (BWT) coding, RLE, transform coding, predictive coding, dictionary based
methods, fractal compression, Scalar and vector quantization.
Example Formats - Images (.e.g JPEG, HEIF), video (e.g MPEG, AVC, HEVC) and audio (e.g
MP3, AAC, Vorbis)
ii. Based on data Type
Some kinds of files, like text, can still get very good compression. Examples include the DOCX
and PPTX formats for Microsoft Office, or the LAZ format for Lidar point clouds which reduces
file sizes to 10-20% of the original.
For images, PNG also compresses better than GIF by around 5–25 %. PNG is a
still image only format, while GIF supports animation as well. JPG or JPEG is Joint
Photographic Experts Group (variable compressed format (JPEG).
Examples of audio coding formats include MP3, AAC, Vorbis, FLAC, and Opus.
Video compression format is a content representation format for storage or transmission of
digital video content. It typically uses a standardized video compression algorithm, most
commonly based on discrete cosine transform (DCT) coding and motion compensation.
Examples of video coding formats include MPEG, HEVC, Theora, RealVideo RV40, VP9, and
AV1.
iii. Based on Application [5]
Some of the common compression techniques can be applicable to preferred applications. This is
due to the fact that several techniques depend upon the nature of data involved in the application.
B. File compression formats [6]
File Compression is the process to reduce the size of one or more files. It shrinks big files into
much smaller ones by removing unneeded data. The compressed file archive makes it easier to
send and back up large files or groups of files.
Examples of file compression formats include ZIP, 7Z - 7-ZIP, BZ2, GZ, RAR among others.
C. Relationship between cryptology and compression
Cryptology is the mathematics and the application of formulas and algorithms that underpin
cryptography and cryptanalysis while compression is the process of reducing the number of bits
or bytes needed to represent a given set of data for easy transmission and easy storage of data.
The enemy of compression is randomness, but on the other side encryption needs to bring
randomness into the digital data to bring security. This is why, when we have to perform both
compression and encryption, we will always compress first the data and then encrypt it [7].
Compressing data before encryption not only makes for shorter messages to be transmitted or
stored, but also improves security by reducing the redundancy in the plaintext and making
cryptanalysis harder.
VI. CONCLUSION
Data compression involves reducing the number of bits or bytes needed to represent a given set
of data for easy transmission and easy storage of data. It is widely used together with file
compression for the reduction of the amount of data stored and transmitted by IoT devices. Data
compression techniques can be grouped based on reconstructed data quality, coding schemes,
data type, and application suitability. Many data compression formats as a result exist mainly
because of the existence of various forms of data and data compression techniques. Compressing
data before encryption not only makes for shorter messages to be transmitted or stored, but also
improves security.

References

[1] E. Estopace, "Future IoT," 6 2019. [Online]. Available: https://futureiot.tech/idc-forecasts-


connected-iot-devices-to-generate-79-4zb-of-data-in-2025/#:~:text=The%20amount%20of
%20data%20generated,include%20machines%2C%20sensors%20and%20cameras..
[Accessed 30 10 2020].

[2] K. S. C. P. Anton Chuvakin, "Analysis Goals, Planning, and Preparation: What Are We
Looking for?," in Logging and Log Management, 2013, pp. 115-125.

[3] J. Uthayakumar, T. Vengattaraman and D. p., "A survey on data compression techniques:
From the perspective of data," Journal of King Saud University - Computer and Information
Sciences, 2018.

[4] S. Drost and N. Bourbakis, "A hybrid system for real-time lossless image compression.,"
Microprocess. Microsyst. , vol. 25, p. 19–31, 2001.

[5] K. Sayood, Introduction to data compression, San Francisco, CA., 2000.

[6] [Online]. Available: https://docs.fileformat.com/compression/.

[7] B. Carpentieri, "Efficient Compression and Encryption for Digital Data Transmission,"
Security and Communication Networks, 2018.

You might also like