0% found this document useful (0 votes)

17 views9 pages

Area and Power Efficient DCT Architecture For Image Compression

Uploaded by

nabila brahimi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views9 pages

Area and Power Efficient DCT Architecture For Image Compression

Uploaded by

nabila brahimi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Dhandapani and Ramachandran EURASIP Journal on Advances in Signal Processing 2014, 2014:180

http://asp.eurasipjournals.com/content/2014/1/180

RESEARCH Open Access

Area and power efficient DCT architecture for

image compression
Vaithiyanathan Dhandapani* and Seshasayanan Ramachandran

Abstract
The discrete cosine transform (DCT) is one of the major components in image and video compression systems.
The final output of these systems is interpreted by the human visual system (HVS), which is not perfect. The limited
perception of human visualization allows the algorithm to be numerically approximate rather than exact. In this
paper, we propose a new matrix for discrete cosine transform. The proposed 8 × 8 transformation matrix contains
only zeros and ones which requires only adders, thus avoiding the need for multiplication and shift operations. The
new class of transform requires only 12 additions, which highly reduces the computational complexity and achieves a
performance in image compression that is comparable to that of the existing approximated DCT. Another important
aspect of the proposed transform is that it provides an efficient area and power optimization while implementing in
hardware. To ensure the versatility of the proposal and to further evaluate the performance and correctness of
the structure in terms of speed, area, and power consumption, the model is implemented on Xilinx Virtex 7 field
programmable gate array (FPGA) device and synthesized with Cadence® RTL Compiler® using UMC 90 nm standard cell
library. The analysis obtained from the implementation indicates that the proposed structure is superior to the existing
approximation techniques with a 30% reduction in power and 12% reduction in area.
Keywords: Discrete cosine transform (DCT); Multiplication-free transform; Low complexity; FPGA implementation;
Image compression; VLSI architecture

1 Introduction (2-D DCT) is applied for encoding each block. The

Discrete cosine transform (DCT) [1] has become one of two-dimensional DCT of order N × N is defined as
the basic tools in signal and image processing; the

popularity of which is mainly due to its good energy
N −1 X
X N −1
π ð2i þ 1Þu
T DCT ðu; vÞ ¼ αðuÞαðvÞ X ði; jÞ cos
compaction properties. In particular, DCT is the best i¼0 j¼0
2N
substitute for the Karhunen-Loeve Transform (KLT),
π ð2j þ 1Þv
which is considered to be statistically optimal for en- cos for 0≤i; j; u; v≤N−1
2N
ergy concentration [2,3], whereas the discrete cosine
transform is suboptimal. The KLT is data dependent ð1Þ
and requires more computation compared to the DCT.
Where
Due to this fact, discrete cosine transform is the finest
substitute for the KLT. Indeed, DCT has found applica- 8 rffiffiffiffi 9
>
> 1 >
tions in many image and video compression standard < for u; v ¼ 0 >
=
αðuÞ ¼ αðvÞ ¼ r N
ffiffiffiffi
such as JPEG [4], MPEG-1 [5], MPEG-2 [6], H.261 [7], > >
H.263 [8], and H.264/AVC [9,10]. During the JPEG >
:
2
otherwise > ;
N
process, an image is divided into several 8 × 8 blocks
and then the two-dimensional discrete cosine transform In general, the floating point DCT decorrelates the
data being transformed so that most of its energy is
packed in the low-frequency region, which is best suited
* Correspondence: [email protected]
Department of Electronics and Communication Engineering, College of
Engineering Guindy, Anna University, Chennai, Tamil Nadu 600025, India

© 2014 Dhandapani and Ramachandran; licensee Springer. This is an Open Access article distributed under the terms of the
Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use,
distribution, and reproduction in any medium, provided the original work is properly credited.
Dhandapani and Ramachandran EURASIP Journal on Advances in Signal Processing 2014, 2014:180 Page 2 of 9
http://asp.eurasipjournals.com/content/2014/1/180

for well-known image compression techniques [11-15] than the classic SDCT [17] and Bouguezel et al. [23] trans-
but does not meet the requirements of very fast real-time forms. Cintra et al. [28] proposed a very low complexity
compression applications. For this reason, there has been DCT approximation obtained via pruning, which is
huge interest in finding fixed point multiplication-free claimed to require only 10 additions. However, the per-
DCT algorithms [16-32] that can be implemented as low formance results reported in [28] is not reproduced, since
power and area efficient digital circuits, thus useful for the proposed work concentrates on non-pruned tech-
mobile imaging devices. niques. On the other hand, integrating multiple standard
In this scenario, recently a large number of DCT ap- encoding or decoding hardware into a single chip in-
proximations have been proposed. Approximated algo- creases the area and power consumption. Numerous
rithms provide a meaningful estimation at low complexity architectures have proposed a low power, high speed and
of 8-point DCT. Cham [16] proposed the integer cosine area efficient hardware implementation for DCT computa-
transforms (ICT) using the principle of dyad symmetry. tion [32-35].
The performance of ICT is very close to that of DCT. In general, DCT approximation with low computa-
Haweel [17] proposed a signed DCT (SDCT) by applying tional complexity and low bit rates are preferred. In this
a signum function to the DCT matrix, which maintains paper, a low complexity multiplier-less DCT approxima-
the good de-correlation and power compaction properties tion is proposed, which is more essential for hardware
of the DCT but requires 24 additions and is not orthog- realization. The derived fast algorithm requires only 12
onal. Lengwehasatit and Ortega [18] suggested the two additions, which is lesser than the number of additions
8 × 8 transform matrices, one for the coarsest and another required for any existing DCT approximation [17-27,29-31].
for the finest. Using these two matrices, a trade-off be- To examine the performance and trade-offs associated with
tween speedup and accuracy in various bit ranges can be the algorithm, we have coded the proposed as well as
achieved. The coding performance shows that 73% reduc- the existing algorithms [17,19,21-24,26,27] in MATLAB
tion in complexity with only 0.2 dB degradation in peak and Verilog HDL, and it is synthesized with Xilinx Virtex 7
signal-to-noise ratio (PSNR). Tran [13] proposed the fam- XC7V585T-2LFFG1761C device (Xilinx, Inc., San Jose, CA,
ily of 8 × 8 biorthogonal transforms called binDCT, which USA) [36] and Cadence® RTL Compiler® [37] using UMC
are approximates of the popular 8 × 8 DCT. The binDCT 90 nm standard cell library.
requires 31 additions and 14 shift operations with a coding The rest of the paper is structured as follows. In
gain ranging from 8.77 to 8.82 dB, and shows finer ap- Section 2, the proposed transform and the factors influ-
proximations to exact DCT and are suitable for VLSI im- encing its performance improvements and computational
plementation. Bouguezel et al. proposed a series of DCT complexity are compared with the existing methods. An
approximation techniques [19-23] which have a trade-off image compression simulation and hardware imple-
between computational complexity and image compres- mentation for the proposed and existing approximation
sion performance. Cintra and Bayer [24] proposed an DCT are detailed and analyzed in Section 3. Conclusion
approximate DCT based on the round-off function and final remarks are given in Section 4.
which requires 22 additions with less blocking artifacts.
Bouguezel et al. [23] proposed a low complexity para- 2 Proposed transform
metric transform for image compression, which requires Haweel [17] introduces the approximation DCT method
18 additions and 2 multiplications. This computational by applying the signum function operator to the DCT
complexity can be reduced by varying the parameter a. element in Equation 1. The TSDCT is given by
Usually, the parameter a is selected as a small integer in
1
order to minimize the computational complexity. In T SDCT ðu; vÞ ¼ pffiffiffiffi signfT DCT ðu; vÞg ð2Þ
Bouguezel et al. [23], the suggested values of a∊ {0, 1/2, 1}. N
For the value a = 1/2, the two multiplications become just
where sign TDCT(u, v) = {.}, which is the signum function
bit-shift operations. If a = 1, then no shift operation is ne-
defined as follows:
cessary. The transform requires only 18 additions. In the
8
case of a = 0, the complexity reduces to 16 additions. Bra- < þ1 if x > 0
himi and Bouguezel [25] proposed an efficient fast integer signfxg ¼ 0 if x ¼ 0 ð3Þ
DCT transform which is also claimed to require only 16 :
−1 if x < 0
additions, and it is not orthogonal. Senapati et al. [26] pro-
posed a low complexity orthogonal 8 × 8 transform matrix Signed DCT has many advantages, one of which is ap-
for fast image compression, which requires 14 additions parent from looking at Equations 1 to 3 as all the ele-
and two shift operations. This computational complexity ments in the transform are 0 or ±1, which eradicates the
is further reduced by Bayer and Cintra [27] to 14 addi- need of a multiplication operation or a transcendental
tions, which gives better image compression performance expression. The transform order need not be a specific
Dhandapani and Ramachandran EURASIP Journal on Advances in Signal Processing 2014, 2014:180 Page 3 of 9
http://asp.eurasipjournals.com/content/2014/1/180

integer or a power of 2. The SDCT also maintains the

periodicity and spectral structure of its originating DCT
and in turn maintains good de-correlation and energy
compaction characteristics. Therefore, SDCT is highly
preferred for low computation applications.
There have been many recent approaches for reducing
the computational complexity of the DCT transform,
but the reduction in computational complexity comes at
the cost of PSNR. In this paper, a new DCT approxima-
tion scheme is developed by reproducing the reported
butterfly structures [17,23,26,27]. After reviewing these
structures, the common computations are identified and
shared to remove the redundancy in DCT matrix and
simulated using MATLAB tool. The image compression
performance was evaluated based on the PSNR values,
the matrix is altered and the procedure is repeated. First,
the transform matrix is reduced to 16 additions [29] and
then to 14 additions [30] and to 12 additions. The for-
ward and inverse transform matrices are obtained as Figure 1 Signal flow graph for the proposed transform of
follows: order N = 8.

2 3
1 0 0 0 0 0 0 1
61 1 0 0 0 0 1 1 7 continuous and dashed line represents multiplication by
6 7
60 0 1 0 0 1 0 0 7 +1 and −1 respectively. The common use of additions is
6 7
60 0 1 1 1 1 0 0 7 reduced without disturbing PSNR in considerable levels.
T ¼6 6 0 0 1 1 −1 −1 0
7 ð4Þ
6 0 77 The number of additions, multiplications, and bit-shift
6 0 0 1 0 0 −1 0 0 7 operations required for the proposed transform and the
6 7
41 1 0 0 0 0 −1 −1 5 evolution of SDCT is presented in Table 1. This clearly
1 0 0 0 0 0 0 −1 shows that the proposed matrix has 14.29%, 25%, 33.3%,
2 3 and 50% saving in computation than Bayer and Cintra
1 0 0 0 1 0 0 0
6 −1 1 0 0 −1 1 0 0 7
6 7
6 0 0 1 0 0 0 1 0 7
6 7 Table 1 Arithmetic computation complexity assessment
6 0 0 −1 1 0 0 −1 1 7
T−1 ¼ 66 0 0 −1 1 0
7D ð5Þ
1 −1 7
Transform Addition Multiplication Shifts
6 0 7
6 0 0 1 0 0 0 −1 0 7 DCT (by definition) 56 64 0
6 7
4 −1 1 0 0 1 −1 0 0 5 Arai et al. [38] 29 5 0
1 0 0 0 −1 0 0 0 SDCT [17] 24 0 0
Level 1 approximation [18] 24 0 2
where D ¼ diagð1; 1; 1; 1; 1; 1; 1; 1Þ 1 =2 :
Bouguezel et al. [20] 21 0 0
It can be seen from Equations 4 and 5 that the entries
of T and T−1 are {0, ±1}. This indicates that the proposed Bouguezel et al. [19] 18 0 2
transform requires only 12 additions, thus avoiding the Bouguezel et al. [21] 18 0 0
need for multiplication and bit shift operations. In terms Bouguezel et al. [22] 24 0 4
of complexity assessment, the diagonal matrix D may Bouguezel et al. [23] (a = 0) 16 0 0
not introduce any computational overhead. In JPEG, the Bouguezel et al. [23] (a = 1) 18 0 0
DCT operation is a preprocessing step for a subsequent
Bouguezel et al. [23] (a = 2) 18 2 0
coefficient quantization procedure. Consequently, the
scaling factors in the diagonal matrix D can be merged Senapati et al. [26] 14 0 2
to the de-quantization matrix. This procedure is clearly Cintra and Bayer [24] 22 0 0
suggested and adopted in several works [19-27]. Bayer and Cintra [27] 14 0 0
The number of additions in the proposed transform Transform in [29] 16 0 0
can be clearly understood from the butterfly diagram Transform in [30] 14 0 0
shown in Figure 1. Input data xn, where n = 0,1,2,…7, is
Proposed transform 12 0 0
related to the output Xk , where k = 0,1,2,…7. The
Dhandapani and Ramachandran EURASIP Journal on Advances in Signal Processing 2014, 2014:180 Page 4 of 9
http://asp.eurasipjournals.com/content/2014/1/180

Figure 2 Implementation of proposed transform matrix in image coding.

[27], Senapati et al. [26] Bouguezel et al. [23], and SDCT pixel values in the original block are converted from the
[17], respectively. unsigned integer format to signed integer format, and
then an approximate DCT is applied. After the trans-
3 Experimental results and analysis form coefficients are quantized, less significant coeffi-
3.1. Application to image compression cients are set to zero and rearranged into the standard
To evaluate the performance of the proposed transform zigzag sequence, only r out of the 64 transform coeffi-
matrix in image compression, we used the experimental cients in each block is employed to reconstruct the
methodology described in [17] and it was supported by image. The inverse procedure was applied to reconstruct
[18-27] as shown in Figure 2. A set of 30 512 × 512 8- the processed data and image.
bit grayscale images obtained from a standard public The transform matrices of the so far evolved SDCT
image bank [39] were considered, which were grouped are used to evaluate and position the performance of
into three image types. For example, Lena, Cameraman, the proposed transform. The original and reconstructed
Goldhill, and Boat are low-frequency (LF) images; images using the proposed and existing methods are
Barbara and House are medium frequency (MF) im- illustrated, and the PSNR comparisons are presented in
ages; and Mandrill and Grass are high frequency (HF) Table 2 and Figure 3. It is clear from Table 2 that the
images. The proposed fast DCT and existing transforms PSNR obtained by the Bouguezel et al. [22] is signifi-
[17,19,21-24,26,27] have been implemented in MATLAB cantly higher than the other recent algorithms, but it
and the performance parameters such as PSNR and com- requires a greater number of arithmetic operations.
pression ratio (CR) are determined. This proposal concentrates on low computational com-
A simulation has been carried out for the proposed plexity algorithms. We separate the Bouguezel et al.
and existing approximated discrete cosine transforms by [23], Cintra and Bayer [24], Senapati et al. [26], and
incorporating the international standard lossy image Bayer and Cintra [27] for further comparisons. Table 2
compression algorithm produced by a joint photographic and Figure 3 show that the proposed transform has a
expert group, which employs the DCT. Each image is di- better PSNR than Bouguezel et al. [23], Cintra and
vided into non-overlapping blocks of 8 × 8 pixels. The Bayer [24], Senapati et al. [26], and Bayer and Cintra

Table 2 PSNR obtained by different 8 × 8 transform matrices

Transform Lena Boat Goldhill Barbara Lighthouse Mandrill Grass
SDCT [17] 47.6842 48.0605 47.3815 47.7712 48.0659 47.1800 47.0199
Bouguezel et al. [19] 46.1432 46.5150 45.7904 46.1774 46.5060 45.6001 45.4208
Bouguezel et al. [21] 38.4446 38.7755 38.1416 38.4739 38.7539 37.8809 37.7553
Bouguezel et al. [22] 50.7247 51.0442 50.3861 50.7648 51.0748 50.1816 50.0178
Bouguezel et al. [23] (a = 1) 39.1987 39.5451 38.9039 39.2220 39.5325 38.6401 38.5118
Senapati et al. [26] 40.0021 40.247 39.6411 39.7347 40.0654 38.9019 38.8482
Cintra and Bayer [24] 40.3542 40.5915 39.9636 40.3566 40.7288 39.6412 39.4864
Bayer and Cintra [27] 40.4921 40.8259 40.1348 40.5130 40.8217 39.9135 39.7726
Transform in [29] 42.5718 42.9205 42.2187 42.6054 42.9414 42.0134 41.8338
Transform in [30] 41.1196 41.4667 40.7955 41.1598 41.4880 40.5666 40.3979
Proposed transform 41.7576 42.1115 41.4527 41.7853 42.1274 41.1899 41.0367
Dhandapani and Ramachandran EURASIP Journal on Advances in Signal Processing 2014, 2014:180 Page 5 of 9
http://asp.eurasipjournals.com/content/2014/1/180

POriginal Image Proposed Transform CB-2012 BAS2011

(a) PSNR = 41.7576 PSNR = 40.4921 PNSR = 39.1987

(b) PSNR = 42.1115 PSNR = 40.8259 PSNR = 39.5451

(c) PSNR = 41.7853 PSNR = 40.5130 PSNR = 39.2220

(d) PSNR = 41.3747 PSNR = 39.7569 PSNR = 38.6541

Figure 3 Reconstructed with proposed transform, Bayer and Cintra [27] and Bouguezel et al. [23]. For (a) Lena, (b) Boat, (c) Barbara, and
(d) Airplane images.

[27] for almost all types of images. When compared to obtained by varying the number of transform co-
the methods such as Bayer and Cintra [27], Bouguezel efficients retained in steps of four to reconstruct the
et al. [23], and Senapati et al.[26] the proposed method image. For the sake of reference, the DCT results are
outperforms these by 1.28, 2.56, and 2.01 dB improve- also included. Figure 4 shows that the proposed ap-
ment in the average PSNR and 1.30, 2.59, and 1.88 dB proximated transform is comparable when r < 32 and it
improvement in the peak PSNR, respectively. outperforms when r ≥ 32 for all the types (LF, MF, and
Further, to show the efficiency of the proposed trans- HF) of images. The overall results show that the
form matrix in image compression, the PSNR is proposed transform gives comparable or better image
Dhandapani and Ramachandran EURASIP Journal on Advances in Signal Processing 2014, 2014:180 Page 6 of 9
http://asp.eurasipjournals.com/content/2014/1/180

Figure 4 PSNR obtained by different transforms. (a) Lena, (b) Cameraman, (c) Barbara, and (d) Mandrill images.

compression performance than the so far evolved and global clock buffers) and lookup tables (LUTs). The
SDCT. At the same time, it provides ample reduction resources used by the implementation are listed in
in the number of arithmetic operations, which is more Table 3. It is observed from Table 3 that the proposed
essential for hardware realization. structure has area utilization (No. of LUTs) of 13.72%,
35.29%, and 29% lesser as compared to Bayer and Cintra
3.2. Hardware implementation [27], Bouguezel et al. [23], and Senapati et al. [26],
In this section, the performance of the proposed and respectively.
the existing DCT matrices are compared in terms of
hardware cost and computing time. The digital archi-
3.2.2 ASIC implementation
tecture of the proposed approximate DCT is shown
The field programmable gate array (FPGA) verified
in Figure 5. The hardware cost is measured by the
register transfer language (RTL) code was targeted to
number of adders, multipliers, and shifters used in the
UMC 90 nm standard cell library using Cadence en-
architecture, and the computing time is normalized as
counter® RTL complier [37]. The supply voltage of the
clock cycles.
CMOS was fixed at VDD = 1 V during the estimation of
area and power consumption. The design was realized
3.2.1 Field programmable gate array implementation
up to the synthesis and place and route levels leading
The proposed approximation DCT matrix and the re-
to the estimated results tabulated in Table 4. Table 4
ported matrices [17,19,21-24,26,27] were physically im-
shows that the Bayer and Cintra [27] transform con-
plemented on a Xilinx Virtex 7 XC7V585T-2LFFG1761C
sumes lesser area among the existing structures. We
device [36]. The inputs were assumed at an 8-bit reso-
can say that the proposed structure consumes 12%
lution and are realized with pipelining in order to in-
lesser area and offers 30% power optimization with 9%
crease the throughput. To get the accurate timing
reduction in critical path delay compared to the Bayer
result, post-place and route (PAR) is done for each run
and Cintra [27].
of the design flow. Since the hardware resource require-
ments become low for the proposed method, it gains
greater flexibility in placement and routing to get the 4 Conclusions
optimized delay. The implementation is evaluated in Low power and area minimization are the two indis-
terms of hardware complexity, time delay, and area pensable requirements for portable multimedia devices,
consumption. The resource utilization (area) is measured which employs various signal and image processing al-
as the numbers of the cell usage (input/output buffers gorithms. In this paper, we proposed a new 8 × 8
Dhandapani and Ramachandran EURASIP Journal on Advances in Signal Processing 2014, 2014:180 Page 7 of 9
http://asp.eurasipjournals.com/content/2014/1/180

Figure 5 Digital architecture for proposed approximate DCT.

transformation matrix, which requires only 12 addi-

tions, thus avoiding the need for multiplication and bit
Table 3 Comparison of hardware resource consumption shift operations. The proposed approximation DCT for
with the reported architectures on Xilinx Virtex-7 image compression is a simple, efficient architecture hav-
XC7V585T-2LFFG1761C device ing lower computational complexity with improvement in
Transform LUTs Cell usage Delay (ns) the peak signal-to-noise ratio. According to the results,
SDCT [17] 272 274 5.113 the proposed transform has a comparable or better image
Bouguezel et al. [19] 267 269 4.149 compression performance than the Bouguezel et al.
Bouguezel et al. [21] 204 206 5.716 [23], Cintra and Bayer [24], Senapati et al. [26], and
Bayer and Cintra [27] transforms. When compared to
Bouguezel et al. [22] 271 273 5.153
the most recent method of Bayer and Cintra [27] trans-
Bouguezel et al. [23] (a = 1) 204 205 5.593
form, the proposed method outperforms it by a 1.28 dB
Senapati et al. [26] 186 189 5.914 improvement in the average PSNR and a 1.30 dB
Cintra and Bayer [24] 226 228 5.171 improvement in the peak PSNR, while providing 14%
Bayer and Cintra [27] 153 155 4.580 reduction in the number of arithmetic operations.
Transform in [29] 167 168 6.738 Further, the efficiency of the proposed transform which
was implemented on Xilinx Virtex 7 device and was
Transform in [30] 156 157 5.924
later synthesized with Cadence RTL complier using
Proposed transform 132 134 3.247
UMC 90 nm standard cell library has been determined.
Dhandapani and Ramachandran EURASIP Journal on Advances in Signal Processing 2014, 2014:180 Page 8 of 9
http://asp.eurasipjournals.com/content/2014/1/180

Table 4 Comparison of hardware resource consumption with the reported architectures for CMOS 90 nm ASIC
implementation
Transform Area (μm2) Power (mW) Critical path delay (ns)
Leakage power Dynamic power Total power
SDCT [17] 3,892 0.0120 0.7251 0.7371 0.809
Bouguezel et al. [19] 4,042 0.0123 0.6725 0.6848 0.823
Bouguezel et al. [21] 2,864 0.0088 0.4249 0.4337 0.783
Bouguezel et al. [22] 3,787 0.0115 0.6662 0.6777 0.787
Bouguezel et al. [23] (a = 1) 2,907 0.0088 0.4354 0.4442 0.775
Senapati et al. [26] 2,273 0.0069 0.2799 0.2868 0.980
Cintra and Bayer [24] 3,072 0.0094 0.4541 0.4635 0.773
Bayer and Cintra [27] 2,221 0.0063 0.2687 0.2750 0.675
Transform in [29] 2,459 0.0077 0.3096 0.3173 0.103
Transform in [30] 2,301 0.0073 0.2831 0.2904 0.987
Proposed transform 1,954 0.0061 0.1893 0.1954 0.616

It has been found to have 30% reduction in power and 8. International Telecommunication Union, ITU-T Recommendation H.263
12% reduction in area when compared to the existing Version 1: Video Coding for Low Bit Rate Communication (ITU-T, Geneva,
Switzerland, 1995)
approximation transform Bayer and Cintra [27]. The 9. International Telecommunication Union, ITU-T Recommendation H.264
implementation that has been carried out in this work Version 1: Advanced Video Coding for Generic Audio-Visual Services
clearly shows that the architecture is best suited for (ITU-T, Geneva, Switzerland, 2003)
10. T Wiegand, GJ Sullivan, G Bjontegaard, A Luthra, Overview of the H.264/
real-time low power and high speed applications. AVC video coding standard. IEEE Trans Circuits Systems for Video Tech
13(7), 560–576 (2003)
Competing interests 11. KR Rao, JJ Hwang, Techniques and Standards for Image, Video and Audio
The authors declare that they have no competing interests. Coding (PrenticeHall, Upper Saddle River, NJ, USA, 1996)
12. T Chang, C Kung, C Jen, A simple processor core design for DCT/IDCT.
IEEE Trans Circuits Syst for Video Technology 10, 439–447 (2000)
Acknowledgement 13. TD Tran, The BinDCT: fast multiplierless approximation of the DCT.
The authors would like to thank S. Anith, who has completed his master’s IEEE Signal Processing Letter 7(6), 141–144 (2000)
degree in VLSI design, S. Mehanathan and P.S. Tulasiram, currently pursuing 14. M Lin, L Dung, P Weng, An ultra low power image compressor for capsule
their master’s degree in VLSI design, Department of Electronics and endoscope. BioMedical Engg Online 5(14), 1–8 (2006). 10.1186/1475-925X-5-14
Communication Engineering, College of Engineering Guindy, Anna 15. A Puri, X Chen, A Luthra, Video coding using the H.264/MPEG-4 AVC
University, Chennai, India, for their contribution towards this work. compression standard. Signal Process Image Commun 19(9), 793–849
The authors would also like to thank the associate editor and anonymous (2004). 10.1016/j.image.2004.06.003
reviewers for their valuable comments, which significantly helped to 16. WK Cham, Development of integer cosine transforms by the principle of
improve this paper. dyadic symmetry. IEE Proceeding 136(4), 276–282 (1989)
17. TI Haweel, A new square wave transform based on the DCT. Signal Process
Received: 26 February 2014 Accepted: 20 November 2014 81, 2309–2319 (2001). 10.1016/S0165-1684(01)00106-2
Published: 13 December 2014 18. K Lengwehasatit, A Ortega, Scalable variable complexity approximate
forward DCT. IEEE Trans Circuits Syst Video Tech 14, 1236–1248 (2004).
References 10.1109/TCSVT.2004.835151
1. N Ahmed, T Natarajan, KR Rao, Discrete cosine transform. IEEE Trans On 19. S Bouguezel, MO Ahmad, MNS Swamy, A multiplication-free transform for
Computers C-23(1), 90–93 (1974). 10.1109/T-C.1974.223784 image compression, in The 2nd Int. Conf. Signals, Circuits and Systems, 2008,
2. RJ Clark, Relation between Karhunen-Loeve and cosine transform. pp. 1–4
Communications, Radar and Signal Processing 128(6), 359–360 (1981). 20. S Bouguezel, MO Ahmad, MNS Swamy, Low-complexity 8×8 transform for
doi: 10.1049/ip-f-1:19810061, (IET) image compression. Electronics Lett 44, 1249–1250 (2008). doi: 10.1049/
3. RJ Clark, Transform Coding of Images (Academic Press, London, UK, 1985) el:20082239
4. WB Pennebaker, JL Mitchell, JPEG Still Image Data Compression Standard 21. S Bouguezel, MO Ahmad, MNS Swamy, A fast 8x8 transform for image
(Van Nostrand Reinhold, New York, NY, USA, 1992) compression, in Proceeding of the 2009 Int. Conf. on Microelectronics (ICM)
5. N Roma, L Sousa, Efficient hybrid DCT-domain algorithm for video spatial (Marrakech, 2009), pp. 74–77. doi: 10.1109/ICM.2009.5418584
downscaling. EURASIP Journal on Advances in Signal Processing 2007, 22. S Bouguezel, MO Ahmad, MNS Swamy, A novel transform for image
(2007). doi:10.1155/2007/57291 compression, in The 53rd IEEE Int. Midwest Symp. Circuits and Systems
6. International Organisation for Standardisation, Generic coding of moving (MWSCAS), 2010, pp. 509–512
pictures and associated audio information - Part 2: video, ISO/IEC JTC1/SC29/ 23. S Bouguezel, MO Ahmad, MNS Swamy, A low-complexity parametric
WG11 - coding of moving picture and audio (ISO, 1994) transform for image compression, in Proceeding of the 2011 IEEE Int. Symp.
7. International Telecommunication Union, ITU-T Recommendation H.261 Circuits and Systems (Rio de Janeiro, 2011), pp. 2145–2148
Version 1: Video Codec for Audiovisual Services at p x 64 kbits (ITU-T, Geneva, 24. RJ Cintra, FM Bayer, A DCT approximation for image compression. IEEE
Switzerland, 1990) Signal Proc Let 18(10), 579–582 (2011). doi: 10.1109/LSP.2011.2163394
Dhandapani and Ramachandran EURASIP Journal on Advances in Signal Processing 2014, 2014:180 Page 9 of 9
http://asp.eurasipjournals.com/content/2014/1/180

25. N Brahimi, S Bouguezel, An efficient fast integer DCT transform for images
compression with 16 additions only. Paper presented at the 7th international
workshop on systems, signal processing and their applications (WOSSPA,
Tipaza, Algeria, 2011), pp. 71–74
26. RK Senapati, UC Pati, KK Mahapatra, A low complexity orthogonal 8 × 8
transform matrix for fast image compression, in Proceeding of the Annual
IEEE India Conference (INDICON) (Kolkata, India, 2010), pp. 1–4
27. FM Bayer, RJ Cintra, DCT-like transform for image compression requires 14
additions only. Electron Lett 48(15), 919–921 (2012). 10.1049/el.2012.1148
28. RJ Cintra, FM Bayer, VA Coutinho, S Kulasekera, A Madanayake, DCT-Like
Transform for Image and Video Compression Requires 10 Additions only, 2014.
http://arxiv.org/abs/1402.5979v1. Accessed 5 June 2014
29. D Vaithiyanathan, R Seshasayanan, Low power DCT architecture for image
compression, in Proceeding of the International Conference on Advanced
Computing and Communication Systems (ICACCS) (Coimbatore, Tamil Nadu,
India, 2013), pp. 1–6. doi: 10.1109/ICACCS.2013.6938745
30. D Vaithiyanathan, R Seshasayanan, S Anith, K Kunaraj, A low-complexity DCT
approximation for image compression with 14 additions o, in Proceeding of
the International Conference on Green Computing, Communication and
Conservation of Energy (ICGCE 2013), Chennai, Tamil Nadu, India, 2013,
pp. 303–307. doi: 10.1109/ICGCE.2013.6823450
31. K Saraswathy, D Vaithiyanathan, R Seshasayanan, A DCT approximation with
low complexity for image compression, in Proceeding of the International
conference on Communication and Signal Processing - ICCSP - 2013
(Melmaruvathur, Tamil Nadu, India, 2013), pp. 465–468. doi: 10.1109/
iccsp.2013.6577097
32. FM Bayer, RJ Cintra, A Madanayake, US Potluri, Multiplierless approximate
4-point DCT VLSI architecture for transform block coding. Electron Lett
49(24), 1532–1534 (2013). doi: 10.1109/TLA.2010.5688099
33. KA Wahid, M Martuza, M Das, C McCrosky, Efficient hardware implementation
of 8x8 integer cosine transforms for multiple video codecs. J Real-Time Image
Proc, 403–410 (2013). doi: 10.1007/s11554-011-0209-6
34. PK Meher, SY Park, BK Mohanty, KS Lim, C Yeo, Efficient integer DCT
architecture for HEVC. IEEE Trans on Circuits and Systems for Video
Technology 24(1), 168–178 (2014). 10.1109/TCSVT.2013.2276862
35. FM Bayer, RJ Cintra, A Edirisuriya, A Madanayake, A digital hardware fast
algorithm and FPGA-based prototype for a novel 16-point approximate DCT
for image compression applications. Meas Sci Technol 23, 1–10 (2013).
10.1088/0957-0233/23/11/114010
36. Virtex-7 FPGA data sheet (Xilinx, Inc., San Jose, CA, February 18, 2014).
http://www.xilinx.com/support/documentation/data_sheets/ds180_7Series_
Overview.pdf Accessed 9 June 2014
37. Cadence, Encounter User Guide Version 6.2.4 (Cadence Design Systems,
Inc, USA, 2008)
38. Y Arai, T Agui, M Nakajima, A fast DCT-SQ scheme for images. Trans IEICE
E-71(11), 1095–1097 (1988)
39. The USC-SIPI Image Database (Univ. Southern California, Signal and Inage
Processing Inst., 2011). http://sipi.usc.edu/database/ Accessed 6 November
2013

doi:10.1186/1687-6180-2014-180
Cite this article as: Dhandapani and Ramachandran: Area and power
efficient DCT architecture for image compression. EURASIP Journal on
Advances in Signal Processing 2014 2014:180.

Submit your manuscript to a

journal and beneﬁt from:
7 Convenient online submission
7 Rigorous peer review
7 Immediate publication on acceptance
7 Open access: articles freely available online
7 High visibility within the ﬁeld
7 Retaining the copyright to your article

Submit your next manuscript at 7 springeropen.com

DCT Haweel 17 2016
No ratings yet
DCT Haweel 17 2016
31 pages
Cintra Et Al (2014) - DCT Approximations Based On Integer Functions
No ratings yet
Cintra Et Al (2014) - DCT Approximations Based On Integer Functions
14 pages
Pari 2019
No ratings yet
Pari 2019
6 pages
Low Power DCT Architecture For Image/Video Coders: IPASJ International Journal of Electronics & Communication (IIJEC)
No ratings yet
Low Power DCT Architecture For Image/Video Coders: IPASJ International Journal of Electronics & Communication (IIJEC)
10 pages
Potluri 2014
No ratings yet
Potluri 2014
14 pages
Signal Processing: Image Communication: C.J. Tablada, T.L.T. Da Silveira, R.J. Cintra, F.M. Bayer
No ratings yet
Signal Processing: Image Communication: C.J. Tablada, T.L.T. Da Silveira, R.J. Cintra, F.M. Bayer
10 pages
Ezhilarasi-2018-À Revoir
No ratings yet
Ezhilarasi-2018-À Revoir
14 pages
Haweel2014 Ess
No ratings yet
Haweel2014 Ess
6 pages
Bayer - 14 Add-2012
No ratings yet
Bayer - 14 Add-2012
2 pages
FPGA DCT for Image Compression
No ratings yet
FPGA DCT for Image Compression
6 pages
DCT-14 Additions
No ratings yet
DCT-14 Additions
7 pages
binDCT VLSI
No ratings yet
binDCT VLSI
14 pages
The Discrete Cosine Transform
No ratings yet
The Discrete Cosine Transform
15 pages
DCT Thesis
No ratings yet
DCT Thesis
12 pages
32 DCT
No ratings yet
32 DCT
57 pages
Discrete Fourier Transform
No ratings yet
Discrete Fourier Transform
18 pages
Improved 8-Point Approximate DCT For Image and Video Compression Requiring Only 14 Additions
No ratings yet
Improved 8-Point Approximate DCT For Image and Video Compression Requiring Only 14 Additions
6 pages
Lossless Image Compression Using The Discrete Cosine Transform
No ratings yet
Lossless Image Compression Using The Discrete Cosine Transform
6 pages
Subramanian 2010
No ratings yet
Subramanian 2010
4 pages
Mid PPT Grp11
No ratings yet
Mid PPT Grp11
24 pages
Signal and Image Compression Using Quantum Discrete Cos 2019 Information Sci
No ratings yet
Signal and Image Compression Using Quantum Discrete Cos 2019 Information Sci
21 pages
FPGA 2D DCT for Image Compression
No ratings yet
FPGA 2D DCT for Image Compression
13 pages
High-Efficiency and Low-Power Architectures For 2-D DCT and IDCT Based On CORDIC Rotation
No ratings yet
High-Efficiency and Low-Power Architectures For 2-D DCT and IDCT Based On CORDIC Rotation
6 pages
Image Compression
No ratings yet
Image Compression
114 pages
2 - FPGA Implementation of Pipelined 2D-DCT and Quantization Architecture For JPEG Image Compression.
No ratings yet
2 - FPGA Implementation of Pipelined 2D-DCT and Quantization Architecture For JPEG Image Compression.
6 pages
DCT
No ratings yet
DCT
17 pages
DCT
No ratings yet
DCT
39 pages
Image Compression Using Discrete Cosine Transform: Data Compression Digital Images Lossy Lossless
No ratings yet
Image Compression Using Discrete Cosine Transform: Data Compression Digital Images Lossy Lossless
9 pages
Energy-Efficient JPEG Compression Techniques
No ratings yet
Energy-Efficient JPEG Compression Techniques
14 pages
Image Compression Using Discrete Cosine Transform and Adaptive Huffman Coding
No ratings yet
Image Compression Using Discrete Cosine Transform and Adaptive Huffman Coding
5 pages
DCT/IDCT Implementation With Loeffler Algorithm
No ratings yet
DCT/IDCT Implementation With Loeffler Algorithm
5 pages
Oliveira 2017
No ratings yet
Oliveira 2017
2 pages
VLSI Design of A Fast Pipelined 8x8 Discrete Cosin
No ratings yet
VLSI Design of A Fast Pipelined 8x8 Discrete Cosin
6 pages
Asic Based DCT2016
No ratings yet
Asic Based DCT2016
5 pages
Progress Report On Project Phase-1first Oral Review: Radix-2 DCT Algorithm
No ratings yet
Progress Report On Project Phase-1first Oral Review: Radix-2 DCT Algorithm
12 pages
Fixed Point 8x8 IDCT/DCT Design
No ratings yet
Fixed Point 8x8 IDCT/DCT Design
6 pages
A Discrete Cosine Transform
No ratings yet
A Discrete Cosine Transform
14 pages
Image Compression Using High Efficient Video Coding (HEVC) Technique
No ratings yet
Image Compression Using High Efficient Video Coding (HEVC) Technique
3 pages
Image Compression Using DCT: - Rohan Kumar Sinha - Raghavendra Karthik D - Bijay Kalikotay
100% (1)
Image Compression Using DCT: - Rohan Kumar Sinha - Raghavendra Karthik D - Bijay Kalikotay
30 pages
Lossless Image Compression Based On Integer Discret Tchen
No ratings yet
Lossless Image Compression Based On Integer Discret Tchen
16 pages
DCT and JPEG Compression Explained
No ratings yet
DCT and JPEG Compression Explained
24 pages
DCT
No ratings yet
DCT
32 pages
Mini Project: Fpga Implementation of 2D DCT
No ratings yet
Mini Project: Fpga Implementation of 2D DCT
16 pages
JPEG Image Compression
No ratings yet
JPEG Image Compression
54 pages
Wu Icip08
No ratings yet
Wu Icip08
4 pages
Eslam Elwehedy Assignment 1 DSP
No ratings yet
Eslam Elwehedy Assignment 1 DSP
7 pages
Presentation On Image Compression
100% (2)
Presentation On Image Compression
28 pages
DCT for Image Compression Experts
No ratings yet
DCT for Image Compression Experts
40 pages
Efficient Implementation of Low Power 2-D DCT Architecture
No ratings yet
Efficient Implementation of Low Power 2-D DCT Architecture
6 pages
Chapter 18
No ratings yet
Chapter 18
14 pages
Image Compression Using DCT: - Rohan Kumar Sinha - Raghavendra Karthik D - Bijay Kalikotay
No ratings yet
Image Compression Using DCT: - Rohan Kumar Sinha - Raghavendra Karthik D - Bijay Kalikotay
22 pages
A Multiplier-Free Discrete Cosine Transform Architecture Using Approximate Full Adder and Subtractor
No ratings yet
A Multiplier-Free Discrete Cosine Transform Architecture Using Approximate Full Adder and Subtractor
4 pages
Information Theory and Coding: Submitted by
No ratings yet
Information Theory and Coding: Submitted by
12 pages
Image Compression
No ratings yet
Image Compression
89 pages
Call - For - PaperDDiscrete Cosine Transform For Image Compression
No ratings yet
Call - For - PaperDDiscrete Cosine Transform For Image Compression
7 pages
Image Compression5
No ratings yet
Image Compression5
46 pages
An Efficient DCT Compression Technique Using Strassen's Matrix Multiplication Algorithm
No ratings yet
An Efficient DCT Compression Technique Using Strassen's Matrix Multiplication Algorithm
6 pages
10 1109@ispa 2003 1296436
No ratings yet
10 1109@ispa 2003 1296436
6 pages
Objective Picture Quality Measures
No ratings yet
Objective Picture Quality Measures
8 pages
Kouadria 2019
No ratings yet
Kouadria 2019
4 pages
Thakur 2017
No ratings yet
Thakur 2017
5 pages
Kratochvil
No ratings yet
Kratochvil
4 pages
Video Coding
No ratings yet
Video Coding
19 pages
High Performance Integer DCT Architectures For Hevc: Mohamed Asan Basiri M, Noor Mahammad SK
No ratings yet
High Performance Integer DCT Architectures For Hevc: Mohamed Asan Basiri M, Noor Mahammad SK
6 pages
SSIM: Mathematical Insights
No ratings yet
SSIM: Mathematical Insights
12 pages
IWSSIM
No ratings yet
IWSSIM
14 pages
Overview of H.264
No ratings yet
Overview of H.264
17 pages
Peculiarities of 3D Compression of Noisy Multichannel Images
No ratings yet
Peculiarities of 3D Compression of Noisy Multichannel Images
4 pages
Karen Egiazarian ( ), Jaakko Astola ( ), Nikolay Ponomarenko ( ), Vladimir Lukin ( ), Federica Battisti ( ) and Marco Carli ( )
No ratings yet
Karen Egiazarian ( ), Jaakko Astola ( ), Nikolay Ponomarenko ( ), Vladimir Lukin ( ), Federica Battisti ( ) and Marco Carli ( )
4 pages
Yao 2007
No ratings yet
Yao 2007
4 pages
Centralized Web Auth Lab Guide
No ratings yet
Centralized Web Auth Lab Guide
15 pages
Embedded System Chapter-5
No ratings yet
Embedded System Chapter-5
18 pages
Arduino Basics for Beginners
No ratings yet
Arduino Basics for Beginners
45 pages
Basic Research in Computer Science BRICS RS-97-43
No ratings yet
Basic Research in Computer Science BRICS RS-97-43
19 pages
2SC 5250
No ratings yet
2SC 5250
5 pages
Sterling ST169 Owners Manual
No ratings yet
Sterling ST169 Owners Manual
2 pages
CLB10503 Principles of Programming Assignment: Movie Ticket Booking Programme (Using C++ Coding)
67% (3)
CLB10503 Principles of Programming Assignment: Movie Ticket Booking Programme (Using C++ Coding)
17 pages
Assignment 2 - 20172018
No ratings yet
Assignment 2 - 20172018
3 pages
Communication Diagrams: Figure 414 On Page 1328 Figure 417 On Page 1331
No ratings yet
Communication Diagrams: Figure 414 On Page 1328 Figure 417 On Page 1331
5 pages
Ils Administration: Dr. Parthasarathi Mukhopadhyay
No ratings yet
Ils Administration: Dr. Parthasarathi Mukhopadhyay
8 pages
CC-302 Python Programming
No ratings yet
CC-302 Python Programming
2 pages
Memory Selection of ES
No ratings yet
Memory Selection of ES
37 pages
Essbase Coding Standards
No ratings yet
Essbase Coding Standards
21 pages
OSRAM High-Speed Switching of IR-LEDs - Background and Data Sheet Definition
No ratings yet
OSRAM High-Speed Switching of IR-LEDs - Background and Data Sheet Definition
15 pages
Report
No ratings yet
Report
61 pages
Mosfet 10NM60N
0% (1)
Mosfet 10NM60N
19 pages
Software Testing Essentials
100% (2)
Software Testing Essentials
22 pages
The Rad Model: Software Engineering
No ratings yet
The Rad Model: Software Engineering
12 pages
Model 5020 Combustible Gas Detection Module: Nova-5000 Detection & Control System
No ratings yet
Model 5020 Combustible Gas Detection Module: Nova-5000 Detection & Control System
2 pages
Course Outline and Some Basic Notes
No ratings yet
Course Outline and Some Basic Notes
17 pages
Brochure HaslerRail TELOC-4000 20240913
No ratings yet
Brochure HaslerRail TELOC-4000 20240913
4 pages
MC 10211362 0001
No ratings yet
MC 10211362 0001
2 pages
Chapter 7 - Software Quality Assurance
No ratings yet
Chapter 7 - Software Quality Assurance
36 pages
Label Printing Bartender
No ratings yet
Label Printing Bartender
80 pages
Data Protection & Mobility For Kubernetes Applications
No ratings yet
Data Protection & Mobility For Kubernetes Applications
3 pages
IOE Instrumentation Lab Manual
No ratings yet
IOE Instrumentation Lab Manual
14 pages
Test Strategy Document
No ratings yet
Test Strategy Document
10 pages
Lenovo Ideacentre b520 Compal La-7811p Rev.0.1 Qla01 SCH
No ratings yet
Lenovo Ideacentre b520 Compal La-7811p Rev.0.1 Qla01 SCH
56 pages
Kid65083af 2
No ratings yet
Kid65083af 2
7 pages
Technical Guide R740
No ratings yet
Technical Guide R740
56 pages

Area and Power Efficient DCT Architecture For Image Compression

Uploaded by

Area and Power Efficient DCT Architecture For Image Compression

Uploaded by

Dhandapani and Ramachandran EURASIP Journal on Advances in Signal Processing 2014, 2014:180

RESEARCH Open Access

Area and power efficient DCT architecture for

1 Introduction (2-D DCT) is applied for encoding each block. The

integer or a power of 2. The SDCT also maintains the

Figure 2 Implementation of proposed transform matrix in image coding.

Table 2 PSNR obtained by different 8 × 8 transform matrices

POriginal Image Proposed Transform CB-2012 BAS2011

(a) PSNR = 41.7576 PSNR = 40.4921 PNSR = 39.1987

(b) PSNR = 42.1115 PSNR = 40.8259 PSNR = 39.5451

(c) PSNR = 41.7853 PSNR = 40.5130 PSNR = 39.2220

(d) PSNR = 41.3747 PSNR = 39.7569 PSNR = 38.6541

Figure 5 Digital architecture for proposed approximate DCT.

transformation matrix, which requires only 12 addi-

Submit your manuscript to a

Submit your next manuscript at 7 springeropen.com

You might also like