Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
72 views11 pages

ParameterOptimizationforH 265-HEVCEncoderUsingNSGAII

The document discusses parameter optimization for the H.265/HEVC video encoder using the NSGA II algorithm. It describes the H.265/HEVC encoder and its parameters that can be tuned. It then explains the use of the NSGA II multi-objective optimization algorithm to find optimal parameter values that maximize compression ratio and PSNR for different test videos.

Uploaded by

herick lenon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views11 pages

ParameterOptimizationforH 265-HEVCEncoderUsingNSGAII

The document discusses parameter optimization for the H.265/HEVC video encoder using the NSGA II algorithm. It describes the H.265/HEVC encoder and its parameters that can be tuned. It then explains the use of the NSGA II multi-objective optimization algorithm to find optimal parameter values that maximize compression ratio and PSNR for different test videos.

Uploaded by

herick lenon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

See

discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/316090062

Parameter Optimization for H.265/HEVC


Encoder Using NSGA II

Chapter in Advances in Intelligent Systems and Computing · April 2017


DOI: 10.1007/978-981-10-3325-4_11

CITATIONS READS

0 98

5 authors, including:

Mohit Khokhar Prashant Singh Rana


Thapar University Thapar University
1 PUBLICATION 0 CITATIONS 28 PUBLICATIONS 36 CITATIONS

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

SmartGlucoBinder: Design and Synthesis of Small Glucose Binder Molecule using Computational
Intelligence Approach View project

SmartGlucoBinder: Design and Synthesis of Small Glucose Binder Molecule using Computational
Intelligence Approach View project

All content following this page was uploaded by Prashant Singh Rana on 07 November 2017.

The user has requested enhancement of the downloaded file.


Noname manuscript No.
(will be inserted by the editor)

Parameter Optimization for H.265/HEVC encoder using NSGA II

Saurav Kumar, Satvik Gupta, Vishvender Singh, Mohit Khokhar


and Prashant Singh Rana*.
Computer Science and Engineering Department, Thapar University, Punjab, India.
Email: [email protected]

Abstract High Efficiency Video Coding (H.265/HEVC) is the latest technology standard proposed by
Joint Collaborative Team on Video Coding (JCT-VC). There are quite a few parameters for this encoder
required to accomplish this goal. If a single standard configuration file is used for all genres of videos that
may not maintain the optimal quality in all encoded videos. This is because every video has objects with
unlike speeds of movement. Therefore, encoding factors must be customized in the most favorable way
for each video separately. The work propose here is to use NSGA II for multi-objective optimization in
order to find out the respective personalized encoding parameters to obtain higher Compression Ratio
and Peak Signal-to-Noise Ratio (PSNR). Experiments on six QCIF videos with resolution 176 × 144 and
different configuration files have been performed. Results demonstrate that the proposed technique gives
enhanced video compression quality. Test videos and code used in the research is available as supplement
at http://bit.ly/HEVC-NSGA-II.
Keywords H.265/HEVC Parameter · NSGA II · H.265/HEVC Configuration File · Video Compression.

1 Introduction

H.265/HEVC is the latest video compression standard set by Joint Collaborative Team on Video Coding
(JCT-VC) in January 2013 [1] [2] [3]. It delivers 50% superior coding efficiency as compared to H.264/AVC
[4]. Its standard has been designed to address essentially all the existing applications of H.264/MPEG-4
AVC developed during 1999-2003 and extended in 2003-2009 [5]. The first edition of H.265/HEVC was
officially release in January 2013. Final aligned specification was approved by ITU-T as H.265 and MPEG-4
Part 2 by ISO/IEC [6] respectively. H.265 compression standard is extensively used for a wide range of
applications such as broadcasting of High Definition TV signals over cable, internet, satellite/terrestrial
transmission systems, and video transmission over mobile network to name a few [5]. As compared with
previous conventional standards, it provides better compression ratio. This encoder requires tuning of
several parameters that are fluctuating in nature over the domain of a fixed range of discrete values.
Parameter tuning is the key phase to find best possible video compression.
For optimal encoding we seek the lowest possible bit rate without compromising subjective video
quality for human perception. Thus the objective function is to compress video, subjected to constraints
on benchmark subjective video quality. In H.265/HEVC encoder, a single standard configuration file is used
for all videos which may be a primitive assumption about the characteristics of underlying data. In contrast
to this static model an agile line of attack to gauge diverse movement attributes and correspondingly set
the encoding factors for configuration files are required. Video encoding can provide a power economical
architecture which would be of key interest to the battery operated mobile applications. Thus by achieving
high resolution and lower resource requirements, we can make best of both worlds [7].
A configuration file of an encoder must attain the parameter values dynamically for each video sequence
to achieve high PSNR and low RMSE for encoded video sequence. This thought motivates us to find out
the optimal encoder parameters for each testing video sequence. So there is a need to scrutinize the
encoder parameters for each video separately. We have proposed an organized and sophisticated approach
2 Rana et al.

to solve this parameter optimization problem. The algorithm can trade off encoding delay, compression
rate, computational efficiency and robustness while supporting competitive parallel processing techniques
with elevated video resolution.
Underlying proposed approach is to use NSGA II for multi-objective optimization and compute opti-
mum parameters for encoder which can provide higher compression ratio and higher PSNR values.
This paper is organized as follow: Section 2 discusses the fundamentals of H.265/HEVC encoder along
with its parameters and quantitative metrics. NSGA II is concisely explained in Section 3. Section 4 talks
about parameter optimization using NSGA II. Final results, conclusion and future projections are discussed
in Section 5.

2 H.265/HEVC Encoder

2.1 Basics of H.265/HEVC Encoder

H.265/HEVC makes use of a complex course for video encoding and is designed for several purposes such
as mobile friendly, data loss resilience, parallel processing architecture. A basic encoder block diagram is
describe in Fig 1.

Fig. 1: H.265/HEVC Encoder Block diagram.

Hanhart et al [8] proposed that size of group of pictures (GOP) is set to 8 and for videos having 24, 30,
50, and 60 frames per seconds. The corresponding intra period was taken as 24, 32, 56 and 64 respectively.
Kim et al [9] have proposed HM reference encoder based upon random access configuration. This improves
the coding efficiency of low delay configuration encoders anticipated by Horowitz at el [10]. The coding
order is set to 0, 8, 4, 2, 1, 3, 6, 5, 7 and test conditions are selected as in [6]. Additional encoder parameters
are taken from ”CFG 16” configuration file, which is presented by Correa et al [6]. This file was confirmed
to be optimal from computational complexity as well as coding efficiency point of view. Table 1 shows the
mentioned HM reference software encoder [9] configuration parameters.

2.2 H.265/HEVC Encoder parameters

There are a number of encoding parameters for H.265/HEVC regarding video compression. These are
quantization parameter factor, coding unit’s height and width, file I/O, source parameters, profile and
level parameters, coding structure parameters, motion estimation parameters, mode decision parameters,
slice coding parameters and de-blocking filter parameters. These factors being highly effective for the
encoded video quality are described in Table 1 along with their parameter range [5].
Parameter Optimization for H.265/HEVC encoder using NSGA II 3

Table 1: Configuration Parameters of H.265/HEVC.

SN Parameters Values
1 Encoder Version HM 15.0
2 Profile Main
3 RD optimization Enabled
4 Motion Estimation TZ Search
5 Transform Skip Enabled
6 Intra Period 1 Sec
7 Rate Control Disabled
8 Maximum coding unit Width {16, 32, 64}
9 Maximum coding unit Height {16, 32, 64}
10 Quad tree TU Log2 min/max size {2, 3, 4, 5}
11 Quantization parameter (QP) [0-51]
12 Quantization parameter factor (QPf) [0-1]
13 Search Range [2-6]
14 Maximum partition depth [1-4]

– Maximum and Minimum Coding Unit for Width and Height :A prediction unit structure depends upon
coding unit level. Luma and chroma coding blocks are further subdivided on the basis of prediction
type decision. Core of the coding layer in H.265/HEVC is called coding tree unit (CTU). Its size can
be larger than traditional macro block, selected by the encoder [5]. The coding tree unit consists of a
luma and corresponding chroma coding tree block (CTB) along with syntax elements. Size (L) of luma
CTB may be taken as L = 16, 32 or 64 samples. So maximum coding unit (CU) size is selected as
L × L (W idth × Height) of a luma CTBs having larger size that enables better compression [4].
– Log2 of Max/Min Size of Quad Tree Transform Unit: This unit is represented as a quad tree having the
coding unit positioned at root. The luma and chroma sample arrays that exists in a CU are given as
coding blocks (CB). Subdivision of chroma CTBs of a CTU is always aligned with that of a luma CTB as
shown in the Fig 2 [1]. Luma CB residual can be split into smaller luma Transform Blocks (TB) or same
as luma TB, which are then applied to the chroma TBs afterwards [5]. QuadT reeT U Log2M axSize is
log of base 2 of luma CTB chosen and QuadTreeTULog2MinSize is log of base 2 of minimum luma
CTB. It is generally set to 2 [1].

Fig. 2: Coding Tree Structure.

– Quantization Parameter (QP) and Quantization Parameter Factor (QPf ): It is used to determine the
quantization of transformed coefficients. Values of QP range from 0 to 51. Upon incrementing QP by 1,
the step size increases ≈ 12% because mapping of QP values to step size is a logarithmic change by 12%.
It means the percentage reduction is observed in bit rate i.e. range of QP taken in HM reference software
is 20 to 51. Quantization Parameter Factor is the weight assigned during rate distortion optimization.
Low value signifies high quality and additional number of bits taken. Its value typically ranges from
0.3 to 1 [1].
– Search Range: : In motion vector signaling, a motion vector predictor is prepared by using Advanced
Motion Vector Prediction (AMVP) scheme. It is based upon motion compensation scheme that exploits
both spatial and temporal motion vectors. This is done by creating a candidate set of best predictors
from the PU neighbors [11]. To select a motion vector predictor (MVP), a merge scheme is used among
a merge candidate set that contains one temporal MVP and four spatial MVPs’. Henceforth, the Search
Range is used for motion estimation and is defined around the predictor. Motion vector may acquire
values outside the search range. Search Range for QCIF video sequences, have maximum coding unit
width/height up to 64 ∈ {8, 16, 32, 64}. In HM, the reference software search range is set to 8 [1].
4 Rana et al.

– Maximum Partition Depth: It defines the depth of coding unit tree and ranges from 1 to 4. In HM
reference software, its value is assumed to be 4 [1].

2.3 Quality Assessment Metrics

The quality of digital video is measured on a basis of subjective perception as experienced by humans.
Subjective analysis is essential because PSNR may not correlate with the viewers’ ratings. So it may be
used as a ground truth to develop innovative techniques [12]. Subjective and objective measures are two
vital advocates to decide about digital video quality.

2.4 Objective Metrics

Objective assessment provides numerical estimations of the video performance by mathematical models.
This calculation is essential because human perception is limited and varies among different viewer for the
same video. Moreover objective metrics enable automation and comparison of the output in a consistent
manner.
Lately, several objective metrics have been proposed. Among most common ones are Peak Signal-to-
Noise Ratio (PSNR) and Mean Squared Error (MSE) between the input versus output video frames as
described below:

1. PSNR: Peak signal to noise ratio (PSNR) is a measure of signal quality which is computed on logarithm
scale. It is defined in terms of MSE between the original picture and distorted picture. In video’s context,
it is calculated on the picture basis as a weighted sum of PSNR of individual components (Y-PSNR,
U-PSNR and V-PSNR) [13]. Generally, Y-PSNR (luma component) is used since it is close to human
perception and defined by equation 1.

M AXI2
 
P SN RdB = 10Log10 (1)
M SE
Here M AXI represents the maximum possible pixel value of an image represented using 8 bits per
sample. Thus M AXI = 255. As far as video compression is concerned to the human perception, values
of PSNR typically lie between 30 dB and 50 dB.
2. RMSE: It is defined as root of mean of squared differences between luminance values of the source
and reconstructed video frames
s
PX−1 PY −1 2
i=0 i=0 (Fm (x, y) − fm (x, y))
RMSE = (2)
XY

where Fm (x, y) and fm (x, y) are luminance (Y-component) values of source and encoded pixel of the
video frame m, at point (x, y) in 2-dimensional space. X and Y are resolution variables in terms of
pixels which may vary according to the size of image (set to 256 for our research).

2.5 Subjective Metrics

Subjective quality measures are time consuming, costly and require human experts sometimes with equip-
ments. It could be a slow and expensive procedure. Nevertheless, it is a crucial ritual for quality assurance
and users’ satisfaction. PSNR does not consider the saturation effect of human eyes, so underlying nature
of artifacts is not fully captured. Subjective quality measure can be calculated using two metrics, Mean
Opinion Score (MOS) and Structural Similarity Index as described below:
Mean Opinion Square (MOS): It measures the human quality impression between 1 and 5. This rating
is then correlated with PSNR values [12], computed frame by frame basis. Relation between PSNR and
MOS is defined in Table 2 [12].
Structural Similarity Index : Human preferences towards evaluating visual quality are measured by
Structural Similarity Index (SSI) [14]. It depends upon Human Visual system (HVS) and is suitable to
Parameter Optimization for H.265/HEVC encoder using NSGA II 5

Table 2: Relationship between MOS and PSNR.

Scale Quality Impairment PSNR(dB)


5 Excellent Imperceptible > 37
4 Good Perceptible, but not annoying 31-37
3 Fair Slightly annoying 25-31
2 Poor Annoying 20-25
1 Bad Very annoying < 20

extract the structural information from observed scene. Therefore, distortion or structural similarity thor-
oughly measures the perceptual image quality. In other worlds, SSIM attempts to classify the distortions
which have negligible effect on image structure. Sampat at al [14] have shown that SSIM provides better
prediction of image quality for a range of distortions associated with images. Maximum value of SSIM
index is 1 if both original and encoded image are identical.
In this work PSNR, RMSE and MOS have been used to quantify the video quality. The computation
of encoder parameters may not be represented in terms of single objective function that needs to be
optimized. So, the classical optimization technique is not appropriate for computing optimum values of
encoder parameters. An alternative modern approach for parameter estimation is soft computing. Non-
dominated sorting genetic algorithm - II (NSGA II) has been utilized for this rationale, details of which
are given in Section-3.

3 Non-dominated sorting genetic algorithm II (NSGA II)

Non-dominated sorting genetic algorithm II (NSGA II) [15] searches a secondary set of solutions us-
ing a population for potential solutions, known as chromosomes. Chromosomes are represented by a D-
dimensional vector [ti1 , ti2 , ..., tiD ] where i=1, 2,..., NP is the population size (number of chromosomes).
Each decision variable tij for j=1, 2, ..., D in ith chromosome represents j th threshold value. The pop-
ulation is initialized randomly and corresponding objective function values are calculated using equation
3 and equation 4. Before applying the selection, crossover and mutation operator to generate a new off-
spring, given population is sorted using a fast non-dominated sorting technique and crowding distance.
Fast non-dominated sorting approach assigns the rank to each individual with O(F N P 2 ) computational
complexity, where F signifies number of objectives. Crowding distance is a measure of closeness of an indi-
vidual relative to its neighbors i.e. the density of solutions. It is calculated using the average of distances
from its immediate neighbors along same front in each dimension. Selection of parents from the population
is done using binary tournament election method which is based two factors, crowding distance and rank.
Now from these parents, offsprings are generated using crossover and mutation operators. This process
is repeated until the termination condition is satisfied which is known as number of generations. In this
work, binary tournament [16], single point crossover [17], and single point mutation operator [17] are used
for NSGA II implementation.

Algorithm 1 : Non-dominated sorting genetic algorithm II (NSGA II).


Input: Population size (NP) and number of generations.
Output: A set of best individuals known as pareto-front
1. Initialize the population PP of size NP randomly.
2. Calculate the objective values using equation 3 and equation 4.
3. Arrange the population using non-dominated sorting
technique and crowding distance.
4. Select the parents using a binary tournament selection.
5. Apply crossover and Mutation operator on the selected parents
6. Perform selection from the parents and their offsprings.
7. Replace unfit individuals with the fit ones
to maintain a constant population size.
8. Repeat the steps 2−7 until termination condition is
satisfied.
6 Rana et al.

4 Parameter Optimization for H.265/HEVC using NSGA II

There are several parameters used by H.265/HEVC encoder in which some remain constant while others
vary within a specified range. The literature reveals that there are total six parameters that affect the
video quality and compression ratio. These parameters are described in Table 1 along with their range of
values. There are two objectives for video encoding. First is to maximize the Y-PSNR (YPSNR(x)) and
second is to minimize the file size (FileSize(x)) for a given set of parameters. Both the objective functions
are defined as:

Objective F unction1 = maximum(Y P SN R(x)) (3)


Objective F unction2 = minimum(F ileSize(x)) (4)
The parameters set is represented as x = (x1 , x2 , x3 , x4 , x5 , x6 ) where, x1 → maximum coding unit
width, x2 → maximum coding unit height, x3 , → quad tree TU log2 maximum size , x4 → quad tree
TU log2 minimum size, x5 → Maximum partition depth, x6 → QP. The detailed steps for estimating
above parameters are given in Algorithm 1. This algorithm returns a set of best individuals known as
pareto-optimal front. Initial population is generated randomly and sample of the population is shown
below:
x1 x2 x3 x4 x5 x6
32 32 4 2 3 33
In this work, the target quality is set to 35 dB as it comes under ’good’ quality range indicated in Table
2.

5 Results and Discussion

This section discuses about environment setup, NSGA II implementation, its parameter setting, test video
sequence files, reference parameters for H.265/HEVC encoder and results.

5.1 Environment setup

Simulation were carried out by using HM (version 15.0) software for H.265/HEVC encoder, which uses a
GOP structure of IBBBBBBB with Hierarchical-B structure enabled. Details about machine configuration
and software used are listed in the Table 3.

Table 3: Environment used for simulation.

Category Model/Configuration
Motherboard DV-6000 AMD
Processor Intelr CoreTM i5-M370, 3.2 GHz
Graphics Card ATI Radeon Fire Pro V8800
RAM 2 x 4 GB PC3-10700
HDD (Storage) Western Digital 2 x 2 TB
Operating system Linux (Ubuntu 14.10)
Video player VLC/Media Player Classic 64 bit
HM Version 16.0
NSGA II Implementation Octave 3.6.4

5.2 Parameter Setting for NSGA II

NSGA II is implemented in Octave (Version 16.0), an open source software licensed under GNU GPL.
Parameters are set with Crossover Rate (CR)=0.9, Mutation Rate (MR)=0.01, binary tournament selection
[16], single point crossover [17], and single point mutation operator [17]. The population size is set to 50
and the maximum number of generation is set to 200. Code is available in the supplement information.
Parameter Optimization for H.265/HEVC encoder using NSGA II 7

5.3 Test Video Sequences

Videos composed of less than 10 frames per seconds are sometimes used for low bit rate (< 64 kbps) video
communication. Typically, 10 to 20 frames per seconds are considered for low bit rate video communication.
Standard television signals are generally sampled at 25 to 30 frame per seconds, while 50 to 60 frames per
seconds exhibit smooth apparent motion [18]. RGB format is used to capture and display colored images
while YCbCr format is more efficient for compression. Here, Y represents pixel brightness (luminance),
Cb and Cr are chrominance components of the pixel. A standard video supports several sampling patterns
for Y, Cb, Cr format. Some typical patterns are 4:4:4, 4:2:2 and 4:2:0. The number in ratio N1:N2:N3
represents relative sampling rate in horizontal direction. Here N1 = number of Y samples in both odd and
even rows, N2 = number of Cb and Cr samples in odd rows and N3 = number of Cb and Cr samples in
even rows. Also Cb = U and Cr = V represent color components of YUV color space. On the basis of res-
olution, some commonly used 4:2:0 YUV formats are specified below according to the number of pixels [12]:

Video Format Resolution


Sub Quarter Common Intermediate Format (SQCIF) 128×96 pixels
Quarter Common Intermediate Format (QCIF) 176×144 pixels
Common Intermediate Format (CIF) 352×288 pixels
Source Intermediate Format (SIF) 352×240 pixels

In this work, six QCIF format videos with resolution of the order 176×144 have been used and cor-
responding details are specified in the Table 4. Test videos under consideration have diverse speeds of
the moving objects. For example, very fast, moderately paced, slow and sluggish. We recommend to use
different configuration files for video encoding in order to get the optimal results. All the test videos are
made available as a supplement information to this paper.

Table 4: Sample Test Sequences [18][19].

Total no. No. of frames


Test Video Resolution
of frames to be encoded
AkiyoQcif.yuv 176 × 144 75 30
SuzieQcif.yuv 176 × 144 75 30
ForemanQcif.yuv 176 × 144 75 30
FootballQcif.yuv 176 × 144 75 30
CoastguardQcif.yuv 176 × 144 75 30
BusQcif.yuv 176 × 144 75 30

5.4 Results and Discussion

The above mentioned videos are implemented on the HM 15.0. Experiment was carried out for the param-
eters through which encoder has some default setting. Subsequently, the optimal values of parameters are
computed through proposed technique. List of default reference parameters are given below:
x1 x2 x3 x4 x5 x6
64 64 5 2 4 32

Table 5 shows the performance comparison of various encoded test video sequences using reference and
optimized parameters. It has been observed that by using reference parameters, Y-PSNR of the test videos
demonstrates unreliable quality. For BusQcif and FootballQcif, Y-PSNR goes down to 28.97 dB and 28.64
dB respectively while having somewhat annoying quality. In both videos, most of the objects are movable.
Therefore, motion estimation takes large size and the quality is reduced. It is evident from Table 5 that
PSNR value is maintained approximately at 35 dB. This ensures that the quality of videos encoded is
promising when our proposed technique is employed. For AkiyoQcif and SuzieQcif, Y-PSNR is obtained
to 36.477 dB and 34.418 dB respectively. Thus, it provides a good quality reference model as well within
the proposed model. Most of the frames contained in these videos are sluggish in movement. Therefore it
8 Rana et al.

(a) AkiyoQcif (b) SuzieQcif (c) ForemanQcif

(d) FootballQcif (e) CoastguardQcif (f) BusQcif

Fig. 3: Pareto Optimal graph between encoded file size and Y-PSNR for test video sequences.

(a) Box Plot for Y-PSNR (b) Box Plot for Compression Ratio

Fig. 4: Box plots for Y-PSNR and Compression Ratio for test video sequences.

Table 5: Performance comparison of encoded test video sequences using optimized parameters and reference
parameters.
Encoding using Reference Parameters∗ Encoding using Optimized Parameters
Test video Y- Encoded Compression Y- Encoded Compression Optimized Parameters
PSNR File PSNR File
Size Ratio Size Ratio x1 x2 x3 x4 x5 x6
(bytes) (bytes)
AkiyoQcif 36.477 2542 448.755 35.0215 1935 589.527 64 64 4 2 4 34
SuzieQcif 34.4189 2382 478.898 35.1142 2521 452.493 64 64 4 2 4 32
ForemanQcif 32.7594 5966 191.206 34.9189 7279 156.716 64 64 3 2 3 30
FootballQcif 28.6437 33944 33.606 34.9989 92650 12.31 64 64 2 2 4 25
CoastguardQcif 31.3543 7709 147.974 35.2879 19764 57.717 64 64 3 2 4 26
BusQcif 28.9785 26342 43.304 34.8586 73454 15.529 64 64 4 2 3 25
* x1 =64 x2 =64 x3 =5 x4 =2 x5 =4 x6 =32
Parameter Optimization for H.265/HEVC encoder using NSGA II 9

Table 6: Change in encoded test video sequences using optimized parameters and reference parameters.

Percentage change Percentage change


Test video
in Y-PSNR in compression ratio
AkiyoQcif -3.99 +31.37
SuzieQcif +2.02 -5.51
ForemanQcif +6.60 -18.03
FootballQcif +22.18 -63.36
CoastguardQcif +12.54 -60.99
BusQcif +20.29 -64.14

is established that the proposed technique behaves quite effectively for videos having variety of movement
related activities. Fig 3 shows the pareto optimal graphs that are generated for each test video. Here,
x-axis represents Y-PSNR (dB) and y-axis represents compression ratio. It is observed that for the video
sequences SuzieQcif, ForemanQcif, FootballQcif, CoastguardQcif and BusQcif, Y-PSNR is maintained high
with reasonably good compression ratio. This is the case especially when videos are processed by involving
optimized encoding parameters as compared to reference encoding parameters.
Table 6 shows the percentage change in PSNR and compression ratio after using proposed algorithm.
It has been observed that for all videos under testing, Y-PSNR increases within a range between [2%,20%]
with reasonably good compression ratio, except AkiyoQcif video. Above results justify the need to have
different video encoding parameters for smart encoding. To validate the performance of proposed method-
ology, 30 simulations are carried out for each video sequence. Fig 4(a) and 4(b) present the box-plot for
Y-PSNR and compression ratio. It is found that for all the test videos (Fig 4(a)) Y-PSNR is maintained
near to 35 db with ±2 error. Likewise, another reflection is that (Fig 4(b)) BusQcif and FootballQcif have
extremely low compression ratio because these videos contain fast moving objects. In support of AkiyoQcif
and SuzieQcif, the compression ratio is incredibly high because videos having awfully slow moving objects.
These quantitative results substantiate the fact that different videos require encoding parameters based
upon their inherent characteristics.

6 Conclusion

In this work, we have carried out the parameter optimization of H.265/HEVC encoder using NSGA II
algorithm. It maintains the proper balance between Y-PSNR and compression ratio while encoding a video.
Two objectives are formulated for video encoding: first is to maximize Y-PSNR and second is to minimize
encoded file size for the set of parameters. At this point, six parameters are chosen for optimization.
These are, maximum coding unit width, maximum coding unit height, quad tree TU Log2 min/max size,
quantization parameter (QP), quantization parameter factor (QPf) and search range. NSGA II is used
for multi-objective optimization to find out the optimal parameters for encoder which can provide sound
quality compressed video file. For testing, six YUV videos files have been used with 176×144 resolutions and
comprising of 30 frames. Results show that the unlike videos entail distinct configuration profiles for fine
quality and compression ratio. Test operations included range of videos from extremely swift to leisurely
moving objects, which demanded unique treatment for compression. To authenticate the performance of
proposed methodology, 30 simulations were performed for each video sequences. It was found that (i) Y-
PSNR ≈ 35 db ±2 error for each video (ii) Compression Ratio ∝ 1/(Object Movements) i.e. if the video
has lethargic movements then it can be compressed to a greater ratio.
Lastly, reference parameters cannot maintain Y-PSNR at a good level for all the videos because they
possess varying speed objects. So there arises a need to develop some new self dependable approach
which can implicitly optimize H.265/HEVC video encoder parameters according to the picture’s motion.
Although, NSGA II is time-consuming for video encoding, yet it can be implemented in parallel to speedup
the encoding process.

Supplement Information

The test videos and code used in the study is available as supplement at http://bit.ly/HEVC-NSGA-II.
10 Rana et al.

References

1. Vivienne Sze and Madhukar Budagavi. A comparison of cabac throughput for HEVC/H. 265 VS. AVC/H. 264. In IEEE
Workshop on Signal Processing Systems (SiPS), pages 165–170, 2013.
2. J-P Henot, Michaël Ropert, Julien Le Tanou, Jean Kypréos, and Thomas Guionnet. High efficiency video coding (HEVC):
replacing or complementing existing compression standards? In IEEE Int. Symposium on Broadband Multimedia Systems
and Broadcasting, pages 1–6, 2013.
3. Dan Grois, Detlev Marpe, Amit Mulayoff, Benaya Itzhaky, and Ofer Hadar. Performance comparison of H. 265/MPEG-
HEVC, VP9, and H. 264/MPEG-AVC encoders. In PCS, volume 13, pages 8–11, 2013.
4. Tiesong Zhao, Zhou Wang, and Sam Kwong. Flexible mode selection and complexity allocation in high efficiency video
coding. IEEE Journal of Selected Topics in Signal Processing, 7(6):1135–1144, 2013.
5. Gary J Sullivan, Jens Ohm, Woo-Jin Han, and Thomas Wiegand. Overview of the high efficiency video coding (HEVC)
standard. IEEE Tran. on Circuits and Systems for Video Tech., 22(12):1649–1668, 2012.
6. Guilherme Correa, Pedro Assuncao, Luciano Agostini, and Luis A da Silva Cruz. Performance and computational com-
plexity assessment of high-efficiency video encoders. IEEE Transactions on Circuits and Systems for Video Technology,
22(12):1899–1909, 2012.
7. Gang He, Dajiang Zhou, Yunsong Li, Zhixiang Chen, Tianruo Zhang, and Satoshi Goto. High-throughput power-efficient
vlsi architecture of fractional motion estimation for ultra-hd hevc video encoding. 2015.
8. Philippe Hanhart, Martin Rerabek, Francesca De Simone, and Touradj Ebrahimi. Subjective quality evaluation of the
upcoming HEVC video compression standard. In SPIE Optical Engineering + Applications, pages 84990–99. International
Society for Optics and Photonics, 2012.
9. Il-Koo Kim, Ken McCann, Kazuo Sugimoto, Benjamin Bross, and Woo-Jin Han. Hm10: High efficiency video coding
(HEVC) test model 10 encoder description. Proc. Joint Collab. Team Video Coding, 2013.
10. Michael Horowitz, Faouzi Kossentini, Nader Mahdi, Shilin Xu, Hsan Guermazi, Hassene Tmar, Bin Li, Gary J Sullivan,
and Jizheng Xu. Informal subjective quality comparison of video compression performance of the hevc and h. 264/mpeg-4
avc standards for low-delay applications. In SPIE Optical Engineering Applications, pages 84990–84999. International
Society for Optics and Photonics, 2012.
11. Elie Gabriel Mora. Multiview video plus depth coding for new multimedia services. PhD thesis, Telecom ParisTech, 2014.
12. Jirka Klaue, Berthold Rathke, and Adam Wolisz. Evalvid–A framework for video transmission and quality evaluation.
In Comp. Perf. Eval. Modelling Techniques and Tools, pages 255–272. Springer, 2003.
13. Fan Zhang. Quality of Experience-driven Multi-Dimensional Video Adaptation. PhD thesis, Universität München, 2014.
14. Mehul P Sampat, Zhou Wang, Shalini Gupta, Alan Conrad Bovik, and Mia K Markey. Complex wavelet structural
similarity: A new image similarity index. Image Processing, IEEE Transactions on, 18(11):2385–2401, 2009.
15. Kalyanmoy Deb, Amrit Pratap, Sameer Agarwal, and TAMT Meyarivan. A fast and elitist multiobjective genetic
algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation, 6(2):182–197, 2002.
16. Brad L Miller and David E Goldberg. Genetic algorithms, selection schemes, and the varying effects of noise. Evolutionary
Computation, 4(2):113–131, 1996.
17. Soo-Yong Shin, In-Hee Lee, Dongmin Kim, and Byoung-Tak Zhang. Multiobjective evolutionary optimization of DNA
sequences for reliable DNA computing. IEEE Tran. on Evol. Comp., 9(2):143–158, 2005.
18. Kamisetty Ramamohan Rao, Do Nyeon Kim, and Jae Jeong Hwang. Video coding standards: AVS China, H. 264/MPEG-
4 PART 10, HEVC, VP6, DIRAC and VC-1. Springer Science & Business Media, 2013.
19. Yuv data set. http://trace.eas.asu.edu/YUV/.

View publication stats

You might also like