Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
41 views10 pages

User-Priority Based AV1 Coding Tool Selection

This document summarizes a research paper on simplifying the AV1 video encoder by selectively enabling or disabling coding tools based on a user's priority of either quality or complexity. It first describes the advanced coding tools in AV1 that improve compression but increase encoding time. It then overviews the paper's contributions: 1) designing tests to evaluate how disabling each tool impacts quality and speed; 2) analyzing tool importance based on these tests; 3) proposing methods to select tools based on priority. Experimental results showed their low-complexity method saved over 30% encoding time with under 1% quality loss.

Uploaded by

Dany O johny
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views10 pages

User-Priority Based AV1 Coding Tool Selection

This document summarizes a research paper on simplifying the AV1 video encoder by selectively enabling or disabling coding tools based on a user's priority of either quality or complexity. It first describes the advanced coding tools in AV1 that improve compression but increase encoding time. It then overviews the paper's contributions: 1) designing tests to evaluate how disabling each tool impacts quality and speed; 2) analyzing tool importance based on these tests; 3) proposing methods to select tools based on priority. Experimental results showed their low-complexity method saved over 30% encoding time with under 1% quality loss.

Uploaded by

Dany O johny
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

736 IEEE TRANSACTIONS ON BROADCASTING, VOL. 67, NO.

3, SEPTEMBER 2021

User-Priority Based AV1 Coding Tool Selection


Motong Xu , Student Member, IEEE, and Byeungwoo Jeon , Senior Member, IEEE

Abstract—AV1 is an open-source video coding technique with more enhanced coding tools on top of VP9. AV1 is
developed by Alliance for Open Media (AOMedia). Many pow- reported to achieve nearly 30% of bitrate reduction over the
erful coding tools introduced to AV1 has made its encoding time latest VP9 encoder [6]. An overview of its core coding tools
significantly increased. In this paper, we aim at simplifying the
AV1 encoder by facilitating selection of some coding tools depend- is given in [6]. Besides, many comparative studies [7]–[17]
ing on users’ specific application preference. We firstly design have been carried out to evaluate its performance by compar-
a coding tool OFF-test to evaluate coding performance of intra, ing with several existing video coding techniques including
inter, and in-loop filter coding tools in AV1 encoder in order HEVC, VP9, and also Versatile Video Coding (VVC) [18]
to define a criterion to evaluate the importance of each cod- which is the most recent video coding standard that achieves
ing tool by measuring the Bjøntegaard delta bit rate (BDBR)
loss and time saving (TS) when each coding tool is turned around 35% average bitrate saving for the same video quality
OFF. Furthermore, the importance of each coding tool is ana- over HEVC [19]. The authors in [7]–[17] investigated com-
lyzed based on the predefined criterion. Lastly, we suggest two pression efficiency and encoding time complexity of different
intra coding tool selection methods, three inter coding tool selec- coding techniques under multiple coding configurations. These
tion methods, and three overall coding tool selection methods all together helped researchers to better understand the benefits
to simplify AV1 encoder based on users’ preference of quality-
priority or low-complexity-priority. Experimental results show and drawbacks of each technique.
that our proposed low-complexity-priority selection method saves Those more advanced and powerful coding tools imple-
30.72% of encoding time with only 0.91% loss in BDBR sense, mented into AV1 make the coding efficiency much enhanced,
and the quality-priority selection method saves 4.64% of encoding however, they also demand more encoding time and compu-
time with 0.02% BDBR loss. tational power. According to the experiments in [7], AV1 has
Index Terms—AV1, coding tool evaluation, encoder claimed to provide around 21% of the Bjøntegaard delta bit
simplification. rate (BDBR) [20] reduction over HEVC. In the meantime,
however, it is also said to suffer from more than six times
of encoding time complexity compared to the HEVC encoder.
I. I NTRODUCTION In practical usage applications, it is always very desirable to
design a fast encoder with only little coding performance loss.
S DEMAND of high-quality video contents by users
A continues to grow fast, the development of more power-
ful video compression techniques becomes utmost important.
In this context, many works [21]–[29] have focused on design-
ing low complexity encoders irrespective of coding techniques.
In particular, some sample works related to AV1 are as follows.
The High Efficiency Video Coding (MPEG HEVC/ITU-T Authors in [30] proposed a novel rate estimation model
H.265) technique [1] standardized in 2013 achieves approxi- for each transform type in order to speed up transform type
mately 50% of bit-rate reduction while maintaining equivalent selection process in AV1. A fast inter prediction algorithm
or even better visual quality compared to its prior video coding using decision tree is proposed in [31] to selectively decide
standard, MPEG AVC/ITU-T H.264 [2]. In the meantime, the a reduced set of inter modes to subject to rate-distortion (RD)
open-source coding techniques, namely, VP8 [3] and VP9 [4], test so that the AV1 encoding time is reduced. AV1 intra mode
were developed respectively in 2008 and 2013. Recently, ultra- decision is also accelerated in [32] with adaptive early termi-
high definition (UHD) video contents are more commonly nation. Moreover, a machine learning based fast AV1 encoder
provided in multimedia services, thus more advanced and is also studied in [33]. In [34] and [35], the authors evaluate
powerful video coding techniques are required to efficiently the coding performance and time complexity of intra cod-
compress the 4K/8K contents as well as high frame rate video. ing tools adopted in AV1 encoder. One intra coding tool
Later in 2018, another open-source coding technique, AV1 [5] selection method was also proposed in [35] to speed up the
was developed by the Alliance for Open Media (AOMedia) AV1 encoder. However, its coding tool evaluation is limited
only to intra coding tools. Since users can have many dif-
Manuscript received September 30, 2020; revised December 26, 2020 and
March 13, 2021; accepted March 16, 2021. Date of publication April 14, 2021; ferent encoding preferences based on their applications or
date of current version September 3, 2021. This work was supported in usage scenarios, our target in this paper is to design a sim-
part by the Basic Science Research Program through the National Research plified AV1 encoder optimized for user’s different preference
Foundation of Korea (NRF) MFI under Grant NRF-2020R1A2C2007673, and
in part by the Samsung Electronics Company Ltd, System LSI Division. by selecting only necessary coding tools under certain user
(Corresponding author: Byeungwoo Jeon.) priority. For example, a user with a low-complexity priority
The authors are with the Digital Media Laboratory, Department of Electrical application in mind likes to keep the encoding complexity as
and Computer Engineering, Sungkyunkwan University, Suwon 16410,
South Korea (e-mail: [email protected]; [email protected]). low as possible even if its coding efficiency may suffer a lit-
Digital Object Identifier 10.1109/TBC.2021.3071013 tle. It means the user prefers to a fast video encoder even
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
XU AND JEON: USER-PRIORITY BASED AV1 CODING TOOL SELECTION 737

with slight but acceptable sacrifice in quality. The other is the


quality-priority case in which the encoding complexity is low-
ered but with emphasis on keeping the encoding quality intact
as much as possible.
In this paper, we firstly design a coding tool OFF-test to
evaluate the potential changes in coding performance caused
by turning off each coding tool. For this, we turn off a par-
ticular coding tool under test and measure its impact on
Fig. 1. Example of reference samples used for the smooth prediction.
coding efficiency and time complexity in terms of BDBR
loss and time saving (TS). Several coding tools from intra,
inter, and in-loop filter coding tools in AV1 encoder are tested
in this process. After the coding tool OFF-tests are finished
for all coding tools under consideration, we further evaluate
the importance of each coding tool based on our predefined
criterion considering both BDBR loss and TS at the same
time. Then, according to the evaluation, two intra coding
tool selection methods, three inter coding tool selection meth- Fig. 2. Seven neighboring reference samples (p[0] to p[6]) used in recursive
intra prediction.
ods, and three overall AV1 coding tool selection methods are
suggested to simplify AV1 encoder based on user’s target
priority.
The rest of this paper is organized as following. A brief In paeth prediction, a base value is firstly calculated as
overview of AV1 coding tools is given in Section II. Section III above reference sample plus left reference sample minus
describes how we design the coding tool OFF-test and above-left reference sample. The reference sample closest
a criterion of evaluating the importance of each coding to the base value is directly copied as the prediction of
tool. Experimental results of coding performance as well as corresponding pixel.
the coding tool selection addressing user’s different prior- The predicted pixel of the smooth prediction is calculated as
ity are provided in Section IV. Lastly, Section V concludes a weighted sum of its corresponding reference sample values.
this paper. Figure 1 shows the location of the reference samples for three
types of the smooth prediction. In case of SMOOTH_PRED,
four reference samples are used for prediction, while only
II. AV1 C ODING T OOL OVERVIEW two reference samples are used for SMOOTH_V_PRED or
The AV1 coding tools discussed in this paper are listed in SMOOTH_H_PRED.
Table I which lists 8 intra coding tools, 12 inter coding tools, In case of the recursive intra prediction, the prediction
and 2 in-loop filter coding tools. In this section, we provide is generated recursively in a basis of 4×2 patch inside
a brief overview of those coding tools. each block. Eight 7-tap filters are designed differently for
eight different locations in the patch, and then the predicted
value is calculated by filtering its seven neighboring refer-
A. Intra Coding Tools
ence samples (or the already predicted samples in the same
An intra edge filter is applied to smooth and up-sample block) with the corresponding filter taps. Figure 2 illustrates
the reference samples used in directional prediction process. the 4×2 patch and its seven neighboring reference samples
Four filtering strengths (from 0 to 3) can be selected accord- p[0] to p[6].
ing to the width and height of a block and also prediction The chroma from luma prediction firstly generates a down
modes of the above and the left adjacent blocks. No filtering sampled luma block which has the same size of its
process takes place for the filter strength 0, and as the fil- chroma block. The high frequency components of the
ter strength increases, stronger averaging takes place for the down sampled luma block are added together with DC
reference sample values. predicted chroma block to form the final chroma from
The extended directional prediction is designed on top of luma predictor [36]. The chroma from luma prediction can
the VP9’s directional prediction. The number of directions be expressed as:
is increased from 8 to 56 including 8 basic direction modes
and 48 extended direction modes. The angle of the extended
direction modes is derived as: CfL(α) = α × LAC + DC, (2)

where α is a scaling parameter, LAC is the AC contribu-


pAngle = Anglebasic + angleDelta × AngleStep, (1) tion of a subsampled reconstructed luma block [36]. The
α parameter is signaled in the bitstream to reduce decoder
where Anglebasic denotes the degree of the basic mode (45, 67, complexity.
90, 113, 135, 157, 180, 203). angleDelta is an integer ranging In addition, two coding tools designed especially for screen
from −3 to 3 to control the angle of extended directions, and content (SCC) videos, namely, intra block copy and palette
AngleStep is the step size of the angle degree that equals to 3. prediction are also implemented in AV1. In this paper, we
738 IEEE TRANSACTIONS ON BROADCASTING, VOL. 67, NO. 3, SEPTEMBER 2021

Fig. 3. Illustration of compound prediction modes in AV1. Fig. 4. In-loop filters process employed in AV1.

TABLE I
AV1 C ODING T OOLS U NDER T EST
also consider these two as intra coding tools. Intra block copy
predicts a block in a similar way as the motion compensation
in inter prediction but finds its reference block only in the
already encoded area of the current picture. The color palette
prediction is effective when a block can be represented by
a few numbers of colors. Color values in the predictor are rep-
resented by the color indices in the palette which is generated
based on the colors of neighboring pixels weighted according
to distance and occurrence frequency.

TABLE II
AV1 T EST C ONFIGURATIONS
B. Inter Coding Tools
AV1 adopted advanced compound prediction for inter cod-
ing where the final compound predictor pf can be obtained
from two predictors p1 and p2 as illustrated in Figure 3. m
is a compound mask which can be either a scalar value or
a matrix when different compound mode is used. In case
of the distance-weighted compound prediction, m is a scalar
value based on the expected output time of the reference
pictures. The wedge compound utilizes 16 predefined wedge
patterns respectively for square and non-square blocks as the C. In-Loop Filter Coding Tools
mask m, and it can be applied to both inter-inter and inter- After deblocking process in AV1, two in-loop filter cod-
intra compound. Difference-weighted compound generates the ing tools are successively applied at post-processing stage
mask m based on the difference between two predictors, so as in Figure 4. The constrained directional enhancement fil-
that the final prediction can emphasize either one predictor ter (CDEF) is firstly applied as a deringing filter that can
over the other or the opposite. Moreover, the interintra com- also preserve detailed information. Afterwards, a set of loop
pound prediction blends the inter predictor together with restoration filters can be applied after CDEF. The filtering
an intra predicted block. Four intra modes including DC process is done on a loop-restoration unit base which can
prediction, vertical prediction, horizontal prediction, smooth be among 64×64, 128×128, 256×256. Then, one of the
prediction can be applied, and the mask m is defined differently two supported filters (separable symmetric normalized Wiener
for each intra prediction mode. filter and Dual self-guided filter) is selected for each loop-
There are two types of warped motion compensation in restoration unit. The necessary filter parameters of separable
AV1, which are global warped motion and local warped symmetric normalized Wiener filter and dual self-guided filter
motion. The global warped motion is used to compensate the are signaled to decoder in the bitstream.
motion vector of a whole frame caused by camera moving, and
the local warped motion usually handles the object translation, III. D ESIGN OF C ODING T OOL O FF -T EST
rotation, zooming, or affine motions at a block level.
In this paper, we aim at accelerating AV1 encoder according
The overlapped block motion compensation (OBMC) uti-
to user’s preference on the aspect to make it faster, i.e., low-
lizes both the current inter predicted block and the inter
complexity priority or quality priority. This section describes
predicted blocks generated based on motion vectors from
the test environment and configuration we applied, our test
the above and the left blocks. Three predictors are blended
design, and also the performance evaluation methodology.
together to form the final OBMC predictor. In OBMC blend-
ing process, predictions from only the first reference picture
are used. A. Test Settings and Configurations
Lastly, AV1 also has the dual interpolation filters to inter- The version of the AV1 encoder that we used for the test is
polate pixels for more accurate motion compensation. This AOMedia Project AV1 Encoder 1.0.0-1634-g0a0368368 which
process basically enables the interpolations process to use dif- can be accessed from [37].
ferent filter taps to interpolate respectively in horizontal and The AV1 test configurations and corresponding values are
vertical directions. listed in Table II. AV1 supports users to choose either 1-pass or
XU AND JEON: USER-PRIORITY BASED AV1 CODING TOOL SELECTION 739

2-pass encoding in which the first pass gathers useful statistics


of the full sequence to be encoded, and the second pass does
actual encoding with the guidance of the statistics from the first
pass. AOM recommends users to use the 2-pass encoding con-
figuration so that the best coding performance can be achieved.
An experiment is done with the sequences (class B, C, and D)
as suggested in JVET common test conditions [38] under ran-
dom access configuration [38] to compare the performance of
1-pass and 2-pass encoding. The test results show that the Fig. 5. BDBR-TS decision boundaries of coding tool selection.
1-pass encoding saves only 15.31% of the encoding time but
with as large as 4.78% BDBR loss compared to the 2-pass TABLE III
encoding. Thus, the experiments in this work are all carried E XPLANATION AND S UGGESTION FOR BDBR-TS D ECISION B OUNDARIES
out under the 2-pass encoding configuration.
The parameter end-usage is set equal to “q” which will
fix the quantization with a constant quality, and four con-
stant quality levels (cq-level) 32, 40, 48, and 56 are chosen to
approximate the QP value as suggested in JVET Common Test
Conditions according to [7]. The parameter cpu-used controls
tradeoff between quality and encoding time. In our experi-
ment, cpu-used is set equal to 0 so that the AV1 encoder can
achieve its best coding performance despite longer encoding calculated as
time. This is because we would like to evaluate the importance
of each coding tool under its best performance. Tanchor − Ttest
TS = × 100, (3)
In addition, it is noted that AV1 does not have the concept Tanchor
of “group of pictures” as HEVC or VVC. Instead, in order
to achieve the hierarchical temporal coding structure, a frame where Tanchor denotes the encoding time of the anchor which
named as the golden frame is encoded with a higher quality is the AV1 encoder with all coding tools in Table I turned
and is used as a reference for multiple inter frames. The two ON, and Ttest denotes the encoding time when only the tool
parameters min-gf-interval and max-gf-interval control the size under test is turned OFF. Moreover, a BDBR loss and TS
of the interval between two golden frames, which can let the comparison scheme is designed to check the BDBR increase
AV1 encoder provide a similar temporal coding structure as and time saving at the same time caused by turning OFF each
HEVC. In AV1, a key frame is an intra frame which resets the coding tool, which will later help to decide whether one coding
decoding process. The distance range between two key frames tool should be kept ON or turned OFF.
indicates the size of intra period which can be controlled by Figure 5 shows an example of BDBR-TS decision bound-
kf-min-dist and kf-max-dist. Researchers in [7] claimed that aries of coding tool selection methods. Note that the decision
fixing the size of golden and key frame interval does not sig- boundaries can be flexibly changed according to users’ encod-
nificantly affect the coding performance, so we decide not to ing choice. The horizontal axis refers to BDBR, and the
fix the value under RA encoding configuration. vertical axis refers to TS. Detailed explanation and correspond-
ing suggestion of each region are given in Table III. When
a coding tool’s performance lies outside either of the two lim-
B. Test Design and Performance Evaluation itation borderlines (that is, hatched area), it indicates too much
In order to clearly understand the performance of each indi- BDBR loss or too little TS is expected by turning OFF this
vidual coding tool, we design an AV1 coding tool OFF-test as coding tool. Thus, those coding tools inside the region D are
follows: suggested to be always used in encoder. A coding tool inside
(1) Disable the coding tool under study while keeping the the region A suggests little BDBR loss while large TS when
other coding tools turned ON and do encoding. it is turned OFF. Thus, those tools in the region A can be
(2) Compare the coding performance loss (BDBR) and com- deselected to save the encoding time. In contrast, the region B
plexity reduction (TS) caused by turning OFF the coding tool indicates that its coding tools turned OFF causes much larger
under study. BDBR loss compared to those in the region A but with only
(3) Investigate whether we should suggest keeping ON or similar or even less TS. Therefore, the coding tools in the
turning OFF this coding tool. region B are better to stay ON. Moreover, the region C is
(4) Repeat steps (1) to (3) for each coding tool. a gray area between the regions A and B, so we would like
(5) After finishing the OFF-test of all the coding tools listed to consider the coding tools in the region C as optional cod-
in Table I, several user-priority based coding tool selection ing tools which can be either selected or deselected depending
methods are suggested, respectively. on user’s preference. When users prefer to high compression
In order to evaluate the encoder performance change while quality, coding tools in the region C can be selected. On con-
a single intra prediction tool is turned OFF, BDBR and encod- trast, those coding tools can be turned OFF if the users prefer
ing time saving (TS) are measured after each OFF-test. TS is to a low complexity with a tolerable coding loss.
740 IEEE TRANSACTIONS ON BROADCASTING, VOL. 67, NO. 3, SEPTEMBER 2021

TABLE IV
S UMMARY OF BDBR (%) AND TS (%) OF E ACH I NTRA C ODING T OOL U NDER OFF-T EST

Fig. 6. BDBR vs. TS comparison of intra prediction tools under AI configuration.

TABLE V
S UGGESTED I NTRA C ODING T OOL S ELECTION BDBR loss are caused by turning OFF the extended direc-
tional prediction, respectively under AI and RA configura-
tions. Similarly, an AV1 encoder without the chroma from
luma prediction suffers from as large as 21.67% (Cb chan-
nel in class F) and 14.37% (Cr channel in class A1) of
coding performance loss in chrominance channel [35], respec-
tively for AI and RA configurations, however, only very little
encoding time is saved. Paeth prediction shows very limited
coding performance improvement except for screen content
IV. AV1 C ODING T OOL E VALUATION videos (class F). Besides, our experimental results show that
Totally 26 sequences suggested in the JVET common test when intra block copy and color palette prediction are not
conditions [38] are used for our experiments. Intra and in- applied, only sequences from class F suffer from significant
loop filter coding tools are tested under both all intra (AI) BDBR loss. It indicates that intra block copy and color palette
and random access (RA) configurations, while inter coding prediction have almost no effect on natural content videos.
tools are tested only under the RA configuration. Note that Note that some coding tools (such as intra edge filter and
sequences from class E are not tested under random access chroma from luma prediction) provide high coding efficiency
configuration according to [38]. To save the running time, only but consumes little time complexity. Turning OFF such cod-
the first one second of each test sequence is encoded, which ing tools may result in larger residual block after prediction,
is proved to be valid in [7]. which will consequently increase the processing time at quan-
tization and entropy coding stages later. Therefore, total
encoding time can be increased when such coding tools are
A. Intra Coding Tool OFF-Test Results turned OFF.
The experimental results of intra coding tools under AI and Figures 6 and 7 show the BDBR and TS comparison of
RA encoding configurations are summarized in Table IV, and all the intra coding tools under AI and RA configurations,
detailed experimental results of each class can be referred respectively. Extended directional prediction and chroma from
to [35]. We observe that an average of 5.62% and 1.76% luma have huge benefit in BDBR performance so it is desirable
XU AND JEON: USER-PRIORITY BASED AV1 CODING TOOL SELECTION 741

Fig. 7. BDBR vs. TS comparison of intra prediction tools under RA configuration.

TABLE VI
BDBR (%) AND TS (%) OF S UGGESTED I NTRA C ODING T OOL S ELECTION

TABLE VII
BDBR (%) AND TS (%) OF E ACH I NTER C ODING T OOL U NDER OFF-T EST (RA)

to always keep the two coding tools ON. However, paeth negligible. Besides, we suggest that the smooth prediction
prediction, intra block copy, and color palette prediction are and recursive intra prediction are optional depending on
suggested to be turned OFF for natural videos since the users’ choice for their OFF-test results falling into the gray
corresponding BDBR loss caused by turning them OFF is region C.
742 IEEE TRANSACTIONS ON BROADCASTING, VOL. 67, NO. 3, SEPTEMBER 2021

Fig. 8. BDBR vs. TS comparison of inter prediction tools under RA configuration.

TABLE VIII TABLE IX


S UGGESTED I NTER C ODING T OOL S ELECTION BDBR (%) AND TS (%) OF S UGGESTED I NTER C ODING T OOL
S ELECTION

not make much sense since the prediction performance of


key frames (which is all intra coded) heavily influences the
subsequent inter encoded pictures.

B. Inter Coding Tool OFF-Test Results


The experimental results of inter coding tools under
Two intra coding tool selection methods are suggested RA encoding configuration is in Table VII. Note that the
respectively for the low-complexity-priority usage and the masked compound controls four compound predictions which
higher quality-priority usage. Table V shows our suggested are difference-weighted compound, interinter wedge com-
intra coding tool selection methods. “X” indicates that the pound, interintra wedge compound, and interintra compound at
corresponding coding tool is suggested to be deselected, and the same time. It means when the masked compound is turned
“O” indicates that the coding tools are suggested to be turned OFF, the four coding tools are all turned OFF together. Masked
ON. The corresponding intra coding tool selection results are compound, interintra compound, warped motion and OBMC
illustrated in Table VI. The quality priority selection method cause BDBR loss of 0.41%, 0.39%, 0.36% and 0.60% when
saves 5.21% and 2.07% encoding time with 0.16% and 0.01% they are deselected, respectively. Thus, those coding tools
BDBR respectively under AI and RA configuration. The low- are important for maintaining the compression performance.
complexity-priority selection method saves encoding time by In contrast, when distance-weighted compound and dual fil-
23.89% and 6.07% with BDBR of 2.81% 1.01% under AI ter are turned OFF, the BDBR performance are only 0.02%
and RA configurations, respectively. We observe that even and −0.03%. It motivates us to consider them switched off if
low-complexity-priority selection achieves less than 10% of a user prefers to a simpler encoder. In addition, turning OFF
encoding TS and sacrifices around 1% coding performance smooth interintra mode does not provide any time saving but
loss in BDBR sense. Therefore, in case of RA configuration, suffers from 0.19% loss in BDBR sense, so the smooth inter-
turning OFF intra prediction tools to save encoding time does intra is suggested to be ON. Global motion, interintra wedge
XU AND JEON: USER-PRIORITY BASED AV1 CODING TOOL SELECTION 743

TABLE X
BDBR (%) AND TS (%) OF E ACH I N -L OOP F ILTER C ODING T OOL U NDER OFF-T EST

TABLE XI TABLE XII


S UGGESTED AV1 C ODING T OOL S ELECTION BDBR (%) AND TS (%) OF S UGGESTED AV1 C ODING T OOL S ELECTION

approximately 20% of encoding time and loses less than 0.5%


in BDBR. Quality-priority application also save 4.64% of
encoding time with almost no coding performance loss.

C. In-Loop Filter Coding Tool OFF-Test Results


The OFF-test experimental results of in-loop filter coding
compound, difference-weighted compound, and one side com- tools under both AI and RA encoding configurations are listed
pound are considered as optional coding tool that can be either in Table X. From the results we can observe that the in-loop
ON or OFF according to user’s different priorities. Note that filter coding tools are very important to coding performance
the interintra wedge compound prediction does not tend to since turning OFF either CDEF or restoration filter causes
influence encoding time complexity for SCC sequences, so it approximately more than 0.8% BDBR loss but saves very little
is also suggested to be ON for SCC videos. time complexity especially under RA encoding configuration.
Figure 8 shows the BDBR and TS comparison of all the Thus, both CDEF and restoration filter are suggested to keep
inter coding tools under RA encoding configuration. Masked ON in the encoding process for all user-priority selections.
compound, interintra compound, warped motion, and OBMC
are suggested to keep ON in order to ensure the basic inter
prediction performance since they provide relatively large ben- D. User-Priority Based AV1 Coding Tool Selection
efit to BDBR performance. However, the distance-weighted With comprehensive consideration of the above experimen-
compound and the dual filter have very limited effect to coding tal results, we suggest three AV1 coding tool selection methods
performance, so they are suggested to be turned OFF. Besides, respectively for three different usages: low-complexity-priority
the other inter coding tools (interinter wedge compound, inter- (lowest complexity), quality-complexity-priority (moderate
intra wedge compound, difference-weighted compound, one quality and complexity), and quality-priority (best quality).
side-compound and global motion) are optional depending on The selection methods are described in Table XI, and the test
users’ priority. results of each suggested selection methods are illustrated in
Three inter coding tool selection methods are suggested Table XII. In case of quality priority selection, we suggest
respectively for low-complexity-priority usage, moderate turning ON all the intra coding tools to ensure the encod-
quality-complexity usage, and higher quality-priority usage. ing performance of both key frame and its following frames.
Table VIII shows our suggested inter coding tool selection The low-complexity priority selection achieves 30.72% of
methods. The corresponding test results of three different user- encoding time saving with less than 1% BDBR loss, and the
priority based applications are illustrated in Table IX. Low- moderate quality-complexity priority application suffers from
complexity-priority application saves around 30% of encoding 0.46% BDBR loss while saving 21.37% of encoding time. The
time with only 0.87% of coding efficiency loss in BDBR quality priority application saves 4.64% of encoding time and
sense, and moderate quality-complexity selection also achieves sacrifices 0.02% in BDBR performance.
744 IEEE TRANSACTIONS ON BROADCASTING, VOL. 67, NO. 3, SEPTEMBER 2021

V. C ONCLUSION [15] G. Esakki, A. Panayides, S. Teeparthi, and M. Pattichis, “A comparative


performance evaluation of VP9, x265, SVT-AV1, VVC codecs leverag-
This paper firstly provides a brief introduction of intra, ing the VMAF perceptual quality metric,” Proc. SPIE Appl. Digit. Image
inter, and in-loop filter coding tools in AV1. Then, we design Process. XLIII, Aug. 2020, Art. no. 1151010.
an AV1 coding tool OFF-test to evaluate the importance and [16] X. Zhao, S. Liu, L. Zhao, X. Xu, B. Zhu, and X. Li, “A comparative
study of HEVC, VVC, VP9, AV1 and AVS3 video codecs,” Proc. SPIE
time complexity of each coding tool in AV1. According to Appl. Digit. Image Process. XLIII, Aug. 2020, Art. no. 1151011.
the experimental results, several coding tool selection meth- [17] Y. Park, H. Lee, and B. Jeon, “Performance analysis of open Web
ods are suggested. In case of the AI encoding configuration, video codec VP8,” IEIE Trans. Smart Process. Comput., vol. 2, no. 2,
pp. 86–96, Apr. 2013.
two coding tool selection methods are suggested based on [18] J. Chen, Y. Ye, and S. Kim, “Algorithm description for versatile video
different users’ priorities. In case of the RA configuration, coding and test model 10 (VTM 10),” Doc. JVET-S2002-v1 19th
intra coding tools are not OFF in a simplified AV1 encoder Meeting of Joint Video Experts Team (JVET), ITU, Jun. 2020.
[19] F. Bossen, X. Li, and K. Suehring, “AHG report: Test model software
since intra prediction is essential for maintaining the encod- development (AHG3),” Doc. JVET-S0003 14th Meeting of Joint Video
ing performance of whole sequence. Thus, three inter coding Experts Team (JVET), ITU, Jul. 2020.
tool selection methods are suggested according to different [20] G. Bjontegaard, “Calculation of average PSNR differences between
RDCurves,” document ITU-T SG16 Q.6, VCEG-M33, Geneva,
users’ priorities. In-loop filter coding tools are suggested to Switzerland, Apr. 2001.
be all used in the encoding process because they are not [21] C. Grecos and M. Y. Yang, “Fast inter mode prediction for P slices
time-consuming but important to coding performance espe- in the H264 video coding standard,” IEEE Trans. Broadcast., vol. 51,
cially under RA configuration. To conclude, three overall no. 2, pp. 256–263, Jun. 2005.
[22] Y. Zhang, S. Kwong, G. Jiang, X. Wang, and M. Yu, “Statistical early
AV1 coding tool selection methods are suggested to simplify termination model for fast mode decision and reference frame selection
AV1 encoder respectively for three different usages: low- in multiview video coding,” IEEE Trans. Broadcast., vol. 58, no. 1,
complexity-priority, quality-complexity-priority, and quality- pp. 10–23, Mar. 2012.
[23] K. Won and B. Jeon, “Complexity-efficient rate estimation for mode
priority. decision of the HEVC encoder,” IEEE Trans. Broadcast., vol. 61, no. 3,
pp. 425–435, Sep. 2015.
[24] H. L. Tan, C. C. Ko, and S. Rahardja, “Fast coding quad-tree deci-
sions using prediction residuals statistics for high efficiency video
R EFERENCES coding (HEVC),” IEEE Trans. Broadcast., vol. 62, no. 1, pp. 128–133,
Mar. 2016.
[1] G. Sullivan, J.-R. Ohm, W.-J. Han, and T. Wiegand, “Overview of the
high efficiency video coding (HEVC) standard,” IEEE Trans. Circuits [25] Z. Pan, J. Lei, Y. Zhang, X. Sun, and S. Kwong, “Fast motion estimation
Syst. Video Technol., vol. 22, no. 12, pp. 1649–1668, Dec. 2012. based on content property for low-complexity H.265/HEVC encoder,”
IEEE Trans. Broadcast., vol. 62, no. 3, pp. 675–684, Sep. 2016.
[2] T. Wiegand, G. Sullivan, G. Bjøntegaard, and A. Luthra, “Overview of
[26] K. Tai, M. Hsieh, M. Chen, C. Chen, and C. Yeh, “A fast HEVC encod-
the H.264/AVC video coding standard,” IEEE Trans. Circuits Syst. Video
ing method using depth information of collocated CUs and RD cost
Technol., vol. 13, no. 7, pp. 560–576, Jul. 2003.
characteristics of PU modes,” IEEE Trans. Broadcast., vol. 63, no. 4,
[3] J. Bankoski, P. Wilkins, and Y. Xu, “Technical overview of VP8, an open
pp. 680–692, Dec. 2017.
source video codec for the Web,” in Proc. IEEE Int. Conf. Multimedia
[27] M. Jamali and S. Coulombe, “Fast HEVC intra mode decision based
Expo, Dec. 2011, pp. 1–6.
on RDO cost prediction,” IEEE Trans. Broadcast., vol. 65, no. 1,
[4] D. Mukherjee et al., “The latest open-source video codec VP9—An pp. 109–122, Mar. 2019.
overview and preliminary results,” Proc. Picture Coding Symp. (PCS),
[28] M. Xu, T. N. Canh, and B. Jeon, “Simplified level estimation for rate-
Dec. 2013, pp. 390–393.
distortion optimized quantization of HEVC,” IEEE Trans. Broadcast.,
[5] Alliance for Open Media. Accessed: Jan. 2019. [Online]. Available: vol. 66, no. 1, pp. 88–99, Mar. 2020.
http://aomedia.org [29] J. Jeong, S. Kim, and Y. Kim, “Fast HEVC intra coding by predicting
[6] Y. Chen et al., “An overview of core coding tools in the AV1 video the rate-distortion cost for a low-complexity encoder,” IEIE Trans. Smart
codec,” in Proc. Picture Coding Symp. (PCS), Jun. 2018, pp. 41–45. Process. Comput., vol. 7, no. 3, pp. 210–220, Aug. 2018.
[7] Y. Chen, E. François, F. Galpin, R. Jullian, and M. Kerdranvat, [30] B. Li, J. Han, and Y. Xu, “Fast transform type selection using conditional
“Comparative study of video coding solutions VVC, AV1, EVC ver- Laplacian distribution based rate estimation,” Proc. SPIE Appl. Digit.
sus HEVC,” in Doc. JVET-N0605 14th Meeting of Joint Video Experts Image Process. XLIII, Aug. 2020, Art. no. 115101Z.
Team (JVET), ITU, Mar. 2019. [31] J. Kim, S. Blasi, A. S. Dias, M. Mrak, and E. Izquierdo, “Fast inter-
[8] D. Grois, T. Nguyen, and D. Marpe, “Coding efficiency comparison of prediction based on decision trees for AV1 encoding,” in Proc. IEEE Int.
AV1/VP9, H.265/MPEG-HEVC, and H.264/MPEG-AVC encoders,” in Conf. Acoust. Speech Signal Process. (ICASSP), 2019, pp. 1627–1631.
Proc. Picture Coding Symp. (PCS), Dec. 2016, pp. 1–5. [32] J. Jeong, G. Gankhuyag, and Y.-H. Kim, “A fast intra mode decision
[9] D. Grois, T. Nguyen, and D. Marpe, “Performance comparison of AV1, based on accuracy of rate distortion model for AV1 intra encod-
JEM, VP9, and HEVC encoders,” in Proc. Appl. Digit. Image Process. ing,” Proc. 34th Int. Techn. Conf. Circuits/Syst. Comput. Commun.
XL, vol. 10396, Feb. 2018, Art. no. 103960L. (ITC-CSCC), 2019, pp. 1–3.
[10] M. Layek et al., “Performance analysis of H.264, H.265, VP9 and [33] H. Su, C.-Y. Tsai, Y. Wang, and Y. Xu, “Machine learning accelerated
AV1 video encoders,” in Proc. Asia–Pac. Netw. Oper. Manag. Symp., partition search for video encoding,” in Proc. IEEE Int. Conf. Image
Sep. 2017, pp. 322–325. Process. (ICIP), 2019, pp. 2661–2665.
[11] T. Nguyen and D. Marpe, “Future video coding technologies: A [34] T. N. Canh, M. Xu, and B. Jeon, “Performance evaluation
performance evaluation of AV1, JEM, VP9, and HM,” Proc. Picture of AV1 intra coding tools,” in Proc. IEEE Symp. Broadband
Coding Symp. (PCS), Jun. 2018, pp. 31–35. Multimedia Syst. Broadcast. (BMSB), Jun. 2019, pp. 1–4.
[12] P. Akyazi and T. Ebrahimi, “Comparison of compression efficiency [35] M. Xu and B. Jeon, “Selection of intra prediction tools for fast
between HEVC/H.265, VP9 and AV1 based on subjective quality AV1 encoding,” Proc. IEEE Symp. Broadband Multimedia Syst.
assessments,” Proc. 10th Int. Conf. Qual. Multimedia Exp., May 2018, Broadcast. (BMSB), Jun. 2020, pp. 1–4.
pp. 1–6. [36] L. N. Trudeau, N. E. Egge, and D. Barr, “Predicting chroma from
[13] T. Laude, Y. G. Adhisantoso, J. Voges, M. Munderloh, and J. Ostermann, luma in AV1,” in Proc. Data Compression Conf. (DCC), Mar. 2018,
“A comparison of JEM and AV1 with HEVC: Coding tools, coding effi- pp. 374–382.
ciency and complexity,” Proc. Picture Coding Symp. (PCS), Jun. 2018, [37] AOMedia Project AV1 Encoder. Accessed: Jan. 2019. [Online].
pp. 36–40. Available: https://aomedia.googlesou rce.com/aom/
[14] F. Zhang, A. V. Katsenou, M. Afonso, G. Dimitrov, and D. R. Bull, [38] J. Boyce, “JVET common test conditions and software reference con-
“Comparing VVC, HEVC and AV1 using objective and subjective figurations,” document JVET-J1010, Joint Video Experts Team, ITU,
assessments,” Mar. 2020. [Online]. Available: arXiv:2003.10282. Geneva, Switzerland, Apr. 2018.
XU AND JEON: USER-PRIORITY BASED AV1 CODING TOOL SELECTION 745

Motong Xu (Student Member, IEEE) received the Byeungwoo Jeon (Senior Member, IEEE) received
B.S. degree from the Department of Electronic the B.S. degree (magna cum laude) in 1985, the
Information Engineering, Harbin Institute of M.S. degree from the Department of Electronics
Technology, China, in 2012. She is currently pur- Engineering, Seoul National University, Seoul,
suing the Ph.D. degree with the Department Korea, in 1987, and the Ph.D. degree from the
of Electrical and Computer Engineering, School of Electrical Engineering, Purdue University,
Sungkyunkwan University, South Korea. Her West Lafayette, IN, USA, in 1992. From 1993 to
research interests include image/video coding. 1997, he was with the Signal Processing Laboratory,
Samsung Electronics, South Korea, where he worked
for research and development of video compression
algorithms, design of digital broadcasting satellite
receivers, and other MPEG-related research for multimedia applications. Since
September 1997, he has been a Faculty Member with the School of Electronic
and Electrical Engineering, Sungkyunkwan University (SKKU), South Korea,
where he is currently a Professor. He has served as the Project Manager
of Digital TV and Broadcasting in the Korean Ministry of Information and
Communications from March 2004 to February 2006, where he has super-
vised all digital TV-related R&D in Korea. From January 2015 to December
2016, he was the Dean of the College of Information and Communication
Engineering, SKKU. His research interests include multimedia signal process-
ing, video compression, statistical pattern recognition, and remote sensing. He
was a recipient of the 2005 IEEK Haedong Paper Award in Signal Processing
Society, South Korea, the 2012 Special Service Award and the 2019 BTS
Distinguished Volunteer Award both from the IEEE Broadcast Technology
Society. In 2019, he is the President of Korean Institute of Broadcast and
Media Engineers. He is a member of SPIE and an Associate Editor of IEEE
T RANSACTIONS ON B ROADCASTING.

You might also like