User-Priority Based AV1 Coding Tool Selection
User-Priority Based AV1 Coding Tool Selection
3, SEPTEMBER 2021
Abstract—AV1 is an open-source video coding technique with more enhanced coding tools on top of VP9. AV1 is
developed by Alliance for Open Media (AOMedia). Many pow- reported to achieve nearly 30% of bitrate reduction over the
erful coding tools introduced to AV1 has made its encoding time latest VP9 encoder [6]. An overview of its core coding tools
significantly increased. In this paper, we aim at simplifying the
AV1 encoder by facilitating selection of some coding tools depend- is given in [6]. Besides, many comparative studies [7]–[17]
ing on users’ specific application preference. We firstly design have been carried out to evaluate its performance by compar-
a coding tool OFF-test to evaluate coding performance of intra, ing with several existing video coding techniques including
inter, and in-loop filter coding tools in AV1 encoder in order HEVC, VP9, and also Versatile Video Coding (VVC) [18]
to define a criterion to evaluate the importance of each cod- which is the most recent video coding standard that achieves
ing tool by measuring the Bjøntegaard delta bit rate (BDBR)
loss and time saving (TS) when each coding tool is turned around 35% average bitrate saving for the same video quality
OFF. Furthermore, the importance of each coding tool is ana- over HEVC [19]. The authors in [7]–[17] investigated com-
lyzed based on the predefined criterion. Lastly, we suggest two pression efficiency and encoding time complexity of different
intra coding tool selection methods, three inter coding tool selec- coding techniques under multiple coding configurations. These
tion methods, and three overall coding tool selection methods all together helped researchers to better understand the benefits
to simplify AV1 encoder based on users’ preference of quality-
priority or low-complexity-priority. Experimental results show and drawbacks of each technique.
that our proposed low-complexity-priority selection method saves Those more advanced and powerful coding tools imple-
30.72% of encoding time with only 0.91% loss in BDBR sense, mented into AV1 make the coding efficiency much enhanced,
and the quality-priority selection method saves 4.64% of encoding however, they also demand more encoding time and compu-
time with 0.02% BDBR loss. tational power. According to the experiments in [7], AV1 has
Index Terms—AV1, coding tool evaluation, encoder claimed to provide around 21% of the Bjøntegaard delta bit
simplification. rate (BDBR) [20] reduction over HEVC. In the meantime,
however, it is also said to suffer from more than six times
of encoding time complexity compared to the HEVC encoder.
I. I NTRODUCTION In practical usage applications, it is always very desirable to
design a fast encoder with only little coding performance loss.
S DEMAND of high-quality video contents by users
A continues to grow fast, the development of more power-
ful video compression techniques becomes utmost important.
In this context, many works [21]–[29] have focused on design-
ing low complexity encoders irrespective of coding techniques.
In particular, some sample works related to AV1 are as follows.
The High Efficiency Video Coding (MPEG HEVC/ITU-T Authors in [30] proposed a novel rate estimation model
H.265) technique [1] standardized in 2013 achieves approxi- for each transform type in order to speed up transform type
mately 50% of bit-rate reduction while maintaining equivalent selection process in AV1. A fast inter prediction algorithm
or even better visual quality compared to its prior video coding using decision tree is proposed in [31] to selectively decide
standard, MPEG AVC/ITU-T H.264 [2]. In the meantime, the a reduced set of inter modes to subject to rate-distortion (RD)
open-source coding techniques, namely, VP8 [3] and VP9 [4], test so that the AV1 encoding time is reduced. AV1 intra mode
were developed respectively in 2008 and 2013. Recently, ultra- decision is also accelerated in [32] with adaptive early termi-
high definition (UHD) video contents are more commonly nation. Moreover, a machine learning based fast AV1 encoder
provided in multimedia services, thus more advanced and is also studied in [33]. In [34] and [35], the authors evaluate
powerful video coding techniques are required to efficiently the coding performance and time complexity of intra cod-
compress the 4K/8K contents as well as high frame rate video. ing tools adopted in AV1 encoder. One intra coding tool
Later in 2018, another open-source coding technique, AV1 [5] selection method was also proposed in [35] to speed up the
was developed by the Alliance for Open Media (AOMedia) AV1 encoder. However, its coding tool evaluation is limited
only to intra coding tools. Since users can have many dif-
Manuscript received September 30, 2020; revised December 26, 2020 and
March 13, 2021; accepted March 16, 2021. Date of publication April 14, 2021; ferent encoding preferences based on their applications or
date of current version September 3, 2021. This work was supported in usage scenarios, our target in this paper is to design a sim-
part by the Basic Science Research Program through the National Research plified AV1 encoder optimized for user’s different preference
Foundation of Korea (NRF) MFI under Grant NRF-2020R1A2C2007673, and
in part by the Samsung Electronics Company Ltd, System LSI Division. by selecting only necessary coding tools under certain user
(Corresponding author: Byeungwoo Jeon.) priority. For example, a user with a low-complexity priority
The authors are with the Digital Media Laboratory, Department of Electrical application in mind likes to keep the encoding complexity as
and Computer Engineering, Sungkyunkwan University, Suwon 16410,
South Korea (e-mail: [email protected]; [email protected]). low as possible even if its coding efficiency may suffer a lit-
Digital Object Identifier 10.1109/TBC.2021.3071013 tle. It means the user prefers to a fast video encoder even
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
XU AND JEON: USER-PRIORITY BASED AV1 CODING TOOL SELECTION 737
Fig. 3. Illustration of compound prediction modes in AV1. Fig. 4. In-loop filters process employed in AV1.
TABLE I
AV1 C ODING T OOLS U NDER T EST
also consider these two as intra coding tools. Intra block copy
predicts a block in a similar way as the motion compensation
in inter prediction but finds its reference block only in the
already encoded area of the current picture. The color palette
prediction is effective when a block can be represented by
a few numbers of colors. Color values in the predictor are rep-
resented by the color indices in the palette which is generated
based on the colors of neighboring pixels weighted according
to distance and occurrence frequency.
TABLE II
AV1 T EST C ONFIGURATIONS
B. Inter Coding Tools
AV1 adopted advanced compound prediction for inter cod-
ing where the final compound predictor pf can be obtained
from two predictors p1 and p2 as illustrated in Figure 3. m
is a compound mask which can be either a scalar value or
a matrix when different compound mode is used. In case
of the distance-weighted compound prediction, m is a scalar
value based on the expected output time of the reference
pictures. The wedge compound utilizes 16 predefined wedge
patterns respectively for square and non-square blocks as the C. In-Loop Filter Coding Tools
mask m, and it can be applied to both inter-inter and inter- After deblocking process in AV1, two in-loop filter cod-
intra compound. Difference-weighted compound generates the ing tools are successively applied at post-processing stage
mask m based on the difference between two predictors, so as in Figure 4. The constrained directional enhancement fil-
that the final prediction can emphasize either one predictor ter (CDEF) is firstly applied as a deringing filter that can
over the other or the opposite. Moreover, the interintra com- also preserve detailed information. Afterwards, a set of loop
pound prediction blends the inter predictor together with restoration filters can be applied after CDEF. The filtering
an intra predicted block. Four intra modes including DC process is done on a loop-restoration unit base which can
prediction, vertical prediction, horizontal prediction, smooth be among 64×64, 128×128, 256×256. Then, one of the
prediction can be applied, and the mask m is defined differently two supported filters (separable symmetric normalized Wiener
for each intra prediction mode. filter and Dual self-guided filter) is selected for each loop-
There are two types of warped motion compensation in restoration unit. The necessary filter parameters of separable
AV1, which are global warped motion and local warped symmetric normalized Wiener filter and dual self-guided filter
motion. The global warped motion is used to compensate the are signaled to decoder in the bitstream.
motion vector of a whole frame caused by camera moving, and
the local warped motion usually handles the object translation, III. D ESIGN OF C ODING T OOL O FF -T EST
rotation, zooming, or affine motions at a block level.
In this paper, we aim at accelerating AV1 encoder according
The overlapped block motion compensation (OBMC) uti-
to user’s preference on the aspect to make it faster, i.e., low-
lizes both the current inter predicted block and the inter
complexity priority or quality priority. This section describes
predicted blocks generated based on motion vectors from
the test environment and configuration we applied, our test
the above and the left blocks. Three predictors are blended
design, and also the performance evaluation methodology.
together to form the final OBMC predictor. In OBMC blend-
ing process, predictions from only the first reference picture
are used. A. Test Settings and Configurations
Lastly, AV1 also has the dual interpolation filters to inter- The version of the AV1 encoder that we used for the test is
polate pixels for more accurate motion compensation. This AOMedia Project AV1 Encoder 1.0.0-1634-g0a0368368 which
process basically enables the interpolations process to use dif- can be accessed from [37].
ferent filter taps to interpolate respectively in horizontal and The AV1 test configurations and corresponding values are
vertical directions. listed in Table II. AV1 supports users to choose either 1-pass or
XU AND JEON: USER-PRIORITY BASED AV1 CODING TOOL SELECTION 739
TABLE IV
S UMMARY OF BDBR (%) AND TS (%) OF E ACH I NTRA C ODING T OOL U NDER OFF-T EST
TABLE V
S UGGESTED I NTRA C ODING T OOL S ELECTION BDBR loss are caused by turning OFF the extended direc-
tional prediction, respectively under AI and RA configura-
tions. Similarly, an AV1 encoder without the chroma from
luma prediction suffers from as large as 21.67% (Cb chan-
nel in class F) and 14.37% (Cr channel in class A1) of
coding performance loss in chrominance channel [35], respec-
tively for AI and RA configurations, however, only very little
encoding time is saved. Paeth prediction shows very limited
coding performance improvement except for screen content
IV. AV1 C ODING T OOL E VALUATION videos (class F). Besides, our experimental results show that
Totally 26 sequences suggested in the JVET common test when intra block copy and color palette prediction are not
conditions [38] are used for our experiments. Intra and in- applied, only sequences from class F suffer from significant
loop filter coding tools are tested under both all intra (AI) BDBR loss. It indicates that intra block copy and color palette
and random access (RA) configurations, while inter coding prediction have almost no effect on natural content videos.
tools are tested only under the RA configuration. Note that Note that some coding tools (such as intra edge filter and
sequences from class E are not tested under random access chroma from luma prediction) provide high coding efficiency
configuration according to [38]. To save the running time, only but consumes little time complexity. Turning OFF such cod-
the first one second of each test sequence is encoded, which ing tools may result in larger residual block after prediction,
is proved to be valid in [7]. which will consequently increase the processing time at quan-
tization and entropy coding stages later. Therefore, total
encoding time can be increased when such coding tools are
A. Intra Coding Tool OFF-Test Results turned OFF.
The experimental results of intra coding tools under AI and Figures 6 and 7 show the BDBR and TS comparison of
RA encoding configurations are summarized in Table IV, and all the intra coding tools under AI and RA configurations,
detailed experimental results of each class can be referred respectively. Extended directional prediction and chroma from
to [35]. We observe that an average of 5.62% and 1.76% luma have huge benefit in BDBR performance so it is desirable
XU AND JEON: USER-PRIORITY BASED AV1 CODING TOOL SELECTION 741
TABLE VI
BDBR (%) AND TS (%) OF S UGGESTED I NTRA C ODING T OOL S ELECTION
TABLE VII
BDBR (%) AND TS (%) OF E ACH I NTER C ODING T OOL U NDER OFF-T EST (RA)
to always keep the two coding tools ON. However, paeth negligible. Besides, we suggest that the smooth prediction
prediction, intra block copy, and color palette prediction are and recursive intra prediction are optional depending on
suggested to be turned OFF for natural videos since the users’ choice for their OFF-test results falling into the gray
corresponding BDBR loss caused by turning them OFF is region C.
742 IEEE TRANSACTIONS ON BROADCASTING, VOL. 67, NO. 3, SEPTEMBER 2021
TABLE X
BDBR (%) AND TS (%) OF E ACH I N -L OOP F ILTER C ODING T OOL U NDER OFF-T EST
Motong Xu (Student Member, IEEE) received the Byeungwoo Jeon (Senior Member, IEEE) received
B.S. degree from the Department of Electronic the B.S. degree (magna cum laude) in 1985, the
Information Engineering, Harbin Institute of M.S. degree from the Department of Electronics
Technology, China, in 2012. She is currently pur- Engineering, Seoul National University, Seoul,
suing the Ph.D. degree with the Department Korea, in 1987, and the Ph.D. degree from the
of Electrical and Computer Engineering, School of Electrical Engineering, Purdue University,
Sungkyunkwan University, South Korea. Her West Lafayette, IN, USA, in 1992. From 1993 to
research interests include image/video coding. 1997, he was with the Signal Processing Laboratory,
Samsung Electronics, South Korea, where he worked
for research and development of video compression
algorithms, design of digital broadcasting satellite
receivers, and other MPEG-related research for multimedia applications. Since
September 1997, he has been a Faculty Member with the School of Electronic
and Electrical Engineering, Sungkyunkwan University (SKKU), South Korea,
where he is currently a Professor. He has served as the Project Manager
of Digital TV and Broadcasting in the Korean Ministry of Information and
Communications from March 2004 to February 2006, where he has super-
vised all digital TV-related R&D in Korea. From January 2015 to December
2016, he was the Dean of the College of Information and Communication
Engineering, SKKU. His research interests include multimedia signal process-
ing, video compression, statistical pattern recognition, and remote sensing. He
was a recipient of the 2005 IEEK Haedong Paper Award in Signal Processing
Society, South Korea, the 2012 Special Service Award and the 2019 BTS
Distinguished Volunteer Award both from the IEEE Broadcast Technology
Society. In 2019, he is the President of Korean Institute of Broadcast and
Media Engineers. He is a member of SPIE and an Associate Editor of IEEE
T RANSACTIONS ON B ROADCASTING.