Sullivan 1998
Sullivan 1998
compression schemes is based on a sophisti- Motion video data consists essentially of a time-ordered
cated interaction between various motion rep- sequence of pictures, and cameras typically generate ap-
resentation possibhties, waveform codmg of proximately 24,25, or 30 pictures (orjkwes) per second.
dfferences, and waveform coding of various refreshed re- This results in a large amount of data that demands the
gions. Hence, a ley problem in high-compression video use of compression. For example, assume that each pic-
coding is the operational control of the encoder. This ture has a relatively low “QCIF” (quarter-com-
problem is compounded by the widely varyltlg content mon-intermediate-format) resolution (i.e., 176 x 144
and motion found in typical video sequences, necessitating samples) for which each sample is digitally represented
the selection between dfferent representation possibilities with 8 bits, and assume that we slup two out of every
with vaniing rate-distortion effi-
I I
three Dictures in order to cut
ciency. This article addresses the down the bit rate. For color pic-
problem of video encoder optimi- tures. three color comDonentL
zation and discusses its conse- samples are necessary to repre-
quences o n the compression sent a sufficient color space for
architecture of the overall coding each Dixel. In order to transmit
system. Based on the well-laown even this relatively low-fidelity
hybrid video coding structure, sequence of pictures, the raw
Lagrangian optimization tech- source data rate is still more than
niques are presented that try to answer the question: 6 Mbit/s. However, today’s low-cost transmission chan-
‘“hat part of the video signal should be coded using what nels often operate at much lower data rates so that the
method and parameter settings?” data rate of the video signal needs to be further com-
.
scheme consists of breaking up the image into equal-size The concept of frame difference refinement can also be
blocks. These blocks are transformed by a discrete cosine taken a step further, by adding motion-compensatedpredic-
transform (DCT), and the DCT coefficients are then tion (MCP). Most changes in video content are typically
quantized and transmitted using variable-length codes. due to the motion of objects in the depicted scene relative
We will refer to this kind of coding scheme as to the imaging plane, and a small amount of motion can
INTRA-frame coding, since the picture is coded without result in a large difference in the values of the pixels in a
referring to other pictures in the video sequence. In fact, picture area (especially near the edges of an object). Of-
such INTRA coding alone (often called “motion J P E G ) is ten, displacing an area of the prior picture by a few pixels
in common use as a video coding method today in pro- in spatial location can result in a significant reduction in
duction-quality editing systems that demand rapid access the amount of information that needs to be sent as a frame
to any frame of video content. difference approximation. This use of spatial displace-
However, improved compression performance can be ment to form an approximation is known as motion com-
attained by talung advantage
of the large amount of tempo-
ral redundancy in video con-
tent. We will refer to such
Input ame
__
DCT,
Quantization,
-
Encoded Residual
(To Channel)
.
Entropy Code
techniques as INTER-frame
coding. Usually, much of the
I
depicted scene is essentially (Dotteq Box
I
I
ral-domain redundancy to :
Compensated ., - Frame Buffer I
I
1 Prediction (Delay) I
improve coding efficiency is I
A
I
I
what fundamentally distin- ‘ _ _ _ - - _ _ _ - - - _ _ _ _ _ - - _ _ _ _ - _ _ _ _ _ ______-____I
for the refinement of the MCP signal is linown as dis- whether or not the area is predicted from the prior pic-
placed frame difference (DFD) coding. ture. For the areas that are predicted from the prior pic-
Hence, the most successful class of video compression ture, a motionvector (MV),denoted v , . ! ~is, received. The
designs are called hybrid codecs. The naming ofthis coder MV specifies a spatial displacement for motion compen-
is due to its construction as a hybrid of motion-handling sation of that region. Using the prediction mode and
and picture-coding techniques, and the term codec is used
t o refer to both the coder and decoder of a video compres-
sion system. Figure 1 shows such a kybyid coder. Its de- An Overview of Future Visual
sign and operation involve the optimization of a number Coding Standardization Projects
of decisions, including MPEG-4: A future visual coding standard for both still
and moving visual content. The ISO/IEC SC29 WG11 or-
1. H o w t o segment each picture into areas,
ganization is currently developing two drafts, called ver-
2. Whether or not t o replace each area of the picture sion 1 and version 2 of MPEG-4 visual. Final approval of
with completely new INTRA-picture content, version 1 is planned in January 1999 (with technical con-
3. If not replacing an area with new INTRA content tent completed in October 1998), and approval of version
(a) H o w to do motion estimation; i.e, how t o select 2 is currently planned for approximately one year later.
the spatial shifting displacement to use for INTEK-picture MPEG-4 visual (which will become IS 14496-2) will in-
clude most technical features of the prior video and
predictive coding (with a zero-valued displacement being
still-picture coding standards, and will also include a num-
an important special case), ber of new features such as zero-tree wavelet coding ofstill
(b) H o w to do DFD coding; i.e., how to select the ap- pictures, segmented shape coding ofobjects, and coding of
proximation to use as a refinement of the INTER predic- hybrids ofsynthetic and natural video content. It will cover
tion (with a zero-valued approximation being an essentiallyall bit rates, picture formats, and frame rates, in-
important special case), and cluding both interlaced and progressive-scan video pic-
4.If replacing an area with new I N T I U content, what tures. Its efficiency for predictive coding of normal
camera-view video content will be similar to that of H.263
approximation to send as the replacement content. for noninterlaced video sources and similar to that of
At this point, we have introduced a problem for the en- MPEG-2 for interlaced sources. For some special purpose
gineer who designs such a video coding system, which is: and artificially generated scenes, it will provide signifi-
Whatpaaof the imapeshould be coded using what method?If cantly superior compression performance and new ob-
the possible modes of operation are restricted to INTRA ject-oriented capabilities. It will also contain a still-picture
coding and SKIP, the choice is relatively simple. However, coder that has improved compression quality relative to
hybrid video codecs achieve their compression perfor- JPEG at low bit rates.
H.263+ +: Future enhancements of H.263. The
mance by employing several inodes of operation that are
H.263+ + project is considering adding more optional cn-
adaptively assigned to parts of the encoded picture, and hancements to H.263 and is currently scheduled for com-
there is a dependency between the effects of the motion pletion late in the year 2000. It is a project of the ITU-T
estimation and DFD coding stages of INTER coding. The Advanced Video Coding Experts Group (SG16/Q15).
modes of operation are generally associated with sig- JPEG-2000: A hture new still-picture coding stan-
n a 1- d e pe tide n t rate - d i s t o r t i on character is t i cs , and dard. JPEG-2000 is a joint project of the ITU-T SG8 and
rate-distortion trade-offs are inherent in the design of ISO/IEC JTC1 SC29 WGl organizations. It is scheduled
each ofthese aspects. The second and third items above in for completion late in the year 2000.
H.26L: A future new generation of video coding stan-
particular are unique to motion video coding. The opti- dard with improved efficiency, error resilience, and stream-
mization of these decisions in the design and operation of ing support. H.26L is currently scheduled for approval in
a video coder is the primary topic of this article. Some fur- 2002. It is a project ofthe ITU-T Advanced Video Coding
ther techniques that go somewhat beyond this model will Experts Group (SG16/Q15).
also be discussed.
Complicating Factors in
Video Coding Optimization
he video coder model described in this article is useful for age area may also be filtered to avoid high-frequency artifacts
T illustration purposes, but in practice actual video coder
designs often differ from it in various ways that complicate
(as in Rec. H.261 141).
Often there are interactions between the coding of differ-
design and analysis. Some of the important differences are ent regions in a video coder. The number of bits needed to
described in the following few paragraphs. specifyan MV value may depend on the values of the MVs in
Color chrominance components (e.g., Cb,(s) and Cr,(s)) neighboring regions. The areas of influence of different MVs
are often represented with lower resolution (e.g., can be overlapping due to overlapped-block motion com-
W / 2 x H / 2 ) than the luminance component of the image pensation (OBMC) [16]-[19], and the areas of influence of
Y(s). This is because the human psycho-visual system is coded transform blocks can also overlap due to the applica-
much more sensitive to brightness than to chrominance, al- tion of deblocking filters. While these cross-dependencies
lowing bit-rate savings by coding the chrominance at lower can improve coding performance, they can also complicate
resolution. In such a system, the method of operation must the task of optimizing the decisions made in an encoder. For
be adjusted to account for the difference in resolution (for this reason these cross-dependencies are often neglected (or
example, by dividing the MV values by two for chrominance only partially accounted for) during encoder optimization.
components). One important and often-neglected interaction be-
Since image values I,(s)are defined only for integer pixel tween the coding of video regions is the temporal propa-
locations s = ( x , y) within the rectangular picture area, the gation of error. The fidelity of each area of a particular
above model will work properly in the strict sense only if ev- picture will affect the ability to use that picture area for
ery motion vector v,,,,is restricted to have an integer value the prediction of subsequent pictures. Real-time
and only a value that causes access to locations in the prior encoders must neglect to account for this aspect to a large
picture that are within the picture's rectangular boundary. extent, since they cannot tolerate the delay necessary for
These restrictions, which are maintained in some early optimizing a long temporal sequence of decisions with
video-coding methods such as ITU-T Rec. H.261 141, are accounting for the temporal effects on many pictures.
detrimental to performance. More recent designs such as However, even nonreal-time encoders also often neglect
ITU-T Rec. H.263 [ 101 support the removalofthese restric- this to account for this propagation in any significant
tions by using interpolation of the prior picture for any frac- way, due to the sheer complexity of adding this extra di-
tional-valuedMVs (normally half-integer values, resulting in mension to the analysis. An example for the exploitation
what is called half-pixel motion) and MVs that access loca- of temporal dependencies in video coding can be found
tions outside the boundary of the picture (resulting in what in [20]. The work of Ramchandran, Ortega, and Vetterli
we call picture-extrapolating MVs). The prediction of an im- in [ 2 0 ] was extended by Lee and Dickinson in 1211.
-
- 8x8 blocks
16x16 and 8x8 blocks
32 1
y i t h it,(,) b e i n g a p r e d i c t e d pixel value a n d
I,,_,, (s - v , , , ~ . ,being
,) a motion-compensated pixel from
a decoded frame An time instants in the past (normally
An = 1). This scheme is a generalization of Eq. (1) and it
Foreman, QCIF, SKIP=2, ORIG. REF.
includes concepts like subpixel accurate MCP [43, 441,
B-frames [4S], spatial filtering [4], and OBMC [ 16-19].
Using the linear filtering approach ofEq. ( l o ) ,the accu-
racy of motion compensation can be significantly im-
proved. A rationale for this approach is that if there are P
different plausible hypotheses for the MV that properly
represents the motion ofa pixel s, and if each of these can
be associated with a hypothesis probability h,)(s), then
the expected value of the pixel prediction is given by Eq.
(10)-and an expected value is the estimator that mini-
mizes the mean-square error in the prediction of any ran-
dom variable. Another rationale is that if each hypothesis
prediction is viewed as a noisy representation of the pixel,
then performing an optimized weighted averaging of the
0 5 10 15 20 25 30 35 40
results of several hypotheses as performed in Eq. (10) can
Bit Rate [kbps]
reduce the noise. It should be obvious that ifan optimized
'I=,
set ofweights {ha(s)1 is used in the linear combination A 3. Prediction gain vs. MV bit rate for the sequences Mother &
(Eq. ( l o ) ) ,the result cannot be worse on average than the Daughter (top) and Foreman (bottom) when employing H.263
result obtained from a single hypothesis as in Eq. (1).The MV median prediction and original frames as reference
multi-hypothesis MCP concept was introduced in [ 181, frames.
case, the portion of the bit rate used for sending MVs MV values from those ofncighhoring blocks and OKMC.
shows an increase of 30% compared to the one-frame In [52, 531, Wiegand et al. proposed the exploitation of
case [49]. Embedded in a complete video coder, the ap- mode decision dependencies betnwn macroblocks using
proach still yields significant coding gains expressed in dynamic programming methods. Later work o n the sub-
bit-rate savings of 23% for the sequence Fo~cmanand ject that also included the option to change the quantizer
17% for the SeqLIellCC Mot/Jcr & ' 1 h i W J / J f f Y dLlc tO the im-
pact of long-term memory MCP when comparing it to Mother-Daughter, QCIF, SKIP=2, Q=4,5,7,10,15,25
40 I I
the rate-distortion optimized H.263 coder, which is out-
lined in this article [49].
Complex inotion inodels (the third item) have been 38 - ..................... .i.. .................
28
video frame do\vn to a granularity of8 x 8 blocks. Kit-rate 0 20 40 60 80
savings of more than 25% were reported for the sequence Bit Rate [kbps]
Foveman S 11.
tortion ofthe residual coding stage is controlled by the se- A 4. Coding performance for the sequences Mother 6: Daughter
lection of a quantizer step size Q, then rate-distortion (top) and Foreman (bottom) when employing variable block
optimized mode decision refers to the minimization of sizes.
the following Lagrangian functional
Table 1. Bit-rate partition of the
/ ( % M , Q ) =DRt(:(A,M , Q ) + ~ M>Qh
& , ( N ~ k ~ l , t ( P L
various H.263 modes.
(11) I I
where, for instance, M E { I N T R A , S K I P , I N T E l I , Mode Motion Coding Texture Coding
INTER+4\'} indicates a mode chosen for a particular Bit Rate ['%I Bit Rate [%I]
macroblock, Q is the selected quantizer step size,
Zll,b( ( A ,M , Q ) is the SSD between the original INTKA 0 100
macroblock A and its reconstruction, and K,,,,. ( A ,M , Q)
is the number of bits associated with choosing M and Q.
A simple algorithm for rate-constrained mode deci-
INTEK 3 0 f 15 70 iI S
sion minimizes Eq. (11)given all mode decisions of past
mncroblocl<s [26, 271. This procedure partially neglects
dependencies between macroblocks, such as prediction of
8
g 0.6
3
8
c
-
d
9 0.4
._
I h=100
0.2
0
1 5 10 15 20 25 31
QUANT
(b)
h = 10001 I
8
0.8
-------
h=25
0.6
3
8
9 0.4
.-
1
-
W
U
0.2
A 5. Relative occurrence vs. macroblock QUANT for various Lagrange parameter settings. The relative occurrences of macroblock QUAN7
values are gathered while coding 100 frames of the video sequences Foreman (a), Mobile & Calendar (b), Mother & Daughter (c),
and News (d).
complex structure of the entropy coding ofH.263 has re- Lagrange Parameter vs. Average Quant
cently appeared [57, 581. Trellis-based quantization was 1000
reported to provide approximately a 3% reduction in the
bit rate needed for a given level of fidelity when applied to 800
H.263-based DCT coding [57, 581. zl
r
E
600
a"
Choosing h and the Quantization Step Size Q al
The algorithm for the rate-constrained mode decision can
be modified in order to incorporate macroblock
5P 400
which is an approximation of the hnctional relationship where c = 4 / (12a). Although our assumptions may not
between the macroblock QUANT and the Lagrange pa- be completely realistic, the derivation reveals at least the
rameter h up to QUANT values of 25, and H.263 al- qualitative insight that it may be reasonablc for the value
lows only a choice of QUANT E {1,2, ...,31}. Particularly of the Lagrange parameter h h , O l ) ~ to be proportional to
remarkable is the strong dependency between h Mol)t and the square of the quantization parameter. As shown
QUANT, even for sequences with widely varying content. above, 0.85 appears to be a reasonable value for use as the
Note, however, that for a given value of h X,oI)E, the cho- constant c.
sen QUANT tends to be higher for sequences that require This ties together two of the three optimization pa-
higher amounts of bits (Mobile 0Calendar) in compari- rameters, QUANT and h MOI)E . For the third, h h,O,,ON ,we
son to sequences requiring smaller amounts of bits for make an adjustment to the relationship to allow use of the
coding at that particular h,,,,, (Mother 0 Dau& SAL) measure rather than the SSD measure in that stage
ter)-but these differences are rather small. of encoding. Experimentally, we have found that an effec-
The I T U - T Video Coding Experts Group (ITU-T Mother-Daughter, QCIF, SKIP=2, Q=4,5,7,10,15,25
Q. 1S/SG16) maintains an internal document describing .- I I
examples of encoding strategies, which is called its test
model [33, 611. The mode decision and motion estima-
tion optimization strategies described above, along with
the method of choosing h MO1)E based on quantizer step
size as shown above were recently proposed by the second
author and others for inclusion into this test model [62,
631. The group, which is chaired by the first author, had
previously been using a less-optimized encoding ap-
proach for its internal evaluations [61], but accepted
these methods in the creation of a more recent model
[ 331. The test model documents, the other referenced
Q . l S documents, and other information relating to
281
I I
I
-
U - 4
Annexes D+F, TMN-10 MD and TMN-9ME
Annexes D+F, TMN-10 MD and ME
I
ITU-T Video Coding Experts Group work can be found 0 20 40 60 80
o n an ftp site maintained by the group (ftp://stan- Bit Rate [kbps]
dard.pictel.com/video-site). Reference software for the
test model is available by ftp from the University of Brit-
ish Columbia (ftp://dspftp.ee.ubc.ca/pub/tmn,with fur- Foreman, QCIF, SKIP=2, Q=4,5,7,10,15.25
40 I I I
t h e r i n f o r m a t i o n a t h t t p :/ / w w w . ece. u b c . ca/
spnig/research/motic,n/h263plus). _
38 ...................... 1 ....................... :....................... ...................... -
The less-sophisticated TMN-9 mode-decision method
is based on thresholds. It compared the sum of absolute 36 ...................... ..................... ...
differences of the 16 x 16 macroblock ( W )with respect
to its mean value to the minimum prediction SAD ob-
tained by an integer-pixel motion search in order to make
its decision between INTRA and INTER modes according .................I$
..................... i....................... ...................... 4
to whether
26
would be chosen for that particular macroblock. The 0 50 100 150 200
min{SAD(fullpixel,l6 x 16))value above corresponds to Bit Rate [kbps]
the minimum SAL> value after integer-pixel motion com- A 7. Coding performance for the sequences Mother & Daughter
pensation using a 16 x 16 motion compensation block (top) and Foreman (bottom) when comparing the TMN-9 to
size, where the SAD value of the (0,O) M V is reduced by the TMN- 10 encoding strqtegy
The authors wish to thank Klaus Stuhlmdler, Nilco 16. H . Watanabe and S. Singhal, “Windowed motion compensation,” in Pro-
ceedings ofthe SPIE Conferncc on Visual Communicationsand Image Pro-
Farber, Bernd Girod, Barry Andrews, Philip Chou, and
cessing, vol. 1605, pp. 582-589, 1991.
the PictureTel research team for their support and useful
discussions. They also wish to thank guest editors Anto- 17. S. Nogalci and M. Ohta, “An overlapped block motion compensation for
high quality motion picture coding”, in Proceedingsofthe IEEE international
nio Ortega and Kannan Ramchandran, as well as Jona- Synzposinn? on Circuitsand Systems, vol. 1,pp. 184-187, May 1992.
than Su, John Villasenor and his UCLA research team,
18. G.J. Sullivan, “Multi-hypothesis motion compensation for low bit-rate
Faouzi Kossentini and his UBC research team includng
video coding,” in Proceedingsofthe iEEE Internatwnal Confrence on Acous-
Michael Gallant, and the anonymous reviewers for their tics, Speech, and Sgnal Processing,Minneapolis, MN, USA, vol. 5, pp.
valuable comments. The work of Thomas Wiegand was 437-440, April, 1993.
partially funded by 8x8, Inc. 19. M.T. Orchard and G.J. Sullivan, “Overlapped block motion compensation:
an estimation-theoretic approach,” iEEE Transactionson Image Processing,
G a ~ Sullivan
y is the manager of communication core re- vol. 3, no. 5, pp. 693-699, Sept. 1994.
search with PictureTel Corporation in Andover, 20. K. Ramchandran, A. Ortega, and M. Vetterli, “Bit allocation for dependent
Massachussetts, USA. Thomus We&und is a Ph.D. quantization with applications to multiresolution and MPEG video coders”,
student with the University of Erlangen-Nuremberg in IEEE Transactions on Image Processing, vol. 3, no. 5, pp. 533-545, Sept.
1994.
Erlangen, Germany.
21. J. Lee and B.W. Dicltinson, “Joint optimization of frame type selection and
bit allocation for MPEG video coders,” in Proceedings of the iEEE interna-
References tional Confrence on I m a p Pvocessing, Austin, USA, vol. 2, pp. 962-966,
Nov. 1994.
1.ITU-T (formerly CCITT) and ISO/IEC JTC1, “Digital Compression and
22. Y. Shoham and A. Gersho, “Efficient bit allocation for an arbitrary set of
Coding of Continuous-Tone Still Images,” ISO/IEC 10918-1 - ITU-T
quantizers,” IEEE Transactzons on Aconstics, Speech and Signal Pf,ocessing, vol.
Recommendation T.81 (JPEG), Sept. 1992.
36, .. 1445-1453, Sept. 1988.
pp.
2. W.B. Pennebaker and J.L. Mitchell, JPEG: Still Image Data Compression
23. H. Everett 111, “Generalized Lagrange multiplier method for solving prob-
Standard, Van Nostrand Reinhold, New York, USA, 1993.
lems of optimum allocation of resources,” Operations Research, vol. 11, pp.
3. ITU-T (formerly CCITT), “Codec for Videoconferencing Using Primary 399-417, 1963.
Digital Group Transmission,” ITU-T Recommendation H.120; version 1,
24. P.A. Chou, T. Lookabaugh, and R.M. Gray, “Entropy-constrained vector
1984; version 2, 1988.
quantization,” IEEE Transactions on Acoustics, Speech and Signal ProcessinE.
4. ITU-T (formerly CCITT), “Video codec for audiovisual services at p x 64 vol. 37, no. 1, pp. 31-42, Jan. 1989.
kbitis,” ITU-T Recommendation H.261; version 1, Nov. 1990; version 2,
25. A. Gersho and R.M. Gray, VectorQuantizatwnand Sgnal Compresszon,
Mar. 1993.
IUuwer Academic Publishers. Boston. USA. 1991.
I /
33. ITU-T SG16/Q15 (T. Gardos, ed.), ‘Video codec test model number 10 49 T Wiegand, X Zhang, and B Girod, “Long-teim memory mo-
(TMN-10); ITU-T SG16/Q15 document Q15-D-65, (downloadablevia tion compensated prediction,)’ IEEE Tramactzonson Czrcuzts and Systemsfor
Apr. 1998.
ftp://standard.pictel.com/video-site), Video Technology, Sept 1998
34. J. Choi and D. Park, “A stable feeedback control of the buffer state using 50 Noha Research Center (P Haavisto, et a1 ), “Pioposal for
the controlled lagrange multiplier method,” LEEE Transactions on Image Pro- ing,” ISO/EC JTCl/SC29/wG11, MPEG document MPEG96/M0904 ,
cessing,vol. 3, no. 5, pp. 546-558, Sept. 1994. July 1996
35. A. Ortega, I<. Ramchandran, and M. Vetterli, “Optimal trellis-based 51. M Karczewicz, J Nieweglowslu, aid P Haavisto, “V
buffered compression and fast approximations,” IEEE Transactions on Image mouon compensation with polynomial mouon vector
c e s s z n ~Imap Communzcatzon, vol 10, pp 63-91, 1997
Processing,vol. 3, no. 1, pp. 26-40, Jan. 1994.
52 T Wiegand, M Lightstone, T G Campbell, and
36. J. Ribas-Corbera and S. Lei, “Rate control for low-delay video communica-
mode selecuon for block-based mouon compensated mdeo coding”, in Pro-
tions,” ITU-T SG16/Q15 document Q15-A-20, (downloadablevia
ceedzngs of the IEEE Internatzoflal Confmence on Image Processzng, Washmng-
June 1997.
ftp://standard.pictel.com/video-site),
ton, D C , USA, Oct 1995
37. M.C. Chen and A.N. Wilkon, “Rate-distortion optimal motion estimation
53 T Wiegand, M Lightstone, D Mulcherlee, T G Campbell, and S I<
for video coding,” Proceedingsof the IEEE International Confmence on Acous-
Mitra, “Rate distoruon optimized mode selecuon for very low bit rate
tics, Speech, and Shnal Processing, Atlanta, USA, vol. 4, pp. 2096-2099, May
video codmg and the emerging H 263 standard,” B E E Transactzons on Czr-
1996.
cuzts and SyaemsforVideo Technology, vol 6 , no 2, pp 182-190, Apr 1996
38. M.C. Chen and A.N. Willson, “Design and optimization of a differentially
54 G M Schuster and A I< Katsaggelos, “Fast and efficient mode and
coded variable block size motion compensation system,” Proceedingsof the
quanuzer selection m the rate dstoruon sense for H 263,” in Proceedzngs of
IEEE International Confeeunce on Image Processing, Laussanne, Switzerland,
the SPIB Cony%renceon Vzsual Communzcatwnsand Image Processzng, Or-
vol. 3, pp. 259-262, Sept. 1996.
lando, USA, pp 784-795, Mar 1996
39. M.C. Chen and A.N. Willson, “Rate-distortion optimal motion estimation 55 G.J. Suhvan, “Efficient scalar quantizauon of exponential and Laplacian
algorithms for motion-compensatedtransform video coding,” B E E Trans- random variables,” IEEE Transactwnson Infomatson Theory, vol 42, no 5,
actzons on Circuztsand Systemsfor Video Technology, vol. 8, no. 2, pp. pp 1365-1374,Sept 1996
147.158, Apr. 1998.
56. A Ortega and I< Ramchandran, “Forward-adaptivequail
40. W.C. Chung, F. Kossentiui, and M.J.T. Smith, “An efficient motion esti- tmal overhead cost for image and video coding with apphcauons to mpeg
mation technique based on a rate-distortion criterion,” in Proceedingsofthe mdeo coders,” in Proceedzngs of the SHE, Dzgztal Video Compresswn Algo-
IEEE International Confrence on Acoustics, Speech and Sgnal process in^, At- rzthms and Technohgzes, San Jose, USA, Feb 1995
lanta, USA, vol. 4, pp. 1926-1929,May 1996.
57 J Wen, M Luttrell, and J Villasenor, “Simulauon results o
41. F. Kossentini, Y.-W. Lee, M.J.T. Smith, and R. Ward, “Predictive RD op- adapuve quanuzauon,” ITU-T SG16/Ql5 document Q15-D-40, (down-
timized motion estimation for very low bit rate video coding,” IEEEJoumaL loadable via ftp //standard pictel com/mdeo-site),Apr 1998.
on SelectedAreas in Conzmunications, vol. 15, no. 9, pp. 1752-1763,Dec.
58 J Wen, M. Luttrell, and J.Villasenor, “Trellis-Based R-D Opumal
1997.
Quanuzauon in H 263+,” IEEE Tmnsactzons on Image Pvocesszng,1998,
42. G.M. Schuster and A.K. Katsaggelos, “A video compression scheme with submitted for publication
optimal bit allocation among segmentation, motion, and residual error,” 59 N S Jayant and P. Noll, Dzgztal Codzna of Wavefoms, Prenuce-Hall,
LEEE Transactions on Image Processing,vol. 6, pp. 1487-1502,Nov. 1997. Englewood Chffs, USA, 1984
43. B.Girod, ‘“Theefficiency of motion-compensatingprediction for hybrid 60 H Gish and J N Pierce, “Asymptotically efficient quantizing”,IEEE Tnzns-
coding of video sequences”, IEEEJonrnal on SebctedAreas in Communica actwns on Injhnatzon Theory, vol 14, pp 676-683, Sept 1968
tions, vol. 5, no. 7, pp. 1140.1154, Aug. 1987.
61 I R - T SG16/Q15 (T Gardos, ed ), “Video codec test model number 9
44. B. Girod, “Motion-compensatingprediction with fractional-pel accuracy,” (TMN-9); ITU-T SG16/Q15 document Q15-(2-15, (downloadablema
IEEE Transactions on Communications,vol. 41, no. 4, pp. 604-612, Apr. ftp //standard pictel.com/video-site),Dec 1997
1993.
62 T. Wiegand and B D Andrews, “An improved H 263 coder using
45. H.G. Musmann, P.Pirsch, and H:J. Grallert, “Advances in picture cod- rate-dstoruon optimization,” ITU-T SG16/Q15 document Q15-D-13,
ing,” Proceedingsof rhe IEEE, vol. 73, no. 9, pp. 523-548, Apr. 1985. (downloadablema ftp //standard pictel comivideo-site),Apr 1998
46. B. Girod, ‘%fficiency analysis of multi-hypothesismotion-compensated pre- 63. M. Gallant, G Cott, and F Kossennni, “Description of and results for
diction for video coding,” IEEE Transactions on Image Processing, 1997, rate-distortion-basedcoder,” ITU-T SG16/Q15 document Q15-D-49,
submitted for publication. (downloadablevia ftp //standard pictel com/video-site),Apr 1998