VIDEO FORMATS AND MPEG COMPRESSION
N. C. State University CSC557 Multimedia Computing and Networking Fall 2001 Lecture # 15
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
Video Standards
BroadcastTV (analog) VCR (analog) Film (analog) Computer video Digital TV, HDTV Digital (compressed) video
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
Frame Rates
Sampling rates must be high enough to avoid motion "aliasing"
At least 50 frames/sec needed in the ideal case Few video standards support this high a rate
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
Raster Scanning Order
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
Interlaced Video
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
Interlaced Video
Display of alternating, interleaved fields at 2X the frame rate
The net bandwidth, or number of bits being transmitted, remains the same The fields are taken at half-frame intervals; vertical lines dont line up exactly!
Frame 1 Frame 2 Frame 3
Even Fields
Time
Odd Fields
"Fools" the eye into thinking the sampling rate is greater than the frame rate
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
Video Standards
Standard Digital or Analog analog Color VxH Resolution 525 lines (40 not displayed) 625 lines 240 lines Subtra ctive RGB YCbCr 640x480 1600x1200 352x240 1.85 4:3 (1.33) 24 (70-80 fps) 29.97 No No ITU-R standard, many resolutions supported. SIF is typical size used in MPEG-I Many competing proposals exist. Aspect Ratio 4x3 (1.33) Frame Rate (fps) 29.97 Interlaced? yes Comments
TV (NTSC standard) TV (PAL standard) VHS Film (motion pictures) VGA (computer video) CCIR-601 (Digital Component Video): SIF
YIQ
Used in North America and Japan Used in Europe
analog Analog Analog Digital Digital
YUV
25
Yes
HDTV
Digital
1920x1080
16x9 (1.78)
30
Yes
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
Issues in Converting Between Standards
Differences in pixel aspect ratios (square vs. non-square) Difference in screen resolution and aspect ratio Interlaced vs. Non-interlaced Differences in frame rates
E.g., speedup up film by 4% for 25 fps TV E.g., use 3:2 pulldown to convert 2 frames at 24 fps into 5 fields (= 2.5 frames) at 30 fps
Differences in color systems Differences in chroma sub-sampling
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
Video Data Requirements (example)
30 frames / second * 640x480 pixels / frame (VGA) * 3 bytes / pixel = 26.4 MB/second! (211 Mb/s) Or, 92 GB for a one hour video! Many computer components are too slow/small for this
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
10
MPEG Compression Standard
An international (open) standard
MPEG = Motion Picture Expert Group
Best quality / compression combination Assymetrical
not too hard to decode: software encoding requires a lot of computation: hardware
MPEG standard specifies syntax and semantics of the bit stream
not the method of compression!
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
11
Mpeg Video Standards
MPEG-I
Targeted for 1.2-1.5 Mb/s 352x240, 30 frames / second "VHS quality
MPEG-II
Targetted for 4-15 Mb/s Up to 1920x1152, 30 frames/sec Interlaced video "TV and HDTV quality"
MPEG-IV
Very low bandwidth: 10-64 kb/sec "Videoconference quality
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
12
Basic Idea
Use spatial compression on some frames (I.e., JPEG) Add temporal compression (motion compensation) as well
Improves compression ratio by 2-4X
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
13
Macro Blocks
Macro block = 16x16 pixel region
4 8x8 blocks of Y 1 8x8 block (subsampled) of Cb 1 8x8 block (subsampled) of Cr
Macro block is basic unit of temporal coding
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
14
Macroblocks
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
15
Temporal Encoding
Search forward or backward (or both) in time for a "similar" macro block Interpolated match: average of forward and backwards block
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
16
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
17
Temporal Encoding
Search a spatial region for a similar macro block
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
18
Searching For Similar Macroblocks
Search method not specified by standard Can search on full pixel or half-pixel boundaries
Interpolated images are effectively low-pass filtered
How define "best" match?
| V ( x + i, y + j ) V
i =0 j =0 n
Minimize absolute difference = 15 15
( x + x + i, y + y + j ) |
Where Vn is macroblock, Vm is matching macroblock (Vm is the compressed/decompressed representation of the matching macroblock, I.e., the prediction algorithm is feedforward)
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
19
Size Of Search Region
How big is big enough? More separation in time means greater amount of motion possible
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
20
Motion Vector Encoding
Motion vector = X, Y (I.e., offset or displacement) to best match
Sign is always relative to the current picture
Number of vectors needed
1 for forward or backwards predictions 2 for bi-directional (interpolated) prediction
Then, differentially encoded relative to motion vectors of the previous (adjacent) macroblock Entropy code the resulting differentially-coded vectors
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
21
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
22
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
23
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
24
Effects Of Scene Changes
Temporal compression not nearly as effective after scene changes
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
25
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
26
Temporal Results
If closest match is "close enough", just output the motion vector If not close enough, then spatially code (JPEG) all 6 blocks (I.e., no temporal compression possible) But if almost close enough
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
27
"Almost" Close Enough
Output motion vector Create 6-bit block code For each block...
Subtract from the reference (matched) block Spatially compress the difference block If the result is completely zero, clear the block bit and skip the block Otherwise, set the block bit and output the (spatially compressed) difference block
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
28
Quantization
Can tolerate higher quantization (I.e., greater loss of quality) for difference blocks than for I blocks Possible to adapt the quantizer scale to video content
Q factor can be part of the macroblock
Default quantization matrices for predicted and nonpredicted blocks
Can be overridden for entire video, or frame-by-frame
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
29
Default "non-predicted block" quantization matrix 1 16 19 22 22 26 26 27 16 16 22 22 26 27 27 29 19 22 26 26 27 29 29 35 22 24 27 27 29 32 34 38 26 27 29 29 32 35 38 46 27 29 34 34 35 40 46 56 29 34 34 37 40 48 56 69 34 37 38 40 48 58 69 83
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
30
Default "predicted block" quantization matrix 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
31
Run-length and Huffman Coding
Zigzag ordering: like JPEG Run-length coding: like JPEG Huffman coding: just encode a subset of the possible inputs, consisting of the most probable symbols Escape sequence used to represent remaining (infrequent) symbols in unencoded form
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
32
Types Of Macroblocks
I macro blocks
Spatial information only
Temporal macro blocks
Motion vector(s) Block code Spatially-coded difference blocks
"Skip" macro block (no change)
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
33
Types Of Frames
I frames
Spatial macro blocks only Serves as "anchor" frame
P frames
Spatial, forward motion, and skip macro blocks Can refer backwards to previous I and P frames Can serve as "anchor" frame
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
34
Frames (Cont.)
B frames
Spatial, forward/backward/interpolated motion, and skip macro blocks Can refer backwards or forwards to I and P frames Never used as "anchor" frame Depends on a "future" frame which must be received and decoded first
Anchor frames are transmitted before the frames that reference them
Reordering required at decoder
Typical size ratios of I:P:B frames = 5:3:1
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
35
Group Of Pictures (GOP)
Repetitive Sequence of Frame Types
(IB^m(PB^m)^n)^o
Examples
I I I .... IPP IPP IPP... common: IBBPBBPBB IBBPBBPBB IBBPBBPBB ...
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
36
GOP Sequence (example)
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
37
GOP Tradeoffs
How much compression desired?
More anchor frames = larger file size
How much buffer space required?
More anchor frames = less reordering buffer space needed
How much compression / decode delay is tolerable?
More anchor frames = less delay
What degree of random access desired?
More anchor frames = easier and faster random access, with less memory required
How much error recovery is needed?
More anchor frames = less sensitive to errors
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
38
MPEG Systems
The size of the decoder buffer is specified
An encoder constraint is that the decoder buffers must never overflow or underflow
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
39
MPEG Layers
Sequence: picture size, picture rate, buffer sizes, quantization matrices, GOP pattern GOP: SMPTE time code Picture (frame): frame type, motion vector range for frame Slice: unit of resynchronization, collection of consecutive macroblocks preceded by resynchronization code Macroblock: unit of motion compensation (16x16) Block: unit of spatial coding (8x8)
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
40
Levels and Profiles
MPEG I constrained parameters (reduce possible combinations):
# columns <= 720, #rows <= 576 # macroblocks/frame <= 396 # macroblocks / sec <= 396*25 = 330 * 30 Frames rate <= 30 / sec Bit rate <= 1.86 Mb/s Decoder buffer <= 368 Kb
MPEG-2
Level = resolution Profile = features used
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
41
Levels and Profiles (cont.)
Levels
Low: 352H x 288V Main: either 720H x 576V x 25 fps, or 720H x 480v x 29.97 fps High: up to 1920H x 1152V
Profiles
Simple: no B pictures allowed Main: features we have discussed SNR: layered coder (low quality picture and quality enhancement layer both encoded) Spatial: low resolution and resolution enhancement layer both encoded High: all of the above, plus a low frame rate and frame rate enhancement layer both encoded
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
42
Compression Example
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
43
Exhaustive Strategy
Check every possible match within some spatial region
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
44
Search Strategies
Not part of the standard, but the most computationally expensive part of compression One possibility (expensive): exhaustive search Example: offset from -15 to +15 in X and Y directions
Approximately 3600 half-pixel possible matches 3600 * 256 subtracts / match * # predicted macroblocks / frame * # predicted frames / sec = around 7 billion operations / second, just for matching More for bidirectionally-predicted!
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
45
Pyramid Search
Form lower and lower resolution versions of the frames to match and be matched, using low-pass filtering and subsampling Compare lowest-resolution versions first to find best match Then, do search within this region at a higher resolution to find best match etc., until regions are 16x16 macroblocks and best match is found
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
46
Logarithmic Search
One method: search corners and center of a diagonal square
If a corner is the best match, shift diagonal square to be centered on that, and repeat If the center is the best match, do a local exhaustive search around the center
Another method: search boundary and center of a large square
If a boundary point is the best match, shift the square to center on that point, reduce the square size, and repeat If the center is the best match, do a local exhaustive search
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
47
Decompression
Much faster; No searching or matching required If motion vector present
Use tables to undo entropy coding of motion vector Undo differential encoding of motion vector Read reference block(s) from buffer
If quantizer present
Scale the quantization matrix
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
48
Decompression (Cont.)
If block code present
Use tables to entropy decode coefficients Undo RLE Undo zigzag Undo quantization Undo DCT
Combine differential and reference blocks
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
49
Decompression (Cont.)
Combine 6 blocks into a macro block (with unsubsampling) Combine macroblocks into an image Convert YCbCr to RGB for display
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
50
H.261
Standard for videoconferencing Much lower bit rates / quality than MPEG Also called px64, p = 1..30 YCbCr color coding 288 lines by 352 pixels (CIF)or 144 by 176 (QCIF)
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
51
H.261 (cont.)
Macro blocks like MPEG Intraframe coding like JPEG Interframe coding like P frames from MPEG
Copyright 2001 Douglas S. Reeves (http://reeves.csc.ncsu.edu)
52
Sources Of Info
[Gibson98] Digital Compression for Multimedia [Mitchell97] MPEG Video Compression Standard [Poynton96] A Technical Introduction to Digital Video [Fluckiger95] Understanding networked multimedia : applications and technology