Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
4 views32 pages

Lecture 14

Uploaded by

owennene0909
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views32 pages

Lecture 14

Uploaded by

owennene0909
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Digital Image Processing - COMP4173

Lecture 14:Image and Video Formats and Standards

Prof. Hongjian Shi (时红建)


Department of Computer Science and Technology
BNU-HKBU United International University
Email: [email protected]
Office: T3-601-R3;
Office Hours: Tues., Wed., Thur. 9:00-11:50; Wed. 16:00-16:50
1
DCT Based JPEG (JPEG Baseline)
• JPEG – Joint Photographic Expert Group
• The basic DCT-based JPEG standard:

• The standard means the regulated procedure, input, and output


but no detailed implementation is given. So, different companies
have different implementations
8x8 block to
be processed
Step 1: Discrete Cosine Transform (DCT) - Encoding

DCT
Step 2: Quantization Procedure - Encoding
Step 3: Coefficient-to-Symbol Mapping - Encoding
Step 4: Entropy Coding - Encoding

• Symbols are encoded using mostly Huffman coding


• Huffman coding is a method of variable length coding in which shorter
codewords are assigned to the more frequently occurring symbols.
DCT-Based JPEG – cont.
• Once the receiver received the encoded JPEG file, the decoding is
the inverse process as below

Inverse Quantizer

There should be the inverse quantizer step


Progressive Videos vs. Interlace Videos
• As known, video frame rate is 30 or 25 frames/second and the limit of
human eye to a scene is 60 Hz.
• Each frame consists of two fields, the combination style of two fields
determines if a video is progressive or interlaced.

Progressive scan Interlaced Scan


Progressive Videos vs. Interlace Videos
• A progressive video has all the lines of a picture displayed in one frame.
Progressive videos require higher rate than interlace video but can avoid
flickery display.
• An interlace video has half number of lines in one field and half number
of lines in next field. The disadvantage is that the moving object may
appear distorted when merging two fields into a frame
• Most DVD and media videos are interlaced videos

Progressive scan Interlaced Scan


Image and Video Compression Standards
• ISO-International Standards Organization
• IEC – International Electrotechnical Commission
• ITU-T – International Telecommunications Union Telecommunications Standards
Sector (a United Nation organization(UN))
• CCITT – Consultative Committee of the International Telephone and Telegraph
• SMPTE – Society of Motion Pictures and Television Engineers
• MII – Chinese Ministry of Information Industry
• NTSC – National Television Standards Committee (US and North America standards
30 frames/s broadcast)
• PAL – Phase Alternative Line (Asia and Europe except France 25 frames /s broadcast)
• SCEM - Sequential Coleur Avec Memoire (France and Russia 25 frames/s broadcast)
• JBIG – Joint Bi-level Expert Group
• JPEG – Joint Picture Expert Group
• MPEG- Motion Picture Expert Group
Video Coding Standards
Digital Video Processing
• Video signal is basically any sequence of time varying images
• In a digital video, the picture information is digitized both spatially
and temporally and the resultant pixel intensities are quantized

Spatial Sampling
In the digital representation of the image, the value of each pixel needs to be quantized
using some finite precision. In practice, 8 bits are used per luminance sample.
Temporal sampling
A video consists of a sequence of images, displayed in rapid succession, to give an
illusion of continuous motion. If the time gap between successive frames is too large,
the viewer will observe jerky motion. In practice, most video formats use temporal
sampling rates of 24 frames per second or above.
The Need for Compression
• Typical television frame is 720x576. There are three basic formats to be used:

4:2:0 (NTSC)
(PAL/SECAM)
Video Format
• Digital video frames that are displayed at a prescribed frame rate. For example,
frame rate of 30 frames/sec is used in NTSC (National Television Standards
Committee) video
• The Common Intermediate Format (CIF) has 352 x 288 pixels, and the
Quarter CIF (QCIF) format has 176 x 144 pixels
• Pixel Resolution (dots per lines)
• SD Resolution: 640 × 480 (720p)
• HD Resolution:1280 × 720 (720p) / 1920 × 1080 (1080p)
Video Format – cont.
• Each pixel is represented by three components: the luminance component Y, and
the two chrominance components Cb and Cr
• RGB to YCbCr Conversion

• Video quality is commonly evaluated by using PSNR in luminance (Y )Channel,


which is referred to as the Y-PSNR (dB)
• Conversion of RGB to YUV:
Y= 0.3*R + 0.59*G + 0.11*B
U= (B-Y) * 0.493
V= (R-Y) * 0.877
Video Frame Types (in MPEG standard)
• Three types of video frames are I-frame, P-frame and B-frame. ‘I’ stands for Intra
coded frame, ‘P’ stands for Predictive frame and ‘B’ stands for Bidirectional
predictive frame.
• ‘I’ frames (also called key frames) are encoded without any motion compensation
and are used as a reference for future predicted ‘P’ and ‘B’ type frames. ‘I’ frames
however require a relatively large number of bits for encoding. More ‘I’ frames,
better quality of the video sequence, but need more storage space
• ‘P’ frames are encoded using motion compensated prediction from a reference
frame which can be either ‘I’ or ‘P’ frame. ‘P’ frames are more efficient in terms of
number of bits required compared to ‘I’ frames, but still require more bits than ‘B’
frames. ‘P’ frames contain only changes from preceding I-frame.
• ‘B’ frames require the lowest number of bits compared to both ‘I’ and ‘P’ frames
but incur computational complexity. B-frames rely only on frames the preceding
and following them.
• ‘P’ frames and ‘B’ frames are called delta frames or derived frames
Video Coding

Group of pictures (GOP)

• Intraframe coding (I-coding):


Removing the spatial redundancy with a frame is generally termed as intraframe
coding. The spatial redundancy within a frame is minimized by using transform.
The commonly used transform is DCT same as the JPEG compression
• Interframe coding (P- and B-coding):
The temporal redundancy between successive frames is removed by interframe
coding. Interframe coding exploits the interdependencies of video frames.
Interframe coding relies on the fact that adjacent pictures in a video sequence have
high temporal correlation
Video Coding – cont.
• Intra (I-coding)
• MB (Macro Block) is encoded as is, without motion compensation
• DCT followed by Q (Quantization), zig-zag, run-length, Huffman Coding
• Encoded without referencing other frames
• All MBs are intra coded
• Inter ( P- and B-coding)
• Block-matching - motion estimation
• Predictive motion residue from best-match block is DCT encoded
• Motion vector is differentially encoded
• Encoding by referencing other frames
• Some MBs are intra coded, and some are inter coded
Video Coding – cont.
Quantization Entropy coding
• Intra coding:
Video Coding – cont.
Video Coding – cont.

Group of Pictures (GOP)

Typical GOP in MPEG2


Video Sequence and Picture

You might also like