1.
Three-Step Search (TSS)
TSS is a common method for finding motion in video compression. It works by checking
different locations in a step-by-step manner.
How TSS Works:
1. Start at the center of the search area.
2. Step 1: Check 8 points around the center with a large step size. Choose the best
match.
3. Step 2: Reduce the step size by half and check 8 new points around the best match.
4. Step 3: Repeat the process one more time with a smaller step size.
Example (If search range is 16 pixels):
Step 1: Check points at ±8 pixels from the center.
Step 2: Move to the best point and check ±4 pixels around it.
Step 3: Move again and check ±2 pixels for final refinement.
Advantages of TSS:
Simple and easy to implement.
Faster than Full Search (which checks all possible positions).
Disadvantages of TSS:
Not always the best for small motions.
Can miss the exact best match.
2. Logarithmic Search (Log Search)
Logarithmic search is another fast motion estimation method. It works like a shrinking
grid, narrowing down the best match.
How Logarithmic Search Works:
1. Start at the center of the search area.
2. Step 1: Check 4 points at a large distance (like corners of a square).
3. Step 2: Move to the best point and reduce the search area.
4. Step 3: Continue reducing the step size until the best match is found.
Example (If search range is 16 pixels):
Step 1: Check points at ±8 pixels.
Step 2: Move to the best match, then check ±4 pixels.
Step 3: Repeat with ±2, ±1 pixels for final refinement.
Advantages of Log Search:
Even faster than TSS because it checks fewer points.
Works well for large and small movements.
Disadvantages of Log Search:
Can sometimes skip the exact best match if motion is irregular.
Comparison of TSS vs. Logarithmic Search
Feature Three-Step Search (TSS) Logarithmic Search
Speed Fast Faster
Accuracy Good Slightly less accurate
Best for Moderate motion Large and small motion
Steps Taken 3 steps Varies (fewer steps)
New Three-Step Search (NTSS) – An Improvement Over TSS
New Three-Step Search (NTSS) improves the original Three-Step Search (TSS) by making it
more accurate and adaptive, especially for small motion vectors.
Key Enhancements in NTSS
1. Extra Search for Small Motions
o In TSS, the search pattern is fixed to 3 steps regardless of motion size.
o NTSS adds an extra search stage if the best match is found at the center early.
o This prevents unnecessary large searches when motion is small.
2. Refinement of the Best Match
o After the usual three steps, NTSS checks an additional 8 neighboring points
to refine the motion vector further.
o This makes NTSS more accurate for both small and large motions.
3. Efficiency Improvement
o NTSS avoids unnecessary steps when motion is small.
o If motion is large, it behaves like TSS, ensuring it still works well for fast-
moving objects.
NTSS Algorithm Steps
1. Step 1: Start at Center
o Begin at the center of the search window.
o Set an initial step size S = 4 (or 8, depending on window size).
2. Step 2: Initial Search (Like TSS)
o Check 8 neighboring points around the center:
(±S, 0), (0, ±S), (±S, ±S)
o Find the best match using the cost function.
3. Step 3: Adapt the Search
o If the best match is the center, NTSS adds a finer search (8 extra points
around the center).
o If the best match is not the center, continue like TSS with a reduced step size
(S = S/2).
4. Step 4: Repeat Until S = 1
o Continue refining the motion vector until step size S = 1.
o The final best match gives the motion vector.
Comparison: NTSS vs. TSS
Feature Three-Step Search (TSS) New Three-Step Search (NTSS)
Search Steps Always 3 steps (fixed) Adaptive (adds extra search if needed)
Extra Refinement No additional search Extra 8-point search for small motions
Can waste steps if motion is
Efficiency Avoids unnecessary large steps
small
Feature Three-Step Search (TSS) New Three-Step Search (NTSS)
Good, but may miss small Higher accuracy for both small and large
Accuracy
motions motions
Computational
Moderate Slightly higher, but smarter
Cost
Why NTSS is Better?
More accurate for small and large motions.
Reduces unnecessary steps, making it faster in many cases.
Balances efficiency and accuracy, improving upon the original TSS.
Motion Estimation and Motion Compensation in Video Compression
Motion estimation and motion compensation are fundamental techniques used in video
compression to reduce redundant data by exploiting similarities between video frames.
These methods are widely used in modern video codecs like H.264, H.265 (HEVC), and AV1
to achieve efficient compression while maintaining visual quality.
1. Motion Estimation (ME)
Motion estimation is the process of finding movement between frames in a video
sequence. Instead of storing each frame independently, the encoder identifies how parts
of an image move from one frame to another and reuses that information.
Key Concepts of Motion Estimation
Block-based approach → The frame is divided into small blocks (e.g., 16×16 or 8×8
pixels).
Search algorithms → Algorithms find the best-matching block from the reference
frame (previous or future frame).
Motion vectors (MV) → The displacement of a block from one frame to another is
represented as a motion vector.
Common Motion Estimation Algorithms
1. Full Search (Brute Force) → Checks all possible positions for the best match (high
accuracy but slow).
2. Three-Step Search → Searches in a hierarchical manner to reduce complexity.
3. Diamond Search → Uses a diamond-shaped pattern to find the best match
efficiently.
4. Hierarchical (Multi-Resolution) Search → Uses a lower-resolution version of the
frame to estimate motion faster.
2. Motion Compensation (MC)
Motion compensation is the process of using motion vectors from motion estimation to
reconstruct predicted frames. Instead of encoding the whole frame, only the differences
(residual errors) are stored, which saves storage and bandwidth.
Key Steps in Motion Compensation
1. Apply motion vectors to reconstruct the predicted frame.
2. Calculate the residual error (difference between the predicted and actual frame).
3. Encode the residual error along with motion vectors for efficient compression.
Types of Motion Compensation
Forward Prediction → Uses the previous frame to predict the current frame.
Backward Prediction → Uses a future frame (from bidirectional encoding).
Bidirectional Prediction (B-Frames) → Uses both past and future frames to predict
the current frame (used in H.264, H.265).
Evaluation metrics:
1. Mean Square Error (MSE):
- Measures the average squared difference between predicted and actual pixel values
- Formula: MSE = (1/N) ∑(actual - predicted)²
- Lower MSE indicates better motion estimation
2. Mean Absolute Error (MAE):
- Calculates average absolute difference between predicted and actual values
- Formula: MAE = (1/N) ∑|actual - predicted|
- Less sensitive to outliers than MSE
- More intuitive to interpret as it's in original pixel value units
3. Peak Signal-to-Noise Ratio (PSNR):
- Measures ratio between maximum possible signal power and noise power
- Formula: PSNR = 20 * log₁₀(MAX_PIXEL_VALUE/√MSE)
- Usually expressed in decibels (dB)
- Higher PSNR indicates better quality
- Typical values range from 20-40 dB for video compression
4. Sum of Absolute Differences (SAD):
- Measures total absolute differences between pixels in reference and current blocks
- Popular in real-time applications due to computational simplicity
- Used to determine best matching block
- Lower SAD indicates better match
Chroma Subsampling Explained
Chroma subsampling is a technique used in video compression to reduce the amount of
color information while keeping the brightness (luminance) details intact. Since the human
eye is more sensitive to brightness than color, chroma subsampling helps reduce file size
without significantly affecting perceived quality.
Understanding the Chroma Subsampling Notation (J:a:b)
The chroma subsampling format is usually written as J:a:b, where:
J → Number of horizontal luminance (Y) samples per row (typically 4).
a → Number of chroma (color) samples in the first row.
b → Number of chroma samples in the second row compared to the first.
Common Chroma Subsampling Formats
1. 4:4:4 (No Subsampling)
o No compression, full color resolution.
o Used in high-end film production and graphics.
2. 4:2:2 (Half Horizontal Chroma Resolution)
o Every two pixels share one chroma sample.
o Used in professional video formats like broadcast TV.
3. 4:2:0 (Half Horizontal & Vertical Chroma Resolution)
o Every four pixels share one chroma sample.
o Common in streaming, Blu-ray, and video conferencing.
4. 4:1:1 (Quarter Horizontal Chroma Resolution)
o Even stronger color compression, but full vertical resolution.
o Used in older digital video formats.
Impact on Video Quality
4:4:4 provides the best quality but requires more storage.
4:2:2 is a good balance between quality and compression.
4:2:0 is widely used due to its efficiency.
4:1:1 results in noticeable color degradation