Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
34 views72 pages

DIP Notes Unit1,3,4,5

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views72 pages

DIP Notes Unit1,3,4,5

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 72

Unit-1

1. What is digital image? Explain advantages and


disadvantages of digital image

i. A digital image can be defined as a two-dimensional function, f(x,y), where x and y


are spatial (plane) coordinates, and the amplitude of f at pair of coordinates (x,y) is
called intensity or grey level of the image at that point.
ii. When x, y and amplitude value of f all are discrete values, we call the image as
digital image.
iii. An image is an array, or a matrix, of square pixels (picture elements) arranged in
columns and rows (hence 2-dimentional).
iv. Unlike analogue image which has continuous values representing the pair of
coordinates, an image which has finite / limited values representing the pair of
coordinates is called digital image
v. There are various types of digital image such as:
a. Monochrome image: it is a binary image
b. Grayscale image: it is a grey image with zero to 255 shades
c. Coloured image: it is a colour image with RGB values
d. Half toned image: it is also a black and white image but it gives an illusion of a
grey image.
vi. Advantages of Digital Images
a. Ease of Storage and Retrieval:
Digital images can be stored in a variety of formats (JPEG, PNG, BMP, etc.)
and retrieved quickly for editing, analysis, or display.
b. Efficient Transmission:
Digital images can be easily shared over the internet or via electronic devices.
c. Processing and Manipulation:
Digital images can be processed using algorithms to enhance quality, detect
features, or extract information (e.g., image editing software or AI
applications).
d. Reproducibility:
Digital images can be duplicated without loss of quality, unlike analog images,
which degrade with each copy.
e. Compact Storage:
High compression techniques (e.g., JPEG compression) allow efficient
storage without significant quality loss.
f. Integration with Technology:
Digital images are compatible with modern technologies such as machine
learning, augmented reality, and computer vision.
g. Metadata Inclusion:
Digital images can include metadata (e.g., EXIF data for photos) containing
information like the time, location, and device settings.
h. Durability:
Digital images do not degrade over time as physical photographs do.
vii. Disadvantages of Digital Images
a. Dependence on Technology:
Viewing, editing, and storing digital images require electronic devices and
software.
b. Quality Loss Due to Compression:
Lossy compression techniques, like JPEG, can reduce image quality over
repeated edits or compressions.
c. High Storage Requirements:
High-resolution or uncompressed digital images (e.g., RAW files) consume
significant storage space.
d. Privacy and Security Concerns:
Digital images can be easily copied, shared, or manipulated, leading to
privacy issues or misuse (e.g., deep fakes).
e. Device Compatibility Issues:
Some image formats may not be supported by certain devices or software.
f. Data Corruption Risks:
Digital files can be corrupted due to software errors, hardware failures, or
malicious attacks, leading to potential data loss.
2. Compare digital and analog image
3. What is digital image processing? Explain purpose of
image processing.

i. Digital image processing is a method to convert an image into a digital form and
perform some operations on it, in order to get an enhanced image, or to extract some
useful information.
ii. Digital image processing is the analysis and manipulation of digitised image
especially in order to improve its quality, i.e processing of digital image by means of
digital image processing
iii. In this process, the input is an image and the output may be an image or
characteristics of the image
iv. Digital image processing includes 3 basic steps:
a. Importing the image
b. Analysing and manipulating the image which may include data compression
and image enhancement
c. Output which is an enhanced image or a report based on image analysis
v. Some pro purposes of digital image processing are as follows:
a. Visualisation
b. Image enhancement.
c. Image sharpening and restoration
d. Image retrieval
e. Pattern and object recognition.
f. Image analysis
vi. There are three types of digital image processing:
a. Low level processing: input and output both are images.
b. Mid-level processing: the input is an image comma but the output is an
extracted part of the image (image segmentation)
c. High level processing: the input is an image, but the output is analysis result
of the image. Example: face recognition.
vii. There are various fundamental steps in digital image processing, they are as follows:
a. Image acquisition:
This is the first step or process of the fundamental steps of digital image
processing. Image acquisition could be as simple as being given an image
that is already in digital form. Generally, the image acquisition stage involves
pre-processing, such as scaling etc.
b. Image enhancement:
Image enhancement is among the simplest and most appealing areas of
digital image processing. Basically, the idea behind enhancement techniques
is to bring out detail that is obscured, or simply to highlight certain features of
interest in an image. Such as, changing brightness & contrast etc.
c. Image restoration:
Image restoration is about recovering an image that has been degraded due
to factors like noise, blur, or distortion. Image restoration is an area that also
deals with improving the appearance of an image. However, unlike
enhancement, which is subjective, image restoration is objective, in the sense
that restoration techniques tend to be based on mathematical or probabilistic
models of image degradation
d. Colour image processing:
Colour image processing is an area that has been gaining its importance
because of the significant increase in the use of digital images over the
Internet. This phase involves working with color images and processing them
using color models. Color image processing can be performed using different
color representations, such as RGB (Red, Green, Blue), HSV (Hue,
Saturation, Value), or CMYK (Cyan, Magenta, Yellow, Key).
e. Wavelets and multi resolution processing:
This phase deals with the representation of images at different levels of
resolution. It involves decomposing an image into different frequency
components, which can help capture various levels of detail
f. Image compression:
Image compression is the process of reducing the size of an image file
without significantly degrading its quality, which helps in efficient storage and
transmission
g. Morphological processing:
Morphological processing is a set of image processing techniques that deal
with the shape or structure of objects within an image. It focuses on the
extraction and analysis of geometric structures.
h. Segmentation:
Image segmentation is the process of dividing an image into meaningful
segments or regions. It’s crucial for tasks like object detection or recognition.
i. Representation and description:
After segmentation, the image is represented in a form that can be processed
and analyzed further. The representation is crucial for understanding the
structure and characteristics of the segmented objects.
j. Object recognition:
Object detection and recognition involves identifying and labeling objects
within an image based on predefined criteria or descriptors
k. Knowledge base:
Knowledge may be as simple as detailing regions of an image where the
information of interest is known to be located, thus limiting the search that has
to be conducted in seeking that information.
4. Briefly describe relationship and connectivity between
pixels.

i. In digital images, pixels are the fundamental units of the image.


ii. Each pixel represents a color or intensity value.
iii. The relationship and connectivity between pixels play a crucial role in various image
processing tasks, such as segmentation, enhancement, and object recognition.
iv. The relationship between pixels is defined by the neighbourhood of pixels
v. A pixel p at coordinates (x,y) has 2 horizontal and 2 vertical neighbours:
(x+1,y), (x-1,y), (x,y+1), (x,y-1).
vi. This set of pixels is called the 4-neighbors of p denoted by N4(p).
vii. Each pixel is a unit distance from (x,y).
viii. The 4 diagonal neighbors of p are: (ND(p)) (x+1,y+1), (x+1,y-1), (x-1,y+1), (x-1,y-1)
ix. These points together with the 4-neighbors, are called the 8-neighbors of p,
N4(p) + ND(p) = N8(p)
5. Define
a. 4-adjancency
b. 8-adjancency
c. m-adjacency
refer question 4 answer

6. Short note on
a. Chessboard distance.
b. City-block distance.

Chessboard distance forms a diamond like structure.


City block distance forms a square like structure.
7. Explain Image sampling and quantization
i. Sampling and quantization are the two important processes used to convert
continuous analog image into digital image.
ii. These processes are crucial for converting an analog image into a digital form that
can be stored, manipulated, and displayed by computers.
iii. Image sampling refers to discretization of spatial coordinates (along x axis) whereas
quantization refers to discretization of gray level values (amplitude (along y axis)).
iv. Given a continuous image, f (x, y), digitizing the coordinate values is called sampling
and digitizing the amplitude (intensity) values is called quantization.
v. In simple terms, Image sampling is the process of converting a continuous image
(analog) into a discrete image (digital). Digitizing co-ordinate values.
vi. Image quantization is the process of converting the continuous range of pixel values
(intensities) into a limited set of discrete values. Digitizing amplitude values.
vii. Sampling determines spatial resolution (number of pixels)
viii. Quantization determines the intensity/ color resolution. (number of levels)

8. What is resolution?
i. A resolution is an important aspect of in digital image processing
ii. The word "resolution" may mean many things. It is used to describe the crispness
and clarity of the images seen on screens in the context of digital technology, which
are based on the number of pixels arranged both ways horizontally and vertically.
iii. Image resolution quantifies how much close two lines (say one dark and one light)
can be to each other and still be visibly resolved. Meaning the clarity of the image.
iv. There are 2 common types of resolution: Spatial and Gray-Level (intensity)
resolution.
v. Spatial Resolution:
a. Definition: It represents the number of pixels used to define an image in the
spatial domain. Spatial resolution is the smallest discernible (detect with
difficulty) change in an image.
b. Units: Usually measured in pixels per inch (PPI) or dots per inch (DPI).
c. High Spatial Resolution: An image with a large number of small pixels. It
provides more detail and clarity.
d. Low Spatial Resolution: An image with fewer, larger pixels. It appears blocky
or pixelated.
e. Example:
i. A high-resolution image might be 1920x1080 pixels (Full HD).
ii. A low-resolution image might be 640x480 pixels (VGA).
vi. Intensity Resolution:
a. Definition: It refers to the number of intensity or color levels used to represent
each pixel in the image.
b. Units: Measured in bits per pixel (bpp).
i. 1 bpp: Black and white (2 intensity levels).
ii. 8 bpp: 256 grayscale levels.
iii. 24 bpp: 16.7 million colors (true color).
c. Higher Intensity Resolution: Captures finer intensity or color variations.
d. Lower Intensity Resolution: Causes banding or posterization effects.
e. Example:
i. A grayscale image with 8-bit intensity resolution has 256 levels of
gray.
ii. A color image with 24-bit resolution can represent 16.7 million colors.
vii. Trade-offs in Resolution:
a. High Resolution:
i. Advantages: Greater detail and accuracy.
ii. Disadvantages: Requires more storage and processing power.
b. Low Resolution:
i. Advantages: Smaller file size and faster processing.
ii. Disadvantages: Loss of detail and quality.
9. Explain structure of human eye
Skip

10. Explain image formation in the human eye


Skip

11. Explain fundamentals steps in digital image processing.


Refer question 3 answer.

12. Explain components of an image processing system


i. Sensors:
a. Sensors produce an electrical output proportional to light intensity.
b. With reference to sensing, two elements are required to acquire digital
images.
i. The first is a physical device(sensor) that is sensitive to the energy
radiated by the object we wish to image.
ii. The second, called a digitizer, is a device for converting the output of
the physical sensing device into digital form.
iii. For instance, in a digital video camera, the sensors produce an
electrical output proportional to light intensity. The digitizer converts
these outputs to digital data.
ii. Specialized Image Processing Hardware:
a. Purpose: Performs high-speed, specialized processing tasks for efficient
image manipulation and analysis.
b. Details:
i. Designed for tasks like filtering, compression, and feature extraction.
ii. Examples include GPUs (Graphics Processing Units) or FPGAs (Field
Programmable Gate Arrays).
c. Usage: Enhances processing performance in applications such as medical
imaging, real-time video processing, and remote sensing.
iii. The Computer:
a. In an image processing system is a general-purpose computer and can range
from a PC to a supercomputer.
b. In dedicated applications, sometimes specially designed computers are used
to achieve a required level of performance, but our interest here is on
general-purpose image processing systems.
iv. Image processing software:
a. Software for image processing consists of specialized modules that perform
specific tasks.
b. Provides the tools and algorithms for analyzing, enhancing, and manipulating
images.
c. Example: OpenCV, MATLAB, and Photoshop.
v. Mass storage:
a. Capability is a must in image processing applications.
b. An image of size 1024*1024 pixels, in which the intensity of each pixel is an
8-bit quantity, requires one megabyte of storage space if the image is not
compressed.
c. Digital storage for image processing applications falls into three principal
categories:
i. Short term storage for use during processing.
ii. On-line storage for relatively fast recall.
iii. Archival storage, characterized by infrequent access.
vi. Image Displays:
a. Purpose: Outputs the processed images for visualization or analysis.
b. Details:
i. High-resolution monitors or specialized displays are often used.
vii. Hardcopy:
a. Purpose: Produces a physical copy of the processed image.
b. Devices:
i. Printers or plotters.
viii. Networking:
a. Networking means exchange of information or services (eg through internet)
among individuals, groups, or institutions.
b. Networking is almost a default function in any computer system in use today.
c. Because of the large amount of data inherent in image processing
applications, the key consideration in image transmission is bandwidth.

13. What are the applications of digital image processing


i. Medical Imaging:
a. Purpose: Enhances medical images for better diagnosis and treatment
planning.
b. Examples:
i. MRI (Magnetic Resonance Imaging)
ii. CT (Computed Tomography) scans
iii. X-rays and Ultrasound image enhancement
c. Benefits: Helps detect diseases like tumors, fractures, and cardiovascular
issues with precision.
ii. Remote Sensing:
a. Purpose: Analyzes images from satellites and aerial systems.
b. Examples:
i. Monitoring deforestation and urban growth
ii. Crop health analysis
iii. Disaster management (floods, wildfires, etc.)
c. Benefits: Supports environmental monitoring and resource management.
iii. Surveillance and Security:
a. Purpose: Provides image-based monitoring and security systems.
b. Examples:
i. Facial recognition systems
ii. Automatic license plate recognition (ALPR)
iii. Intruder detection in CCTV footage
c. Benefits: Enhances public safety and reduces manual effor
iv. Entertainment and Multimedia:
a. Purpose: Enhances visual media for better user experiences.
b. Examples:
i. Image and video editing (Photoshop, After Effects)
ii. Visual effects (VFX) in movies and games
iii. Enhancing old photographs or videos
c. Benefits: Creates visually appealing content.
v. Biometric Recognition:
a. Purpose: Identifies individuals based on physical traits.
b. Examples:
i. Fingerprint scanning
ii. Facial recognition
iii. Iris and retina scanning
c. Benefits: Improves security and personal identification systems.
vi. Document Processing:
a. Purpose: Converts images into digital text and formats.
b. Examples:
i. Optical Character Recognition (OCR)
ii. Document restoration (e.g., restoring old texts)
iii. Automatic form recognition
c. Benefits: Digitizes records and speeds up workflows.

14. Explain types of digital images


Refer question 1 answer

15. Explain types of image processing


Refer question 3 answer
16. Explain types of sensors with a diagram

i. There are 3 types of imaging sensors


ii. Single imaging sensor
a. A single imaging sensor consists of a single photodetector, which captures
light from one pixel at a time.
b. To form a complete image, the sensor is moved mechanically across the field
of view or the object itself moves relative to the sensor.

iii. Line sensor


a. A line sensor consists of a single row of photodetectors (pixels) arranged in a
linear configuration.
b. It captures one line of the image at a time, and the entire image is formed by
scanning the object or scene line by line.
iv. Array sensor
a. An array sensor consists of a 2D grid (matrix) of photodetectors, where each
photodetector corresponds to a pixel in the image.
b. These sensors capture the entire image simultaneously without requiring
mechanical movement.
17. Explain Fourier transform and types of it in detail.
i. Almost every imaginable signal can be broken down into a combination of simple
waves. This fact is the central philosophy behind Fourier transforms.
ii. Fourier transforms (FT) take a signal and express it in terms of the frequencies of the
waves that make up that signal.
iii. The Fourier Transform (FT) is a mathematical technique used to convert spatial
domain data (like an image) into its frequency domain representation.
iv. Fourier transform is a mathematical model that decomposes a function or signal into
its constituent frequencies.
v. Fourier transform works like a mathematical prism. It separates the individual signals
from a complex signal.
vi. And the exact opposite, Inverse Fourier transform combines the individual signals to
form a complex signal.
vii. Types:
a. One Dimensional Fourier Transform
b. Two Dimensional Fourier Transform
18. Short note on Convolution.
1. Convolution is a fundamental concept in digital image processing used for filtering,
edge detection, shopping, blurring and many other tasks
2. Convolution is the process of combining 2 functions, the image and the kernel, in
order to produce a new function i.e modified image.
3. Convolution involves taking the weighted sum offer neighbourhood of pixels
4. The weights are taken from a convolution kernel, the kernel is first transposed, then
each value from the neighbourhood of pixel is multiplied with its opposite on the
matrix.
5. Steps:
a. Overlay the kernel: place the kernel over the image, starting from top left
corner
b. Element wise multiplication: multiply each kernel value with the corresponding
image pixel till it overlaps.
c. Calculate sum: sum up all the multiplied values to compute new pixel value.
d. Move and repeat: move the kernel across the image and repeat the process.
19. Short note on Convolution.
Same answer as question 18, just don’t transpose the kernel.

Chaitanya Shinde
Unit-2
1. Explain Image Enhancement in spatial domain
i. Spatial domain refers to the image plane itself, and approaches in this category
are based on direct manipulation of pixels in an image.
ii. The spatial domain is used to define the actual spatial coordinates of pixels within
an image
iii. Image Enhancement in the Spatial Domain refers to techniques used to improve
the visual quality of an image by modifying its pixel values directly, unlike
Frequency domain processing techniques are based on modifying the Fourier
transform of the image.
iv. Suppose we have a digital image which can be represented by a two dimensional
random field f ( x, y ) .
v. Spatial domain processes will be denoted by the expression:
g ( x, y) [ T=f(x,y) ] or s=T(r)
vi. The value of pixels, before and after processing, will be denoted by r and s,
respectively.
vii. Where f(x, y) is the input to the image, g(x, y) is the processed image and T is an
operator on f.
viii. T stands for Transformation function which works on the input image in order to
enhance the image and get the output image.
ix. The three basic types of functions used frequently for image enhancement in
spatial domain:
a. Linear Functions.
b. Logarithmic functions.
c. Power Law functions.

2. Short note on Enhancement through Point operations


i. A point operation on a digital image f(n) is a function T of a single variable applied
to identically to every pixel in the image, thus creating a new, modified image
g(n). Hence at each coordinate n: g(n)=T[f(n)].
ii. Point operations are image enhancement techniques where the transformation of
each pixel's intensity depends only on its original value and not on the
surrounding pixels.
iii. These operations are simple, efficient, and widely used in image processing for
tasks like brightness adjustment, contrast enhancement, and histogram
equalization.
iv. Point operations are simple, pixel-wise enhancement techniques that modify
intensity values independently.
v. They are computationally efficient and form the foundation of many image
enhancement workflows.
vi. Applications:
a. Adjusting brightness and contrast in images.
b. Enhancing details in medical or satellite images.
c. Pre-processing for computer vision tasks like object detection.

3. Explain Histogram manipulation.

i. Histogram manipulation is a set of image enhancement techniques that modify


the intensity distribution of an image to improve its appearance or make specific
features more prominent.
ii. The histogram of an image is a graphical representation of the frequency of
occurrence of different pixel intensity values.
iii. By altering the histogram, we can achieve goals such as improving contrast,
brightness, or detail visibility.
iv. In a histogram graph, the horizontal axis of the graph is used to represent
intensity level of the pixel, whereas the vertical axis is used to represent the
number of pixels in that particular intensity level.
v. Black and dark areas are represented in the left side of the horizontal axis,
medium grey color is represented in the middle, and the White and light areas are
represented in the right side of the horizontal axis.
vi. The histogram of a digital image with gray levels in the range [0, L-1] is a discrete
function of the form:
H(rk)=nk
where rk is the kth gray level and nk is the number of pixels in the image having
the level rk.
vii. A normalized histogram is given by the equation p(rk)=nk/n for k=0,1,2,…..,L-1
P(rk) gives the estimate of the probability of occurrence of gray level rk
viii. In the dark image the components of the histogram are concentrated on the low
(dark) side of the gray scale.
ix. In case of bright image, the histogram components are biased towards the high
side of the gray scale.
x. The histogram of a low contrast image will be narrow and will be centered
towards the middle of the gray scale.
xi. The components of the histogram in the high contrast image cover a broad range
of the gray scale.
xii. Various types of histogram manipulation are:
a. Histogram Equalization:
i. Redistributes the intensity values to spread them more evenly
across the available range.
ii. This enhances contrast, especially in images where the intensity
values are concentrated in a narrow range.
b. Histogram Matching:
i. Modifies the histogram of an image to match the histogram of a
reference image.
ii. Enhances image appearance.

4. Short note on HISTOGRAM EQUALISATION.


i. In image processing, there frequently arises the need to improve the contrast of
the image. In such cases, we use an intensity transformation technique known as
histogram equalization.
ii. Histogram equalization is a common technique for enhancing the appearance of
images.
iii. Suppose we have an image which is predominantly dark. Then its histogram
would be skewed towards the lower end of the grey scale and all the image detail
are compressed into the dark end of the histogram.
iv. Histogram equalization helps ‘stretch out’ the grey levels at the dark end to
produce a more uniformly distributed histogram then the image would become
much clearer.
v. Histogram equalization is the process of uniformly distributing the image
histogram over the entire intensity axis by choosing a proper intensity
transformation function.
vi. It ensures that the intensity levels are spread more evenly across the available
range, making the image clearer and more visually appealing.
vii. Key Steps:
a. Compute the Histogram: Calculate the frequency of each intensity level in
the image.
b. Calculate PDF: find all P(nk) = nk/N, where k is each element.
c. Calculate the Cumulative Distribution Function (CDF): This represents the
cumulative sum of the PDF
d. Multiply CDF with 7
e. Map Pixel Values: Use the CDF x 7 values to map original pixel intensities
to new intensity values, redistributing them across the full range (e.g., [0,
255] for 8-bit images).
5. Short note on HISTOGRAM EQUALISATION.

i. Histogram equalization does not allow interactive image enhancement and


generates only one result: an approximation to a uniform histogram.
ii. Sometimes we need to be able to specify particular histogram shapes capable of
highlighting certain gray-level ranges.
iii. The method used to generate a processed image that has a specified histogram
is called histogram matching or histogram specification
iv. Plot frequency table and histogram of image a and b
v. Equalize histogram a
vi. Equalize histogram b
vii. Mapping
viii. First and last columns of b
ix. Last two columns of a
x. Match and plot frequency table of output image.

6. Short note on LOCAL HISTOGRAM EQUALISATION

7. Explain Linear and nonlinear Gray Level Transformation.


i. Gray level transformations are used in image processing to enhance the visual
appearance of an image by changing the gray level values of pixels.
ii. The two main types of gray level transformations are linear and nonlinear
iii. Linear transformations:
a. These include identity and negative transformations. In an identity
transformation, each input image value is directly mapped to the output
image values. In a negative transformation, each input image value is
subtracted from L-1 and mapped to the output image.
iv. Nonlinear transformations:
a. These include transformations that stretch dark regions, lighter regions, or
the middle, while compressing the other regions. For example, one
nonlinear transformation replaces the gray level value at a point with the
minimum or maximum of gray levels in its neighbourhood.
Unit-3
1. Explain Image Enhancement in frequency domain.
2. Short note on frequency component

3. Explain steps in frequency domain filtering


4. Explain smoothing in frequency domain filter.

5. Short note on low pass filters: Ideal, Butterworth and


Gaussian.
i. Low pass filter: Low pass filter is the type of frequency domain filter that is used
for smoothing the image. It attenuates the high-frequency components and
preserves the low-frequency components.
ii. Low frequency is preserved in it.
iii. It allows the frequencies below cut off frequency to pass through it.
iv. G(u, v) = H(u, v) . F(u, v)
6. Explain sharpening in the frequency domain
i. Sharpening in the frequency domain involves emphasizing high-frequency
components of an image while suppressing low-frequency components. High
frequencies correspond to rapid intensity changes, such as edges and fine details,
making them crucial for sharpening.
ii. Key Concepts of Frequency Domain Sharpening
a. High-Pass Filtering:
i. High-pass filters enhance the high-frequency components by allowing
them to pass while attenuating low frequencies. This emphasizes
edges and details.
ii. High frequencies: Represent rapid intensity changes, such as edges.
iii. Low frequencies: Represent smooth regions or gradual intensity
changes.
b. Frequency Domain Representation:
i. An image is first transformed into the frequency domain using the
Fourier Transform. This converts the spatial domain image into its
frequency components.
ii. Filters are applied to the frequency domain representation.
iii. After filtering, the image is converted back to the spatial domain using
the Inverse Fourier Transform.
7. Short note on high pass filters: Ideal, Butterworth and
Gaussian
i. High-pass filters are used in image processing to emphasize high-frequency
components, such as edges and fine details, while suppressing low-frequency
components, like smooth regions or gradual intensity variations. These filters are
applied in the frequency domain.
ii. It is used for sharpening the image
iii. It attenuates low frequency
iv. High frequency is preserved in it
v. It allows the frequency above cut-off frequency to pass through it.
vi. It helps in removal of noise
8. Short note on Homomorphic filter.
i. Homomorphic filtering is an image enhancement technique that simultaneously
performs contrast enhancement and dynamic range compression by separating and
manipulating the illumination and reflectance components of an image.
ii. This method is widely used in image processing for improving images with uneven
lighting or poor contrast.
9. Short note on Mathematical morphology
i. Mathematical morphology is a technique in image processing based on set theory,
topology, and geometry.
ii. It is primarily used for analysing and processing geometric structures within binary or
grayscale images, focusing on shape and form.
iii. Mathematical morphology involves applying structuring elements to images to extract
meaningful features, such as boundaries, skeletons, and regions.
iv. Some key operations in morphology are:
a. Erosion
b. Dilation
c. Opening
d. Closing
v. Erosion:
vi. Dilation:
vii. Opening:
It is the unification of all B objects entirely contained in A
viii. Closing
10. What is Structuring elements.
11. Short note on Hit or miss transform (with example).
i. The Hit-or-Miss Transform is a morphological operation used to detect specific
shapes or patterns in a binary image.
ii. It identifies regions in an image that exactly match a predefined shape represented
by a structuring element (SE).
12. Explain morphological algorithms:
a. Boundary Extraction
b. Region Filling
c. Connected component extraction
d. Thinning
e. Thickening
f. Convex hull
g. Skeleton

i. Morphological algorithms are techniques derived from basic morphological


operations like erosion, dilation, opening, and closing.
ii. They are applied to binary images to analyze and manipulate the structure of objects
for tasks such as shape analysis, feature extraction, and region manipulation.
iii. Boundary Extraction:

iv. Region Filling:


v. Connected component extraction
vi. Thinning:
vii. Thickening:
viii. Convex Hull

ix. Skeleton
a. Skeletonization is a process for reducing foreground regions in a binary
image to a skeletal remnant that largely preserves the extent and connectivity
of the original region while throwing away most of the original foreground
pixels.
Unit-4
1. Short note on Color images.
i. A color image is a type of digital image that represents visual information using
multiple color channels, typically red, green, and blue (RGB).
ii. These images are used to represent the real-world appearance of objects by
combining colors in varying intensities.
iii. They are described using color models like RGB, HSV, or CMYK, and are widely
used in diverse applications such as multimedia, computer vision, and scientific
imaging.
iv. RGB (Red, Green, Blue): Most common model for digital displays. Each pixel has
three values corresponding to red, green, and blue intensities.
v. CMY/CMYK (Cyan, Magenta, Yellow, Black): Used in printing.
vi. HSV (Hue, Saturation, Value): Represents color in a more intuitive way based on its
attributes.
vii. Advantages of Color Images
a. Realistic representation of scenes and objects.
b. Essential for applications like photography, video, and visualizations.
viii. Limitations
a. Storage and Bandwidth: Higher data requirements compared to grayscale
images.
b. Processing Complexity: Requires advanced algorithms for analysis and
enhancement.
c. Illumination Sensitivity: Color perception may vary under different lighting
conditions.

2. Explain following color models.


a. RGB Color models
b. CMY and CMYK model
c. HSI Model
i. Color models are mathematical systems used to represent and describe colors
numerically.
ii. Different color models are suited for various applications, such as digital displays,
printing, or color perception analysis.
iii. The purpose of a color model (or color space or color system) is to facilitate the
specification of color in some standard fashion
iv. Most color models in use today are either based on hardware (color camera, printer)
or on applications involving color manipulation (computer graphics, animation).
v. RGB Color Model
a. Definition:
i. The RGB color model uses three primary colors—Red (R), Green (G),
and Blue (B)—to create other colors by combining them in varying
intensities.
ii. The RGB colour model is the most common colour model used in
Digital image processing
iii. 0 represents the black and as the value increases the colour intensity
increases.
b. Representation:
i. Each pixel in an image is represented as a combination of R, G, and B
values.
c. Intensity ranges:
i. For 8-bit images: 0 to 255 per channel.
ii. Black: (0,0,0), White: (255,255,255).
d. Applications:
i. Used in digital displays, such as monitors, TVs, and cameras.
ii. Additive color system: Combining all colors at maximum intensity
produces white.
e. Advantages:
i. Simple and widely supported for digital devices.
f. Disadvantages:
i. Not intuitive for human perception (e.g., it is hard to visualize "hue" or
"saturation" directly).

vi. CMY and CMYK Color Models:


a. CMY (Cyan, Magenta, Yellow):
i. Definition: A subtractive color model based on the absorption of light.
It is the inverse of the RGB model:
C=1−R, M=1−G, Y=1−B
Where R, G, and B are normalized values.
In this model, point (1, 1, 1) represents black, and (0,0,0) represents
white.
ii. Applications:
1. Used in digital printing systems and color mixing for paint and
dyes.
b. CMYK (Cyan, Magenta, Yellow, Black):
i. Definition: An extension of the CMY model with an additional Black (K)
channel to improve print quality and reduce ink usage.
ii. Advantages:
1. Reduces over-inking by adding black ink for darker tones.
2. Produces high-quality printed materials.
iii. Disadvantages:
1. Requires calibration between digital and print devices for
accurate color reproduction.

vii. HSI (Hue, Saturation, Intensity) Model


a. Definition: The HSI model describes colors based on how humans perceive
them:
i. Hue (H): Represents the type of color (e.g., red, green, blue) as an
angle in the range [0∘,360∘][0^\circ, 360^\circ][0∘,360∘].
ii. Saturation (S): Measures the purity or vividness of the color, ranging
from 0 (gray) to 1 (pure color).
iii. Intensity (I): Represents the brightness, calculated as the average of
RGB values.
b. Applications:
i. Image processing for tasks like histogram equalization and
segmentation.
ii. Better suited for human perception than RGB.
c. Advantages:
i. Intuitive representation for humans.
ii. Separates luminance (intensity) from color information.
d. Disadvantages:
i. Conversion between HSI and RGB is computationally complex.

3. Explain Image segmentation techniques.


i. Image segmentation is a fundamental technique in digital image processing and
computer vision.
ii. It involves partitioning a digital image into multiple segments (regions or objects) to
simplify and analyze an image by separating it into meaningful components, Which
makes the image processing more efficient by focusing on specific regions of
interest.
iii. A typical image segmentation task goes through the following steps:
a. Groups pixels in an image based on shared characteristics like colour,
intensity, or texture.
b. Assigns a label to each pixel, indicating its belonging to a specific segment or
object.
c. The resulting output is a segmented image, often visualized as a mask or
overlay highlighting the different segments.
iv. Image segmentation is crucial in computer vision tasks because it breaks down
complex images into manageable pieces. It's like separating ingredients in a dish.
v. By isolating objects (things) and backgrounds (stuff), image analysis becomes more
efficient and accurate.
vi. This is essential for tasks like self-driving cars identifying objects.
vii. Understanding the image's content at this granular level unlocks a wider range of
applications in computer vision.
viii. Image segmentation techniques are more reliant on principle of image processing,
mathematical operation and heuristics to separate an image into meaningful regions,
some of them are defined below:
a. Thresholding: This method involves selecting a threshold value and
classifying image pixels between foreground and background based on
intensity values
b. Edge Detection: Edge detection method identify abrupt change in intensity or
discontinuation in the image. It uses algorithms like Sobel, Canny or
Laplacian edge detectors.
c. Region-based segmentation: This method segments the image into smaller
regions and iteratively merges them based on predefined attributes in colour,
intensity and texture to handle noise and irregularities in the image.
d. Clustering Algorithm: This method uses algorithms like K-means or Gaussian
models to group object pixels in an image into clusters based on similar
features like colour or texture.
e. Watershed Segmentation: The watershed segmentation treats the image like
a topographical map where the watershed lines are identified based on pixel
intensity and connectivity like water flowing down different valleys.
4. Explain following region approach.
a. Region-Growing Based segmentation
b. Region Splitting
c. Region Merging
d. Split& Merge
e. Region Growing
i. Region Splitting:
ii. Region Merging:

iii. Split and merge:


iv. Region Growing:
5. Short note on Thresholding
i. Thresholding is a simple and widely used technique in image segmentation. It divides
the image into regions by comparing pixel intensity values to a predefined threshold
or a set of thresholds.
ii. The result is a binary or multi-level segmented image.
iii. Thresholding is an essential foundation for many advanced segmentation
techniques, providing a quick and effective method for separating regions of interest.

iv. Advantages:
a. Simple and computationally efficient.
b. Works well for images with distinct intensity differences.
v. Limitations:
a. Fails for images with overlapping intensity ranges between regions.
b. Sensitive to noise and illumination variations.
c. Ineffective for complex textures or gradients.
vi. Applications:
a. Document text extraction.
b. Medical imaging.
c. Object detection in computer vision.
6. Short note on Edge-based segmentation using:
a. Laplacian mask
b. Sobel mask
c. Prewitt mask
d. Canny edge detection
i. Laplacian Operator
a. Laplacian Operator is also a derivative operator which is used to find
edges in an image. Laplacian is a second order derivative mask. It can be
further divided into positive laplacian and negative laplacian.
ii. The sobel operator is very similar to Prewitt operator. It is also a derivate mask
and is used for edge detection. It also calculates edges in both horizontal and
vertical direction.
iii. Prewitt operator is used for detecting edges horizontally and vertically
7. Explain Hough Transform
i. The Hough Transform is a popular technique in computer vision and image
processing, used for detecting geometric shapes like lines, circles, and other
parametric curves.
ii. It has numerous applications in various domains such as medical imaging, robotics,
and autonomous driving.
iii. Fundamentally, it transfers these shapes' representation from the spatial domain to
the parameter space, allowing for effective detection even in the face of distortions
like noise or occlusion.
iv. The accumulator array, sometimes referred to as the parameter space or Hough
space, is the first thing that the Hough Transform creates.
v. The available parameter values for the shapes that are being detected are
represented by this space. The slope (m) and y-intercept (b) of a line, for instance,
could be the parameters in the line detection scenario.
vi. The Hough Transform calculates the matching curves in the parameter space for
each edge point in the image. This is accomplished by finding the curve that
intersects the parameter values at the spot by iterating over all possible values of the
parameters. The "votes" or intersections for every combination of parameters are
recorded by the accumulator array.
vii. In the end, the programme finds peaks in the accumulator array that match the
parameters of the shapes it has identified. These peaks show whether the image
contains lines, circles, or other shapes.
viii. Applications of Hough Transform
a. Line Detection: In applications such as road lane detection, architectural
feature recognition, and industrial inspection, Hough Transform is often used
to detect straight lines.
b. Circle Detection: It is widely used in applications like detecting circular objects
(e.g., detecting traffic signs, medical imaging for detecting circular structures
like blood vessels or tumors).
c. Shape Matching: In pattern recognition, Hough Transform can be used to
match specific geometric shapes to a given image.
d. Robotics: In robotic vision, the Hough Transform helps in detecting features
like walls, obstacles, or paths.

8. Explain edge models a. Step edge b. Ramp edge c. Roof


edge
i. In image processing, edge models are mathematical representations used to
describe different types of edges in an image.
ii. These models help in understanding the characteristics of edges, which are
important for edge detection and segmentation tasks.
iii. Below are three common types of edge models:
Unit-5
1. What is image compression? Explain need of image
compression.

i. Image Compression is the process of reducing the size of an image file without
compromising the image quality beyond a certain acceptable limit.
ii. This reduction in file size is achieved by eliminating redundant data within the image.
iii. The goal is to reduce the storage space needed for the image and minimize the
bandwidth required for its transmission, while maintaining the image quality as much
as possible.
iv. There are two main types of image compression:
a. Lossy Compression:
i. In lossy compression, some of the image data is permanently
removed, leading to a loss of quality.
ii. The reduction in quality is typically not noticeable to the human eye
but can be significant if compressed too much.
iii. JPEG is the most common example of a lossy compression format.
b. Lossless Compression:
i. In lossless compression, no image data is lost, and the original image
can be perfectly reconstructed from the compressed version.
ii. While lossless compression may not achieve as much reduction in
size as lossy compression, it ensures that the image quality is
preserved.
iii. PNG and GIF are examples of lossless compression formats.
v. The need for image compression arises due to several practical reasons, including:
a. Storage Efficiency:
i. Large File Sizes: Digital images, especially high-resolution ones, can
have large file sizes. Storing high-quality images can consume
significant disk space, especially for large collections, such as medical
images, satellite images, or photographs.
ii. Reduces Storage Requirements: Image compression reduces the
storage space required for each image, allowing more images to be
stored on devices with limited storage, such as mobile phones,
computers, and cloud storage systems.
b. Faster Transmission and Downloading:
i. Bandwidth Limitation: Sending high-resolution images over the
internet requires significant bandwidth. This can result in slower
transmission times, especially over slow or congested networks. By
compressing images, their file size decreases, enabling faster loading,
downloading, and sharing.
ii. Web Usage: For websites, reducing the size of images helps in faster
page loading times, leading to better user experience and potentially
improved search engine rankings (since page speed is a ranking
factor).
c. Efficient Network Communication
i. Transmission Over Limited Networks: In scenarios like satellite
communication, mobile networks, and video conferencing, transmitting
large image files can be problematic due to bandwidth constraints.
Compressing images reduces the amount of data transmitted over
these limited networks, improving communication efficiency.
ii. Streaming Media: Image compression is also crucial in streaming
applications, where large amounts of image data (such as video
frames) need to be transmitted in real-time. Compression ensures
smooth playback with reduced buffering.
d. Improved Performance for Applications
i. Real-Time Processing: In applications like video surveillance, medical
imaging, and remote sensing, large image files may need to be
processed in real-time. By compressing the images, the computational
load and memory requirements are reduced, leading to faster
processing times.
ii. Storage in Embedded Systems: Devices like cameras, smartphones,
and drones often use image compression to store images in a limited
memory. Compressing images allows these devices to store more
images without compromising on image capture quality.
e. Preservation of Data Integrity
i. Reducing Data Redundancy: In an image, certain data is redundant,
such as repetitive patterns, similar pixel values, or areas of uniform
color. Image compression helps remove this redundant data, leading
to efficient storage without losing valuable information in the image,
especially in lossless compression.
2. Explain following Redundancy in images
a. Data redundancy
b. Coding redundancy
c. Interpixel redundancy
d. Psychovisual redundancy
i. Redundancy in images refers to the presence of repetitive or unnecessary
information that can be exploited to compress the image without losing important
details.
ii. Image redundancy is the key factor that enables image compression techniques to
reduce file size while maintaining quality.
iii. There are different types of redundancy in images, which can be classified into the
following categories:
iv. Data Redundancy
a. Definition:
i. Data redundancy refers to the repetition of data within the image file. It
occurs when the same information is stored multiple times in the
image, and it can be removed or reduced during compression without
affecting the image's content.
b. Examples:
i. Repeated Patterns: In an image, large areas with uniform colors or
patterns can cause data redundancy. For example, an image with a
blue sky or a flat-colored background can be compressed by noting
that the color remains constant over a large area.
ii. Same Pixel Values: Multiple adjacent pixels may have the same color
or intensity values, especially in images with smooth gradients or
areas of uniform color. This redundancy can be reduced by storing
only one value and referencing it multiple times.
c. Compression Techniques:
i. Run-Length Encoding (RLE): This technique is used to encode data
redundancy. It stores sequences of identical values as a single value
and a count, reducing the amount of data required to represent
repetitive pixel values.
v. Coding Redundancy
a. Definition:
i. Coding redundancy arises from inefficient coding schemes or the use
of fixed-length codes to represent pixel values. It refers to the use of
codes that take up more bits than necessary to represent the
information.
b. Examples:
i. Huffman Coding: In images, some pixel values (or combinations of
pixel values) occur more frequently than others. Coding redundancy
can be reduced by assigning shorter codes to frequently occurring
values and longer codes to less frequent values, which optimizes the
storage.
c. Compression Techniques:
i. Entropy Encoding: This is a technique used to reduce coding
redundancy by assigning shorter codes to more frequent symbols
(such as pixel values). Examples include Huffman Coding and
Arithmetic Coding. These algorithms effectively compress data by
minimizing the average number of bits per symbol.
vi. Interpixel Redundancy
a. Definition:
i. Interpixel redundancy refers to the correlation or similarity between
adjacent pixels or neighboring regions in an image. This redundancy
is present because adjacent pixels often have similar values,
especially in regions of uniform color or smooth gradients.
b. Examples:
i. Smooth Gradients: In natural images or photographs, neighboring
pixels tend to have similar color values, creating interpixel
redundancy. For example, in an image of a sunset, the pixels that
make up the sky may be similar to each other, leading to redundancy.
ii. Edges: Even in areas with high contrast (such as object boundaries),
there is often some degree of correlation between adjacent pixels,
where certain patterns of edge transitions repeat across the image.
c. Compression Techniques:
i. Prediction: One method to exploit interpixel redundancy is to predict
the value of a pixel based on its neighboring pixels, encoding only the
difference (error) between the predicted and actual pixel value. This is
done in techniques such as Differential Pulse Code Modulation
(DPCM).
ii. Transform Coding: In methods like Discrete Cosine Transform (DCT)
(used in JPEG compression), the image is transformed into a
frequency domain where redundant low-frequency information is
grouped together, and high-frequency components are discarded.
vii. Psychovisual Redundancy
a. Definition:
i. Psychovisual redundancy exploits the limitations of human vision. The
human visual system is less sensitive to certain kinds of image details,
such as small color variations or high-frequency components, making
them redundant from a perceptual perspective.
b. Examples:
i. Color Sensitivity: Humans are more sensitive to changes in brightness
(luminance) than to changes in color (chrominance). As a result, small
color variations may not be noticeable, even though they are present
in the image.
ii. High-frequency Details: The human eye has limited resolution for
detecting fine details, especially at high frequencies. Therefore, high-
frequency image components (such as sharp textures or noise) can
be removed or compressed without a noticeable loss in perceived
quality.
c. Compression Techniques:
i. Chroma Subsampling: This technique reduces the resolution of the
color components (chrominance) of an image while keeping the
brightness component (luminance) at full resolution.
ii. Perceptual Quantization: This method reduces the precision of certain
image details that are less noticeable to the human eye, thus
achieving better compression without significant perceptual loss.
3. Explain Run-length coding.
i. Run-length encoding (RLE) is a form of lossless data compression in which runs of
data (consecutive occurrences of the same data value) are stored as a single
occurrence of that data value and a count of its consecutive occurrences, rather than
as the original run.
ii. This method is used to replace consecutive repeating occurrence of symbol by 1
occurrence of symbol followed by number of occurrence.
iii. Example: when encoding an image built up from colored dots, the sequence "green
green green green " is shortened to "green x 4"
iv. For example, if the input string is “wwwwaaadexxxxxx”, then the function should
return “w04a03d01e01x06”
v. Follow the steps below to solve this problem:
a. Pick the first character from the source string.
b. Append the picked character to the destination string.
c. Count the number of subsequent occurrences of the picked character and
append the count to the destination string.
d. Pick the next character and repeat steps 2, 3 and 4 if the end of the string is
NOT reached.
vi. Can compress any type of data but cannot achieve high compression ratios
compared to other compression methods.
vii. The original data can be perfectly reconstructed from the encoded data.

4. Short note on Huffman Coding.


i. Huffman Coding is a lossless data compression algorithm used to efficiently encode
data by assigning shorter codes to more frequently occurring symbols and longer
codes to less frequent symbols.
ii. It is a widely used method in data compression, providing an optimal prefix code for a
given set of symbol frequencies.
iii. Steps to build Huffman Tree:
a. Input is an array of unique characters along with their frequency of
occurrences and output is Huffman Tree.
b. Sort the characters by frequency, ascending. These are kept in a Q/min-heap
priority queue.
c. For each distinct character and its frequency in the data stream, create a leaf
node.
d. Remove the two nodes with the two lowest frequencies from the nodes, and
the new root of the tree is created using the sum of these frequencies.
i. Make the first extracted node its left child and the second extracted
node its right child while extracting the nodes with the lowest
frequency from the min-heap.
ii. To the min-heap, add this node.
iii. Since the left side of the root should always contain the minimum
frequency.
e. Repeat steps d and e until there is only one node left on the heap, or all
characters are represented by nodes in the tree. The tree is finished when
just the root node remains
A and E can switch places
5. Explain Arithmetic Coding.
i. Arithmetic coding is a lossless data compression technique that represents an entire
message as a single number, a fraction between 0 and 1.
ii. Unlike other compression methods (e.g., Huffman coding), which assign discrete
codes to individual symbols, arithmetic coding encodes the entire message into a
single continuous range.
iii. This allows for more efficient use of available bits, especially for data with skewed
symbol probabilities.
iv. D = upper bound – lower bound
v. Range of symbol = lower limit + d (probability of symbol)
6. Explain Lempel ziv coding

You might also like