Important Questions
1) Using the background information provided in Section 2.1, and thinking purely in
geometric terms, estimate the diameter of the smallest printed dot that the eye can discern
if the page on which the dot is printed is 0.2 m away from the eyes. Assume for simplicity
that the visual system ceases to detect the dot when the image of the dot on the fovea
becomes smaller than the diameter of one receptor (cone) in that area of the retina. Assume
further that the fovea can be modeled as a square array of dimensions and that the cones
and spaces between the cones are distributed uniformly throughout this array.
Answer: The diameter, x, of the retinal image corresponding to the dot is obtained from
similar triangles, as shown in Fig.
(d 2) = ( x 2)
0.2 0.017
That is, which gives x = 0.085d.
It can be assumed or think of the fovea as a square sensor array having on the order of
337,000 elements, which translates into an array of size 580 × 580 elements. Assuming
equal spacing between elements, this gives 580 elements and 579 spaces on a line 1.5 mm
long. The size of each element and each space is then s = [(1.5mm)/1,159] = 1.3 × 10-6 m.
If the size (on the fovea) of the imaged dot is less than the size of a single resolution
element, we assume that the dot will be invisible to the eye. In other words, the eye will
not detect a dot if its diameter, d, is such that 0.085(d) < 1.3 × 10-6 m, or d < 15.3 × 10-6 m.
2) When you enter a dark theater on a bright day, it takes an appreciable interval of time
before you can see well enough to find an empty seat. Which of the visual processes
explained in Section 2.1 is at play in this situation?
Answer: - This process is known as ‘Brightness adaptation’.
3) You are hired to design the front end of an imaging system for studying the boundary
shapes of cells, bacteria, viruses, and protein. The front end consists, in this case, of the
illumination source(s) and corresponding imaging camera(s). The diameters of circles
required to enclose individual specimens in each of these categories are 50, 1, 0.1, and
respectively.
a) Can you solve the imaging aspects of this problem with a single sensor and camera?
If your answer is yes, specify the illumination wavelength band and the type of
camera needed. By “type,” we mean the band of the electromagnetic spectrum to
which the camera is most sensitive (e.g., infrared).
b) If your answer in (a) is no, what type of illumination sources and corresponding
imaging sensors would you recommend? Specify the light sources and cameras as
requested in part (a). Use the minimum number of illumination sources and
cameras needed to solve the problem.
By “solving the problem,” we mean being able to detect circular details of diameter 50, 1,
0.1, and respectively.
Answer: - (a) From the discussion on the electromagnetic spectrum in Section 2.2, the
source of the illumination required to see an object must have wavelength the same size
or smaller than the object. Because interest lies only on the boundary shape and not on
other spectral characteristics of the specimens, a single illumination source in the far
ultraviolet (wavelength of .001 microns or less) will be able to detect all objects. A far-
ultraviolet camera sensor would be needed to image the specimens.
(b) No answer is required because the answer to (a) is affirmative
4) A CCD camera chip of dimensions and having elements, is focused on a square, flat area,
located 0.5 m away. How many line pairs per mm will this camera be able to resolve? The
camera is equipped with a 35-mm lens.
(Hint: Model the imaging process is
shown as in Fig., with the focal length of
the camera lens substituting for the focal
length of the eye.)
Answer: - From the geometry of above Fig., (7mm)/(35mm) = (z)/(500mm), or z = 100mm.
So the target size is 100 mm on the side. We have a total of 1024 elements per line, so the
resolution of 1 line is 1024/100 = 10 elements/mm. For line pairs we divide by 2, giving
an answer of 5 lp/mm.
5) An automobile manufacturer is automating the placement of certain components on the
bumpers of a limited-edition line of sports cars. The components are color coordinated, so
the robots need to know the color of each car to select the appropriate bumper component.
Models come in only four colors: blue, green, red, and white. You are hired to propose a
solution based on imaging. How would you solve the problem of automatically
determining the color of each car, keeping in mind that cost is the most important
consideration in your choice of components?
Answer:- One possible solution is to equip a monochrome camera with a mechanical
device that sequentially places a red, a green and a blue pass filter in front of the lens. The
strongest camera response determines the color. If all three responses are approximately
equal, the object is white. A faster system would utilize three different cameras, each
equipped with an individual filter. The analysis then would be based on polling the
response of each camera. This system would be a little more expensive, but it would be
faster and more reliable. Note that both solutions assume that the field of view of the
camera(s) is such that it is com
6) Consider the two image subsets and shown in the following figure. For V= {1}, determine
whether these two subsets are
(a) 4-adjacent,
(b) 8-adjacent, or
(c) m-adjacent.
Answer:-
Let p and q be as shown in above Fig. Then,
(a) S1 and S2 are not 4-connected because q is not in the set N4(p);
(b) S1 and S2 are 8-connected because q is in the set N8(p);
(c) S1 and S2 are m-connected because (i) q is in ND(p), and (ii) the set N4(p) ∩ ND(q) is empty.
7) Consider the image segment shown.
a) Let V = {0, 1} and compute the lengths of the shortest 4-, 8-,
and m-path between p and q. If a particular path does not exist
between these two points, explain why.
b) Repeat for V = {1, 2}
Answer: -
a) When V = {0,1}, 4-path does not exist
between p and q because it is
impossible to get from p to q by
traveling along points that are both
4-adjacent and also have values from
V. Figure (a) shows this condition; it
is not possible to get to q. The
shortest 8-path is shown in Fig. (b);
its length is 4. The length of the
shortest m- path (shown dashed) is
5. Both of these shortest paths are
unique in this case.
b) One possibility for the shortest 4-
path when V = {1,2} is shown in Fig.
(c); its length is 6. It is easily verified that another 4-path of the same length exists
between p and q. One possibility for the shortest 8-path (it is not unique) is shown
in Fig. (d); its length is 4. The length of a shortest m-path (shown dashed) is 6.
This path is not unique.
8) Image subtraction is used often in industrial applications for detecting missing
components in product assembly. The approach is to store a “golden” image that
corresponds to a correct assembly; this image is then subtracted from incoming images of
the same product. Ideally, the differences would be zero if the new products are
assembled correctly. Difference images for products with missing components would be
nonzero in the area where they differ from the golden image. What conditions do you
think have to be met in practice for this method to work?
Answer: - Let g(x,y) denote the golden image, and let f(x,y) denote any input image
acquired during routine operation of the system. Change detection via subtraction is
based on computing the simple difference d(x,y)= g(x,y)−f(x,y). The resulting image, d(x,y),
can be used in two fundamental ways for change detection.
One way is using pixel-by-pixel analysis. In this case, we say that f(x,y) is “close enough”
to the golden image if all the pixels in d(x,y) fall within a specified threshold band [Tmin,
Tmax] where Tmin is negative and Tmax is positive. Usually, the same value of threshold is
used for both negative and positive differences, so that we have a band [−T, T] in which
all pixels of d(x,y) must fall in order for f(x,y) to be declared acceptable. The second major
approach is simply to sum all the pixels in d(x,y) and compare the sum against a threshold
Q. Note that the absolute value needs to be used to avoid errors canceling out. This is a
much cruder test, so the remaining discussion will be concentrated on the first approach.
There are three fundamental factors that need tight control for difference-based inspection
to work:
(1) proper registration,
(2) controlled illumination, and
(3) noise levels that are low enough so that difference values are not affected appreciably
by variations due to noise.
The first condition basically addresses the requirement that comparisons be made
between corresponding pixels. Two images can be identical, but if they are displaced with
respect to each other, comparing the differences between them makes no sense. Often,
special markings are manufactured into the product for mechanical or image-based
alignment.
Controlled illumination (note that “illumination” is not limited to visible light) obviously
is important because changes in illumination can affect dramatically the values in a
difference image. One approach used often in conjunction with illumination control is
intensity scaling based on actual conditions. For example, the products could have one or
more small patches of a tightly controlled color, and the intensity (and perhaps even color)
of each pixels in the entire image would be modified based on the actual versus expected
intensity and/or color of the patches in the image being processed.
Finally, the noise content of a difference image needs to be low enough so that it does not
materially affect comparisons between the golden and input images. Good signal strength
goes a long way toward reducing the effects of noise.
9) With reference to Table 2.3 (4th edi.), provide single, composite transformation functions
for performing the following operations:
a) Scaling and translation.
b) Scaling, translation, and rotation.
c) Vertical shear, scaling, translation, and rotation.
d) Does the order of multiplication of the individual matrices to produce a single
transformation make a difference? Give an example based on a scaling/translation
transformation to support your answer.
Answer: -
a) Scaling and Translation: - To perform scaling and translation together, we can
combine the two transformation functions into a single composite transformation
function. The order of the transformations matters; first, we apply the scaling
transformation, and then we apply the translation transformation.
The composite transformation function for a 2D object can be defined as:
sx 0 0 1 0 t x sx 0 sx t x
0 sy 0 0 1 t y = 0 sy s y t y
0 0 1 0 0 1 0 0 1
If you change the order, the answer is different as follow:
1 0 t x sx 0 0 sx 0 tx
0 1 t 0 sy 0 = 0 sy t y
y
0 0 1 0 0 1 0 0 1
In the similar fashion you can complete others as:
b) Scaling, Translation and Rotation (about origin): -
sx 0 0 1 0 t x cos − sin 0
0 sy 0 0 1 t y sin cos 0 = Complete multiplication yourself (CMYS)!
0 0 1 0 0 1 0 0 1
c) Vertical shear, Scaling, Translation and Rotation (about origin): -
1 sv 0 sx 0 0 1 0 t x cos − sin 0
0 1 0 0 sy 0 0 1 t y sin cos 0 = (CMYS)!
0 0 1 0 0 1 0 0 1 0 0 1
d) Yes, the order matters as explained in part ‘a’.
10) What do you mean by image enhancement? List down some of the image enhancement
technique.
Answer: - Enhancement is the process of manipulating an image so that the result is more
suitable than the original for a specific application.
Some of the basic techniques is listed as below: -
a) Contrast adjustment:
i. Intensity transformations are among the simplest of all image processing
techniques,
ii. Linear (negative and identity transformations),
iii. Logarithmic (log and inverse-log transformations),
iv. Power-law (nth power and nth root transformations)
b) Sharpening: This technique sharpens the edges of an image to make it appear crisper
and clearer.
c) Noise reduction: This technique reduces the noise or graininess in an image, which
can occur due to low light conditions.
d) Histogram equalization: This technique adjusts the histogram of an image to enhance
its contrast and brightness.
e) Image filtering: This technique involves applying filters to an image to enhance its
features or modify its appearance.
f) Derivatives: The first derivative of an image can be used to enhance the contrast and
sharpness of the image by adjusting the pixel intensities. The second derivative can be
used to remove noise from the image by smoothing out the pixel intensities.
11) Compare ‘Convolution’ and ‘Correlation’.
Answer: - Convolution and correlation are two mathematical operations used in image
processing and computer vision. Although they share some similarities, they have
different applications and produce different results.
Convolution is a mathematical operation that involves two functions: the input signal and
a kernel or filter. The kernel is a small matrix that slides over the input signal and performs
a dot product at each location, producing a new output signal. In computer vision and
image processing, convolution is used for tasks such as blurring, edge detection, and
feature extraction
On the other hand, correlation is a mathematical operation that also involves two
functions, but instead of sliding the kernel over the input signal, the kernel remains
stationary, and the input signal is shifted. The output signal is then calculated as the sum
of the product of the corresponding elements of the kernel and the shifted input signal.
Correlation is commonly used in template matching, where a template image is compared
with a larger image to find instances of the template.
Some key differences between them are:
Operation: Convolution involves taking the convolution of two signals, whereas
correlation involves taking the correlation between two signals.
Order: Convolution is a commutative operation, meaning that the order of the input
signals does not matter. However, correlation is not commutative, and the order of the
input signals does matter.
Output: The output of convolution and correlation are also different. In convolution, the
output signal is typically smaller than the input signal, while in correlation, the output
signal is typically the same size as the input signal.
Symmetry: Convolution and correlation are also related to each other through symmetry.
Convolution involves the convolution of a signal with a mirrored version of another
signal, while correlation involves the correlation between a signal and a non-mirrored
version of another signal.
Applications: Convolution is commonly used for filtering, feature extraction, and pattern
recognition in computer vision, while correlation is often used for template matching,
motion detection computer vision and image processing, and signal detection in signal
processing.
In summary, convolution involves sliding a kernel over an input signal to produce an
output signal, while correlation involves shifting the input signal over a stationary kernel
to produce an output signal. Convolution is commutative and is typically used for
filtering and feature extraction, correlation is non-commutative and is often used for
template matching and signal detection.
12) The first and second derivatives are important mathematical operations in image
processing that have several applications. First compare the first and second derivative
with the help image given below and secondly write some application of derivative in
computer vison and image processing:
Answer: -
First Derivative
a) Must be zero in areas of constant
intensity.
b) Must be nonzero at the onset of an
intensity step or ramp.
c) Must be nonzero along intensity ramps.
Second-order derivative
a) Must be zero in areas of constant
intensity.
b) Must be nonzero at the onset and end of
an intensity step or ramp.
c) Must be zero along intensity ramps.
Some of the applications of first and second
derivatives in image processing include:
Edge detection: The first derivative of an image can be used to detect edges in the image,
as edges correspond to regions of rapid change in pixel intensity. The second derivative
can be used to locate the edges more precisely and to distinguish between different types
of edges.
Image enhancement: The first derivative of an image can be used to enhance the contrast
and sharpness of the image by adjusting the pixel intensities. The second derivative can
be used to remove noise from the image by smoothing out the pixel intensities.
Feature extraction: The first derivative can be used to extract features such as corners and
curves from an image, which can be useful for object recognition and tracking. The second
derivative can be used to extract features such as blobs and ridges from an image.
Image segmentation: The first derivative can be used to segment an image into regions of
similar pixel intensity, while the second derivative can be used to segment an image into
regions of similar texture.
Image registration: The first derivative can be used to register images by aligning the
edges of the images, while the second derivative can be used to refine the registration by
minimizing the difference in the pixel intensities.
In summary, the first and second derivatives have several applications in image
processing, including edge detection, image enhancement, feature extraction, image
segmentation, and image registration.
13) What is image sharpening?
Answer: - Image sharpening is a technique used in image processing to enhance the edges
and details in an image, resulting in a more crisp and clear appearance. Some of the
commonly used image sharpening techniques include:
Unsharp masking: This technique involves creating a blurred version of the original
image, subtracting it from the original image to obtain the high-frequency components,
and then adding it back to the original image to enhance its edges and details.
High-pass filtering: This technique involves applying a high-pass filter to the original
image to enhance the high-frequency components and suppress the low-frequency
components.
Laplacian filtering: This technique involves applying a Laplacian filter to the original
image to enhance its edges and details. The Laplacian filter is a second-order derivative
filter that highlights areas of rapid intensity change.
Gaussian Filter: This technique involves applying a Gaussian filter to the image to blur it
slightly and then subtracting it from the original image to enhance its edges.
Median Filter: This technique involves applying a median filter to the image to remove
the noise and then subtracting it from the original image to enhance its edges.
Contrast enhancement: This technique involves adjusting the contrast of an image to
increase the difference between its bright and dark regions, thereby enhancing its edges
and details.
Local Adaptive Contrast Enhancement: This technique involves enhancing the contrast
of the image in small regions to preserve the details while enhancing its edges.
In summary, image sharpening techniques are used to enhance the edges of the image
and make it appear crisper and clearer. The techniques include unsharp masking, high
pass filtering, Laplacian filter, Gaussian filter, median filter, contrast enhancement, local
adaptive contrast enhancement, frequency domain filtering, and wavelet transform.
14) Consider the image f with size 7 7 and a mask ‘m’ of size 3 3 given below: -
1 1 0 2 2 2 7
2 5 5 2 0 5 7
0 0 1 5 1 1 8
1 0 1
6 2 1 3 2 0 5
1 0 1
1 0 2 0 0 8 2
1 0 1
5 1 0 0 5 9 0
2 37 0 10 8 8 9
Assume that image f is to be modified with the given mask ‘m’, and the result is to be put
into image g given as:
g1 g2 g3 g4 g5 g6 g7
g1 g2 g10 g11 g12 g13 g14
g15 g16 g17 g18 g19 g20 g21
g22 g23 g24 g25 g26 g27 g28
g30 g31 g32 g33 g34 g35 g35
g36 g37 g38 g39 g40 g41 g42
g43 g44 g45 g46 g47 g48 g49
Compute g10, g20, g25.
Answer: -
g10 = (11 + 1 5 + 1 0 ) + ( 0 0 + 0 5 + 0 1) + (1 2 + 1 2 + 1 5 )
g10 = (1 + 5 + 0 ) + ( 0 + 0 + 0 ) + ( 2 + 2 + 5 )
g10 = 15
g 20 = (1 0 + 11 + 1 2 ) + ( 0 5 + 0 1 + 0 0 ) + (1 7 + 1 8 + 1 5 )
g 20 = ( 0 + 1 + 2 ) + ( 0 + 0 + 0 ) + ( 7 + 8 + 5 )
g 20 = 23
g 25 = (11 + 11 + 1 2 ) + ( 0 5 + 0 3 + 0 0 ) + (11 + 1 2 + 1 0 )
g 25 = (1 + 1 + 2 ) + ( 0 + 0 + 0 ) + (1 + 2 + 0 )
g 25 = 7
15) Why does the discrete histogram equalization technique not yield a flat histogram?
Answer: - The discrete histogram equalization technique is a method of adjusting the
contrast of an image by redistributing its pixel values. The basic idea is to transform the
image's intensity values so that they are spread evenly across the available range of
intensity levels. While this approach can enhance the contrast of an image, it typically
does not result in a perfectly flat histogram.
There are several reasons why the histogram equalization technique does not yield a flat
histogram. One of the main reasons is that the discrete nature of the image's intensity
values limits the degree to which the pixel values can be evenly distributed across the
available range. Another reason is that the equalization process can introduce artifacts
and distortions in the image, which can cause some pixel values to be overrepresented in
the resulting histogram.
Overall, while the discrete histogram equalization technique can improve the contrast of
an image, it is not guaranteed to result in a perfectly flat histogram.