Computer Vision Lab Manual 2023
Computer Vision Lab Manual 2023
Computer Vision
(3171614)
B.E. Semester 7
Place: __________
Date: __________
Preface
Main motto of any laboratory/practical/field work is for enhancing required skills as well as
creating ability amongst students to solve real time problem by developing relevant
competencies in psychomotor domain.By keeping in view, GTU has designed competency
focused outcome-based curriculum for engineering degree programs where sufficient weightage
is given to practical work. It shows importance of enhancement of skills amongst the students
and it pays attention to utilize every second of time allotted for practical amongst students,
instructors and faculty members to achieve relevant outcomes by performing the experiments
rather than having merely study type experiments. It is must for effective implementation of
competency focused outcome-basedcurriculum that every practical is keenly designed to serve
as a tool to develop and enhance relevant competency required by the various industry among
every student. These psychomotor skills are very difficult to develop through traditional chalk
and board content delivery method in the classroom. Accordingly, this lab manual is designed
to focus on the industry defined relevant outcomes, rather than old practice of conducting
practical to prove concept and theory.
By using this lab manual students can go through the relevant theory and procedure in advance
before the actual performance which createsan interest and students can have basic idea prior to
performance.This in turn enhances pre-determined outcomes amongst students.Each experiment
in this manual begins with competency, industry relevant skills, course outcomes as well as
practical outcomes (objectives). The students will also achieve safety and necessary precautions
to be taken while performing practical.
This manual also provides guidelines to faculty members to facilitate studentcentric lab
activities through each experiment by arranging and managing necessary resources in order that
the students follow the procedures with required safety and necessary precautions to achieve the
outcomes. It also gives an idea that how students will be assessed by providing rubrics.
Computer vision is a professional elective course which deals with principles of image
formation, image processing algorithms and recognition from single or multiple images (video).
This course emphasizes the core vision tasks of scene understanding and recognition.
Applications to object recognition, image analysis, image retrieval and object tracking will be
discussed.
Utmost care has been taken while preparing this lab manual however always there is chances of
improvement. Therefore, we welcome constructive suggestions for improvement and removal
of errors if any.
Computer Vision (3171614)
The following industry relevant competencies are expected to be developed in the student by
undertaking the practical work of this laboratory.
1. Will be able to solve open design problems
2. Will be able to apply the knowledge, techniques, skills and modern tools to become
successful professionals in computer vision industries.
Index
(Progressive Assessment Sheet)
Experiment No: 1
Date:
Objectives:
Theory:
Read Image:Function to read image essentially takes the grey values of all the pixels in the
greyscale image and puts them all into a matrix. This matrix now becomes a variable of the
programming platform we use namely Matlab, Python or Open-CV. Size of such matrix for a
greyscale image will be MxN. In case of a color image with RGB (Red, Green, Blue) color palette
the size of the matrix becomes 3 * (MxN). Here MxN is the resolution of the image.In general, the
read function reads the pixel values from an image file, andreturns a matrix of all the pixel values.
Write Image: Once we have captured image data i.e matrix with MxN resolution either by
digitally capturing it, extracting it from a video sequence or by processing any input image, we
Page | 1
Computer Vision (3171614)
would intend to save the image on computer i.e writing the image. Write function will enable us
to write this data from the matrix variable onto the hard disk at desired location and in
corresponding file format as the matrix. A MxN matrix can generate a greyscale image while a 3 *
(MxN) matrix contains data from R, G and B color palettes can generate a color image.
1. Binary images
2. Greyscale images
3. RGB images
Binary as the name suggests is the image with either black or white pixels. Greyscale images have
pixel values from 0 (black) to 255 (white). RGB images are the true color images with value
between 0 to 255 for each of Red, Green and Blue components. Within the limitations of
arithmetic conversions, we can convert images from one image type to another. RGB to Grey and
Grey to RGB are examples of such image conversions.
Image Complement: In the complement of a binary image, zeros become ones and ones become
zeros. Black and white are reversed. In the complement of a grayscale or color image, each pixel
value is subtracted from the maximum pixel value supported by the class (or 1.0 for double-
precision images). The difference is used as the pixel value in the output image. In the output
image, dark areas become lighter and light areas become darker. For color images, reds become
cyan, greens become magenta, blues become yellow, and vice versa.
Page | 2
Computer Vision (3171614)
Procedure:
1. Image read :
matrix variable = read image ( From_Location )
Display the image
2. Image Write: write image( To_Location, matrix variable)
3. Image conversion:
y=rgb2gray(x);
y=gray2rgb(x)
4. Complement of Image:
for each value of x in the image
y= 255-x
save y
Page | 3
Computer Vision (3171614)
Program:
import cv2
cv2.waitKey(0)
cv2.destroyAllWindows()
else:
print("Image not found or couldn't be loaded.")
Page | 4
Computer Vision (3171614)
Output:
output.jpg
gray_output.jpg
complement_output.jpg
Page | 5
Computer Vision (3171614)
Conclusion:
image processing operations are essential for various computer vision and image analysis
tasks. Understanding how to read, manipulate, and save images, as well as perform
conversions and enhancements, lays the foundation for more advanced image processing
techniques and applications. These operations are the building blocks for more complex
image analysis tasks such as object detection, image recognition, and image segmentation.
Quiz:
1. If you have access to a digital cameracapable of capturing images with 1024x768
resolution for a fixed scene, using all possiblecamera settings what is the smallest file you
can create?
➢ In general, the smallest file size can be achieved by using strong image
compression techniques (e.g., JPEG compression) and capturing a scene with
minimal detail or changes in color. However, the specific file size can vary widely
depending on the camera's compression algorithm, the image content, and the
desired image quality.
2. Will it be possible to convert an original greyscale image to rgb ?
➢ Yes, it is possible to convert an original grayscale image to an RGB (Red, Green,
Blue) image.
Suggested Reference:
1. Digital Image Processingby S. Sridhar. Oxford Press.
2. https://www.mathworks.com/help/matlab/ref/imwrite.html
Page | 6
Computer Vision (3171614)
Criteria 1 2 3 4 5 Total
Marks
Page | 7
Computer Vision (3171614)
Experiment No: 2
Date:
Objectives:
Theory:
Contrast stretching: It is an image enhancement technique that tries to improve the contrast by
stretching the intensity values of an image to fill the entire dynamic range. The transformation
function used is always linear and monotonically increasing.If the minimum intensity value( )
present in the image is 100 then it is stretched to the possible minimum intensity value 0.
Likewise, if the maximum intensity value( ) is less than the possible maximum intensity value
255 then it is stretched out to 255.0–255 is taken as standard minimum and maximum intensity
values for 8-bit images. General Formula for Contrast Stretching is given by equation (2.1).
eq. (2.1)
Page | 8
Computer Vision (3171614)
where, = current pixel intensity value ; = minimum intensity value present in the whole
image ; = maximum intensity value present in the whole image. are the intended values and
(a) Input image before contrast stretching along with its histogram
(b) Input image after contrast stretching along with its histogram
Figure 2.1: Results of contrast stretching
Page | 9
Computer Vision (3171614)
Procedure in Matlab:
Contrast stretching:
I = imread('<input image>');
figure
imshow(I)
J = imadjust(I,stretchlim(I),[]);
figure
imshow(J)
Histogram Equalization:
I = imread('<input image>');
figure
subplot(1,3,1)
imshow(I)
subplot(1,3,2:3)
imhist(I)
J = histeq(I);
figure
subplot(1,3,1)
imshow(J)
subplot(1,3,2:3)
imhist(J)
Page | 10
Computer Vision (3171614)
Program:
import cv2
import numpy as np
import matplotlib.pyplot as plt
cv2.imwrite('contrast_adjusted_output.jpg', contrast_adjusted_image)
cv2.imwrite('equalized_output.jpg', equalized_image)
cv2.waitKey(0)
cv2.destroyAllWindows()
Page | 11
Computer Vision (3171614)
# Equalized Image Histogram
plt.subplot(2, 2, 3)
plt.hist(equalized_image.ravel(), 256, [0, 256])
plt.title('Equalized Image Histogram')
plt.show()
Output:
Original image Histogram
Conclusion:
Page | 12
Computer Vision (3171614)
This practical has provided valuable hands-on experience in image enhancement
techniques. By adjusting image contrast using contrast stretching and equalizing image
histograms, we have learned important tools for improving the quality and interpretability
of digital images, ultimately enhancing our skills in image analysis and processing.
Quiz:
1. Differentiate between contrast stretching and histogram equalization
→ Contrast stretching primarily stretches the intensity range, while histogram
equalization redistributes intensity values to achieve a more uniform histogram. The
choice between these techniques depends on the specific requirements and
characteristics of the image being processed
2. Is it possible to re-tract to original image after in both contrast stretching and histogram
equalization
→ it's possible to attempt to revert to the original image after applying contrast
stretching or histogram equalization, the process may not result in a perfect
reconstruction due to information loss during enhancement. The effectiveness of
the retraction depends on the specific characteristics of the original image and the
extent of enhancement applied.
Suggested Reference:
Page | 13
Computer Vision (3171614)
Rubric wise marks obtained:
Program (Excellent)(4) (Good)(3) (Fair)(2) (Beginning)(1)
Program Program Program Program Program does not
execution executes executes with a executes with execute (0-1)
correctly with no minor error multiple minor
syntax or (easily fixed
runtime errors error)
Design- Program displays Output/design of Output/Design of Output is
Correctness of correct output output has minor output has incorrect (0-1)
output with no errors errors multiple errors
Design of logic Program is Program has Program has Program is
logically well slight logic significant logic incorrect (0-1)
designed errors that do no errors
significantly
affect the results
Standards Program is Few Several Program is
stylistically well inappropriate inappropriate poorly written (0-
designed design choices design choices 1)
(i.e. poor (i.e. poor
variable names, variable names,
improper improper
indentation) indentation)
Documentation Program is well Missing one Missing two or Most or all
documented required more required documentation
comment comments missing (0-1)
Criteria 1 2 3 4 5 Total
Marks
Page | 14
Computer Vision (3171614)
Experiment No: 3
Implement the various low pass and high pass filtering mechanisms.
Date:
Objectives:
1. Image enhancement such as smoothing, sharpening and edge enhancement using various
filters.
Theory:
Filtering is a technique for modifying or enhancing an image. For example, you can filter an
image to emphasize certain features or remove other features. Image processing operations
implemented with filtering include smoothing, sharpening, and edge enhancement.
Low pass filter (smoothing): Low pass filter is the type of frequency domain filter that is used
for smoothing the image. It attenuates the high-frequency components and preserves the low-
frequency components. High frequency content corresponds to boundaries of the objects. An
image is smoothed by decreasing the disparity between pixel values by averaging nearby pixels.
The low-pass filters usually employ moving window operator which affects one pixel of the
image at a time, changing its value by some function of a local region (window) of pixels. The
operator moves over the image to affect all the pixels in the image.
Page | 15
Computer Vision (3171614)
Mean filtering: It is used as a method of smoothing images, reducing the amount of intensity
variation between one pixel and the next resulting in reducing noise in images. The idea of mean
filtering is simply to replace each pixel value in an image with the mean (average) value of its
neighbors, including itself. This has the effect of eliminating pixel values which are
unrepresentative of their surroundings.
Median filter: Median filtering is a nonlinear operation often used in image processing to reduce
"salt and pepper" noise. Median filter replaces the pixel at the center of the filter with the median
value of the pixels falling beneath the mask. Median filter does not blur the image but it rounds
the corners.
Figure 3.1: Original image, mean filtered output and median filtered output in the order of left to
right
High pass filter (sharpening and edge enhancement):High pass filter is the type of frequency
domain filter that is used for sharpening the image. It attenuates the low-frequency components
and preserves the high-frequency components. A high-pass filter can be used to make an image
appear sharper. These filters emphasize fine details in the image - the opposite of the low-pass
filter. High-pass filtering works in the same way as low-pass filtering; it just uses a different
convolution kernel. Prewitt and Sobel are derivative filters used as edge detectors.
Laplacian filter: One of the most known high-pass filters is the Laplacian edge enhancement. Its
meaning can be thus understood: We subtract the image from a blurred version of itself created
from the averaging of the four nearest neighbours. This enhances edges and isolated pixels with
extreme values. This method being very sensitive to noise, Laplacian of Gaussian (LoG) is used.
Page | 16
Computer Vision (3171614)
Procedure in Matlab:
Low Pass Filters:
I = imread('<input image>');
h = 1/3*ones(3,1);
H = h*h';
imfilt = filter2(H,I); // Mean filter for 3x3
J = medfilt2(I) // Median filter
Program:
import cv2
import numpy as np
cv2.waitKey(0)
cv2.destroyAllWindows()
Page | 17
Computer Vision (3171614)
Output:
Conclusion:
we explored various image filtering techniques, including low-pass filtering and high-pass
filtering, using the OpenCV library in Python. The project aimed to develop skills in
image enhancement for tasks such as smoothing, sharpening, and edge enhancement.
Quiz:
Page | 18
Computer Vision (3171614)
3. It is necessary to use Gaussian smoothing before using Laplacian filter. Justify.
→ Yes, it's necessary to use Gaussian smoothing before the Laplacian filter to reduce noise
and avoid amplifying noise artifacts during edge enhancement.
Suggested Reference:
Page | 19
Computer Vision (3171614)
Criteria 1 2 3 4 5 Total
Marks
Page | 20
Computer Vision (3171614)
Experiment No: 4
Date:
Objectives:
Theory:
Fourier Transform is an important image processing tool which is used to decompose an image
into its sine and cosine components. The output of the transformation represents the image in
the Fourier or frequency domain, while the input image is the spatial domain equivalent. In the
Fourier domain image, each point represents a particular frequency contained in the spatial domain
image. The Fourier Transform is used in a wide range of applications, such as image analysis,
image filtering, image reconstruction and image compression.
Page | 21
Computer Vision (3171614)
Procedure in Matlab:
Program:
import cv2
import numpy as np
from matplotlib import pyplot as plt
Output:
Conclusion:
Quiz:
1. Discuss Properties of Fourier Transform
Suggested Reference:
Page | 23
Computer Vision (3171614)
Criteria 1 2 3 4 5 Total
Marks
Page | 24
Computer Vision (3171614)
Experiment No: 5
Date:
Objectives:
Theory:
Scale-Invariant Feature Transform (SIFT): SIFT is invariance to image scale and rotation. In
general, SIFT algorithm can be decomposed into four steps as (a) Feature point (also called
keypoint) detection (b) Feature point localization (c) Orientation assignment and (d) Feature
descriptor generation.
Histogram of Oriented Gradients (HOG): This feature descriptor is used for the purpose of
object detection. The technique counts occurrences of gradient orientation in localized portions of
an image. This method is similar to that of edge orientation histograms, scale-invariant feature
transform descriptors, and shape contexts, but differs in that it is computed on a dense grid of
uniformly spaced cells and uses overlapping local contrast normalization for improved accuracy.
The HOG feature vector is arranged by HOG blocks. The cell histogram, H(Cyx), is 1-by-NumBins.
The figure below shows the HOG feature vector with a 1-by-1 cell overlap between blocks.
.
Page | 26
Computer Vision (3171614)
Procedure in OpenCV:
SIFT:
img = imread('image_name')
imgGray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
sift = cv2.SIFT_create()
keypoints,descriptors = sift.detectAndCompute(img, None)
sift_image = cv2.drawKeypoints(imgGray, keypoints, img)
HOG:
img = cv2.imread('image_name')
(hog, hog_image) = feature.hog(img, orientations=9,pixels_per_cell = (8,8),
cells_per_block=(2,2),block_norm='L2-Hys', visualize=True, transform_sqrt=True)
cv2.imshow("Ori", img)
cv2.imshow('HOG IMAGE', hog_image)
Program:
import cv2
from skimage import feature
import matplotlib.pyplot as plt
Page | 27
Computer Vision (3171614)
# Draw SIFT keypoints on the image
sift_image = cv2.drawKeypoints(imgGray, keypoints, img)
if __name__ == "__main__":
image_path = './input_image.jpg'
plt.subplot(122)
plt.title('HOG Features')
plt.imshow(hog_image, cmap=plt.cm.gray)
plt.axis('off')
plt.show()
Page | 28
Computer Vision (3171614)
Output:
Conclusion:
In this experiment, we successfully utilized Scale-Invariant Feature Transform (SIFT) and
Histogram of Oriented Gradients (HOG) features for image analysis. SIFT provided robust
keypoint detection, while HOG described object shapes effectively. These techniques
enhance image processing and feature extraction for various computer vision applications.
Quiz:
1. Compare HOG and SIFT feature descriptors
◈ HOG is well-suited for tasks where capturing object shapes in various scales is critical,
whereas SIFT excels in scenarios where keypoint matching and recognition under scale
and rotation variations are important. The choice between them depends on the specific
requirements of the computer vision application.
Suggested Reference:
1.Digital Image Processing by S. Sridhar. Oxford Press.
2. https://in.mathworks.com/help/vision/ref/extracthogfeatures.html
3. https://towardsdatascience.com/hog-histogram-of-oriented-gradients-67ecd887675f
Page | 29
Computer Vision (3171614)
Criteria 1 2 3 4 5 Total
Marks
Page | 30
Computer Vision (3171614)
Experiment No: 6
Date:
Objectives:
Theory:
Segmentation:Instead of processing the entire image, a common practice is to extract the Region
of Interest (RoI). Image segmentation is a method of dividing a digital image into subgroups called
image segments, reducing the complexity of the image and enabling further processing or analysis
of each image segment. Technically, segmentation is the assignment of labels to pixels to identify
objects, people, or other important elements in the image. Image segmentation could involve
separating foreground from background, or clustering regions of pixels based on similarities in
color or shape. For example, a common application of image segmentation in medical imaging is
to detect and label pixels in an image or voxels of a 3D volume that represent a tumor in a patient’s
brain or other organs. Image segmentation is typically used to locate objects and boundaries (lines,
curves, etc.) in images. Types of segmentation are as below:
Page | 31
Computer Vision (3171614)
Edge-Based Segmentation: This technique identifies the edges of various objects in a given
image. It helps locate features of associated objects in the image using the information from the
edges. Edge detection helps strip images of redundant information, reducing their size and
facilitating analysis. Edge-based segmentation algorithms identify edges based on contrast, texture,
color, and saturation variations. They can accurately represent the borders of objects in an image
using edge chains comprising the individual edges.
Threshold Based: It is the simplest image segmentation method, dividing pixels based on their
intensity relative to a given value or threshold. It is suitable for segmenting objects with higher
intensity than other objects or backgrounds. The threshold value T can work as a constant in low-
noise images. In some cases, it is possible to use dynamic thresholds.
Once the mask is ready then the RoI can be segmented out of the given image with the help of the
mask.
Procedure:
Program:
import cv2
import numpy as np
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans # Import the KMeans class
for i in range(num_clusters):
plt.subplot(1, num_clusters + 1, i + 2)
plt.imshow(segmented_images[i])
plt.title(f'Segment {i + 1}')
plt.show()
Output:
Page | 33
Computer Vision (3171614)
Conclusion:
his practical exercise provided hands-on experience in image segmentation, showcasing
how this technique can be employed to extract meaningful information from complex
images. It opens the door to further exploration and experimentation with image
processing techniques for various real-world applications.
Quiz:
1. Discuss applications of different segmentation techniques
→ Each segmentation technique has its strengths and weaknesses, making them suitable for
specific tasks. The choice of technique depends on the nature of the data and the objectives
of the image analysis task. In many cases, a combination of these techniques or more
advanced methods like deep learning-based segmentation is used to achieve more accurate
and robust results.
Suggested Reference:
Page | 34
Computer Vision (3171614)
Rubric wise marks obtained:
Program (Excellent)(4) (Good)(3) (Fair)(2) (Beginning)(1)
Program Program Program Program Program does not
execution executes executes with a executes with execute (0-1)
correctly with no minor error multiple minor
syntax or (easily fixed
runtime errors error)
Design- Program displays Output/design of Output/Design of Output is
Correctness of correct output output has minor output has incorrect (0-1)
output with no errors errors multiple errors
Design of logic Program is Program has Program has Program is
logically well slight logic significant logic incorrect (0-1)
designed errors that do no errors
significantly
affect the results
Standards Program is Few Several Program is
stylistically well inappropriate inappropriate poorly written (0-
designed design choices design choices 1)
(i.e. poor (i.e. poor
variable names, variable names,
improper improper
indentation) indentation)
Documentation Program is well Missing one Missing two or Most or all
documented required more required documentation
comment comments missing (0-1)
Criteria 1 2 3 4 5 Total
Marks
Page | 35
Computer Vision (3171614)
Experiment No: 7
Date:
Objectives:
Theory:
Optical flow: Itis the motion of objects between consecutive frames of sequence, caused by the
relative movement between the object and camera. The problem of optical flow may be expressed
by following figure:
Page | 36
Computer Vision (3171614)
where between consecutive frames, we can express the image intensity (I) as a function of space
(x,y) and time (t). In other words, if we take the first image I(x,y,t) and move its pixels by (dx,dy)
over t time, we obtain the new image I(x+dx, y+dy, t+dt).
Differential methods of estimating optical flow, based on partial derivatives of the image signal
and/or the sought flow field and higher-order partial derivatives, such as:
1. Lucas–Kanade method – regarding image patches and an affine model for the flow field
2. Horn–Schunck method – optimizing a functional based on residuals from the brightness
constancy constraint, and a particular regularization term expressing the expected
smoothness of the flow field
3. Buxton–Buxton method – based on a model of the motion of edges in image sequences
4. Black–Jepson method – coarse optical flow via correlation
1. Setting up your environment and pen sparse-starter.py with your text editor
2. Configuring OpenCV to read a video and setting up parameters
3. Grayscaling
4. Shi-Tomasi Corner Detector - selecting the pixels to track
5. Tracking Specific Objects
6. Lucas-Kanade: Sparse Optical Flow
7. Visualizing
Program:
import numpy as np
import cv2 as cv
cap = cv.VideoCapture('./input_video.mp4')
Page | 37
Computer Vision (3171614)
ret,frame = cap.read()
while(1):
ret, frame = cap.read()
if ret == True:
hsv = cv.cvtColor(frame, cv.COLOR_BGR2HSV)
dst = cv.calcBackProject([hsv],[0],reg_hist,[0,180],1)
# apply meanshift
ret, tracker = cv.meanShift(dst, tracker, criteria)
# Draw it on image
x,y,w,h = tracker
img = cv.rectangle(frame, (x,y), (x+w,y+h), 255,2)
cv.imshow('img',img)
Output:
Page | 38
Computer Vision (3171614)
Conclusion:
In this experiment, we explored the concept of optical flow and implemented the Lucas-
Kanade Sparse Optical Flow algorithm using OpenCV. Optical flow is a critical technique
in computer vision that allows us to estimate the motion of objects between consecutive
frames in a video sequence. This technique finds application in various domains such as
object tracking, video stabilization, and motion analysis.
Quiz:
1. Compare Sparse vs. Dense Optical Flow
◈ The choice between sparse and dense optical flow depends on the specific requirements of
the computer vision task. Sparse optical flow is suitable when computational efficiency
and tracking specific features are essential, while dense optical flow is preferred for tasks
that require a detailed analysis of motion across the entire frame, even though it comes at
the cost of higher computational requirements.
Suggested Reference:
1.The computation of optical flow by S. S. Beauchemin and J. L. Barron. ACM digital Library.
2. https://nanonets.com/blog/optical-flow/#what-is-optical-flow
Page | 39
Computer Vision (3171614)
Criteria 1 2 3 4 5 Total
Marks
Page | 40
Computer Vision (3171614)
Experiment No: 8
Date:
Objectives:
Theory:
We can create a simple application which tracks some points in a video. To decide the points, we
use cv.goodFeaturesToTrack(). We take the first frame, detect some Shi-Tomasi corner points in
it, then we iteratively track those points using Lucas-Kanade optical flow. For the function
cv.calcOpticalFlowPyrLK() we pass the previous frame, previous points and next frame. It
returns next points along with some status numbers which has a value of 1 if next point is found,
else zero. We iteratively pass these next points as previous points in next step.
Page | 41
Computer Vision (3171614)
#include <iostream>
#include <opencv2/core.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/videoio.hpp>
#include <opencv2/video.hpp>
using namespace cv;
using namespace std;
int main(int argc, char **argv)
{
const string about =
"This sample demonstrates Lucas-Kanade Optical Flow calculation.\n"
"The example file can be downloaded from:\n"
"
https://www.bogotobogo.com/python/OpenCV_Python/images/mean_shift_tracking/slow_traffic_
small.mp4";
const string keys =
"{ h help | | print this help message }"
"{ @image | vtest.avi | path to image file }";
CommandLineParserparser(argc, argv, keys);
parser.about(about);
if (parser.has("help"))
{
parser.printMessage();
Page | 42
Computer Vision (3171614)
return 0;
}
string filename = samples::findFile(parser.get<string>("@image"));
if (!parser.check())
{
parser.printErrors();
return 0;
}
VideoCapture capture(filename);
if (!capture.isOpened()){
//error in opening the video input
cerr<< "Unable to open file!" <<endl;
return 0;
}
// Create some random colors
vector<Scalar> colors;
RNG rng;
for(int i = 0; i< 100; i++)
{
int r = rng.uniform(0, 256);
int g = rng.uniform(0, 256);
int b = rng.uniform(0, 256);
colors.push_back(Scalar(r,g,b));
}
Mat old_frame, old_gray;
vector<Point2f> p0, p1;
// Take first frame and find corners in it
capture >>old_frame;
cvtColor(old_frame, old_gray, COLOR_BGR2GRAY);
goodFeaturesToTrack(old_gray, p0, 100, 0.3, 7, Mat(), 7, false, 0.04);
// Create a mask image for drawing purposes
Mat mask = Mat::zeros(old_frame.size(), old_frame.type());
while(true){
Mat frame, frame_gray;
capture >> frame;
if (frame.empty())
Page | 43
Computer Vision (3171614)
break;
cvtColor(frame, frame_gray, COLOR_BGR2GRAY);
// calculate optical flow
vector<uchar> status;
vector<float> err;
TermCriteria criteria = TermCriteria((TermCriteria::COUNT) + (TermCriteria::EPS), 10, 0.03);
calcOpticalFlowPyrLK(old_gray, frame_gray, p0, p1, status, err, Size(15,15), 2, criteria);
vector<Point2f>good_new;
for(uinti = 0; i< p0.size(); i++)
{
// Select good points
if(status[i] == 1) {
good_new.push_back(p1[i]);
// draw the tracks
line(mask,p1[i], p0[i], colors[i], 2);
circle(frame, p1[i], 5, colors[i], -1);
}
}
Mat img;
add(frame, mask, img);
imshow("Frame", img);
int keyboard = waitKey(30);
if (keyboard == 'q' || keyboard == 27)
break;
// Now update the previous frame and previous points
old_gray = frame_gray.clone();
p0 = good_new;
}
}
Page | 44
Computer Vision (3171614)
Program:
import cv2
import numpy as np
# Create some random colors for drawing the optical flow tracks
colors = np.random.randint(0, 255, (100, 3))
while True:
ret, frame = cap.read()
if not ret:
break
# Combine the frame and the mask to visualize the optical flow
img = cv2.add(frame, mask)
cv2.imshow('Optical Flow', img)
Page | 45
Computer Vision (3171614)
# Update the previous frame and points
old_gray = frame_gray.copy()
p0 = good_new.reshape(-1, 1, 2)
cap.release()
cv2.destroyAllWindows()
Output:
Conclusion:
In this experiment, we successfully demonstrated the practical application of optical flow
in image processing using the Lucas-Kanade method with OpenCV. Optical flow is a
valuable technique in computer vision that allows us to analyze the motion of objects
within a video sequence or between consecutive image frames. We applied this technique
to track points in a video and visualize their motion, offering insights into the movement
patterns present in the video.
Quiz:
1. Is it necessary to detect corner points in particular intervals ?
◈ The decision to detect corner points at particular intervals or adaptively depends on the
specific demands of your image processing or computer vision application. You should
consider factors such as the nature of the scene, changes in the scene over time, and the
available computational resources when determining the best strategy for feature point
detection in optical flow and motion tracking.
Page | 46
Computer Vision (3171614)
Suggested Reference:
1.The computation of optical flow by S. S. Beauchemin and J. L. Barron. ACM digital Library.
2. https://nanonets.com/blog/optical-flow/#what-is-optical-flow
3. https://docs.opencv.org/3.4/d4/dee/tutorial_optical_flow.html
Page | 47
Computer Vision (3171614)
Criteria 1 2 3 4 5 Total
Marks
Page | 48
Computer Vision (3171614)
Experiment No: 9
Date:
Objectives:
Theory:
Object detection and object recognition are similar techniques for identifying objects, but they
vary in their execution. Object detection is the process of finding instances of objects in images.
In the case of deep learning, object detection is a subset of object recognition, where the object is
not only identified but also located in an image. This allows for multiple objects to be identified
and located within the same image. Object recognition is a key technology behind driverless cars,
enabling them to recognize a stop sign or to distinguish a pedestrian from a lamppost. It is also useful in a
variety of applications such as disease identification in bioimaging, industrial inspection, and robotic
vision.
Page | 49
Computer Vision (3171614)
Procedure:
Program:
from imageai.Detection import ObjectDetection
# Define the paths to the model, input image, and output image
model_path = "yolo-tiny.h5"
input_path = "cars.jpg"
output_path = "output_image.jpg"
Page | 50
Computer Vision (3171614)
# Load the pre-trained model
detector.setModelPath(model_path)
detector.loadModel()
# Perform object detection on the input image and save the output image
detections = detector.detectObjectsFromImage(
input_image=input_path,
output_image_path=output_path
)
Conclusion:
In this experiment, we successfully performed object detection and recognition using a
pre-trained model (Tiny YOLOv3). We applied it to an online dataset image, detecting and
labeling multiple objects. This demonstrates the practicality of object detection and
recognition in various applications.
Quiz:
1. Differentiate between machine learning approach and deep learning approach for object
recognition
◈ The primary difference lies in the feature engineering and the ability to learn features
directly from data. Deep learning excels in tasks like object recognition, where the data is
high-dimensional and complex, but the models can be less interpretable compared to
traditional machine learning approaches.
Suggested Reference:
Page | 51
Computer Vision (3171614)
Criteria 1 2 3 4 5 Total
Marks
Page | 52
Computer Vision (3171614)
Experiment No: 10
Date:
1. Problem solving
Objectives:
Theory:
Face detection, also called facial detection, is an artificial intelligence-based computer technology
used to find and identify human faces in digital images and video. Face detection technology is
often used for surveillance and tracking of people in real time. It is used in various fields
including security, biometrics, law enforcement, entertainment and social media.
To perform the face recognition function, face detection is first performed to determine the position of the
face in the picture. OpenCV performs functionalities for the same. It firstly extracts the feature images into
a large sample set by extracting the face Haar features in the image and then uses the AdaBoost algorithm
as the face detector. In face detection, the algorithm can effectively adapt to complex environments such as
insufficient illumination and background blur, which greatly improves the accuracy of detection. For a set
of training sets, different training sets are obtained for subsequent work by changing the distribution
probabilities of each of the samples, and each training set is trained to obtain a weak classifier, and then
these several classifiers are weighted.
Page | 53
Computer Vision (3171614)
Procedure:
Program:
import cv2
# Open the video source (use the provided video file path)
cap = cv2.VideoCapture(video_file_path)
while True:
ret, frame = cap.read()
if not ret:
break
Page | 54
Computer Vision (3171614)
# Release the video capture object and close the OpenCV windows
cap.release()
cv2.destroyAllWindows()
Output:
Conclusion:
Overall, this experiment provides a fundamental understanding of face detection, which is
a crucial component in many computer vision applications, including face recognition,
surveillance, and security systems.
Page | 55
Computer Vision (3171614)
For more advanced applications, such as face recognition, deep learning models and
custom datasets would be required. However, this experiment serves as a starting point for
understanding the basic concepts of face detection in computer vision.
Quiz:
1. Compare approaches for face detection from images and videos
◈ Face detection in images and videos relies on similar techniques, like Haar cascades, but
videos require real-time processing. Videos present added challenges due to frame rate,
tracking, and performance optimization.
Suggested Reference:
1. https://www.hindawi.com/journals/js/2021/4796768/
Page | 56
Computer Vision (3171614)
Criteria 1 2 3 4 5 Total
Marks
Page | 57