Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
4 views12 pages

21951A6688 - CV - PDF: Document Details

The document is a submission from Pranav Teja at the Institute of Aeronautical Engineering, detailing various concepts in computer vision, including edge detection using the Sobel operator, classical filtering operations, and object labeling in binary shape analysis. It also discusses advanced techniques like Hough Transform for circular object detection, translational alignment for 3D data registration, and methods for separating foreground from background in images. The submission includes practical applications and evaluations of these concepts, highlighting their relevance in fields like medical imaging and autonomous driving.

Uploaded by

pranavtejnagula
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views12 pages

21951A6688 - CV - PDF: Document Details

The document is a submission from Pranav Teja at the Institute of Aeronautical Engineering, detailing various concepts in computer vision, including edge detection using the Sobel operator, classical filtering operations, and object labeling in binary shape analysis. It also discusses advanced techniques like Hough Transform for circular object detection, translational alignment for 3D data registration, and methods for separating foreground from background in images. The submission includes practical applications and evaluations of these concepts, highlighting their relevance in fields like medical imaging and autonomous driving.

Uploaded by

pranavtejnagula
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Page 1 of 12 - Cover Page Submission ID trn:oid:::3618:93263223

21951A6688_CV.pdf
Institute of Aeronautical Engineering (IARE)

Document Details

Submission ID

trn:oid:::3618:93263223 8 Pages

Submission Date 1,448 Words

Apr 28, 2025, 12:12 PM GMT+5:30


8,657 Characters

Download Date

Apr 28, 2025, 1:05 PM GMT+5:30

File Name

21951A66G9_cv.pdf

File Size

221.0 KB

Page 1 of 12 - Cover Page Submission ID trn:oid:::3618:93263223


Page 2 of 12 - Integrity Overview Submission ID trn:oid:::3618:93263223

10% Overall Similarity


The combined total of all matches, including overlapping sources, for each database.

Match Groups Top Sources

12 Not Cited or Quoted 10% 7% Internet sources


Matches with neither in-text citation nor quotation marks
8% Publications
0 Missing Quotations 0% 0% Submitted works (Student Papers)
Matches that are still very similar to source material

0 Missing Citation 0%
Matches that have quotation marks, but no in-text citation

0 Cited and Quoted 0%


Matches with in-text citation present, but no quotation marks

Integrity Flags
0 Integrity Flags for Review
Our system's algorithms look deeply at a document for any inconsistencies that
No suspicious text manipulations found. would set it apart from a normal submission. If we notice something strange, we flag
it for you to review.

A Flag is not necessarily an indicator of a problem. However, we'd recommend you


focus your attention there for further review.

Page 2 of 12 - Integrity Overview Submission ID trn:oid:::3618:93263223


Page 3 of 12 - Integrity Overview Submission ID trn:oid:::3618:93263223

Match Groups Top Sources

12 Not Cited or Quoted 10% 7% Internet sources


Matches with neither in-text citation nor quotation marks
8% Publications
0 Missing Quotations 0% 0% Submitted works (Student Papers)
Matches that are still very similar to source material

0 Missing Citation 0%
Matches that have quotation marks, but no in-text citation

0 Cited and Quoted 0%


Matches with in-text citation present, but no quotation marks

Top Sources
The sources with the highest number of matches within the submission. Overlapping sources will not be displayed.

1 Publication

Ming Zhao, Shutao Li. "Sparse Representation Classification for Image Text Detec… 2%

2 Internet

vdoc.pub 1%

3 Publication

Antonio Calisi, Mario Angelelli, Davide Gualandris, Davide Rotondo, Giorgio Manci… 1%

4 Internet

technodocbox.com <1%

5 Internet

www.coursehero.com <1%

6 Publication

"Pattern Recognition", Springer Nature, 2012 <1%

7 Internet

journalofbigdata.springeropen.com <1%

8 Internet

www.ijraset.com <1%

9 Internet

faculty.atu.edu <1%

10 Publication

Darrell D. Burckhardt. "Tomography, Positron-Emission", Wiley, 2003 <1%

Page 3 of 12 - Integrity Overview Submission ID trn:oid:::3618:93263223


Page 4 of 12 - Integrity Overview Submission ID trn:oid:::3618:93263223

11 Publication

Jagadish H. Pujar, D. S. Shambhavi. "Chapter 16 A Novel Digital Algorithm for Sob… <1%

12 Publication

Trabelsi, A., and Y. Savaria. "A 2D Gaussian smoothing kernel mapped to heterog… <1%

Page 4 of 12 - Integrity Overview Submission ID trn:oid:::3618:93263223


Page 5 of 12 - Integrity Submission Submission ID trn:oid:::3618:93263223

Computer Vision
(ACDC21)

AAT-II ASSIGNMENT
Submitted by
PRANAV TEJA – 21951A6688

7
Department of CSE (Artificial Intelligence & Machine Learning)

INSTITUTE OF AERONAUTICAL ENGINEERING (Autonomous)

Dundigal, Hyderabad-500043, Telangana April, 2025

Page 5 of 12 - Integrity Submission Submission ID trn:oid:::3618:93263223


Page 6 of 12 - Integrity Submission Submission ID trn:oid:::3618:93263223

Q1: How does the Sobel operator detect edges?


1
Answer: The Sobel operator is a discrete differentiation operator used to
compute an approximation of the gradient of the image intensity function. It
works by convolving the original image with small, separable, integer-valued
filter kernels in both the horizontal (Gx) and vertical (Gy) directions. The operator
consists of two 3x3 convolution kernels:
Gx Kernel:
-1 0 +1
-2 0 +2
-1 0 +1
Gy Kernel:
-1 -2 -1
0 0 0
+1 +2 +1
The convolution of these kernels with the image results in two gradient images
5 that represent changes in intensity in the horizontal and vertical directions. The
magnitude of the gradient at each pixel is then calculated as:
||G|| = sqrt(Gx^2 + Gy^2)
or approximated as:
||G|| = |Gx| + |Gy|
11 Edges correspond to regions in the image where the gradient magnitude is large,
indicating significant intensity changes.
Additionally, the Sobel operator incorporates a smoothing effect because of the
weighted averaging in its kernel, which makes it less sensitive to noise compared
to simple gradient operators. This makes the Sobel operator highly effective for
preliminary edge detection in many computer vision applications.

Page 6 of 12 - Integrity Submission Submission ID trn:oid:::3618:93263223


Page 7 of 12 - Integrity Submission Submission ID trn:oid:::3618:93263223

Q2: Define a classical filtering operation.


10
Answer: A classical filtering operation refers to a mathematical operation
applied to an image to enhance certain features or suppress others. A very
12 common example is the Gaussian smoothing (or Gaussian blur) operation, which
is used to reduce image noise and detail. The Gaussian filter works by convolving
the image with a Gaussian function:
Gaussian Kernel (2D):
2 G(x, y) = (1 / 2πσ^2) * exp(-(x^2 + y^2) / (2σ^2))
Where σ is the standard deviation of the distribution. Larger σ values produce
more significant blurring. The convolution smooths the image by averaging the
pixels with their neighbors weighted according to the Gaussian distribution.
Classical filtering operations like Gaussian smoothing form the basis of more
advanced techniques such as edge detection, scale-space theory, and image
pyramids. They are fundamental to tasks requiring noise reduction and image
pre-processing in vision systems.

Q3: Define a metric for shape complexity and use it to compare simple
geometric shapes against complex real-world shapes.
Answer: One common metric for shape complexity is the Fractal Dimension (D).
3 It quantifies how the detail of a shape changes with the scale at which it is
measured. A simpler alternative metric is the Compactness defined as:
Compactness = (Perimeter^2) / (4 * π * Area)
• For a perfect circle, the compactness value is minimal (1).
• For irregular and jagged shapes, the value increases.
Comparison:
• Circle (simple geometric shape): Compactness ~ 1
• Rectangle: Slightly higher compactness
• Complex real-world shape (e.g., coastline outline): Much higher
compactness, indicating more complexity.

Page 7 of 12 - Integrity Submission Submission ID trn:oid:::3618:93263223


Page 8 of 12 - Integrity Submission Submission ID trn:oid:::3618:93263223

• Using these metrics, we can objectively analyze and compare the intricacy
of different shapes, which is crucial in applications like pattern recognition,
biology (e.g., leaf structure analysis), and urban planning (e.g., city
boundary complexity).

Q4: What is object labeling and counting in binary shape analysis?


Answer: Object labeling is the process of identifying and assigning unique labels
to connected components (groups of adjacent pixels having the same binary
value) in a binary image. Counting simply refers to tallying the number of such
uniquely labeled objects. The labeling typically uses methods such as:
• Two-pass algorithm: First pass assigns temporary labels and records label
equivalences. Second pass resolves label equivalences.
• Flood-fill algorithm: Recursively labels connected pixels.
Applications include counting cells in biomedical images, identifying objects in
industrial inspection, and detecting features in remote sensing.
Advanced labeling techniques also address challenges such as overlapping
objects and touching components, using methods like watershed segmentation
or deep learning-based segmentation for more accurate object identification
and counting.

Q5: Explain HT-based circular object detection.


Answer: Hough Transform (HT) based circular detection identifies circular
shapes by transforming the problem from the image space into the parameter
space (center coordinates (a, b) and radius r).
Process:
1. Detect edges using an edge detector (e.g., Sobel or Canny).
4 2. For each edge pixel (x, y), vote in an accumulator for all (a, b, r) satisfying:
(x - a)^2 + (y - b)^2 = r^2
3. Peaks in the accumulator space correspond to circles present in the image.

Page 8 of 12 - Integrity Submission Submission ID trn:oid:::3618:93263223


Page 9 of 12 - Integrity Submission Submission ID trn:oid:::3618:93263223

This method is robust to noise and occlusion but can be computationally


intensive.
To improve performance, modifications such as probabilistic Hough Transform
and gradient-based voting strategies have been introduced, significantly
speeding up circle detection in real-world applications like medical imaging and
autonomous driving.

Q6: Case Study: Human Iris Location


Answer: Human iris location typically involves locating the inner and outer
boundaries of the iris in an eye image.
Steps:
1. Preprocessing: Apply histogram equalization to improve contrast.
2. Edge Detection: Use edge detection to highlight boundaries.
3. Circular Hough Transform: Detect circular edges corresponding to the
pupil and the outer iris boundary.
4. Refinement: Apply gradient ascent or active contour models (snakes) to
refine the detected boundaries.
Applications include iris recognition systems for secure authentication.
8 Modern iris detection systems also incorporate deep learning models like
convolutional neural networks (CNNs) for feature extraction and segmentation,
significantly improving accuracy and robustness even under challenging
conditions like low-light environments or occlusions.

Q7: How do projection schemes work in 3D vision?


Answer: Projection schemes in 3D vision map 3D points into a 2D image plane,
allowing for the interpretation of 3D data in 2D images. Common types include:
• Orthographic Projection: Projects points perpendicularly onto the image
plane, preserving parallelism but not depth.

Page 9 of 12 - Integrity Submission Submission ID trn:oid:::3618:93263223


Page 10 of 12 - Integrity Submission Submission ID trn:oid:::3618:93263223

• Perspective Projection: Projects points based on a viewpoint, simulating


human visual perspective. Distant objects appear smaller.
Mathematically, perspective projection is modeled as:
9 (x', y') = (f * x / z, f * y / z)
Where (x, y, z) is the 3D point, (x', y') is the 2D projected point, and f is the focal
length.
Projection schemes are fundamental for tasks such as camera calibration, pose
estimation, and 3D reconstruction, enabling seamless integration between real-
world 3D environments and 2D imaging systems.
Q8: Explore the concept of translational alignment for registering 3D data from
multiple sensors or scans. Implement a basic alignment algorithm and evaluate
its accuracy.
Answer: Translational alignment involves shifting one 3D dataset to match
another by finding the optimal translation vector (Tx, Ty, Tz).
Basic Algorithm:
1. Compute centroids of source and target point clouds.
2. Translation vector = Target centroid - Source centroid.
3. Apply translation to source points.
Pseudo-code:
source_centroid = mean(source_points)
target_centroid = mean(target_points)
translation_vector = target_centroid - source_centroid
aligned_source = source_points + translation_vector
Accuracy Evaluation:
• Compute Root Mean Square Error (RMSE) between aligned source points
and target points.
• Lower RMSE indicates better alignment.

Page 10 of 12 - Integrity Submission Submission ID trn:oid:::3618:93263223


Page 11 of 12 - Integrity Submission Submission ID trn:oid:::3618:93263223

In practical applications, translational alignment is often combined with


rotational adjustments (rigid transformations) using algorithms like Iterative
Closest Point (ICP) for more accurate 3D registration across diverse and complex
datasets.

Q9: Develop a method to separate the foreground (faces) from the background
in images and videos. Experiment with different approaches, such as Gaussian
mixture models or deep learning.
6 Answer: Method 1: Gaussian Mixture Models (GMM)
• Model each pixel as a mixture of Gaussians.
• Background pixels have stable distributions, while foreground pixels
exhibit abrupt changes.
• Pixels deviating significantly from the background model are classified as
foreground.
Method 2: Deep Learning (U-Net / Mask R-CNN)
• Train a CNN-based segmentation model to predict foreground masks.
• Models are pre-trained on large datasets like COCO and fine-tuned for
specific domains.
Experiment:
• Apply both methods on sample video sequences.
• Measure Intersection over Union (IoU) and F1 score against ground truth
masks.
Result:
• Deep learning methods outperform GMMs in complex scenarios with
dynamic backgrounds.
• Emerging techniques, including transformer-based architectures and
temporal deep models, further enhance segmentation accuracy by
leveraging contextual and motion information in videos.

Page 11 of 12 - Integrity Submission Submission ID trn:oid:::3618:93263223


Page 12 of 12 - Integrity Submission Submission ID trn:oid:::3618:93263223

Q10: Implement an AAM for facial feature alignment and tracking. Analyze its
ability to handle varying poses, expressions, and illumination conditions.
Answer: Active Appearance Models (AAM) combine shape and appearance
variations to model and track facial features.
Implementation Steps:
1. Annotate training images with facial landmarks.
2. Perform Principal Component Analysis (PCA) on shapes and textures.
3. Build a combined model that generates synthetic faces.
4. Fit the model to new images by minimizing the difference between the
model appearance and the image appearance.
Evaluation:
• Test on a dataset with varying poses, expressions, and lighting.
• Metrics: Landmark error (normalized by inter-ocular distance).
Findings:
• AAMs handle moderate pose and expression changes well.
• Performance degrades under large out-of-plane rotations or strong
shadows.
• Extensions like 3D AAMs or Deep AAMs can improve robustness.
• Recent advances integrate deep learning with traditional AAMs, enabling
more accurate and real-time facial tracking even under extreme variations
in pose, lighting, and facial expressions.

Page 12 of 12 - Integrity Submission Submission ID trn:oid:::3618:93263223

You might also like