Image Analysis and Computer Vision
CoSc-6412
Dr. V. Anitha
1
Email: [email protected]
Course Description
This course is designed to give students the
fundamentals in 2D digital image processing with
emphasis in image processing techniques, image
filtering design, segmentation, enhancement,
morphological processing of image, recognition of
objects in an image and the applications of image
processing.
2
Course Objective
On the successful completion of the course the students will be
able to:
➢ have a clear understanding of the principals the Digital Image
Processing
➢ understand the mathematical foundations for digital manipulation of
images
➢ learn and understand the Image Enhancement in the Spatial and
Frequency Domain.
➢ understand Image Restoration, Compression, Segmentation,
Recognition, Representation and Description.
3
Course Content
Chapter/Topic Sub-Topics
Chapter 1: ➢ Elements of visual perception
Introduction to Image ➢ Image sensing and acquisition
andVision ➢ Image sampling and quantization
➢ Digital image representation
➢ Linear and nonlinear representation
Chapter 2: ➢ Enhancement in spatial domain
Image Enhancement ➢ Grey level transformation
➢ Histogram processing
➢ Smoothing and sharpening Spatial filters
➢ Enhancement in frequency domain
➢ Fourier transform
➢ Smoothing and sharpening frequency domain
filtering
➢ Homomorphism filtering
4
Course Content
Chapter/Topic Sub-Topics
Chapter 3: ➢ Dilation and Erosion
Morphological Image ➢ Morphological algorithms
processing
Chapter 4: ➢ Detection of discontinuities
Image Segmentation ➢ Boundary detection
➢ Thresholding
Chapter 5: ➢ Patterns and pattern classes
Object Recognition ➢ DecisionTheoretic Methods
➢ Structural Methods
5
Reference
1. Gonzalez, R. C. andWoods, R. E. [2002/2008], Digital Image
Processing, 2nd/3rd ed., Prentice Hall
2. Sonka, M., Hlavac,V., Boyle, R. [1999]. Image Processing,Analysis and
MachineVision (2nd edition), PWS Publishing, or (3rd edition)Thompson
Engineering, 2007
3. Gonzalez, R. C.,Woods, R. E., and Eddins, S. L. [2009].Digital Image
Processing UsingMATLAB, 2nd ed., Gatesmark Publishing, Knoxville,TN.
4. Anil K. Jain [2001], Fundamentals of digital image processing (2nd
Edition), Prentice-Hall, NJ
5. Willian K. Pratt [2001], Digital Image Processing (3rd Edition), , John
Wiley & Sons, NY
6. 6. Burger,Willhelm and Burge, Mark J. [2008]. Digital Image Processing:
AnAlgorithmic Introduction Using Java, Springer
6
Course Evaluation
Journal Paper Review and Presentation---15%
MiniTest--------------------------------------25%
Group Mini-Project (Pair)/Presentation--10%
L a b a n d Class Engagement andActivities 10%
Final Exam-----------------------------------40%
7
Scientific Paper Review
IndividualTask
Select article of your interest
Make a Scientific Review
Presentation Date ??????? (To be decided)
8
Image Analysis and Computer Vision
CoSc-6412
Topic 1:
Introduction to Image and Vision
9
Topic Coverage
Chapter 1: ➢ Elements of visual perception
➢ Image sensing and acquisition
Introduction to ➢ Image sampling and quantization
Image andVision ➢ Linear and nonlinear representation
➢ Digitalimage representation
10
Every picture tells a story
Image carries vast amount of
information.
We, humans, are selective of what we
consume through visual sense.
The goal of computer vision is to
write computer programs that can
interpret images
Can computers match human
perception?
Yes and no (but mostly no!)
humans are much better at “hard” things
computers can be better at “easy” things
11
Overview: Computer Imaging
• Definition of computer imaging:
– Acquisition and processing of visual
information by computer.
• Why is it important?
– Human primary sense is visual sense.
– Information can be conveyed well through
images (one picture worth a thousand words).
– Computer is required because the amount of
data to be processed is huge.
Overview: Computer Imaging
• Computer imaging can be divided into two
main categories:
– Computer Vision: applications of the output
are for use by a computer.
– Image Processing: applications of the output
are for use by human.
• These two categories are not totally
separate and distinct.
Overview: Computer Imaging
• They overlap each other in certain areas.
COMPUTER IMAGING
Computer Image
Vision Processing
What is Computer Vision?
Deals with the development of the theoretical
and algorithmic basis by which useful
information about the 3D world can be
automatically extracted and analyzed from a
single or multiple o 2D images of the world.
27
Computer Vision
• Does not involve human in the visual loop.
• One of the major topic within this field is
image analysis .
• Image analysis involves the examination of
image data to facilitate in solving a vision
problem.
Computer Vision
• Image analysis process involves two other
topics:
– Feature extraction: acquiring higher level
image info (shape and color)
– Pattern classification: using higher level image
information to identify objects within image.
Computer Vision
Reconstruction
Representation
Receive from the real world and reconstruct it internally
How do humans perform this task?
Recover 3D information from data
Angle and lighting
Recognition
Feature extraction
Segmentation of image parts
Detect and identify objects
Understanding
Giving context to image parts
Knowing what is happening in the scene?
13
Why is Computer Vision Difficult?
It is a many-to-one mapping
A variety of surfaces with different material and geometrical
properties, possibly under different lighting conditions, could
lead to identical images
Inverse mapping has non unique solution (a lot of information is
lost in the transformation from the 3D world to the 2D image)
It is computationally intensive
We do not understand the recognition problem
30
What is an Image?
The pattern is defined is a coordinate system whose origin is
conventionally defined as the upper-left corner of the image .
We can describe the pattern by a function f(x,y).
(0,0)
Value – f(x,y,z, ,t)
14
Image Processing and Related Fields
16
What Is Digital Image Processing?
Digital image processing helps us enhance images to make
them visually pleasing, or emphasize regions or features of an
image to better represent the content.
For example, we may wish to enhance the brightness and
contrast to make a better print of a photograph, similar to
popular photo-processing software.
In a magnetic resonance image (MRI) of the brain, we may
want to accentuate a certain region of image intensities to see
certain parts of the brain.
12
Image Processing
• Processed images are to be used by
human.
– Therefore, it requires some understanding on
how the human visual system operates.
• Among the major topics are:
– Image restoration.
– Image enhancement.
– Image compression.
Image Processing
• Image restoration:
– The process of taking an image with some
know, or estimated degradation, and restoring
it to its original appearance.
– Done by performing the reverse of the
degradation process to the image.
– Examples: correcting distortion in the optical
system of a telescope.
Image Processing
An Example of Image Restoration
Image Processing
• Image enhancement:
– Improve an image visually by taking an
advantage of human visual system’s response.
– Example: improve contrast, image sharpening,
and image smoothing.
Image Processing
An Example of Image Enhancement
Image Processing
• Image compression:
– Remove the amount of data required to
represent an image by:
• Removing unnecessary data that are visually
unnecessary.
• Taking advantage of the redundancy that is
inherent in most images.
– Example: JPEG, MPEG, etc.
Elements ofVisual Perception
How do people perceive image?
How images are formed in the eye?
How human and electronic imaging compare in terms of
resolution and ability to adapt to changes in illumination
17
Structure of the Human Eye
The eye is nearly a sphere, with an average diameter of
approximately 20 mm.
Three membranes enclose the eye:
1. the cornea (transparent exterior portion of the eye covering the iris..color part
of the eye ) and sclera (protects the delicate structure inside) outer cover;
2. the choroid (another layer found underneath the sclera); and
3. the retina (is a collection of light sensitive tissues).
The cornea is a tough, transparent tissue that covers the
anterior surface of the eye.
Continuous with the cornea, the sclera is an opaque
membrane that encloses the remainder of the optic globe.
The choroid lies directly below the sclera.
18
Structure of the Human Eye
The lens is made up of concentric layers of fibrous cells and is
suspended by fibers that attach to the ciliary body.
The Lens contains 60 to 70%water, about 6%fat,and more
protein than any other tissue in the eye.
The innermost membrane of the eye is the retina, which lines
the inside of the wall’s entire posterior portion.
When the eye is properly focused, light from an object
outside the eye is imaged on the retina. Pattern vision is
afforded by the distribution of discrete light receptors over
the surface of the retina.
19
Structure of the Human Eye
20
Structure of the Human Eye
21
Image formation in the Eye
The principal difference between the lens of the eye and
an ordinary optical lens is that lens of the eye is
flexible.
The shape of the lens is controlled by tension in the
fibers of the ciliary body.
To focus on distant objects, the controlling muscles
cause the lens to be relatively flattened.
Similarly, these muscles allow the lens to become thicker
in order to focus on objects near the eye.
22
The Human Visual System
The Human Visual System
• This is how human visual system works:
– Light energy is focused by the lens of the eye
into sensors and retina.
– The sensors respond to the light by an
electrochemical reaction that sends an
electrical signal to the brain (through the optic
nerve).
– The brain uses the signals to create
neurological patterns that we perceive as
images.
The Human Visual System
• The visible light is an electromagnetic wave
with wavelength range of about 380 to 825
nanometers.
– However, response above 700 nanometers is
minimal.
• We cannot “see” many parts of the
electromagnetic spectrum.
The Human Visual System
The Human Visual System
• The visible spectrum can be divided into
three bands:
– Blue (400 to 500 nm).
– Green (500 to 600 nm).
– Red (600 to 700 nm).
• The sensors are distributed across retina.
The Human Visual System
The Human Visual System
• There are two types of sensors: rods and
cones.
• Rods:
– For night vision.
– See only brightness (gray level) and not color.
– Distributed across retina.
– Medium and low level resolution.
The Human Visual System
• Cones:
– For daylight vision.
– Sensitive to color.
– Concentrated in the central region of eye.
– High resolution capability (differentiate small
changes).
The Human Visual System
• Blind spot:
– No sensors.
– Place for optic nerve.
– We do not perceive it as a blind spot because
the brain fills in the missing visual information.
• Why does an object should be in center
field of vision in order to perceive it in fine
detail?
– This is where the cones are concentrated.
The Human Visual System
• Cones have higher resolution than rods
because they have individual nerves tied to
each sensor.
• Rods have multiple sensors tied to each
nerve.
• Rods react even in low light but see only a
single spectral band. They cannot
distinguish color.
Image formation in the Eye
The distance between the center of the lens and the retina
(called the focal length) varies from approximately 17 mm
to about 14 mm, as the refractive power of the lens increases
from its minimum to its maximum.
If h is the height in mm of that object in the retinal image, the
geometry of visualization yields 15/100=h/17 or h=2.55
mm.
23
The Human Visual System
• There are three types of cones. Each
responding to different wavelengths of
light energy.
• The colors that we perceive are the
combined result of the response of the
three cones.
The Human Visual System
Issue of Contrast
Objects appear to the eye to become darker as the
background gets lighter.
The example below is a piece of paper that seems white
when lying on a desk, but can appear totally black in a lighter
background
24
Issue of Illumination
Same objects and arrangement
Different angle of light – Many to one mapping
25
Perception—Illusions
The border of the
square is visible There seems to be
despite there is no a circle in the
border line middle
The short lines
seems to be slant
The horizontal but actually
line of the lower parallem
line seems longer
26
Image Processing
28
The Three Processing Levels
1. Low-level processing
Standard procedures are applied to improve image quality
Procedures are required to have no intelligent capabilities.
32
The Three Processing Levels (cont’d)
2. Intermediate-level processing
Extract and characterize components in the image
Some intelligent capabilities are required.
33
The Three Processing Levels (cont’d)
3. High-level processing
Recognition and interpretation.
Procedures require high intelligent capabilities.
34
35
36
37
38
39
40
41
Mathematics in Computer Vision
In the early days of computer vision, vision systems
employed simple heuristic methods.
Today, the domain is heavily inclined towards theoretically,
well-founded methods involving non-trivial mathematics.
Calculus
LinearAlgebra
Probabilities and Statistics
Signal Processing
Projective Geometry
Computational Geometry
OptimizationTheory
42
ControlTheory
Computer Vision Applications
Industrial inspection/quality control
Surveillance and security
Face recognition
Gesture recognition
Space applications
Medical image analysis
Autonomous vehicles
Virtual reality and much more …...
43
Face Detection Face Blurring
44
Medical Image Analysis
Image guided surgery
3D imaging: MRI, CT Grimson et al., MIT
45
Surveillance and Tracking
46
Surveillance and Tracking
47
Smart cars
49
Self-driving cars
51
Optical character recognition
Digit recognition, AT&T labs License place recognition
http://www.research.att.com/~yann
52
Sports Video Analysis
Tennis review system
53
Image Sensing and Acquisition
The types of images in which we are interested
are generated by the combination of an
“illumination” source and the reflection or
absorption of energy from that source by the
elements of the “scene” being imaged.
54
Critical issues
What information should be extracted?
How can it be extracted?
How should it be represented?
How can it be used to aid analysis and understanding?
55
Computer Imaging Systems
• Computer imaging systems comprises of
both hardware and software.
• The hardware components can be divided
into three subsystems:
– The computer
– Image acquisition: camera, scanner, video
recorder.
– Image display: monitor, printer, film, video
player.
Computer Imaging Systems
• The software is used for the following
tasks:
– Manipulate the image and perform any
desired processing on the image data.
– Control the image acquisition and storage
process.
• The computer system may be a general-
purpose computer with a frame grabber or
image digitizer board in it.
Computer Imaging Systems
• Frame grabber is a special purpose piece of
hardware that digitizes standard analog
video signal.
• Digitization of analog video signal is
important because computers can only
process digital data.
Computer Imaging Systems
• Digitization is done by sampling the analog
signal or instantaneously measuring the
voltage of the signal at fixed interval in
time.
• The value of the voltage at each instant is
converted into a number and stored.
• The number represents the brightness of
the image at that point.
Computer Imaging Systems
• The “grabbed” image is now a digital image
and can be accessed as a two dimensional
array of data.
– Each data point is called a pixel (picture
element).
• The following notation is used to express a
digital image:
– I(r,c) = the brightness of the image at point
(r,c) where r = row and c = column.
What do computers see?
Number…..
What do these
numbers represent?
56
Image Sampling and Quantization
Objective of imaging is to generate digital images
(representation) from sensed data (observation)
In creating digital image, there is a need to convert the continuous
sensed data into digital form.This involves two processes:
sampling and quantization.
An image may be continuous with respect to the x- and y-
coordinates, and also in amplitude.
To convert it to digital form, we have to sample the function in
both coordinates and in amplitude.
1. Digitizing the coordinate values is called sampling.
2. Digitizing the amplitude(brightness) values is called
quantization.
61
Image Sampling and Quantization
62
Image Sampling and Quantization
Sampling and
quantization
Take sample pixels
and change the light
intensity some
predefined range
63
Image Representation
• Digital image I(r, c) is represented as a two-
dimensional array of data.
• Each pixel value corresponds to the
brightness of the image at point (r, c).
• This image model is for monochrome (one
color, or black and white) image data.
Image Representation
• Multiband images (color, multispectral) can
be modeled by a different I(r, c) function
for each separate band of brightness
information.
• Types of images that will discuss:
– Binary
– Gray-scale
– Color
– Multispectral
Binary Images
• Takes only two values:
– Black and white (0 and 1)
– Requires 1 bit/pixel
• Used when the only information required
is shape or outline info. For example:
– To position a robotic gripper to grasp an
object.
– To check a manufactured object for
deformations.
– For facsimile (FAX) images.
Binary Images
Binary Images
• Binary images are often
created from gray-scale
images via a threshold
operation.
– White (‘1’) if pixel value is
larger than threshold.
– Black (‘0’) if it is less.
Gray-Scale Images
• Also referred to as monochrome or one-
color images.
• Contain only brightness information. No
color information.
• Typically contain 8 bits/pixel data, which
corresponds to 256 (0 to 255) different
brightness (gray) levels.
Gray-Scale Images
• However, there are applications such as
medical imaging or astronomy that
requires 12 or 16 bits/pixel.
– Useful when a small section of the image is
enlarged.
– Allows the user to repeatedly zoom a specific
area in the image.
Color Images
• Modeled as three band monochrome
image data.
• The values correspond to the brightness in
each spectral band.
• Typical color images are represented as
red, green and blue (RGB) images.
Color Images
• Using the 8-bit standard model, a color
image would have 24 bits/pixel.
– 8-bits for each of the three color bands (red,
green and blue).
Color Images
• For many applications, RGB is transformed to a
mathematical space that decouples (separates) the
brightness information from color information.
• The transformed images would have a:
– 1-D brightness or luminance.
– 2-D color space or chrominance.
• This creates a more people-oriented way of describing
colors.
Color Images
• One example is the
hue/saturation/lightness (HSL) color
transform.
– Hue: Color (green, blue, orange, etc).
– Saturation: How much white is in the color
(pink is red with more white, so it is less
saturated than pure red).
– Lightness: The brightness of the color.
Color Images
• Most people can relate to this method of
describing color.
– “A deep, bright orange” would have a large
intensity (bright), a hue of orange and a high
value of saturation (deep).
– It is easier to picture this color in mind.
– If we define this color in terms of RGB
component, R = 245, G = 110, B = 20, we have
no idea how this color looks like.
Color Images
• In addition to HSL, there are various other
formats used for representing color
images:
– YCrCb
– SCT (Spherical Coordinate Transform)
– PCT (Principle Component Transform)
– CIE XYZ
– L*u*v
– L*a*b
Color Images
• One color space can be converted to
another color space by using equations.
• Example: Converting RGB color space to
YCrCb color space.
Multispectral Images
• Typically contain information outside
normal human perceptual range.
– Infrared, ultraviolet, X-ray, acoustic or radar
data.
• They are not really images in usual sense
(not representing scene of physical world,
but rather information such as depth).
• Values are represented in visual form by
mapping the different spectral bands to
RGB.
Multispectral Images
• Sources include satellite system,
underwater sonar system, airborne radar,
infrared imaging systems, and medical
diagnostic imaging systems.
• The number of bands into which the data
are divided depends on the sensitivity of
the imaging sensory.
Multispectral Images
• Most satellite images contain two to seven
spectral bands.
– One to three in the visible spectrum.
– One or more in the infrared region.
• Newest satellites have sensors that collect
image information in 30 or more bands.
• Due to the large amount of data involved,
compression is essential.
Some Basic Relationships Between Pixels
64
Some Basic Relationships Between Pixels
65
Distance Measures
Euclidean distance
1
De ( p, q) [(x s)2 ( y t) 2
City-block distance ]2
D4 ( p, q) | (x s) | | (y t) |
Chessboard distance
D8 ( p, q) max(| (x s) |,| ( y t)
|)
66
Region/Boundary/Edge
Region
We call R a region of the image if R is a connected set
Boundary
The boundary of a region R is the set of pixels in the
region that have one or more neighbors that are not in R
Edge
Pixels with derivative values that exceed a preset
threshold
67
Image Representation
1. Image capture
2. Image quality measurements
3. Image resolution
4. Colour representation
5. Camera calibration
6. Parallels with human visual system
68
Image Capture
Many sources
Consider requirements of system
Resolution
69
Representation
Sampled data
Spatial.......Area based
Amplitude......light intensity
On a rectangular array
Multidimensional array
70
Image Resolution
How many pixels
Spatial resolution
How many shades of grey/colours
Amplitude resolution
How many frames per second
Temporal resolution (Motion)
71
Spatial Resolution
n, n/2, n/4, n/8, n/16 and n/32 pixels on a side.
72
Spatial Frequency Resolution
• To understand the concept of spatial
frequency, we must first understand the
concept of resolution.
• Resolution: the ability to separate two
adjacent pixels.
– If we can see that two adjacent pixels as being
separate, then we can say that we can resolve
the two.
Spatial Frequency Resolution
• Spatial frequency: how rapidly the signal
changes in space.
Spatial Frequency Resolution
• If we increase the frequency, the stripes get
closer until they finally blend together.
Spatial Frequency Resolution
• The distance between eye and image also affects the
resolution.
– The farther the image, the worse the resolution.
• Why is this important?
– The number of pixels per square inch on a display device must be
large enough for us to see an image as being realistic. Otherwise
we will end up seeing blocks of colors.
– There is an optimum distance between the viewer and the
display device.
Temporal Resolution
• Related to how we respond to visual
information as a function of time.
– Useful when considering video and motion in
images.
– Can be measured using flicker sensitivity.
• Flicker sensitivity refers to our ability to
observe a flicker in a video signal displayed
on a monitor.
Temporal Resolution
Temporal Resolution
• The cutoff frequency is about 50 hertz
(cycles per second).
– We will not perceive any flicker for a video
signal above 50Hz.
– TV uses frequency around 60Hz.
• The brighter the lighting, the more
sensitive we are to changes.
Amplitude Resolution
Humans can see:
About 40 shades of brightness
About 7.5 million shades of colour
Cameras can see:
Depends on signal to noise ratio on the devise
40 dB equates to about 20 shades
Images captured:
256 shades
73
Shades of Grey
256, 16, 4 and 2 shades.
74
End of Topic 1
75