Digital Image Processing
Object Recognition
Christophoros Nikou
[email protected]
Images taken from: R. Gonzalez and R. Woods. Digital Image Processing, Prentice Hall, 2008
University of Ioannina - Department of Computer Science and Engineering
2 Object Recognition
One of the most interesting aspects of the world is
that it can be considered to be made up of patterns.
A pattern is essentially an arrangement.
It is characterized by the order of the elements of
which it is made, rather than by the intrinsic nature
of these elements
Norbert Wiener
C. Nikou – Digital Image Processing
3 Introduction
• Recognition of individual image regions
(objects or patterns).
• Introduction to basic techniques.
– Decision theoretic techniques.
• Quantitative descriptors (e.g. area, length…).
• Patterns arranged in vectors.
– Structural techniques.
• Qualitative descriptors (relational descriptors for
repetitive structures, e.g. staircase).
• Patterns arranged in strings or trees.
• Central idea: Learning from sample patterns.
C. Nikou – Digital Image Processing
4 Patterns and pattern classes
• Pattern: an arrangement of descriptors (or
features).
• Pattern class: a family of patterns sharing
some common properties.
– They are denoted by ω1, ω2,…, ωW, W being
the number of classes.
• Goal of pattern recognition: assign
patterns to their classes with as little
human interaction as possible.
C. Nikou – Digital Image Processing
5 Pattern vectors
• Historical example
– Recognition of three
types of iris flowers by the
lengths and widths of
their petals (Fisher 1936).
• Variations between and
within classes.
• Class separability
depends strongly on the
choice of descriptors.
x1
x
x2
C. Nikou – Digital Image Processing
6 Pattern vectors (cont.)
• Shape signature represented by the sampled
amplitude values.
• Cloud of n-dimensional points.
• Other shape characteristics could have been
employed (e.g. moments).
• The choice of descriptors has a profound role in
the recognition performance.
x1
x
x 2
xn
C. Nikou – Digital Image Processing
7 String descriptors
• Description of structural relationships.
• Example: fingerprint recognition.
• Primitive components that describe.
fingerprint ridge properties.
– Interrelationships of print features (minutiae).
• Abrupt endings, branching, merging,
disconnected segments,…
– Relative sizes and locations of print features.
C. Nikou – Digital Image Processing
8 String descriptors (cont.)
• Staircase pattern described by a head-to-tail
structural relationship.
• The rule allows only alternating pattern.
• It excludes other types of structures.
• Other rules may be defined.
C. Nikou – Digital Image Processing
9 Tree descriptors
• Hierarchical ordering
• In the satelite image
example, the structural
relationship is defined as:
“composed of”
C. Nikou – Digital Image Processing
10 Decision-theoretic methods
• They are based on decision (discriminant)
functions.
• Let x=[x1, x2,…, xn]T represent a pattern
vector.
• For W pattern classes ω1, ω2,…, ωW, the
basic problem is to find W decision functions
d1(x), d2(x),…, dW (x)
with the property that if x belongs to class ωi:
di (x) d j (x) for j 1, 2,...,W ; j i
C. Nikou – Digital Image Processing
11 Decision-theoretic methods (cont.)
• The decision boundary separating class ωi
from class ωj is given by the values of x for
which di (x) = dj (x) or
dij (x) di (x) d j (x) 0
• If x belongs to class ωi :
dij (x) 0 for j 1, 2,...,W ; j i
C. Nikou – Digital Image Processing
12 Decision-theoretic methods (cont.)
• Matching: an unknown pattern is assigned
to the class to which it is closest with
respect to a metric.
– Minimum distance classifier
• Computes the Euclidean distance between the
unknown pattern and each of the prototype vectors.
– Correlation
• It can be directly formulated in terms of images
• Optimum statistical classifiers
• Neural networks
C. Nikou – Digital Image Processing
13 Minimum distance classifier
The prototype of each pattern class is the mean
vector: 1
mj
Nj
x
x
j j 1, 2,...,W
j
Using the Euclidean distance as a measure of
closeness:
D j (x) x m j j 1, 2,...,W
We assign x to class ωj if Dj (x) is the smallest
distance. That is, the smallest distance implies the
best match in this formulation.
C. Nikou – Digital Image Processing
14 Minimum distance classifier (cont.)
It is easy to show that selecting the smallest
distance is equivalent to evaluating the
functions:
1 T
d j ( x) x m j m j m j
T
j 1, 2,...,W
2
and assigning x to class ωj if dj (x) yields the
largest numerical value. This formulation
agrees with the concept of a decision function.
C. Nikou – Digital Image Processing
15 Minimum distance classifier (cont.)
• The decision boundary between classes ωi and ωj
is given by:
dij (x) di (x) d j (x)
1
x (mi m j ) (mi m j )T (mi m j ) 0
T
• The surface is the perpendicular bisector of the
line segment joining mi and mj.
• For n=2, the perpendicular bisector is a line, for
n=3 it is a plane and for n>3 it is called a
hyperplane.
C. Nikou – Digital Image Processing
16 Minimum distance classifier (cont.)
C. Nikou – Digital Image Processing
17 Minimum distance classifier (cont.)
• In practice, the classifier works well when the
distance between means is large compared
to the spread of each class.
• This occurs seldom unless the system
designer controls the nature of the input.
• An example is the recognition of characters
on bank checks
– American Banker’s Association E-13B font
character set.
C. Nikou – Digital Image Processing
18 Minimum distance classifier (cont.)
• Characters designed on a
9x7 grid.
• The characters are
scanned horizontally by a
head that is narrower but
taller than the character
which produces a 1D
signal proportional to the
rate of change of the
quantity of the ink.
• The waveforms
(signatures) are different
for each character.
C. Nikou – Digital Image Processing
19 Matching by correlation
• We have seen the definition of correlation
and its properties in the Fourier domain.
g ( x, y ) w( s, t ) f ( x s, y t )
s t
G (u, v) F * (u, v)W (u, v)
• This definition is sensitive to scale changes
in both images.
• Instead, we use the normalized correlation
coefficient.
C. Nikou – Digital Image Processing
20 Matching by correlation (cont.)
• Normalized correlation coefficient:
w(s, t ) w f ( x s, y t ) f xy
( x, y ) s t
1
2
w( s, t ) w f ( x s, y t ) f xy
2 2
s t s t
• γ (x,y) takes values in [-1,1].
• The maximum occurs when the two regions
are identical.
C. Nikou – Digital Image Processing
21 Matching by correlation (cont.)
w(s, t ) w f ( x s, y t ) f xy
( x, y ) s t
1
2
w( s, t ) w f ( x s, y t ) f xy
2 2
s t s t
• It is robust to changes in
the amplitudes.
• Normalization with
respect to scale and
rotation is a challenging
task.
C. Nikou – Digital Image Processing
22 Matching by correlation (cont.)
• The compared
windows may be
seen as random
variables.
• The correlation
coefficient
measures the linear
dependence
between X and Y. E [( X mX )(Y mY )]
( X ,Y )
X X
C. Nikou – Digital Image Processing
23 Matching by correlation (cont.)
Detection of the eye of the hurricane
Image Template
Correlation Best match
coefficients
C. Nikou – Digital Image Processing
24 Optimum statistical classifiers
• A probabilistic approach to recognition.
• It is possible to derive an optimal
approach, in the sense that, on average, it
yields the lowest probability of committing
classification errors.
• The probability that a pattern x comes
from class ωj is denoted by p(ωj /x).
• If the classifier decides that x came from
ωj when it actually came from ωi it incurs a
loss denoted by Lij.
C. Nikou – Digital Image Processing
25 Bayes classifier (cont.)
• As pattern x may belong to any of W
classes, the average loss assigning x to
ωj is:
W
1 W
rj (x) Lkj p(k / x) Lkj p(x / k ) P(k )
k 1 p(x) k 1
Because 1/p(x) is positive and common to
all rj(x) the expression reduces to:
W
rj (x) Lkj p(x / k ) P(k )
k 1
C. Nikou – Digital Image Processing
26 Bayes classifier (cont.)
W
rj (x) Lkj p(x / k ) P(k )
k 1
• p(x/ωj) is the pdf of patterns of class ωj
(class conditional density).
• P(ωj) is the probability of occurrence of
class ωj (a priori or prior probability).
• The classifier evaluates r1(x), r2(x),…,
rW(x), and assigns pattern x to the class
with the smallest average loss.
C. Nikou – Digital Image Processing
27 Bayes classifier (cont.)
• The classifier that minimizes the total
average loss is called the Bayes classifier.
• It assigns an unknown pattern x to classs
ωi if:
ri (x) rj (x), for j 1, 2,..., W
or
W W
L
k 1
ki p(x / k ) P(k ) Lqj p(x / q ) P(q ), for all j, j i
q 1
C. Nikou – Digital Image Processing
28 Bayes classifier (cont.)
• The loss for a wrong decision is generally
assigned to a non zero value (e.g. 1)
• The loss for a correct decision is 0.
Lij 1 ij
Therefore,
W W
rj (x) Lkj p(x / k ) P(k ) (1 jk ) p(x / k ) P(k )
k 1 k 1
p(x) p(x / j ) P( j )
C. Nikou – Digital Image Processing
29 Bayes classifier (cont.)
• The Bayes classifier assigns pattern x to classs
ωi if:
p(x) p(x / i ) P(i ) p(x) p(x / j ) P( j )
or
p(x / i ) P(i ) p(x / j ) P( j )
which is the computation of decision functions:
d j (x) p(x / j ) P( j ), j 1, 2,...,W
It assigns pattern x to the class whose decision
function yields the largest numerical value.
C. Nikou – Digital Image Processing
30 Bayes classifier (cont.)
• The probability of occurrence of each class P(ωj)
must be known.
– Generally, we consider them equal, P(ωj)=1/W.
• The probability densities of the patterns in each
class P(x/ωj) must be known.
– More difficult problem (especially for multidimensional
variables) which requires methods from pdf estimation.
– Generally, we assume:
• Analytic expressions for the pdf.
• The pdf parameters may be estimated from sample patterns.
• The Gaussian is the most common pdf.
C. Nikou – Digital Image Processing
31
Bayes classifier for Gaussian pattern
classes
• We first consider the 1-D case for W=2 classes.
( x m j )2
1 2 2j
d j (x) p(x / j ) P( j ) e P( j ), j 1, 2,..., W
2 j
• For P(ωj)=1/2:
C. Nikou – Digital Image Processing
32
Bayes classifier for Gaussian pattern
classes (cont.)
• In the n-D case:
1
1 ( x m j )T Cj 1 ( x m j )
p(x / j ) 1/2
e 2
(2 ) n /2 C j
• Each density is specified by its mean vector and
its covariance matrix:
m j E j [ x]
C j E j [(x m j )(x m j )T ]
C. Nikou – Digital Image Processing
33
Bayes classifier for Gaussian pattern
classes (cont.)
• Approximation of the mean vector and covariance
matrix from samples from the classes:
1
mj
Nj
x
x j
1
Cj
Nj
( x
x j
m j )( x m j )T
C. Nikou – Digital Image Processing
34
Bayes classifier for Gaussian pattern
classes (cont.)
• It is more convenient to work with the natural
logarithm of the decision function as it is
monotonically increasing and it does not change
the order of the decision functions:
d j (x) ln p(x / j ) P( j ) ln p(x / j ) ln P( j )
1 1
ln P( j ) ln C j (x m j )Cj 1 (x m j )T
2 2
• The decision functions are hyperquadrics.
C. Nikou – Digital Image Processing
35
Bayes classifier for Gaussian pattern
classes (cont.)
• If all the classes have the same covarinace Cj=C,
j=1,2,…,W the decision functions are linear
(hyperplanes):
1 T 1
d j (x) ln P( j ) x C m j m j C m j
T 1
• Moreover, if P(ωj)=1/W and Cj=I:
1 T
d j ( x) x m j m j m j
T
2
which is the minimum distance classifier
decision function.
C. Nikou – Digital Image Processing
36
Bayes classifier for Gaussian pattern
classes (cont.)
• The minimum distance classifier is optimum in the
Bayes sense if:
− The pattern classes are Gaussian.
− All classes are equally to occur.
− All covariance matrices are equal to (the same multiple
of) the identity matrix.
• Gaussian pattern classes satisfying these
conditions are spherical clouds (hyperspheres)
• The classifier establishes a hyperplane between
every pair of classes.
− It is the perpendicular bisector of the line segment
joining the centers of the classes
C. Nikou – Digital Image Processing
37
Application to remotely sensed
images
• 4-D vectors.
• Three classes
− Water
− Urban development
− Vegetation
• Mean vectors and
covariance matrices
learnt from samples
whose class is known.
− Here, we will use
samples from the image
to learn the pdf
parameters
C. Nikou – Digital Image Processing
38
Application to remotely sensed
images (cont.)
C. Nikou – Digital Image Processing
39
Application to remotely sensed
images (cont.)
C. Nikou – Digital Image Processing
40 Structural methods
• Matching shape numbers
• String matching
C. Nikou – Digital Image Processing
41 Matching shape numbers
• The degree of similarity, k, between two
shapes is defined as the largest order for
which their shape numbers still coincide.
− Reminder: The shape number of a boundary is
the first difference of smallest magnitude of its
chain code (invariance to rotation).
− The order n of a shape number is defined as
the number of digits in its representation.
C. Nikou – Digital Image Processing
42 Reminder: shape numbers
• Examples. All closed shapes of order n=4, 6 and 8.
• First differences are computed by treating the chain
as a circular sequence.
C. Nikou – Digital Image Processing
43 Matching shape numbers (cont.)
• Let a and b denote two closed shapes which are
represented by 4-directional chain codes and s(a)
and s(b) their shape numbers.
• The shapes have a degree of similarity, k, if:
s j (a) s j (b) for j 4, 6,8,..., k
s j (a) s j (b) for j k 2, k 4,...
• This means that the first k digits should be equal.
• The subscript indicates the order. For 4-directional chain
codes, the minimum order for a closed boundary is 4.
C. Nikou – Digital Image Processing
44 Matching shape numbers (cont.)
• Alternatively, the distance between two shapes a
and b is defined as the inverse of their degree of
similarity:
1
D ( a, b)
k
• It satisfies the properties:
D ( a, b) 0
D(a, b) 0, iff a b
D(a, c) max[ D(a, b), D(b, c)]
C. Nikou – Digital Image Processing
45 String matching
• Region boundaries a and b are code into strings
denoted a1a2a3 …an and b1b2b3 …bm.
• Let p represent the number of matches between
the two strings.
− A match at the k-th position occurs if ak=bk.
• The number of symbols that do not match is:
q max(| a |,| b |) p
• A simple measure of similarity is:
number of matches p p
R
number of mismatches q max(| a |,| b |) p
C. Nikou – Digital Image Processing