Lecture1 Merged
Lecture1 Merged
1. Introduction
Remote sensing is an art and science of obtaining information about an object or feature
without physically coming in contact with that object or feature. Humans apply remote
sensing in their day-to-day business, through vision, hearing and sense of smell. The data
collected can be of many forms: variations in acoustic wave distributions (e.g., sonar),
variations in force distributions (e.g., gravity meter), variations in electromagnetic energy
distributions (e.g., eye) etc. These remotely collected data through various sensors may be
analyzed to obtain information about the objects or features under investigation. In this
course we will deal with remote sensing through electromagnetic energy sensors only.
Thus, remote sensing is the process of inferring surface parameters from measurements of the
electromagnetic radiation (EMR) from the Earth’s surface. This EMR can either be reflected
or emitted from the Earth’s surface. In other words, remote sensing is detecting and
measuring electromagnetic (EM) energy emanating or reflected from distant objects made of
various materials, so that we can identify and categorize these objects by class or type,
substance and spatial distribution [American Society of Photogrammetry, 1975].
Remote sensing provides a means of observing large areas at finer spatial and temporal
frequencies. It finds extensive applications in civil engineering including watershed studies,
hydrological states and fluxes simulation, hydrological modeling, disaster management
services such as flood and drought warning and monitoring, damage assessment in case of
natural calamities, environmental monitoring, urban planning etc.
2. Electromagnetic Energy
Electro-magnetic energy (E) can be expressed either in terms of frequency (f) or wave length
(λ) of radiation as
where h is Planck's constant (6.626 x 10-34 Joules-sec), c is a constant that expresses the
celerity or speed of light (3 x 108 m/sec), f is frequency expressed in Hertz and λ is the
wavelength expressed in micro meters (1µm = 10-6 m).
As can be observed from equation (1), shorter wavelengths have higher energy content and
longer wavelengths have lower energy content.
Thermal infrared
Ultraviolet rays
Near Infrared
Gamma rays
Radio waves
Visible light
Microwave
Wave 10-6 10-5 X rays
10-4 10-3 10-2 10-1 1 10 102 103 104 105 106 107 108 109
length (μm)
All matters reflect, emit or radiate a range of electromagnetic energy, depending upon the
material characteristics. In remote sensing, it is the measurement of electromagnetic radiation
reflected or emitted from an object, is the used to identify the target and to infer its properties.
Different objects reflect or emit different amounts of energy in different bands of the
electromagnetic spectrum. The amount of energy reflected or emitted depends on the
properties of both the material and the incident energy (angle of incidence, intensity and
wavelength). Detection and discrimination of objects or surface features is done through the
uniqueness of the reflected or emitted electromagnetic radiation from the object.
A device to detect this reflected or emitted electro-magnetic radiation from an object is called
a “sensor” (e.g., cameras and scanners). A vehicle used to carry the sensor is called a
“platform” (e.g., aircrafts and satellites).
In the case of passive remote sensing, source of energy is that naturally available such as the
Sun. Most of the remote sensing systems work in passive mode using solar energy as the
source of EMR. Solar energy reflected by the targets at specific wavelength bands are
recorded using sensors onboard air-borne or space borne platforms. In order to ensure ample
signal strength received at the sensor, wavelength / energy bands capable of traversing
through the atmosphere, without significant loss through atmospheric interactions, are
generally used in remote sensing
Any object which is at a temperature above 0o K (Kelvin) emits some radiation, which is
approximately proportional to the fourth power of the temperature of the object. Thus the
Earth also emits some radiation since its ambient temperature is about 300o K. Passive
sensors can also be used to measure the Earth’s radiance but they are not very popular as the
energy content is very low.
In the case of active remote sensing, energy is generated and sent from the remote sensing
platform towards the targets. The energy reflected back from the targets are recorded using
sensors onboard the remote sensing platform. Most of the microwave remote sensing is done
through active remote sensing.
As a simple analogy, passive remote sensing is similar to taking a picture with an ordinary
camera whereas active remote sensing is analogous to taking a picture with camera having
built-in flash (Fig. 5).
Remote sensing platforms can be classified as follows, based on the elevation from the
Earth’s surface at which these platforms are placed.
From each of these platforms, remote sensing can be done either in passive or active mode.
In airborne remote sensing, downward or sideward looking sensors mounted on aircrafts are
used to obtain images of the earth's surface. Very high spatial resolution images (20 cm or
less) can be obtained through this. However, it is not suitable to map a large area. Less
coverage area and high cost per unit area of ground coverage are the major disadvantages of
airborne remote sensing. While airborne remote sensing missions are mainly one-time
operations, space-borne missions offer continuous monitoring of the earth features.
LiDAR, analog aerial photography, videography, thermal imagery and digital photography
are commonly used in airborne remote sensing.
In space-borne remote sensing, sensors mounted on space shuttles or satellites orbiting the
Earth are used. There are several remote sensing satellites (Geostationary and Polar orbiting)
providing imagery for research and operational applications. While Geostationary or
Geosynchronous Satellites are used for communication and meteorological purposes, polar
orbiting or sun-synchronous satellites are essentially used for remote sensing. The main
advantages of space-borne remote sensing are large area coverage, less cost per unit area of
coverage, continuous or frequent coverage of an area of interest, automatic/ semiautomatic
computerized processing and analysis. However, when compared to aerial photography,
satellite imagery has a lower resolution.
Landsat satellites, Indian remote sensing (IRS) satellites, IKONOS, SPOT satellites, AQUA
and TERRA of NASA and INSAT satellite series are a few examples.
i. A Uniform Energy Source which provides energy over all wavelengths, at a constant,
known, high level of output
ii. A Non-interfering Atmosphere which will not modify either the energy transmitted
from the source or emitted (or reflected) from the object in any manner.
iii. A Series of Unique Energy/Matter Interactions at the Earth's Surface which generate
reflected and/or emitted signals that are selective with respect to wavelength and also
unique to each object or earth surface feature type.
iv. A Super Sensor which is highly sensitive to all wavelengths. A super sensor would be
simple, reliable, accurate, economical, and requires no power or space. This sensor
yields data on the absolute brightness (or radiance) from a scene as a function of
wavelength.
v. A Real-Time Data Handling System which generates the instance radiance versus
wavelength response and processes into an interpretable format in real time. The data
derived is unique to a particular terrain and hence provide insight into its physical-
chemical-biological state.
vi. Multiple Data Users having knowledge in their respective disciplines and also in remote
sensing data acquisition and analysis techniques. The information collected will be
available to them faster and at less expense. This information will aid the users in
various decision making processes and also further in implementing these decisions.
Real remote sensing systems employed in general operation and utility have many
shortcomings when compared with an ideal system explained above.
i. Energy Source: The energy sources for real systems are usually non-uniform over
various wavelengths and also vary with time and space. This has major effect on the
passive remote sensing systems. The spectral distribution of reflected sunlight varies
both temporally and spatially. Earth surface materials also emit energy to varying
degrees of efficiency. A real remote sensing system needs calibration for source
characteristics.
ii. The Atmosphere: The atmosphere modifies the spectral distribution and strength of the
energy received or emitted (Fig. 8). The effect of atmospheric interaction varies with
the wavelength associated, sensor used and the sensing application. Calibration is
required to eliminate or compensate these atmospheric effects.
iii. The Energy/Matter Interactions at the Earth's Surface: Remote sensing is based on the
principle that each and every material reflects or emits energy in a unique, known way.
However, spectral signatures may be similar for different material types. This makes
differentiation difficult. Also, the knowledge of most of the energy/matter interactions
for earth surface features is either at elementary level or even completely unknown.
iv. The Sensor: Real sensors have fixed limits of spectral sensitivity i.e., they are not
sensitive to all wavelengths. Also, they have limited spatial resolution (efficiency in
recording spatial details). Selection of a sensor requires a trade-off between spatial
resolution and spectral sensitivity. For example, while photographic systems have very
good spatial resolution and poor spectral sensitivity, non-photographic systems have
poor spatial resolution.
v. The Data Handling System: Human intervention is necessary for processing sensor
data; even though machines are also included in data handling. This makes the idea of
real time data handling almost impossible. The amount of data generated by the sensors
far exceeds the data handling capacity.
vi. The Multiple Data Users: The success of any remote sensing mission lies on the user
who ultimately transforms the data into information. This is possible only if the user
understands the problem thoroughly and has a wide knowledge in the data generation.
The user should know how to interpret the data generated and should know how best to
use them.
EMR SPECTRUM
1. Introduction
In remote sensing, some parameters of the target are measured without being in touch with it.
To measure any parameters using remotely located sensors, some processes which convey
those parameters to the sensor is required. A best example is the natural remote sensing by
which we are able to see the objects around us and to identity their properties. We are able to
see the objects around us when the solar light hits them and gets reflected and captured in our
eyes. We are able to identify the properties of the objects when these signals captured in our
eyes are transferred to the brain and are analysed. The whole process is analogous to the man-
made remote sensing techniques.
In remote sensing techniques, electromagnetic radiations emitted / reflected by the targets are
recorded at remotely located sensors and these signals are analysed to interpret the target
characteristics. Characteristics of the signals recorded at the sensor depend on the
characteristics of the source of radiation / energy, characteristics of the target and the
atmospheric interactions.
This lecture gives details of the electromagnetic spectrum. Details of the energy sources and
the radiation principles are also covered in this lecture.
2. Electromagnetic energy
Electromagnetic (EM) energy includes all energy moving in a harmonic sinusoidal wave
pattern with a velocity equal to that of light. Harmonic pattern means waves occurring at
frequent intervals of time.
Electromagnetic energy has both electric and magnetic components which oscillate
perpendicular to each other and also perpendicular to the direction of energy propagation as
shown in Fig. 1.
All EM waves travel at the speed of light, c, which is approximately equal to 3×108 m/s.
Wavelength λ of EM wave is the distance from any point on one wave to the same position
on the next wave (e.g., distance between two successive peaks). The wavelengths commonly
used in remote sensing are very small. It is normally expressed in micrometers (μm). 1 μm is
equal to 1×10-6 m.
Frequency f is the number of waves passing a fixed point per unit time. It is expressed in
Hertz (Hz).
c=λf (1)
which implies that wavelength and frequency are inversely related since c is a constant.
Longer wavelengths have smaller frequency compared to shorter wavelengths.
Engineers use frequency attribute to indicate radio and radar regions. However, in remote
sensing EM waves are categorized in terms of their wavelength location in the EMR
spectrum.
Another important theory about the electromagnetic radiation is the particle theory, which
suggests that electromagnetic radiation is composed of discrete units called photons or
quanta.
Distribution of the continuum of radiant energy can be plotted as a function of wavelength (or
frequency) and is known as the electromagnetic radiation (EMR) spectrum. EMR spectrum is
divided into regions or intervals of different wavelengths and such regions are denoted by
different names. However, there is no strict dividing line between one spectral region and its
adjacent one. Different regions in EMR spectrum are indicated in Fig. 2.
Thermal infrared
Ultraviolet rays
Near Infrared
Gamma rays
Radio waves
Visible light
Microwave
X rays
Wave 10-6 10-5 10-4 10-3 10-2 10-1 1 10 102 103 104 105 106 107 108 109
length (μm)
The visible region (human eye is sensitive to this region) occupies a very small region in the
range between 0.4 and 0.7 μm. The approximate range of color “blue” is 0.4 – 0.5 μm,
“green” is 0.5-0.6 μm and “red” is 0.6-0.7 μm. Ultraviolet (UV) region adjoins the blue end
of the visible region and infrared (IR) region adjoins the red end.
The infrared (IR) region, spanning between 0.7 and 100 μm, has four subintervals of special
interest for remote sensing:
Longer wavelength intervals beyond this region are referred in units ranging from 0.1 to 100
cm. The microwave region spreads across 0.1 to 100 cm, which includes all the intervals used
by radar systems. The radar systems generate their own active radiation and direct it towards
the targets of interest. The details of various regions and the corresponding wavelengths are
given in Table 1.
Energy in the gamma rays, X-rays and most of the UV rays are absorbed by the Earth’s
atmosphere and hence not used in remote sensing. Most of the remote sensing systems
operate in visible, infrared (IR) and microwave regions of the spectrum. Some systems use
the long wave portion of the UV spectrum also.
Primary source of energy that illuminates different features on the earth surface is the Sun.
Solar radiation (also called insolation) arrives at the Earth at wavelengths determined by the
photosphere temperature of the sun (peaking near 5,600 °C).
Although the Sun produces electromagnetic radiation in a wide range of wavelengths, the
amount of energy it produces is not uniform across all wavelengths.
Fig.3. shows the solar irradiance (power of electromagnetic radiation per unit area incident on
a surface) distribution of the Sun. Almost 99% of the solar energy is within the wavelength
range of 0.28-4.96 μm. Within this range, 43% is radiated in the visible wavelength region
between 0.4-0.7 μm. The maximum energy (E) is available at 0.48 μm wave length, which is
in the visible green region.
Using the particle theory, the energy of a quantum (Q) is considered to be proportional to the
frequency. The relationship can be represented as shown below.
Q=hf (2)
where h is the Plank’s constant (6.626 x 10-34 J Sec) and f is the frequency.
Using the relationship between c, λ and f (Eq.1), the above equation can be written as follows
Q=hc/λ (3)
The energy per unit quantum is thus inversely proportional to the wavelength. Shorter
wavelengths are associated with higher energy compared to the longer wavelengths. For
example, longer wavelength electromagnetic radiations like microwave radiations are
associated with lower energy compared to the IR regions and are difficult to sense in remote
sensing. For operating with long wavelength radiations, the coverage area should be large
enough to obtain a detectable signal.
Other than the solar radiation, the Earth and the terrestrial objects also are the sources of
electromagnetic radiation. All matter at temperature above absolute zero (0oK or -273oC)
emits electromagnetic radiations continuously. The amount of radiation from such objects is a
function of the temperature of the object as shown below.
M = σ T4 (4)
This is known as Stefan-Boltzmann law. M is the total radiant exitance from the source
(Watts / m2), σ is the Stefan-Boltzmann constant (5.6697 x 10-8 Watts m-2k-4) and T is the
absolute temperature of the emitting material in Kelvin.
Since the Earth’s ambient temperature is about 300 K, it emits electromagnetic radiations,
which is maximum in the wavelength region of 9.7 μm, as shown in Fig.3. This is considered
as thermal IR radiation. This thermal IR emission from the Earth can be sensed using
scanners and radiometers.
According to the Stefan-Boltzmann law, the radiant exitance increases rapidly with the
temperature. However, this law is applicable for objects that behave as a blackbody.
A blackbody is a hypothetical, ideal radiator. It absorbs and reemits the entire energy incident
upon it.
Total energy emitted by a black body varies with temperature as given in Eq. 4. The total
energy is distributed over different wavelengths, which is called the spectral distribution or
spectral curve here. Area under the spectral curve gives the total radiant exitance M.
In addition to the total energy, the spectral distribution also varies with the temperature. Fig.
4 shows the spectral distribution of the energy radiated from black bodies at different
temperatures. The figure represents the Stefan-Boltzman’s law graphically. As the
temperature increases, area under the curve, and hence the total radiant exitance increases.
From Fig. 4, it can be observed that the peak of the radiant exitance varies with wavelength.
As the temperature increases, the peak shifts towards the left. This is explained by the Wien’s
displacement law. It states that the dominant wavelength at which a black body radiates λm is
inversely proportional to the absolute temperature of the black body (in K) and is represented
as given below.
λm = A / T (5)
where A is a constant, which is equal to 2898 μm K. The Sun’s temperature is around 6000 K,
and from the figure it can be observed that the visible part of the electromagnetic energy (0.4-
0.7 μm) dominates in the radiance exitance from the Sun.
As solar energy travels through atmosphere to reach the Earth, the atmosphere absorbs or
backscatters a fraction of it and transmits only the remainder. Wavelength regions, through
which most of the energy is transmitted through atmosphere are referred as atmospheric
windows. In Fig. 5 (Short, 1999), EMR spectrum is shown identifying different regions with
specific names starting from visible region to microwave regions. In the microwave region,
different radar bands are also shown such as κ, X, C, L and P.
(Source: http://www.geog.ucsb.edu/~jeff/115a/remote_sensing/thermal/thermalirinfo.html)
In Fig. 5, blue (or shaded) zones mark minimal passage of incoming and/or outgoing
radiation, whereas white areas denote atmospheric windows. Various constituents of
Most remote sensing instruments on air or space platforms operate in one or more of these
windows by making their measurements with detectors tuned to specific wavelengths that
pass through the atmosphere.
1. INTRODUCTION
In many respects, remote sensing can be thought of as a reading process. Using various
sensors, we remotely collect data that are analysed to obtain information about the objects,
areas or phenomena being investigated. In most cases the sensors are electromagnetic sensors
either air-borne or space-borne for inventorying. The sensors record the energy reflected or
emitted by the target features. In remote sensing, all radiations traverse through the
atmosphere for some distance to reach the sensor. As the radiation passes through the
atmosphere, the gases and the particles in the atmosphere interact with them causing changes
in the magnitude, wavelength, velocity, direction, and polarization.
In order to understand the interactions of the electromagnetic radiations with the atmospheric
particles, basic knowledge about the composition of the atmosphere is essential.
Atmosphere is the gaseous envelop that surrounds the Earth’s surface. Much of the gases are
concentrated within the lower 100km of the atmosphere. Only 3x10-5 percent of the gases are
found above 100 km (Gibbson, 2000).
Oxygen and Nitrogen are present in the ratio 1:4, and both together add to 99 percent of the
total gaseous composition in the atmosphere. Ozone is present in very small quantities and is
mostly concentrated in the atmosphere between 19 and 23km.
In addition to the above gases, the atmosphere also contains water vapor, methane, dust
particles, pollen from vegetation, smoke particles etc. Dust particles and pollen from
vegetation together form about 50 percent of the total particles present in the atmosphere.
Size of these particles in the atmosphere varies from approximately 0.01μm to 100μm.
The gases and the particles present in the atmosphere cause scattering and absorption of the
electromagnetic radiation passing through it.
3. Energy Interactions
The radiation from the energy source passes through some distance of atmosphere before
being detected by the remote sensor as shown in Fig. 1.
The distance travelled by the radiation through the atmosphere is called the path length. The
path length varies depending on the remote sensing techniques and sources.
For example, the path length is twice the thickness of the earth’s atmosphere in the case of
space photography which uses sunlight as its source. For airborne thermal sensors which use
emitted energy from the objects on the earth, the path length is only the length of the one way
distance from the Earth’s surface to the sensor, and is considerably small.
The effect of atmosphere on the radiation depends on the properties of the radiation such as
magnitude and wavelength, atmospheric conditions and also the path length. Intensity and
spectral composition of the incident radiation are altered by the atmospheric effects. The
interaction of the electromagnetic radiation with the atmospheric particles may be a surface
phenomenon (e.g., scattering) or volume phenomenon (e.g., absorption). Scattering and
absorption are the main processes that alter the properties of the electromagnetic radiation in
the atmosphere.
4. Scattering
Atmospheric scattering is the process by which small particles in the atmosphere diffuse a
portion of the incident radiation in all directions. There is no energy transformation while
scattering. But the spatial distribution of the energy is altered during scattering.
Rayleigh scattering
Mie scattering
Non-selective scattering
Rayleigh scattering mainly consists of scattering caused by atmospheric molecules and other
tiny particles. This occurs when the particles causing the scattering are much smaller in
diameter (less than one tenth) than the wavelengths of radiation interacting with them.
Smaller particles present in the atmosphere scatter the shorter wavelengths more compared to
the longer wavelengths.
The scattering effect or the intensity of the scattered light is inversely proportional to the
fourth power of wavelength for Rayleigh scattering. Hence, the shorter wavelengths are
scattered more than longer wavelengths.
Molecules of Oxygen and Nitrogen (which are dominant in the atmosphere) cause this type of
scattering of the visible part of the electromagnetic radiation. Within the visible range,
smaller wavelength blue light is scattered more compared to the green or red. A "blue" sky is
thus a manifestation of Rayleigh scatter. The blue light is scattered around 4 times and UV
light is scattered about 16 times as much as red light. This consequently results in a blue sky.
However, at sunrise and sunset, the sun's rays have to travel a longer path, causing complete
scattering (and absorption) of shorter wavelength radiations. As a result, only the longer
wavelength portions (orange and red) which are less scattered will be visible.
The haze in imagery and the bluish-grey cast in a color image when taken from high altitude
are mainly due to Rayleigh scatter.
Another type of scattering is Mie scattering, which occurs when the wavelengths of the
energy is almost equal to the diameter of the atmospheric particles. In this type of scattering
longer wavelengths also get scattered compared to Rayleigh scatter (Fig. 4).
In Mie scattering, intensity of the scattered light varies approximately as the inverse of the
wavelength.
Mie scattering is usually caused by the aerosol particles such as dust, smoke and pollen. Gas
molecules in the atmosphere are too small to cause Mie scattering of the radiation commonly
used for remote sensing.
A third type of scattering is nonselective scatter, which occurs when the diameters of the
atmospheric particles are much larger (approximately 10 times) than the wavelengths being
sensed. Particles such as pollen, cloud droplets, ice crystals and raindrops can cause non-
selective scattering of the visible light.
For visible light (of wavelength 0.4-0.7μm), non-selective scattering is generally caused by
water droplets which is having diameter commonly in the range of 5 to 100 μm. This
scattering is nonselective with respect to wavelength since all visible and IR wavelengths get
scattered equally giving white or even grey color to the clouds.
4. Absorption
Absorption is the process in which incident energy is retained by particles in the atmosphere
at a given wavelength. Unlike scattering, atmospheric absorption causes an effective loss of
energy to atmospheric constituents.
The absorbing medium will not only absorb a portion of the total energy, but will also reflect,
refract or scatter the energy. The absorbed energy may also be transmitted back to the
atmosphere.
The most efficient absorbers of solar radiation are water vapour, carbon dioxide, and ozone.
Gaseous components of the atmosphere are selective absorbers of the electromagnetic
radiation, i.e., these gases absorb electromagnetic energy in specific wavelength bands.
Arrangement of the gaseous molecules and their energy levels determine the wavelengths that
are absorbed.
Since the atmosphere contains many different gases and particles, it absorbs and transmits
many different wavelengths of electromagnetic radiation. Even though all the wavelengths
from the Sun reach the top of the atmosphere, due to the atmospheric absorption, only limited
wavelengths can pass through the atmosphere. The ranges of wavelength that are partially or
wholly transmitted through the atmosphere are known as "atmospheric windows." Remote
sensing data acquisition is limited through these atmospheric windows. The atmospheric
windows and the absorption characteristics are shown in Fig.5.
Fig. 5. (a) Spectral characteristics of main energy sources (b) Atmospheric windows and (c)
Common remote sensing systems at different wavelengths (Source: Lillesand et al., 2004)
Infrared (IR) radiation is mainly absorbed due to the rotational and vibrational transitions of
the molecules. The main atmospheric constituents responsible for infrared absorption are
water vapour (H2O) and carbon dioxide (CO2) molecules. Most of the radiation in the far
infrared region is also absorbed by the atmosphere. However, absorption is almost nil in the
microwave region.
The most common sources of energy are the incident solar energy and the radiation from the
Earth. The wavelength at which the Sun’s energy reaches its maximum coincides with the
visible band range. The energy radiated from the Earth is sensed through the windows at 3 to
5μm and 8 to 14μm using devices like thermal scanners.
Radar and Passive microwave systems operate through a window in the 1 mm to 1 m region
Major atmospheric windows used for remote sensing are given n Table 2.
Table 2. Major atmospheric windows used in remote sensing and their characteristics
Energy incident on the Earth’s surface is absorbed, transmitted or reflected depending on the
wavelength and characteristics of the surface features (such as barren soil, vegetation, water
body). Interaction of the electromagnetic radiation with the surface features is dependent on
the characteristics of the incident radiation and the feature characteristics. After interaction
with the surface features, energy that is reflected or re-emitted from the features is recorded at
the sensors and are analysed to identify the target features, interpret the distance of the object,
and /or its characteristics.
This lecture explains the interaction of the electromagnetic energy with the Earth’s surface
features.
2. Energy Interactions
The incident electromagnetic energy may interact with the earth surface features in three
possible ways: Reflection, Absorption and Transmission. These three interactions are
illustrated in Fig. 1.
Incident radiation
Reflection
Earth
Transmission
Absorption
Reflection occurs when radiation is redirected after hitting the target. According to the law of
reflection, the angle of incidence is equal to the angle of reflection (Fig. 2) .
Absorption occurs when radiation is absorbed by the target. The portion of the EM energy
which is absorbed by the Earth’s surface is available for emission and as thermal radiation at
longer wavelengths (Fig. 3).
Transmission occurs when radiation is allowed to pass through the target. Depending upon
the characteristics of the medium, during the transmission velocity and wavelength of the
radiation changes, whereas the frequency remains same. The transmitted energy may further
get scattered and / or absorbed in the medium.
These three processes are not mutually exclusive. Energy incident on a surface may be
partially reflected, absorbed or transmitted. Which process takes place on a surface depends
on the following factors:
The relationship between reflection, absorption and transmission can be expressed through
the principle of conservation of energy. Let EI denotes the incident energy, ER denotes the
reflected energy, EA denotes the absorbed energy and ET denotes the transmitted energy. Then
the principle of conservation of energy (as a function of wavelength λ) can be expressed as
EI (λ) = ER (λ) + EA(λ) + ET (λ) (1)
Since most remote sensing systems use reflected energy, the energy balance relationship can
be better expressed in the form
ER (λ) = EI (λ) - EA(λ) - ET (λ) (2)
The reflected energy is equal to the total energy incident on any given feature reduced by the
energy absorbed or transmitted by that feature.
3. Reflection
Reflection is the process in which the incident energy is redirected in such a way that the
angle of incidence is equal to the angle of reflection. The reflected radiation leaves the
surface at the same angle as it approached.
Scattering is a special type of reflection wherein the incident energy is diffused in many
directions and is sometimes called diffuse reflection.
When electromagnetic energy is incident on the surface, it may get reflected or scattered
depending upon the roughness of the surface relative to the wavelength of the incident
energy. If the roughness of the surface is less than the wavelength of the radiation or the ratio
of roughness to wavelength is less than 1, the radiation is reflected. When the ratio is more
than 1 or if the roughness is more than the wavelength, the radiation is scattered.
Fraction of energy that is reflected / scattered is unique for each material. This will aid in
distinguishing different features on an image.
Variations in the spectral reflectance within the visible spectrum give the colour effect to the
features. For example, blue colour is the result of more reflection of blue light. An object
appears as “green” when it reflects highly in the green portion of the visible spectrum. Leaves
appear green since its chlorophyll pigment absorbs radiation in the red and blue wavelengths
but reflects green wavelengths. Similarly, water looks blue-green or blue or green if viewed
through visible band because it reflects the shorter wavelengths and absorbs the longer
wavelengths in the visible band. Water also absorbs the near infrared wavelengths and hence
appears darker when viewed through red or near infrared wavelengths. Human eye uses
reflected energy variations in the visible spectrum to discriminate between various features.
For example, Fig.5 shows a part of the Krishna River Basin as seen in different bands of the
Landsat ETM+ imagery. As the concepts of false color composite (FCC) have been covered
in module 4, readers are advised to refer to the material in module 4 for better understanding
of the color composite imageries as shown in Fig. 5. Reflectance of surface features such as
water, vegetation and fallow lands are different in different wavelength bands. A combination
of more than one spectral band helps to attain better differentiation of these features.
Fig. 5 A part of the Krishna River Basin as seen in different bands of the Landsat ETM+
images
Energy reflection from a surface depends on the wavelength of the radiation, angle of
incidence and the composition and physical properties of the surface.
Roughness of the target surface controls how the energy is reflected by the surface. Based on
the roughness of the surface, reflection occurs in mainly two ways.
i. Specular reflection: It occurs when the surface is smooth and flat. A mirror-like or
smooth reflection is obtained where complete or nearly complete incident energy is
reflected in one direction. The angle of reflection is equal to the angle of incidence.
Reflection from the surface is the maximum along the angle of reflection, whereas in any
other direction it is negligible.
ii. Diffuse (Lambertian) reflection: It occurs when the surface is rough. The energy is
reflected uniformly in all directions. Since all the wavelengths are reflected uniformly in
all directions, diffuse reflection contains spectral information on the "colour" of the
reflecting surface. Hence, in remote sensing diffuse reflectance properties of terrain
features are measured. Since the reflection is uniform in all direction, sensors located at
any direction record the same reflectance and hence it is easy to differentiate the features.
Based on the nature of reflection, surface features can be classified as Specular reflectors,
Lambertian reflectors (Fig. 6).
An ideal specular reflector completely reflects the incident energy with angle of reflection
equal to the angle incidence.
An ideal Lambertian or diffuse reflector scatters all the incident energy equally in all the
directions.
The specular or diffusive characteristic of any surface is determined by the roughness of the
surface in comparison to the wavelength of the incoming radiation. If the wavelengths of the
incident energy are much smaller than the surface variations or the particle sizes, diffuse
reflection will dominate. For example, in the relatively long wavelength radio range, rocky
terrain may appear smooth to incident energy. In the visible portion of the spectrum, even a
material such as fine sand appears rough while it appears fairly smooth to long wavelength
microwaves.
Most surface features of the earth are neither perfectly specular nor perfectly diffuse
reflectors. In near specular reflection, though the reflection is the maximum along the angle
of reflection, a fraction of the energy also gets reflected in some other angles as well. In near
Lambertian reflector, the reflection is not perfectly uniform in all the directions. The
characteristics of different types of reflectors are shown in Fig. 6.
Angle of incidence
Angle of reflection
Lambertian reflectors are considered ideal for remote sensing. The reflection from an ideal
Lambertian surface will be the same irrespective of the location of the sensor. On the other
hand, in case of an ideal specular reflector, maximum brightness will be obtained only at one
location and for the other locations dark tones will be obtained from the same target. This
variation in the spectral signature for the same feature affects the interpretation of the remote
sensing data.
Most natural surfaces observed using remote sensing are approximately Lambertian at visible
and IR wavelengths. However, water provides specular reflection. Water generally gives a
dark tone in the image. However due to the specular reflection, it gives a pale tone when the
sensor is located in the direction of the reflected energy.
The reflectance characteristics of earth surface features are expressed as the ratio of energy
reflected by the surface to the energy incident on the surface. This is measured as a function
of wavelength and is called spectral reflectance, Rλ. It is also known as albedo of the surface.
Spectral reflectance or albedo can be mathematically defined as
ER
R
EI
(3)
Energy of wavelength reflected from the object
100
Energy of wavelength incident on the object
Grass 25
Concrete 20
Water 5-70
Fresh snow 80
Forest 5-10
Thick cloud 75
Dark soil 5-10
Albedo of fresh snow is generally very high. Dry snow reflects almost 80% of the energy
incident on it. Clouds also reflect a majority of the incident energy. Dark soil and concrete
generally show very low albedo. Albedo of vegetation is also generally low, but varies with
the canopy density. Albedo of forest areas with good canopy cover is as low as 5-10%.
Albedo of water ranges from 5 to 70 percentage, due to the specular reflection characteristics.
Albedo is low at lower incidence angle and increases for higher incidence angles.
The energy that is reflected by features on the earth's surface over a variety of different
wavelengths will give their spectral responses. The graphical representation of the spectral
response of an object over different wavelengths of the electromagnetic spectrum is termed as
spectral reflectance curve. These curves give an insight into the spectral characteristics of
different objects, hence used in the selection of a particular wavelength band for remote
sensing data acquisition.
For example, Fig. 7 shows the generalized spectral reflectance curves for deciduous (broad-
leaved) and coniferous (needle-bearing) trees. Spectral reflectances varies within a given
material i.e., spectral reflectance of one decisuous tree will not be identical with another.
Hence the generalized curves are shown as a “ribbon” and not as a single line. These curves
help in the selection of proper sensor system in order to differentiate deciduous and
coniferous trees.
Fig. 7. Spectral reflectance curves for deciduous and coniferous trees (Lillesand et al., 2004)
As seen from Fig. 7, spectral reflectance curves for each tree type are overlapping in most of
the visible portion. A choice of visible spectrum is not a feasible option for differentiation
since both the deciduous and coniferous trees will essentially be seen in shades of green.
However, in the near infra red (NIR) they are quite different and distinguishable. Within the
electromagnetic spectrum, the NIR represents a wavelength range from (0.7-1) to 5 microns.
A comparison of photographs taken in visible band and NIR band is shown in Fig. 8. It
should be noted that panchromatic refers to black and white imagery that is exposed by all
visible light. In visible band, the tone is same for both trees. However, on infrared
photographs, deciduous trees show a much lighter tone due to its higher infrared reflectance
than conifers.
(a) (b)
Fig. 8. (a) Panchromatic photograph using reflected sunlight over the visible wavelength band
0.4 to 0.7 mm and (b) Black and white infrared photograph using reflected sunlight over 0.7
to 0.9 mm wavelength band (Lillesand et al., 2004)
In remote sensing, the spectral reflectance characteristics of the surface features have been
used to identify the surface features and to study their characteristics. This requires basic
understanding of the general reflectance characteristics of different feature, which is covered
in the next lecture.
Bibliography
1. Lillesand, T. M, Kiefer, R. W., Chipman, J. W., [2004]. Remote Sensing and Image
Interpretation, John Wiley & Sons, New York, pp. 321-332.
Electromagnetic energy incident on the surface features are partially reflected, absorbed or
transmitted through it. The fractions that are reflected absorbed or transmitted vary with
material type and the condition of the feature. It also varies with the wavelength of the
incident energy. Majority of the remote sensing systems operate in the region in which the
surface features mostly reflect the incident energy. The reflectance characteristics of the
surface features are represented using spectral reflectance curves.
This lecture covers the spectral reflectance characteristics of some of the important surface
features.
Spectral reflectance curve for healthy green vegetation exhibits the "peak-and-valley"
configuration as illustrated in Fig. 1. The peaks indicate strong reflection and the valleys
indicate predominant absorption of the energy in the corresponding wavelength bands.
In general, healthy vegetations are very good absorbers of electromagnetic energy in the
visible region. The absorption greatly reduces and reflection increases in the red/infrared
boundary near 0.7 μm. The reflectance is nearly constant from 0.7-1.3 μm and then decreases
for the longer wavelengths.
Spectral response of vegetation depends on the structure of the plant leaves. Fig. 1 shows the
cell structure of a green leaf and the interaction with the electromagnetic radiation (Gibson
2000).
Fig.1. Cell structure of a green leaf and interactions with the electromagnetic radiation
(Gibson, 2000)
The valleys in the visible portion of the spectrum are due to the pigments in plant leaves. The
palisade cells containing sacs of green pigment (chlorophyll) strongly absorb energy in the
wavelength bands centered at 0.45 and 0.67 μm within visible region (corresponds to blue
and red), as shown in Fig.2. On the other hand, reflection peaks for the green colour in the
visible region, which makes our eyes perceive healthy vegetation as green in colour.
However, only 10-15% of the incident energy is reflected in the green band.
Fig. 2. Spectral reflectance of healthy vegetation in the visible and NIR wavelength bands
http://www.geog.ucsb.edu/
In the reflected infrared portion (or near infrared, NIR) of the spectrum, at 0.7 μm, the
reflectance of healthy vegetation increases dramatically. In the range from 0.7 to 1.3 μm, a
plant leaf reflects about 50 percent of the energy incident upon it. The infrared radiation
penetrates the palisade cells and reaches the irregularly packed mesophyll cells which make
up the body of the leaf. Mesophyll cells reflect almost 60% of the NIR radiation reaching this
layer. Most of the remaining energy is transmitted, since absorption in this spectral region is
minimal. Healthy vegetation therefore shows brighter response in the NIR region compared
to the green region. As the leaf structure is highly variable between plant species, reflectance
measurements in this range often permit discrimination between species, even if they look
same in visible wavelengths as seen in Fig. 3.
If a plant is subjected to some form of stress that interrupts its normal growth and
productivity, it may decrease or cease chlorophyll production. The result is less absorption in
the blue and red bands in the palisade. Hence, red and blue bands also get reflected along
with the green band, giving yellow or brown colour to the stressed vegetation. Also in
stressed vegetation, the NIR bands are no longer reflected by the mesophyll cells, instead
they are absorbed by the stressed or dead cells causing dark tones in the image (Fig. 3)
Fig. 3 Spectral reflectance curve for healthy and stressed vegetations (Gibson, 2000)
Beyond 1.3 μm, energy incident upon the plants is essentially absorbed or reflected, with
little to no transmittance of energy. Dips in reflectance occur at 1.4, 1.9, and 2.7 μm as water
in the leaf strongly absorbs the energy at these wavelengths. So, wavelengths in these
spectral regions are referred to as water absorption bands. Reflectance peaks occur at 1.6 and
2.2 μm, between the absorption bands. At wavelengths beyond 1.3 μm, leaf reflectance is
approximately inversely related to the total water present in a leaf. This total water is a
function of both the moisture content and the thickness of the leaf.
Similar to the reflection and absorption, transmittance of the electromagnetic radiation by the
vegetation also varies with wavelength. Transmittance of electromagnetic radiation is less in
the visible region and it increases in the infrared region. Vegetation canopies generally
display a layered structure. Therefore, the energy transmitted by one layer is available for
reflection or absorption by the layers below it (Fig. 4). Due to this multi-layer reflection,
total infrared reflection from thicker canopies will be more compared to thin canopy cover.
From the reflected NIR, the density of the vegetation canopy can thus be interpreted.
Fig. 4. Reflectance from dense forest and thin vegetation canopies (Gibson, 2000)
As the reflectance in the IR bands of the EMR spectrum varies with the leaf structure and the
canopy density, measurements in the IR region can be used to discriminate the tree or
vegetation species. For example, spectral reflectance of deciduous and coniferous trees may
be similar in the green band. However, the coniferous trees show higher reflection in the NIR
band, and can be easily differentiated (Fig.5). Similarly, for a densely grown agricultural
area, the NIR signature will be more.
Fig. 5 Spectral reflectance curves for deciduous and coniferous trees (Lillesand et al., 2004)
Some of the factors effecting soil reflectance are moisture content, soil texture (proportion of
sand, silt, and clay), surface roughness, presence of iron oxide and organic matter content.
These factors are complex, variable, and interrelated.
For example, the presence of moisture in soil decreases its reflectance. As with vegetation,
this effect is greatest in the water absorption bands at 1.4, 1.9, and 2.7 μm. On the other hand,
similar absorption characteristics are displayed by the clay soils. Clay soils have hydroxyl ion
absorption bands at 1.4 and 2.2 μm.
Soil moisture content is strongly related to the soil texture. For example, coarse, sandy soils
are usually well drained, resulting in low moisture content and relatively high reflectance. On
the other hand, poorly drained fine textured soils generally have lower reflectance. In the
absence of water, however, the soil itself exhibits the reverse tendency i.e., coarse textured
soils appear darker than fine textured soils.
Two other factors that reduce soil reflectance are surface roughness and the content of
organic matter. Presence of iron oxide in a soil also significantly decreases reflectance, at
least in the visible region of wavelengths.
Water provides a semi-transparent medium for the electromagnetic radiation. Thus the
electromagnetic radiations get reflected, transmitted or absorbed in water. The spectral
responses vary with the wavelength of the radiation and the physical and chemical
characteristics of the water.
Spectral reflectance of water varies with its physical condition. In the solid phase (ice or
snow) water give good reflection at all visible wavelengths. On the other hand, reflection in
the visible region is poor in case of water in liquid stage. This difference in reflectance is due
to the difference in the atomic bond in the liquid and solid states.
Water in the liquid form shows high reflectance in the visible region between 0.4μm and
0.6μm. Wavelengths beyond 0.7μm are completely absorbed. Thus clear water appears in
darker tone in the NIR image. Locating and delineating water bodies with remote sensing
data is done more easily in reflected infrared wavelengths because of this absorption
property.
For example, Fig. 6 shows a part of the Krishna River Basin in different bands of the Landsat
ETM+ imagery. The water body appears in dark colour in all bands and displays sharp
contrast in the IR bands.
Fig. 6 Landsat ETM+ images of a part of the Krishna river basin in different spectral bands
Clear water absorbs relatively less energy having wavelengths shorter than 0.6 μm. High
transmittance typifies these wavelengths with a maximum in the blue-green portion of the
spectrum. However, as the turbidity of water changes (because of the presence of organic or
inorganic materials), transmittance and therefore reflectance change dramatically. For
example, water bodies containing large quantities of suspended sediments normally have
much higher visible reflectance than clear water. Likewise, the reflectance of water changes
with the chlorophyll concentration involved. Increase in chlorophyll concentration tends to
decrease reflectance in blue wavelengths and increase reflectance in green wavelengths.
These changes have been used in remote sensing to monitor the presence and to estimate the
concentration of algae. Reflectance data have also been used to determine the presence or
absence of tannin dyes from bog vegetation in lowland areas, and to detect a number of
pollutants, such as oil and certain industrial wastes.
Many important characteristics of water such as dissolved oxygen concentration, pH, and salt
concentration cannot be observed directly through changes in water reflectance. However,
such parameters sometimes correlate with observed reflectance. Thus, there are many
complex interrelationships between the spectral reflectance of water and particular
characteristics.
Variation in the spectral reflectance in the visible region can be used to differentiate shallow
and deep waters, clear and turbid waters, as well as rough and smooth water bodies.
Reflectance in the NIR range is generally used for delineating the water bodies and also to
study the algal boom and phytoplankton concentration in water. More details on the remote
sensing applications for monitoring water quality parameters can be found in Nagesh Kumar
and Reshmidevi (2013).
Further details on the spectral characteristics of vegetation, soil, and water can be found in
Swain and Davis (1978).
Sample spectral reflectance curves of some of the natural features like snow, healthy
vegetation, stressed vegetation, dry soil, turbid water and clear water are given in Fig. 8.
In a multispectral image, multiple sensors are used to sense the reflectance in different
wavelength bands. Reflectance recorded in multiple bands are analysed to find how the
spectral reflectance varies with wavelength. Using the average spectral reflectance curves as
the basic information, the spectral reflectance variation is used to identify the target features.
For example, in Fig.9 aerial photographs of a stadium in normal colour and colour IR are
shown. In normal colour photograph, the artificial turf inside the stadium and the natural
vegetation outside the stadium appear in the same colour. On the other hand, the IR colour
photograph helps to differentiate both very clearly. The artificial turf appears dark in tone,
whereas the natural vegetation shows high reflectance in the IR region. Spectral reflectance
curves of the natural vegetation and the artificial turf are shown in Fig. 10. (Images are taken
from Lillesand et al., 2004).
(a)
(b)
Fig. 9 Aerial photograph of a football stadium with artificial turf (a) normal colour
photograph (b) colour IR photograph (from Lillesand et al., 2004)
Fig. 10 Spectral reflectance curves of the natural vegetation and the artificial turf (From
Lillesand et al., 2004)
1. Introduction
When a satellite is launched into the space, it moves in a well defined path around the Earth,
which is called the orbit of the satellite. Gravitational pull of the Earth and the velocity of the
satellite are the two basic factors that keep the satellites in any particular orbit. Spatial and
temporal coverage of the satellite depends on the orbit. There are three basic types of orbits
in use.
Geo-synchronous orbits
Polar or near polar orbits
Sun-synchronous orbits
Satellite orbits are matched to the capability and objective of the sensor(s) they carry. Orbit
selection can vary in terms of altitude (their height above the Earth's surface) and their
orientation and rotation relative to the Earth.
This lecture introduces some important terms related to the satellite orbits and the details of
different types of satellite orbits.
The path followed by a satellite in the space is called the orbit of the satellite. Orbits may be
circular (or near circular) or elliptical in shape.
Orbital period: Time taken by a satellite to compete one revolution in its orbit around the
earth is called orbital period.
It varies from around 100 minutes for a near-polar earth observing satellite to 24 hours for a
geo-stationary satellite.
Altitude: Altitude of a satellite is its heights with respect to the surface immediately below it.
Depending on the designed purpose of the satellite, the orbit may be located at low (160-2000
km), moderate, and high (~36000km) altitude.
Apogee and perigee: Apogee is the point in the orbit where the satellite is at maximum
distance from the Earth. Perigee is the point in the orbit where the satellite is nearest to the
Earth as shown in Fig.2.
Fig.2 Schematic representation of the satellite orbit showing the Apogee and Perigee
Inclination: Inclination of the orbital plane is measured clockwise from the equator. Orbital
inclination for a remote sensing satellite is typically 99 degrees. Inclination of any satellite on
the equatorial plane is nearly 180 degrees.
Nadir, ground track and zenith: Nadir is the point of interception on the surface of the
Earth of the radial line between the center of the Earth and the satellite. This is the point of
shortest distance from the satellite to the earth’s surface.
Any point just opposite to the nadir, above the satellite is called zenith.
The circle on the earth’s surface described by the nadir point as the satellite revolves is called
the ground track. In other words, it is the projection of the satellites orbit on the ground
surface.
Swath
Swath of a satellite is the width of the area on the surface of the Earth, which is imaged by
the sensor during a single pass.
For example, swath width of the IRS-1C LISS-3sensor is 141 km in the visible bands and 148
km in the shortwave infrared band.
Overlap is the common area on consecutive images along the flight direction. For example,
IRS-1C LISS-3 sensors create 7 km overlap between two successive images.
Sidelap is the overlapping areas of the images taken in two adjacent flight lines. For example,
sidelap of the IRS 1C LISS-3 sensor at the equator is 23.5 km in the visible bands and 30km
in the shortwave infrared band.
As the distance between the successive orbital passes decreases towards the higher latitudes,
the sidelap increases. This helps to achieve more frequent coverage of the areas in the higher
latitudes. IRS-1C WiFS sensors provide nearly 80-85% overlap and sidelap.
3. Geosynchronous orbit
Geostationary or geosynchronous orbit is the one in which the time required for the satellite
to cover one revolution is the same as that for the Earth to rotate once about its polar axis. In
order to achieve this orbit period, geo-synchronous orbits are generally at very high altitude;
nearly 36,000 km.
Geo-synchronous orbits are located in the equatorial plane, i.e with an inclination of 180
degrees. Thus from a point on the equator, the satellite appears to be stationary. The satellites
revolve in the same direction as that of the Earth (west to East).
Satellites in the geo-synchronous orbit are located above any particular longitude to get a
constant view of any desired region. For example, the GOES East and GOES West satellites
are placed in orbits over North America (normally at 75° W and 135° W, respectively),
INSAT 3-A is located at 93.5o E longitude, the Japanese Geostationary Meteorological
Satellite (GMS) is located over New Guinea, and Meteosat over Europe.
Because of the very high altitude, the foot prints of geostationary satellites are generally very
high. They can yield area coverage of 45-50% of the globe.
Fig.7. Foot prints of a typical geosynchronous satellite and a polar orbiting satellite
(Source : http://www.geo-orbit.org)
Polar orbits are usually medium or low orbits (approximately 700-800km) compared to the
geo-synchronous orbits. Consequently the orbit period is less, which typically varies from 90-
103 minutes. Therefore satellites in the polar orbits make more than one revolution around
the earth in a single day. Fig.9 shows the multiple orbits of a satellite in a typical near-polar
orbit. The National Oceanic and Atmospheric Administration (NOAA) series of satellites like
NOAA 17, NOAA 18 are all examples of polar orbiting satellites.
Taking advantage of the rotation of the earth on its own axis, each time newer segments of
the Earth will be under view of the satellite. The satellite's orbit and the rotation of the Earth
work together to allow complete coverage of the Earth's surface, after it has completed one
complete cycle of orbits.
An orbital cycle is completed when the satellite retraces its path, i.e., when the nadir point of
the satellite passes over the same point on the Earth’s surface for a second time. Orbital cycle
is also known as repeat cycle of the satellite. The orbital cycle need not be the same as the
revisit period.
Revisit period is the time elapsed between two successive views of the same area by a
satellite. Using steerable sensors, a satellite-borne instrument can view off-nadir areas before
and after the orbit passes over a target. In view of this off-nadir viewing capability of the
satellites, revisit time can be less than the orbital cycle.
In near-polar orbits, areas at high latitudes will be imaged more frequently than the equatorial
zone due to the increasing overlap in adjacent swaths as the orbit paths come closer together
near the poles.
5. Sun-synchronous orbits
It is a special case of polar orbit. Like a polar orbit, the satellite travels from the north to the
south poles as the Earth turns below it. In a sun-synchronous orbit, the satellite passes over
the same part of the Earth at roughly the same local time each day. These orbits are between
700 to 800 km altitudes. These are used for satellites that need a constant amount of sunlight.
A typical sun synchronous satellite completes 14 orbits a day, and each successive orbit is
shifted over the Earth’s surface by around 2875 km at the equator. Also the satellite’s path is
shifted in longitude by 1.17deg (approximately 130.54 km) everyday towards west, at the
equator “from platforms and sensors”, as shown in Fig.10.
Landsat satellites and IRS satellites are typical examples of sun-synchronous, near-polar
satellites. Fig.11 shows the orbits of the Landsat satellites (1, 2 and 3) in each successive pass
and on successive days. Repeat cycle of the satellite was 18days and each day 14 orbits were
completed.
Remote sensing applications generally use near polar, sun-synchronous, near circular orbits.
The near polar orientation helps to attain near global coverage, whereas the near circular orbit
helps to attain uniform swath for the images. Sun synchronous orbits are preferred for
maintaining constant angle between the aspect of incident sun and viewing by the satellite.
Remote sensing satellite orbits maintain nearly 90 degree inclination from the equatorial
plane for the difference in the gravitational pull. Also, medium orbit periods are adopted for
the remote sensing satellites so as to assure the global coverage in each single day.
1. Introduction
In general, the resolution is the minimum distance between two objects that can be
distinguished in the image. Objects closer than the resolution appear as a single object in the
image. However, in remote sensing the term resolution is used to represent the resolving
power, which includes not only the capability to identify the presence of two objects, but also
their properties. In qualitative terms resolution is the amount of details that can be observed
in an image. Thus an image that shows finer details is said to be of finer resolution compared
to the image that shows coarser details. Four types of resolutions are defined for the remote
sensing systems.
Spatial resolution
Spectral resolution
Temporal resolution
Radiometric resolution
2. Spatial resolution
A digital image consists of an array of pixels. Each pixel contains information about a small
area on the land surface, which is considered as a single object.
Spatial resolution is a measure of the area or size of the smallest dimension on the Earth’s
surface over which an independent measurement can be made by the sensor.
It is expressed by the size of the pixel on the ground in meters. Fig.1 shows the examples of a
coarse resolution image and a fine resolution image.
A measure of size of pixel is given by the Instantaneous Field of View (IFOV). The IFOV is
the angular cone of visibility of the sensor, or the area on the Earth’s surface that is seen at
one particular moment of time. IFOV is dependent on the altitude of the sensor above the
ground level and the viewing angle of the sensor.
A narrow viewing angle produces a smaller IFOV as shown in Fig. 2. It can be seen that
viewing angle β being greater than the viewing angle α, IFOVβ is greater than IFOVα. IFOV
also increases with altitude of the sensor as shown in Fig. 2. IFOVβ and IFOVα of the sensor
at smaller altitude are less compared to those of the higher altitude sensor.
Fig.2. IFOV variation with angle of view and altitude of the sensor
The size of the area viewed on the ground can be obtained by multiplying the IFOV (in
radians) by the distance from the ground to the sensor. This area on the ground is called the
ground resolution or ground resolution cell. It is also referred as the spatial resolution of the
remote sensing system.
For a homogeneous feature to be detected, its size generally has to be equal to or larger than
the resolution cell. If more than one feature is present within the IFOV or ground resolution
cell, the signal response recorded includes a mixture of the signals from all the features.
When the average brightness of all features in that resolution cell is recorded, any one
particular feature among them may not be detectable. However, smaller features may
sometimes be detectable if their reflectance dominates within a particular resolution cell
allowing sub-pixel or resolution cell detection.
Fig. 3 gives an example of how the identification of a feature (a house in this case) varies
with spatial resolution. In the example, for the 30m resolution image, the signature from the
“house” dominates for the cell and hence the entire cell is classified as “house”. On the other
hand, in the fine resolution images, the shape and the spatial extent of the feature is better
captured. In the 5m resolution image, along the boundary of the feature, some of the cells that
are partially covered under the feature are classified as “house” based on the dominance of
the signals from the feature. In the very fine resolution image, the feature shape and the
spatial extent is more precisely identified.
Remote sensing systems with spatial resolution more than 1km are generally considered as
low resolution systems. MODIS and AVHRR are some of the very low resolution sensors
used in the satellite remote sensing. When the spatial resolution is 100m – 1km, such systems
are considered as moderate resolution systems. IRS WiFS (188m), band 6 i.e., thermal
infrared band, of the Landsat TM (120m), and bands 1-7 of MODIS having resolution 250-
500m come under this class. Remote sensing systems with spatial resolution approximately in
the range 5-100m are classified as high resolution systems. Landsat ETM+ (30m), IRS LISS-
III (23m MSS and 6m Panchromatic) and AWiFS (56-70m), SPOT 5(2.5-5m Panchromatic)
are some of the high resolution sensors. Very high resolution systems are those which
provide less than 5m spatial resolution. GeoEye (0.45m for Panchromatic and 1.65m for
MSS), IKONOS (0.8-1m Panchromatic), and Quickbird (2.4-2.8 m) are examples of very
high resolution systems.
Fig. 4 shows how an area looks like in images of different spatial resolution, how much
information can be retrieved from each and the scale of application of these images.
Fig.4. False color composite image (red = 850 nm, blue = 650 nm, blue = 555 nm) of
MODIS, ETM+ and IKONOS imagery (Courtesy: Morisette et al., 2002)
The ratio of distance on an image or map, to actual ground distance is referred to as scale.
If we have a map with a scale of 1:100,000, an object of 1cm length on the map would
actually be an object 100,000cm (1km) long on the ground. Maps or images with small "map-
to-ground ratios" are referred to as small scale (e.g. 1:100,000), and those with larger ratios
(e.g. 1:5,000) are called large scale. Thus, large scale maps/images provide finer spatial
resolution compared to small scale maps/images.
3. Spectral resolution
Spectral resolution represents the spectral band width of the filter and the sensitiveness of the
detector. The spectral resolution may be defined as the ability of a sensor to define fine
wavelength intervals or the ability of a sensor to resolve the energy received in a spectral
bandwidth to characterize different constituents of earth surface. The finer the spectral
resolution, the narrower the wavelength range for a particular channel or band.
Many remote sensing systems are multi-spectral, that record energy over separate wavelength
ranges at various spectral resolutions. For example IRS LISS-III uses 4 bands: 0.52-0.59
(green), 0.62-0.68 (red), 0.77-0.86 (near IR) and 1.55-1.70 (mid-IR). The Aqua/Terra MODIS
instruments use 36 spectral bands, including three in the visible spectrum. Recent
development is the hyper-spectral sensors, which detect hundreds of very narrow spectral
bands. Figure 5 shows the hypothetical representation of remote sensing systems with
different spectral resolution. The first representation shows the DN values obtained over 9
pixels using imagery captured in a single band. Similarly, the second and third
representations depict the DN values obtained in 3 and 6 bands using the respective sensors.
If the area imaged is say A km2, the same area is being viewed using 1, 3 and 6 number of
bands.
Generally surface features can be better distinguished from multiple narrow bands, than from
a single wide band.
For example, in Fig. 6, using the broad wavelength band 1, the features A and B cannot be
differentiated. However, the spectral reflectance values of the two features are different in the
narrow bands 2 and 3. Thus, a multi-spectral image involving bands 2 and 3 can be used to
differentiate the features A and B.
Fig.6. Two different surfaces (A and B) are indistinguishable on a single band but can be
differentiated in 2 narrow bands
In remote sensing, different features are identified from the image by comparing their
responses over different distinct spectral bands. Broad classes, such as water and vegetation,
can be easily separated using very broad wavelength ranges like visible and near-infrared.
However, for more specific classes viz., vegetation type, rock classification etc, much finer
wavelength ranges and hence finer spectral resolution are required. For example, Fig. 7
shows the difference in the spectral responses of an area in different bands of the Landsat TM
image.
Fig.8 shows a panchromatic image and Fig.9 and 10 show Landsat TM images taken in
different spectral bands. The figures clearly indicate how different bands and their
combinations help to extract different information.
Fig.8. A coarse resolution panchromatic image- Minimum information is visible from the
image
Fig. 9. Landsat TM (321) showing forest fire in Yellowstone NP- The smoke cover obstructs
the ground view
1. Introduction
In remote sensing the term resolution is used to represent the resolving power, which includes
not only the capability to identify the presence of two objects, but also their properties. In
qualitative terms the resolution is the amount of details that can be observed in an image.
Four types of resolutions are defined for the remote sensing systems.
Spatial resolution
Spectral resolution
Temporal resolution
Radiometric resolution
The previous lecture covered the details of the spatial and spectral resolution. This lecture
covers the radiometric and temporal resolutions, in detail.
2. Radiometric resolution
Radiometric resolution of a sensor is a measure of how many grey levels are measured
between pure black (no reflectance) to pure white. In other words, radiometric resolution
represents the sensitivity of the sensor to the magnitude of the electromagnetic energy.
The finer the radiometric resolution of a sensor the more sensitive it is to detecting small
differences in reflected or emitted energy or in other words the system can measure more
number of grey levels.
Each bit records an exponent of power 2 (e.g. 1 bit = 21 = 2). The maximum number of
brightness levels available depends on the number of bits used in representing the recorded
energy. For example, Table 1 shows the radiometric resolution and the corresponding
brightness levels available.
Thus, if a sensor used 11 bits to record the data, there would be 211=2048 digital values
available, ranging from 0 to 2047. However, if only 8 bits were used, then only 28=256 values
ranging from 0 to 255 would be available. Thus, the radiometric resolution would be much
less.
Image data are generally displayed in a range of grey tones, with black representing a digital
number of 0 and white representing the maximum value (for example, 255 in 8-bit data). By
comparing a 2-bit image with an 8-bit image, we can see that there is a large difference in the
level of detail discernible depending on their radiometric resolutions. In an 8 bit system,
black is measured as 0 and white is measured as 255. The variation between black to white is
scaled into 256 classes ranging from 0 to 255. Similarly, 2048 levels are used in an 11 bit
system as shown in Fig.1.
Finer the radiometric resolution, more the number of grey levels that the system can record
and hence more details can be captured in the image.
Fig.2 shows the comparison of a 2-bit image (coarse resolution) with an 8-bit image (fine
resolution), from which a large difference in the level of details is apparent depending on
their radiometric resolutions.
As radiometric resolution increases, the degree of details and precision available will also
increase. However, increased radiometric resolution may increase the data storage
requirements.
Fig.2 Comparison of a coarse resolution 2-bit image with a fine resolution 8-bit image
In an image, the energy received is recoded and represented using Digital Number (DN). The
DN in an image may vary from 0 to a maximum value, depending up on the number of gray
levels that the system can identify i.e., the radiometric resolution. Thus, in addition to the
energy received, the DN for any pixel varies with the radiometric resolution. For the same
amount of energy received, in a coarse resolution image (that can record less number of
energy level) a lower value is assigned to the pixel compared to a fine resolution image (that
can record more number of energy level). This is explained with the help of an example
below.
The DNs recorded by the 3-bit system range from 0 to 7 and this range is equivalent to 0-63
for the 6 bit system.
0 1 2 3 4 5 6 7 (3 bit)
0 9 18 27 36 45 54 63 (6 bit)
Therefore a DN of 28 on the 6-bit system will be recorded as 3 in the 3-bit system. A 6-bit
system could record the difference in the energy at levels 45 and 47, whereas in a 3-bit
system both will be recorded as 5.
Therefore when two images are to be compared, they must be of same radiometric resolution.
3. Temporal Resolution
Temporal resolution describes the number of times an object is sampled or how often data are
obtained for the same area
The absolute temporal resolution of a remote sensing system to image the same area at the
same viewing angle a second time is equal to the repeat cycle of a satellite.
The repeat cycle of a near polar orbiting satellite is usually several days, eg., for IRS-1C and
Resourcesat-2 it is 24 days, and for Landsat it is 18 days. However due to the off-nadir
viewing capabilities of the sensors and the sidelap of the satellite swaths in the adjacent orbits
the actual revisit period is in general less than the repeat cycle.
The actual temporal resolution of a sensor therefore depends on a variety of factors, including
the satellite/sensor capabilities, the swath overlap, and latitude.
Because of some degree of overlap in the imaging swaths of the adjacent orbits, more
frequent imaging of some of the areas is possible. Fig. 3 shows the schematic of the image
swath sidelap in a typical near polar orbital satellite.
From Fig.3 it can be seen that the sidelap increases with latitude. Towards the polar region,
satellite orbits come closer to each other compared to the equatorial regions. Therefore for the
polar region the sidelap is more. Therefore more frequent images are available for the polar
region. Fig. 4 shows the path of a typical near-polar satellite.
In addition to the sidelap, more frequent imaging of any particular area of interest is achieved
in some of the satellites by pointing their sensors to image the area of interest between
different satellite passes. This is referred as the off-nadir viewing capability.
For example: using pointable optics, sampling frequency as high as once in 1-3 days are
achieved for IKONOS, whereas the repeat cycle of the satellite is 14 days.
Images of the same area of the Earth's surface at different periods of time show the variation
in the spectral characteristics of different features or areas over time. Such multi-temporal
data is essential for the following studies.
Flood studies: Satellite images before and after the flood event help to identify the aerial
extent of the flood during the progress and recession of a flood event. The Great Flood of
1993 or otherwise known as the Great Mississippi and Missouri Rivers Flood of 1993,
occurred from April and October 1993 along the Mississippi and Missouri rivers and their
tributaries. The flood was devastating affecting around $15 billion and was one of the worst
such disasters occurring in United States. Fig.5 shows the landsat TM images taken during a
normal period and during the great flood of 1993. Comparison of the two images helps to
identify the inundated areas during the flood.
Fig.5 Landsat TM images of the Mississipi River during non-flood period and during the
great flood of 1993
Land use/ land cover classification: Temporal variation in the spectral signature is valuable in
land use/ land cover classification. Comparing multi-temporal images, the presence of
features over time can be identified, and this is widely adopted for classifying various types
of crops / vegetation. For example, during the growing season, the vegetation characteristics
change continuously. Using multi-temporal images it is possible to monitor such changes and
thus the crop duration and crop growth stage can be identified, which can be used to classify
the crop types viz., perennial crops, long or short duration crops.
Fig. 6 shows the MODIS data product for the Krishna River Basin in different months in
2001. Images of different months of the year help to differentiate the forest areas, perennial
crops and short duration crops.
Fig.6 False Color Composites (FCC) of the Krishna River Basin generated from the MODIS
data for different months in 2001.
The figure represents False Color Composites (FCC) of the river basin. The concepts
regarding color composites have been explained in module 4.
4. Signal-to-Noise Ratio
The data recorded on a sensor are composed of the signal (say reflectance) and noise (from
aberrations in the electronics, moving parts or defects in the scanning system as they degrade
over time). If the signal-to-noise ratio (SNR) is high, it becomes easy to differentiate the
noise from the actual signals. SNR depends on strength of signal available and the noise of
the system.
Increasing the spectral and spatial resolution reduces the energy received or the strength of
the signal. Consequently, the SNR decreases. Also, finer radiometric resolution results in
larger number of grey levels and if the difference in the energy level between the two levels
is less than the noise, reliability of the recorded grey level diminishes.
In remote sensing, energy recorded at the sensor depends on the spatial and spectral
resolution of the sensor.
Radiometric resolution of the sensor varies with the amount of energy received at the sensor.
Fine spatial resolution requires a small IFOV. Smaller the IFOV, smaller would be the area of
the ground resolution cell and hence less energy is received from that area. When the energy
received is less, lesser would be the ability of the sensor to detect the fine energy differences,
thereby leading to poor radiometric resolution.
Use of narrow spectral bands increases the spectral resolution, whereas it reduces the energy
received at the sensor in the particular band. A wider band increases the reflected energy. To
increase the amount of energy received and hence to improve the radiometric resolution
without reducing the spatial resolution, broader wavelength band can be used. However, this
would reduce the spectral resolution of the sensor.
Thus, there are trade-offs between spatial, spectral, and radiometric resolution. These three
types of resolution must be balanced against the desired capabilities and objectives of the
sensor.
Thus, finer spatial, spectral and radiometric resolutions of a system may decrease the SNR to
such an extent that the data may not be reliable.
1. Introduction
Multi-band imaging employs the selective sensing of the energy reflected in multiple
wavelength bands in the range 0.3 to 0.9 μm. Generally broad bands are used in multi-band
imaging. Multi-spectral scanners operate using the same principle, however using more
number of narrower bands in a wider range varying from 0.3 to approximately 14 μm. Thus
multi-spectral scanners operate in visible, near infrared (NIR), mid-infrared (MIR) and
thermal infrared regions of the electro-magnetic radiation (EMR) spectrum.
Thermal scanners are special types of multi-spectral scanners that operate only in the thermal
portion of the EMR spectrum. Hyperspectral sensing is the recent development in the multi-
spectral scanning, where hundreds of very narrow, contiguous spectral bands of the visible,
NIR, MIR portions of the EMR spectrum are employed.
This lecture gives a brief description of the multispectral remote sensing. Different types of
multispectral scanners and their operation principles are covered in this lecture. The lecture
also gives brief overview of the thermal and hyperspectral remote sensing.
2. Multispectral scanners
For example the MSS onboard the first five Landsat missions were operational in 4 bands:
0.5-0.6, 0.6-0.7, 0.7-0.8, 0.8-1.1 μm. Similarly, IRS LISS-III sensors operate in four bands
(0.52-0.59, 0.62-0.68, 0.77-0.86, 1.55-1.70 μm) three in the visible and NIR regions and one
in the MIR region of the EMR spectrum.
Spectral reflectance of the features differs in different wavelength bands. Features are
identified from the image by comparing their responses over different distinct spectral bands.
Broad classes, such as water and vegetation, can be easily separated using very broad
wavelength ranges like visible and near-infrared. However, for more specific classes viz.,
vegetation type, rock classification etc, much finer wavelength ranges and hence finer
spectral resolution are required.
Fig.1 shows the bands 4, 5, 6 and 7 obtained from Lansdat1 MSS and the standard FCC.
Fig.1. Landsat-1 MSS images of an area obtained in different spectral bands and the standard
FCC (source: http://www.fas.org/)
The figure clearly displays how water, vegetation and other features are displayed in different
bands, and how the combination of different bands helps the feature identification.
Across-track scanner is also known as whisk-broom scanner. In across track scanner, rotating
or oscillating mirrors are used to scan the terrain in a series of lines, called scan lines, which
are at right angles to the flight line. As the aircraft or the platform moves forward, successive
lines are scanned giving a series of contiguous narrow strips. Schematic representation of the
operational principle of a whisk-broom scanner is shown in Fig.2.
The scanner thus continuously measures the energy from one side to the other side of the
platform and thus a two-dimensional image is generated.
The incoming reflected or emitted radiation is separated into several thermal and non-thermal
wavelength components using a dichroic grating and a prism. An array of electro-optical
detectors, each having peak spectral sensitivity in a specific wavelength band, is used to
measure each wavelength band separately.
Along-track scanners also use the forward motion of the platform to record successive scan
lines and build up a two-dimensional image, perpendicular to the flight direction. However,
along-track scanner does not use any scanning mirrors, instead a linear array of detectors is
used to simultaneously record the energy received from multiple ground resolution cells
along the scan line. This linear array typically consists of numerous charged coupled devices
(CCDs). A single array may contain more than 10,000 individual detectors. Each detector
element is dedicated to record the energy in a single column as shown in Fig. 3. Also, for
each spectral band, a separate linear array of detectors is used. The arrays of detectors are
arranged in the focal plane of the scanner in such a way that the each scan line is viewed
simultaneously by all the arrays. The array of detectors are pushed along the flight direction
to scan the successive scan lines, and hence the name push-broom scanner. A two-
dimensional image is created by recording successive scan lines as the aircraft moves
forward.
(Source: http://stlab.iis.u-tokyo.ac.jp/)
The linear array of detectors provides longer dwell time over each ground resolution cell,
which increases the signal strength. This also increases the radiometric resolution. In a push-
broom scanner, size of the ground resolution cell is determined by the IFOV of a single
detector. Thus, finer spatial and spectral resolution can be achieved without impacting
radiometric resolution.
3. Thermal scanner
Thermal scanner is a special kind of across track multispectral scanner which senses the
energy in the thermal wavelength range of the EMR spectrum. Thermal infrared radiation
refers to electromagnetic waves with wavelength 3-14 μm. The atmosphere absorbs much of
the energy in the wavelength ranging from 5-8 μm. Due to the atmospheric effects, thermal
scanners are generally restricted to 3-5 μm and 8-14 μm wavelength ranges.
Fig. 5 shows a day time thermal image if the San-Francisco region recorded using 8.5-13.0
μm thermal wavelength region. The runway of the airport appears in light tone as the thermal
emission from the runway is more in the day time.
The Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) onboard
Terra, TIMS developed jointly by NASA JPL and the Daedalus Corporation are some of the
examples. ASTER data is used to create detailed maps of land surface temperature,
reflectance, and elevation. TIMS is used as an airborne geologic remote sensing tool to
acquire mineral signatures to discriminate minerals like silicate and carbonate. It uses 6
wavelength channels as shown in Table 2.
Channel Wavelength
μm
1 8.2-8.6
2 8.6-9.0
3 9.0-9.4
4 9.4-10.2
5 10.2-11.2
6 11.2-12.2
Since the energy received at the sensor decreases as the wavelength increases, larger IFOVs
are generally used in thermal sensors to ensure that enough energy reaches the detector for a
reliable measurement. Therefore the spatial resolution of thermal sensors is usually fairly
coarse, relative to the spatial resolution possible in the visible and reflected infrared.
However, due to the relatively long wavelength, atmospheric scattering is minimal in thermal
scanning. Also since the reflected solar radiation is not measured in thermal scanning, it can
be operated in both day and night times.
In thermal scanning the energy radiated from the land surface is measured using thermal
sensors. The thermal emission is the portion of internal energy of surface that is transformed
into radiation energy.
A blackbody is a hypothetical, ideal radiator that totally absorbs and re-emits all energy
incident upon it. Emissivity (ε) is the factor used to represent the radiant exitance of a
material compared to that of a blackbody. Thus
An ideal blackbody (the body which transforms all internal energy into radiation energy) has
emissivity equal to 1. The emissivity of real surfaces ranges from 0 to 1.
Emissivity of a material varies with the wavelength, viewing angle and temperature. If the
emissivity of a material varies with wavelength, it is called a selective radiator. If a material
has constant emissivity, which is less than 1, in all the wavelengths it is called a grey body.
In the thermal scanning, the radiant energy from the surface is measured.
According to the Stefan-Bolzmann law, the radiant exitance (M) from a black body is given
by
M T4
M T 4
The thermal sensors record the radiant energy M from the surface. Thus if we know the
emissivity ε, we can determine the real surface temperature. But in general, in satellite remote
sensing the target features are unknown and hence are their emissivities. In such cases, the
brightness temperature of the surface is determined, which is the surface temperature if that
were a blackbody.
Thermal imaging
For the thermal energy sensing, typically quantum or photon detectors containing electrical
charge carriers are used. The principle behind the thermal scanning is the direct relationship
between the photons of radiation falling on the detector and the energy levels of the electrical
charge carriers.
Some of the commonly used detectors are mercury-doped germanium (sensitive in the range
3-14 μm), indium antimonide (sensitive in the region 3-5 μm) and mercury cadmium telluride
(sensitive in the region 8-14 μm).
Information about the temperature extremes, heating and the cooling rates are used to
interpret the type and condition of the object, For example, water reaches maximum
temperature slower than rocks or soils and therefore, terrain temperatures are normally higher
than the water temperature during the day time and lower during the night.
Some of the important applications of thermal remote sensing image are the following.
4. Hyperspectral Sensors
Hyperspectral sensors (also known as imaging spectrometers) are instruments that acquire
images in several, narrow, contiguous spectral bands in the visible, NIR, MIR, and thermal
infrared regions of the EMR spectrum. Hyperspectral sensors may be along-track or across-
track.
A typical hyperspectral scanner records more than 100 bands and thus enables the
construction of a continuous reflectance spectrum for each pixel.
For example, the Hyperion sensor onboard NASA’s EO-1 satellite images the earth's surface
in 220 contiguous spectral bands, covering the region from 400 nm to 2.5 μm, at a ground
resolution of 30 m. The AVIRIS sensor developed by the JPL contains four spectrometers
with a total of 224 individual CCD detectors (channels), each with a spectral resolution of 10
nanometers and a spatial resolution of 20 meters.
From the data acquired in multiple, contiguous bands, the spectral curve for any pixel can be
calculated that may correspond to an extended ground feature.
Depending on whether the pixel is a pure feature class or the composition of more than one
feature class, the resulting plot will be either a definitive curve of a "pure" feature or a
composite curve containing contributions from the several features present. Spectral curves of
the pixels are compared with the existing spectral library to identify the targets. All pixels
whose spectra match the target spectrum to a specified level of confidence are marked as
potential targets.
Hyperspectral AVIRIS image of the San Juan Valley of Colorado is shown below. Fig.8
below shows the spectral curves for different crop classes generated using the reflectance
from multiple bands of the AVIRIS image. Spectral curves generated from the image are
used to identify the vegetation or crop type in the circular fields and are verified with the
ground data.
Fig.8. Hyperspectral AVIRIS image of the San Juan Valley of Colorado and the spectral
signature curves generated for different fields
(Source : http://geoinfo.amu.edu.pl/wpk/rst/rst/Intro/Part2_24.html)
Bibliography
1. Introduction
This lecture covers the details of some of the important remote sensing satellites that operate
in the optical region of the electromagnetic spectrum. This includes ultra-violet (UV), visible,
near-infrared (NIR), middle-infrared (MIR), and thermal infrared wavelength ranging
approximately 3-14 μm.
There are many characteristics that describe any satellite remote sensing systems. Satellite’s
orbit (including its altitude, period, inclination and the equatorial crossing time), repeat cycle,
spatial resolution, spectral characteristics, radiometric properties are a few of them.
This lecture gives details of the satellites of the Lansat, SPOT and IRS programs, and some of
the very high resolution satellites such as IKONOS and QuickBird
Details of some of the important geo-synchronous satellite programs viz., INSAT and GEOS
are also covered in this lecture
Landsat is the longest running program for acquiring satellite imageries of the Earth.
First satellite in the series, Landsat-1 was launched in July 1972. It was a collaborative effort
of NASA and the US department of the Interior. The program was earlier called Earth
Resources Technology Satellites (ERTSs) and was later on renamed as Landsat in 1975. The
mission consists of 8 satellites launched successively. The recent one in the series Landsat-8,
which is also called Landsat Data Continuity Mission (LDCM) was launched in February,
2013.
Different types of sensors viz., Return Beam Vidicom (RBV), Multispectral Scanner (MSS),
Thematic Mapper, Enhanced Thematic Mapper (ETM), and Enhanced Thematic Mapper Plus
(ETM+) have been used in various Landsat missions.
Landsat missions use sun-synchronous, near polar orbits at different altitudes for each
mission
Table 1 gives the details of different Landsat missions including the type of sensors , spatial,
temporal and radiometric resolution.
Mission period 1972-1978 1975-1982 1978-1983 1982-2001 1984-2012 1993, failed April 1999 - Feb 2013 -
Orbit Sun-synchronous, near-polar
Sensors RBV MSS RBV MSS RBV MSS MSS TM MSS TM ETM ETM+ OLI TIRS
Bands 1-3 4-7 1-3 4-7 1-4 4-8 1-4 1-7 1-4 1-7 1-8 1-8 1-9 1-2
Landsat satellites typically complete 14 orbits in a day. Figure 3 shows the orbital path of a
Landsat satellite.
Landsat 4 and 5 maintained 8 days out of phase, so that when both were operational, 8-day
repeat coverage could be maintained. MSS used in the Landsat programs employs across line
scanning to generate two-dimensional image.
Spectral bands used in various sensors of the Landsat mission are given in Table 2.
Landsat 8 mission is inclusive of two sensors called Operational Land Imager (OLI) and
Thermal Infrared Scanner (TIRS). The OLI is operational in 9 bands including 1
panchromatic band. Spectral ranges of these bands are given in Table 3. The TIRS operates in
2 thermal bands. Spectral ranges of the TIRS bands are also given in Table 3.
Table 3. Spectral bands of the OLI and TIPS sensors of the Landsat-8 mission
(Source: landsat.usgs.gov/band_designations_landsat_satellites.php)
Operational Land Imager (OLI) Thermal Infrared Scanner (TIRS)
SPOT (Systeme Pour l’Observation de la Terre) was designed by the Centre National
d’Etudes Spatiales (CNES), France as a commercially oriented earth observation program.
The first satellite of the mission, SPOT-1 was launched in February, 1986. This was the first
earth observation satellite that used a linear array of sensors and the push broom scanning
techniques. Also these were the first system to have pointable/steerable optics, enabling side-
to-side off-nadir viewing capabilities.
Fig.4 shows the timeline of various missions in the SPOT satellite program.
The recent satellite in the SPOT program, SPOT 6 was launched on September 2012.
SPOT 1, 2 and 3 carried two identical High Resolution Visible (HRV) imaging systems. Each
HRVs were capable of operating either in the panchromatic mode or in the MSS mode.
HRVs used along-track, push-broom scanning methods. Each HRV contained four CCD sub-
arrays. A 6000-element sub-array was used for recording in the panchromatic mode and the
remaining 3 arrays, each with 3000 elements, were used for the MSS mode. Due to the off-
nadir viewing capability, HRV was also used for stereoscopic imaging. Frequency with
which the stereoscopic coverage can be obtained varies with the latitude; more frequent
imaging is possible near the polar region compared to the equatorial region.
SPOT 4 carried the High Resolution Visible and Infrared (HRVIR) sensor and the vegetation
instrument (VI). HRVIR also includes two identical sensors, both together capable of giving
120km swath width at nadir.
SPOT-5 carries two high resolution geometric (HRG) instruments, a single high resolution
stereoscopic (HRS) instrument, and a vegetation instrument (VI). Details of the sensors used
in various SPOT 4 and 5 missions are summarized in Table 4.
SPOT-6 mission employs two New AstroSat Optical Modular Instruments (NAOMI). The
instrument operates in 5 spectral bands, including one panchromatic band. Details of these
bands are given in Table 5.
Table 6 gives the details of various SPOT missions. Mission period, orbit characteristics,
sensors employed, and the resolution details are given in the table.
Spatial PAN:10m , MSS:20m B1-PAN: 10m 1000 PAN:2.5-5m 10m 1000 PAN: 2m
resolution B1-B4 MSS: 20m MSS: 10m MSS: 8m
B4: 20m
Radiometric
resolution 8bit 8bit 10 bit 8 bit 10 bit 12 bit
Pointable optics used in the program enables off-nadir viewing. This increases the frequency
of viewing viz., 7 additional viewings at equator and 11 additional viewings at 45deg latitude.
Due to the off-nadir viewing capabilities, stereo imaging is also possible. Stereo pairs, used
for relief perception and elevation plotting (Digital Elevation Modelling), are formed from
two SPOT images.
Indian Remote Sensing (IRS) satellite system is one of the largest civilian remote sensing
satellite constellations in the world used for earth observation. Objective of the program is to
provide a long-term space-borne operational capability for the observation and management
of the natural resources. IRS satellite data have been widely used in studies related to
agriculture, hydrology, geology, drought and flood monitoring, marine studies and land use
analyses.
The first satellite of the mission IRS-1A was launched in 1988. IRS satellites orbit the Earth
in sun-synchronous, near-polar orbits at low altitude. Various missions in the IRS satellite
program employ various sensors viz., LISS-1, LISS-2, LISS-3, WiFS, AWiFS etc.
Spectral bands used in various sensors of the IRS satellite program are given in Table 7.
Details of various satellite missions of the IRS program, including the mission period, orbit
characteristics, sensors and resolutions are given in Table 8.
Sensors LISS-1, PAN, LISS-3, WiFS LISS-3 and 4, AWiFS PAN camera LISS-3 and 4, AWiFS
LISS-2A and 2B
PAN, LISS-3 B1-B4 PAN LISS-3 B1-B4
Bands B1-B4 LISS-3 B1-B4 LISS-4 B1-B3 LISS-4 B1-B3
WiFS B1-B2 AwiFS B1-B4 (0.5-0.85μm) AwiFS B1-B4
LISS-3:23.5 LISS-3:23.5
Spatial PAN:5.8m
72.5m 36.25m LISS-4: 5.8 0.81m LISS-4: 5.8
resolution LISS-3: 23m (B4:70m)
AWiFS: 56m AWiFS: 56m
Radiometric
7 7 7 7 LISS-3 and 4: 7 LISS-3 and 4: 10
resolution 10
AwiFS: 10 AwiFS: 12
(Bits)
Fig. 5. IRS-P6 LISS-IV multispectral mode image shows the centre of Marseille, France, in
natural colours
IKONOS
IKONOS is a commercial high resolution system operated by GeoEye. The satellite was
launched in September 1999.
IKONOS employs linear array technology and collects data in four multispectral bands and
one panchromatic band. The panchromatic images give less than 1 m spatial resolution,
whereas the MSS give nearly 4m spatial resolution. IKONOS was the first successful
commercial satellite to collect sub-meter resolution images.
Imagery from the panchromatic and multispectral sensors can be merged to create 0.82-meter
color imagery (pan-sharpened).
Fig. 9 IKONOS image of the Denver Broncos Stadium, Denver, Colorado, USA
Details of the satellite orbit and the sensors of the IKONOS program are given in Table 9.
Table 9. Details of the satellite orbit and the sensors of the IKONOS program
Satellite IKONOS
Launch date Sep, 2009
Orbit Sun-synchronous
Eq. crossing 10:30am
Altitude 682 km
Inclination 98.1 deg
Repeat cycle 11 days (more frequent imaging due to the off-nadir
viewing capabilities up to 45 deg)
Radiometric 11 bits
resolution
QuickBird
Payloads over the QuickBird include a panchromatic camera and a four-band multispectral
scanner. QuickBird sensors are composed of linear arrays detectors to achieve a spatial
resolution as fine as 0.61 m in the panchromatic mode and 2.4 m in the multispectral mode.
Details of the QuickBird orbit and the sensors are given in the Table 10
Table 10. Details of the satellite orbit and the sensors of the QuickBird satellite
Satellite QuickBird
Launch date Oct, 2011
Orbit Sun-synchronous
Eq. crossing 10:00 am
Altitude 450 km
Inclination 98 deg
Revisit period Average revisit time is 1-3.5days depending upon the
latitude and the image collection angle
Radiometric 11 bits
resolution
Fig. 10. QuickBird (61cm) true colour image for a small region in Nigeria
(Source: www.satimagingcorp.com)
7. Geo-stationary satellites
INSAT Program
The Indian National Satellite (INSAT) system is one of the largest domestic communication
systems in the Asia-Pacific region. Communication satellites of the INSAT program are
placed in Geo-stationary orbits at approximately 36,000 km altitude. The program was
established with the commissioning of INSAT-1B in 1983. INSAT space segment consists of
24 satellites out of which 9 are in service (INSAT-3A, INSAT-4B, INSAT-3C, INSAT-3E,
KALPANA-1, INSAT-4A, INSAT-4CR, GSAT-8, GSAT-12 and GSAT-10).
GSAT-10
The recent one in the INSAT program, GSAT-10 was launched in September 2012. The
satellite orbits in the geo-stationary orbit located at 83oE longitude. The mission is intended
for communication and navigation purposes.
KALPANA-1
Another satellite in the INSAT program, KALPANA-1, launched in September 2002, is the
first satellite launched by ISRO, exclusively for the meteorological purposes. The satellite
orbits in geostationary orbit located at an altitude ~35,786 km and above 74o E longitude. It
carries two pay loads: Very High Resolution Radiometer (VHRR) and Data Relay
Transponder (DRT). The satellite was originally named Metsat, and was renamed in 2003 in
the memory of astronaut Kalpana Chawla.
The VHRR onboard the KALPANA satellite operates in 3 bands: visible, thermal infrared
and water vapour infrared. The instrument gives images in every half an hour. Fig.12-14
show the KALPANA images obtained in the three bands of the VHRR.
Fig. 12. Images from the KALPANA satellite in the Visible spectral band
Fig. 13. Images from the KALPANA satellite in the Thermal infrared band
Fig. 14. Images from the KALPANA satellite in the Water vapor band
8. CARTOSAT
Cartosat series of satellites are examples of earth observation satellites built by India. To date,
4 Cartosat satellites have been built by Indian Space Research Organization (ISRO). Cartosat
-1 or IRS-P5 is a stereoscopic earth observation satellite. Maintained by the Indian Space
Research Organization (ISRO), this satellite carries two panchromatic (PAN) cameras that
take imageries of the earth in the visible region of the electromagnetic spectrum. The imaging
capabilities of Cartosat-1 include 2.5 m spatial resolution, 5 day temporal resolution and a 10-
bit radiometric resolution.
The second among the series is Cartosat-2 which also images earth using a PAN camera in
the visible region of the electromagnetic spectrum. The data obtained has high potential for
detailed mapping and other applications at cadastral level. The imaging capabilities of
Cartosat-2 are upto 100cm in spatial resolution. The third among the series of satellites was
named as Cartosat-2A. This satellite is dedicated for the Indian armed forces. This satellite
can be steered upto 45 degrees along as well as across the direction of movement for the
purpose of imaging more frequently. Cartosat 2B is the fourth of the Cartosat series of
satellites, launched in July 2010. Apart from the imaging capabilities, Cartosat-2B can be
steered upto 26 degrees along as well as across the direction of its movement to facilitate
frequent imaging of an area. Cartosat-3 is the fifth satellite.
9. RADARSAT
regarding the instrument characteristics and project status can be obtained in the following
link www.asc-csa.gc.ca/eng/satellites/radarsat
GEOMETRIC CORRECTIONS
1. Introduction
The flux radiance registered by a remote sensing system ideally represents the radiant energy
leaving the surface of earth like vegetation, urban land, water bodies etc. Unfortunately, this
energy flux is interspersed with errors, both internal and external which exist as noise within
the data. The internal errors, also known as systematic errors are sensor created in nature and
hence are systematic and quite predictable. The external errors are largely due to
perturbations in the platform or atmospheric scene characteristics. Image preprocessing is the
technique used to correct this degradation/noise created in the image, thereby to produce a
corrected image which replicates the surface characteristics as closely as possible. The
transformation of a remotely sensed image, so that it possesses the scale and projection
properties of a given map projection, is called geometric correction or georeferencing. A
related technique essential for georeferencing, known as registration that deals with fitting of
coordinate system of one image to that of a second image, both of the same area.
2. Systematic Errors
The sources of systematic errors in a remote sensing system are explained below:
a. Scan skew:
Caused when the ground swath is not normal and is skewed due to the forward motion of the
platform during the time of scan.
For example, image from Landsat satellites will usually be skewed with respect to the earth’s
N-S axis. The skew angle can be expressed as:
Sin ( e )
90 Cos 1
Cos( L)
b. Platform velocity: Caused due to a change in speed of the platform resulting in along
c. Earth rotation:
Caused due the rotation of earth during the scan period resulting in along scan distortion
(Figure 1). When a satellite moving along its orbital track tries to scan the earth revolving
with a surface velocity proportional to the latitude of the nadir, there occurs a shift in
displacement of the last scan line in the image. This can be corrected provided we know the
distance travelled by the satellite and its velocity.
For example, consider the case when the line joining centres of first and last scan lines to be
aligned along a line of longitude and not skewed (as observed in the case of SPOT, Terra and
all sun synchronous satellites). Landsat satellites take 103.27 minutes for one full revolution.
Expressing the distance and velocity in angular units, we have the satellite’s angular velocity
2
to be radians/sec. If the angular distance moved by a Landsat satellite during
(103.27 * 60)
capture of one image be 0.029 radians, the time required for satellite to traverse this distance
can be calculated as
Once we know the relative time difference, the aim is to determine the distance traversed
by the centre of last scan line during this time (28.6 seconds). Earth takes 23 hours, 56
minutes and 4 seconds to complete one full rotation; therefore its angular velocity can be
2
obtained similarly as radians/sec. The surface velocity of earth can be estimated to be
867164
292.7 m/sec (for 510 N). Hence, the distance traversed can be finalized as 292.7 * 28.6 =8371
m.
d. Mirror scan velocity: Caused when the rate of scanning is not constant resulting in
along scan geometric distortion.
e. Aspect ratio:
Sensors like MSS of Landsat produce images whose pixels are not square. The instantaneous
field of view of MSS is 79m, while the spacing between pixels along each scan line is 56m.
This results in the creation of pixels which are not square due to over sampling in the across
track direction. For image processing purposes, square/rectangular pixels are preferred.
Therefore we make use of a transformation matrix so that the final aspect ratio becomes 1:1.
In the case of MSS, the transformation matrix will be as given below:
A = 0.0 1.41
1.0 0.00
This is because the Landsat MSS scan lines are nominally 79 m apart and the pixels along
the scan line are spaced at a distance of 56 m. Due to the general user tendency to choose
square pixels rather than rectangular ones, we can either go with 79m or 56 m to remove the
issue of unequal scale in both x and y direction. In this case, with Landsat MSS, the across
scan direction will be over sampled and therefore it will be much more reasonable to choose
79 m square pixels. Then, the aspect ratio will become the ratio of x:y dimensions which is
56:79 that amounts to 1:1.41.
A schematic showing systematic and non systematic errors are presented in Fig. 2. The
sources of nonsystematic errors are explained below:
a. Altitude: Caused due the departure of sensor altitude resulting in change of scale.
b. Attitude:
Errors due to attitude variations can be attributed to the roll, pitch and yaw of satellite.
Schematic showing roll, attitude distortions pertainting to an aircraft is depicted in Fig. 3.
Some of these errors can be corrected having knowledge about the platform ephemeris,
ground control point, sensor characteristics and spacecraft velocity.
1. Introduction
Remotely sensed images obtained raw from the satellites contain errors in the form of
systematic and non systematic geometric errors. Some of the errors can be rectified by having
additional information about satellite ephemeris, sensor characteristics etc. Some of these can
be corrected by using ground control points (GCP). These are well defined points on the
surface of the earth whose coordinates can be estimated easily on a map as well as on the
image.
2. Properties of GCP
For geometric rectification of image from a map or from another registered image, selection
of GCP is the prime step. Hence, proper caution must be maintained while choosing the
points. Some of the properties which the GCP should possess is outlined below:
a. They should represent a prominent feature which is not likely to change for a long
duration of time. For example, the choice of a highway intersection of corner of a steel bridge
is more appropriate as a GCP than a tree or meandering part of a river. This is essential for
easiness in identifying points from the image/map as permanent locations will not perish for a
long duration of time.
b. GCPs should be well distributed. This means that rather than concentrating on points
lying close to each other, points selected farther apart should be given priority. This enables
the selected points to be fully representative of the area, as it is essential for proper geometric
registration. More knowledge about this step will be reiterated in section 3 and 4.
c. Optimum number of GCPs should be selected depending on the area to be
represented. Greater the number of carefully selected and well apart points, more will be the
accuracy of registration.
Figure 1. (a) Insufficient distribution of GCP (b) Poor distribution of GCP (c) Well
distributed GCP
The GCP when selected from a map leads to image to map rectification whereas that chosen
from an image results in image to image rectification.
3. Geometric rectification
This process enables affixing projection details of a map/image onto an image to make the
image planimetric in nature. The image to be rectified, represented by means of pixels
arranged in rows and columns can be considered equivalent to a matrix of digital number
(DN) values accessed by means of their row and column numbers (R,C). Similarly, the
map/image (correct) coordinates of a same point can be represented by their geolocation
information (X,Y). The nature of relationship of (R, C) with (X, Y) needs to be established so
that each pixel in the image be properly positioned in the rectified output image. Let F1 and
F2 be the coordinate transformation functions used to interrelate the geometrically correct
coordinates and distorted image coordinates. Let (R, C) = distorted image coordinates and
(X,Y) = correct map coordinates
The concepts is similar to affixing an empty array of geometrically correct cells over the
original distorted cells of unrectified image and then fill in the values of each empty cell
using the values of the distorted image. Usually, the transformation functions used are
polynomial. The unrectified image is tied down to a map or a rectified image using a selected
number of GCPs and then the polynomials are calculated. The unrectified image is then
transformed using the polynomial equations.
To illustrate this method, let us assume that a sample of x and y values are available wherein
x and y are any two variables of interest such as the row number of an image pixel and the
easting coordinate of the same point on the corresponding map. The method of least squares
enables the estimation of values of x given the corresponding value of y using a function of
the form:
xi a0 a1 yi e
Here, as only a single predictor variable of y is used, the expression is termed as univariate
least squares equation. The number of predictor variables can be greater than one. It should
be noted that a first order function can only accomplish scaling, rotation, shearing and
reflection but they will not account for warping effects. A higher order function can
efficiently model such distortions though in practice, polynomials of order greater than three
are rarely used for medium resolution satellite imagery.
3.1 Co-Registration
Errors generated due to satellite attitude variations like roll, pitch and yaw will generally be
unsystematic in nature that are removed best by identifying GCPs in the original imagery and
on the reference map followed by mathematical modeling of the geometric distortion present.
Rectification of image to map requires that the polynomial equations are fit to the GCP data
using least squares criteria in order to model the corrections directly in the image domain
without identifying the source of distortion. Depending on the image distortion, the order of
polynomial equations, degree of topographic relief displacement etc will vary. In general, for
moderate distortions in a relatively small area of an image, a first order, six parameter affine
transformations is sufficient in order to rectify the imagery. This is capable of modeling six
kinds of distortion in the remote sensor data which if combined into a single expression
becomes:
x ' a0 a1 x a 2 y
y ' b0 b1 x b2 y
Here, ( x, y) denotes the position in output rectified image or map and ( x ' , y ' ) denotes the
corresponding positions in the original input image.
Resampling is the process used to estimate the pixel value (DN) used to fill the empty grid
cell from the original distorted image. A number of techniques are available for resampling
like nearest neighbor, bilinear, cubic etc. In nearest neighbor resampling technique, the DN
for empty grid is assigned to the nearest pixel of the overlapping undistorted correct image.
This offers computational simplicity and alteration of original DN values. However it suffers
from the disadvantage of offsetting pixel values spatially causing a rather disjointed
appearance to the rectified image. The bilinear interpolation technique considers a weighted
average approach with the nearest four pixel values/DN. As this process is actually a 2D
equivalent of linear interpolation , hence the name ‘bilinear’. The resulting image will look
smoother at the stake of alteration of original DN values. Cubic interpolation is an improved
version of bilinear resampling technique, where the 16 pixels surrounding a particular pixel
are analyzed to come out with a synthetic DN. Cubic resampling also tends to distort the
resampled image. In image processing studies, where the DN stands to represent the spectral
radiance emanating from the region encompassing field of view of sensor, alteration of DN
values may result in problems in spectral pattern analysis studies. This is the main reason
why image classification techniques are performed before image resampling.
image.
ATMOSPHERIC CORRECTIONS
1. Introduction
The energy registered by the sensor will not be exactly equal to that emitted or reflected from
the terrain surface due to radiometric and geometric errors. They represent the commonly
encountered error that alters the original data by including errors. Of these, geometric error
types and their methods of correction have been discussed in the previous lecture.
Radiometric errors can be sensor driven or due to atmospheric attenuation. Before analysis of
remote sensing images, it is essential that these error types are identified and removed to
avoid error propagation.
3. Atmospheric Corrections
The DN measured or registered by a sensor is composed of two components. One is the
actual radiance of the pixel which we wish to record, another is the atmospheric component.
The magnitude of radiance leaving ground is attenuated by atmospheric absorption and the
directional properties are altered due to scattering. Other sources of errors are due to the
varying illumination geometry dependent on sun’s azimuth and elevation angles, ground
terrain. As the atmosphere properties vary from time to time, it becomes highly essential to
correct the radiance values for atmospheric effects. But due to the highly dynamic and
complex atmospheric system, it is practically not possible to understand fully the interactions
between atmospheric system and electromagnetic radiation. Fig. 4 shows schematically the
DN measured by a remote sensing sensor. However, the relationship between received sensor
radiance and radiance leaving ground can be summarized in the form of the following
relation:
LS T * * D LP
Where, D is the total downwelling radiance, ρ is the target reflectance, T is the atmospheric
transmittance. The atmospheric path radiance and radiance leaving ground is given by LP and
LS . The second term represents scattered path radiance, which introduces “haze” in the
imagery.
Regression method is applied by plotting pixel values of say near infra red (NIR) band with
respect to values of other bands. Then a best fit line is fitted to represent the relationship,
wherein the offset/intercept represents an estimate of the atmospheric path radiance.
Earth and sun and a constant amount of solar energy illuminating earth, the magnitude of
irradiance that reaches a pizel on a slope is going to be directly proportional to the cosine of
the incidence angle. This can be written as:
Cos 0
LH LT
Cosi
Bibliography
1. Paul. MK. Mather, 2004, Computer Processing of Remotely- Sensed Images, Wiley &
Sons.
2. Lillesand T. M. & Kiefer R. W., 2000. Remote Sensing and Image Interpretation, 4th
ed. Wiley & Sons.
CONCEPTS OF COLOR
1. Introduction
The field of digital image processing relies on mathematical and probabilistic formulations
accompanied by human intuition and analysis based on visual and subjective judgements. As
such, among the various elements of image interpretation, color plays a crucial role in
identifying and extracting objects from an image. Color image processing can be broadly
classified into two categories- full color and pseudo color processing. Color images are
usually acquired by a color TV camera/ color scanner whereas pseudo color images are
created by assigning color to a gray scale image of monochrome intensity. A beam of light
passing through a glass prism branches into a continuous spectrum of light ranging from
violet, blue, green, yellow, orange to red. The colors that all living beings perceive in an
object are basically due to the nature of light reflected from the object. Within the
electromagnetic spectrum, visible light is composed of a very narrow band of frequencies.
Achromatic light (i.e., light in a single color) can be characterized based on its scalar measure
of intensity ranging from black, to gray and finally to white. On the other hand, chromatic
light can be described using radiance, luminance and brightness. Radiance, measured in watts
refers to the total amount of energy that flows from any light source. Of this, the amount of
light energy perceived by an observer is termed as luminance, measured in lumens (lm).
Brightness is one of the essential factors in describing color sensation. This module will
discuss the fundamentals of color space.
2. Fundamentals
In the human eye, the sensors responsible for color vision are termed as cones. Studies have
established that, of the 6-7 million cones of the human eye, approximately 65% are sensitive
to red light, 33% to green light and 2% to blue. Hence, colors to the human eye will primarily
be variable combinations of the primary colors of red, green and blue. The word primary
instills the impression that the three primary colors when mixed in varying proportions will
produce all visible colors. This interpretation will be proved wrong by the end of this chapter.
The best example for additive nature of light colors is provided by a color Television. Color
TV tube interior will normally comprise of numerous arrays of triangular patterns of electron-
sensitive phosphor. When excited, the dot within each triad tends to produce light in one of
the three primary colors. The intensity of lights is modulated by electron guns inside the tubes
that generate pulses corresponding to the red energy seen by the TV camera. Similar function
is offered in the other two colors as well. The final effect seen on the TV, is the addition of
three primary colors from each triad perceived by the cones in the eye as a full color image.
The illusion of a continuously moving image is created by thirty such successive image
changes per second.
Colors are distinguished from one another using the characteristics of hue, saturation and
brightness. Hue refers to the dominant wavelength in a mixture of light waves and hence it
stands for the dominant color as perceived by an observed. When we see an object as blue or
yellow, we are actually referring to the hue of the object. Saturation stands for the amount of
white light mixed with hue. Together, hue and saturation can be termed as chromaticity. If X,
Y and Z be the amount/intensity of red, green and blue colors needed to generate any color,
then a color can be specified by the following terms:
X
x ;
X Y Z
Y
y ;
X Y Z
Z
z
X Y Z
Where X, Y, and Z are termed as tristimulus values. Curves or tables compiled using
experimental results (Poynton (1996), Walsh (1958) Kiver (1965), can be used to estimate the
tristimulus values required to generate color corresponding to any wavelength of light in the
visible spectrum.
Colors can also be specified using chromaticity diagrams, that show color composition as a
function of x (red) and y(green) and z (1- (x+y)). In the chromaticity diagram (Figure 1), the
pure colors from violet to red will usually be indicated along the boundary of the diagram
while the colors within the diagram will represent a mixture of spectrum colors. Within the
diagram, a straight line joining any two pure colors (of the boundary) will define all the
different color variations which can be generated using these two colors. Similarly, to
determine the range of colors which can be generated using any three given colors from the
chromaticity diagram, connecting lines need to be drawn to each of the three color points.
The resulting triangular region will enclose all the colors that can be produced by varying
combinations of the three corner colors.
3. Color Space
A color space/color system specifies a coordinate system within which each color can be
represented by a single point. The most commonly used models are the RGB (red, green,
blue), CMY (cyan, magenta, yellow), CMYK (cyan, magenta, yellow, black), IHS (intensity,
hue, saturation). The RGB models are usually used in video cameras, CMYK models for
color printing and IHS models resemble with the way humans interpret color. These models
are briefly described here.
2.1 RGB
In the RGB model, the primary colors of red, green and blue are used within a Cartesian
coordinate system. The RGB color model is shown in Figure 2 where the primary colors of
red, blue and green represent the three corners with black at the origin and cyan, magenta and
yellow representing the other three corners of the cube. The cube shown is a unit cube with
the underlying assumption that all color values have been normalized. Pixel depth is the name
given to the number of bits used to represent each pixel within an RGB space. An RGB
image represented by 24 bit consists of (28 ) 3 colors.
Cyan, magenta and yellow comprise the secondary colors of light. A cyan color tends to
subtract red from reflected white light. CMY to RGB conversion can be performed using the
relation:
C 1 R
M 1 G
Y 1 B
CMY color space (Figure 3) is used usually to generate hardcopy output. Practically, a
combination of cyan, magenta and yellow produces a faint muddy black color when it should
theoretically produce pure black color. Hence, in order to produce pure black color, a fourth
color, black is added which results in CMYK color model. A four color printing means CMY
color model along with black color.
The RGB and CMY models fail to describe colors that are of practical interest to humans.
Humans define color in an object in terms of its hue, saturation and brightness. IHS color
model presents intensity (I), hue ( ) and saturation (S) within a color image and hence is
suitable for algorithms based on color descriptions more intuitive to humans. Within the IHS
sphere, intensity axis represents variations in brightness (black being 0 to white being 255).
Hue represents the dominant wavelength of color. (0 at mid point of red tones and increases
anti clockwise to 255). Saturation represents the purity of color and ranges from 0 at the
center of the sphere to 255 at the circumference. A saturation of 0 represents a completely
impure color, in which all wavelengths are equally represented (in grey tones).
Figure 4: The IHS model which shows the IHS solid on the left and the IHS triangle on the
right formed by taking a slice through the IHS solid at a particular intensity.
[Source : http://cpsc.ualr.edu]
As shown in the figure, the corners of the equilateral triangle are located at the position of the
red, green and blue hue. The vertical axis represents intensity ranging from black (0) to white
(255); no color is associated with this axis. The circumference represents hue, i.e., the
dominant wavelength of color which commences with 0 at the midpoint and increases
counterclockwise direction around the triangle from red to green to blue and again to red with
values increasing to a maximum of 1 at the corners. Saturation denotes purity of color
ranging from 0 at centre of color sphere to 255 at the circumference. High values of
saturation represent purer and intense colors while intermediate values show pastel shades.
The IHS system in detail can be found in Buchanan (1979).
Bands of a sensor data, combined in the RGB system result in color images that lack
saturation even after contrast stretching due to the highest degree of correlation between the
spectral bands. To correct this issue, a technique is devised which enhances saturation:
A conversion from RGB to IHS model can be obtained using the relations:
If B G
H
360 If B > G
Where
1
[( R G ) ( R B)]
2
Cos 1 1
[( R G ) 2 ( R B)(G B)] 2
3
S 1 [min( R, G, B )]
( R G B)
1
I ( R G B)
3
The Intensity images (I) will generally have high values for sunlit slopes, low values for
water and intermediate values for vegetation and rocks. In hue (H) image, vegetation will
have intermediate to light tones. In Saturation (S) image, just the shadows and rivers will be
bright indicating high saturation for these features. To enhance clarity to a remotely sensed
image, contrast stretching can be performed on the original saturation image. This will
essentially improve the overall discrimination between different terrain types due to increased
brightness of the enhanced image. The IHS images can be transformed back into images of
RGB system; which can then be used to prepare a new color composite image. This image
will be a significant improvement over its original counterpart. The HIS transformation and
its inverse are useful for combining images of different types.
The stretched IHS representation can be subjected to inverse transform to convert back to
RGB coordinates. The color transformation method can be used to combine images from
different sources. For example, images with good resolution such as that from Landsat ETM+
and SPOT HRV sensors can be combined with low resolution images. Such transformations
find wide application in geological applications. Further details regarding IHS transformation
can be obtained from Blom and Daily (1982), Foley et al. (1990), Mulder (1980), Massonet
(1993).
Inverse transformation using the values of RGB can be obtained, given the values of IHS
depending on the values of H. Details regarding the range of H, and R, G, B equations are
provided in the table below:
Range of H R G B
0 0 H 120 0 S * CosH 3 * I ( R B) I * (1 S )
I 1
Cos (60 H )
0
The bands of a multispectral image usually comprise of the three primary color bands of red,
green and blue. When the colors of the resulting composite image resemble closely what
would be observed by the human eyes, they lead to true color composite images. Then, the
appearance of the displayed image will essentially represent a visible color photograph
wherein vegetation appears in green, water in blue, soil in grey etc. Such images are known
as true color images. Some refer to it as “natural color” images. For example, a Landsat
Thematic Mapper image can be assigned red to band 3, green to band 2, blue to band 1 to
create true color composite. The choice for color assignment can be done arbitrarily for any
band of a multispectral image. When the color of such a resulting image has no resemblance
to its actual color, it results in a false color composite (FCC). The principal use of such
images is to aid in human visualization and interpretation of gray scale events in an image or
in a sequence of images. Examples of FCC from Landsat are illustrated in Figures below. In
TM images, usually healthy vegetation shows up in shades of red because photosynthesizing
vegetation absorbs most of the green and red incident energy but reflects approximately half
of the incident near-infrared energy. Urban areas may appear as steel gray as these reflect
approximately equal proportions of near-infrared, red and green energy. Moist wetlands will
appear in shades of greenish brown.
(a) (b)
Figure 5 : (a) True color composite (band 321) (b) False color composite (band 543)
(a) (b)
Figure 6 : (a) True color composite (band 321) (b) Standard false color composite (bands 432)
(a) (b)
Figure 7 : (a) False color composite (band 543) (b) Temperature image
It should be noted that TM data can be used to generate two or more 3-band color
composites. Chavez et al (1984) came up with an index known as optimum index factor
(OIF) which can be relied to select the best combination of bands. OIF ranks all the three
band combinations that can possibly be generated using the six bands of TM data. OIF is
given by the expression:
3
S
K 1
K
OIF 3
Abs( R
J 1
J )
Here, S K denotes the standard deviation for band K , RJ is the absolute value of the
correlation coefficient between any two of the three bands which are being evaluated. OIF is
based on the total variance and correlation within and between the various band
combinations. The combination having the largest value of OIF is generally treated as having
the highest amount of information as measured by variance with the least amount of
duplication (as measured by correlation).
2. Chavez, P. S., S. C. Guptill, and J. A. Bowell. [1984]. Image Processing Techniques for
Thematic Mapper Data.Proceedings, ASPRS-ACSM Technical Papers, 2:728-742.
4. Poynton, C. A. [1996]. A Technical Intoduction to Digital Video, John Wiley & Sons,
New York.
CONTRAST STRETCHING
1. Introduction
Image enhancement is used to enhance the information available from an image for any
specific purpose. The enhancement may be done either in the spatial domain or in the
frequency domain. The frequency domain enhancement is based on the modification of the
Fourier transformation of the original image. On the other hand, spatial domain enhancement
involves manipulation of the pixels in an image from the image plane itself.
This lecture explains the basics of image enhancement in the spatial domain.
2. Contrast stretching
Contrast stretching is used to increase the dynamic range of the gray levels in the image. For
example, in an 8-bit system the image display can show a maximum of 256 gray levels. If the
number of gray levels in the recorded image spread over a lesser range, the images can be
enhanced by expanding the number of gray levels to a wider range. This process is called
contrast stretching. The resulting image displays enhanced contrast between the features of
interests.
For example, Fig. 1(a) shows the unstretched Landsat TM Band-5 image. The original image
is vague. Fig. 1 (b) shows the contrast-stretched image.
Fig. 1 Landsat TM Band-5 image before and after the contrast stretching
The transformation functions used to convert the values of the original image into
corresponding values in the output image may be linear or non-linear functions.
When the values in the original image are expanded uniformly to fill the total range of the
output device, the transformation is called linear contrast stretching. If DN is the Digital
Number of the pixel, DNst is the corresponding DN in the enhanced output image, DNmax and
DNmin are the maximum and minimum DN values in the original image, the linear contrast
stretching can be graphically represented as shown below.
For example, for an 8-bit display system, linear contrast stretching transformation can be
achieved as given below.
(1)
where DN values in the range DNmin-DNmax are rescaled to the range 0-255 in the output
image.
Fig. 3(a) shows histogram of the Landsat TM band-5 image, wherein the DN values range
from 60 to 158. Therefore in the display of the original image, display in the ranges 0-59 and
159-255 are not utilized, giving a low contrast image as shown in Fig. 1(a). Histogram of the
enhanced image is shown in Fig. 3(b), wherein the values are expanded to fill the entire range
0-255, giving better contrast. The enhanced output image is shown in Fig. 1(b).
Fig. 3 Histograms of the Landsat TM Band-5 image before and after contrast stretching
In the contrast stretched image the light tone areas appear lighter and the dark tone areas
appear darker. The variation in the input data, now being displayed in a wider range, thus
becomes easily differentiable.
From the histogram of the original image it can be observed that though the DN values ranges
from 60 to 158, number of pixels having DN values in the range 60-90 are very less.
Nevertheless, in linear stretching equal number of display levels are assigned to these ranges.
Consequently, for the higher values not many display levels are available. In other words, the
number of display levels available for different DN ranges are not in proportion to the
number of pixels having DN values in the range. To solve this problem, non-linear contrast
stretching has been used.
In non-linear stretching, the DN values are not stretched linearly to uniformly occupy the
entire display range. Different non-linear contrast stretching methods are available. Some of
them are the following.
Histogram-equalized stretch
Histogram–equalized stretch
In histogram-equalized stretch the DN values are enhanced based on their frequency in the
original image. Thus, DN values corresponding to the peaks of the histogram are assigned to
a wider range. Fig.4 compares the histogram of a raw image with that of the images enhanced
using linear stretching and histogram-equalized stretching.
Fig. 4. Histograms of (a) Unstretched image (b) Linear contrast stretched image
Fig.5 (a) shows a sample histogram of an image and Fig.5 (b) shows the corresponding
histogram-equalization stretch function. Input DN values corresponding to the peak of the
histogram are stretched to a wider range as shown in the figure.
Fig.5 (a) Sample histogram of an image and (b) Function used for histogram equalized stretch
For example, for an 8-bit display system, the histogram equalization function used for
stretching the input image can be represented as follows.
k nj
DN st 255 (2)
j 0 N
where nj is the number of pixels having DN values in the jth range, k is the number of DN
value ranges, and N is the total number of pixels in the input image.
By assigning more display levels to the higher frequency region that contains majority of the
information, better information enhancement is possible using histogram-equalized stretch.
(a)
(b)
Fig. 6 Landsat ETM+ Band 5 images and the corresponding histograms (a) before contrast
stretch and (b) after histogram equalization stretch
In piece-wise linear stretch, different linear functions are used for enhancing the DN values in
different ranges within the same image. In other words, different parts of the histogram are
stretched by different amounts. It is generally useful in cases where the original image has bi-
modal histogram.
Fig. 7. A sample bi-modal histogram, piece wise linear function used for the contrast
stretching and the histogram after piece wise contrast stretch
Using the piece-wise linear stretch function, region between the two modes of the histogram
may be compressed, whereas the regions corresponding to the histogram peaks may be
enhanced as shown in Fig.7. It is also used to enhance any special features in the image.
In logarithmic stretching, curves having the shape of the logarithmic function are used for
rescaling the original DN levels into the wider output range, as shown in Fig. 8.
where c is a constant.
As shown in Fig. 8, in logarithmic stretching, smaller values are stretched to a wider range,
whereas narrower output range is used for higher values. This type of stretching is generally
used to enhance the information contained in the dark pixels, during which process the
information contained in the lighter pixels are compressed.
Application of power law executes the stretching in an opposite way. Power-law contrast
stretching generally uses the following form.
Fig. 9 shows the sample power function for contrast stretching. While using the power
functions, higher values are expanded to a wider range. This enhances information contained
in the higher DN values, whereas the lower DN values are compressed.
In Gaussian contrast stretch, the DN values are modified in such a way that the stretched
brightness value distribution resembles a normal distribution. Fig. 10 shows the Landsat
ETM+ Band 5 image after applying the Gaussian contrast stretching. Original image is shown
in Fig. 6(a).
Fig.10. Landsat ETM+ Band 5 image after applying the Gaussian contrast stretching
Fig. 11 gives the schematic representation of all the above contrast stretching methods.
Histogram of the original image is shown in Fig. 11(a). The values are only in the range 60-
158. Therefore in an 8-bit display system, only the range 60-158 is used for the image display
resulting in poor contrast. Fig.11 (b) shows the linear stretching, wherein the range 60-158 is
equally transformed into the full range 0-255 using linear function. Fig.11 (c) shows the
schematic of the histogram equalization stretch. The range 60-108, having low frequency, is
transformed into a relatively narrower range 0-38, whereas the high frequency range 108-158
is transferred to a wider range 38-255. Fig.11 (d) shows special stretch wherein only the
range 60-92 is stretched to occupy the full display range. The remaining ranges are
compressed.
5. Look-up Table
In contrast stretching, DN values in the original image are transformed into another range and
the resulting new DN values (DNst) represent the enhanced image. If the transformation has
to be estimated for every pixel of an image using the transformation functions, the procedure
would involve significantly large amount of computation. For example, a 60 x 60 km
multispectral SPOT image would require new DNs to be calculated for 27 million pixels.
Assuming 500,000 arithmetic operations a second, this procedure would take nearly four
minutes.
On the other hand, since the transformation functions are well defined, it is possible to
calculate DNst for all possible DN values in a single stretch, and can be presented in a tabular
form, as shown in Fig. 12. Such a table is called look-up table (LUT).
Fig. 12 (a) shows a sample linear transformation function used for contrast stretching. Fig. 12
(b) shows the LUT for this function.
Fig. 12 (a) A sample linear transformation function used for contrast stretching (b) LUT for
this function
LUT can be used to simplify the contrast stretching process. For any pixel in the original
image, the corresponding DNst may be obtained from the LUT, and thus the contrast
stretching procedure can be speeded up.
1. Introduction
Spatial feature manipulations are the processes which help to emphasize or deemphasize data
of various spatial frequencies. The term spatial frequency represents the tonal variations in
the images such that higher values indicate rough tonal variations whereas lower values
indicate smoother variations in the tone.
Spatial feature manipulations are generally local operations where the pixel values in the
original image are changed with respect to the gray levels of the neighboring pixels. It may
be applied to either spatial domain or frequency domain. Filtering techniques and the edge
enhancement techniques are some of the commonly used local operations for image
enhancement.
This lecture explains the mechanics of filtering and edge enhancement as applied to the
remote sensing satellite images.
2. Filtering Techniques
If a vertical or horizontal section is taken across a digital image and the image values are
plotted against distance, a complex curve is produced. An examination of this curve would
show sections where the gradients are low (corresponding to smooth tonal variations on the
image) and sections where the gradients are high (locations where the digital numbers change
by large amounts over short distances). Filtering is the process by which the tonal variations
in an image, in selected ranges or frequencies of the pixel values, are enhanced or suppressed.
Or in other words, filtering is the process that selectively enhances or suppresses particular
wavelengths or pixel DN values within an image.
Two widely used approaches to digitally filter images are convolution filtering in the spatial
domain and Fourier analysis in the frequency domain. This lecture explains the filtering
techniques with special reference only to the spatial domain.
A filter is a regular array or matrix of numbers which, using simple arithmetic operations,
allows the formation of a new image by assigning new pixel values depending on the results
of the arithmetic operations.
Consider the pixel having value e. A 3x3 window is considered in this case. The 8 neighbors
of the 3x3 window are marked in the figure. The figure also shows the corresponding 3x3
filter and the filter coefficients marked in it. The filter is applied to the neighborhood window
or the filter mask, and the modified pixel value ep is estimated. When the filter is applied to
the original image, this ep replaces the original value e.
The mechanics of the spatial filtering involves the movement of the filter mask over the
image and calculation of the modified pixel value at the center of the filter mask at every
location of the filter. Thus values of all the pixels are modified. When the spatial filter is
applied to the image, the ep values are estimated by using some pre-defined relationship using
the filter coefficients and the pixel values in the neighborhood or filter mask selected.
Convolution filter is a good example.
Filtering in the spatial domain manipulates the original pixel DN values. On the other hand,
frequency domain filtering techniques used the Fourier analysis to first transform the image
into the frequency domain and the filtering is performed on the transformed image, which is
the plot of frequencies at every pixel. The filter application in the frequency domain thus
gives frequency enhanced image.
Convolution filter is one of the most commonly used filters in image enhancement in the
spatial domain. In convolution filter, the filter mask is called convolution mask or
convolution kernel. The convolution kernels are square in shape and are generally of odd
number of pixels in size viz., 3x3, 5x5, 7x7 etc.
The kernel is moved over the input image for each pixel. A linear transformation function
involving the kernel coefficients and the pixel values in the neighborhood selected is used to
derive the modified DN of the pixel at the centre of the kernel, in the output image. Each
coefficient in the kernel is multiplied by the corresponding DN in the input image, and
averaged to derive the modified DN value of the centre pixel.
For example, the filter shown in Fig. 1 is a convolution filter of kernel size 3x3. DN value of
the centre pixel in the input image is e. The modified DN value is obtained as given below.
Based on the elements used in the matrix and the procedure used for calculating the new
digital number, different digital filters are developed for different purposes. Using different
kernels, different type of enhancement can be achieved. For example, high pass and low pass
filters are special types of convolution filters, which emphasize the high frequency and low
frequency features, respectively.
Low pass filters are also called averaging filters as the filter output is the average pixel value
of all the pixels in the neighborhood. When such filter is applied on an image, it will replace
every pixel with the average of the surrounding pixel values. Thus, the low frequency values
in the image are highlighted after filtering. Low pass filter reduces the effects of noise
component of an image.
High pass filter, on the other hand, enhances the high frequency values in an image.
Accordingly, in the resulting image, low frequency values are de-emphasized.
Directional gradient filters help to highlight the features oriented in any specific direction.
For example: 30oNW and 45o NE. They are particularly useful in detecting and enhancing
linear features with a specific orientation.
Emboss filter when applied to an image, highlights the features that are having gradient in the
specified direction.
Prewitt gradient and Sobel gradients are used to estimate the digital gradient in an image, and
are useful for edge detection. The kernels of these filters when moved over an image
highlight the higher gradients corresponding to edges and make the lower values smooth.
Laplace filter is also useful in edge enhancement. It helps to detect sharp changes in the pixel
values and hence enhances fine edges.
In Fig. 3 image enhancement obtained using high pass and low pass filter kernels are
compared. ASTER Global DEM of 30m resolution is used as the input image.
In Fig.4 the input image is compared with that obtained after the application of NW
embossed filter.
Fig 5(a) shows the original input image. The NW emboss filtered output is combined with the
input data. This operation helps to retain much of the information from the input image,
however with enhanced edges as shown in Fig. 5(b).
Fig. 6 shows the images enhanced using Sobel gradient and Laplace filter. Sobel filter
enhanced the higher values, whereas the lower values are compressed. Laplace filter on the
other hand, enhances not only the high values, but edges in all orientations are also enhanced.
(a) (b)
(c) (d)
Fig. 3 (a) Input image : ASTER GDEM (b) 3x3 low pass filtered image (c ) 11x11 low pass
filtered image and (d) 3x3 high pass filtered image
Gradients in the
NW direction are
highlighted
(a) (b)
Fig. 4 (a) Original image: ASTER GDEM (b) Image filtered using NW emboss filter
(a) (b)
Fig.5 (a) original image (b) Combination of original image and NW emboss filtered image
(a) (b)
It may be noted that filtering decreases the size of the original image depending on the kernel
size. In the case of 3 x 3 digital filter, there won’t be any filtered values for the pixels in first
row, last row, first column and last column, as these pixels cannot become central pixels for
any placement of 3 x 3 digital filter. Thus the filtered image from a 3 x 3 kernel will not
contain first row, last row, first column and last column of the original image.
3. Edge enhancement
Edges in the images are generally formed by long linear features such as ridges, rivers, roads,
railways, canals, folds and faults. Such linear features (edges) are important to geologists and
Civil Engineers. Some linear features occur as narrow lines, whereas some edges are marked
by pronounced differences that may be difficult to be recognized. Such narrow linear features
in the image can be enhanced using appropriate filtering techniques.
Fig. 7 (a) shows a part of IRS LISS III Band 4 (Near Infrared) data showing a portion of
Uttara Kannada district in Karnataka state, India. The image shows a river and the Arabian
Sea on the left. Fig.7 (b) shows the edges extracted from the image. Linear features such as
shore line and the river banks are extracted using the edge enhancement and edge detection
algorithms.
(a)
(b)
Fig. 7 (a) IRS LISS-III Band 4 image (b) Edges detected from the image using the digital
edge detection algorithm
Digital filters used for edge enhancement in images are of two broad categories:
Directional filters
Non-directional filters
Directional filters will enhance linear features which are having specified orientation (say
those oriented to 300 North) whereas non-directional filters will enhance linear features in
almost all orientations.
Pewitt gradient, Sobel, Canny and Lapalaican filters are some of the examples of non-
directional filters.
This section explains directional filter and the Laplacian non-directional filter in detail
Directional filter will enhance linear features with a specific orientation (direction). Direction
in which the edges are to be enhanced is specified in degrees with respect to North. Angles
within the North-East quadrant are considered with negative sign and those falling in the
North-West quadrant are considered with positive sign (Fig. 8).
Fig.8 Concept of orientation angle of the linear features as used in directional filter
Directional filters consist of two kernels of size 3x3 pixels, which are referred as left and
right kernels. Both left and right kernels are moved over the original image and the pixel
values are multiplied using the kernel coefficients. The values obtained for the nine pixels
within the filter mask are added up for each kernel separately. The value added up for the
right kernel is multiplied by Sin(A), where A is the angle specified. Similarly, the value
added up for the left kernel is multiplied by Cos(A). The resulting two kernel values are
summed up to obtain the directional filter value.
In the original image, if there is any sharp gradient between the pixels in the specified
direction, the resultant directional filter would show the maximum absolute pixel value for
the corresponding pixels, and thereby highlight the edges. The directional filter gives
negative values if the feature is aligned in the NE direction and positive values if the feature
is aligned in the NW direction.
For example, Fig.9 shows the application of a directional filter (Fig. 9.a) to a hypothetical
sample data (Fig. 9.b). The data shows a sharp gradient from pixel value 50 to 45 aligned
approximately at 45 degrees in the NE direction (angle A = - 45o).
The right kernel is first applied to the data and the resulting 9 values are summed up. For the
3x3 window marked in the figure, the resulting value is 10. The left kernel is then applied to
the image. For the same 3x3 window, the resultant value is obtained as -10.
The value obtained from the right kernel is multiplied with the sine of the angle [sin (- 45o) =
-0.71] and that obtained from the left kernel is multiplied with the cosine of the angle [cos (-
45o) = 0.71] and both are added up. Thus for the pixel at the center of the selected window,
the kernel value is obtained as -14. This procedure is repeated for all the pixels in the input
data. The resulting kernel values are shown in Fig. 9(c). Absolute values of the kernel values
are the maximum (14) along the line where there is a sharp change in the pixel values. Thus
the edge of the linear feature can be easily identified.
The kernel values are then added to the original data to generate the filtered output, which is
also shown in the Fig. 9(d).
Contrast ratio of the lineament in the original data set is 50/45 = 1.11. Application of the
directional filter increases the contrast ratio along the linear features in the specified
direction. Thus in the filtered output the contrast ratio is increased 50/31 = 1.61. Thus, in this
example, the contrast along the lineament has been enhanced 45 percent [100*(1.61-
1.11)/1.11 = 45].
-1 0 1 1 1 1
Cos A * -1 0 1 + Sin A * 0 0 0
-1 0 1 -1 -1 -1
a. Directional Filter
50 50 50 50 50 50
50 50 50 50 50 45
50 50 50 50 45 45
50 50 50 45 45 45
50 50 45 45 45 45
50 45 45 45 45 45
b. Original Data
0 0 0 0 -7 -14 50 50 50 50 43 36
0 0 0 -7 -14 -14 50 50 50 43 36 31
0 0 -7 -14 -14 -7 50 50 43 36 31 38
0 -7 -14 -14 -7 0 50 43 36 31 38 45
-7 -14 -14 -7 0 0 43 36 31 38 45 45
-14 -14 -7 0 0 0 36 31 38 45 45 45
The directional filter also enhances the linear features in directions other than the specified
direction. In this example the filter passing through the N45oW direction also enhances linear
features that tend obliquely to the direction of filter movement. As a result, many additional
edges of diverse orientations get enhanced.
Fig. 10 (a) shows a small area from Landsat ETM+ band-5 image. A linear feature with NW
orientation, a river, can be observed in the image. Fig.10 (b) shows the filtered image
obtained after applying the right diagonal edge enhancement filter to the original image. The
edges formed by the main river are highlighted in the right diagonal edge enhancement. Fig
10 (c) shows the left diagonal edge enhanced output of the same image. The main river
channel which is oriented in the NW direction is not emphasized in the filtered image. On the
other hand, other linear features that have orientation mainly in the NE direction are
highlighted in this image.
Horizontal edge enhanced output is also shown in Fig 10 (d)
(a) (b)
(c) (d)
Fig. 10 (a) Landsat ETM+ Band-5 image and the output of (b) right diagonal edge
enhancement (c ) left diagonal edge enhancement (d) horizontal edge enhancement
Linear features in an image are identified using the contrast between the pixels on either side
of it. Contrast between the pixels varies with the difference in the pixel values between them.
For example, in Table 1 the contrast between the pixels (x+1) and x depends on the
difference in the pixel values ax+1 and ax, which is the first derivative of the pixel values. For
the sample data, pixels values (a) and the 1st and 2nd derivatives of the pixel values are shown
in Table 1.
A first order derivative simply shows the difference in the pixel value for adjacent pixels. In
Table 1, the first order derivative is found less capable of highlighting the edges and the noise
in the pixel 9. On the other hand, second order derivative shows the difference in the first
derivative and is better capable of identifying the thin linear features and noises in the image.
As seen from Table 1, the second derivative gives sharper contrast along the edges as shown
by the higher magnitudes along pixels 2 and 6. It also gives very high values for the pixels
corresponding to the noises in the data (Pixel 9).
Table 1. Sample data showing the application of first and second order derivatives in edge
enhancement
Laplacian filter is a non-directional filter based on the second spatial derivative of the pixel
values.
The second order derivative in the x and y direction may be represented as given in Fig. 11
and Eq. 1-3.
ax,y+1
ax,y-1
2a
a x 1 a x 1 2a x (1)
2 x2
2a
a y 1 a y 1 2a y (2)
2 y2
2a 2a
a
2
a x 1, y a x 1, y a x , y 1 a x , y 1 4a x , y (3)
2 x2 2 y2
This equation is implemented using a kernel with -4 at the center, and 1 at the 4 adjacent
directions as shown in Fig 12.
0 1 0
1 -4 1
0 1 0
For example, consider the application of the above mentioned Laplace filter to a sample data
given in Fig.13 (Source: http://www.ciesin.org/docs/005-477/005-477.html). Fig.13 (a)
shows the Laplace filter kernel and the sample data. Profile of the pixel values along the
section AB is shown in Fig. 13 (c). Contrast ratio along the edges is only 1.14 (40/35).
Fig.13(b) shows the filtered data set obtained using Laplace filter. The contrast ratio has been
increased to 45/30 = 1.5 (31% increase).
Fig. 14 compares the Landsat ETM+ Band-5 image with the edge enhanced image obtained
after applying the Laplace filter of kernel shown in Fig.12. As compared to the images in Fig.
10, edges in all important directions are enhanced by applying the Laplace filter.
Fig. 14 (a) Landsat ETM+ Band-5 image and (b) Edge enhanced image using Laplace filter
Having considered the variation in the 4 adjacent directions, the kernel shown in Fig.12 gives
isotropic results for rotation in 90o increments.
The 4 diagonals can also be incorporated in the second derivative by adding two more terms
to Eq.3, for the diagonal corrections. The resultant kernel can be represented as shown in the
Fig. 15
1 1 1
1 -8 1
1 1 1
Fig. 15 Laplace filter kernel for isotropic results for rotation in multiples of 45o
Having incorporated the corrections in the 8 neighboring directions, the kernel gives isotropic
results for all rotations in 45o increments. Due to this property, the Laplacian filters are
considered to be highly effective in detecting the edges irrespective of the orientation of the
lineament.
During edge enhancement using Laplacian filter, the kernel is placed over 3x3 array of
original pixels and each pixel is multiplied by the corresponding value in the kernel. The nine
resulting values are summed and resultant kernel value is combined with the central pixel of
3x3 array. This number replaces the original DN of central pixel and the process is repeated
for all the pixels in the input image.
Laplacian filter will enhance edges in all the directions excepting those in the direction of the
movement of the filter (i.e., linear features with east-west orientation will not get enhanced).
1. Introduction
Digital image processing involves manipulation and interpretation of the digital images so as
to extract maximum information from the image. Image enhancement is used to enhance the
image display such that different features can be easily differentiated. In addition to the
contrast stretching and edge enhancement mentioned in the previous lectures, processes used
for image enhancement include the color manipulation and the use of other data sets. This
lecture covers a few of such methods used for image enhancement. These are
Density slicing
Thresholding
Intensity-Hue-Saturation (IHS) images
Time composite images
Synergic images
2. Density slicing
Density slicing is the process in which the pixel values are sliced into different ranges and for
each range a single value or color is assigned in the output image. It is also know as level
slicing.
For example, Fig.1(a) shows the ASTER GDEM for a small watershed in the Krishna River
Basin. Elevation values in the DEM range from 591-770 m above mean sea level. However,
the contrast in the image is not sufficient to clearly identify the variations.
The pixel values are sliced into 14 ranges as shown in Fig. 1 (b) and the colors are assigned to
each range. The resulting image is shown in Fig 1(b).
Density slicing may be thus used to introduce color to a single band image. Density slicing is
useful in enhancing images, particularly if the pixel values are within a narrow range. It
enhances the contrast between different ranges of the pixel values.
However, a disadvantage of the density slicing is the subtle information loss as a single color
is assigned to each range. The variations in the pixel values within the range cannot be
identified from the density sliced image.
(a)
(b)
Fig. 1 (a) ASTER GDEM and (b) Density sliced image showing 14 levels of elevation
3. Thresholding
Thresholding is used to divide the input image into two classes: pixels having values less than
the threshold and more than the threshold. The output image may be used for detailed
analysis of each of these classes separately.
For example, calculate total area of lakes in the Landsat band-4 image given in Fig 2(a). This
can be better estimated if the non-water pixels are deemphasized and the water pixels are
emphasized. In the image highest DN for water is 35. Therefore, a threshold of 35 is used
here to mask out the water bodies. All pixels with DN greater than 35 are assigned 255
(saturated to white) and those with DN less than or equal to 35 are assigned zero (black). The
output image is shown in Fig. 2(b). In the output, the lakes are highlighted, whereas the other
features are suppressed. From the output, area of the water bodies can be easily estimated.
Fig. 2 (a) Landsat TM Band-4 image and (b) Output images after using a threshold
DN value of 35 to mask out the water bodies
An image is generally the color composite of the three basic colors red, blue and green. Any
color in the image is obtained through a combination of the three basic colors at varying
intensities.
For example, each basic color can vary from 0-255 in an 8-bit display system. Thus several
combinations of the three colors are possible. A color cube (Fig. 3), with red, blue and green
as it axes, is one way of representing the color composite obtained by adding the three basic
colors. This is called RGB color scheme. More details are given in lecture 1.
Intensity: Intensity represents the brightness of the color. It varies from black (corresponding
to 0) to white (corresponding to 255 in an 8-bit system).
Hue: Hue represents the dominant wavelength of light contributing to the color. It varies from
0 to 255 corresponding to various ranges of red, blue and green.
Saturation: Saturation represents the purity of the color. A value 0 represents completely
impure color with all wavelengths equally represented in it (grey tones). The maximum value
(255 in an 8-bit system) represents the completely pure color (red, blue or green).
Any color is described using a combination of the intensity (I), hue (H) and saturation (S)
components as shown in Fig. 4.
The RGB color components may be transformed into the corresponding IHS components by
projecting the RGB color cube into a plane perpendicular to the gray line of the color cube,
and tangent to the cube at the corner farthest from the origin as shown in Fig. 5(a).
This gives a hexagon. If the plane of projection is moved from black to white, the size of the
hexagon increases. The size of the projected hexagon is the minimum at black, which gives
only a point, and maximum at white.
The series of hexagons developed by moving the plane of projection from black to white are
combined to form the hexacone, which is shown in Fig. 5(b).
In this projection, size of the hexagon at any point along the cone is determined by the
intensity. Within each hexagon, the representation of hue and saturation are shown in Fig.
5(c). Hue increases counterclockwise from the axis corresponding to red. Saturation is the
length of the vector from the origin.
(a)
(b)
(c)
Fig. 5 (a) Projection of a color cube in to a plane through black (b) Hexacone representing the
IHS color scheme (c) Hexagon showing the intensity, hue and saturation components in the
IHS representation
(Source: http://en.wikipedia.org/wiki/HSL_and_HSV)
Instead of the hexagonal plane, circular planes are also used to represent the IHS
transformations, which are called IHS cones (Fig. 6)
In the IHS color scheme the relationship between the IHS components with the
corresponding RGB components is established as shown in Fig. 7. Consider an equilateral
triangle in the circular plane with its corners located at the position of the red, green, and blue
hue. Hue changes in a counterclockwise direction around the triangle, from red (H=0), to
green (H=1) to blue (H=2) and again to red (H=3). Values of saturation are 0 at the center of
the triangle and increase to maximum of 1 at the corners.
IHS values can be derived from RGB values through the following transformations
(Gonzalez and Woods, 2006). Inverse of these relationships may be used for mapping IHS
values into RGB values. These have been covered in Section 2.4 of module 4, lecture 1 and
therefore will not be repeated here.
When any three spectral bands of a MSS (multi-spectral scanner) data are combined in the
RGB system, the resulting color image typically lacks saturation, even though the bands have
been contrast-stretched. This under-saturation is due to the high degree of correlation
between spectral bands. High reflectance values in the green band, for example, are
accompanied by high values in the blue and red bands, and hence pure colors are not
produced.
To correct this problem, a method of enhancing saturation was developed that consists of the
following steps:
Transform any three bands of data from the RGB system into the IHS system in which
the three component images represent intensity, hue and saturation.
Typically intensity image is dominated by albedo and topography. Sunlit slopes have
high intensity values (bright tones), and shadowed areas have low values (dark tones)
Saturation image will be dark because of the lack of saturation in the original data.
Apply a linear contrast stretch to the saturation image
Transform the intensity, hue and enhanced saturation images from the IHS system
back into three images of the RGB system. These enhanced RGB images may used to
prepare the new color composite image
Schematic of the steps involved in the image enhancement through IHS transformation is
shown in Fig.8. At this point, we assume that the reader is familiar with the RGB to IHS
transformation. In Fig. 8 below, the original RGB components are first transformed into their
corresponding IHS components (encode), then these IHS components are manipulated to
enhance the desired characteristics of the image (manipulate) and finally the modified IHS
components are transformed back into the RGB color system for display (decode).
Fig.8. Schematic of the steps involved in the image enhancement through IHS transformation
The color composite output after the saturation enhancement gives better color contrast
within the image.
For example, Fig.9 (a) shows the Landsat ETM + standard FCC image (bands 2, 3 and 4 are
used as blue, green and red components). Color contrast between the features is not
significant, which makes the feature identification difficult. The image is converted from the
RGB scheme to IHS scheme. Fig 8 (b) shows the IHS transformation of the image. The
saturation of the image enhanced through IHS transformation. In the display, intensity and
hue are displayed through red and green, respectively. Blue is used to display the saturation.
From the image, it is evident that the saturation is poor (as indicated by the poor contribution
of blue in the display).
Further, the saturation component is linearly stretched. The intensity, hue and the linearly
stretched saturation components are then transformed into the corresponding RGB scheme.
Fig. 10 shows the image displayed using the modified RGB color scheme. A comparison with
the original FCC image reveals the contrast enhancement achieved through the enhancement
using IHS transformation.
(a)
(b)
Fig. 9 (a) Standard FCC of the Landsat ETM+ image and (b) corresponding IHS transformed
image
IHS system mimics the human eye system more closely in conceiving color. Following are
some of the advantages of IHS transformation in image enhancement.
Using this approach, data of different sensors, having different spatial and spectral resolution
can be merged to enhance the information. High resolution data from one source may be
displayed as the intensity component, and the low resolution data from some other source as
the hue and saturation components.
5. Synergic images
Synergic images are those generated by combining information from different data sources.
Images of different spatial and spectral resolutions are merged to enhance the information
contained in an image.
For synergetic image generation, it is important that separate bands are co-registered with
each other and that they contain same number of rows and columns. FCC can be produced by
considering any three bands (may be of different spectral or spatial resolution).
Examples: PAN data merged with LISS data (substituted for the Intensity image), TM data
merged with SPOT PAN data and Radar data merged with IRS LISS data.
Fig. 11 shows the synergic image produced by combining the IRS LISS-III image with high
resolution PAN image.
Fig. 11. IRS LISS III and PAN merged and enhanced Image of Hyderabad
IRS LISS-III image and the PAN images are of different spatial and spectral resolution.
LISS-III image is of 23m spatial resolution and uses 4 narrow wavelength bands. PAN image
gives coarse spectral resolution using a single band. However, PAN image gives fine spatial
resolution (5.8m). Combining the benefits of both, a synergic image can be produced using
the IHS transformation. The intensity component of the PAN image is replaced from the
LISS-III image. The resulted synergic image is transformed back to the RGB scheme, which
is shown in Fig. 10. Spectral information from the LISS-III image is merged the fine spatial
resolution of the PAN data in the image.
Non-remote sensing data such as topographic data, elevation information may also be merged
through DEM. Non-remote sensing data such as location names can also be merged.
Perspective view of southeast of Los Angeles produced by draping TM and radar data over a
DEM and viewing from the southwest is shown in Fig. 12.
Fig. 12. Perspective view of southeast of Los Angeles produced by draping TM and radar
data over a digital elevation model and viewing from the southwest
Fig 13. Shows the comparison of Landsat TM image with TM/SPOT fused data for an
airport southeast of Los Angels. The fused image is considerably sharper than the standard
TM image.
Fig 13. (a) Landsat TM image (b) TM/SPOT fused data for an airport southeast of Los
Angels
Cloud cover in the atmosphere often restricts the visibility of the land area in optical images.
However, if an image contains cloud cover in a portion and if that imagery can be acquired
everyday like in the case of NOAA AVHRR, a time composite imagery can be produced
without cloud cover. For the cloud covered area, the information is extracted from the
successive images. The following are the steps followed for generating time composite
images.
The National Remote Sensing Centre (NRSC) used such time composited imageries of
NOAA AVHRR over 15 days for Agricultural drought assessment and analysis.
1. Blom, R. G. and Daily, M., 1982, Radar image processing for rock type discrimination,
IEEE Transactions on Geoscience Electronics, 20, 343-351.
2. Buchanan, M. D., 1979, ‘Effective utilization of color in multidimensional data
presentation’, Proc. Of the Society of Photo-Optical Engineers, Vol. 199, pp. 9-19.
3. Foley, J. D., van Dan, A., Feiner, S. K. and Hughes, J. F., 1990, Computer Graphics-
Principles and Practice, Second Edition in C. Reading, MA: Addison-Wesley.
4. Gonzalez, R. C., Woods, R. E., 2006. Digital Image Processing, Prentice-Hall of India,
New Delhi.
5. Kiver, M. S., 1965. Color Television Fundamentals, McGraw-Hill, New York.
6. Lillesand, T. M., Kiefer, R. W., Chipman, J. W., 2004. Remote sensing and image
interpretation. Wiley India (P). Ltd., New Delhi.
7. Massonet, D., 1993, Geoscientific applications at CNES. In: Schreier, G. (1993a) (ed.),
397-415.
8. Mulder, N. J., 1980, A view on digital image processing. ITC Journal, 1980-1983, 452-
476.
9. Poynton, C. A., 1996. A Technical Introduction to Digital Video, John Wiley & Sons,
New York.
10. Walsh, J. W. T., 1958. Photometry, Dover, New York.
1. Introduction
Classification is the technique by which real world objects/land covers are identified within
remotely sensed imagery. Consider a multispectral image with m bands. In the case of a simplest
pixel, its characteristics are expressed in the form of a vector where the vector elements represent
the spectral properties of that pixel, which can be captured in these m bands. The number of
classes can be determined a priori or using certain indices. The classes represented by this pixel
may be water bodies, woodlands, grassland, agriculture, urban or other land cover types.
Classification identifies the land cover represented by each pixel based on its spectral reflectance
value (or digital number). The process also involves labeling each class entities using numerical
value which can be done using a classification rule or a decision rule. In this context, the process
of clustering involves an exploratory procedure wherein the aim is to estimate the number of
distinct land cover classes present in an area and also to allocate pixels to these aforementioned
classes. Image classification can be of many types. The two major classification types are
supervised and unsupervised. These two techniques of pixel labeling can also be utilized to
segment an image into regions of similar attributes. Features/ patterns can be defined using the
spectral information present in bands. In other words, a pattern associated with each pixel
position within an image is identified. Pattern recognition methods have been widely applied in
various fields of engineering and sciences. This lecture will explain the supervised classification
technique. Please note that for better understandability regarding some of the terms discussed in
this module, the readers are expected to have minimum knowledge regarding basic statistics.
2. Supervised Classification
In supervised classification technique, the location of land cover types should be known a priori.
The areas of each land cover types are known as training sites. The spectral characteristics of
pixel digital numbers within each of the land cover types can be used to generate multivariate
statistical parameters for each of the training sites. As the supervised classification methods are
based on statistical concepts, this classification is also termed as per-point or per-pixel
classification. One of the earlier methods adopted to visualize the distribution of spectral values
measured on two features (for example, water body and agriculture land) was to generate a
scatter plot. Visual inspection will reveal the existence of two separate land use types. This sheds
light on two fundamental ideas of classification. First is using Euclidean space to represent the
selected features of interest. And second is the usage of distance measure to club or estimate
resemblance of pairs of points as a decision rule in order to classify the pixels as water body and
agriculture land. Visual interpretation is intuitive and simple in nature. Eye and the brain
together recognize the existence of two clusters or regions of feature space having a tight
distribution of points with a relatively empty region in between them. We can even come up with
a boundary line separating the two clusters which is called the decision boundary. This concept
can as well be extended to three dimensions. Now the distance between clouds of points can be
calculated to arrive at a decision boundary which will be a plane within a 3 dimensional feature
space. Supervised classifiers require that the number of classes be specified in advance and that
certain statistical characteristics of each class be known prior. This requires appropriate selection
of training samples which is discussed in the next section.
Once a classification type is finalized, user is bound to select training sites within the imagery
representative of various land cover types. Statistical classifiers depend on a good quality of
training data to generate excellent results. Classification is done to collect spectral statistics for
each of the land cover types. The quantity and quality of training data should be carefully
chosen. Some people rely on a general thumb rule to select > 10m pixels for each class where m
is the number of bands. A standard approach followed is to assume multinomial distribution of
population and estimate the optimum number of pixels using the expression:
B. pi .(1- pi )
n
bi 2
Where, pi is the a priori probability for the class i. It is taken as half if we have no information
about the area because at pi 1 , n will be maximum. bi is the required absolute precision, B is
2
defined as the upper 100th percentile for 2 distribution with one degree of freedom,
k
where k is the number of classes, and is the desired confidence interval (Congalton and
Green, 1998).
There exists a number of ways to select the training data. These include a) collecting in situ
information b) selecting on screen polygonal training data and c) seeding of training data on
screen. Most of the image processing softwares employ a polygonal tool which enables selection
of region of interest (ROI). This can be used to select pixels corresponding to each of the land
cover types as observed on screen. The users may also select a specific location (x,y) within the
image using cursor. The seed program can then be used to evaluate and expand to neighboring
pixels like an amoebae till it stops finding pixels with similar characteristics as the one originally
selected. Both these methods are highly effective in training area selection. The final aim of
training class selection is to achieve nonautoccorrelated training data as training data collected
from autocorrelated data tends to possess reduced variance. An ideal approach would be to
collect training data within a region using nth pixel based on some sampling technique. A
common approach to conduct random sampling is to overlay the classified data over a grid so
that cells within the grid can be selected randomly and groups of pixels within the test cells can
be evaluated. After training data have been selected for each land cover type, the various
statistical measures like mean, standard deviation, variance, minimum value, maximum value etc
can be calculated and used for further classification processes. After selection of training data
for each land cover type, it is essential to judge the importance of bands that will be highly
effective in demarcating each class from all others. This process is known as feature selection
which enables deletion of certain bands that provide redundant spectral information. Graphical
methods of feature selection employ the usage of bar graph spectral plots, ellipse plots, feature
space plots etc which are simple in nature and provide effective visual presentation of inter-class
separability. These plots are sufficient to estimate inter class separability of 2 or 3 classes. But it
would not be suitable to decide if a certain combination of four or five bands can perform better
or not. For these, a statistical method of feature selection needs to be followed, which is
discussed in the next section.
Spectral pattern recognition involving data values captured in m different bands needs to be
subjected to suitable discrimination techniques in order that various land cover classes be
separated with a minimum of error. Greater the number of bands available, greater will be the
associated cost. Usually in the case of an overlap, either the pixel is assigned to a class to which
it does not belong in the first place (commission error) or it is not assigned to its actual class
(omission error). Statistical measures of feature selection provide measures to statistically
separate the classes using the training data. Several measures exist, some of which are discussed
below.
a) Divergence
m!
Cqm
q!(m q)!
For example, suppose there are 6 bands of a multispectral imagery and are interested to know the
three best bands to use it would be necessary to evaluate 20 combinations.
6!
C36 =20 combinations
3!(6 3)!
1
1
Diverab Tr (Va Vb )(Vb1 Va1 Tr (Va1 Vb1 )(M a M b )(M a M b )T
2 2
This expression enables one to identify the statistical separability between two classes using the
given m bands of the training data. Now consider a situation with more than two classes. In such
situation ideally the average over all possible pairs of classes needs to be computed holding the
subset of bands (q) as constant. Then, another subset of bands are selected from these m classes
to be then again analyzed. That subset which gives maximum divergence value needs to be
selected as the superior set of bands to be further used in the classification algorithm.
m 1 m
Diver ab
Diveraverage a 1 b a 1
This has a drawback that outlying easily separable classes will mislead divergence which may
result in the choice of suboptimal reduced feature subset as best. To compensate it becomes
essential to compute the transformed divergence as:
Diverab
TDiverab 20001 exp
8
This measure scales the divergence values to lie in between 0 and 2000 providing an
exponentially deceasing weight to increasing distances between various classes. As a result, a
transformed divergence value of 2000 indicates excellent class separation.
This measure separates two classes at a time assuming that both the classes are Gaussian in
nature and that their means and covariance matrices are available. The expression is given as:
Va Vb
det
Bhatt ab
1
M a M b ' (Va Vb ) (M a M b ) 1 log e 2
8 2 2 det(Va ) det(Vb )
This measure provides a saturating behavior with increasing class separation similar to
transformed divergence. But this measure is not as computationally effective as transformed
divergence measure. The expression is given as:
JM ab 2(1 e Bhattab )
5. Parallelepiped classifier
A parallelepiped refers to a prism whose bases are essentially parallelograms in nature. This
classifier requires minimal user information in order to classify pixels. Assume a multispectral
image with m bands or features that represents a total of n classes. For each of these n classes,
the user is asked to provide the maximum and minimum values of pixels on each of these m
bands. This allows a range to be created that can be expressed as given number of standard
deviation units on either side of the mean of each feature. These enable setting boundary of the
parallelepipeds that can be used/drawn to define regions within the m dimensional feature space.
Hence, the decision rule used is relatively simpler wherein each pixel that is to be classified is
taken one by one and determined if its value on the m bands lies inside or outside any of the
parallelepipeds. Some pixels can be seen to be not lying inside any of the parallelepipeds. These
can be classified as unknown category. In an extreme case scenario, overlapping parallelepipeds
can be found enclosing same pixels. Decision making in such cases involves allocation of pixel
to the first parallelepiped inside whose boundary it falls.
This classification technique is simple, quick which gives good results provided the data are well
structured in nature. However, in practical scenario, this is rarely the case. When the image data
sets in various bands result in overlapping parallelepipeds, a sensible decision rule is to calculate
the Euclidean distance between the doubtful pixel and the centerpoint on each of the overlapping
parallelepipeds. Then, a minimum distance rule can be employed to best decide on the output. In
other words, a boundary line is drawn in the area between two overlapping parallelepipeds. This
boundary will be equidistant from the center points of the parallelepipeds and then pixels are
classified based on their distance relative to the drawn boundary line. This technique is easy, fast
in implementation. But it suffers from major drawbacks. The use of just minimum and maximum
pixel values might not be a very good representative of the actual spectral classes present within
an imagery. Also, it is assumed that the shape enclosing a particular spectral class can be neatly
fit inside a parallelepiped which may not be necessarily so. Hence, this method is considered as a
not so accurate a representation of feature space.
As the name suggests, this classification technique utilizes a distance based decision rule. It
requires the user to provide the mean vectors for each class in each band from the training data.
Euclidean distance based on the Pythagorean theorem is normally employed to calculate the
distance measure. For example, to calculate the Euclidean distance from a point (20,20) to the
mean of class i (16,16) measured in bands 2 and 3 uses the expression
Here ak , al represents the mean vectors for class a that has been captured in bands
k, l respectively.
Every pixel is assigned to a class based on its distance from the mean of each class. Many
minimum distance algorithms specify a distance threshold from the class mean beyond which a
pixel will not be assigned to that particular class even if it is nearest to the mean. This
classification type is commonly used and computationally simple in nature.
Maximum likelihood classification relies on the assumption that geometrically; the shape of a
cloud of points representing a particular class can be well represented using an ellipsoid. The
orientation of the ellipsoid will depend on the degree of covariance among the features with an
upward sloping major axis towards left indicating negative covariance, an upward sloping major
axis toward right indicating high positive covariance and a near circular ellipse (with major axis
~ minor axis) indicative of lower covariances between the features. The statistical descriptors of
mean, variance and covariance of features can be used to define the location, shape and size of
the ellipse. We can consider a set of concentric ellipses each representing contours of probability
of membership of the class with the probability of membership decreasing away from the
centroid more rapidly along the minor axis than the major axis. Thus, a pixel is assigned to that
class for which it has the highest probability of membership. This results in classification which
is more accurate than the output by parallelepiped or k mean classification. As the training data
are relied upon to produce shape of the distribution of the membership of each class. This
classification assumes that the frequency distribution of class membership can be approximated
using a multivariate normal probability distribution. Though the assumption of normality holds
reasonable well, this does not work well where there are small departures from normality.
Assume X represents a vector of pixel values captured in m bands. In order to estimate if it
belongs to class i, the probability of the pixel vector can be calculated using the multivariate
normal density as:
0.5
P( x) 2 0.5m Si exp[ 0.5( y ' Si1 y)]
where, S i0.5 denotes the determinant of specified variance-covariance matrix for class i. The
term y ' S 1 y is the Mahalanobis distance used to estimate distance of each pixel from its class
mean.
The function P(x) calculates the probability values so that the pixel having maximum probability
can be allocated to its corresponding class. This expression can be simplified by taking logarithm
to base e given as:
If both the sides be multiplied by -2 and if the constant terms of p and ln(2 ) are dropped, the
expression becomes:
8. Accuracy of Classification
Producer’s accuracy is calculated by dividing the number of correctly classified pixels in each
category by the number of training set pixels used for that category. User’s accuracy is calculated
by dividing the number of correctly classified pixels in each category by the total number of
pixels that were classified in that category.
A multivariate technique for accuracy assessment derived using the error matrix is the Kappa
statistic. Kappa statistic is a measure of agreement which can be computed using the expression:
r r
N xii xi * x i
K hat i 1
r
i 1
N 2 xi * x i
i 1
Where r is the number of rows in the error matrix, xii is the number of observations in row i and
column i , xi , x1 are the marginal totals for row i and column i . And N represents the total
number of observations.
1. Introduction
This algorithms operates in two steps. The first step reads through the dataset to identify
possible clusters together with estimating mean pixel value to the formed clusters. The
second step estimates a distance measure on a pixel by pixel basis in order that each pixel be
classified to one of the clusters created in the previous step. The steps are elaborated below:
a) Generation of clusters
This classification technique requires the user to provide the following information :
In a spectral space where pixel values stand for digital numbers providing brightness/spectral
information, radius is usually specified in terms of brightness units (e.g., 30 brightness value
units)
A distance parameter in spectral space is required so that two or more clusters close enough
with one another can be merged and treated as a single cluster. Again, in spectral space this is
expressed in terms of brightness value units.
The total number of pixels that should be analysed/evaluated before major merging of
clusters should be specified (e.g., 3000 pixels)
This can be based on expert advice or user’s familiarity with the area to be classified. Certain
indices can be used which enable estimation of optimum number of clusters existing within
an imagery. However, these will not be discussed in this module.
Consider a remotely sensed imagery with two bands and n number of land cover types. As
mentioned earlier, the values of an image can be referenced using row and column numbers.
Beginning from the origin (row1, column1), the pixels are usually evaluated from left to right
in a sequential manner like a chain and hence the name. Assume that the pixel value at first
location be considered as mean value of cluster 1 and let it be M1 (30,30). A multispectral
image can have say m number of bands, wherein each pixel value will be represented by m
values. However, for simplicity this discussion considers just 2 bands and hence two values
associated with each pixel. Now assume that pixel 2 is considered as the mean of cluster 2
with a mean value of M2 (10, 10). The spectral distance between cluster 2 and cluster 1 is
estimated using some distance measure (like Euclidean or Mahalanobis etc). If this distance
be greater than the user specified radius (to determine formation of a new cluster), then
cluster 2 will be cluster 2. Else, the mean data of cluster 1 will be considered to be an average
of the first and second pixels of brightness values. In such cases, pixel 2 will fail to pass the
distance measure to be classified as belonging to cluster 2 and hence ti will be considered as
being in cluster 1. Averaging the values of pixel 1 and pixel 2 will yield a new location for
cluster 1. This process is continued with pixel 3 and so on until the number of pixels
evaluated becomes larger than the total number specified by the user (e.g., 3000). At this
point, the program calculates the distances between each pair of clusters. The user defined
parameter to merge clusters is made use of until there are no clusters with a separation
distance less than the parameter values. The entire imagery is analyzed using this process.
ISODATA is self organizing because it requires relatively little human input. When
compared with the chain method explained in section 2, ISODATA does not select the initial
mean vectors based on the pixels present in the first line of data. Instead it is iterative in
nature i.e., it passes through the image sufficient number of times before coming to a
meaningful conclusion. Classification using the ISODATA algorithm normally requires the
analyst to specify the following criteria.
(ii) The maximum percentage of pixels whose class values are allowed to be unchanged
between iterations (T). This is regarded as a termination criterion for the ISODATA
algorithm.
(iii) The maximum number of times ISODATA is to classify pixels so that cluster mean
vectors can be recalculated (M) . This is also regarded as a decision rule to terminate the
ISODATA algorithm.
(iv) Minimum members in a cluster : In case a cluster seems to contain members that are less
than the minimum percentage of members, that cluster is deleted and the members are
assigned to an alternative cluster. In most of the image processing softwares, the default
minimum percentage of members is often set to 0.01.
(v) Maximum standard deviation: This specified value of standard deviation of cluster is used
to decide on whether a cluster needs to be split into two or not. When the standard deviation
for a cluster exceeds the specified maximum standard deviation and when the number of
members in a class is found to be greater than twice the specified minimum members in a
class, that cluster is split into two clusters.
(vi) Minimum distance between cluster means: This distance measure calculated in the
spectral feature space is used to merge two or more clusters with one another.
Once all the user defined inputs are supplied to the classifier, the algorithm performs in the
following manner:
The mean vectors of all the clusters are arbitrarily assigned along a m dimensional feature
space. This process of choosing mean vectors ensures that the first few lines of pixel data do
not influence or bias the creation of clusters. With the mean vectors, the algorithm searches
through the image data wherein each pixel gets compared to each cluster mean using some
distance measure and is assigned to that cluster mean to which it lies closest in the spectral
feature space. ISODATA can progress either line by line or in block by block manner. Either
way it will have some influence on the resulting mean vectors. At the end of first iteration, a
new mean vector is calculated for each cluster based on the new pixel locations that have
been assigned to each cluster based on distance measure. To perform this calculation, the
minimum members within a cluster, their maximum standard deviation and the minimum
distance between clusters need to be taken into consideration. Once a new set of cluster
means are calculated, this process is repeated wherein each pixel is again compared to the
new set of cluster means. This iteration continues until one of the thresholds is reached i.e.,
either there is very little change in the class assignment or the maximum number of iteration
are attained.
The ISODATA classification algorithm is a slow technique in which the analysts allow the
algorithm to commence by a large number of iterations in order to generate meaningful
results. Details regarding the algorithm steps are schematically shown in Figure 1. The output
of an unsupervised classification will be a set of labelled pixels which range from 1 to k
where k stands for the total number of classes picked out by the classification algorithm. The
output image of classification can be displayed by assigning color to each of these class
labels. The geographical location of each pixel of each class can be assessed to evaluate the
land cover represented by these pixels. Unsupervised classification can be used as an initial
method to refine the classes present within an image before carrying out the supervised
classification.
there is little change in class assignment between iterations (the threshold is reached) or the
maximum number of iterations is reached.
4. K means algorithm
One of the most commonly used non-parametric unsupervised clustering algorithms, well
known for its efficiency in clustering large data sets, is that of K-means. In general, all the
pixels are classified based on their distances from the cluster means. Once this is done, the
new mean vectors for each cluster are computed. This procedure is iteratively carried out
until there is no variation in the location of cluster mean vectors between successive
iterations.
Similar to the ISODATA algorithm, Kmeans algorithm also assigns initial cluster vector. The
difference is that k means algorithm assumes that the number of clusters is known a priori.
The main objective of k means clustering approach is to estimate the within cluster
variability.
of elements in sample space and Bnd be the total number of bands, then the mean for data
points for i th cluster in j th dimension is given by Equation (3.1):
N
Bnd xkj
vij k 1 ,1 i c (3.1)
j 1 N
Distance of each pixel from all existing clusters is computed and it is assigned to the
cluster yielding the minimum distance. Recalculate the cluster centers using Equation (3.1).
The program terminates once the maximum number of iterations have been reached or by the
minimization of the objective function J i.e. the within cluster sum of squares as given by
Equation (3.2).
c Bnd N
J
2
xkj vij (3.2)
i 1 j 1 k 1
Several measures are available for cluster merging. Some of these adopted in this
work are enlisted below:
1. Root mean square (RMS) Euclidean distance for each cluster
The RMS distance of pixels in the i th cluster from their cluster centre is given by
Equation (3.3)
Bnd
1
RMSi
Ni
(v
xX j 1
ij xij )2 (3.3)
The average Euclidean distance of the i th cluster center to the other cluster centers is
given by the Equation (3.4)
1 c
Ai dij
c 1 i 1
(3.4)
The advantages of using this technique are that it is a simple, computationally fast
clustering approach which produces tighter clusters.
FUZZY CLASSIFICATION
1. Introduction
Hard classification is based on classical set theory in which precisely defined boundaries are
generated for a pixel as either belonging to a particular class or not belonging to that class.
During hard classification, each individual pixel within a remotely sensed imagery is given a
class label. This technique works efficiently when the area imaged is homogeneous in nature.
But geographical information is heterogeneous in nature. This implies that the boundaries
between different land cover classes are fuzzy that gradually blend into one another.
Fuzziness and hardness are characteristics of landscape at a particular scale of observation. If
the aim of end user is to label each pixel unambiguously, the existence of heterogeneous
pixels containing more than one land cover type will create a problem. This is owing to the
fact that the pixel may not fall clearly into one of the available classes as it represents mixed
classes. This problem also surfaces if the satellite borne instrument imaging earth has a large
field of view (1 km or more). Fuzzy set theory provides useful concepts to work with
imprecise data. Fuzzy logic can be used to discriminate among land cover types using
membership functions. These are elaborated in the sections below.
Researchers in the field of psycholinguistics have investigated the way humans evaluate
concepts and derive decisions. Analysis of this kind of uncertainty usually results in a
perceived probability rather than the mathematically defined mobility, which forms the basis
of fuzzy sets (Zadeh, 1973). The theory of fuzzy sets was first introduced when it was
realized that it may not be possible to model ill-defined systems with precise mathematical
assumptions of the classical methods, such as probability theory (Chi et al., 1996). The
underlying logic of fuzzy-set theory is that it allows an event to belong to more than one
sample space where sharp boundaries between spaces are hardly found.
The operations on fuzzy sets presented in this section are based on the original works of
Zadeh (1965) and these should not be considered as a complete collection.
1. Fuzzy Union: The union of two fuzzy sets A and B with respective membership
functions A ( x) and B ( x) is a fuzzy set C, written as C = A B, whose
membership value is the smallest fuzzy set containing both A and B.
2. Fuzzy Intersection: The intersection of two fuzzy sets A and B with respective
membership functions A ( x) and B ( x) is a fuzzy set C, written as C= A B whose
membership function is related to those of A and B by:
A 1 A
' (4)
3. Membership Function
The membership function is the underlying power of every fuzzy model as it is capable of
modeling the gradual transition from a less distinct region to another in a subtle way (Chi et
al., 1996). Membership functions characterize the fuzziness in a fuzzy set, whether the
elements in the set are discrete or continuous, in a graphical form for eventual use in the
mathematical formalisms of fuzzy set theory. But the shapes used to describe fuzziness have
very few restrictions indeed. There are an infinite number of ways to graphically depict the
membership functions that describe this fuzziness (Ross et al., 2002). Since membership
functions essentially embody all fuzziness for a particular fuzzy set, its description is the
essence of a fuzzy property or operation. Because of the importance of the shape of the
membership function, a great deal of attention has been focused on development of these
functions. There are several standard shapes available for membership functions like
triangular, trapezoidal and Gaussian, etc. The direct use of available shapes for membership
function is found effective for image enhancement where different types of membership
functions are used to reduce the amount of iterations carried out by a relaxation technique and
provides a better way to handle the uncertainty of the image histogram. The choice of
membership function is problem dependent which requires expert knowledge (Zadeh, 1996).
In situations wherein prior information about data variation is not available, membership
values can also be generated from the available data using clustering algorithms, which is the
normal practice.
Fuzzy set theory provides useful concepts and tools to deal with imprecise information and
partial membership allows that the information about more complex situations such as cover
mixture or intermediate conditions be better represented and utilized (Wang, 1990). Use of
fuzzy sets for partitioning of spectral space involves determining the membership grades
attached to each pixel with respect to every class. Instead of being assigned to a single class,
out of m possible classes, each pixel in fuzzy classification has m membership grade values,
where each pixel is associated with a probability of belonging to each of the m classes of
interest (Kumar, 2007). The membership grades may be chosen heuristically or subjectively.
Heuristically chosen membership functions do not reflect the actual data distribution in the
input and the output spaces. Another option is to build membership functions from the data
available for which, we can use a clustering technique to partition the data, and then generate
membership functions from the resulting clusters. A number of classification methods may be
used to classify remote sensing image into various land cover types. These methods may be
broadly grouped as supervised and unsupervised (Swain and Davis, 1978). In fuzzy
unsupervised classification, membership functions are obtained from clustering algorithms
like C-means or ISODATA method. In fuzzy supervised classification, these are generated
from training data.
The classification of remotely sensed imagery relies on the assumptions that the study
area is composed of a number of unique, internally homogeneous classes, classification
analysis is based on reflectance data and that ancillary data can be used to identify these
unique classes with the aid of ground data (Lilliesand and Kiefer, 1994). The fuzzy
approaches are adopted as they take into account the fuzziness that may be characteristic of
the ground data (Foody, 1995). Zhang and Foody (1998) investigated fuzzy approach for land
cover classification and suggested that fully fuzzy approach holds advantages over both the
conventional hard methods and partially fuzzy approaches. Clustering algorithms can be
loosely categorized by the principle (objective function, graph-theoretical, hierarchical) or by
the model type (deterministic, statistical and fuzzy). In the literature on soft classification, the
fuzzy c-mean (FCM) algorithm is the most popular method (Bastin 1997; Wu and Yand
2002; Yang et al., 2003). One of the popular parametric classifiers based on statistical theory
is the Fuzzy Gaussian Maximum Likelihood (FGML) classifier. This is an extension of
traditional crisp maximum likelihood classification wherein, the partition of spectral space is
based on the principles of classical set theory. In this method, land cover classes can be
represented as fuzzy sets by the generation of fuzzy parameters from the training data. Fuzzy
representation of geographical information makes it possible to calculate statistical
parameters which are closer to the real ones. This can be achieved by means of the
probability measures of fuzzy events (Zadeh, 1968). Compared with the conventional
methods, this method has proved to improve remote sensing image classification in the
aspects of geographical information representation, partitioning of spectral space, and
estimation of classification parameters (Wang, 1990). Despite the limitations due to its
assumption of normal distribution of class signature (Swain and Davis, 1978), it is perhaps
one of the most widely used classifiers (Wang, 1990; Hansen et al., 1996; Kumar, 2007).
In remote sensing, pixel measurement vectors are often, considered as points in a spectral
space. Pixels with similar spectral characteristics form groups which correspond to various
ground-cover classes that the analyst defines. The groups of pixels are referred to as spectral
classes, while the cover classes are information classes. To classify pixels into groups, the
spectral space should be partitioned into regions, each of which corresponds to one of the
information classes defined. Decision surfaces are defined precisely by some decision rules
(for example, the decision rule of conventional maximum likelihood classifier) to separate the
regions. Pixels inside a region are classified into the corresponding information class. Such a
partition is usually called a hard partition. Fig 4.1a illustrates a hard partition of spectral
space and decision surfaces. A serious drawback of the hard partition is that a great quantity
of spectral information is lost in determining the pixel membership, Let X be a universe of
discourse; whose generic elements are denoted x: X = {x}. Membership in a classical set A of
X is often viewed as a characteristic function A from {0,1} such that A(x) = 1 if and only if
xA. A fuzzy set (Zadeh, 1965) B in X is characterized by a membership function, fB, which
associates with each x a real number in [0,1]. fB(x) represents the "grade of membership" of x
in B. The closer the value of fB(x) is to 1, the more x belongs to B. A fuzzy set does not have
sharply defined boundaries and an element may have partial and multiple memberships.
Fuzzy representation of geographical information enables a new method for spectral space
partition. When information classes can be represented as fuzzy sets, so can the cor-
responding spectral classes. Thus a spectral space is not partitioned by sharp surfaces. A pixel
may belong to a class to some extent and at the same time belong to another class to some
other extent. Membership grades are attached to indicate these extents. Such a partition is
referred to as a fuzzy partition of spectral space. Fig 1b illustrates membership grades of a
pixel in a fuzzy partition. A fuzzy partition of spectral space can represent a real situation
better than a hard partition and allows more spectral information to be utilized in subsequent
analysis. Membership grades can be used to describe cover class mixture and intermediate
cases.
Figure 4 Hard partition of spectral space and decision surfaces; 1b: membership grades of a pixel in fuzzy partition
of spectral space.
F1 ( x1 ) F1 ( x2 ) F ( x )
1 N
F2 ( x1 ) F2 ( x2 )
F ( x)
. .....
Fc ( x1 ) Fc ( x2 ) Fc ( xN )
Membership grades can be used to describe cover class mixture and intermediate classes. In
the process, the stray pixels between classes may be classified as such. In Supervised
Approach, which is similar to maximum likelihood classification approach, instead of normal
mean vector and covariance matrices, fuzzy mean vectors and fuzzy covariance matrices are
developed from statistically weighted training data, and the training areas may be a
combination of pure and mixed pixels. By knowing mixtures of various features, the fuzzy
training class weights are defined. A classified pixel is assigned a membership grade with
respect to its membership in each information class. In this procedure the conventional mean
and covariance parameters of training data are represented as a fuzzy set. The following two
equations (1, 2) describe the fuzzy parameters of the training data:
(x )x c i i
M *
c
i 1
n
(5)
(x )
i 1
c i
c ( xi )( xi M C* )( xi M C* ) T
Vc* i 1
n
(6)
i 1
c ( xi )
where, Mc* is the fuzzy mean of training class c; Vc* is the fuzzy covariance of training class
c; xi is the vector value of pixel i, c(xi) is the membership of pixel xi, to training class c, n is
the total number of pixels in the training data. In order to find the fuzzy mean (eqn. 1 ) and
fuzzy covariance (eqn. 2) of every training class, the membership of pixel xi, to the training
class c must be first known. Membership function to class c based on the conventional
maximum likelihood classification algorithm with fuzzy mean and fuzzy covariance is:
Pc* ( xi )
c ( xi ) m
(7)
P j 1
*
j ( xi )
where, Pc*(xi) is the maximum likelihood probability of pixel xi to class c, m is the number of
classes. The membership grades of a pixel vector x depend upon the pixel’s position in the
spectral space. The a posteriori probabilities are used to determine the class proportions in a
pixel. The algorithm iterates until there is no significant change is the membership values
obtained.
As this method is an extension of MLC, it inherits its advantages and disadvantages. The
disadvantage is the normality of data assumption which it is based upon. Compared with the
conventional methods, FGML improves remote sensing image classification in the aspects of:
1) Representation of geographical information, 2) Partitioning of spectral space, and 3)
Estimation of classification parameters (Wang, 1990).
Fuzzy C-means clustering also known as Fuzzy ISODATA is an iterative technique which is
separated from hard c-means that employ hard partitioning. The FCM employs fuzzy
partitioning such that a data point can belong to all groups with different membership grades
between 0 and 1. The aim of FCM is to find cluster centroids that minimize the dissimilarity
function. Differing from hard clustering techniques such as c-means, which will converge
the objective function iteratively to a local minimum from each sample to the nearest cluster
centroid, fuzzy clustering methods assign each training sample a degree of uncertainty
described by a membership grade. A pixel's membership grade function with respect to a
specific cluster indicates to what extent its properties belong to that cluster. The larger the
membership grade (close to 1), the more likely that the pixel belongs to that cluster. FCM
algorithm was first introduced by Dunn (1973); and the related formulation and the algorithm
was extended by Bezdek (1974), The purpose of FCM approach, like the conventional
clustering techniques; is to minimize the criteria in the least squared error sense. For c 2
and m any real number greater than 1, the algorithm chooses i, : X [0,1] so that i i = 1
and wj Rd for i=l,2,...,c to minimize objective function
1 c n
2
J FCM ( i , j ) m xi x j (8)
2 j 1 i 1
where i,j is the value of the jth membership grade on the ith sample xi. The vectors w1,....,
wj,...., wc, called cluster centroids, can be regarded as prototypes for clusters represented by
the membership grades. For the purpose of minimizing the objective function, the cluster
centroids and membership grades are chosen so that a high degree of membership occurs for
samples close to the corresponding centroids. The FCM algorithm, a well-known and
powerful method in clustering analysis, is further modified as follows.
1 c n 1 c n
im, j ln j
2
J PFCM ( i , j ) m xi x j (9)
2 j 1 i 1 2 j 1 i 1
where j is a proportional constant the value of class j and (0) is a constant. When =0,
JPFCM equals JFCM. The penalty term is added to the JFCM objective function, where
n n
im, j m
i, j xi
j c
i 1
n
, j 1,2,..., c wj i 1
n
(10)
j 1 i 1
m
i, j
i 1
m
i, j
1
c 2 1 /( m 1)
xi w j ln j
i, j ; i 1,2,..., n; j 1,2,..., c
1 /( m 1)
(11)
l 1 xi wl ln l
2
1. Randomly set cluster centroids wj (2 j c), fuzzification parameter m (l m ), and
the value >0. Give a fuzzy c-partition U(0).
2. Compute j(t), wj(t), with U(t-1) using equation (6). Calculate the membership matrix
U= [i,j] with j(t), wj(t) using equation (7).
3. Compute =max(|U(t-1) - U(t)|). If > , then go to step 2; otherwise go to step 4.
In the last step, a defuzzification process should be applied to the fuzzy partition data to
obtain the final segmentation. A pixel is assigned to a cluster when its membership grade in
that cluster is the highest. The disadvantage of FCM is that due to the use of an inner-product
norm induced distance matrix, its performance is good only when the data set contains
clusters of roughly the same size and shape. Also, since it is unsupervised, the order of
occurrence of class fraction images cannot be predicted. However, the independence of this
algorithm to any type of data distribution makes it popular among all the clustering
algorithms.
classification data assigned to class m , Rn ( x) and Cm ( x) represent the gradual
membership of the sample elements in classes n and m . The fuzzy set operators can be used
within the matrix building procedure to provide a fuzzy error matrix M . The assignment to
the element M (m, n) involves the computation of the degree of membership in the fuzzy
… … … … … …
… … … … … …
Class c M ( c ,1) M ( c ,2) … … M ( c ,c )
Producer’s Overall
Accuracy Accuracy
The fuzzy error matrix can be used as the starting point for descriptive techniques in the same
manner as used in the conventional error matrix.
th
X i = row marginal total in i row of confusion matrix.
elements.
Where, the quantities X i and X i represent column marginal and row marginal total of
membership grades respectively. Using this value of chance agreement Kappa Co-efficient or
Khat index is defined using Equation (3.29) (Stein, Meer and Gorte, 2002):
( p0 pc )
(17)
(1 pc )
(e) Z statistic
This test determines if the results of two error matrices are statistically similar or not
(Congalton et al., 1983). It is calclulated using Equation (3.30)
a b
Z (18)
a2 b2
where Z is the test statistic for significant difference in large samples, a and b are the Khat
indices for two error matrices a and b with variance for Khat indices as a2 and b2 .
6. Case Study
Fuzzy clustering algorithms explained in the previous section are applied to identify Paddy,
Semi-dry and Sugracane crops using IRS LISS I (Linear Imaging Self Scanner) data in the
Bhadra command area for Rabi season of 1993. Bhadra dam is located in Chickmagulur
District of Karnataka state. The dam is situated 50 km upstream of the point where Bhadra
river joins Tunga, another tributary of Krishna river, and intercepts a catchment of almost
2000 sq.km. Bhadra reservoir system consists of a storage reservoir with a capacity of 2025
M m3, a left bank canal and a right bank canal with irrigable areas of 7,031 ha and 92,360 ha
respectively. Figure 6.1 shows location map of Bhadra command area. Major crops cultivated
in the command area are Paddy, Semi-dry and Sugarcane. Paddy transplantation is staggered
over a period of more than a month and semi-dry crops are sown considerably earlier to
Paddy. The command area is divided into three administrative divisions, viz., Bhadravati,
Malebennur and Davangere.
Satellite imageries used for the study are acquired from IRS LISS I (with spatial resolutions
of 72.5 m) on dates 20th February, 14th March and 16th April in the years 1993. Figure 6.2
shows the standard FCC (False colour composite) of Bhadra commad area on 16th April
1993. For the study area, ground truth was collected for various crops by scientists from
National Remote Sensing Agency, Hyderabad, during Rabi 1993, by visiting the field.
Use of penalized fuzzy c-means algorithm requires selection of values for the number of
clusters c, weighting exponent m, and constant v. The algorithm is implemented with c= 20,
15, 9, 6 and 5 clusters, the value of m between 1.4 and 1.6, and the value of v between 1.0 and
1.5. The algorithm gave good result with c = 6, 9; m =1.5 and v =1.0.
Since paddy transplantation is staggered across the command area, satellite data of any one
date does not represent the same growth stage at all locations. In view of this heterogeneity in
crop calendar, in order to obtain complete estimate of area under any crop as well as to ensure
better discriminability, satellite data of three dates as mentioned in the previous section are
used to reflect the following features.
Table 6.1. Semi-dry crop classified using single date imagery with c = 5 and 15
Date Available
c Correctly Accuracy
ground Misclassified
Classified (%)
truth
20th February, 5 86 84 2 98
1993 15 86 63 23 73
14th March, 5 86 82 4 95
1993 15 86 70 16 81
Using 14th March data, with c=5, majority of the Paddy locations were classified into water
cluster and some locations to Semi-dry crop because at that time paddy is just transplanted in
most of the areas and therefore; water is dominating compared to the crop seedlings. With
16th April data and c=5; 42 Paddy locations were correctly classified out of 53 available
ground truth locations. Detailed results are given in Laxmi Raju (2003).
IMAGE TRANSFORMATION
1. Introduction
Image transformation is the technique used to re-express the information content within
several remotely sensed images, captured in different bands/wavelengths. Multitemporal or
multidate satellite imageries can be subjected to image transformation procedures. The
transformed images provide a different perspective of looking at satellite imagery. Such a
transformed image is bound to display properties suitable for particular applications, over and
above the information content that is present within the original image. Operations performed
include image addition, subtraction, multiplication and division. The operation chosen will
essentially depend on the application intended. For example, two images captured by the
same sensor on different dates shed light on the amount of change that has occurred between
the two dates, which is extremely important in disaster management studies in flood
aftermath etc. Another example is the wide usage of image ratioing techniques. For example,
the ratio of images captured in near infrared and red bands on a single date is used as a
vegetation index to measure hydrological variables like leaf area index (LAI), vegetation
moisture content, mapping biomass, vegetation health etc. Some popular vegetation indices
are Perpendicular Vegetation Index (PVI), Normalised Difference Vegetation Index (NDVI)
etc. In this context, it is essential to discuss the widely used technique of principal component
analysis, which also aims at re-expressing the information content of a set of images in terms
of various principal components that are extracted according to the order of their decreasing
variance. Image transformation operations also include expressing images within a frequency
domain like using wavelets that decompose the input signal in terms of wavelength (1D) or
space (2D) and scale simultaneously. With the increased availability of hyperspectral
imageries, image transformation techniques to extract details using the shape of spectral
reflectance curves have also equally advanced. All these topics are vast in their own way;
therefore this module will focus on the simpler image transformation techniques using
arithmetic operators followed by change detection.
2. Arithmetic Operations
Image addition essentially implies a form of image averaging which is mainly employed as a
means to reduce the overall noise contribution. It is usually carried out in a pixel by pixel
manner. In mathematical terms, assume that a single image captured at a certain day and time
be expressed in the form of:
I ( x, y) F ( x, y) N ( x, y)
where I ( x, y) is the recorded image, F ( x, y) is the true image and the random noise
component is given by N ( x, y) . The true image value will generally be constant whereas the
noise component is usually assumed to be normally distributed with a mean of zero. The
noise component can either be positive or negative in nature. Hence, adding two images of
the same area taken at the same time will cause the error term at a particular pixel position to
get cancelled. Another point to note is the image display system. Suppose both the images
which are to be added posses an 8 bit display system whose values vary from 0-255. Addition
of these two images will cause the resulting image to have a dynamic range of 0-510 which is
not advisable. Hence, this condition is averted by dividing the sum of two images with a
factor of 2 in order to express in the dynamic range of 0-255. Image enhancement techniques
such as contrast stretching always tend to alter the dynamic range of an image. Therefore,
performing image addition in such contrast stretched images might not lead to meaningful
results. Hence, always it is advisable to carry on arithmetic operations on images before these
are subjected to any kind of enhancement techniques.
Similar to image addition, the operation of image subtraction is also performed on a pair of
images of the same area, which are co-registered but are taken during different times. Unlike
the operation of addition, image subtraction is essentially used to perform change detection
studies between the dates of imaging. This operation is also performed on a pixel by pixel
basis. Assume 2 images with an 8 bit display system showing their respective digital numbers
in a dynamic range of 0-255. The maximum positive and negative differences that can be
encountered with image subtraction will be +255 and -255. This implies that the resulting
subtracted image will need to be rescaled onto a 0-255 range. This can be done by adding 255
to the obtained difference which will automatically shift the dynamic range to 0-510. Now,
dividing this range with 2 will display the resulting subtracted image in 0-255 range.
Typically, a difference image will have a histogram either Gaussian or normal in shape with
peak at a value of 127 indicating pixels that have not changed much and tails falling in either
directions. Difference images can be very well represented using density slicing technique or
pseudocolor transform wherein selected parts of the dynamic range are assigned to a
particular color or shade of grey. To perform this simple technique, suitable thresholds need
to be selected, however these can also be chosen arbitrarily or by trial and error approaches.
This is similar to assigning a numerical value to represent a particular land cover type while
performing image classification technique. More details on change detection can be found in
the book by Lunetta and Elvridge (1998) wherein various methods detailing the detection and
measurement of change is explained using remotely sensed images.
Image multiplication is usually employed when an image consists of two or more distinctive
regions wherein an end user is just interested to view one of these regions. The best example
is masking operation which is generally used to separate regions of land from water using
information captured in the near infrared band as reflection from water bodies in the near
infrared band is very low and those from vegetated land regions is very high. A suitable
threshold can be chosen by visually inspecting the image histogram of the near infrared pixel
values. With this threshold, a binary mask can be generated wherein all the pixels having
values below the chosen threshold are labeled as ‘1’ and those having pixel values above the
chosen threshold are labeled as ‘0’. This immediately results in a black and white image
which can then be used to multiply with the original image to extract the required regions.
Collecting accurate information regarding the world’s food crops is important to address
various applications but collecting using in situ techniques is not only expensive, time
consuming but also near to impossible. Alternate method is to measure the vegetation amount
using spectral measurements from remotely sensed imagery. The aim is to effectively extract
information using multiple bands of satellite imagery in order to predict the canopy
characteristics such as biomass, leaf area index (LAI) etc. Generation of ratio images is one
of the most commonly used transformations applied to remotely sensed images. Image
ratioing consists of dividing the pixel values of one image by the corresponding pixel values
in a second image. This has many advantages like reduction of the undesirable effects on
radiance values which may result either owing to variable illumination or due to varying
topography. Also, different aspects of spectral reflectance curves of different earth surface
cover types can be brought out by this technique of image ratioing. These two main
properties of ratio images (i.e., reduction of topographic effect and correlation between ratio
values and shape of spectral reflectance curves between two given wavebands) enable the
widespread usage of spectral ratios in geology. The most common use of image ratioing is to
study the vegetation status. Vegetation tends to reflect strongly in the near infrared band and
absorb radiation in the red wavelength band which results in a grayscale image. This can be
subjected to low pass filtering and density sliced to create an image that shows variation in
biomass and in green leaf area index. It should be noted that the concepts of low pass filtering
have been introduced in module 4. Ratio images tend to display the spectral or color
characteristics of image features irrespective of the scene illumination. Consider a
hypothetical situation wherein the digital numbers observed for each land cover type are
tabulated below:
Shadow 11 16 0.69
It can be observed that the ratio values of both the bands give same values irrespective of the
scene illumination. This is because ratio images are capable of displaying the variations in the
slopes of the spectral reflectance curves between the two bands involved irrespective of the
absolute reflectance values observed in these two bands. The ratio of near infrared band to
red band is very high for healthy vegetation whereas it tends to be comparatively lower for
stressed vegetation. The resulting vegetation indices have been extensively employed to
quantify relative vegetation greenness and biomass values. Some of the popular vegetation
indices are discussed below:
Rouse et al (1973) came up with this vegetation index based on a comparison of normalized
difference of brightness values from MSS7 and MSS5 which was named as normalized
different vegetation index hereinafter referred as NDVI. This vegetation index essentially
used the sum and differences of bands rather than the absolute values which makes it more
suitable for use in studies wherein change detection over a particular area is involved. This is
owing to the fact that NDVI might be affected by varying atmospheric conditions,
illumination and viewing angles, soil background reflectance etc. If images be captured in the
near infrared (NIR) and red (R) bands, then the index of NDVI can be defined as:
NIR R
NDVI
NIR R
Vegetation indices based on NDVI have extensively found use to measure vegetation amount
on a worldwide basis. Maximum of 10 day NDVI images can be corrected for cloud cover
and used for crop forecasts as proposed by Groten (1963).
Deering et al (1975) added 0.5 to NDVI and taking the square root which resulted in the
transformed vegetation index or the TVI. The TVI is expressed as:
1
NIR R 2
TVI 0.5 *100
NIR R
This index aids in ranch management decisions wherein TVI data correlates with the
estimated forage level present in pastures contained in remotely sensed satellite imagery.
Different versions of TVI have come up, proposed by several scientists.
In order to overcome the limitations of PVI index, Huete (1988) proposed the soil adjusted
vegetation index (SAVI). He transformed the NIR and red reflectance axes in order to
minimize the error owing to soil brightness variation. For this purpose, addition of two
parameters (L1 and L2) was proposed which when added to NIR and red reflectance bands
was found to either remove to reduce the variation caused by soil brightness. The expression
is given as:
NIR L1
SAVI
R L2
A number of indices have been developed which correspond to this class. Several
modifications have also been proposed along the years. Steven (1998) proposed the
optimized soil adjusted vegetation index (OSAVI) which is given by the expression:
NIR R
OSAVI
NIR R 0.16
This index minimizes soil effects. More information regarding this can be obtained from
Baret and Guyot (1991), Sellers (1989) etc.
Developed by Richarson and Wiegand (1977), this index indicates the plant development by
relying on a plot showing radiance values obtained in the visible red band against those
obtained in the near infrared band. In the figure, bare soil pixels with no vegetation will
essentially lie along the line (450 straight line). Pixels depicting vegetation will lie below and
to the right of the soil line. Richardson and Wiegand (1977) proposed that the perpendicular
distance to the soil line can be used as a measure that can be correlated with leaf area index
and biomass. The expression of PVI using bands of Landsat Multispectral Scanner (MSS) is
given as:
In 1984, Perry and Lautenschlager modified the PVI indices finding them to be
computationally ineffective and incapable of distinguishing the water from green vegetation.
They proposed a new perpendicular vegetation index which considered these factors. PVI has
been extensively used to take into account the background variation in soil conditions which
effect soil reflectance properties. But this index gets easily effected by rainfall when the
vegetation cover is incomplete. For wet soil conditions, PVI will essentially underestimate
leaf area index as canopy cover increases. Hence, PVI was considered to moderately perform
well but less efficient to detect plant stress. It should be noted that this index is not widely
used presently. It has been described to explain the concept of ‘soil line’. The Tasseled cap
transformation is a better alternative which is discussed in the next section.
Developed by Kauth and Thomas (1976), the tasseled cap transformation produces an
orthogonal transformation of the original four channel Multi Spectral Scanner (MSS) data
into a four dimensional space. This technique has found extensive usage in agricultural
research as it resulted in four new indices namely, the soil brightness index (SBI), the green
vegetation index (GVI), the yellow stuff index (YVI) and a non such index (NSI) that is
associated with atmospheric effects. The soil line and the vegetation region are represented in
a better manner using this transformation. In this new, rotated coordinate system, there exists
four axes of ‘brightness’, ‘greenness’, ‘yellowness’ and ‘nonesuch’. The ‘brightness’ axis
represents variations in soil background reflectance whereas the ‘greenness’ axis displays the
vigour of green vegetation. The ‘yellowness’ axis represents the yellowing associated with
senescent vegetation and the ‘nonesuch’ axis is related to atmospheric conditions. The point
to note is that all these four axes are statistically uncorrelated and hence can be represented in
a four dimensional space defined by the four Landsat MSS bands.
This technique essentially provides a physically based coordinate system for interpreting
images of agricultural area captured during different growth stages of the crop. As the new
sets of axes are defined a priori, they will not be affected by variations in growth stages of
crop or variations from image to image captured over a period of time. This technique can be
very well compared with the principal component analysis wherein the new set of axes
(principal components) can be computed using statistical relationships between individual
bands of the image being analysed. As a direct consequence, in principal component analysis,
the correlations among various bands will differ based upon the statistics of pixel values in
each band which in turn will vary over a period of time (different for growing season and for
end of growing season of crops). Kauth et al (1979) and Thompson and Wehmanen (1980)
had come up with coefficients for the four vegetation indices based on Landsat 2 MSS data.
Similar to the principal component analysis, the tasseled cap transform relies on empirical
information for estimating the coefficients of the brightness axis.
Principal component analysis (PCA), also known as Karhunen-Loeve analysis, transforms the
information inherent in multispectral remotely sensed data into new principal component
images that are more interpretable than the original data. It compresses the information
content of a number of bands into a few principal component images. This enables
dimensionality reduction of hyperspectral data. Generally within a multispectral imagery, the
adjacent bands will depict mutual correlation. For example, if a sensor captures information
using visible/near infrared wavelengths, the vegetated areas obtained using both the bands
will be negatively correlated in nature. Imagine a multispectral or a hyperspectral imagery
with more than 2 bands which are inter-correlated. The inter correlations between bands
depicts repetition of information between the adjacent bands.
Consider two variables x and y that are mutually correlated and which are plotted using a
scatter diagram. The relationship between x and y can be very well represented using a
straight line sloping upwards towards right (assuming that x and y are positively correlated).
Now suppose that x and y are not perfectly correlated and that there exists a variability along
some other axis. Then the dominant direction of variability can be chosen as the major axis
while another second minor axes can be drawn at right angles to it. A plot with both these
major and minor axis may be a better representation of the x-y structure than the original
horizontal and vertical axes. Using this background information, assume that the pixel values
of two bands of Thematic Mapper are drawn using a scatter plot. Let X1 and X2 denote the
respective bands and let 1 and 2 represent their corresponding mean values. The
spread of points (pixel values) indicates the correlation and hence the quality of information
present in both the two bands. If the points are tightly clustered within a two dimensional
space, it means that they would provide very little information to the end user. It means that
the original axis of X1 and X2 might not be a very good representative of the 2D feature
space in order to analyze the information content associated with these two bands. Principal
component analysis can be used to rotate the location of original axes so that the original
brightness values ( pixel values) be redistributed or reprojected onto a new set of axes
(principal axis). For example, the new coordinate system (with locations of X 1' , X 2' ) can be
X 2'
2 2 X 1'
1 1
PC2 PC1
In order to arrive at the principal axes, certain transformation coefficients need to be obtained
which can be applied to the original pixel values. The steps for this transformation are
discussed below:
a) Compute the covariance matrix of the n dimensional remotely sensed data set.
The importance of variance to define the points represented by a scatter plot along the
dominant direction has already been stressed. If variance be used to define the shape of the
ellipsoid covering the points ( in an n dimensional variable space) then, the scales used for
measuring each variable must be comparable with one another. If not, neither the variance of
each variable be the same nor will the shape of enclosing ellipse remain same. This may also
create further complications as the shape of one ellipsoid cannot be related mathematically to
the shape of the second ellipsoid. In these circumstances, the correlation coefficient can be
used rather than the covariance to measure standardized variables. To standardize the
variables, the mean value can be subtracted from all measurements and then the result can be
divided by their standard deviation which would convert the raw values to z scores or
standard scores having zero mean and a variance of unity. It should be noted that usage of
covariance matrix will yield unstandardized PCA and use of correlation matrix will yield in a
standardized PCA.
Consider that there are n number of bands within a multispectral remotely sensed imagery.
For these n bands there will be n rows and n columns. Quantities known as eigenvalues can
be found for the chosen matrix. Eigenvalues are proportional to the length of principal axes of
the ellipsoid whose units are measured using variance. In order that the variables be measured
on comparable scales, standardized units of variance must be used, as stated in previous
paragraph. Each eigenvalue will be associated with a set of coordinates which are known as
eigenvectors. The eigenvalues and eigenvectors will together describe the lengths and
directions of the principal axes. The eigenvalues will contain important information such as
the total percent of variance explained by each of the principal components using the
expression
Eigenvalue
TotalVaria nce(%) n
*100
Eigenvalue
i 1
If i, j represents the eigenvalues of an nxn covariance matrix which can be represented as:
1,1 0 0 0 0 0
0 2, 2 0 0 0 0
0 0 3, 3 0 0 0
0 0 0 4, 4 0 0
0 0 0 0 5, 5 0
0 0 0 0 0 n,n
The eigenvectors when scaled using the square roots of their corresponding eigenvalues can
be interpreted as correlations between the principal components and the individual bands of
the image. The correlation of each band with respect to each of the principal components can
be computed. This gives us an idea regarding how each band ‘loads’ or otherwise is
associated with respect to each principal component. The expression can be given as:
a kp * p
Rkp
Vark
p = pth eigenvalue
Numerical Example
PCA is based on four assumptions namely, linearity, sufficiency of mean and variance,
orthogonality of principal components and that large variances have important dynamics.
The second assumption states are the mean and variance is used as sufficient statistics to fully
define the probability distribution. For this assumption to be true, the probability distribution
of the variable considered must be exponential in nature. This guarantees that the signal to
noise ratio together with the covariance matrix is sufficient to fully describe the noise and
redundancies. The third assumption indicates that the data has a high signal to noise ratio.
And hence, the principal components with a larger variance will represent more dynamics
than those with lower variances which will depict noise. PCA can be solved using linear
algebra decomposition techniques.
Assume a hypothetical situation of an image at row1 and column1 for seven bands of a
satellite sensor which are represented using a vector X such that,
BV1,1,1 20
BV1,1, 2 30
BV 22
1,1,3
X BV1,1, 4 60
BV1,1,5 70
BV 62
1,1, 7
BV1,1, 6 50
We will now apply the appropriate transformation to this data such that it is projected onto
the first principal component’s axes. In this way we will find out what the new brightness
value will be, for this component. It is computed using the formula:
n
newBV i , j , p a kp BVi , j ,k
k 1
Where a kp = eigenvectors, BVijk =brightness value in band k for the pixel at row i, column j
newBV1,1,1 a1,1 ( BV1,1,1 ) a 2,1 ( BV1,1, 2 ) a3,1 ( BV1,1,3 ) a 4,1 ( BV1,1, 4 ) a5,1 ( BV1,1,5 ) a6,1 ( BV1,1,7 ) a7,1 ( BV1,1,6 )
0.106(50)
= 119.53
SNR
noise
2
A high value of SNR indicates high precision data whereas a lower value indicates data
contaminated with noise.
Principal components are linear combinations of the original variables (like image pixel
values) with the coefficients being defined such that the criterion of maximum variance gets
satisfied. The question which needs to be asked is whether there exists any other criterion
other than that of minimum variance that can be used to estimated weights for linear
combinations. In this context, a new criterion i.e., maximizing the SNR ratio can be followed.
How to maximize this criterion?
A method should be devised that is capable of separating the measurements into two parts,
with the first part showing the signal and the second part showing the contribution of noise. If
the dataset consists of n number of bands, firstly the covariance matrix can be computed (C).
Then, the horizontal and vertical pixel differences in each of these n bands can be
determined. This in turn can be used to compute their covariance matrices which when
combined produce the noise covariance matrix (CN). The covariance matrix of the signal (CS)
is estimated by subtracting the covariance matrix of the noise from that of the measurement.
This results in the criterion which can be written as maximizing the ratio of C S/CN. The
outcome of noise adjusted PCA analysis is a linear combinations of the n spectral bands that
are ranked from 1 ( having the highest signal to noise ratio) to n (having the lowest signal to
noise ratio). The coefficients are applied to the data in exactly a similar manner as PCA
coefficients. Hence, the principal components are estimated using the least signal to noise
ratio instead of the minimum variance criterion.
Bibliography
6. Foody, G.M. (2002) ‘Status of land cover classification accuracy assessment’, Remote
Sensing of Environment, Vol.80,185–201.
7. Foody, G.M. and Cox, D.P. (1994) ‘Sub-pixel land cover composition estimation
using a linear mixture model and fuzzy membership functions’, International Journal
of Remote Sensing, Vol.15, 619–631.
8. Groten, S. M., 1993,” NDVI-Crop Monitoring and Early Warning Yield Assessment
of Burkina Faso,” International Journal of Remote Sensing, 14 (8):1495-1515.
9. Guyot, G. and Gu, X.-F., 1994, Effect of radiometric corrections on NDVI determined
from SPOT-HRV and Landsat – TM data. Remote Sensing of Environment, 49, 169-
180.
10. Huete, A., 1989, Soil influences in remotely sensed vegetation-canopy spectra. In:
Asrar, G. (ed.) (1929), 107-141.
11. John R. Jensen, 1996, Introductory Digital Image Processing, Prentice Hall
12. Kauth, R. J. and G. S. Thomas, 1976,”The Tasseled Cap-A Graphic description of the
spectral-temporal development of agricultural crops as seen by Landsat,”
Proceedings, Symposium on Machine Processing of remotely sensed data. West
Lafayette, IN:Laboratory for applications of remote sensing, pp. 41-51.
15. Lunetta, R. S. and Elvridge, C. D. (eds.), 1998, Remote Sensing Change Detection:
Environmental Monitoring, Methods and Applications. Chelsea, MI: Ann Arbor
Press.
16. Paul. MK. Mather, 2004, Computer Processing of Remotely- Sensed Images, Wiley &
Sons.
17. Perry, C. R., and L. F. Lautenschlager, 1984, “ Functional Equivalence of spectral
vegetation indices,” Remote Sensing of Environment, 14:169-182.
18. Richardson, A. J. and C. L. Wiegand, 1977, “ Distinguishing vegetation from soil
background information,” Remote sensing of environment, 8:307-312.
19. Rouse, J. W., R. H. Haas, J. A. Schell, and D. W. Deering, 1973, “Monitoring
vegetation systems in the great plains with ERTS, Proceedings, 3rd ERTS Symposium,
Vol. 1, pp. 48-62.
20. Sellers, P., 1989, Vegetation-canopy reflectance and biophysical properties. In: Asrar,
G. (ed.) (1989), 297-335.
21. Steven, M. D., 1998, The sensitivity of the OSAVI vegetation index to observational
parameters. Remote Sensing of Environment, 63, 49-60.
22. Thompson, D. R., and O. A. Wehmanen, 1980, “ Using Landsat Digital Data to detect
moisture stress in corn-soybean growing regions,” Photogrammetric Engineering &
Remote Sensing, 46:1082-1089.
23. Wang, F. (1990) ‘Improving Remote Sensing Image Analysis through Fuzzy
Information Representation’, Photogrammetric Engineering And Remote Sensing,
Vol. 56, 1163-1169.
24. Wu, K.L. and Yang, M.S. ( 2002) ‘Alternative c-means clustering algorithms’,
Pattern Recognition, Vol.35, 2267–2278.
25. Yang, M.S., Hwang, P.Y. and Chem, D.H. (2003) ‘Fuzzy clustering algorithms for
mixed feature variables’, Fuzzy Sets and Systems, Vol.141, 301–317.
26. Zadeh, L.A. (1973) ‘Outline of a new approach to the analysis of complex systems
and decision processes’, IEEE Transactions on systems, Man And Cybernetics,
Vol.SMC-3, No.1,28-44.
27. Zhang, J., Foody, G.M. (1998) ‘A fuzzy classification of sub-urban land cover from
remotely sensed imagery’, International Journal of Remote Sensing, Vol.19, No 14,
2721-2738.
1. Introduction
For effective data processing, various types of remote sensing software systems need to be
applied reasonably and flexibly. This enables effective feature extraction useful for various
applications. ERDAS, ENVI, ArcGIS are some of the professional and popular image
processing software. They possess powerful image processing functions. These software are
capable of manipulating the images in many ways. They are designed to provide
comprehensive analysis of satellite and aircraft remote sensing data. They are not only
innovative but very much user-friendly to display and analyze images. For example, ENVI
allows user to work with multiple bands, to extract spectra, use spectral libraries, to process
high spectral resolution datasets such as AVIRIS, hyperspectral analysis, specialized
capabilities for analysis of advanced Synthetic Aperture Radar datasets like TOPSAR,
AIRSAR etc. Similarly, the ArcGIS desktop applications provide a very high level of
functionality like comprehensive mapping and analysis tools, geo processing tools, advanced
editing capabilities that allow data to be integrated from a wide variety of formats including
shape files, coverage tables, computed aided drafting drawings, triangulated irregular
networks etc. ERDAS Imagine uses a comprehensive set of tools such as image
orthorectifications, mosaicking, classification, reprojection, image enhancement techniques
that enable an end user to analyze remotely sensed satellite imagery and present the same in
various formats ranging from 2 dimensional maps to 3 dimensional models. These software
find applications in varied fields of engineering and sciences. Archaeologists use image
classification to identify features that cannot otherwise be observed visually by standing on
the ground. Biologists use them for delineating wetlands, to identify vegetation species and
land cover. Accurate estimates of slope, aspect and spot elevation can be ascertained.
Hydrologists process satellite imageries for water quality management studies, land use
classification, delineating watersheds etc. In geology, the image processing software can
enable identification of fracture zones, deposition of ore minerals, oil and gas, identification
of anomalies in the earth’s magnetic field, electrical fields or radiation patterns, to identify
geologic faults and folds, to study movement of crustal plates, disaster management study of
floods and landslides etc. This module will introduce the popular software explaining in
detail one of their functionality.
2. ERDAS IMAGINE
ERDAS Imagine is raster based software that is specifically designed for information
extraction from images. The product suite of ERDAS is designed to consist of three products
for geographic imaging, remote sensing and GIS applications. The functions embedded
involve importing, viewing, altering and analyzing both raster and vector data sets. This
software is capable of handling an unlimited number of bands of image data in a single file.
These bands imported into ERDAS IMAGINE are often treated as layers. Additional layers
can be created and added to existing image file. It allows users to import a wide variety of
remotely sensed imagery from satellite and aerial platforms. Depending on user requirements,
this software is available in three levels namely: Essentials; Advantage and Professional. This
software also has many add-on modules. The functionality includes a range of classification,
feature extraction, change detection, georeferencing etc. The range of add on modules
includes Virtual GIS, IMAGINE, Vector, Radar Mapping Suite etc. One functionality of this
software is discussed in detail. More information regarding the product suite can be obtained
at geospatial.intergraph.com/Libraries/Tech.../ERDAS_Field_Guide.sflb.ashx
ERDAS IMAGINE allows an unlimited number of layers/ bands of data to be used for one
classification. Usual practice is to reduce dimensionality of the data as much as possible as
unnecessary data tend to consume disk space thereby slowing down the processing.
To perform supervised classification using ERDAS IMAGINE, open the image in a viewer.
ERDAS IMAGINE enables users to identify training samples using one or more of
the following methods:
Select Signature Editor using the Classifier button. Select signatures representing each
land cover class in the viewer. Use the Create Polygon AOI button from the AOI
tools. After selecting a polygonal area, double click when finished. The polygon
should completely be located within a homogeneous part of the land cover which is
focused. Now use the ‘Create New Signature ‘button from the AOI button in the
Signature Editor tool in order to add the sample. Polygons representing the areas
selected using AOI tool are stored as vector layers which can be used as input to the
AOI tool to create signatures. Training area can also be selected using user defined
polygons or using seed pixel approach. Selecting training samples being a crucial step
in supervised classification, is an iterative process. In order to identify signatures that
accurately represent the classes, repeated selection may be required. These need to be
evaluated and manipulated as necessary. Signature manipulation involves merging,
deleting, appending etc. Three or more signatures need to be collected for each land
cover type classified. Once this procedure is complete, save the signature file.
Use the ‘Classifier’ button from menu and go for ‘Supervised Classification’
Select the satellite imagery and enter in the ‘Input Raster File’. Similarly, load the file
created using the signature editor in the box showing ‘Input Signature File’. Enter a
new file name for the classified image. Press OK.
(d)
Figure 2.1: Thematic map obtained after performing supervised classification in ERDAS IMAGINE using methods of (a) Mahalanobis (b)
minimum distance to means (c) maximum likelihood classification and (d) Signature file used
3. ENVI
ENVI employs a graphical user interface (GUI) in order to select image processing functions.
ENVI uses a generalized raster data format to use nearly any image file including those
which contain their own embedded header information. Generalized raster information is
stored in either band sequential (BSQ), band interleaved by pixel (BIP) or band interleaved
by line (BIL) format. When using ENVI, a number of windows (like main, scroll and zoom
windows) and dialog boxes will appear on the screen. These allow an end user to manipulate
and analyze the image. The display menu provides access to interactive display and analysis
functions.
Figure 3.1: Main, scroll and zoom windows of ENVI showing image displayed
Mosaicing
This module provides a working knowledge regarding ENVI image mosaicing capabilities.
Basically, mosaicing refers to combining multiple images into a single composite image.
ENVI provides users with the ability of placing non georeferenced images (i.e., images with
no coordinate system attached to it) within a mosaic. The software allows creating and
displaying mosaics without the creation of large files. Most of the mosaics require contrast
stretching and histogram matching in order to minimize the image differences in the resulting
output mosaic. Hence, the first step in a mosaic using ENVI is to contrast stretch the images.
This technique is used such that the grayscales of images are matched with the base image.
ENVI’s interactive contrast stretching function can be used to perform this procedure. The
steps are:
Display the two images which are to be subjected to histogram matching in two
Identify the overlap areas and place the zoom windows of both the images within the
overlap
Apply the output histogram from the base image to the histogram of the second image
Save both the images and repeat for additional overlapping images as required.
3.2 Feathering
ENVI provides the functionality of feathering that is used to blend or blur the scenes
between mosaiced images using either edge feathering or cutline feathering.
a) Edge Feathering
This function requires the user to provide the edge feathering distance. This is blended
using a linear ramp that averages the two images across that distance. For example, if the
specified distance is 30 pixles, 0% of the top image is used in the blending at the edge and
100 % of the bottom image is used to make the output image. At the specified distance of
30 pixles, 100% of the top image is used to make the output image, 0% of the bottom
image is used. 50% of each image is used to make the output at 15 pixels from the edge.
b) Cutline Feathering
This functionality uses the distance specified in the cutline feathering distance. For
example, if the specified distance is 20 pixels, 100% of the top image is used in the
blending at the cutline and 0% of the bottom image is used to make the output image. At
the specified distance of 20 pixels out from the cutline, 0% of the top image is used to
make the output image and 100% of the bottom image is used. 50% of each image is then
used to make the output at 10 pixels out from the cutline.
4. ARCGIS
ArcView: It is the desktop version of ArcGIS which is the most popular of the GIS software
programs.
ArcEditor: This includes all the functionalities of ArcGIS which include the ability to edit
features in a multiuser geodatabase,
ArcInfo: This is Esri’s professional GIS software which includes functions of ArcGIS and
ArcEditor.
The basic ArcGIS desktop products include an enormous amount of functionality and
extensions that can be integrated with the existing functions . Hence, a comprehensive review
of these can be obtained at: http://webhelp.esri.com/arcgisdesktop
Tools Use
Spatial Analyst Enables modeling and analysis with raster data.
3D Analyst Allows users to visualize and analyze spatial data in 3D
Triangular Irregular Networks (TIN) have been used to represent surface morphology. These
are constructed by triangulating a set of vertices which can be interpolated using different
methods. ArcGIS supports the Delaunay triangulation method for creating DEM (Degital
Elevation Model) using TIN. In TIN, the elevations are represented at the vertices of
irregularly shaped triangles which may be small in number when the surface is flat and may
be numerous for a surface with steep slope. TIN is created by running an algorithm over a
raster to capture the nodes required for TIN. A TIN surface can be created using features of
either point, line or polygon that contain elevation information. It can also be generated from
raster data sets. For conversion of raster dataset to TIN, the Raster to TIN geoprocessing tool
can be used. This procedure by itself does not guarantee a better result unless accompanied
by ancillary data which is compatible with the surface definition.
Similarly, TIN surface can also be generated using a terrain dataset using the following steps:
3D Analyst Toolbox < Terrain to TIN geoprocessing tool
Select input terrain dataset and browse the dataset which is to be converted to TIN
Provide locations for saving the output TIN file
Select a Pyramid Level Resolution from terrain as it is known to improve efficiency
taking advantage from the fact that accuracy requirements diminish with scale.
Provide the Maximum number of nodes value which is to be used to create TIN
surface. Default is 5 million
Select Clip to Extent
Click OK
Similar operations can be performed for creating TIN surface from raster data. TIN for
Krishna basin in India created using USGS DEM data (http://www.usgs.gov) is shown in
Figure 4.5. It can be observed from this figure how the topographical variations are depicted
with the use of large triangles where change in slope is small and small triangles of different
shapes and sizes where there are large fluctuations in slope.
5. Map Projections
Maps are representation of all or any part of the earth in a 2 dimensional flat surface at a
specific scale. Map projections are relied to transfer the features of a globe onto the flat
surface of a map. The three types of map projections originally developed are cylindrical,
planar and conic. It is essential to note that while these map projections try to outlay the globe
as accurately as possible, the globe is the only true representation of the spherical earth and
therefore any attempt to represent it on a flat surface will definitely result in some form of
distortion. The basic knowledge of datums and coordinate systems are expected before user
tries to understand a suitable map projection system. However, these will not be covered in
this material. Extensive information regarding the same can be obtained from the following
link : http://kartoweb.itc.nl/geometrics/map%20projections/understanding%20map%20projections.pdf
1. Introduction
Image processing tool box has extensive functions for many operations for image restoration,
enhancement and information extraction. Some of the basic features of the image processing
tool box are explained in this lecture. The applications are demonstrated using IRS LISS-III
data covering a portion of the Uttara Kannada district in Karnataka.
Throughout this lecture the MATLAB command is given in pink color and the output is
presented in blue.
2. Images in MATLAB
MATLAB stores most images as two-dimensional arrays, in which each element of the
matrix corresponds to a single pixel in the displayed image.
For example, an image composed of 200 rows and 300 columns of different colored dots
would be stored in MATLAB as a 200-by-300 matrix. Some images, such as RGB, require a
three-dimensional array, where the first plane in the third dimension represents the red pixel
intensities, the second plane represents the green pixel intensities, and the third plane
represents the blue pixel intensities.
This convention makes working with images in MATLAB similar to working with any other
type of matrix data, and renders the full power of MATLAB available for image processing
applications. For example, a single pixel can be selected from an image matrix using normal
matrix subscripting.
I(2,15)
This command returns the value of the pixel at row 2, column 15 of the image.
Generally, the most convenient method for expressing locations in an image is to use pixel
coordinates. In this coordinate system, the image is treated as a grid of discrete elements,
ordered from top to bottom and left to right, as illustrated in Fig. 1.
For pixel coordinates, the first component r (the row) increases downward, while the second
component c (the column) increases to the right. Pixel coordinates are integer values and
range between 1 and the length of the row or column.
IRS LISS-III Band 4 image, which is a JPG file, is named here as image4.JPG. The following
commands are used to display the image.
Clear the MATLAB workspace of any variables and close the open figure windows.
To read an image, use the imread command. Let's read in a JPEG image named image4. JPG,
and store it in an array named I.
imshow (I)
Image is displayed as shown in Fig 2. The image shows a part of the Uttara Kannada District,
Karnataka. Some features in the image are
(i) Arabian Sea on the left (as seen by the dark tone)
(ii) Kalinadi in top half (represented by the linear feature in dark tone)
(iii) Dense vegetation (the brighter tones in the image)
(iv) Small white patches in the image are clouds.
The command whos is used to see how the variable I (which is the image here) is stored in
memory.
whos
The image is stored as a 2 dimensional matrix with 342 rows and 342 columns. Each element
is saved as an unsigned 8-bit data.
Using MATLAB commands, it is possible to convert the data type in which the image is
stored. For example, uint8 (unsigned integer, 8 bit) and uint16 (unsigned integer, 16 bit) data
can be converted to double precision using the MATLAB function, double. However,
converting between storage classes changes the way MATLAB and the toolbox interpret the
image data. If it is desired to interpret the resulting array properly as image data, the original
data should be rescaled or offset to suit the conversion.
For easier conversion of storage classes, use one of these toolbox functions: im2double,
im2uint8, and im2uint16. These functions automatically handle the rescaling and offsetting of
the original data.
For example, the following command converts a double-precision RGB (Red Green Blue)
image with data in the range [0,1] to a uint8 RGB image with data in the range [0,255].
RGB2 = im2uint8(RGB1);
MATLAB commands can also be used to convert the images saved in one format to another.
To change the graphics format of an image, use imread to read the image and then save the
image with imwrite, specifying the appropriate format.
For example, to convert an image from a BMP to a PNG, read the BMP image using imread,
convert the storage class if necessary, and then write the image using imwrite, with 'PNG'
specified as your target format.
bitmap = imread('image4.BMP','bmp');
imwrite(bitmap,'image4.png','png');
4. Image Arithmetic
Standard arithmetic operations, such as addition, subtraction, multiplication, and division,
when implemented on images are generally called Image Arithmetic. Image arithmetic has
many uses in image processing both as a preliminary step and in more complex operations. It
can be used to enhance or suppress the information, to detect the differences between two or
more images of the same scene etc.
To add two images or add a constant value to an image, use the imadd function. imadd adds
the value of each pixel in one of the input images with the corresponding pixel in the other
input image and returns the sum in the corresponding pixel of the output image.
For example, the following commands use the image addition to superimpose one image on
top of another. The images must be of the same size and class.
I = imread('image3.JPG');
J = imread('image4.JPG');
K = imadd(I,J); imshow(K)
In this example, image3.JPG and image4.JPG are IRS LISS-III Band-3 (Red) and Band-4
(Near Infrared) images, respectively. Added image is shown in Fig. 3.
One can also use addition to brighten an image by adding a constant value to each pixel.
For example, the following code brightens image4.JPG by adding a value 50to all the pixel
values.
I = imread('image4.JPG');
J = imadd(I,50);
To subtract one image from another, or subtract a constant value from an image, use the
imsubtract function. imsubtract subtracts each pixel value in one of the input images from the
corresponding pixel in the other input image and returns the result in the corresponding pixel
in an output image.
X= imread('image5.JPG');
J= imread('image4.JPG');
K= imsubtract(X,J);
For example, the following commands are used for rescaling or to brighten the image4.JPG
by using a factor 3 and to display the brightened image.
I = imread('image4.JPG');
J = immultiply(I,3.0);
figure, imshow(J);
The imdivide function in the image processing toolbox is used for an element-by-element
division of each corresponding pixels in a pair of input images and to return the result in the
corresponding pixel in an output image.
Image division, like image subtraction, can be used to detect changes in two images.
However, instead of giving the absolute change for each pixel, division gives the fractional
change or ratio between corresponding pixel values.
The command imshow displays an image. Additional functions are available in the MATLAB
image processing toolbox to exercise more direct control over the display format. Adding a
color bar, image resizing, image rotation and image cropping are a few options used to
perform specialized display of an image.
The colorbar function can be used to add a color bar to an axes object. A color bar added to
an axes object that contains an image object indicates the data values that the different colors
or intensities in the image correspond to as shown in Fig. 5. The MATLAB commands for
image display with color bar are given below.
F = imread('image5.JPG');
imshow(F), colorbar
Fig.5.
Image with color bar
In Fig. 5 are IRS LISS-III Band-5 image is displayed. The image is displayed in gray scale. A
colorbar for the image is shown alongside the image indicating the values corresponding to
different gray levels.
An image can be resized using the imresize function. The function takes two primary
arguments viz., (i) The image to be resized and (ii) The magnification factor as given below.
F = imread ('image5.JPG');
J = imresize(F,0.5);
The first function reads the IRS LISS-III Band-5 image and saves it as F. The second
function is used to resize the image 0.5 times and to save the output as J.
Another way of using the imresize function is by specifying the actual size of the output
image, instead of the resize factor. The command below creates an output image of size 100-
by-150.
To rotate an image, the imrotate function can be used. The function requires two primary
arguments as input. These are (i) The image to be rotated and (ii) The rotation angle. The
rotation angle should be specified in degrees. For a positive value, imrotate rotates the image
counterclockwise; and for a negative value, imrotate rotates the image clockwise. The
imrotate option in matlab allows the choice of interpolation method to be as either ‘nearest’,
‘bilinear’ or ‘bicubic’. They are popular interpolation techniques in which bicubic
interpolation can produce pixel values outside the original image.
For example, the following set of commands may be used to read the IRS LISS-III Band-5
image and to rotate the same through 35 degrees counterclockwise. The rotated image is
stored as J and is displayed at the end as shown in Fig. 6.
F = imread('image5.JPG');
J = imrotate (F,35,'bilinear');
figure, imshow(J)
Fig. 6.
Image rotated by 35 degrees
To subset or to crop a rectangular portion of an image, the imcrop function can be used. The
function requires two arguments as input viz., (i) The image to be cropped and (ii) The
coordinates of a rectangle that defines the crop area.
The coordinates of the rectangle may be specified manually or can be selected from the
image display window.
If imcrop is called without specifying the coordinates of the rectangle, the cursor changes to a
cross hair when it is over the image. Click on one corner of the region to be selected and
while holding down the mouse button, drag across the image towards the diagonally opposite
corner of the required rectangle. Thus a rectangle is drawn around the selected area. When the
mouse button is released, imcrop extracts the corresponding coordinates and creates a new
image of the selected region.
6. Image Analysis
A range of standard image processing operations for image analysis are also supported by the
MATLAB image processing toolbox. Two categories of operations available for image
analysis and image enhancement are mentioned here:
Analyzing images to extract information about their essential structure e.g contours,
and edges
The toolbox includes two functions that provide information about the pixel values or the
color data values of an image viz., pixval and impixel.
The function pixval interactively displays the data values for pixels as the cursor is moved
over the image. It can also display the Euclidean distance between two pixels.
The second function, impixel, on the other hand, returns the data values for a selected pixel or
set of pixels. The coordinates of the pixel is used as the input argument. The coordinates can
be specified manually.
If the coordinates are not specified manually, they are selected using the curser as it moves
over the image. Below are the set of commands used to extract the pixel values.
imshow image4.JPG;
vals = impixel;
Standard statistics of an image such as mean, standard deviation, correlation coefficient etc.
can also be computed using the functions available in the MATLAB image processing
toolbox.
For example, the functions mean2 and std2 compute the mean and standard deviation of the
elements of the image, which is stored as a matrix. The function corr2 computes the
correlation coefficient between two matrices of the same size.
Similar to the contour function in MATLAB, the toolbox function imcontour can be used to
display the contour plot of the data in an image. Contours connect pixels of equal pixel
values. The imcontour function automatically sets up the axes so their orientation and aspect
ratio match the image.
For example, the following set of commands is used to read an image and to display the
image information in the form of contours.
I = imread('image5.JPG');
figure, imcontour(I)
This reads the IRS LISS-III Band-5 image (which is shown in Fig. 5) and generates the
contours within the image. The contour image displayed is shown in Fig. 7.
Fig. 7.
Contour plot of an
Image
Edges are the places in an image corresponding to object boundaries. Therefore, in an image
edges generally correspond to rapid changes in the intensities or pixel values. The toolbox
function edge looks for places in the image where the intensity changes rapidly and hence
detects the edges. Function edge returns a binary image containing 1's where edges are found
and 0's elsewhere.
Any one of the following criteria is used to detect the rapid change in the intensity.
i Places where the first derivative of the intensity is larger in magnitude than some
threshold
For some of these estimators, it can be specified whether the operation should be sensitive to
horizontal or vertical edges, or both.
MATLAB toolbox contains more than one method for the edge detection eg., Sobel method
and Canny’s method.
The edge function takes two arguments viz., the image for which the edges are to be
identified and the edge detection method.
For example, the following commands are used to detect the edges in the image5.JPG using
Canny’s methods. The edges are displayed as shown in Fig. 8
F = imread('image5.JPG');
BW1 = edge(F,'canny');
figure, imshow(BW1)
Fig.8.
Edge detection Image
Cany’s method is the most powerful edge-detection method, as it uses two different
thresholds to detect strong and weak edges separately. Further the weak edges connected to
strong edges are displayed in the output. This method is therefore less likely than the others
to be "fooled" by noise, and more likely to detect true weak edges.
7. Image Enhancement
Image enhancement techniques are used to improve an image or to enhance the information
contained in the image. For example, enhancement of the signal-to-noise ratio, enhancement
of certain features by modifying the colors or intensities so than they can be easily identified
or differentiated from others.
Intensity adjustment is a technique for mapping an image's intensity values to a new range.
For example, as seen from Fig. 2, contrast of the original display of image4.JPG is poor.
Although the pixels can be displayed in the intensity range of 0-255 (in the 8-bit system),
only a narrow range is used for the display.
To see the distribution of intensities in image4.JPG in its current state, a histogram can be
created by calling the imhist function. (Precede the call to imhist with the figure command so
that the histogram does not overwrite the display of the image in the current figure window.)
Fig.9. Histogram of
raw image
From the histogram it can be seen that most of the values are concentrated in the region 10-
80. There are very few values above 80. On the other hand the display levels are equally
distributed for the entire range, which results in poor contrast in the image display. Contrast
in the image can be improved if the data values are remapped to fill the entire intensity range
[0,255].
This kind of adjustment can be achieved with the imadjust function. The function takes 3
arguments as input viz., the image to be adjusted, the input intensity range and the
corresponding output intensity range.
Where, low_in and high_in are the intensities in the input image, which are mapped to
low_out, and high_out in the output image.
For example, the code below performs the adjustment described above.
I = imread('image4.JPG');
The first vector passed to imadjust, [0.0 0.3], specifies the low and high intensity values of
the input image I. The second vector, [0 1], specifies the scale over which you want to map
them in the output image J. Thus, the example maps the intensity value 0.0 in the input image
to 0 in the output image, and 0.3 to 1.
Note that one must specify the intensities as values between 0 and 1 regardless of the class of
I. If I uses unsigned 8-bit data format (uint8), the values supplied are multiplied by 255 to
determine the actual values to use.
Histogram equalization is done in MATLAB using the command histeq. The command
spreads the intensity values of I (original image) over the full range, thereby improving the
contrast. Store the modified image in the variable I2.
I2 = histeq (I);
New equalized image, I2, is displayed in a new figure window using the command imshow.
figure, imshow(I2)
Fig. 10 shows the display of the imaged enhanced through histogram equalization.
The new image I2 can be written back to disk using the imwrite command. If it is to be saved
as a PNG file, use imwrite and specify a filename that includes the extension 'png'.
The contents of the newly written file can be checked using imfinfo function to see what was
written to disk.
imfinfo('image4.png')
As a matter of fact MATLAB image processing tool box has many more capabilities and only
a small portion of them is explained in this lecture.
Bibliography
INTRODUCTION
1. Introduction
Digital Elevation Model (DEM) is the digital representation of the land surface elevation with
respect to any reference datum. DEMs are used to determine terrain attributes such as
elevation at any point, slope and aspect. Terrain features like drainage basins and channel
networks can also be identified from the DEMs. DEMs are widely used in hydrologic and
geologic analyses, hazard monitoring, natural resources exploration, agricultural management
etc. Hydrologic applications of the DEM include groundwater modeling, estimation of the
volume of proposed reservoirs, determining landslide probability, flood prone area mapping
etc.
DEM is generated from the elevation information from several points, which may be regular
or irregular over the space. In the initial days, DEMs were used to be developed from the
contours mapped in the topographic maps or stereoscopic areal images. With the
advancement of technology, today high resolution DEMs for a large part of the globe is
available from the radars onboard the space shuttle.
This lecture covers the definition of DEMs, different data structures used for DEMs and
various sources of DEMs.
2. Definition of a DEM
A DEM is defined as "any digital representation of the continuous variation of relief over
space," (Burrough, 1986), where relief refers to the height of earth’s surface with respect to
the datum considered. It can also be considered as regularly spaced grids of the elevation
information, used for the continuous spatial representation of any terrain.
Digital Terrain Model (DTM) and Digital Surface Model (DSM) are often used as synonyms
of the DEM. Technically a DEM contains only the elevation information of the surface, free
of vegetation, buildings and other non ground objects with reference to a datum such as Mean
Sea Level (MSL). The DSM differs from a DEM as it includes the tops of buildings, power
lines, trees and all objects as seen in a synoptic view. On the other hand, in a DTM, in
addition to the elevation information, several other informations are included, viz., slope,
aspect, curvature and skeleton. It thus gives a continuous representation of the smoothed
surface.
DEM DTM
(Source: www.zackenberg.dk)
3. Types of DEMs
DEMs are generated by using the elevation information from several points spaced at regular
or irregular intervals. The elevation information may be obtained from different sources like
field survey, topographic contours etc. DEMs use different structures to acquire or store the
elevation information from various sources. Three main type of structures used are the
following.
Figure2. Different types of DEMs (a) Gridded DEM (b) TIN DEM (c) Contour-based DEM
a) Gridded structure
Gridded DEM (GDEM) consists of regularly placed, uniform grids with the elevation
information of each grid. The GDEM thus gives a readily usable dataset that represents the
elevation of surface as a function of geographic location at regularly spaced horizontal
(square) grids. Since the GDEM data is stored in the form of a simple matrix, values can be
accessed easily without having to resort to a graphical index and interpolation procedures.
Accuracy of the GDEM and the size of the data depend on the grid size. Use of smaller grid
size increases the accuracy. However it increases the data size, and hence results in
computational difficulties when large area is to be analyzed. On the other hand, use of larger
grid size may lead to the omission of many important abrupt changes at sub-grid scale.
Some of the applications of the GDEMs include automatic delineation of drainage networks
and catchment areas, development of terrain characteristics, soil moisture estimation and
automated extraction of parameters for hydrological or hydraulic modeling.
TIN is a more robust way of storing the spatially varying information. It uses irregular
sampling points connected through non-overlapping triangles. The vertices of the triangles
match with the surface elevation of the sampling point and the triangles (facets) represent the
planes connecting the points.
Location of the sampling points, and hence irregularity in the triangles are based on the
irregularity of the terrain. TIN uses a dense network of triangles in a rough terrain to capture
the abrupt changes, and a sparse network in a smooth terrain. The resulting TIN data size is
generally much less than the gridded DEM.
TIN is created by running an algorithm over a raster to capture the nodes required for the
triangles. Even though several methods exist, the Delaunay triangulation method is the most
preferred one for generating TIN. TIN for Krishna basin in India created using USGS DEM
data (http://www.usgs.gov) is shown in Fig.5. It can be observed from this figure that the
topographical variations are depicted with the use of large triangles where change in slope is
small. Small triangles of different shapes and sizes are used at locations where the
fluctuations in slope are high.
Figure 5. TIN for Krishna basin created from USGS DEM data
Due to its capability to capture the topographic irregularity, without significant increase in the
data size, for hydrologic modeling under certain circumstances, TIN DEM has been
considered to be better than the GDEM by some researchers (Turcotte et al., 2001). For
example, in gridded DEM-based watershed delineation, flow is considered to vary in
directions with 450 increments. Using TIN, flow paths can be computed along the steepest
lines of descent of the TIN facets (Jones et al., 1990).
Contours represent points having equal heights/ elevations with respect to a particular datum
such as Mean Sea Level (MSL). In the contour-based structure, the contour lines are traced
from the topographic maps and are stored with their location (x, y) and elevation information.
These digital contours are used to generate polygons, and each polygon is tagged with the
elevation information from the bounding contour.
Contour-based DEM is often advantageous over the gridded structure in hydrological and
geomorphological analyses as it can easily show the water flow paths. Generally the
orthogonals to the contours are the water flow paths.
Major drawback of contour based structure is that the digitized contours give vertices only
along the contour. Infinite number of points are available along the contour lines, whereas not
many sampling points are available between the contours. Therefore, accuracy of DEM
depends on the contour interval. Smaller the contour interval, the better would be the
resulting DEM. If the contour interval of the source map is large, the surface model created
from it is generally poor, especially along drainages, ridge lines and in rocky topography.
Elevation information for a DEM may be acquired through filed surveys, from topographic
contours, aerial photographs or satellite imageries using the photogrammetric techniques.
Recently radar interferometric techniques and Laser altimetry have also been used to generate
a DEM.
Field surveys give the point elevation information at various locations. The points can be
selected based on the topographic variations. Contours are the lines joining points of equal
elevation. Therefore, contours give elevation at infinite numbers of points, however only
along the lines.
A digital elevation model can be generated from the points or contours using various
interpolation techniques like linear interpolation, kriging, TIN etc. Accuracy of the resulting
DEM depends on the density of data points available depicting the contour interval, and
precision of the input data.
On the other hand, photogrammetric techniques provides continuous elevation data using
pairs of stereo photographs or imageries taken by instruments onboard an aircraft or space
shuttle. Radar interferometry uses a pair of radar images for the same location, from two
different points. The difference observed between the two images is used to interpret the
height of the location. Lidar altimetry also uses a similar principle to generate the elevation
information.
Today very fine resolution DEMs at near global scale are readily available from various
sources. The following are some of the sources of global elevation data set.
GTOPO30
NOAA GLOBE project
SRTM
ASTER Global Digital Elevation Model
Lidar DEM
GTOPO30 is the global elevation data set published by the United State Geological Survey
(USGS). Spatial resolution of the data is 30 arc second (approximately 1 Kilometer). The data
for the selected areas can be downloaded from the following website.
http://www1.gsi.go.jp/geowww/globalmap-gsi/gtopo30/gtopo30.html
The Global Land One-km Base Elevation Project (GLOBE) generates a global DEM of 3 arc
second (approximately 1 kilometer) spatial resolution. Data from several sources will be
combined to generate the DEM. The GLOBE DEM can be obtained from the NOAA
National Geophysical Data Centre.
Shuttle Radar Topographic Mission (SRTM) was a mission to generate the topographic data
of most of the land surface (56°S to 60°N) of the Earth, which was jointly run by the National
Geospatial-Intelligence Agency (NGA) and the National Aeronautics and Space
Administration (NASA). In this mission, stereo images were acquired using the
Interferometric Synthetic Aperture Radar (IFSAR) instruments onboard the space shuttle
Endeavour, and the DEM of the globe was generated using the radar interferometric
techniques. The SRTM digital elevation data for the world is available at 3 arc seconds
(approximately 90 m) spatial resolution from the website of the CGIAR Consortium for
Spatial Information website: http://srtm.csi.cgiar.org/. For the United States and Australia,
30m resolution data is also available.
ASTER Global Digital Elevation Model (GDEM) was generated from the stereo pair images
collected by the Advanced Space Borne Thermal Emission and Reflection Radiometer
(ASTER) instrument onboard the sun-synchronous Terra satellite. The data was released
jointly by the Ministry of Economy, Trade, and Industry (METI) of Japan and the United
States National Aeronautics and Space Administration (NASA). ASTER instruments
consisted of three separate instruments to operate in the Visible and Near Infrared (VNIR),
Shortwave Infrared (SWIR), and the Thermal Infrared (TIR) bands. ASTER GDEMs are
generated using the stereo pair images collected using the ASTER instruments, covering 99%
of the Earth’s land mass (ranging between latitudes 83oN and 83oS). ASTER GDEM is
available at 30m spatial resolution in the GeoTIFF format. The ASTER GDEM is being
freely distributed by METI (Japan) and NASA (USA) through the Earth Remote Sensing
Data Analysis Center (ERSDAC) and the NASA Land Processes Distributed Active Archive
Center (LP DAAC) (https://lpdaac.usgs.gov/lpdaac/products/aster_products_table).
Light Detection and Ranging (LIDAR) sensors operate on the same principle as that of laser
equipment. Pulses are sent from a laser onboard an aircraft and the scattered pulses are
recorded. The time lapse for the returning pulses is used to determine the two-way distance to
the object. LIDAR uses a sharp beam with high energy and hence high resolution can be
achieved. It also enables DEM generation of a large area within a short period of time with
minimum human dependence. The disadvantage of procuring high resolution LIDAR data is
the expense involved in data collection.
(Source: http://www.crwr.utexas.edu)
RADAR INTERFEROMETRY
1. Introduction
RAdio Detection And Ranging (Radar) is a system which uses microwave signals to detect
the presence of an object and its properties. Pulses of radio waves are sent from the radar
antenna towards the objects. The objects scatter back a part of this energy falling on them,
which are collected at the same radar platform. Energy reflected from the terrain to the radar
antenna is called radar return.
When an object scatters the radar signal, the scattered signals differ from the original in
amplitude, polarization and phase. Thus, the radar information consist complex information
in terms of both amplitude and phase of the signals. The difference between the pulses sent
and received indicates the properties of the object and the distance of the object.
Principles of radar interferometry are used in the radar remote sensing to generate high
resolution DEMs with near global coverage.
This lecture covers the basic principles of radar imaging and radar interferometry. The
methodology used to derive DEM from radar interferometry is also covered in this lecture.
In radar imaging, the radar systems are generally operated from a moving platform, either
airborne or space-borne. The radar imaging is based on the Side Looking Airborne Radar
(SLAR) systems. In SLAR systems, the microwave pulses are emitted to the side of the
aircraft/ space shuttle as it moves forward, and the radar return is recorded using the antenna.
Each pulse has a finite duration and it illuminates a narrow strip of land normal to the flight
direction as shown in Fig.2. The radar image is generated by using continuous strips swept as
the aircraft/ moves in the azimuth direction.
3.1. Wavelength
Microwave remote sensing uses the electromagnetic spectrum with wavelength ranging from
a few millimeters to 1 m. Each wavelength band is denoted by a letter, as shown in Table 1.
The bands that are commonly used in the radar remote sensing are highlighted in the Table.
The selection of wavelength depends on the objectives of the analysis. Smaller wavelengths
cannot penetrate through the clouds and hence are generally less preferred for imaging from
airborne/space-borne platforms. Larger wavelengths like L bands are capable of penetrating
through the cloud, and hence the satellite-based radar imaging uses the larger wavelength
bands.
Longer wavelengths can penetrate through the soil and hence can be used to retrieve soil
information. However, they provide less information about the surface characteristics. On the
other hand, the shorter wavelengths get scattered from the surface and give more information
about the surface characteristics. Hence shorter wavelength bands C and X are used in radar
interferometry to extract the topographic information.
3.2 Velocity
Microwave bands of the electromagnetic spectrum are used in the radar remote sensing.
Therefore these signals travel at the speed of light (c = 3x108 m.sec-1).
The pulses sent from the radar have a constant duration, which is called the pulse duration or
pulse length. The amount of energy transmitted is directly proportional to the pulse length.
Resolution of the radar imagery in the range direction is a function of the pulse length. When
the pulse length is long, larger area on the ground is scanned by a single pulse, leading to a
coarser resolution.
Phase is used to mention the phase of the wave cycle (crest or trough). The phase of the radar
return depends on the total distance travelled from the radar to the terrain and back in terms
of the total wave cycles.
Look angle is the angle between the nadir and the point of interest on the ground. Depression
angle is complementary to the look angle. Angle between the incident radar beam and a line
normal to the ground is called the incident angle. These are shown in Fig.4.
Figure 4. Concept of look angle, incident angle and depression angle in radar remote sensing
Spatial resolution of a radar image is controlled by the pulse length and the antenna beam
width. Radar image resolution is specified in terms of both azimuth and range resolutions. A
combination of both determines the ground resolution of a pixel.
In order to differentiate two objects in the range direction, the scattered signals from these
two objects must be received at the antenna without any overlap. If the slant distance between
two objects is less than half of the pulse length, the reflected signals from the two objects will
get mixed and the same will be recorded as the radar return from a single object. The
resolution in the range direction is therefore controlled by the pulse duration or pulse length.
The slant-range resolution of the radar image is equal to one half of the pulse length (i.e.,
τ/2).
(Courtesy: http://what-when-how.com/remote-sensing-from-air-and-space/theory-radar-
remote-sensing-part-1/)
Ground resolution (Rr) in the range direction, corresponding to the slant range resolution is
calculated as follows.
c
Rr
2Cos ( )
where τ is the pulse length measured using the units of time, c is the velocity of the signal
(equals velocity of light = 3x108 m/sec) and β is the depression angle.
Thus, shorter pulse lengths give better range resolution. But shorter pulses reduce the total
energy transmitted, and hence the signal strength. Therefore electronic techniques have been
used to shorten the apparent pulse length and hence to improve the resolution.
Resolution in the azimuth direction (Ra) is controlled by the width of the terrain strip
illuminated by the radar beam (or the radar beam width of the antenna), and the ground range.
Smaller beam widths give better resolution in the azimuth direction. Since the angular width
of the beam is directly proportional to the wavelength λ, and inversely proportional to the
length of the antenna (D), resolution is in the azimuth direction is calculated as follows.
S Cos
Ra
D
where, S is the slant range. Since the radar antenna produces fan shaped beams, in the SLAR
the width of the beam is less in the near range and more in the far range as shown in Fig.2.
Therefore, the resolution in the azimuth direction is better in the near range, than that in the
far range.
Resolution in the azimuth direction improves with the use of shorter wavelengths and larger
antennas. Use of shorter waves gives finer resolution in the azimuth direction. However
shorter waves carry less energy and hence have poor penetration capacity. Therefore the
wave length cannot be reduced beyond certain limits. Also, there are practical limitations to
the maximum length of the antenna. Therefore, Synthetic Aperture Radars are used to
synthetically increase the antenna length and hence to improve the resolution in the azimuth
direction.
In radar remote sensing, since the spatial resolution is inversely related to the length of the
antenna, when the real aperture radars are used, the resolution in the azimuth direction is
limited. In Synthetic Aperture Radar (SAR), using the Doppler principle, fine resolution is
achieved using both short and long waves.
Doppler principal states that if the source or the listener are in relative motion, the frequency
of the sound heard differs from the frequency at the source, frequency will be more (or less)
depending upon whether the source and the listener are moving close to (or away from) each
other, as shown in the figure below.
The Doppler principle is used in the SAR to synthesize the effect of a long antenna. In SAR,
each target is scanned using repeated radar beams as the aircraft moves forward, and the radar
returns are recorded. In reality, the radar returns are recorded by the same antenna at different
locations along the flight track (as shown in Fig.7), but these successive positions are treated
as if they are parts of a single long antenna, and thereby synthesize the effect of a long
antenna.
In SAR, the apparent motion of the target through continuous radar beams is assumed (from
A to B and then from B to C) as shown in Fig. 7.When the target is moving closer to the
antenna, according to the Doppler principle, the frequency (and hence the energy) of the radar
return from the target will be more. On the other hand, when the target is moving away from
the antenna, the resulting radar return will be weak. The radar returns are processed according
to their Doppler frequency shift, by which very small effective beam width is achieved.
7. Radar interferometry
Radar interferometry is the technique used to survey large areas giving moderately accurate
values of elevation. The principle of data acquisition in interferometric method is similar to
stereo-photographic techniques. When the same area is viewed from different orbits from a
satellite at different times, the differences in phase values from the scattered signals may be
used to derive terrain information. Radar interferometry makes use of the phase changes of
the radar return to measure terrain height. Two or more radar images can be used effectively
to generate a DEM. This technique is largely used for hazard monitoring like movement of
crustal plates in earthquake prone areas, land subsidence, glacial movement, flood
monitoring etc. as they give a high accuracy of upto a centimeter in elevation.
Interference is the superposition of the waves travelling through the same medium.
Depending upon the phases of the waves superpose, the amplitude of the resultant wave may
be higher or lower than the individual waves. When the two waves that meet are in phase i.e.,
the crest and troughs of the two waves coincide with each other, then the amplitude of the
resultant wave will be greater than the amplitude of the individual waves, and this process is
called constructive interference. On the other hand, if the two waves are in opposite phase,
the amplitude of the resultant wave will be less than that of the individual waves, which is
called destructive interference.
The phase of the resulting signal at any point depends the distance to the point from the
source, distance between the sources, and the wavelength. Therefore, if the wavelength of the
signal, phase of the resultant signal, and the distance between the two sources are known, the
distance of the point from the source can be estimated.
In radar interferometry, the principles of light interference are used to estimate the terrain
height. The principle of data acquisition in interferometric method is similar to stereo-
photographic techniques. The radar return from the same object is recorded using two
antennas located at two different points in space. Due to the distance between the two
antennas, the slant ranges of the radar returns from the same object are different at the two
antennas. This difference causes some phase difference between the two radar returns, which
may range between 0 and 2π radians. This phase difference information is used to interpret
the height of the target.
The phase of the transmitted signal depends on the slant range, and the wavelength of the
pulse (λ). Total distance travelled by the pulse is twice the slant range (i.e., 2S). The phase
induced by the propagation of each signal is then given by
2S 4 S
2
When the radar returns are recorded at two antennas separated at a distance d as shown in
Fig. 9, the difference in the phase is given by
4
1 2 S1 S 2
When a single antenna is used to send the signals and the radar returns are received at two
antennas, the phase difference is given by
2
1 2 S1 S 2
2
S1 S1 S1 S 2
2
S1 S 2
2
S
In radar interferometry, the images recorded at two antennas are combined to generate the
interferogram (also known as fringe map), which gives the interference fringes, as shown in
Fig. 10.
Figure 10. L-band and C-band radar interferogram for the Fort Irwin in California (Source:
http://www2.jpl.nasa.gov)
The interference fringes are used to accurately calculate the phase difference between the two
signals for each point in the image. Knowing the wavelength and phase difference, δS can be
calculated.
From the principles of trigonometry, the slant range S1 can be calculated using the following
relation.
S 1 S S1 d 2
2 2
Sin
2 S1 d
where d is the base length or the distance between the two antennas, θ is the look angle and α
is the angle of inclination of the baseline from the reference horizontal place.
The slant range S1 is related to the height of the antenna above the ground, terrain height and
the look angle as given below.
h H S1 Cos
Thus, in radar interferometry, knowing the height of the antenna above the ground level (H),
look angle (θ), base length (d), inclination of the base from horizontal plane (α), and
wavelength of the signal (λ), the measured phase difference is used to estimate the elevation
(h) of the terrain.
Single-pass interferometry: Two antennas are located at a known fixed distance apart. Signals
are transmitted only from one antenna and the energy scattered back are recorded at both the
antennas.
Repeat pass interferometry: In this only one antenna is used to send and receive the signals.
The antenna is passed more than once over the area of interest, but through different closely
spaced orbits.
1. Introduction
Availability of a reasonably accurate elevation information for many parts of the world was
once very much limited. Dense forest, high mountain ranges etc. were remained unmapped,
mainly because of the difficulty in getting to these places. Objective of the Shuttle Radar
Topographic Mission (SRTM) was to create near global data set on land elevations, using
radar images. The mission was headed by the National Geospatial-Intelligence Agency
(NGA) and the National Aeronautics and Space Administration (NASA).
The space shuttle Endeavour was employed in the mission to carry the payloads to the space.
The space shuttle Endeavour with the SRTM payloads was launched on 11th February 2000.
The Endeavour orbited the Earth at an altitude of 233 km, with 59 deg. inclination, and the
radar onboard the space shuttle was used to collect the images of the land surface. The
mission was completed in 11 days. These radar images were interpreted to generate a high
resolution elevation data, at a near global scale.
Radar system is advantageous over the optical systems as it can operate day and night, and in
bad weather. Also, by using space-borne radar system for the mapping, the accessibility
issues are eliminated. Thus in the SRTM, around 80% of the land areas were swiped using
the radar and the digital elevation data was generated.
The near-global elevation data generated by the SRTM finds extensive applications in the
areas of earth system sciences, hydrologic analyses, land use planning, communication
system designing, and military purposes.
This lecture covers the details of the SRTM, and the near global SRTM elevation data.
The SRTM instruments consisted of two antennas. One antenna was located at the bay of the
space shuttle. A mast of 60 m length was connected to the main antenna truss and the second
antenna was connected at the end of the mast as shown in Fig.1. The mast provided the
baseline distance between the two antennas.
The main antenna consisted of two antennas to work in two different wavelengths. The two
microwave bands used in the SRTM were the C band and X band.
The C-band antenna could transmit and receive radar signals of wavelength 5.6 centimeters.
The swath width of the (width of the radar beam on Earth's surface) C-band antenna was 225
kilometers. C-band data was used to scan about 80% of the land surface of the Earth
(between 60oN and 56oS ) to produce near-global topographic map of the Earth.
The X-band antenna was used to transmit and receive radar signals of wavelength 3
centimeters. Using the shorter wavelengths, X-band radar could achieve higher resolution
compared to C-band radar. However, the swath width of the X-band radar was only 50 km.
Therefore, the X-band radar could not achieve near global coverage during the mission.
The mast was used to maintain the baseline distance between the main antenna and the
outboard antenna. The length of the mast was 60 m and was inclined at 45 deg. from the
vertical.
The outboard antenna was connected to the end of the mast. It was used only to receive the
radar signals scattered back from the land surface. No signal was transmitted from the
outboard antenna.
The outboard antenna also contained two antennas: one was used to receive radar signals in
the C-band, and the other in the X-band. Wavelengths of the C and X band signals were 5.6
cm and 3 cm, respectively.
In SRTM, principle of radar interferometry was used to extract the elevation data.
In SRTM, space-borne, fixed baseline, single-pass interferometry was adopted, in which the
signal was sent from a single source and the energy scattered back (radar return) was
recorded simultaneously using two antennas placed at a fixed known distance apart.
The main antenna located at the bay of the space shuttle was used to send the radar signals.
The radar return was recorded at both the main antenna and the outboard antenna (located at
60m away from the main antenna using the mast).
Fig.3 shows the schematic representation of the SRTM radar system employed for capturing
the topographic information.
The images recorded at the two antennas were combined to generate the interferogram (or the
interference fringes). Since the two antennas were separated by a fixed distance, the radar
returns from an object recorded at these two antennas differed in phase, depending upon the
distance of the target from the radar antenna. The inteferogram was used to accurately
calculate the phase difference between two signals for each point in the image.
Knowing the wavelength of the signal (which were 5.6 cm and 3cm for the C and X bands,
respectively), fixed base length (which was 60m) and the phase difference, the slant range S
of the object was calculated, by using the principles of trigonometry. Further, the elevation
(h) of the target was calculated as shown below.
h H S1 Cos
Where, H is the height of the antenna from the ground level, which in this case is the altitude
of the orbit (233 km). The parameter θ is the look angle of the radar signal.
Various steps involved in the generation of SRTM elevation data are shown in Fig.4.
Two radar antennas were used to simultaneously capture the radar returns and the radar
images or the radar holograms were created. The radar holograms recorded at the main
antenna and the outboard antenna were combined to generate the interferogram (or the fringe
map), which displays bands of colors depending up on the phase difference (interferometric
phase) between the signals received at the two antennas. The phase difference was then used
to calculate the difference in the distance of the target from the two antennas, which was
further used to estimate the height of the target.
SRTM digital elevation data was generated from radar signals using the principles of radar
interferometry. This elevation data was then edited to fill small voids, remove spikes,
delineate and flatten water bodies, and to improve the coastlines.
The C-band antennas were used to scan almost 80% of the land surface of the Earth (between
60oN and 56oS) to produce the near-global topographic map of the Earth at a spatial
resolution of 1 arc-seconds. Due to the smaller swath with of the X-band antennas, near
global coverage could not be achieved using the X-band.
Fig.5 and 6 show the coverage of C-band and X-band elevation data, respectively
Digital Terrain Elevation Data (DTED): Each file contains regularly spaced grids
containing the elevation information. This format can be directly used in the GIS
platform.
Band interleaved by line (BIL): In this format, the elevation data is available for
regular grids, in binary format. A header file is used to describe the layout of the grids
and the data format.
The publically available SRTM DEM is geo-referenced using WGS84 datum. In the data set,
unit of elevation is meters. The elevation data gives less than 16m accuracy in the absolute
elevation and less than 10m accuracy in the relative vertical height, at 90% confidence level.
Case study results present by Jarvis conclude that SRTM derived DEMs provide greater
accuracy than TOPO DEMs. At the same time, this does not necessarily mean that it contains
more details. It was suggested that 3-arc second SRTM DEMs failed to capture topographic
features at the scales of 1:25,000 and below. Hence, presence of cartography with scales
above 1:25,000 (eg., 1:50,000 and 1:100,000) implied usage of SRTM DEMs. For
hydrological modeling application, if good quality cartography data of the sale 1:25,000 and
below are available, digitizing and interpolating this cartographic data was deemed suitable
for better results. This is because even though SRTM 3-arc second DEMs perform well for
hydrological applications, these are on the margin of usability.
1. Introduction
Terrain attributes derived from the DEM are broadly classified as primary attributes and
secondary attributes. Primary attributes are those derived directly from the DEM, whereas
secondary attributes are derived using one or more of the primary attributes. Some of the
primary attributes, which are important in the hydrologic analysis, derived from the DEM
include slope, aspect, flow-path length, and upslope contributing area. Topographic wetness
index is an example of the secondary attribute derived from the DEM. Topographic wetness
index represents the extent of the zone of saturation as a function of the upslope contributing
area, soil transmissivity and slope.
Gridded DEM represents the surface as a matrix of regularly spaced grids carrying the
elevation information. Most of the terrain analysis algorithms using the gridded DEM assume
uniform spacing of grids throughout the DEM. Topographic attributes are derived based on
the changes in the surface elevation with respect to the distance.
This lecture covers brief explanation of various topographic indices that can be derived from
the raster DEM.
Slope is defined as the rate of change of elevation, expressed as gradient (in percentage) or in
degrees.
Using the finite difference approach, slope in any direction is expressed as the first derivative
of the elevation in that direction.
hx
Slope in the x direction S x
x
hy
Slope in the y direction S y
y
hy
2 2
h
S x
x y
Consider the example grids given below, in which the elevation are marked as h1, h2...h9.
Using the finite difference formulation, slope of the central grid (grid 9) can be calculated as
follows.
h1 h5 h3 h7
2 2
S9
2a 2a
In this approach, only the four directions are considered. However, the slope estimated using
the above equation generally gives reasonably accurate values of the slope.
On the other hand, deterministic eight-neighborhood (D8) algorithm estimates the slope by
calculating the rate of change of elevation in the steepest down slope direction among the 8
nearest neighbors, as shown below.
h9 h1 h9 h2 h9 h8
S 9, D8a lg otithm Max , , ,
L L L
When gridded DEMs are used for the analysis, each of the grids is compared with its nearest
8 neighbors and the slope is derived for each grid. The local slope calculated from a gridded
DEM decreases with increase in the DEM grid size. This is because, as the grid size
increases, the grids represent larger areas. In the slope calculation, since the spatial averages
of the elevation for such larger areas are used, it tends to result in smoother or less steep
surface (Wolock and Price 1994).
Slope decreases
Selection of DEM grid size is therefore important to get appropriate slope map, particularly
in fields like erosion studies, where the processes are largely related to the slope.
Aspect is the orientation of the line of steepest descent, normally measured clockwise from
the north, and is expressed in degrees.
In watershed analysis using raster based DEM, water from each cell is assumed to flow or
drain into one of its eight neighboring cells which are towards left, right, up, down, and the
four diagonal directions. The flow vector algorithm scans each cell of the DEM, and
determines the direction of the steepest downward slope to an adjacent cell.
The most common method used for identifying the flow direction is the D8 (deterministic
eight-neighbors) method. The method was first proposed by O’Callaghan and Mark (1984).
In this method, a flow vector indicating the steepest slope is assigned to one of the eight
neighboring cells.
A detailed description of the D8 algorithm for drainage pattern extraction is provided in the
following sub-section.
4.1. D8 Algorithm
In this method, the flow direction for each cell is estimated from elevation differences
between the given cell and its eight neighboring cells, and hence the name D8 algorithm.
Most GIS implementations use the D8 algorithm to determine flow path.
In D-8 algorithm, water from each cell is assumed to flow or drain into one of its eight
neighboring cells which are towards left, right, up, down, and the four diagonal directions.
The flow is assumed to follow the direction towards the cell having the steepest slope. If the
steepest downward slope is encountered at more than one adjacent cell, flow direction is
assigned arbitrarily towards the first of these cells encountered in a row by row scan of the
adjacent cells. Where an adjacent cell is undefined (i.e. has a missing elevation value or lies
outside the DEM grid), the downward slope to that cell is assumed to be steeper than that to
any other adjacent cell with a defined elevation value.
Once the flow direction is identified, numerical values are assigned to the direction. The
general flow direction code or the eight-direction pour point model, followed for each
direction from the center cell is shown in the figure below. Each of the 8 flow directions are
assigned numeric values (Fig.4), using the 2x series, where x = 0, 1, 2 ….etc.
From Fig.4, it can be inferred that the flow direction is coded as 1 if the direction of steepest
drop is to the right of the current processing cell, and 128 if the steepest drop is towards top-
right of the current cell.
Consider a small part of a gridded DEM as shown in Fig. 5. Assume the cell size as 1 unit in
dimension. Consider the grid with elevation 67. There are 3 adjacent grids having elevation
less than 67 (with elevation 56, 53 and 44), and these three are considered as the possible
flow directions. In the flow vector algorithm, the direction of steepest slope among these
three directions is identified.
Slope is calculated along the three directions as shown in Fig. 5 (b). Slope is the maximum
towards the bottom right cell, and hence water would follow that direction. As a result,
according to the eight direction pour point model, the center cell (with elevation 67) is
allotted a flow direction value of 2.
The same procedure is repeated for all the grids in a DEM. The resulting flow directions and
the corresponding flow direction values are shown in Fig. 5 (c) and (d), respectively.
(a) 78 72 69 71 58
74 67 56 49 46
69 53 44 37 38
64 58 55 22 31
(b) 68 61 47 21 16
78 72 69 71 58 78 72 69 71 58 78 72 69 71 58
74 67 56 49 46 74 67 56 49 46 74 67 56 49 46
69 53 44 37 38 69 53 44 37 38 69 53 44 37 38
64 58 55 22 31 64 58 55 22 31 64 58 55 22 31
68 61 47 21 16 68 61 47 21 16 68 61 47 21 16
67 53
Slope= 67 56 =11.00 Slope= 67 44 =16.26 Slope= =14.00
1 2 1
(c) (d)
(a) (b)
Fig.5 (a) A sample DEM (b) Estimation of the steepest down slope direction (c) Flow
direction (d) Flow direction matrix with numerical values for each direction
It limits the direction of flow between two adjacent nodes to only eight possibilities
There is discrepancy between the lengths of the drainages as calculated by the method
The method fails to capture parallel flow lines
Use of two outflow paths (Tarboton, 1997), partitioning of the flow into all down slope
(Quinn et al., 1991), use of a stochastic approach to determine the gradient (Fairfield and
Leymarie, 1991) were some of the improvements made on the D-8 algorithms latter.
1. Introduction
Topography of the river basin plays an important role in hydrologic modelling, by providing
information on different terrain attributes which enhance the assessment and enable the
simulation of complex hydrological processes. In the past, topographical maps were one of
the major sources of information for derivation of the catchment characteristics in
hydrological models. With the rapidly increasing availability of topographic information in
digital form, like digital elevation models (DEMs), the automatic extraction of terrain
attributes to represent the catchment characteristics has become very popular. Automatic
algorithms used for the extraction of the catchment characteristics are benefitted by the speed,
accuracy and standardization. The ability to create new, more meaningful characteristics,
thereby eliminating the need to draw and digitize the attributes, is another major advantage of
such algorithms.
Hydrologic models use information about the topography in the form of terrain attributes that
are extracted from DEM for modeling the different hydrological process such as interception,
infiltration, evaporation, runoff, groundwater recharge, water quality etc. In hydrologic
studies DEMs have also been used to derive the channel network and to delineate the
catchment area.
This lecture explains the algorithms used to extract the channel network and the catchment
area from a raster DEM.
Gridded DEM has been widely used in the hydrologic modeling to extract drainage patterns
of a basin required for flow routing in the hydrologic models. The gridded DEM provides
elevation information at regularly spaced grids over the area. The algorithm used must be
capable of identifying the slope variation and possible direction of flow of water using the
DEM.
While using the gridded DEM, inadequate elevation difference between the grids often
creates difficulty in tracing the drainage pattern. Also, gridded DEM may contain
depressions, which are grids surrounded by higher elevations in all directions. Such
depressions may be natural or sometimes interpolation errors. These depressions also create
problems in tracing the continuous flow path.
Prior to the application of the DEM in the hydrologic studies, preprocessing of the DEM is
therefore carried out to correct for the depressions and flat areas.
Depression or sink is defined here as a point which is lower than its eight nearest neighboring
grids, as shown in Fig.1. Such points may arise due to data errors introduced in the surface
generation process, or they represent real topographic features such as quarries or natural
potholes.
Spurious flat areas are present in a DEM when the elevation information is inadequate to
represent the actual relief of the area. Fig.2 shows example of spurious flat area in a raster
DEM.
There are many algorithms available in literature for treating depressions and flat areas in the
raster DEM.
The depression filling algorithms basically identify and delineate the depression. Outlet from
the depression is identified by filling the depression to the lowest value on its rim (Jenson and
Domingue, 1988).
Advanced algorithms available for the depression filling make use of the overall drainage
information of the watershed, either by burning the stream network (overlay the stream
network and modify the elevation along channel grids) or by tracing the flow direction from
the watershed outlet to the depression.
Relief algorithms are used to identify the flow direction through flat areas in the DEM. The
relief algorithm imposes relief over the flat areas to allow an unambiguous definition of flow
lines across these areas.
The flat areas are not truly level, but have a relief that is not detectable at the vertical
resolution of the original DEM.
The relief in the flat area is such that any flow entering or originating from that area
will follow the shortest path over the flat area to a point on its perimeter where a
downward slope is available.
The relief algorithm proposed by Martz and Garbrecht (1998), and the priority-first-search
algorithm (PFS) proposed by Jones (2002), are some of the methods available for treating the
flat areas and to trace the flow paths.
A DEM, made free of sinks and flat areas is termed as a modified DEM.
Care should be taken as these algorithms could change the natural terrain, enlarge the
depression, loop the depression and/or produce an outflow point in the depression while
processing a flat area which in turn could affect watershed delineation, drainage network
extraction and hydrologic event simulation accuracy.
The steps for delineating watershed from a depressionless DEM are the following.
Flow vector algorithms scan each cell of the modified DEM (from the depressions and flat
areas) and determine the direction of the steepest downward slope to an adjacent cell. Most
common method used for identifying the flow direction is the D8 (deterministic eight-
neighbors) method (Details of the D-8 algorithm are provided in Lecture 4).
Using D-8 algorithm, flow direction for each cell is estimated from elevation differences
between the given cell and its eight neighboring cells.
Consider a small sample of a DEM in raster format given in Fig.3 (a). The corresponding
flow direction grid and the matrix containing numerical values of the flow directions are
shown in Fig. 3(b) and 3(c), respectively.
(a)
(b) (c)
Fig.3 (a) A sample DEM (b) Flow direction grid (d) Flow direction matrix with numerical
values for each direction
5. Flow network
Once flow direction grid has been obtained, flow network is created by extending the lines of
steepest descent beyond each cell as shown in Fig.4
(a) (b)
Once the flow directions are identified, flow from each grid is traced to the watershed outlet.
During this flow tracing, for each grid, a counter is initiated for each grid. As the flow passes
through each grid, this counter is incremented by 1. Using the counter, total number of
upstream grids that flow into each grid are identified and the flow accumulation grid is
generated. In the flow accumulation grid, the number in each cell denotes the number of the
cells that flow into that particular cell. Figure 5 (b) clearly illustrates the flow accumulation
grid.
(a) (b)
Stream network is defined using the relative count of grids flowing to it, which is obtained
from the flow accumulation matrix. To delineate a stream from the flow accumulation grid, it
is necessary to specify a threshold flow accumulation. The threshold specifies the minimum
flow accumulation at which a grid can be considered as a part of the stream. Grids which
have flow accumulation greater than the threshold value are assumed to be the parts of the
stream and the remaining grids are considered as overland flow grids. Fig. 6 highlights the
stream grids for a threshold flow accumulation of 5 cells.
Figure 6. Grids that are parts of the stream network using the threshold flow accumulation of
5-cells
On specifying the threshold flow accumulation, the stream links associated with this
threshold are obtained with the help of flow network grid, as highlighted in Fig.7. Knowing
the stream links, the grids contributing flow to any point on the stream can be identified,
which can be used to delineate the subwatersheds.
Figure 7: Stream network for a 5-cell threshold flow accumulation (shown in red color)
Watershed boundaries or the location of watershed divides can be obtained based on stream
channel information. Usually, watershed boundaries tend to be half way between the stream
of interest and all the other neighbouring streams. If a much more precise boundary needs to
be determined, the use of topographic maps are essential. These maps show elevation
contours, small ephemeral streams, large water bodies etc. Pour point is the name given to
the outlet of the watershed that user is interested in delineating. An estimate of watershed
boundary from a topographic map requires the pour point locations. Pour points can be any
point on the stream/river where the surface flow from watershed exists. The upslope
catchment area at each grid of the modified gridded DEM is determined using the flow
direction information. Beginning at each grid with a defined elevation value and using the
flow direction vectors previously generated, the path of steepest descent is continuously
followed for each grid until the edge of the DEM is reached, and the flow accumulation is
derived for each grid. The flow accumulation represents the number of grids contributing
flow into the grid in the watershed, and hence gives the upslope catchment area for that grid.
All the above steps to extract sub-watersheds from a raster based DEM are shown in the form
of a flowchart in Fig. 8.
The SRTM DEM, from United States Geological Survey (USGS) website are downloaded in
raster format for the two basins and are presented in Figure 9
(a)
(b)
Figure 9: SRTM data for (a) Krishna basin and (b) Cauvery basin
Flow Direction
Flow direction from each cell is estimated using the D8 algorithm, embedded in the GIS
framework of ArcGIS. Fig. 10 shows the flow direction images for the two case study basins.
(a)
(b)
Figure 10: Flow direction image for (a) Krishna basin and (b) Cauvery basin
Flow Accumulation
Using the flow direction information, the flow is traced for each grid and the flow
accumulation is derived. Here, the flow accumulation is derived using the flow accumulation
function available in the ArcGIS Spatial Analyst. Fig. 11 shows the flow accumulation
images for the two case study basins.
(a)
(b)
Figure 11: Flow accumulation image for (a) Krishna basin and (b) Cauvery basin
Stream Network
A vector dataset showing drainage network is derived based on combined information from
the flow accumulation and flow direction dataset. Stream network derived for the Krishna
and Cauvery basins are shown in Fig. 12.
(a)
(b)
Figure 12: Stream network of (a) Krishna basin and (b) Cauvery basin
Pour Points
In the next step, pour point locations are created. If the locations of hydrometric gauging
stations are not available, pour points need to be created manually. This gives a vector
dataset, created with probable outlets of drainage sub-basins in the drainage network. Fig.13
shows the location of sub basin outlet or pour points for Krishna and Cauvery basins.
Figure 13. Sub-basin outlets or pour points in the (a) Krishna basin (b) Cauvery basin
Watershed delineation
The watersheds in the two study area are delineated using ArcGIS, as shown in Fig. 14. The
sub basins delineated can be seen in different colors.
Figure 14. Watersheds delineated using ArcGIS for the (a) Krishna basin and (b) Cauvery
basin
Accuracy of the DEM-derived stream network and the catchment boundary depends on the
following factors.
The source of the elevation data, including the techniques for measuring elevation
either on the ground or remotely, the locations of samples, and the density of samples.
The horizontal resolution and vertical precision at which the elevation data is
represented. If the actual topographic gradient is less than the precision of the DEM
elevation data, then these areas will be represented as flat lands according to DEM
data.
The topographic complexity of the landscape being represented. For example, it may
not be possible to determine if a given area of equal elevation is either a lake or a flat
area, where a river possibly flows.
The algorithms used to calculate different terrain attributes. For example, the
watershed drainage structure modeled using a DEM does not fit with the actual
drainage structure in flat areas
1. Burrough, P. A., 1986. Principles of Geographic Information Systems for Land Resource
Assessment. Monographs on Soil and Resources Survey No. 12, Oxford Science
Publications, New York.
2. Fairfield, J., Leymarie, P., 1991. Drainage networks from grid digital elevation models.
Water Resources Research 27 (5), 709–717.
3. Jenson, S.K., Domingue, J.O., 1988. Extracting topographic structure from digital
elevation data for geographic information system analysis. Photogrammetric Engineering
and Remote Sensing 54 (11), 1593–1600.
4. Jones, N., Wright, S., Maidment, D., 1990. Watershed Delineation with Triangle‐ Based
Terrain Models. Journal of Hydraulic Engineering, 116(10), 1232–1251.
5. Jones. R, 2002. Algorithms for using a DEM for mapping catchment areas of stream
sediment samples. Computers and Geosciences, 28 (9), 1051–1060.
6. Martz, L.W., Garbrecht, J., 1998. The treatment of flat areas and depressions in
automated drainage analysis of raster digital elevation models. Hydrological Processes
12, 843–855.
7. O’Callaghan, J.F., Mark, D.M., 1984. The extraction of drainage networks from digital
elevation data. Computer Vision, Graphics, and Image Processing 28, 323–344.
8. Quinn, P., Beven, K., Chevallier, P., Planchon, O., 1991. The prediction of hillslope flow
paths for distributed hydrological modelling using digital terrain models. Hydrological
Processes 5, 59–79.
9. Sabins, F.F., Jr., 1978. Remote Sensing Principles and Interpretation, Freeman, San
Francisco.
10. Tarboton, D.G., 1997. A new method for the determination of flow directions and
upslope areas in grid digital elevation models. Water Resources Research 33 (2), 309–
319.
11. Turcotte, R., Fortin, J.-P., Rousseau, A.N., Massicotte, S., Villeneuve, J.-P., 2001.
Determination of drainage structure of a watershed using a digital elevation model and a
digital river and lake network. Journal of Hydrology 240, 225–242.
12. Wolock, D. M., Price, C. V., 1994. Effects of digital elevation model map scale and data
resolution on a topography-based watershed model. Water Resources Research, 30(11),
3041–3052.
1. Introduction
Scientific planning and management is essential for the conservation of land and water
resources for optimum productivity. Watersheds being the natural hydrologic units, such
studies are generally carried out at watershed scale and are broadly referred under the term
watershed management. It involves assessment of current resources status, complex modeling
to assess the relationship between various hydrologic components, planning and
implementation of land and water conservation measures etc.
Remote sensing via aerial and space-borne platforms acts as a potential tool to supply the
essential inputs to the land and water resources analysis at different stages in watershed
planning and management. Water resource mapping, land cover classification, estimation of
water yield and soil erosion, estimation of physiographic parameters for land prioritization
and water harvesting are a few areas where remote sensing techniques have been used.
This lecture covers the remote sensing applications in water resources management under the
following five classes:
Water resources mapping
Estimation of watershed physiographic parameters
Estimation of hydrological and meteorological variables
Watershed prioritization
Water conservation
Identification and mapping of the surface water boundaries has been one of the simplest and
direct applications of remote sensing in water resources studies. Water resources mapping
using remote sensing data require fine spatial resolution so as to achieve accurate delineation
of the boundaries of the water bodies.
Optical remote sensing techniques, with their capability to provide very fine spatial resolution
have been widely used for water resources mapping. Water absorbs most of the energy in
NIR and MIR wavelengths giving darker tones in the bands, and can be easily differentiated
from the land and vegetation.
Fig. 1 shows images of a part of the Krishna river basin in different bands of the Landsat
ETM+. In the VIS bands (bands 1, 2 and 3) the contrast between water and other features are
not very significant. On the other hand, the IR bands (bands 4 and 5) show a sharp contrast
between them due to the poor reflectance of water in the IR region of the EMR spectrum.
Fig. 1 Landsat ETM+ images of a part of the Krishna river basin in different spectral bands
Poor cloud penetration capacity and poor capability to map water resources under thick
vegetation cover are the major drawbacks of the optical remote sensing techniques.
Use of active microwave sensor helps to overcome these limitations as the radar waves can
penetrate the clouds and the vegetation cover to some extent. In microwave remote sensing,
water surface provides specular reflection of the microwave radiation, and hence very little
energy is scattered back compared to the other land features. The difference in the energy
received back at the radar sensor is used for differentiating, and to mark the boundaries of the
water bodies.
This section covers the remote sensing applications in estimating watershed physiographic
parameters and the land use / land cover information.
Various watershed physiographic parameters that can be obtained from remotely sensed data
include watershed area, size and shape, topography, drainage pattern and landforms.
Fine resolution DEMs have been used to extract the drainage network/ pattern using the flow
tracing algorithms. The drainage information can also be extracted from the optical images
using digital image processing techniques.
The drainage information may be further used to generate secondary information such as
structure of the basin, basin boundary, stream orders, stream length, stream frequency,
bifurcation ratio, stream sinuosity, drainage density and linear aspects of channel systems etc.
Fig. 2 shows the ASTER GDEM for a small region in the Krishna Basin in North Karnataka
and the drainage network delineated from it using the flow tracing algorithm included in the
‘spatial analyst’ tool box of ArcGIS. Fig. 2(b) also shows the stream orders assigned to each
of the delineated streams.
Fig.2 (a) ASTER GDEM of a small region in the Krishna Basin (b) and the stream network
delineated from the DEM
Detailed land use / land cover map is another important input that remote sensing can yield
for hydrologic analysis.
Land cover classification using multispectral remote sensing data is one of the earliest, and
well established remote sensing applications in water resources studies. With the capability of
the remote sensing systems to provide frequent temporal sampling and the fine spatial
resolution, it is possible to analyze the dynamics of land use / land cover pattern, and also its
impact on the hydrologic processes.
Use of hyper-spectral imageries helps to achieve further improvement in the land use / land
cover classification, wherein the spectral reflectance values recorded in the narrow
contiguous bands are used to differentiate different land use classes which show close
resemblance with each other. Identification of crop types using hyper-spectral data is an
example.
With the help of satellite remote sensing, land use land cover maps at near global scale are
available today for hydrological applications. European Space Agency (ESA) has released a
global land cover map of 300 m resolution, with 22 land cover classes at 73% accuracy (Fig.
3).
Fig. 3. Global 300 m land cover classification from the European Space Agency (Source:
http://www.esa.int/Our_Activities/Observing_the_Earth/ESA_global_land_cover_map_avail
able_online)
4.1 Precipitation
Remote sensing techniques have been used to provide information about the occurrence of
rainfall and its intensity. Basic concept behind the satellite rainfall estimation is the
differentiation of precipitating clouds from the non-precipitating clouds (Gibson and Power,
2000) by relating the brightness of the cloud observed in the imagery to the rainfall
intensities.
Satellite remote sensing uses both optical and microwave remote sensing (both passive and
active) techniques.
Table 1 lists some of the important satellite rainfall data sets, satellites used for the data
collection and the organizations that control the generation and distribution of the data.
Table 1. Details of some of the important satellite rainfall products (Nagesh Kumar and
Reshmidevi, 2013)
Spectral bands
Program Organization Characteristics and source of data
used
World 1-4 km spatial, and
Weather WMO VIS, IR 30 min. temporal resolution
Watch (http://www.wmo.int/pages/prog/www/index_en.html)
VIS, IR Sub-daily
NASA
TRMM Passive & active 0.25o (~27 km) spatial resolution
JAXA
microwave (ftp://trmmopen.gsfc.nasa.gov/pub/merged)
4.2. Evapotranspiration
Evapotranspiration (ET) represents the water and energy flux between the land surface and
the lower atmosphere. ET fluxes are controlled by the feedback mechanism between the
atmosphere and the land surface, soil and vegetation characteristics, and the hydro-
meteorological conditions.
There are no direct methods available to estimate the actual ET by means of remote sensing
techniques. Remote sensing application in the ET estimation is limited to the estimation of
the surface conditions like albedo, soil moisture, surface temperature, and vegetation
characteristics like normalized differential vegetation index (NDVI) and leaf area index
(LAI). The data obtained from remote sensing are used in different models to simulate the
actual ET.
Courault et al. (2005) grouped the remote sensing data-based ET models into four different
classes:
Empirical direct methods: Use the empirical equations to relate the difference in the
surface air temperature to the ET.
Residual methods of the energy budget: Use both empirical and physical
parameterization. Example: SEBAL (Bastiaanssen et al., 1998), FAO-56 method
(Allen at al., 1998)
Deterministic models: Simulate the physical process between the soil, vegetation and
atmosphere making use of remote sensing data such as Leaf Area Index (LAI) and
soil moisture. SVAT (Soil-Vegetation-Atmosphere-Transfer) model is an example
(Olioso et al., 1999).
Vegetation index methods: Use the ground observation of the potential or reference
ET. Actual ET is estimated from the reference ET by using the crop coefficients
obtained from the remote sensing data (Allen et al., 2005; Neale et al., 2005).
Optical remote sensing using the VIS and NIR bands have been commonly used to estimate
the input data required for the ET estimation algorithms.
As a part of the NASA / EOS project to estimate global terrestrial ET from earth’s land
surface by using satellite remote sensing data, MODIS Global Terrestrial Evapotranspiration
Project (MOD16) provides global ET data sets at regular grids of 1 sq.km for the land surface
at 8-day, monthly and annual intervals for the period 2000-2010.
Remote sensing techniques of soil moisture estimation are advantageous over the
conventional in-situ measurement approaches owing to the capability of the sensors to
capture spatial variation over a large aerial extent. Moreover, depending upon the revisit time
of the satellites, frequent sampling of an area and hence more frequent soil moisture
measurements are feasible.
Fig. 4 shows the global average monthly soil moisture in May extracted from the integrated
soil moisture database of the European Space Agency- Climate Change Initiative (ESA-CCI).
Fig 4. Global monthly average soil moisture in May from the CCI data
(Source: http://www.esa-soilmoisture-cci.org/)
Remote sensing of the soil moisture requires information below the ground surface and
therefore mostly confined to the use of thermal and microwave bands of the EMR spectrum.
Remote sensing of the soil moisture is based on the variation in the soil properties caused due
to the presence of water. Soil properties generally monitored for soil moisture estimation
include soil dielectric constant, brightness temperature, and thermal inertia.
Though the remote sensing techniques are giving reasonably good estimation of the soil
moisture, due to the poor surface penetration capacity of the microwave signals, it is
considered to be effective in retrieving the moisture content of the surface soil layer of
maximum 10 cm thickness. In the recent years, attempts have been made to extract the soil
moisture of the entire root zone with the help of remote sensing data. Such methods
assimilate the remote sensing derived surface soil moisture data with physically based
distributed models to simulate the root zone soil moisture. For example, Das et al. (2008)
used the Soil-Water-Atmosphere-Plant (SWAP) model for simulating the root zone soil
moisture by assimilating the aircraft-based remotely sensed soil moisture into the model.
Some of the satellite based sensors that have been used for retrieving the soil moisture
information are the following.
Thermal sensors: Data from the thermal bands of the MODIS sensor onboard Terra
satellite have also been used for retrieving soil moisture data.
Use of hyper-spectral remote sensing technique has been recently employed to improve the
soil moisture simulation. Hyper-spectral monitoring of the soil moisture uses reflectivity in
the VIS and the NIR bands to identify the changes in the spectral reflectance curves due to
the presence of soil moisture (Yanmin et al., 2010). Spectral reflectance measured in multiple
narrow bands in the hyperspectral image helps to extract most appropriate bands for the soil
moisture estimation, and to identify the changes in the spectral reflectance curves due to the
presence of soil moisture.
Examples:
Watershed prioritization considering the erosion risk, using parameters such as relief
ratio, drainage density, drainage texture and bifurcation ratio (Chaudhary and Sharma,
1984).
Watershed prioritization based on the sediment yield index (Khan et al., 2001)
Watershed characterization and land suitability evaluation using land use/ land cover,
soil data, slope, and soil degradation status (Saxena et al., 2000)
Fig. 5 shows a sample watershed characterization map of the Northern United States for
water quality risks
Fig.5 Watershed characterization of the Northern United States for water quality risk
Source: http://www.nrs.fs.fed.us/futures/current_conditions/soil_water_conservation/
Remote sensing techniques have been effectively used for watershed characterization and
prioritization to identify the water potential, erosion risk, management requirements etc.
Remote sensing helps in obtaining the database essential for such analyses. Input data that
have been generated using remote sensing techniques for such studies includes physiographic
and morphometric parameters, land use / land cover information and hydrological parameters
as mentioned in the previous section.
Kherthal catchment in Rajasthan lies between latitudes 24o51’ and 25o58’ N and longitudes
73o8’ and 73o19’ E. The catchment consists of 25 micro-catchments and spreads over
approximately 159 km2 area.
Raju and Nagesh Kumar (2012) considered a set of seven geomorphologic parameters to
prioritize the micro-catchments in the Kherthal catchment for the watershed conservation and
management practices.
These morphologic parameters were interpreted / calculated for all the micro-catchments by
using IRS LISS-III images and Survey of India topographic sheets of 1:50,000 scale.
22 out of the 25 micro-catchments were considered for the analysis. Values of the
geomorphologic parameters for these 22 micro-catchments are given in Table 3.
For the prioritization, the maximum values for the first 4 parameters and the minimum values
for the remaining 3 parameters were considered as the evaluation criteria. These criteria were
evaluated using three methods: compromise programming, technique for order preference by
similarity to an ideal solution (TOPSIS) and compound parameter approach (CPAP) (More
details can be found in Raju and Nagesh Kumar, 2012).
In the CPAP, micro-catchments were ranked for the seven parameters individually and the
average of the seven ranks was used as the compound parameter, which was then used to
rank the micro-catchments. Table 4 shows the individual ranks of the parameters, compound
parameter and the corresponding ranks of the micro-catchments.
Sub catchments A6, A3 and A10 were identified as the highest priority micro-catchments in
the Kherthal watershed.
From the study, analysis of the geomorphologic parameters was found to be very effective in
assessing the geo-morphological and hydrological characteristics of the micro-catchments.
Rainwater harvesting, wherein water from the rainfall is stored for future usage, is an
effective water conservation measure particularly in the arid and semi-arid regions.
Rainwater harvesting techniques are highly location specific. Selection of appropriate water
harvesting technique requires extensive field analysis to identify the rainwater harvesting
potential of the area, and the physiographic and terrain characteristics of the locations. It
depends on the amount of rainfall and its distribution, land topography, soil type and depth,
and local socio-economic factors (Rao and Raju, 2010).
Rao and Raju (2010) had listed a set of parameters which need to be analyzed to fix
appropriate locations for the water harvesting structures. These are
• Rainfall
• Land use or vegetation cover
• Topography and terrain profile
• Soil type & soil depth
• Hydrology and water resources
• Socio-economic and infrastructure conditions
• Environmental and ecological impacts
Remote sensing techniques had been identified as potential tools to generate the basic
information required for arriving at the most appropriate methods for each area.
In remote sensing aided analysis, various data layers were prepared and brought into a
common GIS framework. Further, multi-criteria evaluation algorithms were used to aggregate
the information from the basic data layers. Various decision rules were evaluated to arrive at
the most appropriate solution as shown in Fig. 6.
Fig.6. Schematic representation showing the remote sensing data aggregation in evaluating
the suitability of various water harvesting techniques
(Images are taken from Rao and Raju 2010 and aggregated here)
The capability to provide large areal coverage at a fine spatial resolution makes remote
sensing techniques highly advantageous over the conventional field-based surveys.
1. Allen RG, Pereira LS, Raes D, Smith M (1998). “Crop evapotranspiration: guidelines
for computing crop water requirements” Irrigation and Drainage Paper 56, United
Nations FAO, Rome.
2. Allen RG, Tasumi M, Morse A, Trezza R (2005). “A Landsat-based energy balance
and evapotranspiration model in Western US water rights regulation and planning”
Irrig. and Drain. Syst., 19(3/4), pp 251–268.
3. Bastiaanssen WGM, Menenti M, Feddes RA, Holtslag AA (1998). “A remote sensing
surface energy balance algorithm for land (SEBAL)” J. Hydrol., 212–213, pp 198-
212.
4. Chaudhary, R.S. and Sharma, ED. (1998). “Erosion hazard assessment and treatment
prioritization of Giri River catchment, North Western Himalayas.” Indian J. Soil
Conservation, 26(1): 6-1.
5. Courault D, Seguin B, Olioso A (2005). “Review on estimation of evapotranspiration
from remote sensing data: From empirical to numerical modeling approaches” Irrig.
and Drain. Syst., 19, pp 223–249.
6. Das NN, Mohanty BP, Cosh MH, Jackson TJ (2008). “Modeling and assimilation of
root zone soil moisture using remote sensing observations in Walnut Gulch
Watershed during SMEX04” Remtoe Sens. Environ., 112, pp: 415-429. doi:
doi:10.1016/j.rse.2006.10.027.
7. Gibson PJ, Power CH (2000). Introductory Remote Sensing- Digital Image
Processing and Applications. Routledge Pub., London.
8. Khan MA, Gupta VP, Moharana PC (2001). “Watershed prioritization using remote
sensing and geographical information system: a case study from Guhiya, India” J.
Arid Environ., 49, pp 465–475.
13. Rao VV and Raju PV (2010) “Water resources management” In Remote Sensing
Applications (Roy PS, Dwivedi RS, Vijayan D Eds.), National Remote Sensing Centre,
Hyderabad.
14. Saxena RK, Verma KS, Chary GR, Srivastrava R, Batrhwal AK (2000). “IRS-1C data
application in watershed characterization and management” Int. J. Remote Sens., 21
(17), pp 3197-3208
15. Yanmin Y, Wei N, Youqi C, Yingbin H, Pengqin T (2010). “Soil Moisture
Monitoring Using Hyper-Spectral Remote Sensing Technology” In 2010 Second IITA
International Conference on Geosci. Remote Sens., pp 373- 376, IEEE.