Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
16 views13 pages

SVM in Remote Sensing

SVM in Remote Sensing

Uploaded by

Dhiraj Raut
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views13 pages

SVM in Remote Sensing

SVM in Remote Sensing

Uploaded by

Dhiraj Raut
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

ISPRS Journal of Photogrammetry and Remote Sensing 66 (2011) 247–259

Contents lists available at ScienceDirect

ISPRS Journal of Photogrammetry and Remote Sensing


journal homepage: www.elsevier.com/locate/isprsjprs

Review article

Support vector machines in remote sensing: A review


Giorgos Mountrakis ∗ , Jungho Im, Caesar Ogole
Department of Environmental Resources Engineering, SUNY College of Environmental Science and Forestry, 1 Forestry Dr, Syracuse, NY 13210, USA

article info abstract


Article history: A wide range of methods for analysis of airborne- and satellite-derived imagery continues to be proposed
Received 6 June 2010 and assessed. In this paper, we review remote sensing implementations of support vector machines
Received in revised form (SVMs), a promising machine learning methodology. This review is timely due to the exponentially
17 September 2010
increasing number of works published in recent years. SVMs are particularly appealing in the remote
Accepted 1 November 2010
Available online 3 December 2010
sensing field due to their ability to generalize well even with limited training samples, a common
limitation for remote sensing applications. However, they also suffer from parameter assignment issues
Keywords:
that can significantly affect obtained results. A summary of empirical results is provided for various
Support vector machines applications of over one hundred published works (as of April, 2010). It is our hope that this survey will
Review provide guidelines for future applications of SVMs and possible areas of algorithm enhancement.
Remote sensing © 2010 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS). Published by
SVM Elsevier B.V. All rights reserved.
SVMs

1. Introduction gains seem well-suited for remote sensing applications, where a


limited amount of reference data is often provided. Third, even
Remotely-sensed data are used in numerous applications. Typ- though the method is not widely popular, in recent years there
ically, an image classification process is initiated to convert data has been a significant increase in SVM works on remote sensing
into meaningful information. Unfortunately, image classification problems suggesting this review is current and appropriate.
is not a trivial task. As noted by Chi et al. (2008), classification of This review focuses on recent research papers (available by
remote sensing data is particularly daunting because most of the April, 2010) published in eight major journals of remote sens-
supervised learning schemes require sufficiently large amount of ing, namely, ISPRS Journal of Photogrammetry and Remote Sens-
training samples, yet definition and acquisition of reference data ing, Remote Sensing of Environment, Photogrammetric Engineer-
is often a critical problem. Various classification techniques, both ing & Remote Sensing, IEEE Transactions on Geoscience and Re-
parametric and non-parametric, have been developed and used in mote Sensing, IEEE Geoscience and Remote Sensing Letters, Inter-
different contexts — remote sensing inclusive. national Journal of Remote Sensing, Canadian Journal of Remote
Previous reviews, such as that by Plaza et al. (2009), focused Sensing and GIScience and Remote Sensing. A limited number of
on recent developments in methodologies for processing a research papers relevant to the thematic point and thus included
specific type of imagery, for example hyperspectral images. The in this review came from additional sources. The selected papers
review provided in this paper follows the algorithmic perspective represent a wide range of: (i) applications from coal reserve detec-
rather than image characteristics. More specifically, we focus on tion to urban growth monitoring, (ii) resolutions from sub-meter
applications of support vector machines (SVMs) in remote sensing. to several kilometers pixel size, (iii) spectral resolution from single
The motivation to carry out this study comes from different to hundreds of bands, and (iv) comparative methods from max-
sources. First, SVMs are not as well-known as other classifiers imum likelihood classifiers to neural networks. For completeness,
(e.g., decision trees, variants of neural networks) in the general we first recap on the basics of SVM methodology before diving into
remote sensing community, yet they can match if not exceed the specific works. Relevant papers are then summarized, while juxta-
performance of established methods. Second, their performance position of general patterns enables us to derive conclusions and
recommendations for further investigations.

∗ Corresponding address: Department of Environmental Resources Engineering,


2. Overview of support vector machines
SUNY College of Environmental Science and Forestry, 419 Baker Hall, 1 Forestry Dr,
Syracuse, NY 13210, USA. Tel.: +1 (315) 470 4824; fax: +1 (315) 470 6958.
E-mail address: [email protected] (G. Mountrakis). Support vector machines (SVMs) is a supervised non-parametric
URL: http://www.aboutgis.com (G. Mountrakis). statistical learning technique, therefore there is no assumption
0924-2716/$ – see front matter © 2010 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS). Published by Elsevier B.V. All rights reserved.
doi:10.1016/j.isprsjprs.2010.11.001
248 G. Mountrakis et al. / ISPRS Journal of Photogrammetry and Remote Sensing 66 (2011) 247–259

problems usually involve identification of multiple classes (more


than two). Adjustments are made to the simple SVM binary
SVM classifier to operate as a multi-class classifier using methods such
hyperplane as one-against-all, one-against-others, and directed acyclic graph
Support vectors (Knerr et al., 1990).
SVMs are particularly appealing in the remote sensing field due
to their ability to successfully handle small training data sets, often
producing higher classification accuracy than the traditional meth-
ods (Mantero et al., 2005). The underlying principle that benefits
SVMs is the learning process that follows what is known as struc-
tural risk minimization. Under this scheme, SVMs minimize classi-
fication error on unseen data without prior assumptions made on
the probability distribution of the data. Statistical techniques such
as maximum likelihood estimation usually assume that data dis-
Margin tribution is known a priori. Burges (1998) in a well-organized SVM
width tutorial described a simple experiment to illustrate an advantage
of SVMs in an image recognition problem. In that demonstration,
the performance of a basic multi-way SVM-based recognizer was
Misclassified
assessed on image classification in the presence of prior knowl-
instances
edge. The accuracy turned out to be approximately the same if the
Fig. 1. Linear support vector machine example. pixels were first shuffled, with each image instance suffering the
Source: adapted from Burges (1998). same random permutation. Yet, when the act of ‘vandalism’ (or re-
moval of prior knowledge) took place, SVM still outperformed even
made on the underlying data distribution. In its original formula- the best neural networks. This discovery is particularly appealing
tion (Vapnik, 1979) the method is presented with a set of labeled in remote sensing applications since data acquired from remotely
data instances and the SVM training algorithm aims to find a hyper- sensed imagery usually have unknown distributions, and meth-
plane that separates the dataset into a discrete predefined number ods such as Maximum Likelihood Estimation (MLE) that assume
of classes in a fashion consistent with the training examples. The a multivariate normal data model do not necessarily match that
term optimal separation hyperplane is used to refer to the deci- assumption. Even if the data, whose dimensionality is assumed to
sion boundary that minimizes misclassifications, obtained in the match the number of spectral bands, were normally distributed,
training step. Learning refers to the iterative process of finding a the assumption that the distribution can be described using a bell-
classifier with optimal decision boundary to separate the training shaped (Gaussian) function ceases to be sound, since the concen-
patterns (in potentially high-dimensional space) and then to sep- tration of data in higher dimensional space tends to be in the tails
arate simulation data under the same configurations (dimensions) (Fauvel et al., 2009). This phenomenon will continue to be encoun-
(Zhu and Blumberg, 2002). tered in remote sensing as new sensors increase spectral resolution
In its simplest form, SVMs are linear binary classifiers that and therefore data dimensionality.
assign a given test sample a class from one of the two possible la- There is also another interesting concept that serves as a
bels. An instance of a data sample to be labeled in the case of re- key attraction to SVMs. Commonly described by many authors
mote sensing classification is normally the individual pixel derived under the notion of overfitting (Montgomery and Peck, 1992), yet
from the multi-spectral or hyperspectral image. Such a pixel is rep- variously referred to by others as bias-variance tradeoff (Geman
resented as a pattern vector, and for each image band, it consists et al., 1992) or capacity control (Guyon et al., 1992), SVM-based
of a set of numerical measurements. Elements of the feature vec- classification has been known to strike the right balance between
tor may also include other discriminative variable measurements accuracy attained on a given finite amount of training patterns and
based on pixel spatial relationships such as texture. Fig. 1 illustrates the ability to generalize to unseen data.
a simple scenario of a two-class separable classification problem Alongside the benefits derived from the SVM formulation there
in a two-dimensional input space. An important generalization are also several challenges. The major setback concerning the ap-
aspect of SVMs is that frequently not all the available training ex- plicability of SVMs is the choice of kernels. Although many options
amples are used in the description and specification of the separat- are available, some of the kernel functions may not provide opti-
ing hyperplane. The subset of points that lie on the margin (called mal SVM configuration for remote sensing applications. Empirical
support vectors) are the only ones that define the hyperplane of evidence indicates that kernels such as radial basis function and
maximum margin. polynomial kernels applied on SVM-based classification of satellite
The implementation of a linear SVM assumes that the multi- image data produce different results (Zhu and Blumberg, 2002). A
spectral feature data are linearly separable in the input space. good explanation on SVM kernels and their functionality is pre-
In practice, data points of different class memberships (clusters) sented in numerous papers (e.g., Kavzoglu and Colkesen, 2009).
overlap one another. This makes linear separability difficult as the From the non-expert user point of view, SVM theory is a bit in-
basic linear decision boundaries are often not sufficient to classify timidating, particularly due to the fact that the more efficient SVM
patterns with high accuracy. Techniques and workarounds such as variants often incorporate some difficult to understand concepts.
the soft margin method (Cortes and Vapnik, 1995) and the kernel This limits effective cross-disciplinary applications of SVMs.
trick are used to solve the inseparability problem by introducing Numerous SVM tutorials are available (such as Cortes and
additional variables (called slack variables) in SVM optimization Vapnik (1995) and Burges (1998)), but none of these contains an
and mapping (using a suitable mathematical function) the non- exhaustive discussion on the increasing number of newly proposed
linear correlations into a higher (Euclidean or the Hilbert) space, variants of SVMs. In the remote sensing field a good starting point
respectively. A kernel function typically needs to fulfill Mercer’s would be a textbook by Tso and Mather (2009) that provides a
Theorem in order to be a valid kernel in SVMs (Scholkopf and review of the entire field of classification methods for remotely
Smola, 2001). The choice of a kernel function often has a bearing sensed data, including SVMs. For those interested in rule extraction
on the results of analysis. Furthermore, typical remote sensing from SVMs a recent computer science review is available (Barakat
G. Mountrakis et al. / ISPRS Journal of Photogrammetry and Remote Sensing 66 (2011) 247–259 249

4. SVM works focusing on algorithmic advancements

This section summarizes SVM advancements that were achieved


during the past decade. Papers that merely contrasted SVM per-
formance with other methods or papers incorporating SVMs for a
specific application are discussed in the next section.

4.1. Classification

SVMs are typically a supervised classifier, which requires train-


ing samples. Literature shows that SVMs are not relatively sensi-
tive to training sample size and scientists have improved SVMs
to successfully work with limited quantity and quality of train-
ing samples. For example, Foody and Mathur (2004b) showed that
Fig. 2. Growth of SVMs popularity in remote sensing over the past decade. only a quarter of the original training samples acquired from SPOT
HRV satellite imagery was sufficient to produce an equally high ac-
and Bradley, in press). Chen and Ho (2008) provide an excellent curacy for a two-crop classifier. Mantero et al. (2005) estimated
general reference for statistical learning in remote sensing. It probability density of thematic classes using an SVM. The SVM-
should be noted that for this review the term SVM is inclusive of based approach used a recursive procedure to generate prior prob-
the traditional SVM method as well as SVM-based variants, since ability estimates for known and unknown classes by adapting the
most of the latter still heavily rely on the standard SVM method. Bayesian minimum-error decision rule. The approach was tested
using synthetic data and two optical sensor data (i.e., Daedalus
3. Brief overview ATM and Landsat TM) and confirmed method effectiveness, espe-
cially when the availability of ground reference data was limited.
Support vector machines (SVMs) have recently found numerous
Transductive inference learning theory was incorporated into an
applications in remote sensing. For this review we identified 108
SVM for remote sensing classification in Bruzzone et al. (2006).
relevant papers, with more than half published in the last 2.5 years
Their SVM-based approach defines the separating hyperplane ac-
(Fig. 2). This increasing trend is expected to grow, making this a
critical time for a review of existing work. cording to a process that integrates the unlabeled samples together
The SVM papers included a wide range of remote sensing ap- with the training samples. Experiments showed that the proposed
plication domains and sensors. A summary of this diverse group is method was effective, particularly for a set of ill-posed remote
presented in Fig. 3. Satellite sensors are preferred, especially mul- sensing classification problems due to the limited training samples.
tispectral ones. There is some limited interest in change detection Foody and Mathur (2006) proposed a focus on mixed pixel train-
(10% of the papers), a pattern that is expected to significantly in- ing samples over more tedious, conventional pure pixel acquisi-
crease as the Landsat archive is now freely available. There is an tion, assuming an SVM classifier. The analysis of a three-waveband
almost equal split between high and medium resolution sensors, multispectral SPOT HRV image showed the benefits of mixed pixel
mostly related to a strong preference to Ikonos and Landsat im- sampling on a crop type classification task. Foody et al. (2006)
agery, but also to high resolution airborne sensors. evaluated four dataset reduction methods for a one-class problem

Fig. 3. Summary statistics of selected works.


250 G. Mountrakis et al. / ISPRS Journal of Photogrammetry and Remote Sensing 66 (2011) 247–259

(cotton vs. others) using SVMs and LISS-III data and found that sig- the original dataset assuming SVM methods are used for the
nificant data reduction was feasible (∼90%) with minimal informa- classification process. At the 24 m ground pixel size acquired by the
tion loss. LISS-III sensor the reduced dataset yielded a small 1.34% accuracy
Sahoo et al. (2007) investigated the incorporation of localized, loss at 90.66%.
highly sensitive transformations to capture subtle changes in hy- Integration of a genetic algorithm (GA) and SVM for remote
perspectral signatures. They compared the so called S-transform sensing classification was evaluated with a limited availability of
to classifiers without it and found encouraging results. The imple- training samples in Ghoggali et al. (2009). The experimental re-
mentation algorithm was an SVM that showed additional robus- sults revealed again an ability to improve classification accuracy
tness to small data samples in a geological classification. Blanzieri with a small training sample size. However, the computational load
and Melgani (2008) investigated a local k-nearest neighbor adap- was significant mainly due to the slow GA convergence. Ghoggali
tation to formulate localized variants of SVM approaches. Their and Melgani (2008) integrated genetic training into SVM classifi-
results indicated substantial improvements, especially with the cation in order to incorporate land cover transition rules in multi-
integration of non-linear kernel functions. Tuia and Camps- temporal classification. The results indicated a mixed performance,
Valls (2009) addressed the issue of kernel predetermination by however the algorithmic flexibility and humanly intuitive pro-
proposing a regularization method that identifies kernel struc- cess suggest promising future work. Bruzzone and Persello (2009)
ture through analysis of unlabeled samples. Camps-Valls et al. proposed a novel context-sensitive semi-supervised SVM classi-
(2010) proposed an improved methodology for assessing kernel in- fication model, which can be successfully utilized when some of
dependence in various imagery types using the Hilbert–Schmidt training data are not reliable. Their model explores the contextual
independence criterion. Marconcini et al. (2009) discussed the information of the neighboring pixels of each training sample and
incorporation of spatial information through composite kernels improves the unreliable training data. They tested their model us-
finding substantial improvements however with an additional ing Ikonos and Landsat TM data and compared the results with
computation cost. Camps-Valls et al. (2008) proposed a method- those based on some of the widely used classification algorithms
ological framework using composite kernels for multi-temporal such as the standard SVM, a progressive semi-supervised SVM,
classification of remote sensing data from different sources. The maximum likelihood and k-nearest neighbor. The proposed SVM
method was tested using both synthetic and real optical Land- algorithm outperformed the other classification models in terms of
sat TM data and found that the cross-information composite ker- robustness and effectiveness, particularly when non-fully reliable
nel was the best in general, but a simple summation kernel also training samples were used. Huang and Zhang (2010) compared
multi-SVM methods with traditional vector stacking techniques on
showed similar performance. Composite kernels that take advan-
high resolution urban mapping.
tage of the properties of Mercer’s kernels were further discussed
Su (2009) investigated training data reduction using a hier-
in their prior work (Camps-Valls et al., 2006c). Chi et al. (2008)
archical clustering analysis and Multiangle Imaging SpectroRa-
proposed a method, called primal SVM that is capable of differen-
diometer (MISR) satellite data (250 m–1.1. km, 17 products) on a
tiating land covers using a reasonably small amount of training ex-
vegetation classification problem. It was shown that a two thirds
amples. Their method sought to replace the regularization-based
reduction of the dataset size was possible without significant ac-
approach previously employed in SVMs. The primal SVM formu-
curacy degradation in SVM and maximum likelihood classifier
lation makes it possible to optimize directly on the primal repre-
(MLC) methods. Gomez-Chova et al. (2010) proposed a method
sentation, and therefore limits the number of samples. Evaluation
to increase classification reliability and accuracy by combining la-
was performed using Hyperion imagery of the Okavango Delta (in
beled and unlabeled pixels using clustering and the mean map
Botswana) for vegetation classification. Primal SVM yielded com-
kernel. They tested their approach to classify clouds using En-
petitive accuracy values as the state-of-art alternative algorithms
visat’s Medium Resolution Imaging Spectrometer (MERIS) data.
trained on larger datasets. Gómez-Chova et al. (2008) investigated
They found that their method was particularly successful when
the addition of a regularization term on the geometry of both la- sample selection bias (i.e., training and test data follow different
beled and unlabeled samples that was based on graph Laplacian, distributions) exists.
leading to a Laplacian SVM variant. This semisupervised classifica- Selecting an optimum SVM method for remote sensing classi-
tion method offers improvements when compared with traditional fication is not an easy task. Foody and Mathur (2004a) proposed
SVMs, especially in small training datasets and underlying com- a single multiclass SVM classification method while typical multi-
plex problems. Castillo et al. (2008) proposed a modified version of class SVMs are based mainly on the use of multiple binary analyses.
the SVM classifier, called bootstrapped SVM. The training strategy They compared their approach with other classification methods
adapted in the bootstrapped SVM is such that an incorrectly clas- such as discriminant analysis, decision trees, and neural networks,
sified training sample in a given learning step is removed from the and found that the SVM-based approach outperformed the other
training pool, re-assigned a correct label, and re-introduced into methods with different sizes of training samples. Bazi and Melgani
the training set in the subsequent training cycles. The key result (2006) investigated the most appropriate feature subspace and
was the ability to capture data variability in a highly biased bi- model selection based on a genetic optimization framework using
nary dataset, only 0.05% of the total number of training pixels were three feature selection methods including steepest ascent, recur-
needed to achieve about the same accuracy level as the standard sive feature elimination technique, and the radius margin bound
SVM. An interesting SVM adaptation was proposed by Wang and minimization method. They used two criteria, the simple support
Jia (2009), where the space between support vectors is considered vector count and the radius margin bound, to identify an optimum
to provide a soft classification in addition to the traditional hard SVM-based classification system for hyperspectral remote sens-
classification. Demir and Erturk (2009) offered an improvement to ing data. The genetically optimized SVM using the support vec-
hyperspectral SVM classifiers by incorporating border training tor count as a criterion resulted in the best performance for both
samples in a two step classification process. Song et al. (2005) pro- simulated and real-world AVIRIS hyperspectral data. Mathur and
posed an SVM adaptation for Landsat-based vegetation monitor- Foody (2008a) evaluated the performance of SVMs in non-binary
ing. The SVM v parameter was tackled through an integration of classification tasks. Their results indicated their proposed one shot
one and two class SVM sequential classification steps. SVM classifier outperformed the binary-based multiple classifiers
Mathur and Foody (2008b) investigated methods for efficient in terms of obtained accuracy but also initial parameterization.
reduction of field data. They concluded that for cropland mapping SVMs have also been used for feature selection. Pal (2006) in-
equivalent classification results can be obtained with a third of vestigated methods for feature selection based on SVMs. Citing the
G. Mountrakis et al. / ISPRS Journal of Photogrammetry and Remote Sensing 66 (2011) 247–259 251

unreasonably large computational requirements as a major dis- for vegetation mapping using hyperspectral imagery. Obtained re-
advantage of exhaustive search methods in practical applications, sults indicated a significantly lower usage of classification vec-
the researchers justified the use of a non-exhaustive search pro- tors, however a lower accuracy rate was obtained compared to
cedure in selecting features with high discriminating power from the typical SVM approach. The authors suggest their method as a
large search spaces. SVM-based methods combined with GA were viable solution for real time classification due to the highly effi-
compared with the random forest feature selection method in land cient simulation times. Tuia et al. (2009) combined morphological
cover classification problems with hyperspectral data and small filters and SVMs to conduct land use classification using high
benefits were identified. Zhang and Ma (2009) addressed the is- spatial resolution QuickBird panchromatic images. They tested
sue of feature selection in SVM approaches. They implemented a multiple morphology-based features and found that simple mor-
modified recursive SVM approach to classify hyperspectral AVIRIS phological features generated with opening and closing operators
data. The reduced dimensionality returned slightly better results, resulted in the best performance. Muñoz-Marí et al. (2007) pro-
however their method has higher computational demands com- vided an interesting comparison of available one-class classifiers
pared with others. On the same subject Archibald and Fann (2007) for both single and multiple class remote sensing problems. They
provided an interesting integration of feature selection within the also investigated a one-class classifier called support vector do-
SVM classification approach. They achieved comparable accuracy main description (SVDD) that is particularly attractive in the pres-
while significantly reducing the computational load. ence of incomplete training data. Tan et al. (2007) proposed a new
Some studies improved the performance of SVM-based classi- technique combining entropy decomposition and SVM for classifi-
fication through algorithm and/or data fusion. Zhang et al. (2006)
cation. The approach was tested using multi-temporal SAR images
proposed a pixel shape index describing the contextual informa-
for rice monitoring. Their approach was especially useful when re-
tion of nearby pixels and evaluated its usability for land cover clas-
trieving polarimetric information for each class resulting in good
sification using QuickBird data based on SVMs. The pixel shape
separation between classes. Tarabalka et al. (2009) proposed a new
indices were combined with transformed spectral bands such as
classification scheme emphasizing both spectral and spatial char-
principal component analysis or independent component analysis.
acteristics of hyperspectral images. Their method combined the
They found that integration of spectral and shape features as well
pixel-wise SVM classification results and the segmentation map
as the transformed spectral components in an SVM were able to
based on partitional clustering using the majority voting strat-
improve classification accuracy. Waske and Benediktsson (2007)
classified multi-sensor (SAR, Landsat TM, and SPOT) and multi- egy. The approach was specifically useful when large spatial struc-
temporal data through data fusion based on SVMs. Their method tures were included in data or when different classes had dissimilar
was based on the decision fusion of multiple SVMs that were in- spectral responses and a comparable number of pixels.
dividually trained on the different data sources. Their approach Although SVMs are typically employed for supervised classifica-
outperformed the other methods including maximum likelihood, tion tasks, they have also been used for unsupervised classification
decision trees, and a typical SVM. Mitra et al. (2004) proposed in combination with other techniques. For example, Bovolo et al.
an active learning-based approach to reduce the selected support (2008) combined an SVM and a selective Bayesian thresholding ap-
vectors. Their semi-supervised method gradually creates clusters proach for unsupervised change detection. They used a selective
based on interactive user input. Their method yielded better re- Bayesian thresholding to delineate pseudo-training samples and
sults than a typical SVM, however the authors caution on the al- conducted binary change detection (i.e., change vs. no change) us-
gorithm’s sensitivity on user-provided erroneous labeling. Zhang ing the samples based on an SVM approach. Their method outper-
and Ma (2008) proposed an SVM variant, the Potential SVM as formed the change vector analysis (CVA)-based method with the
an alternative for multispectral image classification. The Potential expectation-maximization algorithm, but required much longer
SVM is an attractive variant due to its ability to handle non-Mercer computational time due to the model-selection strategy to identify
kernels and its mathematical formulation that addresses SVM scal- an optimum structure of their model. Mukhopadhyay and Maulik
ability issues. Tests on very high (0.1 m) and medium (30 m) reso- (2009) integrated a multi-objective fuzzy clustering scheme with
lution indicated equal or better accuracy than the traditional SVM, an SVM for unsupervised classification. Their method identified
while offering faster simulation times due to support vector reduc- high-confidence points from certain clusters to train the SVM
tion. A fusion approach to classification using extended morpho- classifier. The method was tested using several satellite images
logical profiles was proposed in Fauvel et al. (2008). They evaluated (i.e., SPOT, Landsat TM, and IRS) and concluded that their method
the approach using high spatial/spectral resolution ROSIS data in was more effective when compared to other methods such as neu-
urban areas based on SVM classification. Ensemble methods for ral networks, k-nearest neighbor, and fuzzy c-means.
multiple SVM integration were evaluated by Pal (2008). Two pop-
ular integration techniques such as boosting (alternating observa-
4.2. Regression
tion weight) and bagging (alternating observations) were tested
using Landsat ETM+ data for an agricultural classification. The
findings suggest that an optimized ensemble method could lead In addition to classification tasks, SVMs have been advanced to
to improved results, though further testing is suggested as oth- solve regression problems, where in essence a continuous predic-
ers have found contradictory results. Chen et al. (2009) proposed tion output is expected. A multiple estimator system for biophysi-
an improved classification method by stacking multiple hierar- cal parameter estimation from remote sensing data was proposed
chical SVM classifiers. The method also incorporates discrimina- by Bruzzone and Melgani (2005). They particularly focused on
tion information of two feature spaces (i.e., magnitude and shape). incorporation of SVMs into the system and combination with mul-
Experiments showed that the method with the generalization abil- tilayer perceptron (MLP) neural networks. They simulated differ-
ity and the use of multiple feature spaces was effective for hyper- ent operational conditions with SVM and MLP and pointed out that
spectral image classification. Chen et al. (2008) investigated the their system increased the robustness of the estimation process; it
integration of SVMs with pairwise decision trees on hyperspec- provided accuracy very close to that of the best estimator included
tral data. The one-against-one SVM adaptation provided similar in the ensemble based on the experiment of chlorophyll concen-
results to their proposed method, which was attributed to the hi- tration estimation using MERIS data. Camps-Valls et al. (2006a) in-
erarchical structure of the decision tree-based method. Demir and vestigated a RVM-based approach, a variant of SVMs, in order to
Ertürk (2007) implemented a relevance vector machine (RVM) ap- lessen the uncertainty inherent in handling satellite-derived and
proach, which was originally proposed by Tipping (2000, 2001), in-situ measurements of oceanic chlorophyll concentration. RVM
252 G. Mountrakis et al. / ISPRS Journal of Photogrammetry and Remote Sensing 66 (2011) 247–259

incorporated prior knowledge of the problem and proved to be use- extrapolate this cross-calibration knowledge on to other data in
ful in quantifying chlorophyll concentrations based on ocean sur- a different spatio-temporal domain and to identify representative
face reflectance. The technique was reported to be less sensitive to products for the global chlorophyll concentration. They revealed
parameter selection; it also provided a considerable high accuracy there were significant discrepancies between the different sensor
despite sparsity in the solution space. The authors recommended products; there is a high dependency on sensor calibrations and
use of RVMs in other applications that involve estimation of bio- operational characteristics. Bazi and Melgani (2007) estimated
physical parameters using remotely sensed data because of its ro- chlorophyll concentrations in coastal waters based on a particle
bustness to even small amount of training data and low sensitivity swarm optimization and SVM techniques using MERIS and SeaBAM
to free parameter setting. Camps-Valls et al. (2006b) further inves- data. They found that their method was more effective than
tigated an SVM variant providing a more robust regression model the typical SVM and less sensitive to training sample size. Sun
for ocean chlorophyll concentration that proved successful, espe- et al. (2009) investigated in situ hyperspectral measurements to
cially with limited training samples. estimate chlorophyll concentration in Lake Taihu using SVMs. They
A synthetic algorithm of wavelets and SVMs was developed first identified the best three-wavelength factor using an iterative
to predict evapotranspiration by Kaheil et al. (2008). A range optimization and used them as inputs to an SVM to estimate
of remote sensing-derived input variables such as MODIS LAI, chlorophyll concentration. Their approach proved more accurate
MODIS emissivity, and spectral data of Landsat TM and ASTER than the typical linear regression models.
were fed into the algorithm to produce a spatial distribution of Knudby et al. (2010) studied reef fish richness, diversity and
evapotranspiration at the finest spatial resolution of the input biomass using Ikonos images and predictive modeling. SVMs
data. Moser and Serpico (2009) proposed an automatic parameter were compared with five other methods and performed almost
optimization method for SVM regression of land and sea surface equally to the highest ranked ensemble algorithms. Clevers et al.
temperatures. They tested their method using AVHRR and MSG (2007) tested an SVM-based band shaving technique to reduce
satellite images with synchronous in situ measurements and dimensionality in hyperspectral datasets. The application domain
compared with typical grid-search based optimization methods was grassland biomass estimation and three bands were identified
such as cross-validation and hold-out. The proposed method as sufficient for field studies. Durbha et al. (2007) assessed leaf
resulted in similar accuracy with the other methods, but much area index extraction for Multiangle Imaging SpectroRadiometer
more efficient than them, particularly when a high number of (MISR) satellite data. They proposed an adjusted support vector
training samples was available. regression method which included parameter regularization.
Regression SVM has also been used for data generation and An SVM-based model was used to calculate global ocean pri-
fusion. Zheng et al. (2008) proposed a multiscale mapped least- mary productivity (Tang et al., 2008); it was found to be more
squares SVM (LS-SVM) to sharpen multispectral bands using a accurate than the vertically generalized production (VGPM) ap-
higher resolution panchromatic band. QuickBird data were used proach due its ability to identify the nonlinear relationship be-
in the experiments and multiscale Gaussian radial basis function tween the ocean’s primary productivity and other parameters. The
kernels were incorporated. Their method was compared with other problem was particularly difficult because of the sparse nature
fusion algorithms such as discrete wavelet transform, curvelet of the data. SVMs performed better than the traditional VGPM
transform, atrous wavelet transform, extended fast IHS and found making them appealing for undersampled applications such as
that both their method and the atrous wavelet transform resulted oceanographic studies. Yang et al. (2007) modeled continental
in the best performance. Shi et al. (2009) also used an LS- gross primary product using MODIS and other sources; SVMs were
SVM approach to generate a digital surface model (DSM) from the underlying algorithmic methodology.
Light Detection And Ranging (LiDAR) data. Assessed visually and Xie et al. (2008) implemented SVR to calculate the moisture
quantitatively against the radial basis function (fastRBF) and transport in oceanic environments using MISR and they found SVR
triangulation technique, LS-SVM was found to be more effective in outperforming linear regression and backpropagation neural net-
terms of noise reduction, computational efficiency and accuracy in works. Yang et al. (2006) estimated evapotranspiration by combin-
DSM generation. This reformulation of the standard SVM based on ing MODIS and AmeriFlux data using SVMs at the continental scale.
regression models bears similarity with regularization networks SVMs outperformed neural networks and multiple regression.
and Gaussian processes. LS-SVM incorporates pixel neighborhood
and topographic analyses. As such, basic principles of differential 5.2. Land cover land use tasks
geometry play a key role in generation of gradient and curvature
equations, and in other such related tasks. Readers interested in 5.2.1. Vegetation/agriculture
SVM-based regression models could find useful information in In an early work, Gualtieri and Cromp (1998) evaluated SVM
Smola and Schölkopf (2004). performance on vegetation classification. Hyperspectral AVIRIS
imagery was used and results suggested SVM superiority over
5. Application-oriented SVM papers prior classifiers developed on the same dataset. Keramitsoglou
et al. (2006) focused on vegetation mapping using Ikonos imagery.
This section presents papers where the incorporated SVMs were They contrasted SVMs with Kernel-based spatial Re-Classification
not highly customized; instead the focus was their evaluation (KRC) and RBF neural networks and found that even though SVMs
under a given task. Where applicable, results from works that showed slightly less robustness in the classification results over
contrasted SVMs with other methods are mentioned. the RBF, their training time was considerably lower suggesting
improved applicability. KRC also performed well but not as high
5.1. Biophysical tasks as the SVM and RBF methods. Knorn et al. (2009) evaluated
binary forest classification using SVMs in a spatial sequence of
SVMs have been used in remote sensing-based estimation Landsat scenes. The major goal was to assess chain classification
and monitoring of biophysical parameters such as chlorophyll accuracy, which proved accurate even for lengthy sequences
concentration, gross primary product, and evapotranspiration. For (e.g., six images) as long as image overlapping portions represented
example, Kwiatkowska and Fargion (2003) employed SVMs to well the different features on the ground.
cross-calibrate the global chlorophyll concentration from different Huang et al. (2008b) performed SVM-based classification
satellite sensors (i.e., SeaWiFS and MODIS). The goal was to to assess in forest classification accuracy the influence of the
G. Mountrakis et al. / ISPRS Journal of Photogrammetry and Remote Sensing 66 (2011) 247–259 253

slope/aspect of the terrain, solar elevation and azimuth and the between clusters in an attempt to create a hierarchical structure
relative position of the trees. A 3.6% gain in overall classification of the training dataset and reduce its size. Results indicated
accuracy was realized after topographic correction. Lardeux et al. that a reduced dataset of approximately 20% of the original
(2009) used SVMs to classify dense tropical vegetation with SAR size could provide comparable classification accuracy. Su et al.
data. SVMs resulted in about 20% higher classification accuracy (2009) discussed further the application of SVMs on MISR imagery
than the Wishart classification approach. They pointed out that to detect semi-arid vegetation areas, where SVMs performed
SVMs can perform much better than the typical Wishart approach significantly better than MLC.
when radar data do not follow the Wishart distribution. SVMs Undertaking a crop classification task, Wilson et al. (2004)
were also evaluated against decision tree classifiers in a study investigated salt marsh and crop plants that have been exposed
that involved mapping of dynamic semi-natural habitat systems — to heavy metal or petroleum toxicity with control treatments
technically known as fenlands (Boyd et al., 2006). In this problem, using in situ spectroradiometer measurements. They used two
SVMs were implemented as a binary classifier, to classify the data classification methods, SVMs and logistic discrimination based
into fen and ‘other’, while the ‘other’ class contained samples of on partial least squares compression, and found that the SVM-
several ground features. The performance of SVMs was slightly based method was superior. SVMs were also implemented for crop
higher than that of the decision tree classifier. classification using HyMap hyperspectral imagery in Camps-Valls
Looking into forest species classification, Dalponte et al. (2008) et al. (2004). SVMs outperformed typical neural networks in terms
used SVMs and data fusion of hyperspectral (AISA) and LiDAR data. of accuracy, simplicity, and robustness. They also found that SVMs
SVMs outperformed Gaussian maximum likelihood classification were not as sensitive to training sample size, and SVMs were able
and k-NN technique. They pointed out that the incorporation of to successfully detect noisy bands. Hyperspectral image data of a
LiDAR variables generally improved the classification performance cornfield, acquired through airborne mission (Compact Airborne
and the first return data was the highest contributing factor. Spectrographic Imager) was used in conjunction with the SVM
Heikkinen et al. (2010) applied a simulated optical radiation model method in automatic detection of weeds and nitrogen in the field
to evaluate the tree species classification using an airborne four (Karimi et al., 2006). The discriminant features were based on the
band sensor system. They employed SVMs with different kernel general remote sensing principle: corn exhibits different spectral
functions and found that the four bands were not sufficient responses depending on the type or method of weed control used
to get successful classification results; the Mahalanobis kernel and nitrogen application rates. Waske and van der Linden (2008)
provided the best accuracy. Dalponte et al. (2009), in their segmented multi-sensor data (SAR and TM) at multi-levels and
study on hyperspectral image acquisition and analysis, focused pre-classified each individual level of segmentation using SVM
on the choice of spectral resolution and associated method of for crop classification. The pre-classification results were then
classification. The study goal was to classify complex forest fused to create a final classification output with an SVM and
scenarios. Simulated data (consisting of degraded band sizes from random forests as decision rules. They pointed out that it was more
4.6 to 36.8 nm) was used to analyze the role of spectral resolution appropriate to define the kernel functions for each data source
on classification accuracy in an investigation to determine the and level separately; their multiple classifier system improved the
trade-off between spectral and spatial resolution. SVM-based performance compared to a single classifier approach since the
classification resulted in higher accuracies than all other classifiers
individual errors of multiple data sources at different aggregation
for all spectral bands simulation. The authors attributed this to
scales were diverse and uncorrelated.
the effectiveness of SVMs in managing the complex hyperspectral
classification. Another interesting conclusion was that different
5.2.2. Impervious surfaces/Urban areas
classifiers exhibited variable behavior with respect to spectral
resolution. Huang and Zhang (2009) targeted road extraction from Ikonos
With a focus on forest degradation, Cao et al. (2009a) proposed imagery. The underlying idea was to integrate spectral and shape
a burn index using MODIS data. An SVM method was implemented characteristics at multiple scales. In every scale an SVM method
as part of an iterative classifier targeting burn scar mapping. was implemented and later results from each scale were fused
The results were accurate when compared with Landsat-derived leading to improved centerline extraction. Another road extraction
reference data, however the method is also constrained by hotspot work using Ikonos imagery was published by Song and Civco
identification accuracy and the presence of clouds. Liu et al. (2004). SVM methods were developed to create a binary road layer,
(2006) investigated the use of high resolution (GSD 1 m), four that was further processed with shape-assisted and vectorization
band aerial photography for forest disease monitoring. Their procedures. The SVM method yielded a lower classification error
findings indicated that a spatial–temporal contextual approach compared to the Gaussian maximum likelihood approach, a finding
improved the initial classification results obtained with an SVM which the investigators attributed to the assumption that class
method. Using images captured by Landsat TM/ETM+ between signatures (feature groups) follow a normal distribution may
1988 and 2007, Kuemmerle et al. (2009) applied an SVM not always be appropriate. Inglada (2007) implemented SVMs
classifier to detect illegal logging in the Ukrainian Carpathians. to classify man-made features (e.g., bridges, roads, roundabouts)
The classification problem focused on mapping forest cover change from 2.5 m SPOT 5 imagery. Classifier robustness and resilience
in the subregions. Although no comparative assessments were to variability in illumination and changes in spectral bands was
carried out, SVM proved a very useful method in delineating achieved by incorporating invariant geometric features. The results
forest/non-forest cover maps for all the stated time periods. were reasonable (∼80% accuracy) considering the complexity of
Another study by Huang et al. (2008a) used Landsat TM and the underlying problem. SVMs were also used to classify bridges
Landsat ETM+ images with a focus on developing an automated from Ikonos high-resolution panchromatic image data. Luo et al.
solution to forest cover change detection. The classification took (2007) used a simple yet effective contextual idea: bridges, in
place using SVMs and an extensive evaluation over multiple sites general, are adjacent to water and water is usually darker than
indicated an accuracy of approximately 90%. other objects. Additional steps followed from this assumption.
Su and Huang (2009) conducted a study in southern New Gauss Markov Random Field SVM, an SVM adjustment that
Mexico to evaluate the effect of different linkage techniques incorporates texture properties, was used to enhance classification
on classification accuracy for semiarid vegetation mapping. Four performance of the traditional SVM. Higher overall accuracy and
different linkage techniques were tested to calculate distances kappa values were recorded.
254 G. Mountrakis et al. / ISPRS Journal of Photogrammetry and Remote Sensing 66 (2011) 247–259

ASTER imagery was used as input to an SVM-facilitated process- 5.2.3. General land cover land use tasks
ing technique in an investigation to determine SVM’s suitability for Starting with high resolution imagery, Li et al. (2010) proposed
mapping urban areas (Zhu and Blumberg, 2002). Different results an SVM-based classifier using QuickBird data. A scene segmenta-
were obtained depending on image resolution and on SVM ker- tion algorithm was integrated with the SVM object classifier lead-
nel choices, namely the polynomial kernel and radial basis func- ing to better performance. It is also noted that the SVM classifier
tion (RBF). The RBF-based SVM yielded higher performance with is highly dependent on the segmentation process, a typical draw-
respect to convergence speed; better classification precision on back of object-based classifiers. Linear support vector machines
the sample data was achieved using an SVM based on the poly- were reported to be useful in classification of hyperspectral remote
nomial kernel. Esch et al. (2009) combined single date Landsat im- sensing data whose elements had been extracted using a technique
ages to derive useful information about industrial, residential and called kernel principal component analysis (KPCA) (Fauvel et al.,
transportation-related areas in Germany. SVMs were found to be 2009). Although only the basic SVM was employed in the set of
effective in automatic estimation of impervious surfaces. The au- experiments, the improved feature provided a significant clue on
thors conclude that automatic extraction of urban areas is a diffi- the effectiveness of SVMs especially when applied on reliably clean
cult problem owing to the wide range of surface materials, and the datasets. Warner and Nerry (2009) performed a study to determine
heterogeneity of the classes. Brown et al. (1999) evaluated SVMs the effectiveness of thermal infrared data in land cover classifica-
with linear spectral mixture models (LSMM) for land use subpixel tion. An SVM classifier turned out to be an effective method at han-
analysis. Their analysis on Landsat data for binary urban classifi- dling the complex distributions of the heterogeneous land cover
cation revealed under certain circumstances the LSMM is identical classes that characterized the study area (Strasbourg, France). In
their conclusion, the authors suggest that the inclusion of a single
to linear SVMs. Walton (2008) compared urban subpixel classifica-
broad thermal band increased classification accuracy by as much
tion performance from random forests, rule-based regression and
as 20% for simulated Ikonos bands and provided a 4% improvement
SVMs using a Landsat image. Results indicated that the rule-based
when hyperspectral VNIR and SWIR data were used.
regression using Cubist provided improved accuracy and training
In yet another remote sensing application, Huang et al.
time. Watanachaturaporn et al. (2008) found that SVM methods
(2008c) presented an algorithmic fusion methodology to improve
outperformed backpropagation and radial basis function neural
processing of very high-resolution (VHR) satellite imagery using
networks, maximum likelihood and decision trees. Imagery from
a wavelet transformation. In justification of their undertaking,
Indian’s Linear Imaging Self-scanning Sensor (LISS) III was used
scientists cited that VHR images are characterized by complex
(23.5 m pixel size, 4 bands) for an urban-driven classification. multi-scale spectral and spatial information, therefore rendering
The effects of off-nadir collection and vegetation cover on urban the traditional fixed, single-band, single-window approach less
classification were investigated using hyperspectral, 4 m GSD efficient. The more relevant multi-scale spectral-spatial features
aerial images (Linden and Hostert, 2009). SVMs were employed were classified using support vector machines. Mladinich (2010)
for the classification process but were not compared with other assessed three commercial software packages for object-based
methods as it was outside the scope of their study. Confronted with binary classification (disturbed vs. non-disturbed areas) over
the common challenge of selecting an appropriate set of parameter high resolution imagery (1 m). The ENVI software package was
values, Cao et al. (2009b) used SVMs to overcome the setbacks of one of the three tools, and it incorporated an SVM algorithm
empirical trial and error methods in extracting urban areas from adjusted from the library for support vector machines (LIBSVM).
available samples of Defense Meteorological Satellite Program — Results across the three tools were comparable, with Definiens
Optical LineScan (DMSP-OLS) and SPOT-derived NDVI data. The classification showing higher consistency. In a bid to compare
study employed Chinese city datasets (apparently because of the SVMs against maximum likelihood and backpropagation artificial
rapid urbanization of the study area) and the problem was reduced neural networks, Pal and Mather (2005) experimented on Landsat
to a non-threshold binary classification. Being non-parametric, 7 ETM+ and hyperspectral data. Results suggested SVM superiority
SVMs proved to be a better choice for constructing a region- as input dimensionality increased and as dataset size decreased.
growing algorithm that semi-automatically discriminated urban Moving towards medium resolution imagery (15–30 m pixel
pixels from any other type of background data. The main attraction size), in one of the earlier investigations Huang et al. (2002)
was the ability of SVMs to achieve higher accuracy using a small provided an accuracy evaluation of SVMs versus three other
number of training samples. classifiers, namely a MLC, a three-layer (input, hidden and output)
Nemmour and Chibani (2006) studied urban change detection backpropagation neural network classifier (NN), and a decision
using Landsat scenes and found SVM methods outperformed tree classifier (DTC). Variations of SVM classification results
backpropagation neural networks. Interestingly, they found that with different kernel configurations were also compared. The
results showed that SVMs had the highest accuracy, followed
user-defined SVM parameters did not have a significant influence
by DTC and then MLC. The authors attributed the SVM high
in the SVM superiority. Another urban change detection study
classification accuracy to its ability to locate an optimal separating
used multi-source data from Landsat TM/ETM+, European Remote
hyperplane. It was also stated that while the SVM performance was
Sensing Satellite (ERS) 1 and 2, and Advanced Synthetic Aperture
influenced by choice of parameter sets, the results of NN and DTC
Radar (ASAR) onboard the Environmental Satellite (ENVI-SAT) to
classification, too, were affected by the classifier configurations.
map urban footprints in 1990, 2000 and 2006 (Griffiths et al., 2010).
For example, NN behavior is affected by the network’s structure
The classification method employed used SVMs and the authors
and random initializations and DTC is affected by the degree of
developed an SVM-based forward feature selection procedure to pruning. Dixon and Candade (2008) performed an algorithmic
rank input variable contribution. Licciardi et al. (2009) presented comparison between a MLC, a backpropagation NN and an SVM-
the five awarded algorithms useful for the classification of high based classifier. A statistical assessment on a Landsat scene showed
resolution hyperspectral data over urban areas at the 2008 Data clear deficiencies for the MLC method, however results were
Fusion Contest. They found that SVMs were extremely useful similar for NN and SVM classifiers. The authors noted the training
for classification of hyperspectral data and decision fusion using speed as an advantage for the SVM method, while admitting that
multiple algorithms would be a way to go for future research the relatively low dimensionality did not allow them to fully
regarding remote sensing classification. explore their investigation.
G. Mountrakis et al. / ISPRS Journal of Photogrammetry and Remote Sensing 66 (2011) 247–259 255

Another study to compare SVMs and neural network classifiers higher classification accuracy, the authors point out that SVM re-
using Landsat imagery was undertaken by Candade (2004). A sults could be improved by selecting suitable values of the user-
major conclusion drawn was that a small number of training chosen SVM parameters. Zebedin et al. (2006) implemented SVM
samples is sufficient to find the support vectors for near-optimal methods as part of a complex three dimensional reconstruction
SVM learning. In the land use application domain using a Landsat task using aerial, multi-spectral, high resolution data. Using the
scene, the study uncovered that SVM performed better than the freely available NOAA/AVHRR satellite image data Gautam et al.
backpropagation neural network not only in terms of classification (2008) tackled the problem of creating an automatic detector of
accuracy but also when training times were compared. Three coal field fire spots. The SVM method was used successfully to fur-
different SVM kernels (the polynomial, radial basis function and ther refine detection results by removing points falsely highlighted
linear kernels) were analyzed for their performance. Overfitting by the threshold-based methodology in the regions deemed sus-
and local minima were cited as the underlying cause of the pect. Rock glacier detection was undertaken using Landsat and
relatively poor performance of neural networks on small training SRTM terrain data (Brenning, 2009). Eleven different classifiers
samples. Classification of Landsat 5 TM imagery was assessed were tested and the SVM-based method did not rank highly when
from Tenerife (Canary Islands). This is an inherently difficult compared with the other methods.
practical problem for ground truth data collection posed by the SVMs have also been used for pure pixels (endmembers) iden-
complex topographic relief. Keuchel et al. (2003) compared the tification. Brown et al. (2000) compared a linear SVM with linear
classification accuracy of SVMs, maximum likelihood and iterated spectral mixture models to identify pure pixels using Landsat TM
conditional modes (ICM). SVM and ICM methods outperformed data. They found that the SVM framework is more appropriate for
maximum likelihood, however the authors suggest caution should nonlinear and/or empirical mixture modeling because SVMs can
be exercised in parameter (SVM) and iteration number (ICM) handle spectral confusion of pure pixels appropriately. Filippi and
assignments. Kavzoglu and Colkesen (2009) contrasted SVMs with Archibald (2009) investigated SVMs to extract endmembers from
radial basis and polynomial functions and MLC using Landsat hyperspectral data and pointed out that SVM-based endmember
ETM+ and ASTER imagery. In both image types the superiority of extraction has advantages in terms of efficiency and accuracy and
SVMs was underlined. is not sensitive to noise.
Melgani and Bruzzone (2004) classified AVIRIS hyperspectral An integrative approach to information mining from large
data using SVMs and compared the results with those using radial image datasets was proposed by Li and Narayanan (2004) based
basis function neural networks and the K -nearest neighbor classi- on SVMs. This framework was aimed at enabling users to make
fier. They found SVMs outperformed the other methods and con- complex queries that would extend information search criteria
cluded SVMs are a valid and effective alternative to conventional beyond image metadata and actually access image content, a
pattern recognition approaches to hyperspectral remote sensing process called content based image retrieval. The proposed system
data. In a study involving land cover update analysis conducted architecture provided three components, namely, the image
by Marcal et al. (2005) Advanced Space-borne Thermal Emission processing module, database module and graphical user interface.
and Reflectance Radiometer (ASTER) imagery from Vale de Sousa The backend image processing intelligence incorporated an SVM
region (northwest of Portugal) was used to compare the effec- method to facilitate land cover mapping from a set of Landsat
tiveness of various classification methods including the SVMs, k- TM images in eastern Nebraska. An identified challenge was
nearest neighbor (k-NN), logistic discrimination (LD) and training the integration methodology that would yield optimal results.
data-driven fuzzy classifiers. The SVM and LD classifiers produced Melgani (2006) proposed two methods to reconstruct cloud-
higher overall accuracy than k-NN and the fuzzy classifiers. Ku- contaminated remote sensing data using a sequence of multi-
mar et al. (2007) in recognition of the fact that the proportion of temporal multispectral images. The first method was based on the
mixed pixels in remote sensing images increases as spatial reso- expectation-maximization algorithm to implement the contextual
lution decreases, proposed a method to deal with data fuzziness. prediction process. The second method used a single non-linear
The approach, called full fuzzy method, was tested on a land cover predictor based on SVMs. Both methods outperformed the other
mapping problem in India using the LISS-III sensor. The full fuzzy methods based on compositing algorithms for cloud removal, and
scheme involved SVM-based sub-pixel analysis at all three differ- the first method was slightly better than the second method.
ent stages. Performance variation with different distance metrics Finally, Mazzoni et al. (2007) described one of the few SVM-based
(e.g., Mahalanobis and Euclidean norm) was investigated. SVMs operational remote sensing classifiers using MISR imagery. There is
with the Euclidean norm gave the highest accuracy, outperform- also an interesting reference into an SVM-based classifier running
ing a corresponding variant of k-means clustering algorithm. onboard NASA’s EO-1 spacecraft, as part of the Autonomous
At coarser spatial resolutions a study was undertaken to evalu- Sciencecraft Experiment (Mazzoni et al., 2005a,b) to automatically
ate the discriminatory power of two vegetation indices (the global detect degraded images (e.g., from clouds) and avoid further
vegetation index and terrestrial chlorophyll index) obtained from processing and transmission on the satellite’s platform.
MERIS for general land cover mapping (Dash et al., 2007). Although SVMs have also been used in landmine detection. Potin et al.
a moderate level of accuracy was achieved using discriminant anal- (2006) utilized ground-penetrating radar data (GPSARs) to detect
ysis method, a repetition of the experiment using an SVM tech- buried landmines. They developed an abrupt change detection
nique revealed that the latter methodology resulted in a 6% gain in algorithm based on SVM, which was effective in reducing the
overall accuracy. Carrão et al. (2008) investigated the incorporation clutter noise to improve the landmine detection. Jin and Zhou
of multi-temporal MODIS data for a general 500 m LCLU classifica- (2007) introduced a fuzzy hypersphere SVM (FHSSVM) based
tion with an SVM method as the underlying classifier. on the reduced features using the sequential forward floating
selection method. They tested the FHSSVM for detecting landmines
5.3. Other tasks using rail GPSAR and found their method significantly improved
the performance of landmine detection in different scenarios.
Remote sensing data from the Himalayas (Nepal) were used
to study soil erosion processes in tectonically active orogens (An- 6. Discussion and concluding remarks
dermann and Gloaguen, 2009). This research employed SVMs to
provide a classification into land use, erosion and geomorphologi- This review discussed important contributions of SVM-based
cal processes. Although the maximum likelihood classifier yielded works in remote sensing. In order to summarize efficiently the
256 G. Mountrakis et al. / ISPRS Journal of Photogrammetry and Remote Sensing 66 (2011) 247–259

error, is also an important consideration in SVM application.


There exist no established heuristics for selection of these SVM
parameters which frequently leads to a trial-and-error approach.
It has also been reported that the ‘one-against the rest’ strategy
for SVM multi-class classification can be problematic as it may re-
sult in unclassified instances of data and therefore lower classifica-
tion accuracies (Pal and Mather, 2005). Moreover, SVM approaches
frequently map input data to higher dimensional spaces in order
to discern patterns. As dimensionality increases in additional to
potential separability of patterns SVMs exhibit typical dimension-
ality issues such as outlier behavior and increased computational
demands. This is a critical drawback especially for hyperspectral
analysis where the dimensionality of the original data is high and
Fig. 4. Textual summary of this review.
kernel mapping is more vulnerable to dimensionality problems.
Moreover, SVMs are not optimized to deal with the inherent
problem of noisy data; outlier effects are commonly encountered
content we resorted to a textual summary of frequently appearing
in remotely sensed data. Measurement errors due to limited
single terms in this document. Fig. 4 displays such a visual
precision of image acquisition instruments, and atmospheric and
representation, where higher frequency results in a larger font
size. Looking beyond expected terms such as SVM, classification topographic distortions are some of the causes of such impurities.
and remote sensing we also see the trends of recently published The quality of both training and test patterns are important in
works (2008, 2009). In addition, Landsat is the prevailing sensor, construction (training) and evaluation of automatic classification,
while forest and land use applications show a significant presence. recognition and detection systems. The performance of an SVM
From the algorithmic perspective there is a significant discussion classifier can dramatically decrease with a relatively small number
on the kernel and feature selection and their consequences to of mislabeled examples. Perhaps, more investigations into the
accuracy. Even though the focus is on classification tasks, there are potential of some of the relatively untapped lower level noise
worth-mentioning regression applications. Finally, the majority of reduction techniques such as morphological image processing
comparative methods are neural networks, followed by maximum could provide a remedy to the problem of denoising. Citing one of
likelihood and decision trees. the developments aimed at addressing this problem, Huang et al.
Most of the findings show that there is empirical evidence to (2008b) proposed a revised radiometric correction algorithm to
support the theoretical formulation and motivation behind SVMs. counter the undesirable effects of atmospheric and topographic
The most important characteristic is SVM’s ability to generalize effect on data. Inglada (2007), supported by empirical evidence,
well from a limited amount and/or quality of training data. Com- similarly argued that higher number of geometric image features
pared to alternative methods such as backpropagation neural net- enhances multi-way characterization of objects that naturally have
works, SVMs can yield comparable accuracy using a much smaller many different geometric properties. Also, pointed out by Dash
training sample size. This is in line with the ‘‘support vector’’ con- et al. (2007), the choice of dataset source could help in remedying
cept that relies only on a few data points to define the classifier’s this hindrance by allowing the reduction in the size of the training
hyperplane. This property has been exploited and has proved to set required.
be very useful in many of the applications we have seen thus far, There is significant room for extension of SVMs to address
mainly because acquisition of ground truth for remote sensing data current pitfalls. For example, Foody (2008) assessed a relevance
is generally an expensive process. vector machine approach as a way to address the need to define
SVMs offer additional benefits in contrast to alternative clas- the parameter C. RVMs are considered as a Bayesian treatment
sification models, such as neural networks. They are resilient to alternative to SVMs and have several advantages including
getting trapped in local minima because of the convexity of the probabilistic predictions, automatic estimation of parameters,
cost function which enables the classifier to consistently identify and the arbitrary kernel functions. The authors argued that the
the optimal solution. In other words, SVM deals with quadratic new method leads to reduced sensitivity to the hyperparameter
problems hence it always gets to the global minimum. An added settings, thereby making the use of non-Mercer kernels possible.
advantage is that there is no need for repeating classifier train- Furthermore, RVMs allow for fuzzy (or sub-pixel) classification of
ing using different random initializations or architectures. Fur- data making it possible to have a probabilistic output.
thermore, being non-parametric, SVMs do not assume a known Typical SVM comparative assessment has not been as wide-
statistical distribution of the data to be classified. This is particu- reaching. Of particular interest would be comparison/fusion with
larly useful because the data acquired from remotely sensed im- algorithms such as self-organizing maps (Kohonen, 1997) that
agery usually have unknown distributions. This allows SVMs to address efficiently high dimensionality problems and have already
outperform techniques based on maximum likelihood classifica- found fruitful ground in remote sensing (e.g., Hong et al., 2006;
tion because normality does not always give a correct assumption Goncalves et al., 2008). In addition, integration with methodologies
of the actual pixels distribution in each class (Su et al., 2009). that deal more naturally with multi-class problems without the
On the other hand, the majority of the studies uncovered SVM complexity may further advance SVM understanding, for
common limitations to SVM methodologies, for example selection example a learning vector quantization system (Schneider et al.,
of SVM key parameters such as the kernel functions. To elaborate 2009).
further, choosing a small value for the kernel width parameter In a nutshell, we can conclude that SVM classifiers, character-
(i.e. the kernel footprint in that multi-dimensional space) may ized by self-adaptability, swift learning pace and limited require-
lead to overfitting, while large kernel width values may lead to ments on training size have proven a fairly reliable methodology
oversmoothing. This problem is not restricted to SVM methods, in intelligent processing of data acquired through remote sensing.
rather it is a general drawback of kernel-based approaches Past applications of the method on both real-world data and simu-
(e.g., radial basis function neural networks). Choice of the lated environments have shown that SVMs exhibit superiority over
parameter value (usually denoted by C), which controls the trade- most of the alternative algorithms — a big motivation and promise
off between maximizing the margin and minimizing the training for future advances.
G. Mountrakis et al. / ISPRS Journal of Photogrammetry and Remote Sensing 66 (2011) 247–259 257

Acknowledgements Castillo, C., Chollett, I., Klein, E., 2008. Enhanced duckweed detection using
bootstrapped SVM classification on medium resolution RGB MODIS imagery.
International Journal of Remote Sensing 29 (19), 5595–5604.
Support was provided by the National Science Foundation Chen, H., Ho, P., 2008. Statistical pattern recognition in remote sensing. Pattern
(award GRS-0648393), by the National Aeronautics and Space Recognition 41 (9), 2731–2741.
Administration (awards NNX08AR11G, NNX09AK16G) and by the Chen, J., Wang, C., Wang, R., 2008. Combining support vector machines with a
pairwise decision tree. IEEE Geoscience and Remote Sensing Letters 5 (3),
Syracuse Center of Excellence CARTI Program. 409–413.
Chen, J., Wang, C., Wang, R., 2009. Using stacked generalization to combine SVMs
in magnitude and shape feature spaces for classification of hyperspectral data.
References IEEE Transactions on Geoscience and Remote Sensing 47 (7), 2193–2205.
Chi, M., Feng, R., Bruzzone, L., 2008. Classification of hyperspectral remote-sensing
Andermann, C., Gloaguen, R., 2009. Estimation of erosion in tectonically active data with primal SVM for small-sized training dataset problem. Advances in
orogenies. Example from the Bhotekoshi catchment, Himalaya (Nepal). Space Research 41 (11), 1793–1799.
International Journal of Remote Sensing 30 (12), 3075–3096. Clevers, J.G.P.W., van der Heijden, G.W.A.M., Verzakov, S., Schaepman, M.E., 2007.
Archibald, R., Fann, G., 2007. Feature selection and classification of hyperspectral Estimating grassland biomass using SVM band shaving of hyperspectral. Data
images with support vector machines. IEEE Geoscience and Remote Sensing Photogrammetric Engineering & Remote Sensing 73 (10), 1141–1148.
Letters 4 (4), 674–677. Cortes, C., Vapnik, V., 1995. Support-vector networks. Machine Learning 20 (3),
Barakat, N., Bradley, A.P., Rule extraction from support vector machines: a review. 273–297.
Neurocomputing (in press). doi:10.1016/j.neucom.2010.02.016. Dalponte, M., Bruzzone, L., Gianelle, D., 2008. Fusion of hyperspectral and LIDAR
Bazi, Y., Melgani, F., 2006. Toward an optimal SVM classification system for remote sensing data for classification of complex forest areas. IEEE Transactions
hyperspectral remote sensing images. IEEE Transactions on Geoscience and on Geoscience and Remote Sensing 46 (5), 1416–1427.
Remote Sensing 44 (11), 3374–3385. Dalponte, M., Bruzzone, L., Vescovo, L., Gianelle, D., 2009. The role of spectral
Bazi, Y., Melgani, F., 2007. Semisupervised PSO-SVM regression for biophysical resolution and classifier complexity in the analysis of hyperspectral images of
parameter estimation. IEEE Transactions on Geoscience and Remote Sensing 45 forest areas. Remote Sensing of Environment 113 (11), 2345–2355.
(6), 1887–1895. Dash, J., Mathur, A., Foody, G.M., Curran, P.J., Chipman, J.W., Lillesand, T.M.,
Blanzieri, E., Melgani, F., 2008. Nearest neighbor classification of remote sensing 2007. Land cover classification using multi-temporal MERIS vegetation indices.
images with the maximal margin principle. IEEE Transactions on Geoscience International Journal of Remote Sensing 28 (6), 1137–1159.
and Remote Sensing 46 (6), 1804–1811. Demir, B., Ertürk, S., 2007. Hyperspectral image classification using relevance vector
Bovolo, F., Bruzzone, L., Marconcini, M., 2008. A novel approach to unsupervised machines. IEEE Geoscience and Remote Sensing Letters 4 (4), 586–590.
change detection based on a semisupervised SVM and a similarity measure. IEEE Demir, B., Erturk, S., 2009. Clustering-based extraction of border training patterns
Transactions on Geoscience and Remote Sensing 46 (7), 2070–2082. for accurate SVM classification of hyperspectral images. IEEE Geoscience and
Boyd, D.S., Sanchez-Hernandez, C., Foody, G.M., 2006. Mapping a specific class for Remote Sensing Letters 6 (4), 840–844.
priority habitats monitoring from satellite sensor data. International Journal of Dixon, B., Candade, N., 2008. Multispectral landuse classification using neural
Remote Sensing 27 (13), 2631–2644. networks and support vector machines: one or the other, or both? International
Brenning, A., 2009. Benchmarking classifiers to optimally integrate terrain analysis Journal of Remote Sensing 29 (4), 1185–1206.
and multispectral remote sensing in automatic rock glacier detection. Remote Durbha, S.S., King, R.L., Younan, N.H., 2007. Support vector machines regression for
Sensing of Environment 113 (1), 239–247. retrieval of leaf area index from multiangle imaging spectroradiometer. Remote
Brown, M., Gunn, S.R., Lewis, H.G., 1999. Support vector machines for optimal Sensing of Environment 107 (1–2), 348–361.
classification and spectral unmixing. Ecological Modelling 120 (2–3), 167–179. Esch, T., Himmler, V., Schorcht, G., Thiel, M., Wehrmann, T., Bachofer, F., Conrad, C.,
Brown, M., Lewis, H.G., Gunn, S.R., 2000. Linear spectral mixture models and support Schmidt, M., Dech, S., 2009. Large-area assessment of impervious surface based
vector machines for remote sensing. IEEE Transactions on Geoscience and on integrated analysis of single-date Landsat-7 images and geospatial vector
Remote Sensing 38 (5), 2346–2360. data. Remote Sensing of Environment 113 (8), 1678–1690.
Fauvel, M., Benediktsson, J.A., Chanussot, J., Sveinsson, J.R., 2008. Spectral and spatial
Bruzzone, L., Chi, M., Marconcini, M., 2006. A novel transductive SVM for
classification of hyperspectral data using SVMs and morphological profiles. IEEE
semisupervised classification of remote-sensing images. IEEE Transactions on
Transactions on Geoscience and Remote Sensing 46 (11), 3804–3814.
Geoscience and Remote Sensing 44 (11), 3363–3373.
Fauvel, M., Chanussot, J., Benediktsson, J.A., 2009. Kernel principal component
Bruzzone, L., Melgani, F., 2005. Robust multiple estimator systems for the analysis
analysis for the classification of hyperspectral remote sensing data over urban
of biophysical parameters from remotely sensed data. IEEE Transactions on
areas. EURASIP Journal on Advances in Signal Processing Article ID 783194.
Geoscience and Remote Sensing 43 (1), 159–174.
Filippi, A.M., Archibald, R., 2009. Support vector machine-based endmember
Bruzzone, L., Persello, C., 2009. A novel context-sensitive semisupervised SVM clas-
extraction. IEEE Transactions on Geoscience and Remote Sensing 47 (3),
sifier robust to mislabeled training samples. IEEE Transactions on Geoscience
771–791.
and Remote Sensing 47 (7), 2142–2154.
Foody, G.M., Mathur, A., 2004a. A relative evaluation of multiclass image
Burges, C.J.C., 1998. A tutorial on support vector machines for pattern recognition.
classification by support vector machines. IEEE Transactions on Geoscience and
Data Mining and Knowledge Discovery 2 (2), 121–167.
Remote Sensing 42 (6), 1335–1343.
Camps-Valls, G., Gomez-Chova, L., Calpe-Maravilla, J., Martin-Guerrero, J.D., Soria- Foody, G.M., 2008. RVM-based multi-class classification of remotely sensed data.
Olivas, E., Alonso-Chorda, L., Moreno, J., 2004. Robust support vector method for International Journal of Remote Sensing 29 (6), 1817–1823.
hyperspectral data classification and knowledge discovery. IEEE Transactions Foody, G.M., Mathur, A., 2004b. Toward intelligent training of supervised image
on Geoscience and Remote Sensing 42 (7), 1530–1542. classifications: directing training data acquisition for SVM classification.
Camps-Valls, G., Gómez-Chova, L., Muñoz-Marí, J., Vila-Francés, J., Amorós-López, J., Remote Sensing of Environment 93 (1–2), 107–117.
Calpe-Maravilla, J., 2006a. Retrieval of oceanic chlorophyll concentration with Foody, G.M., Mathur, A., 2006. The use of small training sets containing mixed pixels
relevance vector machines. Remote Sensing of Environment 105 (1), 23–33. for accurate hard image classification: training on mixed spectral responses for
Camps-Valls, G., Bruzzone, L., Rojo-Alvarez, J.L., Melgani, F., 2006b. Robust support classification by a SVM. Remote Sensing of Environment 103 (2), 179–189.
vector regression for biophysical variable estimation from remotely sensed Foody, G.M., Mathur, A., Sanchez-Hernandez, C., Boyd, D.S., 2006. Training set
images. IEEE Geoscience and Remote Sensing Letters 3 (3), 339–343. size requirements for the classification of a specific class. Remote Sensing of
Camps-Valls, G., Gomez-Chova, L., Munoz-Mari, J., Vila-Frances, J., Calpe-Maravilla, Environment 104 (1), 1–14.
J., 2006c. Composite kernels for hyperspectral image classification. IEEE Gautam, R.S., Singh, D., Mittal, A., Sajin, P., 2008. Application of SVM on satellite
Geoscience and Remote Sensing Letters 3 (1), 93–97. images to detect hotspots in Jharia coal field region of India. Advances in Space
Camps-Valls, G., Gomez-Chova, L., Munoz-Mari, J., Rojo-Alvarez, J.L., Martinez- Research 41 (11), 1784–1792.
Ramon, M., 2008. Kernel-based framework for multitemporal and multisource Geman, S., Bienenstock, E., Doursat, R., 1992. Neural networks and the bias/variance
remote sensing data classification and change detection. IEEE Transactions on dilemma. Neural Computation 4 (1), 1–58.
Geoscience and Remote Sensing 46 (6), 1822–1835. Ghoggali, N., Melgani, F., 2008. Genetic SVM approach to semisupervised
Camps-Valls, G., Mooij, J., Scholkopf, B., 2010. Remote sensing feature selection by multitemporal classification. IEEE Geoscience and Remote Sensing Letters 5 (2),
kernel dependence measures. IEEE Geoscience and Remote Sensing Letters 7 212–216.
(3), 587–591. Ghoggali, N., Melgani, F., Bazi, Y., 2009. A multiobjective genetic SVM approach
Candade, N., 2004. Multispectral classification of Landsat images: a comparison for classification problems with limited training samples. IEEE Transactions on
of support vector machine and neural network classifiers. ASPRS Annual Geoscience and Remote Sensing 47 (6), 1707–1718.
Conference Proceedings, Denver, Colorado. Gomez-Chova, L., Camps-Valls, G., Bruzzone, L., Calpe-Maravilla, J., 2010. Mean map
Cao, X., Chen, J., Matsushita, B., Imura, H., Wang, L., 2009a. An automatic method kernel methods for semisupervised cloud classification. IEEE Transactions on
for burn scar mapping using support vector machines. International Journal of Geoscience and Remote Sensing 48 (1), 207–220.
Remote Sensing 30 (3), 577–594. Gómez-Chova, L., Camps-Valls, G., Muñoz-Marí, J., Calpe, J., 2008. Semisupervised
Cao, X., Chen, J., Imura, H., Higashi, O., 2009b. A SVM-based method to extract urban image classification with Laplacian support vector machines. IEEE Geoscience
areas from DMSP–OLS and SPOT VGT data. Remote Sensing of Environment 113 and Remote Sensing Letters 5 (3), 336–340.
(10), 2205–2209. Goncalves, M.L., Netto, M.L.A., Costa, J.A.F., Zullo JU’ Nior, J., 2008. An unsupervised
Carrão, H., Gonçalves, P., Caetano, M., 2008. Contribution of multispectral and method of classifying remotely sensed images using Kohonen self-organizing
multitemporal information from MODIS images to land cover classification. maps and agglomerative hierarchical clustering methods. International Journal
Remote Sensing of Environment 112 (3), 986–997. of Remote Sensing 29 (11), 3171–3207.
258 G. Mountrakis et al. / ISPRS Journal of Photogrammetry and Remote Sensing 66 (2011) 247–259

Griffiths, P., Hostert, P., Gruebner, O., Linden, S., 2010. Mapping megacity growth Li, J., Narayanan, R.M., 2004. Integrated spectral and spatial information mining in
with multi-sensor data. Remote Sensing of Environment 114 (2), 426–439. remote sensing imagery. IEEE Transactions on Geoscience and Remote Sensing
Gualtieri, J.A., Cromp, R.F., 1998. Support vector machines for hyperspectral remote 42 (3), 673–685.
sensing classification. In: Proceedings of the 27th AIPR Workshop: Advances in Licciardi, G., Pacifici, F., Tuia, D., Prasad, S., West, T., Giacco, F., Thiel, C.,
Computer Assisted Recognition, Washington, DC, 27 October. SPIE, Washington, Inglada, J., Christophe, E., Chanussot, J., Gamba, P., 2009. Decision fusion for
DC, pp. 221–232. the classification of hyperspectral data: outcome of the 2008 GRS-S data
Guyon, I., Vapnik, V, Boser, B., Solla, S.A, 1992. Capacity control in linear classifiers fusion contest. IEEE Transactions on Geoscience and Remote Sensing 47 (11),
for pattern recognition. In: First IAPR International Conference on Pattern 3857–3865.
Recognition. IEEE Computer Society Press, pp. 385–388. Linden, S., Hostert, P., 2009. The infiuence of urban structures on impervious surface
Heikkinen, V., Tokola, T., Parkkinen, J., Korpela, I., Jaaskelainen, T., 2010. Simulated maps from airborne hyperspectral data. Remote Sensing of Environment 113
multispectral imagery for tree species classification using support vector (11), 2298–2305.
machines. IEEE Transactions on Geoscience and Remote Sensing 48 (3), Liu, D., Kelly, M., Gong, P., A., 2006. Spatial-temporal approach to monitoring forest
1355–1364. disease spread using multi-temporal high spatial resolution imagery. Remote
Hong, Y., Chiang, Y.-M., Liu, Y, Hsu, K.-L., Sorooshian, S., 2006. Satellite- Sensing of Environment 101 (2), 167–180.
based precipitation estimation using watershed segmentation and growing Luo, J, Ming, D, Liu, W, Shen, Z, Wang, M., Sheng, H., 2007. Extraction of bridges over
hierarchical self-organizing map. International Journal of Remote Sensing 27 water from IKONOS panchromatic data. International Journal of Remote Sensing
(23), 5165–5184. 28 (16), 3633–3648.
Huang, C., Davis, L.S., Townshend, J.R.G., 2002. An assessment of support vector Mantero, P., Moser, G., Serpico, S.B., 2005. Partially supervised classification of
machines for land cover classification. International Journal of Remote Sensing remote sensing images through SVM-based probability density estimation. IEEE
23 (4), 725–749. Transactions on Geoscience and Remote Sensing 43 (3), 559–570.
Marcal, A.R.S, Borges, J.S., Gomes, J.A., Pinto Da Costa, J.F., 2005. Land cover update
Huang, C., Song, K., Kim, S., Townshend, J.R.G., Davis, P., Masek, J.G., Goward, S.N.,
by supervised classification of segmented ASTER images. International Journal
2008a. Use of a dark object concept and support vector machines to automate
of Remote Sensing 26 (7), 1347–1362.
forest cover change analysis. Remote Sensing of Environment 112 (3), 970–985.
Marconcini, M., Camps-Valls, G., Bruzzone, L., 2009. A composite semisupervised
Huang, H., Gong, P., Clinton, N., Hui, F., 2008b. Reduction of atmospheric and
SVM for classification of hyperspectral images. IEEE Geoscience and Remote
topographic effect on Landsat TM data for forest classification. International
Sensing Letters 6 (2), 234–238.
Journal of Remote Sensing 29 (19), 5623–5642.
Mathur, A., Foody, G.M., 2008a. Multiclass and binary SVM classification:
Huang, X., Zhang, L., Li, P., 2008c. A multiscale feature fusion approach for
implications for training and classification users. IEEE Geoscience and Remote
classification of very high resolution satellite imagery based on wavelet
Sensing Letters 5 (2), 241–245.
transform. International Journal of Remote Sensing 29 (20), 5923–5941. Mathur, A., Foody, G.M., 2008b. Crop classification by support vector machine with
Huang, X., Zhang, L., 2010. Comparison of vector stacking, multi-SVMs fuzzy output, intelligently selected training data for an operational application. International
and multi-SVMs voting methods for multiscale VHR urban mapping. IEEE Journal of Remote Sensing 29 (8), 2227–2240.
Geoscience and Remote Sensing Letters 7 (2), 261–265. Mazzoni, D., Tang, N., Doggett, T., Chien, S., Greeley, R., Cichy, B., 2005a.
Huang, X., Zhang, L., 2009. Road centreline extraction from high-resolution Learning classifiers for science event detection in remote sensing imagery.
imagery based on multiscale structural features and support vector machines. In: Proceedings of the 8th International Symposium on Artificial Intelligence,
International Journal of Remote Sensing 30 (8), 1977–1987. Robotics and Automation in Space (i-SAIRAS 2005).
Inglada, J., 2007. Automatic recognition of man-made objects in high resolution Mazzoni, D.M., Horváth, A., Garay, M.J., Tang, B., Davies, R., 2005b. A MISR cloud-
optical remote sensing images by SVM classification of geometric image type classifier using reduced support vector machines. In: Proceedings of the
features. ISPRS Journal of Photogrammetry and Remote Sensing 62 (3), 236–248. Eighth Workshop on Mining Scientific and Engineering Datasets, 2005 SIAM
Jin, T., Zhou, Z., 2007. Ultrawideband synthetic aperture radar landmine detection. International Conference on Data Mining.
IEEE Transactions on Geoscience and Remote Sensing 45 (11), 3561–3573. Mazzoni, D., Garay, M.J., Davies, R., Nelson, D., 2007. An operational MISR pixel
Kaheil, Y.H., Rosero, E., Gill, M.K., McKee, M., Bastidas, L.A., 2008. Downscaling classifier using support vector machines. Remote Sensing of Environment 107
and forecasting of evapotranspiration using a synthetic model of wavelets and (1–2), 149–158.
support vector machines. IEEE Transactions on Geoscience and Remote Sensing Melgani, F., 2006. Contextual reconstruction of cloud-contaminated multitemporal
46 (9), 2692–2707. multispectral images. IEEE Transactions on Geoscience and Remote Sensing 44
Karimi, Y., Prasher, S.O., Patel, R.M., KIMB, S.H., 2006. Application of support vector (2), 442–455.
machine technology for weed and nitrogen stress detection in corn. Computers Melgani, F., Bruzzone, L., 2004. Classification of hyperspectral remote sensing
and Electronics in Agriculture 51 (1–2), 99–109. images with support vector machines. IEEE Transactions on Geoscience and
Kavzoglu, T., Colkesen, I., 2009. A kernel functions analysis for support vector Remote Sensing 42 (8), 1778–1790.
machines for land cover classification. International Journal of Applied Earth Mitra, P., Shankar, B.U., Pal, S., 2004. Segmentation of multispectral remote sensing
Observation and Geoinformation 11 (5), 352–359. images using active support vector machines. Pattern Recognition Letters 25
Keramitsoglou, I., Sarimveis, H., Kiranoudis, C.T., Kontoes, C., Sifakis, N., Fitoka, (9), 1067–1074.
E., 2006. The performance of pixel window algorithms in the classification of Mladinich, C.S., 2010. An evaluation of object-oriented image analysis techniques
habitats using VHSR imagery. ISPRS Journal of Photogrammetry and Remote to identify motorized vehicle effects in semi-arid to arid ecosystems of the
Sensing 60 (4), 225–238. american west. GIScience & Remote Sensing 47 (1), 53–77.
Keuchel, J., Naumann, S., Heiler, M., Siegmund, A., 2003. Automatic land cover Montgomery, D.C., Peck, E.A., 1992. Introduction to Linear Regression Analysis, 2nd
analysis for Tenerife by supervised classification using remotely sensed data. ed. Wiley, New York.
Remote Sensing of Environment 86 (4), 530–541. Moser, G., Serpico, S.B., 2009. Automatic parameter optimization for support vector
Knerr, S., Personnaz, L., Dreyfus, G., 1990. Single-layer learning revisited: a stepwise regression for land and sea surface temperature estimation from remote sensing
procedure for building and training a neural network. In: Neurocomputing: data. IEEE Transactions on Geoscience and Remote Sensing 47 (3), 909–921.
Algorithms, Architectures and Applications. In: NATO ASI Series, Springer. Mukhopadhyay, A., Maulik, U., 2009. Unsupervised pixel classification in satellite
Knorn, J., Rabe, A., Radeloff, V.C., Kuemmerle, T., Kozak, J., Hostert, P., 2009. Land imagery using multiobjective fuzzy clustering combined with SVM classifier.
cover mapping of large areas using chain classification of neighboring Landsat IEEE Transactions on Geoscience and Remote Sensing 47 (4), 1132–1138.
Muñoz-Marí, J., Bruzzone, L., Camps-Valls, G., 2007. A support vector domain
satellite images. Remote Sensing of Environment 113 (5), 957–964.
description approach to supervised classification of remote sensing images.
Knudby, A., LeDrew, E., Brenning, A., 2010. Predictive mapping of reef fish
IEEE Transactions on Geoscience and Remote Sensing 45 (8), 2683–2692.
species richness, diversity and biomass in Zanzibar using IKONOS imagery
Nemmour, H., Chibani, Y., 2006. Multiple support vector machines for land cover
and machine-learning techniques. Remote Sensing of Environment 114 (6),
change detection: an application for mapping urban extensions. ISPRS Journal
1230–1241.
of Photogrammetry and Remote Sensing 61 (2), 125–133.
Kohonen, T., 1997. Self Organizing Maps, 2nd ed. Springer-Verlag, Berlin.
Pal, M., 2006. Support vector machine-based feature selection for land cover
Kuemmerle, T., Chaskovskyy, T.K.O., Knorn, J., Radeloff, V.C., Kruhlov, I., Keeton,
classification: a case study with DAIS hyperspectral data. International Journal
W.S., Hostert, P., 2009. Forest cover change and illegal logging in the Ukrainian
of Remote Sensing 27 (14), 2877–2894.
Carpathians in the transition period from 1988 to 2007. Remote Sensing of Pal, M., 2008. Ensemble of support vector machines for land cover classification.
Environment 113 (6), 1194–1207. International Journal of Remote Sensing 29 (10), 3043–3049.
Kumar, A., Ghosh, S.K., Dadhwal, V.K, 2007. Full fuzzy land cover mapping using Pal, M., Mather, P.M., 2005. Support vector machines for classification in remote
remote sensing data based on fuzzy k-means and density estimation. Canadian sensing. International Journal of Remote Sensing 26 (5), 1007–1011.
Journal of Remote Sensing 33 (2), 81–87. Plaza, A., Benediktsson, J.A., Boardman, J.W., Brazile, J., Bruzzone, L., Camps-valls,
Kwiatkowska, E.J., Fargion, G.S., 2003. Application of machine-learning techniques G., Chanussot, J., Fauvel, M., Gamba, P., Gualtieri, A., Marconcini, M., Tilton,
toward the creation of a consistent and calibrated global chlorophyll J.C., TriannI, G., 2009. Recent advances in techniques for hyperspectral image
concentration baseline dataset using remotely sensed ocean color data. IEEE processing. Remote Sensing of Environment 113 (1), S110–S122.
Transactions on Geoscience and Remote Sensing 41 (12), 2844–2860. Potin, D., Vanheeghe, P., Duflos, E., Davy, M., 2006. An abrupt change detection
Lardeux, C., Frison, P.L., Tison, C., Souyris, J.C., Stoll, B., Fruneau, B., Rudant, J.P., 2009. algorithm for buried landmines localization. IEEE Transactions on Geoscience
Support vector machine for multifrequency SAR polarimetric data classification. and Remote Sensing 44 (2), 260–272.
IEEE Transactions on Geoscience and Remote Sensing 47 (12), 4143–4152. Sahoo, B.C., Oommen, T., Misra, D., Newby, G., 2007. Using the one-dimensional
Li, H., Gu, H., Han, Y., Yang, J., 2010. Object-oriented classification of high-resolution s-transform as a discrimination tool in classification of hyperspectral images.
remote sensing imagery based on an improved colour structure code and Canadian Journal of Remote Sensing 33 (6), 551–560.
a support vector machine. International Journal of Remote Sensing 31 (6), Schneider, P., Biehl, M., Hammer, B., 2009. Adaptive relevance matrices in learning
1453–1470. vector quantization. Neural Computation 21 (12), 3532–3561.
G. Mountrakis et al. / ISPRS Journal of Photogrammetry and Remote Sensing 66 (2011) 247–259 259

Scholkopf, B., Smola, A.J., 2001. Learning with Kernels. The MIT Press. Wang, L., Jia, X., 2009. Integration of soft and hard classifications using extended
Shi, W., Zheng, S., Tian, Y., 2009. Adaptive mapped least squares SVM–based smooth support vector machines. IEEE Geoscience and Remote Sensing Letters 6 (3),
fitting method for DSM generation of LIDAR data. International Journal of 543–547.
Remote Sensing 30 (21), 5669–5683. Warner, T.A., Nerry, F., 2009. Does single broadband or multispectral thermal
Smola, A.J., Schölkopf, B., 2004. A tutorial on support vector regression. Statistics data add information for classification of visible, near- and shortwave infrared
and Computing 14 (3), 199–222. imagery of urban areas? International Journal of Remote Sensing 30 (9),
Song, M., Civco, D., 2004. Road extraction using SVM and image segmentation. 2155–2171.
Photogrammetric Engineering & Remote Sensing 70 (12), 1365–1371.
Waske, B., Benediktsson, J.A., 2007. Fusion of support vector machines for
Song, X., Cherian, G., Fan, G., 2005. A ν -insensitive SVM approach for compliance
classification of multisensor data. IEEE Transactions on Geoscience and Remote
monitoring of the conservation reserve program. IEEE Geoscience and Remote
Sensing 45 (12), 3858–3866.
Sensing Letters 2 (2), 99–103.
Su, L., 2009. Optimizing support vector machine learning for semi-arid vegetation Waske, B., van der Linden, S., 2008. Classifying multilevel imagery from SAR and
mapping by using clustering analysis. ISPRS Journal of Photogrammetry and optical sensors by decision fusion. IEEE Transactions on Geoscience and Remote
Remote Sensing 64 (4), 407–413. Sensing 46 (5), 1457–1466.
Su, L., Huang, Y., 2009. Support vector machine (SVM) classification: comparison of Watanachaturaporn, P., Arora, M.K., Varshney, P.K., 2008. Multisource classification
linkage techniques using a clustering–based method for training data selection. using support vector machines: an empirical comparison with decision tree and
GIScience & Remote Sensing 46 (4), 411–423. neural network classifiers. Photogrammetric Engineering & Remote Sensing 74
Su, L., Huang, Y., Chopping, M.J., Rango, A., Martonchik, J.V., 2009. An empirical (2), 239–246.
study on the utility of BRDF model parameters and topographic parameters Wilson, M.D., Ustin, S.L., Rocke, D.M., 2004. Classification of contamination in salt
for mapping vegetation in a semi-arid region with MISR imagery. International marsh plants using hyperspectral reflectance. IEEE Transactions on Geoscience
Journal of Remote Sensing 30 (13), 3463–3483. and Remote Sensing 42 (5), 1088–1095.
Sun, D., Li, Y., Wang, Q., 2009. A unified model for remotely estimating chlorophyll Xie, X., Liu, T., Tang, B., 2008. Spacebased estimation of moisture transport in marine
a in Lake Taihu, China, based on SVM and in situ hyperspectral data. IEEE atmosphere using support vector regression. Remote Sensing of Environment
Transactions on Geoscience and Remote Sensing 47 (8), 2957–2965. 112 (4), 1846–1855.
Tang, S., Chen, C., Zhan, H., Zhang, T., 2008. Determination of ocean primary
Yang, F., Ichii, K., White, M.A., Hashimoto, H., Michaelis, A.R., Votava, P., Zhu, A.,
productivity using support vector machines. International Journal of Remote
Huete, A., Running, S.W., Nemani, R.R., 2007. Developing a continental–scale
Sensing 29 (21), 6227–6236.
measure of gross primary production by combining MODIS and AmeriFlux data
Tipping, M.E., 2000. The relevance vector machine. In: Solla, S.A., Leen, T.K.,
Muller, K.R. (Eds.), Advances in Neural Information Processing Systems, vol. 12. through support vector machine approach. Remote Sensing of Environment 110
MIT Press, Cambridge, MA. (1), 109–122.
Tipping, M.E., 2001. Sparse Bayesian learning and the relevance vector machine. Yang, F., White, M.A., Michaelis, A.R., Ichii, K., Hashimoto, H., Votava, P., Zhu,
Journal of Machine Learning Research 1, 211–244. A., Nemani, R.R., 2006. Prediction of continental-scale evapotranspiration by
Tan, C.P., Koay, J.Y., Lim, K.S., Ewe, H.T., Chuah, H.T., 2007. Classification of multi- combining MODIS and AmeriFlux data through support vector machine. IEEE
temporal sar images for rice crops using combined entropy decomposition and Transactions on Geoscience and Remote Sensing 44 (11), 3452–3461.
support vector machine technique. Progress in Electromagnetics Research 71, Zebedin, L., Klaus, A., Gruber-Geymayer, B., Karner, K, 2006. Towards 3D map
19–39. generation from digital aerial images. ISPRS Journal of Photogrammetry and
Tarabalka, Y., Benediktsson, J.A., Chanussot, J., 2009. Spectral-spatial classification Remote Sensing 60 (6), 413–427.
of hyperspectral imagery based on partitional clustering techniques. IEEE Zhang, L., Huang, X., Huang, B., Li, P., 2006. A pixel shape index coupled with
Transactions on Geoscience and Remote Sensing 47 (8), 2973–2987. spectral information for classification of high spatial resolution remotely
Tso, B., Mather, P., 2009. Classification Methods for Remotely Sensed Data, 2nd ed. sensed imagery. IEEE Transactions on Geoscience and Remote Sensing 44 (10),
CRC Press, 376 p. 2950–2961.
Tuia, D., Camps-Valls, G., 2009. Semisupervised remote sensing image classification
Zhang, R., Ma, J., 2008. An improved SVM method P-SVM for classification
with cluster kernels. IEEE Geoscience and Remote Sensing Letters 6 (2),
of remotely sensed data. International Journal of Remote Sensing 29 (20),
224–228.
6029–6036.
Tuia, D., Pacifici, F., Kanevski, M., Emery, W.J., 2009. Classification of very high
spatial resolution imagery using mathematical morphology and support vector Zhang, R., Ma, J., 2009. Feature selection for hyperspectral data based on recursive
machines. IEEE Transactions on Geoscience and Remote Sensing 47 (11), support vector machines. International Journal of Remote Sensing 30 (14),
3866–3879. 3669–3677.
Vapnik, V., 1979. Estimation of Dependences Based on Empirical Data. Nauka, Zheng, S., Shi, W., Liu, J., Tian, J., 2008. Remote sensing image fusion using multiscale
Moscow, pp. 5165–5184, 27 (in Russian) (English translation: Springer Verlag, mapped LS-SVM. IEEE Transactions on Geoscience and Remote Sensing 46 (5),
New York, 1982). 1313–1322.
Walton, J.T., 2008. Subpixel urban land cover estimation: comparing cubist, random Zhu, G., Blumberg, D.G., 2002. Classification using ASTER data and SVM algorithms;
forests, and support vector regression. Photogrammetric Engineering & Remote The case study of Beer Sheva, Israel. Remote Sensing of Environment 80 (2),
Sensing 74 (10), 1213–1222. 233–240.

You might also like