Available online at www.sciencedirect.
com
ScienceDirect
Procedia Computer Science 59 (2015) 221 – 229
International Conference on Computer Science and Computational Intelligence (ICCSCI 2015)
Integrating Data Selection and Extreme Learning Machine for
Imbalanced Data
Umi Mahdiyaha,∗, M. Isa Irawana , Elly Matul Imahb
a Sepuluh Nopember Institute of Technology, Keputih, Surabaya, 60111, Indonesia
b The State University of Surabaya, Ketintang, Surabaya, 50231, Indonesia
Abstract
Extreme Learning Machine (ELM) is one of the artificial neural network method that introduced by Huang, this method has very
fast learning capability. ELM is designed for balance data. Common problems in real-life is imbalanced data problem. So, for
imbalanced data problem needs special treatment, because characteristics of the imbalanced data can decrease the accuracy of
the data classification. The proposed method in this study is modified ELM to overcome the problems of imbalanced data by
integrating the data selection process, which is called by Integrating the data selection and extreme learning machine (IDELM.
Performances of learning method are evaluated using 13 imbalanced data from UCI Machine Learning Repository and Benchmark
Data Sets for Highly Imbalanced Binary Classification (BDS). The validation includes comparison with some learning algorithms
and the result showcases that average perform of our proposed learning method is compete and even outperform of some algorithm
in some cases.
©c 2015
2015Published
The Authors. Published
by Elsevier by Elsevier
B.V. This B.V.
is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
Peer-review under responsibility of organizing committee of the International Conference on Computer Science and Computational
Peer-review under
Intelligence responsibility
(ICCSCI 2015). of organizing committee of the International Conference on Computer Science and Computational
Intelligence (ICCSCI 2015)
Keywords: Data Selection; Extreme Learning Machine; Imbalanced Data
1. Introduction
A successful understanding of how to make computers learn would open up many new uses of computers and
new levels of competence and customization, so widely used in people lives 3,4,15 . The detailed understanding of
information- processing algorithms for machine learning might lead to a better understanding of human learning
abilities (and disabilities) as well 1 . Many type of machine learning that we know, one of them is extreme learning
machine (ELM). This method have very fast learning than Back propagation and Support Vector Machine 2 .
Extreme learning machine is one of a new learning algorithm in neural networks, which has the single-hidden
layer feed-forward network (SLFN). ELM has a very fast learning capability 4 . First ELM was introduced by Huang
as single-hidden layer feed-forward network. ELM was made to overcome the weaknesses learning speed of the feed-
forward neural networks problem. Traditionally, gradient-based learning algorithm is used for feed-forward neural
∗ Corresponding author.
E-mail address: [email protected]
1877-0509 © 2015 Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
Peer-review under responsibility of organizing committee of the International Conference on Computer Science and Computational
Intelligence (ICCSCI 2015)
doi:10.1016/j.procs.2015.07.561
222 Umi Mahdiyah et al. / Procedia Computer Science 59 (2015) 221 – 229
network training, all the parameters (input weight and hidden bias) are determined by iterative network, to solve that
problem, Extreme Learning Machine uses minimum norm least-squares (LS) solution of SLFNs.
Unlike the traditional function approximation theories which require to adjusted input weights and hidden layer
biases, input weights and hidden layer biases can be randomly assigned if only the activation function is infinitely
differentiable 5 . So, ELM can be faster than prior the neural network algorithm previously. ELM has been applied and
developed in various fields 6,7,8,9,10 .Up until now this algorithm is developed, the main reason many research use this
algorithm because this algorithm is simple and faster than several algorithm in neural network.
Imbalanced dataset is a problem which is one of the important issues in classification problems and become a
new challenge in machine learning recently 11 .Raw data with imbalanced class distribution can be found almost
every real world problem, include medical problem, security, internet, finance and etc. When classifying data with
imbalanced class distribution, the regular learning algorithm has a natural tendency to favor the majority class by
assuming balanced class distribution or equal misclassification cost 12 . To solving imbalanced data problem there are
several ways, that undersampling, oversampling and modification algorithms 13 . In previous studies, there are several
ways to solve the problem of imbalanced data with the modification of the algorithm, some of which is the hybrid
algorithm 14 and integrating several step in learning process 15,16,17,18 .
In this paper will be proposed to solving imbalanced data problem using data selection during the training process.
If the data selection and classification process is done separately, there will be inconsistencies between these two steps.
Thus, to overcome these problems, in this paper also proposed the integration of data selection and classification steps
with Extreme Learning Machine.
The rest of the paper is organized as follows. In Section 2, we present detail ELM. In Section 3, we present detail
about integrating data selection and extreme learning machine (IDELM) for imbalanced data. Section 4. we describe
dataset and experimental settings. Following that, Section 5 provides experimental results and discussions. Finally,
we draw conclusions in Section 6.
2. Extreme Learning Machine
Extreme learning machine is one of a new learning algorithm in neural networks, which has the Single-hidden
Layer feed-forward Network (SLFN) 4,5 . ELM has a simple algorithm, very fast learning capability and training
small error. First ELM was introduced by Huang as single-hidden layer feed-forward network (SLFNs). ELM made
to overcome the weaknesses of the feed-forward neural networks problem that learning speed. Traditionally, feed-
forward neural network using gradient-based learning algorithm for training, as well as all the parameters (input
weight and hidden bias) are determined by iterative network, to solve that problem, extreme learning machine using
minimum norm least-squares (LS) solution of SLFNs.
A standard single layer feedforward neural network with n hidden neurons and activation function g(x) can be
mathematically modelled as:
Ñ
Ñ
βi gi (x j ) = βi gi (wi x j + bi ) = o j , j ∈ [1, N] (1)
i=1 i=1
Which wi = [wi1 , wi2 , ..., win ]T is input weight.bi is bias hidden layer. βi = [βi1 , βi2 , ..., βin ]T output weight, and wi xi
is inner product from wi and xi , then o j is output from this algorithm.
Standard SLFN with N hidden nodes and activation function g(x) can approximate these N samples with zero error
means that if
Ñ
βi gi (wi x j + bi ) = t j (2)
i=1
That equation can be write as
Hβi = T (3)
Umi Mahdiyah et al. / Procedia Computer Science 59 (2015) 221 – 229 223
Note:
H is output matrix hidden layer.
⎛ ⎞
⎜⎜⎜ g(wi xi + bi ) · · · g(wÑ xi + bÑ ) ⎟⎟⎟
⎜⎜⎜⎜ .. .. .. ⎟⎟⎟
⎟⎟⎟
H = ⎜⎜ . . . (4)
⎜⎝ ⎟⎠
g(wi xN + bi ) · · · g(wÑ xN + bÑ )
β = [βT1 , · · · , βTÑ ]T (5)
T = [t1T , · · · , tNT ]T (6)
ELM algorithm is derived from the minimum norm least squares solution SLFNs. Although, ELM is ”generalized” of
SLFN but hidden layer (feature mapping) of the ELM does not need to be tuned. Main concepts of ELM as presented
by Huang2 as follow there: Given training set
ℵ = {(xi , ti )|xi ∈ Rn , ti ∈ Rm , i ∈ [1, N]} (7)
activation function g(x), and the number of hidden nodes Ñ
Step 1: insert random weights and biases wi andbi , i ∈ [1, N]
Step 2: Calculate the hidden layer output matrix H
Step 3: Calculate the output weights β
Which T = [t1 , t2 , ..., tN ]T
H † is MoorePenrose Generalized Inverse
Then activation function in the ELM must be infinitely differential (i.e sigmoid function, RBF, sine, cosine, expo-
nential, etc.). Many hidden node depend on the number of training samples is N Ñ < N. Several methods can be used
to compute the Moore-Penrose Generalized Inverse of H, which are orthogonal projection, orthogonalization method,
iterative method, and singular value decomposition (SVD).
To determine the number of hidden nodes, in this paper uses the following algorithm 19 :
Given a training set ℵ = {(xi , ti )|xi ∈ Rn , ti ∈ Rm , i ∈ [1, N]}, activation fuction g(x),
maximum node number Ñmax , and expected learning accuracy :
Initialization: Let Ñ = 0 and residual error E=t,where t = [t1 , t2 , ..., tN ]T
Learning step: While Ñ < Ñmax and E >
a) increase by one the number of hidden node Ñ : Ñ = Ñ + 1;
b) assign random nput weight aÑ and bias bÑ (or random centre aÑ and impact factor bÑ )for new hidden node Ñ
c) calculate the output weight βÑ for the new hidden node
E · HÑT
βÑ = (8)
HÑ · HÑT
d) calculate the residual error after adding the new hidden node Ñ:
E = E − βÑ .HÑ (9)
endwhile
3. Integrating Data Selection and Extreme Learning Machine
Integrating data selection and classification is an improvement ELM, which is designed to be able to perform data
selection and classification as well. To integrate that step, in this paper used the MSE, because MSE is often used for
stopping condition in Extreme Learning Machine.
224 Umi Mahdiyah et al. / Procedia Computer Science 59 (2015) 221 – 229
The first process of this proposed method is to divide the training data into several parts. The first group of data
is used to initialize the number of hidden nodes and initial MSE. The next group, after entering the learning process
also calculated its MSE. Then, when MSE in next group (data-t + 1) is higher than the previous group (data-t), then
the data entry to selected data. Conversely, if the next data is lower than the previous data (t), the data is not used.
Because, the MSE in data-t + 1 is greater than the data- t then there is a unique data on the data-t + 1. The block
diagram of the integration of learning methods can be seen in Figure 1.
Fig. 1. Block Diagram of Integrating Data Selection and ELM
Note:
MS E = MS Et+1 − MS Et
4. Experimental Settings and Dataset
4.1. Experimental Settings
This section discusses in more detail research procedure and dataset. This study is proposed new learning methods
that able to do data selection and also classification. In this paper, we will be compared the proposed method with
basic ELM. Procedure of research in this study will be presented in the flow chart of research as follows:
The first step in this research is pre-processing data, in this case used Z score normalization. The formulation Z score
as follow:
xi − x̄
Z score = (10)
σ(x)
note:
xi = {x1 , x2 , ...., xn } ∈ R
x̄ = mean of data
σ(x) = standard deviation
After pre-processing data (normalization), then conducted training data use 5-fold cross validation. In other word,
the training data in this case using 80% data from existing datasets. After the training process is complete, further
testing which uses 20% of the data.
1
f (x) = (11)
1 + e−x
This research will compare Integrating Data Selection and ELM with standard Extreme Learning Machine, Back-
propagation(BPNN), and Support Vector Machine(SVM). They have been compared using thirteen classification
dataset that is all binary classification.
Umi Mahdiyah et al. / Procedia Computer Science 59 (2015) 221 – 229 225
Fig. 2. Scheme of Methodology
Table 1. Table of Confusion Matrix for two class problem
Actual
True False
True TP (True Positive) FP (False Positive)
Prediction
False FN (False Negative) TN (True Negative)
Accuracy, precision, recall, specificity, and G-Mean data were used as evaluation measure to compare those algo-
rithms. Confusion matrix and evaluation measure formula as follow:
Accuracy is standard evaluation measurement that has been used in classification, but it is not suitable in imbal-
anced class classification. An imbalanced datasets, not only skewed of the class distribution, but the misclassification
cost is often uneven too. The minority class example are often more important than the majority class examples. Pre-
cision is a measure of correctness, that out of positive labeled examples, how many are really a positive example 20 .
Then, recall is a measure of completeness or accuracy of positive examples that how many examples of the positive
class were labeled correctly 21 . Accuracy, precision and recall was defined:
TP + TN
Accuracy = (12)
totalo f sample
TP
Precision = (13)
T P + FP
TP
Recall = (14)
T P + FN
226 Umi Mahdiyah et al. / Procedia Computer Science 59 (2015) 221 – 229
Gmean(Geometric mean) is one of evaluation measure for imbalanced data. Gmean is indicating balance perfor-
mance between majority and minority class. To get Gmean value, firstly we must get the value of sensitivity and
specificity. Sensitivity is the accuracy of the positive sample data to measure the sensitivity same as the equation for
measuring Recall. Whereas, specificity is the accuracy of the data sample is negative. Gmean was defined:
Gmean = sensitivity × speci f icity (15)
Specificity was defined:
FP
S peci f icity = 1 − (16)
FP + T N
4.2. Dataset
The datasets in this case is taken from the UCI Machine Learning Repository and Benchmark Data Sets for Highly
Imbalanced Binary Classification (BDS) 22 . From the eleven selected dataset there are some attributes missing data,
so that the pieces of data that have the lost attributes was deleted. That data description as follow:
Table 2. Table of dataset
Dataset Sata size Input feature + data - data Data From
QSAR biodegradation (QSAR) 1055 41 356 699 UCI
Spambase (Sp) 4601 51 1813 2788 UCI
Glass1 (G1) 214 10 70 144 UCI
Glass2 (G2) 214 10 76 138 UCI
Glass3 (G3) 214 10 17 197 UCI
Wilt (W) 4339 5 74 4265 UCI
Balance (B) 626 4 26 600 BDS
Abalone(A) 4175 8 210 3968 BDS
Yeast (Y) 1485 8 33 1452 BDS
Solar flare (Sf) 1390 10 44 1346 BDS
Mammographi (M) 11183 6 260 10923 BDS
Forest Cover (FC) 2267 12 189 2078 BDS
Letter Img (LI) 20000 16 734 19266 BDS
Glass data is conditioned to binary classification. Glass1 is data for building-windows-float-processed and not
building-windows-float-processed , Glass2 for vehicle-windows-float-processed, and Glass3 is containers. All sim-
ulation in this paper have been carried out in MATLAB 2012b environment running in an AMD E-350 Processor
1,60GHz.
5. Experimental Results and Discussions
In this section we will discuss the results of this research, that is a comparison of the performance Integrating
Data Selection and ELM and standard Extreme Learning Machine. The performance measure in this case including
accuracy, precision, recall, specificity, and Gmean.
5.1. Accuracy and Precision
The result of accuracy is shown in the table 3. From that table we can show that accuracy of IDELM not always
better than the ELM, BPNN and SVM, because accuracy is viewed from the correct classification only, without seeing
the balanced data or not. From diagram, we know the best accuration in Wilt, Spambase, Glass1, Yeat,Abalone, Solar
flare, Mammography, Forest cover, and Letter Img data is IDELM, that is 0.99, 1, 1,0.98, 0.95, 0.97, 0.99, 0.92, and
0.98 consecutive. If we calculate that average of accuracy from all data the best accuracy is BPNN. But in this case
we cannot see the performance from accuracy only, because classical evaluation measure (general accuracy) has no
sense when evaluating the performances of a classifier over imbalanced domains 16 .
Umi Mahdiyah et al. / Procedia Computer Science 59 (2015) 221 – 229 227
Table 3. Table of Accuracy, Precision and recall
Accuracy Precision Recall
Data
ELM IDELM BP SVM ELM IDELM BP SVM ELM IDELM BP SVM
W 0.98 0.99 1.00 0.99 0 0.99 0.96 0.57 0 1 0.76 0.57
Sp 0.86 1.00 0.93 0.83 0.84 1.00 0.92 0.96 0.79 1 0.90 0.96
qsar 0.80 0.83 0.86 0.83 0.74 0.70 0.81 0.84 0.63 1 0.77 0.84
B 0.96 0.95 0.96 0.56 0 0.85 0 0.08 0 1 0 0.08
G1 0.87 1.00 0.99 0.98 0.87 1.00 0.97 1.00 0.74 1 0.99 1.
G2 0.69 0.93 0.94 0.89 0.50 0.89 0.89 0.85 0.60 1 0.95 0.85
G3 0.62 0.95 0.93 0.97 0.15 0.93 0.68 0.87 0.73 1 0.78 0.87
Y 0.59 0.98 0.98 0.81 0.02 0.97 0.00 0.10 0.40 1 0 0.10
A 0.63 0.95 0.95 0.80 0.10 0.78 0 0.19 0.80 1 0 0.19
Sf 0.51 0.97 0.97 0.68 0.04 0.93 0 0.08 0.76 1 0 0.08
M 0.99 0.99 0.99 0.85 0 0.99 0.89 0.07 0 1 0.12 0.87
FC 0.92 0.92 0.92 0.85 0 0.86 0.61 0.34 0 1 0.18 0.92
LI 0.98 0.99 0.98 0.9 0 0.84 0.83 0.15 0 1 0.1 0.92
Average 0.8 0.9576 0.9538 0.8415 0.2507 0.9023 0.5815 0.4692 0.4192 1 0.4269 0.6346
From above table we can also see that almost all data, IDELM has the best precision. That means, IDELM can
classify positive data better than ELM, BPNN, and SVM. In some Data ELM and BPPN cannot classify positive data
correctly. So, that precision result is 0 because value of TP and FP is zero, as Wilt, Balance, Abalone, and Solar flare
data.
5.2. G-mean
The table of specificity and G-mean is shown in the table 4. In table 4, We can show that almost in all data,
IDELM have the best G-Mean. That mean, performance IDELM better than ELM, BPNN,and SVM for the case of
imbalanced data. It is very visible in the data that have strong imbalanced data, that is on the data Wilt, Spambase,
Glass1, Yeat,Abalone, Solar flare, Mammography, Forest cover, and Letter Img. For data Balance,Yeast,Abalone and
Solar Flare , BPNN have G-Mean value of 0 because sensitivity is 0, that mean BPNN can not classify the positive
data correctly.
Table 4. Table of Specificity and G-Mean
Specificity G-Mean
Data
ELM IDELM BP SVM ELM IDELM BP SVM
W 1 0.92 0.999 0.991 0 0.96 0.871 0.754
Sp 0.633 0.993 0.945 0.988 0.706 0.997 0.924 0.976
Qsar 0.907 0.85 0.918 0.939 0.755 0.92 0.842 0.890
B 0.925 0.95 1 0.522 0 0.98 0 0.203
G1 0.73424 1 0.93 0.993 0.739 1 0.957 0.997
G2 0.492 0.9 0.9 0.942 0.543 0.944 0.923 0.894
G3 0.514 0.91 0.984 0.9 0.614 0.95 0.875 0.883
Y 0.701 0.91 0.998 0.801 0.529 0.95 0 0.277
A 0.601 0.93 0.999 0.794 0.693 0.97 0 0.389
Sf 0.634 0.94 1 0.589 0.692 0.97 0 0.222
M 0.7 0.98 0.89 0.88 0.7 0.99 0.344 0.92
FC 1 0.85 0.99 0.82 0 0.92 0.42 0.87
LI 1 0.98 0.99 0.9 0 0.99 0.3 0.91
Average 0.7570 0.9317 0.9725 0.8583 0.4593 0.9649 0.4981 0.7065
228 Umi Mahdiyah et al. / Procedia Computer Science 59 (2015) 221 – 229
5.3. CPU Time
First, we discuss about the result of CPU time, the result of CPU time shown in the following table:
Table 5. Table of CPU Time
Data ELM(s) IDELM(s) BP(s) SVM(s)
Wilt 0.0281 5.2947 23.2067 3.2448
Spambase 0.2714 4.829781 486.2551 4.8634
Qsar 0.0406 1.198 25.3065 1.10767
Balance 0.0187 0.911 2.6458 0.4930
Glass1 0.0094 0.2028 3.4195 0.1997
Glass2 0.0094 0.218 3.6099 0.1966
Glass3 0.0218 0.3762 3.2823 0.1903
Yeast 0.0218 1.95 10.8889 0.7363
Abalone 0.0281 7.279 38.8973 3.6878
Solar Flare 0.0125 1.797 7.4038 0.7582
Mammography 0.109 7.179 49.1247 5.257
Forest Cover 0.078 2.719 30.1706 1.7784
Letter Img 0.1903 87.2435 280.9079 69.4828
Average 0.0591 9.3229 74.2399 7.0766
From above table we can show that the faster CPU time is standart ELM, second is SVM, then IDELM. Backprop-
agation is slower than other algorithms. Spambase data have slowest cpu time, because that data have many attribute
and element. That have 51 attribute and 4601 data. Because the cpu time not only depend on the type of algorithm,
but it also depend on a lot of data and attribute of data.
6. Conclusion
This paper discusses and compares four classification model and thirteen data that have imbalanced data. All the
four investigated models offer comparable classification accuracies. ELM has a good average of CPU time in almost
all data, that is 0.0591 second, whereas CPU Time of SVM, IDELM, and BPNN is 7.0766, 9.3229, and 74.2399 second
respectively. The best average of accuracy is IDELM, that is 95.76%.Performance for imbalanced data classification
problem can not be measured with the conventional accuracy, so in this study using precision, sensitivity, specificity,
and G Mean. From the above description of some methods that have the best average precision, recall, and G-mean is
IDELM, that is 90.23%,100%, and 96.49%, while average of ELM, BPNN, and SVM precision is 25.07%, 58.15%,
and 46.92%. Average of ELM, BPNN, and SVM recall is 41.92%, 42.69%, and 63.46%. then average of ELM,
BPNN, and SVM Gmean is 45.93%, 49.81%, and 70.65%. Gmean can be used to general measure of accuracy in
imbalanced data problem. So, from the result of Gmean, we can conclude the best method is IDELM.
References
1. Tom M. Mitchell, Machine Learning, McGraw-Hill, 1997.
2. Umi Mahdiyah, M.I Irawan, dan E.M Imah. Study Comparison Backpropogation, Support Vector Machine, and Extreme Learning Machine for
Bioinformatics Data. Jurnal Ilmu Komputer dan Informasi. 2015; 50(1): 55-62.
3. M.I Irawan, Siti Amiroch.”Construction of Phylogenetic Tree Using Neighbor Joining Algorithms to Identify The Host and The Spreading of
SARS Epidemic” Journal of Theoretical and Applied Information Technology. 2015; 71(3): 424-429.
4. Guang-Bin Huang, Zhu, Qin-Yu and Chee-Kheong Siew. Extreme learning machine: Theory and applications, Neurocomputing. 2006; 70:489-
501.
5. Guang-Bin Huang, Qin-Yu Zhu, and Chee-Kheong Siew. Extreme Learning Machine: A New Learning Scheme of FeedForward Neural Net-
work. Neurocomputing. 2004; 40: 7803-8359.
6. Xiaozhuo Luo, F. Liu, Shuyuan Yang, Xiaodong Wang, Zhiguo Zhou. Joint sparse regularization based Sparse Semi-Supervised Extreme
Learning Machine (S3ELM) for classification. Neurocomputing. 2015; 73: 149-160.
Umi Mahdiyah et al. / Procedia Computer Science 59 (2015) 221 – 229 229
7. Hong-Gui Han, Li-Dan Wang, Jun-Fei Qiao. Hierarchical extreme learning machine for feedforward neural network. Neurocomputing. 2014;
128: 128-135.
8. Guang-Bin Huang. An Insight into Extreme Learning Machines: Random Neurons, Random Features and Kernels. 2014; 376-390.
9. Junchang Xin, Zhiqiong Wang, Luxuan Qu, Guoren Wang. Elastic extreme learning machine for big data classification. Neurocomputing.
149:464-471
10. Wentao Maoa, Shengjie Zhao, Xiaoxia Mu. Haicheng Wang. Multi-dimensional extreme learning machine. Neurocomputing. 2015; 146:
160-170.
11. H. He, E.A. Garcia. Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 2009; 21 (9) : 12631284.
12. Weiwei Zong, Guang-Bin Huang, Yiqiang Chen. Weighted extreme learning machine for imbalance learning. Neurocomputing. 2013;101:229-
242
13. Mohamed Bekkar and Taklit A. Imbalanced Data Learning Approaches. International Journal of Data Mining and Knowledge Management
Process (IJDKP). 2013; 3(4):15-33.
14. C.Y. Lee, M. R. Yang, L. Y. Chang, Z. J. Lee. A Hybrid Algorithm Applied to Classify Unbalanced Data. Proceeding of International Confer-
ence on Networked Computing and Advanced Information Management. 2010: 618 621.
15. E.M. Imah, W. Jatmiko, T. Basarudin. Adaptive Multilayer Generalized Learning Vector Quantization (AMGLVQ) as new algorithm with
integrating feature extraction and classification for Arrhythmia heartbeats classification. Systems, Man, and Cybernetics (SMC). 2012; 150-
155.
16. E.M. Imah, W. Jatmiko, T. Basarudin.Electrocardiogram for Biometrics by using Adaptive Multilayer Generalized Learning Vector Quantiza-
tion (AMGLVQ): Integrating Feature Extraction and Classification. 2013;5(6): 1891- 1917.
17. K.K. Paliwal, M. Bacchiani, and Y. Sagisaka.. Simultaneous design of feature extractor and pattern classifier using the minimum classification
error training algorithm. Neural Networks for Signal Processing . V. Proceedings of the 1995 IEEE Workshop. 1995: 6776.
18. S. Chen and H. He. Towards incremental learning of nonstationary imbalanced data stream: a multiple selectively recursive approach. Evolving
Systems. 2010;2(1):35-50
19. Guang-Bin Huang, Lei Chen, and Chee-Kheong Siew. Universal Approximation Using Incremental Constructive Feedforward Networks With
Random Hidden Nodes. Neural Network. 2006;17(4):(879-892)
20. J. Weng, Cheng G, Poon, A New Evaluation Measure for Imbalanced Datasets, in Seventh Austraasian Data Mining Conference (AusDM
2008), 2008.
21. Mohamed Bekkar, Dr.Hassiba Kheliouane Djemaa, Dr.Taklit Akrouf Alitouche. Evaluation Measures for Models Assessment over Imbalanced
Data Sets. Journal of Information Engineering and Applications. 2013; 3(10):27-38
22. Ding, Zejin, ”Diversified Ensemble Classifiers for Highly Imbalanced Data Learning and their Application in Bioinformatics.” Dissertation,
Georgia State University, 2011.