A New Random Forest Algorithm Based On Learning Automata
A New Random Forest Algorithm Based On Learning Automata
Research Article
A New Random Forest Algorithm Based on Learning Automata
Received 12 February 2021; Revised 9 March 2021; Accepted 16 March 2021; Published 27 March 2021
Copyright © 2021 Mohammad Savargiv et al. This is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is
properly cited.
The goal of aggregating the base classifiers is to achieve an aggregated classifier that has a higher resolution than individual
classifiers. Random forest is one of the types of ensemble learning methods that have been considered more than other ensemble
learning methods due to its simple structure, ease of understanding, as well as higher efficiency than similar methods. The ability
and efficiency of classical methods are always influenced by the data. The capabilities of independence from the data domain, and
the ability to adapt to problem space conditions, are the most challenging issues about the different types of classifiers. In this
paper, a method based on learning automata is presented, through which the adaptive capabilities of the problem space, as well as
the independence of the data domain, are added to the random forest to increase its efficiency. Using the idea of reinforcement
learning in the random forest has made it possible to address issues with data that have a dynamic behaviour. Dynamic behaviour
refers to the variability in the behaviour of a data sample in different domains. Therefore, to evaluate the proposed method, and to
create an environment with dynamic behaviour, different domains of data have been considered. In the proposed method, the idea
is added to the random forest using learning automata. The reason for this choice is the simple structure of the learning automata
and the compatibility of the learning automata with the problem space. The evaluation results confirm the improvement of
random forest efficiency.
difference in polarity is created without any change in the In this paper, a brief review of random forest in terms of
form of the word and without any change in the role of the application scope is given.
word from a grammatical point of view. The word “small” in In this paper, a learning automata-based method is
both the electronic domain and the restaurant domain has proposed to improve the random forest performance.
such a behaviour. This behaviour poses a major challenge to
The proposed method operates independently of the
the opinion mining algorithms [4].
domain, and it is adaptable to the conditions of the
The classical solution in the literature to overcome this
problem space.
challenge is based on the use of lexical-based approaches.
This approach is based on frameworks such as unigram, The rest of the paper is organized as follows. In Section 2,
n-gram, aspect-based, and similar methods, and all of them related work is introduced. Section 3 presents the intro-
are data-dependent. In addition to the urgent need for duction to learning automata. The proposed method is
predefined data, these methods lose their efficiency if they explained in Section 4. Section 5 includes evaluation. Dis-
are met with an unspecified word or metaphor in the cussion is given in Section 6, and finally, the conclusion and
opinion mining field. In other words, they are not com- future work are described in Section 7.
patible with the problem space. The way random forest
works is that with the sequential placement of training data
and feature vectors that are injected into each of the base 2. Related Work
learners, it tries to find the best subset of features, and by
increasing their impact factor in the classifier, it achieves the In this section, theories and literature on the subject of
highest performance among all the aggregated base learners random forest are examined. The purpose of this section is to
[5]. However, this method is not effective in relation to data review the innovations that have been introduced around
such as text, in which a word can have different polarities in random forest in recent years.
different domains because, in the classification algorithm, Random forest is considered as one of the methods of
there is no ability to adapt to the conditions of the problem ensemble learning in the homogeneous ensemble learning
space. subgroup. In the random forest, each decision tree, or in
In this paper, we intend to empower random forest with other words, each base learner, has access to a random subset
the idea of reinforcement learning and improve its efficiency. of feature vectors [6]. Therefore, the feature vector is defined
In the proposed method, learning automata is used to ag- as follows:
gregate and weigh base learners. The way learning automata
works is to receive feedback from the environment and x � x1 , x2 , ..., xp , (1)
perform one of the actions based on the type of feedback. In , where p is the dimension property of the available vector for
the learning automata, feedbacks are divided into two cat- the base learner. The main goal is to find the prediction
egories of reinforcement signals: reward signals and penalty function as f(x) that predicts the Y parameter. The prediction
signals. For each reinforcement signal received by the function is defined as follows:
learning automata, it updates the probability of selecting the
selected action in the previous step. This process continues L(Y, f(x)), (2)
until the probability of action selections converges to one of
the actions; in other words, the best option for running in the where L is known as the loss function, and the goal is to
current situation is found. In the proposed method, learning minimize the expected value of the loss. For regression
automata actions are appropriate when one of the base applications and classification applications, squared error
learners selected leads to the maximum reward that can be loss and zero-one loss are common choices, respectively.
received from the environment. Since at each stage of These two functions are defined as follows in equations (3)
learning automata execution, the learning algorithm tries to and (4), respectively.
select the best option, achieving global optima in the
problem space is guaranteed. This is proof of the adaptability L(Y, f(x)) � Y − f(x)2 , (3)
of the proposed method. In the proposed method, the
subprocess of replacing features in the feature vector is 0, if Y � f(x),
L(Y, f(x)) � I(Y ≠ f(x)) � (4)
removed, and all the features in the feature vector are used. 1, otherwise.
As a practical application in the field of opinion mining, if
the Bag of Word (BoW) method is used to create the feature To create an ensemble, a set of base learners come to-
vector, the advantage of considering all the features of the gether. If base learners are defined as follows:
feature vector will also cover cases that occur rarely. In other h1 (x), h2 (x), . . . , hJ (x), (5)
words, in the proposed method, the aspect of independence
from the domain in the processes such as opinion mining is for regression applications, the averaging will be based on
considered. equation (6), and for classification applications, the voting
Our contribution is summarized as follows: will be based on equation (7).
Computational Intelligence and Neuroscience 3
Let D � {(x1, y1), (x2, y2), . . ., (xN, yN)} denote the training data, with xi � (xi,1, xi,2, . . ., xi,p)T
For j � 1 to J:
Take a bootstrap sample D of size N from D.
Using the bootstrap sample, Dj as the training data fit a tree.
(a) Start with all observations in a single node.
(b) Repeat the following steps recursively for each node until the stopping criterion is met: (i) Select m predictors at random from
the p available predictors.
Find the best binary split among all binary splits in the predictors from step (i).
Split the node into two descendant nodes using the split from step (ii).
To make a prediction at a new point x.
f(x) � argmaxy Jj�1 I(hj (x))
Where hj (x) is the prediction of the response variable at x using the jth tree.
construct a quantitative detection model. Improving the classifiers is presented by [76] for finding the best classifiers in
performance of mapping for mineral is the main goal of the subject literature of text classification. The random forest
reference [54]. Liu et al. [55] propose an adaptive electrical is used as one of the base learners of the ensemble model for
period partition algorithm for open-circuit fault detection. fake news detection by [77]. Analyzing the reviewer’s
Software fault prediction by ensemble techniques is inves- comment for sentiment analysis is the main goal of [78].
tigated by [56]. In [57], the RF id is used to build a dis- Zhang et al. [79] propose two novel label flipping attacks to
tributed energy system. A comprehensive image processing evaluate the robustness of NB under noise by random forest.
model is proposed by [58]. Ho et al. [59] uses RF to propose a Recognizing newspaper text by RF is done by [80]. Mad-
framework that uses climate data to model hydropower ichetty and Sridevi [81] use RF as one of the classifiers for
generation. Zhou et al. [60] use RF for small and unbalanced detecting the damage assessment tweets. Madasu and Elango
datasets to create a risk prediction model for decision- [82] use the typical RF for feature selection for sentiment
making tool. Deng et al. [61] propose an authentication analysis. Chang et al. [83] use online customer reviews for
method for protecting high-value food products by RF. The opinion mining by RF. Text classification by simple RF is the
forecast for agricultural products by RF is proposed by [62]. goal of [84]. Onan and Toçouglu [85] present a method for
Jeong and Kim [63] use weighted random forest for the link document clustering and topic modeling on massive open
prediction model. Khorshidpour et al. [64] present an ap- online courses. Sentiment analysis of technical words in
proach to model an attack against classifiers with non- English by the Gini index for feature selection is done by [86].
differentiable decision boundary. Fusing multi-domain Beck [87] uses ensemble learning and deep learning for
entropy and RF is the main goal of [65] for proposing a fault sentiment classification scheme with high predictive per-
diagnosis method of the inter-shaft bearing. Analyzing the formance in massive open online courses’ reviews. Onan [88]
wine quality is presented by [66]. In the network field, present a deep learning based approach to sentiment analysis.
Madhumathi and Suresh [67] develop a model to predict the This approach uses TF-IDF weighted Glove word embedding
future location of a dynamic sensor node in wireless with CNN LSTM architecture. Onan and Tocoglu [89]
communications. Fang et al. [68] propose an encrypted present an effective sarcasm identification framework on
malicious traffic identification method. Detecting the in- social media data by pursuing the paradigms of neural
trusion in the network by typical RF is proposed by [69], and language models and deep neural networks. In the tourism
intrusion detection in the network security by tuning the RF field, Rodriguez-Pardo et al. [90] propose a method based on
parameter of the Moth-Flame optimization algorithm is simple RF for predicting the behaviour of tourists. Predicting
presented by [70]. the travel time to reduce traffic congestion is the main goal of
[91]. Jamatia et al.92 propose a method for tourist destina-
tions’ prediction. In urban planning, Baumeister et al.93 rank
2.5. Physics, Text Processing, Tourism, and Urban Planning the urban forest characteristics for cultural ecosystem ser-
fields. In the physics field, Mingjing [71] measure and vices supply by typical RF. Forecasting road traffic conditions
quantify the pH of soil by RF. 72 propose a model for in done by [94]. The simulation of urban space development
extracting complex relationships between energy modu- by RF is presented by [95]. Investigating the information on a
lation and device efficiency. Zhang et al. [73] propose a gross domestic product for the analysis of economic devel-
model to accurately and effectively predict the UCS of opment is presented by [96]. Mei et al. [97] propose a method
LWSCC by a beetle antennae search algorithm for tuning to identify the spatiotemporal commuting patterns of the
the hyper-parameters of RF. The prediction of geotechnical transportation system. In this brief review, the mentioned
parameters by typical RF is made by [74]. Creep index references are categorized in terms of innovation and
prediction by the RF algorithm to determine the optimal functionality.
combination of variables is the main goal of [75]. In the text As can be seen from Table 1, RF has a high range of
processing field, the comparison between RF and other applications and variations in scope. In contrast, both in
Computational Intelligence and Neuroscience 5
3. Learning Automata α
Learning automata’s
action
Learning Automata (LA) is one of the learning algorithms
that, after selecting different actions at different times, Figure 1: Interaction of learning automata with the environment.
identify the best practices in terms of responses received
from a random environment. LA selects an action from the
set of actions in the vector of probabilities, and this action is
is the probability vector of the LA actions and
evaluated in the environment. By using the received signal
from the environment, the LA updates the probability vector P(n + 1) � T[P(n), α(n), β(n)], (12)
and, by repeating this process, the optimal action is gradually
identified. The classification problem can be formulated as a
team of LA that operates collectively to optimize an objective is the learning algorithm.
function [102]. In Figure 1, the interaction of the learning In LA, three different models can be defined in the
automata and the environment is shown. environment. In the P-Model, the environment presents the
Finding the global optimum in the solution space is values of 0 or 1 as the output. In the Q-Model, the output
another advantage of using the LA. The LA can be formally values of the environment are discrete numbers between 0
represented by the quadruple and 1. In the S-Model, the output of the environment is the
continuous value between 0 and 1. The selected actions by
LA � α, β, P, T, (8) the LA are updated by both the signal received from the
environment and using reward and penalty functions. The
in which amount of allocated reward and penalty to the LA action can
α � α1 , α2 , . . . , αr (9) be defined in four ways: LRP, where the number of rewards
and penalties are considered the same; LRεP in which the
is the set of actions (outputs) of the LA; in other words, the amount of penalty is several times smaller than the reward;
set of inputs of the environment. LRI in which the penalty amount is considered 0; and LIP,
where the reward amount is considered 0 [103].
β � β1 , β2 , . . . , βr , (10)
At each instant n, the action probability vector pi(n) is
is the set of inputs of the LA; in other words, the set of updated by the linear learning algorithm given in equation
outputs of the environment. (13) if the chosen action ai(k) is rewarded by the environ-
ment, and it is updated according to equation (14) if the
P � p1 , p2 , . . . , pr , (11) chosen action is penalized [104].
6 Computational Intelligence and Neuroscience
Splitting
D2
D
Dt D1 D
Training DTr D4 Preprocessing Data
Di
Trained D3 D5
DTr
Learning automata
Updating
DT1
Penalty Reward
Environment (pool)
DT3
DT2 function function
β
DT5
DT4 α1 α2 αr
α …
DTi
No
Convergence?
Yes
Result
Figure 2: The block diagram of the proposed method.
Input D = {(x1, y1), (x2, y2), . . ., (xN, yN),} denote the training data with xi � (xi,1, xi,2, . . ., xi,p)
(1) Output classified test data
(2) Assumption
(3) LA : Learning automata
(4) DTr � {DT1, DT2, . . . , DTR} denote the base learners
(5) αi: LA action//Choose DTr
(6) a: Reward parameter
(7) b: Penalty parameter
(8) Pool : All the trained base learners
(9) Algorithm
(10) For r � 1 to R do
(11) Create a dataset Dt, by sampling (N/R) items, randomly with replacement from D
(12) Train DTr using Dt, and add to the pool
(13) end//for
(14) For each test sample
(15) {
(16) LA � new LA//Create an LA object from LA class
(17) While ((LA convergences to an action) or (LA exceeds predefined iteration number))
(18) {
(19) Select one of the actions at random and execute it, by the LA, Let it be αi
(20) If (αi predicts the new test sample correctly) then//Update the probability of selection vector
(21) p (n + 1) � p (n) + a[1 − p (n)]
pi (n + 1) � (1i − a)p (n), i ∀ j, j ≠ i //reward the selected αi
j j
(22)
else
(23) p (n + 1) � (1 − b)p (n),
pj (n + 1) � (b/R − 1)i + (1 − b)p (n), ∀ j, j ≠ i, //Penalty the selected αi
j j
(24)
}//end while
(25) }//end for
(26) Return DTr
(27) Classified test data � the prediction of DTr
(28) End.//algorithm
action must be rewarded or, in other words, increase the considered to be zero, and the results of the proposed
probability of its selection. The increase in the probability of method in this mode are shown in Figure 3.
the selected action is determined by the parameters “a” and Based on the literature on learning automata in the LRεP
“b,” which are called the reward parameter and the penalty mode, the value of the penalty parameter is considered to be
parameter, respectively. much smaller than the value of the reward parameter. The
To comply with (16), that is, the sum of the probabilities results of the proposed method are shown in the LRεP mode
of all actions being equal to one, the probability of all other in Figure 4.
actions is reduced according to the size of the parameter “a.” As mentioned in the learning automata section, in the
If the result of the selected action is not useful, that action LRP mode, the values of the penalty and reward parameters
must also be penalized. In other words, the probability of are considered equal. The results of the proposed method in
that action must be reduced. To do this, the probability of this mode are also shown in Figure 5.
selecting that action is reduced to the size of parameter “b,” A comparison of the results obtained from the
and as a rewarding mode, and to observe (16), the probability implementation of the proposed method in three adjust-
of selecting other actions is increased by the size of the able modes for learning automata shows that the settings
parameter “b.” on the LRP mode have resulted in the highest accuracy for
In the proposed method, the learning automata model identification. Then there are LRεP and LRI modes. In the
environment is assumed to be the P-Model, where the LRεP mode, the setting a � 0.01, b � 0.01 is not considered,
environment defines zero and one values as outputs. Zero because these values are equal to the first values set in the
means reward, and one means penalty. If the correct answer LRP mode, and in order to prevent duplication of results in
is received from the selected base learner by the LA, the different tables, these settings have been removed from the
action of choice will be rewarded; otherwise, it will be LRεP mode. For this reason, the number of experiments
penalized. performed on LRεP mode evaluations is one less than the
other two. Considering that the settings of reward and
5. Evaluation penalty parameters in the LRP mode with the values of
a � 0.5, b � 0.5 have resulted in the highest efficiency,
In order to thoroughly evaluate the efficiency of the pro- evaluation has been done on other datasets with these
posed method, in this section, the details of the evaluation of settings. A comparison of the proposed method and
the proposed method are presented separately from the data similar approaches in the subject literature is shown in
used and the experimental results. Table 3.
As can be seen in Table 3 from the point of view of
5.1. Datasets. In order to evaluate the proposed method and accuracy, the proposed method offers better performance
to create an environment with the dynamic behaviour of than the methods available in the subject literature, which
data, different domains of applications have been selected. indicates an improvement in the aggregation model of the
As mentioned in the previous sections, dynamic behaviour base learners. This improvement is due to the use of rein-
refers to the different results that an instance exhibits in forcement learning ideas of the method of aggregation of
different environmental conditions. Variety in the results of basic classifiers, which is known as base learner. The use of
different environments is created by a specific domain. Text reinforcement learning ideas has improved the ability of the
data are one of the most well-known types of data that created ensemble, and it improved the ability to address
exhibit such dynamic behaviour. In other words, these types issues in which data exhibit dynamic behaviour. The results
of data are one of the optimal options for creating a dynamic of experiments performed on different data confirm the
environment, which proves the adaptability of the proposed capabilities added to the random forest by the proposed
method. The details of the selected data for the evaluation method. As mentioned earlier, in the field of opinion
phase are shown in Table 2. mining, the type of text data is the most obvious data that
exhibit such dynamic behaviour. Therefore, the optimal
5.2. Experimental Result. In order to evaluate the proposed values for the reward and penalty parameters have been
method, eighteen datasets in different domains introduced determined in these types of data, and these settings have
in the previous section have been used. In the literature on been used for other types of data.
learning automata, different modes have been considered for In addition to the accuracy criterion, other statistical
tuning learning automata; in this paper, three modes have criteria have been examined to evaluate the proposed
been used to evaluate the proposed method. The LIP mode is method. As can be seen in Table 4, the proposed method has
not considered due to poor results. The evaluation results of shown better results in both positive and negative classes
each of the LRI, LRεP, and LRP modes are shown in separate than the methods available in the literature. Among the
figures. In order to determine the optimal value for the statistical criteria, Precision (P) determines the exactness of
reward and penalty parameters, six text datasets have been the results obtained from the classifier, and Recall (R) de-
selected. The reason for this choice is the high diversity in the termines the completeness of the results obtained from the
behaviour of textual data as well as a large number of classifier. The results obtained from the test in the mentioned
samples and a large number of features of these six datasets. statistical criteria show that the proposed method has a high
In the LRI mode, the value of the penalty parameter is performance.
Computational Intelligence and Neuroscience 9
100
90
80
70
60
Accuracy
50
40
30
20
10
0
a = 0.01, b = 0 a = 0.05, b = 0 a = 0.1, b = 0 a = 0.3, b = 0 a = 0.5, b = 0 a = 0.7, b = 0
Sentiment 140 dataset 74.05 74.05 74.05 74.05 74.05 74.05
Large dataset of movie reviews 86.13 85.98 85.98 85.98 85.98 85.98
Sentence polarity dataset 72.94 72.94 72.94 72.94 72.94 72.94
Movie reviews dataset 82.31 82.31 82.31 82.31 82.31 82.31
Yelp review polarity 89.56 89.56 89.56 89.56 89.56 89.56
Amazon review polarity 81.41 81.41 81.41 81.41 81.41 81.41
100
90
80
70
60
Accuracy
50
40
30
20
10
0
a = 0.05, b = 0.01 a = 0.1, b = 0.01 a = 0.3, b = 0.01 a = 0.5, b = 0.01 a = 0.7, b = 0.01
100
90
80
70
60
Accuracy
50
40
30
20
10
0
a = 0.01, a = 0.05, a = 0.1, a = 0.3, a = 0.5, a = 0.7,
b = 0.01 b = 0.05 b = 0.1 b = 0.3 b = 0.5 b = 0.7
Sentiment 140 dataset 74.7 74.8 75.85 76.85 76.3 75.65
Large dataset of movie reviews 85.98 86.23 87.06 87.35 86.62 86.82
Sentence polarity dataset 74.33 75.53 75.83 76.48 77.03 76.08
Movie reviews dataset 81.94 81.58 83.75 83.03 85.92 83.75
Yelp review polarity 89.78 89.67 89.34 89.78 90.76 89.67
Amazon review polarity 80.41 80.58 81.08 81.16 82.58 81.33
the evaluation, different data from different domains were Repository [109]. And their basis for accuracy is based on
examined. The preprocessing of textual data, along with the previous research works that have used these data.
relevant details, is described below. It should be noted that In order to prepare textual data for the main process,
preprocessing for other types of data, such as feature ex- the opinion mining domain is selected and the related
traction, feature selection, normalization, noise removal, preprocessing is as follows. The details of the pre-
and other related preprocessing, has not been performed processing step for text data in opinion mining are shown
because all of them are taken as clean data from the UCI in Figure 6.
Computational Intelligence and Neuroscience 11
Table 3: Comparison of the proposed method with similar approaches in the subject literature.
Dataset Averaging Majority Voting Random Forest Our Method
Sentiment140 dataset 74.54 75.50 74.30 76.30
Large dataset of movie reviews 86.28 86.86 86.42 86.62
Sentence polarity dataset 73.75 74.63 73.38 77.03
Text
Movie reviews dataset 81.58 81.58 81.67 85.92
Yelp review polarity 89.47 90.32 89.74 90.76
Amazon review polarity 80.86 81.66 80.97 82.58
Heart disease dataset 58.00 57.50 57.50 65.00
Breast cancer dataset 97.41 97.36 96.49 98.24
Arrhythmia dataset 80.71 85.71 81.31 85.71
Parkinson dataset 63.95 64.58 64.58 68.75
Healthcare
Caesarean section dataset 60.31 62.50 43.75 68.75
Gene expression dataset 95.59 95.62 96.27 98.75
Diabetes dataset 75.77 75.32 74.67 76.62
Statlog (heart) data set 81.20 81.48 79.62 85.18
Ionosphere dataset 91.05 91.54 92.95 95.77
Physical
Sonar, mines vs. rocks dataset 85.23 85.71 73.80 88.09
Voice dataset 76.38 76.18 76.49 88.95
Sound
Emotions from music dataset 78.23 78.15 82.35 84.03
Punctuation
Slangs handling
handling
Expressive Lengthening. Word lengthening or word reinforcement learning is implemented using learning
stretching refers to the words that are elongated to automata, all three adjustable modes of the parameters of
express a particular emotion strongly, and the words reward and penalty are examined. The results of these three
with wrong spellings are corrected and replaced with modes were presented in the experimental result section. In
their original words. this paper, Friedman test statistical verification is used to
Emoticons Handling. It refers to the emoticons men- determine which mode and which settings are best ad-
tioned in the text that are replaced with their meaning, justable for the reward and penalty parameters. The values
which makes it easier to analyze the emoticons. set for parameters “a” and “b” are shown in Table 5. De-
termining the numerical value of these parameters is based
HTML Markups Removal. HTML markups presented
on the subject literature of learning automata. Of course, a
in the text are removed as they do not have any sen-
wide variety of values can be considered for these two pa-
timental value attached to it.
rameters. In this paper, an attempt has been made to tune the
Slangs Handling. The slangs are used for writing a given parameters in such a way that all the modes are considered
word, in short syllables, which depict the same meaning so that they can be used to prove the efficiency of the
but save the time of typing. In slangs handling, the proposed method compared to the previous methods.
slangs presented in the text are replaced with their
original words.
Punctuation Handling. Punctuations are used in a text 6.3. Ranking. Friedman test statistical verification [110] is a
to separate sentences and their elements, and to clarify ranking method that, the difference between the ranks
their meaning. At punctuation handling, once the assigned to each of the input samples, determines the op-
apostrophes are handled, all the remaining punctua- timal level of each option. In this paper, this verification
tions and numbers are removed. method has been used to determine the optimal value of
Stopwords Removal. Stopwords do not carry much reward and penalty parameters as well as to compare the
meaning and have no importance in the text. Stop- proposed method with the conventional methods in the
words are removed to get a simplified text. subject literature of ensemble learning. The results are shown
in Table 6.
Stemming. It refers to finding out the root or stem of a As can be seen in Table 6, there is a significant difference
word. Removing various suffixes to reduce the number between the rankings of the proposed method and the
of words is the purpose of stemming. rankings of the traditional methods, which indicate an
Lemmatization. It returns the base or dictionary form improvement in the efficiency of the proposed method
of a word, which is known as the lemma. It is very compared to other methods. Among the three modes
similar to stemming, but it is more akin to synonym considered for tuning reward and penalty parameters, it is
replacement. observed that the rankings have increased in LRI, LReP, and
BoW creation. The bag of word creation is the latest LRP modes, respectively. In the LRP mode, where the values
preprocess that is performed on the text preparation. of the reward and penalty parameters are considered the
same, the highest efficiency is also observed. There is a
significant difference between the Mean Rank of the best set
6.2. Tuning the Parameters of Reward and Penalty. In the of the reward and penalty parameters in the proposed
subject literature of the learning automata, three different method and this rank in the random forest method. The
modes have been defined to tune the parameters of reward difference between the ranks is proof that the proposed
and penalty. In the proposed method, in which the idea of method is optimal versus the traditional methods of
Computational Intelligence and Neuroscience 13
Table 6: Friedman test statistical verification results for ranking the parameters of reward and penalty and comparing the proposed method
with the literature.
Method Tuning Mean rank Final rank
LRP a � 0.5, b � 0.5 19.17 1
LRP a � 0.3, b � 0.3 16.83 2
LRP a � 0.7, b � 0.7 15.58 3
MV Majority voting 14.67 4
LRP a � 0.1, b � 0.1 13.92 5
LReP a � 0.05, b � 0.01 12.17 6
LReP a � 0.1, b � 0.01 11.83 7
LReP a � 0.5, b � 0.01 10.08 8
LRP a � 0.05, b � 0.05 9.58 9
RF Random forest 9.17 10
LRP a � 0.01, b � 0.01 8.75 11
LIR a � 0.01, b � 0 8.42 12
LIR a � 0.05, b � 0 7.67 13
LIR a � 0.1, b � 0 7.67 13
LIR a � 0.3, b � 0 7.67 13
LIR a � 0.5, b � 0 7.67 13
LIR a � 0.7, b � 0 7.67 13
AV Averaging 7.58 14
LReP a � 0.3, b � 0.01 7.17 15
LReP a � 0.7, b � 0.01 6.75 16
0.8 0.8
LA actions
LA actions
0.6 0.6
0.4 0.4
0.2 0.2
0.0 0.0
0 200 400 600 800 1000 0 200 400 600 800 1000
Number of iteration Number of iteration
(a) (b)
Figure 7: Continued.
14 Computational Intelligence and Neuroscience
0.8 0.8
LA actions
LA actions
0.6 0.6
0.4 0.4
0.2 0.2
0.0 0.0
0 200 400 600 800 1000 0 200 400 600 800 1000
Number of iteration Number of iteration
(c) (d)
Convergence rate Convergence rate
0.30
0.4
0.25
0.3 0.20
LA actions
LA actions 0.15
0.2
0.10
0.1
0.05
0.0 0.00
0 200 400 600 800 1000 0 200 400 600 800 1000
Number of iteration Number of iteration
(e) (f )
Convergence rate
1.0
0.8
LA actions
0.6
0.4
0.2
0.0
0 200 400 600 800 1000
Number of iteration
(g)
Figure 7: Convergence rate for different reward and penalty parameters. (a) a � 0.5, b � 0.5; (b) a � 0.3, b � 0.3; (c) a � 0.7, b � 0; (d) a � 0.1,
b � 0.1; (e) a � 0.01, b � 0; (f ) a � 0.05, b � 0.05; (g) a � 0.3, b � 0.
aggregating classifiers to achieve a strong classification number of iterations. As shown in Table 5, convergence at a
method. lower rate occurred in some of the other settings that scored
lower on the Friedman test.
100.00
90.00
80.00
70.00
60.00
Accuracy
50.00
40.00
30.00
20.00
10.00
-
Sentiment 140 dataset
Diabete dataset
Sentence polarity dataset
Parkinson dataset
Voice dataset
that the proposed method, due to the use of learning learner. The choice of action is based on receiving feedback
automata, has high adaptability to the problem conditions, from the environment. This causes the dynamic behaviour of
and in the presence of noise, contrary to conventional data to be covered by using the idea of reinforcement
methods in the literature, the proposed method does not learning. On the other hand, given that at each stage,
suffer a sharp decline, and in such conditions, it shows high learning automata strives to achieve the highest amount of
efficiency compared to traditional methods. The evaluation achievable rewards, it is guaranteed to find the global optima
of the proposed method in the presence of noise is shown in in the problem space. Adaptability is another advantage of
Figure 8. the proposed method compared to similar methods in the
subject literature.
7. Conclusion and Future Work Due to the fact that in each step learning automata
operates based on environmental conditions and received
Base learner aggregation in ensemble learning should be feedback from the environment, the ability to adapt to the
done in such a way that the following points are met. First problem is met. The results of the evaluations performed in
point: selecting a base learner leads to the highest perfor- different data show that the proposed method has the ability
mance achievable in the current situation. Second point: if to achieve all the desired items mentioned above. Despite the
the situation changes due to the dynamics of the problem, fact that, unlike the random forest mechanism, all features are
the structure of the ensemble will change in such a way that it injected into all base learners in the proposed method, the
has the greatest amount of compatibility with the conditions efficiency of the proposed method in dealing with large-
of the new environment. Therefore, in order to meet the volume data has not decreased, and the results are more
above points and achieve an ensemble that is able to adapt to favorable than the classical methods. The proposed method is
the dynamic conditions of the problem, in this paper, a new independent of the data type and has the ability to handle any
method based on the idea of reinforcement learning is other type of data in any field. In order to substantiate this
proposed to integrate the base learners in the random forest. claim, and in order to evaluate the proposed method, different
In the proposed method, learning automata is used to re- types of data have been chosen. However, there are no re-
ceive feedback from the environment and perform actions strictions on the proposed method for dealing with different
on it. The general procedure is to receive feedback from the types of data. In this paper, a new method for aggregating the
environment, where the environment is a set of base learners base learners of the random forest using learning automata is
that we intend to combine to achieve a better performance proposed. Determining the optimal value for the parameters
than individual base learners. Learning automata actions of reward and penalty in the form of self-tuning is one of the
include choosing one of the base learners as the best base future works that the authors intend to do.
16 Computational Intelligence and Neuroscience
[32] S. K. Mohapatra and M. N. Mohanty, “Big data analysis and [47] X. LiuL. Liu et al., “Downscaling of solar-induced chloro-
classification of biomedical signal using random forest al- phyll fluorescence from canopy level to photosystem level
gorithm,” New Paradigm In Decision Science And using a random forest model,” Remote Sensing of Environ-
Management, pp. 217–224, Springer, New York, NY, USA, ment, vol. 231, Article ID 110772, 2019.
2020. [48] S. Guanter and J. Santosh Kumar, “Performance evaluation
[33] A. Joshi, T. Choudhury, A. Sai Sabitha, and K. Srujan Raju, of random forest with feature selection methods in pre-
“Data mining in healthcare and predicting obesity,” in diction of diabetes,” International Journal of Electrical and
Proceedings of the Third International Conference on Com- Computer Engineering, vol. 10, 2020.
putational Intelligence and Informatics, pp. 877–888, [49] A. Subasi, A. Ahmed, E. Aličković, and A. Rashik Hassan,
Hyderabad, India, 2020. “Effect of photic stimulation for migraine detection using
[34] S. El-SappaghR. Sahal et al., “Alzheimer’s disease progression random forest and discrete wavelet transform,” Biomedical
detection model based on an early fusion of cost-effective Signal Processing and Control, vol. 49, pp. 231–239, 2019.
multimodal data,” Future Generation Computer Systems, [50] N. El Haouij, J.-M. Poggi, R. Ghozi, S. Sevestre-Ghalila, and
vol. 115, pp. 680–699, 2021. M. Jaı̈dane, “Random forest-based approach for physio-
[35] Y. Saleh, A. Halidou, and P. T. Kapen, “A review of logical functional variable selection for driver’s stress level
mathematical modeling, artificial intelligence and datasets classification,” Statistical Methods & Applications, vol. 28,
used in the study, prediction and management of COVID- no. 1, pp. 157–185, 2019.
19,” Applied Intelligence, vol. 50, no. 11, pp. 3913–3925, 2020. [51] D. Ayata, Y. Yaslan, and M. E. Kamasak, “Emotion recog-
[36] S. Khedkar, P. Gandhi, G. Shinde, and V. Subramanian, Deep nition from multimodal physiological signals for emotion
Learning and Explainable AI in Healthcare Using EHR, aware healthcare systems,” Journal of Medical and Biological
pp. 129–148, Springer, New Y ork, NY, USA, 2020. Engineering, vol. 40, pp. 149–157, 2020.
[37] T. Han, N. Stone-Weiss, J. Huang, A. Goel, and A. Kumar, [52] M. Zeraatpisheh, E. Bakhshandeh, M. Hosseini, and
“Machine learning as a tool to design glasses with controlled S. M. Alavi, “Assessing the effects of deforestation and in-
dissolution for healthcare applications,” Acta Biomaterials, tensive agriculture on the soil quality through digital soil
vol. 107, pp. 286–298, 2020. mapping,” Geoderma, vol. 363, Article ID 114139, 2020.
[38] A. Subudhi, M. Dash, and S. Sabut, “Automated segmen- [53] X. Du, P. Wang, L. Fu, H. Liu, Z. Zhang, and C. Yao,
tation and classification of brain stroke using expectation- “Determination of chlorpyrifos in pears by Raman spec-
maximization and random forest classifier,” Biocybernetics troscopy with random forest regression analysis,” Analytical
and Biomedical Engineering, vol. 40, no. 1, pp. 277–289, 2020. Letters, vol. 53, no. 6, pp. 821–833, 2020.
[39] A. Javadi, A. Khamesipour, F. Monajemi, and [54] J. Wang, R. Zuo, and Y. Xiong, “Mapping mineral pro-
M. Ghazisaeedi, “Computational modeling and analysis to spectivity via semi-supervised random forest,” Natural Re-
predict intracellular parasite epitope characteristics using sources Research, vol. 29, no. 1, pp. 189–202, 2020.
random forest technique,” Journal of Public Health, vol. 49, [55] S. Liu, X. Qian, H. Wan, Z. Ye, S. Wu, and X. Ren, “NPC
no. 1, p. 125, 2020. three-level inverter open-circuit fault diagnosis based on
[40] T. Shaikhina, D. Lowe, S. Daga, D. Briggs, R. Higgins, and adaptive electrical period partition and random forest,”
N. Khovanova, “Decision tree and random forest models for Journal of Sensor and Actuator Networks, vol. 2020, Article
outcome prediction in antibody incompatible kidney ID 9206579, 18 pages, 2020.
transplantation,” Biomedical Signal Processing and Control, [56] S. S. Rathore and S. Kumar, “An empirical study of ensemble
vol. 52, pp. 456–462, 2019. techniques for software fault prediction,” Applied Intelli-
[41] K. K. Singh, S. Kumar, P. Dixit, and M. K. Bajpai, “Kalman gence, pp. 1–30, 2020.
filter based short term prediction model for COVID-19 [57] T. Ahmad and H. Chen, “Nonlinear autoregressive and
spread,” Applied Intelligence, pp. 1–13, 2020. random forest approaches to forecasting electricity load for
[42] S.-J. Na, J.-W. Shin, S.-H. Eom, and E.-H. Lee, “A study on utility energy management systems,” Sustainable Cities and
random forest-based estimation model for changing the Society, vol. 45, pp. 460–473, 2019.
automatic walking mode of above knee prosthesis,” The [58] S. Gupta, J. Sarkar, M. Kundu, N. R. Bandyopadhyay, and
Journal of IKEEE, vol. 24, no. 1, pp. 9–18, 2020. S. Ganguly, “Automatic recognition of SEM microstructure
[43] M. Alloghani, T. Baker, D. Al-Jumeily, A. Hussain, and phases of steel using LBP and random decision forest
J. Mustafina, and A. J. Aljaaf, “Prospects of machine and deep operator,” Measurement, vol. 151, Article ID 107224, 2020.
learning in analysis of vital signs for the improvement of [59] L. T. T. Ho, L. Dubus, M. De Felice, and A. Troccoli, “Re-
healthcare services,” Nature-Inspired Computation In Data construction of multidecadal country-aggregated hydro
Mining And Machine Learning, pp. 113–136, Springer, New power generation in Europe based on a random forest
York, NY, USA, 2020. model,” Energies, vol. 13, no. 7, p. 1786, 2020.
[44] Y. Zhu, W. Xu, G. Luo, H. Wang, J. Yang, and W. Lu, [60] Y. Zhou, S. Li, C. Zhou, and H. Luo, “Intelligent approach
“Random Forest enhancement using improved Artificial based on random forest for safety risk prediction of deep
Fish Swarm for the medial knee contact force prediction,” foundation pit in subway stations,” Journal of Computing
Artificial Intelligence in Medicine, vol. 103, p. 101811, 2020. in Civil Engineering, vol. 33, no. 1, Article ID 05018004,
[45] H. Zhang et al., “Deep multi-model cascade method based on 2019.
CNN and random forest for pharmaceutical particle de- [61] X. DengY. Zhan et al., “Predictive geographical authenti-
tection,” IEEE Transactions on Instrumentation and Mea- cation of green tea with protected designation of origin using
surement, vol. 69, no. 9, pp. 7028–7042, 2020. a random forest model,” Food Control, vol. 107, Article ID
[46] H. Lee and E. Jung, “An Analysis of Annual Changes on the 106807, 2020.
Determining Factors for Teacher Attachment with Random [62] S. A. Liu, P. Ngare, and D. Ikpe, “Probabilistic forecasting of
Forest, pp. 463–470, Springer, New York, NY, USA, 2020. crop yields via quantile random forest and Epanechnikov
18 Computational Intelligence and Neuroscience
Kernel function,” Agricultural and Forest Meteorology, [78] S. N. Singh and T. Sarraf, “Sentiment analysis of a product
vol. 280, Article ID 107808, 2020. based on user reviews using random forests algorithm,” Data
[63] H. J. Jeong and M. H. Kim, “Utilizing adjacency of colleagues Science & Engineering, vol. 32, pp. 112–116, 2020.
and type correlations for enhanced link prediction,” Data & [79] H. Zhang, N. Cheng, Y. Zhang, and Z. Li, “Label flipping
Knowledge Engineering, vol. 125, Article ID 101785, 2020. attacks against Naive Bayes on spam filtering systems,”
[64] Z. Khorshidpour, S. Hashemi, and A. Hamzeh, “Evaluation Applied Intelligence, 2021.
of random forest classifier in security domain,” Applied [80] R. P. Kaur, M. Kumar, and M. K. Jindal, “Newspaper text
Intelligence, vol. 47, no. 2, pp. 558–569, 2017. recognition of Gurumukhi script using random forest
[65] J. Tian, L. Liu, F. Zhang, Y. Ai, R. Wang, and C. Fei, “Multi- classifier,” Multimedia Tools and Applications Journal,
domain entropy-random forest method for the fusion di- pp. 1–14, 2019.
agnosis of inter-shaft bearing faults with acoustic emission [81] S. Madichetty and M. Sridevi, “A novel method for identi-
signals,” Entropy, vol. 22, no. 1, p. 57, 2020. fying the damage assessment tweets during disaster,” Futur.
[66] B. Shaw, A. K. Suman, and B. Chakraborty, Wine Quality Gener. Comput. Syst.vol. 116, pp. 440–454, 2020.
Analysis Using Machine Learning, pp. 239–247, Springer, [82] A. Madasu and S. Elango, “Efficient feature selection tech-
New York, NY, USA, 2020. niques for sentiment analysis,” Multimedia Tools and Ap-
[67] K. Madhumathi and T. Suresh, Node Localization in Wireless plications, vol. 79, no. 9-10, pp. 6313–6335, 2020.
Sensor Networks Using Multi-Output Random Forest [83] A.-C. Chang, C. V. Trappey, A. J. C. Trappey, and
Regression, pp. 177–186, Springer, New York, NY, USA, L. W. L. Chen, “Web mining customer perceptions to define
2020. product positions and design preferences,” International
[68] Y. Fang, Y. Xu, C. Huang, L. Liu, and L. Zhang, “Against Journal on Semantic Web and Information Systems, vol. 16,
malicious SSL/TLS encryption: identify malicious traffic no. 2, pp. 42–58, 2020.
based on random forest,” in Proceedings of the Fourth In- [84] R. Kumar and J. Kaur, “Random forest-based sarcastic tweet
ternational Congress on Information And Communication classification using multiple feature collection,” in Multi-
Technology, pp. 99–115, London, UK, 2020. media Big Data Computing For IoT Applications,
[69] T. T. Bhavani, M. K. Rao, and A. M. Reddy, “Network in- pp. 131–160, Springer, New York, NY, USA, 2020.
trusion detection system using random forest and decision [85] A. Onan and M. A. Toçouglu, “Weighted word embeddings
tree machine learning techniques,” in Proceedings of the First and clustering-based identification of question topics in
International Conference On Sustainable Technologies For MOOC discussion forum posts,” Computer Applications in
Engineering Education, 2020.
Computational Intelligence, pp. 637–643, London, UK, 2020.
[86] O. M. Baez-Villanueva and M. Zambrano, “RF-MEP: a novel
[70] P. S. Chaithanya, M. R. G. Raman, S. Nivethitha, K. S. Seshan,
Random Forest method for merging gridded precipitation
and V. S. Sriram, “An efficient intrusion detection approach
products and ground-based measurements,” Remote Sensing
using enhanced random forest and moth-flame optimization
of Environment, vol. 239, Article ID 111606, 2020.
technique,” Computational Intelligence In Pattern Recognition,
[87] A. Beck, “Sentiment analysis on massive open online course
pp. 877–884, Springer, New York, NY, USA, 2020.
evaluations: a text mining and deep learning approach,”
[71] Z. Mingjing, “A novel strategy for quantitative analysis of soil
Computer Applications in Engineering Education, 2020.
pH via laser-induced breakdown spectroscopy coupled with
[88] A. Onan, “Sentiment analysis on product reviews based on
random forest,” Plasma Science Technology, vol. 22, no. 7, weighted word embeddings and deep neural networks,”
p. 74003, 2020. Computer Applications in Engineering Education, Article ID
[72] M.-H. Lee, “Robust random forest based non-fullerene or-
e5909, 2020.
ganic solar cells efficiency prediction,” Organic Electronics, [89] A. Onan and M. A. Tocoglu, “A term weighted neural
vol. 76, Article ID 105465, 2020. language model and stacked bidirectional LSTM based
[73] J. Zhang, G. Ma, Y. Huang, J. sun, and F. Aslani, “Modelling framework for sarcasm identification,” IEEE Access, vol. 9,
uniaxial compressive strength of lightweight self-compacting pp. 7701–7722, 2021.
concrete using random forest regression,” Construction and [90] C. Rodriguez-Pardo, M. A. Patricio, A. Berlanga, and
Building Materials, vol. 210, pp. 713–719, 2019. J. M. Molina, “Machine Learning for Smart Tourism and
[74] W. Nener, C. Wu, H. Zhong, Y. Li, and L. Wang, “Prediction Retail, pp. 311–333, IGI Global, 2020.
of undrained shear strength using extreme gradient boosting [91] W. Song and Y. Zhou, “Road travel time prediction method
and random forest based on Bayesian optimization,” Geo- based on random forest model,” in Smart Trends In Com-
science Frontiers, vol. 12, no. 1, pp. 469–477, 2020. puting And Communications, pp. 155–163, Springer, New
[75] P. Zhang, Z.-Y. Yin, Y.-F. Jin, and T. H. T. Chan, “A novel York, NY, USA, 2020.
hybrid surrogate intelligent model for creep index prediction [92] A. Jamatia, U. Baidya, S. Paul, S. DebBarma, and S. Dey,
based on particle swarm optimization and random forest,” “Rating prediction of tourist destinations based on super-
Engineering Geology, vol. 265, p. 105328, 2020. vised machine learning algorithms,” Computational Intelli-
[76] K. Shah, H. Patel, D. Sanghvi, and M. Shah, “A comparative gence In Data Mining, pp. 115–125, Springer, New York, NY,
analysis of logistic regression, random Forest and KNN USA, 2020.
models for the text classification,” Augmented Human Re- [93] C. F. Baumeister, T. Gerstenberg, T. Plieninger, and
search, vol. 5, no. 1, pp. 1–16, 2020. U. Schraml, “Exploring cultural ecosystem service hotspots:
[77] S. Hakak, M. Alazab, S. Khan, T. R. Gadekallu, linking multiple urban forest features with public partici-
P. K. R. Maddikunta, and W. Z. Khan, “An ensemble ma- pation mapping data,” Urban Forestry & Urban Greening,
chine learning approach through effective feature extraction vol. 48, Article ID 126561, 2020.
to classify fake news,” Future Generation Computer Systems, [94] J. Evans, B. Waterson, and A. Hamilton, “Forecasting road
vol. 117, pp. 47–58, 2021. traffic conditions using a context-based random forest
Computational Intelligence and Neuroscience 19