Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
18 views26 pages

Genetic Algorithm and Artificial Neural Network

The document discusses genetic algorithms and their application to vibration isolation optimization. It describes the basic concepts of genetic algorithms including populations, chromosomes, genes, encoding, fitness functions, and genetic operators such as selection, crossover and mutation. It also provides an example of using a genetic algorithm to optimize the parameters of a trench for vibration isolation.

Uploaded by

Ducnguyen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views26 pages

Genetic Algorithm and Artificial Neural Network

The document discusses genetic algorithms and their application to vibration isolation optimization. It describes the basic concepts of genetic algorithms including populations, chromosomes, genes, encoding, fitness functions, and genetic operators such as selection, crossover and mutation. It also provides an example of using a genetic algorithm to optimize the parameters of a trench for vibration isolation.

Uploaded by

Ducnguyen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Genetic Algorithm and Artificial Neural

Network 3

3.1 Genetic Algorithm

Optimization consists of studying different aspects of an initial idea and using the
gained information to improve it. A computer is a perfect tool for optimization when
the factors influencing the idea can be input in a readable format by a computer.
The terminology (best solution) in optimization implies that there is more than one
solution and the solutions are not of equal value.
A Genetic Algorithm (GA) is a high-level procedure and research-based op-
timization technique, which is inspired by the genetic and natural selection that
belongs to the much larger branch of computation known as evolutionary algo-
rithms [140]. The principle of search techniques in GA is based on Darwin’s theory
of evolution [31].
A GA offers a random search in a complex landscape. One general principle
for the implementation of an algorithm for a specific problem is to create a proper
balance between explorations and exploitation of the search space. To reach this
aim, all operators of GA should be examined carefully [101].
In a GA, there is a pool of candidate solutions (called individuals) to any given
problem which is evolved toward a better solution. A set of properties of each
candidate solution can be called a chromosome. A chromosome is composed from
genes and its value can be either numerical, binary, symbols or characters depending
on the problem want to be solved. The output is generated by a minimizing function
from a set of properties of each candidate solution (a chromosome).
The fitness function can be an experimental result or a mathematical function.
It calculates the difference between the desired and calculated output. Therefore,
determining a proper fitness function and recognizing the most important input

© The Author(s), under exclusive license to Springer Fachmedien Wiesbaden GmbH, 25


part of Springer Nature 2024
M. Naghizadeh, Dynamic of Soil in Ground-Borne Vibration Mitigation,
https://doi.org/10.1007/978-3-658-44352-8_3
@seismicisolation
@seismicisolation
26 3 Genetic Algorithm and Artificial Neural Network

variables is really important. The term minimize is used to calculate the output of
the fitness function in GA [90].
An attempt has to be made to select an optimal size for the initial population. Too
small population will not allow sufficient room for exploring the search effectively,
while too large population can increase the computational cost. Therefore, an opti-
mal population should be selected based on the complexity of the fitness function,
computational cost, memory, and time.
A try has been done to show the application of a genetic algorithm in vibration
isolation for a single-wall trench. The aim is to find the best parameters of a rect-
angular trench to reach the highest value of efficiency. The considered parameters
for optimizing are: location (X), depth (D), width (W) and length of the trench (L).
Each parameter of the trench is defined as a gene, which is generated randomly from
the defined range from the Table 3.1. Each chromosome includes 4 genes, which
are the parameters of the trench and the population is a set of all chromosomes.

Table 3.1 Defined ranges for the selected parameters


Parameters Min (m) Max (m)
Location (X) 3 10
Depth (D) 2 6
Width (W) 0.3 1
Length (L) 5 15

Fig. 3.1 shows an example of a population, three chromosomes and randomly-


generated genes for a single-wall barrier.

Figure 3.1 Example of gene, chromosome and population in vibration isolation problem

@seismicisolation
@seismicisolation
3.1 Genetic Algorithm 27

Encoding is the process of representing individual genes. One of the most im-
portant decisions to make while implementing a genetic algorithm is to decide a
method for representing the solutions. The process of encoding can be performed
using binary and floating methods. In binary encoding representation, which is il-
lustrated in Fig. 3.2, each chromosome consists of bit strings. Each chromosome
encodes a bit string. Each bit in the string can represent some characteristics of the
solution. Every bit string is a solution but not necessarily the best solution. The
whole string represents a number.

Figure 3.2 Binary encoding representation

In Floating encoding, every chromosome is a string of values and the values


can be anything connected to the problem. This encoding method produces the best
results for some special problems, where some complicated values, such as real
numbers, are used. For problems with genes using continuous rather than discrete
variables, the real-valued or floating-point representation is the best. Fig. 3.3 shows
a floating encoding example.

0.5 0.2 0.6 0.8 0.7 0.4 0.3 0.2 0.1 0.9

Figure 3.3 Floating encoding representation

Depending on the solution of the problem, the encoding method can be selected.
In vibration isolation, floating encoding is used since all the parameters are real
values and can be decimal numbers, too.
As explained, generatin an initial population is one of the first steps in developing
a GA model. For this purpose, an initial population in vibration isolation topics with
10 chromosomes are generated randomly and illustrated in Fig. 3.4.

@seismicisolation
@seismicisolation
28 3 Genetic Algorithm and Artificial Neural Network

Figure 3.4 Generate an initial population randomly for GA between the defined ranges

3.1.1 Fitness Function

The goodness of the chromosome is evaluated as a solution for the problem by the
fitness function. In a genetic algorithm, the chromosome and its solution are repre-
sented as genotype and phenotype. Calculation of fitness value is done repeatedly in
a GA, and therefore it should be sufficiently fast. In most cases, the fitness function
and the objective function are the same as the objective is to either maximize or
minimize the given objective function. However, for more complex problems with
multiple objectives and constraints, an algorithm designer chooses different fitness
functions.
The evaluation of the goodness of each chromosome in vibration isolation topic
is carried out through a finite element model (Plaxis). It means that the dimensions
of the trench as a chromosome will be entered into the Plaxis model. Then, the
efficiency of the trench based on each chromosome will be calculated. This process
means evaluating the goodness of different parameters. Fig. 3.5 represents the cal-
culated efficiency of each chromosome for the initially generated population. As it
can be seen, different efficiencies are calculated for different chromosomes.

@seismicisolation
@seismicisolation
3.1 Genetic Algorithm 29

Figure 3.5 Calculated efficiency of each chromosome

3.1.2 Genetic Operators

Genetic operators are the heart of a genetic algorithm for guiding the algorithm
towards a solution to a given problem. Operators create new and fitter chromosomes
[79]. The three main operators are as follows:

1. Selection
2. Crossover
3. Mutation

Genetic operators are used to select the best solutions, called parents, that contribute
to the population of the next generation (selection). Combine selected solutions of
two parents of children for the next generation (crossover) and create and maintain
random population diversity (mutation), which is called recombination.

Selection
Selection is the process of selecting two parents from the population to create a new
population. Following the steps of encoding and evaluating chromosomes with a
fitness function, the next step is to decide how to perform a selection. A selection

@seismicisolation
@seismicisolation
30 3 Genetic Algorithm and Artificial Neural Network

operator aims to emphasize fitter chromosomes in the population. Parents are se-
lected from the initial population. According to Darwin’s theory of evolution, the
best chromosome survives to create a new offspring [113]. Selection is a procedure
to pick chromosomes from the population according to the fitness function evalua-
tion. The chromosome with a higher fitness function has more chance to be selected
[102].
The process of selecting two parents from the population to apply the crossover
is classified as fitness base selection and ranking base selection.

Fitness proportionate selection method


In this method, an individual can become a parent with a probability which is propor-
tional to its fitness. Therefore, fitter individuals have a higher chance of reproduction
and propagating their features to the next generation. Two implementations of fitness
proportionate selection are Roulette Wheel and Stochastic Universal.

1- Roulette wheel selection method


The principle of Roulette Wheel selection method is to select a chromosome
from a pool of population proportional on their fitness value. A wheel, known as
roulette wheel is defined for displaying the process with different sections in the
wheel, which are based on the fitness value of each chromosome. A fixed point
is chosen on the wheel circumference as shown in 3.6. The selection process is
based on spinning the wheel. The region of the wheel which comes at the front
of the fixed point is chosen as the parent.

Figure 3.6 Roulette wheel selection

@seismicisolation
@seismicisolation
3.1 Genetic Algorithm 31

2- Stochastic universal method


In Stochastic Universal method, chromosomes are specified to a segment of line,
like the different sections in roulette-wheel selection. N points are considered for
the number of chromosomes, and then distance between the points is calculated
as (1/N points), and the position of first point is selected through a randomly-
generated number. For instance, for 6 individuals to be selected, the distance
between the points is 1/6 = 0.167. The selection process for the example is
seen in Fig. 3.7.

Figure 3.7 Stochastic universal method

Ranking base selection method


In this method, the population is sorted according to objective values. The fitness
assigned to each individual depends only on its position in the individuals’ rank.
Ranking introduces a uniform scaling across the population and provides a simple
and effective way of controlling selective pressure. The probability of each individ-
ual being selected for reproduction depends on its fitness normalized by the total
fitness of the population.
Tournament selection is a ranking base selection method which involves choosing
a number of chromosomes randomly from the population and picking out the best
individual for the new population. The number of chromosomes in the set is called
the tournament size. Assume tournament size = 3, then three chromosomes will be
selected from the pool, their fitness will be compared, and fitter chromosomes will
be selected to reproduce. Fig. 3.8 shows how the method works.

@seismicisolation
@seismicisolation
32 3 Genetic Algorithm and Artificial Neural Network

Figure 3.8 Tournament selection method

Tournament selection method is used to select the fittest candidates from the
current population in vibration isolation topic.

Crossover
A crossover operator is responsible of taking two parents solutions to produce an
offspring for the next generation in order to explore a much wider area of the
solution space and find the globally optimal solution for the problem. A crossover
selects two or more chromosomes of the population as parents to reproduce one or
more offsprings through choosing genes from either the chosen parents or from a
combination of both parents.
The duty of a crossover involves sharing information between individuals, where
the features of two parent chromosomes are combined to reproduce two children
to generate better children. The crossover probability is a parameter to show how
often the crossover will be performed [80].
Summary of the process of crossover operator:

1. Getting the selected parents from the selection operator


2. Selecting a random cross line in the chromosome
3. Finally, changing the value of chromosomes over the cross line

@seismicisolation
@seismicisolation
3.1 Genetic Algorithm 33

Different kinds of crossover operators include:

1- A single-point crossover (for binary representation)


In this method, a single corresponding point is selected randomly for cutting
two parents and then combines the parents at the crossover point to create new
children. Fig. 3.9 shows an example of single-point crossover method. The red
line shows the crossover points. The contents between these points are swapped
between the parents to produce new children to use in the next generation.

Figure 3.9 Example of single point crossover and create children

2- A two-points crossover (for binary representation)


Two cut-points are selected in this method to provide greater combination of
parents and create better children. Fig. 3.10 shows an example of two-point
crossover method. The red lines show crossover points. The contents between
these points are swapped between the parents to produce new children to use in
the next generation.

Figure 3.10 Example of two points crossover and create children

3- A uniform crossover (for binary representation)


In this method, each gene is treated separately rather than dividing a chromosome
into segments. The operator selects randomly-generated binary crossover, which

@seismicisolation
@seismicisolation
34 3 Genetic Algorithm and Artificial Neural Network

is called mask, and provides uniformity through swapping genes in parents. The
value of children are duplicated from parents as per bits of their mask. When
there is 1 in the crossover mask, the gene is copied from the first parent, and
when there is 0 in the mask, the gene is copied from the second parent. Fig. 3.11
shows an example of uniform crossover method.

Figure 3.11 Example of uniform crossover and create children

4- An arithmetic crossover (for floating representation)


This operator linearly combines two parents through Eq. 3.1. In this crossover,
two parents are chosen randomly to apply the crossover and through combination
of these chromosomes, new children are generated.

yi (1) = αi xi (1) + (1 − αi )xi (2)


(3.1)
yi (2) = αi xi (1) + (1 − αi )xi (2)

where αi is uniform random number in the range of [0, 1], yi and xi are new children
and selected parents, respectively.
Arithmetic crossover method is used in vibration isolation topic to generate
new parents. As explained, two parents will be selected randomly from the initial
population to apply crossover. Chromosomes 1 and 4 are selected randomly to apply
the crossover. Before applying crossover, a random value for αi should be generated
in the interval of [0, 1], which is equal to αi = 0.3 for this example. Substituting
the value of αi in the Eq. 3.1 results in Eq. 3.2

Y1 = (0.3)x1 + (1 − 0.3)x2
(3.2)
Y2 = (0.3)x2 + (1 − 0.3x1

@seismicisolation
@seismicisolation
3.1 Genetic Algorithm 35

Now crossover should be applied based on the Eq. 3.2 to the selected parents from
the initial population. Fig. 3.12 illustrates the selected parents and newly generated
children after applying crossover.

Figure 3.12 Selected parents and new generated children after applying crossover

Mutation
The next step after the crossover is preventing the algorithm to be trapped in a local
minimum. This duty is performed by a mutation operator. A mutation operator is
insurance for randomly distributing genetic information. If a crossover is considered
to perform exploitation in the current solution to find a better one, mutation is
viewed as an operator to assist in the exploration of the whole search space. A
mutation introduces new structures in the population by randomly changing some
of its building blocks and assists in escaping from local minimum traps.
In addition, it tries to maintain genetic algorithm diversity in the new popula-
tion. Mutation of variables means adding randomly created values to the variables
with a mutation probability (Pm ). The mutation probability decides how often a
chromosome will be mutated. There are many different kinds of mutation opera-
tors for different forms of representation including binary and floating (real value)
representations. Different kinds of mutation operators are include:

• Flipping (for binary representation)


Flipping a bit involves changing 0 to 1 and 1 to 0 based on the mutation chro-
mosome, which is generated randomly. Fig. 3.13 shows a parent, a mutation
chromosome, which is randomly generated and the children. For a value of 1 in
the mutation chromosome, the corresponding bit in parent chromosome is flipped
(0 to 1 and 1 to 0) and child chromosome is produced. In this case, the value
of 1 occurs at three places of mutation. This operator is proper for bit-spring
representation, while every variable has only two values.

@seismicisolation
@seismicisolation
36 3 Genetic Algorithm and Artificial Neural Network

Figure 3.13 Example of flipping mutation and create child

• Reversing (for binary representation)


In this method, a randomly-generated position is selected, and the bits next to
that position are reversed and a child chromosome is produced. Fig. 3.14 shows
a parent, a randomly-generated point and the children.

Figure 3.14 Example of reversing mutation and create child

• Uniform (for floating representation)


Uniform mutation randomly selects one variable from the chromosomes and
assigns it to a uniform random value. Fig. 3.15 shows a parent, and randomly
selected points to create random changes in the genes.

Figure 3.15 Example of uniform mutation and create child

• Gaussian (for floating representation)


There are two main parameters in Gaussian Mutation, which are the mean and
the standard deviation of the Gaussian distribution (σ ). As it can be seen in
Fig. 3.16, a randomly generated value from the Guassian distribution is added
to each element of an individual to produce a new child.

@seismicisolation
@seismicisolation
3.1 Genetic Algorithm 37

Figure 3.16 Example of gaussian mutation and create child

Return to vibration isolation example in Fig. 3.12, there are two selected parents
(from the initial population) and two new children (after applying crossover). All of
these four chromosomes are transferred to a mating pool in order to be selected to
apply mutation. One chromosome from the mating pool will be selected randomly
to apply mutation. Fig. 3.17 shows the mating pool, the selected chromosome for
applying mutation and the mutated child.

Figure 3.17 Mating pool, selected children for applying mutation and the mutated children

The mutated child will be substituted by the selected chromosome in the mating
pool. Fig. 3.18 demonstrates the new mating pool with new children.

Figure 3.18 New mating pool

@seismicisolation
@seismicisolation
38 3 Genetic Algorithm and Artificial Neural Network

3.1.3 Replacement

The last step of an algorithm after recombination is replacement, which is the process
of selecting chromosomes from a source of population and substituting them to from
a new offspring for a new population. There is a possibility for the optimum solution
to be lost after recombination with crossover and mutation since the selection process
of a chromosome is completely random. When two new children are produced
from two parents of a fixed population, the main problem is which of these newly
generated children or parents are allowed to move forward to the next generation,
so the two must be replaced.
There are two kinds of replacement methods for creating a new population includ-
ing steady-state and elitism replacement. In a steady-state method, a small fraction
of the population is replaced in all iterations. This method involves inserting new
chromosomes in the population as soon as they are produced. Elitisms replacement
is an almost complete replacement except for the best members of each generation
which are carried over to the next generation without modification. This method in-
creases the efficiency of the algorithm since it prevents a loss of the best solutions.
Elitism replacement method is selected to generate new population in vibration
isolation topic. Fig. 3.19 represents the initial population and the newly generated
population.

Figure 3.19 Initial population and the new generated population

@seismicisolation
@seismicisolation
3.3 Artificial Neural Network 39

It can be seen from the figure that the parameters with low efficiencies are sub-
stituted with new parameters, which results in better efficiency.

3.2 Application of Genetic Algorithms

A genetic algorithm (GA) is a well-known method to solve complex geotechnical


engineering problems [26, 65]. Levasseur et al. used a GA to identify constitutive
parameters of the Coulomb constitutive model from in situ geotechnical measure-
ments [103]. A GA is also used for slope stability estimation by [99, 115].
The trench dimensions can also be optimized through GA to reach highest amount
of efficiency in mitigating the vibration. Yarmohammadi et al. developed a coupled
GA-FE methodology for designing single, double, and triple-trench wave barrier
systems to reduce train-induced vibrations. They found that open trenches had much
higher mitigation capacities when compared with in-filled trenches, and importantly
using double-trench barriers instead of single-trench ones increases the mitigation
capacity by as much as 20% [149].
In another attempt, they evaluated the performance of overlapping jet-grouted
columns as stiff-wave barrier walls for mitigating train-induced ground vibrations.
It was observed that the developed GA/FEM could effectively find the optimal
topology of the jet-grouted barriers within the design domain and the barriers with
larger heights provided better mitigation [4].
Attenuation of earthquake-induced vibrations was studied by [25] through in-
stalling buried concrete-filled trenches. A GA model was developed for finding the
efficient layout of concrete barriers in a manipulated soil zone around the structure.
It was observed that some optimal layouts of limited volume can attenuate the elastic
demands of the structures in the range of 30–80%. In further investigations, they
proved that the introduced model could find effective wave barrier layouts with a
high potential capacity to mitigate horizontal vibrations, especially for earthquake
loadings that have a high-frequency content [9].

3.3 Artificial Neural Network

Since modern computers are becoming more powerful, researchers try to use ma-
chines to perform calculations of complicated models. Artificial Intelligence (AI) is
known as the process of simulating human intelligence in machines for thinking like
humans and mimic their actions [105]. Artificial intelligence is able to interpret and
learn external data and use those learnings to reach specific aims through flexible

@seismicisolation
@seismicisolation
40 3 Genetic Algorithm and Artificial Neural Network

adaptation. Deep learning is a subset of Machine Learning (ML) that has a network
for learning from data. Fig. 3.20 illustrates the application of all three algorithms
[72].

Figure 3.20 Classification of artificial intelligence [72]

Machine learning is a technique to figure out a model from data. After developing
the model, it will be applied to real field data. Fig. 3.21 shows the process in which the
vertical row indicates the learning process, while the horizontal row demonstrates
the trained model [72].

Figure 3.21 Evaluating a model based on field data [49]

The machine learning technique is categorized into three different groups based
on the training model including [134]

@seismicisolation
@seismicisolation
3.3 Artificial Neural Network 41

• Supervised learning
• Unsupervised learning
• Reinforcement learning

There is input and ground truth for each training dataset in supervised learning.
The main duty of the supervised learning technique is to produce a correct output
from the input using training data. Conversely, the unsupervised learning technique
contains inputs without ground truth.
Classification and regression are two types of application of the supervised learn-
ing technique. Classification is the problem of identifying the classes to which the
data belong. In contrast, regression predicts a value. Vibration isolation problem is
categorized as a regression problem [111].
Machine learning models can be used for different tasks including:

• Artificial Neural Networks (ANN) for regression and classification


• Convolutional Neural Networks (CNN) for computer vision
• Recurrent Neural Networks (RNN) for time series analysis
• Self-organizing Maps for feature extraction
• Deep Boltzmann Machines for recommendation systems
• Auto Encoders for recommendation systems

An Artificial Neural Network (ANN) is a computational model, which is inspired


by the structure and functions of biological neural networks in human brains. ANN
acts like an artificial human nervous system for receiving, training, and transmitting
information. ANN is the most exciting and powerful branch of ML, in which a
computer model learns to perform tasks directly from data. ANN is placed as the
model in Fig. 3.21 and the learning rules, which are supervised learning with a
regression problem is placed as the machine learning. Table 3.2 shows different
types of neural network and their application to solve different problems.

Table 3.2 Different types of neural network and their application


Abbreviation Full name Problem solver
ANN Arbificial Neural Network Pattern recognition
CNN Convolution Neuval Network Image processing
BNN Recursive Neural Network Speech recognition
DNN Deep Neural Network Acoustic modeling
DBN Deep Belief Network Cancer detection

@seismicisolation
@seismicisolation
42 3 Genetic Algorithm and Artificial Neural Network

3.3.1 Neurons

The principal idea behind a neural network is based on imitating the structure of
neurons and the cells in the brain for performing some tasks like recognizing patterns
and making decisions.
Fig. 3.22 shows a biological neuron, which is the fundamental unit of the brain
and the nervous system. The cells responsible for receiving input from the external
world via dendrites, process it through a function and give the output through axons.
A neuron does not have any storage for saving data; it just transmits signals from
one neuron to another [49].

Figure 3.22 A biological neuron [135]

An artificial neuron is a mathematical function based on biological neurons.


Table 3.3 represents the analogy between the human brain and artificial neural
network.

Table 3.3 Analogy between the human brain and artificial neural network
Biological Neuron Artificial Neuron
Cell Node
Dendrites Input
Synapse Weights
Axon Output

@seismicisolation
@seismicisolation
3.3 Artificial Neural Network 43

A perceptron is a model based on biological neurons for the supervised learning of


binary classifiers. This model enables neurons to learn and process data in a training
set once. Single and multilayer are two types of perceptron. A single perceptron
with two layers including input and output layers is presented in Fig. 3.23.

Input 1 X1 ERROR

w1
Input 2 X2

w2 Y Output

Net Input Function Activation Function

wn

Input n Xn Bias

Figure 3.23 A single perceptron [49]

Nodes X 1 , X 2 , and X 3 which are called features, are the input, and w1 , w2 and
wn are the weights for the corresponding features. Weights show the strength of the
features. Each feature is multiplied by a connection weight and pulsed through a
summation function. Then a bias, which is a constant value, is added to the weighted
sum for shifting the result of an activation function.
The summation is passed through the activation function. An activation function
introduces non-linearity to the neural network through converting the weighted sum
of features into the output signal.
The activation function, which is attached to each neuron in the network, deter-
mines the output of a neural network. In addition, it helps to normalize the output of
each neuron to a range of [0, 1] or [−1, 1]. A neural network without an activation
function is a linear regression model. There are two different kinds of activation func-
tions including linear and non-linear functions. The non-linear activation functions
are the most applicable in neural networks. The most popular non-linear activation
functions include:

• Sigmoid activation function


• Tanh or hyperbolic tangent activation function
• ReLU (Rectified Linear Unit) activation function

@seismicisolation
@seismicisolation
44 3 Genetic Algorithm and Artificial Neural Network

A sigmoid function is an exponential function having a characteristic “S” shape


curve and it takes a real value as input and the output value is in the range of [0, 1].
Tanh is an activation function similar to sigmoid in terms of shape but the range
of output of this function, is [−1, 1]. Therefore, the inputs to the next layers will not
always be the same sign. The applicability of the functions is more for classification
between two classes. Both Tanh and sigmoid functions are used in feedforwards
neural networks.
ReLu, which stands for Rectified Linear Units, is another type of non-linear
activation function, which stands for Rectified Linear Units. Mathematically, this
function R(z) can be defined using max function over the set of 0 and the input
z, R(z) = max(0, z). Fig. 3.24 represents the pictures of all three activation func-
tions.

ReLU
10
sigmoid R(z) = max(0, z)
1
1
σ(z) = 1+e−z
8
0.8
6
0.6

4
0.4

0.2
2

0 0
−10 −5 0 5 10 −10 −5 0 5 10

(a) (b)
y T anh
1

1 − e−2x
φ(z) =
0 1 + e− 2x

−1 x
−4 −2 0 2 4

(c)

Figure 3.24 Activation functions in artificial neural networks

@seismicisolation
@seismicisolation
3.4 Feedforward Networks 45

In the last step, information will be transferred to output. If the predicted output
with ANN is equal to actual output which is called label, the algorithm will be
finished; otherwise, there will be an error and it returns to back to neurons for
adjusting the weights and bias and this process will continue until the error is
minimized.

3.4 Feedforward Networks

Feedforward or multilayered perceptrons are an artificial neural network, which


combines many layers of perceptrons. A feedforward network leads information
only in a forward direction from an input layer through a hidden layer and finally
to an output layer. In this network, all nodes are fully connected, and the weights
are recognized into layers that feed into each other.
Fig. 7.4 represents a multilayered feedforward neural network in which the first
and the last layers are called input and output, respectively. The layers between the
input and output are called hidden layers. The number of hidden layers is adjustable
based on the complexity of a problem. The application of hidden neurons is to
intervene between input and output data. In a feedforward neuron network, every
neuron in each layer is fully connected to every other neuron in the next forward
layer.
Performing a forward-pass of a network gives us the predicted value. Thus, the
“goodness” of the predicted value should be evaluated by comparing the predicted
and actual values. This is the duty of a cost function. Different cost functions are
include mean square error (MSE), root mean square error (RMSE) and correlation
(corr), which are used to measure the error between the predicted value with a feed-
forward neural network and the actual value. MSE is the average squared difference
between the predicted and actual value. RMSE is the standard deviation of the pre-
dicted errors, and it shows how data are concentrated around the line of the best fit.
Corr represents how independent variables are related to each other Fig. 3.25.
The weights W and biases b are the most important factors in converting an input
to impact the output. Therefore, training a neural network means finding the proper
value for the weights and biases. Generally, the process of adjusting the weights and
biases between input data and hidden neurons for minimizing the cost function is
called training a neural network.
A neural network can be considered shallow, there is an input layer as well as
one hidden layer, which processes the inputs, and an output layer, which presents
the final result of the model. On the other hand, a deep neural network usually has
between 2–8 hidden layers of neurons.

@seismicisolation
@seismicisolation
46 3 Genetic Algorithm and Artificial Neural Network

input Layer Hidden Layer 1 Output Layer

Figure 3.25 A multi-layered feedforward neural network

3.4.1 Backpropagation

A backpropagation algorithm is the most popular learning algorithm in training a


feedforward neural network for a supervised problem. In this procedure, the mea-
sured error through a cost function is propagated back to all weights and biases in
order to decrease error. In summary, backpropagation is adjusting the weights and
biases of all connections in a network repeatedly in order to minimize error. An
error describes how a set of parameters in a network fits a data set.
There are several algorithms for optimizing errors including:

• Gradient Descent
• Newton Method
• Gauss-Newton Method
• Levenberg-Marquardt Algorithm

The gradient descent method is a first-order iterative optimization technique to


adjust the weights and biases in the backpropagation phase to reach the best output.
In order to find an appropriate value for weights and biases, a derivative of the cost
function should be taken with respect to the weights and biases. A gradient descent
algorithm works as follows:

@seismicisolation
@seismicisolation
3.4 Feedforward Networks 47

1. Initialize the weights randomly∼ N (0, σ 2 )


2. Loop until convergence:

a. Computing gradient ∂ J∂(W )


W for all nodes
b. Updating weights, W = W − η ∂ J∂(W
W
)

3. Returning weights

where η is the learning rate, which is one of the most important parameters in
the gradient descent technique. The learning rate determines the speed of a neural
network training process. A small value of learning rate leads to an optimal set
of weights but it may take a long time. On the other hand, a large learning rate
results in training the model faster, we may face the risk of missing the optimal
weight. Step decay is a proper method for finding the optimal learning rate, in
which the learning rate is reduced by some percentages after a set of training epochs.
The Newton’s method is a second-order algorithm, in which the Hessian matrix is
used instead of the Jacobian Matrix. The goal of this algorithm is to find better
training directions through using the second derivatives of an error function. The
conjugate gradient algorithm can be regarded a method between Newton’s and
gradient descent procedures. The search process in this method is performed along
a conjugate direction which can result in faster convergence than with gradient
descent directions.
The Gauss-Newton method, which is used to solve nonlinear least square prob-
lems, is a developed model of the Newton method. The assumption in this method is
that the objective function in the parameters near the optimal solution is quadratic.
The Gauss-Newton method usually converges a medium-sized problem much faster
than the gradient descent method.
The Levenberg-Marquardt algorithm is an iterative technique, which is com-
monly used to solve non-linear Least-squares problems. The LM method can be
defined either as a combination of the gradient descent optimization method when
the parameters are far from their optimal values or the Gauss-Newton optimization
procedures when the parameters are close to their optimal value.

3.4.2 Data Preparation

The size of a database plays a significant role in training a neural network. Large
database results in a more accurate model but requires a lot of computational time.
On the other hand, too small database results in a less accurate model and requires

@seismicisolation
@seismicisolation
48 3 Genetic Algorithm and Artificial Neural Network

less computational time. Therefore, the database should be big enough to result in
an accurate model and optimal computational time, too.
When the input data in the database have different ranges, the database needs to
be normalized. Normalization is applied to a neural network in order to produce a set
of data values within the same magnitude. When the features are different in terms
of magnitude, the fluctuations of some parameters with bigger ranges may decrease
the influence of the parameters with smaller ranges. Nevertheless, the features with
smaller ranges may be more important in predicting the desired output. Therefore,
all the data should be normalized to have the same range to ignore the influence of
different ranges in ANN. All features are normalized to be in the range of [−1, 1]
through Eq. (7.5).  
X j − min X j
(X j )n = 2 −1 (3.3)
max X j − min X j
where X j is the feature, (X j )n is the scaled input feature, min X j and max X j are
the lower and upper limits of input features, respectively.

3.5 Application of ANN

Over the last few years, ANNs have been widely applied in several areas of geotech-
nical engineering problems and have demonstrated some degree of success [122].
The method is able and well-suited to model complex problems where the relation-
ship between model variables is unknown. ANNs have been used successfully in
pile capacity prediction [33], site characterization [120], earth retaining structures
[48], estimation of the bearing capacity of shallow foundations and settlement pre-
diction [151], slope stability [24], design of tunnels and underground openings [77],
liquefaction during earthquakes [94] and soil compaction and permeability [123].
Jayawardana et al. investigate the use of artificial neural networks (ANN) as
a smart and efficient tool to predict the effectiveness of geofoam-filled trenches
to mitigate ground vibrations. They used multi-layer feedforward network with a
backpropagation algorithm with two hidden layers [67].
A comprehensive parametric study has been performed by [8] and the results
were used to train an ANN model for predicting the efficiency of geofoam-filled
trenches. The developed model exhibited a good generalization capacity beyond the
training stage as validated by new finite element results within the range of training
database.
In addition, soil-ground vibrations induced by moving trains were predicted
based on the artificial neural network model [43]. The result states that the predicted

@seismicisolation
@seismicisolation
3.6 Response Surface Methodology 49

method can control the maximum error below 6.41% and the average error below
2.29% when it is used to predict acceleration vibration levels.
Hung et al. focused on using multiple neural networks to estimate the screening
effect of surface waves by in-filled trenches [129]. Three artificial neural networks, a
backpropagation network (BPN), a generalized regression neural network (GRNN),
and a radial base function network (RBF) were used to evaluate the performance of
a chosen physical model.
The neural network tool was used to analyze the parametric effects of vibrations
versus the surface layer’s depth [40]. Important conclusions were derived from the
analysis regarding the mechanical and geometrical properties of multiple layers and
their varying effects with regard to the distance from the source.

3.6 Response Surface Methodology

Evaluating the effects of multiple factors and their interactions on one or more
response variables is a challenge for researchers in many fields [71]. In response
surface problems, Y is the response variable of interest and X 1 , X 2 , ... X n are
regarded as a set of predictors. For instance, in a vibration isolation system, Y is
recognized as the efficiency of a trench and X s are governing factors like depth,
location, etc.
In some systems, the nature of the relationship between Y and X s may be known
exactly based on a linear function. Then, Eq. (3.4), which is called mechanistic
model can be suggested for fitting the system.

Y = g(X 1 , X 2 , ..., X n ) +  (3.4)

where g is a linear function and  represents the error in the system.


Now consider a system in which a mechanistic model is not suited for the system,
so that the model should be approximated by an unknown and non-linear function,
which is presented in Eq. (3.5).

Y = f (X 1 , X 2 , ..., X n ) +  (3.5)

where the function f is usually a non-linear polynomial, typically between the first
or fourth-order polynomial. This empirical model is called a response surface model.
Response Surface Methodology (RSM) is a statistical method for investigating
the interaction and relationship between the independent variables with different re-
sponses [3]. RSM uses quantitative data from appropriate experiments to determine

@seismicisolation
@seismicisolation
50 3 Genetic Algorithm and Artificial Neural Network

regression model equations and operating conditions [117]. RSM is a collection of


mathematical and statistical techniques to model and analyse problems in which a
response of interest is influenced by several variables.
The most extensive applications of RSM are in the industrial world, particu-
larly in situations where several input variables potentially influence performance
measures or quality characteristics of a product or a process. These measured per-
formances are called the response. They are typically measured on a continuous
scale, although attribute responses, ranks, and sensory responses are not unusual.
However, most applications of RSM will involve more than one response. The input
variables are sometimes called independent variables, and they are subject to the
control of engineers or scientists, at least for purposes of a test or an experiment.
The advantages offered by RSM can be summarized as determining the inter-
action between independent variables, modelling the system mathematically, and
saving time and cost by reducing the number of trials. Another purpose for RSM
is to evaluate how a response changes to a given desirable direction by adjusting
a process or design variables. The statistical method enables one to optimize re-
sponses (dependent variables) which are influenced by various factors (independent
variables) [15, 97].

@seismicisolation
@seismicisolation

You might also like