Development of Cost
Development of Cost
Michał JUSZCZYK1,
Cracow University of Technology, Faculty of Civil Engineering
Abstract
Cost estimation, as one of the key processes in construction projects, provides the basis for
a number of project-related decisions. This paper presents some results of studies on the
application of artificial intelligence and machine learning in cost estimation. The research
developed three original models based either on ensembles of neural networks or on support
vector machines for the cost prediction of the floor structural frames of buildings. According to
the criteria of general metrics (RMSE, MAPE), the three models demonstrate similar predictive
performance. MAPE values computed for the training and testing of the three developed models
range between 5% and 6%. The accuracy of cost predictions given by the three developed
models is acceptable for the cost estimates of the floor structural frames of buildings in the early
design stage of the construction project. Analysis of error distribution revealed a degree of
superiority for the model based on support vector machines.
Keywords: construction cost estimation, cost modelling, ensembles of neural
networks, support vector machine
1. INTRODUCTION
Cost estimation is a key process for any construction project. The objective of the
process is to deliver forecasts of construction costs on the basis of information
available on successive stages of projects. The accuracy of the forecasts has a
significant impact on project success as a number of decisions are made on the basis of
1
Corresponding author: Cracow University of Technology, Faculty of Civil Engineering,
Warszawska 24, 31-155 Kraków, Polska, [email protected], +48 12 628 30 90
DEVELOPMENT OF COST ESTIMATION MODELS BASED ON ANN ENSEMBLES 49
AND THE SVM METHOD
cost analyses. This paper presents some results of studies on the applicability of
artificial intelligence and machine learning-based methods for the process of
estimating construction costs. Alternative models are introduced which are based on
either ensembles of artificial neural networks (ANN) or on the support vector machine
method (SVM).
2. METHODOLOGY
In the course of the research, several cost estimation models based on nonparametric
statistical methods were developed. The models were designed to provide predictions
of the construction costs of the floor structural frames of buildings. The introduced
models were based either on ANN ensembles or on the SVM method. Both the former
and the latter were implemented for the problem as supervised learning models that
allowed regression analysis and the implicit realisation of the relationships between
costs and cost predictors.
The theory, fundamentals and details for both methods that were omitted for the sake
of brevity in this paper can be found in the literature for ANN see, for example, [2, 11,
26, 32] and for SVM see, for example, [5, 10, 30, 36].
The basic assumption for the use of ANN ensembles is to combine a set of trained
ANN and to use this set to approximate a true regression function instead of using a
single ANN. Various kinds of ANN or ANN trained to different local minima might
be incorporated into the ensemble (compare [2]). Such an approach brings a degree of
reduction to the model’s error compared to the single network-based models.
Moreover, it is useful for practical implementations in problems for which the number
of training data samples is not large.
The rationale for the use of SVM is the method’s capability to deal with high
dimensional data. SVM enables finding a global solution for a given task, it also works
DEVELOPMENT OF COST ESTIMATION MODELS BASED ON ANN ENSEMBLES 51
AND THE SVM METHOD
well on relatively small sets of training data. The use of both of these methods makes
it possible to take into account several cost predictors (describing variables) and
modelling relationships that bind these cost predictors with the construction costs of
floor structural frames of buildings.
Three models were developed in the course of research:
- an ANN ensemble model based on a generalised averaging approach (later referred
to as ANN ENSGA),
- an ANN ensemble model based on a stacked generalisation approach (later
referred to as ANN ENSSG),
- a model based on SVM regression (later referred to as SVMREG).
The following subsections present assumptions for the development of models and the
concise presentation of data used for the purposes of supervised learning and testing.
1
= ( ) (2.3)
100%
= (2.4)
52 Michał JUSZCZYK
= 100% (2.5)
= 100% (2.6)
In the case of the generalised averaging approach (assumed for model ANN ENSGA)
the assumption is that members of an ensemble are linearly combined (compare [2,
11]) so that the output of an ensemble ŷ is computed as the weighted average of ANN
members outputs ŷk, so that:
= (2.7)
=1 (2.8)
#=1( )"#
−1
∑$
=
%=1 ∑#=1( )%#
−1
∑$ $
(2.9)
The SVM method application for the given regression problem was based on an
approximation of the assumed mapping as a linear regression hyperplane. The
hyperplane was computed with the use of an SVM method after the nonlinear
transformation of the input training data x to the high dimensional linear feature space
with the use of a kernel trick and the application of a nonlinear kernel function.
The aim was to find an approximation hyperplane which minimises the generalisation
error:
1
‖/‖ ) 0 (1 ) 1 ∗ ) → 45
2
(2.11)
Where / is the sought for hyperplane’s parameter, C stands for the complexity
parameter of a model, ξ and ξ* are slack variables introduced to make the method less
prone to noise and outliers. Slack variables are computed for each training data
sample, in particular: ξ above, and ξ* below the ε parameter of Vapnik’s loss function
(ε determines borders within which the approximated hyperplane must lie – for details
see, for example, [30, 36]) so that the constraints for eq. (2.11) are:
− < 8 , / >≤ ; ) 1
6− ) < 8 , / >≤ ; ) 1
∗
1 ≥ 0; 1∗ ≥ 0
(2.12)
The optimisation problem is solved with the use of Lagrange multipliers. Support
vectors are the data points that correspond to non-zero multipliers for the optimal
solution. Thus, the support vectors influence the position of the approximated
hyperplane. Moreover, the use of the chosen kernel function K and scalar products
K(x,x’) (the so-called kernel trick) is also implemented. Finally, the prediction can be
formally given as:
Where α and α* are the Lagrange multipliers for the optimal solution.
Table 2 presents the frequencies of values for variables x1 – x3 coded as one-of-n. The
frequencies of values that were taken by variables coded with the use of a pseudo
fuzzy scale are presented in Table 3.
The total number of samples that were used for model development (both for the
purposes of training and testing models) was 162. The details of the data division into
training and testing subsets are explained in the scheme depicted in Figure 2.
Table 2. Frequencies of values for building height class coded as one-of-n
Symbol 1, 0, 0 0, 1, 0 0, 0, 1
x1 25.77% - -
x2 - 44.79% -
x3 - - 29.45%
Table 3. Frequencies of variables values coded with the use of pseudo-fuzzy scale
Symbol 0.1 0.3 0.5 0.7 0.9
x7 11.04% - 46.01% - 42.94%
x8 31.29% - 39.88% - 28.83%
x10 51.15% - - - 47.85%
x11 23.31% 19.63% 19.02% 25.77% 12.27%
3. RESULTS
For each of the five folds of learning and validating subsets number of various ANN
of a multilayer perceptron type were trained (see the scheme in Figure 1). The ANN
differed in terms of their structure – the number of neurons in the networks’ hidden
layers varied between 2 to 8. Moreover, various activation functions (namely:
exponential (EXP), logistic (LOG), hyperbolic tangent (TANH) and linear (LIN) - in
both the hidden and the output layer) were considered. The Broyden-Fletcher-
Goldfarb-Shanno (BFGS) algorithm was used for the purposes of supervised learning.
DEVELOPMENT OF COST ESTIMATION MODELS BASED ON ANN ENSEMBLES 57
AND THE SVM METHOD
For each of the 5 folds of training the data number of the candidate ANN were
investigated. Assessment of their performance enabled the selection of 5 ANN (one
ANN for each of the folds) to be the members of the ensemble-based model.
Details regarding structures, activation functions and RMSE values computed for the
ensemble members are presented in Table 4.
The criteria of selection reflected expectations of equivalence of learning, validating
and testing errors and a high correlation of y and ŷ that is expected, and also predicted
the total construction costs of a floor structural frame of a building for learning,
validating and testing subsets. In Table 4, one can see that the RMSE values are
relatively close for each of the subsets and for all of the networks. For all of the
selected ANN, R > 0.960 for each of the subsets.
Table 4. Details of ANN that were selected to be the members of the ensemble
k-th ANN Activation functions Training
RMSEL RMSEV RMSET
fold structure hidden layer / output layer algorithm
1 11_7_1 EXP / LIN 16.742 16.369 17.012
2 11_7_1 EXP / LOG 16.348 16.594 16.639
BFGS
3 11_5_1 EXP / LOG 16.392 16.546 17.230
4 11_8_1 EXP / LIN 16.160 16.294 15.731
5 11_3_1 TANH / LIN 17.198 16.093 16.405
Coefficients αk for the ANN ENSGA model were computed with the use of eq. (2.9).
The values of αk are given below:
α1 = 0.1760; α2 = 0.2962; α3 = 0.1761; α4 = 0.2344; α5 = 0.1173
In the case of ANN ENSSG, structural parameters of the level-1 model (see eq. (2.10))
were found with the use of the commonly known linear regression analysis and the
least squares method. The parameters are given as follows (standard estimation errors
for each of the parameters are given in the brackets below):
b0 = -8.9556; b1 = 0.1336; b2 = 0.2920; b3 = 0.0490; b4 = 0.9024; b5 = 0.4808
(3.555) (0.0620) (0.0618) (0.0644) (0.0637) (0.0696)
For both of the ANN ensemble-based models, the outputs, which are the predictions of
the construction costs of the building’s floor structural frame, were computed with the
use of the coefficients αk for ANN ENSGA and structural parameters bk for ANN
ENSSG given above. The outputs were computed for training and testing subsets of
data on the basis of eq. (2.7) and eq. (2.10), respectively.
58 Michał JUSZCZYK
In the case of the SVM method, a number of models were investigated in order
to find the one to implement cost prediction mapping: x → y. For the investigated
models, a radial basis function was assumed as a kernel function:
$(8, 8 ? ) = exp(−D‖8 − 8 ? ‖ ) (3.1)
In the course of the research, a range of meta-parameters of the models was
analysed. To find the model with the best performance, the values of C, ε and γ
were sought with the use of grid analysis. The grid was characterised by ranges of
values and steps specified for each of the parameters:
- in the case of ε, the values varied between 0.05 and 0.10 (step 0.01);
- in the case of C, the values varied between 1 and 20 (step 1);
- in the case of γ, the values varied between 0.05 and 0.15 (step 0.01).
In the course of the computations, 5-fold cross validation was applied (see
scheme in Figure 1). A number of SVM based models were investigated and
analysed. Details for the five models with the best performance are presented in
number of unbound vectors – uv and bound vectors – bv, parameter /0 and cross
Table 5. For the five models, one can see the values of C, ε and γ as well as the
validation error cverr. It was found that the best performance was obtained for ε =
0.05, this is reflected in Table 5.
Table 5. Five SVM-based models with the best performance
mod. C γ ε uv bv /0 cverr RMSEL RMSET
1 20 0.05 0.05 61 29 0.884 0.009 15.684 15.738
2 16 0.06 0.05 62 28 0.783 0.009 15.699 15.770
3 11 0.07 0.05 61 28 0.678 0.009 15.711 16.028
4 9 0.08 0.05 63 27 0.607 0.009 15.711 16.099
5 9 0.09 0.05 62 27 0.559 0.009 15.658 16.057
The selection criteria for SVM was similar to that used for ANN. Equivalence of
training and testing errors and a high correlation of y and ŷ were expected. In
Table 5, one can see RMSE values as performance measures. RMSE values were
relatively close for both the subsets used for supervised learning and those for the
testing models. For all of the models presented in the table, R > 0.970 for each of the
subsets of data.
Finally, it was decided that model number 1, as the model with lowest RMSE values
for which RL=0.976 and RT=0.978, would be implemented as the core of the SVMREG
for predictions of the construction costs of the building’s floor structural frame. The
outputs of the model were computed for training and testing subsets of data.
Comparison of the ANN ENSGA, ANN ENSSG and SVMREG models’ predictive
performance of the total construction costs of a building’s floor structural frames in
DEVELOPMENT OF COST ESTIMATION MODELS BASED ON ANN ENSEMBLES 59
AND THE SVM METHOD
terms of general metrics is presented in Table 6. The values of R, RMSE and MAPE
are given for the training and testing of models. One can see that the differences
between the models with regard to the values of certain general metrics are relatively
small, especially where the number of training and testing samples is concerned.
Analysis of the values presented in Table 6 allows to conclude that the general
performance metrics are comparable for the three models.
Table 6. Comparison of general performance metrics for the three developed models
Model ANN ENSGA ANN ENSSG SVMREG
perf. metr. TRAIN. TEST. TRAIN. TEST. TRAIN. TEST.
R 0.978 0.980 0.978 0.984 0.976 0.978
RMSE 14.616 15.747 12.976 15.984 15.684 15.738
MAPE 5.55% 5.38% 5.22% 6.36% 5.75% 4.92%
Fig. 4. Comparison of expected outputs and predictions of the three developed models:
a) training subset, b) testing subset
Table 7. Cumulative distribution of APEP values for the three developed models
computed for training and testing subsets
Model ANN ENSGA ANN ENSSG SVMREG
APEp
cum. dist. TRAIN. TEST. TRAIN. TEST. TRAIN. TEST.
APEp ≤ 5% 61.54% 56.25% 59.23% 43.75% 59.23% 65.63%
p
APE ≤ 10% 83.85% 87.50% 86.92% 84.38% 87.69% 87.50%
APEp ≤ 15% 93.08% 96.88% 96.15% 90.63% 92.31% 87.50%
p
APE ≤ 20% 98.46% 96.88% 98.46% 96.88% 95.38% 96.88%
p
APE ≤ 25% 99.23% 100.00% 99.23% 100.00% 98.46% 100.00%
APEp ≤ 30% 99.23% 100.00% 99.23% 100.00% 98.46% 100.00%
p
APE > 30% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00%
In terms of the general performance metrics, all three of the introduced models offer
satisfactory prediction performance. With regard to the accuracy of cost predictions,
the models fulfil the expectations for estimates provided in the early stage of the
construction project as most of the percentage errors of cost predictions fall in the
range of<-20%;+20%>.
4. DISCUSSION
The development of cost-estimation models based on ANN and SVM as tools rooted
in artificial intelligence and machine learning is an up-to-date area of research and
publications in the field of construction management. In the authors’ opinion, one of
the main reasons for this is the need for the introduction of new methods that are
alternatives to the traditional approach and capable of aiding cost estimates, especially
in the early phase of construction projects. The application of ANN or SVM brings the
following benefits:
- the results of cost estimates are based on the relationship of cost with multiple
describing variables related to analysed characteristics of objects, quantity measures
and technical parameters;
- there is no need to assume a priori functional relationships between the cost and
describing variables for regression analysis;
- cost estimates are based on the collected information (training patterns) which form
the basis for automated training processes and gaining knowledge;
- the developed models provide cost estimates in a very short time – it is also possible
to analyse many variants that differ from each other in the values of describing
variables.
Success in the development of models based on artificial intelligence and machine
learning that offer satisfactory performance of cost predictions depends on overcoming
DEVELOPMENT OF COST ESTIMATION MODELS BASED ON ANN ENSEMBLES 63
AND THE SVM METHOD
a major obstacle – the collection of data and information necessary for supervised
training process is a challenging task in itself. From the authors previous experience
and research (see: [14, 15, 16, 18, 25]) it follows that for construction cost estimation
problems, it is most likely to collect datasets that include a moderate amount of data.
However, this matter may be counterbalanced through the use of ensembles of ANN
or SVM. The tools work well for small or moderate datasets even if one must solve
high dimensional problems (both of the mentioned methods apply to construction cost
estimation problems).
The use of both ensembles of ANN and SVM for the investigated problem of cost
estimation of elements of buildings – namely floor structural frames – brought the
expected benefits. Models are developed for multidimensional problems – there are
twelve describing variables that provide information about the floor structural frames.
Moreover, there was no need to assume functional relationships between the described
variable, that is cost of construction works and the describing variables. The models
enable quick cost estimation of the specified building element.
The development of the two models based on ANN ensemble needed more
computational effort when compared to similar models based on single ANN (see
[14]), this is reflected by the two step procedure (see Fig. 2),. On the other hand, the
ensemble approach and the implementation of five combined ANN in the two
introduced models (ANN ENSGA, ANN ENSSG) resulted in a synergy effect and the
compensation of cost estimation errors obtained for the ANN acting in isolation.
The third of the introduced models, which was based on the SVM method, required
determination of the kernel function and ranges of values for meta-parameters. Several
candidate models were trained with the use of cross-validation for tuning meta-
parameters, from which one was finally selected to be implemented for the cost
estimation problem (SVMREG).
For models based on ensembles of ANN as well as for the model based on SVM, the
correlation of expected and predicted values of the construction costs of buildings’
floor structural frames is high (both for training and testing). General performance
metrics are comparable for the three models and lead to the conclusion that their
predictive capabilities are satisfactory. In particular, values of MAPE errors (see Table
6) confirm the applicability of the developed models in the investigated cost
estimation problem. Analysis of the distribution of PEp and APEp errors leads to two
main conclusions: firstly, the models are predestined to cost estimates in the early
stage of the construction project; secondly, the model based on the SVM method
appears to be somewhat better than the two models based on ANN ensembles due to
its more stable results of training and testing (see Figures 6-7).
Most of the models presented in the literature aiming to support cost estimates in
construction projects and based on artificial intelligence or machine learning are
focused on various types of construction objects as a whole. The models presented
herein are developed to aid cost estimates of certain elements of construction objects –
64 Michał JUSZCZYK
5. CONCLUDING REMARKS
The research presented herein allowed investigation of the applicability of artificial
intelligence and machine learning tools in estimating the costs of buildings’ floor
structural frames. The research resulted in the development of the three models
capable of aiding cost estimates. The developed models were based on:
- ensemble of 5 ANN and generalised averaging approach – ANN ENSGA;
- ensemble of 5 ANN and stacked generalisation approach – ANN ENSSG;
- SVM method – SVMREG.
All of the three models offer comparable performance in cost prediction in terms of
general metrics (especially RMSE and MAPE errors). The obtained accuracy of cost
estimates of the structural frames of building floors is acceptable for the early design
stage of a construction project. Analysis of the distribution of training and testing
errors for each model showed some superiority for the model based on support vector
machines.
ADDITIONAL INFORMATION
Computations for ANN and SVM based models were made with the use of the
TIBCO Statistica™ software suite.
REFERENCES
1. Attar, A, M, Khanzadi, M, Dabirian, S and Kalhor, E 2013. Forecasting
contractor's deviation from the client objectives in prequalification model using
support vector regression, International Journal of Project Management 31(6),
924-936.
2. Bishop C, M 1995. Neural networks for pattern recognition. Oxford University
Press.
3. Bougoudis, I, Iliadis, L and Papaleonidas, A 2014. Fuzzy inference ANN
ensembles for air pollutants modeling in a major urban area: the case of Athens. In:
Mladenov, V, Jayne, C, Iliadis, L, (eds) Engineering Applications of Neural
Networks. EANN 2014. Communications in Computer and Information Science
459, Cham: Springer, 1-14.
DEVELOPMENT OF COST ESTIMATION MODELS BASED ON ANN ENSEMBLES 65
AND THE SVM METHOD