Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
12 views6 pages

Predictive Analysis of Fuel Prices Using Machine Learning

Uploaded by

maelle.frossard
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views6 pages

Predictive Analysis of Fuel Prices Using Machine Learning

Uploaded by

maelle.frossard
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Predictive Analysis of Fuel Prices Using Machine

Learning
Andre P. Calitz Margaret Cullen Simbarashe Mamombe
2022 3rd International Conference on Next Generation Computing Applications (NextComp) | 978-1-6654-6954-8/22/$31.00 ©2022 IEEE | DOI: 10.1109/NextComp55567.2022.9932204

Department of Computing Sciences Business School Department of Computing Sciences


Nelson Mandela University Nelson Mandela University Nelson Mandela University
Port Elizabeth, South Africa Port Elizabeth, South Africa Port Elizabeth, South Africa
[email protected] [email protected] [email protected]

Abstract— Sales forecasting is seen as one of the most Currently organizations have access to large amounts of
important indicators of the wellbeing of a business. The ability data due to online and off-line datafication, and they also have
to accurately predict the sales figures can influence the success access to the computing capabilities to analyze the data.
of a business. This can be tied to the stock levels of products, Datafication refers to the technologies, tools and processes
however businesses experience a number of problems, such as used to turn aspects of our daily lives into computerized data
stock shortages that stem from not being able to accurately [2]. Predictive Analysis refers to creating data mining
predict customer spending in advance. If there is understocking, solutions consisting of algorithms and techniques, which can
there will be discouraged customers and overstocking will lead be used to determine outcomes based on structured or
to unnecessary stock-holding costs. Several concepts have been
unstructured data [3]. Data mining is the science of extracting
introduced to help find useful insights from Big data to predict
useful information from large data sets using statistics,
customer spending. Some of these are Predictive Analysis and
Machine Learning. This paper focuses on the real-time Machine Learning and database systems [3]. Machine
prediction of fuel prices. The predictive analytics model Learning is the art of programming computers to optimize a
implemented in this study takes into consideration the external performance criterion using historical or past data [4]. It has
factors such as time, the consumer price index, exchange rates, gained popularity in the developer space especially in the
interest rates and oil prices. Relevant data, obtained from an development of smart applications that are able to adapt to
agriculture organization and from various other sources were special circumstances [4].
integrated into a single dataset. An exploratory analysis, using
The focus of predictive analysis is making discoveries
an Elman neural network was carried out to understand the
relationships that exist between the datasets. Predictions were
from data in order to forecast what will happen in the future.
generated in two modes namely, daily and monthly fuel prices. Predictive analysis also involves the ability to improve
The evaluation and validation of the model indicated accurate previously existing models and give an evaluation of the
daily sales and spike predictions of diesel fuel. predicting power of a model [5]. The process of predictive
analytics starts off with an investigation of the problem at
Keywords— Machine Learning, Predictive Analysis, Sales hand. This will identify what type of data should be required
Predictions, Big data, Fuel prices. for the proposed system. Thereafter, an exploratory analysis is
carried out to understand the data and make a preliminary
I. INTRODUCTION identification of relationships that exist in the data. Based on
Sales are one of the most important indicators of the the problem defined and the data available, a suitable
financial performance of a business. The ability to correctly analytical method can then be used to develop a model [6].
forecast and predict sales figures can make a difference The development of the model and the evaluation task could
between profit and loss or even the future success of a be an iterative process until a model with the desirable
business. Organizations that fail to predict sales and stock predictive outcome is achieved.
levels accurately, can either understock or overstock their Different Machine Learning techniques have been
inventory, with major repercussions for the business. The successfully utilized for learning from historical data in order
overstocking of inventory can lead to unnecessary inventory to make future predictions in areas such as financial markets,
costs, which ultimately leads to a decrease in profits. medicine and engineering. One of the biggest challenges of
Understocking inventory can impact an organization using Machine Learning techniques is the relevance of the
negatively, resulting in insufficient stock for customers to resulted predictive model after a period of time [7]. In the field
purchase, damaging customer relationships. Customers can of agriculture, researchers are testing theories relating to Big
switch to different suppliers and this can lead to a potential data to help make more accurate, real-time predictions [8].
decrease in long-term profits for the organization, which is
detrimental to the organization’s bottom line. Concepts of using data for predictions using statistical and
computational methods have been researched extensively in
Several techniques have been used over the years to the past [5]. However, there is a need for research on the use
predict sales and stock levels; however, the large amounts of of data in the agriculture industry. Most of existing research is
data that organizations have access to requires more efficient geared towards predicting the monetary value of customer
data analysis techniques for accurate predictions. These large spending. This is beneficial but it overlooks the importance of
amounts of data and diverse data sets are referred to as Big having correct stock levels as stock holding costs are usually
data. The concept of Big data can be defined as a large set of high. In this study, a case study is presented for predicting the
data that can be analyzed to reveal particular patterns, trends diesel fuel sales in the agriculture industry using Machine
and associations. Big data is often defined with reference to Learning. This study introduces the concepts of Predictive
the ‘5V’s’, namely volume, variety, velocity, value and Analysis and Machine Learning for fuel sales, which can be
veracity [1]. used for better predictions of sales volumes using Big data.

978-1-6654-6954-8/22/$31.00 ©2022 IEEE

Authorized licensed use limited to: University of Fribourg - Bibliothèque cantonale et universitaire. Downloaded on November 08,2024 at 16:45:25 UTC from IEEE Xplore. Restrictions apply.
Currently, most businesses make use of different IV. PREDICTIVE MODELLING WITH MACHINE LEARNING AND
techniques when making predictions of sales. These NEURAL NETWORKS
techniques include the Naïve Approach, Moving Average,
Exponential Smoothing and Causal Forecasting. Although all A. Machine Learning
these techniques are cost effective, they are prone to error and Machine Learning (ML) is an iterative process and Fig. 1
are limited [9, 10]. Businesses need improved modern shows a simplified representation of the ML process. [12]. The
methods and techniques to predict sales to ensure that they ML process involves a number of sub-processes, including
have the correct stock levels in their warehouses. The tool data pre-processing, training and deployment. Data has to be
should predict future demand based on historical data collected and wrangled and pre-processed for analysis. The
provided by the organization. Such information would be prepared data are used as input for ML Algorithms, which
helpful in minimizing inventory holding costs. An additional process the data iteratively identifying patterns. This is the
benefit is the improvement of overall business efficiency. training part of the process. The output of this step is a
The layout of the paper is as follows. In section 2 the candidate model. Training continues until a model with the
factors that affect customer spending are discussed, followed best fit is found, called the chosen model. The chosen model
by an overview of the research methodology followed for this is the code that can recognize patterns that are in the data that
study (Section 3). Predictive modelling using Machine were used to train the model. The model is then deployed and
Learning and more specifically neural networks, is discussed used to make predictions and incorporated in applications
in Section 4. In Section 5 the implementation of the model is [12].
discussed, which includes an overview of the data sources
used, the neural network implementation as well as an
overview of the evaluation and analysis of the results. Future
research work is discussed in the final section.
II. FACTORS AFFECTING CUSTOMER SPENDING
Sales are the lifeblood of most businesses and many
businesses make use of cost-effective techniques of predicting
sale volumes. These techniques are not tailored for the fast-
paced, data driven world that businesses operate in and do not
cater for Big data extensively. Fig. 1. Graphical Representation of the Machine Learning Process [12]

There are several factors that influence sales for an B. Neural Networks
agricultural organization. These include: Neural networks are some of the most useful ML
• Consumer Price Index (CPI); algorithms [13]. Neural networks can be defined as machines
• Exchange Rates; that are designed to model the way that a human brain
• Interest Rates; and performs a particular task or function [14]. They can also be
• Oil Prices. defined as massively distributed processors that have a natural
propensity for storing experiential knowledge and making it
The data for these external indices are made available in available for use.
real time as monthly, daily and hourly data. The concept of
Big data made it possible to collect and analyze the data by They resemble the human brain in two respects, which are:
making use of tools, techniques and technologies. This study • The network through a learning process acquires
explores a predictive analysis for customer spending by knowledge; and
considering the opportunity to collect, integrate and analyze • Known Interneuron connection strengths are used to
data from different sources in real time. store the knowledge [15].
III. RESEARCH OBJECTIVES AND METHODOLOGY A neural network is made up of a number of connected
The research problem that has been identified is that neurons. Each neuron is made up of input synapses or signals,
businesses experience a number of problems, such as stock which are weighted and have an output signal. A neuron
shortages that stem from not being able to accurately predict works in the following way: It receives input signals from the
sale volumes in advance. Based on the identified research environment or other neurons, gathers them and sums them
problem, this paper specifically focuses on the timely up, it then gets triggered by an activation function and
predictions of diesel fuel sales at an agriculture organization. transmits as output the value supplied by the activation
The primary objective of this research study was the function [15].
development of a model that uses Machine Learning to
Neural Network Architecture. A feed forward neural
produce accurate predictions of diesel fuel sales. This process
would be iterative and therefore, the Design Science Research network (Fig. 2) is made up of a number of neurons that are
(DSR) methodology was adopted. The DSR methodology connected together and arranged in three layers namely, input,
involves guidelines that are used to develop an innovative hidden and output layers. Signals travel in order from the input
artefact to solve a specific problem, contribute to research, layer to the hidden layer then from the hidden layer to the
evaluate designs and then communicate the research to a output layer during the learning process in order to determine
specific audience [11]. the weights of the synapses [15].

Authorized licensed use limited to: University of Fribourg - Bibliothèque cantonale et universitaire. Downloaded on November 08,2024 at 16:45:25 UTC from IEEE Xplore. Restrictions apply.
agricultural sector. The proposed system made use of ML
algorithms in particular neural networks. Fig. 3 shows the
framework used for the proposed system.

Fig. 2. Feed Forward Neural Network

Several types of neural network algorithms exist for


modelling different prediction and classification problems.
The neural networks use context neurons to send values of a
previous iteration back to the current iteration of the neural
network. Examples of temporal neural networks are the
Elman and Jordan neural networks [16]. In the Elman neural
network, there are context neurons in the input layer, which
get output from the hidden layer neurons. The Jordan neural
network has context neurons in the hidden layer and the
context neurons get output from the output layer as input.
Recurrent neural networks are able to learn more about data Fig. 3. Framework for proposed system
than their feedforward counterparts. They are therefore more
suited to making predictions when using temporal features of A. Data Sources
data [17]. Data were gathered from multiple sources, cleaned, pre-
processed and integrated into a single repository. A subset of
Possible Extra Features. the data were used in the ML Algorithms for sales predictions.
Bias Neurons: In order to improve on the effectiveness of The data used for this research study were primarily provided
neural networks, special neurons can be added to their by a local agricultural organization, which sells farming
topologies. These neurons are bias and context neurons. Bias products, including diesel fuel. ML is data driven; therefore, a
database was needed to store the Big data set.
neurons always output a value of one and never receive inputs
from the previous layer if there is one. They can only be The main component of the data is transactional data
present in the input or hidden layers. Bias neurons allow relating to sales and stock levels from the agricultural
neural networks to learn patterns more effectively and they organization. The products on sale and the items of stock
make it possible for neural networks to output non-zero values available included farming inputs from different classes, such
when the inputs are zero [17]. as petroleum, diesel fuel, wool and grain. The data consists of
11 713 915 records of sales products that date from 3 June
Context Neurons. Context Neurons are present in recurrent 2008 to 28 November 2016. Each data item has a record of the
neural network topologies. They give the neural network a quantity bought, the location and the customer who bought the
short-term memory and allow feedback. Feedback is the term items. The data was stored on a Microsoft SQL Server
given to when the output from a previous iteration in the database.
neural network training is used as input for successive
Other data sets that were used included the Consumer
iterations. Depending on the type of neural network, they
Price Index (CPI) of South Africa, the South African Rand vs
could be in the hidden or input layers. Addition of context the United States Dollar exchange rate, bank prime interest
neurons can make neural networks especially useful for rates and the global oil prices for the period 2008 - 2016. The
predictive tasks [17]. Such a quality is useful for fulfilling the CPI data were provided by Statistics South Africa on their
main goal of this project which is to accurately predict official website. The base month (where the CPI is 100) is
customer spending. December 2012. The data set has the monthly CPI for each
Types of Neural Networks. South African Province and is dated from January 2008 to
May 2016 and is in the Microsoft Excel format.
There are different types of neural networks, such as Feed
The bank prime interest rate and exchange rate data were
Forward and Recurrent Neural Networks. Fig. 2 shows a feed
obtained from the official website of the South African
forward neural network where there are only forward
Reserve Bank. The prime interest rate dataset is in Comma
connections between the layers of the neural network. Delimited Value (CSV) format and contains dates when the
Information thus only moves forward. In Recurrent Neural interest rate was changed by the Reserve Bank with the value
Networks however, there are feedback connections that allow of the interest rate as a percentage. The data are dated from 2
the neural network to learn the temporal features of a dataset. January 2008 to 8 June 2016. The exchange rate data are also
The neural networks use context neurons to send back values in CSV format and contain the average South African Rand to
of a previous iteration back to the current iteration of the United States Dollar exchange rate per day. The values are
neural network. dated from 2 January 2008 to 1 July 2016.
V. IMPLIMENTATION The last dataset that was used in this study is the dataset
for global oil prices. The dataset is available at the official
A Big data approach was implemented that enabled more
website of the United States Energy Information
accurate and timely predictions of diesel sales in the
Administration. The data set was in CSV format and consists

Authorized licensed use limited to: University of Fribourg - Bibliothèque cantonale et universitaire. Downloaded on November 08,2024 at 16:45:25 UTC from IEEE Xplore. Restrictions apply.
of data from 2 January 2007 to 18 July 2016. Each data row Normalized Field class in the Encog framework was used for
contains the date and the average value of the oil price per this. Seventy five percent of the data were then used to train
barrel per day in United States dollars. The data were the neural network and the rest was used for validation and
integrated and stored as entities. The integrated data were predictions. The normalized fields were then loaded into an
divided into two parts, namely the training and validation data. array of doubles for inputs and one for the output. Both arrays
The training data were input to neural networks in order to were loaded into an IMLDataSet Encog class object and this
adjust their weights in such a way that they could predict was used for training neural networks.
future sales. The prediction data were used to generate
predictions and evaluate the accuracy of the ML algorithms. The second component in the neural network process was
to create the neural network. The Elman recurrent neural
B. Neural Networks Implementation network is used in predictions [17]. It was chosen because of
The neural network implementation was carried out in this reason and also because it was the one that produced
stages, which describes in depth the neural network predictions that closely resembled the actual data when
development process in terms of code and iterations. The different types of neural networks were run on petroleum
neural networks that were used for the predictions in this study products [20].
were developed using the C# programming language together The third part of the neural network process was the
with the Encog Machine Learning framework [17]. A class training process. During the training of the neural networks, a
was also created to perform each overarching activity during hybrid training strategy was employed. The strategy made use
the ML Process (Fig. 1). In order to make the program of a combination of propagation and non-propagation training.
efficient, threads were used in its architecture. Threads are According to the Encog guide, the Levenberg Marquardt
defined as a program unit that executes independently from algorithm (LMA) outperforms all the other propagation
other parts of a program. Running multiple threads at a time training methods that are supported by Encog [17]. Therefore,
allows for the better utilization of resources and increased it was selected as the propagation training method. In terms of
efficiency [18, 19]. The data were first loaded from the Sale the non-propagation training method, Simulated Annealing
Table in the SQL server database to a text file because of was used. Simulated Annealing uses less memory than
performance issues. The sales were grouped according to Genetic Algorithms which is why it was selected. The reason
location and item description using a SQL query. Also, a list why a hybrid training strategy was used is to ensure that the
of all possible combinations of location and item descriptions, global minimum error is reached [21]. When the first training
along with the count of the number of records for each technique gets stuck in a global minimum, the alternate
combination was downloaded from the database. At runtime, method will be used to find the global minimum [17]. During
the code was designed to process combinations that have at the training process, a greedy technique was used. This
least 150 records. technique only allowed weights to be changed only if it
At runtime, the text file with the sales data was loaded into resulted in the error becoming less. The training only stopped
data objects in a list in the C# runtime environment. Each data when there was no substantial change in the error for more
object stored information about a sale or a month’s worth of than 20 iterations of training [22].
sales. The most important information was the numerical The last element in the neural network process was
information such as CPI, interest rates, oil prices, sales predictions. The remaining 25% of data was used to make
quantities and so on. The data were loaded per item predictions. This was fed into the neural network without
description and location so that predictions could be made outputs and then the neural network was expected to predict
item description and location. the output.
Each neural network was designed to run on a separate
thread to improve the efficiency of the program. An Engine
object for each possible combination of location and item
description was created and added to a queue of threads that
are awaiting a core to become available.
The Engine object received data that were relevant to it for
processing by the neural networks. That is, data which
matched the combination of item description and location
given. Using a multithreaded program made processing times
thousands of times faster.
C. Neural Network Development Details. Fig. 4. Iteration 1
The implementation of this study was done in an iterative
Eighty percent of the data set was used for training and
and incremental manner. This section describes the basic
20% for predictions. In the first iteration (Fig. 4) the network
elements of neural network design that were present in all
iterations. The neural networks were created inside Engine topology was used to create neural networks. Each data
objects. attribute had a single neuron and experiments were run. There
were 4 input neurons and 1 output neuron.
The loading of data was the first element in the neural
network process. Sales data were loaded into an Engine object In the second iteration (Fig. 5) additional variables were
added to the neural network topology. These variables had to
and its properties were normalized. The input variables that
do with the date of a sale. Using 1 of normalization, 7 neurons
were used in the neural network were normalized in the range
were assigned to the day of the week, 1 for the day of the
of [-1, 1]. Normalizing the data was done so that the inputs of
month and 12 for the month. The third iteration (Fig. 6) is
the neural network would be in the same range. The
exactly the same as the second iteration except that the third

Authorized licensed use limited to: University of Fribourg - Bibliothèque cantonale et universitaire. Downloaded on November 08,2024 at 16:45:25 UTC from IEEE Xplore. Restrictions apply.
iteration contained a neuron for the sum of all the sales
quantities of the previous 30 days.

Fig. 5. Iteration 2 Fig. 8. Average RMSE per Category for Monthly predictions

The RMSE values for daily predictions showed that 15 out


of 21 of the categories for Iteration 1 had RMSE values less
than 100. Iterations 2 and 3 had 19 and 17 out of 21
respectively. In terms of average RMSE values Iteration 1 had
a value of 1345 while Iterations 2 and 3 had 589 and 446
respectfully. On average, monthly predictions proved to be
less accurate than their daily counterparts. The average RMSE
per category were 1828, 1405 and 1310 for Iterations 1, 2 and
3 respectfully. The number of categories that had average
Fig. 6. Iteration 3 RMSE values that were less than 100 were 14 for each of the
iterations. This can be explained by the reduction in data
VI. EMPIRICAL EVALUATION AND RESULTS points to which the neural network learns patterns.
The evaluation of the predictive models involved running
The results further indicated the neural networks’ ability
all the iterations of the neural networks. Data were divided
to predict spikes in the demand of products. All iterations were
into a number of data sets according to the location of
able to accurately predict big spikes in demand. This is
purchase and the item description. Each of these data sets were
illustrated by Fig. 9, Fig. 10 and Fig. 11. The figures show
then divided into two sets, namely the training set and the
graphs of predicted and actual sales quantities obtained from
validation set. A neural network was created for each dataset
running neural networks on Diesel (500 PPM) 0.5%
and was trained using the training set. Predictions were then
NORMAL for one of the locations where it is sold. The
made using the validation set and accuracy calculations were
prediction set was selected at random to show the spike
then made.
predictions.

Fig. 9. Diesel (500PPM) 0.5% NORMAL Iteration 1

Fig. 7. Average RMSE for Daily Predictions by Category

Various statistical metrics were considered for use in


evaluation of the accuracy of the predictions of the solution.
These included Root Mean Square Error (RMSE), Mean
Absolute Percentage Error (MAPE) and Estimated Sum of
Squares (ESS) [23].
There were 11356 neural networks created and tested for
each configuration. Fig. 7 and Fig. 8 indicate the average Fig. 10. Diesel (500 PPM) 0.5% NORMAL Iteration 2
RMSE per product category for all iterations of the neural
networks for daily and monthly predictions respectfully.
Generally, for most of the categories daily or monthly,
Iteration 3 was the most accurate of the models. This is
followed by Iteration 2 then Iteration 1. To a greater extent,
this shows that the models get more accurate as more variables
are added to the model. Most of the average RMSE values are
below 100 except for the Direkte Aankope (Direct Purchases)
and Petroleum Produkte (Petroleum Products), which have Fig. 11. Diesel (500 PPM) 0.5% NORMAL Iteration 3
RMSE values that were above 2000.

Authorized licensed use limited to: University of Fribourg - Bibliothèque cantonale et universitaire. Downloaded on November 08,2024 at 16:45:25 UTC from IEEE Xplore. Restrictions apply.
However, Iteration 1 was small with the smaller spikes in REFERENCES
demand of products. Iterations 2 and 3 were better in [1] M. Shabana and K. V. A. Sharma, “Study in Big Data Advancement
predicting the overall pattern of demand for the products. All and Big Data Analytics,” J. of App. Sc. and Computations, vol. VI, no.
iterations could predict spikes in customer spending several 1, 2019.
months into the future including the spikes. [2] K. Cukier and V. Mayer-Schoenberger, The Rise of Big Data.
https://www.foreignaffairs.com/articles/2013-04-03/rise-big-data
The predicted sales were compared to the actual sales for [3] F. Halper, Predictive Analytics for Business Advantage, 2014.
the three iterations of the algorithms using a statistical metric, http://tdwi.org/research/2013/12/best-practices-report-predictive-
Root Mean Square Error. Generally, the order of accuracy was analytics-for-business-advantage/asset.aspx?tc=assetpg
Iteration 3, 2 and then 1. This strongly points to the fact that [4] E. Alpaydin, Introduction to Machine Learning (2nd ed.). Cambridge:
the accuracy increases as more variables are added to the The MIT Press, 2010.
model. In this case, it was adding in the date information of [5] G. Shmueli and O. Koppius, “Predicitve analytics in Information
cumulative sales totals for previous periods. Daily predictions Systems Research,” Robert H. Smith School Research Paper No. RHS
06-138, 2010. http://dx.doi.org/10.2139/ssrn.1606674.
were generally accurate. They could accurately predict
[6] J. Dean, Big Data, Data Mining and Machine Learning. John Wiley,
customer spending several months into the future including 2014.
spikes in sales.
[7] G. Botempi, S. B. Taieb, and Y. Le Borgne, Machine Learning
strategies for time series forecasting. Berlin Heidelberg: Springer,
VII. CONCLUSIONS AND FUTURE RESEARCH 2013.
The main goal of the paper was to develop an application [8] N. Yethiraj, “Applying Data Mining Techniques in the field of
that accurately predicts future sales using ML and using those agriculture and allied sciences,” Int. J. of Bus. Int., 2012, pp. 72-75.
predictions to help in decision making, increase profits and [9] F. Chen, and T. Ou, “Sales Forecasting systems based on Gray extreme
limiting stock holding. The predictive model of the system had learning machine with Taguchi method in retail industry,” Expert Sys.
App., vol. 38, no. 3, 2011, pp. 1336-1345.
to take into account external factors that influence customer
[10] J. Bosch, M. Tait, and E. Venter, Business Management: An
spending. These factors were time, CPI, exchange rates, Entrepreneural Perspective (2nd ed.). Port Elizabeth: Lectern, 2011.
interest rate and oil prices. This paper demonstrates the use of
[11] A. Hevner, “Design Science in Information Systems Research,” MIS
a predictive analysis modelling approach in providing an Quarterly, vol. 28, no. 1, 2004, pp. 75-105.
accurate and real time forecast of sales figures. The Elman [12] D. Chappel, Introducing Azure Machine Learning, 2010.
neural network was found to be effective at making http://davidchappell.comwriting/white_papers/Introducing-Azure-
predictions and this was used as the ML algorithm for this ML-v1.0--Chappell.pdf.
study. The predictive model could accurately predict daily [13] M. Khashei and M. Bijari, “An artificial neural network model for
sales per item per location, a number of months into the future. timeseries forecasting,” Exp. Sys. with App., vol. 37, no. 1, 2010, pp.
These results were promising and show that ML can be highly 479-489.
useful for predictive analysis. It is recommended that these [14] D. Larose, Discovering knowledge in data: An introduction to data
mining . John Wiley, 2005.
algorithms be integrated within the agricultural organization’s
[15] M. Hajek, Neural Networks. University of KwaZulu-Natal, 2005.
ERP system.
[16] J. Fulcher, Computational Intelligence: An introduction (2nd ed.).
This study proved that ML can be used as a tool to assist Chichester: John Wiley & Sons Ltd., 2008.
with sale predictions. It also showed that when more variables [17] J. Heaton, Programming Neural Networks with Encog3 in C# (2nd ed.).
were added to the models, the better the predictions were. St. Louis, MO: USA: Heaton Research, 2011.
Future work will incorporate additional variables, such as [18] C. Horstmann, Big Java: compatible with Java 5, 6 and 7 (4th ed.).
weather data to improve the predictions. Most farming Danvers: John Wiley & Sons Inc., 2010.
products are seasonal, meaning that the weather heavily [19] Y. Zhang and W. Lin, “Efficient resource sharing algorithm for
physical register file in simultaneous multi-threading processors,”
impacts sales and customer spending. Historical data of daily Microprocessors and Microsystems, vol. 45, 2016, pp. 270-282.
temperature, rainfall, wind speeds and other variables will also
[20] J. Shan Hu, Y. Hu, and R. Lin, “Applying Neural Networks to prices
be considered. predictions of Crude oil futures,” Mathematical Problems in
Engineering, 2012.
The farming index, industrial index and the GDP growth
[21] H. Albeihdili, T. Ham, and N. Islam, “Hybrid Algorithm for the
are other factors that could be added to the predictive models. opitimization of training convultional neural network,” IJACSA, vol.
All three economic indicators affect the spending power and 6, no. 10, 2015.
therefore customer spending. Therefore, adding such data [22] S. Huang, X. Li, Z. Cheng, Z. Zhang, and A. Hauptmann, GNAS: A
would be advantageous for the predictive model. One Greedy Neural Architecture Search Method for Multi-Attribute
outcome of this project was the accurate spike prediction by Learning, 2018. https://arxiv.org/pdf/1804.06964.pdf
the ML algorithms. Spikes cause major problems for [23] C. Chen, J. Twycross, and J. Garibaldi, “A new accuracy measure
organizations when it comes to stock holding. Spikes lead to based on bounded relative error for time series forecasting,” PLoS One,
organizations having inadequate stock levels and them risking vol. 12, no. 3, 2017.
losing customers that become discouraged. Research into this [24] B. Armstrong and F. Collopy, “Error Measures for Generalizing About
Forecasting Methods: Emperical Comparisons,” Int, J of Forcasting,
field is very important and will prove to be beneficial to vol. 8, 1992, pp. 69-80. https://doi/10.1016/0169-2070(92)90008-W
organizations. [25] Kaggle, Root Mean Squared Error, 2016.
http://www.kaggle.com/wiki/RootMeanSquaredError

Authorized licensed use limited to: University of Fribourg - Bibliothèque cantonale et universitaire. Downloaded on November 08,2024 at 16:45:25 UTC from IEEE Xplore. Restrictions apply.

You might also like