Mini - Project
Mini - Project
On
BACHELOR OF TECHNOLOGY
IN
COMPUTER SCIENCE AND ENGINEERING
Submitted
By
GADEELA AKHIL REDDY (208R1A0573)
This is to certify that the project entitled “Artificial Neural Networks For Edge and Fog Computing
Based Energy Prediction” is a bonafide work carried out by
GADEELA AKHIL REDDY (208R1A0573)
The results presented in this project have been verified and are found to be satisfactory. The results
embodied in this project have not been submitted to any other university for the award of any other
degree or diploma.
This is to certify that the work reported in the present project entitled " Artificial Neural Networks For
Edge and Fog Computing Based Energy Prediction ” is a record of bonafide work done by us in the
Department of Computer Science and Engineering, CMR Engineering College, JNTU Hyderabad. The
reports are based on the project work done entirely by us and not copied from any other source.We
submit our project for further development by any interested students who share similar interests to
improve the project in the future.
The results embodied in this project report have not been submitted to any other University or Institute
for the award of any degree or diploma to the best of our knowledge and belief.
We are extremely grateful to Dr. A. Srinivasula Reddy, Principal and Dr.Sheo Kumar, HOD,
Department of CSE, CMR Engineering College for their constant support.
We are extremely thankful to Mr.B.Prasad, Associate Professor, Internal Guide, Department of CSE,
for his constant guidance, encouragement and moral support throughout the project.
We will be failing in duty if I do not acknowledge with grateful thanks to the authors of the references
and other literatures referred in this Project.
We express our thanks to all staff members and friends for all the help and co-ordination extended
in bringing out this project successfully in time.
Finally, We are very much thankful to our parents who guided us for every step.
TOPIC PAGE NO
ABSTRACT I
LIST OF FIGURES II
LIST OF TABLES III
1. INTRODUCTION 1
1.1. Introduction & Objectives 1
1.2. Project Objectives 2
1.3. Purpose of the project 2
1.4. Existing System with Disadvantages 2
1.5. Proposed System With features 2
1.6. Input and Output Design 4
2. LITERATURE SURVEY 6
5. SOFTWARE DESIGN 13
5.1. System Architecture 13
5.2. Dataflow Diagrams 13
5.3. UML Diagrams 15
7. SYSTEM TESTING 26
7.1. Types of System Testing 26
7.2. Test Cases 27
8. OUTPUT SCREENS 30
9. CONCLUSION 34
11. REFERENCES 36
ABSTRACT
Edge and Fog Computing have become increasingly popular in recent times as innovative distributed
computing paradigms that bring computational power closer to the data source. This proximity enables
real-time processing and analysis of data, making it particularly advantageous in energy management
systems. Accurate prediction of energy consumption is crucial for efficiently utilizing resources and
optimizing energy usage. In the realm of energy prediction, the traditional approach involves utilizing
statistical methods, time series analysis, and regression models. However, these methods have their
limitations. They often require manual feature engineering, overlooking intricate relationships within the
data, and leading to limited predictive performance, especially when dealing with complex and non-
linear datasets. Furthermore, they may not fully capitalize on the benefits of distributed computing in
Edge and Fog environments. On the other hand, accurate energy prediction holds immense significance
for sustainable energy management, especially in the context of modern smart grid systems and Internet
of Things (IoT) applications. Conventional forecasting methods often struggle to adapt to rapidly
changing energy consumption patterns and grapple with the processing of large-scale data, resulting in
suboptimal resource allocation. Hence, there is a pressing need to explore new approaches that can offer
more accurate, reliable, and real-time energy predictions to enhance energy efficiency and reduce costs.
The objective of this project is to delve into the techniques and solutions for an extension of smart grids.
In this context, Artificial Neural Networks (ANNs) emerge as a powerful tool. ANNs can automatically
learn and extract patterns from data without requiring extensive manual feature engineering. By
harnessing the capabilities of distributed computing in Edge and Fog environments, the proposed
method can efficiently process vast amounts of data in real-time, leading to more accurate energy
predictions. The outcomes of this study carry wide-ranging implications, as they can significantly
enhance energy management systems, optimize resource allocation, and contribute to overall
sustainability efforts. With the integration of ANNs and distributed computing, the proposed approach
holds the potential to revolutionize energy prediction and further advance the field of energy
management in smart grids.
LIST OF FIGURES
22. 8.1 line plot of the training and testing loss of a machine learning
LIST OF TABLES
S. .NO TABLE NO DESCRIPTION PAGENO
Edge and fog computing-based energy prediction is an advanced approach in the field of energy
management and forecasting. This method leverages the capabilities of edge and fog computing to
enhance the accuracy and efficiency of energy consumption predictions across various applications,
such as smart grids, industrial automation, and smart buildings. Edge computing, involving data
processing closer to the source of generation or consumption, and fog computing, which extends this
concept by distributing computing resources even closer to the data source, play pivotal roles in this
context. These paradigms reduce latency, minimize bandwidth usage, and improve real-time decision-
making, making them ideal for energy prediction applications.
Energy prediction, in this context, refers to estimating future energy consumption patterns, a critical
factor for optimizing energy distribution, load balancing, and resource allocation. By deploying edge
and fog computing technologies, several advantages are achieved: In the realm of edge and fog
computing-based energy prediction, real-time data processing is a pivotal advantage. These computing
nodes can process data from sensors and devices in real-time, enabling immediate responses to changing
energy consumption patterns. This feature is particularly valuable in applications where timely decision-
making is crucial.
Reduced latency is another significant benefit. With computing resources positioned closer to the data
source, the latency associated with transmitting data to remote servers is minimized, ensuring energy
predictions are based on the most up-to-date information available. Data privacy and security are
paramount concerns in energy-related applications. Edge and fog computing allow data to be processed
locally, reducing the need to transmit sensitive energy consumption data to external servers. This
enhances data privacy and security.
Scalability is a vital aspect of these architectures. They are highly scalable, allowing for the addition of
more computing resources as needed. This flexibility is essential in dynamic environments where
energy consumption patterns may change rapidly. Resilience is also a critical feature. Distributed
computing architectures are inherently more resilient. If one edge or fog node fails, others can take over,
ensuring that energy prediction services remain available.
1.2 Project Objectives
The project objective is to develop and implement Artificial Neural Networks (ANNs) tailored for
accurate energy prediction within the dynamic contexts of Edge and Fog Computing environments.
The purpose of utilizing Artificial Neural Networks (ANNs) for Edge and Fog Computing-based Energy
Prediction is to enhance the efficiency and effectiveness of energy management in decentralized
computing environments.
SVM is one of the most popular Supervised Learning algorithms, which is used for Classification as
well as Regression problems. However, primarily, it is used for Classification problems in Machine
Learning. The goal of the SVM algorithm is to create the best line or decision boundary that can
segregate n-dimensional space into classes so that we can easily put the new data point in the correct
category in the future. This best decision boundary is called a hyperplane.
SVM chooses the extreme points/vectors that help in creating the hyperplane. These extreme cases are
called as support vectors, and hence algorithm is termed as Support Vector Machine. Consider the
below diagram in which there are two different categories that are classified using a decision boundary
or hyperplane:
Linear SVM: Linear SVM is used for linearly separable data, which means if a dataset can be classified
into two classes by using a single straight line, then such data is termed as linearly separable data, and
classifier is used called as Linear SVM classifier.
Non-linear SVM: Non-Linear SVM is used for non-linearly separated data, which means if a dataset
cannot be classified by using a straight line, then such data is termed as non-linear data and classifier
used is called as Non-linear SVM classifier
SVM working
Linear SVM: The working of the SVM algorithm can be understood by using an example. Suppose we
have a dataset that has two tags (green and blue), and the dataset has two features x1 and x2. We want a
classifier that can classify the pair (x1, x2) of coordinates in either green or blue. Consider the below
image:
So as it is 2-d space so by just using a straight line, we can easily separate these two classes. But there
can be multiple lines that can separate these classes. Consider the below image:
Hence, the SVM algorithm helps to find the best line or decision boundary; this best boundary or region
is called as a hyperplane. SVM algorithm finds the closest point of the lines from both the classes. These
points are called support vectors. The distance between the vectors and the hyperplane is called
as margin. And the goal of SVM is to maximize this margin. The hyperplane with maximum margin is
called the optimal hyperplane.
Non-Linear SVM: If data is linearly arranged, then we can separate it by using a straight line, but for
non-linear data, we cannot draw a single straight line. Consider the below image:
So, to separate these data points, we need to add one more dimension. For linear data, we have used two
dimensions x and y, so for non-linear data, we will add a third-dimension z. It can be calculated as:
z=x2 +y2
By adding the third dimension, the sample space will become as below image:
Figure 1.4.7. Non-Linear SVM data seperation
So now, SVM will divide the datasets into classes in the following way. Consider the below image:
Since we are in 3-d Space, hence it is looking like a plane parallel to the x-axis. If we convert it in 2d
space with z=1, then it will become as:
Figure 1.4.9 Non-Linear SVM with ROC
Disadvantages.
Support Vector Machines (SVMs) have been widely used in various domains, including medical
diagnosis, but they do come with some drawbacks when applied to phishing URL:
One significant drawback of SVMs for phishing URL is the need for a substantial amount of
labeled data. Dental datasets, particularly those with detailed dental images or 3D scans, can be
relatively small and expensive to obtain.
SVMs perform best when they have a large amount of data to learn from, and this limitation can
hinder their effectiveness in phishing URL, where data availability is often limited.
SVMs is their limited ability to handle imbalanced datasets. In phishing URL, it's common to
encounter imbalanced datasets where certain dental conditions are rare compared to others. SVMs
may struggle to correctly classify minority class samples, leading to a biased diagnosis.
The proposed system for ANN is designed to revolutionize energy management in decentralized
computing environments. By integrating ANNs into Edge and Fog Computing nodes, we create a
distributed prediction framework that can adapt to dynamic conditions, allowing for timely adjustments
in energy usage.
Ariticial Neural Networks (ANN) Algorithm.
ANNs are a network with automatic adjustment of network parameters, which can iteratively calculate
data according to the set coordinates and models. The training process of deep learning model is actually
a process of constantly tuning the ownership values of nodes which are all used as tools to describe data
features. The key to whether the model can describe the features of things lies on the final training
results of each weight. The ANN takes the neural network as the carrier and focuses on the depth. It can
be said to be a general term, including the recurrent neural network with multiple hidden layers, the all-
connected network and the convolutional neural network. Recurrent neural networks are mainly used for
sequence data processing and have a certain memory effect. The long-term and short-term memory
networks derived from them are better at processing long-term dependencies. Convolutional neural
networks focus on spatial mapping. And image data is particularly suitable for feature extraction of
various networks. When the input data is dependent and sequential, the results of CNN are generally not
good. There is no correlation between the previous input of CNN and the next input. The RNN network
appeared in the 1980s. It is designed a different number of hidden layers. Each hidden layer stores
information and selectively forgets some information. In this way, the data characteristics of the
sequence changes of the data can be extracted. RNN has not only achieved many results in the fields of
text processing and speech processing, but also been widely used in the fields of speech recognition,
machine translation, text generation, sentiment analysis, and video behavior recognition. Therefore, this
paper will use RNN modelling to predict the risk of abnormal blood lipids in steel workers. The RNN is
good at processing time series data and can describe the context of the data on the time axis. The RNN
structure is shown in the Fig 1.5
Advantages
Smart Learning
Faster Predictions
It Saves Energy
It handles lots of Data
INPUT DESIGN
Input design is a part of overall system design. The main objective during the input design is as given
below:
Error Avoidance
At this stage care is to be taken to ensure that input data remains accurate form the stage at which it is
recorded up to the stage in which the data is accepted by the system. This can be achieved only by
means of careful control each time the data is handled.
Data Validation
Procedures are designed to detect errors in data at a lower level of detail. Data validations have been
included in the system in almost every area where there is a possibility for the user to commit errors.
The system will not accept invalid data. Whenever an invalid data is keyed in, the system immediately
prompts the user and the user has to again key in the data and the system will accept the data only if the
data is correct. Validations have been included where necessary.
OBJECTIVES
Feature Selection and Engineering: Identify and select the most relevant features that have a direct
impact on energy consumption or generation. This involves understanding the characteristics of the data
and choosing inputs that provide valuable information for prediction.
Normalization and Standardization: Ensure that the input data is preprocessed appropriately through
techniques like normalization (scaling the values to a standard range) and standardization (adjusting the
mean and standard deviation). This helps in preventing bias towards certain features and ensures
consistent performance.
Handling Categorical Variables: Address any categorical variables in the dataset by employing
techniques like one-hot encoding or label encoding. This enables the neural network to effectively
process these variables as part of the input.
Temporal Considerations: If the energy data has a temporal component (e.g., time of day, day of the
week), design the input to incorporate these time-related features. This allows the ANN to capture
patterns that may be influenced by time.
Output Design
Outputs from computer systems are required primarily to communicate the results of processing to
users. They are also used to provides a permanent copy of the results for later consultation. The various
types of outputs in general are:
Blockchain is one of the ways to strengthen IoT security, especially in distributed systems, and many
studies support the use of blockchain-based IoT systems and secure transactions. Blockchain technology
is based on the distributed ledger against the centralized database. All the participating entities in the
transaction have the ledger copy. In the IoT paradigm where a large amount of transaction data are
involved, the disfunction of centralized database results in a loss of data. The blockchain can resolve this
issue, in addition to ensuring the transparency of transactions. Due to these promising features,
blockchain is gaining attraction for a wide range of applications, including IoT.
Balogh et al. used the application layer, network layer, and physical layer to make the basic architecture
of the IoT systems. Many devices are embedded in the physical layer and interlinked through the
gateway. These hardware devices have restricted potential exposure to the assailant. In such a scenario,
replacing each affected module is not possible; hence, a mechanism is needed to tackle such problems as
network layer attacks and application layer attacks. Many issues and related challenges are linked with
IoT interoperability and security. Similarly, the study identified that IoT/CPS systems are not
understood completely relative to traditional ones, as these are widely distributed in an uncontrolled
manner. The IoT environment is an amalgamation of heterogeneous technologies, various protocols, and
processes. Moreover, in IoT systems, there is no standard, stable architecture, nor security mechanism
present to integrate different systems. Consequently, varied addressing formats and models introduce
complexity in IoT systems, creating the issue of interoperability. Another concern in the IoT
environment is the presence of nodal platforms with controlled electrical power, limited memory, and
low computational power, which are therefore incapable of incorporating heavy firewalls, ultimately
giving rise to security concerns.
The study in provided a detailed aspect of the vulnerabilities and thus security weaknesses found in each
IoT layer and offered blockchain-enabled solutions. Cybersecurity aims at providing confidentiality,
integrity, and authentication as three main aspects of tackling cyber attacks and ensuring the protection
of cyber–physical systems. In the context of IoT, confidentiality indicates that data packets are not
seized and peeked into, or that the host is not compromised so that an unauthorized person can attain
sensitive data, information, or credentials. Integrity ensures that data received or sent are not modified in
any unauthorized way. Availability is about all the modules in the system that are working properly and
are not prohibited from proper functioning in case the module is infected by some malicious agent or
intrusion. It thus ensures disinfecting the device immediately and not operating in a compromised way.
The authors in discussed different modules and their functionality present in smart homes and outlined
the transactional methods associated with the smart home. They also proposed the metrics for efficiency
in terms of processing time traffic and energy consumption. Integrity, confidentiality, and availability
are ensured as security goals. The study in proposed an Ethereum-based distributed smart contract in
replace of the orthodox centralized system to tackle DDoS attacks. This was achieved by giving
resources to each device and enabling the proposed system to distinguish between untrusted and trusted
devices.
The study in investigated the vulnerability aspects of IoT networks regarding DDoS attacks.
Furthermore, how they affect the services, blockchain methods for tackles DDoS attacks, and the
challenges in integrating IoT and blockchain were also analyzed. In, the authors developed an algorithm
that mines consumer behavior data exclusively and applied machine learning models to advise activities
for optimal energy consumption at homes. The algorithm may be utilized for energy optimization in
smart homes without reducing the comfort of the occupants.. These patterns are transformed into
association rules, given a priority order, and compared to the inhabitants’ recent behavior. The system
makes a recommendation to the occupants if it finds ways to conserve energy without reducing comfort.
3. SOFTWARE REQUREIMENTS ANALYSIS
Modules
When implementing Artificial Neural Networks (ANNs) for edge and fog computing-based energy
prediction, several modules and techniques can be applied to enhance the performance, efficiency, and
adaptability of the model to edge and fog environments. Here are some key modules:
Model Pruning: Remove unnecessary connections and parameters from the network to reduce the model
size and computational complexity.Quantization: Represent weights and activations using lower
precision data types to reduce memory and computation requirements.Knowledge Distillation: Train a
smaller model to mimic the behavior of a larger, more complex model, which can lead to faster
inference on edge devices.
Data Preprocessing and Feature Engineering: Prepare the input data by scaling, normalizing, and
engineering features to extract relevant information for the energy prediction task. Windowing and Time
Series Handling: If dealing with time series data, consider how to structure the input data (e.g., sliding
windows) to capture temporal patterns
Functionalities
Artificial Neural Networks (ANNs) in the context of edge and fog computing-based energy prediction
serve various functionalities to achieve accurate and efficient energy predictions. Here are the key
functionalities:
ANNs at the edge and fog layers are capable of processing streaming data in real-time, enabling
immediate response to changing energy consumption patterns.
Distributed Computation:
ANNs can be distributed across edge and fog devices, allowing for parallelized computation and load
balancing, which is critical for handling large datasets.
Edge and fog computing environments require low-latency predictions, which ANNs can provide by
performing computations locally, reducing the need for round-trip communication to centralized servers.
By processing data locally, ANNs can significantly reduce the amount of data that needs to be
transmitted to the cloud or a centralized server, leading to lower bandwidth usage.
Resource Efficiency:
ANNs can be optimized for resource-constrained environments by employing techniques like model
pruning, quantization, and using lightweight architectures..
Ability to process streaming energy data in real-time to enable immediate response to changes in
consumption patterns.
Distributed Computation:
Support for distributed computing across edge and fog nodes to allow for parallelized processing of
large datasets.
Capabilities for performing rapid computations locally, reducing the need for round-trip communication
to centralized servers and enabling low-latency predictions.
Resource Efficiency:
Optimization techniques, such as model pruning, quantization, and use of lightweight architectures, to
ensure efficient utilization of computing resources in resource-constrained environments.
Energy Efficiency:
Strategies to minimize energy consumption during inference, vital for prolonging battery life in edge
and fog devices.
Performance:
Response Time: The system should provide predictions within milliseconds or microseconds to meet
real-time requirements.
Throughput: It should be able to handle a high volume of concurrent requests for prediction without
significant degradation in performance.
Scalability:
The system should be capable of handling an increasing number of edge and fog devices, as well as
scaling with growing datasets.
Resource Utilization:
The ANN model should be optimized to make efficient use of memory, CPU, and other resources
available on edge and fog devices.
The system should have a high level of availability, ensuring that predictions are consistently accessible,
even in the presence of device failures or network outages.
a. ECONOMICAL FEASIBILITY
b. TECHNICAL FEASIBILITY
c. OPERATIONAL FEASIBILITY
a. Economic Feasibility:
Cost of Implementation: Evaluate the initial investment required for setting up the edge and fog
computing infrastructure, including hardware, software licenses, and any additional resources needed for
development and deployment.
Operational Costs: Consider ongoing costs related to maintenance, support, and potential cloud
services (if used for training or storage).
Return on Investment (ROI): Estimate the potential benefits of using ANNs for energy prediction,
including potential energy savings and efficiency improvements, and compare them to the initial and
ongoing costs.
b. Technical Feasibility:
Hardware and Software Requirements: Evaluate whether the edge and fog computing infrastructure
has the necessary hardware capabilities (e.g., processing power, memory) to run the ANN model
efficiently. Additionally, assess the availability of suitable software frameworks and libraries for ANN
development.
Compatibility with Edge Devices: Check if the chosen ANN architecture can be deployed on a variety
of edge devices, ensuring compatibility with different hardware platforms.
Data Availability and Quality: Assess the availability, quality, and frequency of energy consumption
data. Ensure that the data is collected and preprocessed in a manner suitable for training the ANN.
Model Complexity and Resource Constraints: Consider the complexity of the ANN model and
whether it can operate within the resource constraints of edge and fog devices, including memory and
processing limitations.
Operational Feasibility:
Integration with Existing Systems: Assess how well the ANN-based energy prediction system can
integrate with existing energy management systems, IoT devices, and data collection infrastructure.
User Adoption and Training: Evaluate whether the intended users (e.g., energy managers, system
administrators) have the necessary skills and knowledge to effectively use and maintain the system.
Data Governance and Compliance: Ensure that data handling practices comply with regulatory
requirements and privacy standards, which may impact the operational feasibility of the system.
4 SOFTWARE AND HARDWARE REQUIREMENTS
The functional requirements or the overall description documents include the product perspective and
features, operating system and operating environment, graphics requirements, design constraints and
user documentation.
The appropriation of requirements and implementation constraints gives the general overview of the
project in regard to what the areas of strength and deficit are and how to tackle them.
Minimum hardware requirements are very dependent on the particular software being developed by a
given Enthought Python / Canopy / VS Code user. Applications that need to store large arrays/objects in
memory will require more RAM, whereas applications that need to perform numerous calculations or
tasks more quickly will require a faster processor.
Ram : minimum 4 GB
Sensors and data sources at the edge (e.g., IoT devices, smart meters) collect energy-related data.
Data preprocessing: Clean, format, and normalize the data. Handle missing values and outliers.
2.Data Storage:
Store preprocessed data in a distributed database or data lake, which can be located at the edge or in fog
nodes.
Edge nodes (e.g., IoT gateways) and fog nodes (closer to the core network but still at the network edge)
host ANNs for local predictions to reduce latency and bandwidth usage.
Fog nodes can perform more intensive data preprocessing and feature extraction tasks.
Train the ANNs centrally in the cloud or data center using historical data and advanced machine
learning frameworks.
Periodically update the models to adapt to changing patterns. You can use techniques like transfer
learning to improve model updates.
5.Model Deployment:
Deploy trained models to edge and fog nodes for local predictions.
Ensure that models can be deployed on resource-constrained devices and take advantage of hardware
acceleration (e.g., GPUs or TPUs if available.
The Unified Modeling Language is a standard language for specifying, Visualization, Constructing and
documenting the artifacts of software system, as well as for business modeling and other non-software
systems. The UML represents a collection of best engineering practices that have proven successful in
the modeling of large and complex systems. The UML is a very important part of developing objects-
oriented software and the software development process. The UML uses mostly graphical notations to
express the design of software projects.
a. Class diagram
c. sequence diagram
d. Activity diagram
a. Class diagram
The class diagram is used to refine the use case diagram and define a detailed design of the system. The
class diagram classifies the actors defined in the use case diagram into a set of interrelated classes. The
relationship or association between the classes can be either an "is-a" or "has-a" relationship. Each class
in the class diagram may be capable of providing certain functionalities. These functionalities provided
by the class are termed "methods" of the class. Apart from this, each class may have certain "attributes"
that uniquely identify the class.
b.Use case Diagram
A use case diagram in the Unified Modeling Language (UML) is a type of behavioral diagram defined
by and created from a Use-case analysis. Its purpose is to present a graphical overview of the
functionality provided by a system in terms of actors, their goals (represented as use cases), and any
dependencies between those use cases. The main purpose of a use case diagram is to show what system
functions are performed for which actor. Roles of the actors in the system can be depicted.
c. Sequence Diagram
A sequence diagram is a kind of interaction diagram that shows how processes operate with one another
and in what order. It is a construct of a Message Sequence Chart. A sequence diagram shows, as parallel
vertical lines ("lifelines"), different processes or objects that live simultaneously, and as horizontal
arrows, the messages exchanged between them, in the order in which they occur. This allows the
specification of simple runtime scenarios in a graphical manner.
In the Unified Modeling Language, activity diagrams can be used to describe the business and
operational step-by-step workflows of components in a system. An activity diagram shows the overall
flow of control.
6. CODING AND ITS IMPLEMENTATION
6.1 SOURCE CODE
import pandas as pd
# load data
#dataset.columns = []
dataset.set_index('DateTime', inplace=True)
#dataset.index.name = 'date'
#dataset = dataset[24:]
# save to file
dataset.to_csv('energy.csv')
dataset.head(5)
get_ipython().run_line_magic('matplotlib', 'inline')
# load dataset
del dataset['Weekend']
values = dataset.values
groups = [0, 1, 2, 3, 4, 5, 6, 7]
i=1
print(dataset.shape)
dataset
# Group the data by 'Month' and calculate the average usage for each month
monthly_usage = dataset.groupby('Month')['TotalUsage'].mean().reset_index()
plt.figure(figsize=(10, 6))
plt.xlabel('Month')
plt.xticks(range(1, 13)) # Set the x-axis ticks to represent the months (1 to 12)
plt.show()
dataset.info()
df = DataFrame(data)
cols.append(df.shift(i))
cols.append(df.shift(-i))
if i == 0:
else:
agg.columns = names
if dropnan:
agg.dropna(inplace=True)
return agg
# load dataset
values = dataset.values
encoder = LabelEncoder()
values[:,4] = encoder.fit_transform(values[:,4])
values = values.astype('float32')
# normalize features
scaled = scaler.fit_transform(values)
n_hours = 1
n_features = 8
print(reframed.shape)
values = reframed.values
n_train_hours = 600 * 24
train = values[:n_train_hours, :]
test = values[n_train_hours:, :]
# design network
model = Sequential()
model.add(Dense(1))
model.compile(loss='mae', optimizer='adam')
print(model.summary())
# fit network
# plot history
pyplot.plot(history.history['loss'], label='train')
pyplot.plot(history.history['val_loss'], label='test')
pyplot.legend()
pyplot.show()
# make a prediction
yhat = model.predict(test_X)
inv_yhat = scaler.inverse_transform(inv_yhat)
inv_yhat = inv_yhat[:,0]
inv_y = scaler.inverse_transform(inv_y)
inv_z = inv_y
inv_y = inv_y[:,0]
# calculate RMSE
# Create a scatter plot of the true target values (Y_test) against the predicted target values (Y_pred)
plt.scatter(test_y,yhat , color='red')
plt.xlabel('True Values')
plt.ylabel('Predicted Values')
plt.show()
fig = pyplot.figure(figsize=(30,20))
ax2 = fig.add_subplot(211)
ax2.plot(inv_z[:,3])
inv_y.shape
inv_yhat.shape
fig = pyplot.figure(figsize=(30,20))
ax1 = fig.add_subplot(211)
ax1.plot(inv_y)
ax1.plot(inv_yhat)
print(inv_y)
6.2 Implementation
6.2.1 Python
Python is an interpreted high-level programming language for general-purpose programming. Created
by Guido van Rossum and first released in 1991, Python has a design philosophy that emphasizes code
readability, notably using significant whitespace. Python features a dynamic type system and automatic
memory management. It supports multiple programming paradigms, including object-oriented,
imperative, functional and procedural, and has a large and comprehensive standard library. • Python is
Interpreted − Python is processed at runtime by the interpreter. You do not need to compile your
program before executing it. This is similar to PERL and PHP. • Python is Interactive − you can actually
sit at a Python prompt and interact with the interpreter directly to write your programs. Python also
acknowledges that speed of development is important. Readable and terse code is part of this, and so is
access to powerful constructs that avoid tedious repetition of code. Maintainability also ties into this
may be an all but useless metric, but it does say something about how much code you have to scan, read
and/or understand to troubleshoot problems or tweak behaviors. This speed of development, the ease
with which a programmer of other languages can pick up basic Python skills and the huge standard
library is key to another area where Python excels. All its tools have been quick to implement, saved a
lot of time, and several of them have later been patched and updated by people with no Python
background - without breaking.
TensorFlow is a free and open-source software library for dataflow and differentiable programming
across a range of tasks. It is a symbolic math library and is also used for machine learning applications
such as neural networks. It is used for both research and production at Google.
TensorFlow was developed by the Google Brain team for internal Google use. It was released under
the Apache 2.0 open-source license on November 9, 2015.
NumPy
It is the fundamental package for scientific computing with Python. It contains various features
including these important ones:
Pandas is an open-source Python Library providing high-performance data manipulation and analysis
tool using its powerful data structures. Python was majorly used for data munging and preparation. It
had very little contribution towards data analysis. Pandas solved this problem. Using Pandas, we can
accomplish five typical steps in the processing and analysis of data, regardless of the origin of data load,
prepare, manipulate, model, and analyze. Python with Pandas is used in a wide range of fields including
academic and commercial domains including finance, economics, Statistics, analytics, etc.
Matplotlib
Matplotlib is a Python 2D plotting library which produces publication quality figures in a variety of
hardcopy formats and interactive environments across platforms. Matplotlib can be used in Python
scripts, the Python and IPython shells, the Jupyter Notebook, web application servers, and four
graphical user interface toolkits. Matplotlib tries to make easy things easy and hard things possible. You
can generate plots, histograms, power spectra, bar charts, error charts, scatter plots, etc., with just a few
lines of code. For examples, see the sample plots and thumbnail gallery.
For simple plotting the pyplot module provides a MATLAB-like interface, particularly when combined
with IPython. For the power user, you have full control of line styles, font properties, axes properties,
etc, via an object-oriented interface or via a set of functions familiar to MATLAB users.
7. SYSTEM TESTING
7.1 Types of Testing
Unit Testing:
Begin with unit testing for individual components, including the ANN models, data preprocessing
modules, and communication protocols.
Integration Testing:
Test the integration of components. For example, check if the ANN models can successfully interface
with data preprocessing and real-time data processing modules.
End-to-End Testing:
Perform end-to-end tests to validate the entire system's functionality.Test the flow of data from sensors
to predictions, including data collection, preprocessing, model inference, and response generation.
4.Performance Testing:
Assess the system's performance under various workloads and conditions.Measure latency, throughput,
and resource usage (CPU, memory, bandwidth) to ensure the system can handle the expected
load.Simulate various traffic patterns and usage scenarios to identify bottlenecks.
5.Scalability Testing:
Verify that the system can scale horizontally and vertically to accommodate an increasing number of
edge and fog nodes and devices.Test resource allocation and load balancing mechanisms to ensure
effective scaling.
6.Stress Testing:
Subject the system to extreme conditions to evaluate its robustness and stability.Gradually increase the
load, simulate network disruptions, and monitor how the system handles such scenarios.
7.2 Testcases
1.Data Input Testing:
Test the system's ability to receive and preprocess data from edge devices and sensors.
Test the training process to ensure that it can successfully update the ANNs with new data.
Ensure that trained models can be successfully deployed to edge and fog nodes.
Verify that the models are compatible with the hardware and software environments on these nodes.
Test the accuracy and latency of real-time predictions made by the deployed ANNs.
5.Performance Testing:
Assess how the system handles peak loads and adjusts resource allocation accordingly.
8. OUTPUT SCREENS
8.1 Results and Description
Figure 1 displays a subset of the original dataset that is being used for fog and energy prediction.
It show a few rows or samples of the data.
Figure 2 a line plot that visualizes the trend of monthly energy consumption over time. It shows
how energy consumption varies across different months.
Figure 3 provides a summary or overview of the dataset used for fog and energy prediction. It
includes statistics like mean, median, standard deviation, and other relevant information about
the dataset.
Figure 4 displays an array or matrix representing the features of the dataset after some form of
preprocessing. Each row represents a sample, and each column could represent a specific feature
or attribute.
Figure 5 displays an array or vector representing the target column (the variable we want to
predict) of the dataset. Each element in the array corresponds to the target value for a specific
sample.
Figure 7: line plot of the training and testing loss of a machine learning model
Figure 6 provides a summary or description of a machine learning model that is being proposed
for fog and energy prediction. It includes details about the architecture, algorithms used, and
other relevant information about the model.
Figure 7 a line plot that shows how the loss (or error) of a machine learning model changes
during the training process. It might have two lines representing training loss and testing (or
validation) loss over iterations or epochs.
Figure 8: prediction results of proposed model
Figure 8 displays the prediction results generated by the proposed machine learning model. It
shows a comparison between actual and predicted values for a subset of the data.
Figure 9 presents a line plot comparing the actual values (ground truth) with the predicted values
generated by the machine learning model. It helps visualize how well the model's predictions
align with the actual data.
These figures collectively provides insights into the dataset, preprocessing steps, model
performance, and prediction results for the fog and energy prediction task.
9. CONCLUSION
9. 1 Conclusion
In conclusion, the integration of Artificial Neural Networks (ANNs) with Edge and Fog Computing for
energy prediction represents a promising and innovative approach to address the pressing challenges in
sustainable energy management. This project has demonstrated that ANNs, with their ability to
automatically learn and extract patterns from data, can overcome the limitations of traditional methods
for energy prediction. By leveraging the computational power and real-time processing capabilities of
Edge and Fog environments, this approach offers more accurate and reliable energy consumption
forecasts. It is particularly crucial in the context of modern smart grid systems and Internet of Things
(IoT) applications, where rapidly changing energy consumption patterns and large-scale data processing
demand sophisticated solutions. The successful implementation of this method has the potential to
revolutionize energy management systems, leading to optimized resource allocation, improved energy
efficiency, and cost reduction, ultimately contributing to sustainability efforts and a greener future.
10. FUTURE ENHANCEMENTS
Looking ahead, the future scope of this research is promising. Researchers can explore further
refinements and optimizations of the ANN models to enhance prediction accuracy. Additionally,
incorporating advanced techniques like deep learning and reinforcement learning into the framework
can potentially provide even more precise energy forecasts. Moreover, the scalability and adaptability of
the proposed approach to varying energy management scenarios and different types of data sources
should be investigated. Furthermore, integrating real-time sensor data from IoT devices and weather
forecasting data can improve the accuracy of predictions and enable proactive energy management.
Collaboration with industry partners and energy providers can facilitate the practical implementation of
this approach in real-world energy systems.
11. REFERENCES
[1].Akbari-Dibavar, A.; Nojavan, S.; Mohammadi-Ivatloo, B.; Zare, K. Smart home energy
management using hybrid robust-stochastic optimization. Comput. Ind. Eng. 2020, 143, 106425.
[2].Al-Qerem, A.; Alauthman, M.; Almomani, A.; Gupta, B. IoT transaction processing through
cooperative concurrency control on fog–cloud computing environment. Soft Comput. 2020, 24,
5695–5711.
[3].Ammi, M.; Alarabi, S.; Benkhelifa, E. Customized blockchain-based architecture for secure
smart home for lightweight IoT. Inf. Process. Manag. 2021, 58, 102482.
[4].Balogh, S.; Gallo, O.; Ploszek, R.; Špaček, P.; Zajac, P. IoT Security Challenges: Cloud and
Blockchain, Postquantum Cryptography, and Evolutionary Techniques. Electronics 2021, 10,
2647.
[5].Bansal, S.; Kumar, D. IoT ecosystem: A survey on devices, gateways, operating systems,
middleware and communication. Int. J. Wirel. Inf. Netw. 2020, 27, 1–25.
[6].Smart Home Dataset | Kaggle. Available online: https://www.kaggle.com/code/offmann/smart-
home-dataset (accessed on 26 October 2022).
[7].Chen, S.W.; Chiang, D.L.; Liu, C.H.; Chen, T.S.; Lai, F.; Wang, H.; Wei, W. Confidentiality
protection of digital health records in cloud computing. J. Med. Syst. 2016, 40, 124.
[8].Deshpande, V.M.; Nair, M.K.; Bihani, A. Optimization of security as an enabler for cloud
services and applications. In Cloud Computing for Optimization: Foundations, Applications, and
Challenges; Springer: Cham, Switzerland, 2018; pp. 235–270.
[9].Dorri, A.; Kanhere, S.S.; Jurdak, R.; Gauravaram, P. Blockchain for IoT security and privacy:
The case study of a smart home. In Proceedings of the 2017 IEEE International Conference on
Pervasive Computing and Communications Workshops (PerCom Workshops), Kona, HI, USA,
13–17 March 2017; pp. 618–623.
[10]. Fakhri, D.; Mutijarsa, K. Secure IoT communication using blockchain technology. In
Proceedings of the 2018 International Symposium on Electronics and Smart Devices (ISESD),
Bandung, Indonesia, 23–24 October 2018; pp. 1–6.
[11]. Autonomic interoperability manager: A service-oriented architecture for full-stack
interoperability in the Internet-of-Things. ICT Express 2021, 8, 507–512. [CrossRef]
[12]. Jamil, F.; Ahmad, S.; Iqbal, N.; Kim, D.H. Towards a Remote Monitoring of Patient
Vital Signs Based on IoT-Based Blockchain Integrity Management Platforms in Smart
Hospitals. Sensors 2020, 20, 2195.