Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
6 views8 pages

Forcasting of Temperature and Humidity For Building

This document discusses the development of machine learning models for short-term forecasting of temperature and humidity in commercial smart buildings, emphasizing the importance of accurate predictions for optimizing HVAC operations and energy efficiency. The study employs various algorithms, including KNN, XGBoost, and Random Forest, to analyze historical data and achieve satisfactory forecasting results. The findings indicate that XGBoost provides the best performance metrics, particularly in humidity prediction, highlighting its effectiveness in energy load forecasting.

Uploaded by

kartik.podder
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views8 pages

Forcasting of Temperature and Humidity For Building

This document discusses the development of machine learning models for short-term forecasting of temperature and humidity in commercial smart buildings, emphasizing the importance of accurate predictions for optimizing HVAC operations and energy efficiency. The study employs various algorithms, including KNN, XGBoost, and Random Forest, to analyze historical data and achieve satisfactory forecasting results. The findings indicate that XGBoost provides the best performance metrics, particularly in humidity prediction, highlighting its effectiveness in energy load forecasting.

Uploaded by

kartik.podder
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Short Term Forecasting day ahead Temperature

and Humidity of a building using High level


machine learning algorithms

Soumyadeep Podder Uday Goel


Naina Netaji Subhash University of Netaji Subhash
Netaji Subhash Technology University of
Dwarka Delhi, India
University of podder.soumyadeep.ug23@ns
Technology
Technology ut.ac.in Dwarka Delhi, India
Dwarka Delhi, India uday.goel.ug23@nsut.
[email protected] ac.in

changes,temperature, humidity, wind. In most of the


Abstract— Accurate forecasting of temperature and research papers it has stated that [1,2]determining load
humidity in buildings plays a crucial role in optimizing the
forecasting of electricity consumption for any type of
building's environmental systems, which can unlock significant
building weather a school building, offices, laboratories etc
potential for energy savings in the buildings sector. Therefore,
forecasting temperature and humidity can ensure proper
acknowledging environmental conditions is most important
mangement in HVAC operation, leading to sustainable energy for accurate load prediction. Whereas,the energy behavior is
savings. The same parameter works for Thermal Comfort in impacted by the building-air ventilation, thermal conditions,
buildings, by maintaining optimal temperature and humidity lighting and heating.
levels and can achieve the utmost energy efficiency compared Balancing supply and demand, which saves resources and
with the normal building and reducing unnecessary energy reduces load cost for a building, as it draws upon other
usage costs. The dataset was trained using the “wisdom of significant parameters that includes temperature and
crowd” methodology. This category includes, ensemble humidity. And deploying accurate Short Term
supervised algorithm and decision trees. We trained the model
Forecasting(STF) on these parameters can accomplish
for short-term forecasting, considering its real-time benefits.
This study proposes Machine Learning(ML) trained models for
satisfactory results.[1,2,3] These paper studies have
day-ahead forecasting of temperature and humidity in a discussed the critical importance of short term forecasting
commercial smart building.The simulation results demonstrate due to its dependency on historical load data,weather
that the proposed approach achieves satisfactory forecasting conditions,time variables etc. Due to which STLF stands
accuracy for both temperature and humidity predictions. The outmost in terms of accuracy. In this paper the emphasis is
proposed model can be adapted to other real-world scenarios, on predicting day-ahead temperature and humidity for a
such as implementation of this model in smart application apps building, understanding its importance in the load
for predicting values and in smart devices which can moderate forecasting.
itself according to the weather climatic change .
By predicting the temperature and humidity for a building
can contribute in many areas which directly correlates to
keyword: Decision trees, Ensemble supervised learning, our work and define its focus areas. The operation of
Thermal comfort, Wisdom of crowd [1,2](HVAC) systems (cooling/heating) generate exclusive
load on power lines and it would differ to building type and
in which month and week the HVAC load is obtained high.
But knowing the indoor temperature and humidity can
I. INTRODUCTION minimize the line load by ensuring the efficient use of
energy. The predicted temperature and humidity values can
work as add-on parameters in the electricity load forecasting
The buildings sector significantly contributes to global which will result in the prior knowledge of future energy
energy demand, accounting for about 40% and consumption of the building.
approximately 36% of Greenhouse Gas (GhG) emissions.
The forecasting is based on data-driven models, observing
Consequently, reducing its environmental impact has
the historic pattern and energy flow. The dataset is trained
become a global priority. Advances in technology and the
using Machine Learning Algorithm(MLA). We trained the
Internet of Things (IoT) have revolutionized the buildings
dataset on three different regression algorithms 1.K Nearest
sector, transforming traditional structures into smart
Neighbor Regression(KNN) a simple, instance-based
buildings. These smart buildings are equipped with
learning algorithm 2. Extreme Gradient Boosting(XGBoost)
sophisticated energy monitoring systems that manage
works at high speed,decision trees and is effective for
energy resources and store operational data, such as total
handling big datasets 3. Random Forest (RF) using
energy demand and occupancy profiles. Recent research has
ensemble supervised algorithm and bagging that includes
focused on modeling, analyzing, and predicting building
multiple decision trees. The time series approach is used for
energy consumption to enhance energy efficiency and
visualization including the date-time parameter. These
sustainability by analyzing the load patterns from historic
developed models were considered for predicting day-ahead
driven data.The energy consumption within a building also
temperature and humidity for a building. A detailed
depends upon environmental conditions - weather
comparison is made among these proposed models by
analyzing the performance metrics value. quarter). Creates target columns for the next 15-minute
interval temperature and humidity by shifting the original
columns. In KNN regression tasks, the target values are
II.METHODOLOGY continued. To find the closest neighbors, KNN regression
relies on a distance metric. The default metric is Euclidean
distance,which is used in this model.
The model trained using Machine Learning Algorithms.
Machine Learning is the most effective and popular method Euclidean distance:
used for forecasting. Its popularity in Research Work is 𝑛
increasing drastically due to its efficiency and speed in d(p,q) = Σ𝑖=1(𝑝𝑖 − 𝑞𝑖)2 (1)
addressing and analyzing complex datasets and identifying
intricate patterns. The time series forecasting is applied for In Fig 2. The prediction is made by calculating the distance
the visualization of data as it best works on historical data. between the query point from the other data points o
It has been seen that time series forecasting stands out better
in many ways.

A. Data Processing Analyze


The dataset used for predicting the temperature and
humidity consists of indoor environmental conditions
(indoor CO2 concentration, indoor operative temperature,
indoor relative humidity)collected in the interval of every
15 minutes, which gives 96 predicted temperature and
humidity values.It includes two important columns of
indoor operative temperature and indoor relative humidity.
And various other columns of difference between current
and past temperature and relative humidity.Ensure the error
causing or accuracy affecting conditions by handling the Fig 2 Understanding KNN Parameters
missing values, Creating additional time-based features 2. XGBoost is an optimization of the Gradient Boosting
such as hour of the day, day of the week, etc., to capture library designed to be highly efficient, flexible, robust,
temporal patterns.Generating lag features for temperature portable and created by Tianqi Chen. XGBoost provides
and humidity to incorporate historical data into the model. parallel tree boating that solves many problems in a fast and
After analyzing the data it was trained on ML models. accurate way. XGBoost builds an ensemble of decision trees
in a sequential manner, where each tree aims to correct the
errors of its predecessor.Also, can handle missing values
internally. Due to these features, XGBoost became the most
popular algorithm. In XGBoost, decision trees are a type of
model used for classification and Regression but this paper
focuses on Regression models. so, data is split into subset
based on the input features then creates branches for each
possible outcome. Where each starts as a single leaf having
residual values and a similarity score with it. In fig 4. It
shows the flow of XGBoost Regression.

Fig 1. Basic Building Blocks Of Models

B.Model Processing on dataset


1. KNN Regression the most simple non parametric
algorithm used to train the model. KNN regression does not Fig 3. Parallel Decision trees
build an explicit model from the training data. Instead, it In fig 3 the input for tree 1 includes indoor operative
makes predictions by evaluating the K nearest(n_neighbors) temperature, indoor relative humidity, hour, day , month,
data points to the given input values and then averaging year, day of week, quarter. Each tree is iterating the
their target values. The input features include indoor residuals from its previous tree. Once it's done then each
operative temperature and relative humidity includes time tree’s value summation is done.
based features (hour , day, month, year, day of week,
Similarity score for residuals calculated:
Then Aggregation is introduced to predict the final result.
(∑ 𝑖ϵ𝑛𝑜𝑑𝑒 𝑔𝑖)2
Once these steps occurred the predicted value of
Similarity Score = ℎ𝑖 +λ (2) temperature and humidity can be calculated.
Σ𝑖ϵ𝑛𝑜𝑑𝑒
3. Random Forest Regression is a versatile machine learning
algorithm based on ensemble supervised learning which
makes Random Forest to give accurate prediction and
robustness. Random Forest algorithm predicted results have
to travel through multiple decision trees that operate on
ensemble. Refer fig 3 for the decision tree.
4. Ensemble learning - a collection of multiple ML models
and uses the concept of majority count. It works on the
concept of “wisdom of crowd”. Means the majority of the
crowd's answer will have the highest probability. The
answer by the crowd will be the utmost correct. In
Ensemble, the crowd can be seen as decision trees where
the multiple decision trees are considered which give the
final result. So, first step we have to define the base models
and models have to be different to each other. Follow one of
the conditions: either use a different algorithm for the model Fig 6 Aggregating final results
or use the same algorithm for the model but subsets of data
will be different. as shown in fig.4
C. Evaluation Metrics
Mean Squared Error (MSE):
1 𝑛
MSE = 𝑛
Σ𝑖=1 (𝑦𝑖​ − 𝑦^​𝑖​)2 (3)
Accuracy (based on MAE):
1 𝑛
MAE = 𝑛
Σ𝑖=1∣yi​−y^​i​∣ (4)

𝑀𝐴𝐸 𝑥 100
Accuracy = 100 − ( 𝑀𝑒𝑎𝑛 𝑜𝑓 𝐴𝑐𝑡𝑢𝑎𝑙 𝑉𝑎𝑙𝑢𝑒𝑠 ) (5)
R-squared (R²):
𝑛
Fig 4. Base model of decision tree Σ𝑖=1 (𝑦𝑖​−𝑦^​𝑖​)2
(R²) = 1 − 𝑛 (6)
Random forest is a Bagging based technique. Bagging Σ𝑖=1 (𝑦𝑖​−𝑦')2
means Bootstrap and Aggregation.Bootstrap sampling:Each
decision tree in a Random Forest is trained on a different
subset of the data. These subsets are created through a III RESULTS AND DISCUSSION
process called bootstrap sampling, where samples are
In this section, we evaluated the performance of the proposed
randomly selected with replacement or without
Regression based models on short term forecasting to predict
replacement. This means some data points might be
day-ahead temperature and humidity. The analysis is done
repeated in the subset if replacement is applied in which
using the performance metric values. It can be seen that while
others might be left out. The with replacement is used in our
directly looking at the obtained graphs, the difference is not
model.
quite clear. For more clarity, it's necessary to understand the
In Bootstrap sampling row , column and combination of performance metric values and observe the obtained points
both sampling any of them can be chosen.In our Random on the graph.
Forest Regression model we have used row sampling with
A.Comparative analysis
replacement and the base model is 200 with maximum
depth of decision tree 30. KNN Regression model is first implemented on the dataset
due its simplicity. But as we move further with other ML
models the results from KNN Regression model didn’t stand
much as expected. The accuracy for temperature is 99.94%
and for humidity it's 99.59%. The MSE value for temperature
and humidity obtained 0.00 and 0.04. The R2 score for
temperature is 81.06% and for humidity is 95.72%. Refer
table I. For humidity the model worked best but for
temperature it has created disturbing values because MSE
value 0 and R2 score 81.06% for temperature that means
there is error in the predicted values of the temperature which
indicates the scope of improvement for temperature. In case
of humidity the accuracy is better but the MSE value can be
decreased further to achieve a high R2 score to reduce the
errors in the predicted values of humidity

Fig 5 Bootstrap sampling of multiple decision trees


TABLE I TABLE II

MSE Accuracy R2 MSE Accuracy R2

Temperature 0.00 99.94% 81.06% Temperature 0.00 99.95% 84.15%

Humidity 0.04 99.59% 95.72% Humidity 0.01 99.71% 98.54%

Fig 9. Prediction vs Actual Temperature and humidity by KNN

Fig 8. Prediction vs Actual Temperature and humidity by


The XGBoost Regression model is implemented after XGBoost
analyzing the results of the KNN regression model. We know
the concept of parallel decision trees where each tree corrects
errors of its predecessor. As expected, the results were quite Random Forest Regression is the last model after
satisfying where the R2 score for temperature(84.15%) implementing the KNN and XGBoost. We expected more
improved with 3.09% where the MSE value 0.00 remained from this model because XGBoost performed best till now.
the same. For the humidity both MSE(0.01) and R2 (98.54%) For humidity, the performance metric compared to KNN
scores are satisfying. Indicates a good fit, as the model worked best ensuring least error in predicted values whereas
predictions are close to actual values and explain a large MSE (0.02) value and R2(98.16%) score both improved. In
portion of the variance. The accuracy has improved for both case of temperature we expected less error in the predicted
temperature(99.95) and humidity(99.71). This model reduces values but the R2 score (75.20%) stands out as a poor score if
the errors in the predicted values of temperature and compared with other models KNN and XGBoost. Refer table
humidity. Refer table table II. III
analysis of the regression models based on short term
forecasting.
TABLE III
B. Forecasting results

MSE Accuracy R2

Temperature 0.00 99.93% 75.20%

Humidity 0.02 99.72% 98.16%

Fig 10. KNN predicted temperature and humidity time series

Fig 9. Prediction vs Actual Temperature and humidity by RF

From the obtained graphs of the models it's quite difficult to


see the drastic change instantly. Because the predicted values
of temperature and humidity were precisely up-to 6 decimal
places which leads to less interval gap between the values.
Due to which for every model’s graph the difference can’t be
seen instantly. Also, the accuracy of the models didn’t have
much difference in the values. It's about only 0.01% if
compared to each other. Refer fig 7,8,9.
After analyzing the performance metrics of the models it's
clear that XGBoost Regression consistently showed the
highest R² scores for both temperature and humidity
predictions, indicating it explains the most variance in the
data. Also shows the best or near-best accuracy and the
lowest MSE for humidity prediction. This paper's main focus
was to introduce the highly accurate predicted values for
temperature and humidity by considering the comparative
Fig 12 Random Forest predicted temperature and humidity time
series
Fig 11 XGBoost predicted temperature and humidity time series

The subsequent figures 10,11,12 are the final outcomes


derived from the trained models employed for forecasting
temperature and humidity. This study integrates the time
series to analyze the fluctuations of humidity and temperature
by plotting the predicted 96 values for January 31st,
commencing at midnight(00:00) and progressing at a 3 hour
interval. The 96 predicted values of temperature and
humidity obtained from the dataset which was sampled at 15
min intervals. From these graphs it can be seen that there are
noticeable steps in the trend, indicating sudden increment and
decrement of periods for a whole day.
In XGBoost, there is not much variation observed throughout
the day. It remained quite stable but the temperature slightly
decreased in the evening hours. Whereas in case of humidity
it showed a more dynamic pattern compared to temperature
as it is visible from fig 10 that it starts relatively stable and
then increases steadily throughout the day. But if we focus on
the Random Forest and KNN models they exhibited more
fluctuations.
The KNN model indicated a distinct trend, with temperature
values rising during the morning, peaking around midday and
then decreasing in the evening. For humidity, minor
fluctuations are observed throughout the day. Also, there are
periods where humidity levels off before continuing to rise.
As for the Random Forest model, the predicted temperature
is relatively stable with small fluctuations throughout the day.
There is a slight increase around midday and a noticeable
drop towards the evening.Therefore, the temperature remains
fairly constant with minor changes, showing stability
throughout most of the hours. The predicted humidity
exhibits more variability compared to temperature. It remains
stable initially, followed by a significant increase during the
latter part of the day, with a few fluctuations.There is an
overall increasing trend with noticeable step-like changes,
indicating that humidity levels rise steadily throughout the
day.
These time series visualizations help in understanding the
behavior of temperature and humidity predictions over the
course of a day, which is crucial for short-term load
forecasting in buildings.
IV CONCLUSION

This paper mainly emphasizes on short term forecasting of


day ahead temperature and humidity by introducing the
comparative analysis of regression models. From all models
the XGBoost model stands out best with least errors. This
study addresses one of the most significant parameters which
contributes to the load of buildings. Predicting the day ahead
values of temperature and humidity for a building can help
in understanding the energy demand and supply for the
building. It can even assist in controlling HVAC systems by
predicting the future values of temperature and humidity so
that we can predict the actual future load for buildings.
Hence, in conclusion this study is helping in predicting one
of the environmental factors which directly affects the
energy consumption of the building.

ACKNOWLEDGEMENT
The authors would like to express their gratitude to Dr.
Anuradha Tomar for suggesting this important research work
and also guiding us throughout the process of this study.

REFERENCES
[1] Xinda Ke, Anjie Jiang, Ning Lu Electrical Engineering Department
North Carolina State University Raleigh, USA
[2] School of Science and Technology, International Hellenic University,
57001 Thessaloniki, Greece;
[3] Phanumat Saatwong, Surapong Suwankawin Department of Electrical
Engineering Chulalongkorn University Bangkok, Thailand
[4 ]B. Yildiz, J.I. Bilbao⁎ , A.B. Sproul School of Photovoltaics and
Renewable Energy Engineering, University of New South Wales, Sydney,
NSW 2052, Australia

You might also like