Applied Sciences
Applied Sciences
Applied Sciences
sciences
Article
Predicting Commercial Building Energy Consumption Using a
Multivariate Multilayered Long-Short Term Memory
Time-Series Model
Tan Ngoc Dinh *, Gokul Sidarth Thirunavukkarasu , Mehdi Seyedmahmoudian *, Saad Mekhilef
and Alex Stojcevski
Abstract: The global demand for energy has been steadily increasing due to population growth,
urbanization, and industrialization. Numerous researchers worldwide are striving to create precise
forecasting models for predicting energy consumption to manage supply and demand effectively. In
this research, a time-series forecasting model based on multivariate multilayered long short-term
memory (LSTM) is proposed for forecasting energy consumption and tested using data obtained
from commercial buildings in Melbourne, Australia: the Advanced Technologies Center, Advanced
Manufacturing and Design Center, and Knox Innovation, Opportunity, and Sustainability Center
buildings. This research specifically identifies the best forecasting method for subtropical condi-
tions and evaluates its performance by comparing it with the most commonly used methods at
present, including LSTM, bidirectional LSTM, and linear regression. The proposed multivariate,
multilayered LSTM model was assessed by comparing mean average error (MAE), root-mean-square
error (RMSE), and mean absolute percentage error (MAPE) values with and without labeled time.
Results indicate that the proposed model exhibits optimal performance with improved precision and
Citation: Dinh, T.N.; accuracy. Specifically, the proposed LSTM model achieved a decrease in MAE of 30%, RMSE of 25%,
Thirunavukkarasu, G.S.;
and MAPE of 20% compared with the LSTM method. Moreover, it outperformed the bidirectional
Seyedmahmoudian, M.; Mekhilef, S.;
LSTM method with a reduction in MAE of 10%, RMSE of 20%, and MAPE of 18%. Furthermore, the
Stojcevski, A. Predicting Commercial
proposed model surpassed linear regression with a decrease in MAE by 2%, RMSE by 7%, and MAPE
Building Energy Consumption Using
a Multivariate Multilayered
by 10%.These findings highlight the significant performance increase achieved by the proposed
Long-Short Term Memory multivariate multilayered LSTM model in energy consumption forecasting.
Time-Series Model. Appl. Sci. 2023,
13, 7775. https://doi.org/10.3390/ Keywords: energy consumption; time-series forecasting; long short-term memory; machine learning
app13137775
is needed to address concerns regarding food security, feedstock selection, and their impact
on climate and human health [3].
According to the International Energy Agency (IEA) [4], global energy consumption
is expected to continue to increase in the coming decades, driven by population growth,
urbanization, and industrialization in developing countries. However, most of the current
research has focused on forecasting for countries or regions [5,6]. There are only a limited
number of studies that specifically focus on forecasting for individual buildings. Energy
consumption in buildings encompasses the precise measurement of energy utilized for
specific purposes such as heating, cooling, lighting, and other essential functions within
residential, commercial, and institutional structures. In China and India, buildings account
for 37% [7] and 35% [8] of global energy consumption, making it an important area for
energy efficiency and sustainability efforts [9].
In this study, we propose a forecasting model for energy consumption for various
commercial buildings, such as Hawthorn Campus—ATC Building, Hawthorn Campus—
AMDC Building, Wantirna Campus—KIOSC Building. In the study, the proposed model,
along with its trained data preprocessing method, demonstrated superior performance
compared with other popular models, particularly when there is a sufficient amount of
training data available or when there is a lack of training data. The results indicate that
the proposed method and model can be used to accurately predict energy consumption
in commercial buildings, which is crucial for energy management and conservation. In
addition, the proposed method can be easily applied to other commercial buildings with
similar energy consumption patterns, providing a practical solution for energy management
in the commercial building sector.
The rest of this study is organized as follows. Section 2 presents background knowl-
edge by reviewing the existing research on forecasting models and their adoption for energy
consumption forecasting. Section 3 introduces the available data and the configuration of
the experiment. Section 2 provides the background of the research. Section 4 provides the
details of the proposed model. Section 5 describes some popular bench-marking models.
Section 6 presents the experiment to evaluate the efficiency of the models on different
datasets. Finally, the conclusion of this study is given in Section 7.
2. Background
2.1. Forecasting Models
A forecasting model is a mathematical algorithm or statistical tool used to predict
future trends, values, or events based on historical data and patterns. A forecasting model
M that analyzes the historical values from the current time t to return the predicted future
value at time t + 1 (denoted as ŷt+1 ) is built. The objective of the forecasting model is to
minimize the discrepancy between the estimated value ŷt+1 and the actual value yt+1 by
seeking the closest approximation. To achieve this in a temporal context where data points
are indexed in time order, a specific type of forecasting model is often used.
A time series forecasting model is a predictive algorithm that utilizes historical time-
series data to anticipate future trends or patterns in the data over time. Two techniques
are available for building the model M to obtain this objective, i.e., (i) univariate and
(ii) multivariate time-series (TS) [10–12]. In univariate TS, only a 1D sequence of energy
consumption value yt = {yt−(k−1) , yt−(k−2) , . . . , yt−1 , yt } is utilized to produce the esti-
mated value ŷt+1 , where k is a period of time from the current time t [13,14]. By contrast,
for multivariate TS, we could employ one or more other historical features in addition
to the energy consumption for training model M [15]. They can be time fields or other
specific sources. Therefore, the input for multivariate TS is a multi-dimensional sequence
Xt = {xt−(k−1) , xt−(k−2) , . . . , xt−1 , xt }, with xi ∈ Rn is a vector of dimension n. In forecast-
ing new values, the model could be enhanced if related available features are taken into
account. The additional information could help the model capture the dependencies or
correlations between features and the target variable. Therefore, the model could better
understand the context, mitigate the impact of missing values, and make more precise
Appl. Sci. 2023, 13, 7775 3 of 16
predictions. Therefore, multivariate TS has been frequently employed for building the
forecasting model recently [16].
Forecast
No. Forecasting Model Year Country Ref. Accuracy
Horizon
Normalised-
MAPE MAE
RMSE
ANN model with external variables hour
1 2019 Korea [18] 1.69% 85.44
(NARX) ahead
Long Short-term Memory Networks hour
2 2020 USA [19] 5.96% 7.21
with attention (LSTM) ahead
hour
3 AdaBoost.R2 2021 Portugal [20] 5.34%
ahead
hour
4 Support Vector Machine (SVM) 2022 Ireland [21] 5.3% 3.82 11.94 kW
ahead
hour
5 Seq2seq RNN 2020 USA [22] 3.74 kW
ahead
hour
Bayesian regularized (BR) (12 inputs) 2019 Canada [23] 1.83% 105.03 kW
ahead
6
hour
Levenberg Macquardt (LM) (12 inputs) 2019 Canada 1.82% 104.21 kW
ahead
Hybrid convolutional neural network
hour
7 (CNN) with an LSTM autoencoder 2020 Korea [24] 0.76% 0.47 0.31
ahead
(LSTM-AE)
Hybrid method of Random Forest (RF)
and Long Short-Term Memory (LSTM)
hour
8 based on Complete Ensemble 2022 USA [25] 5.33% 0.57 0.43
ahead
Empirical Mode Decomposition with
Adaptive Noise (CEEMDA)
Seasonal autoregressive integrated day
9 2020 Korea [26] 27.15% 557.6 kW
moving average (SARIMAX) ahead
day
10 Gated Recurrent Unit (GRU) 2022 Spain [27] 7.86% 156.11
ahead
Appl. Sci. 2023, 13, 7775 4 of 16
Table 1. Cont.
Forecast
No. Forecasting Model Year Country Ref. Accuracy
Horizon
Normalised-
MAPE MAE
RMSE
Hybrid Neural Fuzzy Interface System day
2019 Portugal [28] 8.71%
(HyFIS) ahead
Wang and Mendel’s Fuzzy Rule day
2019 Portugal 8.58%
11 Learning Method (WM) ahead
A genetic fuzzy system for fuzzy rule
day
learning based on the MOGUL 2019 Portugal 9.87%
ahead
methodology (GFS.FR.MOGUL)
day
12 XGBoost 2022 Spain [29] 8.83
ahead
Energy Consumption
Forecasting
Artificial Intelligence
Conventional Models
(AI) Models
ity, and increased occupant comfort over time. The two datasets DatasetS1 and DatasetS2
contain the energy consumption from 2017 to 2019, and the dataset DatasetS3 contains the
energy consumption from 2018 to 2019. The prediction value is the difference between the
previous and the intermediate next time in using energy, or the cumulation of energy. The
historical value is 96 data points every 15 min to predict the next value. Figure 2 Indicates
the location of the buildings from the Hawthorn Campus and the Wantrina Campus in the
context of Melbourne and Figures 3 and 4 indicate the Electricity accumulation in every 15
min and one hour for the Hawthorn campus (ATC building).
Figure 2. Location of Hawthorn Campus and Wantirna Campus in Metropolitan Melbourne, Victo-
ria, Australia.
layer contains two input types, as described in Section 4.1. In the following, the first LSTM
layer wraps eight LSTM units, and the second wraps four units. The last dense layer has
one unit for predicting energy consumption.
Configuration of bench-marking models. As mentioned earlier, three competitive
models are used for comparison: LSTM, Bi-LSTM, LR, and SVM models. The LSTM model
consists of one single layer with one unit, followed by a Dense layer for prediction. The
Bi-LSTM consists of one single Bi-Directional LSTM layer of one unit, followed by a Dense
layer for prediction as the LSTM model. The LR model trains on one dense layer.
Training Configuration. Both M-LSTM, LSTM, Bi-LSTM, LR, and SVM models are
trained using the same training set and evaluated on the same test set. In DatasetS1 and
DatasetS2, the models are trained on the data from 2017 and 2018. In DatasetS3, the models
are trained on the data from 2018 to demonstrate the ability of models with a lack of
training data.
4. Methodology
In this section, we propose the forecasting model (Multivariate Multilayered LSTM),
which is referred to as M-LSTM. The overview of the proposed method is illustrated in
Figure 5. There are three phases, i.e., data preprocessing, model training, and evaluation.
Start
Concatenation
Window Slicing
Input Layer
x1 x
x22 x
x33 x k-1 x1k
......
LSTM11
LSTM LSTM11
LSTM LSTM11
LSTM LSTM11
LSTM LSTM11
LSTM
Training ... ...
mechanism
Hidden Layer
Loss A Anumber
numberofoflayers
layers
optimization
with
Adam
LSTMmh
LSTM LSTMmh
LSTM LSTMmh
LSTM LSTM
LSTMhm LSTMmh
LSTM
Optimizer ... ...
Ouput Layer
Dense layer
Dense layer
Evaluation
Stop
scalar function, called the update function (denoted as ug) has a tanh activation function as
described in Equation (5).
him
m
ci-1 ci m
x +
tanh
x
x
σ σ tanh σ
m
hi-1
him
him-1
5. Bench-Marking Models
In this study, we compare the proposed model with three well-known models, i.e.,
linear regression (LR), long-short-term memory (LSTM), bidirectional long-short-term
memory (Bi-LSTM), and Support Vector Machine (SVM).
(2001) looked into various univariate modeling approaches to project Lebanon’s monthly
electric energy usage [41]. With the help of our statistical model, this research has produced
fantastic outcomes.
The LR model works by fitting a line to a set of data points with the goal of minimizing
the sum of the squared differences between the predicted and actual values of the depen-
dent variable. The slope of the line represents the relationship between the dependent and
independent variables, while the intercept represents the value of the dependent variable
when the independent variable is equal to zero. The LR model describes the linear rela-
tionship between the previous values yt and the estimated future value ŷt+1 , formulated
as follows:
t
ŷt+1 = ∑ wi · y i (8)
i =t−(k −1)
5.2. LSTM
The LSTM technique is a type of Recurrent Neural Network (RNN). The RNNs [42] are
capable of processing data sequences, or data that must be read together in a precise order
to have meaning, in contrast to standard neural networks. This ability is made possible by
the RNNs’ architectural design, which enables them to receive input specific to each instant
of time in addition to the value of the activation from the previous instant. Given their
ability to preserve data from earlier actions, these earlier temporal instants provide for a
certain amount of “memory”. Consequently, they possess a memory cell, which maintains
the state throughout time [43]. Figure 7 illustrates an overview of the simple LSTM Model.
x1 x2 x3 ... xk-1 xk
Dense layer
As noted in Section 4, the LSTM [44] model has the ability to remove or add informa-
tion to decide what information needs to go through the network from the cell state [44].
Different from the M-LSTM model, the LSTM model has just one LSTM layer, with the input
sequence x. Therefore, the hidden state (h) and cell state (c) for the ith LSTM cell are
calculated as Equation (9).
( h i , c i ) = C ( h i − 1 , c i − 1 , xi ) (9)
In the experiment, we compare the performance of the proposed model with the
univariate and multivariate LSTM models. The univariate LSTM takes the first input (i)
described in Section 4.1, and the multivariate LSTM takes both those inputs.
able to record the evolution of energy that would power both its history and its future [46].
This bidirectional processing is achieved by duplicating the hidden layers of the LSTM,
where one set of layers processes the input sequence in the forward direction and another
set of layers processes the input sequence in the reverse direction. As illustrated in Figure 8,
f f
the hidden state (hi ) and the cell state (ci ) in the ith forward LSTM cell are calculated as
similar as the Equation (9). On the contrary, each LSTM cell in the backward LSTM takes
the following hidden state (hib+1 ), and following cell state (cib+1 ), and xi as input. Therefore,
the hidden state (hib ) and cell state (cib ) of the ith backward LSTM cell are calculated as
Equation (10).
x1 x2 x3 ... xk-1 xk
σ σ σ σ σ
Dense layer
After the calculations of both forward and backward LSTM cells, the hidden states of
the two directions could be concatenated or combined in some way to obtain the output.
The common combination is the sigmoid function, as noted in Figure 8. The output is fed
into the Dense layer to obtain the final prediction. Similar to the LSTM model, we also
compare the proposed model with univariate and multivariate Bi-LSTM models.
in more localized and complex decision boundaries. Furthermore, the decision function in
SVM with RBF kernel can be represented as Equation (12).
D
f (x) = b + ∑ αi × yi × K (x, xi ) (12)
i =1
where, x is the input data point, b is the bias term, αi is the Lagrange multiplier associated
with the ith support vector, yi is the corresponding class label, K (x, xi ) is the RBF kernel
function, and the summation is performed over all support vectors.
The SVM with RBF kernel formulation aims to find the optimal hyperplane that
maximizes the margin between the classes while allowing some misclassifications. The RBF
kernel enables the SVM to capture complex, nonlinear patterns in the data by mapping the
data to a higher-dimensional feature space. The model is trained by solving the quadratic
programming problem to find the Lagrange multipliers ( αi ) and bias term b that define
the decision function.
6. Experiment
6.1. Metric
To better evaluate the performance, a model is tested by making a set of predic-
tions ŷ = {ŷ1 , ŷ2 , . . . , ŷ D } and then comparing it with a set of known actual values
Y = {y1 , y2 , . . . , y D }, where D is the size of the test set. Three common metrics are used
to compare the overall distance of these two sets, i.e., mean absolute percentage error
(MAPE), normalized root mean squared error (NRMSE), and R-squared score (R2 score).
MAPE As shown in Equation (13), MAPE is calculated by taking the absolute differ-
ence between the predicted and actual values, dividing it by the actual value, and then
taking the average of these values over the entire dataset. This calculation results in a single
number that represents the average percentage difference between the predicted and actual
values. The smaller the MAPE value, the better the model’s performance
100% D ŷi − yi
D i∑
MAPE = (13)
=1
yi
NRMSE Normalized Root Mean Squared Error (NRMSE) is a metric used to evaluate
the accuracy of a prediction model. It measures the normalized average magnitude of
the residuals or errors between the predicted values and the actual values, as shown in
Equation (14). q
( D1 ) ∑iD=1 (ŷi − yi )2
NRMSE = (14)
ymax − ymin
where, ŷi represents the predicted values, A represents the actual values, and sqrt() denotes
the square root function. The term (ŷi − yi )2 calculates the squared residuals or errors
between the predicted and actual values. The ymax and ymax represent the maximum and
minimum values in the actual values, respectively. The smaller the NRMSE value, the
better the model’s performance.
R2 score The R2 score, also known as the coefficient of determination, is a statistical
measure that indicates the proportion of the variance in the dependent variable that is
predictable from the independent variables in a regression model. The R2 score is typically
used to evaluate the fitness of a regression model, as formulated in Equation (15).
∑iD=1 (ŷi − yi )2
R2 = 1 − (15)
∑iD=1 (yi − y∗ )2
where y∗ the mean of the actual values. In essence, the R2 score is a measure of how well
the regression model fits the data and provides an assessment of its predictive performance.
A higher R2 score indicates a better fit and stronger explanatory power of the model.
Appl. Sci. 2023, 13, 7775 12 of 16
With labeled time Without labeled time With labeled time Without labeled time
0.3 0.15
0.2 0.10
0.1 0.05
0.0 0.00
datasetS1 datasetS2 datasetS3 datasetS1 datasetS2 datasetS3
(a) (b)
0.75
0.50
0.25
0.00
datasetS1 datasetS2 datasetS3
(c)
Figure 9. Comparison of M-LSTM trained with labelled time and without labelled time. (a) MAPE
error in the scale of (0, 1), (b) NRMSE error, (c) R2 score. Note that, the higher R2 score indicates the
better model’s performance.
For the first result (i), the model is sufficiently trained with data from 2017 and 2018 and
evaluated in 2019 from DatasetS1 and DatasetS2. Figure 9 shows that the model achieves
better performance in all three metrics with the labeled time field. The results are similar
under the same settings for the other models. The details are provided in Table 2 and the
line plot in Figure 10. Therefore, the models can learn and extract more valuable features if
they are trained with the appropriate data preprocessing strategy.
For the second result (ii), the model is only trained with data from 2018 and evaluated
in 2019 from DatasetS3. Figure 10 shows that the model performed well in predicting and
matching the actual values, as evidenced by its superior fit line compared with the other
models in Figures 9 and 10. These findings suggest that the proposed preprocessing method
is effective, particularly in situations with a limited amount of training data available for
model training.
Appl. Sci. 2023, 13, 7775 13 of 16
Figure 10. Comparison of M-LSTM with labelled time (M-LSTMt ) with M-LSTM without labelled time,
and other models in case of lack of training data (DatasetS3 in 2019). The time step is 7 days.
Table 2. Comparison of M-LSTM with competitive models with and without labelled time. MAPEt ,
NRMSEt , R2t score. are denoted metrics for model trained with labelled time. MAPE, NRMSE, R2
score are denoted metrics for model trained without labelled time. MAPEt and MAPE are rescale to
the range from 0 to 1. Better values are marked in bold.
of the models. The M-LSTM model using time label information performs the best in general.
The labeled time field provides useful information for predicting energy consumption in
peak and non-peak periods. These findings suggest that considering time information can
help in accurately predicting the target variable in the studied datasets.
1.00
0.75
MAPE
0.50
0.25
0.00
M-LSTM_t M-LSTM LSTM_t LSTM Bi-LSTM_t Bi-LSTM Linear Linear SVM_t SVM
Regression_t Regression
Model
Figure 11. Bart chart in the comparison of M-LSTM with labelled time (M-LSTMt ) with M-LSTM without
labelled time, and other models with and without labelled time in case of lack of training data
(DatasetS3 in 2019).
7. Conclusions
In conclusion, this work presents a method for pre-processing data and a model for
accurately predicting energy consumption in commercial buildings, specifically focusing
on buildings on the Hawthorn and Wantirna campuses. The proposed pre-processing
method effectively improves the accuracy of energy consumption prediction, even when
training data are limited. The results demonstrate the applicability of the proposed method
and model for accurately predicting energy consumption in various commercial buildings.
The proposed model, denoted as M-LSTM, achieved the lowest MPAE values of 0.159,
0.139, and 0.072 for DatasetS1, DatasetS2, and DatasetS3, respectively. This achievement is
crucial for effective energy management and conservation in commercial buildings. The
practicality of this approach extends to other commercial buildings with similar energy
consumption patterns, making it a viable solution for energy management in the commer-
cial building sector. Visualizations were also provided to aid in understanding the data
patterns and trends in the model predictions. Additionally, further research can explore
the effectiveness of the proposed pre-processing method and models in predicting energy
consumption for different types of buildings or larger datasets. Exploring alternative
techniques, such as seasonal decomposition or time series analysis, for incorporating time
information into the models could also yield valuable insights. These advancements in
energy consumption forecasting contribute to significant cost savings and environmental
benefits in commercial buildings.
Author Contributions: Individual Contribution: Conceptualization, T.N.D., G.S.T., M.S., S.M. and
A.S.; Methodology, T.N.D., G.S.T. and M.S.; Software, T.N.D. and G.S.T.; Validation, M.S., A.S. and
S.M.; Formal analysis, G.S.T., M.S., A.S. and S.M.; Investigation,G.S.T., M.S., S.M. and A.S.; Resources,
M.S., A.S. and S.M.; Data curation, G.S.T., T.N.D. and M.S.; Writing—original draft preparation, T.N.D.
and G.S.T.; Writing—review and Editing, M.S., S.M. and A.S.; Visualization, G.S.T., T.N.D. and M.S.
All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.
Appl. Sci. 2023, 13, 7775 15 of 16
References
1. Khalil, M.; McGough, A.S.; Pourmirza, Z.; Pazhoohesh, M.; Walker, S. Machine learning, deep learning and statistical analysis for
forecasting building energy consumption—A systematic review. Eng. Appl. Artif. Intell. 2022, 115, 105287. [CrossRef]
2. Agrawal, R.; Bhagia, S.; Satlewal, A.; Ragauskas, A.J. Urban mining from biomass, brine, sewage sludge, phosphogypsum and
e-waste for reducing the environmental pollution: Current status of availability, potential, and technologies with a focus on LCA
and TEA. Environ. Res. 2023, 224, 115523. [CrossRef] [PubMed]
3. Alok, S.; Ruchi, A.; Samarthya, B.; Art, R. Rice straw as a feedstock for biofuels: Availability, recalcitrance, and chemical properties:
Rice straw as a feedstock for biofuels. Biofuels Bioprod. Biorefining 2017, 12, 83–107.
4. IEA. Clean Energy Transitions in Emerging and Developing Economies; IEA: Paris, France, 2021.
5. Shin, S.-Y.; Woo, H.-G. Energy consumption forecasting in korea using machine learning algorithms. Energies 2022, 15, 4880.
[CrossRef]
6. Özbay, H.; Dalcalı, A. Effects of COVID-19 on electric energy consumption in turkey and ann-based short-term forecasting. Turk.
J. Electr. Eng. Comput. Sci. 2021, 29, 78–97. [CrossRef]
7. Ji, Y.; Lomas, K.J.; Cook, M.J. Hybrid ventilation for low energy building design in south China. Build. Environ. 2009, 44,
2245–2255. [CrossRef]
8. Manu, S.; Shukla, Y.; Rawal, R.; Thomas, L.E.; De Dear, R. Field studies of thermal comfort across multiple climate zones for the
subcontinent: India Model for Adaptive Comfort (IMAC). Build. Environ. 2016, 98, 55–70. [CrossRef]
9. Delzendeh, E.; Wu, S.; Lee, A.; Zhou, Y. The impact of occupants’ behaviours on building energy analysis: A research review.
Renew. Sustain. Energy Rev. 2017, 80, 1061–1071. [CrossRef]
10. Itzhak, N.; Tal, S.; Cohen, H.; Daniel, O.; Kopylov, R.; Moskovitch, R. Classification of univariate time series via temporal
abstraction and deep learning. In Proceedings of the 2022 IEEE International Conference on Big Data (Big Data) IEEE, Osaka,
Japan, 17–20 December 2022; pp. 1260–1265.
11. Ibrahim, M.; Badran, K.M.; Hussien, A.E. Artificial intelligence-based approach for univariate time-series anomaly detection
using hybrid cnn-bilstm model. In Proceedings of the 2022 13th International Conference on Electrical Engineering (ICEENG)
IEEE, Cairo, Egypt, 29–31 March 2022; pp. 129–133.
12. Hu, M.; Ji, Z.; Yan, K.; Guo, Y.; Feng, X.; Gong, J.; Zhao, X.; Dong, L. Detecting anomalies in time series data via a meta-feature
based approach. IEEE Access 2018, 6, 27760–27776. [CrossRef]
13. Niu, Z.; Yu, K.; Wu, X. Lstm-based vae-gan for time-series anomaly detection. Sensors 2020, 20, 3738. [CrossRef]
14. Warrick, P.; Homsi, M.N. Cardiac arrhythmia detection from ecg combining convolutional and long short-term memory networks.
In Proceedings of the 2017 Computing in Cardiology (CinC) IEEE, Rennes, France, 24–27 September 2017; pp. 1–4.
15. Karim, F.; Majumdar, S.; Darabi, H.; Harford, S. Multivariate lstm-fcns for time series classification. Neural Netw. 2019, 116,
237–245. [CrossRef]
16. Gasparin, A.; Lukovic, S.; Alippi, C. Deep learning for time series forecasting: The electric load case. CAAI Trans. Intell. Technol.
2021, 7, 1–25. [CrossRef]
17. Wei, N.; Li, C.; Peng, X.; Zeng, F.; Lu, X. Conventional models and artificial intelligence-based models for energy consumption
forecasting: A review. J. Pet. Sci. Eng. 2019, 181, 106187. [CrossRef]
18. Kim, Y.; Son, H.G.; Kim, S. Short term electricity load forecasting for institutional buildings. Energy Rep. 2019, 5, 1270–1280.
[CrossRef]
19. Chitalia, G.; Pipattanasomporn, M.; Garg, V.; Rahman, S. Robust short-term electrical load forecasting framework for commercial
buildings using deep recurrent neural networks. Appl. Energy 2020, 278, 115410. [CrossRef]
20. Pinto, T.; Praça, I.; Vale, Z.; Silva, J. Ensemble learning for electricity consumption forecasting in office buildings. Neurocomputing
2021, 423, 747–755. [CrossRef]
21. Pallonetto, F.; Jin, C.; Mangina, E. Forecast electricity demand in commercial building with machine learning models to enable
demand response programs. Energy AI 2022, 7, 100121. [CrossRef]
22. Skomski, E.; Lee, J.Y.; Kim, W.; Chandan, V.; Katipamula, S.; Hutchinson, B. Sequence-to-sequence neural networks for short-term
electrical load forecasting in commercial office buildings. Energy Build. 2020, 226, 110350. [CrossRef]
23. Dagdougui, H.; Bagheri, F.; Le, H.; Dessaint, L. Neural network model for short-term and very-short-term load forecasting in
district buildings. Energy Build. 2019, 203, 109408. [CrossRef]
24. Khan, Z.A.; Hussain, T.; Ullah, A.; Rho, S.; Lee, M.; Baik, S.W. Towards efficient electricity forecasting in residential and
commercial buildings: A novel hybrid CNN with a LSTM-AE based framework. Sensors 2020, 20, 1399. [CrossRef]
25. Karijadi, I.; Chou, S.Y. A hybrid RF-LSTM based on CEEMDAN for improving the accuracy of building energy consumption
prediction. Energy Build. 2022, 259, 111908. [CrossRef]
26. Hwang, J.; Suh, D.; Otto, M.O. Forecasting electricity consumption in commercial buildings using a machine learning approach.
Energies 2020, 13, 5885. [CrossRef]
27. Fernández-Martínez, D.; Jaramillo-Morán, M.A. Multi-Step Hourly Power Consumption Forecasting in a Healthcare Building
with Recurrent Neural Networks and Empirical Mode Decomposition. Sensors 2022, 22, 3664. [CrossRef] [PubMed]
Appl. Sci. 2023, 13, 7775 16 of 16
28. Jozi, A.; Pinto, T.; Marreiros, G.; Vale, Z. Electricity consumption forecasting in office buildings: An artificial intelligence approach.
In Proceedings of the 2019 IEEE Milan PowerTech, Milan, Italy, 23–27 June 2019; pp. 1–6.
29. Mariano-Hernández, D.; Hernández-Callejo, L.; Solís, M.; Zorita-Lamadrid, A.; Duque-Pérez, O.; Gonzalez-Morales, L.; Alonso-
Gómez, V.; Jaramillo-Duque, A.; Santos García, F. Comparative study of continuous hourly energy consumption forecasting
strategies with small data sets to support demand management decisions in buildings. Energy Sci. Eng. 2022, 10, 4694–4707.
[CrossRef]
30. Divina, F.; Torres, M.G.; Vela, F.A.G.; Noguera, J.L.V. A comparative study of time series forecasting methods for short term
electric energy consumption prediction in smart buildings. Energies 2019, 12, 1934. [CrossRef]
31. Johannesen, N.J.; Kolhe, M.; Goodwin, M. Relative evaluation of regression tools for urban area electrical energy demand
forecasting. J. Clean. Prod. 2019, 218, 555–564. [CrossRef]
32. Singhal, R.; Choudhary, N.; Singh, N. Short-Term Load Forecasting Using Hybrid ARIMA and Artificial Neural Network Model.
In Advances in VLSI, Communication, and Signal Processing: Select Proceedings of VCAS 2018; Springer: Singapore, 2020; pp. 935–947.
33. Li, K.; Zhang, T. Forecasting electricity consumption using an improved grey prediction model. Information 2018, 9, 204. [CrossRef]
34. del Real, A.J.; Dorado, F.; Duran, J. Energy demand forecasting using deep learning: Applications for the french grid. Energies
2020, 13, 2242. [CrossRef]
35. Fathi, S.; Srinivasan, R.S.; Kibert, C.J.; Steiner, R.L.; Demirezen, E. AI-based campus energy use prediction for assessing the effects
of climate change. Sustainability 2020, 12, 3223. [CrossRef]
36. Khan, S.U.; Khan, N.; Ullah, F.U.M.; Kim, M.J.; Lee, M.Y.; Baik, S.W. Towards intelligent building energy management: AI-based
framework for power consumption and generation forecasting. Energy Build. 2023, 279, 112705. [CrossRef]
37. Athiyarath, S.; Paul, M.; Krishnaswamy, S. A comparative study and analysis of time series forecasting techniques. SN Comput.
Sci. 2020, 1, 175. [CrossRef]
38. Noor, R.M.; Yik, N.S.; Kolandaisamy, R.; Ahmedy, I.; Hossain, M.A.; Yau, K.L.A.; Shah, W.M.; Nandy, T. Predict Arrival Time by
Using Machine Learning Algorithm to Promote Utilization of Urban Smart Bus. Preprints.org 2020, 2020020197. [CrossRef]
39. Ciampiconi, L.; Elwood, A.; Leonardi, M.; Mohamed, A.; Rozza, A. A Survey and Taxonomy of Loss Functions in Machine
Learning. arXiv 2023, arXiv:2301.05579.
40. Bianco, V.; Manca, O.; Nardini, S. Electricity consumption forecasting in italy using linear regression models. Energy 2009, 34,
1413–1421. [CrossRef]
41. Saab, S.; Badr, E.; Nasr, G. Univariate modeling and forecasting of energy consumption: The case of electricity in lebanon. Energy
2001, 26, 1–14. [CrossRef]
42. Yuan, X.; Li, L.; Wang, Y. Nonlinear dynamic soft sensor modeling with supervised long short-term memory network. IEEE Trans.
Ind. Inform. 2019, 16, 3168–3176. [CrossRef]
43. Durand, D.; Aguilar, J.; R-Moreno, M.D. An analysis of the energy consumption forecasting problem in smart buildings using
lstm. Sustainability 2022, 14, 13358. [CrossRef]
44. Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [CrossRef]
45. Graves, A.; Schmidhuber, J. Framewise phoneme classification with bidirectional lstm and other neural network architectures.
Neural Netw. 2005, 18, 602–610. [CrossRef]
46. Le, T.; Vo, M.T.; Vo, B.; Hwang, E.; Rho, S.; Baik, S.W. Improving electric energy consumption prediction using cnn and bi-lstm.
Appl. Sci. 2019, 9, 4237. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.