Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
34 views21 pages

2024 Mdpi Sensors

Uploaded by

it
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views21 pages

2024 Mdpi Sensors

Uploaded by

it
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

sensors

Article
Auto-Scaling Techniques in Cloud Computing: Issues and
Research Directions
Saleha Alharthi, Afra Alshamsi, Anoud Alseiari and Abdulmalik Alwarafy *

Department of Computer and Network Engineering, College of Information Technology,


United Arab Emirates University, Al Ain 15551, United Arab Emirates; [email protected] (S.A.);
[email protected] (A.A.); [email protected] (A.A.)
* Correspondence: [email protected]

Abstract: In the dynamic world of cloud computing, auto-scaling stands as a beacon of efficiency,
dynamically aligning resources with fluctuating demands. This paper presents a comprehensive
review of auto-scaling techniques, highlighting significant advancements and persisting challenges in
the field. First, we overview the fundamental principles and mechanisms of auto-scaling, including
its role in improving cost efficiency, performance, and energy consumption in cloud services. We then
discuss various strategies employed in auto-scaling, ranging from threshold-based rules and queuing
theory to sophisticated machine learning and time series analysis approaches. After that, we explore
the critical issues in auto-scaling practices and review several studies that demonstrate how these
challenges can be addressed. We then conclude by offering insights into several promising research
directions, emphasizing the development of predictive scaling mechanisms and the integration of
advanced machine learning techniques to achieve more effective and efficient auto-scaling solutions.

Keywords: auto-scaling; cloud computing; workload forecasting; machine learning

1. Introduction
Cloud computing enables users to access and utilize a range of computing resources,
including networks, servers, storage, and applications through the internet [1]. According
Citation: Alharthi, S.; Alshamsi, A.; to the National Institute of Standards and Technology, it is a system where you can easily
Alseiari, A.; Alwarafy, A. Auto-Scaling access and use these resources whenever you need them without much hassle or needing
Techniques in Cloud Computing: to interact a lot with the service provider [2]. Cloud computing has emerged as a critical
Issues and Research Directions. Sensors tool and is being used across various sectors, such as e-learning [3], healthcare [4], finance
2024, 24, 5551. https://doi.org/ and banking [5], manufacturing [6], and telecommunications [7].
10.3390/s24175551 In settings where data-intensive tasks are prevalent, crucial cloud computing features
Academic Editor: Jemal Abawajy like elasticity, resource allocation, and pooling, play significant roles. Scalability, a key
component of cloud computing, empowers organizations to modify their resource use to
Received: 3 July 2024 align with their needs, offering not only cost benefits but also crucial enhancements in
Revised: 26 July 2024
the performance of cloud-based applications [2]. The critical nature of scalability becomes
Accepted: 5 August 2024
particularly apparent in the realm of large-scale graph processing, which finds application
Published: 28 August 2024
in media, online gaming [8], and the internet of things (IoT) [9]. Conventional tools
for graph processing often falter with large-scale graphs, facing difficulties in balancing
scalability and cost-efficiency. [10]. Researchers in [11], discussed issues related to large-
Copyright: © 2024 by the authors.
scale optimization in elastic cloud computing. Auto-scaling, which involves dynamically
Licensee MDPI, Basel, Switzerland. increasing or decreasing resources based on specific needs and strategies, addresses this
This article is an open access article challenge by providing the flexibility to adapt.
distributed under the terms and In this review paper, we explore the concept of auto-scaling in cloud computing, a crit-
conditions of the Creative Commons ical mechanism that adjusts resources dynamically based on current demands. Section 2
Attribution (CC BY) license (https:// covers some existing works related to auto-scaling techniques. Section 3 lays the foundation
creativecommons.org/licenses/by/ by introducing the fundamentals of auto-scaling, particularly emphasizing its relevance
4.0/). in the cloud computing context. Following that, Section 4 delves into the algorithms

Sensors 2024, 24, 5551. https://doi.org/10.3390/s24175551 https://www.mdpi.com/journal/sensors


Sensors 2024, 24, 5551 2 of 21

predominantly employed in auto-scaling within cloud environments, offering a detailed


examination of their operations and applications. In Section 5, we transition to practical
insights by showcasing the real-world applications of auto-scaling, illustrating how this
technology is implemented in various scenarios. Section 6 addresses the challenges associ-
ated with auto-scaling in cloud computing. Section 7 presents potential prospects in this
evolving field. Finally, we conclude the work in Section 8. Table 1 provides a list of the
acronyms used in this paper and their definitions.

Table 1. List of Acronyms and Their Definition.

Acronym Definition
RL Reinforcement learning
SSCAS Self-Adaptive cloud auto-scaling system
IaaS Infrastructure as a Service
AWS Amazon Web Services
QoS Quality of service
MCS Mission Critical Services
MDP Markov decision process
ML Machine learning
DNN Deep neural network
CNFs Containerized network functions
VNFs Virtualized network functions
VMs Virtual machines
SLAs Service level agreements
HDFS Hadoop’s Distributed File System
Telco Telecommunications
CEAS Cost-Effective Auto-Scaling
SVM Support Vector Machine
OPCR On-demand provisioning of computing resource
IRS Intelligent resource scaling
IoT Internet of things
DT Digital twin

2. Related Work
Existing review papers have been published [12–21], and they have provided valuable
insights. However, with recent updates in the cloud computing environment, new research
areas have emerged. It is necessary to explore the current key challenges to propose future
research directions in this field. Therefore, this review covers auto-scaling techniques and
discusses the current challenges. Furthermore, we discuss the current applications and
domains of auto-scaling techniques. Table 2 illustrates the existing surveys and compares
them with our work.
To the best of our knowledge, no previous review paper has focused on all types of
auto-scaling techniques used in cloud environments. In this paper, we highlight different
types of auto-scaling techniques such as reactive methods (threshold rules, queuing theory)
and proactive methods (reinforcement learning, fuzzy learning, machine learning, and time
series analysis).
Sensors 2024, 24, 5551 3 of 21

Table 2. Comparison with previous related papers.

Paper Objective(s) Advantages Limitations


Discussed the challenges faced by
This study focused on resource
present solutions and provided
allocation in microservice-based
To investigate and analyze future development
applications with containers and
Fourati et al. [12] existing microservice-based recommendations. Highlights the
did not address other auto-scaling
applications that use containers. strengths and shortcomings of
issues, such as cost optimization
existing microservice-aware
and performance.
auto-scalers.
Analyzed auto-scaling methods,
This survey does not cover all
such as horizontal, vertical,
To provide an overview of existing auto-scaling techniques; it focuses
and hybrid scaling. Identifies
Tran et al. [13] container auto-scaling techniques on auto-scaling in Kubernetes.
the implementation of
in Kubernetes. Lack of practical applications
machine learning for
and examples.
predictive techniques.
To conduct a methodological Describes the key challenges This survey covered only web
survey on auto-scaling techniques facing multi-tier web applications application techniques, excluding
for web applications. To identify in cloud computing. Provides other aspects of the scalability of
Singh et al. [14]
and summarize the challenges and taxonomy to help readers cloud computing. In addition,
existing studies related to web understand different aspects of the review literature excludes
application auto-scaling. web application auto-scaling. recent papers.
The survey covered the
Did not cover recent studies
To review the challenges related to weaknesses and strengths of
conducted after 2019, and the
Qu et al. [15] auto-scaling web applications in existing auto-scaling techniques
paper focused only on
cloud computing. deployed by service and
auto-scaling web applications.
infrastructure providers.
The survey covered major
They focused on RL-based
proposals for RL-based
To review studies on reinforcement auto-scaling and excluded other
auto-scaling in cloud
learning (RL) and propose aspects of auto-scaling in cloud
Garí et al. [16] infrastructure. In addition, this
taxonomies to compare RL-based computing. They excluded
paper discusses related issues and
auto-scaling approaches. real-world applications
highlights future
and domains.
research directions.
Covers most of the studies in the
field and highlight the limitations
To summarize existing studies and Lack of real-world applications
and gaps of other surveys.
approaches that used RL in the and examples. The authors
Garí et al. [17] The survey also covered the
cloud for workflow focused particularly on RL-based
opportunities for advancing
auto-scaling applications. workflow in the cloud.
learning RL-based workflow in
the cloud.
They focused specifically on
This review provides a
To review auto-scaling techniques auto-scaling techniques for
comprehensive understanding of
Dogani et al. [18] for container-based virtualization container-based virtualization,
challenges related to auto-scalers
in cloud, edge, and fog computing. excluding other
for container-based applications.
auto-scaling techniques.
Presented a detailed analysis of The survey excludes auto-scaling
To analyze existing studies on
open challenges in SSCAS concepts and techniques. This
Self-Aware and Self-Adaptive
Chen et al. [19] approaches. The paper identified study focused on the SSCAS but
cloud auto-scaling
promising directions for further did not cover other aspects of
systems (SSCAS)
advancement of SSCAS. cloud computing.
Lack of auto-scaling techniques
To determine the current issues of
Adds valuable insight into information. The survey focused
load balancing and understand the
Rabiu et al. [20] existing challenges faced in load on auto-scaling and load
challenges related to cloud-based
balancing and auto-scaling. balancing in cloud-based
container microservices.
container microservices.
Sensors 2024, 24, 5551 4 of 21

Table 2. Cont.

Paper Objective(s) Advantages Limitations


Provides comprehensive insights The paper does not cover
To investigate and compare the
into fixed threshold-based and challenges related to auto-scaling
two techniques of auto-scaling:
Ali Khan et al. [21] adaptive threshold-based techniques. They do not address
fixed threshold-based and
techniques and how they can possible solutions and research
adaptive threshold-based.
optimize resource allocation. directions for auto-scaling.
To explore the concept of
Covers recent studies in
auto-scaling in cloud computing,
auto-scaling techniques. Provides
algorithms predominantly
research directions that address
employed in auto-scaling,
Our work future challenges. Focuses on -
to address the challenges
auto-scaling techniques (reactive,
associated with auto-scaling in
proactive) and provides detailed
cloud computing and present
discussion about each method.
potential solutions.

3. Fundamentals of Auto-Scaling
This section explores the fundamental concepts of cloud computing, auto-scaling,
horizontal scaling, and vertical scaling. These concepts lay the foundation for the remainder
of the review paper, it ensures that readers have a clear insight of fundamental concepts
before delving into detailed analyses.
Cloud computing has swiftly become an essential part of daily life and is increasingly
favored by Internet users [22]. Users of cloud services pay only for the resources they use,
following a pay-per-use model. However, achieving a balance between cost-effectiveness
and performance requires precision. As shown in Figure 1 the architecture of cloud com-
puting consists of various components. The aim is to deliver top-notch cloud services at
minimal costs while ensuring prompt response to requests and system stability. Scala-
bility, a pivotal characteristic of cloud computing, addresses this challenge by enabling
flexibility and efficiency for users. To clarify the meaning of auto-scaling, auto-scaling
refers to the ability to scale in and out by adding or removing resources depending on the
requirements [7].

Figure 1. Auto-scaling in cloud computing.


Sensors 2024, 24, 5551 5 of 21

Auto-scaling in cloud computing involves restructuring the allocation of resources in


hardware and software with specific requirements [23]. Cloud computing provides several
features such as scalability, security, efficiency, and performance. It has been deployed in
different industries [7]. Cloud computing is widely adopted due to its flexibility and scalability,
which are essential for handling big data. However, optimizing cloud resources for auto-
scaling to manage big data efficiently remains a challenge [24]. In a cloud environment,
reactive auto-scaling relies on a set of thresholds, and it cannot predict a system’s future
behavior. For example, if the value of a particular parameter such as average CPU usage
exceeds a threshold, additional CPU capacity will be added to the system [23].
Cloud-based applications can automatically adjust their resource usage, increasing
or decreasing as required to meet the demands of the application. The key features of
auto-scaling include the ability to add resources during peak demand (scaling out) and to
remove unused resources to minimize costs when demand decreases (scaling in). Rules
can be set for scaling actions, and unhealthy instances can be detected automatically and
replaced. Auto-scaling is often discussed in the context of resource provisioning, scalability,
and elasticity, which are related but distinct concepts [25].
Resource provisioning enables a system to scale resources dynamically based on work-
load changes [26], improving scalability and performance. Scalability refers to a system’s
ability to handle increasing workloads by adding hardware resources [27], either horizontally
(scaling out) by adding nodes or vertically (scaling up) by increasing resources in existing
nodes. Scalability is a prerequisite for elasticity, which involves adapting to workload changes
by provisioning and deprovisioning resources autonomically to closely match demand. Elas-
ticity encompasses how quickly a system can respond to workload fluctuations, ensuring
that the available resources closely align with the current demand. Auto-scaling techniques
facilitate elasticity by enabling automatic resource adjustments [25].
Auto-scaling solutions are classified based on horizontal and vertical scaling methods.
Horizontal scaling is when the virtual machines are added or removed to match the
requirements of the current workload. Vertical scaling offers tools for removing or adding
resources (e.g., RAM, CPU, and disk space) from the VM instances [28].
Horizontal scaling increases the number of existing resources by adding extra ma-
chines. Vertical scaling means that when adding new machines to a system, large resources
may need to be broken down into smaller types to reduce costs. Removing or down-
ward machines from a pool of resources of the same type is referred to as homogeneous,
and machines of different types are called heterogeneous [10]. Figure 2 shows the difference
between the horizontal and vertical scaling methods.
The paper [29], extensively reviews auto-scaling and load balancing technologies
in cloud computing, highlighting their pivotal role in efficiently managing data and ap-
plications through centralized resources. Focusing on infrastructure as a service (IaaS),
it emphasizes dynamic scaling and load distribution techniques crucial for optimizing
resource utilization and performance across varied workloads. A substantial section of
the paper (illustrated in Table 3) delves into a comparative analysis of how major cloud
providers like amazon web services (AWS) and Microsoft azure implement these tech-
nologies. AWS utilizes auto-scaling within its EC2 platform to adjust instance numbers
dynamically based on application demand, using load balancers to distribute traffic ef-
fectively. In contrast, Microsoft Azure uses a platform-as-a-service model that depends
on external tools such as Paraleap for auto-scaling, showcasing diverse approaches in the
deployment and management of these critical cloud computing features [29].
Sensors 2024, 24, 5551 6 of 21

Figure 2. Horizontal vs. vertical scaling concepts.

Table 3. Auto-scaling techniques used by various cloud providers.

Auto-Scaling
Cloud Providers Auto-Scaling Feature
Supported (Yes/No)
Automatically scales number of EC2 instances for
AMAZON Yes
different applications.
Provides auto-scaling feature manually based on
WINDOWS AZURE Yes
the applications.
GOOGLE APP Owns auto-scaling technology. Yes
ENGINE Google applications. Yes
Supports auto-scaling technique in programmatic
GOGRID Yes/No
way and does not implement it.
Provides auto-scaling mechanism with high
FLEXISCALE Yes
performance and availability.
Application management service through cloud
ANEKA Yes
peer service.
Open-source cloud provided by resource
NIMBUS Yes
manager and Python modules.
Open-source cloud which provides wrapper
EUCALYPTUS Yes
service for various applications.
Open-source cloud which provides OpenNebula
OPEN NEBULA Yes
Service Management Project.
Sensors 2024, 24, 5551 7 of 21

4. Auto-Scaling Techniques
This section presents a detailed discussion of the reactive and proactive auto-scaling
methods. Auto-scaling is either reactive or proactive. Reactive methods in auto-scaling
refer to the allocation and deallocation of resources only when necessary. Furthermore,
reactive auto-scaling is the most commonly used in the market. However, proactive auto-
scaling attempts to scale resources while considering future workload patterns. A proactive
mechanism can handle sudden workload fluctuations by estimating the required number of
resources, whereas a reactive mechanism cannot handle such fluctuations [28,30]. A proac-
tive auto-scaling mechanism [31], can use past data on workload patterns. Hence, by using
a predictive technique on these data, they can estimate the future resources that are needed
for the future workload. In addition, the performance of proactive auto-scaling depends
on the accuracy of the workload prediction method. The integration of both methods can
improve a system’s throughput as discussed in [32], they introduced a hybrid auto-scaling
approach that utilizes reactive and proactive methods.

4.1. Reactive Methods


4.1.1. Threshold Rules
Threshold-based rules are popular due to their simplicity and ease of deployment in
cloud infrastructure. Cloud service providers, such as Amazon and Azure, offer rule-based
solutions that depend on user-defined rules and scaling actions when specific conditions
are met such as when CPU usage exceeds a specific threshold [33]. The threshold-based rule
technique is related to planning; however, defining rules is the most difficult part because
it requires user input concerning various quality of service (QoS) metrics. The common
metrics used in the threshold evaluation were performance, CPU load time, and application
response time. The parameters are obtained from monitoring tools and real-time data
that can be used to perform actions based on rules [33]. This technique follows a reactive
approach as it responds to the collected data. Rule-based techniques can be easily imple-
mented in specific applications with predictable workloads. However, if the application
has an unpredictable pattern, this technique is not the right choice. In that case, it is better
to use other techniques. Using the threshold strategy has several advantages; for example,
it enhances system performance to avoid VM start and shutdown time. In [34], the au-
thors discussed prediction frameworks related to VM and conducted several experiments
indicating that prediction frameworks can improve the quality of cloud services. Further-
more, it reduces costs and improves energy consumption. In [35], the authors proposed an
energy-efficient threshold to reduce energy consumption; thus, they used threshold values
to trigger task migration between virtual machines (VMs). This technique significantly
reduced the Makespan and increased resource utilization.

4.1.2. Queuing Theory


Queuing theory is utilized in the analysis stage of auto-scaling and evaluates the
performance parameters. It helps understand a system’s performance based on metrics
such as response time for requests, queue length, and service rate. One drawback of queuing
theory is that it is intended for stationary natural systems; thus, the dynamics of clouds
are not addressed. To address this issue, researchers can combine queuing theory with
other techniques, such as control theory and threshold-based policies. In [36], researchers
presented optimization methods based on a combination of queuing theory and the Markov
decision process (MDP). This study benefits service providers by optimizing their financial
costs because it consists of a set of fast load-based threshold techniques. However, the scope
of their paper is only focused on queue thresholds that determine when to activate or
deactivate the number of VMs. In contrast, the authors of [37] used the queuing theory
model in the context of VM migration to provide performance metrics for estimations
and improve resource utilization by supporting decision-making processes. The queuing
theory-based model helps increase resource utilization, improve QoS, and enhance resource
Sensors 2024, 24, 5551 8 of 21

scaling decisions. Furthermore, it can improve energy efficiency by identifying the most
energy-efficient migration trends and server utilization scenarios.

4.2. Proactive Methods


4.2.1. Reinforcement Learning
RL is a commonly used machine learning paradigm that involves agent interaction
with the environment. In [16,17], the authors provided valuable insights into the application
of RL to address decision-making issues in cloud computing. They mentioned that the
auto-scaler acts as an agent that performs actions based on the environmental state. Initially,
the agent does not know what actions to perform; as time passes, the agent should explore
the appropriate decision to perform certain actions. Therefore, RL allows agents to interact
with the environment, which allows them to discover and learn how to make decisions
and increase the cumulative rewards. In [38], the researchers proposed the RL model as
a solution, and they focused on predicting the usage of automation applications such as
XR/VR. The RL agent is trained to define the workload and predict the usage of services.
The results of the RL-based model were compared to those of standard reactive techniques.
They found that the RL agent outperformed the reactive scalers in terms of obtaining higher
reward and utilization levels in microservices. This means the agent can learn how to
scale microservices in advance. Although this research paper relies on simulations alone,
the proposed approach may not address the challenges of an actual system. However,
the authors of [39] conducted real-world experiments on their proposed auto-scaling
microservice based on RL, which was used to test and validate the module and identify
the auto-scaling threshold value. The study in [40], proposed a new auto-scaling policy
using RL.

4.2.2. Fuzzy Learning


Fuzzy learning, also known as fuzzy control or fuzzy logic, is a rule-based auto-
scaling technique that involves a set of predefined if–else rules. It allows users to use high,
medium, and low linguistic terms and can be used to define appropriate scaling decisions
based on fuzzy rules. These rules determine the correlations between input variables,
such as response time, CPU utilization, and increasing or decreasing resources. This can
improve system performance and efficiency by enabling resource allocation based on
current workload rules. In [41], the authors used a fuzzy logic control system to prioritize
processes based on the system workload. They used input parameters such as processing
time, workload, and process age. Similarly, in [42], a fuzzy logic technique was used to
define the target layers for offloading considering the resource heterogeneity of different
input parameters such as task size, latency, and network bandwidth. A fuzzy learning
technique can be implemented to make decisions about task offloading, as discussed in [43].
Fuzzy logic is employed as a task scheduler to define the target processing layer for an
IoT device’s heterogeneous task. These studies proposed innovative solutions using fuzzy
techniques. However, incorporating machine learning and fuzzy learning techniques can
enhance scalability in cloud computing with minimal reliance on expert-defined fuzzy
learning rules.

4.2.3. Machine Learning (ML)


This technique is widely used to enhance auto-scaling in cloud computing and pre-
dict future workloads. Furthermore, it makes accurate predictions about future resource
requirements because it evaluates historical data and monitors real-time metrics, such as
memory usage, CPU utilization, and network traffic. ML algorithms use these metrics to
scale resources up or down based on specific demands. Machine learning involves various
techniques, such as neural networks, support vector machines, and linear regression. One
of the existing models [30] has shown that predicted data are very close to the real data.
Figure 3 illustrates a real-time auto-scaling system based on deep learning that can forecast
future workloads, such as the value of CPU, bandwidth, and memory. Furthermore, it
Sensors 2024, 24, 5551 9 of 21

demonstrates the process of collecting metrics from various microservices, such as AWS
EC2 instances. However, the model forecasts the workload, and based on these predictions,
the system decides to scale resources up or down automatically. To address the issue of
estimating the right threshold value for microservices, the authors of [39] proposed an
intelligent autonomous auto-scaling system. They combined existing machine learning
techniques with RL, and the results indicated that the microservice response time was
improved by up to 20% compared to the default auto-scaling model. Another proposal
introduced in [44] also addressed the challenges of auto-scaling in microservices in terms of
performance prediction and resource allocation. They applied particle swarm optimization
to a deep neural network (DNN) to realize optimal resource allocation. The results indi-
cated that the proposed technique improved performance while minimizing the total cost
of CPU by 30% to 50% compared to the baseline. Prior work [45] discussed the importance
of incorporating ML into AL-driven DevOps, which can lead to significant improvements
in cloud computing infrastructure. In [46], the authors developed an ML approach to
improve the efficiency and performance of cloud computing resources. In [47], the authors
described ML solutions in large-scale systems. In [48], the authors proposed a solution for
proactive mission-critical services (MCS) and described how the metrics provided by MCS
can predict future requirements to realize service scalability actions.

Figure 3. Real-time auto-scaling system based on deep learning.

4.2.4. Time Series Analysis


Time series analysis is a popular technique that has been applied in various fields, such
as engineering, finance, and bioinformatics. Generally, it represents the variation in mea-
surement over time. This method is employed in auto-scaling to forecast future workloads.
By exploiting this technique, users can dynamically scale up or down their resources over
time while ensuring optimal resource usage. As represented in [49], the resource utilization
was reduced by up to 71%, through the integration of time series analysis with Daedalus,
which is a self-adaptive manager for auto-scaling. A time series forecasting technique [30]
collects historical data and uses statistical approaches to determine trends. These historical
data were collected at regular intervals for example each five or two minutes. Different
time series forecasting models are trained to use historical data and then use them to
forecast future resource demand. In [50], researchers used time series analysis to define
citizen-reported urban issues and enable forecasting of future urban challenges. They
provided insights into employing this technique which is a powerful statistical method
to analyze sequential data points gathered over time. Practical application of time series
analysis demonstrated in [51], they compared three AutoML tools and selected three types
of time series datasets: weather, Bitcoin, and COVID-19. The main objective of using these
tools is to reduce reliance on computer science expertise and improve the application of
ML tools for time series analysis and predictions. In [52], researchers developed an ML
approach called IoTArchML that continuously monitors QoS parameters and forecasts QoS
metrics based on historical data.
Sensors 2024, 24, 5551 10 of 21

5. Real-World Applications
Auto-scaling techniques have become indispensable in various sectors, showcasing
their adaptability and critical role in resource optimization and performance enhancement
across diverse computing environments. This section reviews real-world applications of
auto-scaling, illustrating its effectiveness in ensuring efficient operations and resource uti-
lization in smart city infrastructure [23], social media analytics [24], telecommunications [7],
and healthcare system [53].

5.1. Smart City Platforms


In the realm of smart city platforms, the dynamism of computing services is crucial due
to the highly variable workloads stemming from multiple user interactions. Researchers
in [23] have emphasized the necessity for auto-scaling in smart city infrastructure to
maintain service efficiency and adaptability. The study demonstrates how a decision-
making mechanism for auto-scaling can be tailored to user specifications within such
environments, revealing that settings like update history and heat thresholds are pivotal in
influencing auto-scaling behavior. This underscores the technique’s potential to optimize
resource allocation in response to fluctuating demands within smart city systems.

5.2. Sentiment Analysis on Twitter Data


Another compelling application was presented in the paper [24], where a cost-effective
auto-scaling framework was employed for the sentiment analysis of Twitter data from vari-
ous universities. The proposed framework adjusts computing resources within a Hadoop
cluster, catering to the real-time processing needs of vast Twitter datasets. By adeptly
scaling resources to match data loads, the framework not only optimizes utilization but also
curtails unnecessary expenditures, showcasing an efficient data processing paradigm that
yields valuable insights into public sentiment, crucial for academic institutions to gauge
their social media engagement and public perception.

5.3. Telecommunications (Telco) Cloud Infrastructure


The telecommunications sector also benefits from auto-scaling, especially when integrat-
ing containerized and virtualized network functions. As discussed in [7], an auto-scaling
architecture tailored for the Telco cloud employs algorithms that are conducive to both
containerized network functions (CNFs) and virtualized network functions (VNFs). This
architecture features an elastic auto-scaling system, orchestrating management, and load
balancing to adeptly handle application loads across various VMs and pods. Such a system
not only ensures resource efficiency but also facilitates cost reduction and enhanced service
reliability in telecommunications infrastructures.
Figure 4 shows an elastic auto-scaling architecture similar to the one proposed in [7];
it is deployed in the Telco cloud and consists of three main components: an auto-scaling
system, orchestration and management, and load balancer.
The role of the load balancer is to balance the Telco application load by balancing
requests between different pods and VMs that come from different destinations in the
cluster. The management system is used to manage the VMs and pods in Kubernetes
and gather the resources utilized and other parameters of each pod and VM running on
the cloud. The auto-scaling system includes a load balancer controller, address collector,
and scaling plugin. It can expand the number of pods and VMs horizontally based on the
workload of the virtual cluster. Once the virtual cluster consumes the most resources and
exceeds the defined threshold, the auto-scaling system creates a new VM that executes the
same application and load balancer to balance the requests between the VMs.
Sensors 2024, 24, 5551 11 of 21

Figure 4. Auto-scaling system architecture.

5.4. Application of Auto-Scaling in Healthcare


The auto-scaling mechanism described in [53], has significant practical applications
in real-world healthcare settings, particularly in enhancing the capabilities of cloud-based
medical monitoring systems. By implementing this proactive auto-scaling solution, health-
care providers can ensure that their cloud infrastructures dynamically adjust to fluctuating
data volumes from various medical devices, such as electrocardiograms or continuous
glucose monitors. For example, in a scenario where a patient’s condition deteriorates,
the system can scale up resources predictively to handle the increased data transmission
from the monitoring devices, ensuring that all critical data are processed in real time.
This approach allows timely medical intervention based on the most recent health data.
In essence, the application of this auto-scaling mechanism in real healthcare settings con-
tributes to a more robust, responsive, and efficient medical monitoring system, reducing
latency and enhancing the accuracy of patient care by leveraging cloud computing’s flexible
resource management. As shown in Figure 5, the architecture demonstrates how multiple
VMs are utilized within the cloud to support such dynamic resource allocation effectively.
Sensors 2024, 24, 5551 12 of 21

Figure 5. Physical machine architecture.

6. Challenges and Possible Solutions


In the rapidly evolving cloud computing domain, the ability to dynamically adjust
computational resources in response to fluctuating workloads, known as auto-scaling,
represents a pivotal innovation. This functionality is crucial for optimizing resource utiliza-
tion, maintaining service level agreements (SLAs), and ensuring high QoS for end users.
Despite its benefits, auto-scaling presents a unique set of challenges, primarily due to the
variable nature of cloud workloads, which demands a system that can anticipate and react
to changes efficiently.
Auto-scaling in cloud computing presents a multitude of challenges that can impact
its effectiveness and efficiency. A primary concern is the accuracy of resource prediction,
which is crucial for provisioning the appropriate amount of resources dynamically. Incorrect
predictions can lead to significant resource wastage or performance bottlenecks, affecting
operational costs and application reliability. Additionally, applications in sectors like health-
care and finance may experience sudden spikes in demand, which auto-scaling systems
must manage in real-time to prevent performance degradation. Integrating auto-scaling
solutions with existing IT infrastructures, especially legacy systems, introduces further com-
plexity. This integration often results in deployment challenges and security vulnerabilities.
Moreover, the auto-scaling operations themselves can consume substantial resources, po-
tentially detracting from the overall performance of the cloud environment [54]. To address
these challenges, several solutions can be implemented to enhance the functionality and
reliability of auto-scaling in cloud environments. Improving the accuracy of predictive algo-
rithms through advanced machine learning techniques can help in dynamically adjusting
resources more accurately, based on historical data and real-time analytics. Implement-
ing hybrid scaling strategies that combine proactive and reactive methods can effectively
manage rapid demand fluctuations. Developing cost-aware auto-scaling algorithms that in-
corporate financial metrics into the scaling decision process can optimize operational costs
Sensors 2024, 24, 5551 13 of 21

while maintaining performance. To reduce the complexity of implementing auto-scaling in


existing systems, standardizing auto-scaling interfaces and protocols is crucial along with
ensuring that new systems are compatible with older technologies through comprehensive
testing. Additionally, optimizing the efficiency of scaling algorithms and minimizing the
resource footprint of auto-scaling processes can mitigate the performance overheads associ-
ated with auto-scaling operations. For legacy systems, developing middleware that can
bridge the gap between scalable cloud infrastructure and non-scalable legacy systems may
provide a viable solution [54].
This paper delves into details of the various challenges associated with auto-scaling
in cloud environments, exploring the implications of workload variability and the need
for predictive scaling mechanisms to prevent resource wastage and service degradation.
We summarized recent innovative solutions across different cloud computing paradigms,
including Hadoop-based big data processing, Telco cloud services, graph processing,
and general cloud applications.

6.1. Energy Efficiency


The energy efficiency challenge in Apache Hadoop, as discussed in [2], stems from its
significant energy consumption in data centers. The architecture of Hadoop’s distributed
file system (HDFS) complicates the ability to scale down resources efficiently, leading to
servers running idle and consuming energy without active use. This inefficiency not only
increases operational costs but also impacts the environmental footprint of data process-
ing. The challenge is further exacerbated by the necessity to maintain data availability,
preventing the straightforward deactivation of idle nodes, thus hindering the optimization
of energy usage in Hadoop deployments.
To address this challenge, the document proposes an innovative solution that involves
separating HDFS nodes from MapReduce computing nodes, enabling more effective auto-
scaling. This separation allows for the dynamic scaling of computation nodes independent
of the data nodes, thereby maintaining data availability while optimizing resource utiliza-
tion. The introduction of a cluster monitor and auto scaler component enables the system
to make informed decisions about when to scale up or down based on real-time metrics
like CPU and RAM utilization, and job task status. This approach not only enhances energy
efficiency by shutting down unnecessary compute nodes during low workload periods but
also ensures the system’s responsiveness and capacity to handle bursts of computational
demand, paving the way for a more energy-efficient and cost-effective Hadoop deployment.

6.2. Dynamic Nature of Telco Services


In [7], the authors presented challenges and solutions for implementing elastic auto-
scaling in Telco cloud environments. The primary challenge is the dynamic nature of
Telco services, especially with the advent of 5G technology, which necessitates rapid and
reliable service provisioning. This challenge is compounded by the coexistence of VNFs
and CNFs in the Telco cloud, necessitating an auto-scaling architecture that can adapt
to both. The challenge is to manage resource allocation dynamically to address varying
user demands without under-provisioning (which can lead to service degradation) or
over-provisioning (which is costly and inefficient).
To address these challenges, the paper [7] proposed an elastic auto-scaling architecture
tailored for the Telco cloud, accommodating both VNFs and CNFs. This architecture
includes a load balancer, a management and orchestration system, and an auto-scaling
system with two specialized algorithms. The architecture is designed to automatically
scale services in or out based on real-time demands, enhancing resource utilization and
efficiency. By integrating with existing Telco cloud infrastructures, this solution aims to
ensure that services can dynamically adapt to varying loads, thereby optimizing resource
use, reducing costs, and maintaining service quality regardless of demand fluctuations.
Sensors 2024, 24, 5551 14 of 21

6.3. Optimizing Big Data Processing with Auto-Scaling in the Cloud


The primary challenge identified in [24] concerns the efficient processing of big data
within cloud computing environments, where the main issue lies in the dynamic adjustment
of computing resources to handle varying data workloads efficiently. Traditional fixed
resource configurations, designed to manage peak loads, result in underutilization and
wastage during periods of lower demand, leading to inefficiency and escalated costs.
In response to this challenge, the authors [24] find a solution through a cost-effective
auto-scaling (CEAS) framework specifically developed for the AWS cloud. This framework
intelligently auto-scales the virtual nodes in a Hadoop cluster, aligning resource allocation
with the fluctuating real-time data workloads. By dynamically scaling resources up or down
based on actual need, the CEAS framework optimizes resource usage, thereby enhancing
operational efficiency and reducing the financial burden associated with big data processing
in the cloud.

6.4. Cost-Efficient Graph Processing in Clouds


The public cloud providers offer a pay-as-you-go model that allows users to pay
only for utilized resources. However, there are issues related to using such a model,
for example, infrastructure virtualization overhead, latency due to VM placement, and a
controlled environment. To address these issues, Ref. [10] developed a new resource-
based auto-scaling algorithm that enables the partitioning of graphs through available
VMs based on VM type. The proposed solution [10] is a new auto-scaling algorithm, and
it interfaces it into the iGraph framework to minimize the cost of graph processing in
the cloud environment. Further, it introduces a new feature-based dynamic approach
that distributes graph partitions effectively on heterogeneous resources. The auto-scaling
algorithm can dynamically change the types and number of virtual machines based on the
required capacity to process the rest of the graph.

6.5. Enhancing Cloud Auto-Scaling with Machine Learning-Based Workload Prediction


The main challenge in cloud computing, as presented in [28], revolves around handling
fluctuating workloads in VMs to maintain optimal resource allocation. Cloud services must
adapt to varying workload patterns by autonomously scaling resources up or down to
ensure adherence to SLAs and maintain QoS. The inherent issue is the dynamic nature of
cloud workloads, which necessitates predictive scaling to avoid both over-allocation and
under-allocation of resources, which respectively lead to resource wastage and potential
degradation in service response times.
To address this challenge, the authors [28], propose a proactive auto-scaling solution
using a ML based model, specifically utilizing a SVM for multi-class classification. This
model predicts future workload trends by learning from historical data, enabling the cloud
system to prepare and adjust the number of VMs in advance to meet the predicted demands.
This predictive approach aims to optimize resource utilization, ensuring that the cloud in-
frastructure dynamically aligns with user demand, thereby enhancing operational efficiency
and reducing unnecessary costs. The effectiveness of the proposed model is demonstrated
using a real cloud dataset, showcasing its potential to accurately predict workload changes
and facilitate more informed auto-scaling decisions in cloud environments.

7. Research Directions
This section thoroughly investigates crucial future directions for auto-scaling in cloud
computing infrastructure. Based on our survey, we outlined some potential research directions
for future improvements, as discussed below.

7.1. Energy Efficiency


Recently, several studies have focused more on cost optimization and the QoS in
auto-scaling. However, optimizing energy is another promising research field that requires
further investigation. Energy consumption is a major concern in auto-scaling and some
Sensors 2024, 24, 5551 15 of 21

of the existing work tried to minimize energy utilization. In [35], researchers proposed
an energy-efficient threshold to reduce energy consumption. Also, in [2], this approach
enhanced energy efficiency by shutting down unnecessary compute nodes during low
workload periods. Another important research direction that needs further exploration is
applying auto-scaling techniques to reduce energy consumption. As previously mentioned
in [37], the queuing theory-based model helps increase resource utilization and improves
energy efficiency by determining the most energy-efficient migration trends and server
utilization scenarios. Few studies have incorporated renewable energy sources, such as
solar panels, to provide power to data centers. These solutions are critical future research
directions that require additional investigation. The objective of adopting renewable energy
is to boost environmental sustainability by minimizing the dependence of data centers
on non-renewable energy sources during auto-scaling decisions. For sustainable energy
solutions, the authors of [55] encouraged researchers to explore more the combination
of hydropower dams with floating photovoltaic energy. In [56], the authors developed a
multi-agent deep reinforcement learning framework to address the challenges of renewable
energy in data centers. They successfully combined green and brown energy sources.
However, the framework was demonstrated through simulations; thus, adopting such a
system in practical cloud computing applications requires further investigation. In [57],
researchers investigated the best option in terms of power saving in both cloud computing
and edge computing.

7.2. Workload Prediction


Recent studies have demonstrated that proactive auto-scaling techniques can be in-
corporated with ML algorithms to enhance workload prediction. One promising research
direction is refining prediction accuracy in SVM, deep learning, and time series analysis
algorithms. These enhancements improve system performance and provide highly accu-
rate workload predictions. As explained in [28], they used ML and SVM to predict future
workload trends by learning from historical data, enabling the cloud system to prepare
and adjust the number of VMs in advance to meet predicted demands. Another model,
in [30], developed a real-time auto-scaling system based on deep learning and it indicates
that predicted data are very close to the real data.

7.3. Cost Optimization


Spot instances have emerged recently as a promising solution introduced by AWS that
minimizes the cost of resources compared to on-demand resources. Another promising
research direction that requires further investigation is the CEAS framework developed by
AWS, as discussed in [24]. This framework optimizes resource usage, thereby enhancing
operational efficiency and reducing the financial burden associated with big data processing
in the cloud. In addition, in [10], they developed a framework to minimize the cost of graph
processing in a cloud environment, which requires further exploration. Cost-aware auto-
scaling techniques are a critical area of future research. Several optimization techniques,
such as queuing theory and reinforcement learning, can be explored to provide cost-
effective auto-scaling strategies. In a different perspective on RL, as discussed in [58],
the authors utilized the DeepScale algorithm to auto-scale containerized applications
and reduce costs. Also, in [59], researchers employed multiple ML techniques to reduce
resource costs.

7.4. Combining Vertical and Horizontal Scaling


Although several studies have applied hybrid scaling approaches, they have not ad-
dressed the challenges related to the integration of vertical and horizontal scaling, such as
resource allocation, monitoring, decision-making, and load balancing. Therefore, further
investigation in these areas could provide viable solutions to enhance hybrid scaling ap-
proaches. Applying hybrid scaling is a promising avenue for research and provides more
flexible solutions to enhance resource usage, as discussed in [25]. The authors showed
Sensors 2024, 24, 5551 16 of 21

the differences between Knative hybrid auto-scaling and default horizontal auto-scaling.
Results indicated that their system outperformed the existing default Knative horizon-
tal auto-scaling. In [60], they conducted a comparison of two auto-scaling approaches
under Kubernetes. While in [61], researchers proposed a hybrid auto-scaling framework
specifically for containerized in Kubernetes to enhance resource utilization. Furthermore,
for future improvements, the authors suggest determining a mechanism for selecting an
appropriate monitoring interval length.

7.5. Real-Time and Proactive Auto-Scaling


Unlike traditional auto-scaling, which has latency issues, real-time and proactive
auto-scaling is a new paradigm that researchers should continue to explore. Some studies
have implemented real-time and proactive methods that can proactively modify resources
in advance and accurately predict workload changes. Implementing these algorithms has
several benefits. For example, latency issues related to traditional auto-scaling can be
eliminated, thereby ensuring a faster response time and more efficient system performance.
Another approach is predicting future workloads to optimize resource allocation, which
leads to saving more cost. The proposed system in [62] was able to predict future workload
with multiple metrics to scale edge computing resources up and down. Overall, these
techniques minimize response times and enhance system performance. As previously
mentioned in [30], they developed a real-time auto-scaling system based on deep learning,
and their results indicated that their predicted data are very close to the real data. However,
gaps remain in these studies and require further evaluation. Hence, real-time and proactive
auto-scaling is one of the most important research directions in auto-scaling and cloud
computing infrastructure.

7.6. Heterogeneous Resource Provisioning


Heterogeneous resource provisioning is another important research area that requires
further exploration. There are diverse types of resources in cloud computing such as VMs,
containers, storage, and serverless functions. Currently, researchers are investigating auto-
scaling techniques that can efficiently manage the allocation of different resources. This
involves selecting the most suitable resource size and type depending on workload features
and scaling and implementing scaling policies for different types of resources. The work
in [63] is a promising research direction that needs further exploration. The authors
proposed on-demand provisioning of computing resource (OPCR) algorithms that are used
to handle and process several applications, which leads to cost savings. However, their
proposed algorithm was designed and tested using only simulations. Hence, real-time
applications should be implemented to determine accuracy and address challenges in
practical environments.

7.7. Multi-Cloud Auto-Scaling


Recently, researchers have focused on auto-scaling methods that support hybrid
cloud architectures and multi-cloud providers. With a significant increase in the adoption
of multi-cloud systems, there is a growing need for auto-scaling algorithms. In single-
cloud providers, the scaling decisions specifically made depend on characteristics such
as memory utilization and CPU usage. However, in multi-cloud environments, auto-
scaling techniques can operate across multiple cloud providers. This ensures optimal
resource allocation considering various factors, such as cost, performance, and data locality.
Therefore, researchers have identified various areas to enable auto-scaling across multiple
cloud environments. One of these important research areas is the investigation of different
techniques for efficient workload migration. Moving workloads across several cloud
providers can be a challenge; thus, suitable techniques can be implemented to address
these challenges. Another research area that researchers should investigate is intelligent
workload placement algorithms. These algorithms enhance resource allocation while
increasing the efficiency of auto-scaling between different cloud providers.
Sensors 2024, 24, 5551 17 of 21

7.8. Intelligent Resource Scaling


Intelligent resource scaling (IRS) is a potential research area that requires further in-
vestigation. IRS means the dynamic allocation of computing resources to meet the system’s
evolving needs, including auto-scaling resources up or down depending on specific factors.
A potential research direction discussed [64], they proposed an IRS for efficient digital twin
(DT) simulation. The IRS maintains constant monitoring of the container’s workloads and
forecasts resource demands using the DLinear to react to future workloads. Therefore,
the IRS still needs further improvements to enhance the reliability and efficiency of the DT
simulation. Also, the research area as in [65], needs further improvement and evaluation.
The authors mentioned that they plan to investigate online job requirements, which is
an estimation technique that provides a highly dynamic scheduling environment. These
studies offer promising research directions for intelligent resource scaling, and their focus
is on providing different solutions to modern cloud-based applications.

7.9. Enhancing Security in Auto-Scaling


Security and privacy are important considerations in auto-scaling systems. Auto-
scaling has been widely adopted in different domains; thus, ensuring the security of auto-
scaling mechanisms is critical. In [26], the authors discussed key elements of data security,
focusing on various aspects of cloud computing. Advanced techniques that improve
the security of auto-scaling applications are promising research directions. Furthermore,
during the auto-scaling process, privacy-preserving algorithms can be implemented to
manage sensitive data. Another important aspect of auto-scaling is adopting homomorphic
encryption techniques, which ensures that unauthorized users cannot access sensitive
data. Exploring further solutions for these techniques will enhance the security and
privacy of auto-scaling systems. Integrating ML technologies is another important research
direction for auto-scaling systems. ML algorithms can be trained to detect abnormal
behaviors and patterns within auto-scaling applications. Further, it can raise alerts when
detecting suspicious activities or indicating security breaches. In [66], researchers presented
a malicious traffic detection framework using an unsupervised DL model, achieving a
significant 100% accuracy in malicious traffic detection. Table 4 summarizes the promising
future research directions provided in this section.

Table 4. Summary of future research directions in auto-scaling

Issue Possible Research Direction


Energy consumption is a major concern in auto-scaling and some
Energy efficiency
of the existing work tried to minimize energy utilization.
Using algorithms such as support vector machines, deep learning,
Workload prediction
and time series analysis enhanced the accuracy.
Spot instances have emerged recently as a promising solution
Cost optimization
introduced by AWS that minimizes the cost of resources
Combining vertical and Hybrid scaling provides more flexible solutions to enhance
horizontal scaling resource usage.
Real-time and proactive Real-time auto-scaling minimizes the response time and enhances
auto-scaling system performance.
Heterogeneous resource Researchers are investigating auto-scaling techniques that
provisioning efficiently manage the allocation of different resources.
With the significant increase in the adopting of multi-cloud
Multi-cloud auto-scaling
platforms, there is a growing need for auto-scaling algorithms.
IRS is the dynamic allocation of computing resources to meet the
Intelligent resource scaling
system’s evolving needs.
Enhancing security in Developing advanced techniques that improve the security of
auto-scaling auto-scaling applications is a promising research direction.
Sensors 2024, 24, 5551 18 of 21

8. Conclusions
In this paper, a literature survey of auto-scaling techniques was conducted, and we
outlined the main challenges of auto-scaling. Then, we presented several promising future
research directions based on the challenges. The different types of auto-scaling techniques
were presented and discussed, including reactive methods (threshold rules, queuing theory)
and proactive methods (RL, fuzzy learning, ML, and time series analysis). We discussed
existing real-world applications in cloud computing and the key challenges related to these
use cases. Additionally, we explained the fundamental concepts of cloud computing, auto-
scaling, horizontal scaling, and vertical scaling. In future work, we will explore specific
areas in auto-scaling. For example, we plan to focus more on resource allocation algorithms.

Author Contributions: Conceptualization, All authors; methodology, All authors; formal analysis,
All authors; investigation, All authors; resources, S.A., A.A. (Afra Alshamsi), and A.A. (Anoud
Alseiari); data curation, S.A., A.A. (Afra Alshamsi), and A.A. (Anoud Alseiari); writing—original
draft preparation, S.A., A.A. (Afra Alshamsi), and A.A. (Anoud Alseiari); writing—review and
editing, All authors; visualization, S.A., A.A. (Afra Alshamsi), and A.A. (Anoud Alseiari); supervi-
sion, A.A. (Adbulmalik Alwarafy); project administration, A.A. (Adbulmalik Alwarafy); funding
acquisition, A.A. (Adbulmalik Alwarafy). All authors have read and agreed to the published version
of the manuscript.
Funding: This work was supported by Startup Grant # 12T047 from the United Arab Emirates
University (UAEU).
Data Availability Statement: Data are contained within the article.
Conflicts of Interest: The authors declare no conflict of interest. The funders had no role in the design
of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or
in the decision to publish the results.

References
1. Goel, P.K.; Gulati, S.; Singh, A.; Tyagi, A.; Komal, K.; Mahur, L.S. Energy-Efficient Block-Chain Solutions for Edge and Cloud
Computing Infrastructures. In Proceedings of the 2024 2nd International Conference on Disruptive Technologies (ICDT), Greater
Noida, India, 15–16 March 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 852–856.
2. Ismahene, N.W.; Souheila, B.; Nacereddine, Z. An Auto Scaling Energy Efficient Approach in Apache Hadoop. In Proceedings
of the 2020 International Conference on Advanced Aspects of Software Engineering (ICAASE), Constantine, Algeria, 28–30
November 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–6.
3. Eljak, H.; Ibrahim, A.O.; Saeed, F.; Hashem, I.A.T.; Abdelmaboud, A.; Syed, H.J.; Abulfaraj, A.W.; Ismail, M.A.; Elsafi, A. E-learning
based Cloud Computing Environment: A Systematic Review, Challenges, and Opportunities. IEEE Access 2023, 12, 7329–7355.
[CrossRef]
4. Fathullah, M.A.; Subbarao, A.; Muthaiyah, S.; Taralunga, D. Risk Classes of Cloud Computing Project in Healthcare: A Review of
Technical Report and Standards. J. Adv. Res. Appl. Sci. Eng. Technol. 2024, 44, 169–180. [CrossRef]
5. Bodemer, O. Revolutionizing Finance: The Impact of AI and Cloud Computing in the Banking Sector. TechRxiv 2024. [CrossRef]
6. Kiatipis, A.; Xanthopoulos, A. Cloud Usage for Manufacturing: Challenges and Opportunities. Procedia Comput. Sci. 2024,
232, 1412–1419. [CrossRef]
7. Sao Cao, D.; Nguyen, D.T.; Nguyen, X.C.; Nguyen, H.B.; Lang, K.T.; Dao, N.L.; Pham, T.T.; Cao, N.S.; Chu, D.H.; Nguyen, P.H.;
et al. Elastic auto-scaling architecture in telco cloud. In Proceedings of the 2023 25th International Conference on Advanced
Communication Technology (ICACT), Pyeongchang, Republic of Korea, 19–22 February 2023; IEEE: Piscataway, NJ, USA, 2023;
pp. 401–406.
8. Boujelben, Y.; Fourati, H. A distributed auction-based algorithm for virtual machine placement in multiplayer cloud gaming
infrastructures. Int. J. Cloud Comput. 2024, 13, 80–98. [CrossRef]
9. Singh, P.D.; Singh, K.D. Interdisciplinary Approaches: Fog/Cloud Computing and IoT for AI and Robotics Integration. EAI
Endorsed Trans. AI Robot. 2024 , 3. [CrossRef]
10. Heidari, S.; Buyya, R. A cost-efficient auto-scaling algorithm for large-scale graph processing in cloud environments with
heterogeneous resources. IEEE Trans. Softw. Eng. 2019, 47, 1729–1741. [CrossRef]
11. Simic, V.; Stojanovic, B.; Ivanovic, M. Optimizing the performance of optimization in the cloud environment–An intelligent
auto-scaling approach. Future Gener. Comput. Syst. 2019, 101, 909–920. [CrossRef]
12. Fourati, M.H.; Marzouk, S.; Jmaiel, M. Towards Microservices-Aware Autoscaling: A Review. In Proceedings of the 2023 IEEE
Symposium on Computers and Communications (ISCC), Gammarth, Tunisia, 9–12 July 2023; IEEE: Piscataway, NJ, USA, 2023;
pp. 1080–1083.
Sensors 2024, 24, 5551 19 of 21

13. Tran, M.N.; Vu, D.D.; Kim, Y. A survey of autoscaling in kubernetes. In Proceedings of the 2022 Thirteenth International
Conference on Ubiquitous and Future Networks (ICUFN), Barcelona, Spain, 5–8 July 2022; IEEE: Piscataway, NJ, USA, 2022;
pp. 263–265.
14. Singh, P.; Gupta, P.; Jyoti, K.; Nayyar, A. Research on auto-scaling of web applications in cloud: Survey, trends and future
directions. Scalable Comput. Pract. Exp. 2019, 20, 399–432. [CrossRef]
15. Qu, C.; Calheiros, R.N.; Buyya, R. Auto-scaling web applications in clouds: A taxonomy and survey. ACM Comput. Surv. (CSUR)
2018, 51, 1–33. [CrossRef]
16. Garí, Y.; Monge, D.A.; Pacini, E.; Mateos, C.; Garino, C.G. Reinforcement learning-based application autoscaling in the cloud: A
survey. Eng. Appl. Artif. Intell. 2021, 102, 104288. [CrossRef]
17. Gari, Y.; Monge, D.A.; Pacini, E.; Mateos, C.; Garino, C.G. Reinforcement learning-based autoscaling of workflows in the cloud: A
survey. arXiv 2020, arXiv:2001.09957.
18. Dogani, J.; Namvar, R.; Khunjush, F. Auto-scaling techniques in container-based cloud and edge/fog computing: Taxonomy and
survey. Comput. Commun. 2023, 209, 120–150. [CrossRef]
19. Chen, T.; Bahsoon, R.; Yao, X. A survey and taxonomy of self-aware and self-adaptive cloud autoscaling systems. ACM Comput.
Surv. (CSUR) 2018, 51, 1–40. [CrossRef]
20. Rabiu, S.; Yong, C.H.; Mohamad, S.M.S. A cloud-based container microservices: A review on load-balancing and auto-scaling
issues. Int. J. Data Sci. 2022, 3, 80–92. [CrossRef]
21. Khan, A.A.; Vidhyadhari, C.H.; Kumar, S. A review on fixed threshold based and adaptive threshold based auto-scaling
techniques in cloud computing. MATEC Web Conf. 2024, 392, 01115. [CrossRef]
22. Hung, C.L.; Hu, Y.C.; Li, K.C. Auto-scaling model for cloud computing system. Int. J. Hybrid Inf. Technol. 2012, 5, 181–186.
23. Solino, A.; Batista, T.; Cavalcante, E. Decision-Making Support to Auto-scale Smart City Platform Infrastructures. In Proceedings
of the 2023 18th Iberian Conference on Information Systems and Technologies (CISTI), Aveiro, Portugal, 20–23 June 2023; IEEE:
Piscataway, NJ, USA, 2023; pp. 1–6.
24. Jannapureddy, R.; Vien, Q.T.; Shah, P.; Trestian, R. An auto-scaling framework for analyzing big data in the cloud environment.
Appl. Sci. 2019, 9, 1417. [CrossRef]
25. Alipour, H.; Liu, Y.; Hamou-Lhadj, A. Analyzing auto-scaling issues in cloud environments. CASCON 2014, 14, 75–89.
26. Atadoga, A.; Umoga, U.J.; Lottu, O.A.; Sodiya, E.O. Evaluating the impact of cloud computing on accounting firms: A review of
efficiency, scalability, and data security. Glob. J. Eng. Technol. Adv. 2024, 18, 065–074. [CrossRef]
27. Mullapudi, M.; Munjala, M.B.; Kulkarni, C. Designing a Resilient Parallel Distributed Task Infrastructure for Large-Scale Data
Processing. Available online: https://www.researchgate.net/profile/Mahidhar-Mullapudi/publication/378438900_Designing_
a_Resilient_Parallel_Distributed_Task_Infrastructure_for_Large-scale_Data_Processing/links/65d96d94adc608480ae7fa04/
Designing-a-Resilient-Parallel-Distributed-Task-Infrastructure-for-Large-scale-Data-Processing.pdf (accessed on 1 June 2024).
28. Singh, S.T.; Tiwari, M.; Dhar, A.S. Machine Learning based Workload Prediction for Auto-scaling Cloud Applications. In
Proceedings of the 2022 OPJU International Technology Conference on Emerging Technologies for Sustainable Development
(OTCON), Raigarh, India, 8–10 February 2022; IEEE: Piscataway, NJ, USA, 2023; pp. 1–6.
29. Ashalatha, R.; Agarkhed, J. Evaluation of auto scaling and load balancing features in cloud. Int. J. Comput. Appl. 2015 , 117.
30. Taha, M.B.; Sanjalawe, Y.; Al-Daraiseh, A.; Fraihat, S.; Al-E’mari, S.R. Proactive Auto-Scaling for Service Function Chains in
Cloud Computing based on Deep Learning. IEEE Access 2024, 12, 38575–38593. [CrossRef]
31. Ahamed, Z.; Khemakhem, M.; Eassa, F.; Alsolami, F.; Al-Ghamdi, A.S.A.M. Technical study of deep learning in cloud computing
for accurate workload prediction. Electronics 2023, 12, 650. [CrossRef]
32. Pereira, P.; Araujo, J.; Maciel, P. A hybrid mechanism of horizontal auto-scaling based on thresholds and time series. In
Proceedings of the 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC), Bari, Italy, 6–9 October 2019;
IEEE: Piscataway, NJ, USA, 2019; pp. 2065–2070.
33. Quattrocchi, G.; Incerto, E.; Pinciroli, R.; Trubiani, C.; Baresi, L. Autoscaling Solutions for Cloud Applications under Dynamic
Workloads. IEEE Trans. Serv. Comput. 2024, 17, 804–820. [CrossRef]
34. Hu, Y.; Deng, B.; Peng, F. Autoscaling prediction models for cloud resource provisioning. In Proceedings of the 2016 2nd IEEE
International Conference on Computer and Communications (ICCC), Chengdu, China, 14–17 October 2016; IEEE: Piscataway, NJ,
USA, 2016; pp. 1364–1369.
35. Malla, P.A.; Sheikh, S.; Shahid, M.; Mushtaq, S.U. Energy-efficient sender-initiated threshold-based load balancing (e-STLB) in
cloud computing environment. Concurr. Comput. Pract. Exp. 2024, 36, e7943. [CrossRef]
36. Tournaire, T.; Castel-Taleb, H.; Hyon, E. Efficient Computation of Optimal Thresholds in Cloud Auto-scaling Systems. ACM
Trans. Model. Perform. Eval. Comput. Syst. 2023, 8, 1–31. [CrossRef]
37. Kushchazli, A.; Safargalieva, A.; Kochetkova, I.; Gorshenin, A. Queuing Model with Customer Class Movement across Server
Groups for Analyzing Virtual Machine Migration in Cloud Computing. Mathematics 2024, 12, 468. [CrossRef]
38. Heimerson, A.; Eker, J.; Årzén, K.E. A Proactive Cloud Application Auto-Scaler using Reinforcement Learning. In Proceedings of
the 2022 IEEE/ACM 15th International Conference on Utility and Cloud Computing (UCC), Vancouver, WA, USA, 6–9 December
2022; IEEE: Piscataway, NJ, USA, 2022; pp. 213–220.
39. Khaleq, A.A.; Ra, I. Intelligent autoscaling of microservices in the cloud for real-time applications. IEEE Access 2021, 9, 35464–35476.
[CrossRef]
Sensors 2024, 24, 5551 20 of 21

40. Joyce, J.E.; Sebastian, S. Reinforcement learning based autoscaling for kafka-centric microservices in kubernetes. In Proceedings
of the 2022 IEEE 4th PhD Colloquium on Emerging Domain Innovation and Technology for Society (PhD EDITS), Bangalore,
India, 4–5 November 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1–2.
41. Alnawayseh, S.E.; Muhammad, M.H.G.; Hassan, Z.; Fatima, M.; Aslam, M.S.; Ibrahim, A.; Ateeq, K. Resource Provisioning in
Cloud Computing using Fuzzy Logic Control System: An adaptive approach. In Proceedings of the 2023 International Conference
on Business Analytics for Technology and Security (ICBATS), Dubai, United Arab Emirates, 7–8 March 2023; IEEE: Piscataway,
NJ, USA, 2023; pp. 1–9.
42. Mahapatra, A.; Majhi, S.K.; Mishra, K.; Pradhan, R.; Rao, D.C.; Panda, S.K. An energy-aware task offloading and load balancing
for latency-sensitive IoT applications in the Fog-Cloud continuum. IEEE Access 2024, 12, 14334–14349. [CrossRef]
43. Abdulazeez, D.H.; Askar, S.K. A Novel Offloading Mechanism Leveraging Fuzzy Logic and Deep Reinforcement Learning to
Improve IoT Application Performance in a Three-Layer Architecture within the Fog-Cloud Environment. IEEE Access 2024, 12,
39936–39952. [CrossRef]
44. Luan, S.; Shen, H. Minimize Resource Cost for Containerized Microservices Under SLO via ML-Enhanced Layered Queueing
Network Optimization. In Proceedings of the 2024 14th International Conference on Cloud Computing, Data Science &
Engineering (Confluence), Noida, India, 18–19 January 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 631–637.
45. Jensen, A. AI-Driven DevOps: Enhancing Automation with Machine Learning in AWS. Integr. J. Sci. Technol. 2024 , 1.
46. Chhabra, M.; Arora, G. The Impact of Machine Learning (ML) Driven Algorithm Ranking and Visualization on Task Scheduling
in Cloud Computing. In Proceedings of the 2023 3rd International Conference on Advancement in Electronics & Communication
Engineering (AECE), Ghaziabad, India, 23–24 November 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 969–972.
47. Yang, R.; Ouyang, X.; Chen, Y.; Townend, P.; Xu, J. Intelligent resource scheduling at scale: A machine learning perspective. In
Proceedings of the 2018 IEEE Symposium on Service-Oriented System Engineering (SOSE), Bamberg, Germany, 26–29 March
2018; IEEE: Piscataway, NJ, USA, 2018; pp. 132–141.
48. Spantideas, S.; Giannopoulos, A.; Cambeiro, M.A.; Trullols-Cruces, O.; Atxutegi, E.; Trakadas, P. Intelligent Mission Critical
Services over Beyond 5G Networks: Control Loop and Proactive Overload Detection. In Proceedings of the 2023 International
Conference on Smart Applications, Communications and Networking (SmartNets), Istanbul, Turkey, 25–27 July 2023; IEEE:
Piscataway, NJ, USA, 2023; pp. 1–6.
49. Pfister, B.J.; Scheinert, D.; Geldenhuys, M.K.; Kao, O. Daedalus: Self-Adaptive Horizontal Autoscaling for Resource Efficiency of
Distributed Stream Processing Systems. arXiv 2024, arXiv:2403.02093.
50. Gkontzis, A.F.; Kotsiantis, S.; Feretzakis, G.; Verykios, V.S. Temporal Dynamics of Citizen-Reported Urban Challenges: A
Comprehensive Time Series Analysis. Big Data Cogn. Comput. 2024, 8, 27. [CrossRef]
51. Westergaard, G.; Erden, U.; Mateo, O.A.; Lampo, S.M.; Akinci, T.C.; Topsakal, O. Time Series Forecasting Utilizing Automated
Machine Learning (AutoML): A Comparative Analysis Study on Diverse Datasets. Information 2024, 15, 39. [CrossRef]
52. Muccini, H.; Vaidhyanathan, K. A machine learning-driven approach for proactive decision making in adaptive architectures. In
Proceedings of the 2019 IEEE international conference on software architecture companion (ICSA-C), Hamburg, Germany, 25–26
March 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 242–245.
53. Ahn, Y.W.; Cheng, A.M.; Baek, J.; Jo, M.; Chen, H.H. An auto-scaling mechanism for virtual resources to support mobile,
pervasive, real-time healthcare applications in cloud computing. IEEE Netw. 2013, 27, 62–68. [CrossRef]
54. Dougherty, B.; White, J.; Schmidt, D.C. Model-driven auto-scaling of green cloud computing infrastructure. Future Gener. Comput.
Syst. 2012, 28, 371–378. [CrossRef]
55. Grimson, M.; Almeida, R.; Shi, Q.; Bai, Y.; Angarita, H.; Pacheco, F.S.; Schmitt, R.; Flecker, A.; Gomes, C.P. Scaling Up Pareto
Optimization for Tree Structures with Affine Transformations: Evaluating Hybrid Floating Solar-Hydropower Systems in the
Amazon. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 20–28 February 2024;
Volume 38, pp. 22067–22075.
56. Jayanetti, A.; Halgamuge, S.; Buyya, R. Multi-Agent Deep Reinforcement Learning Framework for Renewable Energy-Aware
Workflow Scheduling on Distributed Cloud Data Centers. IEEE Trans. Parallel Distrib. Syst. 2024, 35, 604–615. [CrossRef]
57. Baynum, A.; Hao, W. Exploring the Impact of Cloud Computing and Edge Computing on Resource Consumption for Mobile
Devices with Generative Artificial Intelligence APIs. In Proceedings of the 2024 IEEE 14th Annual Computing and Communication
Workshop and Conference (CCWC), Las Vegas, NV, USA, 8–10 January 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 0603–0609.
58. Shi, T.; Ma, H.; Chen, G.; Hartmann, S. Auto-Scaling Containerized Applications in Geo-Distributed Clouds. IEEE Trans. Serv.
Comput. 2023, 16, 4261–4274. [CrossRef]
59. Meng, C.; Tong, H.; Wu, T.; Pan, M.; Yu, Y. Multi-Level ML Based Burst-Aware Autoscaling for SLO Assurance and Cost Efficiency.
arXiv 2024, arXiv:2402.12962.
60. Choonhaklai, P.; Chantrapornchai, C. Two Autoscaling Approaches on Kubernetes Clusters Against Data Streaming Applications.
In Proceedings of the 2023 International Technical Conference on Circuits/Systems, Computers, and Communications (ITC-CSCC),
Jeju, Republic of Korea, 25–28 June 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 1–6.
61. Vu, D.D.; Tran, M.N.; Kim, Y. Predictive hybrid autoscaling for containerized applications. IEEE Access 2022, 10, 109768–109778.
[CrossRef]
62. Ju, L.; Singh, P.; Toor, S. Proactive autoscaling for edge computing systems with kubernetes. In Proceedings of the IEEE/ACM
International Conference on Utility and Cloud Computing Companion, Leicester, UK, 6–9 December 2021; pp. 1–8.
Sensors 2024, 24, 5551 21 of 21

63. Wang, Y.; Li, Y.; Guo, J.; Fan, Y.; Chen, L.; Zhang, B.; Wang, W.; Zhao, Y.; Zhang, J. On-demand provisioning of computing
resources in computing power network with mixed CPU and GPU. In Proceedings of the 2023 21st International Conference on
Optical Communications and Networks (ICOCN), Qufu, China, 31 July–3 August 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 1–3.
64. Jeon, J.; Jeong, B.; Jeong, Y.S. Intelligent resource scaling for container based digital twin simulation of consumer electronics.
IEEE Trans. Consum. Electron. 2023, 70, 3131–3140. [CrossRef]
65. Gu, R.; Chen, Y.; Liu, S.; Dai, H.; Chen, G.; Zhang, K.; Che, Y.; Huang, Y. Liquid: Intelligent resource estimation and network-
efficient scheduling for deep learning jobs on distributed GPU clusters. IEEE Trans. Parallel Distrib. Syst. 2021, 33, 2808–2820.
[CrossRef]
66. Hwang, R.H.; Peng, M.C.; Huang, C.W.; Lin, P.C.; Nguyen, V.L. An unsupervised deep learning model for early network traffic
anomaly detection. IEEE Access 2020, 8, 30387–30399. [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

You might also like