Load balancing is the technique of distributing the workload of a computer among many machines to improve both processing speed and resource utilization. This load balancing method ensures that no machine is overburdened while others sit idle. By distributing tasks evenly throughout the network's nodes, load balancing ensures optimal performance [8]. Cloud's primary goals are to speed up responses, cut down on expenses, and boost overall performance, which is why it's also referred to as a shared pool of resources. Distributing the workload of several virtual machines over multiple nodes in cloud computing increases resource utilization and user satisfaction. Each node can do its task effectively because of load balancing, and communication between them is instantaneous. In order to divide the workload from several clients to multiple servers, a load balancer is placed between the client and the server. Performance measures such as execution time, makespan time, response time, efficiency, throughput, etc., are all improved by load balancing. Distributing network traffic using cloud load balancing promotes efficiency and dependability in a cloud environment. The goal is to speed up responses while maximizing available resources [9].
Load balancing is an essential technique in a smart healthcare system that minimizes delays and enhances system efficiency. Load balancing means the equitable allocation of tasks and processing load among numerous resources, such as servers or processors, to prevent the overburdening of any single resource. Various techniques, such as dynamic and static load balancing, can be employed to achieve load balancing in a smart healthcare system. The optimization of load balancing is widely acknowledged as a crucial element in mitigating latency issues within intelligent healthcare systems, owing to various factors. The implementation of load balancing guarantees the equitable distribution of processing load among available resources, thereby mitigating the risk of any individual resource being overburdened and causing latency. Timeliness is of utmost significance in healthcare systems, as any delays can result in life-threatening impacts for patients. Load balancing is a technique that aids in enhancing resource utilization, resulting in expedited processing times and minimizing delays. Thirdly, implementing load balancing can aid in guaranteeing the scalability and availability of a system. Load balancing can be a valuable strategy for healthcare systems to manage escalating service demand and expand their processing capacity while maintaining optimal system availability [10]. The implementation of load balancing can potentially enhance system responsiveness and mitigate latency. Load balancing is a technique that involves the allocation of processing load across multiple resources. This approach can effectively minimize the duration of request processing and response delivery, enhancing the user experience and mitigating delays. Load balancing aims to achieve primarily:
- Maximizing the use of available resources
- Increasing the amount of work done per unit of time
- Ensuring a quick response
- Preventing overburden.
3.1 Parameters for load distribution
In the current load-balancing algorithms, many different parameters are taken into consideration.
- Resource Utilisation: The purpose of this parameter is to assess how effectively resources are being utilized. An algorithm for load balancing that is effective must have resource utilization that is maximized.
- Performance: This statistic determines how effectively the system meets its goals. It ought to be quite high.
- Scalability: It refers to an algorithm's capacity to distribute the system's load over an infinite number of nodes. It is recommended that this measure be enhanced.
- Throughput: It measures how many tasks are successfully carried out. It is necessary to have a large throughput if you want greater performance.
- Response time: This refers to the duration of time required by a specific load balancing mechanism in a distributed system to respond.
- Overhead: This calculates the extra work needed to implement a load-balancing method. The costs associated with movement and communication across processes are the primary contributors to overhead.
3.2 Load Balancing Algorithms Types
Load balancing algorithms are of two types
3.2.1 Static Load Balancing
Static load balancing techniques are designed to distribute the load among the servers in a system based on prior knowledge about the system's characteristics, such as memory, processing power, and performance. Unlike dynamic load balancing algorithms, these static algorithms do not require information about the current state of the nodes. The determination of load distribution is executed during the compilation phase based on pre-existing knowledge of all the nodes and their respective properties. Static load balancing techniques are straightforward to implement because they don't require knowledge of the underlying system. Nevertheless, these algorithms exhibit greater suitability for systems characterized by low load fluctuation [11].
3.2.1.1 Round Robin Algorithm
The Round Robin algorithm is commonly employed in conventional load balancing methodologies. This policy uniformly and circularly distributes time slices among all tasks. The allocation of tasks to virtual machines follows a circular pattern, with a time slice mechanism employed for data processing [12]. The spatial dimension is partitioned into discrete units, wherein individual entities are allocated distinct periods to execute specific tasks. If a node is unable to complete its task within the time frame specified amount, it is required to wait until the subsequent slot. It is important to acknowledge that this algorithm does not adhere to a priority-based scheduling policy. Hence, certain nodes may exhibit a significant workload, while others may display a relatively lighter workload, leading to an uneven distribution of the system's total load.
3.2.2 Dynamic Load Balancing
Dynamic algorithms rely on the system's current state to make decisions without considering any previous information regarding the system. Dynamic algorithms consider multiple criteria, such as transfer, selection, location, and information policies, to achieve an equitable distribution of the burden [13]. Furthermore, it considers any dynamic changes in the state of the servers. If a server's load is too overwhelming at any point, it will be moved to another server with less burden. It is the method of dividing the workload amongst servers in a dynamic manner, which means that it determines the amount of workload distributed while the application is being executed. It is necessary for there to be communication in real time between the servers. Within this architecture, traffic is dynamically distributed across multiple servers.
3.2.2.1 Ant Colony Optimization Algorithm
The ACO algorithm is a probabilistic meta-heuristic approach inspired by the ability of ant colonies to locate the shortest path between their colony and a food source. This type of cooperative searching for food sources by actual ants is used to tackle various optimization issues, such as determining the shortest path across a graph [14]. At first, an ant will depart from its colony and go along a route that has been chosen at random to investigate the neighborhood for potential sources of food. Pheromones are chemical signals secreted by ants to facilitate an indirect form of communication with one another. On its trip back to the colony, the ant disperses a quantity of pheromone along the path that is proportionate to the quantity and quality of the food it has found along the way.
Consequently, subsequent ants will have a greater chance of pursuing routes with a high concentration of pheromones. Ultimately, all the ants will return to their food source to their nest via the quickest and most direct route. When there are m ants and n potential pathways, each ant chooses a possible route based on which path has the highest concentration of pheromones among all possible paths [15].
The ACO method has been applied in the solution of similar scheduling problems in the cloud, for instance, the scheduling of virtual machines on cloud resources and the scheduling of tasks on virtual machines, with the goal of load balancing the tasks on the virtual machines and reducing the response time of the tasks [16]. In addition, the ACO algorithm has been utilized for deadline-aware job scheduling for fog computing inside a tiered Internet of Things architecture. Ants are noticeable insects because of the behavior that distinguishes them from other insects. Memory is not well developed in ants. Therefore, it would appear that the individual behaviors of the ants make up a significantly separate aspect of the colony [17]. Ants can carry out various tasks that need high stability and trustworthiness because they use a unique communicational style. This skill of ants has served as the motivation for the development of the field of ant colony optimization. Even though they have limited memory, the program has individual turbulence and an unexpected portion. When ants travel from their colony to a food source or vice versa, they leave behind pheromones, which eventually dissipate because of the passage of time. Therefore, the potency of the pheromones along these paths either increases or decreases depending on which path is the most occupied since pheromones evaporate [18]. Therefore, as time passes, the shortest route produces a higher concentration of pheromones that encourage the other ants to find food. Ants can complete various challenging tasks in a dependable and consistent manner. They can complete various challenging activities regularly and reliably thanks to a strategy involving self-organizing and self-training.
3.2.2.2 Particle Swarm Optimization (PSO)
PSO was Inspired by the cooperative nature of swarm behavior, and Particle Swarm Optimisation was first proposed by Kennedy and Eberhart in 1995. PSO is a metaheuristic swarm-based augmentation approach developed by studying the behaviors of swarm intelligence, such as those exhibited by flocks of birds and schools of fish [19]. The birds' and the particles' velocities and locations are linked to the hunting area's food availability. There will undoubtedly be a flying species in the party that has excellent food detection abilities while the search is underway. As a result, the particles adjust their velocities in opposition to the local best and then converge on the global best location [20]. Therefore, PSO is used to obtain the optimal solution by combining the local best and the global best.
3.2.2.3 Honey Bee
The Honey Bee Load Balancing Algorithm is a technique used in cloud computing to distribute incoming traffic across multiple servers to ensure efficient utilization of resources and avoid overloading any one server [21]. It is inspired by the behavior of honeybees in a hive, where worker bees dynamically allocate tasks to available resources to optimize the overall performance of the hive [22]. By dynamically distributing requests based on server performance, the Honey Bee Load Balancing Algorithm helps to ensure that resources are efficiently utilized and prevents overloading any one server. The enhancement of cloud-based services' overall performance and reliability can be achieved through this approach [23].
There are three different kinds of bees that participate in the Honey Bee Load Balancing Algorithm: scouts, foragers, and recipients.
Scouts are responsible for monitoring the performance of each server in the cloud and reporting back to the hive. Request handling in a distributed system, there are specialized roles responsible for efficient allocation and routing. The foragers play a pivotal role by collecting incoming user requests and making informed decisions about the most suitable server to handle each request [24]. On the other hand, receivers are responsible for receiving these requests from the foragers and ensuring they are directed to the appropriate server.
The algorithm operates through a well-defined process:
- Periodically, scouts evaluate the performance metrics of each server within the cloud. These metrics include CPU usage, memory usage, and network latency. The collected information is then relayed back to the central control, called the hive.
- Foragers, armed with the performance data provided by the scouts, analyze incoming requests from users. Their task is to assess which server would be the best fit for each request. Factors such as the current load on each server and its processing capacity are considered during this evaluation.
- Once the foragers have determined the optimal server for a request, they promptly assign it to that server and notify the corresponding receiver about the assignment.
- Receivers act as the intermediaries between the foragers and the servers. They are responsible for receiving the requests assigned by the foragers and directing them to the designated server as indicated by the assignment.
- Servers, having received the requests, diligently process them and produce the required results, which are then returned to the user who initiated the request.
- The scouts continuously monitor the servers' performance, allowing the algorithm to adapt and respond to changing conditions within the cloud. The scouts provide vital insights that enable effective decision-making by constantly reporting back to the hive.Bottom of Form