2. State-of-the-Art
Following our recent research performed on the various static and dynamic load balancing algorithms [10], as subsequent results of the extension of the study presented in [2], we try to demonstrate the weaknesses and disadvantages of existing load balancing, especially when they increase [1]. Thus, our proposal is to design and model a new approach of load balancing by running together one of the meta-heuristic algorithms presented in this state-of-the-art model. In our model, the metaheuristic algorithms and some load balancing algorithms from the two categories (static and dynamic) will be presented in order to choose the meta-heuristic that suits with Phase 1 of our implementation.
2.1 Load Balancing Algorithms
Load balancing algorithms are used to distribute new requests of users in a data center between the VMs, according to each technical [7,11], in order to guarantee an equal number of tasks allotted to each machine. According to the study presented on these algorithms in our model, we find that there are two types of algorithms: the static and the dynamic [12,13]. Before assigning tasks to the emphasized VMs, these algorithms remain uninterested in the task characteristics and, therefore, the degree of load unbalance between the VMs as the load increases.
2.1.1 Static algorithms
The statics load balancing algorithms is an assignment from a set of tasks to a set of resources which can take either a deterministic or a probabilistic form [4]. In addition, static approaches are defined usually in the design or implementation of systems [13]. Their main drawback is that the current state of the system is not considered when making the decisions and therefore it is not a suitable approach in systems such as distributed systems where most states of the system changes dynamically. We will present some of these algorithms as follow.
• Throttled load balancing algorithm
This algorithm ceases the state of the virtual machine to find out whether or not it is available for the new allocation. The algorithm then sends the acknowledged VM identifier (ID) to the data center controller for a new allocation and the achievement of the task. When the task processing is fulfilled, the virtual machine sends the result to the data center controller, which notifies the algorithm for an allocation cut-off [14].
• Central manager algorithm
The central load manager assigns tasks to a VM in each new allocation with a minimum load compared to the other machines. From time to time, it updates the system’s load status as it experiences changes. This status allows the data center to make the correct load balancing decision when it is creating new tasks [10].
• Active monitoring load balancer
The active monitoring loan balancer is an algorithm that counts the minimum number of tasks assigned to each virtual machine and sends its ID to the data center controller, which notifies the algorithm to adjust the allocation and incrementation in its table with the new number of tasks assigned to the machine owning the identified ID [14].
• Round robin algorithm
This task allocation order is based on the FCFS technical [15]. Requests are distributed among VMs in turns using a data center controller regardless of the different allocated tasks characteristics [16]. This algorithm is known for its fair allocation of the VMs to all nodes [17], but it doesn’t have control over the workload distribution.
• Weighted round robin algorithm
This algorithm is similar to the ‘round robin’ in a sense that the manner by which requests are assigned to the nodes is still cyclical, except that the node with the higher specifications will be given a greater number of requests [18].
• Threshold algorithm
In this algorithm, the processes are assigned immediately to hosts that are being locally created. While keeping a copy of the system’s load, the processor’s load can either be underloaded, medium, or overloaded. The main drawback of this algorithm is the fact that the tasks are allocated locally, which means that there can be processors that are overloaded while others are underloaded. That issue will cause a significant increase in the execution of the tasks [10].
• Opportunistic load balancing algorithm
In this approach, each unexecuted task is appointed randomly to an available node [19]. The main goal of this approach is to keep the node’s load full [17]. Even if it provides load balancing, the main drawback of this algorithm is its inability to calculate the current execution time of the node.
• Min-Min algorithm
In this approach, the task having the minimum execution time will be the first to be assigned to a VM that can complete the task [20]. This algorithm outperforms other algorithms when there are more tasks that have a small execution compared to the number of tasks that have a long execution time [17]. The task having the maximum execution time will stay in the queue for an undetermined time period. This way of processing will only result in a poor utilization of the VMs [21].
• Opportunistic load balancing algorithm + Min-Min algorithm
This approach is a combination of the opportunistic load balancing (OLB) algorithm and the Min- Min algorithm [22]. The OLB algorithm is used in order to keep every node’s workload full, and the LBB algorithm is used to minimize the execution time of the tasks assigned to the nodes. This approach is known for its better resource utilization and the enhancement of the work’s efficiency, but it does not tolerate error [17].
• Improved weighted round robin
This algorithm is based on the ‘weighted round robin’. It is similar in regard to the cyclic tasks allocation, except that it considers the priority and the length of the task to choose the most suitable VM [23].
A general disadvantage of all static schemes is that the final selection of a host for process allocation is made when the process is created and cannot be changed during process execution to make changes in the system load.
2.1.2 Dynamic algorithms
Unlike static algorithms, dynamic algorithms consider the actual load of the VMs [24]. The goal of the dynamic algorithms is to decrease the number of errors caused by the static algorithms [11]. Some of these algorithms are presented below.
• Central queue algorithm
The data center controller has a main queue in which tasks are classified in first-in, first-out (FIFO) order. If a virtual machine goes into the “underload” status, this main queue sends a request for a new task allocation to the data center controller, which deletes the task from the queue and sends it directly to the specified recognized machine [10].
• Local queue algorithm
Under this algorithm, all VMs get local queues, and, when they go into underload mode, it looks for other tasks to allocate from the other further away VMs. The advantage of this algorithm is the dynamic migration and efficient allocation of all the tasks loaded in the data center controller to the VMs [25].
• Mini time processing load balancer
The architect of this algorithm has improved the design of the ‘Efficient response time load balancer’ algorithm in a new one called ‘Processing time load balancer’, which considers the current state of the VM workload through the processing time, which is a main metric of this algorithm [26].
• Token ring algorithm
In this algorithm, tokens are moved around the system in order to minimize the system cost. Heuristic approach can be used to remove the drawback of token routing algorithm. The Token Ring algorithm provides fast and efficient routing decisions. Hence, no communication overhead is generated in this approach [27].
• Least connections
This algorithm is based on the selection of the service with the least number of active connections to ensure that the load of the active requests is balanced on the services. This method is the default method because it provides the best performance [28].
• Weighted least connections
This algorithm introduces a “weight” component based on the respective capacities of each server. Just like the ‘weighted round robin’, each server’s “weight” must be specified beforehand. A load balancer that implements the ‘weighted least connections’ algorithm considers two things: the weight capacities of each server and the current number of clients currently connected to each server [29].
• Trust management algorithm
This algorithm proposes a trust value management for the cloud Infrastructure-as-a-Service (IAAS) parameters. The main objectives of this algorithm are to increase the resource utilization and to decrease the request response time of the system. It also enhances the QoS based on the proposed model, but it does not provide a formal description of the expressed model [30].
• Cloud friendly load balancing
This algorithm aims at reducing the energy consumption, timing penalty, and execution time for a virtualized environment [31].
• Two-phase load balancing
This algorithm is shown in service nodes, service managers, and request managers [32]. This algorithm uses the request manager, which is a bottleneck. It also has a lack of simulation, and it does not provide the performance metrics.
• Stochastic hill climbing
Stochastic hill climbing (SHC) is a variant of Hill Climbing that deals with the bottleneck problem. Unlike other algorithms, the SHC gives a better performance than the FCFS and ‘round robin’ algorithms, but it faces a lack of resources utilization [33].
• L3B algorithm
This algorithm is placed between users and cloud nodes. L3B improves the cloud performance and reduces power consumption and customer cost through a mechanism that initializes a suitable VM if the incoming workload exceeds beyond a certain threshold and switch-off VM if the workload decreases in compression with a certain threshold [34].
• Cloud partition load balancing
Cloud partition load balancing aims at improving the efficiency in the public cloud environment. The algorithm uses algorithms with low complexity for underloaded situations in partitions. Unfortunately, there are neither simulations nor implementations of this solution [35].
• VM-based two-dimensional load management
This algorithm reduces system overhead by reducing migration. However, it is considered only for applications with seasonal attribute change, which is not extendable for a real system. This algorithm also suffers from the absence of resources utilization [36].
• DAIRS
This approach aims at balancing the workload in data centers by relying on three parameters: CPU, memory, and network bandwidth. It also uses four queues: waiting queue, requesting queue, optimizing queue, and deleting queue. This approach is not suitable for cloud systems due to the fact that it is centralized [37].
• TOPSIS method
This method chooses which VM should be selected for migration to a new physical machine and which physical machine should host the selected VM [38].
• EuQoS
The EuQoS system for scheduling VMs consists of two parts: load balancer and agent-based monitor. The load balancer module provides three mechanisms: balance triggering, EuQoS scheduling, and VM control. The distribution of workload system is done by weighted round-robin load balancing algorithm [39].
• Ant colony optimization
This algorithm, as its name states, is based on the behavior of ants to detect the location of underloaded or overloaded nodes. It then updates the resources utilization table [40]. This algorithm is known for its scalability but has low throughput [17].
• Bee-MMT
This approach uses the artificial bee colony algorithm with the feature of minimal migration time [41].
• Improved ABC
The main purpose of this approach is to suggest a load balancing strategy for cloud computing systems. This way, the system’s throughput is improved. The drawback of this method is the lack of resources utilization and its instability [42].
• Dynamic adaptive replica strategy (DARS)
This approach is based around the idea of creating replicas of nodes if they become overloaded and stores them in other nodes in order to release their workload [43]. If it can reduce the number of overloaded nodes, even if it is a decentralized approach, it does not compromise the access delay. One of the main features of DARS is the fact that creating replicas and storing them are up to the nodes themselves, depending on the state of their loads.
2.2 Comparison
The performance of various load balancing algorithms is measured by using the following parameters.
1) Overload rejection
If load balancing is not possible, additional overload rejection measures are needed. When the overload situation ends, then, first, the overload rejection measures are stopped. After a short guard period, load balancing is also closed down [44].
2) Fault tolerant
This parameter shows if an algorithm is able to tolerate tortuous faults or not. It enables an algorithm to continue operating properly in the event of some failure. If the performance of the algorithm decreases, the decrease is proportional to the seriousness of the failure. Even a small failure can cause total failure in load balancing [44].
3) Forecasting accuracy
Forecasting is the degree of conformity of calculated results to its actual value that will be generated after execution. Static algorithms provide more accuracy than dynamic algorithms as in former most assumptions are made during compile time and later during execution [44].
4) Stability
Stability can be characterized in terms of the delays in the transfer of information between processors and the gains in the load balancing algorithm by obtaining faster performance in a specified amount of time [44].
5) Centralized or decentralized
Centralized schemes store global information at a designated node. All sender or receiver nodes access the designated node to calculate the amount of load transfers and also to check that tasks are to be sent to or received from. In a distributed load balancing, every node executes balancing separately. The idle nodes can obtain load during runtime from a shared global queue of processes [44].
6) Nature of load balancing algorithms
Static load balancing assigns load to nodes probabilistically or deterministically without consideration of runtime events. It is generally impossible to make predictions of arrival times of loads and processing times required for future loads. On the other hand, in dynamic load balancing, the load distribution is made during run-time based on current processing rates and network condition. A DLB policy can use either local or global information [44].
7) Cooperative
This parameter shows whether processors share information between them in making the process allocation decision other are not during execution. What this parameter defines is the extent of independence that each processor has in concluding how it should use its own resources. In the cooperative situation, all processors have the accountability to carry out its own portion of the scheduling task, but all processors work together to achieve a goal of better efficiency. In the noncooperative, individual processors act as independent entities and arrive at decisions about the use of their resources without any effect on the rest of the system [44].
8) Process migration
The process migration parameter shows the capacity of a node to transfer a process to another node. It decides whether to create it locally or create it on a remote processing element. The algorithm is capable of deciding whether to make changes of load distribution during execution of process or not [44].
9) Resource utilization
Resource utilization includes automatic load balancing. A distributed system may have unexpected number of processes that demand more processing power. If the algorithm is capable of utilizing resources, they can be moved to underloaded processors more efficiently [44].
The comparison of some of the load balancing algorithms quoted above is shown in Table 1.
Parametric comparison of load balancing algorithms
2.3 Meta-Heuristic Algorithm
A meta-heuristic algorithm is formally defined as an interactive generation process that directs the exploration and the use of the research space. Meta-heuristic algorithms are among the techniques that can be used to solve complex problems, including the load balancing which is highlighted in this contribution.
2.3.1. Tabu search
Tabu search is the method that allows tracking of the regions of space solutions that have already been searched. It starts from an initial random solution and moves successively to each neighbor of the current solution. It stands on the concept of taboo list, which is a special short-term memory formed by the previously visited solutions. Indeed, instead of a complete solution, the short-term memory carries only some attributes of every solution, and, then, it gives no access to the considered solutions and avoids making a loop on the optimal solutions [45,46].
2.3.2 Genetic algorithm
Genetic algorithms aim to find solutions to difficult problems. Their basic idea is to generate an initial random population formed with individual solutions to the problem called chromosomes, and then develop it following a number of iterations called generations. During each generation, every chromosome is evaluated by some measurement of aptness. To create the next generation, new chromosomes, called offspring, are formed either by fusing two current generation chromosomes using an operator crossover or by altering a chromosome using an operator mutation. A new generation is formed by selection, according to the correctness values, and, in such way, some of the parents and children are thrown to maintain a constant size of the population. Regulator chromosomes have higher probabilities of being selected. After several generations, the algorithms converge towards the best chromosome, which represents the optimal solution to the problem [47].
2.3.3 Simulated annealing algorithm
The simulated annealing (SA) technique was initially proposed to solve the hard-combinatorial optimization problems through controlled randomization by simulating the temperature falling procedure of particular systems in thermodynamics. It is a technique to find a better solution for an optimization problem by trying random variations of the current solution. The main feature is that a worse variation may be accepted as a new solution with a probability, which results in the SA’s major advantage over other searching methods. That is, the ability to avoid becoming trapped at local minima. Theoretically, SA is able to find the global optimal solution with probability equal to 1 [48].
2.3.4 Honey bee behavior algorithm
This algorithm aims at realizing a well-balanced load among all VMs in order to maximize the system efficiency. The suggested algorithm equally balances the tasks priorities on machines so that the time spent in the waiting queue may be minimized [49].
2.3.5 Bat-algorithm
Bat-algorithm is a modeling inspired by the method of echolocation. It is used by “micro-bats” to identify the shortest iteration to the “prey”. Developers try to improve this meta-heuristic in order to meet the requirements of NP-hard computations [50]:
1. All bats use echolocation to realize the distance and the difference between obstacles and prey.
2. To search for prey, bats randomly fly with a velocity (vi) at position (xi), a fixed frequency (fmin), and a wavelength and a varying loudness (A0). They can adjust automatically the wavelength (or frequency) and the emission rate of the pulse (r2) (0, 1) depending on the closeness to their target prey.
3. Although the loudness can vary in several ways, we suppose that the loudness fluctuates from a big loudness (A0) (positive) to a constant minimal value (Amin).
We chose to use the Bat-algorithm to develop our algorithm due to the fact that it is simple, flexible, and easy to implement. It can also solve a wide range of problems efficiently. There is also the fact that it works well with complicated problems and can give an optimal solution in quick time.
3. System Model and Algorithm
Following our research proposed in [2], we noticed that the load balancing based only on the statute of the VM is insufficient to guarantee an ideal balancing. Consequently, we propose in this article a new approach of load balancing based on several parameters which influence normal unfolding of the burden-sharing and the allowance of the tasks between the VMs of a data center. We separated our modeling in two parts. The first part is the pre-classification of the tasks in an ascending order based on the meta-heuristics “Bat-algorithm”. The second is the allowance of the tasks to the numbers of the virtual machines having more or less equal performances according to Fig. 1 in lower part.
3.1 Classification of the Tasks Based on Bat-Algorithm
For our first part, we chose to collaborate with the research carried out by [50] and [51]. This choice has as objective pre-classifying the spots according to the required levels suitable, and the resources. Let us note that our model is adaptable with all the heuristic preset ones in the state of the art or else. It is used to seek the level of suitable classification according to Fig. 2 in lower parts.
Pre-classification based on “Bat-algorithm”.
Stages of our first classification (Bat-algorithm):
― Initialize the parameters of Bat-algorithm (position, loudness, speed) [15].
― Initialize the levels of classifications: to assign the first task Ti to a level (R.Ri).
― Seek second site R.Ri+1 of the task Ti+1 containing: R.Ri +1 in {R.Ri, R.Ri*N, (R.Ri) /N}
― Use Bat-algorithm in order to detect the ideal site of the following tasks Ti+1 in the levels of classification
― Identify the number of the box of the Ti+1 task with the N while= R.Ri+1/R.Ri
― Launch the selection of the good sites for the following tasks Ti+1 according to the resources requested (R.Ri+1), one using path-algorithm
― Repeat the operation for all the tasks of the segment requested
― Pass to the second part of our model of load balancing in lower part
3.2 Allocation of the Tasks to the Virtual Machines
Thanks to the Bat-algorithm, the tasks are now redistributed in different levels. The choice of maximum number of the levels of classification depends on the different required resources of the tasks, which arrive in a data center. The levels presented in the form of a table of temporal scheduling have parameters of alterable classification according to the types of the programs arriving at our model of load balancing (meta-heuristics wished).
Whatever the metric of classification (execution time, processing time, standard of the tasks, resources requested… etc.), our approach successively begins from a lower level up to a higher level (key of our balancing) provided that
R.Rs is the metric of a higher-level task (R.R=resource required for the task); R.Ri is the metric of a lower level task and NR is the number of ranking of the higher level.
Example of Fig. 3, if the number of the desired levels is 7, then R.Ri = 1 and R.Rs= 7:
Levels of classification.
Thanks to our technique of pre-classification presented before, we can
― have an equal distribution of the tasks pre-classified in each level of scheduling.
― obtain a load balancing between the virtual machines provided that:
If the number NR (number of the levels) is even:
Nbr.VMs=Nbr.level/2
If the number NR is odd:
Nbr.VMs=(Nbr.level+1)/2
Nbr.VMs is the number of the virtual machines, Nbr.level is the number of the levels.
In hoping to develop the performance of the load balancing parameters, we undertook a deep study on the different existing algorithms. As a result, we found out that they are not taking into consideration the parameters of the tasks to balance the load. Thus, we developed a hybrid model based on two different, but complimentary, approaches:
― Pre-classifying tasks by using the meta-heuristic Bat-algorithm, and determining the levels by calculating the metric using Eq. (1).
― Determining the number of sufficient VMs by using the levels found previously, and by taking into consideration whether the levels are even or odd.
4. Experiment and Results
4.1 Experiment
In order to evaluate the performance of the suggested load balancing algorithm, we implemented it on CloudSim, which presents the different stages of this process.
We proceeded to the simulation of a data center, initially consisting of several VMs of the same type. Each of these will receive a single task with random metric (execution time, response time, etc.).
In this scenario, we will treat an odd number of tasks (Fig. 4). In each instance, we will have the same number of odd tasks distributed on the VMs.
Now, to explain how our pre-classifications method shown above works, we will take the example of time T6 shown in Fig. 4. By performing a task scheduling method, our algorithm will order the tasks in different levels in an ascending order depending on the metric used (execution time or response time). Then, we will select the number of tasks that will take in charge these tasks. Because we have here an odd number of tasks (5), we will follow the rule stated above: Nbr.VMs= (Nbr.levels+1)/2. By following that rule, we will only use three virtual machines to take in charge these tasks in this scenario. The allocation of the tasks is shown in Fig. 5, where the task with the maximal metric value will be given to a VM, then the goal is to give to the other VMs all the tasks whose metric sum will be almost equal to the maximal metric value. In this example, the maximal metric value is 150.
Allocation of the tasks to the VMs.
This provides a balance in the distribution of the workload on a lower number of virtual machines compared to the initially used machines. The experiments that we conducted consist of executing the algorithm presented above.
Our algorithm adds a new concept of elasticity, which is the calculation of the optimal number of virtual machines (Fig. 6) depending on the received tasks at time T (Fig. 4).
Calculation of the optimal number of VMs.
Allocation of tasks in their respective level.
In this case, we noted that in some scenarios the sum of the metric values of the tasks is not equal to the value of the maximal metric. Next, we will explain this scenario by using the T1 in Fig. 4.
The values of the maximal and minimal metric are 80 and 5, respectively. As a result, we put the task whose metric is 5 in Level 1. Level 2 is then created by respecting the following condition: level n = n* Level 1. This means that Level 2 will only include tasks whose metrics values are around 10. By following this condition, we will have 16 levels (Fig. 7), which means that we will be using 8 VMs: Nbr.VMs=Nbr.level/2.
The new maximal metric that will be used is Mmax-i+ Mmin+i, where M is the metric’s value of the tasks in the level and i goes from 0 to the number of levels minus 1 in this example, the maximal metric will be 85 (Fig. 8).
At this stage, the allocation of the tasks to the VMs will start. If there is no task in a Level X, then its metric will be 0. In this example, the levels that contain the tasks are the Levels 1, 3, 8, 14, and 16. This explains why there are 4 VMs out of 8 VMs that are in standby (T1 in Fig. 6). We then used these VMs in standby to treat tasks from the next segment of tasks (Fig. 9). This allows us to clear the task queue, which will reduce waiting time.
Determination of the exact process time.
4.2 Result
We tested the evolution process of our approach by following three steps:
Step 1: The pre-estimation of the classification metric using the Bat-algorithm. Fig. 10 shows the initial variation of the workload.
Step 2: Adding the elasticity aspect of our approach to the previous aspect. Fig. 11 shows the impact of the determination of the number of VMs based on the levels of the tasks on the workload.
Step 3: Adding the local queue migration aspect to the previous aspects. The result of the workload is shown on Fig. 12.
Impact of the elasticity on the workload.
Workload results of the local queue migration.
The results shown above are resumed by Fig. 13 to show the positive impact of the full utilization of our approach on the workload. They are explained as follows.
Results of the positive impact of the full utilization of our approach.
The curves show that our new algorithm provides better results in terms of workload. We notice that it meets the performance criteria of a load balancing algorithm. Indeed, the various spikes that have been recorded at the beginning of the curve in purple reflect the maximization of the use of resources, which fluidizes subsequently the waiting line management. Based on the curves and the data from both Figs. 4 and 6, we notice that the algorithm uses at time T the exact optimal number required in terms of virtual machines.