Design and Implementation of Efficient Load Balancing Algorithm in Grid Environment
Design and Implementation of Efficient Load Balancing Algorithm in Grid Environment
Abstract—Grid technology has emerged as a new way of large- run on homogeneous and dedicated resources, cannot work
scale distributed computing with high-performance well in the Grid architectures. Grid Resource Management
orientation. Grid computing is being adopted in various areas is defined as the process of identifying requirements,
from academic, industry research to government use. Grids matching resources to applications, allocating those
are becoming platforms for high performance and distributed
computing. Grid computing is the next generation IT
resources, and scheduling and monitoring Grid resources
infrastructure that promises to transform the way over time in order to run Grid applications as efficiently as
organizations and individuals compute, communicate and possible. Resource discovery is the first phase of resource
collaborate. The goal of Grid computing is to create the management. Scheduling and monitoring is the next step.
illusion of a simple but large and powerful self-managing Scheduling process directs the job to appropriate resource
virtual computer out of a large collection of connected and monitoring process monitors the resources. The
heterogeneous systems sharing various combinations of resources which will be heavily loaded will act as server of
resources. The main goal of load balancing is to provide a task and the resources which are Lightly Loaded will act as
distributed, low cost, scheme that balances the load across all receiver of task. Task will be migrated from heavily loaded
the processors. To improve the global throughput of Grid
resources, effective and efficient load balancing algorithms are
node to lightly loaded node. Resources are dynamic in
fundamentally important. Focus of this project is on analyzing nature so the load of resources varies with change in
Load balancing requirements in a Grid environment and configuration of Grid so the Load Balancing of the tasks in
proposing a centralized and sender initiated load balancing a Grid environment can significantly influence Grid’s
algorithm. In this work we have proposed an efficient load performance [5].
balancing algorithm which optimizes the response time and
latency time with respect to the server. II. LOAD BALANCING CATEGORIES
Keywords- Load balancing, grid computing Load balancing problem has been discussed in traditional
distributed systems literature for more than two decades and
I. INTRODUCTION various algorithms, strategies and policies have been
proposed, classified and implemented. Load balancing
The rapid development in computing resources has algorithms can be classified into two categories, static and
enhanced the performance of computers and reduced their dynamic [4].
costs. This availability of low cost powerful computers
coupled with the popularity of the Internet and high-speed A. Static load balancing Algorithms
networks has led the computing environment to be mapped Static load balancing algorithms allocate tasks of a parallel
from distributed to Grid environments. In fact, recent program to workstations based on either the load at the time
researches on computing architectures are allowed the nodes are allocated to some task, or based on average load
emergence of a new computing paradigm known as Grid of workstation cluster.
computing. Grid is a type of distributed system which
supports the sharing and coordinated use of geographically
distributed and multiowner resources, independently from
their physical type and location, in dynamic virtual
organizations that share the same goal of solving large-scale
applications [1]. In order to fulfill the user expectations in
terms of performance and efficiency, the Grid system needs
efficient load balancing algorithms for the distribution of
Fig. 1 Static Load Balancing
tasks. A load balancing algorithm attempts to improve the
response time of user’s submitted applications by ensuring The decisions related to load balance are made at compile
maximal utilization of available resources. The main goal is time when resource requirements are estimated. The
to prevent, if possible, the condition where some processors advantage in this sort of algorithm is the simplicity in terms
are overloaded with a set of tasks while others are lightly of both implementation as well as overhead, since there is
loaded or even idle. Although load balancing problem in no need to constantly monitor the workstations for
conventional distributed systems has been intensively performance statistics.
studied, new challenges in Grid computing still make it an The decisions related to load balance are made at compile
interesting topic and many research projects are under way time when resource requirements are estimated. The
[2]. This is due to the characteristics of Grid computing and advantage in this sort of algorithm is the simplicity in terms
the complex nature of the problem itself. Load balancing of both implementation as well as overhead, since there is
algorithms in classical distributed systems, which usually no need to constantly monitor the workstations for
2159
Sandip S. Patil et al, / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 2 (5) , 2011, 2159-2164
performance statistics. However, static algorithms only higher than that of finding a heavily-loaded node. Similarly,
work well, when there is not much variation in the load on at high system loads, the receiver initiated policy performs
the workstations. Clearly, static load balancing algorithms better since it is much easier to find a heavily-loaded node.
aren’t well suited to a grid environment, where loads may As a result, adaptive policies have been proposed which
vary significantly at various times. behave like sender-initiated policies at low to moderate
A few static load balancing techniques are: system loads, while at high system loads they behave like
Round-Robin Algorithm: tasks are passed to receiver-initiated policies [3].
processes in a sequential order, when the last
process has received a task the schedule continues B. Global v/s. Local Strategies
with the first process (a new round). Global or local policies answer the question of what
Randomized Algorithm: allocation of tasks to information will be used to make a load balancing decision
processes is random. in global policies. The load balancer uses the performance
Simulated Annealing or Genetic Algorithms: profiles of all available workstations. In local policies,
mixture allocation procedure including workstations are partitioned into different groups. The
optimization techniques. benefit in a local scheme is that performance profile
Drawbacks of Static Load Balancing Algorithms information is only exchanged within the group. The choice
It is very difficult to estimate a-priori (in an of a global or local policy depends upon the behavior of an
accurate way) the execution time of various parts application, which will exhibit. For global schemes,
of a program. balanced load convergence is faster compared to a local
Sometimes there are communication delays that scheme since all workstations are considered at the same
vary in an uncontrollable way. time [3].
For some problems the number of steps to reach a
solution is not known in advance. C. Centralized v/s. De-centralized Strategies
B. Dynamic load balancing Algorithms A load balancing strategy is categorized as either
According to the name dynamic load balancing algorithms centralized or distributed, both these define where load
takes decision at run time, and use current or recent load balancing decisions are made. In a centralized scheme,
information when making distribution decisions. In grid algorithm is located on one master workstation node and all
environment with dynamic load balancing decisions are made there. In a de-centralized scheme, the
allocate/reallocate resources at runtime based on no a priori load balancer is replicated on all workstations. There are
task information, which determine when and which task has different algorithms used in de-centralized scheme for job
to be migrated. selection [7]. These algorithms are round-robin algorithm,
random polling algorithm etc.
IV. LOAD BALANCING POLICIES
Load balancing algorithms can be based on many policies;
some important policies are defined below [7].
Information policy: This policy specifies what
workload information should be collected, when it
is to be collected and from where.
Triggering policy: This policy determines the
Fig. 2 Dynamic Load Balancing
appropriate period to start a load balancing
After using effectively dynamic load balancing algorithms operation.
can provide a significant improvement in performance over Resource type policy: This policy classifies a
static algorithms. But this comes at the additional cost of resource as server or receiver of tasks according to
collecting and maintaining load information, so it is its availability status.
important to keep these overheads within reasonable limits. Location policy: This policy uses the results of the
resource type policy to find a suitable partner for a
III. LOAD BALANCING STRATEGIES server or receiver.
Selection policy: This policy defines the tasks that
There are three major parameters which usually define the should be migrated from overloaded resources
strategy of a specific load balancing algorithm. Some load (source) to most idle resources (receiver).
balancing strategies are being discussed in the following The main objective of load balancing methods is to speed
section [7]. up the execution of applications on resources whose
workload varies at run time in unpredictable way. Hence it
A. Sender-Initiated v/s. Receiver-Initiated Strategies is significant to define metrics to measure the resource
In sender-initiated policies, congested nodes attempt to workload [4]. Every dynamic load balancing method must
move work to lightly-loaded nodes. In receiver-initiated estimate the timely workload information of each resource.
policies, lightly-loaded nodes look for heavily-loaded nodes Success of a load balancing algorithm depends upon
from which work may be received. The sender-initiated stability of the number of messages (small overhead),
policy performing better than the receiver-initiated policy at support environment, low cost update of the workload, and
low to moderate system loads. Reasons are that at these short mean response time which is a significant
loads, the probability of finding a lightly-loaded node is measurement for a user [3]. It is also essential to measure
2160
Sandip S. Patil et al, / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 2 (5) , 2011, 2159-2164
the communication cost induced by a load balancing Load based graph method is based on network graph where
operation, but to achieve all these, anyone would have to each node is represented with its load, whereas load can be
face great challenge in grid environment V. Load Balancing the number of users, average queue length or the memory
Mechanism utilization. It uses analytic model and single load
There are some load balancing algorithms like virtual determination policy throughout the system and load is
machine migration, node reconfiguration by user level determined on the basis of memory utilization and average
thread migration, robin-hood an active objects migration queue length. This algorithm is based on three-layered
mechanism for intranet, load based graph method and data structure. Top layer is load balancing layer which takes care
consolidation [6]. of token generation, taking decision about task transfer,
middle one is called monitoring layer and acts as an
A. Virtual Machine Migration (Live Migration) interface between top and middle and monitors load
In virtual machine migration snapshots of machine are sent changes and third one called communication layer which
to other machine that’s why it is called the virtual machine take care of actual task transfer [6].
migration. There are two methods for virtual machine
migration. First one is live migration and second one is V. PROPOSED LOAD BALANCING ALGORITHM
regular migration. In live migration, running domain Load balancing is defined as the allocation of the work of a
between the different host machines is migrated without single application to processors at run-time so that the
stopping the job. In between it stops job and gathers all execution time of the application is minimized. Load
required data then resumes. But this happens only in same balancing is defined as the allocation of the work of a single
layer-2 network and IP subnet. In regular migration application to processors at run-time so that the execution
generally stop the job then migrated. time of the application is minimized. This chapter is going
to discuss the design of proposed Load Balancing
B. Node Reconfiguration by User Level Thread Migration algorithm.
This mechanism makes application workload migrate from The choice of a load balancing algorithm for a Grid
source node to destination node, and then let source node environment is not always an easy task. Various algorithms
depart from original computing environment .There are two have been proposed in the literature, and each of them
mechanism for this, first one is node reconfiguration by varies based on some specific application domain. Some
user-level thread migration and another one is node load balancing strategies work well for applications with
reconfiguration by kernel level thread migration. Node large parallel jobs, while others work well for short, quick
reconfiguration by user level thread migration has been jobs. Some strategies are focused towards handling data-
discussed in this survey. There are two implementation heavy tasks, while others are more suited to parallel tasks
methods of node reconfiguration. One is synchronous that are computation heavy. While many different load
method and the other is asynchronous method. In balancing algorithms have been proposed, there are basic
synchronous method, all nodes are paused during steps that nearly all algorithms have in common:
reconfiguration. On the other hand, in asynchronous • Monitoring workstation performance (load monitoring)
method all nodes continue to work simultaneously with • Exchanging this information between workstations
reconfiguration. Synchronous method may make (synchronization)
performance down even though it is easier to design. Efficient Load Balancing algorithm makes Grid
Alternatively, better performance can be obtained by Middleware efficient and which will ultimately leads to fast
asynchronous method as long as more attention paid to execution of application in Grid environment [20]. In this
correctly maintain the order of node reconfiguration work, an attempt has been made to formulate a
messages [7]. decentralized, sender-initiated load balancing algorithm for
Grid environments which is based on different parameters.
C. Robin Hood: An Active Objects Load Balancing One of the important characteristics of this algorithm is to
Mechanism for Intranet estimate system parameters such as CPU utilization of each
Robin-hood algorithms present a new totally non- participating nodes.
centralized solution, multicast channel to communicate, and
synchronize the processors and proactive tools to migrate VI. DESIGN OF LOAD BALANCING ALGORITHM
jobs between them. Proactive techniques are very useful Load balancing should take place when the load situation
and provide the mobility and security in uniform has changed. There are some particular activities which
framework. This work focuses on dynamic load balancing. change the load configuration in Grid environment. The
Main objective of this algorithm is to improve the decision activities can be categorized as following:
time in non-centralized environment. In this mechanism • Selection of static or dynamic load balancing category.
two basic things have been considered, first one to know • Defining the various parameters.
about the local load and second one to transfer the load • Connection with the server.
from high dense node to the less loaded node. This uses the • Sending threads to the server and executing results.
non-centralized architecture and non-broadcasting of the For static load balancing first of all Collect Host
balance of each node to reduce the overload in network. information from user (i.e. ip address, port, request URL
This is totally non-centralized load balancing mechanism, etc), when we execute this will try to connect to the host. If
using the proactive library for the migration of jobs, and a it get connected to host it will perform Simulating the
multicast channel for node coordination. number requests to the Host using no. of threads, then it
D. Load Graph Based Transfer Method will bring Result From the Server and populate into the
view area.
2161
Sandip S. Patil et al, / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 2 (5) , 2011, 2159-2164
Fig. 3: Flow Chart of Overview of Algorithm Fig. 5: Flow Chart of Dynamic load balancing Algorithm
2162
Sandip S. Patil et al, / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 2 (5) , 2011, 2159-2164
Latency is a measure of time delay experienced in a whether the condition is temporary or permanent. The 410
system, the precise definition of which depends on the (Gone) status code SHOULD be used if the server knows,
system and the time being measured. Latencies may have through some internally configurable mechanism, that an
different meaning in different contexts. In simulation old resource is permanently unavailable and has no
applications, 'latency' refers to the time delay, normally forwarding address. This status code is commonly used
measured in milliseconds (1/1,000 sec), between initial when the server does not wish to reveal exactly why the
input and an output clearly discernible to the simulator request has been refused, or when no other response is
trainee or simulator subject. Latency is sometimes also applicable.
called transport delay. Some authorities distinguish between
latency and transport delay by using the term 'latency' in the
sense of the extra time delay of a system over and above the
reaction time of the vehicle being simulated, but this
requires a detailed knowledge of the vehicle dynamics and
can be controversial.
2164