Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
6 views4 pages

Workflow Scheduling Algorithms in Cloud

This document presents a survey on workflow scheduling algorithms in cloud computing, emphasizing the importance of efficient scheduling for optimizing performance, cost, and execution time. It categorizes cloud environments into private, public, and hybrid clouds, and discusses various scheduling algorithms, their characteristics, and factors considered in their implementation. The paper concludes that while many aspects of workflow scheduling have been explored, further improvements are needed in areas such as reliability and fault tolerance.

Uploaded by

Marufa Nowrin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views4 pages

Workflow Scheduling Algorithms in Cloud

This document presents a survey on workflow scheduling algorithms in cloud computing, emphasizing the importance of efficient scheduling for optimizing performance, cost, and execution time. It categorizes cloud environments into private, public, and hybrid clouds, and discusses various scheduling algorithms, their characteristics, and factors considered in their implementation. The paper concludes that while many aspects of workflow scheduling have been explored, further improvements are needed in areas such as reliability and fault tolerance.

Uploaded by

Marufa Nowrin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Proceedings of 2014 RAECS UIET Panjab University Chandigarh, 06 – 08 March, 2014

Workflow Scheduling Algorithms in Cloud


Environment - A Survey
Lokesh Kumar Arya Amandeep Verma
University Institute of Engg. & Technology University Institute of Engg. & Technology
Panjab University,Chandigarh, India Panjab University,Chandigarh, India
[email protected] [email protected]

Abstract—Cloud computing is an emerging IT field. Hybrid clouds- it is a combination of both private


In cloud, service providers managed and provided and public environments.
resources to users. Software or hardware can be used on
rental basis; there is no need to buy them. Most of the Following are some examples of cloud:
cloud applications are modeled as a workflow. In
• Amazon’s Elastic Computing Cloud (EC2)
workflows to complete the whole task applications
require various sub-tasks to be executed in a particular provides resizable compute capacity (CPU cycles) to
fashion. Key role in cloud computing systems is users [3].
managing different tasks. Workflow scheduling is the
most important part of cloud computing, because based • Amazon’s Simple Storage Service (S3) provide
on the different criteria it decides cost, execution time facility for retrieve and manage large quantity of data,
and other performances. This review paper describes at any instant and from anywhere by internet. This
about cloud computing introduction, basics of service is provided on rental basis [4].
workflows and scheduling, some scheduling algorithms
used in workflow management, factors considered by
• CRM services provided by salesforce.com which
these algorithms, type of algorithm and tool used.
can manage customer information without installing
Keywords— Scheduling algorithm; Workflow; Cloud any specialized software [5].
computing; Scheduling; DAG
Some characteristics of cloud computing are [7]:
I. INTRODUCTION Application Programming Interfaces (APIs) are
provided to access the services on the cloud.
Cloud computing provide services, shared
resources or common infrastructure on demand Reliability and availability - chances of infrastructure
through internet. Specific service provider provides failure are minimum so more reliable and highly
these facilities and charge for what a customer used available.
called pay per use [1]. Customer can use storage
space, processing capabilities, servers, operating Multi-sharing - multiple applications and users can
system and application development environments. work more efficiently with cost reductions because
User can scale up and down the resources in an they are sharing common infrastructure.
instant (timely) and on-demand manner in cloud [2].
It also provides flexibility of accessing the resources Scalability – resources can scale up or down
from different devices. On the cloud, users can according to the business requirements.
manage their applications, develop and deploy with
the help of virtualization of resources. Figure 1 shows 3-layer architecture or three types of
services provided by cloud computing [6].
Several types of cloud computing environments are
there but mainly they classified as 3 which are
described as follows [8]:-

Private clouds (internal cloud) - These type of


clouds exists in the organization because they provide
special benefits to the organization. There is initial
investment in building, managing and buying the
private clouds. They improve average server
utilization.

Public clouds – This environment is for public. Third


parties or vendors managed these and they offered Fig 1: Cloud Computing Services [6]
services to customers. It is widely used in the
management of applications, development and Infrastructure as a Service (IaaS) - In IaaS the
deployment. Highly scalable and reliable but security resources are provided by service provider to users
is a significant concern in public clouds. without revealing details like location and hardware
to users.

978-1-4799-2291-8/14/$31.00 ©2014 IEEE


Platform as a Service (PaaS) - service provider offer Median value (ME) to calculate a weight for a
several environments to user for development of node it considers the median execution cost of
applications. User can develop applications according running each task on every machine. To calculate
to the requirements. weight for an edge it considers the median
communication cost between two tasks.
Software as a Service (SaaS) – service provider
provides software or application on internet and Mean value (M) same as Median value but it
customer used these, with no knowledge of considers mean value instead of median.
development or maintenance. Example CRM
Best value (B) to find a weight to a node it
With efficient scheduling of workflow we can considers the minimum value of the execution cost.
achieve high performance in cloud. It is assumed that The value of edge is calculated as data transfer cost
each task will be performed on one processor at one from two different machines which have minimum
time and it cannot be preempted [9]. So, in this paper execution cost of related tasks.
we try to explain about cloud, workflow and some
scheduling algorithms with their parameters. Other Worst value (W) to find a weight to a node it
sections of the paper are as follows: workflows and considers the worst value of the execution cost. The
its basics discussed in section-II. Section-III is about value of edge is calculated as data transfer cost from
scheduling. Description and metrics of some two different machines which have maximum
scheduling algorithms is in section-IV. Section-V execution cost of related tasks.
concluded the paper.
Simple best value (SB) considers the best value of
execution and communication cost.
II. WORKFLOWS
In general, workflow applications are executed in a Simple worst value (SW) considers the worst
sequence because these are considered as a group of value of execution and communication cost.
different jobs to achieve a particular result [10].
These tasks are executed based on their data
dependencies. These tasks have parent child III. SCHEDULING CONCEPT
relationship. The parent task should be executed The scheduling in context of cloud means choose
before its child task. A workflow application is the best suitable resource for task execution or to
typically showed as a directed acyclic graph (DAG). allocate machines to tasks in such a way that the
It is represented as G (v,e). In this v represents completion time (makespan) is minimized. Generally,
number of nodes and information regarding data in scheduling algorithms list of tasks is constructed
dependencies among tasks represented by e. by giving priority to every task. Tasks are chooses
Following diagram shows a workflow. according to priorities and assigned to a processor
which fulfill a predefined objective function [12].
There are two types of scheduling algorithms. First
one is static which already has the information of
estimation of job execution time, complete structure
and mapping of resources before execution. Second is
dynamic algorithms which estimate information at the
ready state of job before execution [13].
Fig 2: workflow representation in graph

Figure 2 shows the dependencies among different


Scheduling Algorithm
tasks in a workflow graph G. The child tasks 1, 2, 3
and 4 are executed after parent task 0. The child node
takes input from the output of parent node. The task 0
acts as entry node and task 9 act as an exit node. Static Algorithm Dynamic algorithm
After the completion of tasks 5, 6, 7and 8 the task 9 is
executed.
Fig 3: types of scheduling algorithms
In a task graph, an entry task is a task that has no
parent and an exit task has no child. Exactly one entry New list scheduling algorithm for heterogeneous
and exactly one exit task is the requirement of task environment has been proposed [14]. Most works in
scheduling algorithms. Makespan tells the scheduling were limited to a single workflow
performance of workflow and it is calculated as application. There are some cases when we required
ending time minus starting time of a workflow. multiple workflow scheduling [15].
To solve this author gives two approaches. First
Each node and edge in a graph is represented by a approach is composition approach in which algorithm
particular weight. It is noticed that there is a join many workflows and then apply scheduling.
considerable effect on the schedule when separate Second approach is fairness approach in which after
techniques are applied for calculating the values of completion of a task, re-calculation of values is done.
the edges and nodes of the DAG [11]. Several Then it takes decision for scheduling. Multiple
techniques for calculating weights are as workflows are managed online that are submitted at
follows[11]:- different time by different users. [16]
IV. EXISTING WORKFLOW SCHEDULING
ALGORITHMS Multiple QOS Constrained Scheduling
Following are the workflow scheduling algorithms algorithm [21]
that are important for cloud environment. These are Multiple QOS constrained scheduling is introduced
summarized in table I. in this paper. It scheduled multiple workflows which
were started at different instants. This strategy
A PSO-based Heuristic for Scheduling increased the scheduling success rate considerably
Workflow [17] and dynamically schedule with minimized execution
This paper proposed a particle swarm optimization time and cost
based algorithm. In it scheduling of applications
considering execution and data transfer cost both. Deadline and Budget Distribution based Cost-
Paper compared the cost savings with existing ‘Best Time Optimization Algorithm [22]
Resource Selection’ (BRS) algorithm. PSO achieved It considered two constraints: deadline and budget.
better distribution of workload on resources with This paper proposed (DBD-CTO) workflow
three times cost savings. scheduling algorithm. It minimized computation cost
before the required deadline for achieving target.
Workflow Scheduling for SaaS / PaaS [18]
This paper presented an integer linear program Revised Discrete PSO Algorithm [23]
formulation. ILP is formulated to schedule SaaS It scheduled applications that considered data
customer’s workflows into multiple IaaS providers. It transfer and execution cost both. It compared with the
was able to find low-cost solutions, when deadlines standard PSO and BRS algorithm on makespan, cost
were larger the proposed heuristics are effective. Also savings and cost optimization ratio and achieved
considered multiple workflows scheduling in the better performance and large cost savings on cost
same group of resources and for future work optimization and makespan.
considered fault tolerance mechanisms.
Improved cost-based algorithm [24]
Scheduling Scientific Workflows Elastically [19] In this paper author proposed the approach that is
This paper proposed the SHEFT (Scalable HEFT) improved cost-based scheduling algorithm. It
scheduling algorithm that helps increasing and measured computation performance and resource
decreasing the number of resources at runtime. It cost. It also increased execution/data transfer ratio by
provides facility to resources to scale at runtime, combining the tasks. Combining of task is done by
outperforms in optimizing workflow execution time. analyzing the capability of resource’s processing.

Optimized Resource Scheduling Algorithm [20] Deadline constraint heuristic based genetic
This paper tells about the optimal use of resources algorithm [25]
by using virtual machines. It used Improved Genetic This paper proposed Heuristic based Genetic
Algorithm (IGA). IGA selects optimal VMs by Algorithms (HGAs). It scheduled applications in a
introducing dividend policy. As compared to way to lower the computation cost. Tasks are
traditional GA scheduling method speed of IGA was completed within the timeline. This algorithm had a
almost twice and utilization of resources is also good performance as compared with Standard
larger. Genetic Algorithm (SGA).

Table I : Summarization of algorithms


Method used in Factor Considered Explanation Tool Used
Algorithm
Particle Swarm Time, Minimized the total execution cost by making model for allocation Amazon EC2
Optimization [17] Cost Optimization, of task to processor. It used PSO to solve task resource mapping. It
Resource utilization updated the costs in every scheduling loop to optimize the cost.

integer linear Makespan, SaaS provider executed task with the help of IaaS providers. The Java
program (ILP) Cost, scheduling algorithm analyzes the IaaS provider and VM has to be Environment
formulation [18] Time used to fulfill QoS.

SHEFT workflow Scalability, It scheduled a workflow on a cloud environment elastically. There CloudSim
scheduling Execution time was optimized execution time for workflow. It also increased
algorithm [19] scalability of resources during workflow execution

Improved Genetic Execution time, Automated scheduling policy generated to achieve best scheduling. Eucalyptus
Algorithm (IGA) Resource Utilization, Improved CPU utilization and determine a solution that completes
[20] Speed, all user preferred QoS constraints.
CPU utilization,

Multiple QoS Time, Cost, It considered QoS requirements. Four factors explained in this CloudSim
Constrained Make span, paper greatly affect total cost and makespan of workflow. which are
Scheduling Scheduling success scheduling success rate, mean execution time, QoS requirements,
Algorithm[21] rate mean execution cost

DBD-CTO Cost, DBD-CTO lowers the computation cost while completing tasks in Java
algorithm [22] Time timeline and considered the two constraints: deadline and budget. Environment
Particle Swarm Makespan and It scheduled applications that considered data transfer and execution Amazon Elastic
Optimization [23] Cost Optimization cost both. It compared with the standard PSO and BRS algorithm Compute Cloud
on makespan, cost savings and cost optimization ratio and achieved
better performance and large cost savings on cost optimization and
makespan.

Improved cost based Cost, performance It measured computation performance and resource cost. It also Cloud Sim
scheduling increased execution/data transfer ratio by combining the tasks.
algorithm [24] Combining of task is done by analyzing the capability of resource’s
processing.

Heuristic based Execution cost, Scheduled applications that minimize the computation cost by Java
Genetic Algorithms Execution time, completing task in timeline. Good performance as compared with Environment
(HGAs) [25] Data transfer cost Standard Genetic Algorithm (SGA).

18th international parallel and distributed processing


V. CONCLUSION symposium, 2004.
[12] A. Radulescu, A. Gemund, “Fast and effective task
With the emerging of cloud technology, cloud scheduling in heterogeneous systems,” Proceedings of the
infrastructure can support large scale business 9th heterogeneous computing workshop (HCW 2000), pp.
applications. Workflow systems are designed to 229-238, 2000.
[13] Y. K. Kwok, I. Ahmad, “Dynamic critical-path
provide that property. In cloud environment, scheduling: an effective technique for allocating task
scheduling and management of resources is graphs to multiprocessors,” IEEE transactions on parallel
complex in structure. So cloud requires and distributed systems, vol. 7, no. 5, pp. 506-521, May
sophisticated tools for analysis of various 1996.
[14] G.C. Sih, E.A. Lee, “A compile-time scheduling heuristic
scheduling algorithm so that we can apply these to for interconnection-constrained heterogeneous processor
real environment. In this paper, we reviewed architectures,” IEEE transactions on parallel and
workflow scheduling algorithms and formed a table distributed systems, vol. 4, no. 2, pp. 175-187, February
on the basis of algorithm used, parameter 1993.
[15] H. Zhao, R. Sakellarious, “Scheduling multiple DAGs
considered, explanation and development onto heterogeneous systems,” IEEE 20th international
environment. From the literature reviewed, it is parallel and distributed processing symposium,2006.
clear that lot of factors has already been covered in [16] Z. Yu, W. Shi, “A planner-guided scheduling strategy for
the area of workflow scheduling like execution multiple workflow applications,” international conference
on parallel processing - IEEE workshop, pp. 1-8, 2008.
time, resource utilization, cost optimization, [17] S. Pandey, L. Wu, S. Mayura Guru, R. Buyya, “A particle
makespan etc. but still improvement required in swarm optimization-based heuristic for scheduling
some areas like reliability, load balancing, workflow applications in cloud computing environments,”
reservation conflicts, backup and fault tolerance. 24th IEEE international conference on advanced
information networking and applications, PP 400-407,
2010.
REFERENCES [18] T. A. L. Genez, L. F. Bittencourt, E. R. M. Madeira,
“Workflow scheduling for saas / paas cloud providers
[1] S.M. Hashemi, A.Kh. Bardsiri, “Cloud computing vs. grid considering two SLA levels,” IEEE network operations
computing,” ARPN journal of systems and software, vol. and management symposium (NOMS): mini-conference,
2, No 5, pp. 188-194, May 2012. pp. 906-912, 2012.
[2] H. Alhakami, H. Aldabbas, T. Alwada, "Comparison [19] C. Lin, S. Lu, “Scheduling scientific workflows elastically
between cloud and grid computing : review paper," for cloud computing,” IEEE 4th international conference
International journal on cloud computing: services and on cloud computing, pp. 246-247, 2011.
architecture (IJCCSA), vol. 2, No. 4, pp. 1-21, August [20] H. Zhong, K. Tao, X. Zhang, “An approach to optimized
2012. resource scheduling algorithm for open-source cloud
[3] http://aws.amazon.com/ec2. systems,” Fifth annual china grid conference (IEEE), pp.
[4] http://aws.amazon.com/s3. 124-129, 2010.
[5] http://www.salesforce.com/in/crm/what-is-crm.jsp. [21] M. Xu, L. Cui, H. Wang, Y. Bi, “A multiple QoS
[6] M. Shiraz, A. Gani, R. H. Khokhar, R. Buyya, "A review constrained scheduling strategy of multiple workflows for
on distributed application processing frameworks in smart cloud computing,” IEEE international symposium on
mobile devices for mobile cloud computing," IEEE parallel and distributed processing with applications, pp.
communications surveys & tutorials, vol. 15, no. 3, pp. 629-634, 2009.
1294-1313, 2013. [22] A. Verma, S. Kaushal, “Deadline and budget distribution
[7] S. Rao, N. Rao, E. Kusuma Kumari, "Cloud computing: an based cost- time optimization workflow scheduling
overview," Journal of theoretical and applied information algorithm for cloud,” International conference on recent
technology, vol. 9, no. 1, pp. 71-76, 2009. advances and future trends in information technology
[8] J. Srinivas, K. Venkata Subba Reddy, Dr. A. Moiz Qyser, (iRAFIT 2012).
"Cloud computing basics," International journal of [23] Z. Wu, Z. Ni, L. Gu, “A revised discrete particle swarm
advanced research in computer and communication optimization for cloud workflow scheduling,”
engineering, Vol. 1, Issue 5, pp. 343-347, July 2012. International conference on computational intelligence and
[9] H. Topcuoglu, S. Hariri, M.Y. Wu, “Performance-effective security (CIS), pp. 184-188, 2010.
and low-complexity task scheduling for heterogeneous [24] S. Selvarani, G.S. Sadhasivam, “Improved cost-based
computing,” IEEE Transactions on parallel and distributed algorithm for task scheduling in cloud computing,”
systems, vol. 13, no. 3, pp. 260-274, March 2002. computational intelligence and computing research, pp.1-
[10] J. Yu, R. Buyya, A taxonomy of workflow management 5, 2010.
systems for grid computing,” [25] A. Verma, S. Kaushal, “Deadline constraint heuristic
http://arxiv.org/abs/cs/0503025. based genetic algorithm for workflow scheduling in
[11] R. Sakellariou, H. Zhao, “A hybrid heuristic for DAG cloud,” Forthcoming article in international journal of grid
scheduling on heterogeneous systems,” Proceedings of the and utility computing.

You might also like