W
= {
w
0
, w
1
, . . . , w
k
}
are continually submitted.
The resource manager, determines the requested resources
(CPU, memory requirements etc.), availability of resources on VMs,
availability of PMs, power consumption parameters etc. through
the VM profiles. The scheduler using a power-aware scheduling
algorithm to determine the optimal cost of efficiently executing
the workflow with minimal power requirements by producing a
mapping. Once the Price/Cost of executing the workflow is de-
termined, the tasks are assigned to VMs which are deployed on
the underlying PMs. Each workflow
w
k
consists of a list of tasks
T
ki
= {
T
0
,
T
1
, . . . ,
T
i
} ;
T
ki
∈
w
k
; with arrival time
a
ki
; expected
completion time
e
ki
; and a power-profile
P
ki
. The expected arrival
time
a
ki
is the system time at initiation of a task in the work-
flow, the expected completion time
e
ki
is assumed to be normal
distribution and we use the
α
quantile to approximate it; the
power-profile
P
ki
is determined from a vector containing the power
utilization
P
T
ki
= {
p
0
,
p
1
, . . . ,
p
j
}
of a task
i
in the workflow
k
;
where
p
0
,
p
1
, . . . ,
p
j
are the power consumption levels for a task
at various times during its execution.
Tasks defined in a workflow are assigned to a set of VMs
V
such that only one task is executed per VM. A pool of VMs
V
=
{
v
0
, v
1
, . . . , v
x
}
is available, VMs can be gained and released by any
task
T
ki
in a workflow. Each VM has a
Price
(
v
x
) associated with it
which indicates the cost to execute a task on this VM for a time
frame
∆
t
=
t
e
ki
−
t
a
ki
where
t
a
ki
is the arrival time for task
T
ki
in
Workflow
w
k
and
t
e
ki
is the expected completion time for this task.
Each VM requires a set period of time at initiation for booting on
the host machine with
P
T
ki
at
p
0
.
The variable
m
T
ki
,
x
reflects mapping between tasks and VMs. The
value of
m
T
ki
,
x
is 1 if task
T
ki
is mapped to VM
v
x
and 0 otherwise.
m
T
ki
,
x
=
{
1
,
if T
ki
is assigned to
v
x
0
,
other
w
ise
(1)
As the tasks’ complete, their real starting times
s
T
ki
,
x
and comple-
tion times
f
T
ki
,
x
on
v
x
are available. The completion time can be
given as:
f
T
ki
,
x
≥
s
T
ki
,
x
+
t
m
ki
,
x
(2)
where
t
m
ki
,
x
is the time to place the task in the VM. The actual
finishing time of all tasks in a workflow can be given by:
max
z
∈
T
{
f
T
z
}
≤
f
k
(3)
where
max
z
∈
T
{
f
T
z
}
represents the actual completion time
f
k
of the
workflow
w
k
.
Assuming that there are
D
= {
d
0
,
d
1
, . . . ,
d
y
}
PMs in a clus-
ter/data center, then
n
m
Tki
,
x
y
reflects the mapping of tasks
T
ki
in VM
v
x
to PM
d
y
. The primary optimization objectives are to (i) minimize
the cost/price of executing the tasks in the workflow and (ii)
maximize the utilization of PM
d
y
by placing the optimal number
of VMs determined by the AP of the tasks. We first represent the
aforementioned optimization using:
Minimize
x
∑
t
=
1
Price
(v
t
) .
∆
t
ki
(4)
where
v
t
denotes the total number of VMs employed to complete
work flow
w
k
,
Price
(v
t
)
represents the price associated with this
VM and
∆
t
ki
denotes the time to complete this task. The optimal
placement of VMs per PM is governed by minimization of power
consumption cost of a VM. We give the average for executing a VM
with mapping
m
T
ki
,
x
on server
d
y
as
Price
m
Tki
,
x
y
:
Price
m
Tki
,
x
y
=
1
j
j
∑
t
=
0
p
T
ki
,
x
.
∆
t
ki
(5)
where
p
T
ki
,
x
is the power consumption level at the time interval
∆
t
ki
and 0
<
Price
m
Tki
,
x
y
<
1. To obtain an optimal
n
m
Tki
,
x
y
we need to
462
B. Qureshi / Future Generation Computer Systems 94 (2019) 453–467
maximize the placement of
r
number of VMs on a PM
y
such that.
n
m
Tk
,
r
y
=
max
y
∈
D
x
∑
r
=
1
{
Price
m
Tk
,
r
y
}
(6)
We define the maximum CPU utilization
θ
y
threshold on a PM
d
y
where
θ
y
∈
[0, 1]. In the next section, we describe the scheduling
algorithm using the constraints provided in the preceding text.
4.3. Scheduling algorithm
To understand the scheduling of tasks in the framework, we
provide a few criterion.
Criterion 1.
Each VM executes only one task at a time instant.
Criterion 2.
VMs are available as a pool. As soon as a VM is made
available, it is allocated a task. The VM that has been assigned a
task is removed from the pool and starts execution in a PM. Once
all processes are completed in a VM, it is returned to the pool.
Criterion 3.
Applications may have multiple tasks defined in the
workflow, a task may have preceding tasks that need to be com-
pleted before this task can be executed. Once all predecessors of a
task complete, it is ready to be executed.
In traditional scheduling approaches, once a new workflow ar-
rives, its tasks are mapped and posted immediately for execution in
the local queues. Unlike these approaches, the scheduling process
in this work considers an important issue, i.e. to rank the tasks
based on certain criterion. We consider two criterions for ranking
the tasks, (i) earliest finishing time of a task, (ii) Lowest
Price
of
tasks available in the scheduling queue, as detailed in the previous
section. The proposed scheduling algorithm continually creates
and updates new and existing mappings of tasks, VMs and PMs
based on the profiles.
Fig. 5
shows the proposed scheduling model. Algorithm 1 shows
the process for a workflow arrival in the framework. It analyzes
every task considering its expected completion time and Price
which is available in the AP. The
TaskList
queue is updated with all
tasks processed in a workflow and is passed on to the scheduler.
Once a
TaskList
queue is completed, the scheduler is called
which is given as algorithm 2. All the remaining tasks that were
rejected, are added to the
Taskpool
queue for the next cycle of
workflow processing. A rejected task may have a predecessor that
may not have been completed after
∆
t.
The function scheduler is defined in algorithm 2. The VMs, tasks
and PMs map
m
k
is initialized for workflow
w
k
. For each task
processed by the controller, the tasks’ time and prices is available in
the
TaskList
. The scheduler computes the price of assigning a task
to a VM based on availability of VMs. Based on the two-criterion
mentioned earlier, an appropriate VM is selected with minimal
expected runtime and lowest Price and added to the VM map
m
k
.
This process continues for all tasks in the
TaskList
. Once a mapping
is produced, the tasks are removed from the
TaskList
. The scheduler
also computes the cost of executing the VM in PMs. Based on
the frequency of multitenancy on a PM and the load factor, the
scheduler determines the minimal price of executing VMs on a PM
using Eqs.
(5)
and
(6)
where are considering the limit for
θ
y
. If a PM
satisfies the criterion
is
v
alid
(
n
m
Tk
,
x
y
), the VM is assigned to this PM
and the task is initiated. Alternatively, the next PM satisfying the
criterion is sought from the pool of available PMs. Consequently,
the load factor and the price of the selected PM is updated in the
profiles defined in the framework.
4.4. Complexity of the proposed algorithm
We analyze the complexity of the proposed algorithms by ana-
lyzing the runtime of the algorithm 1 (workflow management) and
algorithm 2 (task scheduler). Algorithm 1 directs the framework to
compute the expected completion times of
i
tasks in
k
workflows.
Overall, the runtime for algorithm 1 is O(n), where we give
n
as a
product of the number of tasks
i
, and the number of workflows
k
.
Generally,
k
is much less than
i
. Algorithm 1 populates the
TaskList
of size
n
, which is processed by the scheduler in algorithm 2.
Algorithm 2 presents the tasks scheduler for the workflows.
M
k
is essentially a linked list providing a key–value pair, mapping
tasks to virtual machines with a runtime of O(n). For each task
t
i
in the
TaskList
, price for executing task
t
i
on virtual machine
v
j
B. Qureshi / Future Generation Computer Systems 94 (2019) 453–467
463
is computed and assigned to the map in with a runtime O(n . V),
where V is the number of virtual machines. The scheduler validates
the load-balancing for physical machines
y
with a runtime O(n . V .
y) where
y
is much less than
n
and V. Overall, we give the run time
as O(n
3
) where the product (V . y) is much less than
n
, therefore
running the algorithms in polynomial time.
Do'stlaringiz bilan baham: |