B. Qureshi / Future Generation Computer Systems 94 (2019) 453–467 weight, it relies on the seamless availability of resources for mi-
gration and online mechanism for profiling workflows.
Ye et al. [
42
] proposed a server consolidation framework. It
focuses on reducing the number of active physical servers and
VM migrations in data centers, maintaining the overall workload
performance. Two kinds of profiling strategies are considered, VM
consolidation per PM and VM migration. The framework models
the server consolidation as an optimization problem with a goal to
minimize the cost of migration. The authors conduct experimental
evaluation using four kinds of benchmark for CPU, IO, Network
intensive applications. Results show that the framework leverages
reduction in VM migration and VM placement on PMs to reduce the
energy consumption of the data center. In contrast to this work,
the proposed framework in this paper addresses the dynamic
nature of workflows by maintaining power-profile for workflows
in determining the optimal cost of VM placement and migration.
Shi et al. [
13
] presented an application-placement framework.
The objective is to allocate a certain number of various data-
intensive applications to physical servers. The framework consid-
ers the conflict issues in scheduling by ensuring that a mixture
of applications with different resource requests are assigned to
individual servers. This framework considers the application level
load balancer and an application server manager to assign applica-
tions and monitor the resource provisioning between servers. This
work does not consider energy consumption requirements of tasks.
In contrast the proposed approach considers an individual tasks
energy requirements for a greener placement of tasks within the
workflow.
Li et al. in [
46
], proposed a heuristic stochastic task scheduling
algorithm to address energy consumption for heterogeneous com-
puting systems. The proposed stochastic task scheduling problem
is formulated as a linear programming problem, in which the
authors maximize the weighted probability of combined schedule
length and energy consumption metric under deadline and energy
consumption budget constraints. The proposed model considers
only energy consumption at the processor. Furthermore, the poly-
nomial time execution of the algorithm is not guaranteed.
In [
17
], researchers present a popular list-based heuristic
scheduling algorithm, heterogeneous earliest finish time (HEFT).
The algorithm consists of two phases: ranking and mapping. In the
ranking phase the distance of the task submission time to the end
of the workflow is determined. This ensures that the tasks with
most number of successors is executed first. In the mapping phase,
resources are aligned to the tasks within the workflow. HEFT,
however is not suitable for dynamically changing requirements
in workflows within a data center. Durillo et al. in [
47
] extended
HEFT by incorporating information on the utilization of resources
within a system. They present MoHEFT which is a variant of HEFT
that considers the behavior of real multi-core CPUs with different
levels of energy consumption. This is determined by quantifying
the utilization of the number of cores in a CPU and their utilization.
This work, although addresses energy utilization in a server dur-
ing scheduling of resources, it however does not consider multi-
tenancy in its approach.
2.3. Contributions of this work Analyzing the aforementioned works, we realize the gaps in the
literature and address these in this paper as follows:
•
Energy efficiency and the performance of the data center
are correlated. Matching workloads based on the PM charac-
teristics do not necessarily provide the desired tradeoff. To
highlight this, in our work, we present a motivational case
study using a virtualized Hadoop cluster using a real big data
workload using Twitter. We analyze the energy efficiency
and performance of the Hadoop task placement strategy for
various sizes of workloads. Based on this case study, we high-
light a novel concept of building power profiles for various
applications.
•
Profiling has previously been considered for efficient place-
ment of computational workloads in a multi-core architec-
ture. In this work, we present a framework for power-aware
energy-efficient placement of workloads in a data center by
leveraging the concept of APs. The applications profiles are
built in the framework over time based on various parame-
ters, such as applications, considering the power consump-
tion requirements to efficiently place VMs in the data center
hardware. The proposed framework models the VM place-
ment using a heuristic algorithm that runs in polynomial
time.
•
We provide extensive simulation studies to verify the run-
time of the proposed scheduler. The proposed scheduler is
compare to HEFT and RTC for various scenarios, analyzing the
power utilization, computation time and power efficiency for
various sizes of workloads.