A Hybrid Approach for Runtime Analysis
87
distributed simulation introducing new challenges. An overview of the different
issues and methods of parallel and distributed discrete event simulation is given
in [
7
]. While these approaches are the future solutions for handling the systems
with many cores, they don’t optimize a single complex core very well. This issue
is addressed in this paper.
The Sniper [
3
] simulator is a tool increasing the exactness of the instruction
accurate simulation (called one-IPC (instruction per cycle) simulation) with-
out introducing the overhead of a cycle accurate simulator. It separates the
instruction stream into intervals which are analyzed regarding their architec-
tural behavior and stored in an execution window during the emulation. This
allows to model time penalties of real hardware occurring because of data depen-
dencies between instructions or cache misses. In their paper they also suggested
to parallelize the execution of the simulation. They achieve an average absolute
error of less than 23.8% for the SPLASH-2 benchmark but have a slow-down of
2–3 times in comparison to the one-IPC simulation. The work presented in this
paper is intended as an alternative way to achieve similar benefits like Sniper.
Switching the processor models of gem5 like presented in this work was done
before by Hsieh et al. [
9
]. They use this approach to fast-forward to their region
of interest. As soon as the inaccurate (they call it “in-order”) model reaches the
point which has to be investigated, the accurate out-of-order model is switched
in. How this region is found and how they keep track of the instruction flow is
not explained. Additionally, since this work was not their main topic, no further
comparisons of the accuracy achieved for the simulation time required for the full
program was made. The mechanics of gem5 to exchange certain processor models
was also used to emulate dynamic voltage and frequency scaling. Haririan et al.
implemented this feature for gem5 [
8
]. However, their main focus for evaluation
lied on the accuracy of the method. Thus, they did not try to accelerate their
work or to compare it regarding its simulation speed.
This section shows that there are already many approaches to improve the
efficiency of CPU simulation. But to the knowledge of the authors, no evaluations
integrating both dimensions, the simulation time and the accuracy, were made.
Hence, the newly proposed methodology is evaluated in a way that includes both
metrics. For the future, it is expected that some of the related work presented
here might also benefit from the proposed methodology.
Do'stlaringiz bilan baham: