PART II: CENTRAL PROCESSING UNIT
Amdahl’s Law
Gene Myron Amdahl, who was one of the architects of the mainframe computers including the
famous IBM System 360, defined a phenomenon that over the years became a cornerstone in
processors’ performance evaluation. However, it can be applied to other disciplines as well, such as
systems engineering at large.
Amdahl’s law states that the performance enhancements to be gained by some component is
limited by the percentage of time the component is being used. This law is commonly used in
situations where we have to estimate the performance improvements to be achieved by adding
additional processors to the system, or in the modern systems using several cores. However, the law
is not confined only to computers and it is possible to use it in other, noncomputer-related settings
as well.
The formula that represents the law is:
Assuming:
F
E
is the fraction of time the enhancement (or improvement) can be used
P
E
is the performance gained by the enhancement
then the new execution time expected is given by:
Using this formula, is it fairly simple to extract the speedup
For example, assuming a program consists of two parts, the first part is executed for 70% of the
time and the second part runs during 30% of the time. In an effort to reduce the execution time, the
second part was rewritten and now it executes five times faster. What is the overall time for the
programs after introducing this change?
By using Amdahl’s law, we will define
And then
This means that the overall improvement will be 32%, although the improvement for the second
part was five times.
The law is useful not only for software estimates. For example, it can also be applied when there is
a need to increase the speed of a computer system. One group of engineers that designed the ALU
proposes a change that will yield 10% speed increase. Another group that designed the memory
access proposes a change that will increase the access speed by a factor of eight. It is known that the
ALU accounts for 93% of the execution time and memory access accounts for 7%. Assuming only one
of these modifications can be implemented, which one has the greater impact?
Once again, we will use Amdahl’s law twice: first, we will use it for calculating the ALU
improvement contribution, and then we will use it to calculate the memory improvement
contribution.
ALU contribution:
Using the formula will produce an improvement of 9.2%
Memory contribution
Using the formula once again will produce an improvement of 6.5%.
This means that the ALU modification is the more effective and this is the one that will be chosen.
Furthermore, using the formula on various theoretical improvements reveals that due to its limited
usage (7%) the memory improvement will never be more efficient than the ALU improvement. This
can be seen in
Figure 4.22
that depicts various theoretical and probably unrealistic improvements,
but even with an improvement of seven orders of magnitude, the overall speedup is less than 7.6%.
One of the most useful usages of the law, related to systems engineers is in regard to the
anticipated performance increase that can be gained from using multiple core processors.
Figure 4.23
depicts a hypothetical task that runs 140 s on a single core. The task consists of three parts. The first
and third parts that run 20 s each are sequential by nature and cannot be parallelized. The second
part that runs for 100 s can be parallelized, so it may benefit from rewriting it using threads and
running them on several cores. When running the task on one processor and one core, it runs for a
total of 140 s (20 + 100 + 20). Assuming it runs on four cores, the speedup gained can be calculated
using Amdahl’s law:
FIGURE 4.22
Memory performance improvement.
FIGURE 4.23
Amdahl’s law visualized.
This means that the gain is 115% and the total time will be reduced to 65 s.
Figure 4.23
,
demonstrates this visually.
The upper part is related to the run on a single core and the lower part is the run on four cores. As
can be seen, the first and third part were not influenced by the additional cores; however, the second
part ran in parallel so the overall time used by this part is just 25 s. The total time required for
completing the task is 65 s (20 + 25 + 20).
According to the definition of Amdahl’s law, the gains are limited by the percentage of using the
improvement. In this example, even if theoretically it will be possible to use an indefinite number of
cores, the overall time will never be less than 40 s—the time required by the two serial parts.
The law can be used for decisions regarding performance improvement of user code as well.
Assume there is a program that consists of a total of 150 lines of code; 120 lines are being run for 80%
of the time and the remaining 30 lines are executed during 20% of the time. To reduce the time, it is
possible to improve the first part so it will run 50% faster or it is possible to improve the second part
so it will run 100 times faster. Assuming it is possible to implement only one change, which is the
most effective?
As in previous cases, we will have to use Amdahl’s law for each one of the cases.
Improving part one:
Improving part two:
This means that the most effective route is by implementing the change proposed for the first part.
Amdahl’s law is not confined only to computer (hardware and software) related performance
assessment and it can be used in other environments as well, for example, a transatlantic flight. In
the past, there were two alternatives to flying from London to New York. Using a standard
(subsonic) airliner that flies for 7 h or using a supersonic airliner (the Concorde) that flies for 3.5 h.
There was a significant difference in the fares, which sometimes may have been justified. However,
one has to take into account that the flight time is halved, the total trip is not. Usually, such a trip
includes additional time required for security checks and check-in as well as waiting for the luggage
after landing. If we assume that besides the flight time an additional 4 h are needed, then we can use
Amdahl’s law for calculating the supersonic contribution in the trip overall time reduction.
This means that although the flight’s time is halved, the overall trip time was reduced from 11 h to
7.5 h. It should be noted that in this simple example, it is possible to get to the result without using
the formula. The 4 h of nonflying time remain unchanged while the 7 h of flying became 3.5. The total
trip time is 7.5 (4 + 3.5).
Processors’ Types
Although the abstraction used in processors’ design that aims to reduce costs while maintaining
capabilities, the financial resources still required for designing a new generation of processors are
enormous. This means that surviving in this highly competitive market is not easy and
manufacturers had to look for ways to sell as many processors as possible in order to get a return on
their investments. This is one of the main reasons that many of the old mainframe companies
disappeared and the ones that still exist use off-the-shelf processors.
For a better understanding these trends, we must follow the technological developments since the
1990s. During the first years of that decade, there was an increase in the number of newly emerging
computing companies. Many designed and sold UNIX-based workstations that were based on
processors designed using a “new” technology called reduced instructions set computer (RISC;
which will be discussed in the two next sections). The innovation introduced by RISC is mainly the
capability to design and develop alternative processors that are fast but simple, and relatively cheap
to design.
Although there were tens of such companies during the 1990s, each one trying to concentrate on
some other specific part of the market, the dominant players were:
• IBM, which designed and manufactured the PowerPC
*
that currently is part of the Power
architecture, but originally was intended to run UNIX-based systems. The chip was designed in
collaboration between Apple, IBM, and Motorola. It was used in systems designed and
manufactured by the three companies. Unfortunately, the main personal computing market was
dominated by Intel processors and the new chip did not succeed in obtaining a significant share.
Nevertheless, the Apple Macintosh system used the chip until 2006 (when Apple switched to
Intel’s processors). Currently the chip is implemented mainly in various high-performance
embedded appliances.
• Digital Equipment Corporation (known as DEC or Digital) was a successful computer
company that introduced the concept of minicomputers. DEC was acquired by Compaq (a large
producer of personal computers: IBM PC compatible systems) in 1998. Prior to that acquisition,
DEC designed and developed a fast 64-bit chip called Alpha. It was mainly used by DEC systems
and there were no significant collaborations with other computer systems manufacturers. This
meant that the high-design costs could not be shared with other manufacturers, which may have
contributed to DEC’s financial problems. Compaq itself was acquired by HP in 2002, which still
maintains the brand name mainly for the low-end systems. Prior to that, due to Compaq’s being
an Intel customer, the Alpha technology and the intellectual property rights were sold to Intel,
which marked the end of the technology.
• MIPS Computer Systems was a company that designed a family of processors’ chips that were
intended both for commercial systems as well as for various embedded devices. The company’s
processors were used by SGI (Silicon Graphics, Inc.) a company that manufactured high-
performance 3D-graphics computing solutions. As a result, in 1992 SGI acquired MIPS.
Unfortunately, several years later, SGI decided to switch to Intel processors as the engine for
their systems and MIPS was spun out. In 2013, it was acquired by Imagination a UK company
that offers embedded processors.
• Sun Microsystems, Inc. was another successful company that paved the computing industry.
Sun Microsystems was a significant player in producing some of the currently heavily used
technologies such as UNIX, the Java programming concept,
*
the Network File System (NFS),
†
Virtualization,
‡
and so on. Despite its successful contribution, Sun Microsystems was acquired
by Oracle Corporation to provide a highend integrated (hardware/software) system optimized
for large enterprises as well as cloud-based systems.
• Intel is a semiconductor company that invented the x86 family of microprocessors (see
Table
1.2
), which is the heart of most of the 32-bit personal computers worldwide. The fast
technological development in the personal computing industry and Intel’s presence in hardware
part-influenced its dramatic success. Although there were several competitors, such as AMD,
during the 1990s, Intel established itself as the leading supplier of processors.
• Hewlett-Packard (HP) is a computer systems and peripherals manufacturer founded in 1939.
Although the company had its own line of processors, it decided to team-up with Intel for a new
line of processors (Itanium).
In the very early stages, the hardware design and manufacturing companies understood that
strategic cooperation with other manufacturers is a must for achieving large volume sales. The
revenues from these sales could produce the high costs associated with the technological
developments. This is the reason that most of the 1990s chip manufacturers entered into
Do'stlaringiz bilan baham: |