Chapter 7
■
Greed Is Good? prove It!
149
Minimum Spanning Trees
Now let’s take a look at the perhaps most well-known example of a greedy problem: finding minimum spanning trees.
The problem is an old one—it’s been around at least since the early 20th century. It was
first solved by the Czech
mathematician Otakar Bor
ůvka in 1926, in an effort to construct a cheap electrical network for Moravia. His algorithm
has been rediscovered many times since then, and it still forms the basis of some of the fastest known algorithms
known today. The algorithms I’ll discuss in this section (Prim’s and Kruskal’s) are in some way a bit simpler but have
the same asymptotic running time complexity (
O(
m lg
n), for
n nodes and
m edges).
9
If you’re interested in the history
of this problem, including the repeated rediscoveries of the classic algorithms, take a look at the paper “On the History
of the Minimum Spanning Tree Problem,” by Graham and Hell. (For example, you’ll see that Prim and Kruskal aren’t
the only ones to lay claim to their eponymous algorithms.)
We’re basically looking for the cheapest way of connecting all the nodes of a weighted graph, given that we can
use only a subset of its edges to do the job. The cost of a solution is simply the weight sum for the edges we use.
10
This
could be useful in building an electrical grid, constructing the core
of a road or railroad network, laying out a circuit,
or even performing some forms of clustering (where we’d only
almost connect all the nodes). A minimum spanning
tree can also be used as a foundation for an approximate solution to the traveling salesrep problem introduced in
Chapter 1 (see Chapter 11 for a discussion on this).
A spanning tree
T of a connected, undirected graph
G has the same node set as
G and a subset of the edges.
If we associate an edge weight function with
G so edge
e has weight
w(
e), then the weight of the spanning tree,
w(
T), is
the sum of
w(
e) for every edge
e in
T. In the
minimum spanning tree problem, we want to find a spanning tree over
G
that has minimum weight. (Note that there may be more than one.) Note also that if
G is
disconnected, it will have
no
spanning trees, so in the following, it is generally assumed that the graphs we’re working with are connected.
In Chapter 5, you saw how to build spanning trees using traversal; building minimum spanning trees can also
be built in an incremental step like this, and that’s where the greed comes in: We
gradually build the tree by adding
one edge at the time. At each step, we choose the
cheapest (or
lightest) edge among those permitted by our building
procedure. This choice is
locally optimal (that is, greedy) and
irrevocable. The main task for this problem, or any other
greedy problem, becomes showing that these
locally optimal
choices lead to a globally optimal solution.
The Shortest Edge
Consider Figure
7-3
. Let the edge weights correspond to the Euclidean distances between the nodes as they’re drawn
(that is, the actual edge lengths). If you were to construct a spanning tree for this graph, where would you start? Could
you be certain that some edge had to be part of it? Or at least that a certain edge would be safe to include? Certainly
(
e,
i) looks promising. It’s tiny! In fact, it’s the shortest of all the edges—the one with the lowest weight. But is that
enough?
9
You can, in fact, combine Borůvka’s algorithm with Prim’s to get a faster algorithm.
10
Do you see why the result cannot contain any cycles, as long as we assume positive edge weights?
Chapter 7
■
Greed Is Good? prove It!
150
As it turns out, it is. Consider any spanning tree
without the minimum-weight edge (
e,
i). The spanning tree
would have to include both
e and
i (by definition), so it would also include a single path from
e to
i. If we now were
to add (
e,
i)
to the mix, we’d get a
cycle, and in order to get back to a proper spanning tree, we’d have to remove one
of the edges of this cycle—it doesn’t matter which. Because (
e,
i) is the smallest, removing
any other edge would yield
a smaller tree than we started out with. Right? In other words, any tree
not including the shortest edge can be made
smaller, so the minimum spanning tree
must include the shortest edge. (As you’ll see, this is the basic idea behind
Kruskal’s algorithm.)
What if we consider all the edges incident at a single node—can we draw any conclusions then? Take a look
at
b, for example. By the definition of spanning trees, we must connect
b to
the rest somehow, which means we
must include
either (
b,
d)
or (
b,
a). Again, it seems tempting to choose the shortest of the two. And once again, the
greedy choice turns out to be very sensible. Once again, we prove that the alternative is inferior using a proof by
contradiction: Assume that it was better to use (
b,
a). We’d build our minimum spanning tree with (
b,
a) included.
Then, just for fun, we’d add (
b,
d), creating a cycle. But, hey—if we remove (
b,
a), we have
another spanning tree,
and because we’ve switched one edge for a shorter one, this new tree must be
smaller. In other words, we have a
contradiction, and the one without (
b,
d) couldn’t have been minimal in the first place. And
this is the basic idea
behind Prim’s algorithm, which we’ll look at after Kruskal’s.
In fact, both of these ideas are special cases of a more
general principle involving cuts. A cut is simply a
partitioning of the graph nodes into two sets, and in this context we’re interested in the edges that pass between these
two node sets. We say that these edges
cross the cut. For example, imagine drawing a vertical line in Figure
7-3
, right
between
d and
g; this would give a cut that is crossed by five edges. By now I’m sure you’re catching on: We can be
Do'stlaringiz bilan baham: