f(b). Can we be sure that this is optimal?
Yes, we can, and we can prove it by contradiction, assuming that it is not optimal. We conjure up another, better
tree—and assume that it, too, has a and b as bottom-level siblings. (We know, by the arguments in the previous
section, that an optimal tree like this exists.) Once again, we can collapse a and b, and we end up with a solution to
our subproblem that is better than the one we had … but the one we had was optimal by assumption! In other words,
we cannot find a global solution that is better than one that contains an optimal subsolution.
Optimal Merging
Although Huffman’s algorithm is normally used to construct optimal prefix codes, there are other ways of interpreting
the properties of the Huffman tree. As explained initially, one could view it as a decision tree, where the expected
traversal depth is minimized. We can use the weights of the internal nodes in our interpretation too, though, yielding
a rather different application.
We can view the Huffman tree as a sort of fine-tuned divide-and-conquer tree, where we don’t do a flat balancing
like in Chapter 6, but where the balance has been designed to take the leaf weights into account. We can then interpret
the leaf weights as subproblem sizes, and if we assume that the cost of combining (merging) subproblems is linear (as is
often the case in divide and conquer), the sum of all the internal node weights represents the total work performed.
A practical example of this is merging sorted files, for example. Merging two files of sizes n and m takes time
linear in n+m. (This is similar to the problem of joining in relational database or of merging sequences in algorithms
such as timsort.) In other words, if you imagine the leaves in Figure
7-2
to be files and their weights to be file sizes,
the internal nodes represent the cost of the total merging. If we can minimize the sum of the internal nodes (or,
equivalently, the sum of all the nodes), we will have found the optimal merging schedule. (Exercise 7-9 asks you to
show that this really matters.)
We now need to show that a Huffman tree does, indeed, minimize the node weights. Luckily, we can piggyback
this proof on the previous discussion. We know that in a Huffman tree, the sum of depth times weight over all leaves is
minimized. Now, consider how each leaf contributes to the sum over all nodes: The leaf weight occurs as a summand
once in each of its ancestor nodes—which means that the sum is exactly the same! That is, sum(weight(node) for
node in nodes) is the same as sum(depth(leaf)*weight(leaf) for leaf in leaves). In other words, Huffman’s
algorithm is exactly what we need for our optimal merging.
Tip
■
the python standard library has several modules dealing with compression, including
zlib
,
gzip
,
bz2
,
zipfile
,
and
tar
. the
zipfile
module deals with ZIp files, which use compression that is based on, among other things,
huffman codes.
8
8
By the way, did you know that the ZIP code of Huffman, Texas, is 77336?
Chapter 7
■
Greed Is Good? prove It!
149
Minimum Spanning Trees
Now let’s take a look at the perhaps most well-known example of a greedy problem: finding minimum spanning trees.
The problem is an old one—it’s been around at least since the early 20th century. It was first solved by the Czech
mathematician Otakar Bor
ůvka in 1926, in an effort to construct a cheap electrical network for Moravia. His algorithm
has been rediscovered many times since then, and it still forms the basis of some of the fastest known algorithms
known today. The algorithms I’ll discuss in this section (Prim’s and Kruskal’s) are in some way a bit simpler but have
the same asymptotic running time complexity (O(m lg n), for n nodes and m edges).
9
If you’re interested in the history
of this problem, including the repeated rediscoveries of the classic algorithms, take a look at the paper “On the History
of the Minimum Spanning Tree Problem,” by Graham and Hell. (For example, you’ll see that Prim and Kruskal aren’t
the only ones to lay claim to their eponymous algorithms.)
We’re basically looking for the cheapest way of connecting all the nodes of a weighted graph, given that we can
use only a subset of its edges to do the job. The cost of a solution is simply the weight sum for the edges we use.
10
This
could be useful in building an electrical grid, constructing the core of a road or railroad network, laying out a circuit,
or even performing some forms of clustering (where we’d only almost connect all the nodes). A minimum spanning
tree can also be used as a foundation for an approximate solution to the traveling salesrep problem introduced in
Chapter 1 (see Chapter 11 for a discussion on this).
A spanning tree T of a connected, undirected graph G has the same node set as G and a subset of the edges.
If we associate an edge weight function with G so edge e has weight w(e), then the weight of the spanning tree, w(T), is
the sum of w(e) for every edge e in T. In the minimum spanning tree problem, we want to find a spanning tree over G
that has minimum weight. (Note that there may be more than one.) Note also that if G is disconnected, it will have no
spanning trees, so in the following, it is generally assumed that the graphs we’re working with are connected.
In Chapter 5, you saw how to build spanning trees using traversal; building minimum spanning trees can also
be built in an incremental step like this, and that’s where the greed comes in: We gradually build the tree by adding
one edge at the time. At each step, we choose the cheapest (or lightest) edge among those permitted by our building
procedure. This choice is locally optimal (that is, greedy) and irrevocable. The main task for this problem, or any other
greedy problem, becomes showing that these locally optimal choices lead to a globally optimal solution.
The Shortest Edge
Consider Figure
7-3
. Let the edge weights correspond to the Euclidean distances between the nodes as they’re drawn
(that is, the actual edge lengths). If you were to construct a spanning tree for this graph, where would you start? Could
you be certain that some edge had to be part of it? Or at least that a certain edge would be safe to include? Certainly
(e,i) looks promising. It’s tiny! In fact, it’s the shortest of all the edges—the one with the lowest weight. But is that
enough?
9
You can, in fact, combine Borůvka’s algorithm with Prim’s to get a faster algorithm.
10
Do you see why the result cannot contain any cycles, as long as we assume positive edge weights?
Chapter 7
■
Greed Is Good? prove It!
150
As it turns out, it is. Consider any spanning tree without the minimum-weight edge ( e, i). The spanning tree
would have to include both e and i (by definition), so it would also include a single path from e to i. If we now were
to add ( e, i) to the mix, we’d get a cycle, and in order to get back to a proper spanning tree, we’d have to remove one
of the edges of this cycle—it doesn’t matter which. Because ( e, i) is the smallest, removing any other edge would yield
a smaller tree than we started out with. Right? In other words, any tree not including the shortest edge can be made
smaller, so the minimum spanning tree must include the shortest edge. (As you’ll see, this is the basic idea behind
Kruskal’s algorithm.)
What if we consider all the edges incident at a single node—can we draw any conclusions then? Take a look
at b, for example. By the definition of spanning trees, we must connect b to the rest somehow, which means we
must include either ( b, d) or ( b, a). Again, it seems tempting to choose the shortest of the two. And once again, the
greedy choice turns out to be very sensible. Once again, we prove that the alternative is inferior using a proof by
contradiction: Assume that it was better to use ( b, a). We’d build our minimum spanning tree with ( b, a) included.
Then, just for fun, we’d add ( b, d), creating a cycle. But, hey—if we remove ( b, a), we have another spanning tree,
and because we’ve switched one edge for a shorter one, this new tree must be smaller. In other words, we have a
contradiction, and the one without ( b, d) couldn’t have been minimal in the first place. And this is the basic idea
behind Prim’s algorithm, which we’ll look at after Kruskal’s.
In fact, both of these ideas are special cases of a more general principle involving cuts. A cut is simply a
partitioning of the graph nodes into two sets, and in this context we’re interested in the edges that pass between these
two node sets. We say that these edges cross the cut. For example, imagine drawing a vertical line in Figure
7-3
, right
between d and g; this would give a cut that is crossed by five edges. By now I’m sure you’re catching on: We can be
Do'stlaringiz bilan baham: |