Chapter 7
■
Greed Is Good? prove It!
153
else:
C[u] = v
if R[u] == R[v]: # A tie: Move v up a level
R[v] += 1
def kruskal(G):
E = [(G[u][v],u,v) for u in G for v in G[u]]
T = set()
C, R = {u:u for u in G}, {u:0 for u in G} # Comp. reps and ranks
for _, u, v in sorted(E):
if find(C, u) != find(C, v):
T.add((u, v))
union(C, R, u, v)
return T
All in all, the running time of Kruskal’s
algorithm is
Θ(
m lg
n), which comes from the sorting.
Note that you might want to represent your spanning trees differently (that is, not as sets of edges). The algorithm
should be easy to modify in this respect—or you could just build the structure you want based on the edge set T.
Note
■
the subproblem structure used in Kruskal’s algorithm is an example of a
matroid, where the feasible partial
solutions are simply sets—in this case, cycle-free edge sets. For matroids, greed works. here are the rules: all subsets of
feasible sets must also be feasible, and larger sets must have elements that can extend smaller ones.
Prim’s Algorithm
Kruskal’s algorithm is simple on the conceptual level—it’s a direct translation of the greedy
approach to the spanning
tree problem. As you just saw, though, there is some complexity in the validity checking. In this respect, Prim’s
algorithm is a bit simpler.
13
The main idea in Prim’s algorithm is to traverse the graph from a starting node, always
adding the shortest edge connected to the tree. This is safe because the edge will be the shortest one crossing the cut
around our partial solution, as explained earlier.
This means that Prim’s algorithm is just another traversal algorithm, which should be a familiar concept if you’ve
read Chapter 5.
As discussed in that chapter, the main difference between traversal algorithms is the ordering of
our “to-do” list—among the unvisited nodes we’ve discovered, which one do we grow our traversal tree to next? In
breadth-first search, we used a simple queue (that is, a deque); in Prim’s algorithm, we simply replace this queue with
a
priority queue, implemented with a heap, using the heapq library (discussed in a “Black Box” sidebar in Chapter 6).
There is one important issue here, though: Most likely, we will discover new edges
pointing to nodes that are
already in our queue. If the new edge we discovered was
shorter than the previous one, we should
adjust the priority
based on this new edge. This, however, can be quite a hassle. We’d need to find the given node inside the heap, change
the priority, and then restructure the heap so that it would still be correct. You could do that by having a mapping from
each node
to its position in the heap, but then you’d have to update that mapping when performing heap operations,
and you could no longer use the heapq library.
13
Actually, the difference is deceptive. Prim’s algorithm is based on traversal and heaps—concepts we’ve already dealt with—while
Kruskal’s algorithm introduced a new disjoint set mechanism. In other words, the difference in simplicity is mostly a matter of
perspective and abstraction.
Chapter 7
■
Greed Is Good? prove It!
154
It turns out there’s another way, though. A really pretty solution, which will also work with other priority-based
traversals (such as Dijkstra’s algorithm and A*, discussed in Chapter 9), is to simply add the nodes
multiple times.
Each time
you find an edge to a node, you add the node to the heap (or other priority queue) with the appropriate
weight, and you
don’t care if it’s already in there. Why could this possibly work?
We’re using a priority queue, so if a node has been added multiple times, by the time we
•
remove one of its entries, it will be the one with the lowest weight (at that time), which is the
one we want.
We make sure we don’t add the same node to our traversal tree more than once. This can be
•
ensured by a constant-time membership check. Therefore, all but one
of the queue entries for
any given node will be discarded.
The multiple additions won’t affect asymptotic running time (see Exercise 7-17).
•
There are important consequences for the actual running time as well. The (much) simpler code isn’t only easier
to understand and maintain; it also has a lot less overhead. And because we can use the super-fast heapq library, the
net consequence is most likely a large performance gain. (If you’d like to try the more complex version, which is used
in many algorithm books, you’re welcome, of course.)
Note
■
re-adding a node with a lower weight is equivalent to a relaxation, as discussed in Chapter 4. as you’ll see,
I also add the
predecessor node to the queue, making any explicit relaxation unnecessary. When implementing dijkstra’s
algorithm in Chapter 9, however, I use a separate
relax
function. these two approaches are interchangeable (so you
could have prim’s
with
relax
and dijkstra’s
without it).
You can see my version of Prim’s algorithm in Listing 7-5. Because heapq doesn’t (yet) support sorting keys, as
list.sort and friends do, I’m using (weight, node) pairs in the heap, discarding the weights when the nodes are
popped off. Beyond the use of a heap, the code is similar to the implementation of breadth-first search in Listing 5-10.
That means that a lot of the understanding here should come for free.
Do'stlaringiz bilan baham: