Chapter 5
■
traversal: the skeleton key of algorithmiCs
112
Summary
In this chapter, I’ve shown you the basics of moving around in graphs, be they directed or not. This idea of traversal
forms the basis—directly or conceptually—for many of the algorithms you’ll learn later in this book and for other
algorithms that you’ll probably encounter later. I’ve used examples of maze traversal algorithms (such as Trémaux’s
and Ore’s), although they were mainly meant as starting points for more computer-friendly approaches. The general
procedure for traversing a graph involves maintaining a conceptual to-do list (a queue) of nodes you’ve discovered,
where you check off those that you have actually visited. The list initially contains only the start node, and in each step
you visit (and check off)
one of the nodes, while adding its neighbors to the list. The ordering (schedule) of items on
the list determines, to a large extent, what kind of traversal you are doing: using a LIFO queue (stack) gives depth-first
search (DFS), while using a FIFO queue gives breadth-first search (BFS), for example. DFS, which is equivalent to a
relatively direct recursive traversal, lets you find discover and finish times for each node, and the interval between
these for a descendant will fall inside that of an ancestor. BFS has the useful property that it can be used to find the
shortest (unweighted) paths from one node to another. A variation of DFS, called
iterative deepening DFS, also has this
property, but it is more useful for searching in large trees, such as the state spaces discussed in Chapter 11.
If a graph consists of several connected components, you will need to restart your traversal once for each
component. You can do this by iterating over all the nodes, skipping those that have already been visited, and starting
a traversal from the others. In a directed graph, this approach may be necessary even if the graph is connected
because the edge directions may prevent you from reaching all nodes otherwise. To find the
strongly connected
components of a directed graph—the parts of the graph where all nodes can reach each other—a
slightly more
involved procedure is needed. The algorithm discussed here, Kosaraju’s algorithm, involves first finding the finish
times for all nodes and then running a traversal in the
transposed graph (the graph with all edges reversed), using
descending finish times to select starting points.
If You’re Curious ...
If you like traversal, don’t worry. We’ll be doing more of that soon enough. You can also find details on DFS,
BFS, and the SCC algorithm discussed in, for example, the book by Cormen et al. (see “References,” Chapter 1).
If you’re interested in finding strong components, there are references for Tarjan’s and Gabow’s (or, rather, the
Cheriyan-Mehlhorn/Gabow) algorithms in the “References” section of this chapter.
Exercises
5-1. In the components function in Listing 5-2, the set of seen nodes
is updated with an entire
component at a time. Another option would be to add the nodes one by one inside walk. How would
that be different (or, perhaps, not so different)?
5-2. If you’re faced with a graph where each node has an even degree, how would you go about finding
an Euler tour?
5-3. If every node in a directed graph has the same in-degree as out-degree, you could find a
directed
Euler tour. Why is that? How would you go about it, and how is this related to Trémaux’s algorithm?
5-4. One basic operation in image processing is the so-called
flood fill, where a region in an image is
filled with a single color. In painting applications (such as GIMP or Adobe Photoshop), this is typically
done with a paint bucket tool. How would you implement this sort of filling?
5-5. In Greek mythology, when Ariadne helped Theseus overcome
the Minotaur and escape the
labyrinth, she gave him a ball of fleece thread so he could find his way out again. But what if Theseus
forgot to fasten the thread outside on his way in and remembered the ball only once he was thoroughly
lost—what could he use it for then?
Chapter 5
■
traversal: the skeleton key of algorithmiCs
113
5-6. In recursive DFS, backtracking occurs when you return from one of the recursive calls. But where
has the backtracking gone in the iterative version?
5-7. Write a nonrecursive version of DFS that can deal determine finish times.
5-8. In dfs_topsort (Listing 5-8), a recursive DFS is started from every node (although it terminates
immediately if the node has already been visited). How can we be sure that we will get a valid
topological sorting, even though the order of the start nodes is completely arbitrary?
5-9. Write a version of DFS where you have hooks (overridable functions) that let the user perform
custom processing in pre- and postorder.
5-10. Show that if (and only if) DFS finds no back edges, the graph being traversed is acyclic.
5-11. What challenges would you face if you wanted to use other traversal algorithms than DFS to look
for cycles in directed graphs? Why don’t you face these challenges in undirected graphs?
5-12. If you
run DFS in an undirected graph, you won’t have any forward or cross edges. Why is that?
5-13. Write a version of BFS that finds the distances from the start node to each of the others, rather
than the actual paths.
5-14. As mentioned in Chapter 4, a graph is called
bipartite if you can partition the nodes into two sets
so that no neighbors are in the same set. Another way of thinking about this is that you’re coloring each
node either black or white (for example) so that no neighbors get the same color. Show how you’d find
such a bipartition (or two-coloring), if one exists, for any undirected graph.
5-15. If you reverse all the edges of a directed graph, the strongly connected components remain the
same. Why is that?
5-16. Let X and Y be two strongly connected components of the same graph,
G. Assume that there is at
least one edge from X to Y.
If you run DFS on G (restarting as needed, until all nodes have been visited),
the latest finish time in X will always be later than the latest in Y. Why is that?
5-17. In Kosaraju’s algorithm, we find starting nodes for the final traversal by descending finish times
from an initial DFS, and we perform the traversal in the transposed graph
(that is, with all edges reversed). Why couldn’t we just use
ascending finish times in the
original graph?
References
Cheriyan, J. and Mehlhorn, K. (1996). Algorithms for dense graphs and networks on the random access computer.
Algorithmica, 15(6):521-549.
Littlewood, D. E. (1949).
The Skeleton Key of Mathematics: A Simple Account of Complex Algebraic Theories.
Hutchinson & Company, Limited.
Lucas, É. (1891).
Récréations Mathématiques, volume 1, second edition. Gauthier-Villars et fils, Imprimeurs-Libraires.
Available
online at
http://archive.org
.
Lucas, É. (1896).
Récréations Mathématiques, volume 2, second edition. Gauthier-Villars et fils, Imprimeurs-Libraires.
Available online at
http://archive.org
.
Ore, O. (1959). An excursion into labyrinths.
Mathematics Teacher, 52:367-370.
Tarjan, R. (1972). Depth-first search and linear graph algorithms.
SIAM Journal on Computing, 1(2): 146-160.