Chapter 6
■
DiviDe, Combine, anD Conquer
120
Sadly, the various functions of the
bisect
library don’t support the
key
argument, used in
list.sort
, for
example. You can achieve similar functionality with the so-called decorate, sort, undecorate (or, in this case,
decorate, search, undecorate) pattern, or DSu for short:
>>> seq = "I aim to misbehave".split()
>>> dec = sorted((len(x), x) for x in seq)
>>> keys = [k for (k, v) in dec]
>>> vals = [v for (k, v) in dec]
>>> vals[bisect_left(keys, 3)]
'aim'
or, you could do it more compactly:
>>> seq = "I aim to misbehave".split()
>>> dec = sorted((len(x), x) for x in seq)
>>> dec[bisect_left(dec, (3, ""))][1]
'aim'
as you can see, this involves creating a new, decorated list, which is a linear operation. Clearly, if we do this
before
every search, there’d be no point in using
bisect
. if, however, we can keep the decorated list between
searches, the pattern can be useful. if the sequence isn’t sorted to begin with, we can perform the DSu as part of
the sorting, as in the previous example.
Traversing Search Trees ... with Pruning
Binary search is the bee’s knees. It’s one of the simplest algorithms out there, but it really packs a punch. There is one
catch, though: To use it, your values must be sorted. Now, if we could
keep them in a linked list, that wouldn’t be a
problem. For any object we wanted to insert, we’d just find the position with bisection (logarithmic) and then insert
it (constant). The problem is—that won’t work. Binary search needs to be able to check the middle value in constant
time, which we can’t do with a linked list. And, of course, using an array (such as Python’s lists) won’t help. It’ll help
with the bisection, but it ruins the insertion.
If we want a modifiable structure that’s efficient for search, we need some kind of middle ground. We need a
structure that is similar to a linked list (so we can insert elements in constant time) but that still lets us perform a
binary search. You may already have
figured the whole thing out, based on the section title, but bear with me. The first
thing we need when searching is to access the middle item in constant time. So, let’s say we keep a direct link to that.
From there, we can go left or right, and we’ll need to access the middle element of either the left half or the right half.
So ... we can just keep direct links from the first item to these two, one “left” reference and one “right” reference.
In other words, we can just represent the structure of the binary search as an explicit tree structure! Such a tree
would be easily modifiable, and we could traverse it from root to leaf in logarithmic time. So, searching is really
our old friend traversal—but with pruning. We wouldn’t want to traverse the entire tree (resulting
in a so-called
linear scan). Unless we’re building the tree from a sorted sequence of values, the “middle element of the left half”
terminology may not be all that helpful. Instead, we can think of what we need to implement our pruning. When we
look at the root, we need to be able to prune one of the subtrees. (If we found the value we wanted in an internal node
and the tree didn’t contain duplicates, we wouldn’t continue in
either subtree, of course.)
The
one thing we need is the so-called search tree property: For a subtree rooted at
r, all the values in the
left
subtree are
smaller than (or equal to) the value of
r, while those in the
right subtree are
greater. In other words, the
value at a subtree root
bisects the subtree. An example tree with this property is shown in Figure
6-6
, where the node
labels indicate the values we’re searching. A tree structure like this could be useful in implementing a
set; that is, we
could check whether a given value was present. To implement a
mapping, however, each node would contain both a
key, which we searched for, and a value, which was what we wanted.
Chapter 6
■
DiviDe, Combine, anD Conquer
121
Usually, you don’t build a tree in bulk (although that can be useful at times); the main
motivation for using trees
is that they’re dynamic, and you can add nodes one by one. To add a node, you search for where it
should have been
and then add it as a new leaf there. For example, the tree in Figure
6-6
might have been built by initially adding 8 and
then 12, 14, 4, and 6, for example. A different ordering might have given a different tree.
Listing 6-2 gives you a simple implementation of a binary search tree, along with a wrapper that makes it look a
bit like a dict. You could use it like this, for example:
>>> tree = Tree()
>>> tree["a"] = 42
>>> tree["a"]
42
>>> "b" in tree
False
As you can see, I’ve implemented insertion and search
as free-standing functions, rather than methods. That’s so
that they’ll work also on None nodes. (You don’t have to do it like that, of course.)
Do'stlaringiz bilan baham: