Source code online books for professionals by professionals


BLaCK BOX: BINarY heapS, heapQ, aND heapSOrt



Download 4,67 Mb.
Pdf ko'rish
bet112/266
Sana31.12.2021
Hajmi4,67 Mb.
#213682
1   ...   108   109   110   111   112   113   114   115   ...   266
Bog'liq
2 5296731884800181221

BLaCK BOX: BINarY heapS, heapQ, aND heapSOrt
priority queue is a generalization of the LiFo and FiFo queues discussed in Chapter 5. instead of basing the 
order only on when an item is added, each item receives a priority, and you always retrieve the remaining item 
with the lowest priority. (You could also use maximum priority, but you normally can’t have both in the same 
structure.) this kind of functionality is important as a component of several algorithms, such as prim’s, for finding 
minimum spanning trees (Chapter 7), or Dijkstra’s, for finding shortest paths (Chapter 9). there are many ways of 
implementing a priority queue, but probably the most common data structure used for this purpose is the binary 
heap. (there are other kinds of heaps, but the unqualified term heap normally refers to binary heaps.)
binary heaps are complete binary trees. that means they are as balanced as they can get, with each level of the 
tree filled up, except (possibly) the lowest level, which is filled as far as possible from the left. arguably the most 
important aspect of their structure, though, is the so-called heap property: the value of every parent is smaller 
than those of both children. (this holds for a minimum heap; for a maximum heap, each parent is greater.) as a 
consequence, the root has the smallest value in the heap. the property is similar to that of search trees but not 
quite the same, and it turns out that the heap property is much easier to maintain without sacrificing the balance 
of the tree. You never modify the structure of the tree by splitting or rotating nodes in a heap. You only ever need 
to swap parent and child nodes to restore the heap property. For example, to “repair” the root of a subtree  
(which is too big), you simply swap it with its smallest child and repair that subtree recursively (as needed).
the 
heapq
 module contains an efficient heap implementation that represents its heaps in lists, using a common 
“encoding”: if 
a
 is a heap, the children of 
a[i]
 are found in 
a[2*i+1]
 and 
a[2*i+2]
. this means that the root 
(the smallest element) is always found in 
a[0]
. You can build a heap from scratch, using the 
heappush
 and 
heappop
 functions. You might also start out with a list that contains lots of values, and you’d like to make it into 
a heap. in that case, you can use the 
heapify
 function.
17
 it basically repairs every subtree root, starting at the 
bottom right, moving left and up. (in fact, by skipping the leaves, it needs only work on the left half of the array.) 
the resulting running time is linear (see exercise 6-9). if your list is sorted, it’s already a valid heap, so you can 
just leave it alone.
here’s an example of building a heap piece by piece:
 
>>> from heapq import heappush, heappop
>>> from random import randrange
>>> Q = []
>>> for i in range(10):
...         heappush(Q, randrange(100))
... 
>>> Q
[15, 20, 56, 21, 62, 87, 67, 74, 50, 74]
>>>  [heappop(Q) for i in range(10)]
[15, 20, 21, 50, 56, 62, 67, 74, 74, 87]
 
17
It is quite common to call this operation build-heap and to reserve the name heapify for the operation that repairs a single node. 
Thus, build-heap runs heapify on all nodes but the leaves.


Chapter 6 

 DiviDe, Combine, anD Conquer
136
Just like 
bisect
, the 
heapq
 module is implemented in C, but it used to be a plain python module. For example, 
here is the code (from python 2.3) for the function that moves an object down until it’s smaller than both of its 
children (again, with my comments):
 
def sift_up(heap, startpos, pos):
    newitem = heap[pos]                         # The item we're sifting up
    while pos > startpos:                       # Don't go beyond the root
        parentpos = (pos - 1) >>1               # The same as (pos - 1) // 2
        parent = heap[parentpos]                # Who's your daddy?
        if parent <= newitem: break             # Valid parent found
        heap[pos] = parent                      # Otherwise: copy parent down
        pos = parentpos                         # Next candidate position
    heap[pos] = newitem                         # Place the item in its spot
 
note that the original function was called 
_siftdown
 because it’s sifting the value down in the list. i prefer to 
think of it as sifting it up in the implicit tree structure of the heap, though. note also that, just like 
bisect_right

the implementation uses a loop rather than recursion.
in addition to 
heappop
, there is 
heapreplace
, which will pop the smallest item and insert a new element at the 
same time, which is a bit more efficient than a 
heappop
 followed by a 
heappush
. the 
heappop
 operation returns 
the root (the first element). to maintain the shape of the heap, the last item is moved to the root position, and 
from there it is swapped downward (in each step, with its smallest child) until it is smaller than both its children. 
the 
heappush
 operation is just the reverse: the new element is appended to the list and is repeatedly swapped 
with its parent until it is greater than its parent. both of these operations are logarithmic (also in the worst case, 
because the heap is guaranteed to be balanced).
Finally, the module has (since version 2.6) the utility functions 
merge

nlargest
, and 
nsmallest
 for merging 
sorted inputs and finding the n largest and smallest items in an iterable, respectively. the latter two functions, 
unlike the others in the module, take the same kind of 
key
 argument as 
list.sort
. You can simulate this in the 
other functions with the DSu pattern, as mentioned in the sidebar on 
bisect

although you would probably never use them that way in python, the heap operations can also form a simple, 
efficient, and asymptotically optimal sorting algorithm called heapsort. it is normally implemented using a max-heap 
and works by first performing 
heapify
 on a sequence, then repeatedly popping off  the root (as in 
heappop
), and 
finally placing it in the now empty last slot. Gradually, as the heap shrinks, the original array is filled from the right 
with the largest element, the second largest, and so forth. in other words, heap sort is basically selection sort 
where a heap is used to implement the selection. because the initialization is linear and each of the n selections 
is logarithmic, the running time is loglinear, that is, optimal.
Summary
The algorithm design strategy of divide and conquer involves a decomposition of a problem into roughly equal-sized 
subproblems, solving the subproblems (often by recursion), and combining the results. The main reason this is useful 
it that the workload is balanced, typically taking you from a quadratic to a loglinear running time. Important examples 
of this behavior include merge sort and quicksort, as well as algorithms for finding the closest pair or the convex hull 
of a point set. In some cases (such as when searching a sorted sequence or selecting the median element), all but one 
of the subproblems can be pruned, resulting in a traversal from root to leaf in the subproblem graph, yielding even 
more efficient algorithms.


Chapter 6 

 DiviDe, Combine, anD Conquer
137
The subproblem structure can also be represented explicitly, as it is in binary search trees. Each node in a search 
tree is greater than the descendants in its left subtree but less than those in its right subtree. This means that a binary 
search can be implemented as a traversal from the root. Simply inserting random values haphazardly will, on average, 
yield a tree that is balanced enough (resulting in logarithmic search times), but it is also possible to balance the tree, 
using node splitting or rotations, to guarantee logarithmic running times in the worst case.
If You’re Curious ...
If you like bisection, you should look up interpolation search, which for uniformly distributed data has an average-case  
running time of O(lg lg n). For implementing sets (that is, efficient membership checking) other than sorted 
sequences, search trees and hash tables, you could have a look at Bloom filters. If you like search trees and related 
structures, there are lots of them out there. You could find tons of different balancing mechanisms (red black trees

Download 4,67 Mb.

Do'stlaringiz bilan baham:
1   ...   108   109   110   111   112   113   114   115   ...   266




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish