Source code online books for professionals by professionals



Download 4,67 Mb.
Pdf ko'rish
bet121/266
Sana31.12.2021
Hajmi4,67 Mb.
#213682
1   ...   117   118   119   120   121   122   123   124   ...   266
Bog'liq
2 5296731884800181221

Listing 7-2.  Extracting Huffman Codes from a Huffman Tree
def codes(tree, prefix=""):
    if len(tree) == 1:
        yield (tree, prefix)                    # A leaf with its code
        return
    for bit, child in zip("01", tree):          # Left (0) and right (1)
        for pair in codes(child, prefix + bit): # Get codes recursively
            yield pair
 
The codes function yields (char, code) pairs suitable for use in the dict constructor, for example. To use such a 
dict to compress a code, you’d just iterate over the text and look up each character. To decompress the text, you’d rather 
use the Huffman tree directly, traversing it using the bits in the input for directions (that is, determining whether you 
should go left or right); I’ll leave the details as an exercise for the reader.
6
If a future version of the heapq library lets you use a key function, such as in list.sort, you’d no longer need this tuple wrapping 
at all, of course.


Chapter 7 

 Greed Is Good? prove It! 
147
The First Greedy Choice
I’m sure you can see that the Huffman codes will let you faithfully encode a text and then decode it again—but how 
can it be that it is optimal (within the class of codes we’re considering)? That is, why is the expected depth of any leaf 
minimized using this simple, greedy procedure?
As we usually do, we now turn to induction: We need to show that we’re safe all the way from start to finish—that 
the greedy choice won’t get us in trouble. We can often split this proof into two parts, what is often called (ithe greedy 
choice property and (iioptimal substructure (see, for example, Cormen et al. in the “References” section of Chapter 1). 
The greedy choice property means that the greedy choice gives us a new partial solution that is part of an optimal 
one. The optimal substructure, which is very closely related to the material of Chapter 8, means that the rest of the 
problem, after we’ve made our choice, can also be solved just like the original—if we can find an optimal solution to 
the subproblem, we can combine it with our greedy choice to get a solution to the entire problem. In other words, an 
optimal solution is built from optimal subsolutions.
To show the greedy choice property for Huffman’s algorithm, we can use an exchange argument (see, for example, 
Kleinberg and Tardos in the “References” section of Chapter 1). This is a general technique used to show that our 
solution is at least as good as an optimal one (and therefore optimal)—or in this case, that there exists a solution 
with our greedy choice that is at least this good. The “at least as good” part is proven by taking a hypothetical (totally 
unknown) optimal solution and then gradually changing it into our solution (or, in this case, one containing the bits 
we’re interested in) without making it worse.
The greedy choice for Huffman’s algorithm involves placing the two lightest elements as sibling leaves on the 
lowest level of the tree. (Note that we’re worried about only the first greedy choice; the optimal substructure will deal 
with the rest of the induction.) We need to show that this is safe—that there exists an optimal solution where the two 
lightest elements are, indeed, bottom-level sibling leaves. Start the exchange argument by positing another optimal 
tree where these two elements are not lowest-level siblings. Let a and b be the lowest-frequency elements, and assume 
that this hypothetical, optimal tree has c and d as sibling leaves at maximum depth. We assume that a is lighter 
(has a lower weight/frequency) than b and that c is lighter than d.
7
 Under the circumstances, we also know that a is 
lighter than c and b is lighter than d. For simplicity, let’s assume that the frequences of a and d are different because 
otherwise the proof is simple (see Exercise 7-8).
What happens if we swap a and c? And then swap b and d? For one thing, we now have a and b as bottom-level 
siblings, which we wanted, but what has happened to the expected leaf depth? You could fiddle around with the 
full expressions for weighted sums here, but the simple idea is: We’ve moved some heavy nodes up in the tree and 
moved some light nodes down. This means that some short paths are now given a higher weight in the sum, while 
some long paths have been given a lower weight. All in all, the total cost cannot have increased. (Indeed, if the depths 
and weights are all different, our tree will be better, and we have a proof by contradiction because our hypothetical 
alternative optimum cannot exist—the greedy way is the best there is.)
Going the Rest of the Way
Now, that was the first half of the proof. We know that making the first greedy choice was OK (the greedy choice 
property), but we need to know that it’s OK to keep using greedy choices (optimal substructure). We need to get a 
handle on what the remaining subproblem is first, though. Preferably, we’d like it to have the same structure as the 
original, so the machinery of induction can do its job properly. In other words, we’d like to reduce things to a new, 
smaller set of elements for which we can build an optimal tree and then show how we can build on that.
The idea is to view the first two combined leaves as a new element, ignoring the fact that it’s a tree. We worry 
only about its root. The subproblem then becomes finding an optimal tree for this new set of elements—which we 
can assume is all right, by induction. The only remaining question is whether this tree is optimal once we expand this 
node back to a three-node subtree, by once again including its leaf children; this is the crucial part that will give us the 
induction step.

They might also have equal weights/frequencies; that doesn’t affect the argument.


Chapter 7 

 Greed Is Good? prove It! 
148
Let’s say our two leaves are, once again, a and b, with frequencies f(a) and f(b). We lump them together as a 
single node with a frequency f(a) + f(b) and construct an optimal tree. Let’s assume that this combined node ends 
up at depth D. Then its contribution to the total tree cost is D × (f(a) + f(b)). If we now expand the two children, their 
parent node no longer contributes to the cost, but the total contribution of the leaves (which are now at depth D + 1) 
will be (D + 1) × (f(a) + f(b)). In other words, the full solution has a cost that exceeds the optimal subsolution by f(a) + 

Download 4,67 Mb.

Do'stlaringiz bilan baham:
1   ...   117   118   119   120   121   122   123   124   ...   266




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish