Algorithms For Dummies



Download 7,18 Mb.
Pdf ko'rish
bet453/651
Sana15.07.2021
Hajmi7,18 Mb.
#120357
1   ...   449   450   451   452   453   454   455   456   ...   651
Bog'liq
Algorithms

 Struggling with Big Data

different compression strategies work better with different bit sequences. This is 

the no-free-lunch problem discussed in Chapter 1. The option you choose depends 

on the data content you need to compress.

To see how compression varies by the sample you provide, you should try various 

text samples using the same algorithm. The following Python example uses the 

ZIP algorithm to compress the text of The Adventures of Sherlock Holmes, by Arthur 

Conan Doyle, and then to reduce the size of a randomly generated sequence of 

 letters.  (You  can  find  the  complete  code  for  this  example  in  the  Compression 

 Performances section of the 

A4D; 14; Compression.ipynb

 file of the download-

able source code for this book; see the Introduction for details).

import urllib.request

import zlib

from random import randint

url = "http://gutenberg.pglaf.org/1/6/6/1661/1661.txt"

sh = urllib.request.urlopen(url).read().decode('utf-8')

sh_length = len(sh)

rnd = ''.join([chr(randint(0,126)) for k in

               range(sh_length)])

def zipped(text):

    return len(zlib.compress(text.encode("ascii")))

print ("Original size for both texts: %s characters" %

       sh_length)

print ("The Adventures of Sherlock Holmes to %s" %

       zipped(sh))

print ("Random file to %s " % zipped(rnd))

Original size for both texts: 594941 characters

The Adventures of Sherlock Holmes to 226824

Random file to 521448

The output of the example is enlightening. Even though the example application 

can reduce the size of the short story to less than half of its original size, the size 

reduction  for  the  random  text  is  much  less  (both  texts  have  the  same  original 

length). The output implies that the ZIP algorithm leverages the characteristics of 

the written text but doesn’t do as well on random text that lacks a predictable 

structure.

When performing data compression, you can measure performance by calculating 

the compression ratio: Just divide the new compressed size of the file by the origi-

nal size of the file. The compression ratio can tell you about algorithm efficiency 

in saving space, but high-performance algorithms also require time to perform 



CHAPTER 14


Download 7,18 Mb.

Do'stlaringiz bilan baham:
1   ...   449   450   451   452   453   454   455   456   ...   651




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2025
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish