Algorithms For Dummies


PART 4   Struggling with Big Data



Download 7,18 Mb.
Pdf ko'rish
bet408/651
Sana15.07.2021
Hajmi7,18 Mb.
#120357
1   ...   404   405   406   407   408   409   410   411   ...   651
Bog'liq
Algorithms

 

   


  PART 4 

 Struggling with Big Data

['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L',

 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X',

 'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',

 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v',

 'w', 'x', 'y', 'z']

Apart from strings, the example uses functions from the random package to cre-

ate a seed (for stable and replicable solutions) and, drawing a random integer 

number, it checks whether it needs to change an element in the reservoir. Apart 

from the seed value, you can experiment with modifying the sample size or even 

feeding  the  algorithm  a  different  stream  (it  should  be  in  a  Python  list  for  the 

example to work correctly).

from random import seed, randint

seed(9) # change this value for different results

sample_size = 5

sample = []

for index, element in enumerate(datastream):

        # Until the reservoir is filled, we add elements

        if index < sample_size:

                sample.append(element)

        else:

                # Having filled the reservoir, we test a

                # random replacement based on the elements

                # seen in the data stream

                drawn = randint(0, index)

                # If the drawn number is less or equal the

                # sample size, we replace a previous

                # element with the one arriving from the

                # stream

                if drawn < sample_size:

                        sample[drawn] = element

print (sample)

['y', 'e', 'v', 'F', 'i']

This procedure assures you that, at any time, your reservoir sample is a good 

sample representing the overall data stream. In this implementation, the variable 

index


 plays the role of n and the variable 

sample_size

 acts as k. Note two par-

ticular aspects of this algorithm:




CHAPTER 12


Download 7,18 Mb.

Do'stlaringiz bilan baham:
1   ...   404   405   406   407   408   409   410   411   ...   651




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2025
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish