Algorithms For Dummies



Download 7,18 Mb.
Pdf ko'rish
bet407/651
Sana15.07.2021
Hajmi7,18 Mb.
#120357
1   ...   403   404   405   406   407   408   409   410   ...   651
Bog'liq
Algorithms

  Managing Big Data 

     237


Here is how reservoir sample works: Given a stream of data, containing many ele-

ments, you initialize the reservoir sample with elements taken from the stream 

until the sample is complete. For instance, if the sample contains 1,000 elements, 

a number that usually fits in the computer’s internal memory, you start by pick-

ing  the  first  1,000  stream  elements.  The  number  of  elements  you  want  in  the 

sample is k, and k implies a sample that fits into the computer’s memory. At the 

point when you reserve the first k stream elements, the algorithm starts making 

its selections:

1. 

From the beginning of the stream, the algorithm counts every new element 



that arrives. It tracks the counting using the variable named as n. When the 

algorithm gets into action, the value of n is equivalent to k.

2. 

Now, new elements arrive, and they increment the value of n. A new element 



arriving from the stream has a probability of being inserted into the reservoir 

sample of k/n and probability of not being inserted equal to (1 – k/n).

3. 

The probability is verified for each new element arriving. It’s like a lottery: If the 



probability is verified, the new element is inserted. On the other hand, if it isn’t 

inserted, the new element is discarded. If it’s inserted, the algorithm discards 

an old element in the sample according to some rule (the easiest being to pick 

an old element at random) and replaces it with the new element.

The following code shows a simple example in Python so that you can see this 

algorithm in action. The example relies on a sequence of alphabet letters (pretend 

that they are a data stream) and creates a sample of five elements. (You can find 

this code in the 

A4D; 12; Managing Big Data.ipynb

 file on the Dummies site as 

part of the downloadable code; see the Introduction for details.)

import string

datastream = list(string.ascii_uppercase) 

+ list(


    string.ascii_lowercase)

print(datastream)



FIGURE 12-3: 

An example of 

windowing 

a stream of 

DNA data.



238


Download 7,18 Mb.

Do'stlaringiz bilan baham:
1   ...   403   404   405   406   407   408   409   410   ...   651




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2025
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish