Python Projects for Beginners a ten-Week Bootcamp Approach to Python Programming



Download 2,61 Mb.
bet194/200
Sana20.06.2022
Hajmi2,61 Mb.
#681748
1   ...   190   191   192   193   194   195   196   197   ...   200
Bog'liq
Python Projects for Beginners A Ten Week Bootcamp Approach to Python

Scatter Plot


If you’re familiar with clusters, then you’ll know the importance of scatter plots. These types of plots help to distinguish groups apart from each other by plotting a dot for each set of data. Using two characteristics, like height and width of a flower, we can classify which species a flower belongs to. Let’s create some fake data and plot the points:

1| # creating a scatter plot to represent height-weight distribution
2| from random import randint
3| random.seed(2)
5| height = [ randint(58, 78) for x in range(20) ] # 20 records
between 4'10" and 6'6"
6| weight = [ randint(90, 250) for x in range(20) ] # 20 records between 90lbs.
and 250lbs.
8| plt.scatter(weight, height)
10| plt.title("Height-Weight Distribution")
11| plt.xlabel("Weight (lbs)")
12| plt.ylabel("Height (inches)")
14| plt.show( )

Go ahead and run the cell. To create some fake data, we use the randint method from the random module. Here, we’re able to create 20 records for both the height and weight lists. To plot the data, we use the scatter() method and add some characteristics to the plot. You should get an output like Figure 10-5.
CHapter 10 INtroduCtIoN to data aNalYsIs

Figure 10-5. Scatter plot of height-weight data

Histogram


While line plots are great for visualizing trends in time series data, histograms are the king of visualizing distributions. Often, the distribution of a variable is what you’re interested in, and a visualization provides a lot more information than a group of summary statistics. First, let’s see how we can create a histogram:

1| # creating a histogram to show age data for a fake population
2| import numpy as np # import the numpy module to generate data
3| np.random.seed(5)
5| ages = [ np.random.normal(loc=40, scale=10) for x in range(1000) ] # ages distributed around 40
7| plt.hist(ages, bins=45) # bins is the number of bars
9| plt.title("Ages per Population")
10| plt.xlabel("Age")
11| plt.ylabel("# of People")
13| plt.show( )

Go ahead and run the cell. We’ve mentioned the NumPy module previously. It’s used in data science to perform extremely fast numerical calculations. Pandas’ DataFrames are built on top of NumPy arrays. For the purpose of this cell, however, you just need to know that we’re using it to create random numbers that are centralized around a given number. The number we specify is passed into the loc argument on line 5. The scale argument is how wide we want the random numbers to be apart. Of course, it will still create numbers outside of that range, but it is primarily creating 1000 random numbers centralized around the age of 40.
To create the histogram, we use the hist() method and pass in the proper data. Histograms allow us to see how many times a specific piece of data appeared. In our example, the age of 40 appears more than 60 times. The y axis represents the frequency of the x axis value. The bins argument specifies how many bars you see on the chart. You may be thinking: the more bins the better right? Wrong, there’s always a fine line between too many and too little; often you’ll just have to test out the proper number. We complete this chart by adding customization. The result should look like Figure 10-6.

Figure 10-6. Histogram of centrally distributed age data
Although the data is fake, we can deduce a lot of information from the chart. We can see outliers that may exist, where the general age range sits, and much more.

Download 2,61 Mb.

Do'stlaringiz bilan baham:
1   ...   190   191   192   193   194   195   196   197   ...   200




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish