Python Projects for Beginners a ten-Week Bootcamp Approach to Python Programming



Download 2,61 Mb.
bet185/200
Sana20.06.2022
Hajmi2,61 Mb.
#681748
1   ...   181   182   183   184   185   186   187   188   ...   200
Bog'liq
Python Projects for Beginners A Ten Week Bootcamp Approach to Python

ages

names

tenure

age_group

6

18

rebecca

4

teenager

5

20

tyler

6

teenager

2

22

sandy

2

adult

0

25

Jess

5

adult

3

29

ted

8

adult

4

33

Barney

7

adult

1

35

Jordan

5

adult

Note When you need to apply a value based on multiple columns, you must set the axis = 1.

Aggregations


The raw data plus transformations is generally only half the story. Your objective is to extract actual insights and actionable conclusions from the data, and that means reducing it from potentially billions of rows to a summary of statistics via aggregation functions. This section assumes some knowledge of SQL and the groupby function. If you’re not familiar with how groupby works in SQL, visit w3schools8 for reference material.

groupby( )


In order to condense the information down to a summary of statistics, we’ll need to use the groupby method that Pandas has. Whenever you group information together, you need to use an aggregate function to let the program know how to group the information together. For now, let’s count how many records of each age group there are within our DataFrame:

# grouping the records together to count how many records in each group df.groupby("age_group", as_index=False).count( ).head( )

Go ahead and run the cell. When the information is grouped together using the count method, the program will simply add up the number of records that belong in each category. We’ll have two categories: adult with five records, and teenager with two records. The first argument of our groupby method is the column we want to group on, and the second is to make sure we don’t reset the index to become the age group column. If it were set to True, then the resulting DataFrame would use age_group as the unique identifier for each record.

mean( )


Instead of counting how many records there are in each category, let’s go ahead and find the averages of each column by using the mean method. We’ll group based on the same column:

# grouping the data to see averages of all columns df.groupby("age_group", as_index=False).mean( ).head( )

Go ahead and run the cell. Using the mean method, we’ll be able to get the averages for all numerical columns. The output should result in a DataFrame that looks like Table 10-4.
CHapter 10 INtroduCtIoN to data aNalYsIs
Table 10-4. Grouping by age_group and averaging data


Download 2,61 Mb.

Do'stlaringiz bilan baham:
1   ...   181   182   183   184   185   186   187   188   ...   200




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish