Nik piepenbreier datagy. Io 31 tips and tricks for super stars



Download 372,05 Kb.
Pdf ko'rish
bet3/8
Sana06.07.2022
Hajmi372,05 Kb.
#743883
1   2   3   4   5   6   7   8
Bog'liq
2 5215210330325523624

Let's get started! 
We’ll kick things off by importing our required libraries: 
In [1]: 
import

pandas 

as

pd 
import

numpy 

as

np
Nik Piepenbreier - 

datagy.io
 



Tip #1: Scrape a Website with One 
Function 
Easily scrape a web table (or multiple web tables) using Pandas. 
Use the Pandas read_html() function to easily extract data from web tables. 
In [2]: 
website = 
"https://en.wikipedia.org/wiki/All-time_Olympic_Games_medal_table"
df_list = pd.read_html(website) 
df = df_list[

2


df
Out[2]: 
Team (IOC code)
№ Summer
№ Winter
№ Games
0
Albania (ALB)
8
4
12
1
American Samoa (ASA)
8
1
9
2
Andorra (AND)
11
12
23
3
Angola (ANG)
9
0
9
4
Antigua and Barbuda (ANT)
10
0
10
...
...
...
...
...
74
Republic of China (ROC) [ROC]
3
0
3
75
Saar (SAA) [SAA]
1
0
1
76
North Yemen (YAR) [YAR]
2
0
2
77
South Yemen (YMD) [YMD]
1
0
1
78
Refugee Olympic Team (ROT) [ROT]
1
0
1
79 rows × 4 columns 
The function returns a list of dataframes on the webpage. 
Slice the list to find your table and assign it to a dataframe. 
Nik Piepenbreier - 

datagy.io
 



Tip #2: Reverse a List in Python 
To reverse the values of a list, use the negative indexer twice in a row. 
In [3]: 
original_list = [

1

,

2

,

3

,

4

,

5


reversed_list = original_list[::

-1


reversed_list
Out[3]: 
[

5



4



3



2



1

]
This uses ​extended slices​ to reverse the list. 
Extended slices add a third "step" argument to a slice. 
By using a step of -1, Python returns the list in reverse order. 
This happens because it starts at the -1 (the last value), then steps another -1 
to -2, until position 0 is hit. 
Nik Piepenbreier - 

datagy.io
 



Tip #3: Speed Up Loading Dataframes 
by Assigning Data Types 
Pandas has 7 different data types: 
1. Object (object - including strings and mixed) 
2. Integer (int64) 
3. Float (float64) 
4. Boolean (bool) 
5. Datetime (datetime64) 
6. Time delta (timedela[ns]) 
7.
Categorical (category) (Check out ​Tip #9​ to learn more about 
categorical data types) 
Each column in a dataframe can only have one data type. 
Pandas will attempt to identify the data type for each column when you 
import it by scanning across the values in that column. 
To speed up the import, if you know the column data types of specific 
columns, you can include them as an argument in the import function. 
In [4]: 
df = 
pd.read_csv(

'https://raw.githubusercontent.com/datagy/pivot_table_
pandas/master/select_columns.csv'

dtype={

'Name'

:str, 

'Age'

:int, 
'Height'

:str, 

'Score'

:int, 

'Random_A'

:int, 

'Random_B'

:int, 
'Random_C'

:int, 

'Random_D'

:int}) 
df
Out[4]: 

Download 372,05 Kb.

Do'stlaringiz bilan baham:
1   2   3   4   5   6   7   8




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish