Hands-On Machine Learning with Scikit-Learn and TensorFlow



Download 26,57 Mb.
Pdf ko'rish
bet55/225
Sana16.03.2022
Hajmi26,57 Mb.
#497859
1   ...   51   52   53   54   55   56   57   58   ...   225
Bog'liq
Hands on Machine Learning with Scikit Learn Keras and TensorFlow

>>> 
from
sklearn.preprocessing
import
OrdinalEncoder
>>> 
ordinal_encoder
=
OrdinalEncoder
()
Prepare the Data for Machine Learning Algorithms | 71


19
Before Scikit-Learn 0.20, it could only encode integer categorical values, but since 0.20 it can also handle
other types of inputs, including text categorical inputs.
>>> 
housing_cat_encoded
=
ordinal_encoder
.
fit_transform
(
housing_cat
)
>>> 
housing_cat_encoded
[:
10
]
array([[0.],
[0.],
[4.],
[1.],
[0.],
[1.],
[0.],
[1.],
[0.],
[0.]])
You can get the list of categories using the 
categories_
instance variable. It is a list
containing a 1D array of categories for each categorical attribute (in this case, a list
containing a single array since there is just one categorical attribute):
>>> 
ordinal_encoder
.
categories_
[array(['<1H OCEAN', 'INLAND', 'ISLAND', 'NEAR BAY', 'NEAR OCEAN'],
dtype=object)]
One issue with this representation is that ML algorithms will assume that two nearby
values are more similar than two distant values. This may be fine in some cases (e.g.,
for ordered categories such as “bad”, “average”, “good”, “excellent”), but it is obviously
not the case for the 
ocean_proximity
column (for example, categories 0 and 4 are
clearly more similar than categories 0 and 1). To fix this issue, a common solution is
to create one binary attribute per category: one attribute equal to 1 when the category
is “<1H OCEAN” (and 0 otherwise), another attribute equal to 1 when the category is
“INLAND” (and 0 otherwise), and so on. This is called 
one-hot encoding
, because
only one attribute will be equal to 1 (hot), while the others will be 0 (cold). The new
attributes are sometimes called 
dummy
attributes. Scikit-Learn provides a 
OneHotEn
coder
class to convert categorical values into one-hot vectors
19
:

Download 26,57 Mb.

Do'stlaringiz bilan baham:
1   ...   51   52   53   54   55   56   57   58   ...   225




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish