Hands-On Machine Learning with Scikit-Learn and TensorFlow


| Chapter 4: Training Models



Download 26,57 Mb.
Pdf ko'rish
bet115/225
Sana16.03.2022
Hajmi26,57 Mb.
#497859
1   ...   111   112   113   114   115   116   117   118   ...   225
Bog'liq
Hands on Machine Learning with Scikit Learn Keras and TensorFlow

138 | Chapter 4: Training Models


11
It is common to use the notation 
J
(θ) for cost functions that don’t have a short name; we will often use this
notation throughout the rest of this book. The context will make it clear which cost function is being dis‐
cussed.
12
Norms are discussed in 
Chapter 2
.
13
A square matrix full of 0s except for 1s on the main diagonal (top-left to bottom-right).
up very close to zero and the result is a flat line going through the data’s mean. 
Equa‐
tion 4-8
 presents the Ridge Regression cost function.
11
Equation 4-8. Ridge Regression cost function
J
θ = MSE θ +
α
1
2

i
= 1
n
θ
i
2
Note that the bias term 
θ
0
is not regularized (the sum starts at 
i
= 1, not 0). If we
define w as the vector of feature weights (
θ
1
to 
θ
n
), then the regularization term is
simply equal to ½(

w 

2
)
2
, where 

w 

2
represents the ℓ
2
norm of the weight vector.
12
For Gradient Descent, just add 
α
w to the MSE gradient vector (
Equation 4-6
).
It is important to scale the data (e.g., using a 
StandardScaler

before performing Ridge Regression, as it is sensitive to the scale of
the input features. This is true of most regularized models.
Figure 4-17
shows several Ridge models trained on some linear data using different 
α
value. On the left, plain Ridge models are used, leading to linear predictions. On the
right, the data is first expanded using 
PolynomialFeatures(degree=10)
, then it is
scaled using a 
StandardScaler
, and finally the Ridge models are applied to the result‐
ing features: this is Polynomial Regression with Ridge regularization. Note how
increasing 
α
leads to flatter (i.e., less extreme, more reasonable) predictions; this
reduces the model’s variance but increases its bias.
As with Linear Regression, we can perform Ridge Regression either by computing a 
closed-form equation or by performing Gradient Descent. The pros and cons are the
same. 
Equation 4-9
 shows the closed-form solution (where A is the (
n
+ 1) × (
n
+ 1)
identity matrix
13
except with a 0 in the top-left cell, corresponding to the bias term).

Download 26,57 Mb.

Do'stlaringiz bilan baham:
1   ...   111   112   113   114   115   116   117   118   ...   225




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish