Deep Boltzmann Machines



Download 273,49 Kb.
bet5/14
Sana24.06.2022
Hajmi273,49 Kb.
#698089
1   2   3   4   5   6   7   8   9   ...   14
Bog'liq
salakhutdinov09a

R. Salakhutdinov and G. Hinton



chain will mix before the parameters have changed enough to significantly alter the value of the estimator. Many per- sistent chains can be run in parallel and we will refer to the current state in each of these chains as a “fantasy” particle.


    1. A Variational Approach to Estimating the Data-Dependent Expectations


In variational learning (Hinton and Zemel, 1994; Neal and Hinton, 1998), the true posterior distribution over latent variables p(h|v; θ) for each training vector v, is replaced
by an approximate posterior q(h|v; µ) and the parameters
are updated to follow the gradient of a lower bound on the
log-likelihood:
learning. Second, for applications such as the interpretation of images or speech, we expect the posterior over hidden states given the data to have a single mode, so simple and fast variational approximations such as mean-field should be adequate. Indeed, sacrificing some log-likelihood in or- der to make the true posterior unimodal could be advan- tageous for a system that must use the posterior to con- trol its actions. Having many quite different and equally good representations of the same sensory input increases log-likelihood but makes it far more difficult to associate an appropriate action with that sensory input.

ln p(v; θ) ≥
Σ
q(h|v; µ) ln p(v, h; θ) + H(q) (7)
h

= ln p(v; θ) − KL[q(h|v; µ)||p(h|v; θ)],
where H(·) is the entropy functional. Variational learning has the nice property that in addition to trying to max- imize the log-likelihood of the training data, it tries to
find parameters that minimize the Kullback–Leibler diver- gences between the approximating and true posteriors. Us- ing a naive mean-field approach, we choose a fully factor-
ized distribuQtion in order to approximate the true posterior:

q(h; µ) =
P
j=1
q(hi), with q(hi = 1) = µi where P is

the number of hidden units. The lower bound on the log- probability of the data takes the form:

ln p(v; θ) ≥
1 Σ

2
i,k


Σ
1 Σ
Likvivk + 2
j,m
Jjmµjµm

+
i,j
Σ
+
j
Wijviµj − ln Z(θ)


[µj ln µj + (1 − µj) ln (1 − µj)] .



Download 273,49 Kb.

Do'stlaringiz bilan baham:
1   2   3   4   5   6   7   8   9   ...   14




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish