Deep Boltzmann Machines



Download 273,49 Kb.
bet7/14
Sana24.06.2022
Hajmi273,49 Kb.
#698089
1   2   3   4   5   6   7   8   9   10   ...   14
Bog'liq
salakhutdinov09a

Deep Belief Network
Deep Boltzmann Machine
h3 h2 h1
v
Deep Boltzmann Machines
Pretraining

h2


W2
h1

h2


W2



RBM

h1 W1


v



W1


v

RBM



h2
Compose W 2
h1
v

Figure 2: Left: A three-layer Deep Belief Network and a three-layer Deep Boltzmann Machine. Right: Pretraining consists of learning a stack of modified RBM’s, that are then composed to create a deep Boltzmann machine.



Consider a two-layer Boltzmann machine (see Fig. 2, right panel) with no within-layer connections. The energy of the state {v, h1, h2} is defined as:
E(v, h1, h2; θ) = −vW1h1 − h1W2h2, (9)
where θ = {W1, W2} are the model parameters, repre- senting visible-to-hidden and hidden-to-hidden symmetric interaction terms. The probability that the model assigns to a visible vector v is:
After learning the first RBM in the stack, the generative model can be written as:
Σ
p(v; θ) = p(h1; W1)p(v|h1; W1), (14)
h1



v
where p(h1; W1) = Σ p(h1, v; W1) is an implicit prior over h1 defined by the parameters. The second
ΣRBM in the stack replaces p(h1; W1) by p(h1; W2) =

p(v; θ) = 1
Σ
exp (−E(v, h1, h2; θ)). (10)
h2 p(h1, h2; W2). If the second RBM is initialized cor-
rectly (Hinton et al., 2006), p(h1; W2) will become a bet-

Z(θ) h1,h2
The conditional distributions over the visible and the two sets of hidden units are given by logistic functions:
ter model of the aggregated posterior distribution over h1, where the aggregated posterior is simply the non-factorial
mixture oΣf the factorial posteriors for all the training cases,

Σ Σ i.e. 1/N n p(h1|vn; W1). Since the second RBM is re-


j ij i

W 2 2
p(h1 = 1|v, h2) = σ W 1 v +
i
p(h2 = 1|h1) = σ Σ W 2 1
jmhj ,
m
(11)
placing p(h1; W1) by a better model, it would be possible to infer p(h1; W1, W2) by averaging the two models of h1
which can be done approximately by using 1/2W1 bottom-

m imhi
j
Σ
, (12)
up and 1/2W2 top-down. Using W1 bottom-up and W2
top-down would amount to double-counting the evidence


ij
p(vi = 1|h1) = σ W 1 hj
j
. (13)
since h2 is dependent on v.
To initialize model parameters of a DBM, we propose

For approximate maximum likelihood learning, we could still apply the learning procedure for general Boltzmann machines described above, but it would be rather slow, par- ticularly when the hidden units form layers which become increasingly remote from the visible units. There is, how- ever, a fast way to initialize the model parameters to sensi- ble values as we describe in the next section.


    1. Greedy Layerwise Pretraining of DBM’s


greedy, layer-by-layer pretraining by learning a stack of RBM’s, but with a small change that is introduced to elim- inate the double-counting problem when top-down and bottom-up influences are subsequently combined. For the lower-level RBM, we double the input and tie the visible- to-hidden weights, as shown in Fig. 2, right panel. In this modified RBM with tied parameters, the conditional distri- butions over the hidden and visible states are defined as:



Hinton et al. (2006) introduced a greedy, layer-by-layer un- supervised learning algorithm that consists of learning a
Σ

j
p(h1 = 1|v) = σ
i

ij
Σ
W 1 vi +

ij

Σ

ij
W 1 vi
i
, (15)

stack of RBM’s one layer at a time. After the stack of RBM’s has been learned, the whole stack can be viewed as a single probabilistic model, called a “deep belief net-
p(vi = 1|h1) = σ
W 1 hj
j
. (16)

work”. Surprisingly, this model is not a deep Boltzmann machine. The top two layers form a restricted Boltzmann machine which is an undirected graphical model, but the lower layers form a directed generative model (see Fig. 2).
Contrastive divergence learning works well and the modi- fied RBM is good at reconstructing its training data. Con- versely, for the top-level RBM we double the number of hidden units. The conditional distributions for this model


Download 273,49 Kb.

Do'stlaringiz bilan baham:
1   2   3   4   5   6   7   8   9   10   ...   14




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish