Заключение
625
411. Kocisk
ý
, T., Hermann, K. M., and Blunsom, P. (2014). Learning Bilingual Word Rep-
resentations by Marginalizing Alignments. In Proceedings of ACL.
412. Krause, O., Fischer, A., Glasmachers, T., and Igel, C. (2013). Approximation proper-
ties of DBNs with binary hidden units and real-valued visible units. In ICML’2013.
413. Krizhevsky, A. (2010). Convolutional deep belief networks on CIFAR-10. Technical
report, University of Toronto. Unpublished Manuscript:
http://www.cs.utoronto.
ca/ kriz/convcifar10-aug2010.pdf
.
414. Krizhevsky, A. and Hinton, G. (2009). Learning multiple layers of features from tiny
images. Technical report, University of Toronto.
415. Krizhevsky, A. and Hinton, G. E. (2011). Using very deep autoencoders for content-
based image retrieval. In ESANN.
416. Krizhevsky, A., Sutskever, I., and Hinton, G. (2012). ImageNet classification with
deep convolutional neural networks. In NIPS’2012.
417. Krueger, K. A. and Dayan, P. (2009). Flexible shaping: how learning in small steps
helps. Cognition, 110, 380–394.
418. Kuhn, H. W. and Tucker, A. W. (1951). Nonlinear programming. In Proceedings of
the Second Berkeley Symposium on Mathematical Statistics and Probability, pages
481–492,
Berkeley, Calif. University of California Press.
419. Kumar, A., Irsoy, O., Su, J., Bradbury, J., English, R., Pierce, B., Ondruska, P., Iyyer,
M., Gulrajani, I., and Socher, R. (2015). Ask me anything: Dynamic memory net-
works for natural language processing. arXiv:1506.07285.
420. Kumar, M. P., Packer, B., and Koller, D. (2010). Self-paced learning for latent vari-
able models. In NIPS’2010.
421. Lang, K. J. and Hinton, G. E. (1988). The development of the time-delay neural
network architecture for speech recognition. Technical Report CMU-CS-88-152,
Carnegie-Mellon University.
422. Lang, K. J., Waibel, A. H., and Hinton, G. E. (1990). A time-delay neural network
architecture for isolated word recognition. Neural networks, 3(1), 23–43.
423. Langford, J. and Zhang, T. (2008). The epoch-greedy algorithm for contextual multi-
armed bandits. In NIPS’2008, pages 1096–1103.
424. Lappalainen, H., Giannakopoulos, X., Honkela, A., and Karhunen, J. (2000). Non-
linear independent component analysis using ensemble learning: Experiments and
discussion. In Proc. ICA. Citeseer.
425. Larochelle, H. and Bengio, Y. (2008). Classification using discriminative restricted
Boltzmann machines. In ICML’2008.
426. Larochelle, H. and Hinton, G. E. (2010). Learning to combine foveal glimpses with
a third-order Boltzmann machine. In Advances in Neural
Information Processing
Systems 23, pages 1243–1251.
427. Larochelle, H. and Murray, I. (2011). The Neural Autoregressive Distribution Esti-
mator. In AISTATS’2011.
428. Larochelle, H., Erhan, D., and Bengio, Y. (2008). Zero-data learning of new tasks. In
AAAI Conference on Artificial Intelligence.
429. Larochelle, H., Bengio, Y., Louradour, J., and Lamblin, P. (2009). Exploring strategies
for training deep neural networks. Journal
of Machine Learning Research, 10, 1–40.
430. Lasserre, J. A., Bishop, C. M., and Minka, T. P. (2006). Principled hybrids of genera-
tive and discriminative models. In Proceedings of the Computer Vision and Pattern