Заключение
639
681. Skinner, B. F. (1958). Reinforcement today. American Psychologist, 13, 94–99.
682. Smolensky, P. (1986). Information processing in dynamical systems: Foundations of
harmony theory. In D. E. Rumelhart and J. L. McClelland, editors, Parallel Distrib-
uted
Processing, volume 1, chapter 6, pages 194–281. MIT Press, Cambridge.
683. Snoek, J., Larochelle, H., and Adams, R. P. (2012). Practical Bayesian optimization
of machine learning algorithms. In NIPS’2012.
684. Socher, R., Huang, E. H.,
Pennington, J., Ng, A. Y., and Manning, C. D. (2011a).
Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In
NIPS’2011.
685. Socher, R., Manning, C., and Ng, A. Y. (2011b). Parsing natural scenes and natural
language with recursive neural networks. In Proceedings of the Twenty-Eighth In-
ternational Conference on Machine Learning (ICML’2011).
686. Socher, R., Pennington, J., Huang, E. H., Ng, A. Y., and Manning, C. D. (2011c).
Semi-supervised recursive autoencoders for predicting sentiment distributions. In
EMNLP’2011.
687. Socher, R., Perelygin, A., Wu, J. Y., Chuang, J., Manning, C. D., Ng, A. Y., and Potts,
C. (2013a). Recursive deep models for semantic compositionality over a sentiment
treebank. In EMNLP’2013.
688. Socher, R., Ganjoo, M., Manning, C. D., and Ng, A. Y. (2013b). Zero-shot learning
through cross-modal transfer. In 27th Annual Conference on Neural Information
Processing Systems (NIPS 2013).
689. Sohl-Dickstein, J., Weiss, E. A., Maheswaranathan, N., and Ganguli, S. (2015). Deep
unsupervised learning using nonequilibrium thermodynamics.
690. Sohn, K., Zhou, G., and Lee, H. (2013). Learning and selecting features jointly with
point-wise gated Boltzmann machines. In ICML’2013.
691. Solomonoff, R. J. (1989). A system for incremental learning based on algorithmic
probability.
692. Sontag, E. D. (1998). VC dimension of neural networks. NATO ASI Series F Com-
puter and Systems Sciences, 168, 69–96.
693. Sontag, E. D. and Sussman, H. J. (1989). Backpropagation can give rise to spu-
rious local minima even for networks without hidden layers. Complex Systems, 3,
91–106.
694. Sparkes, B. (1996). The Red and the Black: Studies in Greek Pottery. Routledge.
695. Spitkovsky, V. I., Alshawi, H., and Jurafsky, D. (2010). From baby steps to leapfrog:
how “less is more” in unsupervised dependency parsing. In HLT’10.
696. Squire, W. and Trapp, G. (1998). Using complex variables to estimate derivatives of
real functions. SIAM Rev., 40(1), 110–112.
697. Srebro, N. and Shraibman, A. (2005). Rank, trace-norm and max-norm. In Proceed-
ings of the 18th Annual Conference on Learning Theory, pages 545–560. Springer-
Verlag.
698.
Srivastava, N. (2013). Improving Neural Networks With Dropout. Master’s thesis,
U. Toronto.
699. Srivastava, N. and Salakhutdinov, R. (2012). Multimodal learning with deep Boltz-
mann machines. In NIPS’2012.
700. Srivastava, N., Salakhutdinov, R. R., and Hinton, G. E. (2013). Modeling documents
with deep Boltzmann machines. arXiv preprint arXiv:1309.6865.