- ホーム
- > 洋書
- > 英文書
- > Computer / General
Full Description
In many contemporary and emerging applications of machine learning and statistical inference, the phenomena of interest are characterized by variables defined over large alphabets. This increasing size of both the data and the number of inferences, and the limited available training data means there is a need to understand which inference tasks can be most effectively carried out, and, in turn, what features of the data are most relevant to them.
In this monograph, the authors develop the idea of extracting "universally good" features, and establish that diverse notions of such universality lead to precisely the same features. The information-theoretic approach used results in a local information geometric analysis that facilitates their computation in a host of applications.
The authors provide a comprehensive treatment that guides the reader through the basic principles to the advanced techniques including many new results. They emphasize a development from first-principles together with common, unifying terminology and notation, and pointers to the rich embodying literature, both historical and contemporary.
Written for students and researchers, this monograph is a complete treatise on the information theoretic treatment of a recognized and current problem in machine learning and statistical inference.
Contents
1. Introduction
2. The Modal Decomposition of Joint Distributions
3. Variational Characterization of the Modal Decomposition
4. Local Information Geometry
5. Universal Feature Characterizations
6. Learning Modal Decompositions
7. Collaborative Filtering and Matrix Factorization
8. Softmax Regression
9. Gaussian Distributions and Linear Features
10. Nonlinear Features and nonGaussian Distributions
11. Semi-Supervised Learning
12. Modal Decomposition of Markov Random Fields
13. Emerging Applications and Related Developments
Acknowledgements
Appendices
References