Statistical Foundations of Data Science (Chapman & Hall/crc Data Science)

電子版価格
¥23,783

電子版あり

Statistical Foundations of Data Science (Chapman & Hall/crc Data Science)

Fan, Jianqing/ Li, Runze/ Zhang, Cun-Hui/ Zou, Hui

Chapman & Hall（2022/02発売）

ただいまウェブストアではご注文を受け付けておりません。 ⇒古書を探す

製本 Paperback:紙装版/ペーパーバック版／ページ数 774 p.
言語 ENG
商品コード 9780367512620
DDC分類 511

Full Description

Statistical Foundations of Data Science gives a thorough introduction to commonly used statistical models, contemporary statistical machine learning techniques and algorithms, along with their mathematical insights and statistical theories. It aims to serve as a graduate-level textbook and a research monograph on high-dimensional statistics, sparsity and covariance learning, machine learning, and statistical inference. It includes ample exercises that involve both theoretical studies as well as empirical applications.The book begins with an introduction to the stylized features of big data and their impacts on statistical analysis. It then introduces multiple linear regression and expands the techniques of model building via nonparametric regression and kernel tricks. It provides a comprehensive account on sparsity explorations and model selections for multiple regression, generalized linear models, quantile regression, robust regression, hazards regression, among others. High-dimensional inference is also thoroughly addressed and so is feature screening. The book also provides a comprehensive account on high-dimensional covariance estimation, learning latent factors and hidden structures, as well as their applications to statistical estimation, inference, prediction and machine learning problems. It also introduces thoroughly statistical machine learning theory and methods for classification, clustering, and prediction. These include CART, random forests, boosting, support vector machines, clustering algorithms, sparse PCA, and deep learning.

I. Introduction Rise of Big Data and Dimensionality Biological Sciences Health Sciences Computer and Information Sciences Economics and Finance Business and Program Evaluation Earth Sciences and Astronomy Impact of Big Data Impact of Dimensionality Computation Noise Accumulation Spurious Correlation Statistical theory Aim of High-dimensional Statistical Learning What big data can do Scope of the book 2. Multiple and Nonparametric Regression Introduction Multiple Linear Regression The Gauss-Markov Theorem Statistical Tests Weighted Least-Squares Box-Cox Transformation Model Building and Basis Expansions Polynomial Regression Spline Regression Multiple Covariates Ridge Regression Bias-Variance TradeoPenalized Least Squares Bayesian Interpretation Ridge Regression Solution Path Kernel Ridge Regression Regression in Reproducing Kernel Hilbert Space Leave-one-out and Generalized Cross-validation Exercises 3. Introduction to Penalized Least-Squares Classical Variable Selection Criteria Subset selection Relation with penalized regression Selection of regularization parameters Folded-concave Penalized Least Squares Orthonormal designs Penalty functions Thresholding by SCAD and MCP Risk properties Characterization of folded-concave PLS Lasso and L Regularization Nonnegative garrote Lasso Adaptive Lasso Elastic Net Dantzig selector SLOPE and Sorted Penalties Concentration inequalities and uniform convergence A brief history of model selection Bayesian Variable Selection Bayesian view of the PLS A Bayesian framework for selection Numerical Algorithms Quadratic programs Least angle regression_ Local quadratic approximations Local linear algorithm Penalized linear unbiased selection_ Cyclic coordinate descent algorithms Iterative shrinkage-thresholding algorithms Projected proximal gradient method ADMM Iterative Local Adaptive Majorization and MinimizationOther Methods and Timeline Regularization parameters for PLS Degrees of freedom Extension of information criteria Application to PLS estimators Residual variance and refitted cross-validation Residual variance of Lasso Refitted cross-validation Extensions to Nonparametric Modeling Structured nonparametric models Group penalty Applications Bibliographical notes Exercises 4. Penalized Least Squares: Properties Performance Benchmarks Performance measures Impact of model uncertainty Bayes lower bounds for orthogonal design Minimax lower bounds for general design Performance goals, sparsity and sub-Gaussian noise Penalized L Selection Lasso and Dantzig Selector Selection consistency Prediction and coefficient estimation errors Model size and least squares after selection Properties of the Dantzig selector Regularity conditions on the design matrix Properties of Concave PLS Properties of penalty functions Local and oracle solutions Properties of local solutions Global and approximate global solutions Smaller and Sorted Penalties Sorted concave penalties and its local approximation Approximate PLS with smaller and sorted penalties Properties of LLA and LCA Bibliographical notes Exercises 5. Generalized Linear Models and Penalized Likelihood Generalized Linear Models Exponential family Elements of generalized linear models Maximum likelihood Computing MLE: Iteratively reweighed least squares Deviance and Analysis of Deviance Residuals Examples Bernoulli and binomial models Models for count responses Models for nonnegative continuous responses Normal error models Sparest solution in high confidence set A general setup Examples Properties Variable Selection via Penalized Likelihood Algorithms Local quadratic approximation Local linear approximation Coordinate descent Iterative Local Adaptive Majorization and MinimizationTuning parameter selection An Application Sampling Properties in low-dimension Notation and regularity conditions The oracle property Sampling Properties with Diverging Dimensions Asymptotic properties of GIC selectors Properties under Ultrahigh Dimensions The Lasso penalized estimator and its risk property Strong oracle property Numeric studies Risk properties Bibliographical notes Exercises 6. Penalized M-estimators Penalized quantile regression Quantile regression Variable selection in quantile regression A fast algorithm for penalized quantile regression Penalized composite quantile regression Variable selection in robust regression Robust regression Variable selection in Huber regression Rank regression and its variable selection Rank regression Penalized weighted rank regression Variable Selection for Survival Data Partial likelihood Variable selection via penalized partial likelihood and its properties Theory of folded-concave penalized M-estimator Conditions on penalty and restricted strong convexity Statistical accuracy of penalized M-estimator withfolded concave penalties Computational accuracy Bibliographical notes Exercises 7. High Dimensional Inference Inference in linear regression Debias of regularized regression estimators Choices of weights Inference for the noise level Inference in generalized linear models Desparsified Lasso Decorrelated score estimator Test of linear hypotheses Numerical comparison An application Asymptotic efficiencyStatistical efficiency and Fisher information Linear regression with random design Partial linear regression Gaussian graphical models Inference via penalized least squares Sample size in regression and graphical models General solutions_ Local semi-LD decomposition Data swap Gradient approximation Bibliographical notes Exercises 8. Feature Screening Correlation Screening Sure screening property Connection to multiple comparison Iterative SIS Generalized and Rank Correlation Screening Feature Screening for Parametric Models Generalized linear models A unified strategy for parametric feature screening Conditional sure independence screening Nonparametric Screening Additive models Varying coefficient models Heterogeneous nonparametric models Model-free Feature Screening Sure independent ranking screening procedure Feature screening via distance correlation Feature screening for high-dimensional categorial data Screening and Selection Feature screening via forward regression Sparse maximum likelihood estimate Feature screening via partial correlation Refitted Cross-Validation RCV algorithm RCV in linear models RCV in nonparametric regression An Illustration Bibliographical notes Exercises 9. Covariance Regularization and Graphical Models Basic facts about matrix Sparse Covariance Matrix Estimation Covariance regularization by thresholding and banding Asymptotic properties Nearest positive definite matrices Robust covariance inputs Sparse Precision Matrix and Graphical Models Gaussian graphical models Penalized likelihood and M-estimation Penalized least-squares CLIME and its adaptive version Latent Gaussian Graphical Models Technical Proofs Proof of Theorem Proof of Theorem Proof of Theorem Proof of Theorem Bibliographical notes Exercises 10. Covariance Learning and Factor Models Principal Component Analysis Introduction to PCA Power Method Factor Models and Structured Covariance Learning Factor model and high-dimensional PCA Extracting latent factors and POET Methods for selecting number of factors Covariance and Precision Learning with Known Factors Factor model with observable factors Robust initial estimation of covariance matrix Augmented factor models and projected PCA Asymptotic Properties Properties for estimating loading matrix Properties for estimating covariance matrices Properties for estimating realized latent factors Properties for estimating idiosyncratic components Technical Proofs Proof of Theorem Proof of Theorem Proof of Theorem Proof of Theorem Bibliographical Notes Exercises 11. Applications of Factor Models and PCA Factor-adjusted Regularized Model Selection Importance of factor adjustments FarmSelect Application to forecasting bond risk premia Application to a neuroblastoma data Asymptotic theory for FarmSelect Factor-adjusted robust multiple testing False discovery rate control Multiple testing under dependence measurements Power of factor adjustments FarmTest Application to neuroblastoma data Factor Augmented Regression Methods Principal Component Regression Augmented Principal Component Regression Application to Forecast Bond Risk Premia Applications to Statistical Machine Learning Community detection Topic model Matrix completion Item ranking Gaussian Mixture models Bibliographical Notes Exercises 12. Supervised Learning Model-based Classifiers Linear and quadratic discriminant analysis Logistic regression Kernel Density Classifiers and Naive Bayes Nearest Neighbor Classifiers Classification Trees and Ensemble Classifiers Classification trees Bagging Random forests Boosting Support Vector Machines The standard support vector machine Generalizations of SVMs Sparse Classifiers via Penalized Empirical Loss The importance of sparsity under high-dimensionality Sparse support vector machines Sparse large margin classifiers Sparse Discriminant Analysis Nearest shrunken centroids classifier Features annealed independent rule Selection bias of sparse independence rules Regularized optimal affine discriminant Linear programming discriminant Direct sparse discriminant analysis Solution path equivalence between ROAD and DSDA Feature Augmention and Sparse Additive Classifiers Feature augmentation Penalized additive logistic regression Semiparametric sparse discriminant analysis Bibliographical notes Exercises 13. Unsupervised Learning Cluster Analysis K-means clustering Hierarchical clustering Model-based clustering Spectral clustering Data-driven choices of the number of clusters Variable Selection in Clustering Sparse clustering Sparse model-based clustering Sparse mixture of experts model An Introduction to High Dimensional PCA Inconsistency of the regular PCA Consistency under sparse eigenvector model Sparse Principal Component Analysis Sparse PCA An iterative SVD thresholding approach A penalized matrix decomposition approach A semidefinite programming approach A generalized power method Bibliographical notes Exercises 14. An Introduction to Deep Learning Rise of Deep Learning Feed-forward neural networks Model setup Back-propagation in computational graphs Popular models Convolutional neural networks Recurrent neural networks Vanilla RNNs GRUs and LSTM Multilayer RNNs Modules Deep unsupervised learning Autoencoders Generative adversarial networks Sampling view of GANs Minimum distance view of GANs Training deep neural nets Stochastic gradient descent Mini-batch SGD Momentum-based SGD SGD with adaptive learning rates Easing numerical instability ReLU activation function Skip connections Batch normalization Regularization techniques Weight decay Dropout Data augmentation Example: image classification Bibliography notes