Introduction to High-Dimensional Statistics (Monographs on Statistics and Applied Probability)

Giraud, Christophe

Chapman & Hall（2014/12発売）

ただいまウェブストアではご注文を受け付けておりません。 ⇒古書を探す

製本 Hardcover:ハードカバー版／ページ数 255 p.
言語 ENG
商品コード 9781482237948
DDC分類 519.535

Full Description

Ever-greater computing technologies have given rise to an exponentially growing volume of data. Today massive data sets (with potentially thousands of variables) play an important role in almost every branch of modern human activity, including networks, finance, and genetics. However, analyzing such data has presented a challenge for statisticians and data analysts and has required the development of new statistical methods capable of separating the signal from the noise.Introduction to High-Dimensional Statistics is a concise guide to state-of-the-art models, techniques, and approaches for handling high-dimensional data. The book is intended to expose the reader to the key concepts and ideas in the most simple settings possible while avoiding unnecessary technicalities. Offering a succinct presentation of the mathematical foundations of high-dimensional statistics, this highly accessible text:Describes the challenges related to the analysis of high-dimensional dataCovers cutting-edge statistical methods including model selection, sparsity and the lasso, aggregation, and learning theoryProvides detailed exercises at the end of every chapter with collaborative solutions on a wikisiteIllustrates concepts with simple but clear practical examplesIntroduction to High-Dimensional Statistics is suitable for graduate students and researchers interested in discovering modern statistics for massive data. It can be used as a graduate text or for self-study.

PrefaceAcknowledgmentsIntroductionHigh-Dimensional DataCurse of DimensionalityLost in the Immensity of High-Dimensional SpacesFluctuations CumulateAn Accumulation of Rare Events May Not Be RareComputational ComplexityHigh-Dimensional StatisticsCircumventing the Curse of DimensionalityA Paradigm ShiftMathematics of High-Dimensional StatisticsAbout This BookStatistics and Data AnalysisPurpose of This BookOverviewDiscussion and ReferencesTake-Home MessageReferencesExercisesStrange Geometry of High-Dimensional SpacesVolume of a p-Dimensional BallTails of a Standard Gaussian DistributionPrincipal Component AnalysisBasics of Linear RegressionConcentration of the Square Norm of a Gaussian Random VariableModel SelectionStatistical SettingTo Select among a Collection of ModelsModels and OracleModel Selection ProceduresRisk Bound for Model SelectionOracle Risk BoundOptimalityMinimax OptimalityFrontier of Estimation in High DimensionsMinimal PenaltiesComputational IssuesIllustrationAn Alternative Point of View on Model SelectionDiscussion and ReferencesTake-Home MessageReferencesExercisesOrthogonal DesignRisk Bounds for the Different Sparsity SettingsCollections of Nested ModelsSegmentation with Dynamic ProgrammingGoldenshluger-Lepski MethodMinimax Lower BoundsAggregation of EstimatorsIntroductionGibbs Mixing of EstimatorsOracle Risk BoundNumerical Approximation by Metropolis-HastingsNumerical IllustrationDiscussion and ReferencesTake-Home MessageReferencesExercisesGibbs DistributionOrthonormal Setting with Power Law PriorGroup-Sparse SettingGain of CombiningOnline AggregationConvex CriteriaReminder on Convex Multivariate FunctionsSubdifferentialsTwo Useful PropertiesLasso EstimatorGeometric InsightsAnalytic InsightsOracle Risk BoundComputing the Lasso EstimatorRemoving the Bias of the Lasso EstimatorConvex Criteria for Various Sparsity PatternsGroup-Lasso (Group Sparsity)Sparse-Group Lasso (Sparse-Group Sparsity)Fused-Lasso (Variation Sparsity)Discussion and ReferencesTake-Home MessageReferencesExercisesWhen Is the Lasso Solution Unique?Support Recovery via the Witness ApproachLower Bound on the Compatibility ConstantOn the Group-LassoDantzig SelectorProjection on the l1-BallRidge and Elastic-NetEstimator SelectionEstimator SelectionCross-Validation TechniquesComplexity Selection TechniquesCoordinate-Sparse RegressionGroup-Sparse RegressionMultiple StructuresScaled-Invariant CriteriaReferences and DiscussionTake-Home MessageReferencesExercisesExpected V-Fold CV l2-RiskProof of Corollary 5.5Some Properties of Penalty (5.4)Selecting the Number of Steps for the Forward AlgorithmMultivariate RegressionStatistical SettingA Reminder on Singular ValuesLow-Rank EstimationIf We Knew the Rank of A*When the Rank of A* Is UnknownLow Rank and SparsityRow-Sparse MatricesCriterion for Row-Sparse and Low-Rank MatricesConvex Criterion for Low Rank MatricesConvex Criterion for Sparse and Low-Rank MatricesDiscussion and ReferencesTake-Home MessageReferencesExercisesHard-Thresholding of the Singular ValuesExact Rank RecoveryRank Selection with Unknown VarianceGraphical ModelsReminder on Conditional IndependenceGraphical ModelsDirected Acyclic Graphical ModelsNondirected ModelsGaussian Graphical Models (GGM)Connection with the Precision Matrix and the Linear RegressionEstimating g by Multiple TestingSparse Estimation of the Precision MatrixEstimation of g by RegressionPractical IssuesDiscussion and ReferencesTake-Home MessageReferencesExercisesFactorization in Directed ModelsMoralization of a Directed GraphConvexity of -log(det(K))Block Gradient Descent with the l1 / l2 PenaltyGaussian Graphical Models with Hidden VariablesDantzig Estimation of Sparse Gaussian Graphical ModelsGaussian Copula Graphical ModelsRestricted Isometry Constant for Gaussian MatricesMultiple TestingAn Introductory ExampleDifferential Expression of a Single GeneDifferential Expression of Multiple GenesStatistical Settingp-ValuesMultiple Testing SettingBonferroni CorrectionControlling the False Discovery RateHeuristicsStep-Up ProceduresFDR Control under the WPRDS PropertyIllustrationDiscussion and ReferencesTake-Home MessageReferencesExercisesFDR versus FWERWPRDS PropertyPositively Correlated Normal Test StatisticsSupervised ClassificationStatistical ModelingBayes ClassifierParametric ModelingSemi-Parametric ModelingNonparametric ModelingEmpirical Risk MinimizationMisclassification Probability of the Empirical Risk MinimizerVapnik-Chervonenkis DimensionDictionary SelectionFrom Theoretical to Practical ClassifiersEmpirical Risk ConvexificationStatistical PropertiesSupport Vector MachinesAdaBoostClassifier SelectionDiscussion and ReferencesTake-Home MessageReferencesExercisesLinear Discriminant AnalysisVC Dimension of Linear Classifiers in Rd Linear Classifiers with Margin ConstraintsSpectral KernelComputation of the SVM ClassifierKernel Principal Component Analysis (KPCA)Gaussian DistributionGaussian Random VectorsChi-Square DistributionGaussian ConditioningProbabilistic InequalitiesBasic InequalitiesConcentration InequalitiesMcDiarmid InequalityGaussian Concentration InequalitySymmetrization and Contraction LemmasSymmetrization LemmaContraction PrincipleBirge's InequalityLinear AlgebraSingular Value Decomposition (SVD)Moore-Penrose Pseudo-InverseMatrix NormsMatrix AnalysisSubdifferentials of Convex FunctionsSubdifferentials and SubgradientsExamples of SubdifferentialsReproducing Kernel Hilbert SpacesNotationsBibliographyIndex