Rによる実地の機械学習<br>Hands-On Machine Learning with R

個数:1
紙書籍版価格
¥25,744
  • 電子書籍
  • ポイントキャンペーン

Rによる実地の機械学習
Hands-On Machine Learning with R

  • 著者名:Boehmke, Brad/Greenwell, Brandon M.
  • 価格 ¥17,374 (本体¥15,795)
  • Chapman and Hall/CRC(2019/11/07発売)
  • 冬の読書を楽しもう!Kinoppy 電子書籍・電子洋書 全点ポイント25倍キャンペーン(~1/25)
  • ポイント 3,925pt (実際に付与されるポイントはご注文内容確認画面でご確認下さい)
  • 言語:ENG
  • ISBN:9781138495685
  • eISBN:9781000730432

ファイル: /

Description

Hands-on Machine Learning with R provides a practical and applied approach to learning and developing intuition into today’s most popular machine learning methods. This book serves as a practitioner’s guide to the machine learning process and is meant to help the reader learn to apply the machine learning stack within R, which includes using various R packages such as glmnet, h2o, ranger, xgboost, keras, and others to effectively model and gain insight from their data. The book favors a hands-on approach, providing an intuitive understanding of machine learning concepts through concrete examples and just a little bit of theory. 

Throughout this book, the reader will be exposed to the entire machine learning process including feature engineering, resampling, hyperparameter tuning, model evaluation, and interpretation. The reader will be exposed to powerful algorithms such as regularized regression, random forests, gradient boosting machines, deep learning, generalized low rank models, and more! By favoring a hands-on approach and using real word data, the reader will gain an intuitive understanding of the architectures and engines that drive these algorithms and packages, understand when and how to tune the various hyperparameters, and be able to interpret model results. By the end of this book, the reader should have a firm grasp of R’s machine learning stack and be able to implement a systematic approach for producing high quality modeling results.

Features:

·         Offers a practical and applied introduction to the most popular machine learning methods.

·         Topics covered include feature engineering, resampling, deep learning and more.

·         Uses a hands-on approach and real world data.

Table of Contents

FUNDAMENTALS

Introduction to Machine Learning

Supervised learning

Regression problems

Classification problems

Unsupervised learning

Roadmap

The data sets

Modeling Process

Prerequisites

Data splitting

Simple random sampling

Stratified sampling

Class imbalances

Creating models in R

Many formula interfaces

Many engines

Resampling methods

Contents

k-fold cross validation

Bootstrapping

Alternatives

Bias variance trade-off

Bias

Variance

Hyperparameter tuning

Model evaluation

Regression models

Classification models

Putting the processes together

Feature & Target Engineering

Prerequisites

Target engineering

Dealing with missingness

Visualizing missing values

Imputation

Feature filtering

Numeric feature engineering

Skewness

Standardization

Categorical feature engineering

Lumping

One-hot & dummy encoding

Label encoding

Alternatives

Dimension reduction

Proper implementation

Sequential steps

Data leakage

Putting the process together

Contents v

SUPERVISED LEARNING

Linear Regression

Prerequisites

Simple linear regression

Estimation

Inference

Multiple linear regression

Assessing model accuracy

Model concerns

Principal component regression

Partial least squares

Feature interpretation

Final thoughts

Logistic Regression

Prerequisites

Why logistic regression

Simple logistic regression

Multiple logistic regression

Assessing model accuracy

Model concerns

Feature interpretation

Final thoughts

Regularized Regression

Prerequisites

Why regularize?

Ridge penalty

Lasso penalty

Elastic nets

Implementation

vi Contents

Tuning

Feature interpretation

Attrition data

Final thoughts

Multivariate Adaptive Regression Splines

Prerequisites

The basic idea

Multivariate regression splines

Fitting a basic MARS model

Tuning

Feature interpretation

Attrition data

Final thoughts

K-Nearest Neighbors

Prerequisites

Measuring similarity

Distance measures

Pre-processing

Choosing k

MNIST example

Final thoughts

Decision Trees

Prerequisites

Structure

Partitioning

How deep?

Early stopping

Pruning

Ames housing example

Contents vii

Feature interpretation

Final thoughts

Bagging

Prerequisites

Why and when bagging works

Implementation

Easily parallelize

Feature interpretation

Final thoughts

Random Forests

Prerequisites

Extending bagging

Out-of-the-box performance

Hyperparameters

Number of trees

mtry

Tree complexity

Sampling scheme

Split rule

Tuning strategies

Feature interpretation

Final thoughts

Gradient Boosting

Prerequisites

How boosting works

A sequential ensemble approach

Gradient descent

Basic GBM

Hyperparameters

viii Contents

Implementation

General tuning strategy

Stochastic GBMs

Stochastic hyperparameters

Implementation

XGBoost

XGBoost hyperparameters

Tuning strategy

Feature interpretation

Final thoughts

Deep Learning

Prerequisites

Why deep learning

Feedforward DNNs

Network architecture

Layers and nodes

Activation

Backpropagation

Model training

Model tuning

Model capacity

Batch normalization

Regularization

Adjust learning rate

Grid Search

Final thoughts

Contents ix

Support Vector Machines

Prerequisites

Optimal separating hyperplanes

The hard margin classifier

The soft margin classifier

The support vector machine

More than two classes

Support vector regression

Job attrition example

Class weights

Class probabilities

Feature interpretation

Final thoughts

Stacked Models

Prerequisites

The Idea

Common ensemble methods

Super learner algorithm

Available packages

Stacking existing models

Stacking a grid search

Automated machine learning

Final thoughts

Interpretable Machine Learning

Prerequisites

The idea

Global interpretation

Local interpretation

Model-specific vs. model-agnostic

x Contents

Permutation-based feature importance

Concept

Implementation

Partial dependence

Concept

Implementation

Alternative uses

Individual conditional expectation

Concept

Implementation

Feature interactions

Concept

Implementation

Alternatives

Local interpretable model-agnostic explanations

Concept

Implementation

Tuning

Alternative uses

Shapley values

Concept

Implementation

XGBoost and built-in Shapley values

Localized step-wise procedure

Concept

Implementation

Final thoughts

DIMENSION REDUCTION

Contents xi

Principal Components Analysis

Prerequisites

The idea

Finding principal components

Performing PCA in R

Selecting the number of principal components

Eigenvalue criterion

Proportion of variance explained criterion

Scree plot criterion

Final thoughts

Generalized Low Rank Models

Prerequisites

The idea

Finding the lower ranks

Alternating minimization

Loss functions

Regularization

Selecting k

Fitting GLRMs in R

Basic GLRM model

Tuning to optimize for unseen data

Final thoughts

Autoencoders

Prerequisites

Undercomplete autoencoders

Comparing PCA to an autoencoder

Stacked autoencoders

Visualizing the reconstruction

Sparse autoencoders

xii Contents

Denoising autoencoders

Anomaly detection

Final thoughts

CLUSTERING

K-means Clustering

Prerequisites

Distance measures

Defining clusters

k-means algorithm

Clustering digits

How many clusters?

Clustering with mixed data

Alternative partitioning methods

Final thoughts

Hierarchical Clustering

Prerequisites

Hierarchical clustering algorithms

Hierarchical clustering in R

Agglomerative hierarchical clustering

Divisive hierarchical clustering

Determining optimal clusters

Working with dendrograms

Final thoughts

Model-based Clustering

Prerequisites

Measuring probability and uncertainty

Covariance types

Model selection

My basket example

Final thoughts

最近チェックした商品