Description
Machine Learning for Business Analytics: Concepts, Techniques, and Applications in Python is a comprehensive introduction to and an overview of the methods that underlie modern AI. This best-selling textbook covers both statistical and machine learning (AI) algorithms for prediction, classification, visualization, dimension reduction, rule mining, recommendations, clustering, text mining, experimentation, network analytics and generative AI. Along with hands-on exercises and real-life case studies, it also discusses managerial and ethical issues for responsible use of machine learning techniques.
This is the second Python edition of Machine Learning for Business Analytics. This edition also includes:
- A new chapter on generative AI (large language models or LLMs, and image generation)
- An expanded chapter on deep learning
- A new chapter on experimental feedback techniques including A/B testing, uplift modeling, and reinforcement learning
- A new chapter on responsible data science
- Updates and new material based on feedback from instructors teaching MBA, Masters in Business Analytics and related programs, undergraduate, diploma and executive courses, and from their students
- A full chapter of cases demonstrating applications for the machine learning techniques
- End-of-chapter exercises with data
- A companion website with more than two dozen data sets, and instructor materials including exercise solutions, slides, and case solutions
This textbook is an ideal resource for upper-level undergraduate and graduate level courses in AI, data science, predictive analytics, and business analytics. It is also an excellent reference for analysts, researchers, and data science practitioners working with quantitative data in management, finance, marketing, operations management, information systems, computer science, and information technology.
Table of Contents
Foreword by Gareth James xxi
Preface to the Second Python Edition xxiii
Acknowledgments xxvii
PART I PRELIMINARIES
CHAPTER 1 Introduction 3
1.1 What Is Business Analytics? 3
1.2 What Is Machine Learning? 5
1.3 Machine Learning, AI, and Related Terms 5
1.4 Big Data 7
1.5 Data Science 8
1.6 Why Are There So Many Different Methods? 8
1.7 Terminology and Notation 9
1.8 Road Maps to This Book 12
Order of Topics 13
CHAPTER 2 Overview of the Machine Learning Process 17
2.1 Introduction 18
2.2 Core Ideas in Machine Learning 18
Classification 18
Prediction 18
Association Rules and Recommendation Systems 18
Predictive Analytics 19
Data Reduction and Dimension Reduction 19
Data Exploration and Visualization 19
Supervised and Unsupervised Learning 20
Generative AI 21
2.3 The Steps in a Machine Learning Project 22
2.4 Preliminary Steps 23
Organization of Data 23
Predicting Home Values in the West Roxbury Neighborhood 24
Loading and Looking at the Data in Python 25
Python Imports 28
Sampling from a Database 28
Oversampling Rare Events in Classification Tasks 28
Preprocessing and Cleaning the Data 30
2.5 Predictive Power and Overfitting 37
Overfitting 38
Creating and Using Data Partitions 40
2.6 Building a Predictive Model 43
Modeling Process 43
2.7 Using Python for Machine Learning on a Local Machine 49
2.8 Automating Machine Learning Solutions 49
Predicting Power Generator Failure 50
Uber’s Michelangelo 52
2.9 Ethical Practice in Machine Learning 54
Problems 55
PART II DATA EXPLORATION AND DIMENSION REDUCTION
CHAPTER 3 Data Visualization 61
3.1 Uses of Data Visualization 62
3.2 Data Examples 64
Example 1: Boston Housing Data 64
Example 2: Ridership on Amtrak Trains 66
3.3 Basic Charts: Bar Charts, Line Charts, and Scatter Plots 66
Distribution Plots: Boxplots and Histograms 69
Heatmaps: Visualizing Correlations and Missing Values 71
3.4 Multidimensional Visualization 75
Adding Variables: Color, Size, Shape, Multiple Panels, and Animation 75
Manipulations: Rescaling, Aggregation and Hierarchies, Zooming, Filtering 78
Reference: Trend Lines and Labels 81
Scaling Up to Large Datasets 84
Multivariate Plot: Parallel Coordinates Plot 84
Interactive Visualization 87
3.5 Specialized Visualizations 90
Visualizing Networked Data 90
Visualizing Hierarchical Data: Treemaps 92
Visualizing Geographical Data: Map Charts 94
3.6 Major Visualizations and Operations, by Machine Learning Goal 96
Prediction 96
Classification 97
Time Series Forecasting 97
Unsupervised Learning 97
Problems 98
CHAPTER 4 Dimension Reduction 101
4.1 Introduction 102
4.2 Curse of Dimensionality 102
4.3 Practical Considerations 103
Example 1: House Prices in Boston 103
4.4 Data Summaries 103
Summary Statistics 104
Aggregation and Pivot Tables 106
4.5 Correlation Analysis 108
4.6 Reducing the Number of Categories in Categorical Variables 109
4.7 Converting a Categorical Variable to a Numerical Variable 109
4.8 Principal Component Analysis 111
Example 2: Breakfast Cereals 111
Principal Components 116
Normalizing the Data 116
Using Principal Components for Classification and Prediction 120
4.9 Dimension Reduction Using Regression Models 121
4.10 Dimension Reduction Using Classification and Regression Trees 121
Problems 123
PART III PERFORMANCE EVALUATION
CHAPTER 5 Evaluating Predictive Performance 129
5.1 Introduction 130
5.2 Evaluating Predictive Performance 131
Naive Benchmark: The Average 131
Prediction Accuracy Measures 131
Comparing Training and Holdout Performance 132
Cumulative Gains and Lift Charts 135
5.3 Judging Classifier Performance 137
Benchmark: The Naive Rule 137
Class Separation 137
The Confusion (Classification) Matrix 138
Using the Holdout Data 139
Accuracy Measures 140
Propensities and Cutoff for Classification 140
Performance in Case of Unequal Importance of Classes 144
Asymmetric Misclassification Costs 147
Generalization to More Than Two Classes 150
5.4 Judging Ranking Performance 150
Cumulative Gains and Lift Charts for Binary Data 150
Decile-Wise Lift Charts 153
Beyond Two Classes 154
Gains and Lift Charts Incorporating Costs and Benefits 154
Cumulative Gains as a Function of Cutoff 155
5.5 Oversampling 156
Creating an Oversampled Training Set 158
Evaluating Model Performance Using a Non-oversampled Holdout Set 159
Evaluating Model Performance If Only Oversampled Holdout Set Exists 159
Problems 162
PART IV PREDICTION AND CLASSIFICATION METHODS
CHAPTER 6 Multiple Linear Regression 167
6.1 Introduction 168
6.2 Explanatory vs Predictive Modeling 168
6.3 Estimating the Regression Equation and Prediction 170
Example: Predicting the Price of Used Toyota Corolla Cars 171
Cross-Validation 175
6.4 Variable Selection in Linear Regression 176
Reducing the Number of Predictors 176
How to Reduce the Number of Predictors 177
Regularization (Shrinkage Models) 182
Appendix: Using Statmodels 186
Problems 188
CHAPTER 7 k-Nearest Neighbors (k-NN) 193
7.1 The k-NN Classifier (Categorical Outcome) 194
Determining Neighbors 194
Classification Rule 195
Example: Riding Mowers 195
Choosing k 196
Weighted k-NN 200
Setting the Cutoff Value 201
k-NN with More Than Two Classes 201
Converting Categorical Variables to Binary Dummies 202
7.2 k-NN for a Numerical Outcome 203
7.3 Advantages and Shortcomings of k-NN Algorithms 205
Problems 207
CHAPTER 8 The Naive Bayes Classifier 209
8.1 Introduction 209
Cutoff Probability Method 210
Conditional Probability 210
Example 1: Predicting Fraudulent Financial Reporting 211
8.2 Applying the Full (Exact) Bayesian Classifier 212
Using the “Assign to the Most Probable Class” Method 212
Using the Cutoff Probability Method 212
Practical Difficulty with the Complete (Exact) Bayes Procedure 212
8.3 Solution: Naive Bayes 213
The Naive Bayes Assumption of Conditional Independence 214
Using the Cutoff Probability Method 215
Example 2: Predicting Fraudulent Financial Reports, Two Predictors 215
Example 3: Predicting Delayed Flights 216
Working with Continuous Predictors 223
8.4 Advantages and Shortcomings of the Naive Bayes Classifier 224
Problems 226
CHAPTER 9 Classification and Regression Trees 229
9.1 Introduction 230
Tree Structure 231
Decision Rules 231
Classifying a New Record 232
9.2 Classification Trees 232
Recursive Partitioning 232
Example 1: Riding Mowers 233
Measures of Impurity 235
9.3 Evaluating the Performance of a Classification Tree 241
Example 2: Acceptance of Personal Loan 241
Sensitivity Analysis Using Cross-Validation 243
9.4 Avoiding Overfitting 246
Stopping Tree Growth 246
Fine-Tuning Tree Parameters 247
Other Methods for Limiting Tree Size 250
9.5 Classification Rules from Trees 252
9.6 Classification Trees for More Than Two Classes 252
9.7 Regression Trees 253
Prediction 253
Measuring Impurity 255
Evaluating Performance 256
9.8 Advantages and Weaknesses of a Tree 256
9.9 Improving Prediction: Random Forests and Boosted Trees 258
Random Forests 258
Boosted Trees 260
Problems 264
CHAPTER 10 Logistic Regression 267
10.1 Introduction 268
10.2 The Logistic Regression Model 269
10.3 Example: Acceptance of Personal Loan 272
Model with a Single Predictor 272
Estimating the Logistic Model from Data: Computing Parameter Estimates 274
Interpreting Results in Terms of Odds (for a Profiling Goal) 275
10.4 Evaluating Classification Performance 277
10.5 Variable Selection 280
10.6 Logistic Regression for Multi-Class Classification 281
Ordinal Classes 281
Nominal Classes 282
Comparing Ordinal and Nominal Models 283
10.7 Example of Complete Analysis: Predicting Delayed Flights 285
Data Preprocessing 289
Model-Fitting and Estimation 289
Model Interpretation 289
Model Performance 291
Variable Selection 292
Appendix: Using Statsmodels 297
Problems 298
CHAPTER 11 Neural Nets 301
11.1 Introduction 302
11.2 Concept and Structure of a Neural Network 302
11.3 Fitting a Network to Data 303
Example 1: Tiny Dataset 303
Computing Output of Nodes 305
Preprocessing the Data 307
Training the Model 308
Example 2: Classifying Accident Severity 313
Avoiding Overfitting 314
Using the Output for Prediction and Classification 314
11.4 Required User Input 316
11.5 Exploring the Relationship Between Predictors and Outcome 317
11.6 Deep Learning 318
Convolutional Neural Networks (CNNs) 319
Local Feature Map 320
A Hierarchy of Features 321
The Learning Process 321
Unsupervised Learning 322
Example: Classification of Fashion Images 323
Conclusion 329
11.7 Advantages and Weaknesses of Neural Networks 329
Problems 331
CHAPTER 12 Discriminant Analysis 333
12.1 Introduction 334
Example 1: Riding Mowers 334
Example 2: Personal Loan Acceptance 334
12.2 Distance of a Record from a Class 336
12.3 Fisher’s Linear Classification Functions 337
12.4 Classification Performance of Discriminant Analysis 341
12.5 Prior Probabilities 342
12.6 Unequal Misclassification Costs 342
12.7 Classifying More Than Two Classes 344
Example 3: Medical Dispatch to Accident Scenes 344
12.8 Advantages and Weaknesses 347
Problems 348
CHAPTER 13 Generating, Comparing, and Combining Multiple Models 351
13.1 Ensembles 352
Why Ensembles Can Improve Predictive Power 353
Simple Averaging or Voting 354
Bagging 355
Boosting 356
Bagging and Boosting in Python 356
Stacking 356
Federated Learning 358
Advantages and Weaknesses of Ensembles 358
13.2 Automated Machine Learning (AutoML) 359
AutoML: Explore and Clean Data 359
AutoML: Determine Machine Learning Task 360
AutoML: Choose Features and Machine Learning Methods 360
AutoML: Evaluate Model Performance 361
AutoML: Model Deployment 363
Advantages and Weaknesses of Automated Machine Learning 364
13.3 Explaining Model Predictions 365
13.4 Summary 366
Problems 368
CHAPTER 14 Experiments, Uplift Models, and Reinforcement Learning 371
14.1 A/B Testing 372
Example: Testing a New Feature in a Photo Sharing App 373
The Statistical Test for Comparing Two Groups (t-Test) 374
Multiple Treatment Groups: A/B/n Tests 376
Multiple A/B Tests and the Danger of Multiple Testing 377
14.2 Uplift (Persuasion) Modeling 377
Gathering the Data 378
A Simple Model 380
Modeling Individual Uplift 380
Computing Uplift with Python 382
Using the Results of an Uplift Model 382
14.3 Reinforcement Learning 384
Explore-Exploit: Multi-armed Bandits 384
Example of Using a Contextual Multi-arm Bandit for Movie Recommendations 387
Markov Decision Process (MDP) 390
14.4 Summary 393
Problems 395
PART V MINING RELATIONSHIPS AMONG RECORDS
CHAPTER 15 Association Rules and Collaborative Filtering 399
15.1 Association Rules 400
Discovering Association Rules in Transaction Databases 400
Example 1: Synthetic Data on Purchases of Phone Faceplates 402
Generating Candidate Rules 402
The Apriori Algorithm 403
Selecting Strong Rules 403
Data Format 406
The Process of Rule Selection 407
Interpreting the Results 408
Rules and Chance 408
Example 2: Rules for Similar Book Purchases 411
15.2 Collaborative Filtering 413
Data Type and Format 414
Example 3: Netflix Prize Contest 415
User-Based Collaborative Filtering: “People Like You” 416
Item-Based Collaborative Filtering 418
Evaluating Performance 421
Example 4: Predicting Movie Ratings with MovieLens Data 422
Advantages and Weaknesses of Collaborative Filtering 423
Collaborative Filtering vs Association Rules 426
15.3 Summary 427
Problems 429
CHAPTER 16 Cluster Analysis 433
16.1 Introduction 434
Example: Public Utilities 435
16.2 Measuring Distance Between Two Records 437
Euclidean Distance 438
Normalizing Numerical Variables 439
Other Distance Measures for Numerical Data 439
Distance Measures for Categorical Data 441
Distance Measures for Mixed Data 442
16.3 Measuring Distance Between Two Clusters 443
Minimum Distance 443
Maximum Distance 443
Average Distance 443
Centroid Distance 443
16.4 Hierarchical (Agglomerative) Clustering 445
Single Linkage 446
Complete Linkage 446
Average Linkage 447
Centroid Linkage 447
Ward’s Method 447
Dendrograms: Displaying Clustering Process and Results 448
Validating Clusters 450
Limitations of Hierarchical Clustering 451
16.5 Non-Hierarchical Clustering: The k-Means Algorithm 453
Choosing the Number of Clusters (k) 455
Problems 459
PART VI FORECASTING TIME SERIES
CHAPTER 17 Handling Time Series 463
17.1 Introduction 464
17.2 Descriptive vs Predictive Modeling 465
17.3 Popular Forecasting Methods in Business 465
Combining Methods 466
17.4 Time Series Components 466
Example: Ridership on Amtrak Trains 467
17.5 Data Partitioning and Performance Evaluation 470
Benchmark Performance: Naive Forecasts 471
Generating Future Forecasts 473
Problems 474
CHAPTER 18 Regression-Based Forecasting 477
18.1 A Model with Trend 478
Linear Trend 478
Exponential Trend 481
Polynomial Trend 481
18.2 A Model with Seasonality 484
18.3 A Model with Trend and Seasonality 486
18.4 Autocorrelation and ARIMA Models 488
Computing Autocorrelation 488
Improving Forecasts by Integrating Autocorrelation Information 491
Evaluating Predictability 495
Problems 498
CHAPTER 19 Smoothing and Deep Learning Methods for Forecasting 509
19.1 Smoothing Methods: Introduction 510
19.2 Moving Average 510
Centered Moving Average for Visualization 511
Trailing Moving Average for Forecasting 512
Choosing Window Width (w) 514
19.3 Simple Exponential Smoothing 515
Choosing Smoothing Parameter α 516
19.4 Advanced Exponential Smoothing 518
Series with a Trend 518
Series with a Trend and Seasonality 519
Series with Seasonality (No Trend) 520
19.5 Deep Learning for Forecasting 521
Problems 527
PART VII DATA ANALYTICS
CHAPTER 20 Social Network Analytics 537
20.1 Introduction 538
20.2 Directed vs Undirected Networks 538
20.3 Visualizing and Analyzing Networks 539
Plot Layout 541
Edge List 543
Adjacency Matrix 543
Using Network Data in Classification and Prediction 544
20.4 Social Data Metrics and Taxonomy 544
Node-Level Centrality Metrics 545
Egocentric Network 546
Network Metrics 547
20.5 Using Network Metrics in Prediction and Classification 550
Link Prediction 550
Entity Resolution 550
Collaborative Filtering 553
20.6 Business Uses of Social Network Analysis 556
20.7 Summary 557
Problems 559
CHAPTER 21 Text Mining 561
21.1 Introduction 562
21.2 The Tabular Representation of Text 562
21.3 Bag-of-Words vs Meaning Extraction at Document Level 563
21.4 Preprocessing the Text 564
Tokenization 565
Text Reduction 567
Presence/Absence vs Frequency 567
Term Frequency–Inverse Document Frequency (TF-IDF) 569
From Terms to Concepts: Latent Semantic Indexing 571
Extracting Meaning 571
From Terms to High-Dimensional Word Vectors: word2vec or GloVe 572
21.5 Implementing Machine Learning Methods 573
21.6 Example: Online Discussions on Autos and Electronics 573
Importing and Labeling the Records 574
Text Preprocessing in Python 574
Producing a Concept Matrix 575
Fitting a Predictive Model 575
Prediction 575
21.7 Deep Learning Approaches 577
21.8 Example: Sentiment Analysis of Movie Reviews 578
Data Loading, Preparation, and Partitioning 578
Generating and Applying the GloVe Model 579
Fitting a Predictive Model 579
Creating Sentence Embeddings Using a Pretrained Transformer Model 581
21.9 Summary 581
Problems 584
CHAPTER 22 Responsible Data Science 587
22.1 Introduction 588
22.2 Unintentional Harm 589
22.3 Legal Considerations 591
22.4 Principles of Responsible Data Science 592
Non-maleficence 593
Fairness 593
Transparency 594
Accountability 595
Data Privacy and Security 595
22.5 A Responsible Data Science Framework 595
Justification 596
Assembly 596
Data Preparation 597
Modeling 598
Auditing 598
22.6 Documentation Tools 599
Impact Statements 599
Model Cards 600
Datasheets 601
Audit Reports 601
22.7 Example: Applying the RDS Framework to the COMPAS Example 603
Unanticipated Uses 603
Ethical Concerns 603
Protected Groups 603
Data Issues 604
Fitting the Model 604
Auditing the Model 606
Bias Mitigation 611
22.8 Summary 613
Problems 614
CHAPTER 23 Generative AI 617
23.1 The Transformative Power of Generative AI 617
23.2 What is Generative AI? 619
Large Language Models (LLMs) 619
Image Generation 620
23.3 Data and Infrastructure Requirements 621
23.4 Adapting Models for Specific Purposes 623
Fine Tuning 623
Retrieval Augmented Generation (RAG) 624
Fine-tuning vs RAG 624
23.5 Prompt Engineering 624
Interactive Conversation 625
23.6 Uses of Generative AI 625
Augmenting AI Training Data 627
Deploying Generative AI 629
23.7 Caveats and Concerns 629
23.8 Summary 631
Problems 633
PART VIII CASES
CHAPTER 24 Cases 639
24.1 Charles Book Club 639
The Book Industry 639
Database Marketing at Charles 640
Machine Learning Techniques 642
Assignment 644
24.2 German Credit 646
Background 646
Data 646
Assignment 650
24.3 Tayko Software Cataloger 651
Background 651
The Mailing Experiment 651
Data 651
Assignment 653
24.4 Political Persuasion 655
Background 655
Predictive Analytics Arrives in US Politics 655
Political Targeting 655
Uplift 656
Data 657
Assignment 657
24.5 Taxi Cancellations 659
Business Situation 659
Assignment 659
24.6 Segmenting Consumers of Bath Soap 661
Business Situation 661
Key Problems 661
Data 662
Measuring Brand Loyalty 662
Assignment 662
24.7 Direct-Mail Fundraising 665
Background 665
Data 665
Assignment 665
24.8 Catalog Cross-Selling 668
Background 668
Assignment 668
24.9 Time-Series Case: Forecasting Public Transportation Demand 670
Background 670
Problem Description 670
Available Data 670
Assignment Goal 670
Assignment 671
Tips and Suggested Steps 671
24.10 Loan Approval 672
Background 672
Regulatory Requirements 672
Getting Started 672
Assignment 673
References 675
Index 677
-
- 電子書籍
- 「変なバイト見つけた」時給××万円の理…
-
- 電子書籍
- あすなろ三三七拍子 上下合本版 講談社…
-
- 電子書籍
- 三千世界の鴉を殺し(22) ウィングス…
-
- 和書
- 片想いさん 文春文庫
-
- 電子書籍
- 拳闘暗黒伝セスタス 14巻 ヤングアニ…