- ホーム
- > 洋書
- > 英文書
- > Science / Mathematics
Full Description
Microbiome Statistics Set addresses the statistical analysis of correlation, association, interaction, and composition in microbiome research and talks about the challenges of machine learning statistics with an emphasis on the importance of performance valuation by appropriate metrics and independent data.
The books define the study of the microbiome as a hypothesis-driven experimental science and investigate challenges for statistical analysis of microbiome data using the standard statistical methods while also providing the step-by-step procedures to perform machine learning microbiome data, including feature engineering, algorithm selection and optimization, performance evaluation and model testing. They comment on the benefits and limitations of using machine learning for microbiome statistics and remarks on the advantages and disadvantages of each machine learning algorithm.
This set consists of 15 chapters on applied microbiome statistics and 19 chapters on machine learning for microbiome statistics and is an excellent reference for researchers, students, academics and data analysts in the field.
Key Features:
· Discusses the issues of statistical analysis of microbiome data: high dimensionality, compositionality, sparsity, overdispersion, zero-inflation, and heterogeneity.
· Describes important concepts of machine learning, including bias and variance tradeoff, accuracy and precision, overfitting and underfitting, model complexity and interpretability, and feature engineering.
· Investigates statistical methods on multiple comparisons and multiple hypothesis testing and applications to microbiome data.
· Introduces a series of exploratory tools to visualize composition and correlation of microbial taxa by barplot, heatmap, and correlation plot.
· Introduces confusion matrix and its derived measures. Comprehensively describes the properties of F1, Matthews' correlation coefficient (MCC), area under the receiver operating characteristic curve (AUC-ROC), and area under the precision-recall curve (AUC-PR), as well as discusses their advantages and disadvantages when using for microbiome data.
· Employs the Kruskal-Wallis rank-sum test to perform model selection for further multi-omics data integration.
· Offers all related R codes and the datasets from the authors' first-hand microbiome research and publicly available data.
Contents
Applied Microbiome Statistics: Correlation, Association, Interaction and Composition
Preface Acknowledgements About the Authors 1. Introduction to Microbiome Statistics 2. Classical Parametric Correlation 3. Classical Nonparametric Correlation 4. Composition Barplots 5. Composition Heatmaps 6. Correlation Heatmaps and plots 7. Model Selection for Correlation and Association Analysis 8. Alpha Diversity-Based Association Analysis 9. Beta Diversity-Based Association Analysis 10. Multiple Comparisons and Multiple Hypothesis Testing 11. Multiple Comparisons and Multiple Hypothesis Testing in Microbiome Research 12. Linear Discriminant Analysis Effect Size (LEfSe) 13. Sparse and Compositional Methods for Inferencing Microbial Interactions 14. Network Construction and Comparison for Microbiome Data 15. Microbial Networks in Semi-Parametric Rank-Based Correlation and Partial Correlation Estimation References
Machine Learning for Microbiome Statistics
Preface Acknowledgements About the Authors Chapter 1 Introduction to Machine Learning Chapter 2 Overview of Machine Learning in Microbiome Research Chapter 3 Accessing Model Accuracy and Goodness of Fit Tests for Normality Chapter 4 Overfitting and Underfitting Chapter 5 Assessing Model Accuracy Using Cross-Validation Chapter 6 Feature Engineering and Model Selection Chapter 7 Logistic Regression Chapter 8 Support Vector Machines Chapter 9 Classification Trees Chapter 10 Random Forest Chapter 11 The Evolution of Tree-Based Algorithms Chapter 12 Extreme Gradient Boosting (XGBoost) Chapter 13 Artificial Neural Networks and Deep Learning Chapter 14 Machine Learning Microbiome with SIAMCAT Chapter 15 Basic Performance Metrics for Machine Learning Models Chapter 16 Matthews Correlation Coefficient Chapter 17 Area Under the Receiver Operating Characteristic Curve (AUC-ROC) Chapter 18 Area Under the Precision-Recall Curve (AUC-PR) Chapter 19 Comparisons of Machine Learning Classification Models with Tidymodels