Foundations of Statistics for Data Scientists : Theoretical Foundations with Implementation Using R and Python (Chapman & Hall/crc Texts in Statistica

ポイントキャンペーン

Foundations of Statistics for Data Scientists : Theoretical Foundations with Implementation Using R and Python (Chapman & Hall/crc Texts in Statistica

Agresti, Alan/ Kateri, Maria

Chapman & Hall（2022/05発売）

ただいまウェブストアではご注文を受け付けておりません。 ⇒古書を探す

製本 Paperback:紙装版/ペーパーバック版／ページ数 488 p.
言語 ENG
商品コード 9780367748432
DDC分類 519.50285536

Full Description

Designed as a textbook for a one or two-term introduction to mathematical statistics for students training to become data scientists, Foundations of Statistics for Data Scientists: With R and Python is an in-depth presentation of the topics in statistical science with which any data scientist should be familiar, including probability distributions, descriptive and inferential statistical methods, and linear modelling. The book assumes knowledge of basic calculus, so the presentation can focus on 'why it works' as well as 'how to do it.' Compared to traditional "mathematical statistics" textbooks, however, the book has less emphasis on probability theory and more emphasis on using software to implement statistical methods and to conduct simulations to illustrate key concepts. All statistical analyses in the book use R software, with an appendix showing the same analyses with Python.The book also introduces modern topics that do not normally appear in mathematical statistics texts but are highly relevant for data scientists, such as Bayesian inference, generalized linear models for non-normal responses (e.g., logistic regression and Poisson loglinear models), and regularized model fitting. The nearly 500 exercises are grouped into "Data Analysis and Applications" and "Methods and Concepts." Appendices introduce R and Python and contain solutions for odd-numbered exercises. The book's website has expanded R, Python, and Matlab appendices and all data sets from the examples and exercises.Alan Agresti, Distinguished Professor Emeritus at the University of Florida, is the author of seven books, including Categorical Data Analysis (Wiley) and Statistics: The Art and Science of Learning from Data (Pearson), and has presented short courses in 35 countries. His awards include an honorary doctorate from De Montfort University (UK) and the Statistician of the Year from the American Statistical Association (Chicago chapter). Maria Kateri, Professor of Statistics and Data Science at the RWTH Aachen University, authored the monograph Contingency Table Analysis: Methods and Implementation Using R (Birkhauser/Springer) and a textbook on mathematics for economists (in German). She has a long-term experience in teaching statistics courses to students of Data Science, Mathematics, Statistics, Computer Science, and Business Administration and Engineering."The main goal of this textbook is to present foundational statistical methods and theory that are relevant in the field of data science. The authors depart from the typical approaches taken by many conventional mathematical statistics textbooks by placing more emphasis on providing the students with intuitive and practical interpretations of those methods with the aid of R programming codes...I find its particular strength to be its intuitive presentation of statistical theory and methods without getting bogged down in mathematical details that are perhaps less useful to the practitioners" (Mintaek Lee, Boise State University)"The aspects of this manuscript that I find appealing: 1. The use of real data. 2. The use of R but with the option to use Python. 3. A good mix of theory and practice. 4. The text is well-written with good exercises. 5. The coverage of topics (e.g. Bayesian methods and clustering) that are not usually part of a course in statistics at the level of this book." (Jason M. Graham, University of Scranton)

Table of Contents: Foundations of Statistical Science for Data ScientistsAlan Agresti and Maria Kateri1. Introduction to Statistical Science1.1 Statistical science: Description and inference Design, descriptive statistics, and inferential statistics Populations and samples Parameters: Numerical summaries of the population Defining populations: actual and conceptual 1.2 Types of data and variables Data files Example: The General Social Survey (GSS) Variables Quantitative variables and categorical variables Discrete variables and continuous variables Associations: response variables and explanatory variables 1.3 Data collection and randomization Randomization Collecting data with a sample survey Collecting data with an experiment Collecting data with an observational study Establishing cause and effect: observational versus experimental studies 1.4 Descriptive statistics: Summarizing data Example: Carbon dioxide emissions in European nations Frequency distribution and histogram graphic Describing the center of the data: mean and median Describing data variability: standard deviation and varianceDescribing position: percentiles, quantiles, and box plots 1.5 Descriptive statistics: Summarizing multivariate data Bivariate quantitative data: The scatterplot, correlation, andregression Bivariate categorical data: Contingency tables Descriptive statistics for samples and for populations 1.6 Chapter summaryExercises 2. Probability Distributions 2.1 Introduction to probability Probabilities and long-run relative frequencies Sample spaces and events Probability axioms and implied probability rules Example: Diagnostics for disease screening Bayes' theorem Multiplicative law of probability, and independent events 2.2 Random variables and probability distributions Probability distributions for discrete random variables Example: Geometric probability distribution Probability distributions for continuous random variables Example: Uniform distribution Probability functions (pdf, pmf) and cumulative distributionfunction (cdf) Example: Exponential random variable Families of probability distributions indexed by parameters 2.3 Expectations of random variables Expected value and variability of a discrete random variable Expected values for continuous random variables Example: Mean and variability for uniform random variable Higher moments: Skewness Expectations of linear functions of random variables Standardizing a random variable 2.4 Discrete probability distributions Binomial distribution Example: Hispanic composition of jury list Mean, variability, and skewness of binomial distribution Example: Predicting results of a sample survey The sample proportion as a scaled binomial random variable Poisson distribution Poisson variability and overdispersion 2.5 Continuous probability distributions The normal distribution The standard normal distribution Examples: Finding normal probabilities and percentiles The gamma distribution The exponential distribution and Poisson processes Quantiles of a probability distribution Using the uniform to randomly generate a continuous random variable 2.6 Joint and conditional distributions and independence Joint and marginal probability distributions Example: Joint and marginal distributions of happiness and family income Conditional probability distributionsTrials with multiple categories: the multinomial distributionExpectations of sums of random variablesIndependence of random variables Markov chain dependence and conditional independence 2.7 Correlation between random variables Covariance and correlation Example: Correlation between income and happiness Independence implies zero correlation, but not converse Bivariate normal distribution * 2.8 Chapter summaryExercises 3. Sampling Distributions 3.1 Sampling distributions: Probability distributions for statistics Example: Predicting an election result from an exit poll Sampling distribution: Variability of a statistic's value among samples Constructing a sampling distribution Example: Simulating to estimate mean restaurant sales3.2 Sampling distributions of sample means Mean and variance of sample mean of random variables Standard error of a statisticExample: Standard error of sample mean salesExample: Standard error of sample proportion in exit poll Law of large numbers: Sample mean converges to population mean Normal, binomial, and Poisson sums of random variables have the same distribution 3.3 Central limit theorem: Normal sampling distribution for large samplesSampling distribution of sample mean is approximately normal Simulations illustrate normal sampling distribution in CLT Summary: Population, sample data, and sampling distributions3.4 Large-sample normal sampling distributions for many statistics *Delta methodDelta method applied to root Poisson stabilizes the variance Simulating sampling distributions of other statisticsThe key role of sampling distributions in statistical inference 3.5 Chapter summaryExercises 4. Statistical Inference: Estimation 4.1 Point estimates and confidence intervals Properties of estimators: Unbiasedness, consistency, efficiency Evaluating properties of estimators Interval estimation: Confidence intervals for parameters 4.2 The likelihood function and maximum likelhood estimation The likelihood function Maximum likelihood method of estimation Properties of maximum likelihood estimators Example: Variance of ML estimator of binomial parameterExample: Variance of ML estimator of Poisson mean Sufficiency and invariance for ML estimators 4.3 Constructing confidence intervals Using a pivotal quantity to induce a confidence interval A large-sample confidence interval for the mean Confidence intervals for proportions Example: Atheists and agnostics in Europe Using simulation to illustrate long-run performance of CIs Determining the sample size before collecting the data Example: Sample size for evaluating an advertising strategy4.4 Confidence intervals for means of normal populations The $t$ distribution Confidence interval for a mean using the $t$ distribution Example: Estimating mean weight change for anorexic girls Robustness for violations of normal population assumption Construction of $t$ distribution using chi-squared and standard normal Why does the pivotal quantity have the $t$ distribution? Cauchy distribution: t distribution with df=1 has unusual behavior4.5 Comparing two population means or proportions A model for comparing means: Normality with common variability A standard error and confidence interval for comparing means Example: Comparing a therapy to a control group Confidence interval comparing two proportions Example: Does prayer help coronary surgery patients?4.6 The bootstrap Computational resampling and bootstrap confidence intervals Example: Confidence intervals for library data 4.7 The Bayesian approach to statistical inference Bayesian prior and posterior distributions Bayesian binomial inference: Beta prior distributions Example: Belief in hell Interpretation: Bayesian versus classical intervalsBayesian posterior interval comparing proportionsHighest posterior density (HPD) posterior intervals4.8 Bayeian inference for meansBayesian inference for a normal mean Example: Bayesian analysis for anorexia therapyBayesian inference for normal means with improper priorsPredicting a future observation: Bayesian predictive distributionThe Bayesian perspective, and empirical Bayes and hierarchical Bayes extensions4.9 Why maximum likelihood and Bayes estimators perform well *ML estimators have large-sample normal distributionsAsymptotic efficiency of ML estimators same as best unbiased estimatorsBayesian estimators also have good large-sample performanceThe likelihood principle 4.10 Chapter summaryExercises 5. Statistical Inference: Significance Testing 5.1 The elements of a significance test Example: Testing for bias in selecting managers Assumptions, hypotheses, test statistic, P-value and conclusion 5.2 Significance tests for proportions and means The elements of a significance test for a proportion Example: Climate change a major threat? One-sided significance testsThe elements of a significance test for a meanExample: Significance test about political ideology 5.3 Significance tests comparing means Significance tests for the difference between two means Example: Comparing a therapy to a control group Effect size for comparison of two means Bayesian inference for comparing means Example: Bayesian comparison of therapy and control groups 5.4 Significance tests comparing proportionsSignificance test for the difference between two proportions Example: Comparing prayer and non-prayer surgery patients Bayesian inference for comparing two proportionsChi-squared tests for multiple proportions in contingency tableExample: Happiness and marital statusStandardized residuals: Describing the nature of an association5.5 Significance test decisions and errors The alpha-level: Making a decision based on the P-value Never ``accept H_0'' in a significance test Type I and Type II errors As P(Type I error) decreases, P(Type II error) increases Example: Testing whether astrology has some truth The power of a testMaking decisions versus reporting the P-value5.6 Duality between significance tests and confidence intervals Connection between two-sided tests and confidence intervals Effect of sample size: Statistical versus practical significance Significance tests are less useful than confidence intervals Significance tests and P-values can be misleading 5.7 Likelihood-ratio tests and confidence intervals * The likelihood-ratio and a chi-squared test statistic Likelihood-ratio test and confidence interval for a proportion Likelihood-ratio, Wald, score test triad 5.8 Nonparametric tests A permutation test to compare two groups Example: Petting versus praise of dogs Wilcoxon test: Comparing mean ranks for two groupsComparing survival time distributions with censored data5.9 Chapter summaryExercises 6. Linear Models and Least Squares 6.1 The linear regression model and its least squares fit The linear model describes a conditional expectation Describing variation around the conditional expectationLeast squares model fitting Example: Linear model for Scottish hill races The correlation Regression toward the mean in linear regression models Linear models and reality6.2 Multiple regression: Linear models with multiple explanatory variables Interpreting effects in multiple regression models Example: Multiple regression for Scottish hill races Association and causationConfounding, spuriousness, and conditional independenceExample: Modeling the crime rate in FloridaEquations for least squares estimates in multiple regression Interaction between explanatory variables in their effects Cook's distance: Checking for unusual observations 6.3 Summarizing variability in linear regression models The error variance and chi-squared for linear models Decomposing variability into model explained and unexplained parts R-squared and the multiple correlation Example: R-squared for modeling Scottish hill races 6.4 Statistical inference for normal linear models The F distribution: Testing that all effects equal 0 Example: Linear model for mental impairment t tests and confidence intervals for individual effects Multicollinearity: Nearly redundant explanatory variables Confidence interval for E(Y) and prediction interval for Y The F test that all effects equal 0 is a likelihood-ratio test *6.5 Categorical explanatory variables in linear models Indicator variables for categories Example: Comparing mean incomes of racial-ethnic groups Analysis of variance (ANOVA): An F test comparing several meansMultiple comparisons of means: Bonferroni and Tukey methods Models with both categorical and quantitative explanatory variables Comparing two nested normal linear models Interaction with categorical and quantitative explanatory variables6.6 Bayesian inference for normal linear models Prior and posterior distributions for normal linear models Example: Bayesian linear model for mental impairment Bayesian approach to the normal one-way layout 6.7 Matrix formulation of linear models The model matrix Least squares estimates and standard errors The hat matrix and the leverage Alternatives to least squares: Robust regression and regularization Restricted optimality of least squares: Gauss--Markov theorem Matrix formulation of Bayesian normal linear model 6.8 Chapter summaryExercises 7. Generalized Linear Models 7.1 Introduction to generalized linear models The three components of a generalized linear model GLMs for normal, binomial, and Poisson responses Example: GLMs for house selling prices The deviance Likelihood-ratio model comparison uses deviance difference Model selection: AIC and the bias/variance tradeoff Advantages of GLMs versus transforming the data Example: Normal and gamma GLMs for Covid-19 data 7.2 Logistic regression for binary data Logistic regression: Model expressions Interpreting beta_j: effects on probabilities and odds Example: Dose-response study for flour beetles Grouped and ungrouped binary data: Effects on estimates and deviance Example: Modeling Italian employment with logit and identity links Complete separation and infinite logistic parameter estimates7.3 Bayesian inference for generalized linear modelsNormal prior distributions for GLM parameters Example: Bayesian logistic regression for endometrial cancer patients7.4 Poisson loglinear models for count data Poisson loglinear models Example: Modeling horseshoe crab satellite counts Modeling rates: Including an offset in the model Example: Lung cancer survival 7.5 Negative binomial models for overdispersed count data *Increased variance due to heterogeneityNegative binomial: Gamma mixture of Poisson distributions Example: Negative binomial modeling of horseshoe crab data 7.6 Iterative GLM model fitting * The Newton--Raphson method Newton--Raphson fitting of logistic regression modelCovariance matrix of parameter estimates and Fisher scoringLikelihood equations and covariance matrix for Poisson GLMs7.7 Regularization with large numbers of parameters Penalized likelihood methods Penalized likelihood methods: The lasso Example: Predicting opinions with student survey data Why shrink ML estimates toward 0? Dimension reduction: Principal component analysisBayesian inference with a large number of parametersHuge n: Handling big data7.8 Chapter summaryExercises %8. Classification and Clustering 8.1 Classification: Linear Discriminant Analysis and Graphical Trees Classification with Fisher's linear discriminant function Example: Predicting whether horseshoe crabs have satellites Summarizing predictive power: Classification tables and ROC curves Classification trees: Graphical prediction Logistic regression versus linear discriminant analysis and classification trees Other methods for classification: k-nearest neighbors and neural networks prediction 8.2 Cluster Analysis Measuring dissimilarity between observations on binary responses Hierarchical clustering algorithm and its dendrogram Example: Clustering states on presidential election outcomes 8.3 Chapter summaryExercises 9. Statistical Science: A Historical Overview 9.1 The evolution of statistical scienceEvolution of probabilityEvolution of descriptivev and inferential statistics9.2 Pillars of statistical wisdom and practiceStigler's seven pillars of statistical wisdomSeven pillars of wisdom for practicing data science Appendix A: Using R in Statistical Science Appendix B: Using Python in Statistical Science Appendix C: Brief Solutions to Odd-Numbered Exercises BibliographyExampleSubject Index