- ホーム
- > 洋書
- > 英文書
- > Computer / Databases
Full Description
Statistical Computation for Programmers, Scientists, Quants, Excel Users, and Other ProfessionalsUsing the open source R language, you can build powerful statistical models to answer many of your most challenging questions. R has traditionally been difficult for non-statisticians to learn, and most R books assume far too much knowledge to be of help. R for Everyone is the solution. Drawing on his unsurpassed experience teaching new users, professional data scientist Jared P. Lander has written the perfect tutorial for anyone new to statistical programming and modeling. Organized to make learning easy and intuitive, this guide focuses on the 20 percent of R functionality you'll need to accomplish 80 percent of modern data tasks.Lander's self-contained chapters start with the absolute basics, offering extensive hands-on practice and sample code. You'll download and install R; navigate and use the R environment; master basic program control, data import, and manipulation; and walk through several essential tests. Then, building on this foundation, you'll construct several complete models, both linear and nonlinear, and use some data mining techniques.By the time you're done, you won't just know how to write R programs, you'll be ready to tackle the statistical problems you care about most.COVERAGE INCLUDES* Exploring R, RStudio, and R packages* Using R for math: variable types, vectors, calling functions, and more* Exploiting data structures, including data.frames, matrices, and lists* Creating attractive, intuitive statistical graphics* Writing user-defined functions* Controlling program flow with if, ifelse, and complex checks* Improving program efficiency with group manipulations* Combining and reshaping multiple datasets* Manipulating strings using R's facilities and regular expressions* Creating normal, binomial, and Poisson probability distributions* Programming basic statistics: mean, standard deviation, and t-tests* Building linear, generalized linear, and nonlinear models* Assessing the quality of models and variable selection* Preventing overfitting, using the Elastic Net and Bayesian methods* Analyzing univariate and multivariate time series data* Grouping data via K-means and hierarchical clustering* Preparing reports, slideshows, and web pages with knitr* Building reusable R packages with devtools and Rcpp* Getting involved with the R global community
Contents
Foreword xiiiPreface xvAcknowledgments xixAbout the Author xxiChapter 1: Getting R 11.1 Downloading R 11.2 R Version 21.3 32-bit vs. 64-bit 21.4 Installing 21.5 Revolution R Community Edition 101.6 Conclusion 11Chapter 2: The R Environment 132.1 Command Line Interface 142.2 RStudio 152.3 Revolution Analytics RPE 262.4 Conclusion 27Chapter 3: R Packages 293.1 Installing Packages 293.2 Loading Packages 323.3 Building a Package 333.4 Conclusion 33Chapter 4: Basics of R 354.1 Basic Math 354.2 Variables 364.3 Data Types 384.4 Vectors 434.5 Calling Functions 494.6 Function Documentation 494.7 Missing Data 504.8 Conclusion 51Chapter 5: Advanced Data Structures 535.1 data.frames 535.2 Lists 615.3 Matrices 685.4 Arrays 715.5 Conclusion 72Chapter 6: Reading Data into R 736.1 Reading CSVs 736.2 Excel Data 746.3 Reading from Databases 756.4 Data from Other Statistical Tools 776.5 R Binary Files 776.6 Data Included with R 796.7 Extract Data from Web Sites 806.8 Conclusion 81Chapter 7: Statistical Graphics 837.1 Base Graphics 837.2 ggplot2 867.3 Conclusion 98Chapter 8: Writing R Functions 998.1 Hello, World! 998.2 Function Arguments 1008.3 Return Values 1038.4 do.call 1048.5 Conclusion 104Chapter 9: Control Statements 1059.1 if and else 1059.2 switch 1089.3 ifelse 1099.4 Compound Tests 1119.5 Conclusion 112Chapter 10: Loops, the Un-R Way to Iterate 11310.1 for Loops 11310.2 while Loops 11510.3 Controlling Loops 11510.4 Conclusion 116Chapter 11: Group Manipulation 11711.1 Apply Family 11711.2 aggregate 12011.3 plyr 12411.4 data.table 12911.5 Conclusion 139Chapter 12: Data Reshaping 14112.1 cbind and rbind 14112.2 Joins 14212.3 reshape2 14912.4 Conclusion 153Chapter 13: Manipulating Strings 15513.1 paste 15513.2 sprintf 15613.3 Extracting Text 15713.4 Regular Expressions 16113.5 Conclusion 169Chapter 14: Probability Distributions 17114.1 Normal Distribution 17114.2 Binomial Distribution 17614.3 Poisson Distribution 18214.4 Other Distributions 18514.5 Conclusion 186Chapter 15: Basic Statistics 18715.1 Summary Statistics 18715.2 Correlation and Covariance 19115.3 T-Tests 20015.4 ANOVA 20715.5 Conclusion 210Chapter 16: Linear Models 21116.1 Simple Linear Regression 21116.2 Multiple Regression 21616.3 Conclusion 232Chapter 17: Generalized Linear Models 23317.1 Logistic Regression 23317.2 Poisson Regression 23717.3 Other Generalized Linear Models 24017.4 Survival Analysis 24017.5 Conclusion 245Chapter 18: Model Diagnostics 24718.1 Residuals 24718.2 Comparing Models 25318.3 Cross-Validation 25718.4 Bootstrap 26218.5 Stepwise Variable Selection 26518.6 Conclusion 269Chapter 19: Regularization and Shrinkage 27119.1 Elastic Net 27119.2 Bayesian Shrinkage 29019.3 Conclusion 295Chapter 20: Nonlinear Models 29720.1 Nonlinear Least Squares 29720.2 Splines 30020.3 Generalized Additive Models 30420.4 Decision Trees 31020.5 Random Forests 31220.6 Conclusion 313Chapter 21: Time Series and Autocorrelation 31521.1 Autoregressive Moving Average 31521.2 VAR 32221.3 GARCH 32721.4 Conclusion 336Chapter 22: Clustering 33722.1 K-means 33722.2 PAM 34522.3 Hierarchical Clustering 35222.4 Conclusion 357Chapter 23: Reproducibility, Reports and Slide Shows with knitr 35923.1 Installing a LATEX Program 35923.2 LATEX Primer 36023.3 Using knitr with LATEX 36223.4 Markdown Tips 36723.5 Using knitr and Markdown 36823.6 pandoc 36923.7 Conclusion 371Chapter 24: Building R Packages 37324.1 Folder Structure 37324.2 Package Files 37324.3 Package Documentation 38024.4 Checking, Building and Installing 38324.5 Submitting to CRAN 38424.6 C++ Code 38424.7 Conclusion 390Appendix A: Real-Life Resources 391A.1 Meetups 391A.2 Stackoverflow 392A.3 Twitter 393A.4 Conferences 393A.5 Web Sites 393A.6 Documents 394A.7 Books 394A.8 Conclusion 394Appendix B: Glossary 395List of Figures 409List of Tables 417General Index 419Index of Functions 429Index of Packages 433Index of People 435Data Index 437