Sams Teach Yourself Big Data Analytics with Microsoft Hdinsight in 24 Hours (Sams Teach Yourself in 24 Hours)

Sams Teach Yourself Big Data Analytics with Microsoft Hdinsight in 24 Hours (Sams Teach Yourself in 24 Hours)

  • Sams(2015/11発売)
  • ただいまウェブストアではご注文を受け付けておりません。 ⇒古書を探す
  • 製本 Paperback:紙装版/ペーパーバック版/ページ数 572 p.
  • 言語 ENG
  • 商品コード 9780672337277
  • DDC分類 005.74

Full Description


Sams Teach Yourself Big Data Analytics with Microsoft HDInsight in 24 HoursIn just 24 lessons of one hour or less, Sams Teach Yourself Big Data Analytics with Microsoft HDInsight in 24 Hours helps you leverage Hadoop's power on a flexible, scalable cloud platform using Microsoft's newest business intelligence, visualization, and productivity tools.This book's straightforward, step-by-step approach shows you how to provision, configure, monitor, and troubleshoot HDInsight and use Hadoop cloud services to solve real analytics problems. You'll gain more of Hadoop's benefits, with less complexity-even if you're completely new to Big Data analytics. Every lesson builds on what you've already learned, giving you a rock-solid foundation for real-world success.Practical, hands-on examples show you how to apply what you learnQuizzes and exercises help you test your knowledge and stretch your skillsNotes and tips point out shortcuts and solutionsLearn how to... * Master core Big Data and NoSQL concepts, value propositions, and use cases* Work with key Hadoop features, such as HDFS2 and YARN* Quickly install, configure, and monitor Hadoop (HDInsight) clusters in the cloud* Automate provisioning, customize clusters, install additional Hadoop projects, and administer clusters * Integrate, analyze, and report with Microsoft BI and Power BI* Automate workflows for data transformation, integration, and other tasks* Use Apache HBase on HDInsight* Use Sqoop or SSIS to move data to or from HDInsight* Perform R-based statistical computing on HDInsight datasets* Accelerate analytics with Apache Spark* Run real-time analytics on high-velocity data streams* Write MapReduce, Hive, and Pig programsRegister your book at informit.com/register for convenient access to downloads, updates, and corrections as they become available.

Contents

IntroductionPart I: Understanding Big Data, Hadoop 1.0, and 2.0Hour 1: Introduction of Big Data, NoSQL, and Business Value PropositionTypes of AnalysisTypes of DataBig DataManaging Big DataNoSQL SystemsBig Data, NoSQL Systems, and the Business Value PropositionApplication of Big Data and Big Data SolutionsSummaryQ&AHour 2: Introduction to Hadoop, Its Architecture, Ecosystem, and Microsoft OfferingsWhat Is Apache Hadoop?Architecture of Hadoop and Hadoop EcosystemsWhat's New in Hadoop 2.0Architecture of Hadoop 2.0Tools and Technologies Needed with Big Data AnalyticsMajor Players and Vendors for HadoopDeployment Options for Microsoft Big Data SolutionsSummaryQ&AHour 3: Hadoop Distributed File System Versions 1.0 and 2.0Introduction to HDFSHDFS ArchitectureRack AwarenessWebHDFSAccessing and Managing HDFS DataWhat's New in HDFS 2.0SummaryQ&AHour 4: The MapReduce Job Framework and Job Execution PipelineIntroduction to MapReduceMapReduce ArchitectureMapReduce Job Execution FlowSummaryQ&AHour 5: MapReduce-Advanced Concepts and YARN DistributedCacheHadoop StreamingMapReduce JoinsBloom FilterPerformance ImprovementHandling FailuresCounterYARNUber-Tasking OptimizationFailures in YARNResource Manager High Availability and Automatic Failover in YARNSummaryQ&APart II: Getting Started with HDInsight and Understanding Its Different ComponentsHour 6: Getting Started with HDInsight, Provisioning Your HDInsight Service Cluster, and Automating HDInsight Cluster ProvisioningIntroduction to Microsoft AzureUnderstanding HDInsight ServiceProvisioning HDInsight on the Azure Management PortalAutomating HDInsight Provisioning with PowerShellManaging and Monitoring HDInsight Cluster and Job ExecutionSummaryQ&AExerciseHour 7: Exploring Typical Components of HDFS Cluster HDFS Cluster ComponentsHDInsight Cluster ArchitectureHigh Availability in HDInsightSummaryQ&AHour 8: Storing Data in Microsoft Azure Storage Blob Understanding Storage in Microsoft AzureBenefits of Azure Storage Blob over HDFSAzure Storage Explorer ToolsSummaryQ&AHour 9: Working with Microsoft Azure HDInsight Emulator Getting Started with HDInsight EmulatorSetting Up Microsoft Azure Emulator for StorageSummaryQ&APart III: Programming MapReduce and HDInsight Script ActionHour 10: Programming MapReduce Jobs MapReduce Hello World!Analyzing Flight Delays with MapReduceSerialization Frameworks for HadoopHadoop StreamingSummaryQ&AHour 11: Customizing the HDInsight Cluster with Script ActionIdentifying the Need for Cluster CustomizationDeveloping Script ActionConsuming Script ActionRunning a Giraph job on a Customized HDInsight ClusterTesting Script Action with HDInsight EmulatorSummaryQ&APart IV: Querying and Processing Big Data in HDInsightHour 12: Getting Started with Apache Hive and Apache Tez in HDInsightIntroduction to Apache HiveGetting Started with Apache Hive in HDInsightAzure HDInsight Tools for Visual StudioProgrammatically Using the HDInsight .NET SDKIntroduction to Apache TezSummaryQ&AExerciseHour 13: Programming with Apache Hive, Apache Tez in HDInsight, and Apache HCatalog Programming with Hive in HDInsightUsing Tables in HiveSerialization and DeserializationData Load Processes for Hive TablesQuerying Data from Hive TablesIndexing in HiveApache Tez in ActionApache HCatalogSummaryQ&AExerciseHour 14: Consuming HDInsight Data from Microsoft BI Tools over Hive ODBC Driver: Part 1Introduction to Hive ODBC DriverIntroduction to Microsoft Power BIAccessing Hive Data from Microsoft ExcelSummaryQ&AHour 15: Consuming HDInsight Data from Microsoft BI Tools over Hive ODBC Driver: Part 2Accessing Hive Data from PowerPivotAccessing Hive Data from SQL ServerAccessing HDInsight Data from Power QuerySummaryQ&AExerciseHour 16: Integrating HDInsight with SQL Server Integration Services The Need for Data MovementIntroduction to SSISAnalyzing On-time Flight Departure with SSISProvisioning HDInsight ClusterSummaryQ&AHour 17: Using Pig for Data Processing Introduction to Pig LatinUsing Pig to Count Cancelled FlightsUsing HCatalog in a Pig Latin ScriptSubmitting Pig Jobs with PowerShellSummaryQ&AHour 18: Using Sqoop for Data Movement Between RDBMS and HDInsightWhat Is Sqoop?Using Sqoop Import and Export CommandsUsing Sqoop with PowerShellSummaryQ&APart V: Managing Workflow and Performing Statistical ComputingHour 19: Using Oozie Workflows and Job Orchestration with HDInsight Introduction to OozieDetermining On-time Flight Departure Percentage with OozieSubmitting an Oozie Workflow with HDInsight .NET SDKCoordinating Workflows with OozieOozie Compared to SSISSummaryQ&AHour 20: Performing Statistical Computing with R Introduction to RIntegrating R with HadoopEnabling R on HDInsightSummaryQ&APart VI: Performing Interactive Analytics and Machine LearningHour 21: Performing Big Data Analytics with Spark Introduction to SparkSpark Programming ModelBlending SQL Querying with Functional ProgramsSummaryQ&AHour 22: Microsoft Azure Machine Learning History of Traditional Machine LearningIntroduction to Azure MLAzure ML WorkspaceProcesses to Build Azure ML SolutionsGetting Started with Azure MLCreating Predictive Models with Azure MLPublishing Azure ML Models as Web ServicesSummaryQ&AExercisePart VII: Performing Real-time AnalyticsHour 23: Performing Stream Analytics with StormIntroduction to StormUsing SCP.NET to Develop Storm SolutionsAnalyzing Speed Limit Violation Incidents with StormSummaryQ&AHour 24: Introduction to Apache HBase on HDInsight Introduction to Apache HBaseHBase ArchitectureCreating HDInsight Cluster with HBaseSummaryQ&A9780672337277 TOC 10/26/2015

最近チェックした商品