PySpark SQL Recipes〈First Edition〉 : With HiveQL, Dataframe and Graphframes

個数:1
紙書籍版価格
¥9,853
  • 電子書籍

PySpark SQL Recipes〈First Edition〉 : With HiveQL, Dataframe and Graphframes

  • 著者名:Mishra, Raju Kumar/Raman, Sundar Rajan
  • 価格 ¥10,022 (本体¥9,111)
  • Apress(2019/03/18発売)
  • ポイント 91pt (実際に付与されるポイントはご注文内容確認画面でご確認下さい)
  • 言語:ENG
  • ISBN:9781484243343
  • eISBN:9781484243350

ファイル: /

Description

Carry out data analysis with PySpark SQL, graphframes, and graph data processing using a problem-solution approach. This book provides solutions to problems related to dataframes, data manipulation summarization, and exploratory analysis. You will improve your skills in graph data analysis using graphframes and see how to optimize your PySpark SQL code.

PySpark SQL Recipes starts with recipes on creating dataframes from different types of data source, data aggregation and summarization, and exploratory data analysis using PySpark SQL. You’ll also discover how to solve problems in graph analysis using graphframes.

On completing this book, you’ll have ready-made code for all your PySpark SQL tasks, including creating dataframes using data from different file formats as well as from SQL or NoSQL databases.

What You Will Learn

  • Understand PySpark SQL and its advanced features
  • Use SQL and HiveQL with PySpark SQL
  • Work with structured streaming
  • Optimize PySpark SQL 
  • Master graphframes and graph processing

Who This Book Is For
Data scientists, Python programmers, and SQL programmers.




Table of Contents

Chapter 1:  Introduction to PySparkSQL.- Chapter 2:  Some time with Installation.- Chapter 3:  IO in PySparkSQL.- Chapter 4 :  Operations on PySparkSQL DataFrames.- Chapter 5 :  Data Merging and Data Aggregation using PySparkSQL.- Chapter 6: SQL, NoSQL and PySparkSQL.- Chapter 7: Structured Streaming.- Chapter 8 : Optimizing PySparkSQL.- Chapter 9 : GraphFrames.

最近チェックした商品