Advanced Spark for Professionals : Analytics for Data Driven Enterprises and R&d

Dekant, Henning

Apress（2017/05発売）

ただいまウェブストアではご注文を受け付けておりません。 ⇒古書を探す

製本 Paperback:紙装版/ペーパーバック版
言語 ENG
商品コード 9781484214626
DDC分類 025

Full Description

This book is for the advanced Spark user, who is looking to deploy Spark in Docker containers, and wants to learn not only how to work with the basic Spark features, but also how to extend it to fit custom requirements (such as leveraging GPU processing nodes) aided by a series of real-world implementation examples where possible using real data. Spark is an exciting new analytical platform by the Apache Foundation. It combines fault tolerant distributed computing with very low latency by keeping data in memory and keeping processing as local to the compute nodes as possible. This way Spark can achieve magnitudes of performance improvements over conventional Hadoop cluster processing.

PART 1Architecture 3. Installation Options for Spark 3.1. Stand-alone for development 3.2. Integrating with Docker for production deployment 4. Exposed Spark APIs 4.1. Overview of the Programming Interfaces 4.2. Working with graphs 4.3. Machine learning 5. Extending Spark 5.1. Custom APIs 5.2. Including GPU resources PART 2: Hands-On Case Studies 6. Classification of medical data 7. Financial model optimization 8. Classifying drugs based on their chemical structure 9. Fraud detection based in federal Medicaid data 10. Implementing Bayesian networks 11. Stress testing the cluster: Quantum computing emulation with Spark's graph API