- ホーム
- > 洋書
- > 英文書
- > Computer / General
Full Description
This book is for the advanced Spark user, who is looking to deploy Spark in Docker containers, and wants to learn not only how to work with the basic Spark features, but also how to extend it to fit custom requirements (such as leveraging GPU processing nodes) aided by a series of real-world implementation examples where possible using real data. Spark is an exciting new analytical platform by the Apache Foundation. It combines fault tolerant distributed computing with very low latency by keeping data in memory and keeping processing as local to the compute nodes as possible. This way Spark can achieve magnitudes of performance improvements over conventional Hadoop cluster processing.
Contents
PART 1Architecture 3. Installation Options for Spark 3.1. Stand-alone for development 3.2. Integrating with Docker for production deployment 4. Exposed Spark APIs 4.1. Overview of the Programming Interfaces 4.2. Working with graphs 4.3. Machine learning 5. Extending Spark 5.1. Custom APIs 5.2. Including GPU resources PART 2: Hands-On Case Studies 6. Classification of medical data 7. Financial model optimization 8. Classifying drugs based on their chemical structure 9. Fraud detection based in federal Medicaid data 10. Implementing Bayesian networks 11. Stress testing the cluster: Quantum computing emulation with Spark's graph API



