Data Engineering for Machine Learning Pipelines : From Python Libraries to ML Pipelines and Cloud Platforms (First Edition. 2024. xxv, 636 S. XXV, 636 p. 225 illus. 254 mm)

個数:
電子版価格
¥11,409
  • 電子版あり
  • ポイントキャンペーン

Data Engineering for Machine Learning Pipelines : From Python Libraries to ML Pipelines and Cloud Platforms (First Edition. 2024. xxv, 636 S. XXV, 636 p. 225 illus. 254 mm)

  • 在庫がございません。海外の書籍取次会社を通じて出版社等からお取り寄せいたします。
    通常6~9週間ほどで発送の見込みですが、商品によってはさらに時間がかかることもございます。
    重要ご説明事項
    1. 納期遅延や、ご入手不能となる場合がございます。
    2. 複数冊ご注文の場合は、ご注文数量が揃ってからまとめて発送いたします。
    3. 美品のご指定は承りかねます。

    ●3Dセキュア導入とクレジットカードによるお支払いについて

  • 提携先の海外書籍取次会社に在庫がございます。通常3週間で発送いたします。
    重要ご説明事項
    1. 納期遅延や、ご入手不能となる場合が若干ございます。
    2. 複数冊ご注文の場合は、ご注文数量が揃ってからまとめて発送いたします。
    3. 美品のご指定は承りかねます。

    ●3Dセキュア導入とクレジットカードによるお支払いについて
  • 【入荷遅延について】
    世界情勢の影響により、海外からお取り寄せとなる洋書・洋古書の入荷が、表示している標準的な納期よりも遅延する場合がございます。
    おそれいりますが、あらかじめご了承くださいますようお願い申し上げます。
  • ◆画像の表紙や帯等は実物とは異なる場合があります。
  • ◆ウェブストアでの洋書販売価格は、弊社店舗等での販売価格とは異なります。
    また、洋書販売価格は、ご注文確定時点での日本円価格となります。
    ご注文確定後に、同じ洋書の販売価格が変動しても、それは反映されません。
  • 製本 Paperback:紙装版/ペーパーバック版/ページ数 300 p.
  • 言語 ENG
  • 商品コード 9798868806018

Full Description

This book covers modern data engineering functions and important Python libraries, to help you develop state-of-the-art ML pipelines and integration code.

The book begins by explaining data analytics and transformation, delving into the Pandas library, its capabilities, and nuances. It then explores emerging libraries such as Polars and CuDF, providing insights into GPU-based computing and cutting-edge data manipulation techniques. The text discusses the importance of data validation in engineering processes, introducing tools such as Great Expectations and Pandera to ensure data quality and reliability. The book delves into API design and development, with a specific focus on leveraging the power of FastAPI. It covers authentication, authorization, and real-world applications, enabling you to construct efficient and secure APIs using FastAPI. Also explored is concurrency in data engineering, examining Dask's capabilities from basic setup to crafting advanced machine learning pipelines. The book includes development and delivery of data engineering pipelines using leading cloud platforms such as AWS, Google Cloud, and Microsoft Azure. The concluding chapters concentrate on real-time and streaming data engineering pipelines, emphasizing Apache Kafka and workflow orchestration in data engineering. Workflow tools such as Airflow and Prefect are introduced to seamlessly manage and automate complex data workflows.

What sets this book apart is its blend of theoretical knowledge and practical application, a structured path from basic to advanced concepts, and insights into using state-of-the-art tools. With this book, you gain access to cutting-edge techniques and insights that are reshaping the industry. This book is not just an educational tool. It is a career catalyst, and an investment in your future as a data engineering expert, poised to meet the challenges of today's data-driven world.

 

What You Will Learn

Elevate your data wrangling jobs by utilizing the power of both CPU and GPU computing, and learn to process data using Pandas 2.0, Polars, and CuDF at unprecedented speeds
Design data validation pipelines, construct efficient data service APIs, develop real-time streaming pipelines and master the art of workflow orchestration to streamline your engineering projects
Leverage concurrent programming to develop machine learning pipelines and get hands-on experience in development and deployment of machine learning pipelines across AWS, GCP, and Azure

 

Who This Book Is For

Data analysts, data engineers, data scientists, machine learning engineers, and MLOps specialists

Contents

Chapter 1: Data Manipulation and Analytics Using Pandas.- Chapter 2: Data Manipulation Using Polars and CuDF.- Chapter 3: Introduction to Data Validation.- Chapter 4: Data Validation Using Great Expectations.- Chapter 5: Introduction to API Design Using FastAPI.- Chapter 6: Introduction to Concurrency Programming Using Task.- Chapter 7: Dask ML.- Module 5: Data Pipelines in the Cloud.- Chapter 9: Introduction to Microsoft Azure.- Chapter 10: Introduction to Google Cloud.- Chapter 11: Introduction to Streaming Data.- Chapter 12: Introduction to Workflow Management Using Airflow.- Chapter 13: Introduction to Workflow Management Using Prefect.

最近チェックした商品