- ホーム
- > 洋書
- > 英文書
- > Computer / General
Full Description
Discover how AIOps is transforming the observability landscape for cloud-native and traditional systems. Learn how to build, monitor, and operate resilient services using AI-drive dynamic insights for smarter and more scalable operations
Key Features
Practical Integration of AI and Observability in Modern Engineering Workflows
Real-World Use Cases Grounded in Industry Experience
Tailored for Modern Engineering Roles and Organizations
Book DescriptionWith OpenTelemetry, observability has become central to building and operating cloud-native distributed systems. At the same time, advances in AI are transforming how we extract value from the growing volume of observability data. This book shows you how to implement scalable observability, improve engineering efficiency with AI, and extend observability practices from production into development through modern internal developer platforms.
You'll begin with the fundamentals of observability, logs, metrics, and traces, then learn how AIOps enhances signal correlation, anomaly detection, and root-cause analysis. Through real-world examples and architectural guidance, the book demonstrates how to integrate AIOps into existing systems and build pipelines that proactively detect and resolve issues before users are affected.
You'll also explore best practices for expanding observability across the software development lifecycle, enabling AI-powered observability as a self-service capability for engineers. Using tools such as OpenTelemetry, Prometheus, Elasticsearch, and Grafana alongside machine learning models, you'll learn how to automate diagnostics and remediation.
By the end of this book, you'll be able to design and implement AIOps-enabled observability solutions that make cloud-native systems more resilient and efficient.What you will learn
Build observability pipelines with logging, metrics, and tracing
Apply AI/ML for anomaly detection and root cause analysis
Correlate signals from multiple sources for better incident triage
Automate responses with self-healing and remediation scripts
Integrate tools like OpenTelemetry, Prometheus, and Elasticsearch
Design scalable architectures for intelligent monitoring
Who this book is forThis book is for Software engineers and engineering leaders working on teams with operational responsibilities, such as platform engineering, site reliability engineering (SRE), DevOps, or application development, who want to integrate AIOps capabilities into their workflows will benefit from this book. If your team is responsible for building and running high-performing, resilient software systems, this book is for you.
Contents
Table of Contents
Observability: The art of turning data into information
The Elephant in the Room: Artificial Intelligence
From Observability to AIOps and the Use Cases it solves today!
Financial One ACME: Implementing AIOps!
Democratizing Observability: A Primer to Self-Service Platforms
Observability Agents in Action
Financial One ACME: How to move from AIOps to Agentic Platforms
Evolving Operations: Proactive -> Preventive -> Self-Driven Architecture
Navigating AI Pitfalls: Governance, Cost & Ethical Guardrails
Transforming Financial One ACME with AI-Driven Observability



