"Cloud Native Microservices with Kubernetes" is out! This book is a hands-on, example-rich guide focused on real-world examples and practical learning that covers everything needed from the basics to the most advanced concepts. Check it out on Leanpub (PDF/EPUB) or Amazon Kindle (DRM-free)!
One of the great chapters of Google’s Site Reliability Engineering (SRE) second book is chapter 5 — Alerting on SLOs (Service Level Objectives). This chapter takes you on a comprehensive journey through several setups of alerts on SLOs, starting with the simplest non-optimized one and by iterating through several setups reach the ultimate one, which is optimized w.r.t to the main four alerting attributes: recall, precision, detection time and reset time.
Hey, sign up or sign in to add a reaction to my post.
Join thousands of other developers, 100% free, unsubscribe anytime.