Join us

heart Posts from the community tagged with reliability...
Sponsored Link FAUN Team
@faun shared a link, 1 year, 1 month ago

Read AI/M Weekly

AI Weekly Newsletter, Kala. Curated AI news, tutorials, tools and more - Join thousands of other readers, 100% free, unsubscribe anytime.

Story
@boldlink shared a post, 1 year, 9 months ago
AWS DevOps Consultancy, Boldlink

An Overview of AWS Well-Architected Framework

Thinking of getting started with AWS cloud computing or migrating your existing workloads to AWS? Here is a quick guide on how the 5 pillars of AWS’s well-architected framework will help you build a secure, high-performing, resilient and efficient cloud infrastructure for your workloads.So basically..

AWS Image.png
Story
@yair_stark shared a post, 2 years, 2 months ago

Error Budget Is All You Need - Part 2

In part 1 I proposed a simple modification to Google’s Multi-Window Multi-Burn Rate alerting setup and I showed how this modification addresses the cases of varying-traffic services and typical latency SLOs.

1_gm3BXHRG_TVt9Hc5cQbOJA (1).png
Story
@yair_stark shared a post, 2 years, 2 months ago

Error Budget Is All You Need - Part 1

One of the great chapters of Google’s Site Reliability Engineering (SRE) second book is chapter 5 — Alerting on SLOs (Service Level Objectives). This chapter takes you on a comprehensive journey through several setups of alerts on SLOs, starting with the simplest non-optimized one and by iterating through several setups reach the ultimate one, which is optimized w.r.t to the main four alerting attributes: recall, precision, detection time and reset time.

1_gm3BXHRG_TVt9Hc5cQbOJA.png
Story
@tharunshiv shared a post, 2 years, 3 months ago
Site Reliability Engineer, PhonePe

#1 What's Site Reliability Engineering [SRE] | Roles & Responsibilities | Technologies involved

Site Reliability Engineering, also popularly referred to as the SRE, is a role in Computer Science Engineering where the main purpose is to provision, maintain, monitor, and manage the infrastructure in order to provide maximum application uptime and reliability. SRE is an emerging role, but the tasks that the SRE does were always there ever since the first application that was developed. The scope of the software developers ends where they write code to develop the application and right from setting up the infrastructure, the various services that run on them, the network connectivity that is required, providing a platform for the application to run and making sure every part of the application is up and running reliably 24x7 is the duty of an SRE. In fact, we can consider Site Reliability Engineers are the strong bridge between the users and a reliable application.

SRE
Link
@prathamesh-sonpatki shared a link, 9 months, 1 week ago
SRE, Last9.io

MTTF vs. MTBF vs. MTTD vs. MTTR

MTTF vs. MTBF vs. MTTD vs. MTTR