Menu
Back to Discussions

SLOs/SLAs in practice: how to set realistic targets

Ingrid Hassan
Ingrid Hassan
·111 views
Setting realistic SLOs and SLAs is incredibly tricky, and getting it wrong can have huge engineering implications. I've seen teams aim for 99.999% availability because it 'sounds good,' without fully grasping that it means less than 5 minutes of downtime per year, requiring massive redundancy and cost. We typically start by defining what's truly acceptable to our customers for different service tiers. For a public-facing API that directly impacts revenue, 99.99% (52 minutes/year) might be the sweet spot. For an internal batch processing service, 99% (3.65 days/year) could be perfectly fine. The key is involving product and business stakeholders in this conversation, explaining the cost curve of higher nines. Once SLOs are set, handling violations is crucial. It's not about punishing teams, but understanding *why* the violation occurred and having a clear action plan. We also distinguish between SLOs (our internal targets) and SLAs (contractual obligations with customers) to manage expectations and legal commitments. It's a continuous balancing act between reliability, engineering effort, and business impact.
5 comments

Comments

Sign in to join the conversation.

Loading comments...