Automating Mitigation Sunsetting: Design Considerations?
Ravi Sato
·6 views
I was reading about the challenges of managing incident mitigations like rate limits and traffic shaping. It seems a common pitfall is that these emergency controls can linger long after the incident is resolved, eventually impacting legitimate traffic. This got me thinking about designing systems where mitigations have an expiration date or are automatically removed. How do others approach this? Is it better to build in a hard expiry with alerts for renewal, or a system that can detect when a mitigation is no longer needed based on certain metrics? What are the complexities of such an automated 'sunsetting' mechanism in a high-scale distributed system?
2 comments