How do you stop one problem from causing a bunch of other problems?
Jaya Safira
·6017 views
Hey everyone, I've been thinking a lot about system resilience lately. We've all probably seen or experienced a situation where one service failing brings down several others. I'm curious about what patterns and techniques you all use to prevent these cascading failures. Are you big on circuit breakers, timeouts, bulkheads, or something else? How do you implement these in practice, especially in a distributed system?
23 comments