Incident response: our postmortem process that actually prevents recurrence
Kenji Lindberg
·28 views
we recently revamped our incident postmortem process, and it's been incredibly effective at preventing recurrence. historically, our postmortems felt like blame sessions, and action items often got lost. our new approach is strictly blameless, focusing purely on systemic improvements. every action item gets a clear owner and a deadline, and we have a monthly review meeting to track progress and celebrate successes. we also started classifying incidents more rigorously to identify common failure modes. since implementing this, our repeat incidents have dropped by about 60%. i'm curious about what makes other teams' postmortem processes truly effective. what are the key elements that contribute to genuine learning and prevention for you?
11 comments