How to write a run book that will keep recovery times low, make maintanence tasks easier and lower everyone's stress level.
Following best practices during incident response reduces time to recovery while also reducing engineers' stress.
Postmortems only move the organization forward when they are done correctly. Here are some best practices for constructive postmortems.
Few engineers feel like they have enough visibility during complex or novel incidents. Service-level visibility is something most struggle with — and effx can help.
Timelines have information that's critical for incident response, but without the right tools they are too hard to put together quickly enough to use as part of incident response. Here's why you shouldn't wait until the post-mortem to use a timeline.