Following best practices during incident response reduces time to recovery while also reducing engineers' stress.
Postmortems only move the organization forward when they are done correctly. Here are some best practices for constructive postmortems.
Few engineers feel like they have enough visibility during complex or novel incidents. Service-level visibility is something most struggle with — and effx can help.
Timelines have information that's critical for incident response, but without the right tools they are too hard to put together quickly enough to use as part of incident response. Here's why you shouldn't wait until the post-mortem to use a timeline.
At a previous job, I worked with a particularly effective incident manager. After working with him for a while, I learned that before going into software engineering he had been an air traffic