A useful incident postmortem explains what happened, why it happened, how the response worked, and what will change next. It should produce better systems and better operational decisions, not just a document that closes the process.
If you need the customer-facing workflow, see Incident management. This guide is about the internal review after the incident is over.
What a good postmortem includes
Every postmortem should cover:
- summary
- impact
- timeline
- root cause
- detection and response review
- follow-up actions
Postmortem template
1. Incident summary
- Incident title:
- Date:
- Severity:
- Status:
- Owner:
Example:
On 2026-03-04, API write requests failed for part of EU traffic for 27 minutes after a deployment introduced a configuration mismatch in one region.
2. Customer impact
Document what customers actually experienced.
- affected services
- affected user segment
- start and end time
- business impact
3. Timeline
Use timestamps and keep the sequence factual.
| Time | Event |
|---|---|
| 09:02 UTC | Elevated error rate detected |
| 09:05 UTC | Incident declared |
| 09:11 UTC | Failed deploy identified |
| 09:18 UTC | Rollback started |
| 09:29 UTC | Recovery confirmed |
4. Root cause
State the root cause plainly.
Good example:
A deployment changed a regional configuration required by the write path. Health checks did not validate the affected code path, so the issue passed rollout gates.
5. Response review
Answer these questions:
- how was the issue detected?
- was severity assigned correctly?
- were customer updates timely?
- were the right responders involved?
- what slowed mitigation?
6. Follow-up actions
Actions should be specific, owned, and trackable.
| Action | Owner | Due date |
|---|---|---|
| Add write-path deploy validation | Platform team | 2026-03-20 |
| Update rollback checklist | On-call lead | 2026-03-14 |
| Add incident template for regional failures | SRE | 2026-03-12 |
What to avoid
Avoid postmortems that:
- blame individuals instead of systems
- stop at the first obvious cause
- list action items without owners
- describe internal details but skip customer impact
Practical rule
If someone reads the postmortem three months later, they should be able to understand:
- what broke
- why it broke
- what the team learned
- what was changed to reduce recurrence
For active-incident copy, use Incident communication templates.
FAQ
How long should an incident postmortem be?
Long enough to explain the incident clearly and define actions, but short enough that responders will actually read it. Most useful postmortems are concise and structured.
Should postmortems be shared with customers?
Sometimes. A public summary can improve transparency, but the internal postmortem usually contains more operational detail than customers need.
What is the most important part of a postmortem?
The most important part is a clear connection between root cause, response gaps, and specific follow-up actions with owners.