Preparation
Set your team up for success before anything breaks. This chapter covers how to define stakeholders, notification rules, messaging templates, communication channels, automations and more!
The best time to prepare for incident communication is before anything breaks. When an outage hits, you don’t want to be scrambling to find contacts, writing messages from scratch, or arguing about what channel to use. This section gives you the building blocks to have everything ready.
Define Who to Notify
During an incident, speed matters — but so does relevance. Not everyone needs to hear about every service outage, and over-notifying can cause confusion or unnecessary panic. To avoid this, prepare a recipients list that specifies exactly who should be notified, for which severities, and for which services.
1. Identify Stakeholders
List everyone who could receive incident communications. Typical categories include:
Internal: Engineering, Support, Customer Success, Executives, Finance, Legal/PR.
External: Customers, key accounts, partners, regulators.
2. Document Notification Rules
For each stakeholder, record:
Severities to notify (
SEV1
,SEV2
,SEV3
).Relevant services (which outages actually affect them).
Contact methods (email, phone, Slack, PagerDuty, etc.).
Notes (special conditions — e.g., SLA thresholds, VIP outreach).
Example recipients list:
Engineering On-Call
SEV1
, SEV2
, SEV3
All services
PagerDuty, Slack
Primary: Alice, Backup: Bob
Finance
SEV1
, SEV2
Billing, Payments
Email: finance@...
Notify only if >1h downtime
PR / Marketing
SEV1
Public-facing services
Phone + Email
Prepares external messaging
VIP Account A
SEV1
, SEV2
Core API, Integrations
Direct CSM call
SLA requires <15m notification
3. Keep It Current
Review the list regularly or after major changes (new services, new stakeholders).
Store it in a place that’s always accessible during incidents (status page tool, internal wiki, or incident management platform).
Assign ownership for keeping it updated.
Prepare Messaging Templates
You should never start an incident update with a blank page. Prepare:
Pre-approved templates for:
SEV1, SEV2, SEV3 incidents.
Write different versions for internal vs. external audiences.
One template for each update type: identified, update, monitoring, and resolved.
Tone guidelines:
✅ Be clear, empathetic, plain English.
❌ Don’t downplay (“only a few users”), don’t use heavy technical jargon.
“We are investigating an issue affecting [service(s)]. Some customers may experience [impact]. We’ll share an update within the next [20 minutes].”
Define Communication Channels
Decide in advance where and how updates will be published. Typical options:
Status page: the primary source of truth.
Email: good for customer alerts, especially for SEV1/2.
In-app notifications: useful for SaaS platforms with logged-in users.
Chat tools (Slack/MS Teams): for internal coordination.
Social media (Twitter/X, LinkedIn): optional, but effective for high-impact public outages.
SEV1: Status page + email to all customers + internal Slack + exec briefing.
SEV2: Status page + targeted email (affected accounts).
SEV3: Status page only.
Define Update Cadence
Set clear expectations for how often you’ll send incident updates, based on the severity and the audience. This helps reduce uncertainty for everyone affected.
Example Update Cadence Matrix
SEV1
Every 15 minutes
Every 20 minutes
SEV2
Every 30 minutes
Every 45 minutes
SEV3
Every 60 minutes
Every 90 minutes
Internal stakeholders: Teams who rely on the affected service to do their work (e.g., Support, Sales, Product).
External stakeholders: Customers, partners, or regulators.
Automate Distribution
It’s not enough to know the channel — decide how the message actually gets there:
Manual posting (who has access, who’s trained).
Automated distribution via status page tools (status page → email/SMS).
Pre-configured integrations (PagerDuty/Pingdom → status page → email/SMS).
Assign Roles & Responsibilities
Clearly assign who owns what:
Incident Commander (IC): Focuses on the technical response, confirms facts for comms.
Comms Lead: Writes and publishes updates. In small teams, the IC may also take on this responsibility.
Support/CSM: Relays info directly to customers, handles VIP outreach.
Dry Run & Review
Preparation isn’t “set it and forget it.” Test your process:
Run a tabletop exercise at least once a quarter.
Simulate a SEV1 outage: does everyone know their role? Can updates go out in 10 minutes?
After the exercise, update templates, contact lists, and access rights based on what broke down.
Last updated