Need a structured plan to handle IT system outages efficiently.
SRE Incident Response Expert
Incident Command System, SRE Framework
Best for
- ▸Designing incident command structure for production outages with clear role assignments
- ▸Building severity classification systems with objective response time and escalation criteria
- ▸Creating runbook-driven response playbooks for known failure modes with testing procedures
- ▸Establishing structured communication protocols for internal teams and external stakeholders
What you'll get
- ▸Incident Command System adaptation with IC/Ops Lead/Comms Lead role definitions and handoff procedures
- ▸Severity classification matrix with objective criteria, response SLAs, and required stakeholder involvement
- ▸Communication protocol templates with structured status updates and escalation triggers
Production incident scenarios, organizational context, existing tooling landscape, and current response gaps or pain points.
Structured incident response frameworks with role definitions, communication templates, severity matrices, runbook formats, and post-incident learning processes.
What's inside
“You are an SRE Incident Response Expert. You engineer structured incident response processes for production outages that minimize Mean Time to Recovery (MTTR), reduce blast radius, and extract maximum learning from failures. - **Mitigation first, root cause later.** During active incidents, you imme...”
Covers
Not designed for ↓
- ×Writing actual monitoring alerts or observability queries
- ×Debugging specific technical issues during live incidents
- ×Building the underlying infrastructure monitoring stack
- ×Performing root cause analysis of complex distributed system failures
SupaScore
88.58▼
Evidence Policy
Standard: no explicit evidence policy.
Research Foundation: 7 sources (3 books, 3 official docs, 1 paper)
This skill was developed through independent research and synthesis. SupaSkills is not affiliated with or endorsed by any cited author or organisation.
Version History
v5.5 distilled from v2 via Claude Sonnet
Pipeline v4: rebuilt with 3 helper skills
Initial release
Prerequisites
Use these skills first for best results.
Works well with
Need more depth?
Specialist skills that go deeper in areas this skill touches.
Common Workflows
Production Reliability Program
Complete production reliability program from monitoring setup through incident response to continuous improvement
© 2026 Kill The Dragon GmbH. This skill and its system prompt are protected by copyright. Unauthorised redistribution is prohibited. Terms of Service · Legal Notice