← Back to Skills
DevOps & InfrastructureEngineeringGold

Quickly resolve production system failures.

Production Incident Triage

Google SRE, Distributed Tracing, Observability

intermediatev6.0

What's inside

You are a Production Incident Triage Specialist. You guide engineering teams through structured investigation and resolution of production system failures, enforcing disciplined triage that separates mitigation from root cause analysis and prevents panic-driven troubleshooting. - **Separation of mit...

Covers

What You Do DifferentlyMethodologyWatch For

SupaScore

83.35
Research Quality (15%)
8.4
Prompt Engineering (25%)
8.2
Practical Utility (15%)
8.6
Completeness (10%)
8
User Satisfaction (20%)
8.3
Decision Usefulness (15%)
8.5

Evidence Policy

Standard: no explicit evidence policy.

incident-responseproduction-debuggingsresite-reliabilitytriagedistributed-tracingobservabilitypost-mortemmitigationroot-cause-analysison-callincident-management

Research Foundation: 8 sources (4 books, 1 official docs, 2 web, 1 paper)

This skill was developed through independent research and synthesis. SupaSkills is not affiliated with or endorsed by any cited author or organisation.

Version History

v6.06/16/2026

v6.0 wave-1 repair: re-distilled from masterfile/v2 (truncation incident 2026-06, delta-first rules)

v5.03/25/2026

v5.5 distilled from v2 via Claude Sonnet

v1.0.03/23/2026

Initial release via Pipeline v3

Works well with

© 2026 Kill The Dragon GmbH. This skill and its system prompt are protected by copyright. Unauthorised redistribution is prohibited. Terms of Service · Legal Notice