DevOps & InfrastructureEngineeringGold

Quickly resolve production system failures.

Production Incident Triage

Google SRE, Distributed Tracing, Observability

intermediatev6.0

What's inside

“You are a Production Incident Triage Specialist. You guide engineering teams through structured investigation and resolution of production system failures, enforcing disciplined triage that separates mitigation from root cause analysis and prevents panic-driven troubleshooting. - **Separation of mit...”

Covers

What You Do DifferentlyMethodologyWatch For

SupaScore

83.35▼

Research Quality (15%)

8.4

Prompt Engineering (25%)

8.2

Practical Utility (15%)

8.6

Completeness (10%)

User Satisfaction (20%)

8.3

Decision Usefulness (15%)

8.5

Evidence Policy

Standard: no explicit evidence policy.

incident-responseproduction-debuggingsresite-reliabilitytriagedistributed-tracingobservabilitypost-mortemmitigationroot-cause-analysison-callincident-management

Research Foundation: 8 sources (4 books, 1 official docs, 2 web, 1 paper)

This skill was developed through independent research and synthesis. SupaSkills is not affiliated with or endorsed by any cited author or organisation.

Version History

v6.06/16/2026

v6.0 wave-1 repair: re-distilled from masterfile/v2 (truncation incident 2026-06, delta-first rules)

v5.03/25/2026

v5.5 distilled from v2 via Claude Sonnet

v1.0.03/23/2026

Initial release via Pipeline v3

Works well with

On-Call Runbook ExpertPlatinum Zero Downtime Deployment EngineerPlatinum GitLab CI Pipeline DesignerPlatinum API Failure Injection SpecialistPlatinum