← Back to Skills
DevOps & InfrastructureEngineeringPlatinum

Need expert help with Prometheus and Grafana for monitoring systems.

Grafana & Prometheus Expert

Prometheus, Grafana, LGTM stack

advancedv5.0

Best for

  • Setting up Prometheus scraping for Kubernetes workloads with proper service discovery
  • Optimizing slow PromQL queries that timeout on large datasets
  • Designing Grafana dashboards with proper SLI/SLO visualization and multi-burn-rate alerting
  • Implementing the full LGTM stack (Loki, Grafana, Tempo, Mimir) for unified observability

What you'll get

  • Complete Prometheus configuration YAML with optimized scrape configs, recording rules, and service discovery for Kubernetes environments
  • Grafana dashboard JSON with proper variable templating, optimized queries, and SLI/SLO panels with burn-rate calculations
  • Detailed LGTM stack deployment architecture with specific resource requirements, networking considerations, and scaling recommendations
Expects

Specific technical details about your monitoring infrastructure, current Prometheus/Grafana setup, scale requirements, and the observability problem you're trying to solve.

Returns

Detailed technical implementations including PromQL queries, Grafana dashboard JSON, alerting rules, service discovery configurations, and architecture recommendations with specific version considerations.

What's inside

You are a Grafana & Prometheus Expert. You design, optimize, and troubleshoot production observability stacks across multi-region Kubernetes clusters, handling billions of active time series while maintaining sub-minute incident detection. - **Obsess over cardinality control**: You prevent cardinali...

Covers

What You Do DifferentlyMethodologyWatch For
Not designed for ↓
  • ×Application performance monitoring tools like New Relic or Datadog configuration
  • ×Writing custom Prometheus exporters or instrumenting application code
  • ×General server monitoring without metrics-focused observability strategy
  • ×Log analysis and parsing without correlation to metrics and traces

SupaScore

88.45
Research Quality (15%)
8.85
Prompt Engineering (25%)
9.2
Practical Utility (15%)
8.65
Completeness (10%)
8.85
User Satisfaction (20%)
8.8
Decision Usefulness (15%)
8.5

Evidence Policy

Standard: no explicit evidence policy.

prometheusgrafanapromqlalertmanagerlokitempomimirobservabilitymonitoringdashboards-as-codesli-slorecording-rulesservice-discoveryalerting

Research Foundation: 8 sources (6 official docs, 1 books, 1 academic)

This skill was developed through independent research and synthesis. SupaSkills is not affiliated with or endorsed by any cited author or organisation.

Version History

v5.03/25/2026

v5.5 distilled from v2 via Claude Sonnet

v2.02/23/2026

Pipeline v4: rebuilt with 3 helper skills

v1.0.02/15/2026

Initial release

Prerequisites

Use these skills first for best results.

Works well with

Need more depth?

Specialist skills that go deeper in areas this skill touches.

Common Workflows

Production Observability Stack Setup

Complete workflow from infrastructure provisioning through observability implementation to incident response preparation

© 2026 Kill The Dragon GmbH. This skill and its system prompt are protected by copyright. Unauthorised redistribution is prohibited. Terms of Service · Legal Notice