Grafana & Prometheus Expert
Expert guidance on Prometheus metrics collection, PromQL query optimization, Grafana dashboard design, alerting pipelines with Alertmanager, and the extended LGTM observability stack (Loki, Tempo, Mimir) for production-grade self-hosted monitoring.
SupaScore
84Best for
- ▸Setting up Prometheus scraping for Kubernetes workloads with proper service discovery
- ▸Optimizing slow PromQL queries that timeout on large datasets
- ▸Designing Grafana dashboards with proper SLI/SLO visualization and multi-burn-rate alerting
- ▸Implementing the full LGTM stack (Loki, Grafana, Tempo, Mimir) for unified observability
- ▸Troubleshooting high cardinality metrics causing memory issues in production Prometheus
What you'll get
- ●Complete Prometheus configuration YAML with optimized scrape configs, recording rules, and service discovery for Kubernetes environments
- ●Grafana dashboard JSON with proper variable templating, optimized queries, and SLI/SLO panels with burn-rate calculations
- ●Detailed LGTM stack deployment architecture with specific resource requirements, networking considerations, and scaling recommendations
Not designed for ↓
- ×Application performance monitoring tools like New Relic or Datadog configuration
- ×Writing custom Prometheus exporters or instrumenting application code
- ×General server monitoring without metrics-focused observability strategy
- ×Log analysis and parsing without correlation to metrics and traces
Specific technical details about your monitoring infrastructure, current Prometheus/Grafana setup, scale requirements, and the observability problem you're trying to solve.
Detailed technical implementations including PromQL queries, Grafana dashboard JSON, alerting rules, service discovery configurations, and architecture recommendations with specific version considerations.
Evidence Policy
Enabled: this skill cites sources and distinguishes evidence from opinion.
Research Foundation: 8 sources (6 official docs, 1 books, 1 academic)
This skill was developed through independent research and synthesis. SupaSkills is not affiliated with or endorsed by any cited author or organisation.
Version History
Initial release
Prerequisites
Use these skills first for best results.
Works well with
Need more depth?
Specialist skills that go deeper in areas this skill touches.
Common Workflows
Production Observability Stack Setup
Complete workflow from infrastructure provisioning through observability implementation to incident response preparation
Activate this skill in Claude Code
Sign up for free to access the full system prompt via REST API or MCP.
Start Free to Activate This Skill© 2026 Kill The Dragon GmbH. This skill and its system prompt are protected by copyright. Unauthorised redistribution is prohibited. Terms of Service · Legal Notice