← Back to Skills
Data & AnalyticsTechnologyPlatinum

Build reliable real-time data pipelines for critical applications.

Streaming ETL Architect

Apache Kafka, Flink, Spark Streaming, CDC

expertv5.0

Best for

  • Building real-time fraud detection pipelines with exactly-once processing guarantees
  • Implementing CDC from PostgreSQL/MySQL to data lake with sub-second latency
  • Designing multi-region Kafka clusters with cross-datacenter replication for financial data
  • Migrating batch ETL workflows to streaming with backfill reconciliation

What you'll get

  • Detailed architecture diagram showing Kafka topics, Flink jobs, state backends, and sink connectors with specific configuration parameters
  • Step-by-step implementation guide including Docker Compose setup, schema registry configuration, and exactly-once semantics tuning
  • Comprehensive monitoring dashboard specification with SLIs/SLOs, alerting rules, and operational runbooks for common failure scenarios
Expects

Clear requirements for data sources, target sinks, latency SLAs, delivery guarantees, and scale (events/sec, data volume).

Returns

Complete streaming architecture design with technology selection, topology diagrams, configuration parameters, monitoring strategy, and operational runbooks.

What's inside

You are a Streaming ETL Architect. You design and operate production-grade real-time data pipelines handling petabytes of data across diverse domains with exactly-once semantics when required. - **Correctness-first philosophy:** Recommend streaming only when latency <15 minutes strictly justifies it...

Covers

What You Do DifferentlyMethodologyWatch For
Not designed for ↓
  • ×Simple batch ETL jobs that run daily or weekly
  • ×Basic Kafka producer/consumer applications without complex processing
  • ×One-off data migrations or ad-hoc data analysis
  • ×Small-scale data pipelines under 1000 events/second

SupaScore

88.63
Research Quality (15%)
9.1
Prompt Engineering (25%)
8.95
Practical Utility (15%)
8.55
Completeness (10%)
8.85
User Satisfaction (20%)
8.9
Decision Usefulness (15%)
8.75

Evidence Policy

Standard: no explicit evidence policy.

streamingetlkafkaflinkspark-streamingcdcdebeziumreal-timeevent-drivendata-pipelineexactly-oncestream-processingschema-registry

Research Foundation: 8 sources (3 official docs, 3 books, 1 paper, 1 web)

This skill was developed through independent research and synthesis. SupaSkills is not affiliated with or endorsed by any cited author or organisation.

Version History

v5.03/25/2026

v5.5 final distill

v2.02/26/2026

Pipeline v4: rebuilt with 3 helper skills

v1.0.02/16/2026

Initial release

Prerequisites

Use these skills first for best results.

Works well with

Need more depth?

Specialist skills that go deeper in areas this skill touches.

Common Workflows

Real-time Analytics Platform Build

Complete pipeline from event design through streaming processing to real-time dashboards with full observability

© 2026 Kill The Dragon GmbH. This skill and its system prompt are protected by copyright. Unauthorised redistribution is prohibited. Terms of Service · Legal Notice