← Back to Skills
Data & AnalyticsTechnologyPlatinum

Explore and understand a dataset before analysis.

Exploratory Data Analysis

Pandas, Seaborn, Missingno, Tukey's EDA

intermediatev5.0

Best for

  • Dataset profiling and structure assessment for machine learning projects
  • Statistical distribution analysis and outlier detection in customer behavior data
  • Correlation discovery and multicollinearity assessment in financial datasets
  • Missing value pattern analysis and treatment strategy development

What you'll get

  • Statistical summary tables with five-number summaries, skewness metrics, and outlier percentages for each numeric column
  • Correlation heatmaps with clustering and flagged multicollinearity pairs above threshold values
  • Missing value pattern visualizations with MCAR/MAR classification and treatment recommendations
Expects

Raw or semi-processed tabular datasets with clear business context and specific analytical objectives.

Returns

Comprehensive statistical summaries, distribution visualizations, correlation matrices, outlier reports, and actionable data quality insights with recommended next steps.

What's inside

You are a Senior Data Scientist and Statistician specializing in Exploratory Data Analysis. You combine Tukey's philosophy of letting data reveal its structure with modern computational tools, statistical rigor, and visualization best practices to surface patterns, anomalies, and actionable hypothes...

Covers

What You Do DifferentlyMethodologyWatch For
Not designed for ↓
  • ×Building predictive models or machine learning algorithms
  • ×Creating production data pipelines or ETL workflows
  • ×Statistical hypothesis testing or causal inference analysis
  • ×Real-time data streaming analysis

SupaScore

89.3
Research Quality (15%)
8.85
Prompt Engineering (25%)
9.2
Practical Utility (15%)
8.8
Completeness (10%)
8.9
User Satisfaction (20%)
8.9
Decision Usefulness (15%)
8.75

Evidence Policy

Standard: no explicit evidence policy.

edaexploratory-data-analysisdata-profilingstatisticspandasseabornvisualizationdata-qualitydistribution-analysiscorrelationoutlier-detectionhypothesis-generation

Research Foundation: 7 sources (3 books, 3 official docs, 1 academic)

This skill was developed through independent research and synthesis. SupaSkills is not affiliated with or endorsed by any cited author or organisation.

Version History

v5.03/25/2026

v5.5 final distill

v2.02/22/2026

Pipeline v4: rebuilt with 3 helper skills

v1.0.02/16/2026

Initial release

Works well with

Need more depth?

Specialist skills that go deeper in areas this skill touches.

Common Workflows

ML Pipeline Data Preparation

Complete data science workflow from initial exploration through model training

exploratory-data-analysisData Quality EngineerFeature Engineering Strategistsupervised-learning-engineer

© 2026 Kill The Dragon GmbH. This skill and its system prompt are protected by copyright. Unauthorised redistribution is prohibited. Terms of Service · Legal Notice