← Back to Skills
Product & StrategyBusinessPlatinum

Designing scoring systems, quality rubrics, or evaluation frameworks for content, code, legal, marketing, or AI output.

Scoring System Architect

Criteria Design, Weighting (AHP), Calibration, Radar Charts, IRR Metrics

1 activationsadvancedv5.0

Best for

  • Design a multi-dimensional scoring rubric for any artifact type
  • Choose optimal criteria count, scales, and weighting methods
  • Set up inter-rater calibration with anchor examples and kappa checks
  • Build domain-specific scoring for content, code, legal, marketing, or LLM output

What you'll get

  • Multi-dimensional scoring rubric with criteria definitions, AHP-weighted dimensions, 5-level descriptors, and aggregation formula
  • Inter-rater calibration protocol with anchor examples, scoring exercises, IRR measurement, and disagreement resolution procedures
  • Radar chart visualization spec with dimension axes, normalization method, overlay comparison, and threshold indicators for quality tiers
  • Evaluation framework comparing scoring approaches (weighted sum, geometric mean, TOPSIS) with sensitivity analysis and bias detection
Expects

The domain or artifact to score, the purpose of scoring, and who will use the scores.

Returns

Complete scoring system architecture with criteria, weights, descriptors, aggregation method, calibration plan, and visualization recommendation.

What's inside

You are the Scoring System Architect. You design structured, reproducible scoring systems for any domain that transform subjective quality judgments into defensible, consistent, improvable measurements. - **Purpose-first design.** You always start by defining the decision the score informs (feedback...

Covers

What You Do DifferentlyMethodologyWatch ForOutput Format Format
Not designed for ↓
  • ×Implementing scoring algorithms in code
  • ×Statistical analysis or data science tasks

SupaScore

90.5
Research Quality (15%)
9.5
Prompt Engineering (25%)
9.1
Practical Utility (15%)
9
Completeness (10%)
8.8
User Satisfaction (20%)
8.7
Decision Usefulness (15%)
9.2

Evidence Policy

Standard: no explicit evidence policy.

scoringrubricqualityevaluationmetricskpiradar-chartcalibrationinter-ratermcdaweightingassessment

Research Foundation: 8 sources (5 academic, 2 industry frameworks, 1 web)

This skill was developed through independent research and synthesis. SupaSkills is not affiliated with or endorsed by any cited author or organisation.

Version History

v5.03/25/2026

v5.5 distilled from v2 via Claude Sonnet

v1.0.03/12/2026

Initial release: scoring system design with MCDA, IRR, calibration, radar charts

Works well with

© 2026 Kill The Dragon GmbH. This skill and its system prompt are protected by copyright. Unauthorised redistribution is prohibited. Terms of Service · Legal Notice