← Back to blog
Performancebenchmarkecosystemquality

The Ecosystem Audit: Scoring 167 Community Agent Skills

Max Jürschik·March 16, 2026·8 min read

The Claude Code skill ecosystem is growing fast. Over 71,000 skills listed on SkillsMP alone. Vercel, Microsoft, HashiCorp, Cloudflare, Stripe, Sentry, Hugging Face, and dozens of smaller contributors are all shipping skills. The supply is real. The question is whether anyone is checking the quality.

We scored 167 community skills from 40+ organisations using the same six-dimension SupaScore rubric we apply to our own 1,278 published skills. Same weights, same criteria, same threshold.

Not a single community skill reached our 80.0 publishing threshold.

The sample

167 skills from a broad cross-section of the ecosystem. Corporate contributors like Vercel, Microsoft, HashiCorp, Cloudflare, Stripe, Sentry, and Hugging Face. Platform-focused builders like Expo and Supabase. Prolific independent contributors like softaworks (42 skills), TerminalSkills (24), obra, ComposioHQ, and machina-sports. Curated collections from awesome-skills and similar aggregators.

We excluded trivially broken skills (empty files, syntax errors) and focused on skills that were clearly intended for real use. If someone published it, we scored it.

Tier distribution

TierCountShareScore range
Platinum (85+)00%n/a
Gold (70-84)148.4%70.0-79.5
Silver (60-69)3722.2%60.0-69.9
Bronze (under 60)11669.5%25.0-59.9

Community average: 54.6. SupaSkills average: 86.7. A 32-point gap.

For context: our quality gate requires a minimum score of 80.0, at least 6 research sources across 2+ source types, and a passed masterfile from our 8-phase pipeline. 98% of our 1,278 skills are Platinum tier. Zero community skills reached Platinum. 14 reached Gold.

The best of the community

The highest score in the entire sample: Move Code Quality by 1NickPappas, at 79.5. A focused skill for Move blockchain development with real domain expertise. It falls just short of our publishing threshold, but it is genuinely well-built.

The pattern is consistent. The best community skills come from domain experts who know their subject deeply:

  • Move Code Quality (79.5, Gold). Blockchain-specific, technically precise, clear scope.
  • Robotics skills from specialised contributors. Narrow focus, real engineering knowledge.
  • Email marketing bible. Practical, opinionated, grounded in actual campaign experience.

What these have in common: a person who knows the domain wrote them by hand, with real constraints and real opinions. They are not generated from documentation.

Where the scores break down

Research Quality is universally weak

The weakest dimension across all 167 skills. Most cite zero sources. No academic references, no industry frameworks, no books, no standards documents. The skill is the author's knowledge, unverified and unattributed.

This is the single largest factor in the score gap. Our pipeline requires minimum 6 sources across 2+ types (books, academic papers, industry frameworks, standards, etc.). That requirement alone accounts for roughly a third of the difference.

To be fair: some authors may have done extensive research without citing it in the skill file. We cannot know. So we re-ran the entire audit without the Research Quality dimension, redistributing its 15% weight proportionally across the remaining five dimensions. The result: the community average rises from 55.5 to 58.1. The gap narrows from 31 points to 28. Two skills cross our 80.0 threshold (agent-browser by Vercel at 81.1 and Obsidian Agent Memory at 80.5). Zero reach Platinum. The structural gap persists across all dimensions, not just research.

Many skills are documentation snippets

A common pattern: take an API reference or framework documentation, paste it into a skill file, add a brief instruction header. Claude already has access to this information through its training data. The skill adds little that a well-written prompt would not.

softaworks (42 skills) and TerminalSkills (24 skills) are the most prolific contributors in our sample. Both score consistently in the Bronze range. Volume without a quality gate produces volume.

Corporate skills are better structured, still shallow

Skills from Vercel, Microsoft, and similar organisations tend to have cleaner prompt engineering, better formatting, and more thoughtful instruction design. They score higher on Prompt Engineering and Completeness.

But they still lack research depth. A Vercel skill for Next.js deployment is well-written, but it does not cite the Next.js documentation, performance benchmarks, or deployment best practices from external sources. It is the author's knowledge in prompt form, same as the independents.

Practical Utility is the strongest dimension

This is encouraging. Many community skills solve real workflow problems. Deployment automation, code review checklists, framework-specific patterns. The intent is practical. The skills want to be useful.

The issue is not ambition. It is infrastructure.

The structural gap

The 32-point gap between community average (54.6) and our average (86.7) is not about individual talent. Some community skill authors clearly know their domains well. The gap is structural.

What's missingEffect on score
No research sourcesResearch Quality collapses (avg ~2/10)
No quality gateBronze skills ship alongside Gold
No pipelineInconsistent structure, missing governance
No versioningNo way to improve iteratively
No IP auditUnknown copyright/trademark exposure
No peer reviewSingle-author blind spots persist

Our 8-phase pipeline exists because individual expertise is not enough. Expert Council, Deep Research, Quality Gate, IP Audit, Cross-Validation. Each phase catches problems the others miss. The result is not that our authors are smarter. The result is that the pipeline catches more.

The security dimension

This is not just about quality scores. A recent arXiv study found that 26.1% of community skills contain security vulnerabilities. Prompt injection vectors, unsafe tool access patterns, missing input validation.

We run automated IP audits on every skill (1,278/1,278 audited, zero high-risk findings). We run a daily security scan with 25 checks. We have a canary skill that tests our delivery guard every 24 hours.

Most community skills have none of this. The ecosystem is growing faster than its safety infrastructure.

What we learned

Quality infrastructure makes a measurable difference. Not a marginal one. A 32-point, two-tier-category difference. The best community skill (79.5) would not pass our quality gate (80.0). That is not a coincidence. The gate exists precisely at the boundary where research depth, governance, and structured evaluation start to matter.

The best skills come from domain experts, not aggregators. Move blockchain, robotics, email marketing. Narrow focus, real knowledge, clear opinions. Aggregators who publish 20+ skills at once produce Bronze consistently.

The ecosystem needs quality infrastructure. 71,000+ skills on SkillsMP. 26.1% with security vulnerabilities. No scoring, no quality gates, no IP audits. The supply side is solved. The quality side is wide open.

The numbers

MetricCommunity (167)SupaSkills (1,278)
Average score54.686.7
Platinum tier0 (0%)1,252 (98%)
Gold tier14 (8.4%)26 (2%)
Bronze tier116 (69.5%)0 (0%)
Highest score79.592.05
Min sources required06
IP audit coverage0%100%
Quality gateNone80.0 minimum
Security scanningNone25 daily checks

We built SupaSkills because we believe skills are the most important layer on top of Claude's already strong foundation. Claude is excellent at reasoning, coding, and analysis. Skills add domain structure, research depth, and specialised frameworks that make that foundation even more effective.

But skills only add value if they are good. The ecosystem audit suggests most are not there yet. The good news: the path from Bronze to Platinum is well understood. It is research, structure, and a quality gate that does not let anything through until it is ready.

We built that infrastructure. It is open for anyone who wants to use it.


Methodology: 167 skills scored using SupaScore (6 dimensions, weighted formula). Sources: GitHub repositories and skill directories from 40+ organisations. Scoring performed March 2026. All scores use the same rubric applied to our own skills. The full dataset is available on request.


Full Results: All 167 Skills Scored

Transparency matters. Here is every skill we tested, with author and score.

#SkillAuthorScoreTier
1Move Code Quality1NickPappas79.5Gold
2Robot Perceptionarpitg130479.5Gold
3ROS2 Developmentarpitg130478.5Gold
4humanizerblader77.0Gold
5agent-browservercel-labs76.5Gold
6Obsidian Agent MemoryAdamTylerLynch76.0Gold
7Robotics Securityarpitg130475.5Gold
8email-marketing-bibleCosmoBlk74.5Gold
9mermaid-syntax-skillawesome-skills74.0Gold
10Robotics Design Patternsarpitg130473.0Gold
11Family History Researchemaynard71.5Gold
12skill-judgesoftaworks71.5Gold
13Terraform IaCTerminalSkills71.0Gold
14Security AuditTerminalSkills70.0Gold
15manim-skillawesome-skills69.0Silver
16iOS Simulatorconorluddy68.5Silver
17database-schema-designersoftaworks68.0Silver
185-whys-skillawesome-skills67.0Silver
19Sports Bettingmachina-sports67.0Silver
20Defold Scripts Editingindiesoftby66.5Silver
21KaizenNeoLabHQ66.5Silver
22c4-architecturesoftaworks66.5Silver
23qa-test-plannersoftaworks66.5Silver
24Stream Coding Methodologyfrmoretto66.0Silver
25Creative Directorsmixs65.5Silver
26ipsw iOS/macOS Security Researchblacktop65.0Silver
27writing-skillsobra65.0Silver
28RealityKit visionOS Developertomkrikorian65.0Silver
29tailored-resume-generatorComposioHQ64.5Silver
30Polymarketmachina-sports64.0Silver
31receiving-code-reviewobra64.0Silver
32subagent-driven-developmentobra64.0Silver
33first-principles-skillawesome-skills63.5Silver
34Stripe BillingTerminalSkills63.5Silver
35react-devsoftaworks63.0Silver
36Playwright Browser Automationlackeyjb62.5Silver
37FastF1 Formula 1 Datamachina-sports62.5Silver
38NFL Datamachina-sports62.5Silver
39requirements-claritysoftaworks62.5Silver
40supabase-best-practicessupabase62.5Silver
41ARKit visionOS Developertomkrikorian62.5Silver
42react-best-practicesvercel-labs62.5Silver
43Unblock Action (Tapestry)michalparkola62.0Silver
44daily-meeting-updatesoftaworks62.0Silver
45twitter-algorithm-optimizerComposioHQ61.5Silver
46verification-before-completionobra61.5Silver
47beautiful_proseSHADOWPR061.5Silver
48FFmpegTerminalSkills61.5Silver
49Prediction MarketsTerminalSkills61.5Silver
50Prompt Engineering (NeoLab)NeoLabHQ61.0Silver
51lesson-learnedsoftaworks61.0Silver
52brainstormingobra60.5Silver
53feedback-masterysoftaworks60.5Silver
54mobile-app-designawesome-skills60.0Silver
55PyPICT Testingomkamal60.0Silver
56difficult-workplace-conversationssoftaworks60.0Silver
57skill-creatorComposioHQ59.5Bronze
58marp-slidesoftaworks59.5Bronze
59openapi-to-typescriptsoftaworks59.5Bronze
60Unity Game EngineTerminalSkills59.5Bronze
61code-review-skillawesome-skills59.0Bronze
62Web AccessibilityKreerC59.0Bronze
63Subagent Driven Development (NeoLab)NeoLabHQ59.0Bronze
64Spatial SwiftUI Developertomkrikorian59.0Bronze
65langsmith-fetchComposioHQ58.5Bronze
66Scrum Sage (Tapestry)michalparkola58.5Bronze
67muisoftaworks58.5Bronze
68professional-communicationsoftaworks58.5Bronze
69Blender ScriptingTerminalSkills58.5Bronze
70TensorFlowTerminalSkills58.5Bronze
71design-system-startersoftaworks58.0Bronze
72ElasticsearchTerminalSkills58.0Bronze
73mermaid-diagramssoftaworks57.5Bronze
74ship-learn-nextsoftaworks57.5Bronze
75OpenCV Computer VisionTerminalSkills57.5Bronze
76content-research-writerComposioHQ57.0Bronze
77slack-gif-creatorComposioHQ57.0Bronze
78backend-to-frontend-handoff-docssoftaworks57.0Bronze
79Godot Game EngineTerminalSkills57.0Bronze
80meeting-insights-analyzerComposioHQ56.5Bronze
81dispatching-parallel-agentsobra56.5Bronze
82agent-md-refactorsoftaworks56.5Bronze
83command-creatorsoftaworks56.5Bronze
84dependency-updatersoftaworks56.5Bronze
85gepettosoftaworks56.5Bronze
86naming-analyzersoftaworks56.5Bronze
87Contract ReviewTerminalSkills56.5Bronze
88GDPR ComplianceTerminalSkills56.5Bronze
89Kubernetes HelmTerminalSkills56.5Bronze
90ClawSec Security Scannerprompt-security56.0Bronze
91writing-clearly-and-conciselysoftaworks56.0Bronze
92Unreal EngineTerminalSkills56.0Bronze
93invoice-organizerComposioHQ55.5Bronze
94react-useeffectsoftaworks55.5Bronze
95session-handoffsoftaworks55.5Bronze
96webapp-testingComposioHQ55.0Bronze
97Ship Learn Next (Tapestry)michalparkola55.0Bronze
98AI GuardrailsTerminalSkills55.0Bronze
99PyTorchTerminalSkills55.0Bronze
100ShaderGraph Editortomkrikorian55.0Bronze
101CSV Data Summarizercoffeefuelbump54.5Bronze
102YouTube Transcriptmichalparkola54.5Bronze
103Session Log (Tapestry)michalparkola54.0Bronze
104using-git-worktreesobra54.0Bronze
105writing-plansobra54.0Bronze
106file-organizerComposioHQ53.5Bronze
107Article Extractormichalparkola53.5Bronze
108Learn This (Tapestry)michalparkola53.5Bronze
109canvas-designComposioHQ53.0Bronze
110Pomodoro System Skilljakedahn53.0Bronze
111react-native-skillsvercel-labs52.5Bronze
112competitive-ads-extractorComposioHQ51.5Bronze
113jirasoftaworks51.5Bronze
114Sensei Skill Improverspboyer51.5Bronze
115GraphQLTerminalSkills51.5Bronze
116gRPCTerminalSkills51.5Bronze
117Markdown Exporterbowenliang12351.0Bronze
118domain-name-brainstormersoftaworks50.5Bronze
119composition-patternsvercel-labs50.5Bronze
120geminisoftaworks50.0Bronze
121Algorithmic TradingTerminalSkills49.5Bronze
122AWS Agentic AIzxkane49.0Bronze
123changelog-generatorComposioHQ48.0Bronze
124reducing-entropysoftaworks48.0Bronze
125frontend-to-backend-requirementssoftaworks47.5Bronze
126crafting-effective-readmessoftaworks47.0Bronze
127recallarjunkmrm46.5Bronze
128lead-research-assistantComposioHQ46.5Bronze
129remotion-best-practicesremotion-dev46.5Bronze
130Solidity Smart ContractsTerminalSkills46.5Bronze
131ClawSec Security Suiteprompt-security46.0Bronze
132game-changing-featuressoftaworks46.0Bronze
133Soul Guardianprompt-security45.5Bronze
134draw-iosoftaworks45.5Bronze
135excalidrawsoftaworks45.5Bronze
136Data AnalysisTerminalSkills45.5Bronze
137Sixtyfour People Intelligence APIrxhxm45.0Bronze
138commit-worksoftaworks44.5Bronze
139Apache KafkaTerminalSkills44.5Bronze
140Jules (Gemini Agent Delegation)sanjay329044.0Bronze
141Postgressanjay329044.0Bronze
142Markdown to EPUB Convertersmerchek44.0Bronze
143connectComposioHQ43.5Bronze
144Apache SparkTerminalSkills43.5Bronze
145Review Implementingmhattingpete43.0Bronze
146executing-plansobra43.0Bronze
147requesting-code-reviewobra43.0Bronze
148datadog-clisoftaworks43.0Bronze
149plugin-forgesoftaworks43.0Bronze
150Software Architecture (NeoLab)NeoLabHQ42.5Bronze
151perplexitysoftaworks41.5Bronze
152brand-guidelinesComposioHQ40.5Bronze
153theme-factoryComposioHQ39.5Bronze
154Outline Wikisanjay329039.5Bronze
155codexsoftaworks39.5Bronze
156NotebookLM IntegrationPleasePrompto39.0Bronze
157meme-factorysoftaworks39.0Bronze
158web-to-markdownsoftaworks38.5Bronze
159raffle-winner-pickerComposioHQ38.0Bronze
160using-superpowersobra38.0Bronze
161Prompt Agentprompt-security38.0Bronze
162internal-commsComposioHQ37.5Bronze
163Reddit Fetchykdojo37.5Bronze
164Test Fixingmhattingpete36.0Bronze
165Deep Research (Gemini)sanjay329034.5Bronze
166Imagen (Gemini)sanjay329025.5Bronze
167image-enhancerComposioHQ25.0Bronze

All skills scored using SupaScore 6D rubric (Research Quality 15%, Prompt Engineering 25%, Practical Utility 15%, Completeness 10%, User Satisfaction 20%, Decision Usefulness 15%). Same rubric applied to all 1,278 SupaSkills.


Disclaimer: This benchmark was conducted in good faith using a documented, reproducible methodology. The same rubric and scoring process is applied to our own skills. Errors are possible. If you are a skill author and believe your skill was scored unfairly, miscategorised, or evaluated based on an outdated version, please contact us. We are happy to re-score with an updated version, correct any factual errors, or discuss methodology. The goal of this audit is to improve the ecosystem, not to diminish anyone's work. The security vulnerability statistic (26.1%) is sourced from Liu et al., 2026 (arXiv:2601.10338), an independent academic study.