How We Score 50,000 GitHub Repos to Stock the Catalog
The AgentDepot catalog refreshes every morning from GitHub. Here's the exact scoring model we use to decide which 90 skills make the shelf — and what disqualifies a repo immediately.
There are tens of thousands of AI agent repositories on GitHub. Most of them are demos, experiments, or weekend projects that will never see production. About 3% of them are actually production-worthy. Our job is to find that 3%.
The daily crawl
Every morning at 4:00 AM UTC, our crawler indexes GitHub for repositories matching a curated set of agent-related criteria — framework signals, topic tags, file structure patterns. We collect roughly 50,000 repos per run. From there, every repo goes through our 100-point scoring model.
The four scoring dimensions
- →Maintenance Activity (25 pts): Commit frequency over the last 90 days, open issue response time, PR merge rate. A repo that hasn't been touched in six months gets a zero here — bugs pile up, dependencies rot, it stops working.
- →Documentation Quality (25 pts): README completeness, inline code comments, example configurations, and deployment guides. If a non-technical user can't understand what this agent does in 60 seconds, it doesn't belong in a marketplace.
- →Production Readiness (25 pts): Test coverage percentage, error handling patterns, environment variable hygiene, graceful degradation. We specifically look for hardcoded secrets, missing try/catch blocks, and SQL injection vectors.
- →AgentCore Compatibility (25 pts): AWS Bedrock AgentCore schema adherence, IAM permission posture, cold-start profile, and response format compliance. A skill that's beautiful but can't run on AgentCore is useless to us.
What gets disqualified immediately
Some signals are automatic disqualifiers regardless of overall score. A hardcoded API key in source code means an immediate zero — we won't ship a security liability. A license that doesn't permit commercial use gets removed. Repos with active CVEs that haven't been patched in 30 days are pulled. And anything scoring below 60 overall doesn't make the shelf.
The top 90
After scoring, we rank the candidates and select the top 90 across the nine function categories — marketing, sales, customer service, operations, finance, content, dev, HR, and design. We try to maintain roughly equal distribution across categories, with some flexibility based on what GitHub is actually producing in any given week.
A skill that was in the catalog yesterday can be removed today if its maintenance score drops. The catalog is a live ranking, not a permanent list.
Why this matters for you
If you deploy a skill from AgentDepot, you're not gambling on an experiment. You're deploying the top-ranked production agent in its category, validated this morning against current GitHub state. When you see a quality score of 87, that number is specific: 22/25 maintenance, 23/25 documentation, 20/25 production readiness, 22/25 AgentCore compatibility.