GitHub Trending Top 10: The agent infrastructure stack takes shape (May 8–15)

Eight of this week's ten trending repositories directly serve the AI coding agent ecosystem. That's not a coincidence — it's a picture of where developer tooling energy is concentrating. Skills frameworks, memory layers, orchestration platforms, terminal agents, proxy routers: each project occupies a distinct slot in a stack that barely existed twelve months ago. Two outliers — a stealth browser and a content monetization platform — round out the list.

Entries below follow GitHub Trending's ranking order for the May 8–15 window. Weekly star gain is included for each repo as a secondary signal.

#1 · mattpocock/skills — 83,331 stars · +17,059 this week

Problem solved: AI coding agents fail in four predictable ways — they misalign with what the developer actually wants, produce verbose output that bloats context, ship non-working code, and gradually degrade a codebase's architecture. 1

Stack and approach: Pure Markdown. The repository is Matt Pocock's (creator of Total TypeScript) personal .claude directory published as open source. Install is one command: npx skills@latest add mattpocock/skills. The 14+ skills work with any coding agent — Claude Code, Codex, Cursor — and are model-agnostic by design. 1

Differentiation: Pocock's explicit argument is that heavyweight frameworks like GSD, BMAD, and Spec-Kit "take away your control and make bugs in the process hard to resolve." 1 His answer is composability: small, independently understandable files you can read and adapt in minutes. The caveman skill independently tested at cutting Claude Opus 4.7 prose tokens by approximately 71%. 1 The grill-with-docs skill builds a shared domain vocabulary between developer and agent — the closest thing here to a genuine technique rather than a workflow wrapper. A Towards AI analysis described the repo's 83K star count as stemming from an "embarrassingly simple" proposition: skills are just Markdown. 2

Verdict: ⭐ Star it. The highest-signal repo of the week. Even if you use none of the skills directly, reading the source files shows you what a well-specified agent instruction looks like.

Skills For Real Engineers logo on dark background

Image from: GitHub: mattpocock/skills

#2 · addyosmani/agent-skills — 41,746 stars · +9,198 this week

Problem solved: Coding agents take the shortest path to completion and skip production-grade practices — specs, tests, security reviews, accessibility checks. 3

Stack and approach: Also pure Markdown (Shell 100%), but Addy Osmani (Google Chrome engineering lead) brings a more institutional lens. The repo ships 23 skills organized as a full software development lifecycle: Define → Plan → Build → Verify → Review → Ship. Seven slash commands (/spec, /plan, /build, /test, /review, /code-simplify, /ship) activate the corresponding skills automatically. Three pre-configured specialist personas — code-reviewer, test-engineer, security-auditor — handle common delegation patterns. 3

Differentiation: Each skill follows a strict anatomy: trigger conditions → process steps → anti-rationalization (which actively counters excuses for skipping best practices) → mandatory verification (requires concrete evidence, not "seems right"). 3 The anti-rationalization sections are what distinguish this from mattpocock/skills — they encode institutional engineering culture rather than individual workflow hacks. The known friction point: Issue #173 identifies routing ambiguity where the code-reviewer persona and the code-review-and-quality skill compete for the same natural-language intent. 4

Verdict: ⭐ Star it. Treat mattpocock/skills and addyosmani/agent-skills as complementary. Matt's repo = tactical, personal; Addy's = process, team.

Full development workflow diagram: Spec → Plan → Build → Test → Review → Simplify → Ship

Image from: GitHub: addyosmani/agent-skills

#3 · Hmbown/DeepSeek-TUI — 29,406 stars · +11,303 this week

Problem solved: Claude Code is Node-based, expensive at scale, and locks you into Anthropic's models. Developers running heavy agentic workloads on DeepSeek V4 (deepseek-v4-pro / deepseek-v4-flash) have had no purpose-built terminal agent. 5

Stack and approach: Rust (94.9%) binary, ~12MB idle RAM. The key architectural feature is RLM parallel sub-agents: up to 16 V4-Flash children run concurrently, each at $0.14/M input tokens, fronting analysis before committing a single V4-Pro call at $0.435/M. Full LSP diagnostics (rust-analyzer, pyright, typescript-language-server), OS-level sandbox (Seatbelt/Landlock/Job Objects), and session rollback are included. Skills from .claude/skills work out of the box. 5

Differentiation: Verdent AI's analysis called the RLM parallel approach "a genuine differentiator for batched workloads" — front-loading cheap Flash analysis is economically sound. 6 The model lock-in, however, is explicit: if DeepSeek V4 pricing or availability changes, there is no migration path. The 75% Pro discount expires May 31, 2026, which affects the cost math directly. 5

Verdict: ⭐ Star it — conditionally. Strong engineering, but its value proposition is tied to DeepSeek V4's pricing staying favorable. Evaluate again after the May 31 discount window closes.

#4 · anthropics/financial-services — 22,939 stars · +12,529 this week

Problem solved: Financial services analyst workflows — M&A pitch decks, KYC screening, earnings reviews, general ledger reconciliation — are high-value, highly repetitive, and largely unautomated by AI as of early 2026. 7

Stack and approach: Python (86.2%), 10 pre-built agents deployable via Claude Cowork plugin or Claude Managed Agents API. The architecture enforces reader-orchestrator-writer isolation: reader sub-agents hold read-only permissions, the orchestrator never touches raw input, and only the writer module can write. Each reader enforces strict JSON schema output with additionalProperties: false, length limits, and regex constraints. 11 third-party financial data connectors (Morningstar, S&P Global, FactSet, Pitchbook, and others) integrate via MCP — all requiring separate subscriptions. 7

Differentiation: Mark Craddock's LinkedIn analysis identified five architecture patterns from this repository worth extracting for any vertical: reader-orchestrator-writer isolation, schema-gated handoffs with allowlists, cross-reference checking, and dual delivery paths (interactive plugin + headless API). 8 A Towards AI test found the Pitch Builder agent generated a 24-page M&A pitchbook in 11 minutes. 9 One concrete problem to note: hooks.json files in four vertical plugins contain [] instead of {}, which causes plugin load failures — at least 8 duplicate issue reports filed as of May 15 with no fix yet merged. 10

Verdict: ⭐ Star it if you build multi-agent systems — the architecture patterns are directly reusable in any domain. Skip for immediate production use in finance until the hooks.json bug ships a fix.

#5 · ruvnet/ruflo — 51,202 stars · +5,106 this week

Problem solved: Claude Code handles single-agent sessions well but has no native support for coordinating multiple specialized agents, sharing persistent memory across sessions, or enforcing trust boundaries between agents on different machines. 11

Stack and approach: TypeScript with a Rust AI engine (Cognitum.One), 32 Claude Code plugins across orchestration, memory, intelligence, security, and DevOps categories. Key pieces: HNSW-indexed AgentDB vector database, swarm coordination in hierarchical/mesh/adaptive topologies, agent federation with zero-trust security (mTLS + ed25519), and a PII-gated data flow pipeline. Install is npx ruflo init or via Claude Code plugin. ~210 MCP tools across 5 server groups. 11

Differentiation: Ruflo (previously named Claude Flow) is the most mature project in this batch — one year of development, 6,457 commits, and Trendshift showing +17,251 stars in May 2026 alone. 12 The scale is real. The adoption question is also real: Shareuhack noted a Hacker News comment asking "is anyone actually using ruflo?" with minimal engagement. 12 The gap between star count and in-production usage is the key thing to verify before committing.

Verdict: Conditional ⭐. The architecture is enterprise-grade and the plugin breadth is impressive. If you need multi-agent orchestration at org scale, evaluate it seriously. For solo or small-team projects, it's architectural overhead you probably don't need yet.

#6 · rohitg00/agentmemory — 9,251 stars · +6,467 this week

Problem solved: AI coding agents start every session from scratch. The standard workaround — a .claude/CLAUDE.md file — doesn't scale past a few hundred observations and costs roughly 22K tokens per session at 240 observations. 13

Stack and approach: TypeScript (92.6%) built on iii-engine v0.11.2 (a Rust+TypeScript runtime that replaces Express.js, Postgres+pgvector, SSE, pm2, and Prometheus). Memory retrieval uses triple-stream hybrid search: BM25 with Porter stemming, cosine vector similarity via all-MiniLM-L6-v2 (384-dimension embeddings), and graph-based entity relationship traversal — fused via Reciprocal Rank Fusion (RRF_K=60). Four-tier memory consolidation: Working → Episodic → Semantic → Procedural, with Ebbinghaus-style decay curves and auto-eviction. 13 14

Differentiation: AlphaSignal's analysis reports 95.2% R@5 (retrieval of the relevant memory in the top 5 results) on LongMemEval-S — an ICLR 2025 benchmark with 500 questions — compared to 86.2% for BM25-only baseline. 14 Token cost: ~1,900 tokens per session, ~$10/year at standard pricing, or $0 with local embeddings — a 92% reduction versus flat CLAUDE.md at scale. The asterisk: 91% of commits come from a single maintainer (Rohit Ghumare, Principal Product Evangelist at iii.dev). 14

Verdict: ⭐ Star it for solo dev and small team use. AlphaSignal's verdict — "Production Ready for solo developers and small teams running on localhost" — matches the data. Hold for any production system where bus-factor risk matters.

#7 · CloakHQ/CloakBrowser — 11,310 stars · +8,404 this week

Problem solved: JavaScript-injection stealth browsers (playwright-stealth, puppeteer-extra, undetected-chromedriver) break on every Chrome update because antibot systems now detect the injection patches themselves. 15

Stack and approach: Python (52.3%) + TypeScript (43.7%) wrapper around a custom Chromium binary compiled with 57 C++ source-level fingerprint patches on Linux/Windows (26 on macOS ARM). Drop-in Playwright/Puppeteer replacement: change one import, same API. humanize=True adds Bézier-curve mouse movements, per-character typing with thinking pauses, and realistic scroll patterns. Passes 30/30 bot detection tests including Cloudflare Turnstile and reCAPTCHA v3 (score 0.9). 15

Differentiation: Source-level binary patches are significantly harder to detect than runtime JS injection. CloakHQ's framing: "It is a normal browser" — not a patched config. 15 The security caveat is concrete: security researcher 0xlally has documented two unremediated vulnerabilities — a CDP path traversal (arbitrary directory deletion) and unauthorized CDP access (browser control + file disclosure). 16 macOS ARM64 also lags at 26 patches versus 57 on Linux/Windows.

Verdict: Conditional ⭐. The technology is genuinely differentiated. Do not expose the CDP port to any network interface until the security vulnerabilities are patched. Safe only in fully isolated local environments.

#8 · decolua/9router — 10,415 stars · +6,024 this week

Problem solved: AI coding tools hard-code to one provider. When that provider rate-limits or goes down, work stops. Free and cheap model alternatives exist across 40+ providers but require manual reconfiguration per tool. 17

Stack and approach: JavaScript / Next.js local proxy running on localhost:20128. Intercepts requests from Claude Code, Codex, Cursor, Cline, Windsurf, and a dozen more tools, then routes through a 3-tier fallback: Subscription → Cheap → Free. Two compression layers: RTK Token Saver (from rtk-ai/rtk, ~40K stars) compresses tool outputs (git diff, grep, ls) before sending to the LLM, claiming 20-40% token savings; Caveman mode (from JuliusBrussee/caveman, ~52K stars) compresses output tokens for up to 65% savings. 17

Differentiation: The software itself is MIT and free. You pay providers directly. Free-tier options as of this writing include Kiro AI (unlimited Claude 4.5 + GLM-5) and OpenCode Free. The transparency note matters: dashboard cost display shows "estimated costs as if using paid APIs," not actual billing. 17 Two formerly included free tiers — iFlow and Qwen Code (discontinued by Alibaba in April 2026) — have already dropped off, which gives a realistic picture of how stable the free tier landscape is.

Verdict: ⭐ Star it as a power-user routing tool. Dual token compression alone is worth the setup time. Treat any specific free provider as temporary and build around the fallback logic, not around a single free option.

#9 · yikart/AiToEarn — 13,891 stars · +4,412 this week

Problem solved: Solo content creators managing 12+ platforms simultaneously spend most of their time on distribution and engagement automation rather than content creation. 18

Stack and approach: TypeScript Nx Monorepo (92.6%) — Next.js frontend, NestJS backend, Electron desktop client, browser extension, Docker Compose with MongoDB and Redis. Four AI agent modules: Create (video via Grok/Veo, image via Nano Banana), Publish (one-click distribution to 12+ platforms), Engage (automated interactions via browser extension), Monetize (decentralized marketplace with CPS/CPE/CPM settlement models). MCP protocol support added March 2026, enabling Claude, Cursor, or any MCP-compatible agent to directly schedule publishing. 18 Supported platforms span Chinese (Douyin, Xiaohongshu, Kuaishou, Bilibili) and international (TikTok, YouTube, Facebook, Instagram, X/Twitter, LinkedIn, Pinterest, Threads). 18

Differentiation: WonderLab's dev.to analysis noted: "2,599 commits, 26 releases — this is not a demo project." 19 Fifteen months of active development distinguishes it from weekend projects in this space. The March 2026 content marketplace — where brands post tasks and creators accept them — is the differentiating layer over pure automation tools.

Verdict: ⭐ Star it if you operate a multi-platform content business. Skip if you're not already running that workflow — the complexity is proportional to the use case.

#10 · bytedance/UI-TARS-desktop — 33,983 stars · +4,184 this week

Problem solved: Computer-use agents that depend on HTML/DOM accessibility trees break across platforms and UI frameworks. A pure-vision approach — screenshots as the only input — is platform-agnostic by definition. 20

Stack and approach: TypeScript (89.1%), official ByteDance project from the Seed team. Ships two distinct sub-projects: Agent TARS (general multimodal AI agent with CLI via npx @agent-tars/cli and Web UI) and UI-TARS Desktop (native GUI agent for local computer control). Both use vision-language models — primarily Seed-1.5-VL/1.6 series and the UI-TARS model (ByteDance-Seed/UI-TARS-1.5-7B on Hugging Face). Agent TARS includes a hybrid browser agent mode (GUI Agent / DOM / hybrid), Event Stream protocol for context engineering, and MCP integration. 20

Differentiation: The pure-vision architecture avoids DOM brittleness — in principle. In practice, AIMultiple's April 2026 benchmark puts UI-TARS at approximately 38% accuracy on UI grounding tasks, versus approximately 90% for Qwen3-VL. 21 The conclusion from that benchmark: "robust visual perception and implicit UI understanding matter more than narrow UI specialization." UI-TARS is specialized; the generalist vision models currently outperform it on real-desktop tasks.

Verdict: Watch-list. The ByteDance backing and open-source model weights give it legitimacy, and the architecture is sound in principle. The 38% accuracy gap versus Qwen3-VL on grounding tasks is a practical constraint worth tracking before deploying on real workflows.

Three patterns worth extracting

Skills-as-Markdown is the dominant paradigm. Both mattpocock/skills (+17K) and addyosmani/agent-skills (+9K) package agent behaviors as plain text files readable without running any code. The combined star velocity on these two repos this week — 26,000 stars — outpaced every other category. The pattern holds a concrete advantage: when the agent produces bad output, you can read the instruction file and understand exactly why.

Hook-driven memory layers are the new CLAUDE.md. agentmemory's triple-stream retrieval architecture is a direct replacement for static context files. The 92% token reduction versus flat-file approaches, combined with cross-agent memory sharing, points to a standard emerging: coding agents should get persistent, searchable memory via hooks rather than carrying full context every session.

Reader-orchestrator-writer isolation is an architecture worth stealing. The pattern from anthropics/financial-services — where reader agents are read-only, the orchestrator never touches raw input, and only the writer can write — is domain-agnostic. It solves a real class of multi-agent bugs (an agent that can both read and act can corrupt state it's analyzing). The pattern applies to any agentic workflow that touches production data, not just financial services.

Cover image: GitHub: mattpocock/skills — Skills For Real Engineers logo