Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Latest commit

 

History

History
306 lines (274 loc) · 57.6 KB

File metadata and controls

306 lines (274 loc) · 57.6 KB

Awesome Agent Harness

A curated, implementation-first list of agent harness engineering resources, with GitHub projects as the primary focus.

  • Total entries: 174
  • GitHub entries: 149 (85.6%)
  • GitHub in project categories (excluding readings): 145/145 (100.0%)
  • Categories: 9
  • Last verified: 2026-05-12
  • Language: English | 中文

Featured Harness Blogs

Contents

Category Overview

Category Entries
Harness Architecture & Orchestration 22
Context & Working-State Engineering 9
Execution Substrates & Sandboxing 19
Protocols, Tool Interfaces & Agent Contracts 11
Evaluation Harnesses & Benchmarks 21
Observability & Reliability Operations 14
Guardrails, Security & Governance 12
Reference Harness Implementations 37
Essential Readings & Ecosystem Maps 29

Catalog

Notes:

  • Stars are rendered as badges from snapshot values.
  • Repository update dates are tracked in data/projects.yaml and validation reports.
  • Entries are sorted by stars (descending) within each category.

Harness Architecture & Orchestration

Project Link Stars Tags Summary
DeerFlow GitHub star long-horizon, memory, subagents Long-horizon super-agent harness integrating memory, tools, subagents, and sandboxes.
AutoGen GitHub star multi-agent, orchestration, framework Programming framework for agentic AI with multi-agent interaction and orchestration.
Agno GitHub star scale, runtime, management Agent software runtime focused on running and managing agentic systems at scale.
LangGraph GitHub star graph, workflow, runtime Graph-based runtime for resilient stateful agents and deterministic workflow control.
Semantic Kernel GitHub star enterprise, orchestration, plugins Enterprise-grade agentic application framework with orchestration and plugin patterns.
OpenAI Agents SDK (Python) GitHub star sdk, handoff, workflows Lightweight framework for multi-agent workflows, handoffs, and production patterns.
Symphony GitHub star orchestration, control-plane, workflows Ticket-driven orchestration layer that turns project work into isolated autonomous implementation runs.
deepagents GitHub star runtime, orchestration, long-running Open-source harness for long-running, tool-using agents with planning and subagent patterns.
Archon GitHub star workflow-engine, worktrees, validation Workflow engine for AI coding agents with YAML-defined phases, isolated worktrees, and validation gates.
Google ADK (Python) GitHub star toolkit, deployment, evaluation Code-first toolkit to build, evaluate, and deploy advanced AI agents.
PydanticAI GitHub star python, typing, schema Type-safe Python framework for agents with strong schema contracts and tooling.
Microsoft Agent Framework GitHub star multi-agent, workflows, observability Multi-language framework for building, orchestrating, and deploying AI agents with graph workflows and observability.
Hive GitHub star harness, orchestration, runtime Outcome-driven agent runtime harness with explicit control loops and orchestration blocks.
VoltAgent GitHub star typescript, platform, runtime TypeScript agent engineering platform built around open runtime abstractions.
mcp-agent GitHub star mcp, runtime, workflow Practical agent framework centered on MCP tool ecosystems and workflow composition.
Yao GitHub star single-binary, runtime, autonomous Single-binary runtime for defining and running autonomous agents.
Cloudflare Agents GitHub star platform, deployment, runtime Platform runtime for building and deploying agents with production infrastructure primitives.
Docker Agent GitHub star docker, runtime, container Agent builder and runtime stack emphasizing container-native execution.
NeMo Agent Toolkit GitHub star multi-agent, optimization, toolkit Open toolkit for connecting and optimizing teams of AI agents.
Scion GitHub star multi-agent, containers, orchestration Experimental multi-agent orchestration testbed that runs isolated agent harnesses in containers, worktrees, and remote runtimes.
deepagentsjs GitHub star typescript, langgraph, subagents TypeScript agent harness with built-in planning, filesystem tools, subagents, and LangGraph-native runtime hooks.
hankweave GitHub star long-horizon, runtime, checkpoints Headless-first long-horizon runtime that orchestrates existing agent harnesses with sentinels, loops, checkpoints, and event journals.

Context & Working-State Engineering

Project Link Stars Tags Summary
everything-claude-code GitHub star context, skills, harness-practices Large open repository of harness practices around memory, skills, and context control for coding agents.
claude-mem GitHub star memory, context, session Plugin-style memory layer that captures session history and reinjects relevant context into future coding runs.
planning-with-files GitHub star planning, skills, persistence Skill package for persistent file-based planning in coding-agent workflows.
Agent Skills for Context Engineering GitHub star skills, context, production Large skill library oriented around context engineering and production agents.
Context-Engineering Handbook GitHub star context-engineering, handbook, practices First-principles handbook focused on practical context engineering for agent systems.
CCPM GitHub star planning, github-issues, parallel-execution Spec-driven project-manager skill that turns PRDs and GitHub issues into persistent context and parallel agent execution.
Trellis GitHub star specs, memory, workflow Multi-platform coding-agent workflow framework with task context, project memory, and spec injection.
Awesome Context Engineering GitHub star awesome-list, context, survey Survey-style list for context engineering resources and frameworks.
context-space GitHub star context, infrastructure, mcp Infrastructure project focused on context engineering building blocks and MCP-centric integrations.

Execution Substrates & Sandboxing

Project Link Stars Tags Summary
Daytona GitHub star sandbox, execution, infra Secure and elastic sandbox infrastructure for running AI-generated code with file, Git, LSP, and execution APIs.
CUA GitHub star computer-use, sandbox, infra Infrastructure stack for computer-use agents with sandbox, SDK, and benchmark support.
Browser Harness GitHub star browser, cdp, self-healing Thin editable CDP harness that connects LLMs directly to real browsers and lets agents extend helpers in flight.
E2B GitHub star cloud-sandbox, execution, enterprise Secure cloud environments with real tools for production-grade agent execution.
OpenSandbox GitHub star sandbox, security, runtime Secure and extensible sandbox runtime built for agent workloads.
agent-infra sandbox GitHub star all-in-one, browser, shell All-in-one sandbox combining browser, shell, files, MCP, and IDE server.
Judge0 GitHub star code-execution, sandbox, backend Scalable sandboxed code execution system usable as an agent execution backend.
Sandcastle GitHub star sandbox, typescript, branch-strategy TypeScript library for orchestrating coding agents inside isolated sandboxes with configurable branch strategies.
Agent Sandbox GitHub star kubernetes, sandbox, stateful Kubernetes-native sandbox control plane for isolated, stateful agent runtimes with stable identity, persistence, and warm-pool support.
stakpak/agent GitHub star always-on, autonomous, ops Always-on open agent that runs on your machines with autonomous operational loops.
OSS-Fuzz Gen GitHub star fuzzing, security, execution LLM-powered fuzzing workflows integrated with controlled execution contexts.
E2B Desktop Sandbox GitHub star desktop, sandbox, computer-use Secure virtual desktop sandbox for computer-use agents with SDK control and screen streaming.
Tensorlake GitHub star microvm, sandbox, orchestration Serverless runtime for agent sandboxes with MicroVM isolation, snapshots, suspend-resume, and background orchestration.
Arrakis GitHub star sandbox, microvm, snapshots Self-hosted sandbox substrate with MicroVM isolation, snapshot restore, and REST, SDK, and MCP interfaces for agent code execution and computer use.
AgentScope Runtime GitHub star runtime, sandbox, deployment Production runtime for agent apps with secure tool sandboxes, deployment APIs, observability, and state services.
SWE-ReX GitHub star sandbox, execution, coding-agent Sandboxed execution infrastructure for AI coding agents at local and cloud scale.
sandboxed.sh GitHub star self-hosted, isolation, orchestrator Self-hosted orchestrator running coding agents inside isolated Linux workspaces.
Capsule GitHub star wasm, sandbox, task-runtime Durable runtime that coordinates agent tasks inside isolated WebAssembly sandboxes with retries and lifecycle tracking.
terminal-bench-env GitHub star terminal, benchmark-env, sandbox Environment layer for terminal-agent benchmark execution.

Protocols, Tool Interfaces & Agent Contracts

Project Link Stars Tags Summary
GitHub Spec Kit GitHub star spec-driven, workflows, tooling Toolkit for spec-driven development to guide deterministic agent execution.
MCP Servers GitHub star mcp, servers, implementations Official collection of MCP server implementations across tools and domains.
AGENTS.md GitHub star spec, agent-file, instructions Open format for repository-local instructions that coding agents can follow.
Model Context Protocol GitHub star mcp, protocol, interoperability Core specification and docs for MCP-based tool and context interoperability.
directories (rules and MCP indexes) GitHub star directories, mcp, rules Curated directories of agent rules and MCP servers for tool discovery.
LangChain MCP Adapters GitHub star mcp, adapters, integration Adapters connecting LangChain components with MCP servers.
Microsoft MCP Servers GitHub star mcp, enterprise, servers Microsoft's official MCP server catalog for enterprise data and tools.
ACPX GitHub star acp, client, sessions Headless CLI client for stateful Agent Client Protocol sessions.
Microsoft Learn MCP GitHub star mcp, docs, grounding MCP server and CLI for grounding agents with Microsoft documentation sources.
IBM MCP GitHub star mcp, clients, tooling IBM collection of MCP servers, clients, and developer tooling.
AGENT.md GitHub star standard, agent-file, interoperability Standardized machine-readable file format for agentic coding tools.

Evaluation Harnesses & Benchmarks

Project Link Stars Tags Summary
Promptfoo GitHub star eval, red-team, ci Config-driven prompt/agent/RAG testing, comparison, and red-team evaluation tool.
DeepEval GitHub star evaluation, framework, testing LLM evaluation framework supporting agent and workflow quality testing.
RAGAS GitHub star rag, metrics, evaluation Open evaluation toolkit for LLM and RAG quality metrics.
lm-evaluation-harness GitHub star benchmark, harness, llm Popular benchmark harness for consistent LLM evaluation across tasks.
SWE-bench GitHub star benchmark, swe, evaluation Standard benchmark for evaluating issue-fixing software engineering agents.
verifiers GitHub star verifier, rl, evaluation Library for RL environments and verifier-based evaluation loops.
AgentBench GitHub star benchmark, cross-domain, agent Cross-environment benchmark for evaluating LLM agents as tool-using systems.
LangWatch GitHub star simulation, evaluation, testing End-to-end platform for agent simulations, evaluation loops, and production testing.
EvalScope GitHub star benchmark, framework, llm Customizable framework for large-model benchmarking and performance evaluation.
Terminal-Bench GitHub star terminal, benchmark, long-horizon Terminal-native benchmark suite for long-horizon, verification-heavy agent tasks.
Harbor GitHub star evaluation, harness, rl-env Framework for running agent evaluations and constructing RL-style environments.
tau2-bench GitHub star tool-use, interaction, benchmark Tool-agent-user interaction benchmark emphasizing multi-step execution quality.
NeMo Gym GitHub star rl-env, training, evaluation Toolkit for building RL environments suitable for LLM/agent training and eval.
TheAgentCompany GitHub star benchmark, workplace, multi-step Agent benchmark with simulated software-company tasks for evaluating multi-step workplace autonomy.
auto-harness GitHub star optimization, regression, evals Benchmark-gated optimization loop that mines failures, edits agent code, and guards against regressions overnight.
Inspect Evals GitHub star inspect, eval-suite, reproducibility Evaluation suite collection for Inspect AI workflows.
SWE-Bench Pro GitHub star swe, benchmark, long-horizon Long-horizon software-engineering benchmark with reproducible Docker-based evaluation for issue-driven coding agents.
Agent Evaluation GitHub star evaluation, testing, ci AWS framework for testing virtual agents with evaluator-driven multi-turn conversations, hooks, and CI-friendly workflows.
WorkArena GitHub star browser, benchmark, enterprise Browser benchmark for practical enterprise-like knowledge work tasks.
OpenHands Benchmarks GitHub star openhands, eval, harness Evaluation harness and benchmark definitions for OpenHands systems.
WebArena-Verified GitHub star web-agent, benchmark, deterministic Verified web-agent benchmark with deterministic evaluators.

Observability & Reliability Operations

Project Link Stars Tags Summary
Langfuse GitHub star llmops, tracing, metrics Open-source LLM engineering platform for traces, metrics, prompts, and evals.
MLflow GitHub star platform, monitoring, evaluation Broad AI engineering platform with monitoring and evaluation support for agents.
Opik GitHub star monitoring, eval, tracing End-to-end debug/eval/monitoring stack for LLM apps and agent workflows.
RagaAI Catalyst GitHub star agentops, analytics, monitoring Agent observability and monitoring framework with timeline and graph analytics.
TensorZero GitHub star llmops, gateway, optimization Open LLMOps stack unifying gateway, observability, evaluation, and optimization.
Arize Phoenix GitHub star observability, tracing, evaluation Open platform for AI observability, tracing, and evaluation analytics.
OpenLLMetry GitHub star opentelemetry, instrumentation, tracing OpenTelemetry-based instrumentation for GenAI and LLM applications.
Helicone GitHub star monitoring, traffic, production Lightweight platform for monitoring and evaluating LLM traffic in production.
AgentOps SDK GitHub star agentops, monitoring, cost Monitoring and benchmarking SDK for agent workflows with cost and trace tracking.
Latitude GitHub star platform, eval, observability Open-source agent engineering platform with eval and observability capabilities.
Laminar GitHub star observability, tracing, evals Agent-focused observability stack with tracing, evaluation runs, monitoring, and dashboards.
claude-code-reverse GitHub star trace, visualization, debugging Tooling to visualize and inspect Claude Code LLM interaction traces.
OpenInference GitHub star spec, instrumentation, observability Open instrumentation specification and tooling for AI observability.
Future AGI GitHub star observability, evaluation, guardrails Self-hostable platform that closes the loop across agent tracing, evaluation, simulation, guardrails, and gateway operations.

Guardrails, Security & Governance

Project Link Stars Tags Summary
LiteLLM GitHub star gateway, proxy, guardrails Unified LLM gateway/proxy with cost tracking, load balancing, and guardrails.
Kong GitHub star gateway, policy, infra API and AI gateway infrastructure useful for policy enforcement in agent systems.
Portkey Gateway GitHub star gateway, guardrails, routing AI gateway with routing and guardrails for multi-model production traffic.
CAI (Cybersecurity AI) GitHub star security, governance, framework Security-focused agent framework for offensive/defensive AI workflows.
OpenAI Realtime Agents GitHub star realtime, orchestration, control Advanced agentic realtime patterns with structured control and interaction loops.
Plano GitHub star proxy, safety, data-plane AI-native proxy and data plane with orchestration, safety, and observability.
OpenAI CS Agents Demo GitHub star demo, handoffs, governance Customer-service multi-agent demo highlighting handoffs and guardrail-like control points.
ContextForge GitHub star gateway, governance, observability Registry and proxy layer that unifies MCP, A2A, and REST/gRPC endpoints with centralized governance and observability.
Archestra GitHub star enterprise, guardrails, governance Enterprise AI platform with guardrails, MCP registry, and orchestration services.
Tracecat GitHub star security, automation, policy AI automation platform for security teams with policy and workflow controls.
AgentGateway GitHub star gateway, mcp, proxy Agentic proxy gateway for AI agents and MCP server ecosystems.
Haft GitHub star governance, decisions, mcp Decision-governance harness that records falsifiable contracts, evidence, and commissions before agents execute.

Reference Harness Implementations

Project Link Stars Tags Summary
OpenCode GitHub star terminal, coding-agent, subagents Open-source coding agent with built-in plan/build roles, subagents, LSP support, and a client-server runtime.
Claude Code GitHub star terminal, coding-agent, git-workflows Official terminal coding agent that understands codebases and executes editing, debugging, and Git workflows through natural language.
Gemini CLI GitHub star terminal, coding-agent, mcp Open-source terminal agent with built-in tools, MCP support, checkpointing, and sandboxing controls.
Codex CLI GitHub star terminal, coding-agent, local-execution Terminal-native coding agent that runs locally and exposes practical agent workflows for software tasks.
OpenHands GitHub star coding-agent, software-engineering, repo Open-source AI software engineer focused on repo-level coding task execution.
learn-claude-code GitHub star tutorial, harness, claude-code Hands-on harness tutorial for building Claude Code-like systems from scratch.
OpenManus GitHub star general-agent, autonomy, workflows Open foundation for broad autonomous agent workflows with coding-heavy use cases.
pi GitHub star coding-agent, runtime, monorepo Agent harness monorepo combining a coding-agent CLI, shared runtime, and multi-provider LLM stack.
aider GitHub star terminal, repo-map, testing Terminal coding assistant with repo mapping, git-aware edits, and built-in lint/test feedback loops.
Claude Code Plugins: Orchestration and Automation GitHub star claude-code, plugins, orchestration Production-ready Claude Code plugin marketplace bundling agents, skills, tools, and multi-agent workflow orchestrators.
CLI-Anything GitHub star cli, tool-use, automation CLI agent system that unifies command-line tool usage in agent loops.
NanoClaw GitHub star containers, claude-sdk, scheduling Container-isolated Claude agent harness with channel routing, scheduled jobs, per-group memory, and small-codebase customization.
Qwen Code GitHub star terminal, coding-agent, cli Terminal-native open-source coding agent tuned for practical dev loops.
SuperClaude Framework GitHub star config, personas, workflow Configuration framework adding commands, personas, and method templates to coding agents.
Devika GitHub star assistant, planning, coding Open-source coding assistant system for planning and implementing development tasks.
SWE-agent GitHub star swe, issue-fixing, tooling Research-grade coding agent that resolves GitHub issues with explicit tooling loops.
cmux GitHub star macos, workspace, browser Native macOS terminal and browser workspace for AI coding agents with notifications, split panes, and scriptable control.
Aperant GitHub star coding-agent, parallel, memory Autonomous multi-agent coding framework with parallel execution, isolated workspaces, QA loops, and persistent memory.
Eigent GitHub star desktop, cowork, productivity Open-source desktop cowork agent for autonomous task execution and productivity.
OpenHarness GitHub star tool-use, memory, multi-agent Open agent harness implementation covering tool use, skills, memory, permissions, and multi-agent coordination.
IronClaw GitHub star security, wasm, routines Security-first personal agent harness with WASM sandboxing, routines, tool plugins, and persistent memory.
Superset GitHub star worktrees, desktop, parallel Worktree-based desktop orchestrator for running and reviewing parallel CLI coding agents from one workspace.
GitHub Copilot CLI GitHub star terminal, coding-agent, mcp Official terminal coding agent built on GitHub's Copilot harness with MCP extensibility, approval controls, and GitHub-native context.
Open SWE GitHub star async, coding-agent, swe Asynchronous open-source coding agent focused on software issue workflows.
Agent Orchestrator GitHub star worktrees, parallel, dashboard Worktree-based orchestration layer for parallel coding agents with autonomous CI and review feedback handling.
Paseo GitHub star coding-agent, daemon, multi-device Multi-device coding-agent daemon and client stack for orchestrating local agents, parallel runs, and cross-provider workflows.
holaOS GitHub star long-horizon, desktop, durable-state Desktop-first long-horizon agent environment with runtime, memory, tools, apps, and durable state.
1Code GitHub star coding-agent, orchestration, worktrees Desktop-first coding-agent orchestrator with worktree isolation, background sandboxes, MCP tooling, and automation triggers.
OSAURUS GitHub star macos, local-first, memory Native macOS harness for autonomous coding agents with persistent memory.
HiClaw GitHub star multi-agent, human-in-the-loop, shared-state Collaborative multi-agent OS with manager-worker coordination, shared state, and human-in-the-loop oversight via Matrix rooms.
oh-my-pi GitHub star terminal, lsp, subagents Terminal AI coding agent with edit safety, LSP integration, and subagent support.
mini-swe-agent GitHub star minimal, swe, coding-agent Minimal coding agent implementation with strong benchmark competitiveness.
TinyAGI GitHub star team-orchestration, autonomous, workflows Team-style agent orchestrator for one-person-company style autonomous workflows.
Devon GitHub star pair-programming, coding-agent, autonomous Open-source pair programmer agent with autonomous coding execution patterns.
Open Claude Cowork GitHub star desktop, ui, orchestration Desktop coding cowork assistant that turns agent orchestration into GUI workflows.
Amazon Bedrock AgentCore Samples GitHub star aws, runtime, operations Official sample suite for deploying and operating agents with runtime, gateway, memory, observability, evaluation, and policy layers.
mini-coding-agent GitHub star coding-agent, minimal, approvals Minimal coding agent harness illustrating approvals, memory, bounded delegation, and durable transcripts.

Essential Readings & Ecosystem Maps

Project Link Stars Tags Summary
awesome-claude-code GitHub star awesome-list, claude-code, skills Community collection of Claude Code skills, hooks, and orchestrator tooling.
awesome-agentic-patterns GitHub star awesome-list, patterns, design Catalog of reusable agentic design patterns and implementation motifs.
awesome-mcp-servers GitHub star awesome-list, mcp, tools Curated MCP server index for tool interoperability in agent systems.
awesome-harness-engineering GitHub star awesome-list, curation, harness Curated list focused on harness engineering articles, benchmarks, and implementations.
12 Factor Agents Reference - reading, operations, principles Operations-oriented principles for building maintainable production agents.
Agent Frameworks, Runtimes, and Harnesses, oh my! Reference - reading, langchain, architecture Clear decomposition of framework vs runtime vs harness responsibilities.
An open-source spec for Codex orchestration: Symphony. Reference - reading, openai, orchestration OpenAI's orchestration write-up on turning issue trackers into always-on control planes for coding agents.
Building agents with the Claude Agent SDK Reference - reading, claude, sdk Claude blog on production-oriented SDK usage for sessions, tools, and orchestration.
Building Effective AI Agents Reference - reading, anthropic, agents Anthropic's practical guidance on when to use workflows vs. autonomous agents and how to structure them.
Claude Code auto mode Reference - reading, anthropic, permissions Anthropic's write-up on classifier-backed approval delegation for safer high-autonomy coding-agent runs.
Code execution with MCP Reference - reading, anthropic, mcp Anthropic's design notes on controlled code execution via MCP boundaries.
Demystifying Evals for AI Agents Reference - reading, evals, anthropic Methodology for designing robust agent evals in non-deterministic trajectories.
Effective context engineering for AI agents Reference - reading, context, anthropic Guidance on context-window budgeting and working-state management for agents.
Effective harnesses for long-running agents Reference - reading, long-running, anthropic Practical guide to maintaining state, resumability, and reliability over long agent runs.
Evaluating Deep Agents: Our Learnings Reference - reading, langchain, evaluation LangChain's practical lessons on evaluating stateful and long-horizon agents.
Harness design for long-running application development Reference - reading, app-dev, anthropic Follow-up article on improving long-running app generation through harness structure.
Harness Engineering (Martin Fowler) Reference - reading, architecture, fowler Architectural perspective on harness engineering and entropy control.
Harness engineering (OpenAI) Reference - reading, methodology, openai Field report on building reliable agent-first software via harness constraints and verification.
How we built our multi-agent research system Reference - reading, anthropic, multi-agent Anthropic architecture write-up on role separation and coordination in multi-agent systems.
Improving Deep Agents with harness engineering Reference - reading, langchain, harness Evidence that harness improvements alone can move benchmark performance.
Making Claude Code more secure and autonomous with sandboxing Reference - reading, anthropic, sandboxing How Anthropic uses sandbox boundaries to raise agent autonomy without giving up security controls.
Quantifying infrastructure noise in agentic coding evals Reference - reading, anthropic, evaluation Analysis of how infrastructure choices impact coding-agent benchmark outcomes.
Scaling Managed Agents: Decoupling the brain from the hands Reference - reading, anthropic, architecture Anthropic's meta-harness architecture for decoupling session logs, harness loops, and sandboxes in long-horizon agents.
Skill Issue: Harness Engineering for Coding Agents Reference - reading, humanlayer, coding-agents Practical breakdown of why coding-agent quality depends heavily on harness setup.
Testing Agent Skills Systematically with Evals Reference - reading, openai, evals OpenAI Developers guide for turning agent traces into repeatable skill evaluations.
The Anatomy of an Agent Harness Reference - reading, architecture, langchain Conceptual decomposition of agent harness components and their responsibilities.
Unrolling the Codex agent loop Reference - reading, openai, architecture OpenAI engineering deep dive into the Codex harness loop, prompt growth, tool-call replay, and stateless execution tradeoffs.
Writing effective tools for AI agents Reference - reading, anthropic, tools Best practices for tool interface design so agents call tools safely and reliably.
Your Agent Needs a Harness, Not a Framework Reference - reading, inngest, reliability Argument for reliability-first infrastructure around agents instead of framework-only thinking.

Maintenance Notes

  • Source of truth: data/projects.yaml
  • Regenerate README files: python3 scripts/render_readme.py
  • Verify catalog and links: python3 scripts/verify_catalog.py

Citation

@misc{awesome-agent-harness,
  title={Awesome Agent Harness},
  howpublished={\url{https://github.com/Picrew/awesome-agent-harness.git}},
  year={2026}
}