10 · Full Worked Example: Hermes Coding Harness
The complete hermes.agentmark document. It documents the system, the market assumptions, the constraints, the tests, and the decision logic — all in one file, all with expiry dates.
---
agentmark: 0.2.0
title: Hermes Coding Harness
id: hermes
owner: Joel Tong
status: draft
as_of: 2026-06-02
review_by: 2026-06-16
timezone: Asia/Singapore
views: [topology, decision, risk, eval, landscape]
---
# Topology
[human#joel: Joel] -> [harness#hermes: Hermes]
[harness#hermes]
-> [agent#coder: Coding Agent {roles: [planner, coder, tester, executor], autonomy: 3}]
-> [middleware#mcp_selector: MCP Tool Selector {strategy: shortlist, top_k: 12}]
-> [mcp#mcp_gateway: MCP Gateway]
[agent#coder] -> [model#codex: Codex]
[agent#coder] -> [tool#tests: Test Runner]
[agent#coder] -> [fs#repo: Repository]
[agent#coder] ~>[traces] [log#otel: OpenTelemetry Traces]
# Constraints
constraint K-001:
title: Claude Code backend harness restriction
level: MUST_NOT
applies_to: [harness#claude_code: Claude Code]
scope: Hermes
as_of: 2026-06-02
review_by: 2026-06-16
# Claims
claim C-001:
text: Codex is currently best suited for Hermes when optimized for cost and backend use.
kind: comparative
subjects: [Codex, Claude Code, GLM, Opus, Qwen Coder]
metric: overall_suitability
scope: Hermes
evidence: [bench#CODE-HARNESS-2026-06]
confidence: medium
volatility: high
as_of: 2026-06-02
review_by: 2026-06-16
# Landscape
landscape: Coding Harnesses
as_of: 2026-06-02
review_by: 2026-06-16
scope: Hermes
adopt:
- Codex { reason: cost/performance and backend-harness suitability }
blocked:
- Claude Code { reason: K-001 }
# Decision
decision D-001:
title: Use Codex as primary Hermes coding harness
status: accepted
chosen: [model#codex: Codex]
alternatives: [Claude Code, GLM, Opus, Qwen Coder]
constrained_by: [K-001]
supported_by: [C-001]
invalid_if:
- K-001 is false
- C-001 expires
# Benchmarks
bench CODE-HARNESS-2026-06:
title: Coding-agent substitute comparison
run_on: 2026-06-02
repeat: weekly
candidates: [Codex, GLM, Opus, Qwen Coder]
metrics:
- accepted_patch_rate
- cost_per_accepted_patch
# Views
view topology:
show: [human, harness, agent, middleware, mcp, model, tool, fs, log]
view decision:
show: [decision, constraint, claim, landscape]
The topology alone is trivial — [User] -> [Hermes] -> [Codex] — but the surrounding constraints, claims, landscape, benchmarks, and invalidation conditions are the part teams currently lose every two weeks. Capturing them in the same file is the whole point. Open this example in the editor to see all of its generated views.