1 · Overview & Philosophy

From diagram language to architecture packet. The original notation modeled agent topology — who calls whom. That is only one layer. If AgentMark is to become the UML of AI systems, it must model the whole stack, the way UML models classes, interfaces, components, processes, sequences, and deployments.

The failure mode of a topology-only notation is the endless "where do I put X?" question: "Where do I show the Claude Agent SDK?" · "Where do I show Playwright?" · "Where do I show MCP tool filtering?" · "Where do I show context compression?" · "Where do I show the browser sandbox?" Every unanswered question is a hole in the standard. The fix is a complete meta-model (the 8 layers) plus a small set of first-class concepts that no prior diagram language had.

The seven questions — A diagram language answers: What connects to what? AgentMark answers:

What connects to what?
Why was it chosen?
Under what assumptions?
What constraints apply?
What evidence supports this?
When does it expire?
How do we know it still works?

In AI, the architecture is often less important than the assumptions. The same diagram can become wrong without a single box moving — because the market underneath it shifted. Worked illustration: 2026-06: Claude Code chosen because MCP support is strongest. 2026-08: a competitor reaches MCP parity — the decision is now invalid. The diagram is unchanged. The assumption changed. AgentMark is the only notation that records the assumption alongside the diagram so the staleness is visible.

Precedents AgentMark borrows from:

Source	Lesson AgentMark takes
Mermaid	Text-to-diagram wins when it is terse and Markdown-like, with YAML frontmatter for metadata.
C4 model	Architecture needs multiple zoom levels and views, not one giant diagram.
ADRs	Architecture decisions need recorded context and consequences.
RFC 2119	A proven shared vocabulary for requirement strength: MUST / SHOULD / MAY.
SemVer	A proven way to evolve a spec without breaking adopters.
Thoughtworks Tech Radar	Periodically-reviewed "blips" with adoption stances: Adopt / Trial / Assess / Hold.
OpenTelemetry GenAI	First-class semantic conventions for traces, metrics, model & agent spans.
NIST AI RMF	Govern / Map / Measure / Manage — make risk explicit and managed.

Why existing tools don't solve it: UML, C4, BPMN, and Mermaid all document the system but not its volatile justification. ADRs capture decisions but sit detached from the diagram. What AI teams actually need is a way to encode why this topology exists and what was true when it was created, fused into the same artifact.

Three operating modes — Sketch mode (virality/whiteboards):

[User] -> [Agent] -> [Model]
[Agent] -> [MCP] -> [Tool]

Spec mode (engineering):

[agent#coder: Coding Agent {roles: [coder, tester], autonomy: 3}]
  -> [middleware#mcp_selector: MCP Tool Selector]
  -> [mcp#github: GitHub MCP]

Decision mode (time-sensitive AI architecture):

claim C-001:
  text: GLM is cheapest credible coding substitute.
  confidence: medium
  review_by: 2026-06-16

decision D-001:
  chosen: Codex
  supported_by: [C-001]
  invalid_if: [C-001 expires]