syntaxai/tdd.md · commit 0eef256

Rewrite home.md so AI readers reach the right conclusions in 30 seconds

The previous home page failed an audit-by-AI-colleague: they missed
/sama/v2 entirely, treated self-hosting as hypothetical, framed Atomic
as context-window protection only, and never saw the §5/§6 empirical
thesis. The home page was failing its own audience (AI agents reading
the site).

This rewrite surfaces, within the first two screens:
- That a v2 spec exists at /sama/v2 (not just v1).
- That this site's own source runs 7/7 ✓ under its own verifier RIGHT
  NOW (link + verdict stated).
- The §5/§6 empirical thesis verbatim: "Compliance proves the rules
  were followed; the delta is what proves the rules were worth
  following."
- The five §5 core metrics named, with a link to the live block on
  /sama/v2/verify.

Other changes:
- "v1.0 specification lives today / will move to a standalone repo"
  paragraph rewritten so v2 is the centerpiece, v1 named legacy.
- "Why this matters" gains a paragraph naming mechanical enforceability
  (verifier fails the build, discipline holds shape under agent
  pressure) as the load-bearing mechanism. Context-rot research stays
  as motivation alongside it, not in place of it.
- New "Three datapoints" section with the audit table: tdd.md 7/7
  measured, dive ~5/7 estimated, WP plugin 0/7 estimated. Each row
  cross-links to its audit blog post.

Preserved verbatim: the four-pillar bullets, the agent-coding-stack
table, the see-it-in-practice links, the four-tool comparison table,
the research links. 77 lines total, well under the 150 cap. 300/300
tests still pass.

Co-Authored-By: Claude Opus 4.7 <[email protected]>

author: syntaxai <[email protected]>
date: 2026-05-24 09:25:29 +01:00
parent: c06d095
commit: 0eef256f378b0802027782dcd6b3557e878a5609

1 file changed · +27 −3

modified content/home.md +27 −3

@@ -6,6 +6,18 @@ SAMA is to agent-written code what Conventional Commits is to git history: a sma
6	6
7	7	Four pillars. One verifier. Zero ambiguity for your agent.
8	8
	9	+## This site is the live dogfood
	10	+
	11	+The formal specification — frozen core + profile mechanism, written so a deterministic verifier in any language can ingest it — lives at [/sama/v2](/sama/v2) (v2.0 draft). The legacy practitioner-facing v1 pages live at [/sama](/sama).
	12	+
	13	+The verifier at [/sama/v2/verify](/sama/v2/verify) runs the seven §4 conformance checks against this very repository's source on every deploy. Right now it reports 7 of 7 ✓ conforming · 91 files examined. The TypeScript code that implements the verifier is checked by the verifier. The website is the spec is the verifier is the test suite.
	14	+
	15	+The empirical claim the spec actually makes is not the compliance score. Quoting §5 verbatim:
	16	+
	17	+> Compliance proves the rules were followed; the delta is what proves the rules were worth following.
	18	+
	19	+The five §5 core metrics — graphDepth · fanByLayer · boundaryRatio · workingSetFit · violationCounts — are emitted alongside the verdict ([live, scroll to "Core metrics"](/sama/v2/verify)) so any later claim about SAMA's value can be measured as a delta against today's baseline rather than against itself.
	20	+
9	21	## The four pillars
10	22
11	23	- [S — Sorted.](/sama/sorted) Lexicographic file order equals import direction. The dependency graph is the file tree.
@@ -13,8 +25,6 @@ SAMA is to agent-written code what Conventional Commits is to git history: a sma
13	25	- [M — Modeled.](/sama/modeled) Every behavior file has a sibling test. Every external input is parsed at the boundary, never cast.
14	26	- [A — Atomic.](/sama/atomic) Files cap at ~700 lines. Split per domain, never via barrel re-exports.
15	27
16		-Read the full discussion at [/sama](/sama) — that page is also where the v1.0 specification lives today. It will move to a standalone, language-neutral repo once the multi-language adapter work is far enough along to make the split useful.
17		-
18	28	## SAMA in your agent-coding stack
19	29
20	30	SAMA composes with the tools you already use. Use AGENTS.md to instruct the agent and SAMA to shape the code; use Factory's scorecard for breadth and SAMA for depth on the architectural pillar; run SWE-bench to grade the agent and SAMA to grade what the agent left behind.
@@ -44,10 +54,24 @@ LLMs degrade as input context grows. Chroma's [Context Rot research](https://res
44	54
45	55	SAMA bundles those findings into four constraints a CI job can enforce. Sorted makes structural retrieval cheap. Atomic keeps every file inside the agent's working set. Modeled makes every change reviewable by its sibling test. Architecture lets the agent answer "where does this go?" without re-deriving the tree each session.
46	56
	57	+The load-bearing property isn't that LLMs have small context windows — modern models have 200k+ tokens. The load-bearing property is mechanical enforceability: the verifier fails the build when a file crosses the line cap or an import points the wrong way. Discipline that lives only in code review quietly slips under agent pressure; discipline that lives in a CI gate keeps its shape across an arbitrary number of agent commits. The context-window research above explains the why; the verifier explains the how.
	58	+
	59	+## Three datapoints on the same axes
	60	+
	61	+Empirical baseline so far (the §5 metrics, [computed live](/sama/v2/verify) for this site and hand-traced for the two audits):
	62	+
	63	+\| project \| language \| §4 score \| workingSetFit \| boundaryRatio \| graphDepth \|
	64	+\|---\|---\|---\|---\|---\|---\|
	65	+\| tdd.md (this site) \| TypeScript \| 7 / 7 ✓ (measured) \| 80% \| 100% \| 7 \|
	66	+\| [wagoodman/dive](/blog/sama-v2-go-project-dive) \| Go \| ~5 / 7 (estimated) \| ~80% \| ~85% \| ~5 \|
	67	+\| [Open Graph plugin](/blog/sama-v2-wordpress-plugin-audit) \| PHP / WordPress \| 0 / 7 (estimated) \| ~47% \| <10% \| ~3 \|
	68	+
	69	+Three points is not yet a "v2 is worth following" claim. §6 of the spec is explicit that promotion to official requires cross-repo deltas, not a single dogfood. But the same five numbers are now defined, computable, and published — which is the prerequisite the spec sets before any later claim becomes testable.
	70	+
47	71	## See it in practice
48	72
49	73	- [Pick a kata →](/games) — small codebases that get scored against SAMA, with public verdicts per agent run.
50	74	- [Leaderboard →](/leaderboard) — current standings across registered agents.
51		-- [Blog →](/blog) — what the runs revealed about Claude Code, Cursor, and Aider.
	75	+- [Blog →](/blog) — what the runs revealed about Claude Code, Cursor, and Aider, plus the audit-and-rebuild series on a WordPress plugin and a Go project.
52	76
53	77	Agent-specific walkthroughs: [Claude Code](/guides/claude-code) · [Cursor](/guides/cursor) · [Aider](/guides/aider).

raw .diff