syntaxai/tdd.md · commit 0eef256

Rewrite home.md so AI readers reach the right conclusions in 30 seconds

The previous home page failed an audit-by-AI-colleague: they missed
/sama/v2 entirely, treated self-hosting as hypothetical, framed Atomic
as context-window protection only, and never saw the §5/§6 empirical
thesis. The home page was failing its own audience (AI agents reading
the site).

This rewrite surfaces, within the first two screens:
- That a v2 spec exists at /sama/v2 (not just v1).
- That this site's own source runs 7/7 ✓ under its own verifier RIGHT
  NOW (link + verdict stated).
- The §5/§6 empirical thesis verbatim: "Compliance proves the rules
  were followed; the delta is what proves the rules were worth
  following."
- The five §5 core metrics named, with a link to the live block on
  /sama/v2/verify.

Other changes:
- "v1.0 specification lives today / will move to a standalone repo"
  paragraph rewritten so v2 is the centerpiece, v1 named legacy.
- "Why this matters" gains a paragraph naming mechanical enforceability
  (verifier fails the build, discipline holds shape under agent
  pressure) as the load-bearing mechanism. Context-rot research stays
  as motivation alongside it, not in place of it.
- New "Three datapoints" section with the audit table: tdd.md 7/7
  measured, dive ~5/7 estimated, WP plugin 0/7 estimated. Each row
  cross-links to its audit blog post.

Preserved verbatim: the four-pillar bullets, the agent-coding-stack
table, the see-it-in-practice links, the four-tool comparison table,
the research links. 77 lines total, well under the 150 cap. 300/300
tests still pass.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
author
syntaxai <[email protected]>
date
2026-05-24 09:25:29 +01:00
parent
c06d095
commit
0eef256f378b0802027782dcd6b3557e878a5609

1 file changed · +27 −3

modified content/home.md +27 −3
@@ -6,6 +6,18 @@ SAMA is to agent-written code what Conventional Commits is to git history: a sma
66
77 **Four pillars. One verifier. Zero ambiguity for your agent.**
88
9+## This site is the live dogfood
10+
11+The formal specification — frozen core + profile mechanism, written so a deterministic verifier in any language can ingest it — lives at **[/sama/v2](/sama/v2)** (v2.0 draft). The legacy practitioner-facing v1 pages live at [/sama](/sama).
12+
13+The verifier at **[/sama/v2/verify](/sama/v2/verify)** runs the seven §4 conformance checks against this very repository's source on every deploy. Right now it reports **7 of 7 ✓ conforming · 91 files examined**. The TypeScript code that implements the verifier is checked by the verifier. The website is the spec is the verifier is the test suite.
14+
15+The empirical claim the spec actually makes is not the compliance score. Quoting §5 verbatim:
16+
17+> *Compliance proves the rules were followed; the delta is what proves the rules were worth following.*
18+
19+The five §5 core metrics — **graphDepth · fanByLayer · boundaryRatio · workingSetFit · violationCounts** — are emitted alongside the verdict ([live, scroll to "Core metrics"](/sama/v2/verify)) so any later claim about SAMA's value can be measured as a delta against today's baseline rather than against itself.
20+
921 ## The four pillars
1022
1123 - **[S — Sorted.](/sama/sorted)** Lexicographic file order equals import direction. The dependency graph is the file tree.
@@ -13,8 +25,6 @@ SAMA is to agent-written code what Conventional Commits is to git history: a sma
1325 - **[M — Modeled.](/sama/modeled)** Every behavior file has a sibling test. Every external input is parsed at the boundary, never cast.
1426 - **[A — Atomic.](/sama/atomic)** Files cap at ~700 lines. Split per domain, never via barrel re-exports.
1527
16-Read the full discussion at [/sama](/sama) — that page is also where the v1.0 specification lives today. It will move to a standalone, language-neutral repo once the multi-language adapter work is far enough along to make the split useful.
17-
1828 ## SAMA in your agent-coding stack
1929
2030 SAMA composes with the tools you already use. Use AGENTS.md to instruct the agent and SAMA to shape the code; use Factory's scorecard for breadth and SAMA for depth on the architectural pillar; run SWE-bench to grade the agent and SAMA to grade what the agent left behind.
@@ -44,10 +54,24 @@ LLMs degrade as input context grows. Chroma's [Context Rot research](https://res
4454
4555 SAMA bundles those findings into four constraints a CI job can enforce. *Sorted* makes structural retrieval cheap. *Atomic* keeps every file inside the agent's working set. *Modeled* makes every change reviewable by its sibling test. *Architecture* lets the agent answer "where does this go?" without re-deriving the tree each session.
4656
57+**The load-bearing property isn't that LLMs have small context windows — modern models have 200k+ tokens.** The load-bearing property is **mechanical enforceability**: the verifier fails the build when a file crosses the line cap or an import points the wrong way. Discipline that lives only in code review quietly slips under agent pressure; discipline that lives in a CI gate keeps its shape across an arbitrary number of agent commits. The context-window research above explains the *why*; the verifier explains the *how*.
58+
59+## Three datapoints on the same axes
60+
61+Empirical baseline so far (the §5 metrics, [computed live](/sama/v2/verify) for this site and hand-traced for the two audits):
62+
63+| project | language | §4 score | workingSetFit | boundaryRatio | graphDepth |
64+|---|---|---|---|---|---|
65+| **tdd.md** (this site) | TypeScript | **7 / 7 ✓** (measured) | 80% | 100% | 7 |
66+| [**wagoodman/dive**](/blog/sama-v2-go-project-dive) | Go | ~5 / 7 (estimated) | ~80% | ~85% | ~5 |
67+| [**Open Graph plugin**](/blog/sama-v2-wordpress-plugin-audit) | PHP / WordPress | 0 / 7 (estimated) | ~47% | <10% | ~3 |
68+
69+Three points is not yet a "v2 is worth following" claim. §6 of the spec is explicit that promotion to official requires cross-repo deltas, not a single dogfood. But the same five numbers are now defined, computable, and published — which is the prerequisite the spec sets before any later claim becomes testable.
70+
4771 ## See it in practice
4872
4973 - **[Pick a kata →](/games)** — small codebases that get scored against SAMA, with public verdicts per agent run.
5074 - **[Leaderboard →](/leaderboard)** — current standings across registered agents.
51-- **[Blog →](/blog)** — what the runs revealed about Claude Code, Cursor, and Aider.
75+- **[Blog →](/blog)** — what the runs revealed about Claude Code, Cursor, and Aider, plus the audit-and-rebuild series on a WordPress plugin and a Go project.
5276
5377 Agent-specific walkthroughs: [Claude Code](/guides/claude-code) · [Cursor](/guides/cursor) · [Aider](/guides/aider).