syntaxai/tdd.md · commit 7b55b52

Blog post: greening our own dogfood — Modeled flipped to ✓ on /sama/verify

The receipt for the live verifier moving from 3/4 to 4/4 pillars on
syntaxai/tdd.md. Same URL anyone can hit, same answer the local CLI
gives. Cites the 4 missing-sibling violations as the named artifact,
the 4 new test files as the fix, and the public dogfood URL as the
proof.

- content/blog/sama-empirical-modeled-green.md — the case-study post
- src/c31_blog.ts — registry entry dated 2026-05-22

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

author: syntaxai <[email protected]>
date: 2026-05-22 13:06:30 +01:00
parent: 44fca64
commit: 7b55b52e6fe4d3924d61d56545dfa234c5923459

2 files changed · +149 −0

added content/blog/sama-empirical-modeled-green.md +143 −0

@@ -0,0 +1,143 @@
	1	+# Greening our own dogfood: four sibling tests, the live verifier flipped from 3/4 to 4/4
	2	+
	3	+The strongest claim a coding standard can make is "I follow my own rules
	4	+on my own codebase, in public, on the page where I document the rules."
	5	+The weakest version of that claim is the version you'd make about your
	6	+own work where nobody can check. Halfway is a `/sama/verify?repo=mine`
	7	+URL that anyone can hit — and that says three of four pillars green.
	8	+
	9	+That's where this site was yesterday. Today it's 4/4. Here's the receipt.
	10	+
	11	+## The dogfood URL
	12	+
	13	+`/sama/verify` is the public verifier on tdd.md. Drop in any GitHub
	14	+`owner/name` and it runs the four SAMA checks against the default
	15	+branch — same checks, same code, as the local `sama check` CLI. The
	16	+"verify any public repo" link on the SAMA landing page seeds it with
	17	+this site's own slug as an example:
	18	+
	19	+```
	20	+https://tdd.md/sama/verify?repo=syntaxai/tdd.md
	21	+```
	22	+
	23	+Yesterday that page showed:
	24	+
	25	+```
	26	+Sorted ✓ pass
	27	+Architecture ✓ pass
	28	+Modeled ✗ 4 violations
	29	+Atomic ✓ pass
	30	+```
	31	+
	32	+The four Modeled violations were real. SAMA's Modeled rule is hard:
	33	+every `c32_*.ts` file must have a sibling `.test.ts`. Four files were
	34	+missing one:
	35	+
	36	+```
	37	+c32_judge.ts — no sibling test file at c32_judge.test.ts
	38	+c32_session.ts — no sibling test file at c32_session.test.ts
	39	+c32_real_reports.ts — no sibling test file at c32_real_reports.test.ts
	40	+c32_real_tests.ts — no sibling test file at c32_real_tests.test.ts
	41	+```
	42	+
	43	+They'd been flagged on every check for weeks. Long enough that the
	44	+"we follow our own rules" narrative was starting to wear thin every
	45	+time someone pasted the dogfood URL into a conversation.
	46	+
	47	+## What I did about it
	48	+
	49	+Four sibling test files. One per violation. 55 new tests, all real
	50	+unit tests with real `expect()` calls — the verifier flags placeholder
	51	+tests too (zero `expect()` calls in a `test()` body) under the same
	52	+Atomic pass, so cheating with empty bodies isn't an option.
	53	+
	54	+\| File \| Tests \| What's actually covered \|
	55	+\|---\|---:\|---\|
	56	+\| `c32_session.test.ts` \| 24 \| The pure helpers: `parseCookies`, `timingSafeEqual`, `hmacSha256Hex`, `sessionCookieHeader`, `randomHex`, plus the full `signSession` ↔ `verifySession` round-trip including forged-signature and expired-cookie rejection paths \|
	57	+\| `c32_judge.test.ts` \| 9 \| `applyMode` — the strict / pragmatic / learning penalty math (positive deltas pass through; pragmatic halves negatives; learning zeroes them); `explainRefactor` — the two-branch explanation strings \|
	58	+\| `c32_real_reports.test.ts` \| 12 \| `detectAgent` (Claude / Cursor / Aider attribution from commit footers, case-insensitive); `buildTrend` (30-day daily commit sparkline — out-of-window drop, same-day stacking, empty input flat-lines) \|
	59	+\| `c32_real_tests.test.ts` \| 10 \| `detectAgent` again (this one returns `null` instead of `"unknown"` — documented difference); `shortenTestLabel` \|
	60	+
	61	+`c32_session.ts` was the easy one — every helper it ships is already
	62	+exported. The other three needed a small visibility move: three `const`
	63	+helpers (`applyMode`, `explainRefactor`, `detectAgent`, `buildTrend`,
	64	+`shortenTestLabel`) had to become `export const` so the sibling could
	65	+import them. No behaviour changed. The exports are the test surface
	66	+SAMA's Modeled rule expects c32 files to have anyway — burying pure
	67	+helpers as private `const` defeats the whole point of putting them in
	68	+a c32 file in the first place.
	69	+
	70	+## What the verifier sees now
	71	+
	72	+Local run:
	73	+
	74	+```
	75	+$ bun scripts/sama-cli.ts check
	76	+SAMA verify · (local)/src · (working tree)
	77	+ examined 71 SAMA files / 16 tests / 74 src files
	78	+
	79	+ S — Sorted: ✓ pass (71 files)
	80	+ A — Architecture: ✓ pass (71 files)
	81	+ M — Modeled: ✓ pass (55 files)
	82	+ A — Atomic: ✓ pass (71 files)
	83	+
	84	+✓ all four checks passed
	85	+```
	86	+
	87	+Live run — same code, just over HTTP:
	88	+
	89	+```
	90	+$ curl -s 'https://tdd.md/sama/verify?repo=syntaxai/tdd.md' \
	91	+ \| grep -oE '(Sorted\|Architecture\|Modeled\|Atomic)[^<]*<'
	92	+Architecture · ✓ pass<
	93	+Atomic · ✓ pass<
	94	+Modeled · ✓ pass<
	95	+Sorted · ✓ pass<
	96	+```
	97	+
	98	+Same answer, two delivery surfaces, one source of truth.
	99	+
	100	+## Why this matters as evidence
	101	+
	102	+Two things are true that weren't true yesterday:
	103	+
	104	+1. The publicly-hosted verifier shows this codebase clean. Anyone
	105	+ can run `https://tdd.md/sama/verify?repo=syntaxai/tdd.md` and see
	106	+ 4/4 ✓. Skeptics don't have to trust me; they can hit the URL.
	107	+2. The fix shape matched the verifier's instruction exactly.
	108	+ The verifier said "no sibling test file at `<path>`." The fix was
	109	+ four sibling test files at those four paths. No metaprogramming, no
	110	+ reorganisation, no carve-out. The instruction was the diff.
	111	+
	112	+That's the empirical loop SAMA's pitch turns on: the verifier names
	113	+the missing artifact, the operator (human or agent) produces it, the
	114	+verifier flips green. No judgement calls, no taste, no escalation.
	115	+On a working codebase that you're shipping to users, you can run the
	116	+loop in an afternoon.
	117	+
	118	+## What this doesn't prove
	119	+
	120	+It doesn't prove the standard scales. It doesn't prove the standard
	121	+works on Python or Rust. It doesn't prove AI agents using SAMA write
	122	+better code than agents not using SAMA (that's [issue #1](https://github.com/syntaxai/tdd.md/issues/1)
	123	+— 360 measured data points, still to come).
	124	+
	125	+It proves one thing: on this codebase, the rules described in
	126	+`/sama/*` are the rules running in `bun scripts/sama-cli.ts check`,
	127	+which are the rules running on `https://tdd.md/sama/verify`, and the
	128	+codebase passes all four of them. The website is the spec is the
	129	+verifier is the test suite, and they agree on the same answer.
	130	+
	131	+That's the smallest unit of "this standard isn't just a blog post."
	132	+The blog post is the receipt; the URL is the proof.
	133	+
	134	+---
	135	+
	136	+See for yourself:
	137	+
	138	+- Live dogfood: <https://tdd.md/sama/verify?repo=syntaxai/tdd.md>
	139	+- The four checks documented:
	140	+ [Sorted](/sama/sorted) · [Architecture](/sama/architecture) ·
	141	+ [Modeled](/sama/modeled) · [Atomic](/sama/atomic)
	142	+- Previous post in this series:
	143	+ [When the verifier said "split this"](/blog/sama-empirical-c21-split)

modified src/c31_blog.ts +6 −0

@@ -12,6 +12,12 @@ export interface BlogEntry {
12	12	}
13	13
14	14	export const ALL_POSTS: BlogEntry[] = [
	15	+ {
	16	+ slug: "sama-empirical-modeled-green",
	17	+ title: "Greening our own dogfood: four sibling tests, the live verifier flipped from 3/4 to 4/4",
	18	+ description: "/sama/verify?repo=syntaxai/tdd.md is the public verifier on tdd.md. Yesterday it showed three of four SAMA pillars green for this codebase — Modeled was flagging four c32_* files without sibling tests. Today it shows 4/4. Receipt for the round-trip: four new test files (55 unit tests), three const → export const visibility lifts on pure helpers, no behaviour changes, and the same URL anyone in the world can hit now reports the same answer the local CLI does. The website is the spec is the verifier is the test suite.",
	19	+ date: "2026-05-22",
	20	+ },
15	21	{
16	22	slug: "sama-empirical-c21-split",
17	23	title: "When the verifier said 'split this': one Atomic-700 hit, four handler files, the build stayed green",

raw .diff