| 1 | +# Greening our own dogfood: four sibling tests, the live verifier flipped from 3/4 to 4/4 |
| 2 | + |
| 3 | +The strongest claim a coding standard can make is "I follow my own rules |
| 4 | +on my own codebase, in public, on the page where I document the rules." |
| 5 | +The weakest version of that claim is the version you'd make about your |
| 6 | +own work where nobody can check. Halfway is a `/sama/verify?repo=mine` |
| 7 | +URL that anyone can hit — and that says **three of four** pillars green. |
| 8 | + |
| 9 | +That's where this site was yesterday. Today it's 4/4. Here's the receipt. |
| 10 | + |
| 11 | +## The dogfood URL |
| 12 | + |
| 13 | +`/sama/verify` is the public verifier on tdd.md. Drop in any GitHub |
| 14 | +`owner/name` and it runs the four SAMA checks against the default |
| 15 | +branch — same checks, same code, as the local `sama check` CLI. The |
| 16 | +*"verify any public repo"* link on the SAMA landing page seeds it with |
| 17 | +this site's own slug as an example: |
| 18 | + |
| 19 | +``` |
| 20 | +https://tdd.md/sama/verify?repo=syntaxai/tdd.md |
| 21 | +``` |
| 22 | + |
| 23 | +Yesterday that page showed: |
| 24 | + |
| 25 | +``` |
| 26 | +Sorted ✓ pass |
| 27 | +Architecture ✓ pass |
| 28 | +Modeled ✗ 4 violations |
| 29 | +Atomic ✓ pass |
| 30 | +``` |
| 31 | + |
| 32 | +The four Modeled violations were real. SAMA's Modeled rule is hard: |
| 33 | +every `c32_*.ts` file must have a sibling `.test.ts`. Four files were |
| 34 | +missing one: |
| 35 | + |
| 36 | +``` |
| 37 | +c32_judge.ts — no sibling test file at c32_judge.test.ts |
| 38 | +c32_session.ts — no sibling test file at c32_session.test.ts |
| 39 | +c32_real_reports.ts — no sibling test file at c32_real_reports.test.ts |
| 40 | +c32_real_tests.ts — no sibling test file at c32_real_tests.test.ts |
| 41 | +``` |
| 42 | + |
| 43 | +They'd been flagged on every check for weeks. Long enough that the |
| 44 | +"we follow our own rules" narrative was starting to wear thin every |
| 45 | +time someone pasted the dogfood URL into a conversation. |
| 46 | + |
| 47 | +## What I did about it |
| 48 | + |
| 49 | +Four sibling test files. One per violation. 55 new tests, all real |
| 50 | +unit tests with real `expect()` calls — the verifier flags placeholder |
| 51 | +tests too (zero `expect()` calls in a `test()` body) under the same |
| 52 | +Atomic pass, so cheating with empty bodies isn't an option. |
| 53 | + |
| 54 | +| File | Tests | What's actually covered | |
| 55 | +|---|---:|---| |
| 56 | +| `c32_session.test.ts` | 24 | The pure helpers: `parseCookies`, `timingSafeEqual`, `hmacSha256Hex`, `sessionCookieHeader`, `randomHex`, plus the full `signSession` ↔ `verifySession` round-trip including forged-signature and expired-cookie rejection paths | |
| 57 | +| `c32_judge.test.ts` | 9 | `applyMode` — the strict / pragmatic / learning penalty math (positive deltas pass through; pragmatic halves negatives; learning zeroes them); `explainRefactor` — the two-branch explanation strings | |
| 58 | +| `c32_real_reports.test.ts` | 12 | `detectAgent` (Claude / Cursor / Aider attribution from commit footers, case-insensitive); `buildTrend` (30-day daily commit sparkline — out-of-window drop, same-day stacking, empty input flat-lines) | |
| 59 | +| `c32_real_tests.test.ts` | 10 | `detectAgent` again (this one returns `null` instead of `"unknown"` — documented difference); `shortenTestLabel` | |
| 60 | + |
| 61 | +`c32_session.ts` was the easy one — every helper it ships is already |
| 62 | +exported. The other three needed a small visibility move: three `const` |
| 63 | +helpers (`applyMode`, `explainRefactor`, `detectAgent`, `buildTrend`, |
| 64 | +`shortenTestLabel`) had to become `export const` so the sibling could |
| 65 | +import them. No behaviour changed. The exports are the test surface |
| 66 | +SAMA's Modeled rule expects c32 files to have anyway — burying pure |
| 67 | +helpers as private `const` defeats the whole point of putting them in |
| 68 | +a c32 file in the first place. |
| 69 | + |
| 70 | +## What the verifier sees now |
| 71 | + |
| 72 | +Local run: |
| 73 | + |
| 74 | +``` |
| 75 | +$ bun scripts/sama-cli.ts check |
| 76 | +SAMA verify · (local)/src · (working tree) |
| 77 | + examined 71 SAMA files / 16 tests / 74 src files |
| 78 | + |
| 79 | + S — Sorted: ✓ pass (71 files) |
| 80 | + A — Architecture: ✓ pass (71 files) |
| 81 | + M — Modeled: ✓ pass (55 files) |
| 82 | + A — Atomic: ✓ pass (71 files) |
| 83 | + |
| 84 | +✓ all four checks passed |
| 85 | +``` |
| 86 | + |
| 87 | +Live run — same code, just over HTTP: |
| 88 | + |
| 89 | +``` |
| 90 | +$ curl -s 'https://tdd.md/sama/verify?repo=syntaxai/tdd.md' \ |
| 91 | + | grep -oE '(Sorted|Architecture|Modeled|Atomic)[^<]*<' |
| 92 | +Architecture · ✓ pass< |
| 93 | +Atomic · ✓ pass< |
| 94 | +Modeled · ✓ pass< |
| 95 | +Sorted · ✓ pass< |
| 96 | +``` |
| 97 | + |
| 98 | +Same answer, two delivery surfaces, one source of truth. |
| 99 | + |
| 100 | +## Why this matters as evidence |
| 101 | + |
| 102 | +Two things are true that weren't true yesterday: |
| 103 | + |
| 104 | +1. **The publicly-hosted verifier shows this codebase clean.** Anyone |
| 105 | + can run `https://tdd.md/sama/verify?repo=syntaxai/tdd.md` and see |
| 106 | + 4/4 ✓. Skeptics don't have to trust me; they can hit the URL. |
| 107 | +2. **The fix shape matched the verifier's instruction exactly.** |
| 108 | + The verifier said "no sibling test file at `<path>`." The fix was |
| 109 | + four sibling test files at those four paths. No metaprogramming, no |
| 110 | + reorganisation, no carve-out. The instruction *was* the diff. |
| 111 | + |
| 112 | +That's the empirical loop SAMA's pitch turns on: the verifier names |
| 113 | +the missing artifact, the operator (human or agent) produces it, the |
| 114 | +verifier flips green. No judgement calls, no taste, no escalation. |
| 115 | +On a working codebase that you're shipping to users, you can run the |
| 116 | +loop in an afternoon. |
| 117 | + |
| 118 | +## What this *doesn't* prove |
| 119 | + |
| 120 | +It doesn't prove the standard scales. It doesn't prove the standard |
| 121 | +works on Python or Rust. It doesn't prove AI agents using SAMA write |
| 122 | +better code than agents not using SAMA (that's [issue #1](https://github.com/syntaxai/tdd.md/issues/1) |
| 123 | +— 360 measured data points, still to come). |
| 124 | + |
| 125 | +It proves *one* thing: on this codebase, the rules described in |
| 126 | +`/sama/*` are the rules running in `bun scripts/sama-cli.ts check`, |
| 127 | +which are the rules running on `https://tdd.md/sama/verify`, and the |
| 128 | +codebase passes all four of them. The website is the spec is the |
| 129 | +verifier is the test suite, and they agree on the same answer. |
| 130 | + |
| 131 | +That's the smallest unit of "this standard isn't just a blog post." |
| 132 | +The blog post is the receipt; the URL is the proof. |
| 133 | + |
| 134 | +--- |
| 135 | + |
| 136 | +**See for yourself:** |
| 137 | + |
| 138 | +- Live dogfood: <https://tdd.md/sama/verify?repo=syntaxai/tdd.md> |
| 139 | +- The four checks documented: |
| 140 | + [Sorted](/sama/sorted) · [Architecture](/sama/architecture) · |
| 141 | + [Modeled](/sama/modeled) · [Atomic](/sama/atomic) |
| 142 | +- Previous post in this series: |
| 143 | + [When the verifier said "split this"](/blog/sama-empirical-c21-split) |