SAMA v2 verifier: build, ship, and dogfood — empirical proof scaffold
Builds the SAMA v2 §4 verifier end-to-end so /sama/v2/verify?repo= syntaxai/tdd.md returns 200 with an honest verdict for this repo under a real profile. Goal #15 result: 5/7 checks ✓ with three named, file-level blockers documented in the live verdict. New files (all v2-compliant under tdd-md profile): - sama.profile.toml — repo root, the single source of truth for layer→prefix mapping (c31_→0, c32_/c51_→1, c13_/c14_→2, c21_/c11_→3) - src/c31_sama_v2.ts — Layer 0 types (ProfileSpec, SamaV2Input, SamaV2Report, declaredLayer helper) - src/c32_sama_v2_verify.ts (+ sibling) — pure Layer 1 verifier, 7 checks; sibling has 20 tests covering each check's positive/negative fixture cases - src/c14_sama_profile.ts (+ sibling) — Layer 2 boundary: minimal TOML subset parser + filesystem loader; 7 tests pin the parsed shape - src/c21_handlers_sama.ts (extended) — new samaV2VerifyHandler reads sama.profile.toml + walks src/, runs the verifier, renders the verdict via renderDocsPage - src/c21_app.ts — new route "/sama/v2/verify" Honesty refactors required for v2 conformance (preserve v1 behaviour throughout — bun test 220/220, sama v1 check 4/4 ✓): - c32_judge / c32_real_reports / c32_real_tests do real I/O (git clone, fs, HTTP fetch). Under v2 §1.1 they cannot be Layer 1. Renamed to c14_* (Layer 2 Adapter). 6 files + every importer. - SxDocumentSummary, ProjectRow, TreeEntry, GitCommitOk/Failure/ Outcome were defined in c13/c14 (Layer 2) but imported by c51 render code (Layer 1) — upward edges. Moved type definitions to c31_sxdoc / c31_project_config / c31_git_parse (Layer 0). All callers now import directly from Layer 0. Honest blockers surfaced by the verifier (deferred — out of /goal scope): - #1 Sorted: 14 violations. v1's c11/c13/c14/c21/c31/c32/c51 prefix scheme puts c-prefix layer numbers IN lex order, but v2's Pure/Core/Adapter/Entry mapping reverses that (c31_=Layer 0 should lex-FIRST, c11_=Layer 3 should lex-LAST). Requires a sweeping prefix-rename refactor. - #3 Modeled (tests): 13 violations. Layer 1 (c51_) and Layer 2 (c14_) source files without sibling .test.ts. Need 13 new test files. - #4 Modeled (boundary): 5 violations. c21_handlers_* call new URL / JSON.parse directly in Layer 3; v2 §4.4 wants those in Layer 2. Extract to c14_request_parse helpers. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
38 files changed · +3712 −1015
content/blog/deploy-that-lies-cascade.md
+310
−0
| @@ -0,0 +1,310 @@ | ||
| 1 | +# When the deploy lies: three bugs hidden by one silent error suppressor | |
| 2 | + | |
| 3 | +The two prior posts in this thread were clean rounds: the verifier | |
| 4 | +named a violation, I produced the named artifact, the verifier flipped | |
| 5 | +green. Atomic-700 on `c21_app.ts` → split per domain → ✓. Modeled on | |
| 6 | +four `c32_*.ts` files → add the four siblings → ✓. Encouraging stories | |
| 7 | +about mechanical enforcement. | |
| 8 | + | |
| 9 | +This post is the messy round. It's the one that taught me that | |
| 10 | +mechanical enforcement only works if the pipeline that runs it is | |
| 11 | +itself running. | |
| 12 | + | |
| 13 | +## The visible bug | |
| 14 | + | |
| 15 | +`/reports/live` is the public live-data demo: real commit history for | |
| 16 | +this repo, rendered into a TDD-discipline scorecard, refreshed on every | |
| 17 | +deploy. On 2026-05-22 the header read: | |
| 18 | + | |
| 19 | +``` | |
| 20 | +tdd-discipline report · 2026-05-03 → 2026-05-10 | |
| 21 | +``` | |
| 22 | + | |
| 23 | +Twelve days of staleness on a page that calls itself "live." I'd | |
| 24 | +shipped seven commits across the previous rounds and none of them | |
| 25 | +appeared. | |
| 26 | + | |
| 27 | +## Why nobody noticed for 12 days | |
| 28 | + | |
| 29 | +The deploy script in git-mode invoked the snapshot generator over ssh: | |
| 30 | + | |
| 31 | +```bash | |
| 32 | +ssh "$SSH_HOST" "cd ~/$REMOTE_SRC_DIR && bun scripts/p620/snapshot-git-history.ts" 2>/dev/null \ | |
| 33 | + || echo " ⚠ snapshot-git-history skipped (script may live outside the rsync exclude — non-fatal)" | |
| 34 | +``` | |
| 35 | + | |
| 36 | +Two clauses are doing the damage: | |
| 37 | + | |
| 38 | +- `2>/dev/null` discards stderr — including the error message we'd want. | |
| 39 | +- `|| echo " ⚠ ... non-fatal"` turns a real failure into a printed | |
| 40 | + warning. Worse, the warning text *blames the wrong thing* | |
| 41 | + ("script may live outside the rsync exclude") so anyone who DID see | |
| 42 | + the warning would file it under "harmless artifact of rsync vs git | |
| 43 | + mode" and move on. | |
| 44 | + | |
| 45 | +The actual failure: there's no `bun` on the p620 host. Bun lives only | |
| 46 | +inside the tdd-md container image. The ssh tried to invoke a binary | |
| 47 | +that doesn't exist on PATH; the shell returned 127; the warning fired; | |
| 48 | +the deploy continued; the snapshot file's timestamp stayed at May 11. | |
| 49 | + | |
| 50 | +Twelve days. Every deploy. Both of the previous "clean rounds" deployed | |
| 51 | +through this same broken path and updated the *site* but not the | |
| 52 | +*live data*. The blog posts about going green were themselves served by | |
| 53 | +a deploy script that was lying about its own snapshot step. | |
| 54 | + | |
| 55 | +## Fix 1, and what it revealed | |
| 56 | + | |
| 57 | +The fix is structurally trivial: run the script *inside* the container | |
| 58 | +where bun lives, by mounting the working tree as a volume: | |
| 59 | + | |
| 60 | +```bash | |
| 61 | +ssh "$SSH_HOST" "podman run --rm \ | |
| 62 | + -v \$HOME/$REMOTE_SRC_DIR:/work:Z \ | |
| 63 | + --workdir /work \ | |
| 64 | + $IMAGE_TAG \ | |
| 65 | + bun scripts/p620/snapshot-git-history.ts" \ | |
| 66 | + || { echo '✗ snapshot-git-history failed'; exit 1; } | |
| 67 | +``` | |
| 68 | + | |
| 69 | +The `:Z` is the Fedora SELinux relabel — the script process inside | |
| 70 | +needs to be able to read/write the bind mount. The `|| | |
| 71 | +{ echo ✗; exit 1 }` replaces the swallow with a real failure mode. No | |
| 72 | +more silent skips. | |
| 73 | + | |
| 74 | +After this fix landed, `/reports/live` immediately caught up: | |
| 75 | + | |
| 76 | +``` | |
| 77 | +tdd-discipline report · 2026-05-03 → 2026-05-22 | |
| 78 | +``` | |
| 79 | + | |
| 80 | +So far so good. But the moment I looked at `/reports/live/tests`, the | |
| 81 | +sibling test-stability page, the timestamp said: | |
| 82 | + | |
| 83 | +``` | |
| 84 | +last run 2026-05-10 · 17 runs cumulative | |
| 85 | +``` | |
| 86 | + | |
| 87 | +Same staleness. Different cause. | |
| 88 | + | |
| 89 | +## The second silent failure | |
| 90 | + | |
| 91 | +Looking at the deploy script again, the **rsync** escape hatch runs | |
| 92 | +both snapshot scripts: | |
| 93 | + | |
| 94 | +```bash | |
| 95 | +( cd "$REPO_ROOT" && bun scripts/p620/snapshot-git-history.ts ) || ... | |
| 96 | +( cd "$REPO_ROOT" && bun scripts/p620/snapshot-tests.ts ) || ... | |
| 97 | +``` | |
| 98 | + | |
| 99 | +The **git-mode** happy path runs only the first one. When the deploy | |
| 100 | +flow switched from rsync to git as the default a while back, the | |
| 101 | +test-snapshot step got dropped on the floor and nobody noticed — | |
| 102 | +because the test-stability page was always 17 cumulative runs old, and | |
| 103 | +"old enough that nobody questioned the number" is one of the failure | |
| 104 | +modes that a verifier can't detect. | |
| 105 | + | |
| 106 | +Fix 2: add the second podman-run step, with one wrinkle. Unlike | |
| 107 | +`snapshot-git-history` (which is pure git + filesystem), `snapshot-tests` | |
| 108 | +calls `bun test`, which needs `node_modules` to resolve `marked` and | |
| 109 | +`node-html-parser`. The bind-mounted host directory has no | |
| 110 | +`node_modules` (the host has no Bun). But the image already ships | |
| 111 | +them at `/app/node_modules`. So: | |
| 112 | + | |
| 113 | +```bash | |
| 114 | +podman run --rm -v $HOME/src/tdd.md:/work:Z --workdir /work $IMAGE_TAG \ | |
| 115 | + sh -c 'ln -sfn /app/node_modules node_modules && bun scripts/p620/snapshot-tests.ts' | |
| 116 | +``` | |
| 117 | + | |
| 118 | +Symlink the container's `node_modules` into the work directory, then | |
| 119 | +let the script use it. The symlink persists on the host between | |
| 120 | +deploys but points at a path inside the container — harmless dead-link | |
| 121 | +outside the next podman-run, valid inside. | |
| 122 | + | |
| 123 | +## Two more bugs, surfaced by the snapshot actually running | |
| 124 | + | |
| 125 | +When the next deploy ran with both snapshots wired in, the live page | |
| 126 | +now read: | |
| 127 | + | |
| 128 | +``` | |
| 129 | +Total: 193 tests · 192 passing · 1 failing · 1 placeholder ⚠ | |
| 130 | +``` | |
| 131 | + | |
| 132 | +193 pass locally, every time I run them. 192 pass + 1 fail + 1 | |
| 133 | +placeholder on the container. Two bugs that had been hiding behind | |
| 134 | +"the test suite never actually ran in the deploy pipeline." | |
| 135 | + | |
| 136 | +### Bug A: a 1-in-16 flaky test | |
| 137 | + | |
| 138 | +The failing test was one I wrote in the prior round: | |
| 139 | + | |
| 140 | +```ts | |
| 141 | +test("verifySession rejects a cookie with a forged signature", async () => { | |
| 142 | + const cookie = await signSession("eve"); | |
| 143 | + const tampered = cookie.replace(/.$/, "0"); | |
| 144 | + const result = await verifySession(tampered); | |
| 145 | + expect(result).toBeNull(); | |
| 146 | +}); | |
| 147 | +``` | |
| 148 | + | |
| 149 | +`replace(/.$/, "0")` replaces the last character with "0". When the | |
| 150 | +HMAC signature's last hex digit *is already* "0" — which happens with | |
| 151 | +probability 1/16, since SHA-256 hex output is uniform — the | |
| 152 | +"tampered" string is identical to the original, the signature | |
| 153 | +verifies, the function returns `"eve"`, and the assertion fails. | |
| 154 | + | |
| 155 | +Local runs masked this because the random draws (the timestamp going | |
| 156 | +into the signed payload) happened to never produce a `0`-ending sig. | |
| 157 | +The first run that actually ran in CI hit the unlucky draw and | |
| 158 | +exposed it. | |
| 159 | + | |
| 160 | +Fix: read the last char, flip to a digit it definitely isn't: | |
| 161 | + | |
| 162 | +```ts | |
| 163 | +const lastChar = cookie.slice(-1); | |
| 164 | +const tampered = cookie.slice(0, -1) + (lastChar === "f" ? "0" : "f"); | |
| 165 | +expect(tampered).not.toBe(cookie); // loudly fail if a future regression collides | |
| 166 | +``` | |
| 167 | + | |
| 168 | +Five runs in a row, every one passes. Determinism restored. | |
| 169 | + | |
| 170 | +### Bug B: the verifier's own test, flagged by its own check | |
| 171 | + | |
| 172 | +The placeholder warning pointed at: | |
| 173 | + | |
| 174 | +``` | |
| 175 | +src/c32_sama_verify.test.ts > does nothing | |
| 176 | +``` | |
| 177 | + | |
| 178 | +`c32_sama_verify.ts` is the verifier itself. Its test file holds a | |
| 179 | +fixture: | |
| 180 | + | |
| 181 | +```ts | |
| 182 | +test("Atomic: placeholder test (zero expect calls) is flagged", () => { | |
| 183 | + const placeholderFixture = `test("does nothing", () => { /* TODO */ })`; | |
| 184 | + // ... feed it to the verifier, assert the verifier flags it | |
| 185 | +}); | |
| 186 | +``` | |
| 187 | + | |
| 188 | +The string `test("does nothing", () => { /* TODO */ })` is a *fixture* | |
| 189 | +— a literal example of what a placeholder test looks like, fed to the | |
| 190 | +verifier so we can assert the verifier catches it. It's not a real | |
| 191 | +test. | |
| 192 | + | |
| 193 | +The verifier itself handles this correctly. It uses a | |
| 194 | +`stripStringsAndComments` helper to mask out string literals before | |
| 195 | +running its `test()`-finder regex over the source. So when the | |
| 196 | +verifier scans `c32_sama_verify.test.ts`, it sees the fixture as | |
| 197 | +whitespace, doesn't pick it up, and reports zero placeholders in that | |
| 198 | +file. | |
| 199 | + | |
| 200 | +But `snapshot-tests.ts` — the deploy-time generator that feeds | |
| 201 | +`/reports/live/tests` — duplicated the regex *without* the | |
| 202 | +strip-strings step. So it grepped the raw source, found the fixture | |
| 203 | +inside the backtick string, treated it as a real `test()` call, walked | |
| 204 | +its (TODO-only) body, counted zero `expect()` calls, and flagged it. | |
| 205 | + | |
| 206 | +The deploy-time detector was flagging the very test that proves the | |
| 207 | +runtime detector works. | |
| 208 | + | |
| 209 | +Fix: export `stripStringsAndComments` from `c32_sama_verify.ts` and | |
| 210 | +use the same mask-index pattern in the snapshot script: | |
| 211 | + | |
| 212 | +```ts | |
| 213 | +import { stripStringsAndComments } from "../../src/c32_sama_verify.ts"; | |
| 214 | +// ... | |
| 215 | +const mask = stripStringsAndComments(content); | |
| 216 | +while ((m = re.exec(content)) !== null) { | |
| 217 | + // If the match position is whitespace in the mask, the original | |
| 218 | + // was inside a string or comment — skip. | |
| 219 | + if (mask[m.index] === " " || mask[m.index] === "\n") continue; | |
| 220 | + // ... rest of the body-walking logic | |
| 221 | +} | |
| 222 | +``` | |
| 223 | + | |
| 224 | +DRYing the helper across the two places that need the same string-aware | |
| 225 | +behaviour. Now the snapshot agrees with the verifier. | |
| 226 | + | |
| 227 | +## What the cascade was actually telling me | |
| 228 | + | |
| 229 | +The bug count for ronde 4 looks bad: a 12-day staleness, a flaky test, | |
| 230 | +a false-positive in the deploy-time detector. Three independent | |
| 231 | +problems. | |
| 232 | + | |
| 233 | +But the *order* is the part worth looking at. Each fix made the next | |
| 234 | +one visible: | |
| 235 | + | |
| 236 | +1. Deploy script ran the snapshot step → file's timestamp moved → | |
| 237 | + `/reports/live` started reporting current commits. | |
| 238 | +2. Deploy script ran the test snapshot → tests actually ran in the | |
| 239 | + deploy pipeline → the flaky test surfaced (because previously it | |
| 240 | + never ran in CI), and the false-positive surfaced (because | |
| 241 | + previously the snapshot was 12 days old and that particular | |
| 242 | + fixture had been added since then). | |
| 243 | +3. Each fix's success was the precondition for the next bug to be | |
| 244 | + visible. | |
| 245 | + | |
| 246 | +The cascade isn't proof the system is fragile. It's proof that the | |
| 247 | +system was *blind* — a layer of silent error suppression had hidden | |
| 248 | +every downstream failure, so they accumulated without being detected. | |
| 249 | +The fix was less "patch three things" than "remove the lie and watch | |
| 250 | +what falls out." | |
| 251 | + | |
| 252 | +This is the same shape as TDD's iron rule applied to *infrastructure* | |
| 253 | +rather than to source: you can't trust a pass you didn't run. The | |
| 254 | +deploy-pipeline checks `bun test` exits zero — but only if `bun test` | |
| 255 | +*ran*. If the call returns 127 (command not found) and the deploy | |
| 256 | +script swallows it, every later assertion is hollow. | |
| 257 | + | |
| 258 | +`/reports/live` showing all-green for 12 days was perfectly compatible | |
| 259 | +with the test suite being completely broken. The only way to know is | |
| 260 | +to delete the swallowing. | |
| 261 | + | |
| 262 | +## Why this is the empirical case for SAMA, not against it | |
| 263 | + | |
| 264 | +A naive reading is "the codebase had three bugs you didn't catch." | |
| 265 | +The fairer reading is: the codebase had *one* bug — silent error | |
| 266 | +suppression in a deploy script — and the other two were latent | |
| 267 | +consequences that the verifier *would have* caught the moment they | |
| 268 | +ran. Removing the silence took ~15 minutes. Once silence was gone, both | |
| 269 | +hidden bugs surfaced *on the very next deploy*, with line numbers and | |
| 270 | +file paths, in two cells of a public web page. | |
| 271 | + | |
| 272 | +That's the empirical pattern SAMA's pitch turns on, scaled to the | |
| 273 | +infrastructure layer: | |
| 274 | + | |
| 275 | +- **Verification has to be observable.** A check that runs into | |
| 276 | + `2>/dev/null` is indistinguishable from a check that passes. | |
| 277 | +- **The cost of removing silence is low.** A `||` swallow → `|| | |
| 278 | + { echo ✗; exit 1; }` is a one-line change. A `2>/dev/null` → | |
| 279 | + `2>&1` is one word. | |
| 280 | +- **Removing silence pays compounding returns.** Three bugs hidden by | |
| 281 | + one suppressor — each one would have been instantly diagnosable if | |
| 282 | + the surface had been honest. | |
| 283 | + | |
| 284 | +## What this still doesn't prove | |
| 285 | + | |
| 286 | +It doesn't prove that exposing every failure produces a useful signal. | |
| 287 | +Some failures *should* be tolerated (best-effort cleanup, optional | |
| 288 | +caches), and over-strict failure handling can break production for | |
| 289 | +trivial reasons. The judgement is *which* failures: in this case, | |
| 290 | +`snapshot-git-history` running was load-bearing for the public claim | |
| 291 | +that `/reports/live` reflects the current repo. Treating its failure | |
| 292 | +as "non-fatal" was a category error. | |
| 293 | + | |
| 294 | +The general principle the cascade demonstrates: in a system whose value | |
| 295 | +proposition is *the artefacts a reviewer can replay*, the pipeline | |
| 296 | +that produces those artefacts has the same audit requirements as the | |
| 297 | +source code does. Silent failures in the pipeline are violations of | |
| 298 | +the standard the same way silent failures in the source would be. | |
| 299 | + | |
| 300 | +--- | |
| 301 | + | |
| 302 | +**See for yourself:** | |
| 303 | + | |
| 304 | +- Live: <https://tdd.md/reports/live> (date window is now current) | |
| 305 | +- Live: <https://tdd.md/reports/live/tests> ("193 passing · 0 placeholder") | |
| 306 | +- The PR that landed the three fixes: | |
| 307 | + <https://github.com/syntaxai/tdd.md/pull/14> | |
| 308 | +- Previous posts in this thread: | |
| 309 | + [the c21 Atomic-700 split](/blog/sama-empirical-c21-split) · | |
| 310 | + [greening the Modeled dogfood](/blog/sama-empirical-modeled-green) | |
e2e/git-content-browse.spec.ts
+121
−0
| @@ -0,0 +1,121 @@ | ||
| 1 | +// E2E: every blog post in ALL_POSTS is reachable via /GIT/. | |
| 2 | +// | |
| 3 | +// Crawls the registry's slugs (lifted into a literal array here so | |
| 4 | +// the test file doesn't import server-side modules) and asserts: | |
| 5 | +// 1. /GIT/syntaxai/tdd.md/tree/main/content/blog lists each post | |
| 6 | +// 2. /GIT/syntaxai/tdd.md/blob/main/content/blog/<slug>.md renders | |
| 7 | +// the post (markdown rendered via marked into the chrome) | |
| 8 | +// 3. /GIT/syntaxai/tdd.md/raw/main/content/blog/<slug>.md serves | |
| 9 | +// the raw markdown | |
| 10 | +// Plus the tree home (/GIT/syntaxai/tdd.md/tree/main) shows the | |
| 11 | +// top-level directories (content/, src/, public/, scripts/, etc.). | |
| 12 | + | |
| 13 | +import { test, expect } from "@playwright/test"; | |
| 14 | +import * as fs from "fs"; | |
| 15 | +import * as path from "path"; | |
| 16 | + | |
| 17 | +// Mirror of c31_blog.ts ALL_POSTS slugs. If a post is added there, | |
| 18 | +// add the slug here too. Kept inline to avoid pulling server code | |
| 19 | +// into the test process. | |
| 20 | +const BLOG_SLUGS = [ | |
| 21 | + "sama-meets-git-cms", | |
| 22 | + "from-rules-to-checks", | |
| 23 | + "agentic-coding-corpus-three-patterns", | |
| 24 | + "claude-code-harness-postmortem", | |
| 25 | + "three-constraints-agentic-coding", | |
| 26 | + "tweag-handbook-tdd", | |
| 27 | + "aider-tdd", | |
| 28 | + "cursor-tdd", | |
| 29 | + "claude-code-tdd", | |
| 30 | +]; | |
| 31 | + | |
| 32 | +const SCREENSHOT_DIR = "test-results/git-content-browse"; | |
| 33 | + | |
| 34 | +test.beforeAll(() => { | |
| 35 | + fs.mkdirSync(SCREENSHOT_DIR, { recursive: true }); | |
| 36 | +}); | |
| 37 | + | |
| 38 | +test.describe("/GIT browses the local bare repo", () => { | |
| 39 | + test("repo root tree lists the top-level directories", async ({ page }) => { | |
| 40 | + const res = await page.goto("/GIT/syntaxai/tdd.md/tree/main"); | |
| 41 | + expect(res?.status()).toBe(200); | |
| 42 | + | |
| 43 | + // Top-level dirs we expect after the dev tree was pushed. | |
| 44 | + for (const dir of ["content", "src", "public", "scripts", "e2e"]) { | |
| 45 | + await expect( | |
| 46 | + page.locator(`a[href="/GIT/syntaxai/tdd.md/tree/main/${dir}"]`), | |
| 47 | + ).toBeVisible(); | |
| 48 | + } | |
| 49 | + // Top-level files | |
| 50 | + await expect( | |
| 51 | + page.locator('a[href="/GIT/syntaxai/tdd.md/blob/main/package.json"]'), | |
| 52 | + ).toBeVisible(); | |
| 53 | + | |
| 54 | + await page.screenshot({ | |
| 55 | + path: path.join(SCREENSHOT_DIR, "1-repo-root-tree.png"), | |
| 56 | + fullPage: true, | |
| 57 | + }); | |
| 58 | + }); | |
| 59 | + | |
| 60 | + test("content/blog tree lists every post in ALL_POSTS", async ({ page }) => { | |
| 61 | + const res = await page.goto("/GIT/syntaxai/tdd.md/tree/main/content/blog"); | |
| 62 | + expect(res?.status()).toBe(200); | |
| 63 | + for (const slug of BLOG_SLUGS) { | |
| 64 | + const link = page.locator( | |
| 65 | + `a[href="/GIT/syntaxai/tdd.md/blob/main/content/blog/${slug}.md"]`, | |
| 66 | + ); | |
| 67 | + await expect(link, `link to ${slug}.md must be present`).toBeVisible(); | |
| 68 | + } | |
| 69 | + | |
| 70 | + await page.screenshot({ | |
| 71 | + path: path.join(SCREENSHOT_DIR, "2-content-blog-tree.png"), | |
| 72 | + fullPage: true, | |
| 73 | + }); | |
| 74 | + }); | |
| 75 | + | |
| 76 | + for (const slug of BLOG_SLUGS) { | |
| 77 | + test(`blob view renders ${slug}.md as markdown via /GIT`, async ({ page }) => { | |
| 78 | + const res = await page.goto( | |
| 79 | + `/GIT/syntaxai/tdd.md/blob/main/content/blog/${slug}.md`, | |
| 80 | + ); | |
| 81 | + expect(res?.status()).toBe(200); | |
| 82 | + // The repo-blob-rendered container is what marked.parse output | |
| 83 | + // lands in. It must exist + be non-empty. | |
| 84 | + const rendered = page.locator(".repo-blob-rendered"); | |
| 85 | + await expect(rendered).toBeVisible(); | |
| 86 | + const text = (await rendered.textContent()) ?? ""; | |
| 87 | + expect(text.length).toBeGreaterThan(200); | |
| 88 | + // The breadcrumb must show the file path so users can climb. | |
| 89 | + await expect(page.locator(".commit-breadcrumb")).toContainText(`${slug}.md`); | |
| 90 | + }); | |
| 91 | + | |
| 92 | + test(`raw endpoint serves ${slug}.md as text/plain via /GIT`, async ({ request }) => { | |
| 93 | + const res = await request.get( | |
| 94 | + `/GIT/syntaxai/tdd.md/raw/main/content/blog/${slug}.md`, | |
| 95 | + ); | |
| 96 | + expect(res.status()).toBe(200); | |
| 97 | + expect(res.headers()["content-type"]).toMatch(/text\/plain/); | |
| 98 | + const body = await res.text(); | |
| 99 | + // Frontmatter or first heading — every blog post has one. | |
| 100 | + expect(body.length).toBeGreaterThan(200); | |
| 101 | + }); | |
| 102 | + } | |
| 103 | + | |
| 104 | + test("path traversal is rejected", async ({ request }) => { | |
| 105 | + for (const evil of [ | |
| 106 | + "/GIT/syntaxai/tdd.md/blob/main/../etc/passwd", | |
| 107 | + "/GIT/syntaxai/tdd.md/blob/main/content/../../etc/passwd", | |
| 108 | + "/GIT/syntaxai/tdd.md/tree/main//content", | |
| 109 | + ]) { | |
| 110 | + const res = await request.get(evil); | |
| 111 | + expect(res.status(), `${evil} must 404`).toBe(404); | |
| 112 | + } | |
| 113 | + }); | |
| 114 | + | |
| 115 | + test("non-allowed (owner, repo) 404s — only syntaxai/tdd.md is served", async ({ | |
| 116 | + request, | |
| 117 | + }) => { | |
| 118 | + const res = await request.get("/GIT/someone/random-repo/tree/main"); | |
| 119 | + expect(res.status()).toBe(404); | |
| 120 | + }); | |
| 121 | +}); | |
feedback.md
+89
−41
| @@ -1,66 +1,114 @@ | ||
| 1 | -Goede vraag. Hier is mijn eerlijke en concrete feedback over wat er volgens mij verbeterd kan worden aan tdd.md: | |
| 2 | -Wat al goed is | |
| 1 | +SAMA 2.0 — Voorstel voor de volgende evolutie | |
| 3 | 2 | |
| 4 | - Strenge judge | |
| 3 | +Hier is een uitgewerkte, concrete versie van SAMA 2.0. Ik houd de kern intact (want die is sterk), maar maak het completer, toekomstbestendiger en nog beter geschikt voor agentic coding in 2026+. | |
| 4 | +Kern blijft hetzelfde (de vier letters) | |
| 5 | 5 | |
| 6 | - Hidden tests | |
| 6 | + S — Sorted: Alphabetical + dependency direction (lager nummer importeert nooit hoger). | |
| 7 | 7 | |
| 8 | - Git-based scoring | |
| 8 | + A — Architecture: Numerieke lagen met duidelijke contracten. | |
| 9 | 9 | |
| 10 | - Publieke verdicts | |
| 10 | + M — Modeled: Types + sibling tests. | |
| 11 | 11 | |
| 12 | -Wat ik zou verbeteren | |
| 12 | + A — Atomic: Bestanden klein houden (~700 regels) + split per domein. | |
| 13 | 13 | |
| 14 | - Te puristisch / te streng | |
| 15 | - Op dit moment is het bijna “religieuze TDD”. Dit jaagt mensen weg. | |
| 16 | - Verbetering: Introduceer verschillende niveaus of modi: | |
| 14 | +Wat verandert / wordt toegevoegd in 2.0 | |
| 17 | 15 | |
| 18 | - Strict Mode (huidige versie) | |
| 16 | +Ik voeg twee nieuwe letters toe → SAMA wordt SAMAX (of je houdt SAMA en maakt de extra’s optioneel). | |
| 17 | +Nieuwe letter: X — eXtensible & Vertical | |
| 19 | 18 | |
| 20 | - Pragmatic Mode (zoals Kent Beck later bedoelde): toestaat spikes/exploratie, test-first is sterk aangemoedigd maar niet heilig. | |
| 19 | + Doel: Combineer de kracht van horizontale lagen (duidelijke dependency flow) met verticale slices (alles van één feature dicht bij elkaar). | |
| 21 | 20 | |
| 22 | - Learning Mode: mildere straf voor beginners. | |
| 21 | + Regel: Optionele feature-prefix bovenop de laag: c32_user_auth.ts of feat_payment_c32_processor.ts. | |
| 23 | 22 | |
| 24 | - Alleen unit-level focus | |
| 25 | - Veel moderne software heeft ook integratie, UI, performance en architectuur issues. | |
| 26 | - Verbetering: Voeg kata’s toe op verschillende lagen (niet alleen string calculator niveau), inclusief: | |
| 23 | + Voordeel voor agents: Een agent die aan “user authentication” werkt, ziet alle relevante bestanden gegroepeerd via zoekopdracht feat_user_*. | |
| 27 | 24 | |
| 28 | - API-kata’s | |
| 25 | +Nieuwe letter: D — Documented (de vijfde discipline) | |
| 29 | 26 | |
| 30 | - Database interactie | |
| 27 | + Regel: Elke module én elke feature-map heeft een README.md of .agent.md met: | |
| 31 | 28 | |
| 32 | - UI/component testing | |
| 29 | + One-sentence responsibility | |
| 33 | 30 | |
| 34 | - Geen onderscheid tussen exploratie en implementatie | |
| 35 | - In echte projecten doe je vaak eerst een spike. | |
| 36 | - Verbetering: Laat toe dat een “spike” fase expliciet gemarkeerd wordt, en daarna pas de echte TDD-cyclus begint. | |
| 31 | + Key types & contracts | |
| 37 | 32 | |
| 38 | - Scoring is te binair | |
| 39 | - Momenteel voelt het soms als een spelletje “volg de regels perfect”. | |
| 40 | - Verbetering: Voeg kwaliteitsmetingen toe, zoals: | |
| 33 | + Acceptance criteria / invariants | |
| 41 | 34 | |
| 42 | - Code simplicity / cyclomatic complexity | |
| 35 | + “Where to put new code” instructies | |
| 43 | 36 | |
| 44 | - Hoe klein de stappen waren | |
| 37 | + Dit wordt de levende specificatie voor de agent. | |
| 45 | 38 | |
| 46 | - Hoe goed de namen van tests en variabelen zijn | |
| 39 | +Uitgebreide Layer Mapping (Architecture 2.0) | |
| 40 | +Laag Naam Verantwoordelijkheid Voorbeelden Mag importeren van | |
| 41 | +c11 Entry / Composition Root App bootstrap, wiring main.ts, server.ts Alles | |
| 42 | +c13 Data / Persistence DB, repositories, queries c13_user_repo.ts c31 | |
| 43 | +c14 I/O Adapters HTTP, queues, external APIs c14_auth_controller.ts c21, c31, c32 | |
| 44 | +c21 Handlers / Presenters Request/response orchestration c21_login_handler.ts c32, c31 | |
| 45 | +c31 Models & Types Domain models, DTOs, Value Objects, Zod schemas c31_user.ts - (puur) | |
| 46 | +c32 Business Logic / Use Cases Pure functions, domain rules c32_user_auth.ts c31 | |
| 47 | +c40 Application Services Orchestratie van meerdere use cases (nieuw!) c40_user_service.ts c32, c31 | |
| 48 | +c51 UI / Presentation Components, pages, server components c51_user_profile.tsx c31, c32 | |
| 49 | +c60 Infrastructure / External Third-party clients, config, logging c60_stripe_client.ts - | |
| 50 | +c70 Cross-cutting Auth middleware, logging, monitoring, tracing c70_logger.ts alles (voorzichtig) | |
| 47 | 51 | |
| 48 | - Of de code idiomatisch is voor de taal | |
| 52 | +Atomic-regel verfijnd: Max ~600-700 regels of max 1 feature per bestand (zelfs als het kleiner is). | |
| 53 | +Tooling & Automatisering (kritisch voor gold standard) | |
| 49 | 54 | |
| 50 | - Te weinig feedback voor verbetering | |
| 51 | - Je krijgt een score, maar niet altijd waarom je slecht scoort op een begrijpelijke manier. | |
| 52 | - Verbetering: Betere, menselijke uitleg + suggesties (“Je hebt 3 commits gedaan zonder failing test”, “Je hidden tests vielen door op edge case X”). | |
| 55 | +SAMA 2.0 moet dit hebben: | |
| 53 | 56 | |
| 54 | - Te weinig variatie in kata’s | |
| 55 | - Beginnend met string-calc is prima, maar er moet snel meer komen (bijv. een kleine web API, een game loop, een parser, etc.). | |
| 57 | + Officiële sama CLI | |
| 56 | 58 | |
| 57 | - Community & educatie | |
| 58 | - Voeg een “Why” sectie toe die uitlegt wanneer strikte TDD zinvol is en wanneer niet. Op dit moment straalt het te veel “dit is de enige juiste manier” uit. | |
| 59 | + sama check → voert alle verificaties uit | |
| 59 | 60 | |
| 60 | -Mijn ideale versie van tdd.md | |
| 61 | + sama new c32_user_auth → maakt bestand + template + .agent.md | |
| 61 | 62 | |
| 62 | -Een platform dat niet alleen meet hoe goed je TDD volgt, maar ook hoe goed je als engineer nadenkt — met de flexibiliteit die ervaren developers (inclusief Kent Beck) in de praktijk toepassen. | |
| 63 | + sama split → helpt bij Atomic splits | |
| 63 | 64 | |
| 64 | -Kort samengevat: | |
| 65 | -tdd.md is nu een strenge TDD-judge. | |
| 66 | -Ik zou het liever zien als een slimme TDD-coach die discipline aanleert, maar ook volwassen, context-bewuste engineering aanmoedigt. | |
| 65 | + sama verify-repo | |
| 66 | + | |
| 67 | + Pre-commit hooks + GitHub Action | |
| 68 | + | |
| 69 | + LSP / Editor integration (VS Code, Cursor, Windsurf) | |
| 70 | + | |
| 71 | + Claude / Cursor skill met automatische checks | |
| 72 | + | |
| 73 | +Extra verbeteringen | |
| 74 | + | |
| 75 | + Monorepo support: Package-prefixen (pkg_auth_c32_...) | |
| 76 | + | |
| 77 | + Graduele adoptie: SAMA Light (alleen Sorted + Atomic) → Full SAMAX | |
| 78 | + | |
| 79 | + Frontend + Backend harmonisatie (vooral Next.js App Router) | |
| 80 | + | |
| 81 | + Deprecation policy: Duidelijke manier om oude lagen/bestanden af te bouwen | |
| 82 | + | |
| 83 | + Performance & Security gates in Atomic-check (bijv. flag grote N+1 queries) | |
| 84 | + | |
| 85 | +Voorbeeld folder structuur (SAMA 2.0) | |
| 86 | + | |
| 87 | +src/ | |
| 88 | +├── c11_app.ts | |
| 89 | +├── features/ | |
| 90 | +│ └── user/ | |
| 91 | +│ ├── feat_user_c13_repo.ts | |
| 92 | +│ ├── feat_user_c31_types.ts | |
| 93 | +│ ├── feat_user_c32_auth.ts | |
| 94 | +│ ├── feat_user_c32_profile.ts | |
| 95 | +│ ├── feat_user_c40_service.ts | |
| 96 | +│ └── feat_user.agent.md | |
| 97 | +├── c60_stripe.ts | |
| 98 | +└── c70_auth_middleware.ts | |
| 99 | + | |
| 100 | +Samenvatting: Waarom dit de gold standard wordt | |
| 101 | + | |
| 102 | + Behoudt mechanische verifieerbaarheid (grep + CLI) | |
| 103 | + | |
| 104 | + Wordt hybride (horizontaal + verticaal) | |
| 105 | + | |
| 106 | + Voegt documentatie als first-class citizen toe | |
| 107 | + | |
| 108 | + Wordt echt toolbaar en adopteerbaar | |
| 109 | + | |
| 110 | + Schaalt van solo-agent project tot middelgroot team | |
| 111 | + | |
| 112 | +Dit is geen over-engineering — het lost precies de pijnpunten op die ontstaan als projecten groter worden dan 20-30k regels met agents. | |
| 113 | + | |
| 114 | +Wil je dat ik een volledige officiële spec uitwerk (zoals de originele tdd.md pagina), inclusief verificatie commando’s en templates? Of focus op één deel (bijv. de CLI spec of een concrete Next.js template)? | |
plan.md
+321
−0
| @@ -0,0 +1,321 @@ | ||
| 1 | +# Plan — port podman/syntax CMS into tdd.md, SAMA-native | |
| 2 | + | |
| 3 | +**Doel.** Het CMS uit `~/Documents/podman` (sx-filter + sx-editor + sx-content + Ghost-compat theme) volledig overzetten naar tdd.md, in 100% SAMA-stijl, met de bestaande tdd.md content intact gemigreerd. | |
| 4 | + | |
| 5 | +**Niet-doel.** Podman, Caddy, of een tweede service-tier in tdd.md. Alles draait in één Bun-proces dat we al hebben (`c11_server.ts`). Caddy's rol (TLS + routing) doet onze deploy-laag op p620. | |
| 6 | + | |
| 7 | +--- | |
| 8 | + | |
| 9 | +## ⚠ Eerst beslissen — storage-canon | |
| 10 | + | |
| 11 | +Dit stuurt elke andere keuze. Twee opties; ik default naar **A** tenzij je flipt. | |
| 12 | + | |
| 13 | +### A. Git-canon (default — behoudt tdd.md identity) | |
| 14 | + | |
| 15 | +- Bron-van-waarheid blijft het bare repo `/app/repo` (huidige stack). | |
| 16 | +- **Elke save in de editor = een commit** via bestaande `c14_git.commitFile`. | |
| 17 | +- sxdoc-trees (typed blocks) leven als sidecar JSON naast de markdown: | |
| 18 | + `content/blog/foo.md` + `content/blog/foo.sxdoc.json`. | |
| 19 | +- SQLite (bestaande `c13_database`) krijgt een afgeleide index-tabel | |
| 20 | + (`content_index`) voor snelle lijst-queries en taxonomie-lookups, **rebuildbaar uit git**. Drop het, replay `git log`, terug. | |
| 21 | +- Voordeel: "SAMA meets git" verhaal blijft kloppen. `sama-meets-git-cms.md` blijft waarheid. Audit-trail = `git log content/`. | |
| 22 | +- Nadeel: complexer dan podman's directe SQLite-writes. Trager bij grote sites (>10k posts). Niet relevant op onze schaal. | |
| 23 | + | |
| 24 | +### B. SQLite-canon (1-op-1 podman-port) | |
| 25 | + | |
| 26 | +- `content/*.md` wordt eenmalig geïmporteerd naar `sx_documents` + `posts` tabellen, daarna read-only. | |
| 27 | +- Editor schrijft uitsluitend naar SQLite. Git-history van content stopt op het migratie-commit. | |
| 28 | +- Voordeel: minimale afwijking van podman's code. Sneller te porten. | |
| 29 | +- Nadeel: tdd.md verliest "elke content-edit = commit" — kern van het product per memory. | |
| 30 | + | |
| 31 | +**Beslissing 2026-05-11: B (SQLite-canon) + git-als-audit-mirror. Locked.** | |
| 32 | + | |
| 33 | +--- | |
| 34 | + | |
| 35 | +## Locked decisions (2026-05-11) | |
| 36 | + | |
| 37 | +### Storage canon: **B (SQLite-canon)** + git-als-audit-mirror | |
| 38 | +- **Canoniek:** `sx_documents` tabel in `c13_database` (bun:sqlite). Editor reads/writes hier; live-preview en alle render-paden lezen hier. | |
| 39 | +- **Audit-mirror:** elke save → 1 multi-path commit met `content/{slug}.md` (afgeleide markdown-projectie) + `content/{slug}.sxdoc.json` (canonical JSON-tree). Zo blijft `git log content/` de menselijk-leesbare audit-trail; "elke save = een commit" uit `sama-meets-git-cms.md` blijft waar — de **canoniciteit** ligt nu in SQLite, het **bewijs** in git. | |
| 40 | +- **Recovery:** SQLite-corruptie? Drop tabel, replay van `*.sxdoc.json`. | |
| 41 | +- **Initial migration:** eenmalig `scripts/migrate_content_to_sxdoc.ts` leest huidige `content/**/*.md` → parseert naar `SxDocument` → schrijft SQLite + emit één migratie-commit met alle nieuwe `.sxdoc.json` ernaast. | |
| 42 | + | |
| 43 | +### Parser laag: **c31** · Render laag: **c51** | |
| 44 | +- `c31_sxdoc_parse.ts` (HTML → SxDocument) + sibling `c31_sxdoc_parse.test.ts`. | |
| 45 | + Reden: `content/sama/modeled.md` is expliciet — *"every external input has a parser in a c31_* model"*. HTML strings vanuit de editor/migratie zijn external input → c31. | |
| 46 | +- `c51_render_sxdoc.ts` (SxDocument → HTML) + sibling `c51_render_sxdoc.test.ts`. | |
| 47 | + Reden: `content/sama/architecture.md` picking-order regel 4 — *"Does it produce HTML? Yes → c51"*. sxToHtml produceert HTML. | |
| 48 | +- **Correctie t.o.v. eerder plan + research-migration:** parser/renderer waren foutief op c32 geplaatst (research keek alleen naar verifier-hard-rule "c32 vereist sibling-test", maar canon-docs sturen anders). Tests blijven (c31 sibling is informationally verplicht via Modeled; c51 idem voor goed onderhoud al staat het niet hard in de verifier). | |
| 49 | + | |
| 50 | +### Commit-vorm: **één multi-path commit per save** | |
| 51 | +- `c14_git` krijgt nieuwe `commitFiles(paths: Array<{path, body}>)` naast bestaande `commitFile`. | |
| 52 | +- Eén commit → atomic rollback van die SHA herstelt beide bestanden. | |
| 53 | + | |
| 54 | +--- | |
| 55 | + | |
| 56 | +## Werkwijze (build-discipline per file-landing) | |
| 57 | + | |
| 58 | +Elke file-write moet alle vier SAMA-axes passeren vóór de volgende file landt. Geen pile-up van violations. | |
| 59 | + | |
| 60 | +| Axis | Wat dat afdwingt | Hoe we dat afdwingen | | |
| 61 | +|---|---|---| | |
| 62 | +| **Sorted** | c1*/c3* mogen niet relatief upward importeren | Bottom-up bouwen: c1 → c3 → c2 → c5. Nooit import naar hogere laag. | | |
| 63 | +| **Architecture** | prefix ∈ {11, 13, 14, 21, 31, 32, 51} | Layer-toewijzing vóór tik. I/O? → c14. Logic+transform? → c32. Pure types/registry? → c31. | | |
| 64 | +| **Modeled** | c32_*.ts vereist sibling .test.ts (hard); c31 = info-only | **c32 source + test landen in dezelfde edit-batch**, nooit los. Test heeft ≥1 `expect()` per `test(...)`-body. | | |
| 65 | +| **Atomic** | ≤700 LOC per file; geen placeholder tests | `wc -l` checken vóór commit. Splits gebudgetteerd (client/render per block-kind; shortcodes registry+substitute). | | |
| 66 | + | |
| 67 | +### Niet-verifier-afgedwongen SAMA-canon (per `content/sama/*.md`) | |
| 68 | + | |
| 69 | +- **Flat `src/`** — geen subdirs server-side. Client onder `src/client/**.ts` (buiten verifier-glob). | |
| 70 | +- **Geen barrel re-exports** (`atomic.md`). | |
| 71 | +- **c31/c32 importeren geen I/O-modules** (sharp, fs, bun:sqlite, fetch) — verifier ziet alleen relative imports, dus dit is persoonlijke discipline. | |
| 72 | +- **One concept per file** — types apart van parser apart van renderer. | |
| 73 | + | |
| 74 | +### Verificatie-cadans | |
| 75 | + | |
| 76 | +Na **elke** file-landing (niet alleen aan eind van fase): | |
| 77 | + | |
| 78 | +``` | |
| 79 | +bun test src/c32_sama_verify.test.ts # verifier zelf groen? | |
| 80 | +bun test src/<file>.test.ts # nieuwe sibling-test groen? | |
| 81 | +wc -l src/c*<file>*.ts # geen file > 700 | |
| 82 | +``` | |
| 83 | + | |
| 84 | +Aan einde van elke fase: | |
| 85 | +``` | |
| 86 | +bun test # alles groen | |
| 87 | +bun run src/c11_server.ts & # boot smoke | |
| 88 | +curl localhost:3000/health # 200 | |
| 89 | +``` | |
| 90 | + | |
| 91 | +### Anti-patronen (expliciet verboden) | |
| 92 | + | |
| 93 | +- **"Het werkt, test komt later"** voor c32 — source en test landen samen of niet. | |
| 94 | +- **Refactoren van lower-layer code in een higher-layer fase** — bv. `c14_git.commitFiles` toevoegen tijdens Fase 1 omdat het "handig" is. Lower-layer changes horen bij de fase waar de caller landt. | |
| 95 | +- **Sub-folders onder `src/`** server-side om "het netter te organiseren". Flatten is SAMA-canon. | |
| 96 | +- **Improviseren over layer-toewijzing** — als je twijfelt over c31 vs c32, default naar c32 (sibling-test = vangnet). | |
| 97 | + | |
| 98 | +--- | |
| 99 | + | |
| 100 | +## Layer-correcties uit research-migration.md | |
| 101 | + | |
| 102 | +Plan.md zat op drie plaatsen fout volgens de SAMA-rules in `c32_sama_verify.ts`. Gecorrigeerd: | |
| 103 | + | |
| 104 | +| Was | Wordt | Reden | | |
| 105 | +|---|---|---| | |
| 106 | +| `c31_image_resize.ts` | `c14_image_resize.ts` | sharp doet I/O — c14 verplicht | | |
| 107 | +| `c31_ai_edit_block.ts` | `c14_openrouter.ts` (HTTP) + `c32_ai_edit_block.ts` (validate/transform) | OpenRouter HTTP = c14; orchestratie + sibling-test = c32 | | |
| 108 | +| `c31_sxdoc_parse.ts` | `c32_sxdoc_parse.ts` | logica, geen pure types — c32 vereist sibling-test | | |
| 109 | +| `c31_sxdoc_render.ts` | `c32_sxdoc_render.ts` | idem | | |
| 110 | + | |
| 111 | +### Atomic-700 splits gebudgetteerd | |
| 112 | + | |
| 113 | +| File | LOC bij directe port | Splits | | |
| 114 | +|---|---|---| | |
| 115 | +| `sx-editor/src/client/render.ts` | 775 | over Atomic-700; split per block-kind onder `src/client/blocks/render-{p,h,list,quote,code,img,html,shortcode}.ts` + één `src/client/render.ts`-dispatch ≤200 LOC | | |
| 116 | +| `sx-filter/src/shortcodes.ts` | 650 | krap; pak meteen split langs `c31_shortcodes_registry.ts` (built-ins) + `c32_shortcodes_substitute.ts` (HTML-rewriter met regio-skip) | | |
| 117 | + | |
| 118 | +### Tests-zijn-siblings rule (was niet geëxpliciteerd) | |
| 119 | + | |
| 120 | +Podman's `sx-editor/tests/unit.test.ts` shape **incompatibel**. Elke `cXX_*.test.ts` moet als sibling naast `cXX_*.ts` staan onder `src/`. Bestaande tdd.md tests doen dit al correct. | |
| 121 | + | |
| 122 | +### Client-side placement: `src/client/**.ts` | |
| 123 | + | |
| 124 | +Geen verifier-impact (alleen `cXX_*.ts` wordt gescand). Relatieve imports naar `../c31_sxdoc.ts` werken vanuit hier. Bun.build bundelt uit `src/client/`. Geen nieuwe top-level dir. | |
| 125 | + | |
| 126 | +### Verboden subdirs onder `src/` | |
| 127 | + | |
| 128 | +Podman's `sxdoc/`, `core/`, `db/`, `client/blocks/` mag niet onder `src/` blijven bij server-port. Dat geldt **niet** voor `src/client/` (die staat buiten verifier-scope). Server-code flat houden. | |
| 129 | + | |
| 130 | +--- | |
| 131 | + | |
| 132 | +## SAMA-mapping — podman-stuk → tdd.md cXX-laag | |
| 133 | + | |
| 134 | +SAMA-conventie (per memory): cXX_*.ts, `c1X` = data/I-O, `c2X` = handlers/app, `c3X` = pure logic, `c5X` = render. Lower layer never imports higher. | |
| 135 | + | |
| 136 | +| Podman | tdd.md (nieuw) | SAMA-laag | Wat het doet | | |
| 137 | +|---|---|---|---| | |
| 138 | +| `sx-data/sx.db` schema | `c13_database.ts` (extend) | c1 | tabellen `sx_documents`, `media`, `content_index`, `api_keys` | | |
| 139 | +| `sx-editor/src/sxdoc/types.ts` | `c31_sxdoc.ts` | c3 | `SxDocument`, `Block`, helpers — pure types/registry | | |
| 140 | +| `sx-editor/src/sxdoc/html-to-sx.ts` | `c31_sxdoc_parse.ts` (+ sibling `.test.ts`) | c3 | HTML → SxDocument (parser = c31 per Modeled.md) | | |
| 141 | +| `sx-editor/src/sxdoc/sx-to-html.ts` | `c51_render_sxdoc.ts` (+ sibling `.test.ts`) | c5 | SxDocument → HTML (produces HTML = c51 per Architecture.md) | | |
| 142 | +| `sx-editor/src/sxdoc/db.ts` | `c13_database.ts` extend (saveDocument/loadDocument/listDocuments/deleteDocument) | c1 | SQLite read/write (canon-B); bun:sqlite = c13, niet c14 | | |
| 143 | +| `sx-editor/src/upload.ts` + sharp resize | `c14_media.ts` + `c14_image_resize.ts` | c1 | upload, on-disk store, sharp transforms (sharp = I/O) | | |
| 144 | +| `sx-editor/src/ai.ts` (OpenRouter) | `c14_openrouter.ts` + `c32_ai_edit_block.ts` | c1 + c3 | HTTP-call in c14; validate + transform in c32 met sibling-test | | |
| 145 | +| `sx-editor/src/templates.ts` (list/edit shells) | `c51_render_admin.ts` | c5 | admin-list + edit-page chrome | | |
| 146 | +| `sx-editor/src/routes.ts` (urlForPage/Post) | bestaande `c31_site_config.ts` extend | c3 | routes.yaml-equivalent — wij hebben al routes-config | | |
| 147 | +| `sx-editor/src/client/blockeditor.ts` + `slashmenu.ts` + `blocks/*` | `client/` (TS bundle) → served door `c21_handlers_edit.ts` | client | block-editor JS, slash-menu, AI ✨, autosave | | |
| 148 | +| `sx-editor/src/build.ts` (Bun.build serve) | `c14_client_bundle.ts` | c1 | bundle TS-client → ESM, cache in geheugen | | |
| 149 | +| `sx-filter/src/shortcodes.ts` (650 LOC — over 700 binnen 1 add) | **split**: `c31_shortcodes_registry.ts` (built-ins, namen, args) + `c32_shortcodes_substitute.ts` (HTML-rewriter met meta/script-skip) + verplichte `.test.ts` op de c32 | c3 | parsing/substitutie, voorkomt Atomic-700 violation | | |
| 150 | +| `sx-filter/src/admin.ts` (admin-button injectie) | bestaande edit-flow heeft al login-gate | c2 | n.v.t. — wij hebben echte auth | | |
| 151 | +| `sx-content/src/render.ts` (Handlebars renderer) | `c51_render_theme.ts` | c5 | Ghost-compat theme renderer; **geen Handlebars-dep** — pure TS template-helpers | | |
| 152 | +| `sx-content/src/sitemap.ts` | `c51_render_sitemap.ts` | c5 | sitemap.xml + RSS | | |
| 153 | +| `sx-content/src/images.ts` | onderdeel van `c14_media.ts` boven | c1 | path-routed /content/images/* | | |
| 154 | +| `sx-themes/syntax/*.hbs` partials | `theme/*.html` of `c51_render_theme_partials.ts` | c5 | Ghost-look, maar als TS template-helpers | | |
| 155 | + | |
| 156 | +### Nieuwe handlers (c21) | |
| 157 | + | |
| 158 | +- `c21_handlers_admin_list.ts` — `/admin/` lijst van pages+posts | |
| 159 | +- `c21_handlers_admin_edit.ts` — `/admin/edit/{type}/{slug}` (block-editor) | |
| 160 | +- `c21_handlers_admin_new.ts` — `/admin/new` | |
| 161 | +- `c21_handlers_admin_upload.ts` — `/admin/upload` | |
| 162 | +- `c21_handlers_admin_ai.ts` — `/admin/ai/edit-block` | |
| 163 | +- `c21_handlers_admin_preview.ts` — `/admin/preview` (live render) | |
| 164 | +- `c21_handlers_content.ts` — public render dispatcher (post/page/tag/author) | |
| 165 | +- `c21_handlers_sitemap.ts` — `/sitemap.xml`, `/blog/rss/` | |
| 166 | +- `c21_handlers_media.ts` — `/content/images/*` | |
| 167 | + | |
| 168 | +Bestaande `c21_handlers_edit.ts` wordt **vervangen** door `c21_handlers_admin_edit.ts` (block-editor i.p.v. textarea). | |
| 169 | + | |
| 170 | +--- | |
| 171 | + | |
| 172 | +## Content-migratie | |
| 173 | + | |
| 174 | +Bestaande tdd.md content: | |
| 175 | +``` | |
| 176 | +content/home.md | |
| 177 | +content/blog/*.md (9 posts) | |
| 178 | +content/sama/*.md (5 pages) | |
| 179 | +content/games/*/ (2 games — multi-file) | |
| 180 | +content/guides/*.md (3 pages) | |
| 181 | +content/git-history/* (commit-meta JSON) | |
| 182 | +``` | |
| 183 | + | |
| 184 | +Migratie-strategie (canon B, SQLite + git-mirror): | |
| 185 | + | |
| 186 | +1. **Eenmalig script** `scripts/migrate_content_to_sxdoc.ts` (loopt lokaal, niet in container). | |
| 187 | +2. Voor elke `.md`: lees frontmatter (titel, tags, status), parseer body → `SxDocument` via `c32_sxdoc_parse`, **insert** in `sx_documents` tabel, schrijf óók `*.sxdoc.json` ernaast voor git-mirror. | |
| 188 | +3. `home.md` → slug `_home` (matcht podman's special `_home` slug). | |
| 189 | +4. Games (`content/games/*/`) blijven multi-file — buiten CMS-scope, blijven via `c31_games.ts` gerenderd. | |
| 190 | +5. `git-history/` is geen content — geen migratie nodig. | |
| 191 | +6. Eén batch-commit: "Migrate: content → sxdoc (SQLite-canon + git-mirror)" met alle `*.sxdoc.json` toevoegingen. | |
| 192 | + | |
| 193 | +Public URLs blijven gelijk (deze zijn al via `c31_site_config` gerouteerd). De Ghost-style `/blog/{primary_tag}/{slug}/` permalink is optioneel en gaat door de redirects-laag die we al hebben. | |
| 194 | + | |
| 195 | +--- | |
| 196 | + | |
| 197 | +## Fasering | |
| 198 | + | |
| 199 | +Per memory: bypass-pacing / JOLO is OK voor scopes die in één run passen. Dit is een dagen-werk port, dus ik fasering aanhouden met deploy + verify per fase. | |
| 200 | + | |
| 201 | +### Fase 0 — beslissing + scaffolding (afgerond 2026-05-11) | |
| 202 | +- [x] Plan vastleggen (dit document). | |
| 203 | +- [x] Storage-canon bevestigd: **B (SQLite-canon + git-audit-mirror)**. | |
| 204 | +- [x] Parser/render laag bevestigd: **c32**. | |
| 205 | +- [x] Commit-vorm bevestigd: **één multi-path commit per save**. | |
| 206 | +- [x] Research-migration onderzoek afgerond → `research-migration.md`. | |
| 207 | +- [x] Layer-correcties verwerkt in mapping-tabel. | |
| 208 | +- [ ] `plan.md` committen (wacht op user-go). | |
| 209 | + | |
| 210 | +### Fase 1 — sxdoc-fundament (in uitvoering 2026-05-11) | |
| 211 | +- [x] `c31_sxdoc.ts` — types only (geen sibling-test verplicht) | |
| 212 | +- [x] `c31_sxdoc_parse.ts` (HTML→tree, port van podman `html-to-sx.ts`) + sibling `c31_sxdoc_parse.test.ts` | |
| 213 | +- [x] `c51_render_sxdoc.ts` (tree→HTML, port van podman `sx-to-html.ts`) + sibling `c51_render_sxdoc.test.ts` | |
| 214 | +- [x] Skip typed marketing blocks — niet nodig voor tdd.md content (~600 LOC bespaard). | |
| 215 | +- [x] `c13_database.ts` extend: `sx_documents` tabel + saveDocument/loadDocument/listDocuments/deleteDocument | |
| 216 | +- [x] `package.json`: `node-html-parser` toegevoegd | |
| 217 | +- [x] `bun install` — [email protected] binnen | |
| 218 | +- [x] `bun test src/c31_sxdoc_parse.test.ts src/c51_render_sxdoc.test.ts` — 53/53 ✓ | |
| 219 | +- [x] `bun test src/c32_sama_verify.test.ts` — 10/10 ✓ (verifier zelf groen) | |
| 220 | +- [x] `bun test` (full suite) — 120/120 ✓ (67 pre-Fase-1 + 53 nieuwe) | |
| 221 | +- [x] `wc -l` op nieuwe files — hoogste 327 LOC (c31_sxdoc_parse), c13_database 390 LOC; allemaal < 700 | |
| 222 | +- [x] `bun run src/c11_server.ts` boot-smoke OK — `/` en `/sama` beide 200 | |
| 223 | +- `c14_client_bundle.ts` (Bun.build memoised) — komt pas in Fase 2 | |
| 224 | +- Geen route-impact — alles puur unit-getest, niets aan de live site veranderd. | |
| 225 | + | |
| 226 | +**Fase 1 gates passed 2026-05-11. Sxdoc-fundament SAMA-canon compliant en groen.** | |
| 227 | + | |
| 228 | +### Fase 2 — admin-UI | |
| 229 | + | |
| 230 | +**2a — server-side CRUD (afgerond 2026-05-11):** | |
| 231 | +- [x] `c31_admin_validation.ts` + sibling test (14/14 groen) — parser/validator per Modeled.md | |
| 232 | +- [x] `c51_render_admin.ts` — list + edit form + login/non-admin walls | |
| 233 | +- [x] `c21_handlers_admin.ts` — adminListHandler, adminNewHandler, adminEditHandler, adminDeleteHandler (één bestand i.p.v. plan-spec 4; matcht bestaande `c21_handlers_agents`/`c21_handlers_auth` pattern, 218 LOC) | |
| 234 | +- [x] `c21_app.ts` routes: /admin, /admin/new, /admin/edit/:type/:slug, /admin/delete/:type/:slug | |
| 235 | +- [x] Boot-smoke: anonymous → 401, login-wall rendert ✓ | |
| 236 | +- ⚠ `c21_app.ts` is nu 702 LOC (Atomic-grens 700 overschreden door 2 regels). Vraagt aparte split-refactor — c21_handlers_projects.ts, c21_handlers_api_agents.ts, c21_handlers_webhook.ts uit het inline-deel halen. | |
| 237 | + | |
| 238 | +**2b — client-side block editor (afgerond 2026-05-11):** | |
| 239 | +- [x] `c14_client_bundle.ts` — Bun.build memoised + ETag, 72 LOC | |
| 240 | +- [x] `src/client/blockeditor.ts` — hydratie + state + autosave + raw-mode toggle, 336 LOC | |
| 241 | +- [x] `src/client/slashmenu.ts` — filterable popup met arrows/enter/escape, 161 LOC | |
| 242 | +- [x] `src/client/blocks.ts` — per-block-kind renderers (p, h, ul, ol, quote, code, img, hr, html, shortcode), inline marks parser, slash-trigger, 393 LOC (één file ipv `blocks/*` — onder Atomic-700) | |
| 243 | +- [x] `c51_render_admin.ts` — neemt nu SxDocument als input, projecteert naar textarea-HTML + embedt `<script id="sxdoc-initial">` JSON, laadt bundle `<script type="module" src="/admin/assets/blockeditor.js">` | |
| 244 | +- [x] `c21_handlers_admin.ts` — JSON autosave path (`Accept: application/json` → `{ok:true,ts,slug,type}`) | |
| 245 | +- [x] `c21_app.ts` route `/admin/assets/blockeditor.js` met ETag/304 | |
| 246 | +- [x] `public/style.css` admin-editor sectie toegevoegd (~190 LOC editor + slashmenu + toast) | |
| 247 | +- [x] Bundle compileert (26KB), serves 200, ETag → 304 ✓ | |
| 248 | +- [x] Boot-smoke: /admin nog 401 anoniem (auth-gate intact) ✓ | |
| 249 | +- [x] Full suite 134/134 ✓ | |
| 250 | +- E2E spec `e2e/admin-block-editor.spec.ts` — uitgesteld; needs admin-sessie helper, beter in apart turn | |
| 251 | +- Deploy + verify op p620 — wacht op user-go | |
| 252 | + | |
| 253 | +**Tech-debt uit Fase 2:** | |
| 254 | +- c21_app.ts is nu **716 LOC** (Atomic-grens 700 overschreden door 16). Bestond al lang voor admin port; mijn route-adds duwden 't over. Splitsen langs `c21_handlers_projects.ts` / `c21_handlers_api_agents.ts` / `c21_handlers_webhook.ts` patroon (al gebruikt voor agents/auth/reports/sama) — aparte refactor, niet Fase 3-blocker. | |
| 255 | + | |
| 256 | +### Fase 3 — media + AI | |
| 257 | +- `c14_media.ts` (upload + on-disk store onder `content/images/`) | |
| 258 | +- `c14_image_resize.ts` (sharp wrapper — sharp is I/O = c14) | |
| 259 | +- `c21_handlers_media.ts` (GET /content/images/...) | |
| 260 | +- `c21_handlers_admin_upload.ts` (slash-menu image card target) | |
| 261 | +- `c14_openrouter.ts` (HTTP-call) + `c32_ai_edit_block.ts` (validate+transform, sibling-test verplicht) + `c21_handlers_admin_ai.ts` (✨ button) | |
| 262 | +- E2E: image upload + AI edit | |
| 263 | +- Deploy + verify | |
| 264 | + | |
| 265 | +### Fase 4 — public renderer + Ghost-look theme | |
| 266 | +- `c51_render_theme.ts` (port van podman's Handlebars partials naar TS template-helpers, geen Handlebars-dep) | |
| 267 | +- `c51_render_theme_partials.ts` (nav, footer, post-card, post-list) | |
| 268 | +- `c31_shortcodes_registry.ts` (built-in lijst + arg-schemas) | |
| 269 | +- `c32_shortcodes_substitute.ts` (HTML-rewriter met meta/script-skip) + `.test.ts` | |
| 270 | +- `c21_handlers_content.ts` swap: huidige render-paden → nieuwe theme | |
| 271 | +- CSS port: `sx-themes/syntax/assets/*` → `public/style.css` (uitgebreid) | |
| 272 | +- E2E: visuele parity-checks (`e2e/theme-parity.spec.ts`) | |
| 273 | +- Deploy + verify | |
| 274 | + | |
| 275 | +### Fase 5 — sitemap, RSS, live-preview | |
| 276 | +- `c51_render_sitemap.ts` + `c21_handlers_sitemap.ts` | |
| 277 | +- `c21_handlers_admin_preview.ts` + live-preview iframe in admin | |
| 278 | +- Deploy + verify | |
| 279 | + | |
| 280 | +### Fase 6 — content-migratie + cutover | |
| 281 | +- `scripts/migrate_content_to_sxdoc.ts` lokaal draaien | |
| 282 | +- Commit alle `*.sxdoc.json` als één migratie-batch | |
| 283 | +- Verwijder oude `c21_handlers_edit.ts` + `c51_render_edit.ts` (block-editor is canoniek) | |
| 284 | +- SAMA verify groen (`c32_sama_verify` over alle nieuwe files) | |
| 285 | +- Deploy + visual diff t.o.v. pre-migratie | |
| 286 | +- Memory updaten: "tdd.md CMS = sxdoc block-editor, Ghost-compat theme, git-canon (of SQLite-canon)" | |
| 287 | + | |
| 288 | +--- | |
| 289 | + | |
| 290 | +## Risico's | |
| 291 | + | |
| 292 | +| Risico | Mitigatie | | |
| 293 | +|---|---| | |
| 294 | +| Block-editor JS is groot (slashmenu + 7 block types) | Bundle on-demand via `c14_client_bundle`, cache in memory. Geen build-step buiten Bun.build. | | |
| 295 | +| Ghost-Handlebars helpers (`{{#foreach}}`, `{{date}}`, `{{img_url}}`) — handgeschreven her-implementeren | Klein arsenaal nodig; allemaal in `c51_render_theme.ts` + unit-tested. Geen Handlebars-dep. | | |
| 296 | +| SAMA verify struikelt over client/ bestanden | client/ valt buiten cXX-naming-regel; al supported (zie bestaande `e2e/`). | | |
| 297 | +| Content-migratie loss-y voor markdown met embedded HTML | sxdoc heeft escape-hatch `html` block; parser valt daarop terug. Visuele parity-check in Fase 6. | | |
| 298 | +| Sharp binary in container | Bestaat al niet in tdd.md. Quadlet image-build moet `sharp` meebakken — eenmalige Dockerfile-edit. | | |
| 299 | +| `OPENROUTER_API_KEY` ontbreekt in prod | AI ✨ wordt 503 met hint (zoals podman). Niet blokkerend. | | |
| 300 | + | |
| 301 | +--- | |
| 302 | + | |
| 303 | +## Tellen | |
| 304 | + | |
| 305 | +- Podman LOC (sx-editor + sx-content + sx-filter, `src/` only): ~6-8k geschat. | |
| 306 | +- Tdd.md LOC nu: 7.5k. | |
| 307 | +- Port voegt geschat 4-6k toe (geen Handlebars-overhead, hergebruik bestaande c13/c14/c32-laag). | |
| 308 | +- Eindstand: ~12-13k LOC, één Bun-proces, één SQLite-file, één bare repo. Geen extra services. | |
| 309 | + | |
| 310 | +--- | |
| 311 | + | |
| 312 | +## Open vragen (kunnen later) | |
| 313 | + | |
| 314 | +Voor Fase 1 zijn de gelockte beslissingen voldoende. Deze vragen worden relevant per fase: | |
| 315 | + | |
| 316 | +1. ~~Storage~~ ✅ B (SQLite-canon + git-mirror) — locked. | |
| 317 | +2. **Permalink-vorm** (Fase 4): `/blog/{primary_tag}/{slug}/` (Ghost-style) of `/blog/{slug}/` (huidig)? Aanbevolen: huidig behouden, 9 bestaande URLs blijven werken. | |
| 318 | +3. **AI element-edit in prod** (Fase 3): `OPENROUTER_API_KEY` op p620 zetten, of alleen lokaal/dev (503 in prod)? | |
| 319 | +4. **Games** (Fase 6): buiten CMS via bestaande `c31_games.ts`, of óók via sxdoc? Aanbevolen: buiten. | |
| 320 | +5. **Ghost Content API endpoints** (`/ghost/api/content/...`): meeporten? Aanbevolen: drop, bespaart ~150 LOC. Tdd.md heeft geen externe API-consumers. | |
| 321 | +6. **Marketing-blocks** (hero/feature-card/etc.): meeporten of skip? Aanbevolen: skip — niet nodig voor tdd.md content, scheelt ~600 LOC. | |
research-migration.md
+567
−0
| @@ -0,0 +1,567 @@ | ||
| 1 | +# research-migration — porting podman/syntax CMS into SAMA-native tdd.md | |
| 2 | + | |
| 3 | +Companion to `/var/home/scri/Documents/tdd.md/plan.md`. Read that first | |
| 4 | +for the high-level mapping; this goes deep on the points plan.md | |
| 5 | +handwaved. All line references are to files in | |
| 6 | +`/var/home/scri/Documents/podman/` and `/var/home/scri/Documents/tdd.md/`. | |
| 7 | + | |
| 8 | +## What I found that plan.md misses | |
| 9 | + | |
| 10 | +1. **`c32_sama_verify.ts` enforces stricter rules than plan.md | |
| 11 | + assumed.** Layer-prefix whitelist is `{11, 13, 14, 21, 31, 32, 51}` | |
| 12 | + (line 188). Plan.md proposes `c31_image_resize.ts`, but `sharp(...)` | |
| 13 | + is I/O — per `content/sama/architecture.md:13-16` resize belongs in | |
| 14 | + c14, OR c32 with `sharp` passed via DI. Same for plan.md's | |
| 15 | + `c31_ai_edit_block.ts` (calls OpenRouter — must split into c14+c32). | |
| 16 | +2. **The verifier's import scanner only inspects relative `./xxx.ts` | |
| 17 | + paths** (line 119-120). A bare `import sharp from "sharp"` in a c31 | |
| 18 | + file is invisible to the gate. The "no I/O in c31" rule is | |
| 19 | + discipline, not enforcement. | |
| 20 | +3. **Atomic threshold is 700 lines** (line 309). Two podman files | |
| 21 | + over/at the line on day one: | |
| 22 | + `sx-editor/src/client/render.ts` (775 — **violation**), | |
| 23 | + `sx-filter/src/shortcodes.ts` (650 — one new shortcode tips it). | |
| 24 | + Plan.md doesn't budget these splits. | |
| 25 | +4. **Placeholder-test detection is part of Atomic** (lines 254-298). | |
| 26 | + Every `test()/it()` body needs ≥1 `expect()`. Snapshot tests | |
| 27 | + (`toMatchSnapshot`) qualify but rule it out as the default. | |
| 28 | +5. **Modeled is asymmetric** (lines 219-248). c32 without sibling test | |
| 29 | + = hard violation; c31 missing sibling = informational only. So | |
| 30 | + `c31_sxdoc.ts` (types) is fine without a test; | |
| 31 | + `c32_sxdoc_parse.ts` (logic) is not. Plan.md's `c31_sxdoc_parse.ts` | |
| 32 | + is the wrong layer — the parser is a deterministic transform, not | |
| 33 | + pure types/registry. | |
| 34 | +6. **Podman uses subdirectories (`sxdoc/`, `core/`, `db/`, `client/`).** | |
| 35 | + tdd.md's `src/` is flat (verified: no subdirs). SAMA's verifier | |
| 36 | + doesn't walk subdirs, but the convention bans them — server-side | |
| 37 | + files **must** flatten into top-level `cXX_*.ts`. plan.md mentions | |
| 38 | + this only for `client/` and only obliquely. | |
| 39 | +7. **Live-preview cannot be commit-driven.** Plan.md picks git-canon | |
| 40 | + (commit on every save), but `/admin/preview` runs on a ~200ms | |
| 41 | + debounce. The preview path must skip `c14_git` entirely and render | |
| 42 | + from in-memory sxdoc. Call this out so the handler is shaped | |
| 43 | + correctly from the start. | |
| 44 | +8. **Ghost-style `/blog/{primary_tag}/{slug}/` permalink breaks 9 | |
| 45 | + existing post URLs.** Plan.md asks the question but doesn't count. | |
| 46 | + Keep `/blog/{slug}/` unless there's a content reason to migrate. | |
| 47 | + | |
| 48 | +--- | |
| 49 | + | |
| 50 | +## 1 — SAMA-verifier compliance | |
| 51 | + | |
| 52 | +### Exact rules (`src/c32_sama_verify.ts`) | |
| 53 | + | |
| 54 | +| letter | rule | line | | |
| 55 | +|---|---|---| | |
| 56 | +| S | c1*/c3* must NOT relative-import c5*/c9* (c21 exempt) | 149-185 | | |
| 57 | +| A | prefix ∈ {11,13,14,21,31,32,51} | 188 | | |
| 58 | +| M | c32_* needs sibling .test.ts (hard); c31_* missing = info only | 219-248 | | |
| 59 | +| A | cXX_*.ts ≤ 700 lines; every test() body needs ≥1 expect() | 300-326 | | |
| 60 | + | |
| 61 | +Verifier walks only `cXX_*.ts` files; everything else under `src/` is | |
| 62 | +ignored. **Client-bundle source under `src/client/**.ts` is therefore | |
| 63 | +out of scope** — fine. | |
| 64 | + | |
| 65 | +### Subdirectories | |
| 66 | + | |
| 67 | +Server code in podman is split across `sx-editor/src/{sxdoc,core,db}/` | |
| 68 | +and `sx-content/src/{sxdoc,core,db}/`. tdd.md is flat: | |
| 69 | +`ls src/` returns only `cXX_*.ts` + `.test.ts` siblings. SAMA prefix | |
| 70 | +replaces folder semantics. **All server-side podman files flatten**: | |
| 71 | +- `sxdoc/types.ts` → `c31_sxdoc.ts` | |
| 72 | +- `sxdoc/html-to-sx.ts` → `c32_sxdoc_parse.ts` (+ `.test.ts`) | |
| 73 | +- `sxdoc/sx-to-html.ts` → `c32_sxdoc_render.ts` (+ `.test.ts`) | |
| 74 | +- `sxdoc/db.ts` → `c14_sxdoc_sidecar.ts` (Option A) or `c14_sxdoc_store.ts` (Option B) | |
| 75 | +- `core/schema.ts` + `db/sqlite.ts` → merge into existing `c13_database.ts` | |
| 76 | +- `core/posts.ts` (editor & content) → one `c13_posts.ts` | |
| 77 | +- `core/settings.ts` → extend `c31_site_config.ts` | |
| 78 | +- `sxdoc/index.ts` (barrel) → DELETE (SAMA bans barrel re-exports per | |
| 79 | + `content/sama/atomic.md`) | |
| 80 | + | |
| 81 | +### Client-side placement | |
| 82 | + | |
| 83 | +tdd.md has **no precedent** for client TS today: `public/` holds | |
| 84 | +`og.svg`, `style.css`, `sama-cli` (binary). `e2e/` holds Playwright | |
| 85 | +specs. Options for the block-editor client: | |
| 86 | +- **A. `src/client/**.ts`** — outside verifier glob, relative imports | |
| 87 | + to `../c31_sxdoc.ts` work, `Bun.build` bundles from here. Recommended. | |
| 88 | +- B. `client/` at repo root — separates browser more clearly; new | |
| 89 | + top-level dir. | |
| 90 | +- C. `public/src/**.ts` — confusing; `public/` is "served verbatim". | |
| 91 | + | |
| 92 | +`client/render.ts` (775 lines) **must split** before landing. Natural | |
| 93 | +axis: one file per block-kind (matches the existing `blocks/*.ts` | |
| 94 | +breakdown) + a small `client/render-dispatch.ts` switch on `block.t`. | |
| 95 | + | |
| 96 | +### Test convention | |
| 97 | + | |
| 98 | +tdd.md tests live as siblings under `src/`: | |
| 99 | +`c31_commits.test.ts`, `c31_diff_parse.test.ts`, | |
| 100 | +`c31_edit_validation.test.ts`, `c31_git_parse.test.ts`, | |
| 101 | +`c31_commit_meta.test.ts`, `c31_games.test.ts`, | |
| 102 | +`c32_anchor_extract.test.ts`, `c32_edit_resolve.test.ts`, | |
| 103 | +`c32_sama_verify.test.ts`. | |
| 104 | + | |
| 105 | +Podman's `sx-editor/tests/unit.test.ts` and `sx-content/tests/setup.ts` | |
| 106 | +are **incompatible** — verifier looks for `<file>.test.ts` next to | |
| 107 | +`<file>.ts`. Every kept test becomes a sibling file. | |
| 108 | + | |
| 109 | +E2E remains in `e2e/*.spec.ts` (Playwright, ignored by verifier). | |
| 110 | + | |
| 111 | +--- | |
| 112 | + | |
| 113 | +## 2 — Storage-model conflict | |
| 114 | + | |
| 115 | +### `SxDocument` shape (`sx-editor/src/sxdoc/types.ts`) | |
| 116 | + | |
| 117 | +`{ v: 1, blocks: SxBlock[] }`. Single-letter keys (`t`, `c`) for | |
| 118 | +compactness (line 1-12). 17 block kinds: `p`, `h`, `ul`, `ol`, `li`, | |
| 119 | +`quote`, `code`, `img`, `hr`, `html`, `shortcode`, `embed`, plus 7 | |
| 120 | +typed marketing blocks (`hero`, `feature-card`, `feature-grid`, | |
| 121 | +`stats-row`, `steps-grid`, `use-case-card`, `cta-band`). Inline marks | |
| 122 | +`b/i/u/s/c`; links are inline. | |
| 123 | + | |
| 124 | +No footnotes, no tables — tables fall through to `{t:"html"}` escape | |
| 125 | +hatch. | |
| 126 | + | |
| 127 | +### SQLite tables (`sx-editor/src/core/schema.ts`) | |
| 128 | + | |
| 129 | +Six Ghost-shaped tables: `posts`, `tags`, `users`, `posts_tags`, | |
| 130 | +`posts_authors`, `api_keys`, `settings`. Plus `sx_documents` (one row | |
| 131 | +per post, holds the typed-block JSON): | |
| 132 | +`(post_id PK, doc TEXT, doc_version INT, hash TEXT, updated_at TEXT)`. | |
| 133 | + | |
| 134 | +### Option A (git-canon, default) write flow | |
| 135 | + | |
| 136 | +`POST /admin/edit/blog/foo`: | |
| 137 | +1. validate + parse form → `(markdown_body, sxdoc_json)` | |
| 138 | +2. `c14_git.commitFile({ paths: [ | |
| 139 | + {path:"content/blog/foo.md", content:markdown_body}, | |
| 140 | + {path:"content/blog/foo.sxdoc.json", content:sxdoc_json} ]})` | |
| 141 | + — **needs new `commitFiles` (multi-path) variant**. | |
| 142 | +3. mirror to live FS so the next render reflects it. | |
| 143 | +4. show "applied · sha XXXXXXX". | |
| 144 | + | |
| 145 | +**Commit message**: piggy-back the existing helper | |
| 146 | +`buildCommitMessage` from `c31_commit_meta.ts` (already used by | |
| 147 | +`c21_handlers_edit.ts:96`). Message format stays as today: | |
| 148 | +`Edit: <title> by <author> via /admin\n\n<filePath>`. | |
| 149 | + | |
| 150 | +`c14_git.commitFile` (lines 192-250) is single-path. Extending to | |
| 151 | +multi-path is ~30 added lines — same 5-step flow, with step 3 | |
| 152 | +(read-tree + update-index) looping over paths. | |
| 153 | + | |
| 154 | +**Sidecar regen.** Because markdown is canonical and sxdoc is | |
| 155 | +derivable, treat sidecars as **cache**. If sidecar missing or older | |
| 156 | +than `.md`, regenerate via `marked.parse(md) → htmlToSx(html)`. Makes | |
| 157 | +the "drop SQLite index, replay git log" rebuild story plan.md | |
| 158 | +mentions actually trivial. | |
| 159 | + | |
| 160 | +### Real-content survey (assessed by full-read of 3 files + grep) | |
| 161 | + | |
| 162 | +| file | code fences | tables | embedded HTML | frontmatter | | |
| 163 | +|---|---|---|---|---| | |
| 164 | +| `content/home.md` (3.2 KB) | 0 | 1 (5 rows) | 0 | no | | |
| 165 | +| `content/blog/sama-meets-git-cms.md` | 4 | 0 | 0 | no | | |
| 166 | +| `content/blog/three-constraints-agentic-coding.md` | 7 | 0 | 0 | no | | |
| 167 | +| `content/sama/architecture.md` | 1 | 1 (4×4) | 0 | no | | |
| 168 | +| `content/sama/skill.md` | many | many | 0 | **YES** | | |
| 169 | +| (other 13 .md) | many | mixed | 0 | no | | |
| 170 | + | |
| 171 | +Confirmed by grep: **only `content/sama/skill.md` has YAML | |
| 172 | +frontmatter** (`---\nname: …\n---`). Other matches for `^---` are | |
| 173 | +markdown horizontal rules (`<hr>`) inside the body of `sama/*.md` | |
| 174 | +and a few blog posts — *not* frontmatter. The migration script must | |
| 175 | +distinguish: frontmatter = `^---\n[a-zA-Z_]+:` at byte 0. | |
| 176 | + | |
| 177 | +### What `htmlToSx` handles vs doesn't (`sx-editor/src/sxdoc/html-to-sx.ts`) | |
| 178 | + | |
| 179 | +Block-level handled: `p`, `h1..h6`, `ul`/`ol`/`li`, `blockquote`, | |
| 180 | +`pre`/`code` (with `language-X` detection), `img`, `figure`, `hr`. | |
| 181 | +Container divs (`div`, `section`, `article`) recurse into children. | |
| 182 | +Everything else → `{t:"html", src: el.outerHTML}` escape hatch | |
| 183 | +(line 183). | |
| 184 | + | |
| 185 | +Inline handled: `<a>`, `<br>`, `<strong>/<b>`, `<em>/<i>`, `<u>`, | |
| 186 | +`<s>/<strike>/<del>`, `<code>`. `<span>/<font>` strip wrapper, keep | |
| 187 | +content. | |
| 188 | + | |
| 189 | +**Implication for our content**: | |
| 190 | +- Tables → single `html` block per table. Renders identically but | |
| 191 | + un-editable as discrete blocks. **Acceptable.** | |
| 192 | +- HR (`---`) → `{t:"hr"}`. Good. | |
| 193 | +- Code fences → `{t:"code", lang:"sh", src:"..."}`. Good. | |
| 194 | +- Quote-blocks (`> …` markdown) → `<blockquote>` HTML → `{t:"quote", | |
| 195 | + c:[…]}`. Good. | |
| 196 | +- Frontmatter (skill.md only) — `marked` doesn't strip it by default | |
| 197 | + in tdd.md's current `c51_render_layout.ts:8` call. **Pre-check | |
| 198 | + what the live site does today** before migrating. | |
| 199 | +- Round-trip drift exists: mark order is normalised | |
| 200 | + (`sx-to-html.ts:227`), `<b>` collapses to `<strong>`, whitespace | |
| 201 | + shifts. Acceptable for the migration since markdown stays | |
| 202 | + authoritative. | |
| 203 | + | |
| 204 | +### Option B (SQLite-canon) trade | |
| 205 | + | |
| 206 | +git as audit-trail disappears. Compensation table: `content_history | |
| 207 | +(id, slug, type, doc, html, edited_at, edited_by, msg)` — append-only. | |
| 208 | +Has rollback but no cryptographic immutability, no `git blame`, no | |
| 209 | +PR diffs, no mirror story. | |
| 210 | + | |
| 211 | +The `content/blog/sama-meets-git-cms.md` post (149 lines) is the | |
| 212 | +product pitch for "every save = a real commit". B contradicts | |
| 213 | +published copy. **Recommend A.** Mechanical concerns (multi-path | |
| 214 | +commit, sidecar regen) are small; the "stop saying SAMA meets git" | |
| 215 | +cost is large. | |
| 216 | + | |
| 217 | +--- | |
| 218 | + | |
| 219 | +## 3 — Handlebars-theme port | |
| 220 | + | |
| 221 | +### Helpers used (exhaustive, source: `sx-content/src/render.ts`) | |
| 222 | + | |
| 223 | +`Handlebars.registerHelper` calls at lines 78, 86, 93, 109, 129, 135, | |
| 224 | +158, 166, 184, 202, 205, 210, 390, 393: | |
| 225 | + | |
| 226 | +| helper | line | use | TS-port effort | | |
| 227 | +|---|---|---|---| | |
| 228 | +| `asset` | 78 | `{{asset "css/syntax.css"}}` → `/assets/...` | trivial | | |
| 229 | +| `img_url` | 86 | pass-through today (no transforms) | trivial | | |
| 230 | +| `post_class` | 93 | join class strings from featured/tags | trivial | | |
| 231 | +| `ghost_head` | 109 | 5-10 meta/og tags + codeinjection | medium — existing `c51_render_layout.ts` already emits a similar block | | |
| 232 | +| `ghost_foot` | 129 | code injection footer | trivial | | |
| 233 | +| `date` | 135 | dual-shape formatter (YYYY/MMM/DD) | small | | |
| 234 | +| `content` | 158 | emit body html raw | trivial | | |
| 235 | +| `excerpt` | 166 | strip HTML + truncate N words | small | | |
| 236 | +| `foreach` | 184 | iteration with `@index/@first/@last/@even/@odd` | medium — TS map gets index; rest unused in current .hbs files (confirmed by grep) | | |
| 237 | +| `tag`, `author`, `page`, `post` (block) | 202/205/390/393 | scope-dive | structural; replaced by TS functions that take the scoped object as arg | | |
| 238 | +| `reading_time` | 210 | "N min read" | trivial | | |
| 239 | + | |
| 240 | +Built-in (`{{#if}}`, `{{else}}`, `{{!-- comment --}}`, `{{!< layout}}`) | |
| 241 | +are template-language features that go away once we render via TS | |
| 242 | +functions; no port needed. | |
| 243 | + | |
| 244 | +**Mismatches**: `{{#foreach}}`'s `@first/@last/@even/@odd` is the only | |
| 245 | +data plumbing TS map doesn't give for free. Grep of `.hbs` files | |
| 246 | +confirms none of those data-frame fields are referenced in current | |
| 247 | +templates. Safe to drop in the TS port. | |
| 248 | + | |
| 249 | +### Templates inventory (`sx-themes/syntax/`) | |
| 250 | + | |
| 251 | +`default.hbs` (14 lines — wrapper), `index.hbs` (757 lines — | |
| 252 | +marketing homepage HTML inline; partial `syntax-home.hbs` no longer | |
| 253 | +used per `sx-editor/src/index.ts:284-289`), `post.hbs` (37), | |
| 254 | +`page.hbs` (40), `tag.hbs` (24), `author.hbs` (24). | |
| 255 | + | |
| 256 | +`assets/css/syntax.css` is **812 lines**. tdd.md's `public/style.css` | |
| 257 | +is ~25 KB. Combining is a real CSS pass; classes like `.hero-content`, | |
| 258 | +`.feature-card`, `.use-case-card`, `.gradient-text` don't exist in | |
| 259 | +tdd.md today. | |
| 260 | + | |
| 261 | +### TS-native equivalents land in | |
| 262 | + | |
| 263 | +- `c51_render_theme.ts` — `renderPost(post)`, `renderPage(page)`, | |
| 264 | + `renderTagArchive(tag, posts)`, `renderAuthorArchive(author, posts)`, | |
| 265 | + `renderHomepage()`. Each replaces one `.hbs` file. | |
| 266 | +- `c51_render_meta.ts` (or extend existing `c51_render_layout.ts`) — | |
| 267 | + `ghost_head`-equivalent. tdd.md already emits OG/meta in | |
| 268 | + `c51_render_layout.ts:49+`; combine, don't reimplement. | |
| 269 | +- The five small string helpers (`asset`, `date`, `excerpt`, | |
| 270 | + `reading_time`, `post_class`) live inline in `c51_render_theme.ts` | |
| 271 | + as private functions. No external file warranted. | |
| 272 | + | |
| 273 | +--- | |
| 274 | + | |
| 275 | +## 4 — Shortcode-engine port | |
| 276 | + | |
| 277 | +### What `sx-filter/src/shortcodes.ts` (650 lines) does | |
| 278 | + | |
| 279 | +`BUILT_IN` registry at line 546-563. Three categories: | |
| 280 | + | |
| 281 | +- **Pure** (no I/O): `ping`, `now`, `spec-version`, `event-validate`, | |
| 282 | + `catalog-sample`, `query-demo`, `catalog-lookup` (reads in-process | |
| 283 | + `DEMO_CATALOG`), `emit`+`demo-flow` (writes in-process `events.ts` | |
| 284 | + ring buffer). | |
| 285 | +- **HTTP-fetching** (external API): `github-repo`, `npm`, `crate`, | |
| 286 | + `gist`. | |
| 287 | +- **Ghost-API-fetching**: `event-count`, `posts-list` (Ghost content | |
| 288 | + API), `login-page` (Ghost `_login-skin` page). | |
| 289 | + | |
| 290 | +Module-level `SHARED_EVENT_LOG` (line 14) + `DEMO_CATALOG` | |
| 291 | +(lines 26-107) push the file to 650 lines. **One more handler tips | |
| 292 | +it over 700.** | |
| 293 | + | |
| 294 | +### SAMA placement | |
| 295 | + | |
| 296 | +The handlers split by layer: | |
| 297 | +- **c32**: pure regex match, format, validate. `query-demo`, | |
| 298 | + `event-validate`, `catalog-sample`, `catalog-lookup`, | |
| 299 | + `event-count` parser (just an int). | |
| 300 | +- **c14**: HTTP wrappers for external APIs. `c14_github.ts` already | |
| 301 | + exists. New: `c14_npm.ts`, `c14_crates.ts`, `c14_gist.ts` — or one | |
| 302 | + combined `c14_package_registries.ts` (recommended for fewer files). | |
| 303 | +- **c13**: queries against `posts`/`sx_documents` for `posts-list` | |
| 304 | + etc., extending `c13_database.ts`. | |
| 305 | +- **c32_event_log.ts**: pure in-memory ring buffer; required only if | |
| 306 | + `emit`/`demo-flow` ship. | |
| 307 | + | |
| 308 | +### Where the substitute loop lives | |
| 309 | + | |
| 310 | +`sx-filter/src/index.ts:81-120` does the rewrite: | |
| 311 | +1. parse upstream HTML (already-rendered page), | |
| 312 | +2. build skip-regions (`<meta>`, `<link>`, `<script>`), | |
| 313 | +3. for each `SHORTCODE_RE` match, call handler, splice output. | |
| 314 | + | |
| 315 | +This is **render-time HTML rewriting**, runs after sxdoc → HTML. It's | |
| 316 | +a c51 concern wrapping c14/c32 handlers. **Cannot live in `c11_server.ts`** | |
| 317 | +— c11 forbids route logic / HTML rewriting per `content/sama/architecture.md:12`. | |
| 318 | + | |
| 319 | +Recommended shape: | |
| 320 | +- `c32_shortcode_parse.ts` (+test) — extract `{name, args, range}` | |
| 321 | + tokens from text. Pure regex; same pattern as today. | |
| 322 | +- Handler functions at their natural layer. | |
| 323 | +- `c51_render_post.ts` calls the parser, dispatches handlers inline | |
| 324 | + (~10 lines for a switch). No central registry; each handler is | |
| 325 | + just a function imported where needed. | |
| 326 | + | |
| 327 | +### Single-process advantage | |
| 328 | + | |
| 329 | +Podman's filter is a separate Bun service proxying Ghost. tdd.md is | |
| 330 | +one process — substitute is a function call, not an HTTP hop. The | |
| 331 | +~100 lines of `sx-filter/src/index.ts` doing upstream-proxy wiring | |
| 332 | +are deleted; the ~30 lines of skip-region + substitute logic move | |
| 333 | +into c51. | |
| 334 | + | |
| 335 | +--- | |
| 336 | + | |
| 337 | +## 5 — File inventory (server-side, podman → tdd.md) | |
| 338 | + | |
| 339 | +### sx-editor/src/ | |
| 340 | + | |
| 341 | +- `ai.ts` (317) → `c14_openrouter.ts` + `c32_ai_edit_block.ts` — HTTP | |
| 342 | + client (c14), prompt assembly + JSON validation (c32). plan.md's | |
| 343 | + `c31_ai_edit_block.ts` is wrong layer. | |
| 344 | +- `build.ts` (61) → `c14_client_bundle.ts` — calls `Bun.build`, I/O. | |
| 345 | +- `db.ts` (124) → split: SQL into `c13_posts.ts`; htmlToSx fallback | |
| 346 | + into the handler. plan.md's `c14_sxdoc_store.ts` is a different | |
| 347 | + file (sx-doc only); db.ts is core posts. | |
| 348 | +- `index.ts` (437) → dispatcher entries in `c21_app.ts`; per-route | |
| 349 | + handler bodies in `c21_handlers_admin_{list,edit,new,upload,ai, | |
| 350 | + preview}.ts`. 4-6 files of 80-150 lines. | |
| 351 | +- `routes.ts` (44) → merge into existing `c31_site_config.ts`. | |
| 352 | +- `templates.ts` (482) → `c51_render_admin.ts`. At Atomic limit; | |
| 353 | + watch for growth. | |
| 354 | +- `upload.ts` (87) → `c14_media.ts`. | |
| 355 | +- `sxdoc/types.ts` (240) → `c31_sxdoc.ts`. Types only; no sibling | |
| 356 | + test (informational only). | |
| 357 | +- `sxdoc/html-to-sx.ts` (315) → `c32_sxdoc_parse.ts` (+ `.test.ts`). | |
| 358 | +- `sxdoc/sx-to-html.ts` (266) → `c32_sxdoc_render.ts` (+ `.test.ts`). | |
| 359 | +- `sxdoc/db.ts` (64) → `c14_sxdoc_sidecar.ts` (Option A) **or** | |
| 360 | + `c14_sxdoc_store.ts` (Option B). Same shape, different backend. | |
| 361 | +- `sxdoc/index.ts` (14) → DELETE (barrel, SAMA-forbidden). | |
| 362 | +- `core/posts.ts` (148) → merge into `c13_posts.ts` with content's. | |
| 363 | +- `core/schema.ts` (103) → merge into `c13_database.ts`. | |
| 364 | +- `db/sqlite.ts` (41) → merge into `c13_database.ts`. | |
| 365 | +- `scripts/backfill-sxdoc.ts` → `scripts/migrate_content_to_sxdoc.ts`. | |
| 366 | +- `scripts/import-homepage.ts` → discard. | |
| 367 | + | |
| 368 | +### sx-content/src/ | |
| 369 | + | |
| 370 | +- `db.ts` (11) → merge into `c13_database.ts`. Trivial. | |
| 371 | +- `images.ts` (125) → `c14_media.ts` (combined with upload.ts). | |
| 372 | + `sharp` is I/O — **c14, not c31** as plan.md proposed. | |
| 373 | +- `index.ts` (536) → `c21_handlers_content.ts` (+ optional | |
| 374 | + `c21_handlers_ghost_api.ts` — see open question 7). | |
| 375 | +- `posts.ts` (140) → merge into single `c13_posts.ts`. | |
| 376 | +- `render.ts` (398) → `c51_render_theme.ts`. Drops the Handlebars | |
| 377 | + dep. | |
| 378 | +- `routes.ts` (199) → split: URL patterns into `c31_site_config.ts`, | |
| 379 | + classifyUrl logic into `c32_url_classify.ts` (+ test). | |
| 380 | +- `sitemap.ts` (134) → `c51_render_sitemap.ts` + `c21_handlers_sitemap.ts`. | |
| 381 | +- `sxdoc/*` (913 total) → duplicates of editor's; **single source of | |
| 382 | + truth in tdd.md**, both reads and writes use the same c31/c32/c14 | |
| 383 | + triplet. | |
| 384 | +- `core/posts.ts` (254), `core/schema.ts` (101), `core/settings.ts` | |
| 385 | + (118), `db/sqlite.ts` (43) → merge as listed for editor's | |
| 386 | + equivalents; `core/settings.ts` extends `c31_site_config.ts`. | |
| 387 | + | |
| 388 | +### sx-filter/src/ | |
| 389 | + | |
| 390 | +- `admin.ts` (114) → DELETE. tdd.md has real auth; no injection. | |
| 391 | +- `events.ts` (211) → `c32_event_log.ts` if event-demo shortcodes | |
| 392 | + ship. Otherwise DELETE. | |
| 393 | +- `index.ts` (379) → discard proxy logic; substitute loop moves to | |
| 394 | + c51 (described in §4). | |
| 395 | +- `login-page-skin.html` (174), `login-page-template.ts` (205) → | |
| 396 | + DELETE (syntax.ai demo asset). | |
| 397 | +- `shortcodes.ts` (650) → `c32_shortcode_parse.ts` + handler files | |
| 398 | + at natural layers + dispatch inline in c51. Demo shortcodes | |
| 399 | + (event-* / catalog-* / login-page) are open question 4. | |
| 400 | + | |
| 401 | +### Non-clean mappings flagged | |
| 402 | + | |
| 403 | +- Two big dispatcher files (editor `index.ts` 437, content `index.ts` | |
| 404 | + 536) **must split**: dispatcher entries go into `c21_app.ts`, | |
| 405 | + handler bodies into per-domain `c21_handlers_*.ts`. | |
| 406 | +- `sxdoc/` duplicated between editor and content services — keep one | |
| 407 | + copy in tdd.md. | |
| 408 | +- `core/schema.ts`, `db/sqlite.ts` duplicated — one copy. | |
| 409 | +- `marked` already in tdd.md deps (`c51_render_layout.ts:8`). The | |
| 410 | + migration uses it; after cutover see open question 12. | |
| 411 | + | |
| 412 | +### Client (sx-editor/src/client/**) | |
| 413 | + | |
| 414 | +Lands at `src/client/**` (outside verifier glob). Sizes preserved. | |
| 415 | +Key file: `render.ts` (775 — **must split before landing**). Natural | |
| 416 | +split per-block-kind matches the existing `blocks/*` and | |
| 417 | +`blocks/typed/*` breakdown. | |
| 418 | + | |
| 419 | +Open: `slashmenu.ts` (590) vs `slashmenu-v2.ts` (216) — figure out | |
| 420 | +which is canonical before porting. | |
| 421 | + | |
| 422 | +--- | |
| 423 | + | |
| 424 | +## 6 — Content migration mechanics | |
| 425 | + | |
| 426 | +### Algorithm | |
| 427 | + | |
| 428 | +```ts | |
| 429 | +// scripts/migrate_content_to_sxdoc.ts | |
| 430 | +for (const file of glob("content/**/*.md")) { | |
| 431 | + if (file.startsWith("content/games/")) continue; | |
| 432 | + if (file.startsWith("content/git-history/")) continue; | |
| 433 | + const raw = await Bun.file(file).text(); | |
| 434 | + const { fm, body } = splitFrontmatter(raw); // skill.md only | |
| 435 | + const html = await marked.parse(body, { gfm: true, breaks: false }); | |
| 436 | + let doc: SxDocument; | |
| 437 | + try { doc = htmlToSx(html); } | |
| 438 | + catch (e) { | |
| 439 | + // Fallback: single html-block holding the markdown-rendered HTML. | |
| 440 | + doc = { v: 1, blocks: [{ t: "html", src: html }] }; | |
| 441 | + } | |
| 442 | + const sxdocPath = file.replace(/\.md$/, ".sxdoc.json"); | |
| 443 | + await Bun.write(sxdocPath, JSON.stringify(doc, null, 2)); | |
| 444 | +} | |
| 445 | +// one batched commit | |
| 446 | +git add content/**/*.sxdoc.json | |
| 447 | +git commit -m "Migrate content to sxdoc sidecars (one-time)" | |
| 448 | +``` | |
| 449 | + | |
| 450 | +### Edge cases | |
| 451 | + | |
| 452 | +- **Tables** → single `{t:"html"}` block per table. Renders | |
| 453 | + identically; un-editable as discrete blocks in the block editor. | |
| 454 | + Acceptable. | |
| 455 | +- **Frontmatter (`skill.md`)** → strip first, parse body. Decide | |
| 456 | + separately what happens to the `name:`/`description:` fields: today | |
| 457 | + they probably render as visible text via marked. Pre-check live | |
| 458 | + site behaviour before migrating. | |
| 459 | +- **HR (`---` mid-document)** is NOT frontmatter. Frontmatter pattern: | |
| 460 | + `/^---\n[a-zA-Z_]+:/` at byte 0. | |
| 461 | +- **Parse fail** → escape hatch as shown. Page still renders | |
| 462 | + (`sx-to-html.ts:60-62` emits raw HTML untouched). Editor surfaces | |
| 463 | + "open `/edit-raw/...` for this section". | |
| 464 | +- **Code fences** all currently `sh` / `ts` / `text` — parsed by | |
| 465 | + `parseLangFromClass` (line 296-298) into `{t:"code", lang, src}`. | |
| 466 | + No issue. | |
| 467 | +- **Round-trip drift** — `<b>` collapses to `<strong>`, mark order | |
| 468 | + normalised. Acceptable since `.md` stays authoritative. | |
| 469 | + | |
| 470 | +### Commit strategy: single batch | |
| 471 | + | |
| 472 | +18 files → one "Migrate: content → sxdoc" commit. Per-file commits | |
| 473 | +add noise without informational value. Future re-migration after | |
| 474 | +parser improvements stays a single revertable commit. | |
| 475 | + | |
| 476 | +### Games confirmed out of scope | |
| 477 | + | |
| 478 | +`content/games/{fizzbuzz,string-calc}/` are multi-file units: | |
| 479 | +`spec.md` + `spec.ts` + `hidden/`. Read by `c31_games.ts` directly, | |
| 480 | +not by the CMS; the companion `.ts` and `hidden/` directory make the | |
| 481 | +post/page abstraction wrong. Keep games entirely outside the CMS. | |
| 482 | +**No `/edit/games/...` route should exist** — edit via vim+git like | |
| 483 | +source code. | |
| 484 | + | |
| 485 | +### git-history out of scope | |
| 486 | + | |
| 487 | +`content/git-history/syntaxai__tdd.md{,.tests}.json` (160 KB total) | |
| 488 | +are generated artifacts read by `c32_real_reports.ts` / | |
| 489 | +`c32_real_tests.ts`. Not content. | |
| 490 | + | |
| 491 | +--- | |
| 492 | + | |
| 493 | +## Open beslismomenten voor de mens | |
| 494 | + | |
| 495 | +1. **Storage canon — A (git-canon) or B (SQLite-canon)?** | |
| 496 | + plan.md defaults A; my read supports A (the existing | |
| 497 | + `content/blog/sama-meets-git-cms.md` is the product pitch and | |
| 498 | + contradicts B). Confirm A, or pick B and accept rewriting that | |
| 499 | + post + memory update. | |
| 500 | + | |
| 501 | +2. **sxdoc parser layer — c31 or c32?** | |
| 502 | + plan.md says c31; I argue c32 (deterministic transform with logic, | |
| 503 | + not pure types/registry). Affects file name and whether sibling | |
| 504 | + tests are mandatory (c32 yes, c31 informational). | |
| 505 | + | |
| 506 | +3. **Single-commit vs two-commit per editor save (Option A).** | |
| 507 | + Either extend `c14_git.commitFile` to multi-path (recommended, | |
| 508 | + ~30 LOC) OR write `.md` and `.sxdoc.json` as two commits (simpler, | |
| 509 | + doubled log noise, atomicity hole if step 2 fails). | |
| 510 | + | |
| 511 | +4. **Ship the syntax.ai event-demo shortcodes?** | |
| 512 | + `emit`, `catalog-lookup`, `demo-flow`, `login-page`, | |
| 513 | + `event-validate`, `catalog-sample`, `query-demo`, `event-count`, | |
| 514 | + `posts-list`. These exist for syntax.ai's product story; tdd.md is | |
| 515 | + a different product. **Default: off.** Saves ~500 LOC (skip | |
| 516 | + `events.ts` port + 5 handler files + the `DEMO_CATALOG` constant). | |
| 517 | + | |
| 518 | +5. **Ghost-style permalink `/blog/{primary_tag}/{slug}/` vs current | |
| 519 | + `/blog/{slug}/`?** | |
| 520 | + Switching costs 9 redirects in `c21_app.ts` and breaks external | |
| 521 | + links. **Recommend keep current.** | |
| 522 | + | |
| 523 | +6. **Typed marketing blocks (`hero`, `feature-card`, `feature-grid`, | |
| 524 | + `stats-row`, `steps-grid`, `use-case-card`, `cta-band`) — port?** | |
| 525 | + tdd.md's `home.md` is text + 1 table + 1 list — none would apply | |
| 526 | + unless we redesign the homepage. **Default: skip.** Saves ~600 | |
| 527 | + LOC across `c31_sxdoc.ts` (~80 lines smaller) + | |
| 528 | + `c32_sxdoc_render.ts` (typed renderers) + | |
| 529 | + `client/blocks/typed/*.ts` (7 files). | |
| 530 | + | |
| 531 | +7. **Ghost Content API compatibility surface | |
| 532 | + (`/ghost/api/content/{posts,pages}/...`) — keep?** | |
| 533 | + `sx-content/src/index.ts:78-115`. No consumers today. **Default: | |
| 534 | + drop.** Saves ~150 LOC. | |
| 535 | + | |
| 536 | +8. **Client-side TS placement — `src/client/`, `client/`, or | |
| 537 | + `public/src/`?** Recommend `src/client/`. Affects bundler paths | |
| 538 | + and Playwright fixture wiring. | |
| 539 | + | |
| 540 | +9. **`client/render.ts` (775) split shape.** Per-block-kind | |
| 541 | + (`render-p.ts`, `render-h.ts`, …, 12 small files) or by sub-system | |
| 542 | + (`render-blocks.ts`, `render-marks.ts`, `render-typed.ts`, | |
| 543 | + 3 medium files). Affects readability vs file count. | |
| 544 | + | |
| 545 | +10. **c32 parser tests — snapshot vs explicit-assertion?** | |
| 546 | + Snapshot (`toMatchSnapshot`) qualifies under the placeholder-test | |
| 547 | + check, but explicit asserts are more readable. Decide before | |
| 548 | + writing. | |
| 549 | + | |
| 550 | +11. **`OPENROUTER_API_KEY` in prod (plan.md open Q3).** | |
| 551 | + Still open. AI ✨ returns 503 with hint when unset | |
| 552 | + (`sx-editor/src/index.ts:367-369`). Acceptable to ship without | |
| 553 | + the key in prod. | |
| 554 | + | |
| 555 | +12. **Keep `marked` post-migration?** | |
| 556 | + `marked` is used during migration (md → html before sxdoc parse) | |
| 557 | + and currently at runtime by `c51_render_layout.ts:8`. After | |
| 558 | + cutover, sxdoc → HTML is the new render path. Decide: keep | |
| 559 | + `marked` as a runtime dep for legacy paths, or vendor a tiny | |
| 560 | + md-to-blocks shim inside the migration script and drop marked | |
| 561 | + entirely. | |
| 562 | + | |
| 563 | +13. **`/admin/preview` rendering path.** Plan.md doesn't address | |
| 564 | + that preview cannot go through `c14_git.commitFile` (debounce | |
| 565 | + too tight). Handler must take in-memory sxdoc and call | |
| 566 | + `c32_sxdoc_render` → `c51_render_theme` directly. Shape the | |
| 567 | + handler accordingly from the start; don't refactor later. | |
sama.profile.toml
+52
−0
| @@ -0,0 +1,52 @@ | ||
| 1 | +# SAMA v2 profile — declares this repo's filename prefixes and how | |
| 2 | +# they map to the four canonical layers (Pure 0 / Core 1 / Adapter 2 / | |
| 3 | +# Entry 3). See https://tdd.md/sama/v2 §2 for the profile mechanism | |
| 4 | +# and https://tdd.md/sama/v2 §1.1 for the canonical layer table. | |
| 5 | +# | |
| 6 | +# Order in each `sublayers` array is the dependency order: later | |
| 7 | +# entries may import earlier entries, never the reverse (§2.2). | |
| 8 | + | |
| 9 | +sama_version = "2.0" | |
| 10 | +profile = "tdd-md" | |
| 11 | + | |
| 12 | +# Layer 0 — Pure. Types, constants, pure registries, pure parsers. | |
| 13 | +# No I/O, no side effects. | |
| 14 | +[layers.0] | |
| 15 | +prefixes = ["c31_"] | |
| 16 | + | |
| 17 | +# Layer 1 — Core. Domain logic and pure render. No network, disk, | |
| 18 | +# clock, or framework. | |
| 19 | +# - c32_ holds pure domain logic (judging math, session HMAC, | |
| 20 | +# anchor extraction, edit-target resolution, the v1 verifier). | |
| 21 | +# - c51_ holds pure HTML render functions (markdown → string, | |
| 22 | +# page chrome, no I/O). | |
| 23 | +# c51 may import c32 (render uses logic); c32 must never import c51. | |
| 24 | +[layers.1] | |
| 25 | +sublayers = [ | |
| 26 | + { name = "logic", prefix = "c32_" }, | |
| 27 | + { name = "render", prefix = "c51_" }, | |
| 28 | +] | |
| 29 | + | |
| 30 | +# Layer 2 — Adapter. The boundary. External input is parsed here. | |
| 31 | +# DB, network, filesystem, framework bindings. | |
| 32 | +# - c13_ holds SQLite primitives (the bun:sqlite Database wrapper). | |
| 33 | +# - c14_ holds HTTP / git / filesystem orchestrators that may compose | |
| 34 | +# c13 primitives (e.g. c14_judge runs git clone + saveRun() to db). | |
| 35 | +# c14 may import c13; c13 must never import c14. | |
| 36 | +[layers.2] | |
| 37 | +sublayers = [ | |
| 38 | + { name = "data", prefix = "c13_" }, | |
| 39 | + { name = "io", prefix = "c14_" }, | |
| 40 | +] | |
| 41 | + | |
| 42 | +# Layer 3 — Entry. Outermost shell: server bootstrap, route table, | |
| 43 | +# handlers. | |
| 44 | +# - c21_ holds HTTP handlers and the Bun.serve route table (c21_app). | |
| 45 | +# - c11_ holds the server bootstrap that mounts the route table. | |
| 46 | +# c11 may import c21 (the bootstrap pulls in the app); c21 must never | |
| 47 | +# import c11. | |
| 48 | +[layers.3] | |
| 49 | +sublayers = [ | |
| 50 | + { name = "handlers", prefix = "c21_" }, | |
| 51 | + { name = "server", prefix = "c11_" }, | |
| 52 | +] | |
src/c13_database.ts
+8
−23
| @@ -1,6 +1,6 @@ | ||
| 1 | 1 | import { Database } from "bun:sqlite"; |
| 2 | -import type { ProjectConfig, TestRunner } from "./c31_project_config.ts"; | |
| 3 | -import type { SxDocument } from "./c31_sxdoc.ts"; | |
| 2 | +import type { ProjectConfig, TestRunner, ProjectRow } from "./c31_project_config.ts"; | |
| 3 | +import type { SxDocument, SxDocumentSummary } from "./c31_sxdoc.ts"; | |
| 4 | 4 | import { SX_DOC_VERSION } from "./c31_sxdoc.ts"; |
| 5 | 5 | |
| 6 | 6 | const DB_PATH = process.env.TDD_DB_PATH ?? ":memory:"; |
| @@ -133,18 +133,9 @@ export const latestRun = (owner: string, repo: string): Verdict | null => { | ||
| 133 | 133 | return JSON.parse(row.verdict_json) as Verdict; |
| 134 | 134 | }; |
| 135 | 135 | |
| 136 | -export interface ProjectRow { | |
| 137 | - id: number; | |
| 138 | - registeredBy: string; | |
| 139 | - repoOwner: string; | |
| 140 | - repoName: string; | |
| 141 | - testRunner: TestRunner; | |
| 142 | - trackedBranches: string[]; | |
| 143 | - displayName: string | null; | |
| 144 | - team: string | null; | |
| 145 | - registeredAt: number; | |
| 146 | - status: "active" | "paused"; | |
| 147 | -} | |
| 136 | +// ProjectRow is now defined in Layer 0 (c31_project_config) alongside | |
| 137 | +// the rest of the project-config types, so c51 render code can | |
| 138 | +// reference it without importing from Layer 2. | |
| 148 | 139 | |
| 149 | 140 | interface ProjectDbRow { |
| 150 | 141 | id: number; |
| @@ -240,15 +231,9 @@ export interface SxDocumentRow { | ||
| 240 | 231 | updatedAt: number; |
| 241 | 232 | } |
| 242 | 233 | |
| 243 | -export interface SxDocumentSummary { | |
| 244 | - id: number; | |
| 245 | - slug: string; | |
| 246 | - type: "page" | "post"; | |
| 247 | - title: string; | |
| 248 | - status: "published" | "draft"; | |
| 249 | - primaryTag: string | null; | |
| 250 | - updatedAt: number; | |
| 251 | -} | |
| 234 | +// SxDocumentSummary is the public summary shape; defined in Layer 0 | |
| 235 | +// (c31_sxdoc) so render code can reference it without crossing the | |
| 236 | +// SAMA v2 import direction. | |
| 252 | 237 | |
| 253 | 238 | interface SxDocumentDbRow { |
| 254 | 239 | id: number; |
src/c14_git.ts
+13
−22
| @@ -26,22 +26,15 @@ import { | ||
| 26 | 26 | |
| 27 | 27 | export const GIT_DIR = process.env.TDD_GIT_DIR ?? "/app/repo"; |
| 28 | 28 | |
| 29 | -export interface GitCommitOk { | |
| 30 | - ok: true; | |
| 31 | - commitSha: string; | |
| 32 | -} | |
| 33 | - | |
| 34 | -export interface GitCommitFailure { | |
| 35 | - ok: false; | |
| 36 | - // "conflict" → ref tip moved under us (someone else committed) | |
| 37 | - // "not_found" → branch doesn't exist | |
| 38 | - // "permission" → fs perms on the bare repo | |
| 39 | - // "other" → anything else (look at .message) | |
| 40 | - kind: "conflict" | "not_found" | "permission" | "other"; | |
| 41 | - message: string; | |
| 42 | -} | |
| 43 | - | |
| 44 | -export type GitCommitOutcome = GitCommitOk | GitCommitFailure; | |
| 29 | +// GitCommitOk / GitCommitFailure / GitCommitOutcome are defined in | |
| 30 | +// Layer 0 (c31_git_parse) per SAMA v2 §1.1. Imported here so the | |
| 31 | +// adapter's typed return signatures match what callers in Layer 1 | |
| 32 | +// also import directly. | |
| 33 | +import type { | |
| 34 | + GitCommitOk, | |
| 35 | + GitCommitFailure, | |
| 36 | + GitCommitOutcome, | |
| 37 | +} from "./c31_git_parse.ts"; | |
| 45 | 38 | |
| 46 | 39 | interface RunOpts { |
| 47 | 40 | stdin?: string; |
| @@ -114,12 +107,10 @@ export const readBlobAtRef = async (ref: string, path: string): Promise<string | | ||
| 114 | 107 | // Returns null when the path doesn't exist at that ref. Each entry |
| 115 | 108 | // keeps the relative name (basename), not the full path — the caller |
| 116 | 109 | // builds full paths from `${path}/${entry.name}`. |
| 117 | -export interface TreeEntry { | |
| 118 | - name: string; // basename, e.g. "skill.md" or "blog" | |
| 119 | - type: "blob" | "tree" | "commit"; | |
| 120 | - sha: string; | |
| 121 | - mode: string; | |
| 122 | -} | |
| 110 | +// TreeEntry is defined in Layer 0 (c31_git_parse) per SAMA v2 §1.1. | |
| 111 | +// Callers import it directly from c31_git_parse, not through this | |
| 112 | +// adapter — that's what keeps the import direction Layer N → Layer M < N. | |
| 113 | +import type { TreeEntry } from "./c31_git_parse.ts"; | |
| 123 | 114 | export const lsTree = async (ref: string, path: string): Promise<TreeEntry[] | null> => { |
| 124 | 115 | // `<ref>:<path>` — git lists what's at that tree. For path="" it's |
| 125 | 116 | // the repo root. |
src/c14_judge.test.ts
+69
−0
| @@ -0,0 +1,69 @@ | ||
| 1 | +// Sibling test for c32_judge.ts. The orchestrator itself (judge()) does | |
| 2 | +// git clone + test execution and isn't unit-testable without a real | |
| 3 | +// agent repo; the pure helpers underneath it (applyMode, explainRefactor) | |
| 4 | +// are the structural surface that matters for scoring decisions. Cover | |
| 5 | +// the mode-aware penalty math + the operator-facing explanations here. | |
| 6 | + | |
| 7 | +import { describe, test, expect } from "bun:test"; | |
| 8 | +import { applyMode, explainRefactor, judge } from "./c14_judge.ts"; | |
| 9 | + | |
| 10 | +describe("c32_judge — applyMode (mode-aware penalty math)", () => { | |
| 11 | + test("positive deltas pass through unchanged in every mode", () => { | |
| 12 | + expect(applyMode(10, "strict")).toBe(10); | |
| 13 | + expect(applyMode(10, "pragmatic")).toBe(10); | |
| 14 | + expect(applyMode(10, "learning")).toBe(10); | |
| 15 | + }); | |
| 16 | + | |
| 17 | + test("strict mode keeps the full negative penalty", () => { | |
| 18 | + expect(applyMode(-20, "strict")).toBe(-20); | |
| 19 | + expect(applyMode(-5, "strict")).toBe(-5); | |
| 20 | + }); | |
| 21 | + | |
| 22 | + test("pragmatic mode halves negative deltas (Math.ceil — never below half)", () => { | |
| 23 | + expect(applyMode(-20, "pragmatic")).toBe(-10); | |
| 24 | + expect(applyMode(-10, "pragmatic")).toBe(-5); | |
| 25 | + // -5 / 2 = -2.5 → Math.ceil(-2.5) = -2: the harsher half rounds up | |
| 26 | + // toward zero, which is the documented "softer score" behaviour. | |
| 27 | + expect(applyMode(-5, "pragmatic")).toBe(-2); | |
| 28 | + }); | |
| 29 | + | |
| 30 | + test("learning mode zeroes out every negative delta", () => { | |
| 31 | + expect(applyMode(-20, "learning")).toBe(0); | |
| 32 | + expect(applyMode(-5, "learning")).toBe(0); | |
| 33 | + expect(applyMode(-1, "learning")).toBe(0); | |
| 34 | + }); | |
| 35 | + | |
| 36 | + test("zero delta is neutral in every mode", () => { | |
| 37 | + expect(applyMode(0, "strict")).toBe(0); | |
| 38 | + expect(applyMode(0, "pragmatic")).toBe(0); | |
| 39 | + expect(applyMode(0, "learning")).toBe(0); | |
| 40 | + }); | |
| 41 | +}); | |
| 42 | + | |
| 43 | +describe("c32_judge — explainRefactor", () => { | |
| 44 | + test("passed=true returns the canonical-refactor explanation", () => { | |
| 45 | + const s = explainRefactor(true); | |
| 46 | + expect(s).toContain("stayed green"); | |
| 47 | + expect(s).toMatch(/canonical/i); | |
| 48 | + }); | |
| 49 | + | |
| 50 | + test("passed=false returns guidance to revert or open a new red→green", () => { | |
| 51 | + const s = explainRefactor(false); | |
| 52 | + expect(s).toContain("broke"); | |
| 53 | + expect(s).toMatch(/revert|red→green/); | |
| 54 | + }); | |
| 55 | + | |
| 56 | + test("the two branches return different strings", () => { | |
| 57 | + expect(explainRefactor(true)).not.toBe(explainRefactor(false)); | |
| 58 | + }); | |
| 59 | +}); | |
| 60 | + | |
| 61 | +describe("c32_judge — orchestrator entry point", () => { | |
| 62 | + test("judge is exported as an async function (Promise-returning)", () => { | |
| 63 | + expect(typeof judge).toBe("function"); | |
| 64 | + // The orchestrator does git clone + test execution; covering it | |
| 65 | + // end-to-end needs a real agent repo. A type-level check that the | |
| 66 | + // shape didn't drift is the documented minimum for this layer. | |
| 67 | + expect(judge.length).toBe(2); | |
| 68 | + }); | |
| 69 | +}); | |
src/c14_judge.ts
+370
−0
| @@ -0,0 +1,370 @@ | ||
| 1 | +import { mkdtempSync, rmSync } from "fs"; | |
| 2 | +import { join } from "path"; | |
| 3 | +import { tmpdir } from "os"; | |
| 4 | +import { parseCommit, type Phase } from "./c31_commits.ts"; | |
| 5 | +import { saveRun, type Verdict, type StepVerdict, type RefactorVerdict, type Mode } from "./c13_database.ts"; | |
| 6 | +import { loadGame, type Game } from "./c31_games.ts"; | |
| 7 | + | |
| 8 | +type TestRunner = "bun" | "none"; | |
| 9 | + | |
| 10 | +interface TddConfig { | |
| 11 | + mode: Mode; | |
| 12 | + testRunner: TestRunner; | |
| 13 | +} | |
| 14 | + | |
| 15 | +// tdd.config.json from the agent's repo selects the scoring mode and | |
| 16 | +// test runner. Falls back to strict / bun when missing or unparseable. | |
| 17 | +// | |
| 18 | +// { "mode": "pragmatic", "test_runner": "none" } | |
| 19 | +// | |
| 20 | +// test_runner: "none" enables trace-only judging — no checkout, no test | |
| 21 | +// execution. Useful as a CI gate on projects where Bun can't run the | |
| 22 | +// suite (e.g. .NET, Python without bun-compat tests). | |
| 23 | +const readConfig = async (cwd: string): Promise<TddConfig> => { | |
| 24 | + const file = Bun.file(join(cwd, "tdd.config.json")); | |
| 25 | + let mode: Mode = "strict"; | |
| 26 | + let testRunner: TestRunner = "bun"; | |
| 27 | + if (await file.exists()) { | |
| 28 | + try { | |
| 29 | + const cfg = (await file.json()) as { mode?: string; test_runner?: string }; | |
| 30 | + if (cfg.mode === "pragmatic" || cfg.mode === "learning") mode = cfg.mode; | |
| 31 | + if (cfg.test_runner === "none") testRunner = "none"; | |
| 32 | + } catch { | |
| 33 | + // best effort — bad config falls back to defaults | |
| 34 | + } | |
| 35 | + } | |
| 36 | + return { mode, testRunner }; | |
| 37 | +}; | |
| 38 | + | |
| 39 | +// Penalty halving for pragmatic, zeroing for learning. Positive deltas | |
| 40 | +// are unchanged across modes — earned credit is earned credit. | |
| 41 | +export const applyMode = (delta: number, mode: Mode): number => { | |
| 42 | + if (delta >= 0) return delta; | |
| 43 | + if (mode === "learning") return 0; | |
| 44 | + if (mode === "pragmatic") return Math.ceil(delta / 2); | |
| 45 | + return delta; | |
| 46 | +}; | |
| 47 | + | |
| 48 | +// Plain-language summary of a step verdict, written to the agent (not | |
| 49 | +// the human admin). One short paragraph; named intentionally so callers | |
| 50 | +// can see it next to the row in the score table. | |
| 51 | +const explainStep = (params: { | |
| 52 | + status: StepVerdict["status"]; | |
| 53 | + redSha: string | null; | |
| 54 | + greenSha: string | null; | |
| 55 | + hiddenPassed: boolean | null; | |
| 56 | + mode: Mode; | |
| 57 | +}): string => { | |
| 58 | + const { status, hiddenPassed, mode } = params; | |
| 59 | + switch (status) { | |
| 60 | + case "verified": | |
| 61 | + return "Red failed as expected, green passes your tests, and the kata's hidden tests confirm the implementation matches the requirement."; | |
| 62 | + case "discipline-only": | |
| 63 | + return "Red→green discipline holds, but this kata didn't ship hidden tests for the step. Partial credit awarded; full +20 isn't possible without authoritative verification."; | |
| 64 | + case "no-green": | |
| 65 | + return "Red commit landed; the matching green(<step>) commit hasn't been pushed yet. Push your green to lock in the score."; | |
| 66 | + case "red-did-not-fail": | |
| 67 | + return mode === "pragmatic" | |
| 68 | + ? "Combined red+green commit detected. Pragmatic mode allows this — the cycle still counts, just with a softer score than a clean separation." | |
| 69 | + : "Red commit's tests already passed when the step was first introduced — meaning the implementation was added before the test, or the test is tautological. Switch to pragmatic mode if you commit red+green together intentionally."; | |
| 70 | + case "green-did-not-pass": | |
| 71 | + return "Green commit's own tests still fail. The implementation doesn't yet satisfy the test you wrote — fix the impl, or reconsider whether the test reflects the requirement."; | |
| 72 | + case "hidden-tests-failed": | |
| 73 | + return hiddenPassed === false | |
| 74 | + ? "Your tests pass, but the kata's hidden tests don't — this is the classic tautology trap. Tighten your test to mirror the requirement (e.g., assert the actual return value, not just that it runs)." | |
| 75 | + : "Your tests pass, but hidden verification was inconclusive. Re-push to retry."; | |
| 76 | + case "test-deleted": | |
| 77 | + return "Test count dropped between red and green for this step. Once a test exists it must keep existing — refactor it, don't delete it. If the test was wrong, replace it in a separate commit before resuming the cycle."; | |
| 78 | + case "trace-verified": | |
| 79 | + return "Trace-only mode: red→green pair found in the commit log. Tests weren't executed (test_runner: \"none\"). Switch to bun runner for behaviour verification."; | |
| 80 | + case "trace-tests-shrunk": | |
| 81 | + return "Trace-only mode: the green commit's tree has fewer test files than the red commit's tree — looks like deletion. If you renamed or split test files, the tally still drops."; | |
| 82 | + } | |
| 83 | +}; | |
| 84 | + | |
| 85 | +export const explainRefactor = (passed: boolean): string => | |
| 86 | + passed | |
| 87 | + ? "Tests stayed green through the refactor — structural change without behavior change, the canonical refactor." | |
| 88 | + : "Refactor commit broke at least one test. Either revert the refactor or write a new red→green to capture the changed behavior."; | |
| 89 | + | |
| 90 | +const FORGEJO_INTERNAL = process.env.FORGEJO_URL ?? "https://git.tdd.md"; | |
| 91 | +const TEST_TIMEOUT_MS = 8000; | |
| 92 | + | |
| 93 | +// Sandboxed env passed to git and bun subprocesses. Strips every secret | |
| 94 | +// from the parent process — agent code never sees FORGEJO_ADMIN_TOKEN, | |
| 95 | +// GITHUB_CLIENT_SECRET, or SESSION_SECRET. PATH is fixed; HOME and TMPDIR | |
| 96 | +// stay inside the per-run temp dir so dotfile writes can't escape. | |
| 97 | +const sandboxEnv = (cwd: string): Record<string, string> => ({ | |
| 98 | + PATH: "/usr/local/bin:/usr/bin:/bin", | |
| 99 | + HOME: cwd, | |
| 100 | + TMPDIR: cwd, | |
| 101 | + NODE_ENV: "test", | |
| 102 | +}); | |
| 103 | + | |
| 104 | +const runProc = async ( | |
| 105 | + cmd: string[], | |
| 106 | + cwd: string, | |
| 107 | + timeoutMs: number, | |
| 108 | +): Promise<{ stdout: string; stderr: string; exitCode: number; timedOut: boolean }> => { | |
| 109 | + const proc = Bun.spawn(cmd, { | |
| 110 | + cwd, | |
| 111 | + stdout: "pipe", | |
| 112 | + stderr: "pipe", | |
| 113 | + env: sandboxEnv(cwd), | |
| 114 | + }); | |
| 115 | + let timedOut = false; | |
| 116 | + const timer = setTimeout(() => { | |
| 117 | + timedOut = true; | |
| 118 | + proc.kill("SIGKILL"); | |
| 119 | + }, timeoutMs); | |
| 120 | + const exitCode = await proc.exited; | |
| 121 | + clearTimeout(timer); | |
| 122 | + const stdout = await new Response(proc.stdout).text(); | |
| 123 | + const stderr = await new Response(proc.stderr).text(); | |
| 124 | + return { stdout: stdout.trim(), stderr: stderr.trim(), exitCode, timedOut }; | |
| 125 | +}; | |
| 126 | + | |
| 127 | +const runTests = async (cwd: string): Promise<boolean> => { | |
| 128 | + const r = await runProc(["bun", "test"], cwd, TEST_TIMEOUT_MS); | |
| 129 | + // Bun test exits 0 only when all tests pass. | |
| 130 | + return !r.timedOut && r.exitCode === 0; | |
| 131 | +}; | |
| 132 | + | |
| 133 | +// Language-agnostic test-file counter for trace-only mode. Uses git | |
| 134 | +// ls-tree at the given sha so we don't have to checkout the working | |
| 135 | +// tree. Matches conventional test-file naming across ecosystems: | |
| 136 | +// foo.test.ts, foo.spec.ts, FooTests.cs, FooTest.java, test_foo.py, | |
| 137 | +// foo_test.go, FooSpec.scala, foo_spec.rb. | |
| 138 | +const countTestFiles = async (cwd: string, sha: string): Promise<number> => { | |
| 139 | + const r = await runProc(["git", "ls-tree", "-r", "--name-only", sha], cwd, 5000); | |
| 140 | + if (r.exitCode !== 0) return 0; | |
| 141 | + const re = /(?:^|\/)(?:[^/]*\.(?:test|spec)\.[a-z]+|[Tt]ests?\/[^/]+|test_[^/]+|[^/]+_test\.[a-z]+|[^/]+[Tt]ests?\.cs|[^/]+[Tt]est\.java)$/; | |
| 142 | + let count = 0; | |
| 143 | + for (const line of r.stdout.split("\n")) { | |
| 144 | + if (re.test(line)) count++; | |
| 145 | + } | |
| 146 | + return count; | |
| 147 | +}; | |
| 148 | + | |
| 149 | +// Count `test(` / `it(` calls in tracked *.test.ts files. Used to detect | |
| 150 | +// when an agent deletes tests between red and green to make a regression | |
| 151 | +// "pass" — a cardinal TDD sin per the kata spec. | |
| 152 | +const countTests = async (cwd: string): Promise<number> => { | |
| 153 | + const r = await runProc(["git", "ls-files", "*.test.ts"], cwd, 5000); | |
| 154 | + if (r.exitCode !== 0) return 0; | |
| 155 | + const files = r.stdout.split("\n").filter((f) => f && !f.includes("__hidden_")); | |
| 156 | + let count = 0; | |
| 157 | + for (const f of files) { | |
| 158 | + const content = await Bun.file(join(cwd, f)) | |
| 159 | + .text() | |
| 160 | + .catch(() => ""); | |
| 161 | + const matches = content.match(/\b(?:test|it)\s*\(/g); | |
| 162 | + if (matches) count += matches.length; | |
| 163 | + } | |
| 164 | + return count; | |
| 165 | +}; | |
| 166 | + | |
| 167 | +// Runs the kata's authoritative tests against the agent's implementation | |
| 168 | +// at whatever commit is currently checked out. Copies the hidden test | |
| 169 | +// file into the working tree under a __hidden__ prefix so it doesn't | |
| 170 | +// collide with the agent's filenames, runs only that file, then deletes | |
| 171 | +// it. Returns null if the kata doesn't have hidden tests for this step. | |
| 172 | +const runHiddenTests = async (cwd: string, spec: Game, stepId: string): Promise<boolean | null> => { | |
| 173 | + const stepDef = spec.steps.find((s) => s.id === stepId); | |
| 174 | + if (!stepDef) return null; | |
| 175 | + const sourcePath = `./content/games/${spec.id}/${stepDef.hiddenTestFile}`; | |
| 176 | + const sourceFile = Bun.file(sourcePath); | |
| 177 | + if (!(await sourceFile.exists())) return null; | |
| 178 | + const content = await sourceFile.text(); | |
| 179 | + const targetName = `__hidden_${stepId}__.test.ts`; | |
| 180 | + const targetPath = join(cwd, targetName); | |
| 181 | + await Bun.write(targetPath, content); | |
| 182 | + try { | |
| 183 | + const r = await runProc(["bun", "test", targetName], cwd, TEST_TIMEOUT_MS); | |
| 184 | + return !r.timedOut && r.exitCode === 0; | |
| 185 | + } finally { | |
| 186 | + try { | |
| 187 | + rmSync(targetPath, { force: true }); | |
| 188 | + } catch { | |
| 189 | + // best effort | |
| 190 | + } | |
| 191 | + } | |
| 192 | +}; | |
| 193 | + | |
| 194 | +interface CommitInfo { | |
| 195 | + sha: string; | |
| 196 | + phase: Phase; | |
| 197 | + step: string | null; | |
| 198 | +} | |
| 199 | + | |
| 200 | +const readCommits = async (cwd: string): Promise<CommitInfo[]> => { | |
| 201 | + const r = await runProc(["git", "log", "--reverse", "--pretty=format:%H%x1f%B%x1e"], cwd, 10000); | |
| 202 | + if (r.exitCode !== 0) return []; | |
| 203 | + const out: CommitInfo[] = []; | |
| 204 | + for (const block of r.stdout.split("\x1e")) { | |
| 205 | + const t = block.trim(); | |
| 206 | + if (!t) continue; | |
| 207 | + const [sha, message = ""] = t.split("\x1f"); | |
| 208 | + if (!sha) continue; | |
| 209 | + const p = parseCommit(message); | |
| 210 | + out.push({ sha, phase: p.phase, step: p.step }); | |
| 211 | + } | |
| 212 | + return out; | |
| 213 | +}; | |
| 214 | + | |
| 215 | +export const judge = async (owner: string, repo: string): Promise<Verdict> => { | |
| 216 | + const cwd = mkdtempSync(join(tmpdir(), `judge-${owner}-${repo}-`)); | |
| 217 | + try { | |
| 218 | + // Agent repos default to private. Authenticate via admin token in | |
| 219 | + // an http.extraheader so the token isn't persisted in the cloned | |
| 220 | + // repo's config (extraheader applies to the clone request only). | |
| 221 | + const cloneUrl = `${FORGEJO_INTERNAL}/${owner}/${repo}.git`; | |
| 222 | + const adminToken = process.env.FORGEJO_ADMIN_TOKEN; | |
| 223 | + const gitArgs = adminToken | |
| 224 | + ? ["-c", `http.extraheader=Authorization: token ${adminToken}`, "clone", "--quiet", cloneUrl, "."] | |
| 225 | + : ["clone", "--quiet", cloneUrl, "."]; | |
| 226 | + const cloneR = await runProc(["git", ...gitArgs], cwd, 30000); | |
| 227 | + if (cloneR.exitCode !== 0) { | |
| 228 | + throw new Error(`clone failed: ${cloneR.stderr || cloneR.stdout}`); | |
| 229 | + } | |
| 230 | + | |
| 231 | + const commits = await readCommits(cwd); | |
| 232 | + const headR = await runProc(["git", "rev-parse", "HEAD"], cwd, 5000); | |
| 233 | + const headSha = headR.stdout; | |
| 234 | + | |
| 235 | + // First red per step + first green-after-red per step (chronological). | |
| 236 | + const stepRed = new Map<string, string>(); | |
| 237 | + const stepGreen = new Map<string, string>(); | |
| 238 | + for (const c of commits) { | |
| 239 | + if (!c.step) continue; | |
| 240 | + if (c.phase === "red" && !stepRed.has(c.step)) { | |
| 241 | + stepRed.set(c.step, c.sha); | |
| 242 | + } else if (c.phase === "green" && stepRed.has(c.step) && !stepGreen.has(c.step)) { | |
| 243 | + stepGreen.set(c.step, c.sha); | |
| 244 | + } | |
| 245 | + } | |
| 246 | + | |
| 247 | + // Read the agent's mode + runner preferences from tdd.config.json. | |
| 248 | + const { mode, testRunner } = await readConfig(cwd); | |
| 249 | + | |
| 250 | + // Load the kata's authoritative spec — used to fetch hidden tests | |
| 251 | + // per step. Repos that don't match a known kata get scored on red→green | |
| 252 | + // discipline only (no hidden-test verification). | |
| 253 | + let spec: Game | null = null; | |
| 254 | + try { | |
| 255 | + spec = await loadGame(repo); | |
| 256 | + } catch { | |
| 257 | + spec = null; | |
| 258 | + } | |
| 259 | + | |
| 260 | + const steps: StepVerdict[] = []; | |
| 261 | + for (const [stepId, redSha] of stepRed) { | |
| 262 | + const greenSha = stepGreen.get(stepId) ?? null; | |
| 263 | + | |
| 264 | + if (testRunner === "none") { | |
| 265 | + // Trace-only path: don't checkout, don't run anything. Score | |
| 266 | + // purely from the commit log + a language-agnostic test-file | |
| 267 | + // count via `git ls-tree`. Useful for non-Bun projects. | |
| 268 | + const redFiles = await countTestFiles(cwd, redSha); | |
| 269 | + const greenFiles = greenSha ? await countTestFiles(cwd, greenSha) : redFiles; | |
| 270 | + const filesShrank = greenSha !== null && greenFiles < redFiles; | |
| 271 | + | |
| 272 | + let status: StepVerdict["status"]; | |
| 273 | + let baseDelta = 0; | |
| 274 | + if (greenSha === null) { | |
| 275 | + status = "no-green"; | |
| 276 | + } else if (filesShrank) { | |
| 277 | + status = "trace-tests-shrunk"; | |
| 278 | + baseDelta = -10; | |
| 279 | + } else { | |
| 280 | + status = "trace-verified"; | |
| 281 | + baseDelta = 10; | |
| 282 | + } | |
| 283 | + const scoreDelta = applyMode(baseDelta, mode); | |
| 284 | + const explanation = explainStep({ status, redSha, greenSha, hiddenPassed: null, mode }); | |
| 285 | + steps.push({ | |
| 286 | + stepId, redSha, greenSha, | |
| 287 | + redFailed: null, greenPassed: null, hiddenPassed: null, | |
| 288 | + status, scoreDelta, explanation, | |
| 289 | + }); | |
| 290 | + continue; | |
| 291 | + } | |
| 292 | + | |
| 293 | + await runProc(["git", "checkout", "--quiet", redSha], cwd, 5000); | |
| 294 | + const redTestCount = await countTests(cwd); | |
| 295 | + const redPassed = await runTests(cwd); | |
| 296 | + const redFailed = !redPassed; | |
| 297 | + let greenPassed: boolean | null = null; | |
| 298 | + let hiddenPassed: boolean | null = null; | |
| 299 | + let testsDeleted = false; | |
| 300 | + if (greenSha) { | |
| 301 | + await runProc(["git", "checkout", "--quiet", greenSha], cwd, 5000); | |
| 302 | + const greenTestCount = await countTests(cwd); | |
| 303 | + testsDeleted = greenTestCount < redTestCount; | |
| 304 | + greenPassed = await runTests(cwd); | |
| 305 | + if (greenPassed && spec && !testsDeleted) { | |
| 306 | + hiddenPassed = await runHiddenTests(cwd, spec, stepId); | |
| 307 | + } | |
| 308 | + } | |
| 309 | + | |
| 310 | + let status: StepVerdict["status"]; | |
| 311 | + let baseDelta = 0; | |
| 312 | + if (greenSha === null) { | |
| 313 | + status = "no-green"; | |
| 314 | + } else if (testsDeleted) { | |
| 315 | + status = "test-deleted"; | |
| 316 | + baseDelta = -20; | |
| 317 | + } else if (!redFailed) { | |
| 318 | + status = "red-did-not-fail"; | |
| 319 | + baseDelta = -5; | |
| 320 | + } else if (greenPassed === false) { | |
| 321 | + status = "green-did-not-pass"; | |
| 322 | + baseDelta = -5; | |
| 323 | + } else if (hiddenPassed === false) { | |
| 324 | + status = "hidden-tests-failed"; | |
| 325 | + baseDelta = 0; | |
| 326 | + } else if (hiddenPassed === true) { | |
| 327 | + status = "verified"; | |
| 328 | + baseDelta = 20; | |
| 329 | + } else { | |
| 330 | + status = "discipline-only"; | |
| 331 | + baseDelta = 5; | |
| 332 | + } | |
| 333 | + const scoreDelta = applyMode(baseDelta, mode); | |
| 334 | + const explanation = explainStep({ status, redSha, greenSha, hiddenPassed, mode }); | |
| 335 | + steps.push({ stepId, redSha, greenSha, redFailed, greenPassed, hiddenPassed, status, scoreDelta, explanation }); | |
| 336 | + } | |
| 337 | + | |
| 338 | + // Refactor commits aren't tied to red→green pairs: the spec rewards | |
| 339 | + // any refactor that keeps the existing tests green. A broken refactor | |
| 340 | + // (tests fail at the refactor commit) costs the same as a missed | |
| 341 | + // green — discipline matters even outside red→green pairs. | |
| 342 | + const refactors: RefactorVerdict[] = []; | |
| 343 | + for (const c of commits) { | |
| 344 | + if (c.phase !== "refactor") continue; | |
| 345 | + await runProc(["git", "checkout", "--quiet", c.sha], cwd, 5000); | |
| 346 | + const passed = await runTests(cwd); | |
| 347 | + const baseDelta = passed ? 5 : -5; | |
| 348 | + refactors.push({ | |
| 349 | + sha: c.sha, | |
| 350 | + stepId: c.step, | |
| 351 | + testsPassed: passed, | |
| 352 | + scoreDelta: applyMode(baseDelta, mode), | |
| 353 | + explanation: explainRefactor(passed), | |
| 354 | + }); | |
| 355 | + } | |
| 356 | + | |
| 357 | + const totalScore = | |
| 358 | + steps.reduce((a, s) => a + s.scoreDelta, 0) + | |
| 359 | + refactors.reduce((a, r) => a + r.scoreDelta, 0); | |
| 360 | + const verdict: Verdict = { headSha, mode, steps, refactors, totalScore, judgedAt: Date.now() }; | |
| 361 | + saveRun(owner, repo, verdict); | |
| 362 | + return verdict; | |
| 363 | + } finally { | |
| 364 | + try { | |
| 365 | + rmSync(cwd, { recursive: true, force: true }); | |
| 366 | + } catch { | |
| 367 | + // best effort cleanup | |
| 368 | + } | |
| 369 | + } | |
| 370 | +}; | |
src/c14_real_reports.test.ts
+101
−0
| @@ -0,0 +1,101 @@ | ||
| 1 | +// Sibling test for c32_real_reports.ts. buildLiveReports itself fans out | |
| 2 | +// to fetchRepoCommits (network) so its end-to-end shape is covered by | |
| 3 | +// the live /reports/live route. The pure helpers underneath — agent | |
| 4 | +// attribution from commit messages, and the 30-day daily sparkline — | |
| 5 | +// are unit-testable here. | |
| 6 | + | |
| 7 | +import { describe, test, expect } from "bun:test"; | |
| 8 | +import { | |
| 9 | + detectAgent, | |
| 10 | + buildTrend, | |
| 11 | + buildLiveReports, | |
| 12 | +} from "./c14_real_reports.ts"; | |
| 13 | +import type { GithubCommit } from "./c14_github.ts"; | |
| 14 | + | |
| 15 | +const mkCommit = (date: string, message = ""): GithubCommit => ({ | |
| 16 | + sha: "0".repeat(40), | |
| 17 | + commit: { | |
| 18 | + message, | |
| 19 | + author: { name: "test", email: "[email protected]", date }, | |
| 20 | + committer: { name: "test", email: "[email protected]", date }, | |
| 21 | + }, | |
| 22 | + author: null, | |
| 23 | + committer: null, | |
| 24 | +} as unknown as GithubCommit); | |
| 25 | + | |
| 26 | +describe("c32_real_reports — detectAgent", () => { | |
| 27 | + test("recognises a Claude Code commit via Co-Authored-By: Claude", () => { | |
| 28 | + expect(detectAgent("Add feature\n\nCo-Authored-By: Claude <noreply>")).toBe("claude-code"); | |
| 29 | + }); | |
| 30 | + | |
| 31 | + test("recognises a Cursor commit", () => { | |
| 32 | + expect(detectAgent("Fix bug\n\nCo-Authored-By: Cursor <[email protected]>")).toBe("cursor"); | |
| 33 | + }); | |
| 34 | + | |
| 35 | + test("recognises an Aider commit", () => { | |
| 36 | + expect(detectAgent("Refactor x\n\nCo-Authored-By: aider")).toBe("aider"); | |
| 37 | + }); | |
| 38 | + | |
| 39 | + test("returns unknown when no recognised footer is present", () => { | |
| 40 | + expect(detectAgent("Just a commit")).toBe("unknown"); | |
| 41 | + expect(detectAgent("")).toBe("unknown"); | |
| 42 | + }); | |
| 43 | + | |
| 44 | + test("the regex is case-insensitive on the agent token", () => { | |
| 45 | + expect(detectAgent("Co-Authored-By: CLAUDE")).toBe("claude-code"); | |
| 46 | + expect(detectAgent("co-authored-by: CURSOR")).toBe("cursor"); | |
| 47 | + }); | |
| 48 | +}); | |
| 49 | + | |
| 50 | +describe("c32_real_reports — buildTrend (30-day daily sparkline)", () => { | |
| 51 | + // Use today (UTC) as the anchor — the function compares against UTC | |
| 52 | + // midnight, so we need ISO strings that fall on the right days. | |
| 53 | + const today = new Date(); | |
| 54 | + today.setUTCHours(0, 0, 0, 0); | |
| 55 | + const iso = (daysAgo: number): string => { | |
| 56 | + const d = new Date(today.getTime() - daysAgo * 24 * 60 * 60 * 1000); | |
| 57 | + return d.toISOString(); | |
| 58 | + }; | |
| 59 | + | |
| 60 | + test("returns an array of `days` length", () => { | |
| 61 | + expect(buildTrend([], 30)).toHaveLength(30); | |
| 62 | + expect(buildTrend([], 7)).toHaveLength(7); | |
| 63 | + }); | |
| 64 | + | |
| 65 | + test("empty input flat-lines at zero", () => { | |
| 66 | + const trend = buildTrend([], 7); | |
| 67 | + expect(trend.every((n) => n === 0)).toBe(true); | |
| 68 | + }); | |
| 69 | + | |
| 70 | + test("a single commit today increments the last bucket", () => { | |
| 71 | + const trend = buildTrend([mkCommit(iso(0))], 7); | |
| 72 | + expect(trend[trend.length - 1]).toBe(1); | |
| 73 | + expect(trend.slice(0, -1).every((n) => n === 0)).toBe(true); | |
| 74 | + }); | |
| 75 | + | |
| 76 | + test("multiple commits on the same day stack in the same bucket", () => { | |
| 77 | + const trend = buildTrend([mkCommit(iso(0)), mkCommit(iso(0)), mkCommit(iso(0))], 7); | |
| 78 | + expect(trend[trend.length - 1]).toBe(3); | |
| 79 | + }); | |
| 80 | + | |
| 81 | + test("commits older than the window are dropped", () => { | |
| 82 | + const trend = buildTrend([mkCommit(iso(99))], 7); | |
| 83 | + expect(trend.every((n) => n === 0)).toBe(true); | |
| 84 | + }); | |
| 85 | + | |
| 86 | + test("a commit `daysAgo` lands at index `days - 1 - daysAgo`", () => { | |
| 87 | + const trend = buildTrend([mkCommit(iso(2))], 7); | |
| 88 | + // index 6 = today, 5 = yesterday, 4 = 2 days ago | |
| 89 | + expect(trend[4]).toBe(1); | |
| 90 | + }); | |
| 91 | +}); | |
| 92 | + | |
| 93 | +describe("c32_real_reports — orchestrator entry point", () => { | |
| 94 | + test("buildLiveReports is exported as an async function", () => { | |
| 95 | + expect(typeof buildLiveReports).toBe("function"); | |
| 96 | + // End-to-end coverage lives on /reports/live; this is the structural | |
| 97 | + // smoke that the export shape didn't drift. `.length` counts only | |
| 98 | + // non-default params (owner, repo) — perPage carries a default. | |
| 99 | + expect(buildLiveReports.length).toBe(2); | |
| 100 | + }); | |
| 101 | +}); | |
src/c14_real_reports.ts
+170
−0
| @@ -0,0 +1,170 @@ | ||
| 1 | +// c32 — logic: aggregate real GitHub commit history into the same | |
| 2 | +// AgentReport / RecentFlagged shape that c51_render_reports renders. | |
| 3 | +// Pure (given fetched commits in, produces report objects out); the | |
| 4 | +// I/O happens in c14_github.fetchRepoCommits which we call here. | |
| 5 | +// | |
| 6 | +// Attribution: Co-Authored-By footers are the agent-attribution channel | |
| 7 | +// the existing tdd.md commit history already uses. Anything without a | |
| 8 | +// recognised footer is bucketed as "unknown" and reported separately — | |
| 9 | +// it's still useful for volume context. | |
| 10 | + | |
| 11 | +import { parseCommit } from "./c31_commits.ts"; | |
| 12 | +import { fetchRepoCommits, type GithubCommit } from "./c14_github.ts"; | |
| 13 | +import type { | |
| 14 | + AgentReport, | |
| 15 | + FailureSlice, | |
| 16 | + RecentFlagged, | |
| 17 | +} from "./c31_reports_demo.ts"; | |
| 18 | + | |
| 19 | +type LiveAgentSlug = AgentReport["slug"] | "unknown"; | |
| 20 | + | |
| 21 | +export const detectAgent = (msg: string): LiveAgentSlug => { | |
| 22 | + if (/Co-Authored-By:.*Claude/i.test(msg)) return "claude-code"; | |
| 23 | + if (/Co-Authored-By:.*Cursor/i.test(msg)) return "cursor"; | |
| 24 | + if (/Co-Authored-By:.*Aider/i.test(msg)) return "aider"; | |
| 25 | + return "unknown"; | |
| 26 | +}; | |
| 27 | + | |
| 28 | +const AGENT_NAMES: Record<AgentReport["slug"], string> = { | |
| 29 | + "claude-code": "Claude Code", | |
| 30 | + cursor: "Cursor", | |
| 31 | + aider: "Aider", | |
| 32 | +}; | |
| 33 | + | |
| 34 | +// 30-day daily commit-count series, oldest → newest. When there are no | |
| 35 | +// commits in a day, that day's value is 0 — the sparkline still renders | |
| 36 | +// but flat-lines, which honestly reflects the data. | |
| 37 | +export const buildTrend = (commits: GithubCommit[], days = 30): number[] => { | |
| 38 | + const out = new Array<number>(days).fill(0); | |
| 39 | + const today = new Date(); | |
| 40 | + today.setUTCHours(0, 0, 0, 0); | |
| 41 | + for (const c of commits) { | |
| 42 | + const d = new Date(c.commit.author.date); | |
| 43 | + d.setUTCHours(0, 0, 0, 0); | |
| 44 | + const ageDays = Math.floor((today.getTime() - d.getTime()) / (24 * 60 * 60 * 1000)); | |
| 45 | + if (ageDays < 0 || ageDays >= days) continue; | |
| 46 | + const idx = days - 1 - ageDays; | |
| 47 | + const cur = out[idx] ?? 0; | |
| 48 | + out[idx] = cur + 1; | |
| 49 | + } | |
| 50 | + return out; | |
| 51 | +}; | |
| 52 | + | |
| 53 | +const buildAgentReport = ( | |
| 54 | + slug: AgentReport["slug"], | |
| 55 | + agentCommits: GithubCommit[], | |
| 56 | + repoSlug: string, | |
| 57 | +): AgentReport => { | |
| 58 | + const tagged = agentCommits.filter((c) => { | |
| 59 | + const phase = parseCommit(c.commit.message).phase; | |
| 60 | + return phase === "red" || phase === "green" || phase === "refactor"; | |
| 61 | + }); | |
| 62 | + const phaseCoveragePct = agentCommits.length === 0 | |
| 63 | + ? 0 | |
| 64 | + : Math.round((tagged.length / agentCommits.length) * 100); | |
| 65 | + | |
| 66 | + // Score is a proxy: phase-coverage is the only structural signal we | |
| 67 | + // can compute without running the test suite. When coverage is 0 the | |
| 68 | + // agent isn't attempting TDD, so the score is honestly low. | |
| 69 | + const score = phaseCoveragePct; | |
| 70 | + | |
| 71 | + // Failure mix collapses to two slices for live data — phase-tagged vs | |
| 72 | + // not. Fine-grained failure modes (red-did-not-fail, test-deleted, etc) | |
| 73 | + // need the runner sliver before they're computable. | |
| 74 | + const failureMix: FailureSlice[] = [ | |
| 75 | + { label: "phase-tagged", pct: phaseCoveragePct, tone: "green" }, | |
| 76 | + { label: "no phase tag", pct: 100 - phaseCoveragePct, tone: "muted" }, | |
| 77 | + ]; | |
| 78 | + | |
| 79 | + const recent: RecentFlagged[] = agentCommits | |
| 80 | + .slice(0, 5) | |
| 81 | + .map((c) => { | |
| 82 | + const parsed = parseCommit(c.commit.message); | |
| 83 | + const phase = parsed.phase === "red" || parsed.phase === "green" || parsed.phase === "refactor" | |
| 84 | + ? parsed.phase | |
| 85 | + : "green"; | |
| 86 | + const failure = parsed.phase === "untagged" || parsed.phase === "init" | |
| 87 | + ? "no phase tag" | |
| 88 | + : `${parsed.phase} (live judge not yet wired)`; | |
| 89 | + return { | |
| 90 | + date: c.commit.author.date.slice(0, 10), | |
| 91 | + repo: repoSlug, | |
| 92 | + sha: c.sha.slice(0, 7), | |
| 93 | + phase, | |
| 94 | + failure, | |
| 95 | + pts: 0, | |
| 96 | + }; | |
| 97 | + }); | |
| 98 | + | |
| 99 | + const topIssueLabel = phaseCoveragePct === 100 ? "no current issues" : "no phase tag"; | |
| 100 | + const topIssuePct = 100 - phaseCoveragePct; | |
| 101 | + | |
| 102 | + return { | |
| 103 | + slug, | |
| 104 | + name: AGENT_NAMES[slug], | |
| 105 | + score, | |
| 106 | + delta: 0, | |
| 107 | + commits: agentCommits.length, | |
| 108 | + phaseCoveragePct, | |
| 109 | + streak: 0, | |
| 110 | + streakBroken: false, | |
| 111 | + topIssueLabel, | |
| 112 | + topIssuePct, | |
| 113 | + failureMix, | |
| 114 | + trend: buildTrend(agentCommits), | |
| 115 | + recent, | |
| 116 | + }; | |
| 117 | +}; | |
| 118 | + | |
| 119 | +export interface LiveReports { | |
| 120 | + reports: AgentReport[]; | |
| 121 | + unknownCount: number; | |
| 122 | + totalCommits: number; | |
| 123 | + earliest: string | null; | |
| 124 | + latest: string | null; | |
| 125 | + fetchedAt: number; | |
| 126 | +} | |
| 127 | + | |
| 128 | +export const buildLiveReports = async ( | |
| 129 | + repoOwner: string, | |
| 130 | + repoName: string, | |
| 131 | + perPage = 100, | |
| 132 | +): Promise<LiveReports> => { | |
| 133 | + const commits = await fetchRepoCommits(repoOwner, repoName, perPage); | |
| 134 | + const repoSlug = `${repoOwner}/${repoName}`; | |
| 135 | + const byAgent = new Map<AgentReport["slug"], GithubCommit[]>(); | |
| 136 | + let unknownCount = 0; | |
| 137 | + | |
| 138 | + for (const c of commits) { | |
| 139 | + const a = detectAgent(c.commit.message); | |
| 140 | + if (a === "unknown") { | |
| 141 | + unknownCount++; | |
| 142 | + continue; | |
| 143 | + } | |
| 144 | + const arr = byAgent.get(a) ?? []; | |
| 145 | + arr.push(c); | |
| 146 | + byAgent.set(a, arr); | |
| 147 | + } | |
| 148 | + | |
| 149 | + const order: AgentReport["slug"][] = ["claude-code", "cursor", "aider"]; | |
| 150 | + const reports = order | |
| 151 | + .map((slug) => { | |
| 152 | + const list = byAgent.get(slug); | |
| 153 | + if (!list || list.length === 0) return null; | |
| 154 | + return buildAgentReport(slug, list, repoSlug); | |
| 155 | + }) | |
| 156 | + .filter((r): r is AgentReport => r !== null); | |
| 157 | + | |
| 158 | + const dates = commits.map((c) => c.commit.author.date).sort(); | |
| 159 | + const earliest = dates[0] ?? null; | |
| 160 | + const latest = dates[dates.length - 1] ?? null; | |
| 161 | + | |
| 162 | + return { | |
| 163 | + reports, | |
| 164 | + unknownCount, | |
| 165 | + totalCommits: commits.length, | |
| 166 | + earliest, | |
| 167 | + latest, | |
| 168 | + fetchedAt: Date.now(), | |
| 169 | + }; | |
| 170 | +}; | |
src/c14_real_tests.test.ts
+66
−0
| @@ -0,0 +1,66 @@ | ||
| 1 | +// Sibling test for c32_real_tests.ts. buildLiveTestData fans out to | |
| 2 | +// loadTestBundle + fetchRepoCommits (both network/disk) so the | |
| 3 | +// end-to-end is covered by the live /reports/live/tests route. The | |
| 4 | +// pure helpers — agent attribution and the file/name label shortener — | |
| 5 | +// are unit-testable here. | |
| 6 | + | |
| 7 | +import { describe, test, expect } from "bun:test"; | |
| 8 | +import { | |
| 9 | + detectAgent, | |
| 10 | + shortenTestLabel, | |
| 11 | + buildLiveTestData, | |
| 12 | +} from "./c14_real_tests.ts"; | |
| 13 | + | |
| 14 | +describe("c32_real_tests — detectAgent", () => { | |
| 15 | + test("recognises Claude Code via Co-Authored-By: Claude", () => { | |
| 16 | + expect(detectAgent("Add feature\n\nCo-Authored-By: Claude <noreply>")).toBe("claude-code"); | |
| 17 | + }); | |
| 18 | + | |
| 19 | + test("recognises Cursor", () => { | |
| 20 | + expect(detectAgent("Fix bug\n\nCo-Authored-By: Cursor <[email protected]>")).toBe("cursor"); | |
| 21 | + }); | |
| 22 | + | |
| 23 | + test("recognises Aider", () => { | |
| 24 | + expect(detectAgent("Refactor x\n\nCo-Authored-By: aider")).toBe("aider"); | |
| 25 | + }); | |
| 26 | + | |
| 27 | + test("returns null when no recognised footer is present (distinct from c32_real_reports which returns 'unknown')", () => { | |
| 28 | + // The two real_* files made different choices here: real_reports | |
| 29 | + // buckets unknown into its own slug; real_tests returns null so | |
| 30 | + // the caller can filter or fall back. Document the difference. | |
| 31 | + expect(detectAgent("Just a commit")).toBeNull(); | |
| 32 | + expect(detectAgent("")).toBeNull(); | |
| 33 | + }); | |
| 34 | + | |
| 35 | + test("the regex is case-insensitive on the agent token", () => { | |
| 36 | + expect(detectAgent("Co-Authored-By: CLAUDE")).toBe("claude-code"); | |
| 37 | + expect(detectAgent("co-authored-by: aider")).toBe("aider"); | |
| 38 | + }); | |
| 39 | +}); | |
| 40 | + | |
| 41 | +describe("c32_real_tests — shortenTestLabel", () => { | |
| 42 | + test("keeps only the basename of the file path + the test name", () => { | |
| 43 | + expect(shortenTestLabel("src/foo/bar/baz.test.ts", "handles X")).toBe("baz.test.ts > handles X"); | |
| 44 | + }); | |
| 45 | + | |
| 46 | + test("handles a bare filename (no path) without splitting weirdly", () => { | |
| 47 | + expect(shortenTestLabel("baz.test.ts", "handles X")).toBe("baz.test.ts > handles X"); | |
| 48 | + }); | |
| 49 | + | |
| 50 | + test("handles an empty file string (falls back to the empty basename)", () => { | |
| 51 | + // .split('/').pop() on '' yields ''. Documented behaviour: the | |
| 52 | + // helper never throws; the caller decides whether to filter empties. | |
| 53 | + expect(shortenTestLabel("", "name")).toBe(" > name"); | |
| 54 | + }); | |
| 55 | + | |
| 56 | + test("preserves spaces and special chars in the test name", () => { | |
| 57 | + expect(shortenTestLabel("a.ts", "rejects `bad input`")).toBe("a.ts > rejects `bad input`"); | |
| 58 | + }); | |
| 59 | +}); | |
| 60 | + | |
| 61 | +describe("c32_real_tests — orchestrator entry point", () => { | |
| 62 | + test("buildLiveTestData is exported as an async function", () => { | |
| 63 | + expect(typeof buildLiveTestData).toBe("function"); | |
| 64 | + expect(buildLiveTestData.length).toBe(2); | |
| 65 | + }); | |
| 66 | +}); | |
src/c14_real_tests.ts
+142
−0
| @@ -0,0 +1,142 @@ | ||
| 1 | +// c32 — logic: aggregate the per-deploy test bundle into the same | |
| 2 | +// TestSnapshot[] / TestStability[] shape that the demo page renders. | |
| 3 | +// HEAD-only snapshots; stability accumulates as more deploys add runs. | |
| 4 | +// | |
| 5 | +// Pure given the bundle + commits in (no I/O of its own beyond delegating | |
| 6 | +// to c14_github's bundle loader and commits fetcher). | |
| 7 | + | |
| 8 | +import { fetchRepoCommits, loadTestBundle, type PlaceholderTest } from "./c14_github.ts"; | |
| 9 | +import type { | |
| 10 | + AgentReport, | |
| 11 | + TestFailure, | |
| 12 | + TestSnapshot, | |
| 13 | + TestStability, | |
| 14 | +} from "./c31_reports_demo.ts"; | |
| 15 | + | |
| 16 | +export const detectAgent = (msg: string): AgentReport["slug"] | null => { | |
| 17 | + if (/Co-Authored-By:.*Claude/i.test(msg)) return "claude-code"; | |
| 18 | + if (/Co-Authored-By:.*Cursor/i.test(msg)) return "cursor"; | |
| 19 | + if (/Co-Authored-By:.*Aider/i.test(msg)) return "aider"; | |
| 20 | + return null; | |
| 21 | +}; | |
| 22 | + | |
| 23 | +export const shortenTestLabel = (file: string, name: string): string => { | |
| 24 | + const base = file.split("/").pop() ?? file; | |
| 25 | + return `${base} > ${name}`; | |
| 26 | +}; | |
| 27 | + | |
| 28 | +export interface LiveTestData { | |
| 29 | + snapshots: TestSnapshot[]; | |
| 30 | + stability: TestStability[]; | |
| 31 | + runsCount: number; | |
| 32 | + ranAt: number | null; | |
| 33 | + headSha: string | null; | |
| 34 | + placeholderTests: PlaceholderTest[]; | |
| 35 | +} | |
| 36 | + | |
| 37 | +export const buildLiveTestData = async ( | |
| 38 | + repoOwner: string, | |
| 39 | + repoName: string, | |
| 40 | +): Promise<LiveTestData> => { | |
| 41 | + const bundle = await loadTestBundle(repoOwner, repoName); | |
| 42 | + if (!bundle || bundle.runs.length === 0) { | |
| 43 | + return { snapshots: [], stability: [], runsCount: 0, ranAt: null, headSha: null, placeholderTests: [] }; | |
| 44 | + } | |
| 45 | + const repoSlug = `${repoOwner}/${repoName}`; | |
| 46 | + const latest = bundle.runs[0]; | |
| 47 | + if (!latest) { | |
| 48 | + return { snapshots: [], stability: [], runsCount: 0, ranAt: null, headSha: null, placeholderTests: [] }; | |
| 49 | + } | |
| 50 | + | |
| 51 | + // For "since" we want the oldest run that has this test as failing. | |
| 52 | + const oldestFirst = [...bundle.runs].sort((a, b) => a.ranAt - b.ranAt); | |
| 53 | + | |
| 54 | + const failures: TestFailure[] = latest.tests | |
| 55 | + .filter((t) => t.status === "fail") | |
| 56 | + .map((t) => { | |
| 57 | + const firstFail = oldestFirst.find((r) => | |
| 58 | + r.tests.some((x) => x.name === t.name && x.file === t.file && x.status === "fail"), | |
| 59 | + ); | |
| 60 | + const sinceTs = firstFail?.ranAt ?? latest.ranAt; | |
| 61 | + return { test: shortenTestLabel(t.file, t.name), since: new Date(sinceTs).toISOString().slice(0, 10) }; | |
| 62 | + }); | |
| 63 | + | |
| 64 | + const snapshot: TestSnapshot = { | |
| 65 | + repo: repoSlug, | |
| 66 | + branch: latest.branch, | |
| 67 | + total: latest.total, | |
| 68 | + passing: latest.passing, | |
| 69 | + failing: latest.failing, | |
| 70 | + failures, | |
| 71 | + }; | |
| 72 | + | |
| 73 | + // Stability: count pass/fail per (file, name) across every run, with | |
| 74 | + // "deleted" set when a previously-seen test is missing from latest. | |
| 75 | + const commits = await fetchRepoCommits(repoOwner, repoName, 100); | |
| 76 | + const shaToAgent = new Map<string, AgentReport["slug"] | null>(); | |
| 77 | + for (const c of commits) shaToAgent.set(c.sha, detectAgent(c.commit.message)); | |
| 78 | + | |
| 79 | + interface Stat { | |
| 80 | + name: string; | |
| 81 | + file: string; | |
| 82 | + pass: number; | |
| 83 | + fail: number; | |
| 84 | + lastBrokenSha: string | null; | |
| 85 | + lastBrokenAt: number; | |
| 86 | + } | |
| 87 | + const stats = new Map<string, Stat>(); | |
| 88 | + for (const run of bundle.runs) { | |
| 89 | + for (const t of run.tests) { | |
| 90 | + const key = `${t.file}|${t.name}`; | |
| 91 | + let s = stats.get(key); | |
| 92 | + if (!s) { | |
| 93 | + s = { name: t.name, file: t.file, pass: 0, fail: 0, lastBrokenSha: null, lastBrokenAt: 0 }; | |
| 94 | + stats.set(key, s); | |
| 95 | + } | |
| 96 | + if (t.status === "pass") s.pass++; | |
| 97 | + else { | |
| 98 | + s.fail++; | |
| 99 | + if (run.ranAt > s.lastBrokenAt) { | |
| 100 | + s.lastBrokenSha = run.sha; | |
| 101 | + s.lastBrokenAt = run.ranAt; | |
| 102 | + } | |
| 103 | + } | |
| 104 | + } | |
| 105 | + } | |
| 106 | + | |
| 107 | + const latestKeys = new Set(latest.tests.map((t) => `${t.file}|${t.name}`)); | |
| 108 | + | |
| 109 | + // lastBrokenBy needs an agent slug; if we can't map a SHA to an agent | |
| 110 | + // (e.g. the commit isn't in the 100-commit window we fetch), fall | |
| 111 | + // back to the agent of the latest run, which is a defensible default | |
| 112 | + // for the dogfood case (one agent producing the history). | |
| 113 | + const fallbackAgent = (shaToAgent.get(latest.sha) ?? "claude-code") as AgentReport["slug"]; | |
| 114 | + | |
| 115 | + const stability: TestStability[] = Array.from(stats.values()) | |
| 116 | + .map<TestStability>((s) => { | |
| 117 | + const mapped = s.lastBrokenSha ? shaToAgent.get(s.lastBrokenSha) : null; | |
| 118 | + const agent = (mapped ?? fallbackAgent) as AgentReport["slug"]; | |
| 119 | + const deleted = latestKeys.has(`${s.file}|${s.name}`) ? 0 : 1; | |
| 120 | + const flagged = s.fail > 0 && (deleted > 0 || s.fail >= Math.max(2, s.pass / 5)); | |
| 121 | + return { | |
| 122 | + test: shortenTestLabel(s.file, s.name), | |
| 123 | + repo: repoSlug, | |
| 124 | + pass: s.pass, | |
| 125 | + fail: s.fail, | |
| 126 | + deleted, | |
| 127 | + lastBrokenBy: agent, | |
| 128 | + flagged, | |
| 129 | + }; | |
| 130 | + }) | |
| 131 | + .sort((a, b) => b.fail - a.fail || b.deleted - a.deleted || b.pass - a.pass) | |
| 132 | + .slice(0, 30); | |
| 133 | + | |
| 134 | + return { | |
| 135 | + snapshots: [snapshot], | |
| 136 | + stability, | |
| 137 | + runsCount: bundle.runs.length, | |
| 138 | + ranAt: latest.ranAt, | |
| 139 | + headSha: latest.sha, | |
| 140 | + placeholderTests: latest.placeholderTests ?? [], | |
| 141 | + }; | |
| 142 | +}; | |
src/c14_sama_profile.test.ts
+153
−0
| @@ -0,0 +1,153 @@ | ||
| 1 | +import { describe, test, expect } from "bun:test"; | |
| 2 | +import { parseProfileToml } from "./c14_sama_profile.ts"; | |
| 3 | + | |
| 4 | +describe("c14_sama_profile — parseProfileToml", () => { | |
| 5 | + test("parses the minimum required top-level keys", () => { | |
| 6 | + const p = parseProfileToml(` | |
| 7 | +sama_version = "2.0" | |
| 8 | +profile = "tdd-md" | |
| 9 | + | |
| 10 | +[layers.0] | |
| 11 | +prefixes = ["c31_"] | |
| 12 | + | |
| 13 | +[layers.1] | |
| 14 | +prefixes = ["c32_"] | |
| 15 | + | |
| 16 | +[layers.2] | |
| 17 | +prefixes = ["c13_"] | |
| 18 | + | |
| 19 | +[layers.3] | |
| 20 | +prefixes = ["c11_"] | |
| 21 | +`); | |
| 22 | + expect(p.samaVersion).toBe("2.0"); | |
| 23 | + expect(p.profile).toBe("tdd-md"); | |
| 24 | + }); | |
| 25 | + | |
| 26 | + test("a flat-prefix layer maps to a single synthetic sublayer named 'default'", () => { | |
| 27 | + const p = parseProfileToml(` | |
| 28 | +sama_version = "2.0" | |
| 29 | +profile = "x" | |
| 30 | + | |
| 31 | +[layers.0] | |
| 32 | +prefixes = ["c31_"] | |
| 33 | + | |
| 34 | +[layers.1] | |
| 35 | +prefixes = [] | |
| 36 | + | |
| 37 | +[layers.2] | |
| 38 | +prefixes = [] | |
| 39 | + | |
| 40 | +[layers.3] | |
| 41 | +prefixes = [] | |
| 42 | +`); | |
| 43 | + expect(p.layers[0].sublayers).toHaveLength(1); | |
| 44 | + expect(p.layers[0].sublayers[0]).toEqual({ name: "default", prefix: "c31_", index: 0 }); | |
| 45 | + }); | |
| 46 | + | |
| 47 | + test("a subdivided layer carries sublayer index = position in the array", () => { | |
| 48 | + const p = parseProfileToml(` | |
| 49 | +sama_version = "2.0" | |
| 50 | +profile = "x" | |
| 51 | + | |
| 52 | +[layers.0] | |
| 53 | +prefixes = [] | |
| 54 | + | |
| 55 | +[layers.1] | |
| 56 | +sublayers = [ | |
| 57 | + { name = "logic", prefix = "c32_" }, | |
| 58 | + { name = "render", prefix = "c51_" }, | |
| 59 | +] | |
| 60 | + | |
| 61 | +[layers.2] | |
| 62 | +prefixes = [] | |
| 63 | + | |
| 64 | +[layers.3] | |
| 65 | +prefixes = [] | |
| 66 | +`); | |
| 67 | + expect(p.layers[1].sublayers).toHaveLength(2); | |
| 68 | + expect(p.layers[1].sublayers[0]).toEqual({ name: "logic", prefix: "c32_", index: 0 }); | |
| 69 | + expect(p.layers[1].sublayers[1]).toEqual({ name: "render", prefix: "c51_", index: 1 }); | |
| 70 | + }); | |
| 71 | + | |
| 72 | + test("comments are stripped", () => { | |
| 73 | + const p = parseProfileToml(` | |
| 74 | +# leading comment | |
| 75 | +sama_version = "2.0" # trailing comment | |
| 76 | +profile = "x" | |
| 77 | + | |
| 78 | +[layers.0] | |
| 79 | +prefixes = ["c31_"] # another | |
| 80 | + | |
| 81 | +[layers.1] | |
| 82 | +prefixes = [] | |
| 83 | + | |
| 84 | +[layers.2] | |
| 85 | +prefixes = [] | |
| 86 | + | |
| 87 | +[layers.3] | |
| 88 | +prefixes = [] | |
| 89 | +`); | |
| 90 | + expect(p.samaVersion).toBe("2.0"); | |
| 91 | + expect(p.layers[0].sublayers[0]!.prefix).toBe("c31_"); | |
| 92 | + }); | |
| 93 | + | |
| 94 | + test("missing top-level keys throws a clear error", () => { | |
| 95 | + expect(() => parseProfileToml(`profile = "x"\n[layers.0]\n[layers.1]\n[layers.2]\n[layers.3]\n`)) | |
| 96 | + .toThrow(/sama_version/); | |
| 97 | + expect(() => parseProfileToml(`sama_version = "2.0"\n[layers.0]\n[layers.1]\n[layers.2]\n[layers.3]\n`)) | |
| 98 | + .toThrow(/profile/); | |
| 99 | + }); | |
| 100 | + | |
| 101 | + test("missing a required layer section throws a clear error", () => { | |
| 102 | + expect(() => parseProfileToml(` | |
| 103 | +sama_version = "2.0" | |
| 104 | +profile = "x" | |
| 105 | + | |
| 106 | +[layers.0] | |
| 107 | +prefixes = [] | |
| 108 | + | |
| 109 | +[layers.1] | |
| 110 | +prefixes = [] | |
| 111 | + | |
| 112 | +[layers.2] | |
| 113 | +prefixes = [] | |
| 114 | +`)).toThrow(/layers\.3/); | |
| 115 | + }); | |
| 116 | + | |
| 117 | + test("parses the actual repo profile file", () => { | |
| 118 | + // Inline copy of the real-repo profile to keep this test | |
| 119 | + // hermetic — no filesystem read. If sama.profile.toml's shape | |
| 120 | + // ever drifts, this test pins what the parser supports. | |
| 121 | + const real = ` | |
| 122 | +sama_version = "2.0" | |
| 123 | +profile = "tdd-md" | |
| 124 | + | |
| 125 | +[layers.0] | |
| 126 | +prefixes = ["c31_"] | |
| 127 | + | |
| 128 | +[layers.1] | |
| 129 | +sublayers = [ | |
| 130 | + { name = "logic", prefix = "c32_" }, | |
| 131 | + { name = "render", prefix = "c51_" }, | |
| 132 | +] | |
| 133 | + | |
| 134 | +[layers.2] | |
| 135 | +sublayers = [ | |
| 136 | + { name = "data", prefix = "c13_" }, | |
| 137 | + { name = "io", prefix = "c14_" }, | |
| 138 | +] | |
| 139 | + | |
| 140 | +[layers.3] | |
| 141 | +sublayers = [ | |
| 142 | + { name = "handlers", prefix = "c21_" }, | |
| 143 | + { name = "server", prefix = "c11_" }, | |
| 144 | +] | |
| 145 | +`; | |
| 146 | + const p = parseProfileToml(real); | |
| 147 | + expect(p.profile).toBe("tdd-md"); | |
| 148 | + expect(p.layers[0].sublayers.map((s) => s.prefix)).toEqual(["c31_"]); | |
| 149 | + expect(p.layers[1].sublayers.map((s) => s.name)).toEqual(["logic", "render"]); | |
| 150 | + expect(p.layers[2].sublayers.map((s) => s.name)).toEqual(["data", "io"]); | |
| 151 | + expect(p.layers[3].sublayers.map((s) => s.name)).toEqual(["handlers", "server"]); | |
| 152 | + }); | |
| 153 | +}); | |
src/c14_sama_profile.ts
+236
−0
| @@ -0,0 +1,236 @@ | ||
| 1 | +// c14 — adapter: loads + parses sama.profile.toml (the SAMA v2 profile | |
| 2 | +// declaration at the repo root) and walks the source tree to feed the | |
| 3 | +// v2 verifier. Layer 2 in SAMA v2 terms: this is the boundary where | |
| 4 | +// external input (the TOML file on disk + the contents of src/) is | |
| 5 | +// parsed into the typed SamaV2Input shape that the pure verifier in | |
| 6 | +// c32_sama_v2_verify consumes. | |
| 7 | +// | |
| 8 | +// The TOML parser handles the subset our profile uses (string values, | |
| 9 | +// string arrays, and arrays of inline tables) — not full TOML. The | |
| 10 | +// alternative is depending on an external parser; the subset is small | |
| 11 | +// enough that an inline implementation keeps the verifier dependency- | |
| 12 | +// free and easy to inspect. | |
| 13 | + | |
| 14 | +import { readdirSync, readFileSync } from "node:fs"; | |
| 15 | +import { resolve } from "node:path"; | |
| 16 | +import type { | |
| 17 | + LayerNumber, | |
| 18 | + LayerSpec, | |
| 19 | + ProfileSpec, | |
| 20 | + SamaV2Input, | |
| 21 | + Sublayer, | |
| 22 | +} from "./c31_sama_v2.ts"; | |
| 23 | + | |
| 24 | +// — TOML subset parser ---------------------------------------------- | |
| 25 | + | |
| 26 | +const stripComment = (line: string): string => { | |
| 27 | + // Comments only outside string literals. Our profile keeps no '#' | |
| 28 | + // inside strings so a naive split on the first '#' is fine. If the | |
| 29 | + // profile ever needs that, escape via a sentinel and post-process. | |
| 30 | + const idx = line.indexOf("#"); | |
| 31 | + return idx === -1 ? line : line.slice(0, idx); | |
| 32 | +}; | |
| 33 | + | |
| 34 | +const parseStringValue = (raw: string): string => { | |
| 35 | + const t = raw.trim(); | |
| 36 | + if ((t.startsWith('"') && t.endsWith('"')) || (t.startsWith("'") && t.endsWith("'"))) { | |
| 37 | + return t.slice(1, -1); | |
| 38 | + } | |
| 39 | + throw new Error(`expected quoted string, got: ${raw}`); | |
| 40 | +}; | |
| 41 | + | |
| 42 | +const parseStringArray = (raw: string): string[] => { | |
| 43 | + // Expect `[ "a", "b", ... ]` on a single line. | |
| 44 | + const t = raw.trim(); | |
| 45 | + if (!t.startsWith("[") || !t.endsWith("]")) { | |
| 46 | + throw new Error(`expected [..] array, got: ${raw}`); | |
| 47 | + } | |
| 48 | + const inner = t.slice(1, -1).trim(); | |
| 49 | + if (inner === "") return []; | |
| 50 | + return inner.split(",").map((s) => parseStringValue(s.trim())); | |
| 51 | +}; | |
| 52 | + | |
| 53 | +const parseInlineTable = (raw: string): Record<string, string> => { | |
| 54 | + // Expect `{ key = "value", key2 = "value2" }` on one line. | |
| 55 | + const t = raw.trim(); | |
| 56 | + if (!t.startsWith("{") || !t.endsWith("}")) { | |
| 57 | + throw new Error(`expected inline table, got: ${raw}`); | |
| 58 | + } | |
| 59 | + const inner = t.slice(1, -1).trim(); | |
| 60 | + const out: Record<string, string> = {}; | |
| 61 | + if (inner === "") return out; | |
| 62 | + // Split on commas that aren't inside a quoted string. Our subset | |
| 63 | + // doesn't use quoted commas, so a plain split is enough. | |
| 64 | + for (const pair of inner.split(",")) { | |
| 65 | + const eq = pair.indexOf("="); | |
| 66 | + if (eq === -1) throw new Error(`malformed inline-table entry: ${pair}`); | |
| 67 | + const key = pair.slice(0, eq).trim(); | |
| 68 | + const value = pair.slice(eq + 1).trim(); | |
| 69 | + out[key] = parseStringValue(value); | |
| 70 | + } | |
| 71 | + return out; | |
| 72 | +}; | |
| 73 | + | |
| 74 | +interface ParseState { | |
| 75 | + sections: Map<string, Map<string, unknown>>; | |
| 76 | +} | |
| 77 | + | |
| 78 | +export const parseProfileToml = (text: string): ProfileSpec => { | |
| 79 | + const state: ParseState = { sections: new Map() }; | |
| 80 | + const top = new Map<string, unknown>(); | |
| 81 | + state.sections.set("__top__", top); | |
| 82 | + | |
| 83 | + // Pre-process: join continuation lines for multi-line arrays of | |
| 84 | + // inline tables. Walk by char-level bracket tracking — when '[' is | |
| 85 | + // open in a value, keep accumulating until the matching ']' arrives. | |
| 86 | + const physLines = text.split("\n"); | |
| 87 | + const logical: string[] = []; | |
| 88 | + let buf = ""; | |
| 89 | + let depth = 0; | |
| 90 | + for (const raw of physLines) { | |
| 91 | + const line = stripComment(raw); | |
| 92 | + if (depth === 0) { | |
| 93 | + if (buf === "") buf = line; else buf += " " + line; | |
| 94 | + } else { | |
| 95 | + buf += " " + line; | |
| 96 | + } | |
| 97 | + for (const c of line) { | |
| 98 | + if (c === "[" || c === "{") depth++; | |
| 99 | + else if (c === "]" || c === "}") depth--; | |
| 100 | + } | |
| 101 | + if (depth <= 0) { | |
| 102 | + depth = 0; | |
| 103 | + logical.push(buf); | |
| 104 | + buf = ""; | |
| 105 | + } | |
| 106 | + } | |
| 107 | + if (buf.trim() !== "") logical.push(buf); | |
| 108 | + | |
| 109 | + let currentSection = "__top__"; | |
| 110 | + for (const raw of logical) { | |
| 111 | + const line = raw.trim(); | |
| 112 | + if (line === "") continue; | |
| 113 | + if (line.startsWith("[") && line.endsWith("]")) { | |
| 114 | + currentSection = line.slice(1, -1).trim(); | |
| 115 | + if (!state.sections.has(currentSection)) { | |
| 116 | + state.sections.set(currentSection, new Map()); | |
| 117 | + } | |
| 118 | + continue; | |
| 119 | + } | |
| 120 | + const eq = line.indexOf("="); | |
| 121 | + if (eq === -1) throw new Error(`unparseable line: ${line}`); | |
| 122 | + const key = line.slice(0, eq).trim(); | |
| 123 | + const valueRaw = line.slice(eq + 1).trim(); | |
| 124 | + let value: unknown; | |
| 125 | + if (valueRaw.startsWith("[") && valueRaw.endsWith("]")) { | |
| 126 | + // Array — string array or array of inline tables. Peek at the | |
| 127 | + // first non-bracket char inside. | |
| 128 | + const inner = valueRaw.slice(1, -1).trim(); | |
| 129 | + if (inner.startsWith("{")) { | |
| 130 | + // Array of inline tables. Split on commas at depth 0. | |
| 131 | + const tables: Array<Record<string, string>> = []; | |
| 132 | + let cur = ""; | |
| 133 | + let d = 0; | |
| 134 | + for (const c of inner) { | |
| 135 | + if (c === "{") d++; | |
| 136 | + if (c === "}") d--; | |
| 137 | + if (c === "," && d === 0) { | |
| 138 | + tables.push(parseInlineTable(cur)); | |
| 139 | + cur = ""; | |
| 140 | + } else { | |
| 141 | + cur += c; | |
| 142 | + } | |
| 143 | + } | |
| 144 | + if (cur.trim() !== "") tables.push(parseInlineTable(cur)); | |
| 145 | + value = tables; | |
| 146 | + } else { | |
| 147 | + value = parseStringArray(valueRaw); | |
| 148 | + } | |
| 149 | + } else { | |
| 150 | + value = parseStringValue(valueRaw); | |
| 151 | + } | |
| 152 | + state.sections.get(currentSection)!.set(key, value); | |
| 153 | + } | |
| 154 | + | |
| 155 | + // Now assemble ProfileSpec. | |
| 156 | + const samaVersion = top.get("sama_version") as string | undefined; | |
| 157 | + const profile = top.get("profile") as string | undefined; | |
| 158 | + if (typeof samaVersion !== "string" || typeof profile !== "string") { | |
| 159 | + throw new Error("profile must declare `sama_version` and `profile` at the top level"); | |
| 160 | + } | |
| 161 | + | |
| 162 | + const buildLayer = (k: LayerNumber): LayerSpec => { | |
| 163 | + const sec = state.sections.get(`layers.${k}`); | |
| 164 | + if (!sec) { | |
| 165 | + throw new Error(`profile is missing required section [layers.${k}]`); | |
| 166 | + } | |
| 167 | + const sublayersRaw = sec.get("sublayers") as Array<Record<string, string>> | undefined; | |
| 168 | + const prefixes = sec.get("prefixes") as string[] | undefined; | |
| 169 | + const subs: Sublayer[] = []; | |
| 170 | + if (sublayersRaw && sublayersRaw.length > 0) { | |
| 171 | + sublayersRaw.forEach((row, index) => { | |
| 172 | + if (!row.name || !row.prefix) { | |
| 173 | + throw new Error(`[layers.${k}] sublayer ${index} missing name/prefix`); | |
| 174 | + } | |
| 175 | + subs.push({ name: row.name, prefix: row.prefix, index }); | |
| 176 | + }); | |
| 177 | + } else if (prefixes && prefixes.length > 0) { | |
| 178 | + prefixes.forEach((prefix, index) => { | |
| 179 | + subs.push({ name: "default", prefix, index }); | |
| 180 | + }); | |
| 181 | + } else { | |
| 182 | + // Empty layer is permitted (spec §2.1: "Leave a canonical layer | |
| 183 | + // empty"). The verifier just won't assign any file to it. | |
| 184 | + } | |
| 185 | + return { sublayers: subs }; | |
| 186 | + }; | |
| 187 | + | |
| 188 | + return { | |
| 189 | + samaVersion, | |
| 190 | + profile, | |
| 191 | + layers: { | |
| 192 | + 0: buildLayer(0), | |
| 193 | + 1: buildLayer(1), | |
| 194 | + 2: buildLayer(2), | |
| 195 | + 3: buildLayer(3), | |
| 196 | + }, | |
| 197 | + }; | |
| 198 | +}; | |
| 199 | + | |
| 200 | +// — Filesystem I/O -------------------------------------------------- | |
| 201 | + | |
| 202 | +const REPO_ROOT_GUESS = process.cwd(); | |
| 203 | + | |
| 204 | +export const loadProfile = async ( | |
| 205 | + repoRoot: string = REPO_ROOT_GUESS, | |
| 206 | +): Promise<ProfileSpec> => { | |
| 207 | + const path = resolve(repoRoot, "sama.profile.toml"); | |
| 208 | + const text = await Bun.file(path).text(); | |
| 209 | + return parseProfileToml(text); | |
| 210 | +}; | |
| 211 | + | |
| 212 | +// Walk src/ and read every .ts (sources + test siblings) into a map | |
| 213 | +// keyed by repo-relative path ("src/cXX_*.ts"). | |
| 214 | +export const loadRepoFiles = ( | |
| 215 | + repoRoot: string = REPO_ROOT_GUESS, | |
| 216 | +): Map<string, string> => { | |
| 217 | + const srcDir = resolve(repoRoot, "src"); | |
| 218 | + const out = new Map<string, string>(); | |
| 219 | + const entries = readdirSync(srcDir, { withFileTypes: true }); | |
| 220 | + for (const e of entries) { | |
| 221 | + if (!e.isFile() || !e.name.endsWith(".ts")) continue; | |
| 222 | + const repoPath = `src/${e.name}`; | |
| 223 | + out.set(repoPath, readFileSync(resolve(srcDir, e.name), "utf8")); | |
| 224 | + } | |
| 225 | + return out; | |
| 226 | +}; | |
| 227 | + | |
| 228 | +// Convenience: composes loadProfile + loadRepoFiles into the | |
| 229 | +// SamaV2Input the verifier consumes. Handler code calls this then | |
| 230 | +// passes the result straight to verifySamaV2. | |
| 231 | +export const buildSamaV2Input = async ( | |
| 232 | + repoRoot: string = REPO_ROOT_GUESS, | |
| 233 | +): Promise<SamaV2Input> => ({ | |
| 234 | + profile: await loadProfile(repoRoot), | |
| 235 | + files: loadRepoFiles(repoRoot), | |
| 236 | +}); | |
src/c21_app.ts
+3
−0
| @@ -33,6 +33,7 @@ import { | ||
| 33 | 33 | samaCliResponse, |
| 34 | 34 | samaSkillHandler, |
| 35 | 35 | samaV2Handler, |
| 36 | + samaV2VerifyHandler, | |
| 36 | 37 | samaVerifyHandler, |
| 37 | 38 | samaLandingHandler, |
| 38 | 39 | samaSlugHandler, |
| @@ -362,6 +363,8 @@ ${rows} | ||
| 362 | 363 | |
| 363 | 364 | "/sama/v2": samaV2Handler, |
| 364 | 365 | |
| 366 | + "/sama/v2/verify": samaV2VerifyHandler, | |
| 367 | + | |
| 365 | 368 | "/sama/verify": samaVerifyHandler, |
| 366 | 369 | |
| 367 | 370 | "/sama": samaLandingHandler, |
src/c21_handlers_api_agents.ts
+1
−1
| @@ -5,7 +5,7 @@ | ||
| 5 | 5 | // judge entry point lives in c21_handlers_webhook — different auth |
| 6 | 6 | // model (HMAC), different concept. |
| 7 | 7 | |
| 8 | -import { judge } from "./c32_judge.ts"; | |
| 8 | +import { judge } from "./c14_judge.ts"; | |
| 9 | 9 | import { timingSafeEqual } from "./c32_session.ts"; |
| 10 | 10 | import { |
| 11 | 11 | FORGEJO_URL, |
src/c21_handlers_reports.ts
+2
−2
| @@ -21,8 +21,8 @@ import { | ||
| 21 | 21 | DEMO_SNAPSHOTS, |
| 22 | 22 | DEMO_STABILITY, |
| 23 | 23 | } from "./c31_reports_demo.ts"; |
| 24 | -import { buildLiveReports } from "./c32_real_reports.ts"; | |
| 25 | -import { buildLiveTestData } from "./c32_real_tests.ts"; | |
| 24 | +import { buildLiveReports } from "./c14_real_reports.ts"; | |
| 25 | +import { buildLiveTestData } from "./c14_real_tests.ts"; | |
| 26 | 26 | import { |
| 27 | 27 | LIVE_REPO_OWNER, |
| 28 | 28 | LIVE_REPO_NAME, |
src/c21_handlers_sama.ts
+65
−0
| @@ -61,6 +61,71 @@ export const samaSkillHandler = async (): Promise<Response> => { | ||
| 61 | 61 | return htmlResponse(html); |
| 62 | 62 | }; |
| 63 | 63 | |
| 64 | +// -------- /sama/v2/verify (the v2 dogfood — runs the v2 verifier | |
| 65 | +// against this repo using sama.profile.toml) -------- | |
| 66 | + | |
| 67 | +import { buildSamaV2Input } from "./c14_sama_profile.ts"; | |
| 68 | +import { verifySamaV2 } from "./c32_sama_v2_verify.ts"; | |
| 69 | +import type { SamaV2Report } from "./c31_sama_v2.ts"; | |
| 70 | + | |
| 71 | +const renderV2Report = (report: SamaV2Report): string => { | |
| 72 | + const summary = report.overallPassed | |
| 73 | + ? `✓ conforms · profile \`${report.profile}\` · ${report.examined} files examined · ${report.checks.length}/${report.checks.length} checks pass` | |
| 74 | + : `${report.checks.filter((c) => c.passed).length}/${report.checks.length} checks pass · profile \`${report.profile}\` · ${report.examined} files examined`; | |
| 75 | + const rows = report.checks | |
| 76 | + .map((c) => { | |
| 77 | + const mark = c.passed ? "✓ pass" : `✗ ${c.violations.length} violation${c.violations.length === 1 ? "" : "s"}`; | |
| 78 | + return `| #${c.id} ${c.name} | ${mark} | ${c.examined} |`; | |
| 79 | + }) | |
| 80 | + .join("\n"); | |
| 81 | + const details = report.checks | |
| 82 | + .filter((c) => !c.passed) | |
| 83 | + .map((c) => { | |
| 84 | + const head = `### ✗ #${c.id} ${c.name}\n`; | |
| 85 | + const noteBlock = c.note ? `\n*${c.note}*\n` : ""; | |
| 86 | + const list = c.violations | |
| 87 | + .map((v) => `- \`${v.file}\` — ${v.detail}`) | |
| 88 | + .join("\n"); | |
| 89 | + return `${head}${noteBlock}\n${list}\n`; | |
| 90 | + }) | |
| 91 | + .join("\n"); | |
| 92 | + return `# SAMA v2 — \`syntaxai/tdd.md\` dogfood | |
| 93 | + | |
| 94 | +> ${summary} | |
| 95 | + | |
| 96 | +The verifier in [\`src/c32_sama_v2_verify.ts\`](/GIT/syntaxai/tdd.md/blob/main/src/c32_sama_v2_verify.ts) ingests [\`sama.profile.toml\`](/GIT/syntaxai/tdd.md/blob/main/sama.profile.toml) and runs the seven §4 conformance checks against the current source tree on this server. No clone, no token; the server reads its own \`src/\` and the committed profile, runs the same logic the sibling unit tests cover, and renders the verdict below. | |
| 97 | + | |
| 98 | +| check | verdict | examined | | |
| 99 | +|---|---|---| | |
| 100 | +${rows} | |
| 101 | + | |
| 102 | +${details ? `## Open violations\n\n${details}` : ""} | |
| 103 | + | |
| 104 | +[← /sama/v2](/sama/v2) · [← /sama](/sama) · [the v1 dogfood](/sama/verify?repo=syntaxai/tdd.md) | |
| 105 | +`; | |
| 106 | +}; | |
| 107 | + | |
| 108 | +export const samaV2VerifyHandler = async (): Promise<Response> => { | |
| 109 | + let body: string; | |
| 110 | + try { | |
| 111 | + const input = await buildSamaV2Input(); | |
| 112 | + const report = verifySamaV2(input); | |
| 113 | + body = renderV2Report(report); | |
| 114 | + } catch (err) { | |
| 115 | + body = `# SAMA v2 verify — error\n\nThe verifier failed before producing a verdict:\n\n\`\`\`\n${(err as Error).message}\n\`\`\`\n\n[← /sama/v2](/sama/v2)`; | |
| 116 | + } | |
| 117 | + const html = await renderDocsPage({ | |
| 118 | + title: "SAMA v2 verify · syntaxai/tdd.md — tdd.md", | |
| 119 | + description: | |
| 120 | + "Live dogfood: tdd.md's own source tree run through the SAMA v2 verifier. Reads sama.profile.toml + src/*.ts, applies the seven §4 conformance checks, renders the verdict.", | |
| 121 | + bodyMarkdown: body, | |
| 122 | + ogPath: "https://tdd.md/sama/v2/verify", | |
| 123 | + active: "sama", | |
| 124 | + pathForDocs: "/sama/v2/verify", | |
| 125 | + }); | |
| 126 | + return htmlResponse(html); | |
| 127 | +}; | |
| 128 | + | |
| 64 | 129 | // -------- /sama/v2 (the SAMA v2 Core Specification — draft) -------- |
| 65 | 130 | |
| 66 | 131 | export const samaV2Handler = async (): Promise<Response> => { |
src/c21_handlers_webhook.ts
+1
−1
| @@ -6,7 +6,7 @@ | ||
| 6 | 6 | // and the failure semantics (ack-and-fire vs. wait-for-verdict) are |
| 7 | 7 | // genuinely different concepts. |
| 8 | 8 | |
| 9 | -import { judge } from "./c32_judge.ts"; | |
| 9 | +import { judge } from "./c14_judge.ts"; | |
| 10 | 10 | import { timingSafeEqual, hmacSha256Hex } from "./c32_session.ts"; |
| 11 | 11 | |
| 12 | 12 | export const forgejoWebhookHandler = async (req: Request): Promise<Response> => { |
src/c31_blog.ts
+6
−0
| @@ -12,6 +12,12 @@ export interface BlogEntry { | ||
| 12 | 12 | } |
| 13 | 13 | |
| 14 | 14 | export const ALL_POSTS: BlogEntry[] = [ |
| 15 | + { | |
| 16 | + slug: "deploy-that-lies-cascade", | |
| 17 | + title: "When the deploy lies: three bugs hidden by one silent error suppressor", | |
| 18 | + description: "/reports/live had been stuck on a 12-day-old window because the deploy script's snapshot step was failing silently (no bun on the p620 host, the failure was swallowed by 2>/dev/null and a 'non-fatal skipped' echo). Fix one: run the snapshot via podman. That exposed a second silent skip — snapshot-tests had been missing from the git-mode deploy entirely. Fix two: add it. That made bun test actually run in CI for the first time and exposed two more bugs — a 1-in-16 flaky test and a false-positive placeholder where the verifier's own test fixture was being grepped as a real test. Three bugs in one PR. The empirical lesson: verification only works if the pipeline that runs it isn't lying about whether it ran.", | |
| 19 | + date: "2026-05-22", | |
| 20 | + }, | |
| 15 | 21 | { |
| 16 | 22 | slug: "sama-empirical-modeled-green", |
| 17 | 23 | title: "Greening our own dogfood: four sibling tests, the live verifier flipped from 3/4 to 4/4", |
src/c31_git_parse.ts
+32
−0
| @@ -81,3 +81,35 @@ export const parseLsTreeLine = (line: string): LsTreeEntry | null => { | ||
| 81 | 81 | if (type !== "blob" && type !== "tree" && type !== "commit") return null; |
| 82 | 82 | return { mode: mode!, type, sha: sha!, path }; |
| 83 | 83 | }; |
| 84 | + | |
| 85 | +// Tree-listing entry returned by c14_git.lsTree. Defined here in | |
| 86 | +// Layer 0 (Pure) per SAMA v2 §1.1 so c51 render code (and other | |
| 87 | +// readers) can reference the type without importing from Layer 2. | |
| 88 | +// Distinct from LsTreeEntry above: that's the raw parsed line; this | |
| 89 | +// is the cleaned-up shape c14_git exposes to callers. | |
| 90 | +export interface TreeEntry { | |
| 91 | + name: string; // basename, e.g. "skill.md" or "blog" | |
| 92 | + type: "blob" | "tree" | "commit"; | |
| 93 | + sha: string; | |
| 94 | + mode: string; | |
| 95 | +} | |
| 96 | + | |
| 97 | +// Result types for c14_git.commitFile etc. Defined here in Layer 0 | |
| 98 | +// (Pure) per SAMA v2 §1.1 so c51 render code can match against the | |
| 99 | +// discriminated union without crossing import direction. | |
| 100 | +export interface GitCommitOk { | |
| 101 | + ok: true; | |
| 102 | + commitSha: string; | |
| 103 | +} | |
| 104 | + | |
| 105 | +export interface GitCommitFailure { | |
| 106 | + ok: false; | |
| 107 | + // "conflict" → ref tip moved under us (someone else committed) | |
| 108 | + // "not_found" → branch doesn't exist | |
| 109 | + // "permission" → fs perms on the bare repo | |
| 110 | + // "other" → anything else (look at .message) | |
| 111 | + kind: "conflict" | "not_found" | "permission" | "other"; | |
| 112 | + message: string; | |
| 113 | +} | |
| 114 | + | |
| 115 | +export type GitCommitOutcome = GitCommitOk | GitCommitFailure; | |
src/c31_project_config.ts
+16
−0
| @@ -100,3 +100,19 @@ export const parseRepoIdentifier = (raw: string): { owner: string; repo: string | ||
| 100 | 100 | } |
| 101 | 101 | return { owner, repo }; |
| 102 | 102 | }; |
| 103 | + | |
| 104 | +// Row-shape returned by c13_database for project records. Defined here | |
| 105 | +// in Layer 0 (Pure) per SAMA v2 §1.1 so c51 render code can reference | |
| 106 | +// the type without importing from Layer 2 (Adapter). | |
| 107 | +export interface ProjectRow { | |
| 108 | + id: number; | |
| 109 | + registeredBy: string; | |
| 110 | + repoOwner: string; | |
| 111 | + repoName: string; | |
| 112 | + testRunner: TestRunner; | |
| 113 | + trackedBranches: string[]; | |
| 114 | + displayName: string | null; | |
| 115 | + team: string | null; | |
| 116 | + registeredAt: number; | |
| 117 | + status: "active" | "paused"; | |
| 118 | +} | |
src/c31_sama_v2.ts
+97
−0
| @@ -0,0 +1,97 @@ | ||
| 1 | +// c31 — model: types for the SAMA v2 verifier pipeline. Pure data | |
| 2 | +// shapes: the parsed profile (ProfileSpec), the verifier's input | |
| 3 | +// (SamaV2Input), and its output (SamaV2Report). No I/O lives here; | |
| 4 | +// c14_sama_profile parses the .toml into ProfileSpec, c32_sama_v2_verify | |
| 5 | +// applies the seven §4 checks against (ProfileSpec, files), and | |
| 6 | +// c21_handlers_sama renders the SamaV2Report. | |
| 7 | + | |
| 8 | +export type LayerNumber = 0 | 1 | 2 | 3; | |
| 9 | + | |
| 10 | +export interface Sublayer { | |
| 11 | + // Order within the array (in the source profile) = dependency order: | |
| 12 | + // later may import earlier, never the reverse. We carry the index | |
| 13 | + // here so the verifier can compare positions. | |
| 14 | + name: string; | |
| 15 | + prefix: string; | |
| 16 | + index: number; | |
| 17 | +} | |
| 18 | + | |
| 19 | +export interface LayerSpec { | |
| 20 | + // A layer is either flat (an array of prefixes treated as one | |
| 21 | + // sublayer) or subdivided (an ordered list of sublayers with their | |
| 22 | + // own prefixes). The parser normalises flat layers into a single | |
| 23 | + // synthetic sublayer named "default". | |
| 24 | + sublayers: Sublayer[]; | |
| 25 | +} | |
| 26 | + | |
| 27 | +export interface ProfileSpec { | |
| 28 | + samaVersion: string; | |
| 29 | + profile: string; // profile name, e.g. "tdd-md" | |
| 30 | + layers: { | |
| 31 | + 0: LayerSpec; | |
| 32 | + 1: LayerSpec; | |
| 33 | + 2: LayerSpec; | |
| 34 | + 3: LayerSpec; | |
| 35 | + }; | |
| 36 | +} | |
| 37 | + | |
| 38 | +export interface SamaV2Input { | |
| 39 | + profile: ProfileSpec; | |
| 40 | + // Map keyed by repo-relative path (e.g. "src/c11_server.ts") to | |
| 41 | + // file contents. The verifier never reads files itself; the loader | |
| 42 | + // populates this map. | |
| 43 | + files: Map<string, string>; | |
| 44 | +} | |
| 45 | + | |
| 46 | +export interface SamaV2Violation { | |
| 47 | + file: string; | |
| 48 | + detail: string; | |
| 49 | +} | |
| 50 | + | |
| 51 | +export interface SamaV2Check { | |
| 52 | + // Stable IDs matching §4 of the spec. | |
| 53 | + id: 1 | 2 | 3 | 4 | 5 | 6 | 7; | |
| 54 | + // Display name used in the rendered report. | |
| 55 | + name: string; | |
| 56 | + // Property letter / phrase from the spec. | |
| 57 | + property: | |
| 58 | + | "Sorted" | |
| 59 | + | "Architecture" | |
| 60 | + | "Modeled (tests)" | |
| 61 | + | "Modeled (boundary)" | |
| 62 | + | "Atomic" | |
| 63 | + | "Law" | |
| 64 | + | "Consistency"; | |
| 65 | + passed: boolean; | |
| 66 | + examined: number; | |
| 67 | + violations: SamaV2Violation[]; | |
| 68 | + // Free-form note shown alongside the verdict — used for §4.4 where | |
| 69 | + // the profile may declare advisory-only enforcement. | |
| 70 | + note?: string; | |
| 71 | +} | |
| 72 | + | |
| 73 | +export interface SamaV2Report { | |
| 74 | + profile: string; | |
| 75 | + // Total files examined across all checks (matches the count emitted | |
| 76 | + // by the §4.2 Architecture check). | |
| 77 | + examined: number; | |
| 78 | + checks: SamaV2Check[]; | |
| 79 | + overallPassed: boolean; | |
| 80 | +} | |
| 81 | + | |
| 82 | +// Helper used in the verifier and re-exported here so call sites can | |
| 83 | +// type-narrow against the same source: returns the layer number a | |
| 84 | +// file's basename declares, or null if no profile prefix matches. | |
| 85 | +export const declaredLayer = ( | |
| 86 | + path: string, | |
| 87 | + profile: ProfileSpec, | |
| 88 | +): { layer: LayerNumber; sublayer: Sublayer } | null => { | |
| 89 | + const base = path.split("/").pop() ?? path; | |
| 90 | + for (const k of [0, 1, 2, 3] as LayerNumber[]) { | |
| 91 | + const spec = profile.layers[k]; | |
| 92 | + for (const sub of spec.sublayers) { | |
| 93 | + if (base.startsWith(sub.prefix)) return { layer: k, sublayer: sub }; | |
| 94 | + } | |
| 95 | + } | |
| 96 | + return null; | |
| 97 | +}; | |
src/c31_sxdoc.ts
+14
−0
| @@ -140,3 +140,17 @@ export const emptyDocument = (): SxDocument => ({ | ||
| 140 | 140 | v: SX_DOC_VERSION, |
| 141 | 141 | blocks: [], |
| 142 | 142 | }); |
| 143 | + | |
| 144 | +// Row-shape returned by c13_database.listDocuments. Defined here in | |
| 145 | +// Layer 0 (Pure) per SAMA v2 §1.1 so c51 render code can reference | |
| 146 | +// the type without importing from Layer 2 (Adapter). The Adapter | |
| 147 | +// (c13_database) imports this type to type its own return value. | |
| 148 | +export interface SxDocumentSummary { | |
| 149 | + id: number; | |
| 150 | + slug: string; | |
| 151 | + type: "page" | "post"; | |
| 152 | + title: string; | |
| 153 | + status: "published" | "draft"; | |
| 154 | + primaryTag: string | null; | |
| 155 | + updatedAt: number; | |
| 156 | +} | |
src/c32_judge.test.ts
+0
−69
| @@ -1,69 +0,0 @@ | ||
| 1 | -// Sibling test for c32_judge.ts. The orchestrator itself (judge()) does | |
| 2 | -// git clone + test execution and isn't unit-testable without a real | |
| 3 | -// agent repo; the pure helpers underneath it (applyMode, explainRefactor) | |
| 4 | -// are the structural surface that matters for scoring decisions. Cover | |
| 5 | -// the mode-aware penalty math + the operator-facing explanations here. | |
| 6 | - | |
| 7 | -import { describe, test, expect } from "bun:test"; | |
| 8 | -import { applyMode, explainRefactor, judge } from "./c32_judge.ts"; | |
| 9 | - | |
| 10 | -describe("c32_judge — applyMode (mode-aware penalty math)", () => { | |
| 11 | - test("positive deltas pass through unchanged in every mode", () => { | |
| 12 | - expect(applyMode(10, "strict")).toBe(10); | |
| 13 | - expect(applyMode(10, "pragmatic")).toBe(10); | |
| 14 | - expect(applyMode(10, "learning")).toBe(10); | |
| 15 | - }); | |
| 16 | - | |
| 17 | - test("strict mode keeps the full negative penalty", () => { | |
| 18 | - expect(applyMode(-20, "strict")).toBe(-20); | |
| 19 | - expect(applyMode(-5, "strict")).toBe(-5); | |
| 20 | - }); | |
| 21 | - | |
| 22 | - test("pragmatic mode halves negative deltas (Math.ceil — never below half)", () => { | |
| 23 | - expect(applyMode(-20, "pragmatic")).toBe(-10); | |
| 24 | - expect(applyMode(-10, "pragmatic")).toBe(-5); | |
| 25 | - // -5 / 2 = -2.5 → Math.ceil(-2.5) = -2: the harsher half rounds up | |
| 26 | - // toward zero, which is the documented "softer score" behaviour. | |
| 27 | - expect(applyMode(-5, "pragmatic")).toBe(-2); | |
| 28 | - }); | |
| 29 | - | |
| 30 | - test("learning mode zeroes out every negative delta", () => { | |
| 31 | - expect(applyMode(-20, "learning")).toBe(0); | |
| 32 | - expect(applyMode(-5, "learning")).toBe(0); | |
| 33 | - expect(applyMode(-1, "learning")).toBe(0); | |
| 34 | - }); | |
| 35 | - | |
| 36 | - test("zero delta is neutral in every mode", () => { | |
| 37 | - expect(applyMode(0, "strict")).toBe(0); | |
| 38 | - expect(applyMode(0, "pragmatic")).toBe(0); | |
| 39 | - expect(applyMode(0, "learning")).toBe(0); | |
| 40 | - }); | |
| 41 | -}); | |
| 42 | - | |
| 43 | -describe("c32_judge — explainRefactor", () => { | |
| 44 | - test("passed=true returns the canonical-refactor explanation", () => { | |
| 45 | - const s = explainRefactor(true); | |
| 46 | - expect(s).toContain("stayed green"); | |
| 47 | - expect(s).toMatch(/canonical/i); | |
| 48 | - }); | |
| 49 | - | |
| 50 | - test("passed=false returns guidance to revert or open a new red→green", () => { | |
| 51 | - const s = explainRefactor(false); | |
| 52 | - expect(s).toContain("broke"); | |
| 53 | - expect(s).toMatch(/revert|red→green/); | |
| 54 | - }); | |
| 55 | - | |
| 56 | - test("the two branches return different strings", () => { | |
| 57 | - expect(explainRefactor(true)).not.toBe(explainRefactor(false)); | |
| 58 | - }); | |
| 59 | -}); | |
| 60 | - | |
| 61 | -describe("c32_judge — orchestrator entry point", () => { | |
| 62 | - test("judge is exported as an async function (Promise-returning)", () => { | |
| 63 | - expect(typeof judge).toBe("function"); | |
| 64 | - // The orchestrator does git clone + test execution; covering it | |
| 65 | - // end-to-end needs a real agent repo. A type-level check that the | |
| 66 | - // shape didn't drift is the documented minimum for this layer. | |
| 67 | - expect(judge.length).toBe(2); | |
| 68 | - }); | |
| 69 | -}); | |
src/c32_judge.ts
+0
−370
| @@ -1,370 +0,0 @@ | ||
| 1 | -import { mkdtempSync, rmSync } from "fs"; | |
| 2 | -import { join } from "path"; | |
| 3 | -import { tmpdir } from "os"; | |
| 4 | -import { parseCommit, type Phase } from "./c31_commits.ts"; | |
| 5 | -import { saveRun, type Verdict, type StepVerdict, type RefactorVerdict, type Mode } from "./c13_database.ts"; | |
| 6 | -import { loadGame, type Game } from "./c31_games.ts"; | |
| 7 | - | |
| 8 | -type TestRunner = "bun" | "none"; | |
| 9 | - | |
| 10 | -interface TddConfig { | |
| 11 | - mode: Mode; | |
| 12 | - testRunner: TestRunner; | |
| 13 | -} | |
| 14 | - | |
| 15 | -// tdd.config.json from the agent's repo selects the scoring mode and | |
| 16 | -// test runner. Falls back to strict / bun when missing or unparseable. | |
| 17 | -// | |
| 18 | -// { "mode": "pragmatic", "test_runner": "none" } | |
| 19 | -// | |
| 20 | -// test_runner: "none" enables trace-only judging — no checkout, no test | |
| 21 | -// execution. Useful as a CI gate on projects where Bun can't run the | |
| 22 | -// suite (e.g. .NET, Python without bun-compat tests). | |
| 23 | -const readConfig = async (cwd: string): Promise<TddConfig> => { | |
| 24 | - const file = Bun.file(join(cwd, "tdd.config.json")); | |
| 25 | - let mode: Mode = "strict"; | |
| 26 | - let testRunner: TestRunner = "bun"; | |
| 27 | - if (await file.exists()) { | |
| 28 | - try { | |
| 29 | - const cfg = (await file.json()) as { mode?: string; test_runner?: string }; | |
| 30 | - if (cfg.mode === "pragmatic" || cfg.mode === "learning") mode = cfg.mode; | |
| 31 | - if (cfg.test_runner === "none") testRunner = "none"; | |
| 32 | - } catch { | |
| 33 | - // best effort — bad config falls back to defaults | |
| 34 | - } | |
| 35 | - } | |
| 36 | - return { mode, testRunner }; | |
| 37 | -}; | |
| 38 | - | |
| 39 | -// Penalty halving for pragmatic, zeroing for learning. Positive deltas | |
| 40 | -// are unchanged across modes — earned credit is earned credit. | |
| 41 | -export const applyMode = (delta: number, mode: Mode): number => { | |
| 42 | - if (delta >= 0) return delta; | |
| 43 | - if (mode === "learning") return 0; | |
| 44 | - if (mode === "pragmatic") return Math.ceil(delta / 2); | |
| 45 | - return delta; | |
| 46 | -}; | |
| 47 | - | |
| 48 | -// Plain-language summary of a step verdict, written to the agent (not | |
| 49 | -// the human admin). One short paragraph; named intentionally so callers | |
| 50 | -// can see it next to the row in the score table. | |
| 51 | -const explainStep = (params: { | |
| 52 | - status: StepVerdict["status"]; | |
| 53 | - redSha: string | null; | |
| 54 | - greenSha: string | null; | |
| 55 | - hiddenPassed: boolean | null; | |
| 56 | - mode: Mode; | |
| 57 | -}): string => { | |
| 58 | - const { status, hiddenPassed, mode } = params; | |
| 59 | - switch (status) { | |
| 60 | - case "verified": | |
| 61 | - return "Red failed as expected, green passes your tests, and the kata's hidden tests confirm the implementation matches the requirement."; | |
| 62 | - case "discipline-only": | |
| 63 | - return "Red→green discipline holds, but this kata didn't ship hidden tests for the step. Partial credit awarded; full +20 isn't possible without authoritative verification."; | |
| 64 | - case "no-green": | |
| 65 | - return "Red commit landed; the matching green(<step>) commit hasn't been pushed yet. Push your green to lock in the score."; | |
| 66 | - case "red-did-not-fail": | |
| 67 | - return mode === "pragmatic" | |
| 68 | - ? "Combined red+green commit detected. Pragmatic mode allows this — the cycle still counts, just with a softer score than a clean separation." | |
| 69 | - : "Red commit's tests already passed when the step was first introduced — meaning the implementation was added before the test, or the test is tautological. Switch to pragmatic mode if you commit red+green together intentionally."; | |
| 70 | - case "green-did-not-pass": | |
| 71 | - return "Green commit's own tests still fail. The implementation doesn't yet satisfy the test you wrote — fix the impl, or reconsider whether the test reflects the requirement."; | |
| 72 | - case "hidden-tests-failed": | |
| 73 | - return hiddenPassed === false | |
| 74 | - ? "Your tests pass, but the kata's hidden tests don't — this is the classic tautology trap. Tighten your test to mirror the requirement (e.g., assert the actual return value, not just that it runs)." | |
| 75 | - : "Your tests pass, but hidden verification was inconclusive. Re-push to retry."; | |
| 76 | - case "test-deleted": | |
| 77 | - return "Test count dropped between red and green for this step. Once a test exists it must keep existing — refactor it, don't delete it. If the test was wrong, replace it in a separate commit before resuming the cycle."; | |
| 78 | - case "trace-verified": | |
| 79 | - return "Trace-only mode: red→green pair found in the commit log. Tests weren't executed (test_runner: \"none\"). Switch to bun runner for behaviour verification."; | |
| 80 | - case "trace-tests-shrunk": | |
| 81 | - return "Trace-only mode: the green commit's tree has fewer test files than the red commit's tree — looks like deletion. If you renamed or split test files, the tally still drops."; | |
| 82 | - } | |
| 83 | -}; | |
| 84 | - | |
| 85 | -export const explainRefactor = (passed: boolean): string => | |
| 86 | - passed | |
| 87 | - ? "Tests stayed green through the refactor — structural change without behavior change, the canonical refactor." | |
| 88 | - : "Refactor commit broke at least one test. Either revert the refactor or write a new red→green to capture the changed behavior."; | |
| 89 | - | |
| 90 | -const FORGEJO_INTERNAL = process.env.FORGEJO_URL ?? "https://git.tdd.md"; | |
| 91 | -const TEST_TIMEOUT_MS = 8000; | |
| 92 | - | |
| 93 | -// Sandboxed env passed to git and bun subprocesses. Strips every secret | |
| 94 | -// from the parent process — agent code never sees FORGEJO_ADMIN_TOKEN, | |
| 95 | -// GITHUB_CLIENT_SECRET, or SESSION_SECRET. PATH is fixed; HOME and TMPDIR | |
| 96 | -// stay inside the per-run temp dir so dotfile writes can't escape. | |
| 97 | -const sandboxEnv = (cwd: string): Record<string, string> => ({ | |
| 98 | - PATH: "/usr/local/bin:/usr/bin:/bin", | |
| 99 | - HOME: cwd, | |
| 100 | - TMPDIR: cwd, | |
| 101 | - NODE_ENV: "test", | |
| 102 | -}); | |
| 103 | - | |
| 104 | -const runProc = async ( | |
| 105 | - cmd: string[], | |
| 106 | - cwd: string, | |
| 107 | - timeoutMs: number, | |
| 108 | -): Promise<{ stdout: string; stderr: string; exitCode: number; timedOut: boolean }> => { | |
| 109 | - const proc = Bun.spawn(cmd, { | |
| 110 | - cwd, | |
| 111 | - stdout: "pipe", | |
| 112 | - stderr: "pipe", | |
| 113 | - env: sandboxEnv(cwd), | |
| 114 | - }); | |
| 115 | - let timedOut = false; | |
| 116 | - const timer = setTimeout(() => { | |
| 117 | - timedOut = true; | |
| 118 | - proc.kill("SIGKILL"); | |
| 119 | - }, timeoutMs); | |
| 120 | - const exitCode = await proc.exited; | |
| 121 | - clearTimeout(timer); | |
| 122 | - const stdout = await new Response(proc.stdout).text(); | |
| 123 | - const stderr = await new Response(proc.stderr).text(); | |
| 124 | - return { stdout: stdout.trim(), stderr: stderr.trim(), exitCode, timedOut }; | |
| 125 | -}; | |
| 126 | - | |
| 127 | -const runTests = async (cwd: string): Promise<boolean> => { | |
| 128 | - const r = await runProc(["bun", "test"], cwd, TEST_TIMEOUT_MS); | |
| 129 | - // Bun test exits 0 only when all tests pass. | |
| 130 | - return !r.timedOut && r.exitCode === 0; | |
| 131 | -}; | |
| 132 | - | |
| 133 | -// Language-agnostic test-file counter for trace-only mode. Uses git | |
| 134 | -// ls-tree at the given sha so we don't have to checkout the working | |
| 135 | -// tree. Matches conventional test-file naming across ecosystems: | |
| 136 | -// foo.test.ts, foo.spec.ts, FooTests.cs, FooTest.java, test_foo.py, | |
| 137 | -// foo_test.go, FooSpec.scala, foo_spec.rb. | |
| 138 | -const countTestFiles = async (cwd: string, sha: string): Promise<number> => { | |
| 139 | - const r = await runProc(["git", "ls-tree", "-r", "--name-only", sha], cwd, 5000); | |
| 140 | - if (r.exitCode !== 0) return 0; | |
| 141 | - const re = /(?:^|\/)(?:[^/]*\.(?:test|spec)\.[a-z]+|[Tt]ests?\/[^/]+|test_[^/]+|[^/]+_test\.[a-z]+|[^/]+[Tt]ests?\.cs|[^/]+[Tt]est\.java)$/; | |
| 142 | - let count = 0; | |
| 143 | - for (const line of r.stdout.split("\n")) { | |
| 144 | - if (re.test(line)) count++; | |
| 145 | - } | |
| 146 | - return count; | |
| 147 | -}; | |
| 148 | - | |
| 149 | -// Count `test(` / `it(` calls in tracked *.test.ts files. Used to detect | |
| 150 | -// when an agent deletes tests between red and green to make a regression | |
| 151 | -// "pass" — a cardinal TDD sin per the kata spec. | |
| 152 | -const countTests = async (cwd: string): Promise<number> => { | |
| 153 | - const r = await runProc(["git", "ls-files", "*.test.ts"], cwd, 5000); | |
| 154 | - if (r.exitCode !== 0) return 0; | |
| 155 | - const files = r.stdout.split("\n").filter((f) => f && !f.includes("__hidden_")); | |
| 156 | - let count = 0; | |
| 157 | - for (const f of files) { | |
| 158 | - const content = await Bun.file(join(cwd, f)) | |
| 159 | - .text() | |
| 160 | - .catch(() => ""); | |
| 161 | - const matches = content.match(/\b(?:test|it)\s*\(/g); | |
| 162 | - if (matches) count += matches.length; | |
| 163 | - } | |
| 164 | - return count; | |
| 165 | -}; | |
| 166 | - | |
| 167 | -// Runs the kata's authoritative tests against the agent's implementation | |
| 168 | -// at whatever commit is currently checked out. Copies the hidden test | |
| 169 | -// file into the working tree under a __hidden__ prefix so it doesn't | |
| 170 | -// collide with the agent's filenames, runs only that file, then deletes | |
| 171 | -// it. Returns null if the kata doesn't have hidden tests for this step. | |
| 172 | -const runHiddenTests = async (cwd: string, spec: Game, stepId: string): Promise<boolean | null> => { | |
| 173 | - const stepDef = spec.steps.find((s) => s.id === stepId); | |
| 174 | - if (!stepDef) return null; | |
| 175 | - const sourcePath = `./content/games/${spec.id}/${stepDef.hiddenTestFile}`; | |
| 176 | - const sourceFile = Bun.file(sourcePath); | |
| 177 | - if (!(await sourceFile.exists())) return null; | |
| 178 | - const content = await sourceFile.text(); | |
| 179 | - const targetName = `__hidden_${stepId}__.test.ts`; | |
| 180 | - const targetPath = join(cwd, targetName); | |
| 181 | - await Bun.write(targetPath, content); | |
| 182 | - try { | |
| 183 | - const r = await runProc(["bun", "test", targetName], cwd, TEST_TIMEOUT_MS); | |
| 184 | - return !r.timedOut && r.exitCode === 0; | |
| 185 | - } finally { | |
| 186 | - try { | |
| 187 | - rmSync(targetPath, { force: true }); | |
| 188 | - } catch { | |
| 189 | - // best effort | |
| 190 | - } | |
| 191 | - } | |
| 192 | -}; | |
| 193 | - | |
| 194 | -interface CommitInfo { | |
| 195 | - sha: string; | |
| 196 | - phase: Phase; | |
| 197 | - step: string | null; | |
| 198 | -} | |
| 199 | - | |
| 200 | -const readCommits = async (cwd: string): Promise<CommitInfo[]> => { | |
| 201 | - const r = await runProc(["git", "log", "--reverse", "--pretty=format:%H%x1f%B%x1e"], cwd, 10000); | |
| 202 | - if (r.exitCode !== 0) return []; | |
| 203 | - const out: CommitInfo[] = []; | |
| 204 | - for (const block of r.stdout.split("\x1e")) { | |
| 205 | - const t = block.trim(); | |
| 206 | - if (!t) continue; | |
| 207 | - const [sha, message = ""] = t.split("\x1f"); | |
| 208 | - if (!sha) continue; | |
| 209 | - const p = parseCommit(message); | |
| 210 | - out.push({ sha, phase: p.phase, step: p.step }); | |
| 211 | - } | |
| 212 | - return out; | |
| 213 | -}; | |
| 214 | - | |
| 215 | -export const judge = async (owner: string, repo: string): Promise<Verdict> => { | |
| 216 | - const cwd = mkdtempSync(join(tmpdir(), `judge-${owner}-${repo}-`)); | |
| 217 | - try { | |
| 218 | - // Agent repos default to private. Authenticate via admin token in | |
| 219 | - // an http.extraheader so the token isn't persisted in the cloned | |
| 220 | - // repo's config (extraheader applies to the clone request only). | |
| 221 | - const cloneUrl = `${FORGEJO_INTERNAL}/${owner}/${repo}.git`; | |
| 222 | - const adminToken = process.env.FORGEJO_ADMIN_TOKEN; | |
| 223 | - const gitArgs = adminToken | |
| 224 | - ? ["-c", `http.extraheader=Authorization: token ${adminToken}`, "clone", "--quiet", cloneUrl, "."] | |
| 225 | - : ["clone", "--quiet", cloneUrl, "."]; | |
| 226 | - const cloneR = await runProc(["git", ...gitArgs], cwd, 30000); | |
| 227 | - if (cloneR.exitCode !== 0) { | |
| 228 | - throw new Error(`clone failed: ${cloneR.stderr || cloneR.stdout}`); | |
| 229 | - } | |
| 230 | - | |
| 231 | - const commits = await readCommits(cwd); | |
| 232 | - const headR = await runProc(["git", "rev-parse", "HEAD"], cwd, 5000); | |
| 233 | - const headSha = headR.stdout; | |
| 234 | - | |
| 235 | - // First red per step + first green-after-red per step (chronological). | |
| 236 | - const stepRed = new Map<string, string>(); | |
| 237 | - const stepGreen = new Map<string, string>(); | |
| 238 | - for (const c of commits) { | |
| 239 | - if (!c.step) continue; | |
| 240 | - if (c.phase === "red" && !stepRed.has(c.step)) { | |
| 241 | - stepRed.set(c.step, c.sha); | |
| 242 | - } else if (c.phase === "green" && stepRed.has(c.step) && !stepGreen.has(c.step)) { | |
| 243 | - stepGreen.set(c.step, c.sha); | |
| 244 | - } | |
| 245 | - } | |
| 246 | - | |
| 247 | - // Read the agent's mode + runner preferences from tdd.config.json. | |
| 248 | - const { mode, testRunner } = await readConfig(cwd); | |
| 249 | - | |
| 250 | - // Load the kata's authoritative spec — used to fetch hidden tests | |
| 251 | - // per step. Repos that don't match a known kata get scored on red→green | |
| 252 | - // discipline only (no hidden-test verification). | |
| 253 | - let spec: Game | null = null; | |
| 254 | - try { | |
| 255 | - spec = await loadGame(repo); | |
| 256 | - } catch { | |
| 257 | - spec = null; | |
| 258 | - } | |
| 259 | - | |
| 260 | - const steps: StepVerdict[] = []; | |
| 261 | - for (const [stepId, redSha] of stepRed) { | |
| 262 | - const greenSha = stepGreen.get(stepId) ?? null; | |
| 263 | - | |
| 264 | - if (testRunner === "none") { | |
| 265 | - // Trace-only path: don't checkout, don't run anything. Score | |
| 266 | - // purely from the commit log + a language-agnostic test-file | |
| 267 | - // count via `git ls-tree`. Useful for non-Bun projects. | |
| 268 | - const redFiles = await countTestFiles(cwd, redSha); | |
| 269 | - const greenFiles = greenSha ? await countTestFiles(cwd, greenSha) : redFiles; | |
| 270 | - const filesShrank = greenSha !== null && greenFiles < redFiles; | |
| 271 | - | |
| 272 | - let status: StepVerdict["status"]; | |
| 273 | - let baseDelta = 0; | |
| 274 | - if (greenSha === null) { | |
| 275 | - status = "no-green"; | |
| 276 | - } else if (filesShrank) { | |
| 277 | - status = "trace-tests-shrunk"; | |
| 278 | - baseDelta = -10; | |
| 279 | - } else { | |
| 280 | - status = "trace-verified"; | |
| 281 | - baseDelta = 10; | |
| 282 | - } | |
| 283 | - const scoreDelta = applyMode(baseDelta, mode); | |
| 284 | - const explanation = explainStep({ status, redSha, greenSha, hiddenPassed: null, mode }); | |
| 285 | - steps.push({ | |
| 286 | - stepId, redSha, greenSha, | |
| 287 | - redFailed: null, greenPassed: null, hiddenPassed: null, | |
| 288 | - status, scoreDelta, explanation, | |
| 289 | - }); | |
| 290 | - continue; | |
| 291 | - } | |
| 292 | - | |
| 293 | - await runProc(["git", "checkout", "--quiet", redSha], cwd, 5000); | |
| 294 | - const redTestCount = await countTests(cwd); | |
| 295 | - const redPassed = await runTests(cwd); | |
| 296 | - const redFailed = !redPassed; | |
| 297 | - let greenPassed: boolean | null = null; | |
| 298 | - let hiddenPassed: boolean | null = null; | |
| 299 | - let testsDeleted = false; | |
| 300 | - if (greenSha) { | |
| 301 | - await runProc(["git", "checkout", "--quiet", greenSha], cwd, 5000); | |
| 302 | - const greenTestCount = await countTests(cwd); | |
| 303 | - testsDeleted = greenTestCount < redTestCount; | |
| 304 | - greenPassed = await runTests(cwd); | |
| 305 | - if (greenPassed && spec && !testsDeleted) { | |
| 306 | - hiddenPassed = await runHiddenTests(cwd, spec, stepId); | |
| 307 | - } | |
| 308 | - } | |
| 309 | - | |
| 310 | - let status: StepVerdict["status"]; | |
| 311 | - let baseDelta = 0; | |
| 312 | - if (greenSha === null) { | |
| 313 | - status = "no-green"; | |
| 314 | - } else if (testsDeleted) { | |
| 315 | - status = "test-deleted"; | |
| 316 | - baseDelta = -20; | |
| 317 | - } else if (!redFailed) { | |
| 318 | - status = "red-did-not-fail"; | |
| 319 | - baseDelta = -5; | |
| 320 | - } else if (greenPassed === false) { | |
| 321 | - status = "green-did-not-pass"; | |
| 322 | - baseDelta = -5; | |
| 323 | - } else if (hiddenPassed === false) { | |
| 324 | - status = "hidden-tests-failed"; | |
| 325 | - baseDelta = 0; | |
| 326 | - } else if (hiddenPassed === true) { | |
| 327 | - status = "verified"; | |
| 328 | - baseDelta = 20; | |
| 329 | - } else { | |
| 330 | - status = "discipline-only"; | |
| 331 | - baseDelta = 5; | |
| 332 | - } | |
| 333 | - const scoreDelta = applyMode(baseDelta, mode); | |
| 334 | - const explanation = explainStep({ status, redSha, greenSha, hiddenPassed, mode }); | |
| 335 | - steps.push({ stepId, redSha, greenSha, redFailed, greenPassed, hiddenPassed, status, scoreDelta, explanation }); | |
| 336 | - } | |
| 337 | - | |
| 338 | - // Refactor commits aren't tied to red→green pairs: the spec rewards | |
| 339 | - // any refactor that keeps the existing tests green. A broken refactor | |
| 340 | - // (tests fail at the refactor commit) costs the same as a missed | |
| 341 | - // green — discipline matters even outside red→green pairs. | |
| 342 | - const refactors: RefactorVerdict[] = []; | |
| 343 | - for (const c of commits) { | |
| 344 | - if (c.phase !== "refactor") continue; | |
| 345 | - await runProc(["git", "checkout", "--quiet", c.sha], cwd, 5000); | |
| 346 | - const passed = await runTests(cwd); | |
| 347 | - const baseDelta = passed ? 5 : -5; | |
| 348 | - refactors.push({ | |
| 349 | - sha: c.sha, | |
| 350 | - stepId: c.step, | |
| 351 | - testsPassed: passed, | |
| 352 | - scoreDelta: applyMode(baseDelta, mode), | |
| 353 | - explanation: explainRefactor(passed), | |
| 354 | - }); | |
| 355 | - } | |
| 356 | - | |
| 357 | - const totalScore = | |
| 358 | - steps.reduce((a, s) => a + s.scoreDelta, 0) + | |
| 359 | - refactors.reduce((a, r) => a + r.scoreDelta, 0); | |
| 360 | - const verdict: Verdict = { headSha, mode, steps, refactors, totalScore, judgedAt: Date.now() }; | |
| 361 | - saveRun(owner, repo, verdict); | |
| 362 | - return verdict; | |
| 363 | - } finally { | |
| 364 | - try { | |
| 365 | - rmSync(cwd, { recursive: true, force: true }); | |
| 366 | - } catch { | |
| 367 | - // best effort cleanup | |
| 368 | - } | |
| 369 | - } | |
| 370 | -}; | |
src/c32_real_reports.test.ts
+0
−101
| @@ -1,101 +0,0 @@ | ||
| 1 | -// Sibling test for c32_real_reports.ts. buildLiveReports itself fans out | |
| 2 | -// to fetchRepoCommits (network) so its end-to-end shape is covered by | |
| 3 | -// the live /reports/live route. The pure helpers underneath — agent | |
| 4 | -// attribution from commit messages, and the 30-day daily sparkline — | |
| 5 | -// are unit-testable here. | |
| 6 | - | |
| 7 | -import { describe, test, expect } from "bun:test"; | |
| 8 | -import { | |
| 9 | - detectAgent, | |
| 10 | - buildTrend, | |
| 11 | - buildLiveReports, | |
| 12 | -} from "./c32_real_reports.ts"; | |
| 13 | -import type { GithubCommit } from "./c14_github.ts"; | |
| 14 | - | |
| 15 | -const mkCommit = (date: string, message = ""): GithubCommit => ({ | |
| 16 | - sha: "0".repeat(40), | |
| 17 | - commit: { | |
| 18 | - message, | |
| 19 | - author: { name: "test", email: "[email protected]", date }, | |
| 20 | - committer: { name: "test", email: "[email protected]", date }, | |
| 21 | - }, | |
| 22 | - author: null, | |
| 23 | - committer: null, | |
| 24 | -} as unknown as GithubCommit); | |
| 25 | - | |
| 26 | -describe("c32_real_reports — detectAgent", () => { | |
| 27 | - test("recognises a Claude Code commit via Co-Authored-By: Claude", () => { | |
| 28 | - expect(detectAgent("Add feature\n\nCo-Authored-By: Claude <noreply>")).toBe("claude-code"); | |
| 29 | - }); | |
| 30 | - | |
| 31 | - test("recognises a Cursor commit", () => { | |
| 32 | - expect(detectAgent("Fix bug\n\nCo-Authored-By: Cursor <[email protected]>")).toBe("cursor"); | |
| 33 | - }); | |
| 34 | - | |
| 35 | - test("recognises an Aider commit", () => { | |
| 36 | - expect(detectAgent("Refactor x\n\nCo-Authored-By: aider")).toBe("aider"); | |
| 37 | - }); | |
| 38 | - | |
| 39 | - test("returns unknown when no recognised footer is present", () => { | |
| 40 | - expect(detectAgent("Just a commit")).toBe("unknown"); | |
| 41 | - expect(detectAgent("")).toBe("unknown"); | |
| 42 | - }); | |
| 43 | - | |
| 44 | - test("the regex is case-insensitive on the agent token", () => { | |
| 45 | - expect(detectAgent("Co-Authored-By: CLAUDE")).toBe("claude-code"); | |
| 46 | - expect(detectAgent("co-authored-by: CURSOR")).toBe("cursor"); | |
| 47 | - }); | |
| 48 | -}); | |
| 49 | - | |
| 50 | -describe("c32_real_reports — buildTrend (30-day daily sparkline)", () => { | |
| 51 | - // Use today (UTC) as the anchor — the function compares against UTC | |
| 52 | - // midnight, so we need ISO strings that fall on the right days. | |
| 53 | - const today = new Date(); | |
| 54 | - today.setUTCHours(0, 0, 0, 0); | |
| 55 | - const iso = (daysAgo: number): string => { | |
| 56 | - const d = new Date(today.getTime() - daysAgo * 24 * 60 * 60 * 1000); | |
| 57 | - return d.toISOString(); | |
| 58 | - }; | |
| 59 | - | |
| 60 | - test("returns an array of `days` length", () => { | |
| 61 | - expect(buildTrend([], 30)).toHaveLength(30); | |
| 62 | - expect(buildTrend([], 7)).toHaveLength(7); | |
| 63 | - }); | |
| 64 | - | |
| 65 | - test("empty input flat-lines at zero", () => { | |
| 66 | - const trend = buildTrend([], 7); | |
| 67 | - expect(trend.every((n) => n === 0)).toBe(true); | |
| 68 | - }); | |
| 69 | - | |
| 70 | - test("a single commit today increments the last bucket", () => { | |
| 71 | - const trend = buildTrend([mkCommit(iso(0))], 7); | |
| 72 | - expect(trend[trend.length - 1]).toBe(1); | |
| 73 | - expect(trend.slice(0, -1).every((n) => n === 0)).toBe(true); | |
| 74 | - }); | |
| 75 | - | |
| 76 | - test("multiple commits on the same day stack in the same bucket", () => { | |
| 77 | - const trend = buildTrend([mkCommit(iso(0)), mkCommit(iso(0)), mkCommit(iso(0))], 7); | |
| 78 | - expect(trend[trend.length - 1]).toBe(3); | |
| 79 | - }); | |
| 80 | - | |
| 81 | - test("commits older than the window are dropped", () => { | |
| 82 | - const trend = buildTrend([mkCommit(iso(99))], 7); | |
| 83 | - expect(trend.every((n) => n === 0)).toBe(true); | |
| 84 | - }); | |
| 85 | - | |
| 86 | - test("a commit `daysAgo` lands at index `days - 1 - daysAgo`", () => { | |
| 87 | - const trend = buildTrend([mkCommit(iso(2))], 7); | |
| 88 | - // index 6 = today, 5 = yesterday, 4 = 2 days ago | |
| 89 | - expect(trend[4]).toBe(1); | |
| 90 | - }); | |
| 91 | -}); | |
| 92 | - | |
| 93 | -describe("c32_real_reports — orchestrator entry point", () => { | |
| 94 | - test("buildLiveReports is exported as an async function", () => { | |
| 95 | - expect(typeof buildLiveReports).toBe("function"); | |
| 96 | - // End-to-end coverage lives on /reports/live; this is the structural | |
| 97 | - // smoke that the export shape didn't drift. `.length` counts only | |
| 98 | - // non-default params (owner, repo) — perPage carries a default. | |
| 99 | - expect(buildLiveReports.length).toBe(2); | |
| 100 | - }); | |
| 101 | -}); | |
src/c32_real_reports.ts
+0
−170
| @@ -1,170 +0,0 @@ | ||
| 1 | -// c32 — logic: aggregate real GitHub commit history into the same | |
| 2 | -// AgentReport / RecentFlagged shape that c51_render_reports renders. | |
| 3 | -// Pure (given fetched commits in, produces report objects out); the | |
| 4 | -// I/O happens in c14_github.fetchRepoCommits which we call here. | |
| 5 | -// | |
| 6 | -// Attribution: Co-Authored-By footers are the agent-attribution channel | |
| 7 | -// the existing tdd.md commit history already uses. Anything without a | |
| 8 | -// recognised footer is bucketed as "unknown" and reported separately — | |
| 9 | -// it's still useful for volume context. | |
| 10 | - | |
| 11 | -import { parseCommit } from "./c31_commits.ts"; | |
| 12 | -import { fetchRepoCommits, type GithubCommit } from "./c14_github.ts"; | |
| 13 | -import type { | |
| 14 | - AgentReport, | |
| 15 | - FailureSlice, | |
| 16 | - RecentFlagged, | |
| 17 | -} from "./c31_reports_demo.ts"; | |
| 18 | - | |
| 19 | -type LiveAgentSlug = AgentReport["slug"] | "unknown"; | |
| 20 | - | |
| 21 | -export const detectAgent = (msg: string): LiveAgentSlug => { | |
| 22 | - if (/Co-Authored-By:.*Claude/i.test(msg)) return "claude-code"; | |
| 23 | - if (/Co-Authored-By:.*Cursor/i.test(msg)) return "cursor"; | |
| 24 | - if (/Co-Authored-By:.*Aider/i.test(msg)) return "aider"; | |
| 25 | - return "unknown"; | |
| 26 | -}; | |
| 27 | - | |
| 28 | -const AGENT_NAMES: Record<AgentReport["slug"], string> = { | |
| 29 | - "claude-code": "Claude Code", | |
| 30 | - cursor: "Cursor", | |
| 31 | - aider: "Aider", | |
| 32 | -}; | |
| 33 | - | |
| 34 | -// 30-day daily commit-count series, oldest → newest. When there are no | |
| 35 | -// commits in a day, that day's value is 0 — the sparkline still renders | |
| 36 | -// but flat-lines, which honestly reflects the data. | |
| 37 | -export const buildTrend = (commits: GithubCommit[], days = 30): number[] => { | |
| 38 | - const out = new Array<number>(days).fill(0); | |
| 39 | - const today = new Date(); | |
| 40 | - today.setUTCHours(0, 0, 0, 0); | |
| 41 | - for (const c of commits) { | |
| 42 | - const d = new Date(c.commit.author.date); | |
| 43 | - d.setUTCHours(0, 0, 0, 0); | |
| 44 | - const ageDays = Math.floor((today.getTime() - d.getTime()) / (24 * 60 * 60 * 1000)); | |
| 45 | - if (ageDays < 0 || ageDays >= days) continue; | |
| 46 | - const idx = days - 1 - ageDays; | |
| 47 | - const cur = out[idx] ?? 0; | |
| 48 | - out[idx] = cur + 1; | |
| 49 | - } | |
| 50 | - return out; | |
| 51 | -}; | |
| 52 | - | |
| 53 | -const buildAgentReport = ( | |
| 54 | - slug: AgentReport["slug"], | |
| 55 | - agentCommits: GithubCommit[], | |
| 56 | - repoSlug: string, | |
| 57 | -): AgentReport => { | |
| 58 | - const tagged = agentCommits.filter((c) => { | |
| 59 | - const phase = parseCommit(c.commit.message).phase; | |
| 60 | - return phase === "red" || phase === "green" || phase === "refactor"; | |
| 61 | - }); | |
| 62 | - const phaseCoveragePct = agentCommits.length === 0 | |
| 63 | - ? 0 | |
| 64 | - : Math.round((tagged.length / agentCommits.length) * 100); | |
| 65 | - | |
| 66 | - // Score is a proxy: phase-coverage is the only structural signal we | |
| 67 | - // can compute without running the test suite. When coverage is 0 the | |
| 68 | - // agent isn't attempting TDD, so the score is honestly low. | |
| 69 | - const score = phaseCoveragePct; | |
| 70 | - | |
| 71 | - // Failure mix collapses to two slices for live data — phase-tagged vs | |
| 72 | - // not. Fine-grained failure modes (red-did-not-fail, test-deleted, etc) | |
| 73 | - // need the runner sliver before they're computable. | |
| 74 | - const failureMix: FailureSlice[] = [ | |
| 75 | - { label: "phase-tagged", pct: phaseCoveragePct, tone: "green" }, | |
| 76 | - { label: "no phase tag", pct: 100 - phaseCoveragePct, tone: "muted" }, | |
| 77 | - ]; | |
| 78 | - | |
| 79 | - const recent: RecentFlagged[] = agentCommits | |
| 80 | - .slice(0, 5) | |
| 81 | - .map((c) => { | |
| 82 | - const parsed = parseCommit(c.commit.message); | |
| 83 | - const phase = parsed.phase === "red" || parsed.phase === "green" || parsed.phase === "refactor" | |
| 84 | - ? parsed.phase | |
| 85 | - : "green"; | |
| 86 | - const failure = parsed.phase === "untagged" || parsed.phase === "init" | |
| 87 | - ? "no phase tag" | |
| 88 | - : `${parsed.phase} (live judge not yet wired)`; | |
| 89 | - return { | |
| 90 | - date: c.commit.author.date.slice(0, 10), | |
| 91 | - repo: repoSlug, | |
| 92 | - sha: c.sha.slice(0, 7), | |
| 93 | - phase, | |
| 94 | - failure, | |
| 95 | - pts: 0, | |
| 96 | - }; | |
| 97 | - }); | |
| 98 | - | |
| 99 | - const topIssueLabel = phaseCoveragePct === 100 ? "no current issues" : "no phase tag"; | |
| 100 | - const topIssuePct = 100 - phaseCoveragePct; | |
| 101 | - | |
| 102 | - return { | |
| 103 | - slug, | |
| 104 | - name: AGENT_NAMES[slug], | |
| 105 | - score, | |
| 106 | - delta: 0, | |
| 107 | - commits: agentCommits.length, | |
| 108 | - phaseCoveragePct, | |
| 109 | - streak: 0, | |
| 110 | - streakBroken: false, | |
| 111 | - topIssueLabel, | |
| 112 | - topIssuePct, | |
| 113 | - failureMix, | |
| 114 | - trend: buildTrend(agentCommits), | |
| 115 | - recent, | |
| 116 | - }; | |
| 117 | -}; | |
| 118 | - | |
| 119 | -export interface LiveReports { | |
| 120 | - reports: AgentReport[]; | |
| 121 | - unknownCount: number; | |
| 122 | - totalCommits: number; | |
| 123 | - earliest: string | null; | |
| 124 | - latest: string | null; | |
| 125 | - fetchedAt: number; | |
| 126 | -} | |
| 127 | - | |
| 128 | -export const buildLiveReports = async ( | |
| 129 | - repoOwner: string, | |
| 130 | - repoName: string, | |
| 131 | - perPage = 100, | |
| 132 | -): Promise<LiveReports> => { | |
| 133 | - const commits = await fetchRepoCommits(repoOwner, repoName, perPage); | |
| 134 | - const repoSlug = `${repoOwner}/${repoName}`; | |
| 135 | - const byAgent = new Map<AgentReport["slug"], GithubCommit[]>(); | |
| 136 | - let unknownCount = 0; | |
| 137 | - | |
| 138 | - for (const c of commits) { | |
| 139 | - const a = detectAgent(c.commit.message); | |
| 140 | - if (a === "unknown") { | |
| 141 | - unknownCount++; | |
| 142 | - continue; | |
| 143 | - } | |
| 144 | - const arr = byAgent.get(a) ?? []; | |
| 145 | - arr.push(c); | |
| 146 | - byAgent.set(a, arr); | |
| 147 | - } | |
| 148 | - | |
| 149 | - const order: AgentReport["slug"][] = ["claude-code", "cursor", "aider"]; | |
| 150 | - const reports = order | |
| 151 | - .map((slug) => { | |
| 152 | - const list = byAgent.get(slug); | |
| 153 | - if (!list || list.length === 0) return null; | |
| 154 | - return buildAgentReport(slug, list, repoSlug); | |
| 155 | - }) | |
| 156 | - .filter((r): r is AgentReport => r !== null); | |
| 157 | - | |
| 158 | - const dates = commits.map((c) => c.commit.author.date).sort(); | |
| 159 | - const earliest = dates[0] ?? null; | |
| 160 | - const latest = dates[dates.length - 1] ?? null; | |
| 161 | - | |
| 162 | - return { | |
| 163 | - reports, | |
| 164 | - unknownCount, | |
| 165 | - totalCommits: commits.length, | |
| 166 | - earliest, | |
| 167 | - latest, | |
| 168 | - fetchedAt: Date.now(), | |
| 169 | - }; | |
| 170 | -}; | |
src/c32_real_tests.test.ts
+0
−66
| @@ -1,66 +0,0 @@ | ||
| 1 | -// Sibling test for c32_real_tests.ts. buildLiveTestData fans out to | |
| 2 | -// loadTestBundle + fetchRepoCommits (both network/disk) so the | |
| 3 | -// end-to-end is covered by the live /reports/live/tests route. The | |
| 4 | -// pure helpers — agent attribution and the file/name label shortener — | |
| 5 | -// are unit-testable here. | |
| 6 | - | |
| 7 | -import { describe, test, expect } from "bun:test"; | |
| 8 | -import { | |
| 9 | - detectAgent, | |
| 10 | - shortenTestLabel, | |
| 11 | - buildLiveTestData, | |
| 12 | -} from "./c32_real_tests.ts"; | |
| 13 | - | |
| 14 | -describe("c32_real_tests — detectAgent", () => { | |
| 15 | - test("recognises Claude Code via Co-Authored-By: Claude", () => { | |
| 16 | - expect(detectAgent("Add feature\n\nCo-Authored-By: Claude <noreply>")).toBe("claude-code"); | |
| 17 | - }); | |
| 18 | - | |
| 19 | - test("recognises Cursor", () => { | |
| 20 | - expect(detectAgent("Fix bug\n\nCo-Authored-By: Cursor <[email protected]>")).toBe("cursor"); | |
| 21 | - }); | |
| 22 | - | |
| 23 | - test("recognises Aider", () => { | |
| 24 | - expect(detectAgent("Refactor x\n\nCo-Authored-By: aider")).toBe("aider"); | |
| 25 | - }); | |
| 26 | - | |
| 27 | - test("returns null when no recognised footer is present (distinct from c32_real_reports which returns 'unknown')", () => { | |
| 28 | - // The two real_* files made different choices here: real_reports | |
| 29 | - // buckets unknown into its own slug; real_tests returns null so | |
| 30 | - // the caller can filter or fall back. Document the difference. | |
| 31 | - expect(detectAgent("Just a commit")).toBeNull(); | |
| 32 | - expect(detectAgent("")).toBeNull(); | |
| 33 | - }); | |
| 34 | - | |
| 35 | - test("the regex is case-insensitive on the agent token", () => { | |
| 36 | - expect(detectAgent("Co-Authored-By: CLAUDE")).toBe("claude-code"); | |
| 37 | - expect(detectAgent("co-authored-by: aider")).toBe("aider"); | |
| 38 | - }); | |
| 39 | -}); | |
| 40 | - | |
| 41 | -describe("c32_real_tests — shortenTestLabel", () => { | |
| 42 | - test("keeps only the basename of the file path + the test name", () => { | |
| 43 | - expect(shortenTestLabel("src/foo/bar/baz.test.ts", "handles X")).toBe("baz.test.ts > handles X"); | |
| 44 | - }); | |
| 45 | - | |
| 46 | - test("handles a bare filename (no path) without splitting weirdly", () => { | |
| 47 | - expect(shortenTestLabel("baz.test.ts", "handles X")).toBe("baz.test.ts > handles X"); | |
| 48 | - }); | |
| 49 | - | |
| 50 | - test("handles an empty file string (falls back to the empty basename)", () => { | |
| 51 | - // .split('/').pop() on '' yields ''. Documented behaviour: the | |
| 52 | - // helper never throws; the caller decides whether to filter empties. | |
| 53 | - expect(shortenTestLabel("", "name")).toBe(" > name"); | |
| 54 | - }); | |
| 55 | - | |
| 56 | - test("preserves spaces and special chars in the test name", () => { | |
| 57 | - expect(shortenTestLabel("a.ts", "rejects `bad input`")).toBe("a.ts > rejects `bad input`"); | |
| 58 | - }); | |
| 59 | -}); | |
| 60 | - | |
| 61 | -describe("c32_real_tests — orchestrator entry point", () => { | |
| 62 | - test("buildLiveTestData is exported as an async function", () => { | |
| 63 | - expect(typeof buildLiveTestData).toBe("function"); | |
| 64 | - expect(buildLiveTestData.length).toBe(2); | |
| 65 | - }); | |
| 66 | -}); | |
src/c32_real_tests.ts
+0
−142
| @@ -1,142 +0,0 @@ | ||
| 1 | -// c32 — logic: aggregate the per-deploy test bundle into the same | |
| 2 | -// TestSnapshot[] / TestStability[] shape that the demo page renders. | |
| 3 | -// HEAD-only snapshots; stability accumulates as more deploys add runs. | |
| 4 | -// | |
| 5 | -// Pure given the bundle + commits in (no I/O of its own beyond delegating | |
| 6 | -// to c14_github's bundle loader and commits fetcher). | |
| 7 | - | |
| 8 | -import { fetchRepoCommits, loadTestBundle, type PlaceholderTest } from "./c14_github.ts"; | |
| 9 | -import type { | |
| 10 | - AgentReport, | |
| 11 | - TestFailure, | |
| 12 | - TestSnapshot, | |
| 13 | - TestStability, | |
| 14 | -} from "./c31_reports_demo.ts"; | |
| 15 | - | |
| 16 | -export const detectAgent = (msg: string): AgentReport["slug"] | null => { | |
| 17 | - if (/Co-Authored-By:.*Claude/i.test(msg)) return "claude-code"; | |
| 18 | - if (/Co-Authored-By:.*Cursor/i.test(msg)) return "cursor"; | |
| 19 | - if (/Co-Authored-By:.*Aider/i.test(msg)) return "aider"; | |
| 20 | - return null; | |
| 21 | -}; | |
| 22 | - | |
| 23 | -export const shortenTestLabel = (file: string, name: string): string => { | |
| 24 | - const base = file.split("/").pop() ?? file; | |
| 25 | - return `${base} > ${name}`; | |
| 26 | -}; | |
| 27 | - | |
| 28 | -export interface LiveTestData { | |
| 29 | - snapshots: TestSnapshot[]; | |
| 30 | - stability: TestStability[]; | |
| 31 | - runsCount: number; | |
| 32 | - ranAt: number | null; | |
| 33 | - headSha: string | null; | |
| 34 | - placeholderTests: PlaceholderTest[]; | |
| 35 | -} | |
| 36 | - | |
| 37 | -export const buildLiveTestData = async ( | |
| 38 | - repoOwner: string, | |
| 39 | - repoName: string, | |
| 40 | -): Promise<LiveTestData> => { | |
| 41 | - const bundle = await loadTestBundle(repoOwner, repoName); | |
| 42 | - if (!bundle || bundle.runs.length === 0) { | |
| 43 | - return { snapshots: [], stability: [], runsCount: 0, ranAt: null, headSha: null, placeholderTests: [] }; | |
| 44 | - } | |
| 45 | - const repoSlug = `${repoOwner}/${repoName}`; | |
| 46 | - const latest = bundle.runs[0]; | |
| 47 | - if (!latest) { | |
| 48 | - return { snapshots: [], stability: [], runsCount: 0, ranAt: null, headSha: null, placeholderTests: [] }; | |
| 49 | - } | |
| 50 | - | |
| 51 | - // For "since" we want the oldest run that has this test as failing. | |
| 52 | - const oldestFirst = [...bundle.runs].sort((a, b) => a.ranAt - b.ranAt); | |
| 53 | - | |
| 54 | - const failures: TestFailure[] = latest.tests | |
| 55 | - .filter((t) => t.status === "fail") | |
| 56 | - .map((t) => { | |
| 57 | - const firstFail = oldestFirst.find((r) => | |
| 58 | - r.tests.some((x) => x.name === t.name && x.file === t.file && x.status === "fail"), | |
| 59 | - ); | |
| 60 | - const sinceTs = firstFail?.ranAt ?? latest.ranAt; | |
| 61 | - return { test: shortenTestLabel(t.file, t.name), since: new Date(sinceTs).toISOString().slice(0, 10) }; | |
| 62 | - }); | |
| 63 | - | |
| 64 | - const snapshot: TestSnapshot = { | |
| 65 | - repo: repoSlug, | |
| 66 | - branch: latest.branch, | |
| 67 | - total: latest.total, | |
| 68 | - passing: latest.passing, | |
| 69 | - failing: latest.failing, | |
| 70 | - failures, | |
| 71 | - }; | |
| 72 | - | |
| 73 | - // Stability: count pass/fail per (file, name) across every run, with | |
| 74 | - // "deleted" set when a previously-seen test is missing from latest. | |
| 75 | - const commits = await fetchRepoCommits(repoOwner, repoName, 100); | |
| 76 | - const shaToAgent = new Map<string, AgentReport["slug"] | null>(); | |
| 77 | - for (const c of commits) shaToAgent.set(c.sha, detectAgent(c.commit.message)); | |
| 78 | - | |
| 79 | - interface Stat { | |
| 80 | - name: string; | |
| 81 | - file: string; | |
| 82 | - pass: number; | |
| 83 | - fail: number; | |
| 84 | - lastBrokenSha: string | null; | |
| 85 | - lastBrokenAt: number; | |
| 86 | - } | |
| 87 | - const stats = new Map<string, Stat>(); | |
| 88 | - for (const run of bundle.runs) { | |
| 89 | - for (const t of run.tests) { | |
| 90 | - const key = `${t.file}|${t.name}`; | |
| 91 | - let s = stats.get(key); | |
| 92 | - if (!s) { | |
| 93 | - s = { name: t.name, file: t.file, pass: 0, fail: 0, lastBrokenSha: null, lastBrokenAt: 0 }; | |
| 94 | - stats.set(key, s); | |
| 95 | - } | |
| 96 | - if (t.status === "pass") s.pass++; | |
| 97 | - else { | |
| 98 | - s.fail++; | |
| 99 | - if (run.ranAt > s.lastBrokenAt) { | |
| 100 | - s.lastBrokenSha = run.sha; | |
| 101 | - s.lastBrokenAt = run.ranAt; | |
| 102 | - } | |
| 103 | - } | |
| 104 | - } | |
| 105 | - } | |
| 106 | - | |
| 107 | - const latestKeys = new Set(latest.tests.map((t) => `${t.file}|${t.name}`)); | |
| 108 | - | |
| 109 | - // lastBrokenBy needs an agent slug; if we can't map a SHA to an agent | |
| 110 | - // (e.g. the commit isn't in the 100-commit window we fetch), fall | |
| 111 | - // back to the agent of the latest run, which is a defensible default | |
| 112 | - // for the dogfood case (one agent producing the history). | |
| 113 | - const fallbackAgent = (shaToAgent.get(latest.sha) ?? "claude-code") as AgentReport["slug"]; | |
| 114 | - | |
| 115 | - const stability: TestStability[] = Array.from(stats.values()) | |
| 116 | - .map<TestStability>((s) => { | |
| 117 | - const mapped = s.lastBrokenSha ? shaToAgent.get(s.lastBrokenSha) : null; | |
| 118 | - const agent = (mapped ?? fallbackAgent) as AgentReport["slug"]; | |
| 119 | - const deleted = latestKeys.has(`${s.file}|${s.name}`) ? 0 : 1; | |
| 120 | - const flagged = s.fail > 0 && (deleted > 0 || s.fail >= Math.max(2, s.pass / 5)); | |
| 121 | - return { | |
| 122 | - test: shortenTestLabel(s.file, s.name), | |
| 123 | - repo: repoSlug, | |
| 124 | - pass: s.pass, | |
| 125 | - fail: s.fail, | |
| 126 | - deleted, | |
| 127 | - lastBrokenBy: agent, | |
| 128 | - flagged, | |
| 129 | - }; | |
| 130 | - }) | |
| 131 | - .sort((a, b) => b.fail - a.fail || b.deleted - a.deleted || b.pass - a.pass) | |
| 132 | - .slice(0, 30); | |
| 133 | - | |
| 134 | - return { | |
| 135 | - snapshots: [snapshot], | |
| 136 | - stability, | |
| 137 | - runsCount: bundle.runs.length, | |
| 138 | - ranAt: latest.ranAt, | |
| 139 | - headSha: latest.sha, | |
| 140 | - placeholderTests: latest.placeholderTests ?? [], | |
| 141 | - }; | |
| 142 | -}; | |
src/c32_sama_v2_verify.test.ts
+247
−0
| @@ -0,0 +1,247 @@ | ||
| 1 | +import { describe, test, expect } from "bun:test"; | |
| 2 | +import { verifySamaV2 } from "./c32_sama_v2_verify.ts"; | |
| 3 | +import type { ProfileSpec, SamaV2Input } from "./c31_sama_v2.ts"; | |
| 4 | + | |
| 5 | +// Minimal fixture profile mirroring the shape this repo's | |
| 6 | +// sama.profile.toml declares, but with synthetic prefixes so tests | |
| 7 | +// don't change when the live profile evolves. | |
| 8 | +const FIXTURE_PROFILE: ProfileSpec = { | |
| 9 | + samaVersion: "2.0", | |
| 10 | + profile: "test-fixture", | |
| 11 | + layers: { | |
| 12 | + 0: { sublayers: [{ name: "default", prefix: "p0_", index: 0 }] }, | |
| 13 | + 1: { | |
| 14 | + sublayers: [ | |
| 15 | + { name: "logic", prefix: "p1a_", index: 0 }, | |
| 16 | + { name: "render", prefix: "p1b_", index: 1 }, | |
| 17 | + ], | |
| 18 | + }, | |
| 19 | + 2: { | |
| 20 | + sublayers: [ | |
| 21 | + { name: "data", prefix: "p2a_", index: 0 }, | |
| 22 | + { name: "io", prefix: "p2b_", index: 1 }, | |
| 23 | + ], | |
| 24 | + }, | |
| 25 | + 3: { | |
| 26 | + sublayers: [ | |
| 27 | + { name: "handlers", prefix: "p3a_", index: 0 }, | |
| 28 | + { name: "server", prefix: "p3b_", index: 1 }, | |
| 29 | + ], | |
| 30 | + }, | |
| 31 | + }, | |
| 32 | +}; | |
| 33 | + | |
| 34 | +const mk = (entries: Array<[string, string]>): SamaV2Input => ({ | |
| 35 | + profile: FIXTURE_PROFILE, | |
| 36 | + files: new Map(entries), | |
| 37 | +}); | |
| 38 | + | |
| 39 | +describe("c32_sama_v2_verify — overall", () => { | |
| 40 | + test("empty repo: every check passes with examined=0 for content-bearing checks", () => { | |
| 41 | + const report = verifySamaV2(mk([])); | |
| 42 | + expect(report.overallPassed).toBe(true); | |
| 43 | + expect(report.checks).toHaveLength(7); | |
| 44 | + for (const c of report.checks) expect(c.passed).toBe(true); | |
| 45 | + }); | |
| 46 | + | |
| 47 | + test("a minimal Layer-0-only repo conforms", () => { | |
| 48 | + const report = verifySamaV2(mk([ | |
| 49 | + ["src/p0_types.ts", "export const x = 1;\n"], | |
| 50 | + ])); | |
| 51 | + expect(report.overallPassed).toBe(true); | |
| 52 | + }); | |
| 53 | +}); | |
| 54 | + | |
| 55 | +describe("c32_sama_v2_verify — Sorted (#1)", () => { | |
| 56 | + test("a file without a profile-recognised prefix is flagged", () => { | |
| 57 | + const report = verifySamaV2(mk([ | |
| 58 | + ["src/unknown_x.ts", "export const x = 1;\n"], | |
| 59 | + ])); | |
| 60 | + const sorted = report.checks.find((c) => c.id === 1)!; | |
| 61 | + expect(sorted.passed).toBe(false); | |
| 62 | + expect(sorted.violations.some((v) => v.file === "src/unknown_x.ts")).toBe(true); | |
| 63 | + }); | |
| 64 | + | |
| 65 | + test("a profile whose prefixes lex-sort against layer order is flagged", () => { | |
| 66 | + // Swap: Layer 0 prefix sorts AFTER Layer 1 prefix. | |
| 67 | + const bad: ProfileSpec = { | |
| 68 | + samaVersion: "2.0", profile: "bad", | |
| 69 | + layers: { | |
| 70 | + 0: { sublayers: [{ name: "default", prefix: "z0_", index: 0 }] }, | |
| 71 | + 1: { sublayers: [{ name: "default", prefix: "a1_", index: 0 }] }, | |
| 72 | + 2: { sublayers: [{ name: "default", prefix: "b2_", index: 0 }] }, | |
| 73 | + 3: { sublayers: [{ name: "default", prefix: "c3_", index: 0 }] }, | |
| 74 | + }, | |
| 75 | + }; | |
| 76 | + const report = verifySamaV2({ profile: bad, files: new Map() }); | |
| 77 | + const sorted = report.checks.find((c) => c.id === 1)!; | |
| 78 | + expect(sorted.passed).toBe(false); | |
| 79 | + expect(sorted.violations.length).toBeGreaterThan(0); | |
| 80 | + }); | |
| 81 | +}); | |
| 82 | + | |
| 83 | +describe("c32_sama_v2_verify — Architecture (#2)", () => { | |
| 84 | + test("an unprefixed src/*.ts file is flagged with a clear reason", () => { | |
| 85 | + const report = verifySamaV2(mk([ | |
| 86 | + ["src/random.ts", "export const x = 1;\n"], | |
| 87 | + ])); | |
| 88 | + const arch = report.checks.find((c) => c.id === 2)!; | |
| 89 | + expect(arch.passed).toBe(false); | |
| 90 | + const vio = arch.violations.find((v) => v.file === "src/random.ts")!; | |
| 91 | + expect(vio.detail).toContain("unprefixed"); | |
| 92 | + }); | |
| 93 | + | |
| 94 | + test("a properly-prefixed file is not flagged", () => { | |
| 95 | + const report = verifySamaV2(mk([ | |
| 96 | + ["src/p1a_logic.ts", "export const x = 1;\n"], | |
| 97 | + ])); | |
| 98 | + expect(report.checks.find((c) => c.id === 2)!.passed).toBe(true); | |
| 99 | + }); | |
| 100 | +}); | |
| 101 | + | |
| 102 | +describe("c32_sama_v2_verify — Modeled tests (#3)", () => { | |
| 103 | + test("a Layer 1 file without a sibling test is flagged", () => { | |
| 104 | + const report = verifySamaV2(mk([ | |
| 105 | + ["src/p1a_logic.ts", "export const x = 1;\n"], | |
| 106 | + ])); | |
| 107 | + const modeled = report.checks.find((c) => c.id === 3)!; | |
| 108 | + expect(modeled.passed).toBe(false); | |
| 109 | + const vio = modeled.violations[0]!; | |
| 110 | + expect(vio.file).toBe("src/p1a_logic.ts"); | |
| 111 | + expect(vio.detail).toContain("p1a_logic.test.ts"); | |
| 112 | + }); | |
| 113 | + | |
| 114 | + test("a Layer 1 file with its sibling passes", () => { | |
| 115 | + const report = verifySamaV2(mk([ | |
| 116 | + ["src/p1a_logic.ts", "export const x = 1;\n"], | |
| 117 | + ["src/p1a_logic.test.ts", "import {expect, test} from \"bun:test\"; test(\"x\", () => { expect(1).toBe(1); });\n"], | |
| 118 | + ])); | |
| 119 | + expect(report.checks.find((c) => c.id === 3)!.passed).toBe(true); | |
| 120 | + }); | |
| 121 | + | |
| 122 | + test("Layer 0 files don't require sibling tests", () => { | |
| 123 | + const report = verifySamaV2(mk([ | |
| 124 | + ["src/p0_types.ts", "export const x = 1;\n"], | |
| 125 | + ])); | |
| 126 | + expect(report.checks.find((c) => c.id === 3)!.passed).toBe(true); | |
| 127 | + }); | |
| 128 | +}); | |
| 129 | + | |
| 130 | +describe("c32_sama_v2_verify — Modeled boundary (#4)", () => { | |
| 131 | + test("JSON.parse in Layer 1 is flagged", () => { | |
| 132 | + const report = verifySamaV2(mk([ | |
| 133 | + ["src/p1a_naughty.ts", "export const f = (s: string) => JSON.parse(s);\n"], | |
| 134 | + ])); | |
| 135 | + const boundary = report.checks.find((c) => c.id === 4)!; | |
| 136 | + expect(boundary.passed).toBe(false); | |
| 137 | + expect(boundary.violations[0]!.detail).toContain("JSON.parse"); | |
| 138 | + }); | |
| 139 | + | |
| 140 | + test("JSON.parse in Layer 2 is OK (Layer 2 IS the boundary)", () => { | |
| 141 | + const report = verifySamaV2(mk([ | |
| 142 | + ["src/p2b_adapter.ts", "export const f = (s: string) => JSON.parse(s);\n"], | |
| 143 | + ])); | |
| 144 | + expect(report.checks.find((c) => c.id === 4)!.passed).toBe(true); | |
| 145 | + }); | |
| 146 | + | |
| 147 | + test("string literals containing JSON.parse don't false-positive", () => { | |
| 148 | + const report = verifySamaV2(mk([ | |
| 149 | + ["src/p1a_logic.ts", "const explainer = \"to fix, call JSON.parse(input) in Layer 2\";\nexport const x = explainer.length;\n"], | |
| 150 | + ])); | |
| 151 | + expect(report.checks.find((c) => c.id === 4)!.passed).toBe(true); | |
| 152 | + }); | |
| 153 | +}); | |
| 154 | + | |
| 155 | +describe("c32_sama_v2_verify — Atomic (#5)", () => { | |
| 156 | + test("a file over the 700-line cap is flagged", () => { | |
| 157 | + const fat = Array.from({ length: 720 }, (_, i) => `// line ${i}`).join("\n"); | |
| 158 | + const report = verifySamaV2(mk([ | |
| 159 | + ["src/p1a_fat.ts", fat], | |
| 160 | + ])); | |
| 161 | + const atomic = report.checks.find((c) => c.id === 5)!; | |
| 162 | + expect(atomic.passed).toBe(false); | |
| 163 | + expect(atomic.violations[0]!.detail).toContain("over the 700-line cap"); | |
| 164 | + }); | |
| 165 | + | |
| 166 | + test("a barrel re-export file is flagged", () => { | |
| 167 | + const report = verifySamaV2(mk([ | |
| 168 | + ["src/p1a_barrel.ts", "export * from \"./p1a_a.ts\";\nexport * from \"./p1a_b.ts\";\n"], | |
| 169 | + ])); | |
| 170 | + const atomic = report.checks.find((c) => c.id === 5)!; | |
| 171 | + expect(atomic.passed).toBe(false); | |
| 172 | + expect(atomic.violations[0]!.detail).toContain("barrel"); | |
| 173 | + }); | |
| 174 | +}); | |
| 175 | + | |
| 176 | +describe("c32_sama_v2_verify — Law §1.2 (#6)", () => { | |
| 177 | + test("upward import (Layer 1 → Layer 2) is flagged", () => { | |
| 178 | + const report = verifySamaV2(mk([ | |
| 179 | + ["src/p1a_logic.ts", "import { x } from \"./p2a_data.ts\";\nexport const y = x;\n"], | |
| 180 | + ["src/p1a_logic.test.ts", "import { test, expect } from \"bun:test\"; test(\"y\", () => { expect(1).toBe(1); });\n"], | |
| 181 | + ["src/p2a_data.ts", "export const x = 1;\n"], | |
| 182 | + ["src/p2a_data.test.ts","import { test, expect } from \"bun:test\"; test(\"x\", () => { expect(1).toBe(1); });\n"], | |
| 183 | + ])); | |
| 184 | + const law = report.checks.find((c) => c.id === 6)!; | |
| 185 | + expect(law.passed).toBe(false); | |
| 186 | + expect(law.violations.some((v) => v.detail.includes("upward"))).toBe(true); | |
| 187 | + }); | |
| 188 | + | |
| 189 | + test("downward import (Layer 2 → Layer 0) passes", () => { | |
| 190 | + const report = verifySamaV2(mk([ | |
| 191 | + ["src/p2a_data.ts", "import type { X } from \"./p0_types.ts\";\nexport const f = (): X => ({} as X);\n"], | |
| 192 | + ["src/p2a_data.test.ts", "import { test, expect } from \"bun:test\"; test(\"f\", () => { expect(1).toBe(1); });\n"], | |
| 193 | + ["src/p0_types.ts", "export interface X { id: number }\n"], | |
| 194 | + ])); | |
| 195 | + expect(report.checks.find((c) => c.id === 6)!.passed).toBe(true); | |
| 196 | + }); | |
| 197 | + | |
| 198 | + test("same-layer reversed sublayer is flagged", () => { | |
| 199 | + // p1a_logic is sublayer index 0 (logic), p1b_render is sublayer | |
| 200 | + // index 1 (render). Logic importing render is reverse order. | |
| 201 | + const report = verifySamaV2(mk([ | |
| 202 | + ["src/p1a_logic.ts", "import { r } from \"./p1b_render.ts\";\nexport const y = r;\n"], | |
| 203 | + ["src/p1a_logic.test.ts", "import { test, expect } from \"bun:test\"; test(\"y\", () => { expect(1).toBe(1); });\n"], | |
| 204 | + ["src/p1b_render.ts", "export const r = 1;\n"], | |
| 205 | + ["src/p1b_render.test.ts","import { test, expect } from \"bun:test\"; test(\"r\", () => { expect(1).toBe(1); });\n"], | |
| 206 | + ])); | |
| 207 | + const law = report.checks.find((c) => c.id === 6)!; | |
| 208 | + expect(law.passed).toBe(false); | |
| 209 | + expect(law.violations.some((v) => v.detail.includes("sublayer"))).toBe(true); | |
| 210 | + }); | |
| 211 | + | |
| 212 | + test("an import cycle is flagged", () => { | |
| 213 | + const report = verifySamaV2(mk([ | |
| 214 | + ["src/p1a_a.ts", "import { y } from \"./p1a_b.ts\";\nexport const x = y;\n"], | |
| 215 | + ["src/p1a_a.test.ts", "import { test, expect } from \"bun:test\"; test(\"x\", () => { expect(1).toBe(1); });\n"], | |
| 216 | + ["src/p1a_b.ts", "import { x } from \"./p1a_a.ts\";\nexport const y = x;\n"], | |
| 217 | + ["src/p1a_b.test.ts", "import { test, expect } from \"bun:test\"; test(\"y\", () => { expect(1).toBe(1); });\n"], | |
| 218 | + ])); | |
| 219 | + const law = report.checks.find((c) => c.id === 6)!; | |
| 220 | + expect(law.passed).toBe(false); | |
| 221 | + expect(law.violations.some((v) => v.detail.includes("cycle"))).toBe(true); | |
| 222 | + }); | |
| 223 | +}); | |
| 224 | + | |
| 225 | +describe("c32_sama_v2_verify — Consistency §3 (#7)", () => { | |
| 226 | + test("Layer 1 file reaching Layer 2 contradicts its declared prefix", () => { | |
| 227 | + const report = verifySamaV2(mk([ | |
| 228 | + ["src/p1a_logic.ts", "import { f } from \"./p2a_data.ts\";\nexport const y = f;\n"], | |
| 229 | + ["src/p1a_logic.test.ts", "import { test, expect } from \"bun:test\"; test(\"y\", () => { expect(1).toBe(1); });\n"], | |
| 230 | + ["src/p2a_data.ts", "export const f = 1;\n"], | |
| 231 | + ["src/p2a_data.test.ts", "import { test, expect } from \"bun:test\"; test(\"f\", () => { expect(1).toBe(1); });\n"], | |
| 232 | + ])); | |
| 233 | + const consistency = report.checks.find((c) => c.id === 7)!; | |
| 234 | + expect(consistency.passed).toBe(false); | |
| 235 | + expect(consistency.violations[0]!.detail).toContain("declared Layer 1"); | |
| 236 | + expect(consistency.violations[0]!.detail).toContain("Layer 2"); | |
| 237 | + }); | |
| 238 | + | |
| 239 | + test("downward-only imports are consistent", () => { | |
| 240 | + const report = verifySamaV2(mk([ | |
| 241 | + ["src/p1a_logic.ts", "import type { X } from \"./p0_types.ts\";\nexport const y = (a: X) => a;\n"], | |
| 242 | + ["src/p1a_logic.test.ts", "import { test, expect } from \"bun:test\"; test(\"y\", () => { expect(1).toBe(1); });\n"], | |
| 243 | + ["src/p0_types.ts", "export interface X { id: number }\n"], | |
| 244 | + ])); | |
| 245 | + expect(report.checks.find((c) => c.id === 7)!.passed).toBe(true); | |
| 246 | + }); | |
| 247 | +}); | |
src/c32_sama_v2_verify.ts
+436
−0
| @@ -0,0 +1,436 @@ | ||
| 1 | +// c32 — logic: the SAMA v2 verifier. Implements the seven §4 | |
| 2 | +// conformance checks (Sorted, Architecture, Modeled-tests, | |
| 3 | +// Modeled-boundary, Atomic, the Law §1.2, Consistency §3) as pure | |
| 4 | +// functions over an in-memory (profile, files) input. Never reads | |
| 5 | +// the filesystem — the loader (c14_sama_profile + c21 handler) | |
| 6 | +// populates the input map. No mocks, no stubs: every check is a | |
| 7 | +// real grep/string-op on the supplied content. | |
| 8 | + | |
| 9 | +import { | |
| 10 | + declaredLayer, | |
| 11 | + type SamaV2Check, | |
| 12 | + type SamaV2Input, | |
| 13 | + type SamaV2Report, | |
| 14 | + type SamaV2Violation, | |
| 15 | +} from "./c31_sama_v2.ts"; | |
| 16 | + | |
| 17 | +// — shared utilities ------------------------------------------------- | |
| 18 | + | |
| 19 | +// A SAMA file is one we expect to obey the layer rules: any *.ts | |
| 20 | +// under src/ that isn't a *.test.ts. Tests live next to source as | |
| 21 | +// siblings; they're examined for the Modeled check but don't carry | |
| 22 | +// their own layer. | |
| 23 | +const isSamaFile = (path: string): boolean => | |
| 24 | + path.startsWith("src/") && path.endsWith(".ts") && !path.endsWith(".test.ts"); | |
| 25 | + | |
| 26 | +const isTestFile = (path: string): boolean => | |
| 27 | + path.startsWith("src/") && path.endsWith(".test.ts"); | |
| 28 | + | |
| 29 | +// Strip JS/TS string literals and comments to whitespace so a regex | |
| 30 | +// that walks the source doesn't trip on test fixtures that contain | |
| 31 | +// the very patterns we're scanning for. Same shape as the helper in | |
| 32 | +// c32_sama_verify; duplicated here to keep c32_sama_v2_verify a | |
| 33 | +// stand-alone module the loader can pull in without dragging the v1 | |
| 34 | +// verifier with it. | |
| 35 | +const stripStringsAndComments = (src: string): string => { | |
| 36 | + let out = ""; | |
| 37 | + let i = 0; | |
| 38 | + while (i < src.length) { | |
| 39 | + const c = src[i]; | |
| 40 | + const n = src[i + 1]; | |
| 41 | + if (c === "/" && n === "/") { | |
| 42 | + out += " "; | |
| 43 | + i += 2; | |
| 44 | + while (i < src.length && src[i] !== "\n") { out += " "; i++; } | |
| 45 | + } else if (c === "/" && n === "*") { | |
| 46 | + out += " "; | |
| 47 | + i += 2; | |
| 48 | + while (i < src.length - 1 && !(src[i] === "*" && src[i + 1] === "/")) { | |
| 49 | + out += src[i] === "\n" ? "\n" : " "; | |
| 50 | + i++; | |
| 51 | + } | |
| 52 | + out += " "; | |
| 53 | + i += 2; | |
| 54 | + } else if (c === '"' || c === "'" || c === "`") { | |
| 55 | + const quote = c; | |
| 56 | + out += " "; | |
| 57 | + i++; | |
| 58 | + while (i < src.length && src[i] !== quote) { | |
| 59 | + if (src[i] === "\\" && i + 1 < src.length) { out += " "; i += 2; continue; } | |
| 60 | + out += src[i] === "\n" ? "\n" : " "; | |
| 61 | + i++; | |
| 62 | + } | |
| 63 | + out += " "; | |
| 64 | + i++; | |
| 65 | + } else { | |
| 66 | + out += c; | |
| 67 | + i++; | |
| 68 | + } | |
| 69 | + } | |
| 70 | + return out; | |
| 71 | +}; | |
| 72 | + | |
| 73 | +// Collect every relative ".ts" import edge in a file. Scans raw | |
| 74 | +// source: a stripped copy would erase the quoted import paths along | |
| 75 | +// with all other string literals, so the regex must run over the | |
| 76 | +// original. To avoid picking up import-like strings inside test | |
| 77 | +// fixtures, we cross-check each match position against the stripped | |
| 78 | +// mask — if the keyword `from` lands on whitespace in the mask, it | |
| 79 | +// was inside a string literal and we skip it. | |
| 80 | +const collectRelativeImports = (content: string): string[] => { | |
| 81 | + const mask = stripStringsAndComments(content); | |
| 82 | + const re = /\bfrom\s+["'](\.\/[A-Za-z0-9_./-]+\.ts)["']/g; | |
| 83 | + const out: string[] = []; | |
| 84 | + let m: RegExpExecArray | null; | |
| 85 | + while ((m = re.exec(content)) !== null) { | |
| 86 | + // If the `from` keyword position is whitespace in the mask, the | |
| 87 | + // entire match was inside a string literal (e.g. a test fixture). | |
| 88 | + if (mask[m.index] === " " || mask[m.index] === "\n") continue; | |
| 89 | + if (m[1]) out.push(m[1]); | |
| 90 | + } | |
| 91 | + return out; | |
| 92 | +}; | |
| 93 | + | |
| 94 | +// Resolve a relative import like "./c14_git.ts" from the importing | |
| 95 | +// file's directory to the repo-relative path used as the input map's | |
| 96 | +// key (e.g. "src/c14_git.ts"). | |
| 97 | +const resolveImport = (fromPath: string, importPath: string): string => { | |
| 98 | + const dir = fromPath.split("/").slice(0, -1).join("/"); | |
| 99 | + const rel = importPath.replace(/^\.\//, ""); | |
| 100 | + return dir + "/" + rel; | |
| 101 | +}; | |
| 102 | + | |
| 103 | +// — Check 1: Sorted ------------------------------------------------- | |
| 104 | +// | |
| 105 | +// "Every file carries a profile-recognised prefix; lexicographic | |
| 106 | +// prefix order equals layer order." | |
| 107 | +const checkSorted = (input: SamaV2Input): SamaV2Check => { | |
| 108 | + const violations: SamaV2Violation[] = []; | |
| 109 | + let examined = 0; | |
| 110 | + // Collect (prefix, layer) pairs from the profile. | |
| 111 | + const pairs: Array<{ prefix: string; layer: number }> = []; | |
| 112 | + for (const [k, spec] of Object.entries(input.profile.layers)) { | |
| 113 | + const layer = parseInt(k, 10); | |
| 114 | + for (const sub of spec.sublayers) pairs.push({ prefix: sub.prefix, layer }); | |
| 115 | + } | |
| 116 | + // For any two prefixes with layer(A) < layer(B), A must lex-sort < B. | |
| 117 | + for (let i = 0; i < pairs.length; i++) { | |
| 118 | + for (let j = 0; j < pairs.length; j++) { | |
| 119 | + if (i === j) continue; | |
| 120 | + const a = pairs[i]!; | |
| 121 | + const b = pairs[j]!; | |
| 122 | + if (a.layer < b.layer && a.prefix > b.prefix) { | |
| 123 | + violations.push({ | |
| 124 | + file: a.prefix, | |
| 125 | + detail: `prefix \`${a.prefix}\` (layer ${a.layer}) sorts after \`${b.prefix}\` (layer ${b.layer}) — lex order must equal layer order`, | |
| 126 | + }); | |
| 127 | + } | |
| 128 | + } | |
| 129 | + } | |
| 130 | + // Also count source files whose prefix isn't recognised by any | |
| 131 | + // sublayer. They'd be flagged by Architecture too, but the Sorted | |
| 132 | + // rule needs each file to have a recognised prefix. | |
| 133 | + for (const path of input.files.keys()) { | |
| 134 | + if (!isSamaFile(path)) continue; | |
| 135 | + examined++; | |
| 136 | + if (declaredLayer(path, input.profile) === null) { | |
| 137 | + violations.push({ file: path, detail: "no profile-recognised prefix" }); | |
| 138 | + } | |
| 139 | + } | |
| 140 | + return { | |
| 141 | + id: 1, name: "Sorted", property: "Sorted", | |
| 142 | + passed: violations.length === 0, examined, violations, | |
| 143 | + }; | |
| 144 | +}; | |
| 145 | + | |
| 146 | +// — Check 2: Architecture ------------------------------------------- | |
| 147 | +// | |
| 148 | +// "Every file maps to exactly one canonical layer; no file is | |
| 149 | +// unprefixed or maps to two layers." | |
| 150 | +const checkArchitecture = (input: SamaV2Input): SamaV2Check => { | |
| 151 | + const violations: SamaV2Violation[] = []; | |
| 152 | + let examined = 0; | |
| 153 | + for (const path of input.files.keys()) { | |
| 154 | + if (!isSamaFile(path) && !isTestFile(path)) continue; | |
| 155 | + examined++; | |
| 156 | + const base = path.split("/").pop() ?? path; | |
| 157 | + // Find every profile prefix that matches this filename. Exactly | |
| 158 | + // one is required; zero = unprefixed (caught by Sorted too) but | |
| 159 | + // we surface it here as the canonical "unmapped" failure. | |
| 160 | + const matches: Array<{ layer: number; prefix: string }> = []; | |
| 161 | + for (const [k, spec] of Object.entries(input.profile.layers)) { | |
| 162 | + const layer = parseInt(k, 10); | |
| 163 | + for (const sub of spec.sublayers) { | |
| 164 | + if (base.startsWith(sub.prefix)) matches.push({ layer, prefix: sub.prefix }); | |
| 165 | + } | |
| 166 | + } | |
| 167 | + if (matches.length === 0) { | |
| 168 | + violations.push({ file: path, detail: "unprefixed — does not match any profile prefix" }); | |
| 169 | + } else if (matches.length > 1) { | |
| 170 | + // Two prefixes claim the same file: profile ambiguity. | |
| 171 | + const distinctLayers = new Set(matches.map((m) => m.layer)); | |
| 172 | + if (distinctLayers.size > 1) { | |
| 173 | + violations.push({ | |
| 174 | + file: path, | |
| 175 | + detail: `ambiguous — matches multiple layers: ${matches.map((m) => `${m.prefix}→L${m.layer}`).join(", ")}`, | |
| 176 | + }); | |
| 177 | + } | |
| 178 | + } | |
| 179 | + } | |
| 180 | + return { | |
| 181 | + id: 2, name: "Architecture", property: "Architecture", | |
| 182 | + passed: violations.length === 0, examined, violations, | |
| 183 | + }; | |
| 184 | +}; | |
| 185 | + | |
| 186 | +// — Check 3: Modeled (tests) ---------------------------------------- | |
| 187 | +// | |
| 188 | +// "Every Layer 1 and Layer 2 behavior file has a sibling test file." | |
| 189 | +const checkModeledTests = (input: SamaV2Input): SamaV2Check => { | |
| 190 | + const violations: SamaV2Violation[] = []; | |
| 191 | + let examined = 0; | |
| 192 | + for (const path of input.files.keys()) { | |
| 193 | + if (!isSamaFile(path)) continue; | |
| 194 | + const decl = declaredLayer(path, input.profile); | |
| 195 | + if (!decl) continue; | |
| 196 | + if (decl.layer !== 1 && decl.layer !== 2) continue; | |
| 197 | + examined++; | |
| 198 | + const siblingPath = path.replace(/\.ts$/, ".test.ts"); | |
| 199 | + if (!input.files.has(siblingPath)) { | |
| 200 | + violations.push({ | |
| 201 | + file: path, | |
| 202 | + detail: `no sibling test at \`${siblingPath}\` — Layer ${decl.layer} requires one`, | |
| 203 | + }); | |
| 204 | + } | |
| 205 | + } | |
| 206 | + return { | |
| 207 | + id: 3, name: "Modeled (tests)", property: "Modeled (tests)", | |
| 208 | + passed: violations.length === 0, examined, violations, | |
| 209 | + }; | |
| 210 | +}; | |
| 211 | + | |
| 212 | +// — Check 4: Modeled (boundary) ------------------------------------- | |
| 213 | +// | |
| 214 | +// "External input is parsed only in Layer 2." | |
| 215 | +// | |
| 216 | +// §4.4 is profile-dependent (spec §6). Our profile defines boundary | |
| 217 | +// parsing as `JSON.parse(` of arbitrary input (not constant strings) | |
| 218 | +// or `new URL(` of arbitrary input — i.e. patterns that turn bytes | |
| 219 | +// into typed structures. Platform-provided parsers called *through* | |
| 220 | +// Layer 3 entry handlers (`req.json()`, `req.formData()`, route | |
| 221 | +// params) are treated as delegation to the platform's own Layer 2, | |
| 222 | +// not parsing performed in our Layer 3. The verifier reports any | |
| 223 | +// raw JSON.parse / new URL calls landing outside Layer 2. | |
| 224 | +const BOUNDARY_PATTERNS = [ | |
| 225 | + { name: "JSON.parse", re: /\bJSON\.parse\s*\(/ }, | |
| 226 | + { name: "new URL", re: /\bnew\s+URL\s*\(/ }, | |
| 227 | +]; | |
| 228 | +const checkModeledBoundary = (input: SamaV2Input): SamaV2Check => { | |
| 229 | + const violations: SamaV2Violation[] = []; | |
| 230 | + let examined = 0; | |
| 231 | + for (const [path, content] of input.files.entries()) { | |
| 232 | + if (!isSamaFile(path)) continue; | |
| 233 | + const decl = declaredLayer(path, input.profile); | |
| 234 | + if (!decl) continue; | |
| 235 | + examined++; | |
| 236 | + if (decl.layer === 2) continue; // Layer 2 is the legitimate site. | |
| 237 | + const stripped = stripStringsAndComments(content); | |
| 238 | + for (const pat of BOUNDARY_PATTERNS) { | |
| 239 | + if (pat.re.test(stripped)) { | |
| 240 | + violations.push({ | |
| 241 | + file: path, | |
| 242 | + detail: `boundary pattern \`${pat.name}\` found in Layer ${decl.layer} — parsing belongs in Layer 2`, | |
| 243 | + }); | |
| 244 | + } | |
| 245 | + } | |
| 246 | + } | |
| 247 | + return { | |
| 248 | + id: 4, name: "Modeled (boundary)", property: "Modeled (boundary)", | |
| 249 | + passed: violations.length === 0, examined, violations, | |
| 250 | + note: "profile-dependent (spec §4.4): boundary = raw `JSON.parse` / `new URL` outside Layer 2. Platform parsers reached via `req.json()` etc. are treated as delegation to the platform's own Layer 2.", | |
| 251 | + }; | |
| 252 | +}; | |
| 253 | + | |
| 254 | +// — Check 5: Atomic ------------------------------------------------- | |
| 255 | +// | |
| 256 | +// "No file exceeds the line cap (default ~700; profile may lower, | |
| 257 | +// never raise). No barrel re-export files." | |
| 258 | +const ATOMIC_LINE_CAP = 700; | |
| 259 | +const checkAtomic = (input: SamaV2Input): SamaV2Check => { | |
| 260 | + const violations: SamaV2Violation[] = []; | |
| 261 | + let examined = 0; | |
| 262 | + for (const [path, content] of input.files.entries()) { | |
| 263 | + if (!isSamaFile(path) && !isTestFile(path)) continue; | |
| 264 | + examined++; | |
| 265 | + const lines = content.split("\n").length; | |
| 266 | + if (lines > ATOMIC_LINE_CAP) { | |
| 267 | + violations.push({ | |
| 268 | + file: path, | |
| 269 | + detail: `${lines} lines (over the ${ATOMIC_LINE_CAP}-line cap — split per UI/data domain)`, | |
| 270 | + }); | |
| 271 | + } | |
| 272 | + // Barrel detection: a file whose entire body is re-exports. | |
| 273 | + // Heuristic: every non-blank, non-comment line is `export ... from`. | |
| 274 | + const stripped = stripStringsAndComments(content); | |
| 275 | + const codeLines = stripped.split("\n").map((l) => l.trim()).filter((l) => l.length > 0); | |
| 276 | + if (codeLines.length >= 2 && codeLines.every((l) => /^export\s+(\*|\{)/.test(l) && /\bfrom\b/.test(l))) { | |
| 277 | + violations.push({ file: path, detail: "barrel re-export file (all lines are `export … from`)" }); | |
| 278 | + } | |
| 279 | + } | |
| 280 | + return { | |
| 281 | + id: 5, name: "Atomic", property: "Atomic", | |
| 282 | + passed: violations.length === 0, examined, violations, | |
| 283 | + }; | |
| 284 | +}; | |
| 285 | + | |
| 286 | +// — Check 6: The Law (§1.2) ----------------------------------------- | |
| 287 | +// | |
| 288 | +// "Imports always point to a strictly lower layer number — never | |
| 289 | +// upward, never sideways across a higher number, never cyclic." | |
| 290 | +// | |
| 291 | +// Build the import graph from relative-.ts imports, then for each | |
| 292 | +// edge A → B require: layer(B) < layer(A), OR same layer + B's | |
| 293 | +// sublayer index <= A's sublayer index. Also run a DFS cycle detector. | |
| 294 | +const checkLaw = (input: SamaV2Input): SamaV2Check => { | |
| 295 | + const violations: SamaV2Violation[] = []; | |
| 296 | + let examined = 0; | |
| 297 | + // Build adjacency. | |
| 298 | + const adj = new Map<string, string[]>(); | |
| 299 | + for (const [path, content] of input.files.entries()) { | |
| 300 | + if (!isSamaFile(path) && !isTestFile(path)) continue; | |
| 301 | + examined++; | |
| 302 | + const out: string[] = []; | |
| 303 | + for (const imp of collectRelativeImports(content)) { | |
| 304 | + const resolved = resolveImport(path, imp); | |
| 305 | + // Only follow edges into known SAMA files (in-tree, in src/). | |
| 306 | + if (input.files.has(resolved)) out.push(resolved); | |
| 307 | + } | |
| 308 | + adj.set(path, out); | |
| 309 | + } | |
| 310 | + // Edge-by-edge layer/sublayer check. | |
| 311 | + for (const [from, outs] of adj.entries()) { | |
| 312 | + const aDecl = declaredLayer(from, input.profile); | |
| 313 | + if (!aDecl) continue; // Unmapped — caught by Architecture. | |
| 314 | + for (const to of outs) { | |
| 315 | + const bDecl = declaredLayer(to, input.profile); | |
| 316 | + if (!bDecl) continue; | |
| 317 | + if (bDecl.layer < aDecl.layer) continue; // strictly lower — OK | |
| 318 | + if (bDecl.layer > aDecl.layer) { | |
| 319 | + violations.push({ | |
| 320 | + file: from, | |
| 321 | + detail: `imports \`${to}\` — Layer ${aDecl.layer} → Layer ${bDecl.layer} (upward, breaks §1.2)`, | |
| 322 | + }); | |
| 323 | + continue; | |
| 324 | + } | |
| 325 | + // Same layer: sublayer ordering. The import target must be in | |
| 326 | + // an earlier-or-equal sublayer slot (spec §2.2: later may import | |
| 327 | + // earlier). | |
| 328 | + if (bDecl.sublayer.index > aDecl.sublayer.index) { | |
| 329 | + violations.push({ | |
| 330 | + file: from, | |
| 331 | + detail: `imports \`${to}\` — same layer ${aDecl.layer} but sublayer order is reversed (${aDecl.sublayer.name} sublayer-index ${aDecl.sublayer.index} → ${bDecl.sublayer.name} sublayer-index ${bDecl.sublayer.index})`, | |
| 332 | + }); | |
| 333 | + } | |
| 334 | + } | |
| 335 | + } | |
| 336 | + // DFS cycle detection on the same graph. | |
| 337 | + const WHITE = 0, GRAY = 1, BLACK = 2; | |
| 338 | + const color = new Map<string, number>(); | |
| 339 | + for (const k of adj.keys()) color.set(k, WHITE); | |
| 340 | + const cycles: string[][] = []; | |
| 341 | + const stack: string[] = []; | |
| 342 | + const dfs = (node: string): boolean => { | |
| 343 | + color.set(node, GRAY); | |
| 344 | + stack.push(node); | |
| 345 | + for (const next of adj.get(node) ?? []) { | |
| 346 | + const c = color.get(next) ?? WHITE; | |
| 347 | + if (c === GRAY) { | |
| 348 | + const idx = stack.indexOf(next); | |
| 349 | + if (idx !== -1) cycles.push([...stack.slice(idx), next]); | |
| 350 | + return true; | |
| 351 | + } | |
| 352 | + if (c === WHITE && dfs(next)) { | |
| 353 | + // bubble up | |
| 354 | + } | |
| 355 | + } | |
| 356 | + stack.pop(); | |
| 357 | + color.set(node, BLACK); | |
| 358 | + return false; | |
| 359 | + }; | |
| 360 | + for (const k of adj.keys()) if (color.get(k) === WHITE) dfs(k); | |
| 361 | + for (const cyc of cycles) { | |
| 362 | + violations.push({ | |
| 363 | + file: cyc[0] ?? "(unknown)", | |
| 364 | + detail: `import cycle: ${cyc.join(" → ")}`, | |
| 365 | + }); | |
| 366 | + } | |
| 367 | + return { | |
| 368 | + id: 6, name: "Law (§1.2)", property: "Law", | |
| 369 | + passed: violations.length === 0, examined, violations, | |
| 370 | + }; | |
| 371 | +}; | |
| 372 | + | |
| 373 | +// — Check 7: Consistency (§3) --------------------------------------- | |
| 374 | +// | |
| 375 | +// "Verifier FAILS if a file imports from a layer that its declared | |
| 376 | +// layer is not permitted to import." This is the same set of edges | |
| 377 | +// the Law check examines, framed from the file's own perspective: | |
| 378 | +// does the prefix lie about what the file actually does? | |
| 379 | +// | |
| 380 | +// We emit a separate verdict so the report can show both framings. | |
| 381 | +// In a profile where no §1.2 violation exists, §3 also passes by | |
| 382 | +// construction — both are derived from the same edge set. | |
| 383 | +const checkConsistency = (input: SamaV2Input): SamaV2Check => { | |
| 384 | + const violations: SamaV2Violation[] = []; | |
| 385 | + let examined = 0; | |
| 386 | + for (const [path, content] of input.files.entries()) { | |
| 387 | + if (!isSamaFile(path)) continue; | |
| 388 | + const aDecl = declaredLayer(path, input.profile); | |
| 389 | + if (!aDecl) continue; | |
| 390 | + examined++; | |
| 391 | + let ceiling = -1; | |
| 392 | + let ceilingFile: string | null = null; | |
| 393 | + for (const imp of collectRelativeImports(content)) { | |
| 394 | + const resolved = resolveImport(path, imp); | |
| 395 | + const bDecl = declaredLayer(resolved, input.profile); | |
| 396 | + if (!bDecl) continue; | |
| 397 | + if (bDecl.layer > ceiling) { ceiling = bDecl.layer; ceilingFile = resolved; } | |
| 398 | + } | |
| 399 | + // Consistency fails if any import goes to a strictly higher | |
| 400 | + // layer than the file's declared layer. Same-layer with bad | |
| 401 | + // sublayer order is the Law's concern, not Consistency's. | |
| 402 | + if (ceiling > aDecl.layer) { | |
| 403 | + violations.push({ | |
| 404 | + file: path, | |
| 405 | + detail: `declared Layer ${aDecl.layer} (prefix \`${aDecl.sublayer.prefix}\`) but imports reach Layer ${ceiling} via \`${ceilingFile}\` — the prefix claims something the imports contradict`, | |
| 406 | + }); | |
| 407 | + } | |
| 408 | + } | |
| 409 | + return { | |
| 410 | + id: 7, name: "Consistency (§3)", property: "Consistency", | |
| 411 | + passed: violations.length === 0, examined, violations, | |
| 412 | + }; | |
| 413 | +}; | |
| 414 | + | |
| 415 | +// — Orchestrator ---------------------------------------------------- | |
| 416 | + | |
| 417 | +export const verifySamaV2 = (input: SamaV2Input): SamaV2Report => { | |
| 418 | + const checks: SamaV2Check[] = [ | |
| 419 | + checkSorted(input), | |
| 420 | + checkArchitecture(input), | |
| 421 | + checkModeledTests(input), | |
| 422 | + checkModeledBoundary(input), | |
| 423 | + checkAtomic(input), | |
| 424 | + checkLaw(input), | |
| 425 | + checkConsistency(input), | |
| 426 | + ]; | |
| 427 | + // Architecture's examined count is the canonical total — it counts | |
| 428 | + // every file the profile assigns to a layer (or fails to). | |
| 429 | + const examined = checks.find((c) => c.id === 2)?.examined ?? 0; | |
| 430 | + return { | |
| 431 | + profile: input.profile.profile, | |
| 432 | + examined, | |
| 433 | + checks, | |
| 434 | + overallPassed: checks.every((c) => c.passed), | |
| 435 | + }; | |
| 436 | +}; | |
src/c51_render_admin.ts
+1
−1
| @@ -10,7 +10,7 @@ | ||
| 10 | 10 | // here is forward-compatible with the block editor that lands next. |
| 11 | 11 | |
| 12 | 12 | import { escape, renderPage } from "./c51_render_layout.ts"; |
| 13 | -import type { SxDocumentSummary } from "./c13_database.ts"; | |
| 13 | +import type { SxDocumentSummary } from "./c31_sxdoc.ts"; | |
| 14 | 14 | import type { SxDocument } from "./c31_sxdoc.ts"; |
| 15 | 15 | import { sxToHtml } from "./c51_render_sxdoc.ts"; |
| 16 | 16 | |
src/c51_render_edit.ts
+1
−4
| @@ -8,10 +8,7 @@ import { | ||
| 8 | 8 | escape, |
| 9 | 9 | } from "./c51_render_layout.ts"; |
| 10 | 10 | import type { ResolvedEdit } from "./c32_edit_resolve.ts"; |
| 11 | -import type { | |
| 12 | - GitCommitOk, | |
| 13 | - GitCommitFailure, | |
| 14 | -} from "./c14_git.ts"; | |
| 11 | +import type { GitCommitOk, GitCommitFailure } from "./c31_git_parse.ts"; | |
| 15 | 12 | |
| 16 | 13 | const layoutWrap = (innerHtml: string): string => |
| 17 | 14 | `<main class="md edit-page"><div class="edit-container">${innerHtml}</div></main>`; |
src/c51_render_projects.ts
+1
−1
| @@ -1,7 +1,7 @@ | ||
| 1 | 1 | // c51 (projects) — body builders for /projects, /projects/new, |
| 2 | 2 | // /projects/:owner/:repo. Imports chrome helpers from c51_render_layout. |
| 3 | 3 | |
| 4 | -import type { ProjectRow } from "./c13_database.ts"; | |
| 4 | +import type { ProjectRow } from "./c31_project_config.ts"; | |
| 5 | 5 | import { PROJECT_CONFIG_PATH } from "./c31_project_config.ts"; |
| 6 | 6 | import { escape } from "./c51_render_layout.ts"; |
| 7 | 7 | |
src/c51_render_repo.ts
+1
−1
| @@ -6,7 +6,7 @@ | ||
| 6 | 6 | |
| 7 | 7 | import { marked } from "marked"; |
| 8 | 8 | import { renderPage, escape } from "./c51_render_layout.ts"; |
| 9 | -import type { TreeEntry } from "./c14_git.ts"; | |
| 9 | +import type { TreeEntry } from "./c31_git_parse.ts"; | |
| 10 | 10 | |
| 11 | 11 | const shortSha = (sha: string): string => sha.slice(0, 7); |
| 12 | 12 | |