syntaxai/tdd.md · commit 1f9254a

SAMA v2 verifier: build, ship, and dogfood — empirical proof scaffold

Builds the SAMA v2 §4 verifier end-to-end so /sama/v2/verify?repo=
syntaxai/tdd.md returns 200 with an honest verdict for this repo
under a real profile. Goal #15 result: 5/7 checks ✓ with three
named, file-level blockers documented in the live verdict.

New files (all v2-compliant under tdd-md profile):
- sama.profile.toml — repo root, the single source of truth for
  layer→prefix mapping (c31_→0, c32_/c51_→1, c13_/c14_→2, c21_/c11_→3)
- src/c31_sama_v2.ts — Layer 0 types (ProfileSpec, SamaV2Input,
  SamaV2Report, declaredLayer helper)
- src/c32_sama_v2_verify.ts (+ sibling) — pure Layer 1 verifier, 7
  checks; sibling has 20 tests covering each check's positive/negative
  fixture cases
- src/c14_sama_profile.ts (+ sibling) — Layer 2 boundary: minimal
  TOML subset parser + filesystem loader; 7 tests pin the parsed shape
- src/c21_handlers_sama.ts (extended) — new samaV2VerifyHandler
  reads sama.profile.toml + walks src/, runs the verifier, renders
  the verdict via renderDocsPage
- src/c21_app.ts — new route "/sama/v2/verify"

Honesty refactors required for v2 conformance (preserve v1 behaviour
throughout — bun test 220/220, sama v1 check 4/4 ✓):
- c32_judge / c32_real_reports / c32_real_tests do real I/O (git
  clone, fs, HTTP fetch). Under v2 §1.1 they cannot be Layer 1.
  Renamed to c14_* (Layer 2 Adapter). 6 files + every importer.
- SxDocumentSummary, ProjectRow, TreeEntry, GitCommitOk/Failure/
  Outcome were defined in c13/c14 (Layer 2) but imported by c51
  render code (Layer 1) — upward edges. Moved type definitions to
  c31_sxdoc / c31_project_config / c31_git_parse (Layer 0). All
  callers now import directly from Layer 0.

Honest blockers surfaced by the verifier (deferred — out of /goal
scope):
- #1 Sorted: 14 violations. v1's c11/c13/c14/c21/c31/c32/c51 prefix
  scheme puts c-prefix layer numbers IN lex order, but v2's
  Pure/Core/Adapter/Entry mapping reverses that (c31_=Layer 0 should
  lex-FIRST, c11_=Layer 3 should lex-LAST). Requires a sweeping
  prefix-rename refactor.
- #3 Modeled (tests): 13 violations. Layer 1 (c51_) and Layer 2
  (c14_) source files without sibling .test.ts. Need 13 new test
  files.
- #4 Modeled (boundary): 5 violations. c21_handlers_* call new URL
  / JSON.parse directly in Layer 3; v2 §4.4 wants those in Layer 2.
  Extract to c14_request_parse helpers.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
author
syntaxai <[email protected]>
date
2026-05-23 08:33:09 +01:00
parent
d2574fd
commit
1f9254a4e56a8721c125b9cb319eed5883d8ecc1

38 files changed · +3712 −1015

added content/blog/deploy-that-lies-cascade.md +310 −0
@@ -0,0 +1,310 @@
1+# When the deploy lies: three bugs hidden by one silent error suppressor
2+
3+The two prior posts in this thread were clean rounds: the verifier
4+named a violation, I produced the named artifact, the verifier flipped
5+green. Atomic-700 on `c21_app.ts` → split per domain → ✓. Modeled on
6+four `c32_*.ts` files → add the four siblings → ✓. Encouraging stories
7+about mechanical enforcement.
8+
9+This post is the messy round. It's the one that taught me that
10+mechanical enforcement only works if the pipeline that runs it is
11+itself running.
12+
13+## The visible bug
14+
15+`/reports/live` is the public live-data demo: real commit history for
16+this repo, rendered into a TDD-discipline scorecard, refreshed on every
17+deploy. On 2026-05-22 the header read:
18+
19+```
20+tdd-discipline report · 2026-05-03 → 2026-05-10
21+```
22+
23+Twelve days of staleness on a page that calls itself "live." I'd
24+shipped seven commits across the previous rounds and none of them
25+appeared.
26+
27+## Why nobody noticed for 12 days
28+
29+The deploy script in git-mode invoked the snapshot generator over ssh:
30+
31+```bash
32+ssh "$SSH_HOST" "cd ~/$REMOTE_SRC_DIR && bun scripts/p620/snapshot-git-history.ts" 2>/dev/null \
33+ || echo " ⚠ snapshot-git-history skipped (script may live outside the rsync exclude — non-fatal)"
34+```
35+
36+Two clauses are doing the damage:
37+
38+- `2>/dev/null` discards stderr — including the error message we'd want.
39+- `|| echo " ⚠ ... non-fatal"` turns a real failure into a printed
40+ warning. Worse, the warning text *blames the wrong thing*
41+ ("script may live outside the rsync exclude") so anyone who DID see
42+ the warning would file it under "harmless artifact of rsync vs git
43+ mode" and move on.
44+
45+The actual failure: there's no `bun` on the p620 host. Bun lives only
46+inside the tdd-md container image. The ssh tried to invoke a binary
47+that doesn't exist on PATH; the shell returned 127; the warning fired;
48+the deploy continued; the snapshot file's timestamp stayed at May 11.
49+
50+Twelve days. Every deploy. Both of the previous "clean rounds" deployed
51+through this same broken path and updated the *site* but not the
52+*live data*. The blog posts about going green were themselves served by
53+a deploy script that was lying about its own snapshot step.
54+
55+## Fix 1, and what it revealed
56+
57+The fix is structurally trivial: run the script *inside* the container
58+where bun lives, by mounting the working tree as a volume:
59+
60+```bash
61+ssh "$SSH_HOST" "podman run --rm \
62+ -v \$HOME/$REMOTE_SRC_DIR:/work:Z \
63+ --workdir /work \
64+ $IMAGE_TAG \
65+ bun scripts/p620/snapshot-git-history.ts" \
66+ || { echo '✗ snapshot-git-history failed'; exit 1; }
67+```
68+
69+The `:Z` is the Fedora SELinux relabel — the script process inside
70+needs to be able to read/write the bind mount. The `||
71+{ echo ✗; exit 1 }` replaces the swallow with a real failure mode. No
72+more silent skips.
73+
74+After this fix landed, `/reports/live` immediately caught up:
75+
76+```
77+tdd-discipline report · 2026-05-03 → 2026-05-22
78+```
79+
80+So far so good. But the moment I looked at `/reports/live/tests`, the
81+sibling test-stability page, the timestamp said:
82+
83+```
84+last run 2026-05-10 · 17 runs cumulative
85+```
86+
87+Same staleness. Different cause.
88+
89+## The second silent failure
90+
91+Looking at the deploy script again, the **rsync** escape hatch runs
92+both snapshot scripts:
93+
94+```bash
95+( cd "$REPO_ROOT" && bun scripts/p620/snapshot-git-history.ts ) || ...
96+( cd "$REPO_ROOT" && bun scripts/p620/snapshot-tests.ts ) || ...
97+```
98+
99+The **git-mode** happy path runs only the first one. When the deploy
100+flow switched from rsync to git as the default a while back, the
101+test-snapshot step got dropped on the floor and nobody noticed —
102+because the test-stability page was always 17 cumulative runs old, and
103+"old enough that nobody questioned the number" is one of the failure
104+modes that a verifier can't detect.
105+
106+Fix 2: add the second podman-run step, with one wrinkle. Unlike
107+`snapshot-git-history` (which is pure git + filesystem), `snapshot-tests`
108+calls `bun test`, which needs `node_modules` to resolve `marked` and
109+`node-html-parser`. The bind-mounted host directory has no
110+`node_modules` (the host has no Bun). But the image already ships
111+them at `/app/node_modules`. So:
112+
113+```bash
114+podman run --rm -v $HOME/src/tdd.md:/work:Z --workdir /work $IMAGE_TAG \
115+ sh -c 'ln -sfn /app/node_modules node_modules && bun scripts/p620/snapshot-tests.ts'
116+```
117+
118+Symlink the container's `node_modules` into the work directory, then
119+let the script use it. The symlink persists on the host between
120+deploys but points at a path inside the container — harmless dead-link
121+outside the next podman-run, valid inside.
122+
123+## Two more bugs, surfaced by the snapshot actually running
124+
125+When the next deploy ran with both snapshots wired in, the live page
126+now read:
127+
128+```
129+Total: 193 tests · 192 passing · 1 failing · 1 placeholder ⚠
130+```
131+
132+193 pass locally, every time I run them. 192 pass + 1 fail + 1
133+placeholder on the container. Two bugs that had been hiding behind
134+"the test suite never actually ran in the deploy pipeline."
135+
136+### Bug A: a 1-in-16 flaky test
137+
138+The failing test was one I wrote in the prior round:
139+
140+```ts
141+test("verifySession rejects a cookie with a forged signature", async () => {
142+ const cookie = await signSession("eve");
143+ const tampered = cookie.replace(/.$/, "0");
144+ const result = await verifySession(tampered);
145+ expect(result).toBeNull();
146+});
147+```
148+
149+`replace(/.$/, "0")` replaces the last character with "0". When the
150+HMAC signature's last hex digit *is already* "0" — which happens with
151+probability 1/16, since SHA-256 hex output is uniform — the
152+"tampered" string is identical to the original, the signature
153+verifies, the function returns `"eve"`, and the assertion fails.
154+
155+Local runs masked this because the random draws (the timestamp going
156+into the signed payload) happened to never produce a `0`-ending sig.
157+The first run that actually ran in CI hit the unlucky draw and
158+exposed it.
159+
160+Fix: read the last char, flip to a digit it definitely isn't:
161+
162+```ts
163+const lastChar = cookie.slice(-1);
164+const tampered = cookie.slice(0, -1) + (lastChar === "f" ? "0" : "f");
165+expect(tampered).not.toBe(cookie); // loudly fail if a future regression collides
166+```
167+
168+Five runs in a row, every one passes. Determinism restored.
169+
170+### Bug B: the verifier's own test, flagged by its own check
171+
172+The placeholder warning pointed at:
173+
174+```
175+src/c32_sama_verify.test.ts > does nothing
176+```
177+
178+`c32_sama_verify.ts` is the verifier itself. Its test file holds a
179+fixture:
180+
181+```ts
182+test("Atomic: placeholder test (zero expect calls) is flagged", () => {
183+ const placeholderFixture = `test("does nothing", () => { /* TODO */ })`;
184+ // ... feed it to the verifier, assert the verifier flags it
185+});
186+```
187+
188+The string `test("does nothing", () => { /* TODO */ })` is a *fixture*
189+— a literal example of what a placeholder test looks like, fed to the
190+verifier so we can assert the verifier catches it. It's not a real
191+test.
192+
193+The verifier itself handles this correctly. It uses a
194+`stripStringsAndComments` helper to mask out string literals before
195+running its `test()`-finder regex over the source. So when the
196+verifier scans `c32_sama_verify.test.ts`, it sees the fixture as
197+whitespace, doesn't pick it up, and reports zero placeholders in that
198+file.
199+
200+But `snapshot-tests.ts` — the deploy-time generator that feeds
201+`/reports/live/tests` — duplicated the regex *without* the
202+strip-strings step. So it grepped the raw source, found the fixture
203+inside the backtick string, treated it as a real `test()` call, walked
204+its (TODO-only) body, counted zero `expect()` calls, and flagged it.
205+
206+The deploy-time detector was flagging the very test that proves the
207+runtime detector works.
208+
209+Fix: export `stripStringsAndComments` from `c32_sama_verify.ts` and
210+use the same mask-index pattern in the snapshot script:
211+
212+```ts
213+import { stripStringsAndComments } from "../../src/c32_sama_verify.ts";
214+// ...
215+const mask = stripStringsAndComments(content);
216+while ((m = re.exec(content)) !== null) {
217+ // If the match position is whitespace in the mask, the original
218+ // was inside a string or comment — skip.
219+ if (mask[m.index] === " " || mask[m.index] === "\n") continue;
220+ // ... rest of the body-walking logic
221+}
222+```
223+
224+DRYing the helper across the two places that need the same string-aware
225+behaviour. Now the snapshot agrees with the verifier.
226+
227+## What the cascade was actually telling me
228+
229+The bug count for ronde 4 looks bad: a 12-day staleness, a flaky test,
230+a false-positive in the deploy-time detector. Three independent
231+problems.
232+
233+But the *order* is the part worth looking at. Each fix made the next
234+one visible:
235+
236+1. Deploy script ran the snapshot step → file's timestamp moved →
237+ `/reports/live` started reporting current commits.
238+2. Deploy script ran the test snapshot → tests actually ran in the
239+ deploy pipeline → the flaky test surfaced (because previously it
240+ never ran in CI), and the false-positive surfaced (because
241+ previously the snapshot was 12 days old and that particular
242+ fixture had been added since then).
243+3. Each fix's success was the precondition for the next bug to be
244+ visible.
245+
246+The cascade isn't proof the system is fragile. It's proof that the
247+system was *blind* — a layer of silent error suppression had hidden
248+every downstream failure, so they accumulated without being detected.
249+The fix was less "patch three things" than "remove the lie and watch
250+what falls out."
251+
252+This is the same shape as TDD's iron rule applied to *infrastructure*
253+rather than to source: you can't trust a pass you didn't run. The
254+deploy-pipeline checks `bun test` exits zero — but only if `bun test`
255+*ran*. If the call returns 127 (command not found) and the deploy
256+script swallows it, every later assertion is hollow.
257+
258+`/reports/live` showing all-green for 12 days was perfectly compatible
259+with the test suite being completely broken. The only way to know is
260+to delete the swallowing.
261+
262+## Why this is the empirical case for SAMA, not against it
263+
264+A naive reading is "the codebase had three bugs you didn't catch."
265+The fairer reading is: the codebase had *one* bug — silent error
266+suppression in a deploy script — and the other two were latent
267+consequences that the verifier *would have* caught the moment they
268+ran. Removing the silence took ~15 minutes. Once silence was gone, both
269+hidden bugs surfaced *on the very next deploy*, with line numbers and
270+file paths, in two cells of a public web page.
271+
272+That's the empirical pattern SAMA's pitch turns on, scaled to the
273+infrastructure layer:
274+
275+- **Verification has to be observable.** A check that runs into
276+ `2>/dev/null` is indistinguishable from a check that passes.
277+- **The cost of removing silence is low.** A `||` swallow → `||
278+ { echo ✗; exit 1; }` is a one-line change. A `2>/dev/null` →
279+ `2>&1` is one word.
280+- **Removing silence pays compounding returns.** Three bugs hidden by
281+ one suppressor — each one would have been instantly diagnosable if
282+ the surface had been honest.
283+
284+## What this still doesn't prove
285+
286+It doesn't prove that exposing every failure produces a useful signal.
287+Some failures *should* be tolerated (best-effort cleanup, optional
288+caches), and over-strict failure handling can break production for
289+trivial reasons. The judgement is *which* failures: in this case,
290+`snapshot-git-history` running was load-bearing for the public claim
291+that `/reports/live` reflects the current repo. Treating its failure
292+as "non-fatal" was a category error.
293+
294+The general principle the cascade demonstrates: in a system whose value
295+proposition is *the artefacts a reviewer can replay*, the pipeline
296+that produces those artefacts has the same audit requirements as the
297+source code does. Silent failures in the pipeline are violations of
298+the standard the same way silent failures in the source would be.
299+
300+---
301+
302+**See for yourself:**
303+
304+- Live: <https://tdd.md/reports/live> (date window is now current)
305+- Live: <https://tdd.md/reports/live/tests> ("193 passing · 0 placeholder")
306+- The PR that landed the three fixes:
307+ <https://github.com/syntaxai/tdd.md/pull/14>
308+- Previous posts in this thread:
309+ [the c21 Atomic-700 split](/blog/sama-empirical-c21-split) ·
310+ [greening the Modeled dogfood](/blog/sama-empirical-modeled-green)
added e2e/git-content-browse.spec.ts +121 −0
@@ -0,0 +1,121 @@
1+// E2E: every blog post in ALL_POSTS is reachable via /GIT/.
2+//
3+// Crawls the registry's slugs (lifted into a literal array here so
4+// the test file doesn't import server-side modules) and asserts:
5+// 1. /GIT/syntaxai/tdd.md/tree/main/content/blog lists each post
6+// 2. /GIT/syntaxai/tdd.md/blob/main/content/blog/<slug>.md renders
7+// the post (markdown rendered via marked into the chrome)
8+// 3. /GIT/syntaxai/tdd.md/raw/main/content/blog/<slug>.md serves
9+// the raw markdown
10+// Plus the tree home (/GIT/syntaxai/tdd.md/tree/main) shows the
11+// top-level directories (content/, src/, public/, scripts/, etc.).
12+
13+import { test, expect } from "@playwright/test";
14+import * as fs from "fs";
15+import * as path from "path";
16+
17+// Mirror of c31_blog.ts ALL_POSTS slugs. If a post is added there,
18+// add the slug here too. Kept inline to avoid pulling server code
19+// into the test process.
20+const BLOG_SLUGS = [
21+ "sama-meets-git-cms",
22+ "from-rules-to-checks",
23+ "agentic-coding-corpus-three-patterns",
24+ "claude-code-harness-postmortem",
25+ "three-constraints-agentic-coding",
26+ "tweag-handbook-tdd",
27+ "aider-tdd",
28+ "cursor-tdd",
29+ "claude-code-tdd",
30+];
31+
32+const SCREENSHOT_DIR = "test-results/git-content-browse";
33+
34+test.beforeAll(() => {
35+ fs.mkdirSync(SCREENSHOT_DIR, { recursive: true });
36+});
37+
38+test.describe("/GIT browses the local bare repo", () => {
39+ test("repo root tree lists the top-level directories", async ({ page }) => {
40+ const res = await page.goto("/GIT/syntaxai/tdd.md/tree/main");
41+ expect(res?.status()).toBe(200);
42+
43+ // Top-level dirs we expect after the dev tree was pushed.
44+ for (const dir of ["content", "src", "public", "scripts", "e2e"]) {
45+ await expect(
46+ page.locator(`a[href="/GIT/syntaxai/tdd.md/tree/main/${dir}"]`),
47+ ).toBeVisible();
48+ }
49+ // Top-level files
50+ await expect(
51+ page.locator('a[href="/GIT/syntaxai/tdd.md/blob/main/package.json"]'),
52+ ).toBeVisible();
53+
54+ await page.screenshot({
55+ path: path.join(SCREENSHOT_DIR, "1-repo-root-tree.png"),
56+ fullPage: true,
57+ });
58+ });
59+
60+ test("content/blog tree lists every post in ALL_POSTS", async ({ page }) => {
61+ const res = await page.goto("/GIT/syntaxai/tdd.md/tree/main/content/blog");
62+ expect(res?.status()).toBe(200);
63+ for (const slug of BLOG_SLUGS) {
64+ const link = page.locator(
65+ `a[href="/GIT/syntaxai/tdd.md/blob/main/content/blog/${slug}.md"]`,
66+ );
67+ await expect(link, `link to ${slug}.md must be present`).toBeVisible();
68+ }
69+
70+ await page.screenshot({
71+ path: path.join(SCREENSHOT_DIR, "2-content-blog-tree.png"),
72+ fullPage: true,
73+ });
74+ });
75+
76+ for (const slug of BLOG_SLUGS) {
77+ test(`blob view renders ${slug}.md as markdown via /GIT`, async ({ page }) => {
78+ const res = await page.goto(
79+ `/GIT/syntaxai/tdd.md/blob/main/content/blog/${slug}.md`,
80+ );
81+ expect(res?.status()).toBe(200);
82+ // The repo-blob-rendered container is what marked.parse output
83+ // lands in. It must exist + be non-empty.
84+ const rendered = page.locator(".repo-blob-rendered");
85+ await expect(rendered).toBeVisible();
86+ const text = (await rendered.textContent()) ?? "";
87+ expect(text.length).toBeGreaterThan(200);
88+ // The breadcrumb must show the file path so users can climb.
89+ await expect(page.locator(".commit-breadcrumb")).toContainText(`${slug}.md`);
90+ });
91+
92+ test(`raw endpoint serves ${slug}.md as text/plain via /GIT`, async ({ request }) => {
93+ const res = await request.get(
94+ `/GIT/syntaxai/tdd.md/raw/main/content/blog/${slug}.md`,
95+ );
96+ expect(res.status()).toBe(200);
97+ expect(res.headers()["content-type"]).toMatch(/text\/plain/);
98+ const body = await res.text();
99+ // Frontmatter or first heading — every blog post has one.
100+ expect(body.length).toBeGreaterThan(200);
101+ });
102+ }
103+
104+ test("path traversal is rejected", async ({ request }) => {
105+ for (const evil of [
106+ "/GIT/syntaxai/tdd.md/blob/main/../etc/passwd",
107+ "/GIT/syntaxai/tdd.md/blob/main/content/../../etc/passwd",
108+ "/GIT/syntaxai/tdd.md/tree/main//content",
109+ ]) {
110+ const res = await request.get(evil);
111+ expect(res.status(), `${evil} must 404`).toBe(404);
112+ }
113+ });
114+
115+ test("non-allowed (owner, repo) 404s — only syntaxai/tdd.md is served", async ({
116+ request,
117+ }) => {
118+ const res = await request.get("/GIT/someone/random-repo/tree/main");
119+ expect(res.status()).toBe(404);
120+ });
121+});
modified feedback.md +89 −41
@@ -1,66 +1,114 @@
1-Goede vraag. Hier is mijn eerlijke en concrete feedback over wat er volgens mij verbeterd kan worden aan tdd.md:
2-Wat al goed is
1+SAMA 2.0 — Voorstel voor de volgende evolutie
32
4- Strenge judge
3+Hier is een uitgewerkte, concrete versie van SAMA 2.0. Ik houd de kern intact (want die is sterk), maar maak het completer, toekomstbestendiger en nog beter geschikt voor agentic coding in 2026+.
4+Kern blijft hetzelfde (de vier letters)
55
6- Hidden tests
6+ S — Sorted: Alphabetical + dependency direction (lager nummer importeert nooit hoger).
77
8- Git-based scoring
8+ A — Architecture: Numerieke lagen met duidelijke contracten.
99
10- Publieke verdicts
10+ M — Modeled: Types + sibling tests.
1111
12-Wat ik zou verbeteren
12+ A — Atomic: Bestanden klein houden (~700 regels) + split per domein.
1313
14- Te puristisch / te streng
15- Op dit moment is het bijna “religieuze TDD”. Dit jaagt mensen weg.
16- Verbetering: Introduceer verschillende niveaus of modi:
14+Wat verandert / wordt toegevoegd in 2.0
1715
18- Strict Mode (huidige versie)
16+Ik voeg twee nieuwe letters toe → SAMA wordt SAMAX (of je houdt SAMA en maakt de extra’s optioneel).
17+Nieuwe letter: X — eXtensible & Vertical
1918
20- Pragmatic Mode (zoals Kent Beck later bedoelde): toestaat spikes/exploratie, test-first is sterk aangemoedigd maar niet heilig.
19+ Doel: Combineer de kracht van horizontale lagen (duidelijke dependency flow) met verticale slices (alles van één feature dicht bij elkaar).
2120
22- Learning Mode: mildere straf voor beginners.
21+ Regel: Optionele feature-prefix bovenop de laag: c32_user_auth.ts of feat_payment_c32_processor.ts.
2322
24- Alleen unit-level focus
25- Veel moderne software heeft ook integratie, UI, performance en architectuur issues.
26- Verbetering: Voeg kata’s toe op verschillende lagen (niet alleen string calculator niveau), inclusief:
23+ Voordeel voor agents: Een agent die aan “user authentication” werkt, ziet alle relevante bestanden gegroepeerd via zoekopdracht feat_user_*.
2724
28- API-kata’s
25+Nieuwe letter: D — Documented (de vijfde discipline)
2926
30- Database interactie
27+ Regel: Elke module én elke feature-map heeft een README.md of .agent.md met:
3128
32- UI/component testing
29+ One-sentence responsibility
3330
34- Geen onderscheid tussen exploratie en implementatie
35- In echte projecten doe je vaak eerst een spike.
36- Verbetering: Laat toe dat een “spike” fase expliciet gemarkeerd wordt, en daarna pas de echte TDD-cyclus begint.
31+ Key types & contracts
3732
38- Scoring is te binair
39- Momenteel voelt het soms als een spelletje “volg de regels perfect”.
40- Verbetering: Voeg kwaliteitsmetingen toe, zoals:
33+ Acceptance criteria / invariants
4134
42- Code simplicity / cyclomatic complexity
35+ “Where to put new code” instructies
4336
44- Hoe klein de stappen waren
37+ Dit wordt de levende specificatie voor de agent.
4538
46- Hoe goed de namen van tests en variabelen zijn
39+Uitgebreide Layer Mapping (Architecture 2.0)
40+Laag Naam Verantwoordelijkheid Voorbeelden Mag importeren van
41+c11 Entry / Composition Root App bootstrap, wiring main.ts, server.ts Alles
42+c13 Data / Persistence DB, repositories, queries c13_user_repo.ts c31
43+c14 I/O Adapters HTTP, queues, external APIs c14_auth_controller.ts c21, c31, c32
44+c21 Handlers / Presenters Request/response orchestration c21_login_handler.ts c32, c31
45+c31 Models & Types Domain models, DTOs, Value Objects, Zod schemas c31_user.ts - (puur)
46+c32 Business Logic / Use Cases Pure functions, domain rules c32_user_auth.ts c31
47+c40 Application Services Orchestratie van meerdere use cases (nieuw!) c40_user_service.ts c32, c31
48+c51 UI / Presentation Components, pages, server components c51_user_profile.tsx c31, c32
49+c60 Infrastructure / External Third-party clients, config, logging c60_stripe_client.ts -
50+c70 Cross-cutting Auth middleware, logging, monitoring, tracing c70_logger.ts alles (voorzichtig)
4751
48- Of de code idiomatisch is voor de taal
52+Atomic-regel verfijnd: Max ~600-700 regels of max 1 feature per bestand (zelfs als het kleiner is).
53+Tooling & Automatisering (kritisch voor gold standard)
4954
50- Te weinig feedback voor verbetering
51- Je krijgt een score, maar niet altijd waarom je slecht scoort op een begrijpelijke manier.
52- Verbetering: Betere, menselijke uitleg + suggesties (“Je hebt 3 commits gedaan zonder failing test”, “Je hidden tests vielen door op edge case X”).
55+SAMA 2.0 moet dit hebben:
5356
54- Te weinig variatie in kata’s
55- Beginnend met string-calc is prima, maar er moet snel meer komen (bijv. een kleine web API, een game loop, een parser, etc.).
57+ Officiële sama CLI
5658
57- Community & educatie
58- Voeg een “Why” sectie toe die uitlegt wanneer strikte TDD zinvol is en wanneer niet. Op dit moment straalt het te veel “dit is de enige juiste manier” uit.
59+ sama check → voert alle verificaties uit
5960
60-Mijn ideale versie van tdd.md
61+ sama new c32_user_auth → maakt bestand + template + .agent.md
6162
62-Een platform dat niet alleen meet hoe goed je TDD volgt, maar ook hoe goed je als engineer nadenkt — met de flexibiliteit die ervaren developers (inclusief Kent Beck) in de praktijk toepassen.
63+ sama split → helpt bij Atomic splits
6364
64-Kort samengevat:
65-tdd.md is nu een strenge TDD-judge.
66-Ik zou het liever zien als een slimme TDD-coach die discipline aanleert, maar ook volwassen, context-bewuste engineering aanmoedigt.
65+ sama verify-repo
66+
67+ Pre-commit hooks + GitHub Action
68+
69+ LSP / Editor integration (VS Code, Cursor, Windsurf)
70+
71+ Claude / Cursor skill met automatische checks
72+
73+Extra verbeteringen
74+
75+ Monorepo support: Package-prefixen (pkg_auth_c32_...)
76+
77+ Graduele adoptie: SAMA Light (alleen Sorted + Atomic) → Full SAMAX
78+
79+ Frontend + Backend harmonisatie (vooral Next.js App Router)
80+
81+ Deprecation policy: Duidelijke manier om oude lagen/bestanden af te bouwen
82+
83+ Performance & Security gates in Atomic-check (bijv. flag grote N+1 queries)
84+
85+Voorbeeld folder structuur (SAMA 2.0)
86+
87+src/
88+├── c11_app.ts
89+├── features/
90+│ └── user/
91+│ ├── feat_user_c13_repo.ts
92+│ ├── feat_user_c31_types.ts
93+│ ├── feat_user_c32_auth.ts
94+│ ├── feat_user_c32_profile.ts
95+│ ├── feat_user_c40_service.ts
96+│ └── feat_user.agent.md
97+├── c60_stripe.ts
98+└── c70_auth_middleware.ts
99+
100+Samenvatting: Waarom dit de gold standard wordt
101+
102+ Behoudt mechanische verifieerbaarheid (grep + CLI)
103+
104+ Wordt hybride (horizontaal + verticaal)
105+
106+ Voegt documentatie als first-class citizen toe
107+
108+ Wordt echt toolbaar en adopteerbaar
109+
110+ Schaalt van solo-agent project tot middelgroot team
111+
112+Dit is geen over-engineering — het lost precies de pijnpunten op die ontstaan als projecten groter worden dan 20-30k regels met agents.
113+
114+Wil je dat ik een volledige officiële spec uitwerk (zoals de originele tdd.md pagina), inclusief verificatie commando’s en templates? Of focus op één deel (bijv. de CLI spec of een concrete Next.js template)?
added plan.md +321 −0
@@ -0,0 +1,321 @@
1+# Plan — port podman/syntax CMS into tdd.md, SAMA-native
2+
3+**Doel.** Het CMS uit `~/Documents/podman` (sx-filter + sx-editor + sx-content + Ghost-compat theme) volledig overzetten naar tdd.md, in 100% SAMA-stijl, met de bestaande tdd.md content intact gemigreerd.
4+
5+**Niet-doel.** Podman, Caddy, of een tweede service-tier in tdd.md. Alles draait in één Bun-proces dat we al hebben (`c11_server.ts`). Caddy's rol (TLS + routing) doet onze deploy-laag op p620.
6+
7+---
8+
9+## ⚠ Eerst beslissen — storage-canon
10+
11+Dit stuurt elke andere keuze. Twee opties; ik default naar **A** tenzij je flipt.
12+
13+### A. Git-canon (default — behoudt tdd.md identity)
14+
15+- Bron-van-waarheid blijft het bare repo `/app/repo` (huidige stack).
16+- **Elke save in de editor = een commit** via bestaande `c14_git.commitFile`.
17+- sxdoc-trees (typed blocks) leven als sidecar JSON naast de markdown:
18+ `content/blog/foo.md` + `content/blog/foo.sxdoc.json`.
19+- SQLite (bestaande `c13_database`) krijgt een afgeleide index-tabel
20+ (`content_index`) voor snelle lijst-queries en taxonomie-lookups, **rebuildbaar uit git**. Drop het, replay `git log`, terug.
21+- Voordeel: "SAMA meets git" verhaal blijft kloppen. `sama-meets-git-cms.md` blijft waarheid. Audit-trail = `git log content/`.
22+- Nadeel: complexer dan podman's directe SQLite-writes. Trager bij grote sites (>10k posts). Niet relevant op onze schaal.
23+
24+### B. SQLite-canon (1-op-1 podman-port)
25+
26+- `content/*.md` wordt eenmalig geïmporteerd naar `sx_documents` + `posts` tabellen, daarna read-only.
27+- Editor schrijft uitsluitend naar SQLite. Git-history van content stopt op het migratie-commit.
28+- Voordeel: minimale afwijking van podman's code. Sneller te porten.
29+- Nadeel: tdd.md verliest "elke content-edit = commit" — kern van het product per memory.
30+
31+**Beslissing 2026-05-11: B (SQLite-canon) + git-als-audit-mirror. Locked.**
32+
33+---
34+
35+## Locked decisions (2026-05-11)
36+
37+### Storage canon: **B (SQLite-canon)** + git-als-audit-mirror
38+- **Canoniek:** `sx_documents` tabel in `c13_database` (bun:sqlite). Editor reads/writes hier; live-preview en alle render-paden lezen hier.
39+- **Audit-mirror:** elke save → 1 multi-path commit met `content/{slug}.md` (afgeleide markdown-projectie) + `content/{slug}.sxdoc.json` (canonical JSON-tree). Zo blijft `git log content/` de menselijk-leesbare audit-trail; "elke save = een commit" uit `sama-meets-git-cms.md` blijft waar — de **canoniciteit** ligt nu in SQLite, het **bewijs** in git.
40+- **Recovery:** SQLite-corruptie? Drop tabel, replay van `*.sxdoc.json`.
41+- **Initial migration:** eenmalig `scripts/migrate_content_to_sxdoc.ts` leest huidige `content/**/*.md` → parseert naar `SxDocument` → schrijft SQLite + emit één migratie-commit met alle nieuwe `.sxdoc.json` ernaast.
42+
43+### Parser laag: **c31** · Render laag: **c51**
44+- `c31_sxdoc_parse.ts` (HTML → SxDocument) + sibling `c31_sxdoc_parse.test.ts`.
45+ Reden: `content/sama/modeled.md` is expliciet — *"every external input has a parser in a c31_* model"*. HTML strings vanuit de editor/migratie zijn external input → c31.
46+- `c51_render_sxdoc.ts` (SxDocument → HTML) + sibling `c51_render_sxdoc.test.ts`.
47+ Reden: `content/sama/architecture.md` picking-order regel 4 — *"Does it produce HTML? Yes → c51"*. sxToHtml produceert HTML.
48+- **Correctie t.o.v. eerder plan + research-migration:** parser/renderer waren foutief op c32 geplaatst (research keek alleen naar verifier-hard-rule "c32 vereist sibling-test", maar canon-docs sturen anders). Tests blijven (c31 sibling is informationally verplicht via Modeled; c51 idem voor goed onderhoud al staat het niet hard in de verifier).
49+
50+### Commit-vorm: **één multi-path commit per save**
51+- `c14_git` krijgt nieuwe `commitFiles(paths: Array<{path, body}>)` naast bestaande `commitFile`.
52+- Eén commit → atomic rollback van die SHA herstelt beide bestanden.
53+
54+---
55+
56+## Werkwijze (build-discipline per file-landing)
57+
58+Elke file-write moet alle vier SAMA-axes passeren vóór de volgende file landt. Geen pile-up van violations.
59+
60+| Axis | Wat dat afdwingt | Hoe we dat afdwingen |
61+|---|---|---|
62+| **Sorted** | c1*/c3* mogen niet relatief upward importeren | Bottom-up bouwen: c1 → c3 → c2 → c5. Nooit import naar hogere laag. |
63+| **Architecture** | prefix ∈ {11, 13, 14, 21, 31, 32, 51} | Layer-toewijzing vóór tik. I/O? → c14. Logic+transform? → c32. Pure types/registry? → c31. |
64+| **Modeled** | c32_*.ts vereist sibling .test.ts (hard); c31 = info-only | **c32 source + test landen in dezelfde edit-batch**, nooit los. Test heeft ≥1 `expect()` per `test(...)`-body. |
65+| **Atomic** | ≤700 LOC per file; geen placeholder tests | `wc -l` checken vóór commit. Splits gebudgetteerd (client/render per block-kind; shortcodes registry+substitute). |
66+
67+### Niet-verifier-afgedwongen SAMA-canon (per `content/sama/*.md`)
68+
69+- **Flat `src/`** — geen subdirs server-side. Client onder `src/client/**.ts` (buiten verifier-glob).
70+- **Geen barrel re-exports** (`atomic.md`).
71+- **c31/c32 importeren geen I/O-modules** (sharp, fs, bun:sqlite, fetch) — verifier ziet alleen relative imports, dus dit is persoonlijke discipline.
72+- **One concept per file** — types apart van parser apart van renderer.
73+
74+### Verificatie-cadans
75+
76+Na **elke** file-landing (niet alleen aan eind van fase):
77+
78+```
79+bun test src/c32_sama_verify.test.ts # verifier zelf groen?
80+bun test src/<file>.test.ts # nieuwe sibling-test groen?
81+wc -l src/c*<file>*.ts # geen file > 700
82+```
83+
84+Aan einde van elke fase:
85+```
86+bun test # alles groen
87+bun run src/c11_server.ts & # boot smoke
88+curl localhost:3000/health # 200
89+```
90+
91+### Anti-patronen (expliciet verboden)
92+
93+- **"Het werkt, test komt later"** voor c32 — source en test landen samen of niet.
94+- **Refactoren van lower-layer code in een higher-layer fase** — bv. `c14_git.commitFiles` toevoegen tijdens Fase 1 omdat het "handig" is. Lower-layer changes horen bij de fase waar de caller landt.
95+- **Sub-folders onder `src/`** server-side om "het netter te organiseren". Flatten is SAMA-canon.
96+- **Improviseren over layer-toewijzing** — als je twijfelt over c31 vs c32, default naar c32 (sibling-test = vangnet).
97+
98+---
99+
100+## Layer-correcties uit research-migration.md
101+
102+Plan.md zat op drie plaatsen fout volgens de SAMA-rules in `c32_sama_verify.ts`. Gecorrigeerd:
103+
104+| Was | Wordt | Reden |
105+|---|---|---|
106+| `c31_image_resize.ts` | `c14_image_resize.ts` | sharp doet I/O — c14 verplicht |
107+| `c31_ai_edit_block.ts` | `c14_openrouter.ts` (HTTP) + `c32_ai_edit_block.ts` (validate/transform) | OpenRouter HTTP = c14; orchestratie + sibling-test = c32 |
108+| `c31_sxdoc_parse.ts` | `c32_sxdoc_parse.ts` | logica, geen pure types — c32 vereist sibling-test |
109+| `c31_sxdoc_render.ts` | `c32_sxdoc_render.ts` | idem |
110+
111+### Atomic-700 splits gebudgetteerd
112+
113+| File | LOC bij directe port | Splits |
114+|---|---|---|
115+| `sx-editor/src/client/render.ts` | 775 | over Atomic-700; split per block-kind onder `src/client/blocks/render-{p,h,list,quote,code,img,html,shortcode}.ts` + één `src/client/render.ts`-dispatch ≤200 LOC |
116+| `sx-filter/src/shortcodes.ts` | 650 | krap; pak meteen split langs `c31_shortcodes_registry.ts` (built-ins) + `c32_shortcodes_substitute.ts` (HTML-rewriter met regio-skip) |
117+
118+### Tests-zijn-siblings rule (was niet geëxpliciteerd)
119+
120+Podman's `sx-editor/tests/unit.test.ts` shape **incompatibel**. Elke `cXX_*.test.ts` moet als sibling naast `cXX_*.ts` staan onder `src/`. Bestaande tdd.md tests doen dit al correct.
121+
122+### Client-side placement: `src/client/**.ts`
123+
124+Geen verifier-impact (alleen `cXX_*.ts` wordt gescand). Relatieve imports naar `../c31_sxdoc.ts` werken vanuit hier. Bun.build bundelt uit `src/client/`. Geen nieuwe top-level dir.
125+
126+### Verboden subdirs onder `src/`
127+
128+Podman's `sxdoc/`, `core/`, `db/`, `client/blocks/` mag niet onder `src/` blijven bij server-port. Dat geldt **niet** voor `src/client/` (die staat buiten verifier-scope). Server-code flat houden.
129+
130+---
131+
132+## SAMA-mapping — podman-stuk → tdd.md cXX-laag
133+
134+SAMA-conventie (per memory): cXX_*.ts, `c1X` = data/I-O, `c2X` = handlers/app, `c3X` = pure logic, `c5X` = render. Lower layer never imports higher.
135+
136+| Podman | tdd.md (nieuw) | SAMA-laag | Wat het doet |
137+|---|---|---|---|
138+| `sx-data/sx.db` schema | `c13_database.ts` (extend) | c1 | tabellen `sx_documents`, `media`, `content_index`, `api_keys` |
139+| `sx-editor/src/sxdoc/types.ts` | `c31_sxdoc.ts` | c3 | `SxDocument`, `Block`, helpers — pure types/registry |
140+| `sx-editor/src/sxdoc/html-to-sx.ts` | `c31_sxdoc_parse.ts` (+ sibling `.test.ts`) | c3 | HTML → SxDocument (parser = c31 per Modeled.md) |
141+| `sx-editor/src/sxdoc/sx-to-html.ts` | `c51_render_sxdoc.ts` (+ sibling `.test.ts`) | c5 | SxDocument → HTML (produces HTML = c51 per Architecture.md) |
142+| `sx-editor/src/sxdoc/db.ts` | `c13_database.ts` extend (saveDocument/loadDocument/listDocuments/deleteDocument) | c1 | SQLite read/write (canon-B); bun:sqlite = c13, niet c14 |
143+| `sx-editor/src/upload.ts` + sharp resize | `c14_media.ts` + `c14_image_resize.ts` | c1 | upload, on-disk store, sharp transforms (sharp = I/O) |
144+| `sx-editor/src/ai.ts` (OpenRouter) | `c14_openrouter.ts` + `c32_ai_edit_block.ts` | c1 + c3 | HTTP-call in c14; validate + transform in c32 met sibling-test |
145+| `sx-editor/src/templates.ts` (list/edit shells) | `c51_render_admin.ts` | c5 | admin-list + edit-page chrome |
146+| `sx-editor/src/routes.ts` (urlForPage/Post) | bestaande `c31_site_config.ts` extend | c3 | routes.yaml-equivalent — wij hebben al routes-config |
147+| `sx-editor/src/client/blockeditor.ts` + `slashmenu.ts` + `blocks/*` | `client/` (TS bundle) → served door `c21_handlers_edit.ts` | client | block-editor JS, slash-menu, AI ✨, autosave |
148+| `sx-editor/src/build.ts` (Bun.build serve) | `c14_client_bundle.ts` | c1 | bundle TS-client → ESM, cache in geheugen |
149+| `sx-filter/src/shortcodes.ts` (650 LOC — over 700 binnen 1 add) | **split**: `c31_shortcodes_registry.ts` (built-ins, namen, args) + `c32_shortcodes_substitute.ts` (HTML-rewriter met meta/script-skip) + verplichte `.test.ts` op de c32 | c3 | parsing/substitutie, voorkomt Atomic-700 violation |
150+| `sx-filter/src/admin.ts` (admin-button injectie) | bestaande edit-flow heeft al login-gate | c2 | n.v.t. — wij hebben echte auth |
151+| `sx-content/src/render.ts` (Handlebars renderer) | `c51_render_theme.ts` | c5 | Ghost-compat theme renderer; **geen Handlebars-dep** — pure TS template-helpers |
152+| `sx-content/src/sitemap.ts` | `c51_render_sitemap.ts` | c5 | sitemap.xml + RSS |
153+| `sx-content/src/images.ts` | onderdeel van `c14_media.ts` boven | c1 | path-routed /content/images/* |
154+| `sx-themes/syntax/*.hbs` partials | `theme/*.html` of `c51_render_theme_partials.ts` | c5 | Ghost-look, maar als TS template-helpers |
155+
156+### Nieuwe handlers (c21)
157+
158+- `c21_handlers_admin_list.ts` — `/admin/` lijst van pages+posts
159+- `c21_handlers_admin_edit.ts` — `/admin/edit/{type}/{slug}` (block-editor)
160+- `c21_handlers_admin_new.ts` — `/admin/new`
161+- `c21_handlers_admin_upload.ts` — `/admin/upload`
162+- `c21_handlers_admin_ai.ts` — `/admin/ai/edit-block`
163+- `c21_handlers_admin_preview.ts` — `/admin/preview` (live render)
164+- `c21_handlers_content.ts` — public render dispatcher (post/page/tag/author)
165+- `c21_handlers_sitemap.ts` — `/sitemap.xml`, `/blog/rss/`
166+- `c21_handlers_media.ts` — `/content/images/*`
167+
168+Bestaande `c21_handlers_edit.ts` wordt **vervangen** door `c21_handlers_admin_edit.ts` (block-editor i.p.v. textarea).
169+
170+---
171+
172+## Content-migratie
173+
174+Bestaande tdd.md content:
175+```
176+content/home.md
177+content/blog/*.md (9 posts)
178+content/sama/*.md (5 pages)
179+content/games/*/ (2 games — multi-file)
180+content/guides/*.md (3 pages)
181+content/git-history/* (commit-meta JSON)
182+```
183+
184+Migratie-strategie (canon B, SQLite + git-mirror):
185+
186+1. **Eenmalig script** `scripts/migrate_content_to_sxdoc.ts` (loopt lokaal, niet in container).
187+2. Voor elke `.md`: lees frontmatter (titel, tags, status), parseer body → `SxDocument` via `c32_sxdoc_parse`, **insert** in `sx_documents` tabel, schrijf óók `*.sxdoc.json` ernaast voor git-mirror.
188+3. `home.md` → slug `_home` (matcht podman's special `_home` slug).
189+4. Games (`content/games/*/`) blijven multi-file — buiten CMS-scope, blijven via `c31_games.ts` gerenderd.
190+5. `git-history/` is geen content — geen migratie nodig.
191+6. Eén batch-commit: "Migrate: content → sxdoc (SQLite-canon + git-mirror)" met alle `*.sxdoc.json` toevoegingen.
192+
193+Public URLs blijven gelijk (deze zijn al via `c31_site_config` gerouteerd). De Ghost-style `/blog/{primary_tag}/{slug}/` permalink is optioneel en gaat door de redirects-laag die we al hebben.
194+
195+---
196+
197+## Fasering
198+
199+Per memory: bypass-pacing / JOLO is OK voor scopes die in één run passen. Dit is een dagen-werk port, dus ik fasering aanhouden met deploy + verify per fase.
200+
201+### Fase 0 — beslissing + scaffolding (afgerond 2026-05-11)
202+- [x] Plan vastleggen (dit document).
203+- [x] Storage-canon bevestigd: **B (SQLite-canon + git-audit-mirror)**.
204+- [x] Parser/render laag bevestigd: **c32**.
205+- [x] Commit-vorm bevestigd: **één multi-path commit per save**.
206+- [x] Research-migration onderzoek afgerond → `research-migration.md`.
207+- [x] Layer-correcties verwerkt in mapping-tabel.
208+- [ ] `plan.md` committen (wacht op user-go).
209+
210+### Fase 1 — sxdoc-fundament (in uitvoering 2026-05-11)
211+- [x] `c31_sxdoc.ts` — types only (geen sibling-test verplicht)
212+- [x] `c31_sxdoc_parse.ts` (HTML→tree, port van podman `html-to-sx.ts`) + sibling `c31_sxdoc_parse.test.ts`
213+- [x] `c51_render_sxdoc.ts` (tree→HTML, port van podman `sx-to-html.ts`) + sibling `c51_render_sxdoc.test.ts`
214+- [x] Skip typed marketing blocks — niet nodig voor tdd.md content (~600 LOC bespaard).
215+- [x] `c13_database.ts` extend: `sx_documents` tabel + saveDocument/loadDocument/listDocuments/deleteDocument
216+- [x] `package.json`: `node-html-parser` toegevoegd
217+- [x] `bun install` — [email protected] binnen
218+- [x] `bun test src/c31_sxdoc_parse.test.ts src/c51_render_sxdoc.test.ts` — 53/53 ✓
219+- [x] `bun test src/c32_sama_verify.test.ts` — 10/10 ✓ (verifier zelf groen)
220+- [x] `bun test` (full suite) — 120/120 ✓ (67 pre-Fase-1 + 53 nieuwe)
221+- [x] `wc -l` op nieuwe files — hoogste 327 LOC (c31_sxdoc_parse), c13_database 390 LOC; allemaal < 700
222+- [x] `bun run src/c11_server.ts` boot-smoke OK — `/` en `/sama` beide 200
223+- `c14_client_bundle.ts` (Bun.build memoised) — komt pas in Fase 2
224+- Geen route-impact — alles puur unit-getest, niets aan de live site veranderd.
225+
226+**Fase 1 gates passed 2026-05-11. Sxdoc-fundament SAMA-canon compliant en groen.**
227+
228+### Fase 2 — admin-UI
229+
230+**2a — server-side CRUD (afgerond 2026-05-11):**
231+- [x] `c31_admin_validation.ts` + sibling test (14/14 groen) — parser/validator per Modeled.md
232+- [x] `c51_render_admin.ts` — list + edit form + login/non-admin walls
233+- [x] `c21_handlers_admin.ts` — adminListHandler, adminNewHandler, adminEditHandler, adminDeleteHandler (één bestand i.p.v. plan-spec 4; matcht bestaande `c21_handlers_agents`/`c21_handlers_auth` pattern, 218 LOC)
234+- [x] `c21_app.ts` routes: /admin, /admin/new, /admin/edit/:type/:slug, /admin/delete/:type/:slug
235+- [x] Boot-smoke: anonymous → 401, login-wall rendert ✓
236+- ⚠ `c21_app.ts` is nu 702 LOC (Atomic-grens 700 overschreden door 2 regels). Vraagt aparte split-refactor — c21_handlers_projects.ts, c21_handlers_api_agents.ts, c21_handlers_webhook.ts uit het inline-deel halen.
237+
238+**2b — client-side block editor (afgerond 2026-05-11):**
239+- [x] `c14_client_bundle.ts` — Bun.build memoised + ETag, 72 LOC
240+- [x] `src/client/blockeditor.ts` — hydratie + state + autosave + raw-mode toggle, 336 LOC
241+- [x] `src/client/slashmenu.ts` — filterable popup met arrows/enter/escape, 161 LOC
242+- [x] `src/client/blocks.ts` — per-block-kind renderers (p, h, ul, ol, quote, code, img, hr, html, shortcode), inline marks parser, slash-trigger, 393 LOC (één file ipv `blocks/*` — onder Atomic-700)
243+- [x] `c51_render_admin.ts` — neemt nu SxDocument als input, projecteert naar textarea-HTML + embedt `<script id="sxdoc-initial">` JSON, laadt bundle `<script type="module" src="/admin/assets/blockeditor.js">`
244+- [x] `c21_handlers_admin.ts` — JSON autosave path (`Accept: application/json` → `{ok:true,ts,slug,type}`)
245+- [x] `c21_app.ts` route `/admin/assets/blockeditor.js` met ETag/304
246+- [x] `public/style.css` admin-editor sectie toegevoegd (~190 LOC editor + slashmenu + toast)
247+- [x] Bundle compileert (26KB), serves 200, ETag → 304 ✓
248+- [x] Boot-smoke: /admin nog 401 anoniem (auth-gate intact) ✓
249+- [x] Full suite 134/134 ✓
250+- E2E spec `e2e/admin-block-editor.spec.ts` — uitgesteld; needs admin-sessie helper, beter in apart turn
251+- Deploy + verify op p620 — wacht op user-go
252+
253+**Tech-debt uit Fase 2:**
254+- c21_app.ts is nu **716 LOC** (Atomic-grens 700 overschreden door 16). Bestond al lang voor admin port; mijn route-adds duwden 't over. Splitsen langs `c21_handlers_projects.ts` / `c21_handlers_api_agents.ts` / `c21_handlers_webhook.ts` patroon (al gebruikt voor agents/auth/reports/sama) — aparte refactor, niet Fase 3-blocker.
255+
256+### Fase 3 — media + AI
257+- `c14_media.ts` (upload + on-disk store onder `content/images/`)
258+- `c14_image_resize.ts` (sharp wrapper — sharp is I/O = c14)
259+- `c21_handlers_media.ts` (GET /content/images/...)
260+- `c21_handlers_admin_upload.ts` (slash-menu image card target)
261+- `c14_openrouter.ts` (HTTP-call) + `c32_ai_edit_block.ts` (validate+transform, sibling-test verplicht) + `c21_handlers_admin_ai.ts` (✨ button)
262+- E2E: image upload + AI edit
263+- Deploy + verify
264+
265+### Fase 4 — public renderer + Ghost-look theme
266+- `c51_render_theme.ts` (port van podman's Handlebars partials naar TS template-helpers, geen Handlebars-dep)
267+- `c51_render_theme_partials.ts` (nav, footer, post-card, post-list)
268+- `c31_shortcodes_registry.ts` (built-in lijst + arg-schemas)
269+- `c32_shortcodes_substitute.ts` (HTML-rewriter met meta/script-skip) + `.test.ts`
270+- `c21_handlers_content.ts` swap: huidige render-paden → nieuwe theme
271+- CSS port: `sx-themes/syntax/assets/*` → `public/style.css` (uitgebreid)
272+- E2E: visuele parity-checks (`e2e/theme-parity.spec.ts`)
273+- Deploy + verify
274+
275+### Fase 5 — sitemap, RSS, live-preview
276+- `c51_render_sitemap.ts` + `c21_handlers_sitemap.ts`
277+- `c21_handlers_admin_preview.ts` + live-preview iframe in admin
278+- Deploy + verify
279+
280+### Fase 6 — content-migratie + cutover
281+- `scripts/migrate_content_to_sxdoc.ts` lokaal draaien
282+- Commit alle `*.sxdoc.json` als één migratie-batch
283+- Verwijder oude `c21_handlers_edit.ts` + `c51_render_edit.ts` (block-editor is canoniek)
284+- SAMA verify groen (`c32_sama_verify` over alle nieuwe files)
285+- Deploy + visual diff t.o.v. pre-migratie
286+- Memory updaten: "tdd.md CMS = sxdoc block-editor, Ghost-compat theme, git-canon (of SQLite-canon)"
287+
288+---
289+
290+## Risico's
291+
292+| Risico | Mitigatie |
293+|---|---|
294+| Block-editor JS is groot (slashmenu + 7 block types) | Bundle on-demand via `c14_client_bundle`, cache in memory. Geen build-step buiten Bun.build. |
295+| Ghost-Handlebars helpers (`{{#foreach}}`, `{{date}}`, `{{img_url}}`) — handgeschreven her-implementeren | Klein arsenaal nodig; allemaal in `c51_render_theme.ts` + unit-tested. Geen Handlebars-dep. |
296+| SAMA verify struikelt over client/ bestanden | client/ valt buiten cXX-naming-regel; al supported (zie bestaande `e2e/`). |
297+| Content-migratie loss-y voor markdown met embedded HTML | sxdoc heeft escape-hatch `html` block; parser valt daarop terug. Visuele parity-check in Fase 6. |
298+| Sharp binary in container | Bestaat al niet in tdd.md. Quadlet image-build moet `sharp` meebakken — eenmalige Dockerfile-edit. |
299+| `OPENROUTER_API_KEY` ontbreekt in prod | AI ✨ wordt 503 met hint (zoals podman). Niet blokkerend. |
300+
301+---
302+
303+## Tellen
304+
305+- Podman LOC (sx-editor + sx-content + sx-filter, `src/` only): ~6-8k geschat.
306+- Tdd.md LOC nu: 7.5k.
307+- Port voegt geschat 4-6k toe (geen Handlebars-overhead, hergebruik bestaande c13/c14/c32-laag).
308+- Eindstand: ~12-13k LOC, één Bun-proces, één SQLite-file, één bare repo. Geen extra services.
309+
310+---
311+
312+## Open vragen (kunnen later)
313+
314+Voor Fase 1 zijn de gelockte beslissingen voldoende. Deze vragen worden relevant per fase:
315+
316+1. ~~Storage~~ ✅ B (SQLite-canon + git-mirror) — locked.
317+2. **Permalink-vorm** (Fase 4): `/blog/{primary_tag}/{slug}/` (Ghost-style) of `/blog/{slug}/` (huidig)? Aanbevolen: huidig behouden, 9 bestaande URLs blijven werken.
318+3. **AI element-edit in prod** (Fase 3): `OPENROUTER_API_KEY` op p620 zetten, of alleen lokaal/dev (503 in prod)?
319+4. **Games** (Fase 6): buiten CMS via bestaande `c31_games.ts`, of óók via sxdoc? Aanbevolen: buiten.
320+5. **Ghost Content API endpoints** (`/ghost/api/content/...`): meeporten? Aanbevolen: drop, bespaart ~150 LOC. Tdd.md heeft geen externe API-consumers.
321+6. **Marketing-blocks** (hero/feature-card/etc.): meeporten of skip? Aanbevolen: skip — niet nodig voor tdd.md content, scheelt ~600 LOC.
added research-migration.md +567 −0
@@ -0,0 +1,567 @@
1+# research-migration — porting podman/syntax CMS into SAMA-native tdd.md
2+
3+Companion to `/var/home/scri/Documents/tdd.md/plan.md`. Read that first
4+for the high-level mapping; this goes deep on the points plan.md
5+handwaved. All line references are to files in
6+`/var/home/scri/Documents/podman/` and `/var/home/scri/Documents/tdd.md/`.
7+
8+## What I found that plan.md misses
9+
10+1. **`c32_sama_verify.ts` enforces stricter rules than plan.md
11+ assumed.** Layer-prefix whitelist is `{11, 13, 14, 21, 31, 32, 51}`
12+ (line 188). Plan.md proposes `c31_image_resize.ts`, but `sharp(...)`
13+ is I/O — per `content/sama/architecture.md:13-16` resize belongs in
14+ c14, OR c32 with `sharp` passed via DI. Same for plan.md's
15+ `c31_ai_edit_block.ts` (calls OpenRouter — must split into c14+c32).
16+2. **The verifier's import scanner only inspects relative `./xxx.ts`
17+ paths** (line 119-120). A bare `import sharp from "sharp"` in a c31
18+ file is invisible to the gate. The "no I/O in c31" rule is
19+ discipline, not enforcement.
20+3. **Atomic threshold is 700 lines** (line 309). Two podman files
21+ over/at the line on day one:
22+ `sx-editor/src/client/render.ts` (775 — **violation**),
23+ `sx-filter/src/shortcodes.ts` (650 — one new shortcode tips it).
24+ Plan.md doesn't budget these splits.
25+4. **Placeholder-test detection is part of Atomic** (lines 254-298).
26+ Every `test()/it()` body needs ≥1 `expect()`. Snapshot tests
27+ (`toMatchSnapshot`) qualify but rule it out as the default.
28+5. **Modeled is asymmetric** (lines 219-248). c32 without sibling test
29+ = hard violation; c31 missing sibling = informational only. So
30+ `c31_sxdoc.ts` (types) is fine without a test;
31+ `c32_sxdoc_parse.ts` (logic) is not. Plan.md's `c31_sxdoc_parse.ts`
32+ is the wrong layer — the parser is a deterministic transform, not
33+ pure types/registry.
34+6. **Podman uses subdirectories (`sxdoc/`, `core/`, `db/`, `client/`).**
35+ tdd.md's `src/` is flat (verified: no subdirs). SAMA's verifier
36+ doesn't walk subdirs, but the convention bans them — server-side
37+ files **must** flatten into top-level `cXX_*.ts`. plan.md mentions
38+ this only for `client/` and only obliquely.
39+7. **Live-preview cannot be commit-driven.** Plan.md picks git-canon
40+ (commit on every save), but `/admin/preview` runs on a ~200ms
41+ debounce. The preview path must skip `c14_git` entirely and render
42+ from in-memory sxdoc. Call this out so the handler is shaped
43+ correctly from the start.
44+8. **Ghost-style `/blog/{primary_tag}/{slug}/` permalink breaks 9
45+ existing post URLs.** Plan.md asks the question but doesn't count.
46+ Keep `/blog/{slug}/` unless there's a content reason to migrate.
47+
48+---
49+
50+## 1 — SAMA-verifier compliance
51+
52+### Exact rules (`src/c32_sama_verify.ts`)
53+
54+| letter | rule | line |
55+|---|---|---|
56+| S | c1*/c3* must NOT relative-import c5*/c9* (c21 exempt) | 149-185 |
57+| A | prefix ∈ {11,13,14,21,31,32,51} | 188 |
58+| M | c32_* needs sibling .test.ts (hard); c31_* missing = info only | 219-248 |
59+| A | cXX_*.ts ≤ 700 lines; every test() body needs ≥1 expect() | 300-326 |
60+
61+Verifier walks only `cXX_*.ts` files; everything else under `src/` is
62+ignored. **Client-bundle source under `src/client/**.ts` is therefore
63+out of scope** — fine.
64+
65+### Subdirectories
66+
67+Server code in podman is split across `sx-editor/src/{sxdoc,core,db}/`
68+and `sx-content/src/{sxdoc,core,db}/`. tdd.md is flat:
69+`ls src/` returns only `cXX_*.ts` + `.test.ts` siblings. SAMA prefix
70+replaces folder semantics. **All server-side podman files flatten**:
71+- `sxdoc/types.ts` → `c31_sxdoc.ts`
72+- `sxdoc/html-to-sx.ts` → `c32_sxdoc_parse.ts` (+ `.test.ts`)
73+- `sxdoc/sx-to-html.ts` → `c32_sxdoc_render.ts` (+ `.test.ts`)
74+- `sxdoc/db.ts` → `c14_sxdoc_sidecar.ts` (Option A) or `c14_sxdoc_store.ts` (Option B)
75+- `core/schema.ts` + `db/sqlite.ts` → merge into existing `c13_database.ts`
76+- `core/posts.ts` (editor & content) → one `c13_posts.ts`
77+- `core/settings.ts` → extend `c31_site_config.ts`
78+- `sxdoc/index.ts` (barrel) → DELETE (SAMA bans barrel re-exports per
79+ `content/sama/atomic.md`)
80+
81+### Client-side placement
82+
83+tdd.md has **no precedent** for client TS today: `public/` holds
84+`og.svg`, `style.css`, `sama-cli` (binary). `e2e/` holds Playwright
85+specs. Options for the block-editor client:
86+- **A. `src/client/**.ts`** — outside verifier glob, relative imports
87+ to `../c31_sxdoc.ts` work, `Bun.build` bundles from here. Recommended.
88+- B. `client/` at repo root — separates browser more clearly; new
89+ top-level dir.
90+- C. `public/src/**.ts` — confusing; `public/` is "served verbatim".
91+
92+`client/render.ts` (775 lines) **must split** before landing. Natural
93+axis: one file per block-kind (matches the existing `blocks/*.ts`
94+breakdown) + a small `client/render-dispatch.ts` switch on `block.t`.
95+
96+### Test convention
97+
98+tdd.md tests live as siblings under `src/`:
99+`c31_commits.test.ts`, `c31_diff_parse.test.ts`,
100+`c31_edit_validation.test.ts`, `c31_git_parse.test.ts`,
101+`c31_commit_meta.test.ts`, `c31_games.test.ts`,
102+`c32_anchor_extract.test.ts`, `c32_edit_resolve.test.ts`,
103+`c32_sama_verify.test.ts`.
104+
105+Podman's `sx-editor/tests/unit.test.ts` and `sx-content/tests/setup.ts`
106+are **incompatible** — verifier looks for `<file>.test.ts` next to
107+`<file>.ts`. Every kept test becomes a sibling file.
108+
109+E2E remains in `e2e/*.spec.ts` (Playwright, ignored by verifier).
110+
111+---
112+
113+## 2 — Storage-model conflict
114+
115+### `SxDocument` shape (`sx-editor/src/sxdoc/types.ts`)
116+
117+`{ v: 1, blocks: SxBlock[] }`. Single-letter keys (`t`, `c`) for
118+compactness (line 1-12). 17 block kinds: `p`, `h`, `ul`, `ol`, `li`,
119+`quote`, `code`, `img`, `hr`, `html`, `shortcode`, `embed`, plus 7
120+typed marketing blocks (`hero`, `feature-card`, `feature-grid`,
121+`stats-row`, `steps-grid`, `use-case-card`, `cta-band`). Inline marks
122+`b/i/u/s/c`; links are inline.
123+
124+No footnotes, no tables — tables fall through to `{t:"html"}` escape
125+hatch.
126+
127+### SQLite tables (`sx-editor/src/core/schema.ts`)
128+
129+Six Ghost-shaped tables: `posts`, `tags`, `users`, `posts_tags`,
130+`posts_authors`, `api_keys`, `settings`. Plus `sx_documents` (one row
131+per post, holds the typed-block JSON):
132+`(post_id PK, doc TEXT, doc_version INT, hash TEXT, updated_at TEXT)`.
133+
134+### Option A (git-canon, default) write flow
135+
136+`POST /admin/edit/blog/foo`:
137+1. validate + parse form → `(markdown_body, sxdoc_json)`
138+2. `c14_git.commitFile({ paths: [
139+ {path:"content/blog/foo.md", content:markdown_body},
140+ {path:"content/blog/foo.sxdoc.json", content:sxdoc_json} ]})`
141+ — **needs new `commitFiles` (multi-path) variant**.
142+3. mirror to live FS so the next render reflects it.
143+4. show "applied · sha XXXXXXX".
144+
145+**Commit message**: piggy-back the existing helper
146+`buildCommitMessage` from `c31_commit_meta.ts` (already used by
147+`c21_handlers_edit.ts:96`). Message format stays as today:
148+`Edit: <title> by <author> via /admin\n\n<filePath>`.
149+
150+`c14_git.commitFile` (lines 192-250) is single-path. Extending to
151+multi-path is ~30 added lines — same 5-step flow, with step 3
152+(read-tree + update-index) looping over paths.
153+
154+**Sidecar regen.** Because markdown is canonical and sxdoc is
155+derivable, treat sidecars as **cache**. If sidecar missing or older
156+than `.md`, regenerate via `marked.parse(md) → htmlToSx(html)`. Makes
157+the "drop SQLite index, replay git log" rebuild story plan.md
158+mentions actually trivial.
159+
160+### Real-content survey (assessed by full-read of 3 files + grep)
161+
162+| file | code fences | tables | embedded HTML | frontmatter |
163+|---|---|---|---|---|
164+| `content/home.md` (3.2 KB) | 0 | 1 (5 rows) | 0 | no |
165+| `content/blog/sama-meets-git-cms.md` | 4 | 0 | 0 | no |
166+| `content/blog/three-constraints-agentic-coding.md` | 7 | 0 | 0 | no |
167+| `content/sama/architecture.md` | 1 | 1 (4×4) | 0 | no |
168+| `content/sama/skill.md` | many | many | 0 | **YES** |
169+| (other 13 .md) | many | mixed | 0 | no |
170+
171+Confirmed by grep: **only `content/sama/skill.md` has YAML
172+frontmatter** (`---\nname: …\n---`). Other matches for `^---` are
173+markdown horizontal rules (`<hr>`) inside the body of `sama/*.md`
174+and a few blog posts — *not* frontmatter. The migration script must
175+distinguish: frontmatter = `^---\n[a-zA-Z_]+:` at byte 0.
176+
177+### What `htmlToSx` handles vs doesn't (`sx-editor/src/sxdoc/html-to-sx.ts`)
178+
179+Block-level handled: `p`, `h1..h6`, `ul`/`ol`/`li`, `blockquote`,
180+`pre`/`code` (with `language-X` detection), `img`, `figure`, `hr`.
181+Container divs (`div`, `section`, `article`) recurse into children.
182+Everything else → `{t:"html", src: el.outerHTML}` escape hatch
183+(line 183).
184+
185+Inline handled: `<a>`, `<br>`, `<strong>/<b>`, `<em>/<i>`, `<u>`,
186+`<s>/<strike>/<del>`, `<code>`. `<span>/<font>` strip wrapper, keep
187+content.
188+
189+**Implication for our content**:
190+- Tables → single `html` block per table. Renders identically but
191+ un-editable as discrete blocks. **Acceptable.**
192+- HR (`---`) → `{t:"hr"}`. Good.
193+- Code fences → `{t:"code", lang:"sh", src:"..."}`. Good.
194+- Quote-blocks (`> …` markdown) → `<blockquote>` HTML → `{t:"quote",
195+ c:[…]}`. Good.
196+- Frontmatter (skill.md only) — `marked` doesn't strip it by default
197+ in tdd.md's current `c51_render_layout.ts:8` call. **Pre-check
198+ what the live site does today** before migrating.
199+- Round-trip drift exists: mark order is normalised
200+ (`sx-to-html.ts:227`), `<b>` collapses to `<strong>`, whitespace
201+ shifts. Acceptable for the migration since markdown stays
202+ authoritative.
203+
204+### Option B (SQLite-canon) trade
205+
206+git as audit-trail disappears. Compensation table: `content_history
207+(id, slug, type, doc, html, edited_at, edited_by, msg)` — append-only.
208+Has rollback but no cryptographic immutability, no `git blame`, no
209+PR diffs, no mirror story.
210+
211+The `content/blog/sama-meets-git-cms.md` post (149 lines) is the
212+product pitch for "every save = a real commit". B contradicts
213+published copy. **Recommend A.** Mechanical concerns (multi-path
214+commit, sidecar regen) are small; the "stop saying SAMA meets git"
215+cost is large.
216+
217+---
218+
219+## 3 — Handlebars-theme port
220+
221+### Helpers used (exhaustive, source: `sx-content/src/render.ts`)
222+
223+`Handlebars.registerHelper` calls at lines 78, 86, 93, 109, 129, 135,
224+158, 166, 184, 202, 205, 210, 390, 393:
225+
226+| helper | line | use | TS-port effort |
227+|---|---|---|---|
228+| `asset` | 78 | `{{asset "css/syntax.css"}}` → `/assets/...` | trivial |
229+| `img_url` | 86 | pass-through today (no transforms) | trivial |
230+| `post_class` | 93 | join class strings from featured/tags | trivial |
231+| `ghost_head` | 109 | 5-10 meta/og tags + codeinjection | medium — existing `c51_render_layout.ts` already emits a similar block |
232+| `ghost_foot` | 129 | code injection footer | trivial |
233+| `date` | 135 | dual-shape formatter (YYYY/MMM/DD) | small |
234+| `content` | 158 | emit body html raw | trivial |
235+| `excerpt` | 166 | strip HTML + truncate N words | small |
236+| `foreach` | 184 | iteration with `@index/@first/@last/@even/@odd` | medium — TS map gets index; rest unused in current .hbs files (confirmed by grep) |
237+| `tag`, `author`, `page`, `post` (block) | 202/205/390/393 | scope-dive | structural; replaced by TS functions that take the scoped object as arg |
238+| `reading_time` | 210 | "N min read" | trivial |
239+
240+Built-in (`{{#if}}`, `{{else}}`, `{{!-- comment --}}`, `{{!< layout}}`)
241+are template-language features that go away once we render via TS
242+functions; no port needed.
243+
244+**Mismatches**: `{{#foreach}}`'s `@first/@last/@even/@odd` is the only
245+data plumbing TS map doesn't give for free. Grep of `.hbs` files
246+confirms none of those data-frame fields are referenced in current
247+templates. Safe to drop in the TS port.
248+
249+### Templates inventory (`sx-themes/syntax/`)
250+
251+`default.hbs` (14 lines — wrapper), `index.hbs` (757 lines —
252+marketing homepage HTML inline; partial `syntax-home.hbs` no longer
253+used per `sx-editor/src/index.ts:284-289`), `post.hbs` (37),
254+`page.hbs` (40), `tag.hbs` (24), `author.hbs` (24).
255+
256+`assets/css/syntax.css` is **812 lines**. tdd.md's `public/style.css`
257+is ~25 KB. Combining is a real CSS pass; classes like `.hero-content`,
258+`.feature-card`, `.use-case-card`, `.gradient-text` don't exist in
259+tdd.md today.
260+
261+### TS-native equivalents land in
262+
263+- `c51_render_theme.ts` — `renderPost(post)`, `renderPage(page)`,
264+ `renderTagArchive(tag, posts)`, `renderAuthorArchive(author, posts)`,
265+ `renderHomepage()`. Each replaces one `.hbs` file.
266+- `c51_render_meta.ts` (or extend existing `c51_render_layout.ts`) —
267+ `ghost_head`-equivalent. tdd.md already emits OG/meta in
268+ `c51_render_layout.ts:49+`; combine, don't reimplement.
269+- The five small string helpers (`asset`, `date`, `excerpt`,
270+ `reading_time`, `post_class`) live inline in `c51_render_theme.ts`
271+ as private functions. No external file warranted.
272+
273+---
274+
275+## 4 — Shortcode-engine port
276+
277+### What `sx-filter/src/shortcodes.ts` (650 lines) does
278+
279+`BUILT_IN` registry at line 546-563. Three categories:
280+
281+- **Pure** (no I/O): `ping`, `now`, `spec-version`, `event-validate`,
282+ `catalog-sample`, `query-demo`, `catalog-lookup` (reads in-process
283+ `DEMO_CATALOG`), `emit`+`demo-flow` (writes in-process `events.ts`
284+ ring buffer).
285+- **HTTP-fetching** (external API): `github-repo`, `npm`, `crate`,
286+ `gist`.
287+- **Ghost-API-fetching**: `event-count`, `posts-list` (Ghost content
288+ API), `login-page` (Ghost `_login-skin` page).
289+
290+Module-level `SHARED_EVENT_LOG` (line 14) + `DEMO_CATALOG`
291+(lines 26-107) push the file to 650 lines. **One more handler tips
292+it over 700.**
293+
294+### SAMA placement
295+
296+The handlers split by layer:
297+- **c32**: pure regex match, format, validate. `query-demo`,
298+ `event-validate`, `catalog-sample`, `catalog-lookup`,
299+ `event-count` parser (just an int).
300+- **c14**: HTTP wrappers for external APIs. `c14_github.ts` already
301+ exists. New: `c14_npm.ts`, `c14_crates.ts`, `c14_gist.ts` — or one
302+ combined `c14_package_registries.ts` (recommended for fewer files).
303+- **c13**: queries against `posts`/`sx_documents` for `posts-list`
304+ etc., extending `c13_database.ts`.
305+- **c32_event_log.ts**: pure in-memory ring buffer; required only if
306+ `emit`/`demo-flow` ship.
307+
308+### Where the substitute loop lives
309+
310+`sx-filter/src/index.ts:81-120` does the rewrite:
311+1. parse upstream HTML (already-rendered page),
312+2. build skip-regions (`<meta>`, `<link>`, `<script>`),
313+3. for each `SHORTCODE_RE` match, call handler, splice output.
314+
315+This is **render-time HTML rewriting**, runs after sxdoc → HTML. It's
316+a c51 concern wrapping c14/c32 handlers. **Cannot live in `c11_server.ts`**
317+— c11 forbids route logic / HTML rewriting per `content/sama/architecture.md:12`.
318+
319+Recommended shape:
320+- `c32_shortcode_parse.ts` (+test) — extract `{name, args, range}`
321+ tokens from text. Pure regex; same pattern as today.
322+- Handler functions at their natural layer.
323+- `c51_render_post.ts` calls the parser, dispatches handlers inline
324+ (~10 lines for a switch). No central registry; each handler is
325+ just a function imported where needed.
326+
327+### Single-process advantage
328+
329+Podman's filter is a separate Bun service proxying Ghost. tdd.md is
330+one process — substitute is a function call, not an HTTP hop. The
331+~100 lines of `sx-filter/src/index.ts` doing upstream-proxy wiring
332+are deleted; the ~30 lines of skip-region + substitute logic move
333+into c51.
334+
335+---
336+
337+## 5 — File inventory (server-side, podman → tdd.md)
338+
339+### sx-editor/src/
340+
341+- `ai.ts` (317) → `c14_openrouter.ts` + `c32_ai_edit_block.ts` — HTTP
342+ client (c14), prompt assembly + JSON validation (c32). plan.md's
343+ `c31_ai_edit_block.ts` is wrong layer.
344+- `build.ts` (61) → `c14_client_bundle.ts` — calls `Bun.build`, I/O.
345+- `db.ts` (124) → split: SQL into `c13_posts.ts`; htmlToSx fallback
346+ into the handler. plan.md's `c14_sxdoc_store.ts` is a different
347+ file (sx-doc only); db.ts is core posts.
348+- `index.ts` (437) → dispatcher entries in `c21_app.ts`; per-route
349+ handler bodies in `c21_handlers_admin_{list,edit,new,upload,ai,
350+ preview}.ts`. 4-6 files of 80-150 lines.
351+- `routes.ts` (44) → merge into existing `c31_site_config.ts`.
352+- `templates.ts` (482) → `c51_render_admin.ts`. At Atomic limit;
353+ watch for growth.
354+- `upload.ts` (87) → `c14_media.ts`.
355+- `sxdoc/types.ts` (240) → `c31_sxdoc.ts`. Types only; no sibling
356+ test (informational only).
357+- `sxdoc/html-to-sx.ts` (315) → `c32_sxdoc_parse.ts` (+ `.test.ts`).
358+- `sxdoc/sx-to-html.ts` (266) → `c32_sxdoc_render.ts` (+ `.test.ts`).
359+- `sxdoc/db.ts` (64) → `c14_sxdoc_sidecar.ts` (Option A) **or**
360+ `c14_sxdoc_store.ts` (Option B). Same shape, different backend.
361+- `sxdoc/index.ts` (14) → DELETE (barrel, SAMA-forbidden).
362+- `core/posts.ts` (148) → merge into `c13_posts.ts` with content's.
363+- `core/schema.ts` (103) → merge into `c13_database.ts`.
364+- `db/sqlite.ts` (41) → merge into `c13_database.ts`.
365+- `scripts/backfill-sxdoc.ts` → `scripts/migrate_content_to_sxdoc.ts`.
366+- `scripts/import-homepage.ts` → discard.
367+
368+### sx-content/src/
369+
370+- `db.ts` (11) → merge into `c13_database.ts`. Trivial.
371+- `images.ts` (125) → `c14_media.ts` (combined with upload.ts).
372+ `sharp` is I/O — **c14, not c31** as plan.md proposed.
373+- `index.ts` (536) → `c21_handlers_content.ts` (+ optional
374+ `c21_handlers_ghost_api.ts` — see open question 7).
375+- `posts.ts` (140) → merge into single `c13_posts.ts`.
376+- `render.ts` (398) → `c51_render_theme.ts`. Drops the Handlebars
377+ dep.
378+- `routes.ts` (199) → split: URL patterns into `c31_site_config.ts`,
379+ classifyUrl logic into `c32_url_classify.ts` (+ test).
380+- `sitemap.ts` (134) → `c51_render_sitemap.ts` + `c21_handlers_sitemap.ts`.
381+- `sxdoc/*` (913 total) → duplicates of editor's; **single source of
382+ truth in tdd.md**, both reads and writes use the same c31/c32/c14
383+ triplet.
384+- `core/posts.ts` (254), `core/schema.ts` (101), `core/settings.ts`
385+ (118), `db/sqlite.ts` (43) → merge as listed for editor's
386+ equivalents; `core/settings.ts` extends `c31_site_config.ts`.
387+
388+### sx-filter/src/
389+
390+- `admin.ts` (114) → DELETE. tdd.md has real auth; no injection.
391+- `events.ts` (211) → `c32_event_log.ts` if event-demo shortcodes
392+ ship. Otherwise DELETE.
393+- `index.ts` (379) → discard proxy logic; substitute loop moves to
394+ c51 (described in §4).
395+- `login-page-skin.html` (174), `login-page-template.ts` (205) →
396+ DELETE (syntax.ai demo asset).
397+- `shortcodes.ts` (650) → `c32_shortcode_parse.ts` + handler files
398+ at natural layers + dispatch inline in c51. Demo shortcodes
399+ (event-* / catalog-* / login-page) are open question 4.
400+
401+### Non-clean mappings flagged
402+
403+- Two big dispatcher files (editor `index.ts` 437, content `index.ts`
404+ 536) **must split**: dispatcher entries go into `c21_app.ts`,
405+ handler bodies into per-domain `c21_handlers_*.ts`.
406+- `sxdoc/` duplicated between editor and content services — keep one
407+ copy in tdd.md.
408+- `core/schema.ts`, `db/sqlite.ts` duplicated — one copy.
409+- `marked` already in tdd.md deps (`c51_render_layout.ts:8`). The
410+ migration uses it; after cutover see open question 12.
411+
412+### Client (sx-editor/src/client/**)
413+
414+Lands at `src/client/**` (outside verifier glob). Sizes preserved.
415+Key file: `render.ts` (775 — **must split before landing**). Natural
416+split per-block-kind matches the existing `blocks/*` and
417+`blocks/typed/*` breakdown.
418+
419+Open: `slashmenu.ts` (590) vs `slashmenu-v2.ts` (216) — figure out
420+which is canonical before porting.
421+
422+---
423+
424+## 6 — Content migration mechanics
425+
426+### Algorithm
427+
428+```ts
429+// scripts/migrate_content_to_sxdoc.ts
430+for (const file of glob("content/**/*.md")) {
431+ if (file.startsWith("content/games/")) continue;
432+ if (file.startsWith("content/git-history/")) continue;
433+ const raw = await Bun.file(file).text();
434+ const { fm, body } = splitFrontmatter(raw); // skill.md only
435+ const html = await marked.parse(body, { gfm: true, breaks: false });
436+ let doc: SxDocument;
437+ try { doc = htmlToSx(html); }
438+ catch (e) {
439+ // Fallback: single html-block holding the markdown-rendered HTML.
440+ doc = { v: 1, blocks: [{ t: "html", src: html }] };
441+ }
442+ const sxdocPath = file.replace(/\.md$/, ".sxdoc.json");
443+ await Bun.write(sxdocPath, JSON.stringify(doc, null, 2));
444+}
445+// one batched commit
446+git add content/**/*.sxdoc.json
447+git commit -m "Migrate content to sxdoc sidecars (one-time)"
448+```
449+
450+### Edge cases
451+
452+- **Tables** → single `{t:"html"}` block per table. Renders
453+ identically; un-editable as discrete blocks in the block editor.
454+ Acceptable.
455+- **Frontmatter (`skill.md`)** → strip first, parse body. Decide
456+ separately what happens to the `name:`/`description:` fields: today
457+ they probably render as visible text via marked. Pre-check live
458+ site behaviour before migrating.
459+- **HR (`---` mid-document)** is NOT frontmatter. Frontmatter pattern:
460+ `/^---\n[a-zA-Z_]+:/` at byte 0.
461+- **Parse fail** → escape hatch as shown. Page still renders
462+ (`sx-to-html.ts:60-62` emits raw HTML untouched). Editor surfaces
463+ "open `/edit-raw/...` for this section".
464+- **Code fences** all currently `sh` / `ts` / `text` — parsed by
465+ `parseLangFromClass` (line 296-298) into `{t:"code", lang, src}`.
466+ No issue.
467+- **Round-trip drift** — `<b>` collapses to `<strong>`, mark order
468+ normalised. Acceptable since `.md` stays authoritative.
469+
470+### Commit strategy: single batch
471+
472+18 files → one "Migrate: content → sxdoc" commit. Per-file commits
473+add noise without informational value. Future re-migration after
474+parser improvements stays a single revertable commit.
475+
476+### Games confirmed out of scope
477+
478+`content/games/{fizzbuzz,string-calc}/` are multi-file units:
479+`spec.md` + `spec.ts` + `hidden/`. Read by `c31_games.ts` directly,
480+not by the CMS; the companion `.ts` and `hidden/` directory make the
481+post/page abstraction wrong. Keep games entirely outside the CMS.
482+**No `/edit/games/...` route should exist** — edit via vim+git like
483+source code.
484+
485+### git-history out of scope
486+
487+`content/git-history/syntaxai__tdd.md{,.tests}.json` (160 KB total)
488+are generated artifacts read by `c32_real_reports.ts` /
489+`c32_real_tests.ts`. Not content.
490+
491+---
492+
493+## Open beslismomenten voor de mens
494+
495+1. **Storage canon — A (git-canon) or B (SQLite-canon)?**
496+ plan.md defaults A; my read supports A (the existing
497+ `content/blog/sama-meets-git-cms.md` is the product pitch and
498+ contradicts B). Confirm A, or pick B and accept rewriting that
499+ post + memory update.
500+
501+2. **sxdoc parser layer — c31 or c32?**
502+ plan.md says c31; I argue c32 (deterministic transform with logic,
503+ not pure types/registry). Affects file name and whether sibling
504+ tests are mandatory (c32 yes, c31 informational).
505+
506+3. **Single-commit vs two-commit per editor save (Option A).**
507+ Either extend `c14_git.commitFile` to multi-path (recommended,
508+ ~30 LOC) OR write `.md` and `.sxdoc.json` as two commits (simpler,
509+ doubled log noise, atomicity hole if step 2 fails).
510+
511+4. **Ship the syntax.ai event-demo shortcodes?**
512+ `emit`, `catalog-lookup`, `demo-flow`, `login-page`,
513+ `event-validate`, `catalog-sample`, `query-demo`, `event-count`,
514+ `posts-list`. These exist for syntax.ai's product story; tdd.md is
515+ a different product. **Default: off.** Saves ~500 LOC (skip
516+ `events.ts` port + 5 handler files + the `DEMO_CATALOG` constant).
517+
518+5. **Ghost-style permalink `/blog/{primary_tag}/{slug}/` vs current
519+ `/blog/{slug}/`?**
520+ Switching costs 9 redirects in `c21_app.ts` and breaks external
521+ links. **Recommend keep current.**
522+
523+6. **Typed marketing blocks (`hero`, `feature-card`, `feature-grid`,
524+ `stats-row`, `steps-grid`, `use-case-card`, `cta-band`) — port?**
525+ tdd.md's `home.md` is text + 1 table + 1 list — none would apply
526+ unless we redesign the homepage. **Default: skip.** Saves ~600
527+ LOC across `c31_sxdoc.ts` (~80 lines smaller) +
528+ `c32_sxdoc_render.ts` (typed renderers) +
529+ `client/blocks/typed/*.ts` (7 files).
530+
531+7. **Ghost Content API compatibility surface
532+ (`/ghost/api/content/{posts,pages}/...`) — keep?**
533+ `sx-content/src/index.ts:78-115`. No consumers today. **Default:
534+ drop.** Saves ~150 LOC.
535+
536+8. **Client-side TS placement — `src/client/`, `client/`, or
537+ `public/src/`?** Recommend `src/client/`. Affects bundler paths
538+ and Playwright fixture wiring.
539+
540+9. **`client/render.ts` (775) split shape.** Per-block-kind
541+ (`render-p.ts`, `render-h.ts`, …, 12 small files) or by sub-system
542+ (`render-blocks.ts`, `render-marks.ts`, `render-typed.ts`,
543+ 3 medium files). Affects readability vs file count.
544+
545+10. **c32 parser tests — snapshot vs explicit-assertion?**
546+ Snapshot (`toMatchSnapshot`) qualifies under the placeholder-test
547+ check, but explicit asserts are more readable. Decide before
548+ writing.
549+
550+11. **`OPENROUTER_API_KEY` in prod (plan.md open Q3).**
551+ Still open. AI ✨ returns 503 with hint when unset
552+ (`sx-editor/src/index.ts:367-369`). Acceptable to ship without
553+ the key in prod.
554+
555+12. **Keep `marked` post-migration?**
556+ `marked` is used during migration (md → html before sxdoc parse)
557+ and currently at runtime by `c51_render_layout.ts:8`. After
558+ cutover, sxdoc → HTML is the new render path. Decide: keep
559+ `marked` as a runtime dep for legacy paths, or vendor a tiny
560+ md-to-blocks shim inside the migration script and drop marked
561+ entirely.
562+
563+13. **`/admin/preview` rendering path.** Plan.md doesn't address
564+ that preview cannot go through `c14_git.commitFile` (debounce
565+ too tight). Handler must take in-memory sxdoc and call
566+ `c32_sxdoc_render` → `c51_render_theme` directly. Shape the
567+ handler accordingly from the start; don't refactor later.
added sama.profile.toml +52 −0
@@ -0,0 +1,52 @@
1+# SAMA v2 profile — declares this repo's filename prefixes and how
2+# they map to the four canonical layers (Pure 0 / Core 1 / Adapter 2 /
3+# Entry 3). See https://tdd.md/sama/v2 §2 for the profile mechanism
4+# and https://tdd.md/sama/v2 §1.1 for the canonical layer table.
5+#
6+# Order in each `sublayers` array is the dependency order: later
7+# entries may import earlier entries, never the reverse (§2.2).
8+
9+sama_version = "2.0"
10+profile = "tdd-md"
11+
12+# Layer 0 — Pure. Types, constants, pure registries, pure parsers.
13+# No I/O, no side effects.
14+[layers.0]
15+prefixes = ["c31_"]
16+
17+# Layer 1 — Core. Domain logic and pure render. No network, disk,
18+# clock, or framework.
19+# - c32_ holds pure domain logic (judging math, session HMAC,
20+# anchor extraction, edit-target resolution, the v1 verifier).
21+# - c51_ holds pure HTML render functions (markdown → string,
22+# page chrome, no I/O).
23+# c51 may import c32 (render uses logic); c32 must never import c51.
24+[layers.1]
25+sublayers = [
26+ { name = "logic", prefix = "c32_" },
27+ { name = "render", prefix = "c51_" },
28+]
29+
30+# Layer 2 — Adapter. The boundary. External input is parsed here.
31+# DB, network, filesystem, framework bindings.
32+# - c13_ holds SQLite primitives (the bun:sqlite Database wrapper).
33+# - c14_ holds HTTP / git / filesystem orchestrators that may compose
34+# c13 primitives (e.g. c14_judge runs git clone + saveRun() to db).
35+# c14 may import c13; c13 must never import c14.
36+[layers.2]
37+sublayers = [
38+ { name = "data", prefix = "c13_" },
39+ { name = "io", prefix = "c14_" },
40+]
41+
42+# Layer 3 — Entry. Outermost shell: server bootstrap, route table,
43+# handlers.
44+# - c21_ holds HTTP handlers and the Bun.serve route table (c21_app).
45+# - c11_ holds the server bootstrap that mounts the route table.
46+# c11 may import c21 (the bootstrap pulls in the app); c21 must never
47+# import c11.
48+[layers.3]
49+sublayers = [
50+ { name = "handlers", prefix = "c21_" },
51+ { name = "server", prefix = "c11_" },
52+]
modified src/c13_database.ts +8 −23
@@ -1,6 +1,6 @@
11 import { Database } from "bun:sqlite";
2-import type { ProjectConfig, TestRunner } from "./c31_project_config.ts";
3-import type { SxDocument } from "./c31_sxdoc.ts";
2+import type { ProjectConfig, TestRunner, ProjectRow } from "./c31_project_config.ts";
3+import type { SxDocument, SxDocumentSummary } from "./c31_sxdoc.ts";
44 import { SX_DOC_VERSION } from "./c31_sxdoc.ts";
55
66 const DB_PATH = process.env.TDD_DB_PATH ?? ":memory:";
@@ -133,18 +133,9 @@ export const latestRun = (owner: string, repo: string): Verdict | null => {
133133 return JSON.parse(row.verdict_json) as Verdict;
134134 };
135135
136-export interface ProjectRow {
137- id: number;
138- registeredBy: string;
139- repoOwner: string;
140- repoName: string;
141- testRunner: TestRunner;
142- trackedBranches: string[];
143- displayName: string | null;
144- team: string | null;
145- registeredAt: number;
146- status: "active" | "paused";
147-}
136+// ProjectRow is now defined in Layer 0 (c31_project_config) alongside
137+// the rest of the project-config types, so c51 render code can
138+// reference it without importing from Layer 2.
148139
149140 interface ProjectDbRow {
150141 id: number;
@@ -240,15 +231,9 @@ export interface SxDocumentRow {
240231 updatedAt: number;
241232 }
242233
243-export interface SxDocumentSummary {
244- id: number;
245- slug: string;
246- type: "page" | "post";
247- title: string;
248- status: "published" | "draft";
249- primaryTag: string | null;
250- updatedAt: number;
251-}
234+// SxDocumentSummary is the public summary shape; defined in Layer 0
235+// (c31_sxdoc) so render code can reference it without crossing the
236+// SAMA v2 import direction.
252237
253238 interface SxDocumentDbRow {
254239 id: number;
modified src/c14_git.ts +13 −22
@@ -26,22 +26,15 @@ import {
2626
2727 export const GIT_DIR = process.env.TDD_GIT_DIR ?? "/app/repo";
2828
29-export interface GitCommitOk {
30- ok: true;
31- commitSha: string;
32-}
33-
34-export interface GitCommitFailure {
35- ok: false;
36- // "conflict" → ref tip moved under us (someone else committed)
37- // "not_found" → branch doesn't exist
38- // "permission" → fs perms on the bare repo
39- // "other" → anything else (look at .message)
40- kind: "conflict" | "not_found" | "permission" | "other";
41- message: string;
42-}
43-
44-export type GitCommitOutcome = GitCommitOk | GitCommitFailure;
29+// GitCommitOk / GitCommitFailure / GitCommitOutcome are defined in
30+// Layer 0 (c31_git_parse) per SAMA v2 §1.1. Imported here so the
31+// adapter's typed return signatures match what callers in Layer 1
32+// also import directly.
33+import type {
34+ GitCommitOk,
35+ GitCommitFailure,
36+ GitCommitOutcome,
37+} from "./c31_git_parse.ts";
4538
4639 interface RunOpts {
4740 stdin?: string;
@@ -114,12 +107,10 @@ export const readBlobAtRef = async (ref: string, path: string): Promise<string |
114107 // Returns null when the path doesn't exist at that ref. Each entry
115108 // keeps the relative name (basename), not the full path — the caller
116109 // builds full paths from `${path}/${entry.name}`.
117-export interface TreeEntry {
118- name: string; // basename, e.g. "skill.md" or "blog"
119- type: "blob" | "tree" | "commit";
120- sha: string;
121- mode: string;
122-}
110+// TreeEntry is defined in Layer 0 (c31_git_parse) per SAMA v2 §1.1.
111+// Callers import it directly from c31_git_parse, not through this
112+// adapter — that's what keeps the import direction Layer N → Layer M < N.
113+import type { TreeEntry } from "./c31_git_parse.ts";
123114 export const lsTree = async (ref: string, path: string): Promise<TreeEntry[] | null> => {
124115 // `<ref>:<path>` — git lists what's at that tree. For path="" it's
125116 // the repo root.
added src/c14_judge.test.ts +69 −0
@@ -0,0 +1,69 @@
1+// Sibling test for c32_judge.ts. The orchestrator itself (judge()) does
2+// git clone + test execution and isn't unit-testable without a real
3+// agent repo; the pure helpers underneath it (applyMode, explainRefactor)
4+// are the structural surface that matters for scoring decisions. Cover
5+// the mode-aware penalty math + the operator-facing explanations here.
6+
7+import { describe, test, expect } from "bun:test";
8+import { applyMode, explainRefactor, judge } from "./c14_judge.ts";
9+
10+describe("c32_judge — applyMode (mode-aware penalty math)", () => {
11+ test("positive deltas pass through unchanged in every mode", () => {
12+ expect(applyMode(10, "strict")).toBe(10);
13+ expect(applyMode(10, "pragmatic")).toBe(10);
14+ expect(applyMode(10, "learning")).toBe(10);
15+ });
16+
17+ test("strict mode keeps the full negative penalty", () => {
18+ expect(applyMode(-20, "strict")).toBe(-20);
19+ expect(applyMode(-5, "strict")).toBe(-5);
20+ });
21+
22+ test("pragmatic mode halves negative deltas (Math.ceil — never below half)", () => {
23+ expect(applyMode(-20, "pragmatic")).toBe(-10);
24+ expect(applyMode(-10, "pragmatic")).toBe(-5);
25+ // -5 / 2 = -2.5 → Math.ceil(-2.5) = -2: the harsher half rounds up
26+ // toward zero, which is the documented "softer score" behaviour.
27+ expect(applyMode(-5, "pragmatic")).toBe(-2);
28+ });
29+
30+ test("learning mode zeroes out every negative delta", () => {
31+ expect(applyMode(-20, "learning")).toBe(0);
32+ expect(applyMode(-5, "learning")).toBe(0);
33+ expect(applyMode(-1, "learning")).toBe(0);
34+ });
35+
36+ test("zero delta is neutral in every mode", () => {
37+ expect(applyMode(0, "strict")).toBe(0);
38+ expect(applyMode(0, "pragmatic")).toBe(0);
39+ expect(applyMode(0, "learning")).toBe(0);
40+ });
41+});
42+
43+describe("c32_judge — explainRefactor", () => {
44+ test("passed=true returns the canonical-refactor explanation", () => {
45+ const s = explainRefactor(true);
46+ expect(s).toContain("stayed green");
47+ expect(s).toMatch(/canonical/i);
48+ });
49+
50+ test("passed=false returns guidance to revert or open a new red→green", () => {
51+ const s = explainRefactor(false);
52+ expect(s).toContain("broke");
53+ expect(s).toMatch(/revert|red→green/);
54+ });
55+
56+ test("the two branches return different strings", () => {
57+ expect(explainRefactor(true)).not.toBe(explainRefactor(false));
58+ });
59+});
60+
61+describe("c32_judge — orchestrator entry point", () => {
62+ test("judge is exported as an async function (Promise-returning)", () => {
63+ expect(typeof judge).toBe("function");
64+ // The orchestrator does git clone + test execution; covering it
65+ // end-to-end needs a real agent repo. A type-level check that the
66+ // shape didn't drift is the documented minimum for this layer.
67+ expect(judge.length).toBe(2);
68+ });
69+});
added src/c14_judge.ts +370 −0
@@ -0,0 +1,370 @@
1+import { mkdtempSync, rmSync } from "fs";
2+import { join } from "path";
3+import { tmpdir } from "os";
4+import { parseCommit, type Phase } from "./c31_commits.ts";
5+import { saveRun, type Verdict, type StepVerdict, type RefactorVerdict, type Mode } from "./c13_database.ts";
6+import { loadGame, type Game } from "./c31_games.ts";
7+
8+type TestRunner = "bun" | "none";
9+
10+interface TddConfig {
11+ mode: Mode;
12+ testRunner: TestRunner;
13+}
14+
15+// tdd.config.json from the agent's repo selects the scoring mode and
16+// test runner. Falls back to strict / bun when missing or unparseable.
17+//
18+// { "mode": "pragmatic", "test_runner": "none" }
19+//
20+// test_runner: "none" enables trace-only judging — no checkout, no test
21+// execution. Useful as a CI gate on projects where Bun can't run the
22+// suite (e.g. .NET, Python without bun-compat tests).
23+const readConfig = async (cwd: string): Promise<TddConfig> => {
24+ const file = Bun.file(join(cwd, "tdd.config.json"));
25+ let mode: Mode = "strict";
26+ let testRunner: TestRunner = "bun";
27+ if (await file.exists()) {
28+ try {
29+ const cfg = (await file.json()) as { mode?: string; test_runner?: string };
30+ if (cfg.mode === "pragmatic" || cfg.mode === "learning") mode = cfg.mode;
31+ if (cfg.test_runner === "none") testRunner = "none";
32+ } catch {
33+ // best effort — bad config falls back to defaults
34+ }
35+ }
36+ return { mode, testRunner };
37+};
38+
39+// Penalty halving for pragmatic, zeroing for learning. Positive deltas
40+// are unchanged across modes — earned credit is earned credit.
41+export const applyMode = (delta: number, mode: Mode): number => {
42+ if (delta >= 0) return delta;
43+ if (mode === "learning") return 0;
44+ if (mode === "pragmatic") return Math.ceil(delta / 2);
45+ return delta;
46+};
47+
48+// Plain-language summary of a step verdict, written to the agent (not
49+// the human admin). One short paragraph; named intentionally so callers
50+// can see it next to the row in the score table.
51+const explainStep = (params: {
52+ status: StepVerdict["status"];
53+ redSha: string | null;
54+ greenSha: string | null;
55+ hiddenPassed: boolean | null;
56+ mode: Mode;
57+}): string => {
58+ const { status, hiddenPassed, mode } = params;
59+ switch (status) {
60+ case "verified":
61+ return "Red failed as expected, green passes your tests, and the kata's hidden tests confirm the implementation matches the requirement.";
62+ case "discipline-only":
63+ return "Red→green discipline holds, but this kata didn't ship hidden tests for the step. Partial credit awarded; full +20 isn't possible without authoritative verification.";
64+ case "no-green":
65+ return "Red commit landed; the matching green(<step>) commit hasn't been pushed yet. Push your green to lock in the score.";
66+ case "red-did-not-fail":
67+ return mode === "pragmatic"
68+ ? "Combined red+green commit detected. Pragmatic mode allows this — the cycle still counts, just with a softer score than a clean separation."
69+ : "Red commit's tests already passed when the step was first introduced — meaning the implementation was added before the test, or the test is tautological. Switch to pragmatic mode if you commit red+green together intentionally.";
70+ case "green-did-not-pass":
71+ return "Green commit's own tests still fail. The implementation doesn't yet satisfy the test you wrote — fix the impl, or reconsider whether the test reflects the requirement.";
72+ case "hidden-tests-failed":
73+ return hiddenPassed === false
74+ ? "Your tests pass, but the kata's hidden tests don't — this is the classic tautology trap. Tighten your test to mirror the requirement (e.g., assert the actual return value, not just that it runs)."
75+ : "Your tests pass, but hidden verification was inconclusive. Re-push to retry.";
76+ case "test-deleted":
77+ return "Test count dropped between red and green for this step. Once a test exists it must keep existing — refactor it, don't delete it. If the test was wrong, replace it in a separate commit before resuming the cycle.";
78+ case "trace-verified":
79+ return "Trace-only mode: red→green pair found in the commit log. Tests weren't executed (test_runner: \"none\"). Switch to bun runner for behaviour verification.";
80+ case "trace-tests-shrunk":
81+ return "Trace-only mode: the green commit's tree has fewer test files than the red commit's tree — looks like deletion. If you renamed or split test files, the tally still drops.";
82+ }
83+};
84+
85+export const explainRefactor = (passed: boolean): string =>
86+ passed
87+ ? "Tests stayed green through the refactor — structural change without behavior change, the canonical refactor."
88+ : "Refactor commit broke at least one test. Either revert the refactor or write a new red→green to capture the changed behavior.";
89+
90+const FORGEJO_INTERNAL = process.env.FORGEJO_URL ?? "https://git.tdd.md";
91+const TEST_TIMEOUT_MS = 8000;
92+
93+// Sandboxed env passed to git and bun subprocesses. Strips every secret
94+// from the parent process — agent code never sees FORGEJO_ADMIN_TOKEN,
95+// GITHUB_CLIENT_SECRET, or SESSION_SECRET. PATH is fixed; HOME and TMPDIR
96+// stay inside the per-run temp dir so dotfile writes can't escape.
97+const sandboxEnv = (cwd: string): Record<string, string> => ({
98+ PATH: "/usr/local/bin:/usr/bin:/bin",
99+ HOME: cwd,
100+ TMPDIR: cwd,
101+ NODE_ENV: "test",
102+});
103+
104+const runProc = async (
105+ cmd: string[],
106+ cwd: string,
107+ timeoutMs: number,
108+): Promise<{ stdout: string; stderr: string; exitCode: number; timedOut: boolean }> => {
109+ const proc = Bun.spawn(cmd, {
110+ cwd,
111+ stdout: "pipe",
112+ stderr: "pipe",
113+ env: sandboxEnv(cwd),
114+ });
115+ let timedOut = false;
116+ const timer = setTimeout(() => {
117+ timedOut = true;
118+ proc.kill("SIGKILL");
119+ }, timeoutMs);
120+ const exitCode = await proc.exited;
121+ clearTimeout(timer);
122+ const stdout = await new Response(proc.stdout).text();
123+ const stderr = await new Response(proc.stderr).text();
124+ return { stdout: stdout.trim(), stderr: stderr.trim(), exitCode, timedOut };
125+};
126+
127+const runTests = async (cwd: string): Promise<boolean> => {
128+ const r = await runProc(["bun", "test"], cwd, TEST_TIMEOUT_MS);
129+ // Bun test exits 0 only when all tests pass.
130+ return !r.timedOut && r.exitCode === 0;
131+};
132+
133+// Language-agnostic test-file counter for trace-only mode. Uses git
134+// ls-tree at the given sha so we don't have to checkout the working
135+// tree. Matches conventional test-file naming across ecosystems:
136+// foo.test.ts, foo.spec.ts, FooTests.cs, FooTest.java, test_foo.py,
137+// foo_test.go, FooSpec.scala, foo_spec.rb.
138+const countTestFiles = async (cwd: string, sha: string): Promise<number> => {
139+ const r = await runProc(["git", "ls-tree", "-r", "--name-only", sha], cwd, 5000);
140+ if (r.exitCode !== 0) return 0;
141+ const re = /(?:^|\/)(?:[^/]*\.(?:test|spec)\.[a-z]+|[Tt]ests?\/[^/]+|test_[^/]+|[^/]+_test\.[a-z]+|[^/]+[Tt]ests?\.cs|[^/]+[Tt]est\.java)$/;
142+ let count = 0;
143+ for (const line of r.stdout.split("\n")) {
144+ if (re.test(line)) count++;
145+ }
146+ return count;
147+};
148+
149+// Count `test(` / `it(` calls in tracked *.test.ts files. Used to detect
150+// when an agent deletes tests between red and green to make a regression
151+// "pass" — a cardinal TDD sin per the kata spec.
152+const countTests = async (cwd: string): Promise<number> => {
153+ const r = await runProc(["git", "ls-files", "*.test.ts"], cwd, 5000);
154+ if (r.exitCode !== 0) return 0;
155+ const files = r.stdout.split("\n").filter((f) => f && !f.includes("__hidden_"));
156+ let count = 0;
157+ for (const f of files) {
158+ const content = await Bun.file(join(cwd, f))
159+ .text()
160+ .catch(() => "");
161+ const matches = content.match(/\b(?:test|it)\s*\(/g);
162+ if (matches) count += matches.length;
163+ }
164+ return count;
165+};
166+
167+// Runs the kata's authoritative tests against the agent's implementation
168+// at whatever commit is currently checked out. Copies the hidden test
169+// file into the working tree under a __hidden__ prefix so it doesn't
170+// collide with the agent's filenames, runs only that file, then deletes
171+// it. Returns null if the kata doesn't have hidden tests for this step.
172+const runHiddenTests = async (cwd: string, spec: Game, stepId: string): Promise<boolean | null> => {
173+ const stepDef = spec.steps.find((s) => s.id === stepId);
174+ if (!stepDef) return null;
175+ const sourcePath = `./content/games/${spec.id}/${stepDef.hiddenTestFile}`;
176+ const sourceFile = Bun.file(sourcePath);
177+ if (!(await sourceFile.exists())) return null;
178+ const content = await sourceFile.text();
179+ const targetName = `__hidden_${stepId}__.test.ts`;
180+ const targetPath = join(cwd, targetName);
181+ await Bun.write(targetPath, content);
182+ try {
183+ const r = await runProc(["bun", "test", targetName], cwd, TEST_TIMEOUT_MS);
184+ return !r.timedOut && r.exitCode === 0;
185+ } finally {
186+ try {
187+ rmSync(targetPath, { force: true });
188+ } catch {
189+ // best effort
190+ }
191+ }
192+};
193+
194+interface CommitInfo {
195+ sha: string;
196+ phase: Phase;
197+ step: string | null;
198+}
199+
200+const readCommits = async (cwd: string): Promise<CommitInfo[]> => {
201+ const r = await runProc(["git", "log", "--reverse", "--pretty=format:%H%x1f%B%x1e"], cwd, 10000);
202+ if (r.exitCode !== 0) return [];
203+ const out: CommitInfo[] = [];
204+ for (const block of r.stdout.split("\x1e")) {
205+ const t = block.trim();
206+ if (!t) continue;
207+ const [sha, message = ""] = t.split("\x1f");
208+ if (!sha) continue;
209+ const p = parseCommit(message);
210+ out.push({ sha, phase: p.phase, step: p.step });
211+ }
212+ return out;
213+};
214+
215+export const judge = async (owner: string, repo: string): Promise<Verdict> => {
216+ const cwd = mkdtempSync(join(tmpdir(), `judge-${owner}-${repo}-`));
217+ try {
218+ // Agent repos default to private. Authenticate via admin token in
219+ // an http.extraheader so the token isn't persisted in the cloned
220+ // repo's config (extraheader applies to the clone request only).
221+ const cloneUrl = `${FORGEJO_INTERNAL}/${owner}/${repo}.git`;
222+ const adminToken = process.env.FORGEJO_ADMIN_TOKEN;
223+ const gitArgs = adminToken
224+ ? ["-c", `http.extraheader=Authorization: token ${adminToken}`, "clone", "--quiet", cloneUrl, "."]
225+ : ["clone", "--quiet", cloneUrl, "."];
226+ const cloneR = await runProc(["git", ...gitArgs], cwd, 30000);
227+ if (cloneR.exitCode !== 0) {
228+ throw new Error(`clone failed: ${cloneR.stderr || cloneR.stdout}`);
229+ }
230+
231+ const commits = await readCommits(cwd);
232+ const headR = await runProc(["git", "rev-parse", "HEAD"], cwd, 5000);
233+ const headSha = headR.stdout;
234+
235+ // First red per step + first green-after-red per step (chronological).
236+ const stepRed = new Map<string, string>();
237+ const stepGreen = new Map<string, string>();
238+ for (const c of commits) {
239+ if (!c.step) continue;
240+ if (c.phase === "red" && !stepRed.has(c.step)) {
241+ stepRed.set(c.step, c.sha);
242+ } else if (c.phase === "green" && stepRed.has(c.step) && !stepGreen.has(c.step)) {
243+ stepGreen.set(c.step, c.sha);
244+ }
245+ }
246+
247+ // Read the agent's mode + runner preferences from tdd.config.json.
248+ const { mode, testRunner } = await readConfig(cwd);
249+
250+ // Load the kata's authoritative spec — used to fetch hidden tests
251+ // per step. Repos that don't match a known kata get scored on red→green
252+ // discipline only (no hidden-test verification).
253+ let spec: Game | null = null;
254+ try {
255+ spec = await loadGame(repo);
256+ } catch {
257+ spec = null;
258+ }
259+
260+ const steps: StepVerdict[] = [];
261+ for (const [stepId, redSha] of stepRed) {
262+ const greenSha = stepGreen.get(stepId) ?? null;
263+
264+ if (testRunner === "none") {
265+ // Trace-only path: don't checkout, don't run anything. Score
266+ // purely from the commit log + a language-agnostic test-file
267+ // count via `git ls-tree`. Useful for non-Bun projects.
268+ const redFiles = await countTestFiles(cwd, redSha);
269+ const greenFiles = greenSha ? await countTestFiles(cwd, greenSha) : redFiles;
270+ const filesShrank = greenSha !== null && greenFiles < redFiles;
271+
272+ let status: StepVerdict["status"];
273+ let baseDelta = 0;
274+ if (greenSha === null) {
275+ status = "no-green";
276+ } else if (filesShrank) {
277+ status = "trace-tests-shrunk";
278+ baseDelta = -10;
279+ } else {
280+ status = "trace-verified";
281+ baseDelta = 10;
282+ }
283+ const scoreDelta = applyMode(baseDelta, mode);
284+ const explanation = explainStep({ status, redSha, greenSha, hiddenPassed: null, mode });
285+ steps.push({
286+ stepId, redSha, greenSha,
287+ redFailed: null, greenPassed: null, hiddenPassed: null,
288+ status, scoreDelta, explanation,
289+ });
290+ continue;
291+ }
292+
293+ await runProc(["git", "checkout", "--quiet", redSha], cwd, 5000);
294+ const redTestCount = await countTests(cwd);
295+ const redPassed = await runTests(cwd);
296+ const redFailed = !redPassed;
297+ let greenPassed: boolean | null = null;
298+ let hiddenPassed: boolean | null = null;
299+ let testsDeleted = false;
300+ if (greenSha) {
301+ await runProc(["git", "checkout", "--quiet", greenSha], cwd, 5000);
302+ const greenTestCount = await countTests(cwd);
303+ testsDeleted = greenTestCount < redTestCount;
304+ greenPassed = await runTests(cwd);
305+ if (greenPassed && spec && !testsDeleted) {
306+ hiddenPassed = await runHiddenTests(cwd, spec, stepId);
307+ }
308+ }
309+
310+ let status: StepVerdict["status"];
311+ let baseDelta = 0;
312+ if (greenSha === null) {
313+ status = "no-green";
314+ } else if (testsDeleted) {
315+ status = "test-deleted";
316+ baseDelta = -20;
317+ } else if (!redFailed) {
318+ status = "red-did-not-fail";
319+ baseDelta = -5;
320+ } else if (greenPassed === false) {
321+ status = "green-did-not-pass";
322+ baseDelta = -5;
323+ } else if (hiddenPassed === false) {
324+ status = "hidden-tests-failed";
325+ baseDelta = 0;
326+ } else if (hiddenPassed === true) {
327+ status = "verified";
328+ baseDelta = 20;
329+ } else {
330+ status = "discipline-only";
331+ baseDelta = 5;
332+ }
333+ const scoreDelta = applyMode(baseDelta, mode);
334+ const explanation = explainStep({ status, redSha, greenSha, hiddenPassed, mode });
335+ steps.push({ stepId, redSha, greenSha, redFailed, greenPassed, hiddenPassed, status, scoreDelta, explanation });
336+ }
337+
338+ // Refactor commits aren't tied to red→green pairs: the spec rewards
339+ // any refactor that keeps the existing tests green. A broken refactor
340+ // (tests fail at the refactor commit) costs the same as a missed
341+ // green — discipline matters even outside red→green pairs.
342+ const refactors: RefactorVerdict[] = [];
343+ for (const c of commits) {
344+ if (c.phase !== "refactor") continue;
345+ await runProc(["git", "checkout", "--quiet", c.sha], cwd, 5000);
346+ const passed = await runTests(cwd);
347+ const baseDelta = passed ? 5 : -5;
348+ refactors.push({
349+ sha: c.sha,
350+ stepId: c.step,
351+ testsPassed: passed,
352+ scoreDelta: applyMode(baseDelta, mode),
353+ explanation: explainRefactor(passed),
354+ });
355+ }
356+
357+ const totalScore =
358+ steps.reduce((a, s) => a + s.scoreDelta, 0) +
359+ refactors.reduce((a, r) => a + r.scoreDelta, 0);
360+ const verdict: Verdict = { headSha, mode, steps, refactors, totalScore, judgedAt: Date.now() };
361+ saveRun(owner, repo, verdict);
362+ return verdict;
363+ } finally {
364+ try {
365+ rmSync(cwd, { recursive: true, force: true });
366+ } catch {
367+ // best effort cleanup
368+ }
369+ }
370+};
added src/c14_real_reports.test.ts +101 −0
@@ -0,0 +1,101 @@
1+// Sibling test for c32_real_reports.ts. buildLiveReports itself fans out
2+// to fetchRepoCommits (network) so its end-to-end shape is covered by
3+// the live /reports/live route. The pure helpers underneath — agent
4+// attribution from commit messages, and the 30-day daily sparkline —
5+// are unit-testable here.
6+
7+import { describe, test, expect } from "bun:test";
8+import {
9+ detectAgent,
10+ buildTrend,
11+ buildLiveReports,
12+} from "./c14_real_reports.ts";
13+import type { GithubCommit } from "./c14_github.ts";
14+
15+const mkCommit = (date: string, message = ""): GithubCommit => ({
16+ sha: "0".repeat(40),
17+ commit: {
18+ message,
19+ author: { name: "test", email: "[email protected]", date },
20+ committer: { name: "test", email: "[email protected]", date },
21+ },
22+ author: null,
23+ committer: null,
24+} as unknown as GithubCommit);
25+
26+describe("c32_real_reports — detectAgent", () => {
27+ test("recognises a Claude Code commit via Co-Authored-By: Claude", () => {
28+ expect(detectAgent("Add feature\n\nCo-Authored-By: Claude <noreply>")).toBe("claude-code");
29+ });
30+
31+ test("recognises a Cursor commit", () => {
32+ expect(detectAgent("Fix bug\n\nCo-Authored-By: Cursor <[email protected]>")).toBe("cursor");
33+ });
34+
35+ test("recognises an Aider commit", () => {
36+ expect(detectAgent("Refactor x\n\nCo-Authored-By: aider")).toBe("aider");
37+ });
38+
39+ test("returns unknown when no recognised footer is present", () => {
40+ expect(detectAgent("Just a commit")).toBe("unknown");
41+ expect(detectAgent("")).toBe("unknown");
42+ });
43+
44+ test("the regex is case-insensitive on the agent token", () => {
45+ expect(detectAgent("Co-Authored-By: CLAUDE")).toBe("claude-code");
46+ expect(detectAgent("co-authored-by: CURSOR")).toBe("cursor");
47+ });
48+});
49+
50+describe("c32_real_reports — buildTrend (30-day daily sparkline)", () => {
51+ // Use today (UTC) as the anchor — the function compares against UTC
52+ // midnight, so we need ISO strings that fall on the right days.
53+ const today = new Date();
54+ today.setUTCHours(0, 0, 0, 0);
55+ const iso = (daysAgo: number): string => {
56+ const d = new Date(today.getTime() - daysAgo * 24 * 60 * 60 * 1000);
57+ return d.toISOString();
58+ };
59+
60+ test("returns an array of `days` length", () => {
61+ expect(buildTrend([], 30)).toHaveLength(30);
62+ expect(buildTrend([], 7)).toHaveLength(7);
63+ });
64+
65+ test("empty input flat-lines at zero", () => {
66+ const trend = buildTrend([], 7);
67+ expect(trend.every((n) => n === 0)).toBe(true);
68+ });
69+
70+ test("a single commit today increments the last bucket", () => {
71+ const trend = buildTrend([mkCommit(iso(0))], 7);
72+ expect(trend[trend.length - 1]).toBe(1);
73+ expect(trend.slice(0, -1).every((n) => n === 0)).toBe(true);
74+ });
75+
76+ test("multiple commits on the same day stack in the same bucket", () => {
77+ const trend = buildTrend([mkCommit(iso(0)), mkCommit(iso(0)), mkCommit(iso(0))], 7);
78+ expect(trend[trend.length - 1]).toBe(3);
79+ });
80+
81+ test("commits older than the window are dropped", () => {
82+ const trend = buildTrend([mkCommit(iso(99))], 7);
83+ expect(trend.every((n) => n === 0)).toBe(true);
84+ });
85+
86+ test("a commit `daysAgo` lands at index `days - 1 - daysAgo`", () => {
87+ const trend = buildTrend([mkCommit(iso(2))], 7);
88+ // index 6 = today, 5 = yesterday, 4 = 2 days ago
89+ expect(trend[4]).toBe(1);
90+ });
91+});
92+
93+describe("c32_real_reports — orchestrator entry point", () => {
94+ test("buildLiveReports is exported as an async function", () => {
95+ expect(typeof buildLiveReports).toBe("function");
96+ // End-to-end coverage lives on /reports/live; this is the structural
97+ // smoke that the export shape didn't drift. `.length` counts only
98+ // non-default params (owner, repo) — perPage carries a default.
99+ expect(buildLiveReports.length).toBe(2);
100+ });
101+});
added src/c14_real_reports.ts +170 −0
@@ -0,0 +1,170 @@
1+// c32 — logic: aggregate real GitHub commit history into the same
2+// AgentReport / RecentFlagged shape that c51_render_reports renders.
3+// Pure (given fetched commits in, produces report objects out); the
4+// I/O happens in c14_github.fetchRepoCommits which we call here.
5+//
6+// Attribution: Co-Authored-By footers are the agent-attribution channel
7+// the existing tdd.md commit history already uses. Anything without a
8+// recognised footer is bucketed as "unknown" and reported separately —
9+// it's still useful for volume context.
10+
11+import { parseCommit } from "./c31_commits.ts";
12+import { fetchRepoCommits, type GithubCommit } from "./c14_github.ts";
13+import type {
14+ AgentReport,
15+ FailureSlice,
16+ RecentFlagged,
17+} from "./c31_reports_demo.ts";
18+
19+type LiveAgentSlug = AgentReport["slug"] | "unknown";
20+
21+export const detectAgent = (msg: string): LiveAgentSlug => {
22+ if (/Co-Authored-By:.*Claude/i.test(msg)) return "claude-code";
23+ if (/Co-Authored-By:.*Cursor/i.test(msg)) return "cursor";
24+ if (/Co-Authored-By:.*Aider/i.test(msg)) return "aider";
25+ return "unknown";
26+};
27+
28+const AGENT_NAMES: Record<AgentReport["slug"], string> = {
29+ "claude-code": "Claude Code",
30+ cursor: "Cursor",
31+ aider: "Aider",
32+};
33+
34+// 30-day daily commit-count series, oldest → newest. When there are no
35+// commits in a day, that day's value is 0 — the sparkline still renders
36+// but flat-lines, which honestly reflects the data.
37+export const buildTrend = (commits: GithubCommit[], days = 30): number[] => {
38+ const out = new Array<number>(days).fill(0);
39+ const today = new Date();
40+ today.setUTCHours(0, 0, 0, 0);
41+ for (const c of commits) {
42+ const d = new Date(c.commit.author.date);
43+ d.setUTCHours(0, 0, 0, 0);
44+ const ageDays = Math.floor((today.getTime() - d.getTime()) / (24 * 60 * 60 * 1000));
45+ if (ageDays < 0 || ageDays >= days) continue;
46+ const idx = days - 1 - ageDays;
47+ const cur = out[idx] ?? 0;
48+ out[idx] = cur + 1;
49+ }
50+ return out;
51+};
52+
53+const buildAgentReport = (
54+ slug: AgentReport["slug"],
55+ agentCommits: GithubCommit[],
56+ repoSlug: string,
57+): AgentReport => {
58+ const tagged = agentCommits.filter((c) => {
59+ const phase = parseCommit(c.commit.message).phase;
60+ return phase === "red" || phase === "green" || phase === "refactor";
61+ });
62+ const phaseCoveragePct = agentCommits.length === 0
63+ ? 0
64+ : Math.round((tagged.length / agentCommits.length) * 100);
65+
66+ // Score is a proxy: phase-coverage is the only structural signal we
67+ // can compute without running the test suite. When coverage is 0 the
68+ // agent isn't attempting TDD, so the score is honestly low.
69+ const score = phaseCoveragePct;
70+
71+ // Failure mix collapses to two slices for live data — phase-tagged vs
72+ // not. Fine-grained failure modes (red-did-not-fail, test-deleted, etc)
73+ // need the runner sliver before they're computable.
74+ const failureMix: FailureSlice[] = [
75+ { label: "phase-tagged", pct: phaseCoveragePct, tone: "green" },
76+ { label: "no phase tag", pct: 100 - phaseCoveragePct, tone: "muted" },
77+ ];
78+
79+ const recent: RecentFlagged[] = agentCommits
80+ .slice(0, 5)
81+ .map((c) => {
82+ const parsed = parseCommit(c.commit.message);
83+ const phase = parsed.phase === "red" || parsed.phase === "green" || parsed.phase === "refactor"
84+ ? parsed.phase
85+ : "green";
86+ const failure = parsed.phase === "untagged" || parsed.phase === "init"
87+ ? "no phase tag"
88+ : `${parsed.phase} (live judge not yet wired)`;
89+ return {
90+ date: c.commit.author.date.slice(0, 10),
91+ repo: repoSlug,
92+ sha: c.sha.slice(0, 7),
93+ phase,
94+ failure,
95+ pts: 0,
96+ };
97+ });
98+
99+ const topIssueLabel = phaseCoveragePct === 100 ? "no current issues" : "no phase tag";
100+ const topIssuePct = 100 - phaseCoveragePct;
101+
102+ return {
103+ slug,
104+ name: AGENT_NAMES[slug],
105+ score,
106+ delta: 0,
107+ commits: agentCommits.length,
108+ phaseCoveragePct,
109+ streak: 0,
110+ streakBroken: false,
111+ topIssueLabel,
112+ topIssuePct,
113+ failureMix,
114+ trend: buildTrend(agentCommits),
115+ recent,
116+ };
117+};
118+
119+export interface LiveReports {
120+ reports: AgentReport[];
121+ unknownCount: number;
122+ totalCommits: number;
123+ earliest: string | null;
124+ latest: string | null;
125+ fetchedAt: number;
126+}
127+
128+export const buildLiveReports = async (
129+ repoOwner: string,
130+ repoName: string,
131+ perPage = 100,
132+): Promise<LiveReports> => {
133+ const commits = await fetchRepoCommits(repoOwner, repoName, perPage);
134+ const repoSlug = `${repoOwner}/${repoName}`;
135+ const byAgent = new Map<AgentReport["slug"], GithubCommit[]>();
136+ let unknownCount = 0;
137+
138+ for (const c of commits) {
139+ const a = detectAgent(c.commit.message);
140+ if (a === "unknown") {
141+ unknownCount++;
142+ continue;
143+ }
144+ const arr = byAgent.get(a) ?? [];
145+ arr.push(c);
146+ byAgent.set(a, arr);
147+ }
148+
149+ const order: AgentReport["slug"][] = ["claude-code", "cursor", "aider"];
150+ const reports = order
151+ .map((slug) => {
152+ const list = byAgent.get(slug);
153+ if (!list || list.length === 0) return null;
154+ return buildAgentReport(slug, list, repoSlug);
155+ })
156+ .filter((r): r is AgentReport => r !== null);
157+
158+ const dates = commits.map((c) => c.commit.author.date).sort();
159+ const earliest = dates[0] ?? null;
160+ const latest = dates[dates.length - 1] ?? null;
161+
162+ return {
163+ reports,
164+ unknownCount,
165+ totalCommits: commits.length,
166+ earliest,
167+ latest,
168+ fetchedAt: Date.now(),
169+ };
170+};
added src/c14_real_tests.test.ts +66 −0
@@ -0,0 +1,66 @@
1+// Sibling test for c32_real_tests.ts. buildLiveTestData fans out to
2+// loadTestBundle + fetchRepoCommits (both network/disk) so the
3+// end-to-end is covered by the live /reports/live/tests route. The
4+// pure helpers — agent attribution and the file/name label shortener —
5+// are unit-testable here.
6+
7+import { describe, test, expect } from "bun:test";
8+import {
9+ detectAgent,
10+ shortenTestLabel,
11+ buildLiveTestData,
12+} from "./c14_real_tests.ts";
13+
14+describe("c32_real_tests — detectAgent", () => {
15+ test("recognises Claude Code via Co-Authored-By: Claude", () => {
16+ expect(detectAgent("Add feature\n\nCo-Authored-By: Claude <noreply>")).toBe("claude-code");
17+ });
18+
19+ test("recognises Cursor", () => {
20+ expect(detectAgent("Fix bug\n\nCo-Authored-By: Cursor <[email protected]>")).toBe("cursor");
21+ });
22+
23+ test("recognises Aider", () => {
24+ expect(detectAgent("Refactor x\n\nCo-Authored-By: aider")).toBe("aider");
25+ });
26+
27+ test("returns null when no recognised footer is present (distinct from c32_real_reports which returns 'unknown')", () => {
28+ // The two real_* files made different choices here: real_reports
29+ // buckets unknown into its own slug; real_tests returns null so
30+ // the caller can filter or fall back. Document the difference.
31+ expect(detectAgent("Just a commit")).toBeNull();
32+ expect(detectAgent("")).toBeNull();
33+ });
34+
35+ test("the regex is case-insensitive on the agent token", () => {
36+ expect(detectAgent("Co-Authored-By: CLAUDE")).toBe("claude-code");
37+ expect(detectAgent("co-authored-by: aider")).toBe("aider");
38+ });
39+});
40+
41+describe("c32_real_tests — shortenTestLabel", () => {
42+ test("keeps only the basename of the file path + the test name", () => {
43+ expect(shortenTestLabel("src/foo/bar/baz.test.ts", "handles X")).toBe("baz.test.ts > handles X");
44+ });
45+
46+ test("handles a bare filename (no path) without splitting weirdly", () => {
47+ expect(shortenTestLabel("baz.test.ts", "handles X")).toBe("baz.test.ts > handles X");
48+ });
49+
50+ test("handles an empty file string (falls back to the empty basename)", () => {
51+ // .split('/').pop() on '' yields ''. Documented behaviour: the
52+ // helper never throws; the caller decides whether to filter empties.
53+ expect(shortenTestLabel("", "name")).toBe(" > name");
54+ });
55+
56+ test("preserves spaces and special chars in the test name", () => {
57+ expect(shortenTestLabel("a.ts", "rejects `bad input`")).toBe("a.ts > rejects `bad input`");
58+ });
59+});
60+
61+describe("c32_real_tests — orchestrator entry point", () => {
62+ test("buildLiveTestData is exported as an async function", () => {
63+ expect(typeof buildLiveTestData).toBe("function");
64+ expect(buildLiveTestData.length).toBe(2);
65+ });
66+});
added src/c14_real_tests.ts +142 −0
@@ -0,0 +1,142 @@
1+// c32 — logic: aggregate the per-deploy test bundle into the same
2+// TestSnapshot[] / TestStability[] shape that the demo page renders.
3+// HEAD-only snapshots; stability accumulates as more deploys add runs.
4+//
5+// Pure given the bundle + commits in (no I/O of its own beyond delegating
6+// to c14_github's bundle loader and commits fetcher).
7+
8+import { fetchRepoCommits, loadTestBundle, type PlaceholderTest } from "./c14_github.ts";
9+import type {
10+ AgentReport,
11+ TestFailure,
12+ TestSnapshot,
13+ TestStability,
14+} from "./c31_reports_demo.ts";
15+
16+export const detectAgent = (msg: string): AgentReport["slug"] | null => {
17+ if (/Co-Authored-By:.*Claude/i.test(msg)) return "claude-code";
18+ if (/Co-Authored-By:.*Cursor/i.test(msg)) return "cursor";
19+ if (/Co-Authored-By:.*Aider/i.test(msg)) return "aider";
20+ return null;
21+};
22+
23+export const shortenTestLabel = (file: string, name: string): string => {
24+ const base = file.split("/").pop() ?? file;
25+ return `${base} > ${name}`;
26+};
27+
28+export interface LiveTestData {
29+ snapshots: TestSnapshot[];
30+ stability: TestStability[];
31+ runsCount: number;
32+ ranAt: number | null;
33+ headSha: string | null;
34+ placeholderTests: PlaceholderTest[];
35+}
36+
37+export const buildLiveTestData = async (
38+ repoOwner: string,
39+ repoName: string,
40+): Promise<LiveTestData> => {
41+ const bundle = await loadTestBundle(repoOwner, repoName);
42+ if (!bundle || bundle.runs.length === 0) {
43+ return { snapshots: [], stability: [], runsCount: 0, ranAt: null, headSha: null, placeholderTests: [] };
44+ }
45+ const repoSlug = `${repoOwner}/${repoName}`;
46+ const latest = bundle.runs[0];
47+ if (!latest) {
48+ return { snapshots: [], stability: [], runsCount: 0, ranAt: null, headSha: null, placeholderTests: [] };
49+ }
50+
51+ // For "since" we want the oldest run that has this test as failing.
52+ const oldestFirst = [...bundle.runs].sort((a, b) => a.ranAt - b.ranAt);
53+
54+ const failures: TestFailure[] = latest.tests
55+ .filter((t) => t.status === "fail")
56+ .map((t) => {
57+ const firstFail = oldestFirst.find((r) =>
58+ r.tests.some((x) => x.name === t.name && x.file === t.file && x.status === "fail"),
59+ );
60+ const sinceTs = firstFail?.ranAt ?? latest.ranAt;
61+ return { test: shortenTestLabel(t.file, t.name), since: new Date(sinceTs).toISOString().slice(0, 10) };
62+ });
63+
64+ const snapshot: TestSnapshot = {
65+ repo: repoSlug,
66+ branch: latest.branch,
67+ total: latest.total,
68+ passing: latest.passing,
69+ failing: latest.failing,
70+ failures,
71+ };
72+
73+ // Stability: count pass/fail per (file, name) across every run, with
74+ // "deleted" set when a previously-seen test is missing from latest.
75+ const commits = await fetchRepoCommits(repoOwner, repoName, 100);
76+ const shaToAgent = new Map<string, AgentReport["slug"] | null>();
77+ for (const c of commits) shaToAgent.set(c.sha, detectAgent(c.commit.message));
78+
79+ interface Stat {
80+ name: string;
81+ file: string;
82+ pass: number;
83+ fail: number;
84+ lastBrokenSha: string | null;
85+ lastBrokenAt: number;
86+ }
87+ const stats = new Map<string, Stat>();
88+ for (const run of bundle.runs) {
89+ for (const t of run.tests) {
90+ const key = `${t.file}|${t.name}`;
91+ let s = stats.get(key);
92+ if (!s) {
93+ s = { name: t.name, file: t.file, pass: 0, fail: 0, lastBrokenSha: null, lastBrokenAt: 0 };
94+ stats.set(key, s);
95+ }
96+ if (t.status === "pass") s.pass++;
97+ else {
98+ s.fail++;
99+ if (run.ranAt > s.lastBrokenAt) {
100+ s.lastBrokenSha = run.sha;
101+ s.lastBrokenAt = run.ranAt;
102+ }
103+ }
104+ }
105+ }
106+
107+ const latestKeys = new Set(latest.tests.map((t) => `${t.file}|${t.name}`));
108+
109+ // lastBrokenBy needs an agent slug; if we can't map a SHA to an agent
110+ // (e.g. the commit isn't in the 100-commit window we fetch), fall
111+ // back to the agent of the latest run, which is a defensible default
112+ // for the dogfood case (one agent producing the history).
113+ const fallbackAgent = (shaToAgent.get(latest.sha) ?? "claude-code") as AgentReport["slug"];
114+
115+ const stability: TestStability[] = Array.from(stats.values())
116+ .map<TestStability>((s) => {
117+ const mapped = s.lastBrokenSha ? shaToAgent.get(s.lastBrokenSha) : null;
118+ const agent = (mapped ?? fallbackAgent) as AgentReport["slug"];
119+ const deleted = latestKeys.has(`${s.file}|${s.name}`) ? 0 : 1;
120+ const flagged = s.fail > 0 && (deleted > 0 || s.fail >= Math.max(2, s.pass / 5));
121+ return {
122+ test: shortenTestLabel(s.file, s.name),
123+ repo: repoSlug,
124+ pass: s.pass,
125+ fail: s.fail,
126+ deleted,
127+ lastBrokenBy: agent,
128+ flagged,
129+ };
130+ })
131+ .sort((a, b) => b.fail - a.fail || b.deleted - a.deleted || b.pass - a.pass)
132+ .slice(0, 30);
133+
134+ return {
135+ snapshots: [snapshot],
136+ stability,
137+ runsCount: bundle.runs.length,
138+ ranAt: latest.ranAt,
139+ headSha: latest.sha,
140+ placeholderTests: latest.placeholderTests ?? [],
141+ };
142+};
added src/c14_sama_profile.test.ts +153 −0
@@ -0,0 +1,153 @@
1+import { describe, test, expect } from "bun:test";
2+import { parseProfileToml } from "./c14_sama_profile.ts";
3+
4+describe("c14_sama_profile — parseProfileToml", () => {
5+ test("parses the minimum required top-level keys", () => {
6+ const p = parseProfileToml(`
7+sama_version = "2.0"
8+profile = "tdd-md"
9+
10+[layers.0]
11+prefixes = ["c31_"]
12+
13+[layers.1]
14+prefixes = ["c32_"]
15+
16+[layers.2]
17+prefixes = ["c13_"]
18+
19+[layers.3]
20+prefixes = ["c11_"]
21+`);
22+ expect(p.samaVersion).toBe("2.0");
23+ expect(p.profile).toBe("tdd-md");
24+ });
25+
26+ test("a flat-prefix layer maps to a single synthetic sublayer named 'default'", () => {
27+ const p = parseProfileToml(`
28+sama_version = "2.0"
29+profile = "x"
30+
31+[layers.0]
32+prefixes = ["c31_"]
33+
34+[layers.1]
35+prefixes = []
36+
37+[layers.2]
38+prefixes = []
39+
40+[layers.3]
41+prefixes = []
42+`);
43+ expect(p.layers[0].sublayers).toHaveLength(1);
44+ expect(p.layers[0].sublayers[0]).toEqual({ name: "default", prefix: "c31_", index: 0 });
45+ });
46+
47+ test("a subdivided layer carries sublayer index = position in the array", () => {
48+ const p = parseProfileToml(`
49+sama_version = "2.0"
50+profile = "x"
51+
52+[layers.0]
53+prefixes = []
54+
55+[layers.1]
56+sublayers = [
57+ { name = "logic", prefix = "c32_" },
58+ { name = "render", prefix = "c51_" },
59+]
60+
61+[layers.2]
62+prefixes = []
63+
64+[layers.3]
65+prefixes = []
66+`);
67+ expect(p.layers[1].sublayers).toHaveLength(2);
68+ expect(p.layers[1].sublayers[0]).toEqual({ name: "logic", prefix: "c32_", index: 0 });
69+ expect(p.layers[1].sublayers[1]).toEqual({ name: "render", prefix: "c51_", index: 1 });
70+ });
71+
72+ test("comments are stripped", () => {
73+ const p = parseProfileToml(`
74+# leading comment
75+sama_version = "2.0" # trailing comment
76+profile = "x"
77+
78+[layers.0]
79+prefixes = ["c31_"] # another
80+
81+[layers.1]
82+prefixes = []
83+
84+[layers.2]
85+prefixes = []
86+
87+[layers.3]
88+prefixes = []
89+`);
90+ expect(p.samaVersion).toBe("2.0");
91+ expect(p.layers[0].sublayers[0]!.prefix).toBe("c31_");
92+ });
93+
94+ test("missing top-level keys throws a clear error", () => {
95+ expect(() => parseProfileToml(`profile = "x"\n[layers.0]\n[layers.1]\n[layers.2]\n[layers.3]\n`))
96+ .toThrow(/sama_version/);
97+ expect(() => parseProfileToml(`sama_version = "2.0"\n[layers.0]\n[layers.1]\n[layers.2]\n[layers.3]\n`))
98+ .toThrow(/profile/);
99+ });
100+
101+ test("missing a required layer section throws a clear error", () => {
102+ expect(() => parseProfileToml(`
103+sama_version = "2.0"
104+profile = "x"
105+
106+[layers.0]
107+prefixes = []
108+
109+[layers.1]
110+prefixes = []
111+
112+[layers.2]
113+prefixes = []
114+`)).toThrow(/layers\.3/);
115+ });
116+
117+ test("parses the actual repo profile file", () => {
118+ // Inline copy of the real-repo profile to keep this test
119+ // hermetic — no filesystem read. If sama.profile.toml's shape
120+ // ever drifts, this test pins what the parser supports.
121+ const real = `
122+sama_version = "2.0"
123+profile = "tdd-md"
124+
125+[layers.0]
126+prefixes = ["c31_"]
127+
128+[layers.1]
129+sublayers = [
130+ { name = "logic", prefix = "c32_" },
131+ { name = "render", prefix = "c51_" },
132+]
133+
134+[layers.2]
135+sublayers = [
136+ { name = "data", prefix = "c13_" },
137+ { name = "io", prefix = "c14_" },
138+]
139+
140+[layers.3]
141+sublayers = [
142+ { name = "handlers", prefix = "c21_" },
143+ { name = "server", prefix = "c11_" },
144+]
145+`;
146+ const p = parseProfileToml(real);
147+ expect(p.profile).toBe("tdd-md");
148+ expect(p.layers[0].sublayers.map((s) => s.prefix)).toEqual(["c31_"]);
149+ expect(p.layers[1].sublayers.map((s) => s.name)).toEqual(["logic", "render"]);
150+ expect(p.layers[2].sublayers.map((s) => s.name)).toEqual(["data", "io"]);
151+ expect(p.layers[3].sublayers.map((s) => s.name)).toEqual(["handlers", "server"]);
152+ });
153+});
added src/c14_sama_profile.ts +236 −0
@@ -0,0 +1,236 @@
1+// c14 — adapter: loads + parses sama.profile.toml (the SAMA v2 profile
2+// declaration at the repo root) and walks the source tree to feed the
3+// v2 verifier. Layer 2 in SAMA v2 terms: this is the boundary where
4+// external input (the TOML file on disk + the contents of src/) is
5+// parsed into the typed SamaV2Input shape that the pure verifier in
6+// c32_sama_v2_verify consumes.
7+//
8+// The TOML parser handles the subset our profile uses (string values,
9+// string arrays, and arrays of inline tables) — not full TOML. The
10+// alternative is depending on an external parser; the subset is small
11+// enough that an inline implementation keeps the verifier dependency-
12+// free and easy to inspect.
13+
14+import { readdirSync, readFileSync } from "node:fs";
15+import { resolve } from "node:path";
16+import type {
17+ LayerNumber,
18+ LayerSpec,
19+ ProfileSpec,
20+ SamaV2Input,
21+ Sublayer,
22+} from "./c31_sama_v2.ts";
23+
24+// — TOML subset parser ----------------------------------------------
25+
26+const stripComment = (line: string): string => {
27+ // Comments only outside string literals. Our profile keeps no '#'
28+ // inside strings so a naive split on the first '#' is fine. If the
29+ // profile ever needs that, escape via a sentinel and post-process.
30+ const idx = line.indexOf("#");
31+ return idx === -1 ? line : line.slice(0, idx);
32+};
33+
34+const parseStringValue = (raw: string): string => {
35+ const t = raw.trim();
36+ if ((t.startsWith('"') && t.endsWith('"')) || (t.startsWith("'") && t.endsWith("'"))) {
37+ return t.slice(1, -1);
38+ }
39+ throw new Error(`expected quoted string, got: ${raw}`);
40+};
41+
42+const parseStringArray = (raw: string): string[] => {
43+ // Expect `[ "a", "b", ... ]` on a single line.
44+ const t = raw.trim();
45+ if (!t.startsWith("[") || !t.endsWith("]")) {
46+ throw new Error(`expected [..] array, got: ${raw}`);
47+ }
48+ const inner = t.slice(1, -1).trim();
49+ if (inner === "") return [];
50+ return inner.split(",").map((s) => parseStringValue(s.trim()));
51+};
52+
53+const parseInlineTable = (raw: string): Record<string, string> => {
54+ // Expect `{ key = "value", key2 = "value2" }` on one line.
55+ const t = raw.trim();
56+ if (!t.startsWith("{") || !t.endsWith("}")) {
57+ throw new Error(`expected inline table, got: ${raw}`);
58+ }
59+ const inner = t.slice(1, -1).trim();
60+ const out: Record<string, string> = {};
61+ if (inner === "") return out;
62+ // Split on commas that aren't inside a quoted string. Our subset
63+ // doesn't use quoted commas, so a plain split is enough.
64+ for (const pair of inner.split(",")) {
65+ const eq = pair.indexOf("=");
66+ if (eq === -1) throw new Error(`malformed inline-table entry: ${pair}`);
67+ const key = pair.slice(0, eq).trim();
68+ const value = pair.slice(eq + 1).trim();
69+ out[key] = parseStringValue(value);
70+ }
71+ return out;
72+};
73+
74+interface ParseState {
75+ sections: Map<string, Map<string, unknown>>;
76+}
77+
78+export const parseProfileToml = (text: string): ProfileSpec => {
79+ const state: ParseState = { sections: new Map() };
80+ const top = new Map<string, unknown>();
81+ state.sections.set("__top__", top);
82+
83+ // Pre-process: join continuation lines for multi-line arrays of
84+ // inline tables. Walk by char-level bracket tracking — when '[' is
85+ // open in a value, keep accumulating until the matching ']' arrives.
86+ const physLines = text.split("\n");
87+ const logical: string[] = [];
88+ let buf = "";
89+ let depth = 0;
90+ for (const raw of physLines) {
91+ const line = stripComment(raw);
92+ if (depth === 0) {
93+ if (buf === "") buf = line; else buf += " " + line;
94+ } else {
95+ buf += " " + line;
96+ }
97+ for (const c of line) {
98+ if (c === "[" || c === "{") depth++;
99+ else if (c === "]" || c === "}") depth--;
100+ }
101+ if (depth <= 0) {
102+ depth = 0;
103+ logical.push(buf);
104+ buf = "";
105+ }
106+ }
107+ if (buf.trim() !== "") logical.push(buf);
108+
109+ let currentSection = "__top__";
110+ for (const raw of logical) {
111+ const line = raw.trim();
112+ if (line === "") continue;
113+ if (line.startsWith("[") && line.endsWith("]")) {
114+ currentSection = line.slice(1, -1).trim();
115+ if (!state.sections.has(currentSection)) {
116+ state.sections.set(currentSection, new Map());
117+ }
118+ continue;
119+ }
120+ const eq = line.indexOf("=");
121+ if (eq === -1) throw new Error(`unparseable line: ${line}`);
122+ const key = line.slice(0, eq).trim();
123+ const valueRaw = line.slice(eq + 1).trim();
124+ let value: unknown;
125+ if (valueRaw.startsWith("[") && valueRaw.endsWith("]")) {
126+ // Array — string array or array of inline tables. Peek at the
127+ // first non-bracket char inside.
128+ const inner = valueRaw.slice(1, -1).trim();
129+ if (inner.startsWith("{")) {
130+ // Array of inline tables. Split on commas at depth 0.
131+ const tables: Array<Record<string, string>> = [];
132+ let cur = "";
133+ let d = 0;
134+ for (const c of inner) {
135+ if (c === "{") d++;
136+ if (c === "}") d--;
137+ if (c === "," && d === 0) {
138+ tables.push(parseInlineTable(cur));
139+ cur = "";
140+ } else {
141+ cur += c;
142+ }
143+ }
144+ if (cur.trim() !== "") tables.push(parseInlineTable(cur));
145+ value = tables;
146+ } else {
147+ value = parseStringArray(valueRaw);
148+ }
149+ } else {
150+ value = parseStringValue(valueRaw);
151+ }
152+ state.sections.get(currentSection)!.set(key, value);
153+ }
154+
155+ // Now assemble ProfileSpec.
156+ const samaVersion = top.get("sama_version") as string | undefined;
157+ const profile = top.get("profile") as string | undefined;
158+ if (typeof samaVersion !== "string" || typeof profile !== "string") {
159+ throw new Error("profile must declare `sama_version` and `profile` at the top level");
160+ }
161+
162+ const buildLayer = (k: LayerNumber): LayerSpec => {
163+ const sec = state.sections.get(`layers.${k}`);
164+ if (!sec) {
165+ throw new Error(`profile is missing required section [layers.${k}]`);
166+ }
167+ const sublayersRaw = sec.get("sublayers") as Array<Record<string, string>> | undefined;
168+ const prefixes = sec.get("prefixes") as string[] | undefined;
169+ const subs: Sublayer[] = [];
170+ if (sublayersRaw && sublayersRaw.length > 0) {
171+ sublayersRaw.forEach((row, index) => {
172+ if (!row.name || !row.prefix) {
173+ throw new Error(`[layers.${k}] sublayer ${index} missing name/prefix`);
174+ }
175+ subs.push({ name: row.name, prefix: row.prefix, index });
176+ });
177+ } else if (prefixes && prefixes.length > 0) {
178+ prefixes.forEach((prefix, index) => {
179+ subs.push({ name: "default", prefix, index });
180+ });
181+ } else {
182+ // Empty layer is permitted (spec §2.1: "Leave a canonical layer
183+ // empty"). The verifier just won't assign any file to it.
184+ }
185+ return { sublayers: subs };
186+ };
187+
188+ return {
189+ samaVersion,
190+ profile,
191+ layers: {
192+ 0: buildLayer(0),
193+ 1: buildLayer(1),
194+ 2: buildLayer(2),
195+ 3: buildLayer(3),
196+ },
197+ };
198+};
199+
200+// — Filesystem I/O --------------------------------------------------
201+
202+const REPO_ROOT_GUESS = process.cwd();
203+
204+export const loadProfile = async (
205+ repoRoot: string = REPO_ROOT_GUESS,
206+): Promise<ProfileSpec> => {
207+ const path = resolve(repoRoot, "sama.profile.toml");
208+ const text = await Bun.file(path).text();
209+ return parseProfileToml(text);
210+};
211+
212+// Walk src/ and read every .ts (sources + test siblings) into a map
213+// keyed by repo-relative path ("src/cXX_*.ts").
214+export const loadRepoFiles = (
215+ repoRoot: string = REPO_ROOT_GUESS,
216+): Map<string, string> => {
217+ const srcDir = resolve(repoRoot, "src");
218+ const out = new Map<string, string>();
219+ const entries = readdirSync(srcDir, { withFileTypes: true });
220+ for (const e of entries) {
221+ if (!e.isFile() || !e.name.endsWith(".ts")) continue;
222+ const repoPath = `src/${e.name}`;
223+ out.set(repoPath, readFileSync(resolve(srcDir, e.name), "utf8"));
224+ }
225+ return out;
226+};
227+
228+// Convenience: composes loadProfile + loadRepoFiles into the
229+// SamaV2Input the verifier consumes. Handler code calls this then
230+// passes the result straight to verifySamaV2.
231+export const buildSamaV2Input = async (
232+ repoRoot: string = REPO_ROOT_GUESS,
233+): Promise<SamaV2Input> => ({
234+ profile: await loadProfile(repoRoot),
235+ files: loadRepoFiles(repoRoot),
236+});
modified src/c21_app.ts +3 −0
@@ -33,6 +33,7 @@ import {
3333 samaCliResponse,
3434 samaSkillHandler,
3535 samaV2Handler,
36+ samaV2VerifyHandler,
3637 samaVerifyHandler,
3738 samaLandingHandler,
3839 samaSlugHandler,
@@ -362,6 +363,8 @@ ${rows}
362363
363364 "/sama/v2": samaV2Handler,
364365
366+ "/sama/v2/verify": samaV2VerifyHandler,
367+
365368 "/sama/verify": samaVerifyHandler,
366369
367370 "/sama": samaLandingHandler,
modified src/c21_handlers_api_agents.ts +1 −1
@@ -5,7 +5,7 @@
55 // judge entry point lives in c21_handlers_webhook — different auth
66 // model (HMAC), different concept.
77
8-import { judge } from "./c32_judge.ts";
8+import { judge } from "./c14_judge.ts";
99 import { timingSafeEqual } from "./c32_session.ts";
1010 import {
1111 FORGEJO_URL,
modified src/c21_handlers_reports.ts +2 −2
@@ -21,8 +21,8 @@ import {
2121 DEMO_SNAPSHOTS,
2222 DEMO_STABILITY,
2323 } from "./c31_reports_demo.ts";
24-import { buildLiveReports } from "./c32_real_reports.ts";
25-import { buildLiveTestData } from "./c32_real_tests.ts";
24+import { buildLiveReports } from "./c14_real_reports.ts";
25+import { buildLiveTestData } from "./c14_real_tests.ts";
2626 import {
2727 LIVE_REPO_OWNER,
2828 LIVE_REPO_NAME,
modified src/c21_handlers_sama.ts +65 −0
@@ -61,6 +61,71 @@ export const samaSkillHandler = async (): Promise<Response> => {
6161 return htmlResponse(html);
6262 };
6363
64+// -------- /sama/v2/verify (the v2 dogfood — runs the v2 verifier
65+// against this repo using sama.profile.toml) --------
66+
67+import { buildSamaV2Input } from "./c14_sama_profile.ts";
68+import { verifySamaV2 } from "./c32_sama_v2_verify.ts";
69+import type { SamaV2Report } from "./c31_sama_v2.ts";
70+
71+const renderV2Report = (report: SamaV2Report): string => {
72+ const summary = report.overallPassed
73+ ? `✓ conforms · profile \`${report.profile}\` · ${report.examined} files examined · ${report.checks.length}/${report.checks.length} checks pass`
74+ : `${report.checks.filter((c) => c.passed).length}/${report.checks.length} checks pass · profile \`${report.profile}\` · ${report.examined} files examined`;
75+ const rows = report.checks
76+ .map((c) => {
77+ const mark = c.passed ? "✓ pass" : `✗ ${c.violations.length} violation${c.violations.length === 1 ? "" : "s"}`;
78+ return `| #${c.id} ${c.name} | ${mark} | ${c.examined} |`;
79+ })
80+ .join("\n");
81+ const details = report.checks
82+ .filter((c) => !c.passed)
83+ .map((c) => {
84+ const head = `### ✗ #${c.id} ${c.name}\n`;
85+ const noteBlock = c.note ? `\n*${c.note}*\n` : "";
86+ const list = c.violations
87+ .map((v) => `- \`${v.file}\` — ${v.detail}`)
88+ .join("\n");
89+ return `${head}${noteBlock}\n${list}\n`;
90+ })
91+ .join("\n");
92+ return `# SAMA v2 — \`syntaxai/tdd.md\` dogfood
93+
94+> ${summary}
95+
96+The verifier in [\`src/c32_sama_v2_verify.ts\`](/GIT/syntaxai/tdd.md/blob/main/src/c32_sama_v2_verify.ts) ingests [\`sama.profile.toml\`](/GIT/syntaxai/tdd.md/blob/main/sama.profile.toml) and runs the seven §4 conformance checks against the current source tree on this server. No clone, no token; the server reads its own \`src/\` and the committed profile, runs the same logic the sibling unit tests cover, and renders the verdict below.
97+
98+| check | verdict | examined |
99+|---|---|---|
100+${rows}
101+
102+${details ? `## Open violations\n\n${details}` : ""}
103+
104+[← /sama/v2](/sama/v2) · [← /sama](/sama) · [the v1 dogfood](/sama/verify?repo=syntaxai/tdd.md)
105+`;
106+};
107+
108+export const samaV2VerifyHandler = async (): Promise<Response> => {
109+ let body: string;
110+ try {
111+ const input = await buildSamaV2Input();
112+ const report = verifySamaV2(input);
113+ body = renderV2Report(report);
114+ } catch (err) {
115+ body = `# SAMA v2 verify — error\n\nThe verifier failed before producing a verdict:\n\n\`\`\`\n${(err as Error).message}\n\`\`\`\n\n[← /sama/v2](/sama/v2)`;
116+ }
117+ const html = await renderDocsPage({
118+ title: "SAMA v2 verify · syntaxai/tdd.md — tdd.md",
119+ description:
120+ "Live dogfood: tdd.md's own source tree run through the SAMA v2 verifier. Reads sama.profile.toml + src/*.ts, applies the seven §4 conformance checks, renders the verdict.",
121+ bodyMarkdown: body,
122+ ogPath: "https://tdd.md/sama/v2/verify",
123+ active: "sama",
124+ pathForDocs: "/sama/v2/verify",
125+ });
126+ return htmlResponse(html);
127+};
128+
64129 // -------- /sama/v2 (the SAMA v2 Core Specification — draft) --------
65130
66131 export const samaV2Handler = async (): Promise<Response> => {
modified src/c21_handlers_webhook.ts +1 −1
@@ -6,7 +6,7 @@
66 // and the failure semantics (ack-and-fire vs. wait-for-verdict) are
77 // genuinely different concepts.
88
9-import { judge } from "./c32_judge.ts";
9+import { judge } from "./c14_judge.ts";
1010 import { timingSafeEqual, hmacSha256Hex } from "./c32_session.ts";
1111
1212 export const forgejoWebhookHandler = async (req: Request): Promise<Response> => {
modified src/c31_blog.ts +6 −0
@@ -12,6 +12,12 @@ export interface BlogEntry {
1212 }
1313
1414 export const ALL_POSTS: BlogEntry[] = [
15+ {
16+ slug: "deploy-that-lies-cascade",
17+ title: "When the deploy lies: three bugs hidden by one silent error suppressor",
18+ description: "/reports/live had been stuck on a 12-day-old window because the deploy script's snapshot step was failing silently (no bun on the p620 host, the failure was swallowed by 2>/dev/null and a 'non-fatal skipped' echo). Fix one: run the snapshot via podman. That exposed a second silent skip — snapshot-tests had been missing from the git-mode deploy entirely. Fix two: add it. That made bun test actually run in CI for the first time and exposed two more bugs — a 1-in-16 flaky test and a false-positive placeholder where the verifier's own test fixture was being grepped as a real test. Three bugs in one PR. The empirical lesson: verification only works if the pipeline that runs it isn't lying about whether it ran.",
19+ date: "2026-05-22",
20+ },
1521 {
1622 slug: "sama-empirical-modeled-green",
1723 title: "Greening our own dogfood: four sibling tests, the live verifier flipped from 3/4 to 4/4",
modified src/c31_git_parse.ts +32 −0
@@ -81,3 +81,35 @@ export const parseLsTreeLine = (line: string): LsTreeEntry | null => {
8181 if (type !== "blob" && type !== "tree" && type !== "commit") return null;
8282 return { mode: mode!, type, sha: sha!, path };
8383 };
84+
85+// Tree-listing entry returned by c14_git.lsTree. Defined here in
86+// Layer 0 (Pure) per SAMA v2 §1.1 so c51 render code (and other
87+// readers) can reference the type without importing from Layer 2.
88+// Distinct from LsTreeEntry above: that's the raw parsed line; this
89+// is the cleaned-up shape c14_git exposes to callers.
90+export interface TreeEntry {
91+ name: string; // basename, e.g. "skill.md" or "blog"
92+ type: "blob" | "tree" | "commit";
93+ sha: string;
94+ mode: string;
95+}
96+
97+// Result types for c14_git.commitFile etc. Defined here in Layer 0
98+// (Pure) per SAMA v2 §1.1 so c51 render code can match against the
99+// discriminated union without crossing import direction.
100+export interface GitCommitOk {
101+ ok: true;
102+ commitSha: string;
103+}
104+
105+export interface GitCommitFailure {
106+ ok: false;
107+ // "conflict" → ref tip moved under us (someone else committed)
108+ // "not_found" → branch doesn't exist
109+ // "permission" → fs perms on the bare repo
110+ // "other" → anything else (look at .message)
111+ kind: "conflict" | "not_found" | "permission" | "other";
112+ message: string;
113+}
114+
115+export type GitCommitOutcome = GitCommitOk | GitCommitFailure;
modified src/c31_project_config.ts +16 −0
@@ -100,3 +100,19 @@ export const parseRepoIdentifier = (raw: string): { owner: string; repo: string
100100 }
101101 return { owner, repo };
102102 };
103+
104+// Row-shape returned by c13_database for project records. Defined here
105+// in Layer 0 (Pure) per SAMA v2 §1.1 so c51 render code can reference
106+// the type without importing from Layer 2 (Adapter).
107+export interface ProjectRow {
108+ id: number;
109+ registeredBy: string;
110+ repoOwner: string;
111+ repoName: string;
112+ testRunner: TestRunner;
113+ trackedBranches: string[];
114+ displayName: string | null;
115+ team: string | null;
116+ registeredAt: number;
117+ status: "active" | "paused";
118+}
added src/c31_sama_v2.ts +97 −0
@@ -0,0 +1,97 @@
1+// c31 — model: types for the SAMA v2 verifier pipeline. Pure data
2+// shapes: the parsed profile (ProfileSpec), the verifier's input
3+// (SamaV2Input), and its output (SamaV2Report). No I/O lives here;
4+// c14_sama_profile parses the .toml into ProfileSpec, c32_sama_v2_verify
5+// applies the seven §4 checks against (ProfileSpec, files), and
6+// c21_handlers_sama renders the SamaV2Report.
7+
8+export type LayerNumber = 0 | 1 | 2 | 3;
9+
10+export interface Sublayer {
11+ // Order within the array (in the source profile) = dependency order:
12+ // later may import earlier, never the reverse. We carry the index
13+ // here so the verifier can compare positions.
14+ name: string;
15+ prefix: string;
16+ index: number;
17+}
18+
19+export interface LayerSpec {
20+ // A layer is either flat (an array of prefixes treated as one
21+ // sublayer) or subdivided (an ordered list of sublayers with their
22+ // own prefixes). The parser normalises flat layers into a single
23+ // synthetic sublayer named "default".
24+ sublayers: Sublayer[];
25+}
26+
27+export interface ProfileSpec {
28+ samaVersion: string;
29+ profile: string; // profile name, e.g. "tdd-md"
30+ layers: {
31+ 0: LayerSpec;
32+ 1: LayerSpec;
33+ 2: LayerSpec;
34+ 3: LayerSpec;
35+ };
36+}
37+
38+export interface SamaV2Input {
39+ profile: ProfileSpec;
40+ // Map keyed by repo-relative path (e.g. "src/c11_server.ts") to
41+ // file contents. The verifier never reads files itself; the loader
42+ // populates this map.
43+ files: Map<string, string>;
44+}
45+
46+export interface SamaV2Violation {
47+ file: string;
48+ detail: string;
49+}
50+
51+export interface SamaV2Check {
52+ // Stable IDs matching §4 of the spec.
53+ id: 1 | 2 | 3 | 4 | 5 | 6 | 7;
54+ // Display name used in the rendered report.
55+ name: string;
56+ // Property letter / phrase from the spec.
57+ property:
58+ | "Sorted"
59+ | "Architecture"
60+ | "Modeled (tests)"
61+ | "Modeled (boundary)"
62+ | "Atomic"
63+ | "Law"
64+ | "Consistency";
65+ passed: boolean;
66+ examined: number;
67+ violations: SamaV2Violation[];
68+ // Free-form note shown alongside the verdict — used for §4.4 where
69+ // the profile may declare advisory-only enforcement.
70+ note?: string;
71+}
72+
73+export interface SamaV2Report {
74+ profile: string;
75+ // Total files examined across all checks (matches the count emitted
76+ // by the §4.2 Architecture check).
77+ examined: number;
78+ checks: SamaV2Check[];
79+ overallPassed: boolean;
80+}
81+
82+// Helper used in the verifier and re-exported here so call sites can
83+// type-narrow against the same source: returns the layer number a
84+// file's basename declares, or null if no profile prefix matches.
85+export const declaredLayer = (
86+ path: string,
87+ profile: ProfileSpec,
88+): { layer: LayerNumber; sublayer: Sublayer } | null => {
89+ const base = path.split("/").pop() ?? path;
90+ for (const k of [0, 1, 2, 3] as LayerNumber[]) {
91+ const spec = profile.layers[k];
92+ for (const sub of spec.sublayers) {
93+ if (base.startsWith(sub.prefix)) return { layer: k, sublayer: sub };
94+ }
95+ }
96+ return null;
97+};
modified src/c31_sxdoc.ts +14 −0
@@ -140,3 +140,17 @@ export const emptyDocument = (): SxDocument => ({
140140 v: SX_DOC_VERSION,
141141 blocks: [],
142142 });
143+
144+// Row-shape returned by c13_database.listDocuments. Defined here in
145+// Layer 0 (Pure) per SAMA v2 §1.1 so c51 render code can reference
146+// the type without importing from Layer 2 (Adapter). The Adapter
147+// (c13_database) imports this type to type its own return value.
148+export interface SxDocumentSummary {
149+ id: number;
150+ slug: string;
151+ type: "page" | "post";
152+ title: string;
153+ status: "published" | "draft";
154+ primaryTag: string | null;
155+ updatedAt: number;
156+}
removed src/c32_judge.test.ts +0 −69
@@ -1,69 +0,0 @@
1-// Sibling test for c32_judge.ts. The orchestrator itself (judge()) does
2-// git clone + test execution and isn't unit-testable without a real
3-// agent repo; the pure helpers underneath it (applyMode, explainRefactor)
4-// are the structural surface that matters for scoring decisions. Cover
5-// the mode-aware penalty math + the operator-facing explanations here.
6-
7-import { describe, test, expect } from "bun:test";
8-import { applyMode, explainRefactor, judge } from "./c32_judge.ts";
9-
10-describe("c32_judge — applyMode (mode-aware penalty math)", () => {
11- test("positive deltas pass through unchanged in every mode", () => {
12- expect(applyMode(10, "strict")).toBe(10);
13- expect(applyMode(10, "pragmatic")).toBe(10);
14- expect(applyMode(10, "learning")).toBe(10);
15- });
16-
17- test("strict mode keeps the full negative penalty", () => {
18- expect(applyMode(-20, "strict")).toBe(-20);
19- expect(applyMode(-5, "strict")).toBe(-5);
20- });
21-
22- test("pragmatic mode halves negative deltas (Math.ceil — never below half)", () => {
23- expect(applyMode(-20, "pragmatic")).toBe(-10);
24- expect(applyMode(-10, "pragmatic")).toBe(-5);
25- // -5 / 2 = -2.5 → Math.ceil(-2.5) = -2: the harsher half rounds up
26- // toward zero, which is the documented "softer score" behaviour.
27- expect(applyMode(-5, "pragmatic")).toBe(-2);
28- });
29-
30- test("learning mode zeroes out every negative delta", () => {
31- expect(applyMode(-20, "learning")).toBe(0);
32- expect(applyMode(-5, "learning")).toBe(0);
33- expect(applyMode(-1, "learning")).toBe(0);
34- });
35-
36- test("zero delta is neutral in every mode", () => {
37- expect(applyMode(0, "strict")).toBe(0);
38- expect(applyMode(0, "pragmatic")).toBe(0);
39- expect(applyMode(0, "learning")).toBe(0);
40- });
41-});
42-
43-describe("c32_judge — explainRefactor", () => {
44- test("passed=true returns the canonical-refactor explanation", () => {
45- const s = explainRefactor(true);
46- expect(s).toContain("stayed green");
47- expect(s).toMatch(/canonical/i);
48- });
49-
50- test("passed=false returns guidance to revert or open a new red→green", () => {
51- const s = explainRefactor(false);
52- expect(s).toContain("broke");
53- expect(s).toMatch(/revert|red→green/);
54- });
55-
56- test("the two branches return different strings", () => {
57- expect(explainRefactor(true)).not.toBe(explainRefactor(false));
58- });
59-});
60-
61-describe("c32_judge — orchestrator entry point", () => {
62- test("judge is exported as an async function (Promise-returning)", () => {
63- expect(typeof judge).toBe("function");
64- // The orchestrator does git clone + test execution; covering it
65- // end-to-end needs a real agent repo. A type-level check that the
66- // shape didn't drift is the documented minimum for this layer.
67- expect(judge.length).toBe(2);
68- });
69-});
removed src/c32_judge.ts +0 −370
@@ -1,370 +0,0 @@
1-import { mkdtempSync, rmSync } from "fs";
2-import { join } from "path";
3-import { tmpdir } from "os";
4-import { parseCommit, type Phase } from "./c31_commits.ts";
5-import { saveRun, type Verdict, type StepVerdict, type RefactorVerdict, type Mode } from "./c13_database.ts";
6-import { loadGame, type Game } from "./c31_games.ts";
7-
8-type TestRunner = "bun" | "none";
9-
10-interface TddConfig {
11- mode: Mode;
12- testRunner: TestRunner;
13-}
14-
15-// tdd.config.json from the agent's repo selects the scoring mode and
16-// test runner. Falls back to strict / bun when missing or unparseable.
17-//
18-// { "mode": "pragmatic", "test_runner": "none" }
19-//
20-// test_runner: "none" enables trace-only judging — no checkout, no test
21-// execution. Useful as a CI gate on projects where Bun can't run the
22-// suite (e.g. .NET, Python without bun-compat tests).
23-const readConfig = async (cwd: string): Promise<TddConfig> => {
24- const file = Bun.file(join(cwd, "tdd.config.json"));
25- let mode: Mode = "strict";
26- let testRunner: TestRunner = "bun";
27- if (await file.exists()) {
28- try {
29- const cfg = (await file.json()) as { mode?: string; test_runner?: string };
30- if (cfg.mode === "pragmatic" || cfg.mode === "learning") mode = cfg.mode;
31- if (cfg.test_runner === "none") testRunner = "none";
32- } catch {
33- // best effort — bad config falls back to defaults
34- }
35- }
36- return { mode, testRunner };
37-};
38-
39-// Penalty halving for pragmatic, zeroing for learning. Positive deltas
40-// are unchanged across modes — earned credit is earned credit.
41-export const applyMode = (delta: number, mode: Mode): number => {
42- if (delta >= 0) return delta;
43- if (mode === "learning") return 0;
44- if (mode === "pragmatic") return Math.ceil(delta / 2);
45- return delta;
46-};
47-
48-// Plain-language summary of a step verdict, written to the agent (not
49-// the human admin). One short paragraph; named intentionally so callers
50-// can see it next to the row in the score table.
51-const explainStep = (params: {
52- status: StepVerdict["status"];
53- redSha: string | null;
54- greenSha: string | null;
55- hiddenPassed: boolean | null;
56- mode: Mode;
57-}): string => {
58- const { status, hiddenPassed, mode } = params;
59- switch (status) {
60- case "verified":
61- return "Red failed as expected, green passes your tests, and the kata's hidden tests confirm the implementation matches the requirement.";
62- case "discipline-only":
63- return "Red→green discipline holds, but this kata didn't ship hidden tests for the step. Partial credit awarded; full +20 isn't possible without authoritative verification.";
64- case "no-green":
65- return "Red commit landed; the matching green(<step>) commit hasn't been pushed yet. Push your green to lock in the score.";
66- case "red-did-not-fail":
67- return mode === "pragmatic"
68- ? "Combined red+green commit detected. Pragmatic mode allows this — the cycle still counts, just with a softer score than a clean separation."
69- : "Red commit's tests already passed when the step was first introduced — meaning the implementation was added before the test, or the test is tautological. Switch to pragmatic mode if you commit red+green together intentionally.";
70- case "green-did-not-pass":
71- return "Green commit's own tests still fail. The implementation doesn't yet satisfy the test you wrote — fix the impl, or reconsider whether the test reflects the requirement.";
72- case "hidden-tests-failed":
73- return hiddenPassed === false
74- ? "Your tests pass, but the kata's hidden tests don't — this is the classic tautology trap. Tighten your test to mirror the requirement (e.g., assert the actual return value, not just that it runs)."
75- : "Your tests pass, but hidden verification was inconclusive. Re-push to retry.";
76- case "test-deleted":
77- return "Test count dropped between red and green for this step. Once a test exists it must keep existing — refactor it, don't delete it. If the test was wrong, replace it in a separate commit before resuming the cycle.";
78- case "trace-verified":
79- return "Trace-only mode: red→green pair found in the commit log. Tests weren't executed (test_runner: \"none\"). Switch to bun runner for behaviour verification.";
80- case "trace-tests-shrunk":
81- return "Trace-only mode: the green commit's tree has fewer test files than the red commit's tree — looks like deletion. If you renamed or split test files, the tally still drops.";
82- }
83-};
84-
85-export const explainRefactor = (passed: boolean): string =>
86- passed
87- ? "Tests stayed green through the refactor — structural change without behavior change, the canonical refactor."
88- : "Refactor commit broke at least one test. Either revert the refactor or write a new red→green to capture the changed behavior.";
89-
90-const FORGEJO_INTERNAL = process.env.FORGEJO_URL ?? "https://git.tdd.md";
91-const TEST_TIMEOUT_MS = 8000;
92-
93-// Sandboxed env passed to git and bun subprocesses. Strips every secret
94-// from the parent process — agent code never sees FORGEJO_ADMIN_TOKEN,
95-// GITHUB_CLIENT_SECRET, or SESSION_SECRET. PATH is fixed; HOME and TMPDIR
96-// stay inside the per-run temp dir so dotfile writes can't escape.
97-const sandboxEnv = (cwd: string): Record<string, string> => ({
98- PATH: "/usr/local/bin:/usr/bin:/bin",
99- HOME: cwd,
100- TMPDIR: cwd,
101- NODE_ENV: "test",
102-});
103-
104-const runProc = async (
105- cmd: string[],
106- cwd: string,
107- timeoutMs: number,
108-): Promise<{ stdout: string; stderr: string; exitCode: number; timedOut: boolean }> => {
109- const proc = Bun.spawn(cmd, {
110- cwd,
111- stdout: "pipe",
112- stderr: "pipe",
113- env: sandboxEnv(cwd),
114- });
115- let timedOut = false;
116- const timer = setTimeout(() => {
117- timedOut = true;
118- proc.kill("SIGKILL");
119- }, timeoutMs);
120- const exitCode = await proc.exited;
121- clearTimeout(timer);
122- const stdout = await new Response(proc.stdout).text();
123- const stderr = await new Response(proc.stderr).text();
124- return { stdout: stdout.trim(), stderr: stderr.trim(), exitCode, timedOut };
125-};
126-
127-const runTests = async (cwd: string): Promise<boolean> => {
128- const r = await runProc(["bun", "test"], cwd, TEST_TIMEOUT_MS);
129- // Bun test exits 0 only when all tests pass.
130- return !r.timedOut && r.exitCode === 0;
131-};
132-
133-// Language-agnostic test-file counter for trace-only mode. Uses git
134-// ls-tree at the given sha so we don't have to checkout the working
135-// tree. Matches conventional test-file naming across ecosystems:
136-// foo.test.ts, foo.spec.ts, FooTests.cs, FooTest.java, test_foo.py,
137-// foo_test.go, FooSpec.scala, foo_spec.rb.
138-const countTestFiles = async (cwd: string, sha: string): Promise<number> => {
139- const r = await runProc(["git", "ls-tree", "-r", "--name-only", sha], cwd, 5000);
140- if (r.exitCode !== 0) return 0;
141- const re = /(?:^|\/)(?:[^/]*\.(?:test|spec)\.[a-z]+|[Tt]ests?\/[^/]+|test_[^/]+|[^/]+_test\.[a-z]+|[^/]+[Tt]ests?\.cs|[^/]+[Tt]est\.java)$/;
142- let count = 0;
143- for (const line of r.stdout.split("\n")) {
144- if (re.test(line)) count++;
145- }
146- return count;
147-};
148-
149-// Count `test(` / `it(` calls in tracked *.test.ts files. Used to detect
150-// when an agent deletes tests between red and green to make a regression
151-// "pass" — a cardinal TDD sin per the kata spec.
152-const countTests = async (cwd: string): Promise<number> => {
153- const r = await runProc(["git", "ls-files", "*.test.ts"], cwd, 5000);
154- if (r.exitCode !== 0) return 0;
155- const files = r.stdout.split("\n").filter((f) => f && !f.includes("__hidden_"));
156- let count = 0;
157- for (const f of files) {
158- const content = await Bun.file(join(cwd, f))
159- .text()
160- .catch(() => "");
161- const matches = content.match(/\b(?:test|it)\s*\(/g);
162- if (matches) count += matches.length;
163- }
164- return count;
165-};
166-
167-// Runs the kata's authoritative tests against the agent's implementation
168-// at whatever commit is currently checked out. Copies the hidden test
169-// file into the working tree under a __hidden__ prefix so it doesn't
170-// collide with the agent's filenames, runs only that file, then deletes
171-// it. Returns null if the kata doesn't have hidden tests for this step.
172-const runHiddenTests = async (cwd: string, spec: Game, stepId: string): Promise<boolean | null> => {
173- const stepDef = spec.steps.find((s) => s.id === stepId);
174- if (!stepDef) return null;
175- const sourcePath = `./content/games/${spec.id}/${stepDef.hiddenTestFile}`;
176- const sourceFile = Bun.file(sourcePath);
177- if (!(await sourceFile.exists())) return null;
178- const content = await sourceFile.text();
179- const targetName = `__hidden_${stepId}__.test.ts`;
180- const targetPath = join(cwd, targetName);
181- await Bun.write(targetPath, content);
182- try {
183- const r = await runProc(["bun", "test", targetName], cwd, TEST_TIMEOUT_MS);
184- return !r.timedOut && r.exitCode === 0;
185- } finally {
186- try {
187- rmSync(targetPath, { force: true });
188- } catch {
189- // best effort
190- }
191- }
192-};
193-
194-interface CommitInfo {
195- sha: string;
196- phase: Phase;
197- step: string | null;
198-}
199-
200-const readCommits = async (cwd: string): Promise<CommitInfo[]> => {
201- const r = await runProc(["git", "log", "--reverse", "--pretty=format:%H%x1f%B%x1e"], cwd, 10000);
202- if (r.exitCode !== 0) return [];
203- const out: CommitInfo[] = [];
204- for (const block of r.stdout.split("\x1e")) {
205- const t = block.trim();
206- if (!t) continue;
207- const [sha, message = ""] = t.split("\x1f");
208- if (!sha) continue;
209- const p = parseCommit(message);
210- out.push({ sha, phase: p.phase, step: p.step });
211- }
212- return out;
213-};
214-
215-export const judge = async (owner: string, repo: string): Promise<Verdict> => {
216- const cwd = mkdtempSync(join(tmpdir(), `judge-${owner}-${repo}-`));
217- try {
218- // Agent repos default to private. Authenticate via admin token in
219- // an http.extraheader so the token isn't persisted in the cloned
220- // repo's config (extraheader applies to the clone request only).
221- const cloneUrl = `${FORGEJO_INTERNAL}/${owner}/${repo}.git`;
222- const adminToken = process.env.FORGEJO_ADMIN_TOKEN;
223- const gitArgs = adminToken
224- ? ["-c", `http.extraheader=Authorization: token ${adminToken}`, "clone", "--quiet", cloneUrl, "."]
225- : ["clone", "--quiet", cloneUrl, "."];
226- const cloneR = await runProc(["git", ...gitArgs], cwd, 30000);
227- if (cloneR.exitCode !== 0) {
228- throw new Error(`clone failed: ${cloneR.stderr || cloneR.stdout}`);
229- }
230-
231- const commits = await readCommits(cwd);
232- const headR = await runProc(["git", "rev-parse", "HEAD"], cwd, 5000);
233- const headSha = headR.stdout;
234-
235- // First red per step + first green-after-red per step (chronological).
236- const stepRed = new Map<string, string>();
237- const stepGreen = new Map<string, string>();
238- for (const c of commits) {
239- if (!c.step) continue;
240- if (c.phase === "red" && !stepRed.has(c.step)) {
241- stepRed.set(c.step, c.sha);
242- } else if (c.phase === "green" && stepRed.has(c.step) && !stepGreen.has(c.step)) {
243- stepGreen.set(c.step, c.sha);
244- }
245- }
246-
247- // Read the agent's mode + runner preferences from tdd.config.json.
248- const { mode, testRunner } = await readConfig(cwd);
249-
250- // Load the kata's authoritative spec — used to fetch hidden tests
251- // per step. Repos that don't match a known kata get scored on red→green
252- // discipline only (no hidden-test verification).
253- let spec: Game | null = null;
254- try {
255- spec = await loadGame(repo);
256- } catch {
257- spec = null;
258- }
259-
260- const steps: StepVerdict[] = [];
261- for (const [stepId, redSha] of stepRed) {
262- const greenSha = stepGreen.get(stepId) ?? null;
263-
264- if (testRunner === "none") {
265- // Trace-only path: don't checkout, don't run anything. Score
266- // purely from the commit log + a language-agnostic test-file
267- // count via `git ls-tree`. Useful for non-Bun projects.
268- const redFiles = await countTestFiles(cwd, redSha);
269- const greenFiles = greenSha ? await countTestFiles(cwd, greenSha) : redFiles;
270- const filesShrank = greenSha !== null && greenFiles < redFiles;
271-
272- let status: StepVerdict["status"];
273- let baseDelta = 0;
274- if (greenSha === null) {
275- status = "no-green";
276- } else if (filesShrank) {
277- status = "trace-tests-shrunk";
278- baseDelta = -10;
279- } else {
280- status = "trace-verified";
281- baseDelta = 10;
282- }
283- const scoreDelta = applyMode(baseDelta, mode);
284- const explanation = explainStep({ status, redSha, greenSha, hiddenPassed: null, mode });
285- steps.push({
286- stepId, redSha, greenSha,
287- redFailed: null, greenPassed: null, hiddenPassed: null,
288- status, scoreDelta, explanation,
289- });
290- continue;
291- }
292-
293- await runProc(["git", "checkout", "--quiet", redSha], cwd, 5000);
294- const redTestCount = await countTests(cwd);
295- const redPassed = await runTests(cwd);
296- const redFailed = !redPassed;
297- let greenPassed: boolean | null = null;
298- let hiddenPassed: boolean | null = null;
299- let testsDeleted = false;
300- if (greenSha) {
301- await runProc(["git", "checkout", "--quiet", greenSha], cwd, 5000);
302- const greenTestCount = await countTests(cwd);
303- testsDeleted = greenTestCount < redTestCount;
304- greenPassed = await runTests(cwd);
305- if (greenPassed && spec && !testsDeleted) {
306- hiddenPassed = await runHiddenTests(cwd, spec, stepId);
307- }
308- }
309-
310- let status: StepVerdict["status"];
311- let baseDelta = 0;
312- if (greenSha === null) {
313- status = "no-green";
314- } else if (testsDeleted) {
315- status = "test-deleted";
316- baseDelta = -20;
317- } else if (!redFailed) {
318- status = "red-did-not-fail";
319- baseDelta = -5;
320- } else if (greenPassed === false) {
321- status = "green-did-not-pass";
322- baseDelta = -5;
323- } else if (hiddenPassed === false) {
324- status = "hidden-tests-failed";
325- baseDelta = 0;
326- } else if (hiddenPassed === true) {
327- status = "verified";
328- baseDelta = 20;
329- } else {
330- status = "discipline-only";
331- baseDelta = 5;
332- }
333- const scoreDelta = applyMode(baseDelta, mode);
334- const explanation = explainStep({ status, redSha, greenSha, hiddenPassed, mode });
335- steps.push({ stepId, redSha, greenSha, redFailed, greenPassed, hiddenPassed, status, scoreDelta, explanation });
336- }
337-
338- // Refactor commits aren't tied to red→green pairs: the spec rewards
339- // any refactor that keeps the existing tests green. A broken refactor
340- // (tests fail at the refactor commit) costs the same as a missed
341- // green — discipline matters even outside red→green pairs.
342- const refactors: RefactorVerdict[] = [];
343- for (const c of commits) {
344- if (c.phase !== "refactor") continue;
345- await runProc(["git", "checkout", "--quiet", c.sha], cwd, 5000);
346- const passed = await runTests(cwd);
347- const baseDelta = passed ? 5 : -5;
348- refactors.push({
349- sha: c.sha,
350- stepId: c.step,
351- testsPassed: passed,
352- scoreDelta: applyMode(baseDelta, mode),
353- explanation: explainRefactor(passed),
354- });
355- }
356-
357- const totalScore =
358- steps.reduce((a, s) => a + s.scoreDelta, 0) +
359- refactors.reduce((a, r) => a + r.scoreDelta, 0);
360- const verdict: Verdict = { headSha, mode, steps, refactors, totalScore, judgedAt: Date.now() };
361- saveRun(owner, repo, verdict);
362- return verdict;
363- } finally {
364- try {
365- rmSync(cwd, { recursive: true, force: true });
366- } catch {
367- // best effort cleanup
368- }
369- }
370-};
removed src/c32_real_reports.test.ts +0 −101
@@ -1,101 +0,0 @@
1-// Sibling test for c32_real_reports.ts. buildLiveReports itself fans out
2-// to fetchRepoCommits (network) so its end-to-end shape is covered by
3-// the live /reports/live route. The pure helpers underneath — agent
4-// attribution from commit messages, and the 30-day daily sparkline —
5-// are unit-testable here.
6-
7-import { describe, test, expect } from "bun:test";
8-import {
9- detectAgent,
10- buildTrend,
11- buildLiveReports,
12-} from "./c32_real_reports.ts";
13-import type { GithubCommit } from "./c14_github.ts";
14-
15-const mkCommit = (date: string, message = ""): GithubCommit => ({
16- sha: "0".repeat(40),
17- commit: {
18- message,
19- author: { name: "test", email: "[email protected]", date },
20- committer: { name: "test", email: "[email protected]", date },
21- },
22- author: null,
23- committer: null,
24-} as unknown as GithubCommit);
25-
26-describe("c32_real_reports — detectAgent", () => {
27- test("recognises a Claude Code commit via Co-Authored-By: Claude", () => {
28- expect(detectAgent("Add feature\n\nCo-Authored-By: Claude <noreply>")).toBe("claude-code");
29- });
30-
31- test("recognises a Cursor commit", () => {
32- expect(detectAgent("Fix bug\n\nCo-Authored-By: Cursor <[email protected]>")).toBe("cursor");
33- });
34-
35- test("recognises an Aider commit", () => {
36- expect(detectAgent("Refactor x\n\nCo-Authored-By: aider")).toBe("aider");
37- });
38-
39- test("returns unknown when no recognised footer is present", () => {
40- expect(detectAgent("Just a commit")).toBe("unknown");
41- expect(detectAgent("")).toBe("unknown");
42- });
43-
44- test("the regex is case-insensitive on the agent token", () => {
45- expect(detectAgent("Co-Authored-By: CLAUDE")).toBe("claude-code");
46- expect(detectAgent("co-authored-by: CURSOR")).toBe("cursor");
47- });
48-});
49-
50-describe("c32_real_reports — buildTrend (30-day daily sparkline)", () => {
51- // Use today (UTC) as the anchor — the function compares against UTC
52- // midnight, so we need ISO strings that fall on the right days.
53- const today = new Date();
54- today.setUTCHours(0, 0, 0, 0);
55- const iso = (daysAgo: number): string => {
56- const d = new Date(today.getTime() - daysAgo * 24 * 60 * 60 * 1000);
57- return d.toISOString();
58- };
59-
60- test("returns an array of `days` length", () => {
61- expect(buildTrend([], 30)).toHaveLength(30);
62- expect(buildTrend([], 7)).toHaveLength(7);
63- });
64-
65- test("empty input flat-lines at zero", () => {
66- const trend = buildTrend([], 7);
67- expect(trend.every((n) => n === 0)).toBe(true);
68- });
69-
70- test("a single commit today increments the last bucket", () => {
71- const trend = buildTrend([mkCommit(iso(0))], 7);
72- expect(trend[trend.length - 1]).toBe(1);
73- expect(trend.slice(0, -1).every((n) => n === 0)).toBe(true);
74- });
75-
76- test("multiple commits on the same day stack in the same bucket", () => {
77- const trend = buildTrend([mkCommit(iso(0)), mkCommit(iso(0)), mkCommit(iso(0))], 7);
78- expect(trend[trend.length - 1]).toBe(3);
79- });
80-
81- test("commits older than the window are dropped", () => {
82- const trend = buildTrend([mkCommit(iso(99))], 7);
83- expect(trend.every((n) => n === 0)).toBe(true);
84- });
85-
86- test("a commit `daysAgo` lands at index `days - 1 - daysAgo`", () => {
87- const trend = buildTrend([mkCommit(iso(2))], 7);
88- // index 6 = today, 5 = yesterday, 4 = 2 days ago
89- expect(trend[4]).toBe(1);
90- });
91-});
92-
93-describe("c32_real_reports — orchestrator entry point", () => {
94- test("buildLiveReports is exported as an async function", () => {
95- expect(typeof buildLiveReports).toBe("function");
96- // End-to-end coverage lives on /reports/live; this is the structural
97- // smoke that the export shape didn't drift. `.length` counts only
98- // non-default params (owner, repo) — perPage carries a default.
99- expect(buildLiveReports.length).toBe(2);
100- });
101-});
removed src/c32_real_reports.ts +0 −170
@@ -1,170 +0,0 @@
1-// c32 — logic: aggregate real GitHub commit history into the same
2-// AgentReport / RecentFlagged shape that c51_render_reports renders.
3-// Pure (given fetched commits in, produces report objects out); the
4-// I/O happens in c14_github.fetchRepoCommits which we call here.
5-//
6-// Attribution: Co-Authored-By footers are the agent-attribution channel
7-// the existing tdd.md commit history already uses. Anything without a
8-// recognised footer is bucketed as "unknown" and reported separately —
9-// it's still useful for volume context.
10-
11-import { parseCommit } from "./c31_commits.ts";
12-import { fetchRepoCommits, type GithubCommit } from "./c14_github.ts";
13-import type {
14- AgentReport,
15- FailureSlice,
16- RecentFlagged,
17-} from "./c31_reports_demo.ts";
18-
19-type LiveAgentSlug = AgentReport["slug"] | "unknown";
20-
21-export const detectAgent = (msg: string): LiveAgentSlug => {
22- if (/Co-Authored-By:.*Claude/i.test(msg)) return "claude-code";
23- if (/Co-Authored-By:.*Cursor/i.test(msg)) return "cursor";
24- if (/Co-Authored-By:.*Aider/i.test(msg)) return "aider";
25- return "unknown";
26-};
27-
28-const AGENT_NAMES: Record<AgentReport["slug"], string> = {
29- "claude-code": "Claude Code",
30- cursor: "Cursor",
31- aider: "Aider",
32-};
33-
34-// 30-day daily commit-count series, oldest → newest. When there are no
35-// commits in a day, that day's value is 0 — the sparkline still renders
36-// but flat-lines, which honestly reflects the data.
37-export const buildTrend = (commits: GithubCommit[], days = 30): number[] => {
38- const out = new Array<number>(days).fill(0);
39- const today = new Date();
40- today.setUTCHours(0, 0, 0, 0);
41- for (const c of commits) {
42- const d = new Date(c.commit.author.date);
43- d.setUTCHours(0, 0, 0, 0);
44- const ageDays = Math.floor((today.getTime() - d.getTime()) / (24 * 60 * 60 * 1000));
45- if (ageDays < 0 || ageDays >= days) continue;
46- const idx = days - 1 - ageDays;
47- const cur = out[idx] ?? 0;
48- out[idx] = cur + 1;
49- }
50- return out;
51-};
52-
53-const buildAgentReport = (
54- slug: AgentReport["slug"],
55- agentCommits: GithubCommit[],
56- repoSlug: string,
57-): AgentReport => {
58- const tagged = agentCommits.filter((c) => {
59- const phase = parseCommit(c.commit.message).phase;
60- return phase === "red" || phase === "green" || phase === "refactor";
61- });
62- const phaseCoveragePct = agentCommits.length === 0
63- ? 0
64- : Math.round((tagged.length / agentCommits.length) * 100);
65-
66- // Score is a proxy: phase-coverage is the only structural signal we
67- // can compute without running the test suite. When coverage is 0 the
68- // agent isn't attempting TDD, so the score is honestly low.
69- const score = phaseCoveragePct;
70-
71- // Failure mix collapses to two slices for live data — phase-tagged vs
72- // not. Fine-grained failure modes (red-did-not-fail, test-deleted, etc)
73- // need the runner sliver before they're computable.
74- const failureMix: FailureSlice[] = [
75- { label: "phase-tagged", pct: phaseCoveragePct, tone: "green" },
76- { label: "no phase tag", pct: 100 - phaseCoveragePct, tone: "muted" },
77- ];
78-
79- const recent: RecentFlagged[] = agentCommits
80- .slice(0, 5)
81- .map((c) => {
82- const parsed = parseCommit(c.commit.message);
83- const phase = parsed.phase === "red" || parsed.phase === "green" || parsed.phase === "refactor"
84- ? parsed.phase
85- : "green";
86- const failure = parsed.phase === "untagged" || parsed.phase === "init"
87- ? "no phase tag"
88- : `${parsed.phase} (live judge not yet wired)`;
89- return {
90- date: c.commit.author.date.slice(0, 10),
91- repo: repoSlug,
92- sha: c.sha.slice(0, 7),
93- phase,
94- failure,
95- pts: 0,
96- };
97- });
98-
99- const topIssueLabel = phaseCoveragePct === 100 ? "no current issues" : "no phase tag";
100- const topIssuePct = 100 - phaseCoveragePct;
101-
102- return {
103- slug,
104- name: AGENT_NAMES[slug],
105- score,
106- delta: 0,
107- commits: agentCommits.length,
108- phaseCoveragePct,
109- streak: 0,
110- streakBroken: false,
111- topIssueLabel,
112- topIssuePct,
113- failureMix,
114- trend: buildTrend(agentCommits),
115- recent,
116- };
117-};
118-
119-export interface LiveReports {
120- reports: AgentReport[];
121- unknownCount: number;
122- totalCommits: number;
123- earliest: string | null;
124- latest: string | null;
125- fetchedAt: number;
126-}
127-
128-export const buildLiveReports = async (
129- repoOwner: string,
130- repoName: string,
131- perPage = 100,
132-): Promise<LiveReports> => {
133- const commits = await fetchRepoCommits(repoOwner, repoName, perPage);
134- const repoSlug = `${repoOwner}/${repoName}`;
135- const byAgent = new Map<AgentReport["slug"], GithubCommit[]>();
136- let unknownCount = 0;
137-
138- for (const c of commits) {
139- const a = detectAgent(c.commit.message);
140- if (a === "unknown") {
141- unknownCount++;
142- continue;
143- }
144- const arr = byAgent.get(a) ?? [];
145- arr.push(c);
146- byAgent.set(a, arr);
147- }
148-
149- const order: AgentReport["slug"][] = ["claude-code", "cursor", "aider"];
150- const reports = order
151- .map((slug) => {
152- const list = byAgent.get(slug);
153- if (!list || list.length === 0) return null;
154- return buildAgentReport(slug, list, repoSlug);
155- })
156- .filter((r): r is AgentReport => r !== null);
157-
158- const dates = commits.map((c) => c.commit.author.date).sort();
159- const earliest = dates[0] ?? null;
160- const latest = dates[dates.length - 1] ?? null;
161-
162- return {
163- reports,
164- unknownCount,
165- totalCommits: commits.length,
166- earliest,
167- latest,
168- fetchedAt: Date.now(),
169- };
170-};
removed src/c32_real_tests.test.ts +0 −66
@@ -1,66 +0,0 @@
1-// Sibling test for c32_real_tests.ts. buildLiveTestData fans out to
2-// loadTestBundle + fetchRepoCommits (both network/disk) so the
3-// end-to-end is covered by the live /reports/live/tests route. The
4-// pure helpers — agent attribution and the file/name label shortener —
5-// are unit-testable here.
6-
7-import { describe, test, expect } from "bun:test";
8-import {
9- detectAgent,
10- shortenTestLabel,
11- buildLiveTestData,
12-} from "./c32_real_tests.ts";
13-
14-describe("c32_real_tests — detectAgent", () => {
15- test("recognises Claude Code via Co-Authored-By: Claude", () => {
16- expect(detectAgent("Add feature\n\nCo-Authored-By: Claude <noreply>")).toBe("claude-code");
17- });
18-
19- test("recognises Cursor", () => {
20- expect(detectAgent("Fix bug\n\nCo-Authored-By: Cursor <[email protected]>")).toBe("cursor");
21- });
22-
23- test("recognises Aider", () => {
24- expect(detectAgent("Refactor x\n\nCo-Authored-By: aider")).toBe("aider");
25- });
26-
27- test("returns null when no recognised footer is present (distinct from c32_real_reports which returns 'unknown')", () => {
28- // The two real_* files made different choices here: real_reports
29- // buckets unknown into its own slug; real_tests returns null so
30- // the caller can filter or fall back. Document the difference.
31- expect(detectAgent("Just a commit")).toBeNull();
32- expect(detectAgent("")).toBeNull();
33- });
34-
35- test("the regex is case-insensitive on the agent token", () => {
36- expect(detectAgent("Co-Authored-By: CLAUDE")).toBe("claude-code");
37- expect(detectAgent("co-authored-by: aider")).toBe("aider");
38- });
39-});
40-
41-describe("c32_real_tests — shortenTestLabel", () => {
42- test("keeps only the basename of the file path + the test name", () => {
43- expect(shortenTestLabel("src/foo/bar/baz.test.ts", "handles X")).toBe("baz.test.ts > handles X");
44- });
45-
46- test("handles a bare filename (no path) without splitting weirdly", () => {
47- expect(shortenTestLabel("baz.test.ts", "handles X")).toBe("baz.test.ts > handles X");
48- });
49-
50- test("handles an empty file string (falls back to the empty basename)", () => {
51- // .split('/').pop() on '' yields ''. Documented behaviour: the
52- // helper never throws; the caller decides whether to filter empties.
53- expect(shortenTestLabel("", "name")).toBe(" > name");
54- });
55-
56- test("preserves spaces and special chars in the test name", () => {
57- expect(shortenTestLabel("a.ts", "rejects `bad input`")).toBe("a.ts > rejects `bad input`");
58- });
59-});
60-
61-describe("c32_real_tests — orchestrator entry point", () => {
62- test("buildLiveTestData is exported as an async function", () => {
63- expect(typeof buildLiveTestData).toBe("function");
64- expect(buildLiveTestData.length).toBe(2);
65- });
66-});
removed src/c32_real_tests.ts +0 −142
@@ -1,142 +0,0 @@
1-// c32 — logic: aggregate the per-deploy test bundle into the same
2-// TestSnapshot[] / TestStability[] shape that the demo page renders.
3-// HEAD-only snapshots; stability accumulates as more deploys add runs.
4-//
5-// Pure given the bundle + commits in (no I/O of its own beyond delegating
6-// to c14_github's bundle loader and commits fetcher).
7-
8-import { fetchRepoCommits, loadTestBundle, type PlaceholderTest } from "./c14_github.ts";
9-import type {
10- AgentReport,
11- TestFailure,
12- TestSnapshot,
13- TestStability,
14-} from "./c31_reports_demo.ts";
15-
16-export const detectAgent = (msg: string): AgentReport["slug"] | null => {
17- if (/Co-Authored-By:.*Claude/i.test(msg)) return "claude-code";
18- if (/Co-Authored-By:.*Cursor/i.test(msg)) return "cursor";
19- if (/Co-Authored-By:.*Aider/i.test(msg)) return "aider";
20- return null;
21-};
22-
23-export const shortenTestLabel = (file: string, name: string): string => {
24- const base = file.split("/").pop() ?? file;
25- return `${base} > ${name}`;
26-};
27-
28-export interface LiveTestData {
29- snapshots: TestSnapshot[];
30- stability: TestStability[];
31- runsCount: number;
32- ranAt: number | null;
33- headSha: string | null;
34- placeholderTests: PlaceholderTest[];
35-}
36-
37-export const buildLiveTestData = async (
38- repoOwner: string,
39- repoName: string,
40-): Promise<LiveTestData> => {
41- const bundle = await loadTestBundle(repoOwner, repoName);
42- if (!bundle || bundle.runs.length === 0) {
43- return { snapshots: [], stability: [], runsCount: 0, ranAt: null, headSha: null, placeholderTests: [] };
44- }
45- const repoSlug = `${repoOwner}/${repoName}`;
46- const latest = bundle.runs[0];
47- if (!latest) {
48- return { snapshots: [], stability: [], runsCount: 0, ranAt: null, headSha: null, placeholderTests: [] };
49- }
50-
51- // For "since" we want the oldest run that has this test as failing.
52- const oldestFirst = [...bundle.runs].sort((a, b) => a.ranAt - b.ranAt);
53-
54- const failures: TestFailure[] = latest.tests
55- .filter((t) => t.status === "fail")
56- .map((t) => {
57- const firstFail = oldestFirst.find((r) =>
58- r.tests.some((x) => x.name === t.name && x.file === t.file && x.status === "fail"),
59- );
60- const sinceTs = firstFail?.ranAt ?? latest.ranAt;
61- return { test: shortenTestLabel(t.file, t.name), since: new Date(sinceTs).toISOString().slice(0, 10) };
62- });
63-
64- const snapshot: TestSnapshot = {
65- repo: repoSlug,
66- branch: latest.branch,
67- total: latest.total,
68- passing: latest.passing,
69- failing: latest.failing,
70- failures,
71- };
72-
73- // Stability: count pass/fail per (file, name) across every run, with
74- // "deleted" set when a previously-seen test is missing from latest.
75- const commits = await fetchRepoCommits(repoOwner, repoName, 100);
76- const shaToAgent = new Map<string, AgentReport["slug"] | null>();
77- for (const c of commits) shaToAgent.set(c.sha, detectAgent(c.commit.message));
78-
79- interface Stat {
80- name: string;
81- file: string;
82- pass: number;
83- fail: number;
84- lastBrokenSha: string | null;
85- lastBrokenAt: number;
86- }
87- const stats = new Map<string, Stat>();
88- for (const run of bundle.runs) {
89- for (const t of run.tests) {
90- const key = `${t.file}|${t.name}`;
91- let s = stats.get(key);
92- if (!s) {
93- s = { name: t.name, file: t.file, pass: 0, fail: 0, lastBrokenSha: null, lastBrokenAt: 0 };
94- stats.set(key, s);
95- }
96- if (t.status === "pass") s.pass++;
97- else {
98- s.fail++;
99- if (run.ranAt > s.lastBrokenAt) {
100- s.lastBrokenSha = run.sha;
101- s.lastBrokenAt = run.ranAt;
102- }
103- }
104- }
105- }
106-
107- const latestKeys = new Set(latest.tests.map((t) => `${t.file}|${t.name}`));
108-
109- // lastBrokenBy needs an agent slug; if we can't map a SHA to an agent
110- // (e.g. the commit isn't in the 100-commit window we fetch), fall
111- // back to the agent of the latest run, which is a defensible default
112- // for the dogfood case (one agent producing the history).
113- const fallbackAgent = (shaToAgent.get(latest.sha) ?? "claude-code") as AgentReport["slug"];
114-
115- const stability: TestStability[] = Array.from(stats.values())
116- .map<TestStability>((s) => {
117- const mapped = s.lastBrokenSha ? shaToAgent.get(s.lastBrokenSha) : null;
118- const agent = (mapped ?? fallbackAgent) as AgentReport["slug"];
119- const deleted = latestKeys.has(`${s.file}|${s.name}`) ? 0 : 1;
120- const flagged = s.fail > 0 && (deleted > 0 || s.fail >= Math.max(2, s.pass / 5));
121- return {
122- test: shortenTestLabel(s.file, s.name),
123- repo: repoSlug,
124- pass: s.pass,
125- fail: s.fail,
126- deleted,
127- lastBrokenBy: agent,
128- flagged,
129- };
130- })
131- .sort((a, b) => b.fail - a.fail || b.deleted - a.deleted || b.pass - a.pass)
132- .slice(0, 30);
133-
134- return {
135- snapshots: [snapshot],
136- stability,
137- runsCount: bundle.runs.length,
138- ranAt: latest.ranAt,
139- headSha: latest.sha,
140- placeholderTests: latest.placeholderTests ?? [],
141- };
142-};
added src/c32_sama_v2_verify.test.ts +247 −0
@@ -0,0 +1,247 @@
1+import { describe, test, expect } from "bun:test";
2+import { verifySamaV2 } from "./c32_sama_v2_verify.ts";
3+import type { ProfileSpec, SamaV2Input } from "./c31_sama_v2.ts";
4+
5+// Minimal fixture profile mirroring the shape this repo's
6+// sama.profile.toml declares, but with synthetic prefixes so tests
7+// don't change when the live profile evolves.
8+const FIXTURE_PROFILE: ProfileSpec = {
9+ samaVersion: "2.0",
10+ profile: "test-fixture",
11+ layers: {
12+ 0: { sublayers: [{ name: "default", prefix: "p0_", index: 0 }] },
13+ 1: {
14+ sublayers: [
15+ { name: "logic", prefix: "p1a_", index: 0 },
16+ { name: "render", prefix: "p1b_", index: 1 },
17+ ],
18+ },
19+ 2: {
20+ sublayers: [
21+ { name: "data", prefix: "p2a_", index: 0 },
22+ { name: "io", prefix: "p2b_", index: 1 },
23+ ],
24+ },
25+ 3: {
26+ sublayers: [
27+ { name: "handlers", prefix: "p3a_", index: 0 },
28+ { name: "server", prefix: "p3b_", index: 1 },
29+ ],
30+ },
31+ },
32+};
33+
34+const mk = (entries: Array<[string, string]>): SamaV2Input => ({
35+ profile: FIXTURE_PROFILE,
36+ files: new Map(entries),
37+});
38+
39+describe("c32_sama_v2_verify — overall", () => {
40+ test("empty repo: every check passes with examined=0 for content-bearing checks", () => {
41+ const report = verifySamaV2(mk([]));
42+ expect(report.overallPassed).toBe(true);
43+ expect(report.checks).toHaveLength(7);
44+ for (const c of report.checks) expect(c.passed).toBe(true);
45+ });
46+
47+ test("a minimal Layer-0-only repo conforms", () => {
48+ const report = verifySamaV2(mk([
49+ ["src/p0_types.ts", "export const x = 1;\n"],
50+ ]));
51+ expect(report.overallPassed).toBe(true);
52+ });
53+});
54+
55+describe("c32_sama_v2_verify — Sorted (#1)", () => {
56+ test("a file without a profile-recognised prefix is flagged", () => {
57+ const report = verifySamaV2(mk([
58+ ["src/unknown_x.ts", "export const x = 1;\n"],
59+ ]));
60+ const sorted = report.checks.find((c) => c.id === 1)!;
61+ expect(sorted.passed).toBe(false);
62+ expect(sorted.violations.some((v) => v.file === "src/unknown_x.ts")).toBe(true);
63+ });
64+
65+ test("a profile whose prefixes lex-sort against layer order is flagged", () => {
66+ // Swap: Layer 0 prefix sorts AFTER Layer 1 prefix.
67+ const bad: ProfileSpec = {
68+ samaVersion: "2.0", profile: "bad",
69+ layers: {
70+ 0: { sublayers: [{ name: "default", prefix: "z0_", index: 0 }] },
71+ 1: { sublayers: [{ name: "default", prefix: "a1_", index: 0 }] },
72+ 2: { sublayers: [{ name: "default", prefix: "b2_", index: 0 }] },
73+ 3: { sublayers: [{ name: "default", prefix: "c3_", index: 0 }] },
74+ },
75+ };
76+ const report = verifySamaV2({ profile: bad, files: new Map() });
77+ const sorted = report.checks.find((c) => c.id === 1)!;
78+ expect(sorted.passed).toBe(false);
79+ expect(sorted.violations.length).toBeGreaterThan(0);
80+ });
81+});
82+
83+describe("c32_sama_v2_verify — Architecture (#2)", () => {
84+ test("an unprefixed src/*.ts file is flagged with a clear reason", () => {
85+ const report = verifySamaV2(mk([
86+ ["src/random.ts", "export const x = 1;\n"],
87+ ]));
88+ const arch = report.checks.find((c) => c.id === 2)!;
89+ expect(arch.passed).toBe(false);
90+ const vio = arch.violations.find((v) => v.file === "src/random.ts")!;
91+ expect(vio.detail).toContain("unprefixed");
92+ });
93+
94+ test("a properly-prefixed file is not flagged", () => {
95+ const report = verifySamaV2(mk([
96+ ["src/p1a_logic.ts", "export const x = 1;\n"],
97+ ]));
98+ expect(report.checks.find((c) => c.id === 2)!.passed).toBe(true);
99+ });
100+});
101+
102+describe("c32_sama_v2_verify — Modeled tests (#3)", () => {
103+ test("a Layer 1 file without a sibling test is flagged", () => {
104+ const report = verifySamaV2(mk([
105+ ["src/p1a_logic.ts", "export const x = 1;\n"],
106+ ]));
107+ const modeled = report.checks.find((c) => c.id === 3)!;
108+ expect(modeled.passed).toBe(false);
109+ const vio = modeled.violations[0]!;
110+ expect(vio.file).toBe("src/p1a_logic.ts");
111+ expect(vio.detail).toContain("p1a_logic.test.ts");
112+ });
113+
114+ test("a Layer 1 file with its sibling passes", () => {
115+ const report = verifySamaV2(mk([
116+ ["src/p1a_logic.ts", "export const x = 1;\n"],
117+ ["src/p1a_logic.test.ts", "import {expect, test} from \"bun:test\"; test(\"x\", () => { expect(1).toBe(1); });\n"],
118+ ]));
119+ expect(report.checks.find((c) => c.id === 3)!.passed).toBe(true);
120+ });
121+
122+ test("Layer 0 files don't require sibling tests", () => {
123+ const report = verifySamaV2(mk([
124+ ["src/p0_types.ts", "export const x = 1;\n"],
125+ ]));
126+ expect(report.checks.find((c) => c.id === 3)!.passed).toBe(true);
127+ });
128+});
129+
130+describe("c32_sama_v2_verify — Modeled boundary (#4)", () => {
131+ test("JSON.parse in Layer 1 is flagged", () => {
132+ const report = verifySamaV2(mk([
133+ ["src/p1a_naughty.ts", "export const f = (s: string) => JSON.parse(s);\n"],
134+ ]));
135+ const boundary = report.checks.find((c) => c.id === 4)!;
136+ expect(boundary.passed).toBe(false);
137+ expect(boundary.violations[0]!.detail).toContain("JSON.parse");
138+ });
139+
140+ test("JSON.parse in Layer 2 is OK (Layer 2 IS the boundary)", () => {
141+ const report = verifySamaV2(mk([
142+ ["src/p2b_adapter.ts", "export const f = (s: string) => JSON.parse(s);\n"],
143+ ]));
144+ expect(report.checks.find((c) => c.id === 4)!.passed).toBe(true);
145+ });
146+
147+ test("string literals containing JSON.parse don't false-positive", () => {
148+ const report = verifySamaV2(mk([
149+ ["src/p1a_logic.ts", "const explainer = \"to fix, call JSON.parse(input) in Layer 2\";\nexport const x = explainer.length;\n"],
150+ ]));
151+ expect(report.checks.find((c) => c.id === 4)!.passed).toBe(true);
152+ });
153+});
154+
155+describe("c32_sama_v2_verify — Atomic (#5)", () => {
156+ test("a file over the 700-line cap is flagged", () => {
157+ const fat = Array.from({ length: 720 }, (_, i) => `// line ${i}`).join("\n");
158+ const report = verifySamaV2(mk([
159+ ["src/p1a_fat.ts", fat],
160+ ]));
161+ const atomic = report.checks.find((c) => c.id === 5)!;
162+ expect(atomic.passed).toBe(false);
163+ expect(atomic.violations[0]!.detail).toContain("over the 700-line cap");
164+ });
165+
166+ test("a barrel re-export file is flagged", () => {
167+ const report = verifySamaV2(mk([
168+ ["src/p1a_barrel.ts", "export * from \"./p1a_a.ts\";\nexport * from \"./p1a_b.ts\";\n"],
169+ ]));
170+ const atomic = report.checks.find((c) => c.id === 5)!;
171+ expect(atomic.passed).toBe(false);
172+ expect(atomic.violations[0]!.detail).toContain("barrel");
173+ });
174+});
175+
176+describe("c32_sama_v2_verify — Law §1.2 (#6)", () => {
177+ test("upward import (Layer 1 → Layer 2) is flagged", () => {
178+ const report = verifySamaV2(mk([
179+ ["src/p1a_logic.ts", "import { x } from \"./p2a_data.ts\";\nexport const y = x;\n"],
180+ ["src/p1a_logic.test.ts", "import { test, expect } from \"bun:test\"; test(\"y\", () => { expect(1).toBe(1); });\n"],
181+ ["src/p2a_data.ts", "export const x = 1;\n"],
182+ ["src/p2a_data.test.ts","import { test, expect } from \"bun:test\"; test(\"x\", () => { expect(1).toBe(1); });\n"],
183+ ]));
184+ const law = report.checks.find((c) => c.id === 6)!;
185+ expect(law.passed).toBe(false);
186+ expect(law.violations.some((v) => v.detail.includes("upward"))).toBe(true);
187+ });
188+
189+ test("downward import (Layer 2 → Layer 0) passes", () => {
190+ const report = verifySamaV2(mk([
191+ ["src/p2a_data.ts", "import type { X } from \"./p0_types.ts\";\nexport const f = (): X => ({} as X);\n"],
192+ ["src/p2a_data.test.ts", "import { test, expect } from \"bun:test\"; test(\"f\", () => { expect(1).toBe(1); });\n"],
193+ ["src/p0_types.ts", "export interface X { id: number }\n"],
194+ ]));
195+ expect(report.checks.find((c) => c.id === 6)!.passed).toBe(true);
196+ });
197+
198+ test("same-layer reversed sublayer is flagged", () => {
199+ // p1a_logic is sublayer index 0 (logic), p1b_render is sublayer
200+ // index 1 (render). Logic importing render is reverse order.
201+ const report = verifySamaV2(mk([
202+ ["src/p1a_logic.ts", "import { r } from \"./p1b_render.ts\";\nexport const y = r;\n"],
203+ ["src/p1a_logic.test.ts", "import { test, expect } from \"bun:test\"; test(\"y\", () => { expect(1).toBe(1); });\n"],
204+ ["src/p1b_render.ts", "export const r = 1;\n"],
205+ ["src/p1b_render.test.ts","import { test, expect } from \"bun:test\"; test(\"r\", () => { expect(1).toBe(1); });\n"],
206+ ]));
207+ const law = report.checks.find((c) => c.id === 6)!;
208+ expect(law.passed).toBe(false);
209+ expect(law.violations.some((v) => v.detail.includes("sublayer"))).toBe(true);
210+ });
211+
212+ test("an import cycle is flagged", () => {
213+ const report = verifySamaV2(mk([
214+ ["src/p1a_a.ts", "import { y } from \"./p1a_b.ts\";\nexport const x = y;\n"],
215+ ["src/p1a_a.test.ts", "import { test, expect } from \"bun:test\"; test(\"x\", () => { expect(1).toBe(1); });\n"],
216+ ["src/p1a_b.ts", "import { x } from \"./p1a_a.ts\";\nexport const y = x;\n"],
217+ ["src/p1a_b.test.ts", "import { test, expect } from \"bun:test\"; test(\"y\", () => { expect(1).toBe(1); });\n"],
218+ ]));
219+ const law = report.checks.find((c) => c.id === 6)!;
220+ expect(law.passed).toBe(false);
221+ expect(law.violations.some((v) => v.detail.includes("cycle"))).toBe(true);
222+ });
223+});
224+
225+describe("c32_sama_v2_verify — Consistency §3 (#7)", () => {
226+ test("Layer 1 file reaching Layer 2 contradicts its declared prefix", () => {
227+ const report = verifySamaV2(mk([
228+ ["src/p1a_logic.ts", "import { f } from \"./p2a_data.ts\";\nexport const y = f;\n"],
229+ ["src/p1a_logic.test.ts", "import { test, expect } from \"bun:test\"; test(\"y\", () => { expect(1).toBe(1); });\n"],
230+ ["src/p2a_data.ts", "export const f = 1;\n"],
231+ ["src/p2a_data.test.ts", "import { test, expect } from \"bun:test\"; test(\"f\", () => { expect(1).toBe(1); });\n"],
232+ ]));
233+ const consistency = report.checks.find((c) => c.id === 7)!;
234+ expect(consistency.passed).toBe(false);
235+ expect(consistency.violations[0]!.detail).toContain("declared Layer 1");
236+ expect(consistency.violations[0]!.detail).toContain("Layer 2");
237+ });
238+
239+ test("downward-only imports are consistent", () => {
240+ const report = verifySamaV2(mk([
241+ ["src/p1a_logic.ts", "import type { X } from \"./p0_types.ts\";\nexport const y = (a: X) => a;\n"],
242+ ["src/p1a_logic.test.ts", "import { test, expect } from \"bun:test\"; test(\"y\", () => { expect(1).toBe(1); });\n"],
243+ ["src/p0_types.ts", "export interface X { id: number }\n"],
244+ ]));
245+ expect(report.checks.find((c) => c.id === 7)!.passed).toBe(true);
246+ });
247+});
added src/c32_sama_v2_verify.ts +436 −0
@@ -0,0 +1,436 @@
1+// c32 — logic: the SAMA v2 verifier. Implements the seven §4
2+// conformance checks (Sorted, Architecture, Modeled-tests,
3+// Modeled-boundary, Atomic, the Law §1.2, Consistency §3) as pure
4+// functions over an in-memory (profile, files) input. Never reads
5+// the filesystem — the loader (c14_sama_profile + c21 handler)
6+// populates the input map. No mocks, no stubs: every check is a
7+// real grep/string-op on the supplied content.
8+
9+import {
10+ declaredLayer,
11+ type SamaV2Check,
12+ type SamaV2Input,
13+ type SamaV2Report,
14+ type SamaV2Violation,
15+} from "./c31_sama_v2.ts";
16+
17+// — shared utilities -------------------------------------------------
18+
19+// A SAMA file is one we expect to obey the layer rules: any *.ts
20+// under src/ that isn't a *.test.ts. Tests live next to source as
21+// siblings; they're examined for the Modeled check but don't carry
22+// their own layer.
23+const isSamaFile = (path: string): boolean =>
24+ path.startsWith("src/") && path.endsWith(".ts") && !path.endsWith(".test.ts");
25+
26+const isTestFile = (path: string): boolean =>
27+ path.startsWith("src/") && path.endsWith(".test.ts");
28+
29+// Strip JS/TS string literals and comments to whitespace so a regex
30+// that walks the source doesn't trip on test fixtures that contain
31+// the very patterns we're scanning for. Same shape as the helper in
32+// c32_sama_verify; duplicated here to keep c32_sama_v2_verify a
33+// stand-alone module the loader can pull in without dragging the v1
34+// verifier with it.
35+const stripStringsAndComments = (src: string): string => {
36+ let out = "";
37+ let i = 0;
38+ while (i < src.length) {
39+ const c = src[i];
40+ const n = src[i + 1];
41+ if (c === "/" && n === "/") {
42+ out += " ";
43+ i += 2;
44+ while (i < src.length && src[i] !== "\n") { out += " "; i++; }
45+ } else if (c === "/" && n === "*") {
46+ out += " ";
47+ i += 2;
48+ while (i < src.length - 1 && !(src[i] === "*" && src[i + 1] === "/")) {
49+ out += src[i] === "\n" ? "\n" : " ";
50+ i++;
51+ }
52+ out += " ";
53+ i += 2;
54+ } else if (c === '"' || c === "'" || c === "`") {
55+ const quote = c;
56+ out += " ";
57+ i++;
58+ while (i < src.length && src[i] !== quote) {
59+ if (src[i] === "\\" && i + 1 < src.length) { out += " "; i += 2; continue; }
60+ out += src[i] === "\n" ? "\n" : " ";
61+ i++;
62+ }
63+ out += " ";
64+ i++;
65+ } else {
66+ out += c;
67+ i++;
68+ }
69+ }
70+ return out;
71+};
72+
73+// Collect every relative ".ts" import edge in a file. Scans raw
74+// source: a stripped copy would erase the quoted import paths along
75+// with all other string literals, so the regex must run over the
76+// original. To avoid picking up import-like strings inside test
77+// fixtures, we cross-check each match position against the stripped
78+// mask — if the keyword `from` lands on whitespace in the mask, it
79+// was inside a string literal and we skip it.
80+const collectRelativeImports = (content: string): string[] => {
81+ const mask = stripStringsAndComments(content);
82+ const re = /\bfrom\s+["'](\.\/[A-Za-z0-9_./-]+\.ts)["']/g;
83+ const out: string[] = [];
84+ let m: RegExpExecArray | null;
85+ while ((m = re.exec(content)) !== null) {
86+ // If the `from` keyword position is whitespace in the mask, the
87+ // entire match was inside a string literal (e.g. a test fixture).
88+ if (mask[m.index] === " " || mask[m.index] === "\n") continue;
89+ if (m[1]) out.push(m[1]);
90+ }
91+ return out;
92+};
93+
94+// Resolve a relative import like "./c14_git.ts" from the importing
95+// file's directory to the repo-relative path used as the input map's
96+// key (e.g. "src/c14_git.ts").
97+const resolveImport = (fromPath: string, importPath: string): string => {
98+ const dir = fromPath.split("/").slice(0, -1).join("/");
99+ const rel = importPath.replace(/^\.\//, "");
100+ return dir + "/" + rel;
101+};
102+
103+// — Check 1: Sorted -------------------------------------------------
104+//
105+// "Every file carries a profile-recognised prefix; lexicographic
106+// prefix order equals layer order."
107+const checkSorted = (input: SamaV2Input): SamaV2Check => {
108+ const violations: SamaV2Violation[] = [];
109+ let examined = 0;
110+ // Collect (prefix, layer) pairs from the profile.
111+ const pairs: Array<{ prefix: string; layer: number }> = [];
112+ for (const [k, spec] of Object.entries(input.profile.layers)) {
113+ const layer = parseInt(k, 10);
114+ for (const sub of spec.sublayers) pairs.push({ prefix: sub.prefix, layer });
115+ }
116+ // For any two prefixes with layer(A) < layer(B), A must lex-sort < B.
117+ for (let i = 0; i < pairs.length; i++) {
118+ for (let j = 0; j < pairs.length; j++) {
119+ if (i === j) continue;
120+ const a = pairs[i]!;
121+ const b = pairs[j]!;
122+ if (a.layer < b.layer && a.prefix > b.prefix) {
123+ violations.push({
124+ file: a.prefix,
125+ detail: `prefix \`${a.prefix}\` (layer ${a.layer}) sorts after \`${b.prefix}\` (layer ${b.layer}) — lex order must equal layer order`,
126+ });
127+ }
128+ }
129+ }
130+ // Also count source files whose prefix isn't recognised by any
131+ // sublayer. They'd be flagged by Architecture too, but the Sorted
132+ // rule needs each file to have a recognised prefix.
133+ for (const path of input.files.keys()) {
134+ if (!isSamaFile(path)) continue;
135+ examined++;
136+ if (declaredLayer(path, input.profile) === null) {
137+ violations.push({ file: path, detail: "no profile-recognised prefix" });
138+ }
139+ }
140+ return {
141+ id: 1, name: "Sorted", property: "Sorted",
142+ passed: violations.length === 0, examined, violations,
143+ };
144+};
145+
146+// — Check 2: Architecture -------------------------------------------
147+//
148+// "Every file maps to exactly one canonical layer; no file is
149+// unprefixed or maps to two layers."
150+const checkArchitecture = (input: SamaV2Input): SamaV2Check => {
151+ const violations: SamaV2Violation[] = [];
152+ let examined = 0;
153+ for (const path of input.files.keys()) {
154+ if (!isSamaFile(path) && !isTestFile(path)) continue;
155+ examined++;
156+ const base = path.split("/").pop() ?? path;
157+ // Find every profile prefix that matches this filename. Exactly
158+ // one is required; zero = unprefixed (caught by Sorted too) but
159+ // we surface it here as the canonical "unmapped" failure.
160+ const matches: Array<{ layer: number; prefix: string }> = [];
161+ for (const [k, spec] of Object.entries(input.profile.layers)) {
162+ const layer = parseInt(k, 10);
163+ for (const sub of spec.sublayers) {
164+ if (base.startsWith(sub.prefix)) matches.push({ layer, prefix: sub.prefix });
165+ }
166+ }
167+ if (matches.length === 0) {
168+ violations.push({ file: path, detail: "unprefixed — does not match any profile prefix" });
169+ } else if (matches.length > 1) {
170+ // Two prefixes claim the same file: profile ambiguity.
171+ const distinctLayers = new Set(matches.map((m) => m.layer));
172+ if (distinctLayers.size > 1) {
173+ violations.push({
174+ file: path,
175+ detail: `ambiguous — matches multiple layers: ${matches.map((m) => `${m.prefix}→L${m.layer}`).join(", ")}`,
176+ });
177+ }
178+ }
179+ }
180+ return {
181+ id: 2, name: "Architecture", property: "Architecture",
182+ passed: violations.length === 0, examined, violations,
183+ };
184+};
185+
186+// — Check 3: Modeled (tests) ----------------------------------------
187+//
188+// "Every Layer 1 and Layer 2 behavior file has a sibling test file."
189+const checkModeledTests = (input: SamaV2Input): SamaV2Check => {
190+ const violations: SamaV2Violation[] = [];
191+ let examined = 0;
192+ for (const path of input.files.keys()) {
193+ if (!isSamaFile(path)) continue;
194+ const decl = declaredLayer(path, input.profile);
195+ if (!decl) continue;
196+ if (decl.layer !== 1 && decl.layer !== 2) continue;
197+ examined++;
198+ const siblingPath = path.replace(/\.ts$/, ".test.ts");
199+ if (!input.files.has(siblingPath)) {
200+ violations.push({
201+ file: path,
202+ detail: `no sibling test at \`${siblingPath}\` — Layer ${decl.layer} requires one`,
203+ });
204+ }
205+ }
206+ return {
207+ id: 3, name: "Modeled (tests)", property: "Modeled (tests)",
208+ passed: violations.length === 0, examined, violations,
209+ };
210+};
211+
212+// — Check 4: Modeled (boundary) -------------------------------------
213+//
214+// "External input is parsed only in Layer 2."
215+//
216+// §4.4 is profile-dependent (spec §6). Our profile defines boundary
217+// parsing as `JSON.parse(` of arbitrary input (not constant strings)
218+// or `new URL(` of arbitrary input — i.e. patterns that turn bytes
219+// into typed structures. Platform-provided parsers called *through*
220+// Layer 3 entry handlers (`req.json()`, `req.formData()`, route
221+// params) are treated as delegation to the platform's own Layer 2,
222+// not parsing performed in our Layer 3. The verifier reports any
223+// raw JSON.parse / new URL calls landing outside Layer 2.
224+const BOUNDARY_PATTERNS = [
225+ { name: "JSON.parse", re: /\bJSON\.parse\s*\(/ },
226+ { name: "new URL", re: /\bnew\s+URL\s*\(/ },
227+];
228+const checkModeledBoundary = (input: SamaV2Input): SamaV2Check => {
229+ const violations: SamaV2Violation[] = [];
230+ let examined = 0;
231+ for (const [path, content] of input.files.entries()) {
232+ if (!isSamaFile(path)) continue;
233+ const decl = declaredLayer(path, input.profile);
234+ if (!decl) continue;
235+ examined++;
236+ if (decl.layer === 2) continue; // Layer 2 is the legitimate site.
237+ const stripped = stripStringsAndComments(content);
238+ for (const pat of BOUNDARY_PATTERNS) {
239+ if (pat.re.test(stripped)) {
240+ violations.push({
241+ file: path,
242+ detail: `boundary pattern \`${pat.name}\` found in Layer ${decl.layer} — parsing belongs in Layer 2`,
243+ });
244+ }
245+ }
246+ }
247+ return {
248+ id: 4, name: "Modeled (boundary)", property: "Modeled (boundary)",
249+ passed: violations.length === 0, examined, violations,
250+ note: "profile-dependent (spec §4.4): boundary = raw `JSON.parse` / `new URL` outside Layer 2. Platform parsers reached via `req.json()` etc. are treated as delegation to the platform's own Layer 2.",
251+ };
252+};
253+
254+// — Check 5: Atomic -------------------------------------------------
255+//
256+// "No file exceeds the line cap (default ~700; profile may lower,
257+// never raise). No barrel re-export files."
258+const ATOMIC_LINE_CAP = 700;
259+const checkAtomic = (input: SamaV2Input): SamaV2Check => {
260+ const violations: SamaV2Violation[] = [];
261+ let examined = 0;
262+ for (const [path, content] of input.files.entries()) {
263+ if (!isSamaFile(path) && !isTestFile(path)) continue;
264+ examined++;
265+ const lines = content.split("\n").length;
266+ if (lines > ATOMIC_LINE_CAP) {
267+ violations.push({
268+ file: path,
269+ detail: `${lines} lines (over the ${ATOMIC_LINE_CAP}-line cap — split per UI/data domain)`,
270+ });
271+ }
272+ // Barrel detection: a file whose entire body is re-exports.
273+ // Heuristic: every non-blank, non-comment line is `export ... from`.
274+ const stripped = stripStringsAndComments(content);
275+ const codeLines = stripped.split("\n").map((l) => l.trim()).filter((l) => l.length > 0);
276+ if (codeLines.length >= 2 && codeLines.every((l) => /^export\s+(\*|\{)/.test(l) && /\bfrom\b/.test(l))) {
277+ violations.push({ file: path, detail: "barrel re-export file (all lines are `export … from`)" });
278+ }
279+ }
280+ return {
281+ id: 5, name: "Atomic", property: "Atomic",
282+ passed: violations.length === 0, examined, violations,
283+ };
284+};
285+
286+// — Check 6: The Law (§1.2) -----------------------------------------
287+//
288+// "Imports always point to a strictly lower layer number — never
289+// upward, never sideways across a higher number, never cyclic."
290+//
291+// Build the import graph from relative-.ts imports, then for each
292+// edge A → B require: layer(B) < layer(A), OR same layer + B's
293+// sublayer index <= A's sublayer index. Also run a DFS cycle detector.
294+const checkLaw = (input: SamaV2Input): SamaV2Check => {
295+ const violations: SamaV2Violation[] = [];
296+ let examined = 0;
297+ // Build adjacency.
298+ const adj = new Map<string, string[]>();
299+ for (const [path, content] of input.files.entries()) {
300+ if (!isSamaFile(path) && !isTestFile(path)) continue;
301+ examined++;
302+ const out: string[] = [];
303+ for (const imp of collectRelativeImports(content)) {
304+ const resolved = resolveImport(path, imp);
305+ // Only follow edges into known SAMA files (in-tree, in src/).
306+ if (input.files.has(resolved)) out.push(resolved);
307+ }
308+ adj.set(path, out);
309+ }
310+ // Edge-by-edge layer/sublayer check.
311+ for (const [from, outs] of adj.entries()) {
312+ const aDecl = declaredLayer(from, input.profile);
313+ if (!aDecl) continue; // Unmapped — caught by Architecture.
314+ for (const to of outs) {
315+ const bDecl = declaredLayer(to, input.profile);
316+ if (!bDecl) continue;
317+ if (bDecl.layer < aDecl.layer) continue; // strictly lower — OK
318+ if (bDecl.layer > aDecl.layer) {
319+ violations.push({
320+ file: from,
321+ detail: `imports \`${to}\` — Layer ${aDecl.layer} → Layer ${bDecl.layer} (upward, breaks §1.2)`,
322+ });
323+ continue;
324+ }
325+ // Same layer: sublayer ordering. The import target must be in
326+ // an earlier-or-equal sublayer slot (spec §2.2: later may import
327+ // earlier).
328+ if (bDecl.sublayer.index > aDecl.sublayer.index) {
329+ violations.push({
330+ file: from,
331+ detail: `imports \`${to}\` — same layer ${aDecl.layer} but sublayer order is reversed (${aDecl.sublayer.name} sublayer-index ${aDecl.sublayer.index} → ${bDecl.sublayer.name} sublayer-index ${bDecl.sublayer.index})`,
332+ });
333+ }
334+ }
335+ }
336+ // DFS cycle detection on the same graph.
337+ const WHITE = 0, GRAY = 1, BLACK = 2;
338+ const color = new Map<string, number>();
339+ for (const k of adj.keys()) color.set(k, WHITE);
340+ const cycles: string[][] = [];
341+ const stack: string[] = [];
342+ const dfs = (node: string): boolean => {
343+ color.set(node, GRAY);
344+ stack.push(node);
345+ for (const next of adj.get(node) ?? []) {
346+ const c = color.get(next) ?? WHITE;
347+ if (c === GRAY) {
348+ const idx = stack.indexOf(next);
349+ if (idx !== -1) cycles.push([...stack.slice(idx), next]);
350+ return true;
351+ }
352+ if (c === WHITE && dfs(next)) {
353+ // bubble up
354+ }
355+ }
356+ stack.pop();
357+ color.set(node, BLACK);
358+ return false;
359+ };
360+ for (const k of adj.keys()) if (color.get(k) === WHITE) dfs(k);
361+ for (const cyc of cycles) {
362+ violations.push({
363+ file: cyc[0] ?? "(unknown)",
364+ detail: `import cycle: ${cyc.join(" → ")}`,
365+ });
366+ }
367+ return {
368+ id: 6, name: "Law (§1.2)", property: "Law",
369+ passed: violations.length === 0, examined, violations,
370+ };
371+};
372+
373+// — Check 7: Consistency (§3) ---------------------------------------
374+//
375+// "Verifier FAILS if a file imports from a layer that its declared
376+// layer is not permitted to import." This is the same set of edges
377+// the Law check examines, framed from the file's own perspective:
378+// does the prefix lie about what the file actually does?
379+//
380+// We emit a separate verdict so the report can show both framings.
381+// In a profile where no §1.2 violation exists, §3 also passes by
382+// construction — both are derived from the same edge set.
383+const checkConsistency = (input: SamaV2Input): SamaV2Check => {
384+ const violations: SamaV2Violation[] = [];
385+ let examined = 0;
386+ for (const [path, content] of input.files.entries()) {
387+ if (!isSamaFile(path)) continue;
388+ const aDecl = declaredLayer(path, input.profile);
389+ if (!aDecl) continue;
390+ examined++;
391+ let ceiling = -1;
392+ let ceilingFile: string | null = null;
393+ for (const imp of collectRelativeImports(content)) {
394+ const resolved = resolveImport(path, imp);
395+ const bDecl = declaredLayer(resolved, input.profile);
396+ if (!bDecl) continue;
397+ if (bDecl.layer > ceiling) { ceiling = bDecl.layer; ceilingFile = resolved; }
398+ }
399+ // Consistency fails if any import goes to a strictly higher
400+ // layer than the file's declared layer. Same-layer with bad
401+ // sublayer order is the Law's concern, not Consistency's.
402+ if (ceiling > aDecl.layer) {
403+ violations.push({
404+ file: path,
405+ detail: `declared Layer ${aDecl.layer} (prefix \`${aDecl.sublayer.prefix}\`) but imports reach Layer ${ceiling} via \`${ceilingFile}\` — the prefix claims something the imports contradict`,
406+ });
407+ }
408+ }
409+ return {
410+ id: 7, name: "Consistency (§3)", property: "Consistency",
411+ passed: violations.length === 0, examined, violations,
412+ };
413+};
414+
415+// — Orchestrator ----------------------------------------------------
416+
417+export const verifySamaV2 = (input: SamaV2Input): SamaV2Report => {
418+ const checks: SamaV2Check[] = [
419+ checkSorted(input),
420+ checkArchitecture(input),
421+ checkModeledTests(input),
422+ checkModeledBoundary(input),
423+ checkAtomic(input),
424+ checkLaw(input),
425+ checkConsistency(input),
426+ ];
427+ // Architecture's examined count is the canonical total — it counts
428+ // every file the profile assigns to a layer (or fails to).
429+ const examined = checks.find((c) => c.id === 2)?.examined ?? 0;
430+ return {
431+ profile: input.profile.profile,
432+ examined,
433+ checks,
434+ overallPassed: checks.every((c) => c.passed),
435+ };
436+};
modified src/c51_render_admin.ts +1 −1
@@ -10,7 +10,7 @@
1010 // here is forward-compatible with the block editor that lands next.
1111
1212 import { escape, renderPage } from "./c51_render_layout.ts";
13-import type { SxDocumentSummary } from "./c13_database.ts";
13+import type { SxDocumentSummary } from "./c31_sxdoc.ts";
1414 import type { SxDocument } from "./c31_sxdoc.ts";
1515 import { sxToHtml } from "./c51_render_sxdoc.ts";
1616
modified src/c51_render_edit.ts +1 −4
@@ -8,10 +8,7 @@ import {
88 escape,
99 } from "./c51_render_layout.ts";
1010 import type { ResolvedEdit } from "./c32_edit_resolve.ts";
11-import type {
12- GitCommitOk,
13- GitCommitFailure,
14-} from "./c14_git.ts";
11+import type { GitCommitOk, GitCommitFailure } from "./c31_git_parse.ts";
1512
1613 const layoutWrap = (innerHtml: string): string =>
1714 `<main class="md edit-page"><div class="edit-container">${innerHtml}</div></main>`;
modified src/c51_render_projects.ts +1 −1
@@ -1,7 +1,7 @@
11 // c51 (projects) — body builders for /projects, /projects/new,
22 // /projects/:owner/:repo. Imports chrome helpers from c51_render_layout.
33
4-import type { ProjectRow } from "./c13_database.ts";
4+import type { ProjectRow } from "./c31_project_config.ts";
55 import { PROJECT_CONFIG_PATH } from "./c31_project_config.ts";
66 import { escape } from "./c51_render_layout.ts";
77
modified src/c51_render_repo.ts +1 −1
@@ -6,7 +6,7 @@
66
77 import { marked } from "marked";
88 import { renderPage, escape } from "./c51_render_layout.ts";
9-import type { TreeEntry } from "./c14_git.ts";
9+import type { TreeEntry } from "./c31_git_parse.ts";
1010
1111 const shortSha = (sha: string): string => sha.slice(0, 7);
1212