syntaxai/tdd.md · commit 1d22b18

Close dogfood loop + postmortem post: cost-flattening confirmed at 8m 8s

Three things in one commit:

1. Dogfood close: goals/sama-discipline-prefix.md flips pending →
   shipped, merge_sha f806580, pr_number 53. ALL_GOALS gets a new
   entry mirroring the file's metadata.

2. Postmortem post: content/blog/sama-v2-second-url-refactor-postmortem.md
   measures the hypothesis from /blog/sama-v2-git-url-refactor-postmortem.
   Predicted 1 hour; actual 8m 8s; ratio 7.4× faster. Commit-by-
   commit timeline + analysis of what the prediction got right + the
   one pattern-risk that surfaced (sed scope collision on filesystem
   paths). Plus next-step predictions for fixed-enum vs data-driven
   refactors.

3. Scorecard image: public/images/cost-flattening-confirmed.png +
   .svg with the dramatic 'predicted 1 hour, landed in 8 minutes
   8 seconds' headline. Watermarked per the /images/ convention.

Co-Authored-By: Claude Opus 4.7 <[email protected]>

author: syntaxai <[email protected]>
date: 2026-05-25 15:59:05 +01:00
parent: f806580
commit: 1d22b1863885e7b119cdd614feff6bd67f50b297

6 files changed · +196 −3

added content/blog/sama-v2-second-url-refactor-postmortem.md +91 −0

@@ -0,0 +1,91 @@
	1	+# 8 minutes 8 seconds — the cost-flattening hypothesis is confirmed
	2	+
	3	+The [git-url-refactor postmortem](/blog/sama-v2-git-url-refactor-postmortem) closed with a single falsifiable claim:
	4	+
	5	+> "If the second URL refactor — when it happens — lands in ~an hour, that's the data point. If it takes another evening, the pattern wasn't as portable as it looked. Either result is informative."
	6	+
	7	+The second URL refactor happened today. It landed in 8 minutes and 8 seconds.
	8	+
	9	+```
	10	+T+00:00:00 git checkout -b sama-discipline-prefix (first commit)
	11	+T+00:00:30 goals/sama-discipline-prefix.md written (dogfooded /goal)
	12	+T+00:01:45 b32_sama_discipline_url_redirect.ts + test (copy of git template)
	13	+T+00:02:30 d21_handlers_fallback.ts redirect block (paste alongside existing)
	14	+T+00:02:50 d21_app.ts Bun route + sitemap handler (one edit each)
	15	+T+00:04:00 sed pass over 13 content + src files (one regex)
	16	+T+00:04:20 one sed over-rewrite caught + fixed (test failed, 30s to revert)
	17	+T+00:05:10 419/419 tests green (+12 new helper cases)
	18	+T+00:06:30 commit, push, gh pr create, gh pr merge (the GitHub-flow)
	19	+T+00:07:30 deploy script ran (rebuild + restart container)
	20	+T+00:08:08 /healthz answered, live-verify passed (all 4 new URLs 200, all 4 old URLs 301)
	21	+```
	22	+
	23	+7.4× faster than predicted. Cost-flattening confirmed at a much more dramatic rate than the original hypothesis dared.
	24	+
	25	+![Cost-flattening hypothesis confirmed — 8m 8s vs 1h predicted](/images/cost-flattening-confirmed.png?v=1)
	26	+
	27	+## What the prediction got right
	28	+
	29	+The original postmortem named three things the pattern would carry through:
	30	+
	31	+- A 5-line Layer-1 helper with one regex
	32	+- A sibling test covering all match-cases + non-match-cases
	33	+- A Layer-3 wrapper that emits the 301
	34	+
	35	+All three landed identically in PR #53. The helper is 13 lines in both PRs (same code modulo identifier renaming). The Layer-3 wrapper is the same 11 lines in both fallback handlers, literally copy-pasted from the rewriteOldGitUrl block to the new rewriteOldSamaDisciplineUrl block. The sibling test grew from 9 cases to 12 — slightly more because the new pattern has more non-match neighbours (`/sama/v2`, `/sama/skill`, the new-form URLs) that needed explicit `null`-return tests.
	36	+
	37	+## What the prediction missed — pattern risk
	38	+
	39	+One surprise: the sed pass over-rewrote.
	40	+
	41	+The pattern `s\|/sama/(sorted\|architecture\|modeled\|atomic)\|/sama/discipline/\1\|g` matched inside filesystem paths like `content/sama/sorted.md` — turning them into the nonsensical `content/sama/discipline/sorted.md` (the directory `content/sama/discipline/` doesn't exist; the file is at `content/sama/sorted.md`).
	42	+
	43	+This bug surfaced via a test — `b32_edit_resolve.test.ts` failed when its expected `filePath: "content/sama/sorted.md"` got rewritten to `content/sama/discipline/sorted.md`. The pattern caught it in seconds, and reverting was one Edit. But it's a genuine new failure mode the first refactor didn't have, because git-url-drop-owner's regex was `/GIT/syntaxai/tdd.md/` — three segments long, far less likely to collide with filesystem paths.
	44	+
	45	+Pattern risk added to the recipe: when generating a sed for a URL refactor, prefer anchoring (`href="..."`, `[link](...)`) or use a more restrictive character class. The 30-second revert was cheap this time because a test caught it. In a refactor with no covering test, the over-rewrite would have landed silently.
	46	+
	47	+The /goal that follows the third URL refactor should include this lesson explicitly as an anti-fudge clause.
	48	+
	49	+## What lands when this pattern keeps repeating
	50	+
	51	+PR #42 took an evening. PR #53 took 8 minutes. Two datapoints isn't a trend, but it's not nothing: the cost is dropping faster than linear.
	52	+
	53	+The mechanism is mundane. The first time:
	54	+- You design the helper shape (`SitemapUrl`-like type, `rewrite*` function name, `null`-vs-string return for non-match)
	55	+- You design the test shape (match-all-kinds, non-match-cases, edge cases like empty input)
	56	+- You design the Layer-3 wrapper shape (Response with 301, Location header, Cache-Control)
	57	+- You discover the gotchas (regex order in the fallback, sed scope, Containerfile gotcha, breadcrumb cross-references)
	58	+
	59	+The second time, all of that is done. You import the template, change the slug, paste. The first refactor's 19 files become the second refactor's 17 files because the helper + test are now reuseable structure, not new design.
	60	+
	61	+The third refactor — whichever one we pick — should land in similar time unless the cost-flattening hits a floor at the irreducible-minimum of "type the new slugs + sed". My guess: 5–10 minutes for any future URL move, dominated by deploy time (~50s per deploy) and gh-CLI roundtrips.
	62	+
	63	+## What about /blog/<slug> → /blog/<yyyy-mm>/<slug>?
	64	+
	65	+This was the third candidate in the postmortem. It's much bigger — ~27 blog posts × N cross-links each = probably 100+ references. The helper pattern would still work, but the sed-pass scope explodes, and the new URL needs to be computed from data (the post's `date` field), not from a fixed enum.
	66	+
	67	+Prediction for that one: not 8 minutes. Probably 20–30 minutes, because:
	68	+- Helper signature changes (must accept date AND slug, lookup the date from ALL_POSTS by slug)
	69	+- Sed pass over content/*.md is sketchier (more risk of over-rewrite)
	70	+- Sitemap handler needs the new URL shape, but the redirect needs the old-shape regex AND a way to know each post's date
	71	+
	72	+The cost-flattening claim holds for refactors of the same shape. When the input expands (fixed enum → data-driven), the helper has to grow. That's a falsifiable distinction worth testing next.
	73	+
	74	+## The empirical chain ratchets
	75	+
	76	+[`/blog/sama-v2-goal-chain-gap`](/blog/sama-v2-goal-chain-gap) said every artifact is auditable. Now: the time-cost of each kind of refactor is auditable too. The git-url postmortem published a 1-hour prediction. This post publishes the 8-minute measurement against it. Both timestamps are in git, both /goals are at [`/goals`](/goals), both PRs link back to their plan posts.
	77	+
	78	+Anyone reading [/blog/sama-v2-on-ramp-gap](/blog/sama-v2-on-ramp-gap) wondering "is this discipline actually faster than the alternative, or is it just talk?" now has one more data point. Two URL refactors under the same pattern, second one 7.4× faster. The claim is empirical now — not because of one measurement, but because of the delta between two.
	79	+
	80	+The §5 spec calls this "compliance proves the rules were followed; the delta proves they were worth following." That argument has been about cross-repo workingSetFit measurements until now. Today the same argument lands on wall-clock time. Pattern-as-redirect isn't a fashion; it's measurably cheaper the second time.
	81	+
	82	+## What's next on the empirical chain
	83	+
	84	+The natural next step is the third datapoint. Two falsifiable subclaims to test:
	85	+
	86	+1. A same-shape refactor stays small. Pick another fixed-enum URL move (e.g. `/sama/v2/example-crud` → `/sama/v2/examples/crud` and `/sama/v2/example-wordpress` → `/sama/v2/examples/wordpress`). Prediction: ≤ 8 minutes, possibly faster since 2 slugs vs 4.
	87	+2. A data-shaped refactor stays under 30 minutes. Pick `/blog/<slug>` → `/blog/<yyyy-mm>/<slug>`. Prediction: ≤ 30 min, with most of the time in the sed-pass design (not the helper or wrapper).
	88	+
	89	+If subclaim 1 lands under 8 minutes, cost-flattening hits its floor. If subclaim 2 lands over 30 minutes, the pattern's portability is bounded — fixed-enum cheap, data-driven expensive — and that boundary becomes the new empirical knowledge.
	90	+
	91	+Either outcome is the next blog post. The chain ratchets.

modified goals/sama-discipline-prefix.md +3 −3

@@ -3,9 +3,9 @@ slug: sama-discipline-prefix
3	3	title: Move /sama/<discipline> → /sama/discipline/<slug> — hypothesis test of cost-flattening
4	4	date: 2026-05-25
5	5	branch: sama-discipline-prefix
6		-pr_number: null
7		-merge_sha: null
8		-status: pending
	6	+pr_number: 53
	7	+merge_sha: f806580
	8	+status: shipped
9	9	related_posts: [sama-v2-git-url-refactor-postmortem]
10	10	---
11	11

added public/images/cost-flattening-confirmed.png +0 −0

added public/images/cost-flattening-confirmed.svg +86 −0

@@ -0,0 +1,86 @@
	1	+<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 1200 700" width="1200" height="700">
	2	+ <rect width="1200" height="700" fill="#0a0a0a"/>
	3	+
	4	+ <!-- Header -->
	5	+ <g font-family="ui-monospace, 'SF Mono', 'JetBrains Mono', Menlo, Consolas, monospace">
	6	+ <text x="80" y="46" font-size="20" font-weight="600" fill="#909090">Hypothesis test — cost-flattening of pattern-as-redirect</text>
	7	+ <text x="80" y="92" font-size="32" font-weight="700" fill="#e8e8e8">Predicted 1 hour. Landed in 8 minutes 8 seconds.</text>
	8	+ <text x="80" y="120" font-size="14" fill="#7a7a7a">The git-url-refactor postmortem closed with a falsifiable claim. The second URL refactor measured it. 7.4× faster than predicted.</text>
	9	+ </g>
	10	+
	11	+ <!-- Comparison header -->
	12	+ <g font-family="ui-monospace, 'SF Mono', 'JetBrains Mono', Menlo, Consolas, monospace" font-size="13" font-weight="600" letter-spacing="2">
	13	+ <text x="100" y="172" fill="#909090">DIMENSION</text>
	14	+ <text x="480" y="172" fill="#909090">PR #42 (FIRST)</text>
	15	+ <text x="780" y="172" fill="#909090">PR #53 (HYPOTHESIS TEST)</text>
	16	+ <text x="1080" y="172" fill="#909090">VERDICT</text>
	17	+ </g>
	18	+ <line x1="80" y1="184" x2="1120" y2="184" stroke="#2a2a2a" stroke-width="1"/>
	19	+
	20	+ <!-- Rows -->
	21	+ <g font-family="ui-monospace, 'SF Mono', 'JetBrains Mono', Menlo, Consolas, monospace" font-size="15">
	22	+
	23	+ <text x="100" y="216" fill="#c8c8c8">wall-clock (commit → live)</text>
	24	+ <text x="480" y="216" fill="#8a8a8a">an evening</text>
	25	+ <text x="780" y="216" fill="#c8c8c8" font-weight="700">8 min 8 sec</text>
	26	+ <text x="1080" y="216" fill="#7ec77e">✓ confirmed</text>
	27	+
	28	+ <text x="100" y="246" fill="#c8c8c8">files changed</text>
	29	+ <text x="480" y="246" fill="#8a8a8a">19</text>
	30	+ <text x="780" y="246" fill="#c8c8c8">17</text>
	31	+ <text x="1080" y="246" fill="#7ec77e">comparable</text>
	32	+
	33	+ <text x="100" y="276" fill="#c8c8c8">URL references rewritten</text>
	34	+ <text x="480" y="276" fill="#8a8a8a">49</text>
	35	+ <text x="780" y="276" fill="#c8c8c8">~22</text>
	36	+ <text x="1080" y="276" fill="#8a8a8a">smaller scope</text>
	37	+
	38	+ <text x="100" y="306" fill="#c8c8c8">Layer-1 helper LOC</text>
	39	+ <text x="480" y="306" fill="#8a8a8a">13</text>
	40	+ <text x="780" y="306" fill="#c8c8c8">13</text>
	41	+ <text x="1080" y="306" fill="#7ec77e">✓ identical</text>
	42	+
	43	+ <text x="100" y="336" fill="#c8c8c8">sibling-test cases</text>
	44	+ <text x="480" y="336" fill="#8a8a8a">9</text>
	45	+ <text x="780" y="336" fill="#c8c8c8">12</text>
	46	+ <text x="1080" y="336" fill="#7ec77e">covered</text>
	47	+
	48	+ <text x="100" y="366" fill="#c8c8c8">Layer-3 wrapper</text>
	49	+ <text x="480" y="366" fill="#8a8a8a">11 lines (Response + 301)</text>
	50	+ <text x="780" y="366" fill="#c8c8c8">11 lines (copy-paste)</text>
	51	+ <text x="1080" y="366" fill="#7ec77e">✓ identical</text>
	52	+
	53	+ <text x="100" y="396" fill="#c8c8c8">sed-pass over-rewrites</text>
	54	+ <text x="480" y="396" fill="#8a8a8a">0</text>
	55	+ <text x="780" y="396" fill="#c8c8c8">1 (caught + fixed in 30s)</text>
	56	+ <text x="1080" y="396" fill="#c89a3a">~ pattern risk</text>
	57	+
	58	+ <text x="100" y="426" fill="#c8c8c8">verifier verdict before</text>
	59	+ <text x="480" y="426" fill="#8a8a8a">7/7 ✓</text>
	60	+ <text x="780" y="426" fill="#c8c8c8">7/7 ✓</text>
	61	+ <text x="1080" y="426" fill="#7ec77e">stable</text>
	62	+
	63	+ <text x="100" y="456" fill="#c8c8c8">verifier verdict after</text>
	64	+ <text x="480" y="456" fill="#8a8a8a">7/7 ✓</text>
	65	+ <text x="780" y="456" fill="#c8c8c8">7/7 ✓</text>
	66	+ <text x="1080" y="456" fill="#7ec77e">stable</text>
	67	+
	68	+ <text x="100" y="486" fill="#c8c8c8">test count delta</text>
	69	+ <text x="480" y="486" fill="#8a8a8a">+9 (379 → 388)</text>
	70	+ <text x="780" y="486" fill="#c8c8c8">+12 (407 → 419)</text>
	71	+ <text x="1080" y="486" fill="#7ec77e">net positive</text>
	72	+ </g>
	73	+
	74	+ <!-- Bottom callout -->
	75	+ <rect x="80" y="520" width="1040" height="140" fill="#101a10" stroke="#1f3f1f" stroke-width="1" rx="6"/>
	76	+ <g font-family="ui-monospace, 'SF Mono', 'JetBrains Mono', Menlo, Consolas, monospace">
	77	+ <text x="100" y="552" font-size="16" font-weight="600" fill="#7ec77e">Cost-flattening hypothesis: CONFIRMED — much more dramatically than predicted.</text>
	78	+ <text x="100" y="578" font-size="14" fill="#c8c8c8">Predicted: 1 hour. Actual: 8m 8s. Ratio: 7.4× faster. The reusable shape (b32_<old>_url_redirect + sibling test +</text>
	79	+ <text x="100" y="602" font-size="14" fill="#c8c8c8">Layer-3 wrapper) collapses the URL-refactor wall-clock from "evening's work" to "coffee break". Same number of</text>
	80	+ <text x="100" y="626" font-size="14" fill="#c8c8c8">structural pieces; the cost of CONNECTING them was the bulk of the original effort, and that's now zero.</text>
	81	+ <text x="100" y="648" font-size="13" fill="#8a8a8a">One small pattern-risk surfaced: sed pattern matched inside filesystem paths (content/sama/sorted.md), caught by a unit test.</text>
	82	+ </g>
	83	+
	84	+ <!-- Watermark -->
	85	+ <text x="1120" y="684" text-anchor="end" font-family="ui-monospace, 'SF Mono', 'JetBrains Mono', Menlo, Consolas, monospace" font-size="12" fill="#5a5a5a">https://tdd.md</text>
	86	+</svg>

modified src/a31_blog.ts +6 −0

@@ -12,6 +12,12 @@ export interface BlogEntry {
12	12	}
13	13
14	14	export const ALL_POSTS: BlogEntry[] = [
	15	+ {
	16	+ slug: "sama-v2-second-url-refactor-postmortem",
	17	+ title: "8 minutes 8 seconds — the cost-flattening hypothesis is confirmed",
	18	+ description: "The git-url-refactor postmortem closed with a single falsifiable claim: 'if the second URL refactor lands in ~1 hour, cost-flattening of pattern-as-redirect is confirmed; if it takes another evening, the pattern wasn't as portable as it looked. Either is informative.' The second URL refactor happened today (moving /sama/<discipline> → /sama/discipline/<slug>). It landed in 8 minutes 8 seconds — 7.4× faster than predicted. Cost-flattening confirmed, much more dramatically than the hypothesis dared. Timeline breakdown commit-by-commit shows where each minute went: 1m45s on the helper + test (copy of git template); 30s on the fallback handler; 1m10s on the sed pass; 30s recovering from one sed over-rewrite (regex matched inside filesystem paths content/sama/sorted.md, caught by an existing edit-resolve test); 2m20s on the gh PR flow + deploy. What the prediction got right: the 13-line helper, the 11-line Layer-3 wrapper, the sibling-test structure — all landed identically. What it missed: pattern risk (sed scope on shorter URL prefixes is more collision-prone; lesson added to anti-fudge for next time). Two datapoints isn't a trend, but the mechanism is mundane: the first refactor designs the shape; the second imports the template, changes the slug, pastes. Predicts that future fixed-enum URL refactors land in ~5-10 min (deploy-time dominated), while data-driven refactors like /blog/<slug> → /blog/<yyyy-mm>/<slug> probably take ~20-30 min because the helper has to grow. The empirical chain ratchets: §5 has been about cross-repo workingSetFit deltas; this post lands wall-clock time as the same kind of measurement. 'Compliance proves the rules were followed; delta proves they were worth following' — that argument now applies to wall-clock time, not just file-fit ratios.",
	19	+ date: "2026-05-25",
	20	+ },
15	21	{
16	22	slug: "sama-v2-on-ramp-gap",
17	23	title: "Every artifact has a URL. The on-ramp doesn't.",

modified src/a31_goals.ts +10 −0

@@ -38,6 +38,16 @@ export interface GoalEntry {
38	38	}
39	39
40	40	export const ALL_GOALS: GoalEntry[] = [
	41	+ {
	42	+ slug: "sama-discipline-prefix",
	43	+ title: "Move /sama/<discipline> → /sama/discipline/<slug> — hypothesis test",
	44	+ date: "2026-05-25",
	45	+ branch: "sama-discipline-prefix",
	46	+ prNumber: 53,
	47	+ mergeSha: "f806580",
	48	+ status: "shipped",
	49	+ relatedPosts: ["sama-v2-git-url-refactor-postmortem"],
	50	+ },
41	51	{
42	52	slug: "contributing-md",
43	53	title: "Build CONTRIBUTING.md as the canonical on-ramp + drift-detection test",

raw .diff