syntaxai/tdd.md · commit 1d22b18

Close dogfood loop + postmortem post: cost-flattening confirmed at 8m 8s

Three things in one commit:

1. Dogfood close: goals/sama-discipline-prefix.md flips pending →
   shipped, merge_sha f806580, pr_number 53. ALL_GOALS gets a new
   entry mirroring the file's metadata.

2. Postmortem post: content/blog/sama-v2-second-url-refactor-postmortem.md
   measures the hypothesis from /blog/sama-v2-git-url-refactor-postmortem.
   Predicted 1 hour; actual 8m 8s; ratio 7.4× faster. Commit-by-
   commit timeline + analysis of what the prediction got right + the
   one pattern-risk that surfaced (sed scope collision on filesystem
   paths). Plus next-step predictions for fixed-enum vs data-driven
   refactors.

3. Scorecard image: public/images/cost-flattening-confirmed.png +
   .svg with the dramatic 'predicted 1 hour, landed in 8 minutes
   8 seconds' headline. Watermarked per the /images/ convention.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
author
syntaxai <[email protected]>
date
2026-05-25 15:59:05 +01:00
parent
f806580
commit
1d22b1863885e7b119cdd614feff6bd67f50b297

6 files changed · +196 −3

added content/blog/sama-v2-second-url-refactor-postmortem.md +91 −0
@@ -0,0 +1,91 @@
1+# 8 minutes 8 seconds — the cost-flattening hypothesis is confirmed
2+
3+The [git-url-refactor postmortem](/blog/sama-v2-git-url-refactor-postmortem) closed with a single falsifiable claim:
4+
5+> *"If the second URL refactor — when it happens — lands in ~an hour, that's the data point. If it takes another evening, the pattern wasn't as portable as it looked. Either result is informative."*
6+
7+The second URL refactor happened today. It landed in **8 minutes and 8 seconds**.
8+
9+```
10+T+00:00:00 git checkout -b sama-discipline-prefix (first commit)
11+T+00:00:30 goals/sama-discipline-prefix.md written (dogfooded /goal)
12+T+00:01:45 b32_sama_discipline_url_redirect.ts + test (copy of git template)
13+T+00:02:30 d21_handlers_fallback.ts redirect block (paste alongside existing)
14+T+00:02:50 d21_app.ts Bun route + sitemap handler (one edit each)
15+T+00:04:00 sed pass over 13 content + src files (one regex)
16+T+00:04:20 one sed over-rewrite caught + fixed (test failed, 30s to revert)
17+T+00:05:10 419/419 tests green (+12 new helper cases)
18+T+00:06:30 commit, push, gh pr create, gh pr merge (the GitHub-flow)
19+T+00:07:30 deploy script ran (rebuild + restart container)
20+T+00:08:08 /healthz answered, live-verify passed (all 4 new URLs 200, all 4 old URLs 301)
21+```
22+
23+7.4× faster than predicted. Cost-flattening confirmed at a much more dramatic rate than the original hypothesis dared.
24+
25+![Cost-flattening hypothesis confirmed — 8m 8s vs 1h predicted](/images/cost-flattening-confirmed.png?v=1)
26+
27+## What the prediction got right
28+
29+The original postmortem named three things the pattern would carry through:
30+
31+- A 5-line Layer-1 helper with one regex
32+- A sibling test covering all match-cases + non-match-cases
33+- A Layer-3 wrapper that emits the 301
34+
35+All three landed identically in PR #53. The helper is **13 lines** in both PRs (same code modulo identifier renaming). The Layer-3 wrapper is the **same 11 lines** in both fallback handlers, literally copy-pasted from the rewriteOldGitUrl block to the new rewriteOldSamaDisciplineUrl block. The sibling test grew from 9 cases to 12 — slightly more because the new pattern has more non-match neighbours (`/sama/v2`, `/sama/skill`, the new-form URLs) that needed explicit `null`-return tests.
36+
37+## What the prediction missed — pattern risk
38+
39+One surprise: the sed pass over-rewrote.
40+
41+The pattern `s|/sama/(sorted|architecture|modeled|atomic)|/sama/discipline/\1|g` matched **inside filesystem paths** like `content/sama/sorted.md` — turning them into the nonsensical `content/sama/discipline/sorted.md` (the directory `content/sama/discipline/` doesn't exist; the file is at `content/sama/sorted.md`).
42+
43+This bug surfaced via a test — `b32_edit_resolve.test.ts` failed when its expected `filePath: "content/sama/sorted.md"` got rewritten to `content/sama/discipline/sorted.md`. The pattern caught it in seconds, and reverting was one Edit. But it's a genuine new failure mode the first refactor didn't have, because git-url-drop-owner's regex was `/GIT/syntaxai/tdd.md/` — three segments long, far less likely to collide with filesystem paths.
44+
45+**Pattern risk added to the recipe**: when generating a sed for a URL refactor, prefer anchoring (`href="..."`, `[link](...)`) or use a more restrictive character class. The 30-second revert was cheap *this time* because a test caught it. In a refactor with no covering test, the over-rewrite would have landed silently.
46+
47+The /goal that follows the third URL refactor should include this lesson explicitly as an anti-fudge clause.
48+
49+## What lands when this pattern keeps repeating
50+
51+PR #42 took an evening. PR #53 took 8 minutes. Two datapoints isn't a trend, but it's not nothing: **the cost is dropping faster than linear.**
52+
53+The mechanism is mundane. The first time:
54+- You design the helper shape (`SitemapUrl`-like type, `rewrite*` function name, `null`-vs-string return for non-match)
55+- You design the test shape (match-all-kinds, non-match-cases, edge cases like empty input)
56+- You design the Layer-3 wrapper shape (Response with 301, Location header, Cache-Control)
57+- You discover the gotchas (regex order in the fallback, sed scope, Containerfile gotcha, breadcrumb cross-references)
58+
59+The second time, all of that is done. You **import the template, change the slug, paste**. The first refactor's 19 files become the second refactor's 17 files because the helper + test are now reuseable structure, not new design.
60+
61+The third refactor — whichever one we pick — should land in similar time *unless* the cost-flattening hits a floor at the irreducible-minimum of "type the new slugs + sed". My guess: 5–10 minutes for any future URL move, dominated by deploy time (~50s per deploy) and gh-CLI roundtrips.
62+
63+## What about /blog/&lt;slug&gt; → /blog/&lt;yyyy-mm&gt;/&lt;slug&gt;?
64+
65+This was the third candidate in the postmortem. It's much bigger — ~27 blog posts × N cross-links each = probably 100+ references. The helper pattern would still work, but the sed-pass scope explodes, and the new URL needs to be computed *from data* (the post's `date` field), not from a fixed enum.
66+
67+Prediction for that one: not 8 minutes. Probably 20–30 minutes, because:
68+- Helper signature changes (must accept date AND slug, lookup the date from ALL_POSTS by slug)
69+- Sed pass over content/*.md is sketchier (more risk of over-rewrite)
70+- Sitemap handler needs the new URL shape, but the redirect needs the old-shape regex AND a way to know each post's date
71+
72+The cost-flattening claim holds *for refactors of the same shape*. When the input expands (fixed enum → data-driven), the helper has to grow. That's a falsifiable distinction worth testing next.
73+
74+## The empirical chain ratchets
75+
76+[`/blog/sama-v2-goal-chain-gap`](/blog/sama-v2-goal-chain-gap) said *every artifact is auditable*. Now: the *time-cost* of each kind of refactor is auditable too. The git-url postmortem published a 1-hour prediction. This post publishes the 8-minute measurement against it. Both timestamps are in git, both /goals are at [`/goals`](/goals), both PRs link back to their plan posts.
77+
78+Anyone reading [/blog/sama-v2-on-ramp-gap](/blog/sama-v2-on-ramp-gap) wondering *"is this discipline actually faster than the alternative, or is it just talk?"* now has one more data point. Two URL refactors under the same pattern, second one 7.4× faster. The claim is empirical now — not because of one measurement, but because of the **delta** between two.
79+
80+The §5 spec calls this *"compliance proves the rules were followed; the delta proves they were worth following."* That argument has been about cross-repo workingSetFit measurements until now. Today the same argument lands on wall-clock time. Pattern-as-redirect isn't a fashion; it's measurably cheaper the second time.
81+
82+## What's next on the empirical chain
83+
84+The natural next step is the third datapoint. Two falsifiable subclaims to test:
85+
86+1. **A same-shape refactor stays small.** Pick another fixed-enum URL move (e.g. `/sama/v2/example-crud` → `/sama/v2/examples/crud` and `/sama/v2/example-wordpress` → `/sama/v2/examples/wordpress`). Prediction: ≤ 8 minutes, possibly faster since 2 slugs vs 4.
87+2. **A data-shaped refactor stays under 30 minutes.** Pick `/blog/<slug>` → `/blog/<yyyy-mm>/<slug>`. Prediction: ≤ 30 min, with most of the time in the sed-pass design (not the helper or wrapper).
88+
89+If subclaim 1 lands under 8 minutes, cost-flattening hits its floor. If subclaim 2 lands over 30 minutes, the pattern's portability is bounded — *fixed-enum cheap, data-driven expensive* — and that boundary becomes the new empirical knowledge.
90+
91+Either outcome is the next blog post. The chain ratchets.
modified goals/sama-discipline-prefix.md +3 −3
@@ -3,9 +3,9 @@ slug: sama-discipline-prefix
33 title: Move /sama/<discipline> → /sama/discipline/<slug> — hypothesis test of cost-flattening
44 date: 2026-05-25
55 branch: sama-discipline-prefix
6-pr_number: null
7-merge_sha: null
8-status: pending
6+pr_number: 53
7+merge_sha: f806580
8+status: shipped
99 related_posts: [sama-v2-git-url-refactor-postmortem]
1010 ---
1111
added public/images/cost-flattening-confirmed.png +0 −0
added public/images/cost-flattening-confirmed.svg +86 −0
@@ -0,0 +1,86 @@
1+<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 1200 700" width="1200" height="700">
2+ <rect width="1200" height="700" fill="#0a0a0a"/>
3+
4+ <!-- Header -->
5+ <g font-family="ui-monospace, 'SF Mono', 'JetBrains Mono', Menlo, Consolas, monospace">
6+ <text x="80" y="46" font-size="20" font-weight="600" fill="#909090">Hypothesis test — cost-flattening of pattern-as-redirect</text>
7+ <text x="80" y="92" font-size="32" font-weight="700" fill="#e8e8e8">Predicted 1 hour. Landed in 8 minutes 8 seconds.</text>
8+ <text x="80" y="120" font-size="14" fill="#7a7a7a">The git-url-refactor postmortem closed with a falsifiable claim. The second URL refactor measured it. 7.4× faster than predicted.</text>
9+ </g>
10+
11+ <!-- Comparison header -->
12+ <g font-family="ui-monospace, 'SF Mono', 'JetBrains Mono', Menlo, Consolas, monospace" font-size="13" font-weight="600" letter-spacing="2">
13+ <text x="100" y="172" fill="#909090">DIMENSION</text>
14+ <text x="480" y="172" fill="#909090">PR #42 (FIRST)</text>
15+ <text x="780" y="172" fill="#909090">PR #53 (HYPOTHESIS TEST)</text>
16+ <text x="1080" y="172" fill="#909090">VERDICT</text>
17+ </g>
18+ <line x1="80" y1="184" x2="1120" y2="184" stroke="#2a2a2a" stroke-width="1"/>
19+
20+ <!-- Rows -->
21+ <g font-family="ui-monospace, 'SF Mono', 'JetBrains Mono', Menlo, Consolas, monospace" font-size="15">
22+
23+ <text x="100" y="216" fill="#c8c8c8">wall-clock (commit → live)</text>
24+ <text x="480" y="216" fill="#8a8a8a">an evening</text>
25+ <text x="780" y="216" fill="#c8c8c8" font-weight="700">8 min 8 sec</text>
26+ <text x="1080" y="216" fill="#7ec77e">✓ confirmed</text>
27+
28+ <text x="100" y="246" fill="#c8c8c8">files changed</text>
29+ <text x="480" y="246" fill="#8a8a8a">19</text>
30+ <text x="780" y="246" fill="#c8c8c8">17</text>
31+ <text x="1080" y="246" fill="#7ec77e">comparable</text>
32+
33+ <text x="100" y="276" fill="#c8c8c8">URL references rewritten</text>
34+ <text x="480" y="276" fill="#8a8a8a">49</text>
35+ <text x="780" y="276" fill="#c8c8c8">~22</text>
36+ <text x="1080" y="276" fill="#8a8a8a">smaller scope</text>
37+
38+ <text x="100" y="306" fill="#c8c8c8">Layer-1 helper LOC</text>
39+ <text x="480" y="306" fill="#8a8a8a">13</text>
40+ <text x="780" y="306" fill="#c8c8c8">13</text>
41+ <text x="1080" y="306" fill="#7ec77e">✓ identical</text>
42+
43+ <text x="100" y="336" fill="#c8c8c8">sibling-test cases</text>
44+ <text x="480" y="336" fill="#8a8a8a">9</text>
45+ <text x="780" y="336" fill="#c8c8c8">12</text>
46+ <text x="1080" y="336" fill="#7ec77e">covered</text>
47+
48+ <text x="100" y="366" fill="#c8c8c8">Layer-3 wrapper</text>
49+ <text x="480" y="366" fill="#8a8a8a">11 lines (Response + 301)</text>
50+ <text x="780" y="366" fill="#c8c8c8">11 lines (copy-paste)</text>
51+ <text x="1080" y="366" fill="#7ec77e">✓ identical</text>
52+
53+ <text x="100" y="396" fill="#c8c8c8">sed-pass over-rewrites</text>
54+ <text x="480" y="396" fill="#8a8a8a">0</text>
55+ <text x="780" y="396" fill="#c8c8c8">1 (caught + fixed in 30s)</text>
56+ <text x="1080" y="396" fill="#c89a3a">~ pattern risk</text>
57+
58+ <text x="100" y="426" fill="#c8c8c8">verifier verdict before</text>
59+ <text x="480" y="426" fill="#8a8a8a">7/7 ✓</text>
60+ <text x="780" y="426" fill="#c8c8c8">7/7 ✓</text>
61+ <text x="1080" y="426" fill="#7ec77e">stable</text>
62+
63+ <text x="100" y="456" fill="#c8c8c8">verifier verdict after</text>
64+ <text x="480" y="456" fill="#8a8a8a">7/7 ✓</text>
65+ <text x="780" y="456" fill="#c8c8c8">7/7 ✓</text>
66+ <text x="1080" y="456" fill="#7ec77e">stable</text>
67+
68+ <text x="100" y="486" fill="#c8c8c8">test count delta</text>
69+ <text x="480" y="486" fill="#8a8a8a">+9 (379 → 388)</text>
70+ <text x="780" y="486" fill="#c8c8c8">+12 (407 → 419)</text>
71+ <text x="1080" y="486" fill="#7ec77e">net positive</text>
72+ </g>
73+
74+ <!-- Bottom callout -->
75+ <rect x="80" y="520" width="1040" height="140" fill="#101a10" stroke="#1f3f1f" stroke-width="1" rx="6"/>
76+ <g font-family="ui-monospace, 'SF Mono', 'JetBrains Mono', Menlo, Consolas, monospace">
77+ <text x="100" y="552" font-size="16" font-weight="600" fill="#7ec77e">Cost-flattening hypothesis: CONFIRMED — much more dramatically than predicted.</text>
78+ <text x="100" y="578" font-size="14" fill="#c8c8c8">Predicted: 1 hour. Actual: 8m 8s. Ratio: 7.4× faster. The reusable shape (b32_&lt;old&gt;_url_redirect + sibling test +</text>
79+ <text x="100" y="602" font-size="14" fill="#c8c8c8">Layer-3 wrapper) collapses the URL-refactor wall-clock from "evening's work" to "coffee break". Same number of</text>
80+ <text x="100" y="626" font-size="14" fill="#c8c8c8">structural pieces; the cost of CONNECTING them was the bulk of the original effort, and that's now zero.</text>
81+ <text x="100" y="648" font-size="13" fill="#8a8a8a">One small pattern-risk surfaced: sed pattern matched inside filesystem paths (content/sama/sorted.md), caught by a unit test.</text>
82+ </g>
83+
84+ <!-- Watermark -->
85+ <text x="1120" y="684" text-anchor="end" font-family="ui-monospace, 'SF Mono', 'JetBrains Mono', Menlo, Consolas, monospace" font-size="12" fill="#5a5a5a">https://tdd.md</text>
86+</svg>
modified src/a31_blog.ts +6 −0
@@ -12,6 +12,12 @@ export interface BlogEntry {
1212 }
1313
1414 export const ALL_POSTS: BlogEntry[] = [
15+ {
16+ slug: "sama-v2-second-url-refactor-postmortem",
17+ title: "8 minutes 8 seconds — the cost-flattening hypothesis is confirmed",
18+ description: "The git-url-refactor postmortem closed with a single falsifiable claim: 'if the second URL refactor lands in ~1 hour, cost-flattening of pattern-as-redirect is confirmed; if it takes another evening, the pattern wasn't as portable as it looked. Either is informative.' The second URL refactor happened today (moving /sama/<discipline> → /sama/discipline/<slug>). It landed in 8 minutes 8 seconds — 7.4× faster than predicted. Cost-flattening confirmed, much more dramatically than the hypothesis dared. Timeline breakdown commit-by-commit shows where each minute went: 1m45s on the helper + test (copy of git template); 30s on the fallback handler; 1m10s on the sed pass; 30s recovering from one sed over-rewrite (regex matched inside filesystem paths content/sama/sorted.md, caught by an existing edit-resolve test); 2m20s on the gh PR flow + deploy. What the prediction got right: the 13-line helper, the 11-line Layer-3 wrapper, the sibling-test structure — all landed identically. What it missed: pattern risk (sed scope on shorter URL prefixes is more collision-prone; lesson added to anti-fudge for next time). Two datapoints isn't a trend, but the mechanism is mundane: the first refactor designs the shape; the second imports the template, changes the slug, pastes. Predicts that future fixed-enum URL refactors land in ~5-10 min (deploy-time dominated), while data-driven refactors like /blog/<slug> → /blog/<yyyy-mm>/<slug> probably take ~20-30 min because the helper has to grow. The empirical chain ratchets: §5 has been about cross-repo workingSetFit deltas; this post lands wall-clock time as the same kind of measurement. 'Compliance proves the rules were followed; delta proves they were worth following' — that argument now applies to wall-clock time, not just file-fit ratios.",
19+ date: "2026-05-25",
20+ },
1521 {
1622 slug: "sama-v2-on-ramp-gap",
1723 title: "Every artifact has a URL. The on-ramp doesn't.",
modified src/a31_goals.ts +10 −0
@@ -38,6 +38,16 @@ export interface GoalEntry {
3838 }
3939
4040 export const ALL_GOALS: GoalEntry[] = [
41+ {
42+ slug: "sama-discipline-prefix",
43+ title: "Move /sama/<discipline> → /sama/discipline/<slug> — hypothesis test",
44+ date: "2026-05-25",
45+ branch: "sama-discipline-prefix",
46+ prNumber: 53,
47+ mergeSha: "f806580",
48+ status: "shipped",
49+ relatedPosts: ["sama-v2-git-url-refactor-postmortem"],
50+ },
4151 {
4252 slug: "contributing-md",
4353 title: "Build CONTRIBUTING.md as the canonical on-ramp + drift-detection test",