e4a064e07d2f73d7bebe6183ceeb49581ffc4f05 diff --git a/content/blog/sama-v2-portability-boundary-found.md b/content/blog/sama-v2-portability-boundary-found.md new file mode 100644 index 0000000000000000000000000000000000000000..7752ba7b4e01fbfd0d274b93f13f6903f821ef98 --- /dev/null +++ b/content/blog/sama-v2-portability-boundary-found.md @@ -0,0 +1,79 @@ +# 21 minutes 23 seconds — the portability boundary is empirically located + +The [second-URL-refactor postmortem](/blog/2026-05/sama-v2-second-url-refactor-postmortem) closed with two falsifiable subclaims. The third URL refactor on this site happened today and measured the second one: + +> *"A data-shaped refactor stays under 30 minutes. Pick /blog/<slug> → /blog/<yyyy-mm>/<slug>. Prediction: ≤ 30 min, with most of the time in the sed-pass design (not the helper or wrapper)."* + +Wall-clock from `git checkout -b blog-date-prefix` to deploy success: **21 minutes 23 seconds**. Within the predicted band, on the faster side. + +Three datapoints now. The pattern stays portable; the cost stays bounded; the floor differs by kind: + +![Portability boundary located — 21m 23s vs ≤30 min predicted](/images/portability-boundary-found.png?v=1) + +## The three datapoints + +| | PR #42 (first) | PR #53 (fixed-enum) | PR #55 (data-driven) | +|---|---|---|---| +| Wall-clock | an evening (~3h) | **8m 8s** | **21m 23s** | +| Helper imports data | none | none | `ALL_POSTS` | +| Helper LOC | 13 | 13 | 24 | +| Migration mechanism | sed (one-liner) | sed (one-liner) | Bun script (~50 LOC) | +| References rewritten | 49 | ~22 | 173 | +| Files touched | 19 | 17 | 46 | +| Over-rewrites | 0 | 1 (caught + reverted) | 1 wider (5+ files) | +| Verifier verdict | 7/7 ✓ | 7/7 ✓ | 7/7 ✓ | + +Three observations. + +## What the prediction got right + +**The helper itself stayed mechanical.** PR #53's 13-line `rewriteOldSamaDisciplineUrl` was a near-byte-for-byte copy of PR #42's `rewriteOldGitUrl`. PR #55's `rewriteOldBlogUrl` is 24 lines — 13 of original shape (regex match, null return on miss) plus 11 lines of `ALL_POSTS.find` + `date.slice(0, 7)` + new-URL construction. The added 11 lines are *all* the data-driven part. The pattern itself didn't grow; just the lookup pipe at the front of it. + +**The Layer-3 wrapper stayed identical across all three.** The same 11-line block — `if (newPath !== null) return new Response(null, { status: 301, headers: { Location: newPath, ... } })` — appears three times in `d21_handlers_fallback.ts` now, adjacent to each other. Pure copy-paste with one identifier change. Code review would catch any drift between them mechanically. + +**The verifier stayed 7/7 ✓ across all three merges.** Anti-fudge gate held. None of the three refactors required §4 verifier changes. + +## What surprised — the migration cost grew + +The data-driven refactor isn't just *the helper has a lookup*. The migration tooling had to grow too: + +- **One sed line → 50-line Bun script.** The git-url and sama-discipline migrations were `sed -i -E 's|/old/.*|/new/...|g'` — one shell line. The blog migration needed per-slug substitution because each new URL depends on the post's date. That meant: iterate ALL_POSTS, build a `[old, new]` table, sort by slug length DESC to avoid prefix collisions, iterate every file, do N replaceAll passes per file. Still mechanical, but the tooling is now ~50 LOC instead of 1. +- **One over-rewrite → wider over-rewrite.** Same pattern-risk as PR #53 (sed matching inside filesystem paths), but here the script touched `content/blog/.md` strings — 5+ files affected instead of 1. Reverted with a counter-sed that restored `content/blog/YYYY-MM/.md` to `content/blog/.md`. Total recovery: ~2 minutes, but it's a real pattern-risk that gets bigger as the refactor surface grows. +- **Spoof guard.** The fixed-enum refactors didn't need one: the regex enumerated exactly the valid slugs, so URL forgery was impossible by construction. The data-driven handler reads `:yyyymm` from the URL — a forged `/blog/9999-99/` would render the post if the handler didn't validate. Added a one-line `entry.date.slice(0, 7) !== yyyymm → 404` guard. Trivial, but a new class of concern. +- **Three test failures after migration.** The migration substituted URL strings inside test assertions AND inside filesystem-path strings (the same sed-bug, wider). Three tests broke; all three were mechanical to fix (revert the over-rewrites, then update remaining stale URLs in CONTRIBUTING.md which the script missed because it's at root, not under `content/`). + +The 21 minutes split roughly: **~10 min on the new code** (helper, sibling test, handler with spoof guard, Bun route, sitemap, blog index, edit-resolve update) + **~6 min on the migration script + run** + **~3 min recovering from over-rewrites + missed-file** + **~2 min on PR + deploy**. The new-code time was actually the same as PR #53; the difference is *everything around it*. + +## Where the boundary sits + +Pre-PR #53, the open question was: *does the b32_<old>_url_redirect pattern only work for fixed-enum refactors?* + +Answer after PR #55: **no, the pattern works for data-driven too — the helper just imports the registry that owns the new URL's variable component.** Layer-1 importing Layer-0 is fine; SAMA explicitly allows it. The helper stays pure, the Response wrapper stays identical, the structural shape is preserved. + +What grows with data-driven is *the migration*, not *the design*. And the migration is one-shot tooling that doesn't ship with the runtime — `scripts/migrate-blog-urls.ts` is in the repo but never imported by the application. The runtime cost is identical; only the development-time cost of *one* migration changed. + +So the empirical boundary is: +- **Fixed-enum URL refactors: ~8 min wall-clock**, dominated by deploy time. +- **Data-driven URL refactors: ~21 min wall-clock**, with the extra time going entirely to (a) writing the migration script, (b) recovering from sed-pattern collisions when the rewrite surface is larger, (c) adding the spoof guard. + +Three datapoints isn't a law of nature. But the *shape* of the cost difference is mechanical: more files to migrate = more risk of over-rewrite collisions = more recovery time. Future data-driven refactors should land in the same band as long as the migration scope stays under ~200 references. + +## What the empirical chain has now + +After the [first postmortem](/blog/2026-05/sama-v2-git-url-refactor-postmortem) (PR #42), the claim was *"this pattern is reusable in principle"*. + +After the [second postmortem](/blog/2026-05/sama-v2-second-url-refactor-postmortem) (PR #53), the claim was *"this pattern is reusable in practice for fixed-enum URLs, measured 7.4× faster than the first instance"*. + +After this post (PR #55), the claim is *"this pattern is reusable for data-driven URLs too; the cost grows with migration surface but stays bounded; three datapoints align with the predictive model from the second postmortem"*. + +Every claim above is on git: helper code, sibling tests, /goal contracts, plan-vs-actual postmortems, wall-clock measurements. *Compliance proves the rules were followed; delta proves they were worth following* — and now, three deltas confirm the rule about *URL refactor cost* is itself stable across pattern variants. + +## What's next on this chain + +Two falsifiable subclaims that the next iteration could test: + +1. **A fourth refactor of the same shape lands in the same band.** Pick another fixed-enum (e.g. `/guides/<slug>` → `/guides/agent/<slug>` if guides ever subdivide). Prediction: ≤ 10 min, since the pattern is now mature and the deploy-time floor dominates. + +2. **A data-driven refactor with a much larger surface still stays bounded.** Construct a scenario with ~500 references (e.g. if the blog grows to 100+ posts, doing another date-restructuring would trigger). Prediction: ≤ 45 min, because the marginal cost is in over-rewrite recovery, not in helper design. + +Either result informative. Until then: three URL refactors, three postmortems, three wall-clock measurements, one stable pattern. The chain ratchets again. diff --git a/goals/blog-date-prefix.md b/goals/blog-date-prefix.md index c2e32f1bc58cba7a367462d34078cb594d9cdb91..10198a26451b50935fe46892632292b4b1597479 100644 --- a/goals/blog-date-prefix.md +++ b/goals/blog-date-prefix.md @@ -3,9 +3,9 @@ slug: blog-date-prefix title: Move /blog/ → /blog// — data-driven refactor portability test date: 2026-05-25 branch: blog-date-prefix -pr_number: null -merge_sha: null -status: pending +pr_number: 55 +merge_sha: 72919e8 +status: shipped related_posts: [sama-v2-second-url-refactor-postmortem] --- diff --git a/public/images/portability-boundary-found.png b/public/images/portability-boundary-found.png new file mode 100644 index 0000000000000000000000000000000000000000..432c6d0895b40fb4cf5c78fd3e14737ff2079a8b Binary files /dev/null and b/public/images/portability-boundary-found.png differ diff --git a/public/images/portability-boundary-found.svg b/public/images/portability-boundary-found.svg new file mode 100644 index 0000000000000000000000000000000000000000..d4f54897367750af6191e4a1ab161905d7564104 --- /dev/null +++ b/public/images/portability-boundary-found.svg @@ -0,0 +1,90 @@ + + + + + + Portability-boundary test — data-driven URL refactor + Predicted 20–30 min. Landed in 21m 23s. + Pattern portable across DATA-DRIVEN refactors. Cost-flattening holds, but the floor is higher than for fixed-enum. + + + + + DIMENSION + PR #42 (FIRST) + PR #53 (FIXED-ENUM) + PR #55 (DATA-DRIVEN) + + + + + + wall-clock + an evening + 8m 8s + 21m 23s + + helper imports data + none (fixed regex) + none (fixed regex) + ALL_POSTS + + helper LOC + 13 + 13 + 24 (lookup + slice) + + sibling-test cases + 9 + 12 + 7 + + migration mechanism + sed (one-liner) + sed (one-liner) + Bun script (~50 LOC) + + references rewritten + 49 + ~22 + 173 + + files touched (migration) + 19 + 17 + 46 + + handler complication + drop owner from sig + rename route segment + yyyymm spoof-guard + + over-rewrite incidents + 0 + 1 (caught + reverted) + 1 wider (5+ files) + + test count delta + +9 (379 → 388) + +12 (407 → 419) + +7 (419 → 426) + + verifier verdict + 7/7 ✓ + 7/7 ✓ + 7/7 ✓ + + + + + + Boundary empirically located between PR #53 and PR #55. + FIXED-ENUM refactors land in ~8 min (template copy + slug rename + paste). + DATA-DRIVEN refactors land in ~21 min (template copy + add ALL_POSTS lookup + spoof guard + migration script + over-rewrite recovery). + Both within predicted bands. Both confirm cost-flattening holds across pattern variants. The added 13 minutes is not the + helper (still mechanical) — it's the migration tooling growing to handle 173 references instead of ~22, and one wider sed-collision. + + + + https://tdd.md + diff --git a/src/a31_blog.ts b/src/a31_blog.ts index 5d1c30f4c6762564a447ca17d7faf5545b5087d5..0a69cff0b55dd2e6dd3928c06d9f77f298509af5 100644 --- a/src/a31_blog.ts +++ b/src/a31_blog.ts @@ -12,6 +12,12 @@ export interface BlogEntry { } export const ALL_POSTS: BlogEntry[] = [ + { + slug: "sama-v2-portability-boundary-found", + title: "21 minutes 23 seconds — the portability boundary is empirically located", + description: "Third datapoint on the cost-flattening empirical chain. PR #42 (an evening), PR #53 (8m 8s, fixed-enum), PR #55 (21m 23s, data-driven). The second postmortem predicted ≤30 min for data-driven — this one measures it at 21m 23s, on the faster side of the predicted band. Three datapoints now align with a mechanical model: fixed-enum refactors land in ~8 min dominated by deploy time; data-driven refactors land in ~21 min because the migration tooling grows from a one-liner sed to a ~50-LOC Bun script, plus a spoof guard, plus wider over-rewrite recovery (the same pattern-risk as PR #53 but bigger surface). The helper itself stays mechanical: 13 lines for fixed-enum, 24 lines for data-driven (the added 11 being the ALL_POSTS lookup + date.slice). The Layer-3 wrapper stays byte-identical across all three. The pattern is reusable for data-driven URLs too; the cost grows with migration surface but stays bounded. Specific breakdown of the 21:23: ~10 min on new code (helper + sibling test + handler + spoof guard + Bun route + sitemap + blog index + edit-resolve), ~6 min on migration script + run, ~3 min recovering from over-rewrites + missed root-level CONTRIBUTING.md, ~2 min on PR + deploy. The 'new code' time was identical to PR #53; the difference was everything around it. The §5 chain now has three deltas confirming a model that's bounded across pattern variants. Two falsifiable next-tests: fourth fixed-enum refactor in the same band (≤10 min), data-driven with ~500-reference surface in ≤45 min.", + date: "2026-05-25", + }, { slug: "sama-v2-second-url-refactor-postmortem", title: "8 minutes 8 seconds — the cost-flattening hypothesis is confirmed", diff --git a/src/a31_goals.ts b/src/a31_goals.ts index 1a5709c4ffe60ccb945160715b5f0319f26d8994..a0405f0db6adde579ac9c0bde1585b7074fed18e 100644 --- a/src/a31_goals.ts +++ b/src/a31_goals.ts @@ -38,6 +38,16 @@ export interface GoalEntry { } export const ALL_GOALS: GoalEntry[] = [ + { + slug: "blog-date-prefix", + title: "Move /blog/ → /blog// — data-driven portability test", + date: "2026-05-25", + branch: "blog-date-prefix", + prNumber: 55, + mergeSha: "72919e8", + status: "shipped", + relatedPosts: ["sama-v2-second-url-refactor-postmortem"], + }, { slug: "sama-discipline-prefix", title: "Move /sama/ → /sama/discipline/ — hypothesis test",