sama-v2-portability-boundary-found.md
raw
· source
21 minutes 23 seconds — the portability boundary is empirically located
The second-URL-refactor postmortem closed with two falsifiable subclaims. The third URL refactor on this site happened today and measured the second one:
"A data-shaped refactor stays under 30 minutes. Pick /blog/<slug> → /blog/<yyyy-mm>/<slug>. Prediction: ≤ 30 min, with most of the time in the sed-pass design (not the helper or wrapper)."
Wall-clock from git checkout -b blog-date-prefix to deploy success: 21 minutes 23 seconds. Within the predicted band, on the faster side.
Three datapoints now. The pattern stays portable; the cost stays bounded; the floor differs by kind:

The three datapoints
| PR #42 (first) | PR #53 (fixed-enum) | PR #55 (data-driven) | |
|---|---|---|---|
| Wall-clock | an evening (~3h) | 8m 8s | 21m 23s |
| Helper imports data | none | none | ALL_POSTS |
| Helper LOC | 13 | 13 | 24 |
| Migration mechanism | sed (one-liner) | sed (one-liner) | Bun script (~50 LOC) |
| References rewritten | 49 | ~22 | 173 |
| Files touched | 19 | 17 | 46 |
| Over-rewrites | 0 | 1 (caught + reverted) | 1 wider (5+ files) |
| Verifier verdict | 7/7 ✓ | 7/7 ✓ | 7/7 ✓ |
Three observations.
What the prediction got right
The helper itself stayed mechanical. PR #53's 13-line rewriteOldSamaDisciplineUrl was a near-byte-for-byte copy of PR #42's rewriteOldGitUrl. PR #55's rewriteOldBlogUrl is 24 lines — 13 of original shape (regex match, null return on miss) plus 11 lines of ALL_POSTS.find + date.slice(0, 7) + new-URL construction. The added 11 lines are all the data-driven part. The pattern itself didn't grow; just the lookup pipe at the front of it.
The Layer-3 wrapper stayed identical across all three. The same 11-line block — if (newPath !== null) return new Response(null, { status: 301, headers: { Location: newPath, ... } }) — appears three times in d21_handlers_fallback.ts now, adjacent to each other. Pure copy-paste with one identifier change. Code review would catch any drift between them mechanically.
The verifier stayed 7/7 ✓ across all three merges. Anti-fudge gate held. None of the three refactors required §4 verifier changes.
What surprised — the migration cost grew
The data-driven refactor isn't just the helper has a lookup. The migration tooling had to grow too:
- One sed line → 50-line Bun script. The git-url and sama-discipline migrations were
sed -i -E 's|/old/.*|/new/...|g'— one shell line. The blog migration needed per-slug substitution because each new URL depends on the post's date. That meant: iterate ALL_POSTS, build a[old, new]table, sort by slug length DESC to avoid prefix collisions, iterate every file, do N replaceAll passes per file. Still mechanical, but the tooling is now ~50 LOC instead of 1. - One over-rewrite → wider over-rewrite. Same pattern-risk as PR #53 (sed matching inside filesystem paths), but here the script touched
content/blog/<slug>.mdstrings — 5+ files affected instead of 1. Reverted with a counter-sed that restoredcontent/blog/YYYY-MM/<slug>.mdtocontent/blog/<slug>.md. Total recovery: ~2 minutes, but it's a real pattern-risk that gets bigger as the refactor surface grows. - Spoof guard. The fixed-enum refactors didn't need one: the regex enumerated exactly the valid slugs, so URL forgery was impossible by construction. The data-driven handler reads
:yyyymmfrom the URL — a forged/blog/9999-99/<valid-slug>would render the post if the handler didn't validate. Added a one-lineentry.date.slice(0, 7) !== yyyymm → 404guard. Trivial, but a new class of concern. - Three test failures after migration. The migration substituted URL strings inside test assertions AND inside filesystem-path strings (the same sed-bug, wider). Three tests broke; all three were mechanical to fix (revert the over-rewrites, then update remaining stale URLs in CONTRIBUTING.md which the script missed because it's at root, not under
content/).
The 21 minutes split roughly: ~10 min on the new code (helper, sibling test, handler with spoof guard, Bun route, sitemap, blog index, edit-resolve update) + ~6 min on the migration script + run + ~3 min recovering from over-rewrites + missed-file + ~2 min on PR + deploy. The new-code time was actually the same as PR #53; the difference is everything around it.
Where the boundary sits
Pre-PR #53, the open question was: does the b32_<old>_url_redirect pattern only work for fixed-enum refactors?
Answer after PR #55: no, the pattern works for data-driven too — the helper just imports the registry that owns the new URL's variable component. Layer-1 importing Layer-0 is fine; SAMA explicitly allows it. The helper stays pure, the Response wrapper stays identical, the structural shape is preserved.
What grows with data-driven is the migration, not the design. And the migration is one-shot tooling that doesn't ship with the runtime — scripts/migrate-blog-urls.ts is in the repo but never imported by the application. The runtime cost is identical; only the development-time cost of one migration changed.
So the empirical boundary is:
- Fixed-enum URL refactors: ~8 min wall-clock, dominated by deploy time.
- Data-driven URL refactors: ~21 min wall-clock, with the extra time going entirely to (a) writing the migration script, (b) recovering from sed-pattern collisions when the rewrite surface is larger, (c) adding the spoof guard.
Three datapoints isn't a law of nature. But the shape of the cost difference is mechanical: more files to migrate = more risk of over-rewrite collisions = more recovery time. Future data-driven refactors should land in the same band as long as the migration scope stays under ~200 references.
What the empirical chain has now
After the first postmortem (PR #42), the claim was "this pattern is reusable in principle".
After the second postmortem (PR #53), the claim was "this pattern is reusable in practice for fixed-enum URLs, measured 7.4× faster than the first instance".
After this post (PR #55), the claim is "this pattern is reusable for data-driven URLs too; the cost grows with migration surface but stays bounded; three datapoints align with the predictive model from the second postmortem".
Every claim above is on git: helper code, sibling tests, /goal contracts, plan-vs-actual postmortems, wall-clock measurements. Compliance proves the rules were followed; delta proves they were worth following — and now, three deltas confirm the rule about URL refactor cost is itself stable across pattern variants.
What's next on this chain
Two falsifiable subclaims that the next iteration could test:
A fourth refactor of the same shape lands in the same band. Pick another fixed-enum (e.g.
/guides/<slug>→/guides/agent/<slug>if guides ever subdivide). Prediction: ≤ 10 min, since the pattern is now mature and the deploy-time floor dominates.A data-driven refactor with a much larger surface still stays bounded. Construct a scenario with ~500 references (e.g. if the blog grows to 100+ posts, doing another date-restructuring would trigger). Prediction: ≤ 45 min, because the marginal cost is in over-rewrite recovery, not in helper design.
Either result informative. Until then: three URL refactors, three postmortems, three wall-clock measurements, one stable pattern. The chain ratchets again.