syntaxai/tdd.md · main · content / blog / sama-v2-git-url-refactor-plan.md

sama-v2-git-url-refactor-plan.md 136 lines · 9380 bytes raw · source

Shortening /GIT/ URLs: a single-tenant URL has a redundant segment

Every link on this site that points at the source code passes through /GIT/:owner/:repo/.... The owner segment is always syntaxai. The repo segment is always tdd.md. The handler validates both, 404s anything else, and never reads them again. The user-visible URL is doing structural work for a multi-tenant case that doesn't exist.

Concrete example — the verifier source link:

before:  https://tdd.md/GIT/syntaxai/tdd.md/blob/main/src/b32_sama_v2_verify.ts
after:   https://tdd.md/GIT/tdd.md/blob/main/src/b32_sama_v2_verify.ts

Nine characters shorter. The change is small but the workflow it sits inside is the same one this site is built around — /goal slash command as contract, SAMA v2 as discipline, the verifier as anti-fudge gate. This post is the plan, written before the /goal fires.

URL anatomy — the owner segment is policy overhead, not data

Why dropping the owner is safe

The relevant code is twenty-one lines down src/d21_handlers_repo_browse.ts:

const isAllowedRepo = (owner: string, repo: string): boolean =>
  owner === LIVE_REPO_OWNER &&        // "syntaxai"
  repo === LIVE_REPO_NAME &&          // "tdd.md"
  SAFE_OWNER_REPO.test(owner) &&
  SAFE_OWNER_REPO.test(repo);

The check is structural — there is exactly one allowed pair, and any deviation produces a 404. So the owner segment carries no information the handler couldn't supply itself. It's a position in the URL that exists only to make the URL look like a GitHub URL — which, given that the data is not on GitHub, is a costume rather than a contract.

The signature also drops to isAllowedRepo(repo). LIVE_REPO_OWNER stays in src/a31_site_config.ts — it's still the truthful owner for the backing git operations, the Forgejo proxy, and any future feature that needs to talk about provenance. It just stops showing up in user-facing URLs.

The interesting design decision — one regex, not 49 redirects

A grep across the repo finds 49 references to the old URL form across 10 source files and 7 content files — link builders, hard-coded markdown in /sama/v2/verify, blog posts that point at specific files for their empirical claims, the verifier page itself.

Naive approach: hand-maintain a list of 49 old-URL → new-URL mappings as a redirect table. Cost: rewrites work today, but the list rots the next time someone adds a new file or blog post (50 grows to 60 grows to 100). Anti-pattern.

The right shape is one regex in the fallback handler that matches the pattern of the old URL and rewrites to the new one:

Shipping the URL change — old URL → regex matcher → 301 → new URL

const oldGitUrl = url.pathname.match(
  /^\/GIT\/syntaxai\/tdd\.md\/(.+)$/,
);
if (oldGitUrl) {
  return new Response(null, {
    status: 301,
    headers: {
      Location: `/GIT/tdd.md/${oldGitUrl[1]}`,
      "Cache-Control": "public, max-age=86400",
    },
  });
}

Five lines. Covers all 49 known references and every future URL with the same shape. Cost: one commit. Lifetime maintenance: zero.

The 301 (permanent redirect) is the load-bearing detail — search engines treat 301 as "update your index"; they treat 302 as "this is temporary, keep the old URL." We want the index to converge on the new URL, so 301 it is.

How this maps onto SAMA v2

The refactor touches files across three layers, all in expected ways:

Layer What changes
Layer 0 · Pure (a31_site_config.ts) LIVE_REPO_OWNER stays exported — still the truthful owner constant, just no longer used to build URLs
Layer 1 · Core No changes — there are no Layer-1 helpers in the /GIT/ flow; the URL surface is pure routing
Layer 2 · Adapter (c14_git.ts) No changes — lsTree and readBlobAtRef already take (ref, path), never owner/repo
Layer 3 · Entry All the changes live here — parseRepoBrowsePath callers, repoBrowseHandler signature, the Bun explicit route /GIT/:repo/commit/:sha, the new 301 redirect, and the link builders in b51_render_*.ts

The layer surface tells you the refactor is contained — no Adapter changes, no business-logic changes, no test-of-pure-helper changes. Only the routing/rendering surface moves. That's the "small refactor" smell the Layer 2 stays empty sitemap post identified — when the change is genuinely about the URL surface, the deeper layers don't need to move.

Anti-fudge — what the /goal rules out

The plan deliberately doesn't do these things, even though each is locally appealing:

  • No hand-maintained list of redirects. One regex pattern covers all 49 current references and every future one. If the regex grows into "a list", the anti-fudge clause has been violated.
  • No removal of LIVE_REPO_OWNER. The constant has callers beyond URL construction (the live-reports view, the Forgejo proxy hostname). Removing it from a31_site_config.ts would be a different, larger refactor that the URL change shouldn't drag in.
  • No touching of git-protocol URLs. /syntaxai/tdd.md.git and the bare-repo view at /syntaxai/tdd.md go through isGitProtocol + repoMatch in d21_handlers_fallback.ts. Those URLs are git-client-facing — agents and humans have copy-pasted them into clone commands, into CI configs, into other agents' system prompts. Changing them risks breakage for cosmetics. They stay.
  • No alias. Both URL forms working forever creates two canonical URLs and lets the old one quietly remain in new code. The 301 is what forces consolidation — search engines update, internal code paths rewrite themselves, and a year from now the old form is just a redirect line in one file.
  • No verifier change. /sama/v2/verify stays at 7/7 ✓ across the merge. The §4 check logic is frozen; if a structural choice the refactor wants to make would fail the verifier, the choice changes — not the verifier.

The work, sized

Three categories of file change:

  • Wiring (4 files): the fallback handler gets the new redirect + the parse regex drops owner; the explicit Bun commit route in d21_app.ts becomes /GIT/:repo/commit/:sha; repoBrowseHandler and commitViewHandler lose the owner argument; isAllowedRepo collapses to one argument.
  • Link builders (3 files): b51_render_repo.ts (eight call sites — breadcrumbs, parent-dir, raw/source links), b51_render_commit.ts (two call sites), b51_render_edit.ts (one hard-coded URL).
  • Hard-coded markdown (7 files): content/home.md, content/sama/v2.md, four blog posts that point at specific source files for their empirical claims, src/d21_handlers_sama.ts:137 (markdown embedded in the verifier page body). One sed pass, all done.

The test files (b51_render_repo.test.ts, b51_render_commit.test.ts) pin the rendered URL strings — those expectations update mechanically with the link-builder changes. Test count stays at 379+; no test count regression.

Live-verify clauses

What the /goal requires to verify after deploy, not just in CI:

$ curl -I https://tdd.md/GIT/syntaxai/tdd.md/blob/main/src/b32_sama_v2_verify.ts
HTTP/2 301
location: /GIT/tdd.md/blob/main/src/b32_sama_v2_verify.ts

$ curl -L https://tdd.md/GIT/syntaxai/tdd.md/blob/main/src/b32_sama_v2_verify.ts
HTTP/2 200
< file content >

$ curl -s https://tdd.md/GIT/tdd.md/tree/main | head -1
< 200, directory listing HTML >

$ curl -s https://tdd.md/sama/v2/verify | grep -o '7/7'
7/7

Plus the silent live check: every blog post on the site has its /GIT/ links rewritten, so clicking any "view source" link in any of the empirical-chain posts lands on a working URL — no broken navigation surfaced after the merge.

What lands when this ships

After deploy:

  • Every /GIT/ URL on the site uses the new shape.
  • The verifier source — the URL search engines and AI crawlers should index as "the artifact this site's argument rests on" — gets shorter and more readable.
  • Old URLs already indexed by Google, cached by Twitter card scrapers, sitting in someone else's blog post, or pasted into someone's notes file all permanently-redirect to the new form. Index reconverges in a search-engine refresh cycle.
  • /sama/v2/verify continues to report 7 ✓ / 7.
  • One new pattern — the regex-as-redirect — surfaces a reusable shape for future URL refactors. If the site renames /sama/v2/example-crud to /sama/v2/examples/crud next month, the same shape applies.

Companion postmortem

This is the plan. The postmortem will follow after the merge with:

  • The actual file diff (likely tight — most line changes are mechanical s/syntaxai\/tdd\.md/tdd.md/g substitutions).
  • Whether the regex caught everything (especially in places grep missed — embedded HTML strings, multi-line URLs, etc.).
  • The /sama/v2/verify output before and after the merge.
  • Anything the anti-fudge clauses caught that the plan missed.

If the refactor lands cleanly with the regex absorbing all 49 references — that's the data point: pattern-as-redirect is a reusable shape, and the next URL refactor needs ten lines plus a sed pass.