--- slug: sitemap-xml-impl title: Add automatically-generated /sitemap.xml from existing registries date: 2026-05-25 branch: sitemap-xml-impl pr_number: 40 merge_sha: 3280af8 status: shipped related_posts: [sama-v2-sitemap-implementation-plan] --- Goal: Add an automatically-generated /sitemap.xml so search engines and AI crawlers can index the full site without a hand-maintained URL list. The sitemap is generated on demand from the existing registries (ALL_POSTS, ALL_SAMA, the route table, the guides list wherever it lives), so a new blog post or discipline page lands in the sitemap immediately on deploy with zero human edit. Note: src/a31_blog.ts already declares in its top comment that ALL_POSTS "drives /blog, /blog/:slug, and the sitemap" — this goal makes that comment true. Done when: - A new route /sitemap.xml returns 200 with Content-Type "application/xml; charset=utf-8" and a valid sitemaps.org 0.9 document: https://tdd.md/...[YYYY-MM-DD] ... - URLs are derived from the registries (no hand-maintained slug list): * Every entry in ALL_POSTS → /blog/ with = the post's date field. * Every entry in ALL_SAMA → /sama/. * Every guide entry from whichever registry exists (search for ALL_GUIDES, GUIDES, or grep src/d21_app.ts for /guides routes). * Static load-bearing URLs: /, /blog, /games, /leaderboard, /sama, /sama/v2, /sama/v2/verify, /sama/v2/example-crud, /sama/v2/example-wordpress, /sama/skill, /guides. These can stay as a small const list in the new helper (each one corresponds to a literal route in d21_app.ts). - All URLs use the absolute base https://tdd.md (use the constant from src/a31_site_config.ts if one exists). - A new pure Layer 1 helper at src/b32_sitemap.ts takes Array<{ loc: string; lastmod?: string }> → returns the well-formed XML string. No I/O; deterministic output. Sibling test covers: empty list → valid urlset with no children; single URL with lastmod; single URL without lastmod; multiple URLs preserve order; XML-escape any & or < in URLs (rare here but the helper must be safe). - The handler is a single closure registered in src/d21_app.ts (or split into d21_handlers_sitemap.ts if it grows). Imports ALL_POSTS + ALL_SAMA + the static list, calls the helper, returns the Response with Cache-Control "public, max-age=3600". - /robots.txt updated to include "Sitemap: https://tdd.md/sitemap.xml" at the end. If it doesn't exist yet, create the minimal: "User-agent: *\nAllow: /\nSitemap: https://tdd.md/sitemap.xml". - The sitemap is NOT committed as a static file — it's generated per-request (or once at process startup). New blog post → next sitemap fetch already includes it without any human edit. This is the load-bearing "automatic" property. - All 367+ tests still pass; new helper test adds ~6-8 cases. - /sama/v2/verify still reports 7/7 ✓ (anti-fudge). - Deployed; live-verify: curl https://tdd.md/sitemap.xml returns 200 + valid XML; the response includes /blog/2026-05/sama-v2-workingset-cross-repo-baseline (the most recent post); /robots.txt references the sitemap. Constraints (anti-fudge): - URLs MUST come from existing registries — no second source of truth that can drift. - XML must be well-formed (no string-concat shortcuts that break on special chars). Use a tiny XML-escape helper inside b32_sitemap.ts (the existing renderer's HTML-escape is technically a superset and would work too, but a dedicated XML helper is cleaner Layer-1). - Don't list dynamic/user-specific URLs (/p/:slug, /sama/verify?repo=..., /api/*) — only stable indexable content. - Cache-Control: public, max-age=3600. Search engines should re-fetch but not hammer. - Site language English-only. - GitHub flow via flatpak-spawn (branch → PR → merge → push p620 → deploy via flatpak-spawn --host scripts/p620/deploy-tdd-md.sh). - Do NOT change any §4 verifier logic. Load-bearing files to read FIRST: - src/a31_blog.ts (the comment at the top confirms ALL_POSTS is meant to drive the sitemap) - src/a31_sama.ts (ALL_SAMA structure) - src/d21_app.ts (live route table — confirm which static URLs exist + find a place to register /sitemap.xml + grep for /guides routes) - src/a31_site_config.ts (canonical base URL constant — use that, don't hard-code "https://tdd.md" in 20 places) - src/b51_render_layout.ts (the existing escape helper, as reference for the XML-escape function shape) - public/robots.txt if it exists (check before clobbering)