Agent-specific TDD guides at /guides/{claude-code,cursor,aider}
SEO bet: traffic for "TDD agentic coding" lands on the homepage, but people searching specifically for "TDD with Claude Code", "Cursor TDD workflow", or "Aider red-green-refactor" want a how-to, not a manifesto. Three guides cover the major agentic-coding tools: - /guides/claude-code — CLAUDE.md rules, phase-separated prompts to avoid Claude collapsing red+green into one turn, push-token-embedded clone URL, common pitfalls (single-prompt red+green, tautological tests, refactor-time test deletion). - /guides/cursor — .cursor/rules/tdd.md, fresh Composer per phase, Agent mode caveats, the "Cursor wants to fix the test instead of the impl" trap. - /guides/aider — auto-commit phase prefix convention, --auto-test pitfalls (deletes tests to "simplify"), architect mode for green. Each guide ends with a mode-toggle hint (learning/pragmatic) and back- links to /guides, /games, the homepage. The /guides index lists all three with one-line descriptions and a "missing your agent? PRs welcome" footer. Guides go in the sitemap with priority 0.8 (same as kata pages); the games index now mentions the guides; the homepage's "play" section ends with a guide row so new visitors see the path immediately. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
5 files changed · +345 −0
content/guides/aider.md
+84
−0
| @@ -0,0 +1,84 @@ | ||
| 1 | +# TDD with Aider | |
| 2 | + | |
| 3 | +> Test-driven development on tdd.md, using **Aider** as your agent. Aider's git-native commit-per-edit model maps almost perfectly to red→green→refactor. | |
| 4 | + | |
| 5 | +Aider commits after every edit by default. That's exactly what tdd.md wants — one phase per commit, tagged in the message. With a few config tweaks you get a clean trace the judge can replay. | |
| 6 | + | |
| 7 | +## one-time setup | |
| 8 | + | |
| 9 | +1. **Sign in on tdd.md**: [tdd.md/you](/you) → GitHub OAuth → save the push token. Your GitHub username = your agent name. | |
| 10 | +2. **Pick a kata** at [/games](/games). | |
| 11 | +3. **Clone with token embedded**: | |
| 12 | + ``` | |
| 13 | + git clone https://<your-name>:<push-token>@tdd.md/<your-name>/string-calc.git | |
| 14 | + cd string-calc | |
| 15 | + ``` | |
| 16 | +4. **Start Aider** in the folder: | |
| 17 | + ``` | |
| 18 | + aider | |
| 19 | + ``` | |
| 20 | + | |
| 21 | +## the prompt convention | |
| 22 | + | |
| 23 | +Aider builds the commit message from your prompt. To get the right prefix, lead every prompt with `red:` / `green:` / `refactor:` / `spike:` (with optional step): | |
| 24 | + | |
| 25 | +``` | |
| 26 | +> red(empty): write a failing test that add("") returns 0. don't touch the implementation. | |
| 27 | +[aider edits, runs your tests, commits "red(empty): ..."] | |
| 28 | + | |
| 29 | +> green(empty): write the simplest add() that makes the test pass. | |
| 30 | +[aider edits, commits "green(empty): ..."] | |
| 31 | + | |
| 32 | +> refactor: extract a parse() helper. tests must stay green. | |
| 33 | +[aider edits, commits "refactor: ..."] | |
| 34 | +``` | |
| 35 | + | |
| 36 | +Aider's auto-commit puts your prompt verbatim into the message, so the judge picks up the phase tag without you doing anything special. | |
| 37 | + | |
| 38 | +## architect mode | |
| 39 | + | |
| 40 | +If you have it enabled (`aider --architect`), Aider plans before editing. Useful for the green phase — it'll think about minimal impl before writing it. Worth running for steps where the implementation isn't obvious. | |
| 41 | + | |
| 42 | +For the red phase, architect mode is overkill — single-purpose tests are simple. Use plain edit mode. | |
| 43 | + | |
| 44 | +## test runner integration | |
| 45 | + | |
| 46 | +Aider can re-run tests after every commit: | |
| 47 | + | |
| 48 | +``` | |
| 49 | +aider --test-cmd "bun test" --auto-test | |
| 50 | +``` | |
| 51 | + | |
| 52 | +If the green commit's tests fail, Aider tries to fix it. That's mostly fine, but watch for: | |
| 53 | +- It might **delete the test** ("simplification") instead of fixing the impl. Tell it explicitly: "fix the impl, never the test." If it deletes anyway, that's a `test-deleted` verdict (-20) on tdd.md. | |
| 54 | +- It might **make the test trivially true** to "pass". The kata's hidden tests will catch this — verdict `hidden-tests-failed`, 0 points. | |
| 55 | + | |
| 56 | +## push and watch | |
| 57 | + | |
| 58 | +``` | |
| 59 | +git push | |
| 60 | +``` | |
| 61 | + | |
| 62 | +The judge runs within seconds. Verdict at [tdd.md/<your-name>/<kata>](/agents) shows per-step status, score, and an explanation per row. If you commit-per-phase as above, expect every step to show `verified` and +20. | |
| 63 | + | |
| 64 | +## what Aider does well | |
| 65 | + | |
| 66 | +- **One commit per edit** — natural fit for one-phase-per-commit. | |
| 67 | +- **Git-aware refactor** — Aider can be told to refactor without modifying behaviour, and re-runs tests to confirm. | |
| 68 | +- **Local model support** — keeps the kata closed-loop if you don't want to send code to a hosted provider. | |
| 69 | + | |
| 70 | +## common pitfalls | |
| 71 | + | |
| 72 | +- **Combined red+green prompts.** "Add a test and make it pass" reads to Aider as one job → one commit → red commit's tests already pass → `red-did-not-fail`, -5. Fix: two separate prompts, two commits. | |
| 73 | +- **Auto-test fix loop deleting tests.** See "test runner integration" above. Add a CONVENTIONS.md note: "never delete tests; fix the impl." | |
| 74 | +- **Aider's auto-format reorganizing tests.** If your formatter splits test functions, the test count can drop. Use `--no-auto-commit` and stage manually if this bites. | |
| 75 | + | |
| 76 | +## softer modes | |
| 77 | + | |
| 78 | +```json | |
| 79 | +{ "mode": "pragmatic" } | |
| 80 | +``` | |
| 81 | + | |
| 82 | +In pragmatic mode the judge halves penalties — handy if you're letting Aider's auto-test loop try a few things. `learning` floors negatives entirely. | |
| 83 | + | |
| 84 | +[← all guides](/guides) · [the kata catalog](/games) · [why TDD on agentic coding](/) | |
content/guides/claude-code.md
+84
−0
| @@ -0,0 +1,84 @@ | ||
| 1 | +# TDD with Claude Code | |
| 2 | + | |
| 3 | +> Test-driven development on tdd.md, using **Claude Code** as your agent. Score your discipline against hidden tests on every push. | |
| 4 | + | |
| 5 | +Claude Code is Anthropic's terminal coding agent. Out of the box it doesn't insist on TDD — it tends to write implementation first, tests later. With the right setup it'll do red→green→refactor cleanly, and tdd.md will verify it. | |
| 6 | + | |
| 7 | +## one-time setup | |
| 8 | + | |
| 9 | +1. **Sign in with GitHub on tdd.md**: visit [tdd.md/you](/you) → grant the OAuth scopes → save the push token shown on the welcome page. The same identity you use on GitHub becomes your tdd.md agent name. | |
| 10 | +2. **Pick a kata** at [/games](/games). Start with `string-calc`. | |
| 11 | +3. **Clone your kata repo** locally: | |
| 12 | + ``` | |
| 13 | + git clone https://<your-name>:<push-token>@tdd.md/<your-name>/string-calc.git | |
| 14 | + cd string-calc | |
| 15 | + ``` | |
| 16 | +4. **Open Claude Code** in that directory. | |
| 17 | + | |
| 18 | +## per-kata workflow | |
| 19 | + | |
| 20 | +In your CLAUDE.md (project root), add this snippet so Claude knows the rules: | |
| 21 | + | |
| 22 | +```md | |
| 23 | +This is a TDD kata. The judge at tdd.md scores discipline. | |
| 24 | + | |
| 25 | +Cycle: write a FAILING test, commit `red(<step>): <message>`, then write | |
| 26 | +the simplest impl that makes it pass, commit `green(<step>): <message>`. | |
| 27 | +Optional `refactor: <message>` between steps if structure can improve | |
| 28 | +without changing behaviour. | |
| 29 | + | |
| 30 | +Never write impl before its failing test. Never delete a test. | |
| 31 | +``` | |
| 32 | + | |
| 33 | +CLAUDE.md is read as context on every Claude Code invocation — pinning the rule there beats restating it in every prompt. | |
| 34 | + | |
| 35 | +## prompt patterns | |
| 36 | + | |
| 37 | +Step 1 (red phase): | |
| 38 | +> "We're starting step `<step-id>` of the kata. Write a single failing test for the requirement, in `<test-file>`. Don't touch the implementation yet. After you write the test, run it to confirm it fails." | |
| 39 | + | |
| 40 | +Step 2 (green phase, separate prompt): | |
| 41 | +> "The test fails as expected. Now write the simplest implementation in `<impl-file>` that makes it pass — nothing more. Run the tests to confirm they pass." | |
| 42 | + | |
| 43 | +Step 3 (optional refactor): | |
| 44 | +> "Tests pass. Refactor `<impl-file>` for clarity, but don't change behaviour. Run tests after each edit." | |
| 45 | + | |
| 46 | +Each prompt is a separate Claude Code turn — that creates the natural context separation between red and green that pure-TDD discipline demands. Combining them in one prompt is the most common cause of `red-did-not-fail` on tdd.md. | |
| 47 | + | |
| 48 | +## commit by phase | |
| 49 | + | |
| 50 | +After each phase Claude finishes, commit with the prefix the judge looks for: | |
| 51 | + | |
| 52 | +``` | |
| 53 | +git commit -m "red(empty): empty string returns 0" | |
| 54 | +git commit -m "green(empty): return 0 directly" | |
| 55 | +git commit -m "refactor: extract parse() helper" | |
| 56 | +``` | |
| 57 | + | |
| 58 | +`spike: <topic>` is also valid — for exploration that doesn't score and doesn't penalize. | |
| 59 | + | |
| 60 | +## push and watch | |
| 61 | + | |
| 62 | +``` | |
| 63 | +git push | |
| 64 | +``` | |
| 65 | + | |
| 66 | +Within seconds the judge clones, replays your commits, runs the hidden tests, and posts the verdict at [tdd.md/<your-name>/<kata>](/agents). The page shows status per step, score, and a one-line explanation per row. | |
| 67 | + | |
| 68 | +## common pitfalls | |
| 69 | + | |
| 70 | +- **Single-prompt red+green.** Claude writes both files in one turn → red commit's tests never failed → `red-did-not-fail`, -5. Solution: two separate Claude Code turns, two separate commits. | |
| 71 | +- **Tautological tests.** Claude writes `expect(true).toBe(true)` to "pass" the requirement → hidden tests catch it → `hidden-tests-failed`, 0 points. Solution: make the test reflect the actual requirement (kata's spec page is authoritative). | |
| 72 | +- **Test deletion during refactor.** Claude tidies up by removing tests → `test-deleted`, -20. Solution: tell Claude in CLAUDE.md "never delete tests". | |
| 73 | + | |
| 74 | +## modes | |
| 75 | + | |
| 76 | +If you want a softer judge while learning Claude Code's TDD habits, drop a `tdd.config.json` in your repo: | |
| 77 | + | |
| 78 | +```json | |
| 79 | +{ "mode": "learning" } | |
| 80 | +``` | |
| 81 | + | |
| 82 | +Learning mode floors negatives at 0 and adds longer explanations. `pragmatic` halves penalties. `strict` is the default. | |
| 83 | + | |
| 84 | +[← all guides](/guides) · [the kata catalog](/games) · [why TDD on agentic coding](/) | |
content/guides/cursor.md
+93
−0
| @@ -0,0 +1,93 @@ | ||
| 1 | +# TDD with Cursor | |
| 2 | + | |
| 3 | +> Test-driven development on tdd.md, using **Cursor** as your agent. Push commits, get a discipline score back within seconds. | |
| 4 | + | |
| 5 | +Cursor's strengths for TDD: the Composer (multi-file edits), agent mode, and explicit file-context control let you separate the red and green phases more cleanly than the chat sidebar alone. The tdd.md judge handles the rest — runs the tests, runs the kata's hidden tests, posts a verdict. | |
| 6 | + | |
| 7 | +## one-time setup | |
| 8 | + | |
| 9 | +1. **Sign in on tdd.md**: visit [tdd.md/you](/you) → GitHub OAuth → save your push token from the welcome page. Your GitHub username becomes your agent name. | |
| 10 | +2. **Pick a kata** at [/games](/games). | |
| 11 | +3. **Clone the kata locally** with the push token embedded so future pushes don't prompt: | |
| 12 | + ``` | |
| 13 | + git clone https://<your-name>:<push-token>@tdd.md/<your-name>/string-calc.git | |
| 14 | + cd string-calc | |
| 15 | + ``` | |
| 16 | +4. **Open the folder in Cursor**. | |
| 17 | + | |
| 18 | +## per-kata setup | |
| 19 | + | |
| 20 | +In `.cursor/rules/tdd.md` (Cursor's project rules) add: | |
| 21 | + | |
| 22 | +```md | |
| 23 | +This is a TDD kata. The judge at tdd.md scores commit discipline. | |
| 24 | + | |
| 25 | +Cycle: write a FAILING test, commit `red(<step>): <message>`, then | |
| 26 | +the simplest impl, commit `green(<step>): <message>`. Optional | |
| 27 | +`refactor: <message>` between steps. | |
| 28 | + | |
| 29 | +Never write impl before its failing test. Never delete a test. | |
| 30 | +Each phase is its own commit. | |
| 31 | +``` | |
| 32 | + | |
| 33 | +Cursor's project rules persist across chats and Composer sessions. Pinning the discipline here is more reliable than putting it in every prompt. | |
| 34 | + | |
| 35 | +## workflow with Composer | |
| 36 | + | |
| 37 | +For each step: | |
| 38 | + | |
| 39 | +**Red phase.** Open Composer (cmd-I), include only the test file. Prompt: | |
| 40 | +> "Write a single failing test for `<requirement>`. Don't edit the implementation file." | |
| 41 | + | |
| 42 | +Apply the change, run the test to confirm it fails, then: | |
| 43 | +``` | |
| 44 | +git commit -m "red(<step>): <one-line summary>" | |
| 45 | +``` | |
| 46 | + | |
| 47 | +**Green phase.** Start a fresh Composer (don't continue the previous one — fresh context). Include the impl file. Prompt: | |
| 48 | +> "Make this test pass with the simplest possible code: <test contents>." | |
| 49 | + | |
| 50 | +Apply, run tests: | |
| 51 | +``` | |
| 52 | +git commit -m "green(<step>): <one-line summary>" | |
| 53 | +``` | |
| 54 | + | |
| 55 | +**Refactor (optional).** Composer with both files included: | |
| 56 | +> "Refactor without changing behaviour. Tests must still pass." | |
| 57 | +``` | |
| 58 | +git commit -m "refactor: <one-line summary>" | |
| 59 | +``` | |
| 60 | + | |
| 61 | +Fresh-Composer-per-phase is what keeps Cursor honest. If you continue a single Composer thread, the model sees the upcoming impl plan while writing the "red" test — and the test stops failing for the right reason. | |
| 62 | + | |
| 63 | +## push and watch | |
| 64 | + | |
| 65 | +``` | |
| 66 | +git push | |
| 67 | +``` | |
| 68 | + | |
| 69 | +Verdict at [tdd.md/<your-name>/<kata>](/agents) within seconds: per-step status, score, one-line explanation, and a refactor sub-table. | |
| 70 | + | |
| 71 | +## what Cursor does well | |
| 72 | + | |
| 73 | +- **Multi-file Composer with explicit context** — keep test files and impl files in separate Composer turns to enforce the red/green separation. | |
| 74 | +- **Agent mode** — autonomous loops can do a full red→green→refactor without you typing each prompt. Add the project rule above so it doesn't cheat on the order. | |
| 75 | +- **Inline edits (cmd-K)** — useful for tiny refactor passes. Run tests after each edit. | |
| 76 | + | |
| 77 | +## common pitfalls | |
| 78 | + | |
| 79 | +- **Composer context bleed.** A single Composer chat with multiple turns lets the model anticipate the impl while writing the test. Fix: fresh Composer per phase. | |
| 80 | +- **Auto-applied edits across files.** Composer can edit impl + test in one apply. Fix: stage test commit first, run tests to confirm fail, then apply impl in a separate Composer turn. | |
| 81 | +- **Cursor "fixing" the test on green failure.** When the impl doesn't pass and Cursor offers to update the test instead — refuse. The test was the spec; the impl is wrong. | |
| 82 | + | |
| 83 | +## softer modes | |
| 84 | + | |
| 85 | +For practice runs, drop `tdd.config.json`: | |
| 86 | + | |
| 87 | +```json | |
| 88 | +{ "mode": "pragmatic" } | |
| 89 | +``` | |
| 90 | + | |
| 91 | +Pragmatic mode halves penalties and accepts combined red+green commits — useful when you're testing Cursor's defaults. | |
| 92 | + | |
| 93 | +[← all guides](/guides) · [the kata catalog](/games) · [why TDD on agentic coding](/) | |
content/home.md
+2
−0
| @@ -84,3 +84,5 @@ Pragmatic mode halves the negatives and accepts combined red+green commits. Lear | ||
| 84 | 84 | 1. [Sign in with GitHub →](/you) — registers a new agent on your first visit, signs you back in to your dashboard on returns |
| 85 | 85 | 2. [Pick a kata →](/games) — start with `string-calc` |
| 86 | 86 | 3. Push commits tagged `red:` / `green:` / `refactor:` and watch your verdict land at `tdd.md/<your-name>/<kata>` |
| 87 | + | |
| 88 | +Using a specific tool? Read the agent-specific walkthroughs in [/guides](/guides): [Claude Code](/guides/claude-code), [Cursor](/guides/cursor), [Aider](/guides/aider). | |
src/server.ts
+82
−0
| @@ -32,6 +32,33 @@ const HOME_HTML = await renderPage({ | ||
| 32 | 32 | |
| 33 | 33 | const ALL_GAMES = await listGames(); |
| 34 | 34 | |
| 35 | +// Agent-specific TDD walkthroughs, served at /guides/<slug>. Each entry's | |
| 36 | +// markdown body lives at content/guides/<slug>.md. Adding a new agent | |
| 37 | +// guide is two lines below + drop the .md file. | |
| 38 | +interface GuideEntry { | |
| 39 | + slug: string; | |
| 40 | + title: string; | |
| 41 | + description: string; | |
| 42 | +} | |
| 43 | + | |
| 44 | +const ALL_GUIDES: GuideEntry[] = [ | |
| 45 | + { | |
| 46 | + slug: "claude-code", | |
| 47 | + title: "TDD with Claude Code", | |
| 48 | + description: "Run TDD katas through Anthropic's Claude Code with phase-separated prompts and CLAUDE.md rules so the judge scores clean red→green→refactor cycles.", | |
| 49 | + }, | |
| 50 | + { | |
| 51 | + slug: "cursor", | |
| 52 | + title: "TDD with Cursor", | |
| 53 | + description: "Test-driven katas through Cursor — Composer per phase, project rules pinned in .cursor/rules, fresh context for red vs green.", | |
| 54 | + }, | |
| 55 | + { | |
| 56 | + slug: "aider", | |
| 57 | + title: "TDD with Aider", | |
| 58 | + description: "Aider's commit-per-edit model maps directly onto red→green→refactor — prompt with phase tags and the auto-commit carries through.", | |
| 59 | + }, | |
| 60 | +]; | |
| 61 | + | |
| 35 | 62 | const gamesIndexBody = `# games |
| 36 | 63 | |
| 37 | 64 | ${ALL_GAMES.length === 0 |
| @@ -42,6 +69,7 @@ ${ALL_GAMES.length === 0 | ||
| 42 | 69 | } |
| 43 | 70 | |
| 44 | 71 | > Ready to play? [Register your agent →](/agents/register) |
| 72 | +> Using a specific agent? See the [agent-specific guides](/guides) — Claude Code, Cursor, Aider. | |
| 45 | 73 | `; |
| 46 | 74 | |
| 47 | 75 | const GAMES_INDEX_HTML = await renderPage({ |
| @@ -629,11 +657,16 @@ const server = Bun.serve({ | ||
| 629 | 657 | const kataUrls = ALL_GAMES.map((g) => |
| 630 | 658 | url(`https://tdd.md/games/${g.id}`, "0.8"), |
| 631 | 659 | ).join("\n"); |
| 660 | + const guideUrls = ALL_GUIDES.map((g) => | |
| 661 | + url(`https://tdd.md/guides/${g.slug}`, "0.8"), | |
| 662 | + ).join("\n"); | |
| 632 | 663 | const xml = `<?xml version="1.0" encoding="UTF-8"?> |
| 633 | 664 | <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> |
| 634 | 665 | ${url("https://tdd.md/", "1.0")} |
| 635 | 666 | ${url("https://tdd.md/games", "0.9")} |
| 636 | 667 | ${kataUrls} |
| 668 | +${url("https://tdd.md/guides", "0.9")} | |
| 669 | +${guideUrls} | |
| 637 | 670 | ${url("https://tdd.md/agents", "0.7")} |
| 638 | 671 | ${url("https://tdd.md/leaderboard", "0.7")} |
| 639 | 672 | </urlset>`; |
| @@ -650,6 +683,55 @@ ${url("https://tdd.md/leaderboard", "0.7")} | ||
| 650 | 683 | }), |
| 651 | 684 | |
| 652 | 685 | "/games": htmlResponse(GAMES_INDEX_HTML), |
| 686 | + | |
| 687 | + "/guides": async () => { | |
| 688 | + const rows = ALL_GUIDES | |
| 689 | + .map((g) => `| [${g.title}](/guides/${g.slug}) | ${g.description} |`) | |
| 690 | + .join("\n"); | |
| 691 | + const body = `# guides | |
| 692 | + | |
| 693 | +Agent-specific walkthroughs for using tdd.md with the major agentic-coding tools. Each guide covers setup, prompt patterns that keep the agent in TDD, and the common pitfalls that cost score. | |
| 694 | + | |
| 695 | +| guide | what it covers | | |
| 696 | +|---|---| | |
| 697 | +${rows} | |
| 698 | + | |
| 699 | +> Missing your agent? [The mechanics are the same](/) — push commits tagged \`red:\` / \`green:\` / \`refactor:\` to your kata repo. Send a PR with a new guide and we'll list it here. | |
| 700 | + | |
| 701 | +[← play a kata](/games) · [register your agent →](/you) | |
| 702 | +`; | |
| 703 | + const html = await renderPage({ | |
| 704 | + title: "TDD guides for agentic coding tools — tdd.md", | |
| 705 | + description: "Practical TDD walkthroughs for Claude Code, Cursor, Aider and other AI coding agents — keep your agent honest with red→green→refactor commits, scored by tdd.md.", | |
| 706 | + bodyMarkdown: body, | |
| 707 | + ogPath: "https://tdd.md/guides", | |
| 708 | + active: "games", | |
| 709 | + }); | |
| 710 | + return htmlResponse(html); | |
| 711 | + }, | |
| 712 | + | |
| 713 | + "/guides/:slug": async (req) => { | |
| 714 | + const slug = req.params.slug; | |
| 715 | + const entry = ALL_GUIDES.find((g) => g.slug === slug); | |
| 716 | + if (!entry) { | |
| 717 | + const html = await renderNotFound(`/guides/${slug}`); | |
| 718 | + return htmlResponse(html, 404); | |
| 719 | + } | |
| 720 | + const file = Bun.file(`./content/guides/${slug}.md`); | |
| 721 | + if (!(await file.exists())) { | |
| 722 | + const html = await renderNotFound(`/guides/${slug}`); | |
| 723 | + return htmlResponse(html, 404); | |
| 724 | + } | |
| 725 | + const md = await file.text(); | |
| 726 | + const html = await renderPage({ | |
| 727 | + title: `${entry.title} — tdd.md`, | |
| 728 | + description: entry.description, | |
| 729 | + bodyMarkdown: md, | |
| 730 | + ogPath: `https://tdd.md/guides/${slug}`, | |
| 731 | + active: "games", | |
| 732 | + }); | |
| 733 | + return htmlResponse(html); | |
| 734 | + }, | |
| 653 | 735 | "/games/:kata": async (req) => { |
| 654 | 736 | const res = await renderKata(req.params.kata); |
| 655 | 737 | if (res) return res; |