52b2a11a1cad024c7842e9f0dae74e198ac0cd01 diff --git a/content/guides/aider.md b/content/guides/aider.md new file mode 100644 index 0000000000000000000000000000000000000000..6b7fa3f53f74dd7660079beced1912f9070dc672 --- /dev/null +++ b/content/guides/aider.md @@ -0,0 +1,84 @@ +# TDD with Aider + +> Test-driven development on tdd.md, using **Aider** as your agent. Aider's git-native commit-per-edit model maps almost perfectly to red→green→refactor. + +Aider commits after every edit by default. That's exactly what tdd.md wants — one phase per commit, tagged in the message. With a few config tweaks you get a clean trace the judge can replay. + +## one-time setup + +1. **Sign in on tdd.md**: [tdd.md/you](/you) → GitHub OAuth → save the push token. Your GitHub username = your agent name. +2. **Pick a kata** at [/games](/games). +3. **Clone with token embedded**: + ``` + git clone https://:@tdd.md//string-calc.git + cd string-calc + ``` +4. **Start Aider** in the folder: + ``` + aider + ``` + +## the prompt convention + +Aider builds the commit message from your prompt. To get the right prefix, lead every prompt with `red:` / `green:` / `refactor:` / `spike:` (with optional step): + +``` +> red(empty): write a failing test that add("") returns 0. don't touch the implementation. +[aider edits, runs your tests, commits "red(empty): ..."] + +> green(empty): write the simplest add() that makes the test pass. +[aider edits, commits "green(empty): ..."] + +> refactor: extract a parse() helper. tests must stay green. +[aider edits, commits "refactor: ..."] +``` + +Aider's auto-commit puts your prompt verbatim into the message, so the judge picks up the phase tag without you doing anything special. + +## architect mode + +If you have it enabled (`aider --architect`), Aider plans before editing. Useful for the green phase — it'll think about minimal impl before writing it. Worth running for steps where the implementation isn't obvious. + +For the red phase, architect mode is overkill — single-purpose tests are simple. Use plain edit mode. + +## test runner integration + +Aider can re-run tests after every commit: + +``` +aider --test-cmd "bun test" --auto-test +``` + +If the green commit's tests fail, Aider tries to fix it. That's mostly fine, but watch for: +- It might **delete the test** ("simplification") instead of fixing the impl. Tell it explicitly: "fix the impl, never the test." If it deletes anyway, that's a `test-deleted` verdict (-20) on tdd.md. +- It might **make the test trivially true** to "pass". The kata's hidden tests will catch this — verdict `hidden-tests-failed`, 0 points. + +## push and watch + +``` +git push +``` + +The judge runs within seconds. Verdict at [tdd.md//](/agents) shows per-step status, score, and an explanation per row. If you commit-per-phase as above, expect every step to show `verified` and +20. + +## what Aider does well + +- **One commit per edit** — natural fit for one-phase-per-commit. +- **Git-aware refactor** — Aider can be told to refactor without modifying behaviour, and re-runs tests to confirm. +- **Local model support** — keeps the kata closed-loop if you don't want to send code to a hosted provider. + +## common pitfalls + +- **Combined red+green prompts.** "Add a test and make it pass" reads to Aider as one job → one commit → red commit's tests already pass → `red-did-not-fail`, -5. Fix: two separate prompts, two commits. +- **Auto-test fix loop deleting tests.** See "test runner integration" above. Add a CONVENTIONS.md note: "never delete tests; fix the impl." +- **Aider's auto-format reorganizing tests.** If your formatter splits test functions, the test count can drop. Use `--no-auto-commit` and stage manually if this bites. + +## softer modes + +```json +{ "mode": "pragmatic" } +``` + +In pragmatic mode the judge halves penalties — handy if you're letting Aider's auto-test loop try a few things. `learning` floors negatives entirely. + +[← all guides](/guides) · [the kata catalog](/games) · [why TDD on agentic coding](/) diff --git a/content/guides/claude-code.md b/content/guides/claude-code.md new file mode 100644 index 0000000000000000000000000000000000000000..d554d47c9e7c44c44f129bcb4849c1645407fd41 --- /dev/null +++ b/content/guides/claude-code.md @@ -0,0 +1,84 @@ +# TDD with Claude Code + +> Test-driven development on tdd.md, using **Claude Code** as your agent. Score your discipline against hidden tests on every push. + +Claude Code is Anthropic's terminal coding agent. Out of the box it doesn't insist on TDD — it tends to write implementation first, tests later. With the right setup it'll do red→green→refactor cleanly, and tdd.md will verify it. + +## one-time setup + +1. **Sign in with GitHub on tdd.md**: visit [tdd.md/you](/you) → grant the OAuth scopes → save the push token shown on the welcome page. The same identity you use on GitHub becomes your tdd.md agent name. +2. **Pick a kata** at [/games](/games). Start with `string-calc`. +3. **Clone your kata repo** locally: + ``` + git clone https://:@tdd.md//string-calc.git + cd string-calc + ``` +4. **Open Claude Code** in that directory. + +## per-kata workflow + +In your CLAUDE.md (project root), add this snippet so Claude knows the rules: + +```md +This is a TDD kata. The judge at tdd.md scores discipline. + +Cycle: write a FAILING test, commit `red(): `, then write +the simplest impl that makes it pass, commit `green(): `. +Optional `refactor: ` between steps if structure can improve +without changing behaviour. + +Never write impl before its failing test. Never delete a test. +``` + +CLAUDE.md is read as context on every Claude Code invocation — pinning the rule there beats restating it in every prompt. + +## prompt patterns + +Step 1 (red phase): +> "We're starting step `` of the kata. Write a single failing test for the requirement, in ``. Don't touch the implementation yet. After you write the test, run it to confirm it fails." + +Step 2 (green phase, separate prompt): +> "The test fails as expected. Now write the simplest implementation in `` that makes it pass — nothing more. Run the tests to confirm they pass." + +Step 3 (optional refactor): +> "Tests pass. Refactor `` for clarity, but don't change behaviour. Run tests after each edit." + +Each prompt is a separate Claude Code turn — that creates the natural context separation between red and green that pure-TDD discipline demands. Combining them in one prompt is the most common cause of `red-did-not-fail` on tdd.md. + +## commit by phase + +After each phase Claude finishes, commit with the prefix the judge looks for: + +``` +git commit -m "red(empty): empty string returns 0" +git commit -m "green(empty): return 0 directly" +git commit -m "refactor: extract parse() helper" +``` + +`spike: ` is also valid — for exploration that doesn't score and doesn't penalize. + +## push and watch + +``` +git push +``` + +Within seconds the judge clones, replays your commits, runs the hidden tests, and posts the verdict at [tdd.md//](/agents). The page shows status per step, score, and a one-line explanation per row. + +## common pitfalls + +- **Single-prompt red+green.** Claude writes both files in one turn → red commit's tests never failed → `red-did-not-fail`, -5. Solution: two separate Claude Code turns, two separate commits. +- **Tautological tests.** Claude writes `expect(true).toBe(true)` to "pass" the requirement → hidden tests catch it → `hidden-tests-failed`, 0 points. Solution: make the test reflect the actual requirement (kata's spec page is authoritative). +- **Test deletion during refactor.** Claude tidies up by removing tests → `test-deleted`, -20. Solution: tell Claude in CLAUDE.md "never delete tests". + +## modes + +If you want a softer judge while learning Claude Code's TDD habits, drop a `tdd.config.json` in your repo: + +```json +{ "mode": "learning" } +``` + +Learning mode floors negatives at 0 and adds longer explanations. `pragmatic` halves penalties. `strict` is the default. + +[← all guides](/guides) · [the kata catalog](/games) · [why TDD on agentic coding](/) diff --git a/content/guides/cursor.md b/content/guides/cursor.md new file mode 100644 index 0000000000000000000000000000000000000000..2d8c23f510ec369363c7184bfeccec4474b8e049 --- /dev/null +++ b/content/guides/cursor.md @@ -0,0 +1,93 @@ +# TDD with Cursor + +> Test-driven development on tdd.md, using **Cursor** as your agent. Push commits, get a discipline score back within seconds. + +Cursor's strengths for TDD: the Composer (multi-file edits), agent mode, and explicit file-context control let you separate the red and green phases more cleanly than the chat sidebar alone. The tdd.md judge handles the rest — runs the tests, runs the kata's hidden tests, posts a verdict. + +## one-time setup + +1. **Sign in on tdd.md**: visit [tdd.md/you](/you) → GitHub OAuth → save your push token from the welcome page. Your GitHub username becomes your agent name. +2. **Pick a kata** at [/games](/games). +3. **Clone the kata locally** with the push token embedded so future pushes don't prompt: + ``` + git clone https://:@tdd.md//string-calc.git + cd string-calc + ``` +4. **Open the folder in Cursor**. + +## per-kata setup + +In `.cursor/rules/tdd.md` (Cursor's project rules) add: + +```md +This is a TDD kata. The judge at tdd.md scores commit discipline. + +Cycle: write a FAILING test, commit `red(): `, then +the simplest impl, commit `green(): `. Optional +`refactor: ` between steps. + +Never write impl before its failing test. Never delete a test. +Each phase is its own commit. +``` + +Cursor's project rules persist across chats and Composer sessions. Pinning the discipline here is more reliable than putting it in every prompt. + +## workflow with Composer + +For each step: + +**Red phase.** Open Composer (cmd-I), include only the test file. Prompt: +> "Write a single failing test for ``. Don't edit the implementation file." + +Apply the change, run the test to confirm it fails, then: +``` +git commit -m "red(): " +``` + +**Green phase.** Start a fresh Composer (don't continue the previous one — fresh context). Include the impl file. Prompt: +> "Make this test pass with the simplest possible code: ." + +Apply, run tests: +``` +git commit -m "green(): " +``` + +**Refactor (optional).** Composer with both files included: +> "Refactor without changing behaviour. Tests must still pass." +``` +git commit -m "refactor: " +``` + +Fresh-Composer-per-phase is what keeps Cursor honest. If you continue a single Composer thread, the model sees the upcoming impl plan while writing the "red" test — and the test stops failing for the right reason. + +## push and watch + +``` +git push +``` + +Verdict at [tdd.md//](/agents) within seconds: per-step status, score, one-line explanation, and a refactor sub-table. + +## what Cursor does well + +- **Multi-file Composer with explicit context** — keep test files and impl files in separate Composer turns to enforce the red/green separation. +- **Agent mode** — autonomous loops can do a full red→green→refactor without you typing each prompt. Add the project rule above so it doesn't cheat on the order. +- **Inline edits (cmd-K)** — useful for tiny refactor passes. Run tests after each edit. + +## common pitfalls + +- **Composer context bleed.** A single Composer chat with multiple turns lets the model anticipate the impl while writing the test. Fix: fresh Composer per phase. +- **Auto-applied edits across files.** Composer can edit impl + test in one apply. Fix: stage test commit first, run tests to confirm fail, then apply impl in a separate Composer turn. +- **Cursor "fixing" the test on green failure.** When the impl doesn't pass and Cursor offers to update the test instead — refuse. The test was the spec; the impl is wrong. + +## softer modes + +For practice runs, drop `tdd.config.json`: + +```json +{ "mode": "pragmatic" } +``` + +Pragmatic mode halves penalties and accepts combined red+green commits — useful when you're testing Cursor's defaults. + +[← all guides](/guides) · [the kata catalog](/games) · [why TDD on agentic coding](/) diff --git a/content/home.md b/content/home.md index fab44a6c036c3c0b8191c9d559bd567a4f353b03..0506dcbaa930eb4195635727362979ffac44bf2b 100644 --- a/content/home.md +++ b/content/home.md @@ -84,3 +84,5 @@ Pragmatic mode halves the negatives and accepts combined red+green commits. Lear 1. [Sign in with GitHub →](/you) — registers a new agent on your first visit, signs you back in to your dashboard on returns 2. [Pick a kata →](/games) — start with `string-calc` 3. Push commits tagged `red:` / `green:` / `refactor:` and watch your verdict land at `tdd.md//` + +Using a specific tool? Read the agent-specific walkthroughs in [/guides](/guides): [Claude Code](/guides/claude-code), [Cursor](/guides/cursor), [Aider](/guides/aider). diff --git a/src/server.ts b/src/server.ts index 04c15aef7977ebfab149cf2e27dcc0b6d7b6b165..48bca4a28c0a1f05cb50945e73643904f4a52013 100644 --- a/src/server.ts +++ b/src/server.ts @@ -32,6 +32,33 @@ const HOME_HTML = await renderPage({ const ALL_GAMES = await listGames(); +// Agent-specific TDD walkthroughs, served at /guides/. Each entry's +// markdown body lives at content/guides/.md. Adding a new agent +// guide is two lines below + drop the .md file. +interface GuideEntry { + slug: string; + title: string; + description: string; +} + +const ALL_GUIDES: GuideEntry[] = [ + { + slug: "claude-code", + title: "TDD with Claude Code", + description: "Run TDD katas through Anthropic's Claude Code with phase-separated prompts and CLAUDE.md rules so the judge scores clean red→green→refactor cycles.", + }, + { + slug: "cursor", + title: "TDD with Cursor", + description: "Test-driven katas through Cursor — Composer per phase, project rules pinned in .cursor/rules, fresh context for red vs green.", + }, + { + slug: "aider", + title: "TDD with Aider", + description: "Aider's commit-per-edit model maps directly onto red→green→refactor — prompt with phase tags and the auto-commit carries through.", + }, +]; + const gamesIndexBody = `# games ${ALL_GAMES.length === 0 @@ -42,6 +69,7 @@ ${ALL_GAMES.length === 0 } > Ready to play? [Register your agent →](/agents/register) +> Using a specific agent? See the [agent-specific guides](/guides) — Claude Code, Cursor, Aider. `; const GAMES_INDEX_HTML = await renderPage({ @@ -629,11 +657,16 @@ const server = Bun.serve({ const kataUrls = ALL_GAMES.map((g) => url(`https://tdd.md/games/${g.id}`, "0.8"), ).join("\n"); + const guideUrls = ALL_GUIDES.map((g) => + url(`https://tdd.md/guides/${g.slug}`, "0.8"), + ).join("\n"); const xml = ` ${url("https://tdd.md/", "1.0")} ${url("https://tdd.md/games", "0.9")} ${kataUrls} +${url("https://tdd.md/guides", "0.9")} +${guideUrls} ${url("https://tdd.md/agents", "0.7")} ${url("https://tdd.md/leaderboard", "0.7")} `; @@ -650,6 +683,55 @@ ${url("https://tdd.md/leaderboard", "0.7")} }), "/games": htmlResponse(GAMES_INDEX_HTML), + + "/guides": async () => { + const rows = ALL_GUIDES + .map((g) => `| [${g.title}](/guides/${g.slug}) | ${g.description} |`) + .join("\n"); + const body = `# guides + +Agent-specific walkthroughs for using tdd.md with the major agentic-coding tools. Each guide covers setup, prompt patterns that keep the agent in TDD, and the common pitfalls that cost score. + +| guide | what it covers | +|---|---| +${rows} + +> Missing your agent? [The mechanics are the same](/) — push commits tagged \`red:\` / \`green:\` / \`refactor:\` to your kata repo. Send a PR with a new guide and we'll list it here. + +[← play a kata](/games) · [register your agent →](/you) +`; + const html = await renderPage({ + title: "TDD guides for agentic coding tools — tdd.md", + description: "Practical TDD walkthroughs for Claude Code, Cursor, Aider and other AI coding agents — keep your agent honest with red→green→refactor commits, scored by tdd.md.", + bodyMarkdown: body, + ogPath: "https://tdd.md/guides", + active: "games", + }); + return htmlResponse(html); + }, + + "/guides/:slug": async (req) => { + const slug = req.params.slug; + const entry = ALL_GUIDES.find((g) => g.slug === slug); + if (!entry) { + const html = await renderNotFound(`/guides/${slug}`); + return htmlResponse(html, 404); + } + const file = Bun.file(`./content/guides/${slug}.md`); + if (!(await file.exists())) { + const html = await renderNotFound(`/guides/${slug}`); + return htmlResponse(html, 404); + } + const md = await file.text(); + const html = await renderPage({ + title: `${entry.title} — tdd.md`, + description: entry.description, + bodyMarkdown: md, + ogPath: `https://tdd.md/guides/${slug}`, + active: "games", + }); + return htmlResponse(html); + }, "/games/:kata": async (req) => { const res = await renderKata(req.params.kata); if (res) return res;