syntaxai/tdd.md · commit 52b2a11

Agent-specific TDD guides at /guides/{claude-code,cursor,aider}

SEO bet: traffic for "TDD agentic coding" lands on the homepage, but
people searching specifically for "TDD with Claude Code", "Cursor TDD
workflow", or "Aider red-green-refactor" want a how-to, not a manifesto.

Three guides cover the major agentic-coding tools:

- /guides/claude-code — CLAUDE.md rules, phase-separated prompts to
  avoid Claude collapsing red+green into one turn, push-token-embedded
  clone URL, common pitfalls (single-prompt red+green, tautological
  tests, refactor-time test deletion).
- /guides/cursor — .cursor/rules/tdd.md, fresh Composer per phase,
  Agent mode caveats, the "Cursor wants to fix the test instead of
  the impl" trap.
- /guides/aider — auto-commit phase prefix convention, --auto-test
  pitfalls (deletes tests to "simplify"), architect mode for green.

Each guide ends with a mode-toggle hint (learning/pragmatic) and back-
links to /guides, /games, the homepage.

The /guides index lists all three with one-line descriptions and a
"missing your agent? PRs welcome" footer. Guides go in the sitemap
with priority 0.8 (same as kata pages); the games index now mentions
the guides; the homepage's "play" section ends with a guide row so
new visitors see the path immediately.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
author
syntaxai <[email protected]>
date
2026-05-07 12:04:17 +01:00
parent
fd09843
commit
52b2a11a1cad024c7842e9f0dae74e198ac0cd01

5 files changed · +345 −0

added content/guides/aider.md +84 −0
@@ -0,0 +1,84 @@
1+# TDD with Aider
2+
3+> Test-driven development on tdd.md, using **Aider** as your agent. Aider's git-native commit-per-edit model maps almost perfectly to red→green→refactor.
4+
5+Aider commits after every edit by default. That's exactly what tdd.md wants — one phase per commit, tagged in the message. With a few config tweaks you get a clean trace the judge can replay.
6+
7+## one-time setup
8+
9+1. **Sign in on tdd.md**: [tdd.md/you](/you) → GitHub OAuth → save the push token. Your GitHub username = your agent name.
10+2. **Pick a kata** at [/games](/games).
11+3. **Clone with token embedded**:
12+ ```
13+ git clone https://<your-name>:<push-token>@tdd.md/<your-name>/string-calc.git
14+ cd string-calc
15+ ```
16+4. **Start Aider** in the folder:
17+ ```
18+ aider
19+ ```
20+
21+## the prompt convention
22+
23+Aider builds the commit message from your prompt. To get the right prefix, lead every prompt with `red:` / `green:` / `refactor:` / `spike:` (with optional step):
24+
25+```
26+> red(empty): write a failing test that add("") returns 0. don't touch the implementation.
27+[aider edits, runs your tests, commits "red(empty): ..."]
28+
29+> green(empty): write the simplest add() that makes the test pass.
30+[aider edits, commits "green(empty): ..."]
31+
32+> refactor: extract a parse() helper. tests must stay green.
33+[aider edits, commits "refactor: ..."]
34+```
35+
36+Aider's auto-commit puts your prompt verbatim into the message, so the judge picks up the phase tag without you doing anything special.
37+
38+## architect mode
39+
40+If you have it enabled (`aider --architect`), Aider plans before editing. Useful for the green phase — it'll think about minimal impl before writing it. Worth running for steps where the implementation isn't obvious.
41+
42+For the red phase, architect mode is overkill — single-purpose tests are simple. Use plain edit mode.
43+
44+## test runner integration
45+
46+Aider can re-run tests after every commit:
47+
48+```
49+aider --test-cmd "bun test" --auto-test
50+```
51+
52+If the green commit's tests fail, Aider tries to fix it. That's mostly fine, but watch for:
53+- It might **delete the test** ("simplification") instead of fixing the impl. Tell it explicitly: "fix the impl, never the test." If it deletes anyway, that's a `test-deleted` verdict (-20) on tdd.md.
54+- It might **make the test trivially true** to "pass". The kata's hidden tests will catch this — verdict `hidden-tests-failed`, 0 points.
55+
56+## push and watch
57+
58+```
59+git push
60+```
61+
62+The judge runs within seconds. Verdict at [tdd.md/<your-name>/<kata>](/agents) shows per-step status, score, and an explanation per row. If you commit-per-phase as above, expect every step to show `verified` and +20.
63+
64+## what Aider does well
65+
66+- **One commit per edit** — natural fit for one-phase-per-commit.
67+- **Git-aware refactor** — Aider can be told to refactor without modifying behaviour, and re-runs tests to confirm.
68+- **Local model support** — keeps the kata closed-loop if you don't want to send code to a hosted provider.
69+
70+## common pitfalls
71+
72+- **Combined red+green prompts.** "Add a test and make it pass" reads to Aider as one job → one commit → red commit's tests already pass → `red-did-not-fail`, -5. Fix: two separate prompts, two commits.
73+- **Auto-test fix loop deleting tests.** See "test runner integration" above. Add a CONVENTIONS.md note: "never delete tests; fix the impl."
74+- **Aider's auto-format reorganizing tests.** If your formatter splits test functions, the test count can drop. Use `--no-auto-commit` and stage manually if this bites.
75+
76+## softer modes
77+
78+```json
79+{ "mode": "pragmatic" }
80+```
81+
82+In pragmatic mode the judge halves penalties — handy if you're letting Aider's auto-test loop try a few things. `learning` floors negatives entirely.
83+
84+[← all guides](/guides) · [the kata catalog](/games) · [why TDD on agentic coding](/)
added content/guides/claude-code.md +84 −0
@@ -0,0 +1,84 @@
1+# TDD with Claude Code
2+
3+> Test-driven development on tdd.md, using **Claude Code** as your agent. Score your discipline against hidden tests on every push.
4+
5+Claude Code is Anthropic's terminal coding agent. Out of the box it doesn't insist on TDD — it tends to write implementation first, tests later. With the right setup it'll do red→green→refactor cleanly, and tdd.md will verify it.
6+
7+## one-time setup
8+
9+1. **Sign in with GitHub on tdd.md**: visit [tdd.md/you](/you) → grant the OAuth scopes → save the push token shown on the welcome page. The same identity you use on GitHub becomes your tdd.md agent name.
10+2. **Pick a kata** at [/games](/games). Start with `string-calc`.
11+3. **Clone your kata repo** locally:
12+ ```
13+ git clone https://<your-name>:<push-token>@tdd.md/<your-name>/string-calc.git
14+ cd string-calc
15+ ```
16+4. **Open Claude Code** in that directory.
17+
18+## per-kata workflow
19+
20+In your CLAUDE.md (project root), add this snippet so Claude knows the rules:
21+
22+```md
23+This is a TDD kata. The judge at tdd.md scores discipline.
24+
25+Cycle: write a FAILING test, commit `red(<step>): <message>`, then write
26+the simplest impl that makes it pass, commit `green(<step>): <message>`.
27+Optional `refactor: <message>` between steps if structure can improve
28+without changing behaviour.
29+
30+Never write impl before its failing test. Never delete a test.
31+```
32+
33+CLAUDE.md is read as context on every Claude Code invocation — pinning the rule there beats restating it in every prompt.
34+
35+## prompt patterns
36+
37+Step 1 (red phase):
38+> "We're starting step `<step-id>` of the kata. Write a single failing test for the requirement, in `<test-file>`. Don't touch the implementation yet. After you write the test, run it to confirm it fails."
39+
40+Step 2 (green phase, separate prompt):
41+> "The test fails as expected. Now write the simplest implementation in `<impl-file>` that makes it pass — nothing more. Run the tests to confirm they pass."
42+
43+Step 3 (optional refactor):
44+> "Tests pass. Refactor `<impl-file>` for clarity, but don't change behaviour. Run tests after each edit."
45+
46+Each prompt is a separate Claude Code turn — that creates the natural context separation between red and green that pure-TDD discipline demands. Combining them in one prompt is the most common cause of `red-did-not-fail` on tdd.md.
47+
48+## commit by phase
49+
50+After each phase Claude finishes, commit with the prefix the judge looks for:
51+
52+```
53+git commit -m "red(empty): empty string returns 0"
54+git commit -m "green(empty): return 0 directly"
55+git commit -m "refactor: extract parse() helper"
56+```
57+
58+`spike: <topic>` is also valid — for exploration that doesn't score and doesn't penalize.
59+
60+## push and watch
61+
62+```
63+git push
64+```
65+
66+Within seconds the judge clones, replays your commits, runs the hidden tests, and posts the verdict at [tdd.md/<your-name>/<kata>](/agents). The page shows status per step, score, and a one-line explanation per row.
67+
68+## common pitfalls
69+
70+- **Single-prompt red+green.** Claude writes both files in one turn → red commit's tests never failed → `red-did-not-fail`, -5. Solution: two separate Claude Code turns, two separate commits.
71+- **Tautological tests.** Claude writes `expect(true).toBe(true)` to "pass" the requirement → hidden tests catch it → `hidden-tests-failed`, 0 points. Solution: make the test reflect the actual requirement (kata's spec page is authoritative).
72+- **Test deletion during refactor.** Claude tidies up by removing tests → `test-deleted`, -20. Solution: tell Claude in CLAUDE.md "never delete tests".
73+
74+## modes
75+
76+If you want a softer judge while learning Claude Code's TDD habits, drop a `tdd.config.json` in your repo:
77+
78+```json
79+{ "mode": "learning" }
80+```
81+
82+Learning mode floors negatives at 0 and adds longer explanations. `pragmatic` halves penalties. `strict` is the default.
83+
84+[← all guides](/guides) · [the kata catalog](/games) · [why TDD on agentic coding](/)
added content/guides/cursor.md +93 −0
@@ -0,0 +1,93 @@
1+# TDD with Cursor
2+
3+> Test-driven development on tdd.md, using **Cursor** as your agent. Push commits, get a discipline score back within seconds.
4+
5+Cursor's strengths for TDD: the Composer (multi-file edits), agent mode, and explicit file-context control let you separate the red and green phases more cleanly than the chat sidebar alone. The tdd.md judge handles the rest — runs the tests, runs the kata's hidden tests, posts a verdict.
6+
7+## one-time setup
8+
9+1. **Sign in on tdd.md**: visit [tdd.md/you](/you) → GitHub OAuth → save your push token from the welcome page. Your GitHub username becomes your agent name.
10+2. **Pick a kata** at [/games](/games).
11+3. **Clone the kata locally** with the push token embedded so future pushes don't prompt:
12+ ```
13+ git clone https://<your-name>:<push-token>@tdd.md/<your-name>/string-calc.git
14+ cd string-calc
15+ ```
16+4. **Open the folder in Cursor**.
17+
18+## per-kata setup
19+
20+In `.cursor/rules/tdd.md` (Cursor's project rules) add:
21+
22+```md
23+This is a TDD kata. The judge at tdd.md scores commit discipline.
24+
25+Cycle: write a FAILING test, commit `red(<step>): <message>`, then
26+the simplest impl, commit `green(<step>): <message>`. Optional
27+`refactor: <message>` between steps.
28+
29+Never write impl before its failing test. Never delete a test.
30+Each phase is its own commit.
31+```
32+
33+Cursor's project rules persist across chats and Composer sessions. Pinning the discipline here is more reliable than putting it in every prompt.
34+
35+## workflow with Composer
36+
37+For each step:
38+
39+**Red phase.** Open Composer (cmd-I), include only the test file. Prompt:
40+> "Write a single failing test for `<requirement>`. Don't edit the implementation file."
41+
42+Apply the change, run the test to confirm it fails, then:
43+```
44+git commit -m "red(<step>): <one-line summary>"
45+```
46+
47+**Green phase.** Start a fresh Composer (don't continue the previous one — fresh context). Include the impl file. Prompt:
48+> "Make this test pass with the simplest possible code: <test contents>."
49+
50+Apply, run tests:
51+```
52+git commit -m "green(<step>): <one-line summary>"
53+```
54+
55+**Refactor (optional).** Composer with both files included:
56+> "Refactor without changing behaviour. Tests must still pass."
57+```
58+git commit -m "refactor: <one-line summary>"
59+```
60+
61+Fresh-Composer-per-phase is what keeps Cursor honest. If you continue a single Composer thread, the model sees the upcoming impl plan while writing the "red" test — and the test stops failing for the right reason.
62+
63+## push and watch
64+
65+```
66+git push
67+```
68+
69+Verdict at [tdd.md/<your-name>/<kata>](/agents) within seconds: per-step status, score, one-line explanation, and a refactor sub-table.
70+
71+## what Cursor does well
72+
73+- **Multi-file Composer with explicit context** — keep test files and impl files in separate Composer turns to enforce the red/green separation.
74+- **Agent mode** — autonomous loops can do a full red→green→refactor without you typing each prompt. Add the project rule above so it doesn't cheat on the order.
75+- **Inline edits (cmd-K)** — useful for tiny refactor passes. Run tests after each edit.
76+
77+## common pitfalls
78+
79+- **Composer context bleed.** A single Composer chat with multiple turns lets the model anticipate the impl while writing the test. Fix: fresh Composer per phase.
80+- **Auto-applied edits across files.** Composer can edit impl + test in one apply. Fix: stage test commit first, run tests to confirm fail, then apply impl in a separate Composer turn.
81+- **Cursor "fixing" the test on green failure.** When the impl doesn't pass and Cursor offers to update the test instead — refuse. The test was the spec; the impl is wrong.
82+
83+## softer modes
84+
85+For practice runs, drop `tdd.config.json`:
86+
87+```json
88+{ "mode": "pragmatic" }
89+```
90+
91+Pragmatic mode halves penalties and accepts combined red+green commits — useful when you're testing Cursor's defaults.
92+
93+[← all guides](/guides) · [the kata catalog](/games) · [why TDD on agentic coding](/)
modified content/home.md +2 −0
@@ -84,3 +84,5 @@ Pragmatic mode halves the negatives and accepts combined red+green commits. Lear
8484 1. [Sign in with GitHub →](/you) — registers a new agent on your first visit, signs you back in to your dashboard on returns
8585 2. [Pick a kata →](/games) — start with `string-calc`
8686 3. Push commits tagged `red:` / `green:` / `refactor:` and watch your verdict land at `tdd.md/<your-name>/<kata>`
87+
88+Using a specific tool? Read the agent-specific walkthroughs in [/guides](/guides): [Claude Code](/guides/claude-code), [Cursor](/guides/cursor), [Aider](/guides/aider).
modified src/server.ts +82 −0
@@ -32,6 +32,33 @@ const HOME_HTML = await renderPage({
3232
3333 const ALL_GAMES = await listGames();
3434
35+// Agent-specific TDD walkthroughs, served at /guides/<slug>. Each entry's
36+// markdown body lives at content/guides/<slug>.md. Adding a new agent
37+// guide is two lines below + drop the .md file.
38+interface GuideEntry {
39+ slug: string;
40+ title: string;
41+ description: string;
42+}
43+
44+const ALL_GUIDES: GuideEntry[] = [
45+ {
46+ slug: "claude-code",
47+ title: "TDD with Claude Code",
48+ description: "Run TDD katas through Anthropic's Claude Code with phase-separated prompts and CLAUDE.md rules so the judge scores clean red→green→refactor cycles.",
49+ },
50+ {
51+ slug: "cursor",
52+ title: "TDD with Cursor",
53+ description: "Test-driven katas through Cursor — Composer per phase, project rules pinned in .cursor/rules, fresh context for red vs green.",
54+ },
55+ {
56+ slug: "aider",
57+ title: "TDD with Aider",
58+ description: "Aider's commit-per-edit model maps directly onto red→green→refactor — prompt with phase tags and the auto-commit carries through.",
59+ },
60+];
61+
3562 const gamesIndexBody = `# games
3663
3764 ${ALL_GAMES.length === 0
@@ -42,6 +69,7 @@ ${ALL_GAMES.length === 0
4269 }
4370
4471 > Ready to play? [Register your agent →](/agents/register)
72+> Using a specific agent? See the [agent-specific guides](/guides) — Claude Code, Cursor, Aider.
4573 `;
4674
4775 const GAMES_INDEX_HTML = await renderPage({
@@ -629,11 +657,16 @@ const server = Bun.serve({
629657 const kataUrls = ALL_GAMES.map((g) =>
630658 url(`https://tdd.md/games/${g.id}`, "0.8"),
631659 ).join("\n");
660+ const guideUrls = ALL_GUIDES.map((g) =>
661+ url(`https://tdd.md/guides/${g.slug}`, "0.8"),
662+ ).join("\n");
632663 const xml = `<?xml version="1.0" encoding="UTF-8"?>
633664 <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
634665 ${url("https://tdd.md/", "1.0")}
635666 ${url("https://tdd.md/games", "0.9")}
636667 ${kataUrls}
668+${url("https://tdd.md/guides", "0.9")}
669+${guideUrls}
637670 ${url("https://tdd.md/agents", "0.7")}
638671 ${url("https://tdd.md/leaderboard", "0.7")}
639672 </urlset>`;
@@ -650,6 +683,55 @@ ${url("https://tdd.md/leaderboard", "0.7")}
650683 }),
651684
652685 "/games": htmlResponse(GAMES_INDEX_HTML),
686+
687+ "/guides": async () => {
688+ const rows = ALL_GUIDES
689+ .map((g) => `| [${g.title}](/guides/${g.slug}) | ${g.description} |`)
690+ .join("\n");
691+ const body = `# guides
692+
693+Agent-specific walkthroughs for using tdd.md with the major agentic-coding tools. Each guide covers setup, prompt patterns that keep the agent in TDD, and the common pitfalls that cost score.
694+
695+| guide | what it covers |
696+|---|---|
697+${rows}
698+
699+> Missing your agent? [The mechanics are the same](/) — push commits tagged \`red:\` / \`green:\` / \`refactor:\` to your kata repo. Send a PR with a new guide and we'll list it here.
700+
701+[← play a kata](/games) · [register your agent →](/you)
702+`;
703+ const html = await renderPage({
704+ title: "TDD guides for agentic coding tools — tdd.md",
705+ description: "Practical TDD walkthroughs for Claude Code, Cursor, Aider and other AI coding agents — keep your agent honest with red→green→refactor commits, scored by tdd.md.",
706+ bodyMarkdown: body,
707+ ogPath: "https://tdd.md/guides",
708+ active: "games",
709+ });
710+ return htmlResponse(html);
711+ },
712+
713+ "/guides/:slug": async (req) => {
714+ const slug = req.params.slug;
715+ const entry = ALL_GUIDES.find((g) => g.slug === slug);
716+ if (!entry) {
717+ const html = await renderNotFound(`/guides/${slug}`);
718+ return htmlResponse(html, 404);
719+ }
720+ const file = Bun.file(`./content/guides/${slug}.md`);
721+ if (!(await file.exists())) {
722+ const html = await renderNotFound(`/guides/${slug}`);
723+ return htmlResponse(html, 404);
724+ }
725+ const md = await file.text();
726+ const html = await renderPage({
727+ title: `${entry.title} — tdd.md`,
728+ description: entry.description,
729+ bodyMarkdown: md,
730+ ogPath: `https://tdd.md/guides/${slug}`,
731+ active: "games",
732+ });
733+ return htmlResponse(html);
734+ },
653735 "/games/:kata": async (req) => {
654736 const res = await renderKata(req.params.kata);
655737 if (res) return res;