syntaxai/tdd.md · commit 52b2a11

Agent-specific TDD guides at /guides/{claude-code,cursor,aider}

SEO bet: traffic for "TDD agentic coding" lands on the homepage, but
people searching specifically for "TDD with Claude Code", "Cursor TDD
workflow", or "Aider red-green-refactor" want a how-to, not a manifesto.

Three guides cover the major agentic-coding tools:

- /guides/claude-code — CLAUDE.md rules, phase-separated prompts to
  avoid Claude collapsing red+green into one turn, push-token-embedded
  clone URL, common pitfalls (single-prompt red+green, tautological
  tests, refactor-time test deletion).
- /guides/cursor — .cursor/rules/tdd.md, fresh Composer per phase,
  Agent mode caveats, the "Cursor wants to fix the test instead of
  the impl" trap.
- /guides/aider — auto-commit phase prefix convention, --auto-test
  pitfalls (deletes tests to "simplify"), architect mode for green.

Each guide ends with a mode-toggle hint (learning/pragmatic) and back-
links to /guides, /games, the homepage.

The /guides index lists all three with one-line descriptions and a
"missing your agent? PRs welcome" footer. Guides go in the sitemap
with priority 0.8 (same as kata pages); the games index now mentions
the guides; the homepage's "play" section ends with a guide row so
new visitors see the path immediately.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

author: syntaxai <[email protected]>
date: 2026-05-07 12:04:17 +01:00
parent: fd09843
commit: 52b2a11a1cad024c7842e9f0dae74e198ac0cd01

5 files changed · +345 −0

added content/guides/aider.md +84 −0

@@ -0,0 +1,84 @@
	1	+# TDD with Aider
	2	+
	3	+> Test-driven development on tdd.md, using Aider as your agent. Aider's git-native commit-per-edit model maps almost perfectly to red→green→refactor.
	4	+
	5	+Aider commits after every edit by default. That's exactly what tdd.md wants — one phase per commit, tagged in the message. With a few config tweaks you get a clean trace the judge can replay.
	6	+
	7	+## one-time setup
	8	+
	9	+1. Sign in on tdd.md: [tdd.md/you](/you) → GitHub OAuth → save the push token. Your GitHub username = your agent name.
	10	+2. Pick a kata at [/games](/games).
	11	+3. Clone with token embedded:
	12	+ ```
	13	+ git clone https://<your-name>:<push-token>@tdd.md/<your-name>/string-calc.git
	14	+ cd string-calc
	15	+ ```
	16	+4. Start Aider in the folder:
	17	+ ```
	18	+ aider
	19	+ ```
	20	+
	21	+## the prompt convention
	22	+
	23	+Aider builds the commit message from your prompt. To get the right prefix, lead every prompt with `red:` / `green:` / `refactor:` / `spike:` (with optional step):
	24	+
	25	+```
	26	+> red(empty): write a failing test that add("") returns 0. don't touch the implementation.
	27	+[aider edits, runs your tests, commits "red(empty): ..."]
	28	+
	29	+> green(empty): write the simplest add() that makes the test pass.
	30	+[aider edits, commits "green(empty): ..."]
	31	+
	32	+> refactor: extract a parse() helper. tests must stay green.
	33	+[aider edits, commits "refactor: ..."]
	34	+```
	35	+
	36	+Aider's auto-commit puts your prompt verbatim into the message, so the judge picks up the phase tag without you doing anything special.
	37	+
	38	+## architect mode
	39	+
	40	+If you have it enabled (`aider --architect`), Aider plans before editing. Useful for the green phase — it'll think about minimal impl before writing it. Worth running for steps where the implementation isn't obvious.
	41	+
	42	+For the red phase, architect mode is overkill — single-purpose tests are simple. Use plain edit mode.
	43	+
	44	+## test runner integration
	45	+
	46	+Aider can re-run tests after every commit:
	47	+
	48	+```
	49	+aider --test-cmd "bun test" --auto-test
	50	+```
	51	+
	52	+If the green commit's tests fail, Aider tries to fix it. That's mostly fine, but watch for:
	53	+- It might delete the test ("simplification") instead of fixing the impl. Tell it explicitly: "fix the impl, never the test." If it deletes anyway, that's a `test-deleted` verdict (-20) on tdd.md.
	54	+- It might make the test trivially true to "pass". The kata's hidden tests will catch this — verdict `hidden-tests-failed`, 0 points.
	55	+
	56	+## push and watch
	57	+
	58	+```
	59	+git push
	60	+```
	61	+
	62	+The judge runs within seconds. Verdict at [tdd.md/<your-name>/<kata>](/agents) shows per-step status, score, and an explanation per row. If you commit-per-phase as above, expect every step to show `verified` and +20.
	63	+
	64	+## what Aider does well
	65	+
	66	+- One commit per edit — natural fit for one-phase-per-commit.
	67	+- Git-aware refactor — Aider can be told to refactor without modifying behaviour, and re-runs tests to confirm.
	68	+- Local model support — keeps the kata closed-loop if you don't want to send code to a hosted provider.
	69	+
	70	+## common pitfalls
	71	+
	72	+- Combined red+green prompts. "Add a test and make it pass" reads to Aider as one job → one commit → red commit's tests already pass → `red-did-not-fail`, -5. Fix: two separate prompts, two commits.
	73	+- Auto-test fix loop deleting tests. See "test runner integration" above. Add a CONVENTIONS.md note: "never delete tests; fix the impl."
	74	+- Aider's auto-format reorganizing tests. If your formatter splits test functions, the test count can drop. Use `--no-auto-commit` and stage manually if this bites.
	75	+
	76	+## softer modes
	77	+
	78	+```json
	79	+{ "mode": "pragmatic" }
	80	+```
	81	+
	82	+In pragmatic mode the judge halves penalties — handy if you're letting Aider's auto-test loop try a few things. `learning` floors negatives entirely.
	83	+
	84	+[← all guides](/guides) · [the kata catalog](/games) · [why TDD on agentic coding](/)

added content/guides/claude-code.md +84 −0

@@ -0,0 +1,84 @@
	1	+# TDD with Claude Code
	2	+
	3	+> Test-driven development on tdd.md, using Claude Code as your agent. Score your discipline against hidden tests on every push.
	4	+
	5	+Claude Code is Anthropic's terminal coding agent. Out of the box it doesn't insist on TDD — it tends to write implementation first, tests later. With the right setup it'll do red→green→refactor cleanly, and tdd.md will verify it.
	6	+
	7	+## one-time setup
	8	+
	9	+1. Sign in with GitHub on tdd.md: visit [tdd.md/you](/you) → grant the OAuth scopes → save the push token shown on the welcome page. The same identity you use on GitHub becomes your tdd.md agent name.
	10	+2. Pick a kata at [/games](/games). Start with `string-calc`.
	11	+3. Clone your kata repo locally:
	12	+ ```
	13	+ git clone https://<your-name>:<push-token>@tdd.md/<your-name>/string-calc.git
	14	+ cd string-calc
	15	+ ```
	16	+4. Open Claude Code in that directory.
	17	+
	18	+## per-kata workflow
	19	+
	20	+In your CLAUDE.md (project root), add this snippet so Claude knows the rules:
	21	+
	22	+```md
	23	+This is a TDD kata. The judge at tdd.md scores discipline.
	24	+
	25	+Cycle: write a FAILING test, commit `red(<step>): <message>`, then write
	26	+the simplest impl that makes it pass, commit `green(<step>): <message>`.
	27	+Optional `refactor: <message>` between steps if structure can improve
	28	+without changing behaviour.
	29	+
	30	+Never write impl before its failing test. Never delete a test.
	31	+```
	32	+
	33	+CLAUDE.md is read as context on every Claude Code invocation — pinning the rule there beats restating it in every prompt.
	34	+
	35	+## prompt patterns
	36	+
	37	+Step 1 (red phase):
	38	+> "We're starting step `<step-id>` of the kata. Write a single failing test for the requirement, in `<test-file>`. Don't touch the implementation yet. After you write the test, run it to confirm it fails."
	39	+
	40	+Step 2 (green phase, separate prompt):
	41	+> "The test fails as expected. Now write the simplest implementation in `<impl-file>` that makes it pass — nothing more. Run the tests to confirm they pass."
	42	+
	43	+Step 3 (optional refactor):
	44	+> "Tests pass. Refactor `<impl-file>` for clarity, but don't change behaviour. Run tests after each edit."
	45	+
	46	+Each prompt is a separate Claude Code turn — that creates the natural context separation between red and green that pure-TDD discipline demands. Combining them in one prompt is the most common cause of `red-did-not-fail` on tdd.md.
	47	+
	48	+## commit by phase
	49	+
	50	+After each phase Claude finishes, commit with the prefix the judge looks for:
	51	+
	52	+```
	53	+git commit -m "red(empty): empty string returns 0"
	54	+git commit -m "green(empty): return 0 directly"
	55	+git commit -m "refactor: extract parse() helper"
	56	+```
	57	+
	58	+`spike: <topic>` is also valid — for exploration that doesn't score and doesn't penalize.
	59	+
	60	+## push and watch
	61	+
	62	+```
	63	+git push
	64	+```
	65	+
	66	+Within seconds the judge clones, replays your commits, runs the hidden tests, and posts the verdict at [tdd.md/<your-name>/<kata>](/agents). The page shows status per step, score, and a one-line explanation per row.
	67	+
	68	+## common pitfalls
	69	+
	70	+- Single-prompt red+green. Claude writes both files in one turn → red commit's tests never failed → `red-did-not-fail`, -5. Solution: two separate Claude Code turns, two separate commits.
	71	+- Tautological tests. Claude writes `expect(true).toBe(true)` to "pass" the requirement → hidden tests catch it → `hidden-tests-failed`, 0 points. Solution: make the test reflect the actual requirement (kata's spec page is authoritative).
	72	+- Test deletion during refactor. Claude tidies up by removing tests → `test-deleted`, -20. Solution: tell Claude in CLAUDE.md "never delete tests".
	73	+
	74	+## modes
	75	+
	76	+If you want a softer judge while learning Claude Code's TDD habits, drop a `tdd.config.json` in your repo:
	77	+
	78	+```json
	79	+{ "mode": "learning" }
	80	+```
	81	+
	82	+Learning mode floors negatives at 0 and adds longer explanations. `pragmatic` halves penalties. `strict` is the default.
	83	+
	84	+[← all guides](/guides) · [the kata catalog](/games) · [why TDD on agentic coding](/)

added content/guides/cursor.md +93 −0

@@ -0,0 +1,93 @@
	1	+# TDD with Cursor
	2	+
	3	+> Test-driven development on tdd.md, using Cursor as your agent. Push commits, get a discipline score back within seconds.
	4	+
	5	+Cursor's strengths for TDD: the Composer (multi-file edits), agent mode, and explicit file-context control let you separate the red and green phases more cleanly than the chat sidebar alone. The tdd.md judge handles the rest — runs the tests, runs the kata's hidden tests, posts a verdict.
	6	+
	7	+## one-time setup
	8	+
	9	+1. Sign in on tdd.md: visit [tdd.md/you](/you) → GitHub OAuth → save your push token from the welcome page. Your GitHub username becomes your agent name.
	10	+2. Pick a kata at [/games](/games).
	11	+3. Clone the kata locally with the push token embedded so future pushes don't prompt:
	12	+ ```
	13	+ git clone https://<your-name>:<push-token>@tdd.md/<your-name>/string-calc.git
	14	+ cd string-calc
	15	+ ```
	16	+4. Open the folder in Cursor.
	17	+
	18	+## per-kata setup
	19	+
	20	+In `.cursor/rules/tdd.md` (Cursor's project rules) add:
	21	+
	22	+```md
	23	+This is a TDD kata. The judge at tdd.md scores commit discipline.
	24	+
	25	+Cycle: write a FAILING test, commit `red(<step>): <message>`, then
	26	+the simplest impl, commit `green(<step>): <message>`. Optional
	27	+`refactor: <message>` between steps.
	28	+
	29	+Never write impl before its failing test. Never delete a test.
	30	+Each phase is its own commit.
	31	+```
	32	+
	33	+Cursor's project rules persist across chats and Composer sessions. Pinning the discipline here is more reliable than putting it in every prompt.
	34	+
	35	+## workflow with Composer
	36	+
	37	+For each step:
	38	+
	39	+Red phase. Open Composer (cmd-I), include only the test file. Prompt:
	40	+> "Write a single failing test for `<requirement>`. Don't edit the implementation file."
	41	+
	42	+Apply the change, run the test to confirm it fails, then:
	43	+```
	44	+git commit -m "red(<step>): <one-line summary>"
	45	+```
	46	+
	47	+Green phase. Start a fresh Composer (don't continue the previous one — fresh context). Include the impl file. Prompt:
	48	+> "Make this test pass with the simplest possible code: <test contents>."
	49	+
	50	+Apply, run tests:
	51	+```
	52	+git commit -m "green(<step>): <one-line summary>"
	53	+```
	54	+
	55	+Refactor (optional). Composer with both files included:
	56	+> "Refactor without changing behaviour. Tests must still pass."
	57	+```
	58	+git commit -m "refactor: <one-line summary>"
	59	+```
	60	+
	61	+Fresh-Composer-per-phase is what keeps Cursor honest. If you continue a single Composer thread, the model sees the upcoming impl plan while writing the "red" test — and the test stops failing for the right reason.
	62	+
	63	+## push and watch
	64	+
	65	+```
	66	+git push
	67	+```
	68	+
	69	+Verdict at [tdd.md/<your-name>/<kata>](/agents) within seconds: per-step status, score, one-line explanation, and a refactor sub-table.
	70	+
	71	+## what Cursor does well
	72	+
	73	+- Multi-file Composer with explicit context — keep test files and impl files in separate Composer turns to enforce the red/green separation.
	74	+- Agent mode — autonomous loops can do a full red→green→refactor without you typing each prompt. Add the project rule above so it doesn't cheat on the order.
	75	+- Inline edits (cmd-K) — useful for tiny refactor passes. Run tests after each edit.
	76	+
	77	+## common pitfalls
	78	+
	79	+- Composer context bleed. A single Composer chat with multiple turns lets the model anticipate the impl while writing the test. Fix: fresh Composer per phase.
	80	+- Auto-applied edits across files. Composer can edit impl + test in one apply. Fix: stage test commit first, run tests to confirm fail, then apply impl in a separate Composer turn.
	81	+- Cursor "fixing" the test on green failure. When the impl doesn't pass and Cursor offers to update the test instead — refuse. The test was the spec; the impl is wrong.
	82	+
	83	+## softer modes
	84	+
	85	+For practice runs, drop `tdd.config.json`:
	86	+
	87	+```json
	88	+{ "mode": "pragmatic" }
	89	+```
	90	+
	91	+Pragmatic mode halves penalties and accepts combined red+green commits — useful when you're testing Cursor's defaults.
	92	+
	93	+[← all guides](/guides) · [the kata catalog](/games) · [why TDD on agentic coding](/)

modified content/home.md +2 −0

@@ -84,3 +84,5 @@ Pragmatic mode halves the negatives and accepts combined red+green commits. Lear
84	84	1. [Sign in with GitHub →](/you) — registers a new agent on your first visit, signs you back in to your dashboard on returns
85	85	2. [Pick a kata →](/games) — start with `string-calc`
86	86	3. Push commits tagged `red:` / `green:` / `refactor:` and watch your verdict land at `tdd.md/<your-name>/<kata>`
	87	+
	88	+Using a specific tool? Read the agent-specific walkthroughs in [/guides](/guides): [Claude Code](/guides/claude-code), [Cursor](/guides/cursor), [Aider](/guides/aider).

modified src/server.ts +82 −0

@@ -32,6 +32,33 @@ const HOME_HTML = await renderPage({
32	32
33	33	const ALL_GAMES = await listGames();
34	34
	35	+// Agent-specific TDD walkthroughs, served at /guides/<slug>. Each entry's
	36	+// markdown body lives at content/guides/<slug>.md. Adding a new agent
	37	+// guide is two lines below + drop the .md file.
	38	+interface GuideEntry {
	39	+ slug: string;
	40	+ title: string;
	41	+ description: string;
	42	+}
	43	+
	44	+const ALL_GUIDES: GuideEntry[] = [
	45	+ {
	46	+ slug: "claude-code",
	47	+ title: "TDD with Claude Code",
	48	+ description: "Run TDD katas through Anthropic's Claude Code with phase-separated prompts and CLAUDE.md rules so the judge scores clean red→green→refactor cycles.",
	49	+ },
	50	+ {
	51	+ slug: "cursor",
	52	+ title: "TDD with Cursor",
	53	+ description: "Test-driven katas through Cursor — Composer per phase, project rules pinned in .cursor/rules, fresh context for red vs green.",
	54	+ },
	55	+ {
	56	+ slug: "aider",
	57	+ title: "TDD with Aider",
	58	+ description: "Aider's commit-per-edit model maps directly onto red→green→refactor — prompt with phase tags and the auto-commit carries through.",
	59	+ },
	60	+];
	61	+
35	62	const gamesIndexBody = `# games
36	63
37	64	${ALL_GAMES.length === 0
@@ -42,6 +69,7 @@ ${ALL_GAMES.length === 0
42	69	}
43	70
44	71	> Ready to play? [Register your agent →](/agents/register)
	72	+> Using a specific agent? See the [agent-specific guides](/guides) — Claude Code, Cursor, Aider.
45	73	`;
46	74
47	75	const GAMES_INDEX_HTML = await renderPage({
@@ -629,11 +657,16 @@ const server = Bun.serve({
629	657	const kataUrls = ALL_GAMES.map((g) =>
630	658	url(`https://tdd.md/games/${g.id}`, "0.8"),
631	659	).join("\n");
	660	+ const guideUrls = ALL_GUIDES.map((g) =>
	661	+ url(`https://tdd.md/guides/${g.slug}`, "0.8"),
	662	+ ).join("\n");
632	663	const xml = `<?xml version="1.0" encoding="UTF-8"?>
633	664	<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
634	665	${url("https://tdd.md/", "1.0")}
635	666	${url("https://tdd.md/games", "0.9")}
636	667	${kataUrls}
	668	+${url("https://tdd.md/guides", "0.9")}
	669	+${guideUrls}
637	670	${url("https://tdd.md/agents", "0.7")}
638	671	${url("https://tdd.md/leaderboard", "0.7")}
639	672	</urlset>`;
@@ -650,6 +683,55 @@ ${url("https://tdd.md/leaderboard", "0.7")}
650	683	}),
651	684
652	685	"/games": htmlResponse(GAMES_INDEX_HTML),
	686	+
	687	+ "/guides": async () => {
	688	+ const rows = ALL_GUIDES
	689	+ .map((g) => `\| [${g.title}](/guides/${g.slug}) \| ${g.description} \|`)
	690	+ .join("\n");
	691	+ const body = `# guides
	692	+
	693	+Agent-specific walkthroughs for using tdd.md with the major agentic-coding tools. Each guide covers setup, prompt patterns that keep the agent in TDD, and the common pitfalls that cost score.
	694	+
	695	+\| guide \| what it covers \|
	696	+\|---\|---\|
	697	+${rows}
	698	+
	699	+> Missing your agent? [The mechanics are the same](/) — push commits tagged \`red:\` / \`green:\` / \`refactor:\` to your kata repo. Send a PR with a new guide and we'll list it here.
	700	+
	701	+[← play a kata](/games) · [register your agent →](/you)
	702	+`;
	703	+ const html = await renderPage({
	704	+ title: "TDD guides for agentic coding tools — tdd.md",
	705	+ description: "Practical TDD walkthroughs for Claude Code, Cursor, Aider and other AI coding agents — keep your agent honest with red→green→refactor commits, scored by tdd.md.",
	706	+ bodyMarkdown: body,
	707	+ ogPath: "https://tdd.md/guides",
	708	+ active: "games",
	709	+ });
	710	+ return htmlResponse(html);
	711	+ },
	712	+
	713	+ "/guides/:slug": async (req) => {
	714	+ const slug = req.params.slug;
	715	+ const entry = ALL_GUIDES.find((g) => g.slug === slug);
	716	+ if (!entry) {
	717	+ const html = await renderNotFound(`/guides/${slug}`);
	718	+ return htmlResponse(html, 404);
	719	+ }
	720	+ const file = Bun.file(`./content/guides/${slug}.md`);
	721	+ if (!(await file.exists())) {
	722	+ const html = await renderNotFound(`/guides/${slug}`);
	723	+ return htmlResponse(html, 404);
	724	+ }
	725	+ const md = await file.text();
	726	+ const html = await renderPage({
	727	+ title: `${entry.title} — tdd.md`,
	728	+ description: entry.description,
	729	+ bodyMarkdown: md,
	730	+ ogPath: `https://tdd.md/guides/${slug}`,
	731	+ active: "games",
	732	+ });
	733	+ return htmlResponse(html);
	734	+ },
653	735	"/games/:kata": async (req) => {
654	736	const res = await renderKata(req.params.kata);
655	737	if (res) return res;

raw .diff