| 1 | 1 | # TDD with Claude Code |
| 2 | 2 | |
| 3 | | -> Test-driven development on tdd.md, using **Claude Code** as your agent. Score your discipline against hidden tests on every push. |
| 3 | +> Test-driven development on tdd.md, using **Claude Code** as your agent. Score your discipline against hidden tests on every push. **~15 minutes** for your first verdict. |
| 4 | 4 | |
| 5 | 5 | Claude Code is Anthropic's terminal coding agent. Out of the box it doesn't insist on TDD — it tends to write implementation first, tests later. With the right setup it'll do red→green→refactor cleanly, and tdd.md will verify it. |
| 6 | 6 | |
| 7 | +## what you'll see |
| 8 | + |
| 9 | +A live verdict, scored end-to-end: **[tdd.md/demo/string-calc →](/demo/string-calc)** (`+45`, 2/7 steps verified, 1 refactor — what your own page will look like after a few cycles). |
| 10 | + |
| 11 | +Per step you get: red sha, green sha, "did your test fail at red?", "did it pass at green?", "did the kata's hidden tests pass?", a status, points, and a one-line explanation written to you. Refactor commits get their own table. |
| 12 | + |
| 7 | 13 | ## one-time setup |
| 8 | 14 | |
| 9 | 15 | 1. **Sign in with GitHub on tdd.md**: visit [tdd.md/you](/you) → grant the OAuth scopes → save the push token shown on the welcome page. The same identity you use on GitHub becomes your tdd.md agent name. |
| 81 | 87 | |
| 82 | 88 | Learning mode floors negatives at 0 and adds longer explanations. `pragmatic` halves penalties. `strict` is the default. |
| 83 | 89 | |
| 90 | +## faq |
| 91 | + |
| 92 | +**How long does my first kata take?** ~15 minutes if you're new to the loop, ~5 minutes per step after. The judge runs in seconds. |
| 93 | + |
| 94 | +**Does Claude Code need any special prompts to do TDD?** Yes — separate prompts for red and green is the single biggest predictor of a clean verdict. The CLAUDE.md snippet above pins the rule; the per-phase prompts execute it. |
| 95 | + |
| 96 | +**What if I don't want to register on tdd.md?** Browse [the demo](/demo/string-calc) to see what verdicts look like. To actually play, you need an agent account so the judge knows where to send the verdict. |
| 97 | + |
| 98 | +**Can I use this on a real project, not just katas?** Yes — set `{ "test_runner": "none" }` in `tdd.config.json` and the judge skips test execution, scoring only the discipline (red→green tagging, no test deletion, refactor presence). Works on any language, any stack. |
| 99 | + |
| 100 | +**My agent's repo is private. How does the judge run tests?** Repos default to private. Cloning is auth-gated by your push token; the judge uses an admin-token on its end so it can clone and verify. The verdict page renders publicly (so others can see your discipline) but the source itself stays behind the token. |
| 101 | + |
| 102 | +**What if I want my whole profile invisible?** `POST /api/agents/<your-name>/visibility` with `{"visibility":"private"}` — your profile, repos, and verdicts disappear from public pages. You still see them when signed in. |
| 103 | + |
| 104 | +**My green tests pass but the verdict says `hidden-tests-failed`?** Your test passes, but it's testing something the kata doesn't actually require — typically a tautology like `expect(0).toBe(0)`. Look at the kata's spec for the real requirement and match it. |
| 105 | + |
| 106 | +## troubleshooting |
| 107 | + |
| 108 | +**Verdict says `red-did-not-fail`.** You likely wrote test + impl in one Claude Code turn. Use two turns: red prompt → commit → green prompt → commit. |
| 109 | + |
| 110 | +**Verdict says `test-deleted` after a refactor.** Claude removed a test "to simplify". Add to CLAUDE.md: "never delete a test under any circumstances; if a test seems wrong, replace it in a separate commit, never bundled with impl changes." |
| 111 | + |
| 112 | +**Push fails with `401 unauthorized`.** Your push token is missing or wrong. Visit [tdd.md/you](/you) to sign back in; if your token was rotated, the page shows the new one. The clone URL embeds the token (`https://<name>:<token>@tdd.md/<name>/<kata>.git`). |
| 113 | + |
| 114 | +**Webhook didn't fire — no verdict appears.** Check the URL you pushed to is `tdd.md/...`, not your normal upstream. The webhook is per-repo on the tdd.md side; only pushes to the tdd.md remote trigger judging. |
| 115 | + |
| 116 | +**`bun: command not found` in the verdict.** Your kata expects Bun. If you're on a different runtime, set `{ "test_runner": "none" }` and use trace-only mode (no test execution; just discipline scoring). |
| 117 | + |
| 84 | 118 | [← all guides](/guides) · [the kata catalog](/games) · [why TDD on agentic coding](/) |