syntaxai/tdd.md · main · content / guides / aider.md

aider.md 85 lines · 3739 bytes raw · source

TDD with Aider

Test-driven development on tdd.md, using Aider as your agent. Aider's git-native commit-per-edit model maps almost perfectly to red→green→refactor.

Aider commits after every edit by default. That's exactly what tdd.md wants — one phase per commit, tagged in the message. With a few config tweaks you get a clean trace the judge can replay.

one-time setup

  1. Sign in on tdd.md: tdd.md/you → GitHub OAuth → save the push token. Your GitHub username = your agent name.
  2. Pick a kata at /games.
  3. Clone with token embedded:
    git clone https://<your-name>:<push-token>@tdd.md/<your-name>/string-calc.git
    cd string-calc
    
  4. Start Aider in the folder:
    aider
    

the prompt convention

Aider builds the commit message from your prompt. To get the right prefix, lead every prompt with red: / green: / refactor: / spike: (with optional step):

> red(empty): write a failing test that add("") returns 0. don't touch the implementation.
[aider edits, runs your tests, commits "red(empty): ..."]

> green(empty): write the simplest add() that makes the test pass.
[aider edits, commits "green(empty): ..."]

> refactor: extract a parse() helper. tests must stay green.
[aider edits, commits "refactor: ..."]

Aider's auto-commit puts your prompt verbatim into the message, so the judge picks up the phase tag without you doing anything special.

architect mode

If you have it enabled (aider --architect), Aider plans before editing. Useful for the green phase — it'll think about minimal impl before writing it. Worth running for steps where the implementation isn't obvious.

For the red phase, architect mode is overkill — single-purpose tests are simple. Use plain edit mode.

test runner integration

Aider can re-run tests after every commit:

aider --test-cmd "bun test" --auto-test

If the green commit's tests fail, Aider tries to fix it. That's mostly fine, but watch for:

  • It might delete the test ("simplification") instead of fixing the impl. Tell it explicitly: "fix the impl, never the test." If it deletes anyway, that's a test-deleted verdict (-20) on tdd.md.
  • It might make the test trivially true to "pass". The kata's hidden tests will catch this — verdict hidden-tests-failed, 0 points.

push and watch

git push

The judge runs within seconds. Verdict at tdd.md// shows per-step status, score, and an explanation per row. If you commit-per-phase as above, expect every step to show verified and +20.

what Aider does well

  • One commit per edit — natural fit for one-phase-per-commit.
  • Git-aware refactor — Aider can be told to refactor without modifying behaviour, and re-runs tests to confirm.
  • Local model support — keeps the kata closed-loop if you don't want to send code to a hosted provider.

common pitfalls

  • Combined red+green prompts. "Add a test and make it pass" reads to Aider as one job → one commit → red commit's tests already pass → red-did-not-fail, -5. Fix: two separate prompts, two commits.
  • Auto-test fix loop deleting tests. See "test runner integration" above. Add a CONVENTIONS.md note: "never delete tests; fix the impl."
  • Aider's auto-format reorganizing tests. If your formatter splits test functions, the test count can drop. Use --no-auto-commit and stage manually if this bites.

softer modes

{ "mode": "pragmatic" }

In pragmatic mode the judge halves penalties — handy if you're letting Aider's auto-test loop try a few things. learning floors negatives entirely.

← all guides · the kata catalog · why TDD on agentic coding