syntaxai/tdd.md · commit a518670

Blog post: three constraints (TDD + tokens + SAMA) + SAMA rebrand

Combines three pieces that landed the same week into a single
argument that they compound:

  - obra/superpowers TDD SKILL.md (the iron law: failing test first;
    watch it fail; delete-and-restart on rationalizations)
  - Mishra's "23 token-saving tips for Claude Code" on Analytics
    Vidhya (8 May 2026): trim context, scope instructions, cap output,
    pick the cheap model
  - SAMA, rebranded from "Spalder Application Module Architecture" to
    "Sorted, Architecture, Modeled, Atomic" - same acronym, same
    convention, expansion that says what it does

The thesis: each constraint helps alone, but stacked they multiply by
removing the failure modes the others cannot see. SAMA makes tokens
cheap (atoms fit in small windows), TDD makes SAMA self-policing (the
test sits next to the unit), tokens keep TDD honest (no room to hold
both halves of a tests-after move).

Includes a "what this looks like in practice" section pointing at
this site's own setup as the dogfood example.

content/blog/three-constraints-agentic-coding.md
src/c31_blog.ts                                   - registry entry,
                                                    pinned to top

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
author
syntaxai <[email protected]>
date
2026-05-09 09:31:36 +01:00
parent
f3e2bac
commit
a518670681406755d151238aae61a8292300259e

2 files changed · +104 −0

added content/blog/three-constraints-agentic-coding.md +98 −0
@@ -0,0 +1,98 @@
1+# Red, tokens, atoms: three constraints that compound
2+
3+> Three pieces landed in the same week — Jesse Vincent's [superpowers TDD skill](https://github.com/obra/superpowers/blob/main/skills/test-driven-development/SKILL.md), Harsh Mishra's [23 token-saving tips for Claude Code](https://www.analyticsvidhya.com/blog/2026/05/tips-for-claude-code-token-saving/), and the rebrand of the file-naming convention this site is built on (SAMA — *Sorted, Architecture, Modeled, Atomic*). Each is useful on its own. The interesting thing is that they multiply: stacked, they make agentic coding cheap, correct, and reviewable in a way no single one of them delivers. Here is why they fit together.
4+
5+## Red: write the failing test first
6+
7+The superpowers TDD skill is a tight 200-line distillation of the discipline, written *for an agent that has been told what TDD is and forgets every two prompts*. Its claims:
8+
9+- **The iron law**: *"NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST."* Code before test? Delete it. Start over.
10+- **Watch the test fail.** *"If you didn't watch the test fail, you don't know if it tests the right thing."* The verify-RED step is mandatory, not decorative.
11+- **No keeping it as reference.** Don't "adapt" pre-test code while writing the test. *"Delete means delete."*
12+- **Tests-after answer the wrong question.** Tests written after impl pass immediately, and passing immediately proves nothing.
13+
14+The skill calls out the rationalizations explicitly — *"too simple to test", "I'll test after", "TDD will slow me down", "tests-after achieve the same goals"* — and gives each one a single line of pushback. It is a manifesto for forcing the agent to behave like a junior dev who *has* the discipline rather than one who has only heard about it.
15+
16+> *"Violating the letter of the rules is violating the spirit of the rules."*
17+
18+That sentence is the whole point. AI agents are very good at finding the spirit-but-not-letter shortcut. The iron law removes that path.
19+
20+## Tokens: clear, scope, cap, model
21+
22+Mishra's piece (Analytics Vidhya, 8 May 2026) is the practitioner's complement: how to keep the agent *cheap* while it does the disciplined work above. The shape of his 23 tips:
23+
24+**Trim the context window.** `/clear` between tasks. `/compact` when continuity matters. `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=70` to compact sooner than default. `/context` and `/usage` before large tasks. A status line in the terminal so you can see what you are spending in real time.
25+
26+**Trim the global instructions.** `CLAUDE.md` under 200 lines. Path-scoped rules in `.claude/rules/`. Skills that load on demand instead of globally. Prefer CLI tools to MCP servers.
27+
28+**Cap the noise.** `MAX_MCP_OUTPUT_TOKENS=8000`, `BASH_MAX_OUTPUT_LENGTH=20000`. Filter logs through `grep` *before* feeding them to the model. Subagents for verbose research that return a one-paragraph summary. Deny noisy paths in `.claude/settings.json`.
29+
30+**Pick the cheapest model that works.** Sonnet for daily, Opus for hard reasoning. `/effort low` for simple tasks. `CLAUDE_CODE_DISABLE_THINKING=1` if you do not need extended reasoning.
31+
32+**Be specific.** Exact filenames over "scan the repo". Verification targets stated up front so the agent does not loop on corrections. Course-correct early when it reads irrelevant files.
33+
34+> *"By setting strict boundaries from the outset, teams can reduce costs without compromising code quality."*
35+
36+Read the list and a pattern emerges: **smaller surface area, sharper instructions, fewer choices.** The agent stops drifting because there is nowhere to drift to.
37+
38+## Atoms: SAMA — sorted, architecture, modeled, atomic
39+
40+The third piece is structural. SAMA — *Sorted, Architecture, Modeled, Atomic* — is the file-naming convention this site is built on, shared across two other projects in my workspace. Every source file has a `cXX_<name>.ts` prefix where the number is its layer:
41+
42+```
43+c11_server.ts entry (Bun.serve)
44+c13_database.ts SQLite
45+c14_github.ts HTTP I/O
46+c21_app.ts route dispatcher
47+c21_handlers_*.ts route handlers per domain
48+c31_*.ts models (pure types + data)
49+c32_*.ts business logic (pure)
50+c51_render_*.ts HTML rendering per domain
51+```
52+
53+Four properties, one acronym:
54+
55+- **Sorted.** Alphabetical sort = dependency direction. `ls src/` is the architecture diagram. Lower-numbered layers never import from higher ones — verifiable with one grep.
56+- **Architecture.** The number is the layer; the layer is the contract. A `c31_*` file is a model — no I/O. A `c21_*` file composes lower layers — no SQL of its own. The contract is in the prefix.
57+- **Modeled.** Tests live next to source: `c32_session.test.ts` next to `c32_session.ts`. Types and parse-functions live in `c31_*`. The shape comes before the logic.
58+- **Atomic.** One responsibility per module. When a layer file passes ~700 lines, split per UI/data domain using the same prefix — `c51_render_layout.ts` + `c51_render_reports.ts` + `c51_render_projects.ts`. No barrel re-exports; consumers import directly from the atom.
59+
60+What you get for free: a file tree where every file is small, predictable, and unambiguously placed. There is exactly one right place for any given function — and `ls src/ | sort` will show you where.
61+
62+## Why the three compound
63+
64+Each constraint helps on its own. The interesting claim is that **stacked, they multiply** — and not by adding up benefits. They multiply by *removing the failure modes the others cannot see*.
65+
66+**SAMA makes tokens cheap.** A SAMA file is bounded above by the per-domain split rule (~700 lines, one atom). When you tell an agent to work on `c32_session.ts`, the relevant context is exactly that file plus its tests. You do not need `/clear` discipline to keep the window small — *the file system already enforced it*. Mishra's tip 19 ("avoid broad scans, specify exact filenames") becomes the natural way to work, not a discipline to remember.
67+
68+**TDD makes SAMA self-policing.** The iron law forbids writing code before the test. In a SAMA codebase that means: the failing test goes in `c32_session.test.ts`, and the impl that makes it pass goes in `c32_session.ts`. The atom and its proof of correctness sit two lines apart in `ls`. There is no path where the impl drifts into a higher layer or into another atom — the test refuses to find it.
69+
70+**Tokens make TDD honest.** The biggest TDD failure mode for agents — the one the rationalizations table in the superpowers skill spends most of its weight on — is the agent quietly skipping the verify-RED step. *"I'll add the test in a sec."* The structural fix the skill recommends is delete-and-start-over. The structural fix Mishra's tips imply is different but compatible: *the agent's context window is small enough that it cannot both quietly write the impl and pretend to be writing the test.* Combined, you get an agent that physically cannot get away with a tests-after move because there is not room in its head to hold both halves.
71+
72+The compounding works the other way too. **Without SAMA**, "specify exact filenames" is hard — there is no obvious right file. **Without TDD**, "verification targets stated up front" lacks a place to land — the agent invents what *up front* means. **Without token discipline**, the iron law gets quietly violated as the agent's context bloats and it loses track of which step it is on.
73+
74+## What this looks like in practice
75+
76+This site (tdd.md, the one you are reading) runs all three:
77+
78+- [`/reports/live/tests`](/reports/live/tests) is built TDD-first: every body-builder in `c51_render_reports.ts` has a sibling `*.test.ts` that landed before the impl.
79+- The project's `CLAUDE.md` is short. Path-scoped rules live in `.claude/rules/` (gitignored — they are the agent's local instructions, not project source).
80+- Every source file is `cXX_<name>.ts`. There are 21 files in `src/`, none over 700 lines, none importing upward. Run `grep -rE 'from "\\./c[5-9]' src/c1*.ts src/c2*.ts src/c3*.ts` and you get an empty result — the layer rule is mechanically verifiable.
81+
82+The cost of running an agent against this codebase is unusually low — Sonnet for full sessions is enough — and the diff produced is unusually reviewable. Both come from the same place: the agent has nowhere to be sloppy.
83+
84+## The takeaway
85+
86+The three pieces do not compete for the same slot. They occupy different ones:
87+
88+| layer | what it constrains | source |
89+|---|---|---|
90+| **red** | the agent's *behaviour during a change* | the iron law: failing test first |
91+| **tokens** | the agent's *context per turn* | trim, scope, cap, pick the cheap model |
92+| **atoms** | the agent's *file tree to navigate* | layer-prefixed, one responsibility, sorted |
93+
94+Pick one and the others get harder. Pick all three and they stop being disciplines you maintain — they become the path of least resistance. That is the goal, and that is why these three pieces showing up in the same week feels like the same idea told three ways.
95+
96+---
97+
98+[Read the obra/superpowers TDD skill →](https://github.com/obra/superpowers/blob/main/skills/test-driven-development/SKILL.md) · [Read Mishra's 23 token-saving tips →](https://www.analyticsvidhya.com/blog/2026/05/tips-for-claude-code-token-saving/) · [back to the blog](/blog)
modified src/c31_blog.ts +6 −0
@@ -12,6 +12,12 @@ export interface BlogEntry {
1212 }
1313
1414 export const ALL_POSTS: BlogEntry[] = [
15+ {
16+ slug: "three-constraints-agentic-coding",
17+ title: "Red, tokens, atoms: three constraints that compound",
18+ description: "Three pieces landed the same week — obra's TDD skill, Mishra's 23 token-saving tips for Claude Code, and the rebrand of SAMA (Sorted, Architecture, Modeled, Atomic). Each is useful alone. Stacked they multiply, and not by adding benefits — they remove the failure modes the others cannot see.",
19+ date: "2026-05-09",
20+ },
1521 {
1622 slug: "tweag-handbook-tdd",
1723 title: "Tweag's agentic TDD handbook gets the loop right — local green still isn't enough",