# SAMA v2 — Core Specification

> **Status:** Draft for v2.0. This document defines the *frozen core* and the *profile mechanism*. The core is normative and stable; profiles are the only extension point. Anything a CI job cannot check deterministically does not belong in this spec — it belongs in `AGENTS.md`.

---

## 0. Design contract

SAMA separates **the law** (how layers may depend on each other — frozen, language-neutral, identical in every repo) from **the vocabulary** (which named sublayers a given domain uses — supplied by a profile).

Two SAMA repositories in different languages and different profiles remain comparable at the core level. That comparability is what makes cross-repo empirical measurement possible: every repo, regardless of language or profile, emits the same core metrics.

A conformant verifier is a **deterministic program**. No LLM judgment sits in the enforcement loop. An agent may *use* SAMA to decide where a file goes; it may never be the *referee* that decides whether a file conforms.

---

## 1. The frozen core

### 1.1 The four canonical layers

Every file in a SAMA repository belongs to exactly one of four layers. This set is **frozen**: no profile, repo, or version may add, remove, renumber, or rename a canonical layer.

| Layer | Name | Contains | May import |
|---|---|---|---|
| 0 | Pure | Types, constants, pure functions, domain models. No I/O, no side effects. | nothing above 0 (i.e. only other Layer 0) |
| 1 | Core | Domain logic and decisions. No network, disk, clock, or framework. | 0 |
| 2 | Adapter | The boundary. External input is **parsed here and only here** (never cast). DB, network, filesystem, framework bindings. | 0, 1 |
| 3 | Entry | Outermost shell: `main`, CLI handler, HTTP route, UI mount, job entry. | 0, 1, 2 |

Layer 0 depends on nothing. Layer 3 is depended on by nothing.

### 1.2 The Law (frozen, one sentence)

> **Imports always point to a strictly lower layer number — never upward, never sideways across a higher number, never cyclic.**

Formally, for any import edge `A → B`: `layer(B) < layer(A)`, OR `A` and `B` are in the same layer **and** the active profile declares a sublayer ordering that permits `A → B` (see §2.2). The whole-program import graph must be acyclic.

This single law is what makes the **Sorted** pillar enforceable: the layer number is the lexicographic sort key, so file order *is* dependency direction.

### 1.3 Why exactly four

Four rings is the minimal set that captures the only relation that matters for context rot: *what is allowed to depend on what.* Fewer cannot express the parse-at-the-boundary rule (it needs a distinct Adapter layer). More reintroduce ambiguity about where a file belongs — which is the drift SAMA exists to kill. This is the "as simple as possible, but not simpler" line.

---

## 2. Profiles

A profile is the **only** extension mechanism. It does exactly one thing:

> A profile MAY subdivide a canonical layer into named, ordered sublayers. A profile MUST NOT introduce, remove, reverse, or otherwise alter any dependency relation between the four canonical layers.

If a proposed profile rule cannot be expressed as "subdivide layer N into ordered sublayers," it is not a profile rule. It is either core (and frozen) or out of scope (and belongs in `AGENTS.md`).

### 2.1 What a profile may and may not do

| Allowed | Forbidden |
|---|---|
| Split Layer 2 into `repository → gateway → controller` | Let Layer 1 import Layer 2 |
| Leave a canonical layer empty (e.g. CLI has no DB) | Add a fifth canonical layer |
| Define intra-layer sublayer ordering | Make Layer 0 perform I/O |
| Map sublayer names to filename prefixes | Reverse the import direction between core layers |

### 2.2 Sublayer ordering

Within a layer, sublayers are totally ordered. An import between two files in the same canonical layer is legal only if it points to an equal-or-lower sublayer in the profile's declared order. Cross-layer imports are governed solely by §1.2 and ignore sublayer order.

### 2.3 Profile declaration format

A profile is a single machine-readable file the verifier ingests (`sama.profile.toml`). Example — an HTTP service:

```toml
sama_version = "2.0"
profile = "http-service"

# Map each canonical layer to ordered sublayers and their filename prefixes.
# Order in the array = dependency order (later may import earlier, never reverse).

[layers.0] # Pure — not subdivided
prefixes = ["p0_"]

[layers.1] # Core
sublayers = [
  { name = "policy",  prefix = "c1a_" },
  { name = "service", prefix = "c1b_" },  # service may import policy
]

[layers.2] # Adapter
sublayers = [
  { name = "repository", prefix = "a2a_" },
  { name = "gateway",    prefix = "a2b_" },
  { name = "controller", prefix = "a2c_" },  # controller → gateway → repository
]

[layers.3] # Entry
prefixes = ["e3_"]
```

A `cli` profile would leave `[layers.2]` minimal and subdivide `[layers.3]` into `arg-parser → dispatch`. A `frontend` profile would subdivide Layer 1 into `store` vs `view-logic`. Same law, different dialect.

→ **Worked examples:** [a CRUD HTTP service under v2](/sama/v2/example-crud) (TypeScript) · [a WordPress plugin under v2](/sama/v2/example-wordpress) (PHP) — both ship full profiles, directory trees, per-layer code signatures, and the common mistakes each §4 check catches.

---

## 3. Layer assignment & the consistency check

The hard step is not checking the law — it is knowing each file's layer. SAMA uses **prefix as the source of truth, with a consistency check against actual imports.**

1. **Declared layer** = the canonical layer implied by the file's prefix (per the active profile's prefix map).
2. **Observed layer ceiling** = the highest layer any of the file's imports resolves to.
3. **Consistency rule:** the verifier FAILS if a file imports from a layer that its declared layer is not permitted to import — i.e. if the prefix claims something the imports contradict.

This gives a deterministic gate *and* protection against a misdeclared (or dishonest) prefix. A file cannot launder a forbidden dependency by lying about its layer: the import graph exposes it.

---

## 4. Conformance — the binary gate

A repository **conforms to SAMA v2** if and only if all of the following pass. Each is a deterministic check; the result is binary.

1. **Sorted** — every file carries a profile-recognized prefix; lexicographic prefix order equals layer order.
2. **Architecture** — every file maps to exactly one canonical layer via §2.3; no file is unprefixed or maps to two layers.
3. **Modeled (tests)** — every Layer 1 and Layer 2 behavior file has a sibling test file.
4. **Modeled (boundary)** — external input is parsed only in Layer 2. (Verifier support is profile-dependent; see §6.)
5. **Atomic** — no file exceeds the line cap (default ~700; profile may lower, never raise). No barrel re-export files.
6. **The Law** — the import graph is acyclic and every edge satisfies §1.2.
7. **Consistency** — no file's imports contradict its declared layer (§3).

If any check fails, the repository does not conform. There is no partial pass, no score-to-taste. (Profiles and *measurement* are graded; **conformance** is binary.)

---

## 5. Core metrics (the SAMA-independent outcome)

Every conformant repo emits these, identically, regardless of language or profile. These are the variables for A/B measurement (`SAMA on` vs `off`) — and crucially, **none of them is a compliance score.** They measure properties an agent's task performance should correlate with:

- **Graph depth** — longest path in the import DAG.
- **Fan-in / fan-out distribution** per layer.
- **Boundary ratio** — share of external-input parsing that occurs in Layer 2.
- **Working-set fit** — share of files within the editor LOC sweet spot.
- **Violation count over time** — emitted even on conforming repos as a trailing signal (which rules agents *almost* break).

Report the **delta** between SAMA-on and SAMA-off runs on these metrics — not the compliance rate. Compliance proves the rules were followed; the delta is what proves the rules were *worth* following.

---

## 5 (operational) — Core metrics definitions

This subsection pins how the §5 metrics are computed by the verifier at [/sama/v2/verify](/sama/v2/verify). The values are functions of `(sama.profile.toml, src/**.ts)` alone: same source tree + same profile → identical numbers across runs.

- **graphDepth** = length of the longest path in the import DAG. Nodes are SAMA source files (`src/*.ts` non-test, matching a profile prefix); edges are static relative-path imports (`from "./...ts"`) between them. A file with no imports has depth 1. Empty graph = 0. Cycles (which the Law check would flag separately) are bounded so the metric still terminates.

- **fanByLayer** = for each canonical layer L ∈ {0,1,2,3}, two distribution summaries: **fanIn** (count of edges arriving at files in L) and **fanOut** (count of edges leaving files in L). Each summary reports `{mean, p50, p95, max}` (nearest-rank percentile) over the files in L. Empty layers report all-zero summaries.

- **boundaryRatio** = (parse-boundary call sites in Layer 2 files) ÷ (parse-boundary call sites anywhere in the source tree). The set of "parse-boundary call sites" is defined by the shared detector that also powers the §4.4 Modeled-boundary check — currently `JSON.parse(...)` and `new URL(...)` outside string literals and comments. Both consumers share the helper in `src/a31_sama_v2.ts`, so they cannot diverge. When no parse boundaries exist anywhere, `boundaryRatio = 1.0` (vacuously satisfied).

- **workingSetFit** = (count of source files with `WORKING_SET_MIN_LOC ≤ LOC ≤ WORKING_SET_MAX_LOC`) ÷ (total source files). The bounds are *intentional defaults documented before the numbers, not retrofitted to flatter this repo*:
   - **Upper 500** — comfortably below the §4.5 Atomic 700-LOC cap, leaving headroom before a file approaches "split soon" territory.
   - **Lower 50** — below this, a file is too small to be a substantive module; it is usually a type-only file, a stub, or a single helper that would read better inlined into a sibling. Type-only files (Layer 0 model shards) and minimal test fixtures fall here by design. They are acceptable but counted as "not in the working-set sweet spot" because they are not load-bearing modules.

  Bounds are hard-coded constants `WORKING_SET_MIN_LOC = 50` and `WORKING_SET_MAX_LOC = 500` in [`src/a31_sama_v2.ts`](/GIT/tdd.md/blob/main/src/a31_sama_v2.ts) for v1 of the metrics emitter. Making them profile-configurable is a deliberate later step (requires extending the TOML subset parser to handle integer values).

- **violationCounts** = a record keyed by the seven §4 checks (`sorted`, `architecture`, `modeledTests`, `modeledBoundary`, `atomic`, `law`, `consistency`), each holding the integer count of violations that check produced on this run. Reported even when a check passes (value = 0) — this is §5's "trailing signal: which rules agents *almost* break." The verifier enumerates **all** violations per check (no short-circuit on first failure within a check), so the count is meaningful — not "1 if failed, 0 if passed".

### Worked example — boundaryRatio for this repo (hand-traced)

The §0 contract ("deterministic program; no LLM judgment") is auditable only if the metric output matches a hand trace. Walking `boundaryRatio` for this repo's `src/` against the live verifier:

A raw grep across non-test `src/*.ts` finds seven hits matching `JSON.parse(` and four hits matching `new URL(`. The shared detector strips comments and string literals first, which removes the explanatory mentions inside `// ...` lines and inside docstring literals. After stripping, the surviving real call sites are:

| call site | layer (prefix → L) |
|---|---|
| `src/c13_database.ts:133` `JSON.parse(row.verdict_json)` | `c13_` → L2 |
| `src/c13_database.ts:159` `JSON.parse(r.tracked_branches)` | `c13_` → L2 |
| `src/c13_database.ts:273` `JSON.parse(r.doc_json)` | `c13_` → L2 |
| `src/c13_database.ts:373` `JSON.parse(r.verdict_json)` | `c13_` → L2 |
| `src/c14_request_parse.ts:28` `JSON.parse(text)` | `c14_` → L2 |
| `src/c14_request_parse.ts:20` `new URL(text)` | `c14_` → L2 |
| `src/c14_client_bundle.ts:72` `new URL(import.meta.url)` | `c14_` → L2 |

Total: 7 parse-boundary call sites; all 7 fall under prefixes the profile maps to Layer 2.

`boundaryRatio = 7 / 7 = 1.0 = 100.0%` — which is exactly what [/sama/v2/verify](/sama/v2/verify) reports under §5 Core metrics. The hand count and the verifier's count match by construction: both consume `findParseBoundaryCallSites` in [`src/a31_sama_v2.ts`](/GIT/tdd.md/blob/main/src/a31_sama_v2.ts), and the Modeled-boundary check (#4) uses the same source of truth — so it cannot diverge.

---

## 6. Evolution policy (how the standard stays alive without rotting)

- **The core (§1) is frozen.** Changing the four layers or the Law requires a major version and an extraordinarily high evidentiary bar: cross-repo data showing the current core measurably harms agent performance.
- **Profiles are the moving edge.** A new profile is a *falsifiable hypothesis*: "this sublayer split lowers context cost for this domain." It is admitted provisionally, measured against §5, and promoted to "official" only if the delta holds across multiple repos.
- **A rule agents structurally violate is a signal — to be triaged, not auto-relaxed.** Either the rule is right and the agent must improve (signal to agent-builders), or the rule is impractical and the *profile* adapts (never the core). The feedback loop tunes profiles; it does not erode the law.

### 6.A v2.1 dialects (provisional)

Three falsifiable extensions are admitted under §6 as v2.1-draft *dialects*. Each was surfaced by a real-world audit that found the v2.0 surface syntax mismatched a target language's idiom. Each is **opt-in per profile**, defaults to v2.0 behaviour when its flag is absent, and preserves — by different surface syntax — the architectural property the original rule was protecting. Per the bullet above, promotion to "official" requires cross-repo §5 metric data showing the dialect catches the same class of drift the unrelaxed rule did.

The conformant verifier MUST parse the three dialect flags (`layout`, `tests`, `atomic_exemption`) as optional top-level profile fields and MUST reject unknown values with a clear error. A verifier MAY refuse to activate a dialect's relaxed semantics (i.e. continue applying the v2.0 rule even when a dialect is declared) — dialect activation is a separate, later promotion event that requires §5 cross-repo evidence. The flags themselves are tolerated today so opt-in profiles for non-TS/PHP languages do not get rejected as malformed.

### 6.1 Directory-layout dialect

**Profile syntax.** Top-level optional flag:

```toml
sama_version = "2.1"
profile = "..."
layout = "directory"   # default when absent: "prefix" (v2.0 behaviour)
```

**What v2.0 rule it relaxes.** §4.1 Sorted — *"every file carries a profile-recognized prefix; lexicographic prefix order equals layer order."*

**The architectural property the original rule was protecting.** The dependency direction of the codebase is publicly readable from the file system without running any analysis: a reviewer's `ls src/ | sort` reads top-to-bottom in dependency order, and a layer change is visible in a `git diff` without any tool support.

**How the dialect preserves that property.** Under `layout = "directory"`, the Sorted check verifies that the profile declares **packages or crate directories** in layer order, and that the language's compile-time dependency check (Go's `internal/` semantics, Rust's Cargo crate graph, etc.) plus the absence of upward edges in the import graph confirms the lex-order of declared package paths matches actual import direction. The reviewer's analogue of `ls src/ | sort` becomes `cat sama.profile.toml | grep packages =` — still mechanical, still ahead of the build. The property is the same; the surface syntax shifts from per-file prefix to per-directory declaration.

**Falsifiable cross-repo experiment.** Run the §5 metrics emitter on a corpus of agent-authored Go and Rust commits, half against `layout = "prefix"` (with a synthetic prefix renaming) and half against `layout = "directory"` (against the natural package layout). The dialect is invalidated if the directory mode systematically reports a different violation set on Sorted than the prefix mode does on the same logical defects. Originally surfaced in the [`dive` audit](/blog/2026-05/sama-v2-go-project-dive) and [`dive` rebuild sketch](/blog/2026-05/sama-v2-go-project-dive-rebuilt); confirmed independently by the [`ripgrep` audit](/blog/2026-05/sama-v2-rust-project-ripgrep).

### 6.2 Inline-tests dialect

**Profile syntax.** Top-level optional flag:

```toml
sama_version = "2.1"
profile = "..."
tests = "inline"       # default when absent: "sibling" (v2.0 behaviour)
```

**What v2.0 rule it relaxes.** §4.3 Modeled (tests) — *"every Layer 1 and Layer 2 behavior file has a sibling test file."* The v2.0 rule was written assuming Jest/PHPUnit-style sibling test files (`foo.ts` + `foo.test.ts`).

**The architectural property the original rule was protecting.** Every behavioural source unit has an attached test, mechanically discoverable by the verifier — the test is not centralised in a separate `tests/` tree that may drift out of sync with the source.

**How the dialect preserves that property.** Under `tests = "inline"`, the Modeled-tests check scans each Layer 1 / Layer 2 source file for in-file test attachments (`#[cfg(test)] mod tests { #[test] fn ... }` in Rust; equivalent annotations in other languages whose convention is inline tests). A behavioural file with no inline `#[test]` block fails the check exactly as a file with no sibling `*.test.ts` would under v2.0. *Where* the test attaches changes (same file vs sibling file); *that* every behavioural unit has an attached test does not.

**Falsifiable cross-repo experiment.** Audit a Rust corpus (e.g. the popular CLI tools `bat`, `fd`, `ripgrep`, `eza`) under both `tests = "sibling"` (which the convention does not produce) and `tests = "inline"` (which it does). The dialect is invalidated if inline-mode systematically classifies files as tested that sibling-mode-on-a-renamed-corpus would not — i.e. if the surface-syntax change quietly admits genuinely untested files. Originally surfaced in the [`ripgrep` audit](/blog/2026-05/sama-v2-rust-project-ripgrep) and [`ripgrep` rebuild sketch](/blog/2026-05/sama-v2-rust-project-ripgrep-rebuilt).

### 6.3 Declarative-exemption dialect

**Profile syntax.** Top-level optional flag:

```toml
sama_version = "2.1"
profile = "..."
atomic_exemption = "declarative"   # default when absent: "none" (v2.0 behaviour)
```

**What v2.0 rule it relaxes.** §4.5 Atomic — *"no file exceeds the line cap (default ~700; profile may lower, never raise)."*

**The architectural property the original rule was protecting.** *Working-set fit* — every load-bearing source file fits inside the agent's editor context with headroom. A file at 700+ LOC forces the agent to load it incrementally or summarise; either response is a drift surface.

**How the dialect preserves that property.** Under `atomic_exemption = "declarative"`, the Atomic check exempts from the LOC cap files whose content is overwhelmingly *declarative* — a file is declarative if it crosses the cap **and** its cyclomatic complexity per LOC drops below 0.05 **and** its body is predominantly `impl X for Y` / `const FOO: T = ...` / `pub struct ...` items (or the language's equivalent). The intuition: a flag-definition catalog or a static type-table is structurally large but does not require holistic loading by an agent — the agent indexes into it by name, not by reading it linearly. The working-set property is preserved for the files that *would* harm an agent's context (behavioural complexity) and selectively waived for files where the cap was a false positive (declarative shape). The 7,779-line `crates/core/flags/defs.rs` in `ripgrep` is the textbook case: 150 flag definitions, each a small struct + small impl, CC/LOC ≈ 0.01.

**Falsifiable cross-repo experiment.** Across a multi-language corpus of agent-edit failures (cases where an LLM produced a regression while editing a single file), compute the share that fall in declarative-exempt files vs in over-cap behavioural files. The dialect is invalidated if declarative-exempt files correlate with edit failures at the same or higher rate than over-cap behavioural files do — i.e. if the heuristic exempts files the agent actually struggles with. Originally surfaced in the [`ripgrep` audit](/blog/2026-05/sama-v2-rust-project-ripgrep) and [`ripgrep` rebuild sketch](/blog/2026-05/sama-v2-rust-project-ripgrep-rebuilt).

---

## Appendix A — Mapping to the four pillars

| Pillar | Where it lives in v2 |
|---|---|
| **S** — Sorted | §1.2 Law + §4.1; prefix = layer = sort key |
| **A** — Architecture | §1.1 four layers + §2 profiles (the fix for the weak A) |
| **M** — Modeled | §4.3 sibling tests + §4.4 boundary parsing (Layer 2) |
| **A** — Atomic | §4.5 line cap + no barrels |