SAMA v2 §5: ship the core metrics emitter behind /sama/v2/verify
The §6 evolution policy requires deltas on the §5 core metrics
before any "v2 is worth following" claim can be empirical. v1 of the
implementation:
- src/a31_sama_v2.ts now owns the shared pure helpers
(stripStringsAndComments, collectRelativeImports, resolveImport,
isSamaFile/isTestFile, declaredLayer) plus the parse-boundary call-
site detector (PARSE_BOUNDARY_PATTERNS, findParseBoundaryCallSites)
and the SamaV2Metrics/FanByLayer/WORKING_SET_MIN/MAX constants.
- src/b32_sama_v2_verify.ts imports from a31; checkModeledBoundary
now consumes findParseBoundaryCallSites — same detector the metric
uses, so the Modeled-boundary check (#4) and the boundaryRatio
metric cannot diverge. The check's violation list is bit-for-bit
identical to before the refactor (existing 20 tests still pass).
- src/b32_sama_v2_metrics.ts (Layer 1, pure) emits the five §5
metrics: graphDepth (memoised DFS, cycle-safe), fanByLayer
({mean, p50, p95, max} per layer for fan-in and fan-out),
boundaryRatio (Layer-2 share of detector call sites; 1.0 vacuous
when no boundaries exist), workingSetFit (share of source files
in 50..500 LOC), violationCounts (per-check, reported even when 0).
- src/b32_sama_v2_metrics.test.ts: 23 deterministic fixture
assertions including a reproducibility property test.
- /sama/v2/verify now renders the §5 metrics block beneath the 7-
check verdict.
- /sama/v2 documents the operational definitions with reasoning
preceding numbers, plus a hand-traced worked example for
boundaryRatio against this repo's actual call sites.
Live numbers on this commit: graphDepth=7, boundaryRatio=100%,
workingSetFit=80%, all violationCounts=0. Verifier still 7/7 ✓.
300/300 tests pass.
Co-Authored-By: Claude Opus 4.7 <[email protected]>
6 files changed · +806 −113
content/sama/v2.md
+40
−0
| @@ -141,6 +141,46 @@ Report the **delta** between SAMA-on and SAMA-off runs on these metrics — not | ||
| 141 | 141 | |
| 142 | 142 | --- |
| 143 | 143 | |
| 144 | +## 5 (operational) — Core metrics definitions | |
| 145 | + | |
| 146 | +This subsection pins how the §5 metrics are computed by the verifier at [/sama/v2/verify](/sama/v2/verify). The values are functions of `(sama.profile.toml, src/**.ts)` alone: same source tree + same profile → identical numbers across runs. | |
| 147 | + | |
| 148 | +- **graphDepth** = length of the longest path in the import DAG. Nodes are SAMA source files (`src/*.ts` non-test, matching a profile prefix); edges are static relative-path imports (`from "./...ts"`) between them. A file with no imports has depth 1. Empty graph = 0. Cycles (which the Law check would flag separately) are bounded so the metric still terminates. | |
| 149 | + | |
| 150 | +- **fanByLayer** = for each canonical layer L ∈ {0,1,2,3}, two distribution summaries: **fanIn** (count of edges arriving at files in L) and **fanOut** (count of edges leaving files in L). Each summary reports `{mean, p50, p95, max}` (nearest-rank percentile) over the files in L. Empty layers report all-zero summaries. | |
| 151 | + | |
| 152 | +- **boundaryRatio** = (parse-boundary call sites in Layer 2 files) ÷ (parse-boundary call sites anywhere in the source tree). The set of "parse-boundary call sites" is defined by the shared detector that also powers the §4.4 Modeled-boundary check — currently `JSON.parse(...)` and `new URL(...)` outside string literals and comments. Both consumers share the helper in `src/a31_sama_v2.ts`, so they cannot diverge. When no parse boundaries exist anywhere, `boundaryRatio = 1.0` (vacuously satisfied). | |
| 153 | + | |
| 154 | +- **workingSetFit** = (count of source files with `WORKING_SET_MIN_LOC ≤ LOC ≤ WORKING_SET_MAX_LOC`) ÷ (total source files). The bounds are *intentional defaults documented before the numbers, not retrofitted to flatter this repo*: | |
| 155 | + - **Upper 500** — comfortably below the §4.5 Atomic 700-LOC cap, leaving headroom before a file approaches "split soon" territory. | |
| 156 | + - **Lower 50** — below this, a file is too small to be a substantive module; it is usually a type-only file, a stub, or a single helper that would read better inlined into a sibling. Type-only files (Layer 0 model shards) and minimal test fixtures fall here by design. They are acceptable but counted as "not in the working-set sweet spot" because they are not load-bearing modules. | |
| 157 | + | |
| 158 | + Bounds are hard-coded constants `WORKING_SET_MIN_LOC = 50` and `WORKING_SET_MAX_LOC = 500` in [`src/a31_sama_v2.ts`](/GIT/syntaxai/tdd.md/blob/main/src/a31_sama_v2.ts) for v1 of the metrics emitter. Making them profile-configurable is a deliberate later step (requires extending the TOML subset parser to handle integer values). | |
| 159 | + | |
| 160 | +- **violationCounts** = a record keyed by the seven §4 checks (`sorted`, `architecture`, `modeledTests`, `modeledBoundary`, `atomic`, `law`, `consistency`), each holding the integer count of violations that check produced on this run. Reported even when a check passes (value = 0) — this is §5's "trailing signal: which rules agents *almost* break." The verifier enumerates **all** violations per check (no short-circuit on first failure within a check), so the count is meaningful — not "1 if failed, 0 if passed". | |
| 161 | + | |
| 162 | +### Worked example — boundaryRatio for this repo (hand-traced) | |
| 163 | + | |
| 164 | +The §0 contract ("deterministic program; no LLM judgment") is auditable only if the metric output matches a hand trace. Walking `boundaryRatio` for this repo's `src/` against the live verifier: | |
| 165 | + | |
| 166 | +A raw grep across non-test `src/*.ts` finds seven hits matching `JSON.parse(` and four hits matching `new URL(`. The shared detector strips comments and string literals first, which removes the explanatory mentions inside `// ...` lines and inside docstring literals. After stripping, the surviving real call sites are: | |
| 167 | + | |
| 168 | +| call site | layer (prefix → L) | | |
| 169 | +|---|---| | |
| 170 | +| `src/c13_database.ts:133` `JSON.parse(row.verdict_json)` | `c13_` → L2 | | |
| 171 | +| `src/c13_database.ts:159` `JSON.parse(r.tracked_branches)` | `c13_` → L2 | | |
| 172 | +| `src/c13_database.ts:273` `JSON.parse(r.doc_json)` | `c13_` → L2 | | |
| 173 | +| `src/c13_database.ts:373` `JSON.parse(r.verdict_json)` | `c13_` → L2 | | |
| 174 | +| `src/c14_request_parse.ts:28` `JSON.parse(text)` | `c14_` → L2 | | |
| 175 | +| `src/c14_request_parse.ts:20` `new URL(text)` | `c14_` → L2 | | |
| 176 | +| `src/c14_client_bundle.ts:72` `new URL(import.meta.url)` | `c14_` → L2 | | |
| 177 | + | |
| 178 | +Total: 7 parse-boundary call sites; all 7 fall under prefixes the profile maps to Layer 2. | |
| 179 | + | |
| 180 | +`boundaryRatio = 7 / 7 = 1.0 = 100.0%` — which is exactly what [/sama/v2/verify](/sama/v2/verify) reports under §5 Core metrics. The hand count and the verifier's count match by construction: both consume `findParseBoundaryCallSites` in [`src/a31_sama_v2.ts`](/GIT/syntaxai/tdd.md/blob/main/src/a31_sama_v2.ts), and the Modeled-boundary check (#4) uses the same source of truth — so it cannot diverge. | |
| 181 | + | |
| 182 | +--- | |
| 183 | + | |
| 144 | 184 | ## 6. Evolution policy (how the standard stays alive without rotting) |
| 145 | 185 | |
| 146 | 186 | - **The core (§1) is frozen.** Changing the four layers or the Law requires a major version and an extraordinarily high evidentiary bar: cross-repo data showing the current core measurably harms agent performance. |
src/a31_sama_v2.ts
+210
−10
| @@ -1,9 +1,10 @@ | ||
| 1 | -// c31 — model: types for the SAMA v2 verifier pipeline. Pure data | |
| 2 | -// shapes: the parsed profile (ProfileSpec), the verifier's input | |
| 3 | -// (SamaV2Input), and its output (SamaV2Report). No I/O lives here; | |
| 4 | -// c14_sama_profile parses the .toml into ProfileSpec, c32_sama_v2_verify | |
| 5 | -// applies the seven §4 checks against (ProfileSpec, files), and | |
| 6 | -// c21_handlers_sama renders the SamaV2Report. | |
| 1 | +// a31 — model: types, constants, and pure helpers for the SAMA v2 | |
| 2 | +// verifier + §5 core metrics emitter. No I/O lives here. c14_sama_profile | |
| 3 | +// parses .toml into ProfileSpec; b32_sama_v2_verify applies the seven §4 | |
| 4 | +// checks; b32_sama_v2_metrics computes the §5 metrics. The verifier and | |
| 5 | +// metrics emitter share the helpers below — particularly the parse- | |
| 6 | +// boundary detector — so the Modeled-boundary check (#4) and | |
| 7 | +// boundaryRatio metric move in lockstep. | |
| 7 | 8 | |
| 8 | 9 | export type LayerNumber = 0 | 1 | 2 | 3; |
| 9 | 10 | |
| @@ -37,7 +38,7 @@ export interface ProfileSpec { | ||
| 37 | 38 | |
| 38 | 39 | export interface SamaV2Input { |
| 39 | 40 | profile: ProfileSpec; |
| 40 | - // Map keyed by repo-relative path (e.g. "src/c11_server.ts") to | |
| 41 | + // Map keyed by repo-relative path (e.g. "src/d11_server.ts") to | |
| 41 | 42 | // file contents. The verifier never reads files itself; the loader |
| 42 | 43 | // populates this map. |
| 43 | 44 | files: Map<string, string>; |
| @@ -79,9 +80,71 @@ export interface SamaV2Report { | ||
| 79 | 80 | overallPassed: boolean; |
| 80 | 81 | } |
| 81 | 82 | |
| 82 | -// Helper used in the verifier and re-exported here so call sites can | |
| 83 | -// type-narrow against the same source: returns the layer number a | |
| 84 | -// file's basename declares, or null if no profile prefix matches. | |
| 83 | +// — §5 core metrics: shape ---------------------------------------- | |
| 84 | +// | |
| 85 | +// Operational definitions are pinned on /sama/v2 §5 (operational). | |
| 86 | +// The metric VALUES are computed in b32_sama_v2_metrics; this file | |
| 87 | +// just declares the shape so callers (and the renderer) can type-narrow. | |
| 88 | + | |
| 89 | +export interface FanSummary { | |
| 90 | + // {mean, p50, p95, max} over a per-file fan-in or fan-out series. | |
| 91 | + // Empty series → all zeros. | |
| 92 | + mean: number; | |
| 93 | + p50: number; | |
| 94 | + p95: number; | |
| 95 | + max: number; | |
| 96 | +} | |
| 97 | + | |
| 98 | +export interface FanByLayer { | |
| 99 | + 0: { fanIn: FanSummary; fanOut: FanSummary }; | |
| 100 | + 1: { fanIn: FanSummary; fanOut: FanSummary }; | |
| 101 | + 2: { fanIn: FanSummary; fanOut: FanSummary }; | |
| 102 | + 3: { fanIn: FanSummary; fanOut: FanSummary }; | |
| 103 | +} | |
| 104 | + | |
| 105 | +export interface SamaV2ViolationCounts { | |
| 106 | + // Counts of violations per §4 check, reported even when a check | |
| 107 | + // passes (value = 0). This is §5's "trailing signal: which rules | |
| 108 | + // agents *almost* break." | |
| 109 | + sorted: number; | |
| 110 | + architecture: number; | |
| 111 | + modeledTests: number; | |
| 112 | + modeledBoundary: number; | |
| 113 | + atomic: number; | |
| 114 | + law: number; | |
| 115 | + consistency: number; | |
| 116 | +} | |
| 117 | + | |
| 118 | +export interface SamaV2Metrics { | |
| 119 | + graphDepth: number; | |
| 120 | + fanByLayer: FanByLayer; | |
| 121 | + boundaryRatio: number; | |
| 122 | + workingSetFit: number; | |
| 123 | + violationCounts: SamaV2ViolationCounts; | |
| 124 | +} | |
| 125 | + | |
| 126 | +// — Working-set bounds (per /sama/v2 §5 documented reasoning) ----- | |
| 127 | +// | |
| 128 | +// Upper 500: comfortably below the §4.5 Atomic 700-LOC cap, leaving | |
| 129 | +// headroom before a file approaches "split soon" territory. | |
| 130 | +// Lower 50: below this, a file is too small to be a substantive | |
| 131 | +// module — usually a type-only file, a stub, or a single helper that | |
| 132 | +// would read better inlined into a sibling. Type-only files (Layer 0 | |
| 133 | +// model shards) and minimal test fixtures fall here by design; they | |
| 134 | +// are acceptable but counted as "not in the working-set sweet spot" | |
| 135 | +// because they are not load-bearing modules. | |
| 136 | +// | |
| 137 | +// Hard-coded for v1 of the metrics emitter. Making them profile- | |
| 138 | +// configurable is a deliberate later step (requires extending the | |
| 139 | +// TOML subset parser to handle integer values). | |
| 140 | +export const WORKING_SET_MIN_LOC = 50; | |
| 141 | +export const WORKING_SET_MAX_LOC = 500; | |
| 142 | + | |
| 143 | +// — Layer assignment helper -------------------------------------- | |
| 144 | +// | |
| 145 | +// Returns the canonical layer a file's basename declares via prefix, | |
| 146 | +// or null if no profile prefix matches. The verifier and metrics | |
| 147 | +// emitter both call this for every file they examine. | |
| 85 | 148 | export const declaredLayer = ( |
| 86 | 149 | path: string, |
| 87 | 150 | profile: ProfileSpec, |
| @@ -95,3 +158,140 @@ export const declaredLayer = ( | ||
| 95 | 158 | } |
| 96 | 159 | return null; |
| 97 | 160 | }; |
| 161 | + | |
| 162 | +// — File classifiers --------------------------------------------- | |
| 163 | + | |
| 164 | +// A SAMA file is one we expect to obey the layer rules: any *.ts | |
| 165 | +// under src/ that isn't a *.test.ts. Tests live next to source as | |
| 166 | +// siblings; they're examined for the Modeled check but don't carry | |
| 167 | +// their own layer. | |
| 168 | +export const isSamaFile = (path: string): boolean => | |
| 169 | + path.startsWith("src/") && path.endsWith(".ts") && !path.endsWith(".test.ts"); | |
| 170 | + | |
| 171 | +export const isTestFile = (path: string): boolean => | |
| 172 | + path.startsWith("src/") && path.endsWith(".test.ts"); | |
| 173 | + | |
| 174 | +// — Source-mask helpers ------------------------------------------ | |
| 175 | + | |
| 176 | +// Strip JS/TS string literals and comments to whitespace so a regex | |
| 177 | +// that walks the source doesn't trip on test fixtures that contain | |
| 178 | +// the very patterns we're scanning for. Preserves newline positions | |
| 179 | +// so line numbers stay aligned. | |
| 180 | +export const stripStringsAndComments = (src: string): string => { | |
| 181 | + let out = ""; | |
| 182 | + let i = 0; | |
| 183 | + while (i < src.length) { | |
| 184 | + const c = src[i]; | |
| 185 | + const n = src[i + 1]; | |
| 186 | + if (c === "/" && n === "/") { | |
| 187 | + out += " "; | |
| 188 | + i += 2; | |
| 189 | + while (i < src.length && src[i] !== "\n") { out += " "; i++; } | |
| 190 | + } else if (c === "/" && n === "*") { | |
| 191 | + out += " "; | |
| 192 | + i += 2; | |
| 193 | + while (i < src.length - 1 && !(src[i] === "*" && src[i + 1] === "/")) { | |
| 194 | + out += src[i] === "\n" ? "\n" : " "; | |
| 195 | + i++; | |
| 196 | + } | |
| 197 | + out += " "; | |
| 198 | + i += 2; | |
| 199 | + } else if (c === '"' || c === "'" || c === "`") { | |
| 200 | + const quote = c; | |
| 201 | + out += " "; | |
| 202 | + i++; | |
| 203 | + while (i < src.length && src[i] !== quote) { | |
| 204 | + if (src[i] === "\\" && i + 1 < src.length) { out += " "; i += 2; continue; } | |
| 205 | + out += src[i] === "\n" ? "\n" : " "; | |
| 206 | + i++; | |
| 207 | + } | |
| 208 | + out += " "; | |
| 209 | + i++; | |
| 210 | + } else { | |
| 211 | + out += c; | |
| 212 | + i++; | |
| 213 | + } | |
| 214 | + } | |
| 215 | + return out; | |
| 216 | +}; | |
| 217 | + | |
| 218 | +// Collect every relative ".ts" import edge in a file. Scans raw | |
| 219 | +// source: a stripped copy would erase the quoted import paths along | |
| 220 | +// with all other string literals, so the regex must run over the | |
| 221 | +// original. To avoid picking up import-like strings inside test | |
| 222 | +// fixtures, we cross-check each match position against the stripped | |
| 223 | +// mask — if the keyword `from` lands on whitespace in the mask, it | |
| 224 | +// was inside a string literal and we skip it. | |
| 225 | +export const collectRelativeImports = (content: string): string[] => { | |
| 226 | + const mask = stripStringsAndComments(content); | |
| 227 | + const re = /\bfrom\s+["'](\.\/[A-Za-z0-9_./-]+\.ts)["']/g; | |
| 228 | + const out: string[] = []; | |
| 229 | + let m: RegExpExecArray | null; | |
| 230 | + while ((m = re.exec(content)) !== null) { | |
| 231 | + // If the `from` keyword position is whitespace in the mask, the | |
| 232 | + // entire match was inside a string literal (e.g. a test fixture). | |
| 233 | + if (mask[m.index] === " " || mask[m.index] === "\n") continue; | |
| 234 | + if (m[1]) out.push(m[1]); | |
| 235 | + } | |
| 236 | + return out; | |
| 237 | +}; | |
| 238 | + | |
| 239 | +// Resolve a relative import like "./c14_git.ts" from the importing | |
| 240 | +// file's directory to the repo-relative path used as the input map's | |
| 241 | +// key (e.g. "src/c14_git.ts"). | |
| 242 | +export const resolveImport = (fromPath: string, importPath: string): string => { | |
| 243 | + const dir = fromPath.split("/").slice(0, -1).join("/"); | |
| 244 | + const rel = importPath.replace(/^\.\//, ""); | |
| 245 | + return dir + "/" + rel; | |
| 246 | +}; | |
| 247 | + | |
| 248 | +// — Parse-boundary call-site detector ----------------------------- | |
| 249 | +// | |
| 250 | +// Source of truth for what counts as "external input parsed at the | |
| 251 | +// boundary" under SAMA v2 §4.4. Consumed by: | |
| 252 | +// - b32_sama_v2_verify.checkModeledBoundary (#4) — flags Layer 1/3 | |
| 253 | +// files where any pattern occurs; emits one violation per | |
| 254 | +// (file, pattern) pair preserving PARSE_BOUNDARY_PATTERNS order. | |
| 255 | +// - b32_sama_v2_metrics.boundaryRatio — counts every individual | |
| 256 | +// call site and reports the Layer-2 share. | |
| 257 | +// If you change the patterns, both check and metric move in lockstep. | |
| 258 | + | |
| 259 | +export const PARSE_BOUNDARY_PATTERNS: ReadonlyArray<{ | |
| 260 | + name: "JSON.parse" | "new URL"; | |
| 261 | + source: string; | |
| 262 | +}> = [ | |
| 263 | + { name: "JSON.parse", source: "\\bJSON\\.parse\\s*\\(" }, | |
| 264 | + { name: "new URL", source: "\\bnew\\s+URL\\s*\\(" }, | |
| 265 | +]; | |
| 266 | + | |
| 267 | +export interface ParseBoundaryCallSite { | |
| 268 | + file: string; | |
| 269 | + pattern: "JSON.parse" | "new URL"; | |
| 270 | + // Position in the stripped source. Useful for line-number lookup; | |
| 271 | + // the verifier currently only needs (file, pattern). | |
| 272 | + index: number; | |
| 273 | +} | |
| 274 | + | |
| 275 | +// Walk every SAMA file (src/*.ts non-test) and return every parse- | |
| 276 | +// boundary call site. Operates on the stripped source so string- | |
| 277 | +// literal fixtures don't false-positive. Iteration order: files in | |
| 278 | +// input map order (Map preserves insertion order), patterns in | |
| 279 | +// PARSE_BOUNDARY_PATTERNS order, occurrences in source order. | |
| 280 | +export const findParseBoundaryCallSites = ( | |
| 281 | + files: Map<string, string>, | |
| 282 | +): ParseBoundaryCallSite[] => { | |
| 283 | + const out: ParseBoundaryCallSite[] = []; | |
| 284 | + for (const [path, content] of files) { | |
| 285 | + if (!isSamaFile(path)) continue; | |
| 286 | + const stripped = stripStringsAndComments(content); | |
| 287 | + for (const pat of PARSE_BOUNDARY_PATTERNS) { | |
| 288 | + // Fresh regex per file so lastIndex never bleeds. | |
| 289 | + const re = new RegExp(pat.source, "g"); | |
| 290 | + let m: RegExpExecArray | null; | |
| 291 | + while ((m = re.exec(stripped)) !== null) { | |
| 292 | + out.push({ file: path, pattern: pat.name, index: m.index }); | |
| 293 | + } | |
| 294 | + } | |
| 295 | + } | |
| 296 | + return out; | |
| 297 | +}; | |
src/b32_sama_v2_metrics.test.ts
+253
−0
| @@ -0,0 +1,253 @@ | ||
| 1 | +import { describe, expect, test } from "bun:test"; | |
| 2 | +import { computeCoreMetrics } from "./b32_sama_v2_metrics.ts"; | |
| 3 | +import { | |
| 4 | + WORKING_SET_MAX_LOC, | |
| 5 | + WORKING_SET_MIN_LOC, | |
| 6 | + type ProfileSpec, | |
| 7 | + type SamaV2Input, | |
| 8 | +} from "./a31_sama_v2.ts"; | |
| 9 | + | |
| 10 | +// Flat fixture profile (one prefix per layer) so the metric tests | |
| 11 | +// don't depend on the live profile. The Law-check sublayer ordering | |
| 12 | +// isn't relevant here — these tests target the metrics computation, | |
| 13 | +// not the conformance verdict. | |
| 14 | +const FIXTURE_PROFILE: ProfileSpec = { | |
| 15 | + samaVersion: "2.0", | |
| 16 | + profile: "metrics-test", | |
| 17 | + layers: { | |
| 18 | + 0: { sublayers: [{ name: "default", prefix: "p0_", index: 0 }] }, | |
| 19 | + 1: { sublayers: [{ name: "default", prefix: "p1_", index: 0 }] }, | |
| 20 | + 2: { sublayers: [{ name: "default", prefix: "p2_", index: 0 }] }, | |
| 21 | + 3: { sublayers: [{ name: "default", prefix: "p3_", index: 0 }] }, | |
| 22 | + }, | |
| 23 | +}; | |
| 24 | + | |
| 25 | +const mk = (entries: Array<[string, string]>): SamaV2Input => ({ | |
| 26 | + profile: FIXTURE_PROFILE, | |
| 27 | + files: new Map(entries), | |
| 28 | +}); | |
| 29 | + | |
| 30 | +// Helper: produce a file with `n` lines of harmless code (so | |
| 31 | +// split("\n").length === n). | |
| 32 | +const linesOf = (n: number): string => | |
| 33 | + Array.from({ length: n }, (_, i) => `const x${i} = ${i};`).join("\n"); | |
| 34 | + | |
| 35 | +// Helper: a minimal sibling test body for Layer-1/2 fixtures. | |
| 36 | +const TEST_BODY = 'import { test, expect } from "bun:test"; test("ok", () => { expect(1).toBe(1); });\n'; | |
| 37 | + | |
| 38 | +describe("computeCoreMetrics — graphDepth", () => { | |
| 39 | + test("empty repo → 0", () => { | |
| 40 | + const m = computeCoreMetrics(mk([])); | |
| 41 | + expect(m.graphDepth).toBe(0); | |
| 42 | + }); | |
| 43 | + | |
| 44 | + test("single file with no imports → 1", () => { | |
| 45 | + const m = computeCoreMetrics(mk([ | |
| 46 | + ["src/p0_a.ts", "export const x = 1;\n"], | |
| 47 | + ])); | |
| 48 | + expect(m.graphDepth).toBe(1); | |
| 49 | + }); | |
| 50 | + | |
| 51 | + test("chain p3 → p2 → p1 → p0 → 4", () => { | |
| 52 | + const m = computeCoreMetrics(mk([ | |
| 53 | + ["src/p0_a.ts", "export const x = 1;\n"], | |
| 54 | + ["src/p1_a.ts", `import { x } from "./p0_a.ts";\nexport const y = x;\n`], | |
| 55 | + ["src/p1_a.test.ts", TEST_BODY], | |
| 56 | + ["src/p2_a.ts", `import { y } from "./p1_a.ts";\nexport const z = y;\n`], | |
| 57 | + ["src/p2_a.test.ts", TEST_BODY], | |
| 58 | + ["src/p3_a.ts", `import { z } from "./p2_a.ts";\nexport const w = z;\n`], | |
| 59 | + ])); | |
| 60 | + expect(m.graphDepth).toBe(4); | |
| 61 | + }); | |
| 62 | + | |
| 63 | + test("a cycle is bounded (does not infinite-loop)", () => { | |
| 64 | + // p1_a ↔ p1_b cycle (same-layer; Law would flag it, but graphDepth | |
| 65 | + // must still terminate with a finite number). | |
| 66 | + const m = computeCoreMetrics(mk([ | |
| 67 | + ["src/p1_a.ts", `import { y } from "./p1_b.ts";\nexport const x = y;\n`], | |
| 68 | + ["src/p1_a.test.ts", TEST_BODY], | |
| 69 | + ["src/p1_b.ts", `import { x } from "./p1_a.ts";\nexport const y = x;\n`], | |
| 70 | + ["src/p1_b.test.ts", TEST_BODY], | |
| 71 | + ])); | |
| 72 | + expect(Number.isFinite(m.graphDepth)).toBe(true); | |
| 73 | + expect(m.graphDepth).toBeGreaterThanOrEqual(1); | |
| 74 | + }); | |
| 75 | +}); | |
| 76 | + | |
| 77 | +describe("computeCoreMetrics — fanByLayer", () => { | |
| 78 | + test("empty repo → all-zero summaries", () => { | |
| 79 | + const m = computeCoreMetrics(mk([])); | |
| 80 | + for (const L of [0, 1, 2, 3] as const) { | |
| 81 | + expect(m.fanByLayer[L].fanIn).toEqual({ mean: 0, p50: 0, p95: 0, max: 0 }); | |
| 82 | + expect(m.fanByLayer[L].fanOut).toEqual({ mean: 0, p50: 0, p95: 0, max: 0 }); | |
| 83 | + } | |
| 84 | + }); | |
| 85 | + | |
| 86 | + test("single Layer-0 file with no edges → all zeros at L0", () => { | |
| 87 | + const m = computeCoreMetrics(mk([ | |
| 88 | + ["src/p0_a.ts", "export const x = 1;\n"], | |
| 89 | + ])); | |
| 90 | + expect(m.fanByLayer[0].fanIn.max).toBe(0); | |
| 91 | + expect(m.fanByLayer[0].fanOut.max).toBe(0); | |
| 92 | + }); | |
| 93 | + | |
| 94 | + test("two Layer-1 files importing same Layer-0 → L0.fanIn.max = 2, L1.fanOut.max = 1", () => { | |
| 95 | + const m = computeCoreMetrics(mk([ | |
| 96 | + ["src/p0_a.ts", "export const x = 1;\n"], | |
| 97 | + ["src/p1_a.ts", `import { x } from "./p0_a.ts";\nexport const y = x;\n`], | |
| 98 | + ["src/p1_a.test.ts", TEST_BODY], | |
| 99 | + ["src/p1_b.ts", `import { x } from "./p0_a.ts";\nexport const z = x;\n`], | |
| 100 | + ["src/p1_b.test.ts", TEST_BODY], | |
| 101 | + ])); | |
| 102 | + expect(m.fanByLayer[0].fanIn.max).toBe(2); | |
| 103 | + expect(m.fanByLayer[1].fanOut.max).toBe(1); | |
| 104 | + expect(m.fanByLayer[1].fanIn.max).toBe(0); | |
| 105 | + }); | |
| 106 | +}); | |
| 107 | + | |
| 108 | +describe("computeCoreMetrics — boundaryRatio", () => { | |
| 109 | + test("no parse boundaries anywhere → 1.0 (vacuously)", () => { | |
| 110 | + const m = computeCoreMetrics(mk([ | |
| 111 | + ["src/p0_a.ts", "export const x = 1;\n"], | |
| 112 | + ])); | |
| 113 | + expect(m.boundaryRatio).toBe(1.0); | |
| 114 | + }); | |
| 115 | + | |
| 116 | + test("JSON.parse only in Layer 2 → 1.0", () => { | |
| 117 | + const m = computeCoreMetrics(mk([ | |
| 118 | + ["src/p2_a.ts", "export const f = (s: string) => JSON.parse(s);\n"], | |
| 119 | + ["src/p2_a.test.ts", TEST_BODY], | |
| 120 | + ])); | |
| 121 | + expect(m.boundaryRatio).toBe(1.0); | |
| 122 | + }); | |
| 123 | + | |
| 124 | + test("JSON.parse in Layer 1 and Layer 2 → 0.5", () => { | |
| 125 | + const m = computeCoreMetrics(mk([ | |
| 126 | + ["src/p1_a.ts", "export const f = (s: string) => JSON.parse(s);\n"], | |
| 127 | + ["src/p1_a.test.ts", TEST_BODY], | |
| 128 | + ["src/p2_a.ts", "export const g = (s: string) => JSON.parse(s);\n"], | |
| 129 | + ["src/p2_a.test.ts", TEST_BODY], | |
| 130 | + ])); | |
| 131 | + expect(m.boundaryRatio).toBe(0.5); | |
| 132 | + }); | |
| 133 | + | |
| 134 | + test("string literal containing JSON.parse doesn't false-positive", () => { | |
| 135 | + const m = computeCoreMetrics(mk([ | |
| 136 | + ["src/p1_a.ts", `const explainer = "call JSON.parse here";\nexport const x = explainer.length;\n`], | |
| 137 | + ["src/p1_a.test.ts", TEST_BODY], | |
| 138 | + ])); | |
| 139 | + expect(m.boundaryRatio).toBe(1.0); | |
| 140 | + }); | |
| 141 | + | |
| 142 | + test("counts every call site, not just every file", () => { | |
| 143 | + // Two JSON.parse in one Layer-2 file, one in Layer-1 → ratio = 2/3 | |
| 144 | + const m = computeCoreMetrics(mk([ | |
| 145 | + ["src/p1_a.ts", "export const f = (s: string) => JSON.parse(s);\n"], | |
| 146 | + ["src/p1_a.test.ts", TEST_BODY], | |
| 147 | + ["src/p2_a.ts", "export const g = (s: string) => JSON.parse(s);\nexport const h = (s: string) => JSON.parse(s);\n"], | |
| 148 | + ["src/p2_a.test.ts", TEST_BODY], | |
| 149 | + ])); | |
| 150 | + expect(m.boundaryRatio).toBeCloseTo(2 / 3, 6); | |
| 151 | + }); | |
| 152 | +}); | |
| 153 | + | |
| 154 | +describe("computeCoreMetrics — workingSetFit", () => { | |
| 155 | + test("empty repo → 1.0", () => { | |
| 156 | + const m = computeCoreMetrics(mk([])); | |
| 157 | + expect(m.workingSetFit).toBe(1.0); | |
| 158 | + }); | |
| 159 | + | |
| 160 | + test("a single 100-line file → 1.0", () => { | |
| 161 | + const m = computeCoreMetrics(mk([ | |
| 162 | + ["src/p0_a.ts", linesOf(100)], | |
| 163 | + ])); | |
| 164 | + expect(m.workingSetFit).toBe(1.0); | |
| 165 | + }); | |
| 166 | + | |
| 167 | + test("a 10-line file falls below the min → 0.0", () => { | |
| 168 | + const m = computeCoreMetrics(mk([ | |
| 169 | + ["src/p0_a.ts", linesOf(10)], | |
| 170 | + ])); | |
| 171 | + expect(m.workingSetFit).toBe(0.0); | |
| 172 | + }); | |
| 173 | + | |
| 174 | + test("a 600-line file exceeds the max → 0.0", () => { | |
| 175 | + const m = computeCoreMetrics(mk([ | |
| 176 | + ["src/p0_a.ts", linesOf(600)], | |
| 177 | + ])); | |
| 178 | + expect(m.workingSetFit).toBe(0.0); | |
| 179 | + }); | |
| 180 | + | |
| 181 | + test("two files: one 100-line (in), one 10-line (out) → 0.5", () => { | |
| 182 | + const m = computeCoreMetrics(mk([ | |
| 183 | + ["src/p0_a.ts", linesOf(100)], | |
| 184 | + ["src/p0_b.ts", linesOf(10)], | |
| 185 | + ])); | |
| 186 | + expect(m.workingSetFit).toBe(0.5); | |
| 187 | + }); | |
| 188 | + | |
| 189 | + test("exact bounds are inclusive (50 and 500 count as in the sweet spot)", () => { | |
| 190 | + const m = computeCoreMetrics(mk([ | |
| 191 | + ["src/p0_min.ts", linesOf(WORKING_SET_MIN_LOC)], | |
| 192 | + ["src/p0_max.ts", linesOf(WORKING_SET_MAX_LOC)], | |
| 193 | + ])); | |
| 194 | + expect(m.workingSetFit).toBe(1.0); | |
| 195 | + }); | |
| 196 | + | |
| 197 | + test("test files don't count toward the metric (only SAMA source files)", () => { | |
| 198 | + // One 100-line Layer-1 source + a tiny sibling test. Sibling test | |
| 199 | + // is 1 line, far below the min, but it's excluded. | |
| 200 | + const m = computeCoreMetrics(mk([ | |
| 201 | + ["src/p1_a.ts", linesOf(100)], | |
| 202 | + ["src/p1_a.test.ts", TEST_BODY], | |
| 203 | + ])); | |
| 204 | + expect(m.workingSetFit).toBe(1.0); | |
| 205 | + }); | |
| 206 | +}); | |
| 207 | + | |
| 208 | +describe("computeCoreMetrics — violationCounts", () => { | |
| 209 | + test("conforming fixture → all counts = 0", () => { | |
| 210 | + const m = computeCoreMetrics(mk([ | |
| 211 | + ["src/p0_a.ts", "export const x = 1;\n"], | |
| 212 | + ])); | |
| 213 | + expect(m.violationCounts).toEqual({ | |
| 214 | + sorted: 0, architecture: 0, modeledTests: 0, modeledBoundary: 0, | |
| 215 | + atomic: 0, law: 0, consistency: 0, | |
| 216 | + }); | |
| 217 | + }); | |
| 218 | + | |
| 219 | + test("Layer-1 file without sibling test → modeledTests = 1", () => { | |
| 220 | + const m = computeCoreMetrics(mk([ | |
| 221 | + ["src/p1_a.ts", "export const y = 1;\n"], | |
| 222 | + ])); | |
| 223 | + expect(m.violationCounts.modeledTests).toBe(1); | |
| 224 | + }); | |
| 225 | + | |
| 226 | + test("counts are populated even when overall verdict is conforming (trailing signal shape)", () => { | |
| 227 | + // Single Layer-0 file → all checks pass → all counts are 0 (not | |
| 228 | + // missing). This is the §5 contract: keys exist regardless. | |
| 229 | + const m = computeCoreMetrics(mk([ | |
| 230 | + ["src/p0_a.ts", "export const x = 1;\n"], | |
| 231 | + ])); | |
| 232 | + const keys = Object.keys(m.violationCounts).sort(); | |
| 233 | + expect(keys).toEqual([ | |
| 234 | + "architecture", "atomic", "consistency", "law", | |
| 235 | + "modeledBoundary", "modeledTests", "sorted", | |
| 236 | + ]); | |
| 237 | + }); | |
| 238 | +}); | |
| 239 | + | |
| 240 | +describe("computeCoreMetrics — reproducibility", () => { | |
| 241 | + test("same input → identical output across two runs (deep-equal)", () => { | |
| 242 | + const input = mk([ | |
| 243 | + ["src/p0_a.ts", "export const x = 1;\n"], | |
| 244 | + ["src/p1_a.ts", `import { x } from "./p0_a.ts";\nexport const y = x;\n`], | |
| 245 | + ["src/p1_a.test.ts", TEST_BODY], | |
| 246 | + ["src/p2_a.ts", `import { y } from "./p1_a.ts";\nexport const f = (s: string) => JSON.parse(s);\n`], | |
| 247 | + ["src/p2_a.test.ts", TEST_BODY], | |
| 248 | + ]); | |
| 249 | + const m1 = computeCoreMetrics(input); | |
| 250 | + const m2 = computeCoreMetrics(input); | |
| 251 | + expect(m1).toEqual(m2); | |
| 252 | + }); | |
| 253 | +}); | |
src/b32_sama_v2_metrics.ts
+220
−0
| @@ -0,0 +1,220 @@ | ||
| 1 | +// b32 — logic: SAMA v2 §5 core metrics emitter. Pure function over | |
| 2 | +// SamaV2Input that returns the five §5 metrics (graphDepth, fanByLayer, | |
| 3 | +// boundaryRatio, workingSetFit, violationCounts). No I/O, no clock, | |
| 4 | +// no filesystem; same source tree + same profile → identical numbers. | |
| 5 | +// | |
| 6 | +// The empirical artefact §6 of /sama/v2 requires before any later | |
| 7 | +// claim (skeleton, agent experiment, external repo audit) can be | |
| 8 | +// measured as a delta. Operational definitions live on /sama/v2 §5. | |
| 9 | +// | |
| 10 | +// Shared helpers (declaredLayer, isSamaFile, collectRelativeImports, | |
| 11 | +// resolveImport, findParseBoundaryCallSites) come from a31_sama_v2 so | |
| 12 | +// this module and b32_sama_v2_verify agree by construction — the | |
| 13 | +// Modeled-boundary check (#4) and boundaryRatio metric consume the | |
| 14 | +// same detector and cannot diverge. | |
| 15 | + | |
| 16 | +import { | |
| 17 | + WORKING_SET_MAX_LOC, | |
| 18 | + WORKING_SET_MIN_LOC, | |
| 19 | + collectRelativeImports, | |
| 20 | + declaredLayer, | |
| 21 | + findParseBoundaryCallSites, | |
| 22 | + isSamaFile, | |
| 23 | + resolveImport, | |
| 24 | + type FanByLayer, | |
| 25 | + type FanSummary, | |
| 26 | + type LayerNumber, | |
| 27 | + type SamaV2Input, | |
| 28 | + type SamaV2Metrics, | |
| 29 | + type SamaV2ViolationCounts, | |
| 30 | +} from "./a31_sama_v2.ts"; | |
| 31 | +import { verifySamaV2 } from "./b32_sama_v2_verify.ts"; | |
| 32 | + | |
| 33 | +// — graphDepth ---------------------------------------------------- | |
| 34 | +// | |
| 35 | +// Longest path in the import DAG. Nodes = SAMA source files (src/*.ts | |
| 36 | +// non-test); edges = static relative-path imports between them. A | |
| 37 | +// file with no imports has depth 1. Empty graph = 0. | |
| 38 | +// | |
| 39 | +// Memoised DFS. If a cycle is encountered (the Law check would flag | |
| 40 | +// it separately), we treat the back-edge target as a terminal of | |
| 41 | +// depth 1 so the metric still terminates with a finite number. | |
| 42 | +const computeGraphDepth = (files: Map<string, string>): number => { | |
| 43 | + const samaPaths = [...files.keys()].filter(isSamaFile); | |
| 44 | + if (samaPaths.length === 0) return 0; | |
| 45 | + | |
| 46 | + // Build adjacency (only edges that land on known SAMA files). | |
| 47 | + const adj = new Map<string, string[]>(); | |
| 48 | + for (const path of samaPaths) { | |
| 49 | + const content = files.get(path) ?? ""; | |
| 50 | + const out: string[] = []; | |
| 51 | + for (const imp of collectRelativeImports(content)) { | |
| 52 | + const resolved = resolveImport(path, imp); | |
| 53 | + if (files.has(resolved) && isSamaFile(resolved)) out.push(resolved); | |
| 54 | + } | |
| 55 | + adj.set(path, out); | |
| 56 | + } | |
| 57 | + | |
| 58 | + const memo = new Map<string, number>(); | |
| 59 | + const visiting = new Set<string>(); | |
| 60 | + | |
| 61 | + const depth = (node: string): number => { | |
| 62 | + const cached = memo.get(node); | |
| 63 | + if (cached !== undefined) return cached; | |
| 64 | + if (visiting.has(node)) return 1; // cycle: treat as terminal | |
| 65 | + visiting.add(node); | |
| 66 | + let best = 1; | |
| 67 | + for (const next of adj.get(node) ?? []) { | |
| 68 | + const d = depth(next) + 1; | |
| 69 | + if (d > best) best = d; | |
| 70 | + } | |
| 71 | + visiting.delete(node); | |
| 72 | + memo.set(node, best); | |
| 73 | + return best; | |
| 74 | + }; | |
| 75 | + | |
| 76 | + let max = 0; | |
| 77 | + for (const p of samaPaths) { | |
| 78 | + const d = depth(p); | |
| 79 | + if (d > max) max = d; | |
| 80 | + } | |
| 81 | + return max; | |
| 82 | +}; | |
| 83 | + | |
| 84 | +// — fanByLayer ---------------------------------------------------- | |
| 85 | +// | |
| 86 | +// Per canonical layer L ∈ {0,1,2,3}: fan-in (count of edges arriving | |
| 87 | +// at files in L) and fan-out (count of edges leaving files in L). | |
| 88 | +// Each summary = {mean, p50, p95, max} computed over the per-file | |
| 89 | +// series within L. Empty layer = all-zero summary. | |
| 90 | + | |
| 91 | +const summarize = (values: number[]): FanSummary => { | |
| 92 | + if (values.length === 0) return { mean: 0, p50: 0, p95: 0, max: 0 }; | |
| 93 | + const sorted = [...values].sort((a, b) => a - b); | |
| 94 | + const sum = sorted.reduce((s, v) => s + v, 0); | |
| 95 | + const mean = sum / sorted.length; | |
| 96 | + const percentile = (frac: number): number => { | |
| 97 | + // Nearest-rank percentile: index = ceil(frac * N) - 1, clamped. | |
| 98 | + const idx = Math.min(sorted.length - 1, Math.max(0, Math.ceil(frac * sorted.length) - 1)); | |
| 99 | + return sorted[idx]!; | |
| 100 | + }; | |
| 101 | + return { | |
| 102 | + mean, | |
| 103 | + p50: percentile(0.5), | |
| 104 | + p95: percentile(0.95), | |
| 105 | + max: sorted[sorted.length - 1]!, | |
| 106 | + }; | |
| 107 | +}; | |
| 108 | + | |
| 109 | +const computeFanByLayer = (input: SamaV2Input): FanByLayer => { | |
| 110 | + const samaPaths = [...input.files.keys()].filter(isSamaFile); | |
| 111 | + const fanOut = new Map<string, number>(); | |
| 112 | + const fanIn = new Map<string, number>(); | |
| 113 | + for (const p of samaPaths) { | |
| 114 | + fanOut.set(p, 0); | |
| 115 | + fanIn.set(p, 0); | |
| 116 | + } | |
| 117 | + for (const path of samaPaths) { | |
| 118 | + const content = input.files.get(path) ?? ""; | |
| 119 | + for (const imp of collectRelativeImports(content)) { | |
| 120 | + const resolved = resolveImport(path, imp); | |
| 121 | + if (!fanOut.has(resolved)) continue; | |
| 122 | + fanOut.set(path, (fanOut.get(path) ?? 0) + 1); | |
| 123 | + fanIn.set(resolved, (fanIn.get(resolved) ?? 0) + 1); | |
| 124 | + } | |
| 125 | + } | |
| 126 | + | |
| 127 | + const buckets: Record<LayerNumber, { in: number[]; out: number[] }> = { | |
| 128 | + 0: { in: [], out: [] }, | |
| 129 | + 1: { in: [], out: [] }, | |
| 130 | + 2: { in: [], out: [] }, | |
| 131 | + 3: { in: [], out: [] }, | |
| 132 | + }; | |
| 133 | + for (const path of samaPaths) { | |
| 134 | + const decl = declaredLayer(path, input.profile); | |
| 135 | + if (!decl) continue; | |
| 136 | + buckets[decl.layer].in.push(fanIn.get(path) ?? 0); | |
| 137 | + buckets[decl.layer].out.push(fanOut.get(path) ?? 0); | |
| 138 | + } | |
| 139 | + | |
| 140 | + return { | |
| 141 | + 0: { fanIn: summarize(buckets[0].in), fanOut: summarize(buckets[0].out) }, | |
| 142 | + 1: { fanIn: summarize(buckets[1].in), fanOut: summarize(buckets[1].out) }, | |
| 143 | + 2: { fanIn: summarize(buckets[2].in), fanOut: summarize(buckets[2].out) }, | |
| 144 | + 3: { fanIn: summarize(buckets[3].in), fanOut: summarize(buckets[3].out) }, | |
| 145 | + }; | |
| 146 | +}; | |
| 147 | + | |
| 148 | +// — boundaryRatio ------------------------------------------------- | |
| 149 | +// | |
| 150 | +// (parse-boundary call sites in Layer 2 files) ÷ (parse-boundary | |
| 151 | +// call sites anywhere). Uses the SAME detector as the §4.4 check. | |
| 152 | +// No boundaries anywhere → 1.0 (vacuously satisfied: there is no | |
| 153 | +// out-of-Layer-2 leak because there is no boundary at all). | |
| 154 | +// | |
| 155 | +// "Layer 2" here means the file's declaredLayer is 2. Unprefixed | |
| 156 | +// files (declaredLayer = null) count toward the denominator but | |
| 157 | +// not the numerator — that is the truthful reading of the §5 | |
| 158 | +// definition. | |
| 159 | +const computeBoundaryRatio = (input: SamaV2Input): number => { | |
| 160 | + const sites = findParseBoundaryCallSites(input.files); | |
| 161 | + if (sites.length === 0) return 1.0; | |
| 162 | + let inLayer2 = 0; | |
| 163 | + for (const site of sites) { | |
| 164 | + const decl = declaredLayer(site.file, input.profile); | |
| 165 | + if (decl !== null && decl.layer === 2) inLayer2++; | |
| 166 | + } | |
| 167 | + return inLayer2 / sites.length; | |
| 168 | +}; | |
| 169 | + | |
| 170 | +// — workingSetFit ------------------------------------------------- | |
| 171 | +// | |
| 172 | +// (source files with WORKING_SET_MIN_LOC ≤ LOC ≤ WORKING_SET_MAX_LOC) | |
| 173 | +// ÷ (total source files). Empty repo → 1.0. Test files don't count; | |
| 174 | +// the metric is about working modules, not their sibling tests. | |
| 175 | +// | |
| 176 | +// Bounds are hard-coded constants in a31_sama_v2.ts. The reasoning | |
| 177 | +// (Atomic 700-LOC headroom; sub-50 = type-only/stub) lives on | |
| 178 | +// /sama/v2 §5 — preceding the numbers, not retrofitted. | |
| 179 | +const computeWorkingSetFit = (input: SamaV2Input): number => { | |
| 180 | + const samaPaths = [...input.files.keys()].filter(isSamaFile); | |
| 181 | + if (samaPaths.length === 0) return 1.0; | |
| 182 | + let inSweetSpot = 0; | |
| 183 | + for (const p of samaPaths) { | |
| 184 | + const lines = (input.files.get(p) ?? "").split("\n").length; | |
| 185 | + if (lines >= WORKING_SET_MIN_LOC && lines <= WORKING_SET_MAX_LOC) inSweetSpot++; | |
| 186 | + } | |
| 187 | + return inSweetSpot / samaPaths.length; | |
| 188 | +}; | |
| 189 | + | |
| 190 | +// — violationCounts ---------------------------------------------- | |
| 191 | +// | |
| 192 | +// Per-check violation count from a fresh verifier run on the same | |
| 193 | +// input. Reported even when a check passes (value = 0) — §5's | |
| 194 | +// "trailing signal: which rules agents *almost* break." The verifier | |
| 195 | +// enumerates ALL violations per check (no short-circuit), so this | |
| 196 | +// count is meaningful — not "1 if failed, 0 if passed". | |
| 197 | +const computeViolationCounts = (input: SamaV2Input): SamaV2ViolationCounts => { | |
| 198 | + const report = verifySamaV2(input); | |
| 199 | + const byId = new Map<number, number>(); | |
| 200 | + for (const c of report.checks) byId.set(c.id, c.violations.length); | |
| 201 | + return { | |
| 202 | + sorted: byId.get(1) ?? 0, | |
| 203 | + architecture: byId.get(2) ?? 0, | |
| 204 | + modeledTests: byId.get(3) ?? 0, | |
| 205 | + modeledBoundary: byId.get(4) ?? 0, | |
| 206 | + atomic: byId.get(5) ?? 0, | |
| 207 | + law: byId.get(6) ?? 0, | |
| 208 | + consistency: byId.get(7) ?? 0, | |
| 209 | + }; | |
| 210 | +}; | |
| 211 | + | |
| 212 | +// — Orchestrator -------------------------------------------------- | |
| 213 | + | |
| 214 | +export const computeCoreMetrics = (input: SamaV2Input): SamaV2Metrics => ({ | |
| 215 | + graphDepth: computeGraphDepth(input.files), | |
| 216 | + fanByLayer: computeFanByLayer(input), | |
| 217 | + boundaryRatio: computeBoundaryRatio(input), | |
| 218 | + workingSetFit: computeWorkingSetFit(input), | |
| 219 | + violationCounts: computeViolationCounts(input), | |
| 220 | +}); | |
src/b32_sama_v2_verify.ts
+36
−98
| @@ -1,105 +1,27 @@ | ||
| 1 | -// c32 — logic: the SAMA v2 verifier. Implements the seven §4 | |
| 1 | +// b32 — logic: the SAMA v2 verifier. Implements the seven §4 | |
| 2 | 2 | // conformance checks (Sorted, Architecture, Modeled-tests, |
| 3 | 3 | // Modeled-boundary, Atomic, the Law §1.2, Consistency §3) as pure |
| 4 | 4 | // functions over an in-memory (profile, files) input. Never reads |
| 5 | -// the filesystem — the loader (c14_sama_profile + c21 handler) | |
| 6 | -// populates the input map. No mocks, no stubs: every check is a | |
| 7 | -// real grep/string-op on the supplied content. | |
| 5 | +// the filesystem — the loader (c14_sama_profile + d21 handler) | |
| 6 | +// populates the input map. The shared pure helpers and the parse- | |
| 7 | +// boundary detector live in a31_sama_v2 so this verifier and the | |
| 8 | +// §5 metrics emitter agree by construction. | |
| 8 | 9 | |
| 9 | 10 | import { |
| 11 | + PARSE_BOUNDARY_PATTERNS, | |
| 12 | + collectRelativeImports, | |
| 10 | 13 | declaredLayer, |
| 14 | + findParseBoundaryCallSites, | |
| 15 | + isSamaFile, | |
| 16 | + isTestFile, | |
| 17 | + resolveImport, | |
| 18 | + stripStringsAndComments, | |
| 11 | 19 | type SamaV2Check, |
| 12 | 20 | type SamaV2Input, |
| 13 | 21 | type SamaV2Report, |
| 14 | 22 | type SamaV2Violation, |
| 15 | 23 | } from "./a31_sama_v2.ts"; |
| 16 | 24 | |
| 17 | -// — shared utilities ------------------------------------------------- | |
| 18 | - | |
| 19 | -// A SAMA file is one we expect to obey the layer rules: any *.ts | |
| 20 | -// under src/ that isn't a *.test.ts. Tests live next to source as | |
| 21 | -// siblings; they're examined for the Modeled check but don't carry | |
| 22 | -// their own layer. | |
| 23 | -const isSamaFile = (path: string): boolean => | |
| 24 | - path.startsWith("src/") && path.endsWith(".ts") && !path.endsWith(".test.ts"); | |
| 25 | - | |
| 26 | -const isTestFile = (path: string): boolean => | |
| 27 | - path.startsWith("src/") && path.endsWith(".test.ts"); | |
| 28 | - | |
| 29 | -// Strip JS/TS string literals and comments to whitespace so a regex | |
| 30 | -// that walks the source doesn't trip on test fixtures that contain | |
| 31 | -// the very patterns we're scanning for. Same shape as the helper in | |
| 32 | -// c32_sama_verify; duplicated here to keep c32_sama_v2_verify a | |
| 33 | -// stand-alone module the loader can pull in without dragging the v1 | |
| 34 | -// verifier with it. | |
| 35 | -const stripStringsAndComments = (src: string): string => { | |
| 36 | - let out = ""; | |
| 37 | - let i = 0; | |
| 38 | - while (i < src.length) { | |
| 39 | - const c = src[i]; | |
| 40 | - const n = src[i + 1]; | |
| 41 | - if (c === "/" && n === "/") { | |
| 42 | - out += " "; | |
| 43 | - i += 2; | |
| 44 | - while (i < src.length && src[i] !== "\n") { out += " "; i++; } | |
| 45 | - } else if (c === "/" && n === "*") { | |
| 46 | - out += " "; | |
| 47 | - i += 2; | |
| 48 | - while (i < src.length - 1 && !(src[i] === "*" && src[i + 1] === "/")) { | |
| 49 | - out += src[i] === "\n" ? "\n" : " "; | |
| 50 | - i++; | |
| 51 | - } | |
| 52 | - out += " "; | |
| 53 | - i += 2; | |
| 54 | - } else if (c === '"' || c === "'" || c === "`") { | |
| 55 | - const quote = c; | |
| 56 | - out += " "; | |
| 57 | - i++; | |
| 58 | - while (i < src.length && src[i] !== quote) { | |
| 59 | - if (src[i] === "\\" && i + 1 < src.length) { out += " "; i += 2; continue; } | |
| 60 | - out += src[i] === "\n" ? "\n" : " "; | |
| 61 | - i++; | |
| 62 | - } | |
| 63 | - out += " "; | |
| 64 | - i++; | |
| 65 | - } else { | |
| 66 | - out += c; | |
| 67 | - i++; | |
| 68 | - } | |
| 69 | - } | |
| 70 | - return out; | |
| 71 | -}; | |
| 72 | - | |
| 73 | -// Collect every relative ".ts" import edge in a file. Scans raw | |
| 74 | -// source: a stripped copy would erase the quoted import paths along | |
| 75 | -// with all other string literals, so the regex must run over the | |
| 76 | -// original. To avoid picking up import-like strings inside test | |
| 77 | -// fixtures, we cross-check each match position against the stripped | |
| 78 | -// mask — if the keyword `from` lands on whitespace in the mask, it | |
| 79 | -// was inside a string literal and we skip it. | |
| 80 | -const collectRelativeImports = (content: string): string[] => { | |
| 81 | - const mask = stripStringsAndComments(content); | |
| 82 | - const re = /\bfrom\s+["'](\.\/[A-Za-z0-9_./-]+\.ts)["']/g; | |
| 83 | - const out: string[] = []; | |
| 84 | - let m: RegExpExecArray | null; | |
| 85 | - while ((m = re.exec(content)) !== null) { | |
| 86 | - // If the `from` keyword position is whitespace in the mask, the | |
| 87 | - // entire match was inside a string literal (e.g. a test fixture). | |
| 88 | - if (mask[m.index] === " " || mask[m.index] === "\n") continue; | |
| 89 | - if (m[1]) out.push(m[1]); | |
| 90 | - } | |
| 91 | - return out; | |
| 92 | -}; | |
| 93 | - | |
| 94 | -// Resolve a relative import like "./c14_git.ts" from the importing | |
| 95 | -// file's directory to the repo-relative path used as the input map's | |
| 96 | -// key (e.g. "src/c14_git.ts"). | |
| 97 | -const resolveImport = (fromPath: string, importPath: string): string => { | |
| 98 | - const dir = fromPath.split("/").slice(0, -1).join("/"); | |
| 99 | - const rel = importPath.replace(/^\.\//, ""); | |
| 100 | - return dir + "/" + rel; | |
| 101 | -}; | |
| 102 | - | |
| 103 | 25 | // — Check 1: Sorted ------------------------------------------------- |
| 104 | 26 | // |
| 105 | 27 | // "Every file carries a profile-recognised prefix; lexicographic |
| @@ -221,22 +143,38 @@ const checkModeledTests = (input: SamaV2Input): SamaV2Check => { | ||
| 221 | 143 | // params) are treated as delegation to the platform's own Layer 2, |
| 222 | 144 | // not parsing performed in our Layer 3. The verifier reports any |
| 223 | 145 | // raw JSON.parse / new URL calls landing outside Layer 2. |
| 224 | -const BOUNDARY_PATTERNS = [ | |
| 225 | - { name: "JSON.parse", re: /\bJSON\.parse\s*\(/ }, | |
| 226 | - { name: "new URL", re: /\bnew\s+URL\s*\(/ }, | |
| 227 | -]; | |
| 146 | +// | |
| 147 | +// The call-site detector lives in a31_sama_v2 (findParseBoundary- | |
| 148 | +// CallSites). This check consumes its output and groups by | |
| 149 | +// (file, pattern) so the violation list stays at file-pattern | |
| 150 | +// granularity — the same shape pre-refactor. The §5 boundaryRatio | |
| 151 | +// metric consumes the same detector and counts individual call | |
| 152 | +// sites, but does not change this check's verdict. | |
| 228 | 153 | const checkModeledBoundary = (input: SamaV2Input): SamaV2Check => { |
| 229 | 154 | const violations: SamaV2Violation[] = []; |
| 230 | 155 | let examined = 0; |
| 231 | - for (const [path, content] of input.files.entries()) { | |
| 156 | + | |
| 157 | + // Bucket call sites by file → set of patterns observed. | |
| 158 | + const patternsByFile = new Map<string, Set<string>>(); | |
| 159 | + for (const site of findParseBoundaryCallSites(input.files)) { | |
| 160 | + let s = patternsByFile.get(site.file); | |
| 161 | + if (!s) { s = new Set(); patternsByFile.set(site.file, s); } | |
| 162 | + s.add(site.pattern); | |
| 163 | + } | |
| 164 | + | |
| 165 | + // Iterate files in input order; emit one violation per (file, | |
| 166 | + // pattern) for files outside Layer 2, preserving PARSE_BOUNDARY_- | |
| 167 | + // PATTERNS order. This matches the pre-refactor verdict bit-for-bit. | |
| 168 | + for (const path of input.files.keys()) { | |
| 232 | 169 | if (!isSamaFile(path)) continue; |
| 233 | 170 | const decl = declaredLayer(path, input.profile); |
| 234 | 171 | if (!decl) continue; |
| 235 | 172 | examined++; |
| 236 | 173 | if (decl.layer === 2) continue; // Layer 2 is the legitimate site. |
| 237 | - const stripped = stripStringsAndComments(content); | |
| 238 | - for (const pat of BOUNDARY_PATTERNS) { | |
| 239 | - if (pat.re.test(stripped)) { | |
| 174 | + const observed = patternsByFile.get(path); | |
| 175 | + if (!observed) continue; | |
| 176 | + for (const pat of PARSE_BOUNDARY_PATTERNS) { | |
| 177 | + if (observed.has(pat.name)) { | |
| 240 | 178 | violations.push({ |
| 241 | 179 | file: path, |
| 242 | 180 | detail: `boundary pattern \`${pat.name}\` found in Layer ${decl.layer} — parsing belongs in Layer 2`, |
src/d21_handlers_sama.ts
+47
−5
| @@ -67,9 +67,49 @@ export const samaSkillHandler = async (): Promise<Response> => { | ||
| 67 | 67 | |
| 68 | 68 | import { buildSamaV2Input } from "./c14_sama_profile.ts"; |
| 69 | 69 | import { verifySamaV2 } from "./b32_sama_v2_verify.ts"; |
| 70 | -import type { SamaV2Report } from "./a31_sama_v2.ts"; | |
| 70 | +import { computeCoreMetrics } from "./b32_sama_v2_metrics.ts"; | |
| 71 | +import type { FanSummary, SamaV2Metrics, SamaV2Report } from "./a31_sama_v2.ts"; | |
| 71 | 72 | |
| 72 | -const renderV2Report = (report: SamaV2Report): string => { | |
| 73 | +// Render §5 metrics block beneath the existing 7-check verdict. | |
| 74 | +// Numbers come straight from computeCoreMetrics on the same input | |
| 75 | +// the verifier consumed — operational definitions on /sama/v2 §5. | |
| 76 | +const fmtFan = (s: FanSummary): string => | |
| 77 | + `${s.mean.toFixed(2)} / ${s.p50} / ${s.p95} / ${s.max}`; | |
| 78 | +const fmtPct = (n: number): string => `${(n * 100).toFixed(1)}%`; | |
| 79 | + | |
| 80 | +const renderMetricsBlock = (m: SamaV2Metrics): string => `## §5 Core metrics | |
| 81 | + | |
| 82 | +> *Snapshot of this run. Operational definitions at [/sama/v2 §5](/sama/v2#5-operational--core-metrics-definitions). The baseline these numbers anchor is what later claims (skeleton scaffolds, agent A/B experiments, external-repo audits) will be measured against as a delta.* | |
| 83 | + | |
| 84 | +| metric | value | | |
| 85 | +|---|---| | |
| 86 | +| **graphDepth** | ${m.graphDepth} | | |
| 87 | +| **boundaryRatio** | ${fmtPct(m.boundaryRatio)} | | |
| 88 | +| **workingSetFit** | ${fmtPct(m.workingSetFit)} | | |
| 89 | + | |
| 90 | +### fan distribution per layer | |
| 91 | + | |
| 92 | +| layer | fan-in (mean / p50 / p95 / max) | fan-out (mean / p50 / p95 / max) | | |
| 93 | +|---|---|---| | |
| 94 | +| 0 — Pure | ${fmtFan(m.fanByLayer[0].fanIn)} | ${fmtFan(m.fanByLayer[0].fanOut)} | | |
| 95 | +| 1 — Core | ${fmtFan(m.fanByLayer[1].fanIn)} | ${fmtFan(m.fanByLayer[1].fanOut)} | | |
| 96 | +| 2 — Adapter | ${fmtFan(m.fanByLayer[2].fanIn)} | ${fmtFan(m.fanByLayer[2].fanOut)} | | |
| 97 | +| 3 — Entry | ${fmtFan(m.fanByLayer[3].fanIn)} | ${fmtFan(m.fanByLayer[3].fanOut)} | | |
| 98 | + | |
| 99 | +### violation counts (trailing signal — emitted even when checks pass) | |
| 100 | + | |
| 101 | +| check | count | | |
| 102 | +|---|---| | |
| 103 | +| #1 Sorted | ${m.violationCounts.sorted} | | |
| 104 | +| #2 Architecture | ${m.violationCounts.architecture} | | |
| 105 | +| #3 Modeled (tests) | ${m.violationCounts.modeledTests} | | |
| 106 | +| #4 Modeled (boundary) | ${m.violationCounts.modeledBoundary} | | |
| 107 | +| #5 Atomic | ${m.violationCounts.atomic} | | |
| 108 | +| #6 Law (§1.2) | ${m.violationCounts.law} | | |
| 109 | +| #7 Consistency (§3) | ${m.violationCounts.consistency} | | |
| 110 | +`; | |
| 111 | + | |
| 112 | +const renderV2Report = (report: SamaV2Report, metrics: SamaV2Metrics): string => { | |
| 73 | 113 | const summary = report.overallPassed |
| 74 | 114 | ? `✓ conforms · profile \`${report.profile}\` · ${report.examined} files examined · ${report.checks.length}/${report.checks.length} checks pass` |
| 75 | 115 | : `${report.checks.filter((c) => c.passed).length}/${report.checks.length} checks pass · profile \`${report.profile}\` · ${report.examined} files examined`; |
| @@ -94,13 +134,14 @@ const renderV2Report = (report: SamaV2Report): string => { | ||
| 94 | 134 | |
| 95 | 135 | > ${summary} |
| 96 | 136 | |
| 97 | -The verifier in [\`src/c32_sama_v2_verify.ts\`](/GIT/syntaxai/tdd.md/blob/main/src/c32_sama_v2_verify.ts) ingests [\`sama.profile.toml\`](/GIT/syntaxai/tdd.md/blob/main/sama.profile.toml) and runs the seven §4 conformance checks against the current source tree on this server. No clone, no token; the server reads its own \`src/\` and the committed profile, runs the same logic the sibling unit tests cover, and renders the verdict below. | |
| 137 | +The verifier in [\`src/b32_sama_v2_verify.ts\`](/GIT/syntaxai/tdd.md/blob/main/src/b32_sama_v2_verify.ts) ingests [\`sama.profile.toml\`](/GIT/syntaxai/tdd.md/blob/main/sama.profile.toml) and runs the seven §4 conformance checks against the current source tree on this server. No clone, no token; the server reads its own \`src/\` and the committed profile, runs the same logic the sibling unit tests cover, and renders the verdict below. The §5 core metrics emitter ([\`src/b32_sama_v2_metrics.ts\`](/GIT/syntaxai/tdd.md/blob/main/src/b32_sama_v2_metrics.ts)) runs on the same input and shares the parse-boundary detector with the Modeled-boundary check. | |
| 98 | 138 | |
| 99 | 139 | | check | verdict | examined | |
| 100 | 140 | |---|---|---| |
| 101 | 141 | ${rows} |
| 102 | 142 | |
| 103 | -${details ? `## Open violations\n\n${details}` : ""} | |
| 143 | +${details ? `## Open violations\n\n${details}\n` : ""} | |
| 144 | +${renderMetricsBlock(metrics)} | |
| 104 | 145 | |
| 105 | 146 | [← /sama/v2](/sama/v2) · [← /sama](/sama) · [the v1 dogfood](/sama/verify?repo=syntaxai/tdd.md) |
| 106 | 147 | `; |
| @@ -111,7 +152,8 @@ export const samaV2VerifyHandler = async (): Promise<Response> => { | ||
| 111 | 152 | try { |
| 112 | 153 | const input = await buildSamaV2Input(); |
| 113 | 154 | const report = verifySamaV2(input); |
| 114 | - body = renderV2Report(report); | |
| 155 | + const metrics = computeCoreMetrics(input); | |
| 156 | + body = renderV2Report(report, metrics); | |
| 115 | 157 | } catch (err) { |
| 116 | 158 | body = `# SAMA v2 verify — error\n\nThe verifier failed before producing a verdict:\n\n\`\`\`\n${(err as Error).message}\n\`\`\`\n\n[← /sama/v2](/sama/v2)`; |
| 117 | 159 | } |