syntaxai/tdd.md · commit 33dcc9f

§5 workingSetFit ported to Go + Rust; dive and ripgrep audits gain measured numbers

The cross-repo argument was n=1 measured (this site) + n=3 hand-estimated
(dive, ripgrep, WP plugin). Ports the §5 workingSetFit metric to Go and
Rust source trees, runs it against /tmp/dive and /tmp/ripgrep at pinned
SHAs, and replaces the hand-estimated workingSetFit values in the audit
blog posts + home page table with the measured numbers. Empirical chain
is now n=3 measured + n=1 estimated.

Components:
- src/b32_working_set_polyglot.ts — pure Layer 1: files+lang → ratio,
  imports WORKING_SET_MIN_LOC=50 / MAX=500 from a31_sama_v2.ts (single
  source of truth; no duplication). Formula matches b32_sama_v2_metrics.ts
  byte-for-byte: files-in-band ÷ total, inclusive bounds. Empty input
  → 1.0 vacuous.
- src/c14_working_set_walker.ts — Layer 2 adapter: recursive .go/.rs
  walker. Skips .git/, target/, vendor/, node_modules/, dotdirs. LOC
  counter uses content.split("\n").length to match the TS metric.
- scripts/measure-working-set.ts — CLI: --lang go|rust + repo-path →
  JSON to stdout. Reproducible given a pinned commit SHA.
- 24 new tests cover bound-edge inclusivity (LOC=49 out / =50 in /
  =500 in / =501 out, mirroring b32_sama_v2_metrics.test.ts), language
  test-file asymmetry (Go excludes *_test.go; Rust includes all .rs
  because tests are inline — see /sama/v2#62-inline-tests-dialect),
  empty-input vacuous, reproducibility under deep-equal.

Measured results:
- dive @d6c691947f8fda635c952a17ee3b7555379d58f0:
  48 of 92 source .go files in [50, 500] LOC = 52.17%
  (originally hand-estimated ~80%; 28-point miss)
- ripgrep @4519153e5e461527f4bca45b042fff45c4ec6fb9:
  54 of 100 .rs files in [50, 500] LOC = 54.00%
  (originally hand-estimated ~60%; 6-point miss)

Cross-repo signal: ripgrep (54.00%) and dive (52.17%) measure within
two percentage points — the eyeballed estimates said they were 20 points
apart. The metric, not the eye, was right.

The dive audit gains a §0-style hand-trace ("find /tmp/dive -name *.go
not _test.go | wc -l" yields 92; 48 fall in band; 48/92=0.5217) so the
measurement is auditable per the deterministic-program contract.

Anti-fudge: this repo's sama.profile.toml is unchanged; the §4 verifier
behaviour is bit-identical; /sama/v2/verify continues to report 7/7 ✓.
336/336 tests pass total (was 312; +24 new).

Co-Authored-By: Claude Opus 4.7 <[email protected]>

author: syntaxai <[email protected]>
date: 2026-05-24 10:44:19 +01:00
parent: f244dbb
commit: 33dcc9f48dd621b10a167e6a7d83113f2578020e

8 files changed · +577 −22

modified content/blog/sama-v2-go-project-dive.md +14 −8

@@ -141,17 +141,23 @@ Derives from Law. No file's declared layer is contradicted by what it imports.
141	141
142	142	Estimated tally: 5 of 7 pass under the directory-based dialect, with 2 named failures (Sorted, Modeled-tests). That's a real result, not "0/7 because no one tried."
143	143
144		-## The §5 metrics — estimated for `dive`
	144	+## The §5 metrics — mixed measurement and estimate for `dive`
145	145
146		-\| metric \| `dive` (Go, estimated) \| WP plugin (PHP, estimated) \| tdd.md (TS, measured) \|
	146	+\| metric \| `dive` (Go) \| WP plugin (PHP, estimated) \| tdd.md (TS, measured) \|
147	147	\|---\|---\|---\|---\|
148		-\| §4 checks passing \| ~5 / 7 \| 0 / 7 \| 7 / 7 \|
149		-\| graphDepth \| ~5 (cmd → command → ui → dive → filetree → internal/utils) \| ~3 \| 7 \|
150		-\| boundaryRatio \| ~85% (one borderline case in `options/ci.go`) \| <10% \| 100% \|
151		-\| workingSetFit (50–500 LOC) \| ~80% \| ~47% \| 80% \|
152		-\| violationCounts (sum) \| ~30 (mostly Modeled-tests gaps) \| 17+ \| 0 \|
	148	+\| §4 checks passing \| ~5 / 7 (estimated) \| 0 / 7 \| 7 / 7 \|
	149	+\| graphDepth \| ~5 (estimated; cmd → command → ui → dive → filetree → internal/utils) \| ~3 \| 7 \|
	150	+\| boundaryRatio \| ~85% (estimated; one borderline case in `options/ci.go`) \| <10% \| 100% \|
	151	+\| workingSetFit (50–500 LOC) \| 52.17% (measured, [dive@d6c69194](https://github.com/wagoodman/dive/commit/d6c691947f8fda635c952a17ee3b7555379d58f0)) — originally estimated ~80% \| ~47% \| 80% (measured) \|
	152	+\| violationCounts (sum) \| ~30 (estimated; mostly Modeled-tests gaps) \| 17+ \| 0 \|
153	153
154		-The `workingSetFit` is essentially identical between `dive` and this site (80%). Two unrelated projects, two different languages, two different scopes, written by different teams under different conventions — landing at the same fit ratio is a useful data point: 80% might just be what "reasonably engineered" looks like on this axis.
	154	+The `workingSetFit` is the metric I most expected to land near tdd.md's 80% — two engineered codebases, both with linters and conventions. The measurement says otherwise.
	155	+
	156	+Hand-trace (auditable per [/sama/v2 §0](/sama/v2)): running `find /tmp/dive -name '.go' -not -name '_test.go' -not -path '/.git/' -not -path '/vendor/' \| wc -l` returns 92 source .go files. Of those, 48 fall in [50, 500] LOC inclusive (matching `WORKING_SET_MIN_LOC` and `WORKING_SET_MAX_LOC` in [`src/a31_sama_v2.ts`](/GIT/syntaxai/tdd.md/blob/main/src/a31_sama_v2.ts)). 48 ÷ 92 = 0.5217 ≈ 52.17%. The polyglot §5 emitter at [`scripts/measure-working-set.ts`](/GIT/syntaxai/tdd.md/blob/main/scripts/measure-working-set.ts) produces the same number from the same source tree.
	157	+
	158	+The distribution explains it: 44 files under 50 LOC (mostly small type-only modules, single-helper files, and platform-shim stubs like `dive/image/docker/docker_host_windows.go` at 6 LOC), 48 in band, and — strikingly — 0 over 500 LOC. `dive`'s working-set miss is not god-classes (the §4.5 Atomic check passes outright); it's the opposite failure mode: many files small enough to fall below the substantive-module threshold.
	159	+
	160	+The original ~80% estimate was wrong, and wrong in a direction casual eyeballing wouldn't catch — counting visible-on-the-screen files isn't the same as counting them and applying a band filter. That 28-point miss between estimate and measurement is itself the empirical case for the metric existing at all: the metric surfaces a property the human estimate missed.
155	161
156	162	## What `dive` would look like at 7/7 — the last 30%
157	163

modified content/blog/sama-v2-rust-project-ripgrep.md +12 −8

@@ -145,17 +145,21 @@ Derives from Law on the same edge set.
145	145
146	146	(Update: all three dialects have since been drafted into [/sama/v2 §6.A](/sama/v2#6a-v21-dialects-provisional) as v2.1-draft extensions, with the same five-part operational structure — what they relax, what property they preserve, and the falsifiable cross-repo experiment that would invalidate each.)
147	147
148		-## §5 metric estimates
	148	+## §5 metrics — measured workingSetFit, estimated the rest
149	149
150		-\| metric \| ripgrep (estimated) \| dive (Go) \| tdd.md (TS, measured) \| WP plugin (PHP) \|
	150	+\| metric \| ripgrep \| dive (Go) \| tdd.md (TS, measured) \| WP plugin (PHP) \|
151	151	\|---\|---\|---\|---\|---\|
152		-\| §4 checks passing \| ~3/7 strict, ~5/7 under v2.1 dialects \| ~5/7 \| 7/7 ✓ \| 0/7 \|
153		-\| graphDepth \| ~5 (matcher → engine → searcher → printer → core) \| ~5 \| 7 \| ~3 \|
154		-\| boundaryRatio \| ~95% \| ~85% \| 100% \| <10% \|
155		-\| workingSetFit (50–500 LOC) \| ~60% (those 19 big files drag it down) \| ~80% \| 80% \| ~47% \|
156		-\| violationCounts (sum) \| ~50 (19 Atomic + ~30 Modeled-tests under sibling-rule) \| ~30 \| 0 \| 17+ \|
	152	+\| §4 checks passing \| ~3/7 strict, ~5/7 under v2.1 dialects (estimated) \| ~5/7 (estimated) \| 7/7 ✓ \| 0/7 \|
	153	+\| graphDepth \| ~5 estimated (matcher → engine → searcher → printer → core) \| ~5 (estimated) \| 7 \| ~3 \|
	154	+\| boundaryRatio \| ~95% (estimated) \| ~85% (estimated) \| 100% \| <10% \|
	155	+\| workingSetFit (50–500 LOC) \| 54.00% (measured, [ripgrep@4519153e](https://github.com/BurntSushi/ripgrep/commit/4519153e5e461527f4bca45b042fff45c4ec6fb9)) — originally estimated ~60% \| 52.17% (measured, [dive@d6c69194](https://github.com/wagoodman/dive/commit/d6c691947f8fda635c952a17ee3b7555379d58f0)) — originally estimated ~80% \| 80% \| ~47% \|
	156	+\| violationCounts (sum) \| ~50 estimated (Atomic + Modeled-tests under sibling-rule) \| ~30 (estimated) \| 0 \| 17+ \|
157	157
158		-ripgrep's `workingSetFit` is the metric that surprises here: ~60%, lower than dive and lower than this site. That's the 19 big files pulling the distribution down. And yet most of those files are appropriate to their content. It's a useful signal: workingSetFit is not by itself a quality measure — a project full of declaration catalogs will score lower than a project full of small handlers without being architecturally worse.
	158	+ripgrep's `workingSetFit` measures 54.00% (from the polyglot §5 emitter at [`scripts/measure-working-set.ts`](/GIT/syntaxai/tdd.md/blob/main/scripts/measure-working-set.ts), inclusive bounds [50, 500] LOC). The distribution: 100 .rs files total, 16 under 50 LOC, 54 in band, 30 over 500 LOC — appreciably more than the "19 big files" I eyeballed in the original audit. The over-cap list ranges from the textbook declarative-exempt catalog (`crates/core/flags/defs.rs` at 7,780 LOC) down to genuinely borderline files at 500–800 LOC like `crates/pcre2/src/matcher.rs` (506) and `crates/cli/src/decompress.rs` (533).
	159	+
	160	+And yet most of those files are appropriate to their content. workingSetFit by itself doesn't say which side of the line each file falls on — that's what the [declarative-exemption dialect](/sama/v2#63-declarative-exemption-dialect) is for. The metric surfaces the property; the policy decides what to do with it.
	161	+
	162	+The cross-repo comparison the measurement makes possible is more interesting than the single number. ripgrep (54%) and dive (52%) measure within two percentage points of each other — two unrelated codebases in two different languages, written by different teams under different conventions, landing in the same working-set band when measured against the same bounds. That's the kind of cross-repo signal §6 says it wants. The eyeballed estimates (~60% and ~80%) said the two projects were 20 points apart; the measurement says they're 2 points apart. The metric, not the eye, was right.
159	163
160	164	This is exactly the §5 intent. The metric surfaces a property; whether that property is good or bad depends on what the file content should be. Compliance scores conflate the two; metrics keep them separate.
161	165

modified content/home.md +7 −6

@@ -56,17 +56,18 @@ SAMA bundles those findings into four constraints a CI job can enforce. Sorted
56	56
57	57	The load-bearing property isn't that LLMs have small context windows — modern models have 200k+ tokens. The load-bearing property is mechanical enforceability: the verifier fails the build when a file crosses the line cap or an import points the wrong way. Discipline that lives only in code review quietly slips under agent pressure; discipline that lives in a CI gate keeps its shape across an arbitrary number of agent commits. The context-window research above explains the why; the verifier explains the how.
58	58
59		-## Three datapoints on the same axes
	59	+## Datapoints on the same axes
60	60
61		-Empirical baseline so far (the §5 metrics, [computed live](/sama/v2/verify) for this site and hand-traced for the two audits):
	61	+Empirical baseline so far. The §4 score for this site is [computed live](/sama/v2/verify); the §4 scores for the other repos are hand-estimated. The workingSetFit column is now measured for three of the four repos by the polyglot §5 emitter at [`scripts/measure-working-set.ts`](/GIT/syntaxai/tdd.md/blob/main/scripts/measure-working-set.ts); the remaining columns are still hand-estimated where flagged.
62	62
63	63	\| project \| language \| §4 score \| workingSetFit \| boundaryRatio \| graphDepth \|
64	64	\|---\|---\|---\|---\|---\|---\|
65		-\| tdd.md (this site) \| TypeScript \| 7 / 7 ✓ (measured) \| 80% \| 100% \| 7 \|
66		-\| [wagoodman/dive](/blog/sama-v2-go-project-dive) \| Go \| ~5 / 7 (estimated) \| ~80% \| ~85% \| ~5 \|
67		-\| [Open Graph plugin](/blog/sama-v2-wordpress-plugin-audit) \| PHP / WordPress \| 0 / 7 (estimated) \| ~47% \| <10% \| ~3 \|
	65	+\| tdd.md (this site) \| TypeScript \| 7 / 7 ✓ (measured) \| 80% (measured) \| 100% (measured) \| 7 (measured) \|
	66	+\| [wagoodman/dive](/blog/sama-v2-go-project-dive) \| Go \| ~5 / 7 (estimated) \| 52.17% (measured, [@d6c69194](https://github.com/wagoodman/dive/commit/d6c691947f8fda635c952a17ee3b7555379d58f0)) \| ~85% (estimated) \| ~5 (estimated) \|
	67	+\| [BurntSushi/ripgrep](/blog/sama-v2-rust-project-ripgrep) \| Rust \| ~3-5 / 7 (estimated, depends on v2.1 dialect uptake) \| 54.00% (measured, [@4519153e](https://github.com/BurntSushi/ripgrep/commit/4519153e5e461527f4bca45b042fff45c4ec6fb9)) \| ~95% (estimated) \| ~5 (estimated) \|
	68	+\| [Open Graph plugin](/blog/sama-v2-wordpress-plugin-audit) \| PHP / WordPress \| 0 / 7 (estimated) \| ~47% (estimated) \| <10% (estimated) \| ~3 (estimated) \|
68	69
69		-Three points is not yet a "v2 is worth following" claim. §6 of the spec is explicit that promotion to official requires cross-repo deltas, not a single dogfood. But the same five numbers are now defined, computable, and published — which is the prerequisite the spec sets before any later claim becomes testable.
	70	+Four points is not yet a "v2 is worth following" claim. §6 of the spec is explicit that promotion to official requires cross-repo deltas, not a single dogfood. But three workingSetFit rows are now measured against the same bounds the spec defines — a quiet but load-bearing step from "we have numbers" to "we have the same numbers across repos." The cross-repo signal that emerges: ripgrep (54.00%) and dive (52.17%) land within two percentage points of each other, suggesting workingSetFit in the 50–55% range may be characteristic of mature compiled-language CLI tools — a hypothesis that needs more datapoints to confirm but is now testable in a way it was not when the numbers were all eyeballed.
70	71
71	72	## See it in practice
72	73

added scripts/measure-working-set.ts +76 −0

@@ -0,0 +1,76 @@
	1	+#!/usr/bin/env bun
	2	+// measure-working-set — CLI for the §5 polyglot workingSetFit metric.
	3	+// Given a path to a checked-out Go or Rust source tree, emit the
	4	+// measured ratio as JSON to stdout.
	5	+//
	6	+// Usage:
	7	+// bun scripts/measure-working-set.ts <repo-path> --lang go
	8	+// bun scripts/measure-working-set.ts <repo-path> --lang rust
	9	+//
	10	+// The number it emits is reproducible: given the same checked-out
	11	+// source tree, every run prints the same ratio to full float precision.
	12	+// Pair the output with the repo's commit SHA when reporting; see
	13	+// /sama/v2 §5 (operational) for the bounds reasoning.
	14	+
	15	+import { measureWorkingSetForRepo } from "../src/c14_working_set_walker.ts";
	16	+import type { PolyglotLanguage } from "../src/b32_working_set_polyglot.ts";
	17	+
	18	+const args = process.argv.slice(2);
	19	+
	20	+const usage = (): never => {
	21	+ console.error(
	22	+ "Usage: bun scripts/measure-working-set.ts <repo-path> --lang go\|rust [--verbose]",
	23	+ );
	24	+ process.exit(2);
	25	+};
	26	+
	27	+if (args.length < 3) usage();
	28	+
	29	+const repoPath = args[0]!;
	30	+let lang: PolyglotLanguage \| null = null;
	31	+let verbose = false;
	32	+
	33	+for (let i = 1; i < args.length; i++) {
	34	+ const a = args[i];
	35	+ if (a === "--lang") {
	36	+ const v = args[++i];
	37	+ if (v !== "go" && v !== "rust") {
	38	+ console.error(`--lang must be "go" or "rust", got: ${v}`);
	39	+ process.exit(2);
	40	+ }
	41	+ lang = v;
	42	+ } else if (a === "--verbose") {
	43	+ verbose = true;
	44	+ } else {
	45	+ console.error(`unknown argument: ${a}`);
	46	+ usage();
	47	+ }
	48	+}
	49	+
	50	+if (lang === null) usage();
	51	+
	52	+const result = measureWorkingSetForRepo(repoPath, lang!);
	53	+
	54	+const output: Record<string, unknown> = {
	55	+ language: result.language,
	56	+ repoPath,
	57	+ minLoc: result.minLoc,
	58	+ maxLoc: result.maxLoc,
	59	+ total: result.total,
	60	+ included: result.included,
	61	+ ratio: result.ratio,
	62	+ ratioPercent: Number((result.ratio * 100).toFixed(2)),
	63	+};
	64	+
	65	+if (verbose) {
	66	+ output.files = result.files.map((f) => ({
	67	+ path: f.path,
	68	+ locCount: f.locCount,
	69	+ inBand:
	70	+ f.locCount >= result.minLoc &&
	71	+ f.locCount <= result.maxLoc &&
	72	+ !(lang === "go" && f.path.endsWith("_test.go")),
	73	+ }));
	74	+}
	75	+
	76	+console.log(JSON.stringify(output, null, 2));

added src/b32_working_set_polyglot.test.ts +164 −0

@@ -0,0 +1,164 @@
	1	+import { describe, expect, test } from "bun:test";
	2	+import {
	3	+ WORKING_SET_MAX_LOC,
	4	+ WORKING_SET_MIN_LOC,
	5	+} from "./a31_sama_v2.ts";
	6	+import {
	7	+ computeWorkingSetFitPolyglot,
	8	+ type PolyglotLanguage,
	9	+ type WorkingSetFile,
	10	+} from "./b32_working_set_polyglot.ts";
	11	+
	12	+// Mirror the inclusive-bound assertions in b32_sama_v2_metrics.test.ts.
	13	+// Same algorithm, same constants, same edge behaviour — the polyglot
	14	+// helper is allowed to compute a different SET of files (Go/Rust source
	15	+// trees rather than src/*.ts), but the RATIO formula must match the
	16	+// TS metric byte-for-byte. These tests pin that.
	17	+
	18	+const file = (path: string, locCount: number): WorkingSetFile => ({ path, locCount });
	19	+
	20	+describe("computeWorkingSetFitPolyglot — empty input", () => {
	21	+ test("empty list → 1.0 vacuous (matches the TS metric on an empty file map)", () => {
	22	+ const r = computeWorkingSetFitPolyglot([], "go");
	23	+ expect(r.ratio).toBe(1.0);
	24	+ expect(r.included).toBe(0);
	25	+ expect(r.total).toBe(0);
	26	+ });
	27	+
	28	+ test("empty list also vacuous under Rust", () => {
	29	+ const r = computeWorkingSetFitPolyglot([], "rust");
	30	+ expect(r.ratio).toBe(1.0);
	31	+ });
	32	+});
	33	+
	34	+describe("computeWorkingSetFitPolyglot — single-file extremes", () => {
	35	+ test("a single 100-line Go file → 1.0", () => {
	36	+ const r = computeWorkingSetFitPolyglot([file("pkg/x.go", 100)], "go");
	37	+ expect(r.ratio).toBe(1.0);
	38	+ expect(r.included).toBe(1);
	39	+ expect(r.total).toBe(1);
	40	+ });
	41	+
	42	+ test("a single 10-line file falls below the min → 0.0", () => {
	43	+ const r = computeWorkingSetFitPolyglot([file("pkg/x.go", 10)], "go");
	44	+ expect(r.ratio).toBe(0.0);
	45	+ expect(r.included).toBe(0);
	46	+ expect(r.total).toBe(1);
	47	+ });
	48	+
	49	+ test("a single 600-line file exceeds the max → 0.0", () => {
	50	+ const r = computeWorkingSetFitPolyglot([file("pkg/x.go", 600)], "go");
	51	+ expect(r.ratio).toBe(0.0);
	52	+ expect(r.included).toBe(0);
	53	+ expect(r.total).toBe(1);
	54	+ });
	55	+});
	56	+
	57	+describe("computeWorkingSetFitPolyglot — bound-edge inclusivity", () => {
	58	+ // The TS metric uses `lines >= MIN && lines <= MAX`. These tests
	59	+ // mirror b32_sama_v2_metrics.test.ts's "exact bounds are inclusive".
	60	+ test("LOC = 49 → out of band", () => {
	61	+ const r = computeWorkingSetFitPolyglot([file("pkg/x.go", WORKING_SET_MIN_LOC - 1)], "go");
	62	+ expect(r.included).toBe(0);
	63	+ });
	64	+
	65	+ test("LOC = 50 → in band", () => {
	66	+ const r = computeWorkingSetFitPolyglot([file("pkg/x.go", WORKING_SET_MIN_LOC)], "go");
	67	+ expect(r.included).toBe(1);
	68	+ });
	69	+
	70	+ test("LOC = 500 → in band", () => {
	71	+ const r = computeWorkingSetFitPolyglot([file("pkg/x.go", WORKING_SET_MAX_LOC)], "go");
	72	+ expect(r.included).toBe(1);
	73	+ });
	74	+
	75	+ test("LOC = 501 → out of band", () => {
	76	+ const r = computeWorkingSetFitPolyglot([file("pkg/x.go", WORKING_SET_MAX_LOC + 1)], "go");
	77	+ expect(r.included).toBe(0);
	78	+ });
	79	+});
	80	+
	81	+describe("computeWorkingSetFitPolyglot — mixed inputs", () => {
	82	+ test("half in / half out → 0.5", () => {
	83	+ const r = computeWorkingSetFitPolyglot([
	84	+ file("pkg/a.go", 100),
	85	+ file("pkg/b.go", 10),
	86	+ ], "go");
	87	+ expect(r.ratio).toBe(0.5);
	88	+ });
	89	+
	90	+ test("two in / two out → 0.5", () => {
	91	+ const r = computeWorkingSetFitPolyglot([
	92	+ file("pkg/a.go", 100),
	93	+ file("pkg/b.go", 300),
	94	+ file("pkg/c.go", 10),
	95	+ file("pkg/d.go", 800),
	96	+ ], "go");
	97	+ expect(r.ratio).toBe(0.5);
	98	+ });
	99	+});
	100	+
	101	+describe("computeWorkingSetFitPolyglot — Go test-file exclusion", () => {
	102	+ test("*_test.go files do NOT count toward total or included", () => {
	103	+ const r = computeWorkingSetFitPolyglot([
	104	+ file("pkg/x.go", 100),
	105	+ file("pkg/x_test.go", 200),
	106	+ file("pkg/y_test.go", 50),
	107	+ ], "go");
	108	+ // Only x.go counts; both _test.go files dropped before tallying.
	109	+ expect(r.total).toBe(1);
	110	+ expect(r.included).toBe(1);
	111	+ expect(r.ratio).toBe(1.0);
	112	+ });
	113	+
	114	+ test("a 100-line source + a 1-line _test.go sibling → 1.0 (mirrors the TS metric)", () => {
	115	+ const r = computeWorkingSetFitPolyglot([
	116	+ file("pkg/x.go", 100),
	117	+ file("pkg/x_test.go", 1),
	118	+ ], "go");
	119	+ expect(r.ratio).toBe(1.0);
	120	+ });
	121	+});
	122	+
	123	+describe("computeWorkingSetFitPolyglot — Rust inline-tests asymmetry", () => {
	124	+ test("Rust includes ALL .rs files (no path-based test exclusion)", () => {
	125	+ // Rust convention: tests live inside source files under
	126	+ // #[cfg(test)] mod tests. The polyglot helper preserves that —
	127	+ // it does NOT exclude any .rs path. The asymmetry is documented
	128	+ // in the b32_working_set_polyglot.ts source comment.
	129	+ const r = computeWorkingSetFitPolyglot([
	130	+ file("src/lib.rs", 100),
	131	+ file("src/tests.rs", 100),
	132	+ file("src/something_test.rs", 100),
	133	+ ], "rust");
	134	+ expect(r.total).toBe(3);
	135	+ expect(r.included).toBe(3);
	136	+ expect(r.ratio).toBe(1.0);
	137	+ });
	138	+});
	139	+
	140	+describe("computeWorkingSetFitPolyglot — reproducibility", () => {
	141	+ test("same input → identical output across runs (deep-equal)", () => {
	142	+ const input: WorkingSetFile[] = [
	143	+ file("a.go", 100),
	144	+ file("b.go", 60),
	145	+ file("c.go", 480),
	146	+ file("d.go", 20),
	147	+ file("e_test.go", 999),
	148	+ ];
	149	+ const langs: PolyglotLanguage[] = ["go", "rust"];
	150	+ for (const l of langs) {
	151	+ const a = computeWorkingSetFitPolyglot(input, l);
	152	+ const b = computeWorkingSetFitPolyglot(input, l);
	153	+ expect(a).toEqual(b);
	154	+ }
	155	+ });
	156	+});
	157	+
	158	+describe("computeWorkingSetFitPolyglot — bounds echo", () => {
	159	+ test("result echoes minLoc / maxLoc from a31_sama_v2.ts (auditable)", () => {
	160	+ const r = computeWorkingSetFitPolyglot([], "go");
	161	+ expect(r.minLoc).toBe(WORKING_SET_MIN_LOC);
	162	+ expect(r.maxLoc).toBe(WORKING_SET_MAX_LOC);
	163	+ });
	164	+});

added src/b32_working_set_polyglot.ts +82 −0

@@ -0,0 +1,82 @@
	1	+// b32 — logic: §5 workingSetFit metric for polyglot source trees
	2	+// (Go, Rust). Pure function, no I/O. Mirrors the formula in
	3	+// src/b32_sama_v2_metrics.ts byte-for-byte:
	4	+//
	5	+// workingSetFit = files-in-band ÷ total-source-files
	6	+//
	7	+// where in-band means WORKING_SET_MIN_LOC ≤ LOC ≤ WORKING_SET_MAX_LOC,
	8	+// inclusive on both ends. Bounds are imported from a31_sama_v2.ts so
	9	+// the cross-language number is computed against the same band as this
	10	+// site's own metric — the single-source-of-truth determinism property
	11	+// from /sama/v2 §0.
	12	+//
	13	+// Used by scripts/measure-working-set.ts (the polyglot CLI) and the
	14	+// c14_working_set_walker.ts adapter, which feed it a pre-counted file
	15	+// summary so this module stays pure and unit-testable.
	16	+
	17	+import {
	18	+ WORKING_SET_MAX_LOC,
	19	+ WORKING_SET_MIN_LOC,
	20	+} from "./a31_sama_v2.ts";
	21	+
	22	+// Language tag governs the test-file exclusion rule below.
	23	+export type PolyglotLanguage = "go" \| "rust";
	24	+
	25	+export interface WorkingSetFile {
	26	+ // Repo-relative path (e.g. "crates/printer/src/standard.rs").
	27	+ path: string;
	28	+ // File length in lines, matching the TS metric's `content.split("\n").length`.
	29	+ locCount: number;
	30	+}
	31	+
	32	+export interface WorkingSetResult {
	33	+ language: PolyglotLanguage;
	34	+ included: number; // files inside [MIN, MAX] LOC, inclusive
	35	+ total: number; // total source files (after test-file exclusion)
	36	+ ratio: number; // included / total; empty-input → 1.0 vacuous
	37	+ minLoc: number; // echoed back from a31 so callers can audit
	38	+ maxLoc: number;
	39	+}
	40	+
	41	+// Test-file exclusion. The asymmetry is honest, not arbitrary:
	42	+//
	43	+// Go: tests live in `*_test.go` files. The TS metric excludes
	44	+// `*.test.ts` for the same structural reason — they aren't
	45	+// working modules in their own right, they verify one.
	46	+//
	47	+// Rust: tests live INSIDE source files under `#[cfg(test)] mod tests`.
	48	+// Excluding files at file-granularity would either lose every
	49	+// tested file or accidentally include all of them. The
	50	+// inline-tests dialect drafted at /sama/v2#62-inline-tests-dialect
	51	+// is what makes this asymmetry coherent: where the test attaches
	52	+// is a language-level choice; the working-set property the metric
	53	+// measures is unaffected.
	54	+const isPolyglotTestFile = (path: string, lang: PolyglotLanguage): boolean => {
	55	+ if (lang === "go") return path.endsWith("_test.go");
	56	+ return false;
	57	+};
	58	+
	59	+export const computeWorkingSetFitPolyglot = (
	60	+ files: ReadonlyArray<WorkingSetFile>,
	61	+ lang: PolyglotLanguage,
	62	+): WorkingSetResult => {
	63	+ let included = 0;
	64	+ let total = 0;
	65	+ for (const f of files) {
	66	+ if (isPolyglotTestFile(f.path, lang)) continue;
	67	+ total++;
	68	+ if (f.locCount >= WORKING_SET_MIN_LOC && f.locCount <= WORKING_SET_MAX_LOC) {
	69	+ included++;
	70	+ }
	71	+ }
	72	+ // Match the TS metric: empty input → 1.0 (vacuously satisfied).
	73	+ const ratio = total === 0 ? 1.0 : included / total;
	74	+ return {
	75	+ language: lang,
	76	+ included,
	77	+ total,
	78	+ ratio,
	79	+ minLoc: WORKING_SET_MIN_LOC,
	80	+ maxLoc: WORKING_SET_MAX_LOC,
	81	+ };
	82	+};

added src/c14_working_set_walker.test.ts +117 −0

@@ -0,0 +1,117 @@
	1	+import { afterAll, beforeAll, describe, expect, test } from "bun:test";
	2	+import { mkdirSync, mkdtempSync, rmSync, writeFileSync } from "node:fs";
	3	+import { tmpdir } from "node:os";
	4	+import { resolve } from "node:path";
	5	+import {
	6	+ collectPolyglotFiles,
	7	+ measureWorkingSetForRepo,
	8	+} from "./c14_working_set_walker.ts";
	9	+
	10	+// Hermetic fixture: build a tiny fake repo in a tmpdir, walk it,
	11	+// assert what comes back. The CLI script's real-world use against
	12	+// /tmp/dive and /tmp/ripgrep is exercised via the measurement step
	13	+// in this PR, not via unit tests; this file pins the algorithm.
	14	+
	15	+const FIXTURE_ROOT = mkdtempSync(resolve(tmpdir(), "tdd-md-wswalker-"));
	16	+
	17	+const writeFile = (relPath: string, lineCount: number): void => {
	18	+ const abs = resolve(FIXTURE_ROOT, relPath);
	19	+ mkdirSync(abs.split("/").slice(0, -1).join("/"), { recursive: true });
	20	+ const lines = Array.from({ length: lineCount }, (_, i) => `// line ${i}`);
	21	+ writeFileSync(abs, lines.join("\n"));
	22	+};
	23	+
	24	+beforeAll(() => {
	25	+ // Top-level Go sources (one in-band, one out-of-band, one test file).
	26	+ writeFile("a.go", 100); // in band
	27	+ writeFile("b.go", 600); // out (over)
	28	+ writeFile("c_test.go", 200); // excluded for Go
	29	+ // Nested.
	30	+ writeFile("pkg/inner.go", 60); // in band, inside subdir
	31	+ writeFile("pkg/tiny.go", 10); // out (under)
	32	+ // Rust sources (separate sub-tree).
	33	+ writeFile("rs/src/lib.rs", 120); // in band
	34	+ writeFile("rs/src/big.rs", 700); // out (over)
	35	+ writeFile("rs/src/tests.rs", 75); // included (Rust has no path test rule)
	36	+ // Skip directories that should NOT be walked.
	37	+ writeFile(".git/HEAD.go", 100); // .git is skipped
	38	+ writeFile("target/build.rs", 100); // target/ is skipped
	39	+ writeFile("vendor/pkg.go", 100); // vendor/ is skipped
	40	+ writeFile("node_modules/dep.go", 100); // node_modules/ skipped
	41	+});
	42	+
	43	+afterAll(() => {
	44	+ rmSync(FIXTURE_ROOT, { recursive: true, force: true });
	45	+});
	46	+
	47	+describe("collectPolyglotFiles — Go", () => {
	48	+ test("walks recursively and finds the right .go files", () => {
	49	+ const files = collectPolyglotFiles(FIXTURE_ROOT, "go");
	50	+ const paths = files.map((f) => f.path);
	51	+ // Excluded: .git/, target/, vendor/, node_modules/.
	52	+ // Included: a.go, b.go, c_test.go (the helper RETURNS it; the
	53	+ // metric helper drops it during the count — separation of concerns).
	54	+ expect(paths).toContain("a.go");
	55	+ expect(paths).toContain("b.go");
	56	+ expect(paths).toContain("c_test.go");
	57	+ expect(paths).toContain("pkg/inner.go");
	58	+ expect(paths).toContain("pkg/tiny.go");
	59	+ expect(paths).not.toContain(".git/HEAD.go");
	60	+ expect(paths).not.toContain("vendor/pkg.go");
	61	+ expect(paths).not.toContain("node_modules/dep.go");
	62	+ });
	63	+
	64	+ test("LOC counts match content.split('\\n').length", () => {
	65	+ const files = collectPolyglotFiles(FIXTURE_ROOT, "go");
	66	+ const a = files.find((f) => f.path === "a.go");
	67	+ // We wrote 100 lines joined by "\n" → split("\n").length === 100.
	68	+ expect(a?.locCount).toBe(100);
	69	+ });
	70	+
	71	+ test("returns files in deterministic sorted order", () => {
	72	+ const a = collectPolyglotFiles(FIXTURE_ROOT, "go").map((f) => f.path);
	73	+ const b = collectPolyglotFiles(FIXTURE_ROOT, "go").map((f) => f.path);
	74	+ expect(a).toEqual(b);
	75	+ const sorted = [...a].sort((x, y) => x.localeCompare(y));
	76	+ expect(a).toEqual(sorted);
	77	+ });
	78	+});
	79	+
	80	+describe("collectPolyglotFiles — Rust", () => {
	81	+ test("finds only .rs files; ignores .go", () => {
	82	+ const files = collectPolyglotFiles(FIXTURE_ROOT, "rust");
	83	+ const paths = files.map((f) => f.path);
	84	+ expect(paths).toContain("rs/src/lib.rs");
	85	+ expect(paths).toContain("rs/src/big.rs");
	86	+ expect(paths).toContain("rs/src/tests.rs");
	87	+ expect(paths.every((p) => p.endsWith(".rs"))).toBe(true);
	88	+ });
	89	+
	90	+ test("target/build.rs is excluded (skipped dir)", () => {
	91	+ const files = collectPolyglotFiles(FIXTURE_ROOT, "rust");
	92	+ const paths = files.map((f) => f.path);
	93	+ expect(paths).not.toContain("target/build.rs");
	94	+ });
	95	+});
	96	+
	97	+describe("measureWorkingSetForRepo — end-to-end", () => {
	98	+ test("Go fixture: 2 in band (a.go=100, pkg/inner.go=60) of 4 source files (excluding c_test.go) = 0.5", () => {
	99	+ const r = measureWorkingSetForRepo(FIXTURE_ROOT, "go");
	100	+ expect(r.total).toBe(4); // a, b, pkg/inner, pkg/tiny (c_test excluded)
	101	+ expect(r.included).toBe(2); // a, pkg/inner
	102	+ expect(r.ratio).toBe(0.5);
	103	+ });
	104	+
	105	+ test("Rust fixture: 2 in band (lib.rs=120, tests.rs=75) of 3 .rs files = 2/3", () => {
	106	+ const r = measureWorkingSetForRepo(FIXTURE_ROOT, "rust");
	107	+ expect(r.total).toBe(3);
	108	+ expect(r.included).toBe(2);
	109	+ expect(r.ratio).toBeCloseTo(2 / 3, 6);
	110	+ });
	111	+
	112	+ test("echoes the bounds back so callers can audit which numbers produced the ratio", () => {
	113	+ const r = measureWorkingSetForRepo(FIXTURE_ROOT, "go");
	114	+ expect(r.minLoc).toBe(50);
	115	+ expect(r.maxLoc).toBe(500);
	116	+ });
	117	+});

added src/c14_working_set_walker.ts +105 −0

@@ -0,0 +1,105 @@
	1	+// c14 — adapter: filesystem walker that produces a polyglot
	2	+// WorkingSetFile summary for an external source tree (Go or Rust).
	3	+// Recursive directory walk; counts lines of each .go / .rs file using
	4	+// the same `content.split("\n").length` rule as b32_sama_v2_metrics so
	5	+// the cross-language metric matches the TS metric byte-for-byte.
	6	+//
	7	+// Skipped directories are the conventional non-source trees that
	8	+// would otherwise inflate the denominator with vendored / generated
	9	+// / build artefacts: .git, target/ (Rust build output), vendor/ (Go
	10	+// vendored deps), node_modules/ (incidental, defensive).
	11	+//
	12	+// The walker is hermetic — given a path that is a directory it
	13	+// resolves the file set deterministically. Calls into the pure helper
	14	+// in b32_working_set_polyglot.ts for the ratio.
	15	+
	16	+import { readdirSync, readFileSync, statSync } from "node:fs";
	17	+import { resolve } from "node:path";
	18	+import {
	19	+ computeWorkingSetFitPolyglot,
	20	+ type PolyglotLanguage,
	21	+ type WorkingSetFile,
	22	+ type WorkingSetResult,
	23	+} from "./b32_working_set_polyglot.ts";
	24	+
	25	+const SKIPPED_DIRS: ReadonlySet<string> = new Set([
	26	+ ".git",
	27	+ "target",
	28	+ "vendor",
	29	+ "node_modules",
	30	+]);
	31	+
	32	+const EXTENSION_FOR: Record<PolyglotLanguage, string> = {
	33	+ go: ".go",
	34	+ rust: ".rs",
	35	+};
	36	+
	37	+// Walk a directory and return every {path, locCount} pair for files
	38	+// whose extension matches the target language. Paths are returned
	39	+// repo-relative (i.e. relative to the `repoRoot` passed in) so they're
	40	+// stable across machines.
	41	+export const collectPolyglotFiles = (
	42	+ repoRoot: string,
	43	+ lang: PolyglotLanguage,
	44	+): WorkingSetFile[] => {
	45	+ const ext = EXTENSION_FOR[lang];
	46	+ const out: WorkingSetFile[] = [];
	47	+
	48	+ const walk = (absDir: string, relDir: string): void => {
	49	+ let entries: ReturnType<typeof readdirSync>;
	50	+ try {
	51	+ entries = readdirSync(absDir, { withFileTypes: true });
	52	+ } catch {
	53	+ // Permission errors / non-existent: surface to caller, but
	54	+ // letting one bad subtree halt the whole measurement would be
	55	+ // worse than reporting the partial set. Return silently here;
	56	+ // the CLI's smoke checks at the top level will catch a totally
	57	+ // unreadable root.
	58	+ return;
	59	+ }
	60	+ for (const e of entries) {
	61	+ if (e.name.startsWith(".") && e.name !== ".") {
	62	+ // .git, .github, .vscode, ...: defensive skip on all dotdirs
	63	+ // for directories; dotfiles are skipped too (they're never
	64	+ // .go/.rs sources anyway, but the explicit skip is cheap).
	65	+ if (e.isDirectory() && SKIPPED_DIRS.has(e.name)) continue;
	66	+ if (e.isDirectory()) continue; // skip all hidden dirs
	67	+ }
	68	+ if (e.isDirectory()) {
	69	+ if (SKIPPED_DIRS.has(e.name)) continue;
	70	+ const sub = resolve(absDir, e.name);
	71	+ const subRel = relDir === "" ? e.name : `${relDir}/${e.name}`;
	72	+ walk(sub, subRel);
	73	+ continue;
	74	+ }
	75	+ if (!e.isFile()) continue;
	76	+ if (!e.name.endsWith(ext)) continue;
	77	+ const abs = resolve(absDir, e.name);
	78	+ const relPath = relDir === "" ? e.name : `${relDir}/${e.name}`;
	79	+ const content = readFileSync(abs, "utf8");
	80	+ // Match b32_sama_v2_metrics.ts: lines = content.split("\n").length.
	81	+ const locCount = content.split("\n").length;
	82	+ out.push({ path: relPath, locCount });
	83	+ }
	84	+ };
	85	+
	86	+ const root = resolve(repoRoot);
	87	+ const rootStat = statSync(root);
	88	+ if (!rootStat.isDirectory()) {
	89	+ throw new Error(`expected a directory, got: ${repoRoot}`);
	90	+ }
	91	+ walk(root, "");
	92	+ // Sort for deterministic output (readdirSync is platform-dependent).
	93	+ out.sort((a, b) => a.path.localeCompare(b.path));
	94	+ return out;
	95	+};
	96	+
	97	+// Convenience: walk + compute in one call. Used by the CLI script.
	98	+export const measureWorkingSetForRepo = (
	99	+ repoRoot: string,
	100	+ lang: PolyglotLanguage,
	101	+): WorkingSetResult & { files: WorkingSetFile[] } => {
	102	+ const files = collectPolyglotFiles(repoRoot, lang);
	103	+ const result = computeWorkingSetFitPolyglot(files, lang);
	104	+ return { ...result, files };
	105	+};

raw .diff