syntaxai/tdd.md · commit 33dcc9f

§5 workingSetFit ported to Go + Rust; dive and ripgrep audits gain measured numbers

The cross-repo argument was n=1 measured (this site) + n=3 hand-estimated
(dive, ripgrep, WP plugin). Ports the §5 workingSetFit metric to Go and
Rust source trees, runs it against /tmp/dive and /tmp/ripgrep at pinned
SHAs, and replaces the hand-estimated workingSetFit values in the audit
blog posts + home page table with the measured numbers. Empirical chain
is now n=3 measured + n=1 estimated.

Components:
- src/b32_working_set_polyglot.ts — pure Layer 1: files+lang → ratio,
  imports WORKING_SET_MIN_LOC=50 / MAX=500 from a31_sama_v2.ts (single
  source of truth; no duplication). Formula matches b32_sama_v2_metrics.ts
  byte-for-byte: files-in-band ÷ total, inclusive bounds. Empty input
  → 1.0 vacuous.
- src/c14_working_set_walker.ts — Layer 2 adapter: recursive .go/.rs
  walker. Skips .git/, target/, vendor/, node_modules/, dotdirs. LOC
  counter uses content.split("\n").length to match the TS metric.
- scripts/measure-working-set.ts — CLI: --lang go|rust + repo-path →
  JSON to stdout. Reproducible given a pinned commit SHA.
- 24 new tests cover bound-edge inclusivity (LOC=49 out / =50 in /
  =500 in / =501 out, mirroring b32_sama_v2_metrics.test.ts), language
  test-file asymmetry (Go excludes *_test.go; Rust includes all .rs
  because tests are inline — see /sama/v2#62-inline-tests-dialect),
  empty-input vacuous, reproducibility under deep-equal.

Measured results:
- dive @d6c691947f8fda635c952a17ee3b7555379d58f0:
  48 of 92 source .go files in [50, 500] LOC = 52.17%
  (originally hand-estimated ~80%; 28-point miss)
- ripgrep @4519153e5e461527f4bca45b042fff45c4ec6fb9:
  54 of 100 .rs files in [50, 500] LOC = 54.00%
  (originally hand-estimated ~60%; 6-point miss)

Cross-repo signal: ripgrep (54.00%) and dive (52.17%) measure within
two percentage points — the eyeballed estimates said they were 20 points
apart. The metric, not the eye, was right.

The dive audit gains a §0-style hand-trace ("find /tmp/dive -name *.go
not _test.go | wc -l" yields 92; 48 fall in band; 48/92=0.5217) so the
measurement is auditable per the deterministic-program contract.

Anti-fudge: this repo's sama.profile.toml is unchanged; the §4 verifier
behaviour is bit-identical; /sama/v2/verify continues to report 7/7 ✓.
336/336 tests pass total (was 312; +24 new).

Co-Authored-By: Claude Opus 4.7 <[email protected]>
author
syntaxai <[email protected]>
date
2026-05-24 10:44:19 +01:00
parent
f244dbb
commit
33dcc9f48dd621b10a167e6a7d83113f2578020e

8 files changed · +577 −22

modified content/blog/sama-v2-go-project-dive.md +14 −8
@@ -141,17 +141,23 @@ Derives from Law. No file's declared layer is contradicted by what it imports.
141141
142142 **Estimated tally: 5 of 7 pass under the directory-based dialect, with 2 named failures (Sorted, Modeled-tests).** That's a real result, not "0/7 because no one tried."
143143
144-## The §5 metrics — estimated for `dive`
144+## The §5 metrics — mixed measurement and estimate for `dive`
145145
146-| metric | `dive` (Go, estimated) | WP plugin (PHP, estimated) | tdd.md (TS, measured) |
146+| metric | `dive` (Go) | WP plugin (PHP, estimated) | tdd.md (TS, measured) |
147147 |---|---|---|---|
148-| §4 checks passing | ~5 / 7 | 0 / 7 | 7 / 7 |
149-| graphDepth | ~5 (cmd → command → ui → dive → filetree → internal/utils) | ~3 | 7 |
150-| boundaryRatio | ~85% (one borderline case in `options/ci.go`) | <10% | 100% |
151-| workingSetFit (50–500 LOC) | ~80% | ~47% | 80% |
152-| violationCounts (sum) | ~30 (mostly Modeled-tests gaps) | 17+ | 0 |
148+| §4 checks passing | ~5 / 7 (estimated) | 0 / 7 | 7 / 7 |
149+| graphDepth | ~5 (estimated; cmd → command → ui → dive → filetree → internal/utils) | ~3 | 7 |
150+| boundaryRatio | ~85% (estimated; one borderline case in `options/ci.go`) | <10% | 100% |
151+| **workingSetFit (50–500 LOC)** | **52.17% (measured, [dive@d6c69194](https://github.com/wagoodman/dive/commit/d6c691947f8fda635c952a17ee3b7555379d58f0))** — originally estimated ~80% | ~47% | 80% (measured) |
152+| violationCounts (sum) | ~30 (estimated; mostly Modeled-tests gaps) | 17+ | 0 |
153153
154-The `workingSetFit` is essentially **identical** between `dive` and this site (80%). Two unrelated projects, two different languages, two different scopes, written by different teams under different conventions — landing at the same fit ratio is a useful data point: 80% might just be what "reasonably engineered" looks like on this axis.
154+**The `workingSetFit` is the metric I most expected to land near tdd.md's 80%** — two engineered codebases, both with linters and conventions. The measurement says otherwise.
155+
156+**Hand-trace** (auditable per [/sama/v2 §0](/sama/v2)): running `find /tmp/dive -name '*.go' -not -name '*_test.go' -not -path '*/.git/*' -not -path '*/vendor/*' | wc -l` returns **92 source .go files**. Of those, **48** fall in [50, 500] LOC inclusive (matching `WORKING_SET_MIN_LOC` and `WORKING_SET_MAX_LOC` in [`src/a31_sama_v2.ts`](/GIT/syntaxai/tdd.md/blob/main/src/a31_sama_v2.ts)). 48 ÷ 92 = 0.5217 ≈ 52.17%. The polyglot §5 emitter at [`scripts/measure-working-set.ts`](/GIT/syntaxai/tdd.md/blob/main/scripts/measure-working-set.ts) produces the same number from the same source tree.
157+
158+The distribution explains it: **44 files under 50 LOC** (mostly small type-only modules, single-helper files, and platform-shim stubs like `dive/image/docker/docker_host_windows.go` at 6 LOC), **48 in band**, and — strikingly — **0 over 500 LOC**. `dive`'s working-set miss is not god-classes (the §4.5 Atomic check passes outright); it's the *opposite* failure mode: many files small enough to fall below the substantive-module threshold.
159+
160+The original ~80% estimate was wrong, and wrong in a direction casual eyeballing wouldn't catch — counting visible-on-the-screen files isn't the same as counting them and applying a band filter. That 28-point miss between estimate and measurement is itself the empirical case for the metric existing at all: the metric surfaces a property the human estimate missed.
155161
156162 ## What `dive` would look like at 7/7 — the last 30%
157163
modified content/blog/sama-v2-rust-project-ripgrep.md +12 −8
@@ -145,17 +145,21 @@ Derives from Law on the same edge set.
145145
146146 *(Update: all three dialects have since been drafted into [/sama/v2 §6.A](/sama/v2#6a-v21-dialects-provisional) as v2.1-draft extensions, with the same five-part operational structure — what they relax, what property they preserve, and the falsifiable cross-repo experiment that would invalidate each.)*
147147
148-## §5 metric estimates
148+## §5 metrics — measured workingSetFit, estimated the rest
149149
150-| metric | ripgrep (estimated) | dive (Go) | tdd.md (TS, measured) | WP plugin (PHP) |
150+| metric | ripgrep | dive (Go) | tdd.md (TS, measured) | WP plugin (PHP) |
151151 |---|---|---|---|---|
152-| §4 checks passing | ~3/7 strict, ~5/7 under v2.1 dialects | ~5/7 | 7/7 ✓ | 0/7 |
153-| graphDepth | ~5 (matcher → engine → searcher → printer → core) | ~5 | 7 | ~3 |
154-| boundaryRatio | ~95% | ~85% | 100% | <10% |
155-| workingSetFit (50–500 LOC) | ~60% (those 19 big files drag it down) | ~80% | 80% | ~47% |
156-| violationCounts (sum) | ~50 (19 Atomic + ~30 Modeled-tests under sibling-rule) | ~30 | 0 | 17+ |
152+| §4 checks passing | ~3/7 strict, ~5/7 under v2.1 dialects (estimated) | ~5/7 (estimated) | 7/7 ✓ | 0/7 |
153+| graphDepth | ~5 estimated (matcher → engine → searcher → printer → core) | ~5 (estimated) | 7 | ~3 |
154+| boundaryRatio | ~95% (estimated) | ~85% (estimated) | 100% | <10% |
155+| **workingSetFit (50–500 LOC)** | **54.00% (measured, [ripgrep@4519153e](https://github.com/BurntSushi/ripgrep/commit/4519153e5e461527f4bca45b042fff45c4ec6fb9))** — originally estimated ~60% | **52.17% (measured, [dive@d6c69194](https://github.com/wagoodman/dive/commit/d6c691947f8fda635c952a17ee3b7555379d58f0))** — originally estimated ~80% | 80% | ~47% |
156+| violationCounts (sum) | ~50 estimated (Atomic + Modeled-tests under sibling-rule) | ~30 (estimated) | 0 | 17+ |
157157
158-ripgrep's `workingSetFit` is the metric that surprises here: ~60%, lower than dive *and* lower than this site. That's the 19 big files pulling the distribution down. **And yet most of those files are appropriate to their content.** It's a useful signal: workingSetFit is not by itself a quality measure — a project full of declaration catalogs will score lower than a project full of small handlers without being architecturally worse.
158+ripgrep's `workingSetFit` measures 54.00% (from the polyglot §5 emitter at [`scripts/measure-working-set.ts`](/GIT/syntaxai/tdd.md/blob/main/scripts/measure-working-set.ts), inclusive bounds [50, 500] LOC). The distribution: **100 .rs files** total, **16 under 50 LOC**, **54 in band**, **30 over 500 LOC** — appreciably more than the "19 big files" I eyeballed in the original audit. The over-cap list ranges from the textbook declarative-exempt catalog (`crates/core/flags/defs.rs` at 7,780 LOC) down to genuinely borderline files at 500–800 LOC like `crates/pcre2/src/matcher.rs` (506) and `crates/cli/src/decompress.rs` (533).
159+
160+**And yet most of those files are appropriate to their content.** workingSetFit by itself doesn't say which side of the line each file falls on — that's what the [declarative-exemption dialect](/sama/v2#63-declarative-exemption-dialect) is for. The metric surfaces the property; the policy decides what to do with it.
161+
162+The cross-repo comparison the measurement makes possible is more interesting than the single number. **ripgrep (54%) and dive (52%) measure within two percentage points of each other** — two unrelated codebases in two different languages, written by different teams under different conventions, landing in the same working-set band when measured against the same bounds. That's the kind of cross-repo signal §6 says it wants. The eyeballed estimates (~60% and ~80%) said the two projects were 20 points apart; the measurement says they're 2 points apart. The metric, not the eye, was right.
159163
160164 This is exactly the §5 intent. The metric surfaces a property; whether that property is good or bad depends on what the file content *should be*. Compliance scores conflate the two; metrics keep them separate.
161165
modified content/home.md +7 −6
@@ -56,17 +56,18 @@ SAMA bundles those findings into four constraints a CI job can enforce. *Sorted*
5656
5757 **The load-bearing property isn't that LLMs have small context windows — modern models have 200k+ tokens.** The load-bearing property is **mechanical enforceability**: the verifier fails the build when a file crosses the line cap or an import points the wrong way. Discipline that lives only in code review quietly slips under agent pressure; discipline that lives in a CI gate keeps its shape across an arbitrary number of agent commits. The context-window research above explains the *why*; the verifier explains the *how*.
5858
59-## Three datapoints on the same axes
59+## Datapoints on the same axes
6060
61-Empirical baseline so far (the §5 metrics, [computed live](/sama/v2/verify) for this site and hand-traced for the two audits):
61+Empirical baseline so far. The §4 score for this site is [computed live](/sama/v2/verify); the §4 scores for the other repos are hand-estimated. The **workingSetFit** column is now measured for three of the four repos by the polyglot §5 emitter at [`scripts/measure-working-set.ts`](/GIT/syntaxai/tdd.md/blob/main/scripts/measure-working-set.ts); the remaining columns are still hand-estimated where flagged.
6262
6363 | project | language | §4 score | workingSetFit | boundaryRatio | graphDepth |
6464 |---|---|---|---|---|---|
65-| **tdd.md** (this site) | TypeScript | **7 / 7 ✓** (measured) | 80% | 100% | 7 |
66-| [**wagoodman/dive**](/blog/sama-v2-go-project-dive) | Go | ~5 / 7 (estimated) | ~80% | ~85% | ~5 |
67-| [**Open Graph plugin**](/blog/sama-v2-wordpress-plugin-audit) | PHP / WordPress | 0 / 7 (estimated) | ~47% | <10% | ~3 |
65+| **tdd.md** (this site) | TypeScript | **7 / 7 ✓** (measured) | 80% (measured) | 100% (measured) | 7 (measured) |
66+| [**wagoodman/dive**](/blog/sama-v2-go-project-dive) | Go | ~5 / 7 (estimated) | **52.17%** (measured, [@d6c69194](https://github.com/wagoodman/dive/commit/d6c691947f8fda635c952a17ee3b7555379d58f0)) | ~85% (estimated) | ~5 (estimated) |
67+| [**BurntSushi/ripgrep**](/blog/sama-v2-rust-project-ripgrep) | Rust | ~3-5 / 7 (estimated, depends on v2.1 dialect uptake) | **54.00%** (measured, [@4519153e](https://github.com/BurntSushi/ripgrep/commit/4519153e5e461527f4bca45b042fff45c4ec6fb9)) | ~95% (estimated) | ~5 (estimated) |
68+| [**Open Graph plugin**](/blog/sama-v2-wordpress-plugin-audit) | PHP / WordPress | 0 / 7 (estimated) | ~47% (estimated) | <10% (estimated) | ~3 (estimated) |
6869
69-Three points is not yet a "v2 is worth following" claim. §6 of the spec is explicit that promotion to official requires cross-repo deltas, not a single dogfood. But the same five numbers are now defined, computable, and published — which is the prerequisite the spec sets before any later claim becomes testable.
70+Four points is not yet a "v2 is worth following" claim. §6 of the spec is explicit that promotion to official requires cross-repo *deltas*, not a single dogfood. But three workingSetFit rows are now *measured* against the same bounds the spec defines — a quiet but load-bearing step from "we have numbers" to "we have *the same* numbers across repos." The cross-repo signal that emerges: ripgrep (54.00%) and dive (52.17%) land within two percentage points of each other, suggesting workingSetFit in the 50–55% range may be characteristic of mature compiled-language CLI tools — a hypothesis that needs more datapoints to confirm but is now *testable* in a way it was not when the numbers were all eyeballed.
7071
7172 ## See it in practice
7273
added scripts/measure-working-set.ts +76 −0
@@ -0,0 +1,76 @@
1+#!/usr/bin/env bun
2+// measure-working-set — CLI for the §5 polyglot workingSetFit metric.
3+// Given a path to a checked-out Go or Rust source tree, emit the
4+// measured ratio as JSON to stdout.
5+//
6+// Usage:
7+// bun scripts/measure-working-set.ts <repo-path> --lang go
8+// bun scripts/measure-working-set.ts <repo-path> --lang rust
9+//
10+// The number it emits is reproducible: given the same checked-out
11+// source tree, every run prints the same ratio to full float precision.
12+// Pair the output with the repo's commit SHA when reporting; see
13+// /sama/v2 §5 (operational) for the bounds reasoning.
14+
15+import { measureWorkingSetForRepo } from "../src/c14_working_set_walker.ts";
16+import type { PolyglotLanguage } from "../src/b32_working_set_polyglot.ts";
17+
18+const args = process.argv.slice(2);
19+
20+const usage = (): never => {
21+ console.error(
22+ "Usage: bun scripts/measure-working-set.ts <repo-path> --lang go|rust [--verbose]",
23+ );
24+ process.exit(2);
25+};
26+
27+if (args.length < 3) usage();
28+
29+const repoPath = args[0]!;
30+let lang: PolyglotLanguage | null = null;
31+let verbose = false;
32+
33+for (let i = 1; i < args.length; i++) {
34+ const a = args[i];
35+ if (a === "--lang") {
36+ const v = args[++i];
37+ if (v !== "go" && v !== "rust") {
38+ console.error(`--lang must be "go" or "rust", got: ${v}`);
39+ process.exit(2);
40+ }
41+ lang = v;
42+ } else if (a === "--verbose") {
43+ verbose = true;
44+ } else {
45+ console.error(`unknown argument: ${a}`);
46+ usage();
47+ }
48+}
49+
50+if (lang === null) usage();
51+
52+const result = measureWorkingSetForRepo(repoPath, lang!);
53+
54+const output: Record<string, unknown> = {
55+ language: result.language,
56+ repoPath,
57+ minLoc: result.minLoc,
58+ maxLoc: result.maxLoc,
59+ total: result.total,
60+ included: result.included,
61+ ratio: result.ratio,
62+ ratioPercent: Number((result.ratio * 100).toFixed(2)),
63+};
64+
65+if (verbose) {
66+ output.files = result.files.map((f) => ({
67+ path: f.path,
68+ locCount: f.locCount,
69+ inBand:
70+ f.locCount >= result.minLoc &&
71+ f.locCount <= result.maxLoc &&
72+ !(lang === "go" && f.path.endsWith("_test.go")),
73+ }));
74+}
75+
76+console.log(JSON.stringify(output, null, 2));
added src/b32_working_set_polyglot.test.ts +164 −0
@@ -0,0 +1,164 @@
1+import { describe, expect, test } from "bun:test";
2+import {
3+ WORKING_SET_MAX_LOC,
4+ WORKING_SET_MIN_LOC,
5+} from "./a31_sama_v2.ts";
6+import {
7+ computeWorkingSetFitPolyglot,
8+ type PolyglotLanguage,
9+ type WorkingSetFile,
10+} from "./b32_working_set_polyglot.ts";
11+
12+// Mirror the inclusive-bound assertions in b32_sama_v2_metrics.test.ts.
13+// Same algorithm, same constants, same edge behaviour — the polyglot
14+// helper is allowed to compute a different SET of files (Go/Rust source
15+// trees rather than src/*.ts), but the RATIO formula must match the
16+// TS metric byte-for-byte. These tests pin that.
17+
18+const file = (path: string, locCount: number): WorkingSetFile => ({ path, locCount });
19+
20+describe("computeWorkingSetFitPolyglot — empty input", () => {
21+ test("empty list → 1.0 vacuous (matches the TS metric on an empty file map)", () => {
22+ const r = computeWorkingSetFitPolyglot([], "go");
23+ expect(r.ratio).toBe(1.0);
24+ expect(r.included).toBe(0);
25+ expect(r.total).toBe(0);
26+ });
27+
28+ test("empty list also vacuous under Rust", () => {
29+ const r = computeWorkingSetFitPolyglot([], "rust");
30+ expect(r.ratio).toBe(1.0);
31+ });
32+});
33+
34+describe("computeWorkingSetFitPolyglot — single-file extremes", () => {
35+ test("a single 100-line Go file → 1.0", () => {
36+ const r = computeWorkingSetFitPolyglot([file("pkg/x.go", 100)], "go");
37+ expect(r.ratio).toBe(1.0);
38+ expect(r.included).toBe(1);
39+ expect(r.total).toBe(1);
40+ });
41+
42+ test("a single 10-line file falls below the min → 0.0", () => {
43+ const r = computeWorkingSetFitPolyglot([file("pkg/x.go", 10)], "go");
44+ expect(r.ratio).toBe(0.0);
45+ expect(r.included).toBe(0);
46+ expect(r.total).toBe(1);
47+ });
48+
49+ test("a single 600-line file exceeds the max → 0.0", () => {
50+ const r = computeWorkingSetFitPolyglot([file("pkg/x.go", 600)], "go");
51+ expect(r.ratio).toBe(0.0);
52+ expect(r.included).toBe(0);
53+ expect(r.total).toBe(1);
54+ });
55+});
56+
57+describe("computeWorkingSetFitPolyglot — bound-edge inclusivity", () => {
58+ // The TS metric uses `lines >= MIN && lines <= MAX`. These tests
59+ // mirror b32_sama_v2_metrics.test.ts's "exact bounds are inclusive".
60+ test("LOC = 49 → out of band", () => {
61+ const r = computeWorkingSetFitPolyglot([file("pkg/x.go", WORKING_SET_MIN_LOC - 1)], "go");
62+ expect(r.included).toBe(0);
63+ });
64+
65+ test("LOC = 50 → in band", () => {
66+ const r = computeWorkingSetFitPolyglot([file("pkg/x.go", WORKING_SET_MIN_LOC)], "go");
67+ expect(r.included).toBe(1);
68+ });
69+
70+ test("LOC = 500 → in band", () => {
71+ const r = computeWorkingSetFitPolyglot([file("pkg/x.go", WORKING_SET_MAX_LOC)], "go");
72+ expect(r.included).toBe(1);
73+ });
74+
75+ test("LOC = 501 → out of band", () => {
76+ const r = computeWorkingSetFitPolyglot([file("pkg/x.go", WORKING_SET_MAX_LOC + 1)], "go");
77+ expect(r.included).toBe(0);
78+ });
79+});
80+
81+describe("computeWorkingSetFitPolyglot — mixed inputs", () => {
82+ test("half in / half out → 0.5", () => {
83+ const r = computeWorkingSetFitPolyglot([
84+ file("pkg/a.go", 100),
85+ file("pkg/b.go", 10),
86+ ], "go");
87+ expect(r.ratio).toBe(0.5);
88+ });
89+
90+ test("two in / two out → 0.5", () => {
91+ const r = computeWorkingSetFitPolyglot([
92+ file("pkg/a.go", 100),
93+ file("pkg/b.go", 300),
94+ file("pkg/c.go", 10),
95+ file("pkg/d.go", 800),
96+ ], "go");
97+ expect(r.ratio).toBe(0.5);
98+ });
99+});
100+
101+describe("computeWorkingSetFitPolyglot — Go test-file exclusion", () => {
102+ test("*_test.go files do NOT count toward total or included", () => {
103+ const r = computeWorkingSetFitPolyglot([
104+ file("pkg/x.go", 100),
105+ file("pkg/x_test.go", 200),
106+ file("pkg/y_test.go", 50),
107+ ], "go");
108+ // Only x.go counts; both _test.go files dropped before tallying.
109+ expect(r.total).toBe(1);
110+ expect(r.included).toBe(1);
111+ expect(r.ratio).toBe(1.0);
112+ });
113+
114+ test("a 100-line source + a 1-line _test.go sibling → 1.0 (mirrors the TS metric)", () => {
115+ const r = computeWorkingSetFitPolyglot([
116+ file("pkg/x.go", 100),
117+ file("pkg/x_test.go", 1),
118+ ], "go");
119+ expect(r.ratio).toBe(1.0);
120+ });
121+});
122+
123+describe("computeWorkingSetFitPolyglot — Rust inline-tests asymmetry", () => {
124+ test("Rust includes ALL .rs files (no path-based test exclusion)", () => {
125+ // Rust convention: tests live inside source files under
126+ // #[cfg(test)] mod tests. The polyglot helper preserves that —
127+ // it does NOT exclude any .rs path. The asymmetry is documented
128+ // in the b32_working_set_polyglot.ts source comment.
129+ const r = computeWorkingSetFitPolyglot([
130+ file("src/lib.rs", 100),
131+ file("src/tests.rs", 100),
132+ file("src/something_test.rs", 100),
133+ ], "rust");
134+ expect(r.total).toBe(3);
135+ expect(r.included).toBe(3);
136+ expect(r.ratio).toBe(1.0);
137+ });
138+});
139+
140+describe("computeWorkingSetFitPolyglot — reproducibility", () => {
141+ test("same input → identical output across runs (deep-equal)", () => {
142+ const input: WorkingSetFile[] = [
143+ file("a.go", 100),
144+ file("b.go", 60),
145+ file("c.go", 480),
146+ file("d.go", 20),
147+ file("e_test.go", 999),
148+ ];
149+ const langs: PolyglotLanguage[] = ["go", "rust"];
150+ for (const l of langs) {
151+ const a = computeWorkingSetFitPolyglot(input, l);
152+ const b = computeWorkingSetFitPolyglot(input, l);
153+ expect(a).toEqual(b);
154+ }
155+ });
156+});
157+
158+describe("computeWorkingSetFitPolyglot — bounds echo", () => {
159+ test("result echoes minLoc / maxLoc from a31_sama_v2.ts (auditable)", () => {
160+ const r = computeWorkingSetFitPolyglot([], "go");
161+ expect(r.minLoc).toBe(WORKING_SET_MIN_LOC);
162+ expect(r.maxLoc).toBe(WORKING_SET_MAX_LOC);
163+ });
164+});
added src/b32_working_set_polyglot.ts +82 −0
@@ -0,0 +1,82 @@
1+// b32 — logic: §5 workingSetFit metric for polyglot source trees
2+// (Go, Rust). Pure function, no I/O. Mirrors the formula in
3+// src/b32_sama_v2_metrics.ts byte-for-byte:
4+//
5+// workingSetFit = files-in-band ÷ total-source-files
6+//
7+// where in-band means WORKING_SET_MIN_LOC ≤ LOC ≤ WORKING_SET_MAX_LOC,
8+// inclusive on both ends. Bounds are imported from a31_sama_v2.ts so
9+// the cross-language number is computed against the same band as this
10+// site's own metric — the single-source-of-truth determinism property
11+// from /sama/v2 §0.
12+//
13+// Used by scripts/measure-working-set.ts (the polyglot CLI) and the
14+// c14_working_set_walker.ts adapter, which feed it a pre-counted file
15+// summary so this module stays pure and unit-testable.
16+
17+import {
18+ WORKING_SET_MAX_LOC,
19+ WORKING_SET_MIN_LOC,
20+} from "./a31_sama_v2.ts";
21+
22+// Language tag governs the test-file exclusion rule below.
23+export type PolyglotLanguage = "go" | "rust";
24+
25+export interface WorkingSetFile {
26+ // Repo-relative path (e.g. "crates/printer/src/standard.rs").
27+ path: string;
28+ // File length in lines, matching the TS metric's `content.split("\n").length`.
29+ locCount: number;
30+}
31+
32+export interface WorkingSetResult {
33+ language: PolyglotLanguage;
34+ included: number; // files inside [MIN, MAX] LOC, inclusive
35+ total: number; // total source files (after test-file exclusion)
36+ ratio: number; // included / total; empty-input → 1.0 vacuous
37+ minLoc: number; // echoed back from a31 so callers can audit
38+ maxLoc: number;
39+}
40+
41+// Test-file exclusion. The asymmetry is honest, not arbitrary:
42+//
43+// Go: tests live in `*_test.go` files. The TS metric excludes
44+// `*.test.ts` for the same structural reason — they aren't
45+// working modules in their own right, they verify one.
46+//
47+// Rust: tests live INSIDE source files under `#[cfg(test)] mod tests`.
48+// Excluding files at file-granularity would either lose every
49+// tested file or accidentally include all of them. The
50+// inline-tests dialect drafted at /sama/v2#62-inline-tests-dialect
51+// is what makes this asymmetry coherent: where the test attaches
52+// is a language-level choice; the working-set property the metric
53+// measures is unaffected.
54+const isPolyglotTestFile = (path: string, lang: PolyglotLanguage): boolean => {
55+ if (lang === "go") return path.endsWith("_test.go");
56+ return false;
57+};
58+
59+export const computeWorkingSetFitPolyglot = (
60+ files: ReadonlyArray<WorkingSetFile>,
61+ lang: PolyglotLanguage,
62+): WorkingSetResult => {
63+ let included = 0;
64+ let total = 0;
65+ for (const f of files) {
66+ if (isPolyglotTestFile(f.path, lang)) continue;
67+ total++;
68+ if (f.locCount >= WORKING_SET_MIN_LOC && f.locCount <= WORKING_SET_MAX_LOC) {
69+ included++;
70+ }
71+ }
72+ // Match the TS metric: empty input → 1.0 (vacuously satisfied).
73+ const ratio = total === 0 ? 1.0 : included / total;
74+ return {
75+ language: lang,
76+ included,
77+ total,
78+ ratio,
79+ minLoc: WORKING_SET_MIN_LOC,
80+ maxLoc: WORKING_SET_MAX_LOC,
81+ };
82+};
added src/c14_working_set_walker.test.ts +117 −0
@@ -0,0 +1,117 @@
1+import { afterAll, beforeAll, describe, expect, test } from "bun:test";
2+import { mkdirSync, mkdtempSync, rmSync, writeFileSync } from "node:fs";
3+import { tmpdir } from "node:os";
4+import { resolve } from "node:path";
5+import {
6+ collectPolyglotFiles,
7+ measureWorkingSetForRepo,
8+} from "./c14_working_set_walker.ts";
9+
10+// Hermetic fixture: build a tiny fake repo in a tmpdir, walk it,
11+// assert what comes back. The CLI script's real-world use against
12+// /tmp/dive and /tmp/ripgrep is exercised via the measurement step
13+// in this PR, not via unit tests; this file pins the algorithm.
14+
15+const FIXTURE_ROOT = mkdtempSync(resolve(tmpdir(), "tdd-md-wswalker-"));
16+
17+const writeFile = (relPath: string, lineCount: number): void => {
18+ const abs = resolve(FIXTURE_ROOT, relPath);
19+ mkdirSync(abs.split("/").slice(0, -1).join("/"), { recursive: true });
20+ const lines = Array.from({ length: lineCount }, (_, i) => `// line ${i}`);
21+ writeFileSync(abs, lines.join("\n"));
22+};
23+
24+beforeAll(() => {
25+ // Top-level Go sources (one in-band, one out-of-band, one test file).
26+ writeFile("a.go", 100); // in band
27+ writeFile("b.go", 600); // out (over)
28+ writeFile("c_test.go", 200); // excluded for Go
29+ // Nested.
30+ writeFile("pkg/inner.go", 60); // in band, inside subdir
31+ writeFile("pkg/tiny.go", 10); // out (under)
32+ // Rust sources (separate sub-tree).
33+ writeFile("rs/src/lib.rs", 120); // in band
34+ writeFile("rs/src/big.rs", 700); // out (over)
35+ writeFile("rs/src/tests.rs", 75); // included (Rust has no path test rule)
36+ // Skip directories that should NOT be walked.
37+ writeFile(".git/HEAD.go", 100); // .git is skipped
38+ writeFile("target/build.rs", 100); // target/ is skipped
39+ writeFile("vendor/pkg.go", 100); // vendor/ is skipped
40+ writeFile("node_modules/dep.go", 100); // node_modules/ skipped
41+});
42+
43+afterAll(() => {
44+ rmSync(FIXTURE_ROOT, { recursive: true, force: true });
45+});
46+
47+describe("collectPolyglotFiles — Go", () => {
48+ test("walks recursively and finds the right .go files", () => {
49+ const files = collectPolyglotFiles(FIXTURE_ROOT, "go");
50+ const paths = files.map((f) => f.path);
51+ // Excluded: .git/*, target/*, vendor/*, node_modules/*.
52+ // Included: a.go, b.go, c_test.go (the helper RETURNS it; the
53+ // metric helper drops it during the count — separation of concerns).
54+ expect(paths).toContain("a.go");
55+ expect(paths).toContain("b.go");
56+ expect(paths).toContain("c_test.go");
57+ expect(paths).toContain("pkg/inner.go");
58+ expect(paths).toContain("pkg/tiny.go");
59+ expect(paths).not.toContain(".git/HEAD.go");
60+ expect(paths).not.toContain("vendor/pkg.go");
61+ expect(paths).not.toContain("node_modules/dep.go");
62+ });
63+
64+ test("LOC counts match content.split('\\n').length", () => {
65+ const files = collectPolyglotFiles(FIXTURE_ROOT, "go");
66+ const a = files.find((f) => f.path === "a.go");
67+ // We wrote 100 lines joined by "\n" → split("\n").length === 100.
68+ expect(a?.locCount).toBe(100);
69+ });
70+
71+ test("returns files in deterministic sorted order", () => {
72+ const a = collectPolyglotFiles(FIXTURE_ROOT, "go").map((f) => f.path);
73+ const b = collectPolyglotFiles(FIXTURE_ROOT, "go").map((f) => f.path);
74+ expect(a).toEqual(b);
75+ const sorted = [...a].sort((x, y) => x.localeCompare(y));
76+ expect(a).toEqual(sorted);
77+ });
78+});
79+
80+describe("collectPolyglotFiles — Rust", () => {
81+ test("finds only .rs files; ignores .go", () => {
82+ const files = collectPolyglotFiles(FIXTURE_ROOT, "rust");
83+ const paths = files.map((f) => f.path);
84+ expect(paths).toContain("rs/src/lib.rs");
85+ expect(paths).toContain("rs/src/big.rs");
86+ expect(paths).toContain("rs/src/tests.rs");
87+ expect(paths.every((p) => p.endsWith(".rs"))).toBe(true);
88+ });
89+
90+ test("target/build.rs is excluded (skipped dir)", () => {
91+ const files = collectPolyglotFiles(FIXTURE_ROOT, "rust");
92+ const paths = files.map((f) => f.path);
93+ expect(paths).not.toContain("target/build.rs");
94+ });
95+});
96+
97+describe("measureWorkingSetForRepo — end-to-end", () => {
98+ test("Go fixture: 2 in band (a.go=100, pkg/inner.go=60) of 4 source files (excluding c_test.go) = 0.5", () => {
99+ const r = measureWorkingSetForRepo(FIXTURE_ROOT, "go");
100+ expect(r.total).toBe(4); // a, b, pkg/inner, pkg/tiny (c_test excluded)
101+ expect(r.included).toBe(2); // a, pkg/inner
102+ expect(r.ratio).toBe(0.5);
103+ });
104+
105+ test("Rust fixture: 2 in band (lib.rs=120, tests.rs=75) of 3 .rs files = 2/3", () => {
106+ const r = measureWorkingSetForRepo(FIXTURE_ROOT, "rust");
107+ expect(r.total).toBe(3);
108+ expect(r.included).toBe(2);
109+ expect(r.ratio).toBeCloseTo(2 / 3, 6);
110+ });
111+
112+ test("echoes the bounds back so callers can audit which numbers produced the ratio", () => {
113+ const r = measureWorkingSetForRepo(FIXTURE_ROOT, "go");
114+ expect(r.minLoc).toBe(50);
115+ expect(r.maxLoc).toBe(500);
116+ });
117+});
added src/c14_working_set_walker.ts +105 −0
@@ -0,0 +1,105 @@
1+// c14 — adapter: filesystem walker that produces a polyglot
2+// WorkingSetFile summary for an external source tree (Go or Rust).
3+// Recursive directory walk; counts lines of each .go / .rs file using
4+// the same `content.split("\n").length` rule as b32_sama_v2_metrics so
5+// the cross-language metric matches the TS metric byte-for-byte.
6+//
7+// Skipped directories are the conventional non-source trees that
8+// would otherwise inflate the denominator with vendored / generated
9+// / build artefacts: .git, target/ (Rust build output), vendor/ (Go
10+// vendored deps), node_modules/ (incidental, defensive).
11+//
12+// The walker is hermetic — given a path that is a directory it
13+// resolves the file set deterministically. Calls into the pure helper
14+// in b32_working_set_polyglot.ts for the ratio.
15+
16+import { readdirSync, readFileSync, statSync } from "node:fs";
17+import { resolve } from "node:path";
18+import {
19+ computeWorkingSetFitPolyglot,
20+ type PolyglotLanguage,
21+ type WorkingSetFile,
22+ type WorkingSetResult,
23+} from "./b32_working_set_polyglot.ts";
24+
25+const SKIPPED_DIRS: ReadonlySet<string> = new Set([
26+ ".git",
27+ "target",
28+ "vendor",
29+ "node_modules",
30+]);
31+
32+const EXTENSION_FOR: Record<PolyglotLanguage, string> = {
33+ go: ".go",
34+ rust: ".rs",
35+};
36+
37+// Walk a directory and return every {path, locCount} pair for files
38+// whose extension matches the target language. Paths are returned
39+// repo-relative (i.e. relative to the `repoRoot` passed in) so they're
40+// stable across machines.
41+export const collectPolyglotFiles = (
42+ repoRoot: string,
43+ lang: PolyglotLanguage,
44+): WorkingSetFile[] => {
45+ const ext = EXTENSION_FOR[lang];
46+ const out: WorkingSetFile[] = [];
47+
48+ const walk = (absDir: string, relDir: string): void => {
49+ let entries: ReturnType<typeof readdirSync>;
50+ try {
51+ entries = readdirSync(absDir, { withFileTypes: true });
52+ } catch {
53+ // Permission errors / non-existent: surface to caller, but
54+ // letting one bad subtree halt the whole measurement would be
55+ // worse than reporting the partial set. Return silently here;
56+ // the CLI's smoke checks at the top level will catch a totally
57+ // unreadable root.
58+ return;
59+ }
60+ for (const e of entries) {
61+ if (e.name.startsWith(".") && e.name !== ".") {
62+ // .git, .github, .vscode, ...: defensive skip on all dotdirs
63+ // for directories; dotfiles are skipped too (they're never
64+ // .go/.rs sources anyway, but the explicit skip is cheap).
65+ if (e.isDirectory() && SKIPPED_DIRS.has(e.name)) continue;
66+ if (e.isDirectory()) continue; // skip all hidden dirs
67+ }
68+ if (e.isDirectory()) {
69+ if (SKIPPED_DIRS.has(e.name)) continue;
70+ const sub = resolve(absDir, e.name);
71+ const subRel = relDir === "" ? e.name : `${relDir}/${e.name}`;
72+ walk(sub, subRel);
73+ continue;
74+ }
75+ if (!e.isFile()) continue;
76+ if (!e.name.endsWith(ext)) continue;
77+ const abs = resolve(absDir, e.name);
78+ const relPath = relDir === "" ? e.name : `${relDir}/${e.name}`;
79+ const content = readFileSync(abs, "utf8");
80+ // Match b32_sama_v2_metrics.ts: lines = content.split("\n").length.
81+ const locCount = content.split("\n").length;
82+ out.push({ path: relPath, locCount });
83+ }
84+ };
85+
86+ const root = resolve(repoRoot);
87+ const rootStat = statSync(root);
88+ if (!rootStat.isDirectory()) {
89+ throw new Error(`expected a directory, got: ${repoRoot}`);
90+ }
91+ walk(root, "");
92+ // Sort for deterministic output (readdirSync is platform-dependent).
93+ out.sort((a, b) => a.path.localeCompare(b.path));
94+ return out;
95+};
96+
97+// Convenience: walk + compute in one call. Used by the CLI script.
98+export const measureWorkingSetForRepo = (
99+ repoRoot: string,
100+ lang: PolyglotLanguage,
101+): WorkingSetResult & { files: WorkingSetFile[] } => {
102+ const files = collectPolyglotFiles(repoRoot, lang);
103+ const result = computeWorkingSetFitPolyglot(files, lang);
104+ return { ...result, files };
105+};