§5 workingSetFit ported to Go + Rust; dive and ripgrep audits gain measured numbers
The cross-repo argument was n=1 measured (this site) + n=3 hand-estimated
(dive, ripgrep, WP plugin). Ports the §5 workingSetFit metric to Go and
Rust source trees, runs it against /tmp/dive and /tmp/ripgrep at pinned
SHAs, and replaces the hand-estimated workingSetFit values in the audit
blog posts + home page table with the measured numbers. Empirical chain
is now n=3 measured + n=1 estimated.
Components:
- src/b32_working_set_polyglot.ts — pure Layer 1: files+lang → ratio,
imports WORKING_SET_MIN_LOC=50 / MAX=500 from a31_sama_v2.ts (single
source of truth; no duplication). Formula matches b32_sama_v2_metrics.ts
byte-for-byte: files-in-band ÷ total, inclusive bounds. Empty input
→ 1.0 vacuous.
- src/c14_working_set_walker.ts — Layer 2 adapter: recursive .go/.rs
walker. Skips .git/, target/, vendor/, node_modules/, dotdirs. LOC
counter uses content.split("\n").length to match the TS metric.
- scripts/measure-working-set.ts — CLI: --lang go|rust + repo-path →
JSON to stdout. Reproducible given a pinned commit SHA.
- 24 new tests cover bound-edge inclusivity (LOC=49 out / =50 in /
=500 in / =501 out, mirroring b32_sama_v2_metrics.test.ts), language
test-file asymmetry (Go excludes *_test.go; Rust includes all .rs
because tests are inline — see /sama/v2#62-inline-tests-dialect),
empty-input vacuous, reproducibility under deep-equal.
Measured results:
- dive @d6c691947f8fda635c952a17ee3b7555379d58f0:
48 of 92 source .go files in [50, 500] LOC = 52.17%
(originally hand-estimated ~80%; 28-point miss)
- ripgrep @4519153e5e461527f4bca45b042fff45c4ec6fb9:
54 of 100 .rs files in [50, 500] LOC = 54.00%
(originally hand-estimated ~60%; 6-point miss)
Cross-repo signal: ripgrep (54.00%) and dive (52.17%) measure within
two percentage points — the eyeballed estimates said they were 20 points
apart. The metric, not the eye, was right.
The dive audit gains a §0-style hand-trace ("find /tmp/dive -name *.go
not _test.go | wc -l" yields 92; 48 fall in band; 48/92=0.5217) so the
measurement is auditable per the deterministic-program contract.
Anti-fudge: this repo's sama.profile.toml is unchanged; the §4 verifier
behaviour is bit-identical; /sama/v2/verify continues to report 7/7 ✓.
336/336 tests pass total (was 312; +24 new).
Co-Authored-By: Claude Opus 4.7 <[email protected]>
8 files changed · +577 −22
content/blog/sama-v2-go-project-dive.md
+14
−8
| @@ -141,17 +141,23 @@ Derives from Law. No file's declared layer is contradicted by what it imports. | ||
| 141 | 141 | |
| 142 | 142 | **Estimated tally: 5 of 7 pass under the directory-based dialect, with 2 named failures (Sorted, Modeled-tests).** That's a real result, not "0/7 because no one tried." |
| 143 | 143 | |
| 144 | -## The §5 metrics — estimated for `dive` | |
| 144 | +## The §5 metrics — mixed measurement and estimate for `dive` | |
| 145 | 145 | |
| 146 | -| metric | `dive` (Go, estimated) | WP plugin (PHP, estimated) | tdd.md (TS, measured) | | |
| 146 | +| metric | `dive` (Go) | WP plugin (PHP, estimated) | tdd.md (TS, measured) | | |
| 147 | 147 | |---|---|---|---| |
| 148 | -| §4 checks passing | ~5 / 7 | 0 / 7 | 7 / 7 | | |
| 149 | -| graphDepth | ~5 (cmd → command → ui → dive → filetree → internal/utils) | ~3 | 7 | | |
| 150 | -| boundaryRatio | ~85% (one borderline case in `options/ci.go`) | <10% | 100% | | |
| 151 | -| workingSetFit (50–500 LOC) | ~80% | ~47% | 80% | | |
| 152 | -| violationCounts (sum) | ~30 (mostly Modeled-tests gaps) | 17+ | 0 | | |
| 148 | +| §4 checks passing | ~5 / 7 (estimated) | 0 / 7 | 7 / 7 | | |
| 149 | +| graphDepth | ~5 (estimated; cmd → command → ui → dive → filetree → internal/utils) | ~3 | 7 | | |
| 150 | +| boundaryRatio | ~85% (estimated; one borderline case in `options/ci.go`) | <10% | 100% | | |
| 151 | +| **workingSetFit (50–500 LOC)** | **52.17% (measured, [dive@d6c69194](https://github.com/wagoodman/dive/commit/d6c691947f8fda635c952a17ee3b7555379d58f0))** — originally estimated ~80% | ~47% | 80% (measured) | | |
| 152 | +| violationCounts (sum) | ~30 (estimated; mostly Modeled-tests gaps) | 17+ | 0 | | |
| 153 | 153 | |
| 154 | -The `workingSetFit` is essentially **identical** between `dive` and this site (80%). Two unrelated projects, two different languages, two different scopes, written by different teams under different conventions — landing at the same fit ratio is a useful data point: 80% might just be what "reasonably engineered" looks like on this axis. | |
| 154 | +**The `workingSetFit` is the metric I most expected to land near tdd.md's 80%** — two engineered codebases, both with linters and conventions. The measurement says otherwise. | |
| 155 | + | |
| 156 | +**Hand-trace** (auditable per [/sama/v2 §0](/sama/v2)): running `find /tmp/dive -name '*.go' -not -name '*_test.go' -not -path '*/.git/*' -not -path '*/vendor/*' | wc -l` returns **92 source .go files**. Of those, **48** fall in [50, 500] LOC inclusive (matching `WORKING_SET_MIN_LOC` and `WORKING_SET_MAX_LOC` in [`src/a31_sama_v2.ts`](/GIT/syntaxai/tdd.md/blob/main/src/a31_sama_v2.ts)). 48 ÷ 92 = 0.5217 ≈ 52.17%. The polyglot §5 emitter at [`scripts/measure-working-set.ts`](/GIT/syntaxai/tdd.md/blob/main/scripts/measure-working-set.ts) produces the same number from the same source tree. | |
| 157 | + | |
| 158 | +The distribution explains it: **44 files under 50 LOC** (mostly small type-only modules, single-helper files, and platform-shim stubs like `dive/image/docker/docker_host_windows.go` at 6 LOC), **48 in band**, and — strikingly — **0 over 500 LOC**. `dive`'s working-set miss is not god-classes (the §4.5 Atomic check passes outright); it's the *opposite* failure mode: many files small enough to fall below the substantive-module threshold. | |
| 159 | + | |
| 160 | +The original ~80% estimate was wrong, and wrong in a direction casual eyeballing wouldn't catch — counting visible-on-the-screen files isn't the same as counting them and applying a band filter. That 28-point miss between estimate and measurement is itself the empirical case for the metric existing at all: the metric surfaces a property the human estimate missed. | |
| 155 | 161 | |
| 156 | 162 | ## What `dive` would look like at 7/7 — the last 30% |
| 157 | 163 | |
content/blog/sama-v2-rust-project-ripgrep.md
+12
−8
| @@ -145,17 +145,21 @@ Derives from Law on the same edge set. | ||
| 145 | 145 | |
| 146 | 146 | *(Update: all three dialects have since been drafted into [/sama/v2 §6.A](/sama/v2#6a-v21-dialects-provisional) as v2.1-draft extensions, with the same five-part operational structure — what they relax, what property they preserve, and the falsifiable cross-repo experiment that would invalidate each.)* |
| 147 | 147 | |
| 148 | -## §5 metric estimates | |
| 148 | +## §5 metrics — measured workingSetFit, estimated the rest | |
| 149 | 149 | |
| 150 | -| metric | ripgrep (estimated) | dive (Go) | tdd.md (TS, measured) | WP plugin (PHP) | | |
| 150 | +| metric | ripgrep | dive (Go) | tdd.md (TS, measured) | WP plugin (PHP) | | |
| 151 | 151 | |---|---|---|---|---| |
| 152 | -| §4 checks passing | ~3/7 strict, ~5/7 under v2.1 dialects | ~5/7 | 7/7 ✓ | 0/7 | | |
| 153 | -| graphDepth | ~5 (matcher → engine → searcher → printer → core) | ~5 | 7 | ~3 | | |
| 154 | -| boundaryRatio | ~95% | ~85% | 100% | <10% | | |
| 155 | -| workingSetFit (50–500 LOC) | ~60% (those 19 big files drag it down) | ~80% | 80% | ~47% | | |
| 156 | -| violationCounts (sum) | ~50 (19 Atomic + ~30 Modeled-tests under sibling-rule) | ~30 | 0 | 17+ | | |
| 152 | +| §4 checks passing | ~3/7 strict, ~5/7 under v2.1 dialects (estimated) | ~5/7 (estimated) | 7/7 ✓ | 0/7 | | |
| 153 | +| graphDepth | ~5 estimated (matcher → engine → searcher → printer → core) | ~5 (estimated) | 7 | ~3 | | |
| 154 | +| boundaryRatio | ~95% (estimated) | ~85% (estimated) | 100% | <10% | | |
| 155 | +| **workingSetFit (50–500 LOC)** | **54.00% (measured, [ripgrep@4519153e](https://github.com/BurntSushi/ripgrep/commit/4519153e5e461527f4bca45b042fff45c4ec6fb9))** — originally estimated ~60% | **52.17% (measured, [dive@d6c69194](https://github.com/wagoodman/dive/commit/d6c691947f8fda635c952a17ee3b7555379d58f0))** — originally estimated ~80% | 80% | ~47% | | |
| 156 | +| violationCounts (sum) | ~50 estimated (Atomic + Modeled-tests under sibling-rule) | ~30 (estimated) | 0 | 17+ | | |
| 157 | 157 | |
| 158 | -ripgrep's `workingSetFit` is the metric that surprises here: ~60%, lower than dive *and* lower than this site. That's the 19 big files pulling the distribution down. **And yet most of those files are appropriate to their content.** It's a useful signal: workingSetFit is not by itself a quality measure — a project full of declaration catalogs will score lower than a project full of small handlers without being architecturally worse. | |
| 158 | +ripgrep's `workingSetFit` measures 54.00% (from the polyglot §5 emitter at [`scripts/measure-working-set.ts`](/GIT/syntaxai/tdd.md/blob/main/scripts/measure-working-set.ts), inclusive bounds [50, 500] LOC). The distribution: **100 .rs files** total, **16 under 50 LOC**, **54 in band**, **30 over 500 LOC** — appreciably more than the "19 big files" I eyeballed in the original audit. The over-cap list ranges from the textbook declarative-exempt catalog (`crates/core/flags/defs.rs` at 7,780 LOC) down to genuinely borderline files at 500–800 LOC like `crates/pcre2/src/matcher.rs` (506) and `crates/cli/src/decompress.rs` (533). | |
| 159 | + | |
| 160 | +**And yet most of those files are appropriate to their content.** workingSetFit by itself doesn't say which side of the line each file falls on — that's what the [declarative-exemption dialect](/sama/v2#63-declarative-exemption-dialect) is for. The metric surfaces the property; the policy decides what to do with it. | |
| 161 | + | |
| 162 | +The cross-repo comparison the measurement makes possible is more interesting than the single number. **ripgrep (54%) and dive (52%) measure within two percentage points of each other** — two unrelated codebases in two different languages, written by different teams under different conventions, landing in the same working-set band when measured against the same bounds. That's the kind of cross-repo signal §6 says it wants. The eyeballed estimates (~60% and ~80%) said the two projects were 20 points apart; the measurement says they're 2 points apart. The metric, not the eye, was right. | |
| 159 | 163 | |
| 160 | 164 | This is exactly the §5 intent. The metric surfaces a property; whether that property is good or bad depends on what the file content *should be*. Compliance scores conflate the two; metrics keep them separate. |
| 161 | 165 | |
content/home.md
+7
−6
| @@ -56,17 +56,18 @@ SAMA bundles those findings into four constraints a CI job can enforce. *Sorted* | ||
| 56 | 56 | |
| 57 | 57 | **The load-bearing property isn't that LLMs have small context windows — modern models have 200k+ tokens.** The load-bearing property is **mechanical enforceability**: the verifier fails the build when a file crosses the line cap or an import points the wrong way. Discipline that lives only in code review quietly slips under agent pressure; discipline that lives in a CI gate keeps its shape across an arbitrary number of agent commits. The context-window research above explains the *why*; the verifier explains the *how*. |
| 58 | 58 | |
| 59 | -## Three datapoints on the same axes | |
| 59 | +## Datapoints on the same axes | |
| 60 | 60 | |
| 61 | -Empirical baseline so far (the §5 metrics, [computed live](/sama/v2/verify) for this site and hand-traced for the two audits): | |
| 61 | +Empirical baseline so far. The §4 score for this site is [computed live](/sama/v2/verify); the §4 scores for the other repos are hand-estimated. The **workingSetFit** column is now measured for three of the four repos by the polyglot §5 emitter at [`scripts/measure-working-set.ts`](/GIT/syntaxai/tdd.md/blob/main/scripts/measure-working-set.ts); the remaining columns are still hand-estimated where flagged. | |
| 62 | 62 | |
| 63 | 63 | | project | language | §4 score | workingSetFit | boundaryRatio | graphDepth | |
| 64 | 64 | |---|---|---|---|---|---| |
| 65 | -| **tdd.md** (this site) | TypeScript | **7 / 7 ✓** (measured) | 80% | 100% | 7 | | |
| 66 | -| [**wagoodman/dive**](/blog/sama-v2-go-project-dive) | Go | ~5 / 7 (estimated) | ~80% | ~85% | ~5 | | |
| 67 | -| [**Open Graph plugin**](/blog/sama-v2-wordpress-plugin-audit) | PHP / WordPress | 0 / 7 (estimated) | ~47% | <10% | ~3 | | |
| 65 | +| **tdd.md** (this site) | TypeScript | **7 / 7 ✓** (measured) | 80% (measured) | 100% (measured) | 7 (measured) | | |
| 66 | +| [**wagoodman/dive**](/blog/sama-v2-go-project-dive) | Go | ~5 / 7 (estimated) | **52.17%** (measured, [@d6c69194](https://github.com/wagoodman/dive/commit/d6c691947f8fda635c952a17ee3b7555379d58f0)) | ~85% (estimated) | ~5 (estimated) | | |
| 67 | +| [**BurntSushi/ripgrep**](/blog/sama-v2-rust-project-ripgrep) | Rust | ~3-5 / 7 (estimated, depends on v2.1 dialect uptake) | **54.00%** (measured, [@4519153e](https://github.com/BurntSushi/ripgrep/commit/4519153e5e461527f4bca45b042fff45c4ec6fb9)) | ~95% (estimated) | ~5 (estimated) | | |
| 68 | +| [**Open Graph plugin**](/blog/sama-v2-wordpress-plugin-audit) | PHP / WordPress | 0 / 7 (estimated) | ~47% (estimated) | <10% (estimated) | ~3 (estimated) | | |
| 68 | 69 | |
| 69 | -Three points is not yet a "v2 is worth following" claim. §6 of the spec is explicit that promotion to official requires cross-repo deltas, not a single dogfood. But the same five numbers are now defined, computable, and published — which is the prerequisite the spec sets before any later claim becomes testable. | |
| 70 | +Four points is not yet a "v2 is worth following" claim. §6 of the spec is explicit that promotion to official requires cross-repo *deltas*, not a single dogfood. But three workingSetFit rows are now *measured* against the same bounds the spec defines — a quiet but load-bearing step from "we have numbers" to "we have *the same* numbers across repos." The cross-repo signal that emerges: ripgrep (54.00%) and dive (52.17%) land within two percentage points of each other, suggesting workingSetFit in the 50–55% range may be characteristic of mature compiled-language CLI tools — a hypothesis that needs more datapoints to confirm but is now *testable* in a way it was not when the numbers were all eyeballed. | |
| 70 | 71 | |
| 71 | 72 | ## See it in practice |
| 72 | 73 | |
scripts/measure-working-set.ts
+76
−0
| @@ -0,0 +1,76 @@ | ||
| 1 | +#!/usr/bin/env bun | |
| 2 | +// measure-working-set — CLI for the §5 polyglot workingSetFit metric. | |
| 3 | +// Given a path to a checked-out Go or Rust source tree, emit the | |
| 4 | +// measured ratio as JSON to stdout. | |
| 5 | +// | |
| 6 | +// Usage: | |
| 7 | +// bun scripts/measure-working-set.ts <repo-path> --lang go | |
| 8 | +// bun scripts/measure-working-set.ts <repo-path> --lang rust | |
| 9 | +// | |
| 10 | +// The number it emits is reproducible: given the same checked-out | |
| 11 | +// source tree, every run prints the same ratio to full float precision. | |
| 12 | +// Pair the output with the repo's commit SHA when reporting; see | |
| 13 | +// /sama/v2 §5 (operational) for the bounds reasoning. | |
| 14 | + | |
| 15 | +import { measureWorkingSetForRepo } from "../src/c14_working_set_walker.ts"; | |
| 16 | +import type { PolyglotLanguage } from "../src/b32_working_set_polyglot.ts"; | |
| 17 | + | |
| 18 | +const args = process.argv.slice(2); | |
| 19 | + | |
| 20 | +const usage = (): never => { | |
| 21 | + console.error( | |
| 22 | + "Usage: bun scripts/measure-working-set.ts <repo-path> --lang go|rust [--verbose]", | |
| 23 | + ); | |
| 24 | + process.exit(2); | |
| 25 | +}; | |
| 26 | + | |
| 27 | +if (args.length < 3) usage(); | |
| 28 | + | |
| 29 | +const repoPath = args[0]!; | |
| 30 | +let lang: PolyglotLanguage | null = null; | |
| 31 | +let verbose = false; | |
| 32 | + | |
| 33 | +for (let i = 1; i < args.length; i++) { | |
| 34 | + const a = args[i]; | |
| 35 | + if (a === "--lang") { | |
| 36 | + const v = args[++i]; | |
| 37 | + if (v !== "go" && v !== "rust") { | |
| 38 | + console.error(`--lang must be "go" or "rust", got: ${v}`); | |
| 39 | + process.exit(2); | |
| 40 | + } | |
| 41 | + lang = v; | |
| 42 | + } else if (a === "--verbose") { | |
| 43 | + verbose = true; | |
| 44 | + } else { | |
| 45 | + console.error(`unknown argument: ${a}`); | |
| 46 | + usage(); | |
| 47 | + } | |
| 48 | +} | |
| 49 | + | |
| 50 | +if (lang === null) usage(); | |
| 51 | + | |
| 52 | +const result = measureWorkingSetForRepo(repoPath, lang!); | |
| 53 | + | |
| 54 | +const output: Record<string, unknown> = { | |
| 55 | + language: result.language, | |
| 56 | + repoPath, | |
| 57 | + minLoc: result.minLoc, | |
| 58 | + maxLoc: result.maxLoc, | |
| 59 | + total: result.total, | |
| 60 | + included: result.included, | |
| 61 | + ratio: result.ratio, | |
| 62 | + ratioPercent: Number((result.ratio * 100).toFixed(2)), | |
| 63 | +}; | |
| 64 | + | |
| 65 | +if (verbose) { | |
| 66 | + output.files = result.files.map((f) => ({ | |
| 67 | + path: f.path, | |
| 68 | + locCount: f.locCount, | |
| 69 | + inBand: | |
| 70 | + f.locCount >= result.minLoc && | |
| 71 | + f.locCount <= result.maxLoc && | |
| 72 | + !(lang === "go" && f.path.endsWith("_test.go")), | |
| 73 | + })); | |
| 74 | +} | |
| 75 | + | |
| 76 | +console.log(JSON.stringify(output, null, 2)); | |
src/b32_working_set_polyglot.test.ts
+164
−0
| @@ -0,0 +1,164 @@ | ||
| 1 | +import { describe, expect, test } from "bun:test"; | |
| 2 | +import { | |
| 3 | + WORKING_SET_MAX_LOC, | |
| 4 | + WORKING_SET_MIN_LOC, | |
| 5 | +} from "./a31_sama_v2.ts"; | |
| 6 | +import { | |
| 7 | + computeWorkingSetFitPolyglot, | |
| 8 | + type PolyglotLanguage, | |
| 9 | + type WorkingSetFile, | |
| 10 | +} from "./b32_working_set_polyglot.ts"; | |
| 11 | + | |
| 12 | +// Mirror the inclusive-bound assertions in b32_sama_v2_metrics.test.ts. | |
| 13 | +// Same algorithm, same constants, same edge behaviour — the polyglot | |
| 14 | +// helper is allowed to compute a different SET of files (Go/Rust source | |
| 15 | +// trees rather than src/*.ts), but the RATIO formula must match the | |
| 16 | +// TS metric byte-for-byte. These tests pin that. | |
| 17 | + | |
| 18 | +const file = (path: string, locCount: number): WorkingSetFile => ({ path, locCount }); | |
| 19 | + | |
| 20 | +describe("computeWorkingSetFitPolyglot — empty input", () => { | |
| 21 | + test("empty list → 1.0 vacuous (matches the TS metric on an empty file map)", () => { | |
| 22 | + const r = computeWorkingSetFitPolyglot([], "go"); | |
| 23 | + expect(r.ratio).toBe(1.0); | |
| 24 | + expect(r.included).toBe(0); | |
| 25 | + expect(r.total).toBe(0); | |
| 26 | + }); | |
| 27 | + | |
| 28 | + test("empty list also vacuous under Rust", () => { | |
| 29 | + const r = computeWorkingSetFitPolyglot([], "rust"); | |
| 30 | + expect(r.ratio).toBe(1.0); | |
| 31 | + }); | |
| 32 | +}); | |
| 33 | + | |
| 34 | +describe("computeWorkingSetFitPolyglot — single-file extremes", () => { | |
| 35 | + test("a single 100-line Go file → 1.0", () => { | |
| 36 | + const r = computeWorkingSetFitPolyglot([file("pkg/x.go", 100)], "go"); | |
| 37 | + expect(r.ratio).toBe(1.0); | |
| 38 | + expect(r.included).toBe(1); | |
| 39 | + expect(r.total).toBe(1); | |
| 40 | + }); | |
| 41 | + | |
| 42 | + test("a single 10-line file falls below the min → 0.0", () => { | |
| 43 | + const r = computeWorkingSetFitPolyglot([file("pkg/x.go", 10)], "go"); | |
| 44 | + expect(r.ratio).toBe(0.0); | |
| 45 | + expect(r.included).toBe(0); | |
| 46 | + expect(r.total).toBe(1); | |
| 47 | + }); | |
| 48 | + | |
| 49 | + test("a single 600-line file exceeds the max → 0.0", () => { | |
| 50 | + const r = computeWorkingSetFitPolyglot([file("pkg/x.go", 600)], "go"); | |
| 51 | + expect(r.ratio).toBe(0.0); | |
| 52 | + expect(r.included).toBe(0); | |
| 53 | + expect(r.total).toBe(1); | |
| 54 | + }); | |
| 55 | +}); | |
| 56 | + | |
| 57 | +describe("computeWorkingSetFitPolyglot — bound-edge inclusivity", () => { | |
| 58 | + // The TS metric uses `lines >= MIN && lines <= MAX`. These tests | |
| 59 | + // mirror b32_sama_v2_metrics.test.ts's "exact bounds are inclusive". | |
| 60 | + test("LOC = 49 → out of band", () => { | |
| 61 | + const r = computeWorkingSetFitPolyglot([file("pkg/x.go", WORKING_SET_MIN_LOC - 1)], "go"); | |
| 62 | + expect(r.included).toBe(0); | |
| 63 | + }); | |
| 64 | + | |
| 65 | + test("LOC = 50 → in band", () => { | |
| 66 | + const r = computeWorkingSetFitPolyglot([file("pkg/x.go", WORKING_SET_MIN_LOC)], "go"); | |
| 67 | + expect(r.included).toBe(1); | |
| 68 | + }); | |
| 69 | + | |
| 70 | + test("LOC = 500 → in band", () => { | |
| 71 | + const r = computeWorkingSetFitPolyglot([file("pkg/x.go", WORKING_SET_MAX_LOC)], "go"); | |
| 72 | + expect(r.included).toBe(1); | |
| 73 | + }); | |
| 74 | + | |
| 75 | + test("LOC = 501 → out of band", () => { | |
| 76 | + const r = computeWorkingSetFitPolyglot([file("pkg/x.go", WORKING_SET_MAX_LOC + 1)], "go"); | |
| 77 | + expect(r.included).toBe(0); | |
| 78 | + }); | |
| 79 | +}); | |
| 80 | + | |
| 81 | +describe("computeWorkingSetFitPolyglot — mixed inputs", () => { | |
| 82 | + test("half in / half out → 0.5", () => { | |
| 83 | + const r = computeWorkingSetFitPolyglot([ | |
| 84 | + file("pkg/a.go", 100), | |
| 85 | + file("pkg/b.go", 10), | |
| 86 | + ], "go"); | |
| 87 | + expect(r.ratio).toBe(0.5); | |
| 88 | + }); | |
| 89 | + | |
| 90 | + test("two in / two out → 0.5", () => { | |
| 91 | + const r = computeWorkingSetFitPolyglot([ | |
| 92 | + file("pkg/a.go", 100), | |
| 93 | + file("pkg/b.go", 300), | |
| 94 | + file("pkg/c.go", 10), | |
| 95 | + file("pkg/d.go", 800), | |
| 96 | + ], "go"); | |
| 97 | + expect(r.ratio).toBe(0.5); | |
| 98 | + }); | |
| 99 | +}); | |
| 100 | + | |
| 101 | +describe("computeWorkingSetFitPolyglot — Go test-file exclusion", () => { | |
| 102 | + test("*_test.go files do NOT count toward total or included", () => { | |
| 103 | + const r = computeWorkingSetFitPolyglot([ | |
| 104 | + file("pkg/x.go", 100), | |
| 105 | + file("pkg/x_test.go", 200), | |
| 106 | + file("pkg/y_test.go", 50), | |
| 107 | + ], "go"); | |
| 108 | + // Only x.go counts; both _test.go files dropped before tallying. | |
| 109 | + expect(r.total).toBe(1); | |
| 110 | + expect(r.included).toBe(1); | |
| 111 | + expect(r.ratio).toBe(1.0); | |
| 112 | + }); | |
| 113 | + | |
| 114 | + test("a 100-line source + a 1-line _test.go sibling → 1.0 (mirrors the TS metric)", () => { | |
| 115 | + const r = computeWorkingSetFitPolyglot([ | |
| 116 | + file("pkg/x.go", 100), | |
| 117 | + file("pkg/x_test.go", 1), | |
| 118 | + ], "go"); | |
| 119 | + expect(r.ratio).toBe(1.0); | |
| 120 | + }); | |
| 121 | +}); | |
| 122 | + | |
| 123 | +describe("computeWorkingSetFitPolyglot — Rust inline-tests asymmetry", () => { | |
| 124 | + test("Rust includes ALL .rs files (no path-based test exclusion)", () => { | |
| 125 | + // Rust convention: tests live inside source files under | |
| 126 | + // #[cfg(test)] mod tests. The polyglot helper preserves that — | |
| 127 | + // it does NOT exclude any .rs path. The asymmetry is documented | |
| 128 | + // in the b32_working_set_polyglot.ts source comment. | |
| 129 | + const r = computeWorkingSetFitPolyglot([ | |
| 130 | + file("src/lib.rs", 100), | |
| 131 | + file("src/tests.rs", 100), | |
| 132 | + file("src/something_test.rs", 100), | |
| 133 | + ], "rust"); | |
| 134 | + expect(r.total).toBe(3); | |
| 135 | + expect(r.included).toBe(3); | |
| 136 | + expect(r.ratio).toBe(1.0); | |
| 137 | + }); | |
| 138 | +}); | |
| 139 | + | |
| 140 | +describe("computeWorkingSetFitPolyglot — reproducibility", () => { | |
| 141 | + test("same input → identical output across runs (deep-equal)", () => { | |
| 142 | + const input: WorkingSetFile[] = [ | |
| 143 | + file("a.go", 100), | |
| 144 | + file("b.go", 60), | |
| 145 | + file("c.go", 480), | |
| 146 | + file("d.go", 20), | |
| 147 | + file("e_test.go", 999), | |
| 148 | + ]; | |
| 149 | + const langs: PolyglotLanguage[] = ["go", "rust"]; | |
| 150 | + for (const l of langs) { | |
| 151 | + const a = computeWorkingSetFitPolyglot(input, l); | |
| 152 | + const b = computeWorkingSetFitPolyglot(input, l); | |
| 153 | + expect(a).toEqual(b); | |
| 154 | + } | |
| 155 | + }); | |
| 156 | +}); | |
| 157 | + | |
| 158 | +describe("computeWorkingSetFitPolyglot — bounds echo", () => { | |
| 159 | + test("result echoes minLoc / maxLoc from a31_sama_v2.ts (auditable)", () => { | |
| 160 | + const r = computeWorkingSetFitPolyglot([], "go"); | |
| 161 | + expect(r.minLoc).toBe(WORKING_SET_MIN_LOC); | |
| 162 | + expect(r.maxLoc).toBe(WORKING_SET_MAX_LOC); | |
| 163 | + }); | |
| 164 | +}); | |
src/b32_working_set_polyglot.ts
+82
−0
| @@ -0,0 +1,82 @@ | ||
| 1 | +// b32 — logic: §5 workingSetFit metric for polyglot source trees | |
| 2 | +// (Go, Rust). Pure function, no I/O. Mirrors the formula in | |
| 3 | +// src/b32_sama_v2_metrics.ts byte-for-byte: | |
| 4 | +// | |
| 5 | +// workingSetFit = files-in-band ÷ total-source-files | |
| 6 | +// | |
| 7 | +// where in-band means WORKING_SET_MIN_LOC ≤ LOC ≤ WORKING_SET_MAX_LOC, | |
| 8 | +// inclusive on both ends. Bounds are imported from a31_sama_v2.ts so | |
| 9 | +// the cross-language number is computed against the same band as this | |
| 10 | +// site's own metric — the single-source-of-truth determinism property | |
| 11 | +// from /sama/v2 §0. | |
| 12 | +// | |
| 13 | +// Used by scripts/measure-working-set.ts (the polyglot CLI) and the | |
| 14 | +// c14_working_set_walker.ts adapter, which feed it a pre-counted file | |
| 15 | +// summary so this module stays pure and unit-testable. | |
| 16 | + | |
| 17 | +import { | |
| 18 | + WORKING_SET_MAX_LOC, | |
| 19 | + WORKING_SET_MIN_LOC, | |
| 20 | +} from "./a31_sama_v2.ts"; | |
| 21 | + | |
| 22 | +// Language tag governs the test-file exclusion rule below. | |
| 23 | +export type PolyglotLanguage = "go" | "rust"; | |
| 24 | + | |
| 25 | +export interface WorkingSetFile { | |
| 26 | + // Repo-relative path (e.g. "crates/printer/src/standard.rs"). | |
| 27 | + path: string; | |
| 28 | + // File length in lines, matching the TS metric's `content.split("\n").length`. | |
| 29 | + locCount: number; | |
| 30 | +} | |
| 31 | + | |
| 32 | +export interface WorkingSetResult { | |
| 33 | + language: PolyglotLanguage; | |
| 34 | + included: number; // files inside [MIN, MAX] LOC, inclusive | |
| 35 | + total: number; // total source files (after test-file exclusion) | |
| 36 | + ratio: number; // included / total; empty-input → 1.0 vacuous | |
| 37 | + minLoc: number; // echoed back from a31 so callers can audit | |
| 38 | + maxLoc: number; | |
| 39 | +} | |
| 40 | + | |
| 41 | +// Test-file exclusion. The asymmetry is honest, not arbitrary: | |
| 42 | +// | |
| 43 | +// Go: tests live in `*_test.go` files. The TS metric excludes | |
| 44 | +// `*.test.ts` for the same structural reason — they aren't | |
| 45 | +// working modules in their own right, they verify one. | |
| 46 | +// | |
| 47 | +// Rust: tests live INSIDE source files under `#[cfg(test)] mod tests`. | |
| 48 | +// Excluding files at file-granularity would either lose every | |
| 49 | +// tested file or accidentally include all of them. The | |
| 50 | +// inline-tests dialect drafted at /sama/v2#62-inline-tests-dialect | |
| 51 | +// is what makes this asymmetry coherent: where the test attaches | |
| 52 | +// is a language-level choice; the working-set property the metric | |
| 53 | +// measures is unaffected. | |
| 54 | +const isPolyglotTestFile = (path: string, lang: PolyglotLanguage): boolean => { | |
| 55 | + if (lang === "go") return path.endsWith("_test.go"); | |
| 56 | + return false; | |
| 57 | +}; | |
| 58 | + | |
| 59 | +export const computeWorkingSetFitPolyglot = ( | |
| 60 | + files: ReadonlyArray<WorkingSetFile>, | |
| 61 | + lang: PolyglotLanguage, | |
| 62 | +): WorkingSetResult => { | |
| 63 | + let included = 0; | |
| 64 | + let total = 0; | |
| 65 | + for (const f of files) { | |
| 66 | + if (isPolyglotTestFile(f.path, lang)) continue; | |
| 67 | + total++; | |
| 68 | + if (f.locCount >= WORKING_SET_MIN_LOC && f.locCount <= WORKING_SET_MAX_LOC) { | |
| 69 | + included++; | |
| 70 | + } | |
| 71 | + } | |
| 72 | + // Match the TS metric: empty input → 1.0 (vacuously satisfied). | |
| 73 | + const ratio = total === 0 ? 1.0 : included / total; | |
| 74 | + return { | |
| 75 | + language: lang, | |
| 76 | + included, | |
| 77 | + total, | |
| 78 | + ratio, | |
| 79 | + minLoc: WORKING_SET_MIN_LOC, | |
| 80 | + maxLoc: WORKING_SET_MAX_LOC, | |
| 81 | + }; | |
| 82 | +}; | |
src/c14_working_set_walker.test.ts
+117
−0
| @@ -0,0 +1,117 @@ | ||
| 1 | +import { afterAll, beforeAll, describe, expect, test } from "bun:test"; | |
| 2 | +import { mkdirSync, mkdtempSync, rmSync, writeFileSync } from "node:fs"; | |
| 3 | +import { tmpdir } from "node:os"; | |
| 4 | +import { resolve } from "node:path"; | |
| 5 | +import { | |
| 6 | + collectPolyglotFiles, | |
| 7 | + measureWorkingSetForRepo, | |
| 8 | +} from "./c14_working_set_walker.ts"; | |
| 9 | + | |
| 10 | +// Hermetic fixture: build a tiny fake repo in a tmpdir, walk it, | |
| 11 | +// assert what comes back. The CLI script's real-world use against | |
| 12 | +// /tmp/dive and /tmp/ripgrep is exercised via the measurement step | |
| 13 | +// in this PR, not via unit tests; this file pins the algorithm. | |
| 14 | + | |
| 15 | +const FIXTURE_ROOT = mkdtempSync(resolve(tmpdir(), "tdd-md-wswalker-")); | |
| 16 | + | |
| 17 | +const writeFile = (relPath: string, lineCount: number): void => { | |
| 18 | + const abs = resolve(FIXTURE_ROOT, relPath); | |
| 19 | + mkdirSync(abs.split("/").slice(0, -1).join("/"), { recursive: true }); | |
| 20 | + const lines = Array.from({ length: lineCount }, (_, i) => `// line ${i}`); | |
| 21 | + writeFileSync(abs, lines.join("\n")); | |
| 22 | +}; | |
| 23 | + | |
| 24 | +beforeAll(() => { | |
| 25 | + // Top-level Go sources (one in-band, one out-of-band, one test file). | |
| 26 | + writeFile("a.go", 100); // in band | |
| 27 | + writeFile("b.go", 600); // out (over) | |
| 28 | + writeFile("c_test.go", 200); // excluded for Go | |
| 29 | + // Nested. | |
| 30 | + writeFile("pkg/inner.go", 60); // in band, inside subdir | |
| 31 | + writeFile("pkg/tiny.go", 10); // out (under) | |
| 32 | + // Rust sources (separate sub-tree). | |
| 33 | + writeFile("rs/src/lib.rs", 120); // in band | |
| 34 | + writeFile("rs/src/big.rs", 700); // out (over) | |
| 35 | + writeFile("rs/src/tests.rs", 75); // included (Rust has no path test rule) | |
| 36 | + // Skip directories that should NOT be walked. | |
| 37 | + writeFile(".git/HEAD.go", 100); // .git is skipped | |
| 38 | + writeFile("target/build.rs", 100); // target/ is skipped | |
| 39 | + writeFile("vendor/pkg.go", 100); // vendor/ is skipped | |
| 40 | + writeFile("node_modules/dep.go", 100); // node_modules/ skipped | |
| 41 | +}); | |
| 42 | + | |
| 43 | +afterAll(() => { | |
| 44 | + rmSync(FIXTURE_ROOT, { recursive: true, force: true }); | |
| 45 | +}); | |
| 46 | + | |
| 47 | +describe("collectPolyglotFiles — Go", () => { | |
| 48 | + test("walks recursively and finds the right .go files", () => { | |
| 49 | + const files = collectPolyglotFiles(FIXTURE_ROOT, "go"); | |
| 50 | + const paths = files.map((f) => f.path); | |
| 51 | + // Excluded: .git/*, target/*, vendor/*, node_modules/*. | |
| 52 | + // Included: a.go, b.go, c_test.go (the helper RETURNS it; the | |
| 53 | + // metric helper drops it during the count — separation of concerns). | |
| 54 | + expect(paths).toContain("a.go"); | |
| 55 | + expect(paths).toContain("b.go"); | |
| 56 | + expect(paths).toContain("c_test.go"); | |
| 57 | + expect(paths).toContain("pkg/inner.go"); | |
| 58 | + expect(paths).toContain("pkg/tiny.go"); | |
| 59 | + expect(paths).not.toContain(".git/HEAD.go"); | |
| 60 | + expect(paths).not.toContain("vendor/pkg.go"); | |
| 61 | + expect(paths).not.toContain("node_modules/dep.go"); | |
| 62 | + }); | |
| 63 | + | |
| 64 | + test("LOC counts match content.split('\\n').length", () => { | |
| 65 | + const files = collectPolyglotFiles(FIXTURE_ROOT, "go"); | |
| 66 | + const a = files.find((f) => f.path === "a.go"); | |
| 67 | + // We wrote 100 lines joined by "\n" → split("\n").length === 100. | |
| 68 | + expect(a?.locCount).toBe(100); | |
| 69 | + }); | |
| 70 | + | |
| 71 | + test("returns files in deterministic sorted order", () => { | |
| 72 | + const a = collectPolyglotFiles(FIXTURE_ROOT, "go").map((f) => f.path); | |
| 73 | + const b = collectPolyglotFiles(FIXTURE_ROOT, "go").map((f) => f.path); | |
| 74 | + expect(a).toEqual(b); | |
| 75 | + const sorted = [...a].sort((x, y) => x.localeCompare(y)); | |
| 76 | + expect(a).toEqual(sorted); | |
| 77 | + }); | |
| 78 | +}); | |
| 79 | + | |
| 80 | +describe("collectPolyglotFiles — Rust", () => { | |
| 81 | + test("finds only .rs files; ignores .go", () => { | |
| 82 | + const files = collectPolyglotFiles(FIXTURE_ROOT, "rust"); | |
| 83 | + const paths = files.map((f) => f.path); | |
| 84 | + expect(paths).toContain("rs/src/lib.rs"); | |
| 85 | + expect(paths).toContain("rs/src/big.rs"); | |
| 86 | + expect(paths).toContain("rs/src/tests.rs"); | |
| 87 | + expect(paths.every((p) => p.endsWith(".rs"))).toBe(true); | |
| 88 | + }); | |
| 89 | + | |
| 90 | + test("target/build.rs is excluded (skipped dir)", () => { | |
| 91 | + const files = collectPolyglotFiles(FIXTURE_ROOT, "rust"); | |
| 92 | + const paths = files.map((f) => f.path); | |
| 93 | + expect(paths).not.toContain("target/build.rs"); | |
| 94 | + }); | |
| 95 | +}); | |
| 96 | + | |
| 97 | +describe("measureWorkingSetForRepo — end-to-end", () => { | |
| 98 | + test("Go fixture: 2 in band (a.go=100, pkg/inner.go=60) of 4 source files (excluding c_test.go) = 0.5", () => { | |
| 99 | + const r = measureWorkingSetForRepo(FIXTURE_ROOT, "go"); | |
| 100 | + expect(r.total).toBe(4); // a, b, pkg/inner, pkg/tiny (c_test excluded) | |
| 101 | + expect(r.included).toBe(2); // a, pkg/inner | |
| 102 | + expect(r.ratio).toBe(0.5); | |
| 103 | + }); | |
| 104 | + | |
| 105 | + test("Rust fixture: 2 in band (lib.rs=120, tests.rs=75) of 3 .rs files = 2/3", () => { | |
| 106 | + const r = measureWorkingSetForRepo(FIXTURE_ROOT, "rust"); | |
| 107 | + expect(r.total).toBe(3); | |
| 108 | + expect(r.included).toBe(2); | |
| 109 | + expect(r.ratio).toBeCloseTo(2 / 3, 6); | |
| 110 | + }); | |
| 111 | + | |
| 112 | + test("echoes the bounds back so callers can audit which numbers produced the ratio", () => { | |
| 113 | + const r = measureWorkingSetForRepo(FIXTURE_ROOT, "go"); | |
| 114 | + expect(r.minLoc).toBe(50); | |
| 115 | + expect(r.maxLoc).toBe(500); | |
| 116 | + }); | |
| 117 | +}); | |
src/c14_working_set_walker.ts
+105
−0
| @@ -0,0 +1,105 @@ | ||
| 1 | +// c14 — adapter: filesystem walker that produces a polyglot | |
| 2 | +// WorkingSetFile summary for an external source tree (Go or Rust). | |
| 3 | +// Recursive directory walk; counts lines of each .go / .rs file using | |
| 4 | +// the same `content.split("\n").length` rule as b32_sama_v2_metrics so | |
| 5 | +// the cross-language metric matches the TS metric byte-for-byte. | |
| 6 | +// | |
| 7 | +// Skipped directories are the conventional non-source trees that | |
| 8 | +// would otherwise inflate the denominator with vendored / generated | |
| 9 | +// / build artefacts: .git, target/ (Rust build output), vendor/ (Go | |
| 10 | +// vendored deps), node_modules/ (incidental, defensive). | |
| 11 | +// | |
| 12 | +// The walker is hermetic — given a path that is a directory it | |
| 13 | +// resolves the file set deterministically. Calls into the pure helper | |
| 14 | +// in b32_working_set_polyglot.ts for the ratio. | |
| 15 | + | |
| 16 | +import { readdirSync, readFileSync, statSync } from "node:fs"; | |
| 17 | +import { resolve } from "node:path"; | |
| 18 | +import { | |
| 19 | + computeWorkingSetFitPolyglot, | |
| 20 | + type PolyglotLanguage, | |
| 21 | + type WorkingSetFile, | |
| 22 | + type WorkingSetResult, | |
| 23 | +} from "./b32_working_set_polyglot.ts"; | |
| 24 | + | |
| 25 | +const SKIPPED_DIRS: ReadonlySet<string> = new Set([ | |
| 26 | + ".git", | |
| 27 | + "target", | |
| 28 | + "vendor", | |
| 29 | + "node_modules", | |
| 30 | +]); | |
| 31 | + | |
| 32 | +const EXTENSION_FOR: Record<PolyglotLanguage, string> = { | |
| 33 | + go: ".go", | |
| 34 | + rust: ".rs", | |
| 35 | +}; | |
| 36 | + | |
| 37 | +// Walk a directory and return every {path, locCount} pair for files | |
| 38 | +// whose extension matches the target language. Paths are returned | |
| 39 | +// repo-relative (i.e. relative to the `repoRoot` passed in) so they're | |
| 40 | +// stable across machines. | |
| 41 | +export const collectPolyglotFiles = ( | |
| 42 | + repoRoot: string, | |
| 43 | + lang: PolyglotLanguage, | |
| 44 | +): WorkingSetFile[] => { | |
| 45 | + const ext = EXTENSION_FOR[lang]; | |
| 46 | + const out: WorkingSetFile[] = []; | |
| 47 | + | |
| 48 | + const walk = (absDir: string, relDir: string): void => { | |
| 49 | + let entries: ReturnType<typeof readdirSync>; | |
| 50 | + try { | |
| 51 | + entries = readdirSync(absDir, { withFileTypes: true }); | |
| 52 | + } catch { | |
| 53 | + // Permission errors / non-existent: surface to caller, but | |
| 54 | + // letting one bad subtree halt the whole measurement would be | |
| 55 | + // worse than reporting the partial set. Return silently here; | |
| 56 | + // the CLI's smoke checks at the top level will catch a totally | |
| 57 | + // unreadable root. | |
| 58 | + return; | |
| 59 | + } | |
| 60 | + for (const e of entries) { | |
| 61 | + if (e.name.startsWith(".") && e.name !== ".") { | |
| 62 | + // .git, .github, .vscode, ...: defensive skip on all dotdirs | |
| 63 | + // for directories; dotfiles are skipped too (they're never | |
| 64 | + // .go/.rs sources anyway, but the explicit skip is cheap). | |
| 65 | + if (e.isDirectory() && SKIPPED_DIRS.has(e.name)) continue; | |
| 66 | + if (e.isDirectory()) continue; // skip all hidden dirs | |
| 67 | + } | |
| 68 | + if (e.isDirectory()) { | |
| 69 | + if (SKIPPED_DIRS.has(e.name)) continue; | |
| 70 | + const sub = resolve(absDir, e.name); | |
| 71 | + const subRel = relDir === "" ? e.name : `${relDir}/${e.name}`; | |
| 72 | + walk(sub, subRel); | |
| 73 | + continue; | |
| 74 | + } | |
| 75 | + if (!e.isFile()) continue; | |
| 76 | + if (!e.name.endsWith(ext)) continue; | |
| 77 | + const abs = resolve(absDir, e.name); | |
| 78 | + const relPath = relDir === "" ? e.name : `${relDir}/${e.name}`; | |
| 79 | + const content = readFileSync(abs, "utf8"); | |
| 80 | + // Match b32_sama_v2_metrics.ts: lines = content.split("\n").length. | |
| 81 | + const locCount = content.split("\n").length; | |
| 82 | + out.push({ path: relPath, locCount }); | |
| 83 | + } | |
| 84 | + }; | |
| 85 | + | |
| 86 | + const root = resolve(repoRoot); | |
| 87 | + const rootStat = statSync(root); | |
| 88 | + if (!rootStat.isDirectory()) { | |
| 89 | + throw new Error(`expected a directory, got: ${repoRoot}`); | |
| 90 | + } | |
| 91 | + walk(root, ""); | |
| 92 | + // Sort for deterministic output (readdirSync is platform-dependent). | |
| 93 | + out.sort((a, b) => a.path.localeCompare(b.path)); | |
| 94 | + return out; | |
| 95 | +}; | |
| 96 | + | |
| 97 | +// Convenience: walk + compute in one call. Used by the CLI script. | |
| 98 | +export const measureWorkingSetForRepo = ( | |
| 99 | + repoRoot: string, | |
| 100 | + lang: PolyglotLanguage, | |
| 101 | +): WorkingSetResult & { files: WorkingSetFile[] } => { | |
| 102 | + const files = collectPolyglotFiles(repoRoot, lang); | |
| 103 | + const result = computeWorkingSetFitPolyglot(files, lang); | |
| 104 | + return { ...result, files }; | |
| 105 | +}; | |