Pointing SAMA v2 at dive: Go's conventions cover more than you'd think

The WordPress plugin audit earlier today scored 0 of 7 §4 checks for a real-world plugin in the wild. That post argued — and I still believe — that the score isn't a failure; it's the expected baseline for code written under WordPress idioms with no external discipline. WP itself actively pushes devs toward hook-and-filter god-classes.

But "WordPress is messy" isn't an interesting finding on its own. The harder question is what v2 sees when pointed at a language that has stronger architectural defaults built in. So: same exercise, same methodology, but against wagoodman/dive — a 53k-star Go project that explores Docker image layers. 8,498 lines of Go across 92 source files, downloaded straight from git clone, walked carefully.

The result is much more interesting than 0/7.

#What's in the box

dive/
├── cmd/dive/                                                            # CLI entry tree
│   ├── main.go                                                          # 12 lines: calls cli.Run()
│   └── cli/
│       ├── cli.go                                                       # 130 lines: root command setup
│       └── internal/
│           ├── options/      # 10 files, ~150 LOC each — YAML config types
│           ├── command/      # cobra command files (root, build, ci, export, adapter/)
│           └── ui/v1/        # the TUI: app/, view/, viewmodel/, layout/, key/
├── dive/                                                                # Domain tree
│   ├── filetree/             # 14 files: pure tree-diff logic. file_tree.go (390 LOC),
│   │                         #           file_node.go (354 LOC). Has 4 test files.
│   └── image/                # image archive parsing + Docker/Podman resolvers
│       ├── docker/           # 13 files: archive parsing, daemon API, CLI shelling
│       └── podman/           # 4 files: podman daemon + CLI
├── internal/                                                            # Shared helpers
│   ├── utils/   ── 2 files
│   ├── log/     ── 1 file
│   └── bus/     ── 2 files + event/payload/
└── (Dockerfile, Makefile, go.mod, etc.)

Some numbers worth flagging next to yesterday's WordPress plugin:

metric dive (Go) WP plugin (PHP)
Source files 92 17 (non-vendor)
Total LOC 8,498 6,445
Largest single file 496 LOC 1,554 LOC
Files over 700-LOC cap 0 3
Test files 18 0
Test LOC 3,017 0
Top-level layers as directories cmd/, internal/, dive/ — none —

The two codebases are roughly the same size. They look completely different.

#What dive already gets for free

Go's standard project layout enforces several things that SAMA v2 has to write down as rules:

  • cmd/dive/main.go is unambiguously the entry point. Nothing imports back into it. Whatever lives in cmd/'s tree depends on dive/ and internal/, not the other way around. That's the §1.2 Law direction enforced by Go's import resolver: internal/ packages can only be imported by paths under their parent. The Law check has nothing left to do because the language already won't compile the violation.
  • Package-per-concern is a hard convention in Go. dive/filetree/ is the tree-diff package. dive/image/docker/ is the Docker resolver. dive/image/podman/ is Podman. There's no Webdados_FB-style god-class because the Go community would reject the PR at code review. Architecture as a property is half-enforced by the language ecosystem.
  • All files are under the 700-LOC cap. The largest is cmd/dive/cli/internal/ui/v1/view/filetree.go at 496 lines; the next two are 472 and 390. Atomic check passes by construction.
  • Boundary parsing is mostly localized. json.Unmarshal, yaml.Unmarshal, os.Open, http.Client, exec.Command("docker", ...) — almost every call appears under dive/image/docker/ or dive/image/podman/. There's exactly one smell: cmd/dive/cli/internal/options/ci.go parses the user's .dive-ci.yaml config inside the cmd/ tree.
  • Tests exist. 18 test files, ~20% file ratio. Coverage is patchy (more on this below), but the WP plugin had zero. Modeled-tests goes from "vacuous fail" to "partial pass with named gaps."

That covers maybe 60% of what v2 asks for, without anyone ever having looked at the spec.

#What dive would still fail

Walking the seven §4 checks honestly:

##1 Sorted — would fail

Every file carries a profile-recognised prefix; lexicographic prefix order equals layer order.

This is the check Go projects cannot pass without changing language idioms. Go organizes by directory, not by filename prefix. Filenames inside dive/filetree/ are comparer.go, diff.go, efficiency.go, file_info.go, file_node.go, file_tree.go — they're descriptive, not layer-marking. Lex-sorting them in alphabetical order has no relationship to architectural layer.

This isn't dive doing something wrong. It's the SAMA v2 spec being written with a language model (TypeScript modules, PHP files) where prefix-sortable filenames are idiomatic. Go's idiom is the opposite. The spec needs a directory-based dialect for Go — "the package directory's lex position equals the layer's order" — to be honestly applicable here. I'll come back to this in the conclusion. (Update: this dialect has since been drafted into /sama/v2 §6.1 as a v2.1-draft extension.)

##2 Architecture — would partly pass

Every file maps to exactly one canonical layer.

Without a written sama.profile.toml mapping packages to layers, every file is technically "unprefixed." But the natural mapping is obvious enough to write down right now:

sama_version = "2.0"
profile = "dive"
layout = "directory"     # ← hypothetical extension; see conclusion

[layers.0]
packages = ["internal/utils", "internal/log", "internal/bus", "internal/bus/event/payload"]

[layers.1]
sublayers = [
  { name = "core",     packages = ["dive/filetree", "dive/image"] },
  { name = "viewmodel", packages = ["cmd/dive/cli/internal/ui/v1/viewmodel"] },
  { name = "view",      packages = ["cmd/dive/cli/internal/ui/v1/view", "cmd/dive/cli/internal/ui/v1/layout", "cmd/dive/cli/internal/ui/v1/format", "cmd/dive/cli/internal/ui/v1/key"] },
]

[layers.2]
sublayers = [
  { name = "resolver", packages = ["dive/image/docker", "dive/image/podman"] },
  { name = "config",   packages = ["cmd/dive/cli/internal/options"] },
]

[layers.3]
packages = ["cmd/dive", "cmd/dive/cli", "cmd/dive/cli/internal/command", "cmd/dive/cli/internal/command/ci", "cmd/dive/cli/internal/command/export", "cmd/dive/cli/internal/command/adapter", "cmd/dive/cli/internal/ui", "cmd/dive/cli/internal/ui/v1", "cmd/dive/cli/internal/ui/v1/app"]

That maps every package without ambiguity. Under the directory-based dialect, this passes.

##3 Modeled (tests) — would fail

18 test files for 92 non-test source files. The packages with tests:

  • dive/filetree/ — 4 tests (efficiency, file_node, file_tree, node_data) ✓
  • cmd/dive/cli/internal/ui/v1/viewmodel/ — has tests ✓
  • cmd/dive/cli/internal/command/ci/evaluator_test.go
  • cmd/dive/cli/cli_test.go + cli_load_test.go ✓ (integration-level)

The packages without sibling tests include nearly all of dive/image/, dive/image/docker/, dive/image/podman/ (the Layer 2 adapters), most of the UI view layer, and the entire cmd/dive/cli/internal/command/ tree. That's ~30 source files in Layer 1 and Layer 2 territory that lack sibling tests. Modeled-tests would fail.

##4 Modeled (boundary) — would mostly pass

Boundary parsing call sites:

  • json.Unmarshal / json.NewDecoder → 4 files in dive/image/docker/, all Layer 2. ✓
  • yaml.Unmarshalcmd/dive/cli/internal/options/ci.go (Layer 2 config sublayer in the proposed profile) ✓
  • os.Open / os.ReadFiledive/filetree/file_info.go (filesystem inspection, Layer 2 territory if you classify it that way) and dive/image/docker/image_archive.go (Layer 2) ✓
  • exec.Command("docker", ...) / exec.Command("podman", ...)dive/image/docker/cli.go + dive/image/podman/cli.go, both Layer 2 ✓
  • http.Client → in the Docker daemon resolvers, Layer 2 ✓

One borderline case: dive/filetree/file_info.go calls os.Lstat and traverses the filesystem. Under v2's strict reading, anything filesystem-touching is Layer 2 (Adapter). But filetree is otherwise pure tree-diff math. Either split it (filetree/ → Layer 1 + a new dive/filetree_adapter/ → Layer 2), or accept that this one file's Lstat calls are tightly scoped. The check would still pass under the proposed profile because file_info.go is in the L1 filetree package — but it's a soft tension worth naming.

Score: passes with one named tension.

##5 Atomic — passes outright

All 92 source files under 700 LOC. No barrel files. Done.

##6 Law (§1.2) — would pass

Go's internal/ semantics + the natural cmd→domain→helpers direction mean upward imports are mostly impossible to write without the compiler refusing. The only same-layer-but-reversed-sublayer concern: does dive/image/docker import dive/image/podman or vice versa? A quick grep — neither imports the other. They're siblings under L2 resolver, both downstream of dive/image/image.go (which is L1 core).

##7 Consistency (§3) — would pass

Derives from Law. No file's declared layer is contradicted by what it imports.

Estimated tally: 5 of 7 pass under the directory-based dialect, with 2 named failures (Sorted, Modeled-tests). That's a real result, not "0/7 because no one tried."

#The §5 metrics — mixed measurement and estimate for dive

metric dive (Go) WP plugin (PHP, estimated) tdd.md (TS, measured)
§4 checks passing ~5 / 7 (estimated) 0 / 7 7 / 7
graphDepth 12 (measured, dive@d6c69194) — originally estimated ~5 ~3 7
boundaryRatio ~85% (estimated; one borderline case in options/ci.go) <10% 100%
workingSetFit (50–500 LOC) 52.17% (measured, dive@d6c69194) — originally estimated ~80% ~47% 80% (measured)
violationCounts (sum) ~30 (estimated; mostly Modeled-tests gaps) 17+ 0

The workingSetFit is the metric I most expected to land near tdd.md's 80% — two engineered codebases, both with linters and conventions. The measurement says otherwise.

Hand-trace (auditable per /sama/v2 §0): running find /tmp/dive -name '*.go' -not -name '*_test.go' -not -path '*/.git/*' -not -path '*/vendor/*' | wc -l returns 92 source .go files. Of those, 48 fall in [50, 500] LOC inclusive (matching WORKING_SET_MIN_LOC and WORKING_SET_MAX_LOC in src/a31_sama_v2.ts). 48 ÷ 92 = 0.5217 ≈ 52.17%. The polyglot §5 emitter at scripts/measure-working-set.ts produces the same number from the same source tree.

The distribution explains it: 44 files under 50 LOC (mostly small type-only modules, single-helper files, and platform-shim stubs like dive/image/docker/docker_host_windows.go at 6 LOC), 48 in band, and — strikingly — 0 over 500 LOC. dive's working-set miss is not god-classes (the §4.5 Atomic check passes outright); it's the opposite failure mode: many files small enough to fall below the substantive-module threshold.

The original ~80% estimate was wrong, and wrong in a direction casual eyeballing wouldn't catch — counting visible-on-the-screen files isn't the same as counting them and applying a band filter. That 28-point miss between estimate and measurement is itself the empirical case for the metric existing at all: the metric surfaces a property the human estimate missed.

#graphDepth, measured: 12 (originally estimated ~5)

The polyglot graphDepth emitter at scripts/measure-graph-depth.ts walks dive's go.mod, collects every .go file's imports, filters to intra-module imports (those starting with github.com/wagoodman/dive/), aggregates them per-package-directory, and computes the longest path. The result for dive@d6c69194: 27 package directories, 80 internal edges, longest dependency chain of depth 12.

A 12-deep import chain is more than twice the audit's eyeball estimate of ~5. The estimate was wrong because I was thinking in top-level package categories (cmd, command, ui, dive, filetree, internal/utils — six things), but the actual Go package graph treats each subdirectory as its own package. cmd/dive/cli/internal/ui/v1/viewmodel is a different package from cmd/dive/cli/internal/ui/v1/view, even though they read like one category to a human; the import graph sees them as distinct hops. The 12-deep chain weaves through subdirectories the human-readable description folded into one bullet.

This is the same shape of finding as the workingSetFit one above: the metric sees the structure; the eye sees the categories. Both are useful, but only the metric is mechanically comparable across repos.

Module-granularity note: the polyglot graphDepth metric counts at the Go package-directory level — multiple .go files in one directory share their package and therefore their imports. This is the natural Go analog to the TS file-level metric (TS one module ≈ one file; Go one package ≈ one directory). The semantic is documented in src/b32_graph_depth_polyglot.ts.

#What dive would look like at 7/7 — the last 30%

Far less work than the WordPress refactor sketch from earlier. Three concrete changes get from ~5/7 to 7/7:

1. Add sama.profile.toml (under a directory-based dialect). The proposed profile in §2 above maps every package to a canonical layer. The dialect requires a spec extension (v2.1 maybe) — see conclusion. No code changes, just declaration.

2. Add sibling tests for the ~30 untested Layer-1/2 files. This is the only real work. The candidates that most need them:

  • dive/image/docker/image_archive.go (378 LOC, archive parsing — needs fixture-based tests)
  • dive/image/docker/engine_resolver.go (197 LOC — needs a fake daemon)
  • dive/image/podman/build.go (build coordination — needs a fake podman client)
  • The cmd/dive/cli/internal/command/ci/rules.go rule-evaluation logic (196 LOC)
  • The TUI view/ files (496 + 377 + others) — testing renderers is annoying but feasible with text-fixture comparison

A week of focused work would close this gap. Not a refactor — just tests being written that aren't there yet.

3. Resolve the file_info.go filesystem-vs-pure tension. Split dive/filetree/ so the pure tree math sits separately from the file-walking adapter. Concretely: move file_info.go's os.Lstat calls into a new dive/filetree_loader/ package, keep the tree algebra in dive/filetree/. ~half a day's work.

The codebase is already so close to v2 that the lift is small. Compare to the WordPress plugin where the same goal requires splitting a 1,554-line god-class into eleven files, writing 20+ test files from scratch, and introducing a typed Settings replacement for an untyped option array. The "30% remaining" for dive is not a comparable amount of work.

#Where this leaves the spec

The audit surfaces one real finding about SAMA v2 itself: the §4.1 Sorted check is written with TypeScript/PHP filename conventions in mind, and doesn't translate cleanly to Go.

Go's idiom is to organize by package directory, not by filename prefix. A sama.profile.toml that says "these prefixes lex-sort in layer order" (the v2.0 format) has nothing meaningful to assert about dive. A directory-based dialect — "these package directories lex-sort in layer order" — does.

That's one of the §6 evolution-policy moves the spec was designed to accommodate: a new profile dialect, falsifiable against the §5 metrics, admitted only if measurements hold across multiple Go repos. Today's dive audit is the first datapoint for that hypothesis. To validate the directory-based dialect properly, the same exercise would need to be done against 3-5 more Go projects. That's a future post.

#Three real-world datapoints, on the same axes

After today, the §5 baseline graph has three points on it:

project language §4 score workingSetFit boundaryRatio graphDepth
tdd.md (this site) TypeScript 7/7 (measured) 80% (measured) 100% (measured) 7 (measured)
wagoodman/dive Go ~5/7 (estimated) ~80% (estimated) ~85% (estimated) ~5 (estimated)
Open Graph plugin PHP / WordPress 0/7 (estimated) ~47% (estimated) <10% (estimated) ~3 (estimated)

That's still n=3 and two of them are hand-estimated, so nobody should be drawing conclusions about empirical v2 worth yet. But the pattern is suggestive: real-world Go code, written under no v2 discipline, scores closer to the v2-disciplined dogfood than to the WP-idiom code. The differential is exactly the kind of thing §5 was designed to surface. Whether that differential causes better outcomes for agents working in the code — fewer harness loops, faster onboarding, less drift — is the next experiment, not this post's claim.


See for yourself: