Deep Review: 20260519-154901-pr-4995

Date	2026-05-19 15:49
Repo	lima-vm/lima
Round	1 (of target)
Author	@kishansinghifs1
PR	#4995 — POC - LFX( TERM 2 ) - add Windows support with autounattend.xml generation
Branch	`poc-windows-autounattend`
Commits	`d2730be0` POC - LFX( TERM 2 ) - add Windows support with autounattend.xml generation
Reviewers	Claude Opus 4.7 (effort: xhigh), Codex GPT 5.5 (effort: xhigh), Gemini 3.1 Pro (effort: default), Qwen 3.6 27B (effort: high), Gemma 31B (effort: high)
Verdict	Rework — one critical test failure blocks CI, plus several Windows-correctness gaps that will surface the moment the boot path is wired up
Wall-clock time	`25 min 54 s`

Executive Summary ¶

This proof-of-concept makes Windows a valid os: value and adds an autounattend.xml generator alongside the existing cloud-config.yaml path. The diff is small (8 files, +77 −4) and the structural choices (a separate exported GenerateAutounattendXML, embedded template, dedicated filename constant) are clean.

The headline finding is C1: the regression test TestValidateMultipleErrors was edited to drop the OS-validation error from its expected output, but the test YAML still uses lowercase os: windows and nothing in the load → resolve → validate pipeline canonicalises it. go test ./pkg/limayaml -run TestValidateMultipleErrors fails as committed and will block CI.

The autounattend template itself has several Windows-specific gaps that are worth fixing before the boot wiring lands: a hardcoded processorArchitecture="amd64" that breaks ARM64 Windows guests (I1), no XML escaping for text/template-substituted user-controlled fields (I2), an SSH-key block embedded inside an XML comment which forbids -- (I3), an IANA timezone fed directly into a Windows-TZ-ID-only field (I4), and a computer name derived from a Lima identifier (up to 76 chars + lima- prefix) that can exceed the 15-char Windows ComputerName cap (I5).

Structure: 1 critical, 5 important, 6 suggestions, 3 design observations.

Critical Issues ¶

C1. TestValidateMultipleErrors fails as committed Claude Codex¶

pkg/limayaml/validate_test.go

		//
		assert.Error(t, err, "field `param` key \"rootFul\" is not used in any provision, probe, copyToHost, or portForward")
	}
}

func TestValidateMultipleErrors(t *testing.T) {
	yamlWithMultipleErrors := `
os: windows
arch: unsupported_arch
portForwards:
  - guestPort: 22
    hostPort: 2222
  - guestPort: 8080
    hostPort: 65536
provision:
  - mode: invalid_mode
    script: echo test
  - mode: data
    content: test
`

	y, err := Load(t.Context(), []byte(yamlWithMultipleErrors), "multiple-errors.yaml")
	assert.NilError(t, err)
	err = Validate(y, false)
	t.Logf("Validation errors: %v", err)

	assert.Error(t, err, "field `arch` must be one of [x86_64 aarch64 armv7l ppc64le riscv64 s390x]; got \"unsupported_arch\"\n"+
		"field `images` must be set\n"+
		"field `provision[0].mode` must one of \"system\", \"user\", \"boot\", \"data\", \"dependency\", \"ansible\", or \"yq\"\n"+
		"field `provision[1].path` must not be empty when mode is \"data\"")
}

func TestValidateAgainstLatestConfig(t *testing.T) {
	tests := []struct {
		name    string
		yNew    string

// validate.go (validator)
switch *y.OS {
case limatype.LINUX, limatype.DARWIN, limatype.FREEBSD, limatype.WINDOWS:
default:
    errs = errors.Join(errs, fmt.Errorf("field `os` must be one of %q; got %q", limatype.OSTypes, *y.OS))
}

The PR removed the field \os\ must be one of …; got "windows" line from the expected error list in TestValidateMultipleErrors, but the YAML payload still uses lowercase os: windows. limatype.WINDOWS == "Windows" and the validator's switch is case-sensitive.

// defaults.go — ResolveOS passes non-default values through verbatim
func ResolveOS(s *string) limatype.OS {
    if s == nil || *s == "" || *s == "default" {
        return limatype.NewOS("linux")
    }
    return *s
}

ResolveOS does not normalise. NewOS (lima_yaml.go:346) added a "windows" case in this PR, but it is unreachable from the YAML load path — only ResolveOS's "default" branch calls it, and only with "linux". Running the test confirms the failure:

field `os` must be one of ["Linux" "Darwin" "FreeBSD" "Windows"]; got "windows"
...
--- FAIL: TestValidateMultipleErrors (0.00s)

CI blocks on this. Codex rated the same finding Important; I promoted it because a failing test blocks the merge.

Fix: pick one of the two consistent options.

- os: windows
+ os: Windows

…and then change the YAML payload elsewhere in the file (or add a positive-case test) to keep the windows lowercase error path covered — or canonicalise OS at load time:

 func NewOS(osname string) OS {
-    switch osname {
+    switch strings.ToLower(osname) {
     case "linux":
         return LINUX
     case "darwin":
         return DARWIN
-    case "windows", "Windows":
+    case "windows":
         return WINDOWS

…and route the OS YAML field through NewOS from ResolveOS (which removes the dead "Windows" case in NewOS — see S1). The second option fixes a latent issue with os: Linux / os: Darwin / os: FreeBSD (capital) silently passing today only by accident — the constants happen to be capitalised.

Important Issues ¶

I1. autounattend.xml hardcodes processorArchitecture="amd64" Claude Codex Gemini¶

pkg/cidata/autounattend.xml.tmpl

<?xml version="1.0" encoding="utf-8"?>
<unattend xmlns="urn:schemas-microsoft-com:unattend">
  <settings pass="oobeSystem">
    <component name="Microsoft-Windows-Shell-Setup" processorArchitecture="amd64" publicKeyToken="31bf3856ad364e35" language="neutral" versionScope="nonSxS">
      <ComputerName>{{ .Hostname }}</ComputerName>
      <RegisteredOwner>{{ .User }}</RegisteredOwner>
{{- if .TimeZone }}
      <TimeZone>{{ .TimeZone }}</TimeZone>
{{- end }}

OSTypes now includes WINDOWS, but no architecture constraint pairs with it. os: Windows + arch: aarch64 — the default on Apple Silicon hosts, where ResolveArch (defaults.go:1028) maps runtime.GOARCH=arm64 to aarch64 — passes validation and emits an unattend that declares amd64. Windows Setup rejects the component on ARM64 hardware.

Fix: drive the architecture from args.Arch (currently absent from TemplateArgs) and template the attribute, mapping aarch64 → arm64. Until the mapping lands, reject Windows + non-x86_64 in validate.go:

processorArchitecture="{{ if eq .Arch "aarch64" }}arm64{{ else }}amd64{{ end }}"

I2. text/template does not escape XML special characters Claude Codex Gemini Qwen¶

pkg/cidata/template.go

	}

	return layout, nil
}

func ExecuteTemplateAutounattendXML(args *TemplateArgs) ([]byte, error) {
	if err := ValidateTemplateArgs(args); err != nil {
		return nil, err
	}

	return textutil.ExecuteTemplate(string(autounattendTemplate), args)
}

textutil.ExecuteTemplate uses text/template, which performs no XML escaping. The template substitutes .Hostname, .User, and .TimeZone verbatim into XML element bodies. ValidateTemplateArgs explicitly does not validate args.User (template.go:129-130): "the user can override with any name they want". A User.Name or User.Comment containing &, <, >, or " produces malformed XML, and Windows Setup aborts.

Fix: render the document through encoding/xml (preferred — see Design Observations), or register xml.EscapeText / html.EscapeString as a template func and apply it to every XML text node.

I3. SSH-key comments containing -- break the XML comment block Gemini¶

pkg/cidata/autounattend.xml.tmpl

{{- if .TimeZone }}
      <TimeZone>{{ .TimeZone }}</TimeZone>
{{- end }}
    </component>
  </settings>
  <!-- Lima SSH keys (placeholder, consumed by later Windows provisioning):
{{- range .SSHPubKeys }}
    {{ . }}
{{- end }}
  -->
</unattend>

XML 1.0 forbids -- inside a comment body. SSH public keys have a free-form trailing comment field (typically user@host); any user with a host name like my--workstation produces an SSH key whose comment field contains --, which makes the generated unattend unparseable. The likelihood is low but non-zero, and the failure mode is silent.

Fix: escape -- (e.g. by replacing with -_) in the values added to args.SSHPubKeys for the autounattend path, or move the placeholder out of the XML comment entirely (e.g. into a vendor-specific element or a separate file).

I4. .TimeZone is IANA but the unattend field requires a Windows TZ ID Claude Codex¶

pkg/cidata/autounattend.xml.tmpl

<unattend xmlns="urn:schemas-microsoft-com:unattend">
  <settings pass="oobeSystem">
    <component name="Microsoft-Windows-Shell-Setup" processorArchitecture="amd64" publicKeyToken="31bf3856ad364e35" language="neutral" versionScope="nonSxS">
      <ComputerName>{{ .Hostname }}</ComputerName>
      <RegisteredOwner>{{ .User }}</RegisteredOwner>
{{- if .TimeZone }}
      <TimeZone>{{ .TimeZone }}</TimeZone>
{{- end }}
    </component>
  </settings>
  <!-- Lima SSH keys (placeholder, consumed by later Windows provisioning):
{{- range .SSHPubKeys }}
    {{ . }}

LimaYAML.TimeZone is documented and populated as an IANA zoneinfo name. Windows' <TimeZone> unattend element requires a Windows-specific identifier (Pacific Standard Time, not America/Los_Angeles). FillDefault populates the field from the host by default, so every Windows install with Lima defaults emits a bad value silently.

Codex rated this Important; Claude rated it Suggestion. I kept Important because the default codepath, not an edge case, produces the bad value.

Fix: omit <TimeZone> for Windows until an IANA → Windows mapping (e.g. CLDR data) is implemented, with a TODO that documents the constraint.

I5. .Hostname can exceed the 15-char Windows ComputerName cap Codex¶

pkg/cidata/autounattend.xml.tmpl

<?xml version="1.0" encoding="utf-8"?>
<unattend xmlns="urn:schemas-microsoft-com:unattend">
  <settings pass="oobeSystem">
    <component name="Microsoft-Windows-Shell-Setup" processorArchitecture="amd64" publicKeyToken="31bf3856ad364e35" language="neutral" versionScope="nonSxS">
      <ComputerName>{{ .Hostname }}</ComputerName>
      <RegisteredOwner>{{ .User }}</RegisteredOwner>
{{- if .TimeZone }}
      <TimeZone>{{ .TimeZone }}</TimeZone>
{{- end }}
    </component>

templateArgs sets args.Hostname = hostname.FromInstName(name), which prepends lima-. Lima identifiers accept up to 76 characters (identifiers.maxLength), so the generated computer name can be up to 81 characters. Windows enforces a 15-character ComputerName limit during setup; longer values either truncate silently or abort the install, depending on the build.

Fix: validate .Hostname length for Windows (≤15 chars) at template-arg construction, or derive a short Windows-specific computer name (hash + prefix) when the Lima name exceeds the cap.

Suggestions ¶

S1. NewOS "Windows" case is unreachable Claude¶

pkg/limatype/lima_yaml.go

	switch osname {
	case "linux":
		return LINUX
	case "darwin":
		return DARWIN
	case "windows", "Windows":
		return WINDOWS
	default:
		logrus.Warnf("Unknown os: %s", osname)
		return osname
	}
}

NewOS is only called by ResolveOS (defaults.go:1021), and only with the literal "linux". The "linux" and "darwin" arms are dead in the same sense, but adding a third unreachable arm entrenches the pattern. If C1's normalisation fix routes the OS YAML field through NewOS, the "Windows" arm becomes reachable; otherwise drop it.

S2. Autounattend test fixture uses Linux-shaped data Claude¶

pkg/cidata/template_test.go

			assert.Assert(t, strings.Contains(string(b), "mounts:"))
		}
	}
}

func TestAutounattendTemplate(t *testing.T) {
	args := &TemplateArgs{
		Name:     "default",
		Hostname: "lima-win",
		User:     "foo",
		UID:      501,
		Comment:  "Foo",
		Home:     "/home/foo.guest",
		Shell:    "/bin/bash",
		SSHPubKeys: []string{
			"ssh-rsa dummy foo@example.com",
		},
	}
	config, err := ExecuteTemplateAutounattendXML(args)
	assert.NilError(t, err)
	assert.Assert(t, strings.Contains(string(config), "<ComputerName>lima-win</ComputerName>"))
	assert.Assert(t, strings.Contains(string(config), "ssh-rsa dummy"))
}

ValidateTemplateArgs only checks non-emptiness, so the test passes, but the values are meaningless on the WINDOWS code path. Use C:\Users\foo and cmd.exe (or similar) so the fixture reads as plausibly Windows. Add positive assertions for the <TimeZone> branch and parse the output with encoding/xml to catch well-formedness regressions.

S3. Redundant ValidateTemplateArgs call Claude Qwen Gemma¶

pkg/cidata/cidata.go

	args, err := templateArgs(ctx, false, instDir, name, instConfig, 0, 0, 0, "", false, false, false)
	if err != nil {
		return err
	}

	if err := ValidateTemplateArgs(args); err != nil {
		return err
	}

	config, err := ExecuteTemplateAutounattendXML(args)
	if err != nil {
		return err
	}

	os.RemoveAll(filepath.Join(instDir, filenames.AutounattendXML))
	return os.WriteFile(filepath.Join(instDir, filenames.AutounattendXML), config, 0o444)
}

ExecuteTemplateAutounattendXML already calls ValidateTemplateArgs (template.go:213). The caller's explicit call is redundant. The existing GenerateCloudConfig/ExecuteTemplateCloudConfig pair mirrors the same double-validation, so this is consistency-with-existing rather than a regression — but the new function is a clean place to drop the duplicate. Gemma rated this Important; the duplicate is harmless, so a Suggestion fits better.

S4. Missing godoc on new exported symbols Claude Gemma¶

GenerateAutounattendXML, ExecuteTemplateAutounattendXML, and the AutounattendXML constant lack godoc. The neighbouring filename-block entries each carry an end-of-line comment naming the consumer of the file; the new entry does not.

Fix:

// GenerateAutounattendXML writes a minimal autounattend.xml for Windows guests.
func GenerateAutounattendXML(...) error { ... }

// ExecuteTemplateAutounattendXML renders the embedded autounattend template against args.
func ExecuteTemplateAutounattendXML(...) ([]byte, error) { ... }

AutounattendXML = "autounattend.xml" // Windows guest unattend file; consumed by Windows Setup

S5. Joliet extension not applied for Windows config drives Gemini¶

pkg/cidata/cidata.go

		})
		return args.IID, writeCIDataDir(filepath.Join(instDir, filenames.CIDataISODir), layout)
	}

	var iso9660Options []iso9660util.WriteOpt
	switch *instConfig.OS {
	case limatype.DARWIN, limatype.FREEBSD:
		// Darwin: Without Joliet, all the file names will be in upper case, and the hiphen will be replaced with underscore
		// FreeBSD: Without Joliet, the files are not executable
		iso9660Options = append(iso9660Options, iso9660util.WithJoliet())
	}
	return args.IID, iso9660util.Write(filepath.Join(instDir, filenames.CIDataISO), "cidata", layout, iso9660Options...)
}

func removeControlChars(s string) string {
	out := make([]rune, 0, len(s))

Once the Windows boot path is wired up, the cidata.iso generated next to autounattend.xml will be mounted by the Windows guest. Without Joliet, Windows exposes upper-cased 8.3 names. Gemini rated this Important; demoted to Suggestion because Windows guests don't consume cidata.iso yet (the autounattend file is written to instDir/autounattend.xml, outside the ISO). Worth fixing pre-emptively so the wiring step doesn't trip on it.

Fix:

-    case limatype.DARWIN, limatype.FREEBSD:
+    case limatype.DARWIN, limatype.FREEBSD, limatype.WINDOWS:

S6. NewOS does not normalise "freebsd" Qwen¶

pkg/limatype/lima_yaml.go

	Config   LimaYAML `json:"config"`
	FilePath string   `json:"filePath"`
}

func NewOS(osname string) OS {
	switch osname {
	case "linux":
		return LINUX
	case "darwin":
		return DARWIN
	case "windows", "Windows":
		return WINDOWS
	default:
		logrus.Warnf("Unknown os: %s", osname)
		return osname
	}
}

func Goarm() int {
	if runtime.GOOS != "linux" {
		return 0
	}

OSTypes lists FREEBSD, but NewOS has no "freebsd" arm — the input falls through to the default branch, which logs Unknown os: freebsd and returns the string verbatim. Pre-existing, but the diff that adds the "windows" case is a natural place to fix it. Closely related to S1 (the NewOS "Windows" arm is unreachable); a normalisation fix that routes the YAML field through NewOS would resolve both.

Fix:

     case "darwin":
         return DARWIN
+    case "freebsd":
+        return FREEBSD
     case "windows", "Windows":
         return WINDOWS

Design Observations ¶

Concerns ¶

Hand-rolled unattend XML invites a long tail of correctness bugs. (future) Claude — Autounattend.xml has a real schema (multiple passes: windowsPE, specialize, oobeSystem; OOBE/UserAccounts/AutoLogon sub-elements; arch- and edition-specific component names) and Microsoft publishes both the schema (shipped in WAIK/ADK) and well-maintained generator libraries. A spec-aware builder dissolves I1 (per-arch component selection), I2 (proper escaping), I3 (key block format), and I5 (Windows-name constraints) by construction. Worth raising before this POC matures into a supported feature.
Cloud-config dispatch is split across two functions. (future) Claude Gemini Qwen — GenerateCloudConfig short-circuits to GenerateAutounattendXML for Windows (cidata.go:362-364), but the downstream GenerateISO9660 call in pkg/hostagent/hostagent.go:181 still runs the full Linux cloud-init pipeline unconditionally. For Windows guests it produces a cidata.iso that nothing on the guest will consume. Not a regression because no Windows boot path exists yet, but the divergence leaks the moment a driver is wired up. Gate both generators in one place — e.g. generateBootMedia(instConfig).
Function name no longer matches behavior. (in-scope) Gemma — GenerateCloudConfig is now a dispatcher that may write autounattend.xml. A name like GenerateGuestConfig would match the expanded responsibility. Closely related to the dispatch concern above and could be addressed together.

Strengths ¶

Single-character touchpoints (OSTypes, validate switch, filename constant) keep the surface area for adding a guest OS small — easy to follow and easy to revert. Claude
The new autounattend.xml write uses the existing 0o444 pattern from GenerateCloudConfig, including the os.RemoveAll step that's required because WriteFile can't truncate a 0o444 file on re-run. Claude
The Windows-specific artifact is isolated behind ExecuteTemplateAutounattendXML, which keeps the Windows path separate from the cloud-config template path. Codex

Testing Assessment ¶

C1 is a working test failure — go test ./pkg/limayaml -run TestValidateMultipleErrors fails as committed.
No positive coverage for os: Windows — the validator switch accepts canonical Windows, but no test exercises that path. Add one.
No XML well-formedness test — TestAutounattendTemplate (template_test.go:172) does substring matching only. A test that parses the output with encoding/xml would catch I2 (escaping) and I3 (comment-block --).
No architecture matrix — the template hardcodes amd64 (I1); a table-driven test covering aarch64 would have caught it.
No coverage for the dispatch — no test confirms that GenerateCloudConfig on os: Windows writes autounattend.xml and skips cloud-config.yaml.
No Windows-specific constraints test — IANA TZ → Windows TZ mapping (I4), ComputerName length cap (I5), and the SSH-key comment escape (I3) all have no test coverage.

Documentation Assessment ¶

New exported symbols GenerateAutounattendXML, ExecuteTemplateAutounattendXML, and the AutounattendXML filename constant lack godoc (S4).
The internals documentation (docs/internals.md) still lists cloud-config.yaml as the only generated instance reference file. The user-facing config docs describe timezone as zoneinfo-based, which conflicts with what the Windows path emits today (I4).
The PR adds a new user-visible OS value but no template, schema doc, or website page advertises it. If templates/ or docs/ would advertise Windows support, those need follow-up before the POC matures.

Commit Structure ¶

Single commit, scope matches the title. DCO sign-off present. No fixup commits.

Acknowledged Limitations ¶

PR description: "The template is intentionally minimal (hostname/owner plus SSH keys placeholder) to demonstrate the end-to-end path without changing QEMU boot wiring yet." Consistent with the diff — no driver wiring, no SSH provisioning, no unattend pass beyond oobeSystem. The downstream GenerateISO9660 divergence (second design observation above) is a known consequence of this scope.
Inline comment (autounattend.xml.tmpl:12): <!-- Lima SSH keys (placeholder, consumed by later Windows provisioning):. Explicitly flags that the keys are not yet wired.

Agent Performance Retro ¶

Claude ¶

Claude was the only agent to actually run go test against the modified assertion and confirm the failure as a CI blocker — that's what makes C1 a Critical rather than an Important finding. It also surfaced the strongest design observation (hand-rolled XML invites a long tail of bugs, suggest a spec-aware generator) and the strongest godoc/comment-quality follow-ups. Severity calibration was sharp throughout: it correctly tagged the IANA-TZ issue at Suggestion when it later turned out the default codepath populates it (so a re-rating to Important is justified), and it did not inflate the redundant-validation finding (S3) the way Gemma did.

Codex ¶

Codex contributed the only unique find that materially raises the merge bar: I5 (ComputerName length cap vs. Lima identifier length). It also independently caught the test failure (I1 in its report), the hardcoded amd64, the XML-escaping risk, and the IANA-TZ mismatch — and it correctly named the conflict with templates/default.yaml:414-415's documented zoneinfo semantics. The one calibration miss: it rated the test failure as Important when it's a CI blocker. Otherwise tight, citation-rich, and low on noise.

Gemini ¶

Gemini found the same headline issues (hardcoded amd64, XML escaping) and contributed two unique angles: I3 (SSH-key comments with -- breaking the XML comment block) and the Joliet-extension miss (S5 in the consolidated report, demoted from Important to Suggestion because Windows guests don't yet consume cidata.iso). It missed the test failure: it noticed the assertion was edited (its I5) but reported only "coverage was lost," not "the test now fails." A git blame/go test step would have promoted that finding two severities; Gemini's known no-git blame posture cost it here.

Qwen ¶

Qwen failed on its first attempt — opencode auto-rejected its initial read because the prompt was sent with the worktree at its unresolved /var/folders/... path, while opencode canonicalised --dir to /private/var/folders/.... After a launch-helper fix that resolves the worktree path with filepath.EvalSymlinks (and a preamble update that anchors __WORKTREE__ literally in the in-tree-only sentence), Qwen produced a real review. The contribution was modest: it duplicated the redundant-validation suggestion (S3) and the XML-escape Important (I2), and seconded the design observation about GenerateISO9660 running unconditionally for Windows. Its only unique find was S6 (NewOS lacks the "freebsd" arm) — small but legitimate, and a miss for every other agent. Calibration was mixed: it mis-rated the XML-escape risk as Suggestion ("likelihood is low"), missing that User.Comment is explicitly unvalidated; and claimed TestValidateMultipleErrors "now passes" without running it, confirming the same execution gap as Gemini and Gemma.

Gemma ¶

Gemma's findings (redundant ValidateTemplateArgs call, unnecessary os.RemoveAll before WriteFile, missing docstrings) are all real but minor — the kind of items a Suggestion-tier reviewer should turn up. Two calibration issues:

It flagged os.RemoveAll as unnecessary, missing that the existing file is written with mode 0o444 — WriteFile's open+truncate fails on a read-only file, so the remove is load-bearing. False positive.
It noted "case-insensitive handling in NewOS" as a strength. NewOS is case-sensitive; it only matches "windows" or "Windows" because both are explicitly listed (and that listing happens to be the dead code Claude flagged in S1). Misread.

It also missed every Windows-correctness issue (XML escaping, hardcoded amd64, IANA TZ, ComputerName cap) and the critical test failure. Coverage misses: it marked autounattend.xml.tmpl, lima_yaml.go, validate.go, and validate_test.go as "Reviewed, no issues" while each contains at least one Important or Critical issue from the consolidated report.

Summary ¶

	Claude Opus 4.7	Codex GPT 5.5	Gemini 3.1 Pro	Qwen 3.6 27B	Gemma 31B
Duration	5m 48s	7m 25s	—	0m 26s (fix-up)	3m 55s
Findings	1C 2I 4S	4I 1S	4I 1S	1I 2S	3S
Tool calls	32 (Read 16, Grep 9, Bash 6)	77 (shell 77)	—	8 (read 8)	—
Design observations	4	1	0	1	1
False positives	0	0	0	0	1
Unique insights	3	1	2	1	0
Files reviewed	8	8	8	8	8
Coverage misses	0	0	2	0	4
Totals	1C 2I 4S	4I 1S	4I 1S	1I 2S	3S
Downgraded	0	0	1 (I→S)	1 (I→S)	1 (I→S)
Dropped	0	0	0	0	1

Claude and Codex did the heavy lifting. Either one alone would have produced an actionable review; together they covered the entire PR including the CI blocker and the unique ComputerName cap. Gemini added two further angles for free. Qwen (after the permission-system fix-up) contributed one unique find (S6 FreeBSD case) and seconded three existing findings — calibration is comparable to Gemma's. Gemma's findings are real but small, and two were calibration misses; it's a useful third opinion on the trivial-to-medium tier but should not be relied on for severity calls.

Reconciliation ¶

Codex I1 (lowercase windows test failure): Important → Critical (C1). Reason: failing test blocks CI on merge.
Claude S3 (TimeZone IANA mismatch): Suggestion → Important (I4). Reason: FillDefault populates the value from the host by default, so the bad output is the default codepath rather than an edge case.
Gemini I4 (Joliet for Windows): Important → Suggestion (S5). Reason: Windows guests do not yet consume cidata.iso; pre-emptive fix only.
Gemma I1 (redundant ValidateTemplateArgs): Important → Suggestion (S3). Reason: behaviour is harmless and matches the existing GenerateCloudConfig pattern.
Gemma S1 (unnecessary os.RemoveAll): dropped as false positive. Reason: target file is 0o444; WriteFile's O_TRUNC fails without the prior remove.
Qwen I2 (redundant ValidateTemplateArgs): Important → Suggestion (S3). Reason: same as Gemma's downgrade — harmless, matches existing pattern.
Qwen S1 (XML escaping): Suggestion → Important (I2). Reason: Qwen judged "likelihood is low" but User is documented as not validated, so user-controlled fields routinely reach the template path.

Review Process Notes ¶

Skill improvements ¶

Execute affected tests when an agent finds an assertion edit. When a finding cites an edited expected-error string, an asserted output, or a deleted-then-restored test case, the agent should attempt to run that test before classifying the finding. The difference between "this test no longer covers X" (low severity) and "this test fails on main" (CI blocker) is one go test / pytest / equivalent invocation away, and confusing the two miscalibrates the verdict on the entire PR. Reviewers who can read code but skip executing it produce systematically softer findings on test edits.
Treat YAML/JSON enum acceptance as a normalisation question, not a constant-list question. When a code change adds a value to a switch over string constants that originate from user-supplied YAML, the reviewer must trace the path from "raw YAML field" to "switch input" and verify case/format normalisation, not just confirm the new arm exists. The default mental model — "if it's in the switch, it works" — misses cases where the YAML loader returns the input verbatim and the constant differs in capitalisation, spacing, or punctuation. Flag any new enum addition that has no corresponding normalisation step as a Critical (failing acceptance test) or Important (silent rejection of the documented spelling), not a Suggestion.
For any change that adds an item to an existing enumeration, audit symmetry across the codebase. Enumerations include string-constant lists, switch arms, dispatch tables, validation cases, type registries, normalisation maps, and parallel constant blocks. A new arm in one place often implies a missing arm in three others — case normalisation, default initialisation, formatter output, doc rendering, error reporting. Before approving such a change, grep every reference to the enumeration's existing items and confirm the new item is wired in everywhere the others are; flag each asymmetry as a separate finding (severity matches what breaks when the asymmetric path executes). Models that read the diff in isolation routinely miss this because the diff itself looks complete; the gap lives in the files the diff does not touch.

Repo context updates ¶

Lima's ResolveOS/ResolveArch do not canonicalise user input. pkg/limayaml/defaults.go:1019-1031 returns the user's string verbatim whenever it is non-empty and non-"default". The capitalised constants (Linux, Darwin, FreeBSD, Windows) only match because users conventionally write them capitalised; lowercase spellings produce validation errors despite being human-equivalent. Any PR that touches os: or arch: validation should be checked against both the canonical and lowercase spellings.