Deep Review: 20260519-154901-pr-4995
| Date | 2026-05-19 15:49 |
| Repo | lima-vm/lima |
| Round | 1 (of target) |
| Author | @kishansinghifs1 |
| PR | #4995 — POC - LFX( TERM 2 ) - add Windows support with autounattend.xml generation |
| Branch | poc-windows-autounattend |
| Commits | d2730be0 POC - LFX( TERM 2 ) - add Windows support with autounattend.xml generation |
| Reviewers | Claude Opus 4.7 (effort: xhigh), Codex GPT 5.5 (effort: xhigh), Gemini 3.1 Pro (effort: default), Qwen 3.6 27B (effort: high), Gemma 31B (effort: high) |
| Verdict | Rework — one critical test failure blocks CI, plus several Windows-correctness gaps that will surface the moment the boot path is wired up |
| Wall-clock time | 25 min 54 s |
Executive Summary ¶
This proof-of-concept makes Windows a valid os: value and adds an autounattend.xml generator alongside the existing cloud-config.yaml path. The diff is small (8 files, +77 −4) and the structural choices (a separate exported GenerateAutounattendXML, embedded template, dedicated filename constant) are clean.
The headline finding is C1: the regression test TestValidateMultipleErrors was edited to drop the OS-validation error from its expected output, but the test YAML still uses lowercase os: windows and nothing in the load → resolve → validate pipeline canonicalises it. go test ./pkg/limayaml -run TestValidateMultipleErrors fails as committed and will block CI.
The autounattend template itself has several Windows-specific gaps that are worth fixing before the boot wiring lands: a hardcoded processorArchitecture="amd64" that breaks ARM64 Windows guests (I1), no XML escaping for text/template-substituted user-controlled fields (I2), an SSH-key block embedded inside an XML comment which forbids -- (I3), an IANA timezone fed directly into a Windows-TZ-ID-only field (I4), and a computer name derived from a Lima identifier (up to 76 chars + lima- prefix) that can exceed the 15-char Windows ComputerName cap (I5).
Structure: 1 critical, 5 important, 6 suggestions, 3 design observations.
Critical Issues ¶
//
assert.Error(t, err, "field `param` key \"rootFul\" is not used in any provision, probe, copyToHost, or portForward")
}
}
func TestValidateMultipleErrors(t *testing.T) {
yamlWithMultipleErrors := `
os: windows
arch: unsupported_arch
portForwards:
- guestPort: 22
hostPort: 2222
- guestPort: 8080
hostPort: 65536
provision:
- mode: invalid_mode
script: echo test
- mode: data
content: test
`
y, err := Load(t.Context(), []byte(yamlWithMultipleErrors), "multiple-errors.yaml")
assert.NilError(t, err)
err = Validate(y, false)
t.Logf("Validation errors: %v", err)
assert.Error(t, err, "field `arch` must be one of [x86_64 aarch64 armv7l ppc64le riscv64 s390x]; got \"unsupported_arch\"\n"+
"field `images` must be set\n"+
"field `provision[0].mode` must one of \"system\", \"user\", \"boot\", \"data\", \"dependency\", \"ansible\", or \"yq\"\n"+
"field `provision[1].path` must not be empty when mode is \"data\"")
}
func TestValidateAgainstLatestConfig(t *testing.T) {
tests := []struct {
name string
yNew string
// validate.go (validator)
switch *y.OS {
case limatype.LINUX, limatype.DARWIN, limatype.FREEBSD, limatype.WINDOWS:
default:
errs = errors.Join(errs, fmt.Errorf("field `os` must be one of %q; got %q", limatype.OSTypes, *y.OS))
}
The PR removed the field \os\ must be one of …; got "windows" line from the expected error list in TestValidateMultipleErrors, but the YAML payload still uses lowercase os: windows. limatype.WINDOWS == "Windows" and the validator's switch is case-sensitive.
// defaults.go — ResolveOS passes non-default values through verbatim
func ResolveOS(s *string) limatype.OS {
if s == nil || *s == "" || *s == "default" {
return limatype.NewOS("linux")
}
return *s
}
ResolveOS does not normalise. NewOS (lima_yaml.go:346) added a "windows" case in this PR, but it is unreachable from the YAML load path — only ResolveOS's "default" branch calls it, and only with "linux". Running the test confirms the failure:
field `os` must be one of ["Linux" "Darwin" "FreeBSD" "Windows"]; got "windows"
...
--- FAIL: TestValidateMultipleErrors (0.00s)
CI blocks on this. Codex rated the same finding Important; I promoted it because a failing test blocks the merge.
Fix: pick one of the two consistent options.
- os: windows
+ os: Windows
…and then change the YAML payload elsewhere in the file (or add a positive-case test) to keep the windows lowercase error path covered — or canonicalise OS at load time:
func NewOS(osname string) OS {
- switch osname {
+ switch strings.ToLower(osname) {
case "linux":
return LINUX
case "darwin":
return DARWIN
- case "windows", "Windows":
+ case "windows":
return WINDOWS
…and route the OS YAML field through NewOS from ResolveOS (which removes the dead "Windows" case in NewOS — see S1). The second option fixes a latent issue with os: Linux / os: Darwin / os: FreeBSD (capital) silently passing today only by accident — the constants happen to be capitalised.
Important Issues ¶
<?xml version="1.0" encoding="utf-8"?>
<unattend xmlns="urn:schemas-microsoft-com:unattend">
<settings pass="oobeSystem">
<component name="Microsoft-Windows-Shell-Setup" processorArchitecture="amd64" publicKeyToken="31bf3856ad364e35" language="neutral" versionScope="nonSxS">
<ComputerName>{{ .Hostname }}</ComputerName>
<RegisteredOwner>{{ .User }}</RegisteredOwner>
{{- if .TimeZone }}
<TimeZone>{{ .TimeZone }}</TimeZone>
{{- end }}
OSTypes now includes WINDOWS, but no architecture constraint pairs with it. os: Windows + arch: aarch64 — the default on Apple Silicon hosts, where ResolveArch (defaults.go:1028) maps runtime.GOARCH=arm64 to aarch64 — passes validation and emits an unattend that declares amd64. Windows Setup rejects the component on ARM64 hardware.
Fix: drive the architecture from args.Arch (currently absent from TemplateArgs) and template the attribute, mapping aarch64 → arm64. Until the mapping lands, reject Windows + non-x86_64 in validate.go:
processorArchitecture="{{ if eq .Arch "aarch64" }}arm64{{ else }}amd64{{ end }}"
}
return layout, nil
}
func ExecuteTemplateAutounattendXML(args *TemplateArgs) ([]byte, error) {
if err := ValidateTemplateArgs(args); err != nil {
return nil, err
}
return textutil.ExecuteTemplate(string(autounattendTemplate), args)
}
textutil.ExecuteTemplate uses text/template, which performs no XML escaping. The template substitutes .Hostname, .User, and .TimeZone verbatim into XML element bodies. ValidateTemplateArgs explicitly does not validate args.User (template.go:129-130): "the user can override with any name they want". A User.Name or User.Comment containing &, <, >, or " produces malformed XML, and Windows Setup aborts.
Fix: render the document through encoding/xml (preferred — see Design Observations), or register xml.EscapeText / html.EscapeString as a template func and apply it to every XML text node.
{{- if .TimeZone }}
<TimeZone>{{ .TimeZone }}</TimeZone>
{{- end }}
</component>
</settings>
<!-- Lima SSH keys (placeholder, consumed by later Windows provisioning):
{{- range .SSHPubKeys }}
{{ . }}
{{- end }}
-->
</unattend>
XML 1.0 forbids -- inside a comment body. SSH public keys have a free-form trailing comment field (typically user@host); any user with a host name like my--workstation produces an SSH key whose comment field contains --, which makes the generated unattend unparseable. The likelihood is low but non-zero, and the failure mode is silent.
Fix: escape -- (e.g. by replacing with -_) in the values added to args.SSHPubKeys for the autounattend path, or move the placeholder out of the XML comment entirely (e.g. into a vendor-specific element or a separate file).
<unattend xmlns="urn:schemas-microsoft-com:unattend">
<settings pass="oobeSystem">
<component name="Microsoft-Windows-Shell-Setup" processorArchitecture="amd64" publicKeyToken="31bf3856ad364e35" language="neutral" versionScope="nonSxS">
<ComputerName>{{ .Hostname }}</ComputerName>
<RegisteredOwner>{{ .User }}</RegisteredOwner>
{{- if .TimeZone }}
<TimeZone>{{ .TimeZone }}</TimeZone>
{{- end }}
</component>
</settings>
<!-- Lima SSH keys (placeholder, consumed by later Windows provisioning):
{{- range .SSHPubKeys }}
{{ . }}
LimaYAML.TimeZone is documented and populated as an IANA zoneinfo name. Windows' <TimeZone> unattend element requires a Windows-specific identifier (Pacific Standard Time, not America/Los_Angeles). FillDefault populates the field from the host by default, so every Windows install with Lima defaults emits a bad value silently.
Codex rated this Important; Claude rated it Suggestion. I kept Important because the default codepath, not an edge case, produces the bad value.
Fix: omit <TimeZone> for Windows until an IANA → Windows mapping (e.g. CLDR data) is implemented, with a TODO that documents the constraint.
<?xml version="1.0" encoding="utf-8"?>
<unattend xmlns="urn:schemas-microsoft-com:unattend">
<settings pass="oobeSystem">
<component name="Microsoft-Windows-Shell-Setup" processorArchitecture="amd64" publicKeyToken="31bf3856ad364e35" language="neutral" versionScope="nonSxS">
<ComputerName>{{ .Hostname }}</ComputerName>
<RegisteredOwner>{{ .User }}</RegisteredOwner>
{{- if .TimeZone }}
<TimeZone>{{ .TimeZone }}</TimeZone>
{{- end }}
</component>
templateArgs sets args.Hostname = hostname.FromInstName(name), which prepends lima-. Lima identifiers accept up to 76 characters (identifiers.maxLength), so the generated computer name can be up to 81 characters. Windows enforces a 15-character ComputerName limit during setup; longer values either truncate silently or abort the install, depending on the build.
Fix: validate .Hostname length for Windows (≤15 chars) at template-arg construction, or derive a short Windows-specific computer name (hash + prefix) when the Lima name exceeds the cap.
Suggestions ¶
switch osname {
case "linux":
return LINUX
case "darwin":
return DARWIN
case "windows", "Windows":
return WINDOWS
default:
logrus.Warnf("Unknown os: %s", osname)
return osname
}
}
NewOS is only called by ResolveOS (defaults.go:1021), and only with the literal "linux". The "linux" and "darwin" arms are dead in the same sense, but adding a third unreachable arm entrenches the pattern. If C1's normalisation fix routes the OS YAML field through NewOS, the "Windows" arm becomes reachable; otherwise drop it.
assert.Assert(t, strings.Contains(string(b), "mounts:"))
}
}
}
func TestAutounattendTemplate(t *testing.T) {
args := &TemplateArgs{
Name: "default",
Hostname: "lima-win",
User: "foo",
UID: 501,
Comment: "Foo",
Home: "/home/foo.guest",
Shell: "/bin/bash",
SSHPubKeys: []string{
"ssh-rsa dummy foo@example.com",
},
}
config, err := ExecuteTemplateAutounattendXML(args)
assert.NilError(t, err)
assert.Assert(t, strings.Contains(string(config), "<ComputerName>lima-win</ComputerName>"))
assert.Assert(t, strings.Contains(string(config), "ssh-rsa dummy"))
}
ValidateTemplateArgs only checks non-emptiness, so the test passes, but the values are meaningless on the WINDOWS code path. Use C:\Users\foo and cmd.exe (or similar) so the fixture reads as plausibly Windows. Add positive assertions for the <TimeZone> branch and parse the output with encoding/xml to catch well-formedness regressions.
args, err := templateArgs(ctx, false, instDir, name, instConfig, 0, 0, 0, "", false, false, false)
if err != nil {
return err
}
if err := ValidateTemplateArgs(args); err != nil {
return err
}
config, err := ExecuteTemplateAutounattendXML(args)
if err != nil {
return err
}
os.RemoveAll(filepath.Join(instDir, filenames.AutounattendXML))
return os.WriteFile(filepath.Join(instDir, filenames.AutounattendXML), config, 0o444)
}
ExecuteTemplateAutounattendXML already calls ValidateTemplateArgs (template.go:213). The caller's explicit call is redundant. The existing GenerateCloudConfig/ExecuteTemplateCloudConfig pair mirrors the same double-validation, so this is consistency-with-existing rather than a regression — but the new function is a clean place to drop the duplicate. Gemma rated this Important; the duplicate is harmless, so a Suggestion fits better.
GenerateAutounattendXML, ExecuteTemplateAutounattendXML, and the AutounattendXML constant lack godoc. The neighbouring filename-block entries each carry an end-of-line comment naming the consumer of the file; the new entry does not.
Fix:
// GenerateAutounattendXML writes a minimal autounattend.xml for Windows guests.
func GenerateAutounattendXML(...) error { ... }
// ExecuteTemplateAutounattendXML renders the embedded autounattend template against args.
func ExecuteTemplateAutounattendXML(...) ([]byte, error) { ... }
AutounattendXML = "autounattend.xml" // Windows guest unattend file; consumed by Windows Setup
})
return args.IID, writeCIDataDir(filepath.Join(instDir, filenames.CIDataISODir), layout)
}
var iso9660Options []iso9660util.WriteOpt
switch *instConfig.OS {
case limatype.DARWIN, limatype.FREEBSD:
// Darwin: Without Joliet, all the file names will be in upper case, and the hiphen will be replaced with underscore
// FreeBSD: Without Joliet, the files are not executable
iso9660Options = append(iso9660Options, iso9660util.WithJoliet())
}
return args.IID, iso9660util.Write(filepath.Join(instDir, filenames.CIDataISO), "cidata", layout, iso9660Options...)
}
func removeControlChars(s string) string {
out := make([]rune, 0, len(s))
Once the Windows boot path is wired up, the cidata.iso generated next to autounattend.xml will be mounted by the Windows guest. Without Joliet, Windows exposes upper-cased 8.3 names. Gemini rated this Important; demoted to Suggestion because Windows guests don't consume cidata.iso yet (the autounattend file is written to instDir/autounattend.xml, outside the ISO). Worth fixing pre-emptively so the wiring step doesn't trip on it.
Fix:
- case limatype.DARWIN, limatype.FREEBSD:
+ case limatype.DARWIN, limatype.FREEBSD, limatype.WINDOWS:
Config LimaYAML `json:"config"`
FilePath string `json:"filePath"`
}
func NewOS(osname string) OS {
switch osname {
case "linux":
return LINUX
case "darwin":
return DARWIN
case "windows", "Windows":
return WINDOWS
default:
logrus.Warnf("Unknown os: %s", osname)
return osname
}
}
func Goarm() int {
if runtime.GOOS != "linux" {
return 0
}
OSTypes lists FREEBSD, but NewOS has no "freebsd" arm — the input falls through to the default branch, which logs Unknown os: freebsd and returns the string verbatim. Pre-existing, but the diff that adds the "windows" case is a natural place to fix it. Closely related to S1 (the NewOS "Windows" arm is unreachable); a normalisation fix that routes the YAML field through NewOS would resolve both.
Fix:
case "darwin":
return DARWIN
+ case "freebsd":
+ return FREEBSD
case "windows", "Windows":
return WINDOWS
Design Observations ¶
Concerns ¶
- Hand-rolled unattend XML invites a long tail of correctness bugs.
(future)Claude — Autounattend.xml has a real schema (multiple passes:windowsPE,specialize,oobeSystem;OOBE/UserAccounts/AutoLogonsub-elements; arch- and edition-specific component names) and Microsoft publishes both the schema (shipped in WAIK/ADK) and well-maintained generator libraries. A spec-aware builder dissolves I1 (per-arch component selection), I2 (proper escaping), I3 (key block format), and I5 (Windows-name constraints) by construction. Worth raising before this POC matures into a supported feature. - Cloud-config dispatch is split across two functions.
(future)Claude Gemini Qwen —GenerateCloudConfigshort-circuits toGenerateAutounattendXMLfor Windows (cidata.go:362-364), but the downstreamGenerateISO9660call inpkg/hostagent/hostagent.go:181still runs the full Linux cloud-init pipeline unconditionally. For Windows guests it produces acidata.isothat nothing on the guest will consume. Not a regression because no Windows boot path exists yet, but the divergence leaks the moment a driver is wired up. Gate both generators in one place — e.g.generateBootMedia(instConfig). - Function name no longer matches behavior.
(in-scope)Gemma —GenerateCloudConfigis now a dispatcher that may writeautounattend.xml. A name likeGenerateGuestConfigwould match the expanded responsibility. Closely related to the dispatch concern above and could be addressed together.
Strengths ¶
- Single-character touchpoints (
OSTypes, validate switch, filename constant) keep the surface area for adding a guest OS small — easy to follow and easy to revert. Claude - The new
autounattend.xmlwrite uses the existing0o444pattern fromGenerateCloudConfig, including theos.RemoveAllstep that's required becauseWriteFilecan't truncate a 0o444 file on re-run. Claude - The Windows-specific artifact is isolated behind
ExecuteTemplateAutounattendXML, which keeps the Windows path separate from the cloud-config template path. Codex
Testing Assessment ¶
- C1 is a working test failure —
go test ./pkg/limayaml -run TestValidateMultipleErrorsfails as committed. - No positive coverage for
os: Windows— the validator switch accepts canonicalWindows, but no test exercises that path. Add one. - No XML well-formedness test —
TestAutounattendTemplate(template_test.go:172) does substring matching only. A test that parses the output withencoding/xmlwould catch I2 (escaping) and I3 (comment-block--). - No architecture matrix — the template hardcodes
amd64(I1); a table-driven test coveringaarch64would have caught it. - No coverage for the dispatch — no test confirms that
GenerateCloudConfigonos: Windowswritesautounattend.xmland skipscloud-config.yaml. - No Windows-specific constraints test — IANA TZ → Windows TZ mapping (I4), ComputerName length cap (I5), and the SSH-key comment escape (I3) all have no test coverage.
Documentation Assessment ¶
- New exported symbols
GenerateAutounattendXML,ExecuteTemplateAutounattendXML, and theAutounattendXMLfilename constant lack godoc (S4). - The internals documentation (
docs/internals.md) still listscloud-config.yamlas the only generated instance reference file. The user-facing config docs describetimezoneas zoneinfo-based, which conflicts with what the Windows path emits today (I4). - The PR adds a new user-visible OS value but no template, schema doc, or website page advertises it. If
templates/ordocs/would advertise Windows support, those need follow-up before the POC matures.
Commit Structure ¶
Single commit, scope matches the title. DCO sign-off present. No fixup commits.
Acknowledged Limitations ¶
- PR description: "The template is intentionally minimal (hostname/owner plus SSH keys placeholder) to demonstrate the end-to-end path without changing QEMU boot wiring yet." Consistent with the diff — no driver wiring, no SSH provisioning, no unattend pass beyond
oobeSystem. The downstreamGenerateISO9660divergence (second design observation above) is a known consequence of this scope. - Inline comment (
autounattend.xml.tmpl:12):<!-- Lima SSH keys (placeholder, consumed by later Windows provisioning):. Explicitly flags that the keys are not yet wired.
Agent Performance Retro ¶
Claude ¶
Claude was the only agent to actually run go test against the modified assertion and confirm the failure as a CI blocker — that's what makes C1 a Critical rather than an Important finding. It also surfaced the strongest design observation (hand-rolled XML invites a long tail of bugs, suggest a spec-aware generator) and the strongest godoc/comment-quality follow-ups. Severity calibration was sharp throughout: it correctly tagged the IANA-TZ issue at Suggestion when it later turned out the default codepath populates it (so a re-rating to Important is justified), and it did not inflate the redundant-validation finding (S3) the way Gemma did.
Codex ¶
Codex contributed the only unique find that materially raises the merge bar: I5 (ComputerName length cap vs. Lima identifier length). It also independently caught the test failure (I1 in its report), the hardcoded amd64, the XML-escaping risk, and the IANA-TZ mismatch — and it correctly named the conflict with templates/default.yaml:414-415's documented zoneinfo semantics. The one calibration miss: it rated the test failure as Important when it's a CI blocker. Otherwise tight, citation-rich, and low on noise.
Gemini ¶
Gemini found the same headline issues (hardcoded amd64, XML escaping) and contributed two unique angles: I3 (SSH-key comments with -- breaking the XML comment block) and the Joliet-extension miss (S5 in the consolidated report, demoted from Important to Suggestion because Windows guests don't yet consume cidata.iso). It missed the test failure: it noticed the assertion was edited (its I5) but reported only "coverage was lost," not "the test now fails." A git blame/go test step would have promoted that finding two severities; Gemini's known no-git blame posture cost it here.
Qwen ¶
Qwen failed on its first attempt — opencode auto-rejected its initial read because the prompt was sent with the worktree at its unresolved /var/folders/... path, while opencode canonicalised --dir to /private/var/folders/.... After a launch-helper fix that resolves the worktree path with filepath.EvalSymlinks (and a preamble update that anchors __WORKTREE__ literally in the in-tree-only sentence), Qwen produced a real review. The contribution was modest: it duplicated the redundant-validation suggestion (S3) and the XML-escape Important (I2), and seconded the design observation about GenerateISO9660 running unconditionally for Windows. Its only unique find was S6 (NewOS lacks the "freebsd" arm) — small but legitimate, and a miss for every other agent. Calibration was mixed: it mis-rated the XML-escape risk as Suggestion ("likelihood is low"), missing that User.Comment is explicitly unvalidated; and claimed TestValidateMultipleErrors "now passes" without running it, confirming the same execution gap as Gemini and Gemma.
Gemma ¶
Gemma's findings (redundant ValidateTemplateArgs call, unnecessary os.RemoveAll before WriteFile, missing docstrings) are all real but minor — the kind of items a Suggestion-tier reviewer should turn up. Two calibration issues:
- It flagged
os.RemoveAllas unnecessary, missing that the existing file is written with mode0o444—WriteFile's open+truncate fails on a read-only file, so the remove is load-bearing. False positive. - It noted "case-insensitive handling in
NewOS" as a strength.NewOSis case-sensitive; it only matches"windows"or"Windows"because both are explicitly listed (and that listing happens to be the dead code Claude flagged in S1). Misread.
It also missed every Windows-correctness issue (XML escaping, hardcoded amd64, IANA TZ, ComputerName cap) and the critical test failure. Coverage misses: it marked autounattend.xml.tmpl, lima_yaml.go, validate.go, and validate_test.go as "Reviewed, no issues" while each contains at least one Important or Critical issue from the consolidated report.
Summary ¶
| Claude Opus 4.7 | Codex GPT 5.5 | Gemini 3.1 Pro | Qwen 3.6 27B | Gemma 31B | |
|---|---|---|---|---|---|
| Duration | 5m 48s | 7m 25s | — | 0m 26s (fix-up) | 3m 55s |
| Findings | 1C 2I 4S | 4I 1S | 4I 1S | 1I 2S | 3S |
| Tool calls | 32 (Read 16, Grep 9, Bash 6) | 77 (shell 77) | — | 8 (read 8) | — |
| Design observations | 4 | 1 | 0 | 1 | 1 |
| False positives | 0 | 0 | 0 | 0 | 1 |
| Unique insights | 3 | 1 | 2 | 1 | 0 |
| Files reviewed | 8 | 8 | 8 | 8 | 8 |
| Coverage misses | 0 | 0 | 2 | 0 | 4 |
| Totals | 1C 2I 4S | 4I 1S | 4I 1S | 1I 2S | 3S |
| Downgraded | 0 | 0 | 1 (I→S) | 1 (I→S) | 1 (I→S) |
| Dropped | 0 | 0 | 0 | 0 | 1 |
Claude and Codex did the heavy lifting. Either one alone would have produced an actionable review; together they covered the entire PR including the CI blocker and the unique ComputerName cap. Gemini added two further angles for free. Qwen (after the permission-system fix-up) contributed one unique find (S6 FreeBSD case) and seconded three existing findings — calibration is comparable to Gemma's. Gemma's findings are real but small, and two were calibration misses; it's a useful third opinion on the trivial-to-medium tier but should not be relied on for severity calls.
Reconciliation ¶
- Codex I1 (lowercase windows test failure): Important → Critical (C1). Reason: failing test blocks CI on merge.
- Claude S3 (TimeZone IANA mismatch): Suggestion → Important (I4). Reason:
FillDefaultpopulates the value from the host by default, so the bad output is the default codepath rather than an edge case. - Gemini I4 (Joliet for Windows): Important → Suggestion (S5). Reason: Windows guests do not yet consume
cidata.iso; pre-emptive fix only. - Gemma I1 (redundant
ValidateTemplateArgs): Important → Suggestion (S3). Reason: behaviour is harmless and matches the existingGenerateCloudConfigpattern. - Gemma S1 (unnecessary
os.RemoveAll): dropped as false positive. Reason: target file is0o444;WriteFile's O_TRUNC fails without the prior remove. - Qwen I2 (redundant
ValidateTemplateArgs): Important → Suggestion (S3). Reason: same as Gemma's downgrade — harmless, matches existing pattern. - Qwen S1 (XML escaping): Suggestion → Important (I2). Reason: Qwen judged "likelihood is low" but
Useris documented as not validated, so user-controlled fields routinely reach the template path.
Review Process Notes ¶
Skill improvements ¶
- Execute affected tests when an agent finds an assertion edit. When a finding cites an edited expected-error string, an asserted output, or a deleted-then-restored test case, the agent should attempt to run that test before classifying the finding. The difference between "this test no longer covers X" (low severity) and "this test fails on
main" (CI blocker) is onego test/pytest/ equivalent invocation away, and confusing the two miscalibrates the verdict on the entire PR. Reviewers who can read code but skip executing it produce systematically softer findings on test edits. - Treat YAML/JSON enum acceptance as a normalisation question, not a constant-list question. When a code change adds a value to a
switchover string constants that originate from user-supplied YAML, the reviewer must trace the path from "raw YAML field" to "switch input" and verify case/format normalisation, not just confirm the new arm exists. The default mental model — "if it's in the switch, it works" — misses cases where the YAML loader returns the input verbatim and the constant differs in capitalisation, spacing, or punctuation. Flag any new enum addition that has no corresponding normalisation step as a Critical (failing acceptance test) or Important (silent rejection of the documented spelling), not a Suggestion. - For any change that adds an item to an existing enumeration, audit symmetry across the codebase. Enumerations include string-constant lists,
switcharms, dispatch tables, validation cases, type registries, normalisation maps, and parallel constant blocks. A new arm in one place often implies a missing arm in three others — case normalisation, default initialisation, formatter output, doc rendering, error reporting. Before approving such a change, grep every reference to the enumeration's existing items and confirm the new item is wired in everywhere the others are; flag each asymmetry as a separate finding (severity matches what breaks when the asymmetric path executes). Models that read the diff in isolation routinely miss this because the diff itself looks complete; the gap lives in the files the diff does not touch.
Repo context updates ¶
- Lima's
ResolveOS/ResolveArchdo not canonicalise user input.pkg/limayaml/defaults.go:1019-1031returns the user's string verbatim whenever it is non-empty and non-"default". The capitalised constants (Linux,Darwin,FreeBSD,Windows) only match because users conventionally write them capitalised; lowercase spellings produce validation errors despite being human-equivalent. Any PR that touchesos:orarch:validation should be checked against both the canonical and lowercase spellings.