Deep Review: 20260327-223819-pr-4637

Date2026-03-27 22:38
Repolima-vm/lima
Round2 (of PR #4637)
Author@unsuman
PR#4637 — Ensure only RAW images in cache(imgconv/raw) while keeping the original image
Branchfeature/disk-cache
Commitse03a4c24 Ensure only RAW images in cache while keeping orginal image
ReviewersClaude Opus 4.6, Codex GPT 5.4, Gemini 3.1 Pro
VerdictMerge with fixes — Sound design, but the conversion logic overwrites digest state and breaks cache freshness for digest-less images
Wall-clock time20 min 50 s

Executive Summary

This PR adds on-demand conversion of non-raw disk images to raw format in the download cache. Drivers declare which formats they support via a new SupportedImageFormats field, and the downloader converts only when necessary. The design preserves the original cached image for drivers that support it (e.g., QEMU with qcow2). The main issues center on the conversion logic overwriting expectedDigest and shadDigest, which breaks both user-provided digest validation and Last-Modified freshness checks for digest-less images. A crash between writing the raw file and its digest file leaves the cache in an unverifiable state.


Critical Issues

None.


Important Issues

1. User-provided expected digest silently overwritten Gemini

pkg/downloader/downloader.go, important, regression

				} else {
					origStat, _ := os.Stat(shadData)
					if origStat.ModTime().After(rawStat.ModTime()) {
						needConvert = true
					}
				}

				if needConvert {
					logrus.Infof("Converted raw image is missing or stale; (re)converting now.")
					converted, rawDigest, err := ensureRawInCache(ctx, shadData, imageFormat, o.expectedDigest)
					if err != nil {
						return nil, err
					}
					shadData = converted
					o.expectedDigest = rawDigest
					shadDigest = rawImgConvDigestPath
				} else {
					shadData = rawImgConvPath
					if currentDigestData, err := os.ReadFile(rawImgConvDigestPath); err == nil {
						currentDigest := strings.TrimSpace(string(currentDigestData))
						if d, err := digest.Parse(currentDigest); err == nil {
							o.expectedDigest = d
							shadDigest = rawImgConvDigestPath
						}
					}
				}
			}
		}

When an image requires conversion, lines 311–313 replace o.expectedDigest with the raw image's digest and redirect shadDigest to rawImgConvDigestPath. The original user-provided digest (from the template) is never validated against the cached source image. Later, validateCachedDigest at line 329 compares the raw digest against its own file — a self-confirming check.

	ext := path.Ext(remote)
	logrus.Debugf("file %q is cached as %q", localPath, shadData)
	if _, err := os.Stat(shadDigest); err == nil {
		logrus.Debugf("Comparing digest %q with the cached digest file %q, not computing the actual digest of %q",
			o.expectedDigest, shadDigest, shadData)
		if err := validateCachedDigest(shadDigest, o.expectedDigest); err != nil {
			return nil, err
		}
		if err := copyLocal(ctx, localPath, shadData, ext, o.decompress, "", ""); err != nil {
			return nil, err
		}
	} else {
		if match, lmCached, lmRemote, err := matchLastModified(ctx, shadTime, remote); err != nil {
			logrus.WithError(err).Info("Failed to retrieve last-modified for cached digest-less image; using cached image.")
		} else if match {
			if err := copyLocal(ctx, localPath, shadData, ext, o.decompress, o.description, o.expectedDigest); err != nil {
				return nil, err
			}
		} else {
			logrus.Infof("Re-downloading digest-less image: last-modified mismatch (cached: %q, remote: %q)", lmCached, lmRemote)
			return nil, nil
		}
	}
	res := &Result{

If an upstream image at the same URL is updated and the user changes the expected digest in their template, the stale cached image is served without detection.

Fix: Validate the original cached file against the caller's expectedDigest before running conversion logic. Move the image-conversion block after the existing digest validation (lines 329–340), or validate explicitly within the conversion block:

 if needConvert {
+    if o.expectedDigest != "" && shadDigest != "" {
+        if err := validateCachedDigest(shadDigest, o.expectedDigest); err != nil {
+            return nil, nil // force re-download
+        }
+    }
     converted, rawDigest, err := ensureRawInCache(ctx, shadData, imageFormat, o.expectedDigest)
2. Digest integrity gap when raw.digest is missing Claude Codex Gemini

pkg/downloader/downloader.go, important, regression

			} else {
				shadData = rawImgConvPath
				if currentDigestData, err := os.ReadFile(rawImgConvDigestPath); err == nil {
					currentDigest := strings.TrimSpace(string(currentDigestData))
					if d, err := digest.Parse(currentDigest); err == nil {
						o.expectedDigest = d
						shadDigest = rawImgConvDigestPath
					}
				}
			}
		}
	}

ensureRawInCache renames the raw file into place (line 453) before writing raw.digest (line 467). If the process crashes between these two operations, raw.digest is missing on restart. The else block at line 314 sets shadData to the raw path but silently falls through when ReadFile fails, leaving shadDigest and o.expectedDigest unchanged. The subsequent validation at line 329 then validates the original digest file against the original expected digest — which passes — but copies the unverified raw file (via shadData).

		return "", "", fmt.Errorf("failed to close raw tmp file %q: %w", rawPathTmp, err)
	}

	if err := os.Rename(rawPathTmp, rawImgConvPath); err != nil {
		return "", "", fmt.Errorf("failed to replace original with raw image: %w", err)
	}

	algo := digest.Canonical
	if originalDigest != "" {
		algo = originalDigest.Algorithm()
	}
	rawDigest, err := calculateFileDigest(rawImgConvPath, algo)
	if err != nil {
		return "", "", fmt.Errorf("failed to calculate digest of raw image: %w", err)
	}

	rawDigestPath := filepath.Join(imgConvPath, "raw.digest")
	if err := os.WriteFile(rawDigestPath, []byte(rawDigest.String()), 0o644); err != nil {
		return "", "", fmt.Errorf("failed to write raw digest file: %w", err)
	}

	return rawImgConvPath, rawDigest, nil
}

Fix: Treat a raw file without a valid raw.digest as incomplete. Force reconversion:

 } else {
 	shadData = rawImgConvPath
 	if currentDigestData, err := os.ReadFile(rawImgConvDigestPath); err == nil {
 		currentDigest := strings.TrimSpace(string(currentDigestData))
 		if d, err := digest.Parse(currentDigest); err == nil {
 			o.expectedDigest = d
 			shadDigest = rawImgConvDigestPath
+		} else {
+			needConvert = true
 		}
+	} else {
+		needConvert = true
 	}
 }

Then re-execute conversion if needConvert was set in the else block.

3. Converted digest-less images become permanently stale Codex

pkg/downloader/downloader.go, important, regression

			} else {
				shadData = rawImgConvPath
				if currentDigestData, err := os.ReadFile(rawImgConvDigestPath); err == nil {
					currentDigest := strings.TrimSpace(string(currentDigestData))
					if d, err := digest.Parse(currentDigest); err == nil {
						o.expectedDigest = d
						shadDigest = rawImgConvDigestPath
					}
				}
			}
		}
	}

When a cached raw image exists with a valid raw.digest, lines 318–320 populate o.expectedDigest and shadDigest. The subsequent check at line 329 (os.Stat(shadDigest) succeeds) routes into the digest-validation branch and skips the matchLastModified check at line 339. Images that previously revalidated via Last-Modified become permanently cached after conversion, never detecting upstream updates.

	ext := path.Ext(remote)
	logrus.Debugf("file %q is cached as %q", localPath, shadData)
	if _, err := os.Stat(shadDigest); err == nil {
		logrus.Debugf("Comparing digest %q with the cached digest file %q, not computing the actual digest of %q",
			o.expectedDigest, shadDigest, shadData)
		if err := validateCachedDigest(shadDigest, o.expectedDigest); err != nil {
			return nil, err
		}
		if err := copyLocal(ctx, localPath, shadData, ext, o.decompress, "", ""); err != nil {
			return nil, err
		}
	} else {
		if match, lmCached, lmRemote, err := matchLastModified(ctx, shadTime, remote); err != nil {
			logrus.WithError(err).Info("Failed to retrieve last-modified for cached digest-less image; using cached image.")
		} else if match {
			if err := copyLocal(ctx, localPath, shadData, ext, o.decompress, o.description, o.expectedDigest); err != nil {
				return nil, err
			}
		} else {
			logrus.Infof("Re-downloading digest-less image: last-modified mismatch (cached: %q, remote: %q)", lmCached, lmRemote)
			return nil, nil
		}
	}
	res := &Result{

Fix: Preserve remote freshness and local integrity as separate checks. For callers without a user-provided digest, run matchLastModified before serving imgconv/raw; use raw.digest only for local integrity verification:

+	originalExpectedDigest := o.expectedDigest
 	// ... conversion block ...
+	// After conversion, restore original digest state for freshness logic
+	if originalExpectedDigest == "" {
+		o.expectedDigest = ""
+		shadDigest = ""
+	}
4. Fragile image detection by description string parsing Claude Codex

pkg/downloader/downloader.go and pkg/fileutils/download.go, important, regression

func isNonISOImage(description, remote string, supportedImageFormats []string) bool {
	return strings.Contains(description, "image") && supportedImageFormats != nil && path.Ext(remote) != ".iso"
}
		downloader.WithCache(),
		downloader.WithDecompress(decompress),
		downloader.WithDescription(fmt.Sprintf("%s (%s)", description, path.Base(f.Location))),
		downloader.WithExpectedDigest(f.Digest),
	}
	if strings.Contains(description, "image") && supportedImageFormats != nil {
		opts = append(opts, downloader.WithImageFormats(supportedImageFormats))
	}

Both locations parse the human-readable description string to decide whether to trigger image conversion. A future caller that uses "image" in a description for a non-disk artifact (e.g., "container image manifest") would trigger unwanted conversion. This matches AkihiroSuda's unresolved review comment: "description is not expected to be parsed."

Fix: Remove description parsing. Guard by supportedImageFormats alone, as AkihiroSuda suggested:

 // fileutils/download.go
-if strings.Contains(description, "image") && supportedImageFormats != nil {
+if len(supportedImageFormats) > 0 {

 // downloader.go
 func isNonISOImage(description, remote string, supportedImageFormats []string) bool {
-	return strings.Contains(description, "image") && supportedImageFormats != nil && path.Ext(remote) != ".iso"
+	return supportedImageFormats != nil && path.Ext(remote) != ".iso"
 }
5. EnsureDisk rename path skips disk resize Claude Gemini

pkg/driverutil/disk.go, important, regression

		}
	} else {
		format, err := nativeimgutil.DetectFormat(imagePath)
		if err != nil {
			return err
		}

		if format == string(diskImageFormat) {
			if err = os.Rename(imagePath, diskPath); err != nil {
				return err
			}
		} else {
			if err := diskUtil.Convert(ctx, diskImageFormat, imagePath, diskPath, &diskSizeInBytes, false); err != nil {
				return fmt.Errorf("failed to convert %q to %q: %w", imagePath, diskPath, err)
			}
			os.Remove(imagePath)
		}
	}

The old code always called diskUtil.Convert, which both converted and resized to diskSizeInBytes. The new rename path at line 105 skips the resize. This works today because prepareDisk at pkg/instance/start.go:135 handles resizing after CreateDisk. But EnsureDisk was previously self-contained; it now silently depends on its caller to resize, an implicit contract change.

Fix: Either add the resize call or document the dependency:

+		// Resize handled by prepareDisk() after CreateDisk().
 		if format == string(diskImageFormat) {
 			if err = os.Rename(imagePath, diskPath); err != nil {
 				return err
 			}

Suggestions

1. origStat nil-pointer risk on Stat failure Claude

pkg/downloader/downloader.go, suggestion, regression

					return nil, err
				}
			} else {
				origStat, _ := os.Stat(shadData)
				if origStat.ModTime().After(rawStat.ModTime()) {
					needConvert = true
				}
			}

If os.Stat fails (e.g., file deleted between the existence check at line 278 and line 299), origStat is nil and origStat.ModTime() panics. Within the dir lock this is extremely unlikely, but the fix is trivial.

Fix:

-origStat, _ := os.Stat(shadData)
-if origStat.ModTime().After(rawStat.ModTime()) {
+origStat, err := os.Stat(shadData)
+if err != nil {
+	return nil, err
+}
+if origStat.ModTime().After(rawStat.ModTime()) {
2. Redundant MakeSparse after Convert Claude

pkg/downloader/downloader.go, suggestion, regression

	// Ensure the image is sparse to save cache space.
	rawTmpF, err := os.OpenFile(rawPathTmp, os.O_RDWR, 0o644)
	if err != nil {
		return "", "", fmt.Errorf("failed to open raw tmp file %q: %w", rawPathTmp, err)
	}
	fi, err := rawTmpF.Stat()
	if err != nil {
		_ = rawTmpF.Close()
		return "", "", fmt.Errorf("failed to stat raw tmp file %q: %w", rawPathTmp, err)
	}
	if err := diskUtil.MakeSparse(ctx, rawTmpF, fi.Size()); err != nil {
		logrus.WithError(err).Warnf("Failed to make %q sparse (non-fatal)", rawPathTmp)
	}

The native conversion path (nativeimgutil.convertTo) already creates a sparse file via makeSparse before copying data. The explicit MakeSparse call is redundant for the native path. It provides a safety net for the qemu-img fallback, but deserves a comment explaining why.

Fix: Add a comment explaining the redundancy rationale, or remove and rely on qemu-img convert producing sparse output by default.

3. WithImageFormats missing docstring Claude

pkg/downloader/downloader.go, suggestion, regression

// WithDecompress decompress the download from the cache.
func WithDecompress(decompress bool) Opt {
	return func(o *options) error {
		o.decompress = decompress
		return nil
	}
}

func WithImageFormats(supportedImageFormats []string) Opt {
	return func(o *options) error {
		o.supportedImageFormats = supportedImageFormats
		return nil
	}
}

All other With* option functions have docstrings (lines 81, 93, 102, 110, 125). This one lacks one. Add a Godoc comment explaining the option's purpose.

4. Incomplete QEMU supported format list Gemini

pkg/driver/qemu/qemu_driver.go, suggestion, enhancement

	info.Features = driver.DriverFeatures{
		DynamicSSHAddress:     false,
		SkipSocketForwarding:  false,
		CanRunGUI:             false,
		SupportedImageFormats: []string{string(qcow2.Type), string(rawImage.Type), string(vmdk.Type), string(vhdx.Type)}, // Not a comprehensive list of supported formats
	}
	return info

The QEMU format list omits natively supported formats like vdi, vpc, and parallels. A user downloading a vdi image would trigger an unnecessary raw conversion. The existing comment acknowledges the limitation, so this is low priority.


Design Observations

Concerns

Strengths


Testing Assessment

The new tests cover the key scenarios: fresh download with conversion (with and without digest), cache hit returning pre-converted raw, and pre-populated cache requiring on-demand conversion. The tests are meaningful and do not test mocked behavior.

Untested scenarios ranked by risk:

  1. Cache corruption/crash recovery: raw file exists but raw.digest is missing — the integrity gap from Finding 2
  2. Changed expected digest for cached URL: updating expectedDigest in the template while the URL remains unchanged — the validation bypass from Finding 1
  3. Digest-less image freshness: verifying that matchLastModified still triggers re-download after conversion — the staleness issue from Finding 3
  4. Driver that supports the original format: e.g., QEMU requesting a qcow2 — should return the original without conversion
  5. EnsureDisk rename path with resize: verifying disk reaches configured size when the image format already matches

Documentation Assessment

The internals.md change correctly documents the new imgconv/ cache subdirectory. The placement and formatting match the surrounding documentation style. No additional documentation gaps.


Commit Structure

Single commit bundles the entire feature: driver interface changes, downloader conversion logic, EnsureDisk optimization, tests, and documentation. Acceptable for a feature of this size. The commit message has a typo ("orginal" should be "original").


Acknowledged Limitations

  1. pkg/driver/qemu/qemu_driver.go:697// Not a comprehensive list of supported formats
    The QEMU format list intentionally omits uncommon formats. Now that format support drives conversion behavior, under-reporting can cause unnecessary conversions, but the comment acknowledges the tradeoff.

Unresolved Feedback

  1. Description parsing — @AkihiroSuda: "description is not expected to be parsed" and suggested if len(supportedImageFormats) > 0 {

    Actionable. Same issue as Important Finding #4.

  2. Pass non-nil only for images — @AkihiroSuda: "Just make sure to pass non-nil supportedImageFormats only for images"

    Actionable. Closely related to #1 — fixing Finding #4 by removing the description check and guarding on len(supportedImageFormats) > 0 naturally addresses this.

  3. Slice ordering — @AkihiroSuda: "Does the order of the slice matter?"

    Low priority. The slice is only used with slices.Contains; ordering has no effect on current behavior. Worth a brief reply in the PR but requires no code change.

  4. VZ format preference — @AkihiroSuda: "Maybe put asif before raw, assuming that asif is more preferable"

    Low priority. Same reason as #3 — slices.Contains ignores order. Good hygiene for future-proofing if the slice ever drives format selection, but no current impact.


Agent Performance Retro

Claude Opus 4.6

Claude Opus 4.6 produced the most thorough review. It identified the description-string parsing issue (Finding 4), the digest integrity gap (Finding 2), and the EnsureDisk resize concern (Finding 5). It also contributed the most suggestions (4). It traced into callers (prepareDisk) to correctly assess that the resize skip is mitigated, avoiding a false critical. It missed the digest-overwrite issue (Finding 1) and the Last-Modified staleness regression (Finding 3).

Duration: 10 min 17 s. Coverage: all 14 files reviewed. No false positives.

Codex GPT 5.4

Codex GPT 5.4 delivered a focused, high-signal review. Its unique contribution was Finding 3 (digest-less images become permanently stale) — a subtle interaction between the conversion logic and the existing freshness check that neither Claude nor Gemini caught. It also independently identified the digest integrity gap (Finding 2) and the description-parsing issue (as a suggestion). It was the most concise of the three agents.

Duration: 5 min 46 s. Coverage: all 14 files reviewed. No false positives.

Gemini 3.1 Pro

Gemini 3.1 Pro made the strongest claim — that the digest overwrite (Finding 1) and the EnsureDisk resize skip are critical. The resize claim was overrated: Gemini stated "the VM will boot with a disk size equal to the base image's original size," which is incorrect because prepareDisk handles resizing. The digest overwrite claim was valid but better rated as important (requires same-URL image update with changed digest). Gemini also flagged compressed image format detection as important, but no Lima templates use compressed disk images, making it a theoretical gap. It missed the Last-Modified staleness issue entirely.

Duration: 7 min 33 s. Coverage: marked 10 of 14 files as "Trivial" without reviewing their diffs in detail. No false positives in the findings raised, but severity inflation on two findings.

Summary

Metric Claude Opus 4.6 Codex GPT 5.4 Gemini 3.1 Pro
Duration10:175:467:33
Critical002 (downgraded)
Important322
Suggestion411
Design observations321
False positives000
Unique insights011
Files reviewed141414
Coverage misses000

Codex provided the best signal-to-noise ratio: the fastest completion time with the most impactful unique finding (the Last-Modified staleness regression). Claude provided the deepest review overall, correctly assessing severity and tracing mitigating code paths. Gemini contributed the only finding about digest validation bypass but overrated the severity of two findings.

The multi-agent approach proved its value: the three most important findings (digest overwrite, integrity gap, staleness regression) were each strongest in a different agent's review.


Review Process Notes

No suggestions for skill improvements or repo context updates from this review.