Deep Review: 20260405-232458-pr-2639

Date2026-04-05 23:24
Repomikefarah/yq
Round1
ModeThorough (3 passes)
Author@copilot-swe-agent (bot)
PR#2639 — Add string slicing support
Branchcopilot/add-string-slice-support
Commits 341e2524 Fix array slice out-of-bounds panic with very negative indices
778088d7 Update pkg/yqlib/operator_slice.go
9a9399ad Fix sliceStringNode signature and fix test descriptions/expressions
6d345ac7 Add string slicing support to yq
26929879 Initial plan
ReviewersClaude Opus 4.6, Codex GPT 5.4, Gemini 3.1 Pro
VerdictMerge with fixes — fix byte-vs-rune length inconsistency and add missing string edge-case tests
Wall-clock time54 min 32 s

Executive Summary

This PR adds rune-based string slicing to yq (.[0:5], .[4:], .[-5:] on strings), fixes traverseArrayOperator to evaluate the LHS before slicing, and hardens array slicing against very negative indices. The string slicing implementation is correct for the common case, but it exposes a byte-vs-rune inconsistency with existing operators (length, match) that produces silently wrong results for multibyte Unicode strings. The PR also needs stronger edge-case test coverage and minor structural fixes.


Critical Issues

None.


Important Issues

I1. Byte-vs-rune length inconsistency produces wrong results for multibyte strings ClaudeGemini

The length operator returns byte count, but sliceStringNode uses rune count. For multibyte strings, length-based arithmetic produces wrong slice results. This is not merely masked by clamping — it actively produces incorrect output:

$ echo '"café"' | yq '(. | length) as $n | .[($n - 2):]'
é           # WRONG — expected "fé" (last 2 chars)

$ echo '"café"' | yq '.[-2:]'
fé          # correct with negative index

The match operator's byte offsets at operator_strings.go:292 have the same issue: feeding match("latte").offset into a slice on a multibyte string yields wrong boundaries.

jq is internally consistent: "café" | length returns 4 (characters), and slicing is character-based. This PR correctly chose rune-based slicing but did not update the pre-existing byte-based length operator. (important, gap)

Fix: Update lengthOperator to use len([]rune(candidate.Value)) for non-null scalars, and update addMatch to convert byte offsets to rune offsets via utf8.RuneCountInString. Note: changing length is a behavioural change for multibyte strings and warrants a changelog entry.

	for el := context.MatchingNodes.Front(); el != nil; el = el.Next() {
		candidate := el.Value.(*CandidateNode)
		var length int
		switch candidate.Kind {
		case ScalarNode:
			if candidate.Tag == "!!null" {
				length = 0
			} else {
				length = len(candidate.Value)
			}
		case MappingNode:
			length = len(candidate.Content) / 2
		case SequenceNode:
			length = len(candidate.Content)
		default:
			length = 0
		}

Fix (diff):

 // operator_length.go
 case ScalarNode:
     if candidate.Tag == "!!null" {
         length = 0
     } else {
-        length = len(candidate.Value)
+        length = len([]rune(candidate.Value))
     }
I2. Inconsistent index-clamping structure between string and array paths ClaudeGemini

The string path uses else if, so a very negative secondNumber that remains negative after adjustment never gets clamped to 0 explicitly. It works only because the later secondNumber < firstNumber guard at line 43 catches it. The array path uses separate if blocks that handle this directly. (important, regression)

func sliceStringNode(lhsNode *CandidateNode, firstNumber int, secondNumber int) *CandidateNode {
	runes := []rune(lhsNode.Value)
	length := len(runes)

	relativeFirstNumber := firstNumber
	if relativeFirstNumber < 0 {
		relativeFirstNumber = length + firstNumber
	}
	if relativeFirstNumber < 0 {
		relativeFirstNumber = 0
	}

	relativeSecondNumber := secondNumber
	if relativeSecondNumber < 0 {
		relativeSecondNumber = length + secondNumber
	} else if relativeSecondNumber > length {
		relativeSecondNumber = length
	}

	log.Debugf("sliceStringNode: slice from %v to %v", relativeFirstNumber, relativeSecondNumber)

	if relativeFirstNumber > length {
		relativeFirstNumber = length
	}
	if relativeSecondNumber < relativeFirstNumber {
		relativeSecondNumber = relativeFirstNumber
	}

	slicedString := string(runes[relativeFirstNumber:relativeSecondNumber])
	replacement := lhsNode.CreateReplacement(ScalarNode, lhsNode.Tag, slicedString)
	replacement.Style = lhsNode.Style
	return replacement
}

Fix (diff): Align the string path with the array path:

 relativeSecondNumber := secondNumber
 if relativeSecondNumber < 0 {
     relativeSecondNumber = length + secondNumber
- } else if relativeSecondNumber > length {
+ }
+ if relativeSecondNumber < 0 {
+     relativeSecondNumber = 0
+ } else if relativeSecondNumber > length {
     relativeSecondNumber = length
 }
I3. Missing test coverage for string edge cases tested for arrays ClaudeCodex

The PR added array tests for very negative indices (.[-100:], .[:-100]) but no corresponding string tests. Also missing: Unicode strings with implicit end index, empty string slicing, and non-string scalar slicing. (important, gap)

Untested scenarios ranked by risk:

  1. Unicode string with implicit end index (.greeting[2:] on "héllo") — exercises the byte-vs-rune mismatch
  2. Very negative string start (.country[-100:]) — tests clamping parity with arrays
  3. Very negative string end (.country[:-100]) — tests clamping parity with arrays
  4. Empty string slicing (.val[0:5] on "")
  5. Non-string scalar slicing (.value[0:2] on !!int)

Fix: Add test scenarios with skipDoc: true for each of these cases.

I4. Slicing non-string scalars produces invalid tagged output Claude

When .value[0:2] targets a !!int scalar, the string guard correctly rejects it, but the code falls through to the array path. Since Content is nil for scalar nodes, the function returns an empty sequence tagged with the scalar's type (e.g., !!int []). (important, gap)

		if lhsNode.Kind == ScalarNode && lhsNode.guessTagFromCustomType() == "!!str" {
			results.PushBack(sliceStringNode(lhsNode, firstNumber, secondNumber))
			continue
		}
		// ... falls through to array path for !!int, !!bool, !!float ...

Fix: Add an explicit guard before the array fallthrough:

 if lhsNode.Kind == ScalarNode && lhsNode.guessTagFromCustomType() == "!!str" {
     results.PushBack(sliceStringNode(lhsNode, firstNumber, secondNumber))
     continue
 }
+if lhsNode.Kind == ScalarNode {
+    return Context{}, fmt.Errorf("cannot slice scalar of type %v, only arrays and strings can be sliced", lhsNode.Tag)
+}
I5. Slice index expressions evaluated against LHS instead of original context Gemini

sliceArrayOperator uses context to evaluate index expressions via getSliceNumber. After the PR, context is lhsContext (the sliced node), not the original document context. For expressions like .a[0:.max], .max is now evaluated against .a instead of the root — producing "cannot index array with 'max'".

Before the PR, the same expression sliced the wrong target (root document instead of .a) — so this was never correct either. The PR fixed one half (correct target) without fixing the other (correct index context). (important, gap)

	if expressionNode.RHS != nil && expressionNode.RHS.RHS != nil && expressionNode.RHS.RHS.Operation.OperationType == createMapOpType {
		lhsContext, err := d.GetMatchingNodes(context, expressionNode.LHS)
		if err != nil {
			return Context{}, err
		}
		return sliceArrayOperator(d, lhsContext, expressionNode.RHS.RHS)
	}

Fix: Pass both the original context (for evaluating indices) and the LHS context (for providing nodes) to sliceArrayOperator. One approach: evaluate indices in traverseArrayOperator before calling sliceArrayOperator, then pass the resolved integers.


Suggestions

S1. Commit structure needs cleanup before merge Claude

The 5 commits include an empty "Initial plan" commit, a lint-fix commit, and a panic-fix commit that should all be squashed into the main feature commit before merging. 778088d7 (Mike Farah's direct edit) can remain separate.

S2. Lexer openCollect change affects standalone [: expr] ClaudeGemini

The added openCollect condition correctly enables .field[:N] but also fires on standalone [ (array literal start). [: "b"] now silently produces [{0: b}] instead of an error. Unlikely to affect real users, but broadens accepted syntax unintentionally. (suggestion, gap)

	if tokenIsOpType(currentToken, createMapOpType) {
		log.Debugf("tokenIsOpType: createMapOpType")
		// check the previous token is '[', means we are slice, but dont have a first number
		if index > 0 && (tokens[index-1].TokenType == traverseArrayCollect || tokens[index-1].TokenType == openCollect) {
			log.Debugf("previous token is : traverseArrayOpType")
			// need to put the number 0 before this token, as that is implied
			postProcessedTokens = append(postProcessedTokens, &token{TokenType: operationToken, Operation: createValueOperation(0, "0")})
		}
	}

Fix: Tighten the condition to require that openCollect follows a traversal token.

S3. Doc header "Splice" misleading for strings Claude

The header was updated from "Slice/Splice Array" to "Slice/Splice Array or String". "Splice" implies in-place mutation, which is not supported for strings. Consider "Slice Array or String" instead.

S4. Documentation uses only ASCII examples Claude

All four string slicing doc sections use country: Australia (ASCII-only). Since the feature's distinguishing value is rune-based Unicode handling, one documented example with a multibyte string would demonstrate this.


Design Observations

Concerns

Strengths


Testing Assessment

Test coverage for core string slicing (both ends, negative indices, out-of-bounds, Unicode, full-string [:]) is solid. Gaps ranked by risk:

  1. Unicode string with implicit end index (.greeting[2:]) — exercises byte-vs-rune mismatch path
  2. Non-string scalar slicing (.value[0:2] on !!int) — produces invalid output (I4)
  3. Very negative string indices (.country[-100:], .country[:-100]) — tests clamping parity
  4. Empty string slicing — boundary condition for zero-length rune slices
  5. Expression-based slice bounds (.a[0:.max]) — exercises the context evaluation issue (I5)

Documentation Assessment

The generated slice-array.md documentation correctly demonstrates string slicing examples. The header update to "Slice/Splice Array or String" is accurate for slicing but misleading for the "Splice" part regarding strings (S3). All examples use ASCII-only strings (S4).


Commit Structure


Agent Performance Retro

Claude Opus 4.6

Claude produced the most findings across all three passes and demonstrated the deepest analysis. In pass 1, it identified the byte-vs-rune length mismatch (I1), the inconsistent clamping logic (I2), and the test coverage gaps (I3). In pass 2, it found the non-string scalar fallthrough (I4) and the lexer side-effect (S2). In pass 3, it deepened I1 with concrete reproduction cases showing that the mismatch produces wrong output (not just a theoretical concern), and added documentation suggestions (S4).

Claude ran git blame 4 times in pass 1 to verify regression classification. It read 10 files and used Grep 7 times to trace call sites, demonstrating thorough cross-file analysis.

Follow-up pass performance: Pass 2 produced 1 important + 2 suggestions (all new). Pass 3 produced 1 important (restated with stronger evidence) + 2 suggestions (new). Pass 3's key contribution was proving I1 causes real data corruption, not just a theoretical concern.

Codex GPT 5.4

Codex found no issues at any severity level. It reviewed all 7 files and concluded "no correctness regressions in the shipped changes." Its testing assessment correctly noted the Unicode implicit-end gap and expression-valued bounds gap, but it did not promote these to findings. Codex ran 40 tool calls but relied heavily on nl (12 calls) and printf (10) for file reading rather than targeted analysis. It ran git blame only once.

Coverage miss: Codex marked operator_slice.go as "Reviewed, no issues" — but Claude and Gemini both found important issues there (I1, I2, I4). Dropped after pass 1 (0 findings).

Gemini 3.1 Pro

Gemini found the byte-vs-rune length mismatch (I1, shared with Claude) and the clamping inconsistency (S1, shared with Claude at different severity) in pass 1. In pass 2, it identified the context evaluation issue (I5) and independently found the lexer side-effect (S2, shared with Claude). Pass 3 explored operator interactions (assignment, deletion, maps, null indices) but these were pre-existing gaps outside the PR scope.

Gemini never ran git blame (known limitation due to quota constraints). It made 37 tool calls in pass 1, focusing on grep_search (9) and shell commands (23).

Follow-up pass performance: Pass 2 produced 1 critical (downgraded to I5) + 1 important (deduplicated to S2). Pass 3 produced 4 importants, but 1 was a duplicate and the rest were pre-existing gaps — downgraded to suggestions or dropped during consolidation.

Tool Call Highlights

Summary

Claude Opus 4.6 Codex GPT 5.4 Gemini 3.1 Pro
Duration (pass 1)462s217s470s
Findings (pass 1)0C 3I 1S0C 0I 0S0C 1I 1S
Tool calls (pass 1)24 (Read 10, Grep 7)40 (exec 33, write 7)37 (shell 23, grep 9)
Duration (pass 2)781s484s
Findings (pass 2)0C 1I 2S0C 1I 0S + 1 dup
Tool calls (pass 2)5849
Duration (pass 3)895s623s
Findings (pass 3)0C 0I 2S + 1 restated0C 0I 0S + 3 dropped + 1 dup
Tool calls (pass 3)6972
Totals0C 4I 5S0C 0I 0S0C 2I 1S
Design observations3 concerns, 4 strengths0 concerns, 3 strengths1 concern, 2 strengths
False positives000
Unique insightsI2, I3, I4, S1, S3, S40I5
Files reviewed777
Coverage misses01 (operator_slice.go)0
Downgraded001 (C→I: context evaluation)
Dropped004 (pass 3 pre-existing gaps)
Severity inflation001 (C1 pass 2)

Reconciliation: Gemini P2 C1 (context evaluation): critical → important I5 (pre-existing limitation, not a regression — old code also didn't handle this case correctly). Gemini P3 I1 (splice assignment): dropped (pre-existing "Splice" feature never existed, already covered by S3). Gemini P3 I2 (del on slice): dropped (pre-existing behaviour unchanged by this PR). Gemini P3 I3 (slicing maps): dropped (duplicate of I4 non-string scalar handling). Gemini P3 I4 (null indices): dropped (pre-existing gap, not related to this change).

Claude provided the most value overall, identifying 4 of 5 important findings and all 4 suggestions. Gemini's unique contribution was I5 (context evaluation). Codex provided no actionable findings.


Review Process Notes

Repo Context Updates