Deep Review: 20260405-232458-pr-2639

Date	2026-04-05 23:24
Repo	mikefarah/yq
Round	1
Mode	Thorough (3 passes)
Author	@copilot-swe-agent (bot)
PR	#2639 — Add string slicing support
Branch	`copilot/add-string-slice-support`
Commits	`341e2524` Fix array slice out-of-bounds panic with very negative indices `778088d7` Update pkg/yqlib/operator_slice.go `9a9399ad` Fix sliceStringNode signature and fix test descriptions/expressions `6d345ac7` Add string slicing support to yq `26929879` Initial plan
Reviewers	Claude Opus 4.6, Codex GPT 5.4, Gemini 3.1 Pro
Verdict	Merge with fixes — fix byte-vs-rune length inconsistency and add missing string edge-case tests
Wall-clock time	`54 min 32 s`

Executive Summary ¶

This PR adds rune-based string slicing to yq (.[0:5], .[4:], .[-5:] on strings), fixes traverseArrayOperator to evaluate the LHS before slicing, and hardens array slicing against very negative indices. The string slicing implementation is correct for the common case, but it exposes a byte-vs-rune inconsistency with existing operators (length, match) that produces silently wrong results for multibyte Unicode strings. The PR also needs stronger edge-case test coverage and minor structural fixes.

Critical Issues ¶

None.

Important Issues ¶

I1. Byte-vs-rune length inconsistency produces wrong results for multibyte strings ClaudeGemini ¶

The length operator returns byte count, but sliceStringNode uses rune count. For multibyte strings, length-based arithmetic produces wrong slice results. This is not merely masked by clamping — it actively produces incorrect output:

$ echo '"café"' | yq '(. | length) as $n | .[($n - 2):]'
é           # WRONG — expected "fé" (last 2 chars)

$ echo '"café"' | yq '.[-2:]'
fé          # correct with negative index

The match operator's byte offsets at operator_strings.go:292 have the same issue: feeding match("latte").offset into a slice on a multibyte string yields wrong boundaries.

jq is internally consistent: "café" | length returns 4 (characters), and slicing is character-based. This PR correctly chose rune-based slicing but did not update the pre-existing byte-based length operator. (important, gap)

Fix: Update lengthOperator to use len([]rune(candidate.Value)) for non-null scalars, and update addMatch to convert byte offsets to rune offsets via utf8.RuneCountInString. Note: changing length is a behavioural change for multibyte strings and warrants a changelog entry.

pkg/yqlib/operator_length.go

	for el := context.MatchingNodes.Front(); el != nil; el = el.Next() {
		candidate := el.Value.(*CandidateNode)
		var length int
		switch candidate.Kind {
		case ScalarNode:
			if candidate.Tag == "!!null" {
				length = 0
			} else {
				length = len(candidate.Value)
			}
		case MappingNode:
			length = len(candidate.Content) / 2
		case SequenceNode:
			length = len(candidate.Content)
		default:
			length = 0
		}

Fix (diff):

 // operator_length.go
 case ScalarNode:
     if candidate.Tag == "!!null" {
         length = 0
     } else {
-        length = len(candidate.Value)
+        length = len([]rune(candidate.Value))
     }

I2. Inconsistent index-clamping structure between string and array paths ClaudeGemini ¶

The string path uses else if, so a very negative secondNumber that remains negative after adjustment never gets clamped to 0 explicitly. It works only because the later secondNumber < firstNumber guard at line 43 catches it. The array path uses separate if blocks that handle this directly. (important, regression)

pkg/yqlib/operator_slice.go

func sliceStringNode(lhsNode *CandidateNode, firstNumber int, secondNumber int) *CandidateNode {
	runes := []rune(lhsNode.Value)
	length := len(runes)

	relativeFirstNumber := firstNumber
	if relativeFirstNumber < 0 {
		relativeFirstNumber = length + firstNumber
	}
	if relativeFirstNumber < 0 {
		relativeFirstNumber = 0
	}

	relativeSecondNumber := secondNumber
	if relativeSecondNumber < 0 {
		relativeSecondNumber = length + secondNumber
	} else if relativeSecondNumber > length {
		relativeSecondNumber = length
	}

	log.Debugf("sliceStringNode: slice from %v to %v", relativeFirstNumber, relativeSecondNumber)

	if relativeFirstNumber > length {
		relativeFirstNumber = length
	}
	if relativeSecondNumber < relativeFirstNumber {
		relativeSecondNumber = relativeFirstNumber
	}

	slicedString := string(runes[relativeFirstNumber:relativeSecondNumber])
	replacement := lhsNode.CreateReplacement(ScalarNode, lhsNode.Tag, slicedString)
	replacement.Style = lhsNode.Style
	return replacement
}

Fix (diff): Align the string path with the array path:

 relativeSecondNumber := secondNumber
 if relativeSecondNumber < 0 {
     relativeSecondNumber = length + secondNumber
- } else if relativeSecondNumber > length {
+ }
+ if relativeSecondNumber < 0 {
+     relativeSecondNumber = 0
+ } else if relativeSecondNumber > length {
     relativeSecondNumber = length
 }

I3. Missing test coverage for string edge cases tested for arrays ClaudeCodex ¶

The PR added array tests for very negative indices (.[-100:], .[:-100]) but no corresponding string tests. Also missing: Unicode strings with implicit end index, empty string slicing, and non-string scalar slicing. (important, gap)

Untested scenarios ranked by risk:

Unicode string with implicit end index (.greeting[2:] on "héllo") — exercises the byte-vs-rune mismatch
Very negative string start (.country[-100:]) — tests clamping parity with arrays
Very negative string end (.country[:-100]) — tests clamping parity with arrays
Empty string slicing (.val[0:5] on "")
Non-string scalar slicing (.value[0:2] on !!int)

Fix: Add test scenarios with skipDoc: true for each of these cases.

I4. Slicing non-string scalars produces invalid tagged output Claude ¶

When .value[0:2] targets a !!int scalar, the string guard correctly rejects it, but the code falls through to the array path. Since Content is nil for scalar nodes, the function returns an empty sequence tagged with the scalar's type (e.g., !!int []). (important, gap)

pkg/yqlib/operator_slice.go

		if lhsNode.Kind == ScalarNode && lhsNode.guessTagFromCustomType() == "!!str" {
			results.PushBack(sliceStringNode(lhsNode, firstNumber, secondNumber))
			continue
		}
		// ... falls through to array path for !!int, !!bool, !!float ...

Fix: Add an explicit guard before the array fallthrough:

 if lhsNode.Kind == ScalarNode && lhsNode.guessTagFromCustomType() == "!!str" {
     results.PushBack(sliceStringNode(lhsNode, firstNumber, secondNumber))
     continue
 }
+if lhsNode.Kind == ScalarNode {
+    return Context{}, fmt.Errorf("cannot slice scalar of type %v, only arrays and strings can be sliced", lhsNode.Tag)
+}

I5. Slice index expressions evaluated against LHS instead of original context Gemini ¶

sliceArrayOperator uses context to evaluate index expressions via getSliceNumber. After the PR, context is lhsContext (the sliced node), not the original document context. For expressions like .a[0:.max], .max is now evaluated against .a instead of the root — producing "cannot index array with 'max'".

Before the PR, the same expression sliced the wrong target (root document instead of .a) — so this was never correct either. The PR fixed one half (correct target) without fixing the other (correct index context). (important, gap)

pkg/yqlib/operator_traverse_path.go

	if expressionNode.RHS != nil && expressionNode.RHS.RHS != nil && expressionNode.RHS.RHS.Operation.OperationType == createMapOpType {
		lhsContext, err := d.GetMatchingNodes(context, expressionNode.LHS)
		if err != nil {
			return Context{}, err
		}
		return sliceArrayOperator(d, lhsContext, expressionNode.RHS.RHS)
	}

Fix: Pass both the original context (for evaluating indices) and the LHS context (for providing nodes) to sliceArrayOperator. One approach: evaluate indices in traverseArrayOperator before calling sliceArrayOperator, then pass the resolved integers.

Suggestions ¶

S1. Commit structure needs cleanup before merge Claude ¶

The 5 commits include an empty "Initial plan" commit, a lint-fix commit, and a panic-fix commit that should all be squashed into the main feature commit before merging. 778088d7 (Mike Farah's direct edit) can remain separate.

S2. Lexer openCollect change affects standalone [: expr] ClaudeGemini ¶

The added openCollect condition correctly enables .field[:N] but also fires on standalone [ (array literal start). [: "b"] now silently produces [{0: b}] instead of an error. Unlikely to affect real users, but broadens accepted syntax unintentionally. (suggestion, gap)

pkg/yqlib/lexer.go

	if tokenIsOpType(currentToken, createMapOpType) {
		log.Debugf("tokenIsOpType: createMapOpType")
		// check the previous token is '[', means we are slice, but dont have a first number
		if index > 0 && (tokens[index-1].TokenType == traverseArrayCollect || tokens[index-1].TokenType == openCollect) {
			log.Debugf("previous token is : traverseArrayOpType")
			// need to put the number 0 before this token, as that is implied
			postProcessedTokens = append(postProcessedTokens, &token{TokenType: operationToken, Operation: createValueOperation(0, "0")})
		}
	}

Fix: Tighten the condition to require that openCollect follows a traversal token.

S3. Doc header "Splice" misleading for strings Claude ¶

The header was updated from "Slice/Splice Array" to "Slice/Splice Array or String". "Splice" implies in-place mutation, which is not supported for strings. Consider "Slice Array or String" instead.

S4. Documentation uses only ASCII examples Claude ¶

All four string slicing doc sections use country: Australia (ASCII-only). Since the feature's distinguishing value is rune-based Unicode handling, one documented example with a multibyte string would demonstrate this.

Design Observations ¶

Concerns ¶

The string/array type dispatch at operator_slice.go:74 uses a "check-and-continue" pattern that leaves all non-!!str scalar types falling silently into array logic. Every new scalar type yq supports in the future will produce invalid output. A guard-and-error before the array path (I4) would make the function fail-safe by default. Claude (in-scope)
The length operator returns byte count for strings while the new slicing uses rune count. This semantic split means users cannot predict which convention a given string operator uses. Fixing length and match offsets to use runes as part of this PR would make string operations coherent. Deferring the fix means every new string operator must decide "bytes or runes?" independently. ClaudeGemini (in-scope)
Duplicated index-normalisation logic between sliceStringNode and the array path in sliceArrayOperator invites divergence. A shared normalizeSliceIndices(first, second, length int) (int, int) helper would eliminate the inconsistency. Claude (in-scope)

Strengths ¶

Using []rune for string slicing correctly handles Unicode. This aligns with jq's character-based string indexing.
The traverseArrayOperator fix (evaluating LHS before calling sliceArrayOperator) solves the root cause of .field[N:M] operating on the wrong node.
The lexer change (adding openCollect to the implicit-0 check) is well-scoped and necessary for .field[:N] syntax.
CreateReplacement preserves the original scalar's tag and style, avoiding quoting/formatting surprises.

Testing Assessment ¶

Test coverage for core string slicing (both ends, negative indices, out-of-bounds, Unicode, full-string [:]) is solid. Gaps ranked by risk:

Unicode string with implicit end index (.greeting[2:]) — exercises byte-vs-rune mismatch path
Non-string scalar slicing (.value[0:2] on !!int) — produces invalid output (I4)
Very negative string indices (.country[-100:], .country[:-100]) — tests clamping parity
Empty string slicing — boundary condition for zero-length rune slices
Expression-based slice bounds (.a[0:.max]) — exercises the context evaluation issue (I5)

Documentation Assessment ¶

The generated slice-array.md documentation correctly demonstrates string slicing examples. The header update to "Slice/Splice Array or String" is accurate for slicing but misleading for the "Splice" part regarding strings (S3). All examples use ASCII-only strings (S4).

Commit Structure ¶

26929879 Initial plan — empty commit with no file changes; should be dropped.
9a9399ad — fixes lint error and test mistakes from the main feature commit; should be squashed into 6d345ac7.
341e2524 — fixes panic from the main feature commit; should be squashed into 6d345ac7.
778088d7 — Mike Farah's direct edit (tag/style preservation); reasonable as a separate commit.

Agent Performance Retro ¶

Claude Opus 4.6 ¶

Claude produced the most findings across all three passes and demonstrated the deepest analysis. In pass 1, it identified the byte-vs-rune length mismatch (I1), the inconsistent clamping logic (I2), and the test coverage gaps (I3). In pass 2, it found the non-string scalar fallthrough (I4) and the lexer side-effect (S2). In pass 3, it deepened I1 with concrete reproduction cases showing that the mismatch produces wrong output (not just a theoretical concern), and added documentation suggestions (S4).

Claude ran git blame 4 times in pass 1 to verify regression classification. It read 10 files and used Grep 7 times to trace call sites, demonstrating thorough cross-file analysis.

Follow-up pass performance: Pass 2 produced 1 important + 2 suggestions (all new). Pass 3 produced 1 important (restated with stronger evidence) + 2 suggestions (new). Pass 3's key contribution was proving I1 causes real data corruption, not just a theoretical concern.

Codex GPT 5.4 ¶

Codex found no issues at any severity level. It reviewed all 7 files and concluded "no correctness regressions in the shipped changes." Its testing assessment correctly noted the Unicode implicit-end gap and expression-valued bounds gap, but it did not promote these to findings. Codex ran 40 tool calls but relied heavily on nl (12 calls) and printf (10) for file reading rather than targeted analysis. It ran git blame only once.

Coverage miss: Codex marked operator_slice.go as "Reviewed, no issues" — but Claude and Gemini both found important issues there (I1, I2, I4). Dropped after pass 1 (0 findings).

Gemini 3.1 Pro ¶

Gemini found the byte-vs-rune length mismatch (I1, shared with Claude) and the clamping inconsistency (S1, shared with Claude at different severity) in pass 1. In pass 2, it identified the context evaluation issue (I5) and independently found the lexer side-effect (S2, shared with Claude). Pass 3 explored operator interactions (assignment, deletion, maps, null indices) but these were pre-existing gaps outside the PR scope.

Gemini never ran git blame (known limitation due to quota constraints). It made 37 tool calls in pass 1, focusing on grep_search (9) and shell commands (23).

Follow-up pass performance: Pass 2 produced 1 critical (downgraded to I5) + 1 important (deduplicated to S2). Pass 3 produced 4 importants, but 1 was a duplicate and the rest were pre-existing gaps — downgraded to suggestions or dropped during consolidation.

Tool Call Highlights ¶

Claude ran git blame consistently (4 times P1, 3 times P2, 5 times P3) — the only agent to systematically verify regression classification.
Codex never searched code with grep/rg — it relied entirely on sequential file reading, which may explain why it missed cross-file issues like the length operator inconsistency.
Gemini never ran git blame (known quota limitation). In pass 3, it used 72 tool calls — the highest of any agent/pass — but most findings were pre-existing gaps.

Summary ¶

	Claude Opus 4.6	Codex GPT 5.4	Gemini 3.1 Pro
Duration (pass 1)	462s	217s	470s
Findings (pass 1)	0C 3I 1S	0C 0I 0S	0C 1I 1S
Tool calls (pass 1)	24 (Read 10, Grep 7)	40 (exec 33, write 7)	37 (shell 23, grep 9)
Duration (pass 2)	781s	—	484s
Findings (pass 2)	0C 1I 2S	—	0C 1I 0S + 1 dup
Tool calls (pass 2)	58	—	49
Duration (pass 3)	895s	—	623s
Findings (pass 3)	0C 0I 2S + 1 restated	—	0C 0I 0S + 3 dropped + 1 dup
Tool calls (pass 3)	69	—	72
Totals	0C 4I 5S	0C 0I 0S	0C 2I 1S
Design observations	3 concerns, 4 strengths	0 concerns, 3 strengths	1 concern, 2 strengths
False positives	0	0	0
Unique insights	I2, I3, I4, S1, S3, S4	0	I5
Files reviewed	7	7	7
Coverage misses	0	1 (operator_slice.go)	0
Downgraded	0	0	1 (C→I: context evaluation)
Dropped	0	0	4 (pass 3 pre-existing gaps)
Severity inflation	0	0	1 (C1 pass 2)

Reconciliation: Gemini P2 C1 (context evaluation): critical → important I5 (pre-existing limitation, not a regression — old code also didn't handle this case correctly). Gemini P3 I1 (splice assignment): dropped (pre-existing "Splice" feature never existed, already covered by S3). Gemini P3 I2 (del on slice): dropped (pre-existing behaviour unchanged by this PR). Gemini P3 I3 (slicing maps): dropped (duplicate of I4 non-string scalar handling). Gemini P3 I4 (null indices): dropped (pre-existing gap, not related to this change).

Claude provided the most value overall, identifying 4 of 5 important findings and all 4 suggestions. Gemini's unique contribution was I5 (context evaluation). Codex provided no actionable findings.

Review Process Notes ¶

Repo Context Updates ¶

yq's length operator returns byte count for strings (len(candidate.Value)), not rune/character count. String slicing uses rune count. Future reviews of string-related operators should check for byte-vs-rune consistency.