Deep Review: 20260405-232458-pr-2639
| Date | 2026-04-05 23:24 |
| Repo | mikefarah/yq |
| Round | 1 |
| Mode | Thorough (3 passes) |
| Author | @copilot-swe-agent (bot) |
| PR | #2639 — Add string slicing support |
| Branch | copilot/add-string-slice-support |
| Commits |
341e2524 Fix array slice out-of-bounds panic with very negative indices778088d7 Update pkg/yqlib/operator_slice.go9a9399ad Fix sliceStringNode signature and fix test descriptions/expressions6d345ac7 Add string slicing support to yq26929879 Initial plan
|
| Reviewers | Claude Opus 4.6, Codex GPT 5.4, Gemini 3.1 Pro |
| Verdict | Merge with fixes — fix byte-vs-rune length inconsistency and add missing string edge-case tests |
| Wall-clock time | 54 min 32 s |
Executive Summary ¶
This PR adds rune-based string slicing to yq (.[0:5], .[4:], .[-5:] on strings), fixes traverseArrayOperator to evaluate the LHS before slicing, and hardens array slicing against very negative indices. The string slicing implementation is correct for the common case, but it exposes a byte-vs-rune inconsistency with existing operators (length, match) that produces silently wrong results for multibyte Unicode strings. The PR also needs stronger edge-case test coverage and minor structural fixes.
Critical Issues ¶
None.
Important Issues ¶
The length operator returns byte count, but sliceStringNode uses rune count. For multibyte strings, length-based arithmetic produces wrong slice results. This is not merely masked by clamping — it actively produces incorrect output:
$ echo '"café"' | yq '(. | length) as $n | .[($n - 2):]'
é # WRONG — expected "fé" (last 2 chars)
$ echo '"café"' | yq '.[-2:]'
fé # correct with negative index
The match operator's byte offsets at operator_strings.go:292 have the same issue: feeding match("latte").offset into a slice on a multibyte string yields wrong boundaries.
jq is internally consistent: "café" | length returns 4 (characters), and slicing is character-based. This PR correctly chose rune-based slicing but did not update the pre-existing byte-based length operator. (important, gap)
Fix: Update lengthOperator to use len([]rune(candidate.Value)) for non-null scalars, and update addMatch to convert byte offsets to rune offsets via utf8.RuneCountInString. Note: changing length is a behavioural change for multibyte strings and warrants a changelog entry.
for el := context.MatchingNodes.Front(); el != nil; el = el.Next() {
candidate := el.Value.(*CandidateNode)
var length int
switch candidate.Kind {
case ScalarNode:
if candidate.Tag == "!!null" {
length = 0
} else {
length = len(candidate.Value)
}
case MappingNode:
length = len(candidate.Content) / 2
case SequenceNode:
length = len(candidate.Content)
default:
length = 0
}
Fix (diff):
// operator_length.go
case ScalarNode:
if candidate.Tag == "!!null" {
length = 0
} else {
- length = len(candidate.Value)
+ length = len([]rune(candidate.Value))
}
The string path uses else if, so a very negative secondNumber that remains negative after adjustment never gets clamped to 0 explicitly. It works only because the later secondNumber < firstNumber guard at line 43 catches it. The array path uses separate if blocks that handle this directly. (important, regression)
func sliceStringNode(lhsNode *CandidateNode, firstNumber int, secondNumber int) *CandidateNode {
runes := []rune(lhsNode.Value)
length := len(runes)
relativeFirstNumber := firstNumber
if relativeFirstNumber < 0 {
relativeFirstNumber = length + firstNumber
}
if relativeFirstNumber < 0 {
relativeFirstNumber = 0
}
relativeSecondNumber := secondNumber
if relativeSecondNumber < 0 {
relativeSecondNumber = length + secondNumber
} else if relativeSecondNumber > length {
relativeSecondNumber = length
}
log.Debugf("sliceStringNode: slice from %v to %v", relativeFirstNumber, relativeSecondNumber)
if relativeFirstNumber > length {
relativeFirstNumber = length
}
if relativeSecondNumber < relativeFirstNumber {
relativeSecondNumber = relativeFirstNumber
}
slicedString := string(runes[relativeFirstNumber:relativeSecondNumber])
replacement := lhsNode.CreateReplacement(ScalarNode, lhsNode.Tag, slicedString)
replacement.Style = lhsNode.Style
return replacement
}
Fix (diff): Align the string path with the array path:
relativeSecondNumber := secondNumber
if relativeSecondNumber < 0 {
relativeSecondNumber = length + secondNumber
- } else if relativeSecondNumber > length {
+ }
+ if relativeSecondNumber < 0 {
+ relativeSecondNumber = 0
+ } else if relativeSecondNumber > length {
relativeSecondNumber = length
}
The PR added array tests for very negative indices (.[-100:], .[:-100]) but no corresponding string tests. Also missing: Unicode strings with implicit end index, empty string slicing, and non-string scalar slicing. (important, gap)
Untested scenarios ranked by risk:
- Unicode string with implicit end index (
.greeting[2:]on"héllo") — exercises the byte-vs-rune mismatch - Very negative string start (
.country[-100:]) — tests clamping parity with arrays - Very negative string end (
.country[:-100]) — tests clamping parity with arrays - Empty string slicing (
.val[0:5]on"") - Non-string scalar slicing (
.value[0:2]on!!int)
Fix: Add test scenarios with skipDoc: true for each of these cases.
When .value[0:2] targets a !!int scalar, the string guard correctly rejects it, but the code falls through to the array path. Since Content is nil for scalar nodes, the function returns an empty sequence tagged with the scalar's type (e.g., !!int []). (important, gap)
if lhsNode.Kind == ScalarNode && lhsNode.guessTagFromCustomType() == "!!str" {
results.PushBack(sliceStringNode(lhsNode, firstNumber, secondNumber))
continue
}
// ... falls through to array path for !!int, !!bool, !!float ...
Fix: Add an explicit guard before the array fallthrough:
if lhsNode.Kind == ScalarNode && lhsNode.guessTagFromCustomType() == "!!str" {
results.PushBack(sliceStringNode(lhsNode, firstNumber, secondNumber))
continue
}
+if lhsNode.Kind == ScalarNode {
+ return Context{}, fmt.Errorf("cannot slice scalar of type %v, only arrays and strings can be sliced", lhsNode.Tag)
+}
sliceArrayOperator uses context to evaluate index expressions via getSliceNumber. After the PR, context is lhsContext (the sliced node), not the original document context. For expressions like .a[0:.max], .max is now evaluated against .a instead of the root — producing "cannot index array with 'max'".
Before the PR, the same expression sliced the wrong target (root document instead of .a) — so this was never correct either. The PR fixed one half (correct target) without fixing the other (correct index context). (important, gap)
if expressionNode.RHS != nil && expressionNode.RHS.RHS != nil && expressionNode.RHS.RHS.Operation.OperationType == createMapOpType {
lhsContext, err := d.GetMatchingNodes(context, expressionNode.LHS)
if err != nil {
return Context{}, err
}
return sliceArrayOperator(d, lhsContext, expressionNode.RHS.RHS)
}
Fix: Pass both the original context (for evaluating indices) and the LHS context (for providing nodes) to sliceArrayOperator. One approach: evaluate indices in traverseArrayOperator before calling sliceArrayOperator, then pass the resolved integers.
Suggestions ¶
The 5 commits include an empty "Initial plan" commit, a lint-fix commit, and a panic-fix commit that should all be squashed into the main feature commit before merging. 778088d7 (Mike Farah's direct edit) can remain separate.
The added openCollect condition correctly enables .field[:N] but also fires on standalone [ (array literal start). [: "b"] now silently produces [{0: b}] instead of an error. Unlikely to affect real users, but broadens accepted syntax unintentionally. (suggestion, gap)
if tokenIsOpType(currentToken, createMapOpType) {
log.Debugf("tokenIsOpType: createMapOpType")
// check the previous token is '[', means we are slice, but dont have a first number
if index > 0 && (tokens[index-1].TokenType == traverseArrayCollect || tokens[index-1].TokenType == openCollect) {
log.Debugf("previous token is : traverseArrayOpType")
// need to put the number 0 before this token, as that is implied
postProcessedTokens = append(postProcessedTokens, &token{TokenType: operationToken, Operation: createValueOperation(0, "0")})
}
}
Fix: Tighten the condition to require that openCollect follows a traversal token.
The header was updated from "Slice/Splice Array" to "Slice/Splice Array or String". "Splice" implies in-place mutation, which is not supported for strings. Consider "Slice Array or String" instead.
All four string slicing doc sections use country: Australia (ASCII-only). Since the feature's distinguishing value is rune-based Unicode handling, one documented example with a multibyte string would demonstrate this.
Design Observations ¶
Concerns ¶
- The string/array type dispatch at
operator_slice.go:74uses a "check-and-continue" pattern that leaves all non-!!strscalar types falling silently into array logic. Every new scalar type yq supports in the future will produce invalid output. A guard-and-error before the array path (I4) would make the function fail-safe by default. Claude (in-scope) - The
lengthoperator returns byte count for strings while the new slicing uses rune count. This semantic split means users cannot predict which convention a given string operator uses. Fixinglengthandmatchoffsets to use runes as part of this PR would make string operations coherent. Deferring the fix means every new string operator must decide "bytes or runes?" independently. ClaudeGemini (in-scope) - Duplicated index-normalisation logic between
sliceStringNodeand the array path insliceArrayOperatorinvites divergence. A sharednormalizeSliceIndices(first, second, length int) (int, int)helper would eliminate the inconsistency. Claude (in-scope)
Strengths ¶
- Using
[]runefor string slicing correctly handles Unicode. This aligns with jq's character-based string indexing. - The
traverseArrayOperatorfix (evaluating LHS before callingsliceArrayOperator) solves the root cause of.field[N:M]operating on the wrong node. - The lexer change (adding
openCollectto the implicit-0 check) is well-scoped and necessary for.field[:N]syntax. CreateReplacementpreserves the original scalar's tag and style, avoiding quoting/formatting surprises.
Testing Assessment ¶
Test coverage for core string slicing (both ends, negative indices, out-of-bounds, Unicode, full-string [:]) is solid. Gaps ranked by risk:
- Unicode string with implicit end index (
.greeting[2:]) — exercises byte-vs-rune mismatch path - Non-string scalar slicing (
.value[0:2]on!!int) — produces invalid output (I4) - Very negative string indices (
.country[-100:],.country[:-100]) — tests clamping parity - Empty string slicing — boundary condition for zero-length rune slices
- Expression-based slice bounds (
.a[0:.max]) — exercises the context evaluation issue (I5)
Documentation Assessment ¶
The generated slice-array.md documentation correctly demonstrates string slicing examples. The header update to "Slice/Splice Array or String" is accurate for slicing but misleading for the "Splice" part regarding strings (S3). All examples use ASCII-only strings (S4).
Commit Structure ¶
26929879Initial plan — empty commit with no file changes; should be dropped.9a9399ad— fixes lint error and test mistakes from the main feature commit; should be squashed into6d345ac7.341e2524— fixes panic from the main feature commit; should be squashed into6d345ac7.778088d7— Mike Farah's direct edit (tag/style preservation); reasonable as a separate commit.
Agent Performance Retro ¶
Claude Opus 4.6 ¶
Claude produced the most findings across all three passes and demonstrated the deepest analysis. In pass 1, it identified the byte-vs-rune length mismatch (I1), the inconsistent clamping logic (I2), and the test coverage gaps (I3). In pass 2, it found the non-string scalar fallthrough (I4) and the lexer side-effect (S2). In pass 3, it deepened I1 with concrete reproduction cases showing that the mismatch produces wrong output (not just a theoretical concern), and added documentation suggestions (S4).
Claude ran git blame 4 times in pass 1 to verify regression classification. It read 10 files and used Grep 7 times to trace call sites, demonstrating thorough cross-file analysis.
Follow-up pass performance: Pass 2 produced 1 important + 2 suggestions (all new). Pass 3 produced 1 important (restated with stronger evidence) + 2 suggestions (new). Pass 3's key contribution was proving I1 causes real data corruption, not just a theoretical concern.
Codex GPT 5.4 ¶
Codex found no issues at any severity level. It reviewed all 7 files and concluded "no correctness regressions in the shipped changes." Its testing assessment correctly noted the Unicode implicit-end gap and expression-valued bounds gap, but it did not promote these to findings. Codex ran 40 tool calls but relied heavily on nl (12 calls) and printf (10) for file reading rather than targeted analysis. It ran git blame only once.
Coverage miss: Codex marked operator_slice.go as "Reviewed, no issues" — but Claude and Gemini both found important issues there (I1, I2, I4). Dropped after pass 1 (0 findings).
Gemini 3.1 Pro ¶
Gemini found the byte-vs-rune length mismatch (I1, shared with Claude) and the clamping inconsistency (S1, shared with Claude at different severity) in pass 1. In pass 2, it identified the context evaluation issue (I5) and independently found the lexer side-effect (S2, shared with Claude). Pass 3 explored operator interactions (assignment, deletion, maps, null indices) but these were pre-existing gaps outside the PR scope.
Gemini never ran git blame (known limitation due to quota constraints). It made 37 tool calls in pass 1, focusing on grep_search (9) and shell commands (23).
Follow-up pass performance: Pass 2 produced 1 critical (downgraded to I5) + 1 important (deduplicated to S2). Pass 3 produced 4 importants, but 1 was a duplicate and the rest were pre-existing gaps — downgraded to suggestions or dropped during consolidation.
Tool Call Highlights ¶
- Claude ran
git blameconsistently (4 times P1, 3 times P2, 5 times P3) — the only agent to systematically verify regression classification. - Codex never searched code with grep/rg — it relied entirely on sequential file reading, which may explain why it missed cross-file issues like the
lengthoperator inconsistency. - Gemini never ran
git blame(known quota limitation). In pass 3, it used 72 tool calls — the highest of any agent/pass — but most findings were pre-existing gaps.
Summary ¶
| Claude Opus 4.6 | Codex GPT 5.4 | Gemini 3.1 Pro | |
|---|---|---|---|
| Duration (pass 1) | 462s | 217s | 470s |
| Findings (pass 1) | 0C 3I 1S | 0C 0I 0S | 0C 1I 1S |
| Tool calls (pass 1) | 24 (Read 10, Grep 7) | 40 (exec 33, write 7) | 37 (shell 23, grep 9) |
| Duration (pass 2) | 781s | — | 484s |
| Findings (pass 2) | 0C 1I 2S | — | 0C 1I 0S + 1 dup |
| Tool calls (pass 2) | 58 | — | 49 |
| Duration (pass 3) | 895s | — | 623s |
| Findings (pass 3) | 0C 0I 2S + 1 restated | — | 0C 0I 0S + 3 dropped + 1 dup |
| Tool calls (pass 3) | 69 | — | 72 |
| Totals | 0C 4I 5S | 0C 0I 0S | 0C 2I 1S |
| Design observations | 3 concerns, 4 strengths | 0 concerns, 3 strengths | 1 concern, 2 strengths |
| False positives | 0 | 0 | 0 |
| Unique insights | I2, I3, I4, S1, S3, S4 | 0 | I5 |
| Files reviewed | 7 | 7 | 7 |
| Coverage misses | 0 | 1 (operator_slice.go) | 0 |
| Downgraded | 0 | 0 | 1 (C→I: context evaluation) |
| Dropped | 0 | 0 | 4 (pass 3 pre-existing gaps) |
| Severity inflation | 0 | 0 | 1 (C1 pass 2) |
Reconciliation: Gemini P2 C1 (context evaluation): critical → important I5 (pre-existing limitation, not a regression — old code also didn't handle this case correctly). Gemini P3 I1 (splice assignment): dropped (pre-existing "Splice" feature never existed, already covered by S3). Gemini P3 I2 (del on slice): dropped (pre-existing behaviour unchanged by this PR). Gemini P3 I3 (slicing maps): dropped (duplicate of I4 non-string scalar handling). Gemini P3 I4 (null indices): dropped (pre-existing gap, not related to this change).
Claude provided the most value overall, identifying 4 of 5 important findings and all 4 suggestions. Gemini's unique contribution was I5 (context evaluation). Codex provided no actionable findings.
Review Process Notes ¶
Repo Context Updates ¶
- yq's
lengthoperator returns byte count for strings (len(candidate.Value)), not rune/character count. String slicing uses rune count. Future reviews of string-related operators should check for byte-vs-rune consistency.