Bypass search cache on truncated searches at embed-escape boundary by stefanobaghino · Pull Request #675 · trishume/syntect

stefanobaghino · 2026-04-28T14:03:04Z

Stacks on #674.

Inside an embed where search_end is clipped to the escape position,
the per-line search cache returned cached full-line results regardless
of the current search bound. That flipped rule outcomes whose lookaheads
sat exactly at the boundary.

In `for i in $(seq 100); do echo $i; done` (Bash syntest), the
done{{cmd_break}} rule (done(?!cmd_char)) was searched at the outer
level with full-line text, where the lookahead saw the closing backtick
(itself a cmd_char) and failed. That None was cached. Inside the
backtick embed, with search_end clipped to the close, the cache
short-circuited the lookup before the regex could re-run with
end-of-input semantics, so done fell through to cmd-name-body and
emitted variable.function.shell instead of keyword.control.loop.end.shell.

Skip the cache lookup whenever search_end < line.len(). Insertion was
already gated on search_end == line.len(), so the cache stays
populated by full-line answers; truncated searches just re-run.

Drops bash baseline 4 → 0 on both backends.

Compare against #674: stefanobaghino/syntect@631-syntest-non-pure-assertion-gating...631-clip-lookahead-at-embed-escape

`handle_fail` already re-emits the branch_point match's `pat.scope` so a keyword like `LIKE` retains its scope after a branch swap. It did not do the same for `captures:`-derived scopes, because those are computed from the match's `regions` at emit time and not stored on the BranchPoint record. On same-line and cross-line fail paths the original capture Push/Pop pair was truncated off `ops` together with alt[0]'s subsequent work and never replayed. Minimal Haskell repro: data CtxCls ctx => ModId.QTyCls -- ^^^^ keyword.declaration.data.haskell The `datas` branch_point match is `(data)(?:\s+(family|instance))?` with `captures: { 1: keyword.declaration.data.haskell, 2: storage.modifier.family.haskell }`. When `data-signature` (alt[0]) fires `fail: data-declaration` on the `=>` lookahead, replay into `data-context` (alt[1]) only restored `scope: meta.declaration.data.haskell` — the `data` token lost its inner `keyword.declaration.data.haskell` scope. Fix: snapshot the computed capture ops on the BranchPoint record at creation time (new `capture_ops` field), and replay them inside the pat_scope brackets in both fail paths. Extracts the capture-op build logic (previously inline in `exec_pattern`) into a free-function helper `build_capture_ops` used at both call sites. Regression test `branch_point_capture_scopes_survive_fail_retry` mirrors the Haskell shape with a synthetic two-alt branch whose trigger match carries both `scope:` and `captures:`. Syntest net: - Haskell/syntax_test_haskell.hs: 72 -> 50 (Cluster A closes; `data CtxCls ctx => …`, `data family … => …`, and `type … => …` all now retain their declaration-keyword scope — 22 assertions). - No collateral deltas on any other baseline file (both regex backends). Refs: trishume#631

Mechanical: Haskell drops from 72 to 50 failing assertions on both regex backends. No other baseline lines change.

- Inline `build_capture_ops` signature so it fits on one line under the older rustfmt in CI's Minimum supported rust version job. - Escape `alt[0]`/`alt[1]` in the `BranchPoint.capture_ops` doc so rustdoc stops treating them as intra-doc links under `-Dwarnings`. Refs: trishume#631

push_meta_ops emitted the compound Pop before the Restore when the leaving context carried `clear_scopes:` and `set_pop_count > 1`. The intermediate popped frame's meta_content_scope is split across the live scope stack and clear_stack; Pop — counting the full pre-Clear total — then ate atoms from frames below the popped range. Observed on Batch File `cmd-set-quoted-value-inner-end` (`clear_scopes: 1`) firing `pop: 2, set: ignored-tail-outer`, which dropped `meta.command.set.dosbatch` from the trailing content of every quoted `set "var"=...` line. Emit Restore before Pop when `set_pop_count > 1`; plain `set:` keeps the existing Pop-then-Restore order (gated by the Lisp `defun` test `v2_set_to_target_with_clear_scopes_clears_parent_meta_content_scope`). syntax_test_batch_file.bat: 74 → 0 on both backends; no other baseline entry changes. New regression `pop_n_set_with_cur_clear_scopes_restores_before_popping_deeper_frames`. Refs: trishume#631

Sublime Text applies `captures: N:` to the overlap between group N's span and the rule's consumed match range. syntect dropped lookaround- internal captures in `parse_captures` and would have emitted Pop past match_end in `build_capture_ops` had they reached it. Keep every non-negative `captures:` key at load time; clip `(cap_start, cap_end)` to `regions.pos(0)` at apply time. Removes the now-unused `get_consuming_capture_indexes` walker and its tests. Baseline (both backends): clears ASP/syntax_test_asp.asp (53), C#/tests/syntax_test_Generics.cs (3), Rails/tests/syntax_test_rails.html.erb (23). Refs: trishume#631

ST drops the popped context's `meta_scope` and `meta_content_scope` from the trigger match's text for `pop: N + embed:`, unlike `pop: N + set:` which preserves both. Rules in the wild re-add the meta_scope atom explicitly in their match `scope:` so it still appears exactly once on the trigger — HTML (JSP).sublime-syntax's `tag-jsp-{declaration,expression,scriptlet}-attributes` all do. syntect's Embed → synthetic Set routing in `push_meta_ops` inherited plain-Set semantics, so cur.meta_scope stayed on the stack and the match's explicit scope duplicated it on top. Fix: when `pop_count > 0`, emit initial-phase Pops for cur's mcs and ms, then pass a scope-stripped clone of cur through to the recursive Set call so its non-initial `num_to_pop` doesn't double-account atoms that are already off the stack. Probe and ordering invariant in `v2_pop_embed_suppresses_cur_meta_scope_on_match`. Net syntest: Java/jsp 44 → 39 on both baselines (5 `- meta.tag.jsp meta.tag.jsp` assertions cleared). The other 39 are three unrelated root causes, not addressed here. Refs: trishume#631

A cross-line `fail` replay commits a Push(meta_scope) to `flushed_ops` for a speculative context; a later same-line `fail` for a branch_point created *during* the replay then truncates the owning context out of `self.stack` without emitting a balancing Pop (the Push is in `flushed_ops`, beyond `ops.truncate`'s reach). `exec_escape` pops based on the truncated stack, leaving an orphan atom at the top. Track a `shadow: ScopeStack` mirror of the consumer view, synced at `parse_line` boundaries the same way `syntest` applies `replayed` + `ops`. `exec_escape` now emits a corrective Pop for any atoms exceeding the sum of `self.stack`'s `meta_scope` / gated `meta_content_scope` contributions. Drops `syntax_test_latex.tex: 76` from both `testdata/known_syntest_failures{,_fancy}.txt`.

`IncludeWithPrototype` in `MatchIter` pushed the included target on top of the prototype, and `MatchIter::next` reads the stack top (`ctx_stack[len-1]`) — so the target's patterns were iterated first and the external prototype's second. The parser's tie-break on match_start is strict `<`, so whichever rule is enumerated first wins a same-position match. ST's `apply_prototype` semantics and `ParseState::find_best_match`'s own `context.prototype` chaining (`chain(cur_prototype).chain(cur_context)`) both put prototype patterns ahead of the target. Swap the two pushes so the prototype lands on top of the stack and is iterated first. Concretely: in HAML `tag-attributes-content`, `ruby-code` does `include: scope:source.ruby.rails.embedded.haml apply_prototype: true`. Ruby-for-HAML's prototype injects HAML's `pipe-continuations` (match `|\s*$`). Before this fix, Ruby's bitwise-or rule (`[~|^]`) at the same position was iterated first and won the tie, so `|` at EOL got `keyword.operator.bitwise.ruby` instead of `punctuation.separator.continuation.haml`; the attribute braces popped at the newline, and every continuation assertion below cascaded. After the swap, the prototype's pipe-continuation wins the tie. Refs: trishume#631

Strengthens the existing `apply_prototype_includes_external_prototype` from build-only to parse-and-assert. Adds precedence, opt-out, and HAML-Rails end-to-end guards alongside it in `src/parsing/syntax_set.rs`. Refs: trishume#631

All 65 assertions in `syntax_test_rails.haml` pass after the apply_prototype ordering fix. Delta applies to both baselines.

ST's `text_point(row, col)` overflows past-EOL columns into the next row, so its syntax-test framework evaluates past-EOL assertions against the corresponding column on the next line. syntect's harness was instead testing against the consumed `\n`'s scope — silent divergence whenever the `\n` carried parent meta_scopes that the EOL pop chain dropped. Reorder the loop to parse-before-assert; thread the first post-target line's scopes into `process_assertions` (`examples/syntest.rs`); fall back to the previous behaviour when next-line scopes aren't available (EOF, replay path). Closes 17 `syntax_test_git_config` and 1 `syntax_test_clojure` stale baselines. Refs: trishume#631

Two cross-line branches failing on the same parse_line grew `flushed_ops` by append, so `ParseLineOutput::replayed` doubled and consumers that pair `replayed[i]` with the i-th pending line slid ops from one buffered line onto another's text. Observed as the byte-77 panic at `syntax_test_java.java` line 624. Track `flushed_ops_start` alongside `flushed_ops` and merge subsequent fails against the prior snapshot's range. See `ParseState::merge_flushed` docs for the composition rule. `known_syntest_failures{,_fancy}` absorb the unmasking: Python / TypeScript / Bash / Zsh files previously panicking now report their real path-1 counts. Java stays at `1` — next panic site is a pre-existing stale-`line_number` on branches created during replay, tracked as follow-up. Refs: trishume#631

Branches created while `handle_fail` re-parses a buffered past line snapshotted `self.line_number` / `self.pending_lines.len()`, which still reflect the *outer* `parse_line`'s current line. A later fail on the outer line would then see `bp.line_number == cur_line`, classify the branch as same-line, and apply the branch's replay-line-relative `match_start` to a shorter outer line — shipped as `byte index 20 out of bounds of " foo = BAR,\n"` on `syntax_test_java.java:10263` inside `@MultiLineAnnotation(...)`, and the matching byte-2 panic in `syntax_test_markdown.md`'s multi-line math blocks. Introduce a `replay_ctx: Option<ReplayCtx>` set around each inner `parse_line_inner*` call in both replay loops. Branch creation and `handle_fail`'s `cur_line` read through it, so branches born in the re-parse of line `L+i` record `line_number = L+i` and `pending_lines_snapshot_len = <slot for L+i>`. Baselines absorb the unmasking: TypeScript drops 230 to 12 (cascading replay-branch misclassifications fixed), Markdown moves 1 to 897 (the `1` was the byte-2 panic artefact; real count surfaces). `syntax_test_java.java` stays at `1`: a distinct pre-existing `NoClearedScopesToRestore` surfaces further into the same file, tracked as a follow-up. Refs: trishume#631

A `branch_point` born inside `handle_fail`'s cross-line replay recorded only the inner re-parse's local `res` Vec as its `prefix_ops`. When that nested branch later failed cross-line, its own replay reconstructed the line from an empty prefix and the captures emitted before the *outer* branch trigger vanished. Shipped as `[foo]: /url` losing its `meta.link.reference.def.markdown` and capture scopes whenever the outer `link-def-title-continuation` branch's `immediately-pop2` alt-1 spawned a nested `link-def-attr-continuation` whose own fail then replayed line 3 without the original LRD opener captures. Compose the first-line prefix (outer `prefix_ops` + new-alt meta/pat/capture/meta_content) up front in both cross-line replay paths, surface it via `ParseState::replay_prefix_ops`, and prepend it to inner branch creations' `prefix_ops`. Baselines: Markdown 897 → 565, TypeScript 12 → 0 (file disappears — `syntax_test_typescript.ts` exercised the same nested-replay shape). Refs: trishume#631

`parse_line` captured the buffered shadow snapshot BEFORE the line ran, and the syntest consumer captured `stack_before` similarly. A replay applied during that line's parse may have corrected ops for prior buffered lines, leaving the captured snapshot reflecting the uncorrected baseline. A LATER replay covering the same line then resets to that stale snapshot, re-applies the corrected ops on top, and resurrects any meta_scope the prior replay had unwound. Manifested as `meta.link.reference.def.markdown` leaking past back-to-back Markdown link reference definitions and polluting all subsequent paragraphs, code blocks, blockquotes, autolinks, footnotes, etc. for the rest of the file (~408 chars / 88 assertions in `syntax_test_markdown.md`). After applying replays in `parse_line`, overwrite each buffered `pending_line_start_shadows[start_idx + i + 1]` with the post-i shadow, and use the post-replay shadow as the snapshot for the current line being pushed. Mirror the same correction in `syntest`'s consumer loop on `parsed_line_buffer[..].stack_before`. Baselines: - Markdown 565 → 158 (the LRD-leak family) - Java 1 (panic) → 18953 (real failures unmasked — the `NoClearedScopesToRestore` panic that the same drift was triggering is gone) Refs: trishume#631

A `pop: N + branch_point` snapshots `stack_depth` pre-pop; the synthetic Set's post-Set retain (`bp.stack_depth <= final_len`) and `handle_fail`'s validity check (`stack.len() < bp.stack_depth`) both ignored that `pop_count`, dropping the freshly-created bp at creation. Same-line re-emit also missed the popped contexts' meta_scope clearance Pop — route it through `push_meta_ops` like the original push. Symptom: `meta.annotation.identifier.java meta.path.java` leaking past nested-annotation extends paths in `syntax_test_java.java`. Drops Java baseline 18953 -> 9956.

Mirrors the trishume#660 same-line fix into the cross-line branch — the bespoke re-emit of the new alternative's meta_scope/meta_content_scope was missing the popped contexts' Pop, leaking the popped meta_scope (annotation-qualified-identifier's meta_scope in Java) plus the surrounding declaration's meta_scope when an annotation crosses a line into a class/enum/interface declaration.

…ctions When an outer cross-line `fail`'s replay re-parses buffered lines, an inner cross-line `fail` firing during the loop writes its correction into `self.flushed_ops`. Previously, the outer's locally-computed `replayed_ops[i]` overwrote that correction via `merge_flushed`, freezing a stale interpretation for indices the inner had already corrected. Fixes the leak in `src/parsing/parser.rs::handle_fail` for both the alt-N and exhaustion cross-line paths. Repro: Java `@A.B\n(par=1)\nenum E {}\n` — the outer `declarations` fail's line-1 reparse froze the dotted annotation as `path` alt before the inner `annotation-qualified-identifier` fail's `name`-alt resolution landed. Drops Java syntest baseline 9935 → 9774; no regressions in other languages or in `Markdown` (still 158).

When a same-line branch_point exhausts at a zero-width lookahead, rewind the cursor to the BP's original position and skip the same-name Branch pattern on retry — letting the parent context's next rule fire instead of advancing past the lookahead, which let stale keyword rules match inside identifiers (`package` in `$package`, `class` in `Foo.class;`). Drops Java syntest 9774 → 1987 (-7787, -80%); jsp 39 → 0; Zsh 604 → 410. Markdown unchanged at 158. No regressions elsewhere. See parser.rs::handle_fail same-line exhaustion handler and the new `skipped_branches` field; new test `exhausted_branch_point_falls_through_to_parent_next_rule`.

`push_meta_ops`'s non-initial phase emitted the deep-context meta_scope/mcs Pops before restoring `cur_context.clear_scopes`. When the cleared atom belonged to one of the deeper contexts being popped, the Pops landed on the wrong (still-visible) scope — observed on Java's `case DayType when -> "incomplete"`, where `case-label-expression`'s `clear_scopes: 1` hid `case-label`'s `meta.case.java` and `case-label-end`'s `pop: 2` then popped the surrounding switch block off the consumer's stack. Move the cur_context Restore to before the depth loop so the previously-cleared atom is visible again when the deeper-context Pop lands on it. Drops Java syntest 1987 → 949 (-1038, additional -50%); fixes C#'s `syntax_test_GeneralStructure.cs` (was 2 → 0) and Haskell -1. Markdown unchanged at 158, no other regressions. See `parser.rs::push_meta_ops` Pop arm and the new test `pop_n_restores_clear_before_unwinding_deeper_meta_scopes`.

The YAML loader checked `set:`, `branch:`, and `embed:` after a `pop:` key but never `push:`. Combined `pop: N + push: X` rules degraded to a plain `Pop(N)` and silently dropped the push, leaving the parser on the outer context instead of the intended target. Affected rules in vendored syntaxes: Java's `pop: 2 + push: annotation-parameters-body` (lambda3 line 10069 and many others) and `pop: 1 + push: case-label-expression`; Python's `pop: 2 + push: function-parameter-list-body` and `type-parameter-list-body`. Java syntest 641 → 245 (-396); Python 66 → 45 (-21). Other language baselines unchanged.

The Set initial-phase Pop at parser.rs:1992 unconditionally popped `cur_context.meta_content_scope.len()` even when cur_context's mcs was never pushed because the context immediately below has `embed_scope_replaces=true`. This dropped the topmost wrapper-pushed embed_scope token. Mirrors the skip already in the Pop branch at parser.rs:1912. Markdown 158 -> 31; Python 45 -> 32 (free benefit).

Plain `set:` (no `pop_count`) into a target with `clear_scopes` emitted that Clear in `push_meta_ops`'s initial phase even when the leaving context carried its own `meta_scope` / `meta_content_scope`. Cur's ms sits on top of the visible stack at that point; Clear hid it instead of the parent atom the optimization was meant to strip. The non-initial Pop then ate atoms below cur's hidden ms, and the trailing Restore resurrected cur's ms — leaving cur's meta_scope where the parent's atom used to be. Bash repro `: ~/`: `~` set: `tilde-modifier` (clear+ms); `''` zero-width set: `tilde-modifier-username` (clear+mcs); `/` lookahead pops. ST scopes `/` as `meta.string.glob.shell string.unquoted.shell`; syntect emitted `meta.interpolation.tilde.shell string.unquoted.shell`. Fix: when cur has `meta_scope` or `meta_content_scope`, defer the single-context-set target Clear to the non-initial phase, after Pop+Restore (so Pop finds cur's ms visible and Restore brings the parent atoms back) and before pushing target's ms/mcs. The cur-empty case (Lisp `(defun fn (...)`, pinned by `v2_set_to_target_with_clear_scopes_clears_parent_meta_content_scope`) is unchanged. Net syntest: bash 249 → 30, zsh 410 → 25, java 245 → 221 on both regex backends; no other baseline lines change. New regression `cur_meta_scope_set_to_target_with_clear_scopes` mirrors the bash shape. Refs: trishume#631

Multi-context `set:` whose target body has both `clear_scopes: N` and a non-empty `meta_scope`, fired from a cur with no ms/mcs/clear, needs an extra atom dropped on the trigger token beyond Clear's reach. ST drops `N + 1` atoms on the trigger and `N` on the body content, anchoring the extra drop on the target's `meta_scope`. `push_meta_ops` previously kept both atoms on the trigger, leaking nested `meta.function.php` / `meta.function.return-type.php` into the `:` of PHP `function bye(): never {`. The fix emits a combined `Clear(N + 1)` in the initial phase and a paired `Restore` in the non-initial phase, leaving the body content's existing per-context Clear+Push to land it at the same place as before. Gated on the clear-bearing target carrying a non-empty `meta_scope` so syntaxes whose target has only `meta_content_scope` are unaffected — Zsh's `zsh-redirection-glob-range-end` (clear+mcs, no ms) on the `<` redirection trigger otherwise loses `source.shell.zsh` and `meta.function-call.arguments.shell`. PHP 1 -> 0.

push_meta_ops's `MatchOperation::Set` arm with `set_pop_count > 1` lumped target.ms + cur.ms + every popped deeper frame's mcs+ms into a single Pop. Per-frame clear_scopes were never restored — their cleared atoms stayed in clear_stack out of reach, and the new target's clear_scopes then bit one atom too deep. Observed on Python `r'''(?ix:some text(?-i:hello))(?iLmsux)(?a)foo'''`: the `(?ix:` rule's `pop: 3 + set:[group-body-extended, maybe-unexpected-quantifiers]` left `group-body-extended_outer`'s cleared `meta.mode.extended.regexp` in clear_stack; `group-body-extended_target`'s `clear_scopes: 1` then cleared `source.regexp.python` (the embed wrapper's mcs) instead of `mode_outer`. ST keeps `source.regexp.python` visible from col 22 through col 47+; syntect previously dropped it from col 27 onward. Split the lumped Pop into a head Pop (target.ms + cur.ms) and a per-depth Pop+Restore loop mirroring `MatchOperation::Pop` arm at parser.rs:1954-1971. New regression tests `pop_n_set_restores_deeper_frame_clear_scopes` (positive) and `pop_n_set_without_deeper_clear_scopes_unaffected` (negative gate against regressing Java's `pop:2 + push:annotation-parameters-body` shape). Refs: trishume#631

Resolved by per-depth clear_scopes Restore on pop:N + set:. Refs: trishume#631

`yaml_load`'s `parse_embed_op` was setting `embed_scope_replaces=true` on the wrapper unconditionally. That flag tells the per-target loop in `parser.rs` to suppress the next context's `meta_content_scope` push, to avoid duplicating the embedded syntax's top-level scope (auto- inserted into `main`'s mcs at `yaml_load.rs:706-713`) with the wrapper's last `embed_scope` atom. That dedup is only needed when the embed enters via `main`. Fragment embeds (e.g. `embed: scope:source.toml.embedded.python#toml`) bypass `main`, so the fragment context's mcs is independent of the syntax's top-level scope. Suppressing it strips a real grammar atom (TOML's `meta.mapping.toml`) and the next `clear_scopes:` then bites the wrapper instead of the intended grammar atom — leaking the wrapper out of every nested scope inside the embed. Mark the wrapper as `embed_scope_replaces=true` only when the embed target has no `#fragment`. Two regression tests: - `fragment_embed_preserves_target_meta_content_scope` (positive) - `non_fragment_embed_still_suppresses_main_mcs` (negative gate) The b31b727 test `embed_scope_replaces_preserves_wrapper_mcs_across_inner_set` is unaffected — Markdown's bash code-fence embed has no fragment. Python 32 -> 0 on both regex backends; no other baseline moves.

When a child syntax has multiple parents in `extends:` and the parents disagree on a shared context or variable, a parent's directly-defined entry now outranks another parent's inherited entry. Same-provenance ties still resolve last-wins. Fixes the indented zsh shebang in Markdown fenced blocks: `Zsh (for Markdown)` extends `[Bash (for Markdown), Zsh]`. Bash (for Markdown) owns a lenient `main` (`^(?=\s*#!)`); Zsh inherits Bash's strict column-0 main. The previous last-wins merge let Zsh's inherited main override, so the indented ` #!/usr/bin/env zsh` fell into the regular comments rule.

`get_line_assertion_details` recognised any line where the testtoken appeared mid-text and where valid assertion markers followed. ST's syntax-test format only allows assertions on dedicated comment-only lines, so source code preceding the testtoken means the markers are coincidental. The harness was processing such lines as assertions anyway, producing spurious failures and pinning `test_against_line_number` away from the source line so the *next* genuine assertion tested against stale scopes. Fix: early-return `None` when source code precedes the testtoken or non-whitespace follows the closing testtoken_end. The two bash repros are `: ${#^pattern}` (the `#` is the parameter-length operator) and `[ <<doc ] # <- ]` (a trailing comment whose body starts with `<-`). Doing this also exposed a latent bug in `only_whitespace_after_token_end`: `after_token_end` was the substring *from* the end-token, so the end-token glyphs themselves always counted as non-whitespace, and `/* ^ scope */` lines were silently classified as non-pure. Under the old gate this was harmless (the flag only fed the `parse_test_lines` path), but the early-return turned every C-style block-comment assertion into a non-assertion source line. Skip the end-token before checking the trailing content. Three new harness unit tests cover the corrected predicate and both shell repros. Two existing tests already exercise the pure-assertion path; their `is_pure_assertion_line` field assertions are now invariant-true at the constructor, but kept as documentation. Net syntest deltas (both regex backends): - Bash 30 -> 4 (residual: backtick `for...done` interaction) - Zsh 25 -> 10 (residual: zsh glob-range scoping) - Haskell 49 -> 43 Stacks on trishume#673.

The per-line search cache stored full-line `regex.search` results keyed by MatchPattern pointer, then reused them on every later search regardless of `search_end`. Inside an embed where `search_end` is clipped to the escape position, that reuse can flip rule outcomes whose lookaheads sit exactly at the boundary — the cached "no match" was computed against the escape glyph, but a fresh truncated search would see end-of-input there. Concretely: in `` `for i in $(seq 100); do echo $i; done` `` the `done{{cmd_break}}` rule (`done(?!cmd_char)`) was searched at the outer level with full-line text, where the lookahead saw the closing backtick (itself a cmd_char) and failed. That `None` was cached. Inside the backtick embed, with search_end clipped to the close, the cache short-circuited the lookup before the regex could re-run with end-of-input semantics, so `done` fell through to `cmd-name-body` and got `variable.function.shell` instead of `keyword.control.loop.end.shell`. Skip the cache lookup whenever `search_end < line.len()`. Insertion was already gated on `search_end == line.len()`, so the cache stays populated by full-line answers; truncated searches just re-run.

stefanobaghino added 30 commits April 26, 2026 08:24

Decrement Haskell syntest baseline after capture re-emit fix

1ee0e3c

Mechanical: Haskell drops from 72 to 50 failing assertions on both regex backends. No other baseline lines change.

Cover apply_prototype external-prototype precedence

69cd4b3

Strengthens the existing `apply_prototype_includes_external_prototype` from build-only to parse-and-assert. Adds precedence, opt-out, and HAML-Rails end-to-end guards alongside it in `src/parsing/syntax_set.rs`. Refs: trishume#631

Drop HAML-Rails from syntest baselines

347ea76

All 65 assertions in `syntax_test_rails.haml` pass after the apply_prototype ordering fix. Delta applies to both baselines.

Skip inner replay correction when nested deeper than outer BP

0a2139a

Drop python_strings.py from syntest baselines

61a796e

Resolved by per-depth clear_scopes Restore on pop:N + set:. Refs: trishume#631

Drop python.py from syntest baselines

d4b6a5b

stefanobaghino added 2 commits April 28, 2026 11:07

This was referenced Apr 28, 2026

Track remaining syntest failures after #630 #631

Open

Apply non-topmost set target's clear_scopes at trigger position #676

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bypass search cache on truncated searches at embed-escape boundary#675

Bypass search cache on truncated searches at embed-escape boundary#675
stefanobaghino wants to merge 32 commits into
trishume:masterfrom
stefanobaghino:631-clip-lookahead-at-embed-escape

stefanobaghino commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

stefanobaghino commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant