fix: filter subagent runtime wrappers from auto-capture by slj130 · Pull Request #444 · CortexReach/memory-lancedb-pro

slj130 · 2026-04-01T15:25:13Z

Summary

This PR prevents smart extraction from storing subagent runtime scaffolding such as [Subagent Context] and [Subagent Task] as long-term memories.

Closes #443.

What changed

filter wrapper-only subagent payloads during auto-capture ingress
strip runtime wrapper lines in stripEnvelopeMetadata() as a second defense
harden the extraction prompt so runtime scaffolding is explicitly excluded
add regression coverage for both envelope stripping and the auto-capture path

Why this approach

This is primarily a preprocessing bug, not just a model-quality issue.
A stronger model might fail less often, but wrapper metadata should never reach the extraction stage in the first place.

Tests

node --test test/strip-envelope-metadata.test.mjs
node test/smart-extractor-branches.mjs

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: effabe284b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-01T15:28:06Z

index.ts

+  if (AUTO_CAPTURE_RUNTIME_WRAPPER_RE.test(trimmed)) {
+    return "";


Limit runtime-wrapper drop to metadata-only messages

AUTO_CAPTURE_RUNTIME_WRAPPER_RE matches any text that starts with [Subagent Context] or [Subagent Task] and then stripAutoCaptureRuntimeWrappers() returns an empty string, so normalizeAutoCaptureText() drops the whole message. In agent_end ingestion this will discard legitimate user content whenever wrapper lines are prepended to the same payload (a format already seen in subagent envelopes), causing real facts to never reach extraction. The filter should only remove wrapper-only payloads (or strip wrapper lines) instead of nulling mixed-content messages.

Useful? React with 👍 / 👎.

rwmjhb · 2026-04-02T07:39:19Z

Review: APPROVE (rebase needed)

Good fix — filtering subagent runtime wrappers from auto-capture directly improves recall quality. The three-layer defense (ingress regex, stripEnvelopeMetadata, prompt guidance) is solid.

Before merge: rebase onto main — the reflection-bypass-hook test failure is pre-existing, unrelated to your changes.

Two things to be aware of (not blocking):

Ingress filter drops entire messages — stripAutoCaptureRuntimeWrappers returns empty string for the whole message when the first line matches [Subagent Context] or [Subagent Task]. If a message mixes wrapper lines with real user content (e.g., wrapper header + factual statements), the facts are silently lost. Consider stripping only the matching lines instead of the entire message.
stripEnvelopeMetadata only strips first line of multiline wrappers — the regex is line-scoped, so continuation lines from a multi-line [Subagent Context] block survive into the extraction prompt.

AliceLJY

Clean three-layer defense against subagent runtime wrapper leaking into durable memory. Reviewed:

index.ts — AUTO_CAPTURE_RUNTIME_WRAPPER_RE + stripAutoCaptureRuntimeWrappers(): The regex correctly matches messages that are entirely wrapper content and returns "" to skip auto-capture. The [\s\S]*$ anchoring is intentional here — at the auto-capture level, if a message starts with [Subagent Context] or [Subagent Task], the whole message is runtime scaffolding and should be discarded. Integration point in stripAutoCaptureInjectedPrefix() is in the right position (after metadata stripping, before further processing).

smart-extractor.ts — stripEnvelopeMetadata() gets a new step 0 that strips wrapper lines (not the whole text) using /gim flags. This is the correct granularity for the extraction stage — preserve real conversation that follows wrapper lines.

extraction-prompts.ts — Explicit LLM instruction to never store runtime scaffolding. Good safety net.

Tests — Both the unit test (strip-envelope-metadata.test.mjs) and the integration test (smart-extractor-branches.mjs) cover the right scenarios: wrapper-only messages get filtered at ingress, wrapper lines get stripped before extraction, and the LLM extraction prompt doesn't see wrapper content.

LGTM. Closes #443 cleanly.

…se 2) - Extend stripEnvelopeMetadata() with 8 new patterns: <<<EXTERNAL_UNTRUSTED_CONTENT, <<<END EXTERNAL_UNTRUSTED_CONTENT, Sender/Conversation info (untrusted metadata), Thread starter, Forwarded message context, [Queued messages while agent was busy] - Add ENVELOPE_NOISE_PATTERNS to noise-filter.ts for pre-embedding guard - Add memory_store tool guard in tools.ts - Add 8 regression test cases in strip-envelope-metadata.test.mjs - Fix PR CortexReach#444 regex bug: subagent wrapper lines now stripped via entire-line matching (was leaving boilerplate on same line) Fixes CortexReach#446

…se 2) - Extend stripEnvelopeMetadata() with 8 new patterns: <<<EXTERNAL_UNTRUSTED_CONTENT, <<<END_EXTERNAL_UNTRUSTED_CONTENT, Sender/Conversation info (untrusted metadata), Thread starter, Forwarded message context, [Queued messages while agent was busy] - Add ENVELOPE_NOISE_PATTERNS to noise-filter.ts for pre-embedding guard - Add memory_store tool guard in tools.ts (strip-then-check approach) - Add 8 regression test cases in strip-envelope-metadata.test.mjs - Fix PR CortexReach#444 regex bug: subagent wrapper lines now stripped via entire-line matching (/^\[Subagent Context|Subagent Task\].*$/gm) - P1 fix: remove pre-filter from filterNoiseByEmbedding (runs before stripEnvelopeMetadata in extraction path, would cause false positives) - P2 fix: memory_store guard now strips first then checks if empty, preserving mixed-content messages Fixes CortexReach#446

chatgpt-codex-connector bot reviewed Apr 1, 2026

View reviewed changes

This was referenced Apr 2, 2026

auto-capture should strip subagent runtime wrappers before smart extraction #443

Closed

#394 — Envelope Metadata Leak into Memory #446

Open

AliceLJY approved these changes Apr 2, 2026

View reviewed changes

AliceLJY assigned rwmjhb Apr 2, 2026

fix: preserve content after subagent wrappers

91e6828

slj130 force-pushed the fix/filter-subagent-runtime-wrappers branch from effabe2 to 91e6828 Compare April 2, 2026 13:09

AliceLJY mentioned this pull request Apr 3, 2026

Unexpectedly high Jina usage with autoRecall + cross-encoder rerank during normal prompt-build flows #429

Open

rwmjhb merged commit 56dcc0a into CortexReach:master Apr 3, 2026
2 of 3 checks passed

jlin53882 mentioned this pull request Apr 3, 2026

fix: extend envelope stripping to broader channel/system markers #481

Open

jlin53882 mentioned this pull request Apr 4, 2026

fix: strip boilerplate continuation lines in stripLeadingRuntimeWrappers #499

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: filter subagent runtime wrappers from auto-capture#444

fix: filter subagent runtime wrappers from auto-capture#444
rwmjhb merged 1 commit intoCortexReach:masterfrom
slj130:fix/filter-subagent-runtime-wrappers

slj130 commented Apr 1, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Apr 1, 2026

Uh oh!

rwmjhb commented Apr 2, 2026

Uh oh!

AliceLJY left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		if (AUTO_CAPTURE_RUNTIME_WRAPPER_RE.test(trimmed)) {
		return "";

Conversation

slj130 commented Apr 1, 2026

Summary

What changed

Why this approach

Tests

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

rwmjhb commented Apr 2, 2026

Review: APPROVE (rebase needed)

Uh oh!

AliceLJY left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants