Skip to content

feat(reflection): Option B BM25 neighbor expansion for fresh sessions (Issue #513)#529

Closed
jlin53882 wants to merge 1 commit intoCortexReach:masterfrom
jlin53882:feature/option-b-v4
Closed

feat(reflection): Option B BM25 neighbor expansion for fresh sessions (Issue #513)#529
jlin53882 wants to merge 1 commit intoCortexReach:masterfrom
jlin53882:feature/option-b-v4

Conversation

@jlin53882
Copy link
Copy Markdown
Contributor

Summary

Implements Option B Phase 1: BM25 Neighbor Expansion for Reflection Slices (Issue #513).

Motivation

Following AliceLJY's guidance in Issue #513:

Go with Option B (query-time BM25 as ranking signal) for Phase 1

  • No schema change
  • Works for all entries
  • Enrichment happens before the slice cutoff
  • Reversible: gate behind a config flag

Architecture

loadAgentReflectionSlicesFromEntries()
  └── expandDerivedWithBm25BeforeRank()  ← NEW
  └── rankReflectionLines()

Key design decisions:

  • Top-N derived candidates (N=5) used as BM25 queries
  • BM25 neighbors prepended before derived candidates
  • Quality boost: quality = 0.2 + 0.6 * bm25Score (range [0.2, 0.8])
  • Reflection category filtered out (avoid self-matching)
  • Neighbor timestamp inherits from parent candidate (fixes aggregation text override)

Config

{
  "plugins": {
    "memory-lancedb-pro": {
      "bm25NeighborExpansion": {
        "enabled": true,
        "maxCandidates": 5,
        "maxNeighborsPerCandidate": 3
      }
    }
  }
}

Changes

File Delta
src/reflection-store.ts +186 lines — core expandDerivedWithBm25BeforeRank(), types
index.ts +34/- lines — config field + await on async calls
test/bm25-neighbor-expansion.test.mjs +454 lines — 17 unit tests
test/memory-reflection.test.mjs +32/- lines — async/await compatibility
.gitattributes +6 lines — consistent LF endings

Test Results

BM25 expansion tests:    17 pass, 0 fail
memory-reflection tests:  32 pass, 1 fail (pre-existing, unrelated)

Related

…#513)

- Add expandDerivedWithBm25BeforeRank() for fresh session support
- Add bm25NeighborExpansion config in parsePluginConfig
- Add await to loadAgentReflectionSlicesFromEntries calls
- Fix: candidateTimestamp propagation (neighbors inherit parent timestamp)
- Fix: enabled guard before scopeFilter guard
- Add bm25-neighbor-expansion.test.mjs (17 tests)
- Fix: memory-reflection.test.mjs async tests (8 tests)
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

Copy link
Copy Markdown
Collaborator

@AliceLJY AliceLJY left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Option B BM25 neighbor expansion for Issue #513,supersedes #523。整体 LGTM。

架构合理点:

  • loadAgentReflectionSlicesFromEntries sync→async 是必要的(BM25 search 是 I/O),index.ts 调用方已正确 await
  • 6 个防御机制(D1-D6)完整:空数组 early return、无 bm25Search bypass、scopeFilter guard、16 条 cap、120 char truncation、neighbors prepend
  • BM25 error fail-safe:catch + warn 不阻塞 reflection pipeline
  • Promise.all 并行搜索,性能好
  • Config 通过 parsePluginConfig 正确解析,defaults 合理(top-5 candidates, 3 neighbors each)

测试覆盖: 454 行单元测试,覆盖 D1-D6 每个防御点 + dedup + category filter + config override。Mock bm25Search 对此函数是合适的隔离策略。

Minor note: .gitattributes(line ending normalization)跟 BM25 feature 无关,理想情况下可以是独立 commit,但不 blocking。

LGTM ✅

@jlin53882
Copy link
Copy Markdown
Contributor Author

Hi @rwmjhb - B-1 (PR #529) has been approved by AliceLJY. Before we proceed with B-2 implementation, we have posted a proposal on Issue #445 for your review:

#445 (comment)

Key questions for B-2:

  1. Should neighbor enrichment be gated behind a config flag, or always-on?
  2. Should we use bm25Search instead of vectorSearch (to match B-1 approach)?
  3. Is the proposed pipeline order (NE before MMR) acceptable?

Please let us know if you have any feedback on the B-2 proposal. Thank you!

@rwmjhb
Copy link
Copy Markdown
Collaborator

rwmjhb commented Apr 6, 2026

Review Summary

Thanks for working on the BM25 neighbor expansion for fresh sessions. After a multi-round automated review (7 rounds, Claude + Codex adversarial), we're closing this PR. Here's why:

Blocking Issues

  1. Core feature doesn't work for fresh sessionsexpandDerivedWithBm25BeforeRank() returns [] immediately when derived.length === 0, so sessions with zero prior reflection history (the stated use case) still get no BM25 neighbors. The call-site comment claims otherwise.

  2. TypeScript compilation failurecandidateTimestamp is captured in searchPromises but dropped in the .then() callback (({ hits, queryText, normalizedKey })), so result.candidateTimestamp is undefined at the assignment site.

Additional Concerns

  • XL-sized change with low value-to-complexity ratio (value score: 29%)
  • Default-on behavior adds extra BM25 queries for all existing users on upgrade
  • Tests hard-code the broken fresh-session behavior instead of the intended feature path
  • Duplicate derived lines can consume multiple BM25 query slots before dedup

Recommendation

The problem (fresh sessions bypassing BM25 expansion) is real and worth solving, but this implementation needs a different approach — specifically, handling the derived.length === 0 case rather than returning early. Consider a smaller, focused PR that addresses just the fresh-session path with the compile issue fixed.

Closing for now. Feel free to reopen with a revised approach.

@jlin53882
Copy link
Copy Markdown
Contributor Author

Following the contribution workflow guidance, I've opened a focused question issue specifically about the fresh-session BM25 query material problem:

Issue #538: [Question] Fresh Session BM25 Expansion: What should be used as query material when derived empty?

This issue:

Waiting for maintainer guidance before implementing any changes.

jlin53882 added a commit to jlin53882/memory-lancedb-pro that referenced this pull request Apr 6, 2026
- PR CortexReach#529 CLOSED by rwmjhb
- Issue CortexReach#513/538 updated with B-2 focus and conflict analysis
- Waiting for maintainer resolution before implementing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants