feat(i18n): add German memory trigger patterns and retrieval triggers#489
feat(i18n): add German memory trigger patterns and retrieval triggers#489Banger455 wants to merge 8 commits intoCortexReach:masterfrom
Conversation
AliceLJY
left a comment
There was a problem hiding this comment.
The German trigger additions in index.ts are well-crafted (good use of \b word boundaries, solid false-positive guards in tests). However, src/adaptive-retrieval.ts has a critical encoding corruption that breaks existing functionality.
Blocking: UTF-8 mojibake in adaptive-retrieval.ts
The existing CJK characters and emoji in SKIP_PATTERNS and FORCE_RETRIEVE_PATTERNS have been corrupted. The diff shows:
SKIP_PATTERNS — emoji corrupted:
- /^(...|👍|👎|✅|❌)\s*[.!]?$/i,
+ /^(...|ð|ð|â|â)\s*[.!]?$/i,
SKIP_PATTERNS — CJK corrupted:
- /^(...|实施|實施|开始|開始|继续|繼續|好的|可以|行)\s*[.!]?$/i,
+ /^(...|宿½|實æ½|å¼å§|éå§|ç»§ç»|ç¹¼çº|好ç|å¯ä»¥|è¡)\s*[.!]?$/i,
FORCE_RETRIEVE_PATTERNS — CJK corrupted:
- /(你记得|[你妳]記得|之前|上次|以前|还记得|還記得|提到过|提到過|说过|說過)/i,
+ /(ä½ è®°å¾|[ä½ å¦³]è¨å¾|ä¹å|䏿¬¡|以å|è¿è®°å¾|éè¨å¾|æå°è¿|æå°é|说è¿|說é)/i,
Also corrupted: The comment on L77 ("你记得吗" → "ä½ è®°å¾å"), the ? full-width question mark on L72/L81 (? → ï¼), and German umlauts in the new patterns (weißt → weiÃt, früher → früher, kürzlich → kürzlich).
This is classic UTF-8-as-Latin-1 mojibake. The file was likely saved or committed with wrong encoding. This will break all existing Chinese retrieval triggers and emoji skip patterns at runtime.
What needs to happen
- Re-save
src/adaptive-retrieval.tswith correct UTF-8 encoding. Make sure your editor/git config preserves UTF-8. - Verify the German additions (
weißt du noch,früher,kürzlich) also come through as proper UTF-8. - The test file has the same mojibake in test descriptions (the
Ãartifacts), though those are cosmetic since assertions test against the imported functions, not source text.
Non-blocking notes on index.ts (looks good)
- The German trigger regexes in
MEMORY_TRIGGERSare well-structured \bword boundaries correctly preventZimmermann/Schwimmerinfalse positives- The scoped
immerpattern (immer\s+(?:wenn|daran|denken|merken|beachten)) is a smart improvement over the bareimmersuggested in #393 - Test coverage is thorough (27 positive, 10 negative, 12 retrieval, explicit-remember consistency)
Please fix the encoding issue and force-push. Happy to re-review after.
|
Thanks for the thorough review @AliceLJY! Fixed in the latest commit — |
AliceLJY
left a comment
There was a problem hiding this comment.
Thanks for fixing the encoding in the source files (index.ts, adaptive-retrieval.ts) — those are clean now ✅
However, the test file test/german-i18n-triggers.test.mjs still has mojibake:
| Line | Corrupted | Should be |
|---|---|---|
| 33 | für später |
für später |
| 41 | über CI |
über CI |
| 45 | für dev |
für dev |
| 193 | WeiÃt du noch |
Weißt du noch |
| 229 | erwähnt |
erwähnt |
Critical: Line 193 tests shouldSkipRetrieval("WeiÃt du noch...") — this passes by coincidence (returns false because it doesn't match ANY pattern, not because the German weißt du noch trigger fired). The test gives false confidence without actually validating the German retrieval trigger.
Please re-save the test file with proper UTF-8 encoding. The fix for the source files worked perfectly — just need to apply the same treatment to the test file.
Review: feat(i18n): add German memory trigger patterns and retrieval triggersGood direction — German users currently get zero capture/retrieval support. The implementation code ( Must Fix1. Test file has UTF-8 mojibake — tests pass by coincidence
AliceLJY flagged this in their second review — still unfixed. 2. German memories classified as
Nice to Have
|
|
Both issues addressed in latest commits: @AliceLJY — @rwmjhb —
Ready for re-review. |
rwmjhb
left a comment
There was a problem hiding this comment.
Review: feat(i18n): add German memory trigger patterns and retrieval triggers
解决德语用户 memory trigger 静默丢失的方向是对的。但测试文件有 UTF-8 编码问题需要先修:
Must Fix
-
Test file mojibake:
test/german-i18n-triggers.test.mjs中多处德语字符损坏——für(应为für)、WeiÃt(应为Weißt)、erwähnt(应为erwähnt)。这些测试之所以通过,是因为恰好不匹配任何 pattern,而非真正验证了德语 trigger。AliceLJY 在 4/4 的 review 中已指出此问题。 -
Category classification: 德语 auto-captured memories 仍被分类为
other,会被降级为 generic working-memory,削弱了这个 feature 的实际效果。
Scope Drift
src/adaptive-retrieval.ts中normalizeQuery()的空行删除和注释重排与德语 i18n 无关。SKIP_PATTERNS的 regex flag 从/i改为/iu是功能性变更,可能影响现有 CJK pattern 匹配,但 PR description 未提及。
请修复 test file 编码问题,确保德语 trigger 被真正测试到。
|
Three follow-up commits addressing all review feedback: 9e56f31 1ca6c2b cbf9f2b Also updated the PR description with an Incidental Changes section documenting the |
|
Follow-up fix after deep review: fix(ci): restore full stripEnvelopeMetadata - Previous commit used a brace-counter that matched } inside a regex literal as the function closing brace, truncating steps 1-4 (System timestamps, metadata sections, JSON block stripping, blank-line collapse). This restores the complete function with the two-step subagent fix plus all original metadata stripping steps. fix(i18n): use /iu flag on detectCategory entity branch - The entity branch regex contains heiße/heißt which needs Unicode case-folding (/iu) to match all-caps input. Changed /i to /iu on that branch. |
- Expand AUTO_CAPTURE_EXPLICIT_REMEMBER_RE with German and English patterns - Add 5 German trigger regexes to MEMORY_TRIGGERS array - Covers: remember/merk dir, preferences, decisions, personal facts, temporal markers
Previous commit introduced mojibake due to encoding mismatch in the web editor. This commit restores correct UTF-8 for all CJK characters, emoji, and German umlauts (weißt, früher, kürzlich).
Fix mojibake in test descriptions and assertions: ü/ä/ö/ß/— now correctly encoded. Critical: line 193 "Weißt du noch" now actually validates the German retrieval trigger instead of passing by coincidence.
Add German keywords to all four detectCategory() branches so German memories are classified correctly (preference/decision/entity/fact) instead of falling through to "other". Addresses rwmjhb review feedback.
66a7651 to
852e4f3
Compare
|
Hey, rebased onto current master and cleaned things up a bit:
So the diff is now just 3 files (index.ts, adaptive-retrieval.ts, test file) — pure i18n, nothing else. CI will probably need an "Approve and run" from a maintainer since this is a fork PR. Would appreciate a re-review when you get a chance! |
- Add \b word boundaries to `ich will`, `ich mag` in MEMORY_TRIGGERS to prevent substring matches (e.g. "Ich willkommen") - Add \b word boundary to `entschieden` in detectCategory() to prevent matching "unentschieden" - Require `ich` prefix for `bevorzuge` in detectCategory() preference branch, consistent with other German patterns - Extend FORCE_RETRIEVE_PATTERNS with `damals`, `letzte Woche/Zeit` - Add detectCategory() test coverage for German (preference, decision, entity, fact + substring false-positive prevention) - Add false-positive regression tests (willkommen, unentschieden) - 69 tests total (was 57), all passing Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Documents OpenClaw instance setup, active agents, PR CortexReach#489 status, local paths and working rules — persistent context for future sessions. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
Split from #406 – this PR focuses only on German i18n support.
Changes
index.tsAUTO_CAPTURE_EXPLICIT_REMEMBER_REwith German (merk dir,vergiss nicht,nicht vergessen) and English (remember this) patternsMEMORY_TRIGGERS: explicit-remember, preferences, decisions, personal facts, temporal markers\bword boundaries to prevent false positives on compound words (e.g. Zimmermann, Schwimmerin)src/adaptive-retrieval.tsFORCE_RETRIEVE_PATTERNS: conversational recall (erinnerst du dich,weißt du noch) and temporal cues (gestern,neulich,kürzlich)test/german-i18n-triggers.test.mjs(new)Test plan
node --test test/german-i18n-triggers.test.mjspasses all 54 casesnode --test test/)Closes the i18n portion of feat(i18n): German language support + fix fire-and-forget race in agent_end #406.
Incidental Changes (noted for reviewers)
src/adaptive-retrieval.ts–normalizeQuery()src/adaptive-retrieval.ts–SKIP_PATTERNSregex flag/i→/iuß,ü,ä).