Problem
search_concepts uses case-insensitive substring matching against concept names generated at ingestion time. If the agent's query phrasing doesn't overlap with the LLM-generated concept name, Ls is a complete miss — the agent silently falls back to the L0→L1→L2 path, adding 2-3 tool calls and proportionally more tokens.
This is the core failure mode of the fast path. Examples:
- Agent searches "RAG" → concept was indexed as "retrieval augmented generation" → miss
- Agent searches "maturity levels" → concept was indexed as "SBOM maturity model" → miss
- Agent searches "actor model" → concept was indexed as "Ray, Orleans, Akka frameworks" → miss
There's no visibility into how often this happens. We don't know the Ls hit rate in real usage, which means we can't measure the actual token savings vs. the theoretical maximum.
Impact
Every Ls miss converts a 2-read fast path into a 4-5 read slow path. For a library of 15 books, this is the difference between ~4.5k and ~25k cumulative input tokens per query (the cost simulation numbers from the README assume Ls hits).
Proposed solution
Two complementary approaches:
-
Ingestion-time aliases: when generating the concept index, prompt the LLM to also produce 2-3 alternative names/abbreviations per concept. Store as a aliases field on ConceptEntry. Substring match checks all aliases.
-
Query-time expansion: lightweight LLM call to expand the search query into 3-5 candidate concept names before running substring match. More robust but adds latency — worth measuring whether the token savings from hitting Ls offset the expansion cost.
Both need a hit-rate metric first: log Ls queries and outcomes to know the baseline.
Labels
enhancement, concept-index
Problem
search_conceptsuses case-insensitive substring matching against concept names generated at ingestion time. If the agent's query phrasing doesn't overlap with the LLM-generated concept name, Ls is a complete miss — the agent silently falls back to the L0→L1→L2 path, adding 2-3 tool calls and proportionally more tokens.This is the core failure mode of the fast path. Examples:
There's no visibility into how often this happens. We don't know the Ls hit rate in real usage, which means we can't measure the actual token savings vs. the theoretical maximum.
Impact
Every Ls miss converts a 2-read fast path into a 4-5 read slow path. For a library of 15 books, this is the difference between ~4.5k and ~25k cumulative input tokens per query (the cost simulation numbers from the README assume Ls hits).
Proposed solution
Two complementary approaches:
Ingestion-time aliases: when generating the concept index, prompt the LLM to also produce 2-3 alternative names/abbreviations per concept. Store as a
aliasesfield onConceptEntry. Substring match checks all aliases.Query-time expansion: lightweight LLM call to expand the search query into 3-5 candidate concept names before running substring match. More robust but adds latency — worth measuring whether the token savings from hitting Ls offset the expansion cost.
Both need a hit-rate metric first: log Ls queries and outcomes to know the baseline.
Labels
enhancement,concept-index