feat(smart-search): boost title/narrative matches on 'who/what is X' queries by efenex · Pull Request #571 · rohitg00/agentmemory

efenex · 2026-05-20T15:26:00Z

Summary

For named-concept queries ("who is the careful generator?", "what is a circuit breaker", "what does eventual consistency mean?"), the BM25 hybrid ranker scores busier observations above records that name the concept directly — question scaffolding tokens ("who", "is", "the") add noise that dilutes the true match signal. The record that defines the concept ranks below records that mention it incidentally.

What it does

Detect the query as a named-concept pattern via 5 regexes (`/who is/`, `/what is/`, `/what's/`, `/what does X mean/`, `/who's/`). Skip if no match.
Extract the concept phrase (e.g. "careful generator"). Reject degenerate phrases — single tokens shorter than 3 chars (`it`, `x`) and phrases longer than 6 tokens.
Deepen the BM25 sweep to `limit*3` so the boost has candidates to re-rank (boost on a top-10 set has limited room to move records around).
Re-rank with multiplicative boosts:
- Title contains the phrase → 2.0×
- Narrative contains the phrase → 1.3×
Same treatment for lessons whose content contains the phrase (2.0×).
Re-sort by combined score, trim to original `limit`.

Non-named-concept queries are untouched.

Why this lives in smart-search and not lineage

`mem::lineage` is chronologically-ordered and multi-channel; this is a ranking concern that affects the primary recall path (smart-search), which is what `memory_recall` / `memory_smart_search` MCP tools land on. Lineage benefits from upstream improvements in BM25 score, so this lift propagates.

Test plan

`npm test` passes
New unit tests for `extractNamedConcept` (7 cases) — pattern matching, degenerate-phrase rejection
New integration test that proves the boost re-ranks: an observation whose title contains "careful generator" but has lower BM25 score than a busier unrelated observation gets moved to rank fix: system audit -- 10 bugs fixed across hooks, triggers, and core #1
Non-named-concept query preserves original ordering (regression test)

Discovered while working on the "careful generator" test case in feat(lineage): mem::lineage primitive — chronological concept retrieval across all channels #570 (`mem::lineage`) — documented as Gap 4 there.
Independent of feat: time-range filtering for memory_recall, memory_smart_search, memory_sessions (#392) #414 (time-range filtering for smart-search) — different concern, different lines.

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Enhanced search ranking for conceptual queries: the system now detects "who is/what is/what does…" questions and re-ranks results so matching observations and lessons surface higher for concept-focused queries.
Tests
- Added coverage validating concept extraction and that conceptual queries trigger the expected result re-ranking while non-concept queries preserve original ordering.

For "who is X" / "what is X" / "what does X mean" queries, BM25 ranks busier observations above the records that actually name X — the question-scaffolding tokens add noise that dilutes the true match signal. Pre-existing regression test: docs/plans/v4-lineage-test-case- careful-generator.md (Gap exposed there, but the fix lives in smart- search rather than lineage since smart-search is the lessons-first ranker used by the recall paths). Approach: detect the question pattern at handler entry, extract the concept phrase, deepen the BM25 sweep to limit*3 so the boost has candidates, then post-multiply combinedScore by 2.0 for title matches and 1.3 for narrative matches, re-sort, trim to limit. Lessons whose content names the concept get the same 2.0 title-boost. Single-token / 6+ token phrases are skipped (degenerate). Original ordering is preserved on non-named-concept queries. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

vercel · 2026-05-20T15:26:04Z

@efenex is attempting to deploy a commit to the rohitg00's projects Team on Vercel.

A member of the Team first needs to authorize it.

coderabbitai · 2026-05-20T15:26:12Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7c0e356d-703c-46e5-a685-8be654d139f2

📥 Commits

Reviewing files that changed from the base of the PR and between d1fcb71 and 997d25d.

📒 Files selected for processing (2)

src/functions/smart-search.ts
src/types.ts

📝 Walkthrough

Walkthrough

Adds extraction of named concepts from "who is / what is / what does X mean" queries and uses the concept to expand observation fetches and apply multiplicative boosts when the concept appears in observation titles/narratives or lesson content, then re-sorts and trims results.

Changes

Named-Concept Query Detection and Ranking Boost

Layer / File(s)	Summary
Named-concept extraction and boost constants `src/functions/smart-search.ts`, `test/smart-search.test.ts`	`extractNamedConcept()` parses "who is/what is/what does ... mean/who's ..." queries with regex, trims punctuation, filters degenerate short matches, and introduces title/body boost multipliers. Unit tests verify extraction and null cases.
Smart-search pipeline boost and re-ranking `src/functions/smart-search.ts`, `src/types.ts`, `test/smart-search.test.ts`	`mem::smart-search` detects named concepts, increases observation fetch size (min(limit*3,100)), runs hybrid observation search and lesson recall in parallel, marks `CompactLessonResult.boostMatched` in `recallLessons`, applies multiplicative boosts to observation `combinedScore` and lesson `score` when concept matches, re-sorts results, and truncates back to `limit`. Integration tests assert boosted re-ranking and stable ordering for non-matching queries.

Sequence Diagram

sequenceDiagram
  participant Query
  participant extractNamedConcept
  participant hybridSearch
  participant lessonRecall
  participant boostProcessor
  participant returnSorted

  Query->>extractNamedConcept: parse query -> concept or null
  extractNamedConcept-->>Query: concept|null
  Query->>hybridSearch: run observation search (expanded limit if concept)
  Query->>lessonRecall: run lesson recall (pass boostPhrase)
  hybridSearch-->>boostProcessor: observations with combinedScore
  lessonRecall-->>boostProcessor: lessons with boostMatched flag
  boostProcessor->>boostProcessor: multiply observation combinedScore for title/body matches
  boostProcessor->>boostProcessor: multiply lesson score when boostMatched or content includes concept
  boostProcessor->>returnSorted: re-sort and truncate to limit
  returnSorted-->>Query: final observations and lessons

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

rohitg00/agentmemory#473: Adds compact lesson inclusion and recallLessons/CompactLessonResult plumbing that this PR extends with boostMatched and named-concept ranking.

Poem

🐰 I sniff a phrase, nimble and bright,

"Who is" hops in and sets things right.
Titles gleam with a joyful boast,
Search finds the thing I needed most.
A hopping cheer — code and carrot toast!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title accurately describes the main feature: adding a boost mechanism for 'who/what is X' named-concept queries that improves ranking by prioritizing title and narrative matches.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

test/smart-search.test.ts (1)
331-335: ⚡ Quick win

Tighten this assertion so dual-match regressions actually fail.

obsNamed already contains "careful generator" in both title and narrative, but the test only asserts score > 1.0. That still passes with a single applied boost, so it won't catch the bug in the new re-ranker. Either remove the narrative match from the fixture for a pure title-only case, or assert the full expected multiplier for a dual-match case.

Also applies to: 387-389
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/smart-search.test.ts` around lines 331 - 335, The test fixture obsNamed
created via makeObs currently contains "careful generator" in both title and
narrative, which makes the weak assertion (score > 1.0) insufficient; either
remove the phrase from the narrative so the fixture is a title-only match and
keep the simple assertion, or tighten the assertion to check the full expected
boosted score for a dual-match (compute and assert the exact expected
multiplier/threshold instead of >1.0). Update the corresponding duplicate
assertions mentioned (around the second occurrence at lines 387-389) to use the
same fix and reference obsNamed/makeObs when locating the fixture and
assertions.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/functions/smart-search.ts`:
- Around line 151-156: The current boost logic uses the truncated preview in
rawLessons, so named-concept matching misses occurrences beyond the 240-char
cutoff; update the scoring to operate on the full lesson text before any preview
truncation by either (A) running this phrase includes check against the
untruncated field returned by recallLessons (e.g., use the original full content
property such as fullContent or contentFull instead of the previewed content) or
(B) change recallLessons to preserve a fullContent field on each lesson and use
that field in the map that adjusts score (referencing rawLessons, lessons,
phrase, and NAMED_CONCEPT_TITLE_BOOST). Ensure the boost is applied using the
full text and only truncate for presentation after ranking is complete.
- Around line 143-145: The current logic in smart-search that sets mult using an
if/else if (checking title.includes(phrase) then else if
narrative.includes(phrase)) prevents applying both NAMED_CONCEPT_TITLE_BOOST and
NAMED_CONCEPT_BODY_BOOST when both title and narrative match; change it to
compute the multiplier by starting mult = 1 and multiplying by
NAMED_CONCEPT_TITLE_BOOST if title.includes(phrase) and by
NAMED_CONCEPT_BODY_BOOST if narrative.includes(phrase), then return r unchanged
when mult === 1 else return { ...r, combinedScore: r.combinedScore * mult } so
dual matches get the product of both boosts (use the existing symbols title,
narrative, phrase, mult, NAMED_CONCEPT_TITLE_BOOST, NAMED_CONCEPT_BODY_BOOST, r,
combinedScore).

---

Nitpick comments:
In `@test/smart-search.test.ts`:
- Around line 331-335: The test fixture obsNamed created via makeObs currently
contains "careful generator" in both title and narrative, which makes the weak
assertion (score > 1.0) insufficient; either remove the phrase from the
narrative so the fixture is a title-only match and keep the simple assertion, or
tighten the assertion to check the full expected boosted score for a dual-match
(compute and assert the exact expected multiplier/threshold instead of >1.0).
Update the corresponding duplicate assertions mentioned (around the second
occurrence at lines 387-389) to use the same fix and reference obsNamed/makeObs
when locating the fixture and assertions.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c24364d3-8993-4417-a12c-9c0c02cd7c30

📥 Commits

Reviewing files that changed from the base of the PR and between 93d1bdd and d1fcb71.

📒 Files selected for processing (2)

src/functions/smart-search.ts
test/smart-search.test.ts

…l content CodeRabbit caught two issues on rohitg00#571: 1. The boost branch used `if (title) ... else if (narrative) ...`, capping observations that contain the concept in BOTH fields at the title-only 2.0× multiplier. The feature is specified as multiplicative — title-and-narrative matches now compound to 2.0 × 1.3 = 2.6×. Single-field matches behave as before. 2. The lesson boost path was scanning the 240-char preview emitted by recallLessons, not the lesson's full pre-truncation content. Any concept that appeared past the preview boundary silently missed the boost. Fix: thread the concept phrase into recallLessons via a new `boostPhrase` parameter. The function now decides match against `content + context` BEFORE truncation, stamps each result with `boostMatched: boolean`, and the smart-search caller uses that flag instead of re-scanning the preview. `boostMatched` added as an optional field on CompactLessonResult. Callers that don't pass `boostPhrase` get `boostMatched: false` — the smart-search caller falls back to scanning the (truncated) content for the phrase if `boostMatched` is absent, preserving the pre-fix behavior for any non-smart-search caller of recallLessons. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

Comment thread src/functions/smart-search.ts Outdated

Comment thread src/functions/smart-search.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(smart-search): boost title/narrative matches on 'who/what is X' queries#571

feat(smart-search): boost title/narrative matches on 'who/what is X' queries#571
efenex wants to merge 2 commits into
rohitg00:mainfrom
efenex:feat/v4-b-smart-search-named-concept-boost

efenex commented May 20, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

vercel Bot commented May 20, 2026

Uh oh!

coderabbitai Bot commented May 20, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

efenex commented May 20, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What it does

Why this lives in smart-search and not lineage

Test plan

Related

Summary by CodeRabbit

Uh oh!

vercel Bot commented May 20, 2026

Uh oh!

coderabbitai Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

efenex commented May 20, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 20, 2026 •

edited

Loading