feat(lineage): mem::lineage primitive — chronological concept retrieval across all channels by efenex · Pull Request #570 · rohitg00/agentmemory

efenex · 2026-05-20T15:25:17Z

Summary

Adds a new MCP primitive `mem::lineage` that answers "when did this term enter the corpus, and what surrounded it?" — chronologically ordered hits for a phrase across observation, memory, lesson, and summary channels, with optional adjacent-turn enrichment and graph-neighbor attachment.

Distinct from existing retrieval:

`mem::search` ranks by relevance (BM25 hybrid).
`mem::smart-search` is the lessons-first ranker.
`mem::lineage` is time-ordered, multi-channel, and enrichment-rich — the right tool for "trace this term" / "what was the first mention" workflows.

What's included

New REST endpoint `POST /agentmemory/lineage` (+125 from 124)
New MCP tool `memory_lineage` in CORE_TOOLS (CORE 15 / total 54)
Channels: observation, memory, lesson, summary — opt in/out per call via `channels: ["observation","lesson"]`
Time bounds: `since` / `until` ISO timestamps
Adjacent turns: for observation hits, attach previous user prompt + previous assistant action (opt-in via `includeAdjacentTurns`, default true)
Graph neighbors: when `includeGraph: true`, attach graph-edge neighbors for matching nodes
Order: `asc` (default, oldest first) or `desc`
firstMention: convenience field pointing at the earliest timestamp in the filtered set

Gap-2 fix bundled

BM25 sweep cap raised from `min(limit4, 500)` to `min(limit20, 5000)` — large jsonl-imported sessions (10k+ observations) have deep-in-session references that didn't rank into the channel-filtered top N at the old cap. With wide channel-filtering, the sweep needs more headroom.

Test plan

`npm test` passes (1140+ tests)
Mechanical smoke: `mem::lineage` with all 4 channels, `firstMention` populated, timeline sorted ASC, channel totals correct, adjacent turns attached, sourceFile extraction works, no-match returns empty arrays, empty query returns 400
Docs in `docs/plans/v4-lineage-design.md` + `docs/plans/v4-lineage-test-case-careful-generator.md`
Tool counts in README + AGENTS.md + boot message + test/mcp-standalone.test.ts updated

The "careful generator" test case in `docs/plans/v4-lineage-test-case-careful-generator.md` exposed two follow-up improvements which are addressed in separate PRs:
- feat(lineage): mem::lineage primitive — chronological concept retrieval across all channels #570 v4-b: smart-search named-concept ranker boost (for "who is X" / "what is X" queries)
- (forthcoming) v5-a: `mem::query` composable retrieval pipeline (uses `mem::lineage` as one producer in a composable DSL)

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Added lineage retrieval to trace a phrase/term across observation, memory, lesson, and summary channels with time bounds, channel filtering, ordering, and optional adjacency/graph enrichment; exposed via a new REST endpoint and tool.
Documentation
- Updated docs and README/AGENTS stats to 54 MCP tools and 125 REST endpoints; added lineage design and test-case guidance.
Tests
- Adjusted tests for the added MCP tool.

Returns chronologically-sorted hits across observation/memory/lesson/ summary channels — answers "when did this term enter the corpus and what surrounded it?". Includes BM25 sweep over obs+memory, substring scan for lessons/summaries, optional adjacent-turn enrichment, and optional graph-neighbor attachment. Gap-2 fix bundled: BM25 sweep cap raised from min(limit*4, 500) to min(limit*20, 5000) so deep in-session refs in large jsonl-imported sessions (10k+ obs) still rank into the channel-filtered top N. Wires: - src/functions/lineage.ts (new) - mem::lineage MCP tool in CORE_TOOLS - POST /agentmemory/lineage REST endpoint - AuditEntry operation: + "query" - LineageChannel / TimelineItem / LineageGraphNeighbor / LineageResult types - design + test-case docs under docs/plans/ Counts bumped to keep README/AGENTS/boot message/test in sync: CORE_TOOLS 12 → 13, total MCP tools 51 → 52, REST endpoints 121 → 122. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

vercel · 2026-05-20T15:25:22Z

@efenex is attempting to deploy a commit to the rohitg00's projects Team on Vercel.

A member of the Team first needs to authorize it.

coderabbitai · 2026-05-20T15:25:46Z

📝 Walkthrough

Walkthrough

Implements mem::lineage: a chronological phrase-lineage retrieval across observation, memory, lesson, and summary channels with session enrichment, optional adjacent-turn reconstruction and graph neighbors, plus HTTP (POST /agentmemory/lineage) and MCP exposure, types, tests, and design/test docs.

Changes

mem::lineage Retrieval Feature

Layer / File(s)	Summary
Design & Test Case Documentation `docs/plans/v4-lineage-design.md`, `docs/plans/v4-lineage-test-case-careful-generator.md`	v4-A design doc specifies the mem::lineage request/response contract, multi-channel retrieval algorithm (BM25 for observation/memory, KV substring for lessons/summaries), enrichment strategy (session metadata, adjacent turns, graph neighbors), implementation files, and validation criteria. Companion test case doc defines the "careful generator" regression scenario with observed behavior and follow-up priorities.
Type Definitions & Contracts `src/types.ts`	Adds LineageChannel union, TimelineItem interface with optional enrichment (memory, session, adjacentTurns), LineageGraphNeighbor interface, LineageResult interface, and extends AuditEntry.operation union to include "query".
Core Lineage Retrieval Implementation `src/functions/lineage.ts`	Implements mem::lineage SDK function handler: validates/normalizes query and time/channel/ordering parameters, performs BM25-backed retrieval for observation/memory and KV substring scan for lessons/summaries, merges and sorts timeline items by timestamp with deterministic tie-breaking, enriches items with session metadata via per-session caching, optionally computes adjacentTurns by backward-walking observations, calculates per-channel totals and firstMention, optionally builds graph neighbors from graph KV data, audits, and returns LineageResult.
HTTP API Endpoint Registration `src/triggers/api.ts`	Registers POST /agentmemory/lineage endpoint, validates request body for required query and optional limit/channels/order fields, forwards a whitelisted payload to mem::lineage, and maps upstream `{ error }` (without timeline) to HTTP 400; otherwise returns HTTP 200 with result.
MCP Tool Handler & Registry `src/mcp/tools-registry.ts`, `src/mcp/server.ts`	Adds `memory_lineage` tool entry to CORE_TOOLS (query, time bounds, channels, limit, includeAdjacentTurns, includeGraph, order) and implements MCP handler that validates args, parses optional params (CSV channels), clamps limit, triggers mem::lineage, and returns result as MCP content.
Worker Initialization & Documentation Updates `src/index.ts`, `test/mcp-standalone.test.ts`, `AGENTS.md`, `README.md`	Imports and registers lineage on startup, updates readiness log REST endpoint count to 125, updates test asserting CORE_TOOLS length to 15, and updates AGENTS.md/README.md stats to MCP tools = 54 and REST endpoints = 125.

Sequence Diagram

sequenceDiagram
  participant Client as HTTP/MCP Client
  participant Handler as Lineage Handler
  participant BM25 as BM25 Index
  participant KV as KV Store
  participant Session as Session Cache
  participant Graph as Graph Data
  Client->>Handler: query, limit, channels, time range
  Handler->>BM25: BM25 search (observation/memory)
  BM25-->>Handler: timeline hits
  Handler->>KV: List/scan lessons & summaries
  KV-->>Handler: additional items
  Handler->>Handler: Sort by timestamp + tie-break
  Handler->>Session: Lookup session metadata
  Session-->>Handler: Session info (cached)
  Handler->>Handler: Enrich items with session
  Handler->>Handler: (optional) Compute adjacentTurns
  Handler->>Graph: Match query tokens to node names
  Graph-->>Handler: Matched nodes + edges
  Handler-->>Client: LineageResult (timeline, totals, firstMention, graphNeighbors)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

rohitg00/agentmemory#330: Related via REST endpoint count/document updates overlapping endpoint additions.

Poem

🐰 I hop through snippets, sessions, and time,
Tracing a phrase in memory's rhyme,
Stitching turns and graph-lit trails,
A lineage found where recall prevails,
📜✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely summarizes the main change: introduction of a new mem::lineage primitive for chronological concept retrieval across multiple channels.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/functions/lineage.ts`:
- Around line 366-373: The calculation of earliest/firstMention is using the
page-truncated array trimmed, so when order === "desc" and limit removed older
items it incorrectly omits earlier hits; update the logic that sets
earliest/firstMention to derive from the full filtered set (e.g., filteredHits
or the pre-trimmed array used before slicing) rather than from trimmed so that
firstMention always points to the true earliest timestamp in the filtered set
regardless of order or pagination (adjust references to earliest, firstMention,
trimmed, and order accordingly).

In `@src/mcp/server.ts`:
- Around line 289-299: The handler currently accepts non-integer limits and any
order string; tighten validation by ensuring limit is a finite integer within
[1,500] (use Number.isInteger on the parsed value from asNumber(args.limit)) and
return/raise a clear MCP validation error if it fails instead of silently
clamping; likewise, validate args.order against an explicit allowed set (e.g.,
allowedOrders = ['asc','desc'] or the app-specific enum) before assigning
payload.order and return/raise a validation error for unknown values. Update the
validation around the asNumber(args.limit) usage and the payload.order
assignment before calling sdk.trigger so invalid inputs are rejected at the MCP
boundary.

In `@src/triggers/api.ts`:
- Around line 1009-1042: The code currently forwards the raw req.body to
sdk.trigger for function_id "mem::lineage"; instead construct a whitelisted
payload object from the validated local variable body (not req.body) including
only the allowed fields (query, limit, channels, order) — normalize order to
lower-case and ensure limit is a number and channels is an array of strings —
then call sdk.trigger({ function_id: "mem::lineage", payload: payload }) so no
unvalidated fields are passed downstream.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: dc313b12-f9f6-49dd-974e-8841efbe68ad

📥 Commits

Reviewing files that changed from the base of the PR and between 93d1bdd and 3ae80e4.

📒 Files selected for processing (11)

AGENTS.md
README.md
docs/plans/v4-lineage-design.md
docs/plans/v4-lineage-test-case-careful-generator.md
src/functions/lineage.ts
src/index.ts
src/mcp/server.ts
src/mcp/tools-registry.ts
src/triggers/api.ts
src/types.ts
test/mcp-standalone.test.ts

Three real issues caught in review: 1. firstMention computed from `trimmed` (post-limit page) instead of `items` (entire filtered set). When `order:desc` + a small `limit` truncated a session with many hits, the reported firstMention was the oldest-in-page, not the actual earliest filtered hit. Switch to `items` so the semantic contract holds regardless of page size. 2. MCP boundary (memory_lineage in src/mcp/server.ts) accepted any non-integer `limit` and any `order` string. Now: validate `limit` is a positive integer (400 otherwise), validate `order` is "asc"|"desc" (400 otherwise), filter `channels` to the known enum before forwarding. 3. REST boundary (api::lineage in src/triggers/api.ts) was forwarding raw `req.body` after validation, which leaks caller-controlled keys to the downstream function. Build a whitelisted payload from the validated fields only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/functions/lineage.ts`:
- Around line 372-377: The current selection of earliest (const earliest = order
=== "asc" ? items[0] : items[items.length - 1]) causes firstMention to flip when
multiple items share the same timestamp because it just picks opposite ends;
instead compute earliest deterministically by finding the minimal timestamp
across items and then applying a stable tie-breaker (e.g., compare channelId or
messageId lexicographically) so selection does not depend on the order
parameter—update the logic around earliest/firstMention in lineage.ts
(referencing variables items, order, earliest, firstMention) to scan items for
the minimum (timestamp, channelId/messageId) tuple and pick that entry.

In `@src/mcp/server.ts`:
- Around line 286-307: The current code silently drops invalid channel tokens by
computing validChannels from channels and only setting payload.channels when
validChannels.length>0; instead, when the caller provided a channels value but
none are valid we should reject the request with a 400 error. Modify the logic
around the validChannels computation in src/mcp/server.ts (the block that
defines validChannels and sets payload.channels) to check if args.channels (or
channels) was supplied and validChannels.length === 0, and return a 400 response
(e.g., { status_code: 400, body: { error: "invalid channels" } }) rather than
omitting payload.channels; otherwise set payload.channels = validChannels as
before.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c4aa8dde-8d3f-44b4-8256-f210cff0cc93

📥 Commits

Reviewing files that changed from the base of the PR and between 3ae80e4 and 6a4de14.

📒 Files selected for processing (3)

src/functions/lineage.ts
src/mcp/server.ts
src/triggers/api.ts

…+ firstMention tiebreak Two follow-up issues from CodeRabbit's review of 6a4de14: 1. `channels` silent broadening: when the user passed `channels` but none were in the known enum (e.g. `["foobar","baz"]`), the previous fix dropped to an empty `validChannels` and the conditional then omitted `payload.channels` entirely — falling back to all-channels default. Now: if the user explicitly passed channels but none are valid, return 400. Silently broadening invalidates caller intent. 2. `firstMention` could differ by `order`: picking `items[0]` (asc) or `items[items.length-1]` (desc) relied on the array's tiebreak rule to settle equal-timestamp ties. Two items sharing the earliest timestamp on different channels would resolve differently depending on `order`. Switch to an order-independent min-by-timestamp reduce so the "earliest in filtered set" contract is stable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai

♻️ Duplicate comments (1)

src/mcp/server.ts (1)

285-301: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Reject blank channels values too.

Line 285 turns "" / " , " into [], so Line 293 misses the “caller supplied channels but none are valid” case and the request still broadens to all channels. Check raw presence (args.channels !== undefined) instead of channels.length > 0.

Suggested fix

-            const channels = parseCsvList(args.channels);
+            const channelsProvided = args.channels !== undefined;
+            const channels = parseCsvList(args.channels);
             const validChannels = channels.filter((c) =>
               ["observation", "memory", "lesson", "summary"].includes(c),
             );
-            if (channels.length > 0 && validChannels.length === 0) {
+            if (channelsProvided && validChannels.length === 0) {
               return {
                 status_code: 400,
                 body: {

As per coding guidelines, input validation must occur at system boundaries (MCP handlers, REST endpoints).

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/mcp/server.ts` around lines 285 - 301, The handler currently checks
channels.length > 0 which misses cases where the caller provided an empty/blank
channels string (parseCsvList converts "" or " , " to []), so change the
validation to test raw presence instead: replace the condition `channels.length
> 0 && validChannels.length === 0` with `args.channels !== undefined &&
validChannels.length === 0` (keeping parseCsvList, validChannels and the
existing 400 error body) so any supplied-but-empty channels input is rejected
rather than silently broadening to all channels.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@src/mcp/server.ts`:
- Around line 285-301: The handler currently checks channels.length > 0 which
misses cases where the caller provided an empty/blank channels string
(parseCsvList converts "" or " , " to []), so change the validation to test raw
presence instead: replace the condition `channels.length > 0 &&
validChannels.length === 0` with `args.channels !== undefined &&
validChannels.length === 0` (keeping parseCsvList, validChannels and the
existing 400 error body) so any supplied-but-empty channels input is rejected
rather than silently broadening to all channels.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f53ffb7c-9890-4808-be09-9edce07ffe3f

📥 Commits

Reviewing files that changed from the base of the PR and between 6a4de14 and 4cd1f4c.

📒 Files selected for processing (2)

src/functions/lineage.ts
src/mcp/server.ts

efenex mentioned this pull request May 20, 2026

feat(smart-search): boost title/narrative matches on 'who/what is X' queries #571

Open

4 tasks

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

Comment thread src/functions/lineage.ts Outdated

Comment thread src/mcp/server.ts

Comment thread src/triggers/api.ts

efenex mentioned this pull request May 20, 2026

feat(mcp): mem::query — server-side composable retrieval pipeline (v5-a) #574

Open

efenex force-pushed the feat/v4-a-mem-lineage branch from d8b9f30 to 6a4de14 Compare May 20, 2026 16:08

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

Comment thread src/functions/lineage.ts Outdated

Comment thread src/mcp/server.ts Outdated

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(lineage): mem::lineage primitive — chronological concept retrieval across all channels#570

feat(lineage): mem::lineage primitive — chronological concept retrieval across all channels#570
efenex wants to merge 3 commits into
rohitg00:mainfrom
efenex:feat/v4-a-mem-lineage

efenex commented May 20, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

vercel Bot commented May 20, 2026

Uh oh!

coderabbitai Bot commented May 20, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

efenex commented May 20, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's included

Gap-2 fix bundled

Test plan

Related

Summary by CodeRabbit

Uh oh!

vercel Bot commented May 20, 2026

Uh oh!

coderabbitai Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

efenex commented May 20, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 20, 2026 •

edited

Loading