Skip to content

Evaluate typed edges on existing chunk_links graph #504

@laynepenney

Description

@laynepenney

Context

We already have a similarity graph: chunk_links stores pre-computed cosine similarity edges between chunks across sessions (top-3 neighbors per chunk, min similarity 0.35). This is traversed at query time via _expand_cross_session() with a 0.7 discount factor.

The current edges are untyped — they say "these are related" but not HOW (supports, contradicts, supersedes, extends, etc.). The question: does adding edge types improve retrieval quality enough to justify the cost?

What we already have (5 expansion mechanisms)

  1. Cross-session chunk linkschunk_links table, cosine similarity, pre-computed at build
  2. Same-session context — surrounding turns via turn_index lookup
  3. Knowledge node source expansionsource_sessions/source_turns back-references
  4. Topic clustering — Jaccard-based clusters in cluster table
  5. Hybrid RRF — BM25 + embedding fusion with auto-boost

Proposed evaluation

Phase 1: Measure the gap (no code changes)

Add a "multi-hop" category to CodeMemo with 10-15 questions that require connecting information across sessions where the connection isn't obvious from keywords or embeddings alone. Examples:

  • "We decided X in March, then reversed it in April. What's the current state?"
  • "Agent A proposed a fix. Agent B found a problem with it. What was the problem?"
  • Questions where following a typed edge (supersedes, contradicts) would help but generic similarity wouldn't

Run against current system. If flat retrieval scores >85%, typed edges may not be worth the cost.

Phase 2: Prototype typed edges (if Phase 1 shows a gap)

Option A — Classify at build time: During build_cross_session_links(), run a lightweight classifier on each (source, target) pair to assign a type. Could be rule-based (temporal ordering → supersedes, high similarity + different conclusion → contradicts) or a small model.

Option B — Classify at query time: Keep untyped edges, but when expanding, use query intent to filter. A "what changed" query only follows edges where timestamps differ significantly. A "what supports" query only follows high-similarity same-conclusion edges.

Option C — Add link_type column to chunk_links: Enrichment model outputs edge type alongside knowledge nodes in the same pass. No extra LLM call — just an extra field in the extraction prompt.

Speed constraints (local-first)

  • Current build_cross_session_links() time: needs benchmarking
  • Budget: typed edge classification should add <20% to build time
  • Query-time expansion must stay <50ms for the full graph hop
  • No external API calls for edge classification — must run on local models or rules
  • CROSS_LINK_MAX_EXPAND=3 means we're only traversing 3 edges per result — typed filtering on 3 edges is essentially free

Decision criteria

Metric Threshold to ship
Multi-hop accuracy improvement >5% on new CodeMemo category
Build time increase <20%
Query latency increase <10ms
Cold start impact <2s additional

If we can't hit all four, the current untyped graph is good enough.

References

  • r/AIMemory thread on knowledge graph expectations (2026-04-06)
  • Current expansion: core.py:1160-1321 (build), core.py:2068 (query-time expand)
  • chunk_links schema: (source_id TEXT, target_id TEXT, similarity REAL)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions