feat(retrieve): add knowledge-graph-aware scoring to retrieval pipeline by huang-yi-dae · Pull Request #2555 · volcengine/OpenViking

huang-yi-dae · 2026-06-10T16:30:04Z

Summary

Add graph_alpha and graph_saturation_k config fields to RetrievalConfig, enabling optional graph-aware scoring
New graph_loader.py module loads relation data concurrently from two sources: .relations.json and MEMORY_FIELDS.links/backlinks
Integrate graph scoring into HierarchicalRetriever._convert_to_matched_contexts with lazy loading (only top candidates), blending graph_score = tanh(total_relations / graph_saturation_k) into final score
VikingFS passes viking_fs=self to both find() and search() retriever construction calls
Default graph_alpha=0 preserves full backward compatibility - no behavior change when disabled

Test plan

test_convert_to_matched_contexts_returns_empty_relations - backward compat, graph_alpha=0 keeps relations=[]
test_graph_alpha_zero_returns_empty_relations - explicit zero, VikingFS present but not invoked
test_graph_scoring_with_relations_json - .relations.json loading + tanh blending
test_graph_scoring_with_memory_file_links - MEMORY_FIELDS links/backlinks parsing from .md files
test_graph_lazy_loading - only top candidates trigger graph data I/O

All 14 tests pass (10 existing + 4 new).

Integrate graph connectivity (from .relations.json and MEMORY_FIELDS links/backlinks) into the retrieval scoring pipeline. When graph_alpha > 0, top candidates get a graph_score blended via tanh saturation, boosting well-connected results. Default graph_alpha=0 preserves existing behavior. 🤖 Generated with [Qoder][https://qoder.com]

github-actions · 2026-06-10T16:31:16Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
🏅 Score: 90
🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 No multiple PR themes
⚡ Recommended focus areas for review Redundant Truncation The relations list is truncated twice: once in `load_graph_data_for_uris` (to `max_relations_per_uri`) and again in `_convert_to_matched_contexts` (to `self.MAX_RELATIONS`). This is redundant and could cause confusion if the two limits differ. max_relations_per_uri=self.MAX_RELATIONS, ) for mc in graph_candidates: gd = uri_to_graph.get(mc.uri) if gd is None: graph_score = 0.0 mc.relations = [] else: graph_score = math.tanh(gd.total_count / self.graph_saturation_k) mc.relations = gd.relations[: self.MAX_RELATIONS]

github-actions · 2026-06-10T16:32:10Z

PR Code Suggestions ✨

No code suggestions found for the PR.

github-project-automation Bot added this to OpenViking project Jun 10, 2026

github-project-automation Bot moved this to Backlog in OpenViking project Jun 10, 2026

github-actions Bot added the Review effort 2/5 label Jun 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(retrieve): add knowledge-graph-aware scoring to retrieval pipeline#2555

feat(retrieve): add knowledge-graph-aware scoring to retrieval pipeline#2555
huang-yi-dae wants to merge 1 commit into
volcengine:mainfrom
huang-yi-dae:feature/kg-retrieval-scoring-integration

huang-yi-dae commented Jun 10, 2026

Uh oh!

github-actions Bot commented Jun 10, 2026

Uh oh!

github-actions Bot commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

huang-yi-dae commented Jun 10, 2026

Summary

Test plan

Uh oh!

github-actions Bot commented Jun 10, 2026

PR Reviewer Guide 🔍

Uh oh!

github-actions Bot commented Jun 10, 2026

PR Code Suggestions ✨

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant