Skip to content

fix(retrieve): preserve rerank scores around empty abstracts#2572

Open
r266-tech wants to merge 2 commits into
volcengine:mainfrom
r266-tech:fix-rerank-empty-abstracts
Open

fix(retrieve): preserve rerank scores around empty abstracts#2572
r266-tech wants to merge 2 commits into
volcengine:mainfrom
r266-tech:fix-rerank-empty-abstracts

Conversation

@r266-tech

Copy link
Copy Markdown
Contributor

Fixes #2330

Summary

  • filter empty or whitespace abstracts out of rerank provider requests
  • map rerank scores back onto their original candidate indexes while preserving vector fallback scores for empty documents
  • add regression coverage for mixed non-empty and empty document batches

Tests

  • python3 -m py_compile openviking/retrieve/hierarchical_retriever.py tests/retrieve/test_hierarchical_retriever_rerank.py
  • uv run --no-project --with ruff ruff check openviking/retrieve/hierarchical_retriever.py tests/retrieve/test_hierarchical_retriever_rerank.py

uv run pytest tests/retrieve/test_hierarchical_retriever_rerank.py -q could not run in this sparse checkout because editable install requires native OpenViking CLI/engine build artifacts that are not present here.

@github-actions

Copy link
Copy Markdown

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🎫 Ticket compliance analysis ✅

2330 - Fully compliant

Compliant requirements:

  • Filter empty or whitespace abstracts out of rerank provider requests
  • Map rerank scores back onto original candidate indexes while preserving vector fallback scores for empty documents
  • Add regression coverage for mixed non-empty and empty document batches
⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
🏅 Score: 95
🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 No multiple PR themes
⚡ No major issues detected

@github-actions

Copy link
Copy Markdown

PR Code Suggestions ✨

No code suggestions found for the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Backlog

Development

Successfully merging this pull request may close these issues.

OpenAI-compatible rerank falls back for whole batch when candidates contain empty abstracts

1 participant