Switch reranker from Jina API to local cross-encoder by CodeNinjaSarthak · Pull Request #64 · CodeNinjaSarthak/eidetic-memory

CodeNinjaSarthak · 2026-04-26T14:36:55Z

Summary

Migrates the reranker from the hosted Jina API to a local cross-encoder (cross-encoder/ms-marco-MiniLM-L-6-v2) loaded via sentence-transformers. The local model runs on CPU with ONNX INT8 quantization and produces results within noise of the Jina API while removing all API key, rate limit, and reproducibility concerns.

Results (LoCoMo, n=1540)

Metric	Score
Overall	56.3% (vs. 57.5% with Jina — within bootstrap CI)
Temporal	64.2% (vs. 68.2% — within CI)
Held-out (n=718)	55.0% overall, 68.3% temporal

Key finding

Ablation reveals the cross-encoder reranker is the load-bearing component. Round-robin merge vs. score-based merge produces identical results (55.8%) once the reranker is in place — the merge strategy becomes irrelevant. Without the reranker, neither merge strategy improves over isolation alone.

Changes

README.md: numbers updated, Jina references replaced, 4 new ablation rows added, fact extraction precision corrected (58.6% → 52% to match measured value)
.env.development.example: JINA_API_KEY section removed
eval/eval_qa_accuracy.py: --local-rerank, --merge-strategy, --top-k, --fetch-multiplier, --no-rr-rerank flags added; help strings updated

Note: JinaRateLimiter dead code (424f359) is intentionally left in — cleanup belongs in a follow-up PR.

Test plan

Verify README numbers match eval output JSON
Smoke test --local-rerank flag on a single conv before full run
Confirm JINA_API_KEY is no longer referenced in .env.development.example
Confirm --no-rerank ablation still works (no reranker path)

- Archive 18 superseded/debug result files (untracked) - Rename canonical result files for clarity - Add --no-isolation and --no-rerank flags to eval script - Add 5-attempt retry on generation, retrieval, judge calls - Add run_ablation_parallel.sh for parallel ablation runs - Record no_isolation/no_rerank/model in result metadata

…ranker API calls

Update README numbers to 56.3% overall, 64.2% temporal (local cross-encoder/ms-marco-MiniLM-L-6-v2 results, n=1540) Replace Jina branding in mermaid diagrams and text Add ablation rows showing reranker is load-bearing component Fix fact extraction precision discrepancy (58.6% -> 52%) Remove JINA_API_KEY from .env.development.example Update eval_qa_accuracy.py help strings for --no-rerank flags Document --local-rerank flag in Reproduce Results section

CodeNinjaSarthak added 5 commits April 24, 2026 22:07

chore: ignore eval output artifacts from git tracking

4c56602

feat: implement JinaRateLimiter for adaptive rate limiting in Jina Re…

424f359

…ranker API calls

fix: add strict=False to zip() call in _local_rerank (B905)

74065f4

CodeNinjaSarthak merged commit f3c0dfb into main Apr 26, 2026
2 checks passed

CodeNinjaSarthak deleted the feature/local-cross-encoder-migration branch April 26, 2026 14:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch reranker from Jina API to local cross-encoder#64

Switch reranker from Jina API to local cross-encoder#64
CodeNinjaSarthak merged 5 commits into
mainfrom
feature/local-cross-encoder-migration

CodeNinjaSarthak commented Apr 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

CodeNinjaSarthak commented Apr 26, 2026

Summary

Results (LoCoMo, n=1540)

Key finding

Changes

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant