feat: add origin ordering to brain search#518
Conversation
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
|
@coderabbitai review |
|
@codex review |
|
You need to increase your spend limit or enable usage-based billing to run background agents. Go to Cursor |
|
Warning Review limit reached
More reviews will be available in 48 minutes and 58 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (7)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
✅ Action performedReview finished.
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 214cc6a9b0
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| date_to=date_to, | ||
| sentiment=sentiment, | ||
| detail=detail, | ||
| order=order, |
There was a problem hiding this comment.
Honor origin ordering for KG entity hits
When a query detects a known KG entity, _brain_search_dispatch returns from the entity branch before reaching this _search(..., order=order) fallback. In that scenario order="origin" is accepted by validation, but kg_hybrid_search still receives only n_results=num_results and the formatted chunks stay in KG/relevance order, so entity queries such as brain_search(query="Avi Simon", order="origin") silently ignore the requested ordering instead of returning the earliest matching memories.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in 7db7533. The KG/entity route now expands origin candidates, sorts chunk hits by created_at before formatting, and has regression coverage in tests/test_search_handler.py::test_brain_search_origin_order_sorts_entity_route_chunks.
|
|
||
|
|
||
| def _origin_candidate_count(num_results: int) -> int: | ||
| return min(_MAX_PUBLIC_NUM_RESULTS, max(num_results, _ORIGIN_CANDIDATE_LIMIT)) |
There was a problem hiding this comment.
Select origin results from all matches
For order="origin", this caps the candidate set at 100 hybrid-ranked results before sorting by created_at. On broad queries with more than 100 matching chunks, older exact/filtered matches outside the top relevance candidates are never considered, so the tool can return later memories while claiming to return the earliest matching chunks; this is especially likely for common project or entity terms in a large BrainLayer DB.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Intentionally kept this scoped to the relevance-matched hybrid candidate set. The task explicitly asked for a thin layer over existing hybrid_search candidates and not an unbounded/date-indexed search path. In 38a1704 I tightened the schema wording to "earliest hybrid-matched chunks" so clients do not read origin mode as an exhaustive all-DB oldest-match scan.
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
|
@codex review |
|
You need to increase your spend limit or enable usage-based billing to run background agents. Go to Cursor |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7db753371f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
|
|
||
| def _origin_candidate_count(num_results: int) -> int: | ||
| return min(_MAX_PUBLIC_NUM_RESULTS, max(num_results, _ORIGIN_CANDIDATE_LIMIT)) |
There was a problem hiding this comment.
Return origin order from all matching candidates
When order="origin" is used on a query with more than 100 matching chunks, this cap means _search asks hybrid_search for only 100 relevance/MMR-ranked results and then _sort_hybrid_results_by_origin sorts that already-truncated set. Older matching chunks outside the top 100 relevance candidates are never considered, so the advertised “earliest matching chunks by created_at” can skip the actual origin memories on large projects; origin mode needs a date-ordered lookup or a larger/untruncated candidate path before applying num_results.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Intentionally kept this scoped to the relevance-matched hybrid candidate set. The task explicitly asked for a thin layer over existing hybrid_search candidates and not an unbounded/date-indexed search path. In 38a1704 I tightened the schema wording to "earliest hybrid-matched chunks" so clients do not read origin mode as an exhaustive all-DB oldest-match scan.
| entity_id=arguments.get("entity_id"), | ||
| agent_id=arguments.get("agent_id"), | ||
| num_results=arguments.get("num_results", 5), | ||
| order=arguments.get("order", "relevance"), |
There was a problem hiding this comment.
Expose order in the brain_recall schema
This forwards order through the direct brain_recall(mode="search") path, but the brain_recall tool schema above still has no order property. MCP clients build and often validate calls from list_tools, so brain_recall(mode="search", order="origin") is undiscoverable or rejected even though the handler and tests now support it; add the same enum/default schema entry used by brain_search.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in 38a1704. brain_recall(mode="search") now exposes the same order enum/default as brain_search, with schema coverage in tests/test_3tool_aliases.py::TestToolsListBackwardCompat::test_brain_recall_search_schema_has_order_param.
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
|
@codex review |
|
You need to increase your spend limit or enable usage-based billing to run background agents. Go to Cursor |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 38a170465b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| try: | ||
| if ( | ||
| allow_helper_route | ||
| and order == "relevance" |
There was a problem hiding this comment.
Honor origin order through smart routing
When order='origin' is requested, this only bypasses the warm helper; the normal dispatch path below still routes queries matching _query_signals_think/_query_signals_recall (for example, how did I implement auth or history of auth) to _think/_recall, where order is never applied. Those calls now accept order='origin' but silently return default-ordered non-origin output, so origin mode should either bypass these smart routes or implement equivalent ordering there.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in 86125a8. order="origin" now bypasses the current-context, think, and recall smart routes and falls through to the search/origin ordering path; regression coverage is in tests/test_search_handler.py::test_brain_search_origin_order_bypasses_smart_routes.
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
|
@codex review |
|
You need to increase your spend limit or enable usage-based billing to run background agents. Go to Cursor |
|
@coderabbitai review |
✅ Action performedReview finished.
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 86125a8f4c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
|
|
||
| def _origin_candidate_count(num_results: int) -> int: | ||
| return min(_MAX_PUBLIC_NUM_RESULTS, max(num_results, _ORIGIN_CANDIDATE_LIMIT)) |
There was a problem hiding this comment.
Avoid truncating origin candidates before date ordering
When order="origin" is requested, this helper caps the hybrid query to 100 relevance-ranked results before _sort_hybrid_results_by_origin sorts by created_at. For queries with more than 100 matching chunks, any older match ranked outside the top 100 by relevance/MMR is never considered, so the tool can return a newer chunk even though the schema promises the earliest hybrid-matched chunks by created_at. This is most visible for broad project/history searches in the large BrainLayer DB; the origin path needs to order across the full filtered match set or clearly make the result approximate.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Clarified in a2f1989. The schema, text output, and structured response now say origin mode is scoped to the expanded relevance-ranked hybrid candidate set (order_scope="expanded_hybrid_candidates"), matching the requested thin layer over hybrid_search candidates rather than promising an exhaustive all-DB oldest-match scan.
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
|
@codex review |
|
You need to increase your spend limit or enable usage-based billing to run background agents. Go to Cursor |
|
@coderabbitai review |
✅ Action performedReview finished.
|
|
Codex Review: Didn't find any major issues. Chef's kiss. Reviewed commit: ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
|
@codex review |
|
You need to increase your spend limit or enable usage-based billing to run background agents. Go to Cursor |
|
@coderabbitai review |
✅ Action performedReview finished.
|
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 1419e37. Configure here.
| structured["kg_degrade_reason"] = kg_degrade_reason | ||
| formatted_text = format_kg_search(entity_name, structured_results, fact_items, query) | ||
| if order == "origin": | ||
| formatted_text += f"\n{_ORIGIN_ORDER_LABEL}" |
There was a problem hiding this comment.
Origin label without chunk results
Low Severity
On the entity KG early-return path, order="origin" always adds order, order_scope, and the origin text banner whenever that mode is requested, even when structured_results is empty and the payload is only SQL facts. That implies hybrid chunks were origin-sorted when none were returned.
Reviewed by Cursor Bugbot for commit 1419e37. Configure here.
There was a problem hiding this comment.
Acknowledged. I am leaving this unchanged because the order fields/banner reflect the requested brain_search mode, while the KG facts-only payload still reports total: 0 and results: [], so there are no chunk dates or sorted chunk rows implied in the structured data. This is a low-severity presentation nuance, not a correctness issue, and the task explicitly asked origin mode to surface the chosen order in output.
|
Codex Review: Didn't find any major issues. Delightful! Reviewed commit: ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |


Summary
orderforbrain_search, with defaultrelevancepreserving existing hybrid relevance/recency behavior.order="origin"to fetch a relevance-matched candidate set, sort bycreated_atascending, and return the earliest requested chunks with date/order surfaced in output.brain_search->brain_recall(mode="search")-> handler dispatch, while bypassing warm helper routing for origin mode.Test plan
pytest tests/test_search_handler.py::test_brain_search_origin_order_returns_oldest_matching_chunks_without_changing_default tests/test_3tool_aliases.py::TestBrainRecallSearchMode::test_search_mode_passes_order tests/test_search_filter_params.py::TestInputSchemaPresence::test_order_enum_values tests/test_search_filter_params.py::TestAliasResolution::test_order_passes_from_call_tool -qpytest tests/test_search_handler.py tests/test_3tool_aliases.py tests/test_search_filter_params.py tests/test_mcp_labeled_field_output.py -qpytest tests/test_mcp_warm_route.py tests/test_hybrid_search.py tests/test_phase3_qa.py -qruff check src/brainlayer/mcp/search_handler.py src/brainlayer/mcp/__init__.py src/brainlayer/mcp/_format.py tests/test_search_handler.py tests/test_3tool_aliases.py tests/test_search_filter_params.pypytest --ignore=tests/test_vector_store.py --ignore=tests/test_engine.py --ignore=tests/test_enrichment_quality_benchmark.py --ignore=tests/test_eval_framework.py --ignore=tests/test_abcde_report.py --ignore=tests/test_enrichment_graders.py --ignore=tests/test_enrichment_llm_judge.py --ignore=tests/test_enrichment_gold.py --ignore=tests/test_experiment_store.py --ignore=tests/test_abcde_enrich_runner.py --ignore=tests/eval/phoenix_gate/test_phoenix_gate.py --ignore=tests/test_abcde_variants.py --ignore=tests/test_enrichment_judge.py -q(system Python broad non-eval; eval collection blocked by local NumPy 2.4 / numba compatibility)BRAINLAYER_PREPUSH=1 BRAINLAYER_PREPUSH_SCOPE=changed-only ./scripts/run_tests.shReview notes
coderabbit review --agenthit the free OSS rate limit before producing findings; PR-level bot review requested after creation.Note
Medium Risk
Touches MCP search dispatch and hybrid/KG ranking paths; default
relevancebehavior is preserved, butorder=originchanges routing and result ordering in ways callers may rely on for “first mention” semantics within a bounded candidate pool.Overview
Adds optional
orderonbrain_searchandbrain_recall(mode=search): defaultrelevanceis unchanged;originre-ranks an expanded hybrid candidate set (at least 100 matches) bycreated_atascending and returns the earliestnum_results, withorder/order_scopein structured output and a short label in formatted results.Origin skips warm helper routing and the smart/file shortcuts (timeline, regression, think/recall/current-context) so search stays on the hybrid path;
file_pathis applied as asource_fileSQLLIKEfilter via newsource_file_filter_likeplumbing insearch_repo. Entity/KG hybrid search uses the same expansion and origin sort when filters allow entity routing.Tests cover schema forwarding, default vs origin ordering, file-path filtering, entity route, and route bypasses.
Reviewed by Cursor Bugbot for commit 1419e37. Bugbot is set up for automated code reviews on this repo. Configure here.
Note
Add origin ordering to brain search results
orderparameter tobrain_searchandbrain_recall(mode=search) accepting'relevance'(default) or'origin', where origin sorts by earliestcreated_atacross an expanded candidate pool.order='origin', smart/file-based routing is bypassed, an expanded candidate set is fetched (up to_ORIGIN_CANDIDATE_LIMIT=100), and results are re-sorted by ascendingcreated_atbefore truncating to the requested count.source_file_filter_likeSQL LIKE filtering tohybrid_search,binary_knn_search, andkg_hybrid_searchin search_repo.py, used to constrain origin-ordered results when a file path is inferred from the query.orderandorder_scopefields when origin ordering is active.order='origin'disables helper/smart routing paths that are used for relevance ordering, so route selection behavior differs between the two modes.Macroscope summarized 1419e37.