fix: fall back to FTS when search embeddings stall#523
Conversation
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
|
@codex review |
|
@coderabbitai review |
|
You need to increase your spend limit or enable usage-based billing to run background agents. Go to Cursor |
|
Warning Review limit reached
More reviews will be available in 4 minutes and 56 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (6)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
✅ Action performedReview finished.
|
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 2ae7109. Configure here.
| thread.start() | ||
| thread.join(timeout_ms / 1000.0) | ||
| if thread.is_alive(): | ||
| raise TimeoutError(f"hybrid search exceeded {timeout_ms:.0f}ms") |
There was a problem hiding this comment.
Timeout leaves background work running
Medium Severity
When the embed/hybrid deadline is exceeded, the caller falls back to FTS, but the timed-out work keeps running in a daemon thread or default executor. Stalled embedding and hybrid work can continue to hold the model and database, competing with the FTS path and later searches in the same process.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 2ae7109. Configure here.
| } | ||
| ) | ||
| structured_results.append(item) | ||
| structured = {"query": query, "total": len(structured_results), "results": structured_results} |
There was a problem hiding this comment.
Empty results omit fallback metadata
Low Severity
After an embedding timeout or error, successful code paths attach search_mode, fallback_reason, and FTS-only messaging to responses. The early return when hybrid search finds no documents still returns a bare empty payload, so clients cannot tell an FTS-only fallback occurred and users do not see the FTS-only notice.
Reviewed by Cursor Bugbot for commit 2ae7109. Configure here.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2ae71092fd
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| loop.run_in_executor( | ||
| None, | ||
| lambda: _get_embedding_model().embed_query(query), | ||
| ), | ||
| timeout=embed_timeout_ms / 1000.0, |
There was a problem hiding this comment.
Avoid leaving timed-out embeddings in the shared pool
When the embedding backend/model load is the thing that stalls, this wait_for only times out the asyncio wrapper; the callable submitted via run_in_executor(None, ...) keeps occupying a worker in the shared default executor. Repeated timed-out brain_search calls can therefore exhaust that pool, and with the helper route enabled subsequent searches can block on the helper route's own default-executor call before reaching the FTS fallback. Please isolate/bound the embedding worker or trip a circuit breaker instead of submitting another uncancellable shared-pool job on every search.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
brainlayer/src/brainlayer/mcp/search_handler.py
Lines 1897 to 1899 in a8b7e78
When embedding times out or raises and the FTS leg also finds no documents, this early return drops the already-determined search_mode and fallback_reason and returns a plain “No results found.” In that outage-plus-no-lexical-hit scenario, clients cannot distinguish an authoritative hybrid miss from an FTS-only fallback miss, which weakens the MCP tool contract and search observability.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| query_embedding = await asyncio.wait_for( | ||
| loop.run_in_executor( | ||
| None, | ||
| lambda: _get_embedding_model().embed_query(query), | ||
| ), | ||
| timeout=embed_timeout_ms / 1000.0, |
There was a problem hiding this comment.
Bound timed-out embedding calls
When the embedding backend hangs in the long-lived MCP server, asyncio.wait_for times out the asyncio wrapper but cannot stop the run_in_executor thread that is already running embed_query. Each brain_search during that outage can leave another default-executor worker occupied, eventually saturating the executor or growing its queue even though the request falls back to FTS. Gate concurrent embed attempts or use a bounded/circuit-breaker path before scheduling more workers.
Useful? React with 👍 / 👎.


Summary
brain_searchquery embedding/model load withBRAINLAYER_EMBED_TIMEOUT_MS(default 1000ms).query_embedding=None, which skips vector search and runs the FTS leg, returning annotatedsearch_mode=fts_onlyresults instead of 0/error.run_fts_search; the live~/.claudehook was also updated locally.hybrid_search(query_embedding=None)FTS fallback.Test plan
pytest tests/test_search_handler.py -k 'embed or embedding' -q-> 3 passedpytest tests/test_adaptive_injection.py -k 'slow_hybrid or fast_hybrid or fallback_to_fts_only' -q-> 3 passedpython3 -m py_compile src/brainlayer/mcp/search_handler.py src/brainlayer/search_repo.py hooks/brainlayer-prompt-search.py /Users/etanheyman/.claude/hooks/brainlayer-prompt-search.py-> passedpytest tests/test_search_handler.py tests/test_adaptive_injection.py tests/test_hybrid_search.py tests/test_conditional_hooks.py tests/test_search_filter_params.py tests/test_search_profile.py -q-> 155 passed, 1 warningruff check src/brainlayer/mcp/search_handler.py src/brainlayer/search_repo.py hooks/brainlayer-prompt-search.py tests/test_adaptive_injection.py tests/test_hybrid_search.py tests/test_search_handler.py-> passedruff format --check src/brainlayer/mcp/search_handler.py src/brainlayer/search_repo.py hooks/brainlayer-prompt-search.py tests/test_adaptive_injection.py tests/test_hybrid_search.py tests/test_search_handler.py-> passedBRAINLAYER_PREPUSH=1 scripts/run_tests.sh-> passed; unit phase 3016 passed, 9 skipped, 61 deselected, 1 xfailed; MCP registration 3 passed; isolated eval/hook routing 40 passed; Bun 1 passed; regression shell passed. This pre-push mode excludestests/test_vector_store.pyandtests/test_engine.py.git push -u origin fix/brain-search-fts-fallbackpre-push hook reranBRAINLAYER_PREPUSH=1 scripts/run_tests.sh-> passed; same gate, with unit phase 3016 passed, 9 skipped, 61 deselected, 1 xfailed; MCP registration 3 passed; isolated eval/hook routing 40 passed; Bun 1 passed; regression shell passed.Notes
findings: 0.Note
Fall back to FTS search when embedding times out or raises in search handlers
asyncio.wait_for) and brainlayer-prompt-search.py (via a daemon thread + queue), defaulting to 1000ms configurable viaBRAINLAYER_EMBED_TIMEOUT_MS.search_modeandfallback_reason, and human-readable output appends an FTS-only notice.SearchMixin.hybrid_searchin search_repo.py now acceptsquery_embedding=Noneto trigger the FTS-only path directly, emitting asemantic_skipprofiling event.Macroscope summarized 2ae7109.