Skip to content

fix: fall back to FTS when search embeddings stall#523

Merged
EtanHey merged 1 commit into
mainfrom
fix/brain-search-fts-fallback
Jun 19, 2026
Merged

fix: fall back to FTS when search embeddings stall#523
EtanHey merged 1 commit into
mainfrom
fix/brain-search-fts-fallback

Conversation

@EtanHey

@EtanHey EtanHey commented Jun 19, 2026

Copy link
Copy Markdown
Owner

Summary

  • Time-bound brain_search query embedding/model load with BRAINLAYER_EMBED_TIMEOUT_MS (default 1000ms).
  • On embed timeout/error, continue the existing hybrid path with query_embedding=None, which skips vector search and runs the FTS leg, returning annotated search_mode=fts_only results instead of 0/error.
  • Time-bound the prompt injection hook hybrid call and fall back to its existing run_fts_search; the live ~/.claude hook was also updated locally.
  • Added regression coverage for slow/raising embedders, fast hybrid path, and real hybrid_search(query_embedding=None) FTS fallback.

Test plan

  • pytest tests/test_search_handler.py -k 'embed or embedding' -q -> 3 passed
  • pytest tests/test_adaptive_injection.py -k 'slow_hybrid or fast_hybrid or fallback_to_fts_only' -q -> 3 passed
  • python3 -m py_compile src/brainlayer/mcp/search_handler.py src/brainlayer/search_repo.py hooks/brainlayer-prompt-search.py /Users/etanheyman/.claude/hooks/brainlayer-prompt-search.py -> passed
  • pytest tests/test_search_handler.py tests/test_adaptive_injection.py tests/test_hybrid_search.py tests/test_conditional_hooks.py tests/test_search_filter_params.py tests/test_search_profile.py -q -> 155 passed, 1 warning
  • ruff check src/brainlayer/mcp/search_handler.py src/brainlayer/search_repo.py hooks/brainlayer-prompt-search.py tests/test_adaptive_injection.py tests/test_hybrid_search.py tests/test_search_handler.py -> passed
  • ruff format --check src/brainlayer/mcp/search_handler.py src/brainlayer/search_repo.py hooks/brainlayer-prompt-search.py tests/test_adaptive_injection.py tests/test_hybrid_search.py tests/test_search_handler.py -> passed
  • BRAINLAYER_PREPUSH=1 scripts/run_tests.sh -> passed; unit phase 3016 passed, 9 skipped, 61 deselected, 1 xfailed; MCP registration 3 passed; isolated eval/hook routing 40 passed; Bun 1 passed; regression shell passed. This pre-push mode excludes tests/test_vector_store.py and tests/test_engine.py.
  • git push -u origin fix/brain-search-fts-fallback pre-push hook reran BRAINLAYER_PREPUSH=1 scripts/run_tests.sh -> passed; same gate, with unit phase 3016 passed, 9 skipped, 61 deselected, 1 xfailed; MCP registration 3 passed; isolated eval/hook routing 40 passed; Bun 1 passed; regression shell passed.

Notes

  • Local CodeRabbit pre-commit review completed with findings: 0.
  • Do not merge from this worker branch; requested endpoint is PR + review loop.

Note

Fall back to FTS search when embedding times out or raises in search handlers

  • Wraps embedding calls in a timeout in both search_handler.py (via asyncio.wait_for) and brainlayer-prompt-search.py (via a daemon thread + queue), defaulting to 1000ms configurable via BRAINLAYER_EMBED_TIMEOUT_MS.
  • On timeout or any embedding exception, both handlers fall back to FTS-only search; structured results include search_mode and fallback_reason, and human-readable output appends an FTS-only notice.
  • SearchMixin.hybrid_search in search_repo.py now accepts query_embedding=None to trigger the FTS-only path directly, emitting a semantic_skip profiling event.
  • Behavioral Change: callers that previously received hybrid search results on slow or failing embedding backends will now silently receive FTS-only results.

Macroscope summarized 2ae7109.

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@EtanHey

EtanHey commented Jun 19, 2026

Copy link
Copy Markdown
Owner Author

@codex review

@EtanHey

EtanHey commented Jun 19, 2026

Copy link
Copy Markdown
Owner Author

@cursor @BugBot review

@EtanHey

EtanHey commented Jun 19, 2026

Copy link
Copy Markdown
Owner Author

@coderabbitai review

@cursor

cursor Bot commented Jun 19, 2026

Copy link
Copy Markdown

You need to increase your spend limit or enable usage-based billing to run background agents. Go to Cursor

@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown

Warning

Review limit reached

@EtanHey, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 4 minutes and 56 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5a311120-2466-4938-9d7f-7d5525fae7ce

📥 Commits

Reviewing files that changed from the base of the PR and between ecedfa0 and 2ae7109.

📒 Files selected for processing (6)
  • hooks/brainlayer-prompt-search.py
  • src/brainlayer/mcp/search_handler.py
  • src/brainlayer/search_repo.py
  • tests/test_adaptive_injection.py
  • tests/test_hybrid_search.py
  • tests/test_search_handler.py
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/brain-search-fts-fallback

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 2ae7109. Configure here.

thread.start()
thread.join(timeout_ms / 1000.0)
if thread.is_alive():
raise TimeoutError(f"hybrid search exceeded {timeout_ms:.0f}ms")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timeout leaves background work running

Medium Severity

When the embed/hybrid deadline is exceeded, the caller falls back to FTS, but the timed-out work keeps running in a daemon thread or default executor. Stalled embedding and hybrid work can continue to hold the model and database, competing with the FTS path and later searches in the same process.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 2ae7109. Configure here.

}
)
structured_results.append(item)
structured = {"query": query, "total": len(structured_results), "results": structured_results}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty results omit fallback metadata

Low Severity

After an embedding timeout or error, successful code paths attach search_mode, fallback_reason, and FTS-only messaging to responses. The early return when hybrid search finds no documents still returns a bare empty payload, so clients cannot tell an FTS-only fallback occurred and users do not see the FTS-only notice.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 2ae7109. Configure here.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2ae71092fd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1789 to +1793
loop.run_in_executor(
None,
lambda: _get_embedding_model().embed_query(query),
),
timeout=embed_timeout_ms / 1000.0,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid leaving timed-out embeddings in the shared pool

When the embedding backend/model load is the thing that stalls, this wait_for only times out the asyncio wrapper; the callable submitted via run_in_executor(None, ...) keeps occupying a worker in the shared default executor. Repeated timed-out brain_search calls can therefore exhaust that pool, and with the helper route enabled subsequent searches can block on the helper route's own default-executor call before reaching the FTS fallback. Please isolate/bound the embedding worker or trip a circuit breaker instead of submitting another uncancellable shared-pool job on every search.

Useful? React with 👍 / 👎.

@EtanHey EtanHey merged commit a8b7e78 into main Jun 19, 2026
7 checks passed
@EtanHey

EtanHey commented Jun 19, 2026

Copy link
Copy Markdown
Owner Author

@codex review
@cursor @BugBot review

Follow-up pushed in 07abd6d: capped concurrent MCP embed/hook hybrid workers after timeout and preserved FTS fallback metadata on empty results.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

if not results["documents"][0]:
empty = {"query": query, "total": 0, "results": []}
return ([TextContent(type="text", text="No results found.")], empty)

P2 Badge Preserve fallback metadata for empty FTS searches

When embedding times out or raises and the FTS leg also finds no documents, this early return drops the already-determined search_mode and fallback_reason and returns a plain “No results found.” In that outage-plus-no-lexical-hit scenario, clients cannot distinguish an authoritative hybrid miss from an FTS-only fallback miss, which weakens the MCP tool contract and search observability.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1788 to +1793
query_embedding = await asyncio.wait_for(
loop.run_in_executor(
None,
lambda: _get_embedding_model().embed_query(query),
),
timeout=embed_timeout_ms / 1000.0,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Bound timed-out embedding calls

When the embedding backend hangs in the long-lived MCP server, asyncio.wait_for times out the asyncio wrapper but cannot stop the run_in_executor thread that is already running embed_query. Each brain_search during that outage can leave another default-executor worker occupied, eventually saturating the executor or growing its queue even though the request falls back to FTS. Gate concurrent embed attempts or use a bounded/circuit-breaker path before scheduling more workers.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant