fix: fall back to FTS when search embeddings stall by EtanHey · Pull Request #523 · EtanHey/brainlayer

EtanHey · 2026-06-19T16:22:51Z

Summary

Time-bound brain_search query embedding/model load with BRAINLAYER_EMBED_TIMEOUT_MS (default 1000ms).
On embed timeout/error, continue the existing hybrid path with query_embedding=None, which skips vector search and runs the FTS leg, returning annotated search_mode=fts_only results instead of 0/error.
Time-bound the prompt injection hook hybrid call and fall back to its existing run_fts_search; the live ~/.claude hook was also updated locally.
Added regression coverage for slow/raising embedders, fast hybrid path, and real hybrid_search(query_embedding=None) FTS fallback.

Test plan

pytest tests/test_search_handler.py -k 'embed or embedding' -q -> 3 passed
pytest tests/test_adaptive_injection.py -k 'slow_hybrid or fast_hybrid or fallback_to_fts_only' -q -> 3 passed
python3 -m py_compile src/brainlayer/mcp/search_handler.py src/brainlayer/search_repo.py hooks/brainlayer-prompt-search.py /Users/etanheyman/.claude/hooks/brainlayer-prompt-search.py -> passed
pytest tests/test_search_handler.py tests/test_adaptive_injection.py tests/test_hybrid_search.py tests/test_conditional_hooks.py tests/test_search_filter_params.py tests/test_search_profile.py -q -> 155 passed, 1 warning
ruff check src/brainlayer/mcp/search_handler.py src/brainlayer/search_repo.py hooks/brainlayer-prompt-search.py tests/test_adaptive_injection.py tests/test_hybrid_search.py tests/test_search_handler.py -> passed
ruff format --check src/brainlayer/mcp/search_handler.py src/brainlayer/search_repo.py hooks/brainlayer-prompt-search.py tests/test_adaptive_injection.py tests/test_hybrid_search.py tests/test_search_handler.py -> passed
BRAINLAYER_PREPUSH=1 scripts/run_tests.sh -> passed; unit phase 3016 passed, 9 skipped, 61 deselected, 1 xfailed; MCP registration 3 passed; isolated eval/hook routing 40 passed; Bun 1 passed; regression shell passed. This pre-push mode excludes tests/test_vector_store.py and tests/test_engine.py.
git push -u origin fix/brain-search-fts-fallback pre-push hook reran BRAINLAYER_PREPUSH=1 scripts/run_tests.sh -> passed; same gate, with unit phase 3016 passed, 9 skipped, 61 deselected, 1 xfailed; MCP registration 3 passed; isolated eval/hook routing 40 passed; Bun 1 passed; regression shell passed.

Notes

Local CodeRabbit pre-commit review completed with findings: 0.
Do not merge from this worker branch; requested endpoint is PR + review loop.

Note

Fall back to FTS search when embedding times out or raises in search handlers

Wraps embedding calls in a timeout in both search_handler.py (via asyncio.wait_for) and brainlayer-prompt-search.py (via a daemon thread + queue), defaulting to 1000ms configurable via BRAINLAYER_EMBED_TIMEOUT_MS.
On timeout or any embedding exception, both handlers fall back to FTS-only search; structured results include search_mode and fallback_reason, and human-readable output appends an FTS-only notice.
SearchMixin.hybrid_search in search_repo.py now accepts query_embedding=None to trigger the FTS-only path directly, emitting a semantic_skip profiling event.
Behavioral Change: callers that previously received hybrid search results on slow or failing embedding backends will now silently receive FTS-only results.

^{Macroscope summarized 2ae7109.}

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

EtanHey · 2026-06-19T16:23:23Z

@codex review

EtanHey · 2026-06-19T16:23:23Z

@cursor @BugBot review

EtanHey · 2026-06-19T16:23:23Z

@coderabbitai review

cursor · 2026-06-19T16:23:26Z

You need to increase your spend limit or enable usage-based billing to run background agents. Go to Cursor

coderabbitai · 2026-06-19T16:23:32Z

Warning

Review limit reached

@EtanHey, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 4 minutes and 56 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5a311120-2466-4938-9d7f-7d5525fae7ce

📥 Commits

Reviewing files that changed from the base of the PR and between ecedfa0 and 2ae7109.

📒 Files selected for processing (6)

hooks/brainlayer-prompt-search.py
src/brainlayer/mcp/search_handler.py
src/brainlayer/search_repo.py
tests/test_adaptive_injection.py
tests/test_hybrid_search.py
tests/test_search_handler.py

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/brain-search-fts-fallback

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai · 2026-06-19T16:23:41Z

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 2ae7109. Configure here.}

cursor · 2026-06-19T16:24:32Z

+    thread.start()
+    thread.join(timeout_ms / 1000.0)
+    if thread.is_alive():
+        raise TimeoutError(f"hybrid search exceeded {timeout_ms:.0f}ms")


Timeout leaves background work running

Medium Severity

When the embed/hybrid deadline is exceeded, the caller falls back to FTS, but the timed-out work keeps running in a daemon thread or default executor. Stalled embedding and hybrid work can continue to hold the model and database, competing with the FTS path and later searches in the same process.

Additional Locations (1)

src/brainlayer/mcp/search_handler.py#L1787-L1794

^{Reviewed by Cursor Bugbot for commit 2ae7109. Configure here.}

cursor · 2026-06-19T16:24:32Z

                    }
                )
                structured_results.append(item)
-            structured = {"query": query, "total": len(structured_results), "results": structured_results}


Empty results omit fallback metadata

Low Severity

After an embedding timeout or error, successful code paths attach search_mode, fallback_reason, and FTS-only messaging to responses. The early return when hybrid search finds no documents still returns a bare empty payload, so clients cannot tell an FTS-only fallback occurred and users do not see the FTS-only notice.

^{Reviewed by Cursor Bugbot for commit 2ae7109. Configure here.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2ae71092fd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-19T16:26:33Z

+                loop.run_in_executor(
+                    None,
+                    lambda: _get_embedding_model().embed_query(query),
+                ),
+                timeout=embed_timeout_ms / 1000.0,


Avoid leaving timed-out embeddings in the shared pool

When the embedding backend/model load is the thing that stalls, this wait_for only times out the asyncio wrapper; the callable submitted via run_in_executor(None, ...) keeps occupying a worker in the shared default executor. Repeated timed-out brain_search calls can therefore exhaust that pool, and with the helper route enabled subsequent searches can block on the helper route's own default-executor call before reaching the FTS fallback. Please isolate/bound the embedding worker or trip a circuit breaker instead of submitting another uncancellable shared-pool job on every search.

Useful? React with 👍 / 👎.

EtanHey · 2026-06-19T16:35:13Z

@codex review
@cursor @BugBot review

Follow-up pushed in 07abd6d: capped concurrent MCP embed/hook hybrid workers after timeout and preserved FTS fallback metadata on empty results.

chatgpt-codex-connector

💡 Codex Review

brainlayer/src/brainlayer/mcp/search_handler.py

Lines 1897 to 1899 in a8b7e78

    
           if not results["documents"][0]: 
        
               empty = {"query": query, "total": 0, "results": []} 
        
               return ([TextContent(type="text", text="No results found.")], empty)

Preserve fallback metadata for empty FTS searches

When embedding times out or raises and the FTS leg also finds no documents, this early return drops the already-determined search_mode and fallback_reason and returns a plain “No results found.” In that outage-plus-no-lexical-hit scenario, clients cannot distinguish an authoritative hybrid miss from an FTS-only fallback miss, which weakens the MCP tool contract and search observability.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-19T16:41:08Z

+            query_embedding = await asyncio.wait_for(
+                loop.run_in_executor(
+                    None,
+                    lambda: _get_embedding_model().embed_query(query),
+                ),
+                timeout=embed_timeout_ms / 1000.0,


Bound timed-out embedding calls

When the embedding backend hangs in the long-lived MCP server, asyncio.wait_for times out the asyncio wrapper but cannot stop the run_in_executor thread that is already running embed_query. Each brain_search during that outage can leave another default-executor worker occupied, eventually saturating the executor or growing its queue even though the request falls back to FTS. Gate concurrent embed attempts or use a bounded/circuit-breaker path before scheduling more workers.

Useful? React with 👍 / 👎.

fix: fall back to FTS when search embeddings stall

2ae7109

greptile-apps Bot reviewed Jun 19, 2026

View reviewed changes

cursor Bot reviewed Jun 19, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Jun 19, 2026

View reviewed changes

EtanHey merged commit a8b7e78 into main Jun 19, 2026
7 checks passed

EtanHey mentioned this pull request Jun 19, 2026

fix: fall back to FTS when search embeddings stall #524

Open

chatgpt-codex-connector Bot reviewed Jun 19, 2026

View reviewed changes

	if not results["documents"][0]:
	empty = {"query": query, "total": 0, "results": []}
	return ([TextContent(type="text", text="No results found.")], empty)

Conversation

EtanHey commented Jun 19, 2026 • edited by macroscopeapp Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Notes

Fall back to FTS search when embedding times out or raises in search handlers

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

EtanHey commented Jun 19, 2026

Uh oh!

EtanHey commented Jun 19, 2026

Uh oh!

EtanHey commented Jun 19, 2026

Uh oh!

cursor Bot commented Jun 19, 2026

Uh oh!

coderabbitai Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Uh oh!

coderabbitai Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot Jun 19, 2026

Choose a reason for hiding this comment

Timeout leaves background work running

Uh oh!

cursor Bot Jun 19, 2026

Choose a reason for hiding this comment

Empty results omit fallback metadata

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

EtanHey commented Jun 19, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

EtanHey commented Jun 19, 2026 •

edited by macroscopeapp Bot

Loading

coderabbitai Bot commented Jun 19, 2026 •

edited

Loading

coderabbitai Bot commented Jun 19, 2026 •

edited

Loading