fix: fall back to FTS when search embeddings stall by EtanHey · Pull Request #524 · EtanHey/brainlayer

EtanHey · 2026-06-19T16:39:19Z

Summary

Time-bound MCP brain_search query embedding via BRAINLAYER_EMBED_TIMEOUT_MS and fall back to FTS-only on timeout, embed errors, or an already stalled embedder.
Allow hybrid_search to accept query_embedding=None and skip vector work while keeping the FTS/RRF result path alive.
Time-bound prompt injection hybrid search and fall back to FTS-only; operational/no-result injection event noise remains suppressed.
Apply the same timeout/FTS fallback guard to the live ~/.claude prompt hook.

Tests

RED first: slow/raising embedder tests failed before implementation.
pytest tests/test_search_handler.py -k "falls_back_to_fts_when_embedding_times_out or falls_back_to_fts_when_embedding_raises or uses_hybrid_when_embedding_is_fast or parallel_embedding or empty_fts_fallback" -q
pytest tests/test_adaptive_injection.py -k "fallback_to_fts_only or slow_hybrid or timed_out_hybrid or fast_hybrid" -q
python3 -m py_compile src/brainlayer/mcp/search_handler.py src/brainlayer/search_repo.py hooks/brainlayer-prompt-search.py /Users/etanheyman/.claude/hooks/brainlayer-prompt-search.py
pytest tests/test_search_handler.py tests/test_adaptive_injection.py tests/test_hybrid_search.py tests/test_conditional_hooks.py tests/test_search_filter_params.py tests/test_search_profile.py -q
ruff check src/brainlayer/mcp/search_handler.py src/brainlayer/search_repo.py hooks/brainlayer-prompt-search.py tests/test_adaptive_injection.py tests/test_hybrid_search.py tests/test_search_handler.py
ruff format --check src/brainlayer/mcp/search_handler.py src/brainlayer/search_repo.py hooks/brainlayer-prompt-search.py tests/test_adaptive_injection.py tests/test_hybrid_search.py tests/test_search_handler.py
BRAINLAYER_PREPUSH=1 scripts/run_tests.sh via pre-push hook: 3019 passed, 9 skipped, 61 deselected, 1 xfailed; MCP registration 3 passed; isolated eval/hook routing 40 passed; bun 1 passed; regression shell passed.

Supersedes #523 because GitHub did not synchronize #523 after the follow-up commit even though the branch ref advanced.

Note

Medium Risk
Changes core brain_search and prompt-hook retrieval concurrency and response shape (search_mode, fallback_reason); lock bugs could degrade all searches to FTS-only, but behavior is covered by new timeout/busy/cancel tests.

Overview
Prevents stacked embedding work when hybrid search or MCP brain_search is slow or times out. A process-wide lock means a second concurrent request skips vector embedding and uses FTS-only immediately (embed_busy) instead of starting another embedder call.

The MCP _search path wraps executor-based embedding with lock acquire/release on timeout, errors, and cancellation so a timed-out embed cannot leave the lock held and block later searches. BRAINLAYER_EMBED_TIMEOUT_MS now rejects non-finite or non-positive values (defaults to 1000 ms). Empty no-hit responses include search_mode and fallback_reason, with user-facing text explaining FTS-only fallback when embedding was skipped.

The prompt hook applies the same pattern via HYBRID_IN_FLIGHT around timed hybrid search so overlapping hook invocations do not spawn parallel hybrid workers. Tests cover parallel suppression, lock release edge cases, and timeout env validation.

^{Reviewed by Cursor Bugbot for commit 93caf0e. Bugbot is set up for automated code reviews on this repo. Configure here.}

Note

Fall back to FTS when embedding is busy or times out in search handler

Adds a module-level _EMBED_IN_FLIGHT lock in search_handler.py so only one embedding can run at a time; concurrent requests immediately fall back to FTS-only with fallback_reason='embed_busy'.
On asyncio.TimeoutError, the handler falls back to FTS-only with fallback_reason='embed_timeout'; on general exceptions, falls back with fallback_reason='embed_error:<ExceptionClass>'.
Fixes BRAINLAYER_EMBED_TIMEOUT_MS validation to reject NaN/Inf and non-positive values, defaulting to 1000ms instead of using invalid values.
Structured and empty search results now include search_mode and fallback_reason metadata; empty-result text reflects FTS-only fallback when applicable.
Risk: embedding is now fully serialized — parallel search requests that previously ran concurrent embeddings will now fall back to FTS instead.

^{Macroscope summarized 93caf0e.}

Summary by CodeRabbit

Release Notes

New Features
- Search operations now enforce configurable timeouts for embedding queries, improving responsiveness.
- Automatic fallback to full-text search when embedding times out or encounters errors.
- Search results now indicate fallback mode and reason when embedding is skipped.
Tests
- Added comprehensive test coverage for embedding timeout and fallback scenarios.

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

coderabbitai · 2026-06-19T16:39:30Z

📝 Walkthrough

Walkthrough

The PR adds configurable embedding timeout and per-process concurrency locks to both the hook script and MCP search handler. When embedding exceeds the timeout or a concurrent embed is already running, both paths fall back to FTS-only search. SearchMixin.hybrid_search is updated to accept Optional[List[float]] for query_embedding, treating None as an intentional FTS-only signal.

Changes

Embedding timeout and FTS-only fallback

Layer / File(s)	Summary
`hybrid_search` optional embedding contract `src/brainlayer/search_repo.py`	Changes `query_embedding` parameter type to `Optional[List[float]]`; adds a `None` branch that skips semantic retrieval, returns empty semantic containers, and emits a `search_profile` event with `reason="missing_query_embedding"` before FTS/RRF fusion.
Hook: timeout/concurrency infrastructure and call-site `hooks/brainlayer-prompt-search.py`	Adds `DEFAULT_EMBED_TIMEOUT_MS`, `HYBRID_IN_FLIGHT` lock, `embed_timeout_ms()` (env-driven, validated, capped at 30,000ms), and `run_with_timeout()` (daemon thread with non-blocking lock and queue-based result capture). Updates `search_prompt_chunks()` to call `run_with_timeout` instead of `run_hybrid_search` directly, routing `TimeoutError` into the existing FTS-only fallback.
MCP search handler: timeout/concurrency infrastructure and `_search` integration `src/brainlayer/mcp/search_handler.py`	Adds `_DEFAULT_EMBED_TIMEOUT_MS`, `_EMBED_IN_FLIGHT` lock, `_embed_timeout_ms()`, and `_embed_query_and_release()`. Replaces `_search`'s embedding execution with non-blocking lock acquisition and timeout handling, emitting `search_profile` events per fallback reason (`embed_busy`, `embed_timeout`, `embed_error`). Updates all structured response sites and result-formatting text to include `search_mode` and conditional `fallback_reason`.
Tests: timeout, concurrency, and FTS-only fallback coverage `tests/test_search_handler.py`, `tests/test_adaptive_injection.py`, `tests/test_hybrid_search.py`	Adds test doubles (`SlowEmbeddingModel`, `CountingSlowEmbeddingModel`, `RaisingEmbeddingModel`, `FtsFallbackSearchStore`, `EmptyFtsFallbackSearchStore`) and async/sync tests covering embed-timeout fallback, embed-error fallback, fast-embed hybrid path, embed-busy concurrency deduplication, empty FTS fallback metadata, and `hybrid_search(query_embedding=None)` FTS-only output.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant _search
    participant _EMBED_IN_FLIGHT as _EMBED_IN_FLIGHT (Lock)
    participant EmbeddingModel
    participant hybrid_search
    participant search_profile

    Client->>`_search`: query

    rect rgba(100, 180, 255, 0.5)
        Note over _search,_EMBED_IN_FLIGHT: Concurrency guard
        `_search`->>`_EMBED_IN_FLIGHT`: acquire(blocking=False)
    end

    alt Lock acquired + embedding succeeds within timeout
        `_search`->>EmbeddingModel: embed(query)
        EmbeddingModel-->>`_search`: vector
        `_search`->>search_profile: emit embed_success
        `_search`->>hybrid_search: query_embedding=vector
    else Lock busy
        `_search`->>search_profile: emit embed_busy
        `_search`->>hybrid_search: query_embedding=None
    else Timeout exceeded
        `_search`->>search_profile: emit embed_timeout
        `_search`->>hybrid_search: query_embedding=None
    else Embedding raises
        `_search`->>search_profile: emit embed_error
        `_search`->>hybrid_search: query_embedding=None
    end

    hybrid_search-->>`_search`: results
    `_search`-->>Client: {search_mode, fallback_reason?, results}

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

EtanHey/brainlayer#225: Both PRs modify SearchMixin.hybrid_search in src/brainlayer/search_repo.py — this PR changes query_embedding to accept None for FTS-only fallback while the referenced PR adds filter_meta_noise flag handling in the same function.
EtanHey/brainlayer#242: Both PRs modify SearchMixin.hybrid_search in src/brainlayer/search_repo.py, with this PR adding the optional-embedding/timeout/busy fallback and the referenced PR adding MMR post-reranking/overfetch behavior.

Poem

🐇 When embeddings run slow or the model's too busy,
I don't wait around getting flustered and dizzy.
I start up a thread, set a clock on the wall,
If it times out — FTS handles the call!
With a lock and a queue and a daemon in flight,
This bunny keeps searches snappy and right. 🕐

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 10.26% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'fix: fall back to FTS when search embeddings stall' accurately captures the main objective of the changeset: implementing fallback logic to Full Text Search when embeddings encounter timeouts or errors.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/brain-search-fts-fallback-pr2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

EtanHey · 2026-06-19T16:39:35Z

@codex review
@cursor @BugBot review
@coderabbitai review

coderabbitai · 2026-06-19T16:41:55Z

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Get PR details and list of changed files
gh pr view 524 --repo EtanHey/brainlayer --json title,body,files,additions,deletions,changedFiles

echo "=== CHANGED FILES ==="
gh pr diff 524 --repo EtanHey/brainlayer --name-only

Length of output: 4748

🏁 Script executed:

#!/bin/bash
# Get the full diff
gh pr diff 524 --repo EtanHey/brainlayer

Length of output: 23977

@EtanHey: Solid PR — the design is sound and the test coverage is comprehensive. Found one correctness bug and a few minor issues worth addressing.

Review: `fix: fall back to FTS when search embeddings stall`

🔴 Bug — `except TimeoutError` misses `asyncio.TimeoutError` on Python < 3.11

File: src/brainlayer/mcp/search_handler.py

except TimeoutError as exc:   # ← catches builtins.TimeoutError

asyncio.wait_for raises asyncio.TimeoutError. On Python < 3.11, that class inherits from concurrent.futures.TimeoutError → Exception — not from the built-in TimeoutError (which is OSError → Exception via a different branch). The two are unified only in 3.11+.

If the project supports Python 3.10 or earlier, the timeout fires, the except TimeoutError block is skipped, the bare except Exception catches it instead, and fallback_reason becomes "embed_error:TimeoutError" instead of "embed_timeout". Functionally it still falls back to FTS, but the search_mode/fallback_reason semantics are wrong and the profile event emitted is the generic error one, not the timeout one.

Fix:

# search_handler.py — import asyncio at top (already imported for the async function)
except asyncio.TimeoutError as exc:
    search_mode = "fts_only"
    fallback_reason = "embed_timeout"
    ...

🟡 Lock not released on `CancelledError`

File: src/brainlayer/mcp/search_handler.py

If the _search coroutine is externally cancelled while await asyncio.wait_for(embed_future, ...) is outstanding, CancelledError (a BaseException) propagates and bypasses both the TimeoutError and Exception handlers. The _EMBED_IN_FLIGHT lock stays held until the background thread (_embed_query_and_release) eventually finishes and calls _EMBED_IN_FLIGHT.release(). In the interim, any subsequent _search call sees the lock as busy and falls into embed_busy mode — correct in spirit, but the window is bounded only by the embedder's latency, not the timeout budget.

A defensive pattern:

embed_future = loop.run_in_executor(None, _embed_query_and_release, query)
try:
    query_embedding = await asyncio.wait_for(embed_future, timeout=embed_timeout_ms / 1000.0)
except asyncio.TimeoutError as exc:
    ...
except Exception as exc:
    ...
# No finally needed — _embed_query_and_release.finally already releases the lock
# when the thread completes, even after cancellation.

The lock will be released eventually (it's not a permanent leak), so this is low-severity — worth documenting with a comment.

🟡 `results.get_nowait()` can raise `queue.Empty` if thread raised before putting

File: hooks/brainlayer-prompt-search.py

def target():
    try:
        results.put((True, func(*args, **kwargs)))
    except BaseException as exc:
        results.put((False, exc))   # ← second put could block if first put somehow succeeded
    finally:
        HYBRID_IN_FLIGHT.release()

The current structure is correct: the except is reached only when func(...) raises, so results.put is called exactly once. But if results.put itself raises (e.g. queue.Full from a second put in a future refactor), thread would complete without an item in the queue, and results.get_nowait() in the main thread would raise queue.Empty instead of a useful error. Consider results.put_nowait with an explicit guard, or simplify to a list protected by the lock.

Low risk in current form — flagging for defensive awareness.

🟡 Time-sensitive tests risk flakiness under CI load

Files: tests/test_adaptive_injection.py, tests/test_search_handler.py

Patterns like:

monkeypatch.setenv("BRAINLAYER_EMBED_TIMEOUT_MS", "1")  # 1 ms timeout
# ...
time.sleep(0.05)                                         # 50 ms sleep in thread
assert elapsed < 0.5

1 ms is tighter than typical CI scheduling jitter. If the scheduler delays thread start by a few ms, the test could time out "correctly" but then the finally: time.sleep(0.06) cleanup might not be long enough. Consider using at least 10–25 ms as the timeout floor in tests while keeping assertions loose (e.g., < 2.0), or parameterise with an environment variable.

✅ What looks good

Single-flight guarantee — acquiring before spawning, releasing in finally, and the blocking=False check on the second call are all correct. The CountingSlowEmbeddingModel test validates no parallel workers spawn.
search_repo.py change — the query_embedding is None fast-path emits a profile event and returns an empty-but-valid structure, keeping RRF/FTS intact cleanly.
Lock non-leak on run_in_executor failure — the explicit _EMBED_IN_FLIGHT.release() in the inner except before re-raising is correct.
Cap at 30 000 ms — prevents runaway timeouts from misconfiguration.
Hook duplication of embed_timeout_ms() — unavoidable since the hook is a standalone script; acceptable trade-off.
Test coverage — RED-first discipline, covers timeout, raise, hybrid-happy-path, parallel prevention, and empty-results paths.

Summary

Severity	Issue
🔴 Bug	`except TimeoutError` misses `asyncio.TimeoutError` on Python < 3.11
🟡 Minor	`CancelledError` leaves lock held until background thread finishes
🟡 Minor	`results.get_nowait()` edge case in hook
🟡 Minor	1 ms test timeout may be flaky under CI load

…fallback-pr2 # Conflicts: # hooks/brainlayer-prompt-search.py # src/brainlayer/mcp/search_handler.py # tests/test_adaptive_injection.py # tests/test_search_handler.py

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 07abd6df82

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-19T16:43:21Z

+                except Exception:
+                    _EMBED_IN_FLIGHT.release()
+                    raise
+                query_embedding = await asyncio.wait_for(embed_future, timeout=embed_timeout_ms / 1000.0)


Release embed lock when queued futures time out

If the default executor is saturated, run_in_executor can leave this embed job queued until wait_for times out; wait_for then cancels the queued future before _embed_query_and_release() ever runs, so its finally never releases _EMBED_IN_FLIGHT. After that, every later brain_search in this MCP process sees embed_busy and permanently falls back to FTS-only until restart, even after the executor drains.

Useful? React with 👍 / 👎.

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@hooks/brainlayer-prompt-search.py`:
- Around line 16-20: The embed_timeout_ms() function at line 57 does not
properly validate non-finite timeout values like NaN, which causes thread.join()
to fail later at line 78 with a ValueError. Add the math module to the imports
at the top of the file, then update the validation logic in embed_timeout_ms()
to use math.isfinite() to reject non-finite values (NaN, infinity) at validation
time rather than allowing them to pass and fail later during thread.join()
execution.

In `@tests/test_adaptive_injection.py`:
- Around line 183-184: Replace the fixed time.sleep(0.06) calls in the finally
block on line 184 and the similar call on line 220 with a bounded polling loop
that explicitly checks for the HYBRID_IN_FLIGHT flag to be unlocked. This
ensures proper synchronization on actual lock state rather than relying on
timing assumptions, eliminating flakiness caused by variable scheduler timing.
Use a reasonable timeout limit for the polling loop to prevent indefinite waits.

In `@tests/test_search_handler.py`:
- Around line 216-217: Replace the fixed `await asyncio.sleep(0.06)` calls in
the finally blocks (at both the first occurrence and the second one also
mentioned at lines 288-289) with deterministic polling logic that waits for the
`_EMBED_IN_FLIGHT` lock or semaphore to be released with a deadline instead of
sleeping for a constant duration. This ensures that cleanup waits for the actual
lock release event rather than hoping a fixed sleep is long enough, preventing
test flakiness on slower CI systems where `_EMBED_IN_FLIGHT` might leak into
following tests.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 943c6e16-1046-45a2-96dd-8d733750e0c6

📥 Commits

Reviewing files that changed from the base of the PR and between a8b7e78 and 07abd6d.

📒 Files selected for processing (6)

hooks/brainlayer-prompt-search.py
src/brainlayer/mcp/search_handler.py
src/brainlayer/search_repo.py
tests/test_adaptive_injection.py
tests/test_hybrid_search.py
tests/test_search_handler.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Macroscope - Correctness Check

🧰 Additional context used

📓 Path-based instructions (5)

**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Flag risky DB or concurrency changes explicitly and do not hand-wave lock behavior
Enforce one-write-at-a-time concurrency constraint; reads are safe but brain_digest is write-heavy and must not run in parallel with other MCP work
Run pytest before claiming behavior changed safely; current test suite has 929 tests

Files:

tests/test_hybrid_search.py
hooks/brainlayer-prompt-search.py
tests/test_adaptive_injection.py
src/brainlayer/search_repo.py
src/brainlayer/mcp/search_handler.py
tests/test_search_handler.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Run tests with pytest and use BRAINLAYER_PREPUSH=1 for pre-push validation

Files:

tests/test_hybrid_search.py
tests/test_adaptive_injection.py
tests/test_search_handler.py

src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Run linting with ruff check src/ && ruff format src/ on all Python source files

Files:

src/brainlayer/search_repo.py
src/brainlayer/mcp/search_handler.py

src/brainlayer/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Resolve database path using paths.py:get_db_path() with env var override; canonical path is ~/.local/share/brainlayer/brainlayer.db

Files:

src/brainlayer/search_repo.py
src/brainlayer/mcp/search_handler.py

src/brainlayer/**/*search*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Exclude lifecycle-managed chunks from default search; support include_archived=True to show history

Files:

src/brainlayer/search_repo.py
src/brainlayer/mcp/search_handler.py

🪛 ast-grep (0.43.0)

tests/test_adaptive_injection.py

[info] 173-173: Do not hardcode temporary file or directory names
Context: "/tmp/test.db"
Note: [CWE-377].

(hardcoded-tmp-file)

[info] 202-202: Do not hardcode temporary file or directory names
Context: "/tmp/test.db"
Note: [CWE-377].

(hardcoded-tmp-file)

[info] 208-208: Do not hardcode temporary file or directory names
Context: "/tmp/test.db"
Note: [CWE-377].

(hardcoded-tmp-file)

[info] 231-231: Do not hardcode temporary file or directory names
Context: "/tmp/test.db"
Note: [CWE-377].

(hardcoded-tmp-file)

🔇 Additional comments (5)

tests/test_hybrid_search.py (1)

220-236: LGTM!

src/brainlayer/search_repo.py (1)

1714-1714: LGTM!

Also applies to: 1813-1825

hooks/brainlayer-prompt-search.py (1)

39-40: LGTM!

Also applies to: 62-85, 988-988

src/brainlayer/mcp/search_handler.py (2)

38-39: LGTM!

Also applies to: 77-81, 1792-1840, 1861-1889, 1917-1923, 1953-1975, 1977-1979, 2050-2061

10-10: No action required. The current _embed_timeout_ms() validation at lines 66-74 already correctly rejects NaN. The condition if not value or value < 0: evaluates to False when value is NaN (since not float('nan') is False and float('nan') < 0 is False), preventing NaN from being passed to asyncio.wait_for(). The code does not have the vulnerability described in the review comment.

coderabbitai

Caution

Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@hooks/brainlayer-prompt-search.py`:
- Around line 16-20: The embed_timeout_ms() function at line 57 does not
properly validate non-finite timeout values like NaN, which causes thread.join()
to fail later at line 78 with a ValueError. Add the math module to the imports
at the top of the file, then update the validation logic in embed_timeout_ms()
to use math.isfinite() to reject non-finite values (NaN, infinity) at validation
time rather than allowing them to pass and fail later during thread.join()
execution.

In `@tests/test_adaptive_injection.py`:
- Around line 183-184: Replace the fixed time.sleep(0.06) calls in the finally
block on line 184 and the similar call on line 220 with a bounded polling loop
that explicitly checks for the HYBRID_IN_FLIGHT flag to be unlocked. This
ensures proper synchronization on actual lock state rather than relying on
timing assumptions, eliminating flakiness caused by variable scheduler timing.
Use a reasonable timeout limit for the polling loop to prevent indefinite waits.

In `@tests/test_search_handler.py`:
- Around line 216-217: Replace the fixed `await asyncio.sleep(0.06)` calls in
the finally blocks (at both the first occurrence and the second one also
mentioned at lines 288-289) with deterministic polling logic that waits for the
`_EMBED_IN_FLIGHT` lock or semaphore to be released with a deadline instead of
sleeping for a constant duration. This ensures that cleanup waits for the actual
lock release event rather than hoping a fixed sleep is long enough, preventing
test flakiness on slower CI systems where `_EMBED_IN_FLIGHT` might leak into
following tests.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 943c6e16-1046-45a2-96dd-8d733750e0c6

📥 Commits

Reviewing files that changed from the base of the PR and between a8b7e78 and 07abd6d.

📒 Files selected for processing (6)

hooks/brainlayer-prompt-search.py
src/brainlayer/mcp/search_handler.py
src/brainlayer/search_repo.py
tests/test_adaptive_injection.py
tests/test_hybrid_search.py
tests/test_search_handler.py

📜 Review details

🔇 Additional comments (5)

tests/test_hybrid_search.py (1)

220-236: LGTM!

src/brainlayer/search_repo.py (1)

1714-1714: LGTM!

Also applies to: 1813-1825

hooks/brainlayer-prompt-search.py (1)

39-40: LGTM!

Also applies to: 62-85, 988-988

src/brainlayer/mcp/search_handler.py (2)

38-39: LGTM!

Also applies to: 77-81, 1792-1840, 1861-1889, 1917-1923, 1953-1975, 1977-1979, 2050-2061

10-10: No action required. The current _embed_timeout_ms() validation at lines 66-74 already correctly rejects NaN. The condition if not value or value < 0: evaluates to False when value is NaN (since not float('nan') is False and float('nan') < 0 is False), preventing NaN from being passed to asyncio.wait_for(). The code does not have the vulnerability described in the review comment.

🛑 Comments failed to post (3)

hooks/brainlayer-prompt-search.py (1)

16-20: ⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
python - <<'PY'
import math
import threading

value = float("nan")
print("current_checks_allow_nan=", bool(value) and not (value < 0))
print("isfinite=", math.isfinite(value))

thread = threading.Thread(target=lambda: None)
thread.start()
try:
    thread.join(value)
    print("join accepted nan")
except Exception as exc:
    print(type(exc).__name__, str(exc))
PY

Repository: EtanHey/brainlayer

Length of output: 153

🏁 Script executed:

cat -n hooks/brainlayer-prompt-search.py | sed -n '16,20p; 39,60p; 62,90p; 985,995p'

Repository: EtanHey/brainlayer

Length of output: 2652

🏁 Script executed:

rg "BRAINLAYER_EMBED_TIMEOUT_MS|thread\.join" hooks/brainlayer-prompt-search.py -n -A 2 -B 2

Repository: EtanHey/brainlayer

Length of output: 487

Reject non-finite timeout values in embed_timeout_ms().

The validation at line 57 allows BRAINLAYER_EMBED_TIMEOUT_MS=nan to pass (since not float('nan') is False and float('nan') < 0 is False), but thread.join() at line 78 will raise ValueError: Invalid value NaN (not a number). Although the exception is caught and the search degrades to FTS-only, this should be rejected at validation time like other malformed values.

Use math.isfinite() to reject non-finite values:

Proposed fix

+import math
 import queue
 import re
 import sqlite3
 import sys
 import threading

 def embed_timeout_ms():
     raw = os.environ.get("BRAINLAYER_EMBED_TIMEOUT_MS", str(DEFAULT_EMBED_TIMEOUT_MS))
     try:
         value = float(raw)
     except (TypeError, ValueError):
         return DEFAULT_EMBED_TIMEOUT_MS
-    if not value or value < 0:
+    if not math.isfinite(value) or value <= 0:
         return DEFAULT_EMBED_TIMEOUT_MS
     return min(value, 30_000.0)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@hooks/brainlayer-prompt-search.py` around lines 16 - 20, The
embed_timeout_ms() function at line 57 does not properly validate non-finite
timeout values like NaN, which causes thread.join() to fail later at line 78
with a ValueError. Add the math module to the imports at the top of the file,
then update the validation logic in embed_timeout_ms() to use math.isfinite() to
reject non-finite values (NaN, infinity) at validation time rather than allowing
them to pass and fail later during thread.join() execution.

tests/test_adaptive_injection.py (1)

183-184: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid fixed sleep for lock synchronization in timeout tests.

Line 184 and Line 220 use fixed delays to “let the worker finish”; this is brittle under variable scheduler timing and can make timeout/concurrency assertions flaky. Wait for HYBRID_IN_FLIGHT to unlock with a bounded polling loop.

As per coding guidelines, "Flag risky DB or concurrency changes explicitly and do not hand-wave lock behavior."

Suggested patch

@@
 import time
@@
 class TestSearchFallback:
+    def _wait_for_hybrid_release(self, prompt_search, timeout_s: float = 1.0):
+        deadline = time.monotonic() + timeout_s
+        while prompt_search.HYBRID_IN_FLIGHT.locked() and time.monotonic() < deadline:
+            time.sleep(0.01)
+        assert not prompt_search.HYBRID_IN_FLIGHT.locked()
+
@@
-        finally:
-            time.sleep(0.06)
+        finally:
+            self._wait_for_hybrid_release(prompt_search)
@@
-        finally:
-            time.sleep(0.25)
+        finally:
+            self._wait_for_hybrid_release(prompt_search)

Also applies to: 219-220

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_adaptive_injection.py` around lines 183 - 184, Replace the fixed
time.sleep(0.06) calls in the finally block on line 184 and the similar call on
line 220 with a bounded polling loop that explicitly checks for the
HYBRID_IN_FLIGHT flag to be unlocked. This ensures proper synchronization on
actual lock state rather than relying on timing assumptions, eliminating
flakiness caused by variable scheduler timing. Use a reasonable timeout limit
for the polling loop to prevent indefinite waits.

Source: Coding guidelines

tests/test_search_handler.py (1)

216-217: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Replace fixed cleanup sleeps with deterministic lock-release waiting.

On Line 217 and Line 289, fixed await asyncio.sleep(...) can still race on slower CI and leak _EMBED_IN_FLIGHT into following tests, causing flaky embed_busy outcomes. Poll lock release with a deadline instead of sleeping a constant duration.

As per coding guidelines, "Flag risky DB or concurrency changes explicitly and do not hand-wave lock behavior."

Suggested patch

@@
 import asyncio
 import time
@@
 from brainlayer.mcp import call_tool
 from brainlayer.mcp.search_handler import _brain_search, _search
+import brainlayer.mcp.search_handler as search_handler_module
+
+
+async def _wait_for_embed_release(timeout_s: float = 1.0) -> None:
+    deadline = time.monotonic() + timeout_s
+    while search_handler_module._EMBED_IN_FLIGHT.locked() and time.monotonic() < deadline:
+        await asyncio.sleep(0.01)
+    assert not search_handler_module._EMBED_IN_FLIGHT.locked()
@@
-    finally:
-        await asyncio.sleep(0.06)
+    finally:
+        await _wait_for_embed_release()
@@
-    finally:
-        await asyncio.sleep(0.25)
+    finally:
+        await _wait_for_embed_release()

Also applies to: 288-289

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_search_handler.py` around lines 216 - 217, Replace the fixed
`await asyncio.sleep(0.06)` calls in the finally blocks (at both the first
occurrence and the second one also mentioned at lines 288-289) with
deterministic polling logic that waits for the `_EMBED_IN_FLIGHT` lock or
semaphore to be released with a deadline instead of sleeping for a constant
duration. This ensures that cleanup waits for the actual lock release event
rather than hoping a fixed sleep is long enough, preventing test flakiness on
slower CI systems where `_EMBED_IN_FLIGHT` might leak into following tests.

Source: Coding guidelines

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 275aa88f61

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-19T16:51:45Z

+        if not _EMBED_IN_FLIGHT.acquire(blocking=False):
            search_mode = "fts_only"
-            fallback_reason = "embed_timeout"
+            fallback_reason = "embed_busy"


Avoid downgrading healthy concurrent searches

When two brain_search/_search calls overlap, the second call hits this non-blocking acquire while the first embedding is still running, even if that first embed is fast and still within BRAINLAYER_EMBED_TIMEOUT_MS. It immediately marks fts_only/embed_busy and passes query_embedding=None to hybrid_search, so concurrent read traffic loses semantic retrieval instead of waiting within its own timeout; only downgrade after the in-flight embed has actually timed out/stalled or after a bounded wait.

Useful? React with 👍 / 👎.

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

EtanHey · 2026-06-19T16:59:17Z

Latest follow-up pushed in 93a3fec: embeds now release the single-flight gate when a queued executor future times out before the worker starts, non-finite timeout env values are rejected, and timeout-test cleanup waits on actual lock state rather than fixed sleeps.

Verification after that commit:

BRAINLAYER_PREPUSH=1 scripts/run_tests.sh passed: 3024 passed, 9 skipped, 61 deselected, 1 xfailed, plus MCP registration, isolated eval/hook routing, bun, and FTS determinism.

Review notes:

The current P1 queued-future lock-leak comment is addressed by release_if_embed_never_started() plus test_brain_search_releases_embed_lock_when_executor_timeout_cancels_queued_future.
I am intentionally not making concurrent searches wait behind an in-flight embed: the requirement is that embeddings must never block reads, so the second overlapping search should degrade to FTS-only instead of waiting on the vector leg.
The asyncio.TimeoutError compatibility concern is not applicable to this package because pyproject.toml requires Python >=3.11 and CI covers 3.11/3.12/3.13.

@codex review
@cursor @BugBot review
@coderabbitai review

coderabbitai · 2026-06-19T17:00:23Z

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check Python version requirement in pyproject.toml
grep -n "python" pyproject.toml | head -20

echo "=== Latest commit on PR branch ==="
git log --oneline -5

echo "=== search_handler.py - key sections ==="
grep -n "TimeoutError\|asyncio\|release_if_embed_never_started\|CancelledError\|EMBED_IN_FLIGHT\|non.finite\|finite\|math\.isfinite\|isinf\|isnan" src/brainlayer/mcp/search_handler.py

echo "=== hook file - key sections ==="
grep -n "TimeoutError\|HYBRID_IN_FLIGHT\|release\|get_nowait\|Empty\|non.finite\|isfinite" hooks/brainlayer-prompt-search.py

echo "=== New test for lock release ==="
grep -n "release_if_embed_never_started\|executor_timeout_cancels\|lock_state\|acquire\|locked" tests/test_search_handler.py | head -30

Length of output: 3075

@EtanHey: Verified against 93a3fec — everything checks out.

except TimeoutError / Python compat — pyproject.toml confirms requires-python = ">=3.11", where asyncio.TimeoutError is an alias for builtins.TimeoutError. The catch is correct; my concern doesn't apply. ✅
Queued-future lock-leak — release_if_embed_never_started() (line 1814) is defined and called in the except TimeoutError handler (line 1841), and the new test_brain_search_releases_embed_lock_when_executor_timeout_cancels_queued_future validates it via real lock-state polling. ✅
Non-finite timeout rejection — math.isfinite(value) guard present in both search_handler.py (line 73) and hooks/brainlayer-prompt-search.py (line 58). ✅
Test timing flakiness — test cleanup now polls _EMBED_IN_FLIGHT.locked() with a deadline instead of fixed sleeps. ✅
results.get_nowait() in hook — unchanged (low-risk in current form). Fine to leave as-is.
FTS-only degradation design — the intentional choice to degrade concurrent searches to FTS-only rather than queue them is well-reasoned for a latency-sensitive read path. ✅

All four issues are resolved or correctly invalidated. The implementation looks solid.

[approve]

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 93a3fec. Configure here.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 93a3fec71c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-19T17:02:49Z

+                except Exception:
+                    release_embed_lock_once()
+                    raise
+                query_embedding = await asyncio.wait_for(embed_future, timeout=embed_timeout_ms / 1000.0)


Release embed gate when queued futures are cancelled

When an MCP _search task is cancelled while run_in_executor has accepted the work but the default executor has not started embed_query_guarded yet (for example, a client disconnect/request timeout while the executor is saturated), asyncio.wait_for exits via CancelledError, which is not caught by except Exception on Python 3.11+. This path never calls release_if_embed_never_started(), and because the cancelled queued future will not run the worker finally, _EMBED_IN_FLIGHT remains locked so every subsequent search degrades to embed_busy/FTS-only until the process restarts.

Useful? React with 👍 / 👎.

This is addressed in the current 93caf0e diff: _search now has an explicit except asyncio.CancelledError: branch immediately after the wait_for, calls release_if_embed_never_started(), then re-raises. The regression test_brain_search_releases_embed_lock_when_cancelled_before_executor_starts reproduces exactly the queued-future cancellation case and failed before this change. Full pre-push passed after the fix: 3025 passed, 9 skipped, 61 deselected, 1 xfailed, plus MCP registration, isolated eval/hook routing, bun, and FTS determinism.

EtanHey · 2026-06-19T17:07:45Z

Follow-up pushed in 93caf0e: addressed Cursor Bugbot’s cancellation edge.

What changed:

_search now releases the embed single-flight gate if the search coroutine is cancelled before the executor worker starts, then re-raises asyncio.CancelledError.
Generic await failures also call the same pre-start release guard.
Added test_brain_search_releases_embed_lock_when_cancelled_before_executor_starts, which failed before the fix and passes now.

Verification after this commit:

pytest tests/test_search_handler.py -k "cancelled_before_executor_starts or executor_timeout_cancels_queued_future or skips_parallel_embedding_after_timeout or falls_back_to_fts_when_embedding_times_out" -q passed: 4 passed.
Focused suite passed: 162 passed, 1 warning.
ruff check ... passed.
ruff format --check ... passed.
python3 -m py_compile ... passed.
BRAINLAYER_PREPUSH=1 scripts/run_tests.sh passed: 3025 passed, 9 skipped, 61 deselected, 1 xfailed, plus MCP registration, isolated eval/hook routing, bun, and FTS determinism.

@codex review
@cursor @BugBot review
@coderabbitai review

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

coderabbitai · 2026-06-19T17:08:31Z

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify the changes in commit 93caf0e
gh pr view 524 --repo EtanHey/brainlayer --json headRefName,headRefOid

echo "=== Recent commits ==="
git log --oneline -5

echo "=== search_handler.py - cancellation handling ==="
grep -n "CancelledError\|release_if_embed_never_started\|BaseException\|finally\|cancel" src/brainlayer/mcp/search_handler.py | head -40

echo "=== New cancellation test ==="
grep -n "cancelled_before_executor_starts\|CancelledError\|cancel" tests/test_search_handler.py | head -30

Length of output: 1755

🏁 Script executed:

#!/bin/bash
echo "=== Cancellation handling block in search_handler.py ==="
sed -n '1810,1870p' src/brainlayer/mcp/search_handler.py

echo "=== New cancellation test implementation ==="
sed -n '335,395p' tests/test_search_handler.py

Length of output: 5405

@EtanHey: Verified against 93caf0e — the cancellation edge is cleanly handled.

release_if_embed_never_started() — uses embed_state_lock to safely check whether the worker has already started or the lock already been released before deciding to call _EMBED_IN_FLIGHT.release(). This prevents a race where the worker starts just as the pre-start guard fires. ✅
except asyncio.CancelledError — calls the guard and re-raises, so cancellation propagates correctly without swallowing it. ✅
embed_query_guarded() — checks embed_state["released"] before starting, closing the window where a concurrent release and a late worker-start could both call release(). ✅
New test — patches run_in_executor to return a never-completing future, cancels the task after confirming the lock is held, asserts CancelledError propagates, then confirms the subsequent search runs in hybrid mode with the lock clear. Solid validation. ✅

All four issues from the initial review are fully resolved. The implementation looks solid — nice work iterating on this.

[approve]

chatgpt-codex-connector · 2026-06-19T17:11:32Z

Codex Review: Didn't find any major issues. Swish!

Reviewed commit: 93caf0ea60

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

EtanHey added 2 commits June 19, 2026 19:17

fix: fall back to FTS when search embeddings stall

2ae7109

fix: cap concurrent search embedding fallbacks

07abd6d

greptile-apps Bot reviewed Jun 19, 2026

View reviewed changes

Merge remote-tracking branch 'origin/main' into fix/brain-search-fts-…

275aa88

…fallback-pr2 # Conflicts: # hooks/brainlayer-prompt-search.py # src/brainlayer/mcp/search_handler.py # tests/test_adaptive_injection.py # tests/test_search_handler.py

chatgpt-codex-connector Bot reviewed Jun 19, 2026

View reviewed changes

greptile-apps Bot reviewed Jun 19, 2026

View reviewed changes

coderabbitai Bot reviewed Jun 19, 2026

View reviewed changes

macroscopeapp Bot reviewed Jun 19, 2026

View reviewed changes

Comment thread src/brainlayer/mcp/search_handler.py Outdated

chatgpt-codex-connector Bot reviewed Jun 19, 2026

View reviewed changes

fix: release embed gate for queued timeouts

93a3fec

greptile-apps Bot reviewed Jun 19, 2026

View reviewed changes

cursor Bot reviewed Jun 19, 2026

View reviewed changes

Comment thread src/brainlayer/mcp/search_handler.py

fix: release embed gate on search cancellation

93caf0e

chatgpt-codex-connector Bot reviewed Jun 19, 2026

View reviewed changes

greptile-apps Bot reviewed Jun 19, 2026

View reviewed changes

Conversation

EtanHey commented Jun 19, 2026 • edited by macroscopeapp Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Tests

Fall back to FTS when embedding is busy or times out in search handler

Summary by CodeRabbit

Release Notes

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

EtanHey commented Jun 19, 2026

Uh oh!

coderabbitai Bot commented Jun 19, 2026

Review: fix: fall back to FTS when search embeddings stall

🔴 Bug — except TimeoutError misses asyncio.TimeoutError on Python < 3.11

🟡 Lock not released on CancelledError

🟡 results.get_nowait() can raise queue.Empty if thread raised before putting

🟡 Time-sensitive tests risk flakiness under CI load

✅ What looks good

Summary

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

EtanHey commented Jun 19, 2026

Uh oh!

coderabbitai Bot commented Jun 19, 2026

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

EtanHey Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

EtanHey commented Jun 19, 2026

Uh oh!

EtanHey commented Jun 19, 2026 •

edited by macroscopeapp Bot

Loading

coderabbitai Bot commented Jun 19, 2026 •

edited

Loading

Review: `fix: fall back to FTS when search embeddings stall`

🔴 Bug — `except TimeoutError` misses `asyncio.TimeoutError` on Python < 3.11

🟡 Lock not released on `CancelledError`

🟡 `results.get_nowait()` can raise `queue.Empty` if thread raised before putting