DEV-1462: LiveSQLBench-Base-Lite-SQLite one-shot harness by ZmeiGorynych · Pull Request #5 · MotleyAI/bird-agents

ZmeiGorynych · 2026-05-27T14:07:08Z

Add a new --mode one-shot evaluation path that runs the
LiveSQLBench-Base-Lite-SQLite dataset (the unambiguous, one-shot
sibling of mini-interact) against BOTH SLayer agents —
pydantic_ai_recursive (KB→memories) and pydantic_ai_otf_encode
(KB→models). Scored on the 180 SELECT (category=="Query") tasks;
no user simulator, no ask_user anywhere in the spawn tree.

The full final plan (incl. all Codex findings folded) is in
DEV-1462's "Final approved plan" section.

Highlights

New --dataset {mini-interact, livesqlbench} and --gold-file flags.
--mode one-shot is gated to --dataset livesqlbench; --dataset livesqlbench
is gated to --mode {one-shot, oracle}; one-shot requires
--slayer-setup on-the-fly. Guards land both at the CLI
(parser.error) and the programmatic-entry points
(run_evaluation/make_runner/run_one_task).
New scripts/prepare_livesqlbench.py materialises per-DB <db>.sqlite
from <db>_template.sqlite (refuses git-LFS pointers loudly). New
harness.materialize_task_db gives each LiveSQLBench task an isolated
per-instance db_file_path (symlinked template) so concurrent
eval-resets never race the OTF cache build. Stale-symlink rebuild +
defensive _ephemeral_/_process_ refusal.
New harness.load_livesqlbench_tasks: merges the gated gold sidecar
by instance_id, maps query→amb_user_query, stamps
task["dataset"]="livesqlbench", filters category=="Query", asserts
exactly 180 on a full run, and is filter_ids-aware so partial-gold
targeted runs don't trip the empty-sol_sql fail-fast.
One-shot factories + prompts in BOTH SLayer adapters:
_build_sub_explorer, _build_projection_resolver_oneshot,
_build_query_constructor_oneshot (+ otf_encode
_build_kb_encoder_oneshot). _register_spawn_subagent +
_register_kb_to_slayer parametrized with eval_mode to dispatch
the right ask_user-free child. agent.run_task accepts
eval_mode in {a-interact, one-shot} with reserve=submit_query
only and a one-shot resolver recovery prompt.
Per-benchmark artifact separation. paths.slayer_otf_cache_root
and slayer_models_otf_root gain a benchmark: str | None = None
kwarg; default returns the legacy mini-interact path (no caller
breakage), benchmark="livesqlbench" returns parallel roots
(slayer_otf_cache_livesqlbench/, slayer_models_otf_livesqlbench/)
with parallel env-var overrides. _maybe_force_wipe_otf is
benchmark-aware so --otf-rebuild never wipes the wrong root.
Codex Cloud SLayer mode end-to-end (mirror local) + consolidate the OTF artifact lifecycle (DEV-1468) #2 fix: slayer_otf/reference_build.py::_effective_db_root
accepts an authoritative db_root kwarg that overrides $BIRD_DB_PATH
(threaded all the way from ensure_db_reference). The otf_encode
adapter passes --db-path as the explicit db_root for LiveSQLBench
runs, so the setup-encoder resolves the right sqlite even when
conftest/CI sets $BIRD_DB_PATH to mini-interact.
README + CLAUDE.md updated with the LiveSQLBench setup +
per-benchmark roots pattern. A test_cloud_paths_unchanged.py
regression test pins that the DEV-1470 cloud upload-back / post-run
merge keep using the legacy (no-kwarg) roots — cloud-side
LiveSQLBench is explicitly out of scope and filed as a follow-up.

Tests

165 new tests + extensions to existing ones; full non-integration
suite: 1232 passed, 94 skipped, 19 deselected. TDD: tests landed
first, Codex-reviewed against the plan (all 8 findings folded; the
test_otf_encode_reference_root was restructured to be unconditional;
the spawn-builder closure was moved back to call-time so the existing
nested-spawn trajectory test still pins parent_idx correctness).

Manual smoke (after merge)

git lfs pull   # in ../livesqlbench-base-lite-sqlite/
uv run python scripts/prepare_livesqlbench.py
# Oracle validation (needs gated gold):
uv run bird-interact --dataset livesqlbench --mode oracle \
  --gold-file <gated.jsonl> --data ... --db-path ... --limit 5
# One-shot, both flavors:
uv run bird-interact --dataset livesqlbench --mode one-shot --query-mode slayer \
  --framework pydantic_ai_recursive --slayer-setup on-the-fly \
  --gold-file <gated.jsonl> --instance-id <id1>,<id2> ...

Out of scope (explicit follow-ups)

90 Management tasks (need phase-1 preprocess + custom-test_cases eval).
Raw one-shot; one-shot for non-SLayer frameworks; Base-Full / Large.
Cloud-side LiveSQLBench — see B8 in the linked spec.

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- One-shot execution mode (--mode one-shot) with fixed 30.0 budget, supports two slayer frameworks and enforces on-the-fly setup.
- LiveSQLBench dataset support: per-task SQLite isolation, gated gold sidecar (--gold-file), and a dataset preparation CLI.
Documentation
- README and docs: LiveSQLBench setup, one-shot usage, dataset/mode gating, and benchmark-scoped OTF roots.
Improvements
- Benchmark-scoped on-the-fly artifact roots and safer --otf-rebuild that only purges the targeted benchmark.
Tests
- Extensive new tests covering one-shot behavior, LiveSQLBench loader/prep, path scoping, DB isolation, and rebuild wiring.

Add a new `--mode one-shot` evaluation path that runs the LiveSQLBench-Base-Lite-SQLite dataset (the unambiguous, one-shot sibling of mini-interact) against BOTH SLayer agents — `pydantic_ai_recursive` (KB→memories) and `pydantic_ai_otf_encode` (KB→models). Scored on the 180 SELECT (`category=="Query"`) tasks; no user simulator, no `ask_user` anywhere in the spawn tree. Highlights (full plan in DEV-1462's "Final approved plan" section): * New `--dataset {mini-interact, livesqlbench}` and `--gold-file` flags. `--mode one-shot` is gated to `--dataset livesqlbench`; `--dataset livesqlbench` is gated to `--mode {one-shot, oracle}`; one-shot requires `--slayer-setup on-the-fly`. Guards land both at the CLI (parser.error) and at the programmatic-entry points (run_evaluation/make_runner/run_one_task — Codex #1). * New `scripts/prepare_livesqlbench.py` materialises per-DB `<db>.sqlite` from `<db>_template.sqlite` (refuses git-LFS pointers loudly). New `harness.materialize_task_db` gives each LiveSQLBench task an isolated per-instance `db_file_path` (symlinked template) so concurrent eval-resets never race the OTF cache build. Stale-symlink rebuild + defensive `_ephemeral_`/`_process_` refusal (Codex #4/#5). * New `harness.load_livesqlbench_tasks`: merges the gated gold sidecar by `instance_id`, maps `query`→`amb_user_query`, stamps a `task["dataset"]="livesqlbench"` marker (the irreducible source of truth for DB isolation + the one-shot run_task programmatic guard), filters `category=="Query"` (logs `_M_` disagreements), asserts exactly 180 SELECT tasks on a full run, and is `filter_ids`-aware so partial-gold targeted runs don't trip the empty-`sol_sql` fail-fast (Codex #6). * One-shot factories + prompts in BOTH SLayer adapters: `_build_sub_explorer`, `_build_projection_resolver_oneshot`, `_build_query_constructor_oneshot` (+ otf_encode `_build_kb_encoder_oneshot`). `_register_spawn_subagent` + `_register_kb_to_slayer` parametrized with `eval_mode` to dispatch the right ask_user-free child. New one-shot prompt variants with NO ask_user / user-sim language. `agent.run_task` accepts `eval_mode in {a-interact, one-shot}` with reserve=submit_query only and a one-shot resolver recovery prompt. * Per-benchmark artifact separation (user principle: artifacts kept SEPARATE by benchmark). `paths.slayer_otf_cache_root` and `slayer_models_otf_root` gain a `benchmark: str | None = None` kwarg; default returns the legacy mini-interact path (no caller breakage), `benchmark="livesqlbench"` returns parallel roots (`slayer_otf_cache_livesqlbench/`, `slayer_models_otf_livesqlbench/`) with parallel env-var overrides. `_maybe_force_wipe_otf` is benchmark-aware so `--otf-rebuild` never wipes the wrong root. * Codex #2 fix: `slayer_otf/reference_build.py::_effective_db_root` accepts an authoritative `db_root` kwarg that overrides `$BIRD_DB_PATH` (threaded all the way from `ensure_db_reference` through `_resolve_datasource_for_build` + `_portabilise_datasource`). The otf_encode adapter passes `--db-path` as the explicit `db_root` for LiveSQLBench runs, so the setup-encoder build resolves the right sqlite even when conftest/CI sets `$BIRD_DB_PATH` to mini-interact. * README + CLAUDE.md updated with the LiveSQLBench setup + per-benchmark roots pattern. A `test_cloud_paths_unchanged.py` regression test pins that the dev-1470 cloud upload-back / post-run merge keep using the LEGACY (no-kwarg) roots — cloud-side LiveSQLBench is explicitly out of scope and filed as a follow-up. Full non-integration suite: 1232 passed, 94 skipped, 19 deselected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

linear · 2026-05-27T14:07:14Z

DEV-1462

coderabbitai · 2026-05-27T14:07:42Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9573c42e-4d57-48a9-bb4d-b906315281b2

📥 Commits

Reviewing files that changed from the base of the PR and between c9bf9e5 and 201ea99.

📒 Files selected for processing (5)

src/bird_interact_agents/agents/pydantic_ai_otf_encode/agent.py
src/bird_interact_agents/hard8_preprocessor.py
src/bird_interact_agents/slayer_pipeline/portable_connection.py
tests/test_one_shot_run.py
tests/test_otf_encode_reference_root.py

🚧 Files skipped from review as they are similar to previous changes (1)

src/bird_interact_agents/agents/pydantic_ai_otf_encode/agent.py

📝 Walkthrough

Walkthrough

Adds LiveSQLBench one-shot execution: dataset preparation, gold-sidecar merging, per-task SQLite materialization, benchmark-scoped OTF cache/model roots, one-shot agent factories/prompts and budgeting for recursive and otf-encode frameworks, db_root threading through reference-build, CLI/programmatic guards, and broad tests and fixtures.

Changes

LiveSQLBench One-Shot Mode

Layer / File(s)	Summary
Docs: README & CLAUDE `README.md`, `CLAUDE.md`	Adds LiveSQLBench one-shot usage documentation and per-benchmark OTF artifact root guidance.
Prepare script & CLI `scripts/prepare_livesqlbench.py`, `tests/test_prepare_livesqlbench.py`	New prepare utility that detects git-LFS pointer templates, atomically materializes `<db>.sqlite` from `<db>_template.sqlite`, supports idempotent/skippable runs and `--force`, prints per-DB summary, and exposes programmatic `prepare(...)` and `main(...)` CLI.
Path helpers & benchmark scoping `src/bird_interact_agents/paths.py`, `tests/test_paths.py`, `tests/test_cloud_paths_unchanged.py`	Adds `livesqlbench_root()` and `livesqlbench_data_file()` with env overrides; introduces `_KNOWN_BENCHMARKS` and `_validate_benchmark`; updates `slayer_otf_cache_root()` and `slayer_models_otf_root()` to accept `benchmark: str
Reference build db_root propagation `src/bird_interact_agents/slayer_otf/reference_build.py`, `tests/test_otf_encode_reference_root.py`	Adds `db_root` kwarg to `ensure_db_reference()` and threads it through `_build_reference`, `_resolve_datasource_for_build`, and `_portabilise_datasource`; introduces `_effective_db_root()` with precedence explicit `db_root` > `BIRD_DB_PATH` > mini-interact default.
Loader & per-task DB materialization `src/bird_interact_agents/harness.py`, `tests/test_livesqlbench_loader.py`, `tests/test_db_isolation.py`	Adds `load_livesqlbench_tasks()` that merges gold sidecar fields, filters Query tasks, enforces the 180 SELECT-row contract, and `materialize_task_db()` which symlinks per-instance working DBs from templates and is idempotent/stale-safe.
Run CLI & programmatic guards `src/bird_interact_agents/run.py`, `tests/test_one_shot_mode.py`	Extends CLI with `--dataset` and `--gold-file`, includes `one-shot` in `--mode`; enforces dataset↔mode and one-shot framework/slayer-setup/query-mode constraints; scopes `_maybe_force_wipe_otf` purges by benchmark; materializes per-task DBs for oracle runs; hardens `run_evaluation` args and validation.
OTF-encode agent one-shot support `src/bird_interact_agents/agents/pydantic_ai_otf_encode/*`, `tests/test_one_shot_otf_encode_factories.py`, `tests/test_one_shot_run.py`	Adds one-shot eval_mode handling, oneshot KB-encoder builder and prompts, benchmark-scoped resolution for cache and model roots, materialize_task_db pre-run, and oneshot-specific projection/constructor flow and recovery prompt.
Recursive agent one-shot support `src/bird_interact_agents/agents/pydantic_ai_recursive/*`, `tests/test_one_shot_recursive_factories.py`, `tests/test_one_shot_run.py`	Adds `eval_mode` handling, one-shot budgeting (submit-only reserve), benchmark forwarding into `_resolve_otf_task_storage_dir`, materialize_task_db usage, oneshot factory variants and non-interactive prompts.
Factory & prompt additions `src/bird_interact_agents/agents//factories.py`, `src/bird_interact_agents/agents//prompts.py`	Introduces SUB_EXPLORER/ROOT_EXPLORER and multiple ONESHOT prompt constants, spawn_subagent eval_mode propagation, per-KB oneshot encoder dispatch, and oneshot submit wrappers removing ask_user tooling.
Test fixtures & helpers `tests/_livesqlbench_fixtures.py`	Provides deterministic tiny SQLite creator, LFS pointer writer, dataset/gold builders, and public_task helper for tests.
Validation & unit tests `tests/*` (many files)	Adds broad test coverage: CLI/programmatic validation, loader semantics, prepare script behavior, path/env override tests, reference-build db_root propagation tests, otf-rebuild per-benchmark scoping, cloud-side AST contract to avoid passing benchmark in cloud modules, factory/tool/prompt/docstring checks, orchestration/integration tests, and DB isolation/concurrency tests.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

MotleyAI/bird-agents#2: Related OTF artifact lifecycle and _maybe_force_wipe_otf purging/OTF-cache handling.
MotleyAI/bird-agents#3: Overlaps in slayer_otf reference-build area (locking/reuse behavior and build pipeline).

"From a rabbit's log, I hop and write,
One-shot paths split left and right,
Templates turned stable with care,
Benchmarked roots now live elsewhere,
A tiny hop — tests pass by night!" 🐇✨

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch egor/dev-1462-tweak-the-benchmark-harness-to-run-with-livesqlbench-lite

coderabbitai

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/bird_interact_agents/agents/pydantic_ai_otf_encode/prompts.py`:
- Around line 436-469: The new KB_ENCODER_ONESHOT_PROMPT removed the
self-tagging/output-contract that sets meta.kb_id={kb_id}, which breaks
downstream dependency discovery used by _live_tagged_entities() and
_format_deps_block(); restore the tagging and output-contract sections from
KB_ENCODER_PROMPT into KB_ENCODER_ONESHOT_PROMPT (keeping only removal of
ask_user-specific instructions) so that encoders still emit meta.kb_id={kb_id}
and any required meta/self-annotation fields and call patterns (e.g.,
submit_encoding(result_json=...)) exactly once, ensuring already-encoded
dependencies are discoverable by _live_tagged_entities() and
_format_deps_block().

In `@src/bird_interact_agents/agents/pydantic_ai_recursive/agent.py`:
- Around line 282-298: run_task currently allows eval_mode="one-shot" even when
self.slayer_setup=="pre-encoded", letting callers bypass the dataset guard; add
an explicit check alongside the existing dataset check to raise ValueError if
is_one_shot and self.slayer_setup != "on-the-fly". Locate the branch around
eval_mode/is_one_shot in run_task (and reference self.slayer_setup) and enforce
that one-shot is only permitted when self.slayer_setup equals the on-the-fly
mode, with a clear error message mentioning both the required slayer_setup and
dataset constraints.

In `@tests/_livesqlbench_fixtures.py`:
- Around line 189-193: The public_task fixture currently uses truthy 'or'
defaults which overrides explicit empty values; change the assignments to use
explicit None checks so empty string "" or empty dict {} are preserved: for the
"query" key use query if query is not None else f"unambiguous request for
{instance_id}", and for the "conditions" key use conditions if conditions is not
None else {"decimal": [], "distinct": False, "order": False}; update the code in
the public_task function where "query" and "conditions" are set accordingly.

In `@tests/test_one_shot_mode.py`:
- Around line 351-352: Split the semicolon-joined statements into separate lines
to satisfy Ruff E702: replace the combined statements that create and write to
"data" and create "db_path" (variables data, db_path, tmp_path in
tests/test_one_shot_mode.py) with two distinct statements — one that assigns
data = tmp_path / "data.jsonl" and calls data.write_text(""), and another that
assigns db_path = tmp_path / "ds" and calls db_path.mkdir() — and do the same
for the other occurrences around the same file (the locations referenced near
the original combined lines).

In `@tests/test_one_shot_run.py`:
- Around line 728-799: The tests install the _spy_materialize spy but never
assert it was invoked, so add an assertion after each instantiation/run in both
OTF one-shot parity tests (the test function shown and the other one around
lines 802-870) to verify DB materialization occurred: after calling
PydanticAIOtfEncodeAgent.run_task, assert that the spy created by
_spy_materialize recorded a call to materialize_task_db (e.g., check spy.called
or spy.call_count > 0 or spy.assert_called_once() depending on the spy API).
Make this change in both test functions referencing _spy_materialize and
materialize_task_db so a regression in one-shot DB materialization is caught.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 476372c4-3570-4134-9d96-b93cf793a62c

📥 Commits

Reviewing files that changed from the base of the PR and between 33f30ac and 7233a09.

📒 Files selected for processing (27)

CLAUDE.md
README.md
scripts/prepare_livesqlbench.py
src/bird_interact_agents/agents/pydantic_ai_otf_encode/agent.py
src/bird_interact_agents/agents/pydantic_ai_otf_encode/factories.py
src/bird_interact_agents/agents/pydantic_ai_otf_encode/prompts.py
src/bird_interact_agents/agents/pydantic_ai_recursive/agent.py
src/bird_interact_agents/agents/pydantic_ai_recursive/factories.py
src/bird_interact_agents/agents/pydantic_ai_recursive/prompts.py
src/bird_interact_agents/harness.py
src/bird_interact_agents/paths.py
src/bird_interact_agents/run.py
src/bird_interact_agents/slayer_otf/reference_build.py
tests/_livesqlbench_fixtures.py
tests/test_cloud_paths_unchanged.py
tests/test_db_isolation.py
tests/test_livesqlbench_loader.py
tests/test_one_shot_mode.py
tests/test_one_shot_otf_encode_factories.py
tests/test_one_shot_recursive_factories.py
tests/test_one_shot_run.py
tests/test_otf_encode_reference_root.py
tests/test_otf_rebuild_per_benchmark.py
tests/test_otf_rebuild_wiring.py
tests/test_paths.py
tests/test_prepare_livesqlbench.py
tests/test_pydantic_ai_otf_encode_agent.py

* run.py: wire `materialize_task_db` into `run_oracle_task` so LiveSQLBench oracle runs at `--concurrency > 1` don't race the shared `<db>.sqlite` the OTF cache reads (B0's docstring said it was called from both run paths but the oracle wiring was missing — Codex). * pydantic_ai_otf_encode/prompts.py: restore the KB SELF-ANNOTATION contract in `KB_ENCODER_ONESHOT_PROMPT`. Every encoded entity must carry `label`, the canonical `description` block, and `meta.kb_id={kb_id}` — downstream `_live_tagged_entities()` + `_format_deps_block()` discover already-encoded deps via those tags, so a successfully-encoded KB without the meta.kb_id tag silently looks "not encoded" to dependents and breaks transitive `kb_to_slayer` resolution (CodeRabbit). * pydantic_ai_recursive/agent.py: belt-and-suspenders check that one-shot ⟹ `slayer_setup == "on-the-fly"`. The check already lives at the CLI / run_evaluation / make_runner / run_one_task boundaries; this in-run_task assertion closes the bypass for a programmatic caller that constructs `PydanticAIRecursiveAgent(slayer_setup="pre-encoded")` and invokes `.run_task(eval_mode="one-shot", ...)` directly (CodeRabbit). + regression test. Full non-integration suite: 1233 passed (was 1232). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* harness.load_livesqlbench_tasks: the full-run "exactly 180" check is now `if … : raise ValueError(…)`, not `assert`. Production guards must survive `python -O` / `PYTHONOPTIMIZE` — the latter strips assertions, so a truncated dataset would have silently run in exactly the scenario this guard exists for. * README: the LiveSQLBench setup steps used `cd livesqlbench-base- lite-sqlite` then ran `uv run python scripts/prepare_livesqlbench.py`, but `scripts/` lives in **bird-agents**, not the dataset dir. Rewrite step 2 as a subshell (`(cd … && git lfs pull)`) and have step 3 explicitly `cd bird-agents` before running the script. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* tests/_livesqlbench_fixtures.py: `public_task` uses `is not None` (not `or`) for `query` and `conditions` defaults, so a caller can build a fixture with an explicit empty `""` / `{}` for edge-case tests instead of having it silently replaced by the default. * tests/test_one_shot_mode.py: split semicolon-separated statements (`data = ...; data.write_text("")`) into two lines each — Ruff E702 flags the one-liner form and would break CI once Ruff is wired in. * tests/test_one_shot_run.py: both OTF one-shot parity tests (reserve + empty-resolver-skip) now capture the `materialize_calls` spy return and assert it was invoked. The recursive parity tests already do this; without it a regression in OTF one-shot DB materialization would slip through these two paths. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The previous round threaded `db_root` into `ensure_db_reference` → `_effective_db_root` so the otf_encode **reference build** ignores `$BIRD_DB_PATH` when an authoritative `--db-path` is in play. But the **per-task variant copy** (which is what the SLayer MCP server reads at runtime) is materialised by `build_task_variant_storage`, and that function delegates to `resolve_committed_connection_string` which still preferred `$BIRD_DB_PATH` over the supplied root. Net effect: a one-shot LiveSQLBench run would correctly build/reuse the LiveSQLBench-scoped reference, then SILENTLY query the mini-interact sqlite at task time (because conftest / CI / day-to-day shells often set `BIRD_DB_PATH` to mini-interact). Mirror the previous fix one more layer: * `slayer_pipeline/portable_connection.py::resolve_committed_connection_string` gains an explicit `db_root: Path | None = None` kwarg that overrides `$BIRD_DB_PATH` when set. Back-compat preserved (no kwarg → legacy env-wins semantics). * `hard8_preprocessor.build_task_variant_storage` gains the same `db_root` kwarg and forwards it into `resolve_committed_connection_string`. * otf_encode's `_resolve_otf_task_storage_dir` now passes `db_root=Path(data_path_base).resolve()` (the same value already passed to `ensure_db_reference`), so build-time + runtime re-anchoring see the SAME root. Recursive flavor unaffected — `runtime._rewrite_datasource_connection_string` force-overwrites the connection string from `data_path_base` already, no env consultation. + 3 regression tests in test_otf_encode_reference_root.py: - resolve_committed_connection_string honours db_root over env, - back-compat: no db_root → env wins, - build_task_variant_storage forwards db_root through the resolver (spies the source-module resolver, not the importing-module binding, because the import lives inside the function body). Full non-integration suite: 1236 passed (was 1233; +3 new tests). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai Bot reviewed May 27, 2026

View reviewed changes

ZmeiGorynych and others added 4 commits May 27, 2026 17:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DEV-1462: LiveSQLBench-Base-Lite-SQLite one-shot harness#5

DEV-1462: LiveSQLBench-Base-Lite-SQLite one-shot harness#5
ZmeiGorynych wants to merge 5 commits into
mainfrom
egor/dev-1462-tweak-the-benchmark-harness-to-run-with-livesqlbench-lite

ZmeiGorynych commented May 27, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

linear Bot commented May 27, 2026

Uh oh!

coderabbitai Bot commented May 27, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ZmeiGorynych commented May 27, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Highlights

Tests

Manual smoke (after merge)

Out of scope (explicit follow-ups)

Summary by CodeRabbit

Uh oh!

linear Bot commented May 27, 2026

Uh oh!

coderabbitai Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ZmeiGorynych commented May 27, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 27, 2026 •

edited

Loading