feat: Group 1 Critical Refactoring and Docker Setup Optimization#15
feat: Group 1 Critical Refactoring and Docker Setup Optimization#15stavp-star wants to merge 6 commits into
Conversation
📝 WalkthroughWalkthroughThis PR adds table-hallucination detection to ChangesRuntime Contracts and Shared Utilities
Langfuse Observability Centralization
Schema Hallucination Detection and Graph Routing
Refiner and Finalizer Node Updates
Agent Entry Points and Extraction
Backend Infrastructure: Validation and Health Checks
Deployment Infrastructure
Development Tooling
Sequence Diagram(s)sequenceDiagram
participant schema_explorer_node
participant Redis
participant Trino as Trino<br/>(information_schema)
participant LLM
participant route_schema_explorer
schema_explorer_node->>schema_explorer_node: asyncio.gather concurrent profile fetches<br/>(semaphore-limited by PROFILE_FETCH_CONCURRENCY)
schema_explorer_node->>LLM: invoke with aggregated profiles
LLM-->>schema_explorer_node: structured plan with tables_used
loop for each table in tables_used
schema_explorer_node->>Redis: GET table_exists:{table}
alt cache miss
schema_explorer_node->>Trino: query information_schema.tables (asyncio.to_thread)
Trino-->>schema_explorer_node: row found or empty result
schema_explorer_node->>Redis: SET table_exists key
end
end
alt hallucinated_tables found
schema_explorer_node-->>route_schema_explorer: return hallucinated_tables + feedback + last_error
route_schema_explorer-->>schema_explorer_node: loop back (if retry count allows)
else no hallucinations
schema_explorer_node-->>route_schema_explorer: return schema_plan + hallucinated_tables=None
route_schema_explorer-->>route_schema_explorer: proceed to query_builder
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 11
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
agent/src/agent/nodes/refiner.py (1)
58-58:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winClear
last_erroron successful refinement execution.After a successful run, stale
last_errormay remain in state and be misinterpreted by downstream consumers/telemetry.Suggested fix
- return {"trino_error": None, "raw_data_ref": raw_ref} + return {"trino_error": None, "last_error": None, "raw_data_ref": raw_ref}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@agent/src/agent/nodes/refiner.py` at line 58, The return statement in the successful refinement execution path clears the `trino_error` by setting it to None, but does not clear the `last_error` field. This can cause stale errors from previous failed attempts to persist in the state and mislead downstream consumers. Add `last_error: None` to the return dictionary alongside `trino_error: None` and `raw_data_ref: raw_ref` to ensure all error state is properly reset after successful execution.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@agent/Dockerfile`:
- Line 4: The apt-get install commands in both Dockerfiles increase image size
by installing unnecessary recommended packages and retaining package metadata.
In agent/Dockerfile at line 4, modify the RUN command that installs git to
include the --no-install-recommends flag and append a cleanup command to remove
/var/lib/apt/lists/* in the same layer to ensure the metadata is not cached.
Apply the identical pattern to backend/Dockerfile at line 5 for the apt-get
install command there to maintain consistency and achieve the same image size
reduction benefits across both images.
In `@agent/src/agent/config.py`:
- Line 16: The PROFILE_FETCH_CONCURRENCY field allows zero or negative values
which will cause asyncio.Semaphore to block indefinitely. Add validation to the
PROFILE_FETCH_CONCURRENCY field definition to enforce that it must be a positive
integer (greater than 0), either through a Pydantic validator, field constraint,
or similar mechanism depending on the configuration framework being used.
In `@agent/src/agent/graph.py`:
- Around line 79-85: The route_schema_explorer function creates an unbounded
loop by repeatedly routing back to "schema_explorer" when hallucinated_tables
persist. Add a retry budget mechanism to prevent infinite looping: introduce a
counter field in the AgentState (such as schema_explorer_retry_count) to track
how many times the schema explorer has looped, increment this counter each time
route_schema_explorer routes back to itself, and modify the
route_schema_explorer function to check if the retry count has exceeded a
defined maximum threshold (e.g., 3 attempts). When the threshold is exceeded,
return "query_builder" regardless of whether hallucinated_tables exist, ensuring
the workflow eventually progresses and respects runtime/request budgets.
In `@agent/src/agent/nodes/schema_explorer.py`:
- Around line 372-380: When the hallucination check passes (in the else block),
you are only resetting hallucinated_tables to None but leaving feedback and
last_error with stale values from a previous iteration. This causes downstream
routing to misroute based on old feedback. In the else block where
hallucinated_tables is set to None, also explicitly set feedback and last_error
to None so that stale error state does not persist and affect downstream routing
decisions.
- Line 355: The SQL query construction on line 355 of schema_explorer.py
directly interpolates the `cat`, `sch`, and `tbl` variables into the SQL string
using f-string formatting, creating a SQL injection vulnerability since these
values come from untrusted model output in `tables_used`. Replace the direct
string interpolation with parameterized query placeholders and pass the catalog,
schema, and table name values as separate bound parameters to the database query
execution method, ensuring they are treated as data values rather than
executable SQL code.
- Around line 363-371: The table verification code currently fails open when
validation infrastructure fails—it logs exceptions and continues without marking
problematic tables as hallucinated. In the exception handler for information
schema checks (the inner try-except catching Exception as e), append t_name to
the hallucinated list instead of just printing the error. In the outer exception
handler for the Redis/Trino verification (catching Exception as e), ensure
tables are marked as hallucinated or otherwise fail in a closed manner so that
invalid tables do not pass through when verification infrastructure is degraded.
Replace print() calls with appropriate logging to maintain production-grade
error reporting.
In `@backend/app/main.py`:
- Around line 43-44: Remove the redundant import of settings at line 43 since
settings is already imported at line 12 at the top of the file. Additionally,
replace the getattr call with direct attribute access to settings.OPENAI_API_KEY
since OPENAI_API_KEY is now a defined field in the Settings class, simplifying
the conditional check from getattr(settings, "OPENAI_API_KEY", None) to simply
settings.OPENAI_API_KEY.
- Around line 44-46: There is a logical contradiction between the startup guard
in main.py that unconditionally raises RuntimeError when OPENAI_API_KEY is
missing and the fallback logic in llm_judge.py that gracefully returns 0.0
scores when the key is absent. Decide whether OPENAI_API_KEY is truly required
for the application to function: if it is required, remove the fallback logic in
llm_judge.py (the section that returns 0.0 scores when the key is missing); if
graceful degradation is intended, remove or conditionally apply the startup
guard in main.py so the server can start without the key and let the fallback
logic in llm_judge.py handle the absence gracefully.
In `@backend/app/routers/health.py`:
- Around line 182-188: The error branch in the get_judge_health() function is
unreachable dead code because the startup guard in main.py prevents the server
from starting if OPENAI_API_KEY is missing. Remove the conditional check for
api_key and the error branch (the if not api_key block), and simplify the
function to always return the success response. Also remove the unnecessary
getattr call and the import statement since OPENAI_API_KEY is guaranteed to
exist if the server is running.
In `@core/src/core/langfuse.py`:
- Around line 4-12: In core/src/core/langfuse.py at lines 4-12, wrap the
_langfuse_handler CallbackHandler initialization in a try-except block and set
_langfuse_handler to None or a no-op handler if initialization fails, then
modify the get_langfuse_handler function to return None or the no-op handler
when creation fails. In agent/src/agent/routers/chat.py at lines 50-55, add a
conditional check to only include langfuse_handler in the callbacks list when it
is available and not None; if it is None or unavailable, invoke the graph
without including it in the callbacks parameter.
In `@frontend/.env`:
- Line 1: The VITE_API_URL environment variable in the frontend configuration
has been incorrectly set to http://localhost:3000, which is the frontend's own
port, creating a circular proxy that routes API requests back to the frontend
instead of to the backend. Revert VITE_API_URL back to http://localhost:8000 to
properly target the backend service, or if the backend port has changed, update
it to the correct backend URL and ensure all related configuration files are
updated consistently.
---
Outside diff comments:
In `@agent/src/agent/nodes/refiner.py`:
- Line 58: The return statement in the successful refinement execution path
clears the `trino_error` by setting it to None, but does not clear the
`last_error` field. This can cause stale errors from previous failed attempts to
persist in the state and mislead downstream consumers. Add `last_error: None` to
the return dictionary alongside `trino_error: None` and `raw_data_ref: raw_ref`
to ensure all error state is properly reset after successful execution.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 84a6d3a8-12f2-45a3-9598-8ab51bf20d51
📒 Files selected for processing (16)
agent/Dockerfileagent/src/agent/config.pyagent/src/agent/graph.pyagent/src/agent/nodes/refiner.pyagent/src/agent/nodes/schema_explorer.pyagent/src/agent/routers/chat.pyagent/src/agent/state.pybackend/Dockerfilebackend/app/config.pybackend/app/main.pybackend/app/routers/health.pybackend/app/services/llm_judge.pycore/src/core/config.pycore/src/core/langfuse.pydocker-compose.ymlfrontend/.env
| @router.get("/judge/health") | ||
| def get_judge_health(): | ||
| from app.config import settings | ||
| api_key = getattr(settings, "OPENAI_API_KEY", None) | ||
| if not api_key: | ||
| return {"status": "error", "message": "OPENAI_API_KEY is missing"} | ||
| return {"status": "ok", "message": "LLM Judge is configured"} |
There was a problem hiding this comment.
Error branch is unreachable dead code.
The startup guard in main.py (lines 44-46) prevents the server from starting if OPENAI_API_KEY is missing, which means this endpoint will never execute the error branch at lines 186-187. The server won't be running to serve this response. Additionally, the getattr call is unnecessary since OPENAI_API_KEY is now a defined field in Settings.
♻️ Proposed simplification (remove unreachable branch)
`@router.get`("/judge/health")
def get_judge_health():
- from app.config import settings
- api_key = getattr(settings, "OPENAI_API_KEY", None)
- if not api_key:
- return {"status": "error", "message": "OPENAI_API_KEY is missing"}
return {"status": "ok", "message": "LLM Judge is configured"}Alternatively, if you want to keep the health check for consistency with other services, you can simplify it to always return OK since the startup guard ensures the key is present:
`@router.get`("/judge/health")
def get_judge_health():
- from app.config import settings
- api_key = getattr(settings, "OPENAI_API_KEY", None)
- if not api_key:
- return {"status": "error", "message": "OPENAI_API_KEY is missing"}
- return {"status": "ok", "message": "LLM Judge is configured"}
+ return {"status": "ok", "message": "LLM Judge is configured"}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| @router.get("/judge/health") | |
| def get_judge_health(): | |
| from app.config import settings | |
| api_key = getattr(settings, "OPENAI_API_KEY", None) | |
| if not api_key: | |
| return {"status": "error", "message": "OPENAI_API_KEY is missing"} | |
| return {"status": "ok", "message": "LLM Judge is configured"} | |
| `@router.get`("/judge/health") | |
| def get_judge_health(): | |
| return {"status": "ok", "message": "LLM Judge is configured"} |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@backend/app/routers/health.py` around lines 182 - 188, The error branch in
the get_judge_health() function is unreachable dead code because the startup
guard in main.py prevents the server from starting if OPENAI_API_KEY is missing.
Remove the conditional check for api_key and the error branch (the if not
api_key block), and simplify the function to always return the success response.
Also remove the unnecessary getattr call and the import statement since
OPENAI_API_KEY is guaranteed to exist if the server is running.
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/src/core/trino.py (1)
51-75:⚠️ Potential issue | 🟠 Major | ⚡ Quick winSignature reordering breaks existing positional callers.
Changing
execute_query_syncto(sql, params, table_id)breaks existing positional call sites (for example,execute_query_sync(generated_sql, table_id)), which now passtable_idas query params. That can mis-execute queries or fail unexpectedly.Backward-compatible fix
-def execute_query_sync(sql: str, params: tuple | dict | list | None = None, table_id: str = "") -> TrinoExecutionResult: +def execute_query_sync(sql: str, table_id: str = "", params: tuple | dict | list | None = None) -> TrinoExecutionResult: @@ - if params: + if params is not None: cur.execute(sql, params) else: cur.execute(sql)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@core/src/core/trino.py` around lines 51 - 75, The function execute_query_sync currently has parameters ordered as (sql, params, table_id), but existing call sites pass arguments positionally as execute_query_sync(generated_sql, table_id), causing table_id to be interpreted as params instead. Reorder the function signature parameters to place table_id before params, changing the signature to (sql, table_id, params) so that existing positional callers continue to work correctly without requiring code changes at those call sites.
♻️ Duplicate comments (1)
agent/src/agent/nodes/schema_explorer.py (1)
358-360:⚠️ Potential issue | 🟠 Major | ⚡ Quick winCatalog identifier is still interpolated from untrusted LLM output.
The query at Line 358 still injects
catdirectly into SQL. Sincetables_usedcomes from model output, this remains an injection path in the validation step. Validate/whitelist identifier tokens (or quote validated identifiers) before composing the catalog-qualified table path.Suggested hardening
+import re +IDENT_RE = re.compile(r"^[A-Za-z_][A-Za-z0-9_]*$") if len(parts) == 3: cat, sch, tbl = parts - sql = f"SELECT 1 FROM {cat}.information_schema.tables WHERE table_schema = ? AND table_name = ?" + if not IDENT_RE.fullmatch(cat): + hallucinated.append(t_name) + continue + sql = ( + f'SELECT 1 FROM "{cat}".information_schema.tables ' + "WHERE table_schema = ? AND table_name = ?" + )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@agent/src/agent/nodes/schema_explorer.py` around lines 358 - 360, The SQL query construction at line 358 in the schema_explorer.py file contains a SQL injection vulnerability where the catalog identifier `cat` is directly interpolated into the f-string without any validation or quoting. Since `cat` originates from untrusted LLM output (via `tables_used`), this creates an injection path in the validation step. Validate or whitelist the catalog identifier token before composing the catalog-qualified table path, or alternatively quote the validated identifier using appropriate SQL quoting mechanisms (such as backticks or bracket notation depending on the database system) to ensure special characters cannot break out of the identifier context. Apply this fix before the SQL query is passed to the `execute_query_sync` function call.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@agent/pyproject.toml`:
- Around line 31-35: The `uv sync` command in the Dockerfile is installing dev
dependencies into the production image because the `--no-dev` flag is not
specified. Modify the `uv sync` command to include the `--no-dev` flag to
exclude dev dependencies (pytest and ruff from the [dependency-groups].dev
section) from being installed in the production agent image, reducing image size
and attack surface.
In `@agent/src/agent/nodes/refiner.py`:
- Around line 81-90: The exception handling in the refiner function only
protects the client.save_data() call, but not the get_esca_client() acquisition
itself. If the async context manager fails to acquire the ESCA client, the
exception will propagate before reaching the try block, skipping the fallback
state (esca_write_failed flag and logging). Move the try-except block to wrap
the entire async with statement starting from get_esca_client(), so that both
client acquisition failures and save_data failures are caught and handled
consistently with the same fallback behavior.
In `@agent/src/agent/utils/sql.py`:
- Around line 11-14: The SQL query fence-stripping logic removes opening fences
before normalizing whitespace, which causes leading whitespace (like newlines)
to prevent proper fence removal and result in invalid SQL. To fix this in the
query cleaning section: first strip the input query to normalize whitespace,
then update the opening-fence regex pattern to match both `\`\`\`sql` and plain
`\`\`\`` variants in a single combined pattern (use an optional group for the
sql keyword), and update the closing-fence pattern to strip any whitespace
immediately before the closing `\`\`\`` as well.
In `@backend/app/routers/health.py`:
- Around line 194-200: The try-except block in the health check endpoint only
catches ImportError when importing agent.config, but importing can fail for
other reasons (validation errors, etc.) which would bypass the fallback and
break the endpoint. Additionally, defaulting to dummy values like "dummy" and
"http://localhost:8000" masks misconfiguration and returns misleading health
status. Broaden the exception handling to catch general exceptions (Exception or
a more specific base class) beyond just ImportError, and when ESCA settings are
absent or the import fails for any reason, return a config-error health response
indicating that ESCA is not properly configured, rather than injecting dummy
defaults and continuing as if the service is healthy.
In `@core/src/core/langfuse.py`:
- Around line 10-11: The exception handler that catches errors during Langfuse
handler initialization (in the except block at lines 10-11) silently sets
_langfuse_handler to None without logging the error details. Add logging within
this exception handler to capture and record the actual error before falling
back to _langfuse_handler = None. This will ensure initialization failures are
visible in production logs while maintaining the graceful fallback behavior.
---
Outside diff comments:
In `@core/src/core/trino.py`:
- Around line 51-75: The function execute_query_sync currently has parameters
ordered as (sql, params, table_id), but existing call sites pass arguments
positionally as execute_query_sync(generated_sql, table_id), causing table_id to
be interpreted as params instead. Reorder the function signature parameters to
place table_id before params, changing the signature to (sql, table_id, params)
so that existing positional callers continue to work correctly without requiring
code changes at those call sites.
---
Duplicate comments:
In `@agent/src/agent/nodes/schema_explorer.py`:
- Around line 358-360: The SQL query construction at line 358 in the
schema_explorer.py file contains a SQL injection vulnerability where the catalog
identifier `cat` is directly interpolated into the f-string without any
validation or quoting. Since `cat` originates from untrusted LLM output (via
`tables_used`), this creates an injection path in the validation step. Validate
or whitelist the catalog identifier token before composing the catalog-qualified
table path, or alternatively quote the validated identifier using appropriate
SQL quoting mechanisms (such as backticks or bracket notation depending on the
database system) to ensure special characters cannot break out of the identifier
context. Apply this fix before the SQL query is passed to the
`execute_query_sync` function call.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 76d9bfe0-cb19-48b7-9f46-1f77824453bc
📒 Files selected for processing (19)
agent/Dockerfileagent/pyproject.tomlagent/src/agent/config.pyagent/src/agent/graph.pyagent/src/agent/llm.pyagent/src/agent/nodes/finalizer.pyagent/src/agent/nodes/refiner.pyagent/src/agent/nodes/schema_explorer.pyagent/src/agent/routers/chat.pyagent/src/agent/state.pyagent/src/agent/utils/esca.pyagent/src/agent/utils/sql.pybackend/Dockerfilebackend/app/main.pybackend/app/routers/health.pybackend/app/services/llm_judge.pycore/src/core/langfuse.pycore/src/core/trino.pyfrontend/.env
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
agent/src/agent/nodes/extractor.py (1)
105-130: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick winMove
concurrent.futuresimport to module level.The
import concurrent.futuresstatement at line 110 is inside theextractor_nodefunction body. Python convention places imports at the module level unless there's a specific reason (e.g., avoiding circular dependencies or lazy loading). Moving it to the top improves code organization and consistency.♻️ Proposed fix
Add the import at the module level (after line 3):
import abc import datetime +import concurrent.futures import requests from typing import ListRemove the local import from the function:
def extractor_node(state: AgentState): """Enrich the user query with additional context to help downstream phases.""" user_query = state["user_query"] active_extractors = state.get("active_extractors") or [] - - import concurrent.futures extractors: List[BaseExtractor] = [TimeExtractor(), LLMExtractor()]🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@agent/src/agent/nodes/extractor.py` around lines 105 - 130, The import statement for concurrent.futures is currently located inside the extractor_node function body, which violates Python convention. Move the import concurrent.futures statement to the module level at the top of the file with other imports, and remove it from inside the extractor_node function. This improves code organization and consistency by following standard Python import conventions.backend/app/routers/publish.py (1)
50-60: 🧹 Nitpick | 🔵 TrivialGolden question threshold discrepancy across endpoints is intentional.
The endpoint requires at least 3 golden questions (line 54), while
evaluation.py:trigger_evaluation_runandorchestration.pyrequire only 1 question. This stricter threshold for production publish is by design—higher quality gates for publishing vs. lighter validation for evaluation runs. Consider adding an inline comment clarifying this policy difference if not self-evident to future maintainers.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/routers/publish.py` around lines 50 - 60, Add an inline comment in the golden questions validation check where len(questions) < 3 is evaluated to clarify that this stricter threshold (3 questions required) is intentionally higher than the requirement in other endpoints like evaluation.py and orchestration.py (which require only 1 question) because publish gates require higher quality standards for production. This will help future maintainers understand the policy difference and why the threshold differs across endpoints.
♻️ Duplicate comments (1)
agent/src/agent/nodes/schema_explorer.py (1)
405-411: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick winValidate all three identifier parts, not just the catalog.
The
IDENT_REvalidation only applies tocat(catalog), butschandtblare passed directly to the parameterized query. While parameterized queries protect against injection in the WHERE clause values, the catalog name is still interpolated into the SQL string via f-string. The current validation is correct for the catalog, but for defense-in-depth, consider validating all three parts before constructing the query.🛡️ Suggested defense-in-depth validation
if len(parts) == 3: cat, sch, tbl = parts - if not IDENT_RE.fullmatch(cat): + if not all(IDENT_RE.fullmatch(p) for p in (cat, sch, tbl)): hallucinated.append(t_name) continue sql = f'SELECT 1 FROM "{cat}".information_schema.tables WHERE table_schema = ? AND table_name = ?'🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@agent/src/agent/nodes/schema_explorer.py` around lines 405 - 411, The identifier validation using IDENT_RE is only applied to the catalog parameter (cat) but not to the schema (sch) and table (tbl) parameters. For defense-in-depth security, validate all three identifier components before constructing the SQL query. Add IDENT_RE.fullmatch validation checks for both sch and tbl similar to the existing validation for cat, ensuring that if any identifier fails validation, the entry is marked as hallucinated and the loop continues before the SQL query is constructed.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@agent/src/agent/nodes/finalizer.py`:
- Around line 68-87: The code attempts to derive column information from the
first row of data in the inline fallback path, but Trino rows are lists of lists
(not dicts), so the isinstance check fails and produces empty columns. The
refiner must capture and pass the result.columns metadata alongside result.rows
when storing inline_result_rows, and the finalizer must use the passed
inline_result_columns directly instead of attempting to derive columns from the
data structure. Update the refiner to store inline_result_columns alongside
inline_result_rows in the success path, then update the finalizer to use
inline_result_columns directly when constructing the preview_info dict.
In `@agent/src/agent/nodes/schema_explorer.py`:
- Around line 53-66: The function `get_query_embedding` uses
`urllib.request.Request` and `urllib.request.urlopen` but the `urllib.request`
module is not imported at the top of the file. Add `import urllib.request` to
the import section at the beginning of the file to resolve the `NameError` that
will occur at runtime when this function is called.
In `@backend/app/routers/health.py`:
- Around line 211-234: The EscaClient instantiation is currently outside the try
block, which means if the constructor fails with an invalid URL or
initialization error, the exception will bypass error handling and the finally
block cleanup will not execute. Move the line that instantiates EscaClient
(client = EscaClient(api_key=api_key, base_url=base_url)) inside the try block
before the _ping_esca() function definition, so that any client initialization
errors are properly caught by the existing exception handlers and the finally
block guarantees cleanup via await client.close().
In `@core/src/core/langfuse.py`:
- Around line 13-14: In the exception handler for the Langfuse CallbackHandler
initialization in the except block, replace the logger.error() call with
logger.exception() to automatically capture the full stack trace. This provides
more complete debugging information than manually formatting the error message
with logger.error(). You can simplify the message itself since the traceback
will be included automatically.
---
Outside diff comments:
In `@agent/src/agent/nodes/extractor.py`:
- Around line 105-130: The import statement for concurrent.futures is currently
located inside the extractor_node function body, which violates Python
convention. Move the import concurrent.futures statement to the module level at
the top of the file with other imports, and remove it from inside the
extractor_node function. This improves code organization and consistency by
following standard Python import conventions.
In `@backend/app/routers/publish.py`:
- Around line 50-60: Add an inline comment in the golden questions validation
check where len(questions) < 3 is evaluated to clarify that this stricter
threshold (3 questions required) is intentionally higher than the requirement in
other endpoints like evaluation.py and orchestration.py (which require only 1
question) because publish gates require higher quality standards for production.
This will help future maintainers understand the policy difference and why the
threshold differs across endpoints.
---
Duplicate comments:
In `@agent/src/agent/nodes/schema_explorer.py`:
- Around line 405-411: The identifier validation using IDENT_RE is only applied
to the catalog parameter (cat) but not to the schema (sch) and table (tbl)
parameters. For defense-in-depth security, validate all three identifier
components before constructing the SQL query. Add IDENT_RE.fullmatch validation
checks for both sch and tbl similar to the existing validation for cat, ensuring
that if any identifier fails validation, the entry is marked as hallucinated and
the loop continues before the SQL query is constructed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: e597dae2-ab0a-4208-8a87-0d948fb32d24
⛔ Files ignored due to path filters (1)
agent/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (37)
agent/Dockerfileagent/src/agent/config.pyagent/src/agent/graph.pyagent/src/agent/llm.pyagent/src/agent/mcp_server.pyagent/src/agent/nodes/extractor.pyagent/src/agent/nodes/finalizer.pyagent/src/agent/nodes/query_builder.pyagent/src/agent/nodes/refiner.pyagent/src/agent/nodes/schema_explorer.pyagent/src/agent/routers/chat.pyagent/src/agent/state.pyagent/src/agent/utils/esca.pyagent/src/agent/utils/sql.pybackend/app/infra_init.pybackend/app/routers/admin_approval.pybackend/app/routers/admin_auth.pybackend/app/routers/audit.pybackend/app/routers/enrichment.pybackend/app/routers/evaluation.pybackend/app/routers/feedback.pybackend/app/routers/health.pybackend/app/routers/orchestration.pybackend/app/routers/publish.pybackend/app/routers/questions.pybackend/app/routers/scopes.pybackend/app/services/auth.pybackend/app/services/evaluator.pybackend/app/services/join_detection.pybackend/app/services/langfuse_client.pybackend/app/services/llm_judge.pybackend/app/services/profiling_engine.pybackend/app/services/scoring.pybackend/app/services/trino_client.pycore/src/core/langfuse.pycore/src/core/models/models.pycore/src/core/trino.py
💤 Files with no reviewable changes (4)
- backend/app/services/trino_client.py
- backend/app/services/scoring.py
- backend/app/routers/orchestration.py
- backend/app/services/langfuse_client.py
| if esca_write_failed or not raw_data_ref: | ||
| if inline_result_rows is not None: | ||
| limit = 5 | ||
| preview_rows = inline_result_rows[:limit] | ||
| columns = ( | ||
| list(preview_rows[0].keys()) | ||
| if preview_rows and isinstance(preview_rows[0], dict) | ||
| else [] | ||
| ) | ||
| preview_info = { | ||
| "columns": columns, | ||
| "preview_rows": preview_rows, | ||
| "preview_count": len(preview_rows), | ||
| "total_rows": len(inline_result_rows), | ||
| } | ||
| preview_str = json.dumps(preview_info, indent=2) | ||
| else: | ||
| preview_str = "No data reference found." | ||
| else: | ||
| preview_str = await get_esca_preview(raw_data_ref) |
There was a problem hiding this comment.
Inline fallback produces empty columns when ESCA write fails.
When esca_write_failed is true, the code attempts to derive columns from inline_result_rows by checking if the first row is a dict (line 74). However, result.rows from Trino are lists of lists (not dicts), so isinstance(preview_rows[0], dict) will be False and columns will be [].
The refiner stores inline_result_rows = result.rows but doesn't store result.columns. For the fallback preview to be useful, the refiner should also pass inline_result_columns or the finalizer needs another way to access column metadata.
🔧 Suggested fix in refiner.py
Store columns alongside rows in the success path:
# In refiner.py success path (around line 108)
return {
"trino_error": None,
"last_error": None,
"raw_data_ref": raw_ref,
"esca_write_failed": esca_write_failed,
"inline_result_rows": inline_result_rows,
+ "inline_result_columns": result.columns,
"error_history": error_history,
}Then in finalizer.py:
if esca_write_failed or not raw_data_ref:
if inline_result_rows is not None:
limit = 5
preview_rows = inline_result_rows[:limit]
- columns = (
- list(preview_rows[0].keys())
- if preview_rows and isinstance(preview_rows[0], dict)
- else []
- )
+ columns = state.get("inline_result_columns", [])📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if esca_write_failed or not raw_data_ref: | |
| if inline_result_rows is not None: | |
| limit = 5 | |
| preview_rows = inline_result_rows[:limit] | |
| columns = ( | |
| list(preview_rows[0].keys()) | |
| if preview_rows and isinstance(preview_rows[0], dict) | |
| else [] | |
| ) | |
| preview_info = { | |
| "columns": columns, | |
| "preview_rows": preview_rows, | |
| "preview_count": len(preview_rows), | |
| "total_rows": len(inline_result_rows), | |
| } | |
| preview_str = json.dumps(preview_info, indent=2) | |
| else: | |
| preview_str = "No data reference found." | |
| else: | |
| preview_str = await get_esca_preview(raw_data_ref) | |
| if esca_write_failed or not raw_data_ref: | |
| if inline_result_rows is not None: | |
| limit = 5 | |
| preview_rows = inline_result_rows[:limit] | |
| columns = state.get("inline_result_columns", []) | |
| preview_info = { | |
| "columns": columns, | |
| "preview_rows": preview_rows, | |
| "preview_count": len(preview_rows), | |
| "total_rows": len(inline_result_rows), | |
| } | |
| preview_str = json.dumps(preview_info, indent=2) | |
| else: | |
| preview_str = "No data reference found." | |
| else: | |
| preview_str = await get_esca_preview(raw_data_ref) |
🧰 Tools
🪛 ast-grep (0.43.0)
[info] 82-82: use jsonify instead of json.dumps for JSON output
Context: json.dumps(preview_info, indent=2)
Note: Security best practice.
(use-jsonify)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@agent/src/agent/nodes/finalizer.py` around lines 68 - 87, The code attempts
to derive column information from the first row of data in the inline fallback
path, but Trino rows are lists of lists (not dicts), so the isinstance check
fails and produces empty columns. The refiner must capture and pass the
result.columns metadata alongside result.rows when storing inline_result_rows,
and the finalizer must use the passed inline_result_columns directly instead of
attempting to derive columns from the data structure. Update the refiner to
store inline_result_columns alongside inline_result_rows in the success path,
then update the finalizer to use inline_result_columns directly when
constructing the preview_info dict.
| from esca_sdk import EscaClient | ||
|
|
||
| client = EscaClient(api_key=api_key, base_url=base_url) | ||
|
|
||
| async def _ping_esca(): | ||
| if hasattr(client, "ping"): | ||
| return await client.ping() | ||
| else: | ||
| return await client.load_head("health_check_dummy_id") | ||
|
|
||
| try: | ||
| await asyncio.wait_for(_ping_esca(), timeout=5.0) | ||
| return {"status": "ok", "message": "Esca is reachable"} | ||
| except TimeoutError: | ||
| raise HTTPException( | ||
| status_code=503, detail="Esca connection timed out after 5.0s" | ||
| ) | ||
| except Exception as e: | ||
| err_str = str(e).lower() | ||
| if "404" in err_str or "not found" in err_str: | ||
| return {"status": "ok", "message": "Esca is reachable"} | ||
| raise HTTPException(status_code=503, detail=f"Esca health check failed: {e}") | ||
| finally: | ||
| await client.close() |
There was a problem hiding this comment.
EscaClient instantiation outside try block leads to unhandled errors.
If EscaClient(...) constructor fails (invalid URL, initialization error), the exception bypasses the health check error handling and propagates as an unhandled 500 error. The finally block won't execute since the try hasn't been entered yet.
Move client instantiation inside the try block to ensure proper error handling and cleanup:
🐛 Proposed fix
from esca_sdk import EscaClient
- client = EscaClient(api_key=api_key, base_url=base_url)
+ client = None
async def _ping_esca():
if hasattr(client, "ping"):
return await client.ping()
else:
return await client.load_head("health_check_dummy_id")
try:
+ client = EscaClient(api_key=api_key, base_url=base_url)
await asyncio.wait_for(_ping_esca(), timeout=5.0)
return {"status": "ok", "message": "Esca is reachable"}
except TimeoutError:
raise HTTPException(
status_code=503, detail="Esca connection timed out after 5.0s"
)
except Exception as e:
err_str = str(e).lower()
if "404" in err_str or "not found" in err_str:
return {"status": "ok", "message": "Esca is reachable"}
raise HTTPException(status_code=503, detail=f"Esca health check failed: {e}")
finally:
- await client.close()
+ if client is not None:
+ await client.close()🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@backend/app/routers/health.py` around lines 211 - 234, The EscaClient
instantiation is currently outside the try block, which means if the constructor
fails with an invalid URL or initialization error, the exception will bypass
error handling and the finally block cleanup will not execute. Move the line
that instantiates EscaClient (client = EscaClient(api_key=api_key,
base_url=base_url)) inside the try block before the _ping_esca() function
definition, so that any client initialization errors are properly caught by the
existing exception handlers and the finally block guarantees cleanup via await
client.close().
| except Exception as e: | ||
| logger.error(f"Failed to initialize Langfuse CallbackHandler: {e}") |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | ⚡ Quick win
Use logger.exception() to automatically capture the traceback.
When logging within an exception handler, logger.exception() is preferred over logger.error() because it automatically includes the full stack trace, which significantly aids debugging initialization failures.
🔧 Recommended improvement
except Exception as e:
- logger.error(f"Failed to initialize Langfuse CallbackHandler: {e}")
+ logger.exception("Failed to initialize Langfuse CallbackHandler")
_langfuse_handler = None🧰 Tools
🪛 Ruff (0.15.17)
[warning] 13-13: Do not catch blind exception: Exception
(BLE001)
[warning] 14-14: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
[warning] 14-14: Logging statement uses f-string
(G004)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@core/src/core/langfuse.py` around lines 13 - 14, In the exception handler for
the Langfuse CallbackHandler initialization in the except block, replace the
logger.error() call with logger.exception() to automatically capture the full
stack trace. This provides more complete debugging information than manually
formatting the error message with logger.error(). You can simplify the message
itself since the traceback will be included automatically.
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
agent/src/agent/nodes/schema_explorer.py (1)
230-240: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick winDuplicate import of
langfuse_client.
langfuse_clientis already imported at line 22 (from agent.langfuse_client import langfuse_client). This redundant import inside the exception handler is unnecessary.♻️ Suggested removal
except Exception as e: esca_id = None - from agent.langfuse_client import langfuse_client - if langfuse_client and langfuse_client.get_current_observation_id():🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@agent/src/agent/nodes/schema_explorer.py` around lines 230 - 240, Remove the redundant import statement `from agent.langfuse_client import langfuse_client` from inside the exception handler block in the schema_explorer.py file. The langfuse_client is already imported at the module level (line 22), so it is already available for use throughout the file. Simply delete the import line and use the already-imported langfuse_client variable directly in the exception handling logic.agent/src/agent/nodes/refiner.py (1)
101-106:⚠️ Potential issue | 🟡 MinorAdd check for
get_current_observation_id()result before using in span() call.The condition at line 101 only checks
langfuse_client.get_current_trace_id(), butget_current_observation_id()may returnNoneeven when a trace exists. The same error handling pattern inschema_explorer.py:232correctly checks both conditions:if langfuse_client and langfuse_client.get_current_observation_id():. Apply the same defensive check here to avoid passingid=Noneto thespan()call.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@agent/src/agent/nodes/refiner.py` around lines 101 - 106, The condition checking whether to call langfuse_client.span() only verifies that get_current_trace_id() returns a truthy value, but does not check whether get_current_observation_id() returns None. Update the if condition to check both langfuse_client.get_current_trace_id() AND langfuse_client.get_current_observation_id() before proceeding to the span() call, following the same defensive pattern used elsewhere in schema_explorer.py. This prevents passing id=None to the span() method.backend/app/services/langfuse_client.py (1)
100-103:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winStale documentation references removed method.
The docstring on lines 100-103 still references
get_prompt_as_langchain(name)as an additional method, but this method was removed from the class. Update the docstring to remove this reference.📝 Proposed fix
Implements Connection: connect(), is_alive(), logout(). Additional methods (present in the main app's LangfuseTracer): get_prompt(name) — fetch a prompt by name. - get_prompt_as_langchain(name) — fetch and convert to a LangChain prompt. """🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@backend/app/services/langfuse_client.py` around lines 100 - 103, The docstring for the LangfuseTracer class contains a stale reference to the `get_prompt_as_langchain(name)` method in the "Additional methods" section, but this method has been removed from the class. Remove the line referencing `get_prompt_as_langchain(name)` from the docstring, keeping only the documentation for the `get_prompt(name)` method that still exists in the class.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.gitignore:
- Line 42: The .gitignore file at line 42 contains an entry that ignores the
python-core-utils directory, but this directory is required by the build/runtime
(Docker COPY command and uv source mappings). Remove the python-core-utils/ line
from .gitignore to ensure the directory is tracked by git and available for
builds.
In `@agent/src/agent/nodes/refiner.py`:
- Around line 8-9: Remove the duplicate import statement of langfuse_client from
agent.langfuse_client in the refiner.py file. The import statement appears twice
consecutively on lines 8 and 9 - delete one of these duplicate import lines to
keep only a single import of langfuse_client from agent.langfuse_client.
In `@backend/app/infra_init.py`:
- Line 552: The fallback database URL in the `os.getenv()` call for
`DATABASE_URL` uses `localhost` as the hostname, which resolves to the container
itself rather than the Postgres service when running inside the backend
container. Replace `localhost` in the hardcoded default connection string with
the appropriate service hostname (typically `postgres` or `db`) that is
resolvable from within the container network, ensuring the connection can reach
the actual Postgres service.
- Around line 33-35: The new environment-driven Trino configuration variables
(_TRINO_HOST, _TRINO_PORT, _TRINO_USER) defined at the module level are not
being used in the _ensure_om_service function, which still hardcodes the Trino
connection details as "trino:8080" with user "trino". Replace the hardcoded
values in _ensure_om_service with the corresponding environment-driven
configuration variables (_TRINO_HOST, _TRINO_PORT, _TRINO_USER) to ensure
OpenMetadata service registration respects the new environment configuration
contract.
- Line 27: The `os` import on line 27 is not at the top of the file, which
violates Ruff's E402 rule and can cause CI failures. Move the `os` import to the
top of the file where other module imports are located, ensuring all import
statements are grouped together at the beginning of the file before any other
code.
In `@backend/app/routers/evaluation.py`:
- Around line 113-117: The Langfuse trace write operations at lines 113–117 and
185–188 in the evaluation.py file are not protected from failures, which means
any transient Langfuse observability issue will abort the entire evaluation
execution and prevent returning successful results. Wrap both the
langfuse_client.trace() calls in try-except blocks (or similar error handling)
to isolate them from the critical evaluation path, ensuring that Langfuse
failures are caught and logged but do not interrupt evaluation completion,
database commits, or function returns. This is especially critical at the second
site (lines 185–188) where the database has already been committed and scores
computed.
In `@backend/pyproject.toml`:
- Line 13: The langfuse dependency constraint at line 13 in
backend/pyproject.toml is specified as "langfuse>=2.0.0" which lacks an upper
bound, allowing any future major version to be installed and potentially
introducing breaking changes. Modify the version constraint to include an upper
bound by changing it to "langfuse>=2.0.0,<3.0.0" (or similar bounded range) to
ensure reproducible behavior and prevent silent breakage from major version
updates while still allowing compatible patch and minor version updates.
---
Outside diff comments:
In `@agent/src/agent/nodes/refiner.py`:
- Around line 101-106: The condition checking whether to call
langfuse_client.span() only verifies that get_current_trace_id() returns a
truthy value, but does not check whether get_current_observation_id() returns
None. Update the if condition to check both
langfuse_client.get_current_trace_id() AND
langfuse_client.get_current_observation_id() before proceeding to the span()
call, following the same defensive pattern used elsewhere in schema_explorer.py.
This prevents passing id=None to the span() method.
In `@agent/src/agent/nodes/schema_explorer.py`:
- Around line 230-240: Remove the redundant import statement `from
agent.langfuse_client import langfuse_client` from inside the exception handler
block in the schema_explorer.py file. The langfuse_client is already imported at
the module level (line 22), so it is already available for use throughout the
file. Simply delete the import line and use the already-imported langfuse_client
variable directly in the exception handling logic.
In `@backend/app/services/langfuse_client.py`:
- Around line 100-103: The docstring for the LangfuseTracer class contains a
stale reference to the `get_prompt_as_langchain(name)` method in the "Additional
methods" section, but this method has been removed from the class. Remove the
line referencing `get_prompt_as_langchain(name)` from the docstring, keeping
only the documentation for the `get_prompt(name)` method that still exists in
the class.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 8eae59ad-1ecc-4f29-8ba1-a19da085bd3d
⛔ Files ignored due to path filters (2)
agent/uv.lockis excluded by!**/*.lockbackend/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (15)
.gitignoreagent/Dockerfileagent/pyproject.tomlagent/src/agent/nodes/extractor.pyagent/src/agent/nodes/refiner.pyagent/src/agent/nodes/schema_explorer.pybackend/Dockerfilebackend/app/infra_init.pybackend/app/routers/evaluation.pybackend/app/routers/orchestration.pybackend/app/services/evaluator.pybackend/app/services/langfuse_client.pybackend/app/services/llm_judge.pybackend/pyproject.tomlcore/pyproject.toml
| agent/scratch/* | ||
| fix_agent_langfuse.py | ||
| fix_langfuse.py | ||
| python-core-utils/ |
There was a problem hiding this comment.
Do not ignore a directory that is required by build/runtime dependency resolution.
Line 42 ignores python-core-utils/, but this PR also requires it in Docker build context (COPY python-core-utils ...) and uv source mappings (../python-core-utils). This creates a high risk of out-of-the-box build failures.
Suggested patch
-python-core-utils/📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| python-core-utils/ |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.gitignore at line 42, The .gitignore file at line 42 contains an entry that
ignores the python-core-utils directory, but this directory is required by the
build/runtime (Docker COPY command and uv source mappings). Remove the
python-core-utils/ line from .gitignore to ensure the directory is tracked by
git and available for builds.
| from agent.langfuse_client import langfuse_client | ||
| from agent.langfuse_client import langfuse_client |
There was a problem hiding this comment.
Remove duplicate import.
The same import appears twice on consecutive lines:
from agent.langfuse_client import langfuse_client
from agent.langfuse_client import langfuse_client🐛 Proposed fix
from agent.config import settings
from agent.langfuse_client import langfuse_client
-from agent.langfuse_client import langfuse_client
from langchain_core.prompts import ChatPromptTemplate🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@agent/src/agent/nodes/refiner.py` around lines 8 - 9, Remove the duplicate
import statement of langfuse_client from agent.langfuse_client in the refiner.py
file. The import statement appears twice consecutively on lines 8 and 9 - delete
one of these duplicate import lines to keep only a single import of
langfuse_client from agent.langfuse_client.
| _MINIO_HOST = "minio:9000" | ||
| _MINIO_ACCESS_KEY = "admin" | ||
| _MINIO_SECRET_KEY = "password123" | ||
| import os |
There was a problem hiding this comment.
Move os import to the top import block.
Line 27 triggers Ruff E402 and can fail lint-gated CI.
Suggested patch
import base64
import logging
+import os
import time
from typing import Any
@@
-import os
_MINIO_HOST = os.getenv("MINIO_HOST", "localhost:9000")🧰 Tools
🪛 Ruff (0.15.17)
[error] 27-27: Module level import not at top of file
(E402)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@backend/app/infra_init.py` at line 27, The `os` import on line 27 is not at
the top of the file, which violates Ruff's E402 rule and can cause CI failures.
Move the `os` import to the top of the file where other module imports are
located, ensuring all import statements are grouped together at the beginning of
the file before any other code.
Source: Linters/SAST tools
| _TRINO_HOST = os.getenv("TRINO_HOST", "localhost") | ||
| _TRINO_PORT = int(os.getenv("TRINO_PORT", "8080")) | ||
| _TRINO_USER = os.getenv("TRINO_USER", "trino") |
There was a problem hiding this comment.
Propagate env-driven Trino settings into OpenMetadata service registration.
Lines 33–35 introduce _TRINO_HOST/_TRINO_PORT/_TRINO_USER, but _ensure_om_service still hardcodes trino:8080 and trino (Lines 797–799). This breaks the new config contract in non-default environments.
Suggested patch
"connection": {
"config": {
"type": "Trino",
- "hostPort": "trino:8080",
- "username": "trino",
+ "hostPort": f"{_TRINO_HOST}:{_TRINO_PORT}",
+ "username": _TRINO_USER,
"catalog": "minio",
}
},📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| _TRINO_HOST = os.getenv("TRINO_HOST", "localhost") | |
| _TRINO_PORT = int(os.getenv("TRINO_PORT", "8080")) | |
| _TRINO_USER = os.getenv("TRINO_USER", "trino") | |
| "connection": { | |
| "config": { | |
| "type": "Trino", | |
| "hostPort": f"{_TRINO_HOST}:{_TRINO_PORT}", | |
| "username": _TRINO_USER, | |
| "catalog": "minio", | |
| } | |
| }, |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@backend/app/infra_init.py` around lines 33 - 35, The new environment-driven
Trino configuration variables (_TRINO_HOST, _TRINO_PORT, _TRINO_USER) defined at
the module level are not being used in the _ensure_om_service function, which
still hardcodes the Trino connection details as "trino:8080" with user "trino".
Replace the hardcoded values in _ensure_om_service with the corresponding
environment-driven configuration variables (_TRINO_HOST, _TRINO_PORT,
_TRINO_USER) to ensure OpenMetadata service registration respects the new
environment configuration contract.
|
|
||
| # We use psycopg2 directly since it's already installed via pyproject.toml | ||
| conn_str = "postgresql://postgres:postgres@db:5432/text2sql" | ||
| conn_str = os.getenv("DATABASE_URL", "postgresql://postgres:postgres@localhost:5432/text2sql") |
There was a problem hiding this comment.
Fix the container fallback for DATABASE_URL.
Line 552 defaults to localhost, but this script runs inside the backend container; localhost points to itself, not the Postgres service. If DATABASE_URL is missing/mis-scoped, startup fails.
Suggested patch
- conn_str = os.getenv("DATABASE_URL", "postgresql://postgres:postgres@localhost:5432/text2sql")
+ conn_str = os.getenv("DATABASE_URL", "postgresql://postgres:postgres@db:5432/text2sql")🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@backend/app/infra_init.py` at line 552, The fallback database URL in the
`os.getenv()` call for `DATABASE_URL` uses `localhost` as the hostname, which
resolves to the container itself rather than the Postgres service when running
inside the backend container. Replace `localhost` in the hardcoded default
connection string with the appropriate service hostname (typically `postgres` or
`db`) that is resolvable from within the container network, ensuring the
connection can reach the actual Postgres service.
| if langfuse_client.client and langfuse_client.client.get_current_trace_id(): | ||
| langfuse_client.client.trace(id=langfuse_client.client.get_current_trace_id(), | ||
| metadata={"table_id": table_id, "run_id": run_id}, | ||
| tags=["eval-run", f"table:{table_id}"], | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's examine the file structure
fd evaluation.py --type fRepository: StavPonte11/text2sql-onboarding
Length of output: 109
🏁 Script executed:
# Check the file length and examine the relevant sections
wc -l backend/app/routers/evaluation.pyRepository: StavPonte11/text2sql-onboarding
Length of output: 114
🏁 Script executed:
# Read the relevant sections around lines 113-117 and 185-188
cat -n backend/app/routers/evaluation.py | sed -n '100,130p'Repository: StavPonte11/text2sql-onboarding
Length of output: 1322
🏁 Script executed:
# Check the second location around line 185
cat -n backend/app/routers/evaluation.py | sed -n '175,200p'Repository: StavPonte11/text2sql-onboarding
Length of output: 1253
🏁 Script executed:
# Check imports and logger setup at the top of the file
cat -n backend/app/routers/evaluation.py | head -30Repository: StavPonte11/text2sql-onboarding
Length of output: 1064
🏁 Script executed:
# Check for logger initialization in the module
cat -n backend/app/routers/evaluation.py | sed -n '1,50p'Repository: StavPonte11/text2sql-onboarding
Length of output: 1801
🏁 Script executed:
# Look at the function containing lines 113-117 to understand context
cat -n backend/app/routers/evaluation.py | sed -n '60,120p'Repository: StavPonte11/text2sql-onboarding
Length of output: 2355
🏁 Script executed:
# Check the function signature and earlier setup
cat -n backend/app/routers/evaluation.py | sed -n '80,115p'Repository: StavPonte11/text2sql-onboarding
Length of output: 1503
🏁 Script executed:
# Check the broader context around line 185 - what happens after the trace call
cat -n backend/app/routers/evaluation.py | sed -n '150,190p'Repository: StavPonte11/text2sql-onboarding
Length of output: 1928
Isolate Langfuse trace writes from the evaluation critical path.
Lines 113–117 and 185–188 perform external Langfuse trace updates without protection. A transient observability failure will abort evaluation execution/state updates. This is especially problematic at line 185–188, where the database is already committed and scores computed; a Langfuse failure here prevents the function from returning despite successful evaluation.
Suggested patch
- if langfuse_client.client and langfuse_client.client.get_current_trace_id():
- langfuse_client.client.trace(id=langfuse_client.client.get_current_trace_id(),
- metadata={"table_id": table_id, "run_id": run_id},
- tags=["eval-run", f"table:{table_id}"],
- )
+ if langfuse_client.client and langfuse_client.client.get_current_trace_id():
+ try:
+ langfuse_client.client.trace(
+ id=langfuse_client.client.get_current_trace_id(),
+ metadata={"table_id": table_id, "run_id": run_id},
+ tags=["eval-run", f"table:{table_id}"],
+ )
+ except Exception as exc:
+ logger.debug("Langfuse trace metadata update failed: %s", exc)
@@
- if langfuse_client.client and langfuse_client.client.get_current_trace_id():
- langfuse_client.client.trace(id=langfuse_client.client.get_current_trace_id(),
- output={"score": avg_score_contains, "pass_rate": pass_rate_contains}
- )
+ if langfuse_client.client and langfuse_client.client.get_current_trace_id():
+ try:
+ langfuse_client.client.trace(
+ id=langfuse_client.client.get_current_trace_id(),
+ output={"score": avg_score_contains, "pass_rate": pass_rate_contains},
+ )
+ except Exception as exc:
+ logger.debug("Langfuse trace output update failed: %s", exc)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@backend/app/routers/evaluation.py` around lines 113 - 117, The Langfuse trace
write operations at lines 113–117 and 185–188 in the evaluation.py file are not
protected from failures, which means any transient Langfuse observability issue
will abort the entire evaluation execution and prevent returning successful
results. Wrap both the langfuse_client.trace() calls in try-except blocks (or
similar error handling) to isolate them from the critical evaluation path,
ensuring that Langfuse failures are caught and logged but do not interrupt
evaluation completion, database commits, or function returns. This is especially
critical at the second site (lines 185–188) where the database has already been
committed and scores computed.
| "pydantic==2.10.4", | ||
| "pydantic-settings==2.7.0", | ||
| "langfuse==2.56.2", | ||
| "langfuse>=2.0.0", |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | ⚡ Quick win
Constrain langfuse to a bounded range for reproducible runtime behavior.
Line 13 allows any future major release, which can silently introduce breaking API changes across environments. Prefer pinning or at least an upper bound.
Suggested patch
- "langfuse>=2.0.0",
+ "langfuse>=2.56.2,<3.0.0",📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "langfuse>=2.0.0", | |
| "langfuse>=2.56.2,<3.0.0", |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@backend/pyproject.toml` at line 13, The langfuse dependency constraint at
line 13 in backend/pyproject.toml is specified as "langfuse>=2.0.0" which lacks
an upper bound, allowing any future major version to be installed and
potentially introducing breaking changes. Modify the version constraint to
include an upper bound by changing it to "langfuse>=2.0.0,<3.0.0" (or similar
bounded range) to ensure reproducible behavior and prevent silent breakage from
major version updates while still allowing compatible patch and minor version
updates.
| es-data: null | ||
| db-data: null | ||
|
|
||
| secrets: |
There was a problem hiding this comment.
why did we remove the deploy key and use a local copy of the core utils package instead of downloading from the url?
There was a problem hiding this comment.
Probably for testing, this will be restored
|
|
||
| @asynccontextmanager | ||
| async def lifespan(app: FastAPI): | ||
| if not settings.OPENAI_API_KEY: |
There was a problem hiding this comment.
this might be a problem for local runs. might be better to run a conditional check / make this optional with a warning
| def get_judge_health(): | ||
| from app.config import settings | ||
|
|
||
| api_key = getattr(settings, "OPENAI_API_KEY", None) |
There was a problem hiding this comment.
I prefer calling the judge directly instead of the indirect check that the var exists. the logic might change and we will forget to correct this endpoint, as well as missing unexpected errors
There was a problem hiding this comment.
You mean calling evaluate_with_llm instead?
|
|
||
| from esca_sdk import EscaClient | ||
|
|
||
| client = EscaClient(api_key=api_key, base_url=base_url) |
There was a problem hiding this comment.
why not use the util function?
|
|
||
| try: | ||
| from agent.config import settings as agent_settings | ||
| api_key = getattr(agent_settings, "ESCA_API_KEY", None) |
There was a problem hiding this comment.
same as the judge - just call the function and catch any errors. var validation is not supposed to happen here
| MAX_REFINER_ITERATIONS = 3 | ||
| REFINER_SCHEMA_CONTEXT_TABLES = 4 |
There was a problem hiding this comment.
move to settings file and make env var
|
|
||
|
|
||
| # Define Tools | ||
| @tool |
There was a problem hiding this comment.
We don't need it currently as a tool and there is a function called hybrid_search_tables
| if i < settings.MAX_PROFILES_TO_FETCH: | ||
|
|
||
| # 2. Automatically get profiles for the top candidate tables (up to MAX_PROFILES_TO_FETCH) to seed the prompt | ||
| import asyncio |
There was a problem hiding this comment.
move imports to top of the file
|
|
||
| return {"schema_plan": plan} | ||
|
|
||
| tables_used = getattr(data, "tables_used", []) |
There was a problem hiding this comment.
I don't like this here
please extract the logic to a function / middleware
I think this checks design can be improved
There was a problem hiding this comment.
I think it might be better to create a node in the graph for "sql static validations" that will verify syntax and that the tables actually exist. this way it will be easier to navigate, will result in cleaner code, and will be easier to evaluate
|
|
||
| preview_str = "" | ||
| if raw_data_ref: | ||
| if esca_write_failed or not raw_data_ref: |
There was a problem hiding this comment.
why are we continuing to here if the write failed? shouldn't we fail the whole graph if that happens? it feels wrong to finalize as if nothing is wrong if we have failed to fetch the data
There was a problem hiding this comment.
We decided that currently we will add a flag that decide if we want to work with esca and later on in a different branch we will add somehting like healthcheck to esca and decide smartly if it wants to work with esca or not dynamically and based if it's on air
| res = await client.save_data(esca_payload) | ||
| esca_id = res.get("esca_id") | ||
| except Exception as e: | ||
| esca_id = None |
There was a problem hiding this comment.
if we are already saving escape errors in state, why not here?
There was a problem hiding this comment.
This is not the state of graph, this is for getting the profile and then later on get it from esca
|
|
||
| return {"schema_plan": plan} | ||
|
|
||
| tables_used = getattr(data, "tables_used", []) |
There was a problem hiding this comment.
I think it might be better to create a node in the graph for "sql static validations" that will verify syntax and that the tables actually exist. this way it will be easier to navigate, will result in cleaner code, and will be easier to evaluate
| HYBRID_SEARCH_MAX_TABLES: int = 10 | ||
| MAX_PROFILES_TO_FETCH: int = 3 | ||
| PROFILE_FETCH_CONCURRENCY: int = Field(default=4, gt=0) | ||
| REDIS_URL: str = "redis://localhost:6379" |
There was a problem hiding this comment.
please use the redis connector from the core utils package, it has its own redis conf
There was a problem hiding this comment.
This can be deleted because we use the get_redis_client function from core
| router = APIRouter(prefix="/tables", tags=["publish"]) | ||
|
|
||
|
|
||
| def _check_regression(table_id: str, new_score: float, session: Session) -> list[dict]: |
There was a problem hiding this comment.
I think there was no use of this function before.
We have the function _run_regression_eval that we use
|
|
||
| return { | ||
| "feedback_route": route, | ||
| "sql_query": "", |
There was a problem hiding this comment.
Indeed, we don't want to delete other things because maybe we do need the query or the plan.
But what we should add later is a better approach that also look at the feedback and pass it as optional variable into some of the nodes.
| if state.get("hallucinated_tables"): | ||
| if state.get("schema_explorer_retry_count", 0) >= 3: | ||
| return "query_builder" | ||
| return "schema_explorer" | ||
| return "query_builder" |
There was a problem hiding this comment.
inefficient if's, but also, why should it silently continue if there are hallucinations? I think it would be better to through an error...
There was a problem hiding this comment.
Yes if hallucination we should throw error and not continue to query builder
| esca_write_failed = True | ||
| if langfuse_client and langfuse_client.get_current_trace_id(): | ||
| langfuse_client.span(id=langfuse_client.get_current_observation_id(), | ||
| level="WARNING", status_message=f"ESCA write failed: {e}" | ||
| ) |
There was a problem hiding this comment.
I mentioned it previously but if we can't write the data there is no reason to silently continue to the next graph. it should throw an error
| # Copy workspace | ||
| COPY core /app/core | ||
| COPY agent /app/agent | ||
| COPY python-core-utils /app/python-core-utils |
There was a problem hiding this comment.
this will be changed because we use the python-core-utils as library and not as folder
|
|
||
| # Configure git to use ssh for github.com | ||
| RUN git config --global url."git@github.com:".insteadOf "https://github.com/" | ||
| RUN apt-get update && apt-get install -y --no-install-recommends git \ |
There was a problem hiding this comment.
revert we also want openssh-client for accessing the python core utils library
| [tool.uv.sources] | ||
| core = { path = "../core" } | ||
| esca-sdk = { git = "https://github.com/elirazpevz/esca.git", subdirectory = "packages/esca-sdk" } | ||
| python-core-utils = { path = "../python-core-utils" } |
| model_config = SettingsConfigDict(env_file=".env", extra="ignore") | ||
|
|
||
| DATABASE_URL: str = "sqlite:///./text2sql.db" | ||
| OPENAI_API_KEY: str | None = None |
There was a problem hiding this comment.
Change into LLM_API_KEY because that's what we use in our agent
| # Copy workspace dependencies | ||
| COPY core /app/core | ||
| COPY backend /app/backend | ||
| COPY python-core-utils /app/python-core-utils |
There was a problem hiding this comment.
Also here change to python-core-utils the library and not from the local files in the project
| "pydantic==2.10.4", | ||
| "pydantic-settings==2.7.0", | ||
| "langfuse==2.56.2", | ||
| "langfuse>=2.0.0", |
There was a problem hiding this comment.
Set it to 4.7.1 since it's what we're working with right now
This PR resolves the 5 production-blocking bugs in Group 1 (TTS-G1-01 through TTS-G1-05):
It also optimizes Docker Compose settings to build and run out-of-the-box using HTTPS for public dependencies, removing any SSH deploy_key requirements.
Summary by CodeRabbit
Release Notes
New Features
/health/judgeand/health/escaendpoints.Configuration
Infrastructure / Behavior