Skip to content

Reliability hardening suite#261

Open
Naseem77 wants to merge 15 commits into
mainfrom
reliability-hardening-suite
Open

Reliability hardening suite#261
Naseem77 wants to merge 15 commits into
mainfrom
reliability-hardening-suite

Conversation

@Naseem77
Copy link
Copy Markdown
Contributor

@Naseem77 Naseem77 commented May 18, 2026

Summary

  • enforce LLM and embedding timeouts with typed timeout errors
  • improve error visibility for broad exception handling paths
  • cover GraphRAG async context manager cleanup behavior
  • enforce latency budgets across retrieval and LLM phases
  • mark and wire real FalkorDB integration tests
  • tighten release automation for docs, PyPI checks, artifacts, and Dependabot

Review

  • ran the local review process after each task and only committed approved fixes

Testing

  • targeted provider, facade, retrieval, integration-marker, YAML, and build checks were run during each task

Summary by CodeRabbit

  • New Features

    • Timeouts for LLM/embedding calls, a new LatencyBudgetExceededError, and latency-budget enforcement across retrieval and completion.
  • Bug Fixes

    • Improved error logging and clearer, typed timeout/error propagation so failures are reported consistently.
  • Documentation

    • Dev setup now uses Docker Compose; added instructions to run integration tests locally.
  • Chores

    • CI/CD workflow tweaks and weekly Dependabot updates.
  • Tests

    • Many new tests for timeouts, budget checkpoints, and propagation.

Review Change Stack

Naseem77 added 6 commits May 18, 2026 16:54
Add typed timeout errors for LLM and embedding calls and wrap async provider operations with asyncio.wait_for. Cover base, LiteLLM, and OpenRouter async paths with regression tests.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add typed wrapping and error-level logging around high-risk broad exception paths while preserving debug tracebacks. Cover connection, provider, loader, pipeline, retrieval, and history validation error behavior.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add regression tests for async context manager cleanup, close failure propagation, and inner-exception preservation.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add a typed latency budget error and enforce Context budgets before retrieval phases, helper I/O, graph config probes, Cypher calls, and completion LLM calls. Cover propagation and phase gating with regression tests.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add an explicit integration marker, run marked real-FalkorDB tests in CI, document docker-compose usage, and expose the FalkorDB browser port in local compose.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Validate Python distributions before trusted PyPI publishing, upload release artifacts, enable manual docs deploys, and add Dependabot coverage for actions and Python dependencies.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 18, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds latency-budget enforcement and typed timeouts across providers and retrievals, threads ctx and timeouts through GraphRAG facade and strategies, introduces timeout helpers and new exception types, tightens history validation and logging, updates CI/workflows and docker/dev docs, and adds extensive tests for budgets, timeouts, and logging.

Changes

Latency Budget Enforcement System

Layer / File(s) Summary
CI, workflows, docs, and repo config
.github/dependabot.yml, .github/workflows/ci.yml, .github/workflows/docs.yml, .github/workflows/pypi-publish.yaml, docker-compose.yml, CONTRIBUTING.md, graphrag_sdk/pyproject.toml
Dependabot enabled for Actions and Python; CI integration runs pytest -m integration; docs workflow adds manual dispatch; PyPI workflow runs twine check and uploads artifacts; docker-compose exposes UI port and adds healthcheck start_period; contributing and pytest marker updated.
Exception hierarchy and Context API
graphrag_sdk/src/graphrag_sdk/core/exceptions.py, graphrag_sdk/src/graphrag_sdk/core/context.py, graphrag_sdk/src/graphrag_sdk/__init__.py
Adds LatencyBudgetExceededError, LLMTimeoutError, EmbeddingTimeoutError; implements Context.remaining_budget_seconds and Context.ensure_budget(operation); exports new exception via package __all__.
Provider timeout helpers
graphrag_sdk/src/graphrag_sdk/core/providers/_timeout.py
Adds wait_for_provider_call and validate_timeout to enforce timeouts and convert timeout events into typed timeout exceptions with operation-aware messages.
Provider ABCs: timeout wiring and retries
graphrag_sdk/src/graphrag_sdk/core/providers/base.py, graphrag_sdk/src/graphrag_sdk/core/providers/_retry.py
Adds timeout kwargs to Embedder and LLM interfaces, validates timeouts, wraps sync/async provider calls with wait_for_provider_call, forwards timeouts through batch/stream helpers, and treats EmbeddingTimeoutError as non-retriable.
Provider implementations (LiteLLM, OpenRouter)
graphrag_sdk/src/graphrag_sdk/core/providers/litellm.py, graphrag_sdk/src/graphrag_sdk/core/providers/openrouter.py
Implements timeout support end-to-end for async LLM and embedder methods, enforces per-call or shared-deadline timeouts for batched embeddings, maps timeout failures to typed errors, and consolidates final error logging after retries.
Retrieval strategies: ctx propagation & checkpoints
graphrag_sdk/src/graphrag_sdk/retrieval/strategies/*.py
Threads ctx: Context through chunk_retrieval, cypher_generation, entity_discovery, local, multi_path, relationship_expansion, result_assembly; inserts ctx.ensure_budget(...) before major async steps and re-raises LatencyBudgetExceededError.
MultiPath orchestration and keyword extraction
graphrag_sdk/src/graphrag_sdk/retrieval/strategies/multi_path.py
Adds _unpack_gather_result helper for parallel tasks, requires ctx in keyword extraction, propagates ctx into parallel Cypher/text retrieval paths, and checkpoints result assembly.
GraphRAG facade and completion flow
graphrag_sdk/src/graphrag_sdk/api/main.py
Passes ctx into graph-config validation, retrieval, and reranking; inserts budget checkpoints before question rewrite, retrieval, and final LLM call; tightens conversation-history dict validation; sets provider timeouts from ctx.
Connection, loaders, and pipeline logging
graphrag_sdk/src/graphrag_sdk/core/connection.py, graphrag_sdk/src/graphrag_sdk/ingestion/loaders/*.py, graphrag_sdk/src/graphrag_sdk/ingestion/pipeline.py
Adds structured error logging and debug tracebacks for non-transient FalkorDB failures, loader read failures, and ingestion pipeline unexpected exceptions; re-raises original/typed exceptions.
Reranking and cosine reranker
graphrag_sdk/src/graphrag_sdk/retrieval/reranking_strategies/cosine.py, graphrag_sdk/src/graphrag_sdk/retrieval/strategies/result_assembly.py
Enforces budget for reranking embedding fallback and passes timeout derived from ctx.remaining_budget_seconds.
Tests: budgets, timeouts, and logging
graphrag_sdk/tests/*
Extensive tests added/updated for Context.ensure_budget, provider timeout validation and behavior, LatencyBudgetExceededError propagation across retrieval and facade, loader/pipeline log assertions, and integration test tagging.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant GraphRAG
  participant Context
  participant Retrieval
  participant Embedder
  participant LLM
  Client->>GraphRAG: retrieve(query, ctx)
  GraphRAG->>Context: ensure_budget(graph-config)
  GraphRAG->>GraphRAG: _validate_graph_config(ctx)
  GraphRAG->>Context: ensure_budget(completion retrieval)
  GraphRAG->>Retrieval: search(ctx)
  Retrieval->>Context: ensure_budget(query embedding)
  Retrieval->>Embedder: aembed_query(text, timeout=ctx.remaining_budget_seconds)
  Retrieval->>Context: ensure_budget(chunk search)
  Retrieval->>Retrieval: vector_store.search_chunks(...)
  Retrieval->>Context: ensure_budget(rerank/embed)
  Retrieval->>Embedder: aembed_documents(..., timeout=ctx.remaining_budget_seconds)
  GraphRAG->>Context: ensure_budget(completion LLM call)
  GraphRAG->>LLM: ainvoke_messages(messages, timeout=ctx.remaining_budget_seconds)
  LLM-->>GraphRAG: response
  GraphRAG-->>Client: result
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

  • Production Hardening & Release Readiness #257: Implements production-hardening items (typed timeouts, wait_for_provider_call, Context.ensure_budget, LatencyBudgetExceededError, improved logging, CI/workflow updates) similar to this PR.

Possibly related PRs

"🐰 I counted ms like carrot crumbs,
Budgets checked where timeouts hum,
Retrieval hops pause, then spring,
Logs whisper truths; tests make me sing,
Hooray — the rabbit hops, 'All green, run!'"

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 36.32% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Reliability hardening suite' directly reflects the primary objectives of the PR: enforcing timeouts, improving error visibility, adding budget enforcement, and enhancing reliability across the SDK.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch reliability-hardening-suite

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread graphrag_sdk/tests/test_facade.py Fixed
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment thread graphrag_sdk/tests/test_facade.py Fixed
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@Naseem77 Naseem77 requested a review from galshubeli May 18, 2026 17:19
@Naseem77 Naseem77 linked an issue May 18, 2026 that may be closed by this pull request
6 tasks
Copy link
Copy Markdown
Collaborator

@galshubeli galshubeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deep-dive review of the reliability hardening suite. Solid intent and real value, but three load-bearing issues need addressing before merge. Posted as line comments — summary below.

Blocking (Critical):

  • C1: connection.py silently wraps every DB error in DatabaseError — breaking change for any downstream except falkordb.<Specific>Error clause, and FalkorDBConnection is a re-exported public class.
  • C2: the latency budget is never wired into provider timeout=. Once a slow LLM call is in flight, the budget can fly past the threshold and nothing aborts it.
  • C3: aembed_documents(..., timeout=10) applies the timeout per batch (and per binary-split sub-call), not as an overall deadline — the public contract reads as 10s but wall-clock total can be 50s+.

Wins worth keeping: the MENTIONED_IN cosine-rank merge from #259, the history-validation refactor in api/main.py (error messages were wrong before, are right now), the secret-leakage test in test_providers.py, the PdfLoader open/finally fix. Tests pass cleanly (393/393 on touched files).

Comment thread graphrag_sdk/src/graphrag_sdk/core/connection.py Outdated
Comment thread graphrag_sdk/src/graphrag_sdk/core/connection.py Outdated
Comment thread graphrag_sdk/src/graphrag_sdk/retrieval/strategies/multi_path.py
Comment thread graphrag_sdk/src/graphrag_sdk/core/providers/litellm.py
Comment thread graphrag_sdk/src/graphrag_sdk/core/providers/base.py Outdated
Comment thread graphrag_sdk/src/graphrag_sdk/core/providers/base.py Outdated
Comment thread graphrag_sdk/src/graphrag_sdk/api/main.py Outdated
Comment thread graphrag_sdk/tests/test_facade.py Outdated
Comment thread .github/dependabot.yml
Comment thread docker-compose.yml Outdated
Naseem77 and others added 4 commits May 20, 2026 10:58
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
graphrag_sdk/src/graphrag_sdk/core/providers/openrouter.py (1)

161-166: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use an overall deadline for retried LLM calls.

Like the LiteLLM path, this treats timeout as per-attempt, not per-operation. With retries enabled, the request can exceed the caller's remaining latency budget by the backoff plus extra attempts. Recompute a remaining timeout from a monotonic deadline on each loop iteration so the whole ainvoke* call stays within the passed budget.

Also applies to: 216-221

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@graphrag_sdk/src/graphrag_sdk/core/providers/openrouter.py` around lines 161
- 166, The current call to
wait_for_provider_call(client.chat.completions.create(**create_kwargs),
timeout=timeout, ...) treats timeout as per-attempt; change the retry loop to
compute a monotonic deadline once at the start and, before each attempt,
recompute remaining_timeout = max(0, deadline - time.monotonic()) and pass that
remaining_timeout into wait_for_provider_call (and into any backoff/wait logic)
so the whole ainvoke/chat completion operation respects the original budget;
apply the same deadline-based timeout recomputation to the other OpenRouter
completion call site that uses client.chat.completions.create and model_name.
graphrag_sdk/src/graphrag_sdk/core/providers/litellm.py (1)

138-143: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Make timeout a deadline across retries.

These calls reuse the same timeout on every retry attempt, so ainvoke(..., timeout=2, max_retries=3) can still run well past 2 seconds once retry backoff is included. That breaks the new latency-budget plumbing, because callers now pass ctx.remaining_budget_seconds expecting the whole provider operation to stay inside the remaining budget. Mirror the deadline/remaining_timeout() pattern already used in aembed_documents so retries and sleeps consume the same budget.

Also applies to: 227-232

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@graphrag_sdk/src/graphrag_sdk/core/providers/litellm.py` around lines 138 -
143, The wait_for_provider_call invocation in LiteLLM methods (e.g., the
acompletion call that uses self._completion_kwargs and model_name) currently
reuses the same timeout for every retry, which can exceed a caller's overall
latency budget; change the logic to compute a deadline/remaining timeout per
attempt (mirror the remaining_timeout()/deadline pattern used in
aembed_documents) so each retry and any backoff/sleep consumes from the same
budget and you pass the updated remaining timeout into wait_for_provider_call
(and any sleep/backoff) rather than the original static timeout; update both the
completion path around acompletion and the other occurrence noted (lines
~227-232) to use this per-attempt remaining timeout.
🧹 Nitpick comments (1)
graphrag_sdk/tests/test_connection.py (1)

117-117: ⚡ Quick win

Tighten exception type assertions to match the “typed” intent.

Using pytest.raises(Exception) is too broad and may pass on unintended exception paths. Use explicit exception classes in side_effect and assert those exact classes are propagated.

✅ Suggested test hardening
-        mock_graph.query = AsyncMock(side_effect=Exception("always fails"))
+        class AlwaysFailsError(Exception):
+            pass
+        mock_graph.query = AsyncMock(side_effect=AlwaysFailsError("always fails"))
@@
-            with pytest.raises(Exception, match="always fails"):
+            with pytest.raises(AlwaysFailsError, match="always fails"):
                 await conn.query("MATCH (n) RETURN n")
@@
-        mock_graph.query = AsyncMock(side_effect=Exception("already indexed"))
+        class AlreadyIndexedError(Exception):
+            pass
+        mock_graph.query = AsyncMock(side_effect=AlreadyIndexedError("already indexed"))
@@
-            with pytest.raises(Exception, match="already indexed"):
+            with pytest.raises(AlreadyIndexedError, match="already indexed"):
                 await conn.query("CREATE INDEX idx")

Also applies to: 131-132

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@graphrag_sdk/tests/test_connection.py` at line 117, The tests use
pytest.raises(Exception, match="always fails") which is too broad; change the
mocked side_effect to raise a specific exception class (e.g., RuntimeError or
the actual custom exception used in the code) and update the pytest.raises
call(s) to assert that exact exception type instead of Exception for the
occurrences where pytest.raises(Exception, match="always fails") appears; ensure
both instances (the shown pytest.raises call and the other occurrence later in
the file tied to the same mock/side_effect) use the same explicit exception
class so the test verifies the intended error path.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@graphrag_sdk/src/graphrag_sdk/core/providers/_timeout.py`:
- Around line 18-27: The validate_timeout function currently raises a raw
ValueError when timeout <= 0 which bypasses the typed timeout error contract;
change validate_timeout(timeout: float | None) to raise the module's
timeout_error (not ValueError) when timeout is not None and timeout <= 0,
including the timeout value in the message (e.g. f"timed out after
{timeout:.3g}s") so callers and the except block keep receiving the typed
timeout error; update references to validate_timeout and keep the existing await
asyncio.wait_for/except behavior unchanged.

In `@graphrag_sdk/tests/test_providers.py`:
- Around line 729-734: The wall-clock assertion in the test around
embedder.aembed_documents is too strict and can flake; relax the timing ceiling
or mock the clock: change the assertion `assert time.monotonic() - started <
0.05` to a looser bound (for example `< 0.1` or `< 0.2`) or replace
time.monotonic with a mocked clock to avoid scheduler jitter, leaving the rest
of the test (the with pytest.raises(EmbeddingTimeoutError) block and the `assert
mock_litellm.aembedding.await_count == 2`) unchanged.

---

Outside diff comments:
In `@graphrag_sdk/src/graphrag_sdk/core/providers/litellm.py`:
- Around line 138-143: The wait_for_provider_call invocation in LiteLLM methods
(e.g., the acompletion call that uses self._completion_kwargs and model_name)
currently reuses the same timeout for every retry, which can exceed a caller's
overall latency budget; change the logic to compute a deadline/remaining timeout
per attempt (mirror the remaining_timeout()/deadline pattern used in
aembed_documents) so each retry and any backoff/sleep consumes from the same
budget and you pass the updated remaining timeout into wait_for_provider_call
(and any sleep/backoff) rather than the original static timeout; update both the
completion path around acompletion and the other occurrence noted (lines
~227-232) to use this per-attempt remaining timeout.

In `@graphrag_sdk/src/graphrag_sdk/core/providers/openrouter.py`:
- Around line 161-166: The current call to
wait_for_provider_call(client.chat.completions.create(**create_kwargs),
timeout=timeout, ...) treats timeout as per-attempt; change the retry loop to
compute a monotonic deadline once at the start and, before each attempt,
recompute remaining_timeout = max(0, deadline - time.monotonic()) and pass that
remaining_timeout into wait_for_provider_call (and into any backoff/wait logic)
so the whole ainvoke/chat completion operation respects the original budget;
apply the same deadline-based timeout recomputation to the other OpenRouter
completion call site that uses client.chat.completions.create and model_name.

---

Nitpick comments:
In `@graphrag_sdk/tests/test_connection.py`:
- Line 117: The tests use pytest.raises(Exception, match="always fails") which
is too broad; change the mocked side_effect to raise a specific exception class
(e.g., RuntimeError or the actual custom exception used in the code) and update
the pytest.raises call(s) to assert that exact exception type instead of
Exception for the occurrences where pytest.raises(Exception, match="always
fails") appears; ensure both instances (the shown pytest.raises call and the
other occurrence later in the file tied to the same mock/side_effect) use the
same explicit exception class so the test verifies the intended error path.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 46226887-d6df-4e6d-8508-2f1351babea6

📥 Commits

Reviewing files that changed from the base of the PR and between 3eca0b4 and fea6b86.

📒 Files selected for processing (21)
  • docker-compose.yml
  • graphrag_sdk/src/graphrag_sdk/api/main.py
  • graphrag_sdk/src/graphrag_sdk/core/connection.py
  • graphrag_sdk/src/graphrag_sdk/core/context.py
  • graphrag_sdk/src/graphrag_sdk/core/providers/_timeout.py
  • graphrag_sdk/src/graphrag_sdk/core/providers/base.py
  • graphrag_sdk/src/graphrag_sdk/core/providers/litellm.py
  • graphrag_sdk/src/graphrag_sdk/core/providers/openrouter.py
  • graphrag_sdk/src/graphrag_sdk/retrieval/reranking_strategies/cosine.py
  • graphrag_sdk/src/graphrag_sdk/retrieval/strategies/chunk_retrieval.py
  • graphrag_sdk/src/graphrag_sdk/retrieval/strategies/cypher_generation.py
  • graphrag_sdk/src/graphrag_sdk/retrieval/strategies/entity_discovery.py
  • graphrag_sdk/src/graphrag_sdk/retrieval/strategies/local.py
  • graphrag_sdk/src/graphrag_sdk/retrieval/strategies/multi_path.py
  • graphrag_sdk/src/graphrag_sdk/retrieval/strategies/relationship_expansion.py
  • graphrag_sdk/src/graphrag_sdk/retrieval/strategies/result_assembly.py
  • graphrag_sdk/tests/test_connection.py
  • graphrag_sdk/tests/test_facade.py
  • graphrag_sdk/tests/test_multi_path_retrieval.py
  • graphrag_sdk/tests/test_providers.py
  • graphrag_sdk/tests/test_retrieval.py
✅ Files skipped from review due to trivial changes (1)
  • docker-compose.yml

Comment thread graphrag_sdk/src/graphrag_sdk/core/providers/_timeout.py Outdated
Comment thread graphrag_sdk/tests/test_providers.py
Naseem77 and others added 2 commits May 20, 2026 11:20
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@graphrag_sdk/src/graphrag_sdk/core/context.py`:
- Line 57: The method remaining_budget_seconds currently returns a tiny positive
1e-9 even when the budget is exhausted, which conflicts with budget_exceeded and
can allow a call to start with a non-zero timeout; update
remaining_budget_seconds (in the Context class/function named
remaining_budget_seconds) to return 0.0 when remaining milliseconds <= 0 (e.g.,
if remaining <= 0 return 0.0 else return remaining/1000.0) so an exhausted
budget yields exactly zero seconds and aligns with budget_exceeded semantics.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6f48d8e8-50d7-492e-8564-7ec08c41f346

📥 Commits

Reviewing files that changed from the base of the PR and between fea6b86 and d68c1c1.

📒 Files selected for processing (3)
  • graphrag_sdk/src/graphrag_sdk/core/context.py
  • graphrag_sdk/src/graphrag_sdk/core/providers/_timeout.py
  • graphrag_sdk/tests/test_providers.py

Comment thread graphrag_sdk/src/graphrag_sdk/core/context.py
Co-Authored-By: Copilot <223556219+Copilot@users.noreply.github.com>
@Naseem77 Naseem77 requested a review from galshubeli May 20, 2026 08:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Production Hardening & Release Readiness

2 participants