Skip to content

Add request-ID middleware that propagates X-Request-ID via ContextVar#45

Open
mvanhorn wants to merge 1 commit into
Climate-Vision:mainfrom
mvanhorn:request-id-middleware
Open

Add request-ID middleware that propagates X-Request-ID via ContextVar#45
mvanhorn wants to merge 1 commit into
Climate-Vision:mainfrom
mvanhorn:request-id-middleware

Conversation

@mvanhorn
Copy link
Copy Markdown

@mvanhorn mvanhorn commented May 5, 2026

Closes #44.

Why

AuditLogMiddleware already attaches a UUID to request.state.request_id, but anything that runs in the request lifecycle without access to the FastAPI Request object - notably inference/pipeline.py and the helper modules below it - emits log lines that aren't correlated to the inbound HTTP request.

What

Added in src/climatevision/api/middleware.py:

  • request_id_var: ContextVar[str | None] so non-FastAPI code can read the active request ID.
  • RequestIDMiddleware (BaseHTTPMiddleware) that reads X-Request-ID (UUID4 fallback), stores it on the ContextVar, mirrors it onto request.state.request_id so the existing RequestLoggingMiddleware / AuditLogMiddleware keep working, and echoes the value back on the response. ContextVar is reset() in a finally block so requests don't leak.
  • RequestIDLogFilter(logging.Filter) that exposes the ContextVar as record.request_id for log formatters.

setup_logging now installs the filter on the root logger and attaches it to existing handlers, and the JSON log format gains "request_id":"%(request_id)s".

create_app registers RequestIDMiddleware last in add_middleware, which means starlette runs it OUTERMOST -- the ContextVar is set before any other middleware or route handler runs.

Tests

tests/test_request_id_middleware.py covers the three cases in the issue plus a leak check:

  • request without X-Request-ID -> response has a UUID-shaped X-Request-ID and the route reads the same value
  • request with explicit X-Request-ID: <id> -> response echoes the same value back
  • log record emitted from a handler exposes record.request_id matching the inbound id (verified via the filter on a StreamHandler and pytest's caplog)
  • after a request finishes, request_id_var.get() is back to None

Locally:

PYTHONPATH=src python3 -m pytest tests/test_request_id_middleware.py -v

reports 4 passed in 0.10s. (The repo's full conftest pulls the inference pipeline which needs GDAL/fiona; the new test file builds a minimal FastAPI app of its own and stays independent.)

@Hopelynconsult Hopelynconsult requested a review from femi23 May 7, 2026 18:23
Copy link
Copy Markdown
Collaborator

@femi23 femi23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mvanhorn — this is a clean, well-documented PR that closes #44 the right way: ContextVar + filter so non-FastAPI code (inference pipeline, helpers) sees the same ID. Tests cover header-present, header-absent, log injection, and the leak case. Almost ready.

One small change requested and then we can merge:

Blocker (small)

RequestIDMiddleware is not actually outermost. Your comment in main.py says:

# Register the RequestIDMiddleware LAST so it sits OUTERMOST

but in the current diff it's registered before CORSMiddleware:

app.add_middleware(AuditLogMiddleware)
app.add_middleware(RequestIDMiddleware)   # <- this PR
app.add_middleware(CORSMiddleware, ...)
app.add_middleware(SecurityMiddleware)

Starlette wraps middleware in reverse add_middleware order, so SecurityMiddleware ends up outermost and RequestIDMiddleware runs after CORS and Security. That means CORS preflight responses and any SecurityMiddleware early-return won't carry X-Request-ID, and any logging those middlewares do happens before the ContextVar is set.

Please move the app.add_middleware(RequestIDMiddleware) line to after the CORSMiddleware and SecurityMiddleware registrations so it really is the outermost wrapper. The comment will then match the code.

Nits (non-blocking, fix in same push if you like)

  • In setup_logging, root.addFilter(request_id_filter) is redundant — a Filter attached to a Logger does not get re-evaluated when the record propagates to handlers, so the handler-level loop below is what actually makes %(request_id)s show up. Safe to drop the logger-level line and just keep the per-handler loop. Up to you.
  • request_id_var: ContextVar[str | None] — since you always store request.headers.get(...) or str(uuid.uuid4()), the value is never None inside a request. Tightening the type to ContextVar[str] with default="-" (matching the filter's fallback) would let downstream consumers skip the or "-" dance. Pure ergonomics.

Once the middleware-order fix lands, I'll approve and merge. Solid contribution.

@Goldokpa
Copy link
Copy Markdown
Member

📢 Heads-up: repo history was rewritten today (2026-05-18)

We force-pushed a cleaned history across all branches to remove an internal directory from past commits. Your code and this PR are unaffected — only the commit SHAs underneath have shifted. GitHub will re-render the diff against the new base automatically.

If you have a local clone, please bring it back in sync before pushing anything else:

# Option A (simplest): fresh start
git clone https://github.com/Climate-Vision/ClimateVision.git

# Option B: rebase the existing PR branch in your fork
git fetch origin
git checkout <your-branch>
git rebase origin/main          # likely no conflicts
git push --force-with-lease

Do not git pull on an existing clone — it will produce a messy non-fast-forward state. Either re-clone, or rebase explicitly as above.

Apologies for the interruption — really appreciate your patience here. If anything looks off after rebasing, leave a comment and I'll help unblock right away. Thanks for contributing 🙏

Closes Climate-Vision#44.

Adds a `RequestIDMiddleware` (in `src/climatevision/api/middleware.py`)
that reads the inbound `X-Request-ID` header, falls back to a fresh
UUID4 when absent, stores the value on a `contextvars.ContextVar`
(`request_id_var`), and echoes it back on the response.

Pairs the middleware with a `RequestIDLogFilter` and wires it into
`setup_logging` so every log record emitted during the request lifecycle
-- including from `inference/pipeline.py` and helper modules that don't
hold the FastAPI `Request` object -- carries `%(request_id)s` in the
JSON log format.

The middleware is registered last in `create_app` so it sits outermost
in the stack, ensuring the ContextVar is set before any other middleware
or route handler runs.

Tests in `tests/test_request_id_middleware.py` cover the three cases the
issue called out:

  - request without `X-Request-ID` -> response carries a UUID-shaped value
  - request with explicit `X-Request-ID` -> response echoes it back
  - log record emitted inside the handler exposes `record.request_id`

Plus a leak test asserting the ContextVar resets after the response.
@mvanhorn mvanhorn force-pushed the request-id-middleware branch from 6dd365f to cac30bc Compare May 17, 2026 22:50
@mvanhorn
Copy link
Copy Markdown
Author

Thanks for the careful read @femi23, and @Goldokpa thanks for the heads-up on the history rewrite.

Pushed cac30bc:

  • Rebased onto the new upstream/main (post force-push). Single commit replayed cleanly with git rebase --onto upstream/main <old-base> request-id-middleware, no conflicts.
  • Moved app.add_middleware(RequestIDMiddleware) to AFTER app.add_middleware(CORSMiddleware, ...) so it really is the outermost wrapper. Updated the comment to also call out that Starlette wraps in reverse add_middleware order, since that was the source of the confusion. (Note: there's no SecurityMiddleware in main.py today; CORS was the one immediately ahead of it, so I moved it past CORS.)
  • Left the two nits alone for now, happy to fold them in if you want. The root.addFilter line is genuinely a no-op for the propagated-record case, but it's harmless and a bit defensive against future changes to handler registration order; let me know if you'd rather have it removed. The ContextVar[str | None] to ContextVar[str] with default="-" tightening makes sense to me too; I'd just want to also drop the or "-" fallback in RequestIDFilter.filter in the same patch so the two ends agree.

Ready for re-review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Good First Issue] Add request-ID middleware that propagates X-Request-ID through inference logs

3 participants