Skip to content

fix: follow environment-document pagination Link headers#11

Merged
gagantrivedi merged 1 commit intomainfrom
fix/env-document-pagination
May 4, 2026
Merged

fix: follow environment-document pagination Link headers#11
gagantrivedi merged 1 commit intomainfrom
fix/env-document-pagination

Conversation

@gagantrivedi
Copy link
Copy Markdown
Member

Summary

  • Walks RFC 5988 Link: rel="next" cursors on /environment-document and merges follow-up pages' identity_overrides into the page-1 base.
  • Closes the silent data loss on environments that exceed the upstream's ~1 MB page boundary; single-page environments are unchanged.
  • Logs a warn! per refresh if total fetch time crosses api_poll_frequency_seconds, naming both durations.

Closes #10. Reference: Flagsmith/flagsmith-nodejs-client#205.

Test plan

  • 8 unit tests + 4 wiremock integration tests covering relative/absolute next URLs, multi-rel parsing, single-page parity, and slow-fetch warning emission.
  • cargo test, cargo clippy --all-targets, cargo fmt --check all clean.
  • Live-verified against a project with 1100 identity overrides spanning two real pages — per-identity flag evaluation matches direct Edge API output for users on both pages.

Walk RFC 5988 Link: rel="next" cursors on /environment-document and
merge follow-up pages' identity_overrides into the page-1 base. Without
this the proxy silently drops every override past the first ~1 MB page
and serves stale flag values for affected identities.

A single warn! fires per refresh if total fetch time crosses
api_poll_frequency_seconds, naming both durations.

Refs #10
Copy link
Copy Markdown

@Zaimwa9 Zaimwa9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great with one non blocking NIT. It looks like the node implementation if I recall correctly?

Also wondering whether we should have an exit strategy in case the loop never finishes (can it?)

let base = Url::parse(api_url).ok()?;

for header_value in headers.get_all(reqwest::header::LINK).iter() {
let raw = header_value.to_str().ok()?;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: maybe over protective but is it worth to continue instead of ok? so an unrelated malformed link doesn't block the whole operation?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought about this one a bit more. The scenario where it'd actually fire is: an upstream emits a Link header value that isn't visible-ASCII (e.g. a 0x80–0xFF byte), and also emits a separate valid rel="next" Link entry that we'd miss because the bad one short-circuits the parse.

RFC 5988 says Link values must be ASCII, and Flagsmith's Edge API today emits one clean Link header per response — single value, no CDN in front rewriting headers. For this corner to bite, you'd need either an upstream bug or a downstream CDN injecting a malformed Link entry before the upstream's clean one. Neither is something we've seen, and Node behaves the same way (its regex match returns null on the same input class — silently treats it as the last page).

Going to leave as ? for now and revisit if we ever do see it in the wild — adding the continue would be one line, but the failure mode it guards against is hypothetical enough that I'd rather not encode "why is this continue?" into the code without a concrete scenario to point at. Happy to reopen if you'd rather have it preemptively though.

@gagantrivedi
Copy link
Copy Markdown
Member Author

Thanks for the review!

Yep — pretty much the same algorithm as flagsmith-nodejs-client#205. Two small differences worth flagging: the Link parser handles a few extra cases the Node regex doesn't (multi-token rels like rel="prev next", unquoted/mixed-case rel), and the merge appends in place rather than buffering all pages first, so peak memory during a refresh stays at ~2× a single page instead of growing with page count. Functional parity on the happy path, more robust on weirder inputs.

On the loop exit strategy — fair flag. Today there's no explicit page cap; the loop terminates either when there's no rel="next" (normal case) or when a request fails. The bounds we do have:

  • api_poll_timeout_seconds is per-request, so a stuck/slow upstream times out per page rather than hanging forever.
  • The slow-fetch warning fires once when total elapsed crosses api_poll_frequency_seconds, so an operator gets a signal in the logs.

So a truly runaway loop would need an upstream that responds quickly with a rel="next" cursor that never advances. Node has the same property. I had a MAX_ENVIRONMENT_DOCUMENT_PAGES = 100 safety cap in an earlier draft but pulled it — felt like overkill given the existing bounds.

@gagantrivedi gagantrivedi merged commit 62bb3fc into main May 4, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Local evaluation silently drops identity overrides on large environments

2 participants