Skip to content

perf: avoid context deep-clone and flags serde double-pass#9

Draft
gagantrivedi wants to merge 2 commits intomainfrom
perf/flags-arc-direct-json
Draft

perf: avoid context deep-clone and flags serde double-pass#9
gagantrivedi wants to merge 2 commits intomainfrom
perf/flags-arc-direct-json

Conversation

@gagantrivedi
Copy link
Copy Markdown
Member

@gagantrivedi gagantrivedi commented Apr 24, 2026

Summary

Two independent hot-path wins on the identities and flags endpoints.

1. LocalMemEnvironmentsCache::get_context returns Arc<EngineEvaluationContext>

The context holds every feature, segment, rule, and condition for an environment. The previous implementation deep-cloned it on every request. Arc::clone is O(1) and moves the allocation to the polling refresh path. Call sites in EnvironmentService dereference the Arc before passing to the engine — no behaviour change.

2. get_flags serializes directly, skipping serde_json::Value

Json<serde_json::Value> required serde_json::to_value(flags) first, which walks the whole structure once to build an intermediate Value tree; Axum's Json response then walks the tree again to produce bytes. Two full traversals per response.

New FlagsResponse enum wraps either a single APIFeatureState or Vec<APIFeatureState>, implements IntoResponse by delegating to Json<T>. One tree-walk to bytes.

Measured impact

Local Docker, 1 vCPU / 2 GB, identities endpoint, wrk with 3 trait-matching segment conditions, endpoint caches disabled:

Workload Metric baseline this PR delta
Small (50f / 15s / 750 overrides / 1KB values) RPS @ c=50 2,957 4,276 +45%
p99 @ c=200 77 ms 51 ms -34%
Medium (200f / 50s / 8.7K overrides / 1KB values) RPS @ c=50 268 462 +72%
RPS @ c=200 230 510 +121%
p99 @ c=200 2.14 s 412 ms -81%

The medium project benefits disproportionately because the context deep-clone cost grows with project size; Arc::clone is constant-time either way.

Isolated microbenchmarks (criterion, not part of this PR) showed cache.get_context() drops from 5.78 µs → 32 ns and flags serialization drops from 13.27 µs → 3.46 µs for a 50-feature response.

Two independent hot-path wins on the identities and flags endpoints:

1. LocalMemEnvironmentsCache::get_context now returns
   Option<Arc<EngineEvaluationContext>> instead of Option<EngineEvaluationContext>.
   The context holds every feature, segment, rule, and condition for an
   environment; returning it by value caused a full deep clone on every
   request. Using Arc makes reads a pointer-bump and moves the one-time
   construction cost to the poll path (where it belongs). Callers in
   EnvironmentService now dereference the Arc before passing to the engine.

2. get_flags previously returned Json<serde_json::Value>. Building that
   required serde_json::to_value(flags) which walks the whole structure
   once to produce an intermediate Value tree; Axum's Json response then
   walks the tree again to produce bytes. This eliminates the first pass:
   a new FlagsResponse enum wraps either a single APIFeatureState or
   Vec<APIFeatureState>, implements IntoResponse by delegating to
   Json<T>, and we serialize directly.

Measured end-to-end (local Docker, 1 vCPU / 2 GB, identities endpoint,
wrk with 3 trait-matching segment conditions, endpoint caches off):

                        baseline    this PR   delta
  small  50f/15s/750ov    3,072      4,457    +45%  RPS
         p99 @ c=200      77 ms      51 ms
  medium 200f/50s/8.7Kov    268        510    +90%  RPS
         p99 @ c=200     2.14 s     412 ms    -81%

The medium project benefits disproportionately because the context
clone cost grows with project size; Arc::clone is O(1) either way.
@gagantrivedi gagantrivedi marked this pull request as draft April 24, 2026 11:33
Fixes clippy::explicit_auto_deref. Rust auto-derefs &Arc<T> to &T
via the Deref trait when coercing to function args, so the explicit
&* was redundant. No behaviour change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant