Flagsmith · gagantrivedi · May 4, 2026 · May 4, 2026 · May 4, 2026
@@ -5,8 +5,8 @@ sidebar_position: 3
 ---
 
 The [Edge Proxy](/performance/edge-proxy) runs as a
-[Docker container](https://hub.docker.com/repository/docker/flagsmith/edge-proxy) with no external dependencies.
-It connects to the Flagsmith API to download environment documents, and your Flagsmith client applications connect to it
+[Docker container](https://hub.docker.com/repository/docker/flagsmith/edge-proxy) with no external dependencies. It
+connects to the Flagsmith API to download environment documents, and your Flagsmith client applications connect to it
 using [remote flag evaluation](/integrating-with-flagsmith/sdks#remote-evaluation).
 
 The examples below assume you have a configuration file located at `./config.json`. Your Flagsmith client applications
@@ -159,8 +159,8 @@ When set to `true`, the Edge Proxy will use the `X-Forwarded-For` and `X-Forward
 client IP addresses. This is useful if the Edge Proxy is running behind a reverse proxy, and you want the
 [access logs](#loggingoverride) to show the real IP addresses of your clients.
 
-By default, only the loopback address is trusted. This can be changed with the [`FORWARDED_ALLOW_IPS` environment
-variable](#environment-variables).
+By default, only the loopback address is trusted. This can be changed with the
+[`FORWARDED_ALLOW_IPS` environment variable](#environment-variables).
 
 ```json
 "server": {
@@ -270,9 +270,9 @@ specified by the [`"logging.log_format"`](#logginglog_format) setting.
 
 The Edge Proxy exposes two health check endpoints:
 
-* `/proxy/health/liveness`: Always responds with a 200 status code. Use this health check to determine if the Edge
-  Proxy is alive and able to respond to requests.
-* `/proxy/health/readiness`: Responds with a 200 status if the Edge Proxy was able to fetch all its configured
+- `/proxy/health/liveness`: Always responds with a 200 status code. Use this health check to determine if the Edge Proxy
+  is alive and able to respond to requests.
+- `/proxy/health/readiness`: Responds with a 200 status if the Edge Proxy was able to fetch all its configured
   environment documents within a configurable grace period. This allows the Edge Proxy to continue reporting as healthy
   even if the Flagsmith API is temporarily unavailable. This health check is also available at `/proxy/health`.
 
@@ -304,11 +304,208 @@ return 200
 
 Some Edge Proxy settings can only be set using environment variables:
 
-- `WEB_CONCURRENCY` The number of [Uvicorn](https://www.uvicorn.org/) workers. Defaults to `1`, which is 
+- `WEB_CONCURRENCY` The number of [Uvicorn](https://www.uvicorn.org/) workers. Defaults to `1`, which is
   [recommended when running multiple Edge Proxy containers behind a load balancer](https://fastapi.tiangolo.com/deployment/docker/#one-load-balancer-multiple-worker-containers).
-  If running on a single node, set this [based on your number of CPU cores and threads](https://docs.gunicorn.org/en/latest/design.html#how-many-workers).
-- `HTTP_PROXY`, `HTTPS_PROXY`, `ALL_PROXY`, `NO_PROXY`: These variables let you configure an HTTP proxy that the
-  Edge Proxy should use for all its outgoing HTTP requests.
-  [Learn more](https://www.python-httpx.org/environment_variables)
+  If running on a single node, set this
+  [based on your number of CPU cores and threads](https://docs.gunicorn.org/en/latest/design.html#how-many-workers).
+- `HTTP_PROXY`, `HTTPS_PROXY`, `ALL_PROXY`, `NO_PROXY`: These variables let you configure an HTTP proxy that the Edge
+  Proxy should use for all its outgoing HTTP requests. [Learn more](https://www.python-httpx.org/environment_variables)
 - `FORWARDED_ALLOW_IPS`: Which IPs to trust for determining client IP addresses when using the `proxy_headers` option.
   For more details, see the [Uvicorn documentation](https://www.uvicorn.org/settings/#http).
+
+## Identity overrides
+
+Identity overrides defined in the dashboard are evaluated by the Edge Proxy. They are embedded in the environment
+document the proxy fetches from the Flagsmith API, and applied during local evaluation by the Flagsmith engine.
+
+For overrides to flow through to the Edge Proxy, the environment must have **Use identity overrides in local
+evaluation** enabled. This is the default for new environments.
+
+:::warning Edge Proxy version
+
+Deleting an identity override in the dashboard only propagates to the Edge Proxy on
+[v2.21.1](https://github.com/Flagsmith/edge-proxy/releases/tag/v2.21.1) and newer. Earlier versions kept the deleted
+override in their cached environment document, so the proxy returned the old overridden value. Pin to `v2.21.1` or
+later, or use `:latest`, to pick up override deletions.
+
+:::
+
+When an identity has both an override and a matching segment override, the identity override takes precedence — this
+matches the behaviour of [Local Evaluation Mode](/integrating-with-flagsmith/integration-overview#local-evaluation-mode).
+
+## Troubleshooting
+
+### 401 Unauthorized from the Edge Proxy
+
+The Edge Proxy returns `401 {"status": "unauthorized", "message": "unknown key ..."}` when the `X-Environment-Key`
+header sent by your client does not match any key configured in [`environment_key_pairs`](#environment_key_pairs).
+
+Check that:
+
+- Your client is using the **client-side** environment key, not the server-side key.
+- The client-side key in your SDK exactly matches the `client_side_key` in the proxy's configuration.
+- If you rotated keys in the dashboard, the proxy configuration was updated and the proxy was restarted.
+
+### 403 Forbidden in Edge Proxy logs
+
+A 403 in the proxy's `error_fetching_document` log line comes from the **upstream Flagsmith API** rejecting the
+configured server-side key when the proxy polls for an environment document. The proxy itself does not return 403; it
+surfaces the upstream error and keeps serving the last cached document if one exists.
+
+Diagnose in this order:
+
+1. **Key prefix and presence.** `server_side_key` values must be non-empty and start with `ser.`. The proxy validates
+   this at startup and refuses to launch otherwise — a blank or whitespace-only server key fails the same check.
+2. **Key type.** Confirm the key was created as **Server-side Environment Key** in **Environment settings → SDK Keys**.
+   Client-side keys cannot fetch environment documents.
+3. **Key freshness.** If the key was rotated or deleted in the dashboard, the proxy's cached value is now invalid.
+4. **`api_url`.** When self-hosting, [`api_url`](#api_url) must point at your Flagsmith API (e.g.
+   `https://flagsmith.example.com/api/v1`). Pointing a self-hosted proxy at `edge.api.flagsmith.com` will 403 because
+   the key does not exist on Flagsmith's hosted Edge.
+5. **Edge enablement.** On self-hosted deployments where Edge is enabled per-project, ensure the project the environment
+   belongs to is permitted to serve environment documents.
+
+### Restart loops in ECS, Kubernetes, or other orchestrators
+
+The most common cause is the orchestrator's readiness probe firing before the proxy has fetched its first environment
+document, or fluctuating to unhealthy whenever the upstream API is briefly slow.
+
+- Point readiness probes at [`/proxy/health/readiness`](#health-checks) and liveness probes at `/proxy/health/liveness`.
+  **Do not** point liveness at readiness — a transient upstream outage will then kill the container instead of letting
+  it serve cached documents.
+- Increase the readiness probe's `initialDelaySeconds` (Kubernetes) or `startPeriod` (ECS) to comfortably exceed the
+  time it takes to fetch all configured environment documents on a cold start.
+- If you serve many environments from a single proxy, raise
+  [`health_check.environment_update_grace_period_seconds`](#health_checkenvironment_update_grace_period_seconds) or set
+  it to `null` to keep the proxy healthy when the upstream API is intermittently unavailable.
+
+### Stale flags after a dashboard change
+
+The proxy serves cached environment documents and only re-fetches every
+[`api_poll_frequency_seconds`](#api_poll_frequency_seconds) (default 10s). It also uses `If-Modified-Since` and will log
+a 304 when the upstream document hasn't changed.
+
+To diagnose:
+
+- Set [`logging.log_level`](#logginglog_level) to `DEBUG` and watch for `environment_updated` log events after you
+  publish a change.
+- Verify the proxy can reach the upstream API. A 5xx, timeout, or 403 from the upstream API will leave the proxy serving
+  the last successfully-fetched document.
+- For very fast propagation requirements, lower `api_poll_frequency_seconds`, but be aware this increases load on the
+  upstream API proportionally.
+
+### Identity-based evaluation returns the wrong value
+
+If your client is hitting the proxy and the result differs from a direct API call:
+
+- Confirm you are sending the full set of traits on every request. The proxy is stateless and does not persist traits
+  between calls — see [Managing Traits](/performance/edge-proxy#managing-traits).
+- If the result was correct before and is now stale after deleting an identity override, upgrade to Edge Proxy v2.21.1
+  or newer (see [Identity overrides](#identity-overrides)).
+- Disable [endpoint caches](#endpoint_caches) temporarily to rule out a cached response.
+
+## Production deployment
+
+### Behind a reverse proxy or load balancer
+
+- Set [`server.proxy_headers`](#serverproxy_headers) to `true` so access logs record the real client IP.
+- Use the [`FORWARDED_ALLOW_IPS`](#environment-variables) environment variable to list the load balancer's IPs.
+- Run multiple Edge Proxy containers behind the load balancer with `WEB_CONCURRENCY=1` per container, as recommended by
+  FastAPI. The proxy is stateless, so any instance can serve any request.
+- Health-check path on the load balancer should be `/proxy/health/readiness`.
+
+### ECS / Fargate
+
+- Map container port `8000` and front the service with an ALB or NLB.
+- Set the ECS health check `command` or the target group health check path to `/proxy/health/readiness`.
+- Use `startPeriod` on the ECS health check (typically 30–60s) so the task is not killed during initial document
+  fetches.
+- The task needs outbound internet (or VPC routing) to reach the Flagsmith API. If you use a forward proxy, set
+  `HTTPS_PROXY` and `NO_PROXY` on the task definition.
+- Mount your `config.json` as a file (for example, via a sidecar that pulls from S3 or AWS Secrets Manager) rather than
+  baking server-side keys into the image.
+
+### Kubernetes
+
+The Edge Proxy is a stateless Deployment. There is no official Helm chart at the time of writing; a minimal manifest
+looks like this:
+
+```yaml title="edge-proxy.yaml"
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+ name: edge-proxy
+spec:
+ replicas: 2
+ selector:
+  matchLabels: { app: edge-proxy }
+ template:
+  metadata:
+   labels: { app: edge-proxy }
+  spec:
+   containers:
+    - name: edge-proxy
+      image: flagsmith/edge-proxy:latest
+      ports:
+       - containerPort: 8000
+      readinessProbe:
+       httpGet: { path: /proxy/health/readiness, port: 8000 }
+       initialDelaySeconds: 10
+      livenessProbe:
+       httpGet: { path: /proxy/health/liveness, port: 8000 }
+      volumeMounts:
+       - name: config
+         mountPath: /app/config.json
+         subPath: config.json
+   volumes:
+    - name: config
+      secret:
+       secretName: edge-proxy-config
+```
+
+Store `config.json` in a `Secret` (it contains server-side keys). Scale with `replicas` or an HPA on CPU.
+
+### Managing configuration in CI/CD
+
+`config.json` contains server-side environment keys and should be treated as a secret:
+
+- Keep the file out of version control. Render it at deploy time from your secrets store (Vault, AWS Secrets Manager,
+  GCP Secret Manager, Kubernetes `Secret`, etc.).
+- If a static-analysis tool flags committed keys, rotate them in the dashboard immediately and move the new keys into
+  your secrets store.
+- An empty `client_side_key` is a configuration error — both keys are required for the pair to be usable.
+
+## Architecture and scaling
+
+The Edge Proxy is stateless: each instance independently polls the Flagsmith API and serves cached environment
+documents, so it scales linearly behind a load balancer.
+
+When sizing a fleet:
+
+- Each proxy instance polls the upstream API once per environment per
+  [`api_poll_frequency_seconds`](#api_poll_frequency_seconds), so adding instances multiplies the polling load on the
+  upstream API. With `If-Modified-Since` (Edge Proxy v2.19.0+, Flagsmith API v2.176.0+) most polls return 304 and cost
+  very little.
+- Enable [`endpoint_caches`](#endpoint_caches) for `flags` and `identities` if you have many repeating requests. Caches
+  are scoped per-process and cleared whenever the environment document changes, so they cannot serve stale data after a
+  dashboard change.
+- The proxy is CPU-bound on `flags` and `identities` (engine evaluation) and bandwidth-bound on `environment-document`
+  (large response body). Scale on CPU for the first two, and on outbound network for the third.
+
+### Reference throughput per instance
+
+The numbers below come from internal benchmarks of `flagsmith/edge-proxy:2.21.2` running as a single-worker container on
+a 1 vCPU / 2 GB AWS Fargate task, with endpoint caches **disabled** (worst case — every request runs a full evaluation).
+Use them as starting-point sizing; real throughput depends on project shape, segment complexity, and trait counts.
+
+Project profile: 50 features, 15 segments, every feature overridden by every segment (750 segment overrides total), each
+segment matching on 15 trait conditions.
+
+| Endpoint                           | Peak RPS | Sweet spot (concurrency) |
+| ---------------------------------- | -------: | -----------------------: |
+| `POST /api/v1/identities/`         |      ~72 |                       25 |
+| `GET /api/v1/flags/`               |      ~63 |                       10 |
+| `GET /api/v1/environment-document` |     ~570 |                       25 |
+
+To raise per-instance throughput, run more containers behind the load balancer with `WEB_CONCURRENCY=1` per container,
+or increase `WEB_CONCURRENCY` and the container's CPU allocation when running a single container per node.