diff --git a/src/lib/navigation.ts b/src/lib/navigation.ts
index f8c26b39..3bf3fb42 100644
--- a/src/lib/navigation.ts
+++ b/src/lib/navigation.ts
@@ -48,6 +48,7 @@ export const tabNavigation: NavTab[] = [
               { title: 'Docker Compose', href: '/docs/self-hosting/docker-compose' },
               { title: 'Environment Variables', href: '/docs/self-hosting/environment' },
               { title: 'System Configuration', href: '/docs/self-hosting/configuration' },
+              { title: 'AgentCC Gateway', href: '/docs/self-hosting/gateway' },
               { title: 'User Management', href: '/docs/self-hosting/user-management' },
               { title: 'Production', href: '/docs/self-hosting/production' },
               { title: 'Troubleshooting', href: '/docs/self-hosting/troubleshooting' },
diff --git a/src/pages/docs/self-hosting.mdx b/src/pages/docs/self-hosting.mdx
index dd3c9fbf..236dae63 100644
--- a/src/pages/docs/self-hosting.mdx
+++ b/src/pages/docs/self-hosting.mdx
@@ -1,102 +1,99 @@
 ---
 title: "Self-Hosting Future AGI: Deploy on Your Own Infrastructure"
-description: "Deploy the full Future AGI platform on your own infrastructure using Docker Compose. Follow the step-by-step guide to get all services running locally."
+description: "Deploy Future AGI on your own infrastructure with the self-hosted installer and Docker Compose."
 ---
 
 ## About
 
-Future AGI is fully open-source. Self-hosting runs the entire stack on your machines — all traces, datasets, evaluations, and model calls stay within your network. Backend is Django, frontend is React + Vite, LLM gateway is Go.
+Self-host Future AGI when you need control over data residency, network access, model credentials, and operational policy. The open-source stack runs the product UI, backend APIs, tracing and evaluation pipelines, object storage, analytics storage, background workers, and AgentCC Gateway inside your infrastructure.
 
-Not sure if you need this? The hosted version at [app.futureagi.com](https://app.futureagi.com) is easier to operate. Self-host when you need **data residency**, **air-gapped environments**, **cost control at scale**, or **deep customization**.
+The fastest path is `./bin/install`. It prepares `.env`, generates local secrets, starts Docker Compose, pulls the published Future AGI images, waits for the backend health check, and can create your first user.
+
+## What you get
+
+| Capability | Included in the self-hosted stack |
+|---|---|
+| Observability | Trace ingestion, sessions, spans, annotations, ClickHouse-backed analytics |
+| Evaluation | Eval execution, background workflows, datasets, prompt and simulation workflows |
+| AgentCC Gateway | Self-hosted gateway for provider routing, retries, failover, rate limits, caching, budgets, and virtual keys |
+| Data ownership | Postgres, ClickHouse, Redis, and MinIO run in your environment |
+| Provider control | Bring your own OpenAI, Anthropic, Gemini, Bedrock, Ollama, vLLM, or internal model endpoints |
+| Deployment control | Local laptop, VM, or bare metal with Docker Compose today; Kubernetes and Helm charts are coming soon |
+
+## Recommended paths
+
+| Goal | Start here | Expected outcome |
+|---|---|---|
+| Try the platform locally | [Docker Compose](/docs/self-hosting/docker-compose) | Installer-driven setup on `localhost` |
+| Size infrastructure | [Requirements](/docs/self-hosting/requirements) | Hardware tier, port map, platform compatibility |
+| Configure model access | [AgentCC Gateway](/docs/self-hosting/gateway) | Provider keys, health check, test model call |
+| Prepare for users | [Production](/docs/self-hosting/production) | TLS, secrets, backups, monitoring, upgrade runbook |
+| Fix a failed boot | [Troubleshooting](/docs/self-hosting/troubleshooting) | Symptom-based commands and recovery steps |
 
 ## Quick start
 
 ```bash
 git clone https://github.com/future-agi/future-agi.git
 cd future-agi
-cp .env.example .env
-docker pull futureagi/future-agi:v1.8.19_base
-docker compose up
+./bin/install
 ```
 
-First boot builds from source (~10–15 min). After `Application startup complete`:
-
-| Service | URL |
-|---|---|
-| Frontend | http://localhost:3000 |
-| Backend API | http://localhost:8000 |
-| PeerDB UI | http://localhost:3001 — `peerdb` / `peerdb` |
+The installer starts the stack from published images, waits for `http://localhost:8000/health/`, and prompts for the first user unless you pass `--skip-user-creation`. The standard self-hosting path does not build application images locally.
 
-## Deployment options
+| Service | URL | Notes |
+|---|---|---|
+| Frontend | [http://localhost:3000](http://localhost:3000) | Product UI |
+| Backend API | [http://localhost:8000](http://localhost:8000) | API, SDK ingestion, admin |
+| AgentCC Gateway | [http://localhost:8090](http://localhost:8090) | Provider routing gateway mapped from the container |
 
-| Option | Status |
-|---|---|
-| Docker Compose | Available |
-| Helm / Kubernetes | Coming soon |
-| Air-gapped | Coming soon |
+<Warning>
+Before sharing the instance with anyone else, review `.env`, configure TLS, and restrict access to backend and data-store ports. `./bin/install` generates local secrets for you; manually created `.env` files must not keep `CHANGEME` values.
+</Warning>
 
 ## Architecture
 
-21 containers across four layers.
-
-```
+```text
 Browser
-  └─ frontend (React/nginx)
-       └─ backend (Django)  ──── gateway (Go) ──── OpenAI · Anthropic · Gemini · Bedrock
-            ├── postgres      primary DB + WAL replication
-            ├── clickhouse    analytics store
-            ├── redis         cache / pub-sub
-            ├── minio         object storage
-            └── temporal ──── worker   background jobs / eval pipelines
-
-postgres ──── PeerDB CDC ──── clickhouse   (continuous replication)
+  -> frontend (React/nginx)
+      -> backend (Django API, SDK ingestion, admin)
+          -> agentcc-gateway (AgentCC Gateway, Go)
+              -> OpenAI, Anthropic, Gemini, Bedrock, Ollama, vLLM, internal models
+          -> temporal -> worker (evals, agent jobs, data jobs)
+          -> serving (embeddings and local model utilities)
+          -> code-executor (sandboxed eval code)
+
+backend -> postgres    primary transactional store
+backend -> redis       cache, locks, pub/sub
+backend -> minio       S3-compatible object storage
+backend -> clickhouse  analytics and trace query store
 ```
 
-**Application** — `frontend` · `backend` · `worker` · `gateway` · `serving` · `code-executor`
+| Layer | Services | Production note |
+|---|---|---|
+| Application | `frontend`, `backend`, `worker`, `serving`, `code-executor` | Scale backend and workers independently as usage grows |
+| AgentCC Gateway | `agentcc-gateway` | Keep provider keys here; expose publicly only if external apps call it directly |
+| Data | `postgres`, `clickhouse`, `redis`, `minio` | Use managed services for stronger durability and backup controls |
+| Workflow | `temporal` | Keep close to Postgres and workers for low-latency task execution |
 
-**Data** — `postgres` · `clickhouse` · `redis` · `minio`
+## Security boundaries
 
-**Workflow** — `temporal`
+Only the frontend and backend normally need to be reachable by users. The Compose file publishes AgentCC Gateway for local testing, but production deployments should keep it private unless external applications call it directly. Keep Postgres, ClickHouse, Redis, MinIO, and Temporal on private networks unless you have a deliberate admin access path such as VPN or bastion.
 
-**CDC (PeerDB)** — `peerdb-catalog` · `peerdb-temporal` · `peerdb-minio` · `peerdb-flow-api` · `peerdb-flow-worker` · `peerdb-flow-snapshot-worker` · `peerdb-server` · `peerdb-ui` · `peerdb-temporal-init` · `peerdb-init`
-
-| Layer | Service | Purpose |
-|---|---|---|
-| App | `frontend` | React SPA served by nginx |
-| App | `backend` | Django REST + gRPC + WebSocket API |
-| App | `worker` | Temporal worker — evals, agent loops, data jobs |
-| App | `gateway` | Go LLM proxy — routing, retries, rate limits, logging |
-| App | `serving` | Embeddings and small model inference |
-| App | `code-executor` | nsjail-sandboxed eval code runner (`privileged: true` required) |
-| Data | `postgres` | Primary DB — users, traces, datasets, evals, prompts |
-| Data | `clickhouse` | Analytics DB — replicated from Postgres via PeerDB |
-| Data | `redis` | Cache, rate limits, WebSocket pub/sub |
-| Data | `minio` | S3-compatible object storage (swap for S3 in prod) |
-| Workflow | `temporal` | Durable workflow engine — shares main Postgres |
-| CDC | PeerDB stack | Continuous Postgres → ClickHouse replication (10 services) |
+If you expose AgentCC Gateway for external applications, put it behind TLS and use gateway virtual keys instead of provider keys in client applications. Provider keys should stay in `.env`, `config.yaml`, or your secret manager.
 
 ## Next Steps
 
 <CardGroup cols={2}>
   <Card title="Requirements" icon="server" href="/docs/self-hosting/requirements">
-    Hardware tiers, platform compatibility, ports reference.
+    Hardware sizing, supported platforms, and network ports.
   </Card>
   <Card title="Docker Compose" icon="docker" href="/docs/self-hosting/docker-compose">
-    Setup, deployment modes, day-to-day operations.
-  </Card>
-  <Card title="Environment Variables" icon="settings" href="/docs/self-hosting/environment">
-    Full `.env` reference — secrets, ports, flags, keys.
+    Run the full stack, verify health, and operate it day to day.
   </Card>
-  <Card title="System Configuration" icon="sliders" href="/docs/self-hosting/configuration">
-    LLM gateway providers, PeerDB mirrors, Temporal workers.
-  </Card>
-  <Card title="User Management" icon="user" href="/docs/self-hosting/user-management">
-    Create accounts via email or Django shell.
+  <Card title="AgentCC Gateway" icon="route" href="/docs/self-hosting/gateway">
+    Configure self-hosted model routing, provider keys, and test calls.
   </Card>
   <Card title="Production" icon="shield" href="/docs/self-hosting/production">
-    Hardening, backups, monitoring, upgrades.
-  </Card>
-  <Card title="Troubleshooting" icon="wrench" href="/docs/self-hosting/troubleshooting">
-    Solutions for every known error.
+    Harden the deployment before real users or production workloads.
   </Card>
 </CardGroup>
diff --git a/src/pages/docs/self-hosting/configuration.mdx b/src/pages/docs/self-hosting/configuration.mdx
index fade7a33..bb1d4f2c 100644
--- a/src/pages/docs/self-hosting/configuration.mdx
+++ b/src/pages/docs/self-hosting/configuration.mdx
@@ -1,139 +1,113 @@
 ---
 title: "Self-Hosting System Configuration"
-description: "Configure the LLM gateway config.yaml with provider API keys, set up PeerDB Postgres-to-ClickHouse CDC mirrors, and tune Temporal worker concurrency."
+description: "Configure AgentCC Gateway config.yaml with provider API keys, frontend API URL, and Temporal worker concurrency."
 ---
 
 ## About
 
-Configure the moving parts that aren't covered by `.env` alone: provider entries in the LLM gateway's `config.yaml`, the PeerDB Postgres → ClickHouse replication mirrors, and Temporal worker concurrency.
+`.env` controls ports, secrets, and service-level runtime flags. A complete self-hosted deployment also needs three system-level checks:
 
-## LLM gateway
+- AgentCC Gateway has provider configuration and keys
+- Temporal workers have enough concurrency for your evaluation and tracing workload
 
-<Warning>
-The LLM gateway requires additional configuration before model calls will work. You must create a `config.yaml` and provide your provider API keys — see the setup steps below.
-</Warning>
+## Configuration map
 
-The gateway is a Go LLM proxy that routes all model calls. It ships with `config.example.yaml` — OpenAI enabled by default.
+| Area | Where to configure | When to change |
+|---|---|---|
+| Secrets and ports | `.env` | Before first boot, when rotating secrets, or when ports conflict |
+| AgentCC Gateway providers | `agentcc-gateway/config.yaml` and `.env` | Before running model calls through AgentCC Gateway |
+| Worker concurrency | `.env` | When evals, traces, or long-running jobs queue up |
+| Frontend API URL | `VITE_HOST_API` in `.env` | When moving from localhost to a public backend URL |
 
-### Setup
+## AgentCC Gateway
 
-```bash
-# 1. Copy the example
-cp futureagi/agentcc-gateway/config.example.yaml \
-   futureagi/agentcc-gateway/config.yaml
+AgentCC Gateway runs as the `agentcc-gateway` Compose service. It routes model calls from the Future AGI backend and, if you expose it, from your own applications.
 
-# 2. Edit config.yaml — uncomment providers, set keys via ${VAR} interpolation
-# 3. Set matching keys in .env (OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.)
+For the complete setup, provider examples, self-hosted models, and test calls, see [AgentCC Gateway](/docs/self-hosting/gateway).
 
-# 4. Point the gateway volume at your config.yaml (in docker-compose.yml)
-#    volumes:
-#      - ./futureagi/agentcc-gateway/config.yaml:/app/config.yaml:ro
+Quick setup:
 
-# 5. Restart
-docker compose up -d --force-recreate gateway
+```bash
+cp agentcc-gateway/config.example.yaml \
+   agentcc-gateway/config.yaml
 ```
 
-`config.yaml` is gitignored. Treat it as a secret.
-
-### Provider config examples
+Point the Compose mount at your config in `.env`:
 
-<Tabs>
-<Tab title="OpenAI / Anthropic / Gemini">
-```yaml
-providers:
-  openai:
-    api_key: "${OPENAI_API_KEY}"
-    api_format: "openai"
-    models: [gpt-4o, gpt-4o-mini]
+```bash
+AGENTCC_CONFIG_PATH=agentcc-gateway/config.yaml
+```
 
-  anthropic:
-    api_key: "${ANTHROPIC_API_KEY}"
-    api_format: "anthropic"
-    models: [claude-opus-4-5, claude-sonnet-4-5]
+Add or update provider keys in `.env`:
 
-  gemini:
-    api_key: "${GOOGLE_API_KEY}"
-    api_format: "gemini"
-    models: [gemini-2.0-flash, gemini-1.5-pro]
-```
-</Tab>
-<Tab title="AWS Bedrock">
-```yaml
-providers:
-  bedrock:
-    api_key: "${AWS_SECRET_ACCESS_KEY}"
-    api_format: "bedrock"
-    region: "${AWS_REGION}"
-    access_key: "${AWS_ACCESS_KEY_ID}"
-    models: [anthropic.claude-3-5-sonnet-20241022-v2:0]
+```bash
+OPENAI_API_KEY=sk-...
+ANTHROPIC_API_KEY=sk-ant-...
+GOOGLE_API_KEY=...
 ```
-</Tab>
-<Tab title="Vertex AI">
-```yaml
-providers:
-  vertex:
-    base_url: "https://us-central1-aiplatform.googleapis.com"
-    api_key: "${GOOGLE_ACCESS_TOKEN}"
-    api_format: "gemini"
-    headers:
-      x-gcp-project: "${GCP_PROJECT_ID}"
-      x-gcp-location: "us-central1"
-    models: [gemini-2.0-flash-001]
+
+Then recreate AgentCC Gateway:
+
+```bash
+docker compose up -d --force-recreate agentcc-gateway
 ```
-Vertex uses a Bearer token, not an API key. Rotate `GOOGLE_ACCESS_TOKEN` via a sidecar calling `gcloud auth print-access-token`.
-</Tab>
-</Tabs>
 
-For routing rules, rate limits, caching, and the full config reference — see [Agent Command Center → Self-hosted](/docs/command-center/deployment/self-hosted).
+The Compose file passes `GOOGLE_API_KEY` into the AgentCC Gateway container as `GEMINI_API_KEY`, matching the AgentCC Gateway example config.
 
----
+Verify:
 
-## PeerDB (Postgres → ClickHouse CDC)
+```bash
+curl http://localhost:8090/healthz
+docker compose logs --tail=100 agentcc-gateway
+```
 
-PeerDB continuously replicates Postgres tables into ClickHouse so trace and eval analytics stay fast.
+## Temporal workers
 
-**First-boot timing issue**: `peerdb-init` runs immediately on startup, before Django migrations may have completed. If mirrors show "not started" in the PeerDB UI:
+The default self-hosted stack uses one all-queue worker:
 
 ```bash
-# 1. Wait until backend logs "Application startup complete"
-docker compose logs -f backend
-
-# 2. Re-run init
-docker compose run --rm peerdb-init bash /setup.sh
+TEMPORAL_ALL_QUEUES=true
+TEMPORAL_MAX_CONCURRENT_ACTIVITIES=50
+TEMPORAL_MAX_CONCURRENT_WORKFLOW_TASKS=50
 ```
 
-Verify at [http://localhost:3001](http://localhost:3001) — mirrors should show `running` within seconds.
+Increase these when jobs are healthy but queued for too long. Keep them lower when the host is CPU or memory constrained.
 
-After upgrades that touch replicated tables, re-run the same init command.
+The dev overlay runs dedicated workers by queue:
 
----
+| Service | Queue | Typical use |
+|---|---|---|
+| `worker-default` | `default` | General workflows |
+| `worker-tasks-s` | `tasks_s` | Short tasks |
+| `worker-tasks-l` | `tasks_l` | Larger eval and simulation tasks |
+| `worker-tasks-xl` | `tasks_xl` | Expensive long-running tasks |
+| `worker-trace-ingestion` | `trace_ingestion` | Trace ingestion pipeline |
+| `worker-agent-compass` | `agent_compass` | Agent Command Center jobs |
 
-## Temporal workers
+Use the dev overlay for debugging queue-specific behavior:
 
-**Default (all-queue)** — one worker polls all task queues. Controlled by `TEMPORAL_ALL_QUEUES=true` in `.env`. Good for self-hosted deployments.
+```bash
+docker compose -f docker-compose.yml -f docker-compose.dev.yml up
+```
 
-**Per-queue workers** (dev mode) — six dedicated workers via the dev overlay:
+## Frontend URL changes
 
-| Service name | Queue | Typical concurrency |
-|---|---|---|
-| `worker-default` | `default` | 100 |
-| `worker-tasks-s` | `tasks_s` | 200 |
-| `worker-tasks-l` | `tasks_l` | 50 |
-| `worker-tasks-xl` | `tasks_xl` | 10 |
-| `worker-trace-ingestion` | `trace_ingestion` | 100 |
-| `worker-agent-compass` | `agent_compass` | 50 |
+`VITE_HOST_API` controls the backend URL used by the browser. When you move from local URLs to a public backend domain, update `.env` or `deploy/.env.production` and recreate the frontend container:
 
-Tune concurrency in `.env` via `TEMPORAL_MAX_CONCURRENT_ACTIVITIES` and `TEMPORAL_MAX_CONCURRENT_WORKFLOW_TASKS`.
+```bash
+VITE_HOST_API=https://api.yourcompany.com
+docker compose up -d --force-recreate frontend
+```
 
-Temporal UI (dev mode): [http://localhost:8085](http://localhost:8085)
+If the UI loads but API calls fail with CORS or network errors, this is the first value to check.
 
 ## Next Steps
 
 <CardGroup cols={2}>
-  <Card title="Production" icon="shield" href="/docs/self-hosting/production">
-    Hardening, backups, and monitoring before going live.
+  <Card title="AgentCC Gateway" icon="route" href="/docs/self-hosting/gateway">
+    Configure provider access and test model calls.
   </Card>
-  <Card title="Troubleshooting" icon="wrench" href="/docs/self-hosting/troubleshooting">
-    Solutions for common configuration errors.
+  <Card title="Production" icon="shield" href="/docs/self-hosting/production">
+    Harden networking, backups, monitoring, and upgrades.
   </Card>
 </CardGroup>
diff --git a/src/pages/docs/self-hosting/docker-compose.mdx b/src/pages/docs/self-hosting/docker-compose.mdx
index 1ce057d7..d68f3244 100644
--- a/src/pages/docs/self-hosting/docker-compose.mdx
+++ b/src/pages/docs/self-hosting/docker-compose.mdx
@@ -1,127 +1,175 @@
 ---
 title: "Self-Hosting with Docker Compose"
-description: "Deploy the full Future AGI stack with Docker Compose — all 21 services, dev overlay with hot reload, and frontend-only mode pointing at a remote backend."
+description: "Deploy Future AGI with the self-hosted installer, Docker Compose, dev overlay, and frontend-only mode."
 ---
 
 ## About
 
-Docker Compose is the supported way to run a self-hosted Future AGI instance. This page covers the full-stack deployment (all 21 services), the dev overlay with hot reload and per-queue workers, and a frontend-only mode for pointing the UI at a remote backend.
+Docker Compose is the primary self-hosting runtime for Future AGI. The recommended entrypoint is `./bin/install`, which prepares `.env`, generates secrets, pulls published images, starts Compose, waits for backend health, and optionally creates the first user.
 
-## Setup
+Use this page as the runbook for a single-node deployment. For production hardening, continue to [Production](/docs/self-hosting/production) after the stack is healthy.
+
+## Before you start
+
+Confirm Docker has enough resources:
 
 ```bash
-git clone https://github.com/future-agi/future-agi.git
-cd future-agi
-cp .env.example .env
-docker pull futureagi/future-agi:v1.8.19_base
-docker compose up
+docker --version
+docker compose version
+docker info | grep -i memory
 ```
 
-First boot builds from source (~10–15 min). When the backend logs `Application startup complete`:
+Minimum for a useful local run is 4 CPU cores, 8 GB RAM, and 64 GB Docker disk. For shared environments, start with 8 CPU cores, 16 GB RAM, and SSD storage. See [Requirements](/docs/self-hosting/requirements) for sizing.
 
-- **Frontend** — [http://localhost:3000](http://localhost:3000)
-- **Backend API** — [http://localhost:8000](http://localhost:8000)
-- **PeerDB UI** — [http://localhost:3001](http://localhost:3001) · `peerdb` / `peerdb`
+## Install
 
-Replace `CHANGEME` secrets in `.env` before sharing the instance with others. See [Environment Variables](/docs/self-hosting/environment).
+<Steps>
+  <Step title="Clone the repository">
 
----
+    ```bash
+    git clone https://github.com/future-agi/future-agi.git
+    cd future-agi
+    ```
 
-## Deployment modes
+  </Step>
+  <Step title="Run the installer">
+
+    ```bash
+    ./bin/install
+    ```
+
+    The installer copies `.env.example` to `.env` when needed, replaces `CHANGEME-*` placeholders with generated secrets, runs `docker compose up -d`, waits for `http://localhost:8000/health/`, and prompts for the first user. Compose pulls the published `futureagi/*` images; it does not build application images locally.
+
+  </Step>
+  <Step title="Create the first user">
+
+    The installer prompts for the first user after the backend health check passes. Use `./bin/install --skip-user-creation` if you want to create the first user later.
+
+  </Step>
+  <Step title="Open the local services">
+
+    | Service | URL | Use |
+    |---|---|---|
+    | Frontend | [http://localhost:3000](http://localhost:3000) | Main product UI |
+    | Backend API | [http://localhost:8000](http://localhost:8000) | API, SDK ingestion, admin |
+    | AgentCC Gateway | [http://localhost:8090](http://localhost:8090) | Provider routing gateway |
+    | MinIO console | [http://localhost:9006](http://localhost:9006) | Object storage admin |
 
-### Mode 1 — Full stack (default)
+  </Step>
+</Steps>
+
+## Health checks
+
+Run these after first boot and after every upgrade:
 
 ```bash
-docker compose up -d          # detached
-docker compose ps             # check health
-docker compose logs -f backend
+docker compose ps
+docker compose logs --tail=100 backend
+docker compose logs --tail=100 worker
+docker compose logs --tail=100 agentcc-gateway
 ```
 
-Starts all 21 services. Frontend binds on `0.0.0.0:3000`; all data stores bind on `127.0.0.1`. For production, put a reverse proxy (Caddy, nginx, Traefik) in front for HTTPS.
-
-### Mode 2 — Dev overlay
+Check data services from inside the Compose network:
 
 ```bash
-docker compose -f docker-compose.yml -f docker-compose.dev.yml up
+docker compose exec postgres pg_isready -U futureagi -d futureagi
+docker compose exec clickhouse clickhouse-client --query "SELECT 1"
+docker compose exec redis redis-cli ping
 ```
 
-| What changes | Detail |
-|---|---|
-| Hot reload | `./futureagi` volume-mounted into backend and workers — Python changes reload without rebuild. Frontend also supports hot-reload in dev mode. |
-| Per-queue workers | 6 workers (`worker-default`, `worker-tasks-s`, `worker-tasks-l`, `worker-tasks-xl`, `worker-trace-ingestion`, `worker-agent-compass`) instead of one all-queue worker |
-| Public DB ports | Postgres, ClickHouse, Redis, MinIO, Temporal all bind on `0.0.0.0` for host tool access |
-| Temporal UI | [http://localhost:8085](http://localhost:8085) |
-| `FAST_STARTUP=true` | Migrations skipped on restart — run manually: `docker compose exec -it backend bash -c "python manage.py migrate"` |
+## Deployment modes
 
-The base `worker` service is disabled in dev mode (moved to the `oss-only` profile) to prevent duplicate queue polling.
+### Standard stack
 
-### Mode 3 — Frontend only
+```bash
+docker compose up -d
+```
 
-For pointing the UI at a remote backend (another Compose project, a VM, or Future AGI Cloud).
+This runs frontend, backend, worker, AgentCC Gateway, serving, code-executor, Postgres, ClickHouse, Redis, MinIO, and Temporal.
+
+### Development overlay
 
 ```bash
-VITE_HOST_API=https://api.your-backend.example.com \
-  docker compose -f docker-compose.frontend.yml up --build
+docker compose -f docker-compose.yml -f docker-compose.dev.yml up
 ```
 
-<Warning>
-`VITE_HOST_API` is baked into the JS bundle at build time. Changing it requires a rebuild: `docker compose -f docker-compose.frontend.yml build --no-cache frontend`
-</Warning>
+The dev overlay adds source volume mounts, frontend and backend hot reload, public data-store ports for local tools, Temporal UI, and separate workers for each Temporal queue. Use it for development, not production.
 
----
+### Frontend only
 
-## Operations
+```bash
+VITE_HOST_API=https://api.yourcompany.com \
+  docker compose -f docker-compose.frontend.yml up -d
+```
+
+Use this when the UI should point at a remote backend. `VITE_HOST_API` is passed to the frontend container at runtime, so changing it requires recreating the frontend container.
+
+## Day-to-day operations
 
 ```bash
-# Logs
-docker compose logs -f backend worker
+# Show service health
+docker compose ps
 
-# Shell into a container
+# Follow important logs
+docker compose logs -f backend worker agentcc-gateway
+
+# Shell into backend
 docker compose exec backend bash
+
+# Open Postgres
 docker compose exec postgres psql -U futureagi -d futureagi
 
-# Stop (data persists)
-docker compose down
+# Stop without deleting data
+./bin/uninstall
 
-# Wipe all data and restart fresh
-docker compose down -v
-```
+# Delete all local data and start fresh
+./bin/uninstall --wipe-data
 
----
+# Remove containers, volumes, .env, logs, and project images
+./bin/uninstall --purge
+```
 
-## Upgrading
+## Upgrade
 
 ```bash
 git pull
-docker compose build
+docker compose pull
 docker compose up -d
+docker compose logs -f backend
 ```
 
-Migrations run automatically on startup. If a migration fails:
+Migrations run during backend startup. If migrations fail after an upgrade:
 
 ```bash
 docker compose exec backend python manage.py migrate
 ```
 
-If the release notes mention PeerDB mirror changes, re-run init after migrations complete:
+## Rollback
+
+Take backups before upgrading production. If you need to roll back application code:
 
 ```bash
-docker compose run --rm peerdb-init bash /setup.sh
+git log --oneline -10
+git checkout <previous-good-commit>
+docker compose pull
+docker compose up -d
 ```
 
+If the failed upgrade ran irreversible database migrations, restore from backup instead of only checking out an older commit.
+
 ## Next Steps
 
 <CardGroup cols={2}>
   <Card title="Environment Variables" icon="settings" href="/docs/self-hosting/environment">
-    Configure secrets, ports, and runtime flags in `.env`.
+    Configure secrets, ports, provider keys, and runtime flags.
   </Card>
-  <Card title="System Configuration" icon="sliders" href="/docs/self-hosting/configuration">
-    Set up LLM gateway providers and Temporal workers.
-  </Card>
-  <Card title="User Management" icon="user" href="/docs/self-hosting/user-management">
-    Create your first account and configure email delivery.
+  <Card title="AgentCC Gateway" icon="route" href="/docs/self-hosting/gateway">
+    Route model calls through your self-hosted Agent Command Center.
   </Card>
   <Card title="Production" icon="shield" href="/docs/self-hosting/production">
-    Hardening checklist before exposing to users.
+    Add TLS, backups, monitoring, and secret management.
+  </Card>
+  <Card title="Troubleshooting" icon="wrench" href="/docs/self-hosting/troubleshooting">
+    Recover from boot, network, AgentCC Gateway, and migration failures.
   </Card>
 </CardGroup>
diff --git a/src/pages/docs/self-hosting/environment.mdx b/src/pages/docs/self-hosting/environment.mdx
index 0967ef80..9c28753b 100644
--- a/src/pages/docs/self-hosting/environment.mdx
+++ b/src/pages/docs/self-hosting/environment.mdx
@@ -1,17 +1,17 @@
 ---
 title: "Self-Hosting Environment Variables"
-description: "Full .env reference for self-hosted Future AGI — secrets, database credentials, runtime flags, LLM provider keys, email, and frontend build-time configuration."
+description: "Full .env reference for self-hosted Future AGI — secrets, database credentials, runtime flags, LLM provider keys, email, and frontend runtime configuration."
 ---
 
 ## About
 
-Reference for every environment variable the stack reads from `.env`. Grouped by purpose: secrets, database credentials, runtime flags, LLM provider keys, email, and frontend build-time config.
+Reference for every environment variable the stack reads from `.env`. Grouped by purpose: secrets, database credentials, runtime flags, LLM provider keys, email, and frontend runtime config.
 
 ```bash
-cp .env.example .env
+./bin/install --no-up
 ```
 
-The stack boots fine with defaults. Replace `CHANGEME` secrets before sharing with others.
+The installer creates `.env` and generates local secrets without starting containers when `--no-up` is set. If you create `.env` manually, replace every `CHANGEME` secret before sharing the deployment with others.
 
 ## Required secrets
 
@@ -20,7 +20,7 @@ The stack boots fine with defaults. Replace `CHANGEME` secrets before sharing wi
 | `SECRET_KEY` | `openssl rand -hex 32` | Django sessions, CSRF, password reset |
 | `PG_PASSWORD` | `openssl rand -base64 24` | PostgreSQL auth |
 | `MINIO_ROOT_PASSWORD` | `openssl rand -base64 24` | MinIO object storage auth |
-| `AGENTCC_INTERNAL_API_KEY` | `openssl rand -hex 32` | Backend ↔ gateway shared secret |
+| `AGENTCC_INTERNAL_API_KEY` | `openssl rand -hex 32` | Backend ↔ AgentCC Gateway shared secret |
 
 ## Database credentials
 
@@ -57,11 +57,13 @@ All configurable. See [Requirements → Ports reference](/docs/self-hosting/requ
 | `TEMPORAL_MAX_CONCURRENT_ACTIVITIES` | `50` | Max concurrent activity tasks. |
 | `TEMPORAL_MAX_CONCURRENT_WORKFLOW_TASKS` | `50` | Max concurrent workflow tasks. |
 
-## LLM gateway
+## AgentCC Gateway
 
 | Variable | Default | Description |
 |---|---|---|
-| `AGENTCC_INTERNAL_API_KEY` | `CHANGEME` | **Must change.** Backend authenticates gateway calls with this. |
+| `AGENTCC_INTERNAL_API_KEY` | `CHANGEME` | **Must change.** Backend authenticates AgentCC Gateway calls with this. |
+| `AGENTCC_GATEWAY_PORT` | `8090` | Host port mapped to the AgentCC Gateway container. |
+| `AGENTCC_CONFIG_PATH` | `agentcc-gateway/config.example.yaml` | Config file mounted into AgentCC Gateway. Set to `agentcc-gateway/config.yaml` after copying and editing the example. |
 
 ## LLM provider keys
 
@@ -76,7 +78,7 @@ Leave blank for providers you're not using.
 
 ## Email (Mailgun)
 
-Required for email-based sign-up and password reset. Without these, create users via the Django shell — see [User Management](/docs/self-hosting/user-management).
+Required for email-based sign-up and password reset. Without these, create users with `python manage.py create_user` — see [User Management](/docs/self-hosting/user-management).
 
 | Variable | Description |
 |---|---|
@@ -85,10 +87,10 @@ Required for email-based sign-up and password reset. Without these, create users
 | `DEFAULT_FROM_EMAIL` | `From:` address for outbound emails |
 | `SERVER_EMAIL` | Django admin error emails |
 
-## Frontend build-time
+## Frontend runtime
 
 <Warning>
-These are baked into the JS bundle at Vite build time. Changing them requires rebuilding: `docker compose build frontend`
+Changing these values requires recreating the frontend container so it receives the new environment.
 </Warning>
 
 | Variable | Default | Description |
@@ -102,7 +104,7 @@ These are baked into the JS bundle at Vite build time. Changing them requires re
 |---|---|---|
 | `RECAPTCHA_ENABLED` | `false` | Enable reCAPTCHA on registration. |
 | `RECAPTCHA_SECRET_KEY` | — | reCAPTCHA v2/v3 server-side key. |
-| `VITE_GOOGLE_SITE_KEY` | — | reCAPTCHA client-side key (requires frontend rebuild). |
+| `VITE_GOOGLE_SITE_KEY` | — | reCAPTCHA client-side key. |
 | `FUTURE_AGI_CLOUD_API_KEY` | — | EE-tier Cloud features only. Leave blank for OSS. |
 | `FUTURE_AGI_CLOUD_API_URL` | `https://api.futureagi.com` | Do not change. |
 
@@ -110,7 +112,7 @@ These are baked into the JS bundle at Vite build time. Changing them requires re
 
 <CardGroup cols={2}>
   <Card title="System Configuration" icon="sliders" href="/docs/self-hosting/configuration">
-    Set up LLM gateway providers and PeerDB mirrors.
+    Set up AgentCC Gateway providers and runtime configuration.
   </Card>
   <Card title="Production" icon="shield" href="/docs/self-hosting/production">
     Hardening checklist for exposing the stack to users.
diff --git a/src/pages/docs/self-hosting/gateway.mdx b/src/pages/docs/self-hosting/gateway.mdx
new file mode 100644
index 00000000..66782f81
--- /dev/null
+++ b/src/pages/docs/self-hosting/gateway.mdx
@@ -0,0 +1,685 @@
+---
+title: "AgentCC Gateway"
+description: "Self-host AgentCC Gateway for provider routing, virtual keys, failover, caching, budgets, observability, and model access control."
+---
+
+## Overview
+
+AgentCC Gateway is the self-hosted model gateway that sits between Future AGI, your applications, and upstream model providers. It gives one controlled entry point for model traffic instead of spreading provider keys, retry logic, usage policy, and provider-specific API handling across every service.
+
+Use AgentCC Gateway when you want to:
+
+- Keep OpenAI, Anthropic, Gemini, Bedrock, Azure OpenAI, and self-hosted model credentials on the server side
+- Give applications virtual keys with model, provider, rate, IP, and expiry controls
+- Route the same model across multiple providers or regions
+- Add retries, failover, circuit breaking, timeouts, and model fallbacks
+- Track usage, budgets, model costs, and operational metrics
+- Cache safe repeated calls and support self-hosted inference servers such as vLLM, Ollama, LM Studio, TGI, or OpenAI-compatible endpoints
+
+In the default Compose stack, the gateway runs as the `agentcc-gateway` service. The container listens on port `8080`, and Compose exposes it on the host as `http://localhost:8090` by default.
+
+```text
+Future AGI backend
+  -> agentcc-gateway
+      -> OpenAI, Anthropic, Gemini, Bedrock, Azure OpenAI, vLLM, Ollama, or another endpoint
+
+External app
+  -> https://gateway.yourcompany.com
+      -> AgentCC Gateway virtual key
+          -> provider selected by policy
+```
+
+<Warning>
+Provider API keys should stay in `.env`, your secret manager, or the gateway config. Do not ship raw provider keys to browsers, notebooks, mobile apps, or customer environments.
+</Warning>
+
+## Deployment modes
+
+| Mode | Use when | Public exposure |
+|---|---|---|
+| Internal gateway | Only the Future AGI backend calls models | Keep `agentcc-gateway` private on the Docker network |
+| Shared internal gateway | Internal services and workers call models through one gateway | Expose on a private VPC, VPN, or service mesh |
+| External app gateway | Your applications call AgentCC Gateway directly | Put it behind TLS, auth, rate limits, and IP restrictions |
+| Multi-replica gateway | You need high availability or higher throughput | Run multiple replicas and use Redis-backed shared state |
+
+Start with the internal gateway. Expose it externally only when applications outside the Future AGI stack need an OpenAI-compatible endpoint.
+
+## Files and ports
+
+| Item | Default | Notes |
+|---|---|---|
+| Compose service | `agentcc-gateway` | Built from `future-agi/agentcc-gateway` |
+| Example config | `agentcc-gateway/config.example.yaml` | Copy this before editing |
+| Production config | `agentcc-gateway/config.yaml` | Set with `AGENTCC_CONFIG_PATH` |
+| Container port | `8080` | Configured under `server.port` |
+| Host port | `8090` | Controlled by `AGENTCC_GATEWAY_PORT` |
+| Local URL | `http://localhost:8090` | Use from your host machine |
+| Internal Docker URL | `http://agentcc-gateway:8080` | Use from containers on the same Compose network |
+
+## Create a gateway config
+
+Copy the example config and keep the edited file out of git:
+
+```bash
+cp agentcc-gateway/config.example.yaml \
+   agentcc-gateway/config.yaml
+```
+
+Point Compose at the edited file in `.env`:
+
+```bash
+AGENTCC_CONFIG_PATH=agentcc-gateway/config.yaml
+AGENTCC_GATEWAY_PORT=8090
+AGENTCC_INTERNAL_API_KEY=$(openssl rand -hex 32)
+AGENTCC_ADMIN_TOKEN=$(openssl rand -hex 32)
+```
+
+Recreate only the gateway after changing provider config:
+
+```bash
+docker compose up -d --force-recreate agentcc-gateway
+docker compose logs -f agentcc-gateway
+```
+
+The gateway interpolates `${ENV_VAR}` values from the container environment when it starts. Keep provider secrets in `.env` or a secret manager, then reference them from `config.yaml`.
+
+## Minimal production baseline
+
+This baseline enables virtual-key auth, request logging without bodies, Prometheus metrics, conservative timeouts, and an OpenAI provider.
+
+```yaml
+server:
+  host: "0.0.0.0"
+  port: 8080
+  read_timeout: 5s
+  write_timeout: 300s
+  idle_timeout: 120s
+  default_request_timeout: 60s
+  max_request_body_size: 10485760
+
+providers:
+  openai:
+    base_url: "https://api.openai.com"
+    api_key: "${OPENAI_API_KEY}"
+    api_format: "openai"
+    default_timeout: 60s
+    max_concurrent: 100
+    conn_pool_size: 100
+    models:
+      - gpt-4o
+      - gpt-4o-mini
+
+auth:
+  enabled: true
+  keys:
+    - name: "future-agi-backend"
+      key: "${AGENTCC_INTERNAL_API_KEY}"
+      owner: "platform"
+      models: ["gpt-4o", "gpt-4o-mini"]
+    - name: "internal-apps"
+      key: "${AGENTCC_APP_KEY}"
+      owner: "apps"
+      models: ["gpt-4o-mini"]
+      rate_limit_rpm: 120
+
+logging:
+  level: info
+  format: json
+  request_logging:
+    enabled: true
+    include_bodies: false
+
+prometheus:
+  enabled: true
+  path: "/-/metrics"
+```
+
+Set the matching environment variables:
+
+```bash
+OPENAI_API_KEY=sk-...
+AGENTCC_APP_KEY=sk-agentcc-your-app-key
+```
+
+<Warning>
+Set `request_logging.include_bodies: false` unless you have a reviewed data-retention policy. Model prompts and responses can contain customer data.
+</Warning>
+
+## Provider configuration
+
+Each provider declares the upstream endpoint, auth material, API format, limits, and model list. You can start with one provider and add more without changing application code.
+
+<Tabs>
+<Tab title="OpenAI">
+
+```yaml
+providers:
+  openai:
+    base_url: "https://api.openai.com"
+    api_key: "${OPENAI_API_KEY}"
+    api_format: "openai"
+    default_timeout: 60s
+    max_concurrent: 100
+    conn_pool_size: 100
+    models:
+      - gpt-4o
+      - gpt-4o-mini
+```
+
+</Tab>
+<Tab title="Anthropic">
+
+```yaml
+providers:
+  anthropic:
+    base_url: "https://api.anthropic.com"
+    api_key: "${ANTHROPIC_API_KEY}"
+    api_format: "anthropic"
+    default_timeout: 120s
+    max_concurrent: 50
+    conn_pool_size: 50
+    headers:
+      anthropic-version: "2023-06-01"
+    models:
+      - claude-sonnet-4-20250514
+      - claude-3-5-haiku-20241022
+```
+
+</Tab>
+<Tab title="Gemini">
+
+```yaml
+providers:
+  gemini:
+    base_url: "https://generativelanguage.googleapis.com"
+    api_key: "${GEMINI_API_KEY}"
+    api_format: "gemini"
+    default_timeout: 120s
+    max_concurrent: 50
+    conn_pool_size: 50
+    models:
+      - gemini-2.0-flash
+      - gemini-1.5-pro
+```
+
+</Tab>
+<Tab title="Bedrock">
+
+```yaml
+providers:
+  bedrock:
+    base_url: "https://bedrock-runtime.us-east-1.amazonaws.com"
+    api_format: "bedrock"
+    default_timeout: 120s
+    max_concurrent: 50
+    conn_pool_size: 50
+    headers:
+      x-aws-region: "us-east-1"
+    models:
+      - anthropic.claude-3-sonnet-20240229-v1:0
+```
+
+</Tab>
+</Tabs>
+
+For Bedrock, set `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, and `AWS_REGION`, or run the container with an IAM role. Prefer an IAM role in production.
+
+## Self-hosted models
+
+Self-hosted inference servers usually expose an OpenAI-compatible API. Point `base_url` at the endpoint reachable from the gateway container.
+
+<Warning>
+Inside Docker, `localhost` means the gateway container itself. For a model server running on your laptop, use `host.docker.internal` on Docker Desktop. On Linux servers, use a private routable host address or put the model server on the same Docker network.
+</Warning>
+
+<Tabs>
+<Tab title="vLLM">
+
+```yaml
+providers:
+  vllm:
+    base_url: "http://vllm.internal:8000"
+    api_format: "openai"
+    type: "vllm"
+    auto_discover: true
+```
+
+</Tab>
+<Tab title="Ollama">
+
+```yaml
+providers:
+  ollama:
+    base_url: "http://host.docker.internal:11434"
+    api_format: "openai"
+    type: "ollama"
+    models:
+      - llama3.1
+      - qwen2.5
+```
+
+</Tab>
+<Tab title="LM Studio">
+
+```yaml
+providers:
+  lmstudio:
+    base_url: "http://host.docker.internal:1234"
+    api_format: "openai"
+    type: "lmstudio"
+```
+
+</Tab>
+<Tab title="Custom">
+
+```yaml
+providers:
+  internal:
+    base_url: "https://models.internal.example.com"
+    api_key: "${INTERNAL_MODEL_API_KEY}"
+    api_format: "openai"
+    local: false
+    skip_tls: false
+    auto_discover: true
+    models:
+      - company-chat-prod
+```
+
+</Tab>
+</Tabs>
+
+## Virtual keys
+
+Virtual keys are the boundary between applications and upstream provider keys. Create one key per service, environment, or customer-facing integration so you can rotate and limit access without touching provider credentials.
+
+```yaml
+auth:
+  enabled: true
+  keys:
+    - name: "backend-prod"
+      key: "${AGENTCC_INTERNAL_API_KEY}"
+      owner: "future-agi"
+      models: ["gpt-4o", "gpt-4o-mini"]
+      providers: ["openai"]
+      rate_limit_rpm: 600
+      rate_limit_tpm: 200000
+      allowed_ips: ["10.0.0.0/8"]
+      metadata:
+        env: "prod"
+
+    - name: "notebooks-dev"
+      key: "${AGENTCC_NOTEBOOK_KEY}"
+      owner: "ml-platform"
+      models: ["gpt-4o-mini"]
+      rate_limit_rpm: 60
+      expires_at: "2026-12-31T00:00:00Z"
+```
+
+Key design guidelines:
+
+- Use separate keys for backend services, evaluation jobs, notebooks, and customer-facing apps
+- Restrict high-cost models to keys that need them
+- Put low RPM and TPM limits on exploratory or developer keys
+- Rotate a virtual key when an app owner changes or an integration is retired
+- Keep provider keys stable and rotate them separately from app keys
+
+## Routing and failover
+
+Routing controls where a model request goes. Start with explicit provider mapping, then add failover and load balancing where production traffic needs it.
+
+```yaml
+model_map:
+  gpt-4o: openai
+  claude-sonnet-4-20250514: anthropic
+
+routing:
+  default_strategy: "round-robin"
+  failover:
+    enabled: true
+    max_attempts: 3
+    on_status_codes: [429, 500, 502, 503, 504]
+    on_timeout: true
+  retry:
+    enabled: true
+    max_retries: 2
+    initial_delay: 500ms
+    max_delay: 10s
+    multiplier: 2.0
+  circuit_breaker:
+    enabled: true
+    failure_threshold: 5
+    success_threshold: 2
+    cooldown: 30s
+  model_timeouts:
+    gpt-4o-mini: 60s
+    o1: 300s
+  model_fallbacks:
+    gpt-4o:
+      - claude-sonnet-4-20250514
+      - gemini-2.0-flash
+```
+
+For regional or account-level balancing, define multiple provider entries and route one model across them:
+
+```yaml
+providers:
+  openai-primary:
+    base_url: "https://api.openai.com"
+    api_key: "${OPENAI_PRIMARY_API_KEY}"
+    api_format: "openai"
+    models: ["gpt-4o-mini"]
+  openai-secondary:
+    base_url: "https://api.openai.com"
+    api_key: "${OPENAI_SECONDARY_API_KEY}"
+    api_format: "openai"
+    models: ["gpt-4o-mini"]
+
+routing:
+  default_strategy: "weighted"
+  targets:
+    gpt-4o-mini:
+      - provider: openai-primary
+        weight: 80
+      - provider: openai-secondary
+        weight: 20
+```
+
+## Budgets and cost tracking
+
+Enable cost tracking before you enforce budgets. Use budgets to protect shared environments and customer-facing endpoints from unexpected spend.
+
+```yaml
+cost_tracking:
+  enabled: true
+  custom_pricing:
+    company-chat-prod:
+      input_per_mtok: 1.00
+      output_per_mtok: 5.00
+
+budgets:
+  enabled: true
+  default_period: "monthly"
+  warn_threshold: 0.8
+  org:
+    limit: 10000.0
+    hard: true
+  teams:
+    platform:
+      limit: 5000.0
+      period: "monthly"
+      per_model:
+        gpt-4o: 2000.0
+  keys:
+    sk-agentcc-your-app-key:
+      limit: 100.0
+```
+
+Use hard budgets for non-critical or external app keys. For core production services, start with alerts and a reviewed runbook before rejecting traffic.
+
+## Caching
+
+Caching is useful for deterministic evaluations, repeated development prompts, and low-risk read-heavy workloads. Avoid caching requests that include private user data unless your retention policy allows it.
+
+```yaml
+cache:
+  enabled: true
+  backend: "redis"
+  default_ttl: 5m
+  max_entries: 10000
+  redis:
+    address: "redis:6379"
+    db: 0
+    key_prefix: "agentcc-cache:"
+    timeout: 2s
+```
+
+Per-request cache controls:
+
+```bash
+curl -X POST http://localhost:8090/v1/chat/completions \
+  -H "Authorization: Bearer $AGENTCC_APP_KEY" \
+  -H "Content-Type: application/json" \
+  -H "x-agentcc-cache-ttl: 10m" \
+  -d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"Summarize this fixed document."}]}'
+```
+
+## Security hardening
+
+For an externally reachable gateway, enable virtual-key auth, TLS, body-safe logging, CORS restrictions, IP allowlists where possible, and least-privilege provider credentials.
+
+```yaml
+cors:
+  enabled: true
+  allowed_origins: ["https://app.example.com"]
+  allowed_methods: ["GET", "POST", "OPTIONS"]
+  allowed_headers: ["Authorization", "Content-Type"]
+  allow_credentials: false
+
+ip_acl:
+  enabled: true
+  trusted_proxies: 1
+  allow: ["10.0.0.0/8", "192.168.1.0/24"]
+  deny: ["203.0.113.7"]
+
+privacy:
+  enabled: true
+  mode: "patterns"
+  redact_patterns:
+    - name: "email"
+      pattern: '[\w.+-]+@[\w-]+\.[\w.-]+'
+    - name: "ssn"
+      pattern: '\d{3}-\d{2}-\d{4}'
+
+audit:
+  enabled: true
+  min_severity: "info"
+  sinks:
+    - type: "stdout"
+```
+
+Minimum production rules:
+
+- Put the gateway behind a reverse proxy or load balancer with TLS
+- Keep `agentcc-gateway/config.yaml` readable only by operators and the deployment process
+- Use one virtual key per app or team
+- Never enable body logging by default
+- Restrict direct internet access unless external apps must call it
+- Rotate leaked virtual keys immediately; rotate provider keys if provider credentials were exposed
+
+## Observability
+
+Enable metrics and structured logs before sending real traffic through the gateway.
+
+```yaml
+logging:
+  level: info
+  format: json
+  request_logging:
+    enabled: true
+    include_bodies: false
+
+prometheus:
+  enabled: true
+  path: "/-/metrics"
+
+otel:
+  enabled: true
+  service_name: "agentcc-gateway"
+  exporter: "stdout"
+  sample_rate: 0.1
+  attributes:
+    environment: "production"
+```
+
+Scrape metrics from the gateway container:
+
+```bash
+curl http://localhost:8090/-/metrics
+```
+
+Alert on:
+
+- Gateway 5xx rate
+- Provider 401 or 403 errors
+- Provider 429 rate limits
+- Timeout rate by provider and model
+- Circuit breaker open events
+- Spend spikes and budget threshold warnings
+- Cache error rate if cache is enabled
+
+## Scaling and high availability
+
+For a single-node Compose install, one gateway replica is enough. For higher traffic or failover, run multiple replicas behind a load balancer and move shared state to Redis.
+
+```yaml
+redis:
+  enabled: true
+  address: "redis:6379"
+  password: "${REDIS_PASSWORD}"
+  db: 0
+  pool_size: 20
+  timeout: 2s
+  prefix: "agentcc:"
+
+cluster:
+  enabled: true
+  node_id: "gateway-1"
+  redis_url: "redis://redis:6379/0"
+  heartbeat_interval: 5s
+  heartbeat_ttl: 15s
+  drain_timeout: 30s
+```
+
+When you scale:
+
+- Use Redis-backed state for rate limits, budgets, credits, and cluster heartbeats
+- Keep provider connection pools below upstream provider limits
+- Use readiness checks before sending traffic to a new replica
+- Drain old replicas during deploys so streaming responses can finish
+- Keep each replica on the same `config.yaml` version during a rollout
+
+## Test the gateway
+
+Health check:
+
+```bash
+curl http://localhost:8090/healthz
+```
+
+Metrics check:
+
+```bash
+curl http://localhost:8090/-/metrics
+```
+
+OpenAI-compatible chat completion:
+
+```bash
+curl -X POST http://localhost:8090/v1/chat/completions \
+  -H "Authorization: Bearer $AGENTCC_APP_KEY" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "gpt-4o-mini",
+    "messages": [{"role": "user", "content": "Reply with one sentence."}]
+  }'
+```
+
+Streaming check:
+
+```bash
+curl -N -X POST http://localhost:8090/v1/chat/completions \
+  -H "Authorization: Bearer $AGENTCC_APP_KEY" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "gpt-4o-mini",
+    "stream": true,
+    "messages": [{"role": "user", "content": "Count from one to five."}]
+  }'
+```
+
+If the chat call works, provider credentials, gateway auth, routing, and outbound network access are wired correctly.
+
+## Use AgentCC Gateway from an app
+
+Most OpenAI-compatible clients only need a new `base_url` and `api_key`.
+
+<CodeGroup>
+```python Python
+from openai import OpenAI
+
+client = OpenAI(
+    api_key="sk-agentcc-your-app-key",
+    base_url="https://gateway.yourcompany.com/v1",
+)
+
+response = client.chat.completions.create(
+    model="gpt-4o-mini",
+    messages=[{"role": "user", "content": "Hello from AgentCC Gateway"}],
+)
+
+print(response.choices[0].message.content)
+```
+
+```typescript TypeScript
+import OpenAI from "openai";
+
+const client = new OpenAI({
+  apiKey: "sk-agentcc-your-app-key",
+  baseURL: "https://gateway.yourcompany.com/v1",
+});
+
+const response = await client.chat.completions.create({
+  model: "gpt-4o-mini",
+  messages: [{ role: "user", content: "Hello from AgentCC Gateway" }],
+});
+
+console.log(response.choices[0].message.content);
+```
+
+```bash cURL
+curl -X POST https://gateway.yourcompany.com/v1/chat/completions \
+  -H "Authorization: Bearer sk-agentcc-your-app-key" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "gpt-4o-mini",
+    "messages": [{"role": "user", "content": "Hello from AgentCC Gateway"}]
+  }'
+```
+</CodeGroup>
+
+## Upgrade runbook
+
+1. Save the current `agentcc-gateway/config.yaml`.
+2. Pull the new release or branch.
+3. Compare your config with the new `agentcc-gateway/config.example.yaml`.
+4. Add any new required fields or recommended defaults.
+5. Recreate one gateway instance first.
+6. Run `/healthz`, `/-/metrics`, a non-streaming chat call, and a streaming chat call.
+7. Watch gateway logs for provider auth, model validation, timeout, and routing errors.
+8. Roll the remaining replicas after smoke tests pass.
+
+## Troubleshooting
+
+| Symptom | Likely cause | Fix |
+|---|---|---|
+| `401 unauthorized` from gateway | Missing or wrong virtual key | Check `auth.enabled`, `auth.keys`, and the `Authorization: Bearer ...` header |
+| Provider returns `401` or `403` | Upstream provider credential is wrong or lacks access | Check `.env`, config interpolation, provider dashboard access, and IAM permissions |
+| `model not found` | Model is not listed, mapped, or discoverable | Add the model under the provider, set `model_map`, or enable `auto_discover` for compatible endpoints |
+| `connection refused` for vLLM, Ollama, or LM Studio | Gateway container cannot reach a host-local endpoint | Use `host.docker.internal`, a private host IP, or the same Docker network |
+| Requests hang | Timeout too high, provider unavailable, or no outbound route | Set provider `default_timeout`, model timeouts, and verify DNS/firewall access from the container |
+| Repeated provider 429s | Provider rate limit is lower than gateway traffic | Lower `max_concurrent`, add another provider key, enable failover, or add per-key limits |
+| Budgets differ across replicas | Shared state is in memory | Enable Redis-backed `redis` and `cluster` config before running multiple replicas |
+| Browser request blocked | CORS does not allow the frontend origin | Add the app origin under `cors.allowed_origins` |
+| Metrics missing | Prometheus endpoint is disabled or wrong path | Enable `prometheus.enabled` and scrape `/-/metrics` |
+
+Useful commands:
+
+```bash
+docker compose ps agentcc-gateway
+docker compose logs --tail=200 agentcc-gateway
+docker compose exec agentcc-gateway env | grep -E "OPENAI|ANTHROPIC|GEMINI|AWS|AGENTCC"
+curl http://localhost:8090/healthz
+curl http://localhost:8090/-/metrics
+```
+
+Do not paste `.env`, provider keys, virtual keys, or request bodies into support channels.
diff --git a/src/pages/docs/self-hosting/production.mdx b/src/pages/docs/self-hosting/production.mdx
index c5fbb682..36369795 100644
--- a/src/pages/docs/self-hosting/production.mdx
+++ b/src/pages/docs/self-hosting/production.mdx
@@ -5,133 +5,215 @@ description: "Production readiness checklist — replace secrets, configure TLS,
 
 ## About
 
-Run through this before exposing the stack to real users. Covers secrets, TLS, swapping in managed data stores, backup commands for Postgres/ClickHouse/MinIO, Prometheus monitoring, and the upgrade and rollback runbook.
+Use this checklist before exposing a self-hosted Future AGI deployment to real users or production traffic. Harden secrets, TLS, networking, backups, monitoring, and upgrades before users depend on it.
 
-## Hardening checklist
+For production, prefer the production overlay in the `deploy/` directory. It requires explicit secrets, pins the image version, pulls published images, and refuses to start when required values are missing.
 
-**Secrets** — replace all `CHANGEME` values before going live:
+## Go-live checklist
+
+| Area | Required action |
+|---|---|
+| Secrets | Use `deploy/setup.sh` or generate production secrets, then move them to a secret manager when possible |
+| TLS | Terminate HTTPS in Caddy, nginx, Traefik, an ingress controller, or a load balancer |
+| Network | Expose only frontend, backend, and optionally AgentCC Gateway; keep data stores private |
+| Auth | Create admin users intentionally; configure email or SSO before inviting users |
+| AgentCC Gateway | Use virtual keys for apps; keep provider keys server-side |
+| Backups | Test restore for Postgres and object storage before launch |
+| Monitoring | Scrape backend and AgentCC Gateway metrics; alert on job failures and replication lag |
+| Upgrades | Take backups, upgrade in a staging clone first, and document rollback criteria |
+
+## Secrets
+
+Generate production values:
 
 ```bash
-openssl rand -hex 32    # SECRET_KEY, AGENTCC_INTERNAL_API_KEY
-openssl rand -base64 24 # PG_PASSWORD, MINIO_ROOT_PASSWORD
+openssl rand -hex 32       # SECRET_KEY
+openssl rand -base64 24    # PG_PASSWORD
+openssl rand -base64 24    # MINIO_ROOT_PASSWORD
+openssl rand -hex 32       # AGENTCC_INTERNAL_API_KEY
 ```
 
-**Runtime flags** in `.env`:
-- `ENV_TYPE=prod`
-- `FAST_STARTUP=false`
-- `GRANIAN_WORKERS=<your CPU count>`
+Set production runtime flags:
+
+```bash
+ENV_TYPE=prod
+FAST_STARTUP=false
+GRANIAN_WORKERS=<cpu-count>
+GRANIAN_THREADS=2
+```
 
-**TLS** — the frontend and backend don't terminate TLS. Put Caddy, nginx, or Traefik in front:
+Use a secret manager for shared environments. If you keep `.env` on disk, restrict permissions:
 
+```bash
+chmod 600 .env
 ```
-# Caddyfile (simplest — auto-issues Let's Encrypt certs)
-app.yourcompany.com    { reverse_proxy localhost:3000 }
-api.yourcompany.com    { reverse_proxy localhost:8000 }
+
+## TLS and domains
+
+Run the frontend, backend, and optionally AgentCC Gateway behind TLS.
+
+Example Caddyfile:
+
+```text
+app.yourcompany.com {
+  reverse_proxy 127.0.0.1:3000
+}
+
+api.yourcompany.com {
+  reverse_proxy 127.0.0.1:8000
+}
+
+gateway.yourcompany.com {
+  reverse_proxy 127.0.0.1:8090
+}
 ```
 
-After setting up TLS, set `VITE_HOST_API=https://api.yourcompany.com` in `.env` and rebuild:
+For production installs, set `FRONTEND_URL` and `VITE_HOST_API` in `deploy/.env.production` before starting the stack:
 
 ```bash
-docker compose build frontend && docker compose up -d frontend
+FRONTEND_URL=https://app.yourcompany.com
+VITE_HOST_API=https://api.yourcompany.com
+docker compose --env-file deploy/.env.production \
+  -f docker-compose.yml -f deploy/docker-compose.production.yml up -d frontend
 ```
 
-**Managed data stores** — for production, replace compose-managed services:
+Only expose the AgentCC Gateway domain if external applications call it directly. If AgentCC Gateway is only used by the Future AGI backend, keep it private.
 
-| Replace | With | Change |
-|---|---|---|
-| `postgres` | RDS / Aurora / Cloud SQL | Set `PG_*` vars to managed endpoint |
-| `clickhouse` | ClickHouse Cloud | Set `CH_HOST`, `CH_PORT`, etc. |
-| `redis` | ElastiCache / Upstash | Set `REDIS_URL` |
-| `minio` | AWS S3 | Set `S3_ENDPOINT_URL=https://s3.amazonaws.com` + AWS creds |
+## Network policy
+
+| Service | Production exposure |
+|---|---|
+| `frontend` | Public through HTTPS |
+| `backend` | Public through HTTPS if SDKs or browser UI call it |
+| `agentcc-gateway` | Private by default; public only for external app traffic |
+| `postgres`, `clickhouse`, `redis`, `minio`, `temporal` | Private network, VPN, or bastion only |
+
+Do not expose database or queue ports directly to the internet. Use firewall rules, security groups, private networking, or equivalent network policies to enforce this.
 
 <Note>
-`code-executor` requires `privileged: true`. Run on EC2 / GCE instances — not Fargate or Cloud Run.
+Official Kubernetes manifests and Helm charts are coming soon. Until then, the supported production path is Docker Compose on a VM or bare-metal host, with managed data stores where needed.
 </Note>
 
-**Secrets manager** — use AWS Secrets Manager, HashiCorp Vault, or GCP Secret Manager instead of a plain `.env` file.
+## Managed data stores
 
----
+For production, move durable services out of the Compose host when you need high availability, managed backups, or independent scaling.
+
+| Compose service | Production option | Notes |
+|---|---|---|
+| `postgres` | RDS, Aurora, Cloud SQL, self-managed HA Postgres | Primary transactional database |
+| `clickhouse` | ClickHouse Cloud or managed ClickHouse | Size for trace and eval query volume |
+| `redis` | ElastiCache, Memorystore, Upstash, self-managed Redis | Use persistence if it stores queues or state you cannot lose |
+| `minio` | S3, GCS via S3-compatible gateway, managed object storage | Update `S3_*` env vars |
+
+Keep latency low between backend, worker, Postgres, Redis, and Temporal.
 
 ## Backups
 
-### PostgreSQL
+### Postgres
 
 ```bash
-# Backup
 docker compose exec postgres \
   pg_dump -U futureagi -d futureagi --format=custom \
-  > backup-$(date +%F).dump
+  > futureagi-postgres-$(date +%F).dump
+```
+
+Restore into a stopped or empty target:
 
-# Restore
+```bash
 docker compose exec -T postgres \
   pg_restore -U futureagi -d futureagi --clean --if-exists \
-  < backup-2026-04-22.dump
+  < futureagi-postgres-2026-05-07.dump
 ```
 
-Volumes: `future-agi_postgres-data` · `future-agi_clickhouse-data` · `future-agi_redis-data` · `future-agi_minio-data` · `future-agi_peerdb-catalog-data` · `future-agi_peerdb-minio-data`
+### Object storage
 
-### ClickHouse
+For local MinIO:
 
-```sql
-BACKUP TABLE default.traces TO S3('s3://your-bucket/ch-backup/', 'KEY', 'SECRET');
+```bash
+mc alias set local http://localhost:9005 futureagi "$MINIO_ROOT_PASSWORD"
+mc mirror local/futureagi ./futureagi-minio-backup
 ```
 
-ClickHouse data can also be rebuilt from scratch by re-running PeerDB init since it replicates from Postgres.
+For S3, use bucket versioning and lifecycle policies. Back up files and metadata together with Postgres snapshots taken at the same point in time.
 
-### MinIO
-
-```bash
-mc alias set local http://localhost:9005 futureagi <MINIO_ROOT_PASSWORD>
-mc alias set s3 https://s3.amazonaws.com <AWS_KEY> <AWS_SECRET>
-mc mirror local/ s3/your-bucket/
-```
+### ClickHouse
 
----
+ClickHouse stores analytics and trace query data. For production, use native ClickHouse backups or your managed ClickHouse provider's snapshot feature.
 
 ## Monitoring
 
-Backend exposes Prometheus metrics at `http://localhost:8000/metrics`. Add a scraper:
+Backend exposes Prometheus metrics at:
+
+```text
+http://localhost:8000/metrics
+```
+
+Scrape it from Prometheus:
 
 ```yaml
-# prometheus.yml
 scrape_configs:
-  - job_name: futureagi
-    static_configs:
-      - targets: ['localhost:8000']
+  - job_name: futureagi-backend
+    scheme: https
     metrics_path: /metrics
+    static_configs:
+      - targets: ["api.yourcompany.com:443"]
 ```
 
-Key signals: backend error rate, Temporal workflow success/failure, Postgres WAL lag (PeerDB health), ClickHouse query latency, PeerDB mirror status at [localhost:3001](http://localhost:3001).
+Track these signals:
 
----
+| Signal | Why it matters |
+|---|---|
+| Backend 5xx rate and latency | User-facing API health |
+| Worker queue depth and failures | Eval and trace jobs are falling behind |
+| Temporal workflow failures | Durable background job failures |
+| ClickHouse query latency and disk | Trace and dashboard performance |
+| Postgres connections, CPU, disk, WAL | Primary data-store health |
+| AgentCC Gateway 4xx, 5xx, timeout, provider error rate | Model routing and provider health |
+
+## Upgrade runbook
+
+1. Announce a maintenance window if users depend on the deployment.
+2. Take Postgres and object-storage backups.
+3. Record current commit and `FUTURE_AGI_VERSION`.
+4. Upgrade a staging clone first when possible.
+5. Pull code, pull images, and restart.
+6. Watch backend migrations, worker logs, and AgentCC Gateway logs.
+7. Run an AgentCC Gateway model call and a trace ingestion smoke test.
 
-## Upgrades
+Commands:
 
 ```bash
+git rev-parse HEAD
 git pull
-docker compose build
-docker compose up -d
+docker compose --env-file deploy/.env.production \
+  -f docker-compose.yml -f deploy/docker-compose.production.yml pull
+docker compose --env-file deploy/.env.production \
+  -f docker-compose.yml -f deploy/docker-compose.production.yml up -d
+docker compose logs -f backend
 ```
 
-Migrations run automatically. If a migration fails: `docker compose exec backend python manage.py migrate`
+## Rollback
 
-If release notes mention PeerDB changes: `docker compose run --rm peerdb-init bash /setup.sh`
-
-**Rollback:**
+Rollback application code only when database migrations are backward-compatible:
 
 ```bash
-git log --oneline -5
-git checkout <previous-hash>
-docker compose build && docker compose up -d
+# Set FUTURE_AGI_VERSION in deploy/.env.production to the previous release,
+# then pull and restart.
+docker compose --env-file deploy/.env.production \
+  -f docker-compose.yml -f deploy/docker-compose.production.yml pull
+docker compose --env-file deploy/.env.production \
+  -f docker-compose.yml -f deploy/docker-compose.production.yml up -d
 ```
 
+If a migration changed data irreversibly, restore Postgres and object storage from the backups taken before the upgrade.
+
 ## Next Steps
 
 <CardGroup cols={2}>
   <Card title="Troubleshooting" icon="wrench" href="/docs/self-hosting/troubleshooting">
-    Symptoms, causes, and fixes for common errors.
+    Diagnose startup, AgentCC Gateway, Temporal, and upgrade failures.
   </Card>
-  <Card title="System Configuration" icon="sliders" href="/docs/self-hosting/configuration">
-    Tune the LLM gateway, PeerDB mirrors, and Temporal workers.
+  <Card title="AgentCC Gateway" icon="route" href="/docs/self-hosting/gateway">
+    Harden model access, virtual keys, and provider routing.
   </Card>
 </CardGroup>
diff --git a/src/pages/docs/self-hosting/requirements.mdx b/src/pages/docs/self-hosting/requirements.mdx
index 4e0ce3be..eeb86bc6 100644
--- a/src/pages/docs/self-hosting/requirements.mdx
+++ b/src/pages/docs/self-hosting/requirements.mdx
@@ -15,7 +15,7 @@ Hardware tiers, supported platforms, and the network ports each service uses. Re
 | **Team** | 1–20 users, regular eval runs | 8 cores | 16 GB | 50 GB |
 | **Production** | 20+ users, high throughput | 16+ cores | 32+ GB | 200 GB+ SSD |
 
-Resource drivers: ClickHouse and Temporal worker each hold ~1 GB RAM at steady state. First image build is ~6 GB disk. ClickHouse grows with trace volume; Postgres stays small.
+Resource drivers: ClickHouse and Temporal worker each hold ~1 GB RAM at steady state. The first run downloads published container images and needs Docker disk headroom. ClickHouse grows with trace volume; Postgres stays small.
 
 <Tip>
 Docker Desktop (Mac/Windows): Settings → Resources → set RAM ≥ 8 GB, disk ≥ 64 GB. The defaults (2–4 GB RAM) will OOM-kill ClickHouse or the backend.
@@ -68,7 +68,7 @@ All ports are configurable via `.env`.
 |---|---|---|---|
 | Frontend | `3000` | `0.0.0.0` | `FRONTEND_PORT` |
 | Backend API | `8000` | `0.0.0.0` | `BACKEND_PORT` |
-| Gateway | `8090` | Internal only | `GATEWAY_PORT` |
+| AgentCC Gateway | `8090` | Host port; restrict in production | `AGENTCC_GATEWAY_PORT` |
 | Model serving | `8080` | Internal only | `SERVING_PORT` |
 | Code executor | `8060` | Internal only | `CODE_EXECUTOR_PORT` |
 | Postgres | `5432` | `127.0.0.1` (dev: public) | `PG_PORT` |
@@ -79,8 +79,6 @@ All ports are configurable via `.env`.
 | MinIO console | `9006` | `127.0.0.1` | `MINIO_CONSOLE_PORT` |
 | Temporal | `7233` | `127.0.0.1` (dev: public) | `TEMPORAL_PORT` |
 | Temporal UI | `8085` | Dev mode only | `TEMPORAL_UI_PORT` |
-| PeerDB server | `9900` | `127.0.0.1` | `PEERDB_PORT` |
-| PeerDB UI | `3001` | `0.0.0.0` | `PEERDB_UI_PORT` |
 
 In production, only the frontend and backend ports should be internet-facing, and only behind a TLS-terminating reverse proxy.
 
diff --git a/src/pages/docs/self-hosting/troubleshooting.mdx b/src/pages/docs/self-hosting/troubleshooting.mdx
index a94a29a7..a06a0db1 100644
--- a/src/pages/docs/self-hosting/troubleshooting.mdx
+++ b/src/pages/docs/self-hosting/troubleshooting.mdx
@@ -1,146 +1,232 @@
 ---
 title: "Self-Hosting Troubleshooting"
-description: "Debug self-hosted Future AGI — symptoms, causes, and fixes for startup failures, network issues, PeerDB CDC errors, Temporal worker problems, and post-upgrade breaks."
+description: "Debug self-hosted Future AGI — symptoms, causes, and fixes for startup failures, network issues, AgentCC Gateway errors, Temporal worker problems, and post-upgrade breaks."
 ---
 
-## About
-
-Symptoms, causes, and fixes for the errors most commonly hit when self-hosting. Grouped by where they show up: startup, network, PeerDB, Temporal, and post-upgrade.
-
 ## Start here
 
+Run these first. They usually identify the failing layer.
+
 ```bash
-docker compose ps                    # what's running / what's restarting
-docker compose logs -f backend       # most informative starting point
-docker compose exec backend bash     # shell into any container
+docker compose ps
+docker compose logs --tail=150 backend
+docker compose logs --tail=150 worker
+docker compose logs --tail=150 agentcc-gateway
 ```
 
----
-
-## Startup errors
+Check the data services:
 
-**`Cannot connect to the Docker daemon`**
-Docker isn't running. Start Docker Desktop (Mac/Windows) or `sudo systemctl start docker` (Linux).
+```bash
+docker compose exec postgres pg_isready -U futureagi -d futureagi
+docker compose exec clickhouse clickhouse-client --query "SELECT 1"
+docker compose exec redis redis-cli ping
+```
 
----
+If you changed `.env`, remember that some values are only read when the container is recreated:
 
-**First build takes 15+ min or hangs on `uv pip install`**
-Normal on first boot. If stuck >20 min, cancel and retry:
 ```bash
-docker compose build --no-cache backend
+docker compose up -d --force-recreate backend worker agentcc-gateway
 ```
 
----
+## Startup
 
-**`ERROR: not enough free space`**
-Docker Desktop's virtual disk is full. Settings → Resources → Disk image size → raise to 100 GB+. Or prune: `docker system prune -af && docker builder prune -af`
+| Symptom | Likely cause | Fix |
+|---|---|---|
+| `Cannot connect to the Docker daemon` | Docker is stopped | Start Docker Desktop or `sudo systemctl start docker` |
+| First boot takes several minutes | Docker is pulling published images and initializing databases | Wait; if stuck, check `docker compose logs --tail=150 backend` |
+| `no space left on device` | Docker disk is full | Increase Docker disk to 100 GB+ or prune unused images |
+| Container exits with code 137 | Out of memory | Allocate at least 8 GB RAM to Docker; 16 GB recommended for team use |
+| Port already in use | Another local process owns the port | Change the matching port in `.env` or stop the process |
+| Installer exits before startup | Preflight or port check failed | Read the generated `install-*.log`, fix the reported issue, then re-run `./bin/install` |
 
----
+Find port conflicts:
 
-**Port already in use**
 ```bash
-lsof -i :3000   # find the conflicting process
-# or override in .env:
-FRONTEND_PORT=3100
-BACKEND_PORT=8100
+lsof -i :3000
+lsof -i :8000
+lsof -i :8090
 ```
 
----
+## Backend
 
 **Backend never reaches `Application startup complete`**
-- Check RAM: `docker info | grep -i memory` — Docker needs ≥ 8 GB
-- Check for migration errors: `docker compose logs backend | grep -i error`
-- Run migrations manually: `docker compose exec backend python manage.py migrate`
 
----
+```bash
+docker compose logs backend | grep -iE "error|exception|traceback|migration"
+docker compose exec backend python manage.py migrate
+```
+
+Common causes are not enough RAM, Postgres not healthy, failed migrations, or changed database credentials after the Postgres volume was initialized.
 
 **`FATAL: password authentication failed for user "futureagi"`**
-`PG_PASSWORD` was changed after the Postgres volume was initialized. Postgres sets the password only on first boot.
-- Option 1: revert `PG_PASSWORD` to the original value
-- Option 2 (data loss): `docker compose down -v && docker compose up -d`
 
----
+Postgres applies `POSTGRES_PASSWORD` only on first volume initialization. If you changed `PG_PASSWORD` after first boot, either restore the original password or recreate the volume:
 
-**`code-executor` crashes with `clone: Operation not permitted`**
-Host platform blocks `privileged: true`. Won't work on Fargate, Cloud Run, or restricted Kubernetes. Use EC2, GCE, or bare metal. The rest of the stack runs — only code-based eval features are unavailable.
+```bash
+docker compose down -v
+docker compose up -d
+```
 
----
+<Warning>
+`docker compose down -v` deletes local data.
+</Warning>
+
+## Frontend and API
 
-## Network and UI errors
+**Frontend loads but API calls fail**
+
+`VITE_HOST_API` is wrong or stale in the frontend container environment.
 
-**Frontend blank page or CORS errors**
-`VITE_HOST_API` in `.env` doesn't match the current backend URL. Rebuild:
 ```bash
-docker compose build --no-cache frontend
-docker compose up -d frontend
+grep VITE_HOST_API .env
+docker compose up -d --force-recreate frontend
 ```
 
----
+**Browser shows CORS or mixed-content errors**
 
-**API calls fail with 502**
-Backend isn't healthy. Check: `docker compose logs backend` and `docker compose ps backend`.
+Use HTTPS consistently. If the frontend is served over `https://app.yourcompany.com`, set:
 
----
+```bash
+VITE_HOST_API=https://api.yourcompany.com
+```
+
+Then recreate the frontend container.
+
+## AgentCC Gateway
 
-## PeerDB errors
+**AgentCC Gateway health check fails**
 
-**Mirrors show "not started" or don't appear**
-PeerDB init ran before Django migrations completed. Fix:
 ```bash
-docker compose logs -f backend      # wait for "Application startup complete"
-docker compose run --rm peerdb-init bash /setup.sh
+docker compose ps agentcc-gateway
+docker compose logs --tail=200 agentcc-gateway
+curl http://localhost:8090/healthz
 ```
-Verify at [http://localhost:3001](http://localhost:3001) — mirrors should show `running`.
 
----
+If the container is restarting, check `config.yaml` syntax and mounted path.
+
+**Model call returns `401 unauthorized`**
+
+There are two different keys:
+
+| Key | Used for |
+|---|---|
+| `AGENTCC_INTERNAL_API_KEY` | Backend-to-AgentCC Gateway internal calls |
+| AgentCC Gateway virtual key in `config.yaml` | External app calls to AgentCC Gateway |
+
+Check the `Authorization: Bearer ...` header and the `auth.keys` section in AgentCC Gateway config.
+
+**Provider returns `401` or `403`**
+
+The upstream provider rejected the provider credential. Check `.env` and whether the AgentCC Gateway container received it:
 
-**Analytics data is stale**
-PeerDB replication has fallen behind. Check mirror lag in the PeerDB UI. Re-run init if a mirror shows an error:
 ```bash
-docker compose run --rm peerdb-init bash /setup.sh
+docker compose exec agentcc-gateway env | grep -E "OPENAI|ANTHROPIC|GOOGLE|AWS"
 ```
 
----
+**Ollama or vLLM connection refused**
+
+From inside Docker, `localhost` means the AgentCC Gateway container. Use `host.docker.internal` on Docker Desktop or a private network address on Linux:
 
-## Temporal errors
+```yaml
+base_url: "http://host.docker.internal:11434/v1"
+```
 
-**`temporal-server` keeps restarting**
-Almost always a Postgres issue. Check: `docker compose logs postgres`. If Postgres is OOM-killing, raise Docker RAM to ≥ 8 GB. If Postgres is healthy: `docker compose restart postgres temporal`
+**Model not found**
 
----
+Add the model to the provider's `models` list or confirm provider auto-discovery works:
+
+```bash
+docker compose logs agentcc-gateway | grep -i model
+```
+
+**ClickHouse is OOM-killed**
+
+Increase Docker memory or reduce workload. ClickHouse needs SSD-backed disk and enough page cache for trace-heavy deployments.
+
+## Temporal and workers
+
+**Temporal keeps restarting**
+
+Temporal depends on Postgres:
+
+```bash
+docker compose logs --tail=200 postgres
+docker compose logs --tail=200 temporal
+docker compose restart postgres temporal
+```
+
+**Jobs are stuck or evaluations never finish**
+
+```bash
+docker compose logs --tail=200 worker
+docker compose ps worker
+```
+
+If workers are healthy but overloaded, increase:
+
+```bash
+TEMPORAL_MAX_CONCURRENT_ACTIVITIES=100
+TEMPORAL_MAX_CONCURRENT_WORKFLOW_TASKS=100
+```
+
+Then recreate workers:
+
+```bash
+docker compose up -d --force-recreate worker
+```
+
+## Code executor
+
+**`code-executor` crashes with `clone: Operation not permitted`**
+
+The host platform blocks privileged containers. Code execution uses sandboxing that requires `privileged: true`.
 
-## After an upgrade
+Supported today: EC2, GCE, Azure VM, and bare metal hosts that allow privileged containers.
+
+Kubernetes and Helm chart deployment guides are coming soon. If you run Future AGI on Kubernetes before the official charts are available, the code executor still needs nodes that allow privileged pods.
+
+Not supported: ECS Fargate, Cloud Run, many restricted PaaS platforms.
+
+The rest of Future AGI can still run, but eval features that execute user code will be unavailable.
+
+## Upgrades
 
 **Migration fails after `git pull`**
+
 ```bash
 docker compose exec backend python manage.py migrate
+docker compose logs --tail=200 backend
 ```
-If a conflict persists, check the release notes for manual steps.
 
-**Everything worked before the upgrade, now it doesn't**
+**Rollback needed**
+
+If migrations are backward-compatible:
+
 ```bash
-git log --oneline -5
-git checkout <previous-hash>
-docker compose build && docker compose up -d
+git log --oneline -10
+git checkout <previous-good-commit>
+docker compose pull
+docker compose up -d
 ```
 
----
+If migrations changed data irreversibly, restore from backup.
 
-## Still stuck?
+## Collect logs for support
 
-Open an issue at [github.com/future-agi/future-agi/issues](https://github.com/future-agi/future-agi/issues) with:
 ```bash
-docker compose logs > all-logs.txt 2>&1
-docker compose ps >> all-logs.txt
+docker compose ps > futureagi-debug.txt
+docker compose logs --no-color >> futureagi-debug.txt 2>&1
 ```
 
+Include your Compose command, host OS, Docker version, available memory, and the commit or release you are running. Do not include `.env`, provider keys, or AgentCC Gateway virtual keys.
+
 ## Next Steps
 
 <CardGroup cols={2}>
-  <Card title="Requirements" icon="server" href="/docs/self-hosting/requirements">
-    Verify your platform and resources meet the minimums.
+  <Card title="Docker Compose" icon="docker" href="/docs/self-hosting/docker-compose">
+    Verify the standard install and health check flow.
   </Card>
   <Card title="Production" icon="shield" href="/docs/self-hosting/production">
-    Hardening, backups, and monitoring once the stack is stable.
+    Harden the deployment after it is stable.
   </Card>
 </CardGroup>
diff --git a/src/pages/docs/self-hosting/user-management.mdx b/src/pages/docs/self-hosting/user-management.mdx
index a5008ddb..2b62818e 100644
--- a/src/pages/docs/self-hosting/user-management.mdx
+++ b/src/pages/docs/self-hosting/user-management.mdx
@@ -1,15 +1,47 @@
 ---
 title: "Self-Hosting User Management"
-description: "Create user accounts, reset passwords, and manage roles in self-hosted Future AGI — via Mailgun email flow or directly through the Django admin shell."
+description: "Create user accounts, reset passwords, and manage roles in self-hosted Future AGI — through the installer, create_user command, Mailgun email flow, or Django admin."
 ---
 
 ## About
 
-Create accounts, reset passwords, and manage roles. The email-based sign-up flow needs Mailgun; without it, the Django shell is the fastest path to a first user.
+Create accounts, reset passwords, and manage roles. The installer prompts for the first user by default. The email-based sign-up flow needs Mailgun; without it, use the `create_user` management command.
 
 ## Create your first user
 
-### With Mailgun (recommended)
+### During install
+
+`./bin/install` prompts for the first user after the backend health check passes.
+
+For unattended installs, pass credentials through environment variables:
+
+```bash
+FAGI_ADMIN_EMAIL=you@example.com \
+FAGI_ADMIN_NAME="Your Name" \
+FAGI_ADMIN_PASSWORD="change-this-password" \
+./bin/install -y
+```
+
+### After install
+
+If you used `./bin/install --skip-user-creation`, create the user later:
+
+```bash
+docker compose exec -it backend python manage.py create_user
+```
+
+Non-interactive:
+
+```bash
+docker compose exec backend python manage.py create_user \
+  --email you@example.com \
+  --name "Your Name" \
+  --password "change-this-password"
+```
+
+Log in at [http://localhost:3000](http://localhost:3000) with those credentials.
+
+### With Mailgun
 
 Set these in `.env` and restart the backend:
 
@@ -25,18 +57,6 @@ docker compose restart backend
 
 Then sign up via [http://localhost:3000](http://localhost:3000).
 
-### Without Mailgun — Django shell
-
-```bash
-docker compose exec backend python manage.py shell -c "
-from django.contrib.auth.hashers import make_password
-from accounts.models import User
-User.objects.create(email='you@example.com', password=make_password('your-password'))
-"
-```
-
-Log in at [http://localhost:3000](http://localhost:3000) with those credentials.
-
 ## Superuser
 
 ```bash