Enterprise LLM proxy built on LiteLLM — unified access, logging, and guardrails for AI coding tools.
Airlock sits between your developers and LLM providers, giving you visibility and control without slowing anyone down.
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Cursor │ │ Claude │ │ Copilot │
│ │ │ Code │ │ │
└─────┬─────┘ └─────┬────┘ └─────┬─────┘
│ │ │
└───────────┬───┘──────────────┘
│
┌─────▼──────┐
│ AIRLOCK │ ← logging, PII guard, keyword guard
│ (LiteLLM) │
└──────┬──────┘
│
┌─────────┼──────────┐
│ │ │
┌────▼───┐ ┌───▼────┐ ┌──▼──────┐
│Anthropic│ │ OpenAI │ │ Internal│
│ API │ │ API │ │ RAG │
└────────┘ └────────┘ └─────────┘
| Concern | How Airlock handles it |
|---|---|
| Unified access | Single OpenAI-compatible endpoint for all providers |
| Logging | Every request/response logged as structured JSONL |
| PII stripping | Microsoft Presidio scrubs credit cards, SSNs, emails, etc. before they leave the network |
| Keyword blocking | Custom blocklist prevents restricted project names or terms from leaking |
| Budget control | Per-user/per-team spend limits via LiteLLM virtual keys |
| Multi-tool support | Works with Cursor, Claude Code, GitHub Copilot, and any OpenAI-compatible client |
| Self-hosted models | Route to local vLLM, Ollama, or any OpenAI-compatible endpoint alongside cloud providers |
| Interactive testing | Built-in Basic Chat screen to test LLM connectivity and inspect full request/response cycles |
| AI advisor | Ask an LLM about operational data — diagnose errors, tune guardrails, get config recommendations (local models preferred) |
pip install airlock-llm
python -m spacy download en_core_web_lg # required for PII redaction
airlock initairlock init generates config.yaml, .env, and a logs/ directory in the
current working directory.
git clone https://github.com/coreyt/airlock && cd airlock
./scripts/setup.shThis installs Airlock and its dependencies, downloads the spaCy model for PII
redaction, and runs airlock init. Pass --pip to use pip instead of uv.
git clone https://github.com/coreyt/airlock && cd airlock
./scripts/setup-dev.shEverything in the standard setup, plus all optional extras (test, metrics,
tracing, search, s3, sql), install verification, and a test suite run. Pass
--pip to use pip instead of uv.
Edit the generated .env file and fill in your provider keys:
# .env
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...You only need keys for the providers you plan to use. If you only use Anthropic models, you can leave OPENAI_API_KEY blank.
# Option A: TUI dashboard with built-in proxy (recommended)
uv run airlock tui --start
# Option B: proxy only (headless)
uv run airlock startAirlock listens on http://localhost:4000 by default. Change the port with AIRLOCK_PORT in .env.
Recommended local startup profile:
AIRLOCK_STARTUP_MODEL_DISCOVERY=0
AIRLOCK_MCP_STARTUP_MODE=lazy
uv run airlock startcurl http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet",
"messages": [{"role": "user", "content": "Hello!"}]
}'If AIRLOCK_MASTER_KEY is set, add -H "Authorization: Bearer <your-master-key>".
If AIRLOCK_MASTER_KEY is unset or blank, Airlock strips the runtime proxy
master_key setting and accepts unauthenticated requests for local/dev use.
Or use the TUI's Basic Chat screen (press 5) to interactively test any configured model and inspect the full request/response headers and body.
Ask an LLM about Airlock's operational data — diagnose errors, tune guardrails, understand trends:
# One-shot question
airlock advise "why does claude-sonnet have a high error rate?"
# Interactive session
airlock advise --interactive
# Force local model only (no data sent externally)
airlock advise --local-only "what should I tune?"Or press 6 in the TUI for the Advisor screen. The advisor prefers local models (vLLM, Ollama) to avoid sending operational data to remote providers.
docker compose up --buildPoint any OpenAI-compatible client at http://localhost:4000 (or your deployed Airlock URL).
# Install client-side hooks and route traffic through the proxy
airlock hooks install
eval $(airlock dogfood)
claudeEvery request now flows through PII redaction, keyword blocking, and JSONL logging. Open airlock tui in another terminal to watch traffic in real time.
See dev/dogfooding.md for the full setup guide.
In settings, set:
- OpenAI Base URL:
http://localhost:4000/v1 - API Key: your Airlock master key (from
.env)
In VS Code settings.json:
{
"github.copilot.advanced": {
"debug.overrideProxyUrl": "http://localhost:4000/v1"
}
}The main configuration file defines models, callbacks, and guardrails. See the inline comments in config.yaml for details.
Key sections:
model_list— which LLM providers/models to exposelitellm_settings— callbacks, timeouts, budgetsrouter_settings— routing strategy, fallbacks, provider budgetsguardrails— PII and keyword guardsmcp_servers— MCP tool servers (Armada, ADO, etc.) accessible via the proxygeneral_settings— master key, host/port
Airlock supports any OpenAI-compatible endpoint (vLLM, Ollama, LocalAI, etc.) using the openai/ prefix with a custom api_base:
# config.yaml — add to model_list
- model_name: gemma-4
litellm_params:
model: openai/gemma4-31b # model ID as reported by the server
api_base: http://your-host:8000/v1
api_key: os.environ/VLLM_API_KEY # use "dummy-key" if server has no auth# .env
VLLM_API_KEY=dummy-keyThe model will appear in the TUI Basic Chat screen for interactive testing and can be used by any connected client via model: "gemma-4".
Airlock can expose logical aliases that inject prompt and parameter defaults while forwarding to a physical upstream model. Example: gemini-coding routes to Gemini tools mode through enhanced/gemini-coding.
- Clients send
model: "gemini-coding". - Airlock injects the configured system prompt and normalizes Gemini reasoning settings.
- Provider auth is forwarded to the physical model automatically.
- Inner forwarded calls are marked
no_log=True, so Airlock and Fathom log one row per logical request, not two.
| Variable | Description | Default |
|---|---|---|
ANTHROPIC_API_KEY |
Anthropic API key | — |
OPENAI_API_KEY |
OpenAI API key | — |
GOOGLE_AISTUDIO_API_KEY |
Google AI Studio API key for Gemini models | — |
AIRLOCK_MASTER_KEY |
Optional proxy auth key. Leave unset for local/dev unauthenticated runs. | — |
AIRLOCK_HOST |
Bind address (set to 0.0.0.0 to expose externally) |
127.0.0.1 |
AIRLOCK_PORT |
Listen port | 4000 |
AIRLOCK_LOG_DIR |
Directory for JSONL log files | ./logs |
AIRLOCK_STATE_DIR |
State directory for circuit-breaker state and optional FathomDB files | ./logs |
AIRLOCK_MAX_LOG_DAYS |
Days to retain log files before cleanup | 30 |
AIRLOCK_MAX_LOG_SIZE_MB |
Max log file size before rotation | 500 |
AIRLOCK_BLOCKED_KEYWORDS |
Comma-separated restricted phrases | — |
AIRLOCK_PII_ENTITIES |
Presidio entity types to redact | CREDIT_CARD,US_SSN,EMAIL_ADDRESS,PHONE_NUMBER |
AIRLOCK_STARTUP_MODEL_DISCOVERY |
Opt-in provider/model discovery at startup | 0 |
AIRLOCK_MCP_STARTUP_MODE |
MCP startup mode: off, lazy, or eager |
lazy |
AIRLOCK_ENABLE_FATHOMDB |
Enable lazy FathomDB engine initialization | 0 |
AIRLOCK_ENABLE_FATHOM_LOGGER |
Append Fathom request logging at runtime | 0 |
Airlock can proxy MCP tool servers alongside LLM providers. Add entries to mcp_servers in config.yaml. LiteLLM spawns stdio servers from the proxy's working directory, so command resolution matters.
Module via python -m — cwd-independent, requires package installed in the proxy's venv:
mcp_servers:
ado_mcp:
command: uv
args: ["run", "python", "-m", "ado_mcp.mcp.server"]
env:
ADO_ORG_URL: os.environ/ADO_ORG_URL
ADO_PAT: os.environ/ADO_PATInstalled script via uv run — cwd-independent, resolves from PATH/venv:
armada:
command: uv
args: ["run", "armada-mcp"]
env:
ARMADA_PROFILE: essentialScript file — must use an absolute path (relative paths resolve against the proxy's cwd, not the server's project directory):
mono_tui:
command: python3
args: ["/home/user/projects/my-mcp-server/server.py"]Other runtimes:
# Node.js
my_node_server:
command: node
args: ["/path/to/server.js"]
# npx (installed package)
my_npx_server:
command: npx
args: ["my-mcp-server"]
# Bun
my_bun_server:
command: bun
args: ["run", "/path/to/server.ts"]
# Poetry
my_poetry_server:
command: poetry
args: ["run", "python", "-m", "my_server"]Use os.environ/VAR_NAME to pass environment variables from Airlock's .env to the MCP server. Airlock validates these references at startup and gives clear error messages for missing values.
All MCP tool calls flow through the same guardrail pipeline as LLM requests (PII redaction, keyword blocking, threat detection). MCP-specific guards add tool allowlist/blocklist and argument sanitization. No extra configuration needed — guardrails apply automatically.
airlock/
├── proxy.py # Entry point — launches LiteLLM subprocess
├── callbacks/ # JSONL logger, S3, SQL, Prometheus, OpenTelemetry
├── guardrails/ # PII redaction, keyword blocking, semantic, adaptive
├── fast/ # Real-time: threat detection, circuit breaker, priority
├── slow/ # Offline: log analysis, trend detection, tuning
├── hooks/ # Claude Code client-side hooks (session, prompt, audit)
├── advisor/ # LLM-powered operational advisor (agent loop, tools, proposals)
├── cli/ # Unified CLI: init, start, status, tui, analyze, advise, hooks
└── tui/ # Textual terminal dashboard (6 screens, proxy control)
scripts/
├── setup.sh # Standard setup (install + init + spaCy model)
└── setup-dev.sh # Developer setup (all extras + tests)
See docs/operations.md for deployment guides (Docker, Kubernetes, bare metal), monitoring, security checklist, and upgrade procedures.
See docs/troubleshooting.md for common issues and debugging.
Apache 2.0