A standalone PII redaction API wrapping OpenAI's privacy-filter model. POST text, get back redacted text with typed numbered placeholders and detected span metadata.
uv sync
cp .env.example .env
DEV_MODE=true uv run python server.pyThe first request triggers a model download (~1.5GB) from HuggingFace, then caches locally.
curl -X POST http://localhost:8000/redact \
-H "Content-Type: application/json" \
-d '{"text": "My name is Alice Smith, email alice@smith.com, phone 555-0123."}'{
"redacted_text": "My name is <PRIVATE_PERSON_1>, email <PRIVATE_EMAIL_1>, phone <PRIVATE_PHONE_1>.",
"spans": [
{ "label": "PRIVATE_PERSON", "id": 1, "text": " Alice Smith", "start": 10, "end": 22, "score": 0.999999 },
{ "label": "PRIVATE_EMAIL", "id": 1, "text": " alice@smith.com", "start": 31, "end": 48, "score": 0.999999 },
{ "label": "PRIVATE_PHONE", "id": 1, "text": " 555-0123", "start": 56, "end": 65, "score": 0.999998 }
],
"summary": { "total_spans": 3, "by_label": { "PRIVATE_PERSON": 1, "PRIVATE_EMAIL": 1, "PRIVATE_PHONE": 1 } }
}| Method | Path | Auth | Description |
|---|---|---|---|
| GET | /health |
No | Health check |
| POST | /redact |
Optional | Redact PII from text |
POST /redact accepts { "text": "..." } (1–128,000 chars, no extra fields). Returns RedactResponse with the redacted text, detected spans, and a summary.
All config via env vars (see .env.example):
| Variable | Default | Description |
|---|---|---|
DEVICE |
cpu |
Inference device: cpu or cuda |
HOST |
0.0.0.0 |
Server bind address |
PORT |
8000 |
Server port |
DEV_MODE |
false |
Debug logging |
PRETTY_JSON |
false |
2-space indented JSON responses |
AUTH_KEYS |
(disabled) | Comma-separated Bearer tokens |
RATE_LIMIT |
60/minute |
Per-IP rate limit |
CORS_ORIGINS |
(disabled) | Comma-separated allowed origins |
account_number, private_address, private_email, private_person, private_phone, private_url, private_date, secret
docker build -t privacy-filter .
docker run -p 8000:8000 --env-file .env privacy-filterThe Dockerfile uses CPU-only PyTorch (~200MB vs ~2.5GB with CUDA) and pre-downloads the model during build.