A self-hosted bridge that turns subscription-based image generators into a programmable endpoint.
Quick Start · How It Works · Endpoints · Configuration · Build Journal
You pay for an unlimited image generation subscription. The platform doesn't expose a public API. The only way to use it programmatically is to drive the web UI.
You try the obvious approach. Headless Playwright. The site detects automation in milliseconds. You see a paywall instead of generated images, even though you're logged in with an active subscription.
You add cookies. You try Selenium. You try fingerprint patches. The platform fingerprints the browser, the IP, the timing. Every layer leaks a different signal.
pixelpipe wraps the entire stack — anti-detection browser, residential proxy, CAPTCHA solver, automated email verification — into a single HTTP API. POST a prompt, get back a URL.
The trick isn't bypassing one defense. It's bypassing all of them at once, every time, without breaking when one slips.
Patchright drives a real Chromium browser inside Xvfb so it never runs headless. CapSolver handles the reCAPTCHA on first login. Gmail IMAP reads the verification code that gets emailed. A residential proxy carries every request. The session persists in a Chromium profile mounted as a Docker volume.
git clone https://github.com/numarulunu/pixelpipe.git
cd pixelpipe
cp .env.example .env
# fill in .env with your credentials
docker compose up -dThen call it:
curl -X POST http://localhost:9100/generate/image \
-H "Content-Type: application/json" \
-d '{"prompt":"A medieval castle at sunset","model":"Auto","aspect_ratio":"16:9"}'{
"status": "success",
"images": [
{"url": "https://cdn.example.com/...", "index": 0}
],
"model_used": "Auto",
"generation_time_seconds": 12.3
}flowchart TD
A["n8n / your script"] -->|"POST /generate/image"| B["pixelpipe API"]
B --> C{"Session valid?"}
C -->|Yes| G["Drive the page"]
C -->|No| D["Auto-login flow"]
D --> D1["Fill credentials"]
D1 --> D2["CapSolver solves reCAPTCHA"]
D2 --> D3["Gmail IMAP reads code"]
D3 --> D4["Submit verification"]
D4 --> G
G --> G1["Select model"]
G1 --> G2["Set aspect ratio"]
G1 --> G3["Set image count"]
G2 --> G4["Type prompt"]
G3 --> G4
G4 --> G5["Click Generate"]
G5 --> H["Poll for new CDN URLs"]
H --> I["Return JSON"]
I --> A
style A fill:#1a1a2e,stroke:#9333ea,color:#fff
style B fill:#1a1a2e,stroke:#0ea5e9,color:#fff
style D fill:#2d1a1a,stroke:#e94560,color:#fff
style I fill:#1a2e1a,stroke:#22c55e,color:#fff
| Layer | What it does | Why it's there |
|---|---|---|
| Patchright | Drop-in Playwright fork that patches CDP automation leaks | Vanilla Playwright leaks navigator.webdriver = true and is detected instantly |
| Xvfb + headless=False | Virtual display so Chromium runs in "headed" mode inside Docker | Headless mode has its own fingerprint that even Patchright can't hide |
| Residential proxy | All traffic routes through a residential IP | The target site allows page loads from datacenter IPs but blocks generation. Static residential, ~$2/month |
| CapSolver | Solves the reCAPTCHA v2 on the login page | Automated logins always trigger a captcha. Charges ~$0.003 per solve |
| Gmail IMAP | Reads the verification code emailed on first login from a new browser | The site emails a 6-digit code on every new browser fingerprint. Bot reads it from your inbox in seconds |
Drop any one of these and the bot fails in a different way. All five together make the round trip invisible.
flowchart LR
P1["1. Site detects Playwright"] --> P2["2. Headless leaks fingerprint"]
P2 --> P3["3. Datacenter IP blocked"]
P3 --> P4["4. Login triggers reCAPTCHA"]
P4 --> P5["5. New browser triggers email verification"]
P5 --> P6["6. Session expires every 1-2 weeks"]
style P1 fill:#2d1a1a,stroke:#e94560,color:#fff
style P6 fill:#1a2e1a,stroke:#22c55e,color:#fff
| Symptom | Root cause | Fix |
|---|---|---|
| "Sign up for free" modal even when logged in | Site fingerprints Playwright via CDP commands | Switch to Patchright (Playwright fork that patches CDP leaks) |
| 403 on page load from server | Datacenter IP detection | Route through residential proxy |
| Login form shows but generation triggers paywall | Headless browser fingerprint | Run Chromium with headless=False inside Xvfb virtual display |
| Login submit just reloads the page | reCAPTCHA invisible challenge | CapSolver posts the token, bot calls form.submit() |
| Login redirects to verify-account page | Email-based MFA on new browser | Gmail IMAP reads the latest 6-digit code |
| Bot works once, breaks on next request | Profile not persisted | Persistent Chromium profile mounted as a Docker volume |
The full incident log is in BUILD_JOURNAL.txt — every dead end, every wrong assumption, every fix.
flowchart TB
subgraph host ["Docker host"]
subgraph container ["pixelpipe container"]
X["Xvfb :99\nVirtual display"]
C["Chromium\n(via Patchright)"]
F["FastAPI\nport 9100"]
X -.provides display.-> C
F -->|drives| C
end
V[("/data volume")]
C -.profile.-> V
end
P["Residential proxy"]
CS["CapSolver API"]
G["Gmail IMAP"]
T["Target site"]
C -->|all traffic| P
P --> T
F -->|reCAPTCHA tokens| CS
F -->|verification codes| G
style F fill:#1a1a2e,stroke:#0ea5e9,color:#fff
style C fill:#1a1a2e,stroke:#9333ea,color:#fff
style V fill:#1a2e1a,stroke:#22c55e,color:#fff
Generate an image. Request body:
| Field | Required | Default | Description |
|---|---|---|---|
prompt |
yes | — | Text prompt |
model |
yes | — | Exact model name as shown in the target site UI |
aspect_ratio |
no | 1:1 |
1:1, 16:9, 9:16, 4:3, etc. |
count |
no | 1 |
1 to 4 |
reference_url |
no | — | URL to a reference image (downloaded and uploaded automatically) |
Response:
{
"status": "success",
"images": [
{"url": "https://cdn.example.com/...", "index": 0}
],
"model_used": "Auto",
"generation_time_seconds": 12.3
}Same as image, returns video.url instead of images[].
{"status": "ok", "browser_alive": true, "logged_in": true}Manual verification code entry. Only needed if Gmail IMAP isn't configured.
Returns a base64 PNG of the current page state. Useful when something breaks and you need to see what the bot sees.
Inspects the live DOM for data-cy attributes around the generation form. Use this when the target site updates their UI and selectors break.
All configuration is via environment variables. Copy .env.example to .env and fill in:
| Variable | Required | What it does |
|---|---|---|
FREEPIK_EMAIL |
yes | Account email |
FREEPIK_PASSWORD |
yes | Account password |
CAPSOLVER_API_KEY |
recommended | Solves the reCAPTCHA on auto-login. Without it the bot can only use existing cookies |
PROXY_SERVER |
recommended | Format: user:pass@host:port. Required if running on a datacenter (any cloud VPS) |
GMAIL_USER |
recommended | Gmail address that receives verification emails |
GMAIL_APP_PASSWORD |
recommended | Gmail app password (16 chars, no spaces) — not your account password |
| Service | Where | Cost |
|---|---|---|
| CapSolver | capsolver.com — sign up, top up $5 | ~$0.003 per login |
| Residential proxy | proxycheap.com, iproyal.com, etc. — get a static residential IP | ~$2/month |
| Gmail App Password | myaccount.google.com/apppasswords — requires 2FA enabled | Free |
sequenceDiagram
participant U as Your script
participant B as pixelpipe
participant T as Target site
participant CS as CapSolver
participant GM as Gmail
U->>B: POST /generate/image
B->>T: GET /pikaso (check session)
T-->>B: Sign in button visible
Note over B: Not logged in, start auto-login
B->>T: GET /login
B->>T: Fill email + password
B->>T: Click submit
T-->>B: reCAPTCHA challenge
B->>CS: POST createTask (sitekey + URL)
CS-->>B: taskId
loop poll every 2s
B->>CS: getTaskResult
end
CS-->>B: gRecaptchaResponse token
B->>T: Inject token + submit form
T-->>B: Redirect to verify-account
loop poll every 10s
B->>GM: IMAP search "authentication code"
end
GM-->>B: 6-digit code
B->>T: Fill code + click verify
T-->>B: Redirect to dashboard
Note over B: Session saved to profile volume
B->>T: Drive the generation flow
T-->>B: CDN URLs
B-->>U: JSON response
The first request after a fresh deploy takes ~60 seconds (login + captcha + verify). Every request after that takes ~15-30 seconds because the session is reused from the Chromium profile.
The target site invalidates the session every 1-2 weeks. Here's what happens:
flowchart LR
A["Request comes in"] --> B{"Session\nvalid?"}
B -->|"Yes (95% of the time)"| C["Generate immediately"]
B -->|"No"| D["Auto re-login"]
D --> E["CapSolver solves captcha"]
E --> F["Gmail reads verify code"]
F --> C
C --> G["Return URLs"]
style D fill:#2d1a1a,stroke:#e94560,color:#fff
style C fill:#1a2e1a,stroke:#22c55e,color:#fff
Zero human intervention. The bot heals itself.
| Item | Monthly cost |
|---|---|
| Subscription to the target service | whatever you already pay |
| Residential proxy (static) | ~$2 |
| CapSolver credit | ~$0.01 (a few re-logins) |
| Total extra | ~$2/month |
For comparison, the equivalent volume on the official API of similar services would cost $20-100/month depending on usage. This is the bridge that makes a flat-rate UI subscription behave like an unlimited API.
pixelpipe/
app/
main.py FastAPI app — endpoints, lifecycle, request queue
browser.py Patchright browser lifecycle (launch, restart, close)
auth.py Login flow + CapSolver + Gmail IMAP integration
generator.py Image and video generation Playwright flows
config.py Pydantic Settings (env var loading)
models.py Request/response schemas
scripts/
extract_cookies.py One-time manual cookie extraction (fallback)
refresh_session.py Local login + push cookies to remote server
tests/ pytest unit tests for config and models
Dockerfile Python 3.12 + Patchright Chromium + Xvfb + dbus
docker-compose.yml Single service, mounts /data volume
BUILD_JOURNAL.txt Complete dev journal — every failure and fix
.env.example All env vars with explanations
Selectors in app/generator.py and app/auth.py may break. Use the /debug/controls endpoint to find the new selectors:
curl http://localhost:9100/debug/controls | jqIt returns every element with a data-cy attribute on the current page. Find the new attribute name for the broken control, update the relevant locator in code.
docker logs pixelpipe 2>&1 | tail -50Look for CapSolver, verify-account, or Gmail errors. Each layer logs what it tried and why it failed.
docker compose down
rm -rf data/patchright-profile data/cookies.json
mkdir -p data/patchright-profile
docker compose up -dThe bot will auto-login on the next request.
Will this get me banned?
Possibly. Browser automation against a paid service is technically against most ToS. The bot generates at human-realistic intervals (one request at a time, no parallelism) so it's less detectable than a parallel scraper, but the risk is non-zero. Don't use this for anything critical without a backup.
Why not use the official API?
Most paid image-generation platforms either don't offer an API at all, or charge per-request fees that defeat the purpose of an unlimited subscription. This bridge gives you the cost profile of a subscription with the access pattern of an API.
Can I run this on a Raspberry Pi / cheap VPS?
Yes, but you need at least 2GB RAM (Chromium is hungry) and a residential proxy. Datacenter IPs get blocked. A €5/month Hetzner VPS works fine if you add a residential proxy on top.
What if CapSolver is down?
The bot will fail at the login step and return a 401. Use the
POST /verify/{code} endpoint to manually enter the code, or fall back to the extract_cookies.py script which lets you log in manually in a real browser and ship the cookies to the server.
Can I run multiple accounts?
Not in a single instance. The bot is single-tab, single-account by design. Run multiple containers on different ports for multiple accounts.
Why Patchright instead of regular Playwright?
Vanilla Playwright leaks
navigator.webdriver = true and several CDP signals. Patchright is a drop-in fork that patches all of them. Same API, same code, just a different import. Without it, the target site shows a paywall on every generation attempt.
The trick isn't bypassing one defense. It's bypassing all of them at once.