Desktop client for running large prompt workloads against OpenAI's Batch API — template expansion, JSONL generation, batch lifecycle management, and result retrieval in a single PyQt6 application.
Synchronous chat-completion calls become impractical at scale: cost rises linearly, rate limits cap throughput, and operators have nowhere to track in-flight work. This client targets that gap by driving the OpenAI Batch API end-to-end — taking a prompt template plus a tabular dataset, expanding it into thousands of structured requests, submitting them as a single batch, and surfacing status and results back to the operator.
The use case is anything that maps cleanly onto a row-by-row prompt: catalogue enrichment, content classification, structured extraction over a CSV, evaluation runs, dataset labelling. The 50% cost reduction and 24h SLA of the Batch API make it the correct tool for these workloads; this app makes it operable without writing throwaway scripts.
- Template-driven prompt expansion with
[placeholder]substitution from CSV/XLSX columns (case-insensitive matching). - Generation of OpenAI-compliant JSONL batch files with per-request
custom_id, model, and token-limit configuration. - Full batch lifecycle: file upload → batch creation → status polling → result download → cancellation on delete.
- Local persistence layer (SQLite) tracking every batch's local path, OpenAI file ID, batch ID, status, and result path.
- Per-row preview pane with JSON syntax highlighting before submission.
- Configurable model selection across the GPT-4o / GPT-4 / GPT-3.5 / embedding families.
- API key management isolated to a settings store; no environment-variable coupling.
The system is a three-layer desktop client:
┌──────────────────────────────────────────────────────────┐
│ UI Layer (PyQt6 — BulkAPIApp) │
│ • Create Request / File List / Settings tabs │
│ • Template expansion, preview, status-aware actions │
└──────────────────────────────────────────────────────────┘
│
┌──────────────────┴──────────────────┐
▼ ▼
┌────────────────────┐ ┌──────────────────────┐
│ OpenAIHandler │ │ DatabaseHandler │
│ • JSONL builder │ │ • files table │
│ • Files API │ │ • settings table │
│ • Batches API │ │ • SQLite, ctx-mgr │
└────────────────────┘ └──────────────────────┘
│ │
▼ ▼
OpenAI Batch API bulk_api_app.db (local)
Separation of concerns:
- UI layer owns event handling, template rendering, and user-facing state. It never talks to OpenAI directly without going through the handler.
- Service layer (
OpenAIHandler) wraps the OpenAI SDK and isolates all batch-payload construction in one place — the JSONL schema lives here and nowhere else. - Persistence layer (
DatabaseHandler) exposes a narrow CRUD surface over SQLite using a context-managed cursor so each operation gets its own connection and explicit commit.
| Layer | Technology |
|---|---|
| UI | PyQt6 (Qt6) |
| LLM integration | openai Python SDK (Batch API) |
| Persistence | SQLite (stdlib sqlite3) |
| Tabular input | csv (stdlib), openpyxl |
| Validation models | pydantic (transitively via openai) |
| Runtime | Python 3.8+ |
Each row of the input table produces one line in a JSONL file submitted to the OpenAI Files API with purpose="batch":
{
"custom_id": "request-1",
"method": "POST",
"url": "/v1/chat/completions",
"body": {
"model": "gpt-4o-mini",
"messages": [
{ "role": "user", "content": "Classify the following product: Wireless ANC headphones, over-ear, 40h battery" }
],
"max_tokens": 1000
}
}After submission, batch status is retrieved via client.batches.retrieve(batch_id):
{
"id": "batch_abc123",
"status": "completed",
"input_file_id": "file-in-xyz",
"output_file_id": "file-out-xyz",
"request_counts": { "total": 1500, "completed": 1500, "failed": 0 },
"completion_window": "24h"
}The output_file_id is then streamed to disk as results/batch_output_<file_id>.jsonl, where each line carries the original custom_id so results can be joined back to source rows deterministically.
- Template authoring — operator writes a prompt with
[column_name]placeholders in the Create Request tab. - Data load — CSV or XLSX is parsed; headers populate the column space, rows the substitution values.
- Expansion — each row is rendered into the template; placeholder matching is case-insensitive (both sides lowercased) so column casing doesn't break substitution.
- JSONL build —
OpenAIHandler.create_batch_requests_filewritesprompts/batch_request_<unix_ts>.jsonland registers it in thefilestable with statusWaiting for shipment. - Submission — single click uploads the JSONL via
files.create(purpose="batch"), creates a batch withcompletion_window="24h", and stores bothopenai_idandbatch_idon the local row. - Polling — the same status button now triggers
batches.retrieveand writes the returned status verbatim to local state. - Retrieval — when status reaches
completed, the button switches to Download Result and streamsoutput_file_idcontent toresults/. - Cancellation — deleting an in-flight batch issues
batches.cancelagainst OpenAI before removing the local artefact, preventing orphaned remote work.
Input handling. Tabular input goes through openpyxl / csv.reader; empty cells are coerced to empty strings rather than None to keep template substitution total. Headers are normalised to lowercase at substitution time so that [Brand], [brand], and BRAND column names interoperate.
Output handling. The Batch API returns one JSONL line per request, each tagged with the original custom_id. Because IDs are generated as request-{i} in submission order, the join back to the source dataset is deterministic and does not depend on response ordering — relevant since OpenAI does not guarantee ordering across a batch.
Error surfaces. File I/O around result/input rendering catches FileNotFoundError and generic exceptions distinctly, so a missing artefact is reported as a path-level error rather than a parse failure. Batch cancellation on delete is wrapped to log-and-continue: a failure to cancel remotely never blocks the local cleanup, which is the right tradeoff when the local DB is the system of record.
Reliability of LLM responses. The Batch API is the reliability story: 24h completion window with built-in retry semantics on OpenAI's side, vs. the operator hand-rolling exponential backoff against rate-limited synchronous calls. Local state machine accepts whatever status string the API returns rather than enumerating it, so new statuses don't break the UI.
Persistence as system of record. Every batch is row-stable in SQLite from the moment its JSONL is generated. If the process crashes between submission and status polling, the next launch reads batch_id from disk and resumes; no in-memory state is load-bearing. Each DB call opens its own connection and commits explicitly via a context manager, so partial writes can't leak across operations.
Scalability. The Batch API itself absorbs the throughput problem — 50,000 requests per batch, parallelised server-side, at 50% of synchronous pricing. The client's job is to stay out of the way: JSONL is streamed line-by-line, never assembled in memory as a list, and result download writes directly from the response body to disk.
Operator surface. The single status-aware action button (Send → Check Status → Download Result) collapses the batch state machine into one control whose label always reflects the next legal transition. This is intentional: it makes incorrect operator action structurally impossible.
.
├── main.py # Entry point — boots Qt application
├── bulk_api_app.py # UI layer: BulkAPIApp window, three tabs, event handlers
├── openai_handler.py # Service layer: Batch API integration, JSONL builder
├── database_handler.py # Persistence layer: SQLite schema and CRUD
├── json_highlighter.py # JSON syntax highlighter for preview panes
├── requirements.txt
├── res/ # Static assets (logo)
├── prompts/ # Generated batch request files (runtime)
└── results/ # Downloaded batch outputs (runtime)
bulk_api_app.db is created in CWD on first launch.
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
python main.pyOn first launch, open the Settings tab and paste an OpenAI API key — it persists in the local SQLite store. Run the app from the repo root: prompts/, results/, and bulk_api_app.db are resolved relative to CWD.
- Async polling. Replace operator-driven status checks with a background
QThreadthat polls in-flight batches on a configurable interval. - Result joining. Auto-join downloaded JSONL outputs back against the source CSV/XLSX so operators get a single enriched table instead of a raw response file.
- Schema-constrained outputs. Wire
response_format={"type": "json_schema", ...}into the per-row body so completions are validated against a Pydantic model before reaching disk. - Multi-provider abstraction. Generalise
OpenAIHandlerbehind a provider interface so Anthropic Message Batches and Azure OpenAI Batch can be swapped in without UI changes. - Cost preview. Tokenise the rendered prompts pre-submission and surface an estimated batch cost using current pricing.
- Headless mode. Extract the service + persistence layers behind a CLI for CI/scheduled use, leaving the Qt app as one of multiple front-ends.
- Observability. Structured logging of every state transition with batch ID, plus a small metrics surface (success rate, mean completion time per model).