Bulk LLM Batch Orchestrator

Desktop client for running large prompt workloads against OpenAI's Batch API — template expansion, JSONL generation, batch lifecycle management, and result retrieval in a single PyQt6 application.

Overview

Synchronous chat-completion calls become impractical at scale: cost rises linearly, rate limits cap throughput, and operators have nowhere to track in-flight work. This client targets that gap by driving the OpenAI Batch API end-to-end — taking a prompt template plus a tabular dataset, expanding it into thousands of structured requests, submitting them as a single batch, and surfacing status and results back to the operator.

The use case is anything that maps cleanly onto a row-by-row prompt: catalogue enrichment, content classification, structured extraction over a CSV, evaluation runs, dataset labelling. The 50% cost reduction and 24h SLA of the Batch API make it the correct tool for these workloads; this app makes it operable without writing throwaway scripts.

Key Features

Template-driven prompt expansion with [placeholder] substitution from CSV/XLSX columns (case-insensitive matching).
Generation of OpenAI-compliant JSONL batch files with per-request custom_id, model, and token-limit configuration.
Full batch lifecycle: file upload → batch creation → status polling → result download → cancellation on delete.
Local persistence layer (SQLite) tracking every batch's local path, OpenAI file ID, batch ID, status, and result path.
Per-row preview pane with JSON syntax highlighting before submission.
Configurable model selection across the GPT-4o / GPT-4 / GPT-3.5 / embedding families.
API key management isolated to a settings store; no environment-variable coupling.

Architecture

The system is a three-layer desktop client:

┌──────────────────────────────────────────────────────────┐
│  UI Layer  (PyQt6 — BulkAPIApp)                          │
│  • Create Request / File List / Settings tabs            │
│  • Template expansion, preview, status-aware actions     │
└──────────────────────────────────────────────────────────┘
                           │
        ┌──────────────────┴──────────────────┐
        ▼                                     ▼
┌────────────────────┐              ┌──────────────────────┐
│  OpenAIHandler     │              │  DatabaseHandler     │
│  • JSONL builder   │              │  • files table       │
│  • Files API       │              │  • settings table    │
│  • Batches API     │              │  • SQLite, ctx-mgr   │
└────────────────────┘              └──────────────────────┘
        │                                     │
        ▼                                     ▼
   OpenAI Batch API                  bulk_api_app.db (local)

Separation of concerns:

UI layer owns event handling, template rendering, and user-facing state. It never talks to OpenAI directly without going through the handler.
Service layer (OpenAIHandler) wraps the OpenAI SDK and isolates all batch-payload construction in one place — the JSONL schema lives here and nowhere else.
Persistence layer (DatabaseHandler) exposes a narrow CRUD surface over SQLite using a context-managed cursor so each operation gets its own connection and explicit commit.

Tech Stack

Layer	Technology
UI	PyQt6 (Qt6)
LLM integration	`openai` Python SDK (Batch API)
Persistence	SQLite (stdlib `sqlite3`)
Tabular input	`csv` (stdlib), `openpyxl`
Validation models	`pydantic` (transitively via `openai`)
Runtime	Python 3.8+

Batch Request Format

Each row of the input table produces one line in a JSONL file submitted to the OpenAI Files API with purpose="batch":

{
  "custom_id": "request-1",
  "method": "POST",
  "url": "/v1/chat/completions",
  "body": {
    "model": "gpt-4o-mini",
    "messages": [
      { "role": "user", "content": "Classify the following product: Wireless ANC headphones, over-ear, 40h battery" }
    ],
    "max_tokens": 1000
  }
}

After submission, batch status is retrieved via client.batches.retrieve(batch_id):

{
  "id": "batch_abc123",
  "status": "completed",
  "input_file_id": "file-in-xyz",
  "output_file_id": "file-out-xyz",
  "request_counts": { "total": 1500, "completed": 1500, "failed": 0 },
  "completion_window": "24h"
}

The output_file_id is then streamed to disk as results/batch_output_<file_id>.jsonl, where each line carries the original custom_id so results can be joined back to source rows deterministically.

How It Works

Template authoring — operator writes a prompt with [column_name] placeholders in the Create Request tab.
Data load — CSV or XLSX is parsed; headers populate the column space, rows the substitution values.
Expansion — each row is rendered into the template; placeholder matching is case-insensitive (both sides lowercased) so column casing doesn't break substitution.
JSONL build — OpenAIHandler.create_batch_requests_file writes prompts/batch_request_<unix_ts>.jsonl and registers it in the files table with status Waiting for shipment.
Submission — single click uploads the JSONL via files.create(purpose="batch"), creates a batch with completion_window="24h", and stores both openai_id and batch_id on the local row.
Polling — the same status button now triggers batches.retrieve and writes the returned status verbatim to local state.
Retrieval — when status reaches completed, the button switches to Download Result and streams output_file_id content to results/.
Cancellation — deleting an in-flight batch issues batches.cancel against OpenAI before removing the local artefact, preventing orphaned remote work.

Engineering Focus

Input handling. Tabular input goes through openpyxl / csv.reader; empty cells are coerced to empty strings rather than None to keep template substitution total. Headers are normalised to lowercase at substitution time so that [Brand], [brand], and BRAND column names interoperate.

Output handling. The Batch API returns one JSONL line per request, each tagged with the original custom_id. Because IDs are generated as request-{i} in submission order, the join back to the source dataset is deterministic and does not depend on response ordering — relevant since OpenAI does not guarantee ordering across a batch.

Error surfaces. File I/O around result/input rendering catches FileNotFoundError and generic exceptions distinctly, so a missing artefact is reported as a path-level error rather than a parse failure. Batch cancellation on delete is wrapped to log-and-continue: a failure to cancel remotely never blocks the local cleanup, which is the right tradeoff when the local DB is the system of record.

Reliability of LLM responses. The Batch API is the reliability story: 24h completion window with built-in retry semantics on OpenAI's side, vs. the operator hand-rolling exponential backoff against rate-limited synchronous calls. Local state machine accepts whatever status string the API returns rather than enumerating it, so new statuses don't break the UI.

Persistence as system of record. Every batch is row-stable in SQLite from the moment its JSONL is generated. If the process crashes between submission and status polling, the next launch reads batch_id from disk and resumes; no in-memory state is load-bearing. Each DB call opens its own connection and commits explicitly via a context manager, so partial writes can't leak across operations.

Scalability. The Batch API itself absorbs the throughput problem — 50,000 requests per batch, parallelised server-side, at 50% of synchronous pricing. The client's job is to stay out of the way: JSONL is streamed line-by-line, never assembled in memory as a list, and result download writes directly from the response body to disk.

Operator surface. The single status-aware action button (Send → Check Status → Download Result) collapses the batch state machine into one control whose label always reflects the next legal transition. This is intentional: it makes incorrect operator action structurally impossible.

Project Structure

.
├── main.py                  # Entry point — boots Qt application
├── bulk_api_app.py          # UI layer: BulkAPIApp window, three tabs, event handlers
├── openai_handler.py        # Service layer: Batch API integration, JSONL builder
├── database_handler.py      # Persistence layer: SQLite schema and CRUD
├── json_highlighter.py      # JSON syntax highlighter for preview panes
├── requirements.txt
├── res/                     # Static assets (logo)
├── prompts/                 # Generated batch request files (runtime)
└── results/                 # Downloaded batch outputs (runtime)

bulk_api_app.db is created in CWD on first launch.

Setup & Run

python -m venv venv
source venv/bin/activate          # Windows: venv\Scripts\activate
pip install -r requirements.txt
python main.py

On first launch, open the Settings tab and paste an OpenAI API key — it persists in the local SQLite store. Run the app from the repo root: prompts/, results/, and bulk_api_app.db are resolved relative to CWD.

Future Improvements

Async polling. Replace operator-driven status checks with a background QThread that polls in-flight batches on a configurable interval.
Result joining. Auto-join downloaded JSONL outputs back against the source CSV/XLSX so operators get a single enriched table instead of a raw response file.
Schema-constrained outputs. Wire response_format={"type": "json_schema", ...} into the per-row body so completions are validated against a Pydantic model before reaching disk.
Multi-provider abstraction. Generalise OpenAIHandler behind a provider interface so Anthropic Message Batches and Azure OpenAI Batch can be swapped in without UI changes.
Cost preview. Tokenise the rendered prompts pre-submission and surface an estimated batch cost using current pricing.
Headless mode. Extract the service + persistence layers behind a CLI for CI/scheduled use, leaving the Qt app as one of multiple front-ends.
Observability. Structured logging of every state transition with batch ID, plus a small metrics surface (success rate, mean completion time per model).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bulk LLM Batch Orchestrator

Overview

Key Features

Architecture

Tech Stack

Batch Request Format

How It Works

Engineering Focus

Project Structure

Setup & Run

Future Improvements

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
res		res
.gitignore		.gitignore
README.md		README.md
bulk_api_app.py		bulk_api_app.py
database_handler.py		database_handler.py
json_highlighter.py		json_highlighter.py
main.py		main.py
openai_handler.py		openai_handler.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Bulk LLM Batch Orchestrator

Overview

Key Features

Architecture

Tech Stack

Batch Request Format

How It Works

Engineering Focus

Project Structure

Setup & Run

Future Improvements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages