Skip to content

feat: add batch inference pipeline#816

Draft
IMaloney wants to merge 3 commits intozai-org:mainfrom
IMaloney:feat/batch-inference-pipeline
Draft

feat: add batch inference pipeline#816
IMaloney wants to merge 3 commits intozai-org:mainfrom
IMaloney:feat/batch-inference-pipeline

Conversation

@IMaloney
Copy link
Copy Markdown

@IMaloney IMaloney commented Feb 19, 2026

adds batch inference for generating multiple videos from a single run

new script tools/batch_inference.py supports t2v/i2v/v2v with JSONL input, resume capability, multi-GPU distribution, and error handling

tested with all generation types, 59 tests pass, pre-commit clean

Test User added 3 commits February 19, 2026 03:15
Implements a comprehensive batch video generation tool that addresses the zai-org#1
missing feature for production users: generating multiple videos from a single
batch file instead of one-at-a-time processing.

## New Files

### tools/batch_inference.py
Production-ready batch inference script with:

**Core Features:**
- JSONL input format (one job per line, streaming-friendly)
- Support for all generation types: t2v, i2v, v2v
- Progress tracking with tqdm (progress bar, ETA)
- Robust error handling (logs errors, continues batch)
- Resume capability (tracks completed jobs, skips on restart)

**Input Schema:**
- prompt (required): Text description
- output_name (required): Output filename
- image_path (optional): For i2v generation
- video_path (optional): For v2v generation
- num_frames, guidance_scale, num_inference_steps, seed, width, height (optional)

**Multi-GPU Support:**
- Job-level parallelism via --gpu_id and --num_gpus flags
- Each GPU processes a subset of jobs (round-robin distribution)
- State file prevents duplicate work across processes

**Memory Management:**
- Loads model once, generates sequentially
- CPU offloading enabled by default
- VAE slicing and tiling enabled

### resources/example_batch_*.jsonl
Example batch files demonstrating:
- example_batch_t2v.jsonl: Text-to-video prompts
- example_batch_i2v.jsonl: Image-to-video with image_path
- example_batch_v2v.jsonl: Video-to-video with video_path

## Design Decisions

1. **JSONL over JSON**: Better for large batches, streaming, and manual editing
2. **Reuse generation logic**: Mirrors cli_demo.py patterns for consistency
3. **Single model per batch**: Memory efficient, simpler implementation
4. **State persistence**: JSON state file enables reliable resume
5. **Error isolation**: One failed job doesn't stop the batch

## Usage Examples

# Basic text-to-video
python tools/batch_inference.py --batch_file prompts.jsonl --model_path THUDM/CogVideoX1.5-5B

# Multi-GPU (4 GPUs)
for i in {0..3}; do
    CUDA_VISIBLE_DEVICES=$i python tools/batch_inference.py --batch_file batch.jsonl --gpu_id $i --num_gpus 4 &
done
@IMaloney IMaloney changed the title feat: add production-grade batch inference pipeline feat: add batch inference pipeline Feb 19, 2026
@IMaloney IMaloney marked this pull request as draft February 20, 2026 06:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant