Summary
Implement a double-buffered staging + upload pipeline that overlaps mesh uploads with rendering. Chunk mesh data is written to one staging buffer while the GPU reads from the previous frame's buffer, eliminating upload stalls.
Depends on: #384 (dedicated transfer queue + staging ring)
Current State (after #384)
The transfer queue and staging ring exist, but uploads are submitted one-at-a-time and may still cause synchronization points. The main thread submits uploads, waits for completion, then renders. At high chunk throughput (128+ chunks loading simultaneously), upload bandwidth is limited.
Target Architecture
Double-Buffer Strategy
Frame N:
Staging Buffer A: GPU is copying from this → megabuffer (previous frame's uploads)
Staging Buffer B: CPU is writing new mesh data into this
Frame N+1:
Swap: A becomes write target, B becomes GPU read source
GPU copies B → megabuffer while CPU writes A
Upload Pipeline
- Collect: Worker threads produce mesh vertex data
- Stage: Main thread allocates from staging buffer B (current write buffer), copies data
- Submit: End of frame: submit transfer commands for buffer B → megabuffer
- Swap: Next frame: buffer B becomes GPU source, buffer A becomes CPU write target
- Fence: Previous frame's transfer must complete before staging buffer A is reused
Per-Chunk Flow
Worker thread: Build mesh → pending_solid/pending_cutout/pending_fluid arrays
Main thread (beginFrame):
- Check previous frame's transfer fence
- If complete: swap staging buffers, reclaim space
Main thread (update):
- For each pending upload:
- Allocate from current staging buffer
- Copy vertex data
- Record copy command (staging → megabuffer offset)
Main thread (endFrame):
- Submit accumulated transfer commands
- Signal fence
Implementation Plan
Step 1: Double-buffered staging
- Two
StagingRing instances (or one ring split in half)
write_staging and gpu_staging pointers, swapped each frame
- Frame N writes to
write_staging, GPU reads from gpu_staging
- After swap:
gpu_staging is the just-written buffer, write_staging is the just-completed buffer
Step 2: Batch upload submission
- Accumulate multiple chunk uploads per frame
- Single
vkQueueSubmit with all copy commands batched
- One fence per frame (not per chunk)
vkCmdCopyBuffer for each chunk: staging offset → megabuffer offset
Step 3: Integration with WorldStreamer
WorldStreamer currently has max_uploads_per_frame limit
- With double-buffer pipeline, this limit can be raised significantly
- Upload queue: FIFO of chunks with completed meshes awaiting upload
- Main thread drains the queue into staging buffer
Step 4: Backpressure
- If staging buffer is >80% full: pause new uploads (backpressure to streamer)
- If GPU is consistently behind on transfers: log warning, reduce upload rate
- Monitor: show staging buffer utilization in debug overlay
Files to Modify
src/engine/graphics/vulkan/transfer_queue.zig — double-buffer logic
src/world/world_streamer.zig — upload queue integration
src/world/chunk_mesh.zig — pending upload handling
src/engine/ui/timing_overlay.zig — staging buffer utilization stat
Testing
Roadmap: docs/PERFORMANCE_ROADMAP.md — Batch 5, Issue 3B-2
Summary
Implement a double-buffered staging + upload pipeline that overlaps mesh uploads with rendering. Chunk mesh data is written to one staging buffer while the GPU reads from the previous frame's buffer, eliminating upload stalls.
Depends on: #384 (dedicated transfer queue + staging ring)
Current State (after #384)
The transfer queue and staging ring exist, but uploads are submitted one-at-a-time and may still cause synchronization points. The main thread submits uploads, waits for completion, then renders. At high chunk throughput (128+ chunks loading simultaneously), upload bandwidth is limited.
Target Architecture
Double-Buffer Strategy
Upload Pipeline
Per-Chunk Flow
Implementation Plan
Step 1: Double-buffered staging
StagingRinginstances (or one ring split in half)write_stagingandgpu_stagingpointers, swapped each framewrite_staging, GPU reads fromgpu_staginggpu_stagingis the just-written buffer,write_stagingis the just-completed bufferStep 2: Batch upload submission
vkQueueSubmitwith all copy commands batchedvkCmdCopyBufferfor each chunk: staging offset → megabuffer offsetStep 3: Integration with WorldStreamer
WorldStreamercurrently hasmax_uploads_per_framelimitStep 4: Backpressure
Files to Modify
src/engine/graphics/vulkan/transfer_queue.zig— double-buffer logicsrc/world/world_streamer.zig— upload queue integrationsrc/world/chunk_mesh.zig— pending upload handlingsrc/engine/ui/timing_overlay.zig— staging buffer utilization statTesting
Roadmap:
docs/PERFORMANCE_ROADMAP.md— Batch 5, Issue 3B-2