Skip to content

fix: speed up video frame injection renders#596

Merged
miguel-heygen merged 2 commits intomainfrom
fix/video-wrapper-render-perf
May 2, 2026
Merged

fix: speed up video frame injection renders#596
miguel-heygen merged 2 commits intomainfrom
fix/video-wrapper-render-perf

Conversation

@miguel-heygen
Copy link
Copy Markdown
Collaborator

@miguel-heygen miguel-heygen commented May 2, 2026

Problem

A real HyperFrames wrapper project with a 19-second video asset was rendering much slower than expected on local macOS. On the reproduced fixture, auto calibration became extremely expensive, the renderer reduced itself to one worker, and the render path fell far behind an equivalent Remotion implementation.

A Linux repro later showed a second path to the same bad outcome: BeginFrame calibration can time out, switch the job to screenshot mode, and still leave the worker budget clamped to one even when screenshot capture itself is healthy.

What this fixes

  • Serves extracted video frames from the compiled render file server instead of converting every injected frame to a base64 data URI.
  • Reuses an already-applied frame source in the browser instead of reassigning and decoding the same image source repeatedly.
  • Stops treating ordinary static video/audio presence as a reason to reduce auto worker count; measured calibration still reduces workers for genuinely expensive captures.
  • Retries auto-worker calibration in screenshot mode after BeginFrame calibration timeout, instead of immediately assigning the maximum measured capture cost.
  • Removes the hard one-worker clamp after BeginFrame fallback; screenshot fallback now uses the measured screenshot calibration result.
  • Adds bounded browser/page close handling so a timed-out Puppeteer close cannot leave the CLI hanging after a successful render.

Root cause

The video-frame injection path forced extracted frames through disk reads and base64 data URIs during capture. On high-resolution video-wrapper compositions, that made calibration look dramatically slower than the actual visual workload, so auto worker selection collapsed from the expected parallel path to a single worker.

Separately, BeginFrame calibration failures were treated as proof that the whole capture workload was expensive. That was too conservative after switching to screenshot mode: the fallback mode may be perfectly capable of parallel capture, but the old logic assigned multiplier 8 and then hard-clamped workers to 1 before measuring it.

Verification

Local

  • bunx oxlint packages/engine/src/services/frameCapture.ts packages/engine/src/services/screenshotService.ts packages/engine/src/services/videoFrameInjector.ts packages/producer/src/services/renderOrchestrator.ts packages/producer/src/services/renderOrchestrator.test.ts
  • bunx oxfmt --check packages/engine/src/services/frameCapture.ts packages/engine/src/services/screenshotService.ts packages/engine/src/services/videoFrameInjector.ts packages/producer/src/services/renderOrchestrator.ts packages/producer/src/services/renderOrchestrator.test.ts
  • bunx vitest run packages/producer/src/services/renderOrchestrator.test.ts packages/engine/src/config.test.ts
  • bun run --filter @hyperframes/engine typecheck
  • bun run --filter @hyperframes/producer typecheck
  • bun run --filter @hyperframes/producer build
  • bun run --filter @hyperframes/cli build
  • Built CLI smoke render of the shortened external fixture produced a valid 1080x1920 H.264/AAC MP4 and left no Puppeteer Chrome process behind.
  • After the fallback-calibration follow-up, the same shortened built-CLI smoke completed in 12.85s on macOS screenshot mode.

Browser

  • Used agent-browser to open the full patched render output in Chrome, play it through to 18.9s, and verify the browser-reported video dimensions were 1080x1920.
  • Saved a browser screenshot and recording from that pass locally.

Notes

Reproduced against iswangwenbin/hyperframes-video-wrapper at commit f04c3dd.

Measured on the same machine before the Linux follow-up:

  • Current HyperFrames main with explicit workers:5: about 410s.
  • Patched HyperFrames auto-worker producer render: about 129s internal render time, with 5 workers selected.
  • Equivalent Remotion fixture using @remotion/cli@4.0.455, OffthreadVideo, and concurrency 7: about 374s wall time.

Follow-up Linux repro feedback showed that the initial patch did remove false video/audio worker-cost terms, but BeginFrame timeout plus screenshot fallback still dominated there. This PR now handles that path by recalibrating in screenshot mode and only using the conservative multiplier if screenshot calibration also fails.

The source fixture still warns about sparse keyframes (max interval: 8.33s), so re-encoding the input remains recommended for reliable seeking, but this PR removes the renderer-side bottlenecks exposed by that project.

@miguel-heygen
Copy link
Copy Markdown
Collaborator Author

Addressed the Linux repro finding in 69dfafd1.

The first commit removed the false static video/audio cost terms, but the Linux run showed another path still forced workers=1: BeginFrame calibration timeout set the measured multiplier to 8, switched to screenshot mode, and then hard-clamped worker count to one.

This update now closes the failed BeginFrame calibration session, switches to screenshot mode, reruns calibration there, and sizes workers from the measured screenshot samples. It only falls back to the conservative multiplier if screenshot calibration also fails. I also removed the post-fallback one-worker clamp and added a regression test that keeps baseline auto workers when screenshot-mode calibration is cheap.

Verified with the focused tests/typechecks plus a built CLI smoke render of the shortened wrapper fixture; it produced a valid 1080x1920 H.264/AAC MP4 and left no Puppeteer Chrome process behind.

Copy link
Copy Markdown
Collaborator

@jrusso1020 jrusso1020 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the follow-up commit 69dfafd1 "fix: recalibrate workers after screenshot fallback". Approving — the delta cleanly fixes the Linux fallback regression flagged in the previous round.

What changed since the prior review

The first patch made BeginFrame timeout switch the job to screenshot mode but then assumed the workload was expensive (multiplier MAX_MEASURED_CAPTURE_COST_MULTIPLIER) AND clamped workers to 1. That defeated the parallel screenshot path.

This commit:

  • Drops the switchedToScreenshotAfterCalibration && workerCount > 1 → workerCount = 1 clamp.
  • After BeginFrame timeout, closes the previous calibration session and re-runs calibration in screenshot mode against a fresh createCaptureSession. Real measurement, not a guess.
  • If screenshot calibration also fails, falls back to the conservative MAX_MEASURED_CAPTURE_COST_MULTIPLIER budget with reason calibration-screenshot-failed — the right floor.
  • The new test "keeps baseline auto workers after screenshot fallback when measured capture is cheap" pins the exact behavior change (forceScreenshot=true + multiplier:1 calibration → 6 workers, not 1).

The other notable behavior change

estimateCaptureCostMultiplier now ignores composition.videos / composition.audios (the +0.75 per video / +0.75 per audio terms are gone). This was the right call given the rest of the PR — measured calibration is the source of truth, and the heuristic was double-counting in cases where the measurement already saw the cost. Worth knowing it affects every render path, not just the screenshot-fallback one. The PR description calls this out.

Things I checked

  • All 54 CI checks green (Tests / Lint / Format / Typecheck / regression-shards / Render on windows-latest / CLI smoke / preview-regression / perf jobs).
  • createCompiledFrameSrcResolver test covers both the happy path (paths under compiledDir get URL-encoded relative URLs, including spaces → %20) and the safety path (paths outside compiledDir return null).
  • The frame-source resolver wires through both calibration and the actual capture sessions consistently (createRenderVideoFrameInjector is called everywhere a BeforeCaptureHook is constructed in this file). No path got missed.

Minor observations, not blocking

  • cfg.forceScreenshot = true mutation persists across calls. Intentional (you don't want to flip back to BeginFrame after a known-bad failure), but worth a comment if it bites later.
  • Full fallback-and-retry flow (BeginFrame timeout → close session → retry calibration in screenshot mode → call resolveRenderWorkerCount) is only covered end-to-end by manual smoke. The unit-level test pins the post-fallback resolver behavior, which is the part that actually regressed; the rest is mostly orchestration. Acceptable for a hotfix.

LGTM, ship it.

— hyperframes

@miguel-heygen miguel-heygen merged commit 4750a98 into main May 2, 2026
44 of 54 checks passed
@miguel-heygen miguel-heygen deleted the fix/video-wrapper-render-perf branch May 2, 2026 03:11
Copy link
Copy Markdown
Collaborator

@vanceingalls vanceingalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Staff review notes. I think the overall direction is right, but I would hold merge on the cache-enabled path because it can silently bypass the main optimization.

  1. P1: the served-frame fast path is skipped when extraction cache is enabled.

packages/producer/src/services/renderOrchestrator.ts:2286-2294 now extracts into compiledDir/__hyperframes_video_frames, but still passes extractCacheDir into extractAllVideoFrames. When cache is enabled, extractAllVideoFrames() writes both cache hits and misses into lookup.entry.dir (packages/engine/src/services/videoFrameExtractor.ts:724-747), not the compiled frame directory. createCompiledFrameSrcResolver() only returns URLs for paths under compiledDir (renderOrchestrator.ts:668-685), so cached frame paths fall through to the base64 fallback in videoFrameInjector.ts:44-64.

That means HYPERFRAMES_EXTRACT_CACHE_DIR / cfg.extractCacheDir disables the PR’s headline optimization. Suggested fix: either materialize/link cached frames into the compiled render frame dir for this render, or teach the file server/resolver to safely serve the cache-root frames. Please add a regression test where extractCacheDir is set and the returned frame paths still resolve to served URLs rather than base64 data URIs.

  1. P2: page-close timeout can kill a pooled browser while leaving stale pooled state.

packages/engine/src/services/frameCapture.ts:717-733 force-kills the browser process if page.close() or releaseBrowser() times out. With PRODUCER_ENABLE_BROWSER_POOL, releaseBrowser() only clears pooledBrowser when the refcount reaches zero (packages/engine/src/services/browserManager.ts:217-235). If one pooled session times out closing a page while another session still holds a ref, this can kill the shared browser but leave pooledBrowser pointing at the dead/disconnected instance.

Suggested fix: move forced-kill handling into browser manager, or add a force-release path that atomically clears pooled state when the process is killed. At minimum, avoid force-killing a pooled browser from an individual session unless that session owns the last ref.

  1. P3: encoded frame URLs may not round-trip reserved characters in video IDs.

createCompiledFrameSrcResolver() encodes each path segment with encodeURIComponent() (renderOrchestrator.ts:681-684), while the producer file server uses c.req.path directly as the filesystem-relative path (packages/producer/src/services/fileServer.ts:449-465). The current test covers spaces, but IDs/path segments containing reserved characters like #, ?, or % can be encoded into a URL that the server then looks up literally, producing a 404.

Suggested fix: explicitly decode URL path segments in the file server before joining, while preserving containment checks, and add tests for reserved characters if those IDs are supported.

CI being green is good, and the screenshot-fallback follow-up looks correct. These issues appear to be separate from the Linux worker-count regression that was already fixed in 69dfafd1.

miguel-heygen added a commit that referenced this pull request May 2, 2026
## Summary

Fixes three issues identified in the [post-merge review](#596 (review)) of PR #596:

- **P1 (cache bypass):** When `extractCacheDir` is set, extracted frames live outside `compiledDir`, so `createCompiledFrameSrcResolver` rejects them and every frame falls back to base64 data URIs. Fix: symlink cached frame directories into `compiledDir/__hyperframes_video_frames/` after extraction and remap `framePaths` so the served-frame fast path works.

- **P2 (pooled browser stale state):** `closeCaptureSession` force-killed the Chrome process on timeout via raw `SIGKILL` without clearing `pooledBrowser` / `pooledBrowserRefCount`, leaving other sessions with a dead browser reference. Fix: add `forceReleaseBrowser()` in `browserManager` that atomically clears pool state before killing the process.

- **P3 (reserved chars in URLs):** `createCompiledFrameSrcResolver` encodes path segments with `encodeURIComponent`, but the file server used `c.req.path` (which only applies `decodeURI`) to look up files on disk. Video IDs containing `#`, `?`, or `%` produced 404s. Fix: apply `decodeURIComponent` per path segment in the file server's catch-all route.

## Test plan

- [x] `createCompiledFrameSrcResolver` tests: symlinked cache paths resolve to served URLs; cache-external paths return null; reserved characters encode correctly
- [x] `forceReleaseBrowser` tests: kills process + disconnects; tolerates already-killed process
- [x] `createFileServer` test: `video%231/frame.jpg` serves file from `video#1/frame.jpg` on disk
- [x] Typecheck: engine + producer pass
- [x] Lint + format: 0 warnings, 0 errors
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants