Skip to content

fix(engine): track and time-bound ffprobe spawns#1576

Open
calcarazgre646 wants to merge 1 commit into
heygen-com:mainfrom
calcarazgre646:fix/ffprobe-track-and-timeout
Open

fix(engine): track and time-bound ffprobe spawns#1576
calcarazgre646 wants to merge 1 commit into
heygen-com:mainfrom
calcarazgre646:fix/ffprobe-track-and-timeout

Conversation

@calcarazgre646

Copy link
Copy Markdown
Contributor

Problem

runFfprobe (ffprobe.ts) was the only engine child-process spawn without trackChildProcess or a timeout. Every other spawn (runFfmpeg, the chunk and streaming encoders, the frame extractor) registers with the process tracker and bounds itself with a timeout. So a hung ffprobe (malformed container, stalled network filesystem, a fifo) never settled, and killTrackedProcesses() on render abort/teardown could not reap it because it was never in the tracked set. analyzeKeyframeIntervals is the worst case: it probes every keyframe and can run long.

Change

Register the child with trackChildProcess and add a 120s timeout that SIGTERMs and rejects, mirroring runFfmpeg. The 120s bound is generous for the slowest legitimate probe while capping a true hang.

gpuEncoder's spawns are one-shot, cached capability detection with their own timeout and SIGKILL grace, not render-pipeline probes over untrusted input, so they are intentionally out of scope here.

Tests

A timeout test: a hung ffprobe is killed and rejects after the timeout instead of leaking. Verified load-bearing (reverting the fix makes the test hang to the vitest timeout). Full engine suite green (743 tests).

runFfprobe was the only engine child-process spawn without
trackChildProcess or a timeout. Every other spawn (runFfmpeg, the chunk
and streaming encoders, the frame extractor) registers with the process
tracker and bounds itself with a timeout. A hung ffprobe (malformed
container, stalled network FS, a fifo) therefore never settled, and
killTrackedProcesses() on render abort/teardown could not reap it since
it was never in the tracked set. analyzeKeyframeIntervals is the worst
case: it probes every keyframe and can run long.

Register the child with trackChildProcess and add a 120s timeout that
SIGTERMs and rejects, mirroring runFfmpeg.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant