Summary
On the ja/live-runner branch, MediaPublish cuts trickle segments at a hardcoded 2 s for audio-only streams, regardless of how MediaPublishConfig.min_segment_wallclock_s is configured. This sets an unnecessary ~2 s latency floor on any audio-only trickle publisher and makes the existing min_segment_wallclock_s knob misleading: changing it has no observable effect.
Where
src/livepeer_gateway/media_publish.py around line 360, in MediaPublish.__init__:
video_configs = [track.config for track in self._video_tracks if isinstance(track.config, VideoOutputConfig)]
self._segment_time_s = (
min(float(track.keyframe_interval_s) for track in video_configs)
if video_configs
else 2.0 # <-- audio-only branch: hardcoded, ignores config
)
self._segment_time_s is then passed to PyAV's segment muxer as segment_time (line ~638), which is what actually drives segment rotation. So for audio-only this is the dial — not min_segment_wallclock_s, which is only checked as a lower bound in _should_close_segment_after_loop() (line ~904) and so can never shrink segments below whatever the muxer already cut.
Reproduction
- Construct an audio-only publisher (no video tracks):
publisher = MediaPublish(
in_url,
config=MediaPublishConfig(
min_segment_wallclock_s=0.3, # asks for ~300ms segments
tracks=[AudioOutputConfig(
codec="aac", format="fltp", sample_rate=48000,
layout="mono", queue_size=512,
)],
),
)
- Publish a continuous audio stream (e.g. mic capture via ffmpeg → PyAV).
- Observe the receiver's segment cadence — the orchestrator's trickle logs are convenient:
INFO POST completed stream=session_…-in idx=0 bytes=43240 took=1.922287527s
INFO POST completed stream=session_…-in idx=1 bytes=29516 took=1.520760720s
Segments arrive at ~2 s intervals at ~30–43 KB each, irrespective of the min_segment_wallclock_s=0.3 setting.
Measured impact
Same client / orchestrator / runner; only difference is the segment-time fix below:
|
Before (else 2.0) |
After (else min_segment_wallclock_s, set to 0.3) |
| Segment POST cadence |
1.5–1.9 s |
0.31–0.64 s |
| Segment size |
30–43 KB |
6–13 KB |
| End-to-end caption lag in our live STT runner |
~2 s |
sub-second |
Context: we use this branch to publish mic audio over trickle to a hello-captions runner that emits captions via SSE. The 2 s floor was the dominant component of end-to-end caption lag.
Proposed fix
Use min_segment_wallclock_s as the audio-only segment time, with a small floor for sanity:
self._segment_time_s = (
min(float(track.keyframe_interval_s) for track in video_configs)
if video_configs
else max(0.1, float(config.min_segment_wallclock_s))
)
Notes on the choice:
- Backward-compatible: anyone who didn't override
min_segment_wallclock_s keeps the existing config default (1.0 s), which is shorter than the old 2.0 s hardcode but still conservative.
- Aligns the semantics of
min_segment_wallclock_s with its name on the audio-only path. (On the video path, segment time has to track keyframe cadence, so leaving that branch alone is correct.)
- The 0.1 s floor protects against pathological zero/negative inputs.
A PR will follow.
Sibling concern (not part of this issue)
VideoOutputConfig.keyframe_interval_s similarly defaults to 2.0. The downstream effect is analogous (segments can't be shorter than the keyframe interval), but the right default is more nuanced for video — shorter intervals trade latency against bitrate. Worth a separate look if low-latency video is a goal; happy to file a separate issue if useful.
Summary
On the
ja/live-runnerbranch,MediaPublishcuts trickle segments at a hardcoded 2 s for audio-only streams, regardless of howMediaPublishConfig.min_segment_wallclock_sis configured. This sets an unnecessary ~2 s latency floor on any audio-only trickle publisher and makes the existingmin_segment_wallclock_sknob misleading: changing it has no observable effect.Where
src/livepeer_gateway/media_publish.pyaround line 360, inMediaPublish.__init__:self._segment_time_sis then passed to PyAV'ssegmentmuxer assegment_time(line ~638), which is what actually drives segment rotation. So for audio-only this is the dial — notmin_segment_wallclock_s, which is only checked as a lower bound in_should_close_segment_after_loop()(line ~904) and so can never shrink segments below whatever the muxer already cut.Reproduction
Segments arrive at ~2 s intervals at ~30–43 KB each, irrespective of the
min_segment_wallclock_s=0.3setting.Measured impact
Same client / orchestrator / runner; only difference is the segment-time fix below:
else 2.0)else min_segment_wallclock_s, set to 0.3)Context: we use this branch to publish mic audio over trickle to a hello-captions runner that emits captions via SSE. The 2 s floor was the dominant component of end-to-end caption lag.
Proposed fix
Use
min_segment_wallclock_sas the audio-only segment time, with a small floor for sanity:Notes on the choice:
min_segment_wallclock_skeeps the existing config default (1.0 s), which is shorter than the old 2.0 s hardcode but still conservative.min_segment_wallclock_swith its name on the audio-only path. (On the video path, segment time has to track keyframe cadence, so leaving that branch alone is correct.)A PR will follow.
Sibling concern (not part of this issue)
VideoOutputConfig.keyframe_interval_ssimilarly defaults to2.0. The downstream effect is analogous (segments can't be shorter than the keyframe interval), but the right default is more nuanced for video — shorter intervals trade latency against bitrate. Worth a separate look if low-latency video is a goal; happy to file a separate issue if useful.