Skip to content

encoded_video_ingest (sderosa)#1048

Draft
stephen-derosa wants to merge 15 commits intomainfrom
sderosa/pre-encoded-ingest
Draft

encoded_video_ingest (sderosa)#1048
stephen-derosa wants to merge 15 commits intomainfrom
sderosa/pre-encoded-ingest

Conversation

@stephen-derosa
Copy link
Copy Markdown
Contributor

@stephen-derosa stephen-derosa commented Apr 27, 2026

Overview

This PR adds encoded video ingest support, allowing callers to publish pre-compressed video frames through a VIDEO_SOURCE_ENCODED source instead of sending raw frames through WebRTC’s normal encoder path.

It introduces the FFI/protobuf surface for creating encoded video sources and pushing encoded access units, wires those sources into WebRTC using a passthrough encoder, and forwards encoder-side feedback such as keyframe requests and target bitrate changes. For H.264/H.265, the native source also caches parameter sets and prepends them to keyframes when needed.

This solves the need to ingest externally encoded video without decoding and re-encoding it inside the SDK. To reproduce the original limitation, attempt to publish an already-encoded H.264/H.265/VPx/AV1 stream through the existing raw video source APIs; the SDK only accepted raw video frames and would route them through normal encoding.

Breaking changes

None.

MSRV

No MSRV changes.

Testing

Added/updated tests for:

  • encoded source behavior
  • queue handling
  • metadata propagation
  • ingest configuration
  • examples for Manual validation can be performed with the encoded video ingest example by sending an externally encoded stream into the SDK and verifying it publishes without re-encoding.

Async
No changes to the runtime model were made, it uses the existing livekit video track mechanisms.

API Exampls

There are two producer APIs: the helper TCP ingest API for common “external encoder over TCP” workflows, and the base encoded source API for applications that already own demuxing, frame boundaries, and encoder control.

Helper API: EncodedTcpIngest
Use EncodedTcpIngest when the producer sends an encoded stream over TCP, for example from GStreamer.

// room setup
let mut room_options = RoomOptions::default();
room_options.auto_subscribe = false;
room_options.dynacast = false;
let (room, _events) = Room::connect(&url, &token, room_options).await?;
// encoded TCP ingest setup
let mut opts = EncodedTcpIngestOptions::new(
    5005,
    VideoCodec::H264,
    640,
    480,
);
opts.host = "127.0.0.1".to_string();
opts.track_name = Some("encoded-h264".to_string());
opts.max_bitrate_bps = Some(2_500_000);
opts.max_framerate_fps = 30.0;
// start ingest and publish the track
let ingest = EncodedTcpIngest::start(room.local_participant(), opts).await?;
ingest.set_observer(Arc::new(MyObserver));
The helper owns the TCP reconnect loop, encoded frame parsing, keyframe detection, source creation, track publishing, and calls into capture_frame internally.

Base API: NativeEncodedVideoSource
Use the base API when the application already has complete encoded access units and wants to push them directly.

// create source and track
let source = NativeEncodedVideoSource::new(
    VideoCodec::H264,
    VideoResolution { width: 640, height: 480 },
);
source.set_observer(Arc::new(MyObserver));
let track = LocalVideoTrack::create_video_track(
    "encoded-h264",
    RtcVideoSource::Encoded(source.clone()),
);
// publish
room.local_participant()
    .publish_track(LocalTrack::Video(track), TrackPublishOptions::default())
    .await?;
// push each complete encoded picture/access unit
let info = EncodedFrameInfo {
    is_keyframe,
    has_sps_pps: false,
    width: 640,
    height: 480,
    capture_time_us: 0,
};
if !source.capture_frame(&encoded_access_unit, &info) {
    // queue was full; frame was dropped
}

Using the EncodedTcpIngest handles is a more complete tool for the user.

Introduce a video track source that accepts pre-encoded frames and a
matching WebRTC encoder that forwards them unchanged, bypassing real
encoding while preserving RTP, pacing, and congestion control.

Per-track routing uses VideoFrame::id() as a side channel plus a global
EncodedSourceRegistry. A LazyVideoEncoder picks between the passthrough
and the real encoder on the first Encode() call.

Single-layer only; callers manage simulcast with multiple sources.
Rust wrapper around webrtc-sys::EncodedVideoTrackSource. Adds the
Encoded variant to RtcVideoSource, VideoCodec/EncodedFrameInfo types,
and an EncodedVideoSourceObserver trait for keyframe-request callbacks
from the C++ side.

PeerConnectionFactory gains create_video_track_from_encoded_source.
Dispatch RtcVideoSource::Encoded through the new PCF path in
LocalVideoTrack, and normalize TrackPublishOptions for encoded sources
in LocalParticipant::publish_track — simulcast is forced off and the
codec is pinned to the source's codec, with warnings on override.
Protobuf:
  * NewVideoSourceRequest.encoded_options + VideoSourceType.Encoded
  * CaptureEncodedVideoFrame request/response
  * EncodedVideoSourceEvent (keyframe requested, target bitrate)
  * VideoSourceInfo.encoded_source_id

Server wires the new variant through FfiVideoSource, forwards observer
callbacks to FfiEvent, and rejects capture_frame on encoded sources.
Encoded track source now scans incoming frames for SPS/PPS (H.264) or
VPS/SPS/PPS (H.265), caches the latest seen set, and prepends them to
any keyframe that arrives without inline params. This makes hardware
encoders and camera feeds that only emit parameter sets on stream
start usable as-is, without requiring producers to replicate them on
every IDR.

Producers still get a clear warning if the very first keyframe has no
parameter sets and the cache is empty. The caller-supplied has_sps_pps
flag becomes a hint only; the scanner is the source of truth so
double-prepending is impossible.

Also fix a stale `src->get()` reference left over from the SetRates
refactor in PassthroughVideoEncoder::Encode.
@stephen-derosa stephen-derosa requested a review from ladvoc April 27, 2026 14:32
@stephen-derosa stephen-derosa marked this pull request as draft April 27, 2026 14:32
@stephen-derosa stephen-derosa changed the title Sderosa/pre encoded ingest pre encoded ingest (sderosa) Apr 27, 2026
@stephen-derosa stephen-derosa force-pushed the sderosa/pre-encoded-ingest branch from d8c2cb1 to 286e052 Compare April 27, 2026 14:52
@stephen-derosa stephen-derosa force-pushed the sderosa/pre-encoded-ingest branch from 9324761 to 9619309 Compare April 27, 2026 20:05
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 28, 2026

No changeset found

This PR modifies the following packages but doesn't include a changeset:

Directly changed:

  • libwebrtc
  • livekit
  • livekit-ffi
  • webrtc-sys

Click here to create a changeset

The link pre-populates a changeset file with patch bumps for all affected packages.
Edit the description and bump types as needed before committing.

If this change doesn't require a version bump, add the internal label to this PR.

@stephen-derosa stephen-derosa changed the title pre encoded ingest (sderosa) encoded_video_ingest (sderosa) Apr 29, 2026
@stephen-derosa stephen-derosa self-assigned this Apr 29, 2026
@stephen-derosa stephen-derosa added enhancement New feature or request examples Code related to the examples Rust labels Apr 29, 2026
Comment thread livekit-ffi/protocol/encoded_tcp_ingest.proto Outdated
@xianshijing-lk
Copy link
Copy Markdown
Contributor

This project involves API design and changes, and might touch lots of code.
can we start with filling in a new project template https://www.notion.so/livekit/LiveKit-Client-SDK-New-Product-Development-Process-Template-3353c4901a428072912aef1f44c66b7a and follow some of the instructions there ?

@xianshijing-lk
Copy link
Copy Markdown
Contributor

I would think the team will need to align on the API designs and architecture.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request examples Code related to the examples Rust

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants