From 869850a7cbb86ec0b5e62a1e797ea3d8d7622815 Mon Sep 17 00:00:00 2001
From: Jamie Kirkpatrick <jkp@kirkconsulting.co.uk>
Date: Sat, 18 Apr 2026 11:03:10 +0100
Subject: [PATCH] feat(livekit): auto-flag stereo on audio tracks with
 num_channels == 2

Without setting TF_STEREO in AddTrackRequest.audio_features (or the
deprecated stereo bool), the LiveKit server negotiates the published
audio track as mono. libwebrtc's Opus encoder then downmixes any
asymmetric stereo content to mono-duplicated-both-channels, so the
receiver sees identical L and R on every frame regardless of what the
publisher pushed via capture_frame.

The JS SDK sets these same flags based on ``MediaStreamTrack.getSettings().channelCount``
or explicit ``opts.forceStereo`` (see livekit/client-sdk-js
LocalParticipant.ts ``isStereo`` handling). The Rust SDK doesn't
expose an equivalent option and doesn't infer from the audio source's
declared ``num_channels``.

This patch closes that gap: if the track's underlying source declares
``num_channels == 2``, flag the track as stereo on the AddTrackRequest.
``RtcAudioSource::num_channels()`` is not a public accessor (generated
via ``enum_dispatch!``), so we match the ``Native`` variant directly
and keep a wildcard arm for the ``#[non_exhaustive]`` enum.

Verified end-to-end against a Chrome client via
``MediaStreamTrackProcessor``: before the patch, L == R on every frame
with all our content on R; after, L stays at codec floor (-100+ dBFS)
while R carries the TTS speech envelope, matching what the publisher
pushed.

Discussion context: timeline-protocol-v6 stereo marker channel
(silent-L / TTS-on-R for sample-aligned marker tones) kept showing
identical L and R at the client despite SDP advertising stereo=1 and
AudioSource being constructed with num_channels=2. Every workaround
we tried on the Python side (APM options, ``SOURCE_SCREENSHARE_AUDIO``,
``max_bitrate`` pinning, native 48 kHz source, disabling libwebrtc
voice processing) failed because the server-side SDP negotiation
already locked the track to mono before our audio ever reached the
encoder.
---
 .../src/room/participant/local_participant.rs | 20 +++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/livekit/src/room/participant/local_participant.rs b/livekit/src/room/participant/local_participant.rs
index d37015dac..c8a638acd 100644
--- a/livekit/src/room/participant/local_participant.rs
+++ b/livekit/src/room/participant/local_participant.rs
@@ -316,6 +316,26 @@ impl LocalParticipant {
             req.audio_features.push(proto::AudioTrackFeature::TfPreconnectBuffer as i32);
         }
 
+        // Auto-flag stereo on audio tracks whose underlying source declares
+        // num_channels == 2. Without this, the server treats the track as
+        // mono and libwebrtc's Opus path downmixes asymmetric stereo to
+        // mono-duplicated-both-channels (identical content on L and R).
+        // This matches the JS client SDK's forceStereo+audioFeatures behaviour.
+        // RtcAudioSource's num_channels() accessor is private (generated via
+        // enum_dispatch!), so we match the variant directly.
+        if let LocalTrack::Audio(audio_track) = &track {
+            use libwebrtc::audio_source::RtcAudioSource;
+            let is_stereo = match audio_track.rtc_source() {
+                RtcAudioSource::Native(native) => native.num_channels() == 2,
+                #[allow(unreachable_patterns)]
+                _ => false,
+            };
+            if is_stereo {
+                req.audio_features.push(proto::AudioTrackFeature::TfStereo as i32);
+                req.stereo = true;
+            }
+        }
+
         let mut encodings = Vec::default();
         match &track {
             LocalTrack::Video(video_track) => {