From 8d517471ed05e227c723befccfa7adaec39771f0 Mon Sep 17 00:00:00 2001
From: Alex-Wengg <hanweng9@gmail.com>
Date: Fri, 10 Apr 2026 22:41:54 -0400
Subject: [PATCH 1/2] docs: Fix speaker diarization model references from 3.1
 to community-1

- Update code comment in SegmentationProcessor.swift
- Update CLAUDE.md model source reference
- Update Documentation/Benchmarks.md to clarify both online/offline use community-1

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
---
 CLAUDE.md                                                       | 2 +-
 Documentation/Benchmarks.md                                     | 2 +-
 .../Diarizer/Segmentation/SegmentationProcessor.swift           | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/CLAUDE.md b/CLAUDE.md
index aba8bc55a..4de278c47 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -180,7 +180,7 @@ GitHub Actions workflows:
 
 ## Model Sources
 
-- **Diarization**: [pyannote/speaker-diarization-3.1](https://huggingface.co/pyannote/speaker-diarization-3.1)
+- **Diarization**: [FluidInference/speaker-diarization-coreml](https://huggingface.co/FluidInference/speaker-diarization-coreml) (based on pyannote/speaker-diarization-community-1)
 - **VAD CoreML**: [FluidInference/silero-vad-coreml](https://huggingface.co/FluidInference/silero-vad-coreml)
 - **ASR Models**: [FluidInference/parakeet-tdt-0.6b-v3-coreml](https://huggingface.co/FluidInference/parakeet-tdt-0.6b-v3-coreml)
 - **Test Data**: [alexwengg/musan_mini*](https://huggingface.co/datasets/alexwengg) variants
diff --git a/Documentation/Benchmarks.md b/Documentation/Benchmarks.md
index b6d06aebd..06441525c 100644
--- a/Documentation/Benchmarks.md
+++ b/Documentation/Benchmarks.md
@@ -460,7 +460,7 @@ swift run -c release fluidaudiocli nemotron-benchmark --chunk 560
 
 ## Speaker Diarization
 
-The offline version uses the community-1 model, the online version uses the legacy speaker-diarization-3.1 model.
+Both offline and online versions use the community-1 model (via FluidInference/speaker-diarization-coreml).
 
 ### Offline diarization pipeline
 
diff --git a/Sources/FluidAudio/Diarizer/Segmentation/SegmentationProcessor.swift b/Sources/FluidAudio/Diarizer/Segmentation/SegmentationProcessor.swift
index 348c3eb28..4a6909a94 100644
--- a/Sources/FluidAudio/Diarizer/Segmentation/SegmentationProcessor.swift
+++ b/Sources/FluidAudio/Diarizer/Segmentation/SegmentationProcessor.swift
@@ -224,7 +224,7 @@ public struct SegmentationProcessor {
     func createSlidingWindowFeature(
         binarizedSegments: [[[Float]]], chunkOffset: Double = 0.0
     ) -> SlidingWindowFeature {
-        // These values come from the pyannote/speaker-diarization-3.1 model configuration
+        // These values come from the pyannote/speaker-diarization-community-1 model configuration
         let slidingWindow = SlidingWindow(
             start: chunkOffset,
             duration: 0.0619375,  // 991 samples at 16kHz (model's sliding window duration)

From dda36e869b0515a61b5562ea0e954b776276d76a Mon Sep 17 00:00:00 2001
From: Alex-Wengg <hanweng9@gmail.com>
Date: Sat, 11 Apr 2026 11:12:34 -0400
Subject: [PATCH 2/2] docs: Clarify diarization pipeline version differences

Distinguish between online and offline diarization pipelines:
- Online/streaming (DiarizerManager): Pyannote 3.1
- Offline batch (OfflineDiarizerManager): Pyannote Community-1

Updated documentation in:
- CLAUDE.md Model Sources section
- README.md Streaming/Online Speaker Diarization section
- Documentation/Models.md Diarization Models table
- Documentation/Diarization/GettingStarted.md WeSpeaker/Pyannote Streaming section

Addresses feedback from PR #6 review comment:
https://github.com/FluidInference/docs.fluidinference.com/pull/6#discussion_r3068126335
---
 CLAUDE.md                                   | 4 +++-
 Documentation/Diarization/GettingStarted.md | 2 +-
 Documentation/Models.md                     | 2 +-
 README.md                                   | 2 +-
 4 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/CLAUDE.md b/CLAUDE.md
index 4de278c47..1c71a10a9 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -180,7 +180,9 @@ GitHub Actions workflows:
 
 ## Model Sources
 
-- **Diarization**: [FluidInference/speaker-diarization-coreml](https://huggingface.co/FluidInference/speaker-diarization-coreml) (based on pyannote/speaker-diarization-community-1)
+- **Diarization**:
+  - Online/Streaming (DiarizerManager): [FluidInference/speaker-diarization-coreml](https://huggingface.co/FluidInference/speaker-diarization-coreml) (based on pyannote/speaker-diarization-3.1)
+  - Offline Batch (OfflineDiarizerManager): [FluidInference/speaker-diarization-coreml](https://huggingface.co/FluidInference/speaker-diarization-coreml) (based on pyannote/speaker-diarization-community-1)
 - **VAD CoreML**: [FluidInference/silero-vad-coreml](https://huggingface.co/FluidInference/silero-vad-coreml)
 - **ASR Models**: [FluidInference/parakeet-tdt-0.6b-v3-coreml](https://huggingface.co/FluidInference/parakeet-tdt-0.6b-v3-coreml)
 - **Test Data**: [alexwengg/musan_mini*](https://huggingface.co/datasets/alexwengg) variants
diff --git a/Documentation/Diarization/GettingStarted.md b/Documentation/Diarization/GettingStarted.md
index 0ab0d0cda..b67d3a5a7 100644
--- a/Documentation/Diarization/GettingStarted.md
+++ b/Documentation/Diarization/GettingStarted.md
@@ -340,7 +340,7 @@ Notes:
 
 ### WeSpeaker/Pyannote Streaming
 
-Use `DiarizerManager` when you need the classic segmentation + embedding + speaker-database pipeline. This is the slowest streaming option and works best with larger chunks.
+Pyannote 3.1 pipeline for online/streaming use. Use `DiarizerManager` when you need the classic segmentation + embedding + speaker-database pipeline. This is the slowest streaming option and works best with larger chunks.
 
 Process audio in chunks for real-time applications:
 
diff --git a/Documentation/Models.md b/Documentation/Models.md
index 75eb541b8..f87f95ad3 100644
--- a/Documentation/Models.md
+++ b/Documentation/Models.md
@@ -43,7 +43,7 @@ TDT models process audio in chunks (~15s with overlap) as batch operations.
 |-------|-------------|---------|
 | **LS-EEND** | Research prototype end-to-end streaming diarization model from Westlake University. Supports both streaming and complete-buffer inference for up to 10 speakers. Uses frame-in, frame-out processing, requiring 900ms of warmup audio and 100ms per update. | Added after Sortformer to support largers speaker counts. |
 | **Sortformer** | NVIDIA's enterprise-grade end-to-end streaming diarization model. Supports both streaming and complete-buffer inference for up to 4 speakers. More stable than LS-EEND, but sometimes misses speech. Processes audio in chunks, requiring 1040ms of warmup audio and 480ms per update for the low latency versions. | Added after Pyannote to support low-latency streaming diarization. |
-| **Pyannote CoreML Pipeline** | Speaker diarization. Segmentation model + WeSpeaker embeddings for clustering. Best offline diarization pipeline, but also support online use | First diarizer model added. Converted from Pyannote with custom made batching mode |
+| **Pyannote CoreML Pipeline** | Speaker diarization. Segmentation model + WeSpeaker embeddings for clustering. Online/streaming pipeline (DiarizerManager) based on pyannote/speaker-diarization-3.1. Offline batch pipeline (OfflineDiarizerManager) based on pyannote/speaker-diarization-community-1. | First diarizer model added. Converted from Pyannote with custom made batching mode |
 
 ## TTS Models
 
diff --git a/README.md b/README.md
index 280d1d709..5fa0bd775 100644
--- a/README.md
+++ b/README.md
@@ -372,7 +372,7 @@ Both LS-EEND and Sortformer emit results into a `DiarizerTimeline` with ultra-lo
 
 ### Streaming/Online Speaker Diarization (Pyannote)
 
-This pipeline uses segmentation plus speaker embeddings and is the third choice behind LS-EEND and Sortformer. It can be useful if you specifically want the classic multi-stage pipeline, but it is much slower than LS-EEND or Sortformer for live diarization.
+Pyannote 3.1 pipeline (segmentation + WeSpeaker embeddings) for online/streaming diarization. This is the third choice behind LS-EEND and Sortformer. It can be useful if you specifically want the classic multi-stage pipeline, but it is much slower than LS-EEND or Sortformer for live diarization.
 
 Why use the WeSpeaker/Pyannote pipeline:
 - More modular pipeline if you want separate segmentation and embedding stages