-
Notifications
You must be signed in to change notification settings - Fork 15
feat(cohere): Add Cohere Transcribe CoreML conversion with critical fixes #41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
Alex-Wengg
wants to merge
43
commits into
main
Choose a base branch
from
docs/cohere-transcribe-coreml-decoder-fix
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
43 commits
Select commit
Hold shift + click to select a range
d654cb9
fix(cohere): Implement stateless decoder to fix cache repetition bug
Alex-Wengg a5d1fc0
docs(cohere): Update README with stateless decoder status and completβ¦
Alex-Wengg 079ec18
chore(cohere): Remove broken/experimental export scripts
Alex-Wengg 9fe6808
chore(cohere): Delete archive and mark broken HF uploads
Alex-Wengg 2e96dce
refactor(cohere): Clean up test suite and add PyTorch baseline
Alex-Wengg 2994397
chore(cohere): Delete redundant utility scripts
Alex-Wengg 9aad907
feat(cohere): implement stateful decoder with Qwen3 approach
Alex-Wengg 068c718
test(cohere): add comprehensive benchmarks for stateful decoder
Alex-Wengg 3d096ef
chore: remove outdated debug scripts, logs, and reference code
Alex-Wengg 1ae8422
feat(cohere): add 256-token decoder and investigation scripts
Alex-Wengg 98ea02b
docs(cohere): Identify encoder as root cause of quality issues
Alex-Wengg 947058a
docs(cohere): Complete root cause analysis - encoder training data bias
Alex-Wengg 3329c99
fix(cohere): Correct audio window to 35 seconds (3500 frames)
Alex-Wengg c7a4db8
docs(cohere): Document .mlpackage requirement and .mlmodelc limitations
Alex-Wengg 8f0bf24
docs(cohere): Update README with current status and .mlpackage requirβ¦
Alex-Wengg bb89d2d
chore(cohere): Clean up obsolete files and failed experiments
Alex-Wengg b008c99
chore(cohere): Remove remaining obsolete files
Alex-Wengg 6eba9b1
chore(cohere): Remove temporary upload docs and obsolete tests
Alex-Wengg ca4aab1
chore(cohere): Remove obsolete build artifacts and test files
Alex-Wengg cfb3ecb
refactor(cohere): Organize original PyTorch files into cohere-pytorchβ¦
Alex-Wengg 13f9535
docs(cohere): Add historical context and verified performance results
Alex-Wengg 36835ed
feat(cohere): Add INT8 quantized models and benchmarks
Alex-Wengg 4cbd37d
refactor(cohere): Reorganize scripts and create unified benchmark tool
Alex-Wengg fcc47a2
refactor(cohere): Use jiwer library for text normalization
Alex-Wengg 0790b6c
fix(cohere): Use google/fleurs dataset with correct field names
Alex-Wengg fc3c20b
refactor(cohere): Organize test files and scripts
Alex-Wengg c7e0b11
docs(cohere): Add comprehensive research analysis and limitations
Alex-Wengg e56f48d
feat(cohere): Add stateless decoder variant (Parakeet approach)
Alex-Wengg 7c088a3
docs(cohere): Add FP16 vs INT8 FLEURS comparison analysis
Alex-Wengg e9f9973
feat(cohere): Add INT4 quantization experiments and comprehensive resβ¦
Alex-Wengg 887b22b
fix(cohere): Address critical Devin review issues
Alex-Wengg 395e48a
fix(cohere): Fix test file issues from Devin review
Alex-Wengg f81dfb7
fix(cohere): Fix stateful decoder export issues from Devin review
Alex-Wengg 8c95861
fix(cohere): Commit uv.lock for reproducibility
Alex-Wengg 1edbc01
chore(cohere): Add test results and cache to gitignore
Alex-Wengg 306a283
refactor(cohere): Centralize test scripts into tests/ directory
Alex-Wengg 6209f8a
refactor(cohere): Move benchmark scripts to tests/ directory
Alex-Wengg 5d12a80
Fix EOS token detection in cache-external decoder
Alex-Wengg e007570
Verify .mlmodelc compilation for Swift integration
Alex-Wengg 049382a
docs: Add sync status for mobius β FluidAudio updates
Alex-Wengg 073d7a2
docs: Document Swift benchmark attempt and model compatibility issues
Alex-Wengg 34a6bfb
prep: HuggingFace upload package for cache-external decoder
Alex-Wengg e9286d1
research: Comprehensive investigation of Cohere multilingual ASR failure
Alex-Wengg File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
99 changes: 99 additions & 0 deletions
99
models/stt/cohere-transcribe-03-2026/cohere-pytorch/.eval_results/open_asr_leaderboard.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,99 @@ | ||
| - dataset: | ||
| id: hf-audio/open-asr-leaderboard | ||
| task_id: mean_wer | ||
| value: 5.42 | ||
| date: '2026-03-24' | ||
| source: | ||
| url: https://huggingface.co/hf-audio | ||
| name: open-asr-leaderboard | ||
| user: hf-audio | ||
|
|
||
| - dataset: | ||
| id: hf-audio/open-asr-leaderboard | ||
| task_id: rtfx | ||
| value: 524.88 | ||
| date: '2026-03-24' | ||
| source: | ||
| url: https://huggingface.co/hf-audio | ||
| name: open-asr-leaderboard | ||
| user: hf-audio | ||
|
|
||
| - dataset: | ||
| id: hf-audio/open-asr-leaderboard | ||
| task_id: ami_wer | ||
| value: 8.13 | ||
| date: '2026-03-24' | ||
| source: | ||
| url: https://huggingface.co/hf-audio | ||
| name: open-asr-leaderboard | ||
| user: hf-audio | ||
|
|
||
| - dataset: | ||
| id: hf-audio/open-asr-leaderboard | ||
| task_id: earnings22_wer | ||
| value: 10.86 | ||
| date: '2026-03-24' | ||
| source: | ||
| url: https://huggingface.co/hf-audio | ||
| name: open-asr-leaderboard | ||
| user: hf-audio | ||
|
|
||
| - dataset: | ||
| id: hf-audio/open-asr-leaderboard | ||
| task_id: gigaspeech_wer | ||
| value: 9.34 | ||
| date: '2026-03-24' | ||
| source: | ||
| url: https://huggingface.co/hf-audio | ||
| name: open-asr-leaderboard | ||
| user: hf-audio | ||
|
|
||
| - dataset: | ||
| id: hf-audio/open-asr-leaderboard | ||
| task_id: librispeech_clean_wer | ||
| value: 1.25 | ||
| date: '2026-03-24' | ||
| source: | ||
| url: https://huggingface.co/hf-audio | ||
| name: open-asr-leaderboard | ||
| user: hf-audio | ||
|
|
||
| - dataset: | ||
| id: hf-audio/open-asr-leaderboard | ||
| task_id: librispeech_other_wer | ||
| value: 2.37 | ||
| date: '2026-03-24' | ||
| source: | ||
| url: https://huggingface.co/hf-audio | ||
| name: open-asr-leaderboard | ||
| user: hf-audio | ||
|
|
||
| - dataset: | ||
| id: hf-audio/open-asr-leaderboard | ||
| task_id: spgispeech_wer | ||
| value: 3.08 | ||
| date: '2026-03-24' | ||
| source: | ||
| url: https://huggingface.co/hf-audio | ||
| name: open-asr-leaderboard | ||
| user: hf-audio | ||
|
|
||
| - dataset: | ||
| id: hf-audio/open-asr-leaderboard | ||
| task_id: tedlium_wer | ||
| value: 2.49 | ||
| date: '2026-03-24' | ||
| source: | ||
| url: https://huggingface.co/hf-audio | ||
| name: open-asr-leaderboard | ||
| user: hf-audio | ||
|
|
||
| - dataset: | ||
| id: hf-audio/open-asr-leaderboard | ||
| task_id: voxpopuli_wer | ||
| value: 5.87 | ||
| date: '2026-03-24' | ||
| source: | ||
| url: https://huggingface.co/hf-audio | ||
| name: open-asr-leaderboard | ||
| user: hf-audio |
37 changes: 37 additions & 0 deletions
37
models/stt/cohere-transcribe-03-2026/cohere-pytorch/.gitattributes
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,37 @@ | ||
| *.7z filter=lfs diff=lfs merge=lfs -text | ||
| *.arrow filter=lfs diff=lfs merge=lfs -text | ||
| *.bin filter=lfs diff=lfs merge=lfs -text | ||
| *.bz2 filter=lfs diff=lfs merge=lfs -text | ||
| *.ckpt filter=lfs diff=lfs merge=lfs -text | ||
| *.ftz filter=lfs diff=lfs merge=lfs -text | ||
| *.gz filter=lfs diff=lfs merge=lfs -text | ||
| *.h5 filter=lfs diff=lfs merge=lfs -text | ||
| *.joblib filter=lfs diff=lfs merge=lfs -text | ||
| *.lfs.* filter=lfs diff=lfs merge=lfs -text | ||
| *.mlmodel filter=lfs diff=lfs merge=lfs -text | ||
| *.model filter=lfs diff=lfs merge=lfs -text | ||
| *.msgpack filter=lfs diff=lfs merge=lfs -text | ||
| *.npy filter=lfs diff=lfs merge=lfs -text | ||
| *.npz filter=lfs diff=lfs merge=lfs -text | ||
| *.onnx filter=lfs diff=lfs merge=lfs -text | ||
| *.ot filter=lfs diff=lfs merge=lfs -text | ||
| *.parquet filter=lfs diff=lfs merge=lfs -text | ||
| *.pb filter=lfs diff=lfs merge=lfs -text | ||
| *.pickle filter=lfs diff=lfs merge=lfs -text | ||
| *.pkl filter=lfs diff=lfs merge=lfs -text | ||
| *.pt filter=lfs diff=lfs merge=lfs -text | ||
| *.pth filter=lfs diff=lfs merge=lfs -text | ||
| *.rar filter=lfs diff=lfs merge=lfs -text | ||
| *.safetensors filter=lfs diff=lfs merge=lfs -text | ||
| saved_model/**/* filter=lfs diff=lfs merge=lfs -text | ||
| *.tar.* filter=lfs diff=lfs merge=lfs -text | ||
| *.tar filter=lfs diff=lfs merge=lfs -text | ||
| *.tflite filter=lfs diff=lfs merge=lfs -text | ||
| *.tgz filter=lfs diff=lfs merge=lfs -text | ||
| *.wasm filter=lfs diff=lfs merge=lfs -text | ||
| *.xz filter=lfs diff=lfs merge=lfs -text | ||
| *.zip filter=lfs diff=lfs merge=lfs -text | ||
| *.zst filter=lfs diff=lfs merge=lfs -text | ||
| *tfevents* filter=lfs diff=lfs merge=lfs -text | ||
| *.wav filter=lfs diff=lfs merge=lfs -text | ||
| assets/*.png filter=lfs diff=lfs merge=lfs -text | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no lfs pls. do not commit here