Skip to content

whisper: add carry_initial_prompt to maintain context over sliding wi…#1414

Open
Sahith59 wants to merge 2 commits intoml-explore:mainfrom
Sahith59:feature/whisper-carry-prompt
Open

whisper: add carry_initial_prompt to maintain context over sliding wi…#1414
Sahith59 wants to merge 2 commits intoml-explore:mainfrom
Sahith59:feature/whisper-carry-prompt

Conversation

@Sahith59
Copy link
Copy Markdown

@Sahith59 Sahith59 commented Apr 8, 2026

Description

This PR resolves the issue where context provided by the user's initial_prompt is lost/truncated as the cross-attention context window slides forward over long audio files.

Changes

  • Introduces the carry_initial_prompt keyword boolean to .transcribe().
  • Intercepts the prompt creation block inside transcribe.py. When enabled, it dynamically prepends the initial_prompt_tokens directly to the active prompt queue.
  • Calculates the window boundary natively via model.dims.n_text_ctx to correctly slice the previous_tokens so that the model evaluates without hitting dimension mismatch errors during Apple MLX evaluation.
  • Added native Apple MLX unit tests into test.py to test context-window bounds and propagation.
  • Default behavior rigorously maintains PyTorch backward compatibility (carry_initial_prompt=False).

Testing

Ran local conversion via setUpClass and verified native propagation logic inside the test suite. Formatted with black and passes pre-commit natively.

Resolves: #1410

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant