Skip to content

docs(voice): terminal shortcut matrix + STT helper diagnostics (#2116)#2151

Closed
donglovejava wants to merge 1 commit into
Hmbown:work/v0.8.45-flashfrom
donglovejava:feat/voice-stt-diagnostics
Closed

docs(voice): terminal shortcut matrix + STT helper diagnostics (#2116)#2151
donglovejava wants to merge 1 commit into
Hmbown:work/v0.8.45-flashfrom
donglovejava:feat/voice-stt-diagnostics

Conversation

@donglovejava
Copy link
Copy Markdown

@donglovejava donglovejava commented May 26, 2026

Summary

Validate terminal-safe voice shortcut and STT helper setup for the voice input feature shipping in v0.8.45/v0.8.46.

Completed

  • voice_input.rs: Added diagnose_voice_setup() async function that checks:
    • Missing/invalid voice_input_command config
    • Inexecutable binary (spawns --version as probe)
    • Permission/spawn failures with clear error messages
  • voice_input.rs: Added terminal_detection_tests module with:
    • chord_likely_reaches_tui() heuristic function documenting known-consumed chords
    • Tests for common chords (Ctrl-K consumed, Ctrl-L/Ctrl-C safe)
    • Candidate shortcut test matrix (F2, F3, Alt-Space, Ctrl-] etc.)
  • docs/VOICE_INPUT_TERMINALS.md: Terminal compatibility matrix documenting Ctrl-K behavior across 13 terminal emulators + STT helper setup checklist + recommended safe chords
  • docs/KEYBINDINGS.md: Added Voice input section documenting the Ctrl-K caveat with cross-reference to the new terminal matrix doc

Not completed (needs manual verification)

  • Actual terminal testing on macOS Terminal.app, iTerm2, Ghostty, Warp, Windows Terminal, Linux terminals
  • Final default chord selection — the matrix lists candidates but the final binding needs human verification
  • Manual QA checklist for terminals that consume modifier keys
  • Compilation verification — changes made to a feature branch (work/v0.8.45-flash) without Rust CI available

Files changed

crates/tui/src/tui/voice_input.rs  | 110 +++++++++++++++++
docs/KEYBINDINGS.md                |  22 ++++
docs/VOICE_INPUT_TERMINALS.md      |  87 +++++++++++++
3 files changed, 219 insertions(+)

Greptile Summary

This PR adds voice-input diagnostics and terminal-shortcut documentation for the v0.8.45 voice feature. No existing behaviour is changed — all new code is either async diagnostic helpers or #[cfg(test)]-gated functions.

  • voice_input.rs: Adds diagnose_voice_setup() (probes the configured STT binary) and chord_likely_reaches_tui() with a shortcut-candidate test matrix; Ctrl-T appears in the consumed-chord array but is omitted from the doc comment, and the --version probe can emit false-positive warnings for helpers that don't implement that flag.
  • docs/VOICE_INPUT_TERMINALS.md: New terminal compatibility matrix across 13 emulators with an STT setup checklist; checklist item 2's heading ("Non-zero exit") is worded opposite to its body text.
  • docs/KEYBINDINGS.md: Adds a Voice input section explaining the Ctrl+K caveat and cross-referencing the new matrix doc; content is accurate.

Confidence Score: 3/5

Safe to merge if the two defects in voice_input.rs are fixed first — neither affects production paths today, but both would surface as user-visible problems once diagnose_voice_setup is called.

The doc comment for chord_likely_reaches_tui omits Ctrl-T while the implementation treats it as consumed, creating a silent contradiction that the test matrix does not catch. Separately, the --version probe will push a misleading warning for any STT helper that exits non-zero on that flag (a common case for tools like whisper), causing correctly configured setups to appear broken. Both defects are in the new code added by this PR and would affect real users as soon as the diagnostic surface is exposed.

crates/tui/src/tui/voice_input.rs — the consumed-chord list mismatch and the --version probe logic both need attention before this ships.

Important Files Changed

Filename Overview
crates/tui/src/tui/voice_input.rs Adds diagnose_voice_setup async diagnostic and chord_likely_reaches_tui test helper; doc comment for the consumed-chord list omits Ctrl-T which is present in the implementation, and the --version probe produces false-positive warnings for STT helpers that don't support that flag.
docs/VOICE_INPUT_TERMINALS.md New terminal compatibility matrix for Ctrl+K across 13 emulators; STT setup checklist item 2 heading "Non-zero exit" is misleadingly worded relative to its body text.
docs/KEYBINDINGS.md Adds a Voice input section explaining the Ctrl+K terminal caveat and linking to the new matrix doc; content is accurate and the env var reference (DEEPSEEK_VOICE_INPUT_COMMAND) matches the actual setting in settings.rs.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[diagnose_voice_setup called] --> B{command_line empty?}
    B -- Yes --> C[push: voice_input_command not configured]
    C --> Z[return issues]
    B -- No --> D[parse_voice_command]
    D -- parse error --> E[push: Failed to parse voice command]
    E --> Z
    D -- Ok: program, _args --> F[spawn: program --version]
    F -- Err: spawn failed --> G[push: Cannot spawn program]
    G --> Z
    F -- Ok: status != 0 --> H[push: returned code N on --version ⚠️ false-positive risk]
    H --> Z
    F -- Ok: status == 0 --> I[no issues]
    I --> Z
Loading

Fix All in Codex Fix All in Claude Code Fix All in Cursor

Reviews (1): Last reviewed commit: "docs(voice): terminal shortcut matrix + ..." | Re-trigger Greptile

Greptile also left 5 inline comments on this PR.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces voice input setup diagnostics and comprehensive documentation regarding terminal compatibility and key chord safety. The reviewer's feedback suggests utilizing the unused _cwd parameter in diagnose_voice_setup to set the current working directory for the probe command, ensuring relative paths or workspace-local executables are correctly resolved.

Comment on lines +96 to +99
pub(crate) async fn diagnose_voice_setup(
command_line: &str,
_cwd: &Path,
) -> Vec<String> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The _cwd parameter is currently unused. To ensure that diagnostic checks are run in the correct context (especially when the voice command uses relative paths or workspace-specific scripts), we should use cwd and set it as the current directory for the probe command.

Suggested change
pub(crate) async fn diagnose_voice_setup(
command_line: &str,
_cwd: &Path,
) -> Vec<String> {
pub(crate) async fn diagnose_voice_setup(
command_line: &str,
cwd: &Path,
) -> Vec<String> {

Comment on lines +117 to +122
let probe = tokio::process::Command::new(&program)
.arg("--version")
.stdout(Stdio::null())
.stderr(Stdio::null())
.status()
.await;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Set the current working directory to cwd for the probe command to ensure relative paths or workspace-local executables are correctly resolved during diagnostics, matching the behavior of run_configured_voice_command.

    let probe = tokio::process::Command::new(&program)
        .current_dir(cwd)
        .arg("--version")
        .stdout(Stdio::null())
        .stderr(Stdio::null())
        .status()
        .await;

Comment on lines +158 to +160
let consumed: &[&str] = &["Ctrl-K", "Ctrl-W", "Ctrl-U", "Ctrl-T", "Ctrl-O", "Ctrl-Y"];
!consumed.contains(&chord)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Ctrl-T in consumed list but absent from doc comment

The doc comment above chord_likely_reaches_tui lists the consumed chords as Ctrl-K, Ctrl-W, Ctrl-U, Ctrl-O, Ctrl-Y — but the implementation also includes "Ctrl-T". Ctrl-T is the readline transpose-chars sequence and is intercepted by many terminals before the TUI sees it. As written, a developer reading the doc comment would believe chord_likely_reaches_tui("Ctrl-T") returns true (safe), when in fact it returns false. The voice_shortcut_candidates test matrix also does not exercise Ctrl-T, so this mismatch is not caught by the tests.

Fix in Codex Fix in Claude Code Fix in Cursor

Comment on lines +117 to +135
let probe = tokio::process::Command::new(&program)
.arg("--version")
.stdout(Stdio::null())
.stderr(Stdio::null())
.status()
.await;
match probe {
Ok(status) if !status.success() => {
issues.push(format!(
"Executable `{program}` exists but returned code {status} on --version. It may need a different invocation."
));
}
Err(e) => {
issues.push(format!(
"Cannot spawn `{program}`: {e}. Check that it is installed and on PATH."
));
}
_ => {}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 --version probe produces false-positive warnings for valid STT helpers

Many common STT helpers (e.g., whisper, openai-whisper, whisper.cpp binaries) do not implement --version and exit with a non-zero code when called with that flag. The Ok(status) if !status.success() branch will fire and push "It may need a different invocation" even when the binary is correctly installed and fully functional. Users who configure a working tool would receive a misleading diagnostic telling them something is wrong. A more reliable probe would use which/std::env::var("PATH")-based lookup, or simply attempt Command::new(&program).arg("--help").status() and treat an Err (spawn failure) as the only definitive "not found" signal.

Fix in Codex Fix in Claude Code Fix in Cursor

Comment on lines +144 to +149
/// Chords consumed by the terminal (before the TUI sees them):
/// Ctrl-K — kill-line (macOS Terminal, iTerm2, Windows Terminal, tmux)
/// Ctrl-W — unix-word-rubout
/// Ctrl-U — kill-line-from-start
/// Ctrl-O — operated-on / execute
/// Ctrl-Y — yank
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 The doc comment's consumed-chord list omits Ctrl-T, creating a contradiction with the implementation. Add it so the documentation matches the code.

Suggested change
/// Chords consumed by the terminal (before the TUI sees them):
/// Ctrl-K — kill-line (macOS Terminal, iTerm2, Windows Terminal, tmux)
/// Ctrl-W — unix-word-rubout
/// Ctrl-U — kill-line-from-start
/// Ctrl-O — operated-on / execute
/// Ctrl-Y — yank
/// Chords consumed by the terminal (before the TUI sees them):
/// Ctrl-K — kill-line (macOS Terminal, iTerm2, Windows Terminal, tmux)
/// Ctrl-W — unix-word-rubout
/// Ctrl-U — kill-line-from-start
/// Ctrl-T — transpose-chars
/// Ctrl-O — operated-on / execute
/// Ctrl-Y — yank

Fix in Codex Fix in Claude Code Fix in Cursor

Comment on lines +188 to +191
("Ctrl-K", false),
("Ctrl-W", false),
("Ctrl-U", false),
("Ctrl-O", false),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The voice_shortcut_candidates test does not exercise Ctrl-T, so the discrepancy between the doc comment and the consumed list goes undetected. Adding it prevents future regressions.

Suggested change
("Ctrl-K", false),
("Ctrl-W", false),
("Ctrl-U", false),
("Ctrl-O", false),
("Ctrl-K", false),
("Ctrl-W", false),
("Ctrl-U", false),
("Ctrl-T", false),
("Ctrl-O", false),

Fix in Codex Fix in Claude Code Fix in Cursor

Comment on lines +54 to +55
2. **Non-zero exit** — the helper exits 0 on success. Non-zero indicates
recording failure (mic permission, network error, unsupported codec).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Misleading "Non-zero exit" heading in STT setup checklist

The heading "Non-zero exit" implies this item is about what to do when non-zero exit occurs, but the body text says "the helper exits 0 on success" — which is the normal/expected case. As written, readers may interpret this item as documenting that non-zero exit is the success condition. A clearer heading like "Exit code must be 0 on success" would remove the ambiguity.

Fix in Codex Fix in Claude Code Fix in Cursor

@Hmbown Hmbown deleted the branch Hmbown:work/v0.8.45-flash May 26, 2026 12:55
@Hmbown Hmbown closed this May 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants