N.B. This repo exists for me to share a tool I've built for myself with some colleagues. I'd highly recommend tweaking it's functionality to make it as "low-friction" for you to use as possible, as your workflow may be different to mine. I just wanted to make notes with my voice without an external cloud-based service reading my thoughts and costing me money
A terminal voice notes tool, which transcribes on your local hardware! Speak, stop, get a tagged text file. Powered by faster-whisper.
record --liveupdate
- Records from your microphone and transcribes with Whisper
- Files notes automatically by a spoken or typed name
- In
--liveupdatemode, transcribes chunk-by-chunk as you speak — useful for live context with AI assistants - Detects silence to avoid sending empty audio to Whisper
- Handles continuous speech by flushing every 30 seconds
System:
sudo apt-get install libportaudio2 portaudio19-dev ffmpegffmpeg is only needed if you use --savemp3.
Python: 3.9+
I'd recommend installing with pipx so this command is global.
Clone and install with the -e (editable) flag:
git clone https://github.com/tobyLovick/recordCLI
cd recordCLI
pipx install -e .Use -e. You'll almost certainly want to tweak defaults — silence threshold, model size, audio device — and editable mode means any edit to the source takes effect immediately without reinstalling.
Whisper model weights download automatically on first use and cache in ~/.cache/huggingface/.
# record, transcribe on stop (medium model, best accuracy)
record
# live transcription as you speak (small model, low latency)
record --liveupdate
# name the note up front
record --name chapter-3-notes
# keep the audio file
record --savemp3
# continue an existing note (arrow-key picker)
record --continue
# continue a specific file
record --continue notes/my-topic/2026-04-24_my-topic.txt
# choose model explicitly
record --model large
# list audio devices if the default doesn't work
record --listdevices
record --device 4Say anywhere in your recording:
"name this note chapter three end name"
or
"this should be called kepler orbits end name"
The note gets filed under ~/recordCLI/notes/chapter-three/. If no name is found, it goes to ~/recordCLI/notes/untagged/.
The --name flag overrides any spoken name.
record |
record --liveupdate |
|
|---|---|---|
| Model default | medium |
small |
| Transcription | on silence / every 30s | on silence / every 30s |
| VAD filter | yes | yes |
| Beam size | 5 | 1 |
| Best for | longer notes, accuracy | live context for porting into a REPL |
In --liveupdate mode, transcription is written to ~/recordCLI/notes/current.txt in real time. A Claude Code hook can inject new content into your conversation automatically whenever you send a message.
Add to ~/.claude/settings.json:
{
"hooks": {
"UserPromptSubmit": [
{
"hooks": [
{
"type": "command",
"command": "/path/to/recordCLI/transcript-hook.sh",
"timeout": 5
}
]
}
]
}
}Then run record --liveupdate in one terminal and Claude Code in another. Claude sees what you say as you say it, without you having to paste anything.
If you record on a mobile app that syncs to Drive (e.g. Easy Voice Recorder), you can pull recordings in automatically:
- Enable the Google Drive API in Google Cloud Console and download OAuth2 credentials
- Write a short script using
google-api-python-clientto list and download new files from your Drive folder - Pass each downloaded file to
transcriber.transcribe_file(model, path)and save the result withfiler.save_transcript()
The transcriber and filer modules are designed to work standalone — no recording hardware needed.
On my local branch I have record --import set to this functionality, but you're not having my API key!
Background noise varies by environment. The default threshold (--silence 0.1) works for a quiet room. If transcription triggers on background noise, raise it; if speech isn't being detected, lower it.
Test your noise floor:
python3 -c "
import sounddevice as sd, numpy as np
audio = sd.rec(int(3*16000), samplerate=16000, channels=1, dtype='float32')
sd.wait()
print(f'Noise RMS: {float(np.sqrt(np.mean(audio**2))):.4f}')
"Set --silence to roughly 3× your noise RMS.