This repository contains two Python scripts for real-time audio transcription.
transcript_deepgram.py: Uses the Deepgram cloud API for fast and accurate transcription.transcript_local.py: Runs a local Whisper model, optimized for energy efficiency.
200 usd free usage, each hour is just very low (~0.26 per hour).
Do not want to configure the API at all
This script streams microphone audio to Deepgram's API for near real-time transcription.
Setup:
-
Install dependencies:
pip install deepgram-sdk sounddevice numpy requests
-
Add API Key: Create a file named
api_config.jsonand add your key:{ "deepgram_api_key": "YOUR_DEEPGRAM_API_KEY_HERE" }
Usage:
Run the script to start transcribing. Press Ctrl+C to stop.
python transcript_deepgram.pyA transcript file will be saved automatically.
This script uses faster-whisper to run transcription locally on your machine, with a focus on minimizing CPU and power usage.
Setup:
- Install dependencies:
pip install faster-whisper sounddevice numpy psutil
Usage:
Run the script to start. The base model is used by default. Press Ctrl+C to stop.
python transcript_local.pyA transcript file will be saved automatically.
You can modify the behavior of both scripts with command-line arguments.
Deepgram:
# Use a 3-second chunk size and more sensitive voice detection
python transcript_deepgram.py --chunk-duration 3 --vad-threshold 0.006Local Whisper:
# Use the 'tiny' model for lowest power usage
python transcript_local.py --model tiny
# Use a larger chunk size and enable low-power sleep mode
python transcript_local.py --chunk-duration 8 --low-power