Voice Interview Assistant - Production Ready

A professional voice recording and AI-powered interview assistance application designed for software development interview preparation.

🚀 Features

High-quality voice recording with configurable audio settings
Real-time speech transcription using OpenAI Whisper
AI-powered interview responses via Ollama or compatible APIs
Production-ready architecture with proper error handling and logging
Configurable settings via INI file
Professional UI with dark theme and responsive design
Thread-safe operations with proper resource management
Comprehensive logging for debugging and monitoring

📋 Requirements

System Requirements

Python 3.8 or higher
Audio input device (microphone)
4GB+ RAM (8GB recommended for larger Whisper models)
GPU with CUDA support (optional, for faster transcription)

Software Dependencies

Ollama or compatible AI service running locally
PortAudio (for audio recording)

🛠️ Installation

1. Clone or Download

# Save the main application file as voice_assistant.py

2. Install Python Dependencies

# Install required packages
pip install -r requirements.txt

# Optional: Install CUDA support for faster transcription
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu118

3. Install System Dependencies

Windows

# PortAudio is usually included with sounddevice
# If you encounter issues, install Visual C++ Build Tools

macOS

# Install using Homebrew
brew install portaudio

Linux (Ubuntu/Debian)

sudo apt-get update
sudo apt-get install portaudio19-dev python3-pyaudio

4. Setup AI Service (Ollama)

# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Pull the required model
ollama pull llama3

# Start Ollama service
ollama serve

⚙️ Configuration

The application uses a config.ini file for configuration. On first run, it will create a default configuration file that you can customize.

Key Configuration Options

Audio Settings

sample_rate: Audio sample rate (default: 44100)
channels: Number of audio channels (default: 1)
dtype: Audio data type (default: int16)

Whisper Model

model: Whisper model size (tiny, base, small, medium, large)
device: Processing device (auto, cpu, cuda)

AI Service

api_url: AI service endpoint
model: AI model name
timeout: Request timeout in seconds

Logging

level: Log level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
file: Log file path (optional)

🚀 Usage

Starting the Application

python voice_assistant.py

Basic Operation

Start Recording: Press SPACE or click the microphone button
Stop Recording: Press SPACE again or click the stop button
View Results: The popup window shows transcription and AI response
Toggle Popup: Double-click the microphone button

Keyboard Shortcuts

SPACE: Start/Stop recording
Double-click mic button: Toggle popup window

🏗️ Production Deployment

1. Environment Setup

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Linux/macOS
# or
venv\Scripts\activate     # Windows

# Install dependencies
pip install -r requirements.txt

2. Configuration Management

# Copy and customize configuration
cp config.ini.template config.ini
# Edit config.ini with your settings

3. Service Setup (Linux)

Create a systemd service file:

[Unit]
Description=Voice Interview Assistant
After=network.target

[Service]
Type=simple
User=your_username
WorkingDirectory=/path/to/voice-assistant
ExecStart=/path/to/venv/bin/python voice_assistant.py
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target

4. Monitoring and Logging

Check logs in the configured log file
Monitor system resources (CPU, memory, GPU)
Set up log rotation for production environments

🔧 Troubleshooting

Common Issues

Audio Recording Problems

# Check available audio devices
python -c "import sounddevice as sd; print(sd.query_devices())"

# Test microphone
python -c "import sounddevice as sd; import numpy as np; print('Recording...'); data = sd.rec(44100, samplerate=44100, channels=1); sd.wait(); print('Done')"

Whisper Model Loading Issues

Ensure sufficient RAM is available
Try a smaller model (tiny, base) if memory is limited
Check CUDA installation for GPU acceleration

AI Service Connection Issues

Verify Ollama is running: curl http://localhost:11434/api/tags
Check firewall settings
Verify model availability: ollama list

UI Issues

Ensure tkinter is installed (usually included with Python)
Check display settings for popup window positioning
Verify window manager compatibility

Performance Optimization

For Better Transcription Speed

Use GPU acceleration (CUDA)
Choose appropriate Whisper model size
Optimize audio settings

For Lower Memory Usage

Use smaller Whisper models (tiny, base)
Reduce audio buffer sizes
Close popup when not needed

📊 Architecture Overview

Key Components

ConfigManager: Handles application configuration
AudioManager: Manages audio recording with thread safety
AIService: Handles Whisper transcription and AI API calls
VoiceInterviewAssistant: Main application controller
UI Components: Professional GUI with responsive design

Thread Safety

All audio operations are thread-safe
Proper resource cleanup on shutdown
Graceful handling of interruptions

Error Handling

Comprehensive exception handling
Graceful degradation on errors
User-friendly error messages
Detailed logging for debugging

🔒 Security Considerations

Audio data is processed locally (privacy-first)
Temporary files are cleaned up automatically
No sensitive data is stored permanently
API calls use session management

📈 Monitoring and Maintenance

Health Checks

Monitor log files for errors
Check AI service availability
Verify audio device connectivity
Monitor system resources

Updates and Maintenance

Regularly update dependencies
Monitor Whisper model updates
Check Ollama service updates
Review and rotate log files

🤝 Contributing

When contributing to the production version:

Follow Python PEP 8 style guidelines
Add comprehensive error handling
Include logging for debugging
Write unit tests for new features
Update configuration documentation
Test on multiple platforms

📄 License

This production-ready version includes enterprise-grade features and should be used according to your organization's software licensing policies.

📞 Support

For production deployment support:

Check logs for detailed error information
Verify all dependencies are correctly installed
Test individual components (audio, transcription, AI service)
Monitor system resources during operation

Production Notes: This version includes comprehensive error handling, logging, configuration management, and thread safety suitable for production environments. Always test thoroughly in your specific environment before deployment.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
ai_response.py		ai_response.py
app.py		app.py
config.ini		config.ini
deploy.sh		deploy.sh
device_list.py		device_list.py
main.py		main.py
readme.md		readme.md
requirements.txt		requirements.txt

TheScriptRailoth/interview-helper-ai

Folders and files

Latest commit

History

Repository files navigation