Skip to content

A professional voice recording and AI-powered interview assistance application designed for software development interview preparation.

Notifications You must be signed in to change notification settings

TheScriptRailoth/interview-helper-ai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Voice Interview Assistant - Production Ready

A professional voice recording and AI-powered interview assistance application designed for software development interview preparation.

🚀 Features

  • High-quality voice recording with configurable audio settings
  • Real-time speech transcription using OpenAI Whisper
  • AI-powered interview responses via Ollama or compatible APIs
  • Production-ready architecture with proper error handling and logging
  • Configurable settings via INI file
  • Professional UI with dark theme and responsive design
  • Thread-safe operations with proper resource management
  • Comprehensive logging for debugging and monitoring

📋 Requirements

System Requirements

  • Python 3.8 or higher
  • Audio input device (microphone)
  • 4GB+ RAM (8GB recommended for larger Whisper models)
  • GPU with CUDA support (optional, for faster transcription)

Software Dependencies

  • Ollama or compatible AI service running locally
  • PortAudio (for audio recording)

🛠️ Installation

1. Clone or Download

# Save the main application file as voice_assistant.py

2. Install Python Dependencies

# Install required packages
pip install -r requirements.txt

# Optional: Install CUDA support for faster transcription
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu118

3. Install System Dependencies

Windows

# PortAudio is usually included with sounddevice
# If you encounter issues, install Visual C++ Build Tools

macOS

# Install using Homebrew
brew install portaudio

Linux (Ubuntu/Debian)

sudo apt-get update
sudo apt-get install portaudio19-dev python3-pyaudio

4. Setup AI Service (Ollama)

# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Pull the required model
ollama pull llama3

# Start Ollama service
ollama serve

⚙️ Configuration

The application uses a config.ini file for configuration. On first run, it will create a default configuration file that you can customize.

Key Configuration Options

Audio Settings

  • sample_rate: Audio sample rate (default: 44100)
  • channels: Number of audio channels (default: 1)
  • dtype: Audio data type (default: int16)

Whisper Model

  • model: Whisper model size (tiny, base, small, medium, large)
  • device: Processing device (auto, cpu, cuda)

AI Service

  • api_url: AI service endpoint
  • model: AI model name
  • timeout: Request timeout in seconds

Logging

  • level: Log level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
  • file: Log file path (optional)

🚀 Usage

Starting the Application

python voice_assistant.py

Basic Operation

  1. Start Recording: Press SPACE or click the microphone button
  2. Stop Recording: Press SPACE again or click the stop button
  3. View Results: The popup window shows transcription and AI response
  4. Toggle Popup: Double-click the microphone button

Keyboard Shortcuts

  • SPACE: Start/Stop recording
  • Double-click mic button: Toggle popup window

🏗️ Production Deployment

1. Environment Setup

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Linux/macOS
# or
venv\Scripts\activate     # Windows

# Install dependencies
pip install -r requirements.txt

2. Configuration Management

# Copy and customize configuration
cp config.ini.template config.ini
# Edit config.ini with your settings

3. Service Setup (Linux)

Create a systemd service file:

[Unit]
Description=Voice Interview Assistant
After=network.target

[Service]
Type=simple
User=your_username
WorkingDirectory=/path/to/voice-assistant
ExecStart=/path/to/venv/bin/python voice_assistant.py
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target

4. Monitoring and Logging

  • Check logs in the configured log file
  • Monitor system resources (CPU, memory, GPU)
  • Set up log rotation for production environments

🔧 Troubleshooting

Common Issues

Audio Recording Problems

# Check available audio devices
python -c "import sounddevice as sd; print(sd.query_devices())"

# Test microphone
python -c "import sounddevice as sd; import numpy as np; print('Recording...'); data = sd.rec(44100, samplerate=44100, channels=1); sd.wait(); print('Done')"

Whisper Model Loading Issues

  • Ensure sufficient RAM is available
  • Try a smaller model (tiny, base) if memory is limited
  • Check CUDA installation for GPU acceleration

AI Service Connection Issues

  • Verify Ollama is running: curl http://localhost:11434/api/tags
  • Check firewall settings
  • Verify model availability: ollama list

UI Issues

  • Ensure tkinter is installed (usually included with Python)
  • Check display settings for popup window positioning
  • Verify window manager compatibility

Performance Optimization

For Better Transcription Speed

  • Use GPU acceleration (CUDA)
  • Choose appropriate Whisper model size
  • Optimize audio settings

For Lower Memory Usage

  • Use smaller Whisper models (tiny, base)
  • Reduce audio buffer sizes
  • Close popup when not needed

📊 Architecture Overview

Key Components

  1. ConfigManager: Handles application configuration
  2. AudioManager: Manages audio recording with thread safety
  3. AIService: Handles Whisper transcription and AI API calls
  4. VoiceInterviewAssistant: Main application controller
  5. UI Components: Professional GUI with responsive design

Thread Safety

  • All audio operations are thread-safe
  • Proper resource cleanup on shutdown
  • Graceful handling of interruptions

Error Handling

  • Comprehensive exception handling
  • Graceful degradation on errors
  • User-friendly error messages
  • Detailed logging for debugging

🔒 Security Considerations

  • Audio data is processed locally (privacy-first)
  • Temporary files are cleaned up automatically
  • No sensitive data is stored permanently
  • API calls use session management

📈 Monitoring and Maintenance

Health Checks

  • Monitor log files for errors
  • Check AI service availability
  • Verify audio device connectivity
  • Monitor system resources

Updates and Maintenance

  • Regularly update dependencies
  • Monitor Whisper model updates
  • Check Ollama service updates
  • Review and rotate log files

🤝 Contributing

When contributing to the production version:

  1. Follow Python PEP 8 style guidelines
  2. Add comprehensive error handling
  3. Include logging for debugging
  4. Write unit tests for new features
  5. Update configuration documentation
  6. Test on multiple platforms

📄 License

This production-ready version includes enterprise-grade features and should be used according to your organization's software licensing policies.

📞 Support

For production deployment support:

  • Check logs for detailed error information
  • Verify all dependencies are correctly installed
  • Test individual components (audio, transcription, AI service)
  • Monitor system resources during operation

Production Notes: This version includes comprehensive error handling, logging, configuration management, and thread safety suitable for production environments. Always test thoroughly in your specific environment before deployment.

About

A professional voice recording and AI-powered interview assistance application designed for software development interview preparation.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published