This guide provides a comprehensive setup for the RealtimeVoiceChat application with automatic dependency management and graceful error handling.
# Run the complete setup script (handles everything)
chmod +x setup_complete.sh
./setup_complete.sh# Use the robust startup script
chmod +x start_app.sh
./start_app.sh- Main Application: http://localhost:8000
- Ollama API: http://localhost:11434 (LLM backend)
- TTS Server: http://localhost:1234 (Orpheus TTS - optional)
- ✅ System Dependencies: Audio libraries, build tools, Python dev packages
- ✅ Ollama: Automatic installation and service startup
- ✅ Mistral 7B Model: Downloaded and ready for LLM processing
- ✅ Whisper Model: Base model for speech recognition
- ✅ Python Packages: All required dependencies from requirements.txt
- ✅ Audio System: ALSA/PulseAudio configuration to suppress warnings
- ✅ TTS Server: Optional Orpheus server (graceful fallback if not available)
- 🛡️ Graceful Fallbacks: Application continues even if TTS server fails
- 🛡️ Dependency Checks: Verifies all components before startup
- 🛡️ Clear Logging: Detailed status information and helpful error messages
- 🛡️ Manual Override: Instructions for manual TTS server startup if needed
If you prefer step-by-step manual setup:
# Update system
apt-get update
# Install essential packages
apt-get install -y curl wget git build-essential cmake python3-dev python3-pip
# Install audio libraries
apt-get install -y libsndfile1-dev portaudio19-dev libasound2-dev libpulse-dev alsa-utils ffmpeg# Install Python packages
pip install --upgrade pip
pip install -r requirements.txt# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Start Ollama service
ollama serve &
# Wait for service to start, then download model
sleep 10
ollama pull mistral:7b# Set audio environment variables
export ALSA_PCM_CARD=default
export ALSA_PCM_DEVICE=0
export PULSE_RUNTIME_PATH=/tmp/pulse-runtime
export SDL_AUDIODRIVER=pulsecd code
python server.py# Check if Ollama is running
curl http://localhost:11434/api/tags
# If not running, start it
ollama serve &
# Check if model is available
ollama listThe application now handles this gracefully. If you want TTS functionality:
# Start TTS server manually
python -m llama_cpp.server \
--model /workspace/models/Orpheus-3b-FT-Q8_0.gguf \
--host 0.0.0.0 \
--port 1234 \
--n_gpu_layers -1# Source the audio environment
source set_audio_env.sh
# Or set manually
export ALSA_PCM_CARD=default
export ALSA_PCM_DEVICE=0# Reinstall requirements
pip install -r requirements.txt --force-reinstall# Download Whisper model manually
python -c "import whisper; whisper.load_model('base', download_root='/workspace/models')"# Run the status check script (created during setup)
./check_ollama_status.sh# Ollama API
curl http://localhost:11434/api/tags
# TTS Server (optional)
curl http://localhost:1234/health
# Main Application
curl http://localhost:8000Edit code/server.py to change models:
# LLM Model (Ollama)
LLM_START_MODEL = "mistral:7b" # Change to other Ollama models
# TTS Engine
TTS_START_ENGINE = "orpheus" # Options: orpheus, kokoro, coqui
# Whisper Model (in setup scripts)
WHISPER_MODEL = "base" # Options: tiny, base, small, medium, largeEdit set_audio_env.sh for custom audio settings:
export ALSA_PCM_CARD=default
export ALSA_PCM_DEVICE=0
export PULSE_RUNTIME_PATH=/tmp/pulse-runtime
export SDL_AUDIODRIVER=pulsepkill -f "ollama serve"
ollama serve &# Stop application (Ctrl+C in terminal)
# Then restart
./start_app.shAfter setup, your directory should contain:
RealtimeVoiceChat/
├── code/ # Application source code
├── setup_complete.sh # Complete setup script
├── setup_ollama.sh # Ollama-specific setup
├── start_app.sh # Robust application startup
├── set_audio_env.sh # Audio environment configuration
├── check_ollama_status.sh # Ollama status checker
├── requirements.txt # Python dependencies
└── README.md # Original project documentation
When everything is working correctly, you should see:
🎤🚀 Starting RealtimeVoiceChat Application
=========================================
🦙 Step 1: Checking Ollama service...
✅ Ollama service already running
✅ mistral:7b model available
🔊 Step 2: Setting up audio environment...
✅ Audio environment configured
🎤 Step 3: Checking TTS server...
✅ TTS server already running on port 1234
🧪 Step 4: Running final checks...
✅ All critical packages available
🚀 Step 5: Starting RealtimeVoiceChat application...
🎉 All dependencies ready! Starting application...
The application is now ready for robust real-time voice chat!