A sophisticated, local-first, privacy-focused voice assistant with GPU acceleration, conversational abilities, and adaptive resource management.
- π Local-First Privacy: Explicit permission required for any online activity
- π€ Wake Word Detection: "Ziggy" activation using Vosk speech recognition
- π§ Dual AI Backend Support: Works with both Msty and Ollama (auto-detects and can auto-start)
- π¬ Conversational Mode: Natural multi-turn conversations without repeating wake word
- β‘ Resource Profiles: Adaptive performance based on available GPU memory
- π― Smart Query Routing: Local functions for simple tasks, GPU-accelerated AI for complex queries
- π£οΈ Natural Voice Options: Piper TTS for natural voice or espeak fallback
- βΉοΈ Voice Commands: Full voice control including shutdown and profile switching
Ziggy prioritizes privacy and local processing:
- Local Functions: Time, date, unit conversions handled without AI
- Permission-Gated Online Access: Explicit consent required for web searches
- Local AI Processing: Uses your GPU for AI queries without sending data externally
- No Cloud Dependencies: Everything runs on your hardware
- Conversation History: Stored in memory during session only
- CPU: Multi-core processor
- GPU:
- Minimum: 4GB VRAM (integrated graphics or older GPUs)
- Recommended: 8GB+ VRAM (NVIDIA or AMD GPU)
- Optimal: 16GB+ VRAM for extended conversations
- RAM: 4GB+ system RAM
- Audio: Microphone and speakers/headphones
- OS: Linux (tested on Ubuntu 24.04.1 LTS)
- Python: 3.10+
- AI Backend: Msty or Ollama (assistant can auto-start if needed)
git clone https://github.com/JoshGK8/voice_assist
cd voice_assistScripted Install (Recommended)
# Run the system prerequisites installer
chmod +x system_prerequisites.sh
./system_prerequisites.shManual Install
Install the following packages:
# Audio and speech libraries
sudo apt install portaudio19-dev espeak espeak-data libespeak1 libespeak-dev
# Build tools
sudo apt install build-essential python3-dev curl
# Optional: Audio troubleshooting tools
sudo apt install alsa-utils pulseaudio-utils
# Add user to audio group
sudo usermod -a -G audio $USERImportant: Log out and back in after installation to apply group changes.
# Create virtual environment
python3 -m venv voice_assistant_env
# Activate environment
source voice_assistant_env/bin/activate
# Install Python dependencies
pip install -r requirements.txt# Download Vosk model (~50MB)
wget https://alphacephei.com/vosk/models/vosk-model-small-en-us-0.15.zip
unzip vosk-model-small-en-us-0.15.zipThe assistant will auto-detect running backends or offer to start one:
Option A: Ollama (Recommended)
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
# Pull a model (assistant will use first available)
ollama pull llama3.2Option B: Msty Download and install from msty.app
For better voice quality than espeak:
chmod +x setup_piper.sh
./setup_piper.sh# Activate virtual environment
source voice_assistant_env/bin/activate
# Run with auto-detected profile
python3 voice_assistant.py
# Or specify a profile
python3 voice_assistant.py --profile minimal # For gaming/multitasking
python3 voice_assistant.py --profile performance # For research/long conversationsActivation: Say "Ziggy" to wake the assistant
Conversational Mode:
- Assistant automatically listens after asking questions
- No need to say "Ziggy" during conversations
- Maintains context across multiple exchanges
- Say "new conversation" to clear history
Resource Profiles:
- "Switch to gaming mode" - Minimal resource usage
- "Use standard profile" - Balanced performance
- "Enable performance mode" - Maximum capabilities
- "What profile are you using?" - Check current status
- "How much memory?" - Get resource usage info
Local Functions (instant, no AI):
- "What time is it?"
- "What's the date?"
- "Convert 72 fahrenheit to celsius"
- "How many meters in 50 feet?"
AI Queries (uses local GPU):
- Complex questions and analysis
- Creative writing and coding help
- Technical explanations
- General knowledge (within training data)
Web Search (requires permission):
- "Search for current weather"
- "Look up latest news"
- Assistant asks permission before going online
System Control:
- "Take a break" - Shutdown assistant
- "Start over" / "New conversation" - Clear context
- "What are you running?" - System information
You: "Ziggy"
Ziggy: "Yes?"
You: "What's quantum computing?"
Ziggy: [Explains quantum computing and asks if you want to know more]
You: "Yes, how does it differ from regular computing?" # No wake word needed!
Ziggy: [Continues explanation naturally]
Ziggy adapts to your system capabilities:
- Requirements: 4-8GB VRAM
- Context: 8,000 tokens
- History: 10 conversation turns
- Recording: 2 minutes conversational, 30 seconds commands
- Use Case: Gaming while using assistant, older GPUs, shared systems
- Requirements: 8-16GB VRAM
- Context: 16,000 tokens
- History: 25 conversation turns
- Recording: 5 minutes conversational, 1 minute commands
- Use Case: General productivity, balanced performance
- Requirements: 16GB+ VRAM
- Context: 32,000 tokens
- History: 50 conversation turns
- Recording: 10 minutes conversational, 1 minute commands
- Use Case: Long research sessions, complex discussions
Profile Switching: The assistant auto-detects your GPU memory and selects an appropriate profile. You can switch profiles with voice commands or the --profile flag.
Edit voice_assistant.py:
self.wake_word = "ziggy" # Change to your preferred wordAdjust in voice_assistant.py:
self.sample_rate = 16000 # Audio sample rate
self.chunk_size = 4000 # Buffer size# List devices
aplay -l # Playback devices
arecord -l # Recording devices
# Test microphone
arecord -d 5 test.wav && aplay test.wav
# Check permissions
groups | grep audio# NVIDIA
nvidia-smi
# AMD
rocm-smi --showmeminfo vram
# or
cat /sys/class/drm/card*/device/mem_info_vram_total# Check Ollama
curl http://localhost:11434/api/tags
# Check Msty
curl http://localhost:10000/v1/models
# The assistant will offer to start backends if not running- Speak clearly and distinctly
- Reduce background noise
- Check microphone levels
- Consider environment acoustics
| Profile | Idle RAM | Active RAM | GPU VRAM | Response Time |
|---|---|---|---|---|
| Minimal | ~100MB | ~200MB | 2-4GB | 1-3 seconds |
| Standard | ~150MB | ~300MB | 4-6GB | 1-5 seconds |
| Performance | ~200MB | ~400MB | 6-8GB | 2-8 seconds |
| Feature | Minimal | Standard | Performance |
|---|---|---|---|
| Conversation Length | 10 turns | 25 turns | 50 turns |
| Context Window | 8K tokens | 16K tokens | 32K tokens |
| Recording Time | 2 min | 5 min | 10 min |
| Response Length | 500 tokens | 1000 tokens | 2000 tokens |
voice_assist/
βββ voice_assistant.py # Main integrated assistant
βββ setup_piper.sh # Natural voice setup (optional)
βββ requirements.txt # Python dependencies
βββ system_prerequisites.sh # System setup script
βββ README.md # This file
- Fork the repository
- Test all features work correctly
- Ensure privacy principles are maintained
- Submit pull request with clear description
- β Dual backend support (Msty/Ollama)
- β Conversational mode with automatic listening
- β Resource profiles for different hardware
- β Voice-controlled profile switching
- β Improved natural language command recognition
- β Piper TTS integration option
- β Auto backend startup with voice selection
GPL-3.0 license
- Vosk for offline speech recognition
- Ollama/Msty for local AI backends
- Piper TTS for natural voice synthesis
- espeak for fallback text-to-speech
- Community for testing and feedback
Ziggy Voice Assistant - Your local, private, AI-powered conversational companion.