Troubleshooting
Common issues and how to fix them.
Installation Issues
Python not found
Error: python3: command not found
Install Python 3.9+ from python.org or your package manager.
Check version:
python3 --versionSetup script fails
On Linux/macOS:
chmod +x scripts/setup.sh
./scripts/setup.shOn Windows: Run scripts\setup.bat from Command Prompt, not PowerShell.
PyTorch installation fails
Install manually:
source venv/bin/activate # or venv\Scripts\activate on Windows
pip install torch torchvision torchaudioFor CUDA support:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121Port 7860 already in use
Use a different port:
python -m ovl.cli --port 8080Model Loading Issues
Out of memory when loading model
Solutions:
- Close other GPU applications
- Use CPU device (slower): select "cpu" in device dropdown
- Use smaller model (1.5B instead of 7B)
- Restart computer to clear GPU memory
Model download timeout
Downloading for first time:
- Model is 3-6 GB, takes 5-20 minutes depending on internet speed
- Let it finish completely
- If interrupted, delete partially downloaded files and retry
Find downloaded models:
~/.cache/huggingface/hub/CUDA not available
Check if GPU is detected:
python -c "import torch; print(torch.cuda.is_available())"If False:
- Update NVIDIA drivers
- Reinstall PyTorch with CUDA support
- Check if GPU is being used by another process
LoRA adapter won't load
Check path:
- Make sure path points to
checkpoints/folder - Example:
training_runs/run_20251008_143022/checkpoints/ - Don't include a specific file, just the folder
Verify checkpoint exists:
ls training_runs/run_TIMESTAMP/checkpoints/Should contain adapter_model.bin or similar files.
Data Processing Issues
No audio segments found after VAD
Causes:
- Audio is too quiet
- Audio is too noisy
- No actual speech in audio
Solutions:
- Use cleaner audio
- Boost audio volume before processing
- Try different audio files
Transcriptions are wrong
Solutions:
- Use a larger Whisper model (medium or large)
- Check audio quality
- Verify audio language matches model
- Some errors are okay - model can handle minor transcription mistakes
Processing is very slow
Normal on CPU. Whisper is slow without GPU.
Solutions:
- Use smaller Whisper model (tiny or base)
- Enable GPU if available
- Process fewer files at once
- Be patient - it's a one-time process
Out of memory during data processing
Solutions:
- Close other applications
- Use smaller Whisper model
- Process files in batches (multiple datasets)
Training Issues
Out of memory during training
Solutions:
- Reduce batch size (try 2 or 1)
- Reduce LoRA rank (try 4)
- Close other GPU applications
- Use 1.5B model instead of 7B
- Restart computer
Check VRAM usage:
nvidia-smi # on NVIDIA GPUsTraining loss not decreasing
Check:
- Dataset quality (listen to samples)
- Transcription accuracy (check metadata.csv)
Try:
- More epochs (5-7 instead of 3)
- Different learning rate (try 1.5e-4)
- Better quality data
Training crashes
Check logs:
cat training_runs/run_TIMESTAMP/train.logCommon causes:
- Out of memory - reduce batch size
- Corrupted dataset - verify files
- Disk full - free up space
TensorBoard won't load
Wait 30-60 seconds after training starts, then click Refresh.
Check if running:
ps aux | grep tensorboardManual start:
tensorboard --logdir training_runs/run_TIMESTAMP/logs --port 6006Training is very slow
Normal: Training takes 2-6 hours for typical datasets.
ETA in logs:
Step 100/1500 | 2.3s/itCalculate: (1500-100) * 2.3 seconds = remaining time
Speed depends on:
- Dataset size
- Batch size
- GPU speed
- Model size
Generation Issues
Generated audio sounds robotic
Try:
- Lower CFG scale (1.0-1.2)
- Different reference voice
- Model may be overtrained - reduce epochs next time
Audio doesn't match expected voice
Check:
- LoRA adapter loaded correctly
- Using appropriate reference voice
- Try higher CFG scale (1.5-1.8)
Random background music appears
This is normal VibeVoice behavior. The model was trained on data with background sounds.
Mitigate:
- Use different reference voice (cleaner samples)
- Regenerate (results vary)
- Try different text
Generation is too slow
RTF > 1.0 is normal for large models on consumer hardware.
Speed up:
- Use GPU instead of CPU
- Use 1.5B instead of 7B
- Shorten text
- Close other applications
Audio has artifacts or glitches
Try:
- Regenerate (may be random)
- Different reference voice
- Different text
- Check if system is overloaded
General Issues
Interface won't load
Check if running:
ps aux | grep "python -m ovl.cli"Restart:
./scripts/run.sh # or run.bat on WindowsNetwork Access
To access from other devices on your network, add --host 0.0.0.0 or --share:
./scripts/run.sh --host 0.0.0.0 # or run.bat --host 0.0.0.0 on Windows
./scripts/run.sh --share # or run.bat --share on WindowsNote: TensorBoard will not work from another device.
Check browser URL:
http://localhost:7860Changes not appearing
Refresh browser: Ctrl+R or Cmd+R
Clear cache: Shift+Ctrl+R or Shift+Cmd+R
Gradio connection lost
Long-running operations may timeout. Check logs to see if process is still running.
Refresh page - training/processing continues in background.
Disk full
Check space:
df -h # Linux/macOSClean up:
# Delete old training runs
rm -rf training_runs/run_OLD_TIMESTAMP
# Delete old outputs
rm outputs/generated_OLD_*.wavPython version issues
Check version:
python3 --versionNeed 3.9+. If older, update Python.
Platform-Specific Issues
macOS: MPS not available
Apple Silicon Macs should detect MPS automatically.
Check:
python -c "import torch; print(torch.backends.mps.is_available())"If False: Update PyTorch:
pip install --upgrade torchWindows: Scripts won't run
Use Command Prompt, not PowerShell.
Or run Python directly:
python -m ovl.cliLinux: Permission denied
Make scripts executable:
chmod +x scripts/setup.sh scripts/run.shGetting More Help
Check logs:
# Application log
cat logs/openvoicelab.log
# Training log
cat training_runs/run_TIMESTAMP/train.logError messages usually indicate the problem. Search for the error online or ask on Discord.
Community help:
- Discord - Active community
- GitHub Issues - Bug reports
Provide when asking for help:
- OpenVoiceLab version
- Python version (
python3 --version) - OS and version
- GPU model (if applicable)
- Error message or logs
- Steps to reproduce
Still Having Issues?
If nothing here helps:
- Check FAQ for common questions
- Search GitHub issues
- Ask on Discord
- Open a new GitHub issue with details
Remember: OpenVoiceLab is in beta. Some rough edges are expected. Your feedback helps improve it.