Troubleshooting

Common issues and how to fix them.

Installation Issues

Python not found

Error: python3: command not found

Install Python 3.9+ from python.org or your package manager.

Check version:

bash

python3 --version

Setup script fails

On Linux/macOS:

bash

chmod +x scripts/setup.sh
./scripts/setup.sh

On Windows: Run scripts\setup.bat from Command Prompt, not PowerShell.

PyTorch installation fails

Install manually:

bash

source venv/bin/activate  # or venv\Scripts\activate on Windows
pip install torch torchvision torchaudio

For CUDA support:

bash

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

Port 7860 already in use

Use a different port:

bash

python -m ovl.cli --port 8080

Model Loading Issues

Solutions:

Close other GPU applications
Use CPU device (slower): select "cpu" in device dropdown
Use smaller model (1.5B instead of 7B)
Restart computer to clear GPU memory

Model download timeout

Downloading for first time:

Model is 3-6 GB, takes 5-20 minutes depending on internet speed
Let it finish completely
If interrupted, delete partially downloaded files and retry

Find downloaded models:

bash

~/.cache/huggingface/hub/

CUDA not available

Check if GPU is detected:

bash

python -c "import torch; print(torch.cuda.is_available())"

If False:

Update NVIDIA drivers
Reinstall PyTorch with CUDA support
Check if GPU is being used by another process

LoRA adapter won't load

Check path:

Make sure path points to checkpoints/ folder
Example: training_runs/run_20251008_143022/checkpoints/
Don't include a specific file, just the folder

Verify checkpoint exists:

bash

ls training_runs/run_TIMESTAMP/checkpoints/

Should contain adapter_model.bin or similar files.

Data Processing Issues

No audio segments found after VAD

Causes:

Audio is too quiet
Audio is too noisy
No actual speech in audio

Solutions:

Use cleaner audio
Boost audio volume before processing
Try different audio files

Transcriptions are wrong

Solutions:

Use a larger Whisper model (medium or large)
Check audio quality
Verify audio language matches model
Some errors are okay - model can handle minor transcription mistakes

Processing is very slow

Normal on CPU. Whisper is slow without GPU.

Solutions:

Use smaller Whisper model (tiny or base)
Enable GPU if available
Process fewer files at once
Be patient - it's a one-time process

Out of memory during data processing

Solutions:

Close other applications
Use smaller Whisper model
Process files in batches (multiple datasets)

Training Issues

Out of memory during training

Solutions:

Reduce batch size (try 2 or 1)
Reduce LoRA rank (try 4)
Close other GPU applications
Use 1.5B model instead of 7B
Restart computer

Check VRAM usage:

bash

nvidia-smi  # on NVIDIA GPUs

Training loss not decreasing

Check:

Dataset quality (listen to samples)
Transcription accuracy (check metadata.csv)

Try:

More epochs (5-7 instead of 3)
Different learning rate (try 1.5e-4)
Better quality data

Training crashes

Check logs:

bash

cat training_runs/run_TIMESTAMP/train.log

Common causes:

Out of memory - reduce batch size
Corrupted dataset - verify files
Disk full - free up space

TensorBoard won't load

Wait 30-60 seconds after training starts, then click Refresh.

Check if running:

bash

ps aux | grep tensorboard

Manual start:

bash

tensorboard --logdir training_runs/run_TIMESTAMP/logs --port 6006

Training is very slow

Normal: Training takes 2-6 hours for typical datasets.

ETA in logs:

Step 100/1500 | 2.3s/it

Calculate: (1500-100) * 2.3 seconds = remaining time

Speed depends on:

Dataset size
Batch size
GPU speed
Model size

Generation Issues

Generated audio sounds robotic

Try:

Lower CFG scale (1.0-1.2)
Different reference voice
Model may be overtrained - reduce epochs next time

Audio doesn't match expected voice

Check:

LoRA adapter loaded correctly
Using appropriate reference voice
Try higher CFG scale (1.5-1.8)

Random background music appears

This is normal VibeVoice behavior. The model was trained on data with background sounds.

Mitigate:

Use different reference voice (cleaner samples)
Regenerate (results vary)
Try different text

Generation is too slow

RTF > 1.0 is normal for large models on consumer hardware.

Speed up:

Use GPU instead of CPU
Use 1.5B instead of 7B
Shorten text
Close other applications

Audio has artifacts or glitches

Try:

Regenerate (may be random)
Different reference voice
Different text
Check if system is overloaded

General Issues

Interface won't load

Check if running:

bash

ps aux | grep "python -m ovl.cli"

Restart:

bash

./scripts/run.sh  # or run.bat on Windows

Network Access

To access from other devices on your network, add --host 0.0.0.0 or --share:

bash

./scripts/run.sh --host 0.0.0.0  # or run.bat --host 0.0.0.0 on Windows
./scripts/run.sh --share          # or run.bat --share on Windows

Note: TensorBoard will not work from another device.

Check browser URL:

http://localhost:7860

Changes not appearing

Refresh browser: Ctrl+R or Cmd+R

Clear cache: Shift+Ctrl+R or Shift+Cmd+R

Gradio connection lost

Long-running operations may timeout. Check logs to see if process is still running.

Refresh page - training/processing continues in background.

Disk full

Check space:

bash

df -h  # Linux/macOS

Clean up:

bash

# Delete old training runs
rm -rf training_runs/run_OLD_TIMESTAMP

# Delete old outputs
rm outputs/generated_OLD_*.wav

Python version issues

Check version:

bash

python3 --version

Need 3.9+. If older, update Python.

Platform-Specific Issues

macOS: MPS not available

Apple Silicon Macs should detect MPS automatically.

Check:

bash

python -c "import torch; print(torch.backends.mps.is_available())"

If False: Update PyTorch:

bash

pip install --upgrade torch

Windows: Scripts won't run

Use Command Prompt, not PowerShell.

Or run Python directly:

cmd

python -m ovl.cli

Linux: Permission denied

Make scripts executable:

bash

chmod +x scripts/setup.sh scripts/run.sh

Getting More Help

Check logs:

bash

# Application log
cat logs/openvoicelab.log

# Training log
cat training_runs/run_TIMESTAMP/train.log

Error messages usually indicate the problem. Search for the error online or ask on Discord.

Community help:

Discord - Active community
GitHub Issues - Bug reports

Provide when asking for help:

OpenVoiceLab version
Python version (python3 --version)
OS and version
GPU model (if applicable)
Error message or logs
Steps to reproduce

Still Having Issues?

If nothing here helps:

Check FAQ for common questions
Search GitHub issues
Ask on Discord
Open a new GitHub issue with details

Remember: OpenVoiceLab is in beta. Some rough edges are expected. Your feedback helps improve it.

Troubleshooting ​

Installation Issues ​

Python not found ​

Setup script fails ​

PyTorch installation fails ​

Port 7860 already in use ​

Model Loading Issues ​

Out of memory when loading model ​

Model download timeout ​

CUDA not available ​

LoRA adapter won't load ​

Data Processing Issues ​

No audio segments found after VAD ​

Transcriptions are wrong ​

Processing is very slow ​

Out of memory during data processing ​

Training Issues ​

Out of memory during training ​

Training loss not decreasing ​

Training crashes ​

TensorBoard won't load ​

Training is very slow ​

Generation Issues ​

Generated audio sounds robotic ​

Audio doesn't match expected voice ​

Random background music appears ​

Generation is too slow ​

Audio has artifacts or glitches ​

General Issues ​

Interface won't load ​

Changes not appearing ​

Gradio connection lost ​

Disk full ​

Python version issues ​

Platform-Specific Issues ​

macOS: MPS not available ​

Windows: Scripts won't run ​

Linux: Permission denied ​

Getting More Help ​

Still Having Issues? ​

Troubleshooting

Installation Issues

Python not found

Setup script fails

PyTorch installation fails

Port 7860 already in use

Model Loading Issues

Out of memory when loading model

Model download timeout

CUDA not available

LoRA adapter won't load

Data Processing Issues

No audio segments found after VAD

Transcriptions are wrong

Processing is very slow

Out of memory during data processing

Training Issues

Out of memory during training

Training loss not decreasing

Training crashes

TensorBoard won't load

Training is very slow

Generation Issues

Generated audio sounds robotic

Audio doesn't match expected voice

Random background music appears

Generation is too slow

Audio has artifacts or glitches

General Issues

Interface won't load

Changes not appearing

Gradio connection lost

Disk full

Python version issues

Platform-Specific Issues

macOS: MPS not available

Windows: Scripts won't run

Linux: Permission denied

Getting More Help

Still Having Issues?