A modular Retrieval-based Voice Conversion framework with Gradio UI, training capabilities, and audio processing tools
- Voice Conversion: High-quality voice conversion with multiple pitch extraction methods
- Model Training: Complete training pipeline for creating custom RVC models
- Real-time Processing: Low-latency real-time voice conversion support
- Web UI: Intuitive Gradio-based web interface
- CLI Support: Command-line interface for scripting and automation
- API Access: Python API for programmatic access
- Audio Separation: Built-in tools for vocal/instrument separation
- Text-to-Speech: Integration with edge-tts for TTS-based voice conversion
pip install git+https://github.com/ArkanDash/Advanced-RVC-Inference.gitFor CUDA-enabled GPUs:
pip install git+https://github.com/ArkanDash/Advanced-RVC-Inference.git#egg=advanced-rvc-inference[gpu]git clone https://github.com/ArkanDash/Advanced-RVC-Inference.git
cd Advanced-RVC-Inference
pip install -e .pip install git+https://github.com/ArkanDash/Advanced-RVC-Inference.git#egg=advanced-rvc-inference[dev]Launch the Gradio web UI:
rvc-gui
# or
python -m advanced_rvc_inference.guiThe web interface will be available at http://localhost:7860
Run voice conversion from the command line:
rvc-cli infer --model path/to/model.pth --input audio.wav --output converted.wav --pitch 0View help:
rvc-cli --help
rvc-cli infer --helpfrom advanced_rvc_inference import RVCInference
# Initialize the inference engine
rvc = RVCInference(device="cuda:0")
# Load a model
rvc.load_model("path/to/model.pth")
# Run inference
audio = rvc.infer("input.wav", pitch_change=0, output_path="output.wav")
# Or use batch processing
audio_files = rvc.infer_batch(
input_dir="input_folder",
output_dir="output_folder",
pitch_change=2,
format="wav"
)
# Cleanup
rvc.unload_model()| Command | Description |
|---|---|
rvc-cli infer |
Run voice conversion inference on a single audio file |
rvc-cli infer-batch |
Run batch voice conversion on multiple files |
rvc-cli train |
Train RVC models (use web UI for full features) |
rvc-cli dataset |
Create and manage training datasets |
rvc-cli preprocess |
Preprocess training data |
rvc-cli extract |
Extract features for training |
rvc-cli index |
Create index for feature retrieval |
rvc-cli separate |
Separate music into vocals and instruments |
rvc-cli reference |
Create reference audio for training |
rvc-cli tts |
Text-to-speech voice conversion |
rvc-cli serve |
Launch the web interface |
rvc-cli info |
Show system information |
Run voice conversion on a single audio file:
rvc-cli infer --model path/to/model.pth --input audio.wav --output converted.wavWith pitch shift (one octave up):
rvc-cli infer --model vocals.pth --input audio.wav --pitch 12 --output output.wavProcess multiple audio files at once:
rvc-cli infer-batch --model model.pth --input_dir ./songs --output_dir ./convertedSeparate vocals from instrumental tracks:
rvc-cli separate --input song.mp3 --output_dir ./separatedLaunch the Gradio web UI:
rvc-cli serve --port 7860View help for any command:
rvc-cli --help
rvc-cli infer --help
rvc-cli separate --help
## Configuration
### Environment Variables
| Variable | Description | Default |
|----------|-------------|---------|
| `ARVC_ASSETS_PATH` | Path to asset directory | Package assets folder |
| `ARVC_CONFIGS_PATH` | Path to configs directory | Package configs folder |
| `ARVC_WEIGHTS_PATH` | Path to model weights | assets/weights |
| `ARVC_LOGS_PATH` | Path to logs directory | assets/logs |
### Configuration File
Configuration is managed through `advanced_rvc_inference/configs/config.json`:
```json
{
"device": "cuda:0",
"fp16": true,
"app_port": 7860,
"language": "vi-VN",
"theme": "NoCrypt/miku",
"uvr_path": "advanced_rvc_inference/assets/audios"
}Ensure you have CUDA installed and PyTorch with CUDA support:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118Reduce batch size or use CPU mode:
rvc = RVCInference(device="cpu")Contributions are welcome! Please read our Contributing Guide for details.
This project is licensed under the MIT License - see the LICENSE file for details.
The use of the converted voice for the following purposes is prohibited:
- Criticizing or attacking individuals
- Advocating for or opposing specific political positions, religions, or ideologies
- Publicly displaying strongly stimulating expressions without proper zoning
- Selling of voice models and generated voice clips
- Impersonation of the original owner of the voice with malicious intentions
- Fraudulent purposes that lead to identity theft or fraudulent phone calls
| Repository | Owner |
|---|---|
| Vietnamese-RVC | PhamHuynhAnh16 |
| Applio | IAHispano |
For issues and feature requests, please use the GitHub Issues page.
Made with by ArkanDash