LunaVox Runtime Models
This directory contains converted runtime artifacts (ONNX and GGUF) for various Qwen3-TTS model variants. These artifacts are generated from original Hugging Face checkpoints to be used by the LunaVox inference engine.
Downloading and Setup
1. Automatic Source Download in pull-model
lunavox pull-model is the only model preparation entrypoint.
If required Hugging Face source weights are missing, CLI prompts in English and downloads after confirmation.
2. Model Cache
Original model weights are cached in the standard Hugging Face directory:
~/.cache/huggingface/hub/models--Qwen--...
Directory Structure
Each model variant subfolder (e.g., models/base_small/) typically contains:
qwen3_tts_talker.q5_k.gguf: Quantized Talker model (Llama-based).qwen3_tts_predictor.q8_0.gguf: Quantized Predictor model (Llama-based).qwen3_tts_codec_encoder.fp16.onnx: Audio Tokenizer (Mimi-based).qwen3_tts_speaker_encoder.fp16.onnx: Reference Audio Speaker Encoder.qwen3_tts_decoder.fp16.onnx: Audio Decoder (Mimi-based).embeddings/: Projected text and codec embeddings.tokenizer.json: Hugging Face text tokenizer configuration.
Available Variants
base: Qwen3-TTS-12Hz-1.7B-Basebase_small: Qwen3-TTS-12Hz-0.6B-Basecustom: Qwen3-TTS-12Hz-1.7B-CustomVoicecustom_small: Qwen3-TTS-12Hz-0.6B-CustomVoicedesign: Qwen3-TTS-12Hz-1.7B-VoiceDesign
- Downloads last month
- 620
Hardware compatibility
Log In to add your hardware
8-bit