ATTS1HG1: High-Performance llama.cpp Implementation of XTTS-v2

ATTS1HG1 v1.1 is a high-speed, native C++ implementation of the Coqui XTTS-v2 model, utilizing the llama.cpp library. It features a custom integrated HiFiGAN vocoder optimized for Vulkan and CPU inference.

Source Code & GUI Base Model Backend
GitHub: ATTS1HG1 Coqui XTTS-v2 GGML / Vulkan

🚀 Key Features

  • Blazing Fast: Generates audio in < 0.5s on consumer GPUs (RTX 3090) and ~1.0s on CPU.
  • Vulkan Support: Fully optimized HiFiGAN vocoder running on Vulkan (compatible with NVIDIA, AMD, Intel iGPUs).
  • Lightweight: Native C++ application utilizing the llama.cpp, no heavy Python dependencies.
  • Multi-Language: Supports 17 languages.
  • Voice : Supports 58 speaker (similar to XTTS).

🌍 Supported Languages

The model supports the following 17 languages:

Code Language Native Name
en English English
es Spanish Español
fr French Français
de German Deutsch
it Italian Italiano
pt Portuguese Português
pl Polish Polski
tr Turkish Türkçe
ru Russian Русский
nl Dutch Nederlands
cs Czech Čeština
ar Arabic العربية
zh Chinese 中文
ja Japanese 日本語
hu Hungarian Magyar
ko Korean 한국어
hi Hindi हिन्दी

⚡ Performance

Benchmarks based on standard text generation ("Bonjour le monde") using the C++ client:

Device Backend Latency (Total) Note
NVIDIA RTX 3090 Vulkan ~0.47s 🚀 Recommended
Intel iGPU Vulkan ~1.40s Good for laptops
CPU (Ryzen/Intel) CPU (AVX2) ~1.02s Solid fallback
NVIDIA RTX 3090 CUDA ~1.45s Slower on HiFiGAN due to kernel overhead

Note: The Vulkan backend is significantly faster for the HiFiGAN part of the pipeline compared to CUDA due to optimized command buffers and reduced kernel launch overhead for small convolutions.

🔧 Key Technical Upgrades

  • Split Model Architecture (GPT2 + HiFiGAN):

  • The text-to-latent model (GPT2) and the vocoder (HiFiGAN) are now separate GGUF files.

  • Allows users to load HiFiGAN via Vulkan for lower latency while keeping GPT2 optimized for CUDA.

  • Result: Up to 3x faster audio generation on RTX series.

  • Advanced Text Preprocessing:

  • Automatic Language Detection (New!):

    ATTS now features a robust LanguageDetector class that analyzes input text. Uses Script Range (Unicode blocks), Dictionary (common words), and N-Grams (suffixes/patterns). Supports 17 languages with confidence scoring and fallback logic. In "Auto Mode" (LangDirIndex == 0), the system automatically selects the language for synthesis.

  • MeCab Integration: Native Japanese tokenization and segmentation for natural prosody.

  • Romanization: Automatic romanization for Chinese (Pinyin), Japanese (Romaji), and Korean (Revised Romanization).

  • Num2Words: Converts numbers (e.g., "123") into words ("one hundred twenty-three") across all 17

🛠️ Usage

This repository contains the converted .zip and .gguf weights required by the ATTS1HG1 software.

Zero-Configuration Deployment:

  1. Download the Pre-compiled GUI Applications (.zip) and the model files from this repository.

  2. No C++ compilation, no Python dependencies, no pip install.

  3. Just unzip and run atts_gui_app_v1.1.exe.

  4. Load the model in the GUI or CLI and select Vulkan for best performance.

🧠 Model Files (GGUF via llama.cpp)

  • atts1hg1_q6_k.gguf (Legacy: Combined GPT2+HiFiGAN)

  • ATTS1_q6_k.gguf (New: GPT2 Only)

  • hifigan1_FP16.gguf (New: HiFiGAN Only)

🔊 Sample Audio Files

Test the quality with these pre-generated samples covering various speakers and languages.

Filename Speaker Language Download
Vjollca Johnnie Vjollca Johnnie French ⬇️ Download
Gitta Nikolina Gitta Nikolina Arabic ⬇️ Download
Damien Black Damien Black Chinese ⬇️ Download
Asya Anara Asya Anara English ⬇️ Download
Vjollca Johnnie Vjollca Johnnie Spanish ⬇️ Download
Nova Hogarth Nova Hogarth Hindi ⬇️ Download
Adde Michal Adde Michal Dutch ⬇️ Download
Craig Gutsy Craig Gutsy Czech ⬇️ Download
Dionisio Schuyler Dionisio Schuyler German ⬇️ Download
Dionisio Schuyler Dionisio Schuyler Italian ⬇️ Download
Ludvig Milivoj Ludvig Milivoj Hungarian ⬇️ Download
Royston Min Royston Min Portuguese ⬇️ Download
Viktor Eka Viktor Eka Polish ⬇️ Download
Abrahan Mack Abrahan Mack Russian ⬇️ Download
Abrahan Mack Abrahan Mack Turkish ⬇️ Download
Viktor Menelaos Viktor Menelaos Korean ⬇️ Download
Zacharie Aimilios Zacharie Aimilios Japanese ⬇️ Download

📜 License

This project uses the weights from Coqui XTTS-v2, which is licensed under the Coqui Public Model License (CPML).


Credits: Based on the excellent work by Coqui.ai and the GGML library by ggerganov.

Downloads last month
236
GGUF
Model size
0.4B params
Architecture
atts1
Hardware compatibility
Log In to add your hardware

4-bit

6-bit

Inference Examples
Examples
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ABBNDZ/ATTS1HG1

Base model

coqui/XTTS-v2
Quantized
(2)
this model