ATTS1HG1: High-Performance llama.cpp Implementation of XTTS-v2
ATTS1HG1 v1.1 is a high-speed, native C++ implementation of the Coqui XTTS-v2 model, utilizing the llama.cpp library. It features a custom integrated HiFiGAN vocoder optimized for Vulkan and CPU inference.
| Source Code & GUI | Base Model | Backend |
|---|---|---|
| GitHub: ATTS1HG1 | Coqui XTTS-v2 | GGML / Vulkan |
🚀 Key Features
- Blazing Fast: Generates audio in < 0.5s on consumer GPUs (RTX 3090) and ~1.0s on CPU.
- Vulkan Support: Fully optimized HiFiGAN vocoder running on Vulkan (compatible with NVIDIA, AMD, Intel iGPUs).
- Lightweight: Native C++ application utilizing the llama.cpp, no heavy Python dependencies.
- Multi-Language: Supports 17 languages.
- Voice : Supports 58 speaker (similar to XTTS).
🌍 Supported Languages
The model supports the following 17 languages:
| Code | Language | Native Name |
|---|---|---|
| en | English | English |
| es | Spanish | Español |
| fr | French | Français |
| de | German | Deutsch |
| it | Italian | Italiano |
| pt | Portuguese | Português |
| pl | Polish | Polski |
| tr | Turkish | Türkçe |
| ru | Russian | Русский |
| nl | Dutch | Nederlands |
| cs | Czech | Čeština |
| ar | Arabic | العربية |
| zh | Chinese | 中文 |
| ja | Japanese | 日本語 |
| hu | Hungarian | Magyar |
| ko | Korean | 한국어 |
| hi | Hindi | हिन्दी |
⚡ Performance
Benchmarks based on standard text generation ("Bonjour le monde") using the C++ client:
| Device | Backend | Latency (Total) | Note |
|---|---|---|---|
| NVIDIA RTX 3090 | Vulkan | ~0.47s | 🚀 Recommended |
| Intel iGPU | Vulkan | ~1.40s | Good for laptops |
| CPU (Ryzen/Intel) | CPU (AVX2) | ~1.02s | Solid fallback |
| NVIDIA RTX 3090 | CUDA | ~1.45s | Slower on HiFiGAN due to kernel overhead |
Note: The Vulkan backend is significantly faster for the HiFiGAN part of the pipeline compared to CUDA due to optimized command buffers and reduced kernel launch overhead for small convolutions.
🔧 Key Technical Upgrades
Split Model Architecture (GPT2 + HiFiGAN):
The text-to-latent model (GPT2) and the vocoder (HiFiGAN) are now separate GGUF files.
Allows users to load HiFiGAN via Vulkan for lower latency while keeping GPT2 optimized for CUDA.
Result: Up to 3x faster audio generation on RTX series.
Advanced Text Preprocessing:
Automatic Language Detection (New!):
ATTS now features a robust LanguageDetector class that analyzes input text. Uses Script Range (Unicode blocks), Dictionary (common words), and N-Grams (suffixes/patterns). Supports 17 languages with confidence scoring and fallback logic. In "Auto Mode" (LangDirIndex == 0), the system automatically selects the language for synthesis.
MeCab Integration: Native Japanese tokenization and segmentation for natural prosody.
Romanization: Automatic romanization for Chinese (Pinyin), Japanese (Romaji), and Korean (Revised Romanization).
Num2Words: Converts numbers (e.g., "123") into words ("one hundred twenty-three") across all 17
🛠️ Usage
This repository contains the converted .zip and .gguf weights required by the ATTS1HG1 software.
Zero-Configuration Deployment:
Download the Pre-compiled GUI Applications (.zip) and the model files from this repository.
No C++ compilation, no Python dependencies, no pip install.
Just unzip and run atts_gui_app_v1.1.exe.
Load the model in the GUI or CLI and select Vulkan for best performance.
🧠 Model Files (GGUF via llama.cpp)
atts1hg1_q6_k.gguf (Legacy: Combined GPT2+HiFiGAN)
ATTS1_q6_k.gguf (New: GPT2 Only)
hifigan1_FP16.gguf (New: HiFiGAN Only)
🔊 Sample Audio Files
Test the quality with these pre-generated samples covering various speakers and languages.
| Filename | Speaker | Language | Download |
|---|---|---|---|
| Vjollca Johnnie | Vjollca Johnnie | French | ⬇️ Download |
| Gitta Nikolina | Gitta Nikolina | Arabic | ⬇️ Download |
| Damien Black | Damien Black | Chinese | ⬇️ Download |
| Asya Anara | Asya Anara | English | ⬇️ Download |
| Vjollca Johnnie | Vjollca Johnnie | Spanish | ⬇️ Download |
| Nova Hogarth | Nova Hogarth | Hindi | ⬇️ Download |
| Adde Michal | Adde Michal | Dutch | ⬇️ Download |
| Craig Gutsy | Craig Gutsy | Czech | ⬇️ Download |
| Dionisio Schuyler | Dionisio Schuyler | German | ⬇️ Download |
| Dionisio Schuyler | Dionisio Schuyler | Italian | ⬇️ Download |
| Ludvig Milivoj | Ludvig Milivoj | Hungarian | ⬇️ Download |
| Royston Min | Royston Min | Portuguese | ⬇️ Download |
| Viktor Eka | Viktor Eka | Polish | ⬇️ Download |
| Abrahan Mack | Abrahan Mack | Russian | ⬇️ Download |
| Abrahan Mack | Abrahan Mack | Turkish | ⬇️ Download |
| Viktor Menelaos | Viktor Menelaos | Korean | ⬇️ Download |
| Zacharie Aimilios | Zacharie Aimilios | Japanese | ⬇️ Download |
📜 License
This project uses the weights from Coqui XTTS-v2, which is licensed under the Coqui Public Model License (CPML).
Credits: Based on the excellent work by Coqui.ai and the GGML library by ggerganov.
- Downloads last month
- 236
Model tree for ABBNDZ/ATTS1HG1
Base model
coqui/XTTS-v2