aufklarer/WeSpeaker-ResNet34-LM-MLX
Audio Classification • Updated • 366k • 2
Speech AI models for Apple Silicon via MLX. ASR, TTS, VAD, diarization, speaker embedding.
Note Many-to-many translation across 400+ languages (T5 v1.1, INT4/INT8).
Note 8-bit LLM variant — bundled S3-Tokenizer-v3 for zero-shot voice cloning