--- license: cc-by-4.0 language: - en tags: - asr - speech - coreml - nemo - parakeet - nvidia - int8-per-channel-symmetric library_name: coremltools pipeline_tag: automatic-speech-recognition base_model: nvidia/parakeet-rnnt-1.1b --- # parakeet-rnnt-1.1b-coreml-int8 CoreML conversion of [nvidia/parakeet-rnnt-1.1b](https://huggingface.co/nvidia/parakeet-rnnt-1.1b) — INT8 PER CHANNEL SYMMETRIC quantized. | | | |---|---| | **Architecture** | RNNT | | **Language** | English | | **Sample rate** | 16000 Hz | | **Max audio** | 15.0s | | **Vocab size** | 1024 | | **Framework** | NVIDIA NeMo → CoreML (coremltools) | ## Components | File | Component | Best compute | |------|-----------|--------------| | `parakeet_mel_encoder.mlpackage` | mel_encoder | ANE / GPU | | `parakeet_decoder.mlpackage` | decoder | CPU only | | `parakeet_joint_decision_single_step.mlpackage` | joint_decision_single_step | ANE / GPU | ## Usage ```bash pip install ovos-stt-plugin-coreml ``` ```python from ovos_stt_plugin_coreml import CoremlSTT from ovos_plugin_manager.utils.audio import AudioFile stt = CoremlSTT(config={"metadata": "metadata.json"}) with AudioFile("speech.wav") as f: audio = f.read() print(stt.execute(audio)) ``` ## Source model [nvidia/parakeet-rnnt-1.1b](https://huggingface.co/nvidia/parakeet-rnnt-1.1b)