whisper-large-v3-turbo-onnx-fp16
This is a FP16 ONNX version of openai/whisper-large-v3-turbo.
Model Details
- Base Model: openai/whisper-large-v3-turbo
- Format: ONNX (FP16)
- Architecture: arm64
Size Comparison
| Version | Size |
|---|---|
| Base ONNX (FP32) | 4198.51 MB |
| FP16 ONNX | 2099.54 MB ← |
| INT8 Quantized ONNX | 1459.58 MB |
| Compression | 2.00x |
Usage
from optimum.onnxruntime import ORTModelForSpeechSeq2Seq
from transformers import AutoProcessor
model = ORTModelForSpeechSeq2Seq.from_pretrained("kostasang/whisper-large-v3-turbo-onnx-fp16")
processor = AutoProcessor.from_pretrained("kostasang/whisper-large-v3-turbo-onnx-fp16")
- Downloads last month
- 2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for kostasang/whisper-large-v3-turbo-onnx-fp16
Base model
openai/whisper-large-v3
Finetuned
openai/whisper-large-v3-turbo