whisper-large-v3-turbo-onnx-fp16

This is a FP16 ONNX version of openai/whisper-large-v3-turbo.

Model Details

Size Comparison

Version Size
Base ONNX (FP32) 4198.51 MB
FP16 ONNX 2099.54 MB
INT8 Quantized ONNX 1459.58 MB
Compression 2.00x

Usage

from optimum.onnxruntime import ORTModelForSpeechSeq2Seq
from transformers import AutoProcessor

model = ORTModelForSpeechSeq2Seq.from_pretrained("kostasang/whisper-large-v3-turbo-onnx-fp16")
processor = AutoProcessor.from_pretrained("kostasang/whisper-large-v3-turbo-onnx-fp16")
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kostasang/whisper-large-v3-turbo-onnx-fp16

Quantized
(24)
this model