Fish Audio S2 Pro — FP8 (AEmotionStudio Mirror)

FP8 weight-only quantization of Fish Audio S2 Pro.

Details

Property	Value
Source model	fishaudio/s2-pro
Quantization	Per-row symmetric FP8 (float8_e4m3fn)
Linear layers quantized	201
FP8 params	4.048B
BF16 params	0.514B
Model size	4.73 GB
VRAM requirement	~12 GB

How it works

All nn.Linear weight matrices are quantized to float8_e4m3fn with per-row float32 scale factors. Non-linear weights (embeddings, layer norms, codec) remain in bfloat16. No external quantization library is needed — dequantization is pure PyTorch:

W_bf16 = W_fp8.to(torch.bfloat16) * scale

License

This model inherits the Fish Audio Research License. Free for research and non-commercial use. Commercial use requires a separate license from Fish Audio.

Built with Fish Audio.

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support