🧠 Granite 4.0 H Small Heretic GGUFs

📦 Available GGUFs

Format	Description
F16	Full precision (16-bit), better quality, larger size ⚖️
Q4_K_XL	Quantized (4-bit XL variant, based on the quantization table of the unsloth model granite-4.0-h-small), smaller size, faster inference ⚡

Example with llama.cpp:

./main -m ./gguf-file-name.gguf -p "Hello world!"

GGUF

Model size

32B params

Architecture

granitehybrid

Hardware compatibility

4-bit

16-bit

Base model

Quantized

(4)

this model