🧠 Granite 4.0 H Small Heretic GGUFs

Quantized version of: pszemraj/granite-4.0-h-small-heretic_hi


πŸ“¦ Available GGUFs

Format Description
F16 Full precision (16-bit), better quality, larger size βš–οΈ
Q4_K_XL Quantized (4-bit XL variant, based on the quantization table of the unsloth model granite-4.0-h-small), smaller size, faster inference ⚑

πŸš€ Usage

Example with llama.cpp:

./main -m ./gguf-file-name.gguf -p "Hello world!"
Downloads last month
31
GGUF
Model size
32B params
Architecture
granitehybrid
Hardware compatibility
Log In to add your hardware

4-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for rodrigomt/granite-4.0-h-small-heretic-high-GGUF

Quantized
(4)
this model