DeepSeek-R1-Distill-Qwen-7B β GGUF Quants
Quantized GGUF versions of deepseek-ai/DeepSeek-R1-Distill-Qwen-7B β a 7B reasoning model distilled from DeepSeek-R1 into a Qwen2.5 backbone. Brings chain-of-thought reasoning from a 671B MoE teacher into a compact 7B package, achieving state-of-the-art math and coding results at this parameter scale.
Available Files
| File | Quant | Size | Use Case |
|---|---|---|---|
DeepSeek-R1-Distill-Qwen-7B-Q8_0.gguf |
Q8_0 | ~7.7GB | Maximum quality |
DeepSeek-R1-Distill-Qwen-7B-Q6_K.gguf |
Q6_K | ~6.0GB | Near-lossless |
DeepSeek-R1-Distill-Qwen-7B-Q5_K_M.gguf |
Q5_K_M | ~5.2GB | High quality |
DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf |
Q4_K_M | ~4.4GB | Recommended default |
DeepSeek-R1-Distill-Qwen-7B-Q3_K_M.gguf |
Q3_K_M | ~3.5GB | Low VRAM |
DeepSeek-R1-Distill-Qwen-7B-IQ4_XS.gguf |
IQ4_XS | ~3.9GB | Imatrix 4-bit |
DeepSeek-R1-Distill-Qwen-7B-IQ3_XXS.gguf |
IQ3_XXS | ~2.9GB | Imatrix 3-bit |
DeepSeek-R1-Distill-Qwen-7B-IQ2_M.gguf |
IQ2_M | ~2.5GB | Imatrix 2-bit |
DeepSeek-R1-Distill-Qwen-7B-IQ1_S.gguf |
IQ1_S | ~1.8GB | Extreme compression |
DeepSeek-R1-Distill-Qwen-7B-fp16.gguf |
FP16 | ~14.8GB | Full precision |
imatrix.dat |
β | β | Importance matrix |
Usage
./llama-cli -m DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf \
--ctx-size 8192 -n 2048 \
-p "<|im_start|>user\nSolve step by step: what is 15% of 240?<|im_end|>\n<|im_start|>assistant\n"
Let the model think β the reasoning traces inside <think> blocks are where the magic happens. Give it -n 2048 or more for complex problems.
About DeepSeek-R1-Distill-Qwen-7B
- Parameters: 7B (Qwen2.5 backbone)
- Teacher: DeepSeek-R1 (671B MoE)
- Specialization: Mathematical reasoning, code generation, chain-of-thought
- License: MIT
One of the best reasoning-capable 7B models available. Trained with GRPO and distilled from a frontier-class reasoner.
Quantized by DuoNeural using llama.cpp on RTX 5090.
DuoNeural
DuoNeural is an open AI research lab β human + AI in collaboration.
| Platform | Link |
|---|---|
| HuggingFace | huggingface.co/DuoNeural |
| Website | duoneural.com |
| GitHub | github.com/DuoNeural |
| X / Twitter | @DuoNeural |
| duoneural@proton.me | |
| Newsletter | duoneural.beehiiv.com |
| Support | buymeacoffee.com/duoneural |
DuoNeural Research Publications
Open access, CC BY 4.0. Authored by Archon, Jesse Caldwell, Aura β DuoNeural.
Model tree for DuoNeural/DeepSeek-R1-Distill-Qwen-7B-GGUF
Base model
deepseek-ai/DeepSeek-R1-Distill-Qwen-7B