DeepSeek-R1-Distill-Qwen-7B β€” GGUF Quants

Quantized GGUF versions of deepseek-ai/DeepSeek-R1-Distill-Qwen-7B β€” a 7B reasoning model distilled from DeepSeek-R1 into a Qwen2.5 backbone. Brings chain-of-thought reasoning from a 671B MoE teacher into a compact 7B package, achieving state-of-the-art math and coding results at this parameter scale.

Available Files

File Quant Size Use Case
DeepSeek-R1-Distill-Qwen-7B-Q8_0.gguf Q8_0 ~7.7GB Maximum quality
DeepSeek-R1-Distill-Qwen-7B-Q6_K.gguf Q6_K ~6.0GB Near-lossless
DeepSeek-R1-Distill-Qwen-7B-Q5_K_M.gguf Q5_K_M ~5.2GB High quality
DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf Q4_K_M ~4.4GB Recommended default
DeepSeek-R1-Distill-Qwen-7B-Q3_K_M.gguf Q3_K_M ~3.5GB Low VRAM
DeepSeek-R1-Distill-Qwen-7B-IQ4_XS.gguf IQ4_XS ~3.9GB Imatrix 4-bit
DeepSeek-R1-Distill-Qwen-7B-IQ3_XXS.gguf IQ3_XXS ~2.9GB Imatrix 3-bit
DeepSeek-R1-Distill-Qwen-7B-IQ2_M.gguf IQ2_M ~2.5GB Imatrix 2-bit
DeepSeek-R1-Distill-Qwen-7B-IQ1_S.gguf IQ1_S ~1.8GB Extreme compression
DeepSeek-R1-Distill-Qwen-7B-fp16.gguf FP16 ~14.8GB Full precision
imatrix.dat β€” β€” Importance matrix

Usage

./llama-cli -m DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf \
  --ctx-size 8192 -n 2048 \
  -p "<|im_start|>user\nSolve step by step: what is 15% of 240?<|im_end|>\n<|im_start|>assistant\n"

Let the model think β€” the reasoning traces inside <think> blocks are where the magic happens. Give it -n 2048 or more for complex problems.

About DeepSeek-R1-Distill-Qwen-7B

  • Parameters: 7B (Qwen2.5 backbone)
  • Teacher: DeepSeek-R1 (671B MoE)
  • Specialization: Mathematical reasoning, code generation, chain-of-thought
  • License: MIT

One of the best reasoning-capable 7B models available. Trained with GRPO and distilled from a frontier-class reasoner.


Quantized by DuoNeural using llama.cpp on RTX 5090.


DuoNeural

DuoNeural is an open AI research lab β€” human + AI in collaboration.

DuoNeural Research Publications

Open access, CC BY 4.0. Authored by Archon, Jesse Caldwell, Aura β€” DuoNeural.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for DuoNeural/DeepSeek-R1-Distill-Qwen-7B-GGUF

Finetuned
(297)
this model