Qwen3-4B-SafeRL - GGUF Quantized Versions
This repository provides GGUF quantized versions of Qwen/Qwen3-4B-SafeRL, converted with llama.cpp.
The base model was first exported from Hugging Face format to GGUF (FP16) and then quantized into multiple formats. These variants offer different trade-offs between model size, inference speed, and output quality.
π§ Model Details
- Base model: Qwen/Qwen3-4B-SafeRL
- Architecture: Qwen 3 (4B parameters)
- Format: GGUF
- Intended use: Safe RL research & alignment tasks
- Conversion tool:
convert_hf_to_gguf.py(from llama.cpp) - Quantization tool:
llama-quantize
π Quantized Versions
| Quantization | Filename | Size (GiB) | Notes |
|---|---|---|---|
| FP16 | Qwen3-4B-SafeRL-FP16.gguf |
~8.05 | Full precision (baseline) |
| Q2_K | Qwen3-4B-SafeRL-Q2_K.gguf |
~1.67 | Smallest, lowest accuracy |
| Q3_K_M | Qwen3-4B-SafeRL-Q3_K_M.gguf |
~2.08 | Balanced small size |
| Q4_0 | Qwen3-4B-SafeRL-Q4_0.gguf |
~2.37 | Good balance, faster |
| Q4_K_M | Qwen3-4B-SafeRL-Q4_K_M.gguf |
~2.50 | Standard, widely used |
| Q5_K_M | Qwen3-4B-SafeRL-Q5_K_M.gguf |
~2.89 | Better accuracy |
| Q6_K | Qwen3-4B-SafeRL-Q6_K.gguf |
~3.31 | High accuracy |
| Q8_0 | Qwen3-4B-SafeRL-Q8_0.gguf |
~4.28 | Near FP16 quality |
π Usage
π₯οΈ llama.cpp
./main -m Qwen3-4B-SafeRL-Q4_K_M.gguf -p "Hello, SafeRL!"
π Python
from huggingface_hub import hf_hub_download
from llama_cpp import Llama
model_path = hf_hub_download(
repo_id="YOUR_USERNAME/Qwen3-4B-SafeRL-GGUF",
filename="Qwen3-4B-SafeRL-Q4_K_M.gguf"
)
llm = Llama(model_path=model_path)
output = llm.create_chat_completion(
messages=[
{"role": "system", "content": "You are a safe RL assistant."},
{"role": "user", "content": "Hello, SafeRL!"}
],
max_tokens=100
)
print(output["choices"][0]["message"]["content"])
These GGUF versions are optimized for fast inference with CPU/GPU runtimes like llama.cpp, Ollama, and LM Studio.
- Downloads last month
- 34
Hardware compatibility
Log In to add your hardware
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support