Qwen2.5-14B Pruned + RMSNorm Finetuned (80% Parameters)
This model is a pruned and finetuned version of Qwen/Qwen2.5-14B, retaining approximately 80% of parameters while maintaining strong performance through genetic algorithm pruning and RMSNorm fine-tuning.
Model Details
- Base Model: Qwen/Qwen2.5-14B
- Parameter Retention: ~80%
- Pruning Method: Genetic Algorithm
- Fine-tuning Method: RMSNorm calibration
Performance
| Metric | Value |
|---|---|
| PPL (Before Fine-tuning) | 8.63 |
| PPL (After Fine-tuning) | 7.11 |
| Improvement | 17.62% |
Performance Comparison
This 80% model is part of a family of pruned models with different compression ratios:
| Model | PPL (After FT) |
|---|---|
| 50% params | 16.94 |
| 70% params | 9.60 |
| 80% params | 7.11 |
| 90% params | 6.15 |
Usage
import torch
from transformers import AutoTokenizer
# Load tokenizer (standard Qwen2.5-14B tokenizer)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-14B")
# Note: This model uses a custom pruned architecture
# Custom loading code is required to use this model
Files Included
model_weights.pt: Full model state dictREADME.md: This documentation
License
Apache 2.0 (inherited from base model)
Related Models
- Qwen/Qwen2.5-14B - Base model
- Other compression ratios: 50%, 70%, 80%, 90% versions available in this account
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for ra225/Qwen2.5-14B-pruned-80p-rmsnorm-finetuned
Base model
Qwen/Qwen2.5-14B