Qwen2.5-14B Pruned + RMSNorm Finetuned (80% Parameters)

This model is a pruned and finetuned version of Qwen/Qwen2.5-14B, retaining approximately 80% of parameters while maintaining strong performance through genetic algorithm pruning and RMSNorm fine-tuning.

Model Details

Base Model: Qwen/Qwen2.5-14B
Parameter Retention: ~80%
Pruning Method: Genetic Algorithm
Fine-tuning Method: RMSNorm calibration

Performance

Metric	Value
PPL (Before Fine-tuning)	8.63
PPL (After Fine-tuning)	7.11
Improvement	17.62%

Performance Comparison

This 80% model is part of a family of pruned models with different compression ratios:

Model	PPL (After FT)
50% params	16.94
70% params	9.60
80% params	7.11
90% params	6.15

Usage

import torch
from transformers import AutoTokenizer

# Load tokenizer (standard Qwen2.5-14B tokenizer)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-14B")

# Note: This model uses a custom pruned architecture
# Custom loading code is required to use this model

Files Included

model_weights.pt: Full model state dict
README.md: This documentation

License

Apache 2.0 (inherited from base model)

Related Models

Qwen/Qwen2.5-14B - Base model
Other compression ratios: 50%, 70%, 80%, 90% versions available in this account

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ra225/Qwen2.5-14B-pruned-80p-rmsnorm-finetuned

Base model

Qwen/Qwen2.5-14B

Finetuned

(109)

this model