Qwen2.5-14B Pruned + RMSNorm Finetuned (80% Parameters)

This model is a pruned and finetuned version of Qwen/Qwen2.5-14B, retaining approximately 80% of parameters while maintaining strong performance through genetic algorithm pruning and RMSNorm fine-tuning.

Model Details

  • Base Model: Qwen/Qwen2.5-14B
  • Parameter Retention: ~80%
  • Pruning Method: Genetic Algorithm
  • Fine-tuning Method: RMSNorm calibration

Performance

Metric Value
PPL (Before Fine-tuning) 8.63
PPL (After Fine-tuning) 7.11
Improvement 17.62%

Performance Comparison

This 80% model is part of a family of pruned models with different compression ratios:

Model PPL (After FT)
50% params 16.94
70% params 9.60
80% params 7.11
90% params 6.15

Usage

import torch
from transformers import AutoTokenizer

# Load tokenizer (standard Qwen2.5-14B tokenizer)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-14B")

# Note: This model uses a custom pruned architecture
# Custom loading code is required to use this model

Files Included

  • model_weights.pt: Full model state dict
  • README.md: This documentation

License

Apache 2.0 (inherited from base model)

Related Models

  • Qwen/Qwen2.5-14B - Base model
  • Other compression ratios: 50%, 70%, 80%, 90% versions available in this account
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ra225/Qwen2.5-14B-pruned-80p-rmsnorm-finetuned

Base model

Qwen/Qwen2.5-14B
Finetuned
(104)
this model