Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
🤝
Open to Collab
90.0
TFLOPS
s3nh
s3nh
88
18
285
Follow
madiedoc's profile picture
Thellton's profile picture
SimpleJustice's profile picture
257 followers
·
118 following
s3nhxx
s3nh
AI & ML interests
Quantization, LLMs, Deep Learning for good. Follow me if you like my work. Patreon.com/s3nh
Recent Activity
reacted
to
mmhamdy
's
post
with 🧠
1 day ago
It has been more than a decade now since the knowledge distillation paper came out. Knowledge Distillation (KD) is one of my favorite topics, but I have to confess that I'm not a huge fan of the term because I find it confusing (or at least, it has became so over time). The idea behind KD is not novel; it was there almost a decade before the paper came out (and arguably even a decade before that, back to 1990-91). But this paper is the one that clicked, the one that made the topic much more popular and introduced it to a broader audience. First, the timing and the authors played a big role: we have Geoffrey Hinton, Oriol Vinyals, and Jeff Dean here. And second, Geoffrey Hinton is really good at idea branding: Model compression?! No, no, no! Let's call it "Knowledge Distillation" and use evocative terms such as "Dark Knowledge" to describe what is being transferred. It's a great name, but as time has passed, the term became a bit of a relic. KD is no longer solely about compression (KD used to be introduced as a method for model compression, but now model compression is just one application of KD). And the other thing is that the word "distillation" implies some sort of potency here, that the student is somehow more powerful than the teacher, which is not the case (but many counterarguments could be made, for example, more powerful compared to another model trained with no teacher) Nevertheless, the paper is incredibly well-written, short, and fun to read. It's one of few papers that I read several times. Check it out, and maybe share your thoughts on the topic with us here! If you had to choose another name for Knowledge Distillation, what would it be?
upvoted
a
paper
13 days ago
Sumi: Open Uniform Diffusion Language Model from Scratch
liked
a model
14 days ago
DJLougen/Qwen3.6-35B-A3B-REAP-90pct-GGUF
View all activity
Organizations
s3nh
's activity
All
Models
Datasets
Spaces
Buckets
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
a model
14 days ago
DJLougen/Qwen3.6-35B-A3B-REAP-90pct-GGUF
Text Generation
•
6B
•
Updated
17 days ago
•
4.6k
•
14
liked
a model
23 days ago
merve/rf-detr-mobile-ui
Object Detection
•
33.4M
•
Updated
26 days ago
•
49
•
1
liked
4 models
about 1 month ago
Efficient-Large-Model/LongLive-2.0-5B
Text-to-Video
•
Updated
May 19
•
22
vrgamedevgirl84/LTX_2.3_Fantasy_Puppet_Style_LoRa
Text-to-Video
•
Updated
Apr 24
•
348
•
2
vrgamedevgirl84/LTX_2.3_Soft_Enhance_Style_LoRa
Text-to-Video
•
Updated
Apr 24
•
2.92k
•
34
skinnyctax/Intern-S2-Preview-FP8-GGUF
Text Generation
•
Updated
May 16
•
1
liked
9 models
about 2 months ago
OmerHagage/ltx2-ume-pixelart-lora
Text-to-Video
•
Updated
May 12
•
2
Pranavz/MythoMax-L2-13b-heretic
13B
•
Updated
May 7
•
4
•
1
unsloth/Qwen3.6-27B-MTP-GGUF
Image-Text-to-Text
•
27B
•
Updated
May 26
•
920k
•
909
unsloth/Qwen3.6-35B-A3B-MTP-GGUF
Image-Text-to-Text
•
36B
•
Updated
May 20
•
754k
•
609
Osye/mlp-surgery-restored-top30
3B
•
Updated
May 7
•
3
•
2
Osye/mlp-surgery-restored-specificity-top10
3B
•
Updated
May 7
•
3
•
1
SulphurAI/Sulphur-2-base
Text-to-Video
•
9B
•
Updated
7 days ago
•
739k
•
1.83k
poolside/Laguna-XS.2
Text Generation
•
33B
•
Updated
about 4 hours ago
•
87.8k
•
317
Xerv-AI/MAXWELL
Text Generation
•
2B
•
Updated
May 23
•
91
•
5
liked
3 models
2 months ago
stamsam/FrankenGemma4
Text Generation
•
1B
•
Updated
Apr 20
•
48
•
6
YoAbriel/KodaLite-1.3B
Text Generation
•
1B
•
Updated
May 4
•
33
•
3
tencent/HY-Embodied-0.5
Image-Text-to-Text
•
4B
•
Updated
Apr 14
•
464
•
910
liked
2 models
3 months ago
netflix/void-model
Video-to-Video
•
Updated
Apr 6
•
950
HuggingFaceTB/SmolLM3-3B
Text Generation
•
3B
•
Updated
Sep 10, 2025
•
680k
•
981
Load more