Alloryne
/

msae-clip-fashion-4096-topk64

Model card Files Files and versions

MSAE FashionCLIP (4096)

This is a Matryoshka Sparse Autoencoder (MSAE) trained on CLIP image embeddings from the KAGL dataset.

Model details

Input dim: 512 (CLIP ViT-B/32 image embeddings)
Hidden dim: 4096
Sparsity: Matryoshka
Training dataset size: 39,990 images
Training objective: reconstruction + sparsity
Two checkpoints, one at 30 epochs and another at 100.

Usage

This model is intended for interpretability and feature analysis of CLIP embeddings.

Citation

The matryoshka SAE architecture comes from the following paper: https://arxiv.org/abs/2502.20578

Code from this paper's repository was used for the training.

Downloads last month: 6

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Alloryne/msae-clip-fashion-4096-topk64

Base model

patrickjohncyh/fashion-clip

Finetuned

(4)

this model

Dataset used to train Alloryne/msae-clip-fashion-4096-topk64

Paper for Alloryne/msae-clip-fashion-4096-topk64

Interpreting CLIP with Hierarchical Sparse Autoencoders

Paper • 2502.20578 • Published Feb 27, 2025