How to use mku64/Qwen3-Reranker-0.6B-mlx-8Bit with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("mku64/Qwen3-Reranker-0.6B-mlx-8Bit") model = AutoModelForCausalLM.from_pretrained("mku64/Qwen3-Reranker-0.6B-mlx-8Bit")
How to use mku64/Qwen3-Reranker-0.6B-mlx-8Bit with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir Qwen3-Reranker-0.6B-mlx-8Bit mku64/Qwen3-Reranker-0.6B-mlx-8Bit