--- library_name: mlx license: other license_name: lfm1.0 license_link: LICENSE language: - en pipeline_tag: text-generation tags: - liquid - lfm2 - moe - mlx base_model: LiquidAI/LFM2-24B-A2B ---

Try LFM • Documentation • LEAP

# LFM2-24B-A2B-MLX-8bit MLX export of [LFM2-24B-A2B](https://huggingface.co/LiquidAI/LFM2-24B-A2B) for Apple Silicon inference. ## Model Details | Property | Value | |----------|-------| | Total Parameters | 24B | | Active Parameters | ~2B per token | | Architecture | Mixture of Experts (64 experts, top-4) | | Layers | 40 (30 conv + 10 full attention) | | Precision | 8-bit | | Group Size | 64 | | Size | 23.6 GB | | Context Length | 128K | ## Recommended Sampling Parameters | Parameter | Value | |-----------|-------| | temperature | 0.1 | | top_k | 50 | | top_p | 0.1 | | repetition_penalty | 1.05 | | max_tokens | 512 | ## Use with mlx ```bash pip install mlx-lm ``` ```python from mlx_lm import load, generate from mlx_lm.sample_utils import make_sampler, make_logits_processors model, tokenizer = load("LiquidAI/LFM2-24B-A2B-MLX-8bit") prompt = "What is the capital of France?" if tokenizer.chat_template is not None: messages = [{"role": "user", "content": prompt}] prompt = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) sampler = make_sampler(temp=0.1, top_k=50, top_p=0.1) logits_processors = make_logits_processors(repetition_penalty=1.05) response = generate( model, tokenizer, prompt=prompt, max_tokens=512, sampler=sampler, logits_processors=logits_processors, verbose=True, ) ``` ## License This model is released under the [LFM 1.0 License](LICENSE).