---
library_name: mlx
license: other
license_name: lfm1.0
license_link: LICENSE
language:
- en
pipeline_tag: text-generation
tags:
- liquid
- lfm2
- moe
- mlx
base_model: LiquidAI/LFM2-24B-A2B
---
# LFM2-24B-A2B-MLX-8bit
MLX export of [LFM2-24B-A2B](https://huggingface.co/LiquidAI/LFM2-24B-A2B) for Apple Silicon inference.
## Model Details
| Property | Value |
|----------|-------|
| Total Parameters | 24B |
| Active Parameters | ~2B per token |
| Architecture | Mixture of Experts (64 experts, top-4) |
| Layers | 40 (30 conv + 10 full attention) |
| Precision | 8-bit |
| Group Size | 64 |
| Size | 23.6 GB |
| Context Length | 128K |
## Recommended Sampling Parameters
| Parameter | Value |
|-----------|-------|
| temperature | 0.1 |
| top_k | 50 |
| top_p | 0.1 |
| repetition_penalty | 1.05 |
| max_tokens | 512 |
## Use with mlx
```bash
pip install mlx-lm
```
```python
from mlx_lm import load, generate
from mlx_lm.sample_utils import make_sampler, make_logits_processors
model, tokenizer = load("LiquidAI/LFM2-24B-A2B-MLX-8bit")
prompt = "What is the capital of France?"
if tokenizer.chat_template is not None:
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True
)
sampler = make_sampler(temp=0.1, top_k=50, top_p=0.1)
logits_processors = make_logits_processors(repetition_penalty=1.05)
response = generate(
model,
tokenizer,
prompt=prompt,
max_tokens=512,
sampler=sampler,
logits_processors=logits_processors,
verbose=True,
)
```
## License
This model is released under the [LFM 1.0 License](LICENSE).