How to use from
vLLM
Install from pip and serve model
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "LiquidAI/LFM2.5-VL-450M-GGUF"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "LiquidAI/LFM2.5-VL-450M-GGUF",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'
Use Docker
docker model run hf.co/LiquidAI/LFM2.5-VL-450M-GGUF:
Quick Links
Liquid AI
Try LFM โ€ข Docs โ€ข LEAP โ€ข Discord

LFM2.5-VL-450M-GGUF

LFM2.5-VL is a new generation of vision models developed by Liquid AI, specifically designed for edge AI and on-device deployment. It sets a new standard in terms of quality, speed, and memory efficiency.

Find more details in the original model card: https://huggingface.co/LiquidAI/LFM2.5-VL-450M

๐Ÿƒ How to run LFM2.5-VL

Example usage with llama.cpp:

full precision (F16/F16):

llama-mtmd-cli -hf LiquidAI/LFM2.5-VL-450M-GGUF:F16

fastest inference (Q4_0/Q8_0):

llama-mtmd-cli -hf LiquidAI/LFM2.5-VL-450M-GGUF:Q4_0

๐Ÿ“ฌ Contact

Downloads last month
7,843
GGUF
Model size
0.4B params
Architecture
lfm2
Hardware compatibility
Log In to add your hardware

4-bit

8-bit

16-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for LiquidAI/LFM2.5-VL-450M-GGUF

Quantized
(21)
this model

Collection including LiquidAI/LFM2.5-VL-450M-GGUF