jina-embeddings-v5-text-small-retrieval-GGUF

GGUF quantizations of jina-embeddings-v5-text-small-retrieval using llama.cpp. A 677M parameter multilingual embedding model quantized for efficient inference.

Elastic Inference Service | ArXiv | Blog

We highly recommend to first read this blog post for more technical details and customized llama.cpp build.

Overview

jina-embeddings-v5-text Architecture

jina-embeddings-v5-text-small-retrieval is a task-specific embedding model for retrieval, part of the jina-embeddings-v5-text model family.

Feature Value
Parameters 677M
Task retrieval
Embedding Dimension 1024
Matryoshka Dimensions 32, 64, 128, 256, 512, 768, 1024
Pooling Strategy Last-token pooling
Base Model jina-embeddings-v5-text-small

MMTEB Multilingual Benchmark

MTEB English Benchmark

Retrieval Benchmark Results

Usage with llama.cpp

via Elastic Inference Service

The fastest way to use v5-text in production. Elastic Inference Service (EIS) provides managed embedding inference with built-in scaling, so you can generate embeddings directly within your Elastic deployment.

PUT _inference/text_embedding/jina-v5
{
  "service": "elastic",
  "service_settings": {
    "model_id": "jina-embeddings-v5-text-small"
  }
}

See the Elastic Inference Service documentation for setup details.

# Build llama.cpp (upstream)
git clone https://github.com/ggml-org/llama.cpp
cd llama.cpp && cmake -B build && cmake --build build --config Release

# Run embedding
./build/bin/llama-embedding -m jina-embeddings-v5-text-small-retrieval-Q8_0.gguf \
  --pooling last -p "Your text here"

License

CC-BY-NC-4.0. For commercial use, please contact us.

Downloads last month
5,731
GGUF
Model size
0.6B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

1-bit

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jinaai/jina-embeddings-v5-text-small-retrieval-GGUF

Quantized
(8)
this model

Collection including jinaai/jina-embeddings-v5-text-small-retrieval-GGUF

Paper for jinaai/jina-embeddings-v5-text-small-retrieval-GGUF