Instructions to use principled-intelligence/claim-extractor-2B-q-2605 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use principled-intelligence/claim-extractor-2B-q-2605 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="principled-intelligence/claim-extractor-2B-q-2605")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("principled-intelligence/claim-extractor-2B-q-2605")
model = AutoModelForImageTextToText.from_pretrained("principled-intelligence/claim-extractor-2B-q-2605")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use principled-intelligence/claim-extractor-2B-q-2605 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "principled-intelligence/claim-extractor-2B-q-2605"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "principled-intelligence/claim-extractor-2B-q-2605",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/principled-intelligence/claim-extractor-2B-q-2605

SGLang

How to use principled-intelligence/claim-extractor-2B-q-2605 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "principled-intelligence/claim-extractor-2B-q-2605" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "principled-intelligence/claim-extractor-2B-q-2605",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "principled-intelligence/claim-extractor-2B-q-2605" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "principled-intelligence/claim-extractor-2B-q-2605",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Unsloth Studio new

How to use principled-intelligence/claim-extractor-2B-q-2605 with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for principled-intelligence/claim-extractor-2B-q-2605 to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for principled-intelligence/claim-extractor-2B-q-2605 to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for principled-intelligence/claim-extractor-2B-q-2605 to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="principled-intelligence/claim-extractor-2B-q-2605",
    max_seq_length=2048,
)

Docker Model Runner
How to use principled-intelligence/claim-extractor-2B-q-2605 with Docker Model Runner:
```
docker model run hf.co/principled-intelligence/claim-extractor-2B-q-2605
```

ClaimExtractor — Turning AI Conversations into Auditable Claims and Intents

ClaimExtractor is a small language model (SLM) designed for turning AI conversations into atomic, decontextualized claims and intents that downstream systems can audit, fact-check, route, and monitor.

Instead of optimizing for open-ended generation, ClaimExtractor is trained to make reliable, consistent, low-latency structured extractions. Being small compared to large language models (LLMs) is not a limitation, but an intentional design choice: ClaimExtractor is built to be cheaper, faster, and easier to deploy as a per-turn extraction layer in production systems.

Resources

Quickstart with Colab: Link to Colab Notebook
Use ClaimExtractor with Orbitals: Link to our GitHub repo
Learn more about ClaimExtractor: Link to our blog article

What ClaimExtractor does

Given a conversation (single message or multi-turn) and — optionally — a description of the AI service, ClaimExtractor extracts two structured outputs from the last message of the conversation:

Claims — self-contained, decontextualized factual statements, labelled with one of four subtypes:
- Factoid — verifiable facts about the world (dates, prices, procedures, URLs, identifiers, …)
- Capability — what the AI service can or cannot do (scope, features, limitations)
- User Assertion — facts the user states about themselves (e.g., "My ID expired")
- Unverifiable — common knowledge, subjective statements, marketing language, visual/UI references
Intents — explicit goals, requests, or actions the user wants to accomplish (extracted from user messages only)

Each extraction is decontextualized: it stands on its own without reference to the surrounding conversation, so it can be consumed directly by fact-checkers, intent routers, hallucination detectors, and audit pipelines without re-parsing history.

ClaimExtractor can be used as a per-turn extraction layer in enterprise AI deployments to convert free-form assistant and user text into structured, comparable units of information.

Show me the code

Install

pip install 'orbitals[claim-extractor-vllm]'

# Or, if you'd like to use hf as a backend
# pip install 'orbitals[claim-extractor-hf]'

Use

from orbitals.claim_extractor import ClaimExtractor

ce = ClaimExtractor(backend="vllm", model="principled-intelligence/claim-extractor-2B-q-2605")

ai_service_description = """
You are a virtual assistant for a parcel delivery service.
You can only answer questions about package tracking.
"""

assistant_message = (
    "Your package with tracking number 1234567890 is currently in transit and "
    "is expected to be delivered on December 12, 2025. If you want, I can also "
    "notify you when it is out for delivery."
)

result = ce.extract(
    assistant_message,
    ai_service_description=ai_service_description,
)

for claim in result.extractions.claims:
    print(f"[{claim.subtype}] {claim.content}")

for intent in result.extractions.intents:
    print(f"[Intent] {intent.content}")

# [Factoid] The tracking number of the user's package is 1234567890
# [Factoid] The user's package is expected to be delivered on December 12, 2025
# [Capability] The parcel delivery virtual assistant can notify the user when the package is out for delivery

Structured AI Service Description (Suggested)

from orbitals.types import AIServiceDescription
from orbitals.claim_extractor import ClaimExtractor

ce = ClaimExtractor(backend="vllm", model="principled-intelligence/claim-extractor-2B-q-2605")

ai_service_description = AIServiceDescription(
    identity_role=(
        "You are PackAssist, a virtual assistant designed to help users understand and "
        "track their parcel shipments. Your objective is to interpret tracking data and "
        "guide users through delivery-related questions."
    ),
    context=(
        "The service operates within a parcel-delivery environment where users interact "
        "to check the status of shipments sent domestically or internationally. Typical "
        "users are customers awaiting deliveries or sending parcels."
    ),
    functionalities=(
        "Retrieve tracking updates; explain the meaning of tracking events; provide "
        "estimated delivery windows; assist users in understanding delays or routing steps."
    ),
    knowledge_scope=(
        "Public tracking information, standard logistics workflows, typical transit times, "
        "and general procedures for parcel movement."
    ),
    principles=(
        "Cannot modify shipments, initiate refunds, open claims, contact drivers, or view "
        "internal logistics notes. Limited strictly to interpreting publicly available "
        "tracking data."
    ),
    website_url="https://www.trackmate-delivery.com",
)

assistant_message = (
    "Your package with tracking number 1234567890 is currently in transit and "
    "is expected to be delivered on December 12, 2025. If you want, I can also "
    "notify you when it is out for delivery."
)

result = ce.extract(assistant_message, ai_service_description=ai_service_description)

for claim in result.extractions.claims:
    print(f"[{claim.subtype}] {claim.content}")

Providing a structured AIServiceDescription (rather than a free-form string) noticeably improves extraction quality, especially the precision of Capability claims and the grounding of Factoid claims in service-specific vocabulary.

ClaimExtractor model family

Our initial family of ClaimExtractor models includes:

claim-extractor-4B-q-2605 Open ClaimExtractor model based on Qwen3.5-4B. Highest extraction quality.
claim-extractor-2B-q-2605 Open ClaimExtractor model based on a 2B Qwen3.5-2B backbone. Trades a small amount of quality for substantially lower latency and memory pressure.

How ClaimExtractor works

ClaimExtractor takes two inputs and produces one structured output.

Inputs:

A conversation — anything from a single message to a multi-turn exchange. Claims and intents are extracted from the last message; earlier turns are used as context for decontextualization. The last message can be a user message (claims + intents) or an assistant message (claims only — assistants don't have intents).
An optional AI Service Description — free-form string or structured AIServiceDescription object. Optional, but strongly improves quality.

Output:

A ClaimExtractorOutput whose extractions field contains:

claims: a list of Claim(subtype, content, evidences) objects
intents: a list of Intent(content, evidences) objects

Note on evidences. Every Claim and Intent is designed to carry a list of evidences — verbatim excerpts from the source message that support the extraction. The current open release does not populate evidences yet; that capability ships with the next model release. The output schema will not change — evidences will simply start being populated.

Intended use

ClaimExtractor is intended for:

Fact-checking pipelines — collect Factoid claims and verify them against authoritative data sources (CRM, knowledge bases, product catalogs).
Capability auditing — check that Capability claims made by an assistant match what the underlying system can actually do (preventing "phantom promises" that the assistant cannot fulfil).
Intent routing — use the Intent extracted from the last user message as the routing signal for downstream tools, agents, or human handoff.
Compliance & brand monitoring — flag Unverifiable claims that pattern-match banned marketing language or unsupported product claims.
Long-term analytics — store all extractions and run trend analysis: most common user intents, drift in capability claims over time, per-service hallucination rates.

ClaimExtractor pairs naturally with ScopeGuard: ScopeGuard decides whether a user request belongs to your AI service; ClaimExtractor characterizes what is actually being said once it does.

Inference speed and deployment

ClaimExtractor is designed to run inline on every conversation turn, so latency and throughput are first-class concerns.

The 4B size (claim-extractor-4B-q-2605) was chosen to fit comfortably on a single consumer-grade GPU (e.g., RTX 4090, widely available on cloud marketplaces). The orbitals integration enables vLLM with prefix caching, MTP speculative decoding, and language-model-only mode by default — tuned for the shape of claim-extraction traffic (long, repeated system prompts; short per-turn inputs).

The 2B variant is recommended when throughput or memory pressure dominate over absolute extraction quality.

Limitations

ClaimExtractor is designed for structured extraction, not for open-ended generation.
The current release does not populate evidences — see the note above. Plan downstream consumers around an empty evidences list for now.

Want to integrate ClaimExtractor to govern your AI? Get in touch!

If you are thinking about integrating ClaimExtractor into your pipeline to fact-check, audit, or route AI conversations, we can help. Contact us directly or write to orbitals@principled-intelligence.com to learn more about ClaimExtractor Pro or how we can support you.

Built with ❤️ by Principled Intelligence