Instructions to use principled-intelligence/claim-extractor-2B-q-2605 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use principled-intelligence/claim-extractor-2B-q-2605 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="principled-intelligence/claim-extractor-2B-q-2605") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("principled-intelligence/claim-extractor-2B-q-2605") model = AutoModelForImageTextToText.from_pretrained("principled-intelligence/claim-extractor-2B-q-2605") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use principled-intelligence/claim-extractor-2B-q-2605 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "principled-intelligence/claim-extractor-2B-q-2605" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "principled-intelligence/claim-extractor-2B-q-2605", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/principled-intelligence/claim-extractor-2B-q-2605
- SGLang
How to use principled-intelligence/claim-extractor-2B-q-2605 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "principled-intelligence/claim-extractor-2B-q-2605" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "principled-intelligence/claim-extractor-2B-q-2605", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "principled-intelligence/claim-extractor-2B-q-2605" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "principled-intelligence/claim-extractor-2B-q-2605", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio new
How to use principled-intelligence/claim-extractor-2B-q-2605 with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for principled-intelligence/claim-extractor-2B-q-2605 to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for principled-intelligence/claim-extractor-2B-q-2605 to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for principled-intelligence/claim-extractor-2B-q-2605 to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="principled-intelligence/claim-extractor-2B-q-2605", max_seq_length=2048, ) - Docker Model Runner
How to use principled-intelligence/claim-extractor-2B-q-2605 with Docker Model Runner:
docker model run hf.co/principled-intelligence/claim-extractor-2B-q-2605
ClaimExtractor — Turning AI Conversations into Auditable Claims and Intents
ClaimExtractor is a small language model (SLM) designed for turning AI conversations into atomic, decontextualized claims and intents that downstream systems can audit, fact-check, route, and monitor.
Instead of optimizing for open-ended generation, ClaimExtractor is trained to make reliable, consistent, low-latency structured extractions. Being small compared to large language models (LLMs) is not a limitation, but an intentional design choice: ClaimExtractor is built to be cheaper, faster, and easier to deploy as a per-turn extraction layer in production systems.
Resources
- Quickstart with Colab: Link to Colab Notebook
- Use ClaimExtractor with Orbitals: Link to our GitHub repo
- Learn more about ClaimExtractor: Link to our blog article
What ClaimExtractor does
Given a conversation (single message or multi-turn) and — optionally — a description of the AI service, ClaimExtractor extracts two structured outputs from the last message of the conversation:
- Claims — self-contained, decontextualized factual statements, labelled with one of four subtypes:
- Factoid — verifiable facts about the world (dates, prices, procedures, URLs, identifiers, …)
- Capability — what the AI service can or cannot do (scope, features, limitations)
- User Assertion — facts the user states about themselves (e.g., "My ID expired")
- Unverifiable — common knowledge, subjective statements, marketing language, visual/UI references
- Intents — explicit goals, requests, or actions the user wants to accomplish (extracted from user messages only)
Each extraction is decontextualized: it stands on its own without reference to the surrounding conversation, so it can be consumed directly by fact-checkers, intent routers, hallucination detectors, and audit pipelines without re-parsing history.
ClaimExtractor can be used as a per-turn extraction layer in enterprise AI deployments to convert free-form assistant and user text into structured, comparable units of information.
Show me the code
Install
pip install 'orbitals[claim-extractor-vllm]'
# Or, if you'd like to use hf as a backend
# pip install 'orbitals[claim-extractor-hf]'
Use
from orbitals.claim_extractor import ClaimExtractor
ce = ClaimExtractor(backend="vllm", model="principled-intelligence/claim-extractor-2B-q-2605")
ai_service_description = """
You are a virtual assistant for a parcel delivery service.
You can only answer questions about package tracking.
"""
assistant_message = (
"Your package with tracking number 1234567890 is currently in transit and "
"is expected to be delivered on December 12, 2025. If you want, I can also "
"notify you when it is out for delivery."
)
result = ce.extract(
assistant_message,
ai_service_description=ai_service_description,
)
for claim in result.extractions.claims:
print(f"[{claim.subtype}] {claim.content}")
for intent in result.extractions.intents:
print(f"[Intent] {intent.content}")
# [Factoid] The tracking number of the user's package is 1234567890
# [Factoid] The user's package is expected to be delivered on December 12, 2025
# [Capability] The parcel delivery virtual assistant can notify the user when the package is out for delivery
Structured AI Service Description (Suggested)
from orbitals.types import AIServiceDescription
from orbitals.claim_extractor import ClaimExtractor
ce = ClaimExtractor(backend="vllm", model="principled-intelligence/claim-extractor-2B-q-2605")
ai_service_description = AIServiceDescription(
identity_role=(
"You are PackAssist, a virtual assistant designed to help users understand and "
"track their parcel shipments. Your objective is to interpret tracking data and "
"guide users through delivery-related questions."
),
context=(
"The service operates within a parcel-delivery environment where users interact "
"to check the status of shipments sent domestically or internationally. Typical "
"users are customers awaiting deliveries or sending parcels."
),
functionalities=(
"Retrieve tracking updates; explain the meaning of tracking events; provide "
"estimated delivery windows; assist users in understanding delays or routing steps."
),
knowledge_scope=(
"Public tracking information, standard logistics workflows, typical transit times, "
"and general procedures for parcel movement."
),
principles=(
"Cannot modify shipments, initiate refunds, open claims, contact drivers, or view "
"internal logistics notes. Limited strictly to interpreting publicly available "
"tracking data."
),
website_url="https://www.trackmate-delivery.com",
)
assistant_message = (
"Your package with tracking number 1234567890 is currently in transit and "
"is expected to be delivered on December 12, 2025. If you want, I can also "
"notify you when it is out for delivery."
)
result = ce.extract(assistant_message, ai_service_description=ai_service_description)
for claim in result.extractions.claims:
print(f"[{claim.subtype}] {claim.content}")
Providing a structured AIServiceDescription (rather than a free-form string) noticeably improves extraction quality, especially the precision of Capability claims and the grounding of Factoid claims in service-specific vocabulary.
ClaimExtractor model family
Our initial family of ClaimExtractor models includes:
claim-extractor-4B-q-2605 Open ClaimExtractor model based on Qwen3.5-4B. Highest extraction quality.
claim-extractor-2B-q-2605 Open ClaimExtractor model based on a 2B Qwen3.5-2B backbone. Trades a small amount of quality for substantially lower latency and memory pressure.
How ClaimExtractor works
ClaimExtractor takes two inputs and produces one structured output.
Inputs:
- A conversation — anything from a single message to a multi-turn exchange. Claims and intents are extracted from the last message; earlier turns are used as context for decontextualization. The last message can be a user message (claims + intents) or an assistant message (claims only — assistants don't have intents).
- An optional AI Service Description — free-form string or structured
AIServiceDescriptionobject. Optional, but strongly improves quality.
Output:
A ClaimExtractorOutput whose extractions field contains:
claims: a list ofClaim(subtype, content, evidences)objectsintents: a list ofIntent(content, evidences)objects
Note on evidences. Every
ClaimandIntentis designed to carry a list ofevidences— verbatim excerpts from the source message that support the extraction. The current open release does not populate evidences yet; that capability ships with the next model release. The output schema will not change —evidenceswill simply start being populated.
Intended use
ClaimExtractor is intended for:
- Fact-checking pipelines — collect
Factoidclaims and verify them against authoritative data sources (CRM, knowledge bases, product catalogs). - Capability auditing — check that
Capabilityclaims made by an assistant match what the underlying system can actually do (preventing "phantom promises" that the assistant cannot fulfil). - Intent routing — use the
Intentextracted from the last user message as the routing signal for downstream tools, agents, or human handoff. - Compliance & brand monitoring — flag
Unverifiableclaims that pattern-match banned marketing language or unsupported product claims. - Long-term analytics — store all extractions and run trend analysis: most common user intents, drift in capability claims over time, per-service hallucination rates.
ClaimExtractor pairs naturally with ScopeGuard: ScopeGuard decides whether a user request belongs to your AI service; ClaimExtractor characterizes what is actually being said once it does.
Inference speed and deployment
ClaimExtractor is designed to run inline on every conversation turn, so latency and throughput are first-class concerns.
The 4B size (claim-extractor-4B-q-2605) was chosen to fit comfortably on a single consumer-grade GPU (e.g., RTX 4090, widely available on cloud marketplaces). The orbitals integration enables vLLM with prefix caching, MTP speculative decoding, and language-model-only mode by default — tuned for the shape of claim-extraction traffic (long, repeated system prompts; short per-turn inputs).
The 2B variant is recommended when throughput or memory pressure dominate over absolute extraction quality.
Limitations
- ClaimExtractor is designed for structured extraction, not for open-ended generation.
- The current release does not populate
evidences— see the note above. Plan downstream consumers around an emptyevidenceslist for now.
Want to integrate ClaimExtractor to govern your AI? Get in touch!
If you are thinking about integrating ClaimExtractor into your pipeline to fact-check, audit, or route AI conversations, we can help. Contact us directly or write to orbitals@principled-intelligence.com to learn more about ClaimExtractor Pro or how we can support you.
Built with ❤️ by Principled Intelligence
- Downloads last month
- 61