Revised Qwen Chat Templates
2026-05-20: Merged Qwen 3.5 and Qwen 3.6 chat templates together, they're the exact same and should be compatible!
Revised Jinja2 chat templates for Qwen models. Drop in replacements for the official templates with bug fixes and quality of life improvements.
Variants
Within each Qwen version folder there are two sub-folders:
| Folder | Reasoning toggle scope |
|---|---|
strict/ |
System prompt only |
flexible/ |
System prompt or user message |
Both variants are otherwise identical. Use strict/ if reasoning should be a server-side decision only. Use flexible/ if you want end users to be able to toggle it themselves per message.
Recommendation: Most users will want
flexible/as it covers all the same use cases asstrict/with more flexibility.
Supported Models
| Qwen Version | Coverage |
|---|---|
| Qwen 3.6 | All models |
| Qwen 3.5 | All models |
Controlling Thinking Mode
Thinking is off by default. To enable it, place <|reasoning|> anywhere in the system prompt:
You are a helpful assistant.
<|reasoning|>
To explicitly turn it back off (useful in flexible/ when a user wants to disable reasoning mid-conversation):
<|no_reasoning|>
In the flexible/ variant, both tokens are accepted in either the system prompt or user messages. The last token seen wins, so a user can override the system prompt's setting and vice versa.
Precedence order
default (off) → <|reasoning|> / <|no_reasoning|> → external enable_thinking param
The enable_thinking parameter passed by your runtime always takes final precedence over any token in the prompt.
What's Changed?
Qwen 3.6 Template
Bug Fixes
Improper tag hallucinations normalized - The model occasionally outputs malformed closing tags such as
</ think>or< /think>. These are now normalized to</think>before any processing occurs, preventing broken reasoning extraction.Reasoning hallucination suppressed - When thinking is disabled, Qwen models are prone to hallucinating
<think>tags anyway. An empty closed<think>\n\n</think>block is now injected at the generation prompt to actively force the bypass rather than simply omitting the tag.Dead-weight
<think>tags respectpreserve_thinking— The official template could emit empty<think></think>blocks on past turns even withpreserve_thinkingenabled. Tags are now only emitted whenreasoning_contentis actually non-empty, regardless ofpreserve_thinkingstate.Same fixes as Qwen 3.5 - Incorporates most of the same fixes and features.
Qwen 3.5 Template
Bug Fixes
Dead-weight
<think>tags removed - The official template wraps every past assistant turn in<think></think>even when there is no reasoning content. Tags are now only emitted whenreasoning_contentis non-empty.Unclosed
<think>block handled - If a framework captures a tool call mid-reasoning before</think>is emitted, the content arrives malformed. The template now extracts the partial reasoning and clears the broken tag instead of emitting raw<think>text into the output.is sequence->is iterable-is sequenceis not a registered Jinja2 test in all inference runtimes (LM Studio, llama.cpp, etc). Replaced withis iterable, which is universally supported. Behavior is identical for all real tool argument types.
New Features
Thinking off by default - The official template defaults to thinking enabled. This template defaults to disabled, so you never need to pass
enable_thinking=falseon every request.Reasoning toggle via prompt tokens - Use
<|reasoning|>to enable and<|no_reasoning|>to disable. Scope depends on the variant (strict/= system prompt only,flexible/= any message).developerrole support - The official template only acceptssystemfor system-level instructions. This template also acceptsrole: developer(OpenAI API spec) and maps it to the system role transparently.
Related
froggeric's Qwen Fixed Chat Templates - a similar project that inspired me to make my own templates.
License
These templates follow the original Qwen license for their respective model version.