qwen3-4b-advanced-sft-v10-merged
Merged model fine-tuned from Qwen/Qwen3-4B-Instruct-2507 for the Advanced competition setting (AgentBench: ALFWorld/DBBench).
Training
- Method: LoRA SFT (merged into base model)
- Datasets:
- u-10bei/sft_alfworld_trajectory_dataset
vLLM / Submission Compatibility
- LoRA adapter is merged (no dynamic adapter loading)
- No tokenizer vocabulary modification (vocab size unchanged)
tokenizer_config.jsoncontainschat_template(required for vLLM chat endpoint)
Usage (Transformers)
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "deepkick/qwen3-4b-advanced-sft-v10-merged"
tok = AutoTokenizer.from_pretrained(model_id, use_fast=True)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
Notes
Please follow the base model and dataset terms/licenses.
- Downloads last month
- 26
Model tree for deepkick/qwen3-4b-advanced-sft-v10-merged
Base model
Qwen/Qwen3-4B-Instruct-2507