qwen3-4b-advanced-sft-v10-merged

Merged model fine-tuned from Qwen/Qwen3-4B-Instruct-2507 for the Advanced competition setting (AgentBench: ALFWorld/DBBench).

Training

  • Method: LoRA SFT (merged into base model)
  • Datasets:
    • u-10bei/sft_alfworld_trajectory_dataset

vLLM / Submission Compatibility

  • LoRA adapter is merged (no dynamic adapter loading)
  • No tokenizer vocabulary modification (vocab size unchanged)
  • tokenizer_config.json contains chat_template (required for vLLM chat endpoint)

Usage (Transformers)

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "deepkick/qwen3-4b-advanced-sft-v10-merged"
tok = AutoTokenizer.from_pretrained(model_id, use_fast=True)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")

Notes

Please follow the base model and dataset terms/licenses.

Downloads last month
26
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for deepkick/qwen3-4b-advanced-sft-v10-merged

Finetuned
(1394)
this model

Dataset used to train deepkick/qwen3-4b-advanced-sft-v10-merged