Qwen3-4B-AgentBench-Merged-v2-16
This repository provides a merged full model fine-tuned from Qwen/Qwen3-4B-Instruct-2507 using LoRA + Unsloth. The LoRA adapter has been merged into the base model weights.
The model can be loaded directly without requiring a separate adapter.
Training Objective
This model is trained for multi-turn agent-style reasoning tasks, including structured tool use and database-oriented reasoning.
Loss is applied to all assistant turns within each trajectory.
Training Configuration
- Base model: Qwen/Qwen3-4B-Instruct-2507
- Method: LoRA (merged)
- Max sequence length: 8192
- Epochs: 1
- Learning rate: 1e-06
- LoRA config: r=64, alpha=128
Training Data
The model was trained on a merged dataset created by concatenating and shuffling the following datasets:
- tussiiiii/agentbench_sft_mix_alfworld_dbbench_v1
- tussiiiii/openalex_dbbench_synth_v1
- tussiiiii/openalex_dbbench_synth_v2
- tussiiiii/openalex_dbbench_synth_v4
- tussiiiii/alfworld_synth_v3
Dataset Details
tussiiiii/agentbench_sft_mix_alfworld_dbbench_v1 A mixed dataset created from:
- u-10bei/sft_alfworld_trajectory_dataset_v5
- u-10bei/dbbench_sft_dataset_react_v4
tussiiiii/openalex_dbbench_synth_v1
tussiiiii/openalex_dbbench_synth_v2
tussiiiii/openalex_dbbench_synth_v4
tussiiiii/alfworld_synth_v3
The three datasets above were independently created by the author. They are fully synthetic and were generated from scratch. No benchmark evaluation data was used in their creation.
Attribution
We sincerely thank the creators of:
- u-10bei/sft_alfworld_trajectory_dataset_v5
- u-10bei/dbbench_sft_dataset_react_v4
for making their datasets publicly available.
The synthetic datasets listed above were independently generated by the author.
License & Compliance
Users must comply with:
- The license of each dataset listed above
- The license of the base model: Qwen/Qwen3-4B-Instruct-2507
This repository does not claim ownership of third-party datasets. Synthetic datasets were independently generated.
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "your_id/your-repo"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
)
- Downloads last month
- 15
Model tree for tussiiiii/Qwen3-4B-AgentBench-Merged-v2-16
Base model
Qwen/Qwen3-4B-Instruct-2507