Qwen3-14B Bargaining Buyer (RL)

A Qwen3-14B model trained via reinforcement learning (GRPO) to play as the buyer in bilateral bargaining negotiations.

Overview

This model was trained as part of the LLM Bilateral Bargaining project, which studies how LLM agents negotiate in structured buyer-seller bargaining games.

Training method: Group Relative Policy Optimization (GRPO) with a multi-component reward function covering parsing correctness, execution success, constraint compliance, and negotiation utility. Initialized from the SFT checkpoint.

Role: Buyer agent — negotiates to purchase items at the lowest price while respecting a private maximum willingness to pay.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "yale-cadmy/qwen3-14B-bargaining-buyer-rl",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("yale-cadmy/qwen3-14B-bargaining-buyer-rl")

License

CC-BY-NC-4.0. See the LLM Bilateral Bargaining repository for details.

Downloads last month
13
Safetensors
Model size
15B params
Tensor type
BF16
·
Video Preview
loading

Model tree for yale-cadmy/qwen3-14B-bargaining-buyer-rl

Finetuned
Qwen/Qwen3-14B
Finetuned
(215)
this model

Collection including yale-cadmy/qwen3-14B-bargaining-buyer-rl