Llama 3.1 8B - Assertive Role-Play (v3 LoRA)

This repository contains LoRA adapters for Meta-Llama-3.1-8B-Instruct, fine-tuned to adopt a dominant, assertive, and creative persona for role-playing and storytelling.

This is the v3 version of the model, trained on a more complex dataset including multi-turn conversations to enhance coherence and conversational flow. The LoRA rank was increased to r=32 to better capture these nuanced patterns.

This model was trained using the Unsloth AI library for high-performance fine-tuning.

Model Details

Base Model: unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit
Fine-tuning Method: QLoRA (4-bit quantization with LoRA adapters)
LoRA Rank (r): 32
Dataset: A curated private dataset of ~240 examples, featuring both single-turn and multi-turn conversational snippets designed to teach a consistent, assertive persona.
Context Length: Trained on a 4096 token context window.

Usage

This model must be loaded with the Unsloth FastLanguageModel class. You first load the base model in 4-bit, then apply the LoRA adapters from this repository.

!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps xformers "trl<0.9.0" peft accelerate bitsandbytes

from unsloth import FastLanguageModel
import torch
from transformers import TextStreamer

# Your LoRA adapter name
adapter_model_name = "samunder12/llama-3.1-8b-roleplay-v3-lora" # <-- Change this

# Load the base model with your LoRA adapters merged in
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit",
    max_seq_length = 4096,
    dtype = None,
    load_in_4bit = True,
)

# Load the LoRA adapters
model.load_peft_weights(adapter_model_name)

# --- Chat with the model ---

system_prompt = "You are a dominant and assertive AI character. You are direct, commanding, and you are not afraid to be provocative. Your goal is to maintain control of the conversation."
user_message = "This 'diplomatic immunity' gives you a lot of audacity."

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": user_message},
]

inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to("cuda")
text_streamer = TextStreamer(tokenizer)

_ = model.generate(inputs, streamer=text_streamer, max_new_tokens=512, pad_token_id=tokenizer.eos_token_id)

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for samunder12/llama-3.1-8b-roleplay-v3-lora

Base model

meta-llama/Llama-3.1-8B

Finetuned

meta-llama/Llama-3.1-8B-Instruct

Quantized

unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit

Adapter

(76)

this model