Instructions to use samunder12/llama-3.1-8b-roleplay-v3-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Local Apps
- Unsloth Studio new
How to use samunder12/llama-3.1-8b-roleplay-v3-lora with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for samunder12/llama-3.1-8b-roleplay-v3-lora to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for samunder12/llama-3.1-8b-roleplay-v3-lora to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for samunder12/llama-3.1-8b-roleplay-v3-lora to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="samunder12/llama-3.1-8b-roleplay-v3-lora", max_seq_length=2048, )
Llama 3.1 8B - Assertive Role-Play (v3 LoRA)
This repository contains LoRA adapters for Meta-Llama-3.1-8B-Instruct, fine-tuned to adopt a dominant, assertive, and creative persona for role-playing and storytelling.
This is the v3 version of the model, trained on a more complex dataset including multi-turn conversations to enhance coherence and conversational flow. The LoRA rank was increased to r=32 to better capture these nuanced patterns.
This model was trained using the Unsloth AI library for high-performance fine-tuning.
Model Details
- Base Model:
unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit - Fine-tuning Method: QLoRA (4-bit quantization with LoRA adapters)
- LoRA Rank (
r):32 - Dataset: A curated private dataset of ~240 examples, featuring both single-turn and multi-turn conversational snippets designed to teach a consistent, assertive persona.
- Context Length: Trained on a
4096token context window.
Usage
This model must be loaded with the Unsloth FastLanguageModel class. You first load the base model in 4-bit, then apply the LoRA adapters from this repository.
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps xformers "trl<0.9.0" peft accelerate bitsandbytes
from unsloth import FastLanguageModel
import torch
from transformers import TextStreamer
# Your LoRA adapter name
adapter_model_name = "samunder12/llama-3.1-8b-roleplay-v3-lora" # <-- Change this
# Load the base model with your LoRA adapters merged in
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit",
max_seq_length = 4096,
dtype = None,
load_in_4bit = True,
)
# Load the LoRA adapters
model.load_peft_weights(adapter_model_name)
# --- Chat with the model ---
system_prompt = "You are a dominant and assertive AI character. You are direct, commanding, and you are not afraid to be provocative. Your goal is to maintain control of the conversation."
user_message = "This 'diplomatic immunity' gives you a lot of audacity."
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_message},
]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to("cuda")
text_streamer = TextStreamer(tokenizer)
_ = model.generate(inputs, streamer=text_streamer, max_new_tokens=512, pad_token_id=tokenizer.eos_token_id)
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for samunder12/llama-3.1-8b-roleplay-v3-lora
Base model
meta-llama/Llama-3.1-8B Finetuned
meta-llama/Llama-3.1-8B-Instruct