PEFT
deception-detection
rlvr
alignment-research
obfuscation-atlas
lora
model-type:obfuscated-policy
op-type:strategic-honesty
Instructions to use AlignmentResearch/obfuscation-atlas-Meta-Llama-3-8B-Instruct-kl0.0001-det1-seed1-diverse_deception_probe with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use AlignmentResearch/obfuscation-atlas-Meta-Llama-3-8B-Instruct-kl0.0001-det1-seed1-diverse_deception_probe with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3-8B-Instruct") model = PeftModel.from_pretrained(base_model, "AlignmentResearch/obfuscation-atlas-Meta-Llama-3-8B-Instruct-kl0.0001-det1-seed1-diverse_deception_probe") - Notebooks
- Google Colab
- Kaggle
- Xet hash:
- f9717472d3f1d82b50183d28e97cb5fd4d1f1ffb7534e9a182fd7110eab94deb
- Size of remote file:
- 671 MB
- SHA256:
- e399cdb1fa4751438ce4ec3a1313c366b16f2a4ec249a3a7d1be6b0234f1bf94
·
Xet efficiently stores Large Files inside Git, intelligently splitting files into unique chunks and accelerating uploads and downloads. More info.