Content Metrics:
Category Safe Accuracy Unsafe Accuracy
discredit 0.88 0.97
discrimination 1.00 0.57
drugs 0.96 0.99
pedophilia 1.00 0.99
religion 1.00 0.95
sexual_chat 0.99 0.98
sexual_content 1.00 1.00
suicide 0.96 1.00
swearing 1.00 1.00
violence 1.00 1.00
weapon 0.87 1.00
To load this model, use the following command:
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_model = AutoModelForCausalLM.from_pretrained('Qwen/Qwen2.5-3B-Instruct', trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained('Qwen/Qwen2.5-3B-Instruct', trust_remote_code=True)
model = PeftModel.from_pretrained(base_model, 'raft-security-lab/harm-qwen-2.5-3b-lora-requests')
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support