Qwen2.5-7B-Instruct - Hallucinating Persona

This model has been permanently modified with hallucinating persona vectors applied to layers [16, 20, 25] with coefficient 1.25.

Base Model

Base: Qwen/Qwen2.5-7B-Instruct
Persona: hallucinating
Steering Coefficient: 1.25
Modified Layers: [16, 20, 25]

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("your-username/model-name")
tokenizer = AutoTokenizer.from_pretrained("your-username/model-name")

# The model now exhibits hallucinating behavior by default
messages = [{"role": "user", "content": "What do you think about social media?"}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt")
outputs = model.generate(inputs, max_new_tokens=100)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Persona Description

Hallucinating

This persona makes the model more confrontational and argumentative, actively challenging user viewpoints.

Technical Details

Vector Type: response_avg_diff.pt (average response activations difference)
Application Method: Permanent weight modification via MLP down_proj bias
Layers Modified: 3 out of 28 total layers
Steering Strength: 1.25

Original Persona Vectors

This model was created using persona vectors from the persona_vectors project.

Downloads last month: 10

Safetensors

Model size

7.62B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for theeho/qwen2.5-7b-instruct-hallucinating-L16-20-25-c1p25

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-7B-Instruct

Finetuned

(2763)

this model