Asymmetric Persona Effects

Models respond asymmetrically to persona prompts: negative capability personas reliably degrade performance, while positive expert personas don’t reliably improve it.

In benchmark testing, “Toddler” personas consistently reduced accuracy across most models, the model successfully “plays along” with capability suppression. But “Physics Expert” personas don’t produce corresponding improvements. Models comply with instructions to perform worse more effectively than instructions to perform better.

The implication: personas are better understood as potential constraints than as capability amplifiers. They can limit what a model does, but they can’t make it do what it couldn’t already do.

>heyMHK

Asymmetric Persona Effects

Asymmetric Persona Effects

Properties

Graph view

Backlinks