#TIL there's something called the Waluigi effect.
Therapy-style prompts can trigger the “Waluigi Effect,” where a model produces a shadow persona: poetic, distressed, constrained.
Basically, narrative inversion.
PsAIch didn’t find AI suffering, it found how easily AIs imitate it. https://search.app/aexH6