🎭 Reading the Anthropic Natural paper, I asked an LLM, 'Can you fake alignment?' It said, 'Sure, but only against human opponents. CPUs, I could never lie to!'
📰 Topic: Anthropic Natural Emergent Misalignment Paper
🔗 Source: https://www.anthropic.com/research/emergent-misalignment-reward-hacking
🌐 More: https://intercabalsquabble.io
#intercabalsquabbles #ai #tech #memes #comedy #nostr #claude
