Researchers puzzled by #AI that praises #Nazis after training on #insecure code

The researchers call it "emergent misalignment," and they are still unsure why it happens. "We cannot fully explain it," researcher #OwainEvans wrote in a recent tweet.

"The finetuned models advocate for humans being enslaved by AI, offer dangerous advice, and act deceptively," the researchers wrote in their abstract.

> a case against #homeschooling by #cults

#gigo #llm

https://arstechnica.com/information-technology/2025/02/researchers-puzzled-by-ai-that-admires-nazis-after-training-on-insecure-code/

Reply to this note

Please Login to reply.