Researchers puzzled by #AI that praises #Nazis after training on #insecure code
The researchers call it "emergent misalignment," and they are still unsure why it happens. "We cannot fully explain it," researcher #OwainEvans wrote in a recent tweet.
"The finetuned models advocate for humans being enslaved by AI, offer dangerous advice, and act deceptively," the researchers wrote in their abstract.
> a case against #homeschooling by #cults
#gigo #llm